EvalML Library for Machine Learning Automation Pipeline with Python

Build machine learning models by using automation pipelines

Amit Chauhan

--

Photo by Boitumelo on Unsplash

EvalML library is an automation tool that builds machine learning models by using pipelines. It evaluates the feature engineering automatically making the work easier for data scientists. It also does hyper-parameter tuning inside this automation technique.

To install the evalML library use the command shown below:

pip install eval

If the above command doesn’t install then try:

pip install eval — user

Importing the evalml library

import evalml

Here, we will download the demo data set from the evalml library.

X, y = evalml.demos.load_breast_cancer()

Here, we will use the in-built demo dataset of evalML. After this, we will try to split data into train and test with the help of split_data method that is available in evalML library.

X_train, X_test, y_train, y_test = evalml.preprocessing.split_data(X, y, problem_type='binary')
X_train.head()

Let’s check the type of the data after splitting.

type(X_train)

#output:
pandas.core.frame.DataFrame

Now check the different types of problems that the model can run.

evalml.problem_types.ProblemTypes.all_problem_types

#output:
[<ProblemTypes.BINARY: 'binary'>,
<ProblemTypes.MULTICLASS: 'multiclass'>,
<ProblemTypes.REGRESSION: 'regression'>,
<ProblemTypes.TIME_SERIES_REGRESSION: 'time series regression'>,
<ProblemTypes.TIME_SERIES_BINARY: 'time series binary'>,
<ProblemTypes.TIME_SERIES_MULTICLASS: 'time series multiclass'>,
<ProblemTypes.MULTISERIES_TIME_SERIES_REGRESSION: 'multiseries time series regression'>]

In the above part, there are different types of problems we can choose to make the output by the model.

--

--