Fully Explained Voting Ensemble Technique in Machine Learning
Ensemble learning in machine learning is a method to use multiple weak learners i.e. different algorithms to create a strong predictive model or strong learners.
In general the types of ensemble methods:
From the above methods, we will study the voting ensemble.
In this technique, we use different machine learning models that will train on the same dataset to make classification or regression predictions.
Assumptions to be taken in voting technique:
- The base model should be different.
- The accuracy of each model should be greater than 50%. The final accuracy depends on the prediction probabilities of each model.
As we are using many base models, the effect of poor performance by one algorithm can be managed by the strong performance model.
Types of voting ensemble depend on prediction:
- Soft voting: The prediction output based on the average of all probabilities of each base model class and choosing the highest
- Hard voting: The prediction output of each model is based on the majority voting classes.
- The prediction output is the average of each model.
Soft and Hard voting Classifier Examples
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
df = pd.read_csv('iris.csv')
from sklearn.preprocessing import LabelEncoder
# encoding the output column categories to numerical
encoder = LabelEncoder()
df['variety'] = encoder.fit_transform(df['variety'])
from sklearn.ensemble import VotingClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import…