Data Science Enthusiastic | Electronics R&D | Data Visualization | BI | NLP |

Data Visualization

Basic concepts for forecasting models in machine learning with example

In this article, we will discuss time series concepts with machine learning examples that deal with the time component in the data.

Forecasting is so much important in the banking sector, weather, population prediction, and many more that directly deals with real-life problems.

Time series models are based on a function of time. The measurements are in regular intervals of time where time be an independent variable for modeling.

Z = f(t)

Z is the values Z1, Z2……Zn and “t” are the times at T1, T2….Tn intervals.

Topics to be covered:

  1. Components of Time Series
  2. White Noise
  3. Stationary and Non-Stationary
  4. Rolling Statistics and Dickey-Fuller…

Data Science

A robust method to make data ready for machine learning estimators

In this article, we will study some important data preprocessing methods. It is a very important step to visualize the data and make it in a suitable form so that the estimators (algorithm) fit well with good accuracy.

Topics to be covered:

  1. Standardization
  2. Scaling with sparse data and outliers
  3. Normalization
  4. Categorical Encoding
  5. Imputation


Standardization is a process that deals with the mean and standard deviation of the data points. As raw data, the values are varying from very low to very high. So, to avoid the low performance in the model we use standardization. …

Deep Learning

A classification approach for binary classes

The World has witnessed an explosion in machine and deep learning technology in the last decade from a personalized world to professional activities everywhere. With these technologies, I am sure that you had heard or read the term “Perceptron” in the neural networks which is the first concept one will probably start to learn neural network. Therefore through this article, my emphasis is to show what is a perceptron and its working.

An American Scientist Rosenblatt was very much inspired by the biological neuron and its ability to learn and the term perceptron was introduced by him around 1957…


Analysis of variables association and distribution

When we talk of chi-square tests, basically we study two types:

  • Chi-square for independence
  • Chi-square for the goodness of fit

Both of them are non-parametric tests (that do not have a continuous scale for measurement and are assumption-free)

The first one helps determine any association between qualitative variables, and the second one tells whether a sample follows the same distribution as a sample or not.

Chi-square for independence

This test helps determine any association between two categorical values of qualitative data. It is only applicable to the Categorical data.

Machine Learning

A part of continuous integration, continuous development, and continuous testing

What is MLOps?

If we break down the word itself, it is a combination of 2 words, machine learning, and operations. Where machine learning stands for model development or any kind of code development and operations means production and deployment of code.

A more technical definition of MLOps is a set of principles and practices to standardize and streamline the machine learning lifecycle management.

Well, it is not a new technology or tool but rather a culture with a set of principles, guidelines defined in a machine learning world to seamlessly integrate/automate the development phase with the operational development phase.

Installation and CRUD operation in databases with Jupyter notebook

In this article, we will do CRUD operation in the database using python language in the jupyter notebook. The python language becomes versatile in numerous filed all over the globe. To automate the whole project pipeline, we need also to get a connection between python and the database system.

If some of the people do not know the CRUD meaning, then it is Create, Read, Update, and Delete operations.

Here, we are using MySQL workbench 8.0 and python 3.7 versions. …

WebApp framework for data science and machine learning

In this article, we will try to make a visualization chart in streamlit with matplotlib and seaborn library. The streamlit is a new generation web framework for data science and machine learning enthusiasts.

What is streamlit?

It is an open-source web framework to make customizable apps in the field of machine learning and data science.

Topics to be covered

1. Matplotlib
a. Bar chart
b. Histogram
2. Seaborn
a. Count plot
b. Violin plot

Visualization with Matplotlib library in streamlit

Here, we will make a bar chart and histogram with streamlit functionalities. Visualization is an important part of the analysis…

Predictive Analytics

Improving model and its accuracy for high dimension data

As we know the importance of features in the machine learning algorithms are playing a very crucial role in prediction analysis in any field.

When the data features become very complex then there are very high chances to get a multi-collinearity situation or high correlation between two and more features. This situation strikes badly on the training of data and it might go over-fitting or under-fitting of the data.

There are some methods to select and remove features as shown below:

Feature Selection Methods
1. Uni-variate Selection
2. Selecting from Model
Feature removing Methods
1. Low variance method
2. …


Error interruption in the normal flow of the program

Exception Handling

The concept of exception is simple. An exception takes place only when the normal flow of a program is interrupted by an error. If the program cannot handle the particular operation, it raises an exception. An exception is nothing but a python object that represents an error. Once the python script raises an exception, it has to be dealt with immediately. If not, the program will get terminated.

Try and except in Python

In python, a try-except statement first runs the program under the try statement. If the program does not execute successfully, it stops where an error…

Deep Learning

Using Artificial Neural Network with Keras in google Colab

In this article, we will predict customer churn or attrition with an artificial neural network in google colab. The aim is not to get good accuracy with the model, the importance is to get a good model with different techniques and algorithms.

Churn analysis is a classification problem because the label column has binary values only. Customers are very important for every company and institution and attrition of customers is a part of the analysis.

Here, we will use a neural network to predict the attrition of the customer based on input features and target column as shown below:


