Data Science Enthusiastic | Electronics R&D | Data Visualization | BI | NLP |

In this article, we will discuss some basic visualization with matplotlib and seaborn library. Both libraries are well known in the data science and analytics community.

**Matplotlib:**It is very useful to plot basic plotting functionality with a customizable approach. It is very much comfortable with pandas and numpy. It also helps to plot multiple figures.**Seaborn:**It is also a very powerful tool for visualization and more comfortable with a pandas data frame. It provides beautiful themes to the plot. It provides multiple figures but sometimes leads to OOM (Out Of Memory) problems.

Some examples of visualization with matplotlib…

This article will change the new beginners’ thoughts to learn natural language processing (NLP). When I started learning natural language processing first time is always something that how I will use all these concepts.

The prerequisite for this article is the basic knowledge of natural language concepts. You can read the below article to brush up on the concepts.

**Reading sentiment text file****Data Exploration and Text Processing****Data Cleaning — Stopwords, Stemming, and Lemmatization****Model Building — Naive Bayes****Saving and Revoking the model**

Reading sentiment text file

importing all the necessary libraries.

`import pandas as pd import numpy…`

This article will provide the all depth ** concepts of the list** as a part of the data structure. The concepts will go from basic to advance to know the inside-out of the list.

The first question raise in our mind is *what is a list?*

The following points will make you understand the list as shown below:

- The list is may or may not be a collection of heterogeneous data types.
- It is useful for small sequences.
- The list is immutable in python means we can do modifications in the list with their items. …

In this article, we will discuss time series concepts with machine learning examples that deal with the time component in the data.

Forecasting is so much important in the banking sector, weather, population prediction, and many more that directly deals with real-life problems.

Time series models are based on a function of time. The measurements are in regular intervals of time where time be an independent variable for modeling.

**Z = f(t)**

**Z** is the values Z1, Z2……Zn and “**t**” are the times at T1, T2….Tn intervals.

**Components of Time Series****White Noise****Stationary and Non-Stationary****Rolling Statistics and Dickey-Fuller…**

This article is related to find the prediction that person is diabetic or not based on given data. We will use two machine learning approaches to find the accuracy of prediction.

- Introduction about the data
- Exploratory Data Analysis (EDA) with matplotlib and sweetviz
- Prediction with SVM and KNN classifier

The data contains 8 independent variables and 1 dependent variable. The inspiration to make the prediction model to ease the working in less time and make a fast prediction for further medication.

The independent variables are: **pregnancies, glucose, BMI, insulin, blood pressure, skin thickness, pedigree function, age**

The dependent variable: **outcome**

This article will cover all the concepts related to functions and make you feel comfortable in programming. This topic is very easy to understand and yet difficult because of less practice.

- Introduction
- Function arguments and their types
- Global and Local variable
- Passing data sequence to function
- Anonymous function —
*Lambda*

The worth of using function comes to know when you are writing the formula more than one or more times in a program o algorithm and it cost time.

It is important to make a single-function comprise of that formula and use these functions many times.

The benefits of using…

In this article, we will discuss error handling in python with a try, except and finally keywords to handle file and data management.

In general, the errors describe in these three categories as shown below:

**Compile-time error:**When we do some syntactical errors like missing something, spelling wrong, a undefined variable is used, etc. these type of errors comes in compile time.**Logical error:**This type of error comes when the program is run or compiled properly but gives wrong or undesired output. So, this kind of error is known as a logical error.**Runtime error:**This error occurs in…

Another article in the series of Fully Explained machine learning algorithms i.e. BIRCH clustering in unsupervised learning.

This algorithm is used to perform hierarchical clustering based on trees. These trees are called CFT i.e. Cluster Feature Trees. The full form of BIRCH is **B**alanced **I**terative **R**educing **C**lusters using **H**ierarchies. The use case of BIRCH clustering is in below scenario:

- Large dataset
- Outlier detection
- Data reduction.

The metric use in this cluster to measure the distance is Euclidean distance measurement.

There are some points that BIRCH is very useful in clustering algorithms as shown below:

- It is very useful to handle…

In this article, we will discuss how to create a fake estimator just to compare with the model estimator. We will discuss two types of dummies in supervised learning i.e. regression and classification.

This concept comes in the metrics and scoring part of sklearn.

It is used to make predictions on a simple rule to know the simple baseline for compare regressors but not use in real problems.

**Parameters in DummyRegressor**

There are main parameters as shown below:

**Strategy:**It is used to generate predictions based on its different arguments.

It is used to predict the mean of the…*Mean:*

This article will be fun for all readers

Hypotheses testing is an idea to be tested in statistics on observed data points. It is all about guessing the things that can be work or not to make meaningful results.

A good hypothesis contains “if” and “then” words. For example, **if** the temperature is increased **then** the solid will melt.

- Knowledge of Descriptive Statistics
- Knowledge of Inferential Statistics

- Formulate a hypotheses
- Find the right test
- Execute the test
- Make a decision

When we always do hypotheses we have to know what is our null hypothesis. For example, if we say the…