Probability and Classification is one of the most important aspect of Machine Learning. They often go hand in hand with each other. We use various algorithms to classify data into distinguishable classes. One such algorithm is The Naïve Bayes Algorithm.

Image for post
Image for post
Image Source: Simran Kaur Arora

“A learner that uses Bayes’ theorem and assumes the effects are independent given the cause is called a Naïve Bayes classifier. That’s because, well, that’s such a naïve assumption.”
— Pedro Domingos

The basics of Naïve Bayes Algorithm

In order to understand Naïve Bayes, we require some basic knowledge of probability and the Bayes Theorem. To understand this, lets consider an example:

A Dice and a…

Moving on with our knowledge from Logistic Regression — A Supervised Learning Algorithm for Classification of Data. We now study a much more geometrically motivated algorithm — The Support Vector Machine Algorithm.

Image for post
Image for post
mage Source: Shuyu Luo

With this blog, we will learn about the working and the agenda behind using SVM for classification. We will also discuss about the need of SVM over Logistic Regression Algorithm and when to use it. Although, Support Vector Machine can be used for both regression and classification tasks, it is mainly used for classification problems.

Support Vector Machine — The way to Social Distancing!

Logistic Regression is a probabilistic binary classifier algorithm. This means it classifies the…

With our gathered knowledge from learning K-Nearest-Neighbors — KNN, which is a Supervised Learning Algorithm and a Lazy learner we found that all similar things exist in close proximity. It hinges on this assumption being true enough for the algorithm to be useful.

Image for post
Image for post
Image Source:

K-Means is no different. Based on the same basic principle of proximity or similarity, K-Means is also able to categorize data points into groups. In this blog, we will go over the Math behind K-Means Clustering and build a small model from scratch.

K-Means Clustering — Introduction

K-Means Clustering, also known as Lloyd’s Algorithm, is an iterative, data-partitioning, Unsupervised Learning Algorithm…

Finding whether or not something will happen is another dilemma we face everyday. We are faced with the question of Yes or No all the time. Researchers in the field of Machine Learning are no different.

Image for post
Image for post
Image Source:

In Machine Learning, to answer this question of probability of an event happening is solved using Logistic Regression. Although it is called “Regression”, Logistic Regression is an algorithm built to solve Classification Problems.

Math behind Logistic Regression

In statistics, the logit function or the log-odds is the logarithm of the odds p/(1-p) where p is a probability. …

Decisions, Decisions, Decisions… we make numerous decisions everyday; unconsciously or consciously, sometimes doing it automatically with little effort and sometimes, agonizing for hours over another. If only there was a way to chart a path to the conclusion.

Image for post
Image for post
Sample Decision Tree

Decision Tree is one of the most popular and powerful tool for Classification and Regression. Clear from its name, it is a flowchart like tree structure. It’s this property of Decision Tree that makes them easy to understand and interpret.

Anatomy of a Decision Tree

Today, we expect our machine to be autonomous, intelligent, and decision maker. We want them to make our lives easier and hassle free. In this blog, we will learn how a machine can be trained to take such decisions for us.

Image for post
Image for post
Image Source: Mike from Pexels

The oldest, shortest words — ‘Yes’ and ‘No’ — are those which require the most thought.
— Pythagoras

Classification — A problem of identifying to which set of categories a new observation belongs. Classification comes naturally to human beings, we see similar looking, feeling, or even smelling things and we put them under the same category. This technique is…

Image for post
Image for post

Getting into Machine Learning, one of the first things everyone learns is Regression. It is a Supervised Learning technique which helps us in finding the correlation between two or more variables. It enables us to predict the continuous output (dependent) variable based on one or more predictor (independent) variables.

“An approximate answer to the right problem is worth a good deal more than an exact answer to an approximate problem.”
— John Tukey

Regression Algorithms

There are numerous Regression Algorithms used in Data Science and Machine Learning. Each algorithm is used in a different scenario, which majorly depends upon the type of…

In this follow-up blog, we shall study about the next concept of Mathematics behind Machine Learning.

“There are three types of lies — lies, damn lies, and statistics.”
— Benjamin Disraeli

Image for post
Image for post
Image Source: Malte Luk from Pexels

We’ve already seen how Linear Algebra and Probability work and function in the world of Machine Learning. If you haven’t already, read the previous blog first: Why study Mathematics? : Machine Learning in Python | by Divyansh Chaudhary | Jan, 2021 | Medium. The next topics we encounter the most while working through data and predictions are Statistics and Calculus.


Statistics is an important tool of Machine Learning. It…

“Mathematics is not about numbers, equations, computation, or algorithms: it is about understanding.”
— William Paul Thurston

Image for post
Image for post
Image Source: Lum3n from Pexels

Understanding is a crucial part on the journey of becoming a Machine Learning Professional. Even though one might argue that learning mathematics behind machine learning is not necessary as python provides numerous libraries to perform these mathematical operations, it is a fallacy that has created a false sense of expectation among inspiring ML Professionals.

Mathematics behind Machine Learning

If you are feeling a bit of anxiety even before reading this blog, don’t worry. Mathematics can be quite complex for some people, especially for people coming from non-technical…

With all our gathered knowledge from the previous Data Visualization blogs, Data Visualization and Data Visualization II, our next move should be to gain an even deeper understanding of visualization in Python.

Image for post
Image for post
Image Source: Luke Chesser on Unsplash

Python delivers Seaborn library providing a high-level interface for drawing attractive and informative statistical graphs. Although Matplotlib is sufficient enough for plotting basic graphs, it has a complex customizability option. Seaborn on the other hand, utilizes its statistical plotting prowess with its inbuilt themes to provide beautiful and interactive graphs.

“If Matplotlib “tries to make easy things easy and hard things possible”, seaborn tries to make a well-defined…

Divyansh Chaudhary

Machine Learning and Python Student. Coding Enthusiast. Pursuing Bachelors in Computer Science.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store