Supervised Learning

Society of AI
10 min readSep 16, 2020


Supervised Learning is a one type of Machine Learning

What is Machine Learning

● According to Arthur Samuel Machine Learning is “The field of study that gives computers the ability to learn without being explicitly programmed.”

● According to Tom Mitchell provides a more modern definition.”A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P,if its performance at tasks in T,as measured by P,improves with experience E.”

● There are four types of Machine Learning-

○ Supervised learning

○ Unsupervised Learning

○ Semi-supervised Learning

○ Reinforcement Learning

What is Supervised Learning

● Supervised machine learning algorithms are designed to learn by example.

● Basically supervised learning is a learning in which we teach or train the machine using data which is well labeled that means some data is already tagged with the correct answer.

● When training a supervised learning algorithm,the training data will consist of inputs paired with the correct outputs.In Supervised Learning ,during training the algorithm will search for patterns in the data that correlate with the desired outputs.

● The objective of a supervised learning model is to predict the correct label for newly presented input data.

● A supervised learning algorithm can be written simply as:

● Y=f(x)

● Where Y is the predicted output that is determined by a mapping function that assigns a class to an input value x.

● Supervised Machine Learning can be split into two subcategories.

○ (1)Classification

○ (2)Regression


● The Classification algorithm used when the output variable is categorized as “disease” or “no disease” ,”red” or “blue”.

● Example of classification :

○ Consider the mail classification problem.In this we have to decide whether the particular mail is spam or not, So here the output variable is categorized as “spam” or “not spam”.


● Regression analysis is a form of predictive modelling technique which investigates the relationship between a dependent and independent variable.

● Example of Regression:

○ Consider the problem in which we have to predict the student’s marks based on the hours he/she put for preparation.

List Of Supervised Machine Learning Algorithm

● Linear Regression

● Multiple Variable Linear Regression

● Logistic Regression

● Naive Bayes Classifiers

● K-NN Classification

● Support Vector Machine

● Decision Tree

Linear Regression

● Linear regression performs the task to predict a dependent variable value(y) based on a given independent variable(x).

● Linear regression is a predictive algorithm which provides a linear relationship between dependent variable (Call it ‘Y’) and independent (Call it ‘X’) .Hence it’s

● name is Linear Regression.

● If we draw linear relationship in two dimensional space,we get a straight line.

● Linear Regression gives us straight line that best fits the data points,as shown in figure below-

Linear Regression
Linear Regression

● The Equation of the straight is


Where Y=Dependent Variable

x=Independent Variable m=Slope


There are two types of Linear Regression.

○ Simple Linear Regression ○ Multiple Linear Regression

Simple Linear Regression

● Simple Linear Regression

● In Simple Linear Regression we try to find the relationship between a single independent variable(input) and a corresponding dependent variable(output).This can be expressed in the form of a straight line.

● The same equation of a straight line can be rewritten for Simple Linear Regression as

● Y=B0+B1x+ε


Y=Dependent Variable

B and B represent the intercept and slope(coefficient) Respectively.

ε (Epsilon) represents an error term.

Real World Example Of Simple Linear Regression

● Let’s consider the problem of predicting the marks of a student based on the number of hours he/she put for preparation.

● As it is simple linear regression problem let’s assume that the marks of a student(M) depend on the number of hours(H) he/she put up for

● preparation.

● The following formula represent simple linear regression model. Marks=m*Hours+c

● The data represented in the above plot would be used to find out a line such as the following which represents a best-fit line.

● The slope of the t best-fit line would be the value of “m”.

● The value of m can be determined using objective function. For simple Linear regression, the objective function would be the

● summation of Mean Squared Error(MSE).MSE is the sum of squared distances between the target variable(actual marks) and predicted value(marks calculated using the above equation).The best fit line would be obtained by minimizing the objective function(summation of mean squared error).

● The Following Plot Represent a simple linear model with a regression line

Simple Linear Model

Sample Code For Simple Linear Regression

We can implement the linear regression with model using a machine Learning Library Called scikit-learn.


Simple Linear Regression : Code

Multiple Variable Linear Regression

● In Multiple Linear Regression,we try to find out the relationship between two or more independent variables(input) and corresponding Dependent variable(output).

● The Independent variables can be categorical or continuous.

● The Following equation describe that how the predicted value of Y is related to n(multiple) independent variables.

● Y=B0+B1x1+B2x2+…….Bnxn

o Y=Dependent Variable

o B0 is intercept

o B1 ,B2 ,…..Bn are the Coefficient

o x1 ,x2 ,…xn are independent variables

o ε(Epsilon) represent error term

Real World Example of Multiple Variable Linear Regression

● Let’s consider the problem of predicting weight reduction based on multiple independent variables such as age,height,weight of the

● person and time spent on exercises.

● The Following Equation Represent the multiple linear regression model.

● Weight Reduction=B1*age+B2 *height+B3 *weight+B4 *timeonexcercise+B0. The value of B0 ,B1 ,B2 ,B3 and B4 are determined using the objective

● function.

● ○The objective function would be the summation of mean squared error which is nothing but the sum of squared distance between the actual

● value and the predicted value for different values of age,height,weight and timeonexercise.

● Our goal would be to find the value of B0 ,B1 ,B2 ,B3 and B4 which would minimize the objective function.

Sample Code for Linear Regression

● We can implement the Multiple Variable linear regression with model using a machine Learning Library Called scikit-learn.

● Example:

Linear Regression : Code

Logistic Regression

● Logistic Regression is one of the popular machine learning algorithms, which comes under the Supervised Learning technique. It is used for predicting the categorical dependent variable using a given set of independent variables.

● Generally In logistic Regression the dependent variable is binary in nature having data coded as either 1 (stands for success/yes/true) or 0 (stands for failure/no/false).But instead of giving the exact value as 0 and 1,it gives the probabilistic values which lie between 0 and 1.

● Logistic regression is similar to Linear Regression but Linear Regression is used for solving regression problem and Logistic Regression is used for solving the classification problems.

● Logistic Regression can be used for various classification problems such as spam detection,Diebetes prediction,cancer detection etc.

Types Of Logistic Regression

● Based dependent variable(target variable) logistic regression can be classified into following.

o Binary or Binomial:

▪ In this type a dependent variable will having only two possible types either 1 (stands for success/yes/true) and 0(stands for failure/no/false).

o Multinomial:

▪ In this type of classification, dependent variable have 3 or more possible unordered types or the types having no quantitative significance. For example, these variables may represent “Type A”, ”Type B” or “Type C”.

o Ordinal:

▪ In this type of classification, dependent variable have 3 or more possible ordered types or the types having quantitative significance. For example, these variables may represent “poor” or “good”,”very good”,”Excellent” and each category can have the scores like 0,1,2,3.

Logistic Regression Equation

● The Logistic regression equation can be obtained from the Linear Regression equation as follows.

o Linear Regression equation:


o In Logistic Regression y can be between 0 and 1,so let’s divide the above equation by (1-Y)

o Y/(1-Y); 0 for Y=0 and infinity for Y=1

o But we need range between -[infinity] to +[infinity], then take logarithm of the equation it will become

Log(Y/(1-Y))= B0+B1x1+B2x2+…….Bnxn

…Final Equation.

Sample Code for Logistic Regression

● We can implement the Multiple Logistic Regression with model using a machine Learning Library Called scikit-learn.

Logistic Regression : Code

Naive Bayes Classifiers

● Naïve Bayes algorithms is a classification technique based on Bayes theorem.

● In this algorithm all the predictors are independent to each other. In simple words, the assumption is that the presence of a feature in a class is independent to the presence of any other feature of same class. For example in fruit classification we classify any fruits based on its color ,smell ,shape. But color,smell and shape are independent.

● It is a probabilistic classifier,which means it predicts on the basis of the probability of an object.

● Sample popular examples of Naïve Bayes Algorithm are spam filtration,Sentimental analysis and classifying articles.

● The Formula for Bayes’ theorem is also known as Bayes’ Rule or Bayes’ law.It depends on the conditional probability.

● The formula for Bayes’ theorem is given as:

o P(L|features)=P(L)P(features|L)/P(features)

▪ Here, (𝐿 | 𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠) is the posterior probability of class.

▪ 𝑃(𝐿) is the prior probability of class.

▪ 𝑃(𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠|𝐿) is the likelihood which is the probability of predictor given class.

▪ 𝑃(𝑓𝑒𝑎𝑡𝑢𝑟𝑒𝑠) is the prior probability of predictor.

Sample Code For Naive Bayes Classifiers

We can implement the Naive Bayes Classifiers with model using a machine Learning Library Called scikit-learn

Naive Bayes Classifiers : Code

K-NN Classification

● K-Nearest Neighbor is one of the simplest Machine Learning algorithms based on Supervised Learning Technique.

● K-NN algorithm assumes the similarity between the new case/data and available cases and put the new case into the category that is most similar to the available categories.

● K-NN algorithm stores all the available data and classifies a new data and classifies a new data point based on the similarity.

● K-NN algorithm used for both Classification and Regression problems. But mostly it is used to solve classification problem.

● Steps:

o First,we have to select the number K of the neighbor. The optimum K value always vary depending of data-set.It should be as big that noises won’t affect the prediction highly.And as low low that one factor wont’t dominate another.To find the optimum K value mostly Cross-Validation is used.

o Then Calculate the Euclidean distance of K number of neighbors.

o Take the K nearest neighbors as per the calculated Euclidean distance.

o Then, among these k neighbors, count the data points in each category.

o Assign the new data points to that category for which the number of the neighbor is maximum.

o Model is ready.

o Suppose:

o Category-A

Category -B

So We can se that after appying K-NN (k=3)algorithm new data point is in category B.

Sample Code For K-NN Classification Algorithm.

We can implement the KNN Classification algorithm with model using a machine Learning Library Called scikit-learn

K-NN Classification Algorithm : Code

Support Vector Machine

● Support Vector Machine or SVM is one of the popular machine learning algorithms which is used to solve Classification and Regression Problem.

● The idea of SVM is to create a line or a hyperplane which separates the data into classes.

● The goal of the SVM algorithm is to create the best line or decision boundary that can segregate n-dimensional space into classes so that we can easily put the new data point in the correct category in the future. This best decision boundary is called a hyperplane.

● SVM chooses the extreme points/vectors that help in creating the hyperplane. These extreme cases are called as support vectors.

Example :

Suppose we have a dataset and we need to classify the red rectangles from the blue ellipses(let’s say positives from the negatives). So Our task is to find an ideal line that separates this dataset in two classes (say red and blue).

From both classes we consider the points nearest to the line according to the SVM

algorithm. These points are called support vectors. Now, we measure the distance between the support vectors and the line. This distance is called the margin. Our goal is to maximize the margin. The hyperplane for which the margin is maximum is the optimal hyperplane.

● Thus SVM tries to make a decision boundary in such a way that the separation between the two classes(that street) is as wide as possible.

Sample Code for Support Vector Machine

We can implement the support vector machine algorithm with model using a machine Learning Library Called scikit-learn

Support Vector Machine: Code

If you liked the story and want to appreciate us you can clap as much as you can. Appreciate our work by your constructive comment and also you can connect to us on….


LinkedIn :


Website :



Society of AI

Society of AI has an vision to educate people how Artificial Intelligence can change their life!

Recommended from Medium


See more recommendations