Machine Learning or Data Science is all about understanding your data through Mathematics and Statistics concepts. Before you go on writing Python codes first thing you should know is how to do a few basic calculations. In this blog, I tried to cover Maths used in Data Science.  

1. Understanding Mean, Median and Mode:

Mean is nothing but the average.

Example:- Dataset  (1,2,3,4,5,6,7)

Mean=(1+2+3+4+5+6+7)/7

Mean= 4

Median is a central value. There are two ways to calculate the median one for odd numbers and another is for even numbers. Assign the value into ascending order.

For even number: 

  • Sample Dataset: 1,58,34,56,23,2,45,8
  • Arrange dataset in Ascending order (1,2,8,23,34,45,56,58) 
  • Take the middle two values these are 23+34 and take a average. It will be center value for above dataset.

For odd numbers:

  • Sample Dataset: 1,2,8,23,34,45,56
  • Arrange dataset in ascending order. 
  • 23 is dividing the dataset value into two equal part. 
  • So in this case 23 is a middle value.

Mode:- The number which occurs most in the dataset. 

Sample Dataset (1,2,4,3,2,5,6,2,7,8,2)

2 are occurring most times in the dataset, so the Mode is 2.

2. Variance and standard deviation:  

It is used to understand how your data is scattered. Variance is nothing but how your data is spread on the graph. Standard deviation is the squared root of variance.

Let’s see how to calculate variance. 

  1. Calculate the mean of your dataset.
  2. Subtract the mean from each data point and then take square
  3. Take the average of all the answers we got in step 2. This is variance of your dataset.
  4. Square root of variance is Standard Deviation.

These values data Scientists use for further analysis.