Skip to main content

Find the correlation matrix using python on Iris Dataset

 Q. Find the correlation matrix

Correlation: Correlation is Statistical Measure which finds the extent to which two or more variable related with each other. 

or 

Correlation is a statistical measure that describes the degree to which two variables change together. correlation is denoted by 'r'

Type of Correlations:

Positive Correlation(r>0): If the value of one variable increases then value of another variable also increases

Negative Correlation(r<0):If the value of one variable increases then value of another variable decreases 

No Correlation(r=0):There is no any linear relationship between the  two variables.

Program using Python

Import Libraries:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

Read CSV File: Download csv file Iris Dataset 

then use pd.read_csv() function

df=pd.read_csv('/Iris.csv')

df

Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species
0 1 5.1 3.5 1.4 0.2 Iris-setosa
1 2 4.9 3.0 1.4 0.2 Iris-setosa
2 3 4.7 3.2 1.3 0.2 Iris-setosa
3 4 4.6 3.1 1.5 0.2 Iris-setosa
4 5 5.0 3.6 1.4 0.2 Iris-setosa
... ... ... ... ... ... ...
145 146 6.7 3.0 5.2 2.3 Iris-virginica
146 147 6.3 2.5 5.0 1.9 Iris-virginica
147 148 6.5 3.0 5.2 2.0 Iris-virginica
148 149 6.2 3.4 5.4 2.3 Iris-virginica
149 150 5.9 3.0 5.1 1.8 Iris-virginica

150 rows × 6 columns

Information of Top 5 rows

df.head()

Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species
0 1 5.1 3.5 1.4 0.2 Iris-setosa
1 2 4.9 3.0 1.4 0.2 Iris-setosa
2 3 4.7 3.2 1.3 0.2 Iris-setosa
3 4 4.6 3.1 1.5 0.2 Iris-setosa
4 5 5.0 3.6 1.4 0.2 Iris-setosa

Correlation using function corr()

cor=df.corr() cor
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm
Id 1.000000 0.716676 -0.397729 0.882747 0.899759
SepalLengthCm 0.716676 1.000000 -0.109369 0.871754 0.817954
SepalWidthCm -0.397729 -0.109369 1.000000 -0.420516 -0.356544
PetalLengthCm 0.882747 0.871754 -0.420516 1.000000 0.962757
PetalWidthCm 0.899759 0.817954 -0.356544 0.962757 1.000000


Correlation map using heatmap


sns.heatmap(data=cor,annot=True)


<matplotlib.axes._subplots.AxesSubplot at 0x7f2be4bd6610>





Comments

Popular posts from this blog