0
点赞
收藏
分享

微信扫一扫

相关性分析

小编 2022-02-11 阅读 96

变量之间的相关性分析主要包括:

  1. 分析变量自身的规律
    • 自相关分析
    • 偏相关分析
  2. 分析任意两个等长数列之间的相关性
    • 简单相关分析
  3. 允许在一定的间隔下进行简单的相关分析
    • 互相关分析
  4. 分析两组变量的相关性
    • 典型的相关分析

相关图的绘制

一、相关矩阵图

	import matplotlib.pyplot as plt
	import numpy as np
    import pandas as pd
    from sklearn import datasets
    import seaborn as sns

    iris = datasets.load_iris()
    iris_data = pd.DataFrame(iris.data, columns=iris.feature_names)
    iris_data['species'] = iris.target_names[iris.target]

    df = iris_data.drop(columns='species')

    corr = df.corr()
    corrplot(corr, cmap='Spectral', s=2000)

    pd.set_option('display.max_columns', None)
    pd.set_option('display.max_rows', None)
    print('corr: \n', corr)
    

corrplot函数

def corrplot(corr, cmap, s):
    import matplotlib.pyplot as plt
    x, y, z = [], [], []
    N = corr.shape[0]
    for row in range(N):
        for column in range(N):
            x.append(row)
            y.append(N - 1 - column)
            z.append(round(corr.iloc[row, column], 2))
    sc = plt.scatter(x, y, c=z, vmin=-1, vmax=1, s=s * np.absolute(z), cmap=plt.cm.get_cmap(cmap))
    plt.colorbar(sc)
    plt.xlim((-0.5, N - 0.5))
    plt.ylim((-0.5, N - 0.5))
    plt.xticks(range(N), corr.columns, rotation=90)
    plt.yticks(range(N)[::-1], corr.columns)
    plt.grid(False)
    ax = plt.gca()

    ax.xaxis.set_ticks_position('top')

    internal_space = [0.5 + k for k in range(4)]
    [plt.plot([m, m], [-.05, N - 0.5], c='lightgray') for m in internal_space]
    [plt.plot([-.05, N - 0.5], [m, m], c='lightgray') for m in internal_space]
    plt.show() 

iris 数据集

     sepal length (cm)  sepal width (cm)  ...  petal width (cm)    species
0                  5.1               3.5  ...               0.2     setosa
1                  4.9               3.0  ...               0.2     setosa
2                  4.7               3.2  ...               0.2     setosa
3                  4.6               3.1  ...               0.2     setosa
4                  5.0               3.6  ...               0.2     setosa
..                 ...               ...  ...               ...        ...
145                6.7               3.0  ...               2.3  virginica
146                6.3               2.5  ...               1.9  virginica
147                6.5               3.0  ...               2.0  virginica
148                6.2               3.4  ...               2.3  virginica
149                5.9               3.0  ...               1.8  virginica

计算相关系数矩阵

                    sepal length (cm)  sepal width (cm)  petal length (cm)   petal width (cm)  
sepal length (cm)           1.000000         -0.117570           0.871754           0.817941     
sepal width (cm)           -0.117570          1.000000          -0.428440           -0.366126   
petal length (cm)           0.871754         -0.428440           1.000000           0.962865   
petal width (cm)            0.817941         -0.366126           0.962865           1.000000  

在这里插入图片描述

二、相关层次图

    import numpy as np
    import pandas as pd
    mtcars = pd.read_csv('data/mtcars.csv', index_col=0)
    print(mtcars)
    d = np.sqrt(1 - mtcars.corr() * mtcars.corr())
    d.fillna(0,inplace=True)
    print(d)
    
    d.dropna()
    from scipy.spatial.distance import pdist, squareform
    from scipy.cluster.hierarchy import linkage
    from scipy.cluster.hierarchy import dendrogram
    row_cluster = linkage(pdist(d, metric='euclidean'), method='ward')
    row_dendr = dendrogram(row_cluster, labels=d.index)
    plt.tight_layout()
    plt.ylabel('Euclidean distance')
    plt.plot([0, 2000], [1.5, 1.5], c='gray', linestyle='--')
    plt.show()

mtcars.csv

"","mpg","cyl","disp","hp","drat","wt","qsec","vs","am","gear","carb"
"Mazda RX4",21,6,160,110,3.9,2.62,16.46,0,1,4,4
"Mazda RX4 Wag",21,6,160,110,3.9,2.875,17.02,0,1,4,4
"Datsun 710",22.8,4,108,93,3.85,2.32,18.61,1,1,4,1
"Hornet 4 Drive",21.4,6,258,110,3.08,3.215,19.44,1,0,3,1
"Hornet Sportabout",18.7,8,360,175,3.15,3.44,17.02,0,0,3,2
"Valiant",18.1,6,225,105,2.76,3.46,20.22,1,0,3,1
"Duster 360",14.3,8,360,245,3.21,3.57,15.84,0,0,3,4
"Merc 240D",24.4,4,146.7,62,3.69,3.19,20,1,0,4,2
"Merc 230",22.8,4,140.8,95,3.92,3.15,22.9,1,0,4,2
"Merc 280",19.2,6,167.6,123,3.92,3.44,18.3,1,0,4,4
"Merc 280C",17.8,6,167.6,123,3.92,3.44,18.9,1,0,4,4
"Merc 450SE",16.4,8,275.8,180,3.07,4.07,17.4,0,0,3,3
"Merc 450SL",17.3,8,275.8,180,3.07,3.73,17.6,0,0,3,3
"Merc 450SLC",15.2,8,275.8,180,3.07,3.78,18,0,0,3,3
"Cadillac Fleetwood",10.4,8,472,205,2.93,5.25,17.98,0,0,3,4
"Lincoln Continental",10.4,8,460,215,3,5.424,17.82,0,0,3,4
"Chrysler Imperial",14.7,8,440,230,3.23,5.345,17.42,0,0,3,4
"Fiat 128",32.4,4,78.7,66,4.08,2.2,19.47,1,1,4,1
"Honda Civic",30.4,4,75.7,52,4.93,1.615,18.52,1,1,4,2
"Toyota Corolla",33.9,4,71.1,65,4.22,1.835,19.9,1,1,4,1
"Toyota Corona",21.5,4,120.1,97,3.7,2.465,20.01,1,0,3,1
"Dodge Challenger",15.5,8,318,150,2.76,3.52,16.87,0,0,3,2
"AMC Javelin",15.2,8,304,150,3.15,3.435,17.3,0,0,3,2
"Camaro Z28",13.3,8,350,245,3.73,3.84,15.41,0,0,3,4
"Pontiac Firebird",19.2,8,400,175,3.08,3.845,17.05,0,0,3,2
"Fiat X1-9",27.3,4,79,66,4.08,1.935,18.9,1,1,4,1
"Porsche 914-2",26,4,120.3,91,4.43,2.14,16.7,0,1,5,2
"Lotus Europa",30.4,4,95.1,113,3.77,1.513,16.9,1,1,5,2
"Ford Pantera L",15.8,8,351,264,4.22,3.17,14.5,0,1,5,4
"Ferrari Dino",19.7,6,145,175,3.62,2.77,15.5,0,1,5,6
"Maserati Bora",15,8,301,335,3.54,3.57,14.6,0,1,5,8
"Volvo 142E",21.4,4,121,109,4.11,2.78,18.6,1,1,4,2

mtcars数据集读取结果:

                    mpg  cyl   disp   hp  drat  ...   qsec  vs  am  gear  carb
Mazda RX4            21.0    6  160.0  110  3.90  ...  16.46   0   1     4     4
Mazda RX4 Wag        21.0    6  160.0  110  3.90  ...  17.02   0   1     4     4
Datsun 710           22.8    4  108.0   93  3.85  ...  18.61   1   1     4     1
Hornet 4 Drive       21.4    6  258.0  110  3.08  ...  19.44   1   0     3     1
Hornet Sportabout    18.7    8  360.0  175  3.15  ...  17.02   0   0     3     2
Valiant              18.1    6  225.0  105  2.76  ...  20.22   1   0     3     1
Duster 360           14.3    8  360.0  245  3.21  ...  15.84   0   0     3     4
Merc 240D            24.4    4  146.7   62  3.69  ...  20.00   1   0     4     2
Merc 230             22.8    4  140.8   95  3.92  ...  22.90   1   0     4     2
Merc 280             19.2    6  167.6  123  3.92  ...  18.30   1   0     4     4
Merc 280C            17.8    6  167.6  123  3.92  ...  18.90   1   0     4     4
Merc 450SE           16.4    8  275.8  180  3.07  ...  17.40   0   0     3     3
Merc 450SL           17.3    8  275.8  180  3.07  ...  17.60   0   0     3     3
Merc 450SLC          15.2    8  275.8  180  3.07  ...  18.00   0   0     3     3
Cadillac Fleetwood   10.4    8  472.0  205  2.93  ...  17.98   0   0     3     4
Lincoln Continental  10.4    8  460.0  215  3.00  ...  17.82   0   0     3     4
Chrysler Imperial    14.7    8  440.0  230  3.23  ...  17.42   0   0     3     4
Fiat 128             32.4    4   78.7   66  4.08  ...  19.47   1   1     4     1
Honda Civic          30.4    4   75.7   52  4.93  ...  18.52   1   1     4     2
Toyota Corolla       33.9    4   71.1   65  4.22  ...  19.90   1   1     4     1
Toyota Corona        21.5    4  120.1   97  3.70  ...  20.01   1   0     3     1
Dodge Challenger     15.5    8  318.0  150  2.76  ...  16.87   0   0     3     2
AMC Javelin          15.2    8  304.0  150  3.15  ...  17.30   0   0     3     2
Camaro Z28           13.3    8  350.0  245  3.73  ...  15.41   0   0     3     4
Pontiac Firebird     19.2    8  400.0  175  3.08  ...  17.05   0   0     3     2
Fiat X1-9            27.3    4   79.0   66  4.08  ...  18.90   1   1     4     1
Porsche 914-2        26.0    4  120.3   91  4.43  ...  16.70   0   1     5     2
Lotus Europa         30.4    4   95.1  113  3.77  ...  16.90   1   1     5     2
Ford Pantera L       15.8    8  351.0  264  4.22  ...  14.50   0   1     5     4
Ferrari Dino         19.7    6  145.0  175  3.62  ...  15.50   0   1     5     6
Maserati Bora        15.0    8  301.0  335  3.54  ...  14.60   0   1     5     8
Volvo 142E           21.4    4  121.0  109  4.11  ...  18.60   1   1     4     2

计算获得相关系数矩阵

           mpg       cyl          disp  ...        am      gear      carb
mpg   0.000000  0.523278  5.307133e-01  ...  0.800126  0.877113  0.834555
cyl   0.523278  0.000000  4.316673e-01  ...  0.852574  0.870207  0.849873
disp  0.530713  0.431667  2.107342e-08  ...  0.806505  0.831470  0.918691
hp    0.630526  0.554104  6.118826e-01  ...  0.969975  0.992068  0.661650
drat  0.732124  0.714203  7.039859e-01  ...  0.701458  0.714525  0.995870
wt    0.497159  0.622656  4.598822e-01  ...  0.721422  0.812266  0.903965
qsec  0.908132  0.806494  9.010583e-01  ...  0.973224  0.977121  0.754544
vs    0.747698  0.585307  7.037821e-01  ...  0.985728  0.978547  0.821917
am    0.800126  0.852574  8.065052e-01  ...  0.000000  0.607841  0.998344
gear  0.877113  0.870207  8.314703e-01  ...  0.607841  0.000000  0.961709
carb  0.834555  0.849873  9.186911e-01  ...  0.998344  0.961709  0.000000

相关层次图
在这里插入图片描述

举报

相关推荐

0 条评论