模型评估
文章目录
符号 | 含义 |
---|---|
y y y | 真实值【一个向量】 |
y ^ \hat y y^ | 预测值【一个向量】 |
分类评估(Metrics for Classification)
准确度(Accuracy)
Eg:
y y y = (0,0,1,0,1)
y
^
\hat y
y^ = (1,0,1,0,1)
A
c
c
u
a
r
c
y
=
s
u
m
(
y
=
=
y
^
)
/
l
e
n
(
y
)
=
s
u
m
(
(
0
,
0
,
1
,
0
,
1
)
=
=
(
1
,
0
,
1
,
0
,
1
)
)
/
5
=
4
/
5
\begin{aligned} Accuarcy &= sum(y == \hat y)/len(y) \\ &= sum((0,0,1,0,1) == (1,0,1,0,1))/5\\ &= 4/5 \end{aligned}
Accuarcy=sum(y==y^)/len(y)=sum((0,0,1,0,1)==(1,0,1,0,1))/5=4/5
精度(Precision)
当有时候数据正负类分布不平衡,此时得到的准确度就有一点偏离模型真正能够达到的准确度,因此需要其他指标来权衡利弊。
Eg:
y y y = (0,0,1,0,1)
y ^ \hat y y^ = (1,0,1,0,1)
预测正确率
P
r
e
c
i
s
i
o
n
=
s
u
m
(
y
=
=
1
a
n
d
y
^
=
=
1
)
/
s
u
m
(
y
^
=
=
1
)
=
s
u
m
(
(
0
,
0
,
1
,
0
,
1
)
=
=
1
a
n
d
(
1
,
0
,
1
,
0
,
1
)
=
=
1
)
/
s
u
m
(
(
1
,
0
,
1
,
0
,
1
)
=
=
1
)
=
2
/
3
\begin{aligned} Precision &= sum(y==1\ and\ \hat y==1)/sum(\hat y==1) \\ &= sum((0,0,1,0,1)==1\ and\ (1,0,1,0,1)==1)/ sum((1,0,1,0,1)==1)\\ &= 2/3 \end{aligned}
Precision=sum(y==1 and y^==1)/sum(y^==1)=sum((0,0,1,0,1)==1 and (1,0,1,0,1)==1)/sum((1,0,1,0,1)==1)=2/3
召回率(Recall)
Eg:
y y y = (0,0,1,0,1)
y ^ \hat y y^ = (1,0,1,0,1)
预测正确率
R
e
c
a
l
l
=
s
u
m
(
y
=
=
1
a
n
d
y
^
=
=
1
)
/
s
u
m
(
y
=
=
1
)
=
s
u
m
(
(
0
,
0
,
1
,
0
,
1
)
=
=
1
a
n
d
(
1
,
0
,
1
,
0
,
1
)
=
=
1
)
/
s
u
m
(
(
0
,
0
,
1
,
0
,
1
)
=
=
1
)
=
2
/
2
\begin{aligned} Recall &= sum(y==1\ and\ \hat y==1)/sum( y==1) \\ &= sum((0,0,1,0,1)==1\ and\ (1,0,1,0,1)==1)/ sum((0,0,1,0,1)==1)\\ &= 2/2 \end{aligned}
Recall=sum(y==1 and y^==1)/sum(y==1)=sum((0,0,1,0,1)==1 and (1,0,1,0,1)==1)/sum((0,0,1,0,1)==1)=2/2
F1
F 1 = 2 ∗ P ∗ R / ( P + R ) F1 = 2*P*R/(P+R) F1=2∗P∗R/(P+R)
Eg:
在以上的例子中计算F1
F
1
=
2
∗
P
∗
R
/
(
P
+
R
)
=
(
2
∗
2
/
3
∗
2
/
2
)
/
(
2
/
3
+
2
/
2
)
=
4
/
5
\begin{aligned} F1 &= 2*P*R/(P+R) \\ &= (2*2/3*2/2)/(2/3+2/2)\\ &= 4/5 \end{aligned}
F1=2∗P∗R/(P+R)=(2∗2/3∗2/2)/(2/3+2/2)=4/5