机器学习（模型评估（混淆矩阵、ROC、AUC））机器学习

混淆矩阵 【机器学习（模型评估（混淆矩阵、ROC、AUC））】Confusion Matrix is a performance measurement for machine learning classification.

文章图片

TP, FP, FN, TN

True Positive: predicted positive and it’s true.
False Positive: type 1 error, predicted positive and it’s false.
False Negative: type 2 error, predicted negative and it’s false.
True Negative: predicted negative and it’s true.

评价指标

文章图片

TPR/Recall/Sensitivity，真阳率/召回率/敏感度
Out of all the positive examples, how much we predicted correctly. 真实正例中被预测为正例的比率（有病被正确诊断为有病的概率），当所有样例预测为正例，召回率为100%。
T P R / R e c a l l / S e n s i t i v i t y = T P T P + F N \mathsf{TPR/Recall/Sensitivity = \frac{TP}{TP+FN}} TPR/Recall/Sensitivity=TP+FNTP?
FPR，假阳率
Out of all the negtive examples, how much we predicted incorrectly. 真实负例被预测为正例的比率（没病被误诊为有病的概率），当所有样本预测为正例时，假阳率为1。
F P R = F P T N + F P \mathsf{FPR= \frac{FP}{TN+FP}} FPR=TN+FPFP?
Precision，精确度
Out of all the positive examples we have predicted correctly, how many are actually positive. 预测正例中真实正例的比率，增加属于正例阈值，总是将置信度很高的样本才预测为正例，精确度可以提升，但召回率会很大程度降低。
P r e c i s i o n = T P T P + F P \mathsf{Precision= \frac{TP}{TP+FP}} Precision=TP+FPTP?
Accuracy，准确度
Out of all examples, how much we predict correctly. 所有样例中，被预测正确的比率。
A c c u r a c y = T P + T N T P + F P + F N + T N \mathsf{Accuracy = \frac{TP+TN}{TP+FP+FN+TN}} Accuracy=TP+FP+FN+TNTP+TN?
F-measure，F分数
It is difficult to compare two models with low precision and high recall or vice versa. F-score helps to measure Recall and Precision at the same time. It uses Harmonic Mean in place of Arithmetic Mean by punishing the extreme values more.
F ? m e a s u r e = 2 ? R e c a l l ? P r e c i s i o n R e c a l l + P r e c i s i o n \mathsf{F-measure= \frac{2*Recall * Precision}{Recall+Precision}} F?measure=Recall+Precision2?Recall?Precision?

以信用卡欺诈二分类任务为例，正负样本数一致，正例表示存在欺诈，我们希望尽可能找到所有欺诈/正例样本，即召回率应尽可能高：

召回率表示，真实欺诈样本被找到的概率；
假阳率表示，非欺诈样本被误判为欺诈样本的概率；
精确度表示，预测欺诈样本中真实欺诈样本的概率；
准确度表示，所有样本被预测正确的概率；

召回率和精度相互制约，精度很高时，召回率可能很低，相反，召回率很高时，精度可能很低！！！

Area Under the Curve and Receiver Operating Characteristics The ROC curve is plotted with TPR against the FPR where TPR is on y-axis and FPR is on the x-axis. We could evaluate a binary classification model with different classification thresholds. and then got ROC curve.
AUC - ROC curve is a performance measurement for classification problem at various thresholds settings.
ROC is a probability curve and AUC represents degree or measure of separability. It tells how much model is capable of distinguishing between classes. Higher the AUC, better the model is at predicting 0s as 0s and 1s as 1s. By analogy, Higher the AUC, better the model is at distinguishing between patients with disease and no disease. AUC is 0.5, it means model has no class separation capacity whatsoever
所有样本预测为正例时，FN = TN = 0, TPR = FPR = 1，因此调整属于正例的概率阈值从1.0到0.0 :