Metrics¶

This chapter, I am introducing the popular metrics for ML applications, as the following -

Classification - confusion matrix, accuracy, precision, recall, F1-score, ROC, AUC.
Regression - MSE, MAE, R squared.
Recommender system (learn to rank) - AP, mAP@k, nDCG.

Classification¶

Confusion matrix
- True positive(TP): predict positive, actual positive
- True negative(TN): predict negative, actual negative
- False positive(FP): predict positive, actual negative
- False negative(FN): predict negative, actual positive
Accuracy = (TP + TN) / (TP + TN + FP + FN)
Precision = TP / (TP + FP)
Recall (True positive rate) = TP / (TP + FN)
False positive rate = FP / (FP + TN)
F1-score = 2 * Precision * Recall / (Precision + Recall)
ROC : x - False positive rate, y - True positive rate, threshold - 0-1
AUC: Area under ROC, 1 - good, 0 - bad.

Regression¶

MSE = mean((y - y_pred)^2)
MAE = mean(abs(y-y_pred))
RSME = sqrt(MSE)
R_squared = 1 - SSR/SST = 1 - sum((y - y_pred)^2)/sum((y - y_avg)^2)

Learn to Rank (Recommender systems)¶

Learn to rank is to predict the rank (order) of relevant items for a given task.

Mean reciprocal rank (MRR)

Average of the reciprocal ranks of “the first relevant item” for a set of queries. MRR = mean(1/rank).

Precision @ k :

Number of relevant items among the top k items.

P@k = # relevant items / # top k items
AP@N = 1/n * sum(P@k)
mAP@N = mean(AP@N)

Example:

true_items = {"a", "b", "c", "d", "e", "k"}
predict_items = ["a", "f", "d", "e", "g"]
relevant_list = [1, 0, 1, 1, 0]
AP@N = 1/len(true_items) * (1/1 + 0/2 + 2/3 + 3/4 + 0/5)
or
AP@N = 1/sum(relevant_list) * (1/1 + 0/2 + 2/3 + 3/4 + 0/5)

Normalized Discounted Cumulative Gain (NDCG)

Cumulative Gain : Sum of all relevance values in a search result list, sum(rel_i).
Discounted Cumulative Gain : sum(rel_i / log2(i+1)).

Example:

true_items = ["a", "b", "c", "d", "e", "k"]
relevant_scores = [6, 5, 4, 3, 2, 1]
predict_items = ["a", "f", "d", "e", "g"]
relevant_list = [6, 0, 3, 2, 0]
DCG = 6/1 + 0 + 3/2 + 2/2.32 + 0
ideal_relevant_list = [6, 3, 2, 0, 0]
IDCG = 6/1 + 3/1.59 + 2/2 + 0 + 0
NDCG = DCG / IDCG

Reference

https://towardsdatascience.com/20-popular-machine-learning-metrics-part-2-ranking-statistical-metrics-22c3e5a937b6
http://sdsawtelle.github.io/blog/output/mean-average-precision-MAP-for-recommender-systems.html
https://machinelearningmedium.com/2017/07/24/discounted-cumulative-gain/
https://sigir.org/wp-content/uploads/2017/06/p243.pdf
https://gist.github.com/tgsmith61591/d8aa96ac7c74c24b33e4b0cb967ca519