TY - GEN
T1 - A methodology for comparing classifiers that allow the control of bias
AU - Zamolotskikh, Anton
AU - Delany, Sarah Jane
AU - Cunningham, Pádraig
PY - 2006
Y1 - 2006
N2 - This paper presents False Positive-Critical Classifiers Comparison a new technique for pairwise comparison of classifiers that allow the control of bias. An evaluation of Naïve Bayes, k-Nearest Neighbour and Support Vector Machine classifiers has been carried out on five datasets containing unsolicited and legitimate e-mail messages to confirm the advantage of the technique over Receiver Operating Characteristic curves. The evaluation results suggest that the technique may be useful for choosing the better classifier when the ROC curves do not show comprehensive differences, as well as to prove that the difference between two classifiers is not significant, when ROC suggests that it might be. Spam filtering is a typical application for such a comparison tool, as it requires a classifier to be biased toward negative prediction and to have some upper limit on the rate of false positives. Finally the particular evaluation summary is presented, which confirms that Support Vector Machines outperform other methods in most cases, while the Naïve Bayes classifier works well in a narrow, but relevant range of false positive rate.
AB - This paper presents False Positive-Critical Classifiers Comparison a new technique for pairwise comparison of classifiers that allow the control of bias. An evaluation of Naïve Bayes, k-Nearest Neighbour and Support Vector Machine classifiers has been carried out on five datasets containing unsolicited and legitimate e-mail messages to confirm the advantage of the technique over Receiver Operating Characteristic curves. The evaluation results suggest that the technique may be useful for choosing the better classifier when the ROC curves do not show comprehensive differences, as well as to prove that the difference between two classifiers is not significant, when ROC suggests that it might be. Spam filtering is a typical application for such a comparison tool, as it requires a classifier to be biased toward negative prediction and to have some upper limit on the rate of false positives. Finally the particular evaluation summary is presented, which confirms that Support Vector Machines outperform other methods in most cases, while the Naïve Bayes classifier works well in a narrow, but relevant range of false positive rate.
UR - http://www.scopus.com/inward/record.url?scp=33751052281&partnerID=8YFLogxK
U2 - 10.1145/1141277.1141411
DO - 10.1145/1141277.1141411
M3 - Conference contribution
AN - SCOPUS:33751052281
SN - 1595931082
SN - 9781595931085
T3 - Proceedings of the ACM Symposium on Applied Computing
SP - 582
EP - 587
BT - Applied Computing 2006 - The 21st Annual ACM Symposium on Applied Computing - Proceedings of the 2006 ACM Symposium on Applied Computing
PB - Association for Computing Machinery
T2 - 2006 ACM Symposium on Applied Computing
Y2 - 23 April 2006 through 27 April 2006
ER -