Adversarial Pattern Classification - Tutorial proposal at ICML 2011
Tutorial proposal at ICML 2011
Adversarial Pattern Classification
Topic overview |
Pattern classifiers are currently used in several applications, like biometric recognition, spam filtering, and intrusion detection in computer networks, in which the goal of the classification task is usually to discriminate between a 'legitimate' and a 'malicious' pattern class (for instance, legitimate and spam e-mails, legitimate and intrusive network traffic, genuine or impostor users in biometric systems). However, these tasks are different from traditional pattern recognition tasks, since the samples of the 'malicious' class are generated by an intelligent, adaptive adversary, who can actively manipulate data to get his samples misclassified as legitimate. Well known examples of adversarial behaviour are tricks used by spammers to camouflage their spam e-mails (like misspelling typical spam words, and embedding the spam message into an attached image, to evade spam filters based on the analysis of the e-mail's body text), techniques used by hackers to camouflage their intrusive network packets, and potential attacks against biometric systems (like spoof attacks, for instance by constructing fake fingeprints). Other adversarial settings are likely to emerge in many applications, like web search engines (e.g., inflating web page ranking), social networks, reputation systems, etc.
Fig.1: Attacking a biometric verification system using fake fingerprints (left). Example of spam camouflaged by misspelling typical spam words ("bad word obfuscation") and adding ham words ("good word insertion") (right).
Traditional pattern recognition techniques and design methods do not take into account the adversarial nature of classification problems like the ones mentioned above, and exhibit several vulnerabilities (potentially unknown) which can be exploited by an adversary to make them ineffective. One of the consequences is that, when they are used in adversarial settings, their performance can significantly degrade under attack.
This kind of classification problem has been named adversarial classification (Dalvi et al., 2004), and is the subject of an emerging research field in the machine learning community. However, so far only limited issues related to specific applications (mainly, intrusion detection in computer networks and spam filtering) have been addressed. Given the increasing relevance of adversarial settings and the increasing adoption of pattern recognition and machine learning techniques in such kind of applications, there is a strong need to develop general theories and design methods for pattern recognition systems in adversarial environments, and to generalize and systematize solutions proposed so far in specific contexts.
The purposes of this tutorial are:
- to introduce the fundamentals of adversarial classification from the perspective of a designer of a pattern recognition system;
- to illustrate the design cycle of a pattern recognition system for adversarial tasks;
- to present the new techniques that have been recently proposed to assess performance of pattern classifiers under attack, evaluate classifiers’ vulnerabilities, and implement defence strategies that make classifiers more robust against attacks;
- to show some applications of adversarial classification techniques to pattern recognition tasks like biometric recognition and spam filtering.
Target audience |
This tutorial is devoted to
- people that want to become aware of the new research field of adversarial classification and learning the fundamentals;
- people doing research in machine learning, data mining and pattern recognition applications which have a potential adversarial component, and wish to learn how the techniques of adversarial classification can be effectively used in such applications.
Content details
|
Introduction to adversarial pattern classification (0.5 hours) by F.Roli
Introduction by examples from biometrics, spam filtering, and intrusion detection in computer networks. Basic concepts and terminology. The concept of adversary-aware classifier. Definitions of attack and defence.
Design of pattern classification systems in adversarial environments (0.5 hours) by F. Roli
Modelling of adversarial tasks. The two-player model (the attacker and the classifier). Levels of reciprocal knowledge of the two players (perfect knowledge, limited knowledge, knowledge by queries and feedback). The design cycle of adversarial pattern classification systems. Distinctive features and main differences in comparison to the traditional design cycle. The notion of security by design.
System design: vulnerability assessment and performance evaluation (1 hour) by G. Fumera
Attack models against pattern classifiers. The Influence of attacks on the classifier: causative or exploratory attacks. Type of security violation: integrity or availability attacks. The specificity of the attack: targeted or indiscriminate attacks.
Performance evaluation. Limits of the traditional approach based on “cross-validation” of data for the performance evaluation under attack. Performance evaluation by simulation of attack patterns. Vulnerability assessment by performance evaluation. Examples of performance evaluation of classifiers under attack.
System design: defence strategies (1 hour) by F. Roli
Taxonomy of possible defence strategies. Defence strategies against specific attacks and for specific applications. Examples from biometrics, spam filtering, and intrusion detection in computer networks General-purpose defence strategies. Hiding information about the pattern classifier. Disinformation. Randomized classifiers. Evade hard multiple classifier systems. Detecting attacks.
Tutorial format: slide presentation.
Examples of past introductory talks on the tutorial topics
|
F. Roli, “Adversarial pattern classification”, Plenary lecture at the 2009 Int. Conference on Machine Learning and Cybernetics, Boading, Hebei, China, available at http://prag.diee.unica.it/pra/node/778
F. Roli, “Adversarial pattern classification”, Plenary lecture at the 2010 Int. Workshop on Computational Intelligence, Pattern Recognition, and Classifier Systems, Cairo, Egypt, available at http://www.datamine.nileu.edu.eg/files/file/F_Roli@Cairo-April2010.pdf
Organizers & presenters' expertise
|
|
|
Fabio Roli received his M.S. degree, with honours, and Ph.D. degree in Electronic Engineering from the University of Genoa, Italy. He was adjunct professor at the University of Trento, Italy, in 1993 and 1994. In 1995, he joined the Dept. of Electrical and Electronic Engineering of the University of Cagliari, Italy, where he is now professor of computer engineering and head of the research group on pattern recognition and applications (http://prag.diee.unica.it). Dr Roli has always done research on pattern recognition, machine learning, and image analysis, in the context of real applications including biometric recognition, video surveillance, and computer security. He did seminal work on the fusion of multiple classifiers and its applications to computer security. Dr Roli established the popular workshop series on multiple classifier systems and co-chaired its ten editions (www.diee.unica.it/mcs). On these topics, he has published more than one hundred papers at conferences and on journals. Recently, he focused his activity on pattern recognition in adversarial environments. He gave four invited lectures on this topic, and a plenary talk on adversarial classification at the international conference ICMLC 2009. Dr Roli is a Senior member of the IEEE and Fellow of the International Association for Pattern Recognition. He was the chairman of the IAPR Technical Committee on Statistical Techniques in Pattern Recognition from 2004 to 2008. Dr Roli is a member of the governing board of the International Association for Pattern Recognition.
|
|
|
Giorgio Fumera received the M.Sc. degree in Electronic Eng. with honours, in 1997, and the Ph.D. in Electronic and Computer Eng., in 2002, from the University of Cagliari. Since 2002 he is assistant professor of computer engineering in the Dep. of Electrical and Electronic Eng. of the same University. His research interests are in the field of statistical pattern recognition and its applications. His research topics include the reliability of pattern recognition systems, multiple classifier systems, and multimedia document categorization. G. Fumera is a member of the IEEE and IEEE Computer Society, and of the Italian chapter of the International Association for Pattern Recognition. |
Main contributions of presenters to the field of adversarial classification |
Journal papers
Battista Biggio, Giorgio Fumera, Fabio Roli, "Multiple Classifier Systems for Robust Classifier Design in Adversarial
Environments", Journal of Machine Learning and Cybernetics, vol. 1, issue 1: Springer Berlin / Heidelberg, pp. 27--41, 2010.
G. Fumera, I. Pillai, F. Roli, "Spam filtering based on the analysis of text information embedded into images", Journal of Machine Learning Research, vol. 7, pp. 2699-2720, 12/2006. Available at http://www.jmlr.org/papers/volume7/fumera06a/fumera06a.pdf
Conference papers
B. Biggio, G. Fumera, and Fabio Roli, "Multiple classifier systems under attack". 9th International Workshop on Multiple Classifier Systems, MCS 2010, Cairo, Egypt, April 7-9 2010, Sprinter, LNCS, Vol. 5997, pp. 74-83. (http://prag.diee.unica.it/pra/system/files/biggio-mcs2010.pdf)
B. Biggio, G. Fumera, and F. Roli, “Evade hard multiple classifier systems”. In O. Okun and G. Valentini, editors, Supervised and Unsupervised Ensemble Methods and Their Applications, volume 245 of Studies in Computational Intel ligence, Springer Berlin / Heidelberg, 2009, pages 15–38. (http://prag.diee.unica.it/pra/system/files/biggio08 suemaBook.pdf)
B. Biggio, G. Fumera, and F. Roli, “Adversarial pattern classification using multiple classifiers and randomisation”. In 12th
Joint IAPR International Workshop on Structural and Syntactic Pattern Recognition (SSPR 2008), Vol. 5342 of Lecture Notes in Computer
Science, Springer-Verlag, pages 500–509. (http://prag.diee.unica.it/pra/system/files/Biggio_SPR2008.pdf)
B. Biggio, G. Fumera, and F. Roli, “Multiple classifier systems for adversarial classification tasks”. 8th International Workshop on
Multiple Classifier Systems, MCS 2009, Reykjavik, Iceland, June 10-12, 2009. Proceedings, volume 5519 of Lecture Notes in Computer Science. Springer, 2009, pages 132–141. (http://prag.diee.unica.it/pra/system/files/Biggio_MCS2009.pdf)
B. Biggio, G. Fumera, I. Pillai, F. Roli , "Image Spam Filtering by Content Obscuring Detection", Fourth Conference on Email and Anti-Spam (CEAS 2007), Microsoft Research Silicon Valley, Mountain View, California, 2007. (http://prag.diee.unica.it/pra/system/files/Biggio_CEAS2007.pdf)
Essential bibliography on adversarial classification |
Journal papers
R. N. Rodrigues, L. L. Ling, and V. Govindaraju, “Robustness of multimodal biometric fusion methods against spoof attacks,” J. Vis. Lang. Comput., vol. 20, no. 3, pp. 169–179, 2009.
M. Kearns and M. Li, “Learning in the presence of malicious errors,” SIAM J. Comput., vol. 22, no. 4, pp. 807–837, 1993.
Z. Jorgensen, Y. Zhou, and M. Inge, “A multiple instance learning strategy for combating good word attacks on spam filters,” Journal of Machine Learning Research, vol. 9, pp. 1115–1146, June 2008.
Conference papers
N. Dalvi, P. Domingos, Mausam, S. Sanghai, and D. Verma, “Adversarial classification,” in Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), Seattle, 2004, pp. 99–108.
D. Lowd and C. Meek, “Adversarial learning,” in Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), A. Press, Ed., Chicago, IL., 2005, pp. 641–647.
M. Barreno, B. Nelson, R. Sears, A. D. Joseph, and J. D. Tygar, “Can machine learning be secure?” in ASIACCS ’06: Proceedings of the 2006 ACM Symposium on Information, computer and commu- nications security. New York, NY, USA: ACM, 2006, pp. 16–25.
P. Laskov and R. Lippmann, Eds., Neural Information Processing Systems (NIPS) Workshop on Machine Learning in Adversarial Environments for Computer Security, http://mls-nips07.first.fraunhofer.de, 2007.
Tom Fawcett, "In vivo spam filtering: a challenge problem for KDD", in ACM SIGKDD Explorations Newsletter, v.5 n.2, December 2003.
G. L. Wittel and S. F. Wu, “On attacking statistical spam filters,” in First Conference on Email and Anti-Spam (CEAS), Microsoft Research Silicon Valley, Mountain View, California, 2004.
D. Lowd and C. Meek, “Good word attacks on statistical spam filters,” in Second Conference on Email and Anti-Spam (CEAS), Mountain View, CA, USA, 2005.
B. Nelson, M. Barreno, F. J. Chi, A. D. Joseph, B. I. P. Rubinstein, U. Saini, C. Sutton, J. D. Tygar, and K. Xia, “Exploiting machine learning to subvert your spam filter,” in LEET’08: Proceedings of the 1st Usenix Workshop on Large-Scale Exploits and Emergent Threats. Berkeley, CA, USA: USENIX Association, 2008, pp. 1–9.
M. Kloft and P. Laskov, “A ’poisoning’ attack against online anomaly detection,” in Neural Information Processing Systems (NIPS) Workshop on Machine Learning in Adversarial Environments for Computer Security, http://mls-nips07.first.fraunhofer.de, P. Laskov and R. Lippmann, Eds., 2007.
R. Perdisci, D. Dagon, W. Lee, P. Fogla, and M. Sharif, “Misleading worm signature generators using deliberate noise injection,” in Security and Privacy, IEEE Symposium on, May 2006, pp. 15 pp.– 31.
G. F. Cretu, A. Stavrou, M. E. Locasto, S. J. Stolfo, and A. D. Keromytis, “Casting out demons: Sanitizing training data for anomaly sensors,” Security and Privacy, IEEE Symposium on, vol. 0, pp. 81–95, 2008.
A. Kolcz and C. H. Teo, “Feature weighting for improved classifier robustness,” in Sixth Conference on Email and Anti-Spam (CEAS), Mountain View, CA, USA, 16/07/2009 2009.
A. Globerson and S. T. Roweis, “Nightmare at test time: robust learning by feature deletion,” in ICML, ser. ACM International Conference Proceeding Series, W. W. Cohen and A. Moore, Eds., vol. 148. ACM, 2006, pp. 353–360.