Classification with Reject Option in Text Categorisation Systems
Publication Type:Conference Paper
Source:12th International Conference on Image Analysis and Processing (ICIAP 2003), IEEE Computer Society, Mantova, p.582-587 (2003)
Keywords:document categorisation; text categorisation; classification reliability; reject option; rej00; doc01; doc00
The aim of this paper is to evaluate the potential usefulness of the reject option for text categorisation (TC) tasks. The reject option is a technique used in statistical pattern recognition for improving classification reliability. Our work is motivated by the fact that, although the reject option proved to be useful in several pattern recognition problems, it has not yet been considered for TC tasks. Since TC tasks differ from usual pattern recognition problems in the performance measures used and in the fact that documents can belong to more than one category, we developed a specific rejection technique for TC problems. The performance improvement achievable by using the reject option was experimentally evaluated on the Reuters dataset, which is a standard benchmark for TC systems.