Text Categorisation
Our work on text categorization is related to the issue of classification reliability (see the corresponding pages under the Research section of this site), which has been addressed so far only in the context of "standard" pattern recognition applications, namely single-label problems in which performance is measured in terms of the expected classification cost, misclassification probability being a particular case of it.
We are investigating whether and how the use of the reject option can be useful in text categorization tasks. So far we considered the case of multi-stage text categorization systems, in which documents that cannot be reliably categorized by any of the stages (but the last one) are fed to the next stage, and the case of text categorization systems in which uncertain document are handled by human operators instead of being automatically categorized.
People working on this topic:
- Giorgio Fumera
- Ignazio Pillai
- Fabio Roli