Unbalanced learning in Content-Based Image Classification and Retrieval

TitleUnbalanced learning in Content-Based Image Classification and Retrieval
Publication TypeConference Paper
Year of Publication2010
AuthorsPiras, L, Giacinto, G
Conference NameIEEE International Conference on Multimedia & Expo (ICME)
Date Published19/07/2010
Conference LocationSingapore
Nowadays very large archives of digital images can be easily produced thanks to the availability of digital cameras as stand-alone devices, or embedded into a number of portable devices. Each personal computer is typically a repository for thousands of images, while the Internet can be seen as a very large repository. One of the most severe problems in the classification and retrieval of images from very large repositories is the very limited number of elements belonging to each semantic class compared to the number of images in the repository. As a consequence, an even smaller fraction of images per semantic class can be used as training set in a classification problem, or as a query in a content-based image retrieval problem. In this paper we propose a technique aimed at artificially increasing the number of examples in the training set in order to improve the learning capabilities, reducing the unbalance between the semantic class of interest, and all other images. The proposed approach is tailored to classification and relevance feedback techniques based on the Nearest-Neighbor paradigm. A number of new points in the feature space are created based on the available training patterns, so that they better represent the distribution of the semantic class of interest. These new points are created according to the k-NN paradigm, and take into account both relevant and non-relevant images with respect to the semantic class of interest. The proposed approach allows increasing the generalization capability of NN techniques, and mitigates the risk of classifier over-training on few patterns. Reported experiments show the effectiveness of the proposed technique in Content-Based Image Retrieval tasks, where the Nearest-Neighbor approach is used to exploit users relevance feedback. The improvement in precision and recall gained in one feature space allows also to outperform the improvement in performances attained by combining different feature spaces.
Citation Key 825
03 - ICME 2010.pdf177.33 KB