Image and Video Description using Local Binary Patterns

in

Image and Video Description using Local Binary Patterns

Developing pattern recognition systems involves two crucial issues: image representation and classifier design. The aim of image representation is to derive a set of features from the raw images which minimizes the intra-class variations (i.e. within instances of a same object) and maximizes the extra-class variations (i.e. between images of different objects). Obviously, if inadequate representations are adopted, even the most sophisticated classifiers fail to accomplish the recognition task. Therefore, it is important to carefully decide on what representation to adopt when designing pattern recognition systems. Ideally, the representation should: (i) discriminate different objects well while tolerating within-class variations; (ii) be easily extracted from the raw images/videos in order to allow fast processing; and (iii) lie in a low dimensional space (short vector length) in order to avoid a computationally expensive classifier. Naturally, it is not easy to find features which meet all these criteria because of the large variability in object appearances due to different imaging factors such as scale, orientation, pose, lighting conditions, etc. Thus, a key issue in pattern recognition and computer vision is finding efficient image and video descriptors.

Feature (or descriptor) extraction from images and videos is indeed a very crucial task in almost all computer vision systems. It consists of extracting characteristics describing important information in the images and videos. In literature, different global (or holistic) methods such as Principal Component Analysis (PCA) have been widely studied and applied but lately local descriptors (such as LBP,WLD, LPQ, SIFT, Gabor, DCT and HOG) have gained more attention due to their robustness to challenges such as pose and illumination changes. This lecture gives an overview of different image and video descriptors which can be found in literature with an emphasis on the most recent developments in the field.  

To explain and demonstrate the use of image and video descriptors, the local binary pattern (LBP) operator will be chosen as an example of methods for computing descriptors. LBP is shown to be very efficient in describing image and video appearances and provides outstanding results in representing and analyzing different patterns in both still images and video sequences. The LBP operator is defined as a grayscale invariant texture measure, derived from a general definition of texture in a local neighborhood. Due to its discriminative power and computational simplicity, the LBP texture operator has become a popular approach in various applications, including visual inspection, image retrieval, remote sensing, biomedical image analysis, face image analysis, motion analysis, environment modeling, and outdoor scene analysis. After the presentation, the participants will become aware of the state of- the-art in image and video descriptors and their development in computer vision. Particularly, they will understand the fundamental theory behind Local Binary Patterns (LBP). They will also be advised on effective and proper use of LBP in various applications.

Where: Room Mocci, A building
When: Thursday December 15, 2011. 3.30 P.M.
Contacts: roli[at]diee[dot]unica[dot]it
Web site: http://www.ee.oulu.fi/~hadid/