Visual Content-Based and Semantic Concept-Based Multimedia Indexing and Search

in
PhD Seminar Course on

Visual Content-Based and Semantic Concept-Based Multimedia Indexing and Search

Cagliari, September, 20-22, 2010
Instructor: Dr. Apostol (Paul) Natsev

IBM Reseaarch - Watson Research Center (Hawthorne)
Duration: 8 hours
Schedule:
  • Monday September 20, 9 A.M. - 1 P.M
  • Wednesday September 22, 9 A.M. - 1 P.M
Venue: Room I, Main building - Faculty of Engineering
Topics:

Social media creation and use has skyrocketed in recent years and has become an indelible part of our lives -- from the way we entertain and inform ourselves to the way we communicate, socialize, or learn.   More than 24 hours of new video content are uploaded on YouTube each minute, and over 100M new photos are uploaded on Facebook every day.  With the tremendous growth of online multimedia content come great opportunities but also great expectations and challenges.  Users expect images and video to be searchable as easily as text but technology has unfortunately not kept pace.

In this short course, I will present the current state of art in multimedia search, and will review the current approaches for visual content-based as well as semantic indexing and search. Emphasis will be given on a new promising direction of research, semantic concept-based retrieval, which aims to boost both the effectiveness and usability of multimedia search. I will describe techniques that leverage the computer's ability to effectively analyze visual features of images and video, and apply statistical machine learning techniques to classify and label visual scenes, objects, people, and activities, automatically.  I will also describe methods that leverage such automatically generated labels to improve the quality of multimedia indexing and search as well as to enable new applications and content monetization models.  The above approaches will be presented and demonstrated in the context of a state-of-art multimedia analysis and retrieval system developed at IBM Research (http://www.alphaworks.ibm.com/tech/imars).

Here is a preliminary synopsis and list of topics to be covered (still subject to change but not by much):
1. Introduction and motivation
1.1 Opportunities of multimedia analysis and retrieval
1.2 Challenges and basic problems
1.3 Example applications and demos

2. Content-based retrieval
2.1 Global Visual Features
- Color spaces and color features

        - Color feature representations (color histograms, correlograms, moments)
        - Texture features (structural, statistical, spectral)
        - Edge and shape features
2.2 Local visual features
        - Interest point detection
        - Local descriptor representation
        - Local point matching and spatial registration
2.2 Similarity measures and evaluation metrics
2.3 Video segmentation and matching
2.4 Advanced techniques of content-based retrieval
2.4 Video fingerprinting and near-duplicate detection

3. Semantic concept-based retrieval
3.1 Definitions and motivation
3.2 Semantic concept vocabulary design
3.3 Semantic concept modeling and extraction
3.4 Multi-modal fusion and semantic context exploitation
3.5 Retrieval by semantic concepts
3.6 Concept-based query expansion

4. Multi-modal video retrieval -- a case study
4.1 Speech-based retrieval
4.2 Visual content-based retrieval
4.3 Semantic concept-based retrieval
4.4 Query-dependent multi-modal fusion
4.5 Performance evaluation (TRECVID)
4.6 Demos and other applications

 

Speaker Bio

Dr. Apostol (Paul) Natsev is a Research Staff Member and Manager of the Multimedia Research Group at the IBM T. J. Watson Research Center.  He received his M.S. (1997) and Ph.D. (2001) degrees in Computer Science from Duke University, and joined IBM Research in 2001.  At IBM, he leads research efforts on multimedia analysis and retrieval, with an agenda to advance the science and practice of systems that enable users to manage and search vast repositories of unstructured multimedia content.  

Dr. Natsev is a founding member and senior researcher of IBM’s award-winning IMARS project on multimedia analysis and retrieval, with pioneering contributions in the area of content-based and semantic concept-based multimedia retrieval.  He is an active participant in the NIST TREC Video Retrieval (TRECVID) evaluation, where his team has previously achieved top performance on TRECVID concept detection, search, and video copy detection tasks (in 2006, TRECVID involved an estimated 380 researchers from almost 100 separate institutions world-wide).  He is also the chief architect and lead developer of an IMARS-based video fingerprinting and copy detection system, which achieved top performance in the 2007 CIVR Video Copy Detection Showcase.  

Dr. Natsev is an author of more than 60 publications and 16 U.S. patents (7 granted, 9 pending) in the areas of multimedia analysis, indexing and search, video fingerprinting, multimedia databases, and query optimization.  His work has been recognized with several awards, including the 2004 Wall Street Journal Innovation Award (for the IBM multimedia analysis and retrieval system), a 2005 IBM Outstanding Technical Accomplishment Award, a 2005 ACM Multimedia Plenary Paper Award, a 2006 ICME Best Poster Award, and the 2008 CIVR VideOlympics People's Choice Award (for IMARS). 

Organizer: Prof. Giorgio Giacinto
Dep. of Electrical and Electronic Engineering
University of Cagliari, Italy
Email: giacinto[at]diee[dot]unica[dot]it