Aqua Phoenix
     >>  Research >>  VAST MM  
 

Navigator
   
 
    About  
   
Title: VAST MM: Video Audio Structure Text Multimedia Browser
Short Description: This Video Indexing and Browsing tool is designed for unstructured presentation videos (lectures, talks, etc.), particularly in the domain of candidly captured student presentation videos. It demonstrates several integrated approaches for multi-modal analysis and indexing of audio and video. It applies visual segmentation techniques on unedited video to determine likely changes of topics. Speaker segmentation methods are employed to determine individual student appearances, which are linked to extracted headshots to create a visual speaker index. Videos are augmented with time-aligned filtered keywords and phrases from highly inaccurate speech transcripts. This user interface, the VAST MM Browser (Video Audio Structure Text Multi Media Browser), combines streaming videos, visual, and textual indices for browsing and searching. It has been evaluated in a large engineering design course over four semesters and 598 student participants. Results on student performance suggest that our video indexing and retrieval approach is effective, and that the exam scores of students using the browser significantly increased.
Hardware required: Computer for Indexing, Server for Database and Streaming, MPEG2 Encoder if transferring from video tape to digitized video
Software required: FFMPEG and its libraries (video transcoding, video indexing), IBM ViaVoice (automatic transcription), C compiler for indexing software, Java for UI browser
Platforms: Video Indexer: any platform with FFMPEG and C compiler; Video Browser: any Java WebStart compliant platform
Date: June 2004 - May 2008
Acknowledgement: This material is based upon work partially supported by the National Science Foundation under Grant No. IIS-0713064
Keywords: Video, audio, segmentation, indexing, searching, browsing, semantics, inaccurate transcripts, lecture video, instructional video, presentation video, talks, Java MPEG, MPEG video player, streaming video, streaming server
 
    Quick Index  
   

Table of Contents

Demo(s)
   
 
       
   

VAST Video Audio Structure Text MultiMedia Browser is an interactive video library browser featuring tools for video search and retrieval, developed in the High Level Vision lab in the Department of Computer Science at Columbia University. The major parts of the browser include:

  • Text search tools for video content
  • Interactive visualization for video scene segmentation
  • Interactive visualization for filtered key phrases from automatically generated (ASR) transcripts (IBM ViaVoice)
  • Visualization for speaker segmentation
  • Streaming video (MPEG1) via a pure Java implementation that does NOT require a separate package such as JMF and does not rely on the underlying system codecs
  • Keyframe player for fast video skimming