INTERMEDIA Multimedia Tools

The European Union project INTERMEDIA (IST-038419) aims at integrating tools and technologies from various disciplines to facilitate user centered access to multimedia data.

The Institut für Nachrichtentechnik is one of 16 partners that conduct research on human-machine interfaces, wearable computing, networking, localization, embedded systems, digital rights management and multimedia adaptation. Our contribution concerns multimedia annotation. The research focus is to provide tools that automatically and autonomously extract information from video data, enabling adaptation of the media to a given users' current needs (i.e. the current 'context').

During the first year of the project, we have committed a couple of tools that were showcased during the Open Forum in Madrid, October 2007. We've decided to open source these tools and provide them for public access on SourceForge.

Shot boundary detection

A key advantages of digital media formats is the ability to store meta information. The most prominent kind of meta-data is information on temporal structure, be it the simple track lists of audio CDs or animated menus as available on DVD.

Unfortunately, with digital video recording equipment becoming affordable and popular, users will likely soon own a significant amount of home recording video without sophisticated annotation.

The first step in adding semantic information is to gather the temporal structure. Our approach to shot boundary detection is based on moving averages over color histograms. Color based scene cut detection is known to be robust, performs well even on multimedia devices with litte processing power and the usage of moving averages allows to detect hard cuts as well as smooth transitions.

Main author: Dr.-Ing. Mark Asbach.

Scene classification

Typical TV material contains visually similar scene setups like the trailers of TV shows, the anchor person of a news cast, the weather map or even commercials. Until broadcasters deliver detailed metadata, automatic tools to detect these special scene types can provide useful meta information. A user might have recorded a day's program but is interested in a certain show only.

We have developed a very simple scene classifier, that performs surprisingly well on anchor person shots, weather maps, etc. Author: Dr.-Ing. Michael Unger.

Motion compensated background subtraction

A standard task in computer vision is object segmentation that allows defining regions of interest (ROI), i.e. spatial description of object areas on a frame-wise level. Thus video contents can be adapted to small screens, such as mobile video payers, by scaling and cropping the ROI.

We have developed an object segmentation algorithm based on motion compensated background subtraction, which is able to handle unconstrained camera motion, zoom, rotation and even (weak) lens distortion.