With the vast development of Internet capacity and speed, as well as wide adoptation
of media technologies in people’s daily life, a large amount of videos have
been surging, and need to be efficiently processed or organized based on interest.
The human visual perception system could, without difficulty, interpret and recognize
thousands of events in videos, despite high level of video object clutters,
different types of scene context, variability of motion scales, appearance changes,
occlusions and object interactions. For a computer vision system, it has been be
very challenging to achieve automatic video event understanding for decades.
Broadly speaking, those challenges include robust detection of events under motion
clutters, event interpretation under complex scenes, multi-level semantic
event inference, putting events in context and multiple cameras, event inference
from object interactions, etc.
In recent years, steady progress has been made towards better models for video
event categorisation and recognition, e.g., from modelling events with bag of
spatial temporal features to discovering event context, from detecting events using
a single camera to inferring events through a distributed camera network, and
from low-level event feature extraction and description to high-level semantic
event classification and recognition. Nowadays, text based video retrieval is
widely used by commercial search engines. However, it is still very difficult to
retrieve or categorise a specific video segment based on their content in a real
multimedia system or in surveillance applications. To advance the progress further,
we must adapt recent or existing approaches to find new solutions for intelligent
video understanding.
This book aims to present state-of-the-art research advances of video event understanding
technologies. It will provide researchers and practitioners a rich resource
for future research directions and successful practice. It could also serve
as a reference tool and handbook for researchers in a number of applications including
visual surveillance, human-computer interaction, and video search and
indexing etc. Its potential audience will be composed of active researchers and
practitioners as well as graduate students working on video analysis in various
disciplines such as computer vision, pattern recognition, information security,
artificial intelligence, etc.