An Exploration of MPEG-7 Shape Descriptors

Woz, Bret

Repository landing page

oai:repository.rit.edu:theses-8567

An Exploration of MPEG-7 Shape Descriptors

Authors: Bret Woz
Publication date: 1 January 2003
Publisher: RIT Digital Institutional Repository

Abstract

The Multimedia Content Description Interface (ISO/IEC 15938), commonly known to as MPEG-7, became a standard as of September of 2001. Unlike its predecessors, MPEG- 7 standardizes multimedia metadata description. By providing robust descriptors and an effective system for storing them, MPEG-7 is designed to provide a means of navigation through audio-visual content. In particular, MPEG-7 provides two two-dimensional shape descriptors, the Angular Radial Transform (ART) and Curvature Scaled Space (CSS), for use in image and video annotation and retrieval. Field Programmable Gate Arrays (FPGAs) have a very general structure and are made up of programmable switches that allow the end-user, rather than the manufacturer, to configure these switches for whatever design is needed by their application. This flexibly has led to the use of FPGAs for prototyping and implementing circuit designs as well as their use being suggesting as part of reconfigurable computing. For this work, an FPGA based ART extractor was designed and simulated for a Xilinx Virtex-E XCV300e in order to provide a speedup over software based extraction. The design created is capable of processing over 69,4400 pixels a minute. This design utilizes 99% of the FPGA\u27s logical resources and operates at a clock rate of 25 MHz. Along with the proposed design, the MPEG-7 shape descriptors were explored as to how well they retrieved similar objects and how these objects matched up to what a human would expect. Results showed that the majority of the retrievals made using the MPEG-7 shape descriptors returned visually acceptable results. It should be noted that even the human results had a high amount of variance. Finally, this thesis briefly explored the potential of utilizing the ART descriptor for optical character recognition (OCR) in the context of image retrieval from databases. It was demonstrated that the ART has potential for use in OCR, however there is still research to be performed in this area

text

Similar works

Full text

RIT Scholar Works

oai:repository.rit.edu:theses-...

Last time updated on 12/01/2024

This paper was published in RIT Scholar Works.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.