Text Recognition in Multimedia Documents: A Study of two Neural-based OCRs Using and Avoiding Character Segmentation

Elagouni, Khaoula; Garcia, Christophe; Mamalet, Franck; Sébillot, Pascale

Repository landing page

Text Recognition in Multimedia Documents: A Study of two Neural-based OCRs Using and Avoiding Character Segmentation

Authors: Khaoula Elagouni
Christophe Garcia
Franck Mamalet
Pascale Sébillot
Publication date: 1 March 2014
Publisher: Springer Verlag
Doi

Abstract

International audienceText embedded in multimedia documents represents an important semantic information that helps to automatically access the content. This paper proposes two neural-based OCRs that handle the text recognition problem in different ways. The first approach segments a text image into individual characters before recognizing them, while the second one avoids the segmentation step by integrating a multi-scale scanning scheme that allows to jointly localize and recognize characters at each position and scale. Some linguistic knowledge is also incorporated into the proposed schemes to remove errors due to recognition confusions. Both OCR systems are applied to caption texts embedded in videos and in natural scene images and provide outstanding results showing that the proposed approaches outperform the state-of-the-art methods

Similar works

Full text

HAL

oai:HAL:hal-00867225v1

Last time updated on 01/11/2023

This paper was published in HAL.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.