6533b86dfe1ef96bd12ca868

RESEARCH PRODUCT

Text Extraction from Scrolling News Tickers

Guntis BarzdinsIngus Janis PretkalninsArturs Sprogis

subject

Information retrievalComputer scienceCharacter (computing)ScrollingExtraction (chemistry)ComputingMethodologies_DOCUMENTANDTEXTPROCESSINGKeyword extractionTesseractPipeline (software)

description

While a lot of work exists on text or keyword extraction from videos, not a lot can be found on the exact problem of extracting continuous text from scrolling tickers. In this work a novel Tesseract OCR based pipeline is proposed for location and continuous text extraction from scrolling tickers in videos. The solution worked faster than real time, and achieved a character accuracy of 97.3% on 45 min of manually transcribed 360p videos of popular Latvian news shows.

https://doi.org/10.1007/978-3-030-57672-1_11