6533b7d2fe1ef96bd125f601

RESEARCH PRODUCT

A Robust Multi Stage Technique for Image Binarization of Degraded Historical Documents

Walid Khaled HidouciDominique MichelucciOmar Boudraa

subject

adaptive thresholdingComputer scienceHistorical document image analysis[SPI] Engineering Sciences [physics]ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION02 engineering and technologyhybrid algorithm01 natural sciencesGrayscaleElectronic mail010309 optics[SPI]Engineering Sciences [physics]Histogram0103 physical sciences0202 electrical engineering electronic engineering information engineeringNoise measurementbusiness.industryPattern recognitionImage segmentationglobal thresholdingThresholding[SPI.TRON] Engineering Sciences [physics]/Electronics[SPI.TRON]Engineering Sciences [physics]/ElectronicsComputingMethodologies_DOCUMENTANDTEXTPROCESSINGcontrast enhancement020201 artificial intelligence & image processingAlgorithm designAdaptive histogram equalizationArtificial intelligencebusiness

description

International audience; Document image binarization is a central problem in many document analysis systems. Indeed, it represents one of the basic challenges, especially in case of historical documents analysis. In this paper, we propose a novel robust multi stage framework that combines different existing document image thresholding methods for the purpose of getting a better binarization result. CLAHE technique is introduced to significantly enhance contrast in some poor images. The proposed method then uses a hybrid algorithm to partition image into foreground and background. A special procedure is finally applied in order to remove small noise and correct characters morphology. Experimental results prove the accuracy and the efficiency of our approach on document images binarization over three popular datasets compared to some well-known methods in literature.Keywords

https://u-bourgogne.hal.science/hal-01858390