Improving character recognition by thresholding natural images

Detta är en Kandidat-uppsats från KTH/Skolan för datavetenskap och kommunikation (CSC)

Sammanfattning: The current state of the art optical character recognition (OCR) algorithms are capable of extracting text from images in predefined conditions. OCR is extremely reliable for interpreting machine-written text with minimal distortions, but images taken in a natural scene are still challenging. In recent years the topic of improving recognition rates in natural images has gained interest because more powerful handheld devices are used. The main problem faced dealing with recognition in natural images are distortions like illuminations, font textures, and complex backgrounds. Different preprocessing approaches to separate text from its background have been researched lately. In our study, we assess the improvement reached by two of these preprocessing methods called k-means and Otsu by comparing their results from an OCR algorithm. The study showed that the preprocessing made some improvement on special occasions, but overall gained worse accuracy compared to the unaltered images.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)