Receipt Scanning Using Deep Learning

Detta är en Master-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Författare: Gustaf Gunér; [2020]

Nyckelord: ;

Sammanfattning: Employees often make purchases on behalf of the companies that they are working for. These purchases must be reported manually, either by the employees themselves or by sending the receipts to the company‘s accountant. In both cases, parts of the receipts are transcribed manually. This process is time-consuming and poses a risk that the human factor causes transcription errors, which can lead to ambiguities in the company‘s financial statements. A fully-automatic receipt scanner, which from a photograph of a receipt can extract metadata (e.g. total price, VAT, and individual item names) would solve many of these problems. Not only would it make the reporting process more efficient, which would reduce costs and save time, but the correctness of the data could be increased too. In this report, the possibilities of using Deep Learning (DL) as an approach to receipt scanning are evaluated, in comparison to a heuristic Computer Vision (CV) solution. Both approaches detect the receipt in a photograph, preprocess the original photograph based on the location information and extract the text from it using Optical Character Recognition (OCR). The approaches were evaluated based on the accuracy of the predicted receipt locations and the accuracy of the extracted texts. The results show that the Deep Learning approach achieved significantly better results than the heuristic approach, in both tasks. In the generic test set, which combined all test instances, the Deep Learning approach achieved 31.1 percentage points higher average Intersection over Union (IoU), 23.4 percentage points lower average Character Error Rate (CER) and 17.5 percentage points lower average Word Error Rate (WER).

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)