Saved in:
Bibliographic Details
Main Authors: Sammed S. Admuthe, Hemlata P. Channe
Format: Recurso digital
Language:
Published: Zenodo 2021
Subjects:
Online Access:https://doi.org/10.5281/zenodo.18515979
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • A large number of photographs, signatures, and documents are produced, processed, and stored in the form of digital images. Pan Card, Aadhar Card, Passport, Voter Id are the most authentic documents. Classification of these documents is an important step in office automation, digital libraries, and other document image analysis applications. Document classification generally focuses on extracting textual data and using that for feature engineering. Many document image classification methods exists but they are generally used for photographic image classification. In the present work to classify document images, two different techniques are used. The first one is based on textual feature extraction methods TF-IDF (Term Frequency Inverse Term Frequency). The second is visual classification methods by convolution neural network. The dataset contains 600 documented images belonging to 4 different categories. Classification of textual feature extraction methods has given an average accuracy of 85.5% while visual feature classification has given an average accuracy 93%.