M-HMOGA: A New Multi-Objective Feature Selection Algorithm for Handwritten Numeral Classification release_fpdy73lmovcvtpmjagw3zvtf5i

by Ritam Guha, Manosij Ghosh, Pawan Kumar Singh, Ram Sarkar, Mita Nasipuri

Published in Journal of Intelligent Systems by Walter de Gruyter GmbH.

2019  

Abstract

<jats:title>Abstract</jats:title> The feature selection process is very important in the field of pattern recognition, which selects the informative features so as to reduce the curse of dimensionality, thus improving the overall classification accuracy. In this paper, a new feature selection approach named Memory-Based Histogram-Oriented Multi-objective Genetic Algorithm (M-HMOGA) is introduced to identify the informative feature subset to be used for a pattern classification problem. The proposed M-HMOGA approach is applied to two recently used feature sets, namely Mojette transform and <jats:italic>Regional Weighted Run Length</jats:italic> features. The experimentations are carried out on <jats:italic>Bangla</jats:italic>, <jats:italic>Devanagari</jats:italic>, and <jats:italic>Roman</jats:italic> numeral datasets, which are the three most popular scripts used in the Indian subcontinent. In-house <jats:italic>Bangla</jats:italic> and <jats:italic>Devanagari</jats:italic> script datasets and Competition on Handwritten Digit Recognition (HDRC) 2013 <jats:italic>Roman</jats:italic> numeral dataset are used for evaluating our model. Moreover, as proof of robustness, we have applied an innovative approach of using different datasets for training and testing. We have used in-house <jats:italic>Bangla</jats:italic> and <jats:italic>Devanagari</jats:italic> script datasets for training the model, and the trained model is then tested on Indian Statistical Institute numeral datasets. For <jats:italic>Roman</jats:italic> numerals, we have used the HDRC 2013 dataset for training and the Modified National Institute of Standards and Technology dataset for testing. Comparison of the results obtained by the proposed model with existing HMOGA and MOGA techniques clearly indicates the superiority of M-HMOGA over both of its ancestors. Moreover, use of K-nearest neighbor as well as multi-layer perceptron as classifiers speaks for the classifier-independent nature of M-HMOGA. The proposed M-HMOGA model uses only about 45–50% of the total feature set in order to achieve around 1% increase when the same datasets are partitioned for training-testing and a 2–3% increase in the classification ability while using only 35–45% features when different datasets are used for training-testing with respect to the situation when all the features are used for classification.
In application/xml+jats format

Archived Files and Locations

application/pdf  1.5 MB
file_kj27ot3g35d4layfrlzp7jkhci
www.degruyter.com (publisher)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article-journal
Stage   published
Date   2019-06-14
Journal Metadata
Open Access Publication
In DOAJ
In Keepers Registry
ISSN-L:  0334-1860
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 210b7e4c-597d-45fb-83d6-da59a151e8ee
API URL: JSON