Fitur Baharu dari Kombinasi Geometri Segitiga dan Pengezonan untuk Paleografi Jawi Digital

Azmi, Mohd Sanusi and Omar, Khairuddin and Nasrudin, Mohamad Faidzul and Muda, Azah Kamilah (2013) Fitur Baharu dari Kombinasi Geometri Segitiga dan Pengezonan untuk Paleografi Jawi Digital. PhD thesis, Universiti Kebangsaan Malaysia.

[img]
Preview
HTML
tesis_phd_sanusi(P56225).pdf

Download (5MB)
[img]
Preview
PDF
tesis_phd_sanusi(P56225).pdf - Published Version

Download (5MB)

Abstract

The study of paleography focuses on the physical attributes of writings such as the character, calligraphy, and illumination types to determine the origin of the manuscript. Determination of physical attributes in Jawi characters like the calligraphy type using optical character recognition (OCR) is part of the digital Jawi paleography (PJD). The subjects of digital paleography in this study are Jawi characters written in standalone form. However, OCR still uses the features that neglect the angle of the standalone characters, an important criteria in calligraphy type recognition. Moreover, the paleography study is still limited with the use of local datasets. The purpose of this study is to suggest a PJD based on the calligraphy type determination. To achieve this purpose as well as solving the problem of the datasets and features as mentioned previously, this study suggests four objectives: i. To create a PJD framework and Arabic calligraphy dataset of standalone characters, ii. To suggest features from the combination of triangle geometry and zoning which are sensitive to angles, iii. To suggest vector features as continuation from (ii), and lastly, iv. To suggest discriminant standalone Arabic characters for PJD. The study was implemented by selecting digit Arabic/Farsi dataset because standalone characters dataset does not exist in PJD as far as our concern. The Arabic digit dataset chosen are HODA and IFHCDB with ten classes in each. Experiments that were conducted successfully verified the proposed algorithm. Later, a dataset of 69.400 images of Arabic calligraphy characters was built consisting of the handwriting of ten calligraphy experts. The proposed features are from the combination of the geometry of scalene triangles and zoning from the Cartesian plane. Features were generated from binary images which are divided into four zones. Next, the zoning is extended to the Horizontal, Vertical, 25-zone, 45-degree, and 33-zone zoning. Every experiment uses the original distribution, 10-fold and 10 random classifications using the supervised machine learning (SML) and unsupervised machine learning (UML). The SML used are support vector machine learning (SVM) and multi-layer perceptron (MLP), whereas the UML used are minimum Euclidean distance and average accuracy mean. The highest result for SML experiment is obtained through MLP for 33-zon zoning, which is 99.695 percent. Experiments were also conducted on the new Arabic calligraphy dataset with the result of more than 90 percent accuracy. The proposed algorithm is capable of outperforming other researches and proposed six major discriminant calligraphy characters for the purpose of PJD. In conclusion, the proposed algorithm can be used in digit recognition as well as recognition type determination which are useful for PJD.

Item Type: Thesis (PhD)
Subjects: T Technology > T Technology (General)
D History General and Old World > DS Asia
Q Science > QA Mathematics > QA76 Computer software
Divisions: Faculty of Information and Communication Technology > Department of Software Engineeering
Depositing User: Dr Mohd Sanusi Azmi
Date Deposited: 26 Mar 2014 15:37
Last Modified: 28 May 2015 04:20
URI: http://eprints.utem.edu.my/id/eprint/11814
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item