Teknik Rough-Apriori Untuk Perlombongan Petua Sekutuan Linguistik

Choo, Yun Huoy (2008) Teknik Rough-Apriori Untuk Perlombongan Petua Sekutuan Linguistik. PhD thesis, Universiti Kebangsaan Malaysia.

[img] PDF (24 Pages)
Teknik_Rough-Apriori_Untuk_Perlombongan_Petua_Sekutuan_Linguistik.pdf - Submitted Version
Restricted to Registered users only

Download (2MB)


Capturing the ambiguous and uncertainty boundary in the discretised quantitative intervals has emerged as an important problem in linguistic association rules mining. Currently, the fuzzy-based techniques are the most prominent techniques in natural language soft boundary capturing problem. In addition to fuzzy theory, rough set theory, i.e. the approximation concept and rough membership function are capable to deal with ambiguous boundary problem with its nature rough features. However, there are still very limited findings on rough membership function in mining linguistic association rules. Thus, the main purpose of this research is to capture the soft boundary in discretised quantitative intervals from quantitative association rules with rough membership function. This research follows the experimental methodology. All experiments were carried out on ten UCI datasets of 10-fold cross-validation setting. Selected measurements were set in the evaluation, i.e. the classification accuracy, rules predictive ratio, the area under ROC curve, the number of rules generated, the reducts length and the dimension of rules. The results were further analysed with statistical significant t-test. A rough-Apriori technique was proposed in this research. It consists of two key components, the fitness-rough . (FsR) attribute reduction method and the rough membership reference association rules mining (RMRARM) technique. RMRARM utilises rough membership value as the weightage of linguistic interval for transformation of linguistic decision system before performing the Apriori mining algorithm. Modified fuzzy generalized mining algorithm and the Boolean Reasoning quantitative association rules method were chosen in the comparative study. The FsR method was proposed to be embedded in the RMRARM to reduce the complexity of the data while maintaining a lesser degree of information loss. It is a hybrid of statistical fitness degree and rough reducts calculation to reduce attributes in a simpler way. FsR method was compared to the classical rough reducts method, the statistical entropy method and the correlation-based feature selection method. Experimental results showed that the FsR method has performed comparatively well with higher reduction strength and smaller rules set against the comparing methods. Besides, RMRARM technique demonstrated a comparable performance against its comparing techniques. It generates more specific rules as compared to Fuzzy-based technique. The research has extended its contribution to the study of data mining particularly in linguistic association rules mining. Experimentally, rough membership function is proven to be comparatively capable in capturing the soft boundary of discretized quantitative intervals with its nature features of graded probability on objects.

Item Type: Thesis (PhD)
Uncontrolled Keywords: Dissertations, Academic -- Malaysia, Algorithms, Data mining, Data warehousing
Subjects: Q Science > Q Science (General)
Q Science > QA Mathematics > QA76 Computer software
Divisions: Library > Disertasi > FTMK
Depositing User: Nor Aini Md. Jali
Date Deposited: 23 Nov 2014 13:18
Last Modified: 28 May 2015 04:31
URI: http://eprints.utem.edu.my/id/eprint/13398
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item