Validating Causal Effect Rules Using Cluster-Based Cohort Study

Iman, Muhammad Naufal (2017) Validating Causal Effect Rules Using Cluster-Based Cohort Study. Masters thesis, Universiti Teknikal Malaysia Melaka.

[img] Text
Validating Causal Effect Rules Using Cluster-Based Cohort Study - Muhammad Naufal Iman - 24 Pages.pdf - Submitted Version

Download (376kB)

Abstract

Mining association rules from massive amount of data in the database is interesting for many industries especially for root cause analysis. Many techniques have been introduced to identify causal effect root cause using association rules mining framework, such as the support-confidence and support-lift framework. However, verifying and validating causal effect root causes usually involve an expert from the business domain. This has increased the complexity and time taken in the rule mining process. Hence, this study proposed the use of cohort study approach to statistically verify the generated causal effect root cause by Apriori association rules mining technique. The study follows the experimental methodology in testing and validating the proposed cohort study approach. The project had also studied on the partitioning technique in cohort study approach. The proposed cluster-based partitioning technique using k-mean clustering was compared with the manual partitioning technique through experimental results analysis. The data used in the experiments were taken from a semi-conductor manufacturer in Melaka. The data involve true alarm of failure detection collected from the business intelligence reporting unit. The results have shown positive results on root cause validation using k-mean partitioning cohort study. The manual partitioning cohort study has generated 107 rules while the k-means partitioned cohort study produced 49 rules. Only 8 rules appeared in both approached. Thus, we can conclude that the 8 rules generated by both approaches are definite causal effect rules, while the others are to be further confirmed by domain expert. In summary, cohort study approach can be used for validating a causal effect rules to a certain extend. Manual partitioning to create different cohort data can be done only if there is sufficient knowledge about the data. In the other hand, K-Means clustering technique can be used to partition the raw data into different cohorts for further validation. The limitation of this work lies on the validation of generated root causes with the domain expert due to time constraints. So, the future work in this study should focus on the domain expert validation. Besides, the use of lift standardization and thresholding should also be concerned for it is believed to be able to improve the results of generated causal effect rules.

Item Type: Thesis (Masters)
Uncontrolled Keywords: Data mining, Sampling (Statistics), Web database
Subjects: Q Science > Q Science (General)
Q Science > QA Mathematics > QA76 Computer software
Divisions: Library > Tesis > FTMK
Depositing User: Nor Aini Md. Jali
Date Deposited: 25 Apr 2018 09:20
Last Modified: 25 Apr 2018 09:20
URI: http://eprints.utem.edu.my/id/eprint/20738
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item