Browse By Repository:


Classification of SNPs for obesity analysis using FARNeM modelling

Ong , Phaik Ling (2013) Classification of SNPs for obesity analysis using FARNeM modelling. Project Report. UTeM. (Submitted)


Download (237Kb) | Preview


The current trend of obesity research is heading toward the field of Single Nucleotide Polymorphism (SNPs). It is because with a recognise SNPs through classification, a personalized medicine can be customized which in turn allow early diagnosis. However, it is costly and time consuming to deal with large size, redundant and noisy SNPs data. Therefore, feature selection has to precede classification task. This experiment is following a general methodology which consists of 6 phases- preliminary studies, data preparation, SNPs reduction, classification of SNPs, benchmarking and analysis, and lastly result validation. Forward attribute reduction based on neighbourhood rough set model (FARNeM) is used to select attribute that are disease related and to discard attribute that are not disease related because it can avoid information loss cause by discretization process in classical rough set. A common threshold, 0.1 and a common distance, Euclidean distance are implemented in FARNeM to perform feature selection. Then, the reduction result performance is compared among FARNeM, Correlation Feature Selection (CFS), ReliefF and with data without undergo feature selection. Both CFS and ReliefF were chosen based on their reduction properties that are subset reduction and ranking reduction respectively, which believe can produce a comparative result with FARNeM. It is at best to maximize positive predictive value and negative predictive value in diagnostic task. Thus, classification accuracy, sensitivity and specificity are used to further assess the flexibility of error rate. Experimental result shows that, it is encouraging to perform feature selection and FARNeM performs better than others technique in sensitivity and specificity measurements. However, the accuracy of FARNeM is affected badly by skewed data. Therefore, in future, improvement needs to be done when dealing with skewed data. Besides that, it is also suggested to tune the parameter of threshold as threshold is very important in determining the size of neighbourhood.

Item Type: Monograph (Project Report)
Uncontrolled Keywords: human genetics -- data processing, obesity -- genetics, genetics -- data processing
Subjects: Q Science > QH Natural history
Divisions: Library > Projek Sarjana Muda > FTMK
Depositing User: Noor Rahman Jamiah Jalil
Date Deposited: 06 Jul 2015 06:37
Last Modified: 06 Jul 2015 06:37

Actions (login required)

View Item View Item


Downloads per month over past year