Mediterranean anemia is a genetic disease that currently relies heavily on expert clinical experience to determine whether patients are affected. This method is overly reliant on expert experience and is not precise e...Mediterranean anemia is a genetic disease that currently relies heavily on expert clinical experience to determine whether patients are affected. This method is overly reliant on expert experience and is not precise enough. This paper proposes two modeling methods to predict whether patients have Mediterranean anemia. The first method involves using Principal Component Analysis (PCA) to reduce the dimensionality of the data, followed by logistic regression modeling (PCA-LR) on the reduced dataset. The second method involves building a Partial Least Squares Regression (PLS) model. Experimental results show that the prediction accuracy of the PCA-LR model is 87.5% (degree = 2, λ=4), and the prediction accuracy of the PLS model is 92.5% (ncomp = 4), indicating good predictive performance of the models.展开更多
Brain arteriovenous malformation(BAVM) is frequently described as vascular malformation. Although computer tomography(CT), magnetic resonance imaging(MRI) and angiography can clearly detect lesions, there are no diagn...Brain arteriovenous malformation(BAVM) is frequently described as vascular malformation. Although computer tomography(CT), magnetic resonance imaging(MRI) and angiography can clearly detect lesions, there are no diagnostic biological markers of BAVM available. Current study demonstrated that micro RNA(mi RNA)showed a feasible marker for vascular disease. To find key correlations between these mi RNAs and the onset of BAVM, we carried out chip analysis of serum mi RNAs by identifying 18 potential markers of BAVM. We then constructed a principle component analysis and logistic regression(PCA-LR) model to analyze the 18 mi RNAs collected from 77 patients. Another 9 independent samples were used to test the resulting model. The results showed that mi RNAs hsa-mir-126-3p and hsa-mir-140 are important protective factors, while hsa-mir-338 is a dominating risk factor, all of which have stronger correlation with BAVM than others. We also compared the testing results using PCA-LR model with those using LR model. The comparison revealed that PCA-LR model is better in predicting the disease.展开更多
There are a variety of classification techniques such as neural network, decision tree, support vector machine and logistic regression. The problem of dimensionality is pertinent to many learning algorithms, and it de...There are a variety of classification techniques such as neural network, decision tree, support vector machine and logistic regression. The problem of dimensionality is pertinent to many learning algorithms, and it denotes the drastic raise of computational complexity, however, we need to use dimensionality reduction methods. These methods include principal component analysis (PCA) and locality preserving projection (LPP). In many real-world classification problems, the local structure is more important than the global structure and dimensionality reduction techniques ignore the local structure and preserve the global structure. The objectives is to compare PCA and LPP in terms of accuracy, to develop appropriate representations of complex data by reducing the dimensions of the data and to explain the importance of using LPP with logistic regression. The results of this paper find that the proposed LPP approach provides a better representation and high accuracy than the PCA approach.展开更多
文摘Mediterranean anemia is a genetic disease that currently relies heavily on expert clinical experience to determine whether patients are affected. This method is overly reliant on expert experience and is not precise enough. This paper proposes two modeling methods to predict whether patients have Mediterranean anemia. The first method involves using Principal Component Analysis (PCA) to reduce the dimensionality of the data, followed by logistic regression modeling (PCA-LR) on the reduced dataset. The second method involves building a Partial Least Squares Regression (PLS) model. Experimental results show that the prediction accuracy of the PCA-LR model is 87.5% (degree = 2, λ=4), and the prediction accuracy of the PLS model is 92.5% (ncomp = 4), indicating good predictive performance of the models.
文摘Brain arteriovenous malformation(BAVM) is frequently described as vascular malformation. Although computer tomography(CT), magnetic resonance imaging(MRI) and angiography can clearly detect lesions, there are no diagnostic biological markers of BAVM available. Current study demonstrated that micro RNA(mi RNA)showed a feasible marker for vascular disease. To find key correlations between these mi RNAs and the onset of BAVM, we carried out chip analysis of serum mi RNAs by identifying 18 potential markers of BAVM. We then constructed a principle component analysis and logistic regression(PCA-LR) model to analyze the 18 mi RNAs collected from 77 patients. Another 9 independent samples were used to test the resulting model. The results showed that mi RNAs hsa-mir-126-3p and hsa-mir-140 are important protective factors, while hsa-mir-338 is a dominating risk factor, all of which have stronger correlation with BAVM than others. We also compared the testing results using PCA-LR model with those using LR model. The comparison revealed that PCA-LR model is better in predicting the disease.
文摘There are a variety of classification techniques such as neural network, decision tree, support vector machine and logistic regression. The problem of dimensionality is pertinent to many learning algorithms, and it denotes the drastic raise of computational complexity, however, we need to use dimensionality reduction methods. These methods include principal component analysis (PCA) and locality preserving projection (LPP). In many real-world classification problems, the local structure is more important than the global structure and dimensionality reduction techniques ignore the local structure and preserve the global structure. The objectives is to compare PCA and LPP in terms of accuracy, to develop appropriate representations of complex data by reducing the dimensions of the data and to explain the importance of using LPP with logistic regression. The results of this paper find that the proposed LPP approach provides a better representation and high accuracy than the PCA approach.