A new system is developed to recognize promoter sequences from non promoter sequences based on position weight matrix and backpropagation neural network in this paper. The system performs significantly better on the t...A new system is developed to recognize promoter sequences from non promoter sequences based on position weight matrix and backpropagation neural network in this paper. The system performs significantly better on the training set and the test set, the mean recognition rate is as high as 99% on the training set and 97% on the testing set. Experimental results demonstrate the effectiveness of the system to recognize the promoter sequences that have been trained and the promoter sequences that have not been seen previously.展开更多
By sampling 100 encoding proteins from SARS-coronavirus (SARS-CoV, NC 004718) and other six coronaviruses and selecting 23 variables through stepwise multiple regression (SMR) from 172 variables, the multiple linear r...By sampling 100 encoding proteins from SARS-coronavirus (SARS-CoV, NC 004718) and other six coronaviruses and selecting 23 variables through stepwise multiple regression (SMR) from 172 variables, the multiple linear regression (MLR) model was established with good results of the quantitative modelling correlation coefficient R2 = 0.645 and the cross-validation correlation coefficient RCV = 0.375. After removing 4 outliers, the quantitative 2 modelling and cross-validation correlation coefficients were R2= 0.743 and RCV = 0.543, respectively.展开更多
文摘A new system is developed to recognize promoter sequences from non promoter sequences based on position weight matrix and backpropagation neural network in this paper. The system performs significantly better on the training set and the test set, the mean recognition rate is as high as 99% on the training set and 97% on the testing set. Experimental results demonstrate the effectiveness of the system to recognize the promoter sequences that have been trained and the promoter sequences that have not been seen previously.
文摘By sampling 100 encoding proteins from SARS-coronavirus (SARS-CoV, NC 004718) and other six coronaviruses and selecting 23 variables through stepwise multiple regression (SMR) from 172 variables, the multiple linear regression (MLR) model was established with good results of the quantitative modelling correlation coefficient R2 = 0.645 and the cross-validation correlation coefficient RCV = 0.375. After removing 4 outliers, the quantitative 2 modelling and cross-validation correlation coefficients were R2= 0.743 and RCV = 0.543, respectively.