This paper develops sequence-based methods for identifying novel protein-protein interactions (PPIs) by means of support vector machines (SVMs). The authors encode proteins ont only in the gene level but also in t...This paper develops sequence-based methods for identifying novel protein-protein interactions (PPIs) by means of support vector machines (SVMs). The authors encode proteins ont only in the gene level but also in the amino acid level, and design a procedure to select negative training set for dealing with the training dataset imbalance problem, i.e., the number of interacting protein pairs is scarce relative to large scale non-interacting protein pairs. The proposed methods are validated on PPIs data of Plasmodium falciparum and Escherichia coli, and yields the predictive accuracy of 93.8% and 95.3%, respectively. The functional annotation analysis and database search indicate that our novel predictions are worthy of future experimental validation. The new methods will be useful supplementary tools for the future proteomics studies.展开更多
A local algorithm is proposed for unconstrained optimization problem. Compared with the traditional Newton method with Choleski factorization, this algorithm has the same quadratic convergence. But its computation cos...A local algorithm is proposed for unconstrained optimization problem. Compared with the traditional Newton method with Choleski factorization, this algorithm has the same quadratic convergence. But its computation cost per iteration in average is less when the dimension n≥55. The saving is estimated in the theoretical framework.展开更多
基金This research is supported by the Key Project of the National Natural Science Foundation of China under Grant No. 10631070, the National Natural Science Foundation of China under Grant Nos. 10801112, 10971223, 11071252, and the Ph.D Graduate Start Research Foundation of Xinjiang University Funded Project under Grant No. BS080101. Thank Dr. Yong Wang from Institute of Systems Science, Academy of Mathematics and Systems Science for kind discussion and good suggestions.
文摘This paper develops sequence-based methods for identifying novel protein-protein interactions (PPIs) by means of support vector machines (SVMs). The authors encode proteins ont only in the gene level but also in the amino acid level, and design a procedure to select negative training set for dealing with the training dataset imbalance problem, i.e., the number of interacting protein pairs is scarce relative to large scale non-interacting protein pairs. The proposed methods are validated on PPIs data of Plasmodium falciparum and Escherichia coli, and yields the predictive accuracy of 93.8% and 95.3%, respectively. The functional annotation analysis and database search indicate that our novel predictions are worthy of future experimental validation. The new methods will be useful supplementary tools for the future proteomics studies.
文摘A local algorithm is proposed for unconstrained optimization problem. Compared with the traditional Newton method with Choleski factorization, this algorithm has the same quadratic convergence. But its computation cost per iteration in average is less when the dimension n≥55. The saving is estimated in the theoretical framework.