In the post-genomic era, the construction and control of genetic regulatory networks using gene expression data is a hot research topic. Boolean networks (BNs) and its extension Probabilistic Boolean Networks (PBNs) h...In the post-genomic era, the construction and control of genetic regulatory networks using gene expression data is a hot research topic. Boolean networks (BNs) and its extension Probabilistic Boolean Networks (PBNs) have been served as an effective tool for this purpose. However, PBNs are difficult to be used in practice when the number of genes is large because of the huge computational cost. In this paper, we propose a simplified multivariate Markov model for approximating a PBN The new model can preserve the strength of PBNs, the ability to capture the inter-dependence of the genes in the network, qnd at the same time reduce the complexity of the network and therefore the computational cost. We then present an optimal control model with hard constraints for the purpose of control/intervention of a genetic regulatory network. Numerical experimental examples based on the yeast data are given to demonstrate the effectiveness of our proposed model and control policy.展开更多
Driven by the challenge of integrating large amount of experimental data, classification technique emerges as one of the major and popular tools in computational biology and bioinformatics research. Machine learning m...Driven by the challenge of integrating large amount of experimental data, classification technique emerges as one of the major and popular tools in computational biology and bioinformatics research. Machine learning methods, especially kernel methods with Support Vector Machines (SVMs) are very popular and effective tools. In the perspective of kernel matrix, a technique namely Eigen- matrix translation has been introduced for protein data classification. The Eigen-matrix translation strategy has a lot of nice properties which deserve more exploration. This paper investigates the major role of Eigen-matrix translation in classification. The authors propose that its importance lies in the dimension reduction of predictor attributes within the data set. This is very important when the dimension of features is huge. The authors show by numerical experiments on real biological data sets that the proposed framework is crucial and effective in improving classification accuracy. This can therefore serve as a novel perspective for future research in dimension reduction problems.展开更多
Modeling genetic regulatory networks is an important research topic in genomic research and computationM systems biology. This paper considers the problem of constructing a genetic regula- tory network (GRN) using t...Modeling genetic regulatory networks is an important research topic in genomic research and computationM systems biology. This paper considers the problem of constructing a genetic regula- tory network (GRN) using the discrete dynamic system (DDS) model approach. Although considerable research has been devoted to building GRNs, many of the works did not consider the time-delay effect. Here, the authors propose a time-delay DDS model composed of linear difference equations to represent temporal interactions among significantly expressed genes. The authors also introduce interpolation scheme and re-sampling method for equalizing the non-uniformity of sampling time points. Statistical significance plays an active role in obtaining the optimal interaction matrix of GRNs. The constructed genetic network using linear multiple regression matches with the original data very well. Simulation results are given to demonstrate the effectiveness of the proposed method and model.展开更多
Predicting protein functions is an important issue in the post-genomic era. This paper studies several network-based kernels including local linear embedding (LLE) kernel method, diffusion kernel and laplacian kerne...Predicting protein functions is an important issue in the post-genomic era. This paper studies several network-based kernels including local linear embedding (LLE) kernel method, diffusion kernel and laplacian kernel to uncover the relationship between proteins functions and protein-protein interactions (PPI). The author first construct kernels based on PPI networks, then apply support vector machine (SVM) techniques to classify proteins into different functional groups. The 5-fold cross validation is then applied to the selected 359 GO terms to compare the performance of different kernels and guilt-by-association methods including neighbor counting methods and Chi-square methods. Finally, the authors conduct predictions of functions of some unknown genes and verify the preciseness of our prediction in part by the information of other data source.展开更多
String kernels are popular tools for analyzing protein sequence data and they have been successfully applied to many computational biology problems. The traditional string kernels assume that different substrings are ...String kernels are popular tools for analyzing protein sequence data and they have been successfully applied to many computational biology problems. The traditional string kernels assume that different substrings are independent. However, substrings can be highly correlated due to their substructure relationship or common physico-chemical properties. This paper proposes two kinds of weighted spectrum kernels: The correlation spectrum kernel and the AA spectrum kernel. We evMuate their performances by predicting glycan-binding proteins of 12 glycans. The results show that the correlation spectrum kernel and the AA spectrum kernel perform significantly better than the spectrum kernel for nearly all the 12 glycans. By comparing the predictive power of AA spectrum kernels constructed by different physico-chemical properties, the authors can also identify the physico- chemical properties which contributes the most to the glycan-protein binding. The results indicate that physico-chemical properties of amino acids in proteins play an important role in the mechanism of glycamprotein binding.展开更多
We study the Markov decision processes under the average-value-at-risk criterion.The state space and the action space are Borel spaces,the costs are admitted to be unbounded from above,and the discount factors are sta...We study the Markov decision processes under the average-value-at-risk criterion.The state space and the action space are Borel spaces,the costs are admitted to be unbounded from above,and the discount factors are state-action dependent.Under suitable conditions,we establish the existence of optimal deterministic stationary policies.Furthermore,we apply our main results to a cash-balance model.展开更多
文摘In the post-genomic era, the construction and control of genetic regulatory networks using gene expression data is a hot research topic. Boolean networks (BNs) and its extension Probabilistic Boolean Networks (PBNs) have been served as an effective tool for this purpose. However, PBNs are difficult to be used in practice when the number of genes is large because of the huge computational cost. In this paper, we propose a simplified multivariate Markov model for approximating a PBN The new model can preserve the strength of PBNs, the ability to capture the inter-dependence of the genes in the network, qnd at the same time reduce the complexity of the network and therefore the computational cost. We then present an optimal control model with hard constraints for the purpose of control/intervention of a genetic regulatory network. Numerical experimental examples based on the yeast data are given to demonstrate the effectiveness of our proposed model and control policy.
基金supported by Research Grants Council of Hong Kong under Grant No.17301214HKU CERG Grants,Fundamental Research Funds for the Central Universities+2 种基金the Research Funds of Renmin University of ChinaHung Hing Ying Physical Research Grantthe Natural Science Foundation of China under Grant No.11271144
文摘Driven by the challenge of integrating large amount of experimental data, classification technique emerges as one of the major and popular tools in computational biology and bioinformatics research. Machine learning methods, especially kernel methods with Support Vector Machines (SVMs) are very popular and effective tools. In the perspective of kernel matrix, a technique namely Eigen- matrix translation has been introduced for protein data classification. The Eigen-matrix translation strategy has a lot of nice properties which deserve more exploration. This paper investigates the major role of Eigen-matrix translation in classification. The authors propose that its importance lies in the dimension reduction of predictor attributes within the data set. This is very important when the dimension of features is huge. The authors show by numerical experiments on real biological data sets that the proposed framework is crucial and effective in improving classification accuracy. This can therefore serve as a novel perspective for future research in dimension reduction problems.
基金supported in part by HKRGC GrantHKU Strategic Theme Grant on Computational SciencesNational Natural Science Foundation of China under Grant Nos.10971075 and 11271144
文摘Modeling genetic regulatory networks is an important research topic in genomic research and computationM systems biology. This paper considers the problem of constructing a genetic regula- tory network (GRN) using the discrete dynamic system (DDS) model approach. Although considerable research has been devoted to building GRNs, many of the works did not consider the time-delay effect. Here, the authors propose a time-delay DDS model composed of linear difference equations to represent temporal interactions among significantly expressed genes. The authors also introduce interpolation scheme and re-sampling method for equalizing the non-uniformity of sampling time points. Statistical significance plays an active role in obtaining the optimal interaction matrix of GRNs. The constructed genetic network using linear multiple regression matches with the original data very well. Simulation results are given to demonstrate the effectiveness of the proposed method and model.
基金This research is supported in part by HKRGC Grant 7017/07P, HKU CRCG Grants, HKU strategic theme grant on computational sciences, HKU Hung Hing Ying Physical Science Research Grant, National Natural Science Foundation of China Grant No. 10971075 and Guangdong Provincial Natural Science Grant No. 9151063101000021. The preliminary version of this paper has been presented in the OSB2009 conference and published in the corresponding conference proceedings[25]. The authors would like to thank the anonymous referees for their helpful comments and suggestions.
文摘Predicting protein functions is an important issue in the post-genomic era. This paper studies several network-based kernels including local linear embedding (LLE) kernel method, diffusion kernel and laplacian kernel to uncover the relationship between proteins functions and protein-protein interactions (PPI). The author first construct kernels based on PPI networks, then apply support vector machine (SVM) techniques to classify proteins into different functional groups. The 5-fold cross validation is then applied to the selected 359 GO terms to compare the performance of different kernels and guilt-by-association methods including neighbor counting methods and Chi-square methods. Finally, the authors conduct predictions of functions of some unknown genes and verify the preciseness of our prediction in part by the information of other data source.
基金supported in part by Research Grants Council of Hong Kong under Grant No.17301214HKU CERG Grants+2 种基金Hung Hing Ying Physical Research Grantthe Research Funds of Renmin University of Chinathe National Natural Science Foundation of China under Grant Nos.11271144,11101382,11471256,and S201201009985
文摘String kernels are popular tools for analyzing protein sequence data and they have been successfully applied to many computational biology problems. The traditional string kernels assume that different substrings are independent. However, substrings can be highly correlated due to their substructure relationship or common physico-chemical properties. This paper proposes two kinds of weighted spectrum kernels: The correlation spectrum kernel and the AA spectrum kernel. We evMuate their performances by predicting glycan-binding proteins of 12 glycans. The results show that the correlation spectrum kernel and the AA spectrum kernel perform significantly better than the spectrum kernel for nearly all the 12 glycans. By comparing the predictive power of AA spectrum kernels constructed by different physico-chemical properties, the authors can also identify the physico- chemical properties which contributes the most to the glycan-protein binding. The results indicate that physico-chemical properties of amino acids in proteins play an important role in the mechanism of glycamprotein binding.
基金supported by the National Natural Science Foundation of China(Grant Nos.61673019,11931018)the Natural Science Foundation of Guangdong Province(Grant Nos.2018A030313738,2021A1515010057)+1 种基金Guangdong Province Key Laboratory of Computational Science at the Sun Yat-sen University(2020B1212060032)IMR and RAE Research Fund,Faculty of Science,HKU.
文摘We study the Markov decision processes under the average-value-at-risk criterion.The state space and the action space are Borel spaces,the costs are admitted to be unbounded from above,and the discount factors are state-action dependent.Under suitable conditions,we establish the existence of optimal deterministic stationary policies.Furthermore,we apply our main results to a cash-balance model.