In this editorial we introduce the research paradigms of signal processing in the era of systems biology.Signal processing is a field of science traditionally focused on modeling electronic and communications systems,...In this editorial we introduce the research paradigms of signal processing in the era of systems biology.Signal processing is a field of science traditionally focused on modeling electronic and communications systems,but recently it has turned to biological applications with astounding results.The essence of signal processing is to describe the natural world by mathematical models and then,based on these models,develop efficient computational tools for solving engineering problems.Here,we underline,with examples,the endless possibilities which arise when the battle-hardened tools of engineering are applied to solve the problems that have tormented cancer researchers.Based on this approach,a new field has emerged,called cancer systems biology.Despite its short history,cancer systems biology has already produced several success stories tackling previously impracticable problems.Perhaps most importantly,it has been accepted as an integral part of the major endeavors of cancer research,such as analyzing the genomic and epigenomic data produced by The Cancer Genome Atlas(TCGA) project.Finally,we show that signal processing and cancer research,two fields that are seemingly distant from each other,have merged into a field that is indeed more than the sum of its parts.展开更多
We examine the role of big data and machine learning in cancer research.We describe an example in cancer research where gene-level data from The Cancer Genome Atlas(TCGA) consortium is interpreted using a pathway-leve...We examine the role of big data and machine learning in cancer research.We describe an example in cancer research where gene-level data from The Cancer Genome Atlas(TCGA) consortium is interpreted using a pathway-level model.As the complexity of computational models increases,their sample requirements grow exponentially.This growth stems from the fact that the number of combinations of variables grows exponentially as the number of variables increases.Thus,a large sample size is needed.The number of variables in a computational model can be reduced by incorporating biological knowledge.One particularly successful way of doing this is by using available gene regulatory,signaling,metabolic,or context-specific pathway information.We conclude that the incorporation of existing biological knowledge is essential for the progress in using big data for cancer research.展开更多
Background:Data from RNA-seq experiments provide a wealth of information about the transcriptome of an organism.However,the analysis of such data is very demanding.In this study,we aimed to establish robust analysis p...Background:Data from RNA-seq experiments provide a wealth of information about the transcriptome of an organism.However,the analysis of such data is very demanding.In this study,we aimed to establish robust analysis procedures that can be used in clinical practice.Methods:We studied RNA-seq data from triple-negative breast cancer patients.Specifically,we investigated the subsampling of RNA-seq data.Results:The main results of our investigations are as follows:(1) the subsampling of RNA-seq data gave biologically realistic simulations of sequencing experiments with smaller sequencing depth but not direct scaling of count matrices;(2) the saturation of results required an average sequencing depth larger than 32 million reads and an individual sequencing depth larger than 46 million reads;and(3) for an abrogated feature selection,higher moments of the distribution of all expressed genes had a higher sensitivity for signal detection than the corresponding mean values.Conclusions:Our results reveal important characteristics of RNA-seq data that must be understood before one can apply such an approach to translational medicine.展开更多
Facial expression recognition is a hot topic in computer vision, but it remains challenging due to the feature inconsistency caused by person-specific 'characteristics of facial expressions. To address such a chal...Facial expression recognition is a hot topic in computer vision, but it remains challenging due to the feature inconsistency caused by person-specific 'characteristics of facial expressions. To address such a challenge, and inspired by the recent success of deep identity network (DeepID-Net) for face identification, this paper proposes a novel deep learning based framework for recognising human expressions with facial images. Compared to the existing deep learning methods, our proposed framework, which is based on multi-scale global images and local facial patches, can significantly achieve a better performance on facial expression recognition. Finally, we verify the effectiveness of our proposed framework through experiments on the public benchmarking datasets JAFFE and extended Cohn-Kanade (CK+).展开更多
Identification of genetic signatures is the main objective for many computational oncology studies. The signature usually consists of numerous genes that are differentially expressed between two clinically distinct gr...Identification of genetic signatures is the main objective for many computational oncology studies. The signature usually consists of numerous genes that are differentially expressed between two clinically distinct groups of samples, such as tumor subtypes. Prospectively, many signatures have been found to generalize poorly to other datasets and, thus, have rarely been accepted into clinical use. Recognizing the limited success of traditionally generated signatures, we developed a systems biology-based framework for robust identification of key transcription factors and their genomic regulatory neighborhoods. Application of the framework to study the differences between gastrointestinal stromal tumor (GIST) and leiomyosarcoma (LMS) resulted in the identification of nine transcription factors (SRF, NKX2-5, CCDC6, LEF1, VDR, ZNF250, TRIM63, MAF, and MYC). Functional annotations of the obtained neighborhoods identified the biological processes which the key transcription factors regulate differently between the tumor types. Analyzing the differences in the expression patterns using our approach resulted in a more robust genetic signature and more biological insight into the diseases compared to a traditional genetic signature.展开更多
In this article we propose to facilitate local peer-to-peer communication by a Device-to-Device (D2D) radio that operates as an underlay network to an IMT-Advanced cellular network. It is expected that local services ...In this article we propose to facilitate local peer-to-peer communication by a Device-to-Device (D2D) radio that operates as an underlay network to an IMT-Advanced cellular network. It is expected that local services may utilize mobile peer-to-peer communication instead of central server based communication for rich mul-timedia services. The main challenge of the underlay radio in a multi-cell environment is to limit the inter-ference to the cellular network while achieving a reasonable link budget for the D2D radio. We propose a novel power control mechanism for D2D connections that share cellular uplink resources. The mechanism limits the maximum D2D transmit power utilizing cellular power control information of the devices in D2D communication. Thereby it enables underlaying D2D communication even in interference-limited networks with full load and without degrading the performance of the cellular network. Secondly, we study a single cell scenario consisting of a device communicating with the base station and two devices that communicate with each other. The results demonstrate that the D2D radio, sharing the same resources as the cellular net-work, can provide higher capacity (sum rate) compared to pure cellular communication where all the data is transmitted through the base station.展开更多
Transcription, post-transcriptional modification, translation, post-translational modification, DNA replication, and signaling interaction of intra- and extra- cellular components are the relevant mechanisms in gene r...Transcription, post-transcriptional modification, translation, post-translational modification, DNA replication, and signaling interaction of intra- and extra- cellular components are the relevant mechanisms in gene regulation. Transcription is one of the most important mechanisms in the control of gene expression. Further, post-transcriptional modifications play a crucial role after transcription which determine whether the transcribed gene is coding or non-coding RNA (ncRNAs). Genome-wide analysis of RNAs provides information about the coding RNAs, whereas the status of ncRNAs are still at large and must be discussed in detail as variations in the ncRNAs can lead to different phenotypes. In this short article, we discuss the role of genetic variation in ncRNA genes and how this variation may play a crucial role in ncRNA biogenesis that eventually leads to phenotypic variation and thus speciation.展开更多
We present an algorithm for the stochastic simulation of gene expression and heterogeneous population dynamics.The algorithm combines an exact method to simulate molecular-level fluctuations in single cells and a cons...We present an algorithm for the stochastic simulation of gene expression and heterogeneous population dynamics.The algorithm combines an exact method to simulate molecular-level fluctuations in single cells and a constant-number Monte Carlo method to simulate time-dependent statistical characteristics of growing cell populations.To benchmark performance,we compare simulation results with steadystate and time-dependent analytical solutions for several scenarios,including steadystate and time-dependent gene expression,and the effects on population heterogeneity of cell growth,division,and DNA replication.This comparison demonstrates that the algorithm provides an efficient and accurate approach to simulate how complex biological features influence gene expression.We also use the algorithm to model gene expression dynamics within"bet-hedging"cell populations during their adaption to environmental stress.These simulations indicate that the algorithm provides a framework suitable for simulating and analyzing realistic models of heterogeneous population dynamics combining molecular-level stochastic reaction kinetics,relevant physiological details and phenotypic variability.展开更多
Accurate prediction of electricity price(EP)is crucial for energy utilities and gridoperators for enhancing the energy trading,grid stability studies,resource allocationsand pricing strategies,thereby improving the ov...Accurate prediction of electricity price(EP)is crucial for energy utilities and gridoperators for enhancing the energy trading,grid stability studies,resource allocationsand pricing strategies,thereby improving the overall grid reliability,efficiency,and cost-effectiveness.This study introduces a novel D3Net model for half-hourly EP prediction,integrating Seasonal-Trend decomposition using LOESS(STL)and Variational ModeDecomposition(VMD)with Multi-Layer Perceptron(MLP),Random Forest Regression(RFR),and Tabular Neural Network(TabNet).The methodology involves applying STL tothe EP time-series to extract trend,seasonal,and residual components.The trend ispredicted using an MLP model,the seasonal component is further decomposed withVMD into 20 Variational Mode Functions(VMFs)and predicted using an RFR model,andthe residual component is decomposed with VMD and predicted using the TabNet model.Input features are identified using the Partial Autocorrelation Function,and models areoptimized using the Optuna algorithm.The final prediction combines the trend,seasonal,and residual components'predictions.Explainable Artificial Intelligence(xAI)methodswere used to enhance model interpretability and trustworthiness,with optimization viathe Optuna algorithm.Comparative analysis with seven standalone and seven decomposition-based models confirmed the superior performance and statisticalsignificance of the D3Net model.The D3Net achieved the highest global performanceindicator for South Australia(GPI≈11.068)and Tasmania(GPI≈12.206).Theseresults validate the efficacy and statistical significance of the D3Net model,demonstrating the viability of integrating STL and VMD decomposition approaches withMLP,RFR,and TabNet for EP prediction.展开更多
This paper presents and evaluates two novel ordinal classification methods for wind speed prediction,considering three prediction time-horizons:1h,4h,and 8h.To address the problem,wind speed values are discretised int...This paper presents and evaluates two novel ordinal classification methods for wind speed prediction,considering three prediction time-horizons:1h,4h,and 8h.To address the problem,wind speed values are discretised into four classes,critical for wind farm management.Each class represents essential information for wind farm production,ranging from very low wind speeds to extreme wind speed events and the corresponding production conditions,facilitating operational decisions for wind farm operators.Ordinal classifiers are more suitable than nominal methods to tackle this problem.The study’s primary objective is to compare recently proposed ordinal classifiers for addressing the challenges of wind speed prediction with a focus on extreme wind conditions,which are responsible for many turbine shutdowns.Hourly wind speed measurements from a Spanish wind farm and predictor variables from the European Centre for Medium-Range Weather Forecasts Reanalysis v5(ERA5 Reanalysis)model are used.The proposed methods include an Artificial Neural Network(ANN)model implementing the Cumulative Link Model as an ordinal output function(MLP-CLM^(O)),which emphasises overall performance,and an ANN model optimised using a soft labelling technique based on triangular distributions(MLP-T^(O)),which excels at handling extreme class performance.The results demonstrate the superiority of both approaches over other nominal and ordinal methods across performance metrics that account for the unbalanced nature and ordinality of the data.MLP-CLMO excels in overall and ordinal performance,while MLP-TO demonstrates superior handling of the extreme class predictions.展开更多
This paper develops a trustworthy deep learning model that considers electricity demand(G)and local climate conditions.The model utilises Multi-Head Self-Attention Transformer(TNET)to capture critical information from...This paper develops a trustworthy deep learning model that considers electricity demand(G)and local climate conditions.The model utilises Multi-Head Self-Attention Transformer(TNET)to capture critical information from𝐻,to attain reliable predictions with local climate(rainfall,radiation,humidity,evaporation,and maximum and minimum temperatures)data from Energex substations in Queensland,Australia.The TNET model is then evaluated with deep learning models(Long-Short Term Memory LSTM,Bidirectional LSTM BILSTM,Gated Recurrent Unit GRU,Convolutional Neural Networks CNN,and Deep Neural Network DNN)based on robust model assessment metrics.The Kernel Density Estimation method is used to generate the prediction interval(PI)of electricity demand forecasts and derive probability metrics and results to show the developed TNET model is accurate for all the substations.The study concludes that the proposed TNET model is a reliable electricity demand predictive tool that has high accuracy and low predictive errors and could be employed as a stratagem by demand modellers and energy policy-makers who wish to incorporate climatic factors into electricity demand patterns and develop national energy market insights and analysis systems.展开更多
文摘In this editorial we introduce the research paradigms of signal processing in the era of systems biology.Signal processing is a field of science traditionally focused on modeling electronic and communications systems,but recently it has turned to biological applications with astounding results.The essence of signal processing is to describe the natural world by mathematical models and then,based on these models,develop efficient computational tools for solving engineering problems.Here,we underline,with examples,the endless possibilities which arise when the battle-hardened tools of engineering are applied to solve the problems that have tormented cancer researchers.Based on this approach,a new field has emerged,called cancer systems biology.Despite its short history,cancer systems biology has already produced several success stories tackling previously impracticable problems.Perhaps most importantly,it has been accepted as an integral part of the major endeavors of cancer research,such as analyzing the genomic and epigenomic data produced by The Cancer Genome Atlas(TCGA) project.Finally,we show that signal processing and cancer research,two fields that are seemingly distant from each other,have merged into a field that is indeed more than the sum of its parts.
文摘We examine the role of big data and machine learning in cancer research.We describe an example in cancer research where gene-level data from The Cancer Genome Atlas(TCGA) consortium is interpreted using a pathway-level model.As the complexity of computational models increases,their sample requirements grow exponentially.This growth stems from the fact that the number of combinations of variables grows exponentially as the number of variables increases.Thus,a large sample size is needed.The number of variables in a computational model can be reduced by incorporating biological knowledge.One particularly successful way of doing this is by using available gene regulatory,signaling,metabolic,or context-specific pathway information.We conclude that the incorporation of existing biological knowledge is essential for the progress in using big data for cancer research.
基金supported In part by the Arkansas Biosciences Institute under Grant(No.UL1TR000039)the IDeANetworks of Biomedical Research Excellence(INBRE) Grant(No.P20RR16460)
文摘Background:Data from RNA-seq experiments provide a wealth of information about the transcriptome of an organism.However,the analysis of such data is very demanding.In this study,we aimed to establish robust analysis procedures that can be used in clinical practice.Methods:We studied RNA-seq data from triple-negative breast cancer patients.Specifically,we investigated the subsampling of RNA-seq data.Results:The main results of our investigations are as follows:(1) the subsampling of RNA-seq data gave biologically realistic simulations of sequencing experiments with smaller sequencing depth but not direct scaling of count matrices;(2) the saturation of results required an average sequencing depth larger than 32 million reads and an individual sequencing depth larger than 46 million reads;and(3) for an abrogated feature selection,higher moments of the distribution of all expressed genes had a higher sensitivity for signal detection than the corresponding mean values.Conclusions:Our results reveal important characteristics of RNA-seq data that must be understood before one can apply such an approach to translational medicine.
基金supported by the Academy of Finland(267581)the D2I SHOK Project from Digile Oy as well as Nokia Technologies(Tampere,Finland)
文摘Facial expression recognition is a hot topic in computer vision, but it remains challenging due to the feature inconsistency caused by person-specific 'characteristics of facial expressions. To address such a challenge, and inspired by the recent success of deep identity network (DeepID-Net) for face identification, this paper proposes a novel deep learning based framework for recognising human expressions with facial images. Compared to the existing deep learning methods, our proposed framework, which is based on multi-scale global images and local facial patches, can significantly achieve a better performance on facial expression recognition. Finally, we verify the effectiveness of our proposed framework through experiments on the public benchmarking datasets JAFFE and extended Cohn-Kanade (CK+).
基金supported by Project for the Biological Information and Information Processing Properties of Biological Systems from the Academy of Finland(No.122973)Project for the Structure-dynamics Relationships in Biological Network from the Academy of Finland(No.132877)Finnish Funding Agency for Technology and Innovation Finland Distinguished Professor program(No.1480/31/09)
文摘Identification of genetic signatures is the main objective for many computational oncology studies. The signature usually consists of numerous genes that are differentially expressed between two clinically distinct groups of samples, such as tumor subtypes. Prospectively, many signatures have been found to generalize poorly to other datasets and, thus, have rarely been accepted into clinical use. Recognizing the limited success of traditionally generated signatures, we developed a systems biology-based framework for robust identification of key transcription factors and their genomic regulatory neighborhoods. Application of the framework to study the differences between gastrointestinal stromal tumor (GIST) and leiomyosarcoma (LMS) resulted in the identification of nine transcription factors (SRF, NKX2-5, CCDC6, LEF1, VDR, ZNF250, TRIM63, MAF, and MYC). Functional annotations of the obtained neighborhoods identified the biological processes which the key transcription factors regulate differently between the tumor types. Analyzing the differences in the expression patterns using our approach resulted in a more robust genetic signature and more biological insight into the diseases compared to a traditional genetic signature.
文摘In this article we propose to facilitate local peer-to-peer communication by a Device-to-Device (D2D) radio that operates as an underlay network to an IMT-Advanced cellular network. It is expected that local services may utilize mobile peer-to-peer communication instead of central server based communication for rich mul-timedia services. The main challenge of the underlay radio in a multi-cell environment is to limit the inter-ference to the cellular network while achieving a reasonable link budget for the D2D radio. We propose a novel power control mechanism for D2D connections that share cellular uplink resources. The mechanism limits the maximum D2D transmit power utilizing cellular power control information of the devices in D2D communication. Thereby it enables underlaying D2D communication even in interference-limited networks with full load and without degrading the performance of the cellular network. Secondly, we study a single cell scenario consisting of a device communicating with the base station and two devices that communicate with each other. The results demonstrate that the D2D radio, sharing the same resources as the cellular net-work, can provide higher capacity (sum rate) compared to pure cellular communication where all the data is transmitted through the base station.
文摘Transcription, post-transcriptional modification, translation, post-translational modification, DNA replication, and signaling interaction of intra- and extra- cellular components are the relevant mechanisms in gene regulation. Transcription is one of the most important mechanisms in the control of gene expression. Further, post-transcriptional modifications play a crucial role after transcription which determine whether the transcribed gene is coding or non-coding RNA (ncRNAs). Genome-wide analysis of RNAs provides information about the coding RNAs, whereas the status of ncRNAs are still at large and must be discussed in detail as variations in the ncRNAs can lead to different phenotypes. In this short article, we discuss the role of genetic variation in ncRNA genes and how this variation may play a crucial role in ncRNA biogenesis that eventually leads to phenotypic variation and thus speciation.
基金the National Science and Engineering Research Council of Canada(NSERC)the Canadian Institutes of Health Research(CIHR)+1 种基金the Academy of Finland(Application Number 129657,Finnish Programme for Centres of Excellence in Research 2006-2011,and 124615)the Tampere Graduate School in Information Science and Engineering(TISE).
文摘We present an algorithm for the stochastic simulation of gene expression and heterogeneous population dynamics.The algorithm combines an exact method to simulate molecular-level fluctuations in single cells and a constant-number Monte Carlo method to simulate time-dependent statistical characteristics of growing cell populations.To benchmark performance,we compare simulation results with steadystate and time-dependent analytical solutions for several scenarios,including steadystate and time-dependent gene expression,and the effects on population heterogeneity of cell growth,division,and DNA replication.This comparison demonstrates that the algorithm provides an efficient and accurate approach to simulate how complex biological features influence gene expression.We also use the algorithm to model gene expression dynamics within"bet-hedging"cell populations during their adaption to environmental stress.These simulations indicate that the algorithm provides a framework suitable for simulating and analyzing realistic models of heterogeneous population dynamics combining molecular-level stochastic reaction kinetics,relevant physiological details and phenotypic variability.
文摘Accurate prediction of electricity price(EP)is crucial for energy utilities and gridoperators for enhancing the energy trading,grid stability studies,resource allocationsand pricing strategies,thereby improving the overall grid reliability,efficiency,and cost-effectiveness.This study introduces a novel D3Net model for half-hourly EP prediction,integrating Seasonal-Trend decomposition using LOESS(STL)and Variational ModeDecomposition(VMD)with Multi-Layer Perceptron(MLP),Random Forest Regression(RFR),and Tabular Neural Network(TabNet).The methodology involves applying STL tothe EP time-series to extract trend,seasonal,and residual components.The trend ispredicted using an MLP model,the seasonal component is further decomposed withVMD into 20 Variational Mode Functions(VMFs)and predicted using an RFR model,andthe residual component is decomposed with VMD and predicted using the TabNet model.Input features are identified using the Partial Autocorrelation Function,and models areoptimized using the Optuna algorithm.The final prediction combines the trend,seasonal,and residual components'predictions.Explainable Artificial Intelligence(xAI)methodswere used to enhance model interpretability and trustworthiness,with optimization viathe Optuna algorithm.Comparative analysis with seven standalone and seven decomposition-based models confirmed the superior performance and statisticalsignificance of the D3Net model.The D3Net achieved the highest global performanceindicator for South Australia(GPI≈11.068)and Tasmania(GPI≈12.206).Theseresults validate the efficacy and statistical significance of the D3Net model,demonstrating the viability of integrating STL and VMD decomposition approaches withMLP,RFR,and TabNet for EP prediction.
基金supported by the“Agencia Estatal de Investigación(España)”,Spanish Ministry of Science,Innovation and Universities(grant refs.:PID2023-150663NB-C21 and PID2023-150663NB-C22/AEI/10.13039/501100011033)by the European Com-mission,projects“CLImate INTelligence:Extreme events detection,attribution and adaptation design using machine learning,CLINT”(grant ref.:H2020-LC-CLA-2020-2,101003876)+6 种基金“Test and Exper-iment Facilities for the Agri-Food Domain,AgriFoodTEF”(grant ref.:DIGITAL-2022-CLOUD-AI-02,101100622)by the ENIA International Chair in Agriculture,University of Córdoba(grant ref.:TSI-100921-2023-3)by the European Union through the European Regional De-velopment Fund“project IA-CONV”funded by the Community of Madrid through the grant agreement for the promotion and advance-ment of research and technology transfer at the University of Alcalá(grant ref.:CM/DEMG/2024-039)by the University of Córdoba through competitive grants for Andalusian society challenges(grant ref.:PP2F_L1_15)Antonio Manuel Gómez-Orellana has been supported by“Consejería de Transformación Económica,Industria,Conocimiento y Universidades de la Junta de Andalucía”(grant ref.:PREDOC-00489)David Guijo-Rubio has been supported by the“Agencia Estatal de Inves-tigación(España)”MCIU/AEI/10.13039/501100011033 and European Union NextGenerationEU/PRTR(grant ref.:JDC2022-048378-I)。
文摘This paper presents and evaluates two novel ordinal classification methods for wind speed prediction,considering three prediction time-horizons:1h,4h,and 8h.To address the problem,wind speed values are discretised into four classes,critical for wind farm management.Each class represents essential information for wind farm production,ranging from very low wind speeds to extreme wind speed events and the corresponding production conditions,facilitating operational decisions for wind farm operators.Ordinal classifiers are more suitable than nominal methods to tackle this problem.The study’s primary objective is to compare recently proposed ordinal classifiers for addressing the challenges of wind speed prediction with a focus on extreme wind conditions,which are responsible for many turbine shutdowns.Hourly wind speed measurements from a Spanish wind farm and predictor variables from the European Centre for Medium-Range Weather Forecasts Reanalysis v5(ERA5 Reanalysis)model are used.The proposed methods include an Artificial Neural Network(ANN)model implementing the Cumulative Link Model as an ordinal output function(MLP-CLM^(O)),which emphasises overall performance,and an ANN model optimised using a soft labelling technique based on triangular distributions(MLP-T^(O)),which excels at handling extreme class performance.The results demonstrate the superiority of both approaches over other nominal and ordinal methods across performance metrics that account for the unbalanced nature and ordinality of the data.MLP-CLMO excels in overall and ordinal performance,while MLP-TO demonstrates superior handling of the extreme class predictions.
基金Partial support of this work was through a project PID2020-115454GB-C21 of the Spanish Ministry of Science and Innovation(MICINN).
文摘This paper develops a trustworthy deep learning model that considers electricity demand(G)and local climate conditions.The model utilises Multi-Head Self-Attention Transformer(TNET)to capture critical information from𝐻,to attain reliable predictions with local climate(rainfall,radiation,humidity,evaporation,and maximum and minimum temperatures)data from Energex substations in Queensland,Australia.The TNET model is then evaluated with deep learning models(Long-Short Term Memory LSTM,Bidirectional LSTM BILSTM,Gated Recurrent Unit GRU,Convolutional Neural Networks CNN,and Deep Neural Network DNN)based on robust model assessment metrics.The Kernel Density Estimation method is used to generate the prediction interval(PI)of electricity demand forecasts and derive probability metrics and results to show the developed TNET model is accurate for all the substations.The study concludes that the proposed TNET model is a reliable electricity demand predictive tool that has high accuracy and low predictive errors and could be employed as a stratagem by demand modellers and energy policy-makers who wish to incorporate climatic factors into electricity demand patterns and develop national energy market insights and analysis systems.