As more information is gathered on the mechanisms of transcription and translation, it is becoming apparent that these processes are highly regulated. The formation of mRNA secondary and tertiary structures is one suc...As more information is gathered on the mechanisms of transcription and translation, it is becoming apparent that these processes are highly regulated. The formation of mRNA secondary and tertiary structures is one such regulatory process that until recently it has not been analysed in depth. Formation of these mRNA structures has the potential to enhance and inhibit alternative splicing of transcripts, and regulate rates and amount of translation. As this regulatory mechanism potentially impacts at both the transcriptional and translational level, while also potentially utilising the vast array of non-coding RNAs, it warrants further investigation. Currently, a variety of high- throughput sequencing techniques including parallel analysis of RNA structure (PARS), fragmentation sequencing (FragSeq) and selective 2-hydroxyl acylation analysed by primer extension (SHAPE) lead the way in the genome-wide identification and analysis of mRNA structure formation. These new sequencing techniques highlight the diversity and complexity of the transcriptome, and demonstrate another regulatory mechanism that could become a target for new therapeutic approaches.展开更多
Many recent exciting discoveries have revealed the versatility of RNAs and their importance in a variety of cellular functions which are strongly coupled to RNA structures. To understand the functions of RNAs, some st...Many recent exciting discoveries have revealed the versatility of RNAs and their importance in a variety of cellular functions which are strongly coupled to RNA structures. To understand the functions of RNAs, some structure prediction models have been developed in recent years. In this review, the progress in computational models for RNA structure prediction is introduced and the distinguishing features of many outstanding algorithms are discussed, emphasizing three- dimensional (3D) structure prediction. A promising coarse-grained model for predicting RNA 3D structure, stability and salt effect is also introduced briefly. Finally, we discuss the major challenges in the RNA 3D structure modeling.展开更多
RNAs have important biological functions and the functions of RNAs are generally coupled to their structures, especiallytheir secondary structures. In this work, we have made a comprehensive evaluation of the performa...RNAs have important biological functions and the functions of RNAs are generally coupled to their structures, especiallytheir secondary structures. In this work, we have made a comprehensive evaluation of the performances of existingtop RNA secondary structure prediction methods, including five deep-learning (DL) based methods and five minimum freeenergy (MFE) based methods. First, we made a brief overview of these RNA secondary structure prediction methods.Afterwards, we built two rigorous test datasets consisting of RNAs with non-redundant sequences and comprehensivelyexamined the performances of the RNA secondary structure prediction methods through classifying the RNAs into differentlength ranges and different types. Our examination shows that the DL-based methods generally perform better thanthe MFE-based methods for RNAs with long lengths and complex structures, while the MFE-based methods can achievegood performance for small RNAs and some specialized MFE-based methods can achieve good prediction accuracy forpseudoknots. Finally, we provided some insights and perspectives in modeling RNA secondary structures.展开更多
To enable diverse functions and precise regulation,an RNA sequence often folds into complex yet distinct structures in different cellular states.Probing RNA in its native environment is essential to uncovering RNA str...To enable diverse functions and precise regulation,an RNA sequence often folds into complex yet distinct structures in different cellular states.Probing RNA in its native environment is essential to uncovering RNA structures of biological contexts.However,current methods generally require large amounts of input RNA and are challenging for physiologically relevant use.Here,we report smartSHAPE,a new RNA structure probing method that requires very low amounts of RNA input due to the largely reduced artefact of probing signals and increased efficiency of library construction.Using smartSHAPE,we showcased the profiling of the RNA structure landscape of mouse intestinal macrophages upon inflammation,and provided evidence that RNA conformational changes regulate immune responses.These results demonstrate that smartSHAPE can greatly expand the scope of RNA structure-based investigations in practical biological systems,and also provide a research paradigm for the study of post-transcriptional regulation.展开更多
RNA molecules serve a wide range of functions that are closely linked to their structures.The basic structural units of RNA consist of single-and double-stranded regions.In order to carry out advanced functions such a...RNA molecules serve a wide range of functions that are closely linked to their structures.The basic structural units of RNA consist of single-and double-stranded regions.In order to carry out advanced functions such as catalysis and ligand binding,certain types of RNAs can adopt higher-order structures.The analysis of RNA structures has progressed alongside advancements in structural biology techniques,but it comes with its own set of challenges and corresponding solutions.In this review,we will discuss recent advances in RNA structure analysis techniques,including structural probing methods,X-ray crystallography,nuclear magnetic resonance,cryo-electron microscopy,and small-angle X-ray scattering.Often,a combination of multiple techniques is employed for the integrated analysis of RNA structures.We also survey important RNA structures that have been recently determined using various techniques.展开更多
RNA folds into intricate structures that are crucial for its functions and regulations. To date, a multitude of approaches for probing structures of the whole transcriptome, i.e., RNA struc- turomes, have been develop...RNA folds into intricate structures that are crucial for its functions and regulations. To date, a multitude of approaches for probing structures of the whole transcriptome, i.e., RNA struc- turomes, have been developed. Applications of these approaches to different cell lines and tissues have generated a rich resource for the study of RNA structure-function relationships at a systems biology level. In this review, we first introduce the designs of these methods and their applications to study different RNA structuromes. We emphasize their technological differences especially their unique advantages and caveats. We then summarize the structural insights in RNA functions and regulations obtained from the studies of RNA structuromes. And finally, we propose potential directions for future improvements and studies.展开更多
RNAs play crucial and versatile roles in biological processes.Computational prediction approaches can help to understand RNA structures and their stabilizing factors,thus providing information on their functions,and f...RNAs play crucial and versatile roles in biological processes.Computational prediction approaches can help to understand RNA structures and their stabilizing factors,thus providing information on their functions,and facilitating the design of new RNAs.Machine learning(ML)techniques have made tremendous progress in many fields in the past few years.Although their usage in protein-related fields has a long history,the use of ML methods in predicting RNA tertiary structures is new and rare.Here,we review the recent advances of using ML methods on RNA structure predictions and discuss the advantages and limitation,the difficulties and potentials of these approaches when applied in the field.展开更多
[Objective] To examine the grammar model based on lexical substring exac- tion for RNA secondary structure prediction. [Method] By introducing cloud model into stochastic grammar model, a machine learning algorithm su...[Objective] To examine the grammar model based on lexical substring exac- tion for RNA secondary structure prediction. [Method] By introducing cloud model into stochastic grammar model, a machine learning algorithm suitable for the lexicalized stochastic grammar model was proposed. The word grid mode was used to extract and divide RNA sequence to acquire lexical substring, and the cloud classifier was used to search the maximum probability of each lemma which was marked as a certain sec- ondary structure type. Then, the lemma information was introduced into the training stochastic grammar process as prior information, realizing the prediction on the sec- ondary structure of RNA, and the method was tested by experiment. [Result] The experimental results showed that the prediction accuracy and searching speed of stochastic grammar cloud model were significantly improved from the prediction with simple stochastic grammar. [Conclusion] This study laid the foundation for the wide application of stochastic grammar model for RNA secondary structure prediction.展开更多
RNAs play crucial and versatile roles in cellular biochemical reactions.Since experimental approaches of determining their three-dimensional(3D)structures are costly and less efficient,it is greatly advantageous to de...RNAs play crucial and versatile roles in cellular biochemical reactions.Since experimental approaches of determining their three-dimensional(3D)structures are costly and less efficient,it is greatly advantageous to develop computational methods to predict RNA 3D structures.For these methods,designing a model or scoring function for structure quality assessment is an essential step but this step poses challenges.In this study,we designed and trained a deep learning model to tackle this problem.The model was based on a graph convolutional network(GCN)and named RNAGCN.The model provided a natural way of representing RNA structures,avoided complex algorithms to preserve atomic rotational equivalence,and was capable of extracting features automatically out of structural patterns.Testing results on two datasets convincingly demonstrated that RNAGCN performs similarly to or better than four leading scoring functions.Our approach provides an alternative way of RNA tertiary structure assessment and may facilitate RNA structure predictions.RNAGCN can be downloaded from https://gitee.com/dcw-RNAGCN/rnagcn.展开更多
Knowledge of RNA 3-dimensional(3 D) structures is critical to understand the important biological functions of RNAs, and various models have been developed to predict RNA 3 D structures in silico. However, there is st...Knowledge of RNA 3-dimensional(3 D) structures is critical to understand the important biological functions of RNAs, and various models have been developed to predict RNA 3 D structures in silico. However, there is still lack of a reliable and efficient statistical potential for RNA 3 D structure evaluation. For this purpose, we developed a statistical potential based on a minimal coarse-grained representation and residue separation, where every nucleotide is represented by C4’ atom for backbone and N1(or N9) atom for base. In analogy to the newly developed all-atom rsRNASP, cgRNASP-CN is composed of short-ranged and long-ranged potentials, and the short-ranged one was involved more subtly. The examination indicates that the performance of cgRNASP-CN is close to that of the all-atom rsRNASP and is superior to other top all-atom traditional statistical potentials and scoring functions trained from neural networks, for two realistic test datasets including the RNA-Puzzles dataset. Very importantly,cgRNASP-CN is about 100 times more efficient than existing all-atom statistical potentials/scoring functions including rsRNASP. cgRNASP-CN is available at website: https://github.com/Tan-group/cgRNASP-CN.展开更多
We have previously reported that the human ACAT1 gene produces a chimeric mRNA through the interchromosomal processing of two discontinuous RNAs transcribed from chromosomes 1 and 7. The chimeric mRNA uses AUG1397-139...We have previously reported that the human ACAT1 gene produces a chimeric mRNA through the interchromosomal processing of two discontinuous RNAs transcribed from chromosomes 1 and 7. The chimeric mRNA uses AUG1397-1399 and GGC1274-1276 as translation initiation codons to produce normal 50-kDa ACAT1 and a novel enzymatically active 56-kDa isoform, respectively, with the latter being authentically present in human cells, including human monocyte- derived macrophages. In this work, we report that RNA secondary structures located in the vicinity of the GGC1274-1276 codon are required for production of the 56-kDa isoform. The effects of the three predicted stem-loops (nt 1255-1268, 1286-1342 and 1355-1384) were tested individually by transfecting expression plasmids into cells that contained the wild-type, deleted or mutant stem-loop sequences linked to a partial ACAT1 AUG open reading frame (ORF) or to the ORFs of other genes. The expression patterns were monitored by western blot analyses. We found that the upstream stem-loop1255-1268 from chromosome 7 and downstream stem-loop1286-1342 from chromosome 1 were needed for production of the 56-kDa isoform, whereas the last stem-loop135s-1384 from chromosome 1 was dispensable. The results of experi- ments using both monocistronic and bicistronic vectors with a stable hairpin showed that translation initiation from the GGC1274-1276 codon was mediated by an internal ribosome entry site (IRES). Further experiments revealed that translation initiation from the GGC1274-1276 codon requires the upstream AU-constituted RNA secondary structure and the downstream GC-rich structure. This mechanistic work provides further support for the biological significance of the chimeric nature of the human ACAT1 transcript.展开更多
A novel method for the prediction of RNA secondary structure was proposed based on the particle swarm optimization(PSO). PSO is known to be effective in solving many different types of optimization problems and know...A novel method for the prediction of RNA secondary structure was proposed based on the particle swarm optimization(PSO). PSO is known to be effective in solving many different types of optimization problems and known for being able to approximate the global optimal results in the solution space. We designed an efficient objective function according to the minimum free energy, the number of selected stems and the average length of selected stems. We calculated how many legal stems there were in the sequence, and selected some of them to obtain an optimal result using PSO in the right of the objective function. A method based on the improved particle swarm optimization(IPSO) was proposed to predict RNA secondary structure, which consisted of three stages. The first stage was applied to encoding the source sequences, and to exploring all the legal stems. Then, a set of encoded stems were created in order to prepare input data for the second stage. In the second stage, IPSO was responsible for structure selection. At last, the optimal result was obtained from the secondary structures selected via IPSO. Nine sequences from the comparative RNA website were selected for the evaluation of the proposed method. Compared with other six methods, the proposed method decreased the complexity and enhanced the sensitivity and specificity on the basis of the experiment results.展开更多
Secondary structures of RNAs are the basis of understanding their tertiary structures and functions and so their predictions are widely needed due to increasing discovery of noncoding RNAs.In the last decades,a lot of...Secondary structures of RNAs are the basis of understanding their tertiary structures and functions and so their predictions are widely needed due to increasing discovery of noncoding RNAs.In the last decades,a lot of methods have been proposed to predict RNA secondary structures but their accuracies encountered bottleneck.Here we present a method for RNA secondary structure prediction using direct coupling analysis and a remove-and-expand algorithm that shows better performance than four existing popular multiple-sequence methods.We further show that the results can also be used to improve the prediction accuracy of the single-sequence methods.展开更多
A simple stepwise folding process has been developed to simulate RNA secondary structure formation.Modifications for the energy parameters of various loops were included in the program.Five possible types of pseudokno...A simple stepwise folding process has been developed to simulate RNA secondary structure formation.Modifications for the energy parameters of various loops were included in the program.Five possible types of pseudoknots including the well known H-type pseudoknot were permitted to occur if reasonable.We have applied this approach to e number of RNA sequences.The prediction accuracies we obtained were higher than those in published papers.展开更多
The RNA tertiary structure is essential to understanding the function and biological processes. Unfortunately, it is still challenging to determine the large RNA structure from direct experimentation or computational ...The RNA tertiary structure is essential to understanding the function and biological processes. Unfortunately, it is still challenging to determine the large RNA structure from direct experimentation or computational modeling. One promising approach is first to predict the tertiary contacts and then use the contacts as constraints to model the structure. The RNA structure modeling depends on the contact prediction accuracy. Although many contact prediction methods have been developed in the protein field, there are only several contact prediction methods in the RNA field at present. Here, we first review the theoretical basis and test the performances of recent RNA contact prediction methods for tertiary structure and complex modeling problems. Then, we summarize the advantages and limitations of these RNA contact prediction methods. We suggest some future directions for this rapidly expanding field in the last.展开更多
RNAs carry out diverse biological functions, partly because different conformations of the same RNA sequence can play different roles in cellular activities. To fully understand the biological functions of RNAs requir...RNAs carry out diverse biological functions, partly because different conformations of the same RNA sequence can play different roles in cellular activities. To fully understand the biological functions of RNAs requires a conceptual framework to investigate the folding kinetics of RNA molecules, instead of native structures alone. Over the past several decades, many experimental and theoretical methods have been developed to address RNA folding. The helix-based RNA folding theory is the one which uses helices as building blocks, to calculate folding kinetics of secondary structures with pseudoknots of long RNA in two different folding scenarios. Here, we will briefly review the helix-based RNA folding theory and its application in exploring regulation mechanisms of several riboswitches and self-cleavage activities of the hepatitis delta virus (HDV) ribozyme.展开更多
The attenuated vaccine strains of CSFV have a 12-nucleotides (nt) insertion in the 3'-UTR of genome as compared to that of CSFV virulent strains. In this study, we found a distinct heterogeneity in the 3'-UTR of a...The attenuated vaccine strains of CSFV have a 12-nucleotides (nt) insertion in the 3'-UTR of genome as compared to that of CSFV virulent strains. In this study, we found a distinct heterogeneity in the 3'-UTR of attenuated Thiverval and HCLV strains. The longest 3'-UTR of Thiverval strain was 259 base pairs (bp) with a 32-nt insertion, the shortest 3'-UTR had only 233 bp with a 6-nt insertion. The longest 3'-UTR of HCLV strain was 244 bp with a 17-nt insertion and the shortest 3' UTR was 235 bp with a 8-nt insertion. Compared with the published sequences of 3'-UTR of vaccine and virulent strains, the 3'-UTR of CSFV vaccine strains have two variable regions where insertion among the different vaccine strains were frequently found. The first is located between the second conservative TALk codon and the start of T-rich region where we found the variable length insertion in the same vaccine strain Thiveral or HCLV and the second is located between the end of T-rich region and the front of GAA eodon, however, a 4-nt deletion was found in this region in the virulent Shimen strain. These two regions may represent the "hot spot" for mutation. Modeling the secondary structures of the 3'-UTR suggests that the T-rich insertion could result in the change of structure and free energy, thus affecting the stability of the 3'-UTR structure. These findings will help to understand the mechanism of attenuated vaccines and improve vaccine safety, stability, and efficacy.展开更多
The 5′-cap structures of eukaryotic m RNAs are important for RNA stability, pre-m RNA splicing,m RNA export, and protein translation. Many viruses have evolved mechanisms for generating their own cap structures with ...The 5′-cap structures of eukaryotic m RNAs are important for RNA stability, pre-m RNA splicing,m RNA export, and protein translation. Many viruses have evolved mechanisms for generating their own cap structures with methylation at the N7 position of the capped guanine and the ribose 2′-Oposition of the first nucleotide, which help viral RNAs escape recognition by the host innate immune system. The RNA genomes of coronavirus were identified to have 5′-caps in the early1980 s. However, for decades the RNA capping mechanisms of coronaviruses remained unknown.Since 2003, the outbreak of severe acute respiratory syndrome coronavirus has drawn increased attention and stimulated numerous studies on the molecular virology of coronaviruses. Here, we review the current understanding of the mechanisms adopted by coronaviruses to produce the 5′-cap structure and methylation modification of viral genomic RNAs.展开更多
To investigate how synonymous codons have been adapted to the formation of ribonucleic acid(RNA)G-quadruplex(rG4)structure,a computational searching algorithm G4Hunter was applied to detect rG4 structures in protein-c...To investigate how synonymous codons have been adapted to the formation of ribonucleic acid(RNA)G-quadruplex(rG4)structure,a computational searching algorithm G4Hunter was applied to detect rG4 structures in protein-coding sequences of mRNAs in five eukaryotic species.The native sequences forming rG4s were then compared with randomized sequences to evaluate selection on synonymous codons.Factors that may influence the formation of rG4 were also investigated,and the selection pressures of rG4 in different gene regions were compared to explore its potential roles in gene regulation.The results show universal selective pressure acts on synonymous codons in rG4 regions to facilitate rG4 formation in five eukaryotic organisms.While G-rich codon combinations are preferred in the rG4 structural region,C-rich codon combinations are selectively unfavorable for rG4 formation.Gene's codon usage bias,nucleotide composition,and evolutionary rate can account for the selective variations on synonymous codons among rG4 structures within a species.Moreover,rG4 structures in the translational initiation region showed significantly higher selective pressures than those in the translational elongation region.展开更多
RNA-protein interactions are crucial for regulating various cellular processes such as gene expression,RNA modification and translation.In contrast,undesirable RNA-protein interactions often cause dysregulated cellula...RNA-protein interactions are crucial for regulating various cellular processes such as gene expression,RNA modification and translation.In contrast,undesirable RNA-protein interactions often cause dysregulated cellular activities associated with many human diseases.The RNA containing expanded GGGGCC repeats forms secondary structures that sequester various RNA binding proteins(RBPs),leading to the development of amyotrophic lateral sclerosis(ALS)and frontotemporal dementia(FTD).However,a gap persists in understanding the structural basis for GGGGCC repeat RNA binding to RBPs.Here,we resolve the first solution NMR structure of a natural GGGGCC repeat RNA containing a 2×2 GG/GG internal loop,and perform MD simulations and site-directed mutagenesis to elucidate the mechanism for GGGGCC repeat RNA binding to SRSF_(2),a splicing factor and key marker of nuclear speckles.We reveal that the R47/T51/R61 residues in RNA recognition motif of SRSF_(2) and the 2×2 GG/GG internal loop in GGGGCC repeat RNA are essential for binding.This work furnishes a valuable high-resolution structural basis for understanding the binding mechanism for GGGGCC repeat RNA and RBPs,and steers RNA structure-based drug design.展开更多
文摘As more information is gathered on the mechanisms of transcription and translation, it is becoming apparent that these processes are highly regulated. The formation of mRNA secondary and tertiary structures is one such regulatory process that until recently it has not been analysed in depth. Formation of these mRNA structures has the potential to enhance and inhibit alternative splicing of transcripts, and regulate rates and amount of translation. As this regulatory mechanism potentially impacts at both the transcriptional and translational level, while also potentially utilising the vast array of non-coding RNAs, it warrants further investigation. Currently, a variety of high- throughput sequencing techniques including parallel analysis of RNA structure (PARS), fragmentation sequencing (FragSeq) and selective 2-hydroxyl acylation analysed by primer extension (SHAPE) lead the way in the genome-wide identification and analysis of mRNA structure formation. These new sequencing techniques highlight the diversity and complexity of the transcriptome, and demonstrate another regulatory mechanism that could become a target for new therapeutic approaches.
基金supported by the National Natural Science Foundation of China(Grant Nos.11074191,11175132,and 11374234)the National Basic Research Programof China(Grant No.2011CB933600)the Program for New Century Excellent Talents of China(Grant No.NCET 08-0408)
文摘Many recent exciting discoveries have revealed the versatility of RNAs and their importance in a variety of cellular functions which are strongly coupled to RNA structures. To understand the functions of RNAs, some structure prediction models have been developed in recent years. In this review, the progress in computational models for RNA structure prediction is introduced and the distinguishing features of many outstanding algorithms are discussed, emphasizing three- dimensional (3D) structure prediction. A promising coarse-grained model for predicting RNA 3D structure, stability and salt effect is also introduced briefly. Finally, we discuss the major challenges in the RNA 3D structure modeling.
基金supported by grants from the National Science Foundation of China(Grant Nos.12375038 and 12075171 to ZJT,and 12205223 to YLT).
文摘RNAs have important biological functions and the functions of RNAs are generally coupled to their structures, especiallytheir secondary structures. In this work, we have made a comprehensive evaluation of the performances of existingtop RNA secondary structure prediction methods, including five deep-learning (DL) based methods and five minimum freeenergy (MFE) based methods. First, we made a brief overview of these RNA secondary structure prediction methods.Afterwards, we built two rigorous test datasets consisting of RNAs with non-redundant sequences and comprehensivelyexamined the performances of the RNA secondary structure prediction methods through classifying the RNAs into differentlength ranges and different types. Our examination shows that the DL-based methods generally perform better thanthe MFE-based methods for RNAs with long lengths and complex structures, while the MFE-based methods can achievegood performance for small RNAs and some specialized MFE-based methods can achieve good prediction accuracy forpseudoknots. Finally, we provided some insights and perspectives in modeling RNA secondary structures.
基金the National Key R&D Program of China(2019YFA0110002 and 2018YFA0107603 to Q.C.Z,and 2020YFA0509100 to X.H.)National Natural Science Foundation of China(Grants No.32125007,91940306,91740204,and 31761163007 to Q.C.Z,and 31725010,31821003,31991174,32030037,82150105 to X.H.)Research Grants Council of the Hong Kong SAR,China Project No.N_CityU110/17 to C.K.K.
文摘To enable diverse functions and precise regulation,an RNA sequence often folds into complex yet distinct structures in different cellular states.Probing RNA in its native environment is essential to uncovering RNA structures of biological contexts.However,current methods generally require large amounts of input RNA and are challenging for physiologically relevant use.Here,we report smartSHAPE,a new RNA structure probing method that requires very low amounts of RNA input due to the largely reduced artefact of probing signals and increased efficiency of library construction.Using smartSHAPE,we showcased the profiling of the RNA structure landscape of mouse intestinal macrophages upon inflammation,and provided evidence that RNA conformational changes regulate immune responses.These results demonstrate that smartSHAPE can greatly expand the scope of RNA structure-based investigations in practical biological systems,and also provide a research paradigm for the study of post-transcriptional regulation.
基金National Key R&D Program of China(2021YFA1301500,2017YFA0504600,2022YFC2303700,2022YFA1302700,2022YFF1203100)National Natural Science Foundation of China(U1832215,32171191,91940302,32230018 and 32125007)+6 种基金Strategic Priority Research Program of Chinese Academy of Sciences(XDB37010201,XDB0490000)Center for Advanced Interdisciplinary Science and Biomedicine of IHM(QYPY20220019)the Fundamental Research Funds for the Central Universities(WK9100000032 and WK9100000044)Guangdong Science and Technology Department(2022A1515010328,2020B1212060018 and 2020B1212030004)the Postdoctoral Foundation of Tsinghua-Peking Center for Life Sciences[to J.Z.]the Beijing Advanced Innovation Center for Structural Biology[to Q.C.Z.]the Tsinghua-Peking Joint Center for Life Sciences[to Q.C.Z.].
文摘RNA molecules serve a wide range of functions that are closely linked to their structures.The basic structural units of RNA consist of single-and double-stranded regions.In order to carry out advanced functions such as catalysis and ligand binding,certain types of RNAs can adopt higher-order structures.The analysis of RNA structures has progressed alongside advancements in structural biology techniques,but it comes with its own set of challenges and corresponding solutions.In this review,we will discuss recent advances in RNA structure analysis techniques,including structural probing methods,X-ray crystallography,nuclear magnetic resonance,cryo-electron microscopy,and small-angle X-ray scattering.Often,a combination of multiple techniques is employed for the integrated analysis of RNA structures.We also survey important RNA structures that have been recently determined using various techniques.
基金supported by the National Natural Science Foundation of China(Grant No.31671355)the National Thousand Young Talents Program of China to QCZ
文摘RNA folds into intricate structures that are crucial for its functions and regulations. To date, a multitude of approaches for probing structures of the whole transcriptome, i.e., RNA struc- turomes, have been developed. Applications of these approaches to different cell lines and tissues have generated a rich resource for the study of RNA structure-function relationships at a systems biology level. In this review, we first introduce the designs of these methods and their applications to study different RNA structuromes. We emphasize their technological differences especially their unique advantages and caveats. We then summarize the structural insights in RNA functions and regulations obtained from the studies of RNA structuromes. And finally, we propose potential directions for future improvements and studies.
基金Project supported by the National Natural Science Foundation of China(Grant Nos.11774158,11974173,11774157,and 11934008)。
文摘RNAs play crucial and versatile roles in biological processes.Computational prediction approaches can help to understand RNA structures and their stabilizing factors,thus providing information on their functions,and facilitating the design of new RNAs.Machine learning(ML)techniques have made tremendous progress in many fields in the past few years.Although their usage in protein-related fields has a long history,the use of ML methods in predicting RNA tertiary structures is new and rare.Here,we review the recent advances of using ML methods on RNA structure predictions and discuss the advantages and limitation,the difficulties and potentials of these approaches when applied in the field.
基金Supported by the Science Foundation of Hengyang Normal University of China(09A36)~~
文摘[Objective] To examine the grammar model based on lexical substring exac- tion for RNA secondary structure prediction. [Method] By introducing cloud model into stochastic grammar model, a machine learning algorithm suitable for the lexicalized stochastic grammar model was proposed. The word grid mode was used to extract and divide RNA sequence to acquire lexical substring, and the cloud classifier was used to search the maximum probability of each lemma which was marked as a certain sec- ondary structure type. Then, the lemma information was introduced into the training stochastic grammar process as prior information, realizing the prediction on the sec- ondary structure of RNA, and the method was tested by experiment. [Result] The experimental results showed that the prediction accuracy and searching speed of stochastic grammar cloud model were significantly improved from the prediction with simple stochastic grammar. [Conclusion] This study laid the foundation for the wide application of stochastic grammar model for RNA secondary structure prediction.
基金funded by the National Natural Science Foundation of China(Grant Nos.11774158 to JZ,11934008 to WW,and 11974173 to WFL)。
文摘RNAs play crucial and versatile roles in cellular biochemical reactions.Since experimental approaches of determining their three-dimensional(3D)structures are costly and less efficient,it is greatly advantageous to develop computational methods to predict RNA 3D structures.For these methods,designing a model or scoring function for structure quality assessment is an essential step but this step poses challenges.In this study,we designed and trained a deep learning model to tackle this problem.The model was based on a graph convolutional network(GCN)and named RNAGCN.The model provided a natural way of representing RNA structures,avoided complex algorithms to preserve atomic rotational equivalence,and was capable of extracting features automatically out of structural patterns.Testing results on two datasets convincingly demonstrated that RNAGCN performs similarly to or better than four leading scoring functions.Our approach provides an alternative way of RNA tertiary structure assessment and may facilitate RNA structure predictions.RNAGCN can be downloaded from https://gitee.com/dcw-RNAGCN/rnagcn.
基金supported by grants from the National Science Foundation of China(12075171,11774272)。
文摘Knowledge of RNA 3-dimensional(3 D) structures is critical to understand the important biological functions of RNAs, and various models have been developed to predict RNA 3 D structures in silico. However, there is still lack of a reliable and efficient statistical potential for RNA 3 D structure evaluation. For this purpose, we developed a statistical potential based on a minimal coarse-grained representation and residue separation, where every nucleotide is represented by C4’ atom for backbone and N1(or N9) atom for base. In analogy to the newly developed all-atom rsRNASP, cgRNASP-CN is composed of short-ranged and long-ranged potentials, and the short-ranged one was involved more subtly. The examination indicates that the performance of cgRNASP-CN is close to that of the all-atom rsRNASP and is superior to other top all-atom traditional statistical potentials and scoring functions trained from neural networks, for two realistic test datasets including the RNA-Puzzles dataset. Very importantly,cgRNASP-CN is about 100 times more efficient than existing all-atom statistical potentials/scoring functions including rsRNASP. cgRNASP-CN is available at website: https://github.com/Tan-group/cgRNASP-CN.
文摘We have previously reported that the human ACAT1 gene produces a chimeric mRNA through the interchromosomal processing of two discontinuous RNAs transcribed from chromosomes 1 and 7. The chimeric mRNA uses AUG1397-1399 and GGC1274-1276 as translation initiation codons to produce normal 50-kDa ACAT1 and a novel enzymatically active 56-kDa isoform, respectively, with the latter being authentically present in human cells, including human monocyte- derived macrophages. In this work, we report that RNA secondary structures located in the vicinity of the GGC1274-1276 codon are required for production of the 56-kDa isoform. The effects of the three predicted stem-loops (nt 1255-1268, 1286-1342 and 1355-1384) were tested individually by transfecting expression plasmids into cells that contained the wild-type, deleted or mutant stem-loop sequences linked to a partial ACAT1 AUG open reading frame (ORF) or to the ORFs of other genes. The expression patterns were monitored by western blot analyses. We found that the upstream stem-loop1255-1268 from chromosome 7 and downstream stem-loop1286-1342 from chromosome 1 were needed for production of the 56-kDa isoform, whereas the last stem-loop135s-1384 from chromosome 1 was dispensable. The results of experi- ments using both monocistronic and bicistronic vectors with a stable hairpin showed that translation initiation from the GGC1274-1276 codon was mediated by an internal ribosome entry site (IRES). Further experiments revealed that translation initiation from the GGC1274-1276 codon requires the upstream AU-constituted RNA secondary structure and the downstream GC-rich structure. This mechanistic work provides further support for the biological significance of the chimeric nature of the human ACAT1 transcript.
基金Supported by the National Natural Science Foundation of China(No60971089)
文摘A novel method for the prediction of RNA secondary structure was proposed based on the particle swarm optimization(PSO). PSO is known to be effective in solving many different types of optimization problems and known for being able to approximate the global optimal results in the solution space. We designed an efficient objective function according to the minimum free energy, the number of selected stems and the average length of selected stems. We calculated how many legal stems there were in the sequence, and selected some of them to obtain an optimal result using PSO in the right of the objective function. A method based on the improved particle swarm optimization(IPSO) was proposed to predict RNA secondary structure, which consisted of three stages. The first stage was applied to encoding the source sequences, and to exploring all the legal stems. Then, a set of encoded stems were created in order to prepare input data for the second stage. In the second stage, IPSO was responsible for structure selection. At last, the optimal result was obtained from the secondary structures selected via IPSO. Nine sequences from the comparative RNA website were selected for the evaluation of the proposed method. Compared with other six methods, the proposed method decreased the complexity and enhanced the sensitivity and specificity on the basis of the experiment results.
基金Project supported by the National Natural Science Foundation of China(Grant No.31570722).
文摘Secondary structures of RNAs are the basis of understanding their tertiary structures and functions and so their predictions are widely needed due to increasing discovery of noncoding RNAs.In the last decades,a lot of methods have been proposed to predict RNA secondary structures but their accuracies encountered bottleneck.Here we present a method for RNA secondary structure prediction using direct coupling analysis and a remove-and-expand algorithm that shows better performance than four existing popular multiple-sequence methods.We further show that the results can also be used to improve the prediction accuracy of the single-sequence methods.
文摘A simple stepwise folding process has been developed to simulate RNA secondary structure formation.Modifications for the energy parameters of various loops were included in the program.Five possible types of pseudoknots including the well known H-type pseudoknot were permitted to occur if reasonable.We have applied this approach to e number of RNA sequences.The prediction accuracies we obtained were higher than those in published papers.
基金Project supported by the National Natural Science Foundation of China (Grant No. 11704140)Self-determined Research Funds of CCNU from the Colleges' Basic Research and Operation of MOE (Grant No. CCNU20TS004).
文摘The RNA tertiary structure is essential to understanding the function and biological processes. Unfortunately, it is still challenging to determine the large RNA structure from direct experimentation or computational modeling. One promising approach is first to predict the tertiary contacts and then use the contacts as constraints to model the structure. The RNA structure modeling depends on the contact prediction accuracy. Although many contact prediction methods have been developed in the protein field, there are only several contact prediction methods in the RNA field at present. Here, we first review the theoretical basis and test the performances of recent RNA contact prediction methods for tertiary structure and complex modeling problems. Then, we summarize the advantages and limitations of these RNA contact prediction methods. We suggest some future directions for this rapidly expanding field in the last.
基金Project supported by the Science Fund from the Key Laboratory of Hubei Province, China (Grant No. 201932003)the National Natural Science Foundation of China (Grant Nos. 1157324 and 31600592).
文摘RNAs carry out diverse biological functions, partly because different conformations of the same RNA sequence can play different roles in cellular activities. To fully understand the biological functions of RNAs requires a conceptual framework to investigate the folding kinetics of RNA molecules, instead of native structures alone. Over the past several decades, many experimental and theoretical methods have been developed to address RNA folding. The helix-based RNA folding theory is the one which uses helices as building blocks, to calculate folding kinetics of secondary structures with pseudoknots of long RNA in two different folding scenarios. Here, we will briefly review the helix-based RNA folding theory and its application in exploring regulation mechanisms of several riboswitches and self-cleavage activities of the hepatitis delta virus (HDV) ribozyme.
基金supported by the National Natural Science Foundation of China (30571377)the National High-Tech R&D Program of China (863 Program,2006AA10A204)
文摘The attenuated vaccine strains of CSFV have a 12-nucleotides (nt) insertion in the 3'-UTR of genome as compared to that of CSFV virulent strains. In this study, we found a distinct heterogeneity in the 3'-UTR of attenuated Thiverval and HCLV strains. The longest 3'-UTR of Thiverval strain was 259 base pairs (bp) with a 32-nt insertion, the shortest 3'-UTR had only 233 bp with a 6-nt insertion. The longest 3'-UTR of HCLV strain was 244 bp with a 17-nt insertion and the shortest 3' UTR was 235 bp with a 8-nt insertion. Compared with the published sequences of 3'-UTR of vaccine and virulent strains, the 3'-UTR of CSFV vaccine strains have two variable regions where insertion among the different vaccine strains were frequently found. The first is located between the second conservative TALk codon and the start of T-rich region where we found the variable length insertion in the same vaccine strain Thiveral or HCLV and the second is located between the end of T-rich region and the front of GAA eodon, however, a 4-nt deletion was found in this region in the virulent Shimen strain. These two regions may represent the "hot spot" for mutation. Modeling the secondary structures of the 3'-UTR suggests that the T-rich insertion could result in the change of structure and free energy, thus affecting the stability of the 3'-UTR structure. These findings will help to understand the mechanism of attenuated vaccines and improve vaccine safety, stability, and efficacy.
基金supported by the China "973" Basic Research Program (2013CB911101)China NSFC grants (81130083 and 81271817)
文摘The 5′-cap structures of eukaryotic m RNAs are important for RNA stability, pre-m RNA splicing,m RNA export, and protein translation. Many viruses have evolved mechanisms for generating their own cap structures with methylation at the N7 position of the capped guanine and the ribose 2′-Oposition of the first nucleotide, which help viral RNAs escape recognition by the host innate immune system. The RNA genomes of coronavirus were identified to have 5′-caps in the early1980 s. However, for decades the RNA capping mechanisms of coronaviruses remained unknown.Since 2003, the outbreak of severe acute respiratory syndrome coronavirus has drawn increased attention and stimulated numerous studies on the molecular virology of coronaviruses. Here, we review the current understanding of the mechanisms adopted by coronaviruses to produce the 5′-cap structure and methylation modification of viral genomic RNAs.
基金The National Key Research and Development Program of China(No.2018YFC1314900,2018YFC1314902)the National Natural Science Foundation of China(No.61571109)the Fundamental Research Funds for the Central Universities(No.2242017K3DN04).
文摘To investigate how synonymous codons have been adapted to the formation of ribonucleic acid(RNA)G-quadruplex(rG4)structure,a computational searching algorithm G4Hunter was applied to detect rG4 structures in protein-coding sequences of mRNAs in five eukaryotic species.The native sequences forming rG4s were then compared with randomized sequences to evaluate selection on synonymous codons.Factors that may influence the formation of rG4 were also investigated,and the selection pressures of rG4 in different gene regions were compared to explore its potential roles in gene regulation.The results show universal selective pressure acts on synonymous codons in rG4 regions to facilitate rG4 formation in five eukaryotic organisms.While G-rich codon combinations are preferred in the rG4 structural region,C-rich codon combinations are selectively unfavorable for rG4 formation.Gene's codon usage bias,nucleotide composition,and evolutionary rate can account for the selective variations on synonymous codons among rG4 structures within a species.Moreover,rG4 structures in the translational initiation region showed significantly higher selective pressures than those in the translational elongation region.
基金supported by the National Key Research and Development Program of China(2021YFA0909400)National Natural Science Foundation of China(22374132,22225402,32341017)+1 种基金Department of Science and Technology of Zhejiang Province(2024R01005,2023SDYXS0002)Natural Science Foundation of Zhejiang Province(QKHM25B0501).
文摘RNA-protein interactions are crucial for regulating various cellular processes such as gene expression,RNA modification and translation.In contrast,undesirable RNA-protein interactions often cause dysregulated cellular activities associated with many human diseases.The RNA containing expanded GGGGCC repeats forms secondary structures that sequester various RNA binding proteins(RBPs),leading to the development of amyotrophic lateral sclerosis(ALS)and frontotemporal dementia(FTD).However,a gap persists in understanding the structural basis for GGGGCC repeat RNA binding to RBPs.Here,we resolve the first solution NMR structure of a natural GGGGCC repeat RNA containing a 2×2 GG/GG internal loop,and perform MD simulations and site-directed mutagenesis to elucidate the mechanism for GGGGCC repeat RNA binding to SRSF_(2),a splicing factor and key marker of nuclear speckles.We reveal that the R47/T51/R61 residues in RNA recognition motif of SRSF_(2) and the 2×2 GG/GG internal loop in GGGGCC repeat RNA are essential for binding.This work furnishes a valuable high-resolution structural basis for understanding the binding mechanism for GGGGCC repeat RNA and RBPs,and steers RNA structure-based drug design.