Severe acute respiratory syndrome coronavirus 2(SARS-Co V-2) relies on the central molecular machine RNA-dependent RNA polymerase(Rd Rp) for the viral replication and transcription. Remdesivir at the template strand h...Severe acute respiratory syndrome coronavirus 2(SARS-Co V-2) relies on the central molecular machine RNA-dependent RNA polymerase(Rd Rp) for the viral replication and transcription. Remdesivir at the template strand has been shown to effectively inhibit the RNA synthesis in SARS-Co V-2 Rd Rp by deactivating not only the complementary UTP incorporation but also the next nucleotide addition. However, the underlying molecular mechanism of the second inhibitory point remains unclear. In this work, we have performed molecular dynamics simulations and demonstrated that such inhibition has not directly acted on the nucleotide addition at the active site. Instead, the translocation of Remdesivir from +1 to-1 site is hindered thermodynamically as the posttranslocation state is less stable than the pre-translocation state due to the motif B residue G683. Moreover, another conserved residue S682 on motif B further hinders the dynamic translocation of Remdesivir due to the steric clash with the 1′-cyano substitution. Overall,our study has unveiled an alternative role of motif B in mediating the translocation when Remdesivir is present in the template strand and complemented our understanding about the inhibitory mechanisms exerted by Remdesivir on the RNA synthesis in SARS-Co V-2 Rd Rp.展开更多
Dear Editor,Pyruvate dehydrogenase complex(PDHc) is a large multienzyme assembly(Mr = 4–10 million Daltons) consisting of three essential components: pyruvate dehydrogenase(E1p), dihydrolipoyl transacetylase(E2p), an...Dear Editor,Pyruvate dehydrogenase complex(PDHc) is a large multienzyme assembly(Mr = 4–10 million Daltons) consisting of three essential components: pyruvate dehydrogenase(E1p), dihydrolipoyl transacetylase(E2p), and dihydrolipoyl dehydrogenase(E3). These three enzymes perform distinct functions sequentially to catalyze the oxidative decarboxylation of pyruvate with formation of nicotinamide adenine dinucleotide(NADH) and acetyl-coenzyme A(Patel and Roche, 1990).展开更多
In mammalian cells, transcribed enhancers(TrEns) play important roles in the initiation of gene expression and maintenance of gene expression levels in a spatiotemporal manner. One of the most challenging questions is...In mammalian cells, transcribed enhancers(TrEns) play important roles in the initiation of gene expression and maintenance of gene expression levels in a spatiotemporal manner. One of the most challenging questions is how the genomic characteristics of enhancers relate to enhancer activities. To date, only a limited number of enhancer sequence characteristics have been investigated, leaving space for exploring the enhancers’ DNA code in a more systematic way. To address this problem, we developed a novel computational framework, Transcribed Enhancer Landscape Search(TELS), aimed at identifying predictive cell type/tissue-specific motif signatures of TrEns.As a case study, we used TELS to compile a comprehensive catalog of motif signatures for all known TrEns identified by the FANTOM5 consortium across 112 human primary cells and tissues.Our results confirm that combinations of different short motifs characterize in an optimized manner cell type/tissue-specific TrEns. Our study is the first to report combinations of motifs that maximize classification performance of TrEns exclusively transcribed in one cell type/tissue from TrEns exclusively transcribed in different cell types/tissues. Moreover, we also report 31 motif signatures predictive of enhancers’ broad activity. TELS codes and material are publicly available at http://www.cbrc.kaust.edu.sa/TELS.展开更多
The COVID-19 pandemic has caused a global crisis that prompted the scientific community to tackle new,extraordinary challenges.The urge to make informed decisions with dramatic impact on public health,wealth and socie...The COVID-19 pandemic has caused a global crisis that prompted the scientific community to tackle new,extraordinary challenges.The urge to make informed decisions with dramatic impact on public health,wealth and society as a whole has pushed the scientific community to produce knowledge and tools at an unprecedented pace.This massive effort resulted in a deep understanding of the SARS-CoV-2 infection,spread and pathogenetic mechanisms,together with the assessment of non-pharmaceutical interventions(NPIs),and the development of therapies and vaccines.We argue that a remarkable contribution to the fight against the pandemic has been provided by the availability of computational biology and bioinformatics tools,which contributed to boost the research process in any possible application area and better cope with the urgent needs determined by the pandemic.展开更多
Covering a quarter of the world's tropical coastlines and being one of the most threat- ened ecosystems, mangroves are among the major sources of terrestrial organic matter to oceans and harbor a wide microbial diver...Covering a quarter of the world's tropical coastlines and being one of the most threat- ened ecosystems, mangroves are among the major sources of terrestrial organic matter to oceans and harbor a wide microbial diversity. In order to protect, restore, and better understand these ecosystems, researchers have extensively studied their microbiology, yet few surveys have focused on their fungal communities, Our lack of knowledge is even more pronounced for specific fungal populations, such as the ones associated with the rhizosphere. Likewise, the Red Sea gray man- groves (Avicennia marina) remain poorly characterized, and understanding of their fungal commu- nities still relies on cultivation-dependent methods. In this study, we analyzed metagenomic datasets from gray mangrove rhizosphere and bulk soil samples collected in the Red Sea coast, to obtain a snapshot of their fungal communities. Our data indicated that Ascomycota was the dominant phylum (76%-85%), while Basidiomycota was less abundant (14%-24~), yet present in higher numbers than usually reported for such environments. Fungal communities were more stable within the rhizosphere than within the bulk soil, both at class and genus level. This finding is consistent with the intrinsic patchiness in soil sediments and with the selection of specific microbial commu- nities by plant roots. Our study indicates the presence of several species on this mycobiome that were not previously reported as mangrove-associated. In particular, we detected representatives of several commercially-used fungi, e.g., producers of secreted cellulases and anaerobic producers of cellulosomes. These results represent additional insights into the fungal community of the gray mangroves of the Red Sea, and show that they are significantly richer than previously reported.展开更多
Abscisic acid(ABA)is an important carotenoid-derived phytohormone that plays essential roles in plant response to biotic and abiotic stresses as well as in various physiological and developmental processes.In Arabidop...Abscisic acid(ABA)is an important carotenoid-derived phytohormone that plays essential roles in plant response to biotic and abiotic stresses as well as in various physiological and developmental processes.In Arabidopsis,ABA biosynthesis starts with the epoxidation of zeaxanthin by the ABA DEFICIENT 1(ABA1)enzyme,leading to epoxycarotenoids;e.g.,violaxanthin.The oxidative cleavage of 9-cis-epoxycaro-tenoids,a key regulatory step catalyzed by 9-C/S-EPOXYCAROTENOID DIOXYGENASE,forms xanthoxin,which is converted in further rea.ctions mediated by ABA DEFICIENT 2(ABA2),ABA DEFICIENT 3(ABA3),and ABSCISIC ALDEHYDE OXIDASE 3(AAO3)into ABA.By combining genetic and biochemical approaches,we unravel here an ABA1-independent ABA biosynthetic pathway starting upstream of zeaxanthin.We iden-tified the carotenoid cleavage products(i.e.,apocarotenoids,β-apo-11-carotenal,9-cis-β-apo-11-carotenal,3-OH-β-apo-11-carotenal,and 9-cis-3-OH-β-apo-11-carotenal)as intermediates of this ABA1-independent ABA biosynthetic pathway.Using labeled compounds,we showed thatβ-apo-11-carotenal,9-cis-β-apo-11-carotenal,and 3-OH-β-apo-11-carotenal are successively converted into 9-cis-3-OH-β-apo-11-carotenal,xanthoxin,and finally into ABA in both Arabidopsis and rice.When applied to Arabidopsis,theseβ-apo-11-carotenoids exert ABA biological functions,such as maintaining seed dormancy and inducing the expression of ABA-responsive genes.Moreover,the transcdptomic analysis revealed a high overlap of differentially expressed genes regulated byβ-apo-11-carotenoids and ABA,suggesting thatβ-apo-11-carot-enoids exert ABA-independent regulatory activities.Taken together,our study identifies a biological function for the common plant metabolites,β-apo-11-carotenoids,extends our knowledge about ABA biosynthesis,and provides new insights into plant apocarotenoid metabolic networks.展开更多
More than 99% of identified prokaryotes, including many from the marine environment, cannot be cultured in the laboratory. This lack of capability restricts our knowledge of microbial genetics and community ecology. M...More than 99% of identified prokaryotes, including many from the marine environment, cannot be cultured in the laboratory. This lack of capability restricts our knowledge of microbial genetics and community ecology. Metagenomics, the culture-independent cloning of environmental DNAs that are isolated directly from an environmental sample, has already provided a wealth of information about the uncultured microbial world. It has also facilitated the discovery of novel bio- catalysts by allowing researchers to probe directly into a huge diversity of enzymes within natural microbial communities. Recent advances in these studies have led to a great interest in recruiting microbial enzymes for the development of environmentally-friendly industry. Although the metage- nomics approach has many limitations, it is expected to provide not only scientific insights but also economic benefits, especially in industry. This review highlights the importance of metagenomics in mining microbial lipases, as an example, by using high-throughput techniques. In addition, we dis- cuss challenges in the metagenomics as an important part of bioinformatics analysis in big data.展开更多
The outstanding properties of graphene have initiated myriads of research and development;yet, its economic impact is hamperedby the difficulties encountered in production and practical application. Recently discovere...The outstanding properties of graphene have initiated myriads of research and development;yet, its economic impact is hamperedby the difficulties encountered in production and practical application. Recently discovered laser-induced graphene is generated bya simple printing process on flexible and lightweight polyimide films. Exploiting the electrical features and mechanical pliability ofLIG on polyimide, we developed wearable resistive bending sensors that pave the way for many cost-effective measurementsystems. The versatile sensors we describe can be utilized in a wide range of configurations, including measurement of force,deflection, and curvature. The deflection induced by different forces and speeds is effectively sensed through a resistancemeasurement, exploiting the piezoresistance of the printed graphene electrodes. The LIG sensors possess an outstanding range forstrain measurements reaching >10% A double-sided electrode concept was developed by printing the same electrodes on bothsides of the film and employing difference measurements. This provided a large bidirectional bending response combined withtemperature compensation. Versatility in geometry and a simple fabrication process enable the detection of a wide range of flowspeeds, forces, and deflections. The sensor response can be easily tuned by geometrical parameters of the bending sensors and theLIG electrodes. As a wearable device, LIG bending sensors were used for tracking body movements. For underwater operation,PDMS-coated LIG bending sensors were integrated with ultra-low power aquatic tags and utilized in underwater animal speedmonitoring applications, and a recording of the surface current velocity on a coral reef in the Red Sea.展开更多
Deep learning(DL)has shown explosive growth in its application to bioinformatics and has demonstrated thrillingly promising power to mine the complex relationship hidden in large-scale biological and biomedical data.A...Deep learning(DL)has shown explosive growth in its application to bioinformatics and has demonstrated thrillingly promising power to mine the complex relationship hidden in large-scale biological and biomedical data.A number of comprehensive reviews have been published on such applications,ranging from high-level reviews with future perspectives to those mainly serving as tutorials.展开更多
Graphene has shown considerable potential for sensing magnetic fields based on the Hall Effect,due to its high carrier mobility,low sheet carrier density,and low-temperature dependence.However,the cost of graphene in ...Graphene has shown considerable potential for sensing magnetic fields based on the Hall Effect,due to its high carrier mobility,low sheet carrier density,and low-temperature dependence.However,the cost of graphene in comparison to conventional materials has meant that its uptake in electronic manufacturing has been slow.To lower technological barriers and bring more widespread adoption of graphene Hall sensors,we are using a one-step laser scribing process that does not rely on multiple steps,toxic chemicals,and subsequent treatments.Laser-scribed graphene Hall sensors offer a linear response to magnetic fields with a normalized sensitivity of~1.12 V/AT.They also exhibit a low constant noise voltage floor of~50 nV√Hz p for a bias current of 100μA at room temperature,which is comparable with state-of-the-art low-noise Hall sensors.The sensors combine a high bendability,come with high robustness and operating temperatures up to 400°C.They enable device ideas in various areas,for instance,soft robotics.As an example,we combined a laser-scribed graphene sensor with a deformable elastomer and flexible magnet to realize low-cost,compliant,and customizable tactile sensors.展开更多
Droplet microfluidic techniques have shown promising outcome to study single cells at high throughput.However,their adoption in laboratories studying“-omics”sciences is still irrelevant due to the complex and multid...Droplet microfluidic techniques have shown promising outcome to study single cells at high throughput.However,their adoption in laboratories studying“-omics”sciences is still irrelevant due to the complex and multidisciplinary nature of the field.To facilitate their use,here we provide engineering details and organized protocols for integrating three droplet-based microfluidic technologies into the metagenomic pipeline to enable functional screening of bioproducts at high throughput.First,a device encapsulating single cells in droplets at a rate of~250 Hz is described considering droplet size and cell growth.Then,we expand on previously reported fluorescence-activated droplet sorting systems to integrate the use of 4 independent fluorescence-exciting lasers(i.e.,405,488,561,and 637 nm)in a single platform to make it compatible with different fluorescence-emitting biosensors.For this sorter,both hardware and software are provided and optimized for effortlessly sorting droplets at 60 Hz.Then,a passive droplet merger is also integrated into our pipeline to enable adding new reagents to already-made droplets at a rate of 200 Hz.Finally,we provide an optimized recipe for manufacturing these chips using silicon dry-etching tools.Because of the overall integration and the technical details presented here,our approach allows biologists to quickly use microfluidic technologies and achieve both single-cell resolution and high-throughput capability(>50,000 cells/day)for mining and bioprospecting metagenomic data.展开更多
Antibody leads must fulfill multiple desirable properties to be clinical candidates.Primarily due to the low throughput in the experimental procedure,the need for such multiproperty optimization causes the bottleneck ...Antibody leads must fulfill multiple desirable properties to be clinical candidates.Primarily due to the low throughput in the experimental procedure,the need for such multiproperty optimization causes the bottleneck in preclinical antibody discovery and development,because addressing one issue usually causes another.We developed a reinforcement learning(RL)method,named AB-Gen,for antibody library design using a generative pre-trained transformer(GPT)as the policy network of the RL agent.We showed that this model can learn the antibody space of heavy chain complementarity determining region 3(CDRH3)and generate sequences with similar property distributions.Besides,when using human epidermal growth factor receptor-2(HER2)as the target,the agent model of AB-Gen was able to generate novel CDRH3 sequences that fulfill multi-property constraints.Totally,509 generated sequences were able to pass all property filters,and three highly conserved residues were identified.The importance of these residues was further demonstrated by molecular dynamics simulations,consolidating that the agent model was capable of grasping important information in this complex optimization task.Overall,the ABGen method is able to design novel antibody sequences with an improved success rate than the traditional propose-then-filter approach.It has the potential to be used in practical antibody design,thus empowering the antibody discovery and development process.The source code of AB-Gen is freely available at Zenodo(https://doi.org/10.5281/zenodo.7657016)and BioCode(https://ngdc.cncb.ac.cn/biocode/tools/BT007341).展开更多
The number of available protein sequences in public databases is increasing exponentially.However,a sig-nificant percentage of these sequences lack functional annotation,which is essential for the understanding of how...The number of available protein sequences in public databases is increasing exponentially.However,a sig-nificant percentage of these sequences lack functional annotation,which is essential for the understanding of how bio-logical systems operate.Here,we propose a novel method,Quantitative Annotation of Unknown STructure(QAUST),to infer protein functions,specifically Gene Ontology(GO)terms and Enzyme Commission(EC)numbers.QAUST uses three sources of information:structure information encoded by global and local structure similarity search,biological network information inferred by protein–protein interaction data,and sequence information extracted from functionally discriminative sequence motifs.These three pieces of information are combined by consensus averaging to make the final prediction.Our approach has been tested on 500 protein targets from the Critical Assessment of Functional Annotation(CAFA)benchmark set.The results show that our method provides accurate functional annotation and outperforms other prediction methods based on sequence similarity search or threading.We further demonstrate that a previously unknown function of human tripartite motif-containing 22(TRIM22)protein predicted by QAUST can be experimentally validated.展开更多
The global features of H3K4 and H3K27 trimethylations (H3K4me3 and H3K27me3) have been well studied in recent years, but most of these studies were performed in mammalian cell lines. In this work, we generated the g...The global features of H3K4 and H3K27 trimethylations (H3K4me3 and H3K27me3) have been well studied in recent years, but most of these studies were performed in mammalian cell lines. In this work, we generated the genorne-wide maps of H3K4me3 and H3K27me3 of mouse cerebrum and testis using ChlP-seq and their high-coverage transcriptomes using ribominus RNA-seq with SOLID technology. We examined the global patterns of H3K4me3 and H3K27me3 in both tissues and found that modifications are closely-associated with tissue-specific expression, function and development. Moreover, we revealed that H3K4me3 and H3K27me3 rarely occur in silent genes, which contradicts the findings in previous studies. Finally, we observed that bivalent domains, with both H3K4me3 and H3K27me3, existed ubiquitously in both tissues and demonstrated an invariable preference for the regulation of developmentally-related genes. How- ever, the bivalent domains tend towards a "winner-takes-all" approach to regulate the expression of associated genes. We also verified the above results in mouse ES cells. As expected, the results in ES cells are consistent with those in cerebrum and testis. In conclusion, we present two very important findings. One is that H3K4me3 and H3K27me3 rarely occur in silent genes. The other is that bivalent domains may adopt a "winner-takes-all" principle to regulate gene expression.展开更多
The ever-increasing high-volume and high-dimensional geno-mics data on the one hand challenge traditional data analysis approaches,and on the other hand provide ample opportuni-ties for developing novel analytic strat...The ever-increasing high-volume and high-dimensional geno-mics data on the one hand challenge traditional data analysis approaches,and on the other hand provide ample opportuni-ties for developing novel analytic strategies.展开更多
The deep-sea brines of the Red Sea include some of the most extreme and unique envi- ronments on Earth. They combine high salinities with increases in temperature, heavy metals, hydrostatic pressure, and anoxic condit...The deep-sea brines of the Red Sea include some of the most extreme and unique envi- ronments on Earth. They combine high salinities with increases in temperature, heavy metals, hydrostatic pressure, and anoxic conditions, creating unique settings for thriving populations of novel extremophiles. Despite a recent increase of studies focusing on these unusual biotopes, their viral communities remain unexplored. The current survey explores four metagenomic datasets obtained from different brine-seawater interface samples, focusing specifically on the diversity of their viral communities. Data analysis confirmed that the particle-attached viral communities present in the brine-seawater interfaces were diverse and generally dominated by Candovirales, yet appearing distinct from sample to sample. With a level of caution, we report the unexpected finding of Phycodnaviridae, which infects algae and plants, and trace amounts of insect-infecting Iridoviridae. Results from Kebrit Deep revealed stratification in the viral communities present in the interface: the upper-interface was enriched with viruses associated with typical marine bacteria, while the lower-interface was enriched with haloviruses and halophages. These results provide first insights into the unexplored viral communities present in deep-sea brines of the Red Sea, represent- ing one of the first steps for ongoing and future sampling efforts and studies.展开更多
Alternative polyadenylation(APA)is a crucial step in post-transcriptional regulation.Previous bioinformatic studies have mainly focused on the recognition of polyadenylation sites(PASs)in a given genomic sequence,whic...Alternative polyadenylation(APA)is a crucial step in post-transcriptional regulation.Previous bioinformatic studies have mainly focused on the recognition of polyadenylation sites(PASs)in a given genomic sequence,which is a binary classification problem.Recently,computational methods for predicting the usage level of alternative PASs in the same gene have been proposed.However,all of them cast the problem as a non-quantitative pairwise comparison task and do not take the competition among multiple PASs into account.To address this,here we propose a deep learning architecture,Deep Regulatory Code and Tools for Alternative Polyadenylation(DeeReCT-APA),to quantitatively predict the usage of all alternative PASs of a given gene.To accommodate different genes with potentially different numbers of PASs,DeeReCT-APA treats the problem as a regression task with a variable-length target.Based on a convolutional neural network-long short-term memory(CNN-LSTM)architecture,DeeReCT-APA extracts sequence features with CNN layers,uses bidirectional LSTM to explicitly model the interactions among competing PASs,and outputs percentage scores representing the usage levels of all PASs of a gene.In addition to the fact that only our method can quantitatively predict the usage of all the PASs within a gene,we show that our method consistently outperforms other existing methods on three different tasks for which they are trained:pairwise comparison task,highest usage prediction task,and ranking task.Finally,we demonstrate that our method can be used to predict the effect of genetic variations on APA patterns and sheds light on future mechanistic understanding in APA regulation.Our code and data are available at https://github.com/lzx325/DeeReCT-APA-repo.展开更多
Aerophobetes (or CD12) is a recently defined bacterial phylum, of which the metabolic processes and ecological importance remain unclear. In the present study, we obtained the draft genome of an Aerophobetes bac- te...Aerophobetes (or CD12) is a recently defined bacterial phylum, of which the metabolic processes and ecological importance remain unclear. In the present study, we obtained the draft genome of an Aerophobetes bac- terium TCSI from saline sediment near the Thuwal cold seep in the Red Sea using a genome binning method. Analysis of 16S rRNA genes of TCS1 and close relatives revealed wide distribution of Aerophobetes in deep-sea sediments. Phylogenetic relationships showed affinity between Aerophobetes TCS 1 and some thermophilic bac- terial phyla. The genome of TCS1 (at least 1.27 Mbp) contains a full set of genes encoding core metabolic path- ways, including glycolysis and pyruvate fermentation to produce acetyl-CoA and acetate. The identification of cross-membrane sugar transporter genes further indicates its potential ability to consume carbohydrates preserved inthe sediment under the microbial mat. Aerophobetes bac- terium TCS1 therefore probably carried out saccharolytic and fermentative metabolism. The genes responsible for autotrophic synthesis of acetyl-CoA via the Wood-Ljung- dahl pathway were also found in the genome. Phylogenetic study of the essential genes for the Wood-Ljungdahl pathway implied relative independence of Aerophobetes bacterium from the known acetogens and methanogens. Compared with genomes of acetogenic bacteria, Aero- phobetes bacterium TCS 1 genome lacks the genes involved in nitrogen metabolism, sulfur metabolism, signal trans- duction and cell motility. The metabolic activities of TCS 1 might depend on geochemical conditions such as supplies of CO2, hydrogen and sugars, and therefore the TCSI might be a facultative bacterium in anaerobic saline sedi- ments near cold seeps.展开更多
Professor Vladimir B.Bajic,a world-renowned pioneer in bioinformatics and associate editor-in-chief of Genomics Proteomics&Bioinformatics(GPB)since 2015,passed away on31 October 2019 in Jeddah,Saudi Arabia.Prof.Ba...Professor Vladimir B.Bajic,a world-renowned pioneer in bioinformatics and associate editor-in-chief of Genomics Proteomics&Bioinformatics(GPB)since 2015,passed away on31 October 2019 in Jeddah,Saudi Arabia.Prof.Bajic joined the editorial board of GPB in 2012 and had been devoting his precious time and efforts to handling and reviewing manuscripts and providing many valuable instructions and suggestions in improving journal quality and readership as well.展开更多
The accurate annotation of transcription start sites(TSSs)and their usage are critical for the mechanistic understanding of gene regulation in different biological contexts.To fulfill this,specific high-throughput exp...The accurate annotation of transcription start sites(TSSs)and their usage are critical for the mechanistic understanding of gene regulation in different biological contexts.To fulfill this,specific high-throughput experimental technologies have been developed to capture TSSs in a genome-wide manner,and various computational tools have also been developed for in silico prediction of TSSs solely based on genomic sequences.Most of these computational tools cast the problem as a binary classification task on a balanced dataset,thus resulting in drastic false positive predictions when applied on the genome scale.Here,we present Dee Re CT-TSS,a deep learningbased method that is capable of identifying TSSs across the whole genome based on both DNA sequence and conventional RNA sequencing data.We show that by effectively incorporating these two sources of information,Dee Re CT-TSS significantly outperforms other solely sequence-based methods on the precise annotation of TSSs used in different cell types.Furthermore,we develop a meta-learning-based extension for simultaneous TSS annotations on 10 cell types,which enables the identification of cell type-specific TSSs.Finally,we demonstrate the high precision of DeeReCT-TSS on two independent datasets by correlating our predicted TSSs with experimentally defined TSS chromatin states.The source code for Dee Re CT-TSS is available at https://github.-com/Joshua Chou2018/Dee Re CT-TSS_release and https://ngdc.cncb.ac.cn/biocode/tools/BT007316.展开更多
基金supported by the National Key RD program of China(No.2021YFA1502300)the National Natural Science Foundation of China(No.21733007)。
文摘Severe acute respiratory syndrome coronavirus 2(SARS-Co V-2) relies on the central molecular machine RNA-dependent RNA polymerase(Rd Rp) for the viral replication and transcription. Remdesivir at the template strand has been shown to effectively inhibit the RNA synthesis in SARS-Co V-2 Rd Rp by deactivating not only the complementary UTP incorporation but also the next nucleotide addition. However, the underlying molecular mechanism of the second inhibitory point remains unclear. In this work, we have performed molecular dynamics simulations and demonstrated that such inhibition has not directly acted on the nucleotide addition at the active site. Instead, the translocation of Remdesivir from +1 to-1 site is hindered thermodynamically as the posttranslocation state is less stable than the pre-translocation state due to the motif B residue G683. Moreover, another conserved residue S682 on motif B further hinders the dynamic translocation of Remdesivir due to the steric clash with the 1′-cyano substitution. Overall,our study has unveiled an alternative role of motif B in mediating the translocation when Remdesivir is present in the template strand and complemented our understanding about the inhibitory mechanisms exerted by Remdesivir on the RNA synthesis in SARS-Co V-2 Rd Rp.
基金supported by the National Key R&D Program of China(2022YFA1302701)the National Natural Science Foundation of China(32030056 to M.Y.+4 种基金32241031 and 32171195 to S.L.)the scientific project of Beijing Life Science Academy(2023300CA0090)Tsinghua University Initiative Scientific Research Program(2023Z11DSZ001)the King Abdullah University of Science and Technology(KAUST)Office of Sponsored Research(OSR)under Award(OSR-2020-CRG9-4352)Office of Research Administration(ORA)under Award No.URF/1/4352-01-01,FCC/1/1976-44-01,FCC/1/1976-45-01,REI/1/5234-01-01,and REI/1/5414-01-01.
文摘Dear Editor,Pyruvate dehydrogenase complex(PDHc) is a large multienzyme assembly(Mr = 4–10 million Daltons) consisting of three essential components: pyruvate dehydrogenase(E1p), dihydrolipoyl transacetylase(E2p), and dihydrolipoyl dehydrogenase(E3). These three enzymes perform distinct functions sequentially to catalyze the oxidative decarboxylation of pyruvate with formation of nicotinamide adenine dinucleotide(NADH) and acetyl-coenzyme A(Patel and Roche, 1990).
基金supported by the base funding (Grant No. BAS/1/1606-01-01) to VBB by the King Abdullah University of Science and Technology (KAUST), Saudi Arabia
文摘In mammalian cells, transcribed enhancers(TrEns) play important roles in the initiation of gene expression and maintenance of gene expression levels in a spatiotemporal manner. One of the most challenging questions is how the genomic characteristics of enhancers relate to enhancer activities. To date, only a limited number of enhancer sequence characteristics have been investigated, leaving space for exploring the enhancers’ DNA code in a more systematic way. To address this problem, we developed a novel computational framework, Transcribed Enhancer Landscape Search(TELS), aimed at identifying predictive cell type/tissue-specific motif signatures of TrEns.As a case study, we used TELS to compile a comprehensive catalog of motif signatures for all known TrEns identified by the FANTOM5 consortium across 112 human primary cells and tissues.Our results confirm that combinations of different short motifs characterize in an optimized manner cell type/tissue-specific TrEns. Our study is the first to report combinations of motifs that maximize classification performance of TrEns exclusively transcribed in one cell type/tissue from TrEns exclusively transcribed in different cell types/tissues. Moreover, we also report 31 motif signatures predictive of enhancers’ broad activity. TELS codes and material are publicly available at http://www.cbrc.kaust.edu.sa/TELS.
文摘The COVID-19 pandemic has caused a global crisis that prompted the scientific community to tackle new,extraordinary challenges.The urge to make informed decisions with dramatic impact on public health,wealth and society as a whole has pushed the scientific community to produce knowledge and tools at an unprecedented pace.This massive effort resulted in a deep understanding of the SARS-CoV-2 infection,spread and pathogenetic mechanisms,together with the assessment of non-pharmaceutical interventions(NPIs),and the development of therapies and vaccines.We argue that a remarkable contribution to the fight against the pandemic has been provided by the availability of computational biology and bioinformatics tools,which contributed to boost the research process in any possible application area and better cope with the urgent needs determined by the pandemic.
基金supported by the base research funds to VBBthe competitive research funding of VBB from King Abdullah University of Science and Technology (KAUST) in Saudi Arabia
文摘Covering a quarter of the world's tropical coastlines and being one of the most threat- ened ecosystems, mangroves are among the major sources of terrestrial organic matter to oceans and harbor a wide microbial diversity. In order to protect, restore, and better understand these ecosystems, researchers have extensively studied their microbiology, yet few surveys have focused on their fungal communities, Our lack of knowledge is even more pronounced for specific fungal populations, such as the ones associated with the rhizosphere. Likewise, the Red Sea gray man- groves (Avicennia marina) remain poorly characterized, and understanding of their fungal commu- nities still relies on cultivation-dependent methods. In this study, we analyzed metagenomic datasets from gray mangrove rhizosphere and bulk soil samples collected in the Red Sea coast, to obtain a snapshot of their fungal communities. Our data indicated that Ascomycota was the dominant phylum (76%-85%), while Basidiomycota was less abundant (14%-24~), yet present in higher numbers than usually reported for such environments. Fungal communities were more stable within the rhizosphere than within the bulk soil, both at class and genus level. This finding is consistent with the intrinsic patchiness in soil sediments and with the selection of specific microbial commu- nities by plant roots. Our study indicates the presence of several species on this mycobiome that were not previously reported as mangrove-associated. In particular, we detected representatives of several commercially-used fungi, e.g., producers of secreted cellulases and anaerobic producers of cellulosomes. These results represent additional insights into the fungal community of the gray mangroves of the Red Sea, and show that they are significantly richer than previously reported.
基金This work was supported by baseline funding and the Research Grants Prog ram-Round 4(CRG4)baseline funding from King Abdullah University of Science and Technology to S.A.-B.National Natural Science Foundation of China(funds 31900245 and 32170271)given to K.-P.J.
文摘Abscisic acid(ABA)is an important carotenoid-derived phytohormone that plays essential roles in plant response to biotic and abiotic stresses as well as in various physiological and developmental processes.In Arabidopsis,ABA biosynthesis starts with the epoxidation of zeaxanthin by the ABA DEFICIENT 1(ABA1)enzyme,leading to epoxycarotenoids;e.g.,violaxanthin.The oxidative cleavage of 9-cis-epoxycaro-tenoids,a key regulatory step catalyzed by 9-C/S-EPOXYCAROTENOID DIOXYGENASE,forms xanthoxin,which is converted in further rea.ctions mediated by ABA DEFICIENT 2(ABA2),ABA DEFICIENT 3(ABA3),and ABSCISIC ALDEHYDE OXIDASE 3(AAO3)into ABA.By combining genetic and biochemical approaches,we unravel here an ABA1-independent ABA biosynthetic pathway starting upstream of zeaxanthin.We iden-tified the carotenoid cleavage products(i.e.,apocarotenoids,β-apo-11-carotenal,9-cis-β-apo-11-carotenal,3-OH-β-apo-11-carotenal,and 9-cis-3-OH-β-apo-11-carotenal)as intermediates of this ABA1-independent ABA biosynthetic pathway.Using labeled compounds,we showed thatβ-apo-11-carotenal,9-cis-β-apo-11-carotenal,and 3-OH-β-apo-11-carotenal are successively converted into 9-cis-3-OH-β-apo-11-carotenal,xanthoxin,and finally into ABA in both Arabidopsis and rice.When applied to Arabidopsis,theseβ-apo-11-carotenoids exert ABA biological functions,such as maintaining seed dormancy and inducing the expression of ABA-responsive genes.Moreover,the transcdptomic analysis revealed a high overlap of differentially expressed genes regulated byβ-apo-11-carotenoids and ABA,suggesting thatβ-apo-11-carot-enoids exert ABA-independent regulatory activities.Taken together,our study identifies a biological function for the common plant metabolites,β-apo-11-carotenoids,extends our knowledge about ABA biosynthesis,and provides new insights into plant apocarotenoid metabolic networks.
基金supported by King Abdullah University of Science and Technology (KAUST),Saudi Arabia
文摘More than 99% of identified prokaryotes, including many from the marine environment, cannot be cultured in the laboratory. This lack of capability restricts our knowledge of microbial genetics and community ecology. Metagenomics, the culture-independent cloning of environmental DNAs that are isolated directly from an environmental sample, has already provided a wealth of information about the uncultured microbial world. It has also facilitated the discovery of novel bio- catalysts by allowing researchers to probe directly into a huge diversity of enzymes within natural microbial communities. Recent advances in these studies have led to a great interest in recruiting microbial enzymes for the development of environmentally-friendly industry. Although the metage- nomics approach has many limitations, it is expected to provide not only scientific insights but also economic benefits, especially in industry. This review highlights the importance of metagenomics in mining microbial lipases, as an example, by using high-throughput techniques. In addition, we dis- cuss challenges in the metagenomics as an important part of bioinformatics analysis in big data.
基金This research is a contribution to the CAASE project funded by King Abdullah University of Science and Technology(KAUST)under the KAUST Sensor Initiative.
文摘The outstanding properties of graphene have initiated myriads of research and development;yet, its economic impact is hamperedby the difficulties encountered in production and practical application. Recently discovered laser-induced graphene is generated bya simple printing process on flexible and lightweight polyimide films. Exploiting the electrical features and mechanical pliability ofLIG on polyimide, we developed wearable resistive bending sensors that pave the way for many cost-effective measurementsystems. The versatile sensors we describe can be utilized in a wide range of configurations, including measurement of force,deflection, and curvature. The deflection induced by different forces and speeds is effectively sensed through a resistancemeasurement, exploiting the piezoresistance of the printed graphene electrodes. The LIG sensors possess an outstanding range forstrain measurements reaching >10% A double-sided electrode concept was developed by printing the same electrodes on bothsides of the film and employing difference measurements. This provided a large bidirectional bending response combined withtemperature compensation. Versatility in geometry and a simple fabrication process enable the detection of a wide range of flowspeeds, forces, and deflections. The sensor response can be easily tuned by geometrical parameters of the bending sensors and theLIG electrodes. As a wearable device, LIG bending sensors were used for tracking body movements. For underwater operation,PDMS-coated LIG bending sensors were integrated with ultra-low power aquatic tags and utilized in underwater animal speedmonitoring applications, and a recording of the surface current velocity on a coral reef in the Red Sea.
文摘Deep learning(DL)has shown explosive growth in its application to bioinformatics and has demonstrated thrillingly promising power to mine the complex relationship hidden in large-scale biological and biomedical data.A number of comprehensive reviews have been published on such applications,ranging from high-level reviews with future perspectives to those mainly serving as tutorials.
基金funded by King Abdullah University of Science and Technology(KAUST)under the KAUST Sensor Initiative.B.A.Kaidarova et al.6 npj Flexible。
文摘Graphene has shown considerable potential for sensing magnetic fields based on the Hall Effect,due to its high carrier mobility,low sheet carrier density,and low-temperature dependence.However,the cost of graphene in comparison to conventional materials has meant that its uptake in electronic manufacturing has been slow.To lower technological barriers and bring more widespread adoption of graphene Hall sensors,we are using a one-step laser scribing process that does not rely on multiple steps,toxic chemicals,and subsequent treatments.Laser-scribed graphene Hall sensors offer a linear response to magnetic fields with a normalized sensitivity of~1.12 V/AT.They also exhibit a low constant noise voltage floor of~50 nV√Hz p for a bias current of 100μA at room temperature,which is comparable with state-of-the-art low-noise Hall sensors.The sensors combine a high bendability,come with high robustness and operating temperatures up to 400°C.They enable device ideas in various areas,for instance,soft robotics.As an example,we combined a laser-scribed graphene sensor with a deformable elastomer and flexible magnet to realize low-cost,compliant,and customizable tactile sensors.
基金The work was supported by the grants from King Abdullah University of Science and Technology(KAUST),Saudi Arabia(Grant Nos.BAS/1/1059/01/01,URF/1/1976/03/01,URF/1/1976-17-01,URF/1/1976-20-01,and FCS/1/3326-01-01).
文摘Droplet microfluidic techniques have shown promising outcome to study single cells at high throughput.However,their adoption in laboratories studying“-omics”sciences is still irrelevant due to the complex and multidisciplinary nature of the field.To facilitate their use,here we provide engineering details and organized protocols for integrating three droplet-based microfluidic technologies into the metagenomic pipeline to enable functional screening of bioproducts at high throughput.First,a device encapsulating single cells in droplets at a rate of~250 Hz is described considering droplet size and cell growth.Then,we expand on previously reported fluorescence-activated droplet sorting systems to integrate the use of 4 independent fluorescence-exciting lasers(i.e.,405,488,561,and 637 nm)in a single platform to make it compatible with different fluorescence-emitting biosensors.For this sorter,both hardware and software are provided and optimized for effortlessly sorting droplets at 60 Hz.Then,a passive droplet merger is also integrated into our pipeline to enable adding new reagents to already-made droplets at a rate of 200 Hz.Finally,we provide an optimized recipe for manufacturing these chips using silicon dry-etching tools.Because of the overall integration and the technical details presented here,our approach allows biologists to quickly use microfluidic technologies and achieve both single-cell resolution and high-throughput capability(>50,000 cells/day)for mining and bioprospecting metagenomic data.
基金supported in part by the Office of Research Administration(ORA),King Abdullah University of Science and Technology(KAUST),Saudi Arabia(Grant Nos.FCC/1/1976-44-01,FCC/1/1976-45-01,REI/1/5234-01-01,and URF/1/4352-01-01)the National Natural Science Foundation of China(Grant No.22273107).
文摘Antibody leads must fulfill multiple desirable properties to be clinical candidates.Primarily due to the low throughput in the experimental procedure,the need for such multiproperty optimization causes the bottleneck in preclinical antibody discovery and development,because addressing one issue usually causes another.We developed a reinforcement learning(RL)method,named AB-Gen,for antibody library design using a generative pre-trained transformer(GPT)as the policy network of the RL agent.We showed that this model can learn the antibody space of heavy chain complementarity determining region 3(CDRH3)and generate sequences with similar property distributions.Besides,when using human epidermal growth factor receptor-2(HER2)as the target,the agent model of AB-Gen was able to generate novel CDRH3 sequences that fulfill multi-property constraints.Totally,509 generated sequences were able to pass all property filters,and three highly conserved residues were identified.The importance of these residues was further demonstrated by molecular dynamics simulations,consolidating that the agent model was capable of grasping important information in this complex optimization task.Overall,the ABGen method is able to design novel antibody sequences with an improved success rate than the traditional propose-then-filter approach.It has the potential to be used in practical antibody design,thus empowering the antibody discovery and development process.The source code of AB-Gen is freely available at Zenodo(https://doi.org/10.5281/zenodo.7657016)and BioCode(https://ngdc.cncb.ac.cn/biocode/tools/BT007341).
基金supported by the King Abdullah University of Science and Technology(KAUST)Office of Sponsored Research(OSR)(Grant Nos.URF/1/1976-04,URF/1/1976-06)。
文摘The number of available protein sequences in public databases is increasing exponentially.However,a sig-nificant percentage of these sequences lack functional annotation,which is essential for the understanding of how bio-logical systems operate.Here,we propose a novel method,Quantitative Annotation of Unknown STructure(QAUST),to infer protein functions,specifically Gene Ontology(GO)terms and Enzyme Commission(EC)numbers.QAUST uses three sources of information:structure information encoded by global and local structure similarity search,biological network information inferred by protein–protein interaction data,and sequence information extracted from functionally discriminative sequence motifs.These three pieces of information are combined by consensus averaging to make the final prediction.Our approach has been tested on 500 protein targets from the Critical Assessment of Functional Annotation(CAFA)benchmark set.The results show that our method provides accurate functional annotation and outperforms other prediction methods based on sequence similarity search or threading.We further demonstrate that a previously unknown function of human tripartite motif-containing 22(TRIM22)protein predicted by QAUST can be experimentally validated.
基金supported by Grants from Knowledge Innovation Program of the Chinese Academy of Sciences(KSCX2-EW-R-01-04)National Science and Technology Key Project (2008ZX1004-013)+3 种基金863 Program(2009AA01A130)Special Foundation Work Program(2009FY120100)National Key Technology R&D Program (2008BA164B02)973 Program (2011CB944100,2011CB965300 and 2007CB948101) from the Ministry of Science and Technology of the People’s Republic of China
文摘The global features of H3K4 and H3K27 trimethylations (H3K4me3 and H3K27me3) have been well studied in recent years, but most of these studies were performed in mammalian cell lines. In this work, we generated the genorne-wide maps of H3K4me3 and H3K27me3 of mouse cerebrum and testis using ChlP-seq and their high-coverage transcriptomes using ribominus RNA-seq with SOLID technology. We examined the global patterns of H3K4me3 and H3K27me3 in both tissues and found that modifications are closely-associated with tissue-specific expression, function and development. Moreover, we revealed that H3K4me3 and H3K27me3 rarely occur in silent genes, which contradicts the findings in previous studies. Finally, we observed that bivalent domains, with both H3K4me3 and H3K27me3, existed ubiquitously in both tissues and demonstrated an invariable preference for the regulation of developmentally-related genes. How- ever, the bivalent domains tend towards a "winner-takes-all" approach to regulate the expression of associated genes. We also verified the above results in mouse ES cells. As expected, the results in ES cells are consistent with those in cerebrum and testis. In conclusion, we present two very important findings. One is that H3K4me3 and H3K27me3 rarely occur in silent genes. The other is that bivalent domains may adopt a "winner-takes-all" principle to regulate gene expression.
基金supported by the Basic Research Grant (Grant No. JCYJ20170307105752508) from the Science and Technology Innovation Commission of Shenzhen Municipal Government, Chinathe King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR), Saudi Arabia (Grant Nos. FCC/1/1976-04, URF/1/ 2602-01, URF/1/3007-01, URF/1/3412-01, URF/1/3450-01, and URF/1/3454-01)
文摘The ever-increasing high-volume and high-dimensional geno-mics data on the one hand challenge traditional data analysis approaches,and on the other hand provide ample opportuni-ties for developing novel analytic strategies.
基金the support through the KAUST baseline research funds to VBBpartially supported by the KAUST-AUC Global Collaborative Research Program
文摘The deep-sea brines of the Red Sea include some of the most extreme and unique envi- ronments on Earth. They combine high salinities with increases in temperature, heavy metals, hydrostatic pressure, and anoxic conditions, creating unique settings for thriving populations of novel extremophiles. Despite a recent increase of studies focusing on these unusual biotopes, their viral communities remain unexplored. The current survey explores four metagenomic datasets obtained from different brine-seawater interface samples, focusing specifically on the diversity of their viral communities. Data analysis confirmed that the particle-attached viral communities present in the brine-seawater interfaces were diverse and generally dominated by Candovirales, yet appearing distinct from sample to sample. With a level of caution, we report the unexpected finding of Phycodnaviridae, which infects algae and plants, and trace amounts of insect-infecting Iridoviridae. Results from Kebrit Deep revealed stratification in the viral communities present in the interface: the upper-interface was enriched with viruses associated with typical marine bacteria, while the lower-interface was enriched with haloviruses and halophages. These results provide first insights into the unexplored viral communities present in deep-sea brines of the Red Sea, represent- ing one of the first steps for ongoing and future sampling efforts and studies.
基金supported by the King Abdullah University of Science and Technology(KAUST)Office of Sponsored Research(OSR)(Grant Nos.URF/1/4098-01-01,BAS/1/1624-01,FCC/1/1976-18-01,FCC/1/1976-23-01,FCC/1/1976-25-01,FCC/1/1976-26-01,and FCS/1/4102-02-01)the International Cooperation Research Grant from Science and Technology Innovation Commission of Shenzhen Municipal Government,China(Grant No.GJHZ20170310161947503 to YH)the Shenzhen Science and Technology Program,China(Grant No.KQTD20180411143432337 to YH and WC).
文摘Alternative polyadenylation(APA)is a crucial step in post-transcriptional regulation.Previous bioinformatic studies have mainly focused on the recognition of polyadenylation sites(PASs)in a given genomic sequence,which is a binary classification problem.Recently,computational methods for predicting the usage level of alternative PASs in the same gene have been proposed.However,all of them cast the problem as a non-quantitative pairwise comparison task and do not take the competition among multiple PASs into account.To address this,here we propose a deep learning architecture,Deep Regulatory Code and Tools for Alternative Polyadenylation(DeeReCT-APA),to quantitatively predict the usage of all alternative PASs of a given gene.To accommodate different genes with potentially different numbers of PASs,DeeReCT-APA treats the problem as a regression task with a variable-length target.Based on a convolutional neural network-long short-term memory(CNN-LSTM)architecture,DeeReCT-APA extracts sequence features with CNN layers,uses bidirectional LSTM to explicitly model the interactions among competing PASs,and outputs percentage scores representing the usage levels of all PASs of a gene.In addition to the fact that only our method can quantitatively predict the usage of all the PASs within a gene,we show that our method consistently outperforms other existing methods on three different tasks for which they are trained:pairwise comparison task,highest usage prediction task,and ranking task.Finally,we demonstrate that our method can be used to predict the effect of genetic variations on APA patterns and sheds light on future mechanistic understanding in APA regulation.Our code and data are available at https://github.com/lzx325/DeeReCT-APA-repo.
基金supported by the Strategic Priority Research Program of the Chinese Academy of Sciences (XDB06010201)the National Natural Science Foundation of China (41476104)+3 种基金supported by the Strategic Priority Research Program (XDB06010102)an award from the King Abdullah University of Science and Technology (SA-C0040/ UK-C0016) to P.Y. QianV.B. Bajic was supported by KAUST Base Research FundsS. Bougouffa was supported by a SABIC postdoctoral fellowship
文摘Aerophobetes (or CD12) is a recently defined bacterial phylum, of which the metabolic processes and ecological importance remain unclear. In the present study, we obtained the draft genome of an Aerophobetes bac- terium TCSI from saline sediment near the Thuwal cold seep in the Red Sea using a genome binning method. Analysis of 16S rRNA genes of TCS1 and close relatives revealed wide distribution of Aerophobetes in deep-sea sediments. Phylogenetic relationships showed affinity between Aerophobetes TCS 1 and some thermophilic bac- terial phyla. The genome of TCS1 (at least 1.27 Mbp) contains a full set of genes encoding core metabolic path- ways, including glycolysis and pyruvate fermentation to produce acetyl-CoA and acetate. The identification of cross-membrane sugar transporter genes further indicates its potential ability to consume carbohydrates preserved inthe sediment under the microbial mat. Aerophobetes bac- terium TCS1 therefore probably carried out saccharolytic and fermentative metabolism. The genes responsible for autotrophic synthesis of acetyl-CoA via the Wood-Ljung- dahl pathway were also found in the genome. Phylogenetic study of the essential genes for the Wood-Ljungdahl pathway implied relative independence of Aerophobetes bacterium from the known acetogens and methanogens. Compared with genomes of acetogenic bacteria, Aero- phobetes bacterium TCS 1 genome lacks the genes involved in nitrogen metabolism, sulfur metabolism, signal trans- duction and cell motility. The metabolic activities of TCS 1 might depend on geochemical conditions such as supplies of CO2, hydrogen and sugars, and therefore the TCSI might be a facultative bacterium in anaerobic saline sedi- ments near cold seeps.
文摘Professor Vladimir B.Bajic,a world-renowned pioneer in bioinformatics and associate editor-in-chief of Genomics Proteomics&Bioinformatics(GPB)since 2015,passed away on31 October 2019 in Jeddah,Saudi Arabia.Prof.Bajic joined the editorial board of GPB in 2012 and had been devoting his precious time and efforts to handling and reviewing manuscripts and providing many valuable instructions and suggestions in improving journal quality and readership as well.
基金supported in part by grants from Office of Research Administration(ORA)at King Abdullah University of Science and Technology(KAUST)(Grant Nos.BAS/1/1624-01-01,FCC/1/197604-01,URF/1/4098-01-01,REI/1/0018-01-01,REI/1/4216-0101,REI/1/4437-01-01,REI/1/4473-01-01,URF/1/4352-01-01,REI/1/4742-01-01,and URF/1/4663-01-01)supported in part by the National Natural Science Foundation of China(Grant No.31970601)+1 种基金the Shenzhen Science and Technology Program(Grant No.KQTD20180411143432337)the Shenzhen Key Laboratory of Gene Regulation and Systems Biology(Grant No.ZDSYS20200811144002008),China。
文摘The accurate annotation of transcription start sites(TSSs)and their usage are critical for the mechanistic understanding of gene regulation in different biological contexts.To fulfill this,specific high-throughput experimental technologies have been developed to capture TSSs in a genome-wide manner,and various computational tools have also been developed for in silico prediction of TSSs solely based on genomic sequences.Most of these computational tools cast the problem as a binary classification task on a balanced dataset,thus resulting in drastic false positive predictions when applied on the genome scale.Here,we present Dee Re CT-TSS,a deep learningbased method that is capable of identifying TSSs across the whole genome based on both DNA sequence and conventional RNA sequencing data.We show that by effectively incorporating these two sources of information,Dee Re CT-TSS significantly outperforms other solely sequence-based methods on the precise annotation of TSSs used in different cell types.Furthermore,we develop a meta-learning-based extension for simultaneous TSS annotations on 10 cell types,which enables the identification of cell type-specific TSSs.Finally,we demonstrate the high precision of DeeReCT-TSS on two independent datasets by correlating our predicted TSSs with experimentally defined TSS chromatin states.The source code for Dee Re CT-TSS is available at https://github.-com/Joshua Chou2018/Dee Re CT-TSS_release and https://ngdc.cncb.ac.cn/biocode/tools/BT007316.