Severe acute respiratory syndrome coronavirus 2(SARS-Co V-2) relies on the central molecular machine RNA-dependent RNA polymerase(Rd Rp) for the viral replication and transcription. Remdesivir at the template strand h...Severe acute respiratory syndrome coronavirus 2(SARS-Co V-2) relies on the central molecular machine RNA-dependent RNA polymerase(Rd Rp) for the viral replication and transcription. Remdesivir at the template strand has been shown to effectively inhibit the RNA synthesis in SARS-Co V-2 Rd Rp by deactivating not only the complementary UTP incorporation but also the next nucleotide addition. However, the underlying molecular mechanism of the second inhibitory point remains unclear. In this work, we have performed molecular dynamics simulations and demonstrated that such inhibition has not directly acted on the nucleotide addition at the active site. Instead, the translocation of Remdesivir from +1 to-1 site is hindered thermodynamically as the posttranslocation state is less stable than the pre-translocation state due to the motif B residue G683. Moreover, another conserved residue S682 on motif B further hinders the dynamic translocation of Remdesivir due to the steric clash with the 1′-cyano substitution. Overall,our study has unveiled an alternative role of motif B in mediating the translocation when Remdesivir is present in the template strand and complemented our understanding about the inhibitory mechanisms exerted by Remdesivir on the RNA synthesis in SARS-Co V-2 Rd Rp.展开更多
Musical rhythms are represented as sequences of symbols. The sequences may be composed of binary symbols denoting either silent or monophonic sounded pulses, or ternary symbols denoting silent pulses and two types of ...Musical rhythms are represented as sequences of symbols. The sequences may be composed of binary symbols denoting either silent or monophonic sounded pulses, or ternary symbols denoting silent pulses and two types of sounded pulses made up of low-pitched (dum) and high-pitched (tak) sounds. Experiments are described that compare the effectiveness of the many-to-many minimum-weight matching between two sequences to serve as a measure of similarity that correlates well with human judgements of rhythm similarity. This measure is also compared to the often used edit distance and to the one-to-one minimum-weight matching. New results are reported from experiments performed with three widely different datasets of real- world and artificially generated musical rhythms (including Afro-Cuban rhythms), and compared with results previously reported with a dataset of Middle Eastern dum-tak rhythms.展开更多
It is challenging to identify comorbidity patterns and mechanistically investigate disease associations based on health-related data that are often sparse,large-scale,and multimodal.Adopting a systems biology approach...It is challenging to identify comorbidity patterns and mechanistically investigate disease associations based on health-related data that are often sparse,large-scale,and multimodal.Adopting a systems biology approach,embedding-based algorithms provide a new perspective to examine diseases under a unified framework by mapping diseases into a highdimensional space as embedding vectors.These vectors and their constituted disease space encode pathological information and enable a quantitative and systemic measurement of the similarity between any pair of diseases,opening up an avenue for numerous types of downstream analyses.Here,we exemplify its potential through applications in discovering hidden disease associations,assisting in genetic parameter estimation,facilitating data-driven disease classifications,and transforming genetic association studies of diseases in consideration of comorbidities.While underscoring the power and versatility of this approach,we also discuss the challenges posed by medical context,requirements of online training and result validation,and research opportunities in constructing foundation models from multimodal disease data.With continued innovation and exploration,disease embedding has the potential to transform the fields of disease association analysis and even pathology studies by providing a holistic representation of patient health status.展开更多
Dear Editor,Pyruvate dehydrogenase complex(PDHc) is a large multienzyme assembly(Mr = 4–10 million Daltons) consisting of three essential components: pyruvate dehydrogenase(E1p), dihydrolipoyl transacetylase(E2p), an...Dear Editor,Pyruvate dehydrogenase complex(PDHc) is a large multienzyme assembly(Mr = 4–10 million Daltons) consisting of three essential components: pyruvate dehydrogenase(E1p), dihydrolipoyl transacetylase(E2p), and dihydrolipoyl dehydrogenase(E3). These three enzymes perform distinct functions sequentially to catalyze the oxidative decarboxylation of pyruvate with formation of nicotinamide adenine dinucleotide(NADH) and acetyl-coenzyme A(Patel and Roche, 1990).展开更多
High-intensity laser–plasma interactions produce a wide array of energetic particles and beams with promising applications.Unfortunately,the high repetition rate and high average power requirements for many applicati...High-intensity laser–plasma interactions produce a wide array of energetic particles and beams with promising applications.Unfortunately,the high repetition rate and high average power requirements for many applications are not satisfied by the lasers,optics,targets,and diagnostics currently employed.Here,we aim to address the need for high-repetition-rate targets and optics through the use of liquids.A novel nozzle assembly is used to generate highvelocity,laminar-flowing liquid microjets which are compatible with a low-vacuum environment,generate little to no debris,and exhibit precise positional and dimensional tolerances.Jets,droplets,submicron-thick sheets,and other exotic configurations are characterized with pump–probe shadowgraphy to evaluate their use as targets.To demonstrate a highrepetition-rate,consumable,liquid optical element,we present a plasma mirror created by a submicron-thick liquid sheet.This plasma mirror provides etalon-like anti-reflection properties in the low field of 0.1%and high reflectivity as a plasma,69%,at a repetition rate of 1 k Hz.Practical considerations of fluid compatibility,in-vacuum operation,and estimates of maximum repetition rate are addressed.The targets and optics presented here demonstrate a potential technique for enabling the operation of laser–plasma interactions at high repetition rates.展开更多
Substantial research has been devoted to the modelling of the small-world phenomenon that arises in nature as well as human society. Earlier work has focused on the static properties of various small-world models. To ...Substantial research has been devoted to the modelling of the small-world phenomenon that arises in nature as well as human society. Earlier work has focused on the static properties of various small-world models. To examine the routing aspects, Kleinberg proposes a model based on a d-dimensional toroidal lattice with long-range links chosen at random according to the d-harmonic distribution. Kleinberg shows that, by using only local information, the greedy routing algorithm performs in O(lg^2 n) expected number of hops. We extend Kleinberg's small-world model by allowing each node x to have two more random links to nodes chosen uniformly and randomly within (lg n)2/d Manhattan distance from x. Based on this extended model, we then propose an oblivious algorithm that can route messages between any two nodes in O(lg n) expected number of hops. Our routing algorithm keeps only O((lgn)β+1) bits of information on each node, where 1 〈 β 〈 2, thus being scalable w.r.t, the network size. To our knowledge, our result is the first to achieve the optimal routing complexity while still keeping a poly-logarithmic number of bits of information stored on each node in the small-world networks.展开更多
Accurate prediction of peptide spectra is crucial for improving the efficiency and reliability of proteomic analysis,as well as for gaining insight into various biological processes.In this study,we introduce Deep MS ...Accurate prediction of peptide spectra is crucial for improving the efficiency and reliability of proteomic analysis,as well as for gaining insight into various biological processes.In this study,we introduce Deep MS Simulator(DMSS),a novel attention-based model tailored for forecasting theoretical spectra in mass spectrometry.DMSS has undergone rigorous validation through a series of experiments,consistently demonstrating superior performance compared to current methods in forecasting theoretical spectra.The superior ability of DMSS to distinguish extremely similar peptides highlights the potential application of incorporating our predicted intensity information into mass spectrometry search engines to enhance the accuracy of protein identification.These findings contribute to the advancement of proteomics analysis and highlight the potential of the DMSS as a valuable tool in the field.展开更多
The accurate annotation of transcription start sites(TSSs)and their usage are critical for the mechanistic understanding of gene regulation in different biological contexts.To fulfill this,specific high-throughput exp...The accurate annotation of transcription start sites(TSSs)and their usage are critical for the mechanistic understanding of gene regulation in different biological contexts.To fulfill this,specific high-throughput experimental technologies have been developed to capture TSSs in a genome-wide manner,and various computational tools have also been developed for in silico prediction of TSSs solely based on genomic sequences.Most of these computational tools cast the problem as a binary classification task on a balanced dataset,thus resulting in drastic false positive predictions when applied on the genome scale.Here,we present Dee Re CT-TSS,a deep learningbased method that is capable of identifying TSSs across the whole genome based on both DNA sequence and conventional RNA sequencing data.We show that by effectively incorporating these two sources of information,Dee Re CT-TSS significantly outperforms other solely sequence-based methods on the precise annotation of TSSs used in different cell types.Furthermore,we develop a meta-learning-based extension for simultaneous TSS annotations on 10 cell types,which enables the identification of cell type-specific TSSs.Finally,we demonstrate the high precision of DeeReCT-TSS on two independent datasets by correlating our predicted TSSs with experimentally defined TSS chromatin states.The source code for Dee Re CT-TSS is available at https://github.-com/Joshua Chou2018/Dee Re CT-TSS_release and https://ngdc.cncb.ac.cn/biocode/tools/BT007316.展开更多
基金supported by the National Key RD program of China(No.2021YFA1502300)the National Natural Science Foundation of China(No.21733007)。
文摘Severe acute respiratory syndrome coronavirus 2(SARS-Co V-2) relies on the central molecular machine RNA-dependent RNA polymerase(Rd Rp) for the viral replication and transcription. Remdesivir at the template strand has been shown to effectively inhibit the RNA synthesis in SARS-Co V-2 Rd Rp by deactivating not only the complementary UTP incorporation but also the next nucleotide addition. However, the underlying molecular mechanism of the second inhibitory point remains unclear. In this work, we have performed molecular dynamics simulations and demonstrated that such inhibition has not directly acted on the nucleotide addition at the active site. Instead, the translocation of Remdesivir from +1 to-1 site is hindered thermodynamically as the posttranslocation state is less stable than the pre-translocation state due to the motif B residue G683. Moreover, another conserved residue S682 on motif B further hinders the dynamic translocation of Remdesivir due to the steric clash with the 1′-cyano substitution. Overall,our study has unveiled an alternative role of motif B in mediating the translocation when Remdesivir is present in the template strand and complemented our understanding about the inhibitory mechanisms exerted by Remdesivir on the RNA synthesis in SARS-Co V-2 Rd Rp.
文摘Musical rhythms are represented as sequences of symbols. The sequences may be composed of binary symbols denoting either silent or monophonic sounded pulses, or ternary symbols denoting silent pulses and two types of sounded pulses made up of low-pitched (dum) and high-pitched (tak) sounds. Experiments are described that compare the effectiveness of the many-to-many minimum-weight matching between two sequences to serve as a measure of similarity that correlates well with human judgements of rhythm similarity. This measure is also compared to the often used edit distance and to the one-to-one minimum-weight matching. New results are reported from experiments performed with three widely different datasets of real- world and artificially generated musical rhythms (including Afro-Cuban rhythms), and compared with results previously reported with a dataset of Middle Eastern dum-tak rhythms.
基金National Natural Science Foundation of China,Grant/Award Number:32470720Innovation Program of Chinese Academy of Agricultural Sciences,Grant/Award Number:CAASASTIP-2021-AGIS+13 种基金Chinese University of Hong KongGrant/Award Numbers:4937025,4937026,5501517,5501329Research Grants Council of the Hong Kong Special Administrative RegionChinaGrant/Award Number:CUHK 24204023Innovation and Technology Commission of the Hong Kong Special Administrative Region,ChinaGrant/Award Number:GHP/065/21SZRMGS in CUHK,Grant/Award Numbers:8601603,8601663Shun Hing Institute of Advanced Engineering(SHIAE),Grant/Award Number:BME-p1-24The King Abdullah University of Science and Technology Office of Research Administration,Grant/Award Numbers:REI/1/5234-01-01,REI/1/5289-01-01,REI/1/5404-01-01,REI/1/5414-01-01,REI/1/5992-01-01,URF/1/4663-01-01The King Abdullah University of Science and Technology Center of Excellence for Smart Health(KCSH)Grant/Award Number:5932The King Abdullah University of Science and Technology Center of Excellence on Generative AIGrant/Award Number:5940。
文摘It is challenging to identify comorbidity patterns and mechanistically investigate disease associations based on health-related data that are often sparse,large-scale,and multimodal.Adopting a systems biology approach,embedding-based algorithms provide a new perspective to examine diseases under a unified framework by mapping diseases into a highdimensional space as embedding vectors.These vectors and their constituted disease space encode pathological information and enable a quantitative and systemic measurement of the similarity between any pair of diseases,opening up an avenue for numerous types of downstream analyses.Here,we exemplify its potential through applications in discovering hidden disease associations,assisting in genetic parameter estimation,facilitating data-driven disease classifications,and transforming genetic association studies of diseases in consideration of comorbidities.While underscoring the power and versatility of this approach,we also discuss the challenges posed by medical context,requirements of online training and result validation,and research opportunities in constructing foundation models from multimodal disease data.With continued innovation and exploration,disease embedding has the potential to transform the fields of disease association analysis and even pathology studies by providing a holistic representation of patient health status.
基金supported by the National Key R&D Program of China(2022YFA1302701)the National Natural Science Foundation of China(32030056 to M.Y.+4 种基金32241031 and 32171195 to S.L.)the scientific project of Beijing Life Science Academy(2023300CA0090)Tsinghua University Initiative Scientific Research Program(2023Z11DSZ001)the King Abdullah University of Science and Technology(KAUST)Office of Sponsored Research(OSR)under Award(OSR-2020-CRG9-4352)Office of Research Administration(ORA)under Award No.URF/1/4352-01-01,FCC/1/1976-44-01,FCC/1/1976-45-01,REI/1/5234-01-01,and REI/1/5414-01-01.
文摘Dear Editor,Pyruvate dehydrogenase complex(PDHc) is a large multienzyme assembly(Mr = 4–10 million Daltons) consisting of three essential components: pyruvate dehydrogenase(E1p), dihydrolipoyl transacetylase(E2p), and dihydrolipoyl dehydrogenase(E3). These three enzymes perform distinct functions sequentially to catalyze the oxidative decarboxylation of pyruvate with formation of nicotinamide adenine dinucleotide(NADH) and acetyl-coenzyme A(Patel and Roche, 1990).
基金supported by the Air Force Office of Scientific Research under LRIR Project 17RQCOR504 under the management of Dr. Riq Parraprovided by the AFOSR summer faculty program
文摘High-intensity laser–plasma interactions produce a wide array of energetic particles and beams with promising applications.Unfortunately,the high repetition rate and high average power requirements for many applications are not satisfied by the lasers,optics,targets,and diagnostics currently employed.Here,we aim to address the need for high-repetition-rate targets and optics through the use of liquids.A novel nozzle assembly is used to generate highvelocity,laminar-flowing liquid microjets which are compatible with a low-vacuum environment,generate little to no debris,and exhibit precise positional and dimensional tolerances.Jets,droplets,submicron-thick sheets,and other exotic configurations are characterized with pump–probe shadowgraphy to evaluate their use as targets.To demonstrate a highrepetition-rate,consumable,liquid optical element,we present a plasma mirror created by a submicron-thick liquid sheet.This plasma mirror provides etalon-like anti-reflection properties in the low field of 0.1%and high reflectivity as a plasma,69%,at a repetition rate of 1 k Hz.Practical considerations of fluid compatibility,in-vacuum operation,and estimates of maximum repetition rate are addressed.The targets and optics presented here demonstrate a potential technique for enabling the operation of laser–plasma interactions at high repetition rates.
文摘Substantial research has been devoted to the modelling of the small-world phenomenon that arises in nature as well as human society. Earlier work has focused on the static properties of various small-world models. To examine the routing aspects, Kleinberg proposes a model based on a d-dimensional toroidal lattice with long-range links chosen at random according to the d-harmonic distribution. Kleinberg shows that, by using only local information, the greedy routing algorithm performs in O(lg^2 n) expected number of hops. We extend Kleinberg's small-world model by allowing each node x to have two more random links to nodes chosen uniformly and randomly within (lg n)2/d Manhattan distance from x. Based on this extended model, we then propose an oblivious algorithm that can route messages between any two nodes in O(lg n) expected number of hops. Our routing algorithm keeps only O((lgn)β+1) bits of information on each node, where 1 〈 β 〈 2, thus being scalable w.r.t, the network size. To our knowledge, our result is the first to achieve the optimal routing complexity while still keeping a poly-logarithmic number of bits of information stored on each node in the small-world networks.
基金supported by the National Natural Science Foundation of China(Nos.62072435,82130055,32271297,and 32370657)the National Key Research and Development Program of China(No.2020YFA0907000).
文摘Accurate prediction of peptide spectra is crucial for improving the efficiency and reliability of proteomic analysis,as well as for gaining insight into various biological processes.In this study,we introduce Deep MS Simulator(DMSS),a novel attention-based model tailored for forecasting theoretical spectra in mass spectrometry.DMSS has undergone rigorous validation through a series of experiments,consistently demonstrating superior performance compared to current methods in forecasting theoretical spectra.The superior ability of DMSS to distinguish extremely similar peptides highlights the potential application of incorporating our predicted intensity information into mass spectrometry search engines to enhance the accuracy of protein identification.These findings contribute to the advancement of proteomics analysis and highlight the potential of the DMSS as a valuable tool in the field.
基金supported in part by grants from Office of Research Administration(ORA)at King Abdullah University of Science and Technology(KAUST)(Grant Nos.BAS/1/1624-01-01,FCC/1/197604-01,URF/1/4098-01-01,REI/1/0018-01-01,REI/1/4216-0101,REI/1/4437-01-01,REI/1/4473-01-01,URF/1/4352-01-01,REI/1/4742-01-01,and URF/1/4663-01-01)supported in part by the National Natural Science Foundation of China(Grant No.31970601)+1 种基金the Shenzhen Science and Technology Program(Grant No.KQTD20180411143432337)the Shenzhen Key Laboratory of Gene Regulation and Systems Biology(Grant No.ZDSYS20200811144002008),China。
文摘The accurate annotation of transcription start sites(TSSs)and their usage are critical for the mechanistic understanding of gene regulation in different biological contexts.To fulfill this,specific high-throughput experimental technologies have been developed to capture TSSs in a genome-wide manner,and various computational tools have also been developed for in silico prediction of TSSs solely based on genomic sequences.Most of these computational tools cast the problem as a binary classification task on a balanced dataset,thus resulting in drastic false positive predictions when applied on the genome scale.Here,we present Dee Re CT-TSS,a deep learningbased method that is capable of identifying TSSs across the whole genome based on both DNA sequence and conventional RNA sequencing data.We show that by effectively incorporating these two sources of information,Dee Re CT-TSS significantly outperforms other solely sequence-based methods on the precise annotation of TSSs used in different cell types.Furthermore,we develop a meta-learning-based extension for simultaneous TSS annotations on 10 cell types,which enables the identification of cell type-specific TSSs.Finally,we demonstrate the high precision of DeeReCT-TSS on two independent datasets by correlating our predicted TSSs with experimentally defined TSS chromatin states.The source code for Dee Re CT-TSS is available at https://github.-com/Joshua Chou2018/Dee Re CT-TSS_release and https://ngdc.cncb.ac.cn/biocode/tools/BT007316.