The paper utilized a standardized methodology to identify prognostic biomarkers in hepatocellular carcinoma(HCC)by analyzing transcriptomic and clinical data from The Cancer Genome Atlas(TCGA)database.The approach,whi...The paper utilized a standardized methodology to identify prognostic biomarkers in hepatocellular carcinoma(HCC)by analyzing transcriptomic and clinical data from The Cancer Genome Atlas(TCGA)database.The approach,which included stringent data preprocessing,differential gene expression analysis,and Kaplan-Meier survival analysis,provided valuable insights into the genetic underpinnings of HCC.The comprehensive analysis of a dataset involving 370 HCC patients uncovered correlations between survival status and pathological characteristics,including tumor size,lymph node involvement,and distant metastasis.The processed transcriptome dataset,comprising 420 samples and annotating 26,783 genes,served as a robust platform for identifying differential gene expression patterns.Among the significant differential expression genes,the key genes such as FBXO43,HAGLROS,CRISPLD1,LRRC3.DT,and ERN2,were pinpointed,which showed significant associations with patient survival outcomes,indicating their potential as novel prognostic biomarkers.This study can not only enhance the understanding of HCC’s genetic landscape but also establish a blueprint for a standardized process to discover prognostic biomarkers of various diseases using genetic big data.Future research should focus on validating these biomarkers through independent cohorts and exploring their utility in the development of personalized treatment strategies.展开更多
在小学高年级英语语篇教学中,存在学生思维浅表化、问题设计碎片化、旧版教材适配难这三个痛点。以译林版英语教材六年级上册Unit 4 Then and now中Story time的教学为例,教师立足教材文本,构建“课前定问—课初引链—课中解链—课后拓...在小学高年级英语语篇教学中,存在学生思维浅表化、问题设计碎片化、旧版教材适配难这三个痛点。以译林版英语教材六年级上册Unit 4 Then and now中Story time的教学为例,教师立足教材文本,构建“课前定问—课初引链—课中解链—课后拓链—全程评链”的五步闭环,用大问题拉主线、小问题搭台阶,能激活学生语篇学习内驱力,实现英语教学从“知识传递”到“素养培养”的转变。展开更多
The discovery of novel materials with desired properties is essential to the advancements of energy-related technologies.Despite the rapid development of computational infrastructures and theoretical approaches,progre...The discovery of novel materials with desired properties is essential to the advancements of energy-related technologies.Despite the rapid development of computational infrastructures and theoretical approaches,progress so far has been limited by the empirical and serial nature of experimental work.Fortunately,the situation is changing thanks to the maturation of theoretical tools such as density functional theory,high-throughput screening,crystal structure prediction,and emerging approaches based on machine learning.Together these recent innovations in computational chemistry,data informatics,and machine learning have acted as catalysts for revolutionizing material design and hopefully will lead to faster kinetics in the development of energy-related industries.In this report,recent advances in material discovery methods are reviewed for energy devices.Three paradigms based on empiricism-driven experiments,database-driven high-throughput screening,and data informatics-driven machine learning are discussed critically.Key methodological advancements involved are reviewed including high-throughput screening,crystal structure prediction,and generative models for target material design.Their applications in energy-related devices such as batteries,catalysts,and photovoltaics are selectively showcased.展开更多
The security of the seed industry is crucial for ensuring national food security.Currently,developed countries in Europe and America,along with international seed industry giants,have entered the Breeding 4.0 era.This...The security of the seed industry is crucial for ensuring national food security.Currently,developed countries in Europe and America,along with international seed industry giants,have entered the Breeding 4.0 era.This era integrates biotechnology,artificial intelligence(AI),and big data information technology.In contrast,China is still in a transition period between stages 2.0 and 3.0,which primarily relies on conventional selection and molecular breeding.In the context of increasingly complex international situations,accurately identifying core issues in China's seed industry innovation and seizing the frontier of international seed technology are strategically important.These efforts are essential for ensuring food security and revitalizing the seed industry.This paper systematically analyzes the characteristics of crop breeding data from artificial selection to intelligent design breeding.It explores the applications and development trends of AI and big data in modern crop breeding from several key perspectives.These include highthroughput phenotype acquisition and analysis,multiomics big data database and management system construction,AI-based multiomics integrated analysis,and the development of intelligent breeding software tools based on biological big data and AI technology.Based on an in-depth analysis of the current status and challenges of China's seed industry technology development,we propose strategic goals and key tasks for China's new generation of AI and big data-driven intelligent design breeding.These suggestions aim to accelerate the development of an intelligent-driven crop breeding engineering system that features large-scale gene mining,efficient gene manipulation,engineered variety design,and systematized biobreeding.This study provides a theoretical basis and practical guidance for the development of China's seed industry technology.展开更多
Research into metamorphism plays a pivotal role in reconstructing the evolution of continent,particularly through the study of ancient rocks that are highly susceptible to metamorphic alterations due to multiple tecto...Research into metamorphism plays a pivotal role in reconstructing the evolution of continent,particularly through the study of ancient rocks that are highly susceptible to metamorphic alterations due to multiple tectonic activities.In the big data era,the establishment of new data platforms and the application of big data methods have become a focus for metamorphic rocks.Significant progress has been made in creating specialized databases,compiling comprehensive datasets,and utilizing data analytics to address complex scientific questions.However,many existing databases are inadequate in meeting the specific requirements of metamorphic research,resulting from a substantial amount of valuable data remaining uncollected.Therefore,constructing new databases that can cope with the development of the data era is necessary.This article provides an extensive review of existing databases related to metamorphic rocks and discusses data-driven studies in this.Accordingly,several crucial factors that need to be taken into consideration in the establishment of specialized metamorphic databases are identified,aiming to leverage data-driven applications to achieve broader scientific objectives in metamorphic research.展开更多
On October 18,2017,the 19th National Congress Report called for the implementation of the Healthy China Strategy.The development of biomedical data plays a pivotal role in advancing this strategy.Since the 18th Nation...On October 18,2017,the 19th National Congress Report called for the implementation of the Healthy China Strategy.The development of biomedical data plays a pivotal role in advancing this strategy.Since the 18th National Congress of the Communist Party of China,China has vigorously promoted the integration and implementation of the Healthy China and Digital China strategies.The National Health Commission has prioritized the development of health and medical big data,issuing policies to promote standardized applica-tions and foster innovation in"Internet+Healthcare."Biomedical data has significantly contributed to preci-sion medicine,personalized health management,drug development,disease diagnosis,public health monitor-ing,and epidemic prediction capabilities.展开更多
Transformer models have emerged as pivotal tools within the realm of drug discovery,distinguished by their unique architectural features and exceptional performance in managing intricate data landscapes.Leveraging the...Transformer models have emerged as pivotal tools within the realm of drug discovery,distinguished by their unique architectural features and exceptional performance in managing intricate data landscapes.Leveraging the innate capabilities of transformer architectures to comprehend intricate hierarchical dependencies inherent in sequential data,these models showcase remarkable efficacy across various tasks,including new drug design and drug target identification.The adaptability of pre-trained trans-former-based models renders them indispensable assets for driving data-centric advancements in drug discovery,chemistry,and biology,furnishing a robust framework that expedites innovation and dis-covery within these domains.Beyond their technical prowess,the success of transformer-based models in drug discovery,chemistry,and biology extends to their interdisciplinary potential,seamlessly combining biological,physical,chemical,and pharmacological insights to bridge gaps across diverse disciplines.This integrative approach not only enhances the depth and breadth of research endeavors but also fosters synergistic collaborations and exchange of ideas among disparate fields.In our review,we elucidate the myriad applications of transformers in drug discovery,as well as chemistry and biology,spanning from protein design and protein engineering,to molecular dynamics(MD),drug target iden-tification,transformer-enabled drug virtual screening(VS),drug lead optimization,drug addiction,small data set challenges,chemical and biological image analysis,chemical language understanding,and single cell data.Finally,we conclude the survey by deliberating on promising trends in transformer models within the context of drug discovery and other sciences.展开更多
This paper addresses urban sustainability challenges amid global urbanization, emphasizing the need for innova tive approaches aligned with the Sustainable Development Goals. While traditional tools and linear models ...This paper addresses urban sustainability challenges amid global urbanization, emphasizing the need for innova tive approaches aligned with the Sustainable Development Goals. While traditional tools and linear models offer insights, they fall short in presenting a holistic view of complex urban challenges. System dynamics (SD) models that are often utilized to provide holistic, systematic understanding of a research subject, like the urban system, emerge as valuable tools, but data scarcity and theoretical inadequacy pose challenges. The research reviews relevant papers on recent SD model applications in urban sustainability since 2018, categorizing them based on nine key indicators. Among the reviewed papers, data limitations and model assumptions were identified as ma jor challenges in applying SD models to urban sustainability. This led to exploring the transformative potential of big data analytics, a rare approach in this field as identified by this study, to enhance SD models’ empirical foundation. Integrating big data could provide data-driven calibration, potentially improving predictive accuracy and reducing reliance on simplified assumptions. The paper concludes by advocating for new approaches that reduce assumptions and promote real-time applicable models, contributing to a comprehensive understanding of urban sustainability through the synergy of big data and SD models.展开更多
Deep-time Earth research plays a pivotal role in deciphering the rates,patterns,and mechanisms of Earth's evolutionary processes throughout geological history,providing essential scientific foundations for climate...Deep-time Earth research plays a pivotal role in deciphering the rates,patterns,and mechanisms of Earth's evolutionary processes throughout geological history,providing essential scientific foundations for climate prediction,natural resource exploration,and sustainable planetary stewardship.To advance Deep-time Earth research in the era of big data and artificial intelligence,the International Union of Geological Sciences initiated the“Deeptime Digital Earth International Big Science Program”(DDE)in 2019.At the core of this ambitious program lies the development of geoscience knowledge graphs,serving as a transformative knowledge infrastructure that enables the integration,sharing,mining,and analysis of heterogeneous geoscience big data.The DDE knowledge graph initiative has made significant strides in three critical dimensions:(1)establishing a unified knowledge structure across geoscience disciplines that ensures consistent representation of geological entities and their interrelationships through standardized ontologies and semantic frameworks;(2)developing a robust and scalable software infrastructure capable of supporting both expert-driven and machine-assisted knowledge engineering for large-scale graph construction and management;(3)implementing a comprehensive three-tiered architecture encompassing basic,discipline-specific,and application-oriented knowledge graphs,spanning approximately 20 geoscience disciplines.Through its open knowledge framework and international collaborative network,this initiative has fostered multinational research collaborations,establishing a robust foundation for next-generation geoscience research while propelling the discipline toward FAIR(Findable,Accessible,Interoperable,Reusable)data practices in deep-time Earth systems research.展开更多
Well logging technology has accumulated a large amount of historical data through four generations of technological development,which forms the basis of well logging big data and digital assets.However,the value of th...Well logging technology has accumulated a large amount of historical data through four generations of technological development,which forms the basis of well logging big data and digital assets.However,the value of these data has not been well stored,managed and mined.With the development of cloud computing technology,it provides a rare development opportunity for logging big data private cloud.The traditional petrophysical evaluation and interpretation model has encountered great challenges in the face of new evaluation objects.The solution research of logging big data distributed storage,processing and learning functions integrated in logging big data private cloud has not been carried out yet.To establish a distributed logging big-data private cloud platform centered on a unifi ed learning model,which achieves the distributed storage and processing of logging big data and facilitates the learning of novel knowledge patterns via the unifi ed logging learning model integrating physical simulation and data models in a large-scale functional space,thus resolving the geo-engineering evaluation problem of geothermal fi elds.Based on the research idea of“logging big data cloud platform-unifi ed logging learning model-large function space-knowledge learning&discovery-application”,the theoretical foundation of unified learning model,cloud platform architecture,data storage and learning algorithm,arithmetic power allocation and platform monitoring,platform stability,data security,etc.have been carried on analysis.The designed logging big data cloud platform realizes parallel distributed storage and processing of data and learning algorithms.The feasibility of constructing a well logging big data cloud platform based on a unifi ed learning model of physics and data is analyzed in terms of the structure,ecology,management and security of the cloud platform.The case study shows that the logging big data cloud platform has obvious technical advantages over traditional logging evaluation methods in terms of knowledge discovery method,data software and results sharing,accuracy,speed and complexity.展开更多
Viral infectious diseases,characterized by their intricate nature and wide-ranging diversity,pose substantial challenges in the domain of data management.The vast volume of data generated by these diseases,spanning fr...Viral infectious diseases,characterized by their intricate nature and wide-ranging diversity,pose substantial challenges in the domain of data management.The vast volume of data generated by these diseases,spanning from the molecular mechanisms within cells to large-scale epidemiological patterns,has surpassed the capabilities of traditional analytical methods.In the era of artificial intelligence(AI)and big data,there is an urgent necessity for the optimization of these analytical methods to more effectively handle and utilize the information.Despite the rapid accumulation of data associated with viral infections,the lack of a comprehensive framework for integrating,selecting,and analyzing these datasets has left numerous researchers uncertain about which data to select,how to access it,and how to utilize it most effectively in their research.This review endeavors to fill these gaps by exploring the multifaceted nature of viral infectious diseases and summarizing relevant data across multiple levels,from the molecular details of pathogens to broad epidemiological trends.The scope extends from the micro-scale to the macro-scale,encompassing pathogens,hosts,and vectors.In addition to data summarization,this review thoroughly investigates various dataset sources.It also traces the historical evolution of data collection in the field of viral infectious diseases,highlighting the progress achieved over time.Simultaneously,it evaluates the current limitations that impede data utilization.Furthermore,we propose strategies to surmount these challenges,focusing on the development and application of advanced computational techniques,AI-driven models,and enhanced data integration practices.By providing a comprehensive synthesis of existing knowledge,this review is designed to guide future research and contribute to more informed approaches in the surveillance,prevention,and control of viral infectious diseases,particularly within the context of the expanding big-data landscape.展开更多
This study examines the Big Data Collection and Preprocessing course at Anhui Institute of Information Engineering,implementing a hybrid teaching reform using the Bosi Smart Learning Platform.The proposed hybrid model...This study examines the Big Data Collection and Preprocessing course at Anhui Institute of Information Engineering,implementing a hybrid teaching reform using the Bosi Smart Learning Platform.The proposed hybrid model follows a“three-stage”and“two-subject”framework,incorporating a structured design for teaching content and assessment methods before,during,and after class.Practical results indicate that this approach significantly enhances teaching effectiveness and improves students’learning autonomy.展开更多
The modernization and internationalization of traditional Chinese medicine(TCM)have long been constrained by the"black box"problem of its complex compositional system and unclear mechanisms of action.Target ...The modernization and internationalization of traditional Chinese medicine(TCM)have long been constrained by the"black box"problem of its complex compositional system and unclear mechanisms of action.Target discovery,as a core step in revealing drug action principles,is key to promoting TCM's transition from"empirical medicine"to"precision medicine".In recent years,the rapid development of chemical biology technologies has provided powerful tools to address this challenge.This article focuses on the latest progress in applying chemical biology strategies,such as molecular probes,click chemistry,fluorescent labeling,and photo-crosslinking microarrays,in TCM target identification research.Combined with typical case studies like Sapanone A and Eupalinolide B,it elaborates on how these cutting-edge technologies can precisely identify the direct targets of active TCM components,thereby achieving comprehensive mechanism analysis from cells and animals to clinical samples.Furthermore,this article prospectively discusses novel"supramolecular drugs"formed by the self-assembly of TCM components at the nanoscale and their unique biological effects.It also preliminarily constructs a modern scientific interpretation framework for TCM theories like"property-flavor-channel tropism"and"processing theory",centered around target distribution and regulation.Finally,this article proposes that"chemical biology of TCM,"as a key driver to discover original drug targets derived from TCM theory,is posited to offer a novel paradigm for innovative drug discovery and to contribute significantly to the modernization and scientific elucidation of TCM theory.展开更多
This paper makes an in-depth study of the status quo and trend of the development of the big health industry under the background of digitalization,aiming to explore ways to promote the transformation,upgrading and su...This paper makes an in-depth study of the status quo and trend of the development of the big health industry under the background of digitalization,aiming to explore ways to promote the transformation,upgrading and sustainable development of the big health industry through technological innovation and policy guidance,and provide valuable references for the government,enterprises and universities and other stakeholders to promote the high-quality development of the big health industry in the digital era.展开更多
RETRACTION:P.Goyal and R.Malviya,“Challenges and Opportunities of Big Data Analytics in Healthcare,”Health Care Science 2,no.5(2023):328-338,https://doi.org/10.1002/hcs2.66.The above article,published online on 4 Oc...RETRACTION:P.Goyal and R.Malviya,“Challenges and Opportunities of Big Data Analytics in Healthcare,”Health Care Science 2,no.5(2023):328-338,https://doi.org/10.1002/hcs2.66.The above article,published online on 4 October 2023 in Wiley Online Library(wileyonlinelibrary.com),has been retracted by agreement between the journal Editor-in-Chief,Zongjiu Zhang;Tsinghua University Press;and John Wiley&Sons Ltd.展开更多
Pesticides play a pivotal role in modern agriculture. However, the pesticide industry faces significant challenges closely linked to major global concerns such as pesticide resistance, environmental pollution, food sa...Pesticides play a pivotal role in modern agriculture. However, the pesticide industry faces significant challenges closely linked to major global concerns such as pesticide resistance, environmental pollution, food safety, and crop yields. Developing safe, efficient, and environmentally friendly pesticides has become a key challenge for the industry. Recently, Qing Yang and colleagues unveiled the mode of action of a dual-functional protein, the ABCH transporter, which plays essential roles in lipid transport to construct the lipid barrier of insect cuticles and in pesticide detoxification within insects. Since ABCH transporters are critical for all insects but absent in mammals and plants, this elegant and exciting work provides a highly promising target for developing safe, low-resistance pesticides. Here, we highlight the groundbreaking discoveries made by Qing Yang's team in unraveling the intricate mechanisms of the ABCH transporter.展开更多
Objectives:This study aimed to develop and validate a stroke risk prediction model based on machine learning(ML)and regional healthcare big data,and determine whether it may improve the prediction performance compared...Objectives:This study aimed to develop and validate a stroke risk prediction model based on machine learning(ML)and regional healthcare big data,and determine whether it may improve the prediction performance compared with the conventional Logistic Regression(LR)model.Methods:This retrospective cohort study analyzed data from the CHinese Electronic health Records Research in Yinzhou(CHERRY)(2015–2021).We included adults aged 18–75 from the platform who had established records before 2015.Individuals with pre-existing stroke,key data absence,or excessive missingness(>30%)were excluded.Data on demographic,clinical measures,lifestyle factors,comorbidities,and family history of stroke were collected.Variable selection was performed in two stages:an initial screening via univariate analysis,followed by a prioritization of variables based on clinical relevance and actionability,with a focus on those that are modifiable.Stroke prediction models were developed using LR and four ML algorithms:Decision Tree(DT),Random Forest(RF),eXtreme Gradient Boosting(XGBoost),and Back Propagation Neural Network(BPNN).The dataset was split 7:3 for training and validation sets.Performance was assessed using receiver operating characteristic(ROC)curves,calibration,and confusion matrices,and the cutoff value was determined by Youden's index to classify risk groups.Results:The study cohort comprised 92,172 participants with 436 incident stroke cases(incidence rate:474/100,000 person-years).Ultimately,13 predictor variables were included.RF achieved the highest accuracy(0.935),precision(0.923),sensitivity(recall:0.947),and F1 score(0.935).Model evaluation demonstrated superior predictive performance of ML algorithms over conventional LR,with training/validation areaunderthe curve(AUC)sof0.777/0.779(LR),0.921/0.918(BPNN),0.988/0.980(RF),0.980/0.955(DT),and 0.962/0.958(XGBoost).Calibration analysis revealed a better fit for DT,LR and BPNN compared to RF and XGBoost model.Based on the optimal performance of the RF model,the ranking of factors in descending order of importance was:hypertension,age,diabetes,systolic blood pressure,waist,high-density lipoprotein Cholesterol,fasting blood glucose,physical activity,BMI,low-density lipoprotein cholesterol,total cholesterol,dietary habits,and family history of stroke.Using Youden's index as the optimal cutoff,the RF model stratified individuals into high-risk(>0.789)and low-risk(≤0.789)groups with robust discrimination.Conclusions:The ML-based prediction models demonstrated superior performance metrics compared to conventional LR and the RF is the optimal prediction model,providing an effective tool for risk stratifi cation in primary stroke prevention in community settings.展开更多
Anemoside B4(AB4),a triterpenoidal saponin derived from Pulsatilla chinensis,has garnered considerable attention for its potent anti-inflammatory and immunomodulatory activities,culminating in its approval for clinica...Anemoside B4(AB4),a triterpenoidal saponin derived from Pulsatilla chinensis,has garnered considerable attention for its potent anti-inflammatory and immunomodulatory activities,culminating in its approval for clinical trials by the Center for Drug Evaluation,National Medical Products Administration,for the treatment of mild to moderate ulcerative colitis.Despite this,AB4’s therapeutic potential remained underexplored until the development of its injection formulation.This review discusses the scientific rationale and theoretical framework behind AB4’s development,offering a new paradigm and innovative research strategy for discovering lead compounds or drug candidates from natural medicines.In-depth investigations into AB4’s cellular targets,biochemical pathways,and administration routes have provided valuable insights into its druggability evaluation and clinical potential.The high water solubility of AB4,attributable to its multiple sugar units,imposes limitations on its bioavailability and pharmacokinetic profiles.To address this,structural modification via chemical methods and enzymatic hydrolysis have been employed,resulting in derivatives with reduced molecular weight,improved bioavailability,enhanced pharmacological activity,and greater clinical potential.These advances lay a solid foundation for the continued development of AB4 and its derivatives as promising therapeutic agents.展开更多
With the constant changes of the times,China's science and technology have entered a period of rapid development.At the same time,the economic structure is also changing with the changes of the times,and the origi...With the constant changes of the times,China's science and technology have entered a period of rapid development.At the same time,the economic structure is also changing with the changes of the times,and the original Haikou logistics industry in the process is also facing new impacts and challenges.And related enterprises want to stand out in the fierce market competition,we must optimize and upgrade the current industry development situation,promote the integrated development of Haikou logistics and manufacturing industry,to constantly promote the innovative application of digital technology in the logistics industry and manufacturing industry,the formation of a multi-force economic development model.This paper mainly starts with the development status of Haikou logistics,analyzes the importance of the integration of Haikou logistics and manufacturing industry under the background of big data drive,and makes an in-depth discussion on the path of the integration of Haikou logistics and manufacturing industry under the drive of big data,hoping to contribute new strength to the development of social economy.展开更多
基金the 2023 Inner Mongolia Public Institution High-Level Talent Introduction Scientific Research Support Project with the start-up funding from Linyi Vocational College。
文摘The paper utilized a standardized methodology to identify prognostic biomarkers in hepatocellular carcinoma(HCC)by analyzing transcriptomic and clinical data from The Cancer Genome Atlas(TCGA)database.The approach,which included stringent data preprocessing,differential gene expression analysis,and Kaplan-Meier survival analysis,provided valuable insights into the genetic underpinnings of HCC.The comprehensive analysis of a dataset involving 370 HCC patients uncovered correlations between survival status and pathological characteristics,including tumor size,lymph node involvement,and distant metastasis.The processed transcriptome dataset,comprising 420 samples and annotating 26,783 genes,served as a robust platform for identifying differential gene expression patterns.Among the significant differential expression genes,the key genes such as FBXO43,HAGLROS,CRISPLD1,LRRC3.DT,and ERN2,were pinpointed,which showed significant associations with patient survival outcomes,indicating their potential as novel prognostic biomarkers.This study can not only enhance the understanding of HCC’s genetic landscape but also establish a blueprint for a standardized process to discover prognostic biomarkers of various diseases using genetic big data.Future research should focus on validating these biomarkers through independent cohorts and exploring their utility in the development of personalized treatment strategies.
文摘在小学高年级英语语篇教学中,存在学生思维浅表化、问题设计碎片化、旧版教材适配难这三个痛点。以译林版英语教材六年级上册Unit 4 Then and now中Story time的教学为例,教师立足教材文本,构建“课前定问—课初引链—课中解链—课后拓链—全程评链”的五步闭环,用大问题拉主线、小问题搭台阶,能激活学生语篇学习内驱力,实现英语教学从“知识传递”到“素养培养”的转变。
文摘The discovery of novel materials with desired properties is essential to the advancements of energy-related technologies.Despite the rapid development of computational infrastructures and theoretical approaches,progress so far has been limited by the empirical and serial nature of experimental work.Fortunately,the situation is changing thanks to the maturation of theoretical tools such as density functional theory,high-throughput screening,crystal structure prediction,and emerging approaches based on machine learning.Together these recent innovations in computational chemistry,data informatics,and machine learning have acted as catalysts for revolutionizing material design and hopefully will lead to faster kinetics in the development of energy-related industries.In this report,recent advances in material discovery methods are reviewed for energy devices.Three paradigms based on empiricism-driven experiments,database-driven high-throughput screening,and data informatics-driven machine learning are discussed critically.Key methodological advancements involved are reviewed including high-throughput screening,crystal structure prediction,and generative models for target material design.Their applications in energy-related devices such as batteries,catalysts,and photovoltaics are selectively showcased.
基金partially supported by the Construction of Collaborative Innovation Center of Beijing Academy of Agricultural and Forestry Sciences(KJCX20240406)the Beijing Natural Science Foundation(JQ24037)+1 种基金the National Natural Science Foundation of China(32330075)the Earmarked Fund for China Agriculture Research System(CARS-02 and CARS-54)。
文摘The security of the seed industry is crucial for ensuring national food security.Currently,developed countries in Europe and America,along with international seed industry giants,have entered the Breeding 4.0 era.This era integrates biotechnology,artificial intelligence(AI),and big data information technology.In contrast,China is still in a transition period between stages 2.0 and 3.0,which primarily relies on conventional selection and molecular breeding.In the context of increasingly complex international situations,accurately identifying core issues in China's seed industry innovation and seizing the frontier of international seed technology are strategically important.These efforts are essential for ensuring food security and revitalizing the seed industry.This paper systematically analyzes the characteristics of crop breeding data from artificial selection to intelligent design breeding.It explores the applications and development trends of AI and big data in modern crop breeding from several key perspectives.These include highthroughput phenotype acquisition and analysis,multiomics big data database and management system construction,AI-based multiomics integrated analysis,and the development of intelligent breeding software tools based on biological big data and AI technology.Based on an in-depth analysis of the current status and challenges of China's seed industry technology development,we propose strategic goals and key tasks for China's new generation of AI and big data-driven intelligent design breeding.These suggestions aim to accelerate the development of an intelligent-driven crop breeding engineering system that features large-scale gene mining,efficient gene manipulation,engineered variety design,and systematized biobreeding.This study provides a theoretical basis and practical guidance for the development of China's seed industry technology.
基金funded by the National Natural Science Foundation of China(No.42220104008)。
文摘Research into metamorphism plays a pivotal role in reconstructing the evolution of continent,particularly through the study of ancient rocks that are highly susceptible to metamorphic alterations due to multiple tectonic activities.In the big data era,the establishment of new data platforms and the application of big data methods have become a focus for metamorphic rocks.Significant progress has been made in creating specialized databases,compiling comprehensive datasets,and utilizing data analytics to address complex scientific questions.However,many existing databases are inadequate in meeting the specific requirements of metamorphic research,resulting from a substantial amount of valuable data remaining uncollected.Therefore,constructing new databases that can cope with the development of the data era is necessary.This article provides an extensive review of existing databases related to metamorphic rocks and discusses data-driven studies in this.Accordingly,several crucial factors that need to be taken into consideration in the establishment of specialized metamorphic databases are identified,aiming to leverage data-driven applications to achieve broader scientific objectives in metamorphic research.
文摘On October 18,2017,the 19th National Congress Report called for the implementation of the Healthy China Strategy.The development of biomedical data plays a pivotal role in advancing this strategy.Since the 18th National Congress of the Communist Party of China,China has vigorously promoted the integration and implementation of the Healthy China and Digital China strategies.The National Health Commission has prioritized the development of health and medical big data,issuing policies to promote standardized applica-tions and foster innovation in"Internet+Healthcare."Biomedical data has significantly contributed to preci-sion medicine,personalized health management,drug development,disease diagnosis,public health monitor-ing,and epidemic prediction capabilities.
基金supported in part by National Institute of Health(NIH),USA(Grant Nos.:R01GM126189,R01AI164266,and R35GM148196)the National Science Foundation,USA(Grant Nos.DMS2052983,DMS-1761320,and IIS-1900473)+3 种基金National Aero-nautics and Space Administration(NASA),USA(Grant No.:80NSSC21M0023)Michigan State University(MSU)Foundation,USA,Bristol-Myers Squibb(Grant No.:65109)USA,and Pfizer,USAsupported by the National Natural Science Foundation of China(Grant Nos.:11971367,12271416,and 11972266).
文摘Transformer models have emerged as pivotal tools within the realm of drug discovery,distinguished by their unique architectural features and exceptional performance in managing intricate data landscapes.Leveraging the innate capabilities of transformer architectures to comprehend intricate hierarchical dependencies inherent in sequential data,these models showcase remarkable efficacy across various tasks,including new drug design and drug target identification.The adaptability of pre-trained trans-former-based models renders them indispensable assets for driving data-centric advancements in drug discovery,chemistry,and biology,furnishing a robust framework that expedites innovation and dis-covery within these domains.Beyond their technical prowess,the success of transformer-based models in drug discovery,chemistry,and biology extends to their interdisciplinary potential,seamlessly combining biological,physical,chemical,and pharmacological insights to bridge gaps across diverse disciplines.This integrative approach not only enhances the depth and breadth of research endeavors but also fosters synergistic collaborations and exchange of ideas among disparate fields.In our review,we elucidate the myriad applications of transformers in drug discovery,as well as chemistry and biology,spanning from protein design and protein engineering,to molecular dynamics(MD),drug target iden-tification,transformer-enabled drug virtual screening(VS),drug lead optimization,drug addiction,small data set challenges,chemical and biological image analysis,chemical language understanding,and single cell data.Finally,we conclude the survey by deliberating on promising trends in transformer models within the context of drug discovery and other sciences.
基金sponsored by the U.S.Department of Housing and Urban Development(Grant No.NJLTS0027-22)The opinions expressed in this study are the authors alone,and do not represent the U.S.Depart-ment of HUD’s opinions.
文摘This paper addresses urban sustainability challenges amid global urbanization, emphasizing the need for innova tive approaches aligned with the Sustainable Development Goals. While traditional tools and linear models offer insights, they fall short in presenting a holistic view of complex urban challenges. System dynamics (SD) models that are often utilized to provide holistic, systematic understanding of a research subject, like the urban system, emerge as valuable tools, but data scarcity and theoretical inadequacy pose challenges. The research reviews relevant papers on recent SD model applications in urban sustainability since 2018, categorizing them based on nine key indicators. Among the reviewed papers, data limitations and model assumptions were identified as ma jor challenges in applying SD models to urban sustainability. This led to exploring the transformative potential of big data analytics, a rare approach in this field as identified by this study, to enhance SD models’ empirical foundation. Integrating big data could provide data-driven calibration, potentially improving predictive accuracy and reducing reliance on simplified assumptions. The paper concludes by advocating for new approaches that reduce assumptions and promote real-time applicable models, contributing to a comprehensive understanding of urban sustainability through the synergy of big data and SD models.
基金Strategic Priority Research Program of the Chinese Academy of Sciences,No.XDB0740000National Key Research and Development Program of China,No.2022YFB3904200,No.2022YFF0711601+1 种基金Key Project of Innovation LREIS,No.PI009National Natural Science Foundation of China,No.42471503。
文摘Deep-time Earth research plays a pivotal role in deciphering the rates,patterns,and mechanisms of Earth's evolutionary processes throughout geological history,providing essential scientific foundations for climate prediction,natural resource exploration,and sustainable planetary stewardship.To advance Deep-time Earth research in the era of big data and artificial intelligence,the International Union of Geological Sciences initiated the“Deeptime Digital Earth International Big Science Program”(DDE)in 2019.At the core of this ambitious program lies the development of geoscience knowledge graphs,serving as a transformative knowledge infrastructure that enables the integration,sharing,mining,and analysis of heterogeneous geoscience big data.The DDE knowledge graph initiative has made significant strides in three critical dimensions:(1)establishing a unified knowledge structure across geoscience disciplines that ensures consistent representation of geological entities and their interrelationships through standardized ontologies and semantic frameworks;(2)developing a robust and scalable software infrastructure capable of supporting both expert-driven and machine-assisted knowledge engineering for large-scale graph construction and management;(3)implementing a comprehensive three-tiered architecture encompassing basic,discipline-specific,and application-oriented knowledge graphs,spanning approximately 20 geoscience disciplines.Through its open knowledge framework and international collaborative network,this initiative has fostered multinational research collaborations,establishing a robust foundation for next-generation geoscience research while propelling the discipline toward FAIR(Findable,Accessible,Interoperable,Reusable)data practices in deep-time Earth systems research.
基金supported By Grant (PLN2022-14) of State Key Laboratory of Oil and Gas Reservoir Geology and Exploitation (Southwest Petroleum University)。
文摘Well logging technology has accumulated a large amount of historical data through four generations of technological development,which forms the basis of well logging big data and digital assets.However,the value of these data has not been well stored,managed and mined.With the development of cloud computing technology,it provides a rare development opportunity for logging big data private cloud.The traditional petrophysical evaluation and interpretation model has encountered great challenges in the face of new evaluation objects.The solution research of logging big data distributed storage,processing and learning functions integrated in logging big data private cloud has not been carried out yet.To establish a distributed logging big-data private cloud platform centered on a unifi ed learning model,which achieves the distributed storage and processing of logging big data and facilitates the learning of novel knowledge patterns via the unifi ed logging learning model integrating physical simulation and data models in a large-scale functional space,thus resolving the geo-engineering evaluation problem of geothermal fi elds.Based on the research idea of“logging big data cloud platform-unifi ed logging learning model-large function space-knowledge learning&discovery-application”,the theoretical foundation of unified learning model,cloud platform architecture,data storage and learning algorithm,arithmetic power allocation and platform monitoring,platform stability,data security,etc.have been carried on analysis.The designed logging big data cloud platform realizes parallel distributed storage and processing of data and learning algorithms.The feasibility of constructing a well logging big data cloud platform based on a unifi ed learning model of physics and data is analyzed in terms of the structure,ecology,management and security of the cloud platform.The case study shows that the logging big data cloud platform has obvious technical advantages over traditional logging evaluation methods in terms of knowledge discovery method,data software and results sharing,accuracy,speed and complexity.
基金supported by the National Natural Science Foundation of China(32370703)the CAMS Innovation Fund for Medical Sciences(CIFMS)(2022-I2M-1-021,2021-I2M-1-061)the Major Project of Guangzhou National Labora-tory(GZNL2024A01015).
文摘Viral infectious diseases,characterized by their intricate nature and wide-ranging diversity,pose substantial challenges in the domain of data management.The vast volume of data generated by these diseases,spanning from the molecular mechanisms within cells to large-scale epidemiological patterns,has surpassed the capabilities of traditional analytical methods.In the era of artificial intelligence(AI)and big data,there is an urgent necessity for the optimization of these analytical methods to more effectively handle and utilize the information.Despite the rapid accumulation of data associated with viral infections,the lack of a comprehensive framework for integrating,selecting,and analyzing these datasets has left numerous researchers uncertain about which data to select,how to access it,and how to utilize it most effectively in their research.This review endeavors to fill these gaps by exploring the multifaceted nature of viral infectious diseases and summarizing relevant data across multiple levels,from the molecular details of pathogens to broad epidemiological trends.The scope extends from the micro-scale to the macro-scale,encompassing pathogens,hosts,and vectors.In addition to data summarization,this review thoroughly investigates various dataset sources.It also traces the historical evolution of data collection in the field of viral infectious diseases,highlighting the progress achieved over time.Simultaneously,it evaluates the current limitations that impede data utilization.Furthermore,we propose strategies to surmount these challenges,focusing on the development and application of advanced computational techniques,AI-driven models,and enhanced data integration practices.By providing a comprehensive synthesis of existing knowledge,this review is designed to guide future research and contribute to more informed approaches in the surveillance,prevention,and control of viral infectious diseases,particularly within the context of the expanding big-data landscape.
基金2024 Anqing Normal University University-Level Key Project(ZK2024062D)。
文摘This study examines the Big Data Collection and Preprocessing course at Anhui Institute of Information Engineering,implementing a hybrid teaching reform using the Bosi Smart Learning Platform.The proposed hybrid model follows a“three-stage”and“two-subject”framework,incorporating a structured design for teaching content and assessment methods before,during,and after class.Practical results indicate that this approach significantly enhances teaching effectiveness and improves students’learning autonomy.
基金supported by the Natural Science Foundation of Shandong Province(No.ZR2023ZD25)the Taishan Scholars Project in Shandong Province(Nos.tstp20230633 and tsqn202408246).
文摘The modernization and internationalization of traditional Chinese medicine(TCM)have long been constrained by the"black box"problem of its complex compositional system and unclear mechanisms of action.Target discovery,as a core step in revealing drug action principles,is key to promoting TCM's transition from"empirical medicine"to"precision medicine".In recent years,the rapid development of chemical biology technologies has provided powerful tools to address this challenge.This article focuses on the latest progress in applying chemical biology strategies,such as molecular probes,click chemistry,fluorescent labeling,and photo-crosslinking microarrays,in TCM target identification research.Combined with typical case studies like Sapanone A and Eupalinolide B,it elaborates on how these cutting-edge technologies can precisely identify the direct targets of active TCM components,thereby achieving comprehensive mechanism analysis from cells and animals to clinical samples.Furthermore,this article prospectively discusses novel"supramolecular drugs"formed by the self-assembly of TCM components at the nanoscale and their unique biological effects.It also preliminarily constructs a modern scientific interpretation framework for TCM theories like"property-flavor-channel tropism"and"processing theory",centered around target distribution and regulation.Finally,this article proposes that"chemical biology of TCM,"as a key driver to discover original drug targets derived from TCM theory,is posited to offer a novel paradigm for innovative drug discovery and to contribute significantly to the modernization and scientific elucidation of TCM theory.
文摘This paper makes an in-depth study of the status quo and trend of the development of the big health industry under the background of digitalization,aiming to explore ways to promote the transformation,upgrading and sustainable development of the big health industry through technological innovation and policy guidance,and provide valuable references for the government,enterprises and universities and other stakeholders to promote the high-quality development of the big health industry in the digital era.
文摘RETRACTION:P.Goyal and R.Malviya,“Challenges and Opportunities of Big Data Analytics in Healthcare,”Health Care Science 2,no.5(2023):328-338,https://doi.org/10.1002/hcs2.66.The above article,published online on 4 October 2023 in Wiley Online Library(wileyonlinelibrary.com),has been retracted by agreement between the journal Editor-in-Chief,Zongjiu Zhang;Tsinghua University Press;and John Wiley&Sons Ltd.
基金National Natural Science Foundation of China(32471265).
文摘Pesticides play a pivotal role in modern agriculture. However, the pesticide industry faces significant challenges closely linked to major global concerns such as pesticide resistance, environmental pollution, food safety, and crop yields. Developing safe, efficient, and environmentally friendly pesticides has become a key challenge for the industry. Recently, Qing Yang and colleagues unveiled the mode of action of a dual-functional protein, the ABCH transporter, which plays essential roles in lipid transport to construct the lipid barrier of insect cuticles and in pesticide detoxification within insects. Since ABCH transporters are critical for all insects but absent in mammals and plants, this elegant and exciting work provides a highly promising target for developing safe, low-resistance pesticides. Here, we highlight the groundbreaking discoveries made by Qing Yang's team in unraveling the intricate mechanisms of the ABCH transporter.
基金funded by Beijing Natural Science Foundation-Haidian Original Innovation Joint Fund(Grant No.L222103)the National Natural Science Foundation of China(Grant No.72174012)。
文摘Objectives:This study aimed to develop and validate a stroke risk prediction model based on machine learning(ML)and regional healthcare big data,and determine whether it may improve the prediction performance compared with the conventional Logistic Regression(LR)model.Methods:This retrospective cohort study analyzed data from the CHinese Electronic health Records Research in Yinzhou(CHERRY)(2015–2021).We included adults aged 18–75 from the platform who had established records before 2015.Individuals with pre-existing stroke,key data absence,or excessive missingness(>30%)were excluded.Data on demographic,clinical measures,lifestyle factors,comorbidities,and family history of stroke were collected.Variable selection was performed in two stages:an initial screening via univariate analysis,followed by a prioritization of variables based on clinical relevance and actionability,with a focus on those that are modifiable.Stroke prediction models were developed using LR and four ML algorithms:Decision Tree(DT),Random Forest(RF),eXtreme Gradient Boosting(XGBoost),and Back Propagation Neural Network(BPNN).The dataset was split 7:3 for training and validation sets.Performance was assessed using receiver operating characteristic(ROC)curves,calibration,and confusion matrices,and the cutoff value was determined by Youden's index to classify risk groups.Results:The study cohort comprised 92,172 participants with 436 incident stroke cases(incidence rate:474/100,000 person-years).Ultimately,13 predictor variables were included.RF achieved the highest accuracy(0.935),precision(0.923),sensitivity(recall:0.947),and F1 score(0.935).Model evaluation demonstrated superior predictive performance of ML algorithms over conventional LR,with training/validation areaunderthe curve(AUC)sof0.777/0.779(LR),0.921/0.918(BPNN),0.988/0.980(RF),0.980/0.955(DT),and 0.962/0.958(XGBoost).Calibration analysis revealed a better fit for DT,LR and BPNN compared to RF and XGBoost model.Based on the optimal performance of the RF model,the ranking of factors in descending order of importance was:hypertension,age,diabetes,systolic blood pressure,waist,high-density lipoprotein Cholesterol,fasting blood glucose,physical activity,BMI,low-density lipoprotein cholesterol,total cholesterol,dietary habits,and family history of stroke.Using Youden's index as the optimal cutoff,the RF model stratified individuals into high-risk(>0.789)and low-risk(≤0.789)groups with robust discrimination.Conclusions:The ML-based prediction models demonstrated superior performance metrics compared to conventional LR and the RF is the optimal prediction model,providing an effective tool for risk stratifi cation in primary stroke prevention in community settings.
基金supported by National Natural Science Foundation of China(82341087,82073912,and 81903896)a project funded by Priority Academic Program Development(PAPD)of Jiangsu Higher Education Institutions.
文摘Anemoside B4(AB4),a triterpenoidal saponin derived from Pulsatilla chinensis,has garnered considerable attention for its potent anti-inflammatory and immunomodulatory activities,culminating in its approval for clinical trials by the Center for Drug Evaluation,National Medical Products Administration,for the treatment of mild to moderate ulcerative colitis.Despite this,AB4’s therapeutic potential remained underexplored until the development of its injection formulation.This review discusses the scientific rationale and theoretical framework behind AB4’s development,offering a new paradigm and innovative research strategy for discovering lead compounds or drug candidates from natural medicines.In-depth investigations into AB4’s cellular targets,biochemical pathways,and administration routes have provided valuable insights into its druggability evaluation and clinical potential.The high water solubility of AB4,attributable to its multiple sugar units,imposes limitations on its bioavailability and pharmacokinetic profiles.To address this,structural modification via chemical methods and enzymatic hydrolysis have been employed,resulting in derivatives with reduced molecular weight,improved bioavailability,enhanced pharmacological activity,and greater clinical potential.These advances lay a solid foundation for the continued development of AB4 and its derivatives as promising therapeutic agents.
基金Research on the Digital Transformation of Financial Management Major and the Training Model of Outstanding Talents(2023122203988)Research on the Integration of Haikou Logistics and Manufacturing Driven by Big Data and Its Consumption Promotion Effect(HKKY2024-ZD-24)。
文摘With the constant changes of the times,China's science and technology have entered a period of rapid development.At the same time,the economic structure is also changing with the changes of the times,and the original Haikou logistics industry in the process is also facing new impacts and challenges.And related enterprises want to stand out in the fierce market competition,we must optimize and upgrade the current industry development situation,promote the integrated development of Haikou logistics and manufacturing industry,to constantly promote the innovative application of digital technology in the logistics industry and manufacturing industry,the formation of a multi-force economic development model.This paper mainly starts with the development status of Haikou logistics,analyzes the importance of the integration of Haikou logistics and manufacturing industry under the background of big data drive,and makes an in-depth discussion on the path of the integration of Haikou logistics and manufacturing industry under the drive of big data,hoping to contribute new strength to the development of social economy.