In the current data-intensive era, the traditional hands-on method of conducting scientific research by exploring related publications to generate a testable hypothesis is well on its way of becoming obsolete within j...In the current data-intensive era, the traditional hands-on method of conducting scientific research by exploring related publications to generate a testable hypothesis is well on its way of becoming obsolete within just a year or two. Analyzing the literature and data to automatically generate a hypothesis might become the de facto approach to inform the core research efforts of those trying to master the exponentially rapid expansion of publications and datasets. Here, viewpoints are provided and discussed to help the understanding of challenges of data-driven discovery.展开更多
Stochastic differential equations(SDEs)are mathematical models that are widely used to describe complex processes or phenomena perturbed by random noise from different sources.The identification of SDEs governing a sy...Stochastic differential equations(SDEs)are mathematical models that are widely used to describe complex processes or phenomena perturbed by random noise from different sources.The identification of SDEs governing a system is often a challenge because of the inherent strong stochasticity of data and the complexity of the system’s dynamics.The practical utility of existing parametric approaches for identifying SDEs is usually limited by insufficient data resources.This study presents a novel framework for identifying SDEs by leveraging the sparse Bayesian learning(SBL)technique to search for a parsimonious,yet physically necessary representation from the space of candidate basis functions.More importantly,we use the analytical tractability of SBL to develop an efficient way to formulate the linear regression problem for the discovery of SDEs that requires considerably less time-series data.The effectiveness of the proposed framework is demonstrated using real data on stock and oil prices,bearing variation,and wind speed,as well as simulated data on well-known stochastic dynamical systems,including the generalized Wiener process and Langevin equation.This framework aims to assist specialists in extracting stochastic mathematical models from random phenomena in the natural sciences,economics,and engineering fields for analysis,prediction,and decision making.展开更多
This study integrates multiple sources of data(transaction data,policy text,public opinion data)with visualization techniques(such as heat maps,time-series trend charts,3D building brochures)to construct an analysis f...This study integrates multiple sources of data(transaction data,policy text,public opinion data)with visualization techniques(such as heat maps,time-series trend charts,3D building brochures)to construct an analysis framework for the Chengdu real estate market.By using the Adaptive Neuro-Fuzzy Inference System(ANFIS)prediction model,spatial GIS(Geographic Information System analysis)analysis,and interactive dashboards,this study reveals market differentiation,policy impacts,and changes in demand structure,thereby providing decision support for the government,enterprises,and homebuyers.展开更多
To address the issue of instability or even imbalance in the orientation and attitude control of quadrotor unmanned aerial vehicles(QUAVs)under random disturbances,this paper proposes a distributed antidisturbance dat...To address the issue of instability or even imbalance in the orientation and attitude control of quadrotor unmanned aerial vehicles(QUAVs)under random disturbances,this paper proposes a distributed antidisturbance data-driven event-triggered fusion control method,which achieves efficient fault diagnosis while suppressing random disturbances and mitigating communication conflicts within the QUAV swarm.First,the impact of random disturbances on the UAV swarm is analyzed,and a model for orientation and attitude control of QUAVs under stochastic perturbations is established,with the disturbance gain threshold determined.Second,a fault diagnosis system based on a high-gain observer is designed,constructing a fault gain criterion by integrating orientation and attitude information from QUAVs.Subsequently,a model-free dynamic linearization-based data modeling(MFDLDM)framework is developed using model-free adaptive control,which efficiently fits the nonlinear control model of the QUAV swarm while reducing temporal constraints on control data.On this basis,this paper constructs a distributed data-driven event-triggered controller based on the staggered communication mechanism,which consists of an equivalent QUAV controller and an event-triggered controller,and is able to reduce the communication conflicts while suppressing the influence of random interference.Finally,by incorporating random disturbances into the controller,comparative experiments and physical validations are conducted on the QUAV platforms,fully demonstrating the strong adaptability and robustness of the proposed distributed event-triggered fault-tolerant control system.展开更多
Wetting deformation in earth-rockfill dams is a critical factor influencingdam safety.Although numerous mathematical models have been developed to describe this phenomenon,most of them rely on empirical formulations a...Wetting deformation in earth-rockfill dams is a critical factor influencingdam safety.Although numerous mathematical models have been developed to describe this phenomenon,most of them rely on empirical formulations and lack prior knowledge of model parameters,which is essential for Bayesian parameter inversion to enhance accuracy and reduce uncertainty.This study introduces a datadriven approach to establishing prior knowledge of earth-rockfill dams.Driving factors are utilized to determine the potential range of model parameters,and settlement changes within this range are calculated.The results are iteratively compared with actual monitoring data until the calculated range encompasses the observed data,thereby providing prior knowledge of the model parameters.The proposed method is applied to the right-bank earth-rockfilldam of Danjiangkou.Employing a Gibbs sample size of 30,000,the proposed method effectively calibrates the prior knowledge of the wetting model parameters,achieving a root mean square error(RMSE)of 5.18 mm for the settlement predictions.By comparison,the use of non-informative priors with sample sizes of 30,000 and 50,000 results in significantly larger RMSE values of 11.97 mm and 16.07 mm,respectively.Furthermore,the computational efficiencyof the proposed method is demonstrated by an inversion computation time of 902 s for 30,000 samples,which is notably shorter than the 1026 s and 1558 s required for noninformative priors with 30,000 and 50,000 samples,respectively.These findingsunderscore the superior performance of the proposed approach in terms of both prediction accuracy and computational efficiency.These results demonstrate that the proposed method not only improves the predictive accuracy but also enhances the computational efficiency,enabling optimal parameter identificationwith reduced computational effort.This approach provides a robust and efficientframework for advancing dam safety assessments.展开更多
Identifying the community structure of complex networks is crucial to extracting insights and understanding network properties.Although several community detection methods have been proposed,many are unsuitable for so...Identifying the community structure of complex networks is crucial to extracting insights and understanding network properties.Although several community detection methods have been proposed,many are unsuitable for social networks due to significant limitations.Specifically,most approaches depend mainly on user-user structural links while overlooking service-centric,semantic,and multi-attribute drivers of community formation,and they also lack flexible filtering mechanisms for large-scale,service-oriented settings.Our proposed approach,called community discovery-based service(CDBS),leverages user profiles and their interactions with consulted web services.The method introduces a novel similarity measure,global similarity interaction profile(GSIP),which goes beyond typical similarity measures by unifying user and service profiles for all attributes types into a coherent representation,thereby clarifying its novelty and contribution.It applies multiple filtering criteria related to user attributes,accessed services,and interaction patterns.Experimental comparisons against Louvain,Hierarchical Agglomerative Clustering,Label Propagation and Infomap show that CDBS reveals the higher performance as it achieves 0.74 modularity,0.13 conductance,0.77 coverage,and significantly fast response time of 9.8 s,even with 10,000 users and 400 services.Moreover,community discoverybased service consistently detects a larger number of communities with distinct topics of interest,underscoring its capacity to generate detailed and efficient structures in complex networks.These results confirm both the efficiency and effectiveness of the proposed method.Beyond controlled evaluation,communities discovery based service is applicable to targeted recommendations,group-oriented marketing,access control,and service personalization,where communities are shaped not only by user links but also by service engagement.展开更多
Most Convolutional Neural Network(CNN)interpretation techniques visualize only the dominant cues that the model relies on,but there is no guarantee that these represent all the evidence the model uses for classificati...Most Convolutional Neural Network(CNN)interpretation techniques visualize only the dominant cues that the model relies on,but there is no guarantee that these represent all the evidence the model uses for classification.This limitation becomes critical when hidden secondary cues—potentially more meaningful than the visualized ones—remain undiscovered.This study introduces CasCAM(Cascaded Class Activation Mapping)to address this fundamental limitation through counterfactual reasoning.By asking“if this dominant cue were absent,what other evidence would the model use?”,CasCAM progressively masks the most salient features and systematically uncovers the hierarchy of classification evidence hidden beneath them.Experimental results demonstrate that CasCAM effectively discovers the full spectrum of reasoning evidence and can be universally applied with nine existing interpretation methods.展开更多
Recent years have witnessed the significant breakthrough in the field of new materials discovery brought about by the artificial intelligence(AI).AI has successfully been applied for predicting the formability,reveali...Recent years have witnessed the significant breakthrough in the field of new materials discovery brought about by the artificial intelligence(AI).AI has successfully been applied for predicting the formability,revealing the properties,and guiding the experimental synthesis of materials.Rapid progress has been made in the integration of increasing database and improved computing power.Though some reviews present the development from their unique aspects,reviews from the view of how AI empowered both discovery of new materials and cognition of existing materials that covers the completed contents with two synergistical aspects are few.Here,the newest development is systematically reviewed in the field of AI empowered materials,reflecting advanced design of the intelligent systems for discovery,synthesis,prediction and validation of materials.First,background and mechanisms are briefed,after which the design for the AI systems with data,machine learning and automated laboratory included is illustrated.Next,strategies are summarized to obtain the AI systems for materials with improved performance which comprehensively cover the aspects from the in-depth cognizance of existing material and the rapid discovery of new materials,and then,the design thought for future AI systems in material science is pointed out.Finally,some perspectives are put forward.展开更多
Owing to the emergence of drug resistance and high morbidity,the need for novel antiviral drugs with novel targets is highly sought after.Marine-derived compounds mostly possess potent antiviral activity and serve as ...Owing to the emergence of drug resistance and high morbidity,the need for novel antiviral drugs with novel targets is highly sought after.Marine-derived compounds mostly possess potent antiviral activity and serve as a primary source for developing novel antiviral drugs,making the rapid discovery and evaluation of marine antiviral agents particularly crucial.Thus,future research should place greater emphasis on the identification of novel antiviral targets through the combination of artificial intelligence(AI)and structural pharmacology,as well as expanding the marine resource and target databases.展开更多
Despite advances in current anti-cancer therapies,challenges such as drug resistance,toxicity,and tumor heterogeneity persist.The limitations of traditional single-target drugs and simple combination therapies are bec...Despite advances in current anti-cancer therapies,challenges such as drug resistance,toxicity,and tumor heterogeneity persist.The limitations of traditional single-target drugs and simple combination therapies are becoming increasingly apparent1.To address these issues,a novel treatment strategy,the artificially intelligent synergistic engineered drug(AISED)paradigm,merits further exploration.This paradigm is based on the systematic engineered integration of multiple active ingredients into a unified single entity through artificial intelligence(AI).This strategy is aimed at developing new anti-cancer drug designs involving multiple ingredients,multiple molecular targets,and multiple biological effects,for multiple cancer types,thereby providing a novel theoretical paradigm for overcoming existing treatment bottlenecks.展开更多
Purpose: The late Don R. Swanson was well appreciated during his lifetime as Dean of the Graduate Library School at University of Chicago, as winner of the American Society for Information Science Award of Merit for ...Purpose: The late Don R. Swanson was well appreciated during his lifetime as Dean of the Graduate Library School at University of Chicago, as winner of the American Society for Information Science Award of Merit for 2000, and as author of many seminal articles. In this informal essay, I will give my personal perspective on Don's contributions to science, and outline some current and future directions in literature-based discovery that are rooted in concepts that he developed.Design/methodology/approach: Personal recollections and literature review. Findings: The Swanson A-B-C model of literature-based discovery has been successfully used by laboratory investigators analyzing their findings and hypotheses. It continues to be a fertile area of research in a wide range of application areas including text mining, drug repurposing, studies of scientific innovation, knowledge discovery in databases, and bioinformatics. Recently, additional modes of discovery that do not follow the A-B-C model have also been proposed and explored (e.g. so-called storytelling, gaps, analogies, link prediction, negative consensus, outliers, and revival of neglected or discarded research questions). Research limitations: This paper reflects the opinions of the author and is not a comprehensive nor technically based review of literature-based discovery. Practical implications: The general scientific public is still not aware of the availability of tools for literature-based discovery. Our Arrowsmith project site maintains a suite of discovery tools that are free and open to the public (http://arrowsmith.psych.uic.edu), as does BITOLA which is maintained by Dmitar Hristovski (http:// http://ibmi.mf.uni-lj.si/bitola), and Epiphanet which is maintained by Trevor Cohen (http://epiphanet.uth.tme.edu/). Bringing user-friendly tools to the public should be a high priority, since even more than advancing basic research in informatics, it is vital that we ensure that scientists actually use discovery tools and that these are actually able to help them make experimental discoveries in the lab and in the clinic. Originality/value: This paper discusses problems and issues which were inherent in Don's thoughts during his life, including those which have not yet been fully taken up and studied systematically.展开更多
Tauopathies,diseases characterized by neuropathological aggregates of tau including Alzheimer's disease and subtypes of fro ntotemporal dementia,make up the vast majority of dementia cases.Although there have been...Tauopathies,diseases characterized by neuropathological aggregates of tau including Alzheimer's disease and subtypes of fro ntotemporal dementia,make up the vast majority of dementia cases.Although there have been recent developments in tauopathy biomarkers and disease-modifying treatments,ongoing progress is required to ensure these are effective,economical,and accessible for the globally ageing population.As such,continued identification of new potential drug targets and biomarkers is critical."Big data"studies,such as proteomics,can generate information on thousands of possible new targets for dementia diagnostics and therapeutics,but currently remain underutilized due to the lack of a clear process by which targets are selected for future drug development.In this review,we discuss current tauopathy biomarkers and therapeutics,and highlight areas in need of improvement,particularly when addressing the needs of frail,comorbid and cognitively impaired populations.We highlight biomarkers which have been developed from proteomic data,and outline possible future directions in this field.We propose new criteria by which potential targets in proteomics studies can be objectively ranked as favorable for drug development,and demonstrate its application to our group's recent tau interactome dataset as an example.展开更多
Mitigating vortex-induced vibrations(VIV)in flexible risers represents a critical concern in offshore oil and gas production,considering its potential impact on operational safety and efficiency.The accurate predictio...Mitigating vortex-induced vibrations(VIV)in flexible risers represents a critical concern in offshore oil and gas production,considering its potential impact on operational safety and efficiency.The accurate prediction of displacement and position of VIV in flexible risers remains challenging under actual marine conditions.This study presents a data-driven model for riser displacement prediction that corresponds to field conditions.Experimental data analysis reveals that the XGBoost algorithm predicts the maximum displacement and position with superior accuracy compared with Support vector regression(SVR),considering both computational efficiency and precision.Platform displacement in the Y-direction demonstrates a significant positive correlation with both axial depth and maximum displacement magnitude.The fourth point displacement exhibits the highest contribution to model prediction outcomes,showing a positive influence on maximum displacement while negatively affecting the axial depth of maximum displacement.Platform displacement in the X-and Y-directions exhibits competitive effects on both the riser’s maximum displacement and its axial depth.Through the implementation of XGBoost algorithm and SHapley Additive exPlanation(SHAP)analysis,the model effectively estimates the riser’s maximum displacement and its precise location.This data-driven approach achieves predictions using minimal,readily available data points,enhancing its practical field applications and demonstrating clear relevance to academic and professional communities.展开更多
Based on the educational evaluation reform,this study explores the construction of an evidence-based value-added evaluation system based on data-driven,aiming to solve the limitations of traditional evaluation methods...Based on the educational evaluation reform,this study explores the construction of an evidence-based value-added evaluation system based on data-driven,aiming to solve the limitations of traditional evaluation methods.The research adopts the method of combining theoretical analysis and practical application,and designs the evidence-based value-added evaluation framework,which includes the core elements of a multi-source heterogeneous data acquisition and processing system,a value-added evaluation agent based on a large model,and an evaluation implementation and application mechanism.Through empirical research verification,the evaluation system has remarkable effects in improving learning participation,promoting ability development,and supporting teaching decision-making,and provides a theoretical reference and practical path for educational evaluation reform in the new era.The research shows that the evidence-based value-added evaluation system based on data-driven can reflect students’actual progress more fairly and objectively by accurately measuring the difference in starting point and development range of students,and provide strong support for the realization of high-quality education development.展开更多
Transformer models have emerged as pivotal tools within the realm of drug discovery,distinguished by their unique architectural features and exceptional performance in managing intricate data landscapes.Leveraging the...Transformer models have emerged as pivotal tools within the realm of drug discovery,distinguished by their unique architectural features and exceptional performance in managing intricate data landscapes.Leveraging the innate capabilities of transformer architectures to comprehend intricate hierarchical dependencies inherent in sequential data,these models showcase remarkable efficacy across various tasks,including new drug design and drug target identification.The adaptability of pre-trained trans-former-based models renders them indispensable assets for driving data-centric advancements in drug discovery,chemistry,and biology,furnishing a robust framework that expedites innovation and dis-covery within these domains.Beyond their technical prowess,the success of transformer-based models in drug discovery,chemistry,and biology extends to their interdisciplinary potential,seamlessly combining biological,physical,chemical,and pharmacological insights to bridge gaps across diverse disciplines.This integrative approach not only enhances the depth and breadth of research endeavors but also fosters synergistic collaborations and exchange of ideas among disparate fields.In our review,we elucidate the myriad applications of transformers in drug discovery,as well as chemistry and biology,spanning from protein design and protein engineering,to molecular dynamics(MD),drug target iden-tification,transformer-enabled drug virtual screening(VS),drug lead optimization,drug addiction,small data set challenges,chemical and biological image analysis,chemical language understanding,and single cell data.Finally,we conclude the survey by deliberating on promising trends in transformer models within the context of drug discovery and other sciences.展开更多
In the realm of drug discovery,recent advancements have paved the way for innovative approaches and methodologies.This comprehensive review encapsulates six distinct yet interrelated mini-reviews,each shedding light o...In the realm of drug discovery,recent advancements have paved the way for innovative approaches and methodologies.This comprehensive review encapsulates six distinct yet interrelated mini-reviews,each shedding light on novel strategies in drug development.(a)The resurgence of covalent drugs is highlighted,focusing on the targeted covalent inhibitors(TCIs)and their role in enhancing selectivity and affinity.(b)The potential of the quantum mechanics-based computational aid drug design(CADD)tool,Cov_DOX,is introduced for predicting protein-covalent ligand binding structures and affinities.(c)The scaffolding function of proteins is proposed as a new avenue for drug design,with a focus on modulating protein-protein interactions through small molecules and proteolysis targeting chimeras(PROTACs).(d)The concept of pro-PROTACs is explored as a promising strategy for cancer therapy,combining the principles of prodrugs and PROTACs to enhance specificity and reduce toxicity.(e)The design of prodrugs through carbon-carbon bond cleavage is discussed,offering a new perspective for the activation of drugs with limited modifiable functional groups.(f)The targeting of programmed cell death pathways in cancer therapies with small molecules is reviewed,emphasizing the induction of autophagy-dependent cell death,ferroptosis,and cuproptosis.These insights collectively contribute to a deeper understanding of the dynamic landscape of drug discovery.展开更多
We propose an integrated method of data-driven and mechanism models for well logging formation evaluation,explicitly focusing on predicting reservoir parameters,such as porosity and water saturation.Accurately interpr...We propose an integrated method of data-driven and mechanism models for well logging formation evaluation,explicitly focusing on predicting reservoir parameters,such as porosity and water saturation.Accurately interpreting these parameters is crucial for effectively exploring and developing oil and gas.However,with the increasing complexity of geological conditions in this industry,there is a growing demand for improved accuracy in reservoir parameter prediction,leading to higher costs associated with manual interpretation.The conventional logging interpretation methods rely on empirical relationships between logging data and reservoir parameters,which suffer from low interpretation efficiency,intense subjectivity,and suitability for ideal conditions.The application of artificial intelligence in the interpretation of logging data provides a new solution to the problems existing in traditional methods.It is expected to improve the accuracy and efficiency of the interpretation.If large and high-quality datasets exist,data-driven models can reveal relationships of arbitrary complexity.Nevertheless,constructing sufficiently large logging datasets with reliable labels remains challenging,making it difficult to apply data-driven models effectively in logging data interpretation.Furthermore,data-driven models often act as“black boxes”without explaining their predictions or ensuring compliance with primary physical constraints.This paper proposes a machine learning method with strong physical constraints by integrating mechanism and data-driven models.Prior knowledge of logging data interpretation is embedded into machine learning regarding network structure,loss function,and optimization algorithm.We employ the Physically Informed Auto-Encoder(PIAE)to predict porosity and water saturation,which can be trained without labeled reservoir parameters using self-supervised learning techniques.This approach effectively achieves automated interpretation and facilitates generalization across diverse datasets.展开更多
文摘In the current data-intensive era, the traditional hands-on method of conducting scientific research by exploring related publications to generate a testable hypothesis is well on its way of becoming obsolete within just a year or two. Analyzing the literature and data to automatically generate a hypothesis might become the de facto approach to inform the core research efforts of those trying to master the exponentially rapid expansion of publications and datasets. Here, viewpoints are provided and discussed to help the understanding of challenges of data-driven discovery.
基金supported by the National Key Research and Development Program of China(2018YFB1701202)the National Natural Science Foundation of China(92167201 and 51975237)the Fundamental Research Funds for the Central Universities,Huazhong University of Science and Technology(2021JYCXJJ028)。
文摘Stochastic differential equations(SDEs)are mathematical models that are widely used to describe complex processes or phenomena perturbed by random noise from different sources.The identification of SDEs governing a system is often a challenge because of the inherent strong stochasticity of data and the complexity of the system’s dynamics.The practical utility of existing parametric approaches for identifying SDEs is usually limited by insufficient data resources.This study presents a novel framework for identifying SDEs by leveraging the sparse Bayesian learning(SBL)technique to search for a parsimonious,yet physically necessary representation from the space of candidate basis functions.More importantly,we use the analytical tractability of SBL to develop an efficient way to formulate the linear regression problem for the discovery of SDEs that requires considerably less time-series data.The effectiveness of the proposed framework is demonstrated using real data on stock and oil prices,bearing variation,and wind speed,as well as simulated data on well-known stochastic dynamical systems,including the generalized Wiener process and Langevin equation.This framework aims to assist specialists in extracting stochastic mathematical models from random phenomena in the natural sciences,economics,and engineering fields for analysis,prediction,and decision making.
基金Chengdu City Philosophy and Social Sciences Research Center“artificial intelligence+urban communication”theory and Application Research Center Project“Chengdu real estate vertical market public opinion data visualization research”(Project No.RZCC2025017).
文摘This study integrates multiple sources of data(transaction data,policy text,public opinion data)with visualization techniques(such as heat maps,time-series trend charts,3D building brochures)to construct an analysis framework for the Chengdu real estate market.By using the Adaptive Neuro-Fuzzy Inference System(ANFIS)prediction model,spatial GIS(Geographic Information System analysis)analysis,and interactive dashboards,this study reveals market differentiation,policy impacts,and changes in demand structure,thereby providing decision support for the government,enterprises,and homebuyers.
基金supported in part by the National Natural Science Foundation of China,Grant/Award Number:62003267the Key Research and Development Program of Shaanxi Province,Grant/Award Number:2023-GHZD-33Open Project of the State Key Laboratory of Intelligent Game,Grant/Award Number:ZBKF-23-05。
文摘To address the issue of instability or even imbalance in the orientation and attitude control of quadrotor unmanned aerial vehicles(QUAVs)under random disturbances,this paper proposes a distributed antidisturbance data-driven event-triggered fusion control method,which achieves efficient fault diagnosis while suppressing random disturbances and mitigating communication conflicts within the QUAV swarm.First,the impact of random disturbances on the UAV swarm is analyzed,and a model for orientation and attitude control of QUAVs under stochastic perturbations is established,with the disturbance gain threshold determined.Second,a fault diagnosis system based on a high-gain observer is designed,constructing a fault gain criterion by integrating orientation and attitude information from QUAVs.Subsequently,a model-free dynamic linearization-based data modeling(MFDLDM)framework is developed using model-free adaptive control,which efficiently fits the nonlinear control model of the QUAV swarm while reducing temporal constraints on control data.On this basis,this paper constructs a distributed data-driven event-triggered controller based on the staggered communication mechanism,which consists of an equivalent QUAV controller and an event-triggered controller,and is able to reduce the communication conflicts while suppressing the influence of random interference.Finally,by incorporating random disturbances into the controller,comparative experiments and physical validations are conducted on the QUAV platforms,fully demonstrating the strong adaptability and robustness of the proposed distributed event-triggered fault-tolerant control system.
基金supported by the National Key R&D Program of China(Grant No.2023YFC3209504)Natural Science Foundation of Wuhan(Grant No.2024040801020271)the Fundamental Research Funds for Central Public Welfare Research Institutes(Grant No.CKSF2025718/YT).
文摘Wetting deformation in earth-rockfill dams is a critical factor influencingdam safety.Although numerous mathematical models have been developed to describe this phenomenon,most of them rely on empirical formulations and lack prior knowledge of model parameters,which is essential for Bayesian parameter inversion to enhance accuracy and reduce uncertainty.This study introduces a datadriven approach to establishing prior knowledge of earth-rockfill dams.Driving factors are utilized to determine the potential range of model parameters,and settlement changes within this range are calculated.The results are iteratively compared with actual monitoring data until the calculated range encompasses the observed data,thereby providing prior knowledge of the model parameters.The proposed method is applied to the right-bank earth-rockfilldam of Danjiangkou.Employing a Gibbs sample size of 30,000,the proposed method effectively calibrates the prior knowledge of the wetting model parameters,achieving a root mean square error(RMSE)of 5.18 mm for the settlement predictions.By comparison,the use of non-informative priors with sample sizes of 30,000 and 50,000 results in significantly larger RMSE values of 11.97 mm and 16.07 mm,respectively.Furthermore,the computational efficiencyof the proposed method is demonstrated by an inversion computation time of 902 s for 30,000 samples,which is notably shorter than the 1026 s and 1558 s required for noninformative priors with 30,000 and 50,000 samples,respectively.These findingsunderscore the superior performance of the proposed approach in terms of both prediction accuracy and computational efficiency.These results demonstrate that the proposed method not only improves the predictive accuracy but also enhances the computational efficiency,enabling optimal parameter identificationwith reduced computational effort.This approach provides a robust and efficientframework for advancing dam safety assessments.
文摘Identifying the community structure of complex networks is crucial to extracting insights and understanding network properties.Although several community detection methods have been proposed,many are unsuitable for social networks due to significant limitations.Specifically,most approaches depend mainly on user-user structural links while overlooking service-centric,semantic,and multi-attribute drivers of community formation,and they also lack flexible filtering mechanisms for large-scale,service-oriented settings.Our proposed approach,called community discovery-based service(CDBS),leverages user profiles and their interactions with consulted web services.The method introduces a novel similarity measure,global similarity interaction profile(GSIP),which goes beyond typical similarity measures by unifying user and service profiles for all attributes types into a coherent representation,thereby clarifying its novelty and contribution.It applies multiple filtering criteria related to user attributes,accessed services,and interaction patterns.Experimental comparisons against Louvain,Hierarchical Agglomerative Clustering,Label Propagation and Infomap show that CDBS reveals the higher performance as it achieves 0.74 modularity,0.13 conductance,0.77 coverage,and significantly fast response time of 9.8 s,even with 10,000 users and 400 services.Moreover,community discoverybased service consistently detects a larger number of communities with distinct topics of interest,underscoring its capacity to generate detailed and efficient structures in complex networks.These results confirm both the efficiency and effectiveness of the proposed method.Beyond controlled evaluation,communities discovery based service is applicable to targeted recommendations,group-oriented marketing,access control,and service personalization,where communities are shaped not only by user links but also by service engagement.
基金supported by the Basic Science Research Program through the National Research Foundation of Korea(NRF),funded by the Ministry of Education(RS-2023-00249743).
文摘Most Convolutional Neural Network(CNN)interpretation techniques visualize only the dominant cues that the model relies on,but there is no guarantee that these represent all the evidence the model uses for classification.This limitation becomes critical when hidden secondary cues—potentially more meaningful than the visualized ones—remain undiscovered.This study introduces CasCAM(Cascaded Class Activation Mapping)to address this fundamental limitation through counterfactual reasoning.By asking“if this dominant cue were absent,what other evidence would the model use?”,CasCAM progressively masks the most salient features and systematically uncovers the hierarchy of classification evidence hidden beneath them.Experimental results demonstrate that CasCAM effectively discovers the full spectrum of reasoning evidence and can be universally applied with nine existing interpretation methods.
基金supported by the Hong Kong Polytechnic University(Project No.4-ZZW1,4-YWER,97D9,4-W443)。
文摘Recent years have witnessed the significant breakthrough in the field of new materials discovery brought about by the artificial intelligence(AI).AI has successfully been applied for predicting the formability,revealing the properties,and guiding the experimental synthesis of materials.Rapid progress has been made in the integration of increasing database and improved computing power.Though some reviews present the development from their unique aspects,reviews from the view of how AI empowered both discovery of new materials and cognition of existing materials that covers the completed contents with two synergistical aspects are few.Here,the newest development is systematically reviewed in the field of AI empowered materials,reflecting advanced design of the intelligent systems for discovery,synthesis,prediction and validation of materials.First,background and mechanisms are briefed,after which the design for the AI systems with data,machine learning and automated laboratory included is illustrated.Next,strategies are summarized to obtain the AI systems for materials with improved performance which comprehensively cover the aspects from the in-depth cognizance of existing material and the rapid discovery of new materials,and then,the design thought for future AI systems in material science is pointed out.Finally,some perspectives are put forward.
文摘Owing to the emergence of drug resistance and high morbidity,the need for novel antiviral drugs with novel targets is highly sought after.Marine-derived compounds mostly possess potent antiviral activity and serve as a primary source for developing novel antiviral drugs,making the rapid discovery and evaluation of marine antiviral agents particularly crucial.Thus,future research should place greater emphasis on the identification of novel antiviral targets through the combination of artificial intelligence(AI)and structural pharmacology,as well as expanding the marine resource and target databases.
文摘Despite advances in current anti-cancer therapies,challenges such as drug resistance,toxicity,and tumor heterogeneity persist.The limitations of traditional single-target drugs and simple combination therapies are becoming increasingly apparent1.To address these issues,a novel treatment strategy,the artificially intelligent synergistic engineered drug(AISED)paradigm,merits further exploration.This paradigm is based on the systematic engineered integration of multiple active ingredients into a unified single entity through artificial intelligence(AI).This strategy is aimed at developing new anti-cancer drug designs involving multiple ingredients,multiple molecular targets,and multiple biological effects,for multiple cancer types,thereby providing a novel theoretical paradigm for overcoming existing treatment bottlenecks.
基金supported by NIH grants R01LM010817 and P01AG039347
文摘Purpose: The late Don R. Swanson was well appreciated during his lifetime as Dean of the Graduate Library School at University of Chicago, as winner of the American Society for Information Science Award of Merit for 2000, and as author of many seminal articles. In this informal essay, I will give my personal perspective on Don's contributions to science, and outline some current and future directions in literature-based discovery that are rooted in concepts that he developed.Design/methodology/approach: Personal recollections and literature review. Findings: The Swanson A-B-C model of literature-based discovery has been successfully used by laboratory investigators analyzing their findings and hypotheses. It continues to be a fertile area of research in a wide range of application areas including text mining, drug repurposing, studies of scientific innovation, knowledge discovery in databases, and bioinformatics. Recently, additional modes of discovery that do not follow the A-B-C model have also been proposed and explored (e.g. so-called storytelling, gaps, analogies, link prediction, negative consensus, outliers, and revival of neglected or discarded research questions). Research limitations: This paper reflects the opinions of the author and is not a comprehensive nor technically based review of literature-based discovery. Practical implications: The general scientific public is still not aware of the availability of tools for literature-based discovery. Our Arrowsmith project site maintains a suite of discovery tools that are free and open to the public (http://arrowsmith.psych.uic.edu), as does BITOLA which is maintained by Dmitar Hristovski (http:// http://ibmi.mf.uni-lj.si/bitola), and Epiphanet which is maintained by Trevor Cohen (http://epiphanet.uth.tme.edu/). Bringing user-friendly tools to the public should be a high priority, since even more than advancing basic research in informatics, it is vital that we ensure that scientists actually use discovery tools and that these are actually able to help them make experimental discoveries in the lab and in the clinic. Originality/value: This paper discusses problems and issues which were inherent in Don's thoughts during his life, including those which have not yet been fully taken up and studied systematically.
基金supported by funding from the Bluesand Foundation,Alzheimer's Association(AARG-21-852072 and Bias Frangione Early Career Achievement Award)to EDan Australian Government Research Training Program scholarship and the University of Sydney's Brain and Mind Centre fellowship to AH。
文摘Tauopathies,diseases characterized by neuropathological aggregates of tau including Alzheimer's disease and subtypes of fro ntotemporal dementia,make up the vast majority of dementia cases.Although there have been recent developments in tauopathy biomarkers and disease-modifying treatments,ongoing progress is required to ensure these are effective,economical,and accessible for the globally ageing population.As such,continued identification of new potential drug targets and biomarkers is critical."Big data"studies,such as proteomics,can generate information on thousands of possible new targets for dementia diagnostics and therapeutics,but currently remain underutilized due to the lack of a clear process by which targets are selected for future drug development.In this review,we discuss current tauopathy biomarkers and therapeutics,and highlight areas in need of improvement,particularly when addressing the needs of frail,comorbid and cognitively impaired populations.We highlight biomarkers which have been developed from proteomic data,and outline possible future directions in this field.We propose new criteria by which potential targets in proteomics studies can be objectively ranked as favorable for drug development,and demonstrate its application to our group's recent tau interactome dataset as an example.
基金The research work was financially supported by the National Natural Science Foundation of China(Grant Nos.51979238 and 52301338)the Sichuan Science and Technology Program(Grant Nos.2023NSFSC1953 and 2023ZYD0140).
文摘Mitigating vortex-induced vibrations(VIV)in flexible risers represents a critical concern in offshore oil and gas production,considering its potential impact on operational safety and efficiency.The accurate prediction of displacement and position of VIV in flexible risers remains challenging under actual marine conditions.This study presents a data-driven model for riser displacement prediction that corresponds to field conditions.Experimental data analysis reveals that the XGBoost algorithm predicts the maximum displacement and position with superior accuracy compared with Support vector regression(SVR),considering both computational efficiency and precision.Platform displacement in the Y-direction demonstrates a significant positive correlation with both axial depth and maximum displacement magnitude.The fourth point displacement exhibits the highest contribution to model prediction outcomes,showing a positive influence on maximum displacement while negatively affecting the axial depth of maximum displacement.Platform displacement in the X-and Y-directions exhibits competitive effects on both the riser’s maximum displacement and its axial depth.Through the implementation of XGBoost algorithm and SHapley Additive exPlanation(SHAP)analysis,the model effectively estimates the riser’s maximum displacement and its precise location.This data-driven approach achieves predictions using minimal,readily available data points,enhancing its practical field applications and demonstrating clear relevance to academic and professional communities.
基金This paper is the research result of“Research on Innovation of Evidence-Based Teaching Paradigm in Vocational Education under the Background of New Quality Productivity”(2024JXQ176)the Shandong Province Artificial Intelligence Education Research Project(SDDJ202501035),which explores the application of artificial intelligence big models in student value-added evaluation from an evidence-based perspective。
文摘Based on the educational evaluation reform,this study explores the construction of an evidence-based value-added evaluation system based on data-driven,aiming to solve the limitations of traditional evaluation methods.The research adopts the method of combining theoretical analysis and practical application,and designs the evidence-based value-added evaluation framework,which includes the core elements of a multi-source heterogeneous data acquisition and processing system,a value-added evaluation agent based on a large model,and an evaluation implementation and application mechanism.Through empirical research verification,the evaluation system has remarkable effects in improving learning participation,promoting ability development,and supporting teaching decision-making,and provides a theoretical reference and practical path for educational evaluation reform in the new era.The research shows that the evidence-based value-added evaluation system based on data-driven can reflect students’actual progress more fairly and objectively by accurately measuring the difference in starting point and development range of students,and provide strong support for the realization of high-quality education development.
基金supported in part by National Institute of Health(NIH),USA(Grant Nos.:R01GM126189,R01AI164266,and R35GM148196)the National Science Foundation,USA(Grant Nos.DMS2052983,DMS-1761320,and IIS-1900473)+3 种基金National Aero-nautics and Space Administration(NASA),USA(Grant No.:80NSSC21M0023)Michigan State University(MSU)Foundation,USA,Bristol-Myers Squibb(Grant No.:65109)USA,and Pfizer,USAsupported by the National Natural Science Foundation of China(Grant Nos.:11971367,12271416,and 11972266).
文摘Transformer models have emerged as pivotal tools within the realm of drug discovery,distinguished by their unique architectural features and exceptional performance in managing intricate data landscapes.Leveraging the innate capabilities of transformer architectures to comprehend intricate hierarchical dependencies inherent in sequential data,these models showcase remarkable efficacy across various tasks,including new drug design and drug target identification.The adaptability of pre-trained trans-former-based models renders them indispensable assets for driving data-centric advancements in drug discovery,chemistry,and biology,furnishing a robust framework that expedites innovation and dis-covery within these domains.Beyond their technical prowess,the success of transformer-based models in drug discovery,chemistry,and biology extends to their interdisciplinary potential,seamlessly combining biological,physical,chemical,and pharmacological insights to bridge gaps across diverse disciplines.This integrative approach not only enhances the depth and breadth of research endeavors but also fosters synergistic collaborations and exchange of ideas among disparate fields.In our review,we elucidate the myriad applications of transformers in drug discovery,as well as chemistry and biology,spanning from protein design and protein engineering,to molecular dynamics(MD),drug target iden-tification,transformer-enabled drug virtual screening(VS),drug lead optimization,drug addiction,small data set challenges,chemical and biological image analysis,chemical language understanding,and single cell data.Finally,we conclude the survey by deliberating on promising trends in transformer models within the context of drug discovery and other sciences.
基金supported by grants from the National Natural Science Foundation of China(No.82273770)the Foundation for Innovative Research Groups of the National Natural Science Foundation of Sichuan Province(No.24NSFTD0051).
文摘In the realm of drug discovery,recent advancements have paved the way for innovative approaches and methodologies.This comprehensive review encapsulates six distinct yet interrelated mini-reviews,each shedding light on novel strategies in drug development.(a)The resurgence of covalent drugs is highlighted,focusing on the targeted covalent inhibitors(TCIs)and their role in enhancing selectivity and affinity.(b)The potential of the quantum mechanics-based computational aid drug design(CADD)tool,Cov_DOX,is introduced for predicting protein-covalent ligand binding structures and affinities.(c)The scaffolding function of proteins is proposed as a new avenue for drug design,with a focus on modulating protein-protein interactions through small molecules and proteolysis targeting chimeras(PROTACs).(d)The concept of pro-PROTACs is explored as a promising strategy for cancer therapy,combining the principles of prodrugs and PROTACs to enhance specificity and reduce toxicity.(e)The design of prodrugs through carbon-carbon bond cleavage is discussed,offering a new perspective for the activation of drugs with limited modifiable functional groups.(f)The targeting of programmed cell death pathways in cancer therapies with small molecules is reviewed,emphasizing the induction of autophagy-dependent cell death,ferroptosis,and cuproptosis.These insights collectively contribute to a deeper understanding of the dynamic landscape of drug discovery.
基金supported by National Key Research and Development Program (2019YFA0708301)National Natural Science Foundation of China (51974337)+2 种基金the Strategic Cooperation Projects of CNPC and CUPB (ZLZX2020-03)Science and Technology Innovation Fund of CNPC (2021DQ02-0403)Open Fund of Petroleum Exploration and Development Research Institute of CNPC (2022-KFKT-09)
文摘We propose an integrated method of data-driven and mechanism models for well logging formation evaluation,explicitly focusing on predicting reservoir parameters,such as porosity and water saturation.Accurately interpreting these parameters is crucial for effectively exploring and developing oil and gas.However,with the increasing complexity of geological conditions in this industry,there is a growing demand for improved accuracy in reservoir parameter prediction,leading to higher costs associated with manual interpretation.The conventional logging interpretation methods rely on empirical relationships between logging data and reservoir parameters,which suffer from low interpretation efficiency,intense subjectivity,and suitability for ideal conditions.The application of artificial intelligence in the interpretation of logging data provides a new solution to the problems existing in traditional methods.It is expected to improve the accuracy and efficiency of the interpretation.If large and high-quality datasets exist,data-driven models can reveal relationships of arbitrary complexity.Nevertheless,constructing sufficiently large logging datasets with reliable labels remains challenging,making it difficult to apply data-driven models effectively in logging data interpretation.Furthermore,data-driven models often act as“black boxes”without explaining their predictions or ensuring compliance with primary physical constraints.This paper proposes a machine learning method with strong physical constraints by integrating mechanism and data-driven models.Prior knowledge of logging data interpretation is embedded into machine learning regarding network structure,loss function,and optimization algorithm.We employ the Physically Informed Auto-Encoder(PIAE)to predict porosity and water saturation,which can be trained without labeled reservoir parameters using self-supervised learning techniques.This approach effectively achieves automated interpretation and facilitates generalization across diverse datasets.