Dear Editor,Health management is essential to ensure battery performance and safety, while data-driven learning system is a promising solution to enable efficient state of health(SoH) estimation of lithium-ion(Liion) ...Dear Editor,Health management is essential to ensure battery performance and safety, while data-driven learning system is a promising solution to enable efficient state of health(SoH) estimation of lithium-ion(Liion) batteries. However, the time-consuming signal data acquisition and the lack of interpretability of model still hinder its efficient deployment. Motivated by this, this letter proposes a novel and interpretable data-driven learning strategy through combining the benefits of explainable AI and non-destructive ultrasonic detection for battery SoH estimation. Specifically, after equipping battery with advanced ultrasonic sensor to promise fast real-time ultrasonic signal measurement, an interpretable data-driven learning strategy named generalized additive neural decision ensemble(GANDE) is designed to rapidly estimate battery SoH and explain the effects of the involved ultrasonic features of interest.展开更多
For unachievable tracking problems, where the system output cannot precisely track a given reference, achieving the best possible approximation for the reference trajectory becomes the objective. This study aims to in...For unachievable tracking problems, where the system output cannot precisely track a given reference, achieving the best possible approximation for the reference trajectory becomes the objective. This study aims to investigate solutions using the Ptype learning control scheme. Initially, we demonstrate the necessity of gradient information for achieving the best approximation.Subsequently, we propose an input-output-driven learning gain design to handle the imprecise gradients of a class of uncertain systems. However, it is discovered that the desired performance may not be attainable when faced with incomplete information.To address this issue, an extended iterative learning control scheme is introduced. In this scheme, the tracking errors are modified through output data sampling, which incorporates lowmemory footprints and offers flexibility in learning gain design.The input sequence is shown to converge towards the desired input, resulting in an output that is closest to the given reference in the least square sense. Numerical simulations are provided to validate the theoretical findings.展开更多
The effectiveness of data-driven learning(DDL) has been testified on Chinese learners by using sample corpora of English articles. The result shows that an independent manipulation of the corpora on the part of learne...The effectiveness of data-driven learning(DDL) has been testified on Chinese learners by using sample corpora of English articles. The result shows that an independent manipulation of the corpora on the part of learner can not ensure the suc cess of DDL.展开更多
A case study has been made to explore whether the teacher’s role in data-driven learning(DDL)can be minimized.The outcome shows that the teacher’s role in offering an explicit instruction may be indispensable and ev...A case study has been made to explore whether the teacher’s role in data-driven learning(DDL)can be minimized.The outcome shows that the teacher’s role in offering an explicit instruction may be indispensable and even central to the acquisi tion of English articles.展开更多
NJmat is a user-friendly,data-driven machine learning interface designed for materials design and analysis.The platform integrates advanced computational techniques,including natural language processing(NLP),large lan...NJmat is a user-friendly,data-driven machine learning interface designed for materials design and analysis.The platform integrates advanced computational techniques,including natural language processing(NLP),large language models(LLM),machine learning potentials(MLP),and graph neural networks(GNN),to facili-tate materials discovery.The platform has been applied in diverse materials research areas,including perovskite surface design,catalyst discovery,battery materials screening,structural alloy design,and molecular informatics.By automating feature selection,predictive modeling,and result interpretation,NJmat accelerates the development of high-performance materials across energy storage,conversion,and structural applications.Additionally,NJmat serves as an educational tool,allowing students and researchers to apply machine learning techniques in materials science with minimal coding expertise.Through automated feature extraction,genetic algorithms,and interpretable machine learning models,NJmat simplifies the workflow for materials informatics,bridging the gap between AI and experimental materials research.The latest version(available at https://figshare.com/articles/software/NJmatML/24607893(accessed on 01 January 2025))enhances its functionality by incorporating NJmatNLP,a module leveraging language models like MatBERT and those based on Word2Vec to support materials prediction tasks.By utilizing clustering and cosine similarity analysis with UMAP visualization,NJmat enables intuitive exploration of materials datasets.While NJmat primarily focuses on structure-property relationships and the discovery of novel chemistries,it can also assist in optimizing processing conditions when relevant parameters are included in the training data.By providing an accessible,integrated environment for machine learning-driven materials discovery,NJmat aligns with the objectives of the Materials Genome Initiative and promotes broader adoption of AI techniques in materials science.展开更多
The Underwater Acoustic(UWA)channel is bandwidth-constrained and experiences doubly selective fading.It is challenging to acquire perfect channel knowledge for Orthogonal Frequency Division Multiplexing(OFDM)communica...The Underwater Acoustic(UWA)channel is bandwidth-constrained and experiences doubly selective fading.It is challenging to acquire perfect channel knowledge for Orthogonal Frequency Division Multiplexing(OFDM)communications using a finite number of pilots.On the other hand,Deep Learning(DL)approaches have been very successful in wireless OFDM communications.However,whether they will work underwater is still a mystery.For the first time,this paper compares two categories of DL-based UWA OFDM receivers:the DataDriven(DD)method,which performs as an end-to-end black box,and the Model-Driven(MD)method,also known as the model-based data-driven method,which combines DL and expert OFDM receiver knowledge.The encoder-decoder framework and Convolutional Neural Network(CNN)structure are employed to establish the DD receiver.On the other hand,an unfolding-based Minimum Mean Square Error(MMSE)structure is adopted for the MD receiver.We analyze the characteristics of different receivers by Monte Carlo simulations under diverse communications conditions and propose a strategy for selecting a proper receiver under different communication scenarios.Field trials in the pool and sea are also conducted to verify the feasibility and advantages of the DL receivers.It is observed that DL receivers perform better than conventional receivers in terms of bit error rate.展开更多
The distillation process is an important chemical process,and the application of data-driven modelling approach has the potential to reduce model complexity compared to mechanistic modelling,thus improving the efficie...The distillation process is an important chemical process,and the application of data-driven modelling approach has the potential to reduce model complexity compared to mechanistic modelling,thus improving the efficiency of process optimization or monitoring studies.However,the distillation process is highly nonlinear and has multiple uncertainty perturbation intervals,which brings challenges to accurate data-driven modelling of distillation processes.This paper proposes a systematic data-driven modelling framework to solve these problems.Firstly,data segment variance was introduced into the K-means algorithm to form K-means data interval(KMDI)clustering in order to cluster the data into perturbed and steady state intervals for steady-state data extraction.Secondly,maximal information coefficient(MIC)was employed to calculate the nonlinear correlation between variables for removing redundant features.Finally,extreme gradient boosting(XGBoost)was integrated as the basic learner into adaptive boosting(AdaBoost)with the error threshold(ET)set to improve weights update strategy to construct the new integrated learning algorithm,XGBoost-AdaBoost-ET.The superiority of the proposed framework is verified by applying this data-driven modelling framework to a real industrial process of propylene distillation.展开更多
The asphalt pavement industry is transforming because of the growing influence of artificial intelligence and industrial digitization.As a result of this shift,there is a stronger emphasis on advanced statistical appr...The asphalt pavement industry is transforming because of the growing influence of artificial intelligence and industrial digitization.As a result of this shift,there is a stronger emphasis on advanced statistical approaches like optimization tools like response surface methodology(RSM)and machine learning(ML)techniques.The goal of this paper is to provide a scientometric and systematic review of the application of RSM and ML applications in data-driven approaches such as optimizing,modeling,and predicting asphalt pavement performance to achieve sustainable asphalt pavements in support of numerous sustainable development goals(SDGs).These include Goals 9(sustainable infrastructure),11(urban resilience),12(sustainable construction strategies),13(climate action through optimized materials),and 17(multidisciplinary interaction).A thorough search of the ScienceDirect,Web of Science,and Scopus databases from 2010 to 2023 yielded 1249 relevant records,with 125 studies closely examined.Over the last thirteen years,there has been significant research growth in RSM and ML applications,particularly in ML-based pavement optimization.The study shows that the topic has a global presence,with notable contributions from Asia,North America,Europe,and other continents.Researchers have concentrated on utilizing sophisticated ML models such as support vector machines(SVM),artificial neural networks(ANN),and Bayesian networks for prediction.Also,the integration of RSM and ML provides a faster and more efficient method for analyzing large datasets to optimize asphalt pavement performance variables.Key contributors include the United States,China,and Malaysia,with global efforts focused on sustainable materials and approaches to reduce impact on the environment.Furthermore,the review demonstrates the integrated use of RSM and ML as transformative tools for improving sustainability,which contributes significantly to SDGs 9,11,12,13,and 17.Providing valuable insights for future research and guiding decision-making for soft computing applications for asphalt pavement projects.展开更多
Predicting fracture intensity is essential for optimising reservoir production and mitigating drilling risks in the Brazilian pre-salt layer.However,previous studies rely excessively on conceptual models and typically...Predicting fracture intensity is essential for optimising reservoir production and mitigating drilling risks in the Brazilian pre-salt layer.However,previous studies rely excessively on conceptual models and typically do not integrate multiple types of data to perform such task.Moreover,to date,no feasibilitylike studies have assessed the reasonableness of such approaches.We propose a data-driven approach that utilises upscaled well logs(Young's modulus,Poisson's ratio,and silica content)alongside seismic attributes(curvature,distance to fault)to predict fracture intensity.The distance to fault is measured using the fault probability volume estimated by a pre-trained convolutional neural network(CNN).We evaluate the effectiveness of this data-driven approach employing two tree-ensemble models,eXtreme Gradient Boosting(XGBoost)and Random Forest,to estimate the volumetric fracture intensity(P32)in the wells.Regression and residual analyses indicate that XGBoost outperforms Random Forest.Results from feature importance methods,such as permutation importance and Shapley Additive explanations(SHAP),highlight curvature as the most important feature,followed by distance to fault,Young's modulus(or P-Impedance),silica content,and Poisson's ratio.The approach has been validated with rock sampling information and two blind tests.Consequently,we believe this workflow can be applied to other wells in nearby fields.The study offers a valuable tool for quantitatively estimating fracture intensity in pre-salt reservoirs.Future research may use this study as a reference for estimating fracture intensity within a seismic volume.The predicted fracture intensity estimates can enhance the reliability of reservoir porosity models and serve as a geohazard indicator to mitigate drilling risks.展开更多
Teacher–student relationships play a vital role in improving college students’academic performance and the quality of higher education.However,empirical studies with substantial data-driven insights remain limited.T...Teacher–student relationships play a vital role in improving college students’academic performance and the quality of higher education.However,empirical studies with substantial data-driven insights remain limited.To address this gap,this study collected 3278 questionnaires from seven universities across four provinces in China to analyze the key factors affecting college students’academic performance.A machine learning framework,CQFOA-KELM,was developed by enhancing the Fruit Fly Optimization Algorithm(FOA)with Covariance Matrix Adaptation Evolution Strategy(CMAES)and Quadratic Approximation(QA).CQFOA significantly improved population diversity and was validated on the IEEE CEC2017 benchmark functions.The CQFOA-KELM model achieved an accuracy of 98.15%and a sensitivity of 98.53%in predicting college students’academic performance.Additionally,it effectively identified the key factors influencing academic performance through the feature selection process.展开更多
As legal cases grow in complexity and volume worldwide,integrating machine learning and artificial intelligence into judicial systems has become a pivotal research focus.This study introduces a comprehensive framework...As legal cases grow in complexity and volume worldwide,integrating machine learning and artificial intelligence into judicial systems has become a pivotal research focus.This study introduces a comprehensive framework for verdict recommendation that synergizes rule-based methods with deep learning techniques specifically tailored to the legal domain.The proposed framework comprises three core modules:legal feature extraction,semantic similarity assessment,and verdict recommendation.For legal feature extraction,a rule-based approach leverages Black’s Law Dictionary and WordNet Synsets to construct feature vectors from judicial texts.Semantic similarity between cases is evaluated using a hybrid method that combines rule-based logic with an LSTM model,analyzing the feature vectors of query cases against a legal knowledge base.Verdicts are then recommended through a rule-based retrieval system,enhanced by predefined legal statutes and regulations.By merging rule-based methodologies with deep learning,this framework addresses the interpretability challenges often associated with contemporary AImodels,thereby enhancing both transparency and generalizability across diverse legal contexts.The system was rigorously tested using a legal corpus of 43,000 case laws across six categories:Criminal,Revenue,Service,Corporate,Constitutional,and Civil law,ensuring its adaptability across a wide range of judicial scenarios.Performance evaluation showed that the feature extraction module achieved an average accuracy of 91.6%with an F-Score of 95%.The semantic similarity module,tested using Manhattan,Euclidean,and Cosine distance metrics,achieved 88%accuracy and a 93%F-Score for short queries(Manhattan),89%accuracy and a 93.7%F-Score for medium-length queries(Euclidean),and 87%accuracy with a 92.5%F-Score for longer queries(Cosine).The verdict recommendation module outperformed existing methods,achieving 90%accuracy and a 93.75%F-Score.This study highlights the potential of hybrid AI frameworks to improve judicial decision-making and streamline legal processes,offering a robust,interpretable,and adaptable solution for the evolving demands of modern legal systems.展开更多
Bifunctional oxide-zeolite-based composites(OXZEO)have emerged as promising materials for the direct conversion of syngas to olefins.However,experimental screening and optimization of reaction parameters remain resour...Bifunctional oxide-zeolite-based composites(OXZEO)have emerged as promising materials for the direct conversion of syngas to olefins.However,experimental screening and optimization of reaction parameters remain resource-intensive.To address this challenge,we implemented a three-stage framework integrating machine learning,Bayesian optimization,and experimental validation,utilizing a carefully curated dataset from the literature.Our ensemble-tree model(R^(2)>0.87)identified Zn-Zr and Cu-Mg binary mixed oxides as the most effective OXZEO systems,with their light olefin space-time yields confirmed by physically mixing with HSAPO-34 through experimental validation.Density functional theory calculations further elucidated the activity trends between Zn-Zr and Cu-Mg mixed oxides.Among 16 catalyst and reaction condition descriptors,the oxide/zeolite ratio,reaction temperature,and pressure emerged as the most significant factors.This interpretable,data-driven framework offers a versatile approach that can be applied to other catalytic processes,providing a powerful tool for experiment design and optimization in catalysis.展开更多
Non-learning based motion and path planning of an Unmanned Aerial Vehicle(UAV)is faced with low computation efficiency,mapping memory occupation and local optimization problems.This article investigates the challenge ...Non-learning based motion and path planning of an Unmanned Aerial Vehicle(UAV)is faced with low computation efficiency,mapping memory occupation and local optimization problems.This article investigates the challenge of quadrotor control using offline reinforcement learning.By establishing a data-driven learning paradigm that operates without real-environment interaction,the proposed workflow offers a safer approach than traditional reinforcement learning,making it particularly suited for UAV control in industrial scenarios.The introduced algorithm evaluates dataset uncertainty and employs a pessimistic estimation to foster offline deep reinforcement learning.Experiments highlight the algorithm's superiority over traditional online reinforcement learning methods,especially when learning from offline datasets.Furthermore,the article emphasizes the importance of a more general behavior policy.In evaluations,the trained policy demonstrated versatility by adeptly navigating diverse obstacles,underscoring its real-world applicability.展开更多
Early identification and treatment of stroke can greatly improve patient outcomes and quality of life.Although clinical tests such as the Cincinnati Pre-hospital Stroke Scale(CPSS)and the Face Arm Speech Test(FAST)are...Early identification and treatment of stroke can greatly improve patient outcomes and quality of life.Although clinical tests such as the Cincinnati Pre-hospital Stroke Scale(CPSS)and the Face Arm Speech Test(FAST)are commonly used for stroke screening,accurate administration is dependent on specialized training.In this study,we proposed a novel multimodal deep learning approach,based on the FAST,for assessing suspected stroke patients exhibiting symptoms such as limb weakness,facial paresis,and speech disorders in acute settings.We collected a dataset comprising videos and audio recordings of emergency room patients performing designated limb movements,facial expressions,and speech tests based on the FAST.We compared the constructed deep learning model,which was designed to process multi-modal datasets,with six prior models that achieved good action classification performance,including the I3D,SlowFast,X3D,TPN,TimeSformer,and MViT.We found that the findings of our deep learning model had a higher clinical value compared with the other approaches.Moreover,the multi-modal model outperformed its single-module variants,highlighting the benefit of utilizing multiple types of patient data,such as action videos and speech audio.These results indicate that a multi-modal deep learning model combined with the FAST could greatly improve the accuracy and sensitivity of early stroke identification of stroke,thus providing a practical and powerful tool for assessing stroke patients in an emergency clinical setting.展开更多
With the rapid development of artificial intelligence,the Internet of Things(IoT)can deploy various machine learning algorithms for network and application management.In the IoT environment,many sensors and devices ge...With the rapid development of artificial intelligence,the Internet of Things(IoT)can deploy various machine learning algorithms for network and application management.In the IoT environment,many sensors and devices generatemassive data,but data security and privacy protection have become a serious challenge.Federated learning(FL)can achieve many intelligent IoT applications by training models on local devices and allowing AI training on distributed IoT devices without data sharing.This review aims to deeply explore the combination of FL and the IoT,and analyze the application of federated learning in the IoT from the aspects of security and privacy protection.In this paper,we first describe the potential advantages of FL and the challenges faced by current IoT systems in the fields of network burden and privacy security.Next,we focus on exploring and analyzing the advantages of the combination of FL on the Internet,including privacy security,attack detection,efficient communication of the IoT,and enhanced learning quality.We also list various application scenarios of FL on the IoT.Finally,we propose several open research challenges and possible solutions.展开更多
In the realm of Intelligent Railway Transportation Systems,effective multi-party collaboration is crucial due to concerns over privacy and data silos.Vertical Federated Learning(VFL)has emerged as a promising approach...In the realm of Intelligent Railway Transportation Systems,effective multi-party collaboration is crucial due to concerns over privacy and data silos.Vertical Federated Learning(VFL)has emerged as a promising approach to facilitate such collaboration,allowing diverse entities to collectively enhance machine learning models without the need to share sensitive training data.However,existing works have highlighted VFL’s susceptibility to privacy inference attacks,where an honest but curious server could potentially reconstruct a client’s raw data from embeddings uploaded by the client.This vulnerability poses a significant threat to VFL-based intelligent railway transportation systems.In this paper,we introduce SensFL,a novel privacy-enhancing method to against privacy inference attacks in VFL.Specifically,SensFL integrates regularization of the sensitivity of embeddings to the original data into the model training process,effectively limiting the information contained in shared embeddings.By reducing the sensitivity of embeddings to the original data,SensFL can effectively resist reverse privacy attacks and prevent the reconstruction of the original data from the embeddings.Extensive experiments were conducted on four distinct datasets and three different models to demonstrate the efficacy of SensFL.Experiment results show that SensFL can effectively mitigate privacy inference attacks while maintaining the accuracy of the primary learning task.These results underscore SensFL’s potential to advance privacy protection technologies within VFL-based intelligent railway systems,addressing critical security concerns in collaborative learning environments.展开更多
The burgeoning market for lithium-ion batteries has stimulated a growing need for more reliable battery performance monitoring. Accurate state-of-health(SOH) estimation is critical for ensuring battery operational per...The burgeoning market for lithium-ion batteries has stimulated a growing need for more reliable battery performance monitoring. Accurate state-of-health(SOH) estimation is critical for ensuring battery operational performance. Despite numerous data-driven methods reported in existing research for battery SOH estimation, these methods often exhibit inconsistent performance across different application scenarios. To address this issue and overcome the performance limitations of individual data-driven models,integrating multiple models for SOH estimation has received considerable attention. Ensemble learning(EL) typically leverages the strengths of multiple base models to achieve more robust and accurate outputs. However, the lack of a clear review of current research hinders the further development of ensemble methods in SOH estimation. Therefore, this paper comprehensively reviews multi-model ensemble learning methods for battery SOH estimation. First, existing ensemble methods are systematically categorized into 6 classes based on their combination strategies. Different realizations and underlying connections are meticulously analyzed for each category of EL methods, highlighting distinctions, innovations, and typical applications. Subsequently, these ensemble methods are comprehensively compared in terms of base models, combination strategies, and publication trends. Evaluations across 6 dimensions underscore the outstanding performance of stacking-based ensemble methods. Following this, these ensemble methods are further inspected from the perspectives of weighted ensemble and diversity, aiming to inspire potential approaches for enhancing ensemble performance. Moreover, addressing challenges such as base model selection, measuring model robustness and uncertainty, and interpretability of ensemble models in practical applications is emphasized. Finally, future research prospects are outlined, specifically noting that deep learning ensemble is poised to advance ensemble methods for battery SOH estimation. The convergence of advanced machine learning with ensemble learning is anticipated to yield valuable avenues for research. Accelerated research in ensemble learning holds promising prospects for achieving more accurate and reliable battery SOH estimation under real-world conditions.展开更多
Mental health is a significant issue worldwide,and the utilization of technology to assist mental health has seen a growing trend.This aims to alleviate the workload on healthcare professionals and aid individuals.Num...Mental health is a significant issue worldwide,and the utilization of technology to assist mental health has seen a growing trend.This aims to alleviate the workload on healthcare professionals and aid individuals.Numerous applications have been developed to support the challenges in intelligent healthcare systems.However,because mental health data is sensitive,privacy concerns have emerged.Federated learning has gotten some attention.This research reviews the studies on federated learning and mental health related to solving the issue of intelligent healthcare systems.It explores various dimensions of federated learning in mental health,such as datasets(their types and sources),applications categorized based on mental health symptoms,federated mental health frameworks,federated machine learning,federated deep learning,and the benefits of federated learning in mental health applications.This research conducts surveys to evaluate the current state of mental health applications,mainly focusing on the role of Federated Learning(FL)and related privacy and data security concerns.The survey provides valuable insights into how these applications are emerging and evolving,specifically emphasizing FL’s impact.展开更多
基金supported by the National Natural Science Foundation of China(62373224,62333013,U23A20327)the Natural Science Foundation of Shandong Province(ZR2024JQ021)
文摘Dear Editor,Health management is essential to ensure battery performance and safety, while data-driven learning system is a promising solution to enable efficient state of health(SoH) estimation of lithium-ion(Liion) batteries. However, the time-consuming signal data acquisition and the lack of interpretability of model still hinder its efficient deployment. Motivated by this, this letter proposes a novel and interpretable data-driven learning strategy through combining the benefits of explainable AI and non-destructive ultrasonic detection for battery SoH estimation. Specifically, after equipping battery with advanced ultrasonic sensor to promise fast real-time ultrasonic signal measurement, an interpretable data-driven learning strategy named generalized additive neural decision ensemble(GANDE) is designed to rapidly estimate battery SoH and explain the effects of the involved ultrasonic features of interest.
基金supported by the National Natural Science Foundation of China (62173333, 12271522)Beijing Natural Science Foundation (Z210002)the Research Fund of Renmin University of China (2021030187)。
文摘For unachievable tracking problems, where the system output cannot precisely track a given reference, achieving the best possible approximation for the reference trajectory becomes the objective. This study aims to investigate solutions using the Ptype learning control scheme. Initially, we demonstrate the necessity of gradient information for achieving the best approximation.Subsequently, we propose an input-output-driven learning gain design to handle the imprecise gradients of a class of uncertain systems. However, it is discovered that the desired performance may not be attainable when faced with incomplete information.To address this issue, an extended iterative learning control scheme is introduced. In this scheme, the tracking errors are modified through output data sampling, which incorporates lowmemory footprints and offers flexibility in learning gain design.The input sequence is shown to converge towards the desired input, resulting in an output that is closest to the given reference in the least square sense. Numerical simulations are provided to validate the theoretical findings.
文摘The effectiveness of data-driven learning(DDL) has been testified on Chinese learners by using sample corpora of English articles. The result shows that an independent manipulation of the corpora on the part of learner can not ensure the suc cess of DDL.
文摘A case study has been made to explore whether the teacher’s role in data-driven learning(DDL)can be minimized.The outcome shows that the teacher’s role in offering an explicit instruction may be indispensable and even central to the acquisi tion of English articles.
基金supported by the Jiangsu Provincial Science and Technology Project Basic Research Program(Natural Science Foundation of Jiangsu Province)(No.BK20211283).
文摘NJmat is a user-friendly,data-driven machine learning interface designed for materials design and analysis.The platform integrates advanced computational techniques,including natural language processing(NLP),large language models(LLM),machine learning potentials(MLP),and graph neural networks(GNN),to facili-tate materials discovery.The platform has been applied in diverse materials research areas,including perovskite surface design,catalyst discovery,battery materials screening,structural alloy design,and molecular informatics.By automating feature selection,predictive modeling,and result interpretation,NJmat accelerates the development of high-performance materials across energy storage,conversion,and structural applications.Additionally,NJmat serves as an educational tool,allowing students and researchers to apply machine learning techniques in materials science with minimal coding expertise.Through automated feature extraction,genetic algorithms,and interpretable machine learning models,NJmat simplifies the workflow for materials informatics,bridging the gap between AI and experimental materials research.The latest version(available at https://figshare.com/articles/software/NJmatML/24607893(accessed on 01 January 2025))enhances its functionality by incorporating NJmatNLP,a module leveraging language models like MatBERT and those based on Word2Vec to support materials prediction tasks.By utilizing clustering and cosine similarity analysis with UMAP visualization,NJmat enables intuitive exploration of materials datasets.While NJmat primarily focuses on structure-property relationships and the discovery of novel chemistries,it can also assist in optimizing processing conditions when relevant parameters are included in the training data.By providing an accessible,integrated environment for machine learning-driven materials discovery,NJmat aligns with the objectives of the Materials Genome Initiative and promotes broader adoption of AI techniques in materials science.
基金funded in part by the National Natural Science Foundation of China under Grant 62401167 and 62192712in part by the Key Laboratory of Marine Environmental Survey Technology and Application,Ministry of Natural Resources,P.R.China under Grant MESTA-2023-B001in part by the Stable Supporting Fund of National Key Laboratory of Underwater Acoustic Technology under Grant JCKYS2022604SSJS007.
文摘The Underwater Acoustic(UWA)channel is bandwidth-constrained and experiences doubly selective fading.It is challenging to acquire perfect channel knowledge for Orthogonal Frequency Division Multiplexing(OFDM)communications using a finite number of pilots.On the other hand,Deep Learning(DL)approaches have been very successful in wireless OFDM communications.However,whether they will work underwater is still a mystery.For the first time,this paper compares two categories of DL-based UWA OFDM receivers:the DataDriven(DD)method,which performs as an end-to-end black box,and the Model-Driven(MD)method,also known as the model-based data-driven method,which combines DL and expert OFDM receiver knowledge.The encoder-decoder framework and Convolutional Neural Network(CNN)structure are employed to establish the DD receiver.On the other hand,an unfolding-based Minimum Mean Square Error(MMSE)structure is adopted for the MD receiver.We analyze the characteristics of different receivers by Monte Carlo simulations under diverse communications conditions and propose a strategy for selecting a proper receiver under different communication scenarios.Field trials in the pool and sea are also conducted to verify the feasibility and advantages of the DL receivers.It is observed that DL receivers perform better than conventional receivers in terms of bit error rate.
基金supported by the National Key Research and Development Program of China(2023YFB3307801)the National Natural Science Foundation of China(62394343,62373155,62073142)+3 种基金Major Science and Technology Project of Xinjiang(No.2022A01006-4)the Programme of Introducing Talents of Discipline to Universities(the 111 Project)under Grant B17017the Fundamental Research Funds for the Central Universities,Science Foundation of China University of Petroleum,Beijing(No.2462024YJRC011)the Open Research Project of the State Key Laboratory of Industrial Control Technology,China(Grant No.ICT2024B70).
文摘The distillation process is an important chemical process,and the application of data-driven modelling approach has the potential to reduce model complexity compared to mechanistic modelling,thus improving the efficiency of process optimization or monitoring studies.However,the distillation process is highly nonlinear and has multiple uncertainty perturbation intervals,which brings challenges to accurate data-driven modelling of distillation processes.This paper proposes a systematic data-driven modelling framework to solve these problems.Firstly,data segment variance was introduced into the K-means algorithm to form K-means data interval(KMDI)clustering in order to cluster the data into perturbed and steady state intervals for steady-state data extraction.Secondly,maximal information coefficient(MIC)was employed to calculate the nonlinear correlation between variables for removing redundant features.Finally,extreme gradient boosting(XGBoost)was integrated as the basic learner into adaptive boosting(AdaBoost)with the error threshold(ET)set to improve weights update strategy to construct the new integrated learning algorithm,XGBoost-AdaBoost-ET.The superiority of the proposed framework is verified by applying this data-driven modelling framework to a real industrial process of propylene distillation.
文摘The asphalt pavement industry is transforming because of the growing influence of artificial intelligence and industrial digitization.As a result of this shift,there is a stronger emphasis on advanced statistical approaches like optimization tools like response surface methodology(RSM)and machine learning(ML)techniques.The goal of this paper is to provide a scientometric and systematic review of the application of RSM and ML applications in data-driven approaches such as optimizing,modeling,and predicting asphalt pavement performance to achieve sustainable asphalt pavements in support of numerous sustainable development goals(SDGs).These include Goals 9(sustainable infrastructure),11(urban resilience),12(sustainable construction strategies),13(climate action through optimized materials),and 17(multidisciplinary interaction).A thorough search of the ScienceDirect,Web of Science,and Scopus databases from 2010 to 2023 yielded 1249 relevant records,with 125 studies closely examined.Over the last thirteen years,there has been significant research growth in RSM and ML applications,particularly in ML-based pavement optimization.The study shows that the topic has a global presence,with notable contributions from Asia,North America,Europe,and other continents.Researchers have concentrated on utilizing sophisticated ML models such as support vector machines(SVM),artificial neural networks(ANN),and Bayesian networks for prediction.Also,the integration of RSM and ML provides a faster and more efficient method for analyzing large datasets to optimize asphalt pavement performance variables.Key contributors include the United States,China,and Malaysia,with global efforts focused on sustainable materials and approaches to reduce impact on the environment.Furthermore,the review demonstrates the integrated use of RSM and ML as transformative tools for improving sustainability,which contributes significantly to SDGs 9,11,12,13,and 17.Providing valuable insights for future research and guiding decision-making for soft computing applications for asphalt pavement projects.
文摘Predicting fracture intensity is essential for optimising reservoir production and mitigating drilling risks in the Brazilian pre-salt layer.However,previous studies rely excessively on conceptual models and typically do not integrate multiple types of data to perform such task.Moreover,to date,no feasibilitylike studies have assessed the reasonableness of such approaches.We propose a data-driven approach that utilises upscaled well logs(Young's modulus,Poisson's ratio,and silica content)alongside seismic attributes(curvature,distance to fault)to predict fracture intensity.The distance to fault is measured using the fault probability volume estimated by a pre-trained convolutional neural network(CNN).We evaluate the effectiveness of this data-driven approach employing two tree-ensemble models,eXtreme Gradient Boosting(XGBoost)and Random Forest,to estimate the volumetric fracture intensity(P32)in the wells.Regression and residual analyses indicate that XGBoost outperforms Random Forest.Results from feature importance methods,such as permutation importance and Shapley Additive explanations(SHAP),highlight curvature as the most important feature,followed by distance to fault,Young's modulus(or P-Impedance),silica content,and Poisson's ratio.The approach has been validated with rock sampling information and two blind tests.Consequently,we believe this workflow can be applied to other wells in nearby fields.The study offers a valuable tool for quantitatively estimating fracture intensity in pre-salt reservoirs.Future research may use this study as a reference for estimating fracture intensity within a seismic volume.The predicted fracture intensity estimates can enhance the reliability of reservoir porosity models and serve as a geohazard indicator to mitigate drilling risks.
文摘Teacher–student relationships play a vital role in improving college students’academic performance and the quality of higher education.However,empirical studies with substantial data-driven insights remain limited.To address this gap,this study collected 3278 questionnaires from seven universities across four provinces in China to analyze the key factors affecting college students’academic performance.A machine learning framework,CQFOA-KELM,was developed by enhancing the Fruit Fly Optimization Algorithm(FOA)with Covariance Matrix Adaptation Evolution Strategy(CMAES)and Quadratic Approximation(QA).CQFOA significantly improved population diversity and was validated on the IEEE CEC2017 benchmark functions.The CQFOA-KELM model achieved an accuracy of 98.15%and a sensitivity of 98.53%in predicting college students’academic performance.Additionally,it effectively identified the key factors influencing academic performance through the feature selection process.
基金funded by the Deanship of Scientific Research at Jouf University under Grant number DSR-2022-RG-0101。
文摘As legal cases grow in complexity and volume worldwide,integrating machine learning and artificial intelligence into judicial systems has become a pivotal research focus.This study introduces a comprehensive framework for verdict recommendation that synergizes rule-based methods with deep learning techniques specifically tailored to the legal domain.The proposed framework comprises three core modules:legal feature extraction,semantic similarity assessment,and verdict recommendation.For legal feature extraction,a rule-based approach leverages Black’s Law Dictionary and WordNet Synsets to construct feature vectors from judicial texts.Semantic similarity between cases is evaluated using a hybrid method that combines rule-based logic with an LSTM model,analyzing the feature vectors of query cases against a legal knowledge base.Verdicts are then recommended through a rule-based retrieval system,enhanced by predefined legal statutes and regulations.By merging rule-based methodologies with deep learning,this framework addresses the interpretability challenges often associated with contemporary AImodels,thereby enhancing both transparency and generalizability across diverse legal contexts.The system was rigorously tested using a legal corpus of 43,000 case laws across six categories:Criminal,Revenue,Service,Corporate,Constitutional,and Civil law,ensuring its adaptability across a wide range of judicial scenarios.Performance evaluation showed that the feature extraction module achieved an average accuracy of 91.6%with an F-Score of 95%.The semantic similarity module,tested using Manhattan,Euclidean,and Cosine distance metrics,achieved 88%accuracy and a 93%F-Score for short queries(Manhattan),89%accuracy and a 93.7%F-Score for medium-length queries(Euclidean),and 87%accuracy with a 92.5%F-Score for longer queries(Cosine).The verdict recommendation module outperformed existing methods,achieving 90%accuracy and a 93.75%F-Score.This study highlights the potential of hybrid AI frameworks to improve judicial decision-making and streamline legal processes,offering a robust,interpretable,and adaptable solution for the evolving demands of modern legal systems.
基金funded by the KRICT Project (KK2512-10) of the Korea Research Institute of Chemical Technology and the Ministry of Trade, Industry and Energy (MOTIE)the Korea Institute for Advancement of Technology (KIAT) through the Virtual Engineering Platform Program (P0022334)+1 种基金supported by the Carbon Neutral Industrial Strategic Technology Development Program (RS-202300261088) funded by the Ministry of Trade, Industry & Energy (MOTIE, Korea)Further support was provided by research fund of Chungnam National University。
文摘Bifunctional oxide-zeolite-based composites(OXZEO)have emerged as promising materials for the direct conversion of syngas to olefins.However,experimental screening and optimization of reaction parameters remain resource-intensive.To address this challenge,we implemented a three-stage framework integrating machine learning,Bayesian optimization,and experimental validation,utilizing a carefully curated dataset from the literature.Our ensemble-tree model(R^(2)>0.87)identified Zn-Zr and Cu-Mg binary mixed oxides as the most effective OXZEO systems,with their light olefin space-time yields confirmed by physically mixing with HSAPO-34 through experimental validation.Density functional theory calculations further elucidated the activity trends between Zn-Zr and Cu-Mg mixed oxides.Among 16 catalyst and reaction condition descriptors,the oxide/zeolite ratio,reaction temperature,and pressure emerged as the most significant factors.This interpretable,data-driven framework offers a versatile approach that can be applied to other catalytic processes,providing a powerful tool for experiment design and optimization in catalysis.
基金supported by the National Natural Science Foundation of China(No.52272382)the Aeronautical Science Foundation of China(No.20200017051001)the Fundamental Research Funds for the Central Universities,China。
文摘Non-learning based motion and path planning of an Unmanned Aerial Vehicle(UAV)is faced with low computation efficiency,mapping memory occupation and local optimization problems.This article investigates the challenge of quadrotor control using offline reinforcement learning.By establishing a data-driven learning paradigm that operates without real-environment interaction,the proposed workflow offers a safer approach than traditional reinforcement learning,making it particularly suited for UAV control in industrial scenarios.The introduced algorithm evaluates dataset uncertainty and employs a pessimistic estimation to foster offline deep reinforcement learning.Experiments highlight the algorithm's superiority over traditional online reinforcement learning methods,especially when learning from offline datasets.Furthermore,the article emphasizes the importance of a more general behavior policy.In evaluations,the trained policy demonstrated versatility by adeptly navigating diverse obstacles,underscoring its real-world applicability.
基金supported by the Ministry of Science and Technology of China,No.2020AAA0109605(to XL)Meizhou Major Scientific and Technological Innovation PlatformsProjects of Guangdong Provincial Science & Technology Plan Projects,No.2019A0102005(to HW).
文摘Early identification and treatment of stroke can greatly improve patient outcomes and quality of life.Although clinical tests such as the Cincinnati Pre-hospital Stroke Scale(CPSS)and the Face Arm Speech Test(FAST)are commonly used for stroke screening,accurate administration is dependent on specialized training.In this study,we proposed a novel multimodal deep learning approach,based on the FAST,for assessing suspected stroke patients exhibiting symptoms such as limb weakness,facial paresis,and speech disorders in acute settings.We collected a dataset comprising videos and audio recordings of emergency room patients performing designated limb movements,facial expressions,and speech tests based on the FAST.We compared the constructed deep learning model,which was designed to process multi-modal datasets,with six prior models that achieved good action classification performance,including the I3D,SlowFast,X3D,TPN,TimeSformer,and MViT.We found that the findings of our deep learning model had a higher clinical value compared with the other approaches.Moreover,the multi-modal model outperformed its single-module variants,highlighting the benefit of utilizing multiple types of patient data,such as action videos and speech audio.These results indicate that a multi-modal deep learning model combined with the FAST could greatly improve the accuracy and sensitivity of early stroke identification of stroke,thus providing a practical and powerful tool for assessing stroke patients in an emergency clinical setting.
基金supported by the Shandong Province Science and Technology Project(2023TSGC0509,2022TSGC2234)Qingdao Science and Technology Plan Project(23-1-5-yqpy-2-qy)Open Topic Grants of Anhui Province Key Laboratory of Intelligent Building&Building Energy Saving,Anhui Jianzhu University(IBES2024KF08).
文摘With the rapid development of artificial intelligence,the Internet of Things(IoT)can deploy various machine learning algorithms for network and application management.In the IoT environment,many sensors and devices generatemassive data,but data security and privacy protection have become a serious challenge.Federated learning(FL)can achieve many intelligent IoT applications by training models on local devices and allowing AI training on distributed IoT devices without data sharing.This review aims to deeply explore the combination of FL and the IoT,and analyze the application of federated learning in the IoT from the aspects of security and privacy protection.In this paper,we first describe the potential advantages of FL and the challenges faced by current IoT systems in the fields of network burden and privacy security.Next,we focus on exploring and analyzing the advantages of the combination of FL on the Internet,including privacy security,attack detection,efficient communication of the IoT,and enhanced learning quality.We also list various application scenarios of FL on the IoT.Finally,we propose several open research challenges and possible solutions.
基金supported by Systematic Major Project of Shuohuang Railway Development Co.,Ltd.,National Energy Group(Grant Number:SHTL-23-31)Beijing Natural Science Foundation(U22B2027).
文摘In the realm of Intelligent Railway Transportation Systems,effective multi-party collaboration is crucial due to concerns over privacy and data silos.Vertical Federated Learning(VFL)has emerged as a promising approach to facilitate such collaboration,allowing diverse entities to collectively enhance machine learning models without the need to share sensitive training data.However,existing works have highlighted VFL’s susceptibility to privacy inference attacks,where an honest but curious server could potentially reconstruct a client’s raw data from embeddings uploaded by the client.This vulnerability poses a significant threat to VFL-based intelligent railway transportation systems.In this paper,we introduce SensFL,a novel privacy-enhancing method to against privacy inference attacks in VFL.Specifically,SensFL integrates regularization of the sensitivity of embeddings to the original data into the model training process,effectively limiting the information contained in shared embeddings.By reducing the sensitivity of embeddings to the original data,SensFL can effectively resist reverse privacy attacks and prevent the reconstruction of the original data from the embeddings.Extensive experiments were conducted on four distinct datasets and three different models to demonstrate the efficacy of SensFL.Experiment results show that SensFL can effectively mitigate privacy inference attacks while maintaining the accuracy of the primary learning task.These results underscore SensFL’s potential to advance privacy protection technologies within VFL-based intelligent railway systems,addressing critical security concerns in collaborative learning environments.
基金National Natural Science Foundation of China (52075420)Fundamental Research Funds for the Central Universities (xzy022023049)National Key Research and Development Program of China (2023YFB3408600)。
文摘The burgeoning market for lithium-ion batteries has stimulated a growing need for more reliable battery performance monitoring. Accurate state-of-health(SOH) estimation is critical for ensuring battery operational performance. Despite numerous data-driven methods reported in existing research for battery SOH estimation, these methods often exhibit inconsistent performance across different application scenarios. To address this issue and overcome the performance limitations of individual data-driven models,integrating multiple models for SOH estimation has received considerable attention. Ensemble learning(EL) typically leverages the strengths of multiple base models to achieve more robust and accurate outputs. However, the lack of a clear review of current research hinders the further development of ensemble methods in SOH estimation. Therefore, this paper comprehensively reviews multi-model ensemble learning methods for battery SOH estimation. First, existing ensemble methods are systematically categorized into 6 classes based on their combination strategies. Different realizations and underlying connections are meticulously analyzed for each category of EL methods, highlighting distinctions, innovations, and typical applications. Subsequently, these ensemble methods are comprehensively compared in terms of base models, combination strategies, and publication trends. Evaluations across 6 dimensions underscore the outstanding performance of stacking-based ensemble methods. Following this, these ensemble methods are further inspected from the perspectives of weighted ensemble and diversity, aiming to inspire potential approaches for enhancing ensemble performance. Moreover, addressing challenges such as base model selection, measuring model robustness and uncertainty, and interpretability of ensemble models in practical applications is emphasized. Finally, future research prospects are outlined, specifically noting that deep learning ensemble is poised to advance ensemble methods for battery SOH estimation. The convergence of advanced machine learning with ensemble learning is anticipated to yield valuable avenues for research. Accelerated research in ensemble learning holds promising prospects for achieving more accurate and reliable battery SOH estimation under real-world conditions.
文摘Mental health is a significant issue worldwide,and the utilization of technology to assist mental health has seen a growing trend.This aims to alleviate the workload on healthcare professionals and aid individuals.Numerous applications have been developed to support the challenges in intelligent healthcare systems.However,because mental health data is sensitive,privacy concerns have emerged.Federated learning has gotten some attention.This research reviews the studies on federated learning and mental health related to solving the issue of intelligent healthcare systems.It explores various dimensions of federated learning in mental health,such as datasets(their types and sources),applications categorized based on mental health symptoms,federated mental health frameworks,federated machine learning,federated deep learning,and the benefits of federated learning in mental health applications.This research conducts surveys to evaluate the current state of mental health applications,mainly focusing on the role of Federated Learning(FL)and related privacy and data security concerns.The survey provides valuable insights into how these applications are emerging and evolving,specifically emphasizing FL’s impact.