The Underwater Acoustic(UWA)channel is bandwidth-constrained and experiences doubly selective fading.It is challenging to acquire perfect channel knowledge for Orthogonal Frequency Division Multiplexing(OFDM)communica...The Underwater Acoustic(UWA)channel is bandwidth-constrained and experiences doubly selective fading.It is challenging to acquire perfect channel knowledge for Orthogonal Frequency Division Multiplexing(OFDM)communications using a finite number of pilots.On the other hand,Deep Learning(DL)approaches have been very successful in wireless OFDM communications.However,whether they will work underwater is still a mystery.For the first time,this paper compares two categories of DL-based UWA OFDM receivers:the DataDriven(DD)method,which performs as an end-to-end black box,and the Model-Driven(MD)method,also known as the model-based data-driven method,which combines DL and expert OFDM receiver knowledge.The encoder-decoder framework and Convolutional Neural Network(CNN)structure are employed to establish the DD receiver.On the other hand,an unfolding-based Minimum Mean Square Error(MMSE)structure is adopted for the MD receiver.We analyze the characteristics of different receivers by Monte Carlo simulations under diverse communications conditions and propose a strategy for selecting a proper receiver under different communication scenarios.Field trials in the pool and sea are also conducted to verify the feasibility and advantages of the DL receivers.It is observed that DL receivers perform better than conventional receivers in terms of bit error rate.展开更多
The distillation process is an important chemical process,and the application of data-driven modelling approach has the potential to reduce model complexity compared to mechanistic modelling,thus improving the efficie...The distillation process is an important chemical process,and the application of data-driven modelling approach has the potential to reduce model complexity compared to mechanistic modelling,thus improving the efficiency of process optimization or monitoring studies.However,the distillation process is highly nonlinear and has multiple uncertainty perturbation intervals,which brings challenges to accurate data-driven modelling of distillation processes.This paper proposes a systematic data-driven modelling framework to solve these problems.Firstly,data segment variance was introduced into the K-means algorithm to form K-means data interval(KMDI)clustering in order to cluster the data into perturbed and steady state intervals for steady-state data extraction.Secondly,maximal information coefficient(MIC)was employed to calculate the nonlinear correlation between variables for removing redundant features.Finally,extreme gradient boosting(XGBoost)was integrated as the basic learner into adaptive boosting(AdaBoost)with the error threshold(ET)set to improve weights update strategy to construct the new integrated learning algorithm,XGBoost-AdaBoost-ET.The superiority of the proposed framework is verified by applying this data-driven modelling framework to a real industrial process of propylene distillation.展开更多
The burgeoning market for lithium-ion batteries has stimulated a growing need for more reliable battery performance monitoring. Accurate state-of-health(SOH) estimation is critical for ensuring battery operational per...The burgeoning market for lithium-ion batteries has stimulated a growing need for more reliable battery performance monitoring. Accurate state-of-health(SOH) estimation is critical for ensuring battery operational performance. Despite numerous data-driven methods reported in existing research for battery SOH estimation, these methods often exhibit inconsistent performance across different application scenarios. To address this issue and overcome the performance limitations of individual data-driven models,integrating multiple models for SOH estimation has received considerable attention. Ensemble learning(EL) typically leverages the strengths of multiple base models to achieve more robust and accurate outputs. However, the lack of a clear review of current research hinders the further development of ensemble methods in SOH estimation. Therefore, this paper comprehensively reviews multi-model ensemble learning methods for battery SOH estimation. First, existing ensemble methods are systematically categorized into 6 classes based on their combination strategies. Different realizations and underlying connections are meticulously analyzed for each category of EL methods, highlighting distinctions, innovations, and typical applications. Subsequently, these ensemble methods are comprehensively compared in terms of base models, combination strategies, and publication trends. Evaluations across 6 dimensions underscore the outstanding performance of stacking-based ensemble methods. Following this, these ensemble methods are further inspected from the perspectives of weighted ensemble and diversity, aiming to inspire potential approaches for enhancing ensemble performance. Moreover, addressing challenges such as base model selection, measuring model robustness and uncertainty, and interpretability of ensemble models in practical applications is emphasized. Finally, future research prospects are outlined, specifically noting that deep learning ensemble is poised to advance ensemble methods for battery SOH estimation. The convergence of advanced machine learning with ensemble learning is anticipated to yield valuable avenues for research. Accelerated research in ensemble learning holds promising prospects for achieving more accurate and reliable battery SOH estimation under real-world conditions.展开更多
NJmat is a user-friendly,data-driven machine learning interface designed for materials design and analysis.The platform integrates advanced computational techniques,including natural language processing(NLP),large lan...NJmat is a user-friendly,data-driven machine learning interface designed for materials design and analysis.The platform integrates advanced computational techniques,including natural language processing(NLP),large language models(LLM),machine learning potentials(MLP),and graph neural networks(GNN),to facili-tate materials discovery.The platform has been applied in diverse materials research areas,including perovskite surface design,catalyst discovery,battery materials screening,structural alloy design,and molecular informatics.By automating feature selection,predictive modeling,and result interpretation,NJmat accelerates the development of high-performance materials across energy storage,conversion,and structural applications.Additionally,NJmat serves as an educational tool,allowing students and researchers to apply machine learning techniques in materials science with minimal coding expertise.Through automated feature extraction,genetic algorithms,and interpretable machine learning models,NJmat simplifies the workflow for materials informatics,bridging the gap between AI and experimental materials research.The latest version(available at https://figshare.com/articles/software/NJmatML/24607893(accessed on 01 January 2025))enhances its functionality by incorporating NJmatNLP,a module leveraging language models like MatBERT and those based on Word2Vec to support materials prediction tasks.By utilizing clustering and cosine similarity analysis with UMAP visualization,NJmat enables intuitive exploration of materials datasets.While NJmat primarily focuses on structure-property relationships and the discovery of novel chemistries,it can also assist in optimizing processing conditions when relevant parameters are included in the training data.By providing an accessible,integrated environment for machine learning-driven materials discovery,NJmat aligns with the objectives of the Materials Genome Initiative and promotes broader adoption of AI techniques in materials science.展开更多
For control systems with unknown model parameters,this paper proposes a data-driven iterative learning method for fault estimation.First,input and output data from the system under fault-free conditions are collected....For control systems with unknown model parameters,this paper proposes a data-driven iterative learning method for fault estimation.First,input and output data from the system under fault-free conditions are collected.By applying orthogonal triangular decomposition and singular value decomposition,a data-driven realization of the system's kernel representation is derived,based on this representation,a residual generator is constructed.Then,the actuator fault signal is estimated online by analyzing the system's dynamic residual,and an iterative learning algorithm is introduced to continuously optimize the residual-based performance function,thereby enhancing estimation accuracy.The proposed method achieves actuator fault estimation without requiring knowledge of model parameters,eliminating the time-consuming system modeling process,and allowing operators to focus on system optimization and decision-making.Compared with existing fault estimation methods,the proposed method demonstrates superior transient performance,steady-state performance,and real-time capability,reduces the need for manual intervention and lowers operational complexity.Finally,experimental results on a mobile robot verify the effectiveness and advantages of the method.展开更多
During the past few decades,mobile wireless communications have experienced four generations of technological revolution,namely from 1 G to 4 G,and the deployment of the latest 5 G networks is expected to take place i...During the past few decades,mobile wireless communications have experienced four generations of technological revolution,namely from 1 G to 4 G,and the deployment of the latest 5 G networks is expected to take place in 2019.One fundamental question is how we can push forward the development of mobile wireless communications while it has become an extremely complex and sophisticated system.We believe that the answer lies in the huge volumes of data produced by the network itself,and machine learning may become a key to exploit such information.In this paper,we elaborate why the conventional model-based paradigm,which has been widely proved useful in pre-5 G networks,can be less efficient or even less practical in the future 5 G and beyond mobile networks.Then,we explain how the data-driven paradigm,using state-of-the-art machine learning techniques,can become a promising solution.At last,we provide a typical use case of the data-driven paradigm,i.e.,proactive load balancing,in which online learning is utilized to adjust cell configurations in advance to avoid burst congestion caused by rapid traffic changes.展开更多
In the information age,blended teaching,no matter online or offline,has become the mainstream of college teaching reform.In this teaching model,self-directed learning and cooperative learning are the two main learning...In the information age,blended teaching,no matter online or offline,has become the mainstream of college teaching reform.In this teaching model,self-directed learning and cooperative learning are the two main learning approaches.On the online teaching platform,students mainly learn knowledge-based content by self-directed learning,while practising their language skills by cooperative learning in flipped classroom activities.On one hand,it advocates student-centered strategy so as to improve students autonomous learning ability;on the other hand,teachers serve as a guide to organize the classroom activities;meanwhile,they give timely feedback to students in order to promote students’learning ability.In blended teaching model,this mutually compatible and reinforcing model of self-directed learning and cooperative learning is undoubtedly helpful to improve the teaching efficiency.展开更多
Recently,orthogonal time frequency space(OTFS)was presented to alleviate severe Doppler effects in high mobility scenarios.Most of the current OTFS detection schemes rely on perfect channel state information(CSI).Howe...Recently,orthogonal time frequency space(OTFS)was presented to alleviate severe Doppler effects in high mobility scenarios.Most of the current OTFS detection schemes rely on perfect channel state information(CSI).However,in real-life systems,the parameters of channels will constantly change,which are often difficult to capture and describe.In this paper,we summarize the existing research on OTFS detection based on data-driven deep learning(DL)and propose three new network structures.The presented three networks include a residual network(ResNet),a dense network(DenseNet),and a residual dense network(RDN)for OTFS detection.The detection schemes based on data-driven paradigms do not require a model that is easy to handle mathematically.Meanwhile,compared with the existing fully connected-deep neural network(FC-DNN)and standard convolutional neural network(CNN),these three new networks can alleviate the problems of gradient explosion and gradient disappearance.Through simulation,it is proved that RDN has the best performance among the three proposed schemes due to the combination of shallow and deep features.RDN can solve the issue of performance loss caused by the traditional network not fully utilizing all the hierarchical information.展开更多
Due to growing concerns regarding climate change and environmental protection,smart power generation has become essential for the economical and safe operation of both conventional thermal power plants and sustainable...Due to growing concerns regarding climate change and environmental protection,smart power generation has become essential for the economical and safe operation of both conventional thermal power plants and sustainable energy.Traditional first-principle model-based methods are becoming insufficient when faced with the ever-growing system scale and its various uncertainties.The burgeoning era of machine learning(ML)and data-driven control(DDC)techniques promises an improved alternative to these outdated methods.This paper reviews typical applications of ML and DDC at the level of monitoring,control,optimization,and fault detection of power generation systems,with a particular focus on uncovering how these methods can function in evaluating,counteracting,or withstanding the effects of the associated uncertainties.A holistic view is provided on the control techniques of smart power generation,from the regulation level to the planning level.The benefits of ML and DDC techniques are accordingly interpreted in terms of visibility,maneuverability,flexibility,profitability,and safety(abbreviated as the“5-TYs”),respectively.Finally,an outlook on future research and applications is presented.展开更多
In terms of multiple temporal and spatial scales, massive data from experiments, flow field measurements, and high-fidelity numerical simulations have greatly promoted the rapid development of fluid mechanics. Machine...In terms of multiple temporal and spatial scales, massive data from experiments, flow field measurements, and high-fidelity numerical simulations have greatly promoted the rapid development of fluid mechanics. Machine Learning(ML) provides a wealth of analysis methods to extract potential information from a large amount of data for in-depth understanding of the underlying flow mechanism or for further applications. Furthermore, machine learning algorithms can enhance flow information and automatically perform tasks that involve active flow control and optimization. This article provides an overview of the past history, current development, and promising prospects of machine learning in the field of fluid mechanics. In addition, to facilitate understanding, this article outlines the basic principles of machine learning methods and their applications in engineering practice, turbulence models, flow field representation problems, and active flow control. In short, machine learning provides a powerful and more intelligent data processing architecture, and may greatly enrich the existing research methods and industrial applications of fluid mechanics.展开更多
We propose an integrated method of data-driven and mechanism models for well logging formation evaluation,explicitly focusing on predicting reservoir parameters,such as porosity and water saturation.Accurately interpr...We propose an integrated method of data-driven and mechanism models for well logging formation evaluation,explicitly focusing on predicting reservoir parameters,such as porosity and water saturation.Accurately interpreting these parameters is crucial for effectively exploring and developing oil and gas.However,with the increasing complexity of geological conditions in this industry,there is a growing demand for improved accuracy in reservoir parameter prediction,leading to higher costs associated with manual interpretation.The conventional logging interpretation methods rely on empirical relationships between logging data and reservoir parameters,which suffer from low interpretation efficiency,intense subjectivity,and suitability for ideal conditions.The application of artificial intelligence in the interpretation of logging data provides a new solution to the problems existing in traditional methods.It is expected to improve the accuracy and efficiency of the interpretation.If large and high-quality datasets exist,data-driven models can reveal relationships of arbitrary complexity.Nevertheless,constructing sufficiently large logging datasets with reliable labels remains challenging,making it difficult to apply data-driven models effectively in logging data interpretation.Furthermore,data-driven models often act as“black boxes”without explaining their predictions or ensuring compliance with primary physical constraints.This paper proposes a machine learning method with strong physical constraints by integrating mechanism and data-driven models.Prior knowledge of logging data interpretation is embedded into machine learning regarding network structure,loss function,and optimization algorithm.We employ the Physically Informed Auto-Encoder(PIAE)to predict porosity and water saturation,which can be trained without labeled reservoir parameters using self-supervised learning techniques.This approach effectively achieves automated interpretation and facilitates generalization across diverse datasets.展开更多
A comprehensive and precise analysis of shale gas production performance is crucial for evaluating resource potential,designing a field development plan,and making investment decisions.However,quantitative analysis ca...A comprehensive and precise analysis of shale gas production performance is crucial for evaluating resource potential,designing a field development plan,and making investment decisions.However,quantitative analysis can be challenging because production performance is dominated by the complex interaction among a series of geological and engineering factors.In fact,each factor can be viewed as a player who makes cooperative contributions to the production payoff within the constraints of physical laws and models.Inspired by the idea,we propose a hybrid data-driven analysis framework in this study,where the contributions of dominant factors are quantitatively evaluated,the productions are precisely forecasted,and the development optimization suggestions are comprehensively generated.More specifically,game theory and machine learning models are coupled to determine the dominating geological and engineering factors.The Shapley value with definite physical meaning is employed to quantitatively measure the effects of individual factors.A multi-model-fused stacked model is trained for production forecast,which provides the basis for derivative-free optimization algorithms to optimize the development plan.The complete workflow is validated with actual production data collected from the Fuling shale gas field,Sichuan Basin,China.The validation results show that the proposed procedure can draw rigorous conclusions with quantified evidence and thereby provide specific and reliable suggestions for development plan optimization.Comparing with traditional and experience-based approaches,the hybrid data-driven procedure is advanced in terms of both efficiency and accuracy.展开更多
In this paper,we present a novel data-driven design method for the human-robot interaction(HRI)system,where a given task is achieved by cooperation between the human and the robot.The presented HRI controller design i...In this paper,we present a novel data-driven design method for the human-robot interaction(HRI)system,where a given task is achieved by cooperation between the human and the robot.The presented HRI controller design is a two-level control design approach consisting of a task-oriented performance optimization design and a plant-oriented impedance controller design.The task-oriented design minimizes the human effort and guarantees the perfect task tracking in the outer-loop,while the plant-oriented achieves the desired impedance from the human to the robot manipulator end-effector in the inner-loop.Data-driven reinforcement learning techniques are used for performance optimization in the outer-loop to assign the optimal impedance parameters.In the inner-loop,a velocity-free filter is designed to avoid the requirement of end-effector velocity measurement.On this basis,an adaptive controller is designed to achieve the desired impedance of the robot manipulator in the task space.The simulation and experiment of a robot manipulator are conducted to verify the efficacy of the presented HRI design framework.展开更多
During the last decades the whispering gallery mode based sensors have become a prominent solution for label-free sensing of various physical and chemical parameters.At the same time,the widespread utilization of the ...During the last decades the whispering gallery mode based sensors have become a prominent solution for label-free sensing of various physical and chemical parameters.At the same time,the widespread utilization of the approach is hindered by the restricted applicability of the known configurations for ambient variations quantification outside the laboratory conditions and their low affordability,where necessity on the spectrally-resolved data collection is among the main limiting factors.In this paper we demonstrate the first realization of an affordable whispering gallery mode sensor powered by deep learning and multi-resonator imaging at a fixed frequency.It has been shown that the approach enables refractive index unit(RIU)prediction with an absolute error at 3×10^(-6) level for dynamic range of the RIU variations from 0 to 2×10^(-3) with temporal resolution of several milliseconds and instrument-driven detection limit of 3×10−5.High sensing accuracy together with instrumental affordability and production simplicity places the reported detector among the most cost-effective realizations of the whispering gallery mode approach.The proposed solution is expected to have a great impact on the shift of the whole sensing paradigm away from the model-based and to the flexible self-learning solutions.展开更多
For unachievable tracking problems, where the system output cannot precisely track a given reference, achieving the best possible approximation for the reference trajectory becomes the objective. This study aims to in...For unachievable tracking problems, where the system output cannot precisely track a given reference, achieving the best possible approximation for the reference trajectory becomes the objective. This study aims to investigate solutions using the Ptype learning control scheme. Initially, we demonstrate the necessity of gradient information for achieving the best approximation.Subsequently, we propose an input-output-driven learning gain design to handle the imprecise gradients of a class of uncertain systems. However, it is discovered that the desired performance may not be attainable when faced with incomplete information.To address this issue, an extended iterative learning control scheme is introduced. In this scheme, the tracking errors are modified through output data sampling, which incorporates lowmemory footprints and offers flexibility in learning gain design.The input sequence is shown to converge towards the desired input, resulting in an output that is closest to the given reference in the least square sense. Numerical simulations are provided to validate the theoretical findings.展开更多
The era of big data is coming,the combination of big data and traditional teaching can provide more and more accurate services for students'self-learning,and it is a good way to teach students according to their a...The era of big data is coming,the combination of big data and traditional teaching can provide more and more accurate services for students'self-learning,and it is a good way to teach students according to their aptitude.In this background,a learning society is coming,which aiming at learning,autonomous learning and lifelong learning.Learning society emphasize the ability of learning autonomy for students unprecedentedly.Learning is no longer limited to the campus.Learning ability will accompany learners'social life and become an active and healthy lifelong activity.Autonomous learning is a learning theory that goes with the requirements of The Times and has a broad development prospect.The study of Autonomous learning not only has a very important guiding significance for the educational and teaching practice in China,but also plays an important role in the life development of every student.The subject of learning is gradually transferred from the classroom,teachers and textbooks to the students themselves.Teachers should not only impart knowledge and answer questions,but also,most importantly,teach students how to exert their autonomy in autonomous learning.After investigating and researching the existing monitoring model of autonomous English learning in colleges and universities,our group found that in practice,there is a lack of corresponding monitoring mechanisms and means,and autonomous learning has gradually become formalized.Therefore,according to the actual situation of autonomous English learning in our country's universities,the monitoring model of autonomous English learning has been reconstructed,and an effective comprehensive evaluation system has been established to effectively improve students'English learning ability.展开更多
Text mining has emerged as a powerful strategy for extracting domain knowledge structure from large amounts of text data.To date,most text mining methods are restricted to specific literature information,resulting in ...Text mining has emerged as a powerful strategy for extracting domain knowledge structure from large amounts of text data.To date,most text mining methods are restricted to specific literature information,resulting in incomplete knowledge graphs.Here,we report a method that combines citation analysis with topic modeling to describe the hidden development patterns in the history of science.Leveraging this method,we construct a knowledge graph in the field of Raman spectroscopy.The traditional Latent DirichletAllocation model is chosen as the baseline model for comparison to validate the performance of our model.Our method improves the topic coherence with a minimum growth rate of 100%compared to the traditional text mining method.It outperforms the traditional text mining method on the diversity,and its growth rate ranges from 0 to 126%.The results show the effectiveness of rule-based tokenizer we designed in solving the word tokenizer problem caused by entity naming rules in the field of chemistry.It is versatile in revealing the distribution of topics,establishing the similarity and inheritance relationships,and identifying the important moments in the history of Raman spectroscopy.Our work provides a comprehensive tool for the science of science research and promises to offer new insights into the historical survey and development forecast of a research field.展开更多
In the era of big data,reinforcement learning(RL)has emerged as a powerful data-driven optimization approach in materials science,enabling unprecedented advances in material design and performance improvement.Unlike t...In the era of big data,reinforcement learning(RL)has emerged as a powerful data-driven optimization approach in materials science,enabling unprecedented advances in material design and performance improvement.Unlike traditional trial-and-error and physics-based approaches,RL agents autonomously identify optimal strategies across high-dimensional and dynamic design spaces by iterative interactions with complex environments.This capability makes RL especially effective for target optimization and sequential decision-making in challenging materials science problems.In this review,we present a comprehensive overview of fundamental RL algorithms,including Q-learning,deep Q-networks(DQN),actor-critic methods,and deep deterministic policy gradient(DDPG).Then,the core mechanisms,advantages,limitations,and representative applications of RL in materials discovery,property optimization,process control,and manufacturing are discussed systematically.Lastly,key future research directions and opportunities are outlined.The perspectives presented herein aim to foster interdisciplinary collaboration and drive innovation at the frontier of AI‑driven materials science.展开更多
Perovskite solar cells(PSCs)have developed rapidly,positioning them as potential candidates for nextgeneration renewable energy sources.However,conventional trial-and-error approaches and the vast compositional parame...Perovskite solar cells(PSCs)have developed rapidly,positioning them as potential candidates for nextgeneration renewable energy sources.However,conventional trial-and-error approaches and the vast compositional parameter space continue to pose challenges in the pursuit of exceptional performance and high stability of perovskite-based optoelectronics.The increasing demand for novel materials in optoelectronic devices and establishment of substantial databases has enabled data-driven machinelearning(ML)approaches to swiftly advance in the materials field.This review succinctly outlines the fundamental ML procedures,techniques,and recent breakthroughs,particularly in predicting the physical characteristics of perovskite materials.Moreover,it highlights research endeavors aimed at optimizing and screening materials to enhance the efficiency and stability of PSCs.Additionally,this review highlights recent efforts in using characterization data for ML,exploring their correlations with material properties and device performance,which are actively being researched,but they have yet to receive significant attention.Lastly,we provide future perspectives,such as leveraging Large Language Models(LLMs)and text-mining,to expedite the discovery of novel perovskite materials and expand their utilization across various optoelectronic fields.展开更多
基金funded in part by the National Natural Science Foundation of China under Grant 62401167 and 62192712in part by the Key Laboratory of Marine Environmental Survey Technology and Application,Ministry of Natural Resources,P.R.China under Grant MESTA-2023-B001in part by the Stable Supporting Fund of National Key Laboratory of Underwater Acoustic Technology under Grant JCKYS2022604SSJS007.
文摘The Underwater Acoustic(UWA)channel is bandwidth-constrained and experiences doubly selective fading.It is challenging to acquire perfect channel knowledge for Orthogonal Frequency Division Multiplexing(OFDM)communications using a finite number of pilots.On the other hand,Deep Learning(DL)approaches have been very successful in wireless OFDM communications.However,whether they will work underwater is still a mystery.For the first time,this paper compares two categories of DL-based UWA OFDM receivers:the DataDriven(DD)method,which performs as an end-to-end black box,and the Model-Driven(MD)method,also known as the model-based data-driven method,which combines DL and expert OFDM receiver knowledge.The encoder-decoder framework and Convolutional Neural Network(CNN)structure are employed to establish the DD receiver.On the other hand,an unfolding-based Minimum Mean Square Error(MMSE)structure is adopted for the MD receiver.We analyze the characteristics of different receivers by Monte Carlo simulations under diverse communications conditions and propose a strategy for selecting a proper receiver under different communication scenarios.Field trials in the pool and sea are also conducted to verify the feasibility and advantages of the DL receivers.It is observed that DL receivers perform better than conventional receivers in terms of bit error rate.
基金supported by the National Key Research and Development Program of China(2023YFB3307801)the National Natural Science Foundation of China(62394343,62373155,62073142)+3 种基金Major Science and Technology Project of Xinjiang(No.2022A01006-4)the Programme of Introducing Talents of Discipline to Universities(the 111 Project)under Grant B17017the Fundamental Research Funds for the Central Universities,Science Foundation of China University of Petroleum,Beijing(No.2462024YJRC011)the Open Research Project of the State Key Laboratory of Industrial Control Technology,China(Grant No.ICT2024B70).
文摘The distillation process is an important chemical process,and the application of data-driven modelling approach has the potential to reduce model complexity compared to mechanistic modelling,thus improving the efficiency of process optimization or monitoring studies.However,the distillation process is highly nonlinear and has multiple uncertainty perturbation intervals,which brings challenges to accurate data-driven modelling of distillation processes.This paper proposes a systematic data-driven modelling framework to solve these problems.Firstly,data segment variance was introduced into the K-means algorithm to form K-means data interval(KMDI)clustering in order to cluster the data into perturbed and steady state intervals for steady-state data extraction.Secondly,maximal information coefficient(MIC)was employed to calculate the nonlinear correlation between variables for removing redundant features.Finally,extreme gradient boosting(XGBoost)was integrated as the basic learner into adaptive boosting(AdaBoost)with the error threshold(ET)set to improve weights update strategy to construct the new integrated learning algorithm,XGBoost-AdaBoost-ET.The superiority of the proposed framework is verified by applying this data-driven modelling framework to a real industrial process of propylene distillation.
基金National Natural Science Foundation of China (52075420)Fundamental Research Funds for the Central Universities (xzy022023049)National Key Research and Development Program of China (2023YFB3408600)。
文摘The burgeoning market for lithium-ion batteries has stimulated a growing need for more reliable battery performance monitoring. Accurate state-of-health(SOH) estimation is critical for ensuring battery operational performance. Despite numerous data-driven methods reported in existing research for battery SOH estimation, these methods often exhibit inconsistent performance across different application scenarios. To address this issue and overcome the performance limitations of individual data-driven models,integrating multiple models for SOH estimation has received considerable attention. Ensemble learning(EL) typically leverages the strengths of multiple base models to achieve more robust and accurate outputs. However, the lack of a clear review of current research hinders the further development of ensemble methods in SOH estimation. Therefore, this paper comprehensively reviews multi-model ensemble learning methods for battery SOH estimation. First, existing ensemble methods are systematically categorized into 6 classes based on their combination strategies. Different realizations and underlying connections are meticulously analyzed for each category of EL methods, highlighting distinctions, innovations, and typical applications. Subsequently, these ensemble methods are comprehensively compared in terms of base models, combination strategies, and publication trends. Evaluations across 6 dimensions underscore the outstanding performance of stacking-based ensemble methods. Following this, these ensemble methods are further inspected from the perspectives of weighted ensemble and diversity, aiming to inspire potential approaches for enhancing ensemble performance. Moreover, addressing challenges such as base model selection, measuring model robustness and uncertainty, and interpretability of ensemble models in practical applications is emphasized. Finally, future research prospects are outlined, specifically noting that deep learning ensemble is poised to advance ensemble methods for battery SOH estimation. The convergence of advanced machine learning with ensemble learning is anticipated to yield valuable avenues for research. Accelerated research in ensemble learning holds promising prospects for achieving more accurate and reliable battery SOH estimation under real-world conditions.
基金supported by the Jiangsu Provincial Science and Technology Project Basic Research Program(Natural Science Foundation of Jiangsu Province)(No.BK20211283).
文摘NJmat is a user-friendly,data-driven machine learning interface designed for materials design and analysis.The platform integrates advanced computational techniques,including natural language processing(NLP),large language models(LLM),machine learning potentials(MLP),and graph neural networks(GNN),to facili-tate materials discovery.The platform has been applied in diverse materials research areas,including perovskite surface design,catalyst discovery,battery materials screening,structural alloy design,and molecular informatics.By automating feature selection,predictive modeling,and result interpretation,NJmat accelerates the development of high-performance materials across energy storage,conversion,and structural applications.Additionally,NJmat serves as an educational tool,allowing students and researchers to apply machine learning techniques in materials science with minimal coding expertise.Through automated feature extraction,genetic algorithms,and interpretable machine learning models,NJmat simplifies the workflow for materials informatics,bridging the gap between AI and experimental materials research.The latest version(available at https://figshare.com/articles/software/NJmatML/24607893(accessed on 01 January 2025))enhances its functionality by incorporating NJmatNLP,a module leveraging language models like MatBERT and those based on Word2Vec to support materials prediction tasks.By utilizing clustering and cosine similarity analysis with UMAP visualization,NJmat enables intuitive exploration of materials datasets.While NJmat primarily focuses on structure-property relationships and the discovery of novel chemistries,it can also assist in optimizing processing conditions when relevant parameters are included in the training data.By providing an accessible,integrated environment for machine learning-driven materials discovery,NJmat aligns with the objectives of the Materials Genome Initiative and promotes broader adoption of AI techniques in materials science.
基金Supported by Shandong Provincial Taishan Scholar Program(Grant No.tsqn202312133)Shandong Provincial Natural Science Foundation(Grant Nos.ZR2022YQ61,ZR2023ZD32)+1 种基金Shandong Provincial Natural Science Foundation(Grant No.ZR2023ZD32)National Natural Science Foundation of China(Grant Nos.61772551 and 62111530052)。
文摘For control systems with unknown model parameters,this paper proposes a data-driven iterative learning method for fault estimation.First,input and output data from the system under fault-free conditions are collected.By applying orthogonal triangular decomposition and singular value decomposition,a data-driven realization of the system's kernel representation is derived,based on this representation,a residual generator is constructed.Then,the actuator fault signal is estimated online by analyzing the system's dynamic residual,and an iterative learning algorithm is introduced to continuously optimize the residual-based performance function,thereby enhancing estimation accuracy.The proposed method achieves actuator fault estimation without requiring knowledge of model parameters,eliminating the time-consuming system modeling process,and allowing operators to focus on system optimization and decision-making.Compared with existing fault estimation methods,the proposed method demonstrates superior transient performance,steady-state performance,and real-time capability,reduces the need for manual intervention and lowers operational complexity.Finally,experimental results on a mobile robot verify the effectiveness and advantages of the method.
基金partially supported by the National Natural Science Foundation of China(61751306,61801208,61671233)the Jiangsu Science Foundation(BK20170650)+2 种基金the Postdoctoral Science Foundation of China(BX201700118,2017M621712)the Jiangsu Postdoctoral Science Foundation(1701118B)the Fundamental Research Funds for the Central Universities(021014380094)
文摘During the past few decades,mobile wireless communications have experienced four generations of technological revolution,namely from 1 G to 4 G,and the deployment of the latest 5 G networks is expected to take place in 2019.One fundamental question is how we can push forward the development of mobile wireless communications while it has become an extremely complex and sophisticated system.We believe that the answer lies in the huge volumes of data produced by the network itself,and machine learning may become a key to exploit such information.In this paper,we elaborate why the conventional model-based paradigm,which has been widely proved useful in pre-5 G networks,can be less efficient or even less practical in the future 5 G and beyond mobile networks.Then,we explain how the data-driven paradigm,using state-of-the-art machine learning techniques,can become a promising solution.At last,we provide a typical use case of the data-driven paradigm,i.e.,proactive load balancing,in which online learning is utilized to adjust cell configurations in advance to avoid burst congestion caused by rapid traffic changes.
文摘In the information age,blended teaching,no matter online or offline,has become the mainstream of college teaching reform.In this teaching model,self-directed learning and cooperative learning are the two main learning approaches.On the online teaching platform,students mainly learn knowledge-based content by self-directed learning,while practising their language skills by cooperative learning in flipped classroom activities.On one hand,it advocates student-centered strategy so as to improve students autonomous learning ability;on the other hand,teachers serve as a guide to organize the classroom activities;meanwhile,they give timely feedback to students in order to promote students’learning ability.In blended teaching model,this mutually compatible and reinforcing model of self-directed learning and cooperative learning is undoubtedly helpful to improve the teaching efficiency.
基金supported by Beijing Natural Science Foundation(L223025)National Natural Science Foundation of China(62201067)R and D Program of Beijing Municipal Education Commission(KM202211232008)。
文摘Recently,orthogonal time frequency space(OTFS)was presented to alleviate severe Doppler effects in high mobility scenarios.Most of the current OTFS detection schemes rely on perfect channel state information(CSI).However,in real-life systems,the parameters of channels will constantly change,which are often difficult to capture and describe.In this paper,we summarize the existing research on OTFS detection based on data-driven deep learning(DL)and propose three new network structures.The presented three networks include a residual network(ResNet),a dense network(DenseNet),and a residual dense network(RDN)for OTFS detection.The detection schemes based on data-driven paradigms do not require a model that is easy to handle mathematically.Meanwhile,compared with the existing fully connected-deep neural network(FC-DNN)and standard convolutional neural network(CNN),these three new networks can alleviate the problems of gradient explosion and gradient disappearance.Through simulation,it is proved that RDN has the best performance among the three proposed schemes due to the combination of shallow and deep features.RDN can solve the issue of performance loss caused by the traditional network not fully utilizing all the hierarchical information.
文摘Due to growing concerns regarding climate change and environmental protection,smart power generation has become essential for the economical and safe operation of both conventional thermal power plants and sustainable energy.Traditional first-principle model-based methods are becoming insufficient when faced with the ever-growing system scale and its various uncertainties.The burgeoning era of machine learning(ML)and data-driven control(DDC)techniques promises an improved alternative to these outdated methods.This paper reviews typical applications of ML and DDC at the level of monitoring,control,optimization,and fault detection of power generation systems,with a particular focus on uncovering how these methods can function in evaluating,counteracting,or withstanding the effects of the associated uncertainties.A holistic view is provided on the control techniques of smart power generation,from the regulation level to the planning level.The benefits of ML and DDC techniques are accordingly interpreted in terms of visibility,maneuverability,flexibility,profitability,and safety(abbreviated as the“5-TYs”),respectively.Finally,an outlook on future research and applications is presented.
基金supported by the National Natural Science Foundation of China(No.11972139)。
文摘In terms of multiple temporal and spatial scales, massive data from experiments, flow field measurements, and high-fidelity numerical simulations have greatly promoted the rapid development of fluid mechanics. Machine Learning(ML) provides a wealth of analysis methods to extract potential information from a large amount of data for in-depth understanding of the underlying flow mechanism or for further applications. Furthermore, machine learning algorithms can enhance flow information and automatically perform tasks that involve active flow control and optimization. This article provides an overview of the past history, current development, and promising prospects of machine learning in the field of fluid mechanics. In addition, to facilitate understanding, this article outlines the basic principles of machine learning methods and their applications in engineering practice, turbulence models, flow field representation problems, and active flow control. In short, machine learning provides a powerful and more intelligent data processing architecture, and may greatly enrich the existing research methods and industrial applications of fluid mechanics.
基金supported by National Key Research and Development Program (2019YFA0708301)National Natural Science Foundation of China (51974337)+2 种基金the Strategic Cooperation Projects of CNPC and CUPB (ZLZX2020-03)Science and Technology Innovation Fund of CNPC (2021DQ02-0403)Open Fund of Petroleum Exploration and Development Research Institute of CNPC (2022-KFKT-09)
文摘We propose an integrated method of data-driven and mechanism models for well logging formation evaluation,explicitly focusing on predicting reservoir parameters,such as porosity and water saturation.Accurately interpreting these parameters is crucial for effectively exploring and developing oil and gas.However,with the increasing complexity of geological conditions in this industry,there is a growing demand for improved accuracy in reservoir parameter prediction,leading to higher costs associated with manual interpretation.The conventional logging interpretation methods rely on empirical relationships between logging data and reservoir parameters,which suffer from low interpretation efficiency,intense subjectivity,and suitability for ideal conditions.The application of artificial intelligence in the interpretation of logging data provides a new solution to the problems existing in traditional methods.It is expected to improve the accuracy and efficiency of the interpretation.If large and high-quality datasets exist,data-driven models can reveal relationships of arbitrary complexity.Nevertheless,constructing sufficiently large logging datasets with reliable labels remains challenging,making it difficult to apply data-driven models effectively in logging data interpretation.Furthermore,data-driven models often act as“black boxes”without explaining their predictions or ensuring compliance with primary physical constraints.This paper proposes a machine learning method with strong physical constraints by integrating mechanism and data-driven models.Prior knowledge of logging data interpretation is embedded into machine learning regarding network structure,loss function,and optimization algorithm.We employ the Physically Informed Auto-Encoder(PIAE)to predict porosity and water saturation,which can be trained without labeled reservoir parameters using self-supervised learning techniques.This approach effectively achieves automated interpretation and facilitates generalization across diverse datasets.
基金This work was supported by the National Natural Science Foundation of China(Grant No.42050104)the Science Foundation of SINOPEC Group(Grant No.P20030).
文摘A comprehensive and precise analysis of shale gas production performance is crucial for evaluating resource potential,designing a field development plan,and making investment decisions.However,quantitative analysis can be challenging because production performance is dominated by the complex interaction among a series of geological and engineering factors.In fact,each factor can be viewed as a player who makes cooperative contributions to the production payoff within the constraints of physical laws and models.Inspired by the idea,we propose a hybrid data-driven analysis framework in this study,where the contributions of dominant factors are quantitatively evaluated,the productions are precisely forecasted,and the development optimization suggestions are comprehensively generated.More specifically,game theory and machine learning models are coupled to determine the dominating geological and engineering factors.The Shapley value with definite physical meaning is employed to quantitatively measure the effects of individual factors.A multi-model-fused stacked model is trained for production forecast,which provides the basis for derivative-free optimization algorithms to optimize the development plan.The complete workflow is validated with actual production data collected from the Fuling shale gas field,Sichuan Basin,China.The validation results show that the proposed procedure can draw rigorous conclusions with quantified evidence and thereby provide specific and reliable suggestions for development plan optimization.Comparing with traditional and experience-based approaches,the hybrid data-driven procedure is advanced in terms of both efficiency and accuracy.
基金This work was supported in part by the National Natural Science Foundation of China(61903028)the Youth Innovation Promotion Association,Chinese Academy of Sciences(2020137)+1 种基金the Lifelong Learning Machines Program from DARPA/Microsystems Technology Officethe Army Research Laboratory(W911NF-18-2-0260).
文摘In this paper,we present a novel data-driven design method for the human-robot interaction(HRI)system,where a given task is achieved by cooperation between the human and the robot.The presented HRI controller design is a two-level control design approach consisting of a task-oriented performance optimization design and a plant-oriented impedance controller design.The task-oriented design minimizes the human effort and guarantees the perfect task tracking in the outer-loop,while the plant-oriented achieves the desired impedance from the human to the robot manipulator end-effector in the inner-loop.Data-driven reinforcement learning techniques are used for performance optimization in the outer-loop to assign the optimal impedance parameters.In the inner-loop,a velocity-free filter is designed to avoid the requirement of end-effector velocity measurement.On this basis,an adaptive controller is designed to achieve the desired impedance of the robot manipulator in the task space.The simulation and experiment of a robot manipulator are conducted to verify the efficacy of the presented HRI design framework.
文摘During the last decades the whispering gallery mode based sensors have become a prominent solution for label-free sensing of various physical and chemical parameters.At the same time,the widespread utilization of the approach is hindered by the restricted applicability of the known configurations for ambient variations quantification outside the laboratory conditions and their low affordability,where necessity on the spectrally-resolved data collection is among the main limiting factors.In this paper we demonstrate the first realization of an affordable whispering gallery mode sensor powered by deep learning and multi-resonator imaging at a fixed frequency.It has been shown that the approach enables refractive index unit(RIU)prediction with an absolute error at 3×10^(-6) level for dynamic range of the RIU variations from 0 to 2×10^(-3) with temporal resolution of several milliseconds and instrument-driven detection limit of 3×10−5.High sensing accuracy together with instrumental affordability and production simplicity places the reported detector among the most cost-effective realizations of the whispering gallery mode approach.The proposed solution is expected to have a great impact on the shift of the whole sensing paradigm away from the model-based and to the flexible self-learning solutions.
基金supported by the National Natural Science Foundation of China (62173333, 12271522)Beijing Natural Science Foundation (Z210002)the Research Fund of Renmin University of China (2021030187)。
文摘For unachievable tracking problems, where the system output cannot precisely track a given reference, achieving the best possible approximation for the reference trajectory becomes the objective. This study aims to investigate solutions using the Ptype learning control scheme. Initially, we demonstrate the necessity of gradient information for achieving the best approximation.Subsequently, we propose an input-output-driven learning gain design to handle the imprecise gradients of a class of uncertain systems. However, it is discovered that the desired performance may not be attainable when faced with incomplete information.To address this issue, an extended iterative learning control scheme is introduced. In this scheme, the tracking errors are modified through output data sampling, which incorporates lowmemory footprints and offers flexibility in learning gain design.The input sequence is shown to converge towards the desired input, resulting in an output that is closest to the given reference in the least square sense. Numerical simulations are provided to validate the theoretical findings.
文摘The era of big data is coming,the combination of big data and traditional teaching can provide more and more accurate services for students'self-learning,and it is a good way to teach students according to their aptitude.In this background,a learning society is coming,which aiming at learning,autonomous learning and lifelong learning.Learning society emphasize the ability of learning autonomy for students unprecedentedly.Learning is no longer limited to the campus.Learning ability will accompany learners'social life and become an active and healthy lifelong activity.Autonomous learning is a learning theory that goes with the requirements of The Times and has a broad development prospect.The study of Autonomous learning not only has a very important guiding significance for the educational and teaching practice in China,but also plays an important role in the life development of every student.The subject of learning is gradually transferred from the classroom,teachers and textbooks to the students themselves.Teachers should not only impart knowledge and answer questions,but also,most importantly,teach students how to exert their autonomy in autonomous learning.After investigating and researching the existing monitoring model of autonomous English learning in colleges and universities,our group found that in practice,there is a lack of corresponding monitoring mechanisms and means,and autonomous learning has gradually become formalized.Therefore,according to the actual situation of autonomous English learning in our country's universities,the monitoring model of autonomous English learning has been reconstructed,and an effective comprehensive evaluation system has been established to effectively improve students'English learning ability.
基金supported by the National Natural Science Foundation of China(T2222002,22032004)the Fundamental Research Funds for the Central Universities(Xiamen University:No.20720240053)State Key Laboratory of Vaccines for Infectious Diseases,Xiang An Biomedicine Laboratory(2023XAKJ0103074).
文摘Text mining has emerged as a powerful strategy for extracting domain knowledge structure from large amounts of text data.To date,most text mining methods are restricted to specific literature information,resulting in incomplete knowledge graphs.Here,we report a method that combines citation analysis with topic modeling to describe the hidden development patterns in the history of science.Leveraging this method,we construct a knowledge graph in the field of Raman spectroscopy.The traditional Latent DirichletAllocation model is chosen as the baseline model for comparison to validate the performance of our model.Our method improves the topic coherence with a minimum growth rate of 100%compared to the traditional text mining method.It outperforms the traditional text mining method on the diversity,and its growth rate ranges from 0 to 126%.The results show the effectiveness of rule-based tokenizer we designed in solving the word tokenizer problem caused by entity naming rules in the field of chemistry.It is versatile in revealing the distribution of topics,establishing the similarity and inheritance relationships,and identifying the important moments in the history of Raman spectroscopy.Our work provides a comprehensive tool for the science of science research and promises to offer new insights into the historical survey and development forecast of a research field.
基金supported by the National Natural Science Foundation of China(Nos.52571028,52301029)the Fundamental Research Funds for the Central Universities(No.06500165)+2 种基金the Guangdong Basic and Applied Basic Research Foundation(No.2022A1515140006)the AVIC Heavy Machinery Innovation Fund(ZJQT-2025-06)the Young Elite Scientists Sponsorship Program by CAST(No.2023QNRC001).
文摘In the era of big data,reinforcement learning(RL)has emerged as a powerful data-driven optimization approach in materials science,enabling unprecedented advances in material design and performance improvement.Unlike traditional trial-and-error and physics-based approaches,RL agents autonomously identify optimal strategies across high-dimensional and dynamic design spaces by iterative interactions with complex environments.This capability makes RL especially effective for target optimization and sequential decision-making in challenging materials science problems.In this review,we present a comprehensive overview of fundamental RL algorithms,including Q-learning,deep Q-networks(DQN),actor-critic methods,and deep deterministic policy gradient(DDPG).Then,the core mechanisms,advantages,limitations,and representative applications of RL in materials discovery,property optimization,process control,and manufacturing are discussed systematically.Lastly,key future research directions and opportunities are outlined.The perspectives presented herein aim to foster interdisciplinary collaboration and drive innovation at the frontier of AI‑driven materials science.
基金supported by the Ministry of Science and ICT(MSIT)of the Republic of Korea(00302646)supported by the National Research Foundation of Korea grant funded by the Korean Government(MSIT)(NRF-2022R1A4A1019296,1345374646,2022M3J1A1064315).
文摘Perovskite solar cells(PSCs)have developed rapidly,positioning them as potential candidates for nextgeneration renewable energy sources.However,conventional trial-and-error approaches and the vast compositional parameter space continue to pose challenges in the pursuit of exceptional performance and high stability of perovskite-based optoelectronics.The increasing demand for novel materials in optoelectronic devices and establishment of substantial databases has enabled data-driven machinelearning(ML)approaches to swiftly advance in the materials field.This review succinctly outlines the fundamental ML procedures,techniques,and recent breakthroughs,particularly in predicting the physical characteristics of perovskite materials.Moreover,it highlights research endeavors aimed at optimizing and screening materials to enhance the efficiency and stability of PSCs.Additionally,this review highlights recent efforts in using characterization data for ML,exploring their correlations with material properties and device performance,which are actively being researched,but they have yet to receive significant attention.Lastly,we provide future perspectives,such as leveraging Large Language Models(LLMs)and text-mining,to expedite the discovery of novel perovskite materials and expand their utilization across various optoelectronic fields.