The distillation process is an important chemical process,and the application of data-driven modelling approach has the potential to reduce model complexity compared to mechanistic modelling,thus improving the efficie...The distillation process is an important chemical process,and the application of data-driven modelling approach has the potential to reduce model complexity compared to mechanistic modelling,thus improving the efficiency of process optimization or monitoring studies.However,the distillation process is highly nonlinear and has multiple uncertainty perturbation intervals,which brings challenges to accurate data-driven modelling of distillation processes.This paper proposes a systematic data-driven modelling framework to solve these problems.Firstly,data segment variance was introduced into the K-means algorithm to form K-means data interval(KMDI)clustering in order to cluster the data into perturbed and steady state intervals for steady-state data extraction.Secondly,maximal information coefficient(MIC)was employed to calculate the nonlinear correlation between variables for removing redundant features.Finally,extreme gradient boosting(XGBoost)was integrated as the basic learner into adaptive boosting(AdaBoost)with the error threshold(ET)set to improve weights update strategy to construct the new integrated learning algorithm,XGBoost-AdaBoost-ET.The superiority of the proposed framework is verified by applying this data-driven modelling framework to a real industrial process of propylene distillation.展开更多
With the rapid advancement of machine learning technology and its growing adoption in research and engineering applications,an increasing number of studies have embraced data-driven approaches for modeling wind turbin...With the rapid advancement of machine learning technology and its growing adoption in research and engineering applications,an increasing number of studies have embraced data-driven approaches for modeling wind turbine wakes.These models leverage the ability to capture complex,high-dimensional characteristics of wind turbine wakes while offering significantly greater efficiency in the prediction process than physics-driven models.As a result,data-driven wind turbine wake models are regarded as powerful and effective tools for predicting wake behavior and turbine power output.This paper aims to provide a concise yet comprehensive review of existing studies on wind turbine wake modeling that employ data-driven approaches.It begins by defining and classifying machine learning methods to facilitate a clearer understanding of the reviewed literature.Subsequently,the related studies are categorized into four key areas:wind turbine power prediction,data-driven analytic wake models,wake field reconstruction,and the incorporation of explicit physical constraints.The accuracy of data-driven models is influenced by two primary factors:the quality of the training data and the performance of the model itself.Accordingly,both data accuracy and model structure are discussed in detail within the review.展开更多
A data-driven model ofmultiple variable cutting(M-VCUT)level set-based substructure is proposed for the topology optimization of lattice structures.TheM-VCUTlevel setmethod is used to represent substructures,enriching...A data-driven model ofmultiple variable cutting(M-VCUT)level set-based substructure is proposed for the topology optimization of lattice structures.TheM-VCUTlevel setmethod is used to represent substructures,enriching their diversity of configuration while ensuring connectivity.To construct the data-driven model of substructure,a database is prepared by sampling the space of substructures spanned by several substructure prototypes.Then,for each substructure in this database,the stiffness matrix is condensed so that its degrees of freedomare reduced.Thereafter,the data-drivenmodel of substructures is constructed through interpolationwith compactly supported radial basis function(CS-RBF).The inputs of the data-driven model are the design variables of topology optimization,and the outputs are the condensed stiffness matrix and volume of substructures.During the optimization,this data-driven model is used,thus avoiding repeated static condensation that would requiremuch computation time.Several numerical examples are provided to verify the proposed method.展开更多
Accurate prediction of strip width is a key factor related to the quality of hot rolling manufacture.Firstly,based on strip width formation mechanism model within strip rolling process,an improved width mechanism calc...Accurate prediction of strip width is a key factor related to the quality of hot rolling manufacture.Firstly,based on strip width formation mechanism model within strip rolling process,an improved width mechanism calculation model is delineated for the optimization of process parameters via the particle swarm optimization algorithm.Subsequently,a hybrid strip width prediction model is proposed by effectively combining the respective advantages of the improved mechanism model and the data-driven model.In acknowledgment of prerequisite for positive error in strip width prediction,an adaptive width error compensation algorithm is proposed.Finally,comparative simulation experiments are designed on the actual rolling dataset after completing data cleaning and feature engineering.The experimental results show that the hybrid prediction model proposed has superior precision and robustness compared with the improved mechanism model and the other eight common data-driven models and satisfies the needs of practical applications.Moreover,the hybrid model can realize the complementary advantages of the mechanism model and the data-driven model,effectively alleviating the problems of difficult to improve the accuracy of the mechanism model and poor interpretability of the data-driven model,which bears significant practical implications for the research of strip width control.展开更多
Additive manufacturing(AM),particularly fused deposition modeling(FDM),has emerged as a transformative technology in modern manufacturing processes.The dimensional accuracy of FDM-printed parts is crucial for ensuring...Additive manufacturing(AM),particularly fused deposition modeling(FDM),has emerged as a transformative technology in modern manufacturing processes.The dimensional accuracy of FDM-printed parts is crucial for ensuring their functional integrity and performance.To achieve sustainable manufacturing in FDM,it is necessary to optimize the print quality and time efficiency concurrently.However,owing to the complex interactions of printing parameters,achieving a balanced optimization of both remains challenging.This study examines four key factors affecting dimensional accuracy and print time:printing speed,layer thickness,nozzle temperature,and bed temperature.Fifty parameter sets were generated using enhanced Latin hypercube sampling.A whale optimization algorithm(WOA)-enhanced support vector regression(SVR)model was developed to predict dimen-sional errors and print time effectively,with non-dominated sorting genetic algorithm Ⅲ(NSGA-Ⅲ)utilized for multi-objective optimization.The technique for Order Preference by Similarity to Ideal Solution(TOPSIS)was applied to select a balanced solution from the Pareto front.In experimental validation,the parts printed using the optimized parameters exhibited excellent dimensional accuracy and printing efficiency.This study comprehensively considered optimizing the printing time and size to meet quality requirements while achieving higher printing efficiency and aiding in the realization of sustainable manufacturing in the field of AM.In addition,the printing of a specific prosthetic component was used as a case study,highlighting the high demands on both dimensional precision and printing efficiency.The optimized process parameters required significantly less printing time,while satisfying the dimensional accuracy requirements.This study provides valuable insights for achieving sustainable AM using FDM.展开更多
This study explores the effectiveness of machine learning models in predicting the air-side performance of microchannel heat exchangers.The data were generated by experimentally validated Computational Fluid Dynam-ics...This study explores the effectiveness of machine learning models in predicting the air-side performance of microchannel heat exchangers.The data were generated by experimentally validated Computational Fluid Dynam-ics(CFD)simulations of air-to-water microchannel heat exchangers.A distinctive aspect of this research is the comparative analysis of four diverse machine learning algorithms:Artificial Neural Networks(ANN),Support Vector Machines(SVM),Random Forest(RF),and Gaussian Process Regression(GPR).These models are adeptly applied to predict air-side heat transfer performance with high precision,with ANN and GPR exhibiting notably superior accuracy.Additionally,this research further delves into the influence of both geometric and operational parameters—including louvered angle,fin height,fin spacing,air inlet temperature,velocity,and tube temperature—on model performance.Moreover,it innovatively incorporates dimensionless numbers such as aspect ratio,fin height-to-spacing ratio,Reynolds number,Nusselt number,normalized air inlet temperature,temperature difference,and louvered angle into the input variables.This strategic inclusion significantly refines the predictive capabilities of the models by establishing a robust analytical framework supported by the CFD-generated database.The results show the enhanced prediction accuracy achieved by integrating dimensionless numbers,highlighting the effectiveness of data-driven approaches in precisely forecasting heat exchanger performance.This advancement is pivotal for the geometric optimization of heat exchangers,illustrating the considerable potential of integrating sophisticated modeling techniques with traditional engineering metrics.展开更多
Blades are essential components of wind turbines.Reducing their fatigue loads during operation helps to extend their lifespan,but it is difficult to quickly and accurately calculate the fatigue loads of blades.To solv...Blades are essential components of wind turbines.Reducing their fatigue loads during operation helps to extend their lifespan,but it is difficult to quickly and accurately calculate the fatigue loads of blades.To solve this problem,this paper innovatively designs a data-driven blade load modeling method based on a deep learning framework through mechanism analysis,feature selection,and model construction.In the mechanism analysis part,the generation mechanism of blade loads and the load theoretical calculationmethod based on material damage theory are analyzed,and four measurable operating state parameters related to blade loads are screened;in the feature extraction part,15 characteristic indicators of each screened parameter are extracted in the time and frequency domain,and feature selection is completed through correlation analysis with blade loads to determine the input parameters of data-driven modeling;in the model construction part,a deep neural network based on feedforward and feedback propagation is designed to construct the nonlinear coupling relationship between the unit operating parameter characteristics and blade loads.The results show that the proposed method mines the wind turbine operating state characteristics highly correlated with the blade load,such as the standard deviation of wind speed.The model built using these characteristics has reasonable calculation and fitting capabilities for the blade load and shows a better fitting level for untrained out-of-sample data than the traditional scheme.Based on the mean absolute percentage error calculation,the modeling accuracy of the two blade loads can reach more than 90%and 80%,respectively,providing a good foundation for the subsequent optimization control to suppress the blade load.展开更多
Increasing the production and utilization of shale gas is of great significance for building a clean and low-carbon energy system.Sharp decline of gas production has been widely observed in shale gas reservoirs.How to...Increasing the production and utilization of shale gas is of great significance for building a clean and low-carbon energy system.Sharp decline of gas production has been widely observed in shale gas reservoirs.How to forecast shale gas production is still challenging due to complex fracture networks,dynamic fracture properties,frac hits,complicated multiphase flow,and multi-scale flow as well as data quality and uncertainty.This work develops an integrated framework for evaluating shale gas well production based on data-driven models.Firstly,a comprehensive dominated-factor system has been established,including geological,drilling,fracturing,and production factors.Data processing and visualization are required to ensure data quality and determine final data set.A shale gas production evaluation model is developed to evaluate shale gas production levels.Finally,the random forest algorithm is used to forecast shale gas production.The prediction accuracy of shale gas production level is higher than 95%based on the shale gas reservoirs in China.Forty-one wells are randomly selected to predict cumulative gas production using the optimal regression model.The proposed shale gas production evaluation frame-work overcomes too many assumptions of analytical or semi-analytical models and avoids huge computation cost and poor generalization for numerical modelling.展开更多
The curse of dimensionality refers to the problem o increased sparsity and computational complexity when dealing with high-dimensional data.In recent years,the types and vari ables of industrial data have increased si...The curse of dimensionality refers to the problem o increased sparsity and computational complexity when dealing with high-dimensional data.In recent years,the types and vari ables of industrial data have increased significantly,making data driven models more challenging to develop.To address this prob lem,data augmentation technology has been introduced as an effective tool to solve the sparsity problem of high-dimensiona industrial data.This paper systematically explores and discusses the necessity,feasibility,and effectiveness of augmented indus trial data-driven modeling in the context of the curse of dimen sionality and virtual big data.Then,the process of data augmen tation modeling is analyzed,and the concept of data boosting augmentation is proposed.The data boosting augmentation involves designing the reliability weight and actual-virtual weigh functions,and developing a double weighted partial least squares model to optimize the three stages of data generation,data fusion and modeling.This approach significantly improves the inter pretability,effectiveness,and practicality of data augmentation in the industrial modeling.Finally,the proposed method is verified using practical examples of fault diagnosis systems and virtua measurement systems in the industry.The results demonstrate the effectiveness of the proposed approach in improving the accu racy and robustness of data-driven models,making them more suitable for real-world industrial applications.展开更多
Vortex induced vibration(VIV)is a challenge in ocean engineering.Several devices including fairings have been designed to suppress VIV.However,how to optimize the design of suppression devices is still a problem to be...Vortex induced vibration(VIV)is a challenge in ocean engineering.Several devices including fairings have been designed to suppress VIV.However,how to optimize the design of suppression devices is still a problem to be solved.In this paper,an optimization design methodology is presented based on data-driven models and genetic algorithm(GA).Data-driven models are introduced to substitute complex physics-based equations.GA is used to rapidly search for the optimal suppression device from all possible solutions.Taking fairings as example,VIV response database for different fairings is established based on parameterized models in which model sections of fairings are controlled by several control points and Bezier curves.Then a data-driven model,which can predict the VIV response of fairings with different sections accurately and efficiently,is trained through BP neural network.Finally,a comprehensive optimization method and process is proposed based on GA and the data-driven model.The proposed method is demonstrated by its application to a case.It turns out that the proposed method can perform the optimization design of fairings effectively.VIV can be reduced obviously through the optimization design.展开更多
The dynamical modeling of projectile systems with sufficient accuracy is of great difficulty due to high-dimensional space and various perturbations.With the rapid development of data science and scientific tools of m...The dynamical modeling of projectile systems with sufficient accuracy is of great difficulty due to high-dimensional space and various perturbations.With the rapid development of data science and scientific tools of measurement recently,there are numerous data-driven methods devoted to discovering governing laws from data.In this work,a data-driven method is employed to perform the modeling of the projectile based on the Kramers–Moyal formulas.More specifically,the four-dimensional projectile system is assumed as an It?stochastic differential equation.Then the least square method and sparse learning are applied to identify the drift coefficient and diffusion matrix from sample path data,which agree well with the real system.The effectiveness of the data-driven method demonstrates that it will become a powerful tool in extracting governing equations and predicting complex dynamical behaviors of the projectile.展开更多
This work addresses the multiscale optimization of the puri cation processes of antibody fragments. Chromatography decisions in the manufacturing processes are optimized, including the number of chromatography columns...This work addresses the multiscale optimization of the puri cation processes of antibody fragments. Chromatography decisions in the manufacturing processes are optimized, including the number of chromatography columns and their sizes, the number of cycles per batch, and the operational ow velocities. Data-driven models of chromatography throughput are developed considering loaded mass, ow velocity, and column bed height as the inputs, using manufacturing-scale simulated datasets based on microscale experimental data. The piecewise linear regression modeling method is adapted due to its simplicity and better prediction accuracy in comparison with other methods. Two alternative mixed-integer nonlinear programming (MINLP) models are proposed to minimize the total cost of goods per gram of the antibody puri cation process, incorporating the data-driven models. These MINLP models are then reformulated as mixed-integer linear programming (MILP) models using linearization techniques and multiparametric disaggregation. Two industrially relevant cases with different chromatography column size alternatives are investigated to demonstrate the applicability of the proposed models.展开更多
In the synthesis of the control algorithm for complex systems, we are often faced with imprecise or unknown mathematical models of the dynamical systems, or even with problems in finding a mathematical model of the sy...In the synthesis of the control algorithm for complex systems, we are often faced with imprecise or unknown mathematical models of the dynamical systems, or even with problems in finding a mathematical model of the system in the open loop. To tackle these difficulties, an approach of data-driven model identification and control algorithm design based on the maximum stability degree criterion is proposed in this paper. The data-driven model identification procedure supposes the finding of the mathematical model of the system based on the undamped transient response of the closed-loop system. The system is approximated with the inertial model, where the coefficients are calculated based on the values of the critical transfer coefficient, oscillation amplitude and period of the underdamped response of the closed-loop system. The data driven control design supposes that the tuning parameters of the controller are calculated based on the parameters obtained from the previous step of system identification and there are presented the expressions for the calculation of the tuning parameters. The obtained results of data-driven model identification and algorithm for synthesis the controller were verified by computer simulation.展开更多
With the continual deployment of power-electronics-interfaced renewable energy resources,increasing privacy concerns due to deregulation of electricity markets,and the diversification of demand-side activities,traditi...With the continual deployment of power-electronics-interfaced renewable energy resources,increasing privacy concerns due to deregulation of electricity markets,and the diversification of demand-side activities,traditional knowledge-based power system dynamic modeling methods are faced with unprecedented challenges.Data-driven modeling has been increasingly studied in recent years because of its lesser need for prior knowledge,higher capability of handling large-scale systems,and better adaptability to variations of system operating conditions.This paper discusses about the motivations and the generalized process of datadriven modeling,and provides a comprehensive overview of various state-of-the-art techniques and applications.It also comparatively presents the advantages and disadvantages of these methods and provides insight into outstanding challenges and possible research directions for the future.展开更多
This study presents an improved data-driven Model-Free Adaptive Control(MFAC)strategy for attitude stabilization of a partially constrained combined spacecraft with external disturbances and input saturation. First, a...This study presents an improved data-driven Model-Free Adaptive Control(MFAC)strategy for attitude stabilization of a partially constrained combined spacecraft with external disturbances and input saturation. First, a novel dynamic linearization data model for the partially constrained combined spacecraft with external disturbances is established. The generalized disturbances composed of external disturbances and dynamic linearization errors are then reconstructed by a Discrete Extended State Observer(DESO). With the dynamic linearization data model and reconstructed information, a DESO-MFAC strategy for the combined spacecraft is proposed based only on input and output data. Next, the input saturation is overcome by introducing an antiwindup compensator. Finally, numerical simulations are carried out to demonstrate the effectiveness and feasibility of the proposed controller when the dynamic properties of the partially constrained combined spacecraft are completely unknown.展开更多
Sub-Saharan Africa(SSA)has the highest maternal and under-five mortality rates in the world.The advent of the coronavirus disease 2019 exacerbated the region's problems by overwhelming the health systems and affec...Sub-Saharan Africa(SSA)has the highest maternal and under-five mortality rates in the world.The advent of the coronavirus disease 2019 exacerbated the region's problems by overwhelming the health systems and affecting access to healthcare through travel restrictions and rechanelling of resources towards the containment of the pandemic.The region failed to achieve the Millenium Development Goals on maternal and child mortalities,and is poised to fail to achieve the same goals in the Sustainable Development Goals.To improve on the maternal and child health outcomes,many SSA countries introduced digital technologies for educating pregnant and nurs-ing women,making doctors'appointments and sending reminders to mothers and expectant mothers,as well as capturing information about patients and their illnesses.However,the collected epidemiological data are not being utilised to inform patient care and improve on the quality,efficiency and access to maternal,neonatal and child health(MNCH)care.To the researchers'best knowledge,no review paper has been published that focuses on digital health for MNCH care in SSA and proposes data-driven approaches to the same.Therefore,this study sought to:(1)identify digital systems for MNCH in SSA;(2)identify the applicability and weaknesses of the dig-ital MNCH systems in SSA;and(3)propose a data-driven model for diverging emerging technologies into MNCH services in SSA to make better use of data to improve MNCH care coverage,efficiency and quality.The PRISMA methodology was used in this study.The study revealed that there are no data-driven models for monitoring pregnant women and under-five children in Sub-Saharan Africa,with the available digital health technologies mainly based on SMS and websites.Thus,the current digital health systems in SSA do not support real-time,ubiquitous,pervasive and data-driven healthcare.Their main applicability is in non-real-time pregnancy moni-toring,education and information dissemination.Unless new and more effective approaches are implemented,SSA might remain with the highest and unacceptable maternal and under-five mortality rates globally.The study proposes feasible emerging technologies that can be used to provide data-driven healthcare for MNCH in SSA,and the recommendations on how to make the transition successful as well as the lessons learn from other regions.展开更多
Using stochastic dynamic simulation for railway vehicle collision still faces many challenges,such as high modelling complexity and time-consuming.To address the challenges,we introduce a novel data-driven stochastic ...Using stochastic dynamic simulation for railway vehicle collision still faces many challenges,such as high modelling complexity and time-consuming.To address the challenges,we introduce a novel data-driven stochastic process modelling(DSPM)approach into dynamic simulation of the railway vehicle collision.This DSPM approach consists of two steps:(i)process description,four kinds of kernels are used to describe the uncertainty inherent in collision processes;(ii)solving,stochastic variational inferences and mini-batch algorithms can then be used to accelerate computations of stochastic processes.By applying DSPM,Gaussian process regression(GPR)and finite element(FE)methods to two collision scenarios(i.e.lead car colliding with a rigid wall,and the lead car colliding with another lead car),we are able to achieve a comprehensive analysis.The comparison between the DSPM approach and the FE method revealed that the DSPM approach is capable of calculating the corresponding confidence interval,simultaneously improving the overall computational efficiency.Comparing the DSPM approach with the GPR method indicates that the DSPM approach has the ability to accurately describe the dynamic response under unknown conditions.Overall,this research demonstrates the feasibility and usability of the proposed DSPM approach for stochastic dynamics simulation of the railway vehicle collision.展开更多
We propose an integrated method of data-driven and mechanism models for well logging formation evaluation,explicitly focusing on predicting reservoir parameters,such as porosity and water saturation.Accurately interpr...We propose an integrated method of data-driven and mechanism models for well logging formation evaluation,explicitly focusing on predicting reservoir parameters,such as porosity and water saturation.Accurately interpreting these parameters is crucial for effectively exploring and developing oil and gas.However,with the increasing complexity of geological conditions in this industry,there is a growing demand for improved accuracy in reservoir parameter prediction,leading to higher costs associated with manual interpretation.The conventional logging interpretation methods rely on empirical relationships between logging data and reservoir parameters,which suffer from low interpretation efficiency,intense subjectivity,and suitability for ideal conditions.The application of artificial intelligence in the interpretation of logging data provides a new solution to the problems existing in traditional methods.It is expected to improve the accuracy and efficiency of the interpretation.If large and high-quality datasets exist,data-driven models can reveal relationships of arbitrary complexity.Nevertheless,constructing sufficiently large logging datasets with reliable labels remains challenging,making it difficult to apply data-driven models effectively in logging data interpretation.Furthermore,data-driven models often act as“black boxes”without explaining their predictions or ensuring compliance with primary physical constraints.This paper proposes a machine learning method with strong physical constraints by integrating mechanism and data-driven models.Prior knowledge of logging data interpretation is embedded into machine learning regarding network structure,loss function,and optimization algorithm.We employ the Physically Informed Auto-Encoder(PIAE)to predict porosity and water saturation,which can be trained without labeled reservoir parameters using self-supervised learning techniques.This approach effectively achieves automated interpretation and facilitates generalization across diverse datasets.展开更多
Pressure differential deviations under static conditions and pressure convergence fluctuations under dynamic disturbances are widely reported problems with pressure differential control in pharmaceutical cleanrooms,ye...Pressure differential deviations under static conditions and pressure convergence fluctuations under dynamic disturbances are widely reported problems with pressure differential control in pharmaceutical cleanrooms,yet their underlying mechanisms and key reasons remain insufficiently explored.This study performed a field survey and model-based simulations to identify the major influencing parameters and quantify their influence on pressure differentials.Twelve pharmaceutical cleanrooms with varying environmental control parameters were included in the field survey,all of which were served by a variable air volume(VAV)ventilation system.Large deviations between actual and design pressure differentials were found,ranging from 10%to 42.5%,and a total of 24 uncertain parameters and their respective uncertainty ranges were identified.Based on the field survey,a data-driven pressure differential response model was developed using MATLAB/Simulink platform.The model fully took into account the system dynamics and facilitated real-time monitoring and control of the pressure differential.Sobol-based sensitivity analysis was then conducted to identify key influencing parameters of pressure differential deviations.The simulated results revealed that static pressure differential deviations were predominantly influenced by pressure sensing accuracy,exhaust airflow accuracy,and duct impedance,while dynamic disturbances were mainly driven by room envelope airtightness and supply airflow accuracy.The interactions between connected zones were pronounced.Rooms with higher branch duct impedance experienced smaller pressure differential deviations due to natural buffering characteristics,while the parameter uncertainties in these rooms significantly affected pressure differential in other rooms.These findings offer practical guidance for the design and operation of precise pressure differential control in pharmaceutical cleanrooms.展开更多
Permanent magnet synchronous motor(PMSM)is widely used in alternating current servo systems as it provides high eficiency,high power density,and a wide speed regulation range.The servo system is placing higher demands...Permanent magnet synchronous motor(PMSM)is widely used in alternating current servo systems as it provides high eficiency,high power density,and a wide speed regulation range.The servo system is placing higher demands on its control performance.The model predictive control(MPC)algorithm is emerging as a potential high-performance motor control algorithm due to its capability of handling multiple-input and multipleoutput variables and imposed constraints.For the MPC used in the PMSM control process,there is a nonlinear disturbance caused by the change of electromagnetic parameters or load disturbance that may lead to a mismatch between the nominal model and the controlled object,which causes the prediction error and thus affects the dynamic stability of the control system.This paper proposes a data-driven MPC strategy in which the historical data in an appropriate range are utilized to eliminate the impact of parameter mismatch and further improve the control performance.The stability of the proposed algorithm is proved as the simulation demonstrates the feasibility.Compared with the classical MPC strategy,the superiority of the algorithm has also been verified.展开更多
基金supported by the National Key Research and Development Program of China(2023YFB3307801)the National Natural Science Foundation of China(62394343,62373155,62073142)+3 种基金Major Science and Technology Project of Xinjiang(No.2022A01006-4)the Programme of Introducing Talents of Discipline to Universities(the 111 Project)under Grant B17017the Fundamental Research Funds for the Central Universities,Science Foundation of China University of Petroleum,Beijing(No.2462024YJRC011)the Open Research Project of the State Key Laboratory of Industrial Control Technology,China(Grant No.ICT2024B70).
文摘The distillation process is an important chemical process,and the application of data-driven modelling approach has the potential to reduce model complexity compared to mechanistic modelling,thus improving the efficiency of process optimization or monitoring studies.However,the distillation process is highly nonlinear and has multiple uncertainty perturbation intervals,which brings challenges to accurate data-driven modelling of distillation processes.This paper proposes a systematic data-driven modelling framework to solve these problems.Firstly,data segment variance was introduced into the K-means algorithm to form K-means data interval(KMDI)clustering in order to cluster the data into perturbed and steady state intervals for steady-state data extraction.Secondly,maximal information coefficient(MIC)was employed to calculate the nonlinear correlation between variables for removing redundant features.Finally,extreme gradient boosting(XGBoost)was integrated as the basic learner into adaptive boosting(AdaBoost)with the error threshold(ET)set to improve weights update strategy to construct the new integrated learning algorithm,XGBoost-AdaBoost-ET.The superiority of the proposed framework is verified by applying this data-driven modelling framework to a real industrial process of propylene distillation.
基金Supported by the National Natural Science Foundation of China under Grant No.52131102.
文摘With the rapid advancement of machine learning technology and its growing adoption in research and engineering applications,an increasing number of studies have embraced data-driven approaches for modeling wind turbine wakes.These models leverage the ability to capture complex,high-dimensional characteristics of wind turbine wakes while offering significantly greater efficiency in the prediction process than physics-driven models.As a result,data-driven wind turbine wake models are regarded as powerful and effective tools for predicting wake behavior and turbine power output.This paper aims to provide a concise yet comprehensive review of existing studies on wind turbine wake modeling that employ data-driven approaches.It begins by defining and classifying machine learning methods to facilitate a clearer understanding of the reviewed literature.Subsequently,the related studies are categorized into four key areas:wind turbine power prediction,data-driven analytic wake models,wake field reconstruction,and the incorporation of explicit physical constraints.The accuracy of data-driven models is influenced by two primary factors:the quality of the training data and the performance of the model itself.Accordingly,both data accuracy and model structure are discussed in detail within the review.
基金supported by the National Natural Science Foundation of China(Grant No.12272144).
文摘A data-driven model ofmultiple variable cutting(M-VCUT)level set-based substructure is proposed for the topology optimization of lattice structures.TheM-VCUTlevel setmethod is used to represent substructures,enriching their diversity of configuration while ensuring connectivity.To construct the data-driven model of substructure,a database is prepared by sampling the space of substructures spanned by several substructure prototypes.Then,for each substructure in this database,the stiffness matrix is condensed so that its degrees of freedomare reduced.Thereafter,the data-drivenmodel of substructures is constructed through interpolationwith compactly supported radial basis function(CS-RBF).The inputs of the data-driven model are the design variables of topology optimization,and the outputs are the condensed stiffness matrix and volume of substructures.During the optimization,this data-driven model is used,thus avoiding repeated static condensation that would requiremuch computation time.Several numerical examples are provided to verify the proposed method.
基金supported by the National Natural Science Foundation of China(No.62273234)Key Research and Development Program of Shaanxi(Program No.2022GY-306)Technology Innovation Leading Program of Shaanxi(Program No.2022QFY01-16).
文摘Accurate prediction of strip width is a key factor related to the quality of hot rolling manufacture.Firstly,based on strip width formation mechanism model within strip rolling process,an improved width mechanism calculation model is delineated for the optimization of process parameters via the particle swarm optimization algorithm.Subsequently,a hybrid strip width prediction model is proposed by effectively combining the respective advantages of the improved mechanism model and the data-driven model.In acknowledgment of prerequisite for positive error in strip width prediction,an adaptive width error compensation algorithm is proposed.Finally,comparative simulation experiments are designed on the actual rolling dataset after completing data cleaning and feature engineering.The experimental results show that the hybrid prediction model proposed has superior precision and robustness compared with the improved mechanism model and the other eight common data-driven models and satisfies the needs of practical applications.Moreover,the hybrid model can realize the complementary advantages of the mechanism model and the data-driven model,effectively alleviating the problems of difficult to improve the accuracy of the mechanism model and poor interpretability of the data-driven model,which bears significant practical implications for the research of strip width control.
基金supporteded by Natural Science Foundation of Shanghai(Grant No.22ZR1463900)State Key Laboratory of Mechanical System and Vibration(Grant No.MSV202318)the Fundamental Research Funds for the Central Universities(Grant No.22120220649).
文摘Additive manufacturing(AM),particularly fused deposition modeling(FDM),has emerged as a transformative technology in modern manufacturing processes.The dimensional accuracy of FDM-printed parts is crucial for ensuring their functional integrity and performance.To achieve sustainable manufacturing in FDM,it is necessary to optimize the print quality and time efficiency concurrently.However,owing to the complex interactions of printing parameters,achieving a balanced optimization of both remains challenging.This study examines four key factors affecting dimensional accuracy and print time:printing speed,layer thickness,nozzle temperature,and bed temperature.Fifty parameter sets were generated using enhanced Latin hypercube sampling.A whale optimization algorithm(WOA)-enhanced support vector regression(SVR)model was developed to predict dimen-sional errors and print time effectively,with non-dominated sorting genetic algorithm Ⅲ(NSGA-Ⅲ)utilized for multi-objective optimization.The technique for Order Preference by Similarity to Ideal Solution(TOPSIS)was applied to select a balanced solution from the Pareto front.In experimental validation,the parts printed using the optimized parameters exhibited excellent dimensional accuracy and printing efficiency.This study comprehensively considered optimizing the printing time and size to meet quality requirements while achieving higher printing efficiency and aiding in the realization of sustainable manufacturing in the field of AM.In addition,the printing of a specific prosthetic component was used as a case study,highlighting the high demands on both dimensional precision and printing efficiency.The optimized process parameters required significantly less printing time,while satisfying the dimensional accuracy requirements.This study provides valuable insights for achieving sustainable AM using FDM.
基金supported by the National Natural Science Foundation of China(Grant No.52306026)the Wenzhou Municipal Science and Technology Research Program(Grant No.G20220012)+2 种基金the Special Innovation Project Fund of the Institute of Wenzhou,Zhejiang University(XMGL-KJZX202205)the State Key Laboratory of Air-Conditioning Equipment and System Energy Conservation Open Project(Project No.ACSKL2021KT01)the Special Innovation Project Fund of the Institute of Wenzhou,Zhejiang University(XMGL-KJZX-202205).
文摘This study explores the effectiveness of machine learning models in predicting the air-side performance of microchannel heat exchangers.The data were generated by experimentally validated Computational Fluid Dynam-ics(CFD)simulations of air-to-water microchannel heat exchangers.A distinctive aspect of this research is the comparative analysis of four diverse machine learning algorithms:Artificial Neural Networks(ANN),Support Vector Machines(SVM),Random Forest(RF),and Gaussian Process Regression(GPR).These models are adeptly applied to predict air-side heat transfer performance with high precision,with ANN and GPR exhibiting notably superior accuracy.Additionally,this research further delves into the influence of both geometric and operational parameters—including louvered angle,fin height,fin spacing,air inlet temperature,velocity,and tube temperature—on model performance.Moreover,it innovatively incorporates dimensionless numbers such as aspect ratio,fin height-to-spacing ratio,Reynolds number,Nusselt number,normalized air inlet temperature,temperature difference,and louvered angle into the input variables.This strategic inclusion significantly refines the predictive capabilities of the models by establishing a robust analytical framework supported by the CFD-generated database.The results show the enhanced prediction accuracy achieved by integrating dimensionless numbers,highlighting the effectiveness of data-driven approaches in precisely forecasting heat exchanger performance.This advancement is pivotal for the geometric optimization of heat exchangers,illustrating the considerable potential of integrating sophisticated modeling techniques with traditional engineering metrics.
基金supported by Science and Technology Project funding from China Southern Power Grid Corporation No.GDKJXM20230245(031700KC23020003).
文摘Blades are essential components of wind turbines.Reducing their fatigue loads during operation helps to extend their lifespan,but it is difficult to quickly and accurately calculate the fatigue loads of blades.To solve this problem,this paper innovatively designs a data-driven blade load modeling method based on a deep learning framework through mechanism analysis,feature selection,and model construction.In the mechanism analysis part,the generation mechanism of blade loads and the load theoretical calculationmethod based on material damage theory are analyzed,and four measurable operating state parameters related to blade loads are screened;in the feature extraction part,15 characteristic indicators of each screened parameter are extracted in the time and frequency domain,and feature selection is completed through correlation analysis with blade loads to determine the input parameters of data-driven modeling;in the model construction part,a deep neural network based on feedforward and feedback propagation is designed to construct the nonlinear coupling relationship between the unit operating parameter characteristics and blade loads.The results show that the proposed method mines the wind turbine operating state characteristics highly correlated with the blade load,such as the standard deviation of wind speed.The model built using these characteristics has reasonable calculation and fitting capabilities for the blade load and shows a better fitting level for untrained out-of-sample data than the traditional scheme.Based on the mean absolute percentage error calculation,the modeling accuracy of the two blade loads can reach more than 90%and 80%,respectively,providing a good foundation for the subsequent optimization control to suppress the blade load.
基金funded by National Natural Science Foundation of China(52004238)China Postdoctoral Science Foundation(2019M663561).
文摘Increasing the production and utilization of shale gas is of great significance for building a clean and low-carbon energy system.Sharp decline of gas production has been widely observed in shale gas reservoirs.How to forecast shale gas production is still challenging due to complex fracture networks,dynamic fracture properties,frac hits,complicated multiphase flow,and multi-scale flow as well as data quality and uncertainty.This work develops an integrated framework for evaluating shale gas well production based on data-driven models.Firstly,a comprehensive dominated-factor system has been established,including geological,drilling,fracturing,and production factors.Data processing and visualization are required to ensure data quality and determine final data set.A shale gas production evaluation model is developed to evaluate shale gas production levels.Finally,the random forest algorithm is used to forecast shale gas production.The prediction accuracy of shale gas production level is higher than 95%based on the shale gas reservoirs in China.Forty-one wells are randomly selected to predict cumulative gas production using the optimal regression model.The proposed shale gas production evaluation frame-work overcomes too many assumptions of analytical or semi-analytical models and avoids huge computation cost and poor generalization for numerical modelling.
基金supported in part by the National Natural Science Foundation of China(NSFC)(92167106,61833014)Key Research and Development Program of Zhejiang Province(2022C01206)。
文摘The curse of dimensionality refers to the problem o increased sparsity and computational complexity when dealing with high-dimensional data.In recent years,the types and vari ables of industrial data have increased significantly,making data driven models more challenging to develop.To address this prob lem,data augmentation technology has been introduced as an effective tool to solve the sparsity problem of high-dimensiona industrial data.This paper systematically explores and discusses the necessity,feasibility,and effectiveness of augmented indus trial data-driven modeling in the context of the curse of dimen sionality and virtual big data.Then,the process of data augmen tation modeling is analyzed,and the concept of data boosting augmentation is proposed.The data boosting augmentation involves designing the reliability weight and actual-virtual weigh functions,and developing a double weighted partial least squares model to optimize the three stages of data generation,data fusion and modeling.This approach significantly improves the inter pretability,effectiveness,and practicality of data augmentation in the industrial modeling.Finally,the proposed method is verified using practical examples of fault diagnosis systems and virtua measurement systems in the industry.The results demonstrate the effectiveness of the proposed approach in improving the accu racy and robustness of data-driven models,making them more suitable for real-world industrial applications.
基金supported by the National Natural Science Foundation of China(Grant No.51809279)the Major National Science and Technology Program(Grant No.2016ZX05028-001-05)+1 种基金Program for Changjiang Scholars and Innovative Research Team in University(Grant No.IRT14R58)the Fundamental Research Funds for the Central Universities,that is,the Opening Fund of National Engineering Laboratory of Offshore Geophysical and Exploration Equipment(Grant No.20CX02302A).
文摘Vortex induced vibration(VIV)is a challenge in ocean engineering.Several devices including fairings have been designed to suppress VIV.However,how to optimize the design of suppression devices is still a problem to be solved.In this paper,an optimization design methodology is presented based on data-driven models and genetic algorithm(GA).Data-driven models are introduced to substitute complex physics-based equations.GA is used to rapidly search for the optimal suppression device from all possible solutions.Taking fairings as example,VIV response database for different fairings is established based on parameterized models in which model sections of fairings are controlled by several control points and Bezier curves.Then a data-driven model,which can predict the VIV response of fairings with different sections accurately and efficiently,is trained through BP neural network.Finally,a comprehensive optimization method and process is proposed based on GA and the data-driven model.The proposed method is demonstrated by its application to a case.It turns out that the proposed method can perform the optimization design of fairings effectively.VIV can be reduced obviously through the optimization design.
基金the Six Talent Peaks Project in Jiangsu Province,China(Grant No.JXQC-002)。
文摘The dynamical modeling of projectile systems with sufficient accuracy is of great difficulty due to high-dimensional space and various perturbations.With the rapid development of data science and scientific tools of measurement recently,there are numerous data-driven methods devoted to discovering governing laws from data.In this work,a data-driven method is employed to perform the modeling of the projectile based on the Kramers–Moyal formulas.More specifically,the four-dimensional projectile system is assumed as an It?stochastic differential equation.Then the least square method and sparse learning are applied to identify the drift coefficient and diffusion matrix from sample path data,which agree well with the real system.The effectiveness of the data-driven method demonstrates that it will become a powerful tool in extracting governing equations and predicting complex dynamical behaviors of the projectile.
文摘This work addresses the multiscale optimization of the puri cation processes of antibody fragments. Chromatography decisions in the manufacturing processes are optimized, including the number of chromatography columns and their sizes, the number of cycles per batch, and the operational ow velocities. Data-driven models of chromatography throughput are developed considering loaded mass, ow velocity, and column bed height as the inputs, using manufacturing-scale simulated datasets based on microscale experimental data. The piecewise linear regression modeling method is adapted due to its simplicity and better prediction accuracy in comparison with other methods. Two alternative mixed-integer nonlinear programming (MINLP) models are proposed to minimize the total cost of goods per gram of the antibody puri cation process, incorporating the data-driven models. These MINLP models are then reformulated as mixed-integer linear programming (MILP) models using linearization techniques and multiparametric disaggregation. Two industrially relevant cases with different chromatography column size alternatives are investigated to demonstrate the applicability of the proposed models.
文摘In the synthesis of the control algorithm for complex systems, we are often faced with imprecise or unknown mathematical models of the dynamical systems, or even with problems in finding a mathematical model of the system in the open loop. To tackle these difficulties, an approach of data-driven model identification and control algorithm design based on the maximum stability degree criterion is proposed in this paper. The data-driven model identification procedure supposes the finding of the mathematical model of the system based on the undamped transient response of the closed-loop system. The system is approximated with the inertial model, where the coefficients are calculated based on the values of the critical transfer coefficient, oscillation amplitude and period of the underdamped response of the closed-loop system. The data driven control design supposes that the tuning parameters of the controller are calculated based on the parameters obtained from the previous step of system identification and there are presented the expressions for the calculation of the tuning parameters. The obtained results of data-driven model identification and algorithm for synthesis the controller were verified by computer simulation.
基金supported by the U.S.Department of Energy’s Office of Energy Efficiency and Renewable Energy(EERE)under the Solar Energy Technologies Office Award Number 38456.
文摘With the continual deployment of power-electronics-interfaced renewable energy resources,increasing privacy concerns due to deregulation of electricity markets,and the diversification of demand-side activities,traditional knowledge-based power system dynamic modeling methods are faced with unprecedented challenges.Data-driven modeling has been increasingly studied in recent years because of its lesser need for prior knowledge,higher capability of handling large-scale systems,and better adaptability to variations of system operating conditions.This paper discusses about the motivations and the generalized process of datadriven modeling,and provides a comprehensive overview of various state-of-the-art techniques and applications.It also comparatively presents the advantages and disadvantages of these methods and provides insight into outstanding challenges and possible research directions for the future.
基金supported by National Natural Science Foundation of China(Nos.61603114,61673135)the Fundamental Research Funds for the Central Universities of China(No.HIT.NSRIF.201826)
文摘This study presents an improved data-driven Model-Free Adaptive Control(MFAC)strategy for attitude stabilization of a partially constrained combined spacecraft with external disturbances and input saturation. First, a novel dynamic linearization data model for the partially constrained combined spacecraft with external disturbances is established. The generalized disturbances composed of external disturbances and dynamic linearization errors are then reconstructed by a Discrete Extended State Observer(DESO). With the dynamic linearization data model and reconstructed information, a DESO-MFAC strategy for the combined spacecraft is proposed based only on input and output data. Next, the input saturation is overcome by introducing an antiwindup compensator. Finally, numerical simulations are carried out to demonstrate the effectiveness and feasibility of the proposed controller when the dynamic properties of the partially constrained combined spacecraft are completely unknown.
文摘Sub-Saharan Africa(SSA)has the highest maternal and under-five mortality rates in the world.The advent of the coronavirus disease 2019 exacerbated the region's problems by overwhelming the health systems and affecting access to healthcare through travel restrictions and rechanelling of resources towards the containment of the pandemic.The region failed to achieve the Millenium Development Goals on maternal and child mortalities,and is poised to fail to achieve the same goals in the Sustainable Development Goals.To improve on the maternal and child health outcomes,many SSA countries introduced digital technologies for educating pregnant and nurs-ing women,making doctors'appointments and sending reminders to mothers and expectant mothers,as well as capturing information about patients and their illnesses.However,the collected epidemiological data are not being utilised to inform patient care and improve on the quality,efficiency and access to maternal,neonatal and child health(MNCH)care.To the researchers'best knowledge,no review paper has been published that focuses on digital health for MNCH care in SSA and proposes data-driven approaches to the same.Therefore,this study sought to:(1)identify digital systems for MNCH in SSA;(2)identify the applicability and weaknesses of the dig-ital MNCH systems in SSA;and(3)propose a data-driven model for diverging emerging technologies into MNCH services in SSA to make better use of data to improve MNCH care coverage,efficiency and quality.The PRISMA methodology was used in this study.The study revealed that there are no data-driven models for monitoring pregnant women and under-five children in Sub-Saharan Africa,with the available digital health technologies mainly based on SMS and websites.Thus,the current digital health systems in SSA do not support real-time,ubiquitous,pervasive and data-driven healthcare.Their main applicability is in non-real-time pregnancy moni-toring,education and information dissemination.Unless new and more effective approaches are implemented,SSA might remain with the highest and unacceptable maternal and under-five mortality rates globally.The study proposes feasible emerging technologies that can be used to provide data-driven healthcare for MNCH in SSA,and the recommendations on how to make the transition successful as well as the lessons learn from other regions.
基金supported by the National Key Research and Development Project(No.2019YFB1405401)the National Natural Science Foundation of China(No.5217120056)。
文摘Using stochastic dynamic simulation for railway vehicle collision still faces many challenges,such as high modelling complexity and time-consuming.To address the challenges,we introduce a novel data-driven stochastic process modelling(DSPM)approach into dynamic simulation of the railway vehicle collision.This DSPM approach consists of two steps:(i)process description,four kinds of kernels are used to describe the uncertainty inherent in collision processes;(ii)solving,stochastic variational inferences and mini-batch algorithms can then be used to accelerate computations of stochastic processes.By applying DSPM,Gaussian process regression(GPR)and finite element(FE)methods to two collision scenarios(i.e.lead car colliding with a rigid wall,and the lead car colliding with another lead car),we are able to achieve a comprehensive analysis.The comparison between the DSPM approach and the FE method revealed that the DSPM approach is capable of calculating the corresponding confidence interval,simultaneously improving the overall computational efficiency.Comparing the DSPM approach with the GPR method indicates that the DSPM approach has the ability to accurately describe the dynamic response under unknown conditions.Overall,this research demonstrates the feasibility and usability of the proposed DSPM approach for stochastic dynamics simulation of the railway vehicle collision.
基金supported by National Key Research and Development Program (2019YFA0708301)National Natural Science Foundation of China (51974337)+2 种基金the Strategic Cooperation Projects of CNPC and CUPB (ZLZX2020-03)Science and Technology Innovation Fund of CNPC (2021DQ02-0403)Open Fund of Petroleum Exploration and Development Research Institute of CNPC (2022-KFKT-09)
文摘We propose an integrated method of data-driven and mechanism models for well logging formation evaluation,explicitly focusing on predicting reservoir parameters,such as porosity and water saturation.Accurately interpreting these parameters is crucial for effectively exploring and developing oil and gas.However,with the increasing complexity of geological conditions in this industry,there is a growing demand for improved accuracy in reservoir parameter prediction,leading to higher costs associated with manual interpretation.The conventional logging interpretation methods rely on empirical relationships between logging data and reservoir parameters,which suffer from low interpretation efficiency,intense subjectivity,and suitability for ideal conditions.The application of artificial intelligence in the interpretation of logging data provides a new solution to the problems existing in traditional methods.It is expected to improve the accuracy and efficiency of the interpretation.If large and high-quality datasets exist,data-driven models can reveal relationships of arbitrary complexity.Nevertheless,constructing sufficiently large logging datasets with reliable labels remains challenging,making it difficult to apply data-driven models effectively in logging data interpretation.Furthermore,data-driven models often act as“black boxes”without explaining their predictions or ensuring compliance with primary physical constraints.This paper proposes a machine learning method with strong physical constraints by integrating mechanism and data-driven models.Prior knowledge of logging data interpretation is embedded into machine learning regarding network structure,loss function,and optimization algorithm.We employ the Physically Informed Auto-Encoder(PIAE)to predict porosity and water saturation,which can be trained without labeled reservoir parameters using self-supervised learning techniques.This approach effectively achieves automated interpretation and facilitates generalization across diverse datasets.
基金supported by the Natural Science Foundation of Hunan Province of China(No.2024JJ9082)by the Fundamental Research Funds for the Central Universities(No.531118010378).
文摘Pressure differential deviations under static conditions and pressure convergence fluctuations under dynamic disturbances are widely reported problems with pressure differential control in pharmaceutical cleanrooms,yet their underlying mechanisms and key reasons remain insufficiently explored.This study performed a field survey and model-based simulations to identify the major influencing parameters and quantify their influence on pressure differentials.Twelve pharmaceutical cleanrooms with varying environmental control parameters were included in the field survey,all of which were served by a variable air volume(VAV)ventilation system.Large deviations between actual and design pressure differentials were found,ranging from 10%to 42.5%,and a total of 24 uncertain parameters and their respective uncertainty ranges were identified.Based on the field survey,a data-driven pressure differential response model was developed using MATLAB/Simulink platform.The model fully took into account the system dynamics and facilitated real-time monitoring and control of the pressure differential.Sobol-based sensitivity analysis was then conducted to identify key influencing parameters of pressure differential deviations.The simulated results revealed that static pressure differential deviations were predominantly influenced by pressure sensing accuracy,exhaust airflow accuracy,and duct impedance,while dynamic disturbances were mainly driven by room envelope airtightness and supply airflow accuracy.The interactions between connected zones were pronounced.Rooms with higher branch duct impedance experienced smaller pressure differential deviations due to natural buffering characteristics,while the parameter uncertainties in these rooms significantly affected pressure differential in other rooms.These findings offer practical guidance for the design and operation of precise pressure differential control in pharmaceutical cleanrooms.
文摘Permanent magnet synchronous motor(PMSM)is widely used in alternating current servo systems as it provides high eficiency,high power density,and a wide speed regulation range.The servo system is placing higher demands on its control performance.The model predictive control(MPC)algorithm is emerging as a potential high-performance motor control algorithm due to its capability of handling multiple-input and multipleoutput variables and imposed constraints.For the MPC used in the PMSM control process,there is a nonlinear disturbance caused by the change of electromagnetic parameters or load disturbance that may lead to a mismatch between the nominal model and the controlled object,which causes the prediction error and thus affects the dynamic stability of the control system.This paper proposes a data-driven MPC strategy in which the historical data in an appropriate range are utilized to eliminate the impact of parameter mismatch and further improve the control performance.The stability of the proposed algorithm is proved as the simulation demonstrates the feasibility.Compared with the classical MPC strategy,the superiority of the algorithm has also been verified.