The world’s increasing population requires the process industry to produce food,fuels,chemicals,and consumer products in a more efficient and sustainable way.Functional process materials lie at the heart of this chal...The world’s increasing population requires the process industry to produce food,fuels,chemicals,and consumer products in a more efficient and sustainable way.Functional process materials lie at the heart of this challenge.Traditionally,new advanced materials are found empirically or through trial-and-error approaches.As theoretical methods and associated tools are being continuously improved and computer power has reached a high level,it is now efficient and popular to use computational methods to guide material selection and design.Due to the strong interaction between material selection and the operation of the process in which the material is used,it is essential to perform material and process design simultaneously.Despite this significant connection,the solution of the integrated material and process design problem is not easy because multiple models at different scales are usually required.Hybrid modeling provides a promising option to tackle such complex design problems.In hybrid modeling,the material properties,which are computationally expensive to obtain,are described by data-driven models,while the well-known process-related principles are represented by mechanistic models.This article highlights the significance of hybrid modeling in multiscale material and process design.The generic design methodology is first introduced.Six important application areas are then selected:four from the chemical engineering field and two from the energy systems engineering domain.For each selected area,state-ofthe-art work using hybrid modeling for multiscale material and process design is discussed.Concluding remarks are provided at the end,and current limitations and future opportunities are pointed out.展开更多
We propose an integrated method of data-driven and mechanism models for well logging formation evaluation,explicitly focusing on predicting reservoir parameters,such as porosity and water saturation.Accurately interpr...We propose an integrated method of data-driven and mechanism models for well logging formation evaluation,explicitly focusing on predicting reservoir parameters,such as porosity and water saturation.Accurately interpreting these parameters is crucial for effectively exploring and developing oil and gas.However,with the increasing complexity of geological conditions in this industry,there is a growing demand for improved accuracy in reservoir parameter prediction,leading to higher costs associated with manual interpretation.The conventional logging interpretation methods rely on empirical relationships between logging data and reservoir parameters,which suffer from low interpretation efficiency,intense subjectivity,and suitability for ideal conditions.The application of artificial intelligence in the interpretation of logging data provides a new solution to the problems existing in traditional methods.It is expected to improve the accuracy and efficiency of the interpretation.If large and high-quality datasets exist,data-driven models can reveal relationships of arbitrary complexity.Nevertheless,constructing sufficiently large logging datasets with reliable labels remains challenging,making it difficult to apply data-driven models effectively in logging data interpretation.Furthermore,data-driven models often act as“black boxes”without explaining their predictions or ensuring compliance with primary physical constraints.This paper proposes a machine learning method with strong physical constraints by integrating mechanism and data-driven models.Prior knowledge of logging data interpretation is embedded into machine learning regarding network structure,loss function,and optimization algorithm.We employ the Physically Informed Auto-Encoder(PIAE)to predict porosity and water saturation,which can be trained without labeled reservoir parameters using self-supervised learning techniques.This approach effectively achieves automated interpretation and facilitates generalization across diverse datasets.展开更多
Steam cracking is the dominant technology for producing light olefins,which are believed to be the foundation of the chemical industry.Predictive models of the cracking process can boost production efficiency and prof...Steam cracking is the dominant technology for producing light olefins,which are believed to be the foundation of the chemical industry.Predictive models of the cracking process can boost production efficiency and profit margin.Rapid advancements in machine learning research have recently enabled data-driven solutions to usher in a new era of process modeling.Meanwhile,its practical application to steam cracking is still hindered by the trade-off between prediction accuracy and computational speed.This research presents a framework for data-driven intelligent modeling of the steam cracking process.Industrial data preparation and feature engineering techniques provide computational-ready datasets for the framework,and feedstock similarities are exploited using k-means clustering.We propose LArge-Residuals-Deletion Multivariate Adaptive Regression Spline(LARD-MARS),a modeling approach that explicitly generates output formulas and eliminates potentially outlying instances.The framework is validated further by the presentation of clustering results,the explanation of variable importance,and the testing and comparison of model performance.展开更多
The curse of dimensionality refers to the problem o increased sparsity and computational complexity when dealing with high-dimensional data.In recent years,the types and vari ables of industrial data have increased si...The curse of dimensionality refers to the problem o increased sparsity and computational complexity when dealing with high-dimensional data.In recent years,the types and vari ables of industrial data have increased significantly,making data driven models more challenging to develop.To address this prob lem,data augmentation technology has been introduced as an effective tool to solve the sparsity problem of high-dimensiona industrial data.This paper systematically explores and discusses the necessity,feasibility,and effectiveness of augmented indus trial data-driven modeling in the context of the curse of dimen sionality and virtual big data.Then,the process of data augmen tation modeling is analyzed,and the concept of data boosting augmentation is proposed.The data boosting augmentation involves designing the reliability weight and actual-virtual weigh functions,and developing a double weighted partial least squares model to optimize the three stages of data generation,data fusion and modeling.This approach significantly improves the inter pretability,effectiveness,and practicality of data augmentation in the industrial modeling.Finally,the proposed method is verified using practical examples of fault diagnosis systems and virtua measurement systems in the industry.The results demonstrate the effectiveness of the proposed approach in improving the accu racy and robustness of data-driven models,making them more suitable for real-world industrial applications.展开更多
Aerodynamic surrogate modeling mostly relies only on integrated loads data obtained from simulation or experiment,while neglecting and wasting the valuable distributed physical information on the surface.To make full ...Aerodynamic surrogate modeling mostly relies only on integrated loads data obtained from simulation or experiment,while neglecting and wasting the valuable distributed physical information on the surface.To make full use of both integrated and distributed loads,a modeling paradigm,called the heterogeneous data-driven aerodynamic modeling,is presented.The essential concept is to incorporate the physical information of distributed loads as additional constraints within the end-to-end aerodynamic modeling.Towards heterogenous data,a novel and easily applicable physical feature embedding modeling framework is designed.This framework extracts lowdimensional physical features from pressure distribution and then effectively enhances the modeling of the integrated loads via feature embedding.The proposed framework can be coupled with multiple feature extraction methods,and the well-performed generalization capabilities over different airfoils are verified through a transonic case.Compared with traditional direct modeling,the proposed framework can reduce testing errors by almost 50%.Given the same prediction accuracy,it can save more than half of the training samples.Furthermore,the visualization analysis has revealed a significant correlation between the discovered low-dimensional physical features and the heterogeneous aerodynamic loads,which shows the interpretability and credibility of the superior performance offered by the proposed deep learning framework.展开更多
The constitutive models of shape memory alloys(SMAs)play an important role in facilitating the widespread application of such types of alloys in various engineering fields.However,to accurately describe the deformatio...The constitutive models of shape memory alloys(SMAs)play an important role in facilitating the widespread application of such types of alloys in various engineering fields.However,to accurately describe the deformation behaviors of SMAs,the concepts in classical plasticity are employed in the existing constitutive models,and a series of complex mathematical equations are involved.Such complexity brings inconvenience for the construction,implementation,and application of the constitutive models.To overcome these shortcomings,a data-driven constitutive model of SMAs is developed in this work based on the artificial neural network(ANN).In the proposed model,the components of the strain tensor in principal space,ambient temperature,and the maximum equivalent strain in the deformation history from the initial state to the current loading state are chosen as the input features,and the components of the stress tensor in principal space are set as the output.The proposed ANN-based constitutive model is implemented into the finite element program ABAQUS by deriving its consistent tangent modulus and writing a user-defined material subroutine.The stress-strain responses of SMA material under various loading paths and at different ambient temperatures are used to train the ANN model,which is generated from the existing constitutive model(numerical experiments).To validate the capability of the proposed model,the predicted stress-strain responses of SMA material,and the global and local responses of two typical SMA structures are compared with the corresponding numerical experiments.This work demonstrates a good potential to obtain the constitutive model of SMAs by pure data and avoid the need for vast stores of knowledge for the construction of constitutive models.展开更多
This paper focuses on the numerical solution of a tumor growth model under a data-driven approach.Based on the inherent laws of the data and reasonable assumptions,an ordinary differential equation model for tumor gro...This paper focuses on the numerical solution of a tumor growth model under a data-driven approach.Based on the inherent laws of the data and reasonable assumptions,an ordinary differential equation model for tumor growth is established.Nonlinear fitting is employed to obtain the optimal parameter estimation of the mathematical model,and the numerical solution is carried out using the Matlab software.By comparing the clinical data with the simulation results,a good agreement is achieved,which verifies the rationality and feasibility of the model.展开更多
With the rapid advancement of machine learning technology and its growing adoption in research and engineering applications,an increasing number of studies have embraced data-driven approaches for modeling wind turbin...With the rapid advancement of machine learning technology and its growing adoption in research and engineering applications,an increasing number of studies have embraced data-driven approaches for modeling wind turbine wakes.These models leverage the ability to capture complex,high-dimensional characteristics of wind turbine wakes while offering significantly greater efficiency in the prediction process than physics-driven models.As a result,data-driven wind turbine wake models are regarded as powerful and effective tools for predicting wake behavior and turbine power output.This paper aims to provide a concise yet comprehensive review of existing studies on wind turbine wake modeling that employ data-driven approaches.It begins by defining and classifying machine learning methods to facilitate a clearer understanding of the reviewed literature.Subsequently,the related studies are categorized into four key areas:wind turbine power prediction,data-driven analytic wake models,wake field reconstruction,and the incorporation of explicit physical constraints.The accuracy of data-driven models is influenced by two primary factors:the quality of the training data and the performance of the model itself.Accordingly,both data accuracy and model structure are discussed in detail within the review.展开更多
The pH-sensitive hydrogels play a crucial role in applications such as soft robotics,drug delivery,and biomedical sensors,as they require precise control of swelling behaviors and stress distributions.Traditional expe...The pH-sensitive hydrogels play a crucial role in applications such as soft robotics,drug delivery,and biomedical sensors,as they require precise control of swelling behaviors and stress distributions.Traditional experimental methods struggle to capture stress distributions due to technical limitations,while numerical approaches are often computationally intensive.This study presents a hybrid framework combining analytical modeling and machine learning(ML)to overcome these challenges.An analytical model is used to simulate transient swelling behaviors and stress distributions,and is confirmed to be viable through the comparison of the obtained simulation results with the existing experimental swelling data.The predictions from this model are used to train neural networks,including a two-step augmented architecture.The initial neural network predicts hydration values,which are then fed into a second network to predict stress distributions,effectively capturing nonlinear interdependencies.This approach achieves mean absolute errors(MAEs)as low as 0.031,with average errors of 1.9%for the radial stress and 2.55%for the hoop stress.This framework significantly enhances the predictive accuracy and reduces the computational complexity,offering actionable insights for optimizing hydrogel-based systems.展开更多
The distillation process is an important chemical process,and the application of data-driven modelling approach has the potential to reduce model complexity compared to mechanistic modelling,thus improving the efficie...The distillation process is an important chemical process,and the application of data-driven modelling approach has the potential to reduce model complexity compared to mechanistic modelling,thus improving the efficiency of process optimization or monitoring studies.However,the distillation process is highly nonlinear and has multiple uncertainty perturbation intervals,which brings challenges to accurate data-driven modelling of distillation processes.This paper proposes a systematic data-driven modelling framework to solve these problems.Firstly,data segment variance was introduced into the K-means algorithm to form K-means data interval(KMDI)clustering in order to cluster the data into perturbed and steady state intervals for steady-state data extraction.Secondly,maximal information coefficient(MIC)was employed to calculate the nonlinear correlation between variables for removing redundant features.Finally,extreme gradient boosting(XGBoost)was integrated as the basic learner into adaptive boosting(AdaBoost)with the error threshold(ET)set to improve weights update strategy to construct the new integrated learning algorithm,XGBoost-AdaBoost-ET.The superiority of the proposed framework is verified by applying this data-driven modelling framework to a real industrial process of propylene distillation.展开更多
Permanent magnet synchronous motor(PMSM)is widely used in alternating current servo systems as it provides high eficiency,high power density,and a wide speed regulation range.The servo system is placing higher demands...Permanent magnet synchronous motor(PMSM)is widely used in alternating current servo systems as it provides high eficiency,high power density,and a wide speed regulation range.The servo system is placing higher demands on its control performance.The model predictive control(MPC)algorithm is emerging as a potential high-performance motor control algorithm due to its capability of handling multiple-input and multipleoutput variables and imposed constraints.For the MPC used in the PMSM control process,there is a nonlinear disturbance caused by the change of electromagnetic parameters or load disturbance that may lead to a mismatch between the nominal model and the controlled object,which causes the prediction error and thus affects the dynamic stability of the control system.This paper proposes a data-driven MPC strategy in which the historical data in an appropriate range are utilized to eliminate the impact of parameter mismatch and further improve the control performance.The stability of the proposed algorithm is proved as the simulation demonstrates the feasibility.Compared with the classical MPC strategy,the superiority of the algorithm has also been verified.展开更多
To ensure the safe operation of batteries,accurately obtaining key internal state parameters is essential.However,traditional parameter measurement methods either require opening the battery or long-term measurements,...To ensure the safe operation of batteries,accurately obtaining key internal state parameters is essential.However,traditional parameter measurement methods either require opening the battery or long-term measurements,which are impractical.Therefore,the fixed values are commonly used for these parameters in electrochemical models and have significant limitations.To overcome these limitations,this paper proposes a deep neural network(DNN)based data-driven evaluation method to determine model parameters.By coupling an improved one-dimensional isothermal pseudo-twodimensional(P2D)model with DNN,this study identified concentration-dependent parameters through detailed discharge curve analysis.The results show that the data-driven method can effectively obtain the change trend of concentration-dependent parameters through the charge and discharge curve,and the method can be extended to different battery systems in different discharge rates and aging applications.This work is expected to provide new parameter selection insights for data-driven battery prediction and monitoring models.展开更多
With the continual deployment of power-electronics-interfaced renewable energy resources,increasing privacy concerns due to deregulation of electricity markets,and the diversification of demand-side activities,traditi...With the continual deployment of power-electronics-interfaced renewable energy resources,increasing privacy concerns due to deregulation of electricity markets,and the diversification of demand-side activities,traditional knowledge-based power system dynamic modeling methods are faced with unprecedented challenges.Data-driven modeling has been increasingly studied in recent years because of its lesser need for prior knowledge,higher capability of handling large-scale systems,and better adaptability to variations of system operating conditions.This paper discusses about the motivations and the generalized process of datadriven modeling,and provides a comprehensive overview of various state-of-the-art techniques and applications.It also comparatively presents the advantages and disadvantages of these methods and provides insight into outstanding challenges and possible research directions for the future.展开更多
The dynamical modeling of projectile systems with sufficient accuracy is of great difficulty due to high-dimensional space and various perturbations.With the rapid development of data science and scientific tools of m...The dynamical modeling of projectile systems with sufficient accuracy is of great difficulty due to high-dimensional space and various perturbations.With the rapid development of data science and scientific tools of measurement recently,there are numerous data-driven methods devoted to discovering governing laws from data.In this work,a data-driven method is employed to perform the modeling of the projectile based on the Kramers–Moyal formulas.More specifically,the four-dimensional projectile system is assumed as an It?stochastic differential equation.Then the least square method and sparse learning are applied to identify the drift coefficient and diffusion matrix from sample path data,which agree well with the real system.The effectiveness of the data-driven method demonstrates that it will become a powerful tool in extracting governing equations and predicting complex dynamical behaviors of the projectile.展开更多
This paper aims to develop Machine Learning algorithms to classify electronic articles related to this phenomenon by retrieving information and topic modelling.The Methodology of this study is categorized into three p...This paper aims to develop Machine Learning algorithms to classify electronic articles related to this phenomenon by retrieving information and topic modelling.The Methodology of this study is categorized into three phases:the Text Classification Approach(TCA),the Proposed Algorithms Interpretation(PAI),andfinally,Information Retrieval Approach(IRA).The TCA reflects the text preprocessing pipeline called a clean corpus.The Global Vec-tors for Word Representation(Glove)pre-trained model,FastText,Term Frequency-Inverse Document Fre-quency(TF-IDF),and Bag-of-Words(BOW)for extracting the features have been interpreted in this research.The PAI manifests the Bidirectional Long Short-Term Memory(Bi-LSTM)and Convolutional Neural Network(CNN)to classify the COVID-19 news.Again,the IRA explains the mathematical interpretation of Latent Dirich-let Allocation(LDA),obtained for modelling the topic of Information Retrieval(IR).In this study,99%accuracy was obtained by performing K-fold cross-validation on Bi-LSTM with Glove.A comparative analysis between Deep Learning and Machine Learning based on feature extraction and computational complexity exploration has been performed in this research.Furthermore,some text analyses and the most influential aspects of each document have been explored in this study.We have utilized Bidirectional Encoder Representations from Trans-formers(BERT)as a Deep Learning mechanism in our model training,but the result has not been uncovered satisfactory.However,the proposed system can be adjustable in the real-time news classification of COVID-19.展开更多
Increasing the production and utilization of shale gas is of great significance for building a clean and low-carbon energy system.Sharp decline of gas production has been widely observed in shale gas reservoirs.How to...Increasing the production and utilization of shale gas is of great significance for building a clean and low-carbon energy system.Sharp decline of gas production has been widely observed in shale gas reservoirs.How to forecast shale gas production is still challenging due to complex fracture networks,dynamic fracture properties,frac hits,complicated multiphase flow,and multi-scale flow as well as data quality and uncertainty.This work develops an integrated framework for evaluating shale gas well production based on data-driven models.Firstly,a comprehensive dominated-factor system has been established,including geological,drilling,fracturing,and production factors.Data processing and visualization are required to ensure data quality and determine final data set.A shale gas production evaluation model is developed to evaluate shale gas production levels.Finally,the random forest algorithm is used to forecast shale gas production.The prediction accuracy of shale gas production level is higher than 95%based on the shale gas reservoirs in China.Forty-one wells are randomly selected to predict cumulative gas production using the optimal regression model.The proposed shale gas production evaluation frame-work overcomes too many assumptions of analytical or semi-analytical models and avoids huge computation cost and poor generalization for numerical modelling.展开更多
In terms of multiple temporal and spatial scales, massive data from experiments, flow field measurements, and high-fidelity numerical simulations have greatly promoted the rapid development of fluid mechanics. Machine...In terms of multiple temporal and spatial scales, massive data from experiments, flow field measurements, and high-fidelity numerical simulations have greatly promoted the rapid development of fluid mechanics. Machine Learning(ML) provides a wealth of analysis methods to extract potential information from a large amount of data for in-depth understanding of the underlying flow mechanism or for further applications. Furthermore, machine learning algorithms can enhance flow information and automatically perform tasks that involve active flow control and optimization. This article provides an overview of the past history, current development, and promising prospects of machine learning in the field of fluid mechanics. In addition, to facilitate understanding, this article outlines the basic principles of machine learning methods and their applications in engineering practice, turbulence models, flow field representation problems, and active flow control. In short, machine learning provides a powerful and more intelligent data processing architecture, and may greatly enrich the existing research methods and industrial applications of fluid mechanics.展开更多
In the manufacturing of thin wall components for aerospace industry,apart from the side wall contour error,the Remaining Bottom Thickness Error(RBTE)for the thin-wall pocket component(e.g.rocket shell)is of the same i...In the manufacturing of thin wall components for aerospace industry,apart from the side wall contour error,the Remaining Bottom Thickness Error(RBTE)for the thin-wall pocket component(e.g.rocket shell)is of the same importance but overlooked in current research.If the RBTE reduces by 30%,the weight reduction of the entire component will reach up to tens of kilograms while improving the dynamic balance performance of the large component.Current RBTE control requires the off-process measurement of limited discrete points on the component bottom to provide the reference value for compensation.This leads to incompleteness in the remaining bottom thickness control and redundant measurement in manufacturing.In this paper,the framework of data-driven physics based model is proposed and developed for the real-time prediction of critical quality for large components,which enables accurate prediction and compensation of RBTE value for the thin wall components.The physics based model considers the primary root cause,in terms of tool deflection and clamping stiffness induced Axial Material Removal Thickness(AMRT)variation,for the RBTE formation.And to incorporate the dynamic and inherent coupling of the complicated manufacturing system,the multi-feature fusion and machine learning algorithm,i.e.kernel Principal Component Analysis(kPCA)and kernel Support Vector Regression(kSVR),are incorporated with the physics based model.Therefore,the proposed data-driven physics based model combines both process mechanism and the system disturbance to achieve better prediction accuracy.The final verification experiment is implemented to validate the effectiveness of the proposed method for dimensional accuracy prediction in pocket milling,and the prediction accuracy of AMRT achieves 0.014 mm and 0.019 mm for straight and corner milling,respectively.展开更多
Molten iron temperature as well as Si, P, and S contents is the most essential molten iron quality (MIQ) indices in the blast furnace (BF) ironmaking, which requires strict monitoring during the whole ironmaking p...Molten iron temperature as well as Si, P, and S contents is the most essential molten iron quality (MIQ) indices in the blast furnace (BF) ironmaking, which requires strict monitoring during the whole ironmaking production. However, these MIQ parameters are difficult to be directly measured online, and large-time delay exists in off-line analysis through laboratory sampling. Focusing on the practical challenge, a data-driven modeling method was presented for the prediction of MIQ using the improved muhivariable incremental random vector functional-link net- works (M-I-RVFLNs). Compared with the conventional random vector functional-link networks (RVFLNs) and the online sequential RVFLNs, the M-I-RVFLNs have solved the problem of deciding the optimal number of hidden nodes and overcome the overfitting problems. Moreover, the proposed M I RVFLNs model has exhibited the potential for multivariable prediction of the MIQ and improved the terminal condition for the multiple-input multiple-out- put (MIMO) dynamic system, which is suitable for the BF ironmaking process in practice. Ultimately, industrial experiments and contrastive researches have been conducted on the BF No. 2 in Liuzhou Iron and Steel Group Co. Ltd. of China using the proposed method, and the results demonstrate that the established model produces better estima ting accuracy than other MIQ modeling methods.展开更多
The complex sand-casting process combined with the interactions between process parameters makes it difficult to control the casting quality,resulting in a high scrap rate.A strategy based on a data-driven model was p...The complex sand-casting process combined with the interactions between process parameters makes it difficult to control the casting quality,resulting in a high scrap rate.A strategy based on a data-driven model was proposed to reduce casting defects and improve production efficiency,which includes the random forest(RF)classification model,the feature importance analysis,and the process parameters optimization with Monte Carlo simulation.The collected data includes four types of defects and corresponding process parameters were used to construct the RF model.Classification results show a recall rate above 90% for all categories.The Gini Index was used to assess the importance of the process parameters in the formation of various defects in the RF model.Finally,the classification model was applied to different production conditions for quality prediction.In the case of process parameters optimization for gas porosity defects,this model serves as an experimental process in the Monte Carlo method to estimate a better temperature distribution.The prediction model,when applied to the factory,greatly improved the efficiency of defect detection.Results show that the scrap rate decreased from 10.16% to 6.68%.展开更多
文摘The world’s increasing population requires the process industry to produce food,fuels,chemicals,and consumer products in a more efficient and sustainable way.Functional process materials lie at the heart of this challenge.Traditionally,new advanced materials are found empirically or through trial-and-error approaches.As theoretical methods and associated tools are being continuously improved and computer power has reached a high level,it is now efficient and popular to use computational methods to guide material selection and design.Due to the strong interaction between material selection and the operation of the process in which the material is used,it is essential to perform material and process design simultaneously.Despite this significant connection,the solution of the integrated material and process design problem is not easy because multiple models at different scales are usually required.Hybrid modeling provides a promising option to tackle such complex design problems.In hybrid modeling,the material properties,which are computationally expensive to obtain,are described by data-driven models,while the well-known process-related principles are represented by mechanistic models.This article highlights the significance of hybrid modeling in multiscale material and process design.The generic design methodology is first introduced.Six important application areas are then selected:four from the chemical engineering field and two from the energy systems engineering domain.For each selected area,state-ofthe-art work using hybrid modeling for multiscale material and process design is discussed.Concluding remarks are provided at the end,and current limitations and future opportunities are pointed out.
基金supported by National Key Research and Development Program (2019YFA0708301)National Natural Science Foundation of China (51974337)+2 种基金the Strategic Cooperation Projects of CNPC and CUPB (ZLZX2020-03)Science and Technology Innovation Fund of CNPC (2021DQ02-0403)Open Fund of Petroleum Exploration and Development Research Institute of CNPC (2022-KFKT-09)
文摘We propose an integrated method of data-driven and mechanism models for well logging formation evaluation,explicitly focusing on predicting reservoir parameters,such as porosity and water saturation.Accurately interpreting these parameters is crucial for effectively exploring and developing oil and gas.However,with the increasing complexity of geological conditions in this industry,there is a growing demand for improved accuracy in reservoir parameter prediction,leading to higher costs associated with manual interpretation.The conventional logging interpretation methods rely on empirical relationships between logging data and reservoir parameters,which suffer from low interpretation efficiency,intense subjectivity,and suitability for ideal conditions.The application of artificial intelligence in the interpretation of logging data provides a new solution to the problems existing in traditional methods.It is expected to improve the accuracy and efficiency of the interpretation.If large and high-quality datasets exist,data-driven models can reveal relationships of arbitrary complexity.Nevertheless,constructing sufficiently large logging datasets with reliable labels remains challenging,making it difficult to apply data-driven models effectively in logging data interpretation.Furthermore,data-driven models often act as“black boxes”without explaining their predictions or ensuring compliance with primary physical constraints.This paper proposes a machine learning method with strong physical constraints by integrating mechanism and data-driven models.Prior knowledge of logging data interpretation is embedded into machine learning regarding network structure,loss function,and optimization algorithm.We employ the Physically Informed Auto-Encoder(PIAE)to predict porosity and water saturation,which can be trained without labeled reservoir parameters using self-supervised learning techniques.This approach effectively achieves automated interpretation and facilitates generalization across diverse datasets.
基金supported by the National Key Research and Development Program of China(2021 YFB 4000500,2021 YFB 4000501,and 2021 YFB 4000502)。
文摘Steam cracking is the dominant technology for producing light olefins,which are believed to be the foundation of the chemical industry.Predictive models of the cracking process can boost production efficiency and profit margin.Rapid advancements in machine learning research have recently enabled data-driven solutions to usher in a new era of process modeling.Meanwhile,its practical application to steam cracking is still hindered by the trade-off between prediction accuracy and computational speed.This research presents a framework for data-driven intelligent modeling of the steam cracking process.Industrial data preparation and feature engineering techniques provide computational-ready datasets for the framework,and feedstock similarities are exploited using k-means clustering.We propose LArge-Residuals-Deletion Multivariate Adaptive Regression Spline(LARD-MARS),a modeling approach that explicitly generates output formulas and eliminates potentially outlying instances.The framework is validated further by the presentation of clustering results,the explanation of variable importance,and the testing and comparison of model performance.
基金supported in part by the National Natural Science Foundation of China(NSFC)(92167106,61833014)Key Research and Development Program of Zhejiang Province(2022C01206)。
文摘The curse of dimensionality refers to the problem o increased sparsity and computational complexity when dealing with high-dimensional data.In recent years,the types and vari ables of industrial data have increased significantly,making data driven models more challenging to develop.To address this prob lem,data augmentation technology has been introduced as an effective tool to solve the sparsity problem of high-dimensiona industrial data.This paper systematically explores and discusses the necessity,feasibility,and effectiveness of augmented indus trial data-driven modeling in the context of the curse of dimen sionality and virtual big data.Then,the process of data augmen tation modeling is analyzed,and the concept of data boosting augmentation is proposed.The data boosting augmentation involves designing the reliability weight and actual-virtual weigh functions,and developing a double weighted partial least squares model to optimize the three stages of data generation,data fusion and modeling.This approach significantly improves the inter pretability,effectiveness,and practicality of data augmentation in the industrial modeling.Finally,the proposed method is verified using practical examples of fault diagnosis systems and virtua measurement systems in the industry.The results demonstrate the effectiveness of the proposed approach in improving the accu racy and robustness of data-driven models,making them more suitable for real-world industrial applications.
基金supported by the National Natural Science Foundation of China(Nos.92152301,12072282)。
文摘Aerodynamic surrogate modeling mostly relies only on integrated loads data obtained from simulation or experiment,while neglecting and wasting the valuable distributed physical information on the surface.To make full use of both integrated and distributed loads,a modeling paradigm,called the heterogeneous data-driven aerodynamic modeling,is presented.The essential concept is to incorporate the physical information of distributed loads as additional constraints within the end-to-end aerodynamic modeling.Towards heterogenous data,a novel and easily applicable physical feature embedding modeling framework is designed.This framework extracts lowdimensional physical features from pressure distribution and then effectively enhances the modeling of the integrated loads via feature embedding.The proposed framework can be coupled with multiple feature extraction methods,and the well-performed generalization capabilities over different airfoils are verified through a transonic case.Compared with traditional direct modeling,the proposed framework can reduce testing errors by almost 50%.Given the same prediction accuracy,it can save more than half of the training samples.Furthermore,the visualization analysis has revealed a significant correlation between the discovered low-dimensional physical features and the heterogeneous aerodynamic loads,which shows the interpretability and credibility of the superior performance offered by the proposed deep learning framework.
基金supported by the National Natural Science Foundation of China(NSFC)(Grant No.12322203).
文摘The constitutive models of shape memory alloys(SMAs)play an important role in facilitating the widespread application of such types of alloys in various engineering fields.However,to accurately describe the deformation behaviors of SMAs,the concepts in classical plasticity are employed in the existing constitutive models,and a series of complex mathematical equations are involved.Such complexity brings inconvenience for the construction,implementation,and application of the constitutive models.To overcome these shortcomings,a data-driven constitutive model of SMAs is developed in this work based on the artificial neural network(ANN).In the proposed model,the components of the strain tensor in principal space,ambient temperature,and the maximum equivalent strain in the deformation history from the initial state to the current loading state are chosen as the input features,and the components of the stress tensor in principal space are set as the output.The proposed ANN-based constitutive model is implemented into the finite element program ABAQUS by deriving its consistent tangent modulus and writing a user-defined material subroutine.The stress-strain responses of SMA material under various loading paths and at different ambient temperatures are used to train the ANN model,which is generated from the existing constitutive model(numerical experiments).To validate the capability of the proposed model,the predicted stress-strain responses of SMA material,and the global and local responses of two typical SMA structures are compared with the corresponding numerical experiments.This work demonstrates a good potential to obtain the constitutive model of SMAs by pure data and avoid the need for vast stores of knowledge for the construction of constitutive models.
基金National Natural Science Foundation of China(Project No.:12371428)Projects of the Provincial College Students’Innovation and Training Program in 2024(Project No.:S202413023106,S202413023110)。
文摘This paper focuses on the numerical solution of a tumor growth model under a data-driven approach.Based on the inherent laws of the data and reasonable assumptions,an ordinary differential equation model for tumor growth is established.Nonlinear fitting is employed to obtain the optimal parameter estimation of the mathematical model,and the numerical solution is carried out using the Matlab software.By comparing the clinical data with the simulation results,a good agreement is achieved,which verifies the rationality and feasibility of the model.
基金Supported by the National Natural Science Foundation of China under Grant No.52131102.
文摘With the rapid advancement of machine learning technology and its growing adoption in research and engineering applications,an increasing number of studies have embraced data-driven approaches for modeling wind turbine wakes.These models leverage the ability to capture complex,high-dimensional characteristics of wind turbine wakes while offering significantly greater efficiency in the prediction process than physics-driven models.As a result,data-driven wind turbine wake models are regarded as powerful and effective tools for predicting wake behavior and turbine power output.This paper aims to provide a concise yet comprehensive review of existing studies on wind turbine wake modeling that employ data-driven approaches.It begins by defining and classifying machine learning methods to facilitate a clearer understanding of the reviewed literature.Subsequently,the related studies are categorized into four key areas:wind turbine power prediction,data-driven analytic wake models,wake field reconstruction,and the incorporation of explicit physical constraints.The accuracy of data-driven models is influenced by two primary factors:the quality of the training data and the performance of the model itself.Accordingly,both data accuracy and model structure are discussed in detail within the review.
文摘The pH-sensitive hydrogels play a crucial role in applications such as soft robotics,drug delivery,and biomedical sensors,as they require precise control of swelling behaviors and stress distributions.Traditional experimental methods struggle to capture stress distributions due to technical limitations,while numerical approaches are often computationally intensive.This study presents a hybrid framework combining analytical modeling and machine learning(ML)to overcome these challenges.An analytical model is used to simulate transient swelling behaviors and stress distributions,and is confirmed to be viable through the comparison of the obtained simulation results with the existing experimental swelling data.The predictions from this model are used to train neural networks,including a two-step augmented architecture.The initial neural network predicts hydration values,which are then fed into a second network to predict stress distributions,effectively capturing nonlinear interdependencies.This approach achieves mean absolute errors(MAEs)as low as 0.031,with average errors of 1.9%for the radial stress and 2.55%for the hoop stress.This framework significantly enhances the predictive accuracy and reduces the computational complexity,offering actionable insights for optimizing hydrogel-based systems.
基金supported by the National Key Research and Development Program of China(2023YFB3307801)the National Natural Science Foundation of China(62394343,62373155,62073142)+3 种基金Major Science and Technology Project of Xinjiang(No.2022A01006-4)the Programme of Introducing Talents of Discipline to Universities(the 111 Project)under Grant B17017the Fundamental Research Funds for the Central Universities,Science Foundation of China University of Petroleum,Beijing(No.2462024YJRC011)the Open Research Project of the State Key Laboratory of Industrial Control Technology,China(Grant No.ICT2024B70).
文摘The distillation process is an important chemical process,and the application of data-driven modelling approach has the potential to reduce model complexity compared to mechanistic modelling,thus improving the efficiency of process optimization or monitoring studies.However,the distillation process is highly nonlinear and has multiple uncertainty perturbation intervals,which brings challenges to accurate data-driven modelling of distillation processes.This paper proposes a systematic data-driven modelling framework to solve these problems.Firstly,data segment variance was introduced into the K-means algorithm to form K-means data interval(KMDI)clustering in order to cluster the data into perturbed and steady state intervals for steady-state data extraction.Secondly,maximal information coefficient(MIC)was employed to calculate the nonlinear correlation between variables for removing redundant features.Finally,extreme gradient boosting(XGBoost)was integrated as the basic learner into adaptive boosting(AdaBoost)with the error threshold(ET)set to improve weights update strategy to construct the new integrated learning algorithm,XGBoost-AdaBoost-ET.The superiority of the proposed framework is verified by applying this data-driven modelling framework to a real industrial process of propylene distillation.
文摘Permanent magnet synchronous motor(PMSM)is widely used in alternating current servo systems as it provides high eficiency,high power density,and a wide speed regulation range.The servo system is placing higher demands on its control performance.The model predictive control(MPC)algorithm is emerging as a potential high-performance motor control algorithm due to its capability of handling multiple-input and multipleoutput variables and imposed constraints.For the MPC used in the PMSM control process,there is a nonlinear disturbance caused by the change of electromagnetic parameters or load disturbance that may lead to a mismatch between the nominal model and the controlled object,which causes the prediction error and thus affects the dynamic stability of the control system.This paper proposes a data-driven MPC strategy in which the historical data in an appropriate range are utilized to eliminate the impact of parameter mismatch and further improve the control performance.The stability of the proposed algorithm is proved as the simulation demonstrates the feasibility.Compared with the classical MPC strategy,the superiority of the algorithm has also been verified.
基金supported by National Natural Science Foundation of China(22478239)Science and Technology Commission of Shanghai Municipality(19DZ2271100)National Natural Science Foundation of China(22208208)。
文摘To ensure the safe operation of batteries,accurately obtaining key internal state parameters is essential.However,traditional parameter measurement methods either require opening the battery or long-term measurements,which are impractical.Therefore,the fixed values are commonly used for these parameters in electrochemical models and have significant limitations.To overcome these limitations,this paper proposes a deep neural network(DNN)based data-driven evaluation method to determine model parameters.By coupling an improved one-dimensional isothermal pseudo-twodimensional(P2D)model with DNN,this study identified concentration-dependent parameters through detailed discharge curve analysis.The results show that the data-driven method can effectively obtain the change trend of concentration-dependent parameters through the charge and discharge curve,and the method can be extended to different battery systems in different discharge rates and aging applications.This work is expected to provide new parameter selection insights for data-driven battery prediction and monitoring models.
基金supported by the U.S.Department of Energy’s Office of Energy Efficiency and Renewable Energy(EERE)under the Solar Energy Technologies Office Award Number 38456.
文摘With the continual deployment of power-electronics-interfaced renewable energy resources,increasing privacy concerns due to deregulation of electricity markets,and the diversification of demand-side activities,traditional knowledge-based power system dynamic modeling methods are faced with unprecedented challenges.Data-driven modeling has been increasingly studied in recent years because of its lesser need for prior knowledge,higher capability of handling large-scale systems,and better adaptability to variations of system operating conditions.This paper discusses about the motivations and the generalized process of datadriven modeling,and provides a comprehensive overview of various state-of-the-art techniques and applications.It also comparatively presents the advantages and disadvantages of these methods and provides insight into outstanding challenges and possible research directions for the future.
基金the Six Talent Peaks Project in Jiangsu Province,China(Grant No.JXQC-002)。
文摘The dynamical modeling of projectile systems with sufficient accuracy is of great difficulty due to high-dimensional space and various perturbations.With the rapid development of data science and scientific tools of measurement recently,there are numerous data-driven methods devoted to discovering governing laws from data.In this work,a data-driven method is employed to perform the modeling of the projectile based on the Kramers–Moyal formulas.More specifically,the four-dimensional projectile system is assumed as an It?stochastic differential equation.Then the least square method and sparse learning are applied to identify the drift coefficient and diffusion matrix from sample path data,which agree well with the real system.The effectiveness of the data-driven method demonstrates that it will become a powerful tool in extracting governing equations and predicting complex dynamical behaviors of the projectile.
文摘This paper aims to develop Machine Learning algorithms to classify electronic articles related to this phenomenon by retrieving information and topic modelling.The Methodology of this study is categorized into three phases:the Text Classification Approach(TCA),the Proposed Algorithms Interpretation(PAI),andfinally,Information Retrieval Approach(IRA).The TCA reflects the text preprocessing pipeline called a clean corpus.The Global Vec-tors for Word Representation(Glove)pre-trained model,FastText,Term Frequency-Inverse Document Fre-quency(TF-IDF),and Bag-of-Words(BOW)for extracting the features have been interpreted in this research.The PAI manifests the Bidirectional Long Short-Term Memory(Bi-LSTM)and Convolutional Neural Network(CNN)to classify the COVID-19 news.Again,the IRA explains the mathematical interpretation of Latent Dirich-let Allocation(LDA),obtained for modelling the topic of Information Retrieval(IR).In this study,99%accuracy was obtained by performing K-fold cross-validation on Bi-LSTM with Glove.A comparative analysis between Deep Learning and Machine Learning based on feature extraction and computational complexity exploration has been performed in this research.Furthermore,some text analyses and the most influential aspects of each document have been explored in this study.We have utilized Bidirectional Encoder Representations from Trans-formers(BERT)as a Deep Learning mechanism in our model training,but the result has not been uncovered satisfactory.However,the proposed system can be adjustable in the real-time news classification of COVID-19.
基金funded by National Natural Science Foundation of China(52004238)China Postdoctoral Science Foundation(2019M663561).
文摘Increasing the production and utilization of shale gas is of great significance for building a clean and low-carbon energy system.Sharp decline of gas production has been widely observed in shale gas reservoirs.How to forecast shale gas production is still challenging due to complex fracture networks,dynamic fracture properties,frac hits,complicated multiphase flow,and multi-scale flow as well as data quality and uncertainty.This work develops an integrated framework for evaluating shale gas well production based on data-driven models.Firstly,a comprehensive dominated-factor system has been established,including geological,drilling,fracturing,and production factors.Data processing and visualization are required to ensure data quality and determine final data set.A shale gas production evaluation model is developed to evaluate shale gas production levels.Finally,the random forest algorithm is used to forecast shale gas production.The prediction accuracy of shale gas production level is higher than 95%based on the shale gas reservoirs in China.Forty-one wells are randomly selected to predict cumulative gas production using the optimal regression model.The proposed shale gas production evaluation frame-work overcomes too many assumptions of analytical or semi-analytical models and avoids huge computation cost and poor generalization for numerical modelling.
基金supported by the National Natural Science Foundation of China(No.11972139)。
文摘In terms of multiple temporal and spatial scales, massive data from experiments, flow field measurements, and high-fidelity numerical simulations have greatly promoted the rapid development of fluid mechanics. Machine Learning(ML) provides a wealth of analysis methods to extract potential information from a large amount of data for in-depth understanding of the underlying flow mechanism or for further applications. Furthermore, machine learning algorithms can enhance flow information and automatically perform tasks that involve active flow control and optimization. This article provides an overview of the past history, current development, and promising prospects of machine learning in the field of fluid mechanics. In addition, to facilitate understanding, this article outlines the basic principles of machine learning methods and their applications in engineering practice, turbulence models, flow field representation problems, and active flow control. In short, machine learning provides a powerful and more intelligent data processing architecture, and may greatly enrich the existing research methods and industrial applications of fluid mechanics.
基金the Science and Technology Major Project of China(No.2019ZX04020001-004,2017ZX04007001)。
文摘In the manufacturing of thin wall components for aerospace industry,apart from the side wall contour error,the Remaining Bottom Thickness Error(RBTE)for the thin-wall pocket component(e.g.rocket shell)is of the same importance but overlooked in current research.If the RBTE reduces by 30%,the weight reduction of the entire component will reach up to tens of kilograms while improving the dynamic balance performance of the large component.Current RBTE control requires the off-process measurement of limited discrete points on the component bottom to provide the reference value for compensation.This leads to incompleteness in the remaining bottom thickness control and redundant measurement in manufacturing.In this paper,the framework of data-driven physics based model is proposed and developed for the real-time prediction of critical quality for large components,which enables accurate prediction and compensation of RBTE value for the thin wall components.The physics based model considers the primary root cause,in terms of tool deflection and clamping stiffness induced Axial Material Removal Thickness(AMRT)variation,for the RBTE formation.And to incorporate the dynamic and inherent coupling of the complicated manufacturing system,the multi-feature fusion and machine learning algorithm,i.e.kernel Principal Component Analysis(kPCA)and kernel Support Vector Regression(kSVR),are incorporated with the physics based model.Therefore,the proposed data-driven physics based model combines both process mechanism and the system disturbance to achieve better prediction accuracy.The final verification experiment is implemented to validate the effectiveness of the proposed method for dimensional accuracy prediction in pocket milling,and the prediction accuracy of AMRT achieves 0.014 mm and 0.019 mm for straight and corner milling,respectively.
基金Item Sponsored by National Natural Science Foundation of China(61290323,61333007,61473064)Fundamental Research Funds for Central Universities of China(N130108001)+1 种基金National High Technology Research and Development Program of China(2015AA043802)General Project on Scientific Research for Education Department of Liaoning Province of China(L20150186)
文摘Molten iron temperature as well as Si, P, and S contents is the most essential molten iron quality (MIQ) indices in the blast furnace (BF) ironmaking, which requires strict monitoring during the whole ironmaking production. However, these MIQ parameters are difficult to be directly measured online, and large-time delay exists in off-line analysis through laboratory sampling. Focusing on the practical challenge, a data-driven modeling method was presented for the prediction of MIQ using the improved muhivariable incremental random vector functional-link net- works (M-I-RVFLNs). Compared with the conventional random vector functional-link networks (RVFLNs) and the online sequential RVFLNs, the M-I-RVFLNs have solved the problem of deciding the optimal number of hidden nodes and overcome the overfitting problems. Moreover, the proposed M I RVFLNs model has exhibited the potential for multivariable prediction of the MIQ and improved the terminal condition for the multiple-input multiple-out- put (MIMO) dynamic system, which is suitable for the BF ironmaking process in practice. Ultimately, industrial experiments and contrastive researches have been conducted on the BF No. 2 in Liuzhou Iron and Steel Group Co. Ltd. of China using the proposed method, and the results demonstrate that the established model produces better estima ting accuracy than other MIQ modeling methods.
基金financially supported by the National Key Research and Development Program of China(2022YFB3706800,2020YFB1710100)the National Natural Science Foundation of China(51821001,52090042,52074183)。
文摘The complex sand-casting process combined with the interactions between process parameters makes it difficult to control the casting quality,resulting in a high scrap rate.A strategy based on a data-driven model was proposed to reduce casting defects and improve production efficiency,which includes the random forest(RF)classification model,the feature importance analysis,and the process parameters optimization with Monte Carlo simulation.The collected data includes four types of defects and corresponding process parameters were used to construct the RF model.Classification results show a recall rate above 90% for all categories.The Gini Index was used to assess the importance of the process parameters in the formation of various defects in the RF model.Finally,the classification model was applied to different production conditions for quality prediction.In the case of process parameters optimization for gas porosity defects,this model serves as an experimental process in the Monte Carlo method to estimate a better temperature distribution.The prediction model,when applied to the factory,greatly improved the efficiency of defect detection.Results show that the scrap rate decreased from 10.16% to 6.68%.