High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements.In particular,in amulti-label environment,higher complexity is required asmuch as the number of ...High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements.In particular,in amulti-label environment,higher complexity is required asmuch as the number of labels.Moreover,an optimization problem that fully considers all dependencies between features and labels is difficult to solve.In this study,we propose a novel regression-basedmulti-label feature selectionmethod that integrates mutual information to better exploit the underlying data structure.By incorporating mutual information into the regression formulation,the model captures not only linear relationships but also complex non-linear dependencies.The proposed objective function simultaneously considers three types of relationships:(1)feature redundancy,(2)featurelabel relevance,and(3)inter-label dependency.These three quantities are computed usingmutual information,allowing the proposed formulation to capture nonlinear dependencies among variables.These three types of relationships are key factors in multi-label feature selection,and our method expresses them within a unified formulation,enabling efficient optimization while simultaneously accounting for all of them.To efficiently solve the proposed optimization problem under non-negativity constraints,we develop a gradient-based optimization algorithm with fast convergence.Theexperimental results on sevenmulti-label datasets show that the proposed method outperforms existingmulti-label feature selection techniques.展开更多
Oil-coal slurry prepared in coal direct liquefaction is a dispersed solid-liquid suspension system. In this paper, some factors such as solvent properties, solid concentrations and temperatures, which affect viscosity...Oil-coal slurry prepared in coal direct liquefaction is a dispersed solid-liquid suspension system. In this paper, some factors such as solvent properties, solid concentrations and temperatures, which affect viscosity change of oil-coal slurry, were studied. The viscosity of coal slurry was measured using rotary viscometer, and the rheological properties have been investigated. The viscosity and rheological curves were plotted and regressed, respectively. The results show that the coal slurry behaves a pseudoplastic and thixotropic property. The rheological type of coal slurry was ascertained and its rheological equations were educed. The oil-coal slurry changes to non-Newtonian fluid from Newtonian fluid with the increasing of solid concentration.展开更多
Decreasing the acetic acid consumption in purified terephthalic acid (PTA) solvent system has become a hot issue with common concern. In accordance with the technical features, the electrical conductivity is in dire...Decreasing the acetic acid consumption in purified terephthalic acid (PTA) solvent system has become a hot issue with common concern. In accordance with the technical features, the electrical conductivity is in direct proportion to the acetic acid content. General regression neural network (GRNN) is used to establish the model of electrical conductivity on the basis of mechanism analysis, and then particle swarm optimization (PSO) algorithm with the improvement of inertia weight and population diversity is proposed to regulate the operating conditions. Thus, the method of decreasing the acid loss is derived and applied to PTA solvent system in a chemical plant. Cases studies show that the precision of modeling and optimization are higher. The results also provide the optimal operating conditions, which decrease the cost and improve the profit.展开更多
The Climate Forecast Systems(CFS) datasets provided by National Centers for Environmental Prediction(NCEP), which cover the time from 1981 to 2008, can be used to forecast atmospheric circulation nine months ahead. Co...The Climate Forecast Systems(CFS) datasets provided by National Centers for Environmental Prediction(NCEP), which cover the time from 1981 to 2008, can be used to forecast atmospheric circulation nine months ahead. Compared with the NCEP datasets, CFS datasets successfully simulate many major features of the Asian monsoon circulation systems and exhibit reasonably high skill in simulating and predicting ENSO events. Based on the CFS forecasting results, a downscaling method of Optimal Subset Regression(OSR) and mean generational function model of multiple variables are used to forecast seasonal precipitation in Guangdong. After statistical analysis tests, sea level pressure, wind and geopotential height field are made predictors. Although the results are unstable in some individual seasons, both the OSR and multivariate mean generational function model can provide good forecasting as operational tests score more than sixty points. CFS datasets are available and updated in real time, as compared with the NCEP dataset. The downscaling forecast method based on the CFS datasets can predict three seasons of seasonal precipitation in Guangdong, enriching traditional statistical methods. However, its forecasting stability needs to be improved.展开更多
The variability characteristics of Guangdong daily power load from 2002 to 2004 and its connection to meteorological variables are analyzed with wavelet analysis and correlation analysis. Prediction equations are esta...The variability characteristics of Guangdong daily power load from 2002 to 2004 and its connection to meteorological variables are analyzed with wavelet analysis and correlation analysis. Prediction equations are established using optimization subset regression. The results show that a linear increasing trend is very significant and seasonal change is obvious. The power load exhibits significant quasi-weekly (5 – 7 days) oscillation, quasi-by-weekly (10 – 20 days) oscillation and intraseasonal (30 – 60 days) oscillation. These oscillations are caused by atmospheric low frequency oscillation and public holidays. The variation of Guangdong daily power load is obviously in decrease on Sundays, shaping like a funnel during Chinese New Year in particular. The minimum is found at the first and second day and the power load gradually increases to normal level after the third day during the long vacation of Labor Day and National Day. Guangdong power load is the most sensitive to temperature, which is the main affecting factor, as in other areas in China. The power load also has relationship with other meteorological elements to some extent during different seasons. The maximum of power load in summer, minimum during Chinese New Year and variation during Labor Day and National Day are well fitted and predicted using the equation established by optimization subset regression and accounting for the effect of workdays and holidays.展开更多
We studied the information search behaviors of Chinese consumers of miniature automobiles. First, we identified the main sources where consumers acquire or seek information about miniature automobiles and discussed th...We studied the information search behaviors of Chinese consumers of miniature automobiles. First, we identified the main sources where consumers acquire or seek information about miniature automobiles and discussed their extent of information search. Then, based on logistic regression and optimal scaling regression of statistics, we studied the influences of characteristics of consumers of miniature automobiles on the extent of information search and on Internet usage. The results indicate that consumers often utilize four sources to obtain information about miniature automobiles. The dominant information source for consumers is their friends/family, followed by dealers, newspapers, and TV. Age, occupation, education and income significantly affect the extent of information search, but gender and city of residence do not have significant impacts. Age, city of residence, occupation, education and income produce significant influences on Internet usage. Gender has an insignificant influence on whether a consumer uses the Internet to search for information.展开更多
Using object mathematical model of traditional control theory can not solve the forecasting problem of the chemical components of sintered ore.In order to control complicated chemical components in the manufacturing p...Using object mathematical model of traditional control theory can not solve the forecasting problem of the chemical components of sintered ore.In order to control complicated chemical components in the manufacturing process of sintered ore,some key techniques for intelligent forecasting of the chemical components of sintered ore are studied in this paper.A new intelligent forecasting system based on SVM is proposed and realized.The results show that the accuracy of predictive value of every component is more than 90%.The application of our system in related companies is for more than one year and has shown satisfactory results.展开更多
To enhance the accuracy of short-term photovoltaic power output prediction and address issues such as insufficient spatial resolution of meteorological forecast data and weak generalization ability of models,this pape...To enhance the accuracy of short-term photovoltaic power output prediction and address issues such as insufficient spatial resolution of meteorological forecast data and weak generalization ability of models,this paper proposes a prediction method that integrates spatial downscaling meteorological data with a convolutional neural network(CNN)-iTransformer-long short-term memory(LSTM)model.First,the rime-optimized random forest regression algorithm(RIME-RF)is employed to perform spatial downscaling on numerical weather prediction(NWP)data,thereby improving its local applicability.Second,a CNN-iTransformer-LSTM hybrid prediction model is constructed.This model utilizes a CNN as a spatial feature extractor to capture local patterns in meteorological data,employs an iTransformer to model the global dependencies among multiple variables,and leverages an LSTM to enhance the learning of short-term temporal dynamic features,thereby achieving efficient collaborative mining of multi-scale features.Finally,experiments are conducted using actual data from a photovoltaic power station in Hebei,China,during various seasons and weather conditions.The results show that the proposed model outperforms the comparison models in terms of the root mean square error(RMSE),mean absolute error(MAE),and R2,maintaining high prediction accuracy and stability even under complex weather conditions such as overcast and rainy days.The downscaling process further enhances the prediction performance,verifying the effectiveness and practicality of this method.展开更多
基金supported by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(RS-2020-NR049579).
文摘High-dimensional data causes difficulties in machine learning due to high time consumption and large memory requirements.In particular,in amulti-label environment,higher complexity is required asmuch as the number of labels.Moreover,an optimization problem that fully considers all dependencies between features and labels is difficult to solve.In this study,we propose a novel regression-basedmulti-label feature selectionmethod that integrates mutual information to better exploit the underlying data structure.By incorporating mutual information into the regression formulation,the model captures not only linear relationships but also complex non-linear dependencies.The proposed objective function simultaneously considers three types of relationships:(1)feature redundancy,(2)featurelabel relevance,and(3)inter-label dependency.These three quantities are computed usingmutual information,allowing the proposed formulation to capture nonlinear dependencies among variables.These three types of relationships are key factors in multi-label feature selection,and our method expresses them within a unified formulation,enabling efficient optimization while simultaneously accounting for all of them.To efficiently solve the proposed optimization problem under non-negativity constraints,we develop a gradient-based optimization algorithm with fast convergence.Theexperimental results on sevenmulti-label datasets show that the proposed method outperforms existingmulti-label feature selection techniques.
基金Project 2004CB217601 supported by the National Key Basic Research Development Plan (973) of China
文摘Oil-coal slurry prepared in coal direct liquefaction is a dispersed solid-liquid suspension system. In this paper, some factors such as solvent properties, solid concentrations and temperatures, which affect viscosity change of oil-coal slurry, were studied. The viscosity of coal slurry was measured using rotary viscometer, and the rheological properties have been investigated. The viscosity and rheological curves were plotted and regressed, respectively. The results show that the coal slurry behaves a pseudoplastic and thixotropic property. The rheological type of coal slurry was ascertained and its rheological equations were educed. The oil-coal slurry changes to non-Newtonian fluid from Newtonian fluid with the increasing of solid concentration.
基金Supported by the National Natural Science Foundation of China (60774079), the National High Technology Research and Development Program of China (2006AA04Z184), and Sinopec Science & Technology Development Project of China (205073).
文摘Decreasing the acetic acid consumption in purified terephthalic acid (PTA) solvent system has become a hot issue with common concern. In accordance with the technical features, the electrical conductivity is in direct proportion to the acetic acid content. General regression neural network (GRNN) is used to establish the model of electrical conductivity on the basis of mechanism analysis, and then particle swarm optimization (PSO) algorithm with the improvement of inertia weight and population diversity is proposed to regulate the operating conditions. Thus, the method of decreasing the acid loss is derived and applied to PTA solvent system in a chemical plant. Cases studies show that the precision of modeling and optimization are higher. The results also provide the optimal operating conditions, which decrease the cost and improve the profit.
基金Science and Technology Program for Guangdong Province(2005B32601007)Project of Guangdong Meteorological Bureau(2008B05)+6 种基金Natural Science Foundation of China"Project 973"(2010CB950304)Project of Meteorological Science and Technology of Guangdong Province(200902)Project for Science and Technology Planning in Guangdong(2012A061400012)Science Project for Guangdong Meteorological Bureau(2013B08)Project for Guangdong Provincial Bureau of Science and Technology(2012A030200006)Project for Meteorological Center of the South China Region,China Meteorological Administration(GRMC2012M02)Science and Technology Planning Project for Guangdong Province(2011A032100006,2012A061400012)
文摘The Climate Forecast Systems(CFS) datasets provided by National Centers for Environmental Prediction(NCEP), which cover the time from 1981 to 2008, can be used to forecast atmospheric circulation nine months ahead. Compared with the NCEP datasets, CFS datasets successfully simulate many major features of the Asian monsoon circulation systems and exhibit reasonably high skill in simulating and predicting ENSO events. Based on the CFS forecasting results, a downscaling method of Optimal Subset Regression(OSR) and mean generational function model of multiple variables are used to forecast seasonal precipitation in Guangdong. After statistical analysis tests, sea level pressure, wind and geopotential height field are made predictors. Although the results are unstable in some individual seasons, both the OSR and multivariate mean generational function model can provide good forecasting as operational tests score more than sixty points. CFS datasets are available and updated in real time, as compared with the NCEP dataset. The downscaling forecast method based on the CFS datasets can predict three seasons of seasonal precipitation in Guangdong, enriching traditional statistical methods. However, its forecasting stability needs to be improved.
基金Platform for Meteorological Prediction of Power Load in Guangdong Province
文摘The variability characteristics of Guangdong daily power load from 2002 to 2004 and its connection to meteorological variables are analyzed with wavelet analysis and correlation analysis. Prediction equations are established using optimization subset regression. The results show that a linear increasing trend is very significant and seasonal change is obvious. The power load exhibits significant quasi-weekly (5 – 7 days) oscillation, quasi-by-weekly (10 – 20 days) oscillation and intraseasonal (30 – 60 days) oscillation. These oscillations are caused by atmospheric low frequency oscillation and public holidays. The variation of Guangdong daily power load is obviously in decrease on Sundays, shaping like a funnel during Chinese New Year in particular. The minimum is found at the first and second day and the power load gradually increases to normal level after the third day during the long vacation of Labor Day and National Day. Guangdong power load is the most sensitive to temperature, which is the main affecting factor, as in other areas in China. The power load also has relationship with other meteorological elements to some extent during different seasons. The maximum of power load in summer, minimum during Chinese New Year and variation during Labor Day and National Day are well fitted and predicted using the equation established by optimization subset regression and accounting for the effect of workdays and holidays.
基金the Natural Science Foundation of China ( No. 70472016).
文摘We studied the information search behaviors of Chinese consumers of miniature automobiles. First, we identified the main sources where consumers acquire or seek information about miniature automobiles and discussed their extent of information search. Then, based on logistic regression and optimal scaling regression of statistics, we studied the influences of characteristics of consumers of miniature automobiles on the extent of information search and on Internet usage. The results indicate that consumers often utilize four sources to obtain information about miniature automobiles. The dominant information source for consumers is their friends/family, followed by dealers, newspapers, and TV. Age, occupation, education and income significantly affect the extent of information search, but gender and city of residence do not have significant impacts. Age, city of residence, occupation, education and income produce significant influences on Internet usage. Gender has an insignificant influence on whether a consumer uses the Internet to search for information.
基金Supported by Key Science and Technology Project of Wuhan(No. 20106062327)Self-determined and Innovative Research Funds of WUT (No.2010-YB-20)
文摘Using object mathematical model of traditional control theory can not solve the forecasting problem of the chemical components of sintered ore.In order to control complicated chemical components in the manufacturing process of sintered ore,some key techniques for intelligent forecasting of the chemical components of sintered ore are studied in this paper.A new intelligent forecasting system based on SVM is proposed and realized.The results show that the accuracy of predictive value of every component is more than 90%.The application of our system in related companies is for more than one year and has shown satisfactory results.
文摘To enhance the accuracy of short-term photovoltaic power output prediction and address issues such as insufficient spatial resolution of meteorological forecast data and weak generalization ability of models,this paper proposes a prediction method that integrates spatial downscaling meteorological data with a convolutional neural network(CNN)-iTransformer-long short-term memory(LSTM)model.First,the rime-optimized random forest regression algorithm(RIME-RF)is employed to perform spatial downscaling on numerical weather prediction(NWP)data,thereby improving its local applicability.Second,a CNN-iTransformer-LSTM hybrid prediction model is constructed.This model utilizes a CNN as a spatial feature extractor to capture local patterns in meteorological data,employs an iTransformer to model the global dependencies among multiple variables,and leverages an LSTM to enhance the learning of short-term temporal dynamic features,thereby achieving efficient collaborative mining of multi-scale features.Finally,experiments are conducted using actual data from a photovoltaic power station in Hebei,China,during various seasons and weather conditions.The results show that the proposed model outperforms the comparison models in terms of the root mean square error(RMSE),mean absolute error(MAE),and R2,maintaining high prediction accuracy and stability even under complex weather conditions such as overcast and rainy days.The downscaling process further enhances the prediction performance,verifying the effectiveness and practicality of this method.