Assessing the stability of slopes is one of the crucial tasks of geotechnical engineering for assessing and managing risks related to natural hazards,directly affecting safety and sustainable development.This study pr...Assessing the stability of slopes is one of the crucial tasks of geotechnical engineering for assessing and managing risks related to natural hazards,directly affecting safety and sustainable development.This study primarily focuses on developing robust and practical hybrid models to predict the slope stability status of circular failure mode.For this purpose,three robust models were developed using a database including 627 case histories of slope stability status.The models were developed using the random forest(RF),support vector machine(SVM),and extreme gradient boosting(XGB)techniques,employing 5-fold cross validation approach.To enhance the performance of models,this study employs Bayesian optimizer(BO)to fine-tuning their hyperparameters.The results indicate that the performance order of the three developed models is RF-BO>SVM-BO>XGB-BO.Furthermore,comparing the developed models with previous models,it was found that the RF-BO model can effectively determine the slope stability status with outstanding performance.This implies that the RF-BO model could serve as a dependable tool for project managers,assisting in the evaluation of slope stability during both the design and operational phases of projects,despite the inherent challenges in this domain.The results regarding the importance of influencing parameters indicate that cohesion,friction angle,and slope height exert the most significant impact on slope stability status.This suggests that concentrating on these parameters and employing the RF-BO model can effectively mitigate the severity of geohazards in the short-term and contribute to the attainment of long-term sustainable development objectives.展开更多
Edge computing(EC)combined with the Internet of Things(IoT)provides a scalable and efficient solution for smart homes.Therapid proliferation of IoT devices poses real-time data processing and security challenges.EC ha...Edge computing(EC)combined with the Internet of Things(IoT)provides a scalable and efficient solution for smart homes.Therapid proliferation of IoT devices poses real-time data processing and security challenges.EC has become a transformative paradigm for addressing these challenges,particularly in intrusion detection and anomaly mitigation.The widespread connectivity of IoT edge networks has exposed them to various security threats,necessitating robust strategies to detect malicious activities.This research presents a privacy-preserving federated anomaly detection framework combined with Bayesian game theory(BGT)and double deep Q-learning(DDQL).The proposed framework integrates BGT to model attacker and defender interactions for dynamic threat level adaptation and resource availability.It also models a strategic layout between attackers and defenders that takes into account uncertainty.DDQL is incorporated to optimize decision-making and aids in learning optimal defense policies at the edge,thereby ensuring policy and decision optimization.Federated learning(FL)enables decentralized and unshared anomaly detection for sensitive data between devices.Data collection has been performed from various sensors in a real-time EC-IoT network to identify irregularities that occurred due to different attacks.The results reveal that the proposed model achieves high detection accuracy of up to 98%while maintaining low resource consumption.This study demonstrates the synergy between game theory and FL to strengthen anomaly detection in EC-IoT networks.展开更多
PM_(2.5)constitutes a complex and diversemixture that significantly impacts the environment,human health,and climate change.However,existing observation and numerical simulation techniques have limitations,such as a l...PM_(2.5)constitutes a complex and diversemixture that significantly impacts the environment,human health,and climate change.However,existing observation and numerical simulation techniques have limitations,such as a lack of data,high acquisition costs,andmultiple uncertainties.These limitations hinder the acquisition of comprehensive information on PM_(2.5)chemical composition and effectively implement refined air pollution protection and control strategies.In this study,we developed an optimal deep learning model to acquire hourly mass concentrations of key PM_(2.5)chemical components without complex chemical analysis.The model was trained using a randomly partitioned multivariate dataset arranged in chronological order,including atmospheric state indicators,which previous studies did not consider.Our results showed that the correlation coefficients of key chemical components were no less than 0.96,and the root mean square errors ranged from 0.20 to 2.11μg/m^(3)for the entire process(training and testing combined).The model accurately captured the temporal characteristics of key chemical components,outperforming typical machine-learning models,previous studies,and global reanalysis datasets(such asModern-Era Retrospective analysis for Research and Applications,Version 2(MERRA-2)and Copernicus Atmosphere Monitoring Service ReAnalysis(CAMSRA)).We also quantified the feature importance using the random forest model,which showed that PM_(2.5),PM_(1),visibility,and temperature were the most influential variables for key chemical components.In conclusion,this study presents a practical approach to accurately obtain chemical composition information that can contribute to filling missing data,improved air pollution monitoring and source identification.This approach has the potential to enhance air pollution control strategies and promote public health and environmental sustainability.展开更多
Bifunctional oxide-zeolite-based composites(OXZEO)have emerged as promising materials for the direct conversion of syngas to olefins.However,experimental screening and optimization of reaction parameters remain resour...Bifunctional oxide-zeolite-based composites(OXZEO)have emerged as promising materials for the direct conversion of syngas to olefins.However,experimental screening and optimization of reaction parameters remain resource-intensive.To address this challenge,we implemented a three-stage framework integrating machine learning,Bayesian optimization,and experimental validation,utilizing a carefully curated dataset from the literature.Our ensemble-tree model(R^(2)>0.87)identified Zn-Zr and Cu-Mg binary mixed oxides as the most effective OXZEO systems,with their light olefin space-time yields confirmed by physically mixing with HSAPO-34 through experimental validation.Density functional theory calculations further elucidated the activity trends between Zn-Zr and Cu-Mg mixed oxides.Among 16 catalyst and reaction condition descriptors,the oxide/zeolite ratio,reaction temperature,and pressure emerged as the most significant factors.This interpretable,data-driven framework offers a versatile approach that can be applied to other catalytic processes,providing a powerful tool for experiment design and optimization in catalysis.展开更多
Extreme-mass-ratio inspiral(EMRI)signals pose significant challenges to gravitational wave(GW)data analysis,mainly owing to their highly complex waveforms and high-dimensional parameter space.Given their extended time...Extreme-mass-ratio inspiral(EMRI)signals pose significant challenges to gravitational wave(GW)data analysis,mainly owing to their highly complex waveforms and high-dimensional parameter space.Given their extended timescales of months to years and low signal-to-noise ratios,detecting and analyzing EMRIs with confidence generally relies on long-term observations.Besides the length of data,parameter estimation is particularly challenging due to non-local parameter degeneracies,arising from multiple local maxima,as well as flat regions and ridges inherent in the likelihood function.These factors lead to exceptionally high time complexity for parameter analysis based on traditional matched filtering and random sampling methods.To address these challenges,the present study explores a machine learning approach to Bayesian posterior estimation of EMRI signals,leveraging the recently developed flow matching technique based on ordinary differential equation neural networks.To our knowledge,this is also the first instance of applying continuous normalizing flows to EMRI analysis.Our approach demonstrates an increase in computational efficiency by several orders of magnitude compared to the traditional Markov chain Monte Carlo(MCMC)methods,while preserving the unbiasedness of results.However,we note that the posterior distributions generated by FMPE may exhibit broader uncertainty ranges than those obtained through full Bayesian sampling,requiring subsequent refinement via methods such as MCMC.Notably,when searching from large priors,our model rapidly approaches the true values while MCMC struggles to converge to the global maximum.Our findings highlight that machine learning has the potential to efficiently handle the vast EMRI parameter space of up to seventeen dimensions,offering new perspectives for advancing space-based GW detection and GW astronomy.展开更多
The optimization of process parameters in polyolefin production can bring significant economic benefits to the factory.However,due to small data sets,high costs associated with parameter verification cycles,and diffic...The optimization of process parameters in polyolefin production can bring significant economic benefits to the factory.However,due to small data sets,high costs associated with parameter verification cycles,and difficulty in establishing an optimization model,the optimization process is often restricted.To address this issue,we propose using a transfer learning Bayesian optimization strategy to improve the efficiency of parameter optimization while minimizing resource consumption.Specifically,we leverage Gaussian process(GP)regression models to establish an integrated model that incorporates both source and target grade production task data.We then measure the similarity weights of each model by comparing their predicted trends,and utilize these weights to accelerate the solution of optimal process parameters for producing target polyolefin grades.In order to enhance the accuracy of our approach,we acknowledge that measuring similarity in a global search space may not effectively capture local similarity characteristics.Therefore,we propose a novel method for transfer learning optimization that operates within a local space(LSTL-PBO).This method employs partial data acquired through random sampling from the target task data and utilizes Bayesian optimization techniques for model establishment.By focusing on a local search space,we aim to better discern and leverage the inherent similarities between source tasks and the target task.Additionally,we incorporate a parallel concept into our method to address multiple local search spaces simultaneously.By doing so,we can explore different regions of the parameter space in parallel,thereby increasing the chances of finding optimal process parameters.This localized approach allows us to improve the precision and effectiveness of our optimization process.The performance of our method is validated through experiments on benchmark problems,and we discuss the sensitivity of its hyperparameters.The results show that our proposed method can significantly improve the efficiency of process parameter optimization,reduce the dependence on source tasks,and enhance the method's robustness.This has great potential for optimizing processes in industrial environments.展开更多
The multi-source passive localization problem is a problem of great interest in signal pro-cessing with many applications.In this paper,a sparse representation model based on covariance matrix is constructed for the l...The multi-source passive localization problem is a problem of great interest in signal pro-cessing with many applications.In this paper,a sparse representation model based on covariance matrix is constructed for the long-range localization scenario,and a sparse Bayesian learning algo-rithm based on Laplace prior of signal covariance is developed for the base mismatch problem caused by target deviation from the initial point grid.An adaptive grid sparse Bayesian learning targets localization(AGSBL)algorithm is proposed.The AGSBL algorithm implements a covari-ance-based sparse signal reconstruction and grid adaptive localization dictionary learning.Simula-tion results show that the AGSBL algorithm outperforms the traditional compressed-aware localiza-tion algorithm for different signal-to-noise ratios and different number of targets in long-range scenes.展开更多
This study aims to explore the application of Bayesian analysis based on neural networks and deep learning in data visualization.The research background is that with the increasing amount and complexity of data,tradit...This study aims to explore the application of Bayesian analysis based on neural networks and deep learning in data visualization.The research background is that with the increasing amount and complexity of data,traditional data analysis methods have been unable to meet the needs.Research methods include building neural networks and deep learning models,optimizing and improving them through Bayesian analysis,and applying them to the visualization of large-scale data sets.The results show that the neural network combined with Bayesian analysis and deep learning method can effectively improve the accuracy and efficiency of data visualization,and enhance the intuitiveness and depth of data interpretation.The significance of the research is that it provides a new solution for data visualization in the big data environment and helps to further promote the development and application of data science.展开更多
The use of dynamic programming(DP)algorithms to learn Bayesian network structures is limited by their high space complexity and difficulty in learning the structure of large-scale networks.Therefore,this study propose...The use of dynamic programming(DP)algorithms to learn Bayesian network structures is limited by their high space complexity and difficulty in learning the structure of large-scale networks.Therefore,this study proposes a DP algorithm based on node block sequence constraints.The proposed algorithm constrains the traversal process of the parent graph by using the M-sequence matrix to considerably reduce the time consumption and space complexity by pruning the traversal process of the order graph using the node block sequence.Experimental results show that compared with existing DP algorithms,the proposed algorithm can obtain learning results more efficiently with less than 1%loss of accuracy,and can be used for learning larger-scale networks.展开更多
Hyperparameters are important for machine learning algorithms since they directly control the behaviors of training algorithms and have a significant effect on the performance of machine learning models. Several techn...Hyperparameters are important for machine learning algorithms since they directly control the behaviors of training algorithms and have a significant effect on the performance of machine learning models. Several techniques have been developed and successfully applied for certain application domains. However, this work demands professional knowledge and expert experience. And sometimes it has to resort to the brute-force search.Therefore, if an efficient hyperparameter optimization algorithm can be developed to optimize any given machine learning method, it will greatly improve the efficiency of machine learning. In this paper, we consider building the relationship between the performance of the machine learning models and their hyperparameters by Gaussian processes. In this way, the hyperparameter tuning problem can be abstracted as an optimization problem and Bayesian optimization is used to solve the problem. Bayesian optimization is based on the Bayesian theorem. It sets a prior over the optimization function and gathers the information from the previous sample to update the posterior of the optimization function. A utility function selects the next sample point to maximize the optimization function.Several experiments were conducted on standard test datasets. Experiment results show that the proposed method can find the best hyperparameters for the widely used machine learning models, such as the random forest algorithm and the neural networks, even multi-grained cascade forest under the consideration of time cost.展开更多
When the training data are insufficient, especially when only a small sample size of data is available, domain knowledge will be taken into the process of learning parameters to improve the performance of the Bayesian...When the training data are insufficient, especially when only a small sample size of data is available, domain knowledge will be taken into the process of learning parameters to improve the performance of the Bayesian networks. In this paper, a new monotonic constraint model is proposed to represent a type of common domain knowledge. And then, the monotonic constraint estimation algorithm is proposed to learn the parameters with the monotonic constraint model. In order to demonstrate the superiority of the proposed algorithm, series of experiments are carried out. The experiment results show that the proposed algorithm is able to obtain more accurate parameters compared to some existing algorithms while the complexity is not the highest.展开更多
It is unpractical to learn the optimal structure of a big Bayesian network(BN)by exhausting the feasible structures,since the number of feasible structures is super exponential on the number of nodes.This paper propos...It is unpractical to learn the optimal structure of a big Bayesian network(BN)by exhausting the feasible structures,since the number of feasible structures is super exponential on the number of nodes.This paper proposes an approach to layer nodes of a BN by using the conditional independence testing.The parents of a node layer only belong to the layer,or layers who have priority over the layer.When a set of nodes has been layered,the number of feasible structures over the nodes can be remarkably reduced,which makes it possible to learn optimal BN structures for bigger sizes of nodes by accurate algorithms.Integrating the dynamic programming(DP)algorithm with the layering approach,we propose a hybrid algorithm—layered optimal learning(LOL)to learn BN structures.Benefitted by the layering approach,the complexity of the DP algorithm reduces to O(ρ2^n?1)from O(n2^n?1),whereρ<n.Meanwhile,the memory requirements for storing intermediate results are limited to O(C k#/k#^2 )from O(Cn/n^2 ),where k#<n.A case study on learning a standard BN with 50 nodes is conducted.The results demonstrate the superiority of the LOL algorithm,with respect to the Bayesian information criterion(BIC)score criterion,over the hill-climbing,max-min hill-climbing,PC,and three-phrase dependency analysis algorithms.展开更多
Finding out reasonable structures from bulky data is one of the difficulties in modeling of Bayesian network (BN), which is also necessary in promoting the application of BN. This pa- per proposes an immune algorith...Finding out reasonable structures from bulky data is one of the difficulties in modeling of Bayesian network (BN), which is also necessary in promoting the application of BN. This pa- per proposes an immune algorithm based method (BN-IA) for the learning of the BN structure with the idea of vaccination. Further- more, the methods on how to extract the effective vaccines from local optimal structure and root nodes are also described in details. Finally, the simulation studies are implemented with the helicopter convertor BN model and the car start BN model. The comparison results show that the proposed vaccines and the BN-IA can learn the BN structure effectively and efficiently.展开更多
Ordering based search methods have advantages over graph based search methods for structure learning of Bayesian networks in terms on the efficiency. With the aim of further increasing the accuracy of ordering based s...Ordering based search methods have advantages over graph based search methods for structure learning of Bayesian networks in terms on the efficiency. With the aim of further increasing the accuracy of ordering based search methods, we first propose to increase the search space, which can facilitate escaping from the local optima. We present our search operators with majorizations, which are easy to implement. Experiments show that the proposed algorithm can obtain significantly more accurate results. With regard to the problem of the decrease on efficiency due to the increase of the search space, we then propose to add path priors as constraints into the swap process. We analyze the coefficient which may influence the performance of the proposed algorithm, the experiments show that the constraints can enhance the efficiency greatly, while has little effect on the accuracy. The final experiments show that, compared to other competitive methods, the proposed algorithm can find better solutions while holding high efficiency at the same time on both synthetic and real data sets.展开更多
Bayesian networks (BNs) have become increasingly popular in recent years due to their wide-ranging applications in modeling uncertain knowledge. An essential problem about discrete BNs is learning conditional probabil...Bayesian networks (BNs) have become increasingly popular in recent years due to their wide-ranging applications in modeling uncertain knowledge. An essential problem about discrete BNs is learning conditional probability table (CPT) parameters. If training data are sparse, purely data-driven methods often fail to learn accurate parameters. Then, expert judgments can be introduced to overcome this challenge. Parameter constraints deduced from expert judgments can cause parameter estimates to be consistent with domain knowledge. In addition, Dirichlet priors contain information that helps improve learning accuracy. This paper proposes a constrained Bayesian estimation approach to learn CPTs by incorporating constraints and Dirichlet priors. First, a posterior distribution of BN parameters is developed over a restricted parameter space based on training data and Dirichlet priors. Then, the expectation of the posterior distribution is taken as a parameter estimation. As it is difficult to directly compute the expectation for a continuous distribution with an irregular feasible domain, we apply the Monte Carlo method to approximate it. In the experiments on learning standard BNs, the proposed method outperforms competing methods. It suggests that the proposed method can facilitate solving real-world problems. Additionally, a case study of Wine data demonstrates that the proposed method achieves the highest classification accuracy.展开更多
Since orthogonal time-frequency space(OTFS)can effectively handle the problems caused by Doppler effect in high-mobility environment,it has gradually become a promising candidate for modulation scheme in the next gene...Since orthogonal time-frequency space(OTFS)can effectively handle the problems caused by Doppler effect in high-mobility environment,it has gradually become a promising candidate for modulation scheme in the next generation of mobile communication.However,the inter-Doppler interference(IDI)problem caused by fractional Doppler poses great challenges to channel estimation.To avoid this problem,this paper proposes a joint time and delayDoppler(DD)domain based on sparse Bayesian learning(SBL)channel estimation algorithm.Firstly,we derive the original channel response(OCR)from the time domain channel impulse response(CIR),which can reflect the channel variation during one OTFS symbol.Compare with the traditional channel model,the OCR can avoid the IDI problem.After that,the dimension of OCR is reduced by using the basis expansion model(BEM)and the relationship between the time and DD domain channel model,so that we have turned the underdetermined problem into an overdetermined problem.Finally,in terms of sparsity of channel in delay domain,SBL algorithm is used to estimate the basis coefficients in the BEM without any priori information of channel.The simulation results show the effectiveness and superiority of the proposed channel estimation algorithm.展开更多
In the post-genomic biology era,the reconstruction of gene regulatory networks from microarray gene expression data is very important to understand the underlying biological system,and it has been a challenging task i...In the post-genomic biology era,the reconstruction of gene regulatory networks from microarray gene expression data is very important to understand the underlying biological system,and it has been a challenging task in bioinformatics.The Bayesian network model has been used in reconstructing the gene regulatory network for its advantages,but how to determine the network structure and parameters is still important to be explored.This paper proposes a two-stage structure learning algorithm which integrates immune evolution algorithm to build a Bayesian network.The new algorithm is evaluated with the use of both simulated and yeast cell cycle data.The experimental results indicate that the proposed algorithm can find many of the known real regulatory relationships from literature and predict the others unknown with high validity and accuracy.展开更多
A new method to evaluate the fitness of the Bayesian networks according to the observed data is provided. The main advantage of this criterion is that it is suitable for both the complete and incomplete cases while th...A new method to evaluate the fitness of the Bayesian networks according to the observed data is provided. The main advantage of this criterion is that it is suitable for both the complete and incomplete cases while the others not. Moreover it facilitates the computation greatly. In order to reduce the search space, the notation of equivalent class proposed by David Chickering is adopted. Instead of using the method directly, the novel criterion, variable ordering, and equivalent class are combined,moreover the proposed mthod avoids some problems caused by the previous one. Later, the genetic algorithm which allows global convergence, lack in the most of the methods searching for Bayesian network is applied to search for a good model in thisspace. To speed up the convergence, the genetic algorithm is combined with the greedy algorithm. Finally, the simulation shows the validity of the proposed approach.展开更多
Structure learning of Bayesian networks is a wellresearched but computationally hard task.For learning Bayesian networks,this paper proposes an improved algorithm based on unconstrained optimization and ant colony opt...Structure learning of Bayesian networks is a wellresearched but computationally hard task.For learning Bayesian networks,this paper proposes an improved algorithm based on unconstrained optimization and ant colony optimization(U-ACO-B) to solve the drawbacks of the ant colony optimization(ACO-B).In this algorithm,firstly,an unconstrained optimization problem is solved to obtain an undirected skeleton,and then the ACO algorithm is used to orientate the edges,thus returning the final structure.In the experimental part of the paper,we compare the performance of the proposed algorithm with ACO-B algorithm.The experimental results show that our method is effective and greatly enhance convergence speed than ACO-B algorithm.展开更多
It is difficult to rapidly design the process parameters of copper alloys by using the traditional trial-and-error method and simultaneously improve the conflicting mechanical and electrical properties.The purpose of ...It is difficult to rapidly design the process parameters of copper alloys by using the traditional trial-and-error method and simultaneously improve the conflicting mechanical and electrical properties.The purpose of this work is to develop a new type of Cu-Ni-Co-Si alloy saving scarce and expensive Co element,in which the Co content is less than half of the lower limit in ASTM standard C70350 alloy,while the properties are as the same level as C70350 alloy.Here we adopted a strategy combining Bayesian optimization machine learning and experimental iteration and quickly designed the secondary deformation-aging parameters(cold rolling deformation 90%,aging temperature 450℃,and aging time 1.25 h)of the new copper alloy with only 32 experiments(27 basic sample data acquisition experiments and 5 iteration experiments),which broke through the barrier of low efficiency and high cost of trial-and-error design of deformation-aging parameters in precipitation strengthened copper alloy.The experimental hardness,tensile strength,and electrical conductivity of the new copper alloy are HV(285±4),(872±3)MPa,and(44.2±0.7)%IACS(international annealed copper standard),reaching the property level of the commercial lead frame C70350 alloy.This work provides a new idea for the rapid design of material process parameters and the simultaneous improvement of mechanical and electrical properties.展开更多
文摘Assessing the stability of slopes is one of the crucial tasks of geotechnical engineering for assessing and managing risks related to natural hazards,directly affecting safety and sustainable development.This study primarily focuses on developing robust and practical hybrid models to predict the slope stability status of circular failure mode.For this purpose,three robust models were developed using a database including 627 case histories of slope stability status.The models were developed using the random forest(RF),support vector machine(SVM),and extreme gradient boosting(XGB)techniques,employing 5-fold cross validation approach.To enhance the performance of models,this study employs Bayesian optimizer(BO)to fine-tuning their hyperparameters.The results indicate that the performance order of the three developed models is RF-BO>SVM-BO>XGB-BO.Furthermore,comparing the developed models with previous models,it was found that the RF-BO model can effectively determine the slope stability status with outstanding performance.This implies that the RF-BO model could serve as a dependable tool for project managers,assisting in the evaluation of slope stability during both the design and operational phases of projects,despite the inherent challenges in this domain.The results regarding the importance of influencing parameters indicate that cohesion,friction angle,and slope height exert the most significant impact on slope stability status.This suggests that concentrating on these parameters and employing the RF-BO model can effectively mitigate the severity of geohazards in the short-term and contribute to the attainment of long-term sustainable development objectives.
基金The authors extend their appreciation to the Deanship of Research and Graduate Studies at King Khalid University for funding this work through the Large Group Project under grant number(RGP2/337/46)The research team thanks the Deanship of Graduate Studies and Scientific Research at Najran University for supporting the research project through the Nama’a program,with the project code NU/GP/SERC/13/352-4.
文摘Edge computing(EC)combined with the Internet of Things(IoT)provides a scalable and efficient solution for smart homes.Therapid proliferation of IoT devices poses real-time data processing and security challenges.EC has become a transformative paradigm for addressing these challenges,particularly in intrusion detection and anomaly mitigation.The widespread connectivity of IoT edge networks has exposed them to various security threats,necessitating robust strategies to detect malicious activities.This research presents a privacy-preserving federated anomaly detection framework combined with Bayesian game theory(BGT)and double deep Q-learning(DDQL).The proposed framework integrates BGT to model attacker and defender interactions for dynamic threat level adaptation and resource availability.It also models a strategic layout between attackers and defenders that takes into account uncertainty.DDQL is incorporated to optimize decision-making and aids in learning optimal defense policies at the edge,thereby ensuring policy and decision optimization.Federated learning(FL)enables decentralized and unshared anomaly detection for sensitive data between devices.Data collection has been performed from various sensors in a real-time EC-IoT network to identify irregularities that occurred due to different attacks.The results reveal that the proposed model achieves high detection accuracy of up to 98%while maintaining low resource consumption.This study demonstrates the synergy between game theory and FL to strengthen anomaly detection in EC-IoT networks.
基金supported by the National Key Research and Development Program for Young Scientists of China(No.2022YFC3704000)the National Natural Science Foundation of China(No.42275122)the National Key Scientific and Technological Infrastructure project“Earth System Science Numerical Simulator Facility”(EarthLab).
文摘PM_(2.5)constitutes a complex and diversemixture that significantly impacts the environment,human health,and climate change.However,existing observation and numerical simulation techniques have limitations,such as a lack of data,high acquisition costs,andmultiple uncertainties.These limitations hinder the acquisition of comprehensive information on PM_(2.5)chemical composition and effectively implement refined air pollution protection and control strategies.In this study,we developed an optimal deep learning model to acquire hourly mass concentrations of key PM_(2.5)chemical components without complex chemical analysis.The model was trained using a randomly partitioned multivariate dataset arranged in chronological order,including atmospheric state indicators,which previous studies did not consider.Our results showed that the correlation coefficients of key chemical components were no less than 0.96,and the root mean square errors ranged from 0.20 to 2.11μg/m^(3)for the entire process(training and testing combined).The model accurately captured the temporal characteristics of key chemical components,outperforming typical machine-learning models,previous studies,and global reanalysis datasets(such asModern-Era Retrospective analysis for Research and Applications,Version 2(MERRA-2)and Copernicus Atmosphere Monitoring Service ReAnalysis(CAMSRA)).We also quantified the feature importance using the random forest model,which showed that PM_(2.5),PM_(1),visibility,and temperature were the most influential variables for key chemical components.In conclusion,this study presents a practical approach to accurately obtain chemical composition information that can contribute to filling missing data,improved air pollution monitoring and source identification.This approach has the potential to enhance air pollution control strategies and promote public health and environmental sustainability.
基金funded by the KRICT Project (KK2512-10) of the Korea Research Institute of Chemical Technology and the Ministry of Trade, Industry and Energy (MOTIE)the Korea Institute for Advancement of Technology (KIAT) through the Virtual Engineering Platform Program (P0022334)+1 种基金supported by the Carbon Neutral Industrial Strategic Technology Development Program (RS-202300261088) funded by the Ministry of Trade, Industry & Energy (MOTIE, Korea)Further support was provided by research fund of Chungnam National University。
文摘Bifunctional oxide-zeolite-based composites(OXZEO)have emerged as promising materials for the direct conversion of syngas to olefins.However,experimental screening and optimization of reaction parameters remain resource-intensive.To address this challenge,we implemented a three-stage framework integrating machine learning,Bayesian optimization,and experimental validation,utilizing a carefully curated dataset from the literature.Our ensemble-tree model(R^(2)>0.87)identified Zn-Zr and Cu-Mg binary mixed oxides as the most effective OXZEO systems,with their light olefin space-time yields confirmed by physically mixing with HSAPO-34 through experimental validation.Density functional theory calculations further elucidated the activity trends between Zn-Zr and Cu-Mg mixed oxides.Among 16 catalyst and reaction condition descriptors,the oxide/zeolite ratio,reaction temperature,and pressure emerged as the most significant factors.This interpretable,data-driven framework offers a versatile approach that can be applied to other catalytic processes,providing a powerful tool for experiment design and optimization in catalysis.
基金supported by the National Key Research and Development Program of China(Grant Nos.2021YFC2201901,2021YFC2203004,2020YFC2200100 and 2021YFC2201903)International Partnership Program of the Chinese Academy of Sciences(Grant No.025GJHZ2023106GC)+4 种基金the financial support from Brazilian agencies Funda??o de AmparoàPesquisa do Estado de S?o Paulo(FAPESP)Funda??o de Amparoà Pesquisa do Estado do Rio Grande do Sul(FAPERGS)Fundacao de Amparoà Pesquisa do Estado do Rio de Janeiro(FAPERJ)Conselho Nacional de Desenvolvimento Científico e Tecnológico(CNPq)Coordenacao de Aperfeicoamento de Pessoal de Nível Superior(CAPES)。
文摘Extreme-mass-ratio inspiral(EMRI)signals pose significant challenges to gravitational wave(GW)data analysis,mainly owing to their highly complex waveforms and high-dimensional parameter space.Given their extended timescales of months to years and low signal-to-noise ratios,detecting and analyzing EMRIs with confidence generally relies on long-term observations.Besides the length of data,parameter estimation is particularly challenging due to non-local parameter degeneracies,arising from multiple local maxima,as well as flat regions and ridges inherent in the likelihood function.These factors lead to exceptionally high time complexity for parameter analysis based on traditional matched filtering and random sampling methods.To address these challenges,the present study explores a machine learning approach to Bayesian posterior estimation of EMRI signals,leveraging the recently developed flow matching technique based on ordinary differential equation neural networks.To our knowledge,this is also the first instance of applying continuous normalizing flows to EMRI analysis.Our approach demonstrates an increase in computational efficiency by several orders of magnitude compared to the traditional Markov chain Monte Carlo(MCMC)methods,while preserving the unbiasedness of results.However,we note that the posterior distributions generated by FMPE may exhibit broader uncertainty ranges than those obtained through full Bayesian sampling,requiring subsequent refinement via methods such as MCMC.Notably,when searching from large priors,our model rapidly approaches the true values while MCMC struggles to converge to the global maximum.Our findings highlight that machine learning has the potential to efficiently handle the vast EMRI parameter space of up to seventeen dimensions,offering new perspectives for advancing space-based GW detection and GW astronomy.
基金supported by National Natural Science Foundation of China(62394343)Major Program of Qingyuan Innovation Laboratory(00122002)+1 种基金Major Science and Technology Projects of Longmen Laboratory(231100220600)Shanghai Committee of Science and Technology(23ZR1416000)and Shanghai AI Lab.
文摘The optimization of process parameters in polyolefin production can bring significant economic benefits to the factory.However,due to small data sets,high costs associated with parameter verification cycles,and difficulty in establishing an optimization model,the optimization process is often restricted.To address this issue,we propose using a transfer learning Bayesian optimization strategy to improve the efficiency of parameter optimization while minimizing resource consumption.Specifically,we leverage Gaussian process(GP)regression models to establish an integrated model that incorporates both source and target grade production task data.We then measure the similarity weights of each model by comparing their predicted trends,and utilize these weights to accelerate the solution of optimal process parameters for producing target polyolefin grades.In order to enhance the accuracy of our approach,we acknowledge that measuring similarity in a global search space may not effectively capture local similarity characteristics.Therefore,we propose a novel method for transfer learning optimization that operates within a local space(LSTL-PBO).This method employs partial data acquired through random sampling from the target task data and utilizes Bayesian optimization techniques for model establishment.By focusing on a local search space,we aim to better discern and leverage the inherent similarities between source tasks and the target task.Additionally,we incorporate a parallel concept into our method to address multiple local search spaces simultaneously.By doing so,we can explore different regions of the parameter space in parallel,thereby increasing the chances of finding optimal process parameters.This localized approach allows us to improve the precision and effectiveness of our optimization process.The performance of our method is validated through experiments on benchmark problems,and we discuss the sensitivity of its hyperparameters.The results show that our proposed method can significantly improve the efficiency of process parameter optimization,reduce the dependence on source tasks,and enhance the method's robustness.This has great potential for optimizing processes in industrial environments.
文摘The multi-source passive localization problem is a problem of great interest in signal pro-cessing with many applications.In this paper,a sparse representation model based on covariance matrix is constructed for the long-range localization scenario,and a sparse Bayesian learning algo-rithm based on Laplace prior of signal covariance is developed for the base mismatch problem caused by target deviation from the initial point grid.An adaptive grid sparse Bayesian learning targets localization(AGSBL)algorithm is proposed.The AGSBL algorithm implements a covari-ance-based sparse signal reconstruction and grid adaptive localization dictionary learning.Simula-tion results show that the AGSBL algorithm outperforms the traditional compressed-aware localiza-tion algorithm for different signal-to-noise ratios and different number of targets in long-range scenes.
文摘This study aims to explore the application of Bayesian analysis based on neural networks and deep learning in data visualization.The research background is that with the increasing amount and complexity of data,traditional data analysis methods have been unable to meet the needs.Research methods include building neural networks and deep learning models,optimizing and improving them through Bayesian analysis,and applying them to the visualization of large-scale data sets.The results show that the neural network combined with Bayesian analysis and deep learning method can effectively improve the accuracy and efficiency of data visualization,and enhance the intuitiveness and depth of data interpretation.The significance of the research is that it provides a new solution for data visualization in the big data environment and helps to further promote the development and application of data science.
基金Shaanxi Science Fund for Distinguished Young Scholars,Grant/Award Number:2024JC-JCQN-57Xi’an Science and Technology Plan Project,Grant/Award Number:2023JH-QCYJQ-0086+2 种基金Scientific Research Program Funded by Education Department of Shaanxi Provincial Government,Grant/Award Number:P23JP071Engineering Technology Research Center of Shaanxi Province for Intelligent Testing and Reliability Evaluation of Electronic Equipments,Grant/Award Number:2023-ZC-GCZX-00472022 Shaanxi University Youth Innovation Team Project。
文摘The use of dynamic programming(DP)algorithms to learn Bayesian network structures is limited by their high space complexity and difficulty in learning the structure of large-scale networks.Therefore,this study proposes a DP algorithm based on node block sequence constraints.The proposed algorithm constrains the traversal process of the parent graph by using the M-sequence matrix to considerably reduce the time consumption and space complexity by pruning the traversal process of the order graph using the node block sequence.Experimental results show that compared with existing DP algorithms,the proposed algorithm can obtain learning results more efficiently with less than 1%loss of accuracy,and can be used for learning larger-scale networks.
基金supported in part by the National Natural Science Foundation of China under Grant No.61503059
文摘Hyperparameters are important for machine learning algorithms since they directly control the behaviors of training algorithms and have a significant effect on the performance of machine learning models. Several techniques have been developed and successfully applied for certain application domains. However, this work demands professional knowledge and expert experience. And sometimes it has to resort to the brute-force search.Therefore, if an efficient hyperparameter optimization algorithm can be developed to optimize any given machine learning method, it will greatly improve the efficiency of machine learning. In this paper, we consider building the relationship between the performance of the machine learning models and their hyperparameters by Gaussian processes. In this way, the hyperparameter tuning problem can be abstracted as an optimization problem and Bayesian optimization is used to solve the problem. Bayesian optimization is based on the Bayesian theorem. It sets a prior over the optimization function and gathers the information from the previous sample to update the posterior of the optimization function. A utility function selects the next sample point to maximize the optimization function.Several experiments were conducted on standard test datasets. Experiment results show that the proposed method can find the best hyperparameters for the widely used machine learning models, such as the random forest algorithm and the neural networks, even multi-grained cascade forest under the consideration of time cost.
基金supported by the National Natural Science Foundation of China(6130513361573285)the Fundamental Research Funds for the Central Universities(3102016CG002)
文摘When the training data are insufficient, especially when only a small sample size of data is available, domain knowledge will be taken into the process of learning parameters to improve the performance of the Bayesian networks. In this paper, a new monotonic constraint model is proposed to represent a type of common domain knowledge. And then, the monotonic constraint estimation algorithm is proposed to learn the parameters with the monotonic constraint model. In order to demonstrate the superiority of the proposed algorithm, series of experiments are carried out. The experiment results show that the proposed algorithm is able to obtain more accurate parameters compared to some existing algorithms while the complexity is not the highest.
基金supported by the National Natural Science Foundation of China(61573285)
文摘It is unpractical to learn the optimal structure of a big Bayesian network(BN)by exhausting the feasible structures,since the number of feasible structures is super exponential on the number of nodes.This paper proposes an approach to layer nodes of a BN by using the conditional independence testing.The parents of a node layer only belong to the layer,or layers who have priority over the layer.When a set of nodes has been layered,the number of feasible structures over the nodes can be remarkably reduced,which makes it possible to learn optimal BN structures for bigger sizes of nodes by accurate algorithms.Integrating the dynamic programming(DP)algorithm with the layering approach,we propose a hybrid algorithm—layered optimal learning(LOL)to learn BN structures.Benefitted by the layering approach,the complexity of the DP algorithm reduces to O(ρ2^n?1)from O(n2^n?1),whereρ<n.Meanwhile,the memory requirements for storing intermediate results are limited to O(C k#/k#^2 )from O(Cn/n^2 ),where k#<n.A case study on learning a standard BN with 50 nodes is conducted.The results demonstrate the superiority of the LOL algorithm,with respect to the Bayesian information criterion(BIC)score criterion,over the hill-climbing,max-min hill-climbing,PC,and three-phrase dependency analysis algorithms.
基金supported by the National Natural Science Foundation of China(7110111671271170)+1 种基金the Program for New Century Excellent Talents in University(NCET-13-0475)the Basic Research Foundation of NPU(JC20120228)
文摘Finding out reasonable structures from bulky data is one of the difficulties in modeling of Bayesian network (BN), which is also necessary in promoting the application of BN. This pa- per proposes an immune algorithm based method (BN-IA) for the learning of the BN structure with the idea of vaccination. Further- more, the methods on how to extract the effective vaccines from local optimal structure and root nodes are also described in details. Finally, the simulation studies are implemented with the helicopter convertor BN model and the car start BN model. The comparison results show that the proposed vaccines and the BN-IA can learn the BN structure effectively and efficiently.
基金supported by the National Natural Science Fundation of China(61573285)the Doctoral Fundation of China(2013ZC53037)
文摘Ordering based search methods have advantages over graph based search methods for structure learning of Bayesian networks in terms on the efficiency. With the aim of further increasing the accuracy of ordering based search methods, we first propose to increase the search space, which can facilitate escaping from the local optima. We present our search operators with majorizations, which are easy to implement. Experiments show that the proposed algorithm can obtain significantly more accurate results. With regard to the problem of the decrease on efficiency due to the increase of the search space, we then propose to add path priors as constraints into the swap process. We analyze the coefficient which may influence the performance of the proposed algorithm, the experiments show that the constraints can enhance the efficiency greatly, while has little effect on the accuracy. The final experiments show that, compared to other competitive methods, the proposed algorithm can find better solutions while holding high efficiency at the same time on both synthetic and real data sets.
基金supported by the National Natural Science Foundation of China(61573285)the Innovation Foundation for Doctor Dissertation of Northwestern Polytechnical University,China(CX201619)
文摘Bayesian networks (BNs) have become increasingly popular in recent years due to their wide-ranging applications in modeling uncertain knowledge. An essential problem about discrete BNs is learning conditional probability table (CPT) parameters. If training data are sparse, purely data-driven methods often fail to learn accurate parameters. Then, expert judgments can be introduced to overcome this challenge. Parameter constraints deduced from expert judgments can cause parameter estimates to be consistent with domain knowledge. In addition, Dirichlet priors contain information that helps improve learning accuracy. This paper proposes a constrained Bayesian estimation approach to learn CPTs by incorporating constraints and Dirichlet priors. First, a posterior distribution of BN parameters is developed over a restricted parameter space based on training data and Dirichlet priors. Then, the expectation of the posterior distribution is taken as a parameter estimation. As it is difficult to directly compute the expectation for a continuous distribution with an irregular feasible domain, we apply the Monte Carlo method to approximate it. In the experiments on learning standard BNs, the proposed method outperforms competing methods. It suggests that the proposed method can facilitate solving real-world problems. Additionally, a case study of Wine data demonstrates that the proposed method achieves the highest classification accuracy.
基金supported by the Natural Science Foundation of Chongqing(No.cstc2019jcyj-msxmX0017)。
文摘Since orthogonal time-frequency space(OTFS)can effectively handle the problems caused by Doppler effect in high-mobility environment,it has gradually become a promising candidate for modulation scheme in the next generation of mobile communication.However,the inter-Doppler interference(IDI)problem caused by fractional Doppler poses great challenges to channel estimation.To avoid this problem,this paper proposes a joint time and delayDoppler(DD)domain based on sparse Bayesian learning(SBL)channel estimation algorithm.Firstly,we derive the original channel response(OCR)from the time domain channel impulse response(CIR),which can reflect the channel variation during one OTFS symbol.Compare with the traditional channel model,the OCR can avoid the IDI problem.After that,the dimension of OCR is reduced by using the basis expansion model(BEM)and the relationship between the time and DD domain channel model,so that we have turned the underdetermined problem into an overdetermined problem.Finally,in terms of sparsity of channel in delay domain,SBL algorithm is used to estimate the basis coefficients in the BEM without any priori information of channel.The simulation results show the effectiveness and superiority of the proposed channel estimation algorithm.
基金supported by National Natural Science Foundation of China (Grant Nos. 60433020, 60175024 and 60773095)European Commission under grant No. TH/Asia Link/010 (111084)the Key Science-Technology Project of the National Education Ministry of China (Grant No. 02090),and the Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, P. R. China
文摘In the post-genomic biology era,the reconstruction of gene regulatory networks from microarray gene expression data is very important to understand the underlying biological system,and it has been a challenging task in bioinformatics.The Bayesian network model has been used in reconstructing the gene regulatory network for its advantages,but how to determine the network structure and parameters is still important to be explored.This paper proposes a two-stage structure learning algorithm which integrates immune evolution algorithm to build a Bayesian network.The new algorithm is evaluated with the use of both simulated and yeast cell cycle data.The experimental results indicate that the proposed algorithm can find many of the known real regulatory relationships from literature and predict the others unknown with high validity and accuracy.
基金This project was supported by the National Natural Science Foundation of China (70572045).
文摘A new method to evaluate the fitness of the Bayesian networks according to the observed data is provided. The main advantage of this criterion is that it is suitable for both the complete and incomplete cases while the others not. Moreover it facilitates the computation greatly. In order to reduce the search space, the notation of equivalent class proposed by David Chickering is adopted. Instead of using the method directly, the novel criterion, variable ordering, and equivalent class are combined,moreover the proposed mthod avoids some problems caused by the previous one. Later, the genetic algorithm which allows global convergence, lack in the most of the methods searching for Bayesian network is applied to search for a good model in thisspace. To speed up the convergence, the genetic algorithm is combined with the greedy algorithm. Finally, the simulation shows the validity of the proposed approach.
基金supported by the National Natural Science Foundation of China (60974082,11171094)the Fundamental Research Funds for the Central Universities (K50510700004)+1 种基金the Foundation and Advanced Technology Research Program of Henan Province (102300410264)the Basic Research Program of the Education Department of Henan Province (2010A110010)
文摘Structure learning of Bayesian networks is a wellresearched but computationally hard task.For learning Bayesian networks,this paper proposes an improved algorithm based on unconstrained optimization and ant colony optimization(U-ACO-B) to solve the drawbacks of the ant colony optimization(ACO-B).In this algorithm,firstly,an unconstrained optimization problem is solved to obtain an undirected skeleton,and then the ACO algorithm is used to orientate the edges,thus returning the final structure.In the experimental part of the paper,we compare the performance of the proposed algorithm with ACO-B algorithm.The experimental results show that our method is effective and greatly enhance convergence speed than ACO-B algorithm.
基金supported by the National Key Research and Development Program of China(No.2021YFB 3803101)the National Natural Science Foundation of China(Nos.52090041,52022011,and 51974028)。
文摘It is difficult to rapidly design the process parameters of copper alloys by using the traditional trial-and-error method and simultaneously improve the conflicting mechanical and electrical properties.The purpose of this work is to develop a new type of Cu-Ni-Co-Si alloy saving scarce and expensive Co element,in which the Co content is less than half of the lower limit in ASTM standard C70350 alloy,while the properties are as the same level as C70350 alloy.Here we adopted a strategy combining Bayesian optimization machine learning and experimental iteration and quickly designed the secondary deformation-aging parameters(cold rolling deformation 90%,aging temperature 450℃,and aging time 1.25 h)of the new copper alloy with only 32 experiments(27 basic sample data acquisition experiments and 5 iteration experiments),which broke through the barrier of low efficiency and high cost of trial-and-error design of deformation-aging parameters in precipitation strengthened copper alloy.The experimental hardness,tensile strength,and electrical conductivity of the new copper alloy are HV(285±4),(872±3)MPa,and(44.2±0.7)%IACS(international annealed copper standard),reaching the property level of the commercial lead frame C70350 alloy.This work provides a new idea for the rapid design of material process parameters and the simultaneous improvement of mechanical and electrical properties.