期刊文献+
共找到6篇文章
< 1 >
每页显示 20 50 100
Fine-tuning large language models for domain adaptation:exploration of training strategies,scaling,model merging and synergistic capabilities 被引量:1
1
作者 Wei Lu Rachel K.Luu Markus J.Buehler 《npj Computational Materials》 2025年第1期858-900,共43页
The advancement of Large Language Models (LLMs) for domain applications in fields such as materials science and engineering depends on the development of fine-tuning strategies that adapt models for specialized, techn... The advancement of Large Language Models (LLMs) for domain applications in fields such as materials science and engineering depends on the development of fine-tuning strategies that adapt models for specialized, technical capabilities. In this work, we explore the effects of Continued Pretraining (CPT), Supervised Fine-Tuning (SFT), and various preference-based optimization approaches, including Direct Preference Optimization (DPO) and Odds Ratio Preference Optimization (ORPO), on fine-tuned LLM performance. Our analysis shows how these strategies influence model outcomes and reveals that the merging of multiple fine-tuned models can lead to the emergence of capabilities that surpass the individual contributions of the parent models. We find that model merging is not merely a process of aggregation, but a transformative method that can drive substantial advancements in model capabilities characterized by highly nonlinear interactions between model parameters, resulting in new functionalities that neither parent model could achieve alone, leading to improved performance in domain-specific assessments. We study critical factors that influence the success of model merging, such as the diversity between parent models and the fine-tuning techniques employed. The insights underscore the potential of strategic model merging to unlock novel capabilities in LLMs, offering an effective tool for advancing AI systems to meet complex challenges. Experiments with different model architectures are presented, including the Llama 3.1 8B and Mistral 7B family of models, where similar behaviors are observed. Exploring whether the results hold also for much smaller models, we use a tiny LLM with 1.7 billion parameters and show that very small LLMs do not necessarily feature emergent capabilities under model merging, suggesting that model scaling may be a key component. In open-ended yet consistent chat conversations between a human and AI models, our assessment reveals detailed insights into how different model variants perform, and shows that the smallest model achieves a high intelligence score across key criteria including reasoning depth, creativity, clarity, and quantitative precision. Other experiments include the development of image generation prompts that seek to reason over disparate biological material design concepts, to create new microstructures, architectural concepts, and urban design based on biological materials-inspired construction principles. We conclude with a series of questions about scaling and emergence that could be addressed in future research. 展开更多
关键词 continued pretraining domain adaptation training strategies fine tuning materials science direct preference optimization large language models model merging
原文传递
Application of neural network merging model in dam deformation analysis 被引量:5
2
作者 张帆 胡伍生 《Journal of Southeast University(English Edition)》 EI CAS 2013年第4期441-444,共4页
In order to improve the prediction accuracy and test the generalization ability of the dam deformation analysis model, the back-propagation(BP) neural network model for dam deformation analysis is studied, and the m... In order to improve the prediction accuracy and test the generalization ability of the dam deformation analysis model, the back-propagation(BP) neural network model for dam deformation analysis is studied, and the merging model is built based on the neural network BP algorithm and the traditional statistical model. The three models mentioned above are calculated and analyzed according to the long-term deformation observation data in Chencun Dam. The analytical results show that the average prediction accuracies of the statistical model and the BP neural network model are ~ 0.477 and +- 0.390 mm, respectively, while the prediction accuracy of the merging model is ~0. 318 mm, which is improved by 33% and 18% compared to the other two models, respectively. And the merging model has a better generalization ability and broad applicability. 展开更多
关键词 dam deformation analysis neural network statistical model merging model
在线阅读 下载PDF
Dam deformation analysis based on BPNN merging models 被引量:2
3
作者 Jingui Zou Kien-Trinh Thi Bui +1 位作者 Yangxuan Xiao Chinh Van Doan 《Geo-Spatial Information Science》 SCIE CSCD 2018年第2期149-157,共9页
Hydropower has made a significant contribution to the economic development of Vietnam,thus it is important to monitor the safety of hydropower dams for the good of the country and the people.In this paper,dam horizont... Hydropower has made a significant contribution to the economic development of Vietnam,thus it is important to monitor the safety of hydropower dams for the good of the country and the people.In this paper,dam horizontal displacement is analyzed and then forecasted using three methods:the multi-regression model,the seasonal integrated auto-regressive moving average(SARIMA)model and the back-propagation neural network(BPNN)merging models.The monitoring data of the Hoa Binh Dam in Vietnam,including horizontal displacement,time,reservoir water level,and air temperature,are used for the experiments.The results indicate that all of these three methods can approximately describe the trend of dam deformation despite their different forecast accuracies.Hence,their short-term forecasts can provide valuable references for the dam safety. 展开更多
关键词 Dam deformation analysis multi-regression model Back-propagation Neural Network(BPNN) Seasonal Integrated Auto-regressive Moving Average(SARIMA)model merging model
原文传递
An Efficient Multi-area Networks-Merging Model for Power System Online Dynamic Modeling 被引量:3
4
作者 Chuan Qin Ping Ju +3 位作者 Feng Wu Yongkang Liu Xiaohui Ye Guoyang Wu 《CSEE Journal of Power and Energy Systems》 SCIE 2015年第4期22-28,共7页
To improve accuracy and efficiency in power systems dynamic modeling,the distributed online modeling approach is a good option.In this approach,the power system is divided into sub-grids,and the dynamic models of the ... To improve accuracy and efficiency in power systems dynamic modeling,the distributed online modeling approach is a good option.In this approach,the power system is divided into sub-grids,and the dynamic models of the sub-grids are built independently within the distributed modeling system.The subgrid models are subsequently merged,after which the dynamic model of the whole power system is finally constructed online.The merging of the networks plays an important role in the distributed online dynamic modeling of power systems.An efficient multi-area networks-merging model that can rapidly match the boundary power flow is proposed in this paper.The iterations of the boundary matching during network merging are eliminated due to the introduction of the merging model,and the dynamic models of the sub-grid can be directly“plugged in”with each other.The results of the calculations performed in a real power system demonstrate the accuracy of the integrated model under both steady and transient states. 展开更多
关键词 Boundary matching distributed online modeling dynamic modeling multi-area networks merging model power system
原文传递
Deep learning for electrolysis process anode effect prediction based on long short-term memory network and stacked denoising autoencoder 被引量:5
5
作者 Gang Yin Yi-Hui Li +6 位作者 Fei-Ya Yan Peng-Cheng Quan Min Wang Wen-Qi Cao Heng-Quan Xu Jian Lu Wen He 《Rare Metals》 CSCD 2024年第12期6730-6741,共12页
The anode effect is a common failure in the aluminium electrolysis industry.If the anode effect cannot be accurately predicted,it will cause increased energy consumption,harmful gas generation and even equipment damag... The anode effect is a common failure in the aluminium electrolysis industry.If the anode effect cannot be accurately predicted,it will cause increased energy consumption,harmful gas generation and even equipment damage in the aluminium electrolysis.In this paper,an anode effect prediction framework using multi-model merging based on deep learning technology is proposed.Different models are used to process aluminium electrolysis cell condition parameters with high dimensions and different characteristics,and hidden key fault information is deeply mined.A stacked denoising autoencoder is utilized to denoise and extract features from a large number of longperiod parameter data.A long short-term memory network is implemented to identify the intrinsic links between the realtime voltage and current time series and the anode effect.By setting the model time step,the anode effect can be predicted precisely in advance,and the proposed method has good robustness and generalization.Moreover,the traditional Adam algorithm is improved,which enhances the performance and convergence speed of the model.The experimental results show that the classification accuracy and F1score of the model are 97.14% and 0.9579%,respectively.The prediction time can reach 15 min. 展开更多
关键词 Aluminium electrolysis Anode effect prediction Deep learning Improved Adam algorithm merging model
原文传递
Graph Computing and Its Application in Power Grid Analysis 被引量:2
6
作者 Mike Zhou Jianfeng Yan Qianhong Wu 《CSEE Journal of Power and Energy Systems》 SCIE EI CSCD 2022年第6期1550-1557,共8页
Approaches to apply graph computing to power grid analysis are systematically explained using real-world application examples.Through exploring the nature of the power grid and the characteristics of power grid analys... Approaches to apply graph computing to power grid analysis are systematically explained using real-world application examples.Through exploring the nature of the power grid and the characteristics of power grid analysis,the guidelines for selecting appropriate graph computing techniques for the application to power grid analysis are outlined.A custom graph model for representing the power grid for the analysis and simulation purpose and an in-memory computing(IMC)based graph-centric approach with a shared-everything architecture are introduced.Graph algorithms,including network topology processing and subgraph processing,and graph computing application scenarios,including in-memory computing,contingency analysis,and Common Information Model(CIM)model merge,are presented. 展开更多
关键词 CIM model merge contingency analysis Graph computing in-memory computing network topology processing power grid analysis subgraph processing
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部