The development of chemical technologies,which involves a multistage process covering laboratory research,scale‐up to industrial deployment,and necessitates interdisciplinary collaboration,is often accompanied by sub...The development of chemical technologies,which involves a multistage process covering laboratory research,scale‐up to industrial deployment,and necessitates interdisciplinary collaboration,is often accompanied by substantial time and economic costs.To address these challenges,in this work,we report ChemELLM,a domain‐specific large language model(LLM)with 70 billion parameters for chemical engineering.ChemELLM demonstrates state‐of‐the‐art performance across critical tasks ranging from foundational understanding to professional problem‐solving.It outperforms mainstream LLMs(e.g.,O1‐Preview,GPT‐4o,and DeepSeek‐R1)on ChemEBench,the first multidimensional benchmark for chemical engineering,which encompasses 15 dimensions across 101 distinct essential tasks.To support robust model development,we curated ChemEData,a purpose‐built dataset containing 19 billion tokens for pre‐training and 1 billion tokens for fine‐tuning.This work establishes a new paradigm for artificial intelligence‐driven innovation,bridging the gap between laboratory‐scale innovation and industrial‐scale implementation,thus accelerating technological advancement in chemical engineering.ChemELLM is publicly available at https://chemindustry.iflytek.com/chat.展开更多
文摘The development of chemical technologies,which involves a multistage process covering laboratory research,scale‐up to industrial deployment,and necessitates interdisciplinary collaboration,is often accompanied by substantial time and economic costs.To address these challenges,in this work,we report ChemELLM,a domain‐specific large language model(LLM)with 70 billion parameters for chemical engineering.ChemELLM demonstrates state‐of‐the‐art performance across critical tasks ranging from foundational understanding to professional problem‐solving.It outperforms mainstream LLMs(e.g.,O1‐Preview,GPT‐4o,and DeepSeek‐R1)on ChemEBench,the first multidimensional benchmark for chemical engineering,which encompasses 15 dimensions across 101 distinct essential tasks.To support robust model development,we curated ChemEData,a purpose‐built dataset containing 19 billion tokens for pre‐training and 1 billion tokens for fine‐tuning.This work establishes a new paradigm for artificial intelligence‐driven innovation,bridging the gap between laboratory‐scale innovation and industrial‐scale implementation,thus accelerating technological advancement in chemical engineering.ChemELLM is publicly available at https://chemindustry.iflytek.com/chat.