Since its launch in 2011, the Materials Genome Initiative(MGI) has drawn the attention of researchers from academia,government, and industry worldwide. As one of the three tools of the MGI, the use of materials data...Since its launch in 2011, the Materials Genome Initiative(MGI) has drawn the attention of researchers from academia,government, and industry worldwide. As one of the three tools of the MGI, the use of materials data, for the first time, has emerged as an extremely significant approach in materials discovery. Data science has been applied in different disciplines as an interdisciplinary field to extract knowledge from data. The concept of materials data science has been utilized to demonstrate its application in materials science. To explore its potential as an active research branch in the big data era, a three-tier system has been put forward to define the infrastructure for the classification, curation and knowledge extraction of materials data.展开更多
Small Language Models offer an efficient alternative for structured information extraction.We present SLM-MATRIX,a multi-path collaborative reasoning and verification framework based on SLMs,designed to extract materi...Small Language Models offer an efficient alternative for structured information extraction.We present SLM-MATRIX,a multi-path collaborative reasoning and verification framework based on SLMs,designed to extract material names,numerical values,and physical units from materials science literature.The framework integrates three complementary reasoning paths:a multi-agent collaborative path,a generator–discriminator path,and a dual cross-verification path.SLM-MATRIX achieves an accuracy of 92.85%on the BulkModulus dataset and reaches 77.68%accuracy on the MatSynTriplet dataset,both outperforming conventional methods and single-pathmodels.Moreover,experiments on general reasoning benchmarks such as GSM8K and SVAMP validate the framework’s strong generalization capability.Ablation studies evaluate the effects of agent number,Mixture-of-Agents(MoA)depth,and discriminator design on overall performance.Overall,SLM-MATRIX presents an effective approach for high-quality material information extraction in resource-constrained and offers new insights into structured scientific text understanding tasks.展开更多
According to statistics of Printing and Printing Equipment Industries Association of China (PEIAC), the total output value of printing industry of China in 2007 reached 440 billion RMB , the total output value of prin...According to statistics of Printing and Printing Equipment Industries Association of China (PEIAC), the total output value of printing industry of China in 2007 reached 440 billion RMB , the total output value of printing equipment was展开更多
Flexible roll forming is a promising manufacturing method for the production of variable cross section products. Considering the large plastic strain in this forming process which is much larger than that of uniform d...Flexible roll forming is a promising manufacturing method for the production of variable cross section products. Considering the large plastic strain in this forming process which is much larger than that of uniform deformation phase of uniaxial tensile test, the widely adopted method of simulating the forming processes with non-supplemented material data from uniaxial tensile test will certainly lead to large error. To reduce this error, the material data is supplemented based on three constitutive models. Then a finite element model of a six passes flexible roll forming process is established based on the supplemented material data and the original material data from the uniaxial tensile test. The flexible roll forming experiment of a B pillar reinforcing plate is carried out to verify the proposed method. Final cross section shapes of the experimental and the simulated results are compared. It is shown that the simulation calculated with supplemented material data based on Swift model agrees well with the experimental results, while the simulation based on original material data could not predict the actual deformation accurately. The results indicate that this material supplement method is reliable and indispensible, and the simulation model can well reflect the real metal forming process. Detailed analysis of the distribution and history of plastic strain at different positions are performed. A new material data supplement method is proposed to tackle the problem which is ignored in other roll forming simulations, and thus the forming process simulation accuracy can be greatly improved.展开更多
CONSPECTUS:The data-driven paradigm,represented by the famous machine learning paradigm,is revolutionizing the way materials are discovered.The inductive nature of the data-driven approach gives it great speed of pred...CONSPECTUS:The data-driven paradigm,represented by the famous machine learning paradigm,is revolutionizing the way materials are discovered.The inductive nature of the data-driven approach gives it great speed of prediction but also brings with it a heavy reliance on material data.However,unlike its success with text and images,which are supported by big data,materials data tend to be small data.Building a large database of materials is a good solution but not a permanent one.The cost of materials data is much higher than that of text or images,and the size of the materials database at this stage is far from sufficient.We will continue to face a shortage of materials data for a long time to come,making small data approaches necessary for machine learning based materials discovery.展开更多
With the rapidly increasing amount of materials data being generated in a variety of projects,efficient and accurate classification of atomistic structures is essential.A current barrier to effective database queries ...With the rapidly increasing amount of materials data being generated in a variety of projects,efficient and accurate classification of atomistic structures is essential.A current barrier to effective database queries lies in the often ambiguous,inconsistent,or completely missing classification of existing data,highlighting the need for standardized,automated,and verifiable classification methods.This work proposes a robust solution for identifying and classifying a wide spectrum of materials through an iterative technique,called symmetry-based clustering(SBC).Because SBC is not a machine learningbased method,it requires no prior training.Instead,it identifies clusters in atomistic systems by automatically recognizing common unit cells.We demonstrate the potential of SBC to provide automated,reliable classification and to reveal well-known symmetry properties of various materials.Even noisy systems are shown to be classifiable,showing the suitability of our algorithm for real-world data applications.The software implementation is provided in the open-source Python package,MatID,exploiting synergies with popular atomic-structure manipulation libraries and extending the accessibility of those libraries through the NOMAD platform.展开更多
The representation method of heterogeneous material information is one of the key technologies of heterogeneous object modeling, but almost all the existing methods cannot represent non-uniform rational B-spline (NU...The representation method of heterogeneous material information is one of the key technologies of heterogeneous object modeling, but almost all the existing methods cannot represent non-uniform rational B-spline (NURBS) entity. According to the characteristics of NURBS, a novel data structure, named NURBS material data structure, is proposed, in which the geometrical coordinates, weights and material coordinates of NURBS heterogene- ous objects can be represented simultaneously. Based on this data structure, both direct representation method and inverse construction method of heterogeneous NURBS objects are introduced. In the direct representation method, three forms of NURBS heterogeneous objects are introduced by giving the geometry and material information of con- trol points, among which the homogeneous coordinates form is employed for its brevity and easy programming. In the inverse construction method, continuous heterogeneous curves and surfaces can he obtained by interpolating discrete points and curves with specified material information. Some examples are given to show the effectiveness of the pro- posed methods.展开更多
Artificial intelligence(AI)has become an increasingly important propellant for energy materials and energy chemistry research,such as accelerating advanced energy materials discovery[1],analyzing vast amounts of data ...Artificial intelligence(AI)has become an increasingly important propellant for energy materials and energy chemistry research,such as accelerating advanced energy materials discovery[1],analyzing vast amounts of data from both experiments and computations[2],process optimization for materials syntheses,management and monitoring of energy storage devices such as lithium batteries,and algorithm-optimized grid load forecasting.Looking back at recent pioneering works of AI-driven energy chemistry research,constructing a dataset with both large quantity and high quality is almost the first step and largely determines the following success of training AI models and figuring out corresponding scientific issues.展开更多
Over the last few decades, several all-optical circuits have been proposed to meet the need of high-speed data processing. In some information processing architectures, the role of various analog and digital data comp...Over the last few decades, several all-optical circuits have been proposed to meet the need of high-speed data processing. In some information processing architectures, the role of various analog and digital data comparisons is very important. In this letter, we proposed a multi-bit data comparison scheme. The scheme is based on the switching property of optical nonlinear material. Ultrafast operational speed larger than gigahertz can be expected from this all-optical scheme.展开更多
Data-driven machine learning(ML)has demonstrated tremendous potential in material property predictions.However,the scarcity of materials data with costly property labels in the vast chemical space presents a significa...Data-driven machine learning(ML)has demonstrated tremendous potential in material property predictions.However,the scarcity of materials data with costly property labels in the vast chemical space presents a significant challenge for ML in efficiently predicting properties and uncovering structure-property relationships.Here,we propose a novel hierarchy-boosted funnel learning(HiBoFL)framework,which is successfully applied to identify semiconductors with ultralow lattice thermal conductivity(κ_(L)).By training on only a few hundred materials targeted by unsupervised learning from a pool of hundreds of thousands,we achieve efficient and interpretable supervised predictions of ultralowκ_(L),thereby circumventing large-scale brute-force ab initio calculations without clear objectives.As a result,we provide a list of candidates with ultralowκ_(L)for potential thermoelectric applications and discover a new factor that significantly influences structural anharmonicity.This HiBoFL framework offers a novel practical pathway for accelerating the discovery of functional materials.展开更多
Data-driven material innovation has the potential to revolutionize the traditional Edisonian process and significantly shorten development cycles.However,the scarcity of data in materials science and the poor interpre...Data-driven material innovation has the potential to revolutionize the traditional Edisonian process and significantly shorten development cycles.However,the scarcity of data in materials science and the poor interpretability of machine learning pose serious obstacles to the adoption of this new paradigm.Here,we propose a pipeline that integrates data production,virtual screening,and theoretical innovation using high-throughput all-atom molecular dynamics(MD)as a data flywheel.Using this pipeline,we explored high-performance viscosity index improver polymers and constructed a dataset of 1166 entries for viscosity index improvers(VII)started fromonly five types of polymers.Under multiobjective constraints,366 potential high-viscosity-temperature performance polymers were identified,and six representative polymers were validated through direct MD simulations.Starting from high-dimensional physical features,we conducted an unbiased systematic analysis of the quantitative structure-property relationships for polymers VII,providing an explicit mathematical model with promising application in VII industry.This work demonstrates the advanced capabilities and reliability of the pipeline proposed here in initiating material innovation cycles in data-scarce fields,and the establishment of the VII dataset and models will serve as a critical starting point for the datadriven design of high viscosity-temperature performance polymers.展开更多
基金Project supported by the National Key R&D Program of China(Grant No.2016YFB0700503)the National High Technology Research and Development Program of China(Grant No.2015AA03420)+2 种基金Beijing Municipal Science and Technology Project,China(Grant No.D161100002416001)the National Natural Science Foundation of China(Grant No.51172018)Kennametal Inc
文摘Since its launch in 2011, the Materials Genome Initiative(MGI) has drawn the attention of researchers from academia,government, and industry worldwide. As one of the three tools of the MGI, the use of materials data, for the first time, has emerged as an extremely significant approach in materials discovery. Data science has been applied in different disciplines as an interdisciplinary field to extract knowledge from data. The concept of materials data science has been utilized to demonstrate its application in materials science. To explore its potential as an active research branch in the big data era, a three-tier system has been put forward to define the infrastructure for the classification, curation and knowledge extraction of materials data.
文摘Small Language Models offer an efficient alternative for structured information extraction.We present SLM-MATRIX,a multi-path collaborative reasoning and verification framework based on SLMs,designed to extract material names,numerical values,and physical units from materials science literature.The framework integrates three complementary reasoning paths:a multi-agent collaborative path,a generator–discriminator path,and a dual cross-verification path.SLM-MATRIX achieves an accuracy of 92.85%on the BulkModulus dataset and reaches 77.68%accuracy on the MatSynTriplet dataset,both outperforming conventional methods and single-pathmodels.Moreover,experiments on general reasoning benchmarks such as GSM8K and SVAMP validate the framework’s strong generalization capability.Ablation studies evaluate the effects of agent number,Mixture-of-Agents(MoA)depth,and discriminator design on overall performance.Overall,SLM-MATRIX presents an effective approach for high-quality material information extraction in resource-constrained and offers new insights into structured scientific text understanding tasks.
文摘According to statistics of Printing and Printing Equipment Industries Association of China (PEIAC), the total output value of printing industry of China in 2007 reached 440 billion RMB , the total output value of printing equipment was
基金Supported by National Natural Science Foundation of China(Grant Nos.51205004,51475003)Beijing Municipal Natural Science Foundation of China(Grant No.3152010)Beijing Municipal Education Committee Science and Technology Program,China(Grant No.KM201510009004)
文摘Flexible roll forming is a promising manufacturing method for the production of variable cross section products. Considering the large plastic strain in this forming process which is much larger than that of uniform deformation phase of uniaxial tensile test, the widely adopted method of simulating the forming processes with non-supplemented material data from uniaxial tensile test will certainly lead to large error. To reduce this error, the material data is supplemented based on three constitutive models. Then a finite element model of a six passes flexible roll forming process is established based on the supplemented material data and the original material data from the uniaxial tensile test. The flexible roll forming experiment of a B pillar reinforcing plate is carried out to verify the proposed method. Final cross section shapes of the experimental and the simulated results are compared. It is shown that the simulation calculated with supplemented material data based on Swift model agrees well with the experimental results, while the simulation based on original material data could not predict the actual deformation accurately. The results indicate that this material supplement method is reliable and indispensible, and the simulation model can well reflect the real metal forming process. Detailed analysis of the distribution and history of plastic strain at different positions are performed. A new material data supplement method is proposed to tackle the problem which is ignored in other roll forming simulations, and thus the forming process simulation accuracy can be greatly improved.
基金supported by the National Key Research and Development Program of China(2021YFA1500703,2022YFA1503103,2022YFB3807200)Natural Science Foundation of China(22033002,T2321002,22373013)+2 种基金Natural Science Foundation of Jiangsu Province,Major Project(BK20232012,BK20222007)Jiangsu Provincial Scientific Research Center of Applied Mathematics(BK20233002)the Fundamental Research Funds for the Central Universities.
文摘CONSPECTUS:The data-driven paradigm,represented by the famous machine learning paradigm,is revolutionizing the way materials are discovered.The inductive nature of the data-driven approach gives it great speed of prediction but also brings with it a heavy reliance on material data.However,unlike its success with text and images,which are supported by big data,materials data tend to be small data.Building a large database of materials is a good solution but not a permanent one.The cost of materials data is much higher than that of text or images,and the size of the materials database at this stage is far from sufficient.We will continue to face a shortage of materials data for a long time to come,making small data approaches necessary for machine learning based materials discovery.
基金funding by the German Research Foundation(DFG)through the NFDI consortium FAIRmat,project 460197019.
文摘With the rapidly increasing amount of materials data being generated in a variety of projects,efficient and accurate classification of atomistic structures is essential.A current barrier to effective database queries lies in the often ambiguous,inconsistent,or completely missing classification of existing data,highlighting the need for standardized,automated,and verifiable classification methods.This work proposes a robust solution for identifying and classifying a wide spectrum of materials through an iterative technique,called symmetry-based clustering(SBC).Because SBC is not a machine learningbased method,it requires no prior training.Instead,it identifies clusters in atomistic systems by automatically recognizing common unit cells.We demonstrate the potential of SBC to provide automated,reliable classification and to reveal well-known symmetry properties of various materials.Even noisy systems are shown to be classifiable,showing the suitability of our algorithm for real-world data applications.The software implementation is provided in the open-source Python package,MatID,exploiting synergies with popular atomic-structure manipulation libraries and extending the accessibility of those libraries through the NOMAD platform.
基金Supported by National Natural Science Foundation of China (No. 60973079)Natural Science Foundation of Hebei Province (No. E2006000039)
文摘The representation method of heterogeneous material information is one of the key technologies of heterogeneous object modeling, but almost all the existing methods cannot represent non-uniform rational B-spline (NURBS) entity. According to the characteristics of NURBS, a novel data structure, named NURBS material data structure, is proposed, in which the geometrical coordinates, weights and material coordinates of NURBS heterogene- ous objects can be represented simultaneously. Based on this data structure, both direct representation method and inverse construction method of heterogeneous NURBS objects are introduced. In the direct representation method, three forms of NURBS heterogeneous objects are introduced by giving the geometry and material information of con- trol points, among which the homogeneous coordinates form is employed for its brevity and easy programming. In the inverse construction method, continuous heterogeneous curves and surfaces can he obtained by interpolating discrete points and curves with specified material information. Some examples are given to show the effectiveness of the pro- posed methods.
基金supported by the National Key Research and Development Program of China(2021YFB2500300)the National Natural Science Foundation of China(T2322015,92472101,22393903,22393900,52394170)the Beijing Municipal Natural Science Foundation(L247015,L233004)。
文摘Artificial intelligence(AI)has become an increasingly important propellant for energy materials and energy chemistry research,such as accelerating advanced energy materials discovery[1],analyzing vast amounts of data from both experiments and computations[2],process optimization for materials syntheses,management and monitoring of energy storage devices such as lithium batteries,and algorithm-optimized grid load forecasting.Looking back at recent pioneering works of AI-driven energy chemistry research,constructing a dataset with both large quantity and high quality is almost the first step and largely determines the following success of training AI models and figuring out corresponding scientific issues.
文摘Over the last few decades, several all-optical circuits have been proposed to meet the need of high-speed data processing. In some information processing architectures, the role of various analog and digital data comparisons is very important. In this letter, we proposed a multi-bit data comparison scheme. The scheme is based on the switching property of optical nonlinear material. Ultrafast operational speed larger than gigahertz can be expected from this all-optical scheme.
基金the support from the National Natural Science Foundation of China(No.11935010)the National Key R&D Program of China(No.2023YFA1406900 and No.2022YFA1404400)+2 种基金the Natural Science Foundation of Shanghai(No.23ZR1481200)the Program of Shanghai Academic Research Leader(No.23XD1423800)the Opening Project of Shanghai Key Laboratory of Special Artificial Microstructure Materials and Technology.
文摘Data-driven machine learning(ML)has demonstrated tremendous potential in material property predictions.However,the scarcity of materials data with costly property labels in the vast chemical space presents a significant challenge for ML in efficiently predicting properties and uncovering structure-property relationships.Here,we propose a novel hierarchy-boosted funnel learning(HiBoFL)framework,which is successfully applied to identify semiconductors with ultralow lattice thermal conductivity(κ_(L)).By training on only a few hundred materials targeted by unsupervised learning from a pool of hundreds of thousands,we achieve efficient and interpretable supervised predictions of ultralowκ_(L),thereby circumventing large-scale brute-force ab initio calculations without clear objectives.As a result,we provide a list of candidates with ultralowκ_(L)for potential thermoelectric applications and discover a new factor that significantly influences structural anharmonicity.This HiBoFL framework offers a novel practical pathway for accelerating the discovery of functional materials.
基金funded by the Strategic Priority Research Program of the Chinese Academy of Sciences,Grant no.XDB 0470201.
文摘Data-driven material innovation has the potential to revolutionize the traditional Edisonian process and significantly shorten development cycles.However,the scarcity of data in materials science and the poor interpretability of machine learning pose serious obstacles to the adoption of this new paradigm.Here,we propose a pipeline that integrates data production,virtual screening,and theoretical innovation using high-throughput all-atom molecular dynamics(MD)as a data flywheel.Using this pipeline,we explored high-performance viscosity index improver polymers and constructed a dataset of 1166 entries for viscosity index improvers(VII)started fromonly five types of polymers.Under multiobjective constraints,366 potential high-viscosity-temperature performance polymers were identified,and six representative polymers were validated through direct MD simulations.Starting from high-dimensional physical features,we conducted an unbiased systematic analysis of the quantitative structure-property relationships for polymers VII,providing an explicit mathematical model with promising application in VII industry.This work demonstrates the advanced capabilities and reliability of the pipeline proposed here in initiating material innovation cycles in data-scarce fields,and the establishment of the VII dataset and models will serve as a critical starting point for the datadriven design of high viscosity-temperature performance polymers.