In recent decades,control performance monitoring(CPM)has experienced remarkable progress in research and industrial applications.While CPM research has been investigated using various benchmarks,the historical data be...In recent decades,control performance monitoring(CPM)has experienced remarkable progress in research and industrial applications.While CPM research has been investigated using various benchmarks,the historical data benchmark(HIS)has garnered the most attention due to its practicality and effectiveness.However,existing CPM reviews usually focus on the theoretical benchmark,and there is a lack of an in-depth review that thoroughly explores HIS-based methods.In this article,a comprehensive overview of HIS-based CPM is provided.First,we provide a novel static-dynamic perspective on data-level manifestations of control performance underlying typical controller capacities including regulation and servo:static and dynamic properties.The static property portrays time-independent variability in system output,and the dynamic property describes temporal behavior driven by closed-loop feedback.Accordingly,existing HIS-based CPM approaches and their intrinsic motivations are classified and analyzed from these two perspectives.Specifically,two mainstream solutions for CPM methods are summarized,including static analysis and dynamic analysis,which match data-driven techniques with actual controlling behavior.Furthermore,this paper also points out various opportunities and challenges faced in CPM for modern industry and provides promising directions in the context of artificial intelligence for inspiring future research.展开更多
The challenge of enhancing the generalization capacity of reinforcement learning(RL)agents remains a formidable obstacle.Existing RL methods,despite achieving superhuman performance on certain benchmarks,often struggl...The challenge of enhancing the generalization capacity of reinforcement learning(RL)agents remains a formidable obstacle.Existing RL methods,despite achieving superhuman performance on certain benchmarks,often struggle with this aspect.A potential reason is that the benchmarks used for training and evaluation may not adequately offer a diverse set of transferable tasks.Although recent studies have developed bench-marking environments to address this shortcoming,they typically fall short in providing tasks that both ensure a solid foundation for generalization and exhibit significant variability.To overcome these limitations,this work introduces the concept that‘objects are composed of more fundamental components’in environment design,as implemented in the proposed environment called summon the magic(StM).This environment generates tasks where objects are derived from extensible and shareable basic components,facilitating strategy reuse and enhancing generalization.Furthermore,two new metrics,adaptation sensitivity range(ASR)and parameter correlation coefficient(PCC),are proposed to better capture and evaluate the generalization process of RL agents.Experimental results show that increasing the number of basic components of the object reduces the proximal policy optimization(PPO)agent’s training-testing gap by 60.9%(in episode reward),significantly alleviating overfitting.Additionally,linear variations in other environmental factors,such as the training monster set proportion and the total number of basic components,uniformly decrease the gap by at least 32.1%.These results highlight StM’s effectiveness in benchmarking and probing the generalization capabilities of RL algorithms.展开更多
The discovery of high-temperature superconducting materials holds great significance for human industry and daily life.In recent years,research on predicting superconducting transition temperatures using artificial in...The discovery of high-temperature superconducting materials holds great significance for human industry and daily life.In recent years,research on predicting superconducting transition temperatures using artificial intelligence(AI)has gained popularity,with most of these tools claiming to achieve remarkable accuracy.However,the lack of widely accepted benchmark datasets in this field has severely hindered fair comparisons between different AI algorithms and impeded further advancement of these methods.In this work,we present HTSC-2025,an ambient-pressure high-temperature superconducting benchmark dataset.This comprehensive compilation encompasses theoretically predicted superconducting materials discovered by theoretical physicists from 2023 to 2025 based on BCS superconductivity theory,including the renowned X_(2)YH_(6)system,perovskite MXH_(3)system,M_(3)H_(8)system,cage-like BCN-doped metal atomic systems derived from LaH_(10)structural evolution,and two-dimensional honeycomb-structured systems evolving from MgB_(2).In addition,we note a range of approaches inspired by physical intuition for designing high-temperature superconductors,such as hole doping,the introduction of light elements to form strong covalent bonds,and the tuning of spin-orbit coupling.The dataset presented in this paper is openly available at Science DB.The HTSC-2025 benchmark has been open-sourced on Hugging Face at https://huggingface.co/datasets/xiao-qi/HTSC-2025 and will be continuously updated,while the Electronic Laboratory for Material Science platform is available at https://in.iphy.ac.cn/eln/link.html#/124/V2s4.展开更多
Most material distribution-based topology optimization methods work on a relaxed form of the optimization problem and then push the solution toward the binary limits.However,when benchmarking these methods,researchers...Most material distribution-based topology optimization methods work on a relaxed form of the optimization problem and then push the solution toward the binary limits.However,when benchmarking these methods,researchers use known solutions to only a single form of benchmark problem.This paper proposes a comparison platform for systematic benchmarking of topology optimization methods using both binary and relaxed forms.A greyness measure is implemented to evaluate how far a solution is from the desired binary form.The well-known ZhouRozvany(ZR)problem is selected as the benchmarking problem here,making use of available global solutions for both its relaxed and binary forms.The recently developed non-penalization Smooth-edged Material Distribution for Optimizing Topology(SEMDOT),well-established Solid Isotropic Material with Penalization(SIMP),and continuation methods are studied on this platform.Interestingly,in most cases,the grayscale solutions obtained by SEMDOT demonstrate better performance in dealing with the ZR problem than SIMP.The reasons are investigated and attributed to the usage of two different regularization techniques,namely,the Heaviside smooth function in SEMDOT and the power-law penalty in SIMP.More importantly,a simple-to-use benchmarking graph is proposed for evaluating newly developed topology optimization methods.展开更多
Prompt fission neutron spectra(PFNS)have a significant role in nuclear science and technology.In this study,the PFNS for^(239)Pu are evaluated using both differential and integral experimental data.A method that lever...Prompt fission neutron spectra(PFNS)have a significant role in nuclear science and technology.In this study,the PFNS for^(239)Pu are evaluated using both differential and integral experimental data.A method that leverages integral criticality benchmark experiments to constrain the PFNS data is introduced.The measured central values of the PFNS are perturbed by constructing a covariance matrix.The PFNS are sampled using two types of covariance matrices,either generated with an assumed correlation matrix and incorporating experimental uncertainties or derived directly from experimental reports.The joint Monte Carlo transport code is employed to perform transport simulations on five criticality benchmark assemblies by utilizing perturbed PFNS data.Extensive simulations result in an optimized PFNS that shows improved agreement with the integral criticality benchmark experiments.This study introduces a novel approach for optimizing differential experimental data through integral experiments,particularly when a covariance matrix is not provided.展开更多
Advancing the integration of artificial intelligence and polymer science requires high-quality,open-source,and large-scale datasets.However,existing polymer databases often suffer from data sparsity,lack of polymer-pr...Advancing the integration of artificial intelligence and polymer science requires high-quality,open-source,and large-scale datasets.However,existing polymer databases often suffer from data sparsity,lack of polymer-property labels,and limited accessibility,hindering system-atic modeling across property prediction tasks.Here,we present OpenPoly,a curated experimental polymer database derived from extensive lit-erature mining and manual validation,comprising 3985 unique polymer-property data points spanning 26 key properties.We further develop a multi-task benchmarking framework that evaluates property prediction using four encoding methods and eight representative models.Our re-sults highlight that the optimized degree-of-polymerization encoding coupled with Morgan fingerprints achieves an optimal trade-off between computational cost and accuracy.In data-scarce condition,XGBoost outperforms deep learning models on key properties such as dielectric con-stant,glass transition temperature,melting point,and mechanical strength,achieving R2 scores of 0.65-0.87.To further showcase the practical utility of the database,we propose potential polymers for two energy-relevant applications:high temperature polymer dielectrics and fuel cell membranes.By offering a consistent and accessible benchmark and database,OpenPoly paves the way for more accurate polymer-property modeling and fosters data-driven advances in polymer genome engineering.展开更多
Benchmark experiments are indispensable for the development of neutron nuclear data evaluation libraries.Given the lack of domestic benchmarking of nuclear data in the fission energy region,this study developed a neut...Benchmark experiments are indispensable for the development of neutron nuclear data evaluation libraries.Given the lack of domestic benchmarking of nuclear data in the fission energy region,this study developed a neutron leakage spectrum measurement system using a spherical sample based on the^(252)Cf spontaneous fission source.The EJ309 detector(for highenergy measurements)and CLYC detector(for low-energy measurements)were combined to measure the time-of-flight spectrum using theγtagging method.To assess the performance of the system,the time-of-flight spectrum without a sample was measured first.The experimental spectra were consistent with those simulated using the Monte Carlo method and the standard^(252)Cf spectrum from ISO:8529-1.This demonstrates that the system can effectively measure the neutron events in the 0.15-8.0 MeV range.Then,a spherical polyethylene sample was used as the standard to verify the accuracy of the system for the benchmark experiment.The simulation results were obtained using the Monte Carlo method with evaluated data from the ENDF/B-Ⅷ.0,CENDL-3.2,JEFF-3.3,and JENDL-5 libraries.The measured neutron leakage spectra were compared with the corresponding simulated results for the neutron spectrum shape and calculated C/E values.The results showed that the simulated spectra with different data libraries reproduced the experimental results well in the 0.15-8.0 MeV range.This study confirms that the leakage neutron spectrum measurement system based on the^(252)Cf source can perform benchmarking and provides a foundation for evaluating neutron nuclear data through benchmark experiments.展开更多
World Federation of Acupuncture-Moxibustion Societies(WFAS)Technical Benchmark of Acupuncture and Moxibustion:Cupping,developed under the leadership of Tianjin University of Traditional Chinese Medicine,was approved b...World Federation of Acupuncture-Moxibustion Societies(WFAS)Technical Benchmark of Acupuncture and Moxibustion:Cupping,developed under the leadership of Tianjin University of Traditional Chinese Medicine,was approved by WFAS.This technical benchmark was issued on October 9,2023,and implemented on December 31,2023.The main contents include the scope,normative references,terms and definitions,procedures and rules,and safety.This article focuses on the above contents,an outlook on the application,popularization,and update plan of this technical benchmark is proposed.展开更多
In recent years,visual facial forgery has reached a level of sophistication that humans cannot identify fraud,which poses a significant threat to information security.A wide range of malicious applications have emerge...In recent years,visual facial forgery has reached a level of sophistication that humans cannot identify fraud,which poses a significant threat to information security.A wide range of malicious applications have emerged,such as deepfake,fake news,defamation or blackmailing of celebrities,impersonation of politicians in political warfare,and the spreading of rumours to attract views.As a result,a rich body of visual forensic techniques has been proposed in an attempt to stop this dangerous trend.However,there is no comprehensive,fair,and unified performance evaluation to enlighten the community on best performing methods.The authors present a systematic benchmark beyond traditional surveys that provides in-depth insights into facial forgery and facial forensics,grounding on robustness tests such as contrast,brightness,noise,resolution,missing information,and compression.The authors also provide a practical guideline of the benchmarking results,to determine the characteristics of the methods that serve as a comparative reference in this never-ending war between measures and countermeasures.The authors’source code is open to the public.展开更多
基金supported in part by the National Natural Science Foundation of China(62125306)Zhejiang Key Research and Development Project(2024C01163)the State Key Laboratory of Industrial Control Technology,China(ICT2024A06)
文摘In recent decades,control performance monitoring(CPM)has experienced remarkable progress in research and industrial applications.While CPM research has been investigated using various benchmarks,the historical data benchmark(HIS)has garnered the most attention due to its practicality and effectiveness.However,existing CPM reviews usually focus on the theoretical benchmark,and there is a lack of an in-depth review that thoroughly explores HIS-based methods.In this article,a comprehensive overview of HIS-based CPM is provided.First,we provide a novel static-dynamic perspective on data-level manifestations of control performance underlying typical controller capacities including regulation and servo:static and dynamic properties.The static property portrays time-independent variability in system output,and the dynamic property describes temporal behavior driven by closed-loop feedback.Accordingly,existing HIS-based CPM approaches and their intrinsic motivations are classified and analyzed from these two perspectives.Specifically,two mainstream solutions for CPM methods are summarized,including static analysis and dynamic analysis,which match data-driven techniques with actual controlling behavior.Furthermore,this paper also points out various opportunities and challenges faced in CPM for modern industry and provides promising directions in the context of artificial intelligence for inspiring future research.
基金Supported by the National Key R&D Program of China(No.2023YFB4502200)the National Natural Science Foundation of China(No.U22A2028,61925208,62222214,62341411,62102398,62102399,U20A20227,62302478,62302482,62302483,62302480,62302481)+2 种基金the Strategic Priority Research Program of the Chinese Academy of Sciences(No.XDB0660300,XDB0660301,XDB0660302)the Chinese Academy of Sciences Project for Young Scientists in Basic Research(No.YSBR-029)the Youth Innovation Promotion Association of Chinese Academy of Sciences and Xplore Prize.
文摘The challenge of enhancing the generalization capacity of reinforcement learning(RL)agents remains a formidable obstacle.Existing RL methods,despite achieving superhuman performance on certain benchmarks,often struggle with this aspect.A potential reason is that the benchmarks used for training and evaluation may not adequately offer a diverse set of transferable tasks.Although recent studies have developed bench-marking environments to address this shortcoming,they typically fall short in providing tasks that both ensure a solid foundation for generalization and exhibit significant variability.To overcome these limitations,this work introduces the concept that‘objects are composed of more fundamental components’in environment design,as implemented in the proposed environment called summon the magic(StM).This environment generates tasks where objects are derived from extensible and shareable basic components,facilitating strategy reuse and enhancing generalization.Furthermore,two new metrics,adaptation sensitivity range(ASR)and parameter correlation coefficient(PCC),are proposed to better capture and evaluate the generalization process of RL agents.Experimental results show that increasing the number of basic components of the object reduces the proximal policy optimization(PPO)agent’s training-testing gap by 60.9%(in episode reward),significantly alleviating overfitting.Additionally,linear variations in other environmental factors,such as the training monster set proportion and the total number of basic components,uniformly decrease the gap by at least 32.1%.These results highlight StM’s effectiveness in benchmarking and probing the generalization capabilities of RL algorithms.
基金supported by the National Natural Science Foundation of China(Grant Nos.62476278,12434009,12204533)the National Key R&D Program of China(Grant No.2024YFA1408601)the Innovation Program for Quantum Science and Technology(Grant No.2021ZD0302402)。
文摘The discovery of high-temperature superconducting materials holds great significance for human industry and daily life.In recent years,research on predicting superconducting transition temperatures using artificial intelligence(AI)has gained popularity,with most of these tools claiming to achieve remarkable accuracy.However,the lack of widely accepted benchmark datasets in this field has severely hindered fair comparisons between different AI algorithms and impeded further advancement of these methods.In this work,we present HTSC-2025,an ambient-pressure high-temperature superconducting benchmark dataset.This comprehensive compilation encompasses theoretically predicted superconducting materials discovered by theoretical physicists from 2023 to 2025 based on BCS superconductivity theory,including the renowned X_(2)YH_(6)system,perovskite MXH_(3)system,M_(3)H_(8)system,cage-like BCN-doped metal atomic systems derived from LaH_(10)structural evolution,and two-dimensional honeycomb-structured systems evolving from MgB_(2).In addition,we note a range of approaches inspired by physical intuition for designing high-temperature superconductors,such as hole doping,the introduction of light elements to form strong covalent bonds,and the tuning of spin-orbit coupling.The dataset presented in this paper is openly available at Science DB.The HTSC-2025 benchmark has been open-sourced on Hugging Face at https://huggingface.co/datasets/xiao-qi/HTSC-2025 and will be continuously updated,while the Electronic Laboratory for Material Science platform is available at https://in.iphy.ac.cn/eln/link.html#/124/V2s4.
文摘Most material distribution-based topology optimization methods work on a relaxed form of the optimization problem and then push the solution toward the binary limits.However,when benchmarking these methods,researchers use known solutions to only a single form of benchmark problem.This paper proposes a comparison platform for systematic benchmarking of topology optimization methods using both binary and relaxed forms.A greyness measure is implemented to evaluate how far a solution is from the desired binary form.The well-known ZhouRozvany(ZR)problem is selected as the benchmarking problem here,making use of available global solutions for both its relaxed and binary forms.The recently developed non-penalization Smooth-edged Material Distribution for Optimizing Topology(SEMDOT),well-established Solid Isotropic Material with Penalization(SIMP),and continuation methods are studied on this platform.Interestingly,in most cases,the grayscale solutions obtained by SEMDOT demonstrate better performance in dealing with the ZR problem than SIMP.The reasons are investigated and attributed to the usage of two different regularization techniques,namely,the Heaviside smooth function in SEMDOT and the power-law penalty in SIMP.More importantly,a simple-to-use benchmarking graph is proposed for evaluating newly developed topology optimization methods.
基金supported by the National Natural Science Foundation of China(No.12347126)。
文摘Prompt fission neutron spectra(PFNS)have a significant role in nuclear science and technology.In this study,the PFNS for^(239)Pu are evaluated using both differential and integral experimental data.A method that leverages integral criticality benchmark experiments to constrain the PFNS data is introduced.The measured central values of the PFNS are perturbed by constructing a covariance matrix.The PFNS are sampled using two types of covariance matrices,either generated with an assumed correlation matrix and incorporating experimental uncertainties or derived directly from experimental reports.The joint Monte Carlo transport code is employed to perform transport simulations on five criticality benchmark assemblies by utilizing perturbed PFNS data.Extensive simulations result in an optimized PFNS that shows improved agreement with the integral criticality benchmark experiments.This study introduces a novel approach for optimizing differential experimental data through integral experiments,particularly when a covariance matrix is not provided.
基金financially supported by the National Natural Science Foundation of China (Nos. 92372126,52373203)the Excellent Young Scientists Fund Program
文摘Advancing the integration of artificial intelligence and polymer science requires high-quality,open-source,and large-scale datasets.However,existing polymer databases often suffer from data sparsity,lack of polymer-property labels,and limited accessibility,hindering system-atic modeling across property prediction tasks.Here,we present OpenPoly,a curated experimental polymer database derived from extensive lit-erature mining and manual validation,comprising 3985 unique polymer-property data points spanning 26 key properties.We further develop a multi-task benchmarking framework that evaluates property prediction using four encoding methods and eight representative models.Our re-sults highlight that the optimized degree-of-polymerization encoding coupled with Morgan fingerprints achieves an optimal trade-off between computational cost and accuracy.In data-scarce condition,XGBoost outperforms deep learning models on key properties such as dielectric con-stant,glass transition temperature,melting point,and mechanical strength,achieving R2 scores of 0.65-0.87.To further showcase the practical utility of the database,we propose potential polymers for two energy-relevant applications:high temperature polymer dielectrics and fuel cell membranes.By offering a consistent and accessible benchmark and database,OpenPoly paves the way for more accurate polymer-property modeling and fosters data-driven advances in polymer genome engineering.
基金supported by the National Natural Science Foundation of China(No.U2067205)。
文摘Benchmark experiments are indispensable for the development of neutron nuclear data evaluation libraries.Given the lack of domestic benchmarking of nuclear data in the fission energy region,this study developed a neutron leakage spectrum measurement system using a spherical sample based on the^(252)Cf spontaneous fission source.The EJ309 detector(for highenergy measurements)and CLYC detector(for low-energy measurements)were combined to measure the time-of-flight spectrum using theγtagging method.To assess the performance of the system,the time-of-flight spectrum without a sample was measured first.The experimental spectra were consistent with those simulated using the Monte Carlo method and the standard^(252)Cf spectrum from ISO:8529-1.This demonstrates that the system can effectively measure the neutron events in the 0.15-8.0 MeV range.Then,a spherical polyethylene sample was used as the standard to verify the accuracy of the system for the benchmark experiment.The simulation results were obtained using the Monte Carlo method with evaluated data from the ENDF/B-Ⅷ.0,CENDL-3.2,JEFF-3.3,and JENDL-5 libraries.The measured neutron leakage spectra were compared with the corresponding simulated results for the neutron spectrum shape and calculated C/E values.The results showed that the simulated spectra with different data libraries reproduced the experimental results well in the 0.15-8.0 MeV range.This study confirms that the leakage neutron spectrum measurement system based on the^(252)Cf source can perform benchmarking and provides a foundation for evaluating neutron nuclear data through benchmark experiments.
基金Supported by National Key R&D Program of China:2019YFC1712200-2019YFC1712204。
文摘World Federation of Acupuncture-Moxibustion Societies(WFAS)Technical Benchmark of Acupuncture and Moxibustion:Cupping,developed under the leadership of Tianjin University of Traditional Chinese Medicine,was approved by WFAS.This technical benchmark was issued on October 9,2023,and implemented on December 31,2023.The main contents include the scope,normative references,terms and definitions,procedures and rules,and safety.This article focuses on the above contents,an outlook on the application,popularization,and update plan of this technical benchmark is proposed.
基金QuỹĐổi mới sáng tạo Vingroup,Grant/Award Number:VINIF.2020.ThS.BK.10。
文摘In recent years,visual facial forgery has reached a level of sophistication that humans cannot identify fraud,which poses a significant threat to information security.A wide range of malicious applications have emerged,such as deepfake,fake news,defamation or blackmailing of celebrities,impersonation of politicians in political warfare,and the spreading of rumours to attract views.As a result,a rich body of visual forensic techniques has been proposed in an attempt to stop this dangerous trend.However,there is no comprehensive,fair,and unified performance evaluation to enlighten the community on best performing methods.The authors present a systematic benchmark beyond traditional surveys that provides in-depth insights into facial forgery and facial forensics,grounding on robustness tests such as contrast,brightness,noise,resolution,missing information,and compression.The authors also provide a practical guideline of the benchmarking results,to determine the characteristics of the methods that serve as a comparative reference in this never-ending war between measures and countermeasures.The authors’source code is open to the public.