Black-Box Rare-Event Simulation for Safety Testing of AI Agents:An Overview

导出

摘要 This paper provides an overview of black-box rare-event simulation methods applicable to the safety testing of artificial intelligence agents.We explore the challenges and efficiency criteria in black-box simulation,especially emphasizing the subtle occurrence and control of underestimation errors.The paper reviews various adaptive methods,such as the cross-entropy method and adaptive multilevel splitting,highlighting both their empirical effectiveness and theoretical limitations.Additionally,it offers a comparative analysis of different confidence interval constructions for crude Monte Carlo methods,aiming to mitigate underestimation errors through effective uncertainty quantification.The paper concludes with a certifiable deep importance sampling approach,using deep neural networks to develop conservative estimators that address underestimation issues.

作者 Yuan-Lu Bai Zhi-Yuan Huang Henry Lam Ding Zhao

机构地区 Department of Industrial Engineering and Operations Research School of Economics and Management Department of Mechanical Engineering

出处《Journal of the Operations Research Society of China》 2025年第3期750-774,共25页 中国运筹学会会刊(英文)

基金 supported by the National Natural Science Foundation of China(No.72301195) the Shanghai Rising-Star Program(No.22YF1451100) the Fundamental Research Funds for the Central Universities.Henry Lam’s research is supported by the Columbia Innovation Hub Award,the InnoHK initiative,the Government of the HKSAR,and Laboratory for AI-Powered Financial Technologies.

关键词 Rare-event simulation Black-box systems AI system safety UNDERESTIMATION

分类号 TP18 [自动化与计算机技术—控制理论与控制工程] TP309 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

1Xianghong Cao,Chenxu Li,Haoting Zhai.YOLO-AB:A Fusion Algorithm for the Elders’Falling and Smoking Behavior Detection Based on Improved YOLOv8[J].Computers, Materials & Continua,2025,83(6):5487-5515.
2Shahab Saquib Sohail,Dag Øivind Madsen.Beyond decision support:large language models such as ChatGPT and DeepSeek and the future of patient empathy in artificial intelligence[J].Intelligent Medicine,2025,5(3):243-243.
3赖长胜,刘天密,林毅,王安伟.水产品中喹诺酮类药物检测方法研究进展[J].水产研究,2025,12(1):29-32.
4Quanyou Fu,Daxu Sun.Construction of a Virtual Twin Testing Framework for Safety of the Intended Functionality in Intelligent Connected Vehicles[J].Journal of Electronic Research and Application,2025,9(5):12-17.
5Changkui LI.AI as a Socratic Dialogue Partner:An Intervention Study on Enhancing Students’Critical Thinking Skills[J].Artificial Intelligence Education Studies,2025,1(4):1-11.
6Navin Gupta,Ke Zhang,Venkata Sabbisetti,Jian Shu,Ryuji Morizane.AAV for gene therapy drives a nephrotoxic response via NFκB in kidney organoids[J].Signal Transduction and Targeted Therapy,2025,10(9):4822-4825.
7Guo-Jian Qiao,Zhi-Lei Zhang,Sheng-Wen Li,C.P.Sun.Controlling a superconducting transistor by coherent light[J].Communications in Theoretical Physics,2025,77(9):43-53.

Journal of the Operations Research Society of China

2025年第3期

浏览历史

内容加载中请稍等...

Black-Box Rare-Event Simulation for Safety Testing of AI Agents:An Overview

相关作者

相关机构

相关主题

浏览历史