摘要
This paper provides an overview of black-box rare-event simulation methods applicable to the safety testing of artificial intelligence agents.We explore the challenges and efficiency criteria in black-box simulation,especially emphasizing the subtle occurrence and control of underestimation errors.The paper reviews various adaptive methods,such as the cross-entropy method and adaptive multilevel splitting,highlighting both their empirical effectiveness and theoretical limitations.Additionally,it offers a comparative analysis of different confidence interval constructions for crude Monte Carlo methods,aiming to mitigate underestimation errors through effective uncertainty quantification.The paper concludes with a certifiable deep importance sampling approach,using deep neural networks to develop conservative estimators that address underestimation issues.
基金
supported by the National Natural Science Foundation of China(No.72301195)
the Shanghai Rising-Star Program(No.22YF1451100)
the Fundamental Research Funds for the Central Universities.Henry Lam’s research is supported by the Columbia Innovation Hub Award,the InnoHK initiative,the Government of the HKSAR,and Laboratory for AI-Powered Financial Technologies.