Artificial intelligence-enabled database technology,known as AI4DB(Artificial Intelligence for Databases),is an active research area attracting significant attention and innovation.This survey first introduces the bac...Artificial intelligence-enabled database technology,known as AI4DB(Artificial Intelligence for Databases),is an active research area attracting significant attention and innovation.This survey first introduces the background of learning-based database techniques.It then reviews advanced query optimization methods for learning databases,focusing on four popular directions:cardinality/cost estimation,learningbased join order selection,learning-based end-to-end optimizers,and text-to-SQL models.Cardinality/cost estimation is classified into supervised and unsupervised methods based on learning models,with illustrative examples provided to explain the working mechanisms.Detailed descriptions of various query optimizers are also given to elucidate the working mechanisms of each component in learning query optimizers.Additionally,we discuss the challenges and development opportunities of learning query optimizers.The survey further explores text-to-SQL models,a new research area within AI4DB.Finally,we consider the future development prospects of learning databases.展开更多
With the closer integration of database and machine learning, machinelearning task in database can reduce the data transmission, thus dramatically boostingthe runtime performance of the whole task. Moreover, if there ...With the closer integration of database and machine learning, machinelearning task in database can reduce the data transmission, thus dramatically boostingthe runtime performance of the whole task. Moreover, if there is a chance ofstoring machine learning models involved in similar tasks in the system intelligently,the computation resource and time cost of repeated training will be greatlyreduced. However, the intelligent storage system of machine learning model hasnot been developed yet. In order to achieve this goal, a method is proposed tomeasure the similarity of machine learning tasks. Second, the intelligent storagesystem of machine learning model was designed to manage models. Finally, itintroduced the overall architecture and key technologies of intelligent storage systemof machine learning model based on task similarity (ISSMLM), and describethree demonstration scenarios of the system. The results show the validity of theproposed method.展开更多
基金partially supported by the National Natural Science Foundation of China(Grant No.62272066)Open Research Fund of Guangxi Key Lab of Human-machine Interaction and Intelligent Decision(GXHIID2207)+5 种基金Sichuan Science and Technology Program(2025ZNSFSC0044,2025YFHZ0194)Chengdu Technological Innovation Research and Development Project(2024-YF05-01217-SN)Chengdu Regional Science and Technology Innovation Cooperation Project(2025-YF11-00050-HZ)Open Foundation of Key Laboratory of Cyberspace Security,Ministry of Education of China and Henan Key Laboratory of Cyberspace Situation Awareness(KLCS20240106)Ant Group through CCFAnt Research Fund(CCF-AFSG RF20240106)Open Research Fund of Key Laboratory of Cyberspace Big Data Intelligent Security(Chongqing University of Posts and Telecommunications),Ministry of Education of China(CBDIS202404).
文摘Artificial intelligence-enabled database technology,known as AI4DB(Artificial Intelligence for Databases),is an active research area attracting significant attention and innovation.This survey first introduces the background of learning-based database techniques.It then reviews advanced query optimization methods for learning databases,focusing on four popular directions:cardinality/cost estimation,learningbased join order selection,learning-based end-to-end optimizers,and text-to-SQL models.Cardinality/cost estimation is classified into supervised and unsupervised methods based on learning models,with illustrative examples provided to explain the working mechanisms.Detailed descriptions of various query optimizers are also given to elucidate the working mechanisms of each component in learning query optimizers.Additionally,we discuss the challenges and development opportunities of learning query optimizers.The survey further explores text-to-SQL models,a new research area within AI4DB.Finally,we consider the future development prospects of learning databases.
文摘With the closer integration of database and machine learning, machinelearning task in database can reduce the data transmission, thus dramatically boostingthe runtime performance of the whole task. Moreover, if there is a chance ofstoring machine learning models involved in similar tasks in the system intelligently,the computation resource and time cost of repeated training will be greatlyreduced. However, the intelligent storage system of machine learning model hasnot been developed yet. In order to achieve this goal, a method is proposed tomeasure the similarity of machine learning tasks. Second, the intelligent storagesystem of machine learning model was designed to manage models. Finally, itintroduced the overall architecture and key technologies of intelligent storage systemof machine learning model based on task similarity (ISSMLM), and describethree demonstration scenarios of the system. The results show the validity of theproposed method.