Visual Attention Prediction(VAP)is widely applied in GIS research,such as navigation task identification and driver assistance systems.Previous studies commonly took color information to detect the visual saliency of ...Visual Attention Prediction(VAP)is widely applied in GIS research,such as navigation task identification and driver assistance systems.Previous studies commonly took color information to detect the visual saliency of natural scene images.However,these studies rarely considered adaptively feature integration to different geospatial scenes in specific tasks.To better predict visual attention while driving tasks,in this paper,we firstly propose an Adaptive Feature Integration Fully Convolutional Network(AdaFI-FCN)using Scene-Adaptive Weights(SAW)to integrate RGB-D,motion and semantic features.The quantitative comparison results on the DR(eye)VE dataset show that the proposed framework achieved the best accuracy and robustness performance compared with state-of-the-art models(AUC-Judd=0.971,CC=0.767,KL=1.046,SIM=0.579).In addition,the experimental results of the ablation study demonstrated the positive effect of the SAW method on the prediction robustness in response to scene changes.The proposed model has the potential to benefit adaptive VAP research in universal geospatial scenes,such as AR-aided navigation,indoor navigation,and street-view image reading.展开更多
基金supported by the National Natural Science Foundation of China(NSFC)under Grant No.42230103the State Key Laboratory of Geographic Information Engineering and the Key Laboratory of Surveying and Mapping Science and Geospatial Information Technology of the Ministry of Natural Resources Jointly Funded Project under Grant No.2021-04-03.
文摘Visual Attention Prediction(VAP)is widely applied in GIS research,such as navigation task identification and driver assistance systems.Previous studies commonly took color information to detect the visual saliency of natural scene images.However,these studies rarely considered adaptively feature integration to different geospatial scenes in specific tasks.To better predict visual attention while driving tasks,in this paper,we firstly propose an Adaptive Feature Integration Fully Convolutional Network(AdaFI-FCN)using Scene-Adaptive Weights(SAW)to integrate RGB-D,motion and semantic features.The quantitative comparison results on the DR(eye)VE dataset show that the proposed framework achieved the best accuracy and robustness performance compared with state-of-the-art models(AUC-Judd=0.971,CC=0.767,KL=1.046,SIM=0.579).In addition,the experimental results of the ablation study demonstrated the positive effect of the SAW method on the prediction robustness in response to scene changes.The proposed model has the potential to benefit adaptive VAP research in universal geospatial scenes,such as AR-aided navigation,indoor navigation,and street-view image reading.