Driver behavior is a critical factor in road safety,highlighting the need for advanced methods in Distracted riving lassification(DDC).In this study,we introduce DDC-Chat,a novel classification method based on a isual...Driver behavior is a critical factor in road safety,highlighting the need for advanced methods in Distracted riving lassification(DDC).In this study,we introduce DDC-Chat,a novel classification method based on a isual large anguageodel(VLM).DDC-Chat is an interactive multimodal system built upon LLAVA-Plus,fine-tuned specifically for addressing distracted driving detection.It utilizes logical reasoning chains to activate visual skills,including segmentation and pose detection,through end-to-end training.Furthermore,instruction tuning allows DDC-Chat to continuously incorporate new visual skills,enhancing its ability to classify distracted driving behavior.Our extensive experiments demonstrate that DDC-Chat achieves state-of-the-art performance on public DDC datasets,surpassing previous benchmarks.In evaluations on the 100-Driver dataset,the model exhibits superior results in both zero-shot and few-shot learning contexts,establishing it as a valuable tool for improving driving safety by accurately identifying driver distraction.Due to the computational intensity of inference,DDC-Chat is optimized for deployment on remote servers,with data streamed from in-vehicle monitoring systems for real-time analysis.展开更多
基金supported by the National Natural Science Foundation of China(62173253,52272374)the Research and Practice Project of New Engineering in Ordinary Undergraduate Universities in Guangxi Zhuang Autonomous Region(XGK202310)+1 种基金educational reform projects(JGT202302,JGKQ202309)the 2024 Guangxi Collegiate Innovation and Entrepreneurship Training Project"Eye-Smart Driving-Fatigue Driving Monitoring and Warning System Based on Computer Vision"(Project No.S202410595158).
文摘Driver behavior is a critical factor in road safety,highlighting the need for advanced methods in Distracted riving lassification(DDC).In this study,we introduce DDC-Chat,a novel classification method based on a isual large anguageodel(VLM).DDC-Chat is an interactive multimodal system built upon LLAVA-Plus,fine-tuned specifically for addressing distracted driving detection.It utilizes logical reasoning chains to activate visual skills,including segmentation and pose detection,through end-to-end training.Furthermore,instruction tuning allows DDC-Chat to continuously incorporate new visual skills,enhancing its ability to classify distracted driving behavior.Our extensive experiments demonstrate that DDC-Chat achieves state-of-the-art performance on public DDC datasets,surpassing previous benchmarks.In evaluations on the 100-Driver dataset,the model exhibits superior results in both zero-shot and few-shot learning contexts,establishing it as a valuable tool for improving driving safety by accurately identifying driver distraction.Due to the computational intensity of inference,DDC-Chat is optimized for deployment on remote servers,with data streamed from in-vehicle monitoring systems for real-time analysis.