Introduction:Traditional dietary surveys are timeconsuming,and manual recording may lead to omissions.Improvement during data collection is essential to enhance accuracy of nutritional surveys.In recent years,large la...Introduction:Traditional dietary surveys are timeconsuming,and manual recording may lead to omissions.Improvement during data collection is essential to enhance accuracy of nutritional surveys.In recent years,large language models(LLMs)have been rapidly developed,which can provide text-processing functions and assist investigators in conducting dietary surveys.Methods:Thirty-eight participants from 15 families in the Huangpu and Jiading districts of Shanghai were selected.A standardized 24-hour dietary recall protocol was conducted using an intelligent recording pen that simultaneously captured audio data.These recordings were then transcribed into text.After preprocessing,we used GLM-4 for prompt engineering and chain-of-thought for collaborative reasoning,output structured data,and analyzed its integrity and consistency.Model performance was evaluated using precision and F1 scores.Results:The overall integrity rate of the LLMbased structured data reached 92.5%,and the overall consistency rate compared with manual recording was 86%.The LLM can accurately and completely recognize the names of ingredients and dining and production locations during the transcription.The LLM achieved 94%precision and an F1 score of 89.7%for the full dataset.Conclusion:LLM-based text recognition and structured data extraction can serve as effective auxiliary tools to improve efficiency and accuracy in traditional dietary surveys.With the rapid advancement of artificial intelligence,more accurate and efficient auxiliary tools can be developed for more precise and efficient data collection in nutrition research.展开更多
基金Supported by the Ministry of Finance of the People’s Republic of China from 2022 to 2024(grant number 102393220020070000016).
文摘Introduction:Traditional dietary surveys are timeconsuming,and manual recording may lead to omissions.Improvement during data collection is essential to enhance accuracy of nutritional surveys.In recent years,large language models(LLMs)have been rapidly developed,which can provide text-processing functions and assist investigators in conducting dietary surveys.Methods:Thirty-eight participants from 15 families in the Huangpu and Jiading districts of Shanghai were selected.A standardized 24-hour dietary recall protocol was conducted using an intelligent recording pen that simultaneously captured audio data.These recordings were then transcribed into text.After preprocessing,we used GLM-4 for prompt engineering and chain-of-thought for collaborative reasoning,output structured data,and analyzed its integrity and consistency.Model performance was evaluated using precision and F1 scores.Results:The overall integrity rate of the LLMbased structured data reached 92.5%,and the overall consistency rate compared with manual recording was 86%.The LLM can accurately and completely recognize the names of ingredients and dining and production locations during the transcription.The LLM achieved 94%precision and an F1 score of 89.7%for the full dataset.Conclusion:LLM-based text recognition and structured data extraction can serve as effective auxiliary tools to improve efficiency and accuracy in traditional dietary surveys.With the rapid advancement of artificial intelligence,more accurate and efficient auxiliary tools can be developed for more precise and efficient data collection in nutrition research.