期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
A Semi-automatic Method Based on Statistic for Mandarin Semantic Structures Extraction in Specific Domains 被引量:1
1
作者 熊英 朱杰 孙静 《Journal of Shanghai Jiaotong university(Science)》 EI 2004年第4期25-29,共5页
This paper proposed a new method of semi-automatic extraction for semantic structures from unlabelled corpora in specific domains. The approach is statistical in nature. The extracted structures can be used for shallo... This paper proposed a new method of semi-automatic extraction for semantic structures from unlabelled corpora in specific domains. The approach is statistical in nature. The extracted structures can be used for shallow parsing and semantic labeling. By iteratively extracting new words and clustering words, we get an inital semantic lexicon that groups words of the same semantic meaning together as a class. After that, a bootstrapping algorithm is adopted to extract semantic structures. Then the semantic structures are used to extract new 展开更多
关键词 and augment the semantic lexicon. The resultant semantic structures are interpreted by persons and are amenable to hand-editing for refinement. In this experiment the semi-automatically extracted structures S SA provide recall rate of 84.
在线阅读 下载PDF
Large Language Model-Based Text Recognition and Structured Data Extraction for Dietary Surveys
2
作者 Fangxu Guan Ruixue Niu +11 位作者 Feifei Huang Xiaofan Zhang Yanli Wei Jiguo Zhang Xiaofang Jia Yifei Ouyang Jing Bai Chang Su Li Li Wenwen Du Honglei Liu Huijun Wang 《China CDC weekly》 2026年第2期49-54,共6页
Introduction:Traditional dietary surveys are timeconsuming,and manual recording may lead to omissions.Improvement during data collection is essential to enhance accuracy of nutritional surveys.In recent years,large la... Introduction:Traditional dietary surveys are timeconsuming,and manual recording may lead to omissions.Improvement during data collection is essential to enhance accuracy of nutritional surveys.In recent years,large language models(LLMs)have been rapidly developed,which can provide text-processing functions and assist investigators in conducting dietary surveys.Methods:Thirty-eight participants from 15 families in the Huangpu and Jiading districts of Shanghai were selected.A standardized 24-hour dietary recall protocol was conducted using an intelligent recording pen that simultaneously captured audio data.These recordings were then transcribed into text.After preprocessing,we used GLM-4 for prompt engineering and chain-of-thought for collaborative reasoning,output structured data,and analyzed its integrity and consistency.Model performance was evaluated using precision and F1 scores.Results:The overall integrity rate of the LLMbased structured data reached 92.5%,and the overall consistency rate compared with manual recording was 86%.The LLM can accurately and completely recognize the names of ingredients and dining and production locations during the transcription.The LLM achieved 94%precision and an F1 score of 89.7%for the full dataset.Conclusion:LLM-based text recognition and structured data extraction can serve as effective auxiliary tools to improve efficiency and accuracy in traditional dietary surveys.With the rapid advancement of artificial intelligence,more accurate and efficient auxiliary tools can be developed for more precise and efficient data collection in nutrition research. 展开更多
关键词 enhance accuracy nutritional surveysin data collection large language models language models llms dietary surveys manual recording text recognition structured data extraction
原文传递
ML-Parser:An Eficient and Accurate Online Log Parser 被引量:1
3
作者 Yu-Qian Zhu Jia-Ying Deng +3 位作者 Jia-Chen Pu Peng Wang Shen Liang Wei Wang 《Journal of Computer Science & Technology》 SCIE EI CSCD 2022年第6期1412-1426,共15页
A log is a text message that is generated in various services,frameworks,and programs.The majority of log data mining tasks rely on log parsing as the first step,which transforms raw logs into formatted log templates.... A log is a text message that is generated in various services,frameworks,and programs.The majority of log data mining tasks rely on log parsing as the first step,which transforms raw logs into formatted log templates.Existing log parsing approaches often fail to effectively handle the trade-off between parsing quality and performance.In view of this,in this paper,we present Multi-Layer Parser(ML-Parser),an online log parser that runs in a streaming manner.Specifically,we present a multi-layer structure in log parsing to strike a balance between efficiency and effectiveness.Coarse-grained tokenization and a fast similarity measure are applied for efficiency while fine-grained tokenization and an accurate similarity measure are used for effectiveness.In experiments,we compare ML-Parser with two existing online log parsing approaches,Drain and Spell,on ten real-world datasets,five labeled and five unlabeled.On the five labeled datasets,we use the proportion of correctly parsed logs to measure the accuracy,and ML-Parser achieves the highest accuracy on four datasets.On the whole ten datasets,we use Loss metric to measure the parsing quality.ML-Parse achieves the highest quality on seven out of the ten datasets while maintaining relatively high efficiency. 展开更多
关键词 log parsing online approach structure extraction similarity measure
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部