期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
A feature representation method for biomedical scientific data based on composite text description
1
作者 SUN Wei 《Chinese Journal of Library and Information Science》 2009年第4期43-53,共11页
Feature representation is one of the key issues in data clustering. The existing feature representation of scientific data is not sufficient, which to some extent affects the result of scientific data clustering. Ther... Feature representation is one of the key issues in data clustering. The existing feature representation of scientific data is not sufficient, which to some extent affects the result of scientific data clustering. Therefore, the paper proposes a concept of composite text description(CTD) and a CTD-based feature representation method for biomedical scientific data. The method mainly uses different feature weight algorisms to represent candidate features based on two types of data sources respectively, combines and finally strengthens the two feature sets. Experiments show that comparing with traditional methods, the feature representation method is more effective than traditional methods and can significantly improve the performance of biomedcial data clustering. 展开更多
关键词 Composite text description Scientific data Feature representation Weight algorism
原文传递
LLM-Prop:predicting the properties of crystalline materials using large language models
2
作者 Andre Niyongabo Rubungo Craig Arnold +1 位作者 Barry P.Rand Adji Bousso Dieng 《npj Computational Materials》 2025年第1期2003-2015,共13页
The prediction of crystal properties plays a crucial role in materials science and applications.Current methods for predicting crystal properties focus on modeling crystal structures using graph neural networks(GNNs).... The prediction of crystal properties plays a crucial role in materials science and applications.Current methods for predicting crystal properties focus on modeling crystal structures using graph neural networks(GNNs).However,accurately modeling the complex interactions between atoms and molecules within a crystal remains a challenge.Surprisingly,predicting crystal properties from crystal text descriptions is understudied,despite the rich information and expressiveness that text data offer.In this paper,we develop and make public a benchmark dataset(TextEdge)that contains crystal text descriptions with their properties.We then propose LLM-Prop,a method that leverages the generalpurpose learning capabilities of large language models(LLMs)to predict properties of crystals from their text descriptions.LLM-Prop outperforms the current state-of-the-art GNN-based methods by approximately 8%on predicting band gap,3%on classifying whether the band gap is direct or indirect,and 65%on predicting unit cell volume,and yields comparable performance on predicting formation energy per atom,energy per atom,and energy above hull.LLM-Prop also outperforms the fine-tuned MatBERT,a domain-specific pre-trained BERT model,despite having 3 times fewer parameters.We further fine-tune the LLM-Prop model directly on CIF files and condensed structure information generated by Robocrystallographer and found that LLM-Prop fine-tuned on text descriptions provides a better performance on average.Our empirical results highlight the importance of having a natural language input to LLMs to accurately predict crystal properties and the current inability of GNNs to capture information pertaining to space group symmetry and Wyckoff sites for accurate crystal property prediction. 展开更多
关键词 text descriptions graph neural networks crystal properties large language models text data benchmark dataset modeling crystal structures graph neural networks gnns howeveraccurately
原文传递
Natural Disasters Warning for Enterprises Through Fuzzy Keywords Search
3
作者 Zewei Sun Hanwen Liu +1 位作者 Chao Yan Ran An 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2021年第4期558-564,共7页
With the ever-increasing number of natural disasters warning documents in document databases, the document database is becoming an economic and efficient way for enterprise staffs to learn and understand the contents ... With the ever-increasing number of natural disasters warning documents in document databases, the document database is becoming an economic and efficient way for enterprise staffs to learn and understand the contents of the natural disasters warning through searching for necessary text documents. Generally, the document database can recommend a mass of documents to the enterprise staffs through analyzing the enterprise staff's precisely typed keywords. In fact, these recommended documents place a heavy burden on the enterprise staffs to learn and select as the enterprise staffs have little background knowledge about the contents of the natural disasters warning. Thus, the enterprise staffs fail to retrieve and select appropriate documents to achieve their desired goals.Considering the above drawbacks, in this paper, we propose a fuzzy keywords-driven Natural Disasters Warning Documents retrieval approach(named NDWDkeyword). Through the text description mining of documents and the fuzzy keywords searching technology, the retrieval approach can precisely capture the enterprise staffs' target requirements and then return necessary documents to the enterprise staffs. Finally, a case study is run to explain our retrieval approach step by step and demonstrate the effectiveness and feasibility of our proposal. 展开更多
关键词 Natural Disasters Warning Documents(NDWD) fuzzy keywords search text description mining
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部