期刊文献+
共找到13篇文章
< 1 >
每页显示 20 50 100
Global Smartphone Technological Innovation Capacity Analysis Based on Latent Semantic Indexing and Vector Space Model Method
1
作者 ZHANG Yuwen CHEN Wanming 《Transactions of Nanjing University of Aeronautics and Astronautics》 2025年第3期395-410,共16页
This paper analyzes the global competitive landscape of smartphone technological innovation capacity using the latent semantic indexing(LSI)and the vector space model(VSM).It integrates the theory of technological eco... This paper analyzes the global competitive landscape of smartphone technological innovation capacity using the latent semantic indexing(LSI)and the vector space model(VSM).It integrates the theory of technological ecological niches and evaluates four key dimensions:patent quality,energy efficiency engineering,technological modules,and intelligent computing power.The findings reveal that USA has established strong technological barriers through standard-essential patents(SEPs)in wireless communication and integrated circuits.In contrast,Chinese mainland firms rely heavily on fundamental technologies.Qualcomm Inc.in USA and Taiwan Semiconductor Manufacturing Company(TSMC)in Chineses Taiwan have built a comprehensive patent porfolio in energy efficiency engineering.While Chinese mainland faces challenges in advancing dynamic frequency modulation algorithms and high-end manufacturing processes.Huawei Inc.in Chinese mainland leads in 5G module technology but struggles with ecosystem collaboration.Semiconductor manufacturing and radio frequency(RF)components still rely on external suppliers.This highlights the urgent need for innovation in new materials and open'source architectures.To enhance intelligent computing power,Chinese mainland firms must address coordination challenges.They should adopt scenario-driven technological strategies and build a comprehensive ecosystem that includes hardware,operating systems,and developer networks. 展开更多
关键词 smartphone chips technological innovation capacity latent semantic indexing(LSI) vector space model(VSM)
在线阅读 下载PDF
Orbit Weighting Scheme in the Context of Vector Space Information Retrieval
2
作者 Ahmad Ababneh Yousef Sanjalawe +2 位作者 Salam Fraihat Salam Al-E’mari Hamzah Alqudah 《Computers, Materials & Continua》 SCIE EI 2024年第7期1347-1379,共33页
This study introduces the Orbit Weighting Scheme(OWS),a novel approach aimed at enhancing the precision and efficiency of Vector Space information retrieval(IR)models,which have traditionally relied on weighting schem... This study introduces the Orbit Weighting Scheme(OWS),a novel approach aimed at enhancing the precision and efficiency of Vector Space information retrieval(IR)models,which have traditionally relied on weighting schemes like tf-idf and BM25.These conventional methods often struggle with accurately capturing document relevance,leading to inefficiencies in both retrieval performance and index size management.OWS proposes a dynamic weighting mechanism that evaluates the significance of terms based on their orbital position within the vector space,emphasizing term relationships and distribution patterns overlooked by existing models.Our research focuses on evaluating OWS’s impact on model accuracy using Information Retrieval metrics like Recall,Precision,InterpolatedAverage Precision(IAP),andMeanAverage Precision(MAP).Additionally,we assessOWS’s effectiveness in reducing the inverted index size,crucial for model efficiency.We compare OWS-based retrieval models against others using different schemes,including tf-idf variations and BM25Delta.Results reveal OWS’s superiority,achieving a 54%Recall and 81%MAP,and a notable 38%reduction in the inverted index size.This highlights OWS’s potential in optimizing retrieval processes and underscores the need for further research in this underrepresented area to fully leverage OWS’s capabilities in information retrieval methodologies. 展开更多
关键词 Information retrieval orbit weighting scheme semantic text analysis Tf-Idf weighting scheme vector space model
在线阅读 下载PDF
Word Embeddings and Semantic Spaces in Natural Language Processing 被引量:2
3
作者 Peter J. Worth 《International Journal of Intelligence Science》 2023年第1期1-21,共21页
One of the critical hurdles, and breakthroughs, in the field of Natural Language Processing (NLP) in the last two decades has been the development of techniques for text representation that solves the so-called curse ... One of the critical hurdles, and breakthroughs, in the field of Natural Language Processing (NLP) in the last two decades has been the development of techniques for text representation that solves the so-called curse of dimensionality, a problem which plagues NLP in general given that the feature set for learning starts as a function of the size of the language in question, upwards of hundreds of thousands of terms typically. As such, much of the research and development in NLP in the last two decades has been in finding and optimizing solutions to this problem, to feature selection in NLP effectively. This paper looks at the development of these various techniques, leveraging a variety of statistical methods which rest on linguistic theories that were advanced in the middle of the last century, namely the distributional hypothesis which suggests that words that are found in similar contexts generally have similar meanings. In this survey paper we look at the development of some of the most popular of these techniques from a mathematical as well as data structure perspective, from Latent Semantic Analysis to Vector Space Models to their more modern variants which are typically referred to as word embeddings. In this review of algoriths such as Word2Vec, GloVe, ELMo and BERT, we explore the idea of semantic spaces more generally beyond applicability to NLP. 展开更多
关键词 Natural Language Processing vector space models Semantic spaces Word Embeddings Representation Learning Text vectorization Machine Learning Deep Learning
在线阅读 下载PDF
Learning Hierarchical User Interest Models from Web Pages
4
作者 YANG Feng-qin SUN Tie-li SUN Ji-gui 《Wuhan University Journal of Natural Sciences》 EI CAS 2006年第1期6-10,共5页
We propose an algorithm for learning hierarchical user interest models according to the Web pages users have browsed. In this algorithm, the interests of a user are represented into a tree which is called a user inter... We propose an algorithm for learning hierarchical user interest models according to the Web pages users have browsed. In this algorithm, the interests of a user are represented into a tree which is called a user interest tree, the content and the structure of which can change simultaneously to adapt to the changes in a user's interests. This expression represents a user's specific and general interests as a continuurn. In some sense, specific interests correspond to shortterm interests, while general interests correspond to longterm interests. So this representation more really reflects the users' interests. The algorithm can automatically model a us er's multiple interest domains, dynamically generate the in terest models and prune a user interest tree when the number of the nodes in it exceeds given value. Finally, we show the experiment results in a Chinese Web Site. 展开更多
关键词 PERSONALIZATION user interest model vector space model agglomerate clustering method
在线阅读 下载PDF
文本分类技术研究 被引量:3
5
作者 曹锋 张代远 《电脑知识与技术(过刊)》 2009年第11X期9023-9025,共3页
文本分类作为机器学习和信息检索之间的交叉学科,涉及到多个领域的技术。它的完善有赖于各个相关领域的技术发展和提高,该文介绍了文本分类过程中的各个关键技术和存在的问题,讨论了文本表示模型、分类算法、分类器性能评价原理和方法,... 文本分类作为机器学习和信息检索之间的交叉学科,涉及到多个领域的技术。它的完善有赖于各个相关领域的技术发展和提高,该文介绍了文本分类过程中的各个关键技术和存在的问题,讨论了文本表示模型、分类算法、分类器性能评价原理和方法,最后并对今后的发展进行了展望。 展开更多
关键词 文本分类 分类算法 VSM(vector space Model) 语义网络 特征提取
在线阅读 下载PDF
Hybrid Chinese Information Retrieval Model Based on the Combination of Keyword and Concept 被引量:2
6
作者 樊孝忠 李宏乔 李良富 《Journal of Beijing Institute of Technology》 EI CAS 2003年第S1期120-123,共4页
A hybrid model that is based on the Combination of keywords and concept was put forward. The hybrid model is built on vector space model and probabilistic reasoning network. It not only can exert the advantages of key... A hybrid model that is based on the Combination of keywords and concept was put forward. The hybrid model is built on vector space model and probabilistic reasoning network. It not only can exert the advantages of keywords retrieval and concept retrieval but also can compensate for their shortcomings. Their parameters can be adjusted according to different usage in order to accept the best information retrieval result, and it has been proved by our experiments. 展开更多
关键词 hybrid information retrieval model concept retrieval vector space model probabilistic reasoning network
在线阅读 下载PDF
An improved algorithm for weighting keywords in web documents 被引量:1
7
作者 孙双 贺樑 +1 位作者 杨静 顾君忠 《Journal of Shanghai University(English Edition)》 CAS 2008年第3期235-239,共5页
In this paper, an improved algorithm, web-based keyword weight algorithm (WKWA), is presented to weight keywords in web documents. WKWA takes into account representation features of web documents and advantages of t... In this paper, an improved algorithm, web-based keyword weight algorithm (WKWA), is presented to weight keywords in web documents. WKWA takes into account representation features of web documents and advantages of the TF*IDF, TFC and ITC algorithms in order to make it more appropriate for web documents. Meanwhile, the presented algorithm is applied to improved vector space model (IVSM). A real system has been implemented for calculating semantic similarities of web documents. Four experiments have been carried out. They are keyword weight calculation, feature item selection, semantic similarity calculation, and WKWA time performance. The results demonstrate accuracy of keyword weight, and semantic similarity is improved. 展开更多
关键词 improved vector space model (IVSM) representation feature feature item keyword weight semantic similarity
在线阅读 下载PDF
Idea plagiarism detection with recurrent neural networks and vector space model 被引量:1
8
作者 Azra Nazir Roohie Naaz Mir Shaima Qureshi 《International Journal of Intelligent Computing and Cybernetics》 EI 2021年第3期321-332,共12页
Purpose-Natural languages have a fundamental quality of suppleness that makes it possible to present a single idea in plenty of different ways.This feature is often exploited in the academic world,leading to the theft... Purpose-Natural languages have a fundamental quality of suppleness that makes it possible to present a single idea in plenty of different ways.This feature is often exploited in the academic world,leading to the theft of work referred to as plagiarism.Many approaches have been put forward to detect such cases based on various text features and grammatical structures of languages.However,there is a huge scope of improvement for detecting intelligent plagiarism.Design/methodology/approach-To realize this,the paper introduces a hybrid model to detect intelligent plagiarism by breaking the entire process into three stages:(1)clustering,(2)vector formulation in each cluster based on semantic roles,normalization and similarity index calculation and(3)Summary generation using encoder-decoder.An effective weighing scheme has been introduced to select terms used to build vectors based on K-means,which is calculated on the synonym set for the said term.If the value calculated in the last stage lies above a predefined threshold,only then the next semantic argument is analyzed.When the similarity score for two documents is beyond the threshold,a short summary for plagiarized documents is created.Findings-Experimental results show that this method is able to detect connotation and concealment used in idea plagiarism besides detecting literal plagiarism.Originality/value-The proposed model can help academics stay updated by providing summaries of relevant articles.It would eliminate the practice of plagiarism infesting the academic community at an unprecedented pace.The model will also accelerate the process of reviewing academic documents,aiding in the speedy publishing of research articles. 展开更多
关键词 Natural language processing vector space model Recurrent neural networks Plagiarism detection
在线阅读 下载PDF
Encrypted Storage and Retrieval in Cloud Storage Applications 被引量:1
9
作者 Huang Yongfeng Zhang Jiuling Li Xing 《ZTE Communications》 2010年第4期31-33,共3页
Problems with data security impede the widespread application of cloud computing. Although data can be protected through encryption, effective retrieval of encrypted data is difficult to achieve using traditional meth... Problems with data security impede the widespread application of cloud computing. Although data can be protected through encryption, effective retrieval of encrypted data is difficult to achieve using traditional methods. This paper analyzes encrypted storage and retrieval technologies in cloud storage applications. A ranking method based on fully homomorphic encryption is proposed to meet demands of encrypted storage. Results show this method can improve efficiency. 展开更多
关键词 cloud storage vector space model relevance ranking
在线阅读 下载PDF
Document Clustering Based on Constructing Density Tree
10
作者 戴维迪 王文俊 +2 位作者 侯越先 王英 张璐 《Transactions of Tianjin University》 EI CAS 2008年第1期21-26,共6页
This paper focuses on document clustering by clustering algorithm based on a DEnsityTree (CABDET) to improve the accuracy of clustering. The CABDET method constructs a density-based treestructure for every potential c... This paper focuses on document clustering by clustering algorithm based on a DEnsityTree (CABDET) to improve the accuracy of clustering. The CABDET method constructs a density-based treestructure for every potential cluster by dynamically adjusting the radius of neighborhood according to local density. It avoids density-based spatial clustering of applications with noise (DBSCAN) ′s global density parameters and reduces input parameters to one. The results of experiment on real document show that CABDET achieves better accuracy of clustering than DBSCAN method. The CABDET algorithm obtains the max F-measure value 0.347 with the root node's radius of neighborhood 0.80, which is higher than 0.332 of DBSCAN with the radius of neighborhood 0.65 and the minimum number of objects 6. 展开更多
关键词 document handling clustering tree structure vector space model
在线阅读 下载PDF
A New Approach of Intelligent Data Retrieval Paradigm
11
作者 Falah Al-akashi Diana Inkpen 《Artificial Intelligence Advances》 2021年第2期1-12,共12页
What is a real time agent,how does it remedy ongoing daily frustrations for users,and how does it improve the retrieval performance in World Wide Web?These are the main question we focus on this manuscript.In many dis... What is a real time agent,how does it remedy ongoing daily frustrations for users,and how does it improve the retrieval performance in World Wide Web?These are the main question we focus on this manuscript.In many distributed information retrieval systems,information in agents should be ranked based on a combination of multiple criteria.Linear combination of ranks has been the dominant approach due to its simplicity and effectiveness.Such a combination scheme in distributed infrastructure requires that the ranks in resources or agents are comparable to each other before combined.The main challenge is transforming the raw rank values of different criteria appropriately to make them comparable before any combination.Different ways for ranking agents make this strategy difficult.In this research,we will demonstrate how to rank Web documents based on resource-provided information how to combine several resources raking schemas in one time.The proposed system was implemented specifically in data provided by agents to create a comparable combination for different attributes.The proposed approach was tested on the queries provided by Text Retrieval Conference(TREC).Experimental results showed that our approach is effective and robust compared with offline search platforms. 展开更多
关键词 Intelligent agents Ranking schema Distributed approach vector space model
在线阅读 下载PDF
An Adaptive Approach to Schema Classification for Data Warehouse Modeling 被引量:1
12
作者 王宏鼎 童云海 +3 位作者 谭少华 唐世渭 杨冬青 孙国辉 《Journal of Computer Science & Technology》 SCIE EI CSCD 2007年第2期252-260,共9页
Data warehouse (DW) modeling is a complicated task, involving both knowledge of business processes and familiarity with operational information systems structure and behavior. Existing DW modeling techniques suffer ... Data warehouse (DW) modeling is a complicated task, involving both knowledge of business processes and familiarity with operational information systems structure and behavior. Existing DW modeling techniques suffer from the following major drawbacks -- data-driven approach requires high levels of expertise and neglects the requirements of end users, while demand-driven approach lacks enterprise-wide vision and is regardless of existing models of underlying operational systems. In order to make up for those shortcomings, a method of classification of schema elements for DW modeling is proposed in this paper. We first put forward the vector space models for subjects and schema elements, then present an adaptive approach with self-tuning theory to construct context vectors of subjects, and finally classify the source schema elements into different subjects of the DW automatically. Benefited from the result of the schema elements classification, designers can model and construct a DW more easily. 展开更多
关键词 data warehousing schema elements classification vector space model ADAPTIVE
原文传递
A New Indexing Method Based on Word Proximity for Chinese Text Retrieval 被引量:1
13
作者 杜林 孙玉芳 《Journal of Computer Science & Technology》 SCIE EI CSCD 2000年第3期280-286,共7页
This paper proposed a novel text representation and matching scheme for Chinese text retrieval. At present, the indexing methods of Chinese retrieval systems are either character-based or word-based. The character-bas... This paper proposed a novel text representation and matching scheme for Chinese text retrieval. At present, the indexing methods of Chinese retrieval systems are either character-based or word-based. The character-based indexing methods, such as bi-gram or tri-gram indexing, have high false drops due to the mismatches between queries and documents. On the other hand, it's difficult to efficiently identify all the proper nouns, terminology of different domains, and phrases in the word-based indexing systems. The new indexing method uses both proximity and mutual information of the word pairs to represent the text content so as to overcome the high false drop, new word and phrase problems that exist in the character-based and word-based systems. The evaluation results indicate that the average query precision of proximity-based indexing is 5.2% higher than the best results of TREC-5. 展开更多
关键词 information retrieval vector space model automatic indexing proximity-based indexing
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部