We present and analyze an unsupervised method for Word Sense Disambiguation(WSD).Our work is based on the method presented by McCarthy et al.in 2004 for finding the predominant sense of each word in the entire corpu...We present and analyze an unsupervised method for Word Sense Disambiguation(WSD).Our work is based on the method presented by McCarthy et al.in 2004 for finding the predominant sense of each word in the entire corpus.Their maximization algorithm allows weighted terms(similar words) from a distributional thesaurus to accumulate a score for each ambiguous word sense,i.e.,the sense with the highest score is chosen based on votes from a weighted list of terms related to the ambiguous word.This list is obtained using the distributional similarity method proposed by Lin Dekang to obtain a thesaurus.In the method of McCarthy et al.,every occurrence of the ambiguous word uses the same thesaurus,regardless of the context where the ambiguous word occurs.Our method accounts for the context of a word when determining the sense of an ambiguous word by building the list of distributed similar words based on the syntactic context of the ambiguous word.We obtain a top precision of 77.54%of accuracy versus 67.10%of the original method tested on SemCor.We also analyze the effect of the number of weighted terms in the tasks of finding the Most Precuent Sense(MFS) and WSD,and experiment with several corpora for building the Word Space Model.展开更多
Materials with large spin–orbit torque(SOT)hold considerable significance for many spintronic applications because of their potential for energy-efficient magnetization switching.Unfortunately,most of the existing ma...Materials with large spin–orbit torque(SOT)hold considerable significance for many spintronic applications because of their potential for energy-efficient magnetization switching.Unfortunately,most of the existing materials exhibit an SOT efficiency factor that is much less than unity,requiring a large current for magnetization switching.The search for new materials that can exhibit an SOT efficiency much greater than unity is a topic of active research,and only a few such materials have been identified using conventional approaches.In this paper,we present a machine learning-based approach using a word embedding model that can identify new results by deciphering non-trivial correlations among various items in a specialized scientific text corpus.We show that such a model can be used to identify materials likely to exhibit high SOT and rank them according to their expected SOT strengths.The model captured the essential spintronics knowledge embedded in scientific abstracts within various materials science,physics,and engineering journals and identified 97 new materials to exhibit high SOT.Among them,16 candidate materials are expected to exhibit an SOT efficiency greater than unity,and one of them has recently been confirmed with experiments with quantitative agreement with the model prediction.展开更多
基金Supported by the Mexican Government(SNI,SIP-IPN,COFAA-IPN,and PIFI-IPN),CONACYT and the Japanese Government.
文摘We present and analyze an unsupervised method for Word Sense Disambiguation(WSD).Our work is based on the method presented by McCarthy et al.in 2004 for finding the predominant sense of each word in the entire corpus.Their maximization algorithm allows weighted terms(similar words) from a distributional thesaurus to accumulate a score for each ambiguous word sense,i.e.,the sense with the highest score is chosen based on votes from a weighted list of terms related to the ambiguous word.This list is obtained using the distributional similarity method proposed by Lin Dekang to obtain a thesaurus.In the method of McCarthy et al.,every occurrence of the ambiguous word uses the same thesaurus,regardless of the context where the ambiguous word occurs.Our method accounts for the context of a word when determining the sense of an ambiguous word by building the list of distributed similar words based on the syntactic context of the ambiguous word.We obtain a top precision of 77.54%of accuracy versus 67.10%of the original method tested on SemCor.We also analyze the effect of the number of weighted terms in the tasks of finding the Most Precuent Sense(MFS) and WSD,and experiment with several corpora for building the Word Space Model.
基金supported by Applications and Systems-Driven Center for Energy-Efficient Integrated NanoTechnologies (ASCENT), one of six centers in the Joint University Microelectronics Program (JUMP)a Semiconductor Research Corporation (SRC) program sponsored by Defense Advanced Research Projects Agency (DARPA)in part by the US Department of Energy, under Contract No. DE-AC02-05-CH11231 within the NonEquilibrium Magnetic Materials (NEMM) program.
文摘Materials with large spin–orbit torque(SOT)hold considerable significance for many spintronic applications because of their potential for energy-efficient magnetization switching.Unfortunately,most of the existing materials exhibit an SOT efficiency factor that is much less than unity,requiring a large current for magnetization switching.The search for new materials that can exhibit an SOT efficiency much greater than unity is a topic of active research,and only a few such materials have been identified using conventional approaches.In this paper,we present a machine learning-based approach using a word embedding model that can identify new results by deciphering non-trivial correlations among various items in a specialized scientific text corpus.We show that such a model can be used to identify materials likely to exhibit high SOT and rank them according to their expected SOT strengths.The model captured the essential spintronics knowledge embedded in scientific abstracts within various materials science,physics,and engineering journals and identified 97 new materials to exhibit high SOT.Among them,16 candidate materials are expected to exhibit an SOT efficiency greater than unity,and one of them has recently been confirmed with experiments with quantitative agreement with the model prediction.