期刊文献+
共找到419篇文章
< 1 2 21 >
每页显示 20 50 100
TONE MODELING BASED ON HIDDEN CONDITIONAL RANDOM FIELDS AND DISCRIMINATIVE MODEL WEIGHT TRAINING 被引量:1
1
作者 黄浩 朱杰 《Transactions of Nanjing University of Aeronautics and Astronautics》 EI 2008年第1期43-50,共8页
The use of hidden conditional random fields (HCRFs) for tone modeling is explored. The tone recognition performance is improved using HCRFs by taking advantage of intra-syllable dynamic, inter-syllable dynamic and d... The use of hidden conditional random fields (HCRFs) for tone modeling is explored. The tone recognition performance is improved using HCRFs by taking advantage of intra-syllable dynamic, inter-syllable dynamic and duration features. When the tone model is integrated into continuous speech recognition, the discriminative model weight training (DMWT) is proposed. Acoustic and tone scores are scaled by model weights discriminatively trained by the minimum phone error (MPE) criterion. Two schemes of weight training are evaluated and a smoothing technique is used to make training robust to overtraining problem. Experiments show that the accuracies of tone recognition and large vocabulary continuous speech recognition (LVCSR) can be improved by the HCRFs based tone model. Compared with the global weight scheme, continuous speech recognition can be improved by the discriminative trained weight combinations. 展开更多
关键词 speech recognition modelS hidden conditional random fields minimum phone error
在线阅读 下载PDF
Iterated Conditional Modes to Solve Simultaneous Localization and Mapping in Markov Random Fields Context 被引量:3
2
作者 J.Gimenez A.Amicarelli +2 位作者 J.M.Toibero F.di Sciascio R.Carelli 《International Journal of Automation and computing》 EI CSCD 2018年第3期310-324,共15页
This paper models the complex simultaneous localization and mapping(SLAM) problem through a very flexible Markov random field and then solves it by using the iterated conditional modes algorithm. Markovian models al... This paper models the complex simultaneous localization and mapping(SLAM) problem through a very flexible Markov random field and then solves it by using the iterated conditional modes algorithm. Markovian models allow to incorporate: any motion model; any observation model regardless of the type of sensor being chosen; prior information of the map through a map model; maps of diverse natures; sensor fusion weighted according to the accuracy. On the other hand, the iterated conditional modes algorithm is a probabilistic optimizer widely used for image processing which has not yet been used to solve the SLAM problem. This iterative solver has theoretical convergence regardless of the Markov random field chosen to model. Its initialization can be performed on-line and improved by parallel iterations whenever deemed appropriate. It can be used as a post-processing methodology if it is initialized with estimates obtained from another SLAM solver. The applied methodology can be easily implemented in other versions of the SLAM problem, such as the multi-robot version or the SLAM with dynamic environment. Simulations and real experiments show the flexibility and the excellent results of this proposal. 展开更多
关键词 Simultaneous localization and mapping Markov random fields iterated conditional modes modelling on-line solver.
原文传递
Power entity recognition based on bidirectional long short-term memory and conditional random fields 被引量:9
3
作者 Zhixiang Ji Xiaohui Wang +1 位作者 Changyu Cai Hongjian Sun 《Global Energy Interconnection》 2020年第2期186-192,共7页
With the application of artificial intelligence technology in the power industry,the knowledge graph is expected to play a key role in power grid dispatch processes,intelligent maintenance,and customer service respons... With the application of artificial intelligence technology in the power industry,the knowledge graph is expected to play a key role in power grid dispatch processes,intelligent maintenance,and customer service response provision.Knowledge graphs are usually constructed based on entity recognition.Specifically,based on the mining of entity attributes and relationships,domain knowledge graphs can be constructed through knowledge fusion.In this work,the entities and characteristics of power entity recognition are analyzed,the mechanism of entity recognition is clarified,and entity recognition techniques are analyzed in the context of the power domain.Power entity recognition based on the conditional random fields (CRF) and bidirectional long short-term memory (BLSTM) models is investigated,and the two methods are comparatively analyzed.The results indicated that the CRF model,with an accuracy of 83%,can better identify the power entities compared to the BLSTM.The CRF approach can thus be applied to the entity extraction for knowledge graph construction in the power field. 展开更多
关键词 Knowledge graph Entity recognition conditional random fields(crf) Bidirectional Long Short-Term Memory(BLSTM)
在线阅读 下载PDF
A CONDITIONAL RANDOM FIELDS APPROACH TO BIOMEDICAL NAMED ENTITY RECOGNITION 被引量:4
4
作者 Wang Haochang Zhao Tiejun Li Sheng Yu Hao 《Journal of Electronics(China)》 2007年第6期838-844,共7页
Named entity recognition is a fundamental task in biomedical data mining. In this letter, a named entity recognition system based on CRFs (Conditional Random Fields) for biomedical texts is presented. The system mak... Named entity recognition is a fundamental task in biomedical data mining. In this letter, a named entity recognition system based on CRFs (Conditional Random Fields) for biomedical texts is presented. The system makes extensive use of a diverse set of features, including local features, full text features and external resource features. All features incorporated in this system are described in detail, and the impacts of different feature sets on the performance of the system are evaluated. In order to improve the performance of system, post-processing modules are exploited to deal with the abbreviation phenomena, cascaded named entity and boundary errors identification. Evaluation on this system proved that the feature selection has important impact on the system performance, and the post-processing explored has an important contribution on system performance to achieve better resuits. 展开更多
关键词 conditional random fields (crfs Named entity recognition Feature selection Post-processing
在线阅读 下载PDF
Conditional Random Field Tracking Model Based on a Visual Long Short Term Memory Network 被引量:3
5
作者 Pei-Xin Liu Zhao-Sheng Zhu +1 位作者 Xiao-Feng Ye Xiao-Feng Li 《Journal of Electronic Science and Technology》 CAS CSCD 2020年第4期308-319,共12页
In dense pedestrian tracking,frequent object occlusions and close distances between objects cause difficulty when accurately estimating object trajectories.In this study,a conditional random field tracking model is es... In dense pedestrian tracking,frequent object occlusions and close distances between objects cause difficulty when accurately estimating object trajectories.In this study,a conditional random field tracking model is established by using a visual long short term memory network in the three-dimensional(3D)space and the motion estimations jointly performed on object trajectory segments.Object visual field information is added to the long short term memory network to improve the accuracy of the motion related object pair selection and motion estimation.To address the uncertainty of the length and interval of trajectory segments,a multimode long short term memory network is proposed for the object motion estimation.The tracking performance is evaluated using the PETS2009 dataset.The experimental results show that the proposed method achieves better performance than the tracking methods based on the independent motion estimation. 展开更多
关键词 conditional random field(crf) long short term memory network(LSTM) motion estimation multiple object tracking(MOT)
在线阅读 下载PDF
Exploiting PLSA model and conditional random field for refining image annotation 被引量:1
6
作者 田东平 《High Technology Letters》 EI CAS 2015年第1期78-84,共7页
This paper presents a new method for refining image annotation by integrating probabilistic la- tent semantic analysis (PLSA) with conditional random field (CRF). First a PLSA model with asymmetric modalities is c... This paper presents a new method for refining image annotation by integrating probabilistic la- tent semantic analysis (PLSA) with conditional random field (CRF). First a PLSA model with asymmetric modalities is constructed to predict a candidate set of annotations with confidence scores, and then model semantic relationship among the candidate annotations by leveraging conditional ran- dom field. In CRF, the confidence scores generated lay the PLSA model and the Fliekr distance be- tween pairwise candidate annotations are considered as local evidences and contextual potentials re- spectively. The novelty of our method mainly lies in two aspects : exploiting PLSA to predict a candi- date set of annotations with confidence scores as well as CRF to further explore the semantic context among candidate annotations for precise image annotation. To demonstrate the effectiveness of the method proposed in this paper, an experiment is conducted on the standard Corel dataset and its re- sults are 'compared favorably with several state-of-the-art approaches. 展开更多
关键词 automatic image annotation probabilistie latent semantic analysis (PLSA) ex- pectation-maximization conditional random fieldcrf Fliekr distance image retrieval
在线阅读 下载PDF
An Image Segmentation Algorithm Based on a Local Region Conditional Random Field Model 被引量:1
7
作者 Xiao Jiang Haibin Yu Shuaishuai Lv 《International Journal of Communications, Network and System Sciences》 2020年第9期139-159,共21页
To reduce the computation cost of a combined probabilistic graphical model and a deep neural network in semantic segmentation, the local region condition random field (LRCRF) model is investigated which selectively ap... To reduce the computation cost of a combined probabilistic graphical model and a deep neural network in semantic segmentation, the local region condition random field (LRCRF) model is investigated which selectively applies the condition random field (CRF) to the most active region in the image. The full convolutional network structure is optimized with the ResNet-18 structure and dilated convolution to expand the receptive field. The tracking networks are also improved based on SiameseFC by considering the frame relations in consecutive-frame traffic scene maps. Moreover, the segmentation results of the greyscale input data sets are more stable and effective than using the RGB images for deep neural network feature extraction. The experimental results show that the proposed method takes advantage of the image features directly and achieves good real-time performance and high segmentation accuracy. 展开更多
关键词 Image Segmentation Local Region condition random field model Deep Neural Network Consecutive Shooting Traffic Scene
在线阅读 下载PDF
Standardization of Robot Instruction Elements Based on Conditional Random Fields and Word Embeddin
8
作者 Hengsheng Wang Zhengang Zhang +1 位作者 Jin Ren Tong Liu 《Journal of Harbin Institute of Technology(New Series)》 EI CAS 2019年第5期32-40,共9页
Natural language processing has got great progress recently. Controlling robots with spoken natural language has become expectable. With the reliability problem of this kind of control in mind a confirmation process o... Natural language processing has got great progress recently. Controlling robots with spoken natural language has become expectable. With the reliability problem of this kind of control in mind a confirmation process of natural language instruction should be included before carried out by the robot autonomously and the prototype dialog system was designed thus the standardization problem was raised for the natural and understandable language interaction. In the application background of remotely navigating a mobile robot inside a building with Chinese natural spoken language considering that as an important navigation element in instructions a place name can be expressed with different lexical terms in spoken language this paper proposes a model for substituting different alternatives of a place name with a standard one (called standardization). First a CRF (Conditional Random Fields) model is trained to label the term required be standardized then a trained word embedding model is to represent lexical terms as digital vectors. In the vector space similarity of lexical terms is defined and used to find out the most similar one to the term picked out to be standardized. Experiments show that the method proposed works well and the dialog system responses to confirm the instructions are natural and understandable. 展开更多
关键词 WORD embedding conditional random fields ( crfs ) STANDARDIZATION interaction Chinese NATURAL Spoken LANGUAGE (CNSL) NATURAL LANGUAGE Processing (NLP) human-robot
在线阅读 下载PDF
Rockhead profile simulation using an improved generation method of conditional random field 被引量:6
9
作者 Liang Han Lin Wang +2 位作者 Wengang Zhang Boming Geng Shang Li 《Journal of Rock Mechanics and Geotechnical Engineering》 SCIE CSCD 2022年第3期896-908,共13页
Rockhead profile is an important part of geological profiles and can have significant impacts on some geotechnical engineering practice,and thus,it is necessary to establish a useful method to reverse the rockhead pro... Rockhead profile is an important part of geological profiles and can have significant impacts on some geotechnical engineering practice,and thus,it is necessary to establish a useful method to reverse the rockhead profile using site investigation results.As a general method to reflect the spatial distribution of geo-material properties based on field measurements,the conditional random field(CRF)was improved in this paper to simulate rockhead profiles.Besides,in geotechnical engineering practice,measurements are generally limited due to the limitations of budget and time so that the estimation of the mean value can have uncertainty to some extent.As the Bayesian theory can effectively combine the measurements and prior information to deal with uncertainty,CRF was implemented with the aid of the Bayesian framework in this study.More importantly,this simulation procedure is achieved as an analytical solution to avoid the time-consuming sampling work.The results show that the proposed method can provide a reasonable estimation about the rockhead depth at various locations against measurement data and as a result,the subjectivity in determining prior mean can be minimized.Finally,both the measurement data and selection of hyper-parameters in the proposed method can affect the simulated rockhead profiles,while the influence of the latter is less significant than that of the former. 展开更多
关键词 Rockhead profile BOREHOLE conditional random field(crf) BAYESIAN Mean uncertainty
在线阅读 下载PDF
Fine-Grained Opinion Mining on Chinese Car Reviews with Conditional Random Field
10
作者 WANG Yinglin 《Journal of Shanghai Jiaotong university(Science)》 EI 2020年第3期325-332,共8页
Nowadays,the Internet has penetrated into all aspects of people's lives.A large number of online customer reviews have been accumulated in several product forums,which are valuable resources to be analyzed.However... Nowadays,the Internet has penetrated into all aspects of people's lives.A large number of online customer reviews have been accumulated in several product forums,which are valuable resources to be analyzed.However,these customer reviews are unstructured textual data,in which a lot of ambiguities exist,so analyzing them is a challenging task.At present,the effective deep semantic or fine-grained analysis of customer reviews is rare in the existing literature,and the analysis quality of most studies is also low.Therefore,in this paper a fine-grained opinion mining method is introduced to extract the detailed semantic information of opinions from multiple perspectives and aspects from Chinese automobile reviews.The conditional random field (CRF) model is used in this method,in which semantic roles are divided into two groups.One group relates to the objects being reviewed,which includes the roles of manufacturer,the brand,the type,and the aspects of cars.The other group of semantic roles is about the opinions of the objects,which includes the sentiment description,the aspect value,the conditions of opinions and the sentiment tendency.The overall framework of the method includes three major steps.The first step distinguishes the relevant sentences with the irrelevant sentences in the reviews.At the second step the relevant sentences are further classified into different aspects.At the third step fine-grained semantic roles are extracted from sentences of each aspect.The data used in the training process is manually annotated in fine granularity of semantic roles.The features used in this CRF model include basic word features,part-of-speech (POS) features,position features and dependency syntactic features.Different combinations of these features are investigated.Experimental results are analyzed and future directions are discussed. 展开更多
关键词 Chinese opinion mining conditional random field(crf) semantic role labelling Chinese car reviews
原文传递
基于条件随机场(CRFs)的中文词性标注方法 被引量:58
11
作者 洪铭材 张阔 +1 位作者 唐杰 李涓子 《计算机科学》 CSCD 北大核心 2006年第10期148-151,155,共5页
本文提出一种基于CRFs模型的中文词性标注方法。该方法利用CRFs模型能够添加任意特征的优点,在使用词的上下文信息的同时,针对兼类词和未登录词添加了新的统计特征。在《人民日报》1月份语料库上进行的封闭测试和开放测试中,该方法的标... 本文提出一种基于CRFs模型的中文词性标注方法。该方法利用CRFs模型能够添加任意特征的优点,在使用词的上下文信息的同时,针对兼类词和未登录词添加了新的统计特征。在《人民日报》1月份语料库上进行的封闭测试和开放测试中,该方法的标注准确率分别为98.56%和96.60%。 展开更多
关键词 词性标注 条件随机场 维特比解码
在线阅读 下载PDF
SparkCRF:一种基于Spark的并行CRFs算法实现 被引量:11
12
作者 朱继召 贾岩涛 +3 位作者 徐君 乔建忠 王元卓 程学旗 《计算机研究与发展》 EI CSCD 北大核心 2016年第8期1819-1828,共10页
条件随机场(condition random fields,CRFs)可用于解决各种文本分析问题,如自然语言处理(natural language processing,NLP)中的序列标记、中文分词、命名实体识别、实体间关系抽取等.传统的运行在单节点上的条件随机场在处理大规模文本... 条件随机场(condition random fields,CRFs)可用于解决各种文本分析问题,如自然语言处理(natural language processing,NLP)中的序列标记、中文分词、命名实体识别、实体间关系抽取等.传统的运行在单节点上的条件随机场在处理大规模文本时,面临一系列挑战.一方面,个人计算机遇到处理的瓶颈从而难以胜任;另一方面,服务器执行效率较低.而通过升级服务器的硬件配置来提高其计算能力的方法,在处理大规模的文本分析任务时,终究不能从根本上解决问题.为此,采用"分而治之"的思想,基于Apache Spark的大数据处理框架设计并实现了运行在集群环境下的分布式CRFs——SparkCRF.实验表明,SparkCRF在文本分析任务中,具有高效的计算能力和较好的扩展性,并且具有与传统的单节点CRF++相同水平的准确率. 展开更多
关键词 大数据 机器学习 分布式计算 SPARK 条件随机场
在线阅读 下载PDF
基于层叠CRFs的中文句子评价对象抽取 被引量:19
13
作者 郑敏洁 雷志城 +1 位作者 廖祥文 陈国龙 《中文信息学报》 CSCD 北大核心 2013年第3期69-76,共8页
中文句子评价对象抽取是指在中文句子中抽取评论所针对的对象或对象的属性。目前国内相关研究工作尚未能有效识别复合词评价对象和未登陆评价对象。针对以上两种情况,该文提出了一种基于层叠条件随机场的中文句子评价对象抽取方法。该... 中文句子评价对象抽取是指在中文句子中抽取评论所针对的对象或对象的属性。目前国内相关研究工作尚未能有效识别复合词评价对象和未登陆评价对象。针对以上两种情况,该文提出了一种基于层叠条件随机场的中文句子评价对象抽取方法。该方法首先通过低层条件随机场获得候选评价对象集,然后通过降噪模型对噪声进行过滤、补充模型对缺失的候选评价对象进行补充、合并模型对复合短语候选评价对象进行合并,最后由高层模型抽取出评价对象。实验结果显示,与基于线性链条件随机场的识别方法相比,该方法准确率、召回率和F1值分别提升1.62%、5.75%和4.17%,能有效地识别复合词评价对象和未登录评价对象,从而提高中文句子评价对象的识别精度。 展开更多
关键词 评价对象 层叠条件随机场 降噪模型 补充模型
在线阅读 下载PDF
基于CRFs模型的敏感话题识别研究 被引量:4
14
作者 翟东海 聂洪玉 +1 位作者 崔静静 杜佳 《计算机应用研究》 CSCD 北大核心 2014年第4期993-996,共4页
条件随机场(CRFs)是一种判别式概率无向图学习模型,将其引入敏感话题识别中,提出了基于CRFs模型的敏感话题识别方法。将随机挑选出的一篇待检测文本s和剩余的待检测文本分别作为CRFs模型的观察序列和状态序列来计算文本s和其余待检测文... 条件随机场(CRFs)是一种判别式概率无向图学习模型,将其引入敏感话题识别中,提出了基于CRFs模型的敏感话题识别方法。将随机挑选出的一篇待检测文本s和剩余的待检测文本分别作为CRFs模型的观察序列和状态序列来计算文本s和其余待检测文本间的相关性概率值;然后将相关性最高的那篇文本和文本s合并表征一个类别;同时,将相关性最低的那篇文本作为另一个类别,将这两个类别作为CRFs模型新的状态序列,剩余的待检测文本作为新的观察序列进行迭代,据此实现敏感话题的识别。在数据集上进行的实验中,该方法的耗费函数的值为0.01943,宏平均F度量的值为0.8235,都取得了很好的效果。 展开更多
关键词 条件随机场 敏感话题识别 相关性概率值
在线阅读 下载PDF
基于CRFs和词典信息的中古汉语自动分词 被引量:29
15
作者 王晓玉 李斌 《数据分析与知识发现》 CSSCI CSCD 2017年第5期62-70,共9页
【目的】验证中古时期分词一致性和语料类别对CRFs分词效率的影响,在此基础上进一步提高分词效率,降低人工校对的工作量。【方法】以中古时期的史书、佛经、小说类语料为例,针对中古汉语的自动分词问题,优化分词原则,运用CRFs模型和词... 【目的】验证中古时期分词一致性和语料类别对CRFs分词效率的影响,在此基础上进一步提高分词效率,降低人工校对的工作量。【方法】以中古时期的史书、佛经、小说类语料为例,针对中古汉语的自动分词问题,优化分词原则,运用CRFs模型和词典相结合的方法,消除中古汉语人工分词结果中易出现的分词不一致问题;同时在CRFs分词中引入字符分类、字典信息两种特征,并通过对比实验选取每种特征最合适的分词模板。【结果】实验结果显示,分词结果的总F值在封闭测试中达到99%以上,开放测试的综合测试中也达到89%-95%。【局限】分词不一致研究主要针对双字词,因此三字以上词语(多字词)的识别效果稍有欠缺。【结论】在有效提高分词一致性的前提下,字符分类、词典标记特征能够有效提高中古汉语CRFs分词的精确度。同时本文提出的中古汉语分词系统可以服务于中古时期多类别的汉语语料。 展开更多
关键词 crfs模型 分词一致性 中古汉语 自动分词
原文传递
基于CRFs边缘概率的中文分词 被引量:19
16
作者 罗彦彦 黄德根 《中文信息学报》 CSCD 北大核心 2009年第5期3-8,共6页
将分词问题转化为序列标注问题,使用CRFs标注器进行序列标注是近年来广泛采用的分词方法。针对这一方法中CRFs的标记错误问题,该文提出基于CRFs边缘概率的分词方法。该方法从标注结果中发掘边缘概率高的候选词,重组边缘概率低的候选词,... 将分词问题转化为序列标注问题,使用CRFs标注器进行序列标注是近年来广泛采用的分词方法。针对这一方法中CRFs的标记错误问题,该文提出基于CRFs边缘概率的分词方法。该方法从标注结果中发掘边缘概率高的候选词,重组边缘概率低的候选词,提出FMM的奖励机制修正重组后的子串。在第四届SIGHAN Bakeoff中文简体语料SXU和NCC上进行闭式测试,分别在F-1值上达到了96.41%和94.30%的精度。 展开更多
关键词 计算机应用 中文信息处理 中文分词 条件随机场(crfs) 边缘概率 最大向前匹配(FMM) 全局特征
在线阅读 下载PDF
分布式策略与CRFs相结合识别汉语组块 被引量:6
17
作者 黄德根 于静 《中文信息学报》 CSCD 北大核心 2009年第1期16-22,共7页
该文提出了一种基于CRFs的分布式策略及错误驱动的方法识别汉语组块。该方法首先将11种类型的汉语组块进行分组,结合CRFs构建不同的组块识别模型来识别组块;之后利用基于CRFs的错误驱动技术自动对分组组块进行二次识别;最后依据各分组F... 该文提出了一种基于CRFs的分布式策略及错误驱动的方法识别汉语组块。该方法首先将11种类型的汉语组块进行分组,结合CRFs构建不同的组块识别模型来识别组块;之后利用基于CRFs的错误驱动技术自动对分组组块进行二次识别;最后依据各分组F值大小顺序处理类型冲突。实验结果表明,基于CRFs的分布式策略及错误驱动方法识别汉语组块是有效的,系统开放式测试的精确率、召回率、F值分别达到94.90%、91.00%和92.91%,好于单独的CRFs方法、分布式策略方法及其他组合方法。 展开更多
关键词 计算机应用 中文信息处理 组块识别 条件随机域(crfs) 分布式策略 基于crfs的错误驱动 浅层句法分析
在线阅读 下载PDF
基于多级空间上下文LR-CRFs模型的高分辨率影像分类
18
作者 杨耘 徐丽 贾鹏 《地球科学与环境学报》 CAS 2013年第4期119-126,共8页
充分表达和利用目标空间上下文及语义信息是提高高空间分辨率影像分类精度的关键技术,而条件随机场(CRFs)在目标空间上下文建模以及分类预测方面有其独特优势。但是,基于单一尺度分析的CRFs模型存在不能反映目标多层次空间结构及语义关... 充分表达和利用目标空间上下文及语义信息是提高高空间分辨率影像分类精度的关键技术,而条件随机场(CRFs)在目标空间上下文建模以及分类预测方面有其独特优势。但是,基于单一尺度分析的CRFs模型存在不能反映目标多层次空间结构及语义关系的问题,因此针对城区高分辨率影像土地利用/覆盖分类问题,在面向对象分类框架下,提出了一种多级空间上下文LRCRFs模型。该模型定义如下:首先,将影像进行对象层、目标层及场景层的分层表达及分层特征提取,并进行"对象-目标-场景"的逐层关联;其次,采用逻辑回归(LR)分类器定义CRFs模型的关联势函数,利用分层特征加权的Potts函数定义交互势函数;采用最大-积消息传递算法对该模型进行近似推理。利用IKONOS多光谱影像及大比例尺真彩色航空影像进行试验的结果表明:多级空间上下文LR-CRFs模型分类精度高于单一尺度的基于像素层或对象层分割的LR-CRFs模型,其精度平均分别提高了4.63%和2.22%;该方法在一定意义上也缓解了面向对象分类方法中分类结果对分割尺度的依赖程度。 展开更多
关键词 条件随机场 多级空间上下文 逻辑回归 分层图模型 语义信息 高分辨率遥感 影像分类
在线阅读 下载PDF
基于CRFs和歧义模型的越南语分词 被引量:2
19
作者 熊明明 李英 +2 位作者 郭剑毅 毛存礼 余正涛 《数据采集与处理》 CSCD 北大核心 2017年第3期636-642,共7页
通过对越南语词法特点的研究,把越南语的基本特征融入到条件随机场中(Condition random fields,CRFs),提出了一种基于CRFs和歧义模型的越南语分词方法。通过机器标注、人工校对的方式获取了25 981条越南语分词语料作为CRFs的训练语料。... 通过对越南语词法特点的研究,把越南语的基本特征融入到条件随机场中(Condition random fields,CRFs),提出了一种基于CRFs和歧义模型的越南语分词方法。通过机器标注、人工校对的方式获取了25 981条越南语分词语料作为CRFs的训练语料。越南语中交叉歧义广泛分布在句子中,为了克服交叉歧义的影响,通过词典的正向和逆向匹配算法从训练语料中抽取了5 377条歧义片段,并通过最大熵模型训练得到一个歧义模型,并融入到分词模型中。把训练语料均分为10份做交叉验证实验,分词准确率达到了96.55%。与已有越南语分词工具VnTokenizer比较,实验结果表明该方法提高了越南语分词的准确率、召回率和F值。 展开更多
关键词 条件随机场模型 越南语分词 词法 基本特征 最大熵 歧义模型
在线阅读 下载PDF
基于CRFs和领域规则的业务名称识别 被引量:3
20
作者 赵延平 曹存根 谢丽聪 《计算机工程》 CAS CSCD 北大核心 2011年第11期200-202,共3页
提出一种基于条件随机场(CRFs)和领域规则的业务名称识别方法。通过实验词及词性的不同组合选择特征集合,由该特征训练得到CRFs模型,利用该模型测试得到业务术语,采用2-gram及编辑距离2种度量方式进行相似度计算,利用领域规则和相似度... 提出一种基于条件随机场(CRFs)和领域规则的业务名称识别方法。通过实验词及词性的不同组合选择特征集合,由该特征训练得到CRFs模型,利用该模型测试得到业务术语,采用2-gram及编辑距离2种度量方式进行相似度计算,利用领域规则和相似度计算方法得到业务名称。实验结果证明了该方法的有效性。 展开更多
关键词 业务名称识别 条件随机场 文本相似度 编辑距离
在线阅读 下载PDF
上一页 1 2 21 下一页 到第
使用帮助 返回顶部