The use of hidden conditional random fields (HCRFs) for tone modeling is explored. The tone recognition performance is improved using HCRFs by taking advantage of intra-syllable dynamic, inter-syllable dynamic and d...The use of hidden conditional random fields (HCRFs) for tone modeling is explored. The tone recognition performance is improved using HCRFs by taking advantage of intra-syllable dynamic, inter-syllable dynamic and duration features. When the tone model is integrated into continuous speech recognition, the discriminative model weight training (DMWT) is proposed. Acoustic and tone scores are scaled by model weights discriminatively trained by the minimum phone error (MPE) criterion. Two schemes of weight training are evaluated and a smoothing technique is used to make training robust to overtraining problem. Experiments show that the accuracies of tone recognition and large vocabulary continuous speech recognition (LVCSR) can be improved by the HCRFs based tone model. Compared with the global weight scheme, continuous speech recognition can be improved by the discriminative trained weight combinations.展开更多
This paper models the complex simultaneous localization and mapping(SLAM) problem through a very flexible Markov random field and then solves it by using the iterated conditional modes algorithm. Markovian models al...This paper models the complex simultaneous localization and mapping(SLAM) problem through a very flexible Markov random field and then solves it by using the iterated conditional modes algorithm. Markovian models allow to incorporate: any motion model; any observation model regardless of the type of sensor being chosen; prior information of the map through a map model; maps of diverse natures; sensor fusion weighted according to the accuracy. On the other hand, the iterated conditional modes algorithm is a probabilistic optimizer widely used for image processing which has not yet been used to solve the SLAM problem. This iterative solver has theoretical convergence regardless of the Markov random field chosen to model. Its initialization can be performed on-line and improved by parallel iterations whenever deemed appropriate. It can be used as a post-processing methodology if it is initialized with estimates obtained from another SLAM solver. The applied methodology can be easily implemented in other versions of the SLAM problem, such as the multi-robot version or the SLAM with dynamic environment. Simulations and real experiments show the flexibility and the excellent results of this proposal.展开更多
With the application of artificial intelligence technology in the power industry,the knowledge graph is expected to play a key role in power grid dispatch processes,intelligent maintenance,and customer service respons...With the application of artificial intelligence technology in the power industry,the knowledge graph is expected to play a key role in power grid dispatch processes,intelligent maintenance,and customer service response provision.Knowledge graphs are usually constructed based on entity recognition.Specifically,based on the mining of entity attributes and relationships,domain knowledge graphs can be constructed through knowledge fusion.In this work,the entities and characteristics of power entity recognition are analyzed,the mechanism of entity recognition is clarified,and entity recognition techniques are analyzed in the context of the power domain.Power entity recognition based on the conditional random fields (CRF) and bidirectional long short-term memory (BLSTM) models is investigated,and the two methods are comparatively analyzed.The results indicated that the CRF model,with an accuracy of 83%,can better identify the power entities compared to the BLSTM.The CRF approach can thus be applied to the entity extraction for knowledge graph construction in the power field.展开更多
Named entity recognition is a fundamental task in biomedical data mining. In this letter, a named entity recognition system based on CRFs (Conditional Random Fields) for biomedical texts is presented. The system mak...Named entity recognition is a fundamental task in biomedical data mining. In this letter, a named entity recognition system based on CRFs (Conditional Random Fields) for biomedical texts is presented. The system makes extensive use of a diverse set of features, including local features, full text features and external resource features. All features incorporated in this system are described in detail, and the impacts of different feature sets on the performance of the system are evaluated. In order to improve the performance of system, post-processing modules are exploited to deal with the abbreviation phenomena, cascaded named entity and boundary errors identification. Evaluation on this system proved that the feature selection has important impact on the system performance, and the post-processing explored has an important contribution on system performance to achieve better resuits.展开更多
In dense pedestrian tracking,frequent object occlusions and close distances between objects cause difficulty when accurately estimating object trajectories.In this study,a conditional random field tracking model is es...In dense pedestrian tracking,frequent object occlusions and close distances between objects cause difficulty when accurately estimating object trajectories.In this study,a conditional random field tracking model is established by using a visual long short term memory network in the three-dimensional(3D)space and the motion estimations jointly performed on object trajectory segments.Object visual field information is added to the long short term memory network to improve the accuracy of the motion related object pair selection and motion estimation.To address the uncertainty of the length and interval of trajectory segments,a multimode long short term memory network is proposed for the object motion estimation.The tracking performance is evaluated using the PETS2009 dataset.The experimental results show that the proposed method achieves better performance than the tracking methods based on the independent motion estimation.展开更多
This paper presents a new method for refining image annotation by integrating probabilistic la- tent semantic analysis (PLSA) with conditional random field (CRF). First a PLSA model with asymmetric modalities is c...This paper presents a new method for refining image annotation by integrating probabilistic la- tent semantic analysis (PLSA) with conditional random field (CRF). First a PLSA model with asymmetric modalities is constructed to predict a candidate set of annotations with confidence scores, and then model semantic relationship among the candidate annotations by leveraging conditional ran- dom field. In CRF, the confidence scores generated lay the PLSA model and the Fliekr distance be- tween pairwise candidate annotations are considered as local evidences and contextual potentials re- spectively. The novelty of our method mainly lies in two aspects : exploiting PLSA to predict a candi- date set of annotations with confidence scores as well as CRF to further explore the semantic context among candidate annotations for precise image annotation. To demonstrate the effectiveness of the method proposed in this paper, an experiment is conducted on the standard Corel dataset and its re- sults are 'compared favorably with several state-of-the-art approaches.展开更多
To reduce the computation cost of a combined probabilistic graphical model and a deep neural network in semantic segmentation, the local region condition random field (LRCRF) model is investigated which selectively ap...To reduce the computation cost of a combined probabilistic graphical model and a deep neural network in semantic segmentation, the local region condition random field (LRCRF) model is investigated which selectively applies the condition random field (CRF) to the most active region in the image. The full convolutional network structure is optimized with the ResNet-18 structure and dilated convolution to expand the receptive field. The tracking networks are also improved based on SiameseFC by considering the frame relations in consecutive-frame traffic scene maps. Moreover, the segmentation results of the greyscale input data sets are more stable and effective than using the RGB images for deep neural network feature extraction. The experimental results show that the proposed method takes advantage of the image features directly and achieves good real-time performance and high segmentation accuracy.展开更多
Natural language processing has got great progress recently. Controlling robots with spoken natural language has become expectable. With the reliability problem of this kind of control in mind a confirmation process o...Natural language processing has got great progress recently. Controlling robots with spoken natural language has become expectable. With the reliability problem of this kind of control in mind a confirmation process of natural language instruction should be included before carried out by the robot autonomously and the prototype dialog system was designed thus the standardization problem was raised for the natural and understandable language interaction. In the application background of remotely navigating a mobile robot inside a building with Chinese natural spoken language considering that as an important navigation element in instructions a place name can be expressed with different lexical terms in spoken language this paper proposes a model for substituting different alternatives of a place name with a standard one (called standardization). First a CRF (Conditional Random Fields) model is trained to label the term required be standardized then a trained word embedding model is to represent lexical terms as digital vectors. In the vector space similarity of lexical terms is defined and used to find out the most similar one to the term picked out to be standardized. Experiments show that the method proposed works well and the dialog system responses to confirm the instructions are natural and understandable.展开更多
Rockhead profile is an important part of geological profiles and can have significant impacts on some geotechnical engineering practice,and thus,it is necessary to establish a useful method to reverse the rockhead pro...Rockhead profile is an important part of geological profiles and can have significant impacts on some geotechnical engineering practice,and thus,it is necessary to establish a useful method to reverse the rockhead profile using site investigation results.As a general method to reflect the spatial distribution of geo-material properties based on field measurements,the conditional random field(CRF)was improved in this paper to simulate rockhead profiles.Besides,in geotechnical engineering practice,measurements are generally limited due to the limitations of budget and time so that the estimation of the mean value can have uncertainty to some extent.As the Bayesian theory can effectively combine the measurements and prior information to deal with uncertainty,CRF was implemented with the aid of the Bayesian framework in this study.More importantly,this simulation procedure is achieved as an analytical solution to avoid the time-consuming sampling work.The results show that the proposed method can provide a reasonable estimation about the rockhead depth at various locations against measurement data and as a result,the subjectivity in determining prior mean can be minimized.Finally,both the measurement data and selection of hyper-parameters in the proposed method can affect the simulated rockhead profiles,while the influence of the latter is less significant than that of the former.展开更多
Nowadays,the Internet has penetrated into all aspects of people's lives.A large number of online customer reviews have been accumulated in several product forums,which are valuable resources to be analyzed.However...Nowadays,the Internet has penetrated into all aspects of people's lives.A large number of online customer reviews have been accumulated in several product forums,which are valuable resources to be analyzed.However,these customer reviews are unstructured textual data,in which a lot of ambiguities exist,so analyzing them is a challenging task.At present,the effective deep semantic or fine-grained analysis of customer reviews is rare in the existing literature,and the analysis quality of most studies is also low.Therefore,in this paper a fine-grained opinion mining method is introduced to extract the detailed semantic information of opinions from multiple perspectives and aspects from Chinese automobile reviews.The conditional random field (CRF) model is used in this method,in which semantic roles are divided into two groups.One group relates to the objects being reviewed,which includes the roles of manufacturer,the brand,the type,and the aspects of cars.The other group of semantic roles is about the opinions of the objects,which includes the sentiment description,the aspect value,the conditions of opinions and the sentiment tendency.The overall framework of the method includes three major steps.The first step distinguishes the relevant sentences with the irrelevant sentences in the reviews.At the second step the relevant sentences are further classified into different aspects.At the third step fine-grained semantic roles are extracted from sentences of each aspect.The data used in the training process is manually annotated in fine granularity of semantic roles.The features used in this CRF model include basic word features,part-of-speech (POS) features,position features and dependency syntactic features.Different combinations of these features are investigated.Experimental results are analyzed and future directions are discussed.展开更多
条件随机场(condition random fields,CRFs)可用于解决各种文本分析问题,如自然语言处理(natural language processing,NLP)中的序列标记、中文分词、命名实体识别、实体间关系抽取等.传统的运行在单节点上的条件随机场在处理大规模文本...条件随机场(condition random fields,CRFs)可用于解决各种文本分析问题,如自然语言处理(natural language processing,NLP)中的序列标记、中文分词、命名实体识别、实体间关系抽取等.传统的运行在单节点上的条件随机场在处理大规模文本时,面临一系列挑战.一方面,个人计算机遇到处理的瓶颈从而难以胜任;另一方面,服务器执行效率较低.而通过升级服务器的硬件配置来提高其计算能力的方法,在处理大规模的文本分析任务时,终究不能从根本上解决问题.为此,采用"分而治之"的思想,基于Apache Spark的大数据处理框架设计并实现了运行在集群环境下的分布式CRFs——SparkCRF.实验表明,SparkCRF在文本分析任务中,具有高效的计算能力和较好的扩展性,并且具有与传统的单节点CRF++相同水平的准确率.展开更多
通过对越南语词法特点的研究,把越南语的基本特征融入到条件随机场中(Condition random fields,CRFs),提出了一种基于CRFs和歧义模型的越南语分词方法。通过机器标注、人工校对的方式获取了25 981条越南语分词语料作为CRFs的训练语料。...通过对越南语词法特点的研究,把越南语的基本特征融入到条件随机场中(Condition random fields,CRFs),提出了一种基于CRFs和歧义模型的越南语分词方法。通过机器标注、人工校对的方式获取了25 981条越南语分词语料作为CRFs的训练语料。越南语中交叉歧义广泛分布在句子中,为了克服交叉歧义的影响,通过词典的正向和逆向匹配算法从训练语料中抽取了5 377条歧义片段,并通过最大熵模型训练得到一个歧义模型,并融入到分词模型中。把训练语料均分为10份做交叉验证实验,分词准确率达到了96.55%。与已有越南语分词工具VnTokenizer比较,实验结果表明该方法提高了越南语分词的准确率、召回率和F值。展开更多
文摘The use of hidden conditional random fields (HCRFs) for tone modeling is explored. The tone recognition performance is improved using HCRFs by taking advantage of intra-syllable dynamic, inter-syllable dynamic and duration features. When the tone model is integrated into continuous speech recognition, the discriminative model weight training (DMWT) is proposed. Acoustic and tone scores are scaled by model weights discriminatively trained by the minimum phone error (MPE) criterion. Two schemes of weight training are evaluated and a smoothing technique is used to make training robust to overtraining problem. Experiments show that the accuracies of tone recognition and large vocabulary continuous speech recognition (LVCSR) can be improved by the HCRFs based tone model. Compared with the global weight scheme, continuous speech recognition can be improved by the discriminative trained weight combinations.
基金supported by the National Council for Scientific and Technological Research(CONICET)the National University of San Juan(UNSJ)
文摘This paper models the complex simultaneous localization and mapping(SLAM) problem through a very flexible Markov random field and then solves it by using the iterated conditional modes algorithm. Markovian models allow to incorporate: any motion model; any observation model regardless of the type of sensor being chosen; prior information of the map through a map model; maps of diverse natures; sensor fusion weighted according to the accuracy. On the other hand, the iterated conditional modes algorithm is a probabilistic optimizer widely used for image processing which has not yet been used to solve the SLAM problem. This iterative solver has theoretical convergence regardless of the Markov random field chosen to model. Its initialization can be performed on-line and improved by parallel iterations whenever deemed appropriate. It can be used as a post-processing methodology if it is initialized with estimates obtained from another SLAM solver. The applied methodology can be easily implemented in other versions of the SLAM problem, such as the multi-robot version or the SLAM with dynamic environment. Simulations and real experiments show the flexibility and the excellent results of this proposal.
基金supported by Science and Technology Project of State Grid Corporation(Research and Application of Intelligent Energy Meter Quality Analysis and Evaluation Technology Based on Full Chain Data)
文摘With the application of artificial intelligence technology in the power industry,the knowledge graph is expected to play a key role in power grid dispatch processes,intelligent maintenance,and customer service response provision.Knowledge graphs are usually constructed based on entity recognition.Specifically,based on the mining of entity attributes and relationships,domain knowledge graphs can be constructed through knowledge fusion.In this work,the entities and characteristics of power entity recognition are analyzed,the mechanism of entity recognition is clarified,and entity recognition techniques are analyzed in the context of the power domain.Power entity recognition based on the conditional random fields (CRF) and bidirectional long short-term memory (BLSTM) models is investigated,and the two methods are comparatively analyzed.The results indicated that the CRF model,with an accuracy of 83%,can better identify the power entities compared to the BLSTM.The CRF approach can thus be applied to the entity extraction for knowledge graph construction in the power field.
基金Supported by The National Natural Science Foundation of China(No.60302021).
文摘Named entity recognition is a fundamental task in biomedical data mining. In this letter, a named entity recognition system based on CRFs (Conditional Random Fields) for biomedical texts is presented. The system makes extensive use of a diverse set of features, including local features, full text features and external resource features. All features incorporated in this system are described in detail, and the impacts of different feature sets on the performance of the system are evaluated. In order to improve the performance of system, post-processing modules are exploited to deal with the abbreviation phenomena, cascaded named entity and boundary errors identification. Evaluation on this system proved that the feature selection has important impact on the system performance, and the post-processing explored has an important contribution on system performance to achieve better resuits.
文摘In dense pedestrian tracking,frequent object occlusions and close distances between objects cause difficulty when accurately estimating object trajectories.In this study,a conditional random field tracking model is established by using a visual long short term memory network in the three-dimensional(3D)space and the motion estimations jointly performed on object trajectory segments.Object visual field information is added to the long short term memory network to improve the accuracy of the motion related object pair selection and motion estimation.To address the uncertainty of the length and interval of trajectory segments,a multimode long short term memory network is proposed for the object motion estimation.The tracking performance is evaluated using the PETS2009 dataset.The experimental results show that the proposed method achieves better performance than the tracking methods based on the independent motion estimation.
基金Supported by the National Basic Research Priorities Programme(No.2013CB329502)the National High Technology Research and Development Programme of China(No.2012AA011003)+1 种基金the Natural Science Basic Research Plan in Shanxi Province of China(No.2014JQ2-6036)the Science and Technology R&D Program of Baoji City(No.203020013,2013R2-2)
文摘This paper presents a new method for refining image annotation by integrating probabilistic la- tent semantic analysis (PLSA) with conditional random field (CRF). First a PLSA model with asymmetric modalities is constructed to predict a candidate set of annotations with confidence scores, and then model semantic relationship among the candidate annotations by leveraging conditional ran- dom field. In CRF, the confidence scores generated lay the PLSA model and the Fliekr distance be- tween pairwise candidate annotations are considered as local evidences and contextual potentials re- spectively. The novelty of our method mainly lies in two aspects : exploiting PLSA to predict a candi- date set of annotations with confidence scores as well as CRF to further explore the semantic context among candidate annotations for precise image annotation. To demonstrate the effectiveness of the method proposed in this paper, an experiment is conducted on the standard Corel dataset and its re- sults are 'compared favorably with several state-of-the-art approaches.
文摘To reduce the computation cost of a combined probabilistic graphical model and a deep neural network in semantic segmentation, the local region condition random field (LRCRF) model is investigated which selectively applies the condition random field (CRF) to the most active region in the image. The full convolutional network structure is optimized with the ResNet-18 structure and dilated convolution to expand the receptive field. The tracking networks are also improved based on SiameseFC by considering the frame relations in consecutive-frame traffic scene maps. Moreover, the segmentation results of the greyscale input data sets are more stable and effective than using the RGB images for deep neural network feature extraction. The experimental results show that the proposed method takes advantage of the image features directly and achieves good real-time performance and high segmentation accuracy.
基金Sponsored by the Basic Research Development Program of China ( Grant No. 2013CB03554)the Fundamental Research Funds for Universities, Central South University (Grant No. 2017zzts394).
文摘Natural language processing has got great progress recently. Controlling robots with spoken natural language has become expectable. With the reliability problem of this kind of control in mind a confirmation process of natural language instruction should be included before carried out by the robot autonomously and the prototype dialog system was designed thus the standardization problem was raised for the natural and understandable language interaction. In the application background of remotely navigating a mobile robot inside a building with Chinese natural spoken language considering that as an important navigation element in instructions a place name can be expressed with different lexical terms in spoken language this paper proposes a model for substituting different alternatives of a place name with a standard one (called standardization). First a CRF (Conditional Random Fields) model is trained to label the term required be standardized then a trained word embedding model is to represent lexical terms as digital vectors. In the vector space similarity of lexical terms is defined and used to find out the most similar one to the term picked out to be standardized. Experiments show that the method proposed works well and the dialog system responses to confirm the instructions are natural and understandable.
基金the funding support from the National Natural Science Foundation of China (Grant No. 52078086)Program of Distinguished Young Scholars, Natural Science Foundation of Chongqing, China (Grant No. cstc2020jcyj-jq0087)State Education Ministry and the Fundamental Research Funds for the Central Universities (Grant No. 2019 CDJSK 04 XK23)
文摘Rockhead profile is an important part of geological profiles and can have significant impacts on some geotechnical engineering practice,and thus,it is necessary to establish a useful method to reverse the rockhead profile using site investigation results.As a general method to reflect the spatial distribution of geo-material properties based on field measurements,the conditional random field(CRF)was improved in this paper to simulate rockhead profiles.Besides,in geotechnical engineering practice,measurements are generally limited due to the limitations of budget and time so that the estimation of the mean value can have uncertainty to some extent.As the Bayesian theory can effectively combine the measurements and prior information to deal with uncertainty,CRF was implemented with the aid of the Bayesian framework in this study.More importantly,this simulation procedure is achieved as an analytical solution to avoid the time-consuming sampling work.The results show that the proposed method can provide a reasonable estimation about the rockhead depth at various locations against measurement data and as a result,the subjectivity in determining prior mean can be minimized.Finally,both the measurement data and selection of hyper-parameters in the proposed method can affect the simulated rockhead profiles,while the influence of the latter is less significant than that of the former.
基金the National Natural Science Foundation of China(No.61375053)the Project of Shanghai University of Finance and Economics(Nos.2018110565 and 2016110743)。
文摘Nowadays,the Internet has penetrated into all aspects of people's lives.A large number of online customer reviews have been accumulated in several product forums,which are valuable resources to be analyzed.However,these customer reviews are unstructured textual data,in which a lot of ambiguities exist,so analyzing them is a challenging task.At present,the effective deep semantic or fine-grained analysis of customer reviews is rare in the existing literature,and the analysis quality of most studies is also low.Therefore,in this paper a fine-grained opinion mining method is introduced to extract the detailed semantic information of opinions from multiple perspectives and aspects from Chinese automobile reviews.The conditional random field (CRF) model is used in this method,in which semantic roles are divided into two groups.One group relates to the objects being reviewed,which includes the roles of manufacturer,the brand,the type,and the aspects of cars.The other group of semantic roles is about the opinions of the objects,which includes the sentiment description,the aspect value,the conditions of opinions and the sentiment tendency.The overall framework of the method includes three major steps.The first step distinguishes the relevant sentences with the irrelevant sentences in the reviews.At the second step the relevant sentences are further classified into different aspects.At the third step fine-grained semantic roles are extracted from sentences of each aspect.The data used in the training process is manually annotated in fine granularity of semantic roles.The features used in this CRF model include basic word features,part-of-speech (POS) features,position features and dependency syntactic features.Different combinations of these features are investigated.Experimental results are analyzed and future directions are discussed.
文摘条件随机场(condition random fields,CRFs)可用于解决各种文本分析问题,如自然语言处理(natural language processing,NLP)中的序列标记、中文分词、命名实体识别、实体间关系抽取等.传统的运行在单节点上的条件随机场在处理大规模文本时,面临一系列挑战.一方面,个人计算机遇到处理的瓶颈从而难以胜任;另一方面,服务器执行效率较低.而通过升级服务器的硬件配置来提高其计算能力的方法,在处理大规模的文本分析任务时,终究不能从根本上解决问题.为此,采用"分而治之"的思想,基于Apache Spark的大数据处理框架设计并实现了运行在集群环境下的分布式CRFs——SparkCRF.实验表明,SparkCRF在文本分析任务中,具有高效的计算能力和较好的扩展性,并且具有与传统的单节点CRF++相同水平的准确率.
文摘通过对越南语词法特点的研究,把越南语的基本特征融入到条件随机场中(Condition random fields,CRFs),提出了一种基于CRFs和歧义模型的越南语分词方法。通过机器标注、人工校对的方式获取了25 981条越南语分词语料作为CRFs的训练语料。越南语中交叉歧义广泛分布在句子中,为了克服交叉歧义的影响,通过词典的正向和逆向匹配算法从训练语料中抽取了5 377条歧义片段,并通过最大熵模型训练得到一个歧义模型,并融入到分词模型中。把训练语料均分为10份做交叉验证实验,分词准确率达到了96.55%。与已有越南语分词工具VnTokenizer比较,实验结果表明该方法提高了越南语分词的准确率、召回率和F值。