Rod insulators are vital parts of the catenary of high speed railways(HSRs).There are many different catenary insulators,and the background of the insulator image is complicated.It is difficult to recognise insulators...Rod insulators are vital parts of the catenary of high speed railways(HSRs).There are many different catenary insulators,and the background of the insulator image is complicated.It is difficult to recognise insulators and detect defects automatically.In this paper,we propose a catenary intelligent defect detection algorithm based on Mask region-convolutional neural network(R-CNN)and an image processing model.Vertical projection technology is used to achieve single shed positioning and precise cutting of the insulator.Gradient,texture,and gray feature fusion(GTGFF)and a K-means clustering analysis model(KCAM)are proposed to detect broken insulators,dirt,foreign bodies,and flashover.Using this model,insulator recognition and defect detection can achieve a high recall rate and accuracy,and generalized defect detection.The algorithm is tested and verified on a dataset of realistic insulator images,and the accuracy and reliability of the algorithm satisfy current requirements for HSR catenary automatic inspection and intelligent maintenance.展开更多
Automatic road damage detection using image processing is an important aspect of road maintenance.It is also a challenging problem due to the inhomogeneity of road damage and complicated background in the road images....Automatic road damage detection using image processing is an important aspect of road maintenance.It is also a challenging problem due to the inhomogeneity of road damage and complicated background in the road images.In recent years,deep convolutional neural network based methods have been used to address the challenges of road damage detection and classification.In this paper,we propose a new approach to address those challenges.This approach uses densely connected convolution networks as the backbone of the Mask R-CNN to effectively extract image feature,a feature pyramid network for combining multiple scales features,a region proposal network to generate the road damage region,and a fully convolutional neural network to classify the road damage region and refine the region bounding box.This method can not only detect and classify the road damage,but also create a mask of the road damage.Experimental results show that the proposed approach can achieve better results compared with other existing methods.展开更多
This paper proposes a solution to localization and classification of rice grains in an image.All existing related works rely on conventional based machine learning approaches.However,those techniques do not do well fo...This paper proposes a solution to localization and classification of rice grains in an image.All existing related works rely on conventional based machine learning approaches.However,those techniques do not do well for the problem designed in this paper,due to the high similarities between different types of rice grains.The deep learning based solution is developed in the proposed solution.It contains pre-processing steps of data annotation using the watershed algorithm,auto-alignment using the major axis orientation,and image enhancement using the contrast-limited adaptive histogram equalization(CLAHE)technique.Then,the mask region-based convolutional neural networks(R-CNN)is trained to localize and classify rice grains in an input image.The performance is enhanced by using the transfer learning and the dropout regularization for overfitting prevention.The proposed method is validated using many scenarios of experiments,reported in the forms of mean average precision(mAP)and a confusion matrix.It achieves above 80%mAP for main scenarios in the experiments.It is also shown to perform outstanding,when compared to human experts.展开更多
This paper presents a deep neural network(DNN)-based speech enhancement algorithm based on the soft audible noise masking for the single-channel wind noise reduction. To reduce the low-frequency residual noise, the ps...This paper presents a deep neural network(DNN)-based speech enhancement algorithm based on the soft audible noise masking for the single-channel wind noise reduction. To reduce the low-frequency residual noise, the psychoacoustic model is adopted to calculate the masking threshold from the estimated clean speech spectrum. The gain for noise suppression is obtained based on soft audible noise masking by comparing the estimated wind noise spectrum with the masking threshold. To deal with the abruptly time-varying noisy signals, two separate DNN models are utilized to estimate the spectra of clean speech and wind noise components. Experimental results on the subjective and objective quality tests show that the proposed algorithm achieves the better performance compared with the conventional DNN-based wind noise reduction method.展开更多
To pursue the ideal of a safe high-tech society in a time when traffic accidents are frequent,the traffic signs detection system has become one of the necessary topics in recent years and in the future.The ultimate go...To pursue the ideal of a safe high-tech society in a time when traffic accidents are frequent,the traffic signs detection system has become one of the necessary topics in recent years and in the future.The ultimate goal of this research is to identify and classify the types of traffic signs in a panoramic image.To accomplish this goal,the paper proposes a new model for traffic sign detection based on the Convolutional Neural Network for com-prehensive traffic sign classification and Mask Region-based Convolutional Neural Networks(R-CNN)implementation for identifying and extracting signs in panoramic images.Data augmentation and normalization of the images are also applied to assist in classifying better even if old traffic signs are degraded,and considerably minimize the rates of discovering the extra boxes.The proposed model is tested on both the testing dataset and the actual images and gets 94.5%of the correct signs recognition rate,the classification rate of those signs discovered was 99.41%and the rate of false signs was only around 0.11.展开更多
An otoscope is traditionally used to examine the eardrum and ear canal.A diagnosis of otitis media(OM)relies on the experience of clinicians.If an examiner lacks experience,the examination may be difficult and time-co...An otoscope is traditionally used to examine the eardrum and ear canal.A diagnosis of otitis media(OM)relies on the experience of clinicians.If an examiner lacks experience,the examination may be difficult and time-consuming.This paper presents an ear disease classification method using middle ear images based on a convolutional neural network(CNN).Especially the segmentation and classification networks are used to classify an otoscopic image into six classes:normal,acute otitis media(AOM),otitis media with effusion(OME),chronic otitis media(COM),congenital cholesteatoma(CC)and traumatic perforations(TMPs).The Mask R-CNN is utilized for the segmentation network to extract the region of interest(ROI)from otoscopic images.The extracted ROIs are used as guiding features for the classification.The classification is based on transfer learning with an ensemble of two CNN classifiers:EfficientNetB0 and Inception-V3.The proposed model was trained with a 5-fold cross-validation technique.The proposed method was evaluated and achieved a classification accuracy of 97.29%.展开更多
Rock classification plays a crucial role in various fields such as geology,engineering,and environmental studies.Employing deep learning AI(artificial intelligence)methods has a high potential to significantly improve...Rock classification plays a crucial role in various fields such as geology,engineering,and environmental studies.Employing deep learning AI(artificial intelligence)methods has a high potential to significantly improve the accuracy and efficiency of this task.The paper delves into the exploration of two cuttingedge AI techniques,namely Mask DINO and Mask R-CNN(convolutional neural network),as means to identify rock weathering grades and rock types.The results demonstrate that Mask DINO,which is a Detection Transformer(DETR),outperforms Mask R-CNN for the aforementioned purposes.Mask DINO achieved f-1 scores of 91% and 86% in weathering grade detection and rock type detection,as opposed to the Mask R-CNN's f-1 scores of 84% and 75%,respectively.These findings underscore the substantial potential of employing DETR algorithms like Mask DINO for automatic classification of both rock type and weathering states.Although the study examines only two AI models,the data processing and other techniques developed in this study may serve as a foundation for future advancements in the field.By incorporating these advanced AI techniques,logging personnel can obtain valuable references to aid their work,ultimately contributing to the advancement of geological and related fields.展开更多
针对关系抽取(RE)任务中实体关系语义挖掘困难和预测关系有偏差等问题,提出一种基于掩码提示与门控记忆网络校准(MGMNC)的RE方法。首先,利用提示中的掩码学习实体之间在预训练语言模型(PLM)语义空间中的潜在语义,通过构造掩码注意力权...针对关系抽取(RE)任务中实体关系语义挖掘困难和预测关系有偏差等问题,提出一种基于掩码提示与门控记忆网络校准(MGMNC)的RE方法。首先,利用提示中的掩码学习实体之间在预训练语言模型(PLM)语义空间中的潜在语义,通过构造掩码注意力权重矩阵,将离散的掩码语义空间相互关联;其次,采用门控校准网络将含有实体和关系语义的掩码表示融入句子的全局语义;再次,将它们作为关系提示校准关系信息,随后将句子表示的最终表示映射至相应的关系类别;最后,通过更好地利用提示中掩码,并结合传统微调方法的学习句子全局语义的优势,充分激发PLM的潜力。实验结果表明,所提方法在SemEval(SemEval-2010 Task 8)数据集的F1值达到91.4%,相较于RELA(Relation Extraction with Label Augmentation)生成式方法提高了1.0个百分点;在SciERC(Entities, Relations, and Coreference for Scientific knowledge graph construction)和CLTC(Chinese Literature Text Corpus)数据集上的F1值分别达到91.0%和82.8%。所提方法在上述3个数据集上均明显优于对比方法,验证了所提方法的有效性。相较于基于生成式的方法,所提方法实现了更优的抽取性能。展开更多
基金supported by the National Natural Science Foundation of China(Nos.51677171,51637009,51577166 and 51827810)the National Key R&D Program of China(No.2018YFB0606000)+2 种基金the China Scholarship Council(No.201708330502)the Fund of Shuohuang Railway Development Limited Liability Company(No.SHTL-2020-13)the Fund of State Key Laboratory of Industrial Control Technology(No.ICT2022B29),China。
文摘Rod insulators are vital parts of the catenary of high speed railways(HSRs).There are many different catenary insulators,and the background of the insulator image is complicated.It is difficult to recognise insulators and detect defects automatically.In this paper,we propose a catenary intelligent defect detection algorithm based on Mask region-convolutional neural network(R-CNN)and an image processing model.Vertical projection technology is used to achieve single shed positioning and precise cutting of the insulator.Gradient,texture,and gray feature fusion(GTGFF)and a K-means clustering analysis model(KCAM)are proposed to detect broken insulators,dirt,foreign bodies,and flashover.Using this model,insulator recognition and defect detection can achieve a high recall rate and accuracy,and generalized defect detection.The algorithm is tested and verified on a dataset of realistic insulator images,and the accuracy and reliability of the algorithm satisfy current requirements for HSR catenary automatic inspection and intelligent maintenance.
基金supported by the School Doctoral Fund of Zhengzhou University of Light Industry No.2015BSJJ051.
文摘Automatic road damage detection using image processing is an important aspect of road maintenance.It is also a challenging problem due to the inhomogeneity of road damage and complicated background in the road images.In recent years,deep convolutional neural network based methods have been used to address the challenges of road damage detection and classification.In this paper,we propose a new approach to address those challenges.This approach uses densely connected convolution networks as the backbone of the Mask R-CNN to effectively extract image feature,a feature pyramid network for combining multiple scales features,a region proposal network to generate the road damage region,and a fully convolutional neural network to classify the road damage region and refine the region bounding box.This method can not only detect and classify the road damage,but also create a mask of the road damage.Experimental results show that the proposed approach can achieve better results compared with other existing methods.
文摘This paper proposes a solution to localization and classification of rice grains in an image.All existing related works rely on conventional based machine learning approaches.However,those techniques do not do well for the problem designed in this paper,due to the high similarities between different types of rice grains.The deep learning based solution is developed in the proposed solution.It contains pre-processing steps of data annotation using the watershed algorithm,auto-alignment using the major axis orientation,and image enhancement using the contrast-limited adaptive histogram equalization(CLAHE)technique.Then,the mask region-based convolutional neural networks(R-CNN)is trained to localize and classify rice grains in an input image.The performance is enhanced by using the transfer learning and the dropout regularization for overfitting prevention.The proposed method is validated using many scenarios of experiments,reported in the forms of mean average precision(mAP)and a confusion matrix.It achieves above 80%mAP for main scenarios in the experiments.It is also shown to perform outstanding,when compared to human experts.
基金partially supported by the National Natural Science Foundation of China (Nos.11590772, 11590770)the Pre-research Project for Equipment of General Information System (No.JZX2017-0994/Y306)
文摘This paper presents a deep neural network(DNN)-based speech enhancement algorithm based on the soft audible noise masking for the single-channel wind noise reduction. To reduce the low-frequency residual noise, the psychoacoustic model is adopted to calculate the masking threshold from the estimated clean speech spectrum. The gain for noise suppression is obtained based on soft audible noise masking by comparing the estimated wind noise spectrum with the masking threshold. To deal with the abruptly time-varying noisy signals, two separate DNN models are utilized to estimate the spectra of clean speech and wind noise components. Experimental results on the subjective and objective quality tests show that the proposed algorithm achieves the better performance compared with the conventional DNN-based wind noise reduction method.
文摘To pursue the ideal of a safe high-tech society in a time when traffic accidents are frequent,the traffic signs detection system has become one of the necessary topics in recent years and in the future.The ultimate goal of this research is to identify and classify the types of traffic signs in a panoramic image.To accomplish this goal,the paper proposes a new model for traffic sign detection based on the Convolutional Neural Network for com-prehensive traffic sign classification and Mask Region-based Convolutional Neural Networks(R-CNN)implementation for identifying and extracting signs in panoramic images.Data augmentation and normalization of the images are also applied to assist in classifying better even if old traffic signs are degraded,and considerably minimize the rates of discovering the extra boxes.The proposed model is tested on both the testing dataset and the actual images and gets 94.5%of the correct signs recognition rate,the classification rate of those signs discovered was 99.41%and the rate of false signs was only around 0.11.
基金This study was supported by a Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Science,ICT&Future Planning NRF-2020R1A2C1014829the Soonchunhyang University Research Fund.
文摘An otoscope is traditionally used to examine the eardrum and ear canal.A diagnosis of otitis media(OM)relies on the experience of clinicians.If an examiner lacks experience,the examination may be difficult and time-consuming.This paper presents an ear disease classification method using middle ear images based on a convolutional neural network(CNN).Especially the segmentation and classification networks are used to classify an otoscopic image into six classes:normal,acute otitis media(AOM),otitis media with effusion(OME),chronic otitis media(COM),congenital cholesteatoma(CC)and traumatic perforations(TMPs).The Mask R-CNN is utilized for the segmentation network to extract the region of interest(ROI)from otoscopic images.The extracted ROIs are used as guiding features for the classification.The classification is based on transfer learning with an ensemble of two CNN classifiers:EfficientNetB0 and Inception-V3.The proposed model was trained with a 5-fold cross-validation technique.The proposed method was evaluated and achieved a classification accuracy of 97.29%.
基金supported by the Construction Industry Council(Grant No.CICR/01/22)the support from the General Research Fund(Grant No.17206822)of the Research Grants Council(Hong Kong).
文摘Rock classification plays a crucial role in various fields such as geology,engineering,and environmental studies.Employing deep learning AI(artificial intelligence)methods has a high potential to significantly improve the accuracy and efficiency of this task.The paper delves into the exploration of two cuttingedge AI techniques,namely Mask DINO and Mask R-CNN(convolutional neural network),as means to identify rock weathering grades and rock types.The results demonstrate that Mask DINO,which is a Detection Transformer(DETR),outperforms Mask R-CNN for the aforementioned purposes.Mask DINO achieved f-1 scores of 91% and 86% in weathering grade detection and rock type detection,as opposed to the Mask R-CNN's f-1 scores of 84% and 75%,respectively.These findings underscore the substantial potential of employing DETR algorithms like Mask DINO for automatic classification of both rock type and weathering states.Although the study examines only two AI models,the data processing and other techniques developed in this study may serve as a foundation for future advancements in the field.By incorporating these advanced AI techniques,logging personnel can obtain valuable references to aid their work,ultimately contributing to the advancement of geological and related fields.
文摘针对关系抽取(RE)任务中实体关系语义挖掘困难和预测关系有偏差等问题,提出一种基于掩码提示与门控记忆网络校准(MGMNC)的RE方法。首先,利用提示中的掩码学习实体之间在预训练语言模型(PLM)语义空间中的潜在语义,通过构造掩码注意力权重矩阵,将离散的掩码语义空间相互关联;其次,采用门控校准网络将含有实体和关系语义的掩码表示融入句子的全局语义;再次,将它们作为关系提示校准关系信息,随后将句子表示的最终表示映射至相应的关系类别;最后,通过更好地利用提示中掩码,并结合传统微调方法的学习句子全局语义的优势,充分激发PLM的潜力。实验结果表明,所提方法在SemEval(SemEval-2010 Task 8)数据集的F1值达到91.4%,相较于RELA(Relation Extraction with Label Augmentation)生成式方法提高了1.0个百分点;在SciERC(Entities, Relations, and Coreference for Scientific knowledge graph construction)和CLTC(Chinese Literature Text Corpus)数据集上的F1值分别达到91.0%和82.8%。所提方法在上述3个数据集上均明显优于对比方法,验证了所提方法的有效性。相较于基于生成式的方法,所提方法实现了更优的抽取性能。