Pill image recognition is an important field in computer vision.It has become a vital technology in healthcare and pharmaceuticals due to the necessity for precise medication identification to prevent errors and ensur...Pill image recognition is an important field in computer vision.It has become a vital technology in healthcare and pharmaceuticals due to the necessity for precise medication identification to prevent errors and ensure patient safety.This survey examines the current state of pill image recognition,focusing on advancements,methodologies,and the challenges that remain unresolved.It provides a comprehensive overview of traditional image processing-based,machine learning-based,deep learning-based,and hybrid-based methods,and aims to explore the ongoing difficulties in the field.We summarize and classify the methods used in each article,compare the strengths and weaknesses of traditional image processing-based,machine learning-based,deep learning-based,and hybrid-based methods,and review benchmark datasets for pill image recognition.Additionally,we compare the performance of proposed methods on popular benchmark datasets.This survey applies recent advancements,such as Transformer models and cutting-edge technologies like Augmented Reality(AR),to discuss potential research directions and conclude the review.By offering a holistic perspective,this paper aims to serve as a valuable resource for researchers and practitioners striving to advance the field of pill image recognition.展开更多
Considering the difficulty of integrating the depth points of nautical charts of the East China Sea into a global high-precision Grid Digital Elevation Model(Grid-DEM),we proposed a“Fusion based on Image Recognition(...Considering the difficulty of integrating the depth points of nautical charts of the East China Sea into a global high-precision Grid Digital Elevation Model(Grid-DEM),we proposed a“Fusion based on Image Recognition(FIR)”method for multi-sourced depth data fusion,and used it to merge the electronic nautical chart dataset(referred to as Chart2014 in this paper)with the global digital elevation dataset(referred to as Globalbath2002 in this paper).Compared to the traditional fusion of two datasets by direct combination and interpolation,the new Grid-DEM formed by FIR can better represent the data characteristics of Chart2014,reduce the calculation difficulty,and be more intuitive,and,the choice of different interpolation methods in FIR and the influence of the“exclusion radius R”parameter were discussed.FIR avoids complex calculations of spatial distances among points from different sources,and instead uses spatial exclusion map to perform one-step screening based on the exclusion radius R,which greatly improved the fusion status of a reliable dataset.The fusion results of different experiments were analyzed statistically with root mean square error and mean relative error,showing that the interpolation methods based on Delaunay triangulation are more suitable for the fusion of nautical chart depth of China,and factors such as the point density distribution of multiple source data,accuracy,interpolation method,and various terrain conditions should be fully considered when selecting the exclusion radius R.展开更多
Fine-grained Image Recognition(FGIR)task is dedicated to distinguishing similar sub-categories that belong to the same super-category,such as bird species and car types.In order to highlight visual differences,existin...Fine-grained Image Recognition(FGIR)task is dedicated to distinguishing similar sub-categories that belong to the same super-category,such as bird species and car types.In order to highlight visual differences,existing FGIR works often follow two steps:discriminative sub-region localization and local feature representation.However,these works pay less attention on global context information.They neglect a fact that the subtle visual difference in challenging scenarios can be highlighted through exploiting the spatial relationship among different subregions from a global view point.Therefore,in this paper,we consider both global and local information for FGIR,and propose a collaborative teacher-student strategy to reinforce and unity the two types of information.Our framework is implemented mainly by convolutional neural network,referred to Teacher-Student Based Attention Convolutional Neural Network(T-S-ACNN).For fine-grained local information,we choose the classic Multi-Attention Network(MA-Net)as our baseline,and propose a type of boundary constraint to further reduce background noises in the local attention maps.In this way,the discriminative sub-regions tend to appear in the area occupied by fine-grained objects,leading to more accurate sub-region localization.For fine-grained global information,we design a graph convolution based Global Attention Network(GA-Net),which can combine extracted local attention maps from MA-Net with non-local techniques to explore spatial relationship among subregions.At last,we develop a collaborative teacher-student strategy to adaptively determine the attended roles and optimization modes,so as to enhance the cooperative reinforcement of MA-Net and GA-Net.Extensive experiments on CUB-200-2011,Stanford Cars and FGVC Aircraft datasets illustrate the promising performance of our framework.展开更多
Expanding photovoltaic(PV)resources in rural-grid areas is an essential means to augment the share of solar energy in the energy landscape,aligning with the“carbon peaking and carbon neutrality”objectives.However,ru...Expanding photovoltaic(PV)resources in rural-grid areas is an essential means to augment the share of solar energy in the energy landscape,aligning with the“carbon peaking and carbon neutrality”objectives.However,rural power grids often lack digitalization;thus,the load distribution within these areas is not fully known.This hinders the calculation of the available PV capacity and deduction of node voltages.This study proposes a load-distribution modeling approach based on remote-sensing image recognition in pursuit of a scientific framework for developing distributed PV resources in rural grid areas.First,houses in remote-sensing images are accurately recognized using deep-learning techniques based on the YOLOv5 model.The distribution of the houses is then used to estimate the load distribution in the grid area.Next,equally spaced and clustered distribution models are used to adaptively determine the location of the nodes and load power in the distribution lines.Finally,by calculating the connectivity matrix of the nodes,a minimum spanning tree is extracted,the topology of the network is constructed,and the node parameters of the load-distribution model are calculated.The proposed scheme is implemented in a software package and its efficacy is demonstrated by analyzing typical remote-sensing images of rural grid areas.The results underscore the ability of the proposed approach to effectively discern the distribution-line structure and compute the node parameters,thereby offering vital support for determining PV access capability.展开更多
Asparagus stem blight,also known as“asparagus cancer”,is a serious plant disease with a regional distribution.The widespread occurrence of the disease has had a negative impact on the yield and quality of asparagus ...Asparagus stem blight,also known as“asparagus cancer”,is a serious plant disease with a regional distribution.The widespread occurrence of the disease has had a negative impact on the yield and quality of asparagus and has become one of the main problems threatening asparagus production.To improve the ability to accurately identify and localize phenotypic lesions of stem blight in asparagus and to enhance the accuracy of the test,a YOLOv8-CBAM detection algorithm for asparagus stem blight based on YOLOv8 was proposed.The algorithm aims to achieve rapid detection of phenotypic images of asparagus stem blight and to provide effective assistance in the control of asparagus stem blight.To enhance the model’s capacity to capture subtle lesion features,the Convolutional Block AttentionModule(CBAM)is added after C2f in the head.Simultaneously,the original CIoU loss function in YOLOv8 was replaced with the Focal-EIoU loss function,ensuring that the updated loss function emphasizes higher-quality bounding boxes.The YOLOv8-CBAM algorithm can effectively detect asparagus stem blight phenotypic images with a mean average precision(mAP)of 95.51%,which is 0.22%,14.99%,1.77%,and 5.71%higher than the YOLOv5,YOLOv7,YOLOv8,and Mask R-CNN models,respectively.This greatly enhances the efficiency of asparagus growers in identifying asparagus stem blight,aids in improving the prevention and control of asparagus stem blight,and is crucial for the application of computer vision in agriculture.展开更多
Complex plasma widely exists in thin film deposition,material surface modification,and waste gas treatment in industrial plasma processes.During complex plasma discharge,the configuration,distribution,and size of part...Complex plasma widely exists in thin film deposition,material surface modification,and waste gas treatment in industrial plasma processes.During complex plasma discharge,the configuration,distribution,and size of particles,as well as the discharge glow,strongly depend on discharge parameters.However,traditional manual diagnosis methods for recognizing discharge parameters from discharge images are complicated to operate with low accuracy,time-consuming and high requirement of instruments.To solve these problems,by combining the two mechanisms of attention mechanism(strengthening the extraction of the channel feature)and shortcut connection(enabling the input information to be directly transmitted to deep networks and avoiding the disappearance or explosion of gradients),the network of squeeze and excitation convolution with shortcut(SECS)for complex plasma image recognition is proposed to effectively improve the model performance.The results show that the accuracy,precision,recall and F1-Score of our model are superior to other models in complex plasma image recognition,and the recognition accuracy reaches 97.38%.Moreover,the recognition accuracy for the Flowers and Chest X-ray publicly available data sets reaches 97.85%and 98.65%,respectively,and our model has robustness.This study shows that the proposed model provides a new method for the diagnosis of complex plasma images and also provides technical support for the application of plasma in industrial production.展开更多
This paper introduces an intelligent image recognition system integrated into a wheelchair based on deep learning in cold environments,aiming to improve the convenience and safety of disabled individuals.The system ad...This paper introduces an intelligent image recognition system integrated into a wheelchair based on deep learning in cold environments,aiming to improve the convenience and safety of disabled individuals.The system adopts advanced image recognition technology to monitor road conditions in real-time through the camera and to detect and measure distance to foreign objects on the road.The system visualizes the detection results on the wheelchair screen to assist the user in avoiding and improving the safety of their daily travel.In addition,the system also includes crawler tracks,seat heating,snow and rain protection,and other functions.The wheelchair has a wide range of application prospects and development potential.It is expected to be widely used in the future,providing a strong guarantee for the safe travel of disabled individuals in China.展开更多
The automated picking technology of tea is an important part of the development of smart agriculture, which affects the development of the tea industry to a certain extent. Tea leaf recognition and robotic tea picking...The automated picking technology of tea is an important part of the development of smart agriculture, which affects the development of the tea industry to a certain extent. Tea leaf recognition and robotic tea picking end-effector are the key technologies for automated tea picking. This paper proposes a set of algorithms for tea leaf differentiation and recognition based on the principle of colour difference. And on the basis of this algorithm, a tea picking end-effector is designed. The experiments show that the designed tea picking end-effector has good recognition ability and high tea picking speed.展开更多
The traditional synthetic aperture radar(SAR) image recognition techniques focus on the electro magnetic (EM) scattering centers, ignoring the important role of the shadow information on the SAR image recognition....The traditional synthetic aperture radar(SAR) image recognition techniques focus on the electro magnetic (EM) scattering centers, ignoring the important role of the shadow information on the SAR image recognition. It is difficult to classify targets by the shadow information independently, because the shadow shape is dependent on the radar aspect angle, the depression angle and the resolution. Moreover, the shadow shapes of different targets are similar. When the multiple SAR images of one target from different aspects are available, the performance of the target recognition can be improved. Aimed at the problem, a multi-aspect SAR image recognition technique based on the shadow information is developed. It extracts shadow profiles from SAR images, and takes chain codes as the feature vectors of targets. Then, feature vectors on multiple aspects of the same target are combined with feature sequences, and the hidden Markov model (HMM) is applied to the feature sequences for the target recognition. The simulation result shows the effectiveness of the method.展开更多
Image recognition has always been a hot research topic in the scientific community and industry.The emergence of convolutional neural networks(CNN)has made this technology turned into research focus on the field of co...Image recognition has always been a hot research topic in the scientific community and industry.The emergence of convolutional neural networks(CNN)has made this technology turned into research focus on the field of computer vision,especially in image recognition.But it makes the recognition result largely dependent on the number and quality of training samples.Recently,DCGAN has become a frontier method for generating images,sounds,and videos.In this paper,DCGAN is used to generate sample that is difficult to collect and proposed an efficient design method of generating model.We combine DCGAN with CNN for the second time.Use DCGAN to generate samples and training in image recognition model,which based by CNN.This method can enhance the classification model and effectively improve the accuracy of image recognition.In the experiment,we used the radar profile as dataset for 4 categories and achieved satisfactory classification performance.This paper applies image recognition technology to the meteorological field.展开更多
Based on the Fourier transform, a new shape descriptor was proposed to represent the flame image. By employing the shape descriptor as the input, the flame image recognition was studied by the methods of the artificia...Based on the Fourier transform, a new shape descriptor was proposed to represent the flame image. By employing the shape descriptor as the input, the flame image recognition was studied by the methods of the artificial neural network(ANN) and the support vector machine(SVM) respectively. And the recognition experiments were carried out by using flame image data sampled from an alumina rotary kiln to evaluate their effectiveness. The results show that the two recognition methods can achieve good results, which verify the effectiveness of the shape descriptor. The highest recognition rate is 88.83% for SVM and 87.38% for ANN, which means that the performance of the SVM is better than that of the ANN.展开更多
With the continuous progress of The Times and the development of technology,the rise of network social media has also brought the“explosive”growth of image data.As one of the main ways of People’s Daily communicati...With the continuous progress of The Times and the development of technology,the rise of network social media has also brought the“explosive”growth of image data.As one of the main ways of People’s Daily communication,image is widely used as a carrier of communication because of its rich content,intuitive and other advantages.Image recognition based on convolution neural network is the first application in the field of image recognition.A series of algorithm operations such as image eigenvalue extraction,recognition and convolution are used to identify and analyze different images.The rapid development of artificial intelligence makes machine learning more and more important in its research field.Use algorithms to learn each piece of data and predict the outcome.This has become an important key to open the door of artificial intelligence.In machine vision,image recognition is the foundation,but how to associate the low-level information in the image with the high-level image semantics becomes the key problem of image recognition.Predecessors have provided many model algorithms,which have laid a solid foundation for the development of artificial intelligence and image recognition.The multi-level information fusion model based on the VGG16 model is an improvement on the fully connected neural network.Different from full connection network,convolutional neural network does not use full connection method in each layer of neurons of neural network,but USES some nodes for connection.Although this method reduces the computation time,due to the fact that the convolutional neural network model will lose some useful feature information in the process of propagation and calculation,this paper improves the model to be a multi-level information fusion of the convolution calculation method,and further recovers the discarded feature information,so as to improve the recognition rate of the image.VGG divides the network into five groups(mimicking the five layers of AlexNet),yet it USES 3*3 filters and combines them as a convolution sequence.Network deeper DCNN,channel number is bigger.The recognition rate of the model was verified by 0RL Face Database,BioID Face Database and CASIA Face Image Database.展开更多
With the development of Deep Convolutional Neural Networks(DCNNs),the extracted features for image recognition tasks have shifted from low-level features to the high-level semantic features of DCNNs.Previous studies h...With the development of Deep Convolutional Neural Networks(DCNNs),the extracted features for image recognition tasks have shifted from low-level features to the high-level semantic features of DCNNs.Previous studies have shown that the deeper the network is,the more abstract the features are.However,the recognition ability of deep features would be limited by insufficient training samples.To address this problem,this paper derives an improved Deep Fusion Convolutional Neural Network(DF-Net)which can make full use of the differences and complementarities during network learning and enhance feature expression under the condition of limited datasets.Specifically,DF-Net organizes two identical subnets to extract features from the input image in parallel,and then a well-designed fusion module is introduced to the deep layer of DF-Net to fuse the subnet’s features in multi-scale.Thus,the more complex mappings are created and the more abundant and accurate fusion features can be extracted to improve recognition accuracy.Furthermore,a corresponding training strategy is also proposed to speed up the convergence and reduce the computation overhead of network training.Finally,DF-Nets based on the well-known ResNet,DenseNet and MobileNetV2 are evaluated on CIFAR100,Stanford Dogs,and UECFOOD-100.Theoretical analysis and experimental results strongly demonstrate that DF-Net enhances the performance of DCNNs and increases the accuracy of image recognition.展开更多
In recent years,deep convolution neural network has exhibited excellent performance in computer vision and has a far-reaching impact.Traditional plant taxonomic identification requires high expertise,which is time-con...In recent years,deep convolution neural network has exhibited excellent performance in computer vision and has a far-reaching impact.Traditional plant taxonomic identification requires high expertise,which is time-consuming.Most nature reserves have problems such as incomplete species surveys,inaccurate taxonomic identification,and untimely updating of status data.Simple and accurate recognition of plant images can be achieved by applying convolutional neural network technology to explore the best network model.Taking 24 typical desert plant species that are widely distributed in the nature reserves in Xinjiang Uygur Autonomous Region of China as the research objects,this study established an image database and select the optimal network model for the image recognition of desert plant species to provide decision support for fine management in the nature reserves in Xinjiang,such as species investigation and monitoring,by using deep learning.Since desert plant species were not included in the public dataset,the images used in this study were mainly obtained through field shooting and downloaded from the Plant Photo Bank of China(PPBC).After the sorting process and statistical analysis,a total of 2331 plant images were finally collected(2071 images from field collection and 260 images from the PPBC),including 24 plant species belonging to 14 families and 22 genera.A large number of numerical experiments were also carried out to compare a series of 37 convolutional neural network models with good performance,from different perspectives,to find the optimal network model that is most suitable for the image recognition of desert plant species in Xinjiang.The results revealed 24 models with a recognition Accuracy,of greater than 70.000%.Among which,Residual Network X_8GF(RegNetX_8GF)performs the best,with Accuracy,Precision,Recall,and F1(which refers to the harmonic mean of the Precision and Recall values)values of 78.33%,77.65%,69.55%,and 71.26%,respectively.Considering the demand factors of hardware equipment and inference time,Mobile NetworkV2 achieves the best balance among the Accuracy,the number of parameters and the number of floating-point operations.The number of parameters for Mobile Network V2(MobileNetV2)is 1/16 of RegNetX_8GF,and the number of floating-point operations is 1/24.Our findings can facilitate efficient decision-making for the management of species survey,cataloging,inspection,and monitoring in the nature reserves in Xinjiang,providing a scientific basis for the protection and utilization of natural plant resources.展开更多
A new image recognition method based on fuzzy rough sets theory is proposed, and its implementation discussed. The performance of this method as applied to ferrography image recognition is evaluated. It is shown that...A new image recognition method based on fuzzy rough sets theory is proposed, and its implementation discussed. The performance of this method as applied to ferrography image recognition is evaluated. It is shown that the new method gives better results than fuzzy or rough sets method when used alone.展开更多
Fast and accurate determination of effective bentonite content in used clay bonded sand is very important for selecting the correct mixing ratio and mixing process to obtain high-performance molding sand. Currently, t...Fast and accurate determination of effective bentonite content in used clay bonded sand is very important for selecting the correct mixing ratio and mixing process to obtain high-performance molding sand. Currently, the effective bentonite content is determined by testing the ethylene blue absorbed in used clay bonded sand, which is usually a manual operation with some disadvantages including complicated process, long testing time and low accuracy. A rapid automatic analyzer of the effective bentonite content in used clay bonded sand was developed based on image recognition technology. The instrument consists of auto stirring, auto liquid removal, auto titration, step-rotation and image acquisition components, and processor. The principle of the image recognition method is first to decompose the color images into three-channel gray images based on the photosensitive degree difference of the light blue and dark blue in the three channels of red, green and blue, then to make the gray values subtraction calculation and gray level transformation of the gray images, and finally, to extract the outer circle light blue halo and the inner circle blue spot and calculate their area ratio. The titration process can be judged to reach the end-point while the area ratio is higher than the setting value.展开更多
A new gray-spatial histogram is proposed, which incorporates spatial informatio n with gray compositions without sacrificing the robustness of traditional gray histograms. The purpose is to consider the representation...A new gray-spatial histogram is proposed, which incorporates spatial informatio n with gray compositions without sacrificing the robustness of traditional gray histograms. The purpose is to consider the representation role of gray compositi ons and spatial information simultaneously. Each entry in the gray-spatial hist ogram is the gray frequency and corresponding position information of images. In the experiments of sonar image recognition, the results show that the gray-spa tial histogram is effective in practical use.展开更多
The fine-grained ship image recognition task aims to identify various classes of ships.However,small inter-class,large intra-class differences between ships,and lacking of training samples are the reasons that make th...The fine-grained ship image recognition task aims to identify various classes of ships.However,small inter-class,large intra-class differences between ships,and lacking of training samples are the reasons that make the task difficult.Therefore,to enhance the accuracy of the fine-grained ship image recognition,we design a fine-grained ship image recognition network based on bilinear convolutional neural network(BCNN)with Inception and additive margin Softmax(AM-Softmax).This network improves the BCNN in two aspects.Firstly,by introducing Inception branches to the BCNN network,it is helpful to enhance the ability of extracting comprehensive features from ships.Secondly,by adding margin values to the decision boundary,the AM-Softmax function can better extend the inter-class differences and reduce the intra-class differences.In addition,as there are few publicly available datasets for fine-grained ship image recognition,we construct a Ship-43 dataset containing 47,300 ship images belonging to 43 categories.Experimental results on the constructed Ship-43 dataset demonstrate that our method can effectively improve the accuracy of ship image recognition,which is 4.08%higher than the BCNN model.Moreover,comparison results on the other three public fine-grained datasets(Cub,Cars,and Aircraft)further validate the effectiveness of the proposed method.展开更多
As the COVID-19 epidemic spread across the globe,people around the world were advised or mandated to wear masks in public places to prevent its spreading further.In some cases,not wearing a mask could result in a fine...As the COVID-19 epidemic spread across the globe,people around the world were advised or mandated to wear masks in public places to prevent its spreading further.In some cases,not wearing a mask could result in a fine.To monitor mask wearing,and to prevent the spread of future epidemics,this study proposes an image recognition system consisting of a camera,an infrared thermal array sensor,and a convolutional neural network trained in mask recognition.The infrared sensor monitors body temperature and displays the results in real-time on a liquid crystal display screen.The proposed system reduces the inefficiency of traditional object detection by providing training data according to the specific needs of the user and by applying You Only Look Once Version 4(YOLOv4)object detection technology,which experiments show has more efficient training parameters and a higher level of accuracy in object recognition.All datasets are uploaded to the cloud for storage using Google Colaboratory,saving human resources and achieving a high level of efficiency at a low cost.展开更多
In this paper, the characters of the ferrography and image recognitiontechnology are analyzed. The fault diagnosis system for the power device based on the ferrographyand image recognition technology is designed. At t...In this paper, the characters of the ferrography and image recognitiontechnology are analyzed. The fault diagnosis system for the power device based on the ferrographyand image recognition technology is designed. At the same time, the structure, the design andimplementing method, and the functions of each module of this system are described in detail.展开更多
文摘Pill image recognition is an important field in computer vision.It has become a vital technology in healthcare and pharmaceuticals due to the necessity for precise medication identification to prevent errors and ensure patient safety.This survey examines the current state of pill image recognition,focusing on advancements,methodologies,and the challenges that remain unresolved.It provides a comprehensive overview of traditional image processing-based,machine learning-based,deep learning-based,and hybrid-based methods,and aims to explore the ongoing difficulties in the field.We summarize and classify the methods used in each article,compare the strengths and weaknesses of traditional image processing-based,machine learning-based,deep learning-based,and hybrid-based methods,and review benchmark datasets for pill image recognition.Additionally,we compare the performance of proposed methods on popular benchmark datasets.This survey applies recent advancements,such as Transformer models and cutting-edge technologies like Augmented Reality(AR),to discuss potential research directions and conclude the review.By offering a holistic perspective,this paper aims to serve as a valuable resource for researchers and practitioners striving to advance the field of pill image recognition.
基金Supported by the National Key R&D Program of China (No.2023YFC3008100)the National Natural Science Foundation of China (No.U23A2033)
文摘Considering the difficulty of integrating the depth points of nautical charts of the East China Sea into a global high-precision Grid Digital Elevation Model(Grid-DEM),we proposed a“Fusion based on Image Recognition(FIR)”method for multi-sourced depth data fusion,and used it to merge the electronic nautical chart dataset(referred to as Chart2014 in this paper)with the global digital elevation dataset(referred to as Globalbath2002 in this paper).Compared to the traditional fusion of two datasets by direct combination and interpolation,the new Grid-DEM formed by FIR can better represent the data characteristics of Chart2014,reduce the calculation difficulty,and be more intuitive,and,the choice of different interpolation methods in FIR and the influence of the“exclusion radius R”parameter were discussed.FIR avoids complex calculations of spatial distances among points from different sources,and instead uses spatial exclusion map to perform one-step screening based on the exclusion radius R,which greatly improved the fusion status of a reliable dataset.The fusion results of different experiments were analyzed statistically with root mean square error and mean relative error,showing that the interpolation methods based on Delaunay triangulation are more suitable for the fusion of nautical chart depth of China,and factors such as the point density distribution of multiple source data,accuracy,interpolation method,and various terrain conditions should be fully considered when selecting the exclusion radius R.
基金supported by the National Natural Science Foundation of China,China (Grants No.62171232)the Priority Academic Program Development of Jiangsu Higher Education Institutions,China。
文摘Fine-grained Image Recognition(FGIR)task is dedicated to distinguishing similar sub-categories that belong to the same super-category,such as bird species and car types.In order to highlight visual differences,existing FGIR works often follow two steps:discriminative sub-region localization and local feature representation.However,these works pay less attention on global context information.They neglect a fact that the subtle visual difference in challenging scenarios can be highlighted through exploiting the spatial relationship among different subregions from a global view point.Therefore,in this paper,we consider both global and local information for FGIR,and propose a collaborative teacher-student strategy to reinforce and unity the two types of information.Our framework is implemented mainly by convolutional neural network,referred to Teacher-Student Based Attention Convolutional Neural Network(T-S-ACNN).For fine-grained local information,we choose the classic Multi-Attention Network(MA-Net)as our baseline,and propose a type of boundary constraint to further reduce background noises in the local attention maps.In this way,the discriminative sub-regions tend to appear in the area occupied by fine-grained objects,leading to more accurate sub-region localization.For fine-grained global information,we design a graph convolution based Global Attention Network(GA-Net),which can combine extracted local attention maps from MA-Net with non-local techniques to explore spatial relationship among subregions.At last,we develop a collaborative teacher-student strategy to adaptively determine the attended roles and optimization modes,so as to enhance the cooperative reinforcement of MA-Net and GA-Net.Extensive experiments on CUB-200-2011,Stanford Cars and FGVC Aircraft datasets illustrate the promising performance of our framework.
基金supported by the State Grid Science&Technology Project of China(5400-202224153A-1-1-ZN).
文摘Expanding photovoltaic(PV)resources in rural-grid areas is an essential means to augment the share of solar energy in the energy landscape,aligning with the“carbon peaking and carbon neutrality”objectives.However,rural power grids often lack digitalization;thus,the load distribution within these areas is not fully known.This hinders the calculation of the available PV capacity and deduction of node voltages.This study proposes a load-distribution modeling approach based on remote-sensing image recognition in pursuit of a scientific framework for developing distributed PV resources in rural grid areas.First,houses in remote-sensing images are accurately recognized using deep-learning techniques based on the YOLOv5 model.The distribution of the houses is then used to estimate the load distribution in the grid area.Next,equally spaced and clustered distribution models are used to adaptively determine the location of the nodes and load power in the distribution lines.Finally,by calculating the connectivity matrix of the nodes,a minimum spanning tree is extracted,the topology of the network is constructed,and the node parameters of the load-distribution model are calculated.The proposed scheme is implemented in a software package and its efficacy is demonstrated by analyzing typical remote-sensing images of rural grid areas.The results underscore the ability of the proposed approach to effectively discern the distribution-line structure and compute the node parameters,thereby offering vital support for determining PV access capability.
基金supported by the Feicheng Artificial Intelligence Robot and Smart Agriculture Service Platform(381387).
文摘Asparagus stem blight,also known as“asparagus cancer”,is a serious plant disease with a regional distribution.The widespread occurrence of the disease has had a negative impact on the yield and quality of asparagus and has become one of the main problems threatening asparagus production.To improve the ability to accurately identify and localize phenotypic lesions of stem blight in asparagus and to enhance the accuracy of the test,a YOLOv8-CBAM detection algorithm for asparagus stem blight based on YOLOv8 was proposed.The algorithm aims to achieve rapid detection of phenotypic images of asparagus stem blight and to provide effective assistance in the control of asparagus stem blight.To enhance the model’s capacity to capture subtle lesion features,the Convolutional Block AttentionModule(CBAM)is added after C2f in the head.Simultaneously,the original CIoU loss function in YOLOv8 was replaced with the Focal-EIoU loss function,ensuring that the updated loss function emphasizes higher-quality bounding boxes.The YOLOv8-CBAM algorithm can effectively detect asparagus stem blight phenotypic images with a mean average precision(mAP)of 95.51%,which is 0.22%,14.99%,1.77%,and 5.71%higher than the YOLOv5,YOLOv7,YOLOv8,and Mask R-CNN models,respectively.This greatly enhances the efficiency of asparagus growers in identifying asparagus stem blight,aids in improving the prevention and control of asparagus stem blight,and is crucial for the application of computer vision in agriculture.
基金This study was supported by a grand from the National Natural Science Foundation of China(No.12075315).
文摘Complex plasma widely exists in thin film deposition,material surface modification,and waste gas treatment in industrial plasma processes.During complex plasma discharge,the configuration,distribution,and size of particles,as well as the discharge glow,strongly depend on discharge parameters.However,traditional manual diagnosis methods for recognizing discharge parameters from discharge images are complicated to operate with low accuracy,time-consuming and high requirement of instruments.To solve these problems,by combining the two mechanisms of attention mechanism(strengthening the extraction of the channel feature)and shortcut connection(enabling the input information to be directly transmitted to deep networks and avoiding the disappearance or explosion of gradients),the network of squeeze and excitation convolution with shortcut(SECS)for complex plasma image recognition is proposed to effectively improve the model performance.The results show that the accuracy,precision,recall and F1-Score of our model are superior to other models in complex plasma image recognition,and the recognition accuracy reaches 97.38%.Moreover,the recognition accuracy for the Flowers and Chest X-ray publicly available data sets reaches 97.85%and 98.65%,respectively,and our model has robustness.This study shows that the proposed model provides a new method for the diagnosis of complex plasma images and also provides technical support for the application of plasma in industrial production.
文摘This paper introduces an intelligent image recognition system integrated into a wheelchair based on deep learning in cold environments,aiming to improve the convenience and safety of disabled individuals.The system adopts advanced image recognition technology to monitor road conditions in real-time through the camera and to detect and measure distance to foreign objects on the road.The system visualizes the detection results on the wheelchair screen to assist the user in avoiding and improving the safety of their daily travel.In addition,the system also includes crawler tracks,seat heating,snow and rain protection,and other functions.The wheelchair has a wide range of application prospects and development potential.It is expected to be widely used in the future,providing a strong guarantee for the safe travel of disabled individuals in China.
文摘The automated picking technology of tea is an important part of the development of smart agriculture, which affects the development of the tea industry to a certain extent. Tea leaf recognition and robotic tea picking end-effector are the key technologies for automated tea picking. This paper proposes a set of algorithms for tea leaf differentiation and recognition based on the principle of colour difference. And on the basis of this algorithm, a tea picking end-effector is designed. The experiments show that the designed tea picking end-effector has good recognition ability and high tea picking speed.
文摘The traditional synthetic aperture radar(SAR) image recognition techniques focus on the electro magnetic (EM) scattering centers, ignoring the important role of the shadow information on the SAR image recognition. It is difficult to classify targets by the shadow information independently, because the shadow shape is dependent on the radar aspect angle, the depression angle and the resolution. Moreover, the shadow shapes of different targets are similar. When the multiple SAR images of one target from different aspects are available, the performance of the target recognition can be improved. Aimed at the problem, a multi-aspect SAR image recognition technique based on the shadow information is developed. It extracts shadow profiles from SAR images, and takes chain codes as the feature vectors of targets. Then, feature vectors on multiple aspects of the same target are combined with feature sequences, and the hidden Markov model (HMM) is applied to the feature sequences for the target recognition. The simulation result shows the effectiveness of the method.
文摘Image recognition has always been a hot research topic in the scientific community and industry.The emergence of convolutional neural networks(CNN)has made this technology turned into research focus on the field of computer vision,especially in image recognition.But it makes the recognition result largely dependent on the number and quality of training samples.Recently,DCGAN has become a frontier method for generating images,sounds,and videos.In this paper,DCGAN is used to generate sample that is difficult to collect and proposed an efficient design method of generating model.We combine DCGAN with CNN for the second time.Use DCGAN to generate samples and training in image recognition model,which based by CNN.This method can enhance the classification model and effectively improve the accuracy of image recognition.In the experiment,we used the radar profile as dataset for 4 categories and achieved satisfactory classification performance.This paper applies image recognition technology to the meteorological field.
基金Project(60634020) supported by the National Natural Science Foundation of China
文摘Based on the Fourier transform, a new shape descriptor was proposed to represent the flame image. By employing the shape descriptor as the input, the flame image recognition was studied by the methods of the artificial neural network(ANN) and the support vector machine(SVM) respectively. And the recognition experiments were carried out by using flame image data sampled from an alumina rotary kiln to evaluate their effectiveness. The results show that the two recognition methods can achieve good results, which verify the effectiveness of the shape descriptor. The highest recognition rate is 88.83% for SVM and 87.38% for ANN, which means that the performance of the SVM is better than that of the ANN.
文摘With the continuous progress of The Times and the development of technology,the rise of network social media has also brought the“explosive”growth of image data.As one of the main ways of People’s Daily communication,image is widely used as a carrier of communication because of its rich content,intuitive and other advantages.Image recognition based on convolution neural network is the first application in the field of image recognition.A series of algorithm operations such as image eigenvalue extraction,recognition and convolution are used to identify and analyze different images.The rapid development of artificial intelligence makes machine learning more and more important in its research field.Use algorithms to learn each piece of data and predict the outcome.This has become an important key to open the door of artificial intelligence.In machine vision,image recognition is the foundation,but how to associate the low-level information in the image with the high-level image semantics becomes the key problem of image recognition.Predecessors have provided many model algorithms,which have laid a solid foundation for the development of artificial intelligence and image recognition.The multi-level information fusion model based on the VGG16 model is an improvement on the fully connected neural network.Different from full connection network,convolutional neural network does not use full connection method in each layer of neurons of neural network,but USES some nodes for connection.Although this method reduces the computation time,due to the fact that the convolutional neural network model will lose some useful feature information in the process of propagation and calculation,this paper improves the model to be a multi-level information fusion of the convolution calculation method,and further recovers the discarded feature information,so as to improve the recognition rate of the image.VGG divides the network into five groups(mimicking the five layers of AlexNet),yet it USES 3*3 filters and combines them as a convolution sequence.Network deeper DCNN,channel number is bigger.The recognition rate of the model was verified by 0RL Face Database,BioID Face Database and CASIA Face Image Database.
基金This work is partially supported by National Natural Foundation of China(Grant No.61772561)the Key Research&Development Plan of Hunan Province(Grant No.2018NK2012)+2 种基金the Degree&Postgraduate Education Reform Project of Hunan Province(Grant No.2019JGYB154)the Postgraduate Excellent teaching team Project of Hunan Province(Grant[2019]370-133)Teaching Reform Project of Central South University of Forestry and Technology(Grant No.20180682).
文摘With the development of Deep Convolutional Neural Networks(DCNNs),the extracted features for image recognition tasks have shifted from low-level features to the high-level semantic features of DCNNs.Previous studies have shown that the deeper the network is,the more abstract the features are.However,the recognition ability of deep features would be limited by insufficient training samples.To address this problem,this paper derives an improved Deep Fusion Convolutional Neural Network(DF-Net)which can make full use of the differences and complementarities during network learning and enhance feature expression under the condition of limited datasets.Specifically,DF-Net organizes two identical subnets to extract features from the input image in parallel,and then a well-designed fusion module is introduced to the deep layer of DF-Net to fuse the subnet’s features in multi-scale.Thus,the more complex mappings are created and the more abundant and accurate fusion features can be extracted to improve recognition accuracy.Furthermore,a corresponding training strategy is also proposed to speed up the convergence and reduce the computation overhead of network training.Finally,DF-Nets based on the well-known ResNet,DenseNet and MobileNetV2 are evaluated on CIFAR100,Stanford Dogs,and UECFOOD-100.Theoretical analysis and experimental results strongly demonstrate that DF-Net enhances the performance of DCNNs and increases the accuracy of image recognition.
基金supported by the West Light Foundation of the Chinese Academy of Sciences(2019-XBQNXZ-A-007)the National Natural Science Foundation of China(12071458,71731009).
文摘In recent years,deep convolution neural network has exhibited excellent performance in computer vision and has a far-reaching impact.Traditional plant taxonomic identification requires high expertise,which is time-consuming.Most nature reserves have problems such as incomplete species surveys,inaccurate taxonomic identification,and untimely updating of status data.Simple and accurate recognition of plant images can be achieved by applying convolutional neural network technology to explore the best network model.Taking 24 typical desert plant species that are widely distributed in the nature reserves in Xinjiang Uygur Autonomous Region of China as the research objects,this study established an image database and select the optimal network model for the image recognition of desert plant species to provide decision support for fine management in the nature reserves in Xinjiang,such as species investigation and monitoring,by using deep learning.Since desert plant species were not included in the public dataset,the images used in this study were mainly obtained through field shooting and downloaded from the Plant Photo Bank of China(PPBC).After the sorting process and statistical analysis,a total of 2331 plant images were finally collected(2071 images from field collection and 260 images from the PPBC),including 24 plant species belonging to 14 families and 22 genera.A large number of numerical experiments were also carried out to compare a series of 37 convolutional neural network models with good performance,from different perspectives,to find the optimal network model that is most suitable for the image recognition of desert plant species in Xinjiang.The results revealed 24 models with a recognition Accuracy,of greater than 70.000%.Among which,Residual Network X_8GF(RegNetX_8GF)performs the best,with Accuracy,Precision,Recall,and F1(which refers to the harmonic mean of the Precision and Recall values)values of 78.33%,77.65%,69.55%,and 71.26%,respectively.Considering the demand factors of hardware equipment and inference time,Mobile NetworkV2 achieves the best balance among the Accuracy,the number of parameters and the number of floating-point operations.The number of parameters for Mobile Network V2(MobileNetV2)is 1/16 of RegNetX_8GF,and the number of floating-point operations is 1/24.Our findings can facilitate efficient decision-making for the management of species survey,cataloging,inspection,and monitoring in the nature reserves in Xinjiang,providing a scientific basis for the protection and utilization of natural plant resources.
文摘A new image recognition method based on fuzzy rough sets theory is proposed, and its implementation discussed. The performance of this method as applied to ferrography image recognition is evaluated. It is shown that the new method gives better results than fuzzy or rough sets method when used alone.
基金financially supported by the Natural Science Foundation of Hubei Province of China(2014CFB582)
文摘Fast and accurate determination of effective bentonite content in used clay bonded sand is very important for selecting the correct mixing ratio and mixing process to obtain high-performance molding sand. Currently, the effective bentonite content is determined by testing the ethylene blue absorbed in used clay bonded sand, which is usually a manual operation with some disadvantages including complicated process, long testing time and low accuracy. A rapid automatic analyzer of the effective bentonite content in used clay bonded sand was developed based on image recognition technology. The instrument consists of auto stirring, auto liquid removal, auto titration, step-rotation and image acquisition components, and processor. The principle of the image recognition method is first to decompose the color images into three-channel gray images based on the photosensitive degree difference of the light blue and dark blue in the three channels of red, green and blue, then to make the gray values subtraction calculation and gray level transformation of the gray images, and finally, to extract the outer circle light blue halo and the inner circle blue spot and calculate their area ratio. The titration process can be judged to reach the end-point while the area ratio is higher than the setting value.
文摘A new gray-spatial histogram is proposed, which incorporates spatial informatio n with gray compositions without sacrificing the robustness of traditional gray histograms. The purpose is to consider the representation role of gray compositi ons and spatial information simultaneously. Each entry in the gray-spatial hist ogram is the gray frequency and corresponding position information of images. In the experiments of sonar image recognition, the results show that the gray-spa tial histogram is effective in practical use.
基金This work is supported by the National Natural Science Foundation of China(61806013,61876010,62176009,and 61906005)General project of Science and Technology Planof Beijing Municipal Education Commission(KM202110005028)+2 种基金Beijing Municipal Education Commission Project(KZ201910005008)Project of Interdisciplinary Research Institute of Beijing University of Technology(2021020101)International Research Cooperation Seed Fund of Beijing University of Technology(2021A01).
文摘The fine-grained ship image recognition task aims to identify various classes of ships.However,small inter-class,large intra-class differences between ships,and lacking of training samples are the reasons that make the task difficult.Therefore,to enhance the accuracy of the fine-grained ship image recognition,we design a fine-grained ship image recognition network based on bilinear convolutional neural network(BCNN)with Inception and additive margin Softmax(AM-Softmax).This network improves the BCNN in two aspects.Firstly,by introducing Inception branches to the BCNN network,it is helpful to enhance the ability of extracting comprehensive features from ships.Secondly,by adding margin values to the decision boundary,the AM-Softmax function can better extend the inter-class differences and reduce the intra-class differences.In addition,as there are few publicly available datasets for fine-grained ship image recognition,we construct a Ship-43 dataset containing 47,300 ship images belonging to 43 categories.Experimental results on the constructed Ship-43 dataset demonstrate that our method can effectively improve the accuracy of ship image recognition,which is 4.08%higher than the BCNN model.Moreover,comparison results on the other three public fine-grained datasets(Cub,Cars,and Aircraft)further validate the effectiveness of the proposed method.
文摘As the COVID-19 epidemic spread across the globe,people around the world were advised or mandated to wear masks in public places to prevent its spreading further.In some cases,not wearing a mask could result in a fine.To monitor mask wearing,and to prevent the spread of future epidemics,this study proposes an image recognition system consisting of a camera,an infrared thermal array sensor,and a convolutional neural network trained in mask recognition.The infrared sensor monitors body temperature and displays the results in real-time on a liquid crystal display screen.The proposed system reduces the inefficiency of traditional object detection by providing training data according to the specific needs of the user and by applying You Only Look Once Version 4(YOLOv4)object detection technology,which experiments show has more efficient training parameters and a higher level of accuracy in object recognition.All datasets are uploaded to the cloud for storage using Google Colaboratory,saving human resources and achieving a high level of efficiency at a low cost.
文摘In this paper, the characters of the ferrography and image recognitiontechnology are analyzed. The fault diagnosis system for the power device based on the ferrographyand image recognition technology is designed. At the same time, the structure, the design andimplementing method, and the functions of each module of this system are described in detail.