Objectives This study aimed to design and evaluate a detection system for the accidental dislodgement of head-and-neck medical supplies through hand position recognition and tracking in Intensive Care Unit(ICU)patient...Objectives This study aimed to design and evaluate a detection system for the accidental dislodgement of head-and-neck medical supplies through hand position recognition and tracking in Intensive Care Unit(ICU)patients.Methods We conducted a single-center,prospective,parallel-group feasibility randomized controlled trial.We recruited 80 participants using convenience sampling from the ICU of a hospital in Ningbo City,Zhejiang Province,between March 2025 and June 2025,and they were randomly assigned to either the control group(routine care)or the intervention group(routine care plus image recognition-based detection system).The system continuously tracked patients’hand positions via bedside cameras and generated real-time alarms when hands entered predefined risk zones,notifying on-duty nurses to enable early intervention.System stability was assessed by continuous system uptime;system performance and clinical feasibility were evaluated by the frequencies of risk actions and accidental dislodgement of medical supplies(ADMS).Results All 80 participants completed the intervention,with 40 patients in each group.The baseline characteristics and median observation time of the two groups were balanced(intervention group:48 h/patient vs.control group:49 h/patient).Compared with the control group,the intervention group showed fewer ADMS(2/40 vs.9/40)and detected more risk actions per 100 h(36 vs.25);all system-detected events had corroborating images with complete concordance on manual review,and all nurse-recorded hand-contact events were accurately captured.Conclusions The study demonstrated that the image recognition-based detection system can function stably in clinical settings,providing accurate and continuous surveillance while supporting the early detection of risk actions.By reducing the observation burden and offering real-time cognitive support,the system complements routine nursing care and serves as an additional safety measure in ICU practice.With further optimization and larger multicenter validation,this approach could have the potential to make a significant contribution to the development of smart ICUs and the broader digital transformation of nursing care.展开更多
Pill image recognition is an important field in computer vision.It has become a vital technology in healthcare and pharmaceuticals due to the necessity for precise medication identification to prevent errors and ensur...Pill image recognition is an important field in computer vision.It has become a vital technology in healthcare and pharmaceuticals due to the necessity for precise medication identification to prevent errors and ensure patient safety.This survey examines the current state of pill image recognition,focusing on advancements,methodologies,and the challenges that remain unresolved.It provides a comprehensive overview of traditional image processing-based,machine learning-based,deep learning-based,and hybrid-based methods,and aims to explore the ongoing difficulties in the field.We summarize and classify the methods used in each article,compare the strengths and weaknesses of traditional image processing-based,machine learning-based,deep learning-based,and hybrid-based methods,and review benchmark datasets for pill image recognition.Additionally,we compare the performance of proposed methods on popular benchmark datasets.This survey applies recent advancements,such as Transformer models and cutting-edge technologies like Augmented Reality(AR),to discuss potential research directions and conclude the review.By offering a holistic perspective,this paper aims to serve as a valuable resource for researchers and practitioners striving to advance the field of pill image recognition.展开更多
The autocollimator is an important device for achieving precise,small-angle,non-contact measurements.It primarily obtains angular parameters of a plane target mirror indirectly by detecting the position of the imaging...The autocollimator is an important device for achieving precise,small-angle,non-contact measurements.It primarily obtains angular parameters of a plane target mirror indirectly by detecting the position of the imaging spot.There is limited report on the core algorithmic techniques in current commercial products and recent scientific research.This paper addresses the performance requirements of coordinate reading accuracy and operational speed in autocollimator image positioning.It proposes a cross-image center recognition scheme based on the Hough transform and another based on Zernike moments and the least squares method.Through experimental evaluation of the accuracy and speed of both schemes,the optimal image recognition scheme balancing measurement accuracy and speed for the autocollimator is determined.Among these,the center recognition method based on Zernike moments and the least squares method offers higher measurement accuracy and stability,while the Hough transform-based method provides faster measurement speed.展开更多
Marine environments present significant challenges for image processing due to factors such as low light intensity,suspended particles,and varying degrees of water turbidity.These conditions severely degrade the clari...Marine environments present significant challenges for image processing due to factors such as low light intensity,suspended particles,and varying degrees of water turbidity.These conditions severely degrade the clarity and quality of captured marine images,making accurate image recognition difficult.The problem is further compounded by the limited availability of high-quality,labeled training samples,which restricts the effectiveness of conventional recognition algorithms.Existing techniques in both academic and industrial settings—such as Principal Component Analysis(PCA),Neural Networks,and Wavelet Transforms—typically involve converting color images to grayscale prior to feature extraction.While this simplifies processing,it also results in the loss of essential color information,which is often critical for distinguishing features in marine imagery.To address these issues,this paper proposes a novel approach that preserves and utilizes the full color information of marine images during processing and recognition.The method combines color image representation with Hu's invariant moments to extract stable and rotation-invariant features.These features are then input into a Back Propagation Neural Network(BPNN),which is trained to recognize and classify various marine targets.The integration of color-based feature extraction with BPNN significantly improves recognition performance,particularly under complex environmental conditions.Experimental results show that the proposed system achieves a recognition accuracy exceeding 98%,demonstrating its effectiveness and potential for practical applications in marine exploration,environmental monitoring,and underwater robotics.展开更多
Fine-grained Image Recognition(FGIR)task is dedicated to distinguishing similar sub-categories that belong to the same super-category,such as bird species and car types.In order to highlight visual differences,existin...Fine-grained Image Recognition(FGIR)task is dedicated to distinguishing similar sub-categories that belong to the same super-category,such as bird species and car types.In order to highlight visual differences,existing FGIR works often follow two steps:discriminative sub-region localization and local feature representation.However,these works pay less attention on global context information.They neglect a fact that the subtle visual difference in challenging scenarios can be highlighted through exploiting the spatial relationship among different subregions from a global view point.Therefore,in this paper,we consider both global and local information for FGIR,and propose a collaborative teacher-student strategy to reinforce and unity the two types of information.Our framework is implemented mainly by convolutional neural network,referred to Teacher-Student Based Attention Convolutional Neural Network(T-S-ACNN).For fine-grained local information,we choose the classic Multi-Attention Network(MA-Net)as our baseline,and propose a type of boundary constraint to further reduce background noises in the local attention maps.In this way,the discriminative sub-regions tend to appear in the area occupied by fine-grained objects,leading to more accurate sub-region localization.For fine-grained global information,we design a graph convolution based Global Attention Network(GA-Net),which can combine extracted local attention maps from MA-Net with non-local techniques to explore spatial relationship among subregions.At last,we develop a collaborative teacher-student strategy to adaptively determine the attended roles and optimization modes,so as to enhance the cooperative reinforcement of MA-Net and GA-Net.Extensive experiments on CUB-200-2011,Stanford Cars and FGVC Aircraft datasets illustrate the promising performance of our framework.展开更多
With rapid urbanization,fires pose significant challenges in urban governance.Traditional fire detection methods often struggle to detect smoke in complex urban scenes due to environmental interferences and variations...With rapid urbanization,fires pose significant challenges in urban governance.Traditional fire detection methods often struggle to detect smoke in complex urban scenes due to environmental interferences and variations in viewing angles.This study proposes a novel multimodal smoke detection method that fuses infrared and visible imagery using a transformer-based deep learning model.By capturing both thermal and visual cues,our approach significantly enhances the accuracy and robustness of smoke detection in business parks scenes.We first established a dual-view dataset comprising infrared and visible light videos,implemented an innovative image feature fusion strategy,and designed a deep learning model based on the transformer architecture and attention mechanism for smoke classification.Experimental results demonstrate that our method outperforms existing methods,under the condition of multi-view input,it achieves an accuracy rate of 90.88%,precision rate of 98.38%,recall rate of 92.41%and false positive and false negative rates both below 5%,underlining the effectiveness of the proposed multimodal and multi-view fusion approach.The attention mechanism plays a crucial role in improving detection performance,particularly in identifying subtle smoke features.展开更多
In-optical-sensor computing architectures based on neuro-inspired optical sensor arrays have become key milestones for in-sensor artificial intelligence(AI)technology,enabling intelligent vision sensing and extensive ...In-optical-sensor computing architectures based on neuro-inspired optical sensor arrays have become key milestones for in-sensor artificial intelligence(AI)technology,enabling intelligent vision sensing and extensive data processing.These architectures must demonstrate potential advantages in terms of mass production and complementary metal oxide semiconductor compatibility.Here,we introduce a visible-light-driven neuromorphic vision system that integrates front-end retinomorphic photosensors with a back-end artificial neural network(ANN),employing a single neuro-inspired indium-g allium-zinc-oxide photo transistor(NIP)featuring an aluminum sensitization layer(ASL).By methodically adjusting the ASL coverage on IGZO phototransistors,a fast-switching response-type and a synaptic response-type of IGZO photo transistors are successfully developed.Notably,the fabricated NIP shows a remarkable retina-like photoinduced synaptic plasticity under wavelengths up to 635 nm,with over256-states,weight update nonlinearity below 0.1,and a dynamic range of 64.01.Owing to this technology,a 6×6 neuro-inspired optical image sensor array with the NIP can perform highly integrated sensing,memory,and preprocessing functions,including contrast enhancement,and handwritten digit image recognition.The demonstrated prototype highlights the potential for efficient hardware implementations in in-sensor AI technologies.展开更多
Considering the difficulty of integrating the depth points of nautical charts of the East China Sea into a global high-precision Grid Digital Elevation Model(Grid-DEM),we proposed a“Fusion based on Image Recognition(...Considering the difficulty of integrating the depth points of nautical charts of the East China Sea into a global high-precision Grid Digital Elevation Model(Grid-DEM),we proposed a“Fusion based on Image Recognition(FIR)”method for multi-sourced depth data fusion,and used it to merge the electronic nautical chart dataset(referred to as Chart2014 in this paper)with the global digital elevation dataset(referred to as Globalbath2002 in this paper).Compared to the traditional fusion of two datasets by direct combination and interpolation,the new Grid-DEM formed by FIR can better represent the data characteristics of Chart2014,reduce the calculation difficulty,and be more intuitive,and,the choice of different interpolation methods in FIR and the influence of the“exclusion radius R”parameter were discussed.FIR avoids complex calculations of spatial distances among points from different sources,and instead uses spatial exclusion map to perform one-step screening based on the exclusion radius R,which greatly improved the fusion status of a reliable dataset.The fusion results of different experiments were analyzed statistically with root mean square error and mean relative error,showing that the interpolation methods based on Delaunay triangulation are more suitable for the fusion of nautical chart depth of China,and factors such as the point density distribution of multiple source data,accuracy,interpolation method,and various terrain conditions should be fully considered when selecting the exclusion radius R.展开更多
Rapid and accurate recognition of coal and rock is an important prerequisite for safe and efficient coal mining.In this paper,a novel coal-rock recognition method is proposed based on fusing laser point cloud and imag...Rapid and accurate recognition of coal and rock is an important prerequisite for safe and efficient coal mining.In this paper,a novel coal-rock recognition method is proposed based on fusing laser point cloud and images,named Multi-Modal Frustum PointNet(MMFP).Firstly,MobileNetV3 is used as the backbone network of Mask R-CNN to reduce the network parameters and compress the model volume.The dilated convolutional block attention mechanism(Dilated CBAM)and inception structure are combined with MobileNetV3 to further enhance the detection accuracy.Subsequently,the 2D target candidate box is calculated through the improved Mask R-CNN,and the frustum point cloud in the 2D target candidate box is extracted to reduce the calculation scale and spatial search range.Then,the self-attention PointNet is constructed to segment the fused point cloud within the frustum range,and the bounding box regression network is used to predict the bounding box parameters.Finally,an experimental platform of shearer coal wall cutting is established,and multiple comparative experiments are conducted.Experimental results indicate that the proposed coal-rock recognition method is superior to other advanced models.展开更多
The traditional synthetic aperture radar(SAR) image recognition techniques focus on the electro magnetic (EM) scattering centers, ignoring the important role of the shadow information on the SAR image recognition....The traditional synthetic aperture radar(SAR) image recognition techniques focus on the electro magnetic (EM) scattering centers, ignoring the important role of the shadow information on the SAR image recognition. It is difficult to classify targets by the shadow information independently, because the shadow shape is dependent on the radar aspect angle, the depression angle and the resolution. Moreover, the shadow shapes of different targets are similar. When the multiple SAR images of one target from different aspects are available, the performance of the target recognition can be improved. Aimed at the problem, a multi-aspect SAR image recognition technique based on the shadow information is developed. It extracts shadow profiles from SAR images, and takes chain codes as the feature vectors of targets. Then, feature vectors on multiple aspects of the same target are combined with feature sequences, and the hidden Markov model (HMM) is applied to the feature sequences for the target recognition. The simulation result shows the effectiveness of the method.展开更多
Face recognition(FR) is a practical application of pattern recognition(PR) and remains a compelling topic in the study of computer vision. However, in real-world FR systems, interferences in images, including illumina...Face recognition(FR) is a practical application of pattern recognition(PR) and remains a compelling topic in the study of computer vision. However, in real-world FR systems, interferences in images, including illumination condition, occlusion, facial expression and pose variation, make the recognition task challenging. This study explored the impact of those interferences on FR performance and attempted to alleviate it by taking face symmetry into account. A novel and robust FR method was proposed by combining multi-mirror symmetry with local binary pattern(LBP), namely multi-mirror local binary pattern(MMLBP). To enhance FR performance with various interferences, the MMLBP can 1) adaptively compensate lighting under heterogeneous lighting conditions, and 2) generate extracted image features that are much closer to those under well-controlled conditions(i.e., frontal facial images without expression). Therefore, in contrast with the later variations of LBP, the symmetrical singular value decomposition representation(SSVDR) algorithm utilizing the facial symmetry and a state-of-art non-LBP method, the MMLBP method is shown to successfully handle various image interferences that are common in FR applications without preprocessing operation and a large number of training images. The proposed method was validated with four public data sets. According to our analysis, the MMLBP method was demonstrated to achieve robust performance regardless of image interferences.展开更多
With the continuous progress of The Times and the development of technology,the rise of network social media has also brought the“explosive”growth of image data.As one of the main ways of People’s Daily communicati...With the continuous progress of The Times and the development of technology,the rise of network social media has also brought the“explosive”growth of image data.As one of the main ways of People’s Daily communication,image is widely used as a carrier of communication because of its rich content,intuitive and other advantages.Image recognition based on convolution neural network is the first application in the field of image recognition.A series of algorithm operations such as image eigenvalue extraction,recognition and convolution are used to identify and analyze different images.The rapid development of artificial intelligence makes machine learning more and more important in its research field.Use algorithms to learn each piece of data and predict the outcome.This has become an important key to open the door of artificial intelligence.In machine vision,image recognition is the foundation,but how to associate the low-level information in the image with the high-level image semantics becomes the key problem of image recognition.Predecessors have provided many model algorithms,which have laid a solid foundation for the development of artificial intelligence and image recognition.The multi-level information fusion model based on the VGG16 model is an improvement on the fully connected neural network.Different from full connection network,convolutional neural network does not use full connection method in each layer of neurons of neural network,but USES some nodes for connection.Although this method reduces the computation time,due to the fact that the convolutional neural network model will lose some useful feature information in the process of propagation and calculation,this paper improves the model to be a multi-level information fusion of the convolution calculation method,and further recovers the discarded feature information,so as to improve the recognition rate of the image.VGG divides the network into five groups(mimicking the five layers of AlexNet),yet it USES 3*3 filters and combines them as a convolution sequence.Network deeper DCNN,channel number is bigger.The recognition rate of the model was verified by 0RL Face Database,BioID Face Database and CASIA Face Image Database.展开更多
Image recognition has always been a hot research topic in the scientific community and industry.The emergence of convolutional neural networks(CNN)has made this technology turned into research focus on the field of co...Image recognition has always been a hot research topic in the scientific community and industry.The emergence of convolutional neural networks(CNN)has made this technology turned into research focus on the field of computer vision,especially in image recognition.But it makes the recognition result largely dependent on the number and quality of training samples.Recently,DCGAN has become a frontier method for generating images,sounds,and videos.In this paper,DCGAN is used to generate sample that is difficult to collect and proposed an efficient design method of generating model.We combine DCGAN with CNN for the second time.Use DCGAN to generate samples and training in image recognition model,which based by CNN.This method can enhance the classification model and effectively improve the accuracy of image recognition.In the experiment,we used the radar profile as dataset for 4 categories and achieved satisfactory classification performance.This paper applies image recognition technology to the meteorological field.展开更多
In forest variety registration, visual traits of the plants appearance are widely used to discern different tree species. The new recognition system of leaf image strategy which based on neural network established to ...In forest variety registration, visual traits of the plants appearance are widely used to discern different tree species. The new recognition system of leaf image strategy which based on neural network established to administrate a hierarchical list of leaf images, some sorts of edge detection can be performed to identify the individual tokens of every image and the frame of the leaf can be got to differentiate the tree species. An approach based on back-propagation neuronal network is proposed and the programming language for the implementation is also Riven by using Java. The numerical simulations results have shown that the proposed leaf strategt is effective and feasible.展开更多
To address the problems about the difficulty in accurate recognition of distribution features of gas flow center at blast furnace throat and determine the relationship between gas flow center distribution and gas util...To address the problems about the difficulty in accurate recognition of distribution features of gas flow center at blast furnace throat and determine the relationship between gas flow center distribution and gas utilization rate,a method for recognizing distribution features of blast furnace gas flow center was proposed based on infrared image processing,and distribution features of blast furnace gas flow center and corresponding gas utilization rates were categorized by using fuzzy C-means clustering and statistical methods.A concept of gas flow center offset was introduced.The results showed that,when the percentage of gas flow center without offset exceeded 85%,the average blast furnace gas utilization rate was as high as 41%;when the percentage of gas flow center without offset exceeded50%,the gas utilization rate was primarily the center gas utilization rate,and exhibited a positive correlation with no center offset degree;when the percentage of gas flow center without offset was below 50% but the sum of the percentage of gas flow center without offset and that of gas flow center with small offset exceeded 86%,the gas utilization rate depended on both the center and the edges,and was primarily the edge gas utilization rate.The method proposed was able to accurately and effectively recognize gas flow center distribution state and the relationship between it and gas utilization rate,providing evidence in favor of on-line blast furnace control.展开更多
Based on the Fourier transform, a new shape descriptor was proposed to represent the flame image. By employing the shape descriptor as the input, the flame image recognition was studied by the methods of the artificia...Based on the Fourier transform, a new shape descriptor was proposed to represent the flame image. By employing the shape descriptor as the input, the flame image recognition was studied by the methods of the artificial neural network(ANN) and the support vector machine(SVM) respectively. And the recognition experiments were carried out by using flame image data sampled from an alumina rotary kiln to evaluate their effectiveness. The results show that the two recognition methods can achieve good results, which verify the effectiveness of the shape descriptor. The highest recognition rate is 88.83% for SVM and 87.38% for ANN, which means that the performance of the SVM is better than that of the ANN.展开更多
Recognition and analysis of dynamic information about population images during wheat growth periods can be taken for the base of quantitative diagnosis for wheat growth. A recognition system based on self-learning BP ...Recognition and analysis of dynamic information about population images during wheat growth periods can be taken for the base of quantitative diagnosis for wheat growth. A recognition system based on self-learning BP neural network for feature data of wheat population images, such as total green areas and leaves areas was designed in this paper. In addition, some techniques to create favorable conditions for image recognition was discussed, which were as follows: (1) The method of collecting images by a digital camera and assistant equipment under natural conditions in fields. (2) An algorithm of pixel labeling was used to segment image and extract feature. (3) A high pass filter based on Laplacian was used to strengthen image information. The results showed that the ANN system was availability for image recognition of wheat population feature.展开更多
In the sorting system of the production line,the object movement,fixed angle of view,light intensity and other reasons lead to obscure blurred images.It results in bar code recognition rate being low and real time bei...In the sorting system of the production line,the object movement,fixed angle of view,light intensity and other reasons lead to obscure blurred images.It results in bar code recognition rate being low and real time being poor.Aiming at the above problems,a progressive bar code compressed recognition algorithm is proposed.First,assuming that the source image is not tilted,use the direct recognition method to quickly identify the compressed source image.Failure indicates that the compression ratio is improper or the image is skewed.Then,the source image is enhanced to identify the source image directly.Finally,the inclination of the compressed image is detected by the barcode region recognition method and the source image is corrected to locate the barcode information in the barcode region recognition image.The results of multitype image experiments show that the proposed method is improved by 5+times computational efficiency compared with the former methods,and can recognize fuzzy images better.展开更多
In industrial flotation, froth layer plays an important role and reflects directly whether coal, air, water and reagents match each other properly or not and whether the quality of flotation is good or not. So the sup...In industrial flotation, froth layer plays an important role and reflects directly whether coal, air, water and reagents match each other properly or not and whether the quality of flotation is good or not. So the supervision and recognition of the state of froth layer is very important in the flotation process. The ash content of clean coal froth was predicted through extracting the features of images of flotation froth. The froth images were classified according to their structure. A control system of adding flotation reagents was established based on the LVQ neural net.展开更多
With the development of Deep Convolutional Neural Networks(DCNNs),the extracted features for image recognition tasks have shifted from low-level features to the high-level semantic features of DCNNs.Previous studies h...With the development of Deep Convolutional Neural Networks(DCNNs),the extracted features for image recognition tasks have shifted from low-level features to the high-level semantic features of DCNNs.Previous studies have shown that the deeper the network is,the more abstract the features are.However,the recognition ability of deep features would be limited by insufficient training samples.To address this problem,this paper derives an improved Deep Fusion Convolutional Neural Network(DF-Net)which can make full use of the differences and complementarities during network learning and enhance feature expression under the condition of limited datasets.Specifically,DF-Net organizes two identical subnets to extract features from the input image in parallel,and then a well-designed fusion module is introduced to the deep layer of DF-Net to fuse the subnet’s features in multi-scale.Thus,the more complex mappings are created and the more abundant and accurate fusion features can be extracted to improve recognition accuracy.Furthermore,a corresponding training strategy is also proposed to speed up the convergence and reduce the computation overhead of network training.Finally,DF-Nets based on the well-known ResNet,DenseNet and MobileNetV2 are evaluated on CIFAR100,Stanford Dogs,and UECFOOD-100.Theoretical analysis and experimental results strongly demonstrate that DF-Net enhances the performance of DCNNs and increases the accuracy of image recognition.展开更多
文摘Objectives This study aimed to design and evaluate a detection system for the accidental dislodgement of head-and-neck medical supplies through hand position recognition and tracking in Intensive Care Unit(ICU)patients.Methods We conducted a single-center,prospective,parallel-group feasibility randomized controlled trial.We recruited 80 participants using convenience sampling from the ICU of a hospital in Ningbo City,Zhejiang Province,between March 2025 and June 2025,and they were randomly assigned to either the control group(routine care)or the intervention group(routine care plus image recognition-based detection system).The system continuously tracked patients’hand positions via bedside cameras and generated real-time alarms when hands entered predefined risk zones,notifying on-duty nurses to enable early intervention.System stability was assessed by continuous system uptime;system performance and clinical feasibility were evaluated by the frequencies of risk actions and accidental dislodgement of medical supplies(ADMS).Results All 80 participants completed the intervention,with 40 patients in each group.The baseline characteristics and median observation time of the two groups were balanced(intervention group:48 h/patient vs.control group:49 h/patient).Compared with the control group,the intervention group showed fewer ADMS(2/40 vs.9/40)and detected more risk actions per 100 h(36 vs.25);all system-detected events had corroborating images with complete concordance on manual review,and all nurse-recorded hand-contact events were accurately captured.Conclusions The study demonstrated that the image recognition-based detection system can function stably in clinical settings,providing accurate and continuous surveillance while supporting the early detection of risk actions.By reducing the observation burden and offering real-time cognitive support,the system complements routine nursing care and serves as an additional safety measure in ICU practice.With further optimization and larger multicenter validation,this approach could have the potential to make a significant contribution to the development of smart ICUs and the broader digital transformation of nursing care.
文摘Pill image recognition is an important field in computer vision.It has become a vital technology in healthcare and pharmaceuticals due to the necessity for precise medication identification to prevent errors and ensure patient safety.This survey examines the current state of pill image recognition,focusing on advancements,methodologies,and the challenges that remain unresolved.It provides a comprehensive overview of traditional image processing-based,machine learning-based,deep learning-based,and hybrid-based methods,and aims to explore the ongoing difficulties in the field.We summarize and classify the methods used in each article,compare the strengths and weaknesses of traditional image processing-based,machine learning-based,deep learning-based,and hybrid-based methods,and review benchmark datasets for pill image recognition.Additionally,we compare the performance of proposed methods on popular benchmark datasets.This survey applies recent advancements,such as Transformer models and cutting-edge technologies like Augmented Reality(AR),to discuss potential research directions and conclude the review.By offering a holistic perspective,this paper aims to serve as a valuable resource for researchers and practitioners striving to advance the field of pill image recognition.
基金supported by the National Natural Science Foundation of China (No.62375031)the Natural Science Foundation of Chongqing Municipality (No.2024NSCQ-LZX0041)。
文摘The autocollimator is an important device for achieving precise,small-angle,non-contact measurements.It primarily obtains angular parameters of a plane target mirror indirectly by detecting the position of the imaging spot.There is limited report on the core algorithmic techniques in current commercial products and recent scientific research.This paper addresses the performance requirements of coordinate reading accuracy and operational speed in autocollimator image positioning.It proposes a cross-image center recognition scheme based on the Hough transform and another based on Zernike moments and the least squares method.Through experimental evaluation of the accuracy and speed of both schemes,the optimal image recognition scheme balancing measurement accuracy and speed for the autocollimator is determined.Among these,the center recognition method based on Zernike moments and the least squares method offers higher measurement accuracy and stability,while the Hough transform-based method provides faster measurement speed.
文摘Marine environments present significant challenges for image processing due to factors such as low light intensity,suspended particles,and varying degrees of water turbidity.These conditions severely degrade the clarity and quality of captured marine images,making accurate image recognition difficult.The problem is further compounded by the limited availability of high-quality,labeled training samples,which restricts the effectiveness of conventional recognition algorithms.Existing techniques in both academic and industrial settings—such as Principal Component Analysis(PCA),Neural Networks,and Wavelet Transforms—typically involve converting color images to grayscale prior to feature extraction.While this simplifies processing,it also results in the loss of essential color information,which is often critical for distinguishing features in marine imagery.To address these issues,this paper proposes a novel approach that preserves and utilizes the full color information of marine images during processing and recognition.The method combines color image representation with Hu's invariant moments to extract stable and rotation-invariant features.These features are then input into a Back Propagation Neural Network(BPNN),which is trained to recognize and classify various marine targets.The integration of color-based feature extraction with BPNN significantly improves recognition performance,particularly under complex environmental conditions.Experimental results show that the proposed system achieves a recognition accuracy exceeding 98%,demonstrating its effectiveness and potential for practical applications in marine exploration,environmental monitoring,and underwater robotics.
基金supported by the National Natural Science Foundation of China,China (Grants No.62171232)the Priority Academic Program Development of Jiangsu Higher Education Institutions,China。
文摘Fine-grained Image Recognition(FGIR)task is dedicated to distinguishing similar sub-categories that belong to the same super-category,such as bird species and car types.In order to highlight visual differences,existing FGIR works often follow two steps:discriminative sub-region localization and local feature representation.However,these works pay less attention on global context information.They neglect a fact that the subtle visual difference in challenging scenarios can be highlighted through exploiting the spatial relationship among different subregions from a global view point.Therefore,in this paper,we consider both global and local information for FGIR,and propose a collaborative teacher-student strategy to reinforce and unity the two types of information.Our framework is implemented mainly by convolutional neural network,referred to Teacher-Student Based Attention Convolutional Neural Network(T-S-ACNN).For fine-grained local information,we choose the classic Multi-Attention Network(MA-Net)as our baseline,and propose a type of boundary constraint to further reduce background noises in the local attention maps.In this way,the discriminative sub-regions tend to appear in the area occupied by fine-grained objects,leading to more accurate sub-region localization.For fine-grained global information,we design a graph convolution based Global Attention Network(GA-Net),which can combine extracted local attention maps from MA-Net with non-local techniques to explore spatial relationship among subregions.At last,we develop a collaborative teacher-student strategy to adaptively determine the attended roles and optimization modes,so as to enhance the cooperative reinforcement of MA-Net and GA-Net.Extensive experiments on CUB-200-2011,Stanford Cars and FGVC Aircraft datasets illustrate the promising performance of our framework.
基金supported by the National Natural Science Foundation of China(32171797)Chunhui Project Foundation of the Education Department of China(HZKY20220026).
文摘With rapid urbanization,fires pose significant challenges in urban governance.Traditional fire detection methods often struggle to detect smoke in complex urban scenes due to environmental interferences and variations in viewing angles.This study proposes a novel multimodal smoke detection method that fuses infrared and visible imagery using a transformer-based deep learning model.By capturing both thermal and visual cues,our approach significantly enhances the accuracy and robustness of smoke detection in business parks scenes.We first established a dual-view dataset comprising infrared and visible light videos,implemented an innovative image feature fusion strategy,and designed a deep learning model based on the transformer architecture and attention mechanism for smoke classification.Experimental results demonstrate that our method outperforms existing methods,under the condition of multi-view input,it achieves an accuracy rate of 90.88%,precision rate of 98.38%,recall rate of 92.41%and false positive and false negative rates both below 5%,underlining the effectiveness of the proposed multimodal and multi-view fusion approach.The attention mechanism plays a crucial role in improving detection performance,particularly in identifying subtle smoke features.
基金supported by the National Research Foundation of Korea(NRF)Grant funded by the Korea government(MSIT)(Grant No.RS-2023-00256917)Samsung Display。
文摘In-optical-sensor computing architectures based on neuro-inspired optical sensor arrays have become key milestones for in-sensor artificial intelligence(AI)technology,enabling intelligent vision sensing and extensive data processing.These architectures must demonstrate potential advantages in terms of mass production and complementary metal oxide semiconductor compatibility.Here,we introduce a visible-light-driven neuromorphic vision system that integrates front-end retinomorphic photosensors with a back-end artificial neural network(ANN),employing a single neuro-inspired indium-g allium-zinc-oxide photo transistor(NIP)featuring an aluminum sensitization layer(ASL).By methodically adjusting the ASL coverage on IGZO phototransistors,a fast-switching response-type and a synaptic response-type of IGZO photo transistors are successfully developed.Notably,the fabricated NIP shows a remarkable retina-like photoinduced synaptic plasticity under wavelengths up to 635 nm,with over256-states,weight update nonlinearity below 0.1,and a dynamic range of 64.01.Owing to this technology,a 6×6 neuro-inspired optical image sensor array with the NIP can perform highly integrated sensing,memory,and preprocessing functions,including contrast enhancement,and handwritten digit image recognition.The demonstrated prototype highlights the potential for efficient hardware implementations in in-sensor AI technologies.
基金Supported by the National Key R&D Program of China (No.2023YFC3008100)the National Natural Science Foundation of China (No.U23A2033)
文摘Considering the difficulty of integrating the depth points of nautical charts of the East China Sea into a global high-precision Grid Digital Elevation Model(Grid-DEM),we proposed a“Fusion based on Image Recognition(FIR)”method for multi-sourced depth data fusion,and used it to merge the electronic nautical chart dataset(referred to as Chart2014 in this paper)with the global digital elevation dataset(referred to as Globalbath2002 in this paper).Compared to the traditional fusion of two datasets by direct combination and interpolation,the new Grid-DEM formed by FIR can better represent the data characteristics of Chart2014,reduce the calculation difficulty,and be more intuitive,and,the choice of different interpolation methods in FIR and the influence of the“exclusion radius R”parameter were discussed.FIR avoids complex calculations of spatial distances among points from different sources,and instead uses spatial exclusion map to perform one-step screening based on the exclusion radius R,which greatly improved the fusion status of a reliable dataset.The fusion results of different experiments were analyzed statistically with root mean square error and mean relative error,showing that the interpolation methods based on Delaunay triangulation are more suitable for the fusion of nautical chart depth of China,and factors such as the point density distribution of multiple source data,accuracy,interpolation method,and various terrain conditions should be fully considered when selecting the exclusion radius R.
基金supported in part by the National Natural Science Foundation of China(Nos.52174152 and 52074271)in part by the Xuzhou Basic Research Program Project(No.KC23051)+2 种基金in part by the Shandong Province Technology Innovation Guidance Plan(Central Guidance for Local Scientific and Technological Development Fund)(No.YDZX2024119)in part by the Graduate Innovation Program of China University of Mining and Technology(No.2025WLKXJ088)in part by the Postgraduate Research&Practice Innovation Program of Jiangsu Province(No.KYCX252830).
文摘Rapid and accurate recognition of coal and rock is an important prerequisite for safe and efficient coal mining.In this paper,a novel coal-rock recognition method is proposed based on fusing laser point cloud and images,named Multi-Modal Frustum PointNet(MMFP).Firstly,MobileNetV3 is used as the backbone network of Mask R-CNN to reduce the network parameters and compress the model volume.The dilated convolutional block attention mechanism(Dilated CBAM)and inception structure are combined with MobileNetV3 to further enhance the detection accuracy.Subsequently,the 2D target candidate box is calculated through the improved Mask R-CNN,and the frustum point cloud in the 2D target candidate box is extracted to reduce the calculation scale and spatial search range.Then,the self-attention PointNet is constructed to segment the fused point cloud within the frustum range,and the bounding box regression network is used to predict the bounding box parameters.Finally,an experimental platform of shearer coal wall cutting is established,and multiple comparative experiments are conducted.Experimental results indicate that the proposed coal-rock recognition method is superior to other advanced models.
文摘The traditional synthetic aperture radar(SAR) image recognition techniques focus on the electro magnetic (EM) scattering centers, ignoring the important role of the shadow information on the SAR image recognition. It is difficult to classify targets by the shadow information independently, because the shadow shape is dependent on the radar aspect angle, the depression angle and the resolution. Moreover, the shadow shapes of different targets are similar. When the multiple SAR images of one target from different aspects are available, the performance of the target recognition can be improved. Aimed at the problem, a multi-aspect SAR image recognition technique based on the shadow information is developed. It extracts shadow profiles from SAR images, and takes chain codes as the feature vectors of targets. Then, feature vectors on multiple aspects of the same target are combined with feature sequences, and the hidden Markov model (HMM) is applied to the feature sequences for the target recognition. The simulation result shows the effectiveness of the method.
基金supported by National Natural Science Foundation of China (No. 51305392)Youth Funds of the State Key Laboratory of Fluid Power Transmission and Control (No. SKLoFP_QN_1501)+1 种基金Zhejiang Provincial Natural Science Foundation of China (Nos. LY17E050009 and LZ15E050001)the Fundamental Rsesearch Funds for the Central Universities (No. 2018QNA4008)
文摘Face recognition(FR) is a practical application of pattern recognition(PR) and remains a compelling topic in the study of computer vision. However, in real-world FR systems, interferences in images, including illumination condition, occlusion, facial expression and pose variation, make the recognition task challenging. This study explored the impact of those interferences on FR performance and attempted to alleviate it by taking face symmetry into account. A novel and robust FR method was proposed by combining multi-mirror symmetry with local binary pattern(LBP), namely multi-mirror local binary pattern(MMLBP). To enhance FR performance with various interferences, the MMLBP can 1) adaptively compensate lighting under heterogeneous lighting conditions, and 2) generate extracted image features that are much closer to those under well-controlled conditions(i.e., frontal facial images without expression). Therefore, in contrast with the later variations of LBP, the symmetrical singular value decomposition representation(SSVDR) algorithm utilizing the facial symmetry and a state-of-art non-LBP method, the MMLBP method is shown to successfully handle various image interferences that are common in FR applications without preprocessing operation and a large number of training images. The proposed method was validated with four public data sets. According to our analysis, the MMLBP method was demonstrated to achieve robust performance regardless of image interferences.
文摘With the continuous progress of The Times and the development of technology,the rise of network social media has also brought the“explosive”growth of image data.As one of the main ways of People’s Daily communication,image is widely used as a carrier of communication because of its rich content,intuitive and other advantages.Image recognition based on convolution neural network is the first application in the field of image recognition.A series of algorithm operations such as image eigenvalue extraction,recognition and convolution are used to identify and analyze different images.The rapid development of artificial intelligence makes machine learning more and more important in its research field.Use algorithms to learn each piece of data and predict the outcome.This has become an important key to open the door of artificial intelligence.In machine vision,image recognition is the foundation,but how to associate the low-level information in the image with the high-level image semantics becomes the key problem of image recognition.Predecessors have provided many model algorithms,which have laid a solid foundation for the development of artificial intelligence and image recognition.The multi-level information fusion model based on the VGG16 model is an improvement on the fully connected neural network.Different from full connection network,convolutional neural network does not use full connection method in each layer of neurons of neural network,but USES some nodes for connection.Although this method reduces the computation time,due to the fact that the convolutional neural network model will lose some useful feature information in the process of propagation and calculation,this paper improves the model to be a multi-level information fusion of the convolution calculation method,and further recovers the discarded feature information,so as to improve the recognition rate of the image.VGG divides the network into five groups(mimicking the five layers of AlexNet),yet it USES 3*3 filters and combines them as a convolution sequence.Network deeper DCNN,channel number is bigger.The recognition rate of the model was verified by 0RL Face Database,BioID Face Database and CASIA Face Image Database.
文摘Image recognition has always been a hot research topic in the scientific community and industry.The emergence of convolutional neural networks(CNN)has made this technology turned into research focus on the field of computer vision,especially in image recognition.But it makes the recognition result largely dependent on the number and quality of training samples.Recently,DCGAN has become a frontier method for generating images,sounds,and videos.In this paper,DCGAN is used to generate sample that is difficult to collect and proposed an efficient design method of generating model.We combine DCGAN with CNN for the second time.Use DCGAN to generate samples and training in image recognition model,which based by CNN.This method can enhance the classification model and effectively improve the accuracy of image recognition.In the experiment,we used the radar profile as dataset for 4 categories and achieved satisfactory classification performance.This paper applies image recognition technology to the meteorological field.
基金Foundation project: This paper was supported by National Natural Science Foundation of China (No. 30371126).
文摘In forest variety registration, visual traits of the plants appearance are widely used to discern different tree species. The new recognition system of leaf image strategy which based on neural network established to administrate a hierarchical list of leaf images, some sorts of edge detection can be performed to identify the individual tokens of every image and the frame of the leaf can be got to differentiate the tree species. An approach based on back-propagation neuronal network is proposed and the programming language for the implementation is also Riven by using Java. The numerical simulations results have shown that the proposed leaf strategt is effective and feasible.
基金Item Sponsored by National Natural Science Foundation of China(61263015)
文摘To address the problems about the difficulty in accurate recognition of distribution features of gas flow center at blast furnace throat and determine the relationship between gas flow center distribution and gas utilization rate,a method for recognizing distribution features of blast furnace gas flow center was proposed based on infrared image processing,and distribution features of blast furnace gas flow center and corresponding gas utilization rates were categorized by using fuzzy C-means clustering and statistical methods.A concept of gas flow center offset was introduced.The results showed that,when the percentage of gas flow center without offset exceeded 85%,the average blast furnace gas utilization rate was as high as 41%;when the percentage of gas flow center without offset exceeded50%,the gas utilization rate was primarily the center gas utilization rate,and exhibited a positive correlation with no center offset degree;when the percentage of gas flow center without offset was below 50% but the sum of the percentage of gas flow center without offset and that of gas flow center with small offset exceeded 86%,the gas utilization rate depended on both the center and the edges,and was primarily the edge gas utilization rate.The method proposed was able to accurately and effectively recognize gas flow center distribution state and the relationship between it and gas utilization rate,providing evidence in favor of on-line blast furnace control.
基金Project(60634020) supported by the National Natural Science Foundation of China
文摘Based on the Fourier transform, a new shape descriptor was proposed to represent the flame image. By employing the shape descriptor as the input, the flame image recognition was studied by the methods of the artificial neural network(ANN) and the support vector machine(SVM) respectively. And the recognition experiments were carried out by using flame image data sampled from an alumina rotary kiln to evaluate their effectiveness. The results show that the two recognition methods can achieve good results, which verify the effectiveness of the shape descriptor. The highest recognition rate is 88.83% for SVM and 87.38% for ANN, which means that the performance of the SVM is better than that of the ANN.
基金suppported by the National Nat-ual Sience Fundation of China(990427 and“863”Opening Item(001A110-02)
文摘Recognition and analysis of dynamic information about population images during wheat growth periods can be taken for the base of quantitative diagnosis for wheat growth. A recognition system based on self-learning BP neural network for feature data of wheat population images, such as total green areas and leaves areas was designed in this paper. In addition, some techniques to create favorable conditions for image recognition was discussed, which were as follows: (1) The method of collecting images by a digital camera and assistant equipment under natural conditions in fields. (2) An algorithm of pixel labeling was used to segment image and extract feature. (3) A high pass filter based on Laplacian was used to strengthen image information. The results showed that the ANN system was availability for image recognition of wheat population feature.
基金This work was supported by Scientific Research Starting Project of SWPU[Zheng,D.,No.0202002131604]Major Science and Technology Project of Sichuan Province[Zheng,D.,No.8ZDZX0143]+1 种基金Ministry of Education Collaborative Education Project of China[Zheng,D.,No.952]Fundamental Research Project[Zheng,D.,Nos.549,550].
文摘In the sorting system of the production line,the object movement,fixed angle of view,light intensity and other reasons lead to obscure blurred images.It results in bar code recognition rate being low and real time being poor.Aiming at the above problems,a progressive bar code compressed recognition algorithm is proposed.First,assuming that the source image is not tilted,use the direct recognition method to quickly identify the compressed source image.Failure indicates that the compression ratio is improper or the image is skewed.Then,the source image is enhanced to identify the source image directly.Finally,the inclination of the compressed image is detected by the barcode region recognition method and the source image is corrected to locate the barcode information in the barcode region recognition image.The results of multitype image experiments show that the proposed method is improved by 5+times computational efficiency compared with the former methods,and can recognize fuzzy images better.
基金Supported by the Nation’s Natural Science Foundation(5 99740 3 2 )
文摘In industrial flotation, froth layer plays an important role and reflects directly whether coal, air, water and reagents match each other properly or not and whether the quality of flotation is good or not. So the supervision and recognition of the state of froth layer is very important in the flotation process. The ash content of clean coal froth was predicted through extracting the features of images of flotation froth. The froth images were classified according to their structure. A control system of adding flotation reagents was established based on the LVQ neural net.
基金This work is partially supported by National Natural Foundation of China(Grant No.61772561)the Key Research&Development Plan of Hunan Province(Grant No.2018NK2012)+2 种基金the Degree&Postgraduate Education Reform Project of Hunan Province(Grant No.2019JGYB154)the Postgraduate Excellent teaching team Project of Hunan Province(Grant[2019]370-133)Teaching Reform Project of Central South University of Forestry and Technology(Grant No.20180682).
文摘With the development of Deep Convolutional Neural Networks(DCNNs),the extracted features for image recognition tasks have shifted from low-level features to the high-level semantic features of DCNNs.Previous studies have shown that the deeper the network is,the more abstract the features are.However,the recognition ability of deep features would be limited by insufficient training samples.To address this problem,this paper derives an improved Deep Fusion Convolutional Neural Network(DF-Net)which can make full use of the differences and complementarities during network learning and enhance feature expression under the condition of limited datasets.Specifically,DF-Net organizes two identical subnets to extract features from the input image in parallel,and then a well-designed fusion module is introduced to the deep layer of DF-Net to fuse the subnet’s features in multi-scale.Thus,the more complex mappings are created and the more abundant and accurate fusion features can be extracted to improve recognition accuracy.Furthermore,a corresponding training strategy is also proposed to speed up the convergence and reduce the computation overhead of network training.Finally,DF-Nets based on the well-known ResNet,DenseNet and MobileNetV2 are evaluated on CIFAR100,Stanford Dogs,and UECFOOD-100.Theoretical analysis and experimental results strongly demonstrate that DF-Net enhances the performance of DCNNs and increases the accuracy of image recognition.