期刊文献+
共找到13,327篇文章
< 1 2 250 >
每页显示 20 50 100
Boruta-LSTMAE:Feature-Enhanced Depth Image Denoising for 3D Recognition
1
作者 Fawad Salam Khan Noman Hasany +6 位作者 Muzammil Ahmad Khan Shayan Abbas Sajjad Ahmed Muhammad Zorain Wai Yie Leong Susama Bagchi Sanjoy Kumar Debnath 《Computers, Materials & Continua》 2026年第4期2181-2206,共26页
The initial noise present in the depth images obtained with RGB-D sensors is a combination of hardware limitations in addition to the environmental factors,due to the limited capabilities of sensors,which also produce... The initial noise present in the depth images obtained with RGB-D sensors is a combination of hardware limitations in addition to the environmental factors,due to the limited capabilities of sensors,which also produce poor computer vision results.The common image denoising techniques tend to remove significant image details and also remove noise,provided they are based on space and frequency filtering.The updated framework presented in this paper is a novel denoising model that makes use of Boruta-driven feature selection using a Long Short-Term Memory Autoencoder(LSTMAE).The Boruta algorithm identifies the most useful depth features that are used to maximize the spatial structure integrity and reduce redundancy.An LSTMAE is then used to process these selected features and model depth pixel sequences to generate robust,noise-resistant representations.The system uses the encoder to encode the input data into a latent space that has been compressed before it is decoded to retrieve the clean image.Experiments on a benchmark data set show that the suggested technique attains a PSNR of 45 dB and an SSIM of 0.90,which is 10 dB higher than the performance of conventional convolutional autoencoders and 15 times higher than that of the wavelet-based models.Moreover,the feature selection step will decrease the input dimensionality by 40%,resulting in a 37.5%reduction in training time and a real-time inference rate of 200 FPS.Boruta-LSTMAE framework,therefore,offers a highly efficient and scalable system for depth image denoising,with a high potential to be applied to close-range 3D systems,such as robotic manipulation and gesture-based interfaces. 展开更多
关键词 Boruta LSTM autoencoder feature fusion DENOISING 3D object recognition depth images
在线阅读 下载PDF
Image recognition-based detection system for preventing accidental dislodgement of head-and-neck medical supplies in ICU patients:A feasibility randomized controlled trial
2
作者 Zhongjie Shi Taotao Shi +5 位作者 Xin Gao Jian Li Hong Xu Xiaojun Li Zhanxiang Wang Sifang Chen 《International Journal of Nursing Sciences》 2026年第1期3-10,I0001,共9页
Objectives This study aimed to design and evaluate a detection system for the accidental dislodgement of head-and-neck medical supplies through hand position recognition and tracking in Intensive Care Unit(ICU)patient... Objectives This study aimed to design and evaluate a detection system for the accidental dislodgement of head-and-neck medical supplies through hand position recognition and tracking in Intensive Care Unit(ICU)patients.Methods We conducted a single-center,prospective,parallel-group feasibility randomized controlled trial.We recruited 80 participants using convenience sampling from the ICU of a hospital in Ningbo City,Zhejiang Province,between March 2025 and June 2025,and they were randomly assigned to either the control group(routine care)or the intervention group(routine care plus image recognition-based detection system).The system continuously tracked patients’hand positions via bedside cameras and generated real-time alarms when hands entered predefined risk zones,notifying on-duty nurses to enable early intervention.System stability was assessed by continuous system uptime;system performance and clinical feasibility were evaluated by the frequencies of risk actions and accidental dislodgement of medical supplies(ADMS).Results All 80 participants completed the intervention,with 40 patients in each group.The baseline characteristics and median observation time of the two groups were balanced(intervention group:48 h/patient vs.control group:49 h/patient).Compared with the control group,the intervention group showed fewer ADMS(2/40 vs.9/40)and detected more risk actions per 100 h(36 vs.25);all system-detected events had corroborating images with complete concordance on manual review,and all nurse-recorded hand-contact events were accurately captured.Conclusions The study demonstrated that the image recognition-based detection system can function stably in clinical settings,providing accurate and continuous surveillance while supporting the early detection of risk actions.By reducing the observation burden and offering real-time cognitive support,the system complements routine nursing care and serves as an additional safety measure in ICU practice.With further optimization and larger multicenter validation,this approach could have the potential to make a significant contribution to the development of smart ICUs and the broader digital transformation of nursing care. 展开更多
关键词 Accidental dislodgement of medical supplies Feasibility randomized trial image recognition Intensive Care Unit Risk monitoring
暂未订购
Novel image segmentation model of multi-view sheep face for identity recognition
3
作者 Suhui Liu Guangpu Wang +2 位作者 Chuanzhong Xuan Zhaohui Tang Junze Jia 《International Journal of Agricultural and Biological Engineering》 2025年第6期260-268,共9页
Traditional sheep identification is based on ear tags.However,the application of ear tags not only causes stress to the animals but also leads to loss of ear tags,which affects the correct recognition of sheep identit... Traditional sheep identification is based on ear tags.However,the application of ear tags not only causes stress to the animals but also leads to loss of ear tags,which affects the correct recognition of sheep identity.In contrast,the acquisition of sheep face images offers the advantages of being non-invasive and stress-free for the animals.Nevertheless,the extant convolutional neural network-based sheep face identification model is prone to the issue of inadequate refinement,which renders its implementation on farms challenging.To address this issue,this study presented a novel sheep face recognition model that employs advanced feature fusion techniques and precise image segmentation strategies.The images were preprocessed and accurately segmented using deep learning techniques,with a dataset constructed containing sheep face images from multiple viewpoints(left,front,and right faces).In particular,the model employs a segmentation algorithm to delineate the sheep face region accurately,utilizes the Improved Convolutional Block Attention Module(I-CBAM)to emphasize the salient features of the sheep face,and achieves multi-scale fusion of the features through a Feature Pyramid Network(FPN).This process guarantees that the features captured from disparate viewpoints can be efficiently integrated to enhance recognition accuracy.Furthermore,the model guarantees the precise delineation of sheep facial contours by streamlining the image segmentation procedure,thereby establishing a robust basis for the precise identification of sheep identity.The findings demonstrate that the recognition accuracy of the Sheep Face Mask Region-based Convolutional Neural Network(SFMask RCNN)model has been enhanced by 9.64%to 98.65%in comparison to the original model.The method offers a novel technological approach to the management of animal identity in the context of sheep husbandry. 展开更多
关键词 image segmentation sheep face deep learning multi-view feature fusion
原文传递
A Novel Semi-Supervised Multi-View Picture Fuzzy Clustering Approach for Enhanced Satellite Image Segmentation
4
作者 Pham Huy Thong Hoang Thi Canh +2 位作者 Nguyen Tuan Huy Nguyen Long Giang Luong Thi Hong Lan 《Computers, Materials & Continua》 2026年第3期1092-1117,共26页
Satellite image segmentation plays a crucial role in remote sensing,supporting applications such as environmental monitoring,land use analysis,and disaster management.However,traditional segmentation methods often rel... Satellite image segmentation plays a crucial role in remote sensing,supporting applications such as environmental monitoring,land use analysis,and disaster management.However,traditional segmentation methods often rely on large amounts of labeled data,which are costly and time-consuming to obtain,especially in largescale or dynamic environments.To address this challenge,we propose the Semi-Supervised Multi-View Picture Fuzzy Clustering(SS-MPFC)algorithm,which improves segmentation accuracy and robustness,particularly in complex and uncertain remote sensing scenarios.SS-MPFC unifies three paradigms:semi-supervised learning,multi-view clustering,and picture fuzzy set theory.This integration allows the model to effectively utilize a small number of labeled samples,fuse complementary information from multiple data views,and handle the ambiguity and uncertainty inherent in satellite imagery.We design a novel objective function that jointly incorporates picture fuzzy membership functions across multiple views of the data,and embeds pairwise semi-supervised constraints(must-link and cannot-link)directly into the clustering process to enhance segmentation accuracy.Experiments conducted on several benchmark satellite datasets demonstrate that SS-MPFC significantly outperforms existing state-of-the-art methods in segmentation accuracy,noise robustness,and semantic interpretability.On the Augsburg dataset,SS-MPFC achieves a Purity of 0.8158 and an Accuracy of 0.6860,highlighting its outstanding robustness and efficiency.These results demonstrate that SSMPFC offers a scalable and effective solution for real-world satellite-based monitoring systems,particularly in scenarios where rapid annotation is infeasible,such as wildfire tracking,agricultural monitoring,and dynamic urban mapping. 展开更多
关键词 multi-view clustering satellite image segmentation semi-supervised learning picture fuzzy sets remote sensing
在线阅读 下载PDF
A Comprehensive Review of Pill Image Recognition
5
作者 Linh Nguyen Thi My Viet-Tuan Le +1 位作者 Tham Vo Vinh Truong Hoang 《Computers, Materials & Continua》 2025年第3期3693-3740,共48页
Pill image recognition is an important field in computer vision.It has become a vital technology in healthcare and pharmaceuticals due to the necessity for precise medication identification to prevent errors and ensur... Pill image recognition is an important field in computer vision.It has become a vital technology in healthcare and pharmaceuticals due to the necessity for precise medication identification to prevent errors and ensure patient safety.This survey examines the current state of pill image recognition,focusing on advancements,methodologies,and the challenges that remain unresolved.It provides a comprehensive overview of traditional image processing-based,machine learning-based,deep learning-based,and hybrid-based methods,and aims to explore the ongoing difficulties in the field.We summarize and classify the methods used in each article,compare the strengths and weaknesses of traditional image processing-based,machine learning-based,deep learning-based,and hybrid-based methods,and review benchmark datasets for pill image recognition.Additionally,we compare the performance of proposed methods on popular benchmark datasets.This survey applies recent advancements,such as Transformer models and cutting-edge technologies like Augmented Reality(AR),to discuss potential research directions and conclude the review.By offering a holistic perspective,this paper aims to serve as a valuable resource for researchers and practitioners striving to advance the field of pill image recognition. 展开更多
关键词 Pill image recognition pill image identification pill recognition pill identification pill image retrieval pill retrieval computer vision
在线阅读 下载PDF
Research on the balance optimization algorithm of image recognition accuracy and speed based on autocollimator measurement
6
作者 LI Renpu MA Long +3 位作者 CUI Jiwen GUO Junqi Andrei KULIKOV WEN Dandan 《Optoelectronics Letters》 2025年第2期121-128,共8页
The autocollimator is an important device for achieving precise,small-angle,non-contact measurements.It primarily obtains angular parameters of a plane target mirror indirectly by detecting the position of the imaging... The autocollimator is an important device for achieving precise,small-angle,non-contact measurements.It primarily obtains angular parameters of a plane target mirror indirectly by detecting the position of the imaging spot.There is limited report on the core algorithmic techniques in current commercial products and recent scientific research.This paper addresses the performance requirements of coordinate reading accuracy and operational speed in autocollimator image positioning.It proposes a cross-image center recognition scheme based on the Hough transform and another based on Zernike moments and the least squares method.Through experimental evaluation of the accuracy and speed of both schemes,the optimal image recognition scheme balancing measurement accuracy and speed for the autocollimator is determined.Among these,the center recognition method based on Zernike moments and the least squares method offers higher measurement accuracy and stability,while the Hough transform-based method provides faster measurement speed. 展开更多
关键词 image optimization recognition
原文传递
Integrated Application Research on Marine Image Recognition Models
7
作者 Chih-Chen Kao Yu-Fen Peng Bo-Wen Wu 《Sustainable Marine Structures》 2025年第2期38-44,共7页
Marine environments present significant challenges for image processing due to factors such as low light intensity,suspended particles,and varying degrees of water turbidity.These conditions severely degrade the clari... Marine environments present significant challenges for image processing due to factors such as low light intensity,suspended particles,and varying degrees of water turbidity.These conditions severely degrade the clarity and quality of captured marine images,making accurate image recognition difficult.The problem is further compounded by the limited availability of high-quality,labeled training samples,which restricts the effectiveness of conventional recognition algorithms.Existing techniques in both academic and industrial settings—such as Principal Component Analysis(PCA),Neural Networks,and Wavelet Transforms—typically involve converting color images to grayscale prior to feature extraction.While this simplifies processing,it also results in the loss of essential color information,which is often critical for distinguishing features in marine imagery.To address these issues,this paper proposes a novel approach that preserves and utilizes the full color information of marine images during processing and recognition.The method combines color image representation with Hu's invariant moments to extract stable and rotation-invariant features.These features are then input into a Back Propagation Neural Network(BPNN),which is trained to recognize and classify various marine targets.The integration of color-based feature extraction with BPNN significantly improves recognition performance,particularly under complex environmental conditions.Experimental results show that the proposed system achieves a recognition accuracy exceeding 98%,demonstrating its effectiveness and potential for practical applications in marine exploration,environmental monitoring,and underwater robotics. 展开更多
关键词 Marine image Color Preprocessing Pattern recognition BPNN Invariant Moments
在线阅读 下载PDF
A teacher-student based attention network for fine-grainedimage recognition
8
作者 Ang Li Xueyi Zhang +1 位作者 Peilin Li Bin Kang 《Digital Communications and Networks》 2025年第1期52-59,共8页
Fine-grained Image Recognition(FGIR)task is dedicated to distinguishing similar sub-categories that belong to the same super-category,such as bird species and car types.In order to highlight visual differences,existin... Fine-grained Image Recognition(FGIR)task is dedicated to distinguishing similar sub-categories that belong to the same super-category,such as bird species and car types.In order to highlight visual differences,existing FGIR works often follow two steps:discriminative sub-region localization and local feature representation.However,these works pay less attention on global context information.They neglect a fact that the subtle visual difference in challenging scenarios can be highlighted through exploiting the spatial relationship among different subregions from a global view point.Therefore,in this paper,we consider both global and local information for FGIR,and propose a collaborative teacher-student strategy to reinforce and unity the two types of information.Our framework is implemented mainly by convolutional neural network,referred to Teacher-Student Based Attention Convolutional Neural Network(T-S-ACNN).For fine-grained local information,we choose the classic Multi-Attention Network(MA-Net)as our baseline,and propose a type of boundary constraint to further reduce background noises in the local attention maps.In this way,the discriminative sub-regions tend to appear in the area occupied by fine-grained objects,leading to more accurate sub-region localization.For fine-grained global information,we design a graph convolution based Global Attention Network(GA-Net),which can combine extracted local attention maps from MA-Net with non-local techniques to explore spatial relationship among subregions.At last,we develop a collaborative teacher-student strategy to adaptively determine the attended roles and optimization modes,so as to enhance the cooperative reinforcement of MA-Net and GA-Net.Extensive experiments on CUB-200-2011,Stanford Cars and FGVC Aircraft datasets illustrate the promising performance of our framework. 展开更多
关键词 Fine-grained image recognition Collaborative teacher-student strategy Multi-attention Global attention
在线阅读 下载PDF
Transformer-Based Fusion of Infrared and Visible Imagery for Smoke Recognition in Commercial Areas
9
作者 Chongyang Wang Qiongyan Li +2 位作者 Shu Liu Pengle Cheng Ying Huang 《Computers, Materials & Continua》 2025年第9期5157-5176,共20页
With rapid urbanization,fires pose significant challenges in urban governance.Traditional fire detection methods often struggle to detect smoke in complex urban scenes due to environmental interferences and variations... With rapid urbanization,fires pose significant challenges in urban governance.Traditional fire detection methods often struggle to detect smoke in complex urban scenes due to environmental interferences and variations in viewing angles.This study proposes a novel multimodal smoke detection method that fuses infrared and visible imagery using a transformer-based deep learning model.By capturing both thermal and visual cues,our approach significantly enhances the accuracy and robustness of smoke detection in business parks scenes.We first established a dual-view dataset comprising infrared and visible light videos,implemented an innovative image feature fusion strategy,and designed a deep learning model based on the transformer architecture and attention mechanism for smoke classification.Experimental results demonstrate that our method outperforms existing methods,under the condition of multi-view input,it achieves an accuracy rate of 90.88%,precision rate of 98.38%,recall rate of 92.41%and false positive and false negative rates both below 5%,underlining the effectiveness of the proposed multimodal and multi-view fusion approach.The attention mechanism plays a crucial role in improving detection performance,particularly in identifying subtle smoke features. 展开更多
关键词 Multimodal image processing smoke recognition urban safety environmental monitoring
在线阅读 下载PDF
Swiftly accessible retinomorphic hardware for in-sensor image preprocessing and recognition:IGZO-based neuro-inspired optical image sensor arrays with metallic sensitization island
10
作者 Kyungmoon Kwak Kyungho Park +7 位作者 Jae Seong Han Byung Ha Kang Dong Hyun Choi Kunho Moon Seok Min Hong Gwan In Kim Ju Hyun Lee Hyun Jae Kim 《International Journal of Extreme Manufacturing》 2025年第6期494-510,共17页
In-optical-sensor computing architectures based on neuro-inspired optical sensor arrays have become key milestones for in-sensor artificial intelligence(AI)technology,enabling intelligent vision sensing and extensive ... In-optical-sensor computing architectures based on neuro-inspired optical sensor arrays have become key milestones for in-sensor artificial intelligence(AI)technology,enabling intelligent vision sensing and extensive data processing.These architectures must demonstrate potential advantages in terms of mass production and complementary metal oxide semiconductor compatibility.Here,we introduce a visible-light-driven neuromorphic vision system that integrates front-end retinomorphic photosensors with a back-end artificial neural network(ANN),employing a single neuro-inspired indium-g allium-zinc-oxide photo transistor(NIP)featuring an aluminum sensitization layer(ASL).By methodically adjusting the ASL coverage on IGZO phototransistors,a fast-switching response-type and a synaptic response-type of IGZO photo transistors are successfully developed.Notably,the fabricated NIP shows a remarkable retina-like photoinduced synaptic plasticity under wavelengths up to 635 nm,with over256-states,weight update nonlinearity below 0.1,and a dynamic range of 64.01.Owing to this technology,a 6×6 neuro-inspired optical image sensor array with the NIP can perform highly integrated sensing,memory,and preprocessing functions,including contrast enhancement,and handwritten digit image recognition.The demonstrated prototype highlights the potential for efficient hardware implementations in in-sensor AI technologies. 展开更多
关键词 retinomorphic hardware in-sensor preprocessing image recognition neuro-inspired optical sensors indium-gallium-zinc-oxide metallic sensitization layer
在线阅读 下载PDF
Fusion method for water depth data from multiple sources based on image recognition
11
作者 Huiyu HAN Feng ZHOU 《Journal of Oceanology and Limnology》 2025年第4期1093-1105,共13页
Considering the difficulty of integrating the depth points of nautical charts of the East China Sea into a global high-precision Grid Digital Elevation Model(Grid-DEM),we proposed a“Fusion based on Image Recognition(... Considering the difficulty of integrating the depth points of nautical charts of the East China Sea into a global high-precision Grid Digital Elevation Model(Grid-DEM),we proposed a“Fusion based on Image Recognition(FIR)”method for multi-sourced depth data fusion,and used it to merge the electronic nautical chart dataset(referred to as Chart2014 in this paper)with the global digital elevation dataset(referred to as Globalbath2002 in this paper).Compared to the traditional fusion of two datasets by direct combination and interpolation,the new Grid-DEM formed by FIR can better represent the data characteristics of Chart2014,reduce the calculation difficulty,and be more intuitive,and,the choice of different interpolation methods in FIR and the influence of the“exclusion radius R”parameter were discussed.FIR avoids complex calculations of spatial distances among points from different sources,and instead uses spatial exclusion map to perform one-step screening based on the exclusion radius R,which greatly improved the fusion status of a reliable dataset.The fusion results of different experiments were analyzed statistically with root mean square error and mean relative error,showing that the interpolation methods based on Delaunay triangulation are more suitable for the fusion of nautical chart depth of China,and factors such as the point density distribution of multiple source data,accuracy,interpolation method,and various terrain conditions should be fully considered when selecting the exclusion radius R. 展开更多
关键词 water depth fusion method Grid Digital Elevation Model(Grid-DEM) image recognition Delaunay triangulation
在线阅读 下载PDF
A novel coal-rock recognition method in coal mining face based on fusing laser point cloud and images
12
作者 Yang Liu Lei Si +4 位作者 Zhongbin Wang Miao Chen Xin Li Dong Wei Jinheng Gu 《International Journal of Mining Science and Technology》 2025年第7期1057-1071,共15页
Rapid and accurate recognition of coal and rock is an important prerequisite for safe and efficient coal mining.In this paper,a novel coal-rock recognition method is proposed based on fusing laser point cloud and imag... Rapid and accurate recognition of coal and rock is an important prerequisite for safe and efficient coal mining.In this paper,a novel coal-rock recognition method is proposed based on fusing laser point cloud and images,named Multi-Modal Frustum PointNet(MMFP).Firstly,MobileNetV3 is used as the backbone network of Mask R-CNN to reduce the network parameters and compress the model volume.The dilated convolutional block attention mechanism(Dilated CBAM)and inception structure are combined with MobileNetV3 to further enhance the detection accuracy.Subsequently,the 2D target candidate box is calculated through the improved Mask R-CNN,and the frustum point cloud in the 2D target candidate box is extracted to reduce the calculation scale and spatial search range.Then,the self-attention PointNet is constructed to segment the fused point cloud within the frustum range,and the bounding box regression network is used to predict the bounding box parameters.Finally,an experimental platform of shearer coal wall cutting is established,and multiple comparative experiments are conducted.Experimental results indicate that the proposed coal-rock recognition method is superior to other advanced models. 展开更多
关键词 Coal miningface Coal-rock recognition Deep learning Laser pointcloud and images fusion Multi-Modal Frustum PointNet(MMFP)
在线阅读 下载PDF
SAR IMAGE RECOGNITION BASED ON MULTI-ASPECT OF SHADOW INFORMATION 被引量:2
13
作者 杨露菁 郝威 王德石 《Transactions of Nanjing University of Aeronautics and Astronautics》 EI 2009年第4期320-326,共7页
The traditional synthetic aperture radar(SAR) image recognition techniques focus on the electro magnetic (EM) scattering centers, ignoring the important role of the shadow information on the SAR image recognition.... The traditional synthetic aperture radar(SAR) image recognition techniques focus on the electro magnetic (EM) scattering centers, ignoring the important role of the shadow information on the SAR image recognition. It is difficult to classify targets by the shadow information independently, because the shadow shape is dependent on the radar aspect angle, the depression angle and the resolution. Moreover, the shadow shapes of different targets are similar. When the multiple SAR images of one target from different aspects are available, the performance of the target recognition can be improved. Aimed at the problem, a multi-aspect SAR image recognition technique based on the shadow information is developed. It extracts shadow profiles from SAR images, and takes chain codes as the feature vectors of targets. Then, feature vectors on multiple aspects of the same target are combined with feature sequences, and the hidden Markov model (HMM) is applied to the feature sequences for the target recognition. The simulation result shows the effectiveness of the method. 展开更多
关键词 image recognition synthetic aperture radar (SAR) shadow information chain code
在线阅读 下载PDF
Hybrid Quantum Gate Enabled CNN Framework with Optimized Features for Human-Object Detection and Recognition
14
作者 Nouf Abdullah Almujally Tanvir Fatima Naik Bukht +3 位作者 Shuaa S.Alharbi Asaad Algarni Ahmad Jalal Jeongmin Park 《Computers, Materials & Continua》 2026年第4期2254-2271,共18页
Recognising human-object interactions(HOI)is a challenging task for traditional machine learning models,including convolutional neural networks(CNNs).Existing models show limited transferability across complex dataset... Recognising human-object interactions(HOI)is a challenging task for traditional machine learning models,including convolutional neural networks(CNNs).Existing models show limited transferability across complex datasets such as D3D-HOI and SYSU 3D HOI.The conventional architecture of CNNs restricts their ability to handle HOI scenarios with high complexity.HOI recognition requires improved feature extraction methods to overcome the current limitations in accuracy and scalability.This work proposes a Novel quantum gate-enabled hybrid CNN(QEH-CNN)for effectiveHOI recognition.Themodel enhancesCNNperformance by integrating quantumcomputing components.The framework begins with bilateral image filtering,followed bymulti-object tracking(MOT)and Felzenszwalb superpixel segmentation.A watershed algorithm refines object boundaries by cleaning merged superpixels.Feature extraction combines a histogram of oriented gradients(HOG),Global Image Statistics for Texture(GIST)descriptors,and a novel 23-joint keypoint extractionmethod using relative joint angles and joint proximitymeasures.A fuzzy optimization process refines the extracted features before feeding them into the QEH-CNNmodel.The proposed model achieves 95.06%accuracy on the 3D-D3D-HOI dataset and 97.29%on the SYSU3DHOI dataset.Theintegration of quantum computing enhances feature optimization,leading to improved accuracy and overall model efficiency. 展开更多
关键词 Pattern recognition image segmentation computer vision object detection
在线阅读 下载PDF
Human Activity Recognition Using Weighted Average Ensemble by Selected Deep Learning Models
15
作者 Waseem Akhtar Mahwish Ilyas +3 位作者 Romana Aziz Ghadah Aldehim Tassawar Iqbal Muhammad Ramzan 《Computer Modeling in Engineering & Sciences》 2026年第2期971-989,共19页
Human Activity Recognition(HAR)is a novel area for computer vision.It has a great impact on healthcare,smart environments,and surveillance while is able to automatically detect human behavior.It plays a vital role in ... Human Activity Recognition(HAR)is a novel area for computer vision.It has a great impact on healthcare,smart environments,and surveillance while is able to automatically detect human behavior.It plays a vital role in many applications,such as smart home,healthcare,human computer interaction,sports analysis,and especially,intelligent surveillance.In this paper,we propose a robust and efficient HAR system by leveraging deep learning paradigms,including pre-trained models,CNN architectures,and their average-weighted fusion.However,due to the diversity of human actions and various environmental influences,as well as a lack of data and resources,achieving high recognition accuracy remain elusive.In this work,a weighted average ensemble technique is employed to fuse three deep learning models:EfficientNet,ResNet50,and a custom CNN.The results of this study indicate that using a weighted average ensemble strategy for developing more effective HAR models may be a promising idea for detection and classification of human activities.Experiments by using the benchmark dataset proved that the proposed weighted ensemble approach outperformed existing approaches in terms of accuracy and other key performance measures.The combined average-weighted ensemble of pre-trained and CNN models obtained an accuracy of 98%,compared to 97%,96%,and 95%for the customized CNN,EfficientNet,and ResNet50 models,respectively. 展开更多
关键词 Artificial intelligence computer vision deep learning recognition human activity classification image processing
在线阅读 下载PDF
A Robust Face Recognition Method Combining LBP with Multi-mirror Symmetry for Images with Various Face Interferences 被引量:8
16
作者 Shui-Guang Tong Yuan-Yuan Huang Zhe-Ming Tong 《International Journal of Automation and computing》 EI CSCD 2019年第5期671-682,共12页
Face recognition(FR) is a practical application of pattern recognition(PR) and remains a compelling topic in the study of computer vision. However, in real-world FR systems, interferences in images, including illumina... Face recognition(FR) is a practical application of pattern recognition(PR) and remains a compelling topic in the study of computer vision. However, in real-world FR systems, interferences in images, including illumination condition, occlusion, facial expression and pose variation, make the recognition task challenging. This study explored the impact of those interferences on FR performance and attempted to alleviate it by taking face symmetry into account. A novel and robust FR method was proposed by combining multi-mirror symmetry with local binary pattern(LBP), namely multi-mirror local binary pattern(MMLBP). To enhance FR performance with various interferences, the MMLBP can 1) adaptively compensate lighting under heterogeneous lighting conditions, and 2) generate extracted image features that are much closer to those under well-controlled conditions(i.e., frontal facial images without expression). Therefore, in contrast with the later variations of LBP, the symmetrical singular value decomposition representation(SSVDR) algorithm utilizing the facial symmetry and a state-of-art non-LBP method, the MMLBP method is shown to successfully handle various image interferences that are common in FR applications without preprocessing operation and a large number of training images. The proposed method was validated with four public data sets. According to our analysis, the MMLBP method was demonstrated to achieve robust performance regardless of image interferences. 展开更多
关键词 FACE recognition(FR) local binary pattern(LBP) FACIAL SYMMETRY image INTERFERENCES multi-mirror average
原文传递
Face Image Recognition Based on Convolutional Neural Network 被引量:16
17
作者 Guangxin Lou Hongzhen Shi 《China Communications》 SCIE CSCD 2020年第2期117-124,共8页
With the continuous progress of The Times and the development of technology,the rise of network social media has also brought the“explosive”growth of image data.As one of the main ways of People’s Daily communicati... With the continuous progress of The Times and the development of technology,the rise of network social media has also brought the“explosive”growth of image data.As one of the main ways of People’s Daily communication,image is widely used as a carrier of communication because of its rich content,intuitive and other advantages.Image recognition based on convolution neural network is the first application in the field of image recognition.A series of algorithm operations such as image eigenvalue extraction,recognition and convolution are used to identify and analyze different images.The rapid development of artificial intelligence makes machine learning more and more important in its research field.Use algorithms to learn each piece of data and predict the outcome.This has become an important key to open the door of artificial intelligence.In machine vision,image recognition is the foundation,but how to associate the low-level information in the image with the high-level image semantics becomes the key problem of image recognition.Predecessors have provided many model algorithms,which have laid a solid foundation for the development of artificial intelligence and image recognition.The multi-level information fusion model based on the VGG16 model is an improvement on the fully connected neural network.Different from full connection network,convolutional neural network does not use full connection method in each layer of neurons of neural network,but USES some nodes for connection.Although this method reduces the computation time,due to the fact that the convolutional neural network model will lose some useful feature information in the process of propagation and calculation,this paper improves the model to be a multi-level information fusion of the convolution calculation method,and further recovers the discarded feature information,so as to improve the recognition rate of the image.VGG divides the network into five groups(mimicking the five layers of AlexNet),yet it USES 3*3 filters and combines them as a convolution sequence.Network deeper DCNN,channel number is bigger.The recognition rate of the model was verified by 0RL Face Database,BioID Face Database and CASIA Face Image Database. 展开更多
关键词 convolutional neural network face image recognition machine learning artificial intelligence multilayer information fusion
在线阅读 下载PDF
A Method for Improving CNN-Based Image Recognition Using DCGAN 被引量:17
18
作者 Wei Fang Feihong Zhang +1 位作者 Victor S.Sheng Yewen Ding 《Computers, Materials & Continua》 SCIE EI 2018年第10期167-178,共12页
Image recognition has always been a hot research topic in the scientific community and industry.The emergence of convolutional neural networks(CNN)has made this technology turned into research focus on the field of co... Image recognition has always been a hot research topic in the scientific community and industry.The emergence of convolutional neural networks(CNN)has made this technology turned into research focus on the field of computer vision,especially in image recognition.But it makes the recognition result largely dependent on the number and quality of training samples.Recently,DCGAN has become a frontier method for generating images,sounds,and videos.In this paper,DCGAN is used to generate sample that is difficult to collect and proposed an efficient design method of generating model.We combine DCGAN with CNN for the second time.Use DCGAN to generate samples and training in image recognition model,which based by CNN.This method can enhance the classification model and effectively improve the accuracy of image recognition.In the experiment,we used the radar profile as dataset for 4 categories and achieved satisfactory classification performance.This paper applies image recognition technology to the meteorological field. 展开更多
关键词 DCGAN image recognition CNN SAMPLES
在线阅读 下载PDF
Recognition system of leaf images based on neuronal network 被引量:5
19
作者 WANG Dai-lin ZHANG Xiu-mei LIU Ya-qiu 《Journal of Forestry Research》 SCIE CAS CSCD 2006年第3期243-246,共4页
In forest variety registration, visual traits of the plants appearance are widely used to discern different tree species. The new recognition system of leaf image strategy which based on neural network established to ... In forest variety registration, visual traits of the plants appearance are widely used to discern different tree species. The new recognition system of leaf image strategy which based on neural network established to administrate a hierarchical list of leaf images, some sorts of edge detection can be performed to identify the individual tokens of every image and the frame of the leaf can be got to differentiate the tree species. An approach based on back-propagation neuronal network is proposed and the programming language for the implementation is also Riven by using Java. The numerical simulations results have shown that the proposed leaf strategt is effective and feasible. 展开更多
关键词 Neuronal network Edge detection Leaf images Pattern recognition
在线阅读 下载PDF
Recognition of Blast Furnace Gas Flow Center Distribution Based on Infrared Image Processing 被引量:8
20
作者 Lin SHI You-bin WEN +1 位作者 Guang-sheng ZHAO Tao YU 《Journal of Iron and Steel Research International》 SCIE EI CAS CSCD 2016年第3期203-209,共7页
To address the problems about the difficulty in accurate recognition of distribution features of gas flow center at blast furnace throat and determine the relationship between gas flow center distribution and gas util... To address the problems about the difficulty in accurate recognition of distribution features of gas flow center at blast furnace throat and determine the relationship between gas flow center distribution and gas utilization rate,a method for recognizing distribution features of blast furnace gas flow center was proposed based on infrared image processing,and distribution features of blast furnace gas flow center and corresponding gas utilization rates were categorized by using fuzzy C-means clustering and statistical methods.A concept of gas flow center offset was introduced.The results showed that,when the percentage of gas flow center without offset exceeded 85%,the average blast furnace gas utilization rate was as high as 41%;when the percentage of gas flow center without offset exceeded50%,the gas utilization rate was primarily the center gas utilization rate,and exhibited a positive correlation with no center offset degree;when the percentage of gas flow center without offset was below 50% but the sum of the percentage of gas flow center without offset and that of gas flow center with small offset exceeded 86%,the gas utilization rate depended on both the center and the edges,and was primarily the edge gas utilization rate.The method proposed was able to accurately and effectively recognize gas flow center distribution state and the relationship between it and gas utilization rate,providing evidence in favor of on-line blast furnace control. 展开更多
关键词 infrared image processing gas flow center recognition gas utilization rate fuzzy C-means clustering
原文传递
上一页 1 2 250 下一页 到第
使用帮助 返回顶部