Real-time detection for object size has now become a hot topic in the testing field and image processing is the core algorithm. This paper focuses on the processing and display of the collected dynamic images to achie...Real-time detection for object size has now become a hot topic in the testing field and image processing is the core algorithm. This paper focuses on the processing and display of the collected dynamic images to achieve a real-time image pro- cessing for the moving objects. Firstly, the median filtering, gain calibration, image segmentation, image binarization, cor- ner detection and edge fitting are employed to process the images of the moving objects to make the image close to the real object. Then, the processed images are simultaneously displayed on a real-time basis to make it easier to analyze, understand and identify them, and thus it reduces the computation complexity. Finally, human-computer interaction (HCI)-friendly in- terface based on VC ++ is designed to accomplish the digital logic transform, image processing and real-time display of the objects. The experiment shows that the proposed algorithm and software design have better real-time performance and accu- racy which can meet the industrial needs.展开更多
A novel dual-branch decoding fusion convolutional neural network model(DDFNet)specifically designed for real-time salient object detection(SOD)on steel surfaces is proposed.DDFNet is based on a standard encoder–decod...A novel dual-branch decoding fusion convolutional neural network model(DDFNet)specifically designed for real-time salient object detection(SOD)on steel surfaces is proposed.DDFNet is based on a standard encoder–decoder architecture.DDFNet integrates three key innovations:first,we introduce a novel,lightweight multi-scale progressive aggregation residual network that effectively suppresses background interference and refines defect details,enabling efficient salient feature extraction.Then,we propose an innovative dual-branch decoding fusion structure,comprising the refined defect representation branch and the enhanced defect representation branch,which enhance accuracy in defect region identification and feature representation.Additionally,to further improve the detection of small and complex defects,we incorporate a multi-scale attention fusion module.Experimental results on the public ESDIs-SOD dataset show that DDFNet,with only 3.69 million parameters,achieves detection performance comparable to current state-of-the-art models,demonstrating its potential for real-time industrial applications.Furthermore,our DDFNet-L variant consistently outperforms leading methods in detection performance.The code is available at https://github.com/13140W/DDFNet.展开更多
To address the challenges of low accuracy and insufficient real-time performance in dynamic object detection for UAV surveillance,this paper proposes a novel tracking framework that integrates a lightweight improved Y...To address the challenges of low accuracy and insufficient real-time performance in dynamic object detection for UAV surveillance,this paper proposes a novel tracking framework that integrates a lightweight improved YOLOv5s model with adaptive motion compensation.A UAV-view dynamic feature enhancement strategy is innovatively introduced,and a lightweight detection network combining attention mechanisms and multi-scale fusion is constructed.The robustness of tracking under motion blur scenarios is also optimized.Experimental results demonstrate that the proposed method achieves a mAP@0.5 of 68.2%on the VisDrone dataset and reaches an inference speed of 32 FPS on the NVIDIA Jetson TX2 platform.This significantly improves the balance between accuracy and efficiency in complex scenes,offering reliable technical support for real-time applications such as emergency response.展开更多
Visible-infrared object detection leverages the day-night stable object perception capability of infrared images to enhance detection robustness in low-light environments by fusing the complementary information of vis...Visible-infrared object detection leverages the day-night stable object perception capability of infrared images to enhance detection robustness in low-light environments by fusing the complementary information of visible and infrared images.However,the inherent differences in the imaging mechanisms of visible and infrared modalities make effective cross-modal fusion challenging.Furthermore,constrained by the physical characteristics of sensors and thermal diffusion effects,infrared images generally suffer from blurred object contours and missing details,making it difficult to extract object features effectively.To address these issues,we propose an infrared-visible image fusion network that realizesmultimodal information fusion of infrared and visible images through a carefully designedmultiscale fusion strategy.First,we design an adaptive gray-radiance enhancement(AGRE)module to strengthen the detail representation in infrared images,improving their usability in complex lighting scenarios.Next,we introduce a channelspatial feature interaction(CSFI)module,which achieves efficient complementarity between the RGB and infrared(IR)modalities via dynamic channel switching and a spatial attention mechanism.Finally,we propose a multi-scale enhanced cross-attention fusion(MSECA)module,which optimizes the fusion ofmulti-level features through dynamic convolution and gating mechanisms and captures long-range complementary relationships of cross-modal features on a global scale,thereby enhancing the expressiveness of the fused features.Experiments on the KAIST,M3FD,and FLIR datasets demonstrate that our method delivers outstanding performance in daytime and nighttime scenarios.On the KAIST dataset,the miss rate drops to 5.99%,and further to 4.26% in night scenes.On the FLIR and M3FD datasets,it achieves AP50 scores of 79.4% and 88.9%,respectively.展开更多
When detecting objects in Unmanned Aerial Vehicle(UAV)taken images,large number of objects and high proportion of small objects bring huge challenges for detection algorithms based on the You Only Look Once(YOLO)frame...When detecting objects in Unmanned Aerial Vehicle(UAV)taken images,large number of objects and high proportion of small objects bring huge challenges for detection algorithms based on the You Only Look Once(YOLO)framework,rendering them challenging to deal with tasks that demand high precision.To address these problems,this paper proposes a high-precision object detection algorithm based on YOLOv10s.Firstly,a Multi-branch Enhancement Coordinate Attention(MECA)module is proposed to enhance feature extraction capability.Secondly,a Multilayer Feature Reconstruction(MFR)mechanism is designed to fully exploit multilayer features,which can enrich object information as well as remove redundant information.Finally,an MFR Path Aggregation Network(MFR-Neck)is constructed,which integrates multi-scale features to improve the network's ability to perceive objects of var-ying sizes.The experimental results demonstrate that the proposed algorithm increases the average detection accuracy by 14.15%on the Vis Drone dataset compared to YOLOv10s,effectively enhancing object detection precision in UAV-taken images.展开更多
Recent years have seen a surge in interest in object detection on remote sensing images for applications such as surveillance andmanagement.However,challenges like small object detection,scale variation,and the presen...Recent years have seen a surge in interest in object detection on remote sensing images for applications such as surveillance andmanagement.However,challenges like small object detection,scale variation,and the presence of closely packed objects in these images hinder accurate detection.Additionally,the motion blur effect further complicates the identification of such objects.To address these issues,we propose enhanced YOLOv9 with a transformer head(YOLOv9-TH).The model introduces an additional prediction head for detecting objects of varying sizes and swaps the original prediction heads for transformer heads to leverage self-attention mechanisms.We further improve YOLOv9-TH using several strategies,including data augmentation,multi-scale testing,multi-model integration,and the introduction of an additional classifier.The cross-stage partial(CSP)method and the ghost convolution hierarchical graph(GCHG)are combined to improve detection accuracy by better utilizing feature maps,widening the receptive field,and precisely extracting multi-scale objects.Additionally,we incorporate the E-SimAM attention mechanism to address low-resolution feature loss.Extensive experiments on the VisDrone2021 and DIOR datasets demonstrate the effectiveness of YOLOv9-TH,showing good improvement in mAP compared to the best existing methods.The YOLOv9-TH-e achieved 54.2% of mAP50 on the VisDrone2021 dataset and 92.3% of mAP on the DIOR dataset.The results confirmthemodel’s robustness and suitability for real-world applications,particularly for small object detection in remote sensing images.展开更多
Modern manufacturing processes have become more reliant on automation because of the accelerated transition from Industry 3.0 to Industry 4.0.Manual inspection of products on assembly lines remains inefficient,prone t...Modern manufacturing processes have become more reliant on automation because of the accelerated transition from Industry 3.0 to Industry 4.0.Manual inspection of products on assembly lines remains inefficient,prone to errors and lacks consistency,emphasizing the need for a reliable and automated inspection system.Leveraging both object detection and image segmentation approaches,this research proposes a vision-based solution for the detection of various kinds of tools in the toolkit using deep learning(DL)models.Two Intel RealSense D455f depth cameras were arranged in a top down configuration to capture both RGB and depth images of the toolkits.After applying multiple constraints and enhancing them through preprocessing and augmentation,a dataset consisting of 3300 annotated RGB-D photos was generated.Several DL models were selected through a comprehensive assessment of mean Average Precision(mAP),precision-recall equilibrium,inference latency(target≥30 FPS),and computational burden,resulting in a preference for YOLO and Region-based Convolutional Neural Networks(R-CNN)variants over ViT-based models due to the latter’s increased latency and resource requirements.YOLOV5,YOLOV8,YOLOV11,Faster R-CNN,and Mask R-CNN were trained on the annotated dataset and evaluated using key performance metrics(Recall,Accuracy,F1-score,and Precision).YOLOV11 demonstrated balanced excellence with 93.0%precision,89.9%recall,and a 90.6%F1-score in object detection,as well as 96.9%precision,95.3%recall,and a 96.5%F1-score in instance segmentation with an average inference time of 25 ms per frame(≈40 FPS),demonstrating real-time performance.Leveraging these results,a YOLOV11-based windows application was successfully deployed in a real-time assembly line environment,where it accurately processed live video streams to detect and segment tools within toolkits,demonstrating its practical effectiveness in industrial automation.The application is capable of precisely measuring socket dimensions by utilising edge detection techniques on YOLOv11 segmentation masks,in addition to detection and segmentation.This makes it possible to do specification-level quality control right on the assembly line,which improves the ability to examine things in real time.The implementation is a big step forward for intelligent manufacturing in the Industry 4.0 paradigm.It provides a scalable,efficient,and accurate way to do automated inspection and dimensional verification activities.展开更多
Automatic analysis of student behavior in classrooms has gained importance with the rise of smart education and vision technologies.However,the limited real-time accuracy of existing methods severely constrains their ...Automatic analysis of student behavior in classrooms has gained importance with the rise of smart education and vision technologies.However,the limited real-time accuracy of existing methods severely constrains their practical classroom deployment.To address this issue of low accuracy,we propose an improved YOLOv11-based detector that integrates CARAFE upsampling,DySnakeConv,DyHead,and SMFA fusion modules.This new model for real-time classroom behavior detection captures fine-grained student behaviors with low latency.Additionally,we have developed a visualization system that presents data through intuitive dashboards.This system enables teachers to dynamically grasp classroom engagement by tracking student participation and involvement.The enhanced YOLOv11 model achieves an mAP@0.5 of 87.2%on the evaluated datasets,surpassing baseline models.This significance lies in two aspects.First,it provides a practical technical route for deployable live classroom behavior monitoring and engagement feedback systems.Second,by integrating this proposed system,educators could make data-informed and fine-grained teaching decisions,ultimately improving instructional quality and learning outcomes.展开更多
The YOLO(You Only Look Once)series,a leading single-stage object detection framework,has gained significant prominence in medical-image analysis due to its real-time efficiency and robust performance.Recent iterations...The YOLO(You Only Look Once)series,a leading single-stage object detection framework,has gained significant prominence in medical-image analysis due to its real-time efficiency and robust performance.Recent iterations of YOLO have further enhanced its accuracy and reliability in critical clinical tasks such as tumor detection,lesion segmentation,and microscopic image analysis,thereby accelerating the development of clinical decision support systems.This paper systematically reviews advances in YOLO-based medical object detection from 2018 to 2024.It compares YOLO’s performance with othermodels(e.g.,Faster R-CNN,RetinaNet)inmedical contexts,summarizes standard evaluation metrics(e.g.,mean Average Precision(mAP),sensitivity),and analyzes hardware deployment strategies using public datasets such as LUNA16,BraTS,andCheXpert.Thereviewhighlights the impressive performance of YOLO models,particularly from YOLOv5 to YOLOv8,in achieving high precision(up to 99.17%),sensitivity(up to 97.5%),and mAP exceeding 95%in tasks such as lung nodule,breast cancer,and polyp detection.These results demonstrate the significant potential of YOLO models for early disease detection and real-time clinical applications,indicating their ability to enhance clinical workflows.However,the study also identifies key challenges,including high small-object miss rates,limited generalization in low-contrast images,scarcity of annotated data,and model interpretability issues.Finally,the potential future research directions are also proposed to address these challenges and further advance the application of YOLO models in healthcare.展开更多
Deep learning-based intelligent recognition algorithms are increasingly recognized for their potential to address the labor-intensive challenge of manual pest detection.However,their deployment on mobile devices has b...Deep learning-based intelligent recognition algorithms are increasingly recognized for their potential to address the labor-intensive challenge of manual pest detection.However,their deployment on mobile devices has been constrained by high computational demands.Here,we developed GBiDC-PEST,a mobile application that incorporates an improved,lightweight detection algorithm based on the You Only Look Once(YOLO)series singlestage architecture,for real-time detection of four tiny pests(wheat mites,sugarcane aphids,wheat aphids,and rice planthoppers).GBiDC-PEST incorporates several innovative modules,including GhostNet for lightweight feature extraction and architecture optimization by reconstructing the backbone,the bi-directional feature pyramid network(BiFPN)for enhanced multiscale feature fusion,depthwise convolution(DWConv)layers to reduce computational load,and the convolutional block attention module(CBAM)to enable precise feature focus.The newly developed GBiDC-PEST was trained and validated using a multitarget agricultural tiny pest dataset(Tpest-3960)that covered various field environments.GBiDC-PEST(2.8 MB)significantly reduced the model size to only 20%of the original model size,offering a smaller size than the YOLO series(v5-v10),higher detection accuracy than YOLOv10n and v10s,and faster detection speed than v8s,v9c,v10m and v10b.In Android deployment experiments,GBiDCPEST demonstrated enhanced performance in detecting pests against complex backgrounds,and the accuracy for wheat mites and rice planthoppers was improved by 4.5-7.5%compared with the original model.The GBiDC-PEST optimization algorithm and its mobile deployment proposed in this study offer a robust technical framework for the rapid,onsite identification and localization of tiny pests.This advancement provides valuable insights for effective pest monitoring,counting,and control in various agricultural settings.展开更多
Aimed at the long and narrow geometric features and poor generalization ability of the damage detection in conveyor belts with steel rope cores using the X-ray image,a detection method of damage X-ray image is propose...Aimed at the long and narrow geometric features and poor generalization ability of the damage detection in conveyor belts with steel rope cores using the X-ray image,a detection method of damage X-ray image is proposed based on the improved fully convolutional one-stage object detection(FCOS)algorithm.The regression performance of bounding boxes was optimized by introducing the complete intersection over union loss function into the improved algorithm.The feature fusion network structure is modified by adding adaptive fusion paths to the feature fusion network structure,which makes full use of the features of accurate localization and semantics of multi-scale feature fusion networks.Finally,the network structure was trained and validated by using the X-ray image dataset of damages in conveyor belts with steel rope cores provided by a flaw detection equipment manufacturer.In addition,the data enhancement methods such as rotating,mirroring,and scaling,were employed to enrich the image dataset so that the model is adequately trained.Experimental results showed that the improved FCOS algorithm promoted the precision rate and the recall rate by 20.9%and 14.8%respectively,compared with the original algorithm.Meanwhile,compared with Fast R-CNN,Faster R-CNN,SSD,and YOLOv3,the improved FCOS algorithm has obvious advantages;detection precision rate and recall rate of the modified network reached 95.8%and 97.0%respectively.Furthermore,it demonstrated a higher detection accuracy without affecting the speed.The results of this work have some reference significance for the automatic identification and detection of steel core conveyor belt damage.展开更多
Efficient banana crop detection is crucial for precision agriculture;however,traditional remote sensing methods often lack the spatial resolution required for accurate identification.This study utilizes low-altitude U...Efficient banana crop detection is crucial for precision agriculture;however,traditional remote sensing methods often lack the spatial resolution required for accurate identification.This study utilizes low-altitude Unmanned Aerial Vehicle(UAV)images and deep learning-based object detection models to enhance banana plant detection.A comparative analysis of Faster Region-Based Convolutional Neural Network(Faster R-CNN),You Only Look Once Version 3(YOLOv3),Retina Network(RetinaNet),and Single Shot MultiBox Detector(SSD)was conducted to evaluate their effectiveness.Results show that RetinaNet achieved the highest detection accuracy,with a precision of 96.67%,a recall of 71.67%,and an F1 score of 81.33%.The study further highlights the impact of scale variation,occlusion,and vegetation density on detection performance.Unlike previous studies,this research systematically evaluates multi-scale object detection models for banana plant identification,offering insights into the advantages of UAV-based deep learning applications in agriculture.In addition,this study compares five evaluation metrics across the four detection models using both RGB and grayscale images.Specifically,RetinaNet exhibited the best overall performance with grayscale images,achieving the highest values across all five metrics.Compared to its performance with RGB images,these results represent a marked improvement,confirming the potential of grayscale preprocessing to enhance detection capability.展开更多
A first and effective method is proposed to detect weld deject adaptively in various Dypes of real-time X-ray images obtained in different conditions. After weld extraction and noise reduction, a proper template of me...A first and effective method is proposed to detect weld deject adaptively in various Dypes of real-time X-ray images obtained in different conditions. After weld extraction and noise reduction, a proper template of median filter is used to estimate the weld background. After the weld background is subtracted from the original image, an adaptite threshold segmentation algorithm is proposed to obtain the binary image, and then the morphological close and open operation, labeling algorithm and fids'e alarm eliminating algorithm are applied to pracess the binary image to obtain the defect, ct detection result. At last, a fast realization procedure jbr proposed method is developed. The proposed method is tested in real-time X-ray image,s obtairted in different X-ray imaging sutems. Experiment results show that the proposed method is effective to detect low contrast weld dejects with few .false alarms and is adaptive to various types of real-time X-ray imaging systems.展开更多
Cone photoreceptor cell identication is important for the early diagnosis of retinopathy.In this study,an object detection algorithm is used for cone cell identication in confocal adaptive optics scanning laser ophtha...Cone photoreceptor cell identication is important for the early diagnosis of retinopathy.In this study,an object detection algorithm is used for cone cell identication in confocal adaptive optics scanning laser ophthalmoscope(AOSLO)images.An effectiveness evaluation of identication using the proposed method reveals precision,recall,and F_(1)-score of 95.8%,96.5%,and 96.1%,respectively,considering manual identication as the ground truth.Various object detection and identication results from images with different cone photoreceptor cell distributions further demonstrate the performance of the proposed method.Overall,the proposed method can accurately identify cone photoreceptor cells on confocal adaptive optics scanning laser ophthalmoscope images,being comparable to manual identication.展开更多
Due to the bird’s eye view of remote sensing sensors,the orientational information of an object is a key factor that has to be considered in object detection.To obtain rotating bounding boxes,existing studies either ...Due to the bird’s eye view of remote sensing sensors,the orientational information of an object is a key factor that has to be considered in object detection.To obtain rotating bounding boxes,existing studies either rely on rotated anchoring schemes or adding complex rotating ROI transfer layers,leading to increased computational demand and reduced detection speeds.In this study,we propose a novel internal-external optimized convolutional neural network for arbitrary orientated object detection in optical remote sensing images.For the internal opti-mization,we designed an anchor-based single-shot head detector that adopts the concept of coarse-to-fine detection for two-stage object detection networks.The refined rotating anchors are generated from the coarse detection head module and fed into the refining detection head module with a link of an embedded deformable convolutional layer.For the external optimiza-tion,we propose an IOU balanced loss that addresses the regression challenges related to arbitrary orientated bounding boxes.Experimental results on the DOTA and HRSC2016 bench-mark datasets show that our proposed method outperforms selected methods.展开更多
Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unman...Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unmanned aerial vehicle(UAV)imagery.Addressing these limitations,we propose a hybrid transformer-based detector,H-DETR,and enhance it for dense small objects,leading to an accurate and efficient model.Firstly,we introduce a hybrid transformer encoder,which integrates a convolutional neural network-based cross-scale fusion module with the original encoder to handle multi-scale feature sequences more efficiently.Furthermore,we propose two novel strategies to enhance detection performance without incurring additional inference computation.Query filter is designed to cope with the dense clustering inherent in drone-captured images by counteracting similar queries with a training-aware non-maximum suppression.Adversarial denoising learning is a novel enhancement method inspired by adversarial learning,which improves the detection of numerous small targets by counteracting the effects of artificial spatial and semantic noise.Extensive experiments on the VisDrone and UAVDT datasets substantiate the effectiveness of our approach,achieving a significant improvement in accuracy with a reduction in computational complexity.Our method achieves 31.9%and 21.1%AP on the VisDrone and UAVDT datasets,respectively,and has a faster inference speed,making it a competitive model in UAV image object detection.展开更多
Small-object detection has long been a challenge.High-megapixel cameras are used to solve this problem in industries.However,current detectors are inefficient for high-resolution images.In this work,we propose a new m...Small-object detection has long been a challenge.High-megapixel cameras are used to solve this problem in industries.However,current detectors are inefficient for high-resolution images.In this work,we propose a new module called Pre-Locate Net,which is a plug-and-play structure that can be combined with most popular detectors.We inspire the use of classification ideas to obtain candidate regions in images,greatly reducing the amount of calculation,and thus achieving rapid detection in high-resolution images.Pre-Locate Net mainly includes two parts,candidate region classification and behavior classification.Candidate region classification is used to obtain a candidate region,and behavior classification is used to estimate the scale of an object.Different follow-up processing is adopted according to different scales to balance the variance of the network input.Different from the popular candidate region generation method,we abandon the idea of regression of a bounding box and adopt the concept of classification,so as to realize the prediction of a candidate region in the shallow network.We build a high-resolution dataset of aircraft and landing gears covering complex scenes to verify the effectiveness of our method.Compared to state-of-the-art detectors(e.g.,Guided Anchoring,Libra-RCNN,and FASF),our method achieves the best m AP of 94.5 on 1920×1080 images at 16.7 FPS.展开更多
Remote sensing image object detection is one of the core tasks of remote sensing image processing.In recent years,with the development of deep learning,great progress has been made in object detection in remote sensin...Remote sensing image object detection is one of the core tasks of remote sensing image processing.In recent years,with the development of deep learning,great progress has been made in object detection in remote sensing.However,the problems of dense small targets,complex backgrounds and poor target positioning accuracy in remote sensing images make the detection of remote sensing targets still difficult.In order to solve these problems,this research proposes a remote sensing image object detection algorithm based on improved YOLOX-S.Firstly,the Efficient Channel Attention(ECA)module is introduced to improve the network's ability to extract features in the image and suppress useless information such as background;Secondly,the loss function is optimized to improve the regression accuracy of the target bounding box.We evaluate the effectiveness of our algorithm on the NWPU VHR-10 remote sensing image dataset,the experimental results show that the detection accuracy of the algorithm can reach 95.5%,without increasing the amount of parameters.It is significantly improved compared with that of the original YOLOX-S network,and the detection performance is much better than that of some other mainstream remote sensing image detection methods.Besides,our method also shows good generalization detection performance in experiments on aircraft images in the RSOD dataset.展开更多
Aerial image sequence mosaicking is one of the chal-lenging research fields in computer vision.To obtain large-scale orthophoto maps with object detection information,we propose a vision-based image mosaicking algorit...Aerial image sequence mosaicking is one of the chal-lenging research fields in computer vision.To obtain large-scale orthophoto maps with object detection information,we propose a vision-based image mosaicking algorithm without any extra location data.According to object detection results,we define a complexity factor to describe the importance of each input ima-ge and dynamically optimize the feature extraction process.The feature points extraction and matching processes are mainly guided by the speeded-up robust features(SURF)and the grid motion statistic(GMS)algorithm respectively.A robust refer-ence frame selection method is proposed to eliminate the trans-formation distortion by searching for the center area based on overlaps.Besides,the sparse Levenberg-Marquardt(LM)al-gorithm and the heavy occluded frames removal method are ap-plied to reduce accumulated errors and further improve the mo-saicking performance.The proposed algorithm is performed by using multithreading and graphics processing unit(GPU)accel-eration on several aerial image datasets.Extensive experiment results demonstrate that our algorithm outperforms most of the existing aerial image mosaicking methods in visual quality while guaranteeing a high calculation speed.展开更多
In image processing, one of the most important steps is image segmentation. The objects in remote sensing images often have to be detected in order toperform next steps in image processing. Remote sensing images usual...In image processing, one of the most important steps is image segmentation. The objects in remote sensing images often have to be detected in order toperform next steps in image processing. Remote sensing images usually havelarge size and various spatial resolutions. Thus, detecting objects in remote sensing images is very complicated. In this paper, we develop a model to detectobjects in remote sensing images based on the combination of picture fuzzy clustering and MapReduce method (denoted as MPFC). Firstly, picture fuzzy clustering is applied to segment the input images. Then, MapReduce is used to reducethe runtime with the guarantee of quality. To convert data for MapReduce processing, two new procedures are introduced, including Map_PFC and Reduce_PFC.The formal representation and details of two these procedures are presented in thispaper. The experiments on satellite image and remote sensing image datasets aregiven to evaluate proposed model. Validity indices and time consuming are usedto compare proposed model to picture fuzzy clustering model. The values ofvalidity indices show that picture fuzzy clustering integrated to MapReduce getsbetter quality of segmentation than using picture fuzzy clustering only. Moreover,on two selected image datasets, the run time of MPFC model is much less thanthat of picture fuzzy clustering.展开更多
基金National Natural Science Foundation of China(No.61302159,61227003,61301259)Natual Science Foundation of Shanxi Province(No.2012021011-2)+2 种基金Specialized Research Fund for the Doctoral Program of Higher Education,China(No.20121420110006)Top Science and Technology Innovation Teams of Higher Learning Institutions of Shanxi Province,ChinaProject Sponsored by Scientific Research for the Returned Overseas Chinese Scholars,Shanxi Province(No.2013-083)
文摘Real-time detection for object size has now become a hot topic in the testing field and image processing is the core algorithm. This paper focuses on the processing and display of the collected dynamic images to achieve a real-time image pro- cessing for the moving objects. Firstly, the median filtering, gain calibration, image segmentation, image binarization, cor- ner detection and edge fitting are employed to process the images of the moving objects to make the image close to the real object. Then, the processed images are simultaneously displayed on a real-time basis to make it easier to analyze, understand and identify them, and thus it reduces the computation complexity. Finally, human-computer interaction (HCI)-friendly in- terface based on VC ++ is designed to accomplish the digital logic transform, image processing and real-time display of the objects. The experiment shows that the proposed algorithm and software design have better real-time performance and accu- racy which can meet the industrial needs.
基金supported in part by the National Key R&D Program of China(Grant No.2023YFB3307604)the Shanxi Province Basic Research Program Youth Science Research Project(Grant Nos.202303021212054 and 202303021212046)+3 种基金the Key Projects Supported by Hebei Natural Science Foundation(Grant No.E2024203125)the National Science Foundation of China(Grant No.52105391)the Hebei Provincial Science and Technology Major Project(Grant No.23280101Z)the National Key Laboratory of Metal Forming Technology and Heavy Equipment Open Fund(Grant No.S2308100.W17).
文摘A novel dual-branch decoding fusion convolutional neural network model(DDFNet)specifically designed for real-time salient object detection(SOD)on steel surfaces is proposed.DDFNet is based on a standard encoder–decoder architecture.DDFNet integrates three key innovations:first,we introduce a novel,lightweight multi-scale progressive aggregation residual network that effectively suppresses background interference and refines defect details,enabling efficient salient feature extraction.Then,we propose an innovative dual-branch decoding fusion structure,comprising the refined defect representation branch and the enhanced defect representation branch,which enhance accuracy in defect region identification and feature representation.Additionally,to further improve the detection of small and complex defects,we incorporate a multi-scale attention fusion module.Experimental results on the public ESDIs-SOD dataset show that DDFNet,with only 3.69 million parameters,achieves detection performance comparable to current state-of-the-art models,demonstrating its potential for real-time industrial applications.Furthermore,our DDFNet-L variant consistently outperforms leading methods in detection performance.The code is available at https://github.com/13140W/DDFNet.
文摘To address the challenges of low accuracy and insufficient real-time performance in dynamic object detection for UAV surveillance,this paper proposes a novel tracking framework that integrates a lightweight improved YOLOv5s model with adaptive motion compensation.A UAV-view dynamic feature enhancement strategy is innovatively introduced,and a lightweight detection network combining attention mechanisms and multi-scale fusion is constructed.The robustness of tracking under motion blur scenarios is also optimized.Experimental results demonstrate that the proposed method achieves a mAP@0.5 of 68.2%on the VisDrone dataset and reaches an inference speed of 32 FPS on the NVIDIA Jetson TX2 platform.This significantly improves the balance between accuracy and efficiency in complex scenes,offering reliable technical support for real-time applications such as emergency response.
基金supported by the National Natural Science Foundation of China(Grant No.62302086)the Natural Science Foundation of Liaoning Province(Grant No.2023-MSBA-070)the Fundamental Research Funds for the Central Universities(Grant No.N2317005).
文摘Visible-infrared object detection leverages the day-night stable object perception capability of infrared images to enhance detection robustness in low-light environments by fusing the complementary information of visible and infrared images.However,the inherent differences in the imaging mechanisms of visible and infrared modalities make effective cross-modal fusion challenging.Furthermore,constrained by the physical characteristics of sensors and thermal diffusion effects,infrared images generally suffer from blurred object contours and missing details,making it difficult to extract object features effectively.To address these issues,we propose an infrared-visible image fusion network that realizesmultimodal information fusion of infrared and visible images through a carefully designedmultiscale fusion strategy.First,we design an adaptive gray-radiance enhancement(AGRE)module to strengthen the detail representation in infrared images,improving their usability in complex lighting scenarios.Next,we introduce a channelspatial feature interaction(CSFI)module,which achieves efficient complementarity between the RGB and infrared(IR)modalities via dynamic channel switching and a spatial attention mechanism.Finally,we propose a multi-scale enhanced cross-attention fusion(MSECA)module,which optimizes the fusion ofmulti-level features through dynamic convolution and gating mechanisms and captures long-range complementary relationships of cross-modal features on a global scale,thereby enhancing the expressiveness of the fused features.Experiments on the KAIST,M3FD,and FLIR datasets demonstrate that our method delivers outstanding performance in daytime and nighttime scenarios.On the KAIST dataset,the miss rate drops to 5.99%,and further to 4.26% in night scenes.On the FLIR and M3FD datasets,it achieves AP50 scores of 79.4% and 88.9%,respectively.
基金co-supported by the National Natural Science Foundation of China(No.62103190)the Natural Science Foundation of Jiangsu Province,China(No.BK20230923)。
文摘When detecting objects in Unmanned Aerial Vehicle(UAV)taken images,large number of objects and high proportion of small objects bring huge challenges for detection algorithms based on the You Only Look Once(YOLO)framework,rendering them challenging to deal with tasks that demand high precision.To address these problems,this paper proposes a high-precision object detection algorithm based on YOLOv10s.Firstly,a Multi-branch Enhancement Coordinate Attention(MECA)module is proposed to enhance feature extraction capability.Secondly,a Multilayer Feature Reconstruction(MFR)mechanism is designed to fully exploit multilayer features,which can enrich object information as well as remove redundant information.Finally,an MFR Path Aggregation Network(MFR-Neck)is constructed,which integrates multi-scale features to improve the network's ability to perceive objects of var-ying sizes.The experimental results demonstrate that the proposed algorithm increases the average detection accuracy by 14.15%on the Vis Drone dataset compared to YOLOv10s,effectively enhancing object detection precision in UAV-taken images.
文摘Recent years have seen a surge in interest in object detection on remote sensing images for applications such as surveillance andmanagement.However,challenges like small object detection,scale variation,and the presence of closely packed objects in these images hinder accurate detection.Additionally,the motion blur effect further complicates the identification of such objects.To address these issues,we propose enhanced YOLOv9 with a transformer head(YOLOv9-TH).The model introduces an additional prediction head for detecting objects of varying sizes and swaps the original prediction heads for transformer heads to leverage self-attention mechanisms.We further improve YOLOv9-TH using several strategies,including data augmentation,multi-scale testing,multi-model integration,and the introduction of an additional classifier.The cross-stage partial(CSP)method and the ghost convolution hierarchical graph(GCHG)are combined to improve detection accuracy by better utilizing feature maps,widening the receptive field,and precisely extracting multi-scale objects.Additionally,we incorporate the E-SimAM attention mechanism to address low-resolution feature loss.Extensive experiments on the VisDrone2021 and DIOR datasets demonstrate the effectiveness of YOLOv9-TH,showing good improvement in mAP compared to the best existing methods.The YOLOv9-TH-e achieved 54.2% of mAP50 on the VisDrone2021 dataset and 92.3% of mAP on the DIOR dataset.The results confirmthemodel’s robustness and suitability for real-world applications,particularly for small object detection in remote sensing images.
基金National Science and Technology Council,the Republic of China,under grants NSTC 113-2221-E-194-011-MY3 and Research Center on Artificial Intelligence and Sustainability,National Chung Cheng University under the research project grant titled“Generative Digital Twin System Design for Sustainable Smart City Development in Taiwan.
文摘Modern manufacturing processes have become more reliant on automation because of the accelerated transition from Industry 3.0 to Industry 4.0.Manual inspection of products on assembly lines remains inefficient,prone to errors and lacks consistency,emphasizing the need for a reliable and automated inspection system.Leveraging both object detection and image segmentation approaches,this research proposes a vision-based solution for the detection of various kinds of tools in the toolkit using deep learning(DL)models.Two Intel RealSense D455f depth cameras were arranged in a top down configuration to capture both RGB and depth images of the toolkits.After applying multiple constraints and enhancing them through preprocessing and augmentation,a dataset consisting of 3300 annotated RGB-D photos was generated.Several DL models were selected through a comprehensive assessment of mean Average Precision(mAP),precision-recall equilibrium,inference latency(target≥30 FPS),and computational burden,resulting in a preference for YOLO and Region-based Convolutional Neural Networks(R-CNN)variants over ViT-based models due to the latter’s increased latency and resource requirements.YOLOV5,YOLOV8,YOLOV11,Faster R-CNN,and Mask R-CNN were trained on the annotated dataset and evaluated using key performance metrics(Recall,Accuracy,F1-score,and Precision).YOLOV11 demonstrated balanced excellence with 93.0%precision,89.9%recall,and a 90.6%F1-score in object detection,as well as 96.9%precision,95.3%recall,and a 96.5%F1-score in instance segmentation with an average inference time of 25 ms per frame(≈40 FPS),demonstrating real-time performance.Leveraging these results,a YOLOV11-based windows application was successfully deployed in a real-time assembly line environment,where it accurately processed live video streams to detect and segment tools within toolkits,demonstrating its practical effectiveness in industrial automation.The application is capable of precisely measuring socket dimensions by utilising edge detection techniques on YOLOv11 segmentation masks,in addition to detection and segmentation.This makes it possible to do specification-level quality control right on the assembly line,which improves the ability to examine things in real time.The implementation is a big step forward for intelligent manufacturing in the Industry 4.0 paradigm.It provides a scalable,efficient,and accurate way to do automated inspection and dimensional verification activities.
文摘Automatic analysis of student behavior in classrooms has gained importance with the rise of smart education and vision technologies.However,the limited real-time accuracy of existing methods severely constrains their practical classroom deployment.To address this issue of low accuracy,we propose an improved YOLOv11-based detector that integrates CARAFE upsampling,DySnakeConv,DyHead,and SMFA fusion modules.This new model for real-time classroom behavior detection captures fine-grained student behaviors with low latency.Additionally,we have developed a visualization system that presents data through intuitive dashboards.This system enables teachers to dynamically grasp classroom engagement by tracking student participation and involvement.The enhanced YOLOv11 model achieves an mAP@0.5 of 87.2%on the evaluated datasets,surpassing baseline models.This significance lies in two aspects.First,it provides a practical technical route for deployable live classroom behavior monitoring and engagement feedback systems.Second,by integrating this proposed system,educators could make data-informed and fine-grained teaching decisions,ultimately improving instructional quality and learning outcomes.
基金supported by the National Natural Science Foundation of China under grant number 62066016the Natural Science Foundation of Hunan Province of China under grant number 2024JJ7395+2 种基金the Scientific Research Project of Education Department of Hunan Province of China under grant number 22B0549International and Regional Science and Technology Cooperation and Exchange Program of the Hunan Association for Science and Technology under grant number 025SKX-KJ-04Hunan Province Undergraduate Innovation and Entrepreneurship Training Program(grant number S202410531015).
文摘The YOLO(You Only Look Once)series,a leading single-stage object detection framework,has gained significant prominence in medical-image analysis due to its real-time efficiency and robust performance.Recent iterations of YOLO have further enhanced its accuracy and reliability in critical clinical tasks such as tumor detection,lesion segmentation,and microscopic image analysis,thereby accelerating the development of clinical decision support systems.This paper systematically reviews advances in YOLO-based medical object detection from 2018 to 2024.It compares YOLO’s performance with othermodels(e.g.,Faster R-CNN,RetinaNet)inmedical contexts,summarizes standard evaluation metrics(e.g.,mean Average Precision(mAP),sensitivity),and analyzes hardware deployment strategies using public datasets such as LUNA16,BraTS,andCheXpert.Thereviewhighlights the impressive performance of YOLO models,particularly from YOLOv5 to YOLOv8,in achieving high precision(up to 99.17%),sensitivity(up to 97.5%),and mAP exceeding 95%in tasks such as lung nodule,breast cancer,and polyp detection.These results demonstrate the significant potential of YOLO models for early disease detection and real-time clinical applications,indicating their ability to enhance clinical workflows.However,the study also identifies key challenges,including high small-object miss rates,limited generalization in low-contrast images,scarcity of annotated data,and model interpretability issues.Finally,the potential future research directions are also proposed to address these challenges and further advance the application of YOLO models in healthcare.
基金support of the Natural Science Foundation of Jiangsu Province,China(BK20240977)the China Scholarship Council(201606850024)+1 种基金the National High Technology Research and Development Program of China(2016YFD0701003)the Postgraduate Research&Practice Innovation Program of Jiangsu Province,China(SJCX23_1488)。
文摘Deep learning-based intelligent recognition algorithms are increasingly recognized for their potential to address the labor-intensive challenge of manual pest detection.However,their deployment on mobile devices has been constrained by high computational demands.Here,we developed GBiDC-PEST,a mobile application that incorporates an improved,lightweight detection algorithm based on the You Only Look Once(YOLO)series singlestage architecture,for real-time detection of four tiny pests(wheat mites,sugarcane aphids,wheat aphids,and rice planthoppers).GBiDC-PEST incorporates several innovative modules,including GhostNet for lightweight feature extraction and architecture optimization by reconstructing the backbone,the bi-directional feature pyramid network(BiFPN)for enhanced multiscale feature fusion,depthwise convolution(DWConv)layers to reduce computational load,and the convolutional block attention module(CBAM)to enable precise feature focus.The newly developed GBiDC-PEST was trained and validated using a multitarget agricultural tiny pest dataset(Tpest-3960)that covered various field environments.GBiDC-PEST(2.8 MB)significantly reduced the model size to only 20%of the original model size,offering a smaller size than the YOLO series(v5-v10),higher detection accuracy than YOLOv10n and v10s,and faster detection speed than v8s,v9c,v10m and v10b.In Android deployment experiments,GBiDCPEST demonstrated enhanced performance in detecting pests against complex backgrounds,and the accuracy for wheat mites and rice planthoppers was improved by 4.5-7.5%compared with the original model.The GBiDC-PEST optimization algorithm and its mobile deployment proposed in this study offer a robust technical framework for the rapid,onsite identification and localization of tiny pests.This advancement provides valuable insights for effective pest monitoring,counting,and control in various agricultural settings.
文摘Aimed at the long and narrow geometric features and poor generalization ability of the damage detection in conveyor belts with steel rope cores using the X-ray image,a detection method of damage X-ray image is proposed based on the improved fully convolutional one-stage object detection(FCOS)algorithm.The regression performance of bounding boxes was optimized by introducing the complete intersection over union loss function into the improved algorithm.The feature fusion network structure is modified by adding adaptive fusion paths to the feature fusion network structure,which makes full use of the features of accurate localization and semantics of multi-scale feature fusion networks.Finally,the network structure was trained and validated by using the X-ray image dataset of damages in conveyor belts with steel rope cores provided by a flaw detection equipment manufacturer.In addition,the data enhancement methods such as rotating,mirroring,and scaling,were employed to enrich the image dataset so that the model is adequately trained.Experimental results showed that the improved FCOS algorithm promoted the precision rate and the recall rate by 20.9%and 14.8%respectively,compared with the original algorithm.Meanwhile,compared with Fast R-CNN,Faster R-CNN,SSD,and YOLOv3,the improved FCOS algorithm has obvious advantages;detection precision rate and recall rate of the modified network reached 95.8%and 97.0%respectively.Furthermore,it demonstrated a higher detection accuracy without affecting the speed.The results of this work have some reference significance for the automatic identification and detection of steel core conveyor belt damage.
文摘Efficient banana crop detection is crucial for precision agriculture;however,traditional remote sensing methods often lack the spatial resolution required for accurate identification.This study utilizes low-altitude Unmanned Aerial Vehicle(UAV)images and deep learning-based object detection models to enhance banana plant detection.A comparative analysis of Faster Region-Based Convolutional Neural Network(Faster R-CNN),You Only Look Once Version 3(YOLOv3),Retina Network(RetinaNet),and Single Shot MultiBox Detector(SSD)was conducted to evaluate their effectiveness.Results show that RetinaNet achieved the highest detection accuracy,with a precision of 96.67%,a recall of 71.67%,and an F1 score of 81.33%.The study further highlights the impact of scale variation,occlusion,and vegetation density on detection performance.Unlike previous studies,this research systematically evaluates multi-scale object detection models for banana plant identification,offering insights into the advantages of UAV-based deep learning applications in agriculture.In addition,this study compares five evaluation metrics across the four detection models using both RGB and grayscale images.Specifically,RetinaNet exhibited the best overall performance with grayscale images,achieving the highest values across all five metrics.Compared to its performance with RGB images,these results represent a marked improvement,confirming the potential of grayscale preprocessing to enhance detection capability.
文摘A first and effective method is proposed to detect weld deject adaptively in various Dypes of real-time X-ray images obtained in different conditions. After weld extraction and noise reduction, a proper template of median filter is used to estimate the weld background. After the weld background is subtracted from the original image, an adaptite threshold segmentation algorithm is proposed to obtain the binary image, and then the morphological close and open operation, labeling algorithm and fids'e alarm eliminating algorithm are applied to pracess the binary image to obtain the defect, ct detection result. At last, a fast realization procedure jbr proposed method is developed. The proposed method is tested in real-time X-ray image,s obtairted in different X-ray imaging sutems. Experiment results show that the proposed method is effective to detect low contrast weld dejects with few .false alarms and is adaptive to various types of real-time X-ray imaging systems.
基金the Natural Science Foundation of Jiangsu Province(BK20200214)National Key R&D Program of China(2017YFB0403701)+5 种基金Jiangsu Province Key R&D Program(BE2019682 and BE2018667)National Natural Science Foundation of China(61605210,61675226,and 62075235)Youth Innovation Promotion Association of Chinese Academy of Sciences(2019320)Frontier Science Research Project of the Chinese Academy of Sciences(QYZDB-SSW-JSC03)Strategic Priority Research Program of the Chinese Academy of Sciences(XDB02060000)and Entrepreneurship and Innova-tion Talents in Jiangsu Province(Innovation of Scienti¯c Research Institutes).
文摘Cone photoreceptor cell identication is important for the early diagnosis of retinopathy.In this study,an object detection algorithm is used for cone cell identication in confocal adaptive optics scanning laser ophthalmoscope(AOSLO)images.An effectiveness evaluation of identication using the proposed method reveals precision,recall,and F_(1)-score of 95.8%,96.5%,and 96.1%,respectively,considering manual identication as the ground truth.Various object detection and identication results from images with different cone photoreceptor cell distributions further demonstrate the performance of the proposed method.Overall,the proposed method can accurately identify cone photoreceptor cells on confocal adaptive optics scanning laser ophthalmoscope images,being comparable to manual identication.
基金This work is supported by the National Natural Science Foundation of China[grant numbers 41890820,41771452,41771454,and 41901340]。
文摘Due to the bird’s eye view of remote sensing sensors,the orientational information of an object is a key factor that has to be considered in object detection.To obtain rotating bounding boxes,existing studies either rely on rotated anchoring schemes or adding complex rotating ROI transfer layers,leading to increased computational demand and reduced detection speeds.In this study,we propose a novel internal-external optimized convolutional neural network for arbitrary orientated object detection in optical remote sensing images.For the internal opti-mization,we designed an anchor-based single-shot head detector that adopts the concept of coarse-to-fine detection for two-stage object detection networks.The refined rotating anchors are generated from the coarse detection head module and fed into the refining detection head module with a link of an embedded deformable convolutional layer.For the external optimiza-tion,we propose an IOU balanced loss that addresses the regression challenges related to arbitrary orientated bounding boxes.Experimental results on the DOTA and HRSC2016 bench-mark datasets show that our proposed method outperforms selected methods.
基金This research was funded by the Natural Science Foundation of Hebei Province(F2021506004).
文摘Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unmanned aerial vehicle(UAV)imagery.Addressing these limitations,we propose a hybrid transformer-based detector,H-DETR,and enhance it for dense small objects,leading to an accurate and efficient model.Firstly,we introduce a hybrid transformer encoder,which integrates a convolutional neural network-based cross-scale fusion module with the original encoder to handle multi-scale feature sequences more efficiently.Furthermore,we propose two novel strategies to enhance detection performance without incurring additional inference computation.Query filter is designed to cope with the dense clustering inherent in drone-captured images by counteracting similar queries with a training-aware non-maximum suppression.Adversarial denoising learning is a novel enhancement method inspired by adversarial learning,which improves the detection of numerous small targets by counteracting the effects of artificial spatial and semantic noise.Extensive experiments on the VisDrone and UAVDT datasets substantiate the effectiveness of our approach,achieving a significant improvement in accuracy with a reduction in computational complexity.Our method achieves 31.9%and 21.1%AP on the VisDrone and UAVDT datasets,respectively,and has a faster inference speed,making it a competitive model in UAV image object detection.
基金the National Science Fund for Distinguished Young Scholars of China (No. 51625501)the Aeronautical Science Foundation of China (No. 201946051002)
文摘Small-object detection has long been a challenge.High-megapixel cameras are used to solve this problem in industries.However,current detectors are inefficient for high-resolution images.In this work,we propose a new module called Pre-Locate Net,which is a plug-and-play structure that can be combined with most popular detectors.We inspire the use of classification ideas to obtain candidate regions in images,greatly reducing the amount of calculation,and thus achieving rapid detection in high-resolution images.Pre-Locate Net mainly includes two parts,candidate region classification and behavior classification.Candidate region classification is used to obtain a candidate region,and behavior classification is used to estimate the scale of an object.Different follow-up processing is adopted according to different scales to balance the variance of the network input.Different from the popular candidate region generation method,we abandon the idea of regression of a bounding box and adopt the concept of classification,so as to realize the prediction of a candidate region in the shallow network.We build a high-resolution dataset of aircraft and landing gears covering complex scenes to verify the effectiveness of our method.Compared to state-of-the-art detectors(e.g.,Guided Anchoring,Libra-RCNN,and FASF),our method achieves the best m AP of 94.5 on 1920×1080 images at 16.7 FPS.
基金Supported by the National Natural Science Foundation of China (72174172, 71774134)the Fundamental Research Funds for Central University,Southwest Minzu University (2022NYXXS094)。
文摘Remote sensing image object detection is one of the core tasks of remote sensing image processing.In recent years,with the development of deep learning,great progress has been made in object detection in remote sensing.However,the problems of dense small targets,complex backgrounds and poor target positioning accuracy in remote sensing images make the detection of remote sensing targets still difficult.In order to solve these problems,this research proposes a remote sensing image object detection algorithm based on improved YOLOX-S.Firstly,the Efficient Channel Attention(ECA)module is introduced to improve the network's ability to extract features in the image and suppress useless information such as background;Secondly,the loss function is optimized to improve the regression accuracy of the target bounding box.We evaluate the effectiveness of our algorithm on the NWPU VHR-10 remote sensing image dataset,the experimental results show that the detection accuracy of the algorithm can reach 95.5%,without increasing the amount of parameters.It is significantly improved compared with that of the original YOLOX-S network,and the detection performance is much better than that of some other mainstream remote sensing image detection methods.Besides,our method also shows good generalization detection performance in experiments on aircraft images in the RSOD dataset.
基金supported by the National Natural Science Foundation of China(6160304061973036).
文摘Aerial image sequence mosaicking is one of the chal-lenging research fields in computer vision.To obtain large-scale orthophoto maps with object detection information,we propose a vision-based image mosaicking algorithm without any extra location data.According to object detection results,we define a complexity factor to describe the importance of each input ima-ge and dynamically optimize the feature extraction process.The feature points extraction and matching processes are mainly guided by the speeded-up robust features(SURF)and the grid motion statistic(GMS)algorithm respectively.A robust refer-ence frame selection method is proposed to eliminate the trans-formation distortion by searching for the center area based on overlaps.Besides,the sparse Levenberg-Marquardt(LM)al-gorithm and the heavy occluded frames removal method are ap-plied to reduce accumulated errors and further improve the mo-saicking performance.The proposed algorithm is performed by using multithreading and graphics processing unit(GPU)accel-eration on several aerial image datasets.Extensive experiment results demonstrate that our algorithm outperforms most of the existing aerial image mosaicking methods in visual quality while guaranteeing a high calculation speed.
基金funded by Thuyloi University Foundation for Science and Technologyunder Grant Number TLU.STF.19-02.
文摘In image processing, one of the most important steps is image segmentation. The objects in remote sensing images often have to be detected in order toperform next steps in image processing. Remote sensing images usually havelarge size and various spatial resolutions. Thus, detecting objects in remote sensing images is very complicated. In this paper, we develop a model to detectobjects in remote sensing images based on the combination of picture fuzzy clustering and MapReduce method (denoted as MPFC). Firstly, picture fuzzy clustering is applied to segment the input images. Then, MapReduce is used to reducethe runtime with the guarantee of quality. To convert data for MapReduce processing, two new procedures are introduced, including Map_PFC and Reduce_PFC.The formal representation and details of two these procedures are presented in thispaper. The experiments on satellite image and remote sensing image datasets aregiven to evaluate proposed model. Validity indices and time consuming are usedto compare proposed model to picture fuzzy clustering model. The values ofvalidity indices show that picture fuzzy clustering integrated to MapReduce getsbetter quality of segmentation than using picture fuzzy clustering only. Moreover,on two selected image datasets, the run time of MPFC model is much less thanthat of picture fuzzy clustering.