Accurate and efficient detection of building changes in remote sensing imagery is crucial for urban planning,disaster emergency response,and resource management.However,existing methods face challenges such as spectra...Accurate and efficient detection of building changes in remote sensing imagery is crucial for urban planning,disaster emergency response,and resource management.However,existing methods face challenges such as spectral similarity between buildings and backgrounds,sensor variations,and insufficient computational efficiency.To address these challenges,this paper proposes a novel Multi-scale Efficient Wavelet-based Change Detection Network(MewCDNet),which integrates the advantages of Convolutional Neural Networks and Transformers,balances computational costs,and achieves high-performance building change detection.The network employs EfficientNet-B4 as the backbone for hierarchical feature extraction,integrates multi-level feature maps through a multi-scale fusion strategy,and incorporates two key modules:Cross-temporal Difference Detection(CTDD)and Cross-scale Wavelet Refinement(CSWR).CTDD adopts a dual-branch architecture that combines pixel-wise differencing with semanticaware Euclidean distance weighting to enhance the distinction between true changes and background noise.CSWR integrates Haar-based Discrete Wavelet Transform with multi-head cross-attention mechanisms,enabling cross-scale feature fusion while significantly improving edge localization and suppressing spurious changes.Extensive experiments on four benchmark datasets demonstrate MewCDNet’s superiority over comparison methods:achieving F1 scores of 91.54%on LEVIR,93.70%on WHUCD,and 64.96%on S2Looking for building change detection.Furthermore,MewCDNet exhibits optimal performance on the multi-class⋅SYSU dataset(F1:82.71%),highlighting its exceptional generalization capability.展开更多
With the rapid expansion of drone applications,accurate detection of objects in aerial imagery has become crucial for intelligent transportation,urban management,and emergency rescue missions.However,existing methods ...With the rapid expansion of drone applications,accurate detection of objects in aerial imagery has become crucial for intelligent transportation,urban management,and emergency rescue missions.However,existing methods face numerous challenges in practical deployment,including scale variation handling,feature degradation,and complex backgrounds.To address these issues,we propose Edge-enhanced and Detail-Capturing You Only Look Once(EHDC-YOLO),a novel framework for object detection in Unmanned Aerial Vehicle(UAV)imagery.Based on the You Only Look Once version 11 nano(YOLOv11n)baseline,EHDC-YOLO systematically introduces several architectural enhancements:(1)a Multi-Scale Edge Enhancement(MSEE)module that leverages multi-scale pooling and edge information to enhance boundary feature extraction;(2)an Enhanced Feature Pyramid Network(EFPN)that integrates P2-level features with Cross Stage Partial(CSP)structures and OmniKernel convolutions for better fine-grained representation;and(3)Dynamic Head(DyHead)with multi-dimensional attention mechanisms for enhanced cross-scale modeling and perspective adaptability.Comprehensive experiments on the Vision meets Drones for Detection(VisDrone-DET)2019 dataset demonstrate that EHDC-YOLO achieves significant improvements,increasing mean Average Precision(mAP)@0.5 from 33.2%to 46.1%(an absolute improvement of 12.9 percentage points)and mAP@0.5:0.95 from 19.5%to 28.0%(an absolute improvement of 8.5 percentage points)compared with the YOLOv11n baseline,while maintaining a reasonable parameter count(2.81 M vs the baseline’s 2.58 M).Further ablation studies confirm the effectiveness of each proposed component,while visualization results highlight EHDC-YOLO’s superior performance in detecting objects and handling occlusions in complex drone scenarios.展开更多
Impact craters are important for understanding the evolution of lunar geologic and surface erosion rates,among other functions.However,the morphological characteristics of these micro impact craters are not obvious an...Impact craters are important for understanding the evolution of lunar geologic and surface erosion rates,among other functions.However,the morphological characteristics of these micro impact craters are not obvious and they are numerous,resulting in low detection accuracy by deep learning models.Therefore,we proposed a new multi-scale fusion crater detection algorithm(MSF-CDA)based on the YOLO11 to improve the accuracy of lunar impact crater detection,especially for small craters with a diameter of<1 km.Using the images taken by the LROC(Lunar Reconnaissance Orbiter Camera)at the Chang’e-4(CE-4)landing area,we constructed three separate datasets for craters with diameters of 0-70 m,70-140 m,and>140 m.We then trained three submodels separately with these three datasets.Additionally,we designed a slicing-amplifying-slicing strategy to enhance the ability to extract features from small craters.To handle redundant predictions,we proposed a new Non-Maximum Suppression with Area Filtering method to fuse the results in overlapping targets within the multi-scale submodels.Finally,our new MSF-CDA method achieved high detection performance,with the Precision,Recall,and F1 score having values of 0.991,0.987,and 0.989,respectively,perfectly addressing the problems induced by the lesser features and sample imbalance of small craters.Our MSF-CDA can provide strong data support for more in-depth study of the geological evolution of the lunar surface and finer geological age estimations.This strategy can also be used to detect other small objects with lesser features and sample imbalance problems.We detected approximately 500,000 impact craters in an area of approximately 214 km2 around the CE-4 landing area.By statistically analyzing the new data,we updated the distribution function of the number and diameter of impact craters.Finally,we identified the most suitable lighting conditions for detecting impact crater targets by analyzing the effect of different lighting conditions on the detection accuracy.展开更多
Detecting abnormal cervical cells is crucial for early identification and timely treatment of cervical cancer.However,this task is challenging due to the morphological similarities between abnormal and normal cells an...Detecting abnormal cervical cells is crucial for early identification and timely treatment of cervical cancer.However,this task is challenging due to the morphological similarities between abnormal and normal cells and the significant variations in cell size.Pathologists often refer to surrounding cells to identify abnormalities.To emulate this slide examination behavior,this study proposes a Multi-Scale Feature Fusion Network(MSFF-Net)for detecting cervical abnormal cells.MSFF-Net employs a Cross-Scale Pooling Model(CSPM)to effectively capture diverse features and contextual information,ranging from local details to the overall structure.Additionally,a Multi-Scale Fusion Attention(MSFA)module is introduced to mitigate the impact of cell size variations by adaptively fusing local and global information at different scales.To handle the complex environment of cervical cell images,such as cell adhesion and overlapping,the Inner-CIoU loss function is utilized to more precisely measure the overlap between bounding boxes,thereby improving detection accuracy in such scenarios.Experimental results on the Comparison detector dataset demonstrate that MSFF-Net achieves a mean average precision(mAP)of 63.2%,outperforming state-of-the-art methods while maintaining a relatively small number of parameters(26.8 M).This study highlights the effectiveness of multi-scale feature fusion in enhancing the detection of cervical abnormal cells,contributing to more accurate and efficient cervical cancer screening.展开更多
With the rapid growth of socialmedia,the spread of fake news has become a growing problem,misleading the public and causing significant harm.As social media content is often composed of both images and text,the use of...With the rapid growth of socialmedia,the spread of fake news has become a growing problem,misleading the public and causing significant harm.As social media content is often composed of both images and text,the use of multimodal approaches for fake news detection has gained significant attention.To solve the problems existing in previous multi-modal fake news detection algorithms,such as insufficient feature extraction and insufficient use of semantic relations between modes,this paper proposes the MFFFND-Co(Multimodal Feature Fusion Fake News Detection with Co-Attention Block)model.First,the model deeply explores the textual content,image content,and frequency domain features.Then,it employs a Co-Attention mechanism for cross-modal fusion.Additionally,a semantic consistency detectionmodule is designed to quantify semantic deviations,thereby enhancing the performance of fake news detection.Experimentally verified on two commonly used datasets,Twitter and Weibo,the model achieved F1 scores of 90.0% and 94.0%,respectively,significantly outperforming the pre-modified MFFFND(Multimodal Feature Fusion Fake News Detection with Attention Block)model and surpassing other baseline models.This improves the accuracy of detecting fake information in artificial intelligence detection and engineering software detection.展开更多
Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportatio...Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportation systems (ITS) and Advanced Driver Assistance Systems (ADAS), the development of efficient and reliable traffic light detection mechanisms is crucial for enhancing road safety and traffic management. This paper presents an optimized convolutional neural network (CNN) framework designed to detect traffic lights in real-time within complex urban environments. Leveraging multi-scale pyramid feature maps, the proposed model addresses key challenges such as the detection of small, occluded, and low-resolution traffic lights amidst complex backgrounds. The integration of dilated convolutions, Region of Interest (ROI) alignment, and Soft Non-Maximum Suppression (Soft-NMS) further improves detection accuracy and reduces false positives. By optimizing computational efficiency and parameter complexity, the framework is designed to operate seamlessly on embedded systems, ensuring robust performance in real-world applications. Extensive experiments using real-world datasets demonstrate that our model significantly outperforms existing methods, providing a scalable solution for ITS and ADAS applications. This research contributes to the advancement of Artificial Intelligence-driven (AI-driven) pattern recognition in transportation systems and offers a mathematical approach to improving efficiency and safety in logistics and transportation networks.展开更多
Focused on the task of fast and accurate armored target detection in ground battlefield,a detection method based on multi-scale representation network(MS-RN) and shape-fixed Guided Anchor(SF-GA)scheme is proposed.Firs...Focused on the task of fast and accurate armored target detection in ground battlefield,a detection method based on multi-scale representation network(MS-RN) and shape-fixed Guided Anchor(SF-GA)scheme is proposed.Firstly,considering the large-scale variation and camouflage of armored target,a new MS-RN integrating contextual information in battlefield environment is designed.The MS-RN extracts deep features from templates with different scales and strengthens the detection ability of small targets.Armored targets of different sizes are detected on different representation features.Secondly,aiming at the accuracy and real-time detection requirements,improved shape-fixed Guided Anchor is used on feature maps of different scales to recommend regions of interests(ROIs).Different from sliding or random anchor,the SF-GA can filter out 80% of the regions while still improving the recall.A special detection dataset for armored target,named Armored Target Dataset(ARTD),is constructed,based on which the comparable experiments with state-of-art detection methods are conducted.Experimental results show that the proposed method achieves outstanding performance in detection accuracy and efficiency,especially when small armored targets are involved.展开更多
This paper proposes a multi-scale self-recovery(MSSR)approach to protect images against content forgery.The main idea is to provide more resistance against image tampering while enabling the recovery process in a mult...This paper proposes a multi-scale self-recovery(MSSR)approach to protect images against content forgery.The main idea is to provide more resistance against image tampering while enabling the recovery process in a multi-scale quality manner.In the proposed approach,the reference data composed of several parts and each part is protected by a channel coding rate according to its importance.The first part,which is used to reconstruct a rough approximation of the original image,is highly protected in order to resist against higher tampering rates.Other parts are protected with lower rates according to their importance leading to lower tolerable tampering rate(TTR),but the higher quality of the recovered images.The proposed MSSR approach is an efficient solution for the main disadvantage of the current methods,which either recover a tampered image in low tampering rates or fails when tampering rate is above the TTR value.The simulation results on 10000 test images represent the efficiency of the multi-scale self-recovery feature of the proposed approach in comparison with the existing methods.展开更多
This paper introduces a multi-scale morphological edge detection algorithm to extract SAR image edge which suffers seriously from noise. Combining the basic theme of morphology with that of multi-scale analysis, the a...This paper introduces a multi-scale morphological edge detection algorithm to extract SAR image edge which suffers seriously from noise. Combining the basic theme of morphology with that of multi-scale analysis, the algorithm presents the outstanding characteristics of accuracy and robustness. Comparative Experiments reveal its fine performance.展开更多
Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false...Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false detection rates when applying object recognition algorithms tailored for remote sensing imagery.Additionally,these complexities contribute to inaccuracies in target localization and hinder precise target categorization.This paper addresses these challenges by proposing a solution:The YOLO-MFD model(YOLO-MFD:Remote Sensing Image Object Detection withMulti-scale Fusion Dynamic Head).Before presenting our method,we delve into the prevalent issues faced in remote sensing imagery analysis.Specifically,we emphasize the struggles of existing object recognition algorithms in comprehensively capturing critical image features amidst varying scales and complex backgrounds.To resolve these issues,we introduce a novel approach.First,we propose the implementation of a lightweight multi-scale module called CEF.This module significantly improves the model’s ability to comprehensively capture important image features by merging multi-scale feature information.It effectively addresses the issues of missed detection and mistaken alarms that are common in remote sensing imagery.Second,an additional layer of small target detection heads is added,and a residual link is established with the higher-level feature extraction module in the backbone section.This allows the model to incorporate shallower information,significantly improving the accuracy of target localization in remotely sensed images.Finally,a dynamic head attentionmechanism is introduced.This allows themodel to exhibit greater flexibility and accuracy in recognizing shapes and targets of different sizes.Consequently,the precision of object detection is significantly improved.The trial results show that the YOLO-MFD model shows improvements of 6.3%,3.5%,and 2.5%over the original YOLOv8 model in Precision,map@0.5 and map@0.5:0.95,separately.These results illustrate the clear advantages of the method.展开更多
Inspired by the coarse-to-fine visual perception process of human vision system,a new approach based on Gaussian multi-scale space for defect detection of industrial products was proposed.By selecting different scale ...Inspired by the coarse-to-fine visual perception process of human vision system,a new approach based on Gaussian multi-scale space for defect detection of industrial products was proposed.By selecting different scale parameters of the Gaussian kernel,the multi-scale representation of the original image data could be obtained and used to constitute the multi- variate image,in which each channel could represent a perceptual observation of the original image from different scales.The Multivariate Image Analysis (MIA) techniques were used to extract defect features information.The MIA combined Principal Component Analysis (PCA) to obtain the principal component scores of the multivariate test image.The Q-statistic image, derived from the residuals after the extraction of the first principal component score and noise,could be used to efficiently reveal the surface defects with an appropriate threshold value decided by training images.Experimental results show that the proposed method performs better than the gray histogram-based method.It has less sensitivity to the inhomogeneous of illumination,and has more robustness and reliability of defect detection with lower pseudo reject rate.展开更多
Accurately identifying small objects in high-resolution aerial images presents a complex and crucial task in thefield of small object detection on unmanned aerial vehicles(UAVs).This task is challenging due to variati...Accurately identifying small objects in high-resolution aerial images presents a complex and crucial task in thefield of small object detection on unmanned aerial vehicles(UAVs).This task is challenging due to variations inUAV flight altitude,differences in object scales,as well as factors like flight speed and motion blur.To enhancethe detection efficacy of small targets in drone aerial imagery,we propose an enhanced You Only Look Onceversion 7(YOLOv7)algorithm based on multi-scale spatial context.We build the MSC-YOLO model,whichincorporates an additional prediction head,denoted as P2,to improve adaptability for small objects.We replaceconventional downsampling with a Spatial-to-Depth Convolutional Combination(CSPDC)module to mitigatethe loss of intricate feature details related to small objects.Furthermore,we propose a Spatial Context Pyramidwith Multi-Scale Attention(SCPMA)module,which captures spatial and channel-dependent features of smalltargets acrossmultiple scales.This module enhances the perception of spatial contextual features and the utilizationof multiscale feature information.On the Visdrone2023 and UAVDT datasets,MSC-YOLO achieves remarkableresults,outperforming the baseline method YOLOv7 by 3.0%in terms ofmean average precision(mAP).The MSCYOLOalgorithm proposed in this paper has demonstrated satisfactory performance in detecting small targets inUAV aerial photography,providing strong support for practical applications.展开更多
Aiming at the problems of inaccuracy in detecting theαphase contour of TB6 titanium alloy.By combining computer vision technology with human vision mechanisms,the spatial characteristics of theαphase can be simulate...Aiming at the problems of inaccuracy in detecting theαphase contour of TB6 titanium alloy.By combining computer vision technology with human vision mechanisms,the spatial characteristics of theαphase can be simulated to obtain the contour accurately.Therefore,an algorithm forαphase contour detection of TB6 titanium alloy fused with multi-scale fretting features is proposed.Firstly,through the response of the classical receptive field model based on fretting and the suppression of new non-classical receptive field model based on fretting,the information maps of theαphase contour of the TB6 titanium alloy at different scales are obtained;then the information map of the smallest scale contour is used as a benchmark,the neighborhood is constructed to judge the deviation of other scale contour information,and the corresponding weight value is calculated;finally,Gaussian function is used to weight and fuse the deviation information,and the contour detection result of TB6 titanium alloyαphase is obtained.In the Visual Studio 2013 environment,484 metallographic images with different temperatures,strain rates,and magnifications were tested.The results show that the performance evaluation F value of the proposed algorithm is 0.915,which can effectively improve the accuracy ofαphase contour detection of TB6 titanium alloy.展开更多
Face detection is applied to many tasks such as auto focus control, surveillance, user interface, and face recognition. Processing speed and detection accuracy of the face detection have been improved continuously. Th...Face detection is applied to many tasks such as auto focus control, surveillance, user interface, and face recognition. Processing speed and detection accuracy of the face detection have been improved continuously. This paper describes a novel method of fast face detection with multi-scale window search free from image resizing. We adopt statistics of gradient images (SGI) as image features and append an overlapping cell array to improve detection accuracy. The SGI feature is scale invariant and insensitive to small difference of pixel value. These characteristics enable the multi-scale window search without image resizing. Experimental results show that processing speed of our method is 3.66 times faster than a conventional method, adopting HOG features combined to an SVM classifier, without accuracy degradation.展开更多
The detection of ash content in coal slime flotation tailings using deep learning can be hindered by various factors such as foam,impurities,and changing lighting conditions that disrupt the collection of tailings ima...The detection of ash content in coal slime flotation tailings using deep learning can be hindered by various factors such as foam,impurities,and changing lighting conditions that disrupt the collection of tailings images.To address this challenge,we present a method for ash content detection in coal slime flotation tailings.This method utilizes chromatographic filter paper sampling and a multi-scale residual network,which we refer to as MRCN.Initially,tailings are sampled using chromatographic filter paper to obtain static tailings images,effectively isolating interference factors at the flotation site.Subsequently,the MRCN,consisting of a multi-scale residual network,is employed to extract image features and compute ash content.Within the MRCN structure,tailings images undergo convolution operations through two parallel branches that utilize convolution kernels of different sizes,enabling the extraction of image features at various scales and capturing a more comprehensive representation of the ash content information.Furthermore,a channel attention mechanism is integrated to enhance the performance of the model.The combination of the multi-scale residual structure and the channel attention mechanism within MRCN results in robust capabilities for image feature extraction and ash content detection.Comparative experiments demonstrate that this proposed approach,based on chromatographic filter paper sampling and the multi-scale residual network,exhibits significantly superior performance in the detection of ash content in coal slime flotation tailings.展开更多
Corner detection is a chief step in computer vision. A new corner detection algorithm in planar curves is proposed. Firstly, from the human perception, two key characteristics are given as an amendment of the traditio...Corner detection is a chief step in computer vision. A new corner detection algorithm in planar curves is proposed. Firstly, from the human perception, two key characteristics are given as an amendment of the traditional corner properties. Based on the two properties, the concept of the fuzzy set is introduced into a detection. Secondly, the extracted-formulae of three groups including the features of the corner subject degree are derived. Through synthesizing the features of three groups, the judgments of the corner detection, location, and optimization are obtained. Finally, by using the algorithm the detection results of several examples and feature curves for some interested parts, as well as the detection results for the test images history in references are given. Results show that the algorithm is easily realized after adopting the fuzzy set, and the detection effect is very ideal.展开更多
Most of local feature descriptors assume that the scene is planar. In the real scene, the captured images come from the 3-D world. 3-D corner as a novel invariant feature is important for the image matching and the ob...Most of local feature descriptors assume that the scene is planar. In the real scene, the captured images come from the 3-D world. 3-D corner as a novel invariant feature is important for the image matching and the object detection, while automatically discriminating 3-D corners from ordinary corners is difficult. A novel method for 3-D corner detection is proposed based on the image graph grammar, and it can detect the 3-D features of corners to some extent. Experimental results show that the method is valid and the 3-D corner is useful for image matching.展开更多
Drone-based small object detection is of great significance in practical applications such as military actions, disaster rescue, transportation, etc. However, the severe scale differences in objects captured by drones...Drone-based small object detection is of great significance in practical applications such as military actions, disaster rescue, transportation, etc. However, the severe scale differences in objects captured by drones and lack of detail information for small-scale objects make drone-based small object detection a formidable challenge. To address these issues, we first develop a mathematical model to explore how changing receptive fields impacts the polynomial fitting results. Subsequently, based on the obtained conclusions, we propose a simple but effective Hybrid Receptive Field Network (HRFNet), whose modules include Hybrid Feature Augmentation (HFA), Hybrid Feature Pyramid (HFP) and Dual Scale Head (DSH). Specifically, HFA employs parallel dilated convolution kernels of different sizes to extend shallow features with different receptive fields, committed to improving the multi-scale adaptability of the network;HFP enhances the perception of small objects by capturing contextual information across layers, while DSH reconstructs the original prediction head utilizing a set of high-resolution features and ultrahigh-resolution features. In addition, in order to train HRFNet, the corresponding dual-scale loss function is designed. Finally, comprehensive evaluation results on public benchmarks such as VisDrone-DET and TinyPerson demonstrate the robustness of the proposed method. Most impressively, the proposed HRFNet achieves a mAP of 51.0 on VisDrone-DET with 29.3 M parameters, which outperforms the extant state-of-the-art detectors. HRFNet also performs excellently in complex scenarios captured by drones, achieving the best performance on the CS-Drone dataset we built.展开更多
Subpixel accuracy for V-groove center in robot welding is researched and a software measure to increase the accuracy of seam tracking by laser is presented. LOG ( Laplacian of Gaussian ) operator is adopted to detec...Subpixel accuracy for V-groove center in robot welding is researched and a software measure to increase the accuracy of seam tracking by laser is presented. LOG ( Laplacian of Gaussian ) operator is adopted to detect image edge. Vgroove center is extracted by corner detection of extremum curvature. Subpixel position is obtained by Lagarange polynomial interpolation algorithm. Experiment results show that the method is brief and applied, and is sufficient for the real time of robot welding by laser sensors.展开更多
基金supported by the Henan Province Key R&D Project under Grant 241111210400the Henan Provincial Science and Technology Research Project under Grants 252102211047,252102211062,252102211055 and 232102210069+2 种基金the Jiangsu Provincial Scheme Double Initiative Plan JSS-CBS20230474,the XJTLU RDF-21-02-008the Science and Technology Innovation Project of Zhengzhou University of Light Industry under Grant 23XNKJTD0205the Higher Education Teaching Reform Research and Practice Project of Henan Province under Grant 2024SJGLX0126。
文摘Accurate and efficient detection of building changes in remote sensing imagery is crucial for urban planning,disaster emergency response,and resource management.However,existing methods face challenges such as spectral similarity between buildings and backgrounds,sensor variations,and insufficient computational efficiency.To address these challenges,this paper proposes a novel Multi-scale Efficient Wavelet-based Change Detection Network(MewCDNet),which integrates the advantages of Convolutional Neural Networks and Transformers,balances computational costs,and achieves high-performance building change detection.The network employs EfficientNet-B4 as the backbone for hierarchical feature extraction,integrates multi-level feature maps through a multi-scale fusion strategy,and incorporates two key modules:Cross-temporal Difference Detection(CTDD)and Cross-scale Wavelet Refinement(CSWR).CTDD adopts a dual-branch architecture that combines pixel-wise differencing with semanticaware Euclidean distance weighting to enhance the distinction between true changes and background noise.CSWR integrates Haar-based Discrete Wavelet Transform with multi-head cross-attention mechanisms,enabling cross-scale feature fusion while significantly improving edge localization and suppressing spurious changes.Extensive experiments on four benchmark datasets demonstrate MewCDNet’s superiority over comparison methods:achieving F1 scores of 91.54%on LEVIR,93.70%on WHUCD,and 64.96%on S2Looking for building change detection.Furthermore,MewCDNet exhibits optimal performance on the multi-class⋅SYSU dataset(F1:82.71%),highlighting its exceptional generalization capability.
文摘With the rapid expansion of drone applications,accurate detection of objects in aerial imagery has become crucial for intelligent transportation,urban management,and emergency rescue missions.However,existing methods face numerous challenges in practical deployment,including scale variation handling,feature degradation,and complex backgrounds.To address these issues,we propose Edge-enhanced and Detail-Capturing You Only Look Once(EHDC-YOLO),a novel framework for object detection in Unmanned Aerial Vehicle(UAV)imagery.Based on the You Only Look Once version 11 nano(YOLOv11n)baseline,EHDC-YOLO systematically introduces several architectural enhancements:(1)a Multi-Scale Edge Enhancement(MSEE)module that leverages multi-scale pooling and edge information to enhance boundary feature extraction;(2)an Enhanced Feature Pyramid Network(EFPN)that integrates P2-level features with Cross Stage Partial(CSP)structures and OmniKernel convolutions for better fine-grained representation;and(3)Dynamic Head(DyHead)with multi-dimensional attention mechanisms for enhanced cross-scale modeling and perspective adaptability.Comprehensive experiments on the Vision meets Drones for Detection(VisDrone-DET)2019 dataset demonstrate that EHDC-YOLO achieves significant improvements,increasing mean Average Precision(mAP)@0.5 from 33.2%to 46.1%(an absolute improvement of 12.9 percentage points)and mAP@0.5:0.95 from 19.5%to 28.0%(an absolute improvement of 8.5 percentage points)compared with the YOLOv11n baseline,while maintaining a reasonable parameter count(2.81 M vs the baseline’s 2.58 M).Further ablation studies confirm the effectiveness of each proposed component,while visualization results highlight EHDC-YOLO’s superior performance in detecting objects and handling occlusions in complex drone scenarios.
基金the National Key Research and Development Program of China(Grant No.2022YFF0711400)which provided valuable financial support and resources for my research and made it possible for me to deeply explore the unknown mysteries in the field of lunar geologythe National Space Science Data Center Youth Open Project(Grant No.NSSDC2302001),which has not only facilitated the smooth progress of my research,but has also built a platform for me to communicate and cooperate with experts in the field.
文摘Impact craters are important for understanding the evolution of lunar geologic and surface erosion rates,among other functions.However,the morphological characteristics of these micro impact craters are not obvious and they are numerous,resulting in low detection accuracy by deep learning models.Therefore,we proposed a new multi-scale fusion crater detection algorithm(MSF-CDA)based on the YOLO11 to improve the accuracy of lunar impact crater detection,especially for small craters with a diameter of<1 km.Using the images taken by the LROC(Lunar Reconnaissance Orbiter Camera)at the Chang’e-4(CE-4)landing area,we constructed three separate datasets for craters with diameters of 0-70 m,70-140 m,and>140 m.We then trained three submodels separately with these three datasets.Additionally,we designed a slicing-amplifying-slicing strategy to enhance the ability to extract features from small craters.To handle redundant predictions,we proposed a new Non-Maximum Suppression with Area Filtering method to fuse the results in overlapping targets within the multi-scale submodels.Finally,our new MSF-CDA method achieved high detection performance,with the Precision,Recall,and F1 score having values of 0.991,0.987,and 0.989,respectively,perfectly addressing the problems induced by the lesser features and sample imbalance of small craters.Our MSF-CDA can provide strong data support for more in-depth study of the geological evolution of the lunar surface and finer geological age estimations.This strategy can also be used to detect other small objects with lesser features and sample imbalance problems.We detected approximately 500,000 impact craters in an area of approximately 214 km2 around the CE-4 landing area.By statistically analyzing the new data,we updated the distribution function of the number and diameter of impact craters.Finally,we identified the most suitable lighting conditions for detecting impact crater targets by analyzing the effect of different lighting conditions on the detection accuracy.
基金funded by the China Chongqing Municipal Science and Technology Bureau,grant numbers 2024TIAD-CYKJCXX0121,2024NSCQ-LZX0135Chongqing Municipal Commission of Housing and Urban-Rural Development,grant number CKZ2024-87+3 种基金the Chongqing University of Technology graduate education high-quality development project,grant number gzlsz202401the Chongqing University of Technology-Chongqing LINGLUE Technology Co.,Ltd.,Electronic Information(Artificial Intelligence)graduate joint training basethe Postgraduate Education and Teaching Reform Research Project in Chongqing,grant number yjg213116the Chongqing University of Technology-CISDI Chongqing Information Technology Co.,Ltd.,Computer Technology graduate joint training base.
文摘Detecting abnormal cervical cells is crucial for early identification and timely treatment of cervical cancer.However,this task is challenging due to the morphological similarities between abnormal and normal cells and the significant variations in cell size.Pathologists often refer to surrounding cells to identify abnormalities.To emulate this slide examination behavior,this study proposes a Multi-Scale Feature Fusion Network(MSFF-Net)for detecting cervical abnormal cells.MSFF-Net employs a Cross-Scale Pooling Model(CSPM)to effectively capture diverse features and contextual information,ranging from local details to the overall structure.Additionally,a Multi-Scale Fusion Attention(MSFA)module is introduced to mitigate the impact of cell size variations by adaptively fusing local and global information at different scales.To handle the complex environment of cervical cell images,such as cell adhesion and overlapping,the Inner-CIoU loss function is utilized to more precisely measure the overlap between bounding boxes,thereby improving detection accuracy in such scenarios.Experimental results on the Comparison detector dataset demonstrate that MSFF-Net achieves a mean average precision(mAP)of 63.2%,outperforming state-of-the-art methods while maintaining a relatively small number of parameters(26.8 M).This study highlights the effectiveness of multi-scale feature fusion in enhancing the detection of cervical abnormal cells,contributing to more accurate and efficient cervical cancer screening.
基金supported by Communication University of China(HG23035)partly supported by the Fundamental Research Funds for the Central Universities(CUC230A013).
文摘With the rapid growth of socialmedia,the spread of fake news has become a growing problem,misleading the public and causing significant harm.As social media content is often composed of both images and text,the use of multimodal approaches for fake news detection has gained significant attention.To solve the problems existing in previous multi-modal fake news detection algorithms,such as insufficient feature extraction and insufficient use of semantic relations between modes,this paper proposes the MFFFND-Co(Multimodal Feature Fusion Fake News Detection with Co-Attention Block)model.First,the model deeply explores the textual content,image content,and frequency domain features.Then,it employs a Co-Attention mechanism for cross-modal fusion.Additionally,a semantic consistency detectionmodule is designed to quantify semantic deviations,thereby enhancing the performance of fake news detection.Experimentally verified on two commonly used datasets,Twitter and Weibo,the model achieved F1 scores of 90.0% and 94.0%,respectively,significantly outperforming the pre-modified MFFFND(Multimodal Feature Fusion Fake News Detection with Attention Block)model and surpassing other baseline models.This improves the accuracy of detecting fake information in artificial intelligence detection and engineering software detection.
基金funded by the Deanship of Scientific Research at Northern Border University,Arar,Saudi Arabia through research group No.(RG-NBU-2022-1234).
文摘Transportation systems are experiencing a significant transformation due to the integration of advanced technologies, including artificial intelligence and machine learning. In the context of intelligent transportation systems (ITS) and Advanced Driver Assistance Systems (ADAS), the development of efficient and reliable traffic light detection mechanisms is crucial for enhancing road safety and traffic management. This paper presents an optimized convolutional neural network (CNN) framework designed to detect traffic lights in real-time within complex urban environments. Leveraging multi-scale pyramid feature maps, the proposed model addresses key challenges such as the detection of small, occluded, and low-resolution traffic lights amidst complex backgrounds. The integration of dilated convolutions, Region of Interest (ROI) alignment, and Soft Non-Maximum Suppression (Soft-NMS) further improves detection accuracy and reduces false positives. By optimizing computational efficiency and parameter complexity, the framework is designed to operate seamlessly on embedded systems, ensuring robust performance in real-world applications. Extensive experiments using real-world datasets demonstrate that our model significantly outperforms existing methods, providing a scalable solution for ITS and ADAS applications. This research contributes to the advancement of Artificial Intelligence-driven (AI-driven) pattern recognition in transportation systems and offers a mathematical approach to improving efficiency and safety in logistics and transportation networks.
基金supported by the National Key Research and Development Program of China under grant 2016YFC0802904National Natural Science Foundation of China under grant61671470the Postdoctoral Science Foundation Funded Project of China under grant 2017M623423。
文摘Focused on the task of fast and accurate armored target detection in ground battlefield,a detection method based on multi-scale representation network(MS-RN) and shape-fixed Guided Anchor(SF-GA)scheme is proposed.Firstly,considering the large-scale variation and camouflage of armored target,a new MS-RN integrating contextual information in battlefield environment is designed.The MS-RN extracts deep features from templates with different scales and strengthens the detection ability of small targets.Armored targets of different sizes are detected on different representation features.Secondly,aiming at the accuracy and real-time detection requirements,improved shape-fixed Guided Anchor is used on feature maps of different scales to recommend regions of interests(ROIs).Different from sliding or random anchor,the SF-GA can filter out 80% of the regions while still improving the recall.A special detection dataset for armored target,named Armored Target Dataset(ARTD),is constructed,based on which the comparable experiments with state-of-art detection methods are conducted.Experimental results show that the proposed method achieves outstanding performance in detection accuracy and efficiency,especially when small armored targets are involved.
文摘This paper proposes a multi-scale self-recovery(MSSR)approach to protect images against content forgery.The main idea is to provide more resistance against image tampering while enabling the recovery process in a multi-scale quality manner.In the proposed approach,the reference data composed of several parts and each part is protected by a channel coding rate according to its importance.The first part,which is used to reconstruct a rough approximation of the original image,is highly protected in order to resist against higher tampering rates.Other parts are protected with lower rates according to their importance leading to lower tolerable tampering rate(TTR),but the higher quality of the recovered images.The proposed MSSR approach is an efficient solution for the main disadvantage of the current methods,which either recover a tampered image in low tampering rates or fails when tampering rate is above the TTR value.The simulation results on 10000 test images represent the efficiency of the multi-scale self-recovery feature of the proposed approach in comparison with the existing methods.
基金Supported the NatioIlal Naturel Science Foundation of China(No.69831040)
文摘This paper introduces a multi-scale morphological edge detection algorithm to extract SAR image edge which suffers seriously from noise. Combining the basic theme of morphology with that of multi-scale analysis, the algorithm presents the outstanding characteristics of accuracy and robustness. Comparative Experiments reveal its fine performance.
基金the Scientific Research Fund of Hunan Provincial Education Department(23A0423).
文摘Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false detection rates when applying object recognition algorithms tailored for remote sensing imagery.Additionally,these complexities contribute to inaccuracies in target localization and hinder precise target categorization.This paper addresses these challenges by proposing a solution:The YOLO-MFD model(YOLO-MFD:Remote Sensing Image Object Detection withMulti-scale Fusion Dynamic Head).Before presenting our method,we delve into the prevalent issues faced in remote sensing imagery analysis.Specifically,we emphasize the struggles of existing object recognition algorithms in comprehensively capturing critical image features amidst varying scales and complex backgrounds.To resolve these issues,we introduce a novel approach.First,we propose the implementation of a lightweight multi-scale module called CEF.This module significantly improves the model’s ability to comprehensively capture important image features by merging multi-scale feature information.It effectively addresses the issues of missed detection and mistaken alarms that are common in remote sensing imagery.Second,an additional layer of small target detection heads is added,and a residual link is established with the higher-level feature extraction module in the backbone section.This allows the model to incorporate shallower information,significantly improving the accuracy of target localization in remotely sensed images.Finally,a dynamic head attentionmechanism is introduced.This allows themodel to exhibit greater flexibility and accuracy in recognizing shapes and targets of different sizes.Consequently,the precision of object detection is significantly improved.The trial results show that the YOLO-MFD model shows improvements of 6.3%,3.5%,and 2.5%over the original YOLOv8 model in Precision,map@0.5 and map@0.5:0.95,separately.These results illustrate the clear advantages of the method.
基金supported in part by the Natural Science Foundation of China (NSFC) (Grant No:50875240).
文摘Inspired by the coarse-to-fine visual perception process of human vision system,a new approach based on Gaussian multi-scale space for defect detection of industrial products was proposed.By selecting different scale parameters of the Gaussian kernel,the multi-scale representation of the original image data could be obtained and used to constitute the multi- variate image,in which each channel could represent a perceptual observation of the original image from different scales.The Multivariate Image Analysis (MIA) techniques were used to extract defect features information.The MIA combined Principal Component Analysis (PCA) to obtain the principal component scores of the multivariate test image.The Q-statistic image, derived from the residuals after the extraction of the first principal component score and noise,could be used to efficiently reveal the surface defects with an appropriate threshold value decided by training images.Experimental results show that the proposed method performs better than the gray histogram-based method.It has less sensitivity to the inhomogeneous of illumination,and has more robustness and reliability of defect detection with lower pseudo reject rate.
基金the Key Research and Development Program of Hainan Province(Grant Nos.ZDYF2023GXJS163,ZDYF2024GXJS014)National Natural Science Foundation of China(NSFC)(Grant Nos.62162022,62162024)+2 种基金the Major Science and Technology Project of Hainan Province(Grant No.ZDKJ2020012)Hainan Provincial Natural Science Foundation of China(Grant No.620MS021)Youth Foundation Project of Hainan Natural Science Foundation(621QN211).
文摘Accurately identifying small objects in high-resolution aerial images presents a complex and crucial task in thefield of small object detection on unmanned aerial vehicles(UAVs).This task is challenging due to variations inUAV flight altitude,differences in object scales,as well as factors like flight speed and motion blur.To enhancethe detection efficacy of small targets in drone aerial imagery,we propose an enhanced You Only Look Onceversion 7(YOLOv7)algorithm based on multi-scale spatial context.We build the MSC-YOLO model,whichincorporates an additional prediction head,denoted as P2,to improve adaptability for small objects.We replaceconventional downsampling with a Spatial-to-Depth Convolutional Combination(CSPDC)module to mitigatethe loss of intricate feature details related to small objects.Furthermore,we propose a Spatial Context Pyramidwith Multi-Scale Attention(SCPMA)module,which captures spatial and channel-dependent features of smalltargets acrossmultiple scales.This module enhances the perception of spatial contextual features and the utilizationof multiscale feature information.On the Visdrone2023 and UAVDT datasets,MSC-YOLO achieves remarkableresults,outperforming the baseline method YOLOv7 by 3.0%in terms ofmean average precision(mAP).The MSCYOLOalgorithm proposed in this paper has demonstrated satisfactory performance in detecting small targets inUAV aerial photography,providing strong support for practical applications.
基金Supported by Hebei Provincial Key Laboratory for Software Engineering(Grant No.22567637H)the"Rail Vehicle Application Engineering"National International Science and Technology Cooperation Base Open Project Fund(Grant No.BMRV21KF09).
文摘Aiming at the problems of inaccuracy in detecting theαphase contour of TB6 titanium alloy.By combining computer vision technology with human vision mechanisms,the spatial characteristics of theαphase can be simulated to obtain the contour accurately.Therefore,an algorithm forαphase contour detection of TB6 titanium alloy fused with multi-scale fretting features is proposed.Firstly,through the response of the classical receptive field model based on fretting and the suppression of new non-classical receptive field model based on fretting,the information maps of theαphase contour of the TB6 titanium alloy at different scales are obtained;then the information map of the smallest scale contour is used as a benchmark,the neighborhood is constructed to judge the deviation of other scale contour information,and the corresponding weight value is calculated;finally,Gaussian function is used to weight and fuse the deviation information,and the contour detection result of TB6 titanium alloyαphase is obtained.In the Visual Studio 2013 environment,484 metallographic images with different temperatures,strain rates,and magnifications were tested.The results show that the performance evaluation F value of the proposed algorithm is 0.915,which can effectively improve the accuracy ofαphase contour detection of TB6 titanium alloy.
文摘Face detection is applied to many tasks such as auto focus control, surveillance, user interface, and face recognition. Processing speed and detection accuracy of the face detection have been improved continuously. This paper describes a novel method of fast face detection with multi-scale window search free from image resizing. We adopt statistics of gradient images (SGI) as image features and append an overlapping cell array to improve detection accuracy. The SGI feature is scale invariant and insensitive to small difference of pixel value. These characteristics enable the multi-scale window search without image resizing. Experimental results show that processing speed of our method is 3.66 times faster than a conventional method, adopting HOG features combined to an SVM classifier, without accuracy degradation.
基金This work was supported by National Natural Science Foundation of China:Grant No.62106048.
文摘The detection of ash content in coal slime flotation tailings using deep learning can be hindered by various factors such as foam,impurities,and changing lighting conditions that disrupt the collection of tailings images.To address this challenge,we present a method for ash content detection in coal slime flotation tailings.This method utilizes chromatographic filter paper sampling and a multi-scale residual network,which we refer to as MRCN.Initially,tailings are sampled using chromatographic filter paper to obtain static tailings images,effectively isolating interference factors at the flotation site.Subsequently,the MRCN,consisting of a multi-scale residual network,is employed to extract image features and compute ash content.Within the MRCN structure,tailings images undergo convolution operations through two parallel branches that utilize convolution kernels of different sizes,enabling the extraction of image features at various scales and capturing a more comprehensive representation of the ash content information.Furthermore,a channel attention mechanism is integrated to enhance the performance of the model.The combination of the multi-scale residual structure and the channel attention mechanism within MRCN results in robust capabilities for image feature extraction and ash content detection.Comparative experiments demonstrate that this proposed approach,based on chromatographic filter paper sampling and the multi-scale residual network,exhibits significantly superior performance in the detection of ash content in coal slime flotation tailings.
文摘Corner detection is a chief step in computer vision. A new corner detection algorithm in planar curves is proposed. Firstly, from the human perception, two key characteristics are given as an amendment of the traditional corner properties. Based on the two properties, the concept of the fuzzy set is introduced into a detection. Secondly, the extracted-formulae of three groups including the features of the corner subject degree are derived. Through synthesizing the features of three groups, the judgments of the corner detection, location, and optimization are obtained. Finally, by using the algorithm the detection results of several examples and feature curves for some interested parts, as well as the detection results for the test images history in references are given. Results show that the algorithm is easily realized after adopting the fuzzy set, and the detection effect is very ideal.
文摘Most of local feature descriptors assume that the scene is planar. In the real scene, the captured images come from the 3-D world. 3-D corner as a novel invariant feature is important for the image matching and the object detection, while automatically discriminating 3-D corners from ordinary corners is difficult. A novel method for 3-D corner detection is proposed based on the image graph grammar, and it can detect the 3-D features of corners to some extent. Experimental results show that the method is valid and the 3-D corner is useful for image matching.
基金supported by the National Natural Science Foundation of China(Nos.62276204 and 62203343)the Fundamental Research Funds for the Central Universities(No.YJSJ24011)+1 种基金the Natural Science Basic Research Program of Shanxi,China(Nos.2022JM-340 and 2023-JC-QN-0710)the China Postdoctoral Science Foundation(Nos.2020T130494 and 2018M633470).
文摘Drone-based small object detection is of great significance in practical applications such as military actions, disaster rescue, transportation, etc. However, the severe scale differences in objects captured by drones and lack of detail information for small-scale objects make drone-based small object detection a formidable challenge. To address these issues, we first develop a mathematical model to explore how changing receptive fields impacts the polynomial fitting results. Subsequently, based on the obtained conclusions, we propose a simple but effective Hybrid Receptive Field Network (HRFNet), whose modules include Hybrid Feature Augmentation (HFA), Hybrid Feature Pyramid (HFP) and Dual Scale Head (DSH). Specifically, HFA employs parallel dilated convolution kernels of different sizes to extend shallow features with different receptive fields, committed to improving the multi-scale adaptability of the network;HFP enhances the perception of small objects by capturing contextual information across layers, while DSH reconstructs the original prediction head utilizing a set of high-resolution features and ultrahigh-resolution features. In addition, in order to train HRFNet, the corresponding dual-scale loss function is designed. Finally, comprehensive evaluation results on public benchmarks such as VisDrone-DET and TinyPerson demonstrate the robustness of the proposed method. Most impressively, the proposed HRFNet achieves a mAP of 51.0 on VisDrone-DET with 29.3 M parameters, which outperforms the extant state-of-the-art detectors. HRFNet also performs excellently in complex scenarios captured by drones, achieving the best performance on the CS-Drone dataset we built.
基金This work is financially supported by National Nature Science Foundation of China (Grant No. 50175027).
文摘Subpixel accuracy for V-groove center in robot welding is researched and a software measure to increase the accuracy of seam tracking by laser is presented. LOG ( Laplacian of Gaussian ) operator is adopted to detect image edge. Vgroove center is extracted by corner detection of extremum curvature. Subpixel position is obtained by Lagarange polynomial interpolation algorithm. Experiment results show that the method is brief and applied, and is sufficient for the real time of robot welding by laser sensors.