Accurate and efficient detection of building changes in remote sensing imagery is crucial for urban planning,disaster emergency response,and resource management.However,existing methods face challenges such as spectra...Accurate and efficient detection of building changes in remote sensing imagery is crucial for urban planning,disaster emergency response,and resource management.However,existing methods face challenges such as spectral similarity between buildings and backgrounds,sensor variations,and insufficient computational efficiency.To address these challenges,this paper proposes a novel Multi-scale Efficient Wavelet-based Change Detection Network(MewCDNet),which integrates the advantages of Convolutional Neural Networks and Transformers,balances computational costs,and achieves high-performance building change detection.The network employs EfficientNet-B4 as the backbone for hierarchical feature extraction,integrates multi-level feature maps through a multi-scale fusion strategy,and incorporates two key modules:Cross-temporal Difference Detection(CTDD)and Cross-scale Wavelet Refinement(CSWR).CTDD adopts a dual-branch architecture that combines pixel-wise differencing with semanticaware Euclidean distance weighting to enhance the distinction between true changes and background noise.CSWR integrates Haar-based Discrete Wavelet Transform with multi-head cross-attention mechanisms,enabling cross-scale feature fusion while significantly improving edge localization and suppressing spurious changes.Extensive experiments on four benchmark datasets demonstrate MewCDNet’s superiority over comparison methods:achieving F1 scores of 91.54%on LEVIR,93.70%on WHUCD,and 64.96%on S2Looking for building change detection.Furthermore,MewCDNet exhibits optimal performance on the multi-class⋅SYSU dataset(F1:82.71%),highlighting its exceptional generalization capability.展开更多
Deep learning has made significant progress in the field of oriented object detection for remote sensing images.However,existing methods still face challenges when dealing with difficult tasks such as multi-scale targ...Deep learning has made significant progress in the field of oriented object detection for remote sensing images.However,existing methods still face challenges when dealing with difficult tasks such as multi-scale targets,complex backgrounds,and small objects in remote sensing.Maintaining model lightweight to address resource constraints in remote sensing scenarios while improving task completion for remote sensing tasks remains a research hotspot.Therefore,we propose an enhanced multi-scale feature extraction lightweight network EM-YOLO based on the YOLOv8s architecture,specifically optimized for the characteristics of large target scale variations,diverse orientations,and numerous small objects in remote sensing images.Our innovations lie in two main aspects:First,a dynamic snake convolution(DSC)is introduced into the backbone network to enhance the model’s feature extraction capability for oriented targets.Second,an innovative focusing-diffusion module is designed in the feature fusion neck to effectively integrate multi-scale feature information.Finally,we introduce Layer-Adaptive Sparsity for magnitude-based Pruning(LASP)method to perform lightweight network pruning to better complete tasks in resource-constrained scenarios.Experimental results on the lightweight platform Orin demonstrate that the proposed method significantly outperforms the original YOLOv8s model in oriented remote sensing object detection tasks,and achieves comparable or superior performance to state-of-the-art methods on three authoritative remote sensing datasets(DOTA v1.0,DOTA v1.5,and HRSC2016).展开更多
Desert shrubs are indispensable in maintaining ecological stability by reducing soil erosion,enhancing water retention,and boosting soil fertility,which are critical factors in mitigating desertification processes.Due...Desert shrubs are indispensable in maintaining ecological stability by reducing soil erosion,enhancing water retention,and boosting soil fertility,which are critical factors in mitigating desertification processes.Due to the complex topography,variable climate,and challenges in field surveys in desert regions,this paper proposes YOLO-Desert-Shrub(YOLO-DS),a detection method for identifying desert shrubs in UAV remote sensing images based on an enhanced YOLOv8n framework.This method accurately identifying shrub species,locations,and coverage.To address the issue of small individual plants dominating the dataset,the SPDconv convolution module is introduced in the Backbone and Neck layers of the YOLOv8n model,replacing conventional convolutions.This structural optimization mitigates information degradation in fine-grained data while strengthening discriminative feature capture across spatial scales within desert shrub datasets.Furthermore,a structured state-space model is integrated into the main network,and the MambaLayer is designed to dynamically extract and refine shrub-specific features from remote sensing images,effectively filtering out background noise and irrelevant interference to enhance feature representation.Benchmark evaluations reveal the YOLO-DS framework attains 79.56%mAP40weight,demonstrating 2.2%absolute gain versus the baseline YOLOv8n architecture,with statistically significant advantages over contemporary detectors in cross-validation trials.The predicted plant coverage exhibits strong consistency with manually measured coverage,with a coefficient of determination(R^(2))of 0.9148 and a Root Mean Square Error(RMSE)of1.8266%.The proposed UAV-based remote sensing method utilizing the YOLO-DS effectively identify and locate desert shrubs,monitor canopy sizes and distribution,and provide technical support for automated desert shrub monitoring.展开更多
An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyram...An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyramid network(FPN)structure of the original YOLOv8 mode is replaced by the generalized-FPN(GFPN)structure in GiraffeDet to realize the"cross-layer"and"cross-scale"adaptive feature fusion,to enrich the semantic information and spatial information on the feature map to improve the target detection ability of the model.Secondly,a pyramid-pool module of multi atrous spatial pyramid pooling(MASPP)is designed by using the idea of atrous convolution and feature pyramid structure to extract multi-scale features,so as to improve the processing ability of the model for multi-scale objects.The experimental results show that the detection accuracy of the improved YOLOv8 model on DIOR dataset is 92%and mean average precision(mAP)is 87.9%,respectively 3.5%and 1.7%higher than those of the original model.It is proved the detection and classification ability of the proposed model on multi-dimensional optical remote sensing target has been improved.展开更多
Recent years have seen a surge in interest in object detection on remote sensing images for applications such as surveillance andmanagement.However,challenges like small object detection,scale variation,and the presen...Recent years have seen a surge in interest in object detection on remote sensing images for applications such as surveillance andmanagement.However,challenges like small object detection,scale variation,and the presence of closely packed objects in these images hinder accurate detection.Additionally,the motion blur effect further complicates the identification of such objects.To address these issues,we propose enhanced YOLOv9 with a transformer head(YOLOv9-TH).The model introduces an additional prediction head for detecting objects of varying sizes and swaps the original prediction heads for transformer heads to leverage self-attention mechanisms.We further improve YOLOv9-TH using several strategies,including data augmentation,multi-scale testing,multi-model integration,and the introduction of an additional classifier.The cross-stage partial(CSP)method and the ghost convolution hierarchical graph(GCHG)are combined to improve detection accuracy by better utilizing feature maps,widening the receptive field,and precisely extracting multi-scale objects.Additionally,we incorporate the E-SimAM attention mechanism to address low-resolution feature loss.Extensive experiments on the VisDrone2021 and DIOR datasets demonstrate the effectiveness of YOLOv9-TH,showing good improvement in mAP compared to the best existing methods.The YOLOv9-TH-e achieved 54.2% of mAP50 on the VisDrone2021 dataset and 92.3% of mAP on the DIOR dataset.The results confirmthemodel’s robustness and suitability for real-world applications,particularly for small object detection in remote sensing images.展开更多
Cloud detection is a critical preprocessing step in remote sensing image processing, as the presence of clouds significantly affects the accuracy of remote sensing data and limits its applicability across various doma...Cloud detection is a critical preprocessing step in remote sensing image processing, as the presence of clouds significantly affects the accuracy of remote sensing data and limits its applicability across various domains. This study presents an enhanced cloud detection method based on the U-Net architecture, designed to address the challenges of multi-scale cloud features and long-range dependencies inherent in remote sensing imagery. A Multi-Scale Dilated Attention (MSDA) module is introduced to effectively integrate multi-scale information and model long-range dependencies across different scales, enhancing the model’s ability to detect clouds of varying sizes. Additionally, a Multi-Head Self-Attention (MHSA) mechanism is incorporated to improve the model’s capacity for capturing finer details, particularly in distinguishing thin clouds from surface features. A multi-path supervision mechanism is also devised to ensure the model learns cloud features at multiple scales, further boosting the accuracy and robustness of cloud mask generation. Experimental results demonstrate that the enhanced model achieves superior performance compared to other benchmarked methods in complex scenarios. It significantly improves cloud detection accuracy, highlighting its strong potential for practical applications in cloud detection tasks.展开更多
Fine-grained aircraft target detection in remote sensing holds significant research valueand practical applications,particularly in military defense and precision strikes.Given the complex-ity of remote sensing images...Fine-grained aircraft target detection in remote sensing holds significant research valueand practical applications,particularly in military defense and precision strikes.Given the complex-ity of remote sensing images,where targets are often small and similar within categories,detectingthese fine-grained targets is challenging.To address this,we constructed a fine-grained dataset ofremotely sensed airplanes;for the problems of remote sensing fine-grained targets with obvious head-to-tail distributions and large variations in target sizes,we proposed the DWDet fine-grained tar-get detection and recognition algorithm.First,for the problem of unbalanced category distribution,we adopt an adaptive sampling strategy.In addition,we construct a deformable convolutional blockand improve the decoupling head structure to improve the detection effect of the model ondeformed targets.Then,we design a localization loss function,which is used to improve the model’slocalization ability for targets of different scales.The experimental results show that our algorithmimproves the overall accuracy of the model by 4.1%compared to the baseline model,and improvesthe detection accuracy of small targets by 12.2%.The ablation and comparison experiments alsoprove the effectiveness of our algorithm.展开更多
In recent years,convolutional neural networks(CNN)and Transformer architectures have made significant progress in the field of remote sensing(RS)change detection(CD).Most of the existing methods directly stack multipl...In recent years,convolutional neural networks(CNN)and Transformer architectures have made significant progress in the field of remote sensing(RS)change detection(CD).Most of the existing methods directly stack multiple layers of Transformer blocks,which achieves considerable improvement in capturing variations,but at a rather high computational cost.We propose a channel-Efficient Change Detection Network(CE-CDNet)to address the problems of high computational cost and imbalanced detection accuracy in remote sensing building change detection.The adaptive multi-scale feature fusion module(CAMSF)and lightweight Transformer decoder(LTD)are introduced to improve the change detection effect.The CAMSF module can adaptively fuse multi-scale features to improve the model’s ability to detect building changes in complex scenes.In addition,the LTD module reduces computational costs and maintains high detection accuracy through an optimized self-attention mechanism and dimensionality reduction operation.Experimental test results on three commonly used remote sensing building change detection data sets show that CE-CDNet can reduce a certain amount of computational overhead while maintaining detection accuracy comparable to existing mainstream models,showing good performance advantages.展开更多
In response to challenges posed by complex backgrounds,diverse target angles,and numerous small targets in remote sensing images,alongside the issue of high resource consumption hindering model deployment,we propose a...In response to challenges posed by complex backgrounds,diverse target angles,and numerous small targets in remote sensing images,alongside the issue of high resource consumption hindering model deployment,we propose an enhanced,lightweight you only look once version 8 small(YOLOv8s)detection algorithm.Regarding network improvements,we first replace tradi-tional horizontal boxes with rotated boxes for target detection,effectively addressing difficulties in feature extraction caused by varying target angles.Second,we design a module integrating convolu-tional neural networks(CNN)and Transformer components to replace specific C2f modules in the backbone network,thereby expanding the model’s receptive field and enhancing feature extraction in complex backgrounds.Finally,we introduce a feature calibration structure to mitigate potential feature mismatches during feature fusion.For model compression,we employ a lightweight channel pruning technique based on localized mean average precision(LMAP)to eliminate redundancies in the enhanced model.Although this approach results in some loss of detection accuracy,it effec-tively reduces the number of parameters,computational load,and model size.Additionally,we employ channel-level knowledge distillation to recover accuracy in the pruned model,further enhancing detection performance.Experimental results indicate that the enhanced algorithm achieves a 6.1%increase in mAP50 compared to YOLOv8s,while simultaneously reducing parame-ters,computational load,and model size by 57.7%,28.8%,and 52.3%,respectively.展开更多
The objective of this study is to address semantic misalignment and insufficient accuracy in edge detail and discrimination detection,which are common issues in deep learning-based change detection methods relying on ...The objective of this study is to address semantic misalignment and insufficient accuracy in edge detail and discrimination detection,which are common issues in deep learning-based change detection methods relying on encoding and decoding frameworks.In response to this,we propose a model called FlowDual-PixelClsObjectMec(FPCNet),which innovatively incorporates dual flow alignment technology in the decoding stage to rectify semantic discrepancies through streamlined feature correction fusion.Furthermore,the model employs an object-level similarity measurement coupled with pixel-level classification in the PixelClsObjectMec(PCOM)module during the final discrimination stage,significantly enhancing edge detail detection and overall accuracy.Experimental evaluations on the change detection dataset(CDD)and building CDD demonstrate superior performance,with F1 scores of 95.1%and 92.8%,respectively.Our findings indicate that the FPCNet outperforms the existing algorithms in stability,robustness,and other key metrics.展开更多
Fracture identification is important for the evaluation of carbonate reservoirs. However, conventional logging equipment has small depth of investigation and cannot detect rock fractures more than three meters away fr...Fracture identification is important for the evaluation of carbonate reservoirs. However, conventional logging equipment has small depth of investigation and cannot detect rock fractures more than three meters away from the borehole. Remote acoustic logging uses phase-controlled array-transmitting and long sound probes that increase the depth of investigation. The interpretation of logging data with respect to fractures is typically guided by practical experience rather than theory and is often ambiguous. We use remote acoustic reflection logging data and high-order finite-difference approximations in the forward modeling and prestack reverse-time migration to image fractures. First, we perform forward modeling of the fracture responses as a function of the fracture-borehole wall distance, aperture, and dip angle. Second, we extract the energy intensity within the imaging area to determine whether the fracture can be identified as the formation velocity is varied. Finally, we evaluate the effect of the fracture-borehole distance, fracture aperture, and dip angle on fracture identification.展开更多
In recent years,with the rapid advancement of artificial intelligence,object detection algorithms have made significant strides in accuracy and computational efficiency.Notably,research and applications of Anchor-Free...In recent years,with the rapid advancement of artificial intelligence,object detection algorithms have made significant strides in accuracy and computational efficiency.Notably,research and applications of Anchor-Free models have opened new avenues for real-time target detection in optical remote sensing images(ORSIs).However,in the realmof adversarial attacks,developing adversarial techniques tailored to Anchor-Freemodels remains challenging.Adversarial examples generated based on Anchor-Based models often exhibit poor transferability to these new model architectures.Furthermore,the growing diversity of Anchor-Free models poses additional hurdles to achieving robust transferability of adversarial attacks.This study presents an improved cross-conv-block feature fusion You Only Look Once(YOLO)architecture,meticulously engineered to facilitate the extraction ofmore comprehensive semantic features during the backpropagation process.To address the asymmetry between densely distributed objects in ORSIs and the corresponding detector outputs,a novel dense bounding box attack strategy is proposed.This approach leverages dense target bounding boxes loss in the calculation of adversarial loss functions.Furthermore,by integrating translation-invariant(TI)and momentum-iteration(MI)adversarial methodologies,the proposed framework significantly improves the transferability of adversarial attacks.Experimental results demonstrate that our method achieves superior adversarial attack performance,with adversarial transferability rates(ATR)of 67.53%on the NWPU VHR-10 dataset and 90.71%on the HRSC2016 dataset.Compared to ensemble adversarial attack and cascaded adversarial attack approaches,our method generates adversarial examples in an average of 0.64 s,representing an approximately 14.5%improvement in efficiency under equivalent conditions.展开更多
The global population is rapidly expanding,driving an increasing demand for intelligent healthcare systems.Artificial intelligence(AI)applications in remote patient monitoring and diagnosis have achieved remarkable pr...The global population is rapidly expanding,driving an increasing demand for intelligent healthcare systems.Artificial intelligence(AI)applications in remote patient monitoring and diagnosis have achieved remarkable progress and are emerging as a major development trend.Among these applications,mouth motion tracking and mouth-state detection represent an important direction,providing valuable support for diagnosing neuromuscular disorders such as dysphagia,Bell’s palsy,and Parkinson’s disease.In this study,we focus on developing a real-time system capable of monitoring and detecting mouth state that can be efficiently deployed on edge devices.The proposed system integrates the Facial Landmark Detection technique with an optimized model combining a Bidirectional Gated Recurrent Unit(BiGRU)and Comprehensive Learning Particle Swarm Optimization(CLPSO).We conducted a comprehensive comparison and evaluation of the proposed model against several traditional models using multiple performance metrics,including accuracy,precision,recall,F1-score,cosine similarity,ROC–AUC,and the precision–recall curve.The proposed method achieved an impressive accuracy of 96.57%with an excellent precision of 98.25%on our self-collected dataset,outperforming traditional models and related works in the same field.These findings highlight the potential of the proposed approach for implementation in real-time patient monitoring systems,contributing to improved diagnostic accuracy and supporting healthcare professionals in patient treatment and care.展开更多
INTRODUCTION.On May 1st,2024,around 2:10 a.m.,a catastrophic collapse occurred along the Meilong Expressway near Meizhou City,Guangdong Province,China,at coordinates 24°29′24″N and 116°40′25″E.This colla...INTRODUCTION.On May 1st,2024,around 2:10 a.m.,a catastrophic collapse occurred along the Meilong Expressway near Meizhou City,Guangdong Province,China,at coordinates 24°29′24″N and 116°40′25″E.This collapse resulted in a pavement failure of approximately 17.9 m in length and covering an area of about 184.3 m^(2)(Chinanews,2024).展开更多
Considering the important applications in the military and the civilian domain, ship detection and classification based on optical remote sensing images raise considerable attention in the sea surface remote sensing f...Considering the important applications in the military and the civilian domain, ship detection and classification based on optical remote sensing images raise considerable attention in the sea surface remote sensing filed. This article collects the methods of ship detection and classification for practically testing in optical remote sensing images, and provides their corresponding feature extraction strategies and statistical data. Basic feature extraction strategies and algorithms are analyzed associated with their performance and application in ship detection and classification.Furthermore, publicly available datasets that can be applied as the benchmarks to verify the effectiveness and the objectiveness of ship detection and classification methods are summarized in this paper. Based on the analysis, the remaining problems and future development trends are provided for ship detection and classification methods based on optical remote sensing images.展开更多
Maize tassel detection is essential for future agronomic management in maize planting and breeding,with application in yield estimation,growth monitoring,intelligent picking,and disease detection.However,detecting mai...Maize tassel detection is essential for future agronomic management in maize planting and breeding,with application in yield estimation,growth monitoring,intelligent picking,and disease detection.However,detecting maize tassels in the field poses prominent challenges as they are often obscured by widespread occlusions and differ in size and morphological color at different growth stages.This study proposes the SEYOLOX-tiny Model that more accurately and robustly detects maize tassels in the field.Firstly,the data acquisition method ensures the balance between the image quality and image acquisition efficiency and obtains maize tassel images from different periods to enrich the dataset by unmanned aerial vehicle(UAV).Moreover,the robust detection network extends YOLOX by embedding an attention mechanism to realize the extraction of critical features and suppressing the noise caused by adverse factors(e.g.,occlusions and overlaps),which could be more suitable and robust for operation in complex natural environments.Experimental results verify the research hypothesis and show a mean average precision(mAP_(@0.5)) of 95.0%.The mAP_(@0.5),mAP_(@0.5-0.95),mAP_(@0.5-0.95(area=small)),and mAP_(@0.5-0.95(area=medium)) average values increased by 1.5,1.8,5.3,and 1.7%,respectively,compared to the original model.The proposed method can effectively meet the precision and robustness requirements of the vision system in maize tassel detection.展开更多
A large number of publications have incorporated deep learning in the process of remote sensing change detection.In these Deep Learning Change Detection(DLCD)publications,deep learning methods have demonstrated their ...A large number of publications have incorporated deep learning in the process of remote sensing change detection.In these Deep Learning Change Detection(DLCD)publications,deep learning methods have demonstrated their superiority over conventional change detection methods.However,the theoretical underpinnings of why deep learning improves the performance of change detection remain unresolved.As of today,few in-depth reviews have investigated the mechanisms of DLCD.Without such a review,five critical questions remain unclear.Does DLCD provide improved information representation for change detection?If so,how?How to select an appropriate DLCD method and why?How much does each type of change benefits from DLCD in terms of its performance?What are the major limitations of existing DLCD methods and what are the prospects for DLCD?To address these five questions,we reviewed according to the following strategies.We grouped the DLCD information assemblages into the four unique dimensions of remote sensing:spectral,spatial,temporal,and multi-sensor.For the extraction of information in each dimension,the difference between DLCD and conventional change detection methods was compared.We proposed a taxonomy of existing DLCD methods by dividing them into two distinctive pools:separate and coupled models.Their advantages,limitations,applicability,and performance were thoroughly investigated and explicitly presented.We examined the variations in performance between DLCD and conventional change detection.We depicted two limitations of DLCD,i.e.training sample and hardware and software dilemmas.Based on these analyses,we identified directions for future developments.As a result of our review,we found that DLCD’s advantages over conventional change detection can be attributed to three factors:improved information representation;improved change detection methods;and performance enhancements.DLCD has to surpass the limitations with regard to training samples and computing infrastructure.We envision this review can boost developments of deep learning in change detection applications.展开更多
The exploration of building detection plays an important role in urban planning,smart city and military.Aiming at the problem of high overlapping ratio of detection frames for dense building detection in high resoluti...The exploration of building detection plays an important role in urban planning,smart city and military.Aiming at the problem of high overlapping ratio of detection frames for dense building detection in high resolution remote sensing images,we present an effective YOLOv3 framework,corner regression-based YOLOv3(Correg-YOLOv3),to localize dense building accurately.This improved YOLOv3 algorithm establishes a vertex regression mechanism and an additional loss item about building vertex offsets relative to the center point of bounding box.By extending output dimensions,the trained model is able to output the rectangular bounding boxes and the building vertices meanwhile.Finally,we evaluate the performance of the Correg-YOLOv3 on our self-produced data set and provide a comparative analysis qualitatively and quantitatively.The experimental results achieve high performance in precision(96.45%),recall rate(95.75%),F1 score(96.10%)and average precision(98.05%),which were 2.73%,5.4%,4.1%and 4.73%higher than that of YOLOv3.Therefore,our proposed algorithm effectively tackles the problem of dense building detection in high resolution images.展开更多
Object detection in Remote Sensing(RS)has achieved tremendous advances in recent years,but it remains challenging for rotated object detection due to cluttered backgrounds,dense object arrangements and the wide range ...Object detection in Remote Sensing(RS)has achieved tremendous advances in recent years,but it remains challenging for rotated object detection due to cluttered backgrounds,dense object arrangements and the wide range of size variations among objects.To tackle this problem,Dense Context Feature Pyramid Network(DCFPN)and a powerα-Gaussian loss are designed for rotated object detection in this paper.The proposed DCFPN can extract multi-scale information densely and accurately by leveraging a dense multi-path dilation layer to cover all sizes of objects in remote sensing scenarios.For more accurate detection while avoiding bottlenecks such as boundary discontinuity in rotated bounding box regression,a-Gaussian loss,a unified power generalization of existing Gaussian modeling losses is proposed.Furthermore,the properties ofα-Gaussian loss are analyzed comprehensively for a wider range of applications.Experimental results on four datasets(UCAS-AOD,HRSC2016,DIOR-R,and DOTA)show the effectiveness of the proposed method using different detectors,and are superior to the existing methods in both feature extraction and bounding box regression。展开更多
There are problems such as incomplete edges and poor noise suppression when a single fixed morphological structuring element is used to detect the edges in remote sensing images. For this reason, a morphological edge ...There are problems such as incomplete edges and poor noise suppression when a single fixed morphological structuring element is used to detect the edges in remote sensing images. For this reason, a morphological edge detection method for remote sensing image based on variable structuring element is proposed. Firstly, the structuring elements with different scales and multiple directions are constructed according to the diversity of remote sensing imagery targets. In order to suppress the noise of the target background and highlight the edge of the image target in the remote sensing image by adaptive Top hat and Bottom hat transform, the corresponding adaptive morphological operations are constructed based on variable structuring elements; Secondly, adaptive morphological edge detection is used to obtain multiple images with different scales and directional edge features; Finally, the image edges are obtained by weighted summation of each direction edge, and then the least square is used to fit the edges for accurate location of the edge contour of the target. The experimental results show that the proposed method not only can detect the complete edge of remote sensing image, but also has high edge detection accuracy and superior anti-noise performance. Compared with classical edge detection and the morphological edge detection with a fixed single structuring element, the proposed method performs better in edge detection effect, and the accuracy of detection can reach 95 %展开更多
基金supported by the Henan Province Key R&D Project under Grant 241111210400the Henan Provincial Science and Technology Research Project under Grants 252102211047,252102211062,252102211055 and 232102210069+2 种基金the Jiangsu Provincial Scheme Double Initiative Plan JSS-CBS20230474,the XJTLU RDF-21-02-008the Science and Technology Innovation Project of Zhengzhou University of Light Industry under Grant 23XNKJTD0205the Higher Education Teaching Reform Research and Practice Project of Henan Province under Grant 2024SJGLX0126。
文摘Accurate and efficient detection of building changes in remote sensing imagery is crucial for urban planning,disaster emergency response,and resource management.However,existing methods face challenges such as spectral similarity between buildings and backgrounds,sensor variations,and insufficient computational efficiency.To address these challenges,this paper proposes a novel Multi-scale Efficient Wavelet-based Change Detection Network(MewCDNet),which integrates the advantages of Convolutional Neural Networks and Transformers,balances computational costs,and achieves high-performance building change detection.The network employs EfficientNet-B4 as the backbone for hierarchical feature extraction,integrates multi-level feature maps through a multi-scale fusion strategy,and incorporates two key modules:Cross-temporal Difference Detection(CTDD)and Cross-scale Wavelet Refinement(CSWR).CTDD adopts a dual-branch architecture that combines pixel-wise differencing with semanticaware Euclidean distance weighting to enhance the distinction between true changes and background noise.CSWR integrates Haar-based Discrete Wavelet Transform with multi-head cross-attention mechanisms,enabling cross-scale feature fusion while significantly improving edge localization and suppressing spurious changes.Extensive experiments on four benchmark datasets demonstrate MewCDNet’s superiority over comparison methods:achieving F1 scores of 91.54%on LEVIR,93.70%on WHUCD,and 64.96%on S2Looking for building change detection.Furthermore,MewCDNet exhibits optimal performance on the multi-class⋅SYSU dataset(F1:82.71%),highlighting its exceptional generalization capability.
基金funded by the Hainan Province Science and Technology Special Fund under Grant ZDYF2024GXJS292.
文摘Deep learning has made significant progress in the field of oriented object detection for remote sensing images.However,existing methods still face challenges when dealing with difficult tasks such as multi-scale targets,complex backgrounds,and small objects in remote sensing.Maintaining model lightweight to address resource constraints in remote sensing scenarios while improving task completion for remote sensing tasks remains a research hotspot.Therefore,we propose an enhanced multi-scale feature extraction lightweight network EM-YOLO based on the YOLOv8s architecture,specifically optimized for the characteristics of large target scale variations,diverse orientations,and numerous small objects in remote sensing images.Our innovations lie in two main aspects:First,a dynamic snake convolution(DSC)is introduced into the backbone network to enhance the model’s feature extraction capability for oriented targets.Second,an innovative focusing-diffusion module is designed in the feature fusion neck to effectively integrate multi-scale feature information.Finally,we introduce Layer-Adaptive Sparsity for magnitude-based Pruning(LASP)method to perform lightweight network pruning to better complete tasks in resource-constrained scenarios.Experimental results on the lightweight platform Orin demonstrate that the proposed method significantly outperforms the original YOLOv8s model in oriented remote sensing object detection tasks,and achieves comparable or superior performance to state-of-the-art methods on three authoritative remote sensing datasets(DOTA v1.0,DOTA v1.5,and HRSC2016).
基金supported by the National Public Welfare Forest Desert Shrubbery Monitoring Project。
文摘Desert shrubs are indispensable in maintaining ecological stability by reducing soil erosion,enhancing water retention,and boosting soil fertility,which are critical factors in mitigating desertification processes.Due to the complex topography,variable climate,and challenges in field surveys in desert regions,this paper proposes YOLO-Desert-Shrub(YOLO-DS),a detection method for identifying desert shrubs in UAV remote sensing images based on an enhanced YOLOv8n framework.This method accurately identifying shrub species,locations,and coverage.To address the issue of small individual plants dominating the dataset,the SPDconv convolution module is introduced in the Backbone and Neck layers of the YOLOv8n model,replacing conventional convolutions.This structural optimization mitigates information degradation in fine-grained data while strengthening discriminative feature capture across spatial scales within desert shrub datasets.Furthermore,a structured state-space model is integrated into the main network,and the MambaLayer is designed to dynamically extract and refine shrub-specific features from remote sensing images,effectively filtering out background noise and irrelevant interference to enhance feature representation.Benchmark evaluations reveal the YOLO-DS framework attains 79.56%mAP40weight,demonstrating 2.2%absolute gain versus the baseline YOLOv8n architecture,with statistically significant advantages over contemporary detectors in cross-validation trials.The predicted plant coverage exhibits strong consistency with manually measured coverage,with a coefficient of determination(R^(2))of 0.9148 and a Root Mean Square Error(RMSE)of1.8266%.The proposed UAV-based remote sensing method utilizing the YOLO-DS effectively identify and locate desert shrubs,monitor canopy sizes and distribution,and provide technical support for automated desert shrub monitoring.
基金supported by the National Natural Science Foundation of China(No.62241109)the Tianjin Science and Technology Commissioner Project(No.20YDTPJC01110)。
文摘An improved model based on you only look once version 8(YOLOv8)is proposed to solve the problem of low detection accuracy due to the diversity of object sizes in optical remote sensing images.Firstly,the feature pyramid network(FPN)structure of the original YOLOv8 mode is replaced by the generalized-FPN(GFPN)structure in GiraffeDet to realize the"cross-layer"and"cross-scale"adaptive feature fusion,to enrich the semantic information and spatial information on the feature map to improve the target detection ability of the model.Secondly,a pyramid-pool module of multi atrous spatial pyramid pooling(MASPP)is designed by using the idea of atrous convolution and feature pyramid structure to extract multi-scale features,so as to improve the processing ability of the model for multi-scale objects.The experimental results show that the detection accuracy of the improved YOLOv8 model on DIOR dataset is 92%and mean average precision(mAP)is 87.9%,respectively 3.5%and 1.7%higher than those of the original model.It is proved the detection and classification ability of the proposed model on multi-dimensional optical remote sensing target has been improved.
文摘Recent years have seen a surge in interest in object detection on remote sensing images for applications such as surveillance andmanagement.However,challenges like small object detection,scale variation,and the presence of closely packed objects in these images hinder accurate detection.Additionally,the motion blur effect further complicates the identification of such objects.To address these issues,we propose enhanced YOLOv9 with a transformer head(YOLOv9-TH).The model introduces an additional prediction head for detecting objects of varying sizes and swaps the original prediction heads for transformer heads to leverage self-attention mechanisms.We further improve YOLOv9-TH using several strategies,including data augmentation,multi-scale testing,multi-model integration,and the introduction of an additional classifier.The cross-stage partial(CSP)method and the ghost convolution hierarchical graph(GCHG)are combined to improve detection accuracy by better utilizing feature maps,widening the receptive field,and precisely extracting multi-scale objects.Additionally,we incorporate the E-SimAM attention mechanism to address low-resolution feature loss.Extensive experiments on the VisDrone2021 and DIOR datasets demonstrate the effectiveness of YOLOv9-TH,showing good improvement in mAP compared to the best existing methods.The YOLOv9-TH-e achieved 54.2% of mAP50 on the VisDrone2021 dataset and 92.3% of mAP on the DIOR dataset.The results confirmthemodel’s robustness and suitability for real-world applications,particularly for small object detection in remote sensing images.
文摘Cloud detection is a critical preprocessing step in remote sensing image processing, as the presence of clouds significantly affects the accuracy of remote sensing data and limits its applicability across various domains. This study presents an enhanced cloud detection method based on the U-Net architecture, designed to address the challenges of multi-scale cloud features and long-range dependencies inherent in remote sensing imagery. A Multi-Scale Dilated Attention (MSDA) module is introduced to effectively integrate multi-scale information and model long-range dependencies across different scales, enhancing the model’s ability to detect clouds of varying sizes. Additionally, a Multi-Head Self-Attention (MHSA) mechanism is incorporated to improve the model’s capacity for capturing finer details, particularly in distinguishing thin clouds from surface features. A multi-path supervision mechanism is also devised to ensure the model learns cloud features at multiple scales, further boosting the accuracy and robustness of cloud mask generation. Experimental results demonstrate that the enhanced model achieves superior performance compared to other benchmarked methods in complex scenarios. It significantly improves cloud detection accuracy, highlighting its strong potential for practical applications in cloud detection tasks.
基金supported by National Natural Science Foundation of China(No.62471034)Hebei Natural Science Foundation(No.F2023105001).
文摘Fine-grained aircraft target detection in remote sensing holds significant research valueand practical applications,particularly in military defense and precision strikes.Given the complex-ity of remote sensing images,where targets are often small and similar within categories,detectingthese fine-grained targets is challenging.To address this,we constructed a fine-grained dataset ofremotely sensed airplanes;for the problems of remote sensing fine-grained targets with obvious head-to-tail distributions and large variations in target sizes,we proposed the DWDet fine-grained tar-get detection and recognition algorithm.First,for the problem of unbalanced category distribution,we adopt an adaptive sampling strategy.In addition,we construct a deformable convolutional blockand improve the decoupling head structure to improve the detection effect of the model ondeformed targets.Then,we design a localization loss function,which is used to improve the model’slocalization ability for targets of different scales.The experimental results show that our algorithmimproves the overall accuracy of the model by 4.1%compared to the baseline model,and improvesthe detection accuracy of small targets by 12.2%.The ablation and comparison experiments alsoprove the effectiveness of our algorithm.
基金supported by Henan Province Key R&D Project(241111210400)Henan Provincial Science and Technology Research Project(242102211007 and 242102211020)+1 种基金Jiangsu Science and Technology Programme-General Programme(BK20221260)Science and Technology Innovation Project of Zhengzhou University of Light Industry(23XNKJTD0205).
文摘In recent years,convolutional neural networks(CNN)and Transformer architectures have made significant progress in the field of remote sensing(RS)change detection(CD).Most of the existing methods directly stack multiple layers of Transformer blocks,which achieves considerable improvement in capturing variations,but at a rather high computational cost.We propose a channel-Efficient Change Detection Network(CE-CDNet)to address the problems of high computational cost and imbalanced detection accuracy in remote sensing building change detection.The adaptive multi-scale feature fusion module(CAMSF)and lightweight Transformer decoder(LTD)are introduced to improve the change detection effect.The CAMSF module can adaptively fuse multi-scale features to improve the model’s ability to detect building changes in complex scenes.In addition,the LTD module reduces computational costs and maintains high detection accuracy through an optimized self-attention mechanism and dimensionality reduction operation.Experimental test results on three commonly used remote sensing building change detection data sets show that CE-CDNet can reduce a certain amount of computational overhead while maintaining detection accuracy comparable to existing mainstream models,showing good performance advantages.
基金supported in part by the National Natural Foundation of China(Nos.52472334,U2368204)。
文摘In response to challenges posed by complex backgrounds,diverse target angles,and numerous small targets in remote sensing images,alongside the issue of high resource consumption hindering model deployment,we propose an enhanced,lightweight you only look once version 8 small(YOLOv8s)detection algorithm.Regarding network improvements,we first replace tradi-tional horizontal boxes with rotated boxes for target detection,effectively addressing difficulties in feature extraction caused by varying target angles.Second,we design a module integrating convolu-tional neural networks(CNN)and Transformer components to replace specific C2f modules in the backbone network,thereby expanding the model’s receptive field and enhancing feature extraction in complex backgrounds.Finally,we introduce a feature calibration structure to mitigate potential feature mismatches during feature fusion.For model compression,we employ a lightweight channel pruning technique based on localized mean average precision(LMAP)to eliminate redundancies in the enhanced model.Although this approach results in some loss of detection accuracy,it effec-tively reduces the number of parameters,computational load,and model size.Additionally,we employ channel-level knowledge distillation to recover accuracy in the pruned model,further enhancing detection performance.Experimental results indicate that the enhanced algorithm achieves a 6.1%increase in mAP50 compared to YOLOv8s,while simultaneously reducing parame-ters,computational load,and model size by 57.7%,28.8%,and 52.3%,respectively.
文摘The objective of this study is to address semantic misalignment and insufficient accuracy in edge detail and discrimination detection,which are common issues in deep learning-based change detection methods relying on encoding and decoding frameworks.In response to this,we propose a model called FlowDual-PixelClsObjectMec(FPCNet),which innovatively incorporates dual flow alignment technology in the decoding stage to rectify semantic discrepancies through streamlined feature correction fusion.Furthermore,the model employs an object-level similarity measurement coupled with pixel-level classification in the PixelClsObjectMec(PCOM)module during the final discrimination stage,significantly enhancing edge detail detection and overall accuracy.Experimental evaluations on the change detection dataset(CDD)and building CDD demonstrate superior performance,with F1 scores of 95.1%and 92.8%,respectively.Our findings indicate that the FPCNet outperforms the existing algorithms in stability,robustness,and other key metrics.
基金supported by National Petroleum Major Project(Grant No.2011ZX05020-008)
文摘Fracture identification is important for the evaluation of carbonate reservoirs. However, conventional logging equipment has small depth of investigation and cannot detect rock fractures more than three meters away from the borehole. Remote acoustic logging uses phase-controlled array-transmitting and long sound probes that increase the depth of investigation. The interpretation of logging data with respect to fractures is typically guided by practical experience rather than theory and is often ambiguous. We use remote acoustic reflection logging data and high-order finite-difference approximations in the forward modeling and prestack reverse-time migration to image fractures. First, we perform forward modeling of the fracture responses as a function of the fracture-borehole wall distance, aperture, and dip angle. Second, we extract the energy intensity within the imaging area to determine whether the fracture can be identified as the formation velocity is varied. Finally, we evaluate the effect of the fracture-borehole distance, fracture aperture, and dip angle on fracture identification.
文摘In recent years,with the rapid advancement of artificial intelligence,object detection algorithms have made significant strides in accuracy and computational efficiency.Notably,research and applications of Anchor-Free models have opened new avenues for real-time target detection in optical remote sensing images(ORSIs).However,in the realmof adversarial attacks,developing adversarial techniques tailored to Anchor-Freemodels remains challenging.Adversarial examples generated based on Anchor-Based models often exhibit poor transferability to these new model architectures.Furthermore,the growing diversity of Anchor-Free models poses additional hurdles to achieving robust transferability of adversarial attacks.This study presents an improved cross-conv-block feature fusion You Only Look Once(YOLO)architecture,meticulously engineered to facilitate the extraction ofmore comprehensive semantic features during the backpropagation process.To address the asymmetry between densely distributed objects in ORSIs and the corresponding detector outputs,a novel dense bounding box attack strategy is proposed.This approach leverages dense target bounding boxes loss in the calculation of adversarial loss functions.Furthermore,by integrating translation-invariant(TI)and momentum-iteration(MI)adversarial methodologies,the proposed framework significantly improves the transferability of adversarial attacks.Experimental results demonstrate that our method achieves superior adversarial attack performance,with adversarial transferability rates(ATR)of 67.53%on the NWPU VHR-10 dataset and 90.71%on the HRSC2016 dataset.Compared to ensemble adversarial attack and cascaded adversarial attack approaches,our method generates adversarial examples in an average of 0.64 s,representing an approximately 14.5%improvement in efficiency under equivalent conditions.
基金supported by the National Science and Technology Council,Taiwan,with grant numbers NSTC 114-2622-8-992-007-TD1 and 112-2811-E-992-003-MY3.
文摘The global population is rapidly expanding,driving an increasing demand for intelligent healthcare systems.Artificial intelligence(AI)applications in remote patient monitoring and diagnosis have achieved remarkable progress and are emerging as a major development trend.Among these applications,mouth motion tracking and mouth-state detection represent an important direction,providing valuable support for diagnosing neuromuscular disorders such as dysphagia,Bell’s palsy,and Parkinson’s disease.In this study,we focus on developing a real-time system capable of monitoring and detecting mouth state that can be efficiently deployed on edge devices.The proposed system integrates the Facial Landmark Detection technique with an optimized model combining a Bidirectional Gated Recurrent Unit(BiGRU)and Comprehensive Learning Particle Swarm Optimization(CLPSO).We conducted a comprehensive comparison and evaluation of the proposed model against several traditional models using multiple performance metrics,including accuracy,precision,recall,F1-score,cosine similarity,ROC–AUC,and the precision–recall curve.The proposed method achieved an impressive accuracy of 96.57%with an excellent precision of 98.25%on our self-collected dataset,outperforming traditional models and related works in the same field.These findings highlight the potential of the proposed approach for implementation in real-time patient monitoring systems,contributing to improved diagnostic accuracy and supporting healthcare professionals in patient treatment and care.
基金supported by the National Natural Science Foundation of China(Nos.42371094,41907253)partially supported by the Interdisciplinary Cultivation Program of Xidian University(No.21103240005)the Postdoctoral Fellowship Program of CPSF(No.GZB20240589)。
文摘INTRODUCTION.On May 1st,2024,around 2:10 a.m.,a catastrophic collapse occurred along the Meilong Expressway near Meizhou City,Guangdong Province,China,at coordinates 24°29′24″N and 116°40′25″E.This collapse resulted in a pavement failure of approximately 17.9 m in length and covering an area of about 184.3 m^(2)(Chinanews,2024).
文摘Considering the important applications in the military and the civilian domain, ship detection and classification based on optical remote sensing images raise considerable attention in the sea surface remote sensing filed. This article collects the methods of ship detection and classification for practically testing in optical remote sensing images, and provides their corresponding feature extraction strategies and statistical data. Basic feature extraction strategies and algorithms are analyzed associated with their performance and application in ship detection and classification.Furthermore, publicly available datasets that can be applied as the benchmarks to verify the effectiveness and the objectiveness of ship detection and classification methods are summarized in this paper. Based on the analysis, the remaining problems and future development trends are provided for ship detection and classification methods based on optical remote sensing images.
基金supported by the Chinese Universities Scientific Fund (2022TC169)。
文摘Maize tassel detection is essential for future agronomic management in maize planting and breeding,with application in yield estimation,growth monitoring,intelligent picking,and disease detection.However,detecting maize tassels in the field poses prominent challenges as they are often obscured by widespread occlusions and differ in size and morphological color at different growth stages.This study proposes the SEYOLOX-tiny Model that more accurately and robustly detects maize tassels in the field.Firstly,the data acquisition method ensures the balance between the image quality and image acquisition efficiency and obtains maize tassel images from different periods to enrich the dataset by unmanned aerial vehicle(UAV).Moreover,the robust detection network extends YOLOX by embedding an attention mechanism to realize the extraction of critical features and suppressing the noise caused by adverse factors(e.g.,occlusions and overlaps),which could be more suitable and robust for operation in complex natural environments.Experimental results verify the research hypothesis and show a mean average precision(mAP_(@0.5)) of 95.0%.The mAP_(@0.5),mAP_(@0.5-0.95),mAP_(@0.5-0.95(area=small)),and mAP_(@0.5-0.95(area=medium)) average values increased by 1.5,1.8,5.3,and 1.7%,respectively,compared to the original model.The proposed method can effectively meet the precision and robustness requirements of the vision system in maize tassel detection.
文摘A large number of publications have incorporated deep learning in the process of remote sensing change detection.In these Deep Learning Change Detection(DLCD)publications,deep learning methods have demonstrated their superiority over conventional change detection methods.However,the theoretical underpinnings of why deep learning improves the performance of change detection remain unresolved.As of today,few in-depth reviews have investigated the mechanisms of DLCD.Without such a review,five critical questions remain unclear.Does DLCD provide improved information representation for change detection?If so,how?How to select an appropriate DLCD method and why?How much does each type of change benefits from DLCD in terms of its performance?What are the major limitations of existing DLCD methods and what are the prospects for DLCD?To address these five questions,we reviewed according to the following strategies.We grouped the DLCD information assemblages into the four unique dimensions of remote sensing:spectral,spatial,temporal,and multi-sensor.For the extraction of information in each dimension,the difference between DLCD and conventional change detection methods was compared.We proposed a taxonomy of existing DLCD methods by dividing them into two distinctive pools:separate and coupled models.Their advantages,limitations,applicability,and performance were thoroughly investigated and explicitly presented.We examined the variations in performance between DLCD and conventional change detection.We depicted two limitations of DLCD,i.e.training sample and hardware and software dilemmas.Based on these analyses,we identified directions for future developments.As a result of our review,we found that DLCD’s advantages over conventional change detection can be attributed to three factors:improved information representation;improved change detection methods;and performance enhancements.DLCD has to surpass the limitations with regard to training samples and computing infrastructure.We envision this review can boost developments of deep learning in change detection applications.
基金National Natural Science Foundation of China(No.41871305)National Key Research and Development Program of China(No.2017YFC0602204)+2 种基金Fundamental Research Funds for the Central Universities,China University of Geosciences(Wuhan)(No.CUGQY1945)Open Fund of Key Laboratory of Geological Survey and Evaluation of Ministry of Education and the Fundamental Research Funds for the Central Universities(No.GLAB2019ZR02)Open Fund of Laboratory of Urban Land Resources Monitoring and Simulation,Ministry of Natural Resources,China(No.KF-2020-05-068)。
文摘The exploration of building detection plays an important role in urban planning,smart city and military.Aiming at the problem of high overlapping ratio of detection frames for dense building detection in high resolution remote sensing images,we present an effective YOLOv3 framework,corner regression-based YOLOv3(Correg-YOLOv3),to localize dense building accurately.This improved YOLOv3 algorithm establishes a vertex regression mechanism and an additional loss item about building vertex offsets relative to the center point of bounding box.By extending output dimensions,the trained model is able to output the rectangular bounding boxes and the building vertices meanwhile.Finally,we evaluate the performance of the Correg-YOLOv3 on our self-produced data set and provide a comparative analysis qualitatively and quantitatively.The experimental results achieve high performance in precision(96.45%),recall rate(95.75%),F1 score(96.10%)and average precision(98.05%),which were 2.73%,5.4%,4.1%and 4.73%higher than that of YOLOv3.Therefore,our proposed algorithm effectively tackles the problem of dense building detection in high resolution images.
文摘Object detection in Remote Sensing(RS)has achieved tremendous advances in recent years,but it remains challenging for rotated object detection due to cluttered backgrounds,dense object arrangements and the wide range of size variations among objects.To tackle this problem,Dense Context Feature Pyramid Network(DCFPN)and a powerα-Gaussian loss are designed for rotated object detection in this paper.The proposed DCFPN can extract multi-scale information densely and accurately by leveraging a dense multi-path dilation layer to cover all sizes of objects in remote sensing scenarios.For more accurate detection while avoiding bottlenecks such as boundary discontinuity in rotated bounding box regression,a-Gaussian loss,a unified power generalization of existing Gaussian modeling losses is proposed.Furthermore,the properties ofα-Gaussian loss are analyzed comprehensively for a wider range of applications.Experimental results on four datasets(UCAS-AOD,HRSC2016,DIOR-R,and DOTA)show the effectiveness of the proposed method using different detectors,and are superior to the existing methods in both feature extraction and bounding box regression。
基金National Natural Science Foundation of China(No.61761027)Postgraduate Education Reform Project of Lanzhou Jiaotong University(No.1600120101)
文摘There are problems such as incomplete edges and poor noise suppression when a single fixed morphological structuring element is used to detect the edges in remote sensing images. For this reason, a morphological edge detection method for remote sensing image based on variable structuring element is proposed. Firstly, the structuring elements with different scales and multiple directions are constructed according to the diversity of remote sensing imagery targets. In order to suppress the noise of the target background and highlight the edge of the image target in the remote sensing image by adaptive Top hat and Bottom hat transform, the corresponding adaptive morphological operations are constructed based on variable structuring elements; Secondly, adaptive morphological edge detection is used to obtain multiple images with different scales and directional edge features; Finally, the image edges are obtained by weighted summation of each direction edge, and then the least square is used to fit the edges for accurate location of the edge contour of the target. The experimental results show that the proposed method not only can detect the complete edge of remote sensing image, but also has high edge detection accuracy and superior anti-noise performance. Compared with classical edge detection and the morphological edge detection with a fixed single structuring element, the proposed method performs better in edge detection effect, and the accuracy of detection can reach 95 %