This paper aims at providing multi-source remote sensing images registered in geometric space for image fusion.Focusing on the characteristics and differences of multi-source remote sensing images,a feature-based regi...This paper aims at providing multi-source remote sensing images registered in geometric space for image fusion.Focusing on the characteristics and differences of multi-source remote sensing images,a feature-based registration algorithm is implemented.The key technologies include image scale-space for implementing multi-scale properties,Harris corner detection for keypoints extraction,and partial intensity invariant feature descriptor(PIIFD)for keypoints description.Eventually,a multi-scale Harris-PIIFD image registration algorithm framework is proposed.The experimental results of fifteen sets of representative real data show that the algorithm has excellent,stable performance in multi-source remote sensing image registration,and can achieve accurate spatial alignment,which has strong practical application value and certain generalization ability.展开更多
The automatic registration of multi-source remote sensing images (RSI) is a research hotspot of remote sensing image preprocessing currently. A special automatic image registration module named the Image Autosync has ...The automatic registration of multi-source remote sensing images (RSI) is a research hotspot of remote sensing image preprocessing currently. A special automatic image registration module named the Image Autosync has been embedded into the ERDAS IMAGINE software of version 9.0 and above. The registration accuracies of the module verified for the remote sensing images obtained from different platforms or their different spatial resolution. Four tested registration experiments are discussed in this article to analyze the accuracy differences based on the remote sensing data which have different spatial resolution. The impact factors inducing the differences of registration accuracy are also analyzed.展开更多
Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods ex...Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods exhibit deficiencies in detail recovery and noise suppression,particularly when processing complex landscapes(e.g.,forests,farmlands),leading to artifacts and spectral distortions that limit practical utility.To address this,we propose an enhanced Super-Resolution Generative Adversarial Network(SRGAN)framework featuring three key innovations:(1)Replacement of L1/L2 loss with a robust Charbonnier loss to suppress noise while preserving edge details via adaptive gradient balancing;(2)A multi-loss joint optimization strategy dynamically weighting Charbonnier loss(β=0.5),Visual Geometry Group(VGG)perceptual loss(α=1),and adversarial loss(γ=0.1)to synergize pixel-level accuracy and perceptual quality;(3)A multi-scale residual network(MSRN)capturing cross-scale texture features(e.g.,forest canopies,mountain contours).Validated on Sentinel-2(10 m)and SPOT-6/7(2.5 m)datasets covering 904 km2 in Motuo County,Xizang,our method outperforms the SRGAN baseline(SR4RS)with Peak Signal-to-Noise Ratio(PSNR)gains of 0.29 dB and Structural Similarity Index(SSIM)improvements of 3.08%on forest imagery.Visual comparisons confirm enhanced texture continuity despite marginal Learned Perceptual Image Patch Similarity(LPIPS)increases.The method significantly improves noise robustness and edge retention in complex geomorphology,demonstrating 18%faster response in forest fire early warning and providing high-resolution support for agricultural/urban monitoring.Future work will integrate spectral constraints and lightweight architectures.展开更多
With the increasing frequency of floods,in-depth flood event analyses are essential for effective disaster relief and prevention.Satellite-based flood event datasets have become the primary data source for flood event...With the increasing frequency of floods,in-depth flood event analyses are essential for effective disaster relief and prevention.Satellite-based flood event datasets have become the primary data source for flood event analyses instead of limited disaster maps due to their enhanced availability.Nevertheless,despite the vast amount of available remote sensing images,existing flood event datasets continue to pose significant challenges in flood event analyses due to the uneven geographical distribution of data,the scarcity of time series data,and the limited availability of flood-related semantic information.There has been a surge in acceptance of deep learning models for flood event analyses,but some existing flood datasets do not align well with model training,and distinguishing flooded areas has proven difficult with limited data modalities and semantic information.Moreover,efficient retrieval and pre-screening of flood-related imagery from vast satellite data impose notable obstacles,particularly within large-scale analyses.To address these issues,we propose a Multimodal Flood Event Dataset(MFED)for deep-learning-based flood event analyses and data retrieval.It consists of 18 years of multi-source remote sensing imagery and heterogeneous textual information covering flood-prone areas worldwide.Incorporating optical and radar imagery can exploit the correlation and complementarity between distinct image modalities to capture the pixel features in flood imagery.It is worth noting that text modality data,including auxiliary hydrological information extracted from the Global Flood Database and text information refined from online news records,can also offer a semantic supplement to the images for flood event retrieval and analysis.To verify the applicability of the MFED in deep learning models,we carried out experiments with different models using a single modality and different combinations of modalities,which fully verified the effectiveness of the dataset.Furthermore,we also verify the efficiency of the MFED in comparative experiments with existing multimodal datasets and diverse neural network structures.展开更多
High-resolution remote sensing images(HRSIs)are now an essential data source for gathering surface information due to advancements in remote sensing data capture technologies.However,their significant scale changes an...High-resolution remote sensing images(HRSIs)are now an essential data source for gathering surface information due to advancements in remote sensing data capture technologies.However,their significant scale changes and wealth of spatial details pose challenges for semantic segmentation.While convolutional neural networks(CNNs)excel at capturing local features,they are limited in modeling long-range dependencies.Conversely,transformers utilize multihead self-attention to integrate global context effectively,but this approach often incurs a high computational cost.This paper proposes a global-local multiscale context network(GLMCNet)to extract both global and local multiscale contextual information from HRSIs.A detail-enhanced filtering module(DEFM)is proposed at the end of the encoder to refine the encoder outputs further,thereby enhancing the key details extracted by the encoder and effectively suppressing redundant information.In addition,a global-local multiscale transformer block(GLMTB)is proposed in the decoding stage to enable the modeling of rich multiscale global and local information.We also design a stair fusion mechanism to transmit deep semantic information from deep to shallow layers progressively.Finally,we propose the semantic awareness enhancement module(SAEM),which further enhances the representation of multiscale semantic features through spatial attention and covariance channel attention.Extensive ablation analyses and comparative experiments were conducted to evaluate the performance of the proposed method.Specifically,our method achieved a mean Intersection over Union(mIoU)of 86.89%on the ISPRS Potsdam dataset and 84.34%on the ISPRS Vaihingen dataset,outperforming existing models such as ABCNet and BANet.展开更多
Accurate estimation of understory terrain has significant scientific importance for maintaining ecosystem balance and biodiversity conservation.Addressing the issue of inadequate representation of spatial heterogeneit...Accurate estimation of understory terrain has significant scientific importance for maintaining ecosystem balance and biodiversity conservation.Addressing the issue of inadequate representation of spatial heterogeneity when traditional forest topographic inversion methods consider the entire forest as the inversion unit,this study pro⁃poses a differentiated modeling approach to forest types based on refined land cover classification.Taking Puerto Ri⁃co and Maryland as study areas,a multi-dimensional feature system is constructed by integrating multi-source re⁃mote sensing data:ICESat-2 spaceborne LiDAR is used to obtain benchmark values for understory terrain,topo⁃graphic factors such as slope and aspect are extracted based on SRTM data,and vegetation cover characteristics are analyzed using Landsat-8 multispectral imagery.This study incorporates forest type as a classification modeling con⁃dition and applies the random forest algorithm to build differentiated topographic inversion models.Experimental re⁃sults indicate that,compared to traditional whole-area modeling methods(RMSE=5.06 m),forest type-based classi⁃fication modeling significantly improves the accuracy of understory terrain estimation(RMSE=2.94 m),validating the effectiveness of spatial heterogeneity modeling.Further sensitivity analysis reveals that canopy structure parame⁃ters(with RMSE variation reaching 4.11 m)exert a stronger regulatory effect on estimation accuracy compared to forest cover,providing important theoretical support for optimizing remote sensing models of forest topography.展开更多
Mudflat vegetation plays a crucial role in the ecological function of wetland environment,and obtaining its fine spatial distri-bution is of great significance for wetland protection and management.Remote sensing tech...Mudflat vegetation plays a crucial role in the ecological function of wetland environment,and obtaining its fine spatial distri-bution is of great significance for wetland protection and management.Remote sensing techniques can realize the rapid extraction of wetland vegetation over a large area.However,the imaging of optical sensors is easily restricted by weather conditions,and the backs-cattered information reflected by Synthetic Aperture Radar(SAR)images is easily disturbed by many factors.Although both data sources have been applied in wetland vegetation classification,there is a lack of comparative study on how the selection of data sources affects the classification effect.This study takes the vegetation of the tidal flat wetland in Chongming Island,Shanghai,China,in 2019,as the research subject.A total of 22 optical feature parameters and 11 SAR feature parameters were extracted from the optical data source(Sentinel-2)and SAR data source(Sentinel-1),respectively.The performance of optical and SAR data and their feature paramet-ers in wetland vegetation classification was quantitatively compared and analyzed by different feature combinations.Furthermore,by simulating the scenario of missing optical images,the impact of optical image missing on vegetation classification accuracy and the compensatory effect of integrating SAR data were revealed.Results show that:1)under the same classification algorithm,the Overall Accuracy(OA)of the combined use of optical and SAR images was the highest,reaching 95.50%.The OA of using only optical images was slightly lower,while using only SAR images yields the lowest accuracy,but still achieved 86.48%.2)Compared to using the spec-tral reflectance of optical data and the backscattering coefficient of SAR data directly,the constructed optical and SAR feature paramet-ers contributed to improving classification accuracy.The inclusion of optical(vegetation index,spatial texture,and phenology features)and SAR feature parameters(SAR index and SAR texture features)in the classification algorithm resulted in an OA improvement of 4.56%and 9.47%,respectively.SAR backscatter,SAR index,optical phenological features,and vegetation index were identified as the top-ranking important features.3)When the optical data were missing continuously for six months,the OA dropped to a minimum of 41.56%.However,when combined with SAR data,the OA could be improved to 71.62%.This indicates that the incorporation of SAR features can effectively compensate for the loss of accuracy caused by optical image missing,especially in regions with long-term cloud cover.展开更多
A novel CNN-Mamba hybrid architecture was proposed to address intra-class variance and inter-class similarity in remote sensing imagery.The framework integrates:(1)parallel CNN and visual state space(VSS)encoders,(2)m...A novel CNN-Mamba hybrid architecture was proposed to address intra-class variance and inter-class similarity in remote sensing imagery.The framework integrates:(1)parallel CNN and visual state space(VSS)encoders,(2)multi-scale cross-attention feature fusion,and(3)a boundary-constrained decoder.This design overcomes CNN s limited receptive fields and ViT s quadratic complexity while efficiently capturing both local features and global dependencies.Evaluations on LoveDA and ISPRS Vaihingen datasets demonstrate superior segmentation accuracy and boundary preservation compared to existing approaches,with the dual-branch structure maintaining computational efficiency throughout the process.展开更多
In response to challenges posed by complex backgrounds,diverse target angles,and numerous small targets in remote sensing images,alongside the issue of high resource consumption hindering model deployment,we propose a...In response to challenges posed by complex backgrounds,diverse target angles,and numerous small targets in remote sensing images,alongside the issue of high resource consumption hindering model deployment,we propose an enhanced,lightweight you only look once version 8 small(YOLOv8s)detection algorithm.Regarding network improvements,we first replace tradi-tional horizontal boxes with rotated boxes for target detection,effectively addressing difficulties in feature extraction caused by varying target angles.Second,we design a module integrating convolu-tional neural networks(CNN)and Transformer components to replace specific C2f modules in the backbone network,thereby expanding the model’s receptive field and enhancing feature extraction in complex backgrounds.Finally,we introduce a feature calibration structure to mitigate potential feature mismatches during feature fusion.For model compression,we employ a lightweight channel pruning technique based on localized mean average precision(LMAP)to eliminate redundancies in the enhanced model.Although this approach results in some loss of detection accuracy,it effec-tively reduces the number of parameters,computational load,and model size.Additionally,we employ channel-level knowledge distillation to recover accuracy in the pruned model,further enhancing detection performance.Experimental results indicate that the enhanced algorithm achieves a 6.1%increase in mAP50 compared to YOLOv8s,while simultaneously reducing parame-ters,computational load,and model size by 57.7%,28.8%,and 52.3%,respectively.展开更多
Recent years have seen a surge in interest in object detection on remote sensing images for applications such as surveillance andmanagement.However,challenges like small object detection,scale variation,and the presen...Recent years have seen a surge in interest in object detection on remote sensing images for applications such as surveillance andmanagement.However,challenges like small object detection,scale variation,and the presence of closely packed objects in these images hinder accurate detection.Additionally,the motion blur effect further complicates the identification of such objects.To address these issues,we propose enhanced YOLOv9 with a transformer head(YOLOv9-TH).The model introduces an additional prediction head for detecting objects of varying sizes and swaps the original prediction heads for transformer heads to leverage self-attention mechanisms.We further improve YOLOv9-TH using several strategies,including data augmentation,multi-scale testing,multi-model integration,and the introduction of an additional classifier.The cross-stage partial(CSP)method and the ghost convolution hierarchical graph(GCHG)are combined to improve detection accuracy by better utilizing feature maps,widening the receptive field,and precisely extracting multi-scale objects.Additionally,we incorporate the E-SimAM attention mechanism to address low-resolution feature loss.Extensive experiments on the VisDrone2021 and DIOR datasets demonstrate the effectiveness of YOLOv9-TH,showing good improvement in mAP compared to the best existing methods.The YOLOv9-TH-e achieved 54.2% of mAP50 on the VisDrone2021 dataset and 92.3% of mAP on the DIOR dataset.The results confirmthemodel’s robustness and suitability for real-world applications,particularly for small object detection in remote sensing images.展开更多
The objective of this study is to address semantic misalignment and insufficient accuracy in edge detail and discrimination detection,which are common issues in deep learning-based change detection methods relying on ...The objective of this study is to address semantic misalignment and insufficient accuracy in edge detail and discrimination detection,which are common issues in deep learning-based change detection methods relying on encoding and decoding frameworks.In response to this,we propose a model called FlowDual-PixelClsObjectMec(FPCNet),which innovatively incorporates dual flow alignment technology in the decoding stage to rectify semantic discrepancies through streamlined feature correction fusion.Furthermore,the model employs an object-level similarity measurement coupled with pixel-level classification in the PixelClsObjectMec(PCOM)module during the final discrimination stage,significantly enhancing edge detail detection and overall accuracy.Experimental evaluations on the change detection dataset(CDD)and building CDD demonstrate superior performance,with F1 scores of 95.1%and 92.8%,respectively.Our findings indicate that the FPCNet outperforms the existing algorithms in stability,robustness,and other key metrics.展开更多
Image registration is an indispensable component in multi-source remote sensing image processing. In this paper, we put forward a remote sensing image registration method by including an improved multi-scale and multi...Image registration is an indispensable component in multi-source remote sensing image processing. In this paper, we put forward a remote sensing image registration method by including an improved multi-scale and multi-direction Harris algorithm and a novel compound feature. Multi-scale circle Gaussian combined invariant moments and multi-direction gray level co-occurrence matrix are extracted as features for image matching. The proposed algorithm is evaluated on numerous multi-source remote sensor images with noise and illumination changes. Extensive experimental studies prove that our proposed method is capable of receiving stable and even distribution of key points as well as obtaining robust and accurate correspondence matches. It is a promising scheme in multi-source remote sensing image registration.展开更多
Numerous coal fires burn underneath the Datong coalfield because of indiscriminate mining.Landsat TM/ETM,unmanned aerial vehicle(UAV),and infrared thermal imager were employed to monitor underground coal fires in th...Numerous coal fires burn underneath the Datong coalfield because of indiscriminate mining.Landsat TM/ETM,unmanned aerial vehicle(UAV),and infrared thermal imager were employed to monitor underground coal fires in the Majiliang mining area.The thermal field distributions of this area in 2000,2002,2006,2007,and 2009 were obtained using Landsat TM/ETM.The changes in the distribution were then analyzed to approximate the locations of the coal fires.Through UAV imagery employed at a very high resolution(0.2 m),the texture information,linear features,and brightness of the ground fissures in the coal fire area were determined.All these data were combined to build a knowledge model of determining fissures and were used to support underground coal fire detection.An infrared thermal imager was used to map the thermal field distribution of areas where coal fire is serious.Results were analyzed to identify the hot spot trend and the depth of the burning point.展开更多
This study was to estabIish the forest resources management information system for forest farms based on the B/S structural WebGIS with trial forest farm of Hunan Academy of Forestry as the research field, forest reso...This study was to estabIish the forest resources management information system for forest farms based on the B/S structural WebGIS with trial forest farm of Hunan Academy of Forestry as the research field, forest resources field survey da-ta, ETM+ remote sensing data and basic geographical information data as research material through the extraction of forest resource data in the forest farm, require-ment analysis on the system function and the estabIishment of required software and hardware environment, with the alm to realize the management, query, editing, analysis, statistics and other functions of forest resources information to manage the forest resources.展开更多
The large-scale acquisition and widespread application of remote sensing image data have led to increasingly severe challenges in information security and privacy protection during transmission and storage.Urban remot...The large-scale acquisition and widespread application of remote sensing image data have led to increasingly severe challenges in information security and privacy protection during transmission and storage.Urban remote sensing image,characterized by complex content and well-defined structures,are particularly vulnerable to malicious attacks and information leakage.To address this issue,the author proposes an encryption method based on the enhanced single-neuron dynamical system(ESNDS).ESNDS generates highquality pseudo-random sequences with complex dynamics and intense sensitivity to initial conditions,which drive a structure of multi-stage cipher comprising permutation,ring-wise diffusion,and mask perturbation.Using representative GF-2 Panchromatic and Multispectral Scanner(PMS)urban scenes,the author conducts systematic evaluations in terms of inter-pixel correlation,information entropy,histogram uniformity,and number of pixel change rate(NPCR)/unified average changing intensity(UACI).The results demonstrate that the proposed scheme effectively resists statistical analysis,differential attacks,and known-plaintext attacks while maintaining competitive computational efficiency for high-resolution urban image.In addition,the cipher is lightweight and hardware-friendly,integrates readily with on-board and ground processing,and thus offers tangible engineering utility for real-time,large-volume remote-sensing data protection.展开更多
This paper introduces a lightweight remote sensing image dehazing network called multidimensional weight regulation network(MDWR-Net), which addresses the high computational cost of existing methods. Previous works, o...This paper introduces a lightweight remote sensing image dehazing network called multidimensional weight regulation network(MDWR-Net), which addresses the high computational cost of existing methods. Previous works, often based on the encoder-decoder structure and utilizing multiple upsampling and downsampling layers, are computationally expensive. To improve efficiency, the paper proposes two modules: the efficient spatial resolution recovery module(ESRR) for upsampling and the efficient depth information augmentation module(EDIA) for downsampling.These modules not only reduce model complexity but also enhance performance. Additionally, the partial feature weight learning module(PFWL) is introduced to reduce the computational burden by applying weight learning across partial dimensions, rather than using full-channel convolution.To overcome the limitations of convolutional neural networks(CNN)-based networks, the haze distribution index transformer(HDIT) is integrated into the decoder. We also propose the physicalbased non-adjacent feature fusion module(PNFF), which leverages the atmospheric scattering model to improve generalization of our MDWR-Net. The MDWR-Net achieves superior dehazing performance with a computational cost of just 2.98×10^(9) multiply-accumulate operations(MACs),which is less than one-tenth of previous methods. Experimental results validate its effectiveness in balancing performance and computational efficiency.展开更多
Remote sensing image segmentation has a wide range of applications in land cover classification,urban building recognition,crop monitoring,and other fields.In recent years,with the booming development of deep learning...Remote sensing image segmentation has a wide range of applications in land cover classification,urban building recognition,crop monitoring,and other fields.In recent years,with the booming development of deep learning,remote sensing image segmentation models based on deep learning have gradually emerged and produced a large number of scientific research achievements.This article is based on deep learning and reviews the latest achievements in remote sensing image segmentation,exploring future development directions.Firstly,the basic concepts,characteristics,classification,tasks,and commonly used datasets of remote sensingimages are presented.Secondly,the segmentation models based on deep learning were classified and summarized,and the principles,characteristics,and applications of various models were presented.Then,the key technologies involved in deep learning remote sensing image segmentation were introduced.Finally,the future development direction and applicationprospects of remote sensing image segmentation were discussed.This article reviews the latest research achievements in remote sensing image segmentationfrom the perspective of deep learning,which can provide reference and inspiration for the research of remote sensing image segmentation.展开更多
Semantic segmentation provides important technical support for Land cover/land use(LCLU)research.By calculating the cosine similarity between feature vectors,transformer-based models can effectively capture the global...Semantic segmentation provides important technical support for Land cover/land use(LCLU)research.By calculating the cosine similarity between feature vectors,transformer-based models can effectively capture the global information of high-resolution remote sensing images.However,the diversity of detailed and edge features within the same class of ground objects in high-resolution remote sensing images leads to a dispersed embedding distribution.The dispersed feature distribution enlarges feature vector angles and reduces cosine similarity,weakening the attention mechanism’s ability to identify the same class of ground objects.To address this challenge,remote sensing image information granulation transformer for semantic segmentation is proposed.The model employs adaptive granulation to extract common semantic features among objects of the same class,constructing an information granule to replace the detailed feature representation of these objects.Then,the Laplacian operator of the information granule is applied to extract the edge features of the object as represented by the information granule.In the experiments,the proposed model was validated on the Beijing Land-Use(BLU),Gaofen Image Dataset(GID),and Potsdam Dataset(PD).In particular,the model achieves 88.81%for mOA,82.64%for mF1,and 71.50%for mIoU metrics on the GID dataset.Experimental results show that the model effectively handles high-resolution remote sensing images.Our code is available at https://github.com/sjmp525/RSIGT(accessed on 16 April 2025).展开更多
The convolutional neural network(CNN)method based on DeepLabv3+has some problems in the semantic segmentation task of high-resolution remote sensing images,such as fixed receiving field size of feature extraction,lack...The convolutional neural network(CNN)method based on DeepLabv3+has some problems in the semantic segmentation task of high-resolution remote sensing images,such as fixed receiving field size of feature extraction,lack of semantic information,high decoder magnification,and insufficient detail retention ability.A hierarchical feature fusion network(HFFNet)was proposed.Firstly,a combination of transformer and CNN architectures was employed for feature extraction from images of varying resolutions.The extracted features were processed independently.Subsequently,the features from the transformer and CNN were fused under the guidance of features from different sources.This fusion process assisted in restoring information more comprehensively during the decoding stage.Furthermore,a spatial channel attention module was designed in the final stage of decoding to refine features and reduce the semantic gap between shallow CNN features and deep decoder features.The experimental results showed that HFFNet had superior performance on UAVid,LoveDA,Potsdam,and Vaihingen datasets,and its cross-linking index was better than DeepLabv3+and other competing methods,showing strong generalization ability.展开更多
The secured access is studied in this paper for the network of the image remote sensing.Each sensor in this network encounters the information security when uploading information of the images wirelessly from the sens...The secured access is studied in this paper for the network of the image remote sensing.Each sensor in this network encounters the information security when uploading information of the images wirelessly from the sensor to the central collection point.In order to enhance the sensing quality for the remote uploading,the passive reflection surface technique is employed.If one eavesdropper that exists nearby this sensor is keeping on accessing the same networks,he may receive the same image from this sensor.Our goal in this paper is to improve the SNR of legitimate collection unit while cut down the SNR of the eavesdropper as much as possible by adaptively adjust the uploading power from this sensor to enhance the security of the remote sensing images.In order to achieve this goal,the secured energy efficiency performance is theoretically analyzed with respect to the number of the passive reflection elements by calculating the instantaneous performance over the channel fading coefficients.Based on this theoretical result,the secured access is formulated as a mathematical optimization problem by adjusting the sensor uploading power as the unknown variables with the objective of the energy efficiency maximization while satisfying any required maximum data rate of the eavesdropper sensor.Finally,the analytical expression is theoretically derived for the optimum uploading power.Numerical simulations verify the design approach.展开更多
文摘This paper aims at providing multi-source remote sensing images registered in geometric space for image fusion.Focusing on the characteristics and differences of multi-source remote sensing images,a feature-based registration algorithm is implemented.The key technologies include image scale-space for implementing multi-scale properties,Harris corner detection for keypoints extraction,and partial intensity invariant feature descriptor(PIIFD)for keypoints description.Eventually,a multi-scale Harris-PIIFD image registration algorithm framework is proposed.The experimental results of fifteen sets of representative real data show that the algorithm has excellent,stable performance in multi-source remote sensing image registration,and can achieve accurate spatial alignment,which has strong practical application value and certain generalization ability.
文摘The automatic registration of multi-source remote sensing images (RSI) is a research hotspot of remote sensing image preprocessing currently. A special automatic image registration module named the Image Autosync has been embedded into the ERDAS IMAGINE software of version 9.0 and above. The registration accuracies of the module verified for the remote sensing images obtained from different platforms or their different spatial resolution. Four tested registration experiments are discussed in this article to analyze the accuracy differences based on the remote sensing data which have different spatial resolution. The impact factors inducing the differences of registration accuracy are also analyzed.
基金This study was supported by:Inner Mongolia Academy of Forestry Sciences Open Research Project(Grant No.KF2024MS03)The Project to Improve the Scientific Research Capacity of the Inner Mongolia Academy of Forestry Sciences(Grant No.2024NLTS04)The Innovation and Entrepreneurship Training Program for Undergraduates of Beijing Forestry University(Grant No.X202410022268).
文摘Remote sensing image super-resolution technology is pivotal for enhancing image quality in critical applications including environmental monitoring,urban planning,and disaster assessment.However,traditional methods exhibit deficiencies in detail recovery and noise suppression,particularly when processing complex landscapes(e.g.,forests,farmlands),leading to artifacts and spectral distortions that limit practical utility.To address this,we propose an enhanced Super-Resolution Generative Adversarial Network(SRGAN)framework featuring three key innovations:(1)Replacement of L1/L2 loss with a robust Charbonnier loss to suppress noise while preserving edge details via adaptive gradient balancing;(2)A multi-loss joint optimization strategy dynamically weighting Charbonnier loss(β=0.5),Visual Geometry Group(VGG)perceptual loss(α=1),and adversarial loss(γ=0.1)to synergize pixel-level accuracy and perceptual quality;(3)A multi-scale residual network(MSRN)capturing cross-scale texture features(e.g.,forest canopies,mountain contours).Validated on Sentinel-2(10 m)and SPOT-6/7(2.5 m)datasets covering 904 km2 in Motuo County,Xizang,our method outperforms the SRGAN baseline(SR4RS)with Peak Signal-to-Noise Ratio(PSNR)gains of 0.29 dB and Structural Similarity Index(SSIM)improvements of 3.08%on forest imagery.Visual comparisons confirm enhanced texture continuity despite marginal Learned Perceptual Image Patch Similarity(LPIPS)increases.The method significantly improves noise robustness and edge retention in complex geomorphology,demonstrating 18%faster response in forest fire early warning and providing high-resolution support for agricultural/urban monitoring.Future work will integrate spectral constraints and lightweight architectures.
基金supported by the National Natural Science Foundation of China[Grant No.42071413]the GHfund C[Grant No.202302039381].
文摘With the increasing frequency of floods,in-depth flood event analyses are essential for effective disaster relief and prevention.Satellite-based flood event datasets have become the primary data source for flood event analyses instead of limited disaster maps due to their enhanced availability.Nevertheless,despite the vast amount of available remote sensing images,existing flood event datasets continue to pose significant challenges in flood event analyses due to the uneven geographical distribution of data,the scarcity of time series data,and the limited availability of flood-related semantic information.There has been a surge in acceptance of deep learning models for flood event analyses,but some existing flood datasets do not align well with model training,and distinguishing flooded areas has proven difficult with limited data modalities and semantic information.Moreover,efficient retrieval and pre-screening of flood-related imagery from vast satellite data impose notable obstacles,particularly within large-scale analyses.To address these issues,we propose a Multimodal Flood Event Dataset(MFED)for deep-learning-based flood event analyses and data retrieval.It consists of 18 years of multi-source remote sensing imagery and heterogeneous textual information covering flood-prone areas worldwide.Incorporating optical and radar imagery can exploit the correlation and complementarity between distinct image modalities to capture the pixel features in flood imagery.It is worth noting that text modality data,including auxiliary hydrological information extracted from the Global Flood Database and text information refined from online news records,can also offer a semantic supplement to the images for flood event retrieval and analysis.To verify the applicability of the MFED in deep learning models,we carried out experiments with different models using a single modality and different combinations of modalities,which fully verified the effectiveness of the dataset.Furthermore,we also verify the efficiency of the MFED in comparative experiments with existing multimodal datasets and diverse neural network structures.
基金provided by the Science Research Project of Hebei Education Department under grant No.BJK2024115.
文摘High-resolution remote sensing images(HRSIs)are now an essential data source for gathering surface information due to advancements in remote sensing data capture technologies.However,their significant scale changes and wealth of spatial details pose challenges for semantic segmentation.While convolutional neural networks(CNNs)excel at capturing local features,they are limited in modeling long-range dependencies.Conversely,transformers utilize multihead self-attention to integrate global context effectively,but this approach often incurs a high computational cost.This paper proposes a global-local multiscale context network(GLMCNet)to extract both global and local multiscale contextual information from HRSIs.A detail-enhanced filtering module(DEFM)is proposed at the end of the encoder to refine the encoder outputs further,thereby enhancing the key details extracted by the encoder and effectively suppressing redundant information.In addition,a global-local multiscale transformer block(GLMTB)is proposed in the decoding stage to enable the modeling of rich multiscale global and local information.We also design a stair fusion mechanism to transmit deep semantic information from deep to shallow layers progressively.Finally,we propose the semantic awareness enhancement module(SAEM),which further enhances the representation of multiscale semantic features through spatial attention and covariance channel attention.Extensive ablation analyses and comparative experiments were conducted to evaluate the performance of the proposed method.Specifically,our method achieved a mean Intersection over Union(mIoU)of 86.89%on the ISPRS Potsdam dataset and 84.34%on the ISPRS Vaihingen dataset,outperforming existing models such as ABCNet and BANet.
基金Supported by the National Natural Science Foundation of China(42401488,42071351)the National Key Research and Development Program of China(2020YFA0608501,2017YFB0504204)+4 种基金the Liaoning Revitalization Talents Program(XLYC1802027)the Talent Recruited Program of the Chinese Academy of Science(Y938091)the Project Supported Discipline Innovation Team of the Liaoning Technical University(LNTU20TD-23)the Liaoning Province Doctoral Research Initiation Fund Program(2023-BS-202)the Basic Research Projects of Liaoning Department of Education(JYTQN2023202)。
文摘Accurate estimation of understory terrain has significant scientific importance for maintaining ecosystem balance and biodiversity conservation.Addressing the issue of inadequate representation of spatial heterogeneity when traditional forest topographic inversion methods consider the entire forest as the inversion unit,this study pro⁃poses a differentiated modeling approach to forest types based on refined land cover classification.Taking Puerto Ri⁃co and Maryland as study areas,a multi-dimensional feature system is constructed by integrating multi-source re⁃mote sensing data:ICESat-2 spaceborne LiDAR is used to obtain benchmark values for understory terrain,topo⁃graphic factors such as slope and aspect are extracted based on SRTM data,and vegetation cover characteristics are analyzed using Landsat-8 multispectral imagery.This study incorporates forest type as a classification modeling con⁃dition and applies the random forest algorithm to build differentiated topographic inversion models.Experimental re⁃sults indicate that,compared to traditional whole-area modeling methods(RMSE=5.06 m),forest type-based classi⁃fication modeling significantly improves the accuracy of understory terrain estimation(RMSE=2.94 m),validating the effectiveness of spatial heterogeneity modeling.Further sensitivity analysis reveals that canopy structure parame⁃ters(with RMSE variation reaching 4.11 m)exert a stronger regulatory effect on estimation accuracy compared to forest cover,providing important theoretical support for optimizing remote sensing models of forest topography.
基金Under the auspices of the National Key Research and Development Program of China(No.2023YFC3208500)Shanghai Municipal Natural Science Foundation(No.22ZR1421500)+3 种基金National Natural Science Foundation of China(No.U2243207)National Science and Technology Basic Resources Survey Project(No.2023FY01001)Open Research Fund of State Key Laboratory of Estuarine and Coastal Research(No.SKLEC-KF202406)Project from Science and Technology Commission of Shanghai Municipality(No.22DZ1202700)。
文摘Mudflat vegetation plays a crucial role in the ecological function of wetland environment,and obtaining its fine spatial distri-bution is of great significance for wetland protection and management.Remote sensing techniques can realize the rapid extraction of wetland vegetation over a large area.However,the imaging of optical sensors is easily restricted by weather conditions,and the backs-cattered information reflected by Synthetic Aperture Radar(SAR)images is easily disturbed by many factors.Although both data sources have been applied in wetland vegetation classification,there is a lack of comparative study on how the selection of data sources affects the classification effect.This study takes the vegetation of the tidal flat wetland in Chongming Island,Shanghai,China,in 2019,as the research subject.A total of 22 optical feature parameters and 11 SAR feature parameters were extracted from the optical data source(Sentinel-2)and SAR data source(Sentinel-1),respectively.The performance of optical and SAR data and their feature paramet-ers in wetland vegetation classification was quantitatively compared and analyzed by different feature combinations.Furthermore,by simulating the scenario of missing optical images,the impact of optical image missing on vegetation classification accuracy and the compensatory effect of integrating SAR data were revealed.Results show that:1)under the same classification algorithm,the Overall Accuracy(OA)of the combined use of optical and SAR images was the highest,reaching 95.50%.The OA of using only optical images was slightly lower,while using only SAR images yields the lowest accuracy,but still achieved 86.48%.2)Compared to using the spec-tral reflectance of optical data and the backscattering coefficient of SAR data directly,the constructed optical and SAR feature paramet-ers contributed to improving classification accuracy.The inclusion of optical(vegetation index,spatial texture,and phenology features)and SAR feature parameters(SAR index and SAR texture features)in the classification algorithm resulted in an OA improvement of 4.56%and 9.47%,respectively.SAR backscatter,SAR index,optical phenological features,and vegetation index were identified as the top-ranking important features.3)When the optical data were missing continuously for six months,the OA dropped to a minimum of 41.56%.However,when combined with SAR data,the OA could be improved to 71.62%.This indicates that the incorporation of SAR features can effectively compensate for the loss of accuracy caused by optical image missing,especially in regions with long-term cloud cover.
文摘A novel CNN-Mamba hybrid architecture was proposed to address intra-class variance and inter-class similarity in remote sensing imagery.The framework integrates:(1)parallel CNN and visual state space(VSS)encoders,(2)multi-scale cross-attention feature fusion,and(3)a boundary-constrained decoder.This design overcomes CNN s limited receptive fields and ViT s quadratic complexity while efficiently capturing both local features and global dependencies.Evaluations on LoveDA and ISPRS Vaihingen datasets demonstrate superior segmentation accuracy and boundary preservation compared to existing approaches,with the dual-branch structure maintaining computational efficiency throughout the process.
基金supported in part by the National Natural Foundation of China(Nos.52472334,U2368204)。
文摘In response to challenges posed by complex backgrounds,diverse target angles,and numerous small targets in remote sensing images,alongside the issue of high resource consumption hindering model deployment,we propose an enhanced,lightweight you only look once version 8 small(YOLOv8s)detection algorithm.Regarding network improvements,we first replace tradi-tional horizontal boxes with rotated boxes for target detection,effectively addressing difficulties in feature extraction caused by varying target angles.Second,we design a module integrating convolu-tional neural networks(CNN)and Transformer components to replace specific C2f modules in the backbone network,thereby expanding the model’s receptive field and enhancing feature extraction in complex backgrounds.Finally,we introduce a feature calibration structure to mitigate potential feature mismatches during feature fusion.For model compression,we employ a lightweight channel pruning technique based on localized mean average precision(LMAP)to eliminate redundancies in the enhanced model.Although this approach results in some loss of detection accuracy,it effec-tively reduces the number of parameters,computational load,and model size.Additionally,we employ channel-level knowledge distillation to recover accuracy in the pruned model,further enhancing detection performance.Experimental results indicate that the enhanced algorithm achieves a 6.1%increase in mAP50 compared to YOLOv8s,while simultaneously reducing parame-ters,computational load,and model size by 57.7%,28.8%,and 52.3%,respectively.
文摘Recent years have seen a surge in interest in object detection on remote sensing images for applications such as surveillance andmanagement.However,challenges like small object detection,scale variation,and the presence of closely packed objects in these images hinder accurate detection.Additionally,the motion blur effect further complicates the identification of such objects.To address these issues,we propose enhanced YOLOv9 with a transformer head(YOLOv9-TH).The model introduces an additional prediction head for detecting objects of varying sizes and swaps the original prediction heads for transformer heads to leverage self-attention mechanisms.We further improve YOLOv9-TH using several strategies,including data augmentation,multi-scale testing,multi-model integration,and the introduction of an additional classifier.The cross-stage partial(CSP)method and the ghost convolution hierarchical graph(GCHG)are combined to improve detection accuracy by better utilizing feature maps,widening the receptive field,and precisely extracting multi-scale objects.Additionally,we incorporate the E-SimAM attention mechanism to address low-resolution feature loss.Extensive experiments on the VisDrone2021 and DIOR datasets demonstrate the effectiveness of YOLOv9-TH,showing good improvement in mAP compared to the best existing methods.The YOLOv9-TH-e achieved 54.2% of mAP50 on the VisDrone2021 dataset and 92.3% of mAP on the DIOR dataset.The results confirmthemodel’s robustness and suitability for real-world applications,particularly for small object detection in remote sensing images.
文摘The objective of this study is to address semantic misalignment and insufficient accuracy in edge detail and discrimination detection,which are common issues in deep learning-based change detection methods relying on encoding and decoding frameworks.In response to this,we propose a model called FlowDual-PixelClsObjectMec(FPCNet),which innovatively incorporates dual flow alignment technology in the decoding stage to rectify semantic discrepancies through streamlined feature correction fusion.Furthermore,the model employs an object-level similarity measurement coupled with pixel-level classification in the PixelClsObjectMec(PCOM)module during the final discrimination stage,significantly enhancing edge detail detection and overall accuracy.Experimental evaluations on the change detection dataset(CDD)and building CDD demonstrate superior performance,with F1 scores of 95.1%and 92.8%,respectively.Our findings indicate that the FPCNet outperforms the existing algorithms in stability,robustness,and other key metrics.
基金supported by National Nature Science Foundation of China (Nos. 61462046 and 61762052)Natural Science Foundation of Jiangxi Province (Nos. 20161BAB202049 and 20161BAB204172)+2 种基金the Bidding Project of the Key Laboratory of Watershed Ecology and Geographical Environment Monitoring, NASG (Nos. WE2016003, WE2016013 and WE2016015)the Science and Technology Research Projects of Jiangxi Province Education Department (Nos. GJJ160741, GJJ170632 and GJJ170633)the Art Planning Project of Jiangxi Province (Nos. YG2016250 and YG2017381)
文摘Image registration is an indispensable component in multi-source remote sensing image processing. In this paper, we put forward a remote sensing image registration method by including an improved multi-scale and multi-direction Harris algorithm and a novel compound feature. Multi-scale circle Gaussian combined invariant moments and multi-direction gray level co-occurrence matrix are extracted as features for image matching. The proposed algorithm is evaluated on numerous multi-source remote sensor images with noise and illumination changes. Extensive experimental studies prove that our proposed method is capable of receiving stable and even distribution of key points as well as obtaining robust and accurate correspondence matches. It is a promising scheme in multi-source remote sensing image registration.
基金Project(201412016)supported by the Special Fund for Public Projects of National Administration of Surveying,Mapping and Geoinformation of ChinaProject(51174287)supported by the National Natural Science Foundation of China
文摘Numerous coal fires burn underneath the Datong coalfield because of indiscriminate mining.Landsat TM/ETM,unmanned aerial vehicle(UAV),and infrared thermal imager were employed to monitor underground coal fires in the Majiliang mining area.The thermal field distributions of this area in 2000,2002,2006,2007,and 2009 were obtained using Landsat TM/ETM.The changes in the distribution were then analyzed to approximate the locations of the coal fires.Through UAV imagery employed at a very high resolution(0.2 m),the texture information,linear features,and brightness of the ground fissures in the coal fire area were determined.All these data were combined to build a knowledge model of determining fissures and were used to support underground coal fire detection.An infrared thermal imager was used to map the thermal field distribution of areas where coal fire is serious.Results were analyzed to identify the hot spot trend and the depth of the burning point.
文摘This study was to estabIish the forest resources management information system for forest farms based on the B/S structural WebGIS with trial forest farm of Hunan Academy of Forestry as the research field, forest resources field survey da-ta, ETM+ remote sensing data and basic geographical information data as research material through the extraction of forest resource data in the forest farm, require-ment analysis on the system function and the estabIishment of required software and hardware environment, with the alm to realize the management, query, editing, analysis, statistics and other functions of forest resources information to manage the forest resources.
文摘The large-scale acquisition and widespread application of remote sensing image data have led to increasingly severe challenges in information security and privacy protection during transmission and storage.Urban remote sensing image,characterized by complex content and well-defined structures,are particularly vulnerable to malicious attacks and information leakage.To address this issue,the author proposes an encryption method based on the enhanced single-neuron dynamical system(ESNDS).ESNDS generates highquality pseudo-random sequences with complex dynamics and intense sensitivity to initial conditions,which drive a structure of multi-stage cipher comprising permutation,ring-wise diffusion,and mask perturbation.Using representative GF-2 Panchromatic and Multispectral Scanner(PMS)urban scenes,the author conducts systematic evaluations in terms of inter-pixel correlation,information entropy,histogram uniformity,and number of pixel change rate(NPCR)/unified average changing intensity(UACI).The results demonstrate that the proposed scheme effectively resists statistical analysis,differential attacks,and known-plaintext attacks while maintaining competitive computational efficiency for high-resolution urban image.In addition,the cipher is lightweight and hardware-friendly,integrates readily with on-board and ground processing,and thus offers tangible engineering utility for real-time,large-volume remote-sensing data protection.
文摘This paper introduces a lightweight remote sensing image dehazing network called multidimensional weight regulation network(MDWR-Net), which addresses the high computational cost of existing methods. Previous works, often based on the encoder-decoder structure and utilizing multiple upsampling and downsampling layers, are computationally expensive. To improve efficiency, the paper proposes two modules: the efficient spatial resolution recovery module(ESRR) for upsampling and the efficient depth information augmentation module(EDIA) for downsampling.These modules not only reduce model complexity but also enhance performance. Additionally, the partial feature weight learning module(PFWL) is introduced to reduce the computational burden by applying weight learning across partial dimensions, rather than using full-channel convolution.To overcome the limitations of convolutional neural networks(CNN)-based networks, the haze distribution index transformer(HDIT) is integrated into the decoder. We also propose the physicalbased non-adjacent feature fusion module(PNFF), which leverages the atmospheric scattering model to improve generalization of our MDWR-Net. The MDWR-Net achieves superior dehazing performance with a computational cost of just 2.98×10^(9) multiply-accumulate operations(MACs),which is less than one-tenth of previous methods. Experimental results validate its effectiveness in balancing performance and computational efficiency.
文摘Remote sensing image segmentation has a wide range of applications in land cover classification,urban building recognition,crop monitoring,and other fields.In recent years,with the booming development of deep learning,remote sensing image segmentation models based on deep learning have gradually emerged and produced a large number of scientific research achievements.This article is based on deep learning and reviews the latest achievements in remote sensing image segmentation,exploring future development directions.Firstly,the basic concepts,characteristics,classification,tasks,and commonly used datasets of remote sensingimages are presented.Secondly,the segmentation models based on deep learning were classified and summarized,and the principles,characteristics,and applications of various models were presented.Then,the key technologies involved in deep learning remote sensing image segmentation were introduced.Finally,the future development direction and applicationprospects of remote sensing image segmentation were discussed.This article reviews the latest research achievements in remote sensing image segmentationfrom the perspective of deep learning,which can provide reference and inspiration for the research of remote sensing image segmentation.
基金supported by the National Natural Science Foundation of China(62462040)the Yunnan Fundamental Research Projects(202501AT070345)+2 种基金the Major Science and Technology Projects in Yunnan Province(202202AD080013)Sichuan Provincial Key Laboratory of Philosophy and Social Science Key Program on Language Intelligence Special Education(YYZN-2024-1)the Photosynthesis Fund Class A(ghfund202407010460).
文摘Semantic segmentation provides important technical support for Land cover/land use(LCLU)research.By calculating the cosine similarity between feature vectors,transformer-based models can effectively capture the global information of high-resolution remote sensing images.However,the diversity of detailed and edge features within the same class of ground objects in high-resolution remote sensing images leads to a dispersed embedding distribution.The dispersed feature distribution enlarges feature vector angles and reduces cosine similarity,weakening the attention mechanism’s ability to identify the same class of ground objects.To address this challenge,remote sensing image information granulation transformer for semantic segmentation is proposed.The model employs adaptive granulation to extract common semantic features among objects of the same class,constructing an information granule to replace the detailed feature representation of these objects.Then,the Laplacian operator of the information granule is applied to extract the edge features of the object as represented by the information granule.In the experiments,the proposed model was validated on the Beijing Land-Use(BLU),Gaofen Image Dataset(GID),and Potsdam Dataset(PD).In particular,the model achieves 88.81%for mOA,82.64%for mF1,and 71.50%for mIoU metrics on the GID dataset.Experimental results show that the model effectively handles high-resolution remote sensing images.Our code is available at https://github.com/sjmp525/RSIGT(accessed on 16 April 2025).
基金supported by National Natural Science Foundation of China(No.52374155)Anhui Provincial Natural Science Foundation(No.2308085 MF218).
文摘The convolutional neural network(CNN)method based on DeepLabv3+has some problems in the semantic segmentation task of high-resolution remote sensing images,such as fixed receiving field size of feature extraction,lack of semantic information,high decoder magnification,and insufficient detail retention ability.A hierarchical feature fusion network(HFFNet)was proposed.Firstly,a combination of transformer and CNN architectures was employed for feature extraction from images of varying resolutions.The extracted features were processed independently.Subsequently,the features from the transformer and CNN were fused under the guidance of features from different sources.This fusion process assisted in restoring information more comprehensively during the decoding stage.Furthermore,a spatial channel attention module was designed in the final stage of decoding to refine features and reduce the semantic gap between shallow CNN features and deep decoder features.The experimental results showed that HFFNet had superior performance on UAVid,LoveDA,Potsdam,and Vaihingen datasets,and its cross-linking index was better than DeepLabv3+and other competing methods,showing strong generalization ability.
基金supported in part by Jiangsu Province High Level“333”Program (0401206044)National Natural Science Foundation of China (61801243,62072255)+4 种基金Program for Scientific Research Foundation for Talented Scholars of Jinling Institute of Technology (JIT-B-202031)University Incubator Foundation of Jinling Institute of Technology (JIT-FHXM-202110)Open Project of Fujian Provincial Key Lab.of Network Security and Cryptology (NSCL-KF2021-02)Open Foundation of National Railway Intelligence Transportation System Engineering Tech.Research Center (RITS2021KF02)China Postdoctoral Science Foundation (2019M651914)。
文摘The secured access is studied in this paper for the network of the image remote sensing.Each sensor in this network encounters the information security when uploading information of the images wirelessly from the sensor to the central collection point.In order to enhance the sensing quality for the remote uploading,the passive reflection surface technique is employed.If one eavesdropper that exists nearby this sensor is keeping on accessing the same networks,he may receive the same image from this sensor.Our goal in this paper is to improve the SNR of legitimate collection unit while cut down the SNR of the eavesdropper as much as possible by adaptively adjust the uploading power from this sensor to enhance the security of the remote sensing images.In order to achieve this goal,the secured energy efficiency performance is theoretically analyzed with respect to the number of the passive reflection elements by calculating the instantaneous performance over the channel fading coefficients.Based on this theoretical result,the secured access is formulated as a mathematical optimization problem by adjusting the sensor uploading power as the unknown variables with the objective of the energy efficiency maximization while satisfying any required maximum data rate of the eavesdropper sensor.Finally,the analytical expression is theoretically derived for the optimum uploading power.Numerical simulations verify the design approach.