This paper aims at providing multi-source remote sensing images registered in geometric space for image fusion.Focusing on the characteristics and differences of multi-source remote sensing images,a feature-based regi...This paper aims at providing multi-source remote sensing images registered in geometric space for image fusion.Focusing on the characteristics and differences of multi-source remote sensing images,a feature-based registration algorithm is implemented.The key technologies include image scale-space for implementing multi-scale properties,Harris corner detection for keypoints extraction,and partial intensity invariant feature descriptor(PIIFD)for keypoints description.Eventually,a multi-scale Harris-PIIFD image registration algorithm framework is proposed.The experimental results of fifteen sets of representative real data show that the algorithm has excellent,stable performance in multi-source remote sensing image registration,and can achieve accurate spatial alignment,which has strong practical application value and certain generalization ability.展开更多
The automatic registration of multi-source remote sensing images (RSI) is a research hotspot of remote sensing image preprocessing currently. A special automatic image registration module named the Image Autosync has ...The automatic registration of multi-source remote sensing images (RSI) is a research hotspot of remote sensing image preprocessing currently. A special automatic image registration module named the Image Autosync has been embedded into the ERDAS IMAGINE software of version 9.0 and above. The registration accuracies of the module verified for the remote sensing images obtained from different platforms or their different spatial resolution. Four tested registration experiments are discussed in this article to analyze the accuracy differences based on the remote sensing data which have different spatial resolution. The impact factors inducing the differences of registration accuracy are also analyzed.展开更多
Mudflat vegetation plays a crucial role in the ecological function of wetland environment,and obtaining its fine spatial distri-bution is of great significance for wetland protection and management.Remote sensing tech...Mudflat vegetation plays a crucial role in the ecological function of wetland environment,and obtaining its fine spatial distri-bution is of great significance for wetland protection and management.Remote sensing techniques can realize the rapid extraction of wetland vegetation over a large area.However,the imaging of optical sensors is easily restricted by weather conditions,and the backs-cattered information reflected by Synthetic Aperture Radar(SAR)images is easily disturbed by many factors.Although both data sources have been applied in wetland vegetation classification,there is a lack of comparative study on how the selection of data sources affects the classification effect.This study takes the vegetation of the tidal flat wetland in Chongming Island,Shanghai,China,in 2019,as the research subject.A total of 22 optical feature parameters and 11 SAR feature parameters were extracted from the optical data source(Sentinel-2)and SAR data source(Sentinel-1),respectively.The performance of optical and SAR data and their feature paramet-ers in wetland vegetation classification was quantitatively compared and analyzed by different feature combinations.Furthermore,by simulating the scenario of missing optical images,the impact of optical image missing on vegetation classification accuracy and the compensatory effect of integrating SAR data were revealed.Results show that:1)under the same classification algorithm,the Overall Accuracy(OA)of the combined use of optical and SAR images was the highest,reaching 95.50%.The OA of using only optical images was slightly lower,while using only SAR images yields the lowest accuracy,but still achieved 86.48%.2)Compared to using the spec-tral reflectance of optical data and the backscattering coefficient of SAR data directly,the constructed optical and SAR feature paramet-ers contributed to improving classification accuracy.The inclusion of optical(vegetation index,spatial texture,and phenology features)and SAR feature parameters(SAR index and SAR texture features)in the classification algorithm resulted in an OA improvement of 4.56%and 9.47%,respectively.SAR backscatter,SAR index,optical phenological features,and vegetation index were identified as the top-ranking important features.3)When the optical data were missing continuously for six months,the OA dropped to a minimum of 41.56%.However,when combined with SAR data,the OA could be improved to 71.62%.This indicates that the incorporation of SAR features can effectively compensate for the loss of accuracy caused by optical image missing,especially in regions with long-term cloud cover.展开更多
A novel CNN-Mamba hybrid architecture was proposed to address intra-class variance and inter-class similarity in remote sensing imagery.The framework integrates:(1)parallel CNN and visual state space(VSS)encoders,(2)m...A novel CNN-Mamba hybrid architecture was proposed to address intra-class variance and inter-class similarity in remote sensing imagery.The framework integrates:(1)parallel CNN and visual state space(VSS)encoders,(2)multi-scale cross-attention feature fusion,and(3)a boundary-constrained decoder.This design overcomes CNN s limited receptive fields and ViT s quadratic complexity while efficiently capturing both local features and global dependencies.Evaluations on LoveDA and ISPRS Vaihingen datasets demonstrate superior segmentation accuracy and boundary preservation compared to existing approaches,with the dual-branch structure maintaining computational efficiency throughout the process.展开更多
In response to challenges posed by complex backgrounds,diverse target angles,and numerous small targets in remote sensing images,alongside the issue of high resource consumption hindering model deployment,we propose a...In response to challenges posed by complex backgrounds,diverse target angles,and numerous small targets in remote sensing images,alongside the issue of high resource consumption hindering model deployment,we propose an enhanced,lightweight you only look once version 8 small(YOLOv8s)detection algorithm.Regarding network improvements,we first replace tradi-tional horizontal boxes with rotated boxes for target detection,effectively addressing difficulties in feature extraction caused by varying target angles.Second,we design a module integrating convolu-tional neural networks(CNN)and Transformer components to replace specific C2f modules in the backbone network,thereby expanding the model’s receptive field and enhancing feature extraction in complex backgrounds.Finally,we introduce a feature calibration structure to mitigate potential feature mismatches during feature fusion.For model compression,we employ a lightweight channel pruning technique based on localized mean average precision(LMAP)to eliminate redundancies in the enhanced model.Although this approach results in some loss of detection accuracy,it effec-tively reduces the number of parameters,computational load,and model size.Additionally,we employ channel-level knowledge distillation to recover accuracy in the pruned model,further enhancing detection performance.Experimental results indicate that the enhanced algorithm achieves a 6.1%increase in mAP50 compared to YOLOv8s,while simultaneously reducing parame-ters,computational load,and model size by 57.7%,28.8%,and 52.3%,respectively.展开更多
Recent years have seen a surge in interest in object detection on remote sensing images for applications such as surveillance andmanagement.However,challenges like small object detection,scale variation,and the presen...Recent years have seen a surge in interest in object detection on remote sensing images for applications such as surveillance andmanagement.However,challenges like small object detection,scale variation,and the presence of closely packed objects in these images hinder accurate detection.Additionally,the motion blur effect further complicates the identification of such objects.To address these issues,we propose enhanced YOLOv9 with a transformer head(YOLOv9-TH).The model introduces an additional prediction head for detecting objects of varying sizes and swaps the original prediction heads for transformer heads to leverage self-attention mechanisms.We further improve YOLOv9-TH using several strategies,including data augmentation,multi-scale testing,multi-model integration,and the introduction of an additional classifier.The cross-stage partial(CSP)method and the ghost convolution hierarchical graph(GCHG)are combined to improve detection accuracy by better utilizing feature maps,widening the receptive field,and precisely extracting multi-scale objects.Additionally,we incorporate the E-SimAM attention mechanism to address low-resolution feature loss.Extensive experiments on the VisDrone2021 and DIOR datasets demonstrate the effectiveness of YOLOv9-TH,showing good improvement in mAP compared to the best existing methods.The YOLOv9-TH-e achieved 54.2% of mAP50 on the VisDrone2021 dataset and 92.3% of mAP on the DIOR dataset.The results confirmthemodel’s robustness and suitability for real-world applications,particularly for small object detection in remote sensing images.展开更多
The objective of this study is to address semantic misalignment and insufficient accuracy in edge detail and discrimination detection,which are common issues in deep learning-based change detection methods relying on ...The objective of this study is to address semantic misalignment and insufficient accuracy in edge detail and discrimination detection,which are common issues in deep learning-based change detection methods relying on encoding and decoding frameworks.In response to this,we propose a model called FlowDual-PixelClsObjectMec(FPCNet),which innovatively incorporates dual flow alignment technology in the decoding stage to rectify semantic discrepancies through streamlined feature correction fusion.Furthermore,the model employs an object-level similarity measurement coupled with pixel-level classification in the PixelClsObjectMec(PCOM)module during the final discrimination stage,significantly enhancing edge detail detection and overall accuracy.Experimental evaluations on the change detection dataset(CDD)and building CDD demonstrate superior performance,with F1 scores of 95.1%and 92.8%,respectively.Our findings indicate that the FPCNet outperforms the existing algorithms in stability,robustness,and other key metrics.展开更多
This paper introduces a lightweight remote sensing image dehazing network called multidimensional weight regulation network(MDWR-Net), which addresses the high computational cost of existing methods. Previous works, o...This paper introduces a lightweight remote sensing image dehazing network called multidimensional weight regulation network(MDWR-Net), which addresses the high computational cost of existing methods. Previous works, often based on the encoder-decoder structure and utilizing multiple upsampling and downsampling layers, are computationally expensive. To improve efficiency, the paper proposes two modules: the efficient spatial resolution recovery module(ESRR) for upsampling and the efficient depth information augmentation module(EDIA) for downsampling.These modules not only reduce model complexity but also enhance performance. Additionally, the partial feature weight learning module(PFWL) is introduced to reduce the computational burden by applying weight learning across partial dimensions, rather than using full-channel convolution.To overcome the limitations of convolutional neural networks(CNN)-based networks, the haze distribution index transformer(HDIT) is integrated into the decoder. We also propose the physicalbased non-adjacent feature fusion module(PNFF), which leverages the atmospheric scattering model to improve generalization of our MDWR-Net. The MDWR-Net achieves superior dehazing performance with a computational cost of just 2.98×10^(9) multiply-accumulate operations(MACs),which is less than one-tenth of previous methods. Experimental results validate its effectiveness in balancing performance and computational efficiency.展开更多
Semantic segmentation provides important technical support for Land cover/land use(LCLU)research.By calculating the cosine similarity between feature vectors,transformer-based models can effectively capture the global...Semantic segmentation provides important technical support for Land cover/land use(LCLU)research.By calculating the cosine similarity between feature vectors,transformer-based models can effectively capture the global information of high-resolution remote sensing images.However,the diversity of detailed and edge features within the same class of ground objects in high-resolution remote sensing images leads to a dispersed embedding distribution.The dispersed feature distribution enlarges feature vector angles and reduces cosine similarity,weakening the attention mechanism’s ability to identify the same class of ground objects.To address this challenge,remote sensing image information granulation transformer for semantic segmentation is proposed.The model employs adaptive granulation to extract common semantic features among objects of the same class,constructing an information granule to replace the detailed feature representation of these objects.Then,the Laplacian operator of the information granule is applied to extract the edge features of the object as represented by the information granule.In the experiments,the proposed model was validated on the Beijing Land-Use(BLU),Gaofen Image Dataset(GID),and Potsdam Dataset(PD).In particular,the model achieves 88.81%for mOA,82.64%for mF1,and 71.50%for mIoU metrics on the GID dataset.Experimental results show that the model effectively handles high-resolution remote sensing images.Our code is available at https://github.com/sjmp525/RSIGT(accessed on 16 April 2025).展开更多
The convolutional neural network(CNN)method based on DeepLabv3+has some problems in the semantic segmentation task of high-resolution remote sensing images,such as fixed receiving field size of feature extraction,lack...The convolutional neural network(CNN)method based on DeepLabv3+has some problems in the semantic segmentation task of high-resolution remote sensing images,such as fixed receiving field size of feature extraction,lack of semantic information,high decoder magnification,and insufficient detail retention ability.A hierarchical feature fusion network(HFFNet)was proposed.Firstly,a combination of transformer and CNN architectures was employed for feature extraction from images of varying resolutions.The extracted features were processed independently.Subsequently,the features from the transformer and CNN were fused under the guidance of features from different sources.This fusion process assisted in restoring information more comprehensively during the decoding stage.Furthermore,a spatial channel attention module was designed in the final stage of decoding to refine features and reduce the semantic gap between shallow CNN features and deep decoder features.The experimental results showed that HFFNet had superior performance on UAVid,LoveDA,Potsdam,and Vaihingen datasets,and its cross-linking index was better than DeepLabv3+and other competing methods,showing strong generalization ability.展开更多
The secured access is studied in this paper for the network of the image remote sensing.Each sensor in this network encounters the information security when uploading information of the images wirelessly from the sens...The secured access is studied in this paper for the network of the image remote sensing.Each sensor in this network encounters the information security when uploading information of the images wirelessly from the sensor to the central collection point.In order to enhance the sensing quality for the remote uploading,the passive reflection surface technique is employed.If one eavesdropper that exists nearby this sensor is keeping on accessing the same networks,he may receive the same image from this sensor.Our goal in this paper is to improve the SNR of legitimate collection unit while cut down the SNR of the eavesdropper as much as possible by adaptively adjust the uploading power from this sensor to enhance the security of the remote sensing images.In order to achieve this goal,the secured energy efficiency performance is theoretically analyzed with respect to the number of the passive reflection elements by calculating the instantaneous performance over the channel fading coefficients.Based on this theoretical result,the secured access is formulated as a mathematical optimization problem by adjusting the sensor uploading power as the unknown variables with the objective of the energy efficiency maximization while satisfying any required maximum data rate of the eavesdropper sensor.Finally,the analytical expression is theoretically derived for the optimum uploading power.Numerical simulations verify the design approach.展开更多
The classification of Chinese traditional settlements(CTSs)is extremely important for their differentiated development and protection.The innovative double-branch classification model developed in this study comprehen...The classification of Chinese traditional settlements(CTSs)is extremely important for their differentiated development and protection.The innovative double-branch classification model developed in this study comprehensively utilized the features of remote sensing(RS)images and building facade pictures(BFPs).This approach was able to overcome the limitations of previous methods that used only building facade images to classify settlements.First,the features of the roofs and walls were extracted using a double-branch structure,which consisted of an RS image branch and BFP branch.Then,a feature fusion module was designed to fuse the features of the roofs and walls.The precision,recall,and F1-score of the proposed model were improved by more than 4%compared with the classification model using only RS images or BFPs.The same three indexes of the proposed model were improved by more than 2%compared with other deep learning models.The results demonstrated that the proposed model performed well in the classification of architectural styles in CTSs.展开更多
Significant advancements have been achieved in road surface extraction based on high-resolution remote sensingimage processing. Most current methods rely on fully supervised learning, which necessitates enormous human...Significant advancements have been achieved in road surface extraction based on high-resolution remote sensingimage processing. Most current methods rely on fully supervised learning, which necessitates enormous humaneffort to label the image. Within this field, other research endeavors utilize weakly supervised methods. Theseapproaches aim to reduce the expenses associated with annotation by leveraging sparsely annotated data, such asscribbles. This paper presents a novel technique called a weakly supervised network using scribble-supervised andedge-mask (WSSE-net). This network is a three-branch network architecture, whereby each branch is equippedwith a distinct decoder module dedicated to road extraction tasks. One of the branches is dedicated to generatingedge masks using edge detection algorithms and optimizing road edge details. The other two branches supervise themodel’s training by employing scribble labels and spreading scribble information throughout the image. To addressthe historical flaw that created pseudo-labels that are not updated with network training, we use mixup to blendprediction results dynamically and continually update new pseudo-labels to steer network training. Our solutiondemonstrates efficient operation by simultaneously considering both edge-mask aid and dynamic pseudo-labelsupport. The studies are conducted on three separate road datasets, which consist primarily of high-resolutionremote-sensing satellite photos and drone images. The experimental findings suggest that our methodologyperforms better than advanced scribble-supervised approaches and specific traditional fully supervised methods.展开更多
The frequent occurrence of extreme weather events has rendered numerous landslides to a global natural disaster issue.It is crucial to rapidly and accurately determine the boundaries of landslides for geohazards evalu...The frequent occurrence of extreme weather events has rendered numerous landslides to a global natural disaster issue.It is crucial to rapidly and accurately determine the boundaries of landslides for geohazards evaluation and emergency response.Therefore,the Skip Connection DeepLab neural network(SCDnn),a deep learning model based on 770 optical remote sensing images of landslide,is proposed to improve the accuracy of landslide boundary detection.The SCDnn model is optimized for the over-segmentation issue which occurs in conventional deep learning models when there is a significant degree of similarity between topographical geomorphic features.SCDnn exhibits notable improvements in landslide feature extraction and semantic segmentation by combining an enhanced Atrous Spatial Pyramid Convolutional Block(ASPC)with a coding structure that reduces model complexity.The experimental results demonstrate that SCDnn can identify landslide boundaries in 119 images with MIoU values between 0.8and 0.9;while 52 images with MIoU values exceeding 0.9,which exceeds the identification accuracy of existing techniques.This work can offer a novel technique for the automatic extensive identification of landslide boundaries in remote sensing images in addition to establishing the groundwork for future inve stigations and applications in related domains.展开更多
Information on Land Use and Land Cover Map(LULCM)is essential for environment and socioeconomic applications.Such maps are generally derived from Multispectral Remote Sensing Images(MRSI)via classification.The classif...Information on Land Use and Land Cover Map(LULCM)is essential for environment and socioeconomic applications.Such maps are generally derived from Multispectral Remote Sensing Images(MRSI)via classification.The classification process can be described as information flow from images to maps through a trained classifier.Characterizing the information flow is essential for understanding the classification mechanism,providing solutions that address such theoretical issues as“what is the maximum number of classes that can be classified from a given MRSI?”and“how much information gain can be obtained?”Consequently,two interesting questions naturally arise,i.e.(i)How can we characterize the information flow?and(ii)What is the mathematical form of the information flow?To answer these two questions,this study first hypothesizes that thermodynamic entropy is the appropriate measure of information for both MRSI and LULCM.This hypothesis is then supported by kinetic-theory-based experiments.Thereafter,upon such an entropy,a generalized Jarzynski equation is formulated to mathematically model the information flow,which contains such parameters as thermodynamic entropy of MRSI,thermodynamic entropy of LULCM,weighted F1-score(classification accuracy),and total number of classes.This generalized Jarzynski equation has been successfully validated by hypothesis-driven experiments where 694 Sentinel-2 images are classified into 10 classes by four classical classifiers.This study provides a way for linking thermodynamic laws and concepts to the characterization and understanding of information flow in land cover classification,opening a new door for constructing domain knowledge.展开更多
Source identification and deformation analysis of disaster bodies are the main contents of high-steep slope risk assessment,the establishment of high-precision model and the quantification of the fine geometric featur...Source identification and deformation analysis of disaster bodies are the main contents of high-steep slope risk assessment,the establishment of high-precision model and the quantification of the fine geometric features of the slope are the prerequisites for the above work.In this study,based on the UAV remote sensing technology in acquiring refined model and quantitative parameters,a semi-automatic dangerous rock identification method based on multi-source data is proposed.In terms of the periodicity UAV-based deformation monitoring,the monitoring accuracy is defined according to the relative accuracy of multi-temporal point cloud.Taking a high-steep slope as research object,the UAV equipped with special sensors was used to obtain multi-source and multitemporal data,including high-precision DOM and multi-temporal 3D point clouds.The geometric features of the outcrop were extracted and superimposed with DOM images to carry out semi-automatic identification of dangerous rock mass,realizes the closed-loop of identification and accuracy verification;changing detection of multi-temporal 3D point clouds was conducted to capture deformation of slope with centimeter accuracy.The results show that the multi-source data-based semiautomatic dangerous rock identification method can complement each other to improve the efficiency and accuracy of identification,and the UAV-based multi-temporal monitoring can reveal the near real-time deformation state of slopes.展开更多
This research systematically investigates urban three-dimensional greening layout optimization and smart ecocity construction using deep learning and remote sensing technology.An improved U-Net++ architecture combined...This research systematically investigates urban three-dimensional greening layout optimization and smart ecocity construction using deep learning and remote sensing technology.An improved U-Net++ architecture combined with multi-source remote sensing data achieved high-precision recognition of urban three-dimensional greening with 92.8% overall accuracy.Analysis of spatiotemporal evolution patterns in Shanghai,Hangzhou,and Nanjing revealed that threedimensional greening shows a development trend from demonstration to popularization,with 16.5% annual growth rate.The study quantitatively assessed ecological benefits of various three-dimensional greening types.Results indicate that modular vertical greening and intensive roof gardens yield highest ecological benefits,while climbing-type vertical greening and extensive roof gardens offer optimal benefit-cost ratios.Integration of multiple forms generates 15-22% synergistic enhancement.Compared with traditional planning,the multi-objective optimization-based layout achieved 27.5% increase in carbon sequestration,32.6% improvement in temperature regulation,35.8% enhancement in stormwater management,and 42.3% rise in biodiversity index.Three pilot projects validated that actual ecological benefits reached 90.3-102.3% of predicted values.Multi-scenario simulations indicate optimized layouts can reduce urban heat island intensity by 15.2-18.7%,increase carbon neutrality contribution to 8.6-10.2%,and decrease stormwater runoff peaks by 25.3-32.6%.The findings provide technical methods for urban three-dimensional greening optimization and smart eco-city construction,promoting sustainable urban development.展开更多
Image registration is an indispensable component in multi-source remote sensing image processing. In this paper, we put forward a remote sensing image registration method by including an improved multi-scale and multi...Image registration is an indispensable component in multi-source remote sensing image processing. In this paper, we put forward a remote sensing image registration method by including an improved multi-scale and multi-direction Harris algorithm and a novel compound feature. Multi-scale circle Gaussian combined invariant moments and multi-direction gray level co-occurrence matrix are extracted as features for image matching. The proposed algorithm is evaluated on numerous multi-source remote sensor images with noise and illumination changes. Extensive experimental studies prove that our proposed method is capable of receiving stable and even distribution of key points as well as obtaining robust and accurate correspondence matches. It is a promising scheme in multi-source remote sensing image registration.展开更多
Numerous coal fires burn underneath the Datong coalfield because of indiscriminate mining.Landsat TM/ETM,unmanned aerial vehicle(UAV),and infrared thermal imager were employed to monitor underground coal fires in th...Numerous coal fires burn underneath the Datong coalfield because of indiscriminate mining.Landsat TM/ETM,unmanned aerial vehicle(UAV),and infrared thermal imager were employed to monitor underground coal fires in the Majiliang mining area.The thermal field distributions of this area in 2000,2002,2006,2007,and 2009 were obtained using Landsat TM/ETM.The changes in the distribution were then analyzed to approximate the locations of the coal fires.Through UAV imagery employed at a very high resolution(0.2 m),the texture information,linear features,and brightness of the ground fissures in the coal fire area were determined.All these data were combined to build a knowledge model of determining fissures and were used to support underground coal fire detection.An infrared thermal imager was used to map the thermal field distribution of areas where coal fire is serious.Results were analyzed to identify the hot spot trend and the depth of the burning point.展开更多
Artificial Intelligence(AI)Machine Learning(ML)technologies,particularly Deep Learning(DL),have demonstrated significant potential in the interpretation of Remote Sensing(RS)imagery,covering tasks such as scene classi...Artificial Intelligence(AI)Machine Learning(ML)technologies,particularly Deep Learning(DL),have demonstrated significant potential in the interpretation of Remote Sensing(RS)imagery,covering tasks such as scene classification,object detection,land-cover/land-use classification,change detection,and multi-view stereo reconstruction.Large-scale training samples are essential for ML/DL models to achieve optimal performance.However,the current organization of training samples is ad-hoc and vendor-specific,lacking an integrated approach that can effectively manage training samples from different vendors to meet the demands of various RS AI tasks.This article proposes a solution to address these challenges by designing and implementing LuoJiaSET,a large-scale training sample database system for intelligent interpretation of RS imagery.LuoJiaSET accommodates over five million training samples,providing support for cross-dataset queries and serving as a comprehensive training data store for RS AI model training and calibration.It overcomes challenges related to label semantic categories,structural heterogeneity in label representation,and interoperable data access.展开更多
文摘This paper aims at providing multi-source remote sensing images registered in geometric space for image fusion.Focusing on the characteristics and differences of multi-source remote sensing images,a feature-based registration algorithm is implemented.The key technologies include image scale-space for implementing multi-scale properties,Harris corner detection for keypoints extraction,and partial intensity invariant feature descriptor(PIIFD)for keypoints description.Eventually,a multi-scale Harris-PIIFD image registration algorithm framework is proposed.The experimental results of fifteen sets of representative real data show that the algorithm has excellent,stable performance in multi-source remote sensing image registration,and can achieve accurate spatial alignment,which has strong practical application value and certain generalization ability.
文摘The automatic registration of multi-source remote sensing images (RSI) is a research hotspot of remote sensing image preprocessing currently. A special automatic image registration module named the Image Autosync has been embedded into the ERDAS IMAGINE software of version 9.0 and above. The registration accuracies of the module verified for the remote sensing images obtained from different platforms or their different spatial resolution. Four tested registration experiments are discussed in this article to analyze the accuracy differences based on the remote sensing data which have different spatial resolution. The impact factors inducing the differences of registration accuracy are also analyzed.
基金Under the auspices of the National Key Research and Development Program of China(No.2023YFC3208500)Shanghai Municipal Natural Science Foundation(No.22ZR1421500)+3 种基金National Natural Science Foundation of China(No.U2243207)National Science and Technology Basic Resources Survey Project(No.2023FY01001)Open Research Fund of State Key Laboratory of Estuarine and Coastal Research(No.SKLEC-KF202406)Project from Science and Technology Commission of Shanghai Municipality(No.22DZ1202700)。
文摘Mudflat vegetation plays a crucial role in the ecological function of wetland environment,and obtaining its fine spatial distri-bution is of great significance for wetland protection and management.Remote sensing techniques can realize the rapid extraction of wetland vegetation over a large area.However,the imaging of optical sensors is easily restricted by weather conditions,and the backs-cattered information reflected by Synthetic Aperture Radar(SAR)images is easily disturbed by many factors.Although both data sources have been applied in wetland vegetation classification,there is a lack of comparative study on how the selection of data sources affects the classification effect.This study takes the vegetation of the tidal flat wetland in Chongming Island,Shanghai,China,in 2019,as the research subject.A total of 22 optical feature parameters and 11 SAR feature parameters were extracted from the optical data source(Sentinel-2)and SAR data source(Sentinel-1),respectively.The performance of optical and SAR data and their feature paramet-ers in wetland vegetation classification was quantitatively compared and analyzed by different feature combinations.Furthermore,by simulating the scenario of missing optical images,the impact of optical image missing on vegetation classification accuracy and the compensatory effect of integrating SAR data were revealed.Results show that:1)under the same classification algorithm,the Overall Accuracy(OA)of the combined use of optical and SAR images was the highest,reaching 95.50%.The OA of using only optical images was slightly lower,while using only SAR images yields the lowest accuracy,but still achieved 86.48%.2)Compared to using the spec-tral reflectance of optical data and the backscattering coefficient of SAR data directly,the constructed optical and SAR feature paramet-ers contributed to improving classification accuracy.The inclusion of optical(vegetation index,spatial texture,and phenology features)and SAR feature parameters(SAR index and SAR texture features)in the classification algorithm resulted in an OA improvement of 4.56%and 9.47%,respectively.SAR backscatter,SAR index,optical phenological features,and vegetation index were identified as the top-ranking important features.3)When the optical data were missing continuously for six months,the OA dropped to a minimum of 41.56%.However,when combined with SAR data,the OA could be improved to 71.62%.This indicates that the incorporation of SAR features can effectively compensate for the loss of accuracy caused by optical image missing,especially in regions with long-term cloud cover.
文摘A novel CNN-Mamba hybrid architecture was proposed to address intra-class variance and inter-class similarity in remote sensing imagery.The framework integrates:(1)parallel CNN and visual state space(VSS)encoders,(2)multi-scale cross-attention feature fusion,and(3)a boundary-constrained decoder.This design overcomes CNN s limited receptive fields and ViT s quadratic complexity while efficiently capturing both local features and global dependencies.Evaluations on LoveDA and ISPRS Vaihingen datasets demonstrate superior segmentation accuracy and boundary preservation compared to existing approaches,with the dual-branch structure maintaining computational efficiency throughout the process.
基金supported in part by the National Natural Foundation of China(Nos.52472334,U2368204)。
文摘In response to challenges posed by complex backgrounds,diverse target angles,and numerous small targets in remote sensing images,alongside the issue of high resource consumption hindering model deployment,we propose an enhanced,lightweight you only look once version 8 small(YOLOv8s)detection algorithm.Regarding network improvements,we first replace tradi-tional horizontal boxes with rotated boxes for target detection,effectively addressing difficulties in feature extraction caused by varying target angles.Second,we design a module integrating convolu-tional neural networks(CNN)and Transformer components to replace specific C2f modules in the backbone network,thereby expanding the model’s receptive field and enhancing feature extraction in complex backgrounds.Finally,we introduce a feature calibration structure to mitigate potential feature mismatches during feature fusion.For model compression,we employ a lightweight channel pruning technique based on localized mean average precision(LMAP)to eliminate redundancies in the enhanced model.Although this approach results in some loss of detection accuracy,it effec-tively reduces the number of parameters,computational load,and model size.Additionally,we employ channel-level knowledge distillation to recover accuracy in the pruned model,further enhancing detection performance.Experimental results indicate that the enhanced algorithm achieves a 6.1%increase in mAP50 compared to YOLOv8s,while simultaneously reducing parame-ters,computational load,and model size by 57.7%,28.8%,and 52.3%,respectively.
文摘Recent years have seen a surge in interest in object detection on remote sensing images for applications such as surveillance andmanagement.However,challenges like small object detection,scale variation,and the presence of closely packed objects in these images hinder accurate detection.Additionally,the motion blur effect further complicates the identification of such objects.To address these issues,we propose enhanced YOLOv9 with a transformer head(YOLOv9-TH).The model introduces an additional prediction head for detecting objects of varying sizes and swaps the original prediction heads for transformer heads to leverage self-attention mechanisms.We further improve YOLOv9-TH using several strategies,including data augmentation,multi-scale testing,multi-model integration,and the introduction of an additional classifier.The cross-stage partial(CSP)method and the ghost convolution hierarchical graph(GCHG)are combined to improve detection accuracy by better utilizing feature maps,widening the receptive field,and precisely extracting multi-scale objects.Additionally,we incorporate the E-SimAM attention mechanism to address low-resolution feature loss.Extensive experiments on the VisDrone2021 and DIOR datasets demonstrate the effectiveness of YOLOv9-TH,showing good improvement in mAP compared to the best existing methods.The YOLOv9-TH-e achieved 54.2% of mAP50 on the VisDrone2021 dataset and 92.3% of mAP on the DIOR dataset.The results confirmthemodel’s robustness and suitability for real-world applications,particularly for small object detection in remote sensing images.
文摘The objective of this study is to address semantic misalignment and insufficient accuracy in edge detail and discrimination detection,which are common issues in deep learning-based change detection methods relying on encoding and decoding frameworks.In response to this,we propose a model called FlowDual-PixelClsObjectMec(FPCNet),which innovatively incorporates dual flow alignment technology in the decoding stage to rectify semantic discrepancies through streamlined feature correction fusion.Furthermore,the model employs an object-level similarity measurement coupled with pixel-level classification in the PixelClsObjectMec(PCOM)module during the final discrimination stage,significantly enhancing edge detail detection and overall accuracy.Experimental evaluations on the change detection dataset(CDD)and building CDD demonstrate superior performance,with F1 scores of 95.1%and 92.8%,respectively.Our findings indicate that the FPCNet outperforms the existing algorithms in stability,robustness,and other key metrics.
文摘This paper introduces a lightweight remote sensing image dehazing network called multidimensional weight regulation network(MDWR-Net), which addresses the high computational cost of existing methods. Previous works, often based on the encoder-decoder structure and utilizing multiple upsampling and downsampling layers, are computationally expensive. To improve efficiency, the paper proposes two modules: the efficient spatial resolution recovery module(ESRR) for upsampling and the efficient depth information augmentation module(EDIA) for downsampling.These modules not only reduce model complexity but also enhance performance. Additionally, the partial feature weight learning module(PFWL) is introduced to reduce the computational burden by applying weight learning across partial dimensions, rather than using full-channel convolution.To overcome the limitations of convolutional neural networks(CNN)-based networks, the haze distribution index transformer(HDIT) is integrated into the decoder. We also propose the physicalbased non-adjacent feature fusion module(PNFF), which leverages the atmospheric scattering model to improve generalization of our MDWR-Net. The MDWR-Net achieves superior dehazing performance with a computational cost of just 2.98×10^(9) multiply-accumulate operations(MACs),which is less than one-tenth of previous methods. Experimental results validate its effectiveness in balancing performance and computational efficiency.
基金supported by the National Natural Science Foundation of China(62462040)the Yunnan Fundamental Research Projects(202501AT070345)+2 种基金the Major Science and Technology Projects in Yunnan Province(202202AD080013)Sichuan Provincial Key Laboratory of Philosophy and Social Science Key Program on Language Intelligence Special Education(YYZN-2024-1)the Photosynthesis Fund Class A(ghfund202407010460).
文摘Semantic segmentation provides important technical support for Land cover/land use(LCLU)research.By calculating the cosine similarity between feature vectors,transformer-based models can effectively capture the global information of high-resolution remote sensing images.However,the diversity of detailed and edge features within the same class of ground objects in high-resolution remote sensing images leads to a dispersed embedding distribution.The dispersed feature distribution enlarges feature vector angles and reduces cosine similarity,weakening the attention mechanism’s ability to identify the same class of ground objects.To address this challenge,remote sensing image information granulation transformer for semantic segmentation is proposed.The model employs adaptive granulation to extract common semantic features among objects of the same class,constructing an information granule to replace the detailed feature representation of these objects.Then,the Laplacian operator of the information granule is applied to extract the edge features of the object as represented by the information granule.In the experiments,the proposed model was validated on the Beijing Land-Use(BLU),Gaofen Image Dataset(GID),and Potsdam Dataset(PD).In particular,the model achieves 88.81%for mOA,82.64%for mF1,and 71.50%for mIoU metrics on the GID dataset.Experimental results show that the model effectively handles high-resolution remote sensing images.Our code is available at https://github.com/sjmp525/RSIGT(accessed on 16 April 2025).
基金supported by National Natural Science Foundation of China(No.52374155)Anhui Provincial Natural Science Foundation(No.2308085 MF218).
文摘The convolutional neural network(CNN)method based on DeepLabv3+has some problems in the semantic segmentation task of high-resolution remote sensing images,such as fixed receiving field size of feature extraction,lack of semantic information,high decoder magnification,and insufficient detail retention ability.A hierarchical feature fusion network(HFFNet)was proposed.Firstly,a combination of transformer and CNN architectures was employed for feature extraction from images of varying resolutions.The extracted features were processed independently.Subsequently,the features from the transformer and CNN were fused under the guidance of features from different sources.This fusion process assisted in restoring information more comprehensively during the decoding stage.Furthermore,a spatial channel attention module was designed in the final stage of decoding to refine features and reduce the semantic gap between shallow CNN features and deep decoder features.The experimental results showed that HFFNet had superior performance on UAVid,LoveDA,Potsdam,and Vaihingen datasets,and its cross-linking index was better than DeepLabv3+and other competing methods,showing strong generalization ability.
基金supported in part by Jiangsu Province High Level“333”Program (0401206044)National Natural Science Foundation of China (61801243,62072255)+4 种基金Program for Scientific Research Foundation for Talented Scholars of Jinling Institute of Technology (JIT-B-202031)University Incubator Foundation of Jinling Institute of Technology (JIT-FHXM-202110)Open Project of Fujian Provincial Key Lab.of Network Security and Cryptology (NSCL-KF2021-02)Open Foundation of National Railway Intelligence Transportation System Engineering Tech.Research Center (RITS2021KF02)China Postdoctoral Science Foundation (2019M651914)。
文摘The secured access is studied in this paper for the network of the image remote sensing.Each sensor in this network encounters the information security when uploading information of the images wirelessly from the sensor to the central collection point.In order to enhance the sensing quality for the remote uploading,the passive reflection surface technique is employed.If one eavesdropper that exists nearby this sensor is keeping on accessing the same networks,he may receive the same image from this sensor.Our goal in this paper is to improve the SNR of legitimate collection unit while cut down the SNR of the eavesdropper as much as possible by adaptively adjust the uploading power from this sensor to enhance the security of the remote sensing images.In order to achieve this goal,the secured energy efficiency performance is theoretically analyzed with respect to the number of the passive reflection elements by calculating the instantaneous performance over the channel fading coefficients.Based on this theoretical result,the secured access is formulated as a mathematical optimization problem by adjusting the sensor uploading power as the unknown variables with the objective of the energy efficiency maximization while satisfying any required maximum data rate of the eavesdropper sensor.Finally,the analytical expression is theoretically derived for the optimum uploading power.Numerical simulations verify the design approach.
基金The Science and Technology Project of Hebei Education Department,No.BJK2022031The Open Fund of Hebei Key Laboratory of Geological Resources and Environmental Monitoring and Protection,No.JCYKT202310。
文摘The classification of Chinese traditional settlements(CTSs)is extremely important for their differentiated development and protection.The innovative double-branch classification model developed in this study comprehensively utilized the features of remote sensing(RS)images and building facade pictures(BFPs).This approach was able to overcome the limitations of previous methods that used only building facade images to classify settlements.First,the features of the roofs and walls were extracted using a double-branch structure,which consisted of an RS image branch and BFP branch.Then,a feature fusion module was designed to fuse the features of the roofs and walls.The precision,recall,and F1-score of the proposed model were improved by more than 4%compared with the classification model using only RS images or BFPs.The same three indexes of the proposed model were improved by more than 2%compared with other deep learning models.The results demonstrated that the proposed model performed well in the classification of architectural styles in CTSs.
基金the National Natural Science Foundation of China(42001408,61806097).
文摘Significant advancements have been achieved in road surface extraction based on high-resolution remote sensingimage processing. Most current methods rely on fully supervised learning, which necessitates enormous humaneffort to label the image. Within this field, other research endeavors utilize weakly supervised methods. Theseapproaches aim to reduce the expenses associated with annotation by leveraging sparsely annotated data, such asscribbles. This paper presents a novel technique called a weakly supervised network using scribble-supervised andedge-mask (WSSE-net). This network is a three-branch network architecture, whereby each branch is equippedwith a distinct decoder module dedicated to road extraction tasks. One of the branches is dedicated to generatingedge masks using edge detection algorithms and optimizing road edge details. The other two branches supervise themodel’s training by employing scribble labels and spreading scribble information throughout the image. To addressthe historical flaw that created pseudo-labels that are not updated with network training, we use mixup to blendprediction results dynamically and continually update new pseudo-labels to steer network training. Our solutiondemonstrates efficient operation by simultaneously considering both edge-mask aid and dynamic pseudo-labelsupport. The studies are conducted on three separate road datasets, which consist primarily of high-resolutionremote-sensing satellite photos and drone images. The experimental findings suggest that our methodologyperforms better than advanced scribble-supervised approaches and specific traditional fully supervised methods.
基金supported by the National Natural Science Foundation of China(Grant Nos.42090054,41931295)the Natural Science Foundation of Hubei Province of China(2022CFA002)。
文摘The frequent occurrence of extreme weather events has rendered numerous landslides to a global natural disaster issue.It is crucial to rapidly and accurately determine the boundaries of landslides for geohazards evaluation and emergency response.Therefore,the Skip Connection DeepLab neural network(SCDnn),a deep learning model based on 770 optical remote sensing images of landslide,is proposed to improve the accuracy of landslide boundary detection.The SCDnn model is optimized for the over-segmentation issue which occurs in conventional deep learning models when there is a significant degree of similarity between topographical geomorphic features.SCDnn exhibits notable improvements in landslide feature extraction and semantic segmentation by combining an enhanced Atrous Spatial Pyramid Convolutional Block(ASPC)with a coding structure that reduces model complexity.The experimental results demonstrate that SCDnn can identify landslide boundaries in 119 images with MIoU values between 0.8and 0.9;while 52 images with MIoU values exceeding 0.9,which exceeds the identification accuracy of existing techniques.This work can offer a novel technique for the automatic extensive identification of landslide boundaries in remote sensing images in addition to establishing the groundwork for future inve stigations and applications in related domains.
基金supported by the National Natural Science Foundation of China[grant number 41930104]by the Research Grants Council of Hong Kong[grant number PolyU 152219/18E].
文摘Information on Land Use and Land Cover Map(LULCM)is essential for environment and socioeconomic applications.Such maps are generally derived from Multispectral Remote Sensing Images(MRSI)via classification.The classification process can be described as information flow from images to maps through a trained classifier.Characterizing the information flow is essential for understanding the classification mechanism,providing solutions that address such theoretical issues as“what is the maximum number of classes that can be classified from a given MRSI?”and“how much information gain can be obtained?”Consequently,two interesting questions naturally arise,i.e.(i)How can we characterize the information flow?and(ii)What is the mathematical form of the information flow?To answer these two questions,this study first hypothesizes that thermodynamic entropy is the appropriate measure of information for both MRSI and LULCM.This hypothesis is then supported by kinetic-theory-based experiments.Thereafter,upon such an entropy,a generalized Jarzynski equation is formulated to mathematically model the information flow,which contains such parameters as thermodynamic entropy of MRSI,thermodynamic entropy of LULCM,weighted F1-score(classification accuracy),and total number of classes.This generalized Jarzynski equation has been successfully validated by hypothesis-driven experiments where 694 Sentinel-2 images are classified into 10 classes by four classical classifiers.This study provides a way for linking thermodynamic laws and concepts to the characterization and understanding of information flow in land cover classification,opening a new door for constructing domain knowledge.
基金financially supported by the Youth Innovation Promotion Association CAS(No.2021325)the National Natural Science Foundation of China(Nos.52179117,U21A20159)the Research project of Panzhihua Iron and Steel Group Mining Co.,Ltd.(No.2021-P6-D2-05)。
文摘Source identification and deformation analysis of disaster bodies are the main contents of high-steep slope risk assessment,the establishment of high-precision model and the quantification of the fine geometric features of the slope are the prerequisites for the above work.In this study,based on the UAV remote sensing technology in acquiring refined model and quantitative parameters,a semi-automatic dangerous rock identification method based on multi-source data is proposed.In terms of the periodicity UAV-based deformation monitoring,the monitoring accuracy is defined according to the relative accuracy of multi-temporal point cloud.Taking a high-steep slope as research object,the UAV equipped with special sensors was used to obtain multi-source and multitemporal data,including high-precision DOM and multi-temporal 3D point clouds.The geometric features of the outcrop were extracted and superimposed with DOM images to carry out semi-automatic identification of dangerous rock mass,realizes the closed-loop of identification and accuracy verification;changing detection of multi-temporal 3D point clouds was conducted to capture deformation of slope with centimeter accuracy.The results show that the multi-source data-based semiautomatic dangerous rock identification method can complement each other to improve the efficiency and accuracy of identification,and the UAV-based multi-temporal monitoring can reveal the near real-time deformation state of slopes.
文摘This research systematically investigates urban three-dimensional greening layout optimization and smart ecocity construction using deep learning and remote sensing technology.An improved U-Net++ architecture combined with multi-source remote sensing data achieved high-precision recognition of urban three-dimensional greening with 92.8% overall accuracy.Analysis of spatiotemporal evolution patterns in Shanghai,Hangzhou,and Nanjing revealed that threedimensional greening shows a development trend from demonstration to popularization,with 16.5% annual growth rate.The study quantitatively assessed ecological benefits of various three-dimensional greening types.Results indicate that modular vertical greening and intensive roof gardens yield highest ecological benefits,while climbing-type vertical greening and extensive roof gardens offer optimal benefit-cost ratios.Integration of multiple forms generates 15-22% synergistic enhancement.Compared with traditional planning,the multi-objective optimization-based layout achieved 27.5% increase in carbon sequestration,32.6% improvement in temperature regulation,35.8% enhancement in stormwater management,and 42.3% rise in biodiversity index.Three pilot projects validated that actual ecological benefits reached 90.3-102.3% of predicted values.Multi-scenario simulations indicate optimized layouts can reduce urban heat island intensity by 15.2-18.7%,increase carbon neutrality contribution to 8.6-10.2%,and decrease stormwater runoff peaks by 25.3-32.6%.The findings provide technical methods for urban three-dimensional greening optimization and smart eco-city construction,promoting sustainable urban development.
基金supported by National Nature Science Foundation of China (Nos. 61462046 and 61762052)Natural Science Foundation of Jiangxi Province (Nos. 20161BAB202049 and 20161BAB204172)+2 种基金the Bidding Project of the Key Laboratory of Watershed Ecology and Geographical Environment Monitoring, NASG (Nos. WE2016003, WE2016013 and WE2016015)the Science and Technology Research Projects of Jiangxi Province Education Department (Nos. GJJ160741, GJJ170632 and GJJ170633)the Art Planning Project of Jiangxi Province (Nos. YG2016250 and YG2017381)
文摘Image registration is an indispensable component in multi-source remote sensing image processing. In this paper, we put forward a remote sensing image registration method by including an improved multi-scale and multi-direction Harris algorithm and a novel compound feature. Multi-scale circle Gaussian combined invariant moments and multi-direction gray level co-occurrence matrix are extracted as features for image matching. The proposed algorithm is evaluated on numerous multi-source remote sensor images with noise and illumination changes. Extensive experimental studies prove that our proposed method is capable of receiving stable and even distribution of key points as well as obtaining robust and accurate correspondence matches. It is a promising scheme in multi-source remote sensing image registration.
基金Project(201412016)supported by the Special Fund for Public Projects of National Administration of Surveying,Mapping and Geoinformation of ChinaProject(51174287)supported by the National Natural Science Foundation of China
文摘Numerous coal fires burn underneath the Datong coalfield because of indiscriminate mining.Landsat TM/ETM,unmanned aerial vehicle(UAV),and infrared thermal imager were employed to monitor underground coal fires in the Majiliang mining area.The thermal field distributions of this area in 2000,2002,2006,2007,and 2009 were obtained using Landsat TM/ETM.The changes in the distribution were then analyzed to approximate the locations of the coal fires.Through UAV imagery employed at a very high resolution(0.2 m),the texture information,linear features,and brightness of the ground fissures in the coal fire area were determined.All these data were combined to build a knowledge model of determining fissures and were used to support underground coal fire detection.An infrared thermal imager was used to map the thermal field distribution of areas where coal fire is serious.Results were analyzed to identify the hot spot trend and the depth of the burning point.
基金supported by the National Natural Science Foundation of China[grant number 42071354]supported by the Fundamental Research Funds for the Central Universities[grant number 2042022dx0001]supported by the Fundamental Research Funds for the Central Universities[grant number WUT:223108001].
文摘Artificial Intelligence(AI)Machine Learning(ML)technologies,particularly Deep Learning(DL),have demonstrated significant potential in the interpretation of Remote Sensing(RS)imagery,covering tasks such as scene classification,object detection,land-cover/land-use classification,change detection,and multi-view stereo reconstruction.Large-scale training samples are essential for ML/DL models to achieve optimal performance.However,the current organization of training samples is ad-hoc and vendor-specific,lacking an integrated approach that can effectively manage training samples from different vendors to meet the demands of various RS AI tasks.This article proposes a solution to address these challenges by designing and implementing LuoJiaSET,a large-scale training sample database system for intelligent interpretation of RS imagery.LuoJiaSET accommodates over five million training samples,providing support for cross-dataset queries and serving as a comprehensive training data store for RS AI model training and calibration.It overcomes challenges related to label semantic categories,structural heterogeneity in label representation,and interoperable data access.