Automatic segmentation of landslides from remote sensing imagery is challenging because traditional machine learning and early CNN-based models often fail to generalize across heterogeneous landscapes,where segmentati...Automatic segmentation of landslides from remote sensing imagery is challenging because traditional machine learning and early CNN-based models often fail to generalize across heterogeneous landscapes,where segmentation maps contain sparse and fragmented landslide regions under diverse geographical conditions.To address these issues,we propose a lightweight dual-stream siamese deep learning framework that integrates optical and topographical data fusion with an adaptive decoder,guided multimodal fusion,and deep supervision.The framework is built upon the synergistic combination of cross-attention,gated fusion,and sub-pixel upsampling within a unified dual-stream architecture specifically optimized for landslide segmentation,enabling efficient context modeling and robust feature exchange between modalities.The decoder captures long-range context at deeper levels using lightweight cross-attention and refines spatial details at shallower levels through attention-gated skip fusion,enabling precise boundary delineation and fewer false positives.The gated fusion further enhances multimodal integration of optical and topographical cues,and the deep supervision stabilizes training and improves generalization.Moreover,to mitigate checkerboard artifacts,a learnable sub-pixel upsampling is devised to replace the traditional transposed convolution.Despite its compact design with fewer parameters,the model consistently outperforms state-of-the-art baselines.Experiments on two benchmark datasets,Landslide4Sense and Bijie,confirm the effectiveness of the framework.On the Bijie dataset,it achieves an F1-score of 0.9110 and an intersection over union(IoU)of 0.8839.These results highlight its potential for accurate large-scale landslide inventory mapping and real-time disaster response.The implementation is publicly available at https://github.com/mishaown/DiGATe-UNet-LandSlide-Segmentation(accessed on 3 November 2025).展开更多
Landslides constitute one of the most destructive geological hazards worldwide,causing thousands of casualties and billions in economic losses annually.To mitigate these risks,accurate and efficient pixel-wise mapping...Landslides constitute one of the most destructive geological hazards worldwide,causing thousands of casualties and billions in economic losses annually.To mitigate these risks,accurate and efficient pixel-wise mapping of landslides for automatic semantic segmentation is of paramount importance.While recent advances in deep learning,particularly with transformer architectures and large pre-trained models like the Segment Anything Model(SAM),have shown promise,their application to landslide mapping is often hindered by high compu-tational costs,prompt dependency,and challenges with data imbalance.To address these limitations,we propose GeoNeXt,a novel semantic segmentation architecture for intelligent landslide recognition.It combines a scalable,pre-trained ConvNeXt V2 encoder with a decoder that utilizes Pyramid Squeeze Attention(PSA)and Atrous Spatial Pyramid Pooling(ASPP)to capture multi-scale features.Through domain adaptation on the large-scale CAS landslide dataset,we refined the encoder’s general pre-trained features to learn robust,landslide-specific features.GeoNeXt exhibited zero-shot transferability,achieving 74-78%F1 and 64-66%mIoU across three distinct test datasets from diverse regions,which were entirely excluded from the training process.Ablation studies on decoder variants validated the PSA-ASPP synergy,achieving a superior F1 of 90.39%and mIoU of 83.18%on the CAS dataset.Comparative analysis confirmed that GeoNeXt outperformed SAM-based methods,achieving F1 scores of 94.25%,86.43%,and 92.27%(mIoU:89.51%,78.21%,86.02%)on the Bijie,Landslide4Sense,and GVLM datasets,respectively,with 10×fewer parameters than SAM-based methods and lower computational demands.We showed that modernized convolutions,paired with strategic training,were a viable alternative to resource-intensive transformers.This efficiency facilitated their use in operational intelli-gent landslide recognition and geohazard monitoring systems.展开更多
基金funded by the National Natural Science Foundation of China,grant number 62262045the Fundamental Research Funds for the Central Universities,grant number 2023CDJYGRH-YB11the Open Funding of SUGON Industrial Control and Security Center,grant number CUIT-SICSC-2025-03.
文摘Automatic segmentation of landslides from remote sensing imagery is challenging because traditional machine learning and early CNN-based models often fail to generalize across heterogeneous landscapes,where segmentation maps contain sparse and fragmented landslide regions under diverse geographical conditions.To address these issues,we propose a lightweight dual-stream siamese deep learning framework that integrates optical and topographical data fusion with an adaptive decoder,guided multimodal fusion,and deep supervision.The framework is built upon the synergistic combination of cross-attention,gated fusion,and sub-pixel upsampling within a unified dual-stream architecture specifically optimized for landslide segmentation,enabling efficient context modeling and robust feature exchange between modalities.The decoder captures long-range context at deeper levels using lightweight cross-attention and refines spatial details at shallower levels through attention-gated skip fusion,enabling precise boundary delineation and fewer false positives.The gated fusion further enhances multimodal integration of optical and topographical cues,and the deep supervision stabilizes training and improves generalization.Moreover,to mitigate checkerboard artifacts,a learnable sub-pixel upsampling is devised to replace the traditional transposed convolution.Despite its compact design with fewer parameters,the model consistently outperforms state-of-the-art baselines.Experiments on two benchmark datasets,Landslide4Sense and Bijie,confirm the effectiveness of the framework.On the Bijie dataset,it achieves an F1-score of 0.9110 and an intersection over union(IoU)of 0.8839.These results highlight its potential for accurate large-scale landslide inventory mapping and real-time disaster response.The implementation is publicly available at https://github.com/mishaown/DiGATe-UNet-LandSlide-Segmentation(accessed on 3 November 2025).
文摘Landslides constitute one of the most destructive geological hazards worldwide,causing thousands of casualties and billions in economic losses annually.To mitigate these risks,accurate and efficient pixel-wise mapping of landslides for automatic semantic segmentation is of paramount importance.While recent advances in deep learning,particularly with transformer architectures and large pre-trained models like the Segment Anything Model(SAM),have shown promise,their application to landslide mapping is often hindered by high compu-tational costs,prompt dependency,and challenges with data imbalance.To address these limitations,we propose GeoNeXt,a novel semantic segmentation architecture for intelligent landslide recognition.It combines a scalable,pre-trained ConvNeXt V2 encoder with a decoder that utilizes Pyramid Squeeze Attention(PSA)and Atrous Spatial Pyramid Pooling(ASPP)to capture multi-scale features.Through domain adaptation on the large-scale CAS landslide dataset,we refined the encoder’s general pre-trained features to learn robust,landslide-specific features.GeoNeXt exhibited zero-shot transferability,achieving 74-78%F1 and 64-66%mIoU across three distinct test datasets from diverse regions,which were entirely excluded from the training process.Ablation studies on decoder variants validated the PSA-ASPP synergy,achieving a superior F1 of 90.39%and mIoU of 83.18%on the CAS dataset.Comparative analysis confirmed that GeoNeXt outperformed SAM-based methods,achieving F1 scores of 94.25%,86.43%,and 92.27%(mIoU:89.51%,78.21%,86.02%)on the Bijie,Landslide4Sense,and GVLM datasets,respectively,with 10×fewer parameters than SAM-based methods and lower computational demands.We showed that modernized convolutions,paired with strategic training,were a viable alternative to resource-intensive transformers.This efficiency facilitated their use in operational intelli-gent landslide recognition and geohazard monitoring systems.