Remote sensing Change Detection(CD)involves identifying changing regions of interest in bi-temporal remote sensing images.CD technology has rapidly developed in recent years through the powerful learning ability of Co...Remote sensing Change Detection(CD)involves identifying changing regions of interest in bi-temporal remote sensing images.CD technology has rapidly developed in recent years through the powerful learning ability of Convolutional Neural Networks(CNN),affording complex feature extraction.However,the local receptive fields in the CNN limit modeling long-range contextual relationships in semantic changes.Therefore,this work explores the great potential of Siamese Transformers in CD tasks and proposes a general CD model entitled STCD that relies on Swin Transformers.In the encoding process,pure Transformers without CNN are used to model the long-range context of semantic tokens,reducing computational overhead and improving model efficiency compared to current methods.During the decoding process,the 3D convolution block obtains the changing features in the time series and generates the predicted change map in the deconvolution layer with axial attention.Extensive experiments on three binary CD datasets and one semantic CD dataset demonstrate that the proposed STCD model outperforms several popular benchmark methods considering performance and the required parameters.Among the STCD variants,the F1-Score of the Base-STCD on the three binary CD datasets LEVIR,DSIFN,and SVCD reached 89.85%,54.72%,and 93.75%,respectively,and the mF1-Score and mIoU on the semantic CD dataset SECOND were 75.60%and 66.19%.展开更多
Recent change detection(CD)methods focus on the extraction of deep change semantic features.However,existing methods overlook the fine-grained features and have the poor ability to capture long-range space–time infor...Recent change detection(CD)methods focus on the extraction of deep change semantic features.However,existing methods overlook the fine-grained features and have the poor ability to capture long-range space–time information,which leads to the micro changes missing and the edges of change types smoothing.In this paper,a potential transformer-based semantic change detection(SCD)model,Pyramid-SCDFormer is proposed,which precisely recognizes the small changes and fine edges details of the changes.The SCD model selectively merges different semantic tokens in multi-head self-attention block to obtain multiscale features,which is crucial for extraction information of remote sensing images(RSIs)with multiple changes from different scales.Moreover,we create a well-annotated SCD dataset,Landsat-SCD with unprecedented time series and change types in complex scenarios.Comparing with three Convolutional Neural Network-based,one attention-based,and two transformer-based networks,experimental results demonstrate that the Pyramid-SCDFormer stably outperforms the existing state-of-the-art CD models and obtains an improvement in MIoU/F1 of 1.11/0.76%,0.57/0.50%,and 8.75/8.59%on the LEVIR-CD,WHU_CD,and Landsat-SCD dataset respectively.For change classes proportion less than 1%,the proposed model improves the MIoU by 7.17–19.53%on Landsat-SCD dataset.The recognition performance for small-scale and fine edges of change types has greatly improved.展开更多
基金supported by the Military Commission Science and Technology Committee Leading Fund[grant number 18-163-00-TS-004-080-01].
文摘Remote sensing Change Detection(CD)involves identifying changing regions of interest in bi-temporal remote sensing images.CD technology has rapidly developed in recent years through the powerful learning ability of Convolutional Neural Networks(CNN),affording complex feature extraction.However,the local receptive fields in the CNN limit modeling long-range contextual relationships in semantic changes.Therefore,this work explores the great potential of Siamese Transformers in CD tasks and proposes a general CD model entitled STCD that relies on Swin Transformers.In the encoding process,pure Transformers without CNN are used to model the long-range context of semantic tokens,reducing computational overhead and improving model efficiency compared to current methods.During the decoding process,the 3D convolution block obtains the changing features in the time series and generates the predicted change map in the deconvolution layer with axial attention.Extensive experiments on three binary CD datasets and one semantic CD dataset demonstrate that the proposed STCD model outperforms several popular benchmark methods considering performance and the required parameters.Among the STCD variants,the F1-Score of the Base-STCD on the three binary CD datasets LEVIR,DSIFN,and SVCD reached 89.85%,54.72%,and 93.75%,respectively,and the mF1-Score and mIoU on the semantic CD dataset SECOND were 75.60%and 66.19%.
基金supported by National Key Research and Development Program of China[Grant number 2017YFB0504203]Xinjiang Production and Construction Corps Science and Technology Project:[Grant number 2017DB005].
文摘Recent change detection(CD)methods focus on the extraction of deep change semantic features.However,existing methods overlook the fine-grained features and have the poor ability to capture long-range space–time information,which leads to the micro changes missing and the edges of change types smoothing.In this paper,a potential transformer-based semantic change detection(SCD)model,Pyramid-SCDFormer is proposed,which precisely recognizes the small changes and fine edges details of the changes.The SCD model selectively merges different semantic tokens in multi-head self-attention block to obtain multiscale features,which is crucial for extraction information of remote sensing images(RSIs)with multiple changes from different scales.Moreover,we create a well-annotated SCD dataset,Landsat-SCD with unprecedented time series and change types in complex scenarios.Comparing with three Convolutional Neural Network-based,one attention-based,and two transformer-based networks,experimental results demonstrate that the Pyramid-SCDFormer stably outperforms the existing state-of-the-art CD models and obtains an improvement in MIoU/F1 of 1.11/0.76%,0.57/0.50%,and 8.75/8.59%on the LEVIR-CD,WHU_CD,and Landsat-SCD dataset respectively.For change classes proportion less than 1%,the proposed model improves the MIoU by 7.17–19.53%on Landsat-SCD dataset.The recognition performance for small-scale and fine edges of change types has greatly improved.