Purpose:Classification of remote sensing images(RSI)is a challenging task in computer vision.Recently,researchers have proposed a variety of creative methods for automatic recognition of RSI,and feature fusion is a re...Purpose:Classification of remote sensing images(RSI)is a challenging task in computer vision.Recently,researchers have proposed a variety of creative methods for automatic recognition of RSI,and feature fusion is a research hotspot for its great potential to boost performance.However,RSI has a unique imaging condition and cluttered scenes with complicated backgrounds.This larger difference from nature images has made the previous feature fusion methods present insignificant performance improvements.Design/methodology/approach:This work proposed a two-convolutional neural network(CNN)fusion method named main and branch CNN fusion network(MBC-Net)as an improved solution for classifying RSI.In detail,the MBC-Net employs an EfficientNet-B3 as its main CNN stream and an EfficientNet-B0 as a branch,named MC-B3 and BC-B0,respectively.In particular,MBC-Net includes a long-range derivation(LRD)module,which is specially designed to learn the dependence of different features.Meanwhile,MBC-Net also uses some unique ideas to tackle the problems coming from the two-CNN fusion and the inherent nature of RSI.Findings:Extensive experiments on three RSI sets prove that MBC-Net outperforms the other 38 state-of-theart(STOA)methods published from 2020 to 2023,with a noticeable increase in overall accuracy(OA)values.MBC-Net not only presents a 0.7%increased OA value on the most confusing NWPU set but also has 62%fewer parameters compared to the leading approach that ranks first in the literature.Originality/value:MBC-Net is a more effective and efficient feature fusion approach compared to other STOA methods in the literature.Given the visualizations of grad class activation mapping(Grad-CAM),it reveals that MBC-Net can learn the long-range dependence of features that a single CNN cannot.Based on the tendency stochastic neighbor embedding(t-SNE)results,it demonstrates that the feature representation of MBC-Net is more effective than other methods.In addition,the ablation tests indicate that MBC-Net is effective and efficient for fusing features from two CNNs.展开更多
基金funded by Hunan University of Arts and Science(No:16BSQD23).
文摘Purpose:Classification of remote sensing images(RSI)is a challenging task in computer vision.Recently,researchers have proposed a variety of creative methods for automatic recognition of RSI,and feature fusion is a research hotspot for its great potential to boost performance.However,RSI has a unique imaging condition and cluttered scenes with complicated backgrounds.This larger difference from nature images has made the previous feature fusion methods present insignificant performance improvements.Design/methodology/approach:This work proposed a two-convolutional neural network(CNN)fusion method named main and branch CNN fusion network(MBC-Net)as an improved solution for classifying RSI.In detail,the MBC-Net employs an EfficientNet-B3 as its main CNN stream and an EfficientNet-B0 as a branch,named MC-B3 and BC-B0,respectively.In particular,MBC-Net includes a long-range derivation(LRD)module,which is specially designed to learn the dependence of different features.Meanwhile,MBC-Net also uses some unique ideas to tackle the problems coming from the two-CNN fusion and the inherent nature of RSI.Findings:Extensive experiments on three RSI sets prove that MBC-Net outperforms the other 38 state-of-theart(STOA)methods published from 2020 to 2023,with a noticeable increase in overall accuracy(OA)values.MBC-Net not only presents a 0.7%increased OA value on the most confusing NWPU set but also has 62%fewer parameters compared to the leading approach that ranks first in the literature.Originality/value:MBC-Net is a more effective and efficient feature fusion approach compared to other STOA methods in the literature.Given the visualizations of grad class activation mapping(Grad-CAM),it reveals that MBC-Net can learn the long-range dependence of features that a single CNN cannot.Based on the tendency stochastic neighbor embedding(t-SNE)results,it demonstrates that the feature representation of MBC-Net is more effective than other methods.In addition,the ablation tests indicate that MBC-Net is effective and efficient for fusing features from two CNNs.