针对复杂背景中行人小目标的检测精度低以及检测不及时的问题,提出了一种改进的Mamba行人小目标检测方法。首先,在主干网络中将标准卷积替换成感受野注意力卷积(RFAConv),通过动态感受野调整了模型对多尺度特征的捕捉能力,同时优化了计...针对复杂背景中行人小目标的检测精度低以及检测不及时的问题,提出了一种改进的Mamba行人小目标检测方法。首先,在主干网络中将标准卷积替换成感受野注意力卷积(RFAConv),通过动态感受野调整了模型对多尺度特征的捕捉能力,同时优化了计算效率。其次,将注意力机制融入视觉状态空间模型(Visual State Space Model,VSSM)中,实现行人小目标多尺度特征的提取。最后,在颈部利用特征增强模块(Feature Enhancement Module,FEM)和双向金字塔模型实现多尺度特征融合。实验结果表明:在HIT-UAV数据集上,改进的Mamba模型实现了81.25%的准确率(以mAP@0.5为标准),比现有的大型模型如YOLOv5、YOLOv8、YOLOv11高出15%以上。展开更多
卫星捕获的遥感数据容易受到成像过程中悬浮粒子的影响而造成图像雾化现象,极大地影响遥感图像的清晰度。为了弥补这一不足,遥感图像去雾(RSID)非常必要。最近兴起的状态空间模型State Space Model(SSM)在建模线性复杂性和远程依赖关系...卫星捕获的遥感数据容易受到成像过程中悬浮粒子的影响而造成图像雾化现象,极大地影响遥感图像的清晰度。为了弥补这一不足,遥感图像去雾(RSID)非常必要。最近兴起的状态空间模型State Space Model(SSM)在建模线性复杂性和远程依赖关系方面的性能卓越,受其启发,笔者设计了一种基于CSC-Mamba(Cross-Shaped Convolutional Mamba Model)视觉模型遥感图像去雾技术。该技术基于SSM设计了RSMamba模块,利用其线性复杂性来实现全局上下文编码,大大降低了模型的复杂度。同时,利用卷积神经网络CNN以及基于自注意力机制设计CSwin模块来聚合不同方向域上的特征,以有效地感知雾分布的空间变化特征。通过这种方式,CSC-Mamba能够更好地提取雾特征,从而有效地去除雾对遥感图像的影响。通过在SateHaze1K公共数据集上的实验,结果表明本CSC-Mamba模型遥感图像去雾技术不仅具有较好的轻量化特征的同时性,还具有较高的去雾效果。展开更多
Change detection(CD)plays a crucial role in numerous fields,where both convolutional neural networks(CNNs)and Transformers have demonstrated exceptional performance in CD tasks.However,CNNs suffer from limited recepti...Change detection(CD)plays a crucial role in numerous fields,where both convolutional neural networks(CNNs)and Transformers have demonstrated exceptional performance in CD tasks.However,CNNs suffer from limited receptive fields,hindering their ability to capture global features,while Transformers are constrained by high computational complexity.Recently,Mamba architecture,which is based on state space models(SSMs),has shown powerful global modeling capabilities while achieving linear computational complexity.Although some researchers have incorporated Mamba into CD tasks,the existing Mamba⁃based remote sensing CD methods struggle to effectively perceive the inherent locality of changed regions when flattening and scanning remote sensing images,leading to limitations in extracting change features.To address these issues,we propose a novel Mamba⁃based CD method termed difference feature fusion Mamba model(DFFMamba)by mitigating the loss of feature locality caused by traditional Mamba⁃style scanning.Specifically,two distinct difference feature extraction modules are designed:Difference Mamba(DMamba)and local difference Mamba(LDMamba),where DMamba extracts difference features by calculating the difference in coefficient matrices between the state⁃space equations of the bi⁃temporal features.Building upon DMamba,LDMamba combines a locally adaptive state⁃space scanning(LASS)strategy to enhance feature locality so as to accurately extract difference features.Additionally,a fusion Mamba(FMamba)module is proposed,which employs a spatial⁃channel token modeling SSM(SCTMS)unit to integrate multi⁃dimensional spatio⁃temporal interactions of change features,thereby capturing their dependencies across both spatial and channel dimensions.To verify the effectiveness of the proposed DFFMamba,extensive experiments are conducted on three datasets of WHU⁃CD,LEVIR⁃CD,and CLCD.The results demonstrate that DFFMamba significantly outperforms state⁃of⁃the⁃art CD methods,achieving intersection over union(IoU)scores of 90.67%,85.04%,and 66.56%on the three datasets,respectively.展开更多
Brain tumors,one of the most lethal diseases with low survival rates,require early detection and accurate diagnosis to enable effective treatment planning.While deep learning architectures,particularly Convolutional N...Brain tumors,one of the most lethal diseases with low survival rates,require early detection and accurate diagnosis to enable effective treatment planning.While deep learning architectures,particularly Convolutional Neural Networks(CNNs),have shown significant performance improvements over traditional methods,they struggle to capture the subtle pathological variations between different brain tumor types.Recent attention-based models have attempted to address this by focusing on global features,but they come with high computational costs.To address these challenges,this paper introduces a novel parallel architecture,ParMamba,which uniquely integrates Convolutional Attention Patch Embedding(CAPE)and the Conv Mamba block including CNN,Mamba and the channel enhancement module,marking a significant advancement in the field.The unique design of ConvMamba block enhances the ability of model to capture both local features and long-range dependencies,improving the detection of subtle differences between tumor types.The channel enhancement module refines feature interactions across channels.Additionally,CAPE is employed as a downsampling layer that extracts both local and global features,further improving classification accuracy.Experimental results on two publicly available brain tumor datasets demonstrate that ParMamba achieves classification accuracies of 99.62%and 99.35%,outperforming existing methods.Notably,ParMamba surpasses vision transformers(ViT)by 1.37%in accuracy,with a throughput improvement of over 30%.These results demonstrate that ParMamba delivers superior performance while operating faster than traditional attention-based methods.展开更多
文摘针对复杂背景中行人小目标的检测精度低以及检测不及时的问题,提出了一种改进的Mamba行人小目标检测方法。首先,在主干网络中将标准卷积替换成感受野注意力卷积(RFAConv),通过动态感受野调整了模型对多尺度特征的捕捉能力,同时优化了计算效率。其次,将注意力机制融入视觉状态空间模型(Visual State Space Model,VSSM)中,实现行人小目标多尺度特征的提取。最后,在颈部利用特征增强模块(Feature Enhancement Module,FEM)和双向金字塔模型实现多尺度特征融合。实验结果表明:在HIT-UAV数据集上,改进的Mamba模型实现了81.25%的准确率(以mAP@0.5为标准),比现有的大型模型如YOLOv5、YOLOv8、YOLOv11高出15%以上。
基金supported by the National Natural Science Foundation of China(Nos.42371449,41801386).
文摘Change detection(CD)plays a crucial role in numerous fields,where both convolutional neural networks(CNNs)and Transformers have demonstrated exceptional performance in CD tasks.However,CNNs suffer from limited receptive fields,hindering their ability to capture global features,while Transformers are constrained by high computational complexity.Recently,Mamba architecture,which is based on state space models(SSMs),has shown powerful global modeling capabilities while achieving linear computational complexity.Although some researchers have incorporated Mamba into CD tasks,the existing Mamba⁃based remote sensing CD methods struggle to effectively perceive the inherent locality of changed regions when flattening and scanning remote sensing images,leading to limitations in extracting change features.To address these issues,we propose a novel Mamba⁃based CD method termed difference feature fusion Mamba model(DFFMamba)by mitigating the loss of feature locality caused by traditional Mamba⁃style scanning.Specifically,two distinct difference feature extraction modules are designed:Difference Mamba(DMamba)and local difference Mamba(LDMamba),where DMamba extracts difference features by calculating the difference in coefficient matrices between the state⁃space equations of the bi⁃temporal features.Building upon DMamba,LDMamba combines a locally adaptive state⁃space scanning(LASS)strategy to enhance feature locality so as to accurately extract difference features.Additionally,a fusion Mamba(FMamba)module is proposed,which employs a spatial⁃channel token modeling SSM(SCTMS)unit to integrate multi⁃dimensional spatio⁃temporal interactions of change features,thereby capturing their dependencies across both spatial and channel dimensions.To verify the effectiveness of the proposed DFFMamba,extensive experiments are conducted on three datasets of WHU⁃CD,LEVIR⁃CD,and CLCD.The results demonstrate that DFFMamba significantly outperforms state⁃of⁃the⁃art CD methods,achieving intersection over union(IoU)scores of 90.67%,85.04%,and 66.56%on the three datasets,respectively.
基金supported by the Outstanding Youth Science and Technology Innovation Team Project of Colleges and Universities in Hubei Province(Grant no.T201923)Key Science and Technology Project of Jingmen(Grant nos.2021ZDYF024,2022ZDYF019)Cultivation Project of Jingchu University of Technology(Grant no.PY201904).
文摘Brain tumors,one of the most lethal diseases with low survival rates,require early detection and accurate diagnosis to enable effective treatment planning.While deep learning architectures,particularly Convolutional Neural Networks(CNNs),have shown significant performance improvements over traditional methods,they struggle to capture the subtle pathological variations between different brain tumor types.Recent attention-based models have attempted to address this by focusing on global features,but they come with high computational costs.To address these challenges,this paper introduces a novel parallel architecture,ParMamba,which uniquely integrates Convolutional Attention Patch Embedding(CAPE)and the Conv Mamba block including CNN,Mamba and the channel enhancement module,marking a significant advancement in the field.The unique design of ConvMamba block enhances the ability of model to capture both local features and long-range dependencies,improving the detection of subtle differences between tumor types.The channel enhancement module refines feature interactions across channels.Additionally,CAPE is employed as a downsampling layer that extracts both local and global features,further improving classification accuracy.Experimental results on two publicly available brain tumor datasets demonstrate that ParMamba achieves classification accuracies of 99.62%and 99.35%,outperforming existing methods.Notably,ParMamba surpasses vision transformers(ViT)by 1.37%in accuracy,with a throughput improvement of over 30%.These results demonstrate that ParMamba delivers superior performance while operating faster than traditional attention-based methods.