Segmentation of the retinal vessels in the fundus is crucial for diagnosing ocular diseases.Retinal vessel images often suffer from category imbalance and large scale variations.This ultimately results in incomplete v...Segmentation of the retinal vessels in the fundus is crucial for diagnosing ocular diseases.Retinal vessel images often suffer from category imbalance and large scale variations.This ultimately results in incomplete vessel segmentation and poor continuity.In this study,we propose CT-MFENet to address the aforementioned issues.First,the use of context transformer(CT)allows for the integration of contextual feature information,which helps establish the connection between pixels and solve the problem of incomplete vessel continuity.Second,multi-scale dense residual networks are used instead of traditional CNN to address the issue of inadequate local feature extraction when the model encounters vessels at multiple scales.In the decoding stage,we introduce a local-global fusion module.It enhances the localization of vascular information and reduces the semantic gap between high-and low-level features.To address the class imbalance in retinal images,we propose a hybrid loss function that enhances the segmentation ability of the model for topological structures.We conducted experiments on the publicly available DRIVE,CHASEDB1,STARE,and IOSTAR datasets.The experimental results show that our CT-MFENet performs better than most existing methods,including the baseline U-Net.展开更多
Fine-grained visual classification(FGVC)is a very challenging task due to distinguishing subcategories under the same super-category.Recent works mainly localize discriminative image regions and capture subtle inter-c...Fine-grained visual classification(FGVC)is a very challenging task due to distinguishing subcategories under the same super-category.Recent works mainly localize discriminative image regions and capture subtle inter-class differences by utilizing attention-based methods.However,at the same layer,most attention-based works only consider large-scale attention blocks with the same size as feature maps,and they ignore small-scale attention blocks that are smaller than feature maps.To distinguish subcategories,it is important to exploit small local regions.In this work,a novel multi-scale attention network(MSANet)is proposed to capture large and small regions at the same layer in fine-grained visual classification.Specifically,a novel multi-scale attention layer(MSAL)is proposed,which generates multiple groups in each feature maps to capture different-scale discriminative regions.The groups based on large-scale regions can exploit global features and the groups based on the small-scale regions can extract local subtle features.Then,a simple feature fusion strategy is utilized to fully integrate global features and local subtle features to mine information that are more conducive to FGVC.Comprehensive experiments in Caltech-UCSD Birds-200-2011(CUB),FGVC-Aircraft(AIR)and Stanford Cars(Cars)datasets show that our method achieves the competitive performances,which demonstrate its effectiveness.展开更多
To improve the accuracy of modulated signal recognition in variable environments and reduce the impact of factors such as lack of prior knowledge on recognition results,researchers have gradually adopted deep learning...To improve the accuracy of modulated signal recognition in variable environments and reduce the impact of factors such as lack of prior knowledge on recognition results,researchers have gradually adopted deep learning techniques to replace traditional modulated signal processing techniques.To address the problem of low recognition accuracy of the modulated signal at low signal-to-noise ratios,we have designed a novel modulation recognition network of multi-scale analysis with deep threshold noise elimination to recognize the actually collected modulated signals under a symmetric cross-entropy function of label smoothing.The network consists of a denoising encoder with deep adaptive threshold learning and a decoder with multi-scale feature fusion.The two modules are skip-connected to work together to improve the robustness of the overall network.Experimental results show that this method has better recognition accuracy at low signal-to-noise ratios than previous methods.The network demonstrates a flexible self-learning capability for different noise thresholds and the effectiveness of the designed feature fusion module in multi-scale feature acquisition for various modulation types.展开更多
With the development of smart agriculture,accurately identifying crop diseases through visual recognition techniques instead of by eye has been a significant challenge.This study focused on apple leaf disease,which is...With the development of smart agriculture,accurately identifying crop diseases through visual recognition techniques instead of by eye has been a significant challenge.This study focused on apple leaf disease,which is closely related to the final yield of apples.A multiscale fusion dense network combined with an efficient multiscale attention(EMA)mechanism called Incept_EMA_DenseNet was developed to better identify eight complex apple leaf disease images.Incept_EMA_DenseNet consists of three crucial parts:the inception module,which substituted the convolution layer with multiscale fusion methods in the shallow feature extraction layer;the EMA mechanism,which is used for obtaining appropriate weights of different dense blocks;and the improved DenseNet based on DenseNet_121.Specifically,to find appropriate multiscale fusion methods,the residual module and inception module were compared to determine the performance of each technique,and Incept_EMA_DenseNet achieved an accuracy of 95.38%.Second,this work used three attention mechanisms,and the efficient multiscale attention mechanism obtained the best performance.Third,the convolution layers and bottlenecks were modified without performance degradation,reducing half of the computational load compared with the original models.Incept_EMA_DenseNet,as proposed in this paper,has an accuracy of 96.76%,being 2.93%,3.44%,and 4.16%better than Resnet50,DenseNet_121 and GoogLeNet,respectively,proved to be reliable and beneficial,and can effectively and conveniently assist apple growers with leaf disease identification in the field.展开更多
随着机器学习特别是深度学习理论和算法的不断发展和视频数据的大量积累,采用无标签视频信息的无监督学习算法取得了长足进步。提出了一种融合光流信息的双流无监督学习视频目标分割网络。首先,将视频序列中的随机帧和与之对应的由光流...随着机器学习特别是深度学习理论和算法的不断发展和视频数据的大量积累,采用无标签视频信息的无监督学习算法取得了长足进步。提出了一种融合光流信息的双流无监督学习视频目标分割网络。首先,将视频序列中的随机帧和与之对应的由光流网络生成的光流图分别输入到残差神经(residual networks,ResNet)主干网络,形成帧特征图和对应的帧间光流特征图。其次,为克服共同移动的背景信息对分割精度的影响,设计了目标位置信息融合模块(position information fusion,PIF),将输入视频帧和光流进行位置信息融合,在得到主要目标位置的同时,降低了背景噪声信号对分割的影响。最后,设计空间通道上下文信息融合注意力机制模块(spatial channel context information fusion,SCCF),将帧特征和光流特征的上下文信息与经典的空间通道注意力机制进行了融合。在DAVIS-16数据集上的实验表明,文中网络的平均区域相似性指标可达89.6,平均边界精度指标可达87.0,两项指标均达到该领域的最高水平。展开更多
基金the National Natural Science Foundation of China(No.62266025)。
文摘Segmentation of the retinal vessels in the fundus is crucial for diagnosing ocular diseases.Retinal vessel images often suffer from category imbalance and large scale variations.This ultimately results in incomplete vessel segmentation and poor continuity.In this study,we propose CT-MFENet to address the aforementioned issues.First,the use of context transformer(CT)allows for the integration of contextual feature information,which helps establish the connection between pixels and solve the problem of incomplete vessel continuity.Second,multi-scale dense residual networks are used instead of traditional CNN to address the issue of inadequate local feature extraction when the model encounters vessels at multiple scales.In the decoding stage,we introduce a local-global fusion module.It enhances the localization of vascular information and reduces the semantic gap between high-and low-level features.To address the class imbalance in retinal images,we propose a hybrid loss function that enhances the segmentation ability of the model for topological structures.We conducted experiments on the publicly available DRIVE,CHASEDB1,STARE,and IOSTAR datasets.The experimental results show that our CT-MFENet performs better than most existing methods,including the baseline U-Net.
基金jointly supported by the National Science and Technology Major Project(2022ZD0117103)the National Natural Science Foundations of China(62272364)+2 种基金the provincial Key Research and Development Program of Shaanxi(2024GH-ZDXM-47)the Research Project on Higher Education Teaching Reform of Shaanxi Province(23JG003)the Natural Science Basic Research Program of Shaanxi(2024JC-YBQN0639).
文摘Fine-grained visual classification(FGVC)is a very challenging task due to distinguishing subcategories under the same super-category.Recent works mainly localize discriminative image regions and capture subtle inter-class differences by utilizing attention-based methods.However,at the same layer,most attention-based works only consider large-scale attention blocks with the same size as feature maps,and they ignore small-scale attention blocks that are smaller than feature maps.To distinguish subcategories,it is important to exploit small local regions.In this work,a novel multi-scale attention network(MSANet)is proposed to capture large and small regions at the same layer in fine-grained visual classification.Specifically,a novel multi-scale attention layer(MSAL)is proposed,which generates multiple groups in each feature maps to capture different-scale discriminative regions.The groups based on large-scale regions can exploit global features and the groups based on the small-scale regions can extract local subtle features.Then,a simple feature fusion strategy is utilized to fully integrate global features and local subtle features to mine information that are more conducive to FGVC.Comprehensive experiments in Caltech-UCSD Birds-200-2011(CUB),FGVC-Aircraft(AIR)and Stanford Cars(Cars)datasets show that our method achieves the competitive performances,which demonstrate its effectiveness.
基金Project supported by the National Key R&D Program of China(No.2020YFF01015000ZL)the Fundamental Research Funds for the Central Universities,China(No.3072022CF0806)。
文摘To improve the accuracy of modulated signal recognition in variable environments and reduce the impact of factors such as lack of prior knowledge on recognition results,researchers have gradually adopted deep learning techniques to replace traditional modulated signal processing techniques.To address the problem of low recognition accuracy of the modulated signal at low signal-to-noise ratios,we have designed a novel modulation recognition network of multi-scale analysis with deep threshold noise elimination to recognize the actually collected modulated signals under a symmetric cross-entropy function of label smoothing.The network consists of a denoising encoder with deep adaptive threshold learning and a decoder with multi-scale feature fusion.The two modules are skip-connected to work together to improve the robustness of the overall network.Experimental results show that this method has better recognition accuracy at low signal-to-noise ratios than previous methods.The network demonstrates a flexible self-learning capability for different noise thresholds and the effectiveness of the designed feature fusion module in multi-scale feature acquisition for various modulation types.
基金fully supported by the National Natural Science Foundation of China(52072412)。
文摘With the development of smart agriculture,accurately identifying crop diseases through visual recognition techniques instead of by eye has been a significant challenge.This study focused on apple leaf disease,which is closely related to the final yield of apples.A multiscale fusion dense network combined with an efficient multiscale attention(EMA)mechanism called Incept_EMA_DenseNet was developed to better identify eight complex apple leaf disease images.Incept_EMA_DenseNet consists of three crucial parts:the inception module,which substituted the convolution layer with multiscale fusion methods in the shallow feature extraction layer;the EMA mechanism,which is used for obtaining appropriate weights of different dense blocks;and the improved DenseNet based on DenseNet_121.Specifically,to find appropriate multiscale fusion methods,the residual module and inception module were compared to determine the performance of each technique,and Incept_EMA_DenseNet achieved an accuracy of 95.38%.Second,this work used three attention mechanisms,and the efficient multiscale attention mechanism obtained the best performance.Third,the convolution layers and bottlenecks were modified without performance degradation,reducing half of the computational load compared with the original models.Incept_EMA_DenseNet,as proposed in this paper,has an accuracy of 96.76%,being 2.93%,3.44%,and 4.16%better than Resnet50,DenseNet_121 and GoogLeNet,respectively,proved to be reliable and beneficial,and can effectively and conveniently assist apple growers with leaf disease identification in the field.
文摘随着机器学习特别是深度学习理论和算法的不断发展和视频数据的大量积累,采用无标签视频信息的无监督学习算法取得了长足进步。提出了一种融合光流信息的双流无监督学习视频目标分割网络。首先,将视频序列中的随机帧和与之对应的由光流网络生成的光流图分别输入到残差神经(residual networks,ResNet)主干网络,形成帧特征图和对应的帧间光流特征图。其次,为克服共同移动的背景信息对分割精度的影响,设计了目标位置信息融合模块(position information fusion,PIF),将输入视频帧和光流进行位置信息融合,在得到主要目标位置的同时,降低了背景噪声信号对分割的影响。最后,设计空间通道上下文信息融合注意力机制模块(spatial channel context information fusion,SCCF),将帧特征和光流特征的上下文信息与经典的空间通道注意力机制进行了融合。在DAVIS-16数据集上的实验表明,文中网络的平均区域相似性指标可达89.6,平均边界精度指标可达87.0,两项指标均达到该领域的最高水平。