Segmentation of the retinal vessels in the fundus is crucial for diagnosing ocular diseases.Retinal vessel images often suffer from category imbalance and large scale variations.This ultimately results in incomplete v...Segmentation of the retinal vessels in the fundus is crucial for diagnosing ocular diseases.Retinal vessel images often suffer from category imbalance and large scale variations.This ultimately results in incomplete vessel segmentation and poor continuity.In this study,we propose CT-MFENet to address the aforementioned issues.First,the use of context transformer(CT)allows for the integration of contextual feature information,which helps establish the connection between pixels and solve the problem of incomplete vessel continuity.Second,multi-scale dense residual networks are used instead of traditional CNN to address the issue of inadequate local feature extraction when the model encounters vessels at multiple scales.In the decoding stage,we introduce a local-global fusion module.It enhances the localization of vascular information and reduces the semantic gap between high-and low-level features.To address the class imbalance in retinal images,we propose a hybrid loss function that enhances the segmentation ability of the model for topological structures.We conducted experiments on the publicly available DRIVE,CHASEDB1,STARE,and IOSTAR datasets.The experimental results show that our CT-MFENet performs better than most existing methods,including the baseline U-Net.展开更多
Feature extraction of signals plays an important role in classification problems because of data dimension reduction property and potential improvement of a classification accuracy rate. Principal component analysis (...Feature extraction of signals plays an important role in classification problems because of data dimension reduction property and potential improvement of a classification accuracy rate. Principal component analysis (PCA), wavelets transform or Fourier transform methods are often used for feature extraction. In this paper, we propose a multi-scale PCA, which combines discrete wavelet transform, and PCA for feature extraction of signals in both the spatial and temporal domains. Our study shows that the multi-scale PCA combined with the proposed new classification methods leads to high classification accuracy for the considered signals.展开更多
Accurately identifying small objects in high-resolution aerial images presents a complex and crucial task in thefield of small object detection on unmanned aerial vehicles(UAVs).This task is challenging due to variati...Accurately identifying small objects in high-resolution aerial images presents a complex and crucial task in thefield of small object detection on unmanned aerial vehicles(UAVs).This task is challenging due to variations inUAV flight altitude,differences in object scales,as well as factors like flight speed and motion blur.To enhancethe detection efficacy of small targets in drone aerial imagery,we propose an enhanced You Only Look Onceversion 7(YOLOv7)algorithm based on multi-scale spatial context.We build the MSC-YOLO model,whichincorporates an additional prediction head,denoted as P2,to improve adaptability for small objects.We replaceconventional downsampling with a Spatial-to-Depth Convolutional Combination(CSPDC)module to mitigatethe loss of intricate feature details related to small objects.Furthermore,we propose a Spatial Context Pyramidwith Multi-Scale Attention(SCPMA)module,which captures spatial and channel-dependent features of smalltargets acrossmultiple scales.This module enhances the perception of spatial contextual features and the utilizationof multiscale feature information.On the Visdrone2023 and UAVDT datasets,MSC-YOLO achieves remarkableresults,outperforming the baseline method YOLOv7 by 3.0%in terms ofmean average precision(mAP).The MSCYOLOalgorithm proposed in this paper has demonstrated satisfactory performance in detecting small targets inUAV aerial photography,providing strong support for practical applications.展开更多
Aiming at the difficulty of fault identification caused by manual extraction of fault features of rotating machinery,a one-dimensional multi-scale convolutional auto-encoder fault diagnosis model is proposed,based on ...Aiming at the difficulty of fault identification caused by manual extraction of fault features of rotating machinery,a one-dimensional multi-scale convolutional auto-encoder fault diagnosis model is proposed,based on the standard convolutional auto-encoder.In this model,the parallel convolutional and deconvolutional kernels of different scales are used to extract the features from the input signal and reconstruct the input signal;then the feature map extracted by multi-scale convolutional kernels is used as the input of the classifier;and finally the parameters of the whole model are fine-tuned using labeled data.Experiments on one set of simulation fault data and two sets of rolling bearing fault data are conducted to validate the proposed method.The results show that the model can achieve 99.75%,99.3%and 100%diagnostic accuracy,respectively.In addition,the diagnostic accuracy and reconstruction error of the one-dimensional multi-scale convolutional auto-encoder are compared with traditional machine learning,convolutional neural networks and a traditional convolutional auto-encoder.The final results show that the proposed model has a better recognition effect for rolling bearing fault data.展开更多
Tea leaf picking is a crucial stage in tea production that directly influences the quality and value of the tea.Traditional tea-picking machines may compromise the quality of the tea leaves.High-quality teas are often...Tea leaf picking is a crucial stage in tea production that directly influences the quality and value of the tea.Traditional tea-picking machines may compromise the quality of the tea leaves.High-quality teas are often handpicked and need more delicate operations in intelligent picking machines.Compared with traditional image processing techniques,deep learning models have stronger feature extraction capabilities,and better generalization and are more suitable for practical tea shoot harvesting.However,current research mostly focuses on shoot detection and cannot directly accomplish end-to-end shoot segmentation tasks.We propose a tea shoot instance segmentation model based on multi-scale mixed attention(Mask2FusionNet)using a dataset from the tea garden in Hangzhou.We further analyzed the characteristics of the tea shoot dataset,where the proportion of small to medium-sized targets is 89.9%.Our algorithm is compared with several mainstream object segmentation algorithms,and the results demonstrate that our model achieves an accuracy of 82%in recognizing the tea shoots,showing a better performance compared to other models.Through ablation experiments,we found that ResNet50,PointRend strategy,and the Feature Pyramid Network(FPN)architecture can improve performance by 1.6%,1.4%,and 2.4%,respectively.These experiments demonstrated that our proposed multi-scale and point selection strategy optimizes the feature extraction capability for overlapping small targets.The results indicate that the proposed Mask2FusionNet model can perform the shoot segmentation in unstructured environments,realizing the individual distinction of tea shoots,and complete extraction of the shoot edge contours with a segmentation accuracy of 82.0%.The research results can provide algorithmic support for the segmentation and intelligent harvesting of premium tea shoots at different scales.展开更多
In order to extract the richer feature information of ship targets from sea clutter, and address the high dimensional data problem, a method termed as multi-scale fusion kernel sparse preserving projection(MSFKSPP) ba...In order to extract the richer feature information of ship targets from sea clutter, and address the high dimensional data problem, a method termed as multi-scale fusion kernel sparse preserving projection(MSFKSPP) based on the maximum margin criterion(MMC) is proposed for recognizing the class of ship targets utilizing the high-resolution range profile(HRRP). Multi-scale fusion is introduced to capture the local and detailed information in small-scale features, and the global and contour information in large-scale features, offering help to extract the edge information from sea clutter and further improving the target recognition accuracy. The proposed method can maximally preserve the multi-scale fusion sparse of data and maximize the class separability in the reduced dimensionality by reproducing kernel Hilbert space. Experimental results on the measured radar data show that the proposed method can effectively extract the features of ship target from sea clutter, further reduce the feature dimensionality, and improve target recognition performance.展开更多
In this study,an underwater image enhancement method based on multi-scale adversarial network was proposed to solve the problem of detail blur and color distortion in underwater images.Firstly,the local features of ea...In this study,an underwater image enhancement method based on multi-scale adversarial network was proposed to solve the problem of detail blur and color distortion in underwater images.Firstly,the local features of each layer were enhanced into the global features by the proposed residual dense block,which ensured that the generated images retain more details.Secondly,a multi-scale structure was adopted to extract multi-scale semantic features of the original images.Finally,the features obtained from the dual channels were fused by an adaptive fusion module to further optimize the features.The discriminant network adopted the structure of the Markov discriminator.In addition,by constructing mean square error,structural similarity,and perceived color loss function,the generated image is consistent with the reference image in structure,color,and content.The experimental results showed that the enhanced underwater image deblurring effect of the proposed algorithm was good and the problem of underwater image color bias was effectively improved.In both subjective and objective evaluation indexes,the experimental results of the proposed algorithm are better than those of the comparison algorithm.展开更多
Ship detection in synthetic aperture radar(SAR)image is crucial for marine surveillance and navigation.The application of detection network based on deep learning has achieved a promising result in SAR ship detection....Ship detection in synthetic aperture radar(SAR)image is crucial for marine surveillance and navigation.The application of detection network based on deep learning has achieved a promising result in SAR ship detection.However,the existing networks encounters challenges due to the complex backgrounds,diverse scales and irregular distribution of ship targets.To address these issues,this article proposes a detection algorithm that integrates global context of the images(GCF-Net).First,we construct a global feature extraction module in the backbone network of GCF-Net,which encodes features along different spatial directions.Then,we incorporate bi-directional feature pyramid network(BiFPN)in the neck network to fuse the multi-scale features selectively.Finally,we design a convolution and transformer mixed(CTM)detection head to obtain contextual information of targets and concentrate network attention on the most informative regions of the images.Experimental results demonstrate that the proposed method achieves more accurate detection of ship targets in SAR images.展开更多
针对遥感图像微小目标检测中存在的浅层细化特征、深层语义表征和多尺度信息提取3个问题,提出一种综合运用多项技术的跨尺度YOLOv7(cross-scale YOLOv7,CSYOLOv7)网络。首先,设计跨阶段特征提取模块(cross-stage feature extraction mod...针对遥感图像微小目标检测中存在的浅层细化特征、深层语义表征和多尺度信息提取3个问题,提出一种综合运用多项技术的跨尺度YOLOv7(cross-scale YOLOv7,CSYOLOv7)网络。首先,设计跨阶段特征提取模块(cross-stage feature extraction module,CFEM)和感受野特征增强模块(receptive field feature enhancement module,RFFEM)。CFEM提高模型细化特征提取能力并抑制浅层下采样过程中特征的丢失,RFFEM加大网络对深层语义特征的提取力度,增强模型对目标上下文信息获取能力。其次,设计跨梯度空间金字塔池化模块(cross-gradient space pyramid pool module,CSPPM)有效融合微小目标多尺度的全局和局部特征。最后,用形状感知交并比(shape-aware intersection over union,Shape IoU)替换完全交并比(complete intersection over union,CIoU),提高模型在边界框定位任务中的精确度。实验结果表明,CSYOLOv7网络在DIOR(dataset for image object recognition)数据集和NWPU VHR-10(Northwestern Polytechnical University Very High Resolution-10)数据集上分别取得了74%和89.6%的检测精度,有效提升遥感图像微小目标的检测效果。展开更多
为解决现有的事件抽取方法在实体抽取子任务中难以充分利用上下文信息,导致事件抽取精度较低的问题,提出了基于跨度和图卷积网络的篇章级事件抽取(document-level event extraction based on span and graph convolutional network, DEE...为解决现有的事件抽取方法在实体抽取子任务中难以充分利用上下文信息,导致事件抽取精度较低的问题,提出了基于跨度和图卷积网络的篇章级事件抽取(document-level event extraction based on span and graph convolutional network, DEESG)模型。首先,设计中间线性层对编码的向量进行线性处理,并结合标注信息计算最佳跨度,通过提升对跨度开始位置和结束位置判断的准确度来提高实体抽取的精度;接着,提出异构图的构建方法,使用池化策略将实体与句子表示为图的节点,根据提出的建边规则构建异构图,以此建立全局信息的交互,并利用多层图卷积网络(graph convolutional network, GCN)对异构图进行卷积,获得具有上下文信息的实体表示和句子表示,以此解决上下文信息利用不充分的问题;然后,利用多头注意力机制进行事件类型的检测;最后,为组合中的实体分配论元角色,完成事件抽取任务。在中文金融公告(Chinese financial announcements, ChFinAnn)数据集上进行实验。结果表明,与拥有追踪器的异构图交互模型(graph-based interaction model with a tracker, GIT)相比,DEESG模型的F1分数提升了1.3个百分点。该研究证实DEESG模型能有效应用于篇章级事件抽取领域。展开更多
针对无人机航拍图像中存在小目标、目标遮挡、背景复杂的问题,提出一种基于高效特征提取和大感受野的目标检测网络(efficient feature and large receptive field network,EFLF-Net)。通过优化检测层架构降低小目标漏检率;在主干网络融...针对无人机航拍图像中存在小目标、目标遮挡、背景复杂的问题,提出一种基于高效特征提取和大感受野的目标检测网络(efficient feature and large receptive field network,EFLF-Net)。通过优化检测层架构降低小目标漏检率;在主干网络融合新的构建模块以提升特征提取效率;引入内容感知特征重组模块和大型选择性核网络,增强颈部网络对遮挡目标的上下文感知能力;采用Wise-IoU损失函数优化边界框回归稳定性。在VisDrone2019数据集上的实验结果表明,EFLF-Net较基准模型在平均精度上提高了5.2%。与已有代表性的目标检测算法相比,该方法对存在小目标、目标相互遮挡和复杂背景的无人机航拍图像有更好的检测效果。展开更多
基金the National Natural Science Foundation of China(No.62266025)。
文摘Segmentation of the retinal vessels in the fundus is crucial for diagnosing ocular diseases.Retinal vessel images often suffer from category imbalance and large scale variations.This ultimately results in incomplete vessel segmentation and poor continuity.In this study,we propose CT-MFENet to address the aforementioned issues.First,the use of context transformer(CT)allows for the integration of contextual feature information,which helps establish the connection between pixels and solve the problem of incomplete vessel continuity.Second,multi-scale dense residual networks are used instead of traditional CNN to address the issue of inadequate local feature extraction when the model encounters vessels at multiple scales.In the decoding stage,we introduce a local-global fusion module.It enhances the localization of vascular information and reduces the semantic gap between high-and low-level features.To address the class imbalance in retinal images,we propose a hybrid loss function that enhances the segmentation ability of the model for topological structures.We conducted experiments on the publicly available DRIVE,CHASEDB1,STARE,and IOSTAR datasets.The experimental results show that our CT-MFENet performs better than most existing methods,including the baseline U-Net.
文摘Feature extraction of signals plays an important role in classification problems because of data dimension reduction property and potential improvement of a classification accuracy rate. Principal component analysis (PCA), wavelets transform or Fourier transform methods are often used for feature extraction. In this paper, we propose a multi-scale PCA, which combines discrete wavelet transform, and PCA for feature extraction of signals in both the spatial and temporal domains. Our study shows that the multi-scale PCA combined with the proposed new classification methods leads to high classification accuracy for the considered signals.
基金the Key Research and Development Program of Hainan Province(Grant Nos.ZDYF2023GXJS163,ZDYF2024GXJS014)National Natural Science Foundation of China(NSFC)(Grant Nos.62162022,62162024)+2 种基金the Major Science and Technology Project of Hainan Province(Grant No.ZDKJ2020012)Hainan Provincial Natural Science Foundation of China(Grant No.620MS021)Youth Foundation Project of Hainan Natural Science Foundation(621QN211).
文摘Accurately identifying small objects in high-resolution aerial images presents a complex and crucial task in thefield of small object detection on unmanned aerial vehicles(UAVs).This task is challenging due to variations inUAV flight altitude,differences in object scales,as well as factors like flight speed and motion blur.To enhancethe detection efficacy of small targets in drone aerial imagery,we propose an enhanced You Only Look Onceversion 7(YOLOv7)algorithm based on multi-scale spatial context.We build the MSC-YOLO model,whichincorporates an additional prediction head,denoted as P2,to improve adaptability for small objects.We replaceconventional downsampling with a Spatial-to-Depth Convolutional Combination(CSPDC)module to mitigatethe loss of intricate feature details related to small objects.Furthermore,we propose a Spatial Context Pyramidwith Multi-Scale Attention(SCPMA)module,which captures spatial and channel-dependent features of smalltargets acrossmultiple scales.This module enhances the perception of spatial contextual features and the utilizationof multiscale feature information.On the Visdrone2023 and UAVDT datasets,MSC-YOLO achieves remarkableresults,outperforming the baseline method YOLOv7 by 3.0%in terms ofmean average precision(mAP).The MSCYOLOalgorithm proposed in this paper has demonstrated satisfactory performance in detecting small targets inUAV aerial photography,providing strong support for practical applications.
基金The National Natural Science Foundation of China(No.51675098)
文摘Aiming at the difficulty of fault identification caused by manual extraction of fault features of rotating machinery,a one-dimensional multi-scale convolutional auto-encoder fault diagnosis model is proposed,based on the standard convolutional auto-encoder.In this model,the parallel convolutional and deconvolutional kernels of different scales are used to extract the features from the input signal and reconstruct the input signal;then the feature map extracted by multi-scale convolutional kernels is used as the input of the classifier;and finally the parameters of the whole model are fine-tuned using labeled data.Experiments on one set of simulation fault data and two sets of rolling bearing fault data are conducted to validate the proposed method.The results show that the model can achieve 99.75%,99.3%and 100%diagnostic accuracy,respectively.In addition,the diagnostic accuracy and reconstruction error of the one-dimensional multi-scale convolutional auto-encoder are compared with traditional machine learning,convolutional neural networks and a traditional convolutional auto-encoder.The final results show that the proposed model has a better recognition effect for rolling bearing fault data.
基金This research was supported by the National Natural Science Foundation of China No.62276086the National Key R&D Program of China No.2022YFD2000100Zhejiang Provincial Natural Science Foundation of China under Grant No.LTGN23D010002.
文摘Tea leaf picking is a crucial stage in tea production that directly influences the quality and value of the tea.Traditional tea-picking machines may compromise the quality of the tea leaves.High-quality teas are often handpicked and need more delicate operations in intelligent picking machines.Compared with traditional image processing techniques,deep learning models have stronger feature extraction capabilities,and better generalization and are more suitable for practical tea shoot harvesting.However,current research mostly focuses on shoot detection and cannot directly accomplish end-to-end shoot segmentation tasks.We propose a tea shoot instance segmentation model based on multi-scale mixed attention(Mask2FusionNet)using a dataset from the tea garden in Hangzhou.We further analyzed the characteristics of the tea shoot dataset,where the proportion of small to medium-sized targets is 89.9%.Our algorithm is compared with several mainstream object segmentation algorithms,and the results demonstrate that our model achieves an accuracy of 82%in recognizing the tea shoots,showing a better performance compared to other models.Through ablation experiments,we found that ResNet50,PointRend strategy,and the Feature Pyramid Network(FPN)architecture can improve performance by 1.6%,1.4%,and 2.4%,respectively.These experiments demonstrated that our proposed multi-scale and point selection strategy optimizes the feature extraction capability for overlapping small targets.The results indicate that the proposed Mask2FusionNet model can perform the shoot segmentation in unstructured environments,realizing the individual distinction of tea shoots,and complete extraction of the shoot edge contours with a segmentation accuracy of 82.0%.The research results can provide algorithmic support for the segmentation and intelligent harvesting of premium tea shoots at different scales.
基金supported by the National Natural Science Foundation of China (62271255,61871218)the Fundamental Research Funds for the Central University (3082019NC2019002)+1 种基金the Aeronautical Science Foundation (ASFC-201920007002)the Program of Remote Sensing Intelligent Monitoring and Emergency Services for Regional Security Elements。
文摘In order to extract the richer feature information of ship targets from sea clutter, and address the high dimensional data problem, a method termed as multi-scale fusion kernel sparse preserving projection(MSFKSPP) based on the maximum margin criterion(MMC) is proposed for recognizing the class of ship targets utilizing the high-resolution range profile(HRRP). Multi-scale fusion is introduced to capture the local and detailed information in small-scale features, and the global and contour information in large-scale features, offering help to extract the edge information from sea clutter and further improving the target recognition accuracy. The proposed method can maximally preserve the multi-scale fusion sparse of data and maximize the class separability in the reduced dimensionality by reproducing kernel Hilbert space. Experimental results on the measured radar data show that the proposed method can effectively extract the features of ship target from sea clutter, further reduce the feature dimensionality, and improve target recognition performance.
文摘In this study,an underwater image enhancement method based on multi-scale adversarial network was proposed to solve the problem of detail blur and color distortion in underwater images.Firstly,the local features of each layer were enhanced into the global features by the proposed residual dense block,which ensured that the generated images retain more details.Secondly,a multi-scale structure was adopted to extract multi-scale semantic features of the original images.Finally,the features obtained from the dual channels were fused by an adaptive fusion module to further optimize the features.The discriminant network adopted the structure of the Markov discriminator.In addition,by constructing mean square error,structural similarity,and perceived color loss function,the generated image is consistent with the reference image in structure,color,and content.The experimental results showed that the enhanced underwater image deblurring effect of the proposed algorithm was good and the problem of underwater image color bias was effectively improved.In both subjective and objective evaluation indexes,the experimental results of the proposed algorithm are better than those of the comparison algorithm.
基金supported by the National Science Fund for Distinguished Young Scholars(No.62325104).
文摘Ship detection in synthetic aperture radar(SAR)image is crucial for marine surveillance and navigation.The application of detection network based on deep learning has achieved a promising result in SAR ship detection.However,the existing networks encounters challenges due to the complex backgrounds,diverse scales and irregular distribution of ship targets.To address these issues,this article proposes a detection algorithm that integrates global context of the images(GCF-Net).First,we construct a global feature extraction module in the backbone network of GCF-Net,which encodes features along different spatial directions.Then,we incorporate bi-directional feature pyramid network(BiFPN)in the neck network to fuse the multi-scale features selectively.Finally,we design a convolution and transformer mixed(CTM)detection head to obtain contextual information of targets and concentrate network attention on the most informative regions of the images.Experimental results demonstrate that the proposed method achieves more accurate detection of ship targets in SAR images.
文摘针对遥感图像微小目标检测中存在的浅层细化特征、深层语义表征和多尺度信息提取3个问题,提出一种综合运用多项技术的跨尺度YOLOv7(cross-scale YOLOv7,CSYOLOv7)网络。首先,设计跨阶段特征提取模块(cross-stage feature extraction module,CFEM)和感受野特征增强模块(receptive field feature enhancement module,RFFEM)。CFEM提高模型细化特征提取能力并抑制浅层下采样过程中特征的丢失,RFFEM加大网络对深层语义特征的提取力度,增强模型对目标上下文信息获取能力。其次,设计跨梯度空间金字塔池化模块(cross-gradient space pyramid pool module,CSPPM)有效融合微小目标多尺度的全局和局部特征。最后,用形状感知交并比(shape-aware intersection over union,Shape IoU)替换完全交并比(complete intersection over union,CIoU),提高模型在边界框定位任务中的精确度。实验结果表明,CSYOLOv7网络在DIOR(dataset for image object recognition)数据集和NWPU VHR-10(Northwestern Polytechnical University Very High Resolution-10)数据集上分别取得了74%和89.6%的检测精度,有效提升遥感图像微小目标的检测效果。
文摘为解决现有的事件抽取方法在实体抽取子任务中难以充分利用上下文信息,导致事件抽取精度较低的问题,提出了基于跨度和图卷积网络的篇章级事件抽取(document-level event extraction based on span and graph convolutional network, DEESG)模型。首先,设计中间线性层对编码的向量进行线性处理,并结合标注信息计算最佳跨度,通过提升对跨度开始位置和结束位置判断的准确度来提高实体抽取的精度;接着,提出异构图的构建方法,使用池化策略将实体与句子表示为图的节点,根据提出的建边规则构建异构图,以此建立全局信息的交互,并利用多层图卷积网络(graph convolutional network, GCN)对异构图进行卷积,获得具有上下文信息的实体表示和句子表示,以此解决上下文信息利用不充分的问题;然后,利用多头注意力机制进行事件类型的检测;最后,为组合中的实体分配论元角色,完成事件抽取任务。在中文金融公告(Chinese financial announcements, ChFinAnn)数据集上进行实验。结果表明,与拥有追踪器的异构图交互模型(graph-based interaction model with a tracker, GIT)相比,DEESG模型的F1分数提升了1.3个百分点。该研究证实DEESG模型能有效应用于篇章级事件抽取领域。
文摘针对无人机航拍图像中存在小目标、目标遮挡、背景复杂的问题,提出一种基于高效特征提取和大感受野的目标检测网络(efficient feature and large receptive field network,EFLF-Net)。通过优化检测层架构降低小目标漏检率;在主干网络融合新的构建模块以提升特征提取效率;引入内容感知特征重组模块和大型选择性核网络,增强颈部网络对遮挡目标的上下文感知能力;采用Wise-IoU损失函数优化边界框回归稳定性。在VisDrone2019数据集上的实验结果表明,EFLF-Net较基准模型在平均精度上提高了5.2%。与已有代表性的目标检测算法相比,该方法对存在小目标、目标相互遮挡和复杂背景的无人机航拍图像有更好的检测效果。