To solve the problems of complex lesion region morphology,blurred edges,and limited hardware resources for deploying the recognition model in pneumonia image recognition,an improved EfficientNetV2 pneumo-nia recogniti...To solve the problems of complex lesion region morphology,blurred edges,and limited hardware resources for deploying the recognition model in pneumonia image recognition,an improved EfficientNetV2 pneumo-nia recognition model based on multiscale attention is proposed.First,the number of main module stacks of the model is reduced to avoid overfitting,while the dilated convolution is introduced in the first convolutional layer to expand the receptive field of the model;second,a redesigned improved mobile inverted bottleneck convolution(IMBConv)module is proposed,in which GSConv is introduced to enhance the model’s attention to inter-channel information,and a SimAM module is introduced to reduce the number of model parameters while guaranteeing the model’s recognition performance;finally,an improved multi-scale efficient local attention(MELA)module is proposed to ensure the model’s recognition ability for pneumonia images with complex lesion regions.The experimental results show that the improved model has a computational complexity of 1.96 GFLOPs,which is reduced by 32%relative to the baseline model,and the number of model parameters is also reduced,and achieves an accuracy of 86.67%on the triple classification task of the public dataset Chest X-ray,representing an improvement of 2.74%compared to the baseline model.The recognition accuracies of ResNet50,Inception-V4,and Swin Transformer V2 on this dataset are 84.36%,85.98%,and 83.42%,respectively,and their computational complexities and model parameter counts are all higher than those of the proposed model.This indicates that the proposed model has very high feasibility for deployment in edge computing or mobile healthcare systems.In addition,the improved model achieved the highest accuracy of 90.98%on the four-classification public dataset compared to other models,indicating that the model has better recognition accuracy and generalization ability for pneumonia image recognition.展开更多
Automatic pancreas segmentation plays a pivotal role in assisting physicians with diagnosing pancreatic diseases,facilitating treatment evaluations,and designing surgical plans.Due to the pancreas’s tiny size,signifi...Automatic pancreas segmentation plays a pivotal role in assisting physicians with diagnosing pancreatic diseases,facilitating treatment evaluations,and designing surgical plans.Due to the pancreas’s tiny size,significant variability in shape and location,and low contrast with surrounding tissues,achieving high segmentation accuracy remains challenging.To improve segmentation precision,we propose a novel network utilizing EfficientNetV2 and multi-branch structures for automatically segmenting the pancreas fromCT images.Firstly,an EfficientNetV2 encoder is employed to extract complex and multi-level features,enhancing the model’s ability to capture the pancreas’s intricate morphology.Then,a residual multi-branch dilated attention(RMDA)module is designed to suppress irrelevant background noise and highlight useful pancreatic features.And re-parameterization Visual Geometry Group(RepVGG)blocks with amulti-branch structure are introduced in the decoder to effectively integrate deep features and low-level details,improving segmentation accuracy.Furthermore,we apply re-parameterization to the model,reducing computations and parameters while accelerating inference and reducing memory usage.Our approach achieves average dice similarity coefficient(DSC)of 85.59%,intersection over union(IoU)of 75.03%,precision of 85.09%,and recall of 86.57%on the NIH pancreas dataset.Compared with other methods,our model has fewer parameters and faster inference speed,demonstrating its enormous potential in practical applications of pancreatic segmentation.展开更多
Automatic detection of student engagement levels from videos,which is a spatio-temporal classification problem is crucial for enhancing the quality of online education.This paper addresses this challenge by proposing ...Automatic detection of student engagement levels from videos,which is a spatio-temporal classification problem is crucial for enhancing the quality of online education.This paper addresses this challenge by proposing four novel hybrid end-to-end deep learning models designed for the automatic detection of student engagement levels in e-learning videos.The evaluation of these models utilizes the DAiSEE dataset,a public repository capturing student affective states in e-learning scenarios.The initial model integrates EfficientNetV2-L with Gated Recurrent Unit(GRU)and attains an accuracy of 61.45%.Subsequently,the second model combines EfficientNetV2-L with bidirectional GRU(Bi-GRU),yielding an accuracy of 61.56%.The third and fourth models leverage a fusion of EfficientNetV2-L with Long Short-Term Memory(LSTM)and bidirectional LSTM(Bi-LSTM),achieving accuracies of 62.11%and 61.67%,respectively.Our findings demonstrate the viability of these models in effectively discerning student engagement levels,with the EfficientNetV2-L+LSTM model emerging as the most proficient,reaching an accuracy of 62.11%.This study underscores the potential of hybrid spatio-temporal networks in automating the detection of student engagement,thereby contributing to advancements in online education quality.展开更多
针对现有恶意软件分类方法特征提取的单一性及对通道权重忽视的问题,本文提出了一种基于EfficientNetV2和特征融合的新型分类方法。该方法通过综合利用Byte和Asm文件从多角度提取特征图像,融合生成三通道图像以提供更全面的恶意软件特...针对现有恶意软件分类方法特征提取的单一性及对通道权重忽视的问题,本文提出了一种基于EfficientNetV2和特征融合的新型分类方法。该方法通过综合利用Byte和Asm文件从多角度提取特征图像,融合生成三通道图像以提供更全面的恶意软件特征表达,并采用EfficientNetV2深度学习模型进行分类,更精确地刻画恶意软件间的相似性,从而提高分类准确率。在BIG2015数据集上的实验结果表明,本文方法的分类准确率达到了99.14%,能够有效分类恶意软件家族,凸显了特征融合和深度学习模型在恶意软件分类领域的巨大潜力。Addressing the limitations of singularity of feature extraction and the neglect of channel weights in existing malware classification methods, this paper introduces a novel classification method based on EfficientNetV2 and feature fusion. This method combines Byte and Asm files to extract multi-dimensional feature images, creating three-channel images for a more comprehensive representation of malware features. Utilizing the EfficientNetV2 deep learning model, the approach enhances the accuracy of malware classification by capturing subtle similarities among malware more precisely. Experiments on the BIG2015 dataset demonstrate a classification accuracy of 99.14%, effectively categorizing malware families and highlighting the significant potential of feature fusion and deep learning models in the field of malware classification.展开更多
基金supported by the Scientific Research Fund of Hunan Provincial Education Department,China(Grant Nos.21C0439,22A0408).
文摘To solve the problems of complex lesion region morphology,blurred edges,and limited hardware resources for deploying the recognition model in pneumonia image recognition,an improved EfficientNetV2 pneumo-nia recognition model based on multiscale attention is proposed.First,the number of main module stacks of the model is reduced to avoid overfitting,while the dilated convolution is introduced in the first convolutional layer to expand the receptive field of the model;second,a redesigned improved mobile inverted bottleneck convolution(IMBConv)module is proposed,in which GSConv is introduced to enhance the model’s attention to inter-channel information,and a SimAM module is introduced to reduce the number of model parameters while guaranteeing the model’s recognition performance;finally,an improved multi-scale efficient local attention(MELA)module is proposed to ensure the model’s recognition ability for pneumonia images with complex lesion regions.The experimental results show that the improved model has a computational complexity of 1.96 GFLOPs,which is reduced by 32%relative to the baseline model,and the number of model parameters is also reduced,and achieves an accuracy of 86.67%on the triple classification task of the public dataset Chest X-ray,representing an improvement of 2.74%compared to the baseline model.The recognition accuracies of ResNet50,Inception-V4,and Swin Transformer V2 on this dataset are 84.36%,85.98%,and 83.42%,respectively,and their computational complexities and model parameter counts are all higher than those of the proposed model.This indicates that the proposed model has very high feasibility for deployment in edge computing or mobile healthcare systems.In addition,the improved model achieved the highest accuracy of 90.98%on the four-classification public dataset compared to other models,indicating that the model has better recognition accuracy and generalization ability for pneumonia image recognition.
基金supported by the Science and Technology Innovation Programof Hunan Province(Grant No.2022RC1021)the Hunan Provincial Natural Science Foundation Project(Grant No.2023JJ60124)+1 种基金the Changsha Natural Science Foundation Project(Grant No.kq2202265)the key project of the Hunan Provincial of Education(Grant No.22A0255).
文摘Automatic pancreas segmentation plays a pivotal role in assisting physicians with diagnosing pancreatic diseases,facilitating treatment evaluations,and designing surgical plans.Due to the pancreas’s tiny size,significant variability in shape and location,and low contrast with surrounding tissues,achieving high segmentation accuracy remains challenging.To improve segmentation precision,we propose a novel network utilizing EfficientNetV2 and multi-branch structures for automatically segmenting the pancreas fromCT images.Firstly,an EfficientNetV2 encoder is employed to extract complex and multi-level features,enhancing the model’s ability to capture the pancreas’s intricate morphology.Then,a residual multi-branch dilated attention(RMDA)module is designed to suppress irrelevant background noise and highlight useful pancreatic features.And re-parameterization Visual Geometry Group(RepVGG)blocks with amulti-branch structure are introduced in the decoder to effectively integrate deep features and low-level details,improving segmentation accuracy.Furthermore,we apply re-parameterization to the model,reducing computations and parameters while accelerating inference and reducing memory usage.Our approach achieves average dice similarity coefficient(DSC)of 85.59%,intersection over union(IoU)of 75.03%,precision of 85.09%,and recall of 86.57%on the NIH pancreas dataset.Compared with other methods,our model has fewer parameters and faster inference speed,demonstrating its enormous potential in practical applications of pancreatic segmentation.
文摘Automatic detection of student engagement levels from videos,which is a spatio-temporal classification problem is crucial for enhancing the quality of online education.This paper addresses this challenge by proposing four novel hybrid end-to-end deep learning models designed for the automatic detection of student engagement levels in e-learning videos.The evaluation of these models utilizes the DAiSEE dataset,a public repository capturing student affective states in e-learning scenarios.The initial model integrates EfficientNetV2-L with Gated Recurrent Unit(GRU)and attains an accuracy of 61.45%.Subsequently,the second model combines EfficientNetV2-L with bidirectional GRU(Bi-GRU),yielding an accuracy of 61.56%.The third and fourth models leverage a fusion of EfficientNetV2-L with Long Short-Term Memory(LSTM)and bidirectional LSTM(Bi-LSTM),achieving accuracies of 62.11%and 61.67%,respectively.Our findings demonstrate the viability of these models in effectively discerning student engagement levels,with the EfficientNetV2-L+LSTM model emerging as the most proficient,reaching an accuracy of 62.11%.This study underscores the potential of hybrid spatio-temporal networks in automating the detection of student engagement,thereby contributing to advancements in online education quality.
文摘针对现有恶意软件分类方法特征提取的单一性及对通道权重忽视的问题,本文提出了一种基于EfficientNetV2和特征融合的新型分类方法。该方法通过综合利用Byte和Asm文件从多角度提取特征图像,融合生成三通道图像以提供更全面的恶意软件特征表达,并采用EfficientNetV2深度学习模型进行分类,更精确地刻画恶意软件间的相似性,从而提高分类准确率。在BIG2015数据集上的实验结果表明,本文方法的分类准确率达到了99.14%,能够有效分类恶意软件家族,凸显了特征融合和深度学习模型在恶意软件分类领域的巨大潜力。Addressing the limitations of singularity of feature extraction and the neglect of channel weights in existing malware classification methods, this paper introduces a novel classification method based on EfficientNetV2 and feature fusion. This method combines Byte and Asm files to extract multi-dimensional feature images, creating three-channel images for a more comprehensive representation of malware features. Utilizing the EfficientNetV2 deep learning model, the approach enhances the accuracy of malware classification by capturing subtle similarities among malware more precisely. Experiments on the BIG2015 dataset demonstrate a classification accuracy of 99.14%, effectively categorizing malware families and highlighting the significant potential of feature fusion and deep learning models in the field of malware classification.