Federated learning is an effective distributed learning framework that protects privacy and allows multiple edge devices to work together to train models jointly without exchanging data.However,edge devices usually ha...Federated learning is an effective distributed learning framework that protects privacy and allows multiple edge devices to work together to train models jointly without exchanging data.However,edge devices usually have limited com-puting capabilities,and limited network bandwidth is often a major bottleneck.In order to reduce communication and computing costs,we introduced a horizon-tal pruning mechanism,combined federated learning and progressive learning,and proposed a progressive federated learning scheme based on model pruning.It gradually trains from simple models to more complex ones and trims the uploaded models horizontally.Our approach effectively reduces computational and bidirec-tional communication costs while maintaining model performance.Several image classification experiments on different models have been conducted by us,and the experimental results demonstrate that our approach can effectively save approxi-mately 10%of the computational cost and 48%of the communication cost when compared to FedAvg.展开更多
End-to-end object detection Transformer(DETR)successfully established the paradigm of the Transformer architecture in the field of object detection.Its end-to-end detection process and the idea of set prediction have ...End-to-end object detection Transformer(DETR)successfully established the paradigm of the Transformer architecture in the field of object detection.Its end-to-end detection process and the idea of set prediction have become one of the hottest network architectures in recent years.There has been an abundance of work improving upon DETR.However,DETR and its variants require a substantial amount of memory resources and computational costs,and the vast number of parameters in these networks is unfavorable for model deployment.To address this issue,a greedy pruning(GP)algorithm is proposed,applied to a variant denoising-DETR(DN-DETR),which can eliminate redundant parameters in the Transformer architecture of DN-DETR.Considering the different roles of the multi-head attention(MHA)module and the feed-forward network(FFN)module in the Transformer architecture,a modular greedy pruning(MGP)algorithm is proposed.This algorithm separates the two modules and applies their respective optimal strategies and parameters.The effectiveness of the proposed algorithm is validated on the COCO 2017 dataset.The model obtained through the MGP algorithm reduces the parameters by 49%and the number of floating point operations(FLOPs)by 44%compared to the Transformer architecture of DN-DETR.At the same time,the mean average precision(mAP)of the model increases from 44.1%to 45.3%.展开更多
In the field of natural language processing(NLP),there have been various pre-training language models in recent years,with question answering systems gaining significant attention.However,as algorithms,data,and comput...In the field of natural language processing(NLP),there have been various pre-training language models in recent years,with question answering systems gaining significant attention.However,as algorithms,data,and computing power advance,the issue of increasingly larger models and a growing number of parameters has surfaced.Consequently,model training has become more costly and less efficient.To enhance the efficiency and accuracy of the training process while reducing themodel volume,this paper proposes a first-order pruningmodel PAL-BERT based on the ALBERT model according to the characteristics of question-answering(QA)system and language model.Firstly,a first-order network pruning method based on the ALBERT model is designed,and the PAL-BERT model is formed.Then,the parameter optimization strategy of the PAL-BERT model is formulated,and the Mish function was used as an activation function instead of ReLU to improve the performance.Finally,after comparison experiments with traditional deep learning models TextCNN and BiLSTM,it is confirmed that PALBERT is a pruning model compression method that can significantly reduce training time and optimize training efficiency.Compared with traditional models,PAL-BERT significantly improves the NLP task’s performance.展开更多
A two-stage algorithm based on deep learning for the detection and recognition of can bottom spray codes and numbers is proposed to address the problems of small character areas and fast production line speeds in can ...A two-stage algorithm based on deep learning for the detection and recognition of can bottom spray codes and numbers is proposed to address the problems of small character areas and fast production line speeds in can bottom spray code number recognition.In the coding number detection stage,Differentiable Binarization Network is used as the backbone network,combined with the Attention and Dilation Convolutions Path Aggregation Network feature fusion structure to enhance the model detection effect.In terms of text recognition,using the Scene Visual Text Recognition coding number recognition network for end-to-end training can alleviate the problem of coding recognition errors caused by image color distortion due to variations in lighting and background noise.In addition,model pruning and quantization are used to reduce the number ofmodel parameters to meet deployment requirements in resource-constrained environments.A comparative experiment was conducted using the dataset of tank bottom spray code numbers collected on-site,and a transfer experiment was conducted using the dataset of packaging box production date.The experimental results show that the algorithm proposed in this study can effectively locate the coding of cans at different positions on the roller conveyor,and can accurately identify the coding numbers at high production line speeds.The Hmean value of the coding number detection is 97.32%,and the accuracy of the coding number recognition is 98.21%.This verifies that the algorithm proposed in this paper has high accuracy in coding number detection and recognition.展开更多
In response to challenges posed by complex backgrounds,diverse target angles,and numerous small targets in remote sensing images,alongside the issue of high resource consumption hindering model deployment,we propose a...In response to challenges posed by complex backgrounds,diverse target angles,and numerous small targets in remote sensing images,alongside the issue of high resource consumption hindering model deployment,we propose an enhanced,lightweight you only look once version 8 small(YOLOv8s)detection algorithm.Regarding network improvements,we first replace tradi-tional horizontal boxes with rotated boxes for target detection,effectively addressing difficulties in feature extraction caused by varying target angles.Second,we design a module integrating convolu-tional neural networks(CNN)and Transformer components to replace specific C2f modules in the backbone network,thereby expanding the model’s receptive field and enhancing feature extraction in complex backgrounds.Finally,we introduce a feature calibration structure to mitigate potential feature mismatches during feature fusion.For model compression,we employ a lightweight channel pruning technique based on localized mean average precision(LMAP)to eliminate redundancies in the enhanced model.Although this approach results in some loss of detection accuracy,it effec-tively reduces the number of parameters,computational load,and model size.Additionally,we employ channel-level knowledge distillation to recover accuracy in the pruned model,further enhancing detection performance.Experimental results indicate that the enhanced algorithm achieves a 6.1%increase in mAP50 compared to YOLOv8s,while simultaneously reducing parame-ters,computational load,and model size by 57.7%,28.8%,and 52.3%,respectively.展开更多
Deep neural network(DNN)models have achieved remarkable performance across diverse tasks,leading to widespread commercial adoption.However,training high-accuracy models demands extensive data,substantial computational...Deep neural network(DNN)models have achieved remarkable performance across diverse tasks,leading to widespread commercial adoption.However,training high-accuracy models demands extensive data,substantial computational resources,and significant time investment,making them valuable assets vulnerable to unauthorized exploitation.To address this issue,this paper proposes an intellectual property(IP)protection framework for DNN models based on feature layer selection and hyper-chaotic mapping.Firstly,a sensitivity-based importance evaluation algorithm is used to identify the key feature layers for encryption,effectively protecting the core components of the model.Next,the L1 regularization criterion is applied to further select high-weight features that significantly impact the model’s performance,ensuring that the encryption process minimizes performance loss.Finally,a dual-layer encryption mechanism is designed,introducing perturbations into the weight values and utilizing hyperchaotic mapping to disrupt channel information,further enhancing the model’s security.Experimental results demonstrate that encrypting only a small subset of parameters effectively reduces model accuracy to random-guessing levels while ensuring full recoverability.The scheme exhibits strong robustness against model pruning and fine-tuning attacks and maintains consistent performance across multiple datasets,providing an efficient and practical solution for authorization-based DNN IP protection.展开更多
Diseases in tea trees can result in significant losses in both the quality and quantity of tea production.Regular monitoring can help to prevent the occurrence of large-scale diseases in tea plantations.However,existi...Diseases in tea trees can result in significant losses in both the quality and quantity of tea production.Regular monitoring can help to prevent the occurrence of large-scale diseases in tea plantations.However,existingmethods face challenges such as a high number of parameters and low recognition accuracy,which hinders their application in tea plantation monitoring equipment.This paper presents a lightweight I-MobileNetV2 model for identifying diseases in tea leaves,to address these challenges.The proposed method first embeds a Coordinate Attention(CA)module into the originalMobileNetV2 network,enabling the model to locate disease regions accurately.Secondly,a Multi-branch Parallel Convolution(MPC)module is employed to extract disease features across multiple scales,improving themodel’s adaptability to different disease scales.Finally,the AutoML for Model Compression(AMC)is used to compress themodel and reduce computational complexity.Experimental results indicate that our proposed algorithm attains an average accuracy of 96.12%on our self-built tea leaf disease dataset,surpassing the original MobileNetV2 by 1.91%.Furthermore,the number of model parameters have been reduced by 40%,making itmore suitable for practical application in tea plantation environments.展开更多
Age-related Macular Degeneration(AMD)and Diabetic Macular Edema(DME)are two com-mon retinal diseases for elder people that may ultimately cause irreversible blindness.Timely and accurate diagnosis is essential for the...Age-related Macular Degeneration(AMD)and Diabetic Macular Edema(DME)are two com-mon retinal diseases for elder people that may ultimately cause irreversible blindness.Timely and accurate diagnosis is essential for the treatment of these diseases.In recent years,computer-aided diagnosis(CAD)has been deeply investigated and effectively used for rapid and early diagnosis.In this paper,we proposed a method of CAD using vision transformer to analyze optical co-herence tomography(OCT)images and to automatically discriminate AMD,DME,and normal eyes.A classification accuracy of 99.69%was achieved.After the model pruning,the recognition time reached 0.010 s and the classification accuracy did not drop.Compared with the Con-volutional Neural Network(CNN)image classification models(VGG16,Resnet50,Densenet121,and EfficientNet),vision transformer after pruning exhibited better recognition ability.Results show that vision transformer is an improved alternative to diagnose retinal diseases more accurately.展开更多
Deep learning technology is widely used in computer vision.Generally,a large amount of data is used to train the model weights in deep learning,so as to obtain a model with higher accuracy.However,massive data and com...Deep learning technology is widely used in computer vision.Generally,a large amount of data is used to train the model weights in deep learning,so as to obtain a model with higher accuracy.However,massive data and complex model structures require more calculating resources.Since people generally can only carry and use mobile and portable devices in application scenarios,neural networks have limitations in terms of calculating resources,size and power consumption.Therefore,the efficient lightweight model MobileNet is used as the basic network in this study for optimization.First,the accuracy of the MobileNet model is improved by adding methods such as the convolutional block attention module(CBAM)and expansion convolution.Then,the MobileNet model is compressed by using pruning and weight quantization algorithms based on weight size.Afterwards,methods such as Python crawlers and data augmentation are employed to create a garbage classification data set.Based on the above model optimization strategy,the garbage classification mobile terminal application is deployed on mobile phones and raspberry pies,realizing completing the garbage classification task more conveniently.展开更多
文摘Federated learning is an effective distributed learning framework that protects privacy and allows multiple edge devices to work together to train models jointly without exchanging data.However,edge devices usually have limited com-puting capabilities,and limited network bandwidth is often a major bottleneck.In order to reduce communication and computing costs,we introduced a horizon-tal pruning mechanism,combined federated learning and progressive learning,and proposed a progressive federated learning scheme based on model pruning.It gradually trains from simple models to more complex ones and trims the uploaded models horizontally.Our approach effectively reduces computational and bidirec-tional communication costs while maintaining model performance.Several image classification experiments on different models have been conducted by us,and the experimental results demonstrate that our approach can effectively save approxi-mately 10%of the computational cost and 48%of the communication cost when compared to FedAvg.
基金Shanghai Municipal Commission of Economy and Information Technology,China(No.202301054)。
文摘End-to-end object detection Transformer(DETR)successfully established the paradigm of the Transformer architecture in the field of object detection.Its end-to-end detection process and the idea of set prediction have become one of the hottest network architectures in recent years.There has been an abundance of work improving upon DETR.However,DETR and its variants require a substantial amount of memory resources and computational costs,and the vast number of parameters in these networks is unfavorable for model deployment.To address this issue,a greedy pruning(GP)algorithm is proposed,applied to a variant denoising-DETR(DN-DETR),which can eliminate redundant parameters in the Transformer architecture of DN-DETR.Considering the different roles of the multi-head attention(MHA)module and the feed-forward network(FFN)module in the Transformer architecture,a modular greedy pruning(MGP)algorithm is proposed.This algorithm separates the two modules and applies their respective optimal strategies and parameters.The effectiveness of the proposed algorithm is validated on the COCO 2017 dataset.The model obtained through the MGP algorithm reduces the parameters by 49%and the number of floating point operations(FLOPs)by 44%compared to the Transformer architecture of DN-DETR.At the same time,the mean average precision(mAP)of the model increases from 44.1%to 45.3%.
基金Supported by Sichuan Science and Technology Program(2021YFQ0003,2023YFSY0026,2023YFH0004).
文摘In the field of natural language processing(NLP),there have been various pre-training language models in recent years,with question answering systems gaining significant attention.However,as algorithms,data,and computing power advance,the issue of increasingly larger models and a growing number of parameters has surfaced.Consequently,model training has become more costly and less efficient.To enhance the efficiency and accuracy of the training process while reducing themodel volume,this paper proposes a first-order pruningmodel PAL-BERT based on the ALBERT model according to the characteristics of question-answering(QA)system and language model.Firstly,a first-order network pruning method based on the ALBERT model is designed,and the PAL-BERT model is formed.Then,the parameter optimization strategy of the PAL-BERT model is formulated,and the Mish function was used as an activation function instead of ReLU to improve the performance.Finally,after comparison experiments with traditional deep learning models TextCNN and BiLSTM,it is confirmed that PALBERT is a pruning model compression method that can significantly reduce training time and optimize training efficiency.Compared with traditional models,PAL-BERT significantly improves the NLP task’s performance.
文摘A two-stage algorithm based on deep learning for the detection and recognition of can bottom spray codes and numbers is proposed to address the problems of small character areas and fast production line speeds in can bottom spray code number recognition.In the coding number detection stage,Differentiable Binarization Network is used as the backbone network,combined with the Attention and Dilation Convolutions Path Aggregation Network feature fusion structure to enhance the model detection effect.In terms of text recognition,using the Scene Visual Text Recognition coding number recognition network for end-to-end training can alleviate the problem of coding recognition errors caused by image color distortion due to variations in lighting and background noise.In addition,model pruning and quantization are used to reduce the number ofmodel parameters to meet deployment requirements in resource-constrained environments.A comparative experiment was conducted using the dataset of tank bottom spray code numbers collected on-site,and a transfer experiment was conducted using the dataset of packaging box production date.The experimental results show that the algorithm proposed in this study can effectively locate the coding of cans at different positions on the roller conveyor,and can accurately identify the coding numbers at high production line speeds.The Hmean value of the coding number detection is 97.32%,and the accuracy of the coding number recognition is 98.21%.This verifies that the algorithm proposed in this paper has high accuracy in coding number detection and recognition.
基金supported in part by the National Natural Foundation of China(Nos.52472334,U2368204)。
文摘In response to challenges posed by complex backgrounds,diverse target angles,and numerous small targets in remote sensing images,alongside the issue of high resource consumption hindering model deployment,we propose an enhanced,lightweight you only look once version 8 small(YOLOv8s)detection algorithm.Regarding network improvements,we first replace tradi-tional horizontal boxes with rotated boxes for target detection,effectively addressing difficulties in feature extraction caused by varying target angles.Second,we design a module integrating convolu-tional neural networks(CNN)and Transformer components to replace specific C2f modules in the backbone network,thereby expanding the model’s receptive field and enhancing feature extraction in complex backgrounds.Finally,we introduce a feature calibration structure to mitigate potential feature mismatches during feature fusion.For model compression,we employ a lightweight channel pruning technique based on localized mean average precision(LMAP)to eliminate redundancies in the enhanced model.Although this approach results in some loss of detection accuracy,it effec-tively reduces the number of parameters,computational load,and model size.Additionally,we employ channel-level knowledge distillation to recover accuracy in the pruned model,further enhancing detection performance.Experimental results indicate that the enhanced algorithm achieves a 6.1%increase in mAP50 compared to YOLOv8s,while simultaneously reducing parame-ters,computational load,and model size by 57.7%,28.8%,and 52.3%,respectively.
基金supported in part by the National Natural Science Foundation of China under Grant No.62172280in part by the Key Scientific Research Projects of Colleges and Universities in Henan Province,China under Grant No.23A520006in part by Henan Provincial Science and Technology Research Project under Grant No.222102210199.
文摘Deep neural network(DNN)models have achieved remarkable performance across diverse tasks,leading to widespread commercial adoption.However,training high-accuracy models demands extensive data,substantial computational resources,and significant time investment,making them valuable assets vulnerable to unauthorized exploitation.To address this issue,this paper proposes an intellectual property(IP)protection framework for DNN models based on feature layer selection and hyper-chaotic mapping.Firstly,a sensitivity-based importance evaluation algorithm is used to identify the key feature layers for encryption,effectively protecting the core components of the model.Next,the L1 regularization criterion is applied to further select high-weight features that significantly impact the model’s performance,ensuring that the encryption process minimizes performance loss.Finally,a dual-layer encryption mechanism is designed,introducing perturbations into the weight values and utilizing hyperchaotic mapping to disrupt channel information,further enhancing the model’s security.Experimental results demonstrate that encrypting only a small subset of parameters effectively reduces model accuracy to random-guessing levels while ensuring full recoverability.The scheme exhibits strong robustness against model pruning and fine-tuning attacks and maintains consistent performance across multiple datasets,providing an efficient and practical solution for authorization-based DNN IP protection.
基金supported by National Key Research and Development Program(No.2016YFD0201305-07)Guizhou Provincial Basic Research Program(Natural Science)(No.ZK[2023]060)Open Fund Project in Semiconductor Power Device Reliability Engineering Center of Ministry of Education(No.ERCMEKFJJ2019-06).
文摘Diseases in tea trees can result in significant losses in both the quality and quantity of tea production.Regular monitoring can help to prevent the occurrence of large-scale diseases in tea plantations.However,existingmethods face challenges such as a high number of parameters and low recognition accuracy,which hinders their application in tea plantation monitoring equipment.This paper presents a lightweight I-MobileNetV2 model for identifying diseases in tea leaves,to address these challenges.The proposed method first embeds a Coordinate Attention(CA)module into the originalMobileNetV2 network,enabling the model to locate disease regions accurately.Secondly,a Multi-branch Parallel Convolution(MPC)module is employed to extract disease features across multiple scales,improving themodel’s adaptability to different disease scales.Finally,the AutoML for Model Compression(AMC)is used to compress themodel and reduce computational complexity.Experimental results indicate that our proposed algorithm attains an average accuracy of 96.12%on our self-built tea leaf disease dataset,surpassing the original MobileNetV2 by 1.91%.Furthermore,the number of model parameters have been reduced by 40%,making itmore suitable for practical application in tea plantation environments.
基金This work was supported by the Science and Technology innovation project of Shanghai Science and Technology Commission(19441905800)the Natural National Science Foundation of China(62175156,81827807,8210041176,82101177,61675134)+1 种基金the Project of State Key Laboratory of Ophthalmology,Optometry and Visual Science,Wenzhou Medical University(K181002)the Key R&D Program Projects in Zhejiang Province(2019C03045).
文摘Age-related Macular Degeneration(AMD)and Diabetic Macular Edema(DME)are two com-mon retinal diseases for elder people that may ultimately cause irreversible blindness.Timely and accurate diagnosis is essential for the treatment of these diseases.In recent years,computer-aided diagnosis(CAD)has been deeply investigated and effectively used for rapid and early diagnosis.In this paper,we proposed a method of CAD using vision transformer to analyze optical co-herence tomography(OCT)images and to automatically discriminate AMD,DME,and normal eyes.A classification accuracy of 99.69%was achieved.After the model pruning,the recognition time reached 0.010 s and the classification accuracy did not drop.Compared with the Con-volutional Neural Network(CNN)image classification models(VGG16,Resnet50,Densenet121,and EfficientNet),vision transformer after pruning exhibited better recognition ability.Results show that vision transformer is an improved alternative to diagnose retinal diseases more accurately.
文摘Deep learning technology is widely used in computer vision.Generally,a large amount of data is used to train the model weights in deep learning,so as to obtain a model with higher accuracy.However,massive data and complex model structures require more calculating resources.Since people generally can only carry and use mobile and portable devices in application scenarios,neural networks have limitations in terms of calculating resources,size and power consumption.Therefore,the efficient lightweight model MobileNet is used as the basic network in this study for optimization.First,the accuracy of the MobileNet model is improved by adding methods such as the convolutional block attention module(CBAM)and expansion convolution.Then,the MobileNet model is compressed by using pruning and weight quantization algorithms based on weight size.Afterwards,methods such as Python crawlers and data augmentation are employed to create a garbage classification data set.Based on the above model optimization strategy,the garbage classification mobile terminal application is deployed on mobile phones and raspberry pies,realizing completing the garbage classification task more conveniently.