Weather phenomenon recognition plays an important role in the field of meteorology.Nowadays,weather radars and weathers sensor have been widely used for weather recognition.However,given the high cost in deploying and...Weather phenomenon recognition plays an important role in the field of meteorology.Nowadays,weather radars and weathers sensor have been widely used for weather recognition.However,given the high cost in deploying and maintaining the devices,it is difficult to apply them to intensive weather phenomenon recognition.Moreover,advanced machine learning models such as Convolutional Neural Networks(CNNs)have shown a lot of promise in meteorology,but these models also require intensive computation and large memory,which make it difficult to use them in reality.In practice,lightweight models are often used to solve such problems.However,lightweight models often result in significant performance losses.To this end,after taking a deep dive into a large number of lightweight models and summarizing their shortcomings,we propose a novel lightweight CNNs model which is constructed based on new building blocks.The experimental results show that the model proposed in this paper has comparable performance with the mainstream non-lightweight model while also saving 25 times of memory consumption.Such memory reduction is even better than that of existing lightweight models.展开更多
As the use of facial attributes continues to expand,research into facial age estimation is also developing.Because face images are easily affected by factors including illumination and occlusion,the age estimation of ...As the use of facial attributes continues to expand,research into facial age estimation is also developing.Because face images are easily affected by factors including illumination and occlusion,the age estimation of faces is a challenging process.This paper proposes a face age estimation algorithm based on lightweight convolutional neural network in view of the complexity of the environment and the limitations of device computing ability.Improving face age estimation based on Soft Stagewise Regression Network(SSR-Net)and facial images,this paper employs the Center Symmetric Local Binary Pattern(CSLBP)method to obtain the feature image and then combines the face image and the feature image as network input data.Adding feature images to the convolutional neural network can improve the accuracy as well as increase the network model robustness.The experimental results on IMDB-WIKI and MORPH 2 datasets show that the lightweight convolutional neural network method proposed in this paper reduces model complexity and increases the accuracy of face age estimations.展开更多
In the field of agricultural information,the identification and prediction of rice leaf disease have always been the focus of research,and deep learning(DL)technology is currently a hot research topic in the field of ...In the field of agricultural information,the identification and prediction of rice leaf disease have always been the focus of research,and deep learning(DL)technology is currently a hot research topic in the field of pattern recognition.The research and development of high-efficiency,highquality and low-cost automatic identification methods for rice diseases that can replace humans is an important means of dealing with the current situation from a technical perspective.This paper mainly focuses on the problem of huge parameters of the Convolutional Neural Network(CNN)model and proposes a recognitionmodel that combines amulti-scale convolution module with a neural network model based on Visual Geometry Group(VGG).The accuracy and loss of the training set and the test set are used to evaluate the performance of the model.The test accuracy of this model is 97.1%that has increased 5.87%over VGG.Furthermore,the memory requirement is 26.1M,only 1.6%of the VGG.Experiment results show that this model performs better in terms of accuracy,recognition speed and memory size.展开更多
Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware reso...Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.展开更多
Detecting small forest fire targets in unmanned aerial vehicle(UAV)images is difficult,as flames typically cover only a very limited portion of the visual scene.This study proposes Context-guided Compact Lightweight N...Detecting small forest fire targets in unmanned aerial vehicle(UAV)images is difficult,as flames typically cover only a very limited portion of the visual scene.This study proposes Context-guided Compact Lightweight Network(CCLNet),an end-to-end lightweight model designed to detect small forest fire targets while ensuring efficient inference on devices with constrained computational resources.CCLNet employs a three-stage network architecture.Its key components include three modules.C3F-Convolutional Gated Linear Unit(C3F-CGLU)performs selective local feature extraction while preserving fine-grained high-frequency flame details.Context-Guided Feature Fusion Module(CGFM)replaces plain concatenation with triplet-attention interactions to emphasize subtle flame patterns.Lightweight Shared Convolution with Separated Batch Normalization Detection(LSCSBD)reduces parameters through separated batch normalization while maintaining scale-specific statistics.We build TF-11K,an 11,139-image dataset combining 9139 self-collected UAV images from subtropical forests and 2000 re-annotated frames from the FLAME dataset.On TF-11K,CCLNet attains 85.8%mAP@0.5,45.5%mean Average Precision(mAP)@[0.5:0.95],87.4%precision,and 79.1%recall with 2.21 M parameters and 5.7 Giga Floating-point Operations Per Second(GFLOPs).The ablation study confirms that each module contributes to both accuracy and efficiency.Cross-dataset evaluation on DFS yields 77.5%mAP@0.5 and 42.3%mAP@[0.5:0.95],indicating good generalization to unseen scenes.These results suggest that CCLNet offers a practical balance between accuracy and speed for small-target forest fire monitoring with UAVs.展开更多
Semantic segmentation of eye images is a complex task with important applications in human–computer interaction,cognitive science,and neuroscience.Achieving real-time,accurate,and robust segmentation algorithms is cr...Semantic segmentation of eye images is a complex task with important applications in human–computer interaction,cognitive science,and neuroscience.Achieving real-time,accurate,and robust segmentation algorithms is crucial for computationally limited portable devices such as augmented reality and virtual reality.With the rapid advancements in deep learning,many network models have been developed specifically for eye image segmentation.Some methods divide the segmentation process into multiple stages to achieve model parameter miniaturization while enhancing output through post processing techniques to improve segmentation accuracy.These approaches significantly increase the inference time.Other networks adopt more complex encoding and decoding modules to achieve end-to-end output,which requires substantial computation.Therefore,balancing the model’s size,accuracy,and computational complexity is essential.To address these challenges,we propose a lightweight asymmetric UNet architecture and a projection loss function.We utilize ResNet-3 layer blocks to enhance feature extraction efficiency in the encoding stage.In the decoding stage,we employ regular convolutions and skip connections to upscale the feature maps from the latent space to the original image size,balancing the model size and segmentation accuracy.In addition,we leverage the geometric features of the eye region and design a projection loss function to further improve the segmentation accuracy without adding any additional inference computational cost.We validate our approach on the OpenEDS2019 dataset for virtual reality and achieve state-of-the-art performance with 95.33%mean intersection over union(mIoU).Our model has only 0.63M parameters and 350 FPS,which are 68%and 200%of the state-of-the-art model RITNet,respectively.展开更多
Memristor-based neuromorphic computing shows great potential for high-speed and high-throughput signal processing applications,such as electroencephalogram(EEG)signal processing.Nonetheless,the size of one-transistor ...Memristor-based neuromorphic computing shows great potential for high-speed and high-throughput signal processing applications,such as electroencephalogram(EEG)signal processing.Nonetheless,the size of one-transistor one-resistor(1T1R)memristor arrays is limited by the non-ideality of the devices,which prevents the hardware implementation of large and complex networks.In this work,we propose the depthwise separable convolution and bidirectional gate recurrent unit(DSC-BiGRU)network,a lightweight and highly robust hybrid neural network based on 1T1R arrays that enables efficient processing of EEG signals in the temporal,frequency and spatial domains by hybridizing DSC and BiGRU blocks.The network size is reduced and the network robustness is improved while ensuring the network classification accuracy.In the simulation,the measured non-idealities of the 1T1R array are brought into the network through statistical analysis.Compared with traditional convolutional networks,the network parameters are reduced by 95%and the network classification accuracy is improved by 21%at a 95%array yield rate and 5%tolerable error.This work demonstrates that lightweight and highly robust networks based on memristor arrays hold great promise for applications that rely on low consumption and high efficiency.展开更多
Deep kernel mapping support vector machines have achieved good results in numerous tasks by mapping features from a low-dimensional space to a high-dimensional space and then using support vector machines for classifi...Deep kernel mapping support vector machines have achieved good results in numerous tasks by mapping features from a low-dimensional space to a high-dimensional space and then using support vector machines for classification.However,the depth kernel mapping support vector machine does not take into account the connection of different dimensional spaces and increases the model parameters.To further improve the recognition capability of deep kernel mapping support vector machines while reducing the number of model parameters,this paper proposes a framework of Lightweight Deep Convolutional Cross-Connected Kernel Mapping Support Vector Machines(LC-CKMSVM).The framework consists of a feature extraction module and a classification module.The feature extraction module first maps the data from low-dimensional to high-dimensional space by fusing the representations of different dimensional spaces through cross-connections;then,it uses depthwise separable convolution to replace part of the original convolution to reduce the number of parameters in the module;The classification module uses a soft margin support vector machine for classification.The results on 6 different visual datasets show that LC-CKMSVM obtains better classification accuracies on most cases than the other five models.展开更多
The tradeoff between efficiency and model size of the convolutional neural network(CNN)is an essential issue for applications of CNN-based algorithms to diverse real-world tasks.Although deep learning-based methods ha...The tradeoff between efficiency and model size of the convolutional neural network(CNN)is an essential issue for applications of CNN-based algorithms to diverse real-world tasks.Although deep learning-based methods have achieved significant improvements in image super-resolution(SR),current CNNbased techniques mainly contain massive parameters and a high computational complexity,limiting their practical applications.In this paper,we present a fast and lightweight framework,named weighted multi-scale residual network(WMRN),for a better tradeoff between SR performance and computational efficiency.With the modified residual structure,depthwise separable convolutions(DS Convs)are employed to improve convolutional operations’efficiency.Furthermore,several weighted multi-scale residual blocks(WMRBs)are stacked to enhance the multi-scale representation capability.In the reconstruction subnetwork,a group of Conv layers are introduced to filter feature maps to reconstruct the final high-quality image.Extensive experiments were conducted to evaluate the proposed model,and the comparative results with several state-of-the-art algorithms demonstrate the effectiveness of WMRN.展开更多
The main purpose of YOLOv3,aiming to improve the detection speed and accuracy from current detection models,is to predict the center coordinates of(x,y)from the Bounding Box and its length,width through multiple layer...The main purpose of YOLOv3,aiming to improve the detection speed and accuracy from current detection models,is to predict the center coordinates of(x,y)from the Bounding Box and its length,width through multiple layers of VGG Convolutional Neural Network(VGG-CNN)and uses the Darknet lightweight framework to process images at a faster speed.More specifically,our model has been reduced part of YOLOv3's complex and computationally intensive procedures and improved its algorithms to maintain the efficiency and accuracy of object detection.By this method,it performs a higher quality on mass object detection tasks with fewer detection errors.展开更多
There is a problem of real-time detection difficulty in road surface damage detection. This paper proposes an improved lightweight model based on you only look once version 5(YOLOv5). Firstly, this paper fully utilize...There is a problem of real-time detection difficulty in road surface damage detection. This paper proposes an improved lightweight model based on you only look once version 5(YOLOv5). Firstly, this paper fully utilized the convolutional neural network(CNN) + ghosting bottleneck(G_bneck) architecture to reduce redundant feature maps. Afterwards, we upgraded the original upsampling algorithm to content-aware reassembly of features(CARAFE) and increased the receptive field. Finally, we replaced the spatial pyramid pooling fast(SPPF) module with the basic receptive field block(Basic RFB) pooling module and added dilated convolution. After comparative experiments, we can see that the number of parameters and model size of the improved algorithm in this paper have been reduced by nearly half compared to the YOLOv5s. The frame rate per second(FPS) has been increased by 3.25 times. The mean average precision(m AP@0.5: 0.95) has increased by 8%—17% compared to other lightweight algorithms.展开更多
In viticulture,there is an increasing demand for automatic winter grapevine pruning devices,for which detection of pruning location in vineyard images is a necessary task,susceptible to being automated through the use...In viticulture,there is an increasing demand for automatic winter grapevine pruning devices,for which detection of pruning location in vineyard images is a necessary task,susceptible to being automated through the use of computer vision methods.In this study,a novel 2D grapevine winter pruning location detection method was proposed for automatic winter pruning with a Y-shaped cultivation system.The method can be divided into the following four steps.First,the vineyard image was segmented by the threshold two times Red minus Green minus Blue(2R−G−B)channel and S channel;Second,extract the grapevine skeleton by Improved Enhanced Parallel Thinning Algorithm(IEPTA);Third,find the structure of each grapevine by judging the angle and distance relationship between branches;Fourth,obtain the bounding boxes from these grapevines,then pre-trained MobileNetV3_small×0.75 was utilized to classify each bounding box and finally find the pruning location.According to the detection experiment result,the method of this study achieved a precision of 98.8%and a recall of 92.3%for bud detection,an accuracy of 83.4%for pruning location detection,and a total time of 0.423 s.Therefore,the results indicated that the proposed 2D pruning location detection method had decent robustness as well as high precision that could guide automatic devices to winter prune efficiently.展开更多
基金This paper is supported by the following funds:National Key R&D Program of China(2018YFF01010100)National natural science foundation of China(61672064)+1 种基金Basic Research Program of Qinghai Province under Grants No.2020-ZJ-709Advanced information network Beijing laboratory(PXM2019_014204_500029).
文摘Weather phenomenon recognition plays an important role in the field of meteorology.Nowadays,weather radars and weathers sensor have been widely used for weather recognition.However,given the high cost in deploying and maintaining the devices,it is difficult to apply them to intensive weather phenomenon recognition.Moreover,advanced machine learning models such as Convolutional Neural Networks(CNNs)have shown a lot of promise in meteorology,but these models also require intensive computation and large memory,which make it difficult to use them in reality.In practice,lightweight models are often used to solve such problems.However,lightweight models often result in significant performance losses.To this end,after taking a deep dive into a large number of lightweight models and summarizing their shortcomings,we propose a novel lightweight CNNs model which is constructed based on new building blocks.The experimental results show that the model proposed in this paper has comparable performance with the mainstream non-lightweight model while also saving 25 times of memory consumption.Such memory reduction is even better than that of existing lightweight models.
基金This work was funded by the foundation of Liaoning Educational committee under the Grant No.2019LNJC03.
文摘As the use of facial attributes continues to expand,research into facial age estimation is also developing.Because face images are easily affected by factors including illumination and occlusion,the age estimation of faces is a challenging process.This paper proposes a face age estimation algorithm based on lightweight convolutional neural network in view of the complexity of the environment and the limitations of device computing ability.Improving face age estimation based on Soft Stagewise Regression Network(SSR-Net)and facial images,this paper employs the Center Symmetric Local Binary Pattern(CSLBP)method to obtain the feature image and then combines the face image and the feature image as network input data.Adding feature images to the convolutional neural network can improve the accuracy as well as increase the network model robustness.The experimental results on IMDB-WIKI and MORPH 2 datasets show that the lightweight convolutional neural network method proposed in this paper reduces model complexity and increases the accuracy of face age estimations.
基金supported by National key research and development program sub-topics[2018YFF0213606-03(Mu Y.,Hu T.L.,Gong H.,Li S.J.and Sun Y.H.)http://www.most.gov.cn]Jilin Province Science and Technology Development Plan focuses on research and development projects[20200402006NC(Mu Y.,Hu T.L.,Gong H.and Li S.J.)http://kjt.jl.gov.cn]+1 种基金Science and technology support project for key industries in southern Xinjiang[2018DB001(Gong H.,and Li S.J.)http://kjj.xjbt.gov.cn]Key technology R&D project of Changchun Science and Technology Bureau of Jilin Province[21ZGN29(Mu Y.,Bao H.P.,Wang X.B.)http://kjj.changchun.gov.cn].
文摘In the field of agricultural information,the identification and prediction of rice leaf disease have always been the focus of research,and deep learning(DL)technology is currently a hot research topic in the field of pattern recognition.The research and development of high-efficiency,highquality and low-cost automatic identification methods for rice diseases that can replace humans is an important means of dealing with the current situation from a technical perspective.This paper mainly focuses on the problem of huge parameters of the Convolutional Neural Network(CNN)model and proposes a recognitionmodel that combines amulti-scale convolution module with a neural network model based on Visual Geometry Group(VGG).The accuracy and loss of the training set and the test set are used to evaluate the performance of the model.The test accuracy of this model is 97.1%that has increased 5.87%over VGG.Furthermore,the memory requirement is 26.1M,only 1.6%of the VGG.Experiment results show that this model performs better in terms of accuracy,recognition speed and memory size.
文摘Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.
基金funded by the Natural Science Foundation of Hunan Province(Grant No.2025JJ80352)the National Natural Science Foundation Project of China(Grant No.32271879).
文摘Detecting small forest fire targets in unmanned aerial vehicle(UAV)images is difficult,as flames typically cover only a very limited portion of the visual scene.This study proposes Context-guided Compact Lightweight Network(CCLNet),an end-to-end lightweight model designed to detect small forest fire targets while ensuring efficient inference on devices with constrained computational resources.CCLNet employs a three-stage network architecture.Its key components include three modules.C3F-Convolutional Gated Linear Unit(C3F-CGLU)performs selective local feature extraction while preserving fine-grained high-frequency flame details.Context-Guided Feature Fusion Module(CGFM)replaces plain concatenation with triplet-attention interactions to emphasize subtle flame patterns.Lightweight Shared Convolution with Separated Batch Normalization Detection(LSCSBD)reduces parameters through separated batch normalization while maintaining scale-specific statistics.We build TF-11K,an 11,139-image dataset combining 9139 self-collected UAV images from subtropical forests and 2000 re-annotated frames from the FLAME dataset.On TF-11K,CCLNet attains 85.8%mAP@0.5,45.5%mean Average Precision(mAP)@[0.5:0.95],87.4%precision,and 79.1%recall with 2.21 M parameters and 5.7 Giga Floating-point Operations Per Second(GFLOPs).The ablation study confirms that each module contributes to both accuracy and efficiency.Cross-dataset evaluation on DFS yields 77.5%mAP@0.5 and 42.3%mAP@[0.5:0.95],indicating good generalization to unseen scenes.These results suggest that CCLNet offers a practical balance between accuracy and speed for small-target forest fire monitoring with UAVs.
基金supported by the HFIPS Director’s Foundation(YZJJ202207-TS),the National Natural Science Foundation of China(82371931)the Natural Science Foundation of Anhui Province(2008085MC69)+3 种基金the Natural Science Foundation of Hefei City(2021033)the General Scientific Research Project of Anhui Provincial Health Commission(AHWJ2021b150)the Collaborative Innovation Program of Hefei Science Center,CAS(2021HSC-CIP013)the Anhui Province Key Research and Development Project(202204295107020004).
文摘Semantic segmentation of eye images is a complex task with important applications in human–computer interaction,cognitive science,and neuroscience.Achieving real-time,accurate,and robust segmentation algorithms is crucial for computationally limited portable devices such as augmented reality and virtual reality.With the rapid advancements in deep learning,many network models have been developed specifically for eye image segmentation.Some methods divide the segmentation process into multiple stages to achieve model parameter miniaturization while enhancing output through post processing techniques to improve segmentation accuracy.These approaches significantly increase the inference time.Other networks adopt more complex encoding and decoding modules to achieve end-to-end output,which requires substantial computation.Therefore,balancing the model’s size,accuracy,and computational complexity is essential.To address these challenges,we propose a lightweight asymmetric UNet architecture and a projection loss function.We utilize ResNet-3 layer blocks to enhance feature extraction efficiency in the encoding stage.In the decoding stage,we employ regular convolutions and skip connections to upscale the feature maps from the latent space to the original image size,balancing the model size and segmentation accuracy.In addition,we leverage the geometric features of the eye region and design a projection loss function to further improve the segmentation accuracy without adding any additional inference computational cost.We validate our approach on the OpenEDS2019 dataset for virtual reality and achieve state-of-the-art performance with 95.33%mean intersection over union(mIoU).Our model has only 0.63M parameters and 350 FPS,which are 68%and 200%of the state-of-the-art model RITNet,respectively.
基金Project supported by the National Key Research and Development Program of China(Grant No.2019YFB2205102)the National Natural Science Foundation of China(Grant Nos.61974164,62074166,61804181,62004219,62004220,and 62104256).
文摘Memristor-based neuromorphic computing shows great potential for high-speed and high-throughput signal processing applications,such as electroencephalogram(EEG)signal processing.Nonetheless,the size of one-transistor one-resistor(1T1R)memristor arrays is limited by the non-ideality of the devices,which prevents the hardware implementation of large and complex networks.In this work,we propose the depthwise separable convolution and bidirectional gate recurrent unit(DSC-BiGRU)network,a lightweight and highly robust hybrid neural network based on 1T1R arrays that enables efficient processing of EEG signals in the temporal,frequency and spatial domains by hybridizing DSC and BiGRU blocks.The network size is reduced and the network robustness is improved while ensuring the network classification accuracy.In the simulation,the measured non-idealities of the 1T1R array are brought into the network through statistical analysis.Compared with traditional convolutional networks,the network parameters are reduced by 95%and the network classification accuracy is improved by 21%at a 95%array yield rate and 5%tolerable error.This work demonstrates that lightweight and highly robust networks based on memristor arrays hold great promise for applications that rely on low consumption and high efficiency.
基金This work is supported by the National Natural Science Foundation of China(61806013,61876010,61906005,62166002)General project of Science and Technology Plan of Beijing Municipal Education Commission(KM202110005028)+1 种基金Project of Interdisciplinary Research Institute of Beijing University of Technology(2021020101)International Research Cooperation Seed Fund of Beijing University of Technology(2021A01).
文摘Deep kernel mapping support vector machines have achieved good results in numerous tasks by mapping features from a low-dimensional space to a high-dimensional space and then using support vector machines for classification.However,the depth kernel mapping support vector machine does not take into account the connection of different dimensional spaces and increases the model parameters.To further improve the recognition capability of deep kernel mapping support vector machines while reducing the number of model parameters,this paper proposes a framework of Lightweight Deep Convolutional Cross-Connected Kernel Mapping Support Vector Machines(LC-CKMSVM).The framework consists of a feature extraction module and a classification module.The feature extraction module first maps the data from low-dimensional to high-dimensional space by fusing the representations of different dimensional spaces through cross-connections;then,it uses depthwise separable convolution to replace part of the original convolution to reduce the number of parameters in the module;The classification module uses a soft margin support vector machine for classification.The results on 6 different visual datasets show that LC-CKMSVM obtains better classification accuracies on most cases than the other five models.
基金the National Natural Science Foundation of China(61772149,61866009,61762028,U1701267,61702169)Guangxi Science and Technology Project(2019GXNSFFA245014,ZY20198016,AD18281079,AD18216004)+1 种基金the Natural Science Foundation of Hunan Province(2020JJ3014)Guangxi Colleges and Universities Key Laboratory of Intelligent Processing of Computer Images and Graphics(GIIP202001).
文摘The tradeoff between efficiency and model size of the convolutional neural network(CNN)is an essential issue for applications of CNN-based algorithms to diverse real-world tasks.Although deep learning-based methods have achieved significant improvements in image super-resolution(SR),current CNNbased techniques mainly contain massive parameters and a high computational complexity,limiting their practical applications.In this paper,we present a fast and lightweight framework,named weighted multi-scale residual network(WMRN),for a better tradeoff between SR performance and computational efficiency.With the modified residual structure,depthwise separable convolutions(DS Convs)are employed to improve convolutional operations’efficiency.Furthermore,several weighted multi-scale residual blocks(WMRBs)are stacked to enhance the multi-scale representation capability.In the reconstruction subnetwork,a group of Conv layers are introduced to filter feature maps to reconstruct the final high-quality image.Extensive experiments were conducted to evaluate the proposed model,and the comparative results with several state-of-the-art algorithms demonstrate the effectiveness of WMRN.
文摘The main purpose of YOLOv3,aiming to improve the detection speed and accuracy from current detection models,is to predict the center coordinates of(x,y)from the Bounding Box and its length,width through multiple layers of VGG Convolutional Neural Network(VGG-CNN)and uses the Darknet lightweight framework to process images at a faster speed.More specifically,our model has been reduced part of YOLOv3's complex and computationally intensive procedures and improved its algorithms to maintain the efficiency and accuracy of object detection.By this method,it performs a higher quality on mass object detection tasks with fewer detection errors.
基金supported by the Shanghai Sailing Program,China (No.20YF1447600)the Research Start-Up Project of Shanghai Institute of Technology (No.YJ2021-60)+1 种基金the Collaborative Innovation Project of Shanghai Institute of Technology (No.XTCX2020-12)the Science and Technology Talent Development Fund for Young and Middle-Aged Teachers at Shanghai Institute of Technology (No.ZQ2022-6)。
文摘There is a problem of real-time detection difficulty in road surface damage detection. This paper proposes an improved lightweight model based on you only look once version 5(YOLOv5). Firstly, this paper fully utilized the convolutional neural network(CNN) + ghosting bottleneck(G_bneck) architecture to reduce redundant feature maps. Afterwards, we upgraded the original upsampling algorithm to content-aware reassembly of features(CARAFE) and increased the receptive field. Finally, we replaced the spatial pyramid pooling fast(SPPF) module with the basic receptive field block(Basic RFB) pooling module and added dilated convolution. After comparative experiments, we can see that the number of parameters and model size of the improved algorithm in this paper have been reduced by nearly half compared to the YOLOv5s. The frame rate per second(FPS) has been increased by 3.25 times. The mean average precision(m AP@0.5: 0.95) has increased by 8%—17% compared to other lightweight algorithms.
基金This work was financially supported by the Basic Public Welfare Research Project of Zhejiang Province(Grant No.LGN20E050007).
文摘In viticulture,there is an increasing demand for automatic winter grapevine pruning devices,for which detection of pruning location in vineyard images is a necessary task,susceptible to being automated through the use of computer vision methods.In this study,a novel 2D grapevine winter pruning location detection method was proposed for automatic winter pruning with a Y-shaped cultivation system.The method can be divided into the following four steps.First,the vineyard image was segmented by the threshold two times Red minus Green minus Blue(2R−G−B)channel and S channel;Second,extract the grapevine skeleton by Improved Enhanced Parallel Thinning Algorithm(IEPTA);Third,find the structure of each grapevine by judging the angle and distance relationship between branches;Fourth,obtain the bounding boxes from these grapevines,then pre-trained MobileNetV3_small×0.75 was utilized to classify each bounding box and finally find the pruning location.According to the detection experiment result,the method of this study achieved a precision of 98.8%and a recall of 92.3%for bud detection,an accuracy of 83.4%for pruning location detection,and a total time of 0.423 s.Therefore,the results indicated that the proposed 2D pruning location detection method had decent robustness as well as high precision that could guide automatic devices to winter prune efficiently.