Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware reso...Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.展开更多
In the field of agricultural information,the identification and prediction of rice leaf disease have always been the focus of research,and deep learning(DL)technology is currently a hot research topic in the field of ...In the field of agricultural information,the identification and prediction of rice leaf disease have always been the focus of research,and deep learning(DL)technology is currently a hot research topic in the field of pattern recognition.The research and development of high-efficiency,highquality and low-cost automatic identification methods for rice diseases that can replace humans is an important means of dealing with the current situation from a technical perspective.This paper mainly focuses on the problem of huge parameters of the Convolutional Neural Network(CNN)model and proposes a recognitionmodel that combines amulti-scale convolution module with a neural network model based on Visual Geometry Group(VGG).The accuracy and loss of the training set and the test set are used to evaluate the performance of the model.The test accuracy of this model is 97.1%that has increased 5.87%over VGG.Furthermore,the memory requirement is 26.1M,only 1.6%of the VGG.Experiment results show that this model performs better in terms of accuracy,recognition speed and memory size.展开更多
As the use of facial attributes continues to expand,research into facial age estimation is also developing.Because face images are easily affected by factors including illumination and occlusion,the age estimation of ...As the use of facial attributes continues to expand,research into facial age estimation is also developing.Because face images are easily affected by factors including illumination and occlusion,the age estimation of faces is a challenging process.This paper proposes a face age estimation algorithm based on lightweight convolutional neural network in view of the complexity of the environment and the limitations of device computing ability.Improving face age estimation based on Soft Stagewise Regression Network(SSR-Net)and facial images,this paper employs the Center Symmetric Local Binary Pattern(CSLBP)method to obtain the feature image and then combines the face image and the feature image as network input data.Adding feature images to the convolutional neural network can improve the accuracy as well as increase the network model robustness.The experimental results on IMDB-WIKI and MORPH 2 datasets show that the lightweight convolutional neural network method proposed in this paper reduces model complexity and increases the accuracy of face age estimations.展开更多
To address the issues of slow diagnostic speed,low accuracy,and poor generalization performance in traditional rolling bearing fault diagnosis methods,we propose a rolling bearing fault diagnosis method based on Marko...To address the issues of slow diagnostic speed,low accuracy,and poor generalization performance in traditional rolling bearing fault diagnosis methods,we propose a rolling bearing fault diagnosis method based on Markov Transition Field(MTF)image encoding combined with a lightweight convolutional neural network that integrates a Convolutional Block Attention Module(CBAM-LCNN).Specifically,we first use the Markov Transition Field to convert the original one-dimensional vibration signals of rolling bearings into two-dimensional images.Then,we construct a lightweight convolutional neural network incorporating the convolutional attention module(CBAM-LCNN).Finally,the two-dimensional images obtained from MTF mapping are fed into the CBAM-LCNN network for image feature extraction and fault diagnosis.We validate the effectiveness of the proposed method on the bearing fault datasets from Guangdong University of Petrochemical Technology’s multi-stage centrifugal fan and Case Western Reserve University.Experimental results show that,compared to other advanced baseline methods,the proposed rolling bearing fault diagnosis method offers faster diagnostic speed and higher diagnostic accuracy.In addition,we conducted experiments on the Xi’an Jiaotong University rolling bearing dataset,achieving excellent results in bearing fault diagnosis.These results validate the strong generalization performance of the proposed method.The method presented in this paper not only effectively diagnoses faults in rolling bearings but also serves as a reference for fault diagnosis in other equipment.展开更多
In viticulture,there is an increasing demand for automatic winter grapevine pruning devices,for which detection of pruning location in vineyard images is a necessary task,susceptible to being automated through the use...In viticulture,there is an increasing demand for automatic winter grapevine pruning devices,for which detection of pruning location in vineyard images is a necessary task,susceptible to being automated through the use of computer vision methods.In this study,a novel 2D grapevine winter pruning location detection method was proposed for automatic winter pruning with a Y-shaped cultivation system.The method can be divided into the following four steps.First,the vineyard image was segmented by the threshold two times Red minus Green minus Blue(2R−G−B)channel and S channel;Second,extract the grapevine skeleton by Improved Enhanced Parallel Thinning Algorithm(IEPTA);Third,find the structure of each grapevine by judging the angle and distance relationship between branches;Fourth,obtain the bounding boxes from these grapevines,then pre-trained MobileNetV3_small×0.75 was utilized to classify each bounding box and finally find the pruning location.According to the detection experiment result,the method of this study achieved a precision of 98.8%and a recall of 92.3%for bud detection,an accuracy of 83.4%for pruning location detection,and a total time of 0.423 s.Therefore,the results indicated that the proposed 2D pruning location detection method had decent robustness as well as high precision that could guide automatic devices to winter prune efficiently.展开更多
Speeding is one of the primary contributors to rural road crashes.Self-explaining theory offers a solution to reduce speeding,which suggests that well-designed facility environments(i.e.,road facilities and surroundin...Speeding is one of the primary contributors to rural road crashes.Self-explaining theory offers a solution to reduce speeding,which suggests that well-designed facility environments(i.e.,road facilities and surrounding landscapes)can automatically guide drivers to choose appropriate speeds on different road categories.This study proposes an improved lightweight convolutional neural network(LW-CNN)that includes drivers’visual perception characteristics(i.e.,depth perception and dynamic vision)to conduct the self-explaining analysis of the facility environment on 2-lane rural roads.Data for this study are gathered through naturalistic driving experiments on 2-lane rural roads across five Chinese provinces.A total of 3502 visual facility environment images,alongside their corresponding operation speeds and speed limits,are collected.The improved LW-CNN exhibits high accuracy and efficiency in predicting operation speeds with these visual facility environment images,achieving a train loss of 0.05%and a validation loss of 0.15%.The semantics of facility environments affecting operation speeds are further identified by combining this LW-CNN with the gradient-weighted class activation mapping(Grad-CAM)algorithm and the semantic segmentation network.Then,six typical 2-lane rural road categories perceived by drivers with different operation speeds and speeding probability(SP)are sum-marized using k-means clustering.An objective and comprehensive analysis of each category’s semantic composition and depth features is conducted to evaluate their influence on drivers’speeding probability and road category perception.The findings of this study can be directly used to optimize facility environments from drivers’visual perception to decrease speeding-related crashes.展开更多
Automated analysis of sports video summarization is challenging due to variations in cameras,replay speed,illumination conditions,editing effects,game structure,genre,etc.To address these challenges,we propose an effe...Automated analysis of sports video summarization is challenging due to variations in cameras,replay speed,illumination conditions,editing effects,game structure,genre,etc.To address these challenges,we propose an effective video summarization framework based on shot classification and replay detection for field sports videos.Accurate shot classification is mandatory to better structure the input video for further processing,i.e.,key events or replay detection.Therefore,we present a lightweight convolutional neural network based method for shot classification.Then we analyze each shot for replay detection and specifically detect the successive batch of logo transition frames that identify the replay segments from the sports videos.For this purpose,we propose local octa-pattern features to represent video frames and train the extreme learning machine for classification as replay or non-replay frames.The proposed framework is robust to variations in cameras,replay speed,shot speed,illumination conditions,game structure,sports genre,broadcasters,logo designs and placement,frame transitions,and editing effects.The performance of our framework is evaluated on a dataset containing diverse YouTube sports videos of soccer,baseball,and cricket.Experimental results demonstrate that the proposed framework can reliably be used for shot classification and replay detection to summarize field sports videos.展开更多
文摘Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.
基金supported by National key research and development program sub-topics[2018YFF0213606-03(Mu Y.,Hu T.L.,Gong H.,Li S.J.and Sun Y.H.)http://www.most.gov.cn]Jilin Province Science and Technology Development Plan focuses on research and development projects[20200402006NC(Mu Y.,Hu T.L.,Gong H.and Li S.J.)http://kjt.jl.gov.cn]+1 种基金Science and technology support project for key industries in southern Xinjiang[2018DB001(Gong H.,and Li S.J.)http://kjj.xjbt.gov.cn]Key technology R&D project of Changchun Science and Technology Bureau of Jilin Province[21ZGN29(Mu Y.,Bao H.P.,Wang X.B.)http://kjj.changchun.gov.cn].
文摘In the field of agricultural information,the identification and prediction of rice leaf disease have always been the focus of research,and deep learning(DL)technology is currently a hot research topic in the field of pattern recognition.The research and development of high-efficiency,highquality and low-cost automatic identification methods for rice diseases that can replace humans is an important means of dealing with the current situation from a technical perspective.This paper mainly focuses on the problem of huge parameters of the Convolutional Neural Network(CNN)model and proposes a recognitionmodel that combines amulti-scale convolution module with a neural network model based on Visual Geometry Group(VGG).The accuracy and loss of the training set and the test set are used to evaluate the performance of the model.The test accuracy of this model is 97.1%that has increased 5.87%over VGG.Furthermore,the memory requirement is 26.1M,only 1.6%of the VGG.Experiment results show that this model performs better in terms of accuracy,recognition speed and memory size.
基金This work was funded by the foundation of Liaoning Educational committee under the Grant No.2019LNJC03.
文摘As the use of facial attributes continues to expand,research into facial age estimation is also developing.Because face images are easily affected by factors including illumination and occlusion,the age estimation of faces is a challenging process.This paper proposes a face age estimation algorithm based on lightweight convolutional neural network in view of the complexity of the environment and the limitations of device computing ability.Improving face age estimation based on Soft Stagewise Regression Network(SSR-Net)and facial images,this paper employs the Center Symmetric Local Binary Pattern(CSLBP)method to obtain the feature image and then combines the face image and the feature image as network input data.Adding feature images to the convolutional neural network can improve the accuracy as well as increase the network model robustness.The experimental results on IMDB-WIKI and MORPH 2 datasets show that the lightweight convolutional neural network method proposed in this paper reduces model complexity and increases the accuracy of face age estimations.
基金supported by the National Natural Science Foundation of China(52001340)the Henan Province Science and Technology Key Research Project(242102110332)the Henan Province Teaching Reform Project(2022SYJXLX087).
文摘To address the issues of slow diagnostic speed,low accuracy,and poor generalization performance in traditional rolling bearing fault diagnosis methods,we propose a rolling bearing fault diagnosis method based on Markov Transition Field(MTF)image encoding combined with a lightweight convolutional neural network that integrates a Convolutional Block Attention Module(CBAM-LCNN).Specifically,we first use the Markov Transition Field to convert the original one-dimensional vibration signals of rolling bearings into two-dimensional images.Then,we construct a lightweight convolutional neural network incorporating the convolutional attention module(CBAM-LCNN).Finally,the two-dimensional images obtained from MTF mapping are fed into the CBAM-LCNN network for image feature extraction and fault diagnosis.We validate the effectiveness of the proposed method on the bearing fault datasets from Guangdong University of Petrochemical Technology’s multi-stage centrifugal fan and Case Western Reserve University.Experimental results show that,compared to other advanced baseline methods,the proposed rolling bearing fault diagnosis method offers faster diagnostic speed and higher diagnostic accuracy.In addition,we conducted experiments on the Xi’an Jiaotong University rolling bearing dataset,achieving excellent results in bearing fault diagnosis.These results validate the strong generalization performance of the proposed method.The method presented in this paper not only effectively diagnoses faults in rolling bearings but also serves as a reference for fault diagnosis in other equipment.
基金This work was financially supported by the Basic Public Welfare Research Project of Zhejiang Province(Grant No.LGN20E050007).
文摘In viticulture,there is an increasing demand for automatic winter grapevine pruning devices,for which detection of pruning location in vineyard images is a necessary task,susceptible to being automated through the use of computer vision methods.In this study,a novel 2D grapevine winter pruning location detection method was proposed for automatic winter pruning with a Y-shaped cultivation system.The method can be divided into the following four steps.First,the vineyard image was segmented by the threshold two times Red minus Green minus Blue(2R−G−B)channel and S channel;Second,extract the grapevine skeleton by Improved Enhanced Parallel Thinning Algorithm(IEPTA);Third,find the structure of each grapevine by judging the angle and distance relationship between branches;Fourth,obtain the bounding boxes from these grapevines,then pre-trained MobileNetV3_small×0.75 was utilized to classify each bounding box and finally find the pruning location.According to the detection experiment result,the method of this study achieved a precision of 98.8%and a recall of 92.3%for bud detection,an accuracy of 83.4%for pruning location detection,and a total time of 0.423 s.Therefore,the results indicated that the proposed 2D pruning location detection method had decent robustness as well as high precision that could guide automatic devices to winter prune efficiently.
基金supported by the National Natural Science Foundation of China(No.52102416)the Natural Science Foundation of Shanghai(No.22ZR1466000)the Fundamental Research Funds for the Central Universities of China(No.22120240159).
文摘Speeding is one of the primary contributors to rural road crashes.Self-explaining theory offers a solution to reduce speeding,which suggests that well-designed facility environments(i.e.,road facilities and surrounding landscapes)can automatically guide drivers to choose appropriate speeds on different road categories.This study proposes an improved lightweight convolutional neural network(LW-CNN)that includes drivers’visual perception characteristics(i.e.,depth perception and dynamic vision)to conduct the self-explaining analysis of the facility environment on 2-lane rural roads.Data for this study are gathered through naturalistic driving experiments on 2-lane rural roads across five Chinese provinces.A total of 3502 visual facility environment images,alongside their corresponding operation speeds and speed limits,are collected.The improved LW-CNN exhibits high accuracy and efficiency in predicting operation speeds with these visual facility environment images,achieving a train loss of 0.05%and a validation loss of 0.15%.The semantics of facility environments affecting operation speeds are further identified by combining this LW-CNN with the gradient-weighted class activation mapping(Grad-CAM)algorithm and the semantic segmentation network.Then,six typical 2-lane rural road categories perceived by drivers with different operation speeds and speeding probability(SP)are sum-marized using k-means clustering.An objective and comprehensive analysis of each category’s semantic composition and depth features is conducted to evaluate their influence on drivers’speeding probability and road category perception.The findings of this study can be directly used to optimize facility environments from drivers’visual perception to decrease speeding-related crashes.
基金Project supported by the Directorate of Advanced Studies,Research&Technological Development,University of Engineering and Technology Taxila(No.UET/ASRTD/RG-1002-3)。
文摘Automated analysis of sports video summarization is challenging due to variations in cameras,replay speed,illumination conditions,editing effects,game structure,genre,etc.To address these challenges,we propose an effective video summarization framework based on shot classification and replay detection for field sports videos.Accurate shot classification is mandatory to better structure the input video for further processing,i.e.,key events or replay detection.Therefore,we present a lightweight convolutional neural network based method for shot classification.Then we analyze each shot for replay detection and specifically detect the successive batch of logo transition frames that identify the replay segments from the sports videos.For this purpose,we propose local octa-pattern features to represent video frames and train the extreme learning machine for classification as replay or non-replay frames.The proposed framework is robust to variations in cameras,replay speed,shot speed,illumination conditions,game structure,sports genre,broadcasters,logo designs and placement,frame transitions,and editing effects.The performance of our framework is evaluated on a dataset containing diverse YouTube sports videos of soccer,baseball,and cricket.Experimental results demonstrate that the proposed framework can reliably be used for shot classification and replay detection to summarize field sports videos.