Automatic crack detection of cement pavement chiefly benefits from the rapid development of deep learning,with convolutional neural networks(CNN)playing an important role in this field.However,as the performance of cr...Automatic crack detection of cement pavement chiefly benefits from the rapid development of deep learning,with convolutional neural networks(CNN)playing an important role in this field.However,as the performance of crack detection in cement pavement improves,the depth and width of the network structure are significantly increased,which necessitates more computing power and storage space.This limitation hampers the practical implementation of crack detection models on various platforms,particularly portable devices like small mobile devices.To solve these problems,we propose a dual-encoder-based network architecture that focuses on extracting more comprehensive fracture feature information and combines cross-fusion modules and coordinated attention mechanisms formore efficient feature fusion.Firstly,we use small channel convolution to construct shallow feature extractionmodule(SFEM)to extract low-level feature information of cracks in cement pavement images,in order to obtainmore information about cracks in the shallowfeatures of images.In addition,we construct large kernel atrous convolution(LKAC)to enhance crack information,which incorporates coordination attention mechanism for non-crack information filtering,and large kernel atrous convolution with different cores,using different receptive fields to extract more detailed edge and context information.Finally,the three-stage feature map outputs from the shallow feature extraction module is cross-fused with the two-stage feature map outputs from the large kernel atrous convolution module,and the shallow feature and detailed edge feature are fully fused to obtain the final crack prediction map.We evaluate our method on three public crack datasets:DeepCrack,CFD,and Crack500.Experimental results on theDeepCrack dataset demonstrate the effectiveness of our proposed method compared to state-of-the-art crack detection methods,which achieves Precision(P)87.2%,Recall(R)87.7%,and F-score(F1)87.4%.Thanks to our lightweight crack detectionmodel,the parameter count of the model in real-world detection scenarios has been significantly reduced to less than 2M.This advancement also facilitates technical support for portable scene detection.展开更多
Accurately identifying defect patterns in wafer maps can help engineers find abnormal failure factors in production lines.During the wafer testing stage,deep learning methods are widely used in wafer defect detection ...Accurately identifying defect patterns in wafer maps can help engineers find abnormal failure factors in production lines.During the wafer testing stage,deep learning methods are widely used in wafer defect detection due to their powerful feature extraction capa-bilities.However,most of the current wafer defect patterns classification models have high complexity and slow detection speed,which are difficult to apply in the actual wafer production process.In addition,there is a data imbalance in the wafer dataset that seriously affects the training results of the model.To reduce the complexity of the deep model without affecting the wafer feature expression,this paper adjusts the structure of the dense block in the PeleeNet network and proposes a lightweight network WM‐PeleeNet based on the PeleeNet module.In addition,to reduce the impact of data imbalance on model training,this paper proposes a wafer data augmentation method based on a convolutional autoencoder by adding random Gaussian noise to the hidden layer.The method proposed in this paper has an average accuracy of 95.4%on the WM‐811K wafer dataset with only 173.643 KB of the parameters and 316.194 M of FLOPs,and takes only 22.99 s to detect 1000 wafer pictures.Compared with the original PeleeNet network without optimization,the number of parameters and FLOPs are reduced by 92.68%and 58.85%,respectively.Data augmentation on the minority class wafer map improves the average classification accuracy by 1.8%on the WM‐811K dataset.At the same time,the recognition accuracy of minority classes such as Scratch pattern and Donut pattern are significantly improved.展开更多
Deployable mechanism with preferable deployable performance,strong expansibility,and lightweight has attracted much attention because of their potential in aerospace.A basic deployable pyramid unit with good deployabi...Deployable mechanism with preferable deployable performance,strong expansibility,and lightweight has attracted much attention because of their potential in aerospace.A basic deployable pyramid unit with good deployability and expandability is proposed to construct a sizeable deployable mechanism.Firstly,the basic unit folding principle and expansion method is proposed.The configuration synthesis method of adding constraint chains of spatial closed-loop mechanism is used to synthesize the basic unit.Then,the degree of freedom of the basic unit is analyzed using the screw theory and the link dismantling method.Next,the three-dimensional models of the pyramid unit,expansion unit,and array unit are established,and the folding motion simulation analysis is carried out.Based on the number of components,weight reduction rate,and deployable rate,the performance characteristics of the three types of mechanisms are described in detail.Finally,prototypes of the pyramid unit,combination unit,and expansion unit are developed to verify further the correctness of the configuration synthesis based on the pyramid.The proposed deployable mechanism provides aference for the design and application of antennas with a large aperture,high deployable rate,and lightweight.It has a good application prospect in the aerospace field.展开更多
Current you only look once(YOLO)-based algorithm model is facing the challenge of overwhelming parameters and calculation complexity under the printed circuit board(PCB)defect detection application scenario.In order t...Current you only look once(YOLO)-based algorithm model is facing the challenge of overwhelming parameters and calculation complexity under the printed circuit board(PCB)defect detection application scenario.In order to solve this problem,we propose a new method,which combined the lightweight network mobile vision transformer(Mobile Vi T)with the convolutional block attention module(CBAM)mechanism and the new regression loss function.This method needed less computation resources,making it more suitable for embedded edge detection devices.Meanwhile,the new loss function improved the positioning accuracy of the bounding box and enhanced the robustness of the model.In addition,experiments on public datasets demonstrate that the improved model achieves an average accuracy of 87.9%across six typical defect detection tasks,while reducing computational costs by nearly 90%.It significantly reduces the model's computational requirements while maintaining accuracy,ensuring reliable performance for edge deployment.展开更多
The employment of deep convolutional neural networks has recently contributed to significant progress in single image super-resolution(SISR)research.However,the high computational demands of most SR techniques hinder ...The employment of deep convolutional neural networks has recently contributed to significant progress in single image super-resolution(SISR)research.However,the high computational demands of most SR techniques hinder their applicability to edge devices,despite their satisfactory reconstruction performance.These methods commonly use standard convolutions,which increase the convolutional operation cost of the model.In this paper,a lightweight Partial Separation and Multiscale Fusion Network(PSMFNet)is proposed to alleviate this problem.Specifically,this paper introduces partial convolution(PConv),which reduces the redundant convolution operations throughout the model by separating some of the features of an image while retaining features useful for image reconstruction.Additionally,it is worth noting that the existing methods have not fully utilized the rich feature information,leading to information loss,which reduces the ability to learn feature representations.Inspired by self-attention,this paper develops a multiscale feature fusion block(MFFB),which can better utilize the non-local features of an image.MFFB can learn long-range dependencies from the spatial dimension and extract features from the channel dimension,thereby obtaining more comprehensive and rich feature information.As the role of the MFFB is to capture rich global features,this paper further introduces an efficient inverted residual block(EIRB)to supplement the local feature extraction ability of PSMFNet.A comprehensive analysis of the experimental results shows that PSMFNet maintains a better performance with fewer parameters than the state-of-the-art models.展开更多
Accurately identifying crop pests and diseases ensures agricultural productivity and safety.Although current YOLO-based detection models offer real-time capabilities,their conventional convolutional layers involve hig...Accurately identifying crop pests and diseases ensures agricultural productivity and safety.Although current YOLO-based detection models offer real-time capabilities,their conventional convolutional layers involve high computational redundancy and a fixed receptive field,making it challenging to capture local details and global semantics in complex scenarios simultaneously.This leads to significant issues like missed detections of small targets and heightened sensitivity to background interference.To address these challenges,this paper proposes a lightweight adaptive detection network—StarSpark-AdaptiveNet(SSANet),which optimizes features through a dual-module collaborative mechanism.Specifically,the StarNet module utilizes Depthwise separable convolutions(DW-Conv)and dynamic star operations to establish multi-stage feature extraction pathways,enhancing local detail perception within a lightweight framework.Moreover,the Multi-scale Adaptive Spatial Attention Gate(MASAG)module integrates cross-layer feature fusion and dynamic weight allocation to capture multi-scale global contextual information,effectively suppressing background noise.These modules jointly form a“local enhancement-global calibration”bidirectional optimization mechanism,significantly improving the model’s adaptability to complex disease patterns.Furthermore,the proposed Scale-based Dynamic Loss(SD Loss)dynamically adjusts the weight of scale and localization losses,improving regression stability and localization accuracy,especially for small targets.Experiments on the eggplant fruit disease dataset demonstrate that SSANet achieves an mAP50 of 83.9%and a detection speed of 273.5 FPS with only 2.11 M parameters and 5.1 GFLOPs computational cost,outperforming the baseline YOLO11 model by reducing parameters by 18.1%,increasing mAP50 by 1.3%,and improving inference speed by 9.1%.Ablation studies further confirm the effectiveness and complementarity of the modules.SSANet offers a high-accuracy,low-cost solution suitable for real-time pest and disease detection in crops,facilitating edge device deployment and promoting precision agriculture.展开更多
In recent years,the country has spent significant workforce and material resources to prevent traffic accidents,particularly those caused by fatigued driving.The current studies mainly concentrate on driver physiologi...In recent years,the country has spent significant workforce and material resources to prevent traffic accidents,particularly those caused by fatigued driving.The current studies mainly concentrate on driver physiological signals,driving behavior,and vehicle information.However,most of the approaches are computationally intensive and inconvenient for real-time detection.Therefore,this paper designs a network that combines precision,speed and lightweight and proposes an algorithm for facial fatigue detection based on multi-feature fusion.Specifically,the face detection model takes YOLOv8(You Only Look Once version 8)as the basic framework,and replaces its backbone network with MobileNetv3.To focus on the significant regions in the image,CPCA(Channel Prior Convolution Attention)is adopted to enhance the network’s capacity for feature extraction.Meanwhile,the network training phase employs the Focal-EIOU(Focal and Efficient Intersection Over Union)loss function,which makes the network lightweight and increases the accuracy of target detection.Ultimately,the Dlib toolkit was employed to annotate 68 facial feature points.This study established an evaluation metric for facial fatigue and developed a novel fatigue detection algorithm to assess the driver’s condition.A series of comparative experiments were carried out on the self-built dataset.The suggested method’s mAP(mean Average Precision)values for object detection and fatigue detection are 96.71%and 95.75%,respectively,as well as the detection speed is 47 FPS(Frames Per Second).This method can balance the contradiction between computational complexity and model accuracy.Furthermore,it can be transplanted to NVIDIA Jetson Orin NX and quickly detect the driver’s state while maintaining a high degree of accuracy.It contributes to the development of automobile safety systems and reduces the occurrence of traffic accidents.展开更多
Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware reso...Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.展开更多
In minimally invasive surgery,endoscopes or laparoscopes equipped with miniature cameras and tools are used to enter the human body for therapeutic purposes through small incisions or natural cavities.However,in clini...In minimally invasive surgery,endoscopes or laparoscopes equipped with miniature cameras and tools are used to enter the human body for therapeutic purposes through small incisions or natural cavities.However,in clinical operating environments,endoscopic images often suffer from challenges such as low texture,uneven illumination,and non-rigid structures,which affect feature observation and extraction.This can severely impact surgical navigation or clinical diagnosis due to missing feature points in endoscopic images,leading to treatment and postoperative recovery issues for patients.To address these challenges,this paper introduces,for the first time,a Cross-Channel Multi-Modal Adaptive Spatial Feature Fusion(ASFF)module based on the lightweight architecture of EfficientViT.Additionally,a novel lightweight feature extraction and matching network based on attention mechanism is proposed.This network dynamically adjusts attention weights for cross-modal information from grayscale images and optical flow images through a dual-branch Siamese network.It extracts static and dynamic information features ranging from low-level to high-level,and from local to global,ensuring robust feature extraction across different widths,noise levels,and blur scenarios.Global and local matching are performed through a multi-level cascaded attention mechanism,with cross-channel attention introduced to simultaneously extract low-level and high-level features.Extensive ablation experiments and comparative studies are conducted on the HyperKvasir,EAD,M2caiSeg,CVC-ClinicDB,and UCL synthetic datasets.Experimental results demonstrate that the proposed network improves upon the baseline EfficientViT-B3 model by 75.4%in accuracy(Acc),while also enhancing runtime performance and storage efficiency.When compared with the complex DenseDescriptor feature extraction network,the difference in Acc is less than 7.22%,and IoU calculation results on specific datasets outperform complex dense models.Furthermore,this method increases the F1 score by 33.2%and accelerates runtime by 70.2%.It is noteworthy that the speed of CMMCAN surpasses that of comparative lightweight models,with feature extraction and matching performance comparable to existing complex models but with faster speed and higher cost-effectiveness.展开更多
In pursuit of cost-effective manufacturing,enterprises are increasingly adopting the practice of utilizing recycled semiconductor chips.To ensure consistent chip orientation during packaging,a circular marker on the f...In pursuit of cost-effective manufacturing,enterprises are increasingly adopting the practice of utilizing recycled semiconductor chips.To ensure consistent chip orientation during packaging,a circular marker on the front side is employed for pin alignment following successful functional testing.However,recycled chips often exhibit substantial surface wear,and the identification of the relatively small marker proves challenging.Moreover,the complexity of generic target detection algorithms hampers seamless deployment.Addressing these issues,this paper introduces a lightweight YOLOv8s-based network tailored for detecting markings on recycled chips,termed Van-YOLOv8.Initially,to alleviate the influence of diminutive,low-resolution markings on the precision of deep learning models,we utilize an upscaling approach for enhanced resolution.This technique relies on the Super-Resolution Generative Adversarial Network with Extended Training(SRGANext)network,facilitating the reconstruction of high-fidelity images that align with input specifications.Subsequently,we replace the original YOLOv8smodel’s backbone feature extraction network with the lightweight VanillaNetwork(VanillaNet),simplifying the branch structure to reduce network parameters.Finally,a Hybrid Attention Mechanism(HAM)is implemented to capture essential details from input images,improving feature representation while concurrently expediting model inference speed.Experimental results demonstrate that the Van-YOLOv8 network outperforms the original YOLOv8s on a recycled chip dataset in various aspects.Significantly,it demonstrates superiority in parameter count,computational intricacy,precision in identifying targets,and speed when compared to certain prevalent algorithms in the current landscape.The proposed approach proves promising for real-time detection of recycled chips in practical factory settings.展开更多
In frequency division duplex(FDD)massive multiple-input multiple-output(MIMO)systems,a bidirectional positional attention network(BPANet)was proposed to address the high computational complexity and low accuracy of ex...In frequency division duplex(FDD)massive multiple-input multiple-output(MIMO)systems,a bidirectional positional attention network(BPANet)was proposed to address the high computational complexity and low accuracy of existing deep learning-based channel state information(CSI)feedback methods.Specifically,a bidirectional position attention module(BPAM)was designed in the BPANet to improve the network performance.The BPAM captures the distribution characteristics of the CSI matrix by integrating channel and spatial dimension information,thereby enhancing the feature representation of the CSI matrix.Furthermore,channel attention is decomposed into two one-dimensional(1D)feature encoding processes effectively reducing computational costs.Simulation results demonstrate that,compared with the existing representative method complex input lightweight neural network(CLNet),BPANet reduces computational complexity by an average of 19.4%and improves accuracy by an average of 7.1%.Additionally,it performs better in terms of running time delay and cosine similarity.展开更多
Accurate cloud classification plays a crucial role in aviation safety,climate monitoring,and localized weather forecasting.Current research has been focusing on machine learning techniques,particularly deep learning b...Accurate cloud classification plays a crucial role in aviation safety,climate monitoring,and localized weather forecasting.Current research has been focusing on machine learning techniques,particularly deep learning based model,for the types identification.However,traditional approaches such as convolutional neural networks(CNNs)encounter difficulties in capturing global contextual information.In addition,they are computationally expensive,which restricts their usability in resource-limited environments.To tackle these issues,we present the Cloud Vision Transformer(CloudViT),a lightweight model that integrates CNNs with Transformers.The integration enables an effective balance between local and global feature extraction.To be specific,CloudViT comprises two innovative modules:Feature Extraction(E_Module)and Downsampling(D_Module).These modules are able to significantly reduce the number of model parameters and computational complexity while maintaining translation invariance and enhancing contextual comprehension.Overall,the CloudViT includes 0.93×10^(6)parameters,which decreases more than ten times compared to the SOTA(State-of-the-Art)model CloudNet.Comprehensive evaluations conducted on the HBMCD and SWIMCAT datasets showcase the outstanding performance of CloudViT.It achieves classification accuracies of 98.45%and 100%,respectively.Moreover,the efficiency and scalability of CloudViT make it an ideal candidate for deployment inmobile cloud observation systems,enabling real-time cloud image classification.The proposed hybrid architecture of CloudViT offers a promising approach for advancing ground-based cloud image classification.It holds significant potential for both optimizing performance and facilitating practical deployment scenarios.展开更多
This paper introduces a lightweight remote sensing image dehazing network called multidimensional weight regulation network(MDWR-Net), which addresses the high computational cost of existing methods. Previous works, o...This paper introduces a lightweight remote sensing image dehazing network called multidimensional weight regulation network(MDWR-Net), which addresses the high computational cost of existing methods. Previous works, often based on the encoder-decoder structure and utilizing multiple upsampling and downsampling layers, are computationally expensive. To improve efficiency, the paper proposes two modules: the efficient spatial resolution recovery module(ESRR) for upsampling and the efficient depth information augmentation module(EDIA) for downsampling.These modules not only reduce model complexity but also enhance performance. Additionally, the partial feature weight learning module(PFWL) is introduced to reduce the computational burden by applying weight learning across partial dimensions, rather than using full-channel convolution.To overcome the limitations of convolutional neural networks(CNN)-based networks, the haze distribution index transformer(HDIT) is integrated into the decoder. We also propose the physicalbased non-adjacent feature fusion module(PNFF), which leverages the atmospheric scattering model to improve generalization of our MDWR-Net. The MDWR-Net achieves superior dehazing performance with a computational cost of just 2.98×10^(9) multiply-accumulate operations(MACs),which is less than one-tenth of previous methods. Experimental results validate its effectiveness in balancing performance and computational efficiency.展开更多
Unauthorized operations referred to as“black flights”of unmanned aerial vehicles(UAVs)pose a significant danger to public safety,and existing low-attitude object detection algorithms encounter difficulties in balanc...Unauthorized operations referred to as“black flights”of unmanned aerial vehicles(UAVs)pose a significant danger to public safety,and existing low-attitude object detection algorithms encounter difficulties in balancing detection precision and speed.Additionally,their accuracy is insufficient,particularly for small objects in complex environments.To solve these problems,we propose a lightweight feature-enhanced convolutional neural network able to perform detection with high precision detection for low-attitude flying objects in real time to provide guidance information to suppress black-flying UAVs.The proposed network consists of three modules.A lightweight and stable feature extraction module is used to reduce the computational load and stably extract more low-level feature,an enhanced feature processing module significantly improves the feature extraction ability of the model,and an accurate detection module integrates low-level and advanced features to improve the multiscale detection accuracy in complex environments,particularly for small objects.The proposed method achieves a detection speed of 147 frames per second(FPS)and a mean average precision(mAP)of 90.97%for a dataset composed of flying objects,indicating its potential for low-altitude object detection.Furthermore,evaluation results based on microsoft common objects in context(MS COCO)indicate that the proposed method is also applicable to object detection in general.展开更多
To solve the problem of difficulty in identifying apple diseases in the natural environment and the low application rate of deep learning recognition networks,a lightweight ResNet(LW-ResNet)model for apple disease rec...To solve the problem of difficulty in identifying apple diseases in the natural environment and the low application rate of deep learning recognition networks,a lightweight ResNet(LW-ResNet)model for apple disease recognition is proposed.Based on the deep residual network(ResNet18),the multi-scale feature extraction layer is constructed by group convolution to realize the compression model and improve the extraction ability of different sizes of lesion features.By improving the identity mapping structure to reduce information loss.By introducing the efficient channel attention module(ECANet)to suppress noise from a complex background.The experimental results show that the average precision,recall and F1-score of the LW-ResNet on the test set are 97.80%,97.92%and 97.85%,respectively.The parameter memory is 2.32 MB,which is 94%less than that of ResNet18.Compared with the classic lightweight networks SqueezeNet and MobileNetV2,LW-ResNet has obvious advantages in recognition performance,speed,parameter memory requirement and time complexity.The proposed model has the advantages of low computational cost,low storage cost,strong real-time performance,high identification accuracy,and strong practicability,which can meet the needs of real-time identification task of apple leaf disease on resource-constrained devices.展开更多
The diagnosis of COVID-19 requires chest computed tomography(CT).High-resolution CT images can provide more diagnostic information to help doctors better diagnose the disease,so it is of clinical importance to study s...The diagnosis of COVID-19 requires chest computed tomography(CT).High-resolution CT images can provide more diagnostic information to help doctors better diagnose the disease,so it is of clinical importance to study super-resolution(SR)algorithms applied to CT images to improve the reso-lution of CT images.However,most of the existing SR algorithms are studied based on natural images,which are not suitable for medical images;and most of these algorithms improve the reconstruction quality by increasing the network depth,which is not suitable for machines with limited resources.To alleviate these issues,we propose a residual feature attentional fusion network for lightweight chest CT image super-resolution(RFAFN).Specifically,we design a contextual feature extraction block(CFEB)that can extract CT image features more efficiently and accurately than ordinary residual blocks.In addition,we propose a feature-weighted cascading strategy(FWCS)based on attentional feature fusion blocks(AFFB)to utilize the high-frequency detail information extracted by CFEB as much as possible via selectively fusing adjacent level feature information.Finally,we suggest a global hierarchical feature fusion strategy(GHFFS),which can utilize the hierarchical features more effectively than dense concatenation by progressively aggregating the feature information at various levels.Numerous experiments show that our method performs better than most of the state-of-the-art(SOTA)methods on the COVID-19 chest CT dataset.In detail,the peak signal-to-noise ratio(PSNR)is 0.11 dB and 0.47 dB higher on CTtest1 and CTtest2 at×3 SR compared to the suboptimal method,but the number of parameters and multi-adds are reduced by 22K and 0.43G,respectively.Our method can better recover chest CT image quality with fewer computational resources and effectively assist in COVID-19.展开更多
Convolutional neural networks depend on deep network architectures to extract accurate information for image super‐resolution.However,obtained information of these con-volutional neural networks cannot completely exp...Convolutional neural networks depend on deep network architectures to extract accurate information for image super‐resolution.However,obtained information of these con-volutional neural networks cannot completely express predicted high‐quality images for complex scenes.A dynamic network for image super‐resolution(DSRNet)is presented,which contains a residual enhancement block,wide enhancement block,feature refine-ment block and construction block.The residual enhancement block is composed of a residual enhanced architecture to facilitate hierarchical features for image super‐resolution.To enhance robustness of obtained super‐resolution model for complex scenes,a wide enhancement block achieves a dynamic architecture to learn more robust information to enhance applicability of an obtained super‐resolution model for varying scenes.To prevent interference of components in a wide enhancement block,a refine-ment block utilises a stacked architecture to accurately learn obtained features.Also,a residual learning operation is embedded in the refinement block to prevent long‐term dependency problem.Finally,a construction block is responsible for reconstructing high‐quality images.Designed heterogeneous architecture can not only facilitate richer structural information,but also be lightweight,which is suitable for mobile digital devices.Experimental results show that our method is more competitive in terms of performance,recovering time of image super‐resolution and complexity.The code of DSRNet can be obtained at https://github.com/hellloxiaotian/DSRNet.展开更多
With the continuous development of medical informatics and digital diagnosis,the classification of tuberculosis(TB)cases from computed tomography(CT)images of the lung based on deep learning is an important guiding ai...With the continuous development of medical informatics and digital diagnosis,the classification of tuberculosis(TB)cases from computed tomography(CT)images of the lung based on deep learning is an important guiding aid in clinical diagnosis and treatment.Due to its potential application in medical image classification,this task has received extensive research attention.Existing related neural network techniques are still challenging in terms of feature extraction of global contextual information of images and network complexity in achieving image classification.To address these issues,this paper proposes a lightweight medical image classification network based on a combination of Transformer and convolutional neural network(CNN)for the classification of TB cases from lung CT.The method mainly consists of a fusion of the CNN module and the Transformer module,exploiting the advantages of both in order to accomplish a more accurate classification task.On the one hand,the CNN branch supplements the Transformer branch with basic local feature information in the low level;on the other hand,in the middle and high levels of the model,the CNN branch can also provide the Transformer architecture with different local and global feature information to the Transformer architecture to enhance the ability of the model to obtain feature information and improve the accuracy of image classification.A shortcut is used in each module of the network to solve the problem of poor model results due to gradient divergence and to optimize the effectiveness of TB classification.The proposed lightweight model can well solve the problem of long training time in the process of TB classification of lung CT and improve the speed of classification.The proposed method was validated on a CT image data set provided by the First Hospital of Lanzhou University.The experimental results show that the proposed lightweight classification network for TB based on CT medical images of lungs can fully extract the feature information of the input images and obtain high-accuracy classification results.展开更多
Target detection technology has been widely used,while it is less applied in portable equipment as it has certain requirements for devices.For instance,the inventory of rebar is still manually counted at present.In th...Target detection technology has been widely used,while it is less applied in portable equipment as it has certain requirements for devices.For instance,the inventory of rebar is still manually counted at present.In this paper,a lightweight network that adapts mobile devices is proposed to accomplish the task more intelligently and efficiently.Based on the existing method of detection and recognition of dense small objects,the research of rebar recognition was implemented.After designing the multi-resolution input model and training the data set of rebar,the efficiency of detection was improved significantly.Experiments prove that the method proposed has the advantages of higher detection degree,fewer model parameters,and shorter training time for rebar recognition.展开更多
基金supported by the National Natural Science Foundation of China(No.62176034)the Science and Technology Research Program of Chongqing Municipal Education Commission(No.KJZD-M202300604)the Natural Science Foundation of Chongqing(Nos.cstc2021jcyj-msxmX0518,2023NSCQ-MSX1781).
文摘Automatic crack detection of cement pavement chiefly benefits from the rapid development of deep learning,with convolutional neural networks(CNN)playing an important role in this field.However,as the performance of crack detection in cement pavement improves,the depth and width of the network structure are significantly increased,which necessitates more computing power and storage space.This limitation hampers the practical implementation of crack detection models on various platforms,particularly portable devices like small mobile devices.To solve these problems,we propose a dual-encoder-based network architecture that focuses on extracting more comprehensive fracture feature information and combines cross-fusion modules and coordinated attention mechanisms formore efficient feature fusion.Firstly,we use small channel convolution to construct shallow feature extractionmodule(SFEM)to extract low-level feature information of cracks in cement pavement images,in order to obtainmore information about cracks in the shallowfeatures of images.In addition,we construct large kernel atrous convolution(LKAC)to enhance crack information,which incorporates coordination attention mechanism for non-crack information filtering,and large kernel atrous convolution with different cores,using different receptive fields to extract more detailed edge and context information.Finally,the three-stage feature map outputs from the shallow feature extraction module is cross-fused with the two-stage feature map outputs from the large kernel atrous convolution module,and the shallow feature and detailed edge feature are fully fused to obtain the final crack prediction map.We evaluate our method on three public crack datasets:DeepCrack,CFD,and Crack500.Experimental results on theDeepCrack dataset demonstrate the effectiveness of our proposed method compared to state-of-the-art crack detection methods,which achieves Precision(P)87.2%,Recall(R)87.7%,and F-score(F1)87.4%.Thanks to our lightweight crack detectionmodel,the parameter count of the model in real-world detection scenarios has been significantly reduced to less than 2M.This advancement also facilitates technical support for portable scene detection.
基金supported by a project jointly funded by the Beijing Municipal Education Commission and Municipal Natural Science Foundation under grant KZ202010005004.
文摘Accurately identifying defect patterns in wafer maps can help engineers find abnormal failure factors in production lines.During the wafer testing stage,deep learning methods are widely used in wafer defect detection due to their powerful feature extraction capa-bilities.However,most of the current wafer defect patterns classification models have high complexity and slow detection speed,which are difficult to apply in the actual wafer production process.In addition,there is a data imbalance in the wafer dataset that seriously affects the training results of the model.To reduce the complexity of the deep model without affecting the wafer feature expression,this paper adjusts the structure of the dense block in the PeleeNet network and proposes a lightweight network WM‐PeleeNet based on the PeleeNet module.In addition,to reduce the impact of data imbalance on model training,this paper proposes a wafer data augmentation method based on a convolutional autoencoder by adding random Gaussian noise to the hidden layer.The method proposed in this paper has an average accuracy of 95.4%on the WM‐811K wafer dataset with only 173.643 KB of the parameters and 316.194 M of FLOPs,and takes only 22.99 s to detect 1000 wafer pictures.Compared with the original PeleeNet network without optimization,the number of parameters and FLOPs are reduced by 92.68%and 58.85%,respectively.Data augmentation on the minority class wafer map improves the average classification accuracy by 1.8%on the WM‐811K dataset.At the same time,the recognition accuracy of minority classes such as Scratch pattern and Donut pattern are significantly improved.
基金Supported by National Natural Science Foundation of China(Grant No.52075467)Jiangsu Provincial Natural Science Foundation of China(Grant No.BK20220649)+1 种基金Natural Science Foundation of the Jiangsu Higher Education Institutions of China(Grant No.23KJB460010)Jiangsu Provincial Key R&D Project(Grant No.BE2022062).
文摘Deployable mechanism with preferable deployable performance,strong expansibility,and lightweight has attracted much attention because of their potential in aerospace.A basic deployable pyramid unit with good deployability and expandability is proposed to construct a sizeable deployable mechanism.Firstly,the basic unit folding principle and expansion method is proposed.The configuration synthesis method of adding constraint chains of spatial closed-loop mechanism is used to synthesize the basic unit.Then,the degree of freedom of the basic unit is analyzed using the screw theory and the link dismantling method.Next,the three-dimensional models of the pyramid unit,expansion unit,and array unit are established,and the folding motion simulation analysis is carried out.Based on the number of components,weight reduction rate,and deployable rate,the performance characteristics of the three types of mechanisms are described in detail.Finally,prototypes of the pyramid unit,combination unit,and expansion unit are developed to verify further the correctness of the configuration synthesis based on the pyramid.The proposed deployable mechanism provides aference for the design and application of antennas with a large aperture,high deployable rate,and lightweight.It has a good application prospect in the aerospace field.
基金supported by the National Natural Science Foundation of China(Nos.62373215,62373219 and 62073193)the Natural Science Foundation of Shandong Province(No.ZR2023MF100)+1 种基金the Key Projects of the Ministry of Industry and Information Technology(No.TC220H057-2022)the Independently Developed Instrument Funds of Shandong University(No.zy20240201)。
文摘Current you only look once(YOLO)-based algorithm model is facing the challenge of overwhelming parameters and calculation complexity under the printed circuit board(PCB)defect detection application scenario.In order to solve this problem,we propose a new method,which combined the lightweight network mobile vision transformer(Mobile Vi T)with the convolutional block attention module(CBAM)mechanism and the new regression loss function.This method needed less computation resources,making it more suitable for embedded edge detection devices.Meanwhile,the new loss function improved the positioning accuracy of the bounding box and enhanced the robustness of the model.In addition,experiments on public datasets demonstrate that the improved model achieves an average accuracy of 87.9%across six typical defect detection tasks,while reducing computational costs by nearly 90%.It significantly reduces the model's computational requirements while maintaining accuracy,ensuring reliable performance for edge deployment.
基金Guangdong Science and Technology Program under Grant No.202206010052Foshan Province R&D Key Project under Grant No.2020001006827Guangdong Academy of Sciences Integrated Industry Technology Innovation Center Action Special Project under Grant No.2022GDASZH-2022010108.
文摘The employment of deep convolutional neural networks has recently contributed to significant progress in single image super-resolution(SISR)research.However,the high computational demands of most SR techniques hinder their applicability to edge devices,despite their satisfactory reconstruction performance.These methods commonly use standard convolutions,which increase the convolutional operation cost of the model.In this paper,a lightweight Partial Separation and Multiscale Fusion Network(PSMFNet)is proposed to alleviate this problem.Specifically,this paper introduces partial convolution(PConv),which reduces the redundant convolution operations throughout the model by separating some of the features of an image while retaining features useful for image reconstruction.Additionally,it is worth noting that the existing methods have not fully utilized the rich feature information,leading to information loss,which reduces the ability to learn feature representations.Inspired by self-attention,this paper develops a multiscale feature fusion block(MFFB),which can better utilize the non-local features of an image.MFFB can learn long-range dependencies from the spatial dimension and extract features from the channel dimension,thereby obtaining more comprehensive and rich feature information.As the role of the MFFB is to capture rich global features,this paper further introduces an efficient inverted residual block(EIRB)to supplement the local feature extraction ability of PSMFNet.A comprehensive analysis of the experimental results shows that PSMFNet maintains a better performance with fewer parameters than the state-of-the-art models.
基金suported by the Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Science and ICT(NRF-2022R1A2C2012243).
文摘Accurately identifying crop pests and diseases ensures agricultural productivity and safety.Although current YOLO-based detection models offer real-time capabilities,their conventional convolutional layers involve high computational redundancy and a fixed receptive field,making it challenging to capture local details and global semantics in complex scenarios simultaneously.This leads to significant issues like missed detections of small targets and heightened sensitivity to background interference.To address these challenges,this paper proposes a lightweight adaptive detection network—StarSpark-AdaptiveNet(SSANet),which optimizes features through a dual-module collaborative mechanism.Specifically,the StarNet module utilizes Depthwise separable convolutions(DW-Conv)and dynamic star operations to establish multi-stage feature extraction pathways,enhancing local detail perception within a lightweight framework.Moreover,the Multi-scale Adaptive Spatial Attention Gate(MASAG)module integrates cross-layer feature fusion and dynamic weight allocation to capture multi-scale global contextual information,effectively suppressing background noise.These modules jointly form a“local enhancement-global calibration”bidirectional optimization mechanism,significantly improving the model’s adaptability to complex disease patterns.Furthermore,the proposed Scale-based Dynamic Loss(SD Loss)dynamically adjusts the weight of scale and localization losses,improving regression stability and localization accuracy,especially for small targets.Experiments on the eggplant fruit disease dataset demonstrate that SSANet achieves an mAP50 of 83.9%and a detection speed of 273.5 FPS with only 2.11 M parameters and 5.1 GFLOPs computational cost,outperforming the baseline YOLO11 model by reducing parameters by 18.1%,increasing mAP50 by 1.3%,and improving inference speed by 9.1%.Ablation studies further confirm the effectiveness and complementarity of the modules.SSANet offers a high-accuracy,low-cost solution suitable for real-time pest and disease detection in crops,facilitating edge device deployment and promoting precision agriculture.
基金supported by the Science and Technology Bureau of Xi’an project(24KGDW0049)the Key Research and Development Programof Shaanxi(2023-YBGY-264)the Key Research and Development Program of Guangxi(GK-AB20159032).
文摘In recent years,the country has spent significant workforce and material resources to prevent traffic accidents,particularly those caused by fatigued driving.The current studies mainly concentrate on driver physiological signals,driving behavior,and vehicle information.However,most of the approaches are computationally intensive and inconvenient for real-time detection.Therefore,this paper designs a network that combines precision,speed and lightweight and proposes an algorithm for facial fatigue detection based on multi-feature fusion.Specifically,the face detection model takes YOLOv8(You Only Look Once version 8)as the basic framework,and replaces its backbone network with MobileNetv3.To focus on the significant regions in the image,CPCA(Channel Prior Convolution Attention)is adopted to enhance the network’s capacity for feature extraction.Meanwhile,the network training phase employs the Focal-EIOU(Focal and Efficient Intersection Over Union)loss function,which makes the network lightweight and increases the accuracy of target detection.Ultimately,the Dlib toolkit was employed to annotate 68 facial feature points.This study established an evaluation metric for facial fatigue and developed a novel fatigue detection algorithm to assess the driver’s condition.A series of comparative experiments were carried out on the self-built dataset.The suggested method’s mAP(mean Average Precision)values for object detection and fatigue detection are 96.71%and 95.75%,respectively,as well as the detection speed is 47 FPS(Frames Per Second).This method can balance the contradiction between computational complexity and model accuracy.Furthermore,it can be transplanted to NVIDIA Jetson Orin NX and quickly detect the driver’s state while maintaining a high degree of accuracy.It contributes to the development of automobile safety systems and reduces the occurrence of traffic accidents.
文摘Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.
基金This work was supported by Science and Technology Cooperation Special Project of Shijiazhuang(SJZZXA23005).
文摘In minimally invasive surgery,endoscopes or laparoscopes equipped with miniature cameras and tools are used to enter the human body for therapeutic purposes through small incisions or natural cavities.However,in clinical operating environments,endoscopic images often suffer from challenges such as low texture,uneven illumination,and non-rigid structures,which affect feature observation and extraction.This can severely impact surgical navigation or clinical diagnosis due to missing feature points in endoscopic images,leading to treatment and postoperative recovery issues for patients.To address these challenges,this paper introduces,for the first time,a Cross-Channel Multi-Modal Adaptive Spatial Feature Fusion(ASFF)module based on the lightweight architecture of EfficientViT.Additionally,a novel lightweight feature extraction and matching network based on attention mechanism is proposed.This network dynamically adjusts attention weights for cross-modal information from grayscale images and optical flow images through a dual-branch Siamese network.It extracts static and dynamic information features ranging from low-level to high-level,and from local to global,ensuring robust feature extraction across different widths,noise levels,and blur scenarios.Global and local matching are performed through a multi-level cascaded attention mechanism,with cross-channel attention introduced to simultaneously extract low-level and high-level features.Extensive ablation experiments and comparative studies are conducted on the HyperKvasir,EAD,M2caiSeg,CVC-ClinicDB,and UCL synthetic datasets.Experimental results demonstrate that the proposed network improves upon the baseline EfficientViT-B3 model by 75.4%in accuracy(Acc),while also enhancing runtime performance and storage efficiency.When compared with the complex DenseDescriptor feature extraction network,the difference in Acc is less than 7.22%,and IoU calculation results on specific datasets outperform complex dense models.Furthermore,this method increases the F1 score by 33.2%and accelerates runtime by 70.2%.It is noteworthy that the speed of CMMCAN surpasses that of comparative lightweight models,with feature extraction and matching performance comparable to existing complex models but with faster speed and higher cost-effectiveness.
基金the Liaoning Provincial Department of Education 2021 Annual Scientific Research Funding Program(Grant Numbers LJKZ0535,LJKZ0526)the 2021 Annual Comprehensive Reform of Undergraduate Education Teaching(Grant Numbers JGLX2021020,JCLX2021008)Graduate Innovation Fund of Dalian Polytechnic University(Grant Number 2023CXYJ13).
文摘In pursuit of cost-effective manufacturing,enterprises are increasingly adopting the practice of utilizing recycled semiconductor chips.To ensure consistent chip orientation during packaging,a circular marker on the front side is employed for pin alignment following successful functional testing.However,recycled chips often exhibit substantial surface wear,and the identification of the relatively small marker proves challenging.Moreover,the complexity of generic target detection algorithms hampers seamless deployment.Addressing these issues,this paper introduces a lightweight YOLOv8s-based network tailored for detecting markings on recycled chips,termed Van-YOLOv8.Initially,to alleviate the influence of diminutive,low-resolution markings on the precision of deep learning models,we utilize an upscaling approach for enhanced resolution.This technique relies on the Super-Resolution Generative Adversarial Network with Extended Training(SRGANext)network,facilitating the reconstruction of high-fidelity images that align with input specifications.Subsequently,we replace the original YOLOv8smodel’s backbone feature extraction network with the lightweight VanillaNetwork(VanillaNet),simplifying the branch structure to reduce network parameters.Finally,a Hybrid Attention Mechanism(HAM)is implemented to capture essential details from input images,improving feature representation while concurrently expediting model inference speed.Experimental results demonstrate that the Van-YOLOv8 network outperforms the original YOLOv8s on a recycled chip dataset in various aspects.Significantly,it demonstrates superiority in parameter count,computational intricacy,precision in identifying targets,and speed when compared to certain prevalent algorithms in the current landscape.The proposed approach proves promising for real-time detection of recycled chips in practical factory settings.
基金supported by the National Natural Science Foundation of China(12005108)the Shandong Provincial Natural Science Foundation Youth Project(ZR2020QF016)the National Natural Science Foundation of China(U2006222)。
文摘In frequency division duplex(FDD)massive multiple-input multiple-output(MIMO)systems,a bidirectional positional attention network(BPANet)was proposed to address the high computational complexity and low accuracy of existing deep learning-based channel state information(CSI)feedback methods.Specifically,a bidirectional position attention module(BPAM)was designed in the BPANet to improve the network performance.The BPAM captures the distribution characteristics of the CSI matrix by integrating channel and spatial dimension information,thereby enhancing the feature representation of the CSI matrix.Furthermore,channel attention is decomposed into two one-dimensional(1D)feature encoding processes effectively reducing computational costs.Simulation results demonstrate that,compared with the existing representative method complex input lightweight neural network(CLNet),BPANet reduces computational complexity by an average of 19.4%and improves accuracy by an average of 7.1%.Additionally,it performs better in terms of running time delay and cosine similarity.
基金funded by Innovation and Development Special Project of China Meteorological Administration(CXFZ2022J038,CXFZ2024J035)Sichuan Science and Technology Program(No.2023YFQ0072)+1 种基金Key Laboratory of Smart Earth(No.KF2023YB03-07)Automatic Software Generation and Intelligent Service Key Laboratory of Sichuan Province(CUIT-SAG202210).
文摘Accurate cloud classification plays a crucial role in aviation safety,climate monitoring,and localized weather forecasting.Current research has been focusing on machine learning techniques,particularly deep learning based model,for the types identification.However,traditional approaches such as convolutional neural networks(CNNs)encounter difficulties in capturing global contextual information.In addition,they are computationally expensive,which restricts their usability in resource-limited environments.To tackle these issues,we present the Cloud Vision Transformer(CloudViT),a lightweight model that integrates CNNs with Transformers.The integration enables an effective balance between local and global feature extraction.To be specific,CloudViT comprises two innovative modules:Feature Extraction(E_Module)and Downsampling(D_Module).These modules are able to significantly reduce the number of model parameters and computational complexity while maintaining translation invariance and enhancing contextual comprehension.Overall,the CloudViT includes 0.93×10^(6)parameters,which decreases more than ten times compared to the SOTA(State-of-the-Art)model CloudNet.Comprehensive evaluations conducted on the HBMCD and SWIMCAT datasets showcase the outstanding performance of CloudViT.It achieves classification accuracies of 98.45%and 100%,respectively.Moreover,the efficiency and scalability of CloudViT make it an ideal candidate for deployment inmobile cloud observation systems,enabling real-time cloud image classification.The proposed hybrid architecture of CloudViT offers a promising approach for advancing ground-based cloud image classification.It holds significant potential for both optimizing performance and facilitating practical deployment scenarios.
文摘This paper introduces a lightweight remote sensing image dehazing network called multidimensional weight regulation network(MDWR-Net), which addresses the high computational cost of existing methods. Previous works, often based on the encoder-decoder structure and utilizing multiple upsampling and downsampling layers, are computationally expensive. To improve efficiency, the paper proposes two modules: the efficient spatial resolution recovery module(ESRR) for upsampling and the efficient depth information augmentation module(EDIA) for downsampling.These modules not only reduce model complexity but also enhance performance. Additionally, the partial feature weight learning module(PFWL) is introduced to reduce the computational burden by applying weight learning across partial dimensions, rather than using full-channel convolution.To overcome the limitations of convolutional neural networks(CNN)-based networks, the haze distribution index transformer(HDIT) is integrated into the decoder. We also propose the physicalbased non-adjacent feature fusion module(PNFF), which leverages the atmospheric scattering model to improve generalization of our MDWR-Net. The MDWR-Net achieves superior dehazing performance with a computational cost of just 2.98×10^(9) multiply-accumulate operations(MACs),which is less than one-tenth of previous methods. Experimental results validate its effectiveness in balancing performance and computational efficiency.
基金supported by the National Natural Science Foundation of China(52075027)the Fundamental Research Funds for the Central Universities(2020XJJD03).
文摘Unauthorized operations referred to as“black flights”of unmanned aerial vehicles(UAVs)pose a significant danger to public safety,and existing low-attitude object detection algorithms encounter difficulties in balancing detection precision and speed.Additionally,their accuracy is insufficient,particularly for small objects in complex environments.To solve these problems,we propose a lightweight feature-enhanced convolutional neural network able to perform detection with high precision detection for low-attitude flying objects in real time to provide guidance information to suppress black-flying UAVs.The proposed network consists of three modules.A lightweight and stable feature extraction module is used to reduce the computational load and stably extract more low-level feature,an enhanced feature processing module significantly improves the feature extraction ability of the model,and an accurate detection module integrates low-level and advanced features to improve the multiscale detection accuracy in complex environments,particularly for small objects.The proposed method achieves a detection speed of 147 frames per second(FPS)and a mean average precision(mAP)of 90.97%for a dataset composed of flying objects,indicating its potential for low-altitude object detection.Furthermore,evaluation results based on microsoft common objects in context(MS COCO)indicate that the proposed method is also applicable to object detection in general.
基金funded by the Science and Technology Development Program of Jilin Province(20190301024NY)the Precision Agriculture and Big Data Engineering Research Center of Jilin Province(2020C005).
文摘To solve the problem of difficulty in identifying apple diseases in the natural environment and the low application rate of deep learning recognition networks,a lightweight ResNet(LW-ResNet)model for apple disease recognition is proposed.Based on the deep residual network(ResNet18),the multi-scale feature extraction layer is constructed by group convolution to realize the compression model and improve the extraction ability of different sizes of lesion features.By improving the identity mapping structure to reduce information loss.By introducing the efficient channel attention module(ECANet)to suppress noise from a complex background.The experimental results show that the average precision,recall and F1-score of the LW-ResNet on the test set are 97.80%,97.92%and 97.85%,respectively.The parameter memory is 2.32 MB,which is 94%less than that of ResNet18.Compared with the classic lightweight networks SqueezeNet and MobileNetV2,LW-ResNet has obvious advantages in recognition performance,speed,parameter memory requirement and time complexity.The proposed model has the advantages of low computational cost,low storage cost,strong real-time performance,high identification accuracy,and strong practicability,which can meet the needs of real-time identification task of apple leaf disease on resource-constrained devices.
基金supported by the General Project of Natural Science Foundation of Hebei Province of China(H2019201378)the Foundation of the President of Hebei University(XZJJ201917)the Special Project for Cultivating Scientific and Technological Innovation Ability of University and Middle School Students of Hebei Province(2021H060306).
文摘The diagnosis of COVID-19 requires chest computed tomography(CT).High-resolution CT images can provide more diagnostic information to help doctors better diagnose the disease,so it is of clinical importance to study super-resolution(SR)algorithms applied to CT images to improve the reso-lution of CT images.However,most of the existing SR algorithms are studied based on natural images,which are not suitable for medical images;and most of these algorithms improve the reconstruction quality by increasing the network depth,which is not suitable for machines with limited resources.To alleviate these issues,we propose a residual feature attentional fusion network for lightweight chest CT image super-resolution(RFAFN).Specifically,we design a contextual feature extraction block(CFEB)that can extract CT image features more efficiently and accurately than ordinary residual blocks.In addition,we propose a feature-weighted cascading strategy(FWCS)based on attentional feature fusion blocks(AFFB)to utilize the high-frequency detail information extracted by CFEB as much as possible via selectively fusing adjacent level feature information.Finally,we suggest a global hierarchical feature fusion strategy(GHFFS),which can utilize the hierarchical features more effectively than dense concatenation by progressively aggregating the feature information at various levels.Numerous experiments show that our method performs better than most of the state-of-the-art(SOTA)methods on the COVID-19 chest CT dataset.In detail,the peak signal-to-noise ratio(PSNR)is 0.11 dB and 0.47 dB higher on CTtest1 and CTtest2 at×3 SR compared to the suboptimal method,but the number of parameters and multi-adds are reduced by 22K and 0.43G,respectively.Our method can better recover chest CT image quality with fewer computational resources and effectively assist in COVID-19.
基金the TCL Science and Technology Innovation Fundthe Youth Science and Technology Talent Promotion Project of Jiangsu Association for Science and Technology,Grant/Award Number:JSTJ‐2023‐017+4 种基金Shenzhen Municipal Science and Technology Innovation Council,Grant/Award Number:JSGG20220831105002004National Natural Science Foundation of China,Grant/Award Number:62201468Postdoctoral Research Foundation of China,Grant/Award Number:2022M722599the Fundamental Research Funds for the Central Universities,Grant/Award Number:D5000210966the Guangdong Basic and Applied Basic Research Foundation,Grant/Award Number:2021A1515110079。
文摘Convolutional neural networks depend on deep network architectures to extract accurate information for image super‐resolution.However,obtained information of these con-volutional neural networks cannot completely express predicted high‐quality images for complex scenes.A dynamic network for image super‐resolution(DSRNet)is presented,which contains a residual enhancement block,wide enhancement block,feature refine-ment block and construction block.The residual enhancement block is composed of a residual enhanced architecture to facilitate hierarchical features for image super‐resolution.To enhance robustness of obtained super‐resolution model for complex scenes,a wide enhancement block achieves a dynamic architecture to learn more robust information to enhance applicability of an obtained super‐resolution model for varying scenes.To prevent interference of components in a wide enhancement block,a refine-ment block utilises a stacked architecture to accurately learn obtained features.Also,a residual learning operation is embedded in the refinement block to prevent long‐term dependency problem.Finally,a construction block is responsible for reconstructing high‐quality images.Designed heterogeneous architecture can not only facilitate richer structural information,but also be lightweight,which is suitable for mobile digital devices.Experimental results show that our method is more competitive in terms of performance,recovering time of image super‐resolution and complexity.The code of DSRNet can be obtained at https://github.com/hellloxiaotian/DSRNet.
文摘With the continuous development of medical informatics and digital diagnosis,the classification of tuberculosis(TB)cases from computed tomography(CT)images of the lung based on deep learning is an important guiding aid in clinical diagnosis and treatment.Due to its potential application in medical image classification,this task has received extensive research attention.Existing related neural network techniques are still challenging in terms of feature extraction of global contextual information of images and network complexity in achieving image classification.To address these issues,this paper proposes a lightweight medical image classification network based on a combination of Transformer and convolutional neural network(CNN)for the classification of TB cases from lung CT.The method mainly consists of a fusion of the CNN module and the Transformer module,exploiting the advantages of both in order to accomplish a more accurate classification task.On the one hand,the CNN branch supplements the Transformer branch with basic local feature information in the low level;on the other hand,in the middle and high levels of the model,the CNN branch can also provide the Transformer architecture with different local and global feature information to the Transformer architecture to enhance the ability of the model to obtain feature information and improve the accuracy of image classification.A shortcut is used in each module of the network to solve the problem of poor model results due to gradient divergence and to optimize the effectiveness of TB classification.The proposed lightweight model can well solve the problem of long training time in the process of TB classification of lung CT and improve the speed of classification.The proposed method was validated on a CT image data set provided by the First Hospital of Lanzhou University.The experimental results show that the proposed lightweight classification network for TB based on CT medical images of lungs can fully extract the feature information of the input images and obtain high-accuracy classification results.
基金Hainan Science and Technology Project,which is Research and development of intelligent customer service system based on deep learning(No.ZDYF2018017)Thanks to Professor Caimao Li,the correspondent of this paper.
文摘Target detection technology has been widely used,while it is less applied in portable equipment as it has certain requirements for devices.For instance,the inventory of rebar is still manually counted at present.In this paper,a lightweight network that adapts mobile devices is proposed to accomplish the task more intelligently and efficiently.Based on the existing method of detection and recognition of dense small objects,the research of rebar recognition was implemented.After designing the multi-resolution input model and training the data set of rebar,the efficiency of detection was improved significantly.Experiments prove that the method proposed has the advantages of higher detection degree,fewer model parameters,and shorter training time for rebar recognition.