The surge of large-scale models in recent years has led to breakthroughs in numerous fields,but it has also introduced higher computational costs and more complex network architectures.These increasingly large and int...The surge of large-scale models in recent years has led to breakthroughs in numerous fields,but it has also introduced higher computational costs and more complex network architectures.These increasingly large and intricate networks pose challenges for deployment and execution while also exacerbating the issue of network over-parameterization.To address this issue,various network compression techniques have been developed,such as network pruning.A typical pruning algorithm follows a three-step pipeline involving training,pruning,and retraining.Existing methods often directly set the pruned filters to zero during retraining,significantly reducing the parameter space.However,this direct pruning strategy frequently results in irreversible information loss.In the early stages of training,a network still contains much uncertainty,and evaluating filter importance may not be sufficiently rigorous.To manage the pruning process effectively,this paper proposes a flexible neural network pruning algorithm based on the logistic growth differential equation,considering the characteristics of network training.Unlike other pruning algorithms that directly reduce filter weights,this algorithm introduces a three-stage adaptive weight decay strategy inspired by the logistic growth differential equation.It employs a gentle decay rate in the initial training stage,a rapid decay rate during the intermediate stage,and a slower decay rate in the network convergence stage.Additionally,the decay rate is adjusted adaptively based on the filter weights at each stage.By controlling the adaptive decay rate at each stage,the pruning of neural network filters can be effectively managed.In experiments conducted on the CIFAR-10 and ILSVRC-2012 datasets,the pruning of neural networks significantly reduces the floating-point operations while maintaining the same pruning rate.Specifically,when implementing a 30%pruning rate on the ResNet-110 network,the pruned neural network not only decreases floating-point operations by 40.8%but also enhances the classification accuracy by 0.49%compared to the original network.展开更多
The dynamic routing mechanism in evolvable networks enables adaptive reconfiguration of topol-ogical structures and transmission pathways based on real-time task requirements and data character-istics.However,the heig...The dynamic routing mechanism in evolvable networks enables adaptive reconfiguration of topol-ogical structures and transmission pathways based on real-time task requirements and data character-istics.However,the heightened architectural complexity and expanded parameter dimensionality in evolvable networks present significant implementation challenges when deployed in resource-con-strained environments.Due to the critical paths ignored,traditional pruning strategies cannot get a desired trade-off between accuracy and efficiency.For this reason,a critical path retention pruning(CPRP)method is proposed.By deeply traversing the computational graph,the dependency rela-tionship among nodes is derived.Then the nodes are grouped and sorted according to their contribu-tion value.The redundant operations are removed as much as possible while ensuring that the criti-cal path is not affected.As a result,computational efficiency is improved while a higher accuracy is maintained.On the CIFAR benchmark,the experimental results demonstrate that CPRP-induced pruning incurs accuracy degradation below 4.00%,while outperforming traditional feature-agnostic grouping methods by an average 8.98%accuracy improvement.Simultaneously,the pruned model attains a 2.41 times inference acceleration while achieving 48.92%parameter compression and 53.40%floating-point operations(FLOPs)reduction.展开更多
Human pose estimation is a critical research area in the field of computer vision,playing a significant role in applications such as human-computer interaction,behavior analysis,and action recognition.In this paper,we...Human pose estimation is a critical research area in the field of computer vision,playing a significant role in applications such as human-computer interaction,behavior analysis,and action recognition.In this paper,we propose a U-shaped keypoint detection network(DAUNet)based on an improved ResNet subsampling structure and spatial grouping mechanism.This network addresses key challenges in traditional methods,such as information loss,large network redundancy,and insufficient sensitivity to low-resolution features.DAUNet is composed of three main components.First,we introduce an improved BottleNeck block that employs partial convolution and strip pooling to reduce computational load and mitigate feature loss.Second,after upsampling,the network eliminates redundant features,improving the overall efficiency.Finally,a lightweight spatial grouping attention mechanism is applied to enhance low-resolution semantic features within the feature map,allowing for better restoration of the original image size and higher accuracy.Experimental results demonstrate that DAUNet achieves superior accuracy compared to most existing keypoint detection models,with a mean PCKh@0.5 score of 91.6%on the MPII dataset and an AP of 76.1%on the COCO dataset.Moreover,real-world experiments further validate the robustness and generalizability of DAUNet for detecting human bodies in unknown environments,highlighting its potential for broader applications.展开更多
Chinese named entity recognition(CNER)has received widespread attention as an important task of Chinese information extraction.Most previous research has focused on individually studying flat CNER,overlapped CNER,or d...Chinese named entity recognition(CNER)has received widespread attention as an important task of Chinese information extraction.Most previous research has focused on individually studying flat CNER,overlapped CNER,or discontinuous CNER.However,a unified CNER is often needed in real-world scenarios.Recent studies have shown that grid tagging-based methods based on character-pair relationship classification hold great potential for achieving unified NER.Nevertheless,how to enrich Chinese character-pair grid representations and capture deeper dependencies between character pairs to improve entity recognition performance remains an unresolved challenge.In this study,we enhance the character-pair grid representation by incorporating both local and global information.Significantly,we introduce a new approach by considering the character-pair grid representation matrix as a specialized image,converting the classification of character-pair relationships into a pixel-level semantic segmentation task.We devise a U-shaped network to extract multi-scale and deeper semantic information from the grid image,allowing for a more comprehensive understanding of associative features between character pairs.This approach leads to improved accuracy in predicting their relationships,ultimately enhancing entity recognition performance.We conducted experiments on two public CNER datasets in the biomedical domain,namely CMeEE-V2 and Diakg.The results demonstrate the effectiveness of our approach,which achieves F1-score improvements of 7.29 percentage points and 1.64 percentage points compared to the current state-of-the-art(SOTA)models,respectively.展开更多
In view of the problems of multi-scale changes of segmentation targets,noise interference,rough segmentation results and slow training process faced by medical image semantic segmentation,a multi-scale residual aggreg...In view of the problems of multi-scale changes of segmentation targets,noise interference,rough segmentation results and slow training process faced by medical image semantic segmentation,a multi-scale residual aggregation U-shaped attention network structure of MAAUNet(MultiRes aggregation attention UNet)is proposed based on MultiResUNet.Firstly,aggregate connection is introduced from the original feature aggregation at the same level.Skip connection is redesigned to aggregate features of different semantic scales at the decoder subnet,and the problem of semantic gaps is further solved that may exist between skip connections.Secondly,after the multi-scale convolution module,a convolution block attention module is added to focus and integrate features in the two attention directions of channel and space to adaptively optimize the intermediate feature map.Finally,the original convolution block is improved.The convolution channels are expanded with a series convolution structure to complement each other and extract richer spatial features.Residual connections are retained and the convolution block is turned into a multi-channel convolution block.The model is made to extract multi-scale spatial features.The experimental results show that MAAUNet has strong competitiveness in challenging datasets,and shows good segmentation performance and stability in dealing with multi-scale input and noise interference.展开更多
The development of intelligent algorithms for controlling autonomous mobile robots in real-time activities has increased dramatically in recent years.However,conventional intelligent algorithms currently fail to accur...The development of intelligent algorithms for controlling autonomous mobile robots in real-time activities has increased dramatically in recent years.However,conventional intelligent algorithms currently fail to accurately predict unexpected obstacles involved in tour paths and thereby suffer from inefficient tour trajectories.The present study addresses these issues by proposing a potential field integrated pruned adaptive resonance theory(PPART)neural network for effectively managing the touring process of autonomous mobile robots in real-time.The proposed system is implemented using the AlphaBot platform,and the performance of the system is evaluated according to the obstacle prediction accuracy,path detection accuracy,time-lapse,tour length,and the overall accuracy of the system.The proposed system provide a very high obstacle prediction accuracy of 99.61%.Accordingly,the proposed tour planning design effectively predicts unexpected obstacles in the environment and thereby increases the overall efficiency of tour navigation.展开更多
Parkinson’s disease is a serious disease that causes death.Recently,a new dataset has been introduced on this disease.The aim of this study is to improve the predictive performance of the model designed for Parkinson...Parkinson’s disease is a serious disease that causes death.Recently,a new dataset has been introduced on this disease.The aim of this study is to improve the predictive performance of the model designed for Parkinson’s disease diagnosis.By and large,original DNN models were designed by using specific or random number of neurons and layers.This study analyzed the effects of parameters,i.e.,neuron number and activation function on the model performance based on growing and pruning approach.In other words,this study addressed the optimum hidden layer and neuron numbers and ideal activation and optimization functions in order to find out the best Deep Neural Networks model.In this context of this study,several models were designed and evaluated.The overall results revealed that the Deep Neural Networks were significantly successful with 99.34%accuracy value on test data.Also,it presents the highest prediction performance reported so far.Therefore,this study presents a model promising with respect to more accurate Parkinson’s disease diagnosis.展开更多
Aimed at the great computing complexity of optimal brain surgeon (OBS) process, a pruning algorithm with penalty OBS process is presented. Compared with sensitive and regularized methods, the penalty OBS algorithm not...Aimed at the great computing complexity of optimal brain surgeon (OBS) process, a pruning algorithm with penalty OBS process is presented. Compared with sensitive and regularized methods, the penalty OBS algorithm not only avoids time-consuming defect and low pruning efficiency in OBS process, but also keeps higher generalization and pruning accuracy than Levenberg-Marquardt method.展开更多
Deep stochastic configuration networks(DSCNs)produce redundant hidden nodes and connections during training,which complicates their model structures.Aiming at the above problems,this paper proposes a double pruning st...Deep stochastic configuration networks(DSCNs)produce redundant hidden nodes and connections during training,which complicates their model structures.Aiming at the above problems,this paper proposes a double pruning structure design algorithm for DSCNs based on mutual information and relevance.During the training process,the mutual information algorithm is used to calculate and sort the importance scores of the nodes in each hidden layer in a layer-by-layer manner,the node pruning rate of each layer is set according to the depth of the DSCN at the current time,the nodes that contribute little to the model are deleted,and the network-related parameters are updated.When the model completes the configuration procedure,the correlation evaluation strategy is used to sort the global connection weights and delete insignificance connections;then,the network parameters are updated after pruning is completed.The experimental results show that the proposed structure design method can effectively compress the scale of a DSCN model and improve its modeling speed;the model accuracy loss is small,and fine-tuning for accuracy restoration is not needed.The obtained DSCN model has certain application value in the field of regression analysis.展开更多
Filter pruning effectively compresses the neural network by reducing both its parameters and computational cost.Existing pruning methods typically rely on pre-designed pruning criteria to measure filter importance and...Filter pruning effectively compresses the neural network by reducing both its parameters and computational cost.Existing pruning methods typically rely on pre-designed pruning criteria to measure filter importance and remove those deemed unimportant.However,different layers of the neural network exhibit varying filter distributions,making it inappropriate to implement the same pruning criterion for all layers.Additionally,some approaches apply different criteria from the set of pre-defined pruning rules for different layers,but the limited space leads to the difficulty of covering all layers.If criteria for all layers are manually designed,it is costly and difficult to generalize to other networks.To solve this problem,we present a novel neural network pruning method based on the Criterion Learner and Attention Distillation(CLAD).Specifically,CLAD develops a differentiable criterion learner,which is integrated into each layer of the network.The learner can automatically learn the appropriate pruning criterion according to the filter parameters of each layer,thus the requirement of manual design is eliminated.Furthermore,the criterion learner is trained end-to-end by the gradient optimization algorithm to achieve efficient pruning.In addition,attention distillation,which fully utilizes the knowledge of unpruned networks to guide the optimization of the learner and improve the pruned network performance,is introduced in the process of learner optimization.Experiments conducted on various datasets and networks demonstrate the effectiveness of the proposed method.Notably,CLAD reduces the FLOPs of Res Net-110 by about 53%on the CIFAR-10 dataset,while simultaneously improves the network's accuracy by 0.05%.Moreover,it reduces the FLOPs of Res Net-50 by about 46%on the Image Net-1K dataset,and maintains a top-1 accuracy of 75.45%.展开更多
The use of blended acquisition technology in marine seismic exploration has the advantages of high acquisition efficiency and low exploration costs.However,during acquisition,the primary source may be disturbed by adj...The use of blended acquisition technology in marine seismic exploration has the advantages of high acquisition efficiency and low exploration costs.However,during acquisition,the primary source may be disturbed by adjacent sources,resulting in blended noise that can adversely affect data processing and interpretation.Therefore,the de-blending method is needed to suppress blended noise and improve the quality of subsequent processing.Conventional de-blending methods,such as denoising and inversion methods,encounter challenges in parameter selection and entail high computational costs.In contrast,deep learning-based de-blending methods demonstrate reduced reliance on manual intervention and provide rapid calculation speeds post-training.In this study,we propose a Uformer network using a nonoverlapping window multihead attention mechanism designed for de-blending blended data in the common shot domain.We add the depthwise convolution to the feedforward network to improve Uformer’s ability to capture local context information.The loss function comprises SSIM and L1 loss.Our test results indicate that the Uformer outperforms convolutional neural networks and traditional denoising methods across various evaluation metrics,thus highlighting the effectiveness and advantages of Uformer in de-blending blended data.展开更多
Bearing pitting,one of the common faults in mechanical systems,is a research hotspot in both academia and industry.Traditional fault diagnosis methods for bearings are based on manual experience with low diagnostic ef...Bearing pitting,one of the common faults in mechanical systems,is a research hotspot in both academia and industry.Traditional fault diagnosis methods for bearings are based on manual experience with low diagnostic efficiency.This study proposes a novel bearing fault diagnosis method based on deep separable convolution and spatial dropout regularization.Deep separable convolution extracts features from the raw bearing vibration signals,during which a 3×1 convolutional kernel with a one-step size selects effective features by adjusting its weights.The similarity pruning process of the channel convolution and point convolution can reduce the number of parameters and calculation quantities by evaluating the size of the weights and removing the feature maps of smaller weights.The spatial dropout regularization method focuses on bearing signal fault features,improving the independence between the bearing signal features and enhancing the robustness of the model.A batch normalization algorithm is added to the convolutional layer for gradient explosion control and network stability improvement.To validate the effectiveness of the proposed method,we collect raw vibration signals from bearings in eight different health states.The experimental results show that the proposed method can effectively distinguish different pitting faults in the bearings with a better accuracy than that of other typical deep learning methods.展开更多
Overfitting is one of the important problems that restrain the application of neural network. The traditional OBD (Optimal Brain Damage) algorithm can avoid overfitting effectively. But it needs to train the network r...Overfitting is one of the important problems that restrain the application of neural network. The traditional OBD (Optimal Brain Damage) algorithm can avoid overfitting effectively. But it needs to train the network repeatedly with low calculational efficiency. In this paper, the Marquardt algorithm is incorporated into the OBD algorithm and a new method for pruning network-the Dynamic Optimal Brain Damage (DOBD) is introduced. This algorithm simplifies a network and obtains good generalization through dynamically deleting weight parameters with low sensitivity that is defined as the change of error function value with respect to the change of weights. Also a simplified method is presented through which sensitivities can be calculated during training with a little computation. A rule to determine the lower limit of sensitivity for deleting the unnecessary weights and other control methods during pruning and training are introduced. The training course is analyzed theoretically and the reason why DOBD algorithm can obtain a much faster training speed than the OBD algorithm and avoid overfitting effectively is given.展开更多
In the first part of the article, a new algorithm for pruning networkDynamic Optimal Brain Damage(DOBD) is introduced. In this part, two cases and an industrial application are worked out to test the new algorithm. It...In the first part of the article, a new algorithm for pruning networkDynamic Optimal Brain Damage(DOBD) is introduced. In this part, two cases and an industrial application are worked out to test the new algorithm. It is verified that the algorithm can obtain good generalization through deleting weight parameters with low sensitivities dynamically and get better result than the Marquardt algorithm or the cross-validation method. Although the initial construction of network may be different, the finial number of free weights pruned by the DOBD algorithm is similar and the number is just close to the optimal number of free weights. The algorithm is also helpful to design the optimal structure of network.展开更多
Intelligent was very important for command decision model, and it was also the key to improve the quality of simulation training and combat experiment. The decision-making content was more complex in the implementatio...Intelligent was very important for command decision model, and it was also the key to improve the quality of simulation training and combat experiment. The decision-making content was more complex in the implementation of tasks and the nature of the problem was different, so the demand for intelligence was high. To solve better the problem, this paper presented a game method and established a game neural network model. The model had been successfully applied in the classification experiment of winning rate between chess game, which had good theoretical significance and application value.展开更多
The rapidly advancing Convolutional Neural Networks(CNNs)have brought about a paradigm shift in various computer vision tasks,while also garnering increasing interest and application in sensor-based Human Activity Rec...The rapidly advancing Convolutional Neural Networks(CNNs)have brought about a paradigm shift in various computer vision tasks,while also garnering increasing interest and application in sensor-based Human Activity Recognition(HAR)efforts.However,the significant computational demands and memory requirements hinder the practical deployment of deep networks in resource-constrained systems.This paper introduces a novel network pruning method based on the energy spectral density of data in the frequency domain,which reduces the model’s depth and accelerates activity inference.Unlike traditional pruning methods that focus on the spatial domain and the importance of filters,this method converts sensor data,such as HAR data,to the frequency domain for analysis.It emphasizes the low-frequency components by calculating their energy spectral density values.Subsequently,filters that meet the predefined thresholds are retained,and redundant filters are removed,leading to a significant reduction in model size without compromising performance or incurring additional computational costs.Notably,the proposed algorithm’s effectiveness is empirically validated on a standard five-layer CNNs backbone architecture.The computational feasibility and data sensitivity of the proposed scheme are thoroughly examined.Impressively,the classification accuracy on three benchmark HAR datasets UCI-HAR,WISDM,and PAMAP2 reaches 96.20%,98.40%,and 92.38%,respectively.Concurrently,our strategy achieves a reduction in Floating Point Operations(FLOPs)by 90.73%,93.70%,and 90.74%,respectively,along with a corresponding decrease in memory consumption by 90.53%,93.43%,and 90.05%.展开更多
Network fault diagnosis methods play a vital role in maintaining network service quality and enhancing user experience as an integral component of intelligent network management.Considering the unique characteristics ...Network fault diagnosis methods play a vital role in maintaining network service quality and enhancing user experience as an integral component of intelligent network management.Considering the unique characteristics of edge networks,such as limited resources,complex network faults,and the need for high real-time performance,enhancing and optimizing existing network fault diagnosis methods is necessary.Therefore,this paper proposes the lightweight edge-side fault diagnosis approach based on a spiking neural network(LSNN).Firstly,we use the Izhikevich neurons model to replace the Leaky Integrate and Fire(LIF)neurons model in the LSNN model.Izhikevich neurons inherit the simplicity of LIF neurons but also possess richer behavioral characteristics and flexibility to handle diverse data inputs.Inspired by Fast Spiking Interneurons(FSIs)with a high-frequency firing pattern,we use the parameters of FSIs.Secondly,inspired by the connection mode based on spiking dynamics in the basal ganglia(BG)area of the brain,we propose the pruning approach based on the FSIs of the BG in LSNN to improve computational efficiency and reduce the demand for computing resources and energy consumption.Furthermore,we propose a multiple iterative Dynamic Spike Timing Dependent Plasticity(DSTDP)algorithm to enhance the accuracy of the LSNN model.Experiments on two server fault datasets demonstrate significant precision,recall,and F1 improvements across three diagnosis dimensions.Simultaneously,lightweight indicators such as Params and FLOPs significantly reduced,showcasing the LSNN’s advanced performance and model efficiency.To conclude,experiment results on a pair of datasets indicate that the LSNN model surpasses traditional models and achieves cutting-edge outcomes in network fault diagnosis tasks.展开更多
This paper focuses mainly on application of Partially Connected Backpropagation Neural Network (PCBP) instead of typical Fully Connected Neural Network (FCBP). The initial neural network is fully connected, after trai...This paper focuses mainly on application of Partially Connected Backpropagation Neural Network (PCBP) instead of typical Fully Connected Neural Network (FCBP). The initial neural network is fully connected, after training with sample data using cross-entropy as error function, a clustering method is employed to cluster weights between inputs to hidden layer and from hidden to output layer, and connections that are relatively unnecessary are deleted, thus the initial network becomes a PCBP network. Then PCBP can be used in prediction or data mining by training PCBP with data that comes from database. At the end of this paper, several experiments are conducted to illustrate the effects of PCBP using Iris data set.展开更多
基金supported by the National Natural Science Foundation of China under Grant No.62172132.
文摘The surge of large-scale models in recent years has led to breakthroughs in numerous fields,but it has also introduced higher computational costs and more complex network architectures.These increasingly large and intricate networks pose challenges for deployment and execution while also exacerbating the issue of network over-parameterization.To address this issue,various network compression techniques have been developed,such as network pruning.A typical pruning algorithm follows a three-step pipeline involving training,pruning,and retraining.Existing methods often directly set the pruned filters to zero during retraining,significantly reducing the parameter space.However,this direct pruning strategy frequently results in irreversible information loss.In the early stages of training,a network still contains much uncertainty,and evaluating filter importance may not be sufficiently rigorous.To manage the pruning process effectively,this paper proposes a flexible neural network pruning algorithm based on the logistic growth differential equation,considering the characteristics of network training.Unlike other pruning algorithms that directly reduce filter weights,this algorithm introduces a three-stage adaptive weight decay strategy inspired by the logistic growth differential equation.It employs a gentle decay rate in the initial training stage,a rapid decay rate during the intermediate stage,and a slower decay rate in the network convergence stage.Additionally,the decay rate is adjusted adaptively based on the filter weights at each stage.By controlling the adaptive decay rate at each stage,the pruning of neural network filters can be effectively managed.In experiments conducted on the CIFAR-10 and ILSVRC-2012 datasets,the pruning of neural networks significantly reduces the floating-point operations while maintaining the same pruning rate.Specifically,when implementing a 30%pruning rate on the ResNet-110 network,the pruned neural network not only decreases floating-point operations by 40.8%but also enhances the classification accuracy by 0.49%compared to the original network.
基金Supported by the National Key Research and Development Program of China(No.2022ZD0119003)and the National Natural Science Founda-tion of China(No.61834005).
文摘The dynamic routing mechanism in evolvable networks enables adaptive reconfiguration of topol-ogical structures and transmission pathways based on real-time task requirements and data character-istics.However,the heightened architectural complexity and expanded parameter dimensionality in evolvable networks present significant implementation challenges when deployed in resource-con-strained environments.Due to the critical paths ignored,traditional pruning strategies cannot get a desired trade-off between accuracy and efficiency.For this reason,a critical path retention pruning(CPRP)method is proposed.By deeply traversing the computational graph,the dependency rela-tionship among nodes is derived.Then the nodes are grouped and sorted according to their contribu-tion value.The redundant operations are removed as much as possible while ensuring that the criti-cal path is not affected.As a result,computational efficiency is improved while a higher accuracy is maintained.On the CIFAR benchmark,the experimental results demonstrate that CPRP-induced pruning incurs accuracy degradation below 4.00%,while outperforming traditional feature-agnostic grouping methods by an average 8.98%accuracy improvement.Simultaneously,the pruned model attains a 2.41 times inference acceleration while achieving 48.92%parameter compression and 53.40%floating-point operations(FLOPs)reduction.
基金supported by the Natural Science Foundation of Hubei Province of China under grant number 2022CFB536the National Natural Science Foundation of China under grant number 62367006the 15th Graduate Education Innovation Fund of Wuhan Institute of Technology under grant number CX2023579.
文摘Human pose estimation is a critical research area in the field of computer vision,playing a significant role in applications such as human-computer interaction,behavior analysis,and action recognition.In this paper,we propose a U-shaped keypoint detection network(DAUNet)based on an improved ResNet subsampling structure and spatial grouping mechanism.This network addresses key challenges in traditional methods,such as information loss,large network redundancy,and insufficient sensitivity to low-resolution features.DAUNet is composed of three main components.First,we introduce an improved BottleNeck block that employs partial convolution and strip pooling to reduce computational load and mitigate feature loss.Second,after upsampling,the network eliminates redundant features,improving the overall efficiency.Finally,a lightweight spatial grouping attention mechanism is applied to enhance low-resolution semantic features within the feature map,allowing for better restoration of the original image size and higher accuracy.Experimental results demonstrate that DAUNet achieves superior accuracy compared to most existing keypoint detection models,with a mean PCKh@0.5 score of 91.6%on the MPII dataset and an AP of 76.1%on the COCO dataset.Moreover,real-world experiments further validate the robustness and generalizability of DAUNet for detecting human bodies in unknown environments,highlighting its potential for broader applications.
基金supported by Yunnan Provincial Major Science and Technology Special Plan Projects(Grant Nos.202202AD080003,202202AE090008,202202AD080004,202302AD080003)National Natural Science Foundation of China(Grant Nos.U21B2027,62266027,62266028,62266025)Yunnan Province Young and Middle-Aged Academic and Technical Leaders Reserve Talent Program(Grant No.202305AC160063).
文摘Chinese named entity recognition(CNER)has received widespread attention as an important task of Chinese information extraction.Most previous research has focused on individually studying flat CNER,overlapped CNER,or discontinuous CNER.However,a unified CNER is often needed in real-world scenarios.Recent studies have shown that grid tagging-based methods based on character-pair relationship classification hold great potential for achieving unified NER.Nevertheless,how to enrich Chinese character-pair grid representations and capture deeper dependencies between character pairs to improve entity recognition performance remains an unresolved challenge.In this study,we enhance the character-pair grid representation by incorporating both local and global information.Significantly,we introduce a new approach by considering the character-pair grid representation matrix as a specialized image,converting the classification of character-pair relationships into a pixel-level semantic segmentation task.We devise a U-shaped network to extract multi-scale and deeper semantic information from the grid image,allowing for a more comprehensive understanding of associative features between character pairs.This approach leads to improved accuracy in predicting their relationships,ultimately enhancing entity recognition performance.We conducted experiments on two public CNER datasets in the biomedical domain,namely CMeEE-V2 and Diakg.The results demonstrate the effectiveness of our approach,which achieves F1-score improvements of 7.29 percentage points and 1.64 percentage points compared to the current state-of-the-art(SOTA)models,respectively.
基金National Natural Science Foundation of China(No.61806006)Jiangsu University Superior Discipline Construction Project。
文摘In view of the problems of multi-scale changes of segmentation targets,noise interference,rough segmentation results and slow training process faced by medical image semantic segmentation,a multi-scale residual aggregation U-shaped attention network structure of MAAUNet(MultiRes aggregation attention UNet)is proposed based on MultiResUNet.Firstly,aggregate connection is introduced from the original feature aggregation at the same level.Skip connection is redesigned to aggregate features of different semantic scales at the decoder subnet,and the problem of semantic gaps is further solved that may exist between skip connections.Secondly,after the multi-scale convolution module,a convolution block attention module is added to focus and integrate features in the two attention directions of channel and space to adaptively optimize the intermediate feature map.Finally,the original convolution block is improved.The convolution channels are expanded with a series convolution structure to complement each other and extract richer spatial features.Residual connections are retained and the convolution block is turned into a multi-channel convolution block.The model is made to extract multi-scale spatial features.The experimental results show that MAAUNet has strong competitiveness in challenging datasets,and shows good segmentation performance and stability in dealing with multi-scale input and noise interference.
文摘The development of intelligent algorithms for controlling autonomous mobile robots in real-time activities has increased dramatically in recent years.However,conventional intelligent algorithms currently fail to accurately predict unexpected obstacles involved in tour paths and thereby suffer from inefficient tour trajectories.The present study addresses these issues by proposing a potential field integrated pruned adaptive resonance theory(PPART)neural network for effectively managing the touring process of autonomous mobile robots in real-time.The proposed system is implemented using the AlphaBot platform,and the performance of the system is evaluated according to the obstacle prediction accuracy,path detection accuracy,time-lapse,tour length,and the overall accuracy of the system.The proposed system provide a very high obstacle prediction accuracy of 99.61%.Accordingly,the proposed tour planning design effectively predicts unexpected obstacles in the environment and thereby increases the overall efficiency of tour navigation.
文摘Parkinson’s disease is a serious disease that causes death.Recently,a new dataset has been introduced on this disease.The aim of this study is to improve the predictive performance of the model designed for Parkinson’s disease diagnosis.By and large,original DNN models were designed by using specific or random number of neurons and layers.This study analyzed the effects of parameters,i.e.,neuron number and activation function on the model performance based on growing and pruning approach.In other words,this study addressed the optimum hidden layer and neuron numbers and ideal activation and optimization functions in order to find out the best Deep Neural Networks model.In this context of this study,several models were designed and evaluated.The overall results revealed that the Deep Neural Networks were significantly successful with 99.34%accuracy value on test data.Also,it presents the highest prediction performance reported so far.Therefore,this study presents a model promising with respect to more accurate Parkinson’s disease diagnosis.
文摘Aimed at the great computing complexity of optimal brain surgeon (OBS) process, a pruning algorithm with penalty OBS process is presented. Compared with sensitive and regularized methods, the penalty OBS algorithm not only avoids time-consuming defect and low pruning efficiency in OBS process, but also keeps higher generalization and pruning accuracy than Levenberg-Marquardt method.
基金supported by the National Natural Science Foundation of China(62073006)the Beijing Natural Science Foundation of China(4212032)
文摘Deep stochastic configuration networks(DSCNs)produce redundant hidden nodes and connections during training,which complicates their model structures.Aiming at the above problems,this paper proposes a double pruning structure design algorithm for DSCNs based on mutual information and relevance.During the training process,the mutual information algorithm is used to calculate and sort the importance scores of the nodes in each hidden layer in a layer-by-layer manner,the node pruning rate of each layer is set according to the depth of the DSCN at the current time,the nodes that contribute little to the model are deleted,and the network-related parameters are updated.When the model completes the configuration procedure,the correlation evaluation strategy is used to sort the global connection weights and delete insignificance connections;then,the network parameters are updated after pruning is completed.The experimental results show that the proposed structure design method can effectively compress the scale of a DSCN model and improve its modeling speed;the model accuracy loss is small,and fine-tuning for accuracy restoration is not needed.The obtained DSCN model has certain application value in the field of regression analysis.
基金supported in part by the National Natural Science Foundation of China under grants 62073085,61973330 and 62350055in part by the Shenzhen Science and Technology Program,China under grant JCYJ20230807093513027in part by the Fundamental Research Funds for the Central Universities,China under grant 1243300008。
文摘Filter pruning effectively compresses the neural network by reducing both its parameters and computational cost.Existing pruning methods typically rely on pre-designed pruning criteria to measure filter importance and remove those deemed unimportant.However,different layers of the neural network exhibit varying filter distributions,making it inappropriate to implement the same pruning criterion for all layers.Additionally,some approaches apply different criteria from the set of pre-defined pruning rules for different layers,but the limited space leads to the difficulty of covering all layers.If criteria for all layers are manually designed,it is costly and difficult to generalize to other networks.To solve this problem,we present a novel neural network pruning method based on the Criterion Learner and Attention Distillation(CLAD).Specifically,CLAD develops a differentiable criterion learner,which is integrated into each layer of the network.The learner can automatically learn the appropriate pruning criterion according to the filter parameters of each layer,thus the requirement of manual design is eliminated.Furthermore,the criterion learner is trained end-to-end by the gradient optimization algorithm to achieve efficient pruning.In addition,attention distillation,which fully utilizes the knowledge of unpruned networks to guide the optimization of the learner and improve the pruned network performance,is introduced in the process of learner optimization.Experiments conducted on various datasets and networks demonstrate the effectiveness of the proposed method.Notably,CLAD reduces the FLOPs of Res Net-110 by about 53%on the CIFAR-10 dataset,while simultaneously improves the network's accuracy by 0.05%.Moreover,it reduces the FLOPs of Res Net-50 by about 46%on the Image Net-1K dataset,and maintains a top-1 accuracy of 75.45%.
基金supported by the National Natural Science Foundation of China(Research on Dynamic Location of Receiving Points and Wave Field Separation Technology Based on Deep Learning in OBN Seismic Exploration,No.42074140)the Sinopec Geophysical Corporation,Project of OBC/OBN Seismic Data Wave Field Characteristics Analysis and Ghost Wave Suppression(No.SGC-202206)。
文摘The use of blended acquisition technology in marine seismic exploration has the advantages of high acquisition efficiency and low exploration costs.However,during acquisition,the primary source may be disturbed by adjacent sources,resulting in blended noise that can adversely affect data processing and interpretation.Therefore,the de-blending method is needed to suppress blended noise and improve the quality of subsequent processing.Conventional de-blending methods,such as denoising and inversion methods,encounter challenges in parameter selection and entail high computational costs.In contrast,deep learning-based de-blending methods demonstrate reduced reliance on manual intervention and provide rapid calculation speeds post-training.In this study,we propose a Uformer network using a nonoverlapping window multihead attention mechanism designed for de-blending blended data in the common shot domain.We add the depthwise convolution to the feedforward network to improve Uformer’s ability to capture local context information.The loss function comprises SSIM and L1 loss.Our test results indicate that the Uformer outperforms convolutional neural networks and traditional denoising methods across various evaluation metrics,thus highlighting the effectiveness and advantages of Uformer in de-blending blended data.
基金the National Key Research and Development Program of China (No. 2019YFB1704500)the State Ministry of Science and Technology Innovation Fund of China (No. 2018IM030200)+1 种基金the National Natural Foundation of China (No. U1708255)the China Scholarship Council (No. 201906080059)
文摘Bearing pitting,one of the common faults in mechanical systems,is a research hotspot in both academia and industry.Traditional fault diagnosis methods for bearings are based on manual experience with low diagnostic efficiency.This study proposes a novel bearing fault diagnosis method based on deep separable convolution and spatial dropout regularization.Deep separable convolution extracts features from the raw bearing vibration signals,during which a 3×1 convolutional kernel with a one-step size selects effective features by adjusting its weights.The similarity pruning process of the channel convolution and point convolution can reduce the number of parameters and calculation quantities by evaluating the size of the weights and removing the feature maps of smaller weights.The spatial dropout regularization method focuses on bearing signal fault features,improving the independence between the bearing signal features and enhancing the robustness of the model.A batch normalization algorithm is added to the convolutional layer for gradient explosion control and network stability improvement.To validate the effectiveness of the proposed method,we collect raw vibration signals from bearings in eight different health states.The experimental results show that the proposed method can effectively distinguish different pitting faults in the bearings with a better accuracy than that of other typical deep learning methods.
文摘Overfitting is one of the important problems that restrain the application of neural network. The traditional OBD (Optimal Brain Damage) algorithm can avoid overfitting effectively. But it needs to train the network repeatedly with low calculational efficiency. In this paper, the Marquardt algorithm is incorporated into the OBD algorithm and a new method for pruning network-the Dynamic Optimal Brain Damage (DOBD) is introduced. This algorithm simplifies a network and obtains good generalization through dynamically deleting weight parameters with low sensitivity that is defined as the change of error function value with respect to the change of weights. Also a simplified method is presented through which sensitivities can be calculated during training with a little computation. A rule to determine the lower limit of sensitivity for deleting the unnecessary weights and other control methods during pruning and training are introduced. The training course is analyzed theoretically and the reason why DOBD algorithm can obtain a much faster training speed than the OBD algorithm and avoid overfitting effectively is given.
文摘In the first part of the article, a new algorithm for pruning networkDynamic Optimal Brain Damage(DOBD) is introduced. In this part, two cases and an industrial application are worked out to test the new algorithm. It is verified that the algorithm can obtain good generalization through deleting weight parameters with low sensitivities dynamically and get better result than the Marquardt algorithm or the cross-validation method. Although the initial construction of network may be different, the finial number of free weights pruned by the DOBD algorithm is similar and the number is just close to the optimal number of free weights. The algorithm is also helpful to design the optimal structure of network.
文摘Intelligent was very important for command decision model, and it was also the key to improve the quality of simulation training and combat experiment. The decision-making content was more complex in the implementation of tasks and the nature of the problem was different, so the demand for intelligence was high. To solve better the problem, this paper presented a game method and established a game neural network model. The model had been successfully applied in the classification experiment of winning rate between chess game, which had good theoretical significance and application value.
基金supported by National Natural Science Foundation of China(Nos.61902158 and 62202210).
文摘The rapidly advancing Convolutional Neural Networks(CNNs)have brought about a paradigm shift in various computer vision tasks,while also garnering increasing interest and application in sensor-based Human Activity Recognition(HAR)efforts.However,the significant computational demands and memory requirements hinder the practical deployment of deep networks in resource-constrained systems.This paper introduces a novel network pruning method based on the energy spectral density of data in the frequency domain,which reduces the model’s depth and accelerates activity inference.Unlike traditional pruning methods that focus on the spatial domain and the importance of filters,this method converts sensor data,such as HAR data,to the frequency domain for analysis.It emphasizes the low-frequency components by calculating their energy spectral density values.Subsequently,filters that meet the predefined thresholds are retained,and redundant filters are removed,leading to a significant reduction in model size without compromising performance or incurring additional computational costs.Notably,the proposed algorithm’s effectiveness is empirically validated on a standard five-layer CNNs backbone architecture.The computational feasibility and data sensitivity of the proposed scheme are thoroughly examined.Impressively,the classification accuracy on three benchmark HAR datasets UCI-HAR,WISDM,and PAMAP2 reaches 96.20%,98.40%,and 92.38%,respectively.Concurrently,our strategy achieves a reduction in Floating Point Operations(FLOPs)by 90.73%,93.70%,and 90.74%,respectively,along with a corresponding decrease in memory consumption by 90.53%,93.43%,and 90.05%.
基金supported by National Key R&D Program of China(2019YFB2103202).
文摘Network fault diagnosis methods play a vital role in maintaining network service quality and enhancing user experience as an integral component of intelligent network management.Considering the unique characteristics of edge networks,such as limited resources,complex network faults,and the need for high real-time performance,enhancing and optimizing existing network fault diagnosis methods is necessary.Therefore,this paper proposes the lightweight edge-side fault diagnosis approach based on a spiking neural network(LSNN).Firstly,we use the Izhikevich neurons model to replace the Leaky Integrate and Fire(LIF)neurons model in the LSNN model.Izhikevich neurons inherit the simplicity of LIF neurons but also possess richer behavioral characteristics and flexibility to handle diverse data inputs.Inspired by Fast Spiking Interneurons(FSIs)with a high-frequency firing pattern,we use the parameters of FSIs.Secondly,inspired by the connection mode based on spiking dynamics in the basal ganglia(BG)area of the brain,we propose the pruning approach based on the FSIs of the BG in LSNN to improve computational efficiency and reduce the demand for computing resources and energy consumption.Furthermore,we propose a multiple iterative Dynamic Spike Timing Dependent Plasticity(DSTDP)algorithm to enhance the accuracy of the LSNN model.Experiments on two server fault datasets demonstrate significant precision,recall,and F1 improvements across three diagnosis dimensions.Simultaneously,lightweight indicators such as Params and FLOPs significantly reduced,showcasing the LSNN’s advanced performance and model efficiency.To conclude,experiment results on a pair of datasets indicate that the LSNN model surpasses traditional models and achieves cutting-edge outcomes in network fault diagnosis tasks.
文摘This paper focuses mainly on application of Partially Connected Backpropagation Neural Network (PCBP) instead of typical Fully Connected Neural Network (FCBP). The initial neural network is fully connected, after training with sample data using cross-entropy as error function, a clustering method is employed to cluster weights between inputs to hidden layer and from hidden to output layer, and connections that are relatively unnecessary are deleted, thus the initial network becomes a PCBP network. Then PCBP can be used in prediction or data mining by training PCBP with data that comes from database. At the end of this paper, several experiments are conducted to illustrate the effects of PCBP using Iris data set.