期刊文献+
共找到3,903篇文章
< 1 2 196 >
每页显示 20 50 100
ACtriplet:An improved deep learning model for activity cliffs prediction by integrating triplet loss and pre-training 被引量:1
1
作者 Xinxin Yu Yimeng Wang +3 位作者 Long Chen Weihua Li Yun Tang Guixia Liu 《Journal of Pharmaceutical Analysis》 2025年第8期1837-1847,共11页
Activity cliffs(ACs)are generally defined as pairs of similar compounds that only differ by a minor structural modification but exhibit a large difference in their binding affinity for a given target.ACs offer crucial... Activity cliffs(ACs)are generally defined as pairs of similar compounds that only differ by a minor structural modification but exhibit a large difference in their binding affinity for a given target.ACs offer crucial insights that aid medicinal chemists in optimizing molecular structures.Nonetheless,they also form a major source of prediction error in structure-activity relationship(SAR)models.To date,several studies have demonstrated that deep neural networks based on molecular images or graphs might need to be improved further in predicting the potency of ACs.In this paper,we integrated the triplet loss in face recognition with pre-training strategy to develop a prediction model ACtriplet,tailored for ACs.Through extensive comparison with multiple baseline models on 30 benchmark datasets,the results showed that ACtriplet was significantly better than those deep learning(DL)models without pretraining.In addition,we explored the effect of pre-training on data representation.Finally,the case study demonstrated that our model's interpretability module could explain the prediction results reasonably.In the dilemma that the amount of data could not be increased rapidly,this innovative framework would better make use of the existing data,which would propel the potential of DL in the early stage of drug discovery and optimization. 展开更多
关键词 Activity cliff Triplet loss Deep learning pre-training
暂未订购
Synaptic pruning mechanisms and application of emerging imaging techniques in neurological disorders
2
作者 Yakang Xing Yi Mo +1 位作者 Qihui Chen Xiao Li 《Neural Regeneration Research》 2026年第5期1698-1714,共17页
Synaptic pruning is a crucial process in synaptic refinement,eliminating unstable synaptic connections in neural circuits.This process is triggered and regulated primarily by spontaneous neural activity and experience... Synaptic pruning is a crucial process in synaptic refinement,eliminating unstable synaptic connections in neural circuits.This process is triggered and regulated primarily by spontaneous neural activity and experience-dependent mechanisms.The pruning process involves multiple molecular signals and a series of regulatory activities governing the“eat me”and“don't eat me”states.Under physiological conditions,the interaction between glial cells and neurons results in the clearance of unnecessary synapses,maintaining normal neural circuit functionality via synaptic pruning.Alterations in genetic and environmental factors can lead to imbalanced synaptic pruning,thus promoting the occurrence and development of autism spectrum disorder,schizophrenia,Alzheimer's disease,and other neurological disorders.In this review,we investigated the molecular mechanisms responsible for synaptic pruning during neural development.We focus on how synaptic pruning can regulate neural circuits and its association with neurological disorders.Furthermore,we discuss the application of emerging optical and imaging technologies to observe synaptic structure and function,as well as their potential for clinical translation.Our aim was to enhance our understanding of synaptic pruning during neural development,including the molecular basis underlying the regulation of synaptic function and the dynamic changes in synaptic density,and to investigate the potential role of these mechanisms in the pathophysiology of neurological diseases,thus providing a theoretical foundation for the treatment of neurological disorders. 展开更多
关键词 CHEMOKINE COMPLEMENT experience-dependent driven synaptic pruning imaging techniques NEUROGLIA signaling pathways synapse elimination synaptic pruning
暂未订购
Modeling Pruning as a Phase Transition:A Thermodynamic Analysis of Neural Activations
3
作者 Rayeesa Mehmood Sergei Koltcov +1 位作者 Anton Surkov Vera Ignatenko 《Computers, Materials & Continua》 2026年第3期2304-2327,共24页
Activation pruning reduces neural network complexity by eliminating low-importance neuron activations,yet identifying the critical pruning threshold—beyond which accuracy rapidly deteriorates—remains computationally... Activation pruning reduces neural network complexity by eliminating low-importance neuron activations,yet identifying the critical pruning threshold—beyond which accuracy rapidly deteriorates—remains computationally expensive and typically requires exhaustive search.We introduce a thermodynamics-inspired framework that treats activation distributions as energy-filtered physical systems and employs the free energy of activations as a principled evaluation metric.Phase-transition-like phenomena in the free-energy profile—such as extrema,inflection points,and curvature changes—yield reliable estimates of the critical pruning threshold,providing a theoretically grounded means of predicting sharp accuracy degradation.To further enhance efficiency,we propose a renormalized free energy technique that approximates full-evaluation free energy using only the activation distribution of the unpruned network.This eliminates repeated forward passes,dramatically reducing computational overhead and achieving speedups of up to 550×for MLPs.Extensive experiments across diverse vision architectures(MLP,CNN,ResNet,MobileNet,Vision Transformer)and text models(LSTM,BERT,ELECTRA,T5,GPT-2)on multiple datasets validate the generality,robustness,and computational efficiency of our approach.Overall,this work establishes a theoretically grounded and practically effective framework for activation pruning,bridging the gap between analytical understanding and efficient deployment of sparse neural networks. 展开更多
关键词 THERMODYNAMICS activation pruning model compression SPARSITY free energy RENORMALIZATION
在线阅读 下载PDF
Mitigating Attribute Inference in Split Learning via Channel Pruning and Adversarial Training
4
作者 Afnan Alhindi Saad Al-Ahmadi Mohamed Maher Ben Ismail 《Computers, Materials & Continua》 2026年第3期1465-1489,共25页
Split Learning(SL)has been promoted as a promising collaborative machine learning technique designed to address data privacy and resource efficiency.Specifically,neural networks are divided into client and server subn... Split Learning(SL)has been promoted as a promising collaborative machine learning technique designed to address data privacy and resource efficiency.Specifically,neural networks are divided into client and server subnetworks in order to mitigate the exposure of sensitive data and reduce the overhead on client devices,thereby making SL particularly suitable for resource-constrained devices.Although SL prevents the direct transmission of raw data,it does not alleviate entirely the risk of privacy breaches.In fact,the data intermediately transmitted to the server sub-model may include patterns or information that could reveal sensitive data.Moreover,achieving a balance between model utility and data privacy has emerged as a challenging problem.In this article,we propose a novel defense approach that combines:(i)Adversarial learning,and(ii)Network channel pruning.In particular,the proposed adversarial learning approach is specifically designed to reduce the risk of private data exposure while maintaining high performance for the utility task.On the other hand,the suggested channel pruning enables the model to adaptively adjust and reactivate pruned channels while conducting adversarial training.The integration of these two techniques reduces the informativeness of the intermediate data transmitted by the client sub-model,thereby enhancing its robustness against attribute inference attacks without adding significant computational overhead,making it wellsuited for IoT devices,mobile platforms,and Internet of Vehicles(IoV)scenarios.The proposed defense approach was evaluated using EfficientNet-B0,a widely adopted compact model,along with three benchmark datasets.The obtained results showcased its superior defense capability against attribute inference attacks compared to existing state-of-the-art methods.This research’s findings demonstrated the effectiveness of the proposed channel pruning-based adversarial training approach in achieving the intended compromise between utility and privacy within SL frameworks.In fact,the classification accuracy attained by the attackers witnessed a drastic decrease of 70%. 展开更多
关键词 Split learning privacy-preserving split learning distributed collaborative machine learning channel pruning adversarial learning resource-constrained devices
在线阅读 下载PDF
Effective distributed convolutional neural network architecture for remote sensing images target classification with a pre-training approach 被引量:3
5
作者 LI Binquan HU Xiaohui 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2019年第2期238-244,共7页
How to recognize targets with similar appearances from remote sensing images(RSIs) effectively and efficiently has become a big challenge. Recently, convolutional neural network(CNN) is preferred in the target classif... How to recognize targets with similar appearances from remote sensing images(RSIs) effectively and efficiently has become a big challenge. Recently, convolutional neural network(CNN) is preferred in the target classification due to the powerful feature representation ability and better performance. However,the training and testing of CNN mainly rely on single machine.Single machine has its natural limitation and bottleneck in processing RSIs due to limited hardware resources and huge time consuming. Besides, overfitting is a challenge for the CNN model due to the unbalance between RSIs data and the model structure.When a model is complex or the training data is relatively small,overfitting occurs and leads to a poor predictive performance. To address these problems, a distributed CNN architecture for RSIs target classification is proposed, which dramatically increases the training speed of CNN and system scalability. It improves the storage ability and processing efficiency of RSIs. Furthermore,Bayesian regularization approach is utilized in order to initialize the weights of the CNN extractor, which increases the robustness and flexibility of the CNN model. It helps prevent the overfitting and avoid the local optima caused by limited RSI training images or the inappropriate CNN structure. In addition, considering the efficiency of the Na¨?ve Bayes classifier, a distributed Na¨?ve Bayes classifier is designed to reduce the training cost. Compared with other algorithms, the proposed system and method perform the best and increase the recognition accuracy. The results show that the distributed system framework and the proposed algorithms are suitable for RSIs target classification tasks. 展开更多
关键词 convolutional NEURAL network (CNN) DISTRIBUTED architecture REMOTE SENSING images (RSIs) TARGET classification pre-training
在线阅读 下载PDF
Knowledge Enhanced Pre-Training Model for Vision-Language-Navigation Task 被引量:1
6
作者 HUANG Jitao ZENG Guohui +3 位作者 HUANG Bo GAO Yongbin LIU Jin SHI Zhicai 《Wuhan University Journal of Natural Sciences》 CAS CSCD 2021年第2期147-155,共9页
Vision-Language-Navigation(VLN) task is a cross-modality task that combines natural language processing and computer vision. This task requires the agent to automatically move to the destination according to the natur... Vision-Language-Navigation(VLN) task is a cross-modality task that combines natural language processing and computer vision. This task requires the agent to automatically move to the destination according to the natural language instruction and the observed surrounding visual information. To make the best decision, in every step during the navigation, the agent should pay more attention to understanding the objects, the object attributes, and the object relationships. But most current methods process all received textual and visual information equally. Therefore, this paper integrates more detailed semantic connections between visual and textual information through three pre-training tasks(object prediction, object attributes prediction, and object relationship prediction). The model will learn better fusion representation and alignment between these two types of information to improve the success rate(SR) and generalization. The experiments show that compared with the former baseline models, the SR on the unseen validation set(Val Unseen) increased by 7%, and the SR weighted by path length(SPL) increased by 7%;the SR on the test set(Test) increased 4%, SPL increased by 3%. 展开更多
关键词 pre-training cross-modality deep learning scene graph
原文传递
Pre-training Assessment Through the Web
7
作者 Kenneth Wong Reggie Kwan Jimmy SF Chan 《厦门大学学报(自然科学版)》 CAS CSCD 北大核心 2002年第S1期297-,共1页
Web-based training is growing quickly in popularit y for professionals in industrial organizations and large enterprises. The savings in cost and time are significant. The instructor-led trainings are bounded by time ... Web-based training is growing quickly in popularit y for professionals in industrial organizations and large enterprises. The savings in cost and time are significant. The instructor-led trainings are bounded by time and place, not to mention the cost involved in traveling, accommodation and training venue. However, in the most online training courses, all trainees are given same training materials and teaching paradigms. The problem of differentia ting the trainees’ abilities is the main concern. We need a pre-training test t o identify and classify of the weaknesses and strengths of differentiate trainee s so as to devise an appropriate training programs for the trainees. Adaptation of a Web-based Computer adaptive Test (CAT) for the pre-training test make the web-based training more efficient. The advantages of CAT are self-pacing, eff iciency, time and cost saving, immediate scoring and feedback, accuracy and secu rity, etc (Rudner, 1998; UMN, 1999; Novell, 2000; Linacre, 2000; Windowsglore, 2 000). Moreover, Web-based CAT also gives greater flexibility and convenience. T his paper describes how this CAT tool is built, how it helps instructor identify the strengths and weaknesses of trainees, and how to assure quality on the CAT system. 展开更多
关键词 CAT TEST pre-training Assessment Through the Web
在线阅读 下载PDF
A Modified CycleGAN for Multi-Organ Ultrasound Image Enhancement via Unpaired Pre-Training
8
作者 Haonan Han Bingyu Yang +2 位作者 Weihang Zhang Dongwei Li Huiqi Li 《Journal of Beijing Institute of Technology》 EI CAS 2024年第3期194-203,共10页
Handheld ultrasound devices are known for their portability and affordability,making them widely utilized in underdeveloped areas and community healthcare for rapid diagnosis and early screening.However,the image qual... Handheld ultrasound devices are known for their portability and affordability,making them widely utilized in underdeveloped areas and community healthcare for rapid diagnosis and early screening.However,the image quality of handheld ultrasound devices is not always satisfactory due to the limited equipment size,which hinders accurate diagnoses by doctors.At the same time,paired ultrasound images are difficult to obtain from the clinic because imaging process is complicated.Therefore,we propose a modified cycle generative adversarial network(cycleGAN) for ultrasound image enhancement from multiple organs via unpaired pre-training.We introduce an ultrasound image pre-training method that does not require paired images,alleviating the requirement for large-scale paired datasets.We also propose an enhanced block with different structures in the pre-training and fine-tuning phases,which can help achieve the goals of different training phases.To improve the robustness of the model,we add Gaussian noise to the training images as data augmentation.Our approach is effective in obtaining the best quantitative evaluation results using a small number of parameters and less training costs to improve the quality of handheld ultrasound devices. 展开更多
关键词 ultrasound image enhancement handheld devices unpaired images pre-train and finetune cycleGAN
在线阅读 下载PDF
GeoNER:Geological Named Entity Recognition with Enriched Domain Pre-Training Model and Adversarial Training
9
作者 MA Kai HU Xinxin +4 位作者 TIAN Miao TAN Yongjian ZHENG Shuai TAO Liufeng QIU Qinjun 《Acta Geologica Sinica(English Edition)》 SCIE CAS CSCD 2024年第5期1404-1417,共14页
As important geological data,a geological report contains rich expert and geological knowledge,but the challenge facing current research into geological knowledge extraction and mining is how to render accurate unders... As important geological data,a geological report contains rich expert and geological knowledge,but the challenge facing current research into geological knowledge extraction and mining is how to render accurate understanding of geological reports guided by domain knowledge.While generic named entity recognition models/tools can be utilized for the processing of geoscience reports/documents,their effectiveness is hampered by a dearth of domain-specific knowledge,which in turn leads to a pronounced decline in recognition accuracy.This study summarizes six types of typical geological entities,with reference to the ontological system of geological domains and builds a high quality corpus for the task of geological named entity recognition(GNER).In addition,Geo Wo BERT-adv BGP(Geological Word-base BERTadversarial training Bi-directional Long Short-Term Memory Global Pointer)is proposed to address the issues of ambiguity,diversity and nested entities for the geological entities.The model first uses the fine-tuned word granularitybased pre-training model Geo Wo BERT(Geological Word-base BERT)and combines the text features that are extracted using the Bi LSTM(Bi-directional Long Short-Term Memory),followed by an adversarial training algorithm to improve the robustness of the model and enhance its resistance to interference,the decoding finally being performed using a global association pointer algorithm.The experimental results show that the proposed model for the constructed dataset achieves high performance and is capable of mining the rich geological information. 展开更多
关键词 geological named entity recognition geological report adversarial training confrontation training global pointer pre-training model
在线阅读 下载PDF
SFPBL:Soft Filter Pruning Based on Logistic Growth Differential Equation for Neural Network 被引量:1
10
作者 Can Hu Shanqing Zhang +2 位作者 Kewei Tao Gaoming Yang Li Li 《Computers, Materials & Continua》 2025年第3期4913-4930,共18页
The surge of large-scale models in recent years has led to breakthroughs in numerous fields,but it has also introduced higher computational costs and more complex network architectures.These increasingly large and int... The surge of large-scale models in recent years has led to breakthroughs in numerous fields,but it has also introduced higher computational costs and more complex network architectures.These increasingly large and intricate networks pose challenges for deployment and execution while also exacerbating the issue of network over-parameterization.To address this issue,various network compression techniques have been developed,such as network pruning.A typical pruning algorithm follows a three-step pipeline involving training,pruning,and retraining.Existing methods often directly set the pruned filters to zero during retraining,significantly reducing the parameter space.However,this direct pruning strategy frequently results in irreversible information loss.In the early stages of training,a network still contains much uncertainty,and evaluating filter importance may not be sufficiently rigorous.To manage the pruning process effectively,this paper proposes a flexible neural network pruning algorithm based on the logistic growth differential equation,considering the characteristics of network training.Unlike other pruning algorithms that directly reduce filter weights,this algorithm introduces a three-stage adaptive weight decay strategy inspired by the logistic growth differential equation.It employs a gentle decay rate in the initial training stage,a rapid decay rate during the intermediate stage,and a slower decay rate in the network convergence stage.Additionally,the decay rate is adjusted adaptively based on the filter weights at each stage.By controlling the adaptive decay rate at each stage,the pruning of neural network filters can be effectively managed.In experiments conducted on the CIFAR-10 and ILSVRC-2012 datasets,the pruning of neural networks significantly reduces the floating-point operations while maintaining the same pruning rate.Specifically,when implementing a 30%pruning rate on the ResNet-110 network,the pruned neural network not only decreases floating-point operations by 40.8%but also enhances the classification accuracy by 0.49%compared to the original network. 展开更多
关键词 Filter pruning channel pruning CNN complexity deep neural networks filtering theory logistic model
在线阅读 下载PDF
KitWaSor:Pioneering pre-trained model for kitchen waste sorting with an innovative million-level benchmark dataset
11
作者 Leyuan Fang Shuaiyu Ding +3 位作者 Hao Feng Junwu Yu Lin Tang Pedram Ghamisi 《CAAI Transactions on Intelligence Technology》 2025年第1期94-114,共21页
Intelligent sorting is an important prerequisite for the full quantitative consumption and harmless disposal of kitchen waste.The existing object detection method based on an ImageNet pre-trained model is an effective... Intelligent sorting is an important prerequisite for the full quantitative consumption and harmless disposal of kitchen waste.The existing object detection method based on an ImageNet pre-trained model is an effective way of sorting.Owing to significant domain gaps between natural images and kitchen waste images,it is difficult to reflect the characteristics of diverse scales and dense distribution in kitchen waste based on an ImageNet pre-trained model,leading to poor generalisation.In this article,the authors propose the first pre-trained model for kitchen waste sorting called KitWaSor,which combines both contrastive learning(CL)and masked image modelling(MIM)through self-supervised learning(SSL).First,to address the issue of diverse scales,the authors propose a mixed masking strategy by introducing an incomplete masking branch based on the original random masking branch.It prevents the complete loss of small-scale objects while avoiding excessive leakage of large-scale object pixels.Second,to address the issue of dense distribution,the authors introduce semantic consistency constraints on the basis of the mixed masking strategy.That is,object semantic reasoning is performed through semantic consistency constraints to compensate for the lack of contextual information.To train KitWaSor,the authors construct the first million-level kitchen waste dataset across seasonal and regional distributions,named KWD-Million.Extensive experiments show that KitWaSor achieves state-of-the-art(SOTA)performance on the two most relevant downstream tasks for kitchen waste sorting(i.e.image classification and object detection),demonstrating the effectiveness of the proposed KitWaSor. 展开更多
关键词 contrastive learning kitchen waste masked image modeling pre-trained model self-supervised learning
在线阅读 下载PDF
DPCIPI: A pre-trained deep learning model for predicting cross-immunity between drifted strains of Influenza A/H3N2
12
作者 Yiming Du Zhuotian Li +8 位作者 Qian He Thomas Wetere Tulu Kei Hang Katie Chan Lin Wang Sen Pei Zhanwei Du Zhen Wang Xiao-Ke Xu Xiao Fan Liu 《Journal of Automation and Intelligence》 2025年第2期115-124,共10页
Predicting cross-immunity between viral strains is vital for public health surveillance and vaccine development.Traditional neural network methods,such as BiLSTM,could be ineffective due to the lack of lab data for mo... Predicting cross-immunity between viral strains is vital for public health surveillance and vaccine development.Traditional neural network methods,such as BiLSTM,could be ineffective due to the lack of lab data for model training and the overshadowing of crucial features within sequence concatenation.The current work proposes a less data-consuming model incorporating a pre-trained gene sequence model and a mutual information inference operator.Our methodology utilizes gene alignment and deduplication algorithms to preprocess gene sequences,enhancing the model’s capacity to discern and focus on distinctions among input gene pairs.The model,i.e.,DNA Pretrained Cross-Immunity Protection Inference model(DPCIPI),outperforms state-of-theart(SOTA)models in predicting hemagglutination inhibition titer from influenza viral gene sequences only.Improvement in binary cross-immunity prediction is 1.58%in F1,2.34%in precision,1.57%in recall,and 1.57%in Accuracy.For multilevel cross-immunity improvements,the improvement is 2.12%in F1,3.50%in precision,2.19%in recall,and 2.19%in Accuracy.Our study showcases the potential of pre-trained gene models to improve predictions of antigenic variation and cross-immunity.With expanding gene data and advancements in pre-trained models,this approach promises significant impacts on vaccine development and public health. 展开更多
关键词 Cross-immunity prediction pre-trained model Deep learning Influenza strains Hemagglutination inhibition
在线阅读 下载PDF
Big Texture Dataset Synthesized Based on Gradient and Convolution Kernels Using Pre-Trained Deep Neural Networks
13
作者 Farhan A.Alenizi Faten Khalid Karim +1 位作者 Alaa R.Al-Shamasneh Mohammad Hossein Shakoor 《Computer Modeling in Engineering & Sciences》 2025年第8期1793-1829,共37页
Deep neural networks provide accurate results for most applications.However,they need a big dataset to train properly.Providing a big dataset is a significant challenge in most applications.Image augmentation refers t... Deep neural networks provide accurate results for most applications.However,they need a big dataset to train properly.Providing a big dataset is a significant challenge in most applications.Image augmentation refers to techniques that increase the amount of image data.Common operations for image augmentation include changes in illumination,rotation,contrast,size,viewing angle,and others.Recently,Generative Adversarial Networks(GANs)have been employed for image generation.However,like image augmentation methods,GAN approaches can only generate images that are similar to the original images.Therefore,they also cannot generate new classes of data.Texture images presentmore challenges than general images,and generating textures is more complex than creating other types of images.This study proposes a gradient-based deep neural network method that generates a new class of texture.It is possible to rapidly generate new classes of textures using different kernels from pre-trained deep networks.After generating new textures for each class,the number of textures increases through image augmentation.During this process,several techniques are proposed to automatically remove incomplete and similar textures that are created.The proposed method is faster than some well-known generative networks by around 4 to 10 times.In addition,the quality of the generated textures surpasses that of these networks.The proposed method can generate textures that surpass those of someGANs and parametric models in certain image qualitymetrics.It can provide a big texture dataset to train deep networks.A new big texture dataset is created artificially using the proposed method.This dataset is approximately 2 GB in size and comprises 30,000 textures,each 150×150 pixels in size,organized into 600 classes.It is uploaded to the Kaggle site and Google Drive.This dataset is called BigTex.Compared to other texture datasets,the proposed dataset is the largest and can serve as a comprehensive texture dataset for training more powerful deep neural networks and mitigating overfitting. 展开更多
关键词 Big texture dataset data generation pre-trained deep neural network
在线阅读 下载PDF
Hierarchical Shape Pruning for 3D Sparse Convolution Networks
14
作者 Haiyan Long Chonghao Zhang +2 位作者 Xudong Qiu Hai Chen Gang Chen 《Computers, Materials & Continua》 2025年第8期2975-2988,共14页
3D sparse convolution has emerged as a pivotal technique for efficient voxel-based perception in autonomous systems,enabling selective feature extraction from non-empty voxels while suppressing computational waste.Des... 3D sparse convolution has emerged as a pivotal technique for efficient voxel-based perception in autonomous systems,enabling selective feature extraction from non-empty voxels while suppressing computational waste.Despite its theoretical efficiency advantages,practical implementations face under-explored limitations:the fixed geometric patterns of conventional sparse convolutional kernels inevitably process non-contributory positions during sliding-window operations,particularly in regions with uneven point cloud density.To address this,we propose Hierarchical Shape Pruning for 3D Sparse Convolution(HSP-S),which dynamically eliminates redundant kernel stripes through layer-adaptive thresholding.Unlike static soft pruning methods,HSP-S maintains trainable sparsity patterns by progressively adjusting pruning thresholds during optimization,enlarging original parameter search space while removing redundant operations.Extensive experiments validate effectiveness of HSP-S acrossmajor autonomous driving benchmarks.On KITTI’s 3D object detection task,our method reduces 93.47%redundant kernel computations whilemaintaining comparable accuracy(1.56%mAP drop).Remarkably,on themore complexNuScenes benchmark,HSP-S achieves simultaneous computation reduction(21.94%sparsity)and accuracy gains(1.02%mAP(mean Average Precision)and 0.47%NDS(nuScenes detection score)improvement),demonstrating its scalability to diverse perception scenarios.This work establishes the first learnable shape pruning framework that simultaneously enhances computational efficiency and preserves detection accuracy in 3D perception systems. 展开更多
关键词 Shape pruning model compressing 3D sparse convolution
在线阅读 下载PDF
Computation graph pruning based on critical path retention in evolvable networks
15
作者 XIE Xiaoyan YANG Tianjiao +4 位作者 ZHU Yun LUO Xing JIN Luochen YU Jinhao REN Xun 《High Technology Letters》 2025年第3期266-272,共7页
The dynamic routing mechanism in evolvable networks enables adaptive reconfiguration of topol-ogical structures and transmission pathways based on real-time task requirements and data character-istics.However,the heig... The dynamic routing mechanism in evolvable networks enables adaptive reconfiguration of topol-ogical structures and transmission pathways based on real-time task requirements and data character-istics.However,the heightened architectural complexity and expanded parameter dimensionality in evolvable networks present significant implementation challenges when deployed in resource-con-strained environments.Due to the critical paths ignored,traditional pruning strategies cannot get a desired trade-off between accuracy and efficiency.For this reason,a critical path retention pruning(CPRP)method is proposed.By deeply traversing the computational graph,the dependency rela-tionship among nodes is derived.Then the nodes are grouped and sorted according to their contribu-tion value.The redundant operations are removed as much as possible while ensuring that the criti-cal path is not affected.As a result,computational efficiency is improved while a higher accuracy is maintained.On the CIFAR benchmark,the experimental results demonstrate that CPRP-induced pruning incurs accuracy degradation below 4.00%,while outperforming traditional feature-agnostic grouping methods by an average 8.98%accuracy improvement.Simultaneously,the pruned model attains a 2.41 times inference acceleration while achieving 48.92%parameter compression and 53.40%floating-point operations(FLOPs)reduction. 展开更多
关键词 evolvable network computation graph traversing dynamic routing critical path retention pruning
在线阅读 下载PDF
Optimizing BERT for Bengali Emotion Classification: Evaluating Knowledge Distillation, Pruning, and Quantization
16
作者 Md Hasibur Rahman Mohammed Arif Uddin +1 位作者 Zinnat Fowzia Ria Rashedur M.Rahman 《Computer Modeling in Engineering & Sciences》 2025年第2期1637-1666,共30页
The rapid growth of digital data necessitates advanced natural language processing(NLP)models like BERT(Bidi-rectional Encoder Representations from Transformers),known for its superior performance in text classificati... The rapid growth of digital data necessitates advanced natural language processing(NLP)models like BERT(Bidi-rectional Encoder Representations from Transformers),known for its superior performance in text classification.However,BERT’s size and computational demands limit its practicality,especially in resource-constrained settings.This research compresses the BERT base model for Bengali emotion classification through knowledge distillation(KD),pruning,and quantization techniques.Despite Bengali being the sixth most spoken language globally,NLP research in this area is limited.Our approach addresses this gap by creating an efficient BERT-based model for Bengali text.We have explored 20 combinations for KD,quantization,and pruning,resulting in improved speedup,fewer parameters,and reduced memory size.Our best results demonstrate significant improvements in both speed and efficiency.For instance,in the case of mBERT,we achieved a 3.87×speedup and 4×compression ratio with a combination of Distil+Prune+Quant that reduced parameters from 178 to 46 M,while the memory size decreased from 711 to 178 MB.These results offer scalable solutions for NLP tasks in various languages and advance the field of model compression,making these models suitable for real-world applications in resource-limited environments. 展开更多
关键词 Bengali NLP black-box distillation emotion classification model compression post-training quantization unstructured pruning
在线阅读 下载PDF
CLAD:Criterion learner and attention distillation for automated CNN pruning
17
作者 Zheng Li Jiaxin Li +2 位作者 Shaojie Liu Bo Zhao Derong Liu 《Journal of Automation and Intelligence》 2025年第4期254-265,共12页
Filter pruning effectively compresses the neural network by reducing both its parameters and computational cost.Existing pruning methods typically rely on pre-designed pruning criteria to measure filter importance and... Filter pruning effectively compresses the neural network by reducing both its parameters and computational cost.Existing pruning methods typically rely on pre-designed pruning criteria to measure filter importance and remove those deemed unimportant.However,different layers of the neural network exhibit varying filter distributions,making it inappropriate to implement the same pruning criterion for all layers.Additionally,some approaches apply different criteria from the set of pre-defined pruning rules for different layers,but the limited space leads to the difficulty of covering all layers.If criteria for all layers are manually designed,it is costly and difficult to generalize to other networks.To solve this problem,we present a novel neural network pruning method based on the Criterion Learner and Attention Distillation(CLAD).Specifically,CLAD develops a differentiable criterion learner,which is integrated into each layer of the network.The learner can automatically learn the appropriate pruning criterion according to the filter parameters of each layer,thus the requirement of manual design is eliminated.Furthermore,the criterion learner is trained end-to-end by the gradient optimization algorithm to achieve efficient pruning.In addition,attention distillation,which fully utilizes the knowledge of unpruned networks to guide the optimization of the learner and improve the pruned network performance,is introduced in the process of learner optimization.Experiments conducted on various datasets and networks demonstrate the effectiveness of the proposed method.Notably,CLAD reduces the FLOPs of Res Net-110 by about 53%on the CIFAR-10 dataset,while simultaneously improves the network's accuracy by 0.05%.Moreover,it reduces the FLOPs of Res Net-50 by about 46%on the Image Net-1K dataset,and maintains a top-1 accuracy of 75.45%. 展开更多
关键词 Neural network pruning Model compression Knowledge distillation Feature attention Polar regularization
在线阅读 下载PDF
Greedy Pruning Algorithm for DETR Architecture Networks Based on Global Optimization
18
作者 HUANG Qiubo XU Jingsai +2 位作者 ZHANG Yakui WANG Mei CHEN Dehua 《Journal of Donghua University(English Edition)》 2025年第1期96-105,共10页
End-to-end object detection Transformer(DETR)successfully established the paradigm of the Transformer architecture in the field of object detection.Its end-to-end detection process and the idea of set prediction have ... End-to-end object detection Transformer(DETR)successfully established the paradigm of the Transformer architecture in the field of object detection.Its end-to-end detection process and the idea of set prediction have become one of the hottest network architectures in recent years.There has been an abundance of work improving upon DETR.However,DETR and its variants require a substantial amount of memory resources and computational costs,and the vast number of parameters in these networks is unfavorable for model deployment.To address this issue,a greedy pruning(GP)algorithm is proposed,applied to a variant denoising-DETR(DN-DETR),which can eliminate redundant parameters in the Transformer architecture of DN-DETR.Considering the different roles of the multi-head attention(MHA)module and the feed-forward network(FFN)module in the Transformer architecture,a modular greedy pruning(MGP)algorithm is proposed.This algorithm separates the two modules and applies their respective optimal strategies and parameters.The effectiveness of the proposed algorithm is validated on the COCO 2017 dataset.The model obtained through the MGP algorithm reduces the parameters by 49%and the number of floating point operations(FLOPs)by 44%compared to the Transformer architecture of DN-DETR.At the same time,the mean average precision(mAP)of the model increases from 44.1%to 45.3%. 展开更多
关键词 model pruning object detection Transformer(DETR) Transformer architecture object detection
在线阅读 下载PDF
Multilingual Text Summarization in Healthcare Using Pre-Trained Transformer-Based Language Models
19
作者 Josua Käser Thomas Nagy +1 位作者 Patrick Stirnemann Thomas Hanne 《Computers, Materials & Continua》 2025年第4期201-217,共17页
We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of t... We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of these models and their ability to perform the task of abstractive text summarization in the healthcare field.The research hypothesis was that large language models could perform high-quality abstractive text summarization on German technical healthcare texts,even if the model is not specifically trained in that language.Through experiments,the research questions explore the performance of transformer language models in dealing with complex syntax constructs,the difference in performance between models trained in English and German,and the impact of translating the source text to English before conducting the summarization.We conducted an evaluation of four PLMs(GPT-3,a translation-based approach also utilizing GPT-3,a German language Model,and a domain-specific bio-medical model approach).The evaluation considered the informativeness using 3 types of metrics based on Recall-Oriented Understudy for Gisting Evaluation(ROUGE)and the quality of results which is manually evaluated considering 5 aspects.The results show that text summarization models could be used in the German healthcare domain and that domain-independent language models achieved the best results.The study proves that text summarization models can simplify the search for pre-existing German knowledge in various domains. 展开更多
关键词 Text summarization pre-trained transformer-based language models large language models technical healthcare texts natural language processing
在线阅读 下载PDF
A Novel Reduced Error Pruning Tree Forest with Time-Based Missing Data Imputation(REPTF-TMDI)for Traffic Flow Prediction
20
作者 Yunus Dogan Goksu Tuysuzoglu +4 位作者 Elife Ozturk Kiyak Bita Ghasemkhani Kokten Ulas Birant Semih Utku Derya Birant 《Computer Modeling in Engineering & Sciences》 2025年第8期1677-1715,共39页
Accurate traffic flow prediction(TFP)is vital for efficient and sustainable transportation management and the development of intelligent traffic systems.However,missing data in real-world traffic datasets poses a sign... Accurate traffic flow prediction(TFP)is vital for efficient and sustainable transportation management and the development of intelligent traffic systems.However,missing data in real-world traffic datasets poses a significant challenge to maintaining prediction precision.This study introduces REPTF-TMDI,a novel method that combines a Reduced Error Pruning Tree Forest(REPTree Forest)with a newly proposed Time-based Missing Data Imputation(TMDI)approach.The REP Tree Forest,an ensemble learning approach,is tailored for time-related traffic data to enhance predictive accuracy and support the evolution of sustainable urbanmobility solutions.Meanwhile,the TMDI approach exploits temporal patterns to estimate missing values reliably whenever empty fields are encountered.The proposed method was evaluated using hourly traffic flow data from a major U.S.roadway spanning 2012-2018,incorporating temporal features(e.g.,hour,day,month,year,weekday),holiday indicator,and weather conditions(temperature,rain,snow,and cloud coverage).Experimental results demonstrated that the REPTF-TMDI method outperformed conventional imputation techniques across various missing data ratios by achieving an average 11.76%improvement in terms of correlation coefficient(R).Furthermore,REPTree Forest achieved improvements of 68.62%in RMSE and 70.52%in MAE compared to existing state-of-the-art models.These findings highlight the method’s ability to significantly boost traffic flow prediction accuracy,even in the presence of missing data,thereby contributing to the broader objectives of sustainable urban transportation systems. 展开更多
关键词 Machine learning traffic flow prediction missing data imputation reduced error pruning tree(REPTree) sustainable transportation systems traffic management artificial intelligence
在线阅读 下载PDF
上一页 1 2 196 下一页 到第
使用帮助 返回顶部