Consistency identification in task-oriented dialogue(CI-ToD)can prevent inconsistent dialogue response generation,which has recently emerged as an important and growing research area.This paper takes the first step to...Consistency identification in task-oriented dialogue(CI-ToD)can prevent inconsistent dialogue response generation,which has recently emerged as an important and growing research area.This paper takes the first step to explore a pre-training paradigm for CI-ToD.Nevertheless,pre-training for CI-ToD is non-trivial because it requires a large amount of multi-turn KB-grounded dialogues,which are extremely hard to collect.To alleviate the data scarcity problem for pre-training,we introduce a modularized pre-training framework(MPFToD),which is capable of utilizing large amounts of KB-free dialogues.Specifically,such modularization allows us to decouple CI-ToD into three sub-modules and propose three pre-training tasks including(i)query response matching pre-training;(ii)dialogue history consistent identification pre-training;and(iii)KB mask language modeling to enhance different abilities of CI-ToD model.As different sub-tasks are solved separately,MPFToD can learn from large amounts of KB-free dialogues for different modules,which are much easier to obtain.Results on the CI-ToD benchmark show that MPFToD pushes the state-of-the-art performance from 56.3%to 61.0%.Furthermore,we show its transferability with promising performance on other downstream tasks(i.e.,dialog act recognition,sentiment classification and table fact checking).展开更多
Synaptic pruning is a crucial process in synaptic refinement,eliminating unstable synaptic connections in neural circuits.This process is triggered and regulated primarily by spontaneous neural activity and experience...Synaptic pruning is a crucial process in synaptic refinement,eliminating unstable synaptic connections in neural circuits.This process is triggered and regulated primarily by spontaneous neural activity and experience-dependent mechanisms.The pruning process involves multiple molecular signals and a series of regulatory activities governing the“eat me”and“don't eat me”states.Under physiological conditions,the interaction between glial cells and neurons results in the clearance of unnecessary synapses,maintaining normal neural circuit functionality via synaptic pruning.Alterations in genetic and environmental factors can lead to imbalanced synaptic pruning,thus promoting the occurrence and development of autism spectrum disorder,schizophrenia,Alzheimer's disease,and other neurological disorders.In this review,we investigated the molecular mechanisms responsible for synaptic pruning during neural development.We focus on how synaptic pruning can regulate neural circuits and its association with neurological disorders.Furthermore,we discuss the application of emerging optical and imaging technologies to observe synaptic structure and function,as well as their potential for clinical translation.Our aim was to enhance our understanding of synaptic pruning during neural development,including the molecular basis underlying the regulation of synaptic function and the dynamic changes in synaptic density,and to investigate the potential role of these mechanisms in the pathophysiology of neurological diseases,thus providing a theoretical foundation for the treatment of neurological disorders.展开更多
How to recognize targets with similar appearances from remote sensing images(RSIs) effectively and efficiently has become a big challenge. Recently, convolutional neural network(CNN) is preferred in the target classif...How to recognize targets with similar appearances from remote sensing images(RSIs) effectively and efficiently has become a big challenge. Recently, convolutional neural network(CNN) is preferred in the target classification due to the powerful feature representation ability and better performance. However,the training and testing of CNN mainly rely on single machine.Single machine has its natural limitation and bottleneck in processing RSIs due to limited hardware resources and huge time consuming. Besides, overfitting is a challenge for the CNN model due to the unbalance between RSIs data and the model structure.When a model is complex or the training data is relatively small,overfitting occurs and leads to a poor predictive performance. To address these problems, a distributed CNN architecture for RSIs target classification is proposed, which dramatically increases the training speed of CNN and system scalability. It improves the storage ability and processing efficiency of RSIs. Furthermore,Bayesian regularization approach is utilized in order to initialize the weights of the CNN extractor, which increases the robustness and flexibility of the CNN model. It helps prevent the overfitting and avoid the local optima caused by limited RSI training images or the inappropriate CNN structure. In addition, considering the efficiency of the Na¨?ve Bayes classifier, a distributed Na¨?ve Bayes classifier is designed to reduce the training cost. Compared with other algorithms, the proposed system and method perform the best and increase the recognition accuracy. The results show that the distributed system framework and the proposed algorithms are suitable for RSIs target classification tasks.展开更多
Vision-Language-Navigation(VLN) task is a cross-modality task that combines natural language processing and computer vision. This task requires the agent to automatically move to the destination according to the natur...Vision-Language-Navigation(VLN) task is a cross-modality task that combines natural language processing and computer vision. This task requires the agent to automatically move to the destination according to the natural language instruction and the observed surrounding visual information. To make the best decision, in every step during the navigation, the agent should pay more attention to understanding the objects, the object attributes, and the object relationships. But most current methods process all received textual and visual information equally. Therefore, this paper integrates more detailed semantic connections between visual and textual information through three pre-training tasks(object prediction, object attributes prediction, and object relationship prediction). The model will learn better fusion representation and alignment between these two types of information to improve the success rate(SR) and generalization. The experiments show that compared with the former baseline models, the SR on the unseen validation set(Val Unseen) increased by 7%, and the SR weighted by path length(SPL) increased by 7%;the SR on the test set(Test) increased 4%, SPL increased by 3%.展开更多
Web-based training is growing quickly in popularit y for professionals in industrial organizations and large enterprises. The savings in cost and time are significant. The instructor-led trainings are bounded by time ...Web-based training is growing quickly in popularit y for professionals in industrial organizations and large enterprises. The savings in cost and time are significant. The instructor-led trainings are bounded by time and place, not to mention the cost involved in traveling, accommodation and training venue. However, in the most online training courses, all trainees are given same training materials and teaching paradigms. The problem of differentia ting the trainees’ abilities is the main concern. We need a pre-training test t o identify and classify of the weaknesses and strengths of differentiate trainee s so as to devise an appropriate training programs for the trainees. Adaptation of a Web-based Computer adaptive Test (CAT) for the pre-training test make the web-based training more efficient. The advantages of CAT are self-pacing, eff iciency, time and cost saving, immediate scoring and feedback, accuracy and secu rity, etc (Rudner, 1998; UMN, 1999; Novell, 2000; Linacre, 2000; Windowsglore, 2 000). Moreover, Web-based CAT also gives greater flexibility and convenience. T his paper describes how this CAT tool is built, how it helps instructor identify the strengths and weaknesses of trainees, and how to assure quality on the CAT system.展开更多
Handheld ultrasound devices are known for their portability and affordability,making them widely utilized in underdeveloped areas and community healthcare for rapid diagnosis and early screening.However,the image qual...Handheld ultrasound devices are known for their portability and affordability,making them widely utilized in underdeveloped areas and community healthcare for rapid diagnosis and early screening.However,the image quality of handheld ultrasound devices is not always satisfactory due to the limited equipment size,which hinders accurate diagnoses by doctors.At the same time,paired ultrasound images are difficult to obtain from the clinic because imaging process is complicated.Therefore,we propose a modified cycle generative adversarial network(cycleGAN) for ultrasound image enhancement from multiple organs via unpaired pre-training.We introduce an ultrasound image pre-training method that does not require paired images,alleviating the requirement for large-scale paired datasets.We also propose an enhanced block with different structures in the pre-training and fine-tuning phases,which can help achieve the goals of different training phases.To improve the robustness of the model,we add Gaussian noise to the training images as data augmentation.Our approach is effective in obtaining the best quantitative evaluation results using a small number of parameters and less training costs to improve the quality of handheld ultrasound devices.展开更多
As important geological data,a geological report contains rich expert and geological knowledge,but the challenge facing current research into geological knowledge extraction and mining is how to render accurate unders...As important geological data,a geological report contains rich expert and geological knowledge,but the challenge facing current research into geological knowledge extraction and mining is how to render accurate understanding of geological reports guided by domain knowledge.While generic named entity recognition models/tools can be utilized for the processing of geoscience reports/documents,their effectiveness is hampered by a dearth of domain-specific knowledge,which in turn leads to a pronounced decline in recognition accuracy.This study summarizes six types of typical geological entities,with reference to the ontological system of geological domains and builds a high quality corpus for the task of geological named entity recognition(GNER).In addition,Geo Wo BERT-adv BGP(Geological Word-base BERTadversarial training Bi-directional Long Short-Term Memory Global Pointer)is proposed to address the issues of ambiguity,diversity and nested entities for the geological entities.The model first uses the fine-tuned word granularitybased pre-training model Geo Wo BERT(Geological Word-base BERT)and combines the text features that are extracted using the Bi LSTM(Bi-directional Long Short-Term Memory),followed by an adversarial training algorithm to improve the robustness of the model and enhance its resistance to interference,the decoding finally being performed using a global association pointer algorithm.The experimental results show that the proposed model for the constructed dataset achieves high performance and is capable of mining the rich geological information.展开更多
The surge of large-scale models in recent years has led to breakthroughs in numerous fields,but it has also introduced higher computational costs and more complex network architectures.These increasingly large and int...The surge of large-scale models in recent years has led to breakthroughs in numerous fields,but it has also introduced higher computational costs and more complex network architectures.These increasingly large and intricate networks pose challenges for deployment and execution while also exacerbating the issue of network over-parameterization.To address this issue,various network compression techniques have been developed,such as network pruning.A typical pruning algorithm follows a three-step pipeline involving training,pruning,and retraining.Existing methods often directly set the pruned filters to zero during retraining,significantly reducing the parameter space.However,this direct pruning strategy frequently results in irreversible information loss.In the early stages of training,a network still contains much uncertainty,and evaluating filter importance may not be sufficiently rigorous.To manage the pruning process effectively,this paper proposes a flexible neural network pruning algorithm based on the logistic growth differential equation,considering the characteristics of network training.Unlike other pruning algorithms that directly reduce filter weights,this algorithm introduces a three-stage adaptive weight decay strategy inspired by the logistic growth differential equation.It employs a gentle decay rate in the initial training stage,a rapid decay rate during the intermediate stage,and a slower decay rate in the network convergence stage.Additionally,the decay rate is adjusted adaptively based on the filter weights at each stage.By controlling the adaptive decay rate at each stage,the pruning of neural network filters can be effectively managed.In experiments conducted on the CIFAR-10 and ILSVRC-2012 datasets,the pruning of neural networks significantly reduces the floating-point operations while maintaining the same pruning rate.Specifically,when implementing a 30%pruning rate on the ResNet-110 network,the pruned neural network not only decreases floating-point operations by 40.8%but also enhances the classification accuracy by 0.49%compared to the original network.展开更多
Predicting cross-immunity between viral strains is vital for public health surveillance and vaccine development.Traditional neural network methods,such as BiLSTM,could be ineffective due to the lack of lab data for mo...Predicting cross-immunity between viral strains is vital for public health surveillance and vaccine development.Traditional neural network methods,such as BiLSTM,could be ineffective due to the lack of lab data for model training and the overshadowing of crucial features within sequence concatenation.The current work proposes a less data-consuming model incorporating a pre-trained gene sequence model and a mutual information inference operator.Our methodology utilizes gene alignment and deduplication algorithms to preprocess gene sequences,enhancing the model’s capacity to discern and focus on distinctions among input gene pairs.The model,i.e.,DNA Pretrained Cross-Immunity Protection Inference model(DPCIPI),outperforms state-of-theart(SOTA)models in predicting hemagglutination inhibition titer from influenza viral gene sequences only.Improvement in binary cross-immunity prediction is 1.58%in F1,2.34%in precision,1.57%in recall,and 1.57%in Accuracy.For multilevel cross-immunity improvements,the improvement is 2.12%in F1,3.50%in precision,2.19%in recall,and 2.19%in Accuracy.Our study showcases the potential of pre-trained gene models to improve predictions of antigenic variation and cross-immunity.With expanding gene data and advancements in pre-trained models,this approach promises significant impacts on vaccine development and public health.展开更多
Intelligent sorting is an important prerequisite for the full quantitative consumption and harmless disposal of kitchen waste.The existing object detection method based on an ImageNet pre-trained model is an effective...Intelligent sorting is an important prerequisite for the full quantitative consumption and harmless disposal of kitchen waste.The existing object detection method based on an ImageNet pre-trained model is an effective way of sorting.Owing to significant domain gaps between natural images and kitchen waste images,it is difficult to reflect the characteristics of diverse scales and dense distribution in kitchen waste based on an ImageNet pre-trained model,leading to poor generalisation.In this article,the authors propose the first pre-trained model for kitchen waste sorting called KitWaSor,which combines both contrastive learning(CL)and masked image modelling(MIM)through self-supervised learning(SSL).First,to address the issue of diverse scales,the authors propose a mixed masking strategy by introducing an incomplete masking branch based on the original random masking branch.It prevents the complete loss of small-scale objects while avoiding excessive leakage of large-scale object pixels.Second,to address the issue of dense distribution,the authors introduce semantic consistency constraints on the basis of the mixed masking strategy.That is,object semantic reasoning is performed through semantic consistency constraints to compensate for the lack of contextual information.To train KitWaSor,the authors construct the first million-level kitchen waste dataset across seasonal and regional distributions,named KWD-Million.Extensive experiments show that KitWaSor achieves state-of-the-art(SOTA)performance on the two most relevant downstream tasks for kitchen waste sorting(i.e.image classification and object detection),demonstrating the effectiveness of the proposed KitWaSor.展开更多
3D sparse convolution has emerged as a pivotal technique for efficient voxel-based perception in autonomous systems,enabling selective feature extraction from non-empty voxels while suppressing computational waste.Des...3D sparse convolution has emerged as a pivotal technique for efficient voxel-based perception in autonomous systems,enabling selective feature extraction from non-empty voxels while suppressing computational waste.Despite its theoretical efficiency advantages,practical implementations face under-explored limitations:the fixed geometric patterns of conventional sparse convolutional kernels inevitably process non-contributory positions during sliding-window operations,particularly in regions with uneven point cloud density.To address this,we propose Hierarchical Shape Pruning for 3D Sparse Convolution(HSP-S),which dynamically eliminates redundant kernel stripes through layer-adaptive thresholding.Unlike static soft pruning methods,HSP-S maintains trainable sparsity patterns by progressively adjusting pruning thresholds during optimization,enlarging original parameter search space while removing redundant operations.Extensive experiments validate effectiveness of HSP-S acrossmajor autonomous driving benchmarks.On KITTI’s 3D object detection task,our method reduces 93.47%redundant kernel computations whilemaintaining comparable accuracy(1.56%mAP drop).Remarkably,on themore complexNuScenes benchmark,HSP-S achieves simultaneous computation reduction(21.94%sparsity)and accuracy gains(1.02%mAP(mean Average Precision)and 0.47%NDS(nuScenes detection score)improvement),demonstrating its scalability to diverse perception scenarios.This work establishes the first learnable shape pruning framework that simultaneously enhances computational efficiency and preserves detection accuracy in 3D perception systems.展开更多
Deep neural networks provide accurate results for most applications.However,they need a big dataset to train properly.Providing a big dataset is a significant challenge in most applications.Image augmentation refers t...Deep neural networks provide accurate results for most applications.However,they need a big dataset to train properly.Providing a big dataset is a significant challenge in most applications.Image augmentation refers to techniques that increase the amount of image data.Common operations for image augmentation include changes in illumination,rotation,contrast,size,viewing angle,and others.Recently,Generative Adversarial Networks(GANs)have been employed for image generation.However,like image augmentation methods,GAN approaches can only generate images that are similar to the original images.Therefore,they also cannot generate new classes of data.Texture images presentmore challenges than general images,and generating textures is more complex than creating other types of images.This study proposes a gradient-based deep neural network method that generates a new class of texture.It is possible to rapidly generate new classes of textures using different kernels from pre-trained deep networks.After generating new textures for each class,the number of textures increases through image augmentation.During this process,several techniques are proposed to automatically remove incomplete and similar textures that are created.The proposed method is faster than some well-known generative networks by around 4 to 10 times.In addition,the quality of the generated textures surpasses that of these networks.The proposed method can generate textures that surpass those of someGANs and parametric models in certain image qualitymetrics.It can provide a big texture dataset to train deep networks.A new big texture dataset is created artificially using the proposed method.This dataset is approximately 2 GB in size and comprises 30,000 textures,each 150×150 pixels in size,organized into 600 classes.It is uploaded to the Kaggle site and Google Drive.This dataset is called BigTex.Compared to other texture datasets,the proposed dataset is the largest and can serve as a comprehensive texture dataset for training more powerful deep neural networks and mitigating overfitting.展开更多
The rapid growth of digital data necessitates advanced natural language processing(NLP)models like BERT(Bidi-rectional Encoder Representations from Transformers),known for its superior performance in text classificati...The rapid growth of digital data necessitates advanced natural language processing(NLP)models like BERT(Bidi-rectional Encoder Representations from Transformers),known for its superior performance in text classification.However,BERT’s size and computational demands limit its practicality,especially in resource-constrained settings.This research compresses the BERT base model for Bengali emotion classification through knowledge distillation(KD),pruning,and quantization techniques.Despite Bengali being the sixth most spoken language globally,NLP research in this area is limited.Our approach addresses this gap by creating an efficient BERT-based model for Bengali text.We have explored 20 combinations for KD,quantization,and pruning,resulting in improved speedup,fewer parameters,and reduced memory size.Our best results demonstrate significant improvements in both speed and efficiency.For instance,in the case of mBERT,we achieved a 3.87×speedup and 4×compression ratio with a combination of Distil+Prune+Quant that reduced parameters from 178 to 46 M,while the memory size decreased from 711 to 178 MB.These results offer scalable solutions for NLP tasks in various languages and advance the field of model compression,making these models suitable for real-world applications in resource-limited environments.展开更多
End-to-end object detection Transformer(DETR)successfully established the paradigm of the Transformer architecture in the field of object detection.Its end-to-end detection process and the idea of set prediction have ...End-to-end object detection Transformer(DETR)successfully established the paradigm of the Transformer architecture in the field of object detection.Its end-to-end detection process and the idea of set prediction have become one of the hottest network architectures in recent years.There has been an abundance of work improving upon DETR.However,DETR and its variants require a substantial amount of memory resources and computational costs,and the vast number of parameters in these networks is unfavorable for model deployment.To address this issue,a greedy pruning(GP)algorithm is proposed,applied to a variant denoising-DETR(DN-DETR),which can eliminate redundant parameters in the Transformer architecture of DN-DETR.Considering the different roles of the multi-head attention(MHA)module and the feed-forward network(FFN)module in the Transformer architecture,a modular greedy pruning(MGP)algorithm is proposed.This algorithm separates the two modules and applies their respective optimal strategies and parameters.The effectiveness of the proposed algorithm is validated on the COCO 2017 dataset.The model obtained through the MGP algorithm reduces the parameters by 49%and the number of floating point operations(FLOPs)by 44%compared to the Transformer architecture of DN-DETR.At the same time,the mean average precision(mAP)of the model increases from 44.1%to 45.3%.展开更多
We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of t...We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of these models and their ability to perform the task of abstractive text summarization in the healthcare field.The research hypothesis was that large language models could perform high-quality abstractive text summarization on German technical healthcare texts,even if the model is not specifically trained in that language.Through experiments,the research questions explore the performance of transformer language models in dealing with complex syntax constructs,the difference in performance between models trained in English and German,and the impact of translating the source text to English before conducting the summarization.We conducted an evaluation of four PLMs(GPT-3,a translation-based approach also utilizing GPT-3,a German language Model,and a domain-specific bio-medical model approach).The evaluation considered the informativeness using 3 types of metrics based on Recall-Oriented Understudy for Gisting Evaluation(ROUGE)and the quality of results which is manually evaluated considering 5 aspects.The results show that text summarization models could be used in the German healthcare domain and that domain-independent language models achieved the best results.The study proves that text summarization models can simplify the search for pre-existing German knowledge in various domains.展开更多
Accurate traffic flow prediction(TFP)is vital for efficient and sustainable transportation management and the development of intelligent traffic systems.However,missing data in real-world traffic datasets poses a sign...Accurate traffic flow prediction(TFP)is vital for efficient and sustainable transportation management and the development of intelligent traffic systems.However,missing data in real-world traffic datasets poses a significant challenge to maintaining prediction precision.This study introduces REPTF-TMDI,a novel method that combines a Reduced Error Pruning Tree Forest(REPTree Forest)with a newly proposed Time-based Missing Data Imputation(TMDI)approach.The REP Tree Forest,an ensemble learning approach,is tailored for time-related traffic data to enhance predictive accuracy and support the evolution of sustainable urbanmobility solutions.Meanwhile,the TMDI approach exploits temporal patterns to estimate missing values reliably whenever empty fields are encountered.The proposed method was evaluated using hourly traffic flow data from a major U.S.roadway spanning 2012-2018,incorporating temporal features(e.g.,hour,day,month,year,weekday),holiday indicator,and weather conditions(temperature,rain,snow,and cloud coverage).Experimental results demonstrated that the REPTF-TMDI method outperformed conventional imputation techniques across various missing data ratios by achieving an average 11.76%improvement in terms of correlation coefficient(R).Furthermore,REPTree Forest achieved improvements of 68.62%in RMSE and 70.52%in MAE compared to existing state-of-the-art models.These findings highlight the method’s ability to significantly boost traffic flow prediction accuracy,even in the presence of missing data,thereby contributing to the broader objectives of sustainable urban transportation systems.展开更多
Microglia are the macrophages that populate the brain parenchyma.Research in the past decades has identified them as both essential guardians of the brain and significant contributors to various neurological diseases....Microglia are the macrophages that populate the brain parenchyma.Research in the past decades has identified them as both essential guardians of the brain and significant contributors to various neurological diseases.A highly versatile cell type,microglia have been shown to fulfill a multitude of critical roles in the central nervous system,including facilitating neurogenesis and myelination,pruning synapses,removing debris and waste,modulating neuronal activity,supporting the blood-brain barrier,repairing tissue damage,and surveilling against microbial invasions under physiological conditions(Prinz et al.,2021;Paolicelli et al.,2022).展开更多
基金supported by the National Natural Science Foundation of China(NSFC)(Grant Nos.62306342,62176076)the Excellent Young Scientists Fund in Hunan Province(2024JJ4070)+3 种基金supported by the Natural Science Foundation of Guangdong(2023A1515012922)Shenzhen Foundational Research Funding(JCYJ20220818102415032)The Major Key Project of PCL(PCL2023A09)Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies(2022B1212010005k).
文摘Consistency identification in task-oriented dialogue(CI-ToD)can prevent inconsistent dialogue response generation,which has recently emerged as an important and growing research area.This paper takes the first step to explore a pre-training paradigm for CI-ToD.Nevertheless,pre-training for CI-ToD is non-trivial because it requires a large amount of multi-turn KB-grounded dialogues,which are extremely hard to collect.To alleviate the data scarcity problem for pre-training,we introduce a modularized pre-training framework(MPFToD),which is capable of utilizing large amounts of KB-free dialogues.Specifically,such modularization allows us to decouple CI-ToD into three sub-modules and propose three pre-training tasks including(i)query response matching pre-training;(ii)dialogue history consistent identification pre-training;and(iii)KB mask language modeling to enhance different abilities of CI-ToD model.As different sub-tasks are solved separately,MPFToD can learn from large amounts of KB-free dialogues for different modules,which are much easier to obtain.Results on the CI-ToD benchmark show that MPFToD pushes the state-of-the-art performance from 56.3%to 61.0%.Furthermore,we show its transferability with promising performance on other downstream tasks(i.e.,dialog act recognition,sentiment classification and table fact checking).
基金supported by the National Natural Science Foundation of China,No.31760290,82160688the Key Development Areas Project of Ganzhou Science and Technology,No.2022B-SF9554(all to XL)。
文摘Synaptic pruning is a crucial process in synaptic refinement,eliminating unstable synaptic connections in neural circuits.This process is triggered and regulated primarily by spontaneous neural activity and experience-dependent mechanisms.The pruning process involves multiple molecular signals and a series of regulatory activities governing the“eat me”and“don't eat me”states.Under physiological conditions,the interaction between glial cells and neurons results in the clearance of unnecessary synapses,maintaining normal neural circuit functionality via synaptic pruning.Alterations in genetic and environmental factors can lead to imbalanced synaptic pruning,thus promoting the occurrence and development of autism spectrum disorder,schizophrenia,Alzheimer's disease,and other neurological disorders.In this review,we investigated the molecular mechanisms responsible for synaptic pruning during neural development.We focus on how synaptic pruning can regulate neural circuits and its association with neurological disorders.Furthermore,we discuss the application of emerging optical and imaging technologies to observe synaptic structure and function,as well as their potential for clinical translation.Our aim was to enhance our understanding of synaptic pruning during neural development,including the molecular basis underlying the regulation of synaptic function and the dynamic changes in synaptic density,and to investigate the potential role of these mechanisms in the pathophysiology of neurological diseases,thus providing a theoretical foundation for the treatment of neurological disorders.
基金supported by the National Natural Science Foundation of China(U1435220)
文摘How to recognize targets with similar appearances from remote sensing images(RSIs) effectively and efficiently has become a big challenge. Recently, convolutional neural network(CNN) is preferred in the target classification due to the powerful feature representation ability and better performance. However,the training and testing of CNN mainly rely on single machine.Single machine has its natural limitation and bottleneck in processing RSIs due to limited hardware resources and huge time consuming. Besides, overfitting is a challenge for the CNN model due to the unbalance between RSIs data and the model structure.When a model is complex or the training data is relatively small,overfitting occurs and leads to a poor predictive performance. To address these problems, a distributed CNN architecture for RSIs target classification is proposed, which dramatically increases the training speed of CNN and system scalability. It improves the storage ability and processing efficiency of RSIs. Furthermore,Bayesian regularization approach is utilized in order to initialize the weights of the CNN extractor, which increases the robustness and flexibility of the CNN model. It helps prevent the overfitting and avoid the local optima caused by limited RSI training images or the inappropriate CNN structure. In addition, considering the efficiency of the Na¨?ve Bayes classifier, a distributed Na¨?ve Bayes classifier is designed to reduce the training cost. Compared with other algorithms, the proposed system and method perform the best and increase the recognition accuracy. The results show that the distributed system framework and the proposed algorithms are suitable for RSIs target classification tasks.
基金Supported by the National Natural Science Foundation of China (62006150)Songjiang District Science and Technology Research Project (19SJKJGG83)Shanghai Young Science and Technology Talents Sailing Program (19YF1418400)。
文摘Vision-Language-Navigation(VLN) task is a cross-modality task that combines natural language processing and computer vision. This task requires the agent to automatically move to the destination according to the natural language instruction and the observed surrounding visual information. To make the best decision, in every step during the navigation, the agent should pay more attention to understanding the objects, the object attributes, and the object relationships. But most current methods process all received textual and visual information equally. Therefore, this paper integrates more detailed semantic connections between visual and textual information through three pre-training tasks(object prediction, object attributes prediction, and object relationship prediction). The model will learn better fusion representation and alignment between these two types of information to improve the success rate(SR) and generalization. The experiments show that compared with the former baseline models, the SR on the unseen validation set(Val Unseen) increased by 7%, and the SR weighted by path length(SPL) increased by 7%;the SR on the test set(Test) increased 4%, SPL increased by 3%.
文摘Web-based training is growing quickly in popularit y for professionals in industrial organizations and large enterprises. The savings in cost and time are significant. The instructor-led trainings are bounded by time and place, not to mention the cost involved in traveling, accommodation and training venue. However, in the most online training courses, all trainees are given same training materials and teaching paradigms. The problem of differentia ting the trainees’ abilities is the main concern. We need a pre-training test t o identify and classify of the weaknesses and strengths of differentiate trainee s so as to devise an appropriate training programs for the trainees. Adaptation of a Web-based Computer adaptive Test (CAT) for the pre-training test make the web-based training more efficient. The advantages of CAT are self-pacing, eff iciency, time and cost saving, immediate scoring and feedback, accuracy and secu rity, etc (Rudner, 1998; UMN, 1999; Novell, 2000; Linacre, 2000; Windowsglore, 2 000). Moreover, Web-based CAT also gives greater flexibility and convenience. T his paper describes how this CAT tool is built, how it helps instructor identify the strengths and weaknesses of trainees, and how to assure quality on the CAT system.
文摘Handheld ultrasound devices are known for their portability and affordability,making them widely utilized in underdeveloped areas and community healthcare for rapid diagnosis and early screening.However,the image quality of handheld ultrasound devices is not always satisfactory due to the limited equipment size,which hinders accurate diagnoses by doctors.At the same time,paired ultrasound images are difficult to obtain from the clinic because imaging process is complicated.Therefore,we propose a modified cycle generative adversarial network(cycleGAN) for ultrasound image enhancement from multiple organs via unpaired pre-training.We introduce an ultrasound image pre-training method that does not require paired images,alleviating the requirement for large-scale paired datasets.We also propose an enhanced block with different structures in the pre-training and fine-tuning phases,which can help achieve the goals of different training phases.To improve the robustness of the model,we add Gaussian noise to the training images as data augmentation.Our approach is effective in obtaining the best quantitative evaluation results using a small number of parameters and less training costs to improve the quality of handheld ultrasound devices.
基金financially supported by the Natural Science Foundation of China(Grant No.42301492)the National Key R&D Program of China(Grant Nos.2022YFF0711600,2022YFF0801201,2022YFF0801200)+3 种基金the Major Special Project of Xinjiang(Grant No.2022A03009-3)the Open Fund of Key Laboratory of Urban Land Resources Monitoring and Simulation,Ministry of Natural Resources(Grant No.KF-2022-07014)the Opening Fund of the Key Laboratory of the Geological Survey and Evaluation of the Ministry of Education(Grant No.GLAB 2023ZR01)the Fundamental Research Funds for the Central Universities。
文摘As important geological data,a geological report contains rich expert and geological knowledge,but the challenge facing current research into geological knowledge extraction and mining is how to render accurate understanding of geological reports guided by domain knowledge.While generic named entity recognition models/tools can be utilized for the processing of geoscience reports/documents,their effectiveness is hampered by a dearth of domain-specific knowledge,which in turn leads to a pronounced decline in recognition accuracy.This study summarizes six types of typical geological entities,with reference to the ontological system of geological domains and builds a high quality corpus for the task of geological named entity recognition(GNER).In addition,Geo Wo BERT-adv BGP(Geological Word-base BERTadversarial training Bi-directional Long Short-Term Memory Global Pointer)is proposed to address the issues of ambiguity,diversity and nested entities for the geological entities.The model first uses the fine-tuned word granularitybased pre-training model Geo Wo BERT(Geological Word-base BERT)and combines the text features that are extracted using the Bi LSTM(Bi-directional Long Short-Term Memory),followed by an adversarial training algorithm to improve the robustness of the model and enhance its resistance to interference,the decoding finally being performed using a global association pointer algorithm.The experimental results show that the proposed model for the constructed dataset achieves high performance and is capable of mining the rich geological information.
基金supported by the National Natural Science Foundation of China under Grant No.62172132.
文摘The surge of large-scale models in recent years has led to breakthroughs in numerous fields,but it has also introduced higher computational costs and more complex network architectures.These increasingly large and intricate networks pose challenges for deployment and execution while also exacerbating the issue of network over-parameterization.To address this issue,various network compression techniques have been developed,such as network pruning.A typical pruning algorithm follows a three-step pipeline involving training,pruning,and retraining.Existing methods often directly set the pruned filters to zero during retraining,significantly reducing the parameter space.However,this direct pruning strategy frequently results in irreversible information loss.In the early stages of training,a network still contains much uncertainty,and evaluating filter importance may not be sufficiently rigorous.To manage the pruning process effectively,this paper proposes a flexible neural network pruning algorithm based on the logistic growth differential equation,considering the characteristics of network training.Unlike other pruning algorithms that directly reduce filter weights,this algorithm introduces a three-stage adaptive weight decay strategy inspired by the logistic growth differential equation.It employs a gentle decay rate in the initial training stage,a rapid decay rate during the intermediate stage,and a slower decay rate in the network convergence stage.Additionally,the decay rate is adjusted adaptively based on the filter weights at each stage.By controlling the adaptive decay rate at each stage,the pruning of neural network filters can be effectively managed.In experiments conducted on the CIFAR-10 and ILSVRC-2012 datasets,the pruning of neural networks significantly reduces the floating-point operations while maintaining the same pruning rate.Specifically,when implementing a 30%pruning rate on the ResNet-110 network,the pruned neural network not only decreases floating-point operations by 40.8%but also enhances the classification accuracy by 0.49%compared to the original network.
基金supported by the Bill & Melinda Gates Foundation and the Minderoo Foundation
文摘Predicting cross-immunity between viral strains is vital for public health surveillance and vaccine development.Traditional neural network methods,such as BiLSTM,could be ineffective due to the lack of lab data for model training and the overshadowing of crucial features within sequence concatenation.The current work proposes a less data-consuming model incorporating a pre-trained gene sequence model and a mutual information inference operator.Our methodology utilizes gene alignment and deduplication algorithms to preprocess gene sequences,enhancing the model’s capacity to discern and focus on distinctions among input gene pairs.The model,i.e.,DNA Pretrained Cross-Immunity Protection Inference model(DPCIPI),outperforms state-of-theart(SOTA)models in predicting hemagglutination inhibition titer from influenza viral gene sequences only.Improvement in binary cross-immunity prediction is 1.58%in F1,2.34%in precision,1.57%in recall,and 1.57%in Accuracy.For multilevel cross-immunity improvements,the improvement is 2.12%in F1,3.50%in precision,2.19%in recall,and 2.19%in Accuracy.Our study showcases the potential of pre-trained gene models to improve predictions of antigenic variation and cross-immunity.With expanding gene data and advancements in pre-trained models,this approach promises significant impacts on vaccine development and public health.
基金National Key Research and Development Program of China,Grant/Award Number:2021YFC1910402。
文摘Intelligent sorting is an important prerequisite for the full quantitative consumption and harmless disposal of kitchen waste.The existing object detection method based on an ImageNet pre-trained model is an effective way of sorting.Owing to significant domain gaps between natural images and kitchen waste images,it is difficult to reflect the characteristics of diverse scales and dense distribution in kitchen waste based on an ImageNet pre-trained model,leading to poor generalisation.In this article,the authors propose the first pre-trained model for kitchen waste sorting called KitWaSor,which combines both contrastive learning(CL)and masked image modelling(MIM)through self-supervised learning(SSL).First,to address the issue of diverse scales,the authors propose a mixed masking strategy by introducing an incomplete masking branch based on the original random masking branch.It prevents the complete loss of small-scale objects while avoiding excessive leakage of large-scale object pixels.Second,to address the issue of dense distribution,the authors introduce semantic consistency constraints on the basis of the mixed masking strategy.That is,object semantic reasoning is performed through semantic consistency constraints to compensate for the lack of contextual information.To train KitWaSor,the authors construct the first million-level kitchen waste dataset across seasonal and regional distributions,named KWD-Million.Extensive experiments show that KitWaSor achieves state-of-the-art(SOTA)performance on the two most relevant downstream tasks for kitchen waste sorting(i.e.image classification and object detection),demonstrating the effectiveness of the proposed KitWaSor.
文摘3D sparse convolution has emerged as a pivotal technique for efficient voxel-based perception in autonomous systems,enabling selective feature extraction from non-empty voxels while suppressing computational waste.Despite its theoretical efficiency advantages,practical implementations face under-explored limitations:the fixed geometric patterns of conventional sparse convolutional kernels inevitably process non-contributory positions during sliding-window operations,particularly in regions with uneven point cloud density.To address this,we propose Hierarchical Shape Pruning for 3D Sparse Convolution(HSP-S),which dynamically eliminates redundant kernel stripes through layer-adaptive thresholding.Unlike static soft pruning methods,HSP-S maintains trainable sparsity patterns by progressively adjusting pruning thresholds during optimization,enlarging original parameter search space while removing redundant operations.Extensive experiments validate effectiveness of HSP-S acrossmajor autonomous driving benchmarks.On KITTI’s 3D object detection task,our method reduces 93.47%redundant kernel computations whilemaintaining comparable accuracy(1.56%mAP drop).Remarkably,on themore complexNuScenes benchmark,HSP-S achieves simultaneous computation reduction(21.94%sparsity)and accuracy gains(1.02%mAP(mean Average Precision)and 0.47%NDS(nuScenes detection score)improvement),demonstrating its scalability to diverse perception scenarios.This work establishes the first learnable shape pruning framework that simultaneously enhances computational efficiency and preserves detection accuracy in 3D perception systems.
基金supported via funding from Prince Sattam bin Abdulaziz University(PSAU/2025/R/1446)Princess Nourah bint Abdulrahman University(PNURSP2025R300)Prince Sultan University.
文摘Deep neural networks provide accurate results for most applications.However,they need a big dataset to train properly.Providing a big dataset is a significant challenge in most applications.Image augmentation refers to techniques that increase the amount of image data.Common operations for image augmentation include changes in illumination,rotation,contrast,size,viewing angle,and others.Recently,Generative Adversarial Networks(GANs)have been employed for image generation.However,like image augmentation methods,GAN approaches can only generate images that are similar to the original images.Therefore,they also cannot generate new classes of data.Texture images presentmore challenges than general images,and generating textures is more complex than creating other types of images.This study proposes a gradient-based deep neural network method that generates a new class of texture.It is possible to rapidly generate new classes of textures using different kernels from pre-trained deep networks.After generating new textures for each class,the number of textures increases through image augmentation.During this process,several techniques are proposed to automatically remove incomplete and similar textures that are created.The proposed method is faster than some well-known generative networks by around 4 to 10 times.In addition,the quality of the generated textures surpasses that of these networks.The proposed method can generate textures that surpass those of someGANs and parametric models in certain image qualitymetrics.It can provide a big texture dataset to train deep networks.A new big texture dataset is created artificially using the proposed method.This dataset is approximately 2 GB in size and comprises 30,000 textures,each 150×150 pixels in size,organized into 600 classes.It is uploaded to the Kaggle site and Google Drive.This dataset is called BigTex.Compared to other texture datasets,the proposed dataset is the largest and can serve as a comprehensive texture dataset for training more powerful deep neural networks and mitigating overfitting.
文摘The rapid growth of digital data necessitates advanced natural language processing(NLP)models like BERT(Bidi-rectional Encoder Representations from Transformers),known for its superior performance in text classification.However,BERT’s size and computational demands limit its practicality,especially in resource-constrained settings.This research compresses the BERT base model for Bengali emotion classification through knowledge distillation(KD),pruning,and quantization techniques.Despite Bengali being the sixth most spoken language globally,NLP research in this area is limited.Our approach addresses this gap by creating an efficient BERT-based model for Bengali text.We have explored 20 combinations for KD,quantization,and pruning,resulting in improved speedup,fewer parameters,and reduced memory size.Our best results demonstrate significant improvements in both speed and efficiency.For instance,in the case of mBERT,we achieved a 3.87×speedup and 4×compression ratio with a combination of Distil+Prune+Quant that reduced parameters from 178 to 46 M,while the memory size decreased from 711 to 178 MB.These results offer scalable solutions for NLP tasks in various languages and advance the field of model compression,making these models suitable for real-world applications in resource-limited environments.
基金Shanghai Municipal Commission of Economy and Information Technology,China(No.202301054)。
文摘End-to-end object detection Transformer(DETR)successfully established the paradigm of the Transformer architecture in the field of object detection.Its end-to-end detection process and the idea of set prediction have become one of the hottest network architectures in recent years.There has been an abundance of work improving upon DETR.However,DETR and its variants require a substantial amount of memory resources and computational costs,and the vast number of parameters in these networks is unfavorable for model deployment.To address this issue,a greedy pruning(GP)algorithm is proposed,applied to a variant denoising-DETR(DN-DETR),which can eliminate redundant parameters in the Transformer architecture of DN-DETR.Considering the different roles of the multi-head attention(MHA)module and the feed-forward network(FFN)module in the Transformer architecture,a modular greedy pruning(MGP)algorithm is proposed.This algorithm separates the two modules and applies their respective optimal strategies and parameters.The effectiveness of the proposed algorithm is validated on the COCO 2017 dataset.The model obtained through the MGP algorithm reduces the parameters by 49%and the number of floating point operations(FLOPs)by 44%compared to the Transformer architecture of DN-DETR.At the same time,the mean average precision(mAP)of the model increases from 44.1%to 45.3%.
文摘We analyze the suitability of existing pre-trained transformer-based language models(PLMs)for abstractive text summarization on German technical healthcare texts.The study focuses on the multilingual capabilities of these models and their ability to perform the task of abstractive text summarization in the healthcare field.The research hypothesis was that large language models could perform high-quality abstractive text summarization on German technical healthcare texts,even if the model is not specifically trained in that language.Through experiments,the research questions explore the performance of transformer language models in dealing with complex syntax constructs,the difference in performance between models trained in English and German,and the impact of translating the source text to English before conducting the summarization.We conducted an evaluation of four PLMs(GPT-3,a translation-based approach also utilizing GPT-3,a German language Model,and a domain-specific bio-medical model approach).The evaluation considered the informativeness using 3 types of metrics based on Recall-Oriented Understudy for Gisting Evaluation(ROUGE)and the quality of results which is manually evaluated considering 5 aspects.The results show that text summarization models could be used in the German healthcare domain and that domain-independent language models achieved the best results.The study proves that text summarization models can simplify the search for pre-existing German knowledge in various domains.
文摘Accurate traffic flow prediction(TFP)is vital for efficient and sustainable transportation management and the development of intelligent traffic systems.However,missing data in real-world traffic datasets poses a significant challenge to maintaining prediction precision.This study introduces REPTF-TMDI,a novel method that combines a Reduced Error Pruning Tree Forest(REPTree Forest)with a newly proposed Time-based Missing Data Imputation(TMDI)approach.The REP Tree Forest,an ensemble learning approach,is tailored for time-related traffic data to enhance predictive accuracy and support the evolution of sustainable urbanmobility solutions.Meanwhile,the TMDI approach exploits temporal patterns to estimate missing values reliably whenever empty fields are encountered.The proposed method was evaluated using hourly traffic flow data from a major U.S.roadway spanning 2012-2018,incorporating temporal features(e.g.,hour,day,month,year,weekday),holiday indicator,and weather conditions(temperature,rain,snow,and cloud coverage).Experimental results demonstrated that the REPTF-TMDI method outperformed conventional imputation techniques across various missing data ratios by achieving an average 11.76%improvement in terms of correlation coefficient(R).Furthermore,REPTree Forest achieved improvements of 68.62%in RMSE and 70.52%in MAE compared to existing state-of-the-art models.These findings highlight the method’s ability to significantly boost traffic flow prediction accuracy,even in the presence of missing data,thereby contributing to the broader objectives of sustainable urban transportation systems.
基金funded by NIH grants HL154720-03S1, AG057587, AG074283, DK122708-03S1, BrightFocus ADR A20183775Brown Foundation 2020 Healthy Aging Initiative (to WC)
文摘Microglia are the macrophages that populate the brain parenchyma.Research in the past decades has identified them as both essential guardians of the brain and significant contributors to various neurological diseases.A highly versatile cell type,microglia have been shown to fulfill a multitude of critical roles in the central nervous system,including facilitating neurogenesis and myelination,pruning synapses,removing debris and waste,modulating neuronal activity,supporting the blood-brain barrier,repairing tissue damage,and surveilling against microbial invasions under physiological conditions(Prinz et al.,2021;Paolicelli et al.,2022).