期刊文献+
共找到854篇文章
< 1 2 43 >
每页显示 20 50 100
Knowledge graph-enhanced long-tail learning approach for traditional Chinese medicine syndrome differentiation
1
作者 Weikang Kong Chuanbiao Wen Yue Luo 《Digital Chinese Medicine》 2026年第1期57-67,共11页
Objective To address the dual challenges of long-tail distribution and feature sparsity in traditional Chinese medicine(TCM)syndrome differentiation within real clinical settings,we propose a data-efficient learning f... Objective To address the dual challenges of long-tail distribution and feature sparsity in traditional Chinese medicine(TCM)syndrome differentiation within real clinical settings,we propose a data-efficient learning framework enhanced by knowledge graphs.Methods We developed Agent-GNN,a three-stage decoupled learning framework,and validated it on the Traditional Chinese Medicine Syndrome Diagnosis(TCM-SD)dataset containing 54152 clinical records across 148 syndrome categories.First,we constructed a comprehensive medical knowledge graph encoding the complete TCM reasoning system.Second,we proposed a Functional Patient Profiling(FPP)method that utilizes large language models(LLMs)combined with Graph Retrieval-Augmented Generation(RAG)to extract structured symptom-etiology-pathogenesis subgraphs from medical records.Third,we employed heterogeneous graph neural networks to learn structured combination patterns explicitly.We compared our method against multiple baselines including BERT,ZY-BERT,ZY-BERT+Know,GAT,and GPT-4 Few-shot,using macro-F1 score as the primary evaluation metric.Additionally,ablation experiments were conducted to validate the contribution of each key component to model performance.Results Agent-GNN achieved an overall macro-F1 score of 72.4%,representing an 8.7 percentage points improvement over ZY-BERT+Know(63.7%),the strongest baseline among traditional methods.For long-tail syndromes with fewer than 10 samples,Agent-GNN reached a macro-F1 score of 58.6%,compared with 39.3%for ZY-BERT+Know and 41.2%for GPT-4 Few-shot,representing relative improvements of 49.2%and 42.2%,respectively.Ablation experiments confirmed that the explicit modeling of etiology-pathogenesis nodes contributed 12.4 percentage points to this enhanced long-tail syndrome performance.Conclusion This study proposes Agent-GNN,a knowledge graph-enhanced framework that effectively addresses the long-tail distribution challenge in TCM syndrome differentiation.By explicitly modeling manifestation-mechanism-essence patterns through structured knowledge graphs,our approach achieves superior performance in data-scarce scenarios while providing interpretable reasoning paths for TCM intelligent diagnosis. 展开更多
关键词 Syndrome differentiation Medical knowledge graph Graph neural networks long-tail learning Data-efficient learning
在线阅读 下载PDF
Improving long-tail classification via decoupling and regularisation
2
作者 Shuzheng Gao Chaozheng Wang +4 位作者 Cuiyun Gao Wenjian Luo Peiyi Han Qing Liao Guandong Xu 《CAAI Transactions on Intelligence Technology》 2025年第1期62-71,共10页
Real-world data always exhibit an imbalanced and long-tailed distribution,which leads to poor performance for neural network-based classification.Existing methods mainly tackle this problem by reweighting the loss fun... Real-world data always exhibit an imbalanced and long-tailed distribution,which leads to poor performance for neural network-based classification.Existing methods mainly tackle this problem by reweighting the loss function or rebalancing the classifier.However,one crucial aspect overlooked by previous research studies is the imbalanced feature space problem caused by the imbalanced angle distribution.In this paper,the authors shed light on the significance of the angle distribution in achieving a balanced feature space,which is essential for improving model performance under long-tailed distributions.Nevertheless,it is challenging to effectively balance both the classifier norms and angle distribution due to problems such as the low feature norm.To tackle these challenges,the authors first thoroughly analyse the classifier and feature space by decoupling the classification logits into three key components:classifier norm(i.e.the magnitude of the classifier vector),feature norm(i.e.the magnitude of the feature vector),and cosine similarity between the classifier vector and feature vector.In this way,the authors analyse the change of each component in the training process and reveal three critical problems that should be solved,that is,the imbalanced angle distribution,the lack of feature discrimination,and the low feature norm.Drawing from this analysis,the authors propose a novel loss function that incorporates hyperspherical uniformity,additive angular margin,and feature norm regularisation.Each component of the loss function addresses a specific problem and synergistically contributes to achieving a balanced classifier and feature space.The authors conduct extensive experiments on three popular benchmark datasets including CIFAR-10/100-LT,ImageNet-LT,and iNaturalist 2018.The experimental results demonstrate that the authors’loss function outperforms several previous state-of-the-art methods in addressing the challenges posed by imbalanced and longtailed datasets,that is,by improving upon the best-performing baselines on CIFAR-100-LT by 1.34,1.41,1.41 and 1.33,respectively. 展开更多
关键词 computer vision image classification long-tailed data machine learning
在线阅读 下载PDF
Diurnal brooding behavior of long-tailed tits (Aegithalos caudatus glaucogularis) 被引量:2
3
作者 Jin YU Peng-Cheng WANG +7 位作者 Lei LU Zheng-Wang ZHANG Yong WANG Ji-Liang XU Jian-Qiang LI Bo XI Jia-Gui ZHU Zhi-Yong DU 《Zoological Research》 CAS CSCD 2016年第2期84-89,共6页
Brooding is a major breeding investment of parental birds during the early nestling stage, and has important effects on the development and survival of nestlings. Investigating brooding behavior can help to understand... Brooding is a major breeding investment of parental birds during the early nestling stage, and has important effects on the development and survival of nestlings. Investigating brooding behavior can help to understand avian breeding investment strategies. From January to June in 2013 and 2014, we studied the brooding behaviors of long-tailed tits (Aegithalos caudatus glaucogularis) in Dongzhai National Nature Reserve, Henan Province, China. We analyzed the relationships between parental diurnal brooding duration and nestling age, brood size, temperature, relative breeding season, time of day and nestling frequencies during brooding duration. Results showed that female and male long-tailed tit parents had different breeding investment strategies during the early nestling stage. Female parents bore most of the brooding investment, while male parents performed most of the nestling feedings. In addition, helpers were not found to brood nestlings at the two cooperative breeding nests. Parental brooding duration was significantly associated with the food delivered to nestlings (F=86.10, dr=l, 193.94, P〈0.001), and was longer when the nestlings received more food. We found that parental brooding duration declined significantly as nestlings aged (F=5.99, dr=-1, 50.13, P=0.018). When nestlings were six days old, daytime parental brooding almost ceased, implying that long- tailed tit nestlings might be able to maintain their own body temperature by this age. In addition, brooding duration was affected by both brood size (F=12.74, dr=-1,32.08, P=0.001) and temperature (F=5.83, df=-l, 39.59, P=-0.021), with it being shorter in larger broods and when ambient temperature was higher. 展开更多
关键词 long-tailed tit Aegithalos caudatusglaucogularis BROODING DAYTIME Early nestling stage
在线阅读 下载PDF
Jinfengopteryx Compared to Archaeopteryx,with Comments on the Mosaic Evolution of Long-tailed Avialan Birds 被引量:1
4
作者 JI Shu'an JI Qiang 《Acta Geologica Sinica(English Edition)》 SCIE CAS CSCD 2007年第3期337-343,共7页
Jinfengopteryx is a newly uncovered Archaeopteryx-like avialan bird outside Germany, which was found from the Jehol Biota of northern Hebei in northeastern China. It shares many characters only with Archaeopteryx by t... Jinfengopteryx is a newly uncovered Archaeopteryx-like avialan bird outside Germany, which was found from the Jehol Biota of northern Hebei in northeastern China. It shares many characters only with Archaeopteryx by the possession of three fenestrae in the antorbital cavity, 23 caudal vertebrae and long tail feathers attached to all the caudal vertebrae. But the former differs from the latter in the relatively short and high preorbital region of skull, more and closely packed teeth, much shorter forelimb compared to hindlimb. Such differences indicate Jinfengopteryx is even slightly more primitive than Archaeopteryx, although both birds can be placed at the root position of the avialan tree based on cladistic analysis. Shenzhouraptor is suggested to be slightly more advanced than Jinfengopteryx + Archaeopteryx, supported by some derived features in teeth, shoulder girdles and forelimbs such as the reduction of tooth number, dorsolaterally directed glenoid facet, very long forelimb and comparatively short manus. Meanwhile, the tail of Shenzhouraptor shows more primitive characters than those of Jinfengopteryx and Archaeopteryx, e.g., the strikingly longer tail composed of more caudal vertebrae and the long tail feathers attached only to distal caudal segments. The mixed primitive and advanced characters reveal the evident mosaic evolution among long-tailed avialan birds. 展开更多
关键词 Jinfengopteryx ARCHAEOPTERYX long-tailed avialans mosaic evolution MESOZOIC
在线阅读 下载PDF
Semi-supervised Long-tail Endoscopic Image Classification
5
作者 Runnan Cao Mengjie Fang +2 位作者 Hailing Li Jie Tian Di Dong 《Chinese Medical Sciences Journal》 CAS CSCD 2022年第3期171-180,I0002,共11页
Objective To explore the semi-supervised learning(SSL) algorithm for long-tail endoscopic image classification with limited annotations.Method We explored semi-supervised long-tail endoscopic image classification in H... Objective To explore the semi-supervised learning(SSL) algorithm for long-tail endoscopic image classification with limited annotations.Method We explored semi-supervised long-tail endoscopic image classification in HyperKvasir,the largest gastrointestinal public dataset with 23 diverse classes.Semi-supervised learning algorithm FixMatch was applied based on consistency regularization and pseudo-labeling.After splitting the training dataset and the test dataset at a ratio of 4:1,we sampled 20%,50%,and 100% labeled training data to test the classification with limited annotations.Results The classification performance was evaluated by micro-average and macro-average evaluation metrics,with the Mathews correlation coefficient(MCC) as the overall evaluation.SSL algorithm improved the classification performance,with MCC increasing from 0.8761 to 0.8850,from 0.8983 to 0.8994,and from 0.9075 to 0.9095 with 20%,50%,and 100% ratio of labeled training data,respectively.With a 20% ratio of labeled training data,SSL improved both the micro-average and macro-average classification performance;while for the ratio of 50% and 100%,SSL improved the micro-average performance but hurt macro-average performance.Through analyzing the confusion matrix and labeling bias in each class,we found that the pseudo-based SSL algorithm exacerbated the classifier’ s preference for the head class,resulting in improved performance in the head class and degenerated performance in the tail class.Conclusion SSL can improve the classification performance for semi-supervised long-tail endoscopic image classification,especially when the labeled data is extremely limited,which may benefit the building of assisted diagnosis systems for low-volume hospitals.However,the pseudo-labeling strategy may amplify the effect of class imbalance,which hurts the classification performance for the tail class. 展开更多
关键词 endoscopic image artificial intelligence semi-supervised learning long-tail distribution image classification
在线阅读 下载PDF
M^(2)LC-Net: A Multi-Modal Multi-Disease Long-Tailed Classification Network for Real Clinical Scenes
6
作者 Zhonghong Ou Wenjun Chai +9 位作者 Lifei Wang Ruru Zhang Jiawen He Meina Song Lifei Yuan Shengjuan Zhang Yanhui Wang Huan Li Xin Jia Rujian Huang 《China Communications》 SCIE CSCD 2021年第9期210-220,共11页
Leveraging deep learning-based techniques to classify diseases has attracted extensive research interest in recent years.Nevertheless,most of the current studies only consider single-modal medical images,and the numbe... Leveraging deep learning-based techniques to classify diseases has attracted extensive research interest in recent years.Nevertheless,most of the current studies only consider single-modal medical images,and the number of ophthalmic diseases that can be classified is relatively small.Moreover,imbalanced data distribution of different ophthalmic diseases is not taken into consideration,which limits the application of deep learning techniques in realistic clinical scenes.In this paper,we propose a Multimodal Multi-disease Long-tailed Classification Network(M^(2)LC-Net)in response to the challenges mentioned above.M^(2)LC-Net leverages ResNet18-CBAM to extract features from fundus images and Optical Coherence Tomography(OCT)images,respectively,and conduct feature fusion to classify 11 common ophthalmic diseases.Moreover,Class Activation Mapping(CAM)is employed to visualize each mode to improve interpretability of M^(2)LC-Net.We conduct comprehensive experiments on realistic dataset collected from a Grade III Level A ophthalmology hospital in China,including 34,396 images of 11 disease labels.Experimental results demonstrate effectiveness of our proposed model M^(2)LC-Net.Compared with the stateof-the-art,various performance metrics have been improved significantly.Specifically,Cohen’s kappa coefficient κ has been improved by 3.21%,which is a remarkable improvement. 展开更多
关键词 deep learning multi modal long-tail ophthalmic disease classification
在线阅读 下载PDF
Dual Channel with Involution for Long-Tailed Visual Recognition
7
作者 Mengxue Li 《Open Journal of Applied Sciences》 2022年第4期421-433,共13页
With the rapid increase of large-scale problems, the distribution of real-world datasets tends to be long-tailed. Existing solutions typically involve re-balancing strategies (i.e., re-sampling and re-weighting). Alth... With the rapid increase of large-scale problems, the distribution of real-world datasets tends to be long-tailed. Existing solutions typically involve re-balancing strategies (i.e., re-sampling and re-weighting). Although they can significantly promote the classifier learning of deep networks, they will unexpectedly impair the representative ability of the learned deep features to a certain extent. Therefore, this paper proposes a dual-channel learning algorithm with involution neural networks (DC-Invo) to take care of representation learning and classifier learning concurrently. In this work, the most important thing is to combine ResNet and involution to obtain higher classification accuracy because of involution’s wider coverage in the spatial dimension. The paper conducted extensive experiments on several benchmark vision tasks including Cifar-LT, Imagenet-LT, and Places-LT, showing that DC-Invo is able to achieve significant performance gained on long-tailed datasets. 展开更多
关键词 long-tailed Recognition Deep Neural Network Dual-Channel Structure INVOLUTION
在线阅读 下载PDF
Robust long-tailed learning under label noise
8
作者 Tong WEI Jiang-Xin SHI +1 位作者 Min-Ling ZHANG Yu-Feng LI 《Frontiers of Computer Science》 2026年第1期1-12,共12页
Long-tailed learning aims to enhance the generalization performance of underrepresented tail classes.However,previous methods have largely overlooked the prevalence of noisy labels in training data.In this paper,we ad... Long-tailed learning aims to enhance the generalization performance of underrepresented tail classes.However,previous methods have largely overlooked the prevalence of noisy labels in training data.In this paper,we address the challenge of noisy labels in long-tailed learning.We identify a critical issue:the commonly used small-loss noisy label detection criterion fails to perform effectively in long-tailed class distributions.This failure arises from the inherent bias of deep neural networks,which tend to misclassify tail class examples as head classes,leading to unreliable loss calculations.To mitigate this,we propose a novel small-distance criterion that leverages the robustness of learned representations,enabling more accurate identification of correctly-labeled examples across both head and tail classes.Additionally,to improve training for tail classes,we replace discrete pseudo-labels with label distributions for examples flagged as noisy,resulting in significant performance gains.Based on these contributions,we introduce the robust long-tail learning framework,designed to train models that are resilient to both class imbalance and noisy labels.Extensive experiments on benchmark and real-world datasets demonstrate that our approach outperforms previous methods,offering substantial performance improvements.Our source code is available at the website of github.com/Stomach-ache/RoLT. 展开更多
关键词 long-tail learning noisy labels semi-supervised learning
原文传递
Decoupling incremental classifier and representation learning based continual learning machinery fault diagnosis framework under long-tailed distribution
9
作者 Changqing Shen Yao Liu +3 位作者 Bojian Chen Xuyang Tao Yifan Huangfu Dong Wang 《Chinese Journal of Mechanical Engineering》 2026年第1期74-87,共14页
Continual learning fault diagnosis(CLFD)has gained growing interest in mechanical systems for its ability to accumulate and transfer knowledge in dynamic fault diagnosis scenarios.However,existing CLFD methods typical... Continual learning fault diagnosis(CLFD)has gained growing interest in mechanical systems for its ability to accumulate and transfer knowledge in dynamic fault diagnosis scenarios.However,existing CLFD methods typically assume balanced task distributions,neglecting the long-tailed nature of real-world fault occurrences,where certain faults dominate while others are rare.Due to the long-tailed distribution among different me-chanical conditions,excessive attention has been focused on the dominant type,leading to performance de-gradation in rarer types.In this paper,decoupling incremental classifier and representation learning(DICRL)is proposed to address the dual challenges of catastrophic forgetting introduced by incremental tasks and the bias in long-tailed CLFD(LT-CLFD).The core innovation lies in the structural decoupling of incremental classifier learning and representation learning.An instance-balanced sampling strategy is employed to learn more dis-criminative deep representations from the exemplars selected by the herding algorithm and new data.Then,the previous classifiers are frozen to prevent damage to representation learning during backward propagation.Cosine normalization classifier with learnable weight scaling is trained using a class-balanced sampling strategy to enhance classification accuracy.Experimental results demonstrate that DICRL outperforms existing continual learning methods across multiple benchmarks,demonstrating superior performance and robustness in both LT-CLFD and conventional CLFD.DICRL effectively tackles both catastrophic forgetting and long-tailed distribution in CLFD,enabling more reliable fault diagnosis in industrial applications. 展开更多
关键词 Fault diagnosis Continual learning long-tailed distribution Catastrophic forgetting
在线阅读 下载PDF
Semi-Discrete Optimal Transpport for Long-Tailed Classification
10
作者 Lian-Bao Jin Na Lei +3 位作者 Zhong-Xuan Luo Jin Wu Chao Ai Xianfeng Gu 《Journal of Computer Science & Technology》 2025年第1期252-266,共15页
The long-tailed data distribution poses an enormous challenge for training neural networks in classification.A classification network can be decoupled into a feature extractor and a classifier.This paper takes a semi-... The long-tailed data distribution poses an enormous challenge for training neural networks in classification.A classification network can be decoupled into a feature extractor and a classifier.This paper takes a semi-discrete optimal transport(OT)perspective to analyze the long-tailed classification problem,where the feature space is viewed as a continuous source domain,and the classifier weights are viewed as a discrete target domain.The classifier is indeed to find a cell decomposition of the feature space with each cell corresponding to one class.An imbalanced training set causes the more frequent classes to have larger volume cells,which means that the classifier's decision boundary is biased towards less frequent classes,resulting in reduced classification performance in the inference phase.Therefore,we propose a novel OTdynamic softmax loss,which dynamically adjusts the decision boundary in the training phase to avoid overfitting in the tail classes.In addition,our method incorporates the supervised contrastive loss so that the feature space can satisfy the uniform distribution condition.Extensive and comprehensive experiments demonstrate that our method achieves state-ofthe-art performance on multiple long-tailed recognition benchmarks,including CIFAR-LT,ImageNet-LT,iNaturalist 2018,and Places-LT. 展开更多
关键词 semi-discrete optimal transport long-tailed classification decision boundary supervised contrastive loss
原文传递
Balanced Representation Learning for Long-tailed Skeleton-based Action Recognition
11
作者 Hongda Liu Yunlong Wang +4 位作者 Min Ren Junxing Hu Zhengquan Luo Guangqi Hou Zhenan Sun 《Machine Intelligence Research》 2025年第3期466-483,共18页
Skeleton-based action recognition has recently made significant progress.However,data imbalance is still a great challenge in real-world scenarios.The performance of current action recognition algorithms declines shar... Skeleton-based action recognition has recently made significant progress.However,data imbalance is still a great challenge in real-world scenarios.The performance of current action recognition algorithms declines sharply when training data suffers from heavy class imbalance.The imbalanced data actually degrades the representations learned by these methods and becomes the bottleneck for action recognition.How to learn unbiased representations from imbalanced action data is the key to long-tailed action recognition.In this paper,we propose a novel balanced representation learning method to address the long-tailed problem in action recognition.Firstly,a spatial-temporal action exploration strategy is presented to expand the sample space effectively,generating more valuable samples in a rebalanced manner.Secondly,we design a detached action-aware learning schedule to further mitigate the bias in the representation space.The schedule detaches the representation learning of tail classes from training and proposes an action-aware loss to impose more effective constraints.Additionally,a skip-type representation is proposed to provide complementary structural information.The proposed method is validated on four skeleton datasets,NTU RGB+D 60,NTU RGB+D 120,NW-UCLA and Kinetics.It not only achieves consistently large improvement compared to the state-of-the-art(SOTA)methods,but also demonstrates a superior generalization capacity through extensive experiments.Our code is available at https://github.com/firework8/BRL. 展开更多
关键词 Action recognition skeleton sequence long-tailed visual recognition imbalance learning.
原文传递
深成侵入岩类不平衡岩石图像数据集PlutonicRocks-13
12
作者 陈忠良 胡召齐 郑超杰 《中国科学数据(中英文网络版)》 2026年第1期3-18,共16页
岩性识别是地质工作者的基本技能之一。随着人工智能的兴起,如何把地质专业人员识别岩性的能力转化成人工智能模型,提供岩性智能识别服务,让地学爱好者或者非地质专业人员也能较准确地识别岩性,成为地学领域智能服务需求之一。自然条件... 岩性识别是地质工作者的基本技能之一。随着人工智能的兴起,如何把地质专业人员识别岩性的能力转化成人工智能模型,提供岩性智能识别服务,让地学爱好者或者非地质专业人员也能较准确地识别岩性,成为地学领域智能服务需求之一。自然条件下,由于地表岩石分布不均,岩石图像数据集属于典型的长尾分布。本研究以深成侵入岩为例,选择于炳松等主编的《岩石学》中的岩石分类和命名方案,构建类不平衡的岩石图像识别研究数据集PlutonicRocks-13。本数据集包含13种常见的深成侵入岩,共4785张图片,原始图像共2.49 GB,主要岩石类型包括橄榄岩、辉石岩、角闪石岩、辉长岩、闪长岩、二长岩、正长岩、霞石正长岩、花岗闪长岩、二长花岗岩、正长花岗岩、斜长花岗岩、文象花岗岩。岩石图像主要从野外和馆藏机构采集露头和手标本图像,辅以网络渠道收集。经过筛选、处理和标注后,图像最终形成能为岩石图像分类任务提供基础数据的数据集。同时,采用岩石薄片鉴定和基于深度学习可解释性分析的数据集偏见检测等方法开展数据标注质量控制和评估。采用标注标签转换为问答对的方式,还可构建面向岩石图像分类任务的微调指令,为多模态模型的岩石图像分类任务提供指令微调数据集。本图像数据集可为岩石图像自动识别研究提供可靠的数据支撑,对地质调查、地表基质调查和地质科普等有重要的参考价值。 展开更多
关键词 岩浆岩 侵入岩 长尾分布 类不平衡 图像分类
在线阅读 下载PDF
SDNet:A self-supervised bird recognition method based on large language models and diffusion models for improving long-term bird monitoring
13
作者 Zhongde Zhang Nan Su +3 位作者 Chenxun Deng Yandong Zhao Weiping Liu Qiaoling Han 《Avian Research》 2026年第1期200-215,共16页
The collection and annotation of lar ge-scale bird datasets are resource-intensive and time-consuming processes that significantly limit the scalability and accuracy of biodiversity monitoring systems.While self-super... The collection and annotation of lar ge-scale bird datasets are resource-intensive and time-consuming processes that significantly limit the scalability and accuracy of biodiversity monitoring systems.While self-supervised learning(SSL)has emerged as a promising approach for leveraging unannotated data,current SSL methods face two critical challenges in bird species recognition:(1)long-tailed data distributions that result in poor performance on underrepresented species;and(2)domain shift issues caused by data augmentation strategies designed to mitigate class imbalance.Here we present SDNet,a novel SSL-based bird recognition framework that integrates diffusion models with large language models(LLMs)to overcome these limitations.SDNet employs LLMs to generate semantically rich textual descriptions for tail-class species by prompting the models with species taxonomy,morphological attributes,and habitat information,producing detailed natural language priors that capture fine-grained visual characteristics(e.g.,plumage patterns,body proportions,and distinctive markings).These textual descriptions are subsequently used by a conditional diffusion model to synthesize new bird image samples through cross-attention mechanisms that fuse textual embeddings with intermediate visual feature representations during the denoising process,ensuring generated images preserve species-specific morphological details while maintaining photorealistic quality.Additionally,we incorporate a Swin Transformer as the feature extraction backbone whose hierarchical window-based attention mechanism and shifted windowing scheme enable multi-scale local feature extraction that proves particularly effective at capturing finegrained discriminative patterns(such as beak shape and feather texture)while mitigating domain shift between synthetic and original images through consistent feature representations across both data sources.SDNet is validated on both a self-constructed dataset(Bird_BXS)an d a publicly available benchmark(Birds_25),demonstrating substantial improvements over conventional SSL approaches.Our results indicate that the synergistic integration of LLMs,diffusion models,and the Swin Transformer architecture contributes significantly to recognition accuracy,particularly for rare and morphologically similar species.These findings highlight the potential of SDNet for addressing fundamental limitations of existing SSL methods in avian recognition tasks and establishing a new paradigm for efficient self-supervised learning in large-scale ornithological vision applications. 展开更多
关键词 Biodiversity conservation Bird intelligent monitoring Diffusion models Large-scale language models long-tailed learning Self-supervised learning
在线阅读 下载PDF
基于长尾词分布的藏汉机器翻译数据增强方法
14
作者 格桑加措 尼玛扎西 +5 位作者 群诺 嘎玛扎西 道吉扎西 罗桑益西 拉毛吉 钱木吉 《计算机科学》 北大核心 2026年第1期224-230,共7页
现有藏汉机器翻译语料中存在领域数据分布不平衡的问题,导致训练出来的模型对各个领域数据的翻译能力表现不均衡。反向翻译作为一种常见的数据增强方法,通过提供更多样化的伪数据来提高模型的性能。然而,传统的反向翻译方法难以充分考... 现有藏汉机器翻译语料中存在领域数据分布不平衡的问题,导致训练出来的模型对各个领域数据的翻译能力表现不均衡。反向翻译作为一种常见的数据增强方法,通过提供更多样化的伪数据来提高模型的性能。然而,传统的反向翻译方法难以充分考虑数据的领域分布不平衡问题,导致模型在整体性能提升过程中难以提升资源稀缺领域的翻译性能。对此,通过深入分析语料中的长尾词的分布,有针对性地利用现有藏汉双语语料的长尾词来选取单语数据,通过反向翻译构造伪数据进行数据增强操作。这一策略旨在提升藏汉机器翻译模型整体性能的同时,改善数据匮乏领域的翻译性能。实验结果表明,通过充分考虑领域数据不平衡情况,结合长尾词数据增强,能够有效提升机器翻译模型在稀缺领域的翻译性能,为解决领域数据不平衡问题提供了一种有针对性的策略。 展开更多
关键词 长尾词 数据增强 藏汉机器翻译 领域数据不平衡
在线阅读 下载PDF
基于Q-Learning长尾延迟优化的SSD-SMR写缓存策略研究
15
作者 刘健 章步镐 +4 位作者 方匡弛 刘宣锋 孙国道 梁荣华 梁浩然 《计算机工程》 北大核心 2026年第3期287-298,共12页
随着全球数据规模的不断增大,如何以低成本的方式有效提升数据的访问性能是存储系统面临的一项重要挑战,使用低延迟、高带宽的固态硬盘(SSD)和低成本、高存储密度的叠瓦式磁盘(SMR)来构建缓存系统,成为一种有效的解决方案。但是,SMR固... 随着全球数据规模的不断增大,如何以低成本的方式有效提升数据的访问性能是存储系统面临的一项重要挑战,使用低延迟、高带宽的固态硬盘(SSD)和低成本、高存储密度的叠瓦式磁盘(SMR)来构建缓存系统,成为一种有效的解决方案。但是,SMR固有的机械运动和多磁道堆叠的特性导致其写性能较差,SSD中的脏数据频繁写回SMR所导致的大量读-合并-写(RMW)操作可能会引起严重的长尾延迟现象。为此,基于SSD-SMR混合存储架构提出一种结合强化学习Q-Learning算法的缓存替换优化策略。通过学习SMR设备的I/O负载状况与延迟之间的经验知识来控制对SMR的写入,当SMR负载较大时,通过控制缓存中脏数据的逐出来减少SMR因写回而产生的大量RMW操作,从而优化系统在不同负载下的尾部延迟开销。将Q-Learning算法与基于数据流行度的缓存算法LRU以及SMR感知的缓存算法SAC进行结合,使用真实企业Trace和YCSB生成的模拟Trace进行测试,实验结果表明,所提方法能够有效提升现有缓存算法的性能,可以降低57.06%的平均延迟和87.49%的尾部延迟。 展开更多
关键词 Q-Learning算法 I/O负载 长尾延迟 缓存替换算法 混合存储
在线阅读 下载PDF
融合多传感器时空特征演化与头尾部梯度竞争均衡的电机长尾数据故障诊断
16
作者 石佳 郭鹏 +3 位作者 张志瑶 王义涛 梁峻欣 蔡茗茜 《控制与决策》 北大核心 2026年第2期393-404,共12页
针对电机长尾数据故障诊断中尾部类别特征学习不足和头尾部梯度竞争失衡的问题,提出一种融合余弦退火热重启衰减学习率策略(CDLR)的时空消息传递神经网络(STMPNN)电机长尾故障诊断模型(STMPNNCDLR).首先,通过多节点拓扑结构建模多变量... 针对电机长尾数据故障诊断中尾部类别特征学习不足和头尾部梯度竞争失衡的问题,提出一种融合余弦退火热重启衰减学习率策略(CDLR)的时空消息传递神经网络(STMPNN)电机长尾故障诊断模型(STMPNNCDLR).首先,通过多节点拓扑结构建模多变量时间序列数据样本同一时间窗中不同时间阶段(子时间窗)传感器间的时空依赖关系,并设计动态时空关系加权矩阵刻画传感器特征在时间维度上的演化模式,强化头尾部类别潜在时空交互的特征表示;其次,利用STMPNN的消息传递机制实现跨子时间窗的特征更新,提升模型对局部和全局信息的感知能力;最后,通过CDLR策略周期性地重启和衰减学习率,缓解长尾分布导致的梯度竞争失衡问题,增强模型稳定性和对尾部类别的敏感性.在4组不同长尾比率的电机故障诊断实验中,所提出的方法在不牺牲头部正常类别诊断性能的前提下,对尾部故障类别展现出优异的诊断性能和稳定性,Accuracy高于94.57%,验证了该方法在解决电机长尾故障诊断问题中的有效性和优越性. 展开更多
关键词 电机 长尾故障诊断 多变量时间序列 多节点拓扑结构 时空特征演化 梯度竞争均衡
原文传递
恶意被动方场景下的纵向联邦学习安全加权聚合
17
作者 张政胤 王玲玲 +2 位作者 黄梅 张玉兴 宋佼蓉 《山东大学学报(理学版)》 北大核心 2026年第3期29-43,共15页
针对纵向联邦学习中的不可信参与方发动数据投毒攻击阻碍模型训练,以及半诚实参与方发动隐私推理攻击窃取其他参与方私有数据的问题,提出了一种恶意被动方场景下的纵向联邦学习安全加权聚合方案。首先,设计效用评估算法抵御数据投毒攻击... 针对纵向联邦学习中的不可信参与方发动数据投毒攻击阻碍模型训练,以及半诚实参与方发动隐私推理攻击窃取其他参与方私有数据的问题,提出了一种恶意被动方场景下的纵向联邦学习安全加权聚合方案。首先,设计效用评估算法抵御数据投毒攻击,通过计算最大容忍距离过滤有毒样本所对应的嵌入向量。然后,提出自适应权重计算算法,确保在长尾数据场景下依然能够有效抵御数据投毒攻击并保持模型的高收敛率和准确率。最后,利用掩蔽机制和对称同态加密算法保护嵌入向量隐私,抵御隐私推理攻击。理论分析和仿真结果表明本方案具有较好的计算效率和模型性能,能有效抵御隐私推理攻击和数据投毒攻击,与最新相关工作相比模型准确率提高约5%~10%。 展开更多
关键词 纵向联邦学习 数据投毒攻击 隐私推理攻击 长尾数据
原文传递
Long-tailed object detection of kitchen waste with class-instance balanced detector 被引量:3
18
作者 FANG LeYuan TANG Qi +4 位作者 OUYANG LiHan YU JunWu LIN JiaXing DING ShuaiYu TANG Lin 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2023年第8期2361-2372,共12页
Intelligent detection and classification of kitchen waste can promote ecological sustainability by replacing inefficient manual processes.However,the presence of non-degradable waste mixed in kitchen waste often follo... Intelligent detection and classification of kitchen waste can promote ecological sustainability by replacing inefficient manual processes.However,the presence of non-degradable waste mixed in kitchen waste often follows a long-tailed distribution,making it challenging to train convolutional neural network-based object detectors,which results in the unsatisfactory detection of tailclass waste.To address this challenge,we propose a class-instance balanced detector(CIB-Det) for intelligent detection and classification of kitchen waste.CIB-Det implements two strategies for the loss function:the class-balanced strategy(CBS)and the instance-balanced strategy(IBS).The CBS focuses more on tail classes,and the IBS concentrates on hard-to-classify instances adaptively during training.Consequently,CIB-Det comprehensively and adaptively addresses the long-tailed issue.Our experiments on a real dataset of kitchen waste images support the effectiveness of CIB-Det for kitchen waste detection. 展开更多
关键词 kitchen waste detection and classification object detection long-tailed distribution convolutional neural networks
原文传递
基于大语言模型的业务流程长尾变化应变方法
19
作者 邵欣怡 朱经纬 张亮 《计算机科学》 北大核心 2026年第1期29-38,共10页
业务流程应变是业务流程管理的重要任务,旨在通过调整流程模型和实例行为来响应不断变化的环境,从而提高其柔韧性并实现业务目标。建模时,残留不确定性导致的长尾变化无法避免,给传统的业务流程应变技术带来了挑战。目前针对长尾变化最... 业务流程应变是业务流程管理的重要任务,旨在通过调整流程模型和实例行为来响应不断变化的环境,从而提高其柔韧性并实现业务目标。建模时,残留不确定性导致的长尾变化无法避免,给传统的业务流程应变技术带来了挑战。目前针对长尾变化最有效的应变方法基于一种三方协作框架,即由负责感知长尾变化和提出应变策略的前端业务人员、负责提供服务接口和合规性要求的后端技术人员和管理层,以及辅助应变实施的工具系统共同协作来应对长尾变化,保障业务目标达成。然而,长尾变化在不同时空条件下的多样性、复杂性和应变的迫切性,极有可能超出前端业务人员在应变时对当前情境的理解能力、依据情境制定应变策略的专业水平,以及将应变策略采用领域专用语言有效表达的熟练程度。为弥补这一缺憾并进一步拓展上述框架,提出了一种基于大语言模型的业务流程长尾变化应变方法LLM-Adapt,充分利用大语言模型的泛化能力、强大的内容生成能力,以及嵌入的事件与对策知识库,形成一种更高效、灵活的应变机制。首先,以基于长尾变化特征的提示词工程为媒介,使前端业务人员能够通过自然语言与大语言模型进行交互并获得应变方案。其次,结合后端管理层制定的业务基线目标约束对应变方案进行功能性约束验证,提出的SSDT-Lane算法基于流程结构相似性对应变方案进行筛选,消除了大语言模型在流程调整、业务和组织架构匹配等方面面临的幻觉风险。基于合成数据和真实开源数据集的典型案例分析实验显示,LLM-Adapt相比现有方法,在应变准确性、效率、适用性等方面都表现出显著优势。 展开更多
关键词 业务流程应变 长尾变化 大语言模型 业务流程合规性检查 流程结构相似性
在线阅读 下载PDF
Federated learning on non-IID and long-tailed data viadual-decoupling 被引量:1
20
作者 Zhaohui WANG Hongjiao LI +2 位作者 Jinguo LI Renhao HU Baojin WANG 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2024年第5期728-741,共14页
Federated learning(FL),a cutting-edge distributed machine learning training paradigm,aims to generate a global model by collaborating on the training of client models without revealing local private data.The co-occurr... Federated learning(FL),a cutting-edge distributed machine learning training paradigm,aims to generate a global model by collaborating on the training of client models without revealing local private data.The co-occurrence of non-independent and identically distributed(non-IID)and long-tailed distribution in FL is one challenge that substantially degrades aggregate performance.In this paper,we present a corresponding solution called federated dual-decoupling via model and logit calibration(FedDDC)for non-IID and long-tailed distributions.The model is characterized by three aspects.First,we decouple the global model into the feature extractor and the classifier to fine-tune the components affected by the joint problem.For the biased feature extractor,we propose a client confidence re-weighting scheme to assist calibration,which assigns optimal weights to each client.For the biased classifier,we apply the classifier re-balancing method for fine-tuning.Then,we calibrate and integrate the client confidence re-weighted logits with the re-balanced logits to obtain the unbiased logits.Finally,we use decoupled knowledge distillation for the first time in the joint problem to enhance the accuracy of the global model by extracting the knowledge of the unbiased model.Numerous experiments demonstrate that on non-IID and long-tailed data in FL,our approach outperforms state-of-the-art methods. 展开更多
关键词 Federated learning Non-IID long-tailed data Decoupling learning Knowledge distillation
原文传递
上一页 1 2 43 下一页 到第
使用帮助 返回顶部