期刊文献+
共找到268篇文章
< 1 2 14 >
每页显示 20 50 100
Evaluating chat generative pretrained transformer in answering questions on endoscopic mucosal resection and endoscopic submucosal dissection 被引量:1
1
作者 Shi-Song Wang Hui Gao +3 位作者 Peng-Yao Lin Tian-Chen Qian Ying Du Lei Xu 《World Journal of Gastrointestinal Oncology》 2025年第10期290-303,共14页
BACKGROUND With the rising use of endoscopic submucosal dissection(ESD)and endoscopic mucosal resection(EMR),patients are increasingly questioning various aspects of these endoscopic procedures.At the same time,conver... BACKGROUND With the rising use of endoscopic submucosal dissection(ESD)and endoscopic mucosal resection(EMR),patients are increasingly questioning various aspects of these endoscopic procedures.At the same time,conversational artificial intelligence(AI)tools like chat generative pretrained transformer(ChatGPT)are rapidly emerging as sources of medical information.AIM To evaluate ChatGPT’s reliability and usefulness regarding ESD and EMR for patients and healthcare professionals.METHODS In this study,30 specific questions related to ESD and EMR were identified.Then,these questions were repeatedly entered into ChatGPT,with two independent answers generated for each question.A Likert scale was used to rate the accuracy,completeness,and comprehensibility of the responses.Meanwhile,a binary category(high/Low)was used to evaluate each aspect of the two responses generated by ChatGPT and the response retrieved from Google.RESULTS By analyzing the average scores of the three raters,our findings indicated that the responses generated by ChatGPT received high ratings for accuracy(mean score of 5.14 out of 6),completeness(mean score of 2.34 out of 3),and comprehensibility(mean score of 2.96 out of 3).Kendall’s coefficients of concordance indicated good agreement among raters(all P<0.05).For the responses generated by Google,more than half were classified by experts as having low accuracy and low completeness.CONCLUSION ChatGPT provided accurate and reliable answers in response to questions about ESD and EMR.Future studies should address ChatGPT’s current limitations by incorporating more detailed and up-to-date medical information.This could establish AI chatbots as significant resource for both patients and health care professionals. 展开更多
关键词 Endoscopic submucosal dissection Endoscopic mucosal dissection Artificial intelligence Chat generative pretrained transformer Patient education Google
暂未订购
SRS-Net: Training object detectors from scratch for remote sensing images without pretraining 被引量:2
2
作者 Haining WANG Yang LI +4 位作者 Yuqiang FANG Yurong LIAO Bitao JIANG Xitao ZHANG Shuyan NI 《Chinese Journal of Aeronautics》 SCIE EI CAS CSCD 2023年第8期269-283,共15页
Most of the current object detection algorithms use pretrained models that are trained on ImageNet and then fine-tuned in the network,which can achieve good performance in terms of general object detectors.However,in ... Most of the current object detection algorithms use pretrained models that are trained on ImageNet and then fine-tuned in the network,which can achieve good performance in terms of general object detectors.However,in the field of remote sensing image object detection,as pretrained models are significantly different from remote sensing data,it is meaningful to explore a train-fromscratch technique for remote sensing images.This paper proposes an object detection framework trained from scratch,SRS-Net,and describes the design of a densely connected backbone network to provide integrated hidden layer supervision for the convolution module.Then,two necessary improvement principles are proposed:studying the role of normalization in the network structure,and improving data augmentation methods for remote sensing images.To evaluate the proposed framework,we performed many ablation experiments on the DIOR,DOTA,and AS datasets.The results show that whether using the improved backbone network,the normalization method or training data enhancement strategy,the performance of the object detection network trained from scratch increased.These principles compensate for the lack of pretrained models.Furthermore,we found that SRS-Net could achieve similar to or slightly better performance than baseline methods,and surpassed most advanced general detectors. 展开更多
关键词 Denseconnection Object detection pretraining Remote sensing image Trainfrom scratch
原文传递
Swin3D++:Effective Multi-Source Pretraining for 3D Indoor Scene Understanding 被引量:1
3
作者 Yu-Qi Yang Yu-Xiao Guo Yang Liu 《Computational Visual Media》 2025年第3期465-481,共17页
Data diversity and abundance are essential for improving the performance and generalization of models in natural language processing and 2D vision.However,the 3D vision domain suffers from a lack of 3D data,and simply... Data diversity and abundance are essential for improving the performance and generalization of models in natural language processing and 2D vision.However,the 3D vision domain suffers from a lack of 3D data,and simply combining multiple 3D datasets for pretraining a 3D backbone does not yield significant improvement,due to the domain discrepancies among different 3D datasets that impede effective feature learning.In this work,we identify the main sources of the domain discrepancies between 3D indoor scene datasets,and propose Swin3d++,an enhanced architecture based on Swin3d for efficient pretraining on multi-source 3D point clouds.Swin3d++introduces domain-specific mechanisms to SWIN3D's modules to address domain discrepancies and enhance the network capability on multi-source pretraining.Moreover,we devise a simple source-augmentation strategy to increase the pretraining data scale and facilitate supervised pretraining.We validate the effectiveness of our design,and demonstrate that Swin3d++surpasses the state-of-the-art 3D pretraining methods on typical indoor scene understanding tasks. 展开更多
关键词 3D scenes INDOOR pretraining multi-source data data augmentation
原文传递
Pretrained E(3)-equivariant messagepassing neural networks with multi-level representations for organic molecule spectra prediction
4
作者 Yuzhi Xu Daqian Bian +7 位作者 Cheng-Wei Ju Fanyu Zhao Pujun Xie Yuanqing Wang Wei Hu Zhenrong Sun John Z.H.Zhang Tong Zhu 《npj Computational Materials》 2025年第1期2126-2135,共10页
Fast and accurate spectral prediction plays a crucial role in molecular design within fields such as pharmaceutical and materials science.Nevertheless,predicting molecular spectra typically requires quantum chemistry ... Fast and accurate spectral prediction plays a crucial role in molecular design within fields such as pharmaceutical and materials science.Nevertheless,predicting molecular spectra typically requires quantum chemistry calculations,posing significant challenges for fast predictions and highthroughput screening.In this paper,we propose an equivariant,fast,and robust model,named EnviroDetaNet,which integrates molecular environment information.EnviroDetaNet employs an E(3)-equivariant message-passing neural network combining intrinsic atomic properties,spatial features,and environmental information,allowing it tocomprehensively capture both local and global molecular information.Compared to state-of-the-art machine learning models,EnviroDetaNet excels in various predictive tasks and maintains high accuracy even with a 50%reduction in training data,demonstrating strong generalization capabilities.Ablation studies confirm that molecular environment information is crucial for improving model stability and accuracy.EnviroDetaNet also shows outstanding performance in spectral predictions for complex molecular systems,making it a powerful tool for accelerating molecular discovery. 展开更多
关键词 pretrained molecular design spectral prediction quantum chemistry calculationsposing neural networks molecular environment informationenvirodetanet messagepassing e equivariant
原文传递
Multimodal Pretrained Knowledge for Real-world Object Navigation
5
作者 Hui Yuan Yan Huang +4 位作者 Naigong Yu Dongbo Zhang Zetao Du Ziqi Liu Kun Zhang 《Machine Intelligence Research》 2025年第4期713-729,共17页
Most visual-language navigation(VLN)research focuses on simulate environments,but applying these methods to real-world scenarios is challenging because of misalignments between vision and language in complex environme... Most visual-language navigation(VLN)research focuses on simulate environments,but applying these methods to real-world scenarios is challenging because of misalignments between vision and language in complex environments,leading to path deviations.To address this,we propose a novel vision-and-language object navigation strategy that uses multimodal pretrained knowledge as a cross-modal bridge to link semantic concepts in both images and text.This improves navigation supervision at key-points and enhances robustness.Specifically,we 1)randomly generate key-points within a specific density range and optimize them on the basis of challenging locations;2)use pretrained multimodal knowledge to efficiently retrieve target objects;3)combine depth information with simultaneous localization and mapping(SLAM)map data to predict optimal positions and orientations for accurate navigation;and 4)implement the method on a physical robot,successfully conducting navigation tests.Our approach achieves a maximum success rate of 66.7%,outperforming existing VLN methods in real-world environments. 展开更多
关键词 Visual-and-language object navigation key-points multimodal pretrained knowledge optimal positions and orientations physical robot
原文传递
Swin3D: A pretrained transformer backbone for 3D indoor scene understanding
6
作者 Yu-Qi Yang Yu-Xiao Guo +5 位作者 Jian-Yu Xiong Yang Liu Hao Pan Peng-Shuai Wang Xin Tong Baining Guo 《Computational Visual Media》 2025年第1期83-101,共19页
The use of pretrained backbones with finetuning has shown success for 2D vision and natural language processing tasks,with advantages over taskspecific networks.In this paper,we introduce a pretrained 3D backbone,call... The use of pretrained backbones with finetuning has shown success for 2D vision and natural language processing tasks,with advantages over taskspecific networks.In this paper,we introduce a pretrained 3D backbone,called Swin3D,for 3D indoor scene understanding.We designed a 3D Swin Transformer as our backbone network,which enables efficient selfattention on sparse voxels with linear memory complexity,making the backbone scalable to large models and datasets.We also introduce a generalized contextual relative positional embedding scheme to capture various irregularities of point signals for improved network performance.We pretrained a large Swin3D model on a synthetic Structured3D dataset,which is an order of magnitude larger than the ScanNet dataset.Our model pretrained on the synthetic dataset not only generalizes well to downstream segmentation and detection on real 3D point datasets but also outperforms state-of-the-art methods on downstream tasks with+2.3 mIoU and+2.2 mIoU on S3DIS Area5 and 6-fold semantic segmentation,respectively,+1.8 mIoU on ScanNet segmentation(val),+1.9 mAP@0.5 on ScanNet detection,and+8.1 mAP@0.5 on S3DIS detection.A series of extensive ablation studies further validated the scalability,generality,and superior performance enabled by our approach. 展开更多
关键词 3D pretraining ponitcloud analysis trans-former backbone Swin Transformer 3D semantic segmentation 3D object detection
原文传递
An efficient forgetting-aware fine-tuning framework for pretrained universal machine-learning interatomic potentials
7
作者 Jisu Kim Jiho Lee +5 位作者 Sangmin Oh Yutack Park Seungwoo Hwang Seungwu Han Sungwoo Kang Youngho Kang 《npj Computational Materials》 2025年第1期4519-4535,共17页
Pretrained universal machine-learning interatomic potentials(MLIPs)have revolutionized computational materials science by enabling rapid atomistic simulations as efficient alternatives to ab initio methods.Fine-tuning... Pretrained universal machine-learning interatomic potentials(MLIPs)have revolutionized computational materials science by enabling rapid atomistic simulations as efficient alternatives to ab initio methods.Fine-tuning pretrained MLIPs offers a practical approach to improving accuracy for materials and properties where predictive performance is insufficient.However,this approach often induces catastrophic forgetting,undermining the generalizability that is a key advantage of pretrained MLIPs.Herein,we propose reEWC,an advanced fine-tuning strategy that integrates Experience Replay and Elastic Weight Consolidation(EWC)to effectively balance forgetting prevention with fine-tuning efficiency.Using Li_(6)PS_(5)Cl(LPSC),a sulfide-based Li solid-state electrolyte,as a fine-tuning target,we show that reEWC significantly improves the accuracy of a pretrained MLIP,resolving well-known issues of potential energy surface softening and overestimated Li diffusivities.Moreover,reEWC preserves the generalizability of the pretrained MLIP and enables knowledge transfer to chemically distinct systems,including other sulfide,oxide,nitride,and halide electrolytes.Compared to Experience Replay and EWC used individually,reEWC delivers clear synergistic benefits,mitigating their respective limitations while maintaining computational efficiency.These results establish reEWC as a robust and effective solution for continual learning in MLIPs,enabling universal models that can advance materials research through large-scale,high-throughput simulations across diverse chemistries. 展开更多
关键词 fine tuning computational materials science pretrained machine learning interatomic potentials forgetting aware ab initio methodsfine tuning predictive performance rapid atomistic simulations improving accuracy materials properties
原文传递
Multimodal Pretraining from Monolingual to Multilingual 被引量:1
8
作者 Liang Zhang Ludan Ruan +1 位作者 Anwen Hu Qin Jin 《Machine Intelligence Research》 EI CSCD 2023年第2期220-232,共13页
Multimodal pretraining has made convincing achievements in various downstream tasks in recent years.However,since the majority of the existing works construct models based on English,their applications are limited by ... Multimodal pretraining has made convincing achievements in various downstream tasks in recent years.However,since the majority of the existing works construct models based on English,their applications are limited by language.In this work,we address this issue by developing models with multimodal and multilingual capabilities.We explore two types of methods to extend multimodal pretraining model from monolingual to multilingual.Specifically,we propose a pretraining-based model named multilingual multimodal pretraining(MLMM),and two generalization-based models named multilingual CLIP(M-CLIP)and multilingual acquisition(MLA).In addition,we further extend the generalization-based models to incorporate the audio modality and develop the multilingual CLIP for vision,language,and audio(CLIP4VLA).Our models achieve state-of-the-art performances on multilingual vision-text retrieval,visual question answering,and image captioning benchmarks.Based on the experimental results,we discuss the pros and cons of the two types of models and their potential practical applications. 展开更多
关键词 Multilingual pretraining multimodal pretraining cross-lingual transfer multilingual generation cross-modal retrieval
原文传递
融合预训练音频大模型与密度估计的水轮发电机组声学无监督异常检测
9
作者 武亭 闻疏琳 +5 位作者 阎兆立 付高原 李林峰 刘绪都 程晓斌 杨军 《电子与信息学报》 北大核心 2026年第2期772-783,共12页
水轮发电机组作为水电站的核心动力设备,其安全稳定运行对于整个水电站具有重要意义。近年来,非接触式声学测量作为一种有效的检测手段受到广泛关注,然而水轮发电机组的实际运行的异常声信号难以采集,传统异常检测方法及基于监督学习的... 水轮发电机组作为水电站的核心动力设备,其安全稳定运行对于整个水电站具有重要意义。近年来,非接触式声学测量作为一种有效的检测手段受到广泛关注,然而水轮发电机组的实际运行的异常声信号难以采集,传统异常检测方法及基于监督学习的分类策略在该领域的应用受到限制。针对上述挑战,该文提出一种预训练音频大模型与密度估计k近邻(k-NN)的水轮发电机声学无监督异常检测方法。首先验证了预训练音频模型提取的通用音频特征在异常检测中的有效性;随后设计了一种融合注意力统计池化与warm-up的参数微调策略,实现模型的迁移优化,在推理阶段设计了一种密度估计的k近邻实现鲁棒的距离度量。实验结果表明,该方法在风洞环境达到了98.7%的多指标调和平均数,在滑环室则高达99.9%,为水电站的声学异常检测提供了切实可行且性能优异的解决方案。 展开更多
关键词 预训练音频大模型 水轮发电机组 异常检测 无监督深度学习
在线阅读 下载PDF
Exploring Fragment Adding Strategies to Enhance Molecule Pretraining in AI-Driven Drug Discovery 被引量:3
10
作者 Zhaoxu Meng Cheng Chen +2 位作者 Xuan Zhang Wei Zhao Xuefeng Cui 《Big Data Mining and Analytics》 EI CSCD 2024年第3期565-576,共12页
The effectiveness of Al-driven drug discovery can be enhanced by pretraining on small molecules.However,the conventional masked language model pretraining techniques are not suitable for molecule pretraining due to th... The effectiveness of Al-driven drug discovery can be enhanced by pretraining on small molecules.However,the conventional masked language model pretraining techniques are not suitable for molecule pretraining due to the limited vocabulary size and the non-sequential structure of molecules.To overcome these challenges,we propose FragAdd,a strategy that involves adding a chemically implausible molecular fragment to the input molecule.This approach allows for the incorporation of rich local information and the generation of a high-quality graph representation,which is advantageous for tasks like virtual screening.Consequently,we have developed a virtual screening protocol that focuses on identifying estrogen receptor alpha binders on a nucleus receptor.Our results demonstrate a significant improvement in the binding capacity of the retrieved molecules.Additionally,we demonstrate that the FragAdd strategy can be combined with other self-supervised methods to further expedite the drug discovery process. 展开更多
关键词 pretraining information retrieval drug discovery virtual screening molecule property prediction
原文传递
MVContrast:Unsupervised Pretraining for Multi-view 3D Object Recognition 被引量:2
11
作者 Luequan Wang Hongbin Xu Wenxiong Kang 《Machine Intelligence Research》 EI CSCD 2023年第6期872-883,共12页
3D shape recognition has drawn much attention in recent years.The view-based approach performs best of all.However,the current multi-view methods are almost all fully supervised,and the pretraining models are almost a... 3D shape recognition has drawn much attention in recent years.The view-based approach performs best of all.However,the current multi-view methods are almost all fully supervised,and the pretraining models are almost all based on ImageNet.Although the pretraining results of ImageNet are quite impressive,there is still a significant discrepancy between multi-view datasets and ImageNet.Multi-view datasets naturally retain rich 3D information.In addition,large-scale datasets such as ImageNet require considerable cleaning and annotation work,so it is difficult to regenerate a second dataset.In contrast,unsupervised learning methods can learn general feature representations without any extra annotation.To this end,we propose a three-stage unsupervised joint pretraining model.Specifically,we decouple the final representations into three fine-grained representations.Data augmentation is utilized to obtain pixel-level representations within each view.And we boost the spatial invariant features from the view level.Finally,we exploit global information at the shape level through a novel extract-and-swap module.Experimental results demonstrate that the proposed method gains significantly in 3D object classification and retrieval tasks,and shows generalization to cross-dataset tasks. 展开更多
关键词 Multi view unsupervised pretraining contrastive learning 3D vision shape recognition
原文传递
Pretrained Models and Evaluation Data for the Khmer Language 被引量:1
12
作者 Shengyi Jiang Sihui Fu +1 位作者 Nankai Lin Yingwen Fu 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2022年第4期709-718,共10页
Trained on a large corpus,pretrained models(PTMs)can capture different levels of concepts in context and hence generate universal language representations,which greatly benefit downstream natural language processing(N... Trained on a large corpus,pretrained models(PTMs)can capture different levels of concepts in context and hence generate universal language representations,which greatly benefit downstream natural language processing(NLP)tasks.In recent years,PTMs have been widely used in most NLP applications,especially for high-resource languages,such as English and Chinese.However,scarce resources have discouraged the progress of PTMs for low-resource languages.Transformer-based PTMs for the Khmer language are presented in this work for the first time.We evaluate our models on two downstream tasks:Part-of-speech tagging and news categorization.The dataset for the latter task is self-constructed.Experiments demonstrate the effectiveness of the Khmer models.In addition,we find that the current Khmer word segmentation technology does not aid performance improvement.We aim to release our models and datasets to the community in hopes of facilitating the future development of Khmer NLP applications. 展开更多
关键词 pretrained models Khmer language word segmentation part-of-speech(POS)tagging news categorization
原文传递
基于提示的小样本情感分析综述
13
作者 姜鑫 马宏伟 张展峰 《软件导刊》 2026年第1期213-220,共8页
随着多媒体平台和大规模语言模型的迅猛发展,采用基于提示的方法实现小样本情感分析,对用户需求分析和系统服务改进具有重要意义。基于提示的小样本情感分析研究,致力于在不同应用场景中合理利用预训练语言模型理解分类任务,推理情感类... 随着多媒体平台和大规模语言模型的迅猛发展,采用基于提示的方法实现小样本情感分析,对用户需求分析和系统服务改进具有重要意义。基于提示的小样本情感分析研究,致力于在不同应用场景中合理利用预训练语言模型理解分类任务,推理情感类别。首先,阐述了小样本情感分析的问题背景;其次,介绍了使用提示—微调、提示调优和上下文学习的方法步骤;再次,系统对比了近来基于提示的小样本情感分析的主流技术,归纳总结了相关语料库、预训练语言模型与提示模板;最后,对基于提示的小样本情感分析未来可能的研究方向进行展望,为句子级文本分类和提示学习相关领域的研究提供参考。 展开更多
关键词 小样本情感分析 预训练语言模型 提示—微调 提示调优 上下文学习
在线阅读 下载PDF
融合时频特征与混合文本的多模态股票预测框架MATCH
14
作者 魏涵玥 郭晨娟 +4 位作者 梅杰源 田锦东 陈鹏 徐榕荟 杨彬 《计算机应用》 北大核心 2026年第2期427-436,共10页
现有股票预测模型多基于单一模态,忽视了行业间的联动效应与信息异质性;部分研究虽引入了文本模态,但在处理模态异构所导致的时滞性和多粒度等问题上仍存在不足。因此,提出面向股票市场的融合时频特征与混合文本的多模态股票预测框架MAT... 现有股票预测模型多基于单一模态,忽视了行业间的联动效应与信息异质性;部分研究虽引入了文本模态,但在处理模态异构所导致的时滞性和多粒度等问题上仍存在不足。因此,提出面向股票市场的融合时频特征与混合文本的多模态股票预测框架MATCH(Multimodal stock prediction frAmework inTegrating time-frequenCy features and Hybrid text)。一方面,设计混合专家(MoE)预训练策略为每个行业构建特定的预训练表征模型,在预测过程中动态选择匹配的专家网络,并注入行业特征信息;另一方面,设计频域分解与层次化融合机制,通过双流预训练架构获取高频未来波动和低频未来趋势的表征,把它们与不同时间尺度的文本信息进行跨模态交互,更精准地捕捉市场动态变化,并实现多粒度场景下的时序与文本有效交互。在2个真实股票数据集S&P 500和CMIN-US上,MATCH与ESTIMATE(Efficient STock Integration with teMporal generative filters and wavelet hypergraph ATtEntions)和PatchTST等主流方法进行对比的实验结果显示,在S&P 500数据集上相较次优基线模型Adv-ALSTM,MATCH的夏普比率(SR)提升了50.5%;在更具有挑战性的CMIN-US数据集上,MATCH的SR提升了2.35%,其余指标均取得了最佳成绩。MATCH预测性能提升明显可为金融多模态数据融合提供新颖且高效的解决方案。 展开更多
关键词 金融时间序列 多模态 混合专家模型 预训练模型 时频分析
在线阅读 下载PDF
Learning Top-K Subtask Planning Tree Based on Discriminative Representation Pretraining for Decision-making
15
作者 Jingqing Ruan Kaishen Wang +2 位作者 Qingyang Zhang Dengpeng Xing Bo Xu 《Machine Intelligence Research》 EI CSCD 2024年第4期782-800,共19页
Decomposing complex real-world tasks into simpler subtasks and devising a subtask execution plan is critical for humans to achieve effective decision-making.However,replicating this process remains challenging for AI ... Decomposing complex real-world tasks into simpler subtasks and devising a subtask execution plan is critical for humans to achieve effective decision-making.However,replicating this process remains challenging for AI agents and naturally raises two questions:(1)How to extract discriminative knowledge representation from priors?(2)How to develop a rational plan to decompose complex problems?To address these issues,we introduce a groundbreaking framework that incorporates two main contributions.First,our multiple-encoder and individual-predictor regime goes beyond traditional architectures to extract nuanced task-specific dynamics from datasets,enriching the feature space for subtasks.Second,we innovate in planning by introducing a top-K subtask planning tree generated through an attention mechanism,which allows for dynamic adaptability and forward-looking decision-making.Our framework is empirically validated against challenging benchmarks BabyAI including multiple combinatorially rich synthetic tasks(e.g.,GoToSeq,SynthSeq,BossLevel),where it not only outperforms competitive baselines but also demonstrates superior adaptability and effectiveness incomplex task decomposition. 展开更多
关键词 Reinforcement learning representation learning subtask planning task decomposition pretraining.
原文传递
Generative pretrained transformer 4:an innovative approach to facilitate value-based healthcare
16
作者 Han Lyu Zhixiang Wang +6 位作者 Jia Li Jing Sun Xinghao Wang Pengling Ren Linkun Cai Zhenchang Wang Max Wintermark 《Intelligent Medicine》 EI CSCD 2024年第1期10-15,共6页
Objective Appropriate medical imaging is important for value-based care.We aim to evaluate the performance of generative pretrained transformer 4(GPT-4),an innovative natural language processing model,providing approp... Objective Appropriate medical imaging is important for value-based care.We aim to evaluate the performance of generative pretrained transformer 4(GPT-4),an innovative natural language processing model,providing appropriate medical imaging automatically in different clinical scenarios.Methods Institutional Review Boards(IRB)approval was not required due to the use of nonidentifiable data.Instead,we used 112 questions from the American College of Radiology(ACR)Radiology-TEACHES Program as prompts,which is an open-sourced question and answer program to guide appropriate medical imaging.We included 69 free-text case vignettes and 43 simplified cases.For the performance evaluation of GPT-4 and GPT-3.5,we considered the recommendations of ACR guidelines as the gold standard,and then three radiologists analyzed the consistency of the responses from the GPT models with those of the ACR.We set a five-score criterion for the evaluation of the consistency.A paired t-test was applied to assess the statistical significance of the findings.Results For the performance of the GPT models in free-text case vignettes,the accuracy of GPT-4 was 92.9%,whereas the accuracy of GPT-3.5 was just 78.3%.GPT-4 can provide more appropriate suggestions to reduce the overutilization of medical imaging than GPT-3.5(t=3.429,P=0.001).For the performance of the GPT models in simplified scenarios,the accuracy of GPT-4 and GPT-3.5 was 66.5%and 60.0%,respectively.The differences were not statistically significant(t=1.858,P=0.070).GPT-4 was characterized by longer reaction times(27.1 s in average)and extensive responses(137.1 words on average)than GPT-3.5.Conclusion As an advanced tool for improving value-based healthcare in clinics,GPT-4 may guide appropriate medical imaging accurately and efficiently。 展开更多
关键词 Generative pretrained transformer 4 model Natural language processing Medical imaging APPROPRIATENESS
原文传递
基于自监督预训练的单细胞类型注释方法
17
作者 张晴 吴晓晓 +4 位作者 李想 马威 吴通权 谢诒诚 吴兴隆 《武汉工程大学学报》 2026年第1期103-110,共8页
为了应对单细胞RNA测序中准确注释细胞类型的挑战,提出了基于迁移学习和Transformer的深度学习网络单细胞标签注释网络(ScLabel-Net),旨在对小鼠肺部的大规模单细胞数据集进行高效、准确的细胞类型注释。ScLabel-Net首先在约10万个细胞... 为了应对单细胞RNA测序中准确注释细胞类型的挑战,提出了基于迁移学习和Transformer的深度学习网络单细胞标签注释网络(ScLabel-Net),旨在对小鼠肺部的大规模单细胞数据集进行高效、准确的细胞类型注释。ScLabel-Net首先在约10万个细胞的单细胞肺部数据集上进行预训练,通过自监督学习捕捉基因间的相似性,然后将模型迁移到相对较少的数据集上,对特定细胞类型注释任务进行微调。考虑到单细胞数据中常见的细胞类型分布不平衡现象,微调数据集时应用了随机上采样技术,以减轻分布不平衡对注释结果的影响。实验结果表明,ScLabel-Net在GSE267861、GSE264032和Quake等3个小鼠肺部数据集上的细胞类型注释准确率分别达到0.955、0.922和0.986。此外,ScLabel-Net在小鼠其他器官(如气管、肾脏、胰腺)的单细胞数据集上也表现出优异的泛化能力,准确率分别达到0.981、0.951和0.987,验证了ScLabel-Net跨器官的适用性,进一步证明了ScLabel-Net在复杂生物系统和疾病研究中的广泛应用潜力。 展开更多
关键词 细胞类型注释 自监督预训练 深度学习 单细胞RNA测序
在线阅读 下载PDF
Pretraining Enhanced RNN Transducer
18
作者 Junyu Lu Rongzhong Lian +4 位作者 Di Jiang Yuanfeng Song Zhiyang Su Victor Junqiu Wei Lin Yang 《CAAI Artificial Intelligence Research》 2024年第1期74-81,共8页
Recurrent neural network transducer(RNN-T)is an important branch of current end-to-end automatic speech recognition(ASR).Various promising approaches have been designed for boosting RNN-T architecture;however,few stud... Recurrent neural network transducer(RNN-T)is an important branch of current end-to-end automatic speech recognition(ASR).Various promising approaches have been designed for boosting RNN-T architecture;however,few studies exploit the effectiveness of pretrained methods in this framework.In this paper,we introduce the pretrained acoustic extractor(PAE)and the pretrained linguistic network(PLN)to enhance the Conformer long short-term memory(Conformer-LSTM)transducer.First,we construct the input of the acoustic encoder with two different latent representations:one extracted by PAE from the raw waveform,and the other obtained from filter-bank transformation.Second,we fuse an extra semantic feature from the PLN into the joint network to reduce illogical and homophonic errors.Compared with previous works,our approaches are able to obtain pretrained representations for better model generalization.Evaluation on two large-scale datasets has demonstrated that our proposed approaches yield better performance than existing approaches. 展开更多
关键词 pretraining automatic speech recognition self-supervised learning
原文传递
一种针对混合频谱噪声的主动减振技术 被引量:1
19
作者 钟志 牛国标 +1 位作者 刘磊 单明广 《实验技术与管理》 北大核心 2025年第6期46-54,共9页
在船舶、海洋工程装备等领域,振动噪声工况呈现出复杂的宽-窄带复合噪声的特点。以往主动控制技术只针对单一类型的噪声进行消减,导致整体减振效果不佳。为解决上述问题,设计了一种能够消减宽-窄带复合噪声的混合频谱主动减振(MSN-HVNC... 在船舶、海洋工程装备等领域,振动噪声工况呈现出复杂的宽-窄带复合噪声的特点。以往主动控制技术只针对单一类型的噪声进行消减,导致整体减振效果不佳。为解决上述问题,设计了一种能够消减宽-窄带复合噪声的混合频谱主动减振(MSN-HVNC)算法,并在X型小浮筏配机实验平台进行实验验证。MSN-HVNC算法由窄带噪声控制子系统(NBCS)和宽带噪声控制子系统(WBCS)两个子系统组成,两者协同完成对混合频谱噪声的消减。其中,WBCS采用含有预训练的选择系数模型的滤波x最小均方(FxLMS)算法,来完成宽带噪声消减;NBCS采用自适应陷波技术,对能量集中的窄带线谱噪声进行消减。用减振后的残余振动噪声来衡量减振水平,并作为误差信号更新控制器权重。最后,用X型小浮筏配机结构来搭建实验平台,完成振动噪声的主动控制实验。结果表明,MSN-HVNC算法对单频窄带振动噪声在50、75 Hz工况下的平均减振效果分别为23.6、21.3 dB;MSN-HVNC算法对模拟多源耦合振动场景下,混合激励振动信号的平均减振效果为12.4 dB,均优于传统控制算法,对宽-窄带复合的混合频谱噪声具有良好的消减效果。 展开更多
关键词 主动控制 混合频谱噪声 预训练模型 协同控制
在线阅读 下载PDF
多模态持续学习方法研究进展 被引量:1
20
作者 张伟 钱龙玥 +1 位作者 张林 李腾 《数据采集与处理》 北大核心 2025年第5期1122-1138,共17页
多模态持续学习(Multimodal continual learning,MMCL)作为机器学习和人工智能领域的一个重要研究方向,旨在通过融合多种模态数据(如图像、文本或语音等)来实现持续的知识积累与任务适应。相较于传统单模态学习方法,MMCL不仅能够并行处... 多模态持续学习(Multimodal continual learning,MMCL)作为机器学习和人工智能领域的一个重要研究方向,旨在通过融合多种模态数据(如图像、文本或语音等)来实现持续的知识积累与任务适应。相较于传统单模态学习方法,MMCL不仅能够并行处理多源异构数据,还能在有效保持已有知识的同时适应新任务需求,展现出在智能系统中的巨大应用潜力。本文系统性地对多模态持续学习进行综述。首先,从基本概念、评估体系和经典单模态持续学习方法3个维度阐述了MMCL的基础理论框架。其次,深入剖析了MMCL在实际应用中的优势与挑战:尽管其在多模态信息融合方面具有显著优势,但仍面临模态不平衡、异构性融合等关键挑战,这些挑战既制约了当前方法的性能表现,也为未来研究指明了方向。基于此,本文随后从基于回放、正则化、参数隔离和大模型4个主要方面,全面梳理了MMCL方法的研究现状与最新进展。最后,对MMCL的未来发展趋势进行了前瞻性展望。 展开更多
关键词 多模态持续学习 模态对齐 灾难性遗忘 预训练模型 任务适应性
在线阅读 下载PDF
上一页 1 2 14 下一页 到第
使用帮助 返回顶部