期刊文献+
共找到481篇文章
< 1 2 25 >
每页显示 20 50 100
Global open source and international standards promote the inclusive development of large models
1
作者 Lin Yonghua 《China Standardization》 2025年第5期25-25,共1页
In the era of AI,especially large models,the importance of open source has become increasingly prominent.First,open source allows innovation to avoid starting from scratch.Through iterative innovation,it promotes tech... In the era of AI,especially large models,the importance of open source has become increasingly prominent.First,open source allows innovation to avoid starting from scratch.Through iterative innovation,it promotes technical exchanges and learning globally.Second,resources required for large model R&D are difficult for a single institution to obtain.The evaluation of general large models also requires the participation of experts from various industries.Third,without open source collaboration,it is difficult to form a unified upper-layer software ecosystem.Therefore,open source has become an important cooperation mechanism to promote the development of AI and large models.There are two cases to illustrate how open source and international standards interact with each other. 展开更多
关键词 open source large model international standards inclusive development iterative innovationit large modelsthe evaluation general large models large models
原文传递
PKME-MLM:A Novel Multimodal Large Model for Sarcasm Detection
2
作者 Jian Luo Yaling Li +1 位作者 Xueyu Li Xuliang Hu 《Computers, Materials & Continua》 2025年第4期877-896,共20页
Sarcasm detection in Natural Language Processing(NLP)has become increasingly important,partic-ularly with the rise of social media and non-textual emotional expressions,such as images.Existing methods often rely on se... Sarcasm detection in Natural Language Processing(NLP)has become increasingly important,partic-ularly with the rise of social media and non-textual emotional expressions,such as images.Existing methods often rely on separate image and text modalities,which may not fully utilize the information available from both sources.To address this limitation,we propose a novel multimodal large model,i.e.,the PKME-MLM(Prior Knowledge and Multi-label Emotion analysis based Multimodal Large Model for sarcasm detection).The PKME-MLM aims to enhance sarcasm detection by integrating prior knowledge to extract useful textual information from images,which is then combined with text data for deeper analysis.This method improves the integration of image and text data,addressing the limitation of previous models that process these modalities separately.Additionally,we incorporate multi-label sentiment analysis,refining sentiment labels to improve sarcasm recognition accuracy.This design overcomes the limitations of prior models that treated sentiment classification as a single-label problem,thereby improving sarcasm recognition by distinguishing subtle emotional cues from the text.Experimental results demonstrate that our approach achieves significant performance improvements in multimodal sarcasm detection tasks,with an accuracy(Acc.)of 94.35%,and Macro-Average Precision and Recall reaching 93.92%and 94.21%,respectively.These results highlight the potential of multimodal models in improving sarcasm detection and suggest that further integration of modalities could advance future research.This work also paves the way for incorporating multimodal sentiment analysis into sarcasm detection. 展开更多
关键词 Sarcasm detection multimodal large model prior knowledge multi-label fusion
在线阅读 下载PDF
A new human-computer interaction paradigm: Agent interaction model based on large models and its prospects
3
作者 Yang LIU 《虚拟现实与智能硬件(中英文)》 2025年第3期237-266,共30页
This study examines the advent of agent interaction(AIx)as a transformative paradigm in humancomputer interaction(HCI),signifying a notable evolution beyond traditional graphical interfaces and touchscreen interaction... This study examines the advent of agent interaction(AIx)as a transformative paradigm in humancomputer interaction(HCI),signifying a notable evolution beyond traditional graphical interfaces and touchscreen interactions.Within the context of large models,AIx is characterized by its innovative interaction patterns and a plethora of application scenarios that hold great potential.The paper highlights the pivotal role of AIx in shaping the future landscape of the large model industry,emphasizing its adoption and necessity from a user's perspective.This study underscores the pivotal role of AIx in dictating the future trajectory of a large model industry by emphasizing the importance of its adoption and necessity from a user-centric perspective.The fundamental drivers of AIx include the introduction of novel capabilities,replication of capabilities(both anthropomorphic and superhuman),migration of capabilities,aggregation of intelligence,and multiplication of capabilities.These elements are essential for propelling innovation,expanding the frontiers of capability,and realizing the exponential superposition of capabilities,thereby mitigating labor redundancy and addressing a spectrum of human needs.Furthermore,this study provides an in-depth analysis of the structural components and operational mechanisms of agents supported by large models.Such advancements significantly enhance the capacity of agents to tackle complex problems and provide intelligent services,thereby facilitating a more intuitive,adaptive,and personalized engagement between humans and machines.The study further delineates four principal categories of interaction patterns that encompass eight distinct modalities of interaction,corresponding to twenty-one specific scenarios,including applications in smart home systems,health assistance,and elderly care.This emphasizes the significance of this new paradigm in advancing HCI,fostering technological advancements,and redefining user experiences.However,it also acknowledges the challenges and ethical considerations that accompany this paradigm shift,recognizing the need for a balanced approach to harness the full potential of AIx in modern society. 展开更多
关键词 Interaction paradigm Agent interaction large models
在线阅读 下载PDF
An Online Judgement System Based on Code-generating Large Model
4
作者 Xudong Lu Zaixuan Wang +3 位作者 He Zhou Chen Yu Lizhen Cui Wei Guo 《计算机教育》 2025年第3期122-129,共8页
For computer science majors in higher education institutions,programming courses are one of the most important professional foundation courses.Proficiency in independent programming skills is of great help to the stud... For computer science majors in higher education institutions,programming courses are one of the most important professional foundation courses.Proficiency in independent programming skills is of great help to the study of subsequent courses and the personal development of students.In the teaching process of programming courses,online judgement systems are often used to improve students’programming level.Traditional online judgement systems lack guidance for students,and it is often difficult for inexperienced students to find and correct errors in their codes by themselves.We propose an online judgement system that integrates a large model of error correction to help students find errors and improve their programming skills. 展开更多
关键词 Online judgement system Code-generating large model AI assistant
在线阅读 下载PDF
Special Topic on Security of Large Models
5
作者 SU Zhou DU Linkang 《ZTE Communications》 2025年第3期1-2,共2页
Large models,such as large language models(LLMs),vision-language models(VLMs),and multimodal agents,have become key elements in artificial intelli⁃gence(AI)systems.Their rapid development has greatly improved percepti... Large models,such as large language models(LLMs),vision-language models(VLMs),and multimodal agents,have become key elements in artificial intelli⁃gence(AI)systems.Their rapid development has greatly improved perception,generation,and decision-making in various fields.However,their vast scale and complexity bring about new security challenges.Issues such as backdoor vulnerabilities during training,jailbreaking in multimodal rea⁃soning,and data provenance and copyright auditing have made security a critical focus for both academia and industry. 展开更多
关键词 large modelssuch SECURITY multimodal agentshave multimodal rea soningand large language models llms vision language data provenance copyright auditing backdoor vulnerabilities vision language models
在线阅读 下载PDF
Research status and application of artificial intelligence large models in the oil and gas industry 被引量:2
6
作者 LIU He REN Yili +6 位作者 LI Xin DENG Yue WANG Yongtao CAO Qianwen DU Jinyang LIN Zhiwei WANG Wenjie 《Petroleum Exploration and Development》 SCIE 2024年第4期1049-1065,共17页
This article elucidates the concept of large model technology,summarizes the research status of large model technology both domestically and internationally,provides an overview of the application status of large mode... This article elucidates the concept of large model technology,summarizes the research status of large model technology both domestically and internationally,provides an overview of the application status of large models in vertical industries,outlines the challenges and issues confronted in applying large models in the oil and gas sector,and offers prospects for the application of large models in the oil and gas industry.The existing large models can be briefly divided into three categories:large language models,visual large models,and multimodal large models.The application of large models in the oil and gas industry is still in its infancy.Based on open-source large language models,some oil and gas enterprises have released large language model products using methods like fine-tuning and retrieval augmented generation.Scholars have attempted to develop scenario-specific models for oil and gas operations by using visual/multimodal foundation models.A few researchers have constructed pre-trained foundation models for seismic data processing and interpretation,as well as core analysis.The application of large models in the oil and gas industry faces challenges such as current data quantity and quality being difficult to support the training of large models,high research and development costs,and poor algorithm autonomy and control.The application of large models should be guided by the needs of oil and gas business,taking the application of large models as an opportunity to improve data lifecycle management,enhance data governance capabilities,promote the construction of computing power,strengthen the construction of“artificial intelligence+energy”composite teams,and boost the autonomy and control of large model technology. 展开更多
关键词 foundation model large language mode visual large model multimodal large model large model of oil and gas industry pre-training fine-tuning
在线阅读 下载PDF
Key Technologies and Application Prospects of Railway Natural Language Large Model
7
作者 SHI Tianyun LI Xinqin +4 位作者 DAI Mingrui SHI Weifeng LI Guohua DU Wenran SHEN Meiying(Translated) 《Chinese Railways》 2024年第2期11-20,共10页
The emergence of artificial intelligence natural language large models has brought new dawn for the in-depth empowerment of the industry.Research on key technologies and applications of railway natural language large ... The emergence of artificial intelligence natural language large models has brought new dawn for the in-depth empowerment of the industry.Research on key technologies and applications of railway natural language large model is of great significance to promoting and coordinating the development of railway artificial intelligence.This paper puts forward the application scenarios of railway natural language large model according to the application requirements of railway artificial intelligence;designs the overall architecture of the railway natural language large model by relying on the railway artificial intelligence platform,studies the key technologies of the natural language large model,builds a railway industry large model oriented to intelligent question-answering,and verifies the model with actual data;finally,this paper prospects for the development and application of railway natural language large model from the aspects of railway traffic organization,railway operation safety and passenger service. 展开更多
关键词 intelligent HSR artificial intelligence railway natural language large model application scenarios large model architecture large model fine-tuning retrieval-augmented generation railway knowledge question-answering
原文传递
Envisioning the blueprint:Aeronautics in large models era
8
作者 Weiwei ZHANG Shule ZHAO 《Chinese Journal of Aeronautics》 2025年第8期139-141,共3页
Following the groundbreaking introduction of the Transformer architecture in 2017,the development of Large Language Models(LLMs)formally commenced.In May 2020,Chat GPT-3,with over one hundred billion parameters,entere... Following the groundbreaking introduction of the Transformer architecture in 2017,the development of Large Language Models(LLMs)formally commenced.In May 2020,Chat GPT-3,with over one hundred billion parameters,entered the public eye,marking a significant milestone in LLM advancement. 展开更多
关键词 AERONAUTICS large languagemodels transformer architecture transformerarchitecture llms chatgpt large language models llms formally
原文传递
Artificial intelligence large model for logging curve reconstruction
9
作者 CHEN Zhangxing ZHANG Yongan +5 位作者 LI Jian HUI Gang SUN Youzhuang LI Yizheng CHEN Yuntian ZHANG Dongxiao 《Petroleum Exploration and Development》 2025年第3期842-854,共13页
To improve the accuracy and generalization of well logging curve reconstruction,this paper proposes an artificial intelligence large language model“Gaia”and conducts model evaluation experiments.By fine-tuning the p... To improve the accuracy and generalization of well logging curve reconstruction,this paper proposes an artificial intelligence large language model“Gaia”and conducts model evaluation experiments.By fine-tuning the pre-trained large language model,the Gaia significantly improved its ability in extracting sequential patterns and spatial features from well-log curves.Leveraging the adapter method for fine-tuning,this model required training only about 1/70 of its original parameters,greatly improving training efficiency.Comparative experiments,ablation experiments,and generalization experiments were designed and conducted using well-log data from 250 wells.In the comparative experiment,the Gaia model was benchmarked against cutting-edge small deep learning models and conventional large language models,demonstrating that the Gaia model reduced the mean absolute error(MAE)by at least 20%.In the ablation experiments,the synergistic effect of the Gaia model's multiple components was validated,with its MAE being at least 30%lower than that of single-component models.In the generalization experiments,the superior performance of the Gaia model in blind-well predictions was further confirmed.Compared to traditional models,the Gaia model is significantly superior in accuracy and generalization for logging curve reconstruction,fully showcasing the potential of large language models in the field of well-logging.This provides a new approach for future intelligent logging data processing. 展开更多
关键词 logging curve reconstruction large language model ADAPTER pre-trained model fine-tuning method
在线阅读 下载PDF
Dataset Copyright Auditing for Large Models:Fundamentals,Open Problems,and Future Directions
10
作者 DU Linkang SU Zhou YU Xinyi 《ZTE Communications》 2025年第3期38-47,共10页
The unprecedented scale of large models,such as large language models(LLMs)and text-to-image diffusion models,has raised critical concerns about the unauthorized use of copyrighted data during model training.These con... The unprecedented scale of large models,such as large language models(LLMs)and text-to-image diffusion models,has raised critical concerns about the unauthorized use of copyrighted data during model training.These concerns have spurred a growing demand for dataset copyright auditing techniques,which aim to detect and verify potential infringements in the training data of commercial AI systems.This paper presents a survey of existing auditing solutions,categorizing them across key dimensions:data modality,model training stage,data overlap scenarios,and model access levels.We highlight major trends,including the prevalence of black-box auditing methods and the emphasis on fine-tuning rather than pre-training.Through an in-depth analysis of 12 representative works,we extract four key observations that reveal the limitations of current methods.Furthermore,we identify three open challenges and propose future directions for robust,multimodal,and scalable auditing solutions.Our findings underscore the urgent need to establish standardized benchmarks and develop auditing frameworks that are resilient to low watermark densities and applicable in diverse deployment settings. 展开更多
关键词 dataset copyright auditing large language models diffusion models multimodal auditing membership inference
在线阅读 下载PDF
The Synergy of Seeing and Saying: Revolutionary Advances in Multi-modality Medical Vision-Language Large Models
11
作者 Xiang LI Yu SUN +3 位作者 Jia LIN Like LI Ting FENG Shen YIN 《Artificial Intelligence Science and Engineering》 2025年第2期79-97,共19页
The application of visual-language large models in the field of medical health has gradually become a research focus.The models combine the capability for image understanding and natural language processing,and can si... The application of visual-language large models in the field of medical health has gradually become a research focus.The models combine the capability for image understanding and natural language processing,and can simultaneously process multi-modality data such as medical images and medical reports.These models can not only recognize images,but also understand the semantic relationship between images and texts,effectively realize the integration of medical information,and provide strong support for clinical decision-making and disease diagnosis.The visual-language large model has good performance for specific medical tasks,and also shows strong potential and high intelligence in the general task models.This paper provides a comprehensive review of the visual-language large model in the field of medical health.Specifically,this paper first introduces the basic theoretical basis and technical principles.Then,this paper introduces the specific application scenarios in the field of medical health,including modality fusion,semi-supervised learning,weakly supervised learning,unsupervised learning,cross-domain model and general models.Finally,the challenges including insufficient data,interpretability,and practical deployment are discussed.According to the existing challenges,four potential future development directions are given. 展开更多
关键词 large language models vision-language models medical health multimodality models
在线阅读 下载PDF
Neuromorphic Computing in the Era of Large Models
12
作者 Haoxuan SHAN Chiyue WEI +4 位作者 Nicolas RAMOS Xiaoxuan YANG Cong GUO Hai(Helen)LI Yiran CHEN 《Artificial Intelligence Science and Engineering》 2025年第1期17-30,共14页
The rapid advancement of deep learning and the emergence of largescale neural models,such as bidirectional encoder representations from transformers(BERT),generative pre-trained transformer(GPT),and large language mod... The rapid advancement of deep learning and the emergence of largescale neural models,such as bidirectional encoder representations from transformers(BERT),generative pre-trained transformer(GPT),and large language model Meta AI(LLaMa),have brought significant computational and energy challenges.Neuromorphic computing presents a biologically inspired approach to addressing these issues,leveraging event-driven processing and in-memory computation for enhanced energy efficiency.This survey explores the intersection of neuromorphic computing and large-scale deep learning models,focusing on neuromorphic models,learning methods,and hardware.We highlight transferable techniques from deep learning to neuromorphic computing and examine the memoryrelated scalability limitations of current neuromorphic systems.Furthermore,we identify potential directions to enable neuromorphic systems to meet the growing demands of modern AI workloads. 展开更多
关键词 neuromorphic computing spiking neural networks large deep learning models
在线阅读 下载PDF
Implementation of Digital Classroom State System for Teachers and Students Based on Large Models
13
作者 Wenbo Lyu Guangmin Zhu +2 位作者 Ziyi Qin Mengting Yan Jie Zhang 《Journal of Contemporary Educational Research》 2024年第11期101-106,共6页
Deep learning has become a hot field of artificial intelligence,and the deep learning large model framework has become a bridgehead for the active layout of Chinese and foreign technology companies.Large models play a... Deep learning has become a hot field of artificial intelligence,and the deep learning large model framework has become a bridgehead for the active layout of Chinese and foreign technology companies.Large models play a significant role in the application field,greatly improving the efficiency of training and optimization,and contributing to the landing of many innovative artificial intelligence tools.Based on the Chinese PaddlePaddle large model framework,an application system is designed in combination with the intelligent classroom teaching scenario,which uses machine vision algorithms to distinguish and present teachers’and students’behaviors,that is,the digitization and multi-classification scheme of class character states.After having digital data,data analysis can be carried out to evaluate the class status of teachers and students,and the traditional subjective judgment such as peacetime grades and teaching ability can be upgraded to the objective judgment of artificial intelligence. 展开更多
关键词 large model Machine vision DIGITALIZATION STATUS Technical realization
在线阅读 下载PDF
A Survey of Large-Scale Deep Learning Models in Medicine and Healthcare
14
作者 Zhiwei Chen Runze Liu +2 位作者 Shitao Huang Yangyang Guo Yongjun Ren 《Computer Modeling in Engineering & Sciences》 2025年第7期37-81,共45页
The rapid advancement of artificial intelligence technology is driving transformative changes in medical diagnosis,treatment,and management systems through large-scale deep learning models-a process that brings both g... The rapid advancement of artificial intelligence technology is driving transformative changes in medical diagnosis,treatment,and management systems through large-scale deep learning models-a process that brings both groundbreaking opportunities and multifaceted challenges.This study focuses on the medical and healthcare applications of large-scale deep learning architectures,conducting a comprehensive survey to categorize and analyze their diverse uses.The survey results reveal that current applications of large models in healthcare encompass medical data management,healthcare services,medical devices,and preventive medicine,among others.Concurrently,large models demonstrate significant advantages in the medical domain,especially in high-precision diagnosis and prediction,data analysis and knowledge discovery,and enhancing operational efficiency.Nevertheless,we identify several challenges that need urgent attention,including improving the interpretability of large models,strengthening privacy protection,and addressing issues related to handling incomplete data.This research is dedicated to systematically elucidating the deep collaborative mechanisms between artificial intelligence and the healthcare field,providing theoretical references and practical guidance for both academia and industry. 展开更多
关键词 large models healthcare artificial intelligence data management medical applications
在线阅读 下载PDF
Theories and applications of financial large models
15
作者 Shuoling LIU Xiaojun ZENG +1 位作者 Xiu LI Qiang YANG 《Frontiers of Information Technology & Electronic Engineering》 2025年第10期1767-1770,共4页
1 Background and motivation Recent advances in foundation models have ushered in a paradigm shift across the field of artificial intelligence(AI),with profound implications for financial technology(FinTech).Foundation... 1 Background and motivation Recent advances in foundation models have ushered in a paradigm shift across the field of artificial intelligence(AI),with profound implications for financial technology(FinTech).Foundation models refer to large-scale neural networks trained on vast and heterogeneous corpora using self-supervised or instruction-driven objectives,which endow them with strong generalization and transfer capabilities across downstream tasks.Representative classes of such models,including large language models(LLMs),multimodal foundation models,and timeseries foundation models,exhibit emergent abilities in semantic understanding,reasoning,and multimodal representation learning. 展开更多
关键词 financial large models artificial intelligence ai foundation models vast heterogeneous corpora neural networks artificial intelligence financial technology fintech foundation models large languag
原文传递
Spatial-temporal large models: A super hub linking multiple scientific areas with artificial intelligence
16
作者 Zezhi Shao Tangwen Qian +2 位作者 Tao Sun Fei Wang Yongjun Xu 《The Innovation》 2025年第2期9-10,共2页
Intelligent spatial-temporal data analysis,leveraging data such as multivariate time series and geographic information,provides researchers with powerful tools to uncover multiscale patterns and enhance decision-makin... Intelligent spatial-temporal data analysis,leveraging data such as multivariate time series and geographic information,provides researchers with powerful tools to uncover multiscale patterns and enhance decision-making processes.As artificial intelligence advances,intelligent spatial-temporal algorithms have found extensive applications across various disciplines,such as geosciences,biology,and public health.1 Compared to traditional methods,these algorithms are data driven,making them well suited for addressing the complexities of modeling real-world systems.However,their reliance on substantial domain-specific expertise limits their broader applicability.Recently,significant advancements have been made in spatial-temporal large models.Trained on large-scale data,these models exhibit a vast parameter scale,superior generalization capabilities,and multitasking advantages over previous methods.Their high versatility and scalability position them as promising super hubs for multidisciplinary research,integrating knowledge,intelligent algorithms,and research communities from different fields.Nevertheless,achieving this vision will require overcoming numerous critical challenges,offering an expansive and profound space for future exploration. 展开更多
关键词 spatial temporal large models multivariate time series traditional methodsthese intelligent spatial temporal data analysis geographic information artificial intelligence geographic informationprovides
原文传递
Training Large Models on Heterogeneous and Geo-Distributed Resource with Constricted Networks
17
作者 Zan Zong Minkun Guo +3 位作者 Mingshu Zhai Yinan Tang Jianjiang Li Jidong Zhai 《Big Data Mining and Analytics》 2025年第4期966-980,共15页
As the computational demands driven by large model technologies continue to grow rapidly,leveraging GPU hardware to expedite parallel training processes has emerged as a commonly-used strategy.When computational resou... As the computational demands driven by large model technologies continue to grow rapidly,leveraging GPU hardware to expedite parallel training processes has emerged as a commonly-used strategy.When computational resources within a single cluster are insufficient for large-model training,the hybrid utilization of heterogeneous acceleration hardware has emerged as a promising technical solution.The utilization of heterogeneous acceleration hardware and scheduling of diverse cloud resources have become a focal point of considerable interest.However,these computing resources are often geographically distributed.Due to the lack of awareness of heterogeneous devices and network topologies,existing parallel training frameworks struggle to leverage mixed GPU resources across constrained networks effectively.To boost the computing capability of the connected heterogeneous clusters,we propose HGTrainer,an optimizer designed to plan heterogeneous parallel strategies across distributed clusters for large model training.HGTrainer can adaptively saturate heterogeneous clusters because of the expanded tunable parallelism space for heterogeneous accelerators,with the awareness of relatively lower inter-cluster bandwidth.To achieve this goal,we formulate the model partitioning problem among heterogeneous hardware and introduce a hierarchical searching algorithm to solve the optimization problem.Besides,a mixed-precision pipeline method is used to reduce the cost of inter-cluster communications.We evaluate HGTrainer on heterogeneous connected clusters with popular large language models.The experimental result shows that HGTrainer effectively improves 1.49×training throughput on average for the mixed heterogeneous cluster compared with the state-of-the-art Metis. 展开更多
关键词 deep learning system large model training HETEROGENEOUS geo-distributed clusters
原文传递
Multiscale Information Fusion Based on Large Model Inspired Bacterial Detection
18
作者 Zongduo Liu Yan Huang +2 位作者 Jian Wang Genji Yuan Junjie Pang 《Big Data Mining and Analytics》 2025年第1期1-17,共17页
Accurate and efficient bacterial detection is essential for public health and medical diagnostics. However, traditional detection methods are constrained by limited dataset size, complex bacterial morphology, and dive... Accurate and efficient bacterial detection is essential for public health and medical diagnostics. However, traditional detection methods are constrained by limited dataset size, complex bacterial morphology, and diverse detection environments, hindering their effectiveness. In this study, we present EagleEyeNet, a novel multi-scale information fusion model designed to address these challenges. EagleEyeNet leverages large models as teacher networks in a knowledge distillation framework, significantly improving detection performance. Additionally, a newly designed feature fusion architecture, integrating Transformer modules, is proposed to enable the efficient fusion of global and multi-scale features, overcoming the bottlenecks posed by Feature Pyramid Networks (FPN) structures, which in turn reduces information transmission loss between feature layers. To improve the model’s adaptability for different scenarios, we create our own QingDao Bacteria Detection (QDBD) dataset as a comprehensive evaluation benchmark for bacterial detection. Experimental results demonstrate that EagleEyeNet achieves remarkable performance improvements, with mAP50 increases of 3.1% on the QDBD dataset and 4.9% on the AGRA dataset, outperforming the State-Of-The-Art (SOTA) methods in detection accuracy. These findings underscore the transformative potential of integrating large models and deep learning for advancing bacterial detection technologies. 展开更多
关键词 bacterial detection large model feature fusion global information
原文传递
Large model enhanced computational ghost imaging
19
作者 Yifan CHEN Hongjun AN +4 位作者 Zhe SUN Tong TIAN Mingliang CHEN Christian SPIELMANN Xuelong LI 《Science China(Technological Sciences)》 2025年第11期271-282,共12页
Ghost imaging(GI)enables 2D image reconstruction by leveraging high-order correlation between 1D bucket signals and 2D light field information.It demonstrates enhanced detection sensitivity and high-quality image reco... Ghost imaging(GI)enables 2D image reconstruction by leveraging high-order correlation between 1D bucket signals and 2D light field information.It demonstrates enhanced detection sensitivity and high-quality image reconstruction via efficient photon collection in scattering media.Recent studies have established that deep learning(DL)can substantially enhance the GI reconstruction quality.Furthermore,with the emergence of large models such as SDXL and GPT-4,the constraints of conventional DL in parameters and architecture have been transcended,enabling models to comprehensively explore relationships among all distinct positions within feature sequences.This paradigm shift has significantly advanced the capability of DL in restoring severely degraded and low-resolution imagery,making it particularly advantageous for noiserobust image reconstruction in GI applications.In this paper,we propose the first large imaging model with 1.4 billion parameters that incorporates the physical principles of GI(GILM).The proposed GILM implements a skip connection mechanism to mitigate gradient explosion challenges inherent in deep architectures,ensuring sufficient parametric capacity to capture intricate correlations between single-pixel measurements and the object.Moreover,GILM leverages multi-head attention mechanism to learn spatial dependencies across pixel points during image reconstruction,facilitating the extraction of comprehensive object information for subsequent reconstruction.We validated the effectiveness of GILM through a series of experiments,including simulated object imaging,imaging objects in free space,and imaging objects located 52 m away in an underwater environment.The experimental results demonstrate that GILM effectively captures the fluctuation trends of the collected signals,thereby facilitating accurate reconstruction of the object's image from the acquired data.Finally,GILM was successfully deployed on a portable computing platform,demonstrating its feasibility for practical engineering applications. 展开更多
关键词 computational ghost imaging large model speckle pattern
原文传递
Large models in medical imaging:Advances and prospects 被引量:3
20
作者 Mengjie Fang Zipei Wang +8 位作者 Sitian Pan Xin Feng Yunpeng Zhao Dongzhi Hou Ling Wu Xuebin Xie Xu-Yao Zhang Jie Tian Di Dong 《Chinese Medical Journal》 2025年第14期1647-1664,共18页
Recent advances in large models demonstrate significant prospects for transforming the field of medical imaging.These models,including large language models,large visual models,and multimodal large models,offer unprec... Recent advances in large models demonstrate significant prospects for transforming the field of medical imaging.These models,including large language models,large visual models,and multimodal large models,offer unprecedented capabilities in processing and interpreting complex medical data across various imaging modalities.By leveraging self-supervised pretraining on vast unlabeled datasets,cross-modal representation learning,and domain-specific medical knowledge adaptation through fine-tuning,large models can achieve higher diagnostic accuracy and more efficient workflows for key clinical tasks.This review summarizes the concepts,methods,and progress of large models in medical imaging,highlighting their potential in precision medicine.The article first outlines the integration of multimodal data under large model technologies,approaches for training large models with medical datasets,and the need for robust evaluation metrics.It then explores how large models can revolutionize applications in critical tasks such as image segmentation,disease diagnosis,personalized treatment strategies,and real-time interactive systems,thus pushing the boundaries of traditional imaging analysis.Despite their potential,the practical implementation of large models in medical imaging faces notable challenges,including the scarcity of high-quality medical data,the need for optimized perception of imaging phenotypes,safety considerations,and seamless integration with existing clinical workflows and equipment.As research progresses,the development of more efficient,interpretable,and generalizable models will be critical to ensuring their reliable deployment across diverse clinical environments.This review aims to provide insights into the current state of the field and provide directions for future research to facilitate the broader adoption of large models in clinical practice. 展开更多
关键词 Artificial intelligence large language model large vision model Multimodal data SEGMENTATION DIAGNOSIS Interactive system
原文传递
上一页 1 2 25 下一页 到第
使用帮助 返回顶部