期刊文献+
共找到37,440篇文章
< 1 2 250 >
每页显示 20 50 100
卷积神经网络与Vision Transformer在胶质瘤中的研究进展
1
作者 杨浩辉 徐涛 +3 位作者 王伟 安良良 敖用芳 朱家宝 《磁共振成像》 北大核心 2026年第1期168-174,共7页
胶质瘤因高度异质性、强侵袭性及预后差,传统诊疗面临巨大挑战。深度学习技术的引入为其精准诊疗提供了新路径,其中卷积神经网络(convolutional neural network,CNN)与Vision Transformer(ViT)是核心工具。CNN凭借层级化卷积操作在局部... 胶质瘤因高度异质性、强侵袭性及预后差,传统诊疗面临巨大挑战。深度学习技术的引入为其精准诊疗提供了新路径,其中卷积神经网络(convolutional neural network,CNN)与Vision Transformer(ViT)是核心工具。CNN凭借层级化卷积操作在局部特征提取(如肿瘤边缘、纹理细节)上具有天然优势,而ViT基于自注意力机制在全局上下文建模(如肿瘤跨区域异质性、多模态关联)方面表现突出,二者的融合策略通过整合局部精细特征与全局关联信息,在应对胶质瘤边界模糊、跨模态数据异构性等临床难题中展现出显著优势。本文综述了二者在胶质瘤检测与分割、病理分级、分子分型、预后评估等关键临床任务中的研究进展,阐述了原理、单独应用及融合策略。同时,本文也探讨了当前研究中存在的挑战,诸如对数据标注的强依赖性、模型可解释性不足等问题,并展望了未来的发展方向,例如构建轻量化架构、发展自监督学习以及推进多组学融合等前沿,以期为胶质瘤智能诊断提供系统性参考。 展开更多
关键词 胶质瘤 深度学习 卷积神经网络 vision Transformer 磁共振成像
暂未订购
基于条件生成对抗网络和Vision Transformer的胎儿颅脑超声标准切面识别方法
2
作者 李惠莲 林艺榕 +1 位作者 刘中华 柳培忠 《临床超声医学杂志》 2026年第2期164-169,共6页
胎儿颅脑超声检查是产前常规筛查中至关重要的一环,准确识别标准切面对于评估胎儿大脑发育状况具有重要意义。然而,由于超声图像质量差异和切面获取的复杂性,准确识别标准切面具有较大的挑战性。本文提出了一种基于条件对抗生成网络(CG... 胎儿颅脑超声检查是产前常规筛查中至关重要的一环,准确识别标准切面对于评估胎儿大脑发育状况具有重要意义。然而,由于超声图像质量差异和切面获取的复杂性,准确识别标准切面具有较大的挑战性。本文提出了一种基于条件对抗生成网络(CGAN)和Vision Transformer的胎儿颅脑超声标准切面识别方法,利用CGAN对原始数据进行增强,生成额外的标准切面和非标准切面图像,解决数据不足的问题;同时采用YOLOv9模型对超声图像中的颅骨区域进行自动裁剪,去除无关信息,确保模型专注于关键区域。在分类模型中采用Vision Transformer对所有输入图像进行归一化和尺寸调整,使用了数据增强技术如随机水平或垂直翻转、调整图像对比度、中心裁剪和调整图像饱和度等。结果显示,相较于现有最优模型CSwin Transformer的方法,本文提出的方法在胎儿颅脑超声标准切面识别任务中表现出色,其精确率、召回率、F1分数及准确率分别为92.5%、92.3%、92.4%和93.3%。该方法在提升识别精度方面具有显著优势,为临床超声检查提供了有效技术支持。 展开更多
关键词 条件生成对抗网络 vision Transformer 颅脑超声 胎儿 标准切面识别方法
暂未订购
基于Vision Transformer的高炉风口智能监测模型及应用
3
作者 王浩男 韩明博 +1 位作者 但家云 李强 《钢铁研究学报》 北大核心 2026年第1期25-37,共13页
高炉下部风口窥视孔可以实时监测高炉回旋区的燃烧特征与喷煤状态等关键冶炼状态信息,进而判断煤气流分布和炉缸活跃程度等重要参数。为解决风口监测过程中存在的主观性与时滞性问题,本工作基于风口图像非结构大数据与Vision Transforme... 高炉下部风口窥视孔可以实时监测高炉回旋区的燃烧特征与喷煤状态等关键冶炼状态信息,进而判断煤气流分布和炉缸活跃程度等重要参数。为解决风口监测过程中存在的主观性与时滞性问题,本工作基于风口图像非结构大数据与Vision Transformer架构,建立了高炉风口智能监测模型TI-ViT。首先,对采集到的风口图像进行预处理,通过特征辨析与标签标定形成典型炉况数据集;进而,基于Vision Transformer架构构建了TI-ViT风口图像识别模型;最后,对TI-ViT模型进行性能评估,重点探究了模型深度对准确率、参数量、训练时间与运行时间的影响,并与传统卷积神经网络模型进行比较。经验证,TI-ViT模型的准确率达到97.7%,相比基于卷积神经网络的模型提升了9.1%,单张图像的推理时间仅为15.75 ms。将基于本研究模型所开发的“智慧眼”系统应用于现场实践,其识别准确率可达95.2%,表明该系统实现了对高炉风口的实时监测、识别与预警,有助于降低钢铁企业对风口异常状态的监测与诊断成本,为高炉炼铁智能化提供了新的发展方向。 展开更多
关键词 高炉风口 计算机视觉 vision Transformer 图像识别 高炉炼铁
原文传递
有效诊断Vision Transformer网络的滚动轴承故障诊断方法
4
作者 罗志勇 李明周 董鑫 《重庆邮电大学学报(自然科学版)》 北大核心 2026年第1期146-155,共10页
针对滚动轴承故障诊断中特征提取不完整和诊断效率低的问题,提出了有效诊断Vision Transformer(EDViT)网络。采用基于峰度的加权融合策略,合并传感器信息;利用短时傅里叶变换,将融合后的信号转换为时频图像;依次应用EDViT的双重注意卷... 针对滚动轴承故障诊断中特征提取不完整和诊断效率低的问题,提出了有效诊断Vision Transformer(EDViT)网络。采用基于峰度的加权融合策略,合并传感器信息;利用短时傅里叶变换,将融合后的信号转换为时频图像;依次应用EDViT的双重注意卷积模块和双分支补丁视觉变换模块来提取局部和全局特征,使用分类器进行故障分类。实验验证在凯斯西储大学轴承数据集上进行。结果表明,EDViT模型具有出色的特征提取能力、快速的收敛速度和较高的诊断准确性。与其他方法的对比表明,EDViT模型具有很强的泛化能力和鲁棒性。 展开更多
关键词 有效诊断vision Transformer网络 滚动轴承 故障诊断
在线阅读 下载PDF
Gait-ViT:基于Vision Transformer的跨视角步态识别方法
5
作者 沈澍 王森 +1 位作者 黄苏岩 张秉睿 《小型微型计算机系统》 北大核心 2026年第3期646-652,共7页
步态识别作为一种远程生物特征识别技术,在医疗康复、刑侦侦查及社会治安等领域展现出广泛的应用前景.近年来,随着深度学习的快速发展,步态识别方法逐渐从传统的卷积神经网络(Convolutional Neural Network,CNN)转向更为先进的Transfor... 步态识别作为一种远程生物特征识别技术,在医疗康复、刑侦侦查及社会治安等领域展现出广泛的应用前景.近年来,随着深度学习的快速发展,步态识别方法逐渐从传统的卷积神经网络(Convolutional Neural Network,CNN)转向更为先进的Transformer架构.尽管CNN在图像处理任务中表现优异,但其对图像关键区域的关注能力有限,而注意力机制则能够通过聚焦图像局部区域来学习更具判别性的特征.为此,本文提出了一种融合注意力机制的Vision Transformer模型(Gait-ViT)用于步态识别,该方法首先将步态轮廓划分成多个小块并转化成块序列;然后通过位置嵌入和类嵌入对序列中的位置信息进行重新排列和编码;最后,将向量序列反馈给Vision Transformer进行预测.Gait-ViT模型在CASIA-B和OU-MVLP两个公开步态数据集上分别取得了98.1%和91.2%的识别准确率,验证了所提模型的有效性. 展开更多
关键词 步态识别 vision Transformer 卷积神经网络 特征提取
在线阅读 下载PDF
Total score of the computer vision syndrome questionnaire predicts refractive errors and binocular vision anomalies
6
作者 Mosaad Alhassan Tasneem Samman +5 位作者 Hatoun Badukhen Muhamad Alrashed Balsam Alabdulkader Essam Almutleb Tahani Alqahtani Ali Almustanyir 《International Journal of Ophthalmology(English edition)》 2026年第1期90-96,共7页
AIM:To evaluate the efficacy of the total computer vision syndrome questionnaire(CVS-Q)score as a predictive tool for identifying individuals with symptomatic binocular vision anomalies and refractive errors.METHODS:A... AIM:To evaluate the efficacy of the total computer vision syndrome questionnaire(CVS-Q)score as a predictive tool for identifying individuals with symptomatic binocular vision anomalies and refractive errors.METHODS:A total of 141 healthy computer users underwent comprehensive clinical visual function assessments,including evaluations of refractive errors,accommodation(amplitude of accommodation,positive relative accommodation,negative relative accommodation,accommodative accuracy,and accommodative facility),and vergence(phoria,positive and negative fusional vergence,near point of convergence,and vergence facility).Total CVS-Q scores were recorded to explore potential associations between symptom scores and the aforementioned clinical visual function parameters.RESULTS:The cohort included 54 males(38.3%)with a mean age of 23.9±0.58y and 87 age-matched females(61.7%)with a mean age of 23.9±0.53y.The multiple regression model was statistically significant[R²=0.60,F=13.28,degrees of freedom(DF=17122,P<0.001].This indicates that 60%of the variance in total CVS-Q scores(reflecting reported symptoms)could be explained by four clinical measurements:amplitude of accommodation,positive relative accommodation,exophoria at distance and near,and positive fusional vergence at near.CONCLUSION:The total CVS-Q score is a valid and reliable tool for predicting the presence of various nonstrabismic binocular vision anomalies and refractive errors in symptomatic computer users. 展开更多
关键词 computer vision syndrome refractive errors ACCOMMODATION VERGENCE binocular vision SYMPTOMS
原文传递
孪生多级Vision Transformer高分遥感影像变化检测方法
7
作者 黄英杰 《测绘与空间地理信息》 2026年第2期123-126,130,共5页
针对现有遥感变化检测模型捕获特征不全面,深、浅层特征利用不充分,导致分割精度不高的问题,提出一种结合Vision Transformer与孪生架构的遥感影像变化检测模型。在编码器端,采用孪生多级Vision Transformer实现空间特征提取与全局上下... 针对现有遥感变化检测模型捕获特征不全面,深、浅层特征利用不充分,导致分割精度不高的问题,提出一种结合Vision Transformer与孪生架构的遥感影像变化检测模型。在编码器端,采用孪生多级Vision Transformer实现空间特征提取与全局上下文特征建模,同时采用haar小波下采样层进行特征图尺寸压缩,减少细节特征的丢失;在特征解码过程中,引入全尺度特征连接机制,充分利用不同来源的深、浅层特征。实验结果表明,所提出模型在分割精度上优于当前的主流模型,能够准确地捕获变化目标的边界与细节信息。 展开更多
关键词 遥感变化检测 孪生架构 vision Transformer haar小波下采样 全尺度特征连接
在线阅读 下载PDF
A Hybrid Vision Transformer with Attention Architecture for Efficient Lung Cancer Diagnosis
8
作者 Abdu Salam Fahd M.Aldosari +4 位作者 Donia Y.Badawood Farhan Amin Isabel de la Torre Gerardo Mendez Mezquita Henry Fabian Gongora 《Computers, Materials & Continua》 2026年第4期1129-1147,共19页
Lung cancer remains a major global health challenge,with early diagnosis crucial for improved patient survival.Traditional diagnostic techniques,including manual histopathology and radiological assessments,are prone t... Lung cancer remains a major global health challenge,with early diagnosis crucial for improved patient survival.Traditional diagnostic techniques,including manual histopathology and radiological assessments,are prone to errors and variability.Deep learning methods,particularly Vision Transformers(ViT),have shown promise for improving diagnostic accuracy by effectively extracting global features.However,ViT-based approaches face challenges related to computational complexity and limited generalizability.This research proposes the DualSet ViT-PSO-SVM framework,integrating aViTwith dual attentionmechanisms,Particle Swarm Optimization(PSO),and SupportVector Machines(SVM),aiming for efficient and robust lung cancer classification acrossmultiple medical image datasets.The study utilized three publicly available datasets:LIDC-IDRI,LUNA16,and TCIA,encompassing computed tomography(CT)scans and histopathological images.Data preprocessing included normalization,augmentation,and segmentation.Dual attention mechanisms enhanced ViT’s feature extraction capabilities.PSO optimized feature selection,and SVM performed classification.Model performance was evaluated on individual and combined datasets,benchmarked against CNN-based and standard ViT approaches.The DualSet ViT-PSO-SVM significantly outperformed existing methods,achieving superior accuracy rates of 97.85%(LIDC-IDRI),98.32%(LUNA16),and 96.75%(TCIA).Crossdataset evaluations demonstrated strong generalization capabilities and stability across similar imagingmodalities.The proposed framework effectively bridges advanced deep learning techniques with clinical applicability,offering a robust diagnostic tool for lung cancer detection,reducing complexity,and improving diagnostic reliability and interpretability. 展开更多
关键词 Deep learning artificial intelligence healthcare medical imaging vision transformer
在线阅读 下载PDF
From microstructure to performance optimization:Innovative applications of computer vision in materials science
9
作者 Chunyu Guo Xiangyu Tang +10 位作者 Yu’e Chen Changyou Gao Qinglin Shan Heyi Wei Xusheng Liu Chuncheng Lu Meixia Fu Enhui Wang Xinhong Liu Xinmei Hou Yanglong Hou 《International Journal of Minerals,Metallurgy and Materials》 2026年第1期94-115,共22页
The rapid advancements in computer vision(CV)technology have transformed the traditional approaches to material microstructure analysis.This review outlines the history of CV and explores the applications of deep-lear... The rapid advancements in computer vision(CV)technology have transformed the traditional approaches to material microstructure analysis.This review outlines the history of CV and explores the applications of deep-learning(DL)-driven CV in four key areas of materials science:microstructure-based performance prediction,microstructure information generation,microstructure defect detection,and crystal structure-based property prediction.The CV has significantly reduced the cost of traditional experimental methods used in material performance prediction.Moreover,recent progress made in generating microstructure images and detecting microstructural defects using CV has led to increased efficiency and reliability in material performance assessments.The DL-driven CV models can accelerate the design of new materials with optimized performance by integrating predictions based on both crystal and microstructural data,thereby allowing for the discovery and innovation of next-generation materials.Finally,the review provides insights into the rapid interdisciplinary developments in the field of materials science and future prospects. 展开更多
关键词 MICROSTRUCTURE deep learning computer vision performance prediction image generation
在线阅读 下载PDF
A comprehensive analysis of artificial intelligence,machine learning,deep learning and computer vision in food science
10
作者 Premkumar Borugadda Hemantha Kumar Kalluri 《Journal of Future Foods》 2026年第6期975-991,共17页
Providing safe and quality food is crucial for every household and is of extreme significance in the growth of any society.It is a complex procedure that deals with all issues focusing on the development of food proce... Providing safe and quality food is crucial for every household and is of extreme significance in the growth of any society.It is a complex procedure that deals with all issues focusing on the development of food processing from seed to harvest,storage,preparation,and consumption.This current paper seeks to demystify the importance of artificial intelligence,machine learning(ML),deep learning(DL),and computer vision(CV)in ensuring food safety and quality.By stressing the importance of these technologies,the audience will feel reassured and confident in their potential.These are very handy for such problems,giving assurance over food safety.CV is incredibly noble in today's generation because it improves food processing quality and positively impacts firms and researchers.Thus,at the present production stage,rich in image processing and computer visioning is incorporated into all facets of food production.In this field,DL and ML are implemented to identify the type of food in addition to quality.Concerning data and result-oriented perceptions,one has found similarities regarding various approaches.As a result,the findings of this study will be helpful for scholars looking for a proper approach to identify the quality of food offered.It helps to indicate which food products have been discussed by other scholars and lets the reader know papers by other scholars inclined to research further.Also,DL is accurately integrated with identifying the quality and safety of foods in the market.This paper describes the current practices and concerns of ML,DL,and probable trends for its future development. 展开更多
关键词 Artificial intelligence Computer vision Deep learning Food quality Food recognition Machine learning
在线阅读 下载PDF
Functional outcome and patient satisfaction 5y after laser vision correction
11
作者 Ran Gao Yu Han +4 位作者 Jie Qin Yu-Shan Xu Yu Li Xiao-Tong Lyu Feng-Ju Zhang 《International Journal of Ophthalmology(English edition)》 2026年第1期123-131,共9页
AIM:To investigate the association between functionaloutcomes and postoperative patient satisfaction 5y aftersmall incision lenticule extraction(SMILE)and femtosecondlaser-assisted in situ keratomileusis(FS-LASIK).MET... AIM:To investigate the association between functionaloutcomes and postoperative patient satisfaction 5y aftersmall incision lenticule extraction(SMILE)and femtosecondlaser-assisted in situ keratomileusis(FS-LASIK).METHODS:This is a cross-sectional study.Thepatients underwent basic ophthalmic examinations,axiallength measurement,wide-field fundus photography,andaccommodation function testing.Behavioral habits datawere collected using a self-administered questionnaire,andvisual symptoms were assessed with the Quality of Vision(QoV)questionnaire.Postoperative satisfaction was alsorecorded.RESULTS:Totally 410 subjects[820 eyes,160males(39.02%)and 250 females(60.98%)]who hadundergone SMILE or FS-LASIK 5y ago were enrolled.Themean(standard deviation,SD)age of all patients was29.83y(6.69).The mean(SD)preoperative manifest SEwas-5.80(2.04)diopters(D;range:-0.88 to-13.75).Patient satisfaction at 5y after undergoing SMILE or FSLASIKwas 91.70%.Patients were categorized into twogroups:dissatisfied group and satisfied group.Significantdifferences were observed between the two groups in termsof age(P=0.012),sex(P=0.021),preoperative degreeof myopia(P=0.049),postoperative visual symptoms(frequency,P=0.043;severity,P<0.001;bothersome,P=0.018),difficulty driving at night(P=0.001),andaccommodative amplitude(AMP,P=0.020).Multivariateanalysis confirmed that female sex(P=0.024),severityof visual symptoms(P=0.009),and difficulty driving atnight(P=0.006)were significantly associated with lowersatisfaction.The dissatisfied group showed higher rates ofstarbursts,double or multiple images,and high myopia,but lower age.The frequency,severity,and bothersome ofdistortion exhibited decreased with increasing age.CONCLUSION:Patient satisfaction 5y after SMILEand FS-LASIK is high and stable.Difficulty driving at night,sex,and severity of visual symptoms are important factorsinfluencing patient satisfaction.Special attention should bepaid to younger highly myopic female patients,particularlythose with starbursts and double or multiple images.It is crucial to monitor postoperative visual outcomesand provide patients with comprehensive preoperativecounseling to enhance long-term satisfaction. 展开更多
关键词 patient satisfaction MYOPIA vision small incision lenticule extraction femtosecond laser-assisted in situ keratomileusis
原文传递
基于注意机制优化的Vision Transformer在虫草等级识别中的应用
12
作者 刘惠文 《消费电子》 2026年第4期248-250,共3页
在数字时代背景下,深度学习驱动了图像识别的创新,但目前对虫草等级识别的研究主要还是依靠人工经验,存在效率低、主观性强等问题。文章采用视觉转换器(Vision Transformer,ViT)模型对虫草图像进行分级识别。首先,阐述视觉知觉、注意机... 在数字时代背景下,深度学习驱动了图像识别的创新,但目前对虫草等级识别的研究主要还是依靠人工经验,存在效率低、主观性强等问题。文章采用视觉转换器(Vision Transformer,ViT)模型对虫草图像进行分级识别。首先,阐述视觉知觉、注意机制和层次划分的理论依据,并从注意机制和模型结构两个角度对ViT进行调整和优化;在此基础上,利用PyTorch框架对包含5000幅图像的数据集进行5重交叉验证。实验结果显示,该模型的预测精度达到95.2%,召回率达到94.5%,F1值达到94.8%,为虫草行业智能化发展提供了技术支持。 展开更多
关键词 vision Transformer 虫草等级识别 图像分类 深度学习 计算机视觉
在线阅读 下载PDF
Advances and Prospects in Body-Size Measurement of Sheep:From 2D Vision to 3D Reconstruction and 2D-3D Fusion
13
作者 DAI Weijiao LIANG Yudongchen +5 位作者 ZHOU Yong YAO Chao ZHANG Cheng SONG Yongjian LI Guoliang TIAN Fang 《智慧农业(中英文)》 2026年第1期120-147,共28页
[Significance]In alignment with the national germplasm security strategy,current research efforts are accelerating the adoption of precision breeding in sheep.Within the whole-genome selection,accurate phenotyping of ... [Significance]In alignment with the national germplasm security strategy,current research efforts are accelerating the adoption of precision breeding in sheep.Within the whole-genome selection,accurate phenotyping of body morphometrics is critical for assessing growth performance and breeding value.Traditional manual measurements are inefficient,prone to human error,and may cause stress to sheep,limiting their suitability for precision sheep management.By summarizing the applications of sheep body size measurement technologies and analyzing their development directions,this paper provides theoretical references and practical guidance for the research and application of non contact sheep body size measurement.[Progress]This review synthesizes progress across three principal methodological paradigms:two-dimensional(2D)image-based techniques,three-dimensional(3D)point cloud-based approaches,and integrated 2D-3D fusion systems.2D methods,employing either handcrafted geometric features or deep learning-based keypoint detector algorithms,are cost-effective and operationally simple but sensitive to variation in imaging conditions and unable to capture critical circumference metrics.3D point-cloud approaches enable precise reconstruction of full animal morphology,supporting comprehensive body-size acquisition with higher accuracy,yet face challenges including high hardware costs,complex data workflows,and sensitivity to posture variability.Hybrid 2D-3D fusion systems combine semantic richness from RGB imagery with geometric completeness from point clouds.Having been effectively validated in other livestock specise,e.g.,cattle and pigs,these fusion systems have demonstrated excellent performance,providing important technical references and practical insights for sheep body size measurement.[Conclusions and Prospects]Firstly,future research should focus on constructing large-scale,high-quality datasets for sheep body size measurement that encompass diverse breeds,growth stages,and environmental conditions,thereby enhancing model robustness and generalization.Secondly,the development of lightweight artificial intelligence models is essential.Techniques such as model compression,quantization,and algorithmic optimization can substantially reduce computational complexity and storage requirements,facilitating deployment in resource-constrained environments.Thirdly,the 3D point cloud processing pipeline should be streamlined to improve the efficiency of data acquisition,filtering,registration,and segmentation,while promoting the integration of low-cost,high-resilience vision systems into practical farming scenarios.Fourthly,specific emphasis should be placed on improving the accuracy of curved-dimensional measurements,such as chest circumference,abdominal circumference,and shank circumference,through advances in pose standardization,refined 3D segmentation strategies,and multimodal data fusion.Finally,the cross-fertilization of sheep body size measurement technologies with analogous methods for other livestock species offers a promising pathway for mutual learning and collaborative innovation,accelerating the industrialization of automated sheep morphometric systems and supporting the development of intelligent,data-driven pasture management practices. 展开更多
关键词 smart breeding computer vision image recognition three-dimensional reconstruction 2D-3D body measurement
在线阅读 下载PDF
Comparison of binocular vision indices in Parkinson’s disease patients vs age-sex-matched healthy controls
14
作者 Reyhaneh Shariati-Moghaddam Ali Shoeibi +6 位作者 Morad Amir Ahmad Hadi Ostadimoghaddam Hassan Hashemi Akbar Derakhshan Zahra Hemmatian Abbasali Yekta Mehdi Khabazkhoob 《International Journal of Ophthalmology(English edition)》 2026年第3期549-555,共7页
AIM:To evaluate the differences in near point of convergence(NPC),fusional vergence,saccadic eye movements,versional eye movements,and heterophoria between patients diagnosed with Parkinson’s disease(PD)and healthy s... AIM:To evaluate the differences in near point of convergence(NPC),fusional vergence,saccadic eye movements,versional eye movements,and heterophoria between patients diagnosed with Parkinson’s disease(PD)and healthy subjects.METHODS:A cross-sectional comparative study was conducted,enrolling two cohorts:a PD group and a healthy control group.The PD group was recruited via non-random convenience sampling,while the control group was selected randomly from individuals without PD.All participants were screened according to predefined inclusion and exclusion criteria before undergoing a comprehensive optometric assessment,which included measurements of uncorrected visual acuity,corrected visual acuity,and objective and subjective refraction.Subsequently,binocular vision function evaluations were performed,covering NPC measurement,fusional vergence reserve assessment at both distance and near,saccadic eye movement testing,and versional eye movement and heterophoria assessment.RESULTS:A total of 42 PD patients and 41 healthy controls were included in the final analysis.The two groups were well-matched in terms of sex distribution[29 males(69.0%)in the PD group vs 29 males(70.7%)in the control group,P=0.867]and mean age(55.3±9.6y in the PD group vs 54.9±9.8y in the control group,P=0.866).The prevalence of abnormal versional eye movements was significantly higher in the PD group than in the control group(23.81%,95%CI:12.05%-39.45%vs 7.32%,95%CI:1.54%-19.92%;P=0.025).Near exophoria was more prevalent in PD patients(61.90%,95%CI:45.64%-76.43%)than in controls(17.07%,95%CI:7.15%-32.06%),with a significant difference[odds ratio(OR)=7.99;95%CI:2.83-21.99;P<0.001].The mean NPC was significantly greater(more receded)in the PD group than in the control group(9.01±3.74 cm vs 7.20±2.15 cm;P=0.007).A statistically significant positive correlation was observed between PD severity and NPC values(Pearson’s correlation coefficient=0.309;P=0.046).Except for distance baseout break and distance base-out recovery values,all other fusional vergence parameters were significantly lower in the PD group than in the control group(P<0.05).The mean saccadic test score was significantly lower in PD patients than in controls(3.29±0.57 vs 3.78±0.42;P<0.001).Among all fusional vergence indices,near base-in blur yielded the highest area under the curve(AUC=0.877),with a sensitivity of 69%and specificity of 90%,followed by distance base-out blur(AUC=0.824,sensitivity=97.6%,specificity=66.7%),near base-out blur(AUC=0.814,sensitivity=76.2%,specificity=72.7%),near base-out break(AUC=0.749,sensitivity=78.6%,specificity=67.6%),and near base-out recovery(AUC=0.749,sensitivity=95.2%,specificity=50%).CONCLUSION:PD is associated with significant binocular vision function impairment,with receded NPC and reduced near fusional vergence reserves being the most prominent disorders.These findings highlight the potential value of binocular vision assessment as a non-invasive biomarker for the early detection and clinical monitoring of PD. 展开更多
关键词 Parkinson’s disease binocular vision near point of convergence fusional vergence saccadic eye movement HETEROPHORIA
原文传递
Privacy-Preserving Gender-Based Customer Behavior Analytics in Retail Spaces Using Computer Vision
15
作者 Ginanjar Suwasono Adi Samsul Huda +4 位作者 Griffani Megiyanto Rahmatullah Dodit Suprianto Dinda Qurrota Aini Al-Sefy Ivon Sandya Sari Putri Lalu Tri Wijaya Nata Kusuma 《Computers, Materials & Continua》 2026年第1期1839-1861,共23页
In the competitive retail industry of the digital era,data-driven insights into gender-specific customer behavior are essential.They support the optimization of store performance,layout design,product placement,and ta... In the competitive retail industry of the digital era,data-driven insights into gender-specific customer behavior are essential.They support the optimization of store performance,layout design,product placement,and targeted marketing.However,existing computer vision solutions often rely on facial recognition to gather such insights,raising significant privacy and ethical concerns.To address these issues,this paper presents a privacypreserving customer analytics system through two key strategies.First,we deploy a deep learning framework using YOLOv9s,trained on the RCA-TVGender dataset.Cameras are positioned perpendicular to observation areas to reduce facial visibility while maintaining accurate gender classification.Second,we apply AES-128 encryption to customer position data,ensuring secure access and regulatory compliance.Our system achieved overall performance,with 81.5%mAP@50,77.7%precision,and 75.7%recall.Moreover,a 90-min observational study confirmed the system’s ability to generate privacy-protected heatmaps revealing distinct behavioral patterns between male and female customers.For instance,women spent more time in certain areas and showed interest in different products.These results confirm the system’s effectiveness in enabling personalized layout and marketing strategies without compromising privacy. 展开更多
关键词 Business intelligence customer behavior privacy-preserving analytics computer vision deep learning smart retail gender recognition heatmap privacy RCA-TVGender dataset
在线阅读 下载PDF
Advancing Breast Cancer Molecular Subtyping:A Comparative Study of Convolutional Neural Networks and Vision Transformers on Mammograms
16
作者 Chee Chin Lim Hui Wen Tiu +2 位作者 Qi Wei Oung Chiew Chea Lau Xiao Jian Tan 《Computers, Materials & Continua》 2026年第3期1287-1308,共22页
critical for guiding treatment and improving patient outcomes.Traditional molecular subtyping via immuno-histochemistry(IHC)test is invasive,time-consuming,and may not fully represent tumor heterogeneity.This study pr... critical for guiding treatment and improving patient outcomes.Traditional molecular subtyping via immuno-histochemistry(IHC)test is invasive,time-consuming,and may not fully represent tumor heterogeneity.This study proposes a non-invasive approach using digital mammography images and deep learning algorithm for classifying breast cancer molecular subtypes.Four pretrained models,including two Convolutional Neural Networks(MobileNet_V3_Large and VGG-16)and two Vision Transformers(ViT_B_16 and ViT_Base_Patch16_Clip_224)were fine-tuned to classify images into HER2-enriched,Luminal,Normal-like,and Triple Negative subtypes.Hyperparameter tuning,including learning rate adjustment and layer freezing strategies,was applied to optimize performance.Among the evaluated models,ViT_Base_Patch16_Clip_224 achieved the highest test accuracy(94.44%),with equally high precision,recall,and F1-score of 0.94,demonstrating excellent generalization.MobileNet_V3_Large achieved the same accuracy but showed less training stability.In contrast,VGG-16 recorded the lowest performance,indicating a limitation in its generalizability for this classification task.The study also highlighted the superior performance of the Vision Transformer models over CNNs,particularly due to their ability to capture global contextual features and the benefit of CLIP-based pretraining in ViT_Base_Patch16_Clip_224.To enhance clinical applicability,a graphical user interface(GUI)named“BCMS Dx”was developed for streamlined subtype prediction.Deep learning applied to mammography has proven effective for accurate and non-invasive molecular subtyping.The proposed Vision Transformer-based model and supporting GUI offer a promising direction for augmenting diagnostic workflows,minimizing the need for invasive procedures,and advancing personalized breast cancer management. 展开更多
关键词 Artificial intelligence breast cancer classification convolutional neural network deep learning hyperparameter tuning MAMMOGRAPHY medical imaging molecular subtypes vision transformer
在线阅读 下载PDF
KPA-ViT:Key Part-Level Attention Vision Transformer for Foreign Body Classification on Coal Conveyor Belt
17
作者 Haoxuanye Ji Zhiliang Chen +3 位作者 Pengfei Jiang Ziyue Wang Ting Yu Wei Zhang 《Computers, Materials & Continua》 2026年第3期656-671,共16页
Foreign body classification on coal conveyor belts is a critical component of intelligent coal mining systems.Previous approaches have primarily utilized convolutional neural networks(CNNs)to effectively integrate spa... Foreign body classification on coal conveyor belts is a critical component of intelligent coal mining systems.Previous approaches have primarily utilized convolutional neural networks(CNNs)to effectively integrate spatial and semantic information.However,the performance of CNN-based methods remains limited in classification accuracy,primarily due to insufficient exploration of local image characteristics.Unlike CNNs,Vision Transformer(ViT)captures discriminative features by modeling relationships between local image patches.However,such methods typically require a large number of training samples to perform effectively.In the context of foreign body classification on coal conveyor belts,the limited availability of training samples hinders the full exploitation of Vision Transformer’s(ViT)capabilities.To address this issue,we propose an efficient approach,termed Key Part-level Attention Vision Transformer(KPA-ViT),which incorporates key local information into the transformer architecture to enrich the training information.It comprises three main components:a key-point detection module,a key local mining module,and an attention module.To extract key local regions,a key-point detection strategy is first employed to identify the positions of key points.Subsequently,the key local mining module extracts the relevant local features based on these detected points.Finally,an attention module composed of self-attention and cross-attention blocks is introduced to integrate global and key part-level information,thereby enhancing the model’s ability to learn discriminative features.Compared to recent transformer-based frameworks—such as ViT,Swin-Transformer,and EfficientViT—the proposed KPA-ViT achieves performance improvements of 9.3%,6.6%,and 2.8%,respectively,on the CUMT-BelT dataset,demonstrating its effectiveness. 展开更多
关键词 Foreign body classification global and part-level key information coal conveyor belt vision transformer(ViT) self and cross attention
在线阅读 下载PDF
基于Vision Transformer的轻量化单目深度估计
18
作者 张凯 唐嘉宁 +2 位作者 李叶嘉 马孟星 周思达 《现代电子技术》 北大核心 2026年第4期64-72,共9页
深度估计能为无人机提供精确的三维环境感知能力,而对边缘设备而言,实时推理与极低的计算资源消耗至关重要。目前大多数单目深度估计网络都侧重于提高在高端GPU上运行时的精度,难以满足边缘设备的实时性要求。为解决该问题,提出一种新... 深度估计能为无人机提供精确的三维环境感知能力,而对边缘设备而言,实时推理与极低的计算资源消耗至关重要。目前大多数单目深度估计网络都侧重于提高在高端GPU上运行时的精度,难以满足边缘设备的实时性要求。为解决该问题,提出一种新型编码器-解码器网络,以实现边缘设备上的实时单目深度估计。所提网络通过一个高效的语义模块合并全局的语义信息,为深度估计提供更多的物体边缘细节;并将基于Transformer的模块集成到编码器-解码器架构的最低分辨率层级,从而大大减少视觉变换器(ViT)的参数。此外,还提出了用于深度解码的Upconv层。该网络在精度和速度之间实现了较好的权衡,通过TensorRT优化,在NVIDIA Jetson Orin设备上具备实时推理性能,优于目前多数先进的实时性算法。 展开更多
关键词 单目深度估计网络 边缘设备 编码器 解码器 Transformer技术 视觉变换器
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部