期刊文献+
共找到38,341篇文章
< 1 2 250 >
每页显示 20 50 100
Geometric parameter identification of bridge precast box girder sections based on deep learning and computer vision 被引量:2
1
作者 JIA Jingwei NI Youhao +2 位作者 MAO Jianxiao XU Yinfei WANG Hao 《Journal of Southeast University(English Edition)》 2025年第3期278-285,共8页
To overcome the limitations of low efficiency and reliance on manual processes in the measurement of geometric parameters for bridge prefabricated components,a method based on deep learning and computer vision is deve... To overcome the limitations of low efficiency and reliance on manual processes in the measurement of geometric parameters for bridge prefabricated components,a method based on deep learning and computer vision is developed to identify the geometric parameters.The study utilizes a common precast element for highway bridges as the research subject.First,edge feature points of the bridge component section are extracted from images of the precast component cross-sections by combining the Canny operator with mathematical morphology.Subsequently,a deep learning model is developed to identify the geometric parameters of the precast components using the extracted edge coordinates from the images as input and the predefined control parameters of the bridge section as output.A dataset is generated by varying the control parameters and noise levels for model training.Finally,field measurements are conducted to validate the accuracy of the developed method.The results indicate that the developed method effectively identifies the geometric parameters of bridge precast components,with an error rate maintained within 5%. 展开更多
关键词 bridge precast components section geometry parameters size identification computer vision deep learning
在线阅读 下载PDF
卷积神经网络与Vision Transformer在胶质瘤中的研究进展
2
作者 杨浩辉 徐涛 +3 位作者 王伟 安良良 敖用芳 朱家宝 《磁共振成像》 北大核心 2026年第1期168-174,共7页
胶质瘤因高度异质性、强侵袭性及预后差,传统诊疗面临巨大挑战。深度学习技术的引入为其精准诊疗提供了新路径,其中卷积神经网络(convolutional neural network,CNN)与Vision Transformer(ViT)是核心工具。CNN凭借层级化卷积操作在局部... 胶质瘤因高度异质性、强侵袭性及预后差,传统诊疗面临巨大挑战。深度学习技术的引入为其精准诊疗提供了新路径,其中卷积神经网络(convolutional neural network,CNN)与Vision Transformer(ViT)是核心工具。CNN凭借层级化卷积操作在局部特征提取(如肿瘤边缘、纹理细节)上具有天然优势,而ViT基于自注意力机制在全局上下文建模(如肿瘤跨区域异质性、多模态关联)方面表现突出,二者的融合策略通过整合局部精细特征与全局关联信息,在应对胶质瘤边界模糊、跨模态数据异构性等临床难题中展现出显著优势。本文综述了二者在胶质瘤检测与分割、病理分级、分子分型、预后评估等关键临床任务中的研究进展,阐述了原理、单独应用及融合策略。同时,本文也探讨了当前研究中存在的挑战,诸如对数据标注的强依赖性、模型可解释性不足等问题,并展望了未来的发展方向,例如构建轻量化架构、发展自监督学习以及推进多组学融合等前沿,以期为胶质瘤智能诊断提供系统性参考。 展开更多
关键词 胶质瘤 深度学习 卷积神经网络 vision Transformer 磁共振成像
暂未订购
基于条件生成对抗网络和Vision Transformer的胎儿颅脑超声标准切面识别方法
3
作者 李惠莲 林艺榕 +1 位作者 刘中华 柳培忠 《临床超声医学杂志》 2026年第2期164-169,共6页
胎儿颅脑超声检查是产前常规筛查中至关重要的一环,准确识别标准切面对于评估胎儿大脑发育状况具有重要意义。然而,由于超声图像质量差异和切面获取的复杂性,准确识别标准切面具有较大的挑战性。本文提出了一种基于条件对抗生成网络(CG... 胎儿颅脑超声检查是产前常规筛查中至关重要的一环,准确识别标准切面对于评估胎儿大脑发育状况具有重要意义。然而,由于超声图像质量差异和切面获取的复杂性,准确识别标准切面具有较大的挑战性。本文提出了一种基于条件对抗生成网络(CGAN)和Vision Transformer的胎儿颅脑超声标准切面识别方法,利用CGAN对原始数据进行增强,生成额外的标准切面和非标准切面图像,解决数据不足的问题;同时采用YOLOv9模型对超声图像中的颅骨区域进行自动裁剪,去除无关信息,确保模型专注于关键区域。在分类模型中采用Vision Transformer对所有输入图像进行归一化和尺寸调整,使用了数据增强技术如随机水平或垂直翻转、调整图像对比度、中心裁剪和调整图像饱和度等。结果显示,相较于现有最优模型CSwin Transformer的方法,本文提出的方法在胎儿颅脑超声标准切面识别任务中表现出色,其精确率、召回率、F1分数及准确率分别为92.5%、92.3%、92.4%和93.3%。该方法在提升识别精度方面具有显著优势,为临床超声检查提供了有效技术支持。 展开更多
关键词 条件生成对抗网络 vision Transformer 颅脑超声 胎儿 标准切面识别方法
暂未订购
Approximate-Guided Representation Learning in Vision Transformer
4
作者 Kaili Wang Xinwei Sun +2 位作者 Huijie He Fenhua Bai Tao Shen 《CAAI Transactions on Intelligence Technology》 2025年第5期1459-1477,共19页
In recent years,the transformer model has demonstrated excellent performance in computer vision(CV)applications.The key lies in its guided representation attention mechanism,which uses dot-product to depict complex fe... In recent years,the transformer model has demonstrated excellent performance in computer vision(CV)applications.The key lies in its guided representation attention mechanism,which uses dot-product to depict complex feature relationships,and comprehensively understands the context semantics to obtain feature weights.Then feature enhancement is implemented by guiding the target matrix through feature weights.However,the uncertainty and inconsistency of features are widespread that prone to confusion in the description of relationships within dot-product attention mechanisms.To solve this problem,this paper proposed a novel approximate-guided representation learning methodology for vision transformer.The kernelised matroids fuzzy rough set is defined,wherein the closed sets inside kernelised fuzzy information granules of matroids structures can constitute the subspace of lower approximation in rough sets.Thus,the kernel relation is employed to characterise image feature granules that will be reconstructed according to the independent set in matroids theory.Then,according to the characteristics of the closed set within matroids,the feature attention weight is formed by using the lower approximation to realise the approximate guidance of features.The approximate-guided representation mechanism can be flexibly deployed as a plug-and-play component in a wide range of CV tasks.Extensive empirical results demonstrate that the proposed method outperforms the majority of advanced prevalent models,especially in terms of robustness. 展开更多
关键词 computer vision deep learning image representation kernel methods rough sets
在线阅读 下载PDF
The Role of Artificial Intelligence in Enhancing Financial Reporting Quality:Evidence from Saudi Arabia’s Vision 2030 Transformation
5
作者 Amal Yamani 《Journal of Modern Accounting and Auditing》 2025年第4期237-251,共15页
As it leads to a significant transformation under Saudi Arabia’s Vision 2030 initiative,artificial intelligence(AI)is changing the course of corporate systems,including financial reporting.This research examines the ... As it leads to a significant transformation under Saudi Arabia’s Vision 2030 initiative,artificial intelligence(AI)is changing the course of corporate systems,including financial reporting.This research examines the role of AI in advancing financial reporting quality(FRQ)in the Kingdom’s evolving movement toward improved economy and governance.Using qualitative methodology informed by semi-structured interviews with senior finance leaders,auditors,and regulatory professionals in key sectors,the study reveals rich details about how AI technologies can-and will-be realized today,and how they can effectively improve reporting accuracy,timeliness,transparency,and regulatory compliance.The study helpfully outlines several dimensions where,as sworn,AI is advancing FRQ by automating a range of complicated data-intensive tasks,examining and identifying irregularities,and contributing to real-time decision making.Participants explained that AI would reinforce FRQ by ensuring ethical and transparent governance and enabling investment in co-human collaborative decision-making.The findings relate to agency and stakeholder theories.The research supports the notion that AI reduces information asymmetry and builds trust with investors and regulators.This study adds to a small number of qualitative studies on AI and financial governance in emerging economies and has important implications for policymakers,corporate actors,and standard setters.Moreover,it demonstrates the requirement for a collaborative national AI governance approach to ensure optimized value under the full potential of digital transformation and financial reporting standards.Future studies may explore longitudinal or cross-country comparative studies to further develop these insights and understanding. 展开更多
关键词 artificial intelligence financial reporting quality vision 2030 AI governance Saudi Arabia
在线阅读 下载PDF
Enhanced Plant Species Identification through Metadata Fusion and Vision Transformer Integration
6
作者 Hassan Javed Labiba Gillani Fahad +2 位作者 Syed Fahad Tahir Mehdi Hassan Hani Alquhayz 《Computers, Materials & Continua》 2025年第11期3981-3996,共16页
Accurate plant species classification is essential for many applications,such as biodiversity conservation,ecological research,and sustainable agricultural practices.Traditional morphological classification methods ar... Accurate plant species classification is essential for many applications,such as biodiversity conservation,ecological research,and sustainable agricultural practices.Traditional morphological classification methods are inherently slow,labour-intensive,and prone to inaccuracies,especiallywhen distinguishing between species exhibiting visual similarities or high intra-species variability.To address these limitations and to overcome the constraints of imageonly approaches,we introduce a novel Artificial Intelligence-driven framework.This approach integrates robust Vision Transformer(ViT)models for advanced visual analysis with a multi-modal data fusion strategy,incorporating contextual metadata such as precise environmental conditions,geographic location,and phenological traits.This combination of visual and ecological cues significantly enhances classification accuracy and robustness,proving especially vital in complex,heterogeneous real-world environments.The proposedmodel achieves an impressive 97.27%of test accuracy,andMean Reciprocal Rank(MRR)of 0.9842 that demonstrates strong generalization capabilities.Furthermore,efficient utilization of high-performance GPU resources(RTX 3090,18 GB memory)ensures scalable processing of highdimensional data.Comparative analysis consistently confirms that ourmetadata fusion approach substantially improves classification performance,particularly formorphologically similar species,and through principled self-supervised and transfer learning from ImageNet,the model adapts efficiently to new species,ensuring enhanced generalization.This comprehensive approach holds profound practical implications for precise conservation initiatives,rigorous ecological monitoring,and advanced agricultural management. 展开更多
关键词 vision transformers(ViTs) TRANSFORMERS machine learning deep learning plant species classification MULTI-ORGAN
在线阅读 下载PDF
High-precision copper-grade identification via a vision transformer with PGNAA
7
作者 Jie Cao Chong-Gui Zhong +6 位作者 Han-Ting You Yan Zhang Ren-Bo Wang Shu-Min Zhou Jin-Hui Qu Rui Chen Shi-Liang Liu 《Nuclear Science and Techniques》 2025年第7期89-99,共11页
The identification of ore grades is a critical step in mineral resource exploration and mining.Prompt gamma neutron activation analysis(PGNAA)technology employs gamma rays generated by the nuclear reactions between ne... The identification of ore grades is a critical step in mineral resource exploration and mining.Prompt gamma neutron activation analysis(PGNAA)technology employs gamma rays generated by the nuclear reactions between neutrons and samples to achieve the qualitative and quantitative detection of sample components.In this study,we present a novel method for identifying copper grade by combining the vision transformer(ViT)model with the PGNAA technique.First,a Monte Carlo simulation is employed to determine the optimal sizes of the neutron moderator,thermal neutron absorption material,and dimensions of the device.Subsequently,based on the parameters obtained through optimization,a PGNAA copper ore measurement model is established.The gamma spectrum of the copper ore is analyzed using the ViT model.The ViT model is optimized for hyperparameters using a grid search.To ensure the reliability of the identification results,the test results are obtained through five repeated tenfold cross-validations.Long short-term memory and convolutional neural network models are compared with the ViT method.These results indicate that the ViT method is efficient in identifying copper ore grades with average accuracy,precision,recall,F_(1)score,and F_(1)(-)score values of 0.9795,0.9637,0.9614,0.9625,and 0.9942,respectively.When identifying associated minerals,the ViT model can identify Pb,Zn,Fe,and Co minerals with identification accuracies of 0.9215,0.9396,0.9966,and 0.8311,respectively. 展开更多
关键词 Copper-grade identification vision transformer model Prompt gamma neutron activation analysis Monte Carlo N-particle
在线阅读 下载PDF
Total score of the computer vision syndrome questionnaire predicts refractive errors and binocular vision anomalies
8
作者 Mosaad Alhassan Tasneem Samman +5 位作者 Hatoun Badukhen Muhamad Alrashed Balsam Alabdulkader Essam Almutleb Tahani Alqahtani Ali Almustanyir 《International Journal of Ophthalmology(English edition)》 2026年第1期90-96,共7页
AIM:To evaluate the efficacy of the total computer vision syndrome questionnaire(CVS-Q)score as a predictive tool for identifying individuals with symptomatic binocular vision anomalies and refractive errors.METHODS:A... AIM:To evaluate the efficacy of the total computer vision syndrome questionnaire(CVS-Q)score as a predictive tool for identifying individuals with symptomatic binocular vision anomalies and refractive errors.METHODS:A total of 141 healthy computer users underwent comprehensive clinical visual function assessments,including evaluations of refractive errors,accommodation(amplitude of accommodation,positive relative accommodation,negative relative accommodation,accommodative accuracy,and accommodative facility),and vergence(phoria,positive and negative fusional vergence,near point of convergence,and vergence facility).Total CVS-Q scores were recorded to explore potential associations between symptom scores and the aforementioned clinical visual function parameters.RESULTS:The cohort included 54 males(38.3%)with a mean age of 23.9±0.58y and 87 age-matched females(61.7%)with a mean age of 23.9±0.53y.The multiple regression model was statistically significant[R²=0.60,F=13.28,degrees of freedom(DF=17122,P<0.001].This indicates that 60%of the variance in total CVS-Q scores(reflecting reported symptoms)could be explained by four clinical measurements:amplitude of accommodation,positive relative accommodation,exophoria at distance and near,and positive fusional vergence at near.CONCLUSION:The total CVS-Q score is a valid and reliable tool for predicting the presence of various nonstrabismic binocular vision anomalies and refractive errors in symptomatic computer users. 展开更多
关键词 computer vision syndrome refractive errors ACCOMMODATION VERGENCE binocular vision SYMPTOMS
原文传递
Video action recognition meets vision-language models exploring human factors in scene interaction: a review
9
作者 GUO Yuping GAO Hongwei +3 位作者 YU Jiahui GE Jinchao HAN Meng JU Zhaojie 《Optoelectronics Letters》 2025年第10期626-640,共15页
Video action recognition(VAR)aims to analyze dynamic behaviors in videos and achieve semantic understanding.VAR faces challenges such as temporal dynamics,action-scene coupling,and the complexity of human interactions... Video action recognition(VAR)aims to analyze dynamic behaviors in videos and achieve semantic understanding.VAR faces challenges such as temporal dynamics,action-scene coupling,and the complexity of human interactions.Existing methods can be categorized into motion-level,event-level,and story-level ones based on spatiotemporal granularity.However,single-modal approaches struggle to capture complex behavioral semantics and human factors.Therefore,in recent years,vision-language models(VLMs)have been introduced into this field,providing new research perspectives for VAR.In this paper,we systematically review spatiotemporal hierarchical methods in VAR and explore how the introduction of large models has advanced the field.Additionally,we propose the concept of“Factor”to identify and integrate key information from both visual and textual modalities,enhancing multimodal alignment.We also summarize various multimodal alignment methods and provide in-depth analysis and insights into future research directions. 展开更多
关键词 human factors video action recognition vision language models analyze dynamic behaviors spatiotemporal granularity video action recognition var aims multimodal alignment scene interaction
原文传递
From microstructure to performance optimization:Innovative applications of computer vision in materials science
10
作者 Chunyu Guo Xiangyu Tang +10 位作者 Yu’e Chen Changyou Gao Qinglin Shan Heyi Wei Xusheng Liu Chuncheng Lu Meixia Fu Enhui Wang Xinhong Liu Xinmei Hou Yanglong Hou 《International Journal of Minerals,Metallurgy and Materials》 2026年第1期94-115,共22页
The rapid advancements in computer vision(CV)technology have transformed the traditional approaches to material microstructure analysis.This review outlines the history of CV and explores the applications of deep-lear... The rapid advancements in computer vision(CV)technology have transformed the traditional approaches to material microstructure analysis.This review outlines the history of CV and explores the applications of deep-learning(DL)-driven CV in four key areas of materials science:microstructure-based performance prediction,microstructure information generation,microstructure defect detection,and crystal structure-based property prediction.The CV has significantly reduced the cost of traditional experimental methods used in material performance prediction.Moreover,recent progress made in generating microstructure images and detecting microstructural defects using CV has led to increased efficiency and reliability in material performance assessments.The DL-driven CV models can accelerate the design of new materials with optimized performance by integrating predictions based on both crystal and microstructural data,thereby allowing for the discovery and innovation of next-generation materials.Finally,the review provides insights into the rapid interdisciplinary developments in the field of materials science and future prospects. 展开更多
关键词 MICROSTRUCTURE deep learning computer vision performance prediction image generation
在线阅读 下载PDF
Ultrathin Gallium Nitride Quantum-Disk-in-Nanowire-Enabled Reconfigurable Bioinspired Sensor for High-Accuracy Human Action Recognition
11
作者 Zhixiang Gao Xin Ju +10 位作者 Huabin Yu Wei Chen Xin Liu Yuanmin Luo Yang Kang Dongyang Luo JiKai Yao Wengang Gu Muhammad Hunain Memon Yong Yan Haiding Sun 《Nano-Micro Letters》 2026年第2期439-453,共15页
Human action recognition(HAR)is crucial for the development of efficient computer vision,where bioinspired neuromorphic perception visual systems have emerged as a vital solution to address transmission bottlenecks ac... Human action recognition(HAR)is crucial for the development of efficient computer vision,where bioinspired neuromorphic perception visual systems have emerged as a vital solution to address transmission bottlenecks across sensor-processor interfaces.However,the absence of interactions among versatile biomimicking functionalities within a single device,which was developed for specific vision tasks,restricts the computational capacity,practicality,and scalability of in-sensor vision computing.Here,we propose a bioinspired vision sensor composed of a Ga N/Al N-based ultrathin quantum-disks-in-nanowires(QD-NWs)array to mimic not only Parvo cells for high-contrast vision and Magno cells for dynamic vision in the human retina but also the synergistic activity between the two cells for in-sensor vision computing.By simply tuning the applied bias voltage on each QD-NW-array-based pixel,we achieve two biosimilar photoresponse characteristics with slow and fast reactions to light stimuli that enhance the in-sensor image quality and HAR efficiency,respectively.Strikingly,the interplay and synergistic interaction of the two photoresponse modes within a single device markedly increased the HAR recognition accuracy from 51.4%to 81.4%owing to the integrated artificial vision system.The demonstration of an intelligent vision sensor offers a promising device platform for the development of highly efficient HAR systems and future smart optoelectronics. 展开更多
关键词 GaN nanowire Quantum-confined Stark effect Voltage-tunable photoresponse Bioinspired sensor Artificial vision system
在线阅读 下载PDF
Functional outcome and patient satisfaction 5y after laser vision correction
12
作者 Ran Gao Yu Han +4 位作者 Jie Qin Yu-Shan Xu Yu Li Xiao-Tong Lyu Feng-Ju Zhang 《International Journal of Ophthalmology(English edition)》 2026年第1期123-131,共9页
AIM:To investigate the association between functionaloutcomes and postoperative patient satisfaction 5y aftersmall incision lenticule extraction(SMILE)and femtosecondlaser-assisted in situ keratomileusis(FS-LASIK).MET... AIM:To investigate the association between functionaloutcomes and postoperative patient satisfaction 5y aftersmall incision lenticule extraction(SMILE)and femtosecondlaser-assisted in situ keratomileusis(FS-LASIK).METHODS:This is a cross-sectional study.Thepatients underwent basic ophthalmic examinations,axiallength measurement,wide-field fundus photography,andaccommodation function testing.Behavioral habits datawere collected using a self-administered questionnaire,andvisual symptoms were assessed with the Quality of Vision(QoV)questionnaire.Postoperative satisfaction was alsorecorded.RESULTS:Totally 410 subjects[820 eyes,160males(39.02%)and 250 females(60.98%)]who hadundergone SMILE or FS-LASIK 5y ago were enrolled.Themean(standard deviation,SD)age of all patients was29.83y(6.69).The mean(SD)preoperative manifest SEwas-5.80(2.04)diopters(D;range:-0.88 to-13.75).Patient satisfaction at 5y after undergoing SMILE or FSLASIKwas 91.70%.Patients were categorized into twogroups:dissatisfied group and satisfied group.Significantdifferences were observed between the two groups in termsof age(P=0.012),sex(P=0.021),preoperative degreeof myopia(P=0.049),postoperative visual symptoms(frequency,P=0.043;severity,P<0.001;bothersome,P=0.018),difficulty driving at night(P=0.001),andaccommodative amplitude(AMP,P=0.020).Multivariateanalysis confirmed that female sex(P=0.024),severityof visual symptoms(P=0.009),and difficulty driving atnight(P=0.006)were significantly associated with lowersatisfaction.The dissatisfied group showed higher rates ofstarbursts,double or multiple images,and high myopia,but lower age.The frequency,severity,and bothersome ofdistortion exhibited decreased with increasing age.CONCLUSION:Patient satisfaction 5y after SMILEand FS-LASIK is high and stable.Difficulty driving at night,sex,and severity of visual symptoms are important factorsinfluencing patient satisfaction.Special attention should bepaid to younger highly myopic female patients,particularlythose with starbursts and double or multiple images.It is crucial to monitor postoperative visual outcomesand provide patients with comprehensive preoperativecounseling to enhance long-term satisfaction. 展开更多
关键词 patient satisfaction MYOPIA vision small incision lenticule extraction femtosecond laser-assisted in situ keratomileusis
原文传递
Privacy-Preserving Gender-Based Customer Behavior Analytics in Retail Spaces Using Computer Vision
13
作者 Ginanjar Suwasono Adi Samsul Huda +4 位作者 Griffani Megiyanto Rahmatullah Dodit Suprianto Dinda Qurrota Aini Al-Sefy Ivon Sandya Sari Putri Lalu Tri Wijaya Nata Kusuma 《Computers, Materials & Continua》 2026年第1期1839-1861,共23页
In the competitive retail industry of the digital era,data-driven insights into gender-specific customer behavior are essential.They support the optimization of store performance,layout design,product placement,and ta... In the competitive retail industry of the digital era,data-driven insights into gender-specific customer behavior are essential.They support the optimization of store performance,layout design,product placement,and targeted marketing.However,existing computer vision solutions often rely on facial recognition to gather such insights,raising significant privacy and ethical concerns.To address these issues,this paper presents a privacypreserving customer analytics system through two key strategies.First,we deploy a deep learning framework using YOLOv9s,trained on the RCA-TVGender dataset.Cameras are positioned perpendicular to observation areas to reduce facial visibility while maintaining accurate gender classification.Second,we apply AES-128 encryption to customer position data,ensuring secure access and regulatory compliance.Our system achieved overall performance,with 81.5%mAP@50,77.7%precision,and 75.7%recall.Moreover,a 90-min observational study confirmed the system’s ability to generate privacy-protected heatmaps revealing distinct behavioral patterns between male and female customers.For instance,women spent more time in certain areas and showed interest in different products.These results confirm the system’s effectiveness in enabling personalized layout and marketing strategies without compromising privacy. 展开更多
关键词 Business intelligence customer behavior privacy-preserving analytics computer vision deep learning smart retail gender recognition heatmap privacy RCA-TVGender dataset
在线阅读 下载PDF
基于Roboguide虚拟视觉的管板孔开槽机器人加工系统设计与仿真
14
作者 刘炜 蒋立君 +2 位作者 唐嘉强 刘学刚 胡兴 《现代制造工程》 北大核心 2026年第1期153-159,107,共8页
随着我国对智能制造及两化融合战略的推进,机器人虚拟仿真技术的应用显得愈加重要。基于FANUC机器人Roboguide软件内置的iRVision虚拟视觉模块搭建了管板孔开槽机器人加工系统,论述了加工系统相机标定、视觉处理程序设置及仿真运行方法... 随着我国对智能制造及两化融合战略的推进,机器人虚拟仿真技术的应用显得愈加重要。基于FANUC机器人Roboguide软件内置的iRVision虚拟视觉模块搭建了管板孔开槽机器人加工系统,论述了加工系统相机标定、视觉处理程序设置及仿真运行方法,模拟真实相机实现了机器人从图像获取、图像处理、位置补偿和运动控制的全流程仿真,减少了搭建真实机器人视觉系统的过程,节省了实体相机、工控机及机器人等硬件成本,简化了视觉识别算法及硬件通信的二次开发过程,使机器人视觉系统相关项目的可行性验证更加方便快捷。通过布局优化及视觉仿真验证了管板孔开槽机器人加工系统的可行性,该系统可推广应用于管板孔胀接、焊接等加工工艺,为通用机械装备生产制造工艺的自动化及智能化升级提供了重要参考。 展开更多
关键词 Roboguide软件 机器人 视觉系统 管板孔开槽 仿真
在线阅读 下载PDF
基于Vision Transformer的轻量化单目深度估计
15
作者 张凯 唐嘉宁 +2 位作者 李叶嘉 马孟星 周思达 《现代电子技术》 北大核心 2026年第4期64-72,共9页
深度估计能为无人机提供精确的三维环境感知能力,而对边缘设备而言,实时推理与极低的计算资源消耗至关重要。目前大多数单目深度估计网络都侧重于提高在高端GPU上运行时的精度,难以满足边缘设备的实时性要求。为解决该问题,提出一种新... 深度估计能为无人机提供精确的三维环境感知能力,而对边缘设备而言,实时推理与极低的计算资源消耗至关重要。目前大多数单目深度估计网络都侧重于提高在高端GPU上运行时的精度,难以满足边缘设备的实时性要求。为解决该问题,提出一种新型编码器-解码器网络,以实现边缘设备上的实时单目深度估计。所提网络通过一个高效的语义模块合并全局的语义信息,为深度估计提供更多的物体边缘细节;并将基于Transformer的模块集成到编码器-解码器架构的最低分辨率层级,从而大大减少视觉变换器(ViT)的参数。此外,还提出了用于深度解码的Upconv层。该网络在精度和速度之间实现了较好的权衡,通过TensorRT优化,在NVIDIA Jetson Orin设备上具备实时推理性能,优于目前多数先进的实时性算法。 展开更多
关键词 单目深度估计网络 边缘设备 编码器 解码器 Transformer技术 视觉变换器
在线阅读 下载PDF
Application of Computer Vision Technique to Maize Variety Identification 被引量:1
16
作者 孙钟雷 李宇 何伟 《Agricultural Science & Technology》 CAS 2013年第5期783-786,796,共5页
Variety identification is important for maize breeding, processing and trade. The computer vision technique has been widely applied to maize variety identification. In this paper, computer vision technique has been su... Variety identification is important for maize breeding, processing and trade. The computer vision technique has been widely applied to maize variety identification. In this paper, computer vision technique has been summarized from the following technical aspects including image acquisition, image processing, characteristic parameter extraction, pattern recognition and programming softwares. In addition, the existing problems during the application of this technique to maize variety identification have also been analyzed and its development tendency is forecasted. 展开更多
关键词 Maize variety identification Computer vision Image processing Feature extraction Pattern recognition
在线阅读 下载PDF
keilμVision2 IDE集成开发环境及单片机程序的模拟仿真调试(上) 被引量:4
17
作者 严天峰 《电子世界》 2005年第1期28-30,共3页
关键词 ide 集成开发环境 51单片机 51系列单片机 内核 C语言 开发平台 公司 发展 开发目标
在线阅读 下载PDF
keilμVision2 IDE集成开发环境及单片机程序的模拟仿真调试(下) 被引量:1
18
作者 李敏之 严天峰 《电子世界》 2005年第2期23-25,共3页
关键词 端口 I/O口 单片机 集成开发环境 ide 内核 字节 双向 锁存器 地址总线
在线阅读 下载PDF
A VIDEO SPECTRUM SPLITTING ENCODING SCHEME BASED ON HUMAN VISION AND ITS COMPUTER SIMULATION
19
作者 赵宇 李华 +1 位作者 俞斯乐 滕建辅 《Transactions of Tianjin University》 EI CAS 1995年第1期79+76-79,共5页
In this paper, a 3-D video encoding scheme suitable for digital TV/HDTV (high definition television) is studied through computer simulation. The encoding scheme is designed to provide a good match to human vision. Bas... In this paper, a 3-D video encoding scheme suitable for digital TV/HDTV (high definition television) is studied through computer simulation. The encoding scheme is designed to provide a good match to human vision. Basically, this involves transmission of low frequency luminance information at full frame rate for good motion rendition and transmission of high frequency luminance signal at reduced frame rate for good detail in static images. 展开更多
关键词 D video encoding discrete wavelet transform human vision computer simulation
在线阅读 下载PDF
基于残差注意力TCN与vision transformer的齿轮剩余寿命预测
20
作者 胡爱军 李晨阳 +2 位作者 邢磊 周卓浩 向玲 《航空动力学报》 北大核心 2025年第12期14-24,共11页
齿轮系统的运行状况受到多个因素的影响,这些因素在时间上存在长期依赖关系,并在局部和全局特征之间存在差异。为了有效地捕捉数据中的时间依赖性并自适应调整对特征的关注度,提出具有残差卷积块注意力机制的时间卷积网络(RCMTCN)。通... 齿轮系统的运行状况受到多个因素的影响,这些因素在时间上存在长期依赖关系,并在局部和全局特征之间存在差异。为了有效地捕捉数据中的时间依赖性并自适应调整对特征的关注度,提出具有残差卷积块注意力机制的时间卷积网络(RCMTCN)。通过在卷积块注意力机制中引入残差连接,模型能够同时关注原始输入和注意力加权的信息,提高了模型对局部信息的感知能力。在此基础上,将vision transformer(ViT)模型与RCMTCN相结合对齿轮的剩余使用寿命(RUL)预测,ViT模型能有效地捕获数据中的全局信息。两者融合后能充分展现在处理时间序列数据局部特征提取能力和全局信息关注方面的优势,提高对多维度特征的感知能力。最后,通过在两种工况齿轮性能退化数据集上对模型进行验证,选用点蚀故障数据进行训练,分别对点蚀和断齿故障进行测试。实验结果表明:与其他方法相比,所提出的方法能更充分地提取关键特征信息,在点蚀故障上评分函数得分为0.8898,且在断齿故障上得分为0.8587,表现出良好的工况、故障适应能力。 展开更多
关键词 齿轮 剩余使用寿命 时序网络 注意力机制 vision transformer模型
原文传递
上一页 1 2 250 下一页 到第
使用帮助 返回顶部