期刊文献+
共找到9篇文章
< 1 >
每页显示 20 50 100
Adaptive implementation of multi-branch convolution with fusion coefficients based on reconfigurable array
1
作者 Liu Dongyue Jiang Lin +2 位作者 Wang Mei Li Yuancheng Hao Juan 《High Technology Letters》 2026年第1期39-48,共10页
Reconfigurable array architecture has become an important hardware platform for edge-side deployment of convolutional neural networks due to their high parallelism and flexible programmability.However,traditional mult... Reconfigurable array architecture has become an important hardware platform for edge-side deployment of convolutional neural networks due to their high parallelism and flexible programmability.However,traditional multi-branch convolutional networks suffer from computational redundancy,high memory access overhead,and inefficient branch fusion.Therefore,this paper proposes an adaptive multi-branch convolutional module(AMBC)that integrates software-hardware co-optimization.During training,the learnable fusion coefficients are introduced to enable adaptive fusion of multi-scale features,while in the inference phase,the multiple branches and their normalization parameters are merged with the fusion coefficients into a single 3×3 convolutional kernel through operator fusion.On the SIREA-288 reconfigurable platform,compared with unoptimized multi-branch networks,the proposed AMBC reduces external memory accesses by 47.91%and inference latency by 47.20%,achieving a 1.90×speedup.This approach maximizes the utilization of the reconfigurable logic while minimizing both reconfiguration and data-movement overheads in edge inference. 展开更多
关键词 reconfigurable array processor structural re-parameterization model compression fusion coefficients edge-side inference acceleration hardware-software co-optimization
在线阅读 下载PDF
消费级光学设备的AI图像增强与目标识别模块
2
作者 骆安 《科学技术创新》 2026年第6期45-48,共4页
随着消费电子产品对影像性能需求的不断提升,AI图像增强与目标识别逐渐成为光学设备的重要组成部分。本文结合消费级光学系统的硬件特点与AI算法的发展趋势,分析了图像增强模块在降噪、动态范围优化和色彩校正方面的作用,同时探讨了目... 随着消费电子产品对影像性能需求的不断提升,AI图像增强与目标识别逐渐成为光学设备的重要组成部分。本文结合消费级光学系统的硬件特点与AI算法的发展趋势,分析了图像增强模块在降噪、动态范围优化和色彩校正方面的作用,同时探讨了目标识别模块在场景理解、实时检测与交互体验中的应用。研究表明,软硬件协同优化能够有效提升成像质量与识别精度,为移动终端、可穿戴设备及智能家居产品提供高效的视觉解决方案。 展开更多
关键词 消费级光学设备 AI图像增强 目标识别 软硬件协同
在线阅读 下载PDF
杂波环境下发射-接收联合优化的自适应滤波方法 被引量:3
3
作者 吴旭姿 刘峥 刘韵佛 《电子与信息学报》 EI CSCD 北大核心 2013年第11期2657-2663,共7页
为了提高杂波环境下起伏目标的幅度估计精度,该文提出一种基于最小均方误差准则的发射-接收联合优化自适应滤波方法。首先发射一组探测信号得到接收窗外散射中心的幅度估值,然后利用该信息自适应地优化相位调制信号以抑制接收窗外强散... 为了提高杂波环境下起伏目标的幅度估计精度,该文提出一种基于最小均方误差准则的发射-接收联合优化自适应滤波方法。首先发射一组探测信号得到接收窗外散射中心的幅度估值,然后利用该信息自适应地优化相位调制信号以抑制接收窗外强散射中心的旁瓣干扰,最后根据各个散射中心的幅度统计信息对回波进行自适应滤波处理。该方法实现了接收机到发射机的闭环反馈,在多脉冲回波的处理上提高了估计精度并降低了运算复杂度。仿真结果证明了该方法的有效性。 展开更多
关键词 认知雷达 自适应滤波 发射-接收联合优化 波形设计
在线阅读 下载PDF
先进CMOS制造工艺的技术演进及自主发展思考
4
作者 张卫 徐敏 +8 位作者 陈鲲 刘桃 杨静雯 孙新 黄自强 汪大伟 吴春蕾 王晨 徐赛生 《前瞻科技》 2022年第3期52-60,共9页
信息社会的迅猛发展极大推动了对高性能计算的需求。而先进互补金属氧化物半导体(CMOS)制造工艺是制造高性能计算芯片的保障,因此成为世界顶尖设计公司和芯片制造企业竞争的技术高地。文章概述了鳍式场效应晶体管(FinFET)之后技术演进... 信息社会的迅猛发展极大推动了对高性能计算的需求。而先进互补金属氧化物半导体(CMOS)制造工艺是制造高性能计算芯片的保障,因此成为世界顶尖设计公司和芯片制造企业竞争的技术高地。文章概述了鳍式场效应晶体管(FinFET)之后技术演进到环栅场效应晶体管(GAAFET)的必然性,以及在工艺模块、系统集成和工艺无损表征上带来的挑战。在先进CMOS制造工艺技术的创新上,需要有从器件开发到系统设计的思维转变;设计工艺协同优化(DTCO)将会发挥越来越重要的作用。面向未来国产先进的CMOS制造工艺的发展,在技术开发和人才培养方面提出了发展建议和举措。 展开更多
关键词 纳米片 环栅 寄生沟道 寄生电阻/电容 沟道应力 设计工艺协同优化 无损表征
在线阅读 下载PDF
混合装配流水线上最小装配时间的协同优化 被引量:2
5
作者 崔永华 左敦稳 +1 位作者 沈冰妹 焦光明 《中国制造业信息化(学术版)》 2008年第12期42-45,共4页
首先分析阐述了混合装配流水线的平衡问题和排序问题对优化目标的交叉影响,并应用改进的并行优化算法,将其与传统的串行优化算法进行比较分析,分析结果表明,改进的协同优化算法能够克服串行优化算法全局搜索能力弱的局限性,达到全局最优... 首先分析阐述了混合装配流水线的平衡问题和排序问题对优化目标的交叉影响,并应用改进的并行优化算法,将其与传统的串行优化算法进行比较分析,分析结果表明,改进的协同优化算法能够克服串行优化算法全局搜索能力弱的局限性,达到全局最优化,从而改善了优化结果。 展开更多
关键词 流水线平衡 产品排序 协同优化
在线阅读 下载PDF
Hardware-Software Collaborative Techniques for Runtime Profiling and Phase Transition Detection 被引量:1
6
作者 Youfeng Wu Yong-Fong Lee 《Journal of Computer Science & Technology》 SCIE EI CSCD 2005年第5期665-675,共11页
Dynamic optimization relies on runtime profile information to improve the performance of program execution. Traditional profiling techniques incur significant overhead and are not suitable for dynamic optimization. In... Dynamic optimization relies on runtime profile information to improve the performance of program execution. Traditional profiling techniques incur significant overhead and are not suitable for dynamic optimization. In this paper, a new profiling technique is proposed, that incorporates the strength of both software and hardware to achieve near-zero overhead profiling. The compiler passes profiling requests as a few bits of information in branch instructions to the hardware, and the processor executes profiling operations asynchronously in available free slots or on dedicated hardware. The compiler instrumentation of this technique is implemented using an Itanium research compiler. The result shows that the accurate block profiling incurs very little overhead to the user program in terms of the program scheduling cycles. For example, the average overhead is 0.6% for the SPECint95 benchmarks. The hardware support required for the new profiling is practical. The technique is extended to collect edge profiles for continuous phase transition detection. It is believed that the hardware-software collaborative scheme will enable many profile-driven dynamic optimizations for EPIC processors such as the Itanium processors. 展开更多
关键词 runtime profiling dynamic optimizations phase transition detection hardware-software collaboration
原文传递
TikTak: A Scalable Simulator of Wireless Sensor Networks Including Hardware/Software Interaction
7
作者 Francesco Menichelli Mauro Olivieri 《Wireless Sensor Network》 2010年第11期815-822,共8页
We present a simulation framework for wireless sensor networks developed to allow the design exploration and the complete microprocessor-instruction-level debug of network formation, data congestion, nodes interaction... We present a simulation framework for wireless sensor networks developed to allow the design exploration and the complete microprocessor-instruction-level debug of network formation, data congestion, nodes interaction, all in one simulation environment. A specifically innovative feature is the co-emulation of selected nodes at clock-cycle-accurate hardware processing level, allowing code debug and exact execution latency evaluation (considering both protocol stack and application), together with other nodes at abstract protocol level, meeting a designer’s needs of simulation speed, scalability and reliability. The simulator is centered on the Zigbee protocol and can be retargeted for different node micro-architectures. 展开更多
关键词 WSN Simulation hardware-software Co-Emulation
在线阅读 下载PDF
Feasibility study of large-scale mass customization 3D printing framework system with a case study on Nanjing Happy Valley East Gate
8
作者 Philip F.Yuan Hooi Shan Beh +2 位作者 Xuezhou Yang Liming Zhang Tianyi Gao 《Frontiers of Architectural Research》 CSCD 2022年第4期670-680,共11页
At present, the development and implementation of digital transformation are the keys to promoting high-quality industry development. The new digital fabrication method of robotic 3D printing is a research area being ... At present, the development and implementation of digital transformation are the keys to promoting high-quality industry development. The new digital fabrication method of robotic 3D printing is a research area being studied by many to tackle the issue of the declining productivity of traditional construction methods. Although many studies have been done, most of the current 3D printing projects are facing limitations in terms of scale. In order to bridge the gap, this article proposed a mass customization 3D printing framework system for large-scale projects. This article discusses how mass customization is made possible through the joint operation of the FUROBOT software and 3D printing hardware. By taking the east gate of Nanjing Happy Valley Plaza as a case study, the article demonstrates and studies the feasibility of the large-scale mass customization 3D printing framework system. 展开更多
关键词 Mass customization 3D printing hardware-software integration Human-machine collaboration Digital fabrication
原文传递
FASS-pruner:customizing a fine-grained CNN accelerator-aware pruning framework via intra-filter splitting and inter-filter shuffling
9
作者 Xiaohui Wei Xinyang Zheng +2 位作者 Chenyang Wang Guangli Li Hengshan Yue 《CCF Transactions on High Performance Computing》 2023年第3期292-303,共12页
Nowadays,with the increasing depth of CNNs,the number of computation and storage requirements with weights expands significantly,preventing their wide deployment on resource-constrained application scenarios such as e... Nowadays,with the increasing depth of CNNs,the number of computation and storage requirements with weights expands significantly,preventing their wide deployment on resource-constrained application scenarios such as embedded systems.To improve the efficiency of the current deep CNN inference stage,researchers have attempted to explore weight pruning techniques on CNN accelerators(e.g.,systolic arrays)to avoid the number of unimportant weights storage and computation.However,these attempts either suffer expensive extra hardware costs to encode/decode the irregular sparse weight pattern on accelerators or bring finite performance improvement due to structured pruning’s modest compression ratio.In order to address the above challenge,this paper proposes FASS-Pruner,a Fine-grained Accelerator-aware pruning framework via intra-filter Splitting and inter-filter Shuffling:(1)Considering the round-by-round execution behavior of CNN accelerator,FASS-Pruner split filters into multiple rounds to perform column-wise-weight pruning;(2)Leveraging the calculation independence characteristics across filters on CNN accelerators,FASS-Pruner shuffles the filters to prune the unimportant rowwise weights at CNN accelerator.Combining the sparse pattern of pruned CNN and the dataflow of systolic array,we modify the systolic array-based accelerator to enable it to execute pruned sparse CNN with better performance and lower energy consumption.By condensing the pruned sparse weights in systolic arrays,FASS-Pruner achieves a comparable pruning ratio while preserving the original data flow of CNN accelerators,thereby achieving significant performance and energy saving. 展开更多
关键词 CNN accelerator Model pruning hardware-software co-design
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部