针对水下无人航行器(underwater unmanned vehicle,UUV)主动声呐系统对信号处理实时性、能效比及集成度的需求,采用模块化设计以及软硬件协同设计思想,提出一种基于异构多处理器片上系统(multi-processor system on chip,MPSoC)的主动...针对水下无人航行器(underwater unmanned vehicle,UUV)主动声呐系统对信号处理实时性、能效比及集成度的需求,采用模块化设计以及软硬件协同设计思想,提出一种基于异构多处理器片上系统(multi-processor system on chip,MPSoC)的主动声呐实时信号处理算法的加速方案。首先研究适合边缘端部署的声呐信号处理算法;然后设计基于MPSoC的加速计算结构,将数字下变频、逆/快速傅里叶变换、波束形成等具有高计算复杂性的处理步骤移植到可编程逻辑端,实现显著加速;最后将目标检测等复杂度较低的步骤部署在处理器系统端,实现更高的灵活性。仿真及湖上试验结果表明,提出的方案可在数据更新周期的41%时间内完成1帧回波数据的实时处理,并可在复杂水下环境下实时有效探测运动目标。该方案在水下UUV主动声呐探测领域具有广阔的应用前景。展开更多
为研究异构多核片上系统(multi-processor system on chip,MPSoC)在密集并行计算任务中的潜力,文章设计并实现了一种适用于粗粒度数据特征、面向任务级并行应用的异构多核系统动态调度协处理器,采用了片上缓存、任务输出的多级写回管理...为研究异构多核片上系统(multi-processor system on chip,MPSoC)在密集并行计算任务中的潜力,文章设计并实现了一种适用于粗粒度数据特征、面向任务级并行应用的异构多核系统动态调度协处理器,采用了片上缓存、任务输出的多级写回管理、任务自动映射、通讯任务乱序执行等机制。实验结果表明,该动态调度协处理器不仅能够实现任务级乱序执行等基本设计目标,还具有极低的调度开销,相较于基于动态记分牌算法的调度器,运行多个子孔径距离压缩算法的时间降低达17.13%。研究结果证明文章设计的动态调度协处理器能够有效优化目标场景下的任务调度效果。展开更多
We present a broadband and polarization-insensitive unidirectional imager that operates at the visible part of the spectrum,where image formation occurs in one direction,while in the opposite direction,it is blocked.T...We present a broadband and polarization-insensitive unidirectional imager that operates at the visible part of the spectrum,where image formation occurs in one direction,while in the opposite direction,it is blocked.This approach is enabled by deep learning-driven diffractive optical design with wafer-scale nano-fabrication using high-purity fused silica to ensure optical transparency and thermal stability.Our design achieves unidirectional imaging across three visible wavelengths(covering red,green,and blue parts of the spectrum),and we experimentally validated this broadband unidirectional imager by creating high-fidelity images in the forward direction and generating weak,distorted output patterns in the backward direction,in alignment with our numerical simulations.This work demonstrates wafer-scale production of diffractive optical processors,featuring 16 levels of nanoscale phase features distributed across two axially aligned diffractive layers for visible unidirectional imaging.This approach facilitates mass-scale production of~0.5 billion nanoscale phase features per wafer,supporting high-throughput manufacturing of hundreds to thousands of multi-layer diffractive processors suitable for large apertures and parallel processing of multiple tasks.Beyond broadband unidirectional imaging in the visible spectrum,this study establishes a pathway for artificial-intelligence-enabled diffractive optics with versatile applications,signaling a new era in optical device functionality with industrial-level,massively scalable fabrication.展开更多
Dynamic voltage scaling (DVS), supported by many DVS-enabled processors, is an efficient technique for energy-efficient embedded systems. Many researchers work on DVS and have presented various DVS algorithms, some wi...Dynamic voltage scaling (DVS), supported by many DVS-enabled processors, is an efficient technique for energy-efficient embedded systems. Many researchers work on DVS and have presented various DVS algorithms, some with quite good results. However, the previous algorithms either have a large time complexity or obtain results sensitive to the count of the voltage modes. Fine-grained voltage modes lead to optimal results, but coarse-grained voltage modes cause less optimal one. A new algorithm is presented, which is based on ant colony optimization, called ant colony optimization voltage and task scheduling (ACO-VTS) with a low time complexity implemented by parallelizing and its linear time approximation algorithm. Both of them generate quite good results, saving up to 30% more energy than that of the previous ones under coarse-grained modes, and their results don’t depend on the number of modes available.展开更多
This report presents the design and implementation of a Distributed Data Acquisition、 Monitoring and Processing System (DDAMAP)。It is assumed that operations of a factory are organized into two-levels: client machin...This report presents the design and implementation of a Distributed Data Acquisition、 Monitoring and Processing System (DDAMAP)。It is assumed that operations of a factory are organized into two-levels: client machines at plant-level collect real-time raw data from sensors and measurement instrumentations and transfer them to a central processor over the Ethernets, and the central processor handles tasks of real-time data processing and monitoring. This system utilizes the computation power of Intel T2300 dual-core processor and parallel computations supported by multi-threading techniques. Our experiments show that these techniques can significantly improve the system performance and are viable solutions to real-time high-speed data processing.展开更多
针对由周期任务和零星任务形成的实时混合任务集进行合理调度问题,文中提出了一种基于零松弛度边界公平(Boundary Fair until Zero Laxity,BFZL)的实时混合任务算法。该算法在改进边界公平(Improved Boundary Fair,I-BF)实时混合任务算...针对由周期任务和零星任务形成的实时混合任务集进行合理调度问题,文中提出了一种基于零松弛度边界公平(Boundary Fair until Zero Laxity,BFZL)的实时混合任务算法。该算法在改进边界公平(Improved Boundary Fair,I-BF)实时混合任务算法基础上,通过引入最小松弛度优先(Least Laxity First,LLF)算法中的松弛度参数来改进判定任务的优先级,并提出基于松弛度与启发式策略相结合的启发式算法改进任务的分配策略。实验结果表明,BFZL算法能够满足系统实时性,并达到了算法优化目的。通过数据对比分析可知,该算法相比于原始算法,零星任务的平均响应时间降低了约26%,上下文切换减少了约28%,迁移减少了约50%。该算法在调度开销方面也具有一定优势。展开更多
文摘针对水下无人航行器(underwater unmanned vehicle,UUV)主动声呐系统对信号处理实时性、能效比及集成度的需求,采用模块化设计以及软硬件协同设计思想,提出一种基于异构多处理器片上系统(multi-processor system on chip,MPSoC)的主动声呐实时信号处理算法的加速方案。首先研究适合边缘端部署的声呐信号处理算法;然后设计基于MPSoC的加速计算结构,将数字下变频、逆/快速傅里叶变换、波束形成等具有高计算复杂性的处理步骤移植到可编程逻辑端,实现显著加速;最后将目标检测等复杂度较低的步骤部署在处理器系统端,实现更高的灵活性。仿真及湖上试验结果表明,提出的方案可在数据更新周期的41%时间内完成1帧回波数据的实时处理,并可在复杂水下环境下实时有效探测运动目标。该方案在水下UUV主动声呐探测领域具有广阔的应用前景。
文摘为研究异构多核片上系统(multi-processor system on chip,MPSoC)在密集并行计算任务中的潜力,文章设计并实现了一种适用于粗粒度数据特征、面向任务级并行应用的异构多核系统动态调度协处理器,采用了片上缓存、任务输出的多级写回管理、任务自动映射、通讯任务乱序执行等机制。实验结果表明,该动态调度协处理器不仅能够实现任务级乱序执行等基本设计目标,还具有极低的调度开销,相较于基于动态记分牌算法的调度器,运行多个子孔径距离压缩算法的时间降低达17.13%。研究结果证明文章设计的动态调度协处理器能够有效优化目标场景下的任务调度效果。
基金Ozcan Lab at UCLA acknowledges the U.S.Department of Energy(DOE),Office of Basic Energy Sciences,Division of Materials Sciences and Engineering under award no.DE-SC0023088.
文摘We present a broadband and polarization-insensitive unidirectional imager that operates at the visible part of the spectrum,where image formation occurs in one direction,while in the opposite direction,it is blocked.This approach is enabled by deep learning-driven diffractive optical design with wafer-scale nano-fabrication using high-purity fused silica to ensure optical transparency and thermal stability.Our design achieves unidirectional imaging across three visible wavelengths(covering red,green,and blue parts of the spectrum),and we experimentally validated this broadband unidirectional imager by creating high-fidelity images in the forward direction and generating weak,distorted output patterns in the backward direction,in alignment with our numerical simulations.This work demonstrates wafer-scale production of diffractive optical processors,featuring 16 levels of nanoscale phase features distributed across two axially aligned diffractive layers for visible unidirectional imaging.This approach facilitates mass-scale production of~0.5 billion nanoscale phase features per wafer,supporting high-throughput manufacturing of hundreds to thousands of multi-layer diffractive processors suitable for large apertures and parallel processing of multiple tasks.Beyond broadband unidirectional imaging in the visible spectrum,this study establishes a pathway for artificial-intelligence-enabled diffractive optics with versatile applications,signaling a new era in optical device functionality with industrial-level,massively scalable fabrication.
基金the National"973"Basic Research Programof China (2004CB318202)
文摘Dynamic voltage scaling (DVS), supported by many DVS-enabled processors, is an efficient technique for energy-efficient embedded systems. Many researchers work on DVS and have presented various DVS algorithms, some with quite good results. However, the previous algorithms either have a large time complexity or obtain results sensitive to the count of the voltage modes. Fine-grained voltage modes lead to optimal results, but coarse-grained voltage modes cause less optimal one. A new algorithm is presented, which is based on ant colony optimization, called ant colony optimization voltage and task scheduling (ACO-VTS) with a low time complexity implemented by parallelizing and its linear time approximation algorithm. Both of them generate quite good results, saving up to 30% more energy than that of the previous ones under coarse-grained modes, and their results don’t depend on the number of modes available.
文摘This report presents the design and implementation of a Distributed Data Acquisition、 Monitoring and Processing System (DDAMAP)。It is assumed that operations of a factory are organized into two-levels: client machines at plant-level collect real-time raw data from sensors and measurement instrumentations and transfer them to a central processor over the Ethernets, and the central processor handles tasks of real-time data processing and monitoring. This system utilizes the computation power of Intel T2300 dual-core processor and parallel computations supported by multi-threading techniques. Our experiments show that these techniques can significantly improve the system performance and are viable solutions to real-time high-speed data processing.
文摘针对由周期任务和零星任务形成的实时混合任务集进行合理调度问题,文中提出了一种基于零松弛度边界公平(Boundary Fair until Zero Laxity,BFZL)的实时混合任务算法。该算法在改进边界公平(Improved Boundary Fair,I-BF)实时混合任务算法基础上,通过引入最小松弛度优先(Least Laxity First,LLF)算法中的松弛度参数来改进判定任务的优先级,并提出基于松弛度与启发式策略相结合的启发式算法改进任务的分配策略。实验结果表明,BFZL算法能够满足系统实时性,并达到了算法优化目的。通过数据对比分析可知,该算法相比于原始算法,零星任务的平均响应时间降低了约26%,上下文切换减少了约28%,迁移减少了约50%。该算法在调度开销方面也具有一定优势。