A highly c-axis-oriented aluminum nitride(Al N)thin film with smooth and crack-free surface was fabricated by an off-normal direct current(DC)sputtering method in a pure nitrogen atmosphere,in which the rotatable subs...A highly c-axis-oriented aluminum nitride(Al N)thin film with smooth and crack-free surface was fabricated by an off-normal direct current(DC)sputtering method in a pure nitrogen atmosphere,in which the rotatable substrate holder positioned in the middle of four side targets was a key approach to guarantee the grain growth with no tilt.The detailed effects of substrate angle on the c-axis orientation of Al N films were investigated by varying the substrate angle from 0°to 90°.Moreover,theoretical analysis and Monte Carlo(MC)simulation reveal that the oblique or even vertical angle could improve the lateral kinetic energy of sputtered atoms deposited on the growing film.A variety of examining techniques including X-ray diffraction(XRD),(002)peak rocking curve,scanning electron microscopy(SEM)were conducted to evaluate the angle dependence on the crystallographic orientation.These test results indicate that larger substrate angle is beneficial to the(002)growth of Al N thin film,and a fully c-axis textured Al N thin film is obtained at 90°with small surface roughness(R_(a))of 3.32 nm.展开更多
在分析激光焊接工艺要求之后得出激光焊接机器人的位姿要求。重点阐述了激光焊接机器人离线编程过程中位姿的求解原理以及求解过程,利用UG软件二次开发工具UG Open Grip实现了机器人位姿的求解,结合KUKASim软件实现了激光焊接机器人的...在分析激光焊接工艺要求之后得出激光焊接机器人的位姿要求。重点阐述了激光焊接机器人离线编程过程中位姿的求解原理以及求解过程,利用UG软件二次开发工具UG Open Grip实现了机器人位姿的求解,结合KUKASim软件实现了激光焊接机器人的离线编程和仿真。展开更多
虽然批归一化算法能有效加速深度卷积网络模型的收敛速度,但其数据依赖性复杂,训练时会导致严重的“存储墙”瓶颈。故对使用批归一化算法的卷积神经网络,提出多层融合且重构批归一化层的训练方法,减少模型训练过程中的访存量。首先,通...虽然批归一化算法能有效加速深度卷积网络模型的收敛速度,但其数据依赖性复杂,训练时会导致严重的“存储墙”瓶颈。故对使用批归一化算法的卷积神经网络,提出多层融合且重构批归一化层的训练方法,减少模型训练过程中的访存量。首先,通过分析训练时批归一化层的数据依赖、访存特征及模型训练时的访存特征,分析访存瓶颈的关键因素;其次,使用“计算换访存”思想,提出融合“卷积层+批归一化层+激活层”结构的方法,并基于批归一化层的计算访存特征,将其重构为两个子层,分别与相邻层融合,进一步减少训练时对主存的读写,并构建了训练时的访存量模型与计算量模型。实验结果表明,使用NVIDIA TESLA V100 GPU训练ResNet-50、Inception V3及DenseNet模型时,同原始训练方法相比,其访存数据量分别降低了33%,22%及31%,V100的实际计算效率分别提升了20.5%,18.5%以及18.1%。这种优化方法利用了网络结构与模型训练时的访存特点,可与其他访存优化方法协同使用,进一步降低模型训练时的访存量。展开更多
基金financially supported by the National Natural Science Foundation of China(Nos.U1832131 and51721005)Beijing Municipal Natural Science Foundation(No.3202034)the Natural Science Foundation of Hebei Province(No.E2018402097)。
文摘A highly c-axis-oriented aluminum nitride(Al N)thin film with smooth and crack-free surface was fabricated by an off-normal direct current(DC)sputtering method in a pure nitrogen atmosphere,in which the rotatable substrate holder positioned in the middle of four side targets was a key approach to guarantee the grain growth with no tilt.The detailed effects of substrate angle on the c-axis orientation of Al N films were investigated by varying the substrate angle from 0°to 90°.Moreover,theoretical analysis and Monte Carlo(MC)simulation reveal that the oblique or even vertical angle could improve the lateral kinetic energy of sputtered atoms deposited on the growing film.A variety of examining techniques including X-ray diffraction(XRD),(002)peak rocking curve,scanning electron microscopy(SEM)were conducted to evaluate the angle dependence on the crystallographic orientation.These test results indicate that larger substrate angle is beneficial to the(002)growth of Al N thin film,and a fully c-axis textured Al N thin film is obtained at 90°with small surface roughness(R_(a))of 3.32 nm.
文摘虽然批归一化算法能有效加速深度卷积网络模型的收敛速度,但其数据依赖性复杂,训练时会导致严重的“存储墙”瓶颈。故对使用批归一化算法的卷积神经网络,提出多层融合且重构批归一化层的训练方法,减少模型训练过程中的访存量。首先,通过分析训练时批归一化层的数据依赖、访存特征及模型训练时的访存特征,分析访存瓶颈的关键因素;其次,使用“计算换访存”思想,提出融合“卷积层+批归一化层+激活层”结构的方法,并基于批归一化层的计算访存特征,将其重构为两个子层,分别与相邻层融合,进一步减少训练时对主存的读写,并构建了训练时的访存量模型与计算量模型。实验结果表明,使用NVIDIA TESLA V100 GPU训练ResNet-50、Inception V3及DenseNet模型时,同原始训练方法相比,其访存数据量分别降低了33%,22%及31%,V100的实际计算效率分别提升了20.5%,18.5%以及18.1%。这种优化方法利用了网络结构与模型训练时的访存特点,可与其他访存优化方法协同使用,进一步降低模型训练时的访存量。