A design of low-light-level night vision system is described,which can image objects selectively in the specific space. The system can selectively image some objects in specific distances,meanwhile ignore those shelte...A design of low-light-level night vision system is described,which can image objects selectively in the specific space. The system can selectively image some objects in specific distances,meanwhile ignore those shelters on the way of observation by combining an intensifying charge coupled device(ICCD) with a near infrared laser assisted in vision,whose operation wavelength matches with the photocathode of the image tube,and adopting the gated mode and adjustable time-delay. A semiconductor laser diode of 100 W in peak power is chosen for illumination. The laser and the image tube operate in 150 ns pulse width and 2 kHz repeat frequency. Some images of different objects at the different distances within 100 m can be obtained clearly,and even behind a grove by using a sampling circuit and a delay control device at 100 W in peak power of semiconductor laser diode,150 ns in pulse width of laser and image tube,2 kHz in repeat frequency.展开更多
This paper presents discrete wavelet transform (DWT) and its inverse (IDWT) with Haar wavelets as tools to compute the variable size interpolated versions of an image at optimum computational load. As a human obse...This paper presents discrete wavelet transform (DWT) and its inverse (IDWT) with Haar wavelets as tools to compute the variable size interpolated versions of an image at optimum computational load. As a human observer moves closer to or farther from a scene, the retinal image of the scene zooms in or out, respectively. This zooming in or out can be modeled using variable scale interpolation. The paper proposes a novel way of applying DWT and IDWT in a piecewise manner by non-uniform down- or up-sampling of the images to achieve partially sampled versions of the images. The partially sampled versions are then aggregated to achieve the final variable scale interpolated images. The non-uniform down- or up-sampling here is a function of the required scale of interpolation. Appropriate zero padding is used to make the images suitable for the required non-uniform sampling and the subsequent interpolation to the required scale. The concept of zeroeth level DWT is introduced here, which works as the basis for interpolating the images to achieve bigger size than the original one. The main emphasis here is on the computation of variable size images at less computational load, without compromise of quality of images. The interpolated images to different sizes and the reconstructed images are benchmarked using the statistical parameters and visual comparison. It has been found that the proposed approach performs better as compared to bilinear and bicubic interpolation techniques.展开更多
为解决采茶机器人对茶叶的精准检测和采摘问题,提出一种基于双层路由动态稀疏注意力机制和FasterNet改进的YOLOv7算法,以实现对茶叶鲜叶的分类检测。该算法通过PConv和FasterNet替换原有网络结构,减少浮点运算的数量、提升浮点运算效率;...为解决采茶机器人对茶叶的精准检测和采摘问题,提出一种基于双层路由动态稀疏注意力机制和FasterNet改进的YOLOv7算法,以实现对茶叶鲜叶的分类检测。该算法通过PConv和FasterNet替换原有网络结构,减少浮点运算的数量、提升浮点运算效率;在neck层加入基于双层路由的动态稀疏注意力机制,使计算分配和内容感知更灵活;将损失函数替换为EIoU(efficient intersection over union),加速收敛提高回归精度,减少检测过程中的误检。结果表明,改进算法生成的模型比YOLOv7在精确度上提升4.8个百分点,召回率提升5.3个百分点,平衡分数提高5.0个百分点,平均精度均值(mean average precision,mAP)提升2.6个百分点;且在外部验证中浮点运算数量降低15.1 G,每秒传输帧数提升5.52%,mAP提升2.4个百分点。改进后的模型不仅可以高效准确地对茶叶鲜叶进行分类检测,同时具备高识别率、低运算量和快速检测的特点。研究结果为云南高原山地采茶机器人的实现奠定了基础。展开更多
提出一种基于多任务注意力机制的无参考屏幕内容图像质量评价算法(multi-task attention mechanism based no reference quality assessment algorithm for screen content images,MTA-SCI)。MTA-SCI首先使用自注意力机制提取屏幕内容...提出一种基于多任务注意力机制的无参考屏幕内容图像质量评价算法(multi-task attention mechanism based no reference quality assessment algorithm for screen content images,MTA-SCI)。MTA-SCI首先使用自注意力机制提取屏幕内容图像的全局特征,增强对屏幕内容图像整体信息的表征能力;然后使用综合局部注意力机制提取屏幕内容图像的局部特征,使局部特征能够聚焦于屏幕内容图像中更吸引人注意的细节部分;最后使用双通道特征映射模块预测屏幕内容图像的质量分数。在SCID和SIQAD数据集上,MTA-SCI的斯皮尔曼秩序相关系数(Spearman's rank order correlation coefficient,SRCC)分别达到0.9602和0.9233,皮尔森线性相关系数(Pearson linear correlation coefficient,PLCC)分别达到0.9609和0.9294。实验结果表明,MTA-SCI在预测屏幕内容图像质量任务中具有较高的准确性。展开更多
现有的异常检测方法能在特定应用场景下实现高精度检测,然而这些方法难以适用于其他应用场景,且自动化程度有限。因此,提出一种视觉基础模型(VFM)驱动的像素级图像异常检测方法SSMOD-Net(State Space Model driven-Omni Dimensional Ne...现有的异常检测方法能在特定应用场景下实现高精度检测,然而这些方法难以适用于其他应用场景,且自动化程度有限。因此,提出一种视觉基础模型(VFM)驱动的像素级图像异常检测方法SSMOD-Net(State Space Model driven-Omni Dimensional Net),旨在实现更精确的工业缺陷检测。与现有方法不同,SSMOD-Net实现SAM(Segment Anything Model)的自动化提示且不需要微调SAM,因此特别适用于需要处理大规模工业视觉数据的场景。SSMOD-Net的核心是一个新颖的提示编码器,该编码器由状态空间模型驱动,能够根据SAM的输入图像动态地生成提示。这一设计允许模型在保持SAM架构不变的同时,通过提示编码器引入额外的指导信息,从而提高检测精度。提示编码器内部集成一个残差多尺度模块,该模块基于状态空间模型构建,能够综合利用多尺度信息和全局信息。这一模块通过迭代搜索,在提示空间中寻找最优的提示,并将这些提示以高维张量的形式提供给SAM,从而增强模型对工业异常的识别能力。而且所提方法不需要对SAM进行任何修改,从而避免复杂的对训练计划的微调需求。在多个数据集上的实验结果表明,所提方法展现出了卓越的性能,与AutoSAM和SAM-EG(SAM with Edge Guidance framework for efficient polyp segmentation)等方法相比,所提方法在mE(mean E-measure)和平均绝对误差(MAE)、Dice和交并比(IoU)上都取得了较好的结果。展开更多
文摘A design of low-light-level night vision system is described,which can image objects selectively in the specific space. The system can selectively image some objects in specific distances,meanwhile ignore those shelters on the way of observation by combining an intensifying charge coupled device(ICCD) with a near infrared laser assisted in vision,whose operation wavelength matches with the photocathode of the image tube,and adopting the gated mode and adjustable time-delay. A semiconductor laser diode of 100 W in peak power is chosen for illumination. The laser and the image tube operate in 150 ns pulse width and 2 kHz repeat frequency. Some images of different objects at the different distances within 100 m can be obtained clearly,and even behind a grove by using a sampling circuit and a delay control device at 100 W in peak power of semiconductor laser diode,150 ns in pulse width of laser and image tube,2 kHz in repeat frequency.
文摘This paper presents discrete wavelet transform (DWT) and its inverse (IDWT) with Haar wavelets as tools to compute the variable size interpolated versions of an image at optimum computational load. As a human observer moves closer to or farther from a scene, the retinal image of the scene zooms in or out, respectively. This zooming in or out can be modeled using variable scale interpolation. The paper proposes a novel way of applying DWT and IDWT in a piecewise manner by non-uniform down- or up-sampling of the images to achieve partially sampled versions of the images. The partially sampled versions are then aggregated to achieve the final variable scale interpolated images. The non-uniform down- or up-sampling here is a function of the required scale of interpolation. Appropriate zero padding is used to make the images suitable for the required non-uniform sampling and the subsequent interpolation to the required scale. The concept of zeroeth level DWT is introduced here, which works as the basis for interpolating the images to achieve bigger size than the original one. The main emphasis here is on the computation of variable size images at less computational load, without compromise of quality of images. The interpolated images to different sizes and the reconstructed images are benchmarked using the statistical parameters and visual comparison. It has been found that the proposed approach performs better as compared to bilinear and bicubic interpolation techniques.
文摘为解决采茶机器人对茶叶的精准检测和采摘问题,提出一种基于双层路由动态稀疏注意力机制和FasterNet改进的YOLOv7算法,以实现对茶叶鲜叶的分类检测。该算法通过PConv和FasterNet替换原有网络结构,减少浮点运算的数量、提升浮点运算效率;在neck层加入基于双层路由的动态稀疏注意力机制,使计算分配和内容感知更灵活;将损失函数替换为EIoU(efficient intersection over union),加速收敛提高回归精度,减少检测过程中的误检。结果表明,改进算法生成的模型比YOLOv7在精确度上提升4.8个百分点,召回率提升5.3个百分点,平衡分数提高5.0个百分点,平均精度均值(mean average precision,mAP)提升2.6个百分点;且在外部验证中浮点运算数量降低15.1 G,每秒传输帧数提升5.52%,mAP提升2.4个百分点。改进后的模型不仅可以高效准确地对茶叶鲜叶进行分类检测,同时具备高识别率、低运算量和快速检测的特点。研究结果为云南高原山地采茶机器人的实现奠定了基础。
文摘提出一种基于多任务注意力机制的无参考屏幕内容图像质量评价算法(multi-task attention mechanism based no reference quality assessment algorithm for screen content images,MTA-SCI)。MTA-SCI首先使用自注意力机制提取屏幕内容图像的全局特征,增强对屏幕内容图像整体信息的表征能力;然后使用综合局部注意力机制提取屏幕内容图像的局部特征,使局部特征能够聚焦于屏幕内容图像中更吸引人注意的细节部分;最后使用双通道特征映射模块预测屏幕内容图像的质量分数。在SCID和SIQAD数据集上,MTA-SCI的斯皮尔曼秩序相关系数(Spearman's rank order correlation coefficient,SRCC)分别达到0.9602和0.9233,皮尔森线性相关系数(Pearson linear correlation coefficient,PLCC)分别达到0.9609和0.9294。实验结果表明,MTA-SCI在预测屏幕内容图像质量任务中具有较高的准确性。
文摘现有的异常检测方法能在特定应用场景下实现高精度检测,然而这些方法难以适用于其他应用场景,且自动化程度有限。因此,提出一种视觉基础模型(VFM)驱动的像素级图像异常检测方法SSMOD-Net(State Space Model driven-Omni Dimensional Net),旨在实现更精确的工业缺陷检测。与现有方法不同,SSMOD-Net实现SAM(Segment Anything Model)的自动化提示且不需要微调SAM,因此特别适用于需要处理大规模工业视觉数据的场景。SSMOD-Net的核心是一个新颖的提示编码器,该编码器由状态空间模型驱动,能够根据SAM的输入图像动态地生成提示。这一设计允许模型在保持SAM架构不变的同时,通过提示编码器引入额外的指导信息,从而提高检测精度。提示编码器内部集成一个残差多尺度模块,该模块基于状态空间模型构建,能够综合利用多尺度信息和全局信息。这一模块通过迭代搜索,在提示空间中寻找最优的提示,并将这些提示以高维张量的形式提供给SAM,从而增强模型对工业异常的识别能力。而且所提方法不需要对SAM进行任何修改,从而避免复杂的对训练计划的微调需求。在多个数据集上的实验结果表明,所提方法展现出了卓越的性能,与AutoSAM和SAM-EG(SAM with Edge Guidance framework for efficient polyp segmentation)等方法相比,所提方法在mE(mean E-measure)和平均绝对误差(MAE)、Dice和交并比(IoU)上都取得了较好的结果。