Synthetic speech detection is an essential task in the field of voice security,aimed at identifying deceptive voice attacks generated by text-to-speech(TTS)systems or voice conversion(VC)systems.In this paper,we propo...Synthetic speech detection is an essential task in the field of voice security,aimed at identifying deceptive voice attacks generated by text-to-speech(TTS)systems or voice conversion(VC)systems.In this paper,we propose a synthetic speech detection model called TFTransformer,which integrates both local and global features to enhance detection capabilities by effectively modeling local and global dependencies.Structurally,the model is divided into two main components:a front-end and a back-end.The front-end of the model uses a combination of SincLayer and two-dimensional(2D)convolution to extract high-level feature maps(HFM)containing local dependency of the input speech signals.The back-end uses time-frequency Transformer module to process these feature maps and further capture global dependency.Furthermore,we propose TFTransformer-SE,which incorporates a channel attention mechanism within the 2D convolutional blocks.This enhancement aims to more effectively capture local dependencies,thereby improving the model’s performance.The experiments were conducted on the ASVspoof 2021 LA dataset,and the results showed that the model achieved an equal error rate(EER)of 3.37%without data augmentation.Additionally,we evaluated the model using the ASVspoof 2019 LA dataset,achieving an EER of 0.84%,also without data augmentation.This demonstrates that combining local and global dependencies in the time-frequency domain can significantly improve detection accuracy.展开更多
Considering the limitation of computational capacity, a new finite element solution is used to simulate the welding deformation of the side sill of railroad car' s bogie frame based on the local-global method. Firstl...Considering the limitation of computational capacity, a new finite element solution is used to simulate the welding deformation of the side sill of railroad car' s bogie frame based on the local-global method. Firstly, a volumetric heat source defined by a double ellipsoid is adopted to simulate the thermal distributions of the arc welding process. And then, the local models extracted from the global model are computed with refined meshes. On these bases, the global distortions of the subject studied are ascertained by transferring the inner forces of computed local models to the global model. It indicates that the local-global method is feasible for simulating the large welded structures by comparing the computed results with the corresponding actual measured values. The work provides basis for optimizing the welding sequence and clamping conditions, and has theoretical values and engineering significance in the integral design, manufacturing technique selection of the bogie frame, as well as other kinds of large welded structures.展开更多
针对当前电力设备红外图像分辨率低和温度分布模糊问题,提出一种基于局部和全局信息注意力生成对抗网络(local and global information attention generative adversarial network,LGIA-GAN)的超分辨率重建方法。首先,使用门控权重单元...针对当前电力设备红外图像分辨率低和温度分布模糊问题,提出一种基于局部和全局信息注意力生成对抗网络(local and global information attention generative adversarial network,LGIA-GAN)的超分辨率重建方法。首先,使用门控权重单元融合多种卷积输出构建细节增强融合卷积,增加重要信息在输出特征图的占比;其次,搭建双注意力模块,对图像长距离像素依赖关系建模并捕获空间和通道维度信息;然后,构造生成对抗网络,使网络关注电力设备红外图像局部纹理细节和全局轮廓信息;最后,通过实验证明,LGIA-GAN在数据集上的峰值信噪比和结构相似度分别为30.266dB和0.9197,重建时间为0.120s,明显优于其他几种GAN算法,并在主观视觉上重建效果更好。所提方法能够有效提升电力设备热成像分辨率,对电力设备故障诊断具有支撑作用。展开更多
堆煤是输送机常见故障之一,为了保障煤矿工业生产的安全,需要对煤矿井下输送机的堆煤情况进行检测。然而现有的检测方法存在容易误触、检测可靠性较差等缺点,针对这些问题提出一种基于Transformer统一多尺度时序卷积(unified multi-scal...堆煤是输送机常见故障之一,为了保障煤矿工业生产的安全,需要对煤矿井下输送机的堆煤情况进行检测。然而现有的检测方法存在容易误触、检测可靠性较差等缺点,针对这些问题提出一种基于Transformer统一多尺度时序卷积(unified multi-scale temporal ConvTransformer,UnMS-TCT)网络用于输送机堆煤检测。首先融合RGB帧和光流帧提取的特征,使网络更全面地建模时空关系;然后在时序编码器中,将动态位置嵌入(dynamic position embedding,DPE),多头关系聚合器(multi-head relation aggregator,MHRA)以及多层感知机(multilayer perceptron,MLP)组成的全局模块,交叉注意力(cross-attention,CA)组成的局部模块,以交替方式形成全局-局部关系模块,增强多尺度下获取全局和局部时间关系的能力;其次利用残差全局-局部融合(residual global and local fusion,ResGLFus)模块融合多尺度特征,有效地提高融合过程的稳定性,最终实现高精度堆煤预测。实验结果表明:该方法能够实现对输送机堆煤的检测,mAP达到98.17%。展开更多
基金supported by project ZR2022MF330 supported by Shandong Provincial Natural Science Foundationthe National Natural Science Foundation of China under Grant No.61701286.
文摘Synthetic speech detection is an essential task in the field of voice security,aimed at identifying deceptive voice attacks generated by text-to-speech(TTS)systems or voice conversion(VC)systems.In this paper,we propose a synthetic speech detection model called TFTransformer,which integrates both local and global features to enhance detection capabilities by effectively modeling local and global dependencies.Structurally,the model is divided into two main components:a front-end and a back-end.The front-end of the model uses a combination of SincLayer and two-dimensional(2D)convolution to extract high-level feature maps(HFM)containing local dependency of the input speech signals.The back-end uses time-frequency Transformer module to process these feature maps and further capture global dependency.Furthermore,we propose TFTransformer-SE,which incorporates a channel attention mechanism within the 2D convolutional blocks.This enhancement aims to more effectively capture local dependencies,thereby improving the model’s performance.The experiments were conducted on the ASVspoof 2021 LA dataset,and the results showed that the model achieved an equal error rate(EER)of 3.37%without data augmentation.Additionally,we evaluated the model using the ASVspoof 2019 LA dataset,achieving an EER of 0.84%,also without data augmentation.This demonstrates that combining local and global dependencies in the time-frequency domain can significantly improve detection accuracy.
文摘Considering the limitation of computational capacity, a new finite element solution is used to simulate the welding deformation of the side sill of railroad car' s bogie frame based on the local-global method. Firstly, a volumetric heat source defined by a double ellipsoid is adopted to simulate the thermal distributions of the arc welding process. And then, the local models extracted from the global model are computed with refined meshes. On these bases, the global distortions of the subject studied are ascertained by transferring the inner forces of computed local models to the global model. It indicates that the local-global method is feasible for simulating the large welded structures by comparing the computed results with the corresponding actual measured values. The work provides basis for optimizing the welding sequence and clamping conditions, and has theoretical values and engineering significance in the integral design, manufacturing technique selection of the bogie frame, as well as other kinds of large welded structures.
文摘针对当前电力设备红外图像分辨率低和温度分布模糊问题,提出一种基于局部和全局信息注意力生成对抗网络(local and global information attention generative adversarial network,LGIA-GAN)的超分辨率重建方法。首先,使用门控权重单元融合多种卷积输出构建细节增强融合卷积,增加重要信息在输出特征图的占比;其次,搭建双注意力模块,对图像长距离像素依赖关系建模并捕获空间和通道维度信息;然后,构造生成对抗网络,使网络关注电力设备红外图像局部纹理细节和全局轮廓信息;最后,通过实验证明,LGIA-GAN在数据集上的峰值信噪比和结构相似度分别为30.266dB和0.9197,重建时间为0.120s,明显优于其他几种GAN算法,并在主观视觉上重建效果更好。所提方法能够有效提升电力设备热成像分辨率,对电力设备故障诊断具有支撑作用。
文摘堆煤是输送机常见故障之一,为了保障煤矿工业生产的安全,需要对煤矿井下输送机的堆煤情况进行检测。然而现有的检测方法存在容易误触、检测可靠性较差等缺点,针对这些问题提出一种基于Transformer统一多尺度时序卷积(unified multi-scale temporal ConvTransformer,UnMS-TCT)网络用于输送机堆煤检测。首先融合RGB帧和光流帧提取的特征,使网络更全面地建模时空关系;然后在时序编码器中,将动态位置嵌入(dynamic position embedding,DPE),多头关系聚合器(multi-head relation aggregator,MHRA)以及多层感知机(multilayer perceptron,MLP)组成的全局模块,交叉注意力(cross-attention,CA)组成的局部模块,以交替方式形成全局-局部关系模块,增强多尺度下获取全局和局部时间关系的能力;其次利用残差全局-局部融合(residual global and local fusion,ResGLFus)模块融合多尺度特征,有效地提高融合过程的稳定性,最终实现高精度堆煤预测。实验结果表明:该方法能够实现对输送机堆煤的检测,mAP达到98.17%。