Dear Editor,The letter proposes a tensor low-rank orthogonal compression(TLOC)model for a convolutional neural network(CNN),which facilitates its efficient and highly-accurate low-rank representation.Model compression...Dear Editor,The letter proposes a tensor low-rank orthogonal compression(TLOC)model for a convolutional neural network(CNN),which facilitates its efficient and highly-accurate low-rank representation.Model compression is crucial for deploying deep neural network(DNN)models on resource-constrained embedded devices.展开更多
With the rapid growth of complexity and functionality of modern electronic systems, creating precise behavioral models of nonlinear circuits has become an attractive topic. Deep neural networks (DNNs) have been recogn...With the rapid growth of complexity and functionality of modern electronic systems, creating precise behavioral models of nonlinear circuits has become an attractive topic. Deep neural networks (DNNs) have been recognized as a powerful tool for nonlinear system modeling. To characterize the behavior of nonlinear circuits, a DNN based modeling approach is proposed in this paper. The procedure is illustrated by modeling a power amplifier (PA), which is a typical nonlinear circuit in electronic systems. The PA model is constructed based on a feedforward neural network with three hidden layers, and then Multisim circuit simulator is applied to generating the raw training data. Training and validation are carried out in Tensorflow deep learning framework. Compared with the commonly used polynomial model, the proposed DNN model exhibits a faster convergence rate and improves the mean squared error by 13 dB. The results demonstrate that the proposed DNN model can accurately depict the input-output characteristics of nonlinear circuits in both training and validation data sets.展开更多
This paper proposes a technique for synthesizing a pixel-based photo-realistic talking face animation using two-step synthesis with HMMs and DNNs. We introduce facial expression parameters as an intermediate represent...This paper proposes a technique for synthesizing a pixel-based photo-realistic talking face animation using two-step synthesis with HMMs and DNNs. We introduce facial expression parameters as an intermediate representation that has a good correspondence with both of the input contexts and the output pixel data of face images. The sequences of the facial expression parameters are modeled using context-dependent HMMs with static and dynamic features. The mapping from the expression parameters to the target pixel images are trained using DNNs. We examine the required amount of the training data for HMMs and DNNs and compare the performance of the proposed technique with the conventional PCA-based technique through objective and subjective evaluation experiments.展开更多
Deep learning, especially through convolutional neural networks (CNN) such as the U-Net 3D model, has revolutionized fault identification from seismic data, representing a significant leap over traditional methods. Ou...Deep learning, especially through convolutional neural networks (CNN) such as the U-Net 3D model, has revolutionized fault identification from seismic data, representing a significant leap over traditional methods. Our review traces the evolution of CNN, emphasizing the adaptation and capabilities of the U-Net 3D model in automating seismic fault delineation with unprecedented accuracy. We find: 1) The transition from basic neural networks to sophisticated CNN has enabled remarkable advancements in image recognition, which are directly applicable to analyzing seismic data. The U-Net 3D model, with its innovative architecture, exemplifies this progress by providing a method for detailed and accurate fault detection with reduced manual interpretation bias. 2) The U-Net 3D model has demonstrated its superiority over traditional fault identification methods in several key areas: it has enhanced interpretation accuracy, increased operational efficiency, and reduced the subjectivity of manual methods. 3) Despite these achievements, challenges such as the need for effective data preprocessing, acquisition of high-quality annotated datasets, and achieving model generalization across different geological conditions remain. Future research should therefore focus on developing more complex network architectures and innovative training strategies to refine fault identification performance further. Our findings confirm the transformative potential of deep learning, particularly CNN like the U-Net 3D model, in geosciences, advocating for its broader integration to revolutionize geological exploration and seismic analysis.展开更多
Brain encoding and decoding via functional magnetic resonance imaging(fMRI)are two important aspects of visual perception neuroscience.Although previous researchers have made significant advances in brain encoding and...Brain encoding and decoding via functional magnetic resonance imaging(fMRI)are two important aspects of visual perception neuroscience.Although previous researchers have made significant advances in brain encoding and decoding models,existing methods still require improvement using advanced machine learning techniques.For example,traditional methods usually build the encoding and decoding models separately,and are prone to overfitting on a small dataset.In fact,effectively unifying the encoding and decoding procedures may allow for more accurate predictions.In this paper,we first review the existing encoding and decoding methods and discuss the potential advantages of a“bidirectional”modeling strategy.Next,we show that there are correspondences between deep neural networks and human visual streams in terms of the architecture and computational rules.Furthermore,deep generative models(e.g.,variational autoencoders(VAEs)and generative adversarial networks(GANs))have produced promising results in studies on brain encoding and decoding.Finally,we propose that the dual learning method,which was originally designed for machine translation tasks,could help to improve the performance of encoding and decoding models by leveraging large-scale unpaired data.展开更多
为了提高短期风电功率预测的准确性,提出一种基于贝叶斯优化和特征融合的xLSTM(extended Long Short-Term Memory)-Transformer模型。该模型综合应用长短期记忆(LSTM)网络的时序处理能力和Transformer的自注意力机制的动态特征融合能力...为了提高短期风电功率预测的准确性,提出一种基于贝叶斯优化和特征融合的xLSTM(extended Long Short-Term Memory)-Transformer模型。该模型综合应用长短期记忆(LSTM)网络的时序处理能力和Transformer的自注意力机制的动态特征融合能力。借助贝叶斯优化方法,模型可在较少的迭代次数条件下优化超参数,显著降低模型对计算资源的依赖。实验结果表明,内蒙古某风电场数据集上,与单一的LSTM模型、Transformer模型、门控循环单元(GRU)模型以及未采用贝叶斯优化和特征融合的xLSTM-Transformer模型相比,当步长(LookBack)为4和8时,所提模型的决定系数R2较基准模型平均提升1.2%~11.3%;平均绝对误差(MAE)平均降低12.8%~38.4%;均方根误差(RMSE)平均降低8.6%~35.8%。结果表明,所提模型在短历史输入条件下具有更高的预测精度与稳定性。展开更多
基金supported by the Science and Technology Innovation Key R&D Program of Chongqing(CSTB2025TIAD-STX0032)National Key Research and Development Program of China(2024YFF0908200)+1 种基金the Chongqing Technology Innovation and Application Development Special Key Project(CSTB2024TIAD-KPX0018)the Southwest University Graduate Student Research Innovation(SWUB24051)。
文摘Dear Editor,The letter proposes a tensor low-rank orthogonal compression(TLOC)model for a convolutional neural network(CNN),which facilitates its efficient and highly-accurate low-rank representation.Model compression is crucial for deploying deep neural network(DNN)models on resource-constrained embedded devices.
文摘With the rapid growth of complexity and functionality of modern electronic systems, creating precise behavioral models of nonlinear circuits has become an attractive topic. Deep neural networks (DNNs) have been recognized as a powerful tool for nonlinear system modeling. To characterize the behavior of nonlinear circuits, a DNN based modeling approach is proposed in this paper. The procedure is illustrated by modeling a power amplifier (PA), which is a typical nonlinear circuit in electronic systems. The PA model is constructed based on a feedforward neural network with three hidden layers, and then Multisim circuit simulator is applied to generating the raw training data. Training and validation are carried out in Tensorflow deep learning framework. Compared with the commonly used polynomial model, the proposed DNN model exhibits a faster convergence rate and improves the mean squared error by 13 dB. The results demonstrate that the proposed DNN model can accurately depict the input-output characteristics of nonlinear circuits in both training and validation data sets.
文摘This paper proposes a technique for synthesizing a pixel-based photo-realistic talking face animation using two-step synthesis with HMMs and DNNs. We introduce facial expression parameters as an intermediate representation that has a good correspondence with both of the input contexts and the output pixel data of face images. The sequences of the facial expression parameters are modeled using context-dependent HMMs with static and dynamic features. The mapping from the expression parameters to the target pixel images are trained using DNNs. We examine the required amount of the training data for HMMs and DNNs and compare the performance of the proposed technique with the conventional PCA-based technique through objective and subjective evaluation experiments.
文摘Deep learning, especially through convolutional neural networks (CNN) such as the U-Net 3D model, has revolutionized fault identification from seismic data, representing a significant leap over traditional methods. Our review traces the evolution of CNN, emphasizing the adaptation and capabilities of the U-Net 3D model in automating seismic fault delineation with unprecedented accuracy. We find: 1) The transition from basic neural networks to sophisticated CNN has enabled remarkable advancements in image recognition, which are directly applicable to analyzing seismic data. The U-Net 3D model, with its innovative architecture, exemplifies this progress by providing a method for detailed and accurate fault detection with reduced manual interpretation bias. 2) The U-Net 3D model has demonstrated its superiority over traditional fault identification methods in several key areas: it has enhanced interpretation accuracy, increased operational efficiency, and reduced the subjectivity of manual methods. 3) Despite these achievements, challenges such as the need for effective data preprocessing, acquisition of high-quality annotated datasets, and achieving model generalization across different geological conditions remain. Future research should therefore focus on developing more complex network architectures and innovative training strategies to refine fault identification performance further. Our findings confirm the transformative potential of deep learning, particularly CNN like the U-Net 3D model, in geosciences, advocating for its broader integration to revolutionize geological exploration and seismic analysis.
基金This work was supported by the National Key Research and Development Program of China(2018YFC2001302)National Natural Science Foundation of China(91520202)+2 种基金Chinese Academy of Sciences Scientific Equipment Development Project(YJKYYQ20170050)Beijing Municipal Science and Technology Commission(Z181100008918010)Youth Innovation Promotion Association of Chinese Academy of Sciences,and Strategic Priority Research Program of Chinese Academy of Sciences(XDB32040200).
文摘Brain encoding and decoding via functional magnetic resonance imaging(fMRI)are two important aspects of visual perception neuroscience.Although previous researchers have made significant advances in brain encoding and decoding models,existing methods still require improvement using advanced machine learning techniques.For example,traditional methods usually build the encoding and decoding models separately,and are prone to overfitting on a small dataset.In fact,effectively unifying the encoding and decoding procedures may allow for more accurate predictions.In this paper,we first review the existing encoding and decoding methods and discuss the potential advantages of a“bidirectional”modeling strategy.Next,we show that there are correspondences between deep neural networks and human visual streams in terms of the architecture and computational rules.Furthermore,deep generative models(e.g.,variational autoencoders(VAEs)and generative adversarial networks(GANs))have produced promising results in studies on brain encoding and decoding.Finally,we propose that the dual learning method,which was originally designed for machine translation tasks,could help to improve the performance of encoding and decoding models by leveraging large-scale unpaired data.
文摘为了提高短期风电功率预测的准确性,提出一种基于贝叶斯优化和特征融合的xLSTM(extended Long Short-Term Memory)-Transformer模型。该模型综合应用长短期记忆(LSTM)网络的时序处理能力和Transformer的自注意力机制的动态特征融合能力。借助贝叶斯优化方法,模型可在较少的迭代次数条件下优化超参数,显著降低模型对计算资源的依赖。实验结果表明,内蒙古某风电场数据集上,与单一的LSTM模型、Transformer模型、门控循环单元(GRU)模型以及未采用贝叶斯优化和特征融合的xLSTM-Transformer模型相比,当步长(LookBack)为4和8时,所提模型的决定系数R2较基准模型平均提升1.2%~11.3%;平均绝对误差(MAE)平均降低12.8%~38.4%;均方根误差(RMSE)平均降低8.6%~35.8%。结果表明,所提模型在短历史输入条件下具有更高的预测精度与稳定性。