This work proposes a Tensor Train Random Projection(TTRP)method for dimension reduction,where pairwise distances can be approximately preserved.Our TTRP is systematically constructed through a Tensor Train(TT)represen...This work proposes a Tensor Train Random Projection(TTRP)method for dimension reduction,where pairwise distances can be approximately preserved.Our TTRP is systematically constructed through a Tensor Train(TT)representation with TT-ranks equal to one.Based on the tensor train format,this random projection method can speed up the dimension reduction procedure for high-dimensional datasets and requires fewer storage costs with little loss in accuracy,comparedwith existingmethods.We provide a theoretical analysis of the bias and the variance of TTRP,which shows that this approach is an expected isometric projectionwith bounded variance,and we show that the scaling Rademacher variable is an optimal choice for generating the corresponding TT-cores.Detailed numerical experiments with synthetic datasets and theMNIST dataset are conducted to demonstrate the efficiency of TTRP.展开更多
In this article,two new algorithms are presented that convert a given data tensor train into either a Tucker decomposition with orthogonal matrix factors or a multi-scale entanglement renormalization ansatz(MERA).The ...In this article,two new algorithms are presented that convert a given data tensor train into either a Tucker decomposition with orthogonal matrix factors or a multi-scale entanglement renormalization ansatz(MERA).The Tucker core tensor is never explicitly computed but stored as a tensor train instead,resulting in both computationally and storage efficient algorithms.Both the multilinear Tucker-ranks as well as the MERA-ranks are automatically determined by the algorithm for a given upper bound on the relative approximation error.In addition,an iterative algorithm with low computational complexity based on solving an orthogonal Procrustes problem is proposed for the first time to retrieve optimal rank-lowering disentangler tensors,which are a crucial component in the construction of a low-rank MERA.Numerical experiments demonstrate the effectiveness of the proposed algorithms together with the potential storage benefit of a low-rank MERA over a tensor train.展开更多
近期语言模型领域的进展显示,采用Transformer架构的大型预训练模型在自然语言处理应用中表现出优异的技术能力。然而,受限于GPU内存,训练大语言模型(large language models,LLMs)成为了一项挑战。张量并行方法要求单个GPU存储所有激活...近期语言模型领域的进展显示,采用Transformer架构的大型预训练模型在自然语言处理应用中表现出优异的技术能力。然而,受限于GPU内存,训练大语言模型(large language models,LLMs)成为了一项挑战。张量并行方法要求单个GPU存储所有激活值,难以突破内存瓶颈。为解决GPU内存对大语言模型训练的制约并提升训练效率,本文提出一种二维张量并行方法(2D tensor parallelism,TP2D)。2D张量并行将输入数据和参数矩阵分割并分配至4个GPU;采用分布式通信,进行GPU间数据的高速交互,实现真正的分布式并行训练。以GPT-2模型作为基准模型,测试了两种训练方法的软扩展(soft scaling)效率和训练效率。实验表明,当使用4块GPU时,2D张量并行的训练速度是张量并行的1.84倍,软扩展效率达到86%,并降低了内存占用。展开更多
近年来,基于张量补全的频谱制图得到了广泛研究.目前用于频谱制图的张量补全算法大多隐含地假设张量具有平衡特性,而对于非平衡张量,难以利用其低秩性估计完整的张量信息,导致补全算法性能受损.本文提出基于重叠Ket增强(Overlapping Ket...近年来,基于张量补全的频谱制图得到了广泛研究.目前用于频谱制图的张量补全算法大多隐含地假设张量具有平衡特性,而对于非平衡张量,难以利用其低秩性估计完整的张量信息,导致补全算法性能受损.本文提出基于重叠Ket增强(Overlapping Ket Augmentation,OKA)和张量列车(Tensor Train,TT)的非平衡频谱制图算法,以解决非平衡张量在应用传统张量补全算法时性能下降的问题.首先使用OKA将低阶高维张量表示为高阶低维张量,在无信息损耗的情况下解决非平衡张量无法利用其低秩性进行张量补全的问题;然后使用TT矩阵化得到较平衡的矩阵,在维度较平衡条件下提高补全算法的精确度;最后利用高阶低维张量的低秩性,使用并行矩阵分解或基于F范数的无奇异值分解(Singular Value Decomposition Free,SVDFree)算法完成张量补全.仿真结果表明,针对非平衡张量,所提方案与现有的张量补全算法相比,可以获得更精确的无线电地图,同时所提SVDFree算法具有更低的计算复杂度.展开更多
基金supported by the NationalNatural Science Foundation of China(No.12071291)the Science and Technology Commission of Shanghai Municipality(No.20JC1414300)the Natural Science Foundation of Shanghai(No.20ZR1436200).
文摘This work proposes a Tensor Train Random Projection(TTRP)method for dimension reduction,where pairwise distances can be approximately preserved.Our TTRP is systematically constructed through a Tensor Train(TT)representation with TT-ranks equal to one.Based on the tensor train format,this random projection method can speed up the dimension reduction procedure for high-dimensional datasets and requires fewer storage costs with little loss in accuracy,comparedwith existingmethods.We provide a theoretical analysis of the bias and the variance of TTRP,which shows that this approach is an expected isometric projectionwith bounded variance,and we show that the scaling Rademacher variable is an optimal choice for generating the corresponding TT-cores.Detailed numerical experiments with synthetic datasets and theMNIST dataset are conducted to demonstrate the efficiency of TTRP.
基金the Ministry of Education and Science of the Russian Federation(grant 14.756.31.0001).
文摘In this article,two new algorithms are presented that convert a given data tensor train into either a Tucker decomposition with orthogonal matrix factors or a multi-scale entanglement renormalization ansatz(MERA).The Tucker core tensor is never explicitly computed but stored as a tensor train instead,resulting in both computationally and storage efficient algorithms.Both the multilinear Tucker-ranks as well as the MERA-ranks are automatically determined by the algorithm for a given upper bound on the relative approximation error.In addition,an iterative algorithm with low computational complexity based on solving an orthogonal Procrustes problem is proposed for the first time to retrieve optimal rank-lowering disentangler tensors,which are a crucial component in the construction of a low-rank MERA.Numerical experiments demonstrate the effectiveness of the proposed algorithms together with the potential storage benefit of a low-rank MERA over a tensor train.
文摘近年来,基于张量补全的频谱制图得到了广泛研究.目前用于频谱制图的张量补全算法大多隐含地假设张量具有平衡特性,而对于非平衡张量,难以利用其低秩性估计完整的张量信息,导致补全算法性能受损.本文提出基于重叠Ket增强(Overlapping Ket Augmentation,OKA)和张量列车(Tensor Train,TT)的非平衡频谱制图算法,以解决非平衡张量在应用传统张量补全算法时性能下降的问题.首先使用OKA将低阶高维张量表示为高阶低维张量,在无信息损耗的情况下解决非平衡张量无法利用其低秩性进行张量补全的问题;然后使用TT矩阵化得到较平衡的矩阵,在维度较平衡条件下提高补全算法的精确度;最后利用高阶低维张量的低秩性,使用并行矩阵分解或基于F范数的无奇异值分解(Singular Value Decomposition Free,SVDFree)算法完成张量补全.仿真结果表明,针对非平衡张量,所提方案与现有的张量补全算法相比,可以获得更精确的无线电地图,同时所提SVDFree算法具有更低的计算复杂度.