An algorithm, Clustering Algorithm Based On Sparse Feature Vector (CABOSFV),was proposed for the high dimensional clustering of binary sparse data. This algorithm compressesthe data effectively by using a tool 'Sp...An algorithm, Clustering Algorithm Based On Sparse Feature Vector (CABOSFV),was proposed for the high dimensional clustering of binary sparse data. This algorithm compressesthe data effectively by using a tool 'Sparse Feature Vector', thus reduces the data scaleenormously, and can get the clustering result with only one data scan. Both theoretical analysis andempirical tests showed that CABOSFV is of low computational complexity. The algorithm findsclusters in high dimensional large datasets efficiently and handles noise effectively.展开更多
For a data cube there are always constraints between dimensions or among attributes in a dimension, such as functional dependencies. We introduce the problem that when there are functional dependencies, how to use the...For a data cube there are always constraints between dimensions or among attributes in a dimension, such as functional dependencies. We introduce the problem that when there are functional dependencies, how to use them to speed up the computation of sparse data cubes. A new algorithm CFD (Computation by Functional Dependencies) is presented to satisfy this demand. CFD determines the order of dimensions by considering cardinalities of dimensions and functional dependencies between dimensions together, thus reduce the number of partitions for such dimensions. CFD also combines partitioning from bottom to up and aggregate computation from top to bottom to speed up the computation further. CFD can efficiently compute a data cube with hierarchies in a dimension from the smallest granularity to the coarsest one. Key words sparse data cube - functional dependency - dimension - partition - CFD CLC number TP 311 Foundation item: Supported by the E-Government Project of the Ministry of Science and Technology of China (2001BA110B01)Biography: Feng Yu-cai (1945-), male, Professor, research direction: database system.展开更多
为在稀疏测点超孔隙水压力数据条件下预测饱和软土的固结行为,引入物理信息深度算子网络(physics-informed deep operator network,PI-DeepONet)方法,并利用稀疏孔隙水压力测点数据对饱和土体全域内超孔隙水压力分布进行实时预测。通过...为在稀疏测点超孔隙水压力数据条件下预测饱和软土的固结行为,引入物理信息深度算子网络(physics-informed deep operator network,PI-DeepONet)方法,并利用稀疏孔隙水压力测点数据对饱和土体全域内超孔隙水压力分布进行实时预测。通过分析常规黏土变形固结及软弱黏土大变形固结2个实例进行预测,引入相对L2误差和R2这2个评估指标,验证了PI-DeepONet算法在预测全域超孔隙水压力演化方面的性能,并与纯数据驱动的DeepONet算法的计算结果进行了对比。预测结果表明:在相同的测点数目和各测点拥有相同超孔隙水压力数据量的条件下,DeepONet算法对全域超孔隙水压力的预测绝对误差在10^(-2)~10^(-1)左右,而PI-DeepONet算法的绝对误差范围则在10^(−3)~10^(-2)左右,表现出更好的预测效果。其次,在常规黏土变形固结行为研究中,通过对超孔隙水压力数据添加3种不同噪声水平来模拟现场监测环境,观察到即使噪声水平达到5%,PI-DeepONet算法仍能在水压力数据稀疏且带噪声的条件下提供高质量的全域超孔隙水压力实时预测。最后,在软弱黏土大变形固结行为研究中,将PI-DeepONet算法运用于上下边界排水速率不同的固结问题中,发现训练好的一维模型在单一测点条件下,能对其他界面参数条件下饱和土体全域内超孔隙水压力分布规律进行准确预测,表明PIDeepONet算法能为岩土工程中相关问题提供新的解决办法。展开更多
This paper proposes a novel method for testing the equality of high-dimensional means using a multiple hypothesis test. The proposed method is based on the maximum of standardized partial sums of logarithmic p-values ...This paper proposes a novel method for testing the equality of high-dimensional means using a multiple hypothesis test. The proposed method is based on the maximum of standardized partial sums of logarithmic p-values statistic. Numerical studies show that the method performs well for both normal and non-normal data and has a good power performance under both dense and sparse alternative hypotheses. For illustration, a real data analysis is implemented.展开更多
文摘An algorithm, Clustering Algorithm Based On Sparse Feature Vector (CABOSFV),was proposed for the high dimensional clustering of binary sparse data. This algorithm compressesthe data effectively by using a tool 'Sparse Feature Vector', thus reduces the data scaleenormously, and can get the clustering result with only one data scan. Both theoretical analysis andempirical tests showed that CABOSFV is of low computational complexity. The algorithm findsclusters in high dimensional large datasets efficiently and handles noise effectively.
文摘For a data cube there are always constraints between dimensions or among attributes in a dimension, such as functional dependencies. We introduce the problem that when there are functional dependencies, how to use them to speed up the computation of sparse data cubes. A new algorithm CFD (Computation by Functional Dependencies) is presented to satisfy this demand. CFD determines the order of dimensions by considering cardinalities of dimensions and functional dependencies between dimensions together, thus reduce the number of partitions for such dimensions. CFD also combines partitioning from bottom to up and aggregate computation from top to bottom to speed up the computation further. CFD can efficiently compute a data cube with hierarchies in a dimension from the smallest granularity to the coarsest one. Key words sparse data cube - functional dependency - dimension - partition - CFD CLC number TP 311 Foundation item: Supported by the E-Government Project of the Ministry of Science and Technology of China (2001BA110B01)Biography: Feng Yu-cai (1945-), male, Professor, research direction: database system.
文摘为在稀疏测点超孔隙水压力数据条件下预测饱和软土的固结行为,引入物理信息深度算子网络(physics-informed deep operator network,PI-DeepONet)方法,并利用稀疏孔隙水压力测点数据对饱和土体全域内超孔隙水压力分布进行实时预测。通过分析常规黏土变形固结及软弱黏土大变形固结2个实例进行预测,引入相对L2误差和R2这2个评估指标,验证了PI-DeepONet算法在预测全域超孔隙水压力演化方面的性能,并与纯数据驱动的DeepONet算法的计算结果进行了对比。预测结果表明:在相同的测点数目和各测点拥有相同超孔隙水压力数据量的条件下,DeepONet算法对全域超孔隙水压力的预测绝对误差在10^(-2)~10^(-1)左右,而PI-DeepONet算法的绝对误差范围则在10^(−3)~10^(-2)左右,表现出更好的预测效果。其次,在常规黏土变形固结行为研究中,通过对超孔隙水压力数据添加3种不同噪声水平来模拟现场监测环境,观察到即使噪声水平达到5%,PI-DeepONet算法仍能在水压力数据稀疏且带噪声的条件下提供高质量的全域超孔隙水压力实时预测。最后,在软弱黏土大变形固结行为研究中,将PI-DeepONet算法运用于上下边界排水速率不同的固结问题中,发现训练好的一维模型在单一测点条件下,能对其他界面参数条件下饱和土体全域内超孔隙水压力分布规律进行准确预测,表明PIDeepONet算法能为岩土工程中相关问题提供新的解决办法。
基金supported by a grant from the University Grants Council of Hong Kong, National Natural Science Foundation of China (Grant No. 11471335)the Ministry of Education Project of Key Research Institute of Humanities and Social Sciences at Universities (Grant No. 16JJD910002)Fund for Building World-Class Universities (Disciplines) of Renmin University of China
文摘This paper proposes a novel method for testing the equality of high-dimensional means using a multiple hypothesis test. The proposed method is based on the maximum of standardized partial sums of logarithmic p-values statistic. Numerical studies show that the method performs well for both normal and non-normal data and has a good power performance under both dense and sparse alternative hypotheses. For illustration, a real data analysis is implemented.