为了解决信号重构性能差的问题,提出了一种基于广义Jaccard系数的广义正交匹配追踪(generalized orthogonal matching pursuit,g OM P)重构算法。该算法利用广义Jaccard系数相似性匹配准则替换g OM P算法中的内积度量准则,优化了通过感...为了解决信号重构性能差的问题,提出了一种基于广义Jaccard系数的广义正交匹配追踪(generalized orthogonal matching pursuit,g OM P)重构算法。该算法利用广义Jaccard系数相似性匹配准则替换g OM P算法中的内积度量准则,优化了通过感知矩阵来选择与残差余量最匹配原子的匹配方式。实验结果表明,该算法的重构成功率不仅高于g OMP算法,同时也高于OMP、St OMP等算法。展开更多
The fundamental problem of similarity studies, in the frame of data-mining, is to examine and detect similar items in articles, papers, and books with huge sizes. In this paper, we are interested in the probabilistic,...The fundamental problem of similarity studies, in the frame of data-mining, is to examine and detect similar items in articles, papers, and books with huge sizes. In this paper, we are interested in the probabilistic, and the statistical and the algorithmic aspects in studies of texts. We will be using the approach of k-shinglings, a k-shingling being defined as a sequence of k consecutive characters that are extracted from a text (k ≥ 1). The main stake in this field is to find accurate and quick algorithms to compute the similarity in short times. This will be achieved in using approximation methods. The first approximation method is statistical and, is based on the theorem of Glivenko-Cantelli. The second is the banding technique. And the third concerns a modification of the algorithm proposed by Rajaraman et al. ([1]), denoted here as (RUM). The Jaccard index is the one being used in this paper. We finally illustrate these results of the paper on the four Gospels. The results are very conclusive.展开更多
文摘为了解决信号重构性能差的问题,提出了一种基于广义Jaccard系数的广义正交匹配追踪(generalized orthogonal matching pursuit,g OM P)重构算法。该算法利用广义Jaccard系数相似性匹配准则替换g OM P算法中的内积度量准则,优化了通过感知矩阵来选择与残差余量最匹配原子的匹配方式。实验结果表明,该算法的重构成功率不仅高于g OMP算法,同时也高于OMP、St OMP等算法。
文摘The fundamental problem of similarity studies, in the frame of data-mining, is to examine and detect similar items in articles, papers, and books with huge sizes. In this paper, we are interested in the probabilistic, and the statistical and the algorithmic aspects in studies of texts. We will be using the approach of k-shinglings, a k-shingling being defined as a sequence of k consecutive characters that are extracted from a text (k ≥ 1). The main stake in this field is to find accurate and quick algorithms to compute the similarity in short times. This will be achieved in using approximation methods. The first approximation method is statistical and, is based on the theorem of Glivenko-Cantelli. The second is the banding technique. And the third concerns a modification of the algorithm proposed by Rajaraman et al. ([1]), denoted here as (RUM). The Jaccard index is the one being used in this paper. We finally illustrate these results of the paper on the four Gospels. The results are very conclusive.