Data sparseness has been an inherited issue of statistical language models and smoothing method is usually used to resolve the zero count problems. In this paper, we studied empirically and analyzed the well-known smo...Data sparseness has been an inherited issue of statistical language models and smoothing method is usually used to resolve the zero count problems. In this paper, we studied empirically and analyzed the well-known smoothing methods of Good-Turing and advanced Good-Turing for language models on large sizes Chinese corpus. In the paper, ten models are generated sequentially on various size of corpus, from 30 M to 300 M Chinese words of CGW corpus. In our experiments, the smoothing methods;Good-Turing and Advanced Good-Turing smoothing are evaluated on inside testing and outside testing. Based on experiments results, we analyzed further the trends of perplexity of smoothing methods, which are useful for employing the effective smoothing methods to alleviate the issue of data sparseness on various sizes of language models. Finally, some helpful observations are described in detail.展开更多
The g-good-neighbor connectivity of G is a generalization of the concept of connectivity, which is just for, and an important parameter in measuring the fault tolerance and reliability of interconnection network. Many...The g-good-neighbor connectivity of G is a generalization of the concept of connectivity, which is just for, and an important parameter in measuring the fault tolerance and reliability of interconnection network. Many well-known networks can be constructed by the Cartesian products of some simple graphs. In this paper, we determine the g-good-neighbor connectivity of some Cartesian product graphs. We give the exact value of g-good-neighbor connectivity of the Cartesian product of two complete graphs and for , mesh for , cylindrical grid and torus for .展开更多
文摘Data sparseness has been an inherited issue of statistical language models and smoothing method is usually used to resolve the zero count problems. In this paper, we studied empirically and analyzed the well-known smoothing methods of Good-Turing and advanced Good-Turing for language models on large sizes Chinese corpus. In the paper, ten models are generated sequentially on various size of corpus, from 30 M to 300 M Chinese words of CGW corpus. In our experiments, the smoothing methods;Good-Turing and Advanced Good-Turing smoothing are evaluated on inside testing and outside testing. Based on experiments results, we analyzed further the trends of perplexity of smoothing methods, which are useful for employing the effective smoothing methods to alleviate the issue of data sparseness on various sizes of language models. Finally, some helpful observations are described in detail.
文摘The g-good-neighbor connectivity of G is a generalization of the concept of connectivity, which is just for, and an important parameter in measuring the fault tolerance and reliability of interconnection network. Many well-known networks can be constructed by the Cartesian products of some simple graphs. In this paper, we determine the g-good-neighbor connectivity of some Cartesian product graphs. We give the exact value of g-good-neighbor connectivity of the Cartesian product of two complete graphs and for , mesh for , cylindrical grid and torus for .