A crowdsourcing experiment in which viewers (the “crowd”) of a British Broadcasting Corporation (BBC) television show submitted estimates of the number of coins in a tumbler was shown in an antecedent paper (Part 1)...A crowdsourcing experiment in which viewers (the “crowd”) of a British Broadcasting Corporation (BBC) television show submitted estimates of the number of coins in a tumbler was shown in an antecedent paper (Part 1) to follow a log-normal distribution ∧(m,s2). The coin-estimation experiment is an archetype of a broad class of image analysis and object counting problems suitable for solution by crowdsourcing. The objective of the current paper (Part 2) is to determine the location and scale parameters (m,s) of ∧(m,s2) by both Bayesian and maximum likelihood (ML) methods and to compare the results. One outcome of the analysis is the resolution, by means of Jeffreys’ rule, of questions regarding the appropriate Bayesian prior. It is shown that Bayesian and ML analyses lead to the same expression for the location parameter, but different expressions for the scale parameter, which become identical in the limit of an infinite sample size. A second outcome of the analysis concerns use of the sample mean as the measure of information of the crowd in applications where the distribution of responses is not sought or known. In the coin-estimation experiment, the sample mean was found to differ widely from the mean number of coins calculated from ∧(m,s2). This discordance raises critical questions concerning whether, and under what conditions, the sample mean provides a reliable measure of the information of the crowd. This paper resolves that problem by use of the principle of maximum entropy (PME). The PME yields a set of equations for finding the most probable distribution consistent with given prior information and only that information. If there is no solution to the PME equations for a specified sample mean and sample variance, then the sample mean is an unreliable statistic, since no measure can be assigned to its uncertainty. Parts 1 and 2 together demonstrate that the information content of crowdsourcing resides in the distribution of responses (very often log-normal in form), which can be obtained empirically or by appropriate modeling.展开更多
We devise an approach to Bayesian statistics and their applications in the analysis of the Monty Hall problem. We combine knowledge gained through applications of the Maximum Entropy Principle and Nash equilibrium str...We devise an approach to Bayesian statistics and their applications in the analysis of the Monty Hall problem. We combine knowledge gained through applications of the Maximum Entropy Principle and Nash equilibrium strategies to provide results concerning the use of Bayesian approaches unique to the Monty Hall problem. We use a model to describe Monty’s decision process and clarify that Bayesian inference results in an “irrelevant, therefore invariant” hypothesis. We discuss the advantages of Bayesian inference over the frequentist inference in tackling the uneven prior probability Monty Hall variant. We demonstrate that the use of Bayesian statistics conforms to the Maximum Entropy Principle in information theory and Bayesian approach successfully resolves dilemmas in the uneven probability Monty Hall variant. Our findings have applications in the decision making, information theory, bioinformatics, quantum game theory and beyond.展开更多
传统空间插值方法可获得福建省区域内降水的总体分布,但该地区气象站点较稀疏且分布不均,导致该区域内降水的空间插值结果误差较大。为提高插值精度,本文利用TRMM卫星数据以弥补站点数据的不足,尝试将TRMM数据作为"软数据"、...传统空间插值方法可获得福建省区域内降水的总体分布,但该地区气象站点较稀疏且分布不均,导致该区域内降水的空间插值结果误差较大。为提高插值精度,本文利用TRMM卫星数据以弥补站点数据的不足,尝试将TRMM数据作为"软数据"、台站数据作为"硬数据",两者相结合后采用贝叶斯最大熵(Bayesian Maximum Entropy,BME)方法对福建省降水的时空格局进行分析。以2000-2012年近13年20个气象站点的年降水量和月降水量为基础数据,分别利用普通克里格法(Ordinary Kriging,OK)和TRMM为"软数据"的BME插值法,分析福建省多年降水的时空分布格局,并对2种方法的插值结果进行比较。结果表明:在时空分布上,以TRMM数据为辅助变量的贝叶斯最大熵插值结果能更好地体现降水的局部差异特征;在误差评价上,以TRMM数据为辅助变量的贝叶斯最大熵插值结果的MAE和RMSE较小,表明TRMM数据作为"软数据"参与插值的BME方法可以在一定程度上弥补站点数据的不足,有效降低预测结果的绝对误差。通过对福建省降水插值的时空分布格局分析和误差评价可看出,BME插值法通过对基础台站数据,以及TRMM卫星产品数据的利用,使降水的时空分析结果更加真实客观,同时,为TRMM卫星降水数据的应用提供了一个新思路。展开更多
Sea surface temperature (SST) is an important variable for understanding interactions between the ocean and the atmosphere. SST fusion is crucial for acquiring SST products of high spatial resolution and coverage. T...Sea surface temperature (SST) is an important variable for understanding interactions between the ocean and the atmosphere. SST fusion is crucial for acquiring SST products of high spatial resolution and coverage. This study introduces a Bayesian maximum entropy (BME) method for blending daily SSTs from multiple satellite sensors. A new spatiotemporal covariance model of an SST field is built to integrate not only single-day SSTs but also time-adjacent SSTs. In addition, AVHRR 30-year SST climatology data are introduced as soft data at the estimation points to improve the accuracy of blended results within the BME framework. The merged SSTs, with a spatial resolution of 4 km and a temporal resolution of 24 hours, are produced in the Western Pacific Ocean region to demonstrate and evaluate the proposed metho- dology. Comparisons with in situ drifting buoy observations show that the merged SSTs are accurate and the bias and root-mean-square errors for the comparison are 0.15℃ and 0.72℃, respectively.展开更多
贝叶斯最大熵(Bayesian Maximum Entropy,BME)地统计学方法是近年来出现的一种时空地统计学新方法。相对于传统的克里金方法,该法具有坚实的认识论框架和方法学基础。它不需要作线性估值、空间匀质和正态分布的假设,能够融入先验知识和...贝叶斯最大熵(Bayesian Maximum Entropy,BME)地统计学方法是近年来出现的一种时空地统计学新方法。相对于传统的克里金方法,该法具有坚实的认识论框架和方法学基础。它不需要作线性估值、空间匀质和正态分布的假设,能够融入先验知识和软数据,并且不会损失其中蕴含的有用信息,提高了分析精度。本文首先介绍了BME的基本理论及其估值方法,随后简单描述了该方法的理论发展过程及其在土壤和环境科学上的应用情况,最后对该方法的应用做了总结与展望。经过国外研究者多年的开发和实践,BME方法已经被证明是一个理论上较为成熟,能够应用到实际研究中的优秀地统计学方法,在资源环境评估上有着广泛的应用前景。展开更多
文摘A crowdsourcing experiment in which viewers (the “crowd”) of a British Broadcasting Corporation (BBC) television show submitted estimates of the number of coins in a tumbler was shown in an antecedent paper (Part 1) to follow a log-normal distribution ∧(m,s2). The coin-estimation experiment is an archetype of a broad class of image analysis and object counting problems suitable for solution by crowdsourcing. The objective of the current paper (Part 2) is to determine the location and scale parameters (m,s) of ∧(m,s2) by both Bayesian and maximum likelihood (ML) methods and to compare the results. One outcome of the analysis is the resolution, by means of Jeffreys’ rule, of questions regarding the appropriate Bayesian prior. It is shown that Bayesian and ML analyses lead to the same expression for the location parameter, but different expressions for the scale parameter, which become identical in the limit of an infinite sample size. A second outcome of the analysis concerns use of the sample mean as the measure of information of the crowd in applications where the distribution of responses is not sought or known. In the coin-estimation experiment, the sample mean was found to differ widely from the mean number of coins calculated from ∧(m,s2). This discordance raises critical questions concerning whether, and under what conditions, the sample mean provides a reliable measure of the information of the crowd. This paper resolves that problem by use of the principle of maximum entropy (PME). The PME yields a set of equations for finding the most probable distribution consistent with given prior information and only that information. If there is no solution to the PME equations for a specified sample mean and sample variance, then the sample mean is an unreliable statistic, since no measure can be assigned to its uncertainty. Parts 1 and 2 together demonstrate that the information content of crowdsourcing resides in the distribution of responses (very often log-normal in form), which can be obtained empirically or by appropriate modeling.
文摘We devise an approach to Bayesian statistics and their applications in the analysis of the Monty Hall problem. We combine knowledge gained through applications of the Maximum Entropy Principle and Nash equilibrium strategies to provide results concerning the use of Bayesian approaches unique to the Monty Hall problem. We use a model to describe Monty’s decision process and clarify that Bayesian inference results in an “irrelevant, therefore invariant” hypothesis. We discuss the advantages of Bayesian inference over the frequentist inference in tackling the uneven prior probability Monty Hall variant. We demonstrate that the use of Bayesian statistics conforms to the Maximum Entropy Principle in information theory and Bayesian approach successfully resolves dilemmas in the uneven probability Monty Hall variant. Our findings have applications in the decision making, information theory, bioinformatics, quantum game theory and beyond.
文摘传统空间插值方法可获得福建省区域内降水的总体分布,但该地区气象站点较稀疏且分布不均,导致该区域内降水的空间插值结果误差较大。为提高插值精度,本文利用TRMM卫星数据以弥补站点数据的不足,尝试将TRMM数据作为"软数据"、台站数据作为"硬数据",两者相结合后采用贝叶斯最大熵(Bayesian Maximum Entropy,BME)方法对福建省降水的时空格局进行分析。以2000-2012年近13年20个气象站点的年降水量和月降水量为基础数据,分别利用普通克里格法(Ordinary Kriging,OK)和TRMM为"软数据"的BME插值法,分析福建省多年降水的时空分布格局,并对2种方法的插值结果进行比较。结果表明:在时空分布上,以TRMM数据为辅助变量的贝叶斯最大熵插值结果能更好地体现降水的局部差异特征;在误差评价上,以TRMM数据为辅助变量的贝叶斯最大熵插值结果的MAE和RMSE较小,表明TRMM数据作为"软数据"参与插值的BME方法可以在一定程度上弥补站点数据的不足,有效降低预测结果的绝对误差。通过对福建省降水插值的时空分布格局分析和误差评价可看出,BME插值法通过对基础台站数据,以及TRMM卫星产品数据的利用,使降水的时空分析结果更加真实客观,同时,为TRMM卫星降水数据的应用提供了一个新思路。
基金This study was supported by the National Natural Science Foundation of China (Grant Nos. 41201350 and 41371355). We sincerely thank the University of North Carolina Bayesian Maximum Entropy (UNC-BME) laboratory at the UNC at Chapel Hill for supplying the BME codes.
文摘Sea surface temperature (SST) is an important variable for understanding interactions between the ocean and the atmosphere. SST fusion is crucial for acquiring SST products of high spatial resolution and coverage. This study introduces a Bayesian maximum entropy (BME) method for blending daily SSTs from multiple satellite sensors. A new spatiotemporal covariance model of an SST field is built to integrate not only single-day SSTs but also time-adjacent SSTs. In addition, AVHRR 30-year SST climatology data are introduced as soft data at the estimation points to improve the accuracy of blended results within the BME framework. The merged SSTs, with a spatial resolution of 4 km and a temporal resolution of 24 hours, are produced in the Western Pacific Ocean region to demonstrate and evaluate the proposed metho- dology. Comparisons with in situ drifting buoy observations show that the merged SSTs are accurate and the bias and root-mean-square errors for the comparison are 0.15℃ and 0.72℃, respectively.
文摘贝叶斯最大熵(Bayesian Maximum Entropy,BME)地统计学方法是近年来出现的一种时空地统计学新方法。相对于传统的克里金方法,该法具有坚实的认识论框架和方法学基础。它不需要作线性估值、空间匀质和正态分布的假设,能够融入先验知识和软数据,并且不会损失其中蕴含的有用信息,提高了分析精度。本文首先介绍了BME的基本理论及其估值方法,随后简单描述了该方法的理论发展过程及其在土壤和环境科学上的应用情况,最后对该方法的应用做了总结与展望。经过国外研究者多年的开发和实践,BME方法已经被证明是一个理论上较为成熟,能够应用到实际研究中的优秀地统计学方法,在资源环境评估上有着广泛的应用前景。