In this paper,the geoecological information-modeling system(GIMS)is described as possible improvement of the Big Data approach.The main GIMS function is the use of algorithms and models that capture the fundamental pr...In this paper,the geoecological information-modeling system(GIMS)is described as possible improvement of the Big Data approach.The main GIMS function is the use of algorithms and models that capture the fundamental processes controlling the evolution of the climate-nature-society(CNSS)system.The GIMS structure includes 24 blocks that realize a series of models and algorithms for global big data processing and analysis.The CNSS global model is the basic block of the GIMS.The operational tools of GIMS are demonstrated by examining several scenarios associated with the reconstruction of forest areas.It is shown that significant impacts on forests can lead to global climate variations on a large scale.展开更多
Privacy protection for big data linking is discussed here in relation to the Central Statistics Office (CSO), Ireland's, big data linking project titled the 'Structure of Earnings Survey - Administrative Data Proj...Privacy protection for big data linking is discussed here in relation to the Central Statistics Office (CSO), Ireland's, big data linking project titled the 'Structure of Earnings Survey - Administrative Data Project' (SESADP). The result of the project was the creation of datasets and statistical outputs for the years 2011 to 2014 to meet Eurostat's annual earnings statistics requirements and the Structure of Earnings Survey (SES) Regulation. Record linking across the Census and various public sector datasets enabled the necessary information to be acquired to meet the Eurostat earnings requirements. However, the risk of statistical disclosure (i.e. identifying an individual on the dataset) is high unless privacy and confidentiality safe-guards are built into the data matching process. This paper looks at the three methods of linking records on big datasets employed on the SESADP, and how to anonymise the data to protect the identity of the individuals, where potentially disclosive variables exist.展开更多
Since the State Council issued the Action Plan on Promoting the Development of the Big Data Industry,big data-enabled information integration and processing applications have increasingly become the basic strategic re...Since the State Council issued the Action Plan on Promoting the Development of the Big Data Industry,big data-enabled information integration and processing applications have increasingly become the basic strategic resources for the building of a modern governance system in China.When it comes to poverty reduction,given that we are currently at a critical stage in the battle to eradicate poverty,it's important that we apply the big data way of thinking and big data technology to the development and integration of poverty alleviation resources.This paper examines the need to apply big data technology in targeted poverty alleviation and discusses how big data technology can be integrated into targeted poverty alleviation programs and how the big data way of thinking meshes with the idea of targeted poverty alleviation.It is believed that the application of big data technology can significantly improve the results of targeted poverty alleviation programs and that the building of big data-powered poverty alleviation platforms is a new approach to implementing the targeted poverty alleviation strategy.This paper calls for changing our way of thinking regarding targeted poverty alleviation and points out the directions for targeted poverty alleviation in the age of big data,with a view to promoting the extensive application of big data technology in the field of poverty reduction and improving the results of poverty alleviation and eradication programs.展开更多
为有效提升配电网韧性,提出了一种基于数据-模型混合驱动的多类型移动应急资源优化调度方法。首先,考虑到交通道路状态动态变化对移动储能车(mobile energy storage system,MESS)和应急抢修队(repair crew,RC)策略的影响,构建了以电力-...为有效提升配电网韧性,提出了一种基于数据-模型混合驱动的多类型移动应急资源优化调度方法。首先,考虑到交通道路状态动态变化对移动储能车(mobile energy storage system,MESS)和应急抢修队(repair crew,RC)策略的影响,构建了以电力-交通耦合网总损失成本最小为目标的多类型移动应急资源随机优化调度模型。然后,为了实时准确地求解MESS和RC最优路由和调度策略,提出了一种数据-模型混合驱动方法对所构建的复杂非线性随机优化模型进行求解。在数据驱动部分提出一种图注意力网络多智能体强化学习算法,以求解考虑交通网道路修复时间和移动应急资源邻接关系动态变化等不确定因素的MESS和RC最优路由策略。所提算法有效结合多种改进策略和优先经验回放策略以提高算法的采样效率和训练效果。在模型驱动部分采用二阶锥松弛和大M法将多类型移动应急资源优化调度问题构建为混合整数二阶锥规划模型以求解可再生能源出力和配电网负荷变化影响下MESS和RC最优调度策略。最后,在2个不同规模的电力-交通耦合网中验证所提方法的有效性、泛化能力和可拓展能力。展开更多
The question of how to choose a copula model that best fits a given dataset is a predominant limitation of the copula approach, and the present study aims to investigate the techniques of goodness-of-fit tests for mul...The question of how to choose a copula model that best fits a given dataset is a predominant limitation of the copula approach, and the present study aims to investigate the techniques of goodness-of-fit tests for multi-dimensional copulas. A goodness-of-fit test based on Rosenblatt's transformation was mathematically expanded from two dimensions to three dimensions and procedures of a bootstrap version of the test were provided. Through stochastic copula simulation, an empirical application of historical drought data at the Lintong Gauge Station shows that the goodness-of-fit tests perform well, revealing that both trivariate Gaussian and Student t copulas are acceptable for modeling the dependence structures of the observed drought duration, severity, and peak. The goodness-of-fit tests for multi-dimensional copulas can provide further support and help a lot in the potential applications of a wider range of copulas to describe the associations of correlated hydrological variables. However, for the application of copulas with the number of dimensions larger than three, more complicated computational efforts as well as exploration and parameterization of corresponding copulas are required.展开更多
In this paper we present a series of monthly gravity field solutions from Gravity Recovery and Climate Experiment(GRACE) range measurements using modified short arc approach,in which the ambiguity of range measureme...In this paper we present a series of monthly gravity field solutions from Gravity Recovery and Climate Experiment(GRACE) range measurements using modified short arc approach,in which the ambiguity of range measurements is eliminated via differentiating two adjacent range measurements.The data used for developing our monthly gravity field model are same as Tongji-GRACEOl model except that the range measurements are used to replace the range rate measurements,and our model is truncated to degree and order 60,spanning Jan.2004 to Dec.2010 also same as Tongji-GRACE01 model.Based on the comparison results of the C_(2,0),C_(2,1),S_(2,1),and C_(15,15),S_(15,15),time series and the global mass change signals as well as the mass change time series in Amazon area of our model with those of Tongji-GRACE01 model,we can conclude that our monthly gravity field model is comparable with Tongji-GRACE01 monthly model.展开更多
As global warming continues,the monitoring of changes in terrestrial water storage becomes increasingly important since it plays a critical role in understanding global change and water resource management.In North Am...As global warming continues,the monitoring of changes in terrestrial water storage becomes increasingly important since it plays a critical role in understanding global change and water resource management.In North America as elsewhere in the world,changes in water resources strongly impact agriculture and animal husbandry.From a combination of Gravity Recovery and Climate Experiment(GRACE) gravity and Global Positioning System(GPS) data,it is recently found that water storage from August,2002 to March,2011 recovered after the extreme Canadian Prairies drought between 1999 and 2005.In this paper,we use GRACE monthly gravity data of Release 5 to track the water storage change from August,2002 to June,2014.In Canadian Prairies and the Great Lakes areas,the total water storage is found to have increased during the last decade by a rate of 73.8 ± 14.5 Gt/a,which is larger than that found in the previous study due to the longer time span of GRACE observations used and the reduction of the leakage error.We also find a long term decrease of water storage at a rate of-12.0 ± 4.2 Gt/a in Ungava Peninsula,possibly due to permafrost degradation and less snow accumulation during the winter in the region.In addition,the effect of total mass gain in the surveyed area,on present-day sea level,amounts to-0.18 mm/a,and thus should be taken into account in studies of global sea level change.展开更多
In this study, rural poverty in Iran is investigated applying a multidimensional approach, association rules mining technique, and Levine, F and Tukey tests to household data of 2008. The results indicate that poverty...In this study, rural poverty in Iran is investigated applying a multidimensional approach, association rules mining technique, and Levine, F and Tukey tests to household data of 2008. The results indicate that poverty in its multi-dimensions is an epidemic problem in rural Iran. The results also exhibit that there are 11 patterns of poverty in the rural areas including four main patterns with 99.62% coverage and seven sub-patterns with nearly 0.38% coverage. In these patterns, housing and household education are the most important dimensions of poverty and income poverty is the least important dimension. Government income support policy to households, in enforcement the law of targeting subsidies, cannot be regarded as pro poor policy but it follows other political aspects.展开更多
Challenges in Big Data analysis arise due to the way the data are recorded, maintained, processed and stored. We demonstrate that a hierarchical, multivariate, statistical machine learning algorithm, namely Boosted Re...Challenges in Big Data analysis arise due to the way the data are recorded, maintained, processed and stored. We demonstrate that a hierarchical, multivariate, statistical machine learning algorithm, namely Boosted Regression Tree (BRT) can address Big Data challenges to drive decision making. The challenge of this study is lack of interoperability since the data, a collection of GIS shapefiles, remotely sensed imagery, and aggregated and interpolated spatio-temporal information, are stored in monolithic hardware components. For the modelling process, it was necessary to create one common input file. By merging the data sources together, a structured but noisy input file, showing inconsistencies and redundancies, was created. Here, it is shown that BRT can process different data granularities, heterogeneous data and missingness. In particular, BRT has the advantage of dealing with missing data by default by allowing a split on whether or not a value is missing as well as what the value is. Most importantly, the BRT offers a wide range of possibilities regarding the interpretation of results and variable selection is automatically performed by considering how frequently a variable is used to define a split in the tree. A comparison with two similar regression models (Random Forests and Least Absolute Shrinkage and Selection Operator, LASSO) shows that BRT outperforms these in this instance. BRT can also be a starting point for sophisticated hierarchical modelling in real world scenarios. For example, a single or ensemble approach of BRT could be tested with existing models in order to improve results for a wide range of data-driven decisions and applications.展开更多
基金This study was partly supported by the Russian Fund for Basic Researches[Project No.16-01-000213-a].
文摘In this paper,the geoecological information-modeling system(GIMS)is described as possible improvement of the Big Data approach.The main GIMS function is the use of algorithms and models that capture the fundamental processes controlling the evolution of the climate-nature-society(CNSS)system.The GIMS structure includes 24 blocks that realize a series of models and algorithms for global big data processing and analysis.The CNSS global model is the basic block of the GIMS.The operational tools of GIMS are demonstrated by examining several scenarios associated with the reconstruction of forest areas.It is shown that significant impacts on forests can lead to global climate variations on a large scale.
文摘Privacy protection for big data linking is discussed here in relation to the Central Statistics Office (CSO), Ireland's, big data linking project titled the 'Structure of Earnings Survey - Administrative Data Project' (SESADP). The result of the project was the creation of datasets and statistical outputs for the years 2011 to 2014 to meet Eurostat's annual earnings statistics requirements and the Structure of Earnings Survey (SES) Regulation. Record linking across the Census and various public sector datasets enabled the necessary information to be acquired to meet the Eurostat earnings requirements. However, the risk of statistical disclosure (i.e. identifying an individual on the dataset) is high unless privacy and confidentiality safe-guards are built into the data matching process. This paper looks at the three methods of linking records on big datasets employed on the SESADP, and how to anonymise the data to protect the identity of the individuals, where potentially disclosive variables exist.
基金part of the"Study on Improving the Results of Targeted Poverty Alleviation Programs in Guangxi,Guizhou and Yunnan"(15BMZ057)a 2015 general research program funded by the National Social Sciences Fund of China+3 种基金"Exploring the Implementation of the Targeted Poverty Alleviation Strategy and Study on Improving the Implementation Methods in Guangxi"(XBS16035)a program funded by the Guangxi University Research Fund"Study on Dynamic Management Model for Targeted Poverty Alleviation in the Age of Big Data"(201610593296)a program funded by Guagnxi’s College Student Innovation and Entrepreneurship Training Project
文摘Since the State Council issued the Action Plan on Promoting the Development of the Big Data Industry,big data-enabled information integration and processing applications have increasingly become the basic strategic resources for the building of a modern governance system in China.When it comes to poverty reduction,given that we are currently at a critical stage in the battle to eradicate poverty,it's important that we apply the big data way of thinking and big data technology to the development and integration of poverty alleviation resources.This paper examines the need to apply big data technology in targeted poverty alleviation and discusses how big data technology can be integrated into targeted poverty alleviation programs and how the big data way of thinking meshes with the idea of targeted poverty alleviation.It is believed that the application of big data technology can significantly improve the results of targeted poverty alleviation programs and that the building of big data-powered poverty alleviation platforms is a new approach to implementing the targeted poverty alleviation strategy.This paper calls for changing our way of thinking regarding targeted poverty alleviation and points out the directions for targeted poverty alleviation in the age of big data,with a view to promoting the extensive application of big data technology in the field of poverty reduction and improving the results of poverty alleviation and eradication programs.
文摘为有效提升配电网韧性,提出了一种基于数据-模型混合驱动的多类型移动应急资源优化调度方法。首先,考虑到交通道路状态动态变化对移动储能车(mobile energy storage system,MESS)和应急抢修队(repair crew,RC)策略的影响,构建了以电力-交通耦合网总损失成本最小为目标的多类型移动应急资源随机优化调度模型。然后,为了实时准确地求解MESS和RC最优路由和调度策略,提出了一种数据-模型混合驱动方法对所构建的复杂非线性随机优化模型进行求解。在数据驱动部分提出一种图注意力网络多智能体强化学习算法,以求解考虑交通网道路修复时间和移动应急资源邻接关系动态变化等不确定因素的MESS和RC最优路由策略。所提算法有效结合多种改进策略和优先经验回放策略以提高算法的采样效率和训练效果。在模型驱动部分采用二阶锥松弛和大M法将多类型移动应急资源优化调度问题构建为混合整数二阶锥规划模型以求解可再生能源出力和配电网负荷变化影响下MESS和RC最优调度策略。最后,在2个不同规模的电力-交通耦合网中验证所提方法的有效性、泛化能力和可拓展能力。
基金supported by the Program of Introducing Talents of Disciplines to Universities of the Ministry of Education and State Administration of the Foreign Experts Affairs of China (the 111 Project, Grant No.B08048)the Special Basic Research Fund for Methodology in Hydrology of the Ministry of Sciences and Technology of China (Grant No. 2011IM011000)
文摘The question of how to choose a copula model that best fits a given dataset is a predominant limitation of the copula approach, and the present study aims to investigate the techniques of goodness-of-fit tests for multi-dimensional copulas. A goodness-of-fit test based on Rosenblatt's transformation was mathematically expanded from two dimensions to three dimensions and procedures of a bootstrap version of the test were provided. Through stochastic copula simulation, an empirical application of historical drought data at the Lintong Gauge Station shows that the goodness-of-fit tests perform well, revealing that both trivariate Gaussian and Student t copulas are acceptable for modeling the dependence structures of the observed drought duration, severity, and peak. The goodness-of-fit tests for multi-dimensional copulas can provide further support and help a lot in the potential applications of a wider range of copulas to describe the associations of correlated hydrological variables. However, for the application of copulas with the number of dimensions larger than three, more complicated computational efforts as well as exploration and parameterization of corresponding copulas are required.
基金sponsored by National Natural Science Foundation of China(41474017)National Key Basic Research Program of China(973 Program+3 种基金2012CB957703)sponsored by National Natural Science Foundation of China(41274035)State Key Laboratory of Geodesy and Earth's Dynamics(SKLGED2013-3-2-Z,SKLGED2014-1-3-E)State Key Laboratory of Geo-Information Engineering(SKLGIE2014-M-1-2)
文摘In this paper we present a series of monthly gravity field solutions from Gravity Recovery and Climate Experiment(GRACE) range measurements using modified short arc approach,in which the ambiguity of range measurements is eliminated via differentiating two adjacent range measurements.The data used for developing our monthly gravity field model are same as Tongji-GRACEOl model except that the range measurements are used to replace the range rate measurements,and our model is truncated to degree and order 60,spanning Jan.2004 to Dec.2010 also same as Tongji-GRACE01 model.Based on the comparison results of the C_(2,0),C_(2,1),S_(2,1),and C_(15,15),S_(15,15),time series and the global mass change signals as well as the mass change time series in Amazon area of our model with those of Tongji-GRACE01 model,we can conclude that our monthly gravity field model is comparable with Tongji-GRACE01 monthly model.
基金supported by National Natural Science Foundation of China(Grant Nos.41431070,41174016,41274026,41274024,41321063)National Key Basic Research Program of China(973 Program,2012CB957703)+1 种基金CAS/SAFEA International Partnership Program for Creative Research Teams(KZZD-EW-TZ-05)The Chinese Academy of Sciences
文摘As global warming continues,the monitoring of changes in terrestrial water storage becomes increasingly important since it plays a critical role in understanding global change and water resource management.In North America as elsewhere in the world,changes in water resources strongly impact agriculture and animal husbandry.From a combination of Gravity Recovery and Climate Experiment(GRACE) gravity and Global Positioning System(GPS) data,it is recently found that water storage from August,2002 to March,2011 recovered after the extreme Canadian Prairies drought between 1999 and 2005.In this paper,we use GRACE monthly gravity data of Release 5 to track the water storage change from August,2002 to June,2014.In Canadian Prairies and the Great Lakes areas,the total water storage is found to have increased during the last decade by a rate of 73.8 ± 14.5 Gt/a,which is larger than that found in the previous study due to the longer time span of GRACE observations used and the reduction of the leakage error.We also find a long term decrease of water storage at a rate of-12.0 ± 4.2 Gt/a in Ungava Peninsula,possibly due to permafrost degradation and less snow accumulation during the winter in the region.In addition,the effect of total mass gain in the surveyed area,on present-day sea level,amounts to-0.18 mm/a,and thus should be taken into account in studies of global sea level change.
文摘In this study, rural poverty in Iran is investigated applying a multidimensional approach, association rules mining technique, and Levine, F and Tukey tests to household data of 2008. The results indicate that poverty in its multi-dimensions is an epidemic problem in rural Iran. The results also exhibit that there are 11 patterns of poverty in the rural areas including four main patterns with 99.62% coverage and seven sub-patterns with nearly 0.38% coverage. In these patterns, housing and household education are the most important dimensions of poverty and income poverty is the least important dimension. Government income support policy to households, in enforcement the law of targeting subsidies, cannot be regarded as pro poor policy but it follows other political aspects.
文摘Challenges in Big Data analysis arise due to the way the data are recorded, maintained, processed and stored. We demonstrate that a hierarchical, multivariate, statistical machine learning algorithm, namely Boosted Regression Tree (BRT) can address Big Data challenges to drive decision making. The challenge of this study is lack of interoperability since the data, a collection of GIS shapefiles, remotely sensed imagery, and aggregated and interpolated spatio-temporal information, are stored in monolithic hardware components. For the modelling process, it was necessary to create one common input file. By merging the data sources together, a structured but noisy input file, showing inconsistencies and redundancies, was created. Here, it is shown that BRT can process different data granularities, heterogeneous data and missingness. In particular, BRT has the advantage of dealing with missing data by default by allowing a split on whether or not a value is missing as well as what the value is. Most importantly, the BRT offers a wide range of possibilities regarding the interpretation of results and variable selection is automatically performed by considering how frequently a variable is used to define a split in the tree. A comparison with two similar regression models (Random Forests and Least Absolute Shrinkage and Selection Operator, LASSO) shows that BRT outperforms these in this instance. BRT can also be a starting point for sophisticated hierarchical modelling in real world scenarios. For example, a single or ensemble approach of BRT could be tested with existing models in order to improve results for a wide range of data-driven decisions and applications.