In the new era,the impact of emerging productive forces has permeated every sector of industry.As the core production factor of these forces,data plays a pivotal role in industrial transformation and social developmen...In the new era,the impact of emerging productive forces has permeated every sector of industry.As the core production factor of these forces,data plays a pivotal role in industrial transformation and social development.Consequently,many domestic universities have introduced majors or courses related to big data.Among these,the Big Data Management and Applications major stands out for its interdisciplinary approach and emphasis on practical skills.However,as an emerging field,it has not yet accumulated a robust foundation in teaching theory and practice.Current instructional practices face issues such as unclear training objectives,inconsistent teaching methods and course content,insufficient integration of practical components,and a shortage of qualified faculty-factors that hinder both the development of the major and the overall quality of education.Taking the statistics course within the Big Data Management and Applications major as an example,this paper examines the challenges faced by statistics education in the context of emerging productive forces and proposes corresponding improvement measures.By introducing innovative teaching concepts and strategies,the teaching system for professional courses is optimized,and authentic classroom scenarios are recreated through illustrative examples.Questionnaire surveys and statistical analyses of data collected before and after the teaching reforms indicate that the curriculum changes effectively enhance instructional outcomes,promote the development of the major,and improve the quality of talent cultivation.展开更多
In order to reduce the enormous pressure to environmental monitoring work brought by the false sewage monitoring data, Grubbs method, box plot, t test and other methods are used to make depth analysis to the data, pro...In order to reduce the enormous pressure to environmental monitoring work brought by the false sewage monitoring data, Grubbs method, box plot, t test and other methods are used to make depth analysis to the data, providing a set of technological process to identify the sewage monitoring data, which is convenient and simple.展开更多
In this paper we propose a service-oriented architecture for spatial data integration (SOA-SDI) in the context of a large number of available spatial data sources that are physically sitting at different places, and d...In this paper we propose a service-oriented architecture for spatial data integration (SOA-SDI) in the context of a large number of available spatial data sources that are physically sitting at different places, and develop web-based GIS systems based on SOA-SDI, allowing client applications to pull in, analyze and present spatial data from those available spatial data sources. The proposed architecture logically includes 4 layers or components; they are layer of multiple data provider services, layer of data in-tegration, layer of backend services, and front-end graphical user interface (GUI) for spatial data presentation. On the basis of the 4-layered SOA-SDI framework, WebGIS applications can be quickly deployed, which proves that SOA-SDI has the potential to reduce the input of software development and shorten the development period.展开更多
Air quality monitoring is effective for timely understanding of the current air quality status of a region or city.Currently,the huge volume of environmental monitoring data,which has reasonable real-time performance,...Air quality monitoring is effective for timely understanding of the current air quality status of a region or city.Currently,the huge volume of environmental monitoring data,which has reasonable real-time performance,provides strong support for in-depth analysis of air pollution characteristics and causes.However,in the era of big data,to meet current demands for fine management of the atmospheric environment,it is important to explore the characteristics and causes of air pollution from multiple aspects for comprehensive and scientific evaluation of air quality.This study reviewed and summarized air quality evaluation methods on the basis of environmental monitoring data statistics during the 13th Five-Year Plan period,and evaluated the level of air pollution in the Beijing-Tianjin-Hebei region and its surrounding areas(i.e.,the“2+26”region)during the period of the three-year action plan to fight air pollution.We suggest that air quality should be comprehensively,deeply,and scientifically evaluated from the aspects of air pollution characteristics,causes,and influences of meteorological conditions and anthropogenic emissions.It is also suggested that a threeyear moving average be introduced as one of the evaluation indexes of long-term change of pollutants.Additionally,both temporal and spatial differences should be considered when removing confounding meteorological factors.展开更多
Recently, use of mobile communicational devices in field data collection is increasing such as smart phones and cellular phones due to emergence of embedded Global Position System GPS and Wi-Fi Internet access. Accura...Recently, use of mobile communicational devices in field data collection is increasing such as smart phones and cellular phones due to emergence of embedded Global Position System GPS and Wi-Fi Internet access. Accurate timely and handy field data collection is required for disaster management and emergency quick responses. In this article, we introduce web-based GIS system to collect the field data by personal mobile phone through Post Office Protocol POP3 mail server. The main objective of this work is to demonstrate real-time field data collection method to the students using their mobile phone to collect field data by timely and handy manners, either individual or group survey in local or global scale research.展开更多
Atmospheric chemistry models usually perform badly in forecasting wintertime air pollution because of their uncertainties. Generally, such uncertainties can be decreased effectively by techniques such as data assimila...Atmospheric chemistry models usually perform badly in forecasting wintertime air pollution because of their uncertainties. Generally, such uncertainties can be decreased effectively by techniques such as data assimilation(DA) and model output statistics(MOS). However, the relative importance and combined effects of the two techniques have not been clarified. Here,a one-month air quality forecast with the Weather Research and Forecasting-Chemistry(WRF-Chem) model was carried out in a virtually operational setup focusing on Hebei Province, China. Meanwhile, three-dimensional variational(3 DVar) DA and MOS based on one-dimensional Kalman filtering were implemented separately and simultaneously to investigate their performance in improving the model forecast. Comparison with observations shows that the chemistry forecast with MOS outperforms that with 3 DVar DA, which could be seen in all the species tested over the whole 72 forecast hours. Combined use of both techniques does not guarantee a better forecast than MOS only, with the improvements and degradations being small and appearing rather randomly. Results indicate that the implementation of MOS is more suitable than 3 DVar DA in improving the operational forecasting ability of WRF-Chem.展开更多
This paper proposes a useful web-based system for the management and sharing of electron probe micro-analysis( EPMA)data in geology. A new web-based architecture that integrates the management and sharing functions is...This paper proposes a useful web-based system for the management and sharing of electron probe micro-analysis( EPMA)data in geology. A new web-based architecture that integrates the management and sharing functions is developed and implemented.Earth scientists can utilize this system to not only manage their data,but also easily communicate and share it with other researchers.Data query methods provide the core functionality of the proposed management and sharing modules. The modules in this system have been developed using cloud GIS technologies,which help achieve real-time spatial area retrieval on a map. The system has been tested by approximately 263 users at Jilin University and Beijing SHRIMP Center. A survey was conducted among these users to estimate the usability of the primary functions of the system,and the assessment result is summarized and presented.展开更多
This paper is concerned with the development of product data management (PDM) systems--WPDM systems based on web technologies. As a tool to integrate information, traditional PDM system has many benefits for the com...This paper is concerned with the development of product data management (PDM) systems--WPDM systems based on web technologies. As a tool to integrate information, traditional PDM system has many benefits for the companies in such aspects as improving design productivity, better control over projects and so on. With the maturing of web technologies, the advantages of WPDM system are obvious. We will show these advantages in detail in Part 3. WPDM system is built on three-tier application model to provide security and flexibility, they are back-end, middle layer and front-end. The basic designs in each layer will be briefly introduced in Part 4. In the future, WPDM will be extended to integrate with other applications to provide a complete web-based engineering environment.展开更多
Cryo-electron microscopy(cryo-EM) provides a powerful tool to resolve the structure of biological macromolecules in natural state. One advantage of cryo-EM technology is that different conformation states of a protein...Cryo-electron microscopy(cryo-EM) provides a powerful tool to resolve the structure of biological macromolecules in natural state. One advantage of cryo-EM technology is that different conformation states of a protein complex structure can be simultaneously built, and the distribution of different states can be measured. This provides a tool to push cryo-EM technology beyond just to resolve protein structures, but to obtain the thermodynamic properties of protein machines. Here, we used a deep manifold learning framework to get the conformational landscape of Kai C proteins, and further obtained the thermodynamic properties of this central oscillator component in the circadian clock by means of statistical physics.展开更多
The knowledge of probability is fully reflected in people's daily life and production. People know the world. By using the tools of probability and mathematical statistics, people can scientifically and reasonably...The knowledge of probability is fully reflected in people's daily life and production. People know the world. By using the tools of probability and mathematical statistics, people can scientifically and reasonably analyze various complex problems and data, thus significantly improving people's quality of life. At the same time, they can accurately predict the law and trend of things development based on the existing data. Because of these advantages, probability theory and mathematical statistics have become the direction of many complicated problems. At present, people are in great need of big data analysis. Similarly, people also need a better way for big data analysis to deal with various difficult problems in actual production and life.展开更多
The application of frequency distribution statistics to data provides objective means to assess the nature of the data distribution and viability of numerical models that are used to visualize and interpret data.Two c...The application of frequency distribution statistics to data provides objective means to assess the nature of the data distribution and viability of numerical models that are used to visualize and interpret data.Two commonly used tools are the kernel density estimation and reduced chi-squared statistic used in combination with a weighted mean.Due to the wide applicability of these tools,we present a Java-based computer application called KDX to facilitate the visualization of data and the utilization of these numerical tools.展开更多
There has been a significant advancement in the application of statistical tools in plant pathology during the past four decades. These tools include multivariate analysis of disease dynamics involving principal compo...There has been a significant advancement in the application of statistical tools in plant pathology during the past four decades. These tools include multivariate analysis of disease dynamics involving principal component analysis, cluster analysis, factor analysis, pattern analysis, discriminant analysis, multivariate analysis of variance, correspondence analysis, canonical correlation analysis, redundancy analysis, genetic diversity analysis, and stability analysis, which involve in joint regression, additive main effects and multiplicative interactions, and genotype-by-environment interaction biplot analysis. The advanced statistical tools, such as non-parametric analysis of disease association, meta-analysis, Bayesian analysis, and decision theory, take an important place in analysis of disease dynamics. Disease forecasting methods by simulation models for plant diseases have a great potentiality in practical disease control strategies. Common mathematical tools such as monomolecular, exponential, logistic, Gompertz and linked differential equations take an important place in growth curve analysis of disease epidemics. The highly informative means of displaying a range of numerical data through construction of box and whisker plots has been suggested. The probable applications of recent advanced tools of linear and non-linear mixed models like the linear mixed model, generalized linear model, and generalized linear mixed models have been presented. The most recent technologies such as micro-array analysis, though cost effective, provide estimates of gene expressions for thousands of genes simultaneously and need attention by the molecular biologists. Some of these advanced tools can be well applied in different branches of rice research, including crop improvement, crop production, crop protection, social sciences as well as agricultural engineering. The rice research scientists should take advantage of these new opportunities adequately in adoption of the new highly potential advanced technologies while planning experimental designs, data collection, analysis and interpretation of their research data sets.展开更多
In atmospheric data assimilation systems, the forecast error covariance model is an important component. However, the paralneters required by a forecast error covariance model are difficult to obtain due to the absenc...In atmospheric data assimilation systems, the forecast error covariance model is an important component. However, the paralneters required by a forecast error covariance model are difficult to obtain due to the absence of the truth. This study applies an error statistics estimation method to the Pfiysical-space Statistical Analysis System (PSAS) height-wind forecast error covariance model. This method consists of two components: the first component computes the error statistics by using the National Meteorological Center (NMC) method, which is a lagged-forecast difference approach, within the framework of the PSAS height-wind forecast error covariance model; the second obtains a calibration formula to rescale the error standard deviations provided by the NMC method. The calibration is against the error statistics estimated by using a maximum-likelihood estimation (MLE) with rawindsonde height observed-minus-forecast residuals. A complete set of formulas for estimating the error statistics and for the calibration is applied to a one-month-long dataset generated by a general circulation model of the Global Model and Assimilation Office (GMAO), NASA. There is a clear constant relationship between the error statistics estimates of the NMC-method and MLE. The final product provides a full set of 6-hour error statistics required by the PSAS height-wind forecast error covariance model over the globe. The features of these error statistics are examined and discussed.展开更多
Traffic tunnels include tunnel works for traffic and transport in the areas of railway, highway, and rail transit. With many mountains and nearly one fifth of the global population, China possesses numerous large citi...Traffic tunnels include tunnel works for traffic and transport in the areas of railway, highway, and rail transit. With many mountains and nearly one fifth of the global population, China possesses numerous large cities and megapolises with rapidly growing economies and huge traffic demands. As a result, a great deal of railway, highway, and rail transit facilities are required in this country. In the past, the construction of these facilities mainly involved subgrade and bridge works; in recent years.展开更多
It is well known that the nonparametric estimation of the regression function is highly sensitive to the presence of even a small proportion of outliers in the data.To solve the problem of typical observations when th...It is well known that the nonparametric estimation of the regression function is highly sensitive to the presence of even a small proportion of outliers in the data.To solve the problem of typical observations when the covariates of the nonparametric component are functional,the robust estimates for the regression parameter and regression operator are introduced.The main propose of the paper is to consider data-driven methods of selecting the number of neighbors in order to make the proposed processes fully automatic.We use thek Nearest Neighbors procedure(kNN)to construct the kernel estimator of the proposed robust model.Under some regularity conditions,we state consistency results for kNN functional estimators,which are uniform in the number of neighbors(UINN).Furthermore,a simulation study and an empirical application to a real data analysis of octane gasoline predictions are carried out to illustrate the higher predictive performances and the usefulness of the kNN approach.展开更多
Predicting seeing of astronomical observations can provide hints of the quality of optical imaging in the near future,and facilitate flexible scheduling of observation tasks to maximize the use of astronomical observa...Predicting seeing of astronomical observations can provide hints of the quality of optical imaging in the near future,and facilitate flexible scheduling of observation tasks to maximize the use of astronomical observatories.Traditional approaches to seeing prediction mostly rely on regional weather models to capture the in-dome optical turbulence patterns.Thanks to the developing of data gathering and aggregation facilities of astronomical observatories in recent years,data-driven approaches are becoming increasingly feasible and attractive to predict astronomical seeing.This paper systematically investigates data-driven approaches to seeing prediction by leveraging various big data techniques,from traditional statistical modeling,machine learning to new emerging deep learning methods,on the monitoring data of the Large sky Area Multi-Object fiber Spectroscopic Telescope(LAMOST).The raw monitoring data are preprocessed to allow for big data modeling.Then we formulate the seeing prediction task under each type of modeling framework and develop seeing prediction models through using representative big data techniques,including ARIMA and Prophet for statistical modeling,MLP and XGBoost for machine learning,and LSTM,GRU and Transformer for deep learning.We perform empirical studies on the developed models with a variety of feature configurations,yielding notable insights into the applicability of big data techniques to the seeing prediction task.展开更多
The statistical map is usually used to indicate the quantitative features of various socio economic phenomena among regions on the base map of administrative divisions or on other base maps which connected with stati...The statistical map is usually used to indicate the quantitative features of various socio economic phenomena among regions on the base map of administrative divisions or on other base maps which connected with statistical unit. Making use of geographic information system (GIS) techniques, and supported by Auto CAD software, the author of this paper has put forward a practical method for making statistical map and developed a software (SMT) for the making of small scale statistical map using C language.展开更多
The development of adaptation measures to climate change relies on data from climate models or impact models. In order to analyze these large data sets or an ensemble of these data sets, the use of statistical methods...The development of adaptation measures to climate change relies on data from climate models or impact models. In order to analyze these large data sets or an ensemble of these data sets, the use of statistical methods is required. In this paper, the methodological approach to collecting, structuring and publishing the methods, which have been used or developed by former or present adaptation initiatives, is described. The intention is to communicate achieved knowledge and thus support future users. A key component is the participation of users in the development process. Main elements of the approach are standardized, template-based descriptions of the methods including the specific applications, references, and method assessment. All contributions have been quality checked, sorted, and placed in a larger context. The result is a report on statistical methods which is freely available as printed or online version. Examples of how to use the methods are presented in this paper and are also included in the brochure.展开更多
文摘In the new era,the impact of emerging productive forces has permeated every sector of industry.As the core production factor of these forces,data plays a pivotal role in industrial transformation and social development.Consequently,many domestic universities have introduced majors or courses related to big data.Among these,the Big Data Management and Applications major stands out for its interdisciplinary approach and emphasis on practical skills.However,as an emerging field,it has not yet accumulated a robust foundation in teaching theory and practice.Current instructional practices face issues such as unclear training objectives,inconsistent teaching methods and course content,insufficient integration of practical components,and a shortage of qualified faculty-factors that hinder both the development of the major and the overall quality of education.Taking the statistics course within the Big Data Management and Applications major as an example,this paper examines the challenges faced by statistics education in the context of emerging productive forces and proposes corresponding improvement measures.By introducing innovative teaching concepts and strategies,the teaching system for professional courses is optimized,and authentic classroom scenarios are recreated through illustrative examples.Questionnaire surveys and statistical analyses of data collected before and after the teaching reforms indicate that the curriculum changes effectively enhance instructional outcomes,promote the development of the major,and improve the quality of talent cultivation.
文摘In order to reduce the enormous pressure to environmental monitoring work brought by the false sewage monitoring data, Grubbs method, box plot, t test and other methods are used to make depth analysis to the data, providing a set of technological process to identify the sewage monitoring data, which is convenient and simple.
基金Supported by the Research Fund of Key GIS Lab of the Education Ministry (No. 200610)
文摘In this paper we propose a service-oriented architecture for spatial data integration (SOA-SDI) in the context of a large number of available spatial data sources that are physically sitting at different places, and develop web-based GIS systems based on SOA-SDI, allowing client applications to pull in, analyze and present spatial data from those available spatial data sources. The proposed architecture logically includes 4 layers or components; they are layer of multiple data provider services, layer of data in-tegration, layer of backend services, and front-end graphical user interface (GUI) for spatial data presentation. On the basis of the 4-layered SOA-SDI framework, WebGIS applications can be quickly deployed, which proves that SOA-SDI has the potential to reduce the input of software development and shorten the development period.
基金supported by the National Key Research and Development Program of China(No.2019YFC0214800)。
文摘Air quality monitoring is effective for timely understanding of the current air quality status of a region or city.Currently,the huge volume of environmental monitoring data,which has reasonable real-time performance,provides strong support for in-depth analysis of air pollution characteristics and causes.However,in the era of big data,to meet current demands for fine management of the atmospheric environment,it is important to explore the characteristics and causes of air pollution from multiple aspects for comprehensive and scientific evaluation of air quality.This study reviewed and summarized air quality evaluation methods on the basis of environmental monitoring data statistics during the 13th Five-Year Plan period,and evaluated the level of air pollution in the Beijing-Tianjin-Hebei region and its surrounding areas(i.e.,the“2+26”region)during the period of the three-year action plan to fight air pollution.We suggest that air quality should be comprehensively,deeply,and scientifically evaluated from the aspects of air pollution characteristics,causes,and influences of meteorological conditions and anthropogenic emissions.It is also suggested that a threeyear moving average be introduced as one of the evaluation indexes of long-term change of pollutants.Additionally,both temporal and spatial differences should be considered when removing confounding meteorological factors.
文摘Recently, use of mobile communicational devices in field data collection is increasing such as smart phones and cellular phones due to emergence of embedded Global Position System GPS and Wi-Fi Internet access. Accurate timely and handy field data collection is required for disaster management and emergency quick responses. In this article, we introduce web-based GIS system to collect the field data by personal mobile phone through Post Office Protocol POP3 mail server. The main objective of this work is to demonstrate real-time field data collection method to the students using their mobile phone to collect field data by timely and handy manners, either individual or group survey in local or global scale research.
基金supported by the State Key Research and Development Program (Grant Nos. 2017YFC0209803, 2016YFC0208504, 2016YFC0203303 and 2017YFC0210106)the National Natural Science Foundation of China (Grant Nos. 91544230, 41575145, 41621005 and 41275128)
文摘Atmospheric chemistry models usually perform badly in forecasting wintertime air pollution because of their uncertainties. Generally, such uncertainties can be decreased effectively by techniques such as data assimilation(DA) and model output statistics(MOS). However, the relative importance and combined effects of the two techniques have not been clarified. Here,a one-month air quality forecast with the Weather Research and Forecasting-Chemistry(WRF-Chem) model was carried out in a virtually operational setup focusing on Hebei Province, China. Meanwhile, three-dimensional variational(3 DVar) DA and MOS based on one-dimensional Kalman filtering were implemented separately and simultaneously to investigate their performance in improving the model forecast. Comparison with observations shows that the chemistry forecast with MOS outperforms that with 3 DVar DA, which could be seen in all the species tested over the whole 72 forecast hours. Combined use of both techniques does not guarantee a better forecast than MOS only, with the improvements and degradations being small and appearing rather randomly. Results indicate that the implementation of MOS is more suitable than 3 DVar DA in improving the operational forecasting ability of WRF-Chem.
基金National Major Scientific Instruments and Equipment Development Special Funds,China(No.2016YFF0103303)National Science and Technology Support Program,China(No.2014BAK02B03)
文摘This paper proposes a useful web-based system for the management and sharing of electron probe micro-analysis( EPMA)data in geology. A new web-based architecture that integrates the management and sharing functions is developed and implemented.Earth scientists can utilize this system to not only manage their data,but also easily communicate and share it with other researchers.Data query methods provide the core functionality of the proposed management and sharing modules. The modules in this system have been developed using cloud GIS technologies,which help achieve real-time spatial area retrieval on a map. The system has been tested by approximately 263 users at Jilin University and Beijing SHRIMP Center. A survey was conducted among these users to estimate the usability of the primary functions of the system,and the assessment result is summarized and presented.
文摘This paper is concerned with the development of product data management (PDM) systems--WPDM systems based on web technologies. As a tool to integrate information, traditional PDM system has many benefits for the companies in such aspects as improving design productivity, better control over projects and so on. With the maturing of web technologies, the advantages of WPDM system are obvious. We will show these advantages in detail in Part 3. WPDM system is built on three-tier application model to provide security and flexibility, they are back-end, middle layer and front-end. The basic designs in each layer will be briefly introduced in Part 4. In the future, WPDM will be extended to integrate with other applications to provide a complete web-based engineering environment.
基金supported by the National Natural Science Foundation of China (Grant No. 12090054)。
文摘Cryo-electron microscopy(cryo-EM) provides a powerful tool to resolve the structure of biological macromolecules in natural state. One advantage of cryo-EM technology is that different conformation states of a protein complex structure can be simultaneously built, and the distribution of different states can be measured. This provides a tool to push cryo-EM technology beyond just to resolve protein structures, but to obtain the thermodynamic properties of protein machines. Here, we used a deep manifold learning framework to get the conformational landscape of Kai C proteins, and further obtained the thermodynamic properties of this central oscillator component in the circadian clock by means of statistical physics.
文摘The knowledge of probability is fully reflected in people's daily life and production. People know the world. By using the tools of probability and mathematical statistics, people can scientifically and reasonably analyze various complex problems and data, thus significantly improving people's quality of life. At the same time, they can accurately predict the law and trend of things development based on the existing data. Because of these advantages, probability theory and mathematical statistics have become the direction of many complicated problems. At present, people are in great need of big data analysis. Similarly, people also need a better way for big data analysis to deal with various difficult problems in actual production and life.
文摘The application of frequency distribution statistics to data provides objective means to assess the nature of the data distribution and viability of numerical models that are used to visualize and interpret data.Two commonly used tools are the kernel density estimation and reduced chi-squared statistic used in combination with a weighted mean.Due to the wide applicability of these tools,we present a Java-based computer application called KDX to facilitate the visualization of data and the utilization of these numerical tools.
文摘There has been a significant advancement in the application of statistical tools in plant pathology during the past four decades. These tools include multivariate analysis of disease dynamics involving principal component analysis, cluster analysis, factor analysis, pattern analysis, discriminant analysis, multivariate analysis of variance, correspondence analysis, canonical correlation analysis, redundancy analysis, genetic diversity analysis, and stability analysis, which involve in joint regression, additive main effects and multiplicative interactions, and genotype-by-environment interaction biplot analysis. The advanced statistical tools, such as non-parametric analysis of disease association, meta-analysis, Bayesian analysis, and decision theory, take an important place in analysis of disease dynamics. Disease forecasting methods by simulation models for plant diseases have a great potentiality in practical disease control strategies. Common mathematical tools such as monomolecular, exponential, logistic, Gompertz and linked differential equations take an important place in growth curve analysis of disease epidemics. The highly informative means of displaying a range of numerical data through construction of box and whisker plots has been suggested. The probable applications of recent advanced tools of linear and non-linear mixed models like the linear mixed model, generalized linear model, and generalized linear mixed models have been presented. The most recent technologies such as micro-array analysis, though cost effective, provide estimates of gene expressions for thousands of genes simultaneously and need attention by the molecular biologists. Some of these advanced tools can be well applied in different branches of rice research, including crop improvement, crop production, crop protection, social sciences as well as agricultural engineering. The rice research scientists should take advantage of these new opportunities adequately in adoption of the new highly potential advanced technologies while planning experimental designs, data collection, analysis and interpretation of their research data sets.
文摘In atmospheric data assimilation systems, the forecast error covariance model is an important component. However, the paralneters required by a forecast error covariance model are difficult to obtain due to the absence of the truth. This study applies an error statistics estimation method to the Pfiysical-space Statistical Analysis System (PSAS) height-wind forecast error covariance model. This method consists of two components: the first component computes the error statistics by using the National Meteorological Center (NMC) method, which is a lagged-forecast difference approach, within the framework of the PSAS height-wind forecast error covariance model; the second obtains a calibration formula to rescale the error standard deviations provided by the NMC method. The calibration is against the error statistics estimated by using a maximum-likelihood estimation (MLE) with rawindsonde height observed-minus-forecast residuals. A complete set of formulas for estimating the error statistics and for the calibration is applied to a one-month-long dataset generated by a general circulation model of the Global Model and Assimilation Office (GMAO), NASA. There is a clear constant relationship between the error statistics estimates of the NMC-method and MLE. The final product provides a full set of 6-hour error statistics required by the PSAS height-wind forecast error covariance model over the globe. The features of these error statistics are examined and discussed.
文摘Traffic tunnels include tunnel works for traffic and transport in the areas of railway, highway, and rail transit. With many mountains and nearly one fifth of the global population, China possesses numerous large cities and megapolises with rapidly growing economies and huge traffic demands. As a result, a great deal of railway, highway, and rail transit facilities are required in this country. In the past, the construction of these facilities mainly involved subgrade and bridge works; in recent years.
文摘It is well known that the nonparametric estimation of the regression function is highly sensitive to the presence of even a small proportion of outliers in the data.To solve the problem of typical observations when the covariates of the nonparametric component are functional,the robust estimates for the regression parameter and regression operator are introduced.The main propose of the paper is to consider data-driven methods of selecting the number of neighbors in order to make the proposed processes fully automatic.We use thek Nearest Neighbors procedure(kNN)to construct the kernel estimator of the proposed robust model.Under some regularity conditions,we state consistency results for kNN functional estimators,which are uniform in the number of neighbors(UINN).Furthermore,a simulation study and an empirical application to a real data analysis of octane gasoline predictions are carried out to illustrate the higher predictive performances and the usefulness of the kNN approach.
基金supported by the National Natural Science Foundation of China(U1931207,61602278 and 61702306)Sci.&Tech.Development Fund of Shandong Province of China(2016ZDJS02A11,ZR2017BF015 and ZR2017MF027)+1 种基金the Humanities and Social Science Research Project of the Ministry of Education(18YJAZH017)the Taishan Scholar Program of Shandong Province,and the Science and Technology Support Plan of Youth Innovation Team of Shandong Higher School(2019KJN024)。
文摘Predicting seeing of astronomical observations can provide hints of the quality of optical imaging in the near future,and facilitate flexible scheduling of observation tasks to maximize the use of astronomical observatories.Traditional approaches to seeing prediction mostly rely on regional weather models to capture the in-dome optical turbulence patterns.Thanks to the developing of data gathering and aggregation facilities of astronomical observatories in recent years,data-driven approaches are becoming increasingly feasible and attractive to predict astronomical seeing.This paper systematically investigates data-driven approaches to seeing prediction by leveraging various big data techniques,from traditional statistical modeling,machine learning to new emerging deep learning methods,on the monitoring data of the Large sky Area Multi-Object fiber Spectroscopic Telescope(LAMOST).The raw monitoring data are preprocessed to allow for big data modeling.Then we formulate the seeing prediction task under each type of modeling framework and develop seeing prediction models through using representative big data techniques,including ARIMA and Prophet for statistical modeling,MLP and XGBoost for machine learning,and LSTM,GRU and Transformer for deep learning.We perform empirical studies on the developed models with a variety of feature configurations,yielding notable insights into the applicability of big data techniques to the seeing prediction task.
文摘The statistical map is usually used to indicate the quantitative features of various socio economic phenomena among regions on the base map of administrative divisions or on other base maps which connected with statistical unit. Making use of geographic information system (GIS) techniques, and supported by Auto CAD software, the author of this paper has put forward a practical method for making statistical map and developed a software (SMT) for the making of small scale statistical map using C language.
文摘The development of adaptation measures to climate change relies on data from climate models or impact models. In order to analyze these large data sets or an ensemble of these data sets, the use of statistical methods is required. In this paper, the methodological approach to collecting, structuring and publishing the methods, which have been used or developed by former or present adaptation initiatives, is described. The intention is to communicate achieved knowledge and thus support future users. A key component is the participation of users in the development process. Main elements of the approach are standardized, template-based descriptions of the methods including the specific applications, references, and method assessment. All contributions have been quality checked, sorted, and placed in a larger context. The result is a report on statistical methods which is freely available as printed or online version. Examples of how to use the methods are presented in this paper and are also included in the brochure.