In a data-intensive environment,the ability to accurately identify and manage data risks is essential for maintaining cybersecurity,preventing potential threats,supporting decision-making,and enabling effective post-i...In a data-intensive environment,the ability to accurately identify and manage data risks is essential for maintaining cybersecurity,preventing potential threats,supporting decision-making,and enabling effective post-incident analysis.Existing approaches to data risk identification are typically structured around the stages of the data lifecycle,offering a broad perspective but often lacking alignment with the specific dynamics of business operations.This study proposes a data-driven framework for data risk identification that reflects practical business contexts.The framework incorporates 25 categorized risk sources and 13 defined risk-triggering events,using data analysis to examine their interactions and influence.The approach demonstrates strong alignment with documented risk incidents and effectively captures relevant risk factors across operational scenarios.The implementation of this framework enables organizations to identify critical risk points more precisely,enhance the accuracy and timeliness of risk response strategies,and strengthen data governance practices.It also facilitates more informed strategic planning and cross-functional coordination,contributing to improved resilience and operational efficiency.展开更多
Comprehensive evaluation and warning is very important and difficult in food safety. This paper mainly focuses on introducing the application of using big data mining in food safety warning field. At first,we introduc...Comprehensive evaluation and warning is very important and difficult in food safety. This paper mainly focuses on introducing the application of using big data mining in food safety warning field. At first,we introduce the concept of big data miming and three big data methods. At the same time,we discuss the application of the three big data miming methods in food safety areas. Then we compare these big data miming methods,and propose how to apply Back Propagation Neural Network in food safety risk warning.展开更多
We consider the model selection problem of the dependency between the?terminal event and the non-terminal event under semi-competing risks data. When the relationship between the two events is unspecified, the inferen...We consider the model selection problem of the dependency between the?terminal event and the non-terminal event under semi-competing risks data. When the relationship between the two events is unspecified, the inference on the non-terminal event is not identifiable. We cannot make inference on the non-terminal event without extra assumptions. Thus, an association model for?semi-competing risks data is necessary, and it is important to select an appropriate dependence model for a data set. We construct the likelihood function for semi-competing risks data to select an appropriate dependence model. From?simulation studies, it shows the performance of the proposed approach is well. Finally, we apply our method to a bone marrow transplant data set.展开更多
This paper presents a methodology to determine three data quality (DQ) risk characteristics: accuracy, comprehensiveness and nonmembership. The methodology provides a set of quantitative models to confirm the informat...This paper presents a methodology to determine three data quality (DQ) risk characteristics: accuracy, comprehensiveness and nonmembership. The methodology provides a set of quantitative models to confirm the information quality risks for the database of the geographical information system (GIS). Four quantitative measures are introduced to examine how the quality risks of source information affect the quality of information outputs produced using the relational algebra operations Selection, Projection, and Cubic Product. It can be used to determine how quality risks associated with diverse data sources affect the derived data. The GIS is the prime source of information on the location of cables, and detection time strongly depends on whether maps indicate the presence of cables in the construction business. Poor data quality in the GIS can contribute to increased risk or higher risk avoidance costs. A case study provides a numerical example of the calculation of the trade-offs between risk and detection costs and provides an example of the calculation of the costs of data quality. We conclude that the model contributes valuable new insight.展开更多
Rough set theory is relativly new to area of soft computing to handle the uncertain big data efficiently. It also provides a powerful way to calculate the importance degree of vague and uncertain big data to help in d...Rough set theory is relativly new to area of soft computing to handle the uncertain big data efficiently. It also provides a powerful way to calculate the importance degree of vague and uncertain big data to help in decision making. Risk assessment is very important for safe and reliable investment. Risk management involves assessing the risk sources and designing strategies and procedures to mitigate those risks to an acceptable level. In this paper, we emphasize on classification of different types of risk factors and find a simple and effective way to calculate the risk exposure.. The study uses rough set method to classify and judge the safety attributes related to investment policy. The method which based on intelligent knowledge accusation provides an innovative way for risk analysis. From this approach, we are able to calculate the significance of each factor and relative risk exposure based on the original data without assigning the weight subjectively.展开更多
This paper considers quantile regression analysis based on semi-competing risks data in which a non-terminal event may be dependently censored by a terminal event. The major interest is the covariate effects on the qu...This paper considers quantile regression analysis based on semi-competing risks data in which a non-terminal event may be dependently censored by a terminal event. The major interest is the covariate effects on the quantile of the non-terminal event time. Dependent censoring is handled by assuming that the joint distribution of the two event times follows a parametric copula model with unspecified marginal distributions. The technique of inverse probability weighting (IPW) is adopted to adjust for the selection bias. Large-sample properties of the proposed estimator are derived and a model diagnostic procedure is developed to check the adequacy of the model assumption. Simulation results show that the proposed estimator performs well. For illustrative purposes, our method is applied to analyze the bone marrow transplant data in [1].展开更多
Identification of security risk factors for small reservoirs is the basis for implementation of early warning systems.The manner of identification of the factors for small reservoirs is of practical significance when ...Identification of security risk factors for small reservoirs is the basis for implementation of early warning systems.The manner of identification of the factors for small reservoirs is of practical significance when data are incomplete.The existing grey relational models have some disadvantages in measuring the correlation between categorical data sequences.To this end,this paper introduces a new grey relational model to analyze heterogeneous data.In this study,a set of security risk factors for small reservoirs was first constructed based on theoretical analysis,and heterogeneous data of these factors were recorded as sequences.The sequences were regarded as random variables,and the information entropy and conditional entropy between sequences were measured to analyze the relational degree between risk factors.Then,a new grey relational analysis model for heterogeneous data was constructed,and a comprehensive security risk factor identification method was developed.A case study of small reservoirs in Guangxi Zhuang Autonomous Region in China shows that the model constructed in this study is applicable to security risk factor identification for small reservoirs with heterogeneous and sparse data.展开更多
Construction project is not a standalone engineering maneuver.It is closely linked to the well-being of local communities in concern.The city renovation in Beijing down center for Olympic 2008 transformed many antique...Construction project is not a standalone engineering maneuver.It is closely linked to the well-being of local communities in concern.The city renovation in Beijing down center for Olympic 2008 transformed many antique architecture and regional landscape.It gave a world-recognized achievement in China s modem development and manifested a major milestone in China's economic development.In the course of metro construction projects,there are substantial interwoven municipal structures influencing the success of the projects,which including,but the least,all underground cables and ducts,sewage system,the power consumption of construction works,traffic diversion,air pollution,expatriate business activities and social security.There are many US and UK project insurance companies moving into Asia Pacific.They are doing re-insurance business on major construction guarantee,such as machinery damage,project on-time,power consumption,claims from contractors and communities.Environmental information,such as water quality,indoor and outdoor air quality,people inflow and lift waiting time play deterministic roles in construction's fit-touse.Big Data is a contemporary buzzword since 2013,and the key competence is to provide real time response to heuristic syndrome in order to make short-term prediction.This paper attempts to develop a conceptual model in big data for construction展开更多
This paper proposes a simple two-step nonparametric procedure to estimate the intraday jump tail and measure the jump tail risk in asset price with noisy high frequency data. We first propose the pre-averaging thresho...This paper proposes a simple two-step nonparametric procedure to estimate the intraday jump tail and measure the jump tail risk in asset price with noisy high frequency data. We first propose the pre-averaging threshold approach to estimate the intraday jumps occurred, and then use the peaks-over-threshold (POT) method and generalized Pareto distribution (GPD) to model the intraday jump tail and further measure the jump tail risk. Finally, an empirical example further demonstrates the power of the proposed method to measure the jump tail risk under the effect of microstructure noise.展开更多
The paper aims to discuss three interesting issues of statistical inferences for a common risk ratio (RR) in sparse meta-analysis data. Firstly, the conventional log-risk ratio estimator encounters a number of problem...The paper aims to discuss three interesting issues of statistical inferences for a common risk ratio (RR) in sparse meta-analysis data. Firstly, the conventional log-risk ratio estimator encounters a number of problems when the number of events in the experimental or control group is zero in sparse data of a 2 × 2 table. The adjusted log-risk ratio estimator with the continuity correction points based upon the minimum Bayes risk with respect to the uniform prior density over (0, 1) and the Euclidean loss function is proposed. Secondly, the interest is to find the optimal weights of the pooled estimate that minimize the mean square error (MSE) of subject to the constraint on where , , . Finally, the performance of this minimum MSE weighted estimator adjusted with various values of points is investigated to compare with other popular estimators, such as the Mantel-Haenszel (MH) estimator and the weighted least squares (WLS) estimator (also equivalently known as the inverse-variance weighted estimator) in senses of point estimation and hypothesis testing via simulation studies. The results of estimation illustrate that regardless of the true values of RR, the MH estimator achieves the best performance with the smallest MSE when the study size is rather large and the sample sizes within each study are small. The MSE of WLS estimator and the proposed-weight estimator adjusted by , or , or are close together and they are the best when the sample sizes are moderate to large (and) while the study size is rather small.展开更多
Cyberattacks are difficult to prevent because the targeted companies and organizations are often relying on new and fundamentally insecure cloudbased technologies,such as the Internet of Things.With increasing industr...Cyberattacks are difficult to prevent because the targeted companies and organizations are often relying on new and fundamentally insecure cloudbased technologies,such as the Internet of Things.With increasing industry adoption and migration of traditional computing services to the cloud,one of the main challenges in cybersecurity is to provide mechanisms to secure these technologies.This work proposes a Data Security Framework for cloud computing services(CCS)that evaluates and improves CCS data security from a software engineering perspective by evaluating the levels of security within the cloud computing paradigm using engineering methods and techniques applied to CCS.This framework is developed by means of a methodology based on a heuristic theory that incorporates knowledge generated by existing works as well as the experience of their implementation.The paper presents the design details of the framework,which consists of three stages:identification of data security requirements,management of data security risks and evaluation of data security performance in CCS.展开更多
Since creation of spatial data is a costly and time consuming process, researchers, in this domain, in most of the cases rely on open source spatial attributes for their specific purpose. Likewise, the present researc...Since creation of spatial data is a costly and time consuming process, researchers, in this domain, in most of the cases rely on open source spatial attributes for their specific purpose. Likewise, the present research aims at mapping landslide susceptibility at the metropolitan area of Chittagong district of Bangladesh utilizing obtainable open source spatial data from various web portals. In this regard, we targeted a study region where rainfall induced landslides reportedly causes causalities as well as property damage each year. In this study, however, we employed multi-criteria evaluation (MCE) technique i.e., heuristic, a knowledge driven approach based on expert opinions from various discipline for landslide susceptibility mapping combining nine causative factors—geomorphology, geology, land use/land cover (LULC), slope, aspect, plan curvature, drainage distance, relative relief and vegetation in geographic information system (GIS) environment. The final susceptibility map was devised into five hazard classes viz., very low, low, moderate, high, and very high, representing 22 km2 (13%), 90 km2 (53%);24 km2 (15%);22 km2 (13%) and 10 km2 (6%) areas respectively. This particular study might be beneficial to the local authorities and other stake-holders, concerned in disaster risk reduction and mitigation activities. Moreover this study can also be advantageous for risk sensitive land use planning in the study area.展开更多
In this paper, decision mechanism of credit-risk for banks is studied when the loan interest rate is fixed with asymmetry information in credit market. We give out the designs of rationing and non-rationing on credit ...In this paper, decision mechanism of credit-risk for banks is studied when the loan interest rate is fixed with asymmetry information in credit market. We give out the designs of rationing and non-rationing on credit risky decision mechanism when collateral value provided by an entrepreneur is not less than the minimum demands of the bank. It shows that under the action of the mechanism, banks could efficiently identify the risk size of the project. Finally, the condition of the project investigation of bank is given over again.展开更多
基金supported by grants from the National Natural Science Foundation of China(T2293774,72571269,72201265)National Key Research and Development Program of China(2022YFC3321104)+2 种基金China Postdoctoral Science Foundation funded project(2023T160635,2022M723105)Fundamental Research Funds for the Central UniversitiesMOE Social Science Laboratory of Digital Economic Forecasts and Policy Simulation at the University of Chinese Academy of Sciences.
文摘In a data-intensive environment,the ability to accurately identify and manage data risks is essential for maintaining cybersecurity,preventing potential threats,supporting decision-making,and enabling effective post-incident analysis.Existing approaches to data risk identification are typically structured around the stages of the data lifecycle,offering a broad perspective but often lacking alignment with the specific dynamics of business operations.This study proposes a data-driven framework for data risk identification that reflects practical business contexts.The framework incorporates 25 categorized risk sources and 13 defined risk-triggering events,using data analysis to examine their interactions and influence.The approach demonstrates strong alignment with documented risk incidents and effectively captures relevant risk factors across operational scenarios.The implementation of this framework enables organizations to identify critical risk points more precisely,enhance the accuracy and timeliness of risk response strategies,and strengthen data governance practices.It also facilitates more informed strategic planning and cross-functional coordination,contributing to improved resilience and operational efficiency.
基金Supported by Soft Science Research Project of Guizhou Province(R20142023)Key Youth Fund Project of Guizhou Academy of Sciences(J201402)
文摘Comprehensive evaluation and warning is very important and difficult in food safety. This paper mainly focuses on introducing the application of using big data mining in food safety warning field. At first,we introduce the concept of big data miming and three big data methods. At the same time,we discuss the application of the three big data miming methods in food safety areas. Then we compare these big data miming methods,and propose how to apply Back Propagation Neural Network in food safety risk warning.
文摘We consider the model selection problem of the dependency between the?terminal event and the non-terminal event under semi-competing risks data. When the relationship between the two events is unspecified, the inference on the non-terminal event is not identifiable. We cannot make inference on the non-terminal event without extra assumptions. Thus, an association model for?semi-competing risks data is necessary, and it is important to select an appropriate dependence model for a data set. We construct the likelihood function for semi-competing risks data to select an appropriate dependence model. From?simulation studies, it shows the performance of the proposed approach is well. Finally, we apply our method to a bone marrow transplant data set.
基金The National Natural Science Foundation of China (No.70772021,70372004)China Postdoctoral Science Foundation (No.20060400077)
文摘This paper presents a methodology to determine three data quality (DQ) risk characteristics: accuracy, comprehensiveness and nonmembership. The methodology provides a set of quantitative models to confirm the information quality risks for the database of the geographical information system (GIS). Four quantitative measures are introduced to examine how the quality risks of source information affect the quality of information outputs produced using the relational algebra operations Selection, Projection, and Cubic Product. It can be used to determine how quality risks associated with diverse data sources affect the derived data. The GIS is the prime source of information on the location of cables, and detection time strongly depends on whether maps indicate the presence of cables in the construction business. Poor data quality in the GIS can contribute to increased risk or higher risk avoidance costs. A case study provides a numerical example of the calculation of the trade-offs between risk and detection costs and provides an example of the calculation of the costs of data quality. We conclude that the model contributes valuable new insight.
文摘Rough set theory is relativly new to area of soft computing to handle the uncertain big data efficiently. It also provides a powerful way to calculate the importance degree of vague and uncertain big data to help in decision making. Risk assessment is very important for safe and reliable investment. Risk management involves assessing the risk sources and designing strategies and procedures to mitigate those risks to an acceptable level. In this paper, we emphasize on classification of different types of risk factors and find a simple and effective way to calculate the risk exposure.. The study uses rough set method to classify and judge the safety attributes related to investment policy. The method which based on intelligent knowledge accusation provides an innovative way for risk analysis. From this approach, we are able to calculate the significance of each factor and relative risk exposure based on the original data without assigning the weight subjectively.
文摘This paper considers quantile regression analysis based on semi-competing risks data in which a non-terminal event may be dependently censored by a terminal event. The major interest is the covariate effects on the quantile of the non-terminal event time. Dependent censoring is handled by assuming that the joint distribution of the two event times follows a parametric copula model with unspecified marginal distributions. The technique of inverse probability weighting (IPW) is adopted to adjust for the selection bias. Large-sample properties of the proposed estimator are derived and a model diagnostic procedure is developed to check the adequacy of the model assumption. Simulation results show that the proposed estimator performs well. For illustrative purposes, our method is applied to analyze the bone marrow transplant data in [1].
基金supported by the National Nature Science Foundation of China(Grant No.71401052)the National Social Science Foundation of China(Grant No.17BGL156)the Key Project of the National Social Science Foundation of China(Grant No.14AZD024)
文摘Identification of security risk factors for small reservoirs is the basis for implementation of early warning systems.The manner of identification of the factors for small reservoirs is of practical significance when data are incomplete.The existing grey relational models have some disadvantages in measuring the correlation between categorical data sequences.To this end,this paper introduces a new grey relational model to analyze heterogeneous data.In this study,a set of security risk factors for small reservoirs was first constructed based on theoretical analysis,and heterogeneous data of these factors were recorded as sequences.The sequences were regarded as random variables,and the information entropy and conditional entropy between sequences were measured to analyze the relational degree between risk factors.Then,a new grey relational analysis model for heterogeneous data was constructed,and a comprehensive security risk factor identification method was developed.A case study of small reservoirs in Guangxi Zhuang Autonomous Region in China shows that the model constructed in this study is applicable to security risk factor identification for small reservoirs with heterogeneous and sparse data.
文摘Construction project is not a standalone engineering maneuver.It is closely linked to the well-being of local communities in concern.The city renovation in Beijing down center for Olympic 2008 transformed many antique architecture and regional landscape.It gave a world-recognized achievement in China s modem development and manifested a major milestone in China's economic development.In the course of metro construction projects,there are substantial interwoven municipal structures influencing the success of the projects,which including,but the least,all underground cables and ducts,sewage system,the power consumption of construction works,traffic diversion,air pollution,expatriate business activities and social security.There are many US and UK project insurance companies moving into Asia Pacific.They are doing re-insurance business on major construction guarantee,such as machinery damage,project on-time,power consumption,claims from contractors and communities.Environmental information,such as water quality,indoor and outdoor air quality,people inflow and lift waiting time play deterministic roles in construction's fit-touse.Big Data is a contemporary buzzword since 2013,and the key competence is to provide real time response to heuristic syndrome in order to make short-term prediction.This paper attempts to develop a conceptual model in big data for construction
文摘This paper proposes a simple two-step nonparametric procedure to estimate the intraday jump tail and measure the jump tail risk in asset price with noisy high frequency data. We first propose the pre-averaging threshold approach to estimate the intraday jumps occurred, and then use the peaks-over-threshold (POT) method and generalized Pareto distribution (GPD) to model the intraday jump tail and further measure the jump tail risk. Finally, an empirical example further demonstrates the power of the proposed method to measure the jump tail risk under the effect of microstructure noise.
文摘The paper aims to discuss three interesting issues of statistical inferences for a common risk ratio (RR) in sparse meta-analysis data. Firstly, the conventional log-risk ratio estimator encounters a number of problems when the number of events in the experimental or control group is zero in sparse data of a 2 × 2 table. The adjusted log-risk ratio estimator with the continuity correction points based upon the minimum Bayes risk with respect to the uniform prior density over (0, 1) and the Euclidean loss function is proposed. Secondly, the interest is to find the optimal weights of the pooled estimate that minimize the mean square error (MSE) of subject to the constraint on where , , . Finally, the performance of this minimum MSE weighted estimator adjusted with various values of points is investigated to compare with other popular estimators, such as the Mantel-Haenszel (MH) estimator and the weighted least squares (WLS) estimator (also equivalently known as the inverse-variance weighted estimator) in senses of point estimation and hypothesis testing via simulation studies. The results of estimation illustrate that regardless of the true values of RR, the MH estimator achieves the best performance with the smallest MSE when the study size is rather large and the sample sizes within each study are small. The MSE of WLS estimator and the proposed-weight estimator adjusted by , or , or are close together and they are the best when the sample sizes are moderate to large (and) while the study size is rather small.
文摘Cyberattacks are difficult to prevent because the targeted companies and organizations are often relying on new and fundamentally insecure cloudbased technologies,such as the Internet of Things.With increasing industry adoption and migration of traditional computing services to the cloud,one of the main challenges in cybersecurity is to provide mechanisms to secure these technologies.This work proposes a Data Security Framework for cloud computing services(CCS)that evaluates and improves CCS data security from a software engineering perspective by evaluating the levels of security within the cloud computing paradigm using engineering methods and techniques applied to CCS.This framework is developed by means of a methodology based on a heuristic theory that incorporates knowledge generated by existing works as well as the experience of their implementation.The paper presents the design details of the framework,which consists of three stages:identification of data security requirements,management of data security risks and evaluation of data security performance in CCS.
文摘Since creation of spatial data is a costly and time consuming process, researchers, in this domain, in most of the cases rely on open source spatial attributes for their specific purpose. Likewise, the present research aims at mapping landslide susceptibility at the metropolitan area of Chittagong district of Bangladesh utilizing obtainable open source spatial data from various web portals. In this regard, we targeted a study region where rainfall induced landslides reportedly causes causalities as well as property damage each year. In this study, however, we employed multi-criteria evaluation (MCE) technique i.e., heuristic, a knowledge driven approach based on expert opinions from various discipline for landslide susceptibility mapping combining nine causative factors—geomorphology, geology, land use/land cover (LULC), slope, aspect, plan curvature, drainage distance, relative relief and vegetation in geographic information system (GIS) environment. The final susceptibility map was devised into five hazard classes viz., very low, low, moderate, high, and very high, representing 22 km2 (13%), 90 km2 (53%);24 km2 (15%);22 km2 (13%) and 10 km2 (6%) areas respectively. This particular study might be beneficial to the local authorities and other stake-holders, concerned in disaster risk reduction and mitigation activities. Moreover this study can also be advantageous for risk sensitive land use planning in the study area.
基金This project was supported by Fubangs Science & Technology Company Ltd.
文摘In this paper, decision mechanism of credit-risk for banks is studied when the loan interest rate is fixed with asymmetry information in credit market. We give out the designs of rationing and non-rationing on credit risky decision mechanism when collateral value provided by an entrepreneur is not less than the minimum demands of the bank. It shows that under the action of the mechanism, banks could efficiently identify the risk size of the project. Finally, the condition of the project investigation of bank is given over again.