As a new type of production factor in healthcare,healthcare data elements have been rapidly integrated into various health production processes,such as clinical assistance,health management,biological testing,and oper...As a new type of production factor in healthcare,healthcare data elements have been rapidly integrated into various health production processes,such as clinical assistance,health management,biological testing,and operation and supervision[1,2].Healthcare data elements include biolog.ical and clinical data that are related to disease,environ-mental health data that are associated with life,and operational and healthcare management data that are related to healthcare activities(Figure 1).Activities such as the construction of a data value assessment system,the devel-opment of a data circulation and sharing platform,and the authorization of data compliance and operation products support the strong growth momentum of the market for health care data elements in China[3].展开更多
With the advent of the digital economy,there has been a rapid proliferation of small-scale Internet data centers(SIDCs).By leveraging their spatiotemporal load regulation potential through data workload balancing,aggr...With the advent of the digital economy,there has been a rapid proliferation of small-scale Internet data centers(SIDCs).By leveraging their spatiotemporal load regulation potential through data workload balancing,aggregated SIDCs have emerged as promising demand response(DR)resources for future power distribution systems.This paper presents an innovative framework for assessing capacity value(CV)by aggregating SIDCs participating in DR programs(SIDC-DR).Initially,we delineate the concept of CV tailored for aggregated SIDC scenarios and establish a metric for the assessment.Considering the effects of the data load dynamics,equipment constraints,and user behavior,we developed a sophisticated DR model for aggregated SIDCs using a data network aggregation method.Unlike existing studies,the proposed model captures the uncertainties associated with end tenant decisions to opt into an SIDC-DR program by utilizing a novel uncertainty modeling approach called Z-number formulation.This approach accounts for both the uncertainty in user participation intentions and the reliability of basic information during the DR process,enabling high-resolution profiling of the SIDC-DR potential in the CV evaluation.Simulation results from numerical studies conducted on a modified IEEE-33 node distribution system confirmed the effectiveness of the proposed approach and highlighted the potential benefits of SIDC-DR utilization in the efficient operation of future power systems.展开更多
In the rapidly evolving landscape of digital health,the integration of data analytics and Internet healthserviceshasbecome a pivotal area of exploration.To meet keen social needs,Prof.Shan Liu(Xi'an Jiaotong Unive...In the rapidly evolving landscape of digital health,the integration of data analytics and Internet healthserviceshasbecome a pivotal area of exploration.To meet keen social needs,Prof.Shan Liu(Xi'an Jiaotong University)and Prof.Xing Zhang(Wuhan Textile University)have published the timely book Datadriven Internet Health Platform Service Value Co-creation through China Science Press.The book focuses on the provision of medical and health services from doctors to patients through Internet health platforms,where the service value is co-created by three parties.展开更多
Calorific value is one of the most important properties of coal.Machine learning(ML)can be used in the prediction of calorific value to reduce experimental costs.China is one of the world’s largest coal production co...Calorific value is one of the most important properties of coal.Machine learning(ML)can be used in the prediction of calorific value to reduce experimental costs.China is one of the world’s largest coal production countries and coal occupies an important position in its national energy structure.However,ML models with a large database for the overall regions of China are still missing.Based on the extensive coal gasification practices in East China University of Science and Technology,we have built ML models with a large database for overall regions of China.An AutoML model was proposed and achieved a minimum MSE of 1.021.SHAP method was used to increase the model interpretability,and model validity was proved with literature data and additional in-house experiments.The model adaptability was discussed based on the databases of China and USA,showing that geography-specific ML models are essential.This study integrated a large coal database and AutoML method for accurate calorific value prediction and could offer key tools for Chinese coal industry.展开更多
Opportunistic mobile crowdsensing(MCS)non-intrusively exploits human mobility trajectories,and the participants’smart devices as sensors have become promising paradigms for various urban data acquisition tasks.Howeve...Opportunistic mobile crowdsensing(MCS)non-intrusively exploits human mobility trajectories,and the participants’smart devices as sensors have become promising paradigms for various urban data acquisition tasks.However,in practice,opportunistic MCS has several challenges from both the perspectives of MCS participants and the data platform.On the one hand,participants face uncertainties in conducting MCS tasks,including their mobility and implicit interactions among participants,and participants’economic returns given by the MCS data platform are determined by not only their own actions but also other participants’strategic actions.On the other hand,the platform can only observe the participants’uploaded sensing data that depends on the unknown effort/action exerted by participants to the platform,while,for optimizing its overall objective,the platform needs to properly reward certain participants for incentivizing them to provide high-quality data.To address the challenge of balancing individual incentives and platform objectives in MCS,this paper proposes MARCS,an online sensing policy based on multi-agent deep reinforcement learning(MADRL)with centralized training and decentralized execution(CTDE).Specifically,the interactions between MCS participants and the data platform are modeled as a partially observable Markov game,where participants,acting as agents,use DRL-based policies to make decisions based on local observations,such as task trajectories and platform payments.To align individual and platform goals effectively,the platform leverages Shapley value to estimate the contribution of each participant’s sensed data,using these estimates as immediate rewards to guide agent training.The experimental results on real mobility trajectory datasets indicate that the revenue of MARCS reaches almost 35%,53%,and 100%higher than DDPG,Actor-Critic,and model predictive control(MPC)respectively on the participant side and similar results on the platform side,which show superior performance compared to baselines.展开更多
China vigorously is carrying out the construction of green building and ecological city during " The 12 th five-Year plan". Now,although the identification system of design and operation have been implemente...China vigorously is carrying out the construction of green building and ecological city during " The 12 th five-Year plan". Now,although the identification system of design and operation have been implemented in the evaluation of green building,lacking of appropriate evaluation after use. The actual operation results of many buildings,which have got a green building logo,are not satisfied from the user 's perspective. In this paper,an evaluation method that combines the actual building energy consumption and users' satisfaction has been discussed,based on the post occupancy evaluation( POE) theory and the Big data technology. Through the comparison and analysis of building objective operational metrics and users subjective feelings indicators,the green buildings' POE has been achieved. Finally,the study analyzes the assessed value of green building's POE from the three-time dimensions,short-term,medium-term and long-term. And the outlook of the direction is looking forward to the follow-up study.展开更多
Background:his paper presents a case study on 100Credit,an Internet credit service provider in China.100Credit began as an IT company specializing in e-commerce recommendation before getting into the credit rating bus...Background:his paper presents a case study on 100Credit,an Internet credit service provider in China.100Credit began as an IT company specializing in e-commerce recommendation before getting into the credit rating business.The company makes use of Big Data on multiple aspects of individuals’online activities to infer their potential credit risk.Methods:Based on 100Credit’s business practices,this paper summarizes four aspects related to the value of Big Data in Internet credit services.Results:1)value from large data volume that provides access to more borrowers;2)value from prediction correctness in reducing lenders’operational cost;3)value from the variety of services catering to different needs of lenders;and 4)value from information protection to sustain credit service businesses.Conclusion:The paper also discusses the opportunities and challenges of Big Databased credit risk analysis,which needs to be improved in future research and practice.展开更多
Due to continuous decreasing feature size and increasing device density, on-chip caches have been becoming susceptible to single event upsets, which will result in multi-bit soft errors. The increasing rate of multi-b...Due to continuous decreasing feature size and increasing device density, on-chip caches have been becoming susceptible to single event upsets, which will result in multi-bit soft errors. The increasing rate of multi-bit errors could result in high risk of data corruption and even application program crashing. Traditionally, L1 D-caches have been protected from soft errors using simple parity to detect errors, and recover errors by reading correct data from L2 cache, which will induce performance penalty. This work proposes to exploit the redundancy based on the characteristic of data values. In the case of a small data value, the replica is stored in the upper half of the word. The replica of a big data value is stored in a dedicated cache line, which will sacrifice some capacity of the data cache. Experiment results show that the reliability of L1 D-cache has been improved by 65% at the cost of 1% in performance.展开更多
In this paper surrogate data method of phase-randomized is proposed to identify the random or chaotic nature of the data obtained in dynamic analysis: The calculating results validate the phase-randomized method to be...In this paper surrogate data method of phase-randomized is proposed to identify the random or chaotic nature of the data obtained in dynamic analysis: The calculating results validate the phase-randomized method to be useful as it can increase the extent of accuracy of the results. And the calculating results show that threshold values of the random timeseries and nonlinear chaotic timeseries have marked difference.展开更多
This paper explores the data theory of value along the line of reasoning epochal characteristics of data-theoretical innovation-paradigmatic transformation and,through a comparison of hard and soft factors and observa...This paper explores the data theory of value along the line of reasoning epochal characteristics of data-theoretical innovation-paradigmatic transformation and,through a comparison of hard and soft factors and observation of data peculiar features,it draws the conclusion that data have the epochal characteristics of non-competitiveness and non-exclusivity,decreasing marginal cost and increasing marginal return,non-physical and intangible form,and non-finiteness and non-scarcity.It is the epochal characteristics of data that undermine the traditional theory of value and innovate the“production-exchange”theory,including data value generation,data value realization,data value rights determination and data value pricing.From the perspective of data value generation,the levels of data quality,processing,use and connectivity,data application scenarios and data openness will influence data value.From the perspective of data value realization,data,as independent factors of production,show value creation effect,create a value multiplier effect by empowering other factors of production,and substitute other factors of production to create a zero-price effect.From the perspective of data value rights determination,based on the theory of property,the tragedy of the private outweighs the comedy of the private with respect to data,and based on the theory of sharing economy,the comedy of the commons outweighs the tragedy of the commons with respect to data.From the perspective of data pricing,standardized data products can be priced according to the physical product attributes,and non-standardized data products can be priced according to the virtual product attributes.Based on the epochal characteristics of data and theoretical innovation,the“production-exchange”paradigm has undergone a transformation from“using tangible factors to produce tangible products and exchanging tangible products for tangible products”to“using intangible factors to produce tangible products and exchanging intangible products for tangible products”and ultimately to“using intangible factors to produce intangible products and exchanging intangible products for intangible products”.展开更多
The Growth Value Model(GVM)proposed theoretical closed form formulas consist-ing of Return on Equity(ROE)and the Price-to-Book value ratio(P/B)for fair stock prices and expected rates of return.Although regression ana...The Growth Value Model(GVM)proposed theoretical closed form formulas consist-ing of Return on Equity(ROE)and the Price-to-Book value ratio(P/B)for fair stock prices and expected rates of return.Although regression analysis can be employed to verify these theoretical closed form formulas,they cannot be explored by classical quintile or decile sorting approaches with intuition due to the essence of multi-factors and dynamical processes.This article uses visualization techniques to help intuitively explore GVM.The discerning findings and contributions of this paper is that we put forward the concept of the smart frontier,which can be regarded as the reasonable lower limit of P/B at a specific ROE by exploring fair P/B with ROE-P/B 2D dynamical process visualization.The coefficients in the formula can be determined by the quantile regression analysis with market data.The moving paths of the ROE and P/B in the cur-rent quarter and the subsequent quarters show that the portfolios at the lower right of the curve approaches this curve and stagnates here after the portfolios are formed.Furthermore,exploring expected rates of return with ROE-P/B-Return 3D dynamical process visualization,the results show that the data outside of the lower right edge of the“smart frontier”has positive quarterly return rates not only in the t+1 quarter but also in the t+2 quarter.The farther away the data in the t quarter is from the“smart frontier”,the larger the return rates in the t+1 and t+2 quarter.展开更多
Today, the quantity of data continues to increase, furthermore, the data are heterogeneous, from multiple sources (structured, semi-structured and unstructured) and with different levels of quality. Therefore, it is v...Today, the quantity of data continues to increase, furthermore, the data are heterogeneous, from multiple sources (structured, semi-structured and unstructured) and with different levels of quality. Therefore, it is very likely to manipulate data without knowledge about their structures and their semantics. In fact, the meta-data may be insufficient or totally absent. Data Anomalies may be due to the poverty of their semantic descriptions, or even the absence of their description. In this paper, we propose an approach to better understand the semantics and the structure of the data. Our approach helps to correct automatically the intra-column anomalies and the inter-col- umns ones. We aim to improve the quality of data by processing the null values and the semantic dependencies between columns.展开更多
This work investigates the relationship between intellectual capital and value creation in the sector of production and assembly of vehicles and auto-parts in Brazil. Through the access of the database from the annual...This work investigates the relationship between intellectual capital and value creation in the sector of production and assembly of vehicles and auto-parts in Brazil. Through the access of the database from the annual industrial research conducted by the Brazilian Institute of Geography and Statistics, we gathered 865 observations, from 2000 to 2006, of public and private Brazilian companies with more than 100 employees. The database allows the estimate of relevant aggregated variables such as national accounts, gross domestic product, intermediate consumption, as well as propitiates a sectorial study of business strategies and performance, including value added by individual companies. In particular, in this study we use data on variables associated to intellectual capital. To achieve the goal of the study, we consider intellectual capital as defined by Pulic (2000, 2002), including human capital and structural capital. For the analysis of business performance, we used Pulic's VAIC (Value Added Intellectual Cofficient) index as a measure of efficiency of the employed financial and intellectual capital. Regression models were run to verify the relationship among the efficiency in the use of intellectual capital and the profitability of Brazilian companies. The gross income, calculated as before selling, general and administrative expenses, depreciation expenses, amortization and interest expenses, was used as measure of the flows of value creation and the profitability was measured by the gross income to the total assets of the companies. Considering the constructs defined by Pulic (2000, 2002), we tested, for the Brazilian sector of Production and Assembly of Vehicles and Auto-parts, the following hypotheses: (l) there is a positive relationship between value creation and intellectual capital, (2) there is a positive relationship between value creation and stock of intellectual capital, (3) there is a positive relationship between value creation and efficiency of the employed capital, (4) there is a positive relationship between value creation and efficiency of the human capital, (5) there is a positive relationship between value creation and efficiency of the structural capital. The results of the study, obtained through panel data analysis and through the use static and dynamic models, support the hypotheses that the intellectual capital of the companies, in its flow and stock dimensions, is positively and significantly related to value creation.展开更多
In this paper we construct optimal, in certain sense, estimates of values of linear functionals on solutions to two-point boundary value problems (BVPs) for systems of linear first-order ordinary differential equation...In this paper we construct optimal, in certain sense, estimates of values of linear functionals on solutions to two-point boundary value problems (BVPs) for systems of linear first-order ordinary differential equations from observations which are linear transformations of the same solutions perturbed by additive random noises. It is assumed here that right-hand sides of equations and boundary data as well as statistical characteristics of random noises in observations are not known and belong to certain given sets in corresponding functional spaces. This leads to the necessity of introducing minimax statement of an estimation problem when optimal estimates are defined as linear, with respect to observations, estimates for which the maximum of mean square error of estimation taken over the above-mentioned sets attains minimal value. Such estimates are called minimax mean square or guaranteed estimates. We establish that the minimax mean square estimates are expressed via solutions of some systems of differential equations of special type and determine estimation errors.展开更多
It is important to effectively identify the data value of open source scientific and technological information and to help intelligence analysts select high-value data from a large number of open-source scientific and...It is important to effectively identify the data value of open source scientific and technological information and to help intelligence analysts select high-value data from a large number of open-source scientific and technological information. The data value evaluation methods of scientific and technological information is proposed in the open source environment. According to the characteristics of the methods, the data value evaluation methods were divided into the following three aspects: research on data value evaluation methods based on information metrology, research on data value evaluation methods based on economic perspective and research on data value assessment methods based on text analysis. For each method, it indicated the main ideas, application scenarios, advantages and disadvantages.展开更多
As more and more application systems related to big data were developed, NoSQL (Not Only SQL) database systems are becoming more and more popular. In order to add transaction features for some NoSQL database systems, ...As more and more application systems related to big data were developed, NoSQL (Not Only SQL) database systems are becoming more and more popular. In order to add transaction features for some NoSQL database systems, many scholars have tried different techniques. Unfortunately, there is a lack of research on Redis’s transaction in the existing literatures. This paper proposes a transaction model for key-value NoSQL databases including Redis to make possible allowing users to access data in the ACID (Atomicity, Consistency, Isolation and Durability) way, and this model is vividly called the surfing concurrence transaction model. The architecture, important features and implementation principle are described in detail. The key algorithms also were given in the form of pseudo program code, and the performance also was evaluated. With the proposed model, the transactions of Key-Value NoSQL databases can be performed in a lock free and MVCC (Multi-Version Concurrency Control) free manner. This is the result of further research on the related topic, which fills the gap ignored by relevant scholars in this field to make a little contribution to the further development of NoSQL technology.展开更多
NoSQL系统因其高性能、高可扩展性的优势在大数据管理中得到广泛应用,而key-value(KV)模型则是NoSQL系统中使用最广泛的一种存储模型.KV型本地存储系统对于以机械磁盘为持久化存储的情形,存在许多性能优化技术,但这些优化技术面对当前...NoSQL系统因其高性能、高可扩展性的优势在大数据管理中得到广泛应用,而key-value(KV)模型则是NoSQL系统中使用最广泛的一种存储模型.KV型本地存储系统对于以机械磁盘为持久化存储的情形,存在许多性能优化技术,但这些优化技术面对当前的硬件发展新趋势,如多核处理器、大内存和低延迟闪存、非易失性内存NVM(Non-Volatile Memory)等,难以充分发挥新硬件的优势,如数据索引、并发控制、事务日志管理等技术在多核架构下存在多核扩展性问题,又如数据存储策略不适应闪存SSD(Solid State Drive)的新存储特性而产生了IO利用率低效的问题.针对多核处理器、大内存和闪存、NVM等硬件发展新趋势,文中面向当前的大数据应用背景,综述了KV型本地存储系统在索引技术、并发控制、事务日志管理和数据放置等核心模块上的最新优化技术和系统研究成果.从处理器、内存和持久化存储的角度概括了KV型本地存储系统当前存在的最优技术,总结了当前研究尚未解决的技术挑战,并对KV型本地存储系统在CPU缓存高效性、事务日志扩展性和高可用性等方面的研究进行了展望.展开更多
基金supported by National Natural Science Foundation of China(Grants 72474022,71974011,72174022,71972012,71874009)"BIT think tank"Promotion Plan of Science and Technology Innovation Program of Beijing Institute of Technology(Grants 2024CX14017,2023CX13029).
文摘As a new type of production factor in healthcare,healthcare data elements have been rapidly integrated into various health production processes,such as clinical assistance,health management,biological testing,and operation and supervision[1,2].Healthcare data elements include biolog.ical and clinical data that are related to disease,environ-mental health data that are associated with life,and operational and healthcare management data that are related to healthcare activities(Figure 1).Activities such as the construction of a data value assessment system,the devel-opment of a data circulation and sharing platform,and the authorization of data compliance and operation products support the strong growth momentum of the market for health care data elements in China[3].
基金supported in part by the National Natural Science Foundation of China under Grant 52177082in part by the Beijing Nova Program under Grant 20220484007.
文摘With the advent of the digital economy,there has been a rapid proliferation of small-scale Internet data centers(SIDCs).By leveraging their spatiotemporal load regulation potential through data workload balancing,aggregated SIDCs have emerged as promising demand response(DR)resources for future power distribution systems.This paper presents an innovative framework for assessing capacity value(CV)by aggregating SIDCs participating in DR programs(SIDC-DR).Initially,we delineate the concept of CV tailored for aggregated SIDC scenarios and establish a metric for the assessment.Considering the effects of the data load dynamics,equipment constraints,and user behavior,we developed a sophisticated DR model for aggregated SIDCs using a data network aggregation method.Unlike existing studies,the proposed model captures the uncertainties associated with end tenant decisions to opt into an SIDC-DR program by utilizing a novel uncertainty modeling approach called Z-number formulation.This approach accounts for both the uncertainty in user participation intentions and the reliability of basic information during the DR process,enabling high-resolution profiling of the SIDC-DR potential in the CV evaluation.Simulation results from numerical studies conducted on a modified IEEE-33 node distribution system confirmed the effectiveness of the proposed approach and highlighted the potential benefits of SIDC-DR utilization in the efficient operation of future power systems.
文摘In the rapidly evolving landscape of digital health,the integration of data analytics and Internet healthserviceshasbecome a pivotal area of exploration.To meet keen social needs,Prof.Shan Liu(Xi'an Jiaotong University)and Prof.Xing Zhang(Wuhan Textile University)have published the timely book Datadriven Internet Health Platform Service Value Co-creation through China Science Press.The book focuses on the provision of medical and health services from doctors to patients through Internet health platforms,where the service value is co-created by three parties.
基金Shanghai Yangfan Program,22YF1410300,Yunfei GaoNational Natural Science Foundation of China,22208104,Yunfei Gao+1 种基金Shanghai Chenguang Program,21CGA35,Yunfei GaoNational Key Research and Development Program of China,2022YFA1504701,Yunfei Gao,2022YFB4101900,Yunfei Gao。
文摘Calorific value is one of the most important properties of coal.Machine learning(ML)can be used in the prediction of calorific value to reduce experimental costs.China is one of the world’s largest coal production countries and coal occupies an important position in its national energy structure.However,ML models with a large database for the overall regions of China are still missing.Based on the extensive coal gasification practices in East China University of Science and Technology,we have built ML models with a large database for overall regions of China.An AutoML model was proposed and achieved a minimum MSE of 1.021.SHAP method was used to increase the model interpretability,and model validity was proved with literature data and additional in-house experiments.The model adaptability was discussed based on the databases of China and USA,showing that geography-specific ML models are essential.This study integrated a large coal database and AutoML method for accurate calorific value prediction and could offer key tools for Chinese coal industry.
基金sponsored by Qinglan Project of Jiangsu Province,and Jiangsu Provincial Key Research and Development Program(No.BE2020084-1).
文摘Opportunistic mobile crowdsensing(MCS)non-intrusively exploits human mobility trajectories,and the participants’smart devices as sensors have become promising paradigms for various urban data acquisition tasks.However,in practice,opportunistic MCS has several challenges from both the perspectives of MCS participants and the data platform.On the one hand,participants face uncertainties in conducting MCS tasks,including their mobility and implicit interactions among participants,and participants’economic returns given by the MCS data platform are determined by not only their own actions but also other participants’strategic actions.On the other hand,the platform can only observe the participants’uploaded sensing data that depends on the unknown effort/action exerted by participants to the platform,while,for optimizing its overall objective,the platform needs to properly reward certain participants for incentivizing them to provide high-quality data.To address the challenge of balancing individual incentives and platform objectives in MCS,this paper proposes MARCS,an online sensing policy based on multi-agent deep reinforcement learning(MADRL)with centralized training and decentralized execution(CTDE).Specifically,the interactions between MCS participants and the data platform are modeled as a partially observable Markov game,where participants,acting as agents,use DRL-based policies to make decisions based on local observations,such as task trajectories and platform payments.To align individual and platform goals effectively,the platform leverages Shapley value to estimate the contribution of each participant’s sensed data,using these estimates as immediate rewards to guide agent training.The experimental results on real mobility trajectory datasets indicate that the revenue of MARCS reaches almost 35%,53%,and 100%higher than DDPG,Actor-Critic,and model predictive control(MPC)respectively on the participant side and similar results on the platform side,which show superior performance compared to baselines.
文摘China vigorously is carrying out the construction of green building and ecological city during " The 12 th five-Year plan". Now,although the identification system of design and operation have been implemented in the evaluation of green building,lacking of appropriate evaluation after use. The actual operation results of many buildings,which have got a green building logo,are not satisfied from the user 's perspective. In this paper,an evaluation method that combines the actual building energy consumption and users' satisfaction has been discussed,based on the post occupancy evaluation( POE) theory and the Big data technology. Through the comparison and analysis of building objective operational metrics and users subjective feelings indicators,the green buildings' POE has been achieved. Finally,the study analyzes the assessed value of green building's POE from the three-time dimensions,short-term,medium-term and long-term. And the outlook of the direction is looking forward to the follow-up study.
文摘Background:his paper presents a case study on 100Credit,an Internet credit service provider in China.100Credit began as an IT company specializing in e-commerce recommendation before getting into the credit rating business.The company makes use of Big Data on multiple aspects of individuals’online activities to infer their potential credit risk.Methods:Based on 100Credit’s business practices,this paper summarizes four aspects related to the value of Big Data in Internet credit services.Results:1)value from large data volume that provides access to more borrowers;2)value from prediction correctness in reducing lenders’operational cost;3)value from the variety of services catering to different needs of lenders;and 4)value from information protection to sustain credit service businesses.Conclusion:The paper also discusses the opportunities and challenges of Big Databased credit risk analysis,which needs to be improved in future research and practice.
基金Projects(61472322,61272122)supported by the National Natural Science Foundation of ChinaProject(3102014JSJ0001)supported by the Fundamental Research Funds for the Central Universities,China+1 种基金Project(2013JQ8034)supported by the Natural Science Foundation of Shaanxi Province,ChinaProject(JC20120239)supported by the Basic Research Foundation of NWPU,China
文摘Due to continuous decreasing feature size and increasing device density, on-chip caches have been becoming susceptible to single event upsets, which will result in multi-bit soft errors. The increasing rate of multi-bit errors could result in high risk of data corruption and even application program crashing. Traditionally, L1 D-caches have been protected from soft errors using simple parity to detect errors, and recover errors by reading correct data from L2 cache, which will induce performance penalty. This work proposes to exploit the redundancy based on the characteristic of data values. In the case of a small data value, the replica is stored in the upper half of the word. The replica of a big data value is stored in a dedicated cache line, which will sacrifice some capacity of the data cache. Experiment results show that the reliability of L1 D-cache has been improved by 65% at the cost of 1% in performance.
文摘In this paper surrogate data method of phase-randomized is proposed to identify the random or chaotic nature of the data obtained in dynamic analysis: The calculating results validate the phase-randomized method to be useful as it can increase the extent of accuracy of the results. And the calculating results show that threshold values of the random timeseries and nonlinear chaotic timeseries have marked difference.
基金funded by“Management Model Innovation of Chinese Enterprises”Research Project,Institute of Industrial Economics,CASS(Grant No.2019-gjs-06)Project under the Graduate Student Scientific and Research Innovation Support Program,University of Chinese Academy of Social Sciences(Graduate School)(Grant No.2022-KY-118).
文摘This paper explores the data theory of value along the line of reasoning epochal characteristics of data-theoretical innovation-paradigmatic transformation and,through a comparison of hard and soft factors and observation of data peculiar features,it draws the conclusion that data have the epochal characteristics of non-competitiveness and non-exclusivity,decreasing marginal cost and increasing marginal return,non-physical and intangible form,and non-finiteness and non-scarcity.It is the epochal characteristics of data that undermine the traditional theory of value and innovate the“production-exchange”theory,including data value generation,data value realization,data value rights determination and data value pricing.From the perspective of data value generation,the levels of data quality,processing,use and connectivity,data application scenarios and data openness will influence data value.From the perspective of data value realization,data,as independent factors of production,show value creation effect,create a value multiplier effect by empowering other factors of production,and substitute other factors of production to create a zero-price effect.From the perspective of data value rights determination,based on the theory of property,the tragedy of the private outweighs the comedy of the private with respect to data,and based on the theory of sharing economy,the comedy of the commons outweighs the tragedy of the commons with respect to data.From the perspective of data pricing,standardized data products can be priced according to the physical product attributes,and non-standardized data products can be priced according to the virtual product attributes.Based on the epochal characteristics of data and theoretical innovation,the“production-exchange”paradigm has undergone a transformation from“using tangible factors to produce tangible products and exchanging tangible products for tangible products”to“using intangible factors to produce tangible products and exchanging intangible products for tangible products”and ultimately to“using intangible factors to produce intangible products and exchanging intangible products for intangible products”.
文摘The Growth Value Model(GVM)proposed theoretical closed form formulas consist-ing of Return on Equity(ROE)and the Price-to-Book value ratio(P/B)for fair stock prices and expected rates of return.Although regression analysis can be employed to verify these theoretical closed form formulas,they cannot be explored by classical quintile or decile sorting approaches with intuition due to the essence of multi-factors and dynamical processes.This article uses visualization techniques to help intuitively explore GVM.The discerning findings and contributions of this paper is that we put forward the concept of the smart frontier,which can be regarded as the reasonable lower limit of P/B at a specific ROE by exploring fair P/B with ROE-P/B 2D dynamical process visualization.The coefficients in the formula can be determined by the quantile regression analysis with market data.The moving paths of the ROE and P/B in the cur-rent quarter and the subsequent quarters show that the portfolios at the lower right of the curve approaches this curve and stagnates here after the portfolios are formed.Furthermore,exploring expected rates of return with ROE-P/B-Return 3D dynamical process visualization,the results show that the data outside of the lower right edge of the“smart frontier”has positive quarterly return rates not only in the t+1 quarter but also in the t+2 quarter.The farther away the data in the t quarter is from the“smart frontier”,the larger the return rates in the t+1 and t+2 quarter.
文摘Today, the quantity of data continues to increase, furthermore, the data are heterogeneous, from multiple sources (structured, semi-structured and unstructured) and with different levels of quality. Therefore, it is very likely to manipulate data without knowledge about their structures and their semantics. In fact, the meta-data may be insufficient or totally absent. Data Anomalies may be due to the poverty of their semantic descriptions, or even the absence of their description. In this paper, we propose an approach to better understand the semantics and the structure of the data. Our approach helps to correct automatically the intra-column anomalies and the inter-col- umns ones. We aim to improve the quality of data by processing the null values and the semantic dependencies between columns.
文摘This work investigates the relationship between intellectual capital and value creation in the sector of production and assembly of vehicles and auto-parts in Brazil. Through the access of the database from the annual industrial research conducted by the Brazilian Institute of Geography and Statistics, we gathered 865 observations, from 2000 to 2006, of public and private Brazilian companies with more than 100 employees. The database allows the estimate of relevant aggregated variables such as national accounts, gross domestic product, intermediate consumption, as well as propitiates a sectorial study of business strategies and performance, including value added by individual companies. In particular, in this study we use data on variables associated to intellectual capital. To achieve the goal of the study, we consider intellectual capital as defined by Pulic (2000, 2002), including human capital and structural capital. For the analysis of business performance, we used Pulic's VAIC (Value Added Intellectual Cofficient) index as a measure of efficiency of the employed financial and intellectual capital. Regression models were run to verify the relationship among the efficiency in the use of intellectual capital and the profitability of Brazilian companies. The gross income, calculated as before selling, general and administrative expenses, depreciation expenses, amortization and interest expenses, was used as measure of the flows of value creation and the profitability was measured by the gross income to the total assets of the companies. Considering the constructs defined by Pulic (2000, 2002), we tested, for the Brazilian sector of Production and Assembly of Vehicles and Auto-parts, the following hypotheses: (l) there is a positive relationship between value creation and intellectual capital, (2) there is a positive relationship between value creation and stock of intellectual capital, (3) there is a positive relationship between value creation and efficiency of the employed capital, (4) there is a positive relationship between value creation and efficiency of the human capital, (5) there is a positive relationship between value creation and efficiency of the structural capital. The results of the study, obtained through panel data analysis and through the use static and dynamic models, support the hypotheses that the intellectual capital of the companies, in its flow and stock dimensions, is positively and significantly related to value creation.
文摘In this paper we construct optimal, in certain sense, estimates of values of linear functionals on solutions to two-point boundary value problems (BVPs) for systems of linear first-order ordinary differential equations from observations which are linear transformations of the same solutions perturbed by additive random noises. It is assumed here that right-hand sides of equations and boundary data as well as statistical characteristics of random noises in observations are not known and belong to certain given sets in corresponding functional spaces. This leads to the necessity of introducing minimax statement of an estimation problem when optimal estimates are defined as linear, with respect to observations, estimates for which the maximum of mean square error of estimation taken over the above-mentioned sets attains minimal value. Such estimates are called minimax mean square or guaranteed estimates. We establish that the minimax mean square estimates are expressed via solutions of some systems of differential equations of special type and determine estimation errors.
文摘It is important to effectively identify the data value of open source scientific and technological information and to help intelligence analysts select high-value data from a large number of open-source scientific and technological information. The data value evaluation methods of scientific and technological information is proposed in the open source environment. According to the characteristics of the methods, the data value evaluation methods were divided into the following three aspects: research on data value evaluation methods based on information metrology, research on data value evaluation methods based on economic perspective and research on data value assessment methods based on text analysis. For each method, it indicated the main ideas, application scenarios, advantages and disadvantages.
文摘As more and more application systems related to big data were developed, NoSQL (Not Only SQL) database systems are becoming more and more popular. In order to add transaction features for some NoSQL database systems, many scholars have tried different techniques. Unfortunately, there is a lack of research on Redis’s transaction in the existing literatures. This paper proposes a transaction model for key-value NoSQL databases including Redis to make possible allowing users to access data in the ACID (Atomicity, Consistency, Isolation and Durability) way, and this model is vividly called the surfing concurrence transaction model. The architecture, important features and implementation principle are described in detail. The key algorithms also were given in the form of pseudo program code, and the performance also was evaluated. With the proposed model, the transactions of Key-Value NoSQL databases can be performed in a lock free and MVCC (Multi-Version Concurrency Control) free manner. This is the result of further research on the related topic, which fills the gap ignored by relevant scholars in this field to make a little contribution to the further development of NoSQL technology.
文摘NoSQL系统因其高性能、高可扩展性的优势在大数据管理中得到广泛应用,而key-value(KV)模型则是NoSQL系统中使用最广泛的一种存储模型.KV型本地存储系统对于以机械磁盘为持久化存储的情形,存在许多性能优化技术,但这些优化技术面对当前的硬件发展新趋势,如多核处理器、大内存和低延迟闪存、非易失性内存NVM(Non-Volatile Memory)等,难以充分发挥新硬件的优势,如数据索引、并发控制、事务日志管理等技术在多核架构下存在多核扩展性问题,又如数据存储策略不适应闪存SSD(Solid State Drive)的新存储特性而产生了IO利用率低效的问题.针对多核处理器、大内存和闪存、NVM等硬件发展新趋势,文中面向当前的大数据应用背景,综述了KV型本地存储系统在索引技术、并发控制、事务日志管理和数据放置等核心模块上的最新优化技术和系统研究成果.从处理器、内存和持久化存储的角度概括了KV型本地存储系统当前存在的最优技术,总结了当前研究尚未解决的技术挑战,并对KV型本地存储系统在CPU缓存高效性、事务日志扩展性和高可用性等方面的研究进行了展望.