期刊文献+
共找到14,947篇文章
< 1 2 250 >
每页显示 20 50 100
An Application of a Multi-Tier Data Warehouse in Oil and Gas Drilling Information Management 被引量:2
1
作者 张宁生 王志伟 《Petroleum Science》 SCIE CAS CSCD 2004年第4期1-5,共5页
Expenditure on wells constitute a significant part of the operational costs for a petroleum enterprise, where most of the cost results from drilling. This has prompted drilling departments to continuously look for wa... Expenditure on wells constitute a significant part of the operational costs for a petroleum enterprise, where most of the cost results from drilling. This has prompted drilling departments to continuously look for ways to reduce their drilling costs and be as efficient as possible. A system called the Drilling Comprehensive Information Management and Application System (DCIMAS) is developed and presented here, with an aim at collecting, storing and making full use of the valuable well data and information relating to all drilling activities and operations. The DCIMAS comprises three main parts, including a data collection and transmission system, a data warehouse (DW) management system, and an integrated platform of core applications. With the support of the application platform, the DW management system is introduced, whereby the operation data are captured at well sites and transmitted electronically to a data warehouse via transmission equipment and ETL (extract, transformation and load) tools. With the high quality of the data guaranteed, our central task is to make the best use of the operation data and information for drilling analysis and to provide further information to guide later production stages. Applications have been developed and integrated on a uniform platform to interface directly with different layers of the multi-tier DW. Now, engineers in every department spend less time on data handling and more time on applying technology in their real work with the system. 展开更多
关键词 drilling information management multi-tier data warehouse information processing application system
原文传递
Development of cardiovascular clinical research data warehouse and real-world research
2
作者 Dan-Dan LI Ya-Ni YU +6 位作者 Zhi-Jun SUN Chang-Fu LIU Tao CHEN Dong-Kai SHAN Xiao-Dan TUO Jun GUO Yun-Dai CHEN 《Journal of Geriatric Cardiology》 2025年第7期678-689,共12页
Background Medical informatics accumulated vast amounts of data for clinical diagnosis and treatment.However,limited access to follow-up data and the difficulty in integrating data across diverse platforms continue to... Background Medical informatics accumulated vast amounts of data for clinical diagnosis and treatment.However,limited access to follow-up data and the difficulty in integrating data across diverse platforms continue to pose significant barriers to clinical research progress.In response,our research team has embarked on the development of a specialized clinical research database for cardiology,thereby establishing a comprehensive digital platform that facilitates both clinical decision-making and research endeavors.Methods The database incorporated actual clinical data from patients who received treatment at the Cardiovascular Medicine Department of Chinese PLA General Hospital from 2012 to 2021.It included comprehensive data on patients'basic information,medical history,non-invasive imaging studies,laboratory test results,as well as peri-procedural information related to interventional surgeries,extracted from the Hospital Information System.Additionally,an innovative artificial intelligence(AI)-powered interactive follow-up system had been developed,ensuring that nearly all myocardial infarction patients received at least one post-discharge follow-up,thereby achieving comprehensive data management throughout the entire care continuum for highrisk patients.Results This database integrates extensive cross-sectional and longitudinal patient data,with a focus on higher-risk acute coronary syndrome patients.It achieves the integration of structured and unstructured clinical data,while innovatively incorporating AI and automatic speech recognition technologies to enhance data integration and workflow efficiency.It creates a comprehensive patient view,thereby improving diagnostic and follow-up quality,and provides high-quality data to support clinical research.Despite limitations in unstructured data standardization and biological sample integrity,the database's development is accompanied by ongoing optimization efforts.Conclusion The cardiovascular specialty clinical database is a comprehensive digital archive integrating clinical treatment and research,which facilitates the digital and intelligent transformation of clinical diagnosis and treatment processes.It supports clinical decision-making and offers data support and potential research directions for the specialized management of cardiovascular diseases. 展开更多
关键词 clinical decision making medical informatics data warehouse patient data cardiovascular clinical research comprehensive digital platform real world research integrating data
在线阅读 下载PDF
On the Data Quality and Imbalance in Machine Learning-based Design and Manufacturing-A Systematic Review
3
作者 Jiarui Xie Lijun Sun Yaoyao Fiona Zhao 《Engineering》 2025年第2期105-131,共27页
Machine learning(ML)has recently enabled many modeling tasks in design,manufacturing,and condition monitoring due to its unparalleled learning ability using existing data.Data have become the limiting factor when impl... Machine learning(ML)has recently enabled many modeling tasks in design,manufacturing,and condition monitoring due to its unparalleled learning ability using existing data.Data have become the limiting factor when implementing ML in industry.However,there is no systematic investigation on how data quality can be assessed and improved for ML-based design and manufacturing.The aim of this survey is to uncover the data challenges in this domain and review the techniques used to resolve them.To establish the background for the subsequent analysis,crucial data terminologies in ML-based modeling are reviewed and categorized into data acquisition,management,analysis,and utilization.Thereafter,the concepts and frameworks established to evaluate data quality and imbalance,including data quality assessment,data readiness,information quality,data biases,fairness,and diversity,are further investigated.The root causes and types of data challenges,including human factors,complex systems,complicated relationships,lack of data quality,data heterogeneity,data imbalance,and data scarcity,are identified and summarized.Methods to improve data quality and mitigate data imbalance and their applications in this domain are reviewed.This literature review focuses on two promising methods:data augmentation and active learning.The strengths,limitations,and applicability of the surveyed techniques are illustrated.The trends of data augmentation and active learning are discussed with respect to their applications,data types,and approaches.Based on this discussion,future directions for data quality improvement and data imbalance mitigation in this domain are identified. 展开更多
关键词 Machine learning Design and manufacturing data quality data augmentation Active learning
在线阅读 下载PDF
Sign language data quality improvement based on dual information streams
4
作者 CAI Jialiang YUAN Tiantian 《Optoelectronics Letters》 2025年第6期342-347,共6页
Sign language dataset is essential in sign language recognition and translation(SLRT). Current public sign language datasets are small and lack diversity, which does not meet the practical application requirements for... Sign language dataset is essential in sign language recognition and translation(SLRT). Current public sign language datasets are small and lack diversity, which does not meet the practical application requirements for SLRT. However, making a large-scale and diverse sign language dataset is difficult as sign language data on the Internet is scarce. In making a large-scale and diverse sign language dataset, some sign language data qualities are not up to standard. This paper proposes a two information streams transformer(TIST) model to judge whether the quality of sign language data is qualified. To verify that TIST effectively improves sign language recognition(SLR), we make two datasets, the screened dataset and the unscreened dataset. In this experiment, this paper uses visual alignment constraint(VAC) as the baseline model. The experimental results show that the screened dataset can achieve better word error rate(WER) than the unscreened dataset. 展开更多
关键词 sign language dataset data quality improvement two information streams t dual information streams sign language data sign language translation sign language recognition sign language datasets
原文传递
Assessing the data quality and seismic monitoring capabilities of the Belt and Road GNSS network
5
作者 Yu Li Yinxing Shao +2 位作者 Tan Wang Yuebing Wang Hongbo Shi 《Earthquake Science》 2025年第1期56-66,共11页
The Belt and Road global navigation satellite system(B&R GNSS)network is the first large-scale deployment of Chinese GNSS equipment in a seismic system.Prior to this,there have been few systematic assessments of t... The Belt and Road global navigation satellite system(B&R GNSS)network is the first large-scale deployment of Chinese GNSS equipment in a seismic system.Prior to this,there have been few systematic assessments of the data quality of Chinese GNSS equipment.In this study,data from four representative GNSS sites in different regions of China were analyzed using the G-Nut/Anubis software package.Four main indicators(data integrity rate,data validity ratio,multi-path error,and cycle slip ratio)used to systematically analyze data quality,while evaluating the seismic monitoring capabilities of the network based on earthquake magnitudes estimated from high-frequency GNSS data are evaluated by estimating magnitude based on highfrequency GNSS data.The results indicate that the quality of the data produced by the three types of Chinese receivers used in the network meets the needs of earthquake monitoring and the new seismic industry standards,which provide a reference for the selection of equipment for future new projects.After the B&R GNSS network was established,the seismic monitoring capability for earthquakes with magnitudes greater than M_(W)6.5 in most parts of the Sichuan-Yunnan region improved by approximately 20%.In key areas such as the Sichuan-Yunnan Rhomboid Block,the monitoring capability increased by more than 25%,which has greatly improved the effectiveness of regional comprehensive earthquake management. 展开更多
关键词 Belt and Road multi-system GNSS data quality seismic monitoring capability
在线阅读 下载PDF
Development of Geological Data Warehouse 被引量:2
6
作者 LiZhenhua HuGuangdao ZhangZhenfei 《Journal of China University of Geosciences》 SCIE CSCD 2003年第3期261-264,共4页
Data warehouse (DW), a new technology invented in 1990s, is more useful for integrating and analyzing massive data than traditional database. Its application in geology field can be divided into 3 phrases: 1992-1996,... Data warehouse (DW), a new technology invented in 1990s, is more useful for integrating and analyzing massive data than traditional database. Its application in geology field can be divided into 3 phrases: 1992-1996, commercial data warehouse (CDW) appeared; 1996-1999, geological data warehouse (GDW) appeared and the geologists or geographers realized the importance of DW and began the studies on it, but the practical DW still followed the framework of DB; 2000 to present, geological data warehouse grows, and the theory of geo-spatial data warehouse (GSDW) has been developed but the research in geological area is still deficient except that in geography. Although some developments of GDW have been made, its core still follows the CDW-organizing data by time and brings about 3 problems: difficult to integrate the geological data, for the data feature more space than time; hard to store the massive data in different levels due to the same reason; hardly support the spatial analysis if the data are organized by time as CDW does. So the GDW should be redesigned by organizing data by scale in order to store mass data in different levels and synthesize the data in different granularities, and choosing space control points to replace the former time control points so as to integrate different types of data by the method of storing one type data as one layer and then to superpose the layers. In addition, data cube, a wide used technology in CDW, will be no use in GDW, for the causality among the geological data is not so obvious as commercial data, as the data are the mixed result of many complex rules, and their analysis always needs the special geological methods and software; on the other hand, data cube for mass and complex geo-data will devour too much store space to be practical. On this point, the main purpose of GDW may be fit for data integration unlike CDW for data analysis. 展开更多
关键词 data warehouse (DW) geological data warehouse (GDW) space control points data cube
在线阅读 下载PDF
Modeling data quality for risk assessment of GIS 被引量:1
7
作者 Su, Ying Jin, Zhanming Peng, Jie 《Journal of Southeast University(English Edition)》 EI CAS 2008年第S1期37-42,共6页
This paper presents a methodology to determine three data quality (DQ) risk characteristics: accuracy, comprehensiveness and nonmembership. The methodology provides a set of quantitative models to confirm the informat... This paper presents a methodology to determine three data quality (DQ) risk characteristics: accuracy, comprehensiveness and nonmembership. The methodology provides a set of quantitative models to confirm the information quality risks for the database of the geographical information system (GIS). Four quantitative measures are introduced to examine how the quality risks of source information affect the quality of information outputs produced using the relational algebra operations Selection, Projection, and Cubic Product. It can be used to determine how quality risks associated with diverse data sources affect the derived data. The GIS is the prime source of information on the location of cables, and detection time strongly depends on whether maps indicate the presence of cables in the construction business. Poor data quality in the GIS can contribute to increased risk or higher risk avoidance costs. A case study provides a numerical example of the calculation of the trade-offs between risk and detection costs and provides an example of the calculation of the costs of data quality. We conclude that the model contributes valuable new insight. 展开更多
关键词 risk assessment data quality geographical information system PROBABILITY spatial data quality
在线阅读 下载PDF
On Statistical Measures for Data Quality Evaluation 被引量:1
8
作者 Xiaoxia Han 《Journal of Geographic Information System》 2020年第3期178-187,共10页
<span style="font-family:Verdana;">Most GIS databases contain data errors. The quality of the data sources such as traditional paper maps or more recent remote sensing data determines spatial data qual... <span style="font-family:Verdana;">Most GIS databases contain data errors. The quality of the data sources such as traditional paper maps or more recent remote sensing data determines spatial data quality. In the past several decades, different statistical measures have been developed to evaluate data quality for different types of data, such as nominal categorical data, ordinal categorical data and numerical data. Although these methods were originally proposed for medical research or psychological research, they have been widely used to evaluate spatial data quality. In this paper, we first review statistical methods for evaluating data quality, discuss under what conditions we should use them and how to interpret the results, followed by a brief discussion of statistical software and packages that can be used to compute these data quality measures.</span> 展开更多
关键词 GIS data quality Sensitivity SPECIFICITY KAPPA Weighted Kappa Bland-Altman Analysis Intra-Class Correlation Coefficient
在线阅读 下载PDF
Uniform Representation Model for Metadata of Data Warehouse
9
作者 王建芬 曹元大 《Journal of Beijing Institute of Technology》 EI CAS 2002年第1期85-88,共4页
A uniform metadata representation is introduced for heterogeneous databases, multi media information and other information sources. Some features about metadata are analyzed. The limitation of existing metadata model... A uniform metadata representation is introduced for heterogeneous databases, multi media information and other information sources. Some features about metadata are analyzed. The limitation of existing metadata model is compared with the new one. The metadata model is described in XML which is fit for metadata denotation and exchange. The well structured data, semi structured data and those exterior file data without structure are described in the metadata model. The model provides feasibility and extensibility for constructing uniform metadata model of data warehouse. 展开更多
关键词 data warehouse METAdata data model XML
在线阅读 下载PDF
A trajectory data warehouse solution for workforce management decision-making 被引量:1
10
作者 Georgia Garani Dimitrios Tolis Ilias K.Savvas 《Data Science and Management》 2023年第2期88-97,共10页
In modern workforce management,the demand for new ways to maximize worker satisfaction,productivity,and security levels is endless.Workforce movement data such as those source data from an access control system can su... In modern workforce management,the demand for new ways to maximize worker satisfaction,productivity,and security levels is endless.Workforce movement data such as those source data from an access control system can support this ongoing process with subsequent analysis.In this study,a solution to attaining this goal is proposed,based on the design and implementation of a data mart as part of a dimensional trajectory data warehouse(TDW)that acts as a repository for the management of movement data.A novel methodological approach is proposed for modeling multiple spatial and temporal dimensions in a logical model.The case study presented in this paper for modeling and analyzing workforce movement data is to support human resource management decision-making and the following discussion provides a representative example of the contribution of a TDW in the process of information management and decision support systems.The entire process of exporting,cleaning,consolidating,and transforming data is implemented to achieve an appropriate format for final import.Structured query language(SQL)queries demonstrate the convenience of dimensional design for data analysis,and valuable information can be extracted from the movements of employees on company premises to manage the workforce efficiently and effectively.Visual analytics through data visualization support the analysis and facilitate decisionmaking and business intelligence. 展开更多
关键词 Business intelligence DECISION-MAKING Workforce management Trajectory data warehouse(TDW) Moving object Semantic modeling
在线阅读 下载PDF
Refreshing File Aggregate of Distributed Data Warehouse in Sets of Electric Apparatus
11
作者 于宝琴 王太勇 +3 位作者 张君 周明 何改云 李国琴 《Transactions of Tianjin University》 EI CAS 2006年第3期174-179,共6页
Integrating heterogeneous data sources is a precondition to share data for enterprises. Highly-efficient data updating can both save system expenses, and offer real-time data. It is one of the hot issues to modify dat... Integrating heterogeneous data sources is a precondition to share data for enterprises. Highly-efficient data updating can both save system expenses, and offer real-time data. It is one of the hot issues to modify data rapidly in the pre-processing area of the data warehouse. An extract transform loading design is proposed based on a new data algorithm called Diff-Match,which is developed by utilizing mode matching and data-filtering technology. It can accelerate data renewal, filter the heterogeneous data, and seek out different sets of data. Its efficiency has been proved by its successful application in an enterprise of electric apparatus groups. 展开更多
关键词 distributed data warehouse Diff-Match algorithm KMP algorithm file aggregates extract transform loading
在线阅读 下载PDF
Imagery Data Quality of ZY Satellite Reached International Level
12
《Aerospace China》 2012年第2期23-23,共1页
The in-orbit commissioning of ZY-1 02C satellite is proceeding smoothly. According to the relevant experts in this field, the imagery quality of the satellite has reached or nearly reached the level of international s... The in-orbit commissioning of ZY-1 02C satellite is proceeding smoothly. According to the relevant experts in this field, the imagery quality of the satellite has reached or nearly reached the level of international satellites of the same kind. ZY-1 02C satellite and ZY-3 satellite were successfully launched on December 22, 2011 and January 9, 2012 respectively. China Centre for Resources Satellite Data andApplication (CRSDA) was responsible for the building of a ground 展开更多
关键词 Imagery data quality of ZY Satellite Reached International Level
在线阅读 下载PDF
Digital Continuity Guarantee Approach of Electronic Record Based on Data Quality Theory 被引量:7
13
作者 Yongjun Ren Jian Qi +2 位作者 Yaping Cheng Jin Wang Osama Alfarraj 《Computers, Materials & Continua》 SCIE EI 2020年第6期1471-1483,共13页
Since the British National Archive put forward the concept of the digital continuity in 2007,several developed countries have worked out their digital continuity action plan.However,the technologies of the digital con... Since the British National Archive put forward the concept of the digital continuity in 2007,several developed countries have worked out their digital continuity action plan.However,the technologies of the digital continuity guarantee are still lacked.At first,this paper analyzes the requirements of digital continuity guarantee for electronic record based on data quality theory,then points out the necessity of data quality guarantee for electronic record.Moreover,we convert the digital continuity guarantee of electronic record to ensure the consistency,completeness and timeliness of electronic record,and construct the first technology framework of the digital continuity guarantee for electronic record.Finally,the temporal functional dependencies technology is utilized to build the first integration method to insure the consistency,completeness and timeliness of electronic record. 展开更多
关键词 Electronic record digital continuity data quality
在线阅读 下载PDF
Prediction of blast furnace gas generation based on data quality improvement strategy 被引量:5
14
作者 Shu-han Liu Wen-qiang Sun +1 位作者 Wei-dong Li Bing-zhen Jin 《Journal of Iron and Steel Research International》 SCIE EI CAS CSCD 2023年第5期864-874,共11页
The real-time energy flow data obtained in industrial production processes are usually of low quality.It is difficult to accurately predict the short-term energy flow profile by using these field data,which diminishes... The real-time energy flow data obtained in industrial production processes are usually of low quality.It is difficult to accurately predict the short-term energy flow profile by using these field data,which diminishes the effect of industrial big data and artificial intelligence in industrial energy system.The real-time data of blast furnace gas(BFG)generation collected in iron and steel sites are also of low quality.In order to tackle this problem,a three-stage data quality improvement strategy was proposed to predict the BFG generation.In the first stage,correlation principle was used to test the sample set.In the second stage,the original sample set was rectified and updated.In the third stage,Kalman filter was employed to eliminate the noise of the updated sample set.The method was verified by autoregressive integrated moving average model,back propagation neural network model and long short-term memory model.The results show that the prediction model based on the proposed three-stage data quality improvement method performs well.Long short-term memory model has the best prediction performance,with a mean absolute error of 17.85 m3/min,a mean absolute percentage error of 0.21%,and an R squared of 95.17%. 展开更多
关键词 Blast furnace gas Iron and steel industry data quality improvement Artificial intelligence Gas generation prediction
原文传递
Improvement of Wired Drill Pipe Data Quality via Data Validation and Reconciliation 被引量:2
15
作者 Dan Sui Olha Sukhoboka Bernt Sigve Aadn?y 《International Journal of Automation and computing》 EI CSCD 2018年第5期625-636,共12页
Wired drill pipe(WDP)technology is one of the most promising data acquisition technologies in today s oil and gas industry.For the first time it allows sensors to be positioned along the drill string which enables c... Wired drill pipe(WDP)technology is one of the most promising data acquisition technologies in today s oil and gas industry.For the first time it allows sensors to be positioned along the drill string which enables collecting and transmitting valuable data not only from the bottom hole assembly(BHA),but also along the entire length of the wellbore to the drill floor.The technology has received industry acceptance as a viable alternative to the typical logging while drilling(LWD)method.Recently more and more WDP applications can be found in the challenging drilling environments around the world,leading to many innovations to the industry.Nevertheless most of the data acquired from WDP can be noisy and in some circumstances of very poor quality.Diverse factors contribute to the poor data quality.Most common sources include mis-calibrated sensors,sensor drifting,errors during data transmission,or some abnormal conditions in the well,etc.The challenge of improving the data quality has attracted more and more focus from many researchers during the past decade.This paper has proposed a promising solution to address such challenge by making corrections of the raw WDP data and estimating unmeasurable parameters to reveal downhole behaviors.An advanced data processing method,data validation and reconciliation(DVR)has been employed,which makes use of the redundant data from multiple WDP sensors to filter/remove the noise from the measurements and ensures the coherence of all sensors and models.Moreover it has the ability to distinguish the accurate measurements from the inaccurate ones.In addition,the data with improved quality can be used for estimating some crucial parameters in the drilling process which are unmeasurable in the first place,hence provide better model calibrations for integrated well planning and realtime operations. 展开更多
关键词 data quality wired drill pipe (WDP) data validation and reconciliation (DVR) DRILLING models.
原文传递
OpenStreetMap data quality enrichment through awareness raising and collective action tools——experiences from a European project 被引量:2
16
作者 Amin Mobasheri Alexander Zipf Louise Francis 《Geo-Spatial Information Science》 SCIE CSCD 2018年第3期234-246,共13页
Nowadays,several research projects show interest in employing volunteered geographic information(VGI)to improve their systems through using up-to-date and detailed data.The European project CAP4Access is one of the su... Nowadays,several research projects show interest in employing volunteered geographic information(VGI)to improve their systems through using up-to-date and detailed data.The European project CAP4Access is one of the successful examples of such international-wide research projects that aims to improve the accessibility of people with restricted mobility using crowdsourced data.In this project,OpenStreetMap(OSM)is used to extend OpenRouteService,a well-known routing platform.However,a basic challenge that this project tackled was the incompleteness of OSM data with regards to certain information that is required for wheelchair accessibility(e.g.sidewalk information,kerb data,etc.).In this article,we present the results of initial assessment of sidewalk data in OSM at the beginning of the project as well as our approach in awareness raising and using tools for tagging accessibility data into OSM database for enriching the sidewalk data completeness.Several experiments have been carried out in different European cities,and discussion on the results of the experiments as well as the lessons learned are provided.The lessons learned provide recommendations that help in organizing better mapping party events in the future.We conclude by reporting on how and to what extent the OSM sidewalk data completeness in these study areas have benefited from the mapping parties by the end of the project. 展开更多
关键词 ACCESSIBILITY OpenStreetMap(OSM) data quality data completeness SIDEWALK wheel map
原文传递
Novel method for the evaluation of data quality based on fuzzy control 被引量:1
17
作者 Ban Xiaojuan Ning Shurong +1 位作者 Xu Zhaolin Cheng Peng 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2008年第3期606-610,共5页
One of the goals of data collection is preparing for decision-making, so high quality requirement must be satisfied. Rational evaluation of data quality is an effective way to identify data problem in time, and the qu... One of the goals of data collection is preparing for decision-making, so high quality requirement must be satisfied. Rational evaluation of data quality is an effective way to identify data problem in time, and the quality of data after this evaluation is satisfactory with the requirement of decision maker. A fuzzy neural network based research method of data quality evaluation is proposed. First, the criteria for the evaluation of data quality are selected to construct the fuzzy sets of evaluating grades, and then by using the learning ability of NN, the objective evaluation of membership is carried out, which can be used for the effective evaluation of data quality. This research has been used in the platform of 'data report of national compulsory education outlay guarantee' from the Chinese Ministry of Education. This method can be used for the effective evaluation of data quality worldwide, and the data quality situation can be found out more completely, objectively, and in better time by using the method. 展开更多
关键词 data quality evaluation system fuzzy control theory neural network.
在线阅读 下载PDF
A Tailor-made Data Quality Approach for Higher Educational Data 被引量:2
18
作者 Cinzia Daraio Renato Bruni +5 位作者 Giuseppe Catalano Alessandro Daraio Giorgio Matteucci Monica Scannapieco Daniel Wagner-Schuster Benedetto Lepori 《Journal of Data and Information Science》 CSCD 2020年第3期129-160,共32页
Purpose: This paper relates the definition of data quality procedures for knowledge organizations such as Higher Education Institutions. The main purpose is to present the flexible approach developed for monitoring th... Purpose: This paper relates the definition of data quality procedures for knowledge organizations such as Higher Education Institutions. The main purpose is to present the flexible approach developed for monitoring the data quality of the European Tertiary Education Register(ETER) database, illustrating its functioning and highlighting the main challenges that still have to be faced in this domain.Design/methodology/approach: The proposed data quality methodology is based on two kinds of checks, one to assess the consistency of cross-sectional data and the other to evaluate the stability of multiannual data. This methodology has an operational and empirical orientation. This means that the proposed checks do not assume any theoretical distribution for the determination of the threshold parameters that identify potential outliers, inconsistencies, and errors in the data. Findings: We show that the proposed cross-sectional checks and multiannual checks are helpful to identify outliers, extreme observations and to detect ontological inconsistencies not described in the available meta-data. For this reason, they may be a useful complement to integrate the processing of the available information.Research limitations: The coverage of the study is limited to European Higher Education Institutions. The cross-sectional and multiannual checks are not yet completely integrated.Practical implications: The consideration of the quality of the available data and information is important to enhance data quality-aware empirical investigations, highlighting problems, and areas where to invest for improving the coverage and interoperability of data in future data collection initiatives.Originality/value: The data-driven quality checks proposed in this paper may be useful as a reference for building and monitoring the data quality of new databases or of existing databases available for other countries or systems characterized by high heterogeneity and complexity of the units of analysis without relying on pre-specified theoretical distributions. 展开更多
关键词 Knowledge organization Development of data and information services Cross-sectional and multiannual quality checks Higher education institutions Information quality
在线阅读 下载PDF
A Short Review of the Literature on Automatic Data Quality 被引量:1
19
作者 Deepak R. Chandran Vikram Gupta 《Journal of Computer and Communications》 2022年第5期55-73,共19页
Several organizations have migrated to the cloud for better quality in business engagements and security. Data quality is crucial in present-day activities. Information is generated and collected from data representin... Several organizations have migrated to the cloud for better quality in business engagements and security. Data quality is crucial in present-day activities. Information is generated and collected from data representing real-time facts and activities. Poor data quality affects the organizational decision-making policy and customer satisfaction, and influences the organization’s scheme of execution negatively. Data quality also has a massive influence on the accuracy, complexity and efficiency of the machine and deep learning tasks. There are several methods and tools to evaluate data quality to ensure smooth incorporation in model development. The bulk of data quality tools permit the assessment of sources of data only at a certain point in time, and the arrangement and automation are consequently an obligation of the user. In ensuring automatic data quality, several steps are involved in gathering data from different sources and monitoring data quality, and any problems with the data quality must be adequately addressed. There was a gap in the literature as no attempts have been made previously to collate all the advances in different dimensions of automatic data quality. This limited narrative review of existing literature sought to address this gap by correlating different steps and advancements related to the automatic data quality systems. The six crucial data quality dimensions in organizations were discussed, and big data were compared and classified. This review highlights existing data quality models and strategies that can contribute to the development of automatic data quality systems. 展开更多
关键词 data quality MONITORING TOOLKIT DIMENSION ORGANIZATION
在线阅读 下载PDF
Preliminary Study in Spatial Data Warehouse of Flood Control and Disaster Mitigation in Yangtze River Basin
20
作者 ZHAN Xiao guoSenior engineer, Yangtze River Scientific Research Institute, Changjiang Water Resources Commission, Wuhan 430010, China 《人民长江》 北大核心 2002年第S1期90-92,共3页
Since 1990s,the spatial data warehouse technology has rapidly been developing, but due to the complexity of multi-dimensional analysis, extensive application of the spatial data warehouse technology is affected. In th... Since 1990s,the spatial data warehouse technology has rapidly been developing, but due to the complexity of multi-dimensional analysis, extensive application of the spatial data warehouse technology is affected. In the light of the characteristics of the flood control and disaster mitigation in the Yangtze river basin, it is proposed to design a scheme about the subjects and data distribution of the spatial data warehouse of the flood control and disaster mitigation in Yangtze river basin, i.e., to adopt a distributed scheme. The creation and development of the spatial data warehouse of the flood control and disaster mitigation in Yangtze river basin is presented .The necessity and urgency of establishing the spatial data warehouse is expounded from the viewpoint of the present situation being short of available information for the flood control and disaster mitigation in Yangtze river basin. 展开更多
关键词 spatial data warehouse distributional scheme FLOOD control and DISASTER MITIGATION YANGTZE RIVER
在线阅读 下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部