期刊文献+
共找到116篇文章
< 1 2 6 >
每页显示 20 50 100
On the Data Quality and Imbalance in Machine Learning-based Design and Manufacturing-A Systematic Review
1
作者 Jiarui Xie Lijun Sun Yaoyao Fiona Zhao 《Engineering》 2025年第2期105-131,共27页
Machine learning(ML)has recently enabled many modeling tasks in design,manufacturing,and condition monitoring due to its unparalleled learning ability using existing data.Data have become the limiting factor when impl... Machine learning(ML)has recently enabled many modeling tasks in design,manufacturing,and condition monitoring due to its unparalleled learning ability using existing data.Data have become the limiting factor when implementing ML in industry.However,there is no systematic investigation on how data quality can be assessed and improved for ML-based design and manufacturing.The aim of this survey is to uncover the data challenges in this domain and review the techniques used to resolve them.To establish the background for the subsequent analysis,crucial data terminologies in ML-based modeling are reviewed and categorized into data acquisition,management,analysis,and utilization.Thereafter,the concepts and frameworks established to evaluate data quality and imbalance,including data quality assessment,data readiness,information quality,data biases,fairness,and diversity,are further investigated.The root causes and types of data challenges,including human factors,complex systems,complicated relationships,lack of data quality,data heterogeneity,data imbalance,and data scarcity,are identified and summarized.Methods to improve data quality and mitigate data imbalance and their applications in this domain are reviewed.This literature review focuses on two promising methods:data augmentation and active learning.The strengths,limitations,and applicability of the surveyed techniques are illustrated.The trends of data augmentation and active learning are discussed with respect to their applications,data types,and approaches.Based on this discussion,future directions for data quality improvement and data imbalance mitigation in this domain are identified. 展开更多
关键词 Machine learning Design and manufacturing data quality data augmentation Active learning
在线阅读 下载PDF
Quality analysis of AIS data derived from Haiyang(HY)series satellites
2
作者 Xi Ding Songtao Ai +3 位作者 Jiajun Ling Meng Cui Jiachun An Lei Huang 《Acta Oceanologica Sinica》 2025年第7期187-202,共16页
With the globalization of the economy,maritime trade has surged,posing challenges in the supervision of marine vessel activities.An automatic identification system(AIS)is an effective means of shipping traffic service... With the globalization of the economy,maritime trade has surged,posing challenges in the supervision of marine vessel activities.An automatic identification system(AIS)is an effective means of shipping traffic service,but many uncertainties exist regarding its data quality.In this study,the AIS data from Haiyang(HY)series of satellites were used to assess the data quality,analyze the global ship trajectory distribution and update frequencies from 2019 to 2023.Through the analysis of maritime mobile service identity numbers,we identified 340185 unique vessels,80.1%of which adhered to the International Telecommunication Union standards.Approximately 49.7%of ships exhibit significant data gaps,and 1.1%show inconsistencies in their AIS data sources.In the central Pacific Ocean at low latitudes and along the coast of South America(30°-60°S),a heightened incidence of abnormal trajectories of ships has been consistently observed,particularly in areas associated with fishing activities.According to the spatial distribution of ship trajectories,AIS data exhibit numerous deficiencies,particularly in high-traffic regions such as the East China Sea and South China Sea.In contrast,ship trajectories in the polar regions,characterized by high latitudes,are relatively comprehensive.With the increased number of HY satellites equipped with AIS receivers,the quantity of trajectory points displays a growing trend,leading to increasingly complete trajectories.This trend highlights the significant potential of using AIS data acquired from HY satellites to increase the accuracy of vessel tracking. 展开更多
关键词 AIS ship trajectory data quality spatial distribution Haiyang(HY)satellite
在线阅读 下载PDF
Assessing the data quality and seismic monitoring capabilities of the Belt and Road GNSS network
3
作者 Yu Li Yinxing Shao +2 位作者 Tan Wang Yuebing Wang Hongbo Shi 《Earthquake Science》 2025年第1期56-66,共11页
The Belt and Road global navigation satellite system(B&R GNSS)network is the first large-scale deployment of Chinese GNSS equipment in a seismic system.Prior to this,there have been few systematic assessments of t... The Belt and Road global navigation satellite system(B&R GNSS)network is the first large-scale deployment of Chinese GNSS equipment in a seismic system.Prior to this,there have been few systematic assessments of the data quality of Chinese GNSS equipment.In this study,data from four representative GNSS sites in different regions of China were analyzed using the G-Nut/Anubis software package.Four main indicators(data integrity rate,data validity ratio,multi-path error,and cycle slip ratio)used to systematically analyze data quality,while evaluating the seismic monitoring capabilities of the network based on earthquake magnitudes estimated from high-frequency GNSS data are evaluated by estimating magnitude based on highfrequency GNSS data.The results indicate that the quality of the data produced by the three types of Chinese receivers used in the network meets the needs of earthquake monitoring and the new seismic industry standards,which provide a reference for the selection of equipment for future new projects.After the B&R GNSS network was established,the seismic monitoring capability for earthquakes with magnitudes greater than M_(W)6.5 in most parts of the Sichuan-Yunnan region improved by approximately 20%.In key areas such as the Sichuan-Yunnan Rhomboid Block,the monitoring capability increased by more than 25%,which has greatly improved the effectiveness of regional comprehensive earthquake management. 展开更多
关键词 Belt and Road multi-system GNSS data quality seismic monitoring capability
在线阅读 下载PDF
Sign language data quality improvement based on dual information streams
4
作者 CAI Jialiang YUAN Tiantian 《Optoelectronics Letters》 2025年第6期342-347,共6页
Sign language dataset is essential in sign language recognition and translation(SLRT). Current public sign language datasets are small and lack diversity, which does not meet the practical application requirements for... Sign language dataset is essential in sign language recognition and translation(SLRT). Current public sign language datasets are small and lack diversity, which does not meet the practical application requirements for SLRT. However, making a large-scale and diverse sign language dataset is difficult as sign language data on the Internet is scarce. In making a large-scale and diverse sign language dataset, some sign language data qualities are not up to standard. This paper proposes a two information streams transformer(TIST) model to judge whether the quality of sign language data is qualified. To verify that TIST effectively improves sign language recognition(SLR), we make two datasets, the screened dataset and the unscreened dataset. In this experiment, this paper uses visual alignment constraint(VAC) as the baseline model. The experimental results show that the screened dataset can achieve better word error rate(WER) than the unscreened dataset. 展开更多
关键词 sign language dataset data quality improvement two information streams t dual information streams sign language data sign language translation sign language recognition sign language datasets
原文传递
Data trustworthiness and user reputation as indicators of VGI quality 被引量:5
5
作者 Paolo Fogliaroni Fausto D’Antonio Eliseo Clementini 《Geo-Spatial Information Science》 SCIE CSCD 2018年第3期213-233,共21页
Volunteered geographic information(VGI)has entered a phase where there are both a substantial amount of crowdsourced information available and a big interest in using it by organizations.But the issue of deciding the ... Volunteered geographic information(VGI)has entered a phase where there are both a substantial amount of crowdsourced information available and a big interest in using it by organizations.But the issue of deciding the quality of VGI without resorting to a comparison with authoritative data remains an open challenge.This article first formulates the problem of quality assessment of VGI data.Then presents a model to measure trustworthiness of information and reputation of contributors by analyzing geometric,qualitative,and semantic aspects of edits over time.An implementation of the model is running on a small data-set for a preliminary empirical validation.The results indicate that the computed trustworthiness provides a valid approximation of VGI quality. 展开更多
关键词 Volunteered geographic information data quality trustworthiness REPUTATION
原文传递
Digital Continuity Guarantee Approach of Electronic Record Based on Data Quality Theory 被引量:7
6
作者 Yongjun Ren Jian Qi +2 位作者 Yaping Cheng Jin Wang Osama Alfarraj 《Computers, Materials & Continua》 SCIE EI 2020年第6期1471-1483,共13页
Since the British National Archive put forward the concept of the digital continuity in 2007,several developed countries have worked out their digital continuity action plan.However,the technologies of the digital con... Since the British National Archive put forward the concept of the digital continuity in 2007,several developed countries have worked out their digital continuity action plan.However,the technologies of the digital continuity guarantee are still lacked.At first,this paper analyzes the requirements of digital continuity guarantee for electronic record based on data quality theory,then points out the necessity of data quality guarantee for electronic record.Moreover,we convert the digital continuity guarantee of electronic record to ensure the consistency,completeness and timeliness of electronic record,and construct the first technology framework of the digital continuity guarantee for electronic record.Finally,the temporal functional dependencies technology is utilized to build the first integration method to insure the consistency,completeness and timeliness of electronic record. 展开更多
关键词 Electronic record digital continuity data quality
在线阅读 下载PDF
Prediction of blast furnace gas generation based on data quality improvement strategy 被引量:4
7
作者 Shu-han Liu Wen-qiang Sun +1 位作者 Wei-dong Li Bing-zhen Jin 《Journal of Iron and Steel Research International》 SCIE EI CAS CSCD 2023年第5期864-874,共11页
The real-time energy flow data obtained in industrial production processes are usually of low quality.It is difficult to accurately predict the short-term energy flow profile by using these field data,which diminishes... The real-time energy flow data obtained in industrial production processes are usually of low quality.It is difficult to accurately predict the short-term energy flow profile by using these field data,which diminishes the effect of industrial big data and artificial intelligence in industrial energy system.The real-time data of blast furnace gas(BFG)generation collected in iron and steel sites are also of low quality.In order to tackle this problem,a three-stage data quality improvement strategy was proposed to predict the BFG generation.In the first stage,correlation principle was used to test the sample set.In the second stage,the original sample set was rectified and updated.In the third stage,Kalman filter was employed to eliminate the noise of the updated sample set.The method was verified by autoregressive integrated moving average model,back propagation neural network model and long short-term memory model.The results show that the prediction model based on the proposed three-stage data quality improvement method performs well.Long short-term memory model has the best prediction performance,with a mean absolute error of 17.85 m3/min,a mean absolute percentage error of 0.21%,and an R squared of 95.17%. 展开更多
关键词 Blast furnace gas Iron and steel industry data quality improvement Artificial intelligence Gas generation prediction
原文传递
Quality assessment of OpenStreetMap data using trajectory mining 被引量:3
8
作者 Anahid Basiri Mike Jackson +5 位作者 Pouria Amirian Amir Pourabdollah Monika Sester Adam Winstanley Terry Moore Lijuan Zhang 《Geo-Spatial Information Science》 SCIE EI CSCD 2016年第1期56-68,共13页
OpenStreetMap(OSM)data are widely used but their reliability is still variable.Many contributors to OSM have not been trained in geography or surveying and consequently their contributions,including geometry and attri... OpenStreetMap(OSM)data are widely used but their reliability is still variable.Many contributors to OSM have not been trained in geography or surveying and consequently their contributions,including geometry and attribute data inserts,deletions,and updates,can be inaccurate,incomplete,inconsistent,or vague.There are some mechanisms and applications dedicated to discovering bugs and errors in OSM data.Such systems can remove errors through user-checks and applying predefined rules but they need an extra control process to check the real-world validity of suspected errors and bugs.This paper focuses on finding bugs and errors based on patterns and rules extracted from the tracking data of users.The underlying idea is that certain characteristics of user trajectories are directly linked to the type of feature.Using such rules,some sets of potential bugs and errors can be identified and stored for further investigations. 展开更多
关键词 Spatial data quality OpenStreetMap(OSM) trajectory data mining
原文传递
OpenStreetMap data quality enrichment through awareness raising and collective action tools——experiences from a European project 被引量:2
9
作者 Amin Mobasheri Alexander Zipf Louise Francis 《Geo-Spatial Information Science》 SCIE CSCD 2018年第3期234-246,共13页
Nowadays,several research projects show interest in employing volunteered geographic information(VGI)to improve their systems through using up-to-date and detailed data.The European project CAP4Access is one of the su... Nowadays,several research projects show interest in employing volunteered geographic information(VGI)to improve their systems through using up-to-date and detailed data.The European project CAP4Access is one of the successful examples of such international-wide research projects that aims to improve the accessibility of people with restricted mobility using crowdsourced data.In this project,OpenStreetMap(OSM)is used to extend OpenRouteService,a well-known routing platform.However,a basic challenge that this project tackled was the incompleteness of OSM data with regards to certain information that is required for wheelchair accessibility(e.g.sidewalk information,kerb data,etc.).In this article,we present the results of initial assessment of sidewalk data in OSM at the beginning of the project as well as our approach in awareness raising and using tools for tagging accessibility data into OSM database for enriching the sidewalk data completeness.Several experiments have been carried out in different European cities,and discussion on the results of the experiments as well as the lessons learned are provided.The lessons learned provide recommendations that help in organizing better mapping party events in the future.We conclude by reporting on how and to what extent the OSM sidewalk data completeness in these study areas have benefited from the mapping parties by the end of the project. 展开更多
关键词 ACCESSIBILITY OpenStreetMap(OSM) data quality data completeness SIDEWALK wheel map
原文传递
Modeling data quality for risk assessment of GIS 被引量:1
10
作者 Su, Ying Jin, Zhanming Peng, Jie 《Journal of Southeast University(English Edition)》 EI CAS 2008年第S1期37-42,共6页
This paper presents a methodology to determine three data quality (DQ) risk characteristics: accuracy, comprehensiveness and nonmembership. The methodology provides a set of quantitative models to confirm the informat... This paper presents a methodology to determine three data quality (DQ) risk characteristics: accuracy, comprehensiveness and nonmembership. The methodology provides a set of quantitative models to confirm the information quality risks for the database of the geographical information system (GIS). Four quantitative measures are introduced to examine how the quality risks of source information affect the quality of information outputs produced using the relational algebra operations Selection, Projection, and Cubic Product. It can be used to determine how quality risks associated with diverse data sources affect the derived data. The GIS is the prime source of information on the location of cables, and detection time strongly depends on whether maps indicate the presence of cables in the construction business. Poor data quality in the GIS can contribute to increased risk or higher risk avoidance costs. A case study provides a numerical example of the calculation of the trade-offs between risk and detection costs and provides an example of the calculation of the costs of data quality. We conclude that the model contributes valuable new insight. 展开更多
关键词 risk assessment data quality geographical information system PROBABILITY spatial data quality
在线阅读 下载PDF
Improvement of Wired Drill Pipe Data Quality via Data Validation and Reconciliation 被引量:2
11
作者 Dan Sui Olha Sukhoboka Bernt Sigve Aadn?y 《International Journal of Automation and computing》 EI CSCD 2018年第5期625-636,共12页
Wired drill pipe(WDP)technology is one of the most promising data acquisition technologies in today s oil and gas industry.For the first time it allows sensors to be positioned along the drill string which enables c... Wired drill pipe(WDP)technology is one of the most promising data acquisition technologies in today s oil and gas industry.For the first time it allows sensors to be positioned along the drill string which enables collecting and transmitting valuable data not only from the bottom hole assembly(BHA),but also along the entire length of the wellbore to the drill floor.The technology has received industry acceptance as a viable alternative to the typical logging while drilling(LWD)method.Recently more and more WDP applications can be found in the challenging drilling environments around the world,leading to many innovations to the industry.Nevertheless most of the data acquired from WDP can be noisy and in some circumstances of very poor quality.Diverse factors contribute to the poor data quality.Most common sources include mis-calibrated sensors,sensor drifting,errors during data transmission,or some abnormal conditions in the well,etc.The challenge of improving the data quality has attracted more and more focus from many researchers during the past decade.This paper has proposed a promising solution to address such challenge by making corrections of the raw WDP data and estimating unmeasurable parameters to reveal downhole behaviors.An advanced data processing method,data validation and reconciliation(DVR)has been employed,which makes use of the redundant data from multiple WDP sensors to filter/remove the noise from the measurements and ensures the coherence of all sensors and models.Moreover it has the ability to distinguish the accurate measurements from the inaccurate ones.In addition,the data with improved quality can be used for estimating some crucial parameters in the drilling process which are unmeasurable in the first place,hence provide better model calibrations for integrated well planning and realtime operations. 展开更多
关键词 data quality wired drill pipe (WDP) data validation and reconciliation (DVR) DRILLING models.
原文传递
Novel method for the evaluation of data quality based on fuzzy control 被引量:1
12
作者 Ban Xiaojuan Ning Shurong +1 位作者 Xu Zhaolin Cheng Peng 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2008年第3期606-610,共5页
One of the goals of data collection is preparing for decision-making, so high quality requirement must be satisfied. Rational evaluation of data quality is an effective way to identify data problem in time, and the qu... One of the goals of data collection is preparing for decision-making, so high quality requirement must be satisfied. Rational evaluation of data quality is an effective way to identify data problem in time, and the quality of data after this evaluation is satisfactory with the requirement of decision maker. A fuzzy neural network based research method of data quality evaluation is proposed. First, the criteria for the evaluation of data quality are selected to construct the fuzzy sets of evaluating grades, and then by using the learning ability of NN, the objective evaluation of membership is carried out, which can be used for the effective evaluation of data quality. This research has been used in the platform of 'data report of national compulsory education outlay guarantee' from the Chinese Ministry of Education. This method can be used for the effective evaluation of data quality worldwide, and the data quality situation can be found out more completely, objectively, and in better time by using the method. 展开更多
关键词 data quality evaluation system fuzzy control theory neural network.
在线阅读 下载PDF
On Statistical Measures for Data Quality Evaluation 被引量:1
13
作者 Xiaoxia Han 《Journal of Geographic Information System》 2020年第3期178-187,共10页
<span style="font-family:Verdana;">Most GIS databases contain data errors. The quality of the data sources such as traditional paper maps or more recent remote sensing data determines spatial data qual... <span style="font-family:Verdana;">Most GIS databases contain data errors. The quality of the data sources such as traditional paper maps or more recent remote sensing data determines spatial data quality. In the past several decades, different statistical measures have been developed to evaluate data quality for different types of data, such as nominal categorical data, ordinal categorical data and numerical data. Although these methods were originally proposed for medical research or psychological research, they have been widely used to evaluate spatial data quality. In this paper, we first review statistical methods for evaluating data quality, discuss under what conditions we should use them and how to interpret the results, followed by a brief discussion of statistical software and packages that can be used to compute these data quality measures.</span> 展开更多
关键词 GIS data quality Sensitivity SPECIFICITY KAPPA Weighted Kappa Bland-Altman Analysis Intra-Class Correlation Coefficient
在线阅读 下载PDF
Quality Analysis of Dairy Herd Improvement Data from Henan Province 被引量:1
14
作者 Zhen ZHANG Xiaoli REN +3 位作者 Lei YAN Yuefei YAN Fanjun GENG Yanqin SUN 《Agricultural Science & Technology》 CAS 2017年第1期151-155,188,共6页
The dairy herd improvement data from Henan Province were analyzed statistically to establish screening criteria for relevant data, thereby laying a foundation for genetic evaluation of dairy cows. With the 2 152 451 t... The dairy herd improvement data from Henan Province were analyzed statistically to establish screening criteria for relevant data, thereby laying a foundation for genetic evaluation of dairy cows. With the 2 152 451 test-day records about 155 893 Chinese Holstein dairy cows collected by the Henan Dairy Herd Improvement Center from January 2008 to April 2016, the dynamics of test times during a complete lactation, test interval during a complete lactation, days in milk (DIM) of first test-day record, daughter descendant number and herd number of bull, age at first calving and pedigree integrity rate among different years and different herd sizes were analyzed by MEANS order of SAS 9.4. In addition, the data that were applicable to genetic evaluation were screened by SQL program. The results showed that during 2008-2015, the number of cow individuals participating in DHI in Henan Province increased from 7 379 to 93 706; the test-day milk yield increased from 19.91 to 24.05 kg; the somatic cell count reduced from 411.09×10^3 to 277.08×10^3 cells/ml; the percentage of cows with DIM ranging from 5-305 d reached 70.92%; the average test times increased from 3.20 to 6.31 times; the test interval decreased from 70.22 to 33.83 d; the dairy cows with age at first calving of 25 months were dominant, accounting for 12.57%; the bulls whose daughter descendant number was 20 or more and the daughters were distributed in 10 or more farms accounted for 6.05%; the one-generation pedigree integrity rate was 82.54%; the percentage of data that could be used for genetic evaluation was screened as 20.67%, which was lower than the results of other similar studies. 展开更多
关键词 Dairy herd improvement data quality Test times Test interval DIM of first test-day record Daughter descendant number and herd number of bull age at first calving Pedigree integrity rate
在线阅读 下载PDF
A Short Review of the Literature on Automatic Data Quality 被引量:1
15
作者 Deepak R. Chandran Vikram Gupta 《Journal of Computer and Communications》 2022年第5期55-73,共19页
Several organizations have migrated to the cloud for better quality in business engagements and security. Data quality is crucial in present-day activities. Information is generated and collected from data representin... Several organizations have migrated to the cloud for better quality in business engagements and security. Data quality is crucial in present-day activities. Information is generated and collected from data representing real-time facts and activities. Poor data quality affects the organizational decision-making policy and customer satisfaction, and influences the organization’s scheme of execution negatively. Data quality also has a massive influence on the accuracy, complexity and efficiency of the machine and deep learning tasks. There are several methods and tools to evaluate data quality to ensure smooth incorporation in model development. The bulk of data quality tools permit the assessment of sources of data only at a certain point in time, and the arrangement and automation are consequently an obligation of the user. In ensuring automatic data quality, several steps are involved in gathering data from different sources and monitoring data quality, and any problems with the data quality must be adequately addressed. There was a gap in the literature as no attempts have been made previously to collate all the advances in different dimensions of automatic data quality. This limited narrative review of existing literature sought to address this gap by correlating different steps and advancements related to the automatic data quality systems. The six crucial data quality dimensions in organizations were discussed, and big data were compared and classified. This review highlights existing data quality models and strategies that can contribute to the development of automatic data quality systems. 展开更多
关键词 data quality MONITORING TOOLKIT DIMENSION ORGANIZATION
在线阅读 下载PDF
Impact of Data Quality on Question Answering System Performances
16
作者 Rachid Karra Abdelali Lasfar 《Intelligent Automation & Soft Computing》 SCIE 2023年第1期335-349,共15页
In contrast with the research of new models,little attention has been paid to the impact of low or high-quality data feeding a dialogue system.The present paper makes thefirst attempt tofill this gap by extending our ... In contrast with the research of new models,little attention has been paid to the impact of low or high-quality data feeding a dialogue system.The present paper makes thefirst attempt tofill this gap by extending our previous work on question-answering(QA)systems by investigating the effect of misspelling on QA agents and how context changes can enhance the responses.Instead of using large language models trained on huge datasets,we propose a method that enhances the model's score by modifying only the quality and structure of the data feed to the model.It is important to identify the features that modify the agent performance because a high rate of wrong answers can make the students lose their interest in using the QA agent as an additional tool for distant learning.The results demonstrate the accuracy of the proposed context simplification exceeds 85%.Thesefindings shed light on the importance of question data quality and context complexity construct as key dimensions of the QA system.In conclusion,the experimental results on questions and contexts showed that controlling and improving the various aspects of data quality around the QA system can significantly enhance his robustness and performance. 展开更多
关键词 dataOps data quality QA system NLP context simplification
在线阅读 下载PDF
The quality examination of observative data at Geomagnetic observatories
17
作者 程安龙 周锦屏 +3 位作者 高玉芬 赵学敏 赵永芬 黄蔚北 《Earthquake Science》 CSCD 1994年第S1期71-79,共9页
The basic task of geomagnetic observatory is .to produce accurate, relaible,continuous and complete observative data. The aim of examination is to judge the quality status of data. According to the operative principle... The basic task of geomagnetic observatory is .to produce accurate, relaible,continuous and complete observative data. The aim of examination is to judge the quality status of data. According to the operative principle of geomagnetic instruments and its operative status that should be achieved, geomagnetic activity and spread characteristics in time domain and location domain, authers proposed a complete set of data quality examination. The paper discusses respectively physical basement, examination method and the result about scalevalues, base-line values, monthly mean values, daily mean values, maximum and minimum values in daily range, magnetic storm and K index. The practice has proved that this set of examination is feasible and useful to raise and to guarantee the quality of observative data. 展开更多
关键词 geomagnetic observation geomagnetic data data quality quality examination
在线阅读 下载PDF
Quality assessment of volunteered geographic information for outdoor activities:an analysis of OpenStreetMap data for names of peaks in Japan
18
作者 Jun Yamashita Toshikazu Seto +1 位作者 Nobusuke Iwasaki Yuichiro Nishimura 《Geo-Spatial Information Science》 SCIE EI CSCD 2023年第3期333-345,共13页
Geographical studies of outdoor activities have increased in recent years with the rise in popularity of these activities worldwide,including in Japan.Volunteered geographic information(VGI)is a key tool for organizin... Geographical studies of outdoor activities have increased in recent years with the rise in popularity of these activities worldwide,including in Japan.Volunteered geographic information(VGI)is a key tool for organizing outdoor activities as it offers a means to determine the locational information and names of places.To evaluate the quality of VGI,geospatial data generated by land survey agencies and other VGI are often utilized as reference data.However,since these reference data may not be available,other methods are necessary to assure the quality of VGI.In this study,we examined five trust indicators based on the inherent characteristics of VGI through an empirical case study.We used mountain names extracted from OpenStreetMap in Japan as data because there were almost no other VGI in the vicinity.As a result,we isolated three trust indicators,namely versions,users,and tag corrections,to examine the thematic accuracy of VGI because these were the only statistically significant indicators.However,we found that the prediction rate of thematic accuracy was very low.To improve thematic accuracy,this study recommends using the most accurate versions,applying correctly given tags,and considering the motivations and characteristics of the VGI contributors. 展开更多
关键词 Outdoor activities volunteered geographic information(VGI) OpenStreetMap(OSM) data quality assessment trust integrated usage
原文传递
Data quality evaluation and calibration of on-road remote sensing systems based on exhaust plumes
19
作者 Shijie Liu Xinlu Zhang +3 位作者 Linlin Ma Liqiang He Shaojun Zhang Miaomiao Cheng 《Journal of Environmental Sciences》 SCIE EI CAS CSCD 2023年第1期317-326,共10页
In recent years,with rapid increases in the number of vehicles in China,the contribution of vehicle exhaust emissions to air pollution has become increasingly prominent.To achieve the precise control of emissions,on-r... In recent years,with rapid increases in the number of vehicles in China,the contribution of vehicle exhaust emissions to air pollution has become increasingly prominent.To achieve the precise control of emissions,on-road remote sensing(RS)technology has been developed and applied for law enforcement and supervision.However,data quality is still an existing issue affecting the development and application of RS.In this study,the RS data from a cross-road RS system used at a single site(from 2012 to 2015)were collected,the data screening process was reviewed,the issues with data quality were summarized,a new method of data screening and calibration was proposed,and the effectiveness of the improved data quality control methods was finally evaluated.The results showed that this method reduces the skewness and kurtosis of the data distribution by up to nearly 67%,which restores the actual characteristics of exhaust diffusion and is conducive to the identification of actual clean and high-emission vehicles.The annual variability of emission factors of nitric oxide decreases by 60%-on average-eliminating the annual drift of fleet emissions and improving data reliability. 展开更多
关键词 On-road remote sensing(RS) data quality Spearman rank correlation Least-square regression with a non-zero intercept Cook value
原文传递
Correlation Analysis of Turbidity and Total Phosphorus in Water Quality Monitoring Data
20
作者 Wenwu Tan Jianjun Zhang +7 位作者 Xing Liu Jiang Wu Yifu Sheng Ke Xiao Li Wang Haijun Lin Guang Sun Peng Guo 《Journal on Big Data》 2023年第1期85-97,共13页
At present,water pollution has become an important factor affecting and restricting national and regional economic development.Total phosphorus is one of the main sources of water pollution and eutrophication,so the p... At present,water pollution has become an important factor affecting and restricting national and regional economic development.Total phosphorus is one of the main sources of water pollution and eutrophication,so the prediction of total phosphorus in water quality has good research significance.This paper selects the total phosphorus and turbidity data for analysis by crawling the data of the water quality monitoring platform.By constructing the attribute object mapping relationship,the correlation between the two indicators was analyzed and used to predict the future data.Firstly,the monthly mean and daily mean concentrations of total phosphorus and turbidity outliers were calculated after cleaning,and the correlation between them was analyzed.Secondly,the correlation coefficients of different times and frequencies were used to predict the values for the next five days,and the data trend was predicted by python visualization.Finally,the real value was compared with the predicted value data,and the results showed that the correlation between total phosphorus and turbidity was useful in predicting the water quality. 展开更多
关键词 Correlation analysis CLUSTER water quality predict water quality monitoring data
在线阅读 下载PDF
上一页 1 2 6 下一页 到第
使用帮助 返回顶部