期刊文献+
共找到479篇文章
< 1 2 24 >
每页显示 20 50 100
Hybrid Scalable Researcher Recommendation System Using Azure Data Lake Analytics
1
作者 Dinesh Kalla Nathan Smith +1 位作者 Fnu Samaah Kiran Polimetla 《Journal of Data Analysis and Information Processing》 2024年第1期76-88,共13页
This research paper has provided the methodology and design for implementing the hybrid author recommender system using Azure Data Lake Analytics and Power BI. It offers a recommendation for the top 1000 Authors of co... This research paper has provided the methodology and design for implementing the hybrid author recommender system using Azure Data Lake Analytics and Power BI. It offers a recommendation for the top 1000 Authors of computer science in different fields of study. The technique used in this paper is handling the inadequate Information for citation;it removes the problem of cold start, which is encountered by very many other recommender systems. In this paper, abstracts, the titles, and the Microsoft academic graphs have been used in coming up with the recommendation list for every document, which is used to combine the content-based approaches and the co-citations. Prioritization and the blending of every technique have been allowed by the tuning system parameters, allowing for the authority in results of recommendation versus the paper novelty. In the end, we do observe that there is a direct correlation between the similarity rankings that have been produced by the system and the scores of the participant. The results coming from the associated scrips of analysis and the user survey have been made available through the recommendation system. Managers must gain the required expertise to fully utilize the benefits that come with business intelligence systems [1]. Data mining has become an important tool for managers that provides insights about their daily operations and leverage the information provided by decision support systems to improve customer relationships [2]. Additionally, managers require business intelligence systems that can rank the output in the order of priority. Ranking algorithm can replace the traditional data mining algorithms that will be discussed in-depth in the literature review [3]. 展开更多
关键词 Azure data lake U-SQL Author Recommendation System Power BI Microsoft Academic Big data Word Embedding
在线阅读 下载PDF
Optimizing Multimodal Data Queries in Data Lakes
2
作者 Runqun Xiong Shiyuan Zhao +1 位作者 Ciyuan Chen Zhuqing Xu 《Tsinghua Science and Technology》 2025年第6期2625-2637,共13页
This paper addresses the challenge of efficiently querying multimodal related data in data lakes,a large-scale storage and management system that supports heterogeneous data formats,including structured,semi-structure... This paper addresses the challenge of efficiently querying multimodal related data in data lakes,a large-scale storage and management system that supports heterogeneous data formats,including structured,semi-structured,and unstructured data.Multimodal data queries are crucial because they enable seamless retrieval of related data across modalities,such as tables,images,and text,which has applications in fields like e-commerce,healthcare,and education.However,existing methods primarily focus on single-modality queries,such as joinable or unionable table discovery,and struggle to handle the heterogeneity and lack of metadata in data lakes while balancing accuracy and efficiency.To tackle these challenges,we propose a Multimodal data Query mechanism for Data Lakes(MQDL),which employs a modality-adaptive indexing mechanism raleted and contrastive learning based embeddings to unify representations across modalities.Additionally,we introduce product quantization to optimize candidate verification during queries,reducing computational overhead while maintaining precision.We evaluate MQDL using a table-image dataset across multiple business scenarios,measuring metrics such as precision,recall,and F1-score.Results show that MQDL achieves an accuracy rate of approximately 90%,while demonstrating strong scalability and reduced query response time compared to traditional methods.These findings highlight MQDL's potential to enhance multimodal data retrieval in complex data lake environments. 展开更多
关键词 multimodal data query data lake contrastive learning related data query
原文传递
Data Lakes as a Centralized Integration Layer in Enterprise Environments:Approaches and Benefits for Scalability and Performance
3
作者 Carlos Diego Cavalcanti Pereira 《Journal of Data Analysis and Information Processing》 2025年第4期467-486,共20页
Enterprise application integration encounters substantial hurdles,particularly in intricate contexts that require elevated scalability and speed.Transactional applications directly accessed by many systems frequently ... Enterprise application integration encounters substantial hurdles,particularly in intricate contexts that require elevated scalability and speed.Transactional applications directly accessed by many systems frequently overload databases,undermining process efficiency.This paper examines the utilization of data lakes-historically used for data analysis-as a centralized integration layer that accommodates various temporalities and consumption modalities.The sug-gested method diminishes system interdependence and the burden on transac-tional databases,enhancing scalability and data governance in both mono-lithic and distributed frameworks. 展开更多
关键词 Application Integration data lakes data Governance
在线阅读 下载PDF
Realising Data-Centric Scientific Workflows with Provenance-Capturing on Data Lakes
4
作者 Hendrik Noltet Philipp Wieder 《Data Intelligence》 EI 2022年第2期426-438,共13页
Since their introduction by James Dixon in 2010,data lakes get more and more attention,driven by the promise of high reusability of the stored data due to the schema-on-read semantics.Building on this idea,several add... Since their introduction by James Dixon in 2010,data lakes get more and more attention,driven by the promise of high reusability of the stored data due to the schema-on-read semantics.Building on this idea,several additional requirements were discussed in literature to improve the general usability of the concept,like a central metadata catalog including all provenance information,an overarching data governance,or the integration with(high-performance)processing capabilities.Although the necessity for a logical and a physical organisation of data lakes in order to meet those requirements is widely recognized,no concrete guidelines are yet provided.The most common architecture implementing this conceptual organisation is the zone architecture,where data is assigned to a certain zone depending on the degree of processing.This paper discusses how FAIR Digital Objects can be used in a novel approach to organize a data lake based on data types instead of zones,how they can be used to abstract the physical implementation,and how they empower generic and portable processing capabilities based on a provenance-based approach. 展开更多
关键词 data lake PROVENANCE WORKFLOWS FAIRDigital Objects CWFR
原文传递
A Systematic Review of Automated Classification for Simple and Complex Query SQL on NoSQL Database
5
作者 Nurhadi Rabiah Abdul Kadir +1 位作者 Ely Salwana Mat Surin Mahidur R.Sarker 《Computer Systems Science & Engineering》 2024年第6期1405-1435,共31页
A data lake(DL),abbreviated as DL,denotes a vast reservoir or repository of data.It accumulates substantial volumes of data and employs advanced analytics to correlate data from diverse origins containing various form... A data lake(DL),abbreviated as DL,denotes a vast reservoir or repository of data.It accumulates substantial volumes of data and employs advanced analytics to correlate data from diverse origins containing various forms of semi-structured,structured,and unstructured information.These systems use a flat architecture and run different types of data analytics.NoSQL databases are nontabular and store data in a different manner than the relational table.NoSQL databases come in various forms,including key-value pairs,documents,wide columns,and graphs,each based on its data model.They offer simpler scalability and generally outperform traditional relational databases.While NoSQL databases can store diverse data types,they lack full support for atomicity,consistency,isolation,and durability features found in relational databases.Consequently,employing machine learning approaches becomes necessary to categorize complex structured query language(SQL)queries.Results indicate that the most frequently used automatic classification technique in processing SQL queries on NoSQL databases is machine learning-based classification.Overall,this study provides an overview of the automatic classification techniques used in processing SQL queries on NoSQL databases.Understanding these techniques can aid in the development of effective and efficient NoSQL database applications. 展开更多
关键词 NoSQL database data lake machine learning ACID complex query smart city
在线阅读 下载PDF
Wetland vegetation biomass estimation and mapping from Landsat ETM data: a case study of Poyang Lake 被引量:3
6
作者 LI Ren-dong1, 2, LIU Ji-yuan2 (1. Institute of Geodesy and Geophysics, CAS, Wuhan 430077, China 2. Institute of Geographic Sciences and Natural Resources Research, CAS, Beijing 100101 China) 《Journal of Geographical Sciences》 SCIE CSCD 2002年第1期35-41,共7页
Poyang Lake is the largest freshwater lake in China. This paper conducted a digital and rapid investigation of the lake’s wetland vegetation biomass using Landsat ETM data acquired on April 16, 2000. First, utilizing... Poyang Lake is the largest freshwater lake in China. This paper conducted a digital and rapid investigation of the lake’s wetland vegetation biomass using Landsat ETM data acquired on April 16, 2000. First, utilizing the false color composite derived from the ETM data as one of the main references, the authors designed a reasonable sampling route for field measurement of the biomass, and carried it out on April 18–28, 2000. Then after both the sampling data and the ETM data were geometrically corrected to an equal-area projection of Albers, linear relationships among the sampling data and some transformed data derived from the ETM data and the ETM 4 were calculated. The results show that the sampling data is best relative to the band 4 data with a high correlation coefficient of 0.86, followed by the DVI and NDVI data with 0.83 and 0.80 respectively. Therefore, a linear regression model, which was based on the field data and band 4 data, was used to estimate the total biomass of entire Poyang Lake, and then the map of the biomass distribution was compiled. 展开更多
关键词 Poyang lake BIOMASS wetland vegetation Landsat ETM data
在线阅读 下载PDF
Mapping of moraine dammed glacial lakes and assessment of their areal changes in the central and eastern Himalayas using satellite data 被引量:3
7
作者 Sazeda BEGAM Dhrubajyoti SEN 《Journal of Mountain Science》 SCIE CSCD 2019年第1期77-94,共18页
The relatively rapid recession of glaciers in the Himalayas and formation of moraine dammed glacial lakes(MDGLs) in the recent past have increased the risk of glacier lake outburst floods(GLOF) in the countries of Nep... The relatively rapid recession of glaciers in the Himalayas and formation of moraine dammed glacial lakes(MDGLs) in the recent past have increased the risk of glacier lake outburst floods(GLOF) in the countries of Nepal and Bhutan and in the mountainous territory of Sikkim in India. As a product of climate change and global warming, such a risk has not only raised the level of threats to the habitation and infrastructure of the region, but has also contributed to the worsening of the balance of the unique ecosystem that exists in this domain that sustains several of the highest mountain peaks of the world. This study attempts to present an up to date mapping of the MDGLs in the central and eastern Himalayan regions using remote sensing data, with an objective to analyse their surface area variations with time from 1990 through 2015, disaggregated over six episodes. The study also includes the evaluation for susceptibility of MDGLs to GLOF with the least criteria decision analysis(LCDA). Forty two major MDGLs, each having a lake surface area greater than 0.2 km2, that were identified in the Himalayan ranges of Nepal, Bhutan, and Sikkim, have been categorized according to their surface area expansion rates in space and time. The lakes have been identified as located within the elevation range of 3800 m and6800 m above mean sea level(a msl). With a total surface area of 37.9 km2, these MDGLs as a whole were observed to have expanded by an astonishing 43.6% in area over the 25 year period of this study. A factor is introduced to numerically sort the lakes in terms of their relative yearly expansion rates, based on their interpretation of their surface area extents from satellite imageries. Verification of predicted GLOF events in the past using this factor with the limited field data as reported in literature indicates that the present analysis may be considered a sufficiently reliable and rapid technique for assessing the potential bursting susceptibility of the MDGLs. The analysis also indicates that, as of now, there are eight MDGLs in the region which appear to be in highly vulnerable states and have high chances in causing potential GLOF events anytime in the recent future. 展开更多
关键词 GLACIER RETREAT lakeS MAPPING MORAINE dammed GLACIAL lake(MDGL) Surface area change of lakeS Landsat imagery data Least criteria decision analysis(LCDA)
原文传递
基于淹水面积构建的鄱阳湖水文干旱定量表征及变化特征 被引量:1
8
作者 叶许春 岳恩馨 +1 位作者 李相虎 李传哲 《水科学进展》 北大核心 2025年第2期320-331,共12页
研究探讨洪泛湖泊淹水动态的时空异质性特征及其影响下的水文干旱定量表征,对提高洪泛湖泊生态系统管理实践和洪旱灾害防御能力具有重要意义。采用多源遥感数据和图像融合技术构建了鄱阳湖区2000—2023年间连续的高时空分辨率淹水面积数... 研究探讨洪泛湖泊淹水动态的时空异质性特征及其影响下的水文干旱定量表征,对提高洪泛湖泊生态系统管理实践和洪旱灾害防御能力具有重要意义。采用多源遥感数据和图像融合技术构建了鄱阳湖区2000—2023年间连续的高时空分辨率淹水面积数据,揭示了鄱阳湖淹水动态的时空异质性特征;借助标准化降水指数(SPI)原理提出了基于淹水面积的标准化水文干旱指数,并据此分析了鄱阳湖水文干旱的变化特征。结果表明:(1)鄱阳湖淹水动态时空异质性特征明显,主湖区和碟形湖区淹水面积的年内波动存在差异,在年际变化上呈现出相反趋势;(2)在定量反映鄱阳湖整体水文干旱时,基于站点的标准化水位指数存在较大的不确定性,相对而言,标准化淹水面积指数具有更好的科学性;(3)鄱阳湖水文干旱在时空分布上具有一定的复杂性,极端干旱主要发生在年内的4—10月,且更容易发生在主湖区。遥感大数据和图像融合技术结合可实现对大型洪泛湖泊水文干旱的精细定量研究,促进湖泊资源保护利用和洪旱灾害防治等工作的开展。 展开更多
关键词 水文干旱 淹水面积 洪泛湖泊 数据融合 遥感
在线阅读 下载PDF
Study on the Applicability of ERA5 Reanalysis Data at Lake Taihu
9
作者 Bo Wang Dongmei Chen Meiqi Song 《Journal of Geoscience and Environment Protection》 2022年第12期1-16,共16页
Lakes are an important component of the earth climate system. They play an important role in the study of basin weather forecasting, air quality forecasting, and regional climate research. The accuracy of driving vari... Lakes are an important component of the earth climate system. They play an important role in the study of basin weather forecasting, air quality forecasting, and regional climate research. The accuracy of driving variables is the basic premise to ensure the rationality of lake mode simulation. Based on the in-situ observations at Bifenggang site of the Lake Taihu Eddy flux Network from 2012 to 2017, this paper investigated temporal variations in temperature, relative humidity, wind speed, radiation components at different time scales (hourly, seasonal and interannual). ERA5 reanalysis data were compared with in-situ observation to quantify the error and evaluate the performance of reanalysis data. The results show that: 1) On the hourly scale, the ERA5 reanalysis data described air temperature, and downward long-wave radiation more accurately. 2) On the seasonal variation scale, the ERA5 reanalysis data described air temperature, and downward long-wave radiation more accurately. However, the descriptions of wind speed, relative humidity and downward short-wave have large deviations. 3) On the interannual scale, the ERA5 reanalysis data show a good performance for temperature, followed by downward longwave radiation, downward shortwave radiation and relative humidity. 展开更多
关键词 lake Taihu ERA5 Reanalysis data Meteorological Variables COMPARISON APPLICABILITY
在线阅读 下载PDF
基于熵减和马尔科夫链的中小企业客户数据治理技术
10
作者 刘敏 黄倚霄 +1 位作者 陈智扬 张湛梅 《现代信息科技》 2025年第3期140-145,152,共7页
针对传统中小企业客户数据呈现杂乱无序状态且缺乏标准化的现状,提出一种创新的数据治理技术。该技术整合多源异构数据,该技术汇聚多源异构数据,融合光学字符识别(Optical Character Recognition,OCR)等多种方法,构建标准化的中小企业... 针对传统中小企业客户数据呈现杂乱无序状态且缺乏标准化的现状,提出一种创新的数据治理技术。该技术整合多源异构数据,该技术汇聚多源异构数据,融合光学字符识别(Optical Character Recognition,OCR)等多种方法,构建标准化的中小企业基础信息数据湖,从源头提升数据质量。引入“熵减”理念,利用智能算法对数据质量进行量化评估,能够及时定位并解决数据质量问题。同时,搭建时序数据库并构建基于熵减的马尔科夫链模型,以此预测未来数据质量趋势,精准治理潜在问题区域。该技术不仅实现了数据价值的最大化,还显著降低了治理成本,提高了数据治理的效率与准确性,为企业降本增效提供了有力支撑。 展开更多
关键词 熵减 数据治理 马尔科夫链 中小企数据湖 时序数据库
在线阅读 下载PDF
鄱阳湖洪泛系统水文连通性演变特征及对湿地植被生长的影响 被引量:3
11
作者 岳恩馨 刘意 +2 位作者 李相虎 赵华琼 叶许春 《生态学报》 北大核心 2025年第4期1938-1949,共12页
水文连通性是影响洪泛区水文过程及生态系统结构和功能的关键要素,对湿地植被的生长与分布尤为重要。基于ESTARFM(Enhanced Spatial and Temporal Adaptive Reflectance Fusion Model)模型重构了2000—2022年鄱阳湖洪泛系统高时空分辨... 水文连通性是影响洪泛区水文过程及生态系统结构和功能的关键要素,对湿地植被的生长与分布尤为重要。基于ESTARFM(Enhanced Spatial and Temporal Adaptive Reflectance Fusion Model)模型重构了2000—2022年鄱阳湖洪泛系统高时空分辨率水体指数NDWI(Normalized Difference Water Index)(8d,30m)和增强型植被指数EVI(Enhanced Vegetation Index)(16 d,30 m)数据集,并结合地统计水文连通性函数,系统研究了鄱阳湖区多维水文连通性的演变特征及其对湿地植被生长的影响规律。结果表明:1)鄱阳湖区不同水文期内东西和南北方向的水文连通性随距离增加均呈现高度动态变化特征,水文连通性函数曲线的变化速率为:枯水期>退水期>涨水期>丰水期;2)研究时段内,鄱阳湖南北水文连通性明显高于东西水文连通性,但就不同区域而言,主湖区和南矶保护区的主导连通性随时间发生变化,碟形湖区及鄱阳湖保护区以南北水文连通性为主导;不同区域东西水文连通性呈现较为一致的波动下降趋势,南北水文连通性演变趋势差异较大;3)鄱阳湖湿地植被EVI与水文连通性之间呈现显著的负相关关系,其中,主湖区植被EVI主要受东西水文连通性控制,碟形湖区及鄱阳湖保护区植被EVI受东西和南北水文连通性的共同作用,南矶保护区植被EVI更多的受南北水文连通性影响。加强变化环境下水文连通性对湿地生态系统“结构-过程-功能”的影响规律研究,对促进湖泊系统水资源管理和湿地生态保护至关重要。 展开更多
关键词 水文连通性 长时间序列 鄱阳湖 湿地植被
在线阅读 下载PDF
河湖底泥治理科技创新研究与产业化发展
12
作者 唐彤芝 吴志强 +3 位作者 徐锴 黄英豪 关云飞 陈海波 《水利水运工程学报》 北大核心 2025年第5期42-53,共12页
中国江河湖库淤积状况日趋恶化,严重危害工程安全与效益、水质与生态环境,底泥治理已成为国家江河战略的重要内容。当前迫切需要构建“精准探测-高效脱水-安全利用”的技术与产业链,加强多技术融合创新,推动底泥从“安全隐患、治理负担... 中国江河湖库淤积状况日趋恶化,严重危害工程安全与效益、水质与生态环境,底泥治理已成为国家江河战略的重要内容。当前迫切需要构建“精准探测-高效脱水-安全利用”的技术与产业链,加强多技术融合创新,推动底泥从“安全隐患、治理负担”向“战略资源、产业化新质生产力”转变。科学谋划和开展底泥探测、底泥快速固结硬化以及资源化利用技术的科研攻关,有利于提升国家水安全保障能力,推动新时期水利水运建设高质量发展,促进底泥资源产业化发展。总结提出了底泥原位探测与治理利用技术研发的总体任务,分析了需要解决的关键科学问题,构建了创新性与实用性突出、具有自主知识产权与技术特色的淤积智能探测技术与一体化装备、“天空地水工”大数据系统、分布式多功能快速排水固结技术与资源化利用设备技术框架。可为满足国家和行业公益性重大需求、推进底泥产业化形成新质生产力提供科技支撑。 展开更多
关键词 河湖库淤积 智能探测 大数据 固结硬化 产业化
在线阅读 下载PDF
“湖仓一体”数据中台赋能高校数字化转型的实践
13
作者 李晓兰 付睿 《武汉工程职业技术学院学报》 2025年第3期25-28,共4页
在数字化转型浪潮下,高校面临着数据治理能力不足、业务系统割裂等挑战。“湖仓一体”数据中台通过整合数据湖与数据仓库优势,为高校提供统一的数据管理底座。文章从架构设计、技术实现、实践案例三个维度,系统阐述“湖仓一体”数据中... 在数字化转型浪潮下,高校面临着数据治理能力不足、业务系统割裂等挑战。“湖仓一体”数据中台通过整合数据湖与数据仓库优势,为高校提供统一的数据管理底座。文章从架构设计、技术实现、实践案例三个维度,系统阐述“湖仓一体”数据中台在高校场景的创新应用。研究提出“存储-计算-治理-服务”四层架构模型,并结合某高校实践验证其在跨域数据融合、智能分析决策、科研协同创新等方面的价值,最后总结实施路径与挑战应对策略,为高校数字化转型提供参考。 展开更多
关键词 湖仓一体 数据中台 高校数字化转型 数据治理
在线阅读 下载PDF
企业“档案数据湖”的功能模型和构建路径 被引量:1
14
作者 冯泽宇 郭若涵 《北京档案》 北大核心 2025年第1期31-37,共7页
大数据时代,企业“档案数据湖”的构建顺应企业档案数据转型趋势,符合企业档案数据治理需求,有助于消除企业档案数据孤岛,充分释放企业档案数据价值。以目前最典型的“数据湖”架构为参考模板,结合企业档案数据的工作实际,从“入湖:档... 大数据时代,企业“档案数据湖”的构建顺应企业档案数据转型趋势,符合企业档案数据治理需求,有助于消除企业档案数据孤岛,充分释放企业档案数据价值。以目前最典型的“数据湖”架构为参考模板,结合企业档案数据的工作实际,从“入湖:档案数据接入模块”“蓄湖:档案数据存储模块”“治湖:档案数据管理模块”“测湖:档案数据计算模块”“调湖:档案数据调度模块”和“用湖:档案数据应用模块”六大层级构建企业“档案数据湖”功能模型。根据该功能模型,提出从数据摸底、技术选型、数据接入、融合治理、业务支持五方面着手构建该为企业在实际工作中构建“档案数据湖”提供参考价值,充分发挥档案数据赋能企业业务发展的新势能。 展开更多
关键词 企业档案 档案数据 数据湖 构建路径
在线阅读 下载PDF
基于网络大数据分析的滇池高原湖泊景观感知
15
作者 黄琳云 李一姣 +3 位作者 宋钰红 李银汇 徐娅婷 张龙 《西部林业科学》 北大核心 2025年第5期110-118,共9页
城市湖泊作为最重要的蓝色空间,不仅具备调节气候改善环境的功能,其周边也是居民与游客活动、增加幸福感以及增进福祉的主要场所。滇池作为典型的高原湖泊其景观感知影响着昆明市滇池旅游的可持续发展。为揭示滇池湖泊景观的公众感知特... 城市湖泊作为最重要的蓝色空间,不仅具备调节气候改善环境的功能,其周边也是居民与游客活动、增加幸福感以及增进福祉的主要场所。滇池作为典型的高原湖泊其景观感知影响着昆明市滇池旅游的可持续发展。为揭示滇池湖泊景观的公众感知特征,并探讨影响游客满意度的关键因素,以昆明滇池为研究对象,采用网络大数据分析方法,通过采集小红书热门景点及携程网12104条评论数据,运用LDA主题模型与自然语言处理(NLP)情感分析方法,识别景观感知维度并量化情绪倾向,对使用者在现场所产生的感受和偏好进行研究。结果显示:自然景观及民俗文化是影响游客满意度的核心要素,游客普遍对这2个维度进行积极性评价。但是游客对于商业未能体现昆明本地特征、配套设施不足、服务管理等问题表达出一定的负面情绪。结果表明:使用者对于昆明市滇池湖泊景观感知特征是影响使用者满意度的关键因素。本研究的创新性在于研究视角及研究方法的综合应用,研究结果为今后高原湖泊景观优化与服务提升提供参考。 展开更多
关键词 网络大数据 滇池 湖泊景观感知 LDA主题模型 情感分析
在线阅读 下载PDF
基于数据湖的全链条医疗质控指标管理体系建设
16
作者 黄超仪 李田英 +3 位作者 陆殷 方莹 高伟 金从凯 《中国数字医学》 2025年第10期102-106,共5页
目的:通过全链条医疗质控指标管理体系,提升医疗质控指标的可信度和可用性,实现多维度智能分析与监测预警体系构建。方法:基于数据湖平台,搭建三级指标模型、四层数据模型和指标主题管理模型,结合全链条数据质量控制技术与RACI责权矩阵... 目的:通过全链条医疗质控指标管理体系,提升医疗质控指标的可信度和可用性,实现多维度智能分析与监测预警体系构建。方法:基于数据湖平台,搭建三级指标模型、四层数据模型和指标主题管理模型,结合全链条数据质量控制技术与RACI责权矩阵跨部门协作机制,支撑医疗质控管理智能分析场景落地。结果:该体系实现了质控数据标准化,业务数据质量6大维度评分大幅提升,指标多义性问题有效解决,自动统计率提升60%,并形成基于目标值的多维度指标闭环管理及秒级预警能力。结论:数据湖驱动的全链条指标管理体系有效解决了医疗管理指标统计中数量占比最大、统计难度最高的质控指标科学管理难题,通过增强指标的可关联性、可采集性、可追溯性及可靠性,显著提升了医院医疗质量管理效能。 展开更多
关键词 数据湖 医院管理 数据治理 医疗质量控制
在线阅读 下载PDF
基于多源卫星测高数据的青海湖水位变化 被引量:1
17
作者 毋梦艳 陈鹏 +1 位作者 李祖峰 杨新越 《西安科技大学学报》 北大核心 2025年第1期191-201,共11页
为了更准确地获取各地的水位变化,需要建立时间和空间分辨率更高的水位监测方法。首先,利用Envisat、Cryosat-2和Sentinel-3A这3颗测高卫星分别提取青海湖2002—2010年、2011—2015年和2016—2020年的水位信息,构建统一基准的水体水位... 为了更准确地获取各地的水位变化,需要建立时间和空间分辨率更高的水位监测方法。首先,利用Envisat、Cryosat-2和Sentinel-3A这3颗测高卫星分别提取青海湖2002—2010年、2011—2015年和2016—2020年的水位信息,构建统一基准的水体水位的时间序列;然后,结合青海湖的实测水位,并使用均方根误差(RMSE)和相关系数(R)作为精度评估指标;最后,验证3颗雷达测高卫星在青海湖水位反演的精度,基于卡尔曼(Kalman)滤波融合多源测高数据获取了青海湖2002—2020年的水位时间序列。结果表明:青海湖的水位呈逐年上涨趋势,最快以0.36 m/a的趋势在升高;Envisat、Cryosat-2和Sentinel-3A在青海湖的反演水位与实测水位的RMSE分别为0.54,0.13,0.14 m,相关系数R分别为0.36,0.89和0.97;此基础上,使用Kalman滤波获取的多源数据融合反演水位的RMSE和R分别为0.20 m和0.98,较卫星反演水位RMSE降低了17.10%,R提高了5.10%。Kalman滤波的多源测高数据融合反演水位有效弥补了单个卫星的时间分辨率低的缺点,精度较卫星反演水位显著提高,为更多内陆水体水位的变化建立高时空分辨率的水位时间序列奠定了基础。 展开更多
关键词 卫星测高 多源数据融合 KALMAN滤波 青海湖 水位变化
在线阅读 下载PDF
数据湖环境下的分层数据治理框架及优化
18
作者 张灵洁 邓基伟 李源 《移动信息》 2025年第9期102-104,共3页
在数据湖环境中,数据来源多样、数据质量参差不齐以及安全风险复杂等问题对数据治理提出了严峻挑战。文中针对这些问题设计了一个面向数据湖环境的分层数据治理框架,同时围绕元数据管理、数据质量管理和数据安全管理三大关键领域,分别... 在数据湖环境中,数据来源多样、数据质量参差不齐以及安全风险复杂等问题对数据治理提出了严峻挑战。文中针对这些问题设计了一个面向数据湖环境的分层数据治理框架,同时围绕元数据管理、数据质量管理和数据安全管理三大关键领域,分别提出了优化策略,包括基于分布式存储与层级索引的元数据管理优化、结合规则库与机器学习的多层次数据质量管理方法,以及分层防护与动态风险评估相结合的数据安全管理机制,为数据湖环境下高效、可靠的数据治理提供了系统化的解决方案。 展开更多
关键词 数据湖 数据治理 分层架构 大数据
在线阅读 下载PDF
大气环境监测数据湖元数据模型研究
19
作者 刘坤峄 王志宝 +1 位作者 赵满 罗源 《计算机与数字工程》 2025年第8期2265-2271,共7页
为了解决现有元数据管理模型缺乏对大气环境监测数据特征的全面分析,并不能较好地对大气环境监测数据进行管理的问题。论文针对大气环境监测数据规模大、组织结构松散等特点,提出了一种专用于大气环境监测数据潮的元数据模型(AEMDLM)。... 为了解决现有元数据管理模型缺乏对大气环境监测数据特征的全面分析,并不能较好地对大气环境监测数据进行管理的问题。论文针对大气环境监测数据规模大、组织结构松散等特点,提出了一种专用于大气环境监测数据潮的元数据模型(AEMDLM)。该模型将大气环境监测元数据分成时间、空间和业务三类,分别与大气环境监测数据资源目录相连接,进行数据匹配和语义推理,通过开发大气环境监测数据湖元数据管理系统(AEMDLMS)证明模型可以有效地提高大气环境监测数据湖的组织管理能力,便于数据的检索和分析。最后对大气环境监测数据检索进一步研究进行展望。 展开更多
关键词 大气环境监测 数据湖 元数据 元数据模型
在线阅读 下载PDF
基于区块链与数据湖技术的多来源传输信息存储方法 被引量:1
20
作者 韩吉双 曹锋 +1 位作者 曾广勇 连智杰 《电子设计工程》 2025年第18期157-160,169,共5页
多来源传输信息在存储时易被攻击者追踪,导致存储吞吐量低、信息的完整性差。为了解决这类问题,提出了基于区块链与数据湖技术的存储方法。利用区块链“链上-链下”结构对存储节点间数据实施交互加密处理,并认证传输双方的身份。根据接... 多来源传输信息在存储时易被攻击者追踪,导致存储吞吐量低、信息的完整性差。为了解决这类问题,提出了基于区块链与数据湖技术的存储方法。利用区块链“链上-链下”结构对存储节点间数据实施交互加密处理,并认证传输双方的身份。根据接收者与发送者随机信息之间的关联关系,加密构造一组环签名,为信息存储提供了安全环境。构建数据湖存储模型,以根节点的root为出发点,检索节点数据ID。计算节点在数据湖上的频率调制度,将数据湖中的数据映射到节点上,为每个逻辑节点分配物理节点,实现高效多来源传输信息分布式存储。实验结果表明,该方法吞吐量的最大值为1140 Mbps,存储后信息的完整性最高可达到99.7%。 展开更多
关键词 区块链 数据湖 多来源传输 信息存储 节点映射 分布式存储
在线阅读 下载PDF
上一页 1 2 24 下一页 到第
使用帮助 返回顶部