Differences in the imaging subgroups of cerebral small vessel disease(CSVD)need to be further explored.First,we use propensity score matching to obtain balanced datasets.Then random forest(RF)is adopted to classify th...Differences in the imaging subgroups of cerebral small vessel disease(CSVD)need to be further explored.First,we use propensity score matching to obtain balanced datasets.Then random forest(RF)is adopted to classify the subgroups compared with support vector machine(SVM)and extreme gradient boosting(XGBoost),and to select the features.The top 10 important features are included in the stepwise logistic regression,and the odds ratio(OR)and 95%confidence interval(CI)are obtained.There are 41290 adult inpatient records diagnosed with CSVD.Accuracy and area under curve(AUC)of RF are close to 0.7,which performs best in classification compared to SVM and XGBoost.OR and 95%CI of hematocrit for white matter lesions(WMLs),lacunes,microbleeds,atrophy,and enlarged perivascular space(EPVS)are 0.9875(0.9857−0.9893),0.9728(0.9705−0.9752),0.9782(0.9740−0.9824),1.0093(1.0081−1.0106),and 0.9716(0.9597−0.9832).OR and 95%CI of red cell distribution width for WMLs,lacunes,atrophy,and EPVS are 0.9600(0.9538−0.9662),0.9630(0.9559−0.9702),1.0751(1.0686−1.0817),and 0.9304(0.8864−0.9755).OR and 95%CI of platelet distribution width for WMLs,lacunes,and microbleeds are 1.1796(1.1636−1.1958),1.1663(1.1476−1.1853),and 1.0416(1.0152−1.0687).This study proposes a new analytical framework to select important clinical markers for CSVD with machine learning based on a common data model,which has low cost,fast speed,large sample size,and continuous data sources.展开更多
Multidatabase systems are designed to achieve schema integration and data interoperation among distributed and heterogeneous database systems. But data model heterogeneity and schema heterogeneity make this a challeng...Multidatabase systems are designed to achieve schema integration and data interoperation among distributed and heterogeneous database systems. But data model heterogeneity and schema heterogeneity make this a challenging task. A multidatabase common data model is firstly introduced based on XML, named XML-based Integration Data Model (XIDM), which is suitable for integrating different types of schemas. Then an approach of schema mappings based on XIDM in multidatabase systems has been presented. The mappings include global mappings, dealing with horizontal and vertical partitioning between global schemas and export schemas, and local mappings, processing the transformation between export schemas and local schemas. Finally, the illustration and implementation of schema mappings in a multidatabase prototype - Panorama system are also discussed. The implementation results demonstrate that the XIDM is an efficient model for managing multiple heterogeneous data sources and the approaches of schema mapping based on XIDM behave very well when integrating relational, object-oriented database systems and other file systems.展开更多
Towards a better understanding of hydrological interactions between the land surface and atmosphere, land surface mod- els are routinely used to simulate hydro-meteorological fluxes. However, there is a lack of observ...Towards a better understanding of hydrological interactions between the land surface and atmosphere, land surface mod- els are routinely used to simulate hydro-meteorological fluxes. However, there is a lack of observations available for model forcing, to estimate the hydro-meteorological fluxes in East Asia. In this study, Common Land Model (CLM) was used in offline-mode during the summer monsoon period of 2006 in East Asia, with different forcings from Asiaflux, Korea Land Data Assimilation System (KLDAS), and Global Land Data Assimilation System (GLDAS), at point and regional scales, separately. The CLM results were compared with observations from Asiaflux sites. The estimated net radiation showed good agreement, with r = 0.99 for the point scale and 0.85 for the regional scale. The estimated sensible and latent heat fluxes using Asiaflux and KLDAS data indicated reasonable agreement, with r = 0.70. The estimated soil moisture and soil temperature showed similar patterns to observations, although the estimated water fluxes using KLDAS showed larger discrepancies than those of Asiaflux because of scale mismatch. The spatial distribution of hydro-meteorological fluxes according to KLDAS for East Asia were compared to the CLM results with GLDAS, and the GLDAS provided online. The spatial distributions of CLM with KLDAS were analogous to CLM with GLDAS, and the standalone GLDAS data. The results indicate that KLDAS is a good potential source of high spatial resolution forcing data. Therefore, the KLDAS is a promising alternative product, capable of compensating for the lack of observations and low resolution grid data for East Asia.展开更多
现有标准格式雷达基数据解析工具在设计上存在通用性和抽象性不足的问题,不便于雷达数据的解析和处理。为了解决这个问题,本文基于Unidata的CDM(Common Data Model),设计和构建了中国天气雷达基数据模型,在数据模型层面实现了对天气雷...现有标准格式雷达基数据解析工具在设计上存在通用性和抽象性不足的问题,不便于雷达数据的解析和处理。为了解决这个问题,本文基于Unidata的CDM(Common Data Model),设计和构建了中国天气雷达基数据模型,在数据模型层面实现了对天气雷达标准格式基数据的访问,并以Unidata开源的NetCDF Java库和IDV(Integrated Data Viewer)可视化软件为基础,形成了一套基于CDM的天气雷达标准格式基数据内容提取和可视化分析工具。本研究以广州雷达新旧两种格式基本反射率数据对比为例,展示了研究成果在多普勒天气雷达标准格式基数据评估中的应用。结果表明:本研究成果方便了雷达标准格式基数据的使用,对雷达标准格式基数据的业务应用起到了促进作用。本研究成果亦可应用于雷达基数据处理与分析相关的实际业务和科研工作中,为雷达资料的应用提供基础支持。展开更多
基金supported by the National Natural Science Foundation of China(Nos.72204169 and 81825007)Beijing Outstanding Young Scientist Program(No.BJJWZYJH01201910025030)+5 种基金Capital’s Funds for Health Improvement and Research(No.2022-2-2045)National Key R&D Program of China(Nos.2022YFF15015002022YFF1501501,2022YFF1501502,2022YFF1501503,2022YFF1501504,and 2022YFF1501505)Youth Beijing Scholar Program(No.010)Beijing Laboratory of Oral Health(No.PXM2021_014226_000041)Beijing Talent Project-Class A:Innovation and Development(No.2018A12)National Ten-Thousand Talent PlanLeadership of Scientific and Technological Innovation,and National Key R&D Program of China(Nos.2017YFC1307900 and 2017YFC1307905).
文摘Differences in the imaging subgroups of cerebral small vessel disease(CSVD)need to be further explored.First,we use propensity score matching to obtain balanced datasets.Then random forest(RF)is adopted to classify the subgroups compared with support vector machine(SVM)and extreme gradient boosting(XGBoost),and to select the features.The top 10 important features are included in the stepwise logistic regression,and the odds ratio(OR)and 95%confidence interval(CI)are obtained.There are 41290 adult inpatient records diagnosed with CSVD.Accuracy and area under curve(AUC)of RF are close to 0.7,which performs best in classification compared to SVM and XGBoost.OR and 95%CI of hematocrit for white matter lesions(WMLs),lacunes,microbleeds,atrophy,and enlarged perivascular space(EPVS)are 0.9875(0.9857−0.9893),0.9728(0.9705−0.9752),0.9782(0.9740−0.9824),1.0093(1.0081−1.0106),and 0.9716(0.9597−0.9832).OR and 95%CI of red cell distribution width for WMLs,lacunes,atrophy,and EPVS are 0.9600(0.9538−0.9662),0.9630(0.9559−0.9702),1.0751(1.0686−1.0817),and 0.9304(0.8864−0.9755).OR and 95%CI of platelet distribution width for WMLs,lacunes,and microbleeds are 1.1796(1.1636−1.1958),1.1663(1.1476−1.1853),and 1.0416(1.0152−1.0687).This study proposes a new analytical framework to select important clinical markers for CSVD with machine learning based on a common data model,which has low cost,fast speed,large sample size,and continuous data sources.
文摘Multidatabase systems are designed to achieve schema integration and data interoperation among distributed and heterogeneous database systems. But data model heterogeneity and schema heterogeneity make this a challenging task. A multidatabase common data model is firstly introduced based on XML, named XML-based Integration Data Model (XIDM), which is suitable for integrating different types of schemas. Then an approach of schema mappings based on XIDM in multidatabase systems has been presented. The mappings include global mappings, dealing with horizontal and vertical partitioning between global schemas and export schemas, and local mappings, processing the transformation between export schemas and local schemas. Finally, the illustration and implementation of schema mappings in a multidatabase prototype - Panorama system are also discussed. The implementation results demonstrate that the XIDM is an efficient model for managing multiple heterogeneous data sources and the approaches of schema mapping based on XIDM behave very well when integrating relational, object-oriented database systems and other file systems.
基金supported by Space Core Technology Development Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Science,ICTFuture Planning(NRF-2014M1A3A3A02034789)+1 种基金Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(NRF-2013R1A1A2A10004743)the Korea Meteorological Administration Research and Development Program under Grant Weather Information Service Engine(WISE)project,KMA-2012-0001-A
文摘Towards a better understanding of hydrological interactions between the land surface and atmosphere, land surface mod- els are routinely used to simulate hydro-meteorological fluxes. However, there is a lack of observations available for model forcing, to estimate the hydro-meteorological fluxes in East Asia. In this study, Common Land Model (CLM) was used in offline-mode during the summer monsoon period of 2006 in East Asia, with different forcings from Asiaflux, Korea Land Data Assimilation System (KLDAS), and Global Land Data Assimilation System (GLDAS), at point and regional scales, separately. The CLM results were compared with observations from Asiaflux sites. The estimated net radiation showed good agreement, with r = 0.99 for the point scale and 0.85 for the regional scale. The estimated sensible and latent heat fluxes using Asiaflux and KLDAS data indicated reasonable agreement, with r = 0.70. The estimated soil moisture and soil temperature showed similar patterns to observations, although the estimated water fluxes using KLDAS showed larger discrepancies than those of Asiaflux because of scale mismatch. The spatial distribution of hydro-meteorological fluxes according to KLDAS for East Asia were compared to the CLM results with GLDAS, and the GLDAS provided online. The spatial distributions of CLM with KLDAS were analogous to CLM with GLDAS, and the standalone GLDAS data. The results indicate that KLDAS is a good potential source of high spatial resolution forcing data. Therefore, the KLDAS is a promising alternative product, capable of compensating for the lack of observations and low resolution grid data for East Asia.
文摘现有标准格式雷达基数据解析工具在设计上存在通用性和抽象性不足的问题,不便于雷达数据的解析和处理。为了解决这个问题,本文基于Unidata的CDM(Common Data Model),设计和构建了中国天气雷达基数据模型,在数据模型层面实现了对天气雷达标准格式基数据的访问,并以Unidata开源的NetCDF Java库和IDV(Integrated Data Viewer)可视化软件为基础,形成了一套基于CDM的天气雷达标准格式基数据内容提取和可视化分析工具。本研究以广州雷达新旧两种格式基本反射率数据对比为例,展示了研究成果在多普勒天气雷达标准格式基数据评估中的应用。结果表明:本研究成果方便了雷达标准格式基数据的使用,对雷达标准格式基数据的业务应用起到了促进作用。本研究成果亦可应用于雷达基数据处理与分析相关的实际业务和科研工作中,为雷达资料的应用提供基础支持。