Parkinson’s disease(PD)is a debilitating neurological disorder affecting over 10 million people worldwide.PD classification models using voice signals as input are common in the literature.It is believed that using d...Parkinson’s disease(PD)is a debilitating neurological disorder affecting over 10 million people worldwide.PD classification models using voice signals as input are common in the literature.It is believed that using deep learning algorithms further enhances performance;nevertheless,it is challenging due to the nature of small-scale and imbalanced PD datasets.This paper proposed a convolutional neural network-based deep support vector machine(CNN-DSVM)to automate the feature extraction process using CNN and extend the conventional SVM to a DSVM for better classification performance in small-scale PD datasets.A customized kernel function reduces the impact of biased classification towards the majority class(healthy candidates in our consideration).An improved generative adversarial network(IGAN)was designed to generate additional training data to enhance the model’s performance.For performance evaluation,the proposed algorithm achieves a sensitivity of 97.6%and a specificity of 97.3%.The performance comparison is evaluated from five perspectives,including comparisons with different data generation algorithms,feature extraction techniques,kernel functions,and existing works.Results reveal the effectiveness of the IGAN algorithm,which improves the sensitivity and specificity by 4.05%–4.72%and 4.96%–5.86%,respectively;and the effectiveness of the CNN-DSVM algorithm,which improves the sensitivity by 1.24%–57.4%and specificity by 1.04%–163%and reduces biased detection towards the majority class.The ablation experiments confirm the effectiveness of individual components.Two future research directions have also been suggested.展开更多
Accurate purchase prediction in e-commerce critically depends on the quality of behavioral features.This paper proposes a layered and interpretable feature engineering framework that organizes user signals into three ...Accurate purchase prediction in e-commerce critically depends on the quality of behavioral features.This paper proposes a layered and interpretable feature engineering framework that organizes user signals into three layers:Basic,Conversion&Stability(efficiency and volatility across actions),and Advanced Interactions&Activity(crossbehavior synergies and intensity).Using real Taobao(Alibaba’s primary e-commerce platform)logs(57,976 records for 10,203 users;25 November–03 December 2017),we conducted a hierarchical,layer-wise evaluation that holds data splits and hyperparameters fixed while varying only the feature set to quantify each layer’s marginal contribution.Across logistic regression(LR),decision tree,random forest,XGBoost,and CatBoost models with stratified 5-fold cross-validation,the performance improvedmonotonically fromBasic to Conversion&Stability to Advanced features.With LR,F1 increased from 0.613(Basic)to 0.962(Advanced);boosted models achieved high discrimination(0.995 AUC Score)and an F1 score up to 0.983.Calibration and precision–recall analyses indicated strong ranking quality and acknowledged potential dataset and period biases given the short(9-day)window.By making feature contributions measurable and reproducible,the framework complements model-centric advances and offers a transparent blueprint for production-grade behavioralmodeling.The code and processed artifacts are publicly available,and future work will extend the validation to longer,seasonal datasets and hybrid approaches that combine automated feature learning with domain-driven design.展开更多
Standardized datasets are foundational to healthcare informatization by enhancing data quality and unleashing the value of data elements.Using bibliometrics and content analysis,this study examines China's healthc...Standardized datasets are foundational to healthcare informatization by enhancing data quality and unleashing the value of data elements.Using bibliometrics and content analysis,this study examines China's healthcare dataset standards from 2011 to 2025.It analyzes their evolution across types,applications,institutions,and themes,highlighting key achievements including substantial growth in quantity,optimized typology,expansion into innovative application scenarios such as health decision support,and broadened institutional involvement.The study also identifies critical challenges,including imbalanced development,insufficient quality control,and a lack of essential metadata—such as authoritative data element mappings and privacy annotations—which hampers the delivery of intelligent services.To address these challenges,the study proposes a multi-faceted strategy focused on optimizing the standard system's architecture,enhancing quality and implementation,and advancing both data governance—through authoritative tracing and privacy protection—and intelligent service provision.These strategies aim to promote the application of dataset standards,thereby fostering and securing the development of new productive forces in healthcare.展开更多
When dealing with imbalanced datasets,the traditional support vectormachine(SVM)tends to produce a classification hyperplane that is biased towards the majority class,which exhibits poor robustness.This paper proposes...When dealing with imbalanced datasets,the traditional support vectormachine(SVM)tends to produce a classification hyperplane that is biased towards the majority class,which exhibits poor robustness.This paper proposes a high-performance classification algorithm specifically designed for imbalanced datasets.The proposed method first uses a biased second-order cone programming support vectormachine(B-SOCP-SVM)to identify the support vectors(SVs)and non-support vectors(NSVs)in the imbalanced data.Then,it applies the synthetic minority over-sampling technique(SV-SMOTE)to oversample the support vectors of the minority class and uses the random under-sampling technique(NSV-RUS)multiple times to undersample the non-support vectors of the majority class.Combining the above-obtained minority class data set withmultiple majority class datasets can obtainmultiple new balanced data sets.Finally,SOCP-SVM is used to classify each data set,and the final result is obtained through the integrated algorithm.Experimental results demonstrate that the proposed method performs excellently on imbalanced datasets.展开更多
Due to the development of cloud computing and machine learning,users can upload their data to the cloud for machine learning model training.However,dishonest clouds may infer user data,resulting in user data leakage.P...Due to the development of cloud computing and machine learning,users can upload their data to the cloud for machine learning model training.However,dishonest clouds may infer user data,resulting in user data leakage.Previous schemes have achieved secure outsourced computing,but they suffer from low computational accuracy,difficult-to-handle heterogeneous distribution of data from multiple sources,and high computational cost,which result in extremely poor user experience and expensive cloud computing costs.To address the above problems,we propose amulti-precision,multi-sourced,andmulti-key outsourcing neural network training scheme.Firstly,we design a multi-precision functional encryption computation based on Euclidean division.Second,we design the outsourcing model training algorithm based on a multi-precision functional encryption with multi-sourced heterogeneity.Finally,we conduct experiments on three datasets.The results indicate that our framework achieves an accuracy improvement of 6%to 30%.Additionally,it offers a memory space optimization of 1.0×2^(24) times compared to the previous best approach.展开更多
Detecting faces under occlusion remains a significant challenge in computer vision due to variations caused by masks,sunglasses,and other obstructions.Addressing this issue is crucial for applications such as surveill...Detecting faces under occlusion remains a significant challenge in computer vision due to variations caused by masks,sunglasses,and other obstructions.Addressing this issue is crucial for applications such as surveillance,biometric authentication,and human-computer interaction.This paper provides a comprehensive review of face detection techniques developed to handle occluded faces.Studies are categorized into four main approaches:feature-based,machine learning-based,deep learning-based,and hybrid methods.We analyzed state-of-the-art studies within each category,examining their methodologies,strengths,and limitations based on widely used benchmark datasets,highlighting their adaptability to partial and severe occlusions.The review also identifies key challenges,including dataset diversity,model generalization,and computational efficiency.Our findings reveal that deep learning methods dominate recent studies,benefiting from their ability to extract hierarchical features and handle complex occlusion patterns.More recently,researchers have increasingly explored Transformer-based architectures,such as Vision Transformer(ViT)and Swin Transformer,to further improve detection robustness under challenging occlusion scenarios.In addition,hybrid approaches,which aim to combine traditional andmodern techniques,are emerging as a promising direction for improving robustness.This review provides valuable insights for researchers aiming to develop more robust face detection systems and for practitioners seeking to deploy reliable solutions in real-world,occlusionprone environments.Further improvements and the proposal of broader datasets are required to developmore scalable,robust,and efficient models that can handle complex occlusions in real-world scenarios.展开更多
Climate change significantly affects environment,ecosystems,communities,and economies.These impacts often result in quick and gradual changes in water resources,environmental conditions,and weather patterns.A geograph...Climate change significantly affects environment,ecosystems,communities,and economies.These impacts often result in quick and gradual changes in water resources,environmental conditions,and weather patterns.A geographical study was conducted in Arizona State,USA,to examine monthly precipi-tation concentration rates over time.This analysis used a high-resolution 0.50×0.50 grid for monthly precip-itation data from 1961 to 2022,Provided by the Climatic Research Unit.The study aimed to analyze climatic changes affected the first and last five years of each decade,as well as the entire decade,during the specified period.GIS was used to meet the objectives of this study.Arizona experienced 51–568 mm,67–560 mm,63–622 mm,and 52–590 mm of rainfall in the sixth,seventh,eighth,and ninth decades of the second millennium,respectively.Both the first and second five year periods of each decade showed accept-able rainfall amounts despite fluctuations.However,rainfall decreased in the first and second decades of the third millennium.and in the first two years of the third decade.Rainfall amounts dropped to 42–472 mm,55–469 mm,and 74–498 mm,respectively,indicating a downward trend in precipitation.The central part of the state received the highest rainfall,while the eastern and western regions(spanning north to south)had significantly less.Over the decades of the third millennium,the average annual rainfall every five years was relatively low,showing a declining trend due to severe climate changes,generally ranging between 35 mm and 498 mm.The central regions consistently received more rainfall than the eastern and western outskirts.Arizona is currently experiencing a decrease in rainfall due to climate change,a situation that could deterio-rate further.This highlights the need to optimize the use of existing rainfall and explore alternative water sources.展开更多
Face recognition has emerged as one of the most prominent applications of image analysis and under-standing,gaining considerable attention in recent years.This growing interest is driven by two key factors:its extensi...Face recognition has emerged as one of the most prominent applications of image analysis and under-standing,gaining considerable attention in recent years.This growing interest is driven by two key factors:its extensive applications in law enforcement and the commercial domain,and the rapid advancement of practical technologies.Despite the significant advancements,modern recognition algorithms still struggle in real-world conditions such as varying lighting conditions,occlusion,and diverse facial postures.In such scenarios,human perception is still well above the capabilities of present technology.Using the systematic mapping study,this paper presents an in-depth review of face detection algorithms and face recognition algorithms,presenting a detailed survey of advancements made between 2015 and 2024.We analyze key methodologies,highlighting their strengths and restrictions in the application context.Additionally,we examine various datasets used for face detection/recognition datasets focusing on the task-specific applications,size,diversity,and complexity.By analyzing these algorithms and datasets,this survey works as a valuable resource for researchers,identifying the research gap in the field of face detection and recognition and outlining potential directions for future research.展开更多
The aim of this article is to explore potential directions for the development of artificial intelligence(AI).It points out that,while current AI can handle the statistical properties of complex systems,it has difficu...The aim of this article is to explore potential directions for the development of artificial intelligence(AI).It points out that,while current AI can handle the statistical properties of complex systems,it has difficulty effectively processing and fully representing their spatiotemporal complexity patterns.The article also discusses a potential path of AI development in the engineering domain.Based on the existing understanding of the principles of multilevel com-plexity,this article suggests that consistency among the logical structures of datasets,AI models,model-building software,and hardware will be an important AI development direction and is worthy of careful consideration.展开更多
Inferring phylogenetic trees from molecular sequences is a cornerstone of evolutionary biology.Many standard phylogenetic methods(such as maximum-likelihood[ML])rely on explicit models of sequence evolution and thus o...Inferring phylogenetic trees from molecular sequences is a cornerstone of evolutionary biology.Many standard phylogenetic methods(such as maximum-likelihood[ML])rely on explicit models of sequence evolution and thus often suffer from model misspecification or inadequacy.The on-rising deep learning(DL)techniques offer a powerful alternative.Deep learning employs multi-layered artificial neural networks to progressively transform input data into more abstract and complex representations.DL methods can autonomously uncover meaningful patterns from data,thereby bypassing potential biases introduced by predefined features(Franklin,2005;Murphy,2012).Recent efforts have aimed to apply deep neural networks(DNNs)to phylogenetics,with a growing number of applications in tree reconstruction(Suvorov et al.,2020;Zou et al.,2020;Nesterenko et al.,2022;Smith and Hahn,2023;Wang et al.,2023),substitution model selection(Abadi et al.,2020;Burgstaller-Muehlbacher et al.,2023),and diversification rate inference(Voznica et al.,2022;Lajaaiti et al.,2023;Lambert et al.,2023).In phylogenetic tree reconstruction,PhyDL(Zou et al.,2020)and Tree_learning(Suvorov et al.,2020)are two notable DNN-based programs designed to infer unrooted quartet trees directly from alignments of four amino acid(AA)and DNA sequences,respectively.展开更多
The spatial offset of bridge has a significant impact on the safety,comfort,and durability of high-speed railway(HSR)operations,so it is crucial to rapidly and effectively detect the spatial offset of operational HSR ...The spatial offset of bridge has a significant impact on the safety,comfort,and durability of high-speed railway(HSR)operations,so it is crucial to rapidly and effectively detect the spatial offset of operational HSR bridges.Drive-by monitoring of bridge uneven settlement demonstrates significant potential due to its practicality,cost-effectiveness,and efficiency.However,existing drive-by methods for detecting bridge offset have limitations such as reliance on a single data source,low detection accuracy,and the inability to identify lateral deformations of bridges.This paper proposes a novel drive-by inspection method for spatial offset of HSR bridge based on multi-source data fusion of comprehensive inspection train.Firstly,dung beetle optimizer-variational mode decomposition was employed to achieve adaptive decomposition of non-stationary dynamic signals,and explore the hidden temporal relationships in the data.Subsequently,a long short-term memory neural network was developed to achieve feature fusion of multi-source signal and accurate prediction of spatial settlement of HSR bridge.A dataset of track irregularities and CRH380A high-speed train responses was generated using a 3D train-track-bridge interaction model,and the accuracy and effectiveness of the proposed hybrid deep learning model were numerically validated.Finally,the reliability of the proposed drive-by inspection method was further validated by analyzing the actual measurement data obtained from comprehensive inspection train.The research findings indicate that the proposed approach enables rapid and accurate detection of spatial offset in HSR bridge,ensuring the long-term operational safety of HSR bridges.展开更多
Domain adaptation aims to reduce the distribution gap between the training data(source domain)and the target data.This enables effective predictions even for domains not seen during training.However,most conventional ...Domain adaptation aims to reduce the distribution gap between the training data(source domain)and the target data.This enables effective predictions even for domains not seen during training.However,most conventional domain adaptation methods assume a single source domain,making them less suitable for modern deep learning settings that rely on diverse and large-scale datasets.To address this limitation,recent research has focused on Multi-Source Domain Adaptation(MSDA),which aims to learn effectively from multiple source domains.In this paper,we propose Efficient Domain Transition for Multi-source(EDTM),a novel and efficient framework designed to tackle two major challenges in existing MSDA approaches:(1)integrating knowledge across different source domains and(2)aligning label distributions between source and target domains.EDTM leverages an ensemble-based classifier expert mechanism to enhance the contribution of source domains that are more similar to the target domain.To further stabilize the learning process and improve performance,we incorporate imitation learning into the training of the target model.In addition,Maximum Classifier Discrepancy(MCD)is employed to align class-wise label distributions between the source and target domains.Experiments were conducted using Digits-Five,one of the most representative benchmark datasets for MSDA.The results show that EDTM consistently outperforms existing methods in terms of average classification accuracy.Notably,EDTM achieved significantly higher performance on target domains such as Modified National Institute of Standards and Technolog with blended background images(MNIST-M)and Street View House Numbers(SVHN)datasets,demonstrating enhanced generalization compared to baseline approaches.Furthermore,an ablation study analyzing the contribution of each loss component validated the effectiveness of the framework,highlighting the importance of each module in achieving optimal performance.展开更多
Small datasets are often challenging due to their limited sample size.This research introduces a novel solution to these problems:average linkage virtual sample generation(ALVSG).ALVSG leverages the underlying data st...Small datasets are often challenging due to their limited sample size.This research introduces a novel solution to these problems:average linkage virtual sample generation(ALVSG).ALVSG leverages the underlying data structure to create virtual samples,which can be used to augment the original dataset.The ALVSG process consists of two steps.First,an average-linkage clustering technique is applied to the dataset to create a dendrogram.The dendrogram represents the hierarchical structure of the dataset,with each merging operation regarded as a linkage.Next,the linkages are combined into an average-based dataset,which serves as a new representation of the dataset.The second step in the ALVSG process involves generating virtual samples using the average-based dataset.The research project generates a set of 100 virtual samples by uniformly distributing them within the provided boundary.These virtual samples are then added to the original dataset,creating a more extensive dataset with improved generalization performance.The efficacy of the ALVSG approach is validated through resampling experiments and t-tests conducted on two small real-world datasets.The experiments are conducted on three forecasting models:the support vector machine for regression(SVR),the deep learning model(DL),and XGBoost.The results show that the ALVSG approach outperforms the baseline methods in terms of mean square error(MSE),root mean square error(RMSE),and mean absolute error(MAE).展开更多
This study investigated the impacts of random negative training datasets(NTDs)on the uncertainty of machine learning models for geologic hazard susceptibility assessment of the Loess Plateau,northern Shaanxi Province,...This study investigated the impacts of random negative training datasets(NTDs)on the uncertainty of machine learning models for geologic hazard susceptibility assessment of the Loess Plateau,northern Shaanxi Province,China.Based on randomly generated 40 NTDs,the study developed models for the geologic hazard susceptibility assessment using the random forest algorithm and evaluated their performances using the area under the receiver operating characteristic curve(AUC).Specifically,the means and standard deviations of the AUC values from all models were then utilized to assess the overall spatial correlation between the conditioning factors and the susceptibility assessment,as well as the uncertainty introduced by the NTDs.A risk and return methodology was thus employed to quantify and mitigate the uncertainty,with log odds ratios used to characterize the susceptibility assessment levels.The risk and return values were calculated based on the standard deviations and means of the log odds ratios of various locations.After the mean log odds ratios were converted into probability values,the final susceptibility map was plotted,which accounts for the uncertainty induced by random NTDs.The results indicate that the AUC values of the models ranged from 0.810 to 0.963,with an average of 0.852 and a standard deviation of 0.035,indicating encouraging prediction effects and certain uncertainty.The risk and return analysis reveals that low-risk and high-return areas suggest lower standard deviations and higher means across multiple model-derived assessments.Overall,this study introduces a new framework for quantifying the uncertainty of multiple training and evaluation models,aimed at improving their robustness and reliability.Additionally,by identifying low-risk and high-return areas,resource allocation for geologic hazard prevention and control can be optimized,thus ensuring that limited resources are directed toward the most effective prevention and control measures.展开更多
Benthic habitat mapping is an emerging discipline in the international marine field in recent years,providing an effective tool for marine spatial planning,marine ecological management,and decision-making applications...Benthic habitat mapping is an emerging discipline in the international marine field in recent years,providing an effective tool for marine spatial planning,marine ecological management,and decision-making applications.Seabed sediment classification is one of the main contents of seabed habitat mapping.In response to the impact of remote sensing imaging quality and the limitations of acoustic measurement range,where a single data source does not fully reflect the substrate type,we proposed a high-precision seabed habitat sediment classification method that integrates data from multiple sources.Based on WorldView-2 multi-spectral remote sensing image data and multibeam bathymetry data,constructed a random forests(RF)classifier with optimal feature selection.A seabed sediment classification experiment integrating optical remote sensing and acoustic remote sensing data was carried out in the shallow water area of Wuzhizhou Island,Hainan,South China.Different seabed sediment types,such as sand,seagrass,and coral reefs were effectively identified,with an overall classification accuracy of 92%.Experimental results show that RF matrix optimized by fusing multi-source remote sensing data for feature selection were better than the classification results of simple combinations of data sources,which improved the accuracy of seabed sediment classification.Therefore,the method proposed in this paper can be effectively applied to high-precision seabed sediment classification and habitat mapping around islands and reefs.展开更多
Land use/cover change is an important parameter in the climate and ecological simulations. Although they had been widely used in the community, SAGE dataset and HYDE dataset, the two representative global historical l...Land use/cover change is an important parameter in the climate and ecological simulations. Although they had been widely used in the community, SAGE dataset and HYDE dataset, the two representative global historical land use datasets, were little assessed about their accuracies in regional scale. Here, we carried out some assessments for the traditional cultivated region of China (TCRC) over last 300 years, by comparing SAGE2010 and HYDE (v3.1) with Chinese Historical Cropland Dataset (CHCD). The comparisons were performed at three spatial scales: entire study area, provincial area and 60 km by 60 km grid cell. The results show that (1) the cropland area from SAGE2010 was much more than that from CHCD moreover, the growth at a rate of 0.51% from 1700 to 1950 and -0.34% after 1950 were also inconsistent with that from CHCD. (2) HYDE dataset (v3.1) was closer to CHCD dataset than SAGE dataset on entire study area. However, the large biases could be detected at provincial scale and 60 km by 60 km grid cell scale. The percent of grid cells having biases greater than 70% (〈-70% or 〉70%) and 90% (〈-90% or 〉90%) accounted for 56%-63% and 40%-45% of the total grid cells respectively while those having biases range from -10% to 10% and from -30% to 30% account for only 5%-6% and 17% of the total grid cells respectively. (3) Using local historical archives to reconstruct historical dataset with high accuracy would be a valu- able way to improve the accuracy of climate and ecological simulation.展开更多
Rock mass quality serves as a vital index for predicting the stability and safety status of rock tunnel faces.In tunneling practice,the rock mass quality is often assessed via a combination of qualitative and quantita...Rock mass quality serves as a vital index for predicting the stability and safety status of rock tunnel faces.In tunneling practice,the rock mass quality is often assessed via a combination of qualitative and quantitative parameters.However,due to the harsh on-site construction conditions,it is rather difficult to obtain some of the evaluation parameters which are essential for the rock mass quality prediction.In this study,a novel improved Swin Transformer is proposed to detect,segment,and quantify rock mass characteristic parameters such as water leakage,fractures,weak interlayers.The site experiment results demonstrate that the improved Swin Transformer achieves optimal segmentation results and achieving accuracies of 92%,81%,and 86%for water leakage,fractures,and weak interlayers,respectively.A multisource rock tunnel face characteristic(RTFC)dataset includes 11 parameters for predicting rock mass quality is established.Considering the limitations in predictive performance of incomplete evaluation parameters exist in this dataset,a novel tree-augmented naive Bayesian network(BN)is proposed to address the challenge of the incomplete dataset and achieved a prediction accuracy of 88%.In comparison with other commonly used Machine Learning models the proposed BN-based approach proved an improved performance on predicting the rock mass quality with the incomplete dataset.By utilizing the established BN,a further sensitivity analysis is conducted to quantitatively evaluate the importance of the various parameters,results indicate that the rock strength and fractures parameter exert the most significant influence on rock mass quality.展开更多
We analyzed the spatial local accuracy of land cover (LC) datasets for the Qiangtang Plateau,High Asia,incorporating 923 field sampling points and seven LC compilations including the International Geosphere Biosphere ...We analyzed the spatial local accuracy of land cover (LC) datasets for the Qiangtang Plateau,High Asia,incorporating 923 field sampling points and seven LC compilations including the International Geosphere Biosphere Programme Data and Information System (IGBPDIS),Global Land cover mapping at 30 m resolution (GlobeLand30),MODIS Land Cover Type product (MCD12Q1),Climate Change Initiative Land Cover (CCI-LC),Global Land Cover 2000 (GLC2000),University of Maryland (UMD),and GlobCover 2009 (Glob- Cover).We initially compared resultant similarities and differences in both area and spatial patterns and analyzed inherent relationships with data sources.We then applied a geographically weighted regression (GWR) approach to predict local accuracy variation.The results of this study reveal that distinct differences,even inverse time series trends,in LC data between CCI-LC and MCD12Q1 were present between 2001 and 2015,with the exception of category areal discordance between the seven datasets.We also show a series of evident discrepancies amongst the LC datasets sampled here in terms of spatial patterns,that is,high spatial congruence is mainly seen in the homogeneous southeastern region of the study area while a low degree of spatial congruence is widely distributed across heterogeneous northwestern and northeastern regions.The overall combined spatial accuracy of the seven LC datasets considered here is less than 70%,and the GlobeLand30 and CCI-LC datasets exhibit higher local accuracy than their counterparts,yielding maximum overall accuracy (OA) values of 77.39% and 61.43%,respectively.Finally,5.63% of this area is characterized by both high assessment and accuracy (HH) values,mainly located in central and eastern regions of the Qiangtang Plateau,while most low accuracy regions are found in northern,northeastern,and western regions.展开更多
In this study, the upper ocean heat content (OHC) variations in the South China Sea (SCS) during 1993- 2006 were investigated by examining ocean temperatures in seven datasets, including World Ocean Atlas 2009 (W...In this study, the upper ocean heat content (OHC) variations in the South China Sea (SCS) during 1993- 2006 were investigated by examining ocean temperatures in seven datasets, including World Ocean Atlas 2009 (WOA09) (climatology), Ishii datasets, Ocean General Circulation ModeI for the Earth Simulator (OFES), Simple Ocean Data Assimilation system (SODA), Global Ocean Data Assimilation System (GODAS), China Oceanic ReAnalysis system (CORA) , and an ocean reanalysis dataset for the joining area of Asia and Indian-Pacific Ocean (AIPO1.0). Among these datasets, two were independent of any numerical model, four relied on data assimilation, and one was generated without any data assimilation. The annual cycles revealed by the seven datasets were similar, but the interannual variations were different. Vertical structures of temperatures along the 18~N, 12.75~N, and 120~E sections were compared with data collected during open cruises in 1998 and 2005-08. The results indicated that Ishii, OFES, CORA, and AIPO1.0 were more consistent with the observations. Through systematic shortcomings and advantages in presenting the upper comparisons, we found that each dataset had its own OHC in the SCS.展开更多
Based on C-LSAT2.0,using high-and low-frequency components reconstruction methods,combined with observation constraint masking,a reconstructed C-LSAT2.0 with 756 ensemble members from the 1850s to 2018 has been develo...Based on C-LSAT2.0,using high-and low-frequency components reconstruction methods,combined with observation constraint masking,a reconstructed C-LSAT2.0 with 756 ensemble members from the 1850s to 2018 has been developed.These ensemble versions have been merged with the ERSSTv5 ensemble dataset,and an upgraded version of the CMSTInterim dataset with 5°×5°resolution has been developed.The CMST-Interim dataset has significantly improved the coverage rate of global surface temperature data.After reconstruction,the data coverage before 1950 increased from 78%−81%of the original CMST to 81%−89%.The total coverage after 1955 reached about 93%,including more than 98%in the Northern Hemisphere and 81%−89%in the Southern Hemisphere.Through the reconstruction ensemble experiments with different parameters,a good basis is provided for more systematic uncertainty assessment of C-LSAT2.0 and CMSTInterim.In comparison with the original CMST,the global mean surface temperatures are estimated to be cooler in the second half of 19th century and warmer during the 21st century,which shows that the global warming trend is further amplified.The global warming trends are updated from 0.085±0.004℃(10 yr)^(–1)and 0.128±0.006℃(10 yr)^(–1)to 0.089±0.004℃(10 yr)^(–1)and 0.137±0.007℃(10 yr)^(–1),respectively,since the start and the second half of 20th century.展开更多
基金The work described in this paper was fully supported by a grant from Hong Kong Metropolitan University(RIF/2021/05).
文摘Parkinson’s disease(PD)is a debilitating neurological disorder affecting over 10 million people worldwide.PD classification models using voice signals as input are common in the literature.It is believed that using deep learning algorithms further enhances performance;nevertheless,it is challenging due to the nature of small-scale and imbalanced PD datasets.This paper proposed a convolutional neural network-based deep support vector machine(CNN-DSVM)to automate the feature extraction process using CNN and extend the conventional SVM to a DSVM for better classification performance in small-scale PD datasets.A customized kernel function reduces the impact of biased classification towards the majority class(healthy candidates in our consideration).An improved generative adversarial network(IGAN)was designed to generate additional training data to enhance the model’s performance.For performance evaluation,the proposed algorithm achieves a sensitivity of 97.6%and a specificity of 97.3%.The performance comparison is evaluated from five perspectives,including comparisons with different data generation algorithms,feature extraction techniques,kernel functions,and existing works.Results reveal the effectiveness of the IGAN algorithm,which improves the sensitivity and specificity by 4.05%–4.72%and 4.96%–5.86%,respectively;and the effectiveness of the CNN-DSVM algorithm,which improves the sensitivity by 1.24%–57.4%and specificity by 1.04%–163%and reduces biased detection towards the majority class.The ablation experiments confirm the effectiveness of individual components.Two future research directions have also been suggested.
基金supported by the research fund of Hanyang University(HY-202500000001616).
文摘Accurate purchase prediction in e-commerce critically depends on the quality of behavioral features.This paper proposes a layered and interpretable feature engineering framework that organizes user signals into three layers:Basic,Conversion&Stability(efficiency and volatility across actions),and Advanced Interactions&Activity(crossbehavior synergies and intensity).Using real Taobao(Alibaba’s primary e-commerce platform)logs(57,976 records for 10,203 users;25 November–03 December 2017),we conducted a hierarchical,layer-wise evaluation that holds data splits and hyperparameters fixed while varying only the feature set to quantify each layer’s marginal contribution.Across logistic regression(LR),decision tree,random forest,XGBoost,and CatBoost models with stratified 5-fold cross-validation,the performance improvedmonotonically fromBasic to Conversion&Stability to Advanced features.With LR,F1 increased from 0.613(Basic)to 0.962(Advanced);boosted models achieved high discrimination(0.995 AUC Score)and an F1 score up to 0.983.Calibration and precision–recall analyses indicated strong ranking quality and acknowledged potential dataset and period biases given the short(9-day)window.By making feature contributions measurable and reproducible,the framework complements model-centric advances and offers a transparent blueprint for production-grade behavioralmodeling.The code and processed artifacts are publicly available,and future work will extend the validation to longer,seasonal datasets and hybrid approaches that combine automated feature learning with domain-driven design.
文摘Standardized datasets are foundational to healthcare informatization by enhancing data quality and unleashing the value of data elements.Using bibliometrics and content analysis,this study examines China's healthcare dataset standards from 2011 to 2025.It analyzes their evolution across types,applications,institutions,and themes,highlighting key achievements including substantial growth in quantity,optimized typology,expansion into innovative application scenarios such as health decision support,and broadened institutional involvement.The study also identifies critical challenges,including imbalanced development,insufficient quality control,and a lack of essential metadata—such as authoritative data element mappings and privacy annotations—which hampers the delivery of intelligent services.To address these challenges,the study proposes a multi-faceted strategy focused on optimizing the standard system's architecture,enhancing quality and implementation,and advancing both data governance—through authoritative tracing and privacy protection—and intelligent service provision.These strategies aim to promote the application of dataset standards,thereby fostering and securing the development of new productive forces in healthcare.
基金supported by the Natural Science Basic Research Program of Shaanxi(Program No.2024JC-YBMS-026).
文摘When dealing with imbalanced datasets,the traditional support vectormachine(SVM)tends to produce a classification hyperplane that is biased towards the majority class,which exhibits poor robustness.This paper proposes a high-performance classification algorithm specifically designed for imbalanced datasets.The proposed method first uses a biased second-order cone programming support vectormachine(B-SOCP-SVM)to identify the support vectors(SVs)and non-support vectors(NSVs)in the imbalanced data.Then,it applies the synthetic minority over-sampling technique(SV-SMOTE)to oversample the support vectors of the minority class and uses the random under-sampling technique(NSV-RUS)multiple times to undersample the non-support vectors of the majority class.Combining the above-obtained minority class data set withmultiple majority class datasets can obtainmultiple new balanced data sets.Finally,SOCP-SVM is used to classify each data set,and the final result is obtained through the integrated algorithm.Experimental results demonstrate that the proposed method performs excellently on imbalanced datasets.
基金supported by Natural Science Foundation of China(Nos.62303126,62362008,author Z.Z,https://www.nsfc.gov.cn/,accessed on 20 December 2024)Major Scientific and Technological Special Project of Guizhou Province([2024]014)+2 种基金Guizhou Provincial Science and Technology Projects(No.ZK[2022]General149) ,author Z.Z,https://kjt.guizhou.gov.cn/,accessed on 20 December 2024)The Open Project of the Key Laboratory of Computing Power Network and Information Security,Ministry of Education under Grant 2023ZD037,author Z.Z,https://www.gzu.edu.cn/,accessed on 20 December 2024)Open Research Project of the State Key Laboratory of Industrial Control Technology,Zhejiang University,China(No.ICT2024B25),author Z.Z,https://www.gzu.edu.cn/,accessed on 20 December 2024).
文摘Due to the development of cloud computing and machine learning,users can upload their data to the cloud for machine learning model training.However,dishonest clouds may infer user data,resulting in user data leakage.Previous schemes have achieved secure outsourced computing,but they suffer from low computational accuracy,difficult-to-handle heterogeneous distribution of data from multiple sources,and high computational cost,which result in extremely poor user experience and expensive cloud computing costs.To address the above problems,we propose amulti-precision,multi-sourced,andmulti-key outsourcing neural network training scheme.Firstly,we design a multi-precision functional encryption computation based on Euclidean division.Second,we design the outsourcing model training algorithm based on a multi-precision functional encryption with multi-sourced heterogeneity.Finally,we conduct experiments on three datasets.The results indicate that our framework achieves an accuracy improvement of 6%to 30%.Additionally,it offers a memory space optimization of 1.0×2^(24) times compared to the previous best approach.
基金funded by A’Sharqiyah University,Sultanate of Oman,under Research Project grant number(BFP/RGP/ICT/22/490).
文摘Detecting faces under occlusion remains a significant challenge in computer vision due to variations caused by masks,sunglasses,and other obstructions.Addressing this issue is crucial for applications such as surveillance,biometric authentication,and human-computer interaction.This paper provides a comprehensive review of face detection techniques developed to handle occluded faces.Studies are categorized into four main approaches:feature-based,machine learning-based,deep learning-based,and hybrid methods.We analyzed state-of-the-art studies within each category,examining their methodologies,strengths,and limitations based on widely used benchmark datasets,highlighting their adaptability to partial and severe occlusions.The review also identifies key challenges,including dataset diversity,model generalization,and computational efficiency.Our findings reveal that deep learning methods dominate recent studies,benefiting from their ability to extract hierarchical features and handle complex occlusion patterns.More recently,researchers have increasingly explored Transformer-based architectures,such as Vision Transformer(ViT)and Swin Transformer,to further improve detection robustness under challenging occlusion scenarios.In addition,hybrid approaches,which aim to combine traditional andmodern techniques,are emerging as a promising direction for improving robustness.This review provides valuable insights for researchers aiming to develop more robust face detection systems and for practitioners seeking to deploy reliable solutions in real-world,occlusionprone environments.Further improvements and the proposal of broader datasets are required to developmore scalable,robust,and efficient models that can handle complex occlusions in real-world scenarios.
文摘Climate change significantly affects environment,ecosystems,communities,and economies.These impacts often result in quick and gradual changes in water resources,environmental conditions,and weather patterns.A geographical study was conducted in Arizona State,USA,to examine monthly precipi-tation concentration rates over time.This analysis used a high-resolution 0.50×0.50 grid for monthly precip-itation data from 1961 to 2022,Provided by the Climatic Research Unit.The study aimed to analyze climatic changes affected the first and last five years of each decade,as well as the entire decade,during the specified period.GIS was used to meet the objectives of this study.Arizona experienced 51–568 mm,67–560 mm,63–622 mm,and 52–590 mm of rainfall in the sixth,seventh,eighth,and ninth decades of the second millennium,respectively.Both the first and second five year periods of each decade showed accept-able rainfall amounts despite fluctuations.However,rainfall decreased in the first and second decades of the third millennium.and in the first two years of the third decade.Rainfall amounts dropped to 42–472 mm,55–469 mm,and 74–498 mm,respectively,indicating a downward trend in precipitation.The central part of the state received the highest rainfall,while the eastern and western regions(spanning north to south)had significantly less.Over the decades of the third millennium,the average annual rainfall every five years was relatively low,showing a declining trend due to severe climate changes,generally ranging between 35 mm and 498 mm.The central regions consistently received more rainfall than the eastern and western outskirts.Arizona is currently experiencing a decrease in rainfall due to climate change,a situation that could deterio-rate further.This highlights the need to optimize the use of existing rainfall and explore alternative water sources.
文摘Face recognition has emerged as one of the most prominent applications of image analysis and under-standing,gaining considerable attention in recent years.This growing interest is driven by two key factors:its extensive applications in law enforcement and the commercial domain,and the rapid advancement of practical technologies.Despite the significant advancements,modern recognition algorithms still struggle in real-world conditions such as varying lighting conditions,occlusion,and diverse facial postures.In such scenarios,human perception is still well above the capabilities of present technology.Using the systematic mapping study,this paper presents an in-depth review of face detection algorithms and face recognition algorithms,presenting a detailed survey of advancements made between 2015 and 2024.We analyze key methodologies,highlighting their strengths and restrictions in the application context.Additionally,we examine various datasets used for face detection/recognition datasets focusing on the task-specific applications,size,diversity,and complexity.By analyzing these algorithms and datasets,this survey works as a valuable resource for researchers,identifying the research gap in the field of face detection and recognition and outlining potential directions for future research.
文摘The aim of this article is to explore potential directions for the development of artificial intelligence(AI).It points out that,while current AI can handle the statistical properties of complex systems,it has difficulty effectively processing and fully representing their spatiotemporal complexity patterns.The article also discusses a potential path of AI development in the engineering domain.Based on the existing understanding of the principles of multilevel com-plexity,this article suggests that consistency among the logical structures of datasets,AI models,model-building software,and hardware will be an important AI development direction and is worthy of careful consideration.
基金supported by the National Key R&D Program of China(2022YFD1401600)the National Science Foundation for Distinguished Young Scholars of Zhejang Province,China(LR23C140001)supported by the Key Area Research and Development Program of Guangdong Province,China(2018B020205003 and 2020B0202090001).
文摘Inferring phylogenetic trees from molecular sequences is a cornerstone of evolutionary biology.Many standard phylogenetic methods(such as maximum-likelihood[ML])rely on explicit models of sequence evolution and thus often suffer from model misspecification or inadequacy.The on-rising deep learning(DL)techniques offer a powerful alternative.Deep learning employs multi-layered artificial neural networks to progressively transform input data into more abstract and complex representations.DL methods can autonomously uncover meaningful patterns from data,thereby bypassing potential biases introduced by predefined features(Franklin,2005;Murphy,2012).Recent efforts have aimed to apply deep neural networks(DNNs)to phylogenetics,with a growing number of applications in tree reconstruction(Suvorov et al.,2020;Zou et al.,2020;Nesterenko et al.,2022;Smith and Hahn,2023;Wang et al.,2023),substitution model selection(Abadi et al.,2020;Burgstaller-Muehlbacher et al.,2023),and diversification rate inference(Voznica et al.,2022;Lajaaiti et al.,2023;Lambert et al.,2023).In phylogenetic tree reconstruction,PhyDL(Zou et al.,2020)and Tree_learning(Suvorov et al.,2020)are two notable DNN-based programs designed to infer unrooted quartet trees directly from alignments of four amino acid(AA)and DNA sequences,respectively.
基金sponsored by the National Natural Science Foundation of China(Grant No.52178100).
文摘The spatial offset of bridge has a significant impact on the safety,comfort,and durability of high-speed railway(HSR)operations,so it is crucial to rapidly and effectively detect the spatial offset of operational HSR bridges.Drive-by monitoring of bridge uneven settlement demonstrates significant potential due to its practicality,cost-effectiveness,and efficiency.However,existing drive-by methods for detecting bridge offset have limitations such as reliance on a single data source,low detection accuracy,and the inability to identify lateral deformations of bridges.This paper proposes a novel drive-by inspection method for spatial offset of HSR bridge based on multi-source data fusion of comprehensive inspection train.Firstly,dung beetle optimizer-variational mode decomposition was employed to achieve adaptive decomposition of non-stationary dynamic signals,and explore the hidden temporal relationships in the data.Subsequently,a long short-term memory neural network was developed to achieve feature fusion of multi-source signal and accurate prediction of spatial settlement of HSR bridge.A dataset of track irregularities and CRH380A high-speed train responses was generated using a 3D train-track-bridge interaction model,and the accuracy and effectiveness of the proposed hybrid deep learning model were numerically validated.Finally,the reliability of the proposed drive-by inspection method was further validated by analyzing the actual measurement data obtained from comprehensive inspection train.The research findings indicate that the proposed approach enables rapid and accurate detection of spatial offset in HSR bridge,ensuring the long-term operational safety of HSR bridges.
基金supported by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(No.RS-2024-00406320)the Institute of Information&Communica-tions Technology Planning&Evaluation(IITP)-Innovative Human Resource Development for Local Intellectualization Program Grant funded by the Korea government(MSIT)(IITP-2026-RS-2023-00259678).
文摘Domain adaptation aims to reduce the distribution gap between the training data(source domain)and the target data.This enables effective predictions even for domains not seen during training.However,most conventional domain adaptation methods assume a single source domain,making them less suitable for modern deep learning settings that rely on diverse and large-scale datasets.To address this limitation,recent research has focused on Multi-Source Domain Adaptation(MSDA),which aims to learn effectively from multiple source domains.In this paper,we propose Efficient Domain Transition for Multi-source(EDTM),a novel and efficient framework designed to tackle two major challenges in existing MSDA approaches:(1)integrating knowledge across different source domains and(2)aligning label distributions between source and target domains.EDTM leverages an ensemble-based classifier expert mechanism to enhance the contribution of source domains that are more similar to the target domain.To further stabilize the learning process and improve performance,we incorporate imitation learning into the training of the target model.In addition,Maximum Classifier Discrepancy(MCD)is employed to align class-wise label distributions between the source and target domains.Experiments were conducted using Digits-Five,one of the most representative benchmark datasets for MSDA.The results show that EDTM consistently outperforms existing methods in terms of average classification accuracy.Notably,EDTM achieved significantly higher performance on target domains such as Modified National Institute of Standards and Technolog with blended background images(MNIST-M)and Street View House Numbers(SVHN)datasets,demonstrating enhanced generalization compared to baseline approaches.Furthermore,an ablation study analyzing the contribution of each loss component validated the effectiveness of the framework,highlighting the importance of each module in achieving optimal performance.
基金funding support from the National Science and Technology Council(NSTC),under Grant No.114-2410-H-011-026-MY3.
文摘Small datasets are often challenging due to their limited sample size.This research introduces a novel solution to these problems:average linkage virtual sample generation(ALVSG).ALVSG leverages the underlying data structure to create virtual samples,which can be used to augment the original dataset.The ALVSG process consists of two steps.First,an average-linkage clustering technique is applied to the dataset to create a dendrogram.The dendrogram represents the hierarchical structure of the dataset,with each merging operation regarded as a linkage.Next,the linkages are combined into an average-based dataset,which serves as a new representation of the dataset.The second step in the ALVSG process involves generating virtual samples using the average-based dataset.The research project generates a set of 100 virtual samples by uniformly distributing them within the provided boundary.These virtual samples are then added to the original dataset,creating a more extensive dataset with improved generalization performance.The efficacy of the ALVSG approach is validated through resampling experiments and t-tests conducted on two small real-world datasets.The experiments are conducted on three forecasting models:the support vector machine for regression(SVR),the deep learning model(DL),and XGBoost.The results show that the ALVSG approach outperforms the baseline methods in terms of mean square error(MSE),root mean square error(RMSE),and mean absolute error(MAE).
基金supported by a project entitled Loess Plateau Region-Watershed-Slope Geological Hazard Multi-Scale Collaborative Intelligent Early Warning System of the National Key R&D Program of China(2022YFC3003404)a project of the Shaanxi Youth Science and Technology Star(2021KJXX-87)public welfare geological survey projects of Shaanxi Institute of Geologic Survey(20180301,201918,202103,and 202413).
文摘This study investigated the impacts of random negative training datasets(NTDs)on the uncertainty of machine learning models for geologic hazard susceptibility assessment of the Loess Plateau,northern Shaanxi Province,China.Based on randomly generated 40 NTDs,the study developed models for the geologic hazard susceptibility assessment using the random forest algorithm and evaluated their performances using the area under the receiver operating characteristic curve(AUC).Specifically,the means and standard deviations of the AUC values from all models were then utilized to assess the overall spatial correlation between the conditioning factors and the susceptibility assessment,as well as the uncertainty introduced by the NTDs.A risk and return methodology was thus employed to quantify and mitigate the uncertainty,with log odds ratios used to characterize the susceptibility assessment levels.The risk and return values were calculated based on the standard deviations and means of the log odds ratios of various locations.After the mean log odds ratios were converted into probability values,the final susceptibility map was plotted,which accounts for the uncertainty induced by random NTDs.The results indicate that the AUC values of the models ranged from 0.810 to 0.963,with an average of 0.852 and a standard deviation of 0.035,indicating encouraging prediction effects and certain uncertainty.The risk and return analysis reveals that low-risk and high-return areas suggest lower standard deviations and higher means across multiple model-derived assessments.Overall,this study introduces a new framework for quantifying the uncertainty of multiple training and evaluation models,aimed at improving their robustness and reliability.Additionally,by identifying low-risk and high-return areas,resource allocation for geologic hazard prevention and control can be optimized,thus ensuring that limited resources are directed toward the most effective prevention and control measures.
基金Supported by the National Natural Science Foundation of China(Nos.42376185,41876111)the Shandong Provincial Natural Science Foundation(No.ZR2023MD073)。
文摘Benthic habitat mapping is an emerging discipline in the international marine field in recent years,providing an effective tool for marine spatial planning,marine ecological management,and decision-making applications.Seabed sediment classification is one of the main contents of seabed habitat mapping.In response to the impact of remote sensing imaging quality and the limitations of acoustic measurement range,where a single data source does not fully reflect the substrate type,we proposed a high-precision seabed habitat sediment classification method that integrates data from multiple sources.Based on WorldView-2 multi-spectral remote sensing image data and multibeam bathymetry data,constructed a random forests(RF)classifier with optimal feature selection.A seabed sediment classification experiment integrating optical remote sensing and acoustic remote sensing data was carried out in the shallow water area of Wuzhizhou Island,Hainan,South China.Different seabed sediment types,such as sand,seagrass,and coral reefs were effectively identified,with an overall classification accuracy of 92%.Experimental results show that RF matrix optimized by fusing multi-source remote sensing data for feature selection were better than the classification results of simple combinations of data sources,which improved the accuracy of seabed sediment classification.Therefore,the method proposed in this paper can be effectively applied to high-precision seabed sediment classification and habitat mapping around islands and reefs.
基金China Global Change Research Program, No.2010CB950901 National Natural Science Foundation of China, No.41271227 No.41001122
文摘Land use/cover change is an important parameter in the climate and ecological simulations. Although they had been widely used in the community, SAGE dataset and HYDE dataset, the two representative global historical land use datasets, were little assessed about their accuracies in regional scale. Here, we carried out some assessments for the traditional cultivated region of China (TCRC) over last 300 years, by comparing SAGE2010 and HYDE (v3.1) with Chinese Historical Cropland Dataset (CHCD). The comparisons were performed at three spatial scales: entire study area, provincial area and 60 km by 60 km grid cell. The results show that (1) the cropland area from SAGE2010 was much more than that from CHCD moreover, the growth at a rate of 0.51% from 1700 to 1950 and -0.34% after 1950 were also inconsistent with that from CHCD. (2) HYDE dataset (v3.1) was closer to CHCD dataset than SAGE dataset on entire study area. However, the large biases could be detected at provincial scale and 60 km by 60 km grid cell scale. The percent of grid cells having biases greater than 70% (〈-70% or 〉70%) and 90% (〈-90% or 〉90%) accounted for 56%-63% and 40%-45% of the total grid cells respectively while those having biases range from -10% to 10% and from -30% to 30% account for only 5%-6% and 17% of the total grid cells respectively. (3) Using local historical archives to reconstruct historical dataset with high accuracy would be a valu- able way to improve the accuracy of climate and ecological simulation.
基金supported by the National Natural Science Foundation of China(Nos.52279107 and 52379106)the Qingdao Guoxin Jiaozhou Bay Second Submarine Tunnel Co.,Ltd.,the Academician and Expert Workstation of Yunnan Province(No.202205AF150015)the Science and Technology Innovation Project of YCIC Group Co.,Ltd.(No.YCIC-YF-2022-15)。
文摘Rock mass quality serves as a vital index for predicting the stability and safety status of rock tunnel faces.In tunneling practice,the rock mass quality is often assessed via a combination of qualitative and quantitative parameters.However,due to the harsh on-site construction conditions,it is rather difficult to obtain some of the evaluation parameters which are essential for the rock mass quality prediction.In this study,a novel improved Swin Transformer is proposed to detect,segment,and quantify rock mass characteristic parameters such as water leakage,fractures,weak interlayers.The site experiment results demonstrate that the improved Swin Transformer achieves optimal segmentation results and achieving accuracies of 92%,81%,and 86%for water leakage,fractures,and weak interlayers,respectively.A multisource rock tunnel face characteristic(RTFC)dataset includes 11 parameters for predicting rock mass quality is established.Considering the limitations in predictive performance of incomplete evaluation parameters exist in this dataset,a novel tree-augmented naive Bayesian network(BN)is proposed to address the challenge of the incomplete dataset and achieved a prediction accuracy of 88%.In comparison with other commonly used Machine Learning models the proposed BN-based approach proved an improved performance on predicting the rock mass quality with the incomplete dataset.By utilizing the established BN,a further sensitivity analysis is conducted to quantitatively evaluate the importance of the various parameters,results indicate that the rock strength and fractures parameter exert the most significant influence on rock mass quality.
基金The Strategic Priority Research Program of the Chinese Academy of Sciences,Nos.XDA20040200,XDB03030500Key Foundation Project of Basic Work of the Ministry of Science and Technology of China,No.2012FY111400National Key Technologies R&D Program,No.2012BC06B00
文摘We analyzed the spatial local accuracy of land cover (LC) datasets for the Qiangtang Plateau,High Asia,incorporating 923 field sampling points and seven LC compilations including the International Geosphere Biosphere Programme Data and Information System (IGBPDIS),Global Land cover mapping at 30 m resolution (GlobeLand30),MODIS Land Cover Type product (MCD12Q1),Climate Change Initiative Land Cover (CCI-LC),Global Land Cover 2000 (GLC2000),University of Maryland (UMD),and GlobCover 2009 (Glob- Cover).We initially compared resultant similarities and differences in both area and spatial patterns and analyzed inherent relationships with data sources.We then applied a geographically weighted regression (GWR) approach to predict local accuracy variation.The results of this study reveal that distinct differences,even inverse time series trends,in LC data between CCI-LC and MCD12Q1 were present between 2001 and 2015,with the exception of category areal discordance between the seven datasets.We also show a series of evident discrepancies amongst the LC datasets sampled here in terms of spatial patterns,that is,high spatial congruence is mainly seen in the homogeneous southeastern region of the study area while a low degree of spatial congruence is widely distributed across heterogeneous northwestern and northeastern regions.The overall combined spatial accuracy of the seven LC datasets considered here is less than 70%,and the GlobeLand30 and CCI-LC datasets exhibit higher local accuracy than their counterparts,yielding maximum overall accuracy (OA) values of 77.39% and 61.43%,respectively.Finally,5.63% of this area is characterized by both high assessment and accuracy (HH) values,mainly located in central and eastern regions of the Qiangtang Plateau,while most low accuracy regions are found in northern,northeastern,and western regions.
基金supported by the National Basic Research Program of China (Grant Nos. 2010CB950400 and 2013CB430301)the National Natural Science Foundation of China (Grant Nos. 41276025 and 41176023)+2 种基金the R&D Special Fund for Public Welfare Industry (Meteorology) (Grant No. GYHY201106036)The OFES simulation was conducted on the Earth Simulator under the support of JAMSTECsupported by the Data Sharing Infrastructure of Earth System Science-Data Sharing Service Center of the South China Sea and adjacent regions
文摘In this study, the upper ocean heat content (OHC) variations in the South China Sea (SCS) during 1993- 2006 were investigated by examining ocean temperatures in seven datasets, including World Ocean Atlas 2009 (WOA09) (climatology), Ishii datasets, Ocean General Circulation ModeI for the Earth Simulator (OFES), Simple Ocean Data Assimilation system (SODA), Global Ocean Data Assimilation System (GODAS), China Oceanic ReAnalysis system (CORA) , and an ocean reanalysis dataset for the joining area of Asia and Indian-Pacific Ocean (AIPO1.0). Among these datasets, two were independent of any numerical model, four relied on data assimilation, and one was generated without any data assimilation. The annual cycles revealed by the seven datasets were similar, but the interannual variations were different. Vertical structures of temperatures along the 18~N, 12.75~N, and 120~E sections were compared with data collected during open cruises in 1998 and 2005-08. The results indicated that Ishii, OFES, CORA, and AIPO1.0 were more consistent with the observations. Through systematic shortcomings and advantages in presenting the upper comparisons, we found that each dataset had its own OHC in the SCS.
文摘Based on C-LSAT2.0,using high-and low-frequency components reconstruction methods,combined with observation constraint masking,a reconstructed C-LSAT2.0 with 756 ensemble members from the 1850s to 2018 has been developed.These ensemble versions have been merged with the ERSSTv5 ensemble dataset,and an upgraded version of the CMSTInterim dataset with 5°×5°resolution has been developed.The CMST-Interim dataset has significantly improved the coverage rate of global surface temperature data.After reconstruction,the data coverage before 1950 increased from 78%−81%of the original CMST to 81%−89%.The total coverage after 1955 reached about 93%,including more than 98%in the Northern Hemisphere and 81%−89%in the Southern Hemisphere.Through the reconstruction ensemble experiments with different parameters,a good basis is provided for more systematic uncertainty assessment of C-LSAT2.0 and CMSTInterim.In comparison with the original CMST,the global mean surface temperatures are estimated to be cooler in the second half of 19th century and warmer during the 21st century,which shows that the global warming trend is further amplified.The global warming trends are updated from 0.085±0.004℃(10 yr)^(–1)and 0.128±0.006℃(10 yr)^(–1)to 0.089±0.004℃(10 yr)^(–1)and 0.137±0.007℃(10 yr)^(–1),respectively,since the start and the second half of 20th century.