Earthquakes are highly destructive spatio-temporal phenomena whose analysis is essential for disaster preparedness and risk mitigation.Modern seismological research produces vast volumes of heterogeneous data from sei...Earthquakes are highly destructive spatio-temporal phenomena whose analysis is essential for disaster preparedness and risk mitigation.Modern seismological research produces vast volumes of heterogeneous data from seismic networks,satellite observations,and geospatial repositories,creating the need for scalable infrastructures capable of integrating and analyzing such data to support intelligent decision-making.Data warehousing technologies provide a robust foundation for this purpose;however,existing earthquake-oriented data warehouses remain limited,often relying on simplified schemas,domain-specific analytics,or cataloguing efforts.This paper presents the design and implementation of a spatio-temporal data warehouse for seismic activity.The framework integrates spatial and temporal dimensions in a unified schema and introduces a novel array-based approach for managing many-to-many relationships between facts and dimensions without intermediate bridge tables.A comparative evaluation against a conventional bridge-table schema demonstrates that the array-based design improves fact-centric query performance,while the bridge-table schema remains advantageous for dimension-centric queries.To reconcile these trade-offs,a hybrid schema is proposed that retains both representations,ensuring balanced efficiency across heterogeneous workloads.The proposed framework demonstrates how spatio-temporal data warehousing can address schema complexity,improve query performance,and support multidimensional visualization.In doing so,it provides a foundation for integrating seismic analysis into broader big data-driven intelligent decision systems for disaster resilience,risk mitigation,and emergency management.展开更多
Ovarian cancer(OC)is one of the leading causes of death related to gynecological cancer,with the main difficulty of its early diagnosis and a heterogeneous nature of tumor biomarkers.Machine learning(ML)has the potent...Ovarian cancer(OC)is one of the leading causes of death related to gynecological cancer,with the main difficulty of its early diagnosis and a heterogeneous nature of tumor biomarkers.Machine learning(ML)has the potential to process complex datasets and support decision-making in OC diagnosis.Nevertheless,traditional ML models tend to be biased,overfitting,noisy,and less generalized.Moreover,their black-box nature reduces interpretability and limits their practical clinical applicability.In this study,we introduce an explainable ensemble learning(EL)model,TreeX-Stack,based on a stacking architecture that employs tree-based learners such as Decision Tree(DT),Random Forest(RF),Gradient Boosting(GB),and Extreme Gradient Boosting(XGBoost)as base learners,and Logistic Regression(LR)as the meta-learner to enhance ovarian cancer(OC)diagnosis.Local Interpretable ModelAgnostic Explanations(LIME)are used to explain individual predictions,making the model outputs more clinically interpretable and applicable.The model is trained on the dataset that includes demographic information,blood test,general chemistry,and tumor markers.Extensive preprocessing includes handling missing data using iterative imputation with Bayesian Ridge and addressing multicollinearity by removing features with correlation coefficients above 0.7.Relevant features are then selected using the Boruta feature selection method.To obtain robust and unbiased performance estimates during hyperparameter tuning,nested cross-validation(CV)with grid search is employed,and all experiments are repeated five times to ensure statistical reliability.TreeX-Stack demonstrates excellent diagnostic performance,achieving an accuracy of 0.9027,a precision of 0.8673,a recall of 0.9391,and an F1-score of 0.9012.Feature-importance analyses using LIME and permutation importance highlight Human Epididymis Protein 4(HE4)as the most significant biomarker for OC.The combination of high predictive performance and interpretability makes TreeX-Stack a reliable tool for clinical decision support in OC diagnosis.展开更多
This work contributes to the theoretical foundation for pricing in data markets and offers practical insights for managing digital data exchanges in the era of big data.We propose a structured pricing model for data e...This work contributes to the theoretical foundation for pricing in data markets and offers practical insights for managing digital data exchanges in the era of big data.We propose a structured pricing model for data exchanges transitioning from quasi-public to marketoriented operations.To address the complex dynamics among data exchanges,suppliers,and consumers,the authors develop a threestage Stackelberg game framework.In this model,the data exchange acts as a leader setting transaction commission rates,suppliers are intermediate leaders determining unit prices,and consumers are followers making purchasing decisions.Two pricing strategies are examined:the Independent Pricing Approach(IPA)and the novel Perfectly Competitive Pricing Approach(PCPA),which accounts for competition among data providers.Using backward induction,the study derives subgame-perfect equilibria and proves the existence and uniqueness of Stackelberg equilibria under both approaches.Extensive numerical simulations are carried out in the model,demonstrating that PCPA enhances data demander utility,encourages supplier competition,increases transaction volume,and improves the overall profitability and sustainability of data exchanges.Social welfare analysis further confirms PCPA’s superiority in promoting efficient and fair data markets.展开更多
0 INTRODUCTION Earth science is a natural science concerned with the composition,dynamics,spatiotemporal evolution,and formation mechanisms of Earth materials(Chen and Yang,2023).Traditional Earth science research has...0 INTRODUCTION Earth science is a natural science concerned with the composition,dynamics,spatiotemporal evolution,and formation mechanisms of Earth materials(Chen and Yang,2023).Traditional Earth science research has largely been discipline-based,relying on field investigations,data collection,experimental analyses,and data interpretation to study individual components of the Earth system.展开更多
To address the severe challenges of PM_(2.5) and ozone co-control during the"14^(th) Five-Year Plan"period and to enhance the precision and intelligence level of air environment governance,it is imperative t...To address the severe challenges of PM_(2.5) and ozone co-control during the"14^(th) Five-Year Plan"period and to enhance the precision and intelligence level of air environment governance,it is imperative to build an efficient comprehensive management platform for regional air quality.In this paper,the specific practice in Zibo City,Shandong Province is as an example to systematically analyze the top-level design,technical implementation,and innovative application of a comprehensive management platform for regional air quality integrating"perception monitoring,data fusion,research judgment of early warnings,analysis of sources,collaborative dispatching,and evaluation assessment".Through the construction of an"sky-air-ground"integrated three-dimensional monitoring network,the platform integrates multi-source heterogeneous environmental data,and employs big data,cloud computing,artificial intelligence,CALPUFF/CMAQ,and other numerical model technologies to achieve comprehensive perception,precise prediction,intelligent source tracing,and closed-loop management of air pollution.The platform innovatively establishes a full-process closed-loop management mechanism of"data-early warning-disposition-evaluation",and achieves a fundamental transformation from passive response to active anticipation and from experience-based judgment to data driving in environmental supervision.The application results show that this platform significantly improves the scientific decision-making ability and collaborative execution efficiency of air pollution governance in Zibo City,providing a replicable and scalable comprehensive solution for similar industrial cities to achieve the continuous improvement of air quality.展开更多
Accurately assessing the relationship between tree growth and climatic factors is of great importance in dendrochronology.This study evaluated the consistency between alternative climate datasets(including station and...Accurately assessing the relationship between tree growth and climatic factors is of great importance in dendrochronology.This study evaluated the consistency between alternative climate datasets(including station and gridded data)and actual climate data(fixed-point observations near the sampling sites),in northeastern China’s warm temperate zone and analyzed differences in their correlations with tree-ring width index.The results were:(1)Gridded temperature data,as well as precipitation and relative humidity data from the Huailai meteorological station,was more consistent with the actual climate data;in contrast,gridded soil moisture content data showed significant discrepancies.(2)Horizontal distance had a greater impact on the representativeness of actual climate conditions than vertical elevation differences.(3)Differences in consistency between alternative and actual climate data also affected their correlations with tree-ring width indices.In some growing season months,correlation coefficients,both in magnitude and sign,differed significantly from those based on actual data.The selection of different alternative climate datasets can lead to biased results in assessing forest responses to climate change,which is detrimental to the management of forest ecosystems in harsh environments.Therefore,the scientific and rational selection of alternative climate data is essential for dendroecological and climatological research.展开更多
tRNA-derived small RNAs(tsRNAs),as a class of regulatory small noncoding RNA,have been implicated in a wide variety of human diseases.Large amounts of tsRNA–disease associations have been identified in recent years f...tRNA-derived small RNAs(tsRNAs),as a class of regulatory small noncoding RNA,have been implicated in a wide variety of human diseases.Large amounts of tsRNA–disease associations have been identified in recent years from accumulating studies.However,repositories for cataloging the detailed information on tsRNA–disease associations are scarce.In this study,we provide a tsRNADisease database by integrating experimentally and computationally supported tsRNA–disease associations from manual curation of literatures and other related resources.tsRNADisease contains 5571 manually curated associations between 4759 tsRNAs and 166 diseases with experimental evidence from 346 studies.In addition,it also contains 5013 predicted associations between 1297 tsRNAs and 111 diseases.tsRNADisease provides a user-friendly interface to browse,retrieve,and download data conveniently.This database can improve our understanding of tsRNA deregulation in diseases and serve as a valuable resource for investigating the mechanism of disease-related tsRNAs.tsRNADisease is freely available at http://www.compgenelab.info/tsRNADisease.展开更多
Artificial Intelligence(AI)in healthcare enables predicting diabetes using data-driven methods instead of the traditional ways of screening the disease,which include hemoglobin A1c(HbA1c),oral glucose tolerance test(O...Artificial Intelligence(AI)in healthcare enables predicting diabetes using data-driven methods instead of the traditional ways of screening the disease,which include hemoglobin A1c(HbA1c),oral glucose tolerance test(OGTT),and fasting plasma glucose(FPG)screening techniques,which are invasive and limited in scale.Machine learning(ML)and deep neural network(DNN)models that use large datasets to learn the complex,nonlinear feature interactions,but the conventional ML algorithms are data sensitive and often show unstable predictive accuracy.Conversely,DNN models are more robust,though the ability to reach a high accuracy rate consistently on heterogeneous datasets is still an open challenge.For predicting diabetes,this work proposed a hybrid DNN approach by integrating a bidirectional long short-term memory(BiLSTM)network with a bidirectional gated recurrent unit(BiGRU).A robust DL model,developed by combining various datasets with weighted coefficients,dense operations in the connection of deep layers,and the output aggregation using batch normalization and dropout functions to avoid overfitting.The goal of this hybrid model is better generalization and consistency among various datasets,which facilitates the effective management and early intervention.The proposed DNN model exhibits an excellent predictive performance as compared to the state-of-the-art and baseline ML and DNN models for diabetes prediction tasks.The robust performance indicates the possible usefulness of DL-based models in the development of disease prediction in healthcare and other areas that demand high-quality analytics.展开更多
Reducing carbon emissions is fundamental to achieving carbon neutrality.Existing studies have typically estimated emissions by predicting fossil fuel consumption across sectors under different socioeconomic scenarios;...Reducing carbon emissions is fundamental to achieving carbon neutrality.Existing studies have typically estimated emissions by predicting fossil fuel consumption across sectors under different socioeconomic scenarios;however,uncertainties in future development often lead to deviations from these assumptions.To address this limitation,this study proposes a data-driven approach for evaluating national carbon emissions using historical data.Countries with similar energy consumption patterns were selected as reference samples,and their emission pathways were analyzed to predict future emissions for countries that have not yet reached their peak.Key indicators,including peak levels,timing,plateau duration,and post-peak decline rates,were identified.The results indicate that the trends in unpeaked economies can be effectively assessed based on the emission patterns of countries with comparable energy structures.Applying this framework to China suggests a carbon peak between 2027 and 2030,in the range of 14.207 to 16.234 Gt,followed by a gradual decline from 2031 to 2036.Compared with the average results of the existing studies,the predicted minimum and maximum emissions show error margins of 10.1% and 1.41%,respectively.This study proposes a top-down methodology that provides a transparent,reproducible,and empirical framework for forecasting carbon emission pathways,thereby offering a scientific basis for assessing countries that have not yet reached their emissions peak.展开更多
Modern intrusion detection systems(MIDS)face persistent challenges in coping with the rapid evolution of cyber threats,high-volume network traffic,and imbalanced datasets.Traditional models often lack the robustness a...Modern intrusion detection systems(MIDS)face persistent challenges in coping with the rapid evolution of cyber threats,high-volume network traffic,and imbalanced datasets.Traditional models often lack the robustness and explainability required to detect novel and sophisticated attacks effectively.This study introduces an advanced,explainable machine learning framework for multi-class IDS using the KDD99 and IDS datasets,which reflects real-world network behavior through a blend of normal and diverse attack classes.The methodology begins with sophisticated data preprocessing,incorporating both RobustScaler and QuantileTransformer to address outliers and skewed feature distributions,ensuring standardized and model-ready inputs.Critical dimensionality reduction is achieved via the Harris Hawks Optimization(HHO)algorithm—a nature-inspired metaheuristic modeled on hawks’hunting strategies.HHO efficiently identifies the most informative features by optimizing a fitness function based on classification performance.Following feature selection,the SMOTE is applied to the training data to resolve class imbalance by synthetically augmenting underrepresented attack types.The stacked architecture is then employed,combining the strengths of XGBoost,SVM,and RF as base learners.This layered approach improves prediction robustness and generalization by balancing bias and variance across diverse classifiers.The model was evaluated using standard classification metrics:precision,recall,F1-score,and overall accuracy.The best overall performance was recorded with an accuracy of 99.44%for UNSW-NB15,demonstrating the model’s effectiveness.After balancing,the model demonstrated a clear improvement in detecting the attacks.We tested the model on four datasets to show the effectiveness of the proposed approach and performed the ablation study to check the effect of each parameter.Also,the proposed model is computationaly efficient.To support transparency and trust in decision-making,explainable AI(XAI)techniques are incorporated that provides both global and local insight into feature contributions,and offers intuitive visualizations for individual predictions.This makes it suitable for practical deployment in cybersecurity environments that demand both precision and accountability.展开更多
Amid the increasing demand for data sharing,the need for flexible,secure,and auditable access control mechanisms has garnered significant attention in the academic community.However,blockchain-based ciphertextpolicy a...Amid the increasing demand for data sharing,the need for flexible,secure,and auditable access control mechanisms has garnered significant attention in the academic community.However,blockchain-based ciphertextpolicy attribute-based encryption(CP-ABE)schemes still face cumbersome ciphertext re-encryption and insufficient oversight when handling dynamic attribute changes and cross-chain collaboration.To address these issues,we propose a dynamic permission attribute-encryption scheme for multi-chain collaboration.This scheme incorporates a multiauthority architecture for distributed attribute management and integrates an attribute revocation and granting mechanism that eliminates the need for ciphertext re-encryption,effectively reducing both computational and communication overhead.It leverages the InterPlanetary File System(IPFS)for off-chain data storage and constructs a cross-chain regulatory framework—comprising a Hyperledger Fabric business chain and a FISCO BCOS regulatory chain—to record changes in decryption privileges and access behaviors in an auditable manner.Security analysis shows selective indistinguishability under chosen-plaintext attack(sIND-CPA)security under the decisional q-Parallel Bilinear Diffie-Hellman Exponent Assumption(q-PBDHE).In the performance and experimental evaluations,we compared the proposed scheme with several advanced schemes.The results show that,while preserving security,the proposed scheme achieves higher encryption/decryption efficiency and lower storage overhead for ciphertexts and keys.展开更多
The authors consider the issue of hypothesis testing in varying-coefficient regression models with high-dimensional data.Utilizing kernel smoothing techniques,the authors propose a locally concerned U-statistic method...The authors consider the issue of hypothesis testing in varying-coefficient regression models with high-dimensional data.Utilizing kernel smoothing techniques,the authors propose a locally concerned U-statistic method to assess the overall significance of the coefficients.The authors establish that the proposed test is asymptotically normal under both the null hypothesis and local alternatives.Based on the locally concerned U-statistic,the authors further develop a globally concerned U-statistic to test whether the coefficient function is zero.A stochastic perturbation method is employed to approximate the distribution of the globally concerned test statistic.Monte Carlo simulations demonstrate the validity of the proposed test in finite samples.展开更多
With the popularization of new technologies,telephone fraud has become the main means of stealing money and personal identity information.Taking inspiration from the website authentication mechanism,we propose an end-...With the popularization of new technologies,telephone fraud has become the main means of stealing money and personal identity information.Taking inspiration from the website authentication mechanism,we propose an end-to-end datamodem scheme that transmits the caller’s digital certificates through a voice channel for the recipient to verify the caller’s identity.Encoding useful information through voice channels is very difficult without the assistance of telecommunications providers.For example,speech activity detection may quickly classify encoded signals as nonspeech signals and reject input waveforms.To address this issue,we propose a novel modulation method based on linear frequency modulation that encodes 3 bits per symbol by varying its frequency,shape,and phase,alongside a lightweightMobileNetV3-Small-based demodulator for efficient and accurate signal decoding on resource-constrained devices.This method leverages the unique characteristics of linear frequency modulation signals,making them more easily transmitted and decoded in speech channels.To ensure reliable data delivery over unstable voice links,we further introduce a robust framing scheme with delimiter-based synchronization,a sample-level position remedying algorithm,and a feedback-driven retransmission mechanism.We have validated the feasibility and performance of our system through expanded real-world evaluations,demonstrating that it outperforms existing advanced methods in terms of robustness and data transfer rate.This technology establishes the foundational infrastructure for reliable certificate delivery over voice channels,which is crucial for achieving strong caller authentication and preventing telephone fraud at its root cause.展开更多
With the rapid growth of cloud computing,the number of data centers(DCs)continuously increases,leading to a high-energy consumption dilemma.Cooling,apart from IT equipment,represents the largest energy consumption in ...With the rapid growth of cloud computing,the number of data centers(DCs)continuously increases,leading to a high-energy consumption dilemma.Cooling,apart from IT equipment,represents the largest energy consumption in DCs.Passive design(PD)and active design(AD)are two important approaches in architectural design to reduce energy consumption.However,for DC cooling,few studies have summarized AD,and there are almost no studies on PD.Based on existing international research(2005-2024),this paper summarizes the current state of cooling strategies for DCs.PD encompasses floors,ceilings,and layout and zoning of racks.Additionally,other passive strategies not yet studied in DCs are critically examined.AD includes air,liquid,free,and two-phase cooling.This paper systematically compares the performance of different AD technologies on various KPIs,including energy,economic,and environmental indicators.This paper also explores the application of different cooling design strategies through best-practice examples and presents advanced algorithms for energy management in operational DCs.This study reveals that free cooling is widely employed,with Artificial Neural Networks emerging as the most popular algorithm for managing cooling energy.Finally,this paper suggests four future directions for reducing cooling energy in DCs,with a focus on the development of passive strategies.This paper provides an overview and guide to DC energy-consumption issues,emphasizes the importance of implementing passive and active design strategies to reduce DC cooling energy consumption,and provides directions and references for future energy-efficient DC designs.展开更多
Missing data presents a crucial challenge in data analysis,especially in high-dimensional datasets,where missing data often leads to biased conclusions and degraded model performance.In this study,we present a novel a...Missing data presents a crucial challenge in data analysis,especially in high-dimensional datasets,where missing data often leads to biased conclusions and degraded model performance.In this study,we present a novel autoencoder-based imputation framework that integrates a composite loss function to enhance robustness and precision.The proposed loss combines(i)a guided,masked mean squared error focusing on missing entries;(ii)a noise-aware regularization term to improve resilience against data corruption;and(iii)a variance penalty to encourage expressive yet stable reconstructions.We evaluate the proposed model across four missingness mechanisms,such as Missing Completely at Random,Missing at Random,Missing Not at Random,and Missing Not at Random with quantile censorship,under systematically varied feature counts,sample sizes,and missingness ratios ranging from 5%to 60%.Four publicly available real-world datasets(Stroke Prediction,Pima Indians Diabetes,Cardiovascular Disease,and Framingham Heart Study)were used,and the obtained results show that our proposed model consistently outperforms baseline methods,including traditional and deep learning-based techniques.An ablation study reveals the additive value of each component in the loss function.Additionally,we assessed the downstream utility of imputed data through classification tasks,where datasets imputed by the proposed method yielded the highest receiver operating characteristic area under the curve scores across all scenarios.The model demonstrates strong scalability and robustness,improving performance with larger datasets and higher feature counts.These results underscore the capacity of the proposed method to produce not only numerically accurate but also semantically useful imputations,making it a promising solution for robust data recovery in clinical applications.展开更多
Distributed learning is a well-established method for estimation tasks over extensively distributed datasets.However,non-randomly stored data can introduce bias into local parameter estimates,leading to significant pe...Distributed learning is a well-established method for estimation tasks over extensively distributed datasets.However,non-randomly stored data can introduce bias into local parameter estimates,leading to significant performance degradation in classical distributed algorithms.In this paper,the authors propose a novel Distributed Quasi-Newton Pilot(DQNP)method for distributed learning with non-randomly distributed data.The proposed approach accommodates both randomly and non-randomly distributed data settings and imposes no constraints on the uniformity of local sample sizes.Additionally,it avoids the need to transfer the Hessian matrix or compute its inversion,thereby greatly reducing computational and communication complexity.The authors theoretically demonstrate that the resulting estimator achieves statistical efficiency under mild conditions.Extensive numerical experiments on synthetic and real-world data validate the theoretical findings and illustrate the effectiveness of the proposed method.展开更多
With the advent of the big data era,modern statistics has enjoyed unprecedented development opportunities and also faced numerous new challenges.Traditional statistical computing methods are often limited by issues su...With the advent of the big data era,modern statistics has enjoyed unprecedented development opportunities and also faced numerous new challenges.Traditional statistical computing methods are often limited by issues such as computer memory capacity and distributed storage of data across different locations,and are unable to directly apply to large-scale data sets.Therefore,in the context of big data,designing efficient and theoretically guaranteed statistical learning and inference algorithms has become a key issue that the current field of statistics urgently needs to address.In this paper,the application status of statistical analysis methods in the big data environment was systematically reviewed,and its future development directions were analyzed to provide reference and support for the further development of theory and methods of the statistical analysis of big data.展开更多
Photoacoustic-computed tomography is a novel imaging technique that combines high absorption contrast and deep tissue penetration capability,enabling comprehensive three-dimensional imaging of biological targets.Howev...Photoacoustic-computed tomography is a novel imaging technique that combines high absorption contrast and deep tissue penetration capability,enabling comprehensive three-dimensional imaging of biological targets.However,the increasing demand for higher resolution and real-time imaging results in significant data volume,limiting data storage,transmission and processing efficiency of system.Therefore,there is an urgent need for an effective method to compress the raw data without compromising image quality.This paper presents a photoacoustic-computed tomography 3D data compression method and system based on Wavelet-Transformer.This method is based on the cooperative compression framework that integrates wavelet hard coding with deep learning-based soft decoding.It combines the multiscale analysis capability of wavelet transforms with the global feature modeling advantage of Transformers,achieving high-quality data compression and reconstruction.Experimental results using k-wave simulation suggest that the proposed compression system has advantages under extreme compression conditions,achieving a raw data compression ratio of up to 1:40.Furthermore,three-dimensional data compression experiment using in vivo mouse demonstrated that the maximum peak signal-to-noise ratio(PSNR)and structural similarity index(SSIM)values of reconstructed images reached 38.60 and 0.9583,effectively overcoming detail loss and artifacts introduced by raw data compression.All the results suggest that the proposed system can significantly reduce storage requirements and hardware cost,enhancing computational efficiency and image quality.These advantages support the development of photoacoustic-computed tomography toward higher efficiency,real-time performance and intelligent functionality.展开更多
Urban traffic generates massive and diverse data,yet most systems remain fragmented.Current approaches to congestion management suffer from weak data consistency and poor scalability.This study addresses this gap by p...Urban traffic generates massive and diverse data,yet most systems remain fragmented.Current approaches to congestion management suffer from weak data consistency and poor scalability.This study addresses this gap by proposing the Urban Traffic Congestion Unified Metadata Model(UTC-UMM).The goal is to provide a standardized and extensible framework for describing,extracting,and storing multisource traffic data in smart cities.The model defines a two-tier specification that organizes nine core traffic resource classes.It employs an eXtensible Markup Language(XML)Schema that connects general elements with resource-specific elements.This design ensures both syntactic and semantic interoperability across siloed datasets.Extension principles allow new elements or constraints to be introducedwithout breaking backward compatibility.Adistributed pipeline is implemented usingHadoop Distributed File System(HDFS)and HBase.It integrates computer vision for video and natural language processing for text to automate metadata extraction.Optimized row-key designs enable low-latency queries.Performance is tested with the Yahoo!Cloud Serving Benchmark(YCSB),which shows linear scalability and high throughput.The results demonstrate that UTC-UMM can unify heterogeneous traffic data while supporting real-time analytics.The discussion highlights its potential to improve data reuse,portability,and scalability in urban congestion studies.Future research will explore integration with association rulemining and advanced knowledge representation to capture richer spatiotemporal traffic patterns.展开更多
With the accelerating aging process of China’s population,the demand for community elderly care services has shown diversified and personalized characteristics.However,problems such as insufficient total care service...With the accelerating aging process of China’s population,the demand for community elderly care services has shown diversified and personalized characteristics.However,problems such as insufficient total care service resources,uneven distribution,and prominent supply-demand contradictions have seriously affected service quality.Big data technology,with core advantages including data collection,analysis and mining,and accurate prediction,provides a new solution for the allocation of community elderly care service resources.This paper systematically studies the application value of big data technology in the allocation of community elderly care service resources from three aspects:resource allocation efficiency,service accuracy,and management intelligence.Combined with practical needs,it proposes optimal allocation strategies such as building a big data analysis platform and accurately grasping the elderly’s care needs,striving to provide operable path references for the construction of community elderly care service systems,promoting the early realization of the elderly care service goal of“adequate support and proper care for the elderly”,and boosting the high-quality development of China’s elderly care service industry.展开更多
文摘Earthquakes are highly destructive spatio-temporal phenomena whose analysis is essential for disaster preparedness and risk mitigation.Modern seismological research produces vast volumes of heterogeneous data from seismic networks,satellite observations,and geospatial repositories,creating the need for scalable infrastructures capable of integrating and analyzing such data to support intelligent decision-making.Data warehousing technologies provide a robust foundation for this purpose;however,existing earthquake-oriented data warehouses remain limited,often relying on simplified schemas,domain-specific analytics,or cataloguing efforts.This paper presents the design and implementation of a spatio-temporal data warehouse for seismic activity.The framework integrates spatial and temporal dimensions in a unified schema and introduces a novel array-based approach for managing many-to-many relationships between facts and dimensions without intermediate bridge tables.A comparative evaluation against a conventional bridge-table schema demonstrates that the array-based design improves fact-centric query performance,while the bridge-table schema remains advantageous for dimension-centric queries.To reconcile these trade-offs,a hybrid schema is proposed that retains both representations,ensuring balanced efficiency across heterogeneous workloads.The proposed framework demonstrates how spatio-temporal data warehousing can address schema complexity,improve query performance,and support multidimensional visualization.In doing so,it provides a foundation for integrating seismic analysis into broader big data-driven intelligent decision systems for disaster resilience,risk mitigation,and emergency management.
基金supported and funded by the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University(IMSIU)under the grant number IMSIU-DDRSP2601.
文摘Ovarian cancer(OC)is one of the leading causes of death related to gynecological cancer,with the main difficulty of its early diagnosis and a heterogeneous nature of tumor biomarkers.Machine learning(ML)has the potential to process complex datasets and support decision-making in OC diagnosis.Nevertheless,traditional ML models tend to be biased,overfitting,noisy,and less generalized.Moreover,their black-box nature reduces interpretability and limits their practical clinical applicability.In this study,we introduce an explainable ensemble learning(EL)model,TreeX-Stack,based on a stacking architecture that employs tree-based learners such as Decision Tree(DT),Random Forest(RF),Gradient Boosting(GB),and Extreme Gradient Boosting(XGBoost)as base learners,and Logistic Regression(LR)as the meta-learner to enhance ovarian cancer(OC)diagnosis.Local Interpretable ModelAgnostic Explanations(LIME)are used to explain individual predictions,making the model outputs more clinically interpretable and applicable.The model is trained on the dataset that includes demographic information,blood test,general chemistry,and tumor markers.Extensive preprocessing includes handling missing data using iterative imputation with Bayesian Ridge and addressing multicollinearity by removing features with correlation coefficients above 0.7.Relevant features are then selected using the Boruta feature selection method.To obtain robust and unbiased performance estimates during hyperparameter tuning,nested cross-validation(CV)with grid search is employed,and all experiments are repeated five times to ensure statistical reliability.TreeX-Stack demonstrates excellent diagnostic performance,achieving an accuracy of 0.9027,a precision of 0.8673,a recall of 0.9391,and an F1-score of 0.9012.Feature-importance analyses using LIME and permutation importance highlight Human Epididymis Protein 4(HE4)as the most significant biomarker for OC.The combination of high predictive performance and interpretability makes TreeX-Stack a reliable tool for clinical decision support in OC diagnosis.
基金supported by the National Natural Science Foundation of China[grant numbers 12171158,12371474 and 12571510]Fundamental Research Funds for the Central Universities[grant number 2025ECNU-WLJC006].
文摘This work contributes to the theoretical foundation for pricing in data markets and offers practical insights for managing digital data exchanges in the era of big data.We propose a structured pricing model for data exchanges transitioning from quasi-public to marketoriented operations.To address the complex dynamics among data exchanges,suppliers,and consumers,the authors develop a threestage Stackelberg game framework.In this model,the data exchange acts as a leader setting transaction commission rates,suppliers are intermediate leaders determining unit prices,and consumers are followers making purchasing decisions.Two pricing strategies are examined:the Independent Pricing Approach(IPA)and the novel Perfectly Competitive Pricing Approach(PCPA),which accounts for competition among data providers.Using backward induction,the study derives subgame-perfect equilibria and proves the existence and uniqueness of Stackelberg equilibria under both approaches.Extensive numerical simulations are carried out in the model,demonstrating that PCPA enhances data demander utility,encourages supplier competition,increases transaction volume,and improves the overall profitability and sustainability of data exchanges.Social welfare analysis further confirms PCPA’s superiority in promoting efficient and fair data markets.
基金supported by National Key R&D Program of China(No.2021YFF0501301)the National Natural Science Foundation of China(No.42172231)。
文摘0 INTRODUCTION Earth science is a natural science concerned with the composition,dynamics,spatiotemporal evolution,and formation mechanisms of Earth materials(Chen and Yang,2023).Traditional Earth science research has largely been discipline-based,relying on field investigations,data collection,experimental analyses,and data interpretation to study individual components of the Earth system.
文摘To address the severe challenges of PM_(2.5) and ozone co-control during the"14^(th) Five-Year Plan"period and to enhance the precision and intelligence level of air environment governance,it is imperative to build an efficient comprehensive management platform for regional air quality.In this paper,the specific practice in Zibo City,Shandong Province is as an example to systematically analyze the top-level design,technical implementation,and innovative application of a comprehensive management platform for regional air quality integrating"perception monitoring,data fusion,research judgment of early warnings,analysis of sources,collaborative dispatching,and evaluation assessment".Through the construction of an"sky-air-ground"integrated three-dimensional monitoring network,the platform integrates multi-source heterogeneous environmental data,and employs big data,cloud computing,artificial intelligence,CALPUFF/CMAQ,and other numerical model technologies to achieve comprehensive perception,precise prediction,intelligent source tracing,and closed-loop management of air pollution.The platform innovatively establishes a full-process closed-loop management mechanism of"data-early warning-disposition-evaluation",and achieves a fundamental transformation from passive response to active anticipation and from experience-based judgment to data driving in environmental supervision.The application results show that this platform significantly improves the scientific decision-making ability and collaborative execution efficiency of air pollution governance in Zibo City,providing a replicable and scalable comprehensive solution for similar industrial cities to achieve the continuous improvement of air quality.
基金supported by the International Partnership program of the Chinese Academy of Sciences(170GJHZ2023074GC)National Natural Science Foundation of China(42425706 and 42488201)+1 种基金National Key Research and Development Program of China(2024YFF0807902)Beijing Natural Science Foundation(8242041),and China Postdoctoral Science Foundation(2025M770353).
文摘Accurately assessing the relationship between tree growth and climatic factors is of great importance in dendrochronology.This study evaluated the consistency between alternative climate datasets(including station and gridded data)and actual climate data(fixed-point observations near the sampling sites),in northeastern China’s warm temperate zone and analyzed differences in their correlations with tree-ring width index.The results were:(1)Gridded temperature data,as well as precipitation and relative humidity data from the Huailai meteorological station,was more consistent with the actual climate data;in contrast,gridded soil moisture content data showed significant discrepancies.(2)Horizontal distance had a greater impact on the representativeness of actual climate conditions than vertical elevation differences.(3)Differences in consistency between alternative and actual climate data also affected their correlations with tree-ring width indices.In some growing season months,correlation coefficients,both in magnitude and sign,differed significantly from those based on actual data.The selection of different alternative climate datasets can lead to biased results in assessing forest responses to climate change,which is detrimental to the management of forest ecosystems in harsh environments.Therefore,the scientific and rational selection of alternative climate data is essential for dendroecological and climatological research.
基金supported by the National Natural Science Foundation of China(91959106)the Foundation of the Shanghai Municipal Education Commission(24RGZNC02)+4 种基金Shanghai Key Laboratory of Intelligent Information Processing,Fudan University(IIPL-2025-RD3-02)Key University Science Research Project of Anhui Province(2023AH030108)Climbing Peak Training Program for Innovative Technology team of Yijishan Hospital,Wannan Medical College(PF201904)Peak Training Program for Scientific Research of Yijishan Hospital,Wannan Medical College(GF2019G15)the talent project of the First Affiliated Hospital of Wannan Medical College(Yijishan Hospital of Wannan Medical College)(YR202422).
文摘tRNA-derived small RNAs(tsRNAs),as a class of regulatory small noncoding RNA,have been implicated in a wide variety of human diseases.Large amounts of tsRNA–disease associations have been identified in recent years from accumulating studies.However,repositories for cataloging the detailed information on tsRNA–disease associations are scarce.In this study,we provide a tsRNADisease database by integrating experimentally and computationally supported tsRNA–disease associations from manual curation of literatures and other related resources.tsRNADisease contains 5571 manually curated associations between 4759 tsRNAs and 166 diseases with experimental evidence from 346 studies.In addition,it also contains 5013 predicted associations between 1297 tsRNAs and 111 diseases.tsRNADisease provides a user-friendly interface to browse,retrieve,and download data conveniently.This database can improve our understanding of tsRNA deregulation in diseases and serve as a valuable resource for investigating the mechanism of disease-related tsRNAs.tsRNADisease is freely available at http://www.compgenelab.info/tsRNADisease.
基金supported by the School of Digital Science,Universiti Brunei Darussalam,Brunei.
文摘Artificial Intelligence(AI)in healthcare enables predicting diabetes using data-driven methods instead of the traditional ways of screening the disease,which include hemoglobin A1c(HbA1c),oral glucose tolerance test(OGTT),and fasting plasma glucose(FPG)screening techniques,which are invasive and limited in scale.Machine learning(ML)and deep neural network(DNN)models that use large datasets to learn the complex,nonlinear feature interactions,but the conventional ML algorithms are data sensitive and often show unstable predictive accuracy.Conversely,DNN models are more robust,though the ability to reach a high accuracy rate consistently on heterogeneous datasets is still an open challenge.For predicting diabetes,this work proposed a hybrid DNN approach by integrating a bidirectional long short-term memory(BiLSTM)network with a bidirectional gated recurrent unit(BiGRU).A robust DL model,developed by combining various datasets with weighted coefficients,dense operations in the connection of deep layers,and the output aggregation using batch normalization and dropout functions to avoid overfitting.The goal of this hybrid model is better generalization and consistency among various datasets,which facilitates the effective management and early intervention.The proposed DNN model exhibits an excellent predictive performance as compared to the state-of-the-art and baseline ML and DNN models for diabetes prediction tasks.The robust performance indicates the possible usefulness of DL-based models in the development of disease prediction in healthcare and other areas that demand high-quality analytics.
基金The National Natural Science Foundation of China(No.52470211)Special Foundation of Jiangsu Province Science and Technology Plan(No.BZ2024017)RECLAIM Network Plus Project(No.EP/W034034/1).
文摘Reducing carbon emissions is fundamental to achieving carbon neutrality.Existing studies have typically estimated emissions by predicting fossil fuel consumption across sectors under different socioeconomic scenarios;however,uncertainties in future development often lead to deviations from these assumptions.To address this limitation,this study proposes a data-driven approach for evaluating national carbon emissions using historical data.Countries with similar energy consumption patterns were selected as reference samples,and their emission pathways were analyzed to predict future emissions for countries that have not yet reached their peak.Key indicators,including peak levels,timing,plateau duration,and post-peak decline rates,were identified.The results indicate that the trends in unpeaked economies can be effectively assessed based on the emission patterns of countries with comparable energy structures.Applying this framework to China suggests a carbon peak between 2027 and 2030,in the range of 14.207 to 16.234 Gt,followed by a gradual decline from 2031 to 2036.Compared with the average results of the existing studies,the predicted minimum and maximum emissions show error margins of 10.1% and 1.41%,respectively.This study proposes a top-down methodology that provides a transparent,reproducible,and empirical framework for forecasting carbon emission pathways,thereby offering a scientific basis for assessing countries that have not yet reached their emissions peak.
基金funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2025R104)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Modern intrusion detection systems(MIDS)face persistent challenges in coping with the rapid evolution of cyber threats,high-volume network traffic,and imbalanced datasets.Traditional models often lack the robustness and explainability required to detect novel and sophisticated attacks effectively.This study introduces an advanced,explainable machine learning framework for multi-class IDS using the KDD99 and IDS datasets,which reflects real-world network behavior through a blend of normal and diverse attack classes.The methodology begins with sophisticated data preprocessing,incorporating both RobustScaler and QuantileTransformer to address outliers and skewed feature distributions,ensuring standardized and model-ready inputs.Critical dimensionality reduction is achieved via the Harris Hawks Optimization(HHO)algorithm—a nature-inspired metaheuristic modeled on hawks’hunting strategies.HHO efficiently identifies the most informative features by optimizing a fitness function based on classification performance.Following feature selection,the SMOTE is applied to the training data to resolve class imbalance by synthetically augmenting underrepresented attack types.The stacked architecture is then employed,combining the strengths of XGBoost,SVM,and RF as base learners.This layered approach improves prediction robustness and generalization by balancing bias and variance across diverse classifiers.The model was evaluated using standard classification metrics:precision,recall,F1-score,and overall accuracy.The best overall performance was recorded with an accuracy of 99.44%for UNSW-NB15,demonstrating the model’s effectiveness.After balancing,the model demonstrated a clear improvement in detecting the attacks.We tested the model on four datasets to show the effectiveness of the proposed approach and performed the ablation study to check the effect of each parameter.Also,the proposed model is computationaly efficient.To support transparency and trust in decision-making,explainable AI(XAI)techniques are incorporated that provides both global and local insight into feature contributions,and offers intuitive visualizations for individual predictions.This makes it suitable for practical deployment in cybersecurity environments that demand both precision and accountability.
文摘Amid the increasing demand for data sharing,the need for flexible,secure,and auditable access control mechanisms has garnered significant attention in the academic community.However,blockchain-based ciphertextpolicy attribute-based encryption(CP-ABE)schemes still face cumbersome ciphertext re-encryption and insufficient oversight when handling dynamic attribute changes and cross-chain collaboration.To address these issues,we propose a dynamic permission attribute-encryption scheme for multi-chain collaboration.This scheme incorporates a multiauthority architecture for distributed attribute management and integrates an attribute revocation and granting mechanism that eliminates the need for ciphertext re-encryption,effectively reducing both computational and communication overhead.It leverages the InterPlanetary File System(IPFS)for off-chain data storage and constructs a cross-chain regulatory framework—comprising a Hyperledger Fabric business chain and a FISCO BCOS regulatory chain—to record changes in decryption privileges and access behaviors in an auditable manner.Security analysis shows selective indistinguishability under chosen-plaintext attack(sIND-CPA)security under the decisional q-Parallel Bilinear Diffie-Hellman Exponent Assumption(q-PBDHE).In the performance and experimental evaluations,we compared the proposed scheme with several advanced schemes.The results show that,while preserving security,the proposed scheme achieves higher encryption/decryption efficiency and lower storage overhead for ciphertexts and keys.
基金supported by the National Social Science Foundation of China under Grant No.23&ZD126National Science Foundation of China under Grant No.12471256+1 种基金Natural Science Foundation of Shanxi Province under Grant No.202203021221219Scientific and Technological Innovation Programs of Higher Education Institutions in Shanxi under Grant No.2023L164。
文摘The authors consider the issue of hypothesis testing in varying-coefficient regression models with high-dimensional data.Utilizing kernel smoothing techniques,the authors propose a locally concerned U-statistic method to assess the overall significance of the coefficients.The authors establish that the proposed test is asymptotically normal under both the null hypothesis and local alternatives.Based on the locally concerned U-statistic,the authors further develop a globally concerned U-statistic to test whether the coefficient function is zero.A stochastic perturbation method is employed to approximate the distribution of the globally concerned test statistic.Monte Carlo simulations demonstrate the validity of the proposed test in finite samples.
文摘With the popularization of new technologies,telephone fraud has become the main means of stealing money and personal identity information.Taking inspiration from the website authentication mechanism,we propose an end-to-end datamodem scheme that transmits the caller’s digital certificates through a voice channel for the recipient to verify the caller’s identity.Encoding useful information through voice channels is very difficult without the assistance of telecommunications providers.For example,speech activity detection may quickly classify encoded signals as nonspeech signals and reject input waveforms.To address this issue,we propose a novel modulation method based on linear frequency modulation that encodes 3 bits per symbol by varying its frequency,shape,and phase,alongside a lightweightMobileNetV3-Small-based demodulator for efficient and accurate signal decoding on resource-constrained devices.This method leverages the unique characteristics of linear frequency modulation signals,making them more easily transmitted and decoded in speech channels.To ensure reliable data delivery over unstable voice links,we further introduce a robust framing scheme with delimiter-based synchronization,a sample-level position remedying algorithm,and a feedback-driven retransmission mechanism.We have validated the feasibility and performance of our system through expanded real-world evaluations,demonstrating that it outperforms existing advanced methods in terms of robustness and data transfer rate.This technology establishes the foundational infrastructure for reliable certificate delivery over voice channels,which is crucial for achieving strong caller authentication and preventing telephone fraud at its root cause.
文摘With the rapid growth of cloud computing,the number of data centers(DCs)continuously increases,leading to a high-energy consumption dilemma.Cooling,apart from IT equipment,represents the largest energy consumption in DCs.Passive design(PD)and active design(AD)are two important approaches in architectural design to reduce energy consumption.However,for DC cooling,few studies have summarized AD,and there are almost no studies on PD.Based on existing international research(2005-2024),this paper summarizes the current state of cooling strategies for DCs.PD encompasses floors,ceilings,and layout and zoning of racks.Additionally,other passive strategies not yet studied in DCs are critically examined.AD includes air,liquid,free,and two-phase cooling.This paper systematically compares the performance of different AD technologies on various KPIs,including energy,economic,and environmental indicators.This paper also explores the application of different cooling design strategies through best-practice examples and presents advanced algorithms for energy management in operational DCs.This study reveals that free cooling is widely employed,with Artificial Neural Networks emerging as the most popular algorithm for managing cooling energy.Finally,this paper suggests four future directions for reducing cooling energy in DCs,with a focus on the development of passive strategies.This paper provides an overview and guide to DC energy-consumption issues,emphasizes the importance of implementing passive and active design strategies to reduce DC cooling energy consumption,and provides directions and references for future energy-efficient DC designs.
文摘Missing data presents a crucial challenge in data analysis,especially in high-dimensional datasets,where missing data often leads to biased conclusions and degraded model performance.In this study,we present a novel autoencoder-based imputation framework that integrates a composite loss function to enhance robustness and precision.The proposed loss combines(i)a guided,masked mean squared error focusing on missing entries;(ii)a noise-aware regularization term to improve resilience against data corruption;and(iii)a variance penalty to encourage expressive yet stable reconstructions.We evaluate the proposed model across four missingness mechanisms,such as Missing Completely at Random,Missing at Random,Missing Not at Random,and Missing Not at Random with quantile censorship,under systematically varied feature counts,sample sizes,and missingness ratios ranging from 5%to 60%.Four publicly available real-world datasets(Stroke Prediction,Pima Indians Diabetes,Cardiovascular Disease,and Framingham Heart Study)were used,and the obtained results show that our proposed model consistently outperforms baseline methods,including traditional and deep learning-based techniques.An ablation study reveals the additive value of each component in the loss function.Additionally,we assessed the downstream utility of imputed data through classification tasks,where datasets imputed by the proposed method yielded the highest receiver operating characteristic area under the curve scores across all scenarios.The model demonstrates strong scalability and robustness,improving performance with larger datasets and higher feature counts.These results underscore the capacity of the proposed method to produce not only numerically accurate but also semantically useful imputations,making it a promising solution for robust data recovery in clinical applications.
基金supported by the National Natural Science Foundation of China under Grant No.12271034the Open Fund Project of Key Laboratory of Market Regulation under Grant No.2023SYSKF02003。
文摘Distributed learning is a well-established method for estimation tasks over extensively distributed datasets.However,non-randomly stored data can introduce bias into local parameter estimates,leading to significant performance degradation in classical distributed algorithms.In this paper,the authors propose a novel Distributed Quasi-Newton Pilot(DQNP)method for distributed learning with non-randomly distributed data.The proposed approach accommodates both randomly and non-randomly distributed data settings and imposes no constraints on the uniformity of local sample sizes.Additionally,it avoids the need to transfer the Hessian matrix or compute its inversion,thereby greatly reducing computational and communication complexity.The authors theoretically demonstrate that the resulting estimator achieves statistical efficiency under mild conditions.Extensive numerical experiments on synthetic and real-world data validate the theoretical findings and illustrate the effectiveness of the proposed method.
文摘With the advent of the big data era,modern statistics has enjoyed unprecedented development opportunities and also faced numerous new challenges.Traditional statistical computing methods are often limited by issues such as computer memory capacity and distributed storage of data across different locations,and are unable to directly apply to large-scale data sets.Therefore,in the context of big data,designing efficient and theoretically guaranteed statistical learning and inference algorithms has become a key issue that the current field of statistics urgently needs to address.In this paper,the application status of statistical analysis methods in the big data environment was systematically reviewed,and its future development directions were analyzed to provide reference and support for the further development of theory and methods of the statistical analysis of big data.
基金supported by the National Key R&D Program of China[Grant No.2023YFF0713600]the National Natural Science Foundation of China[Grant No.62275062]+3 种基金Project of Shandong Innovation and Startup Community of High-end Medical Apparatus and Instruments[Grant No.2023-SGTTXM-002 and 2024-SGTTXM-005]the Shandong Province Technology Innovation Guidance Plan(Central Leading Local Science and Technology Development Fund)[Grant No.YDZX2023115]the Taishan Scholar Special Funding Project of Shandong Provincethe Shandong Laboratory of Advanced Biomaterials and Medical Devices in Weihai[Grant No.ZL202402].
文摘Photoacoustic-computed tomography is a novel imaging technique that combines high absorption contrast and deep tissue penetration capability,enabling comprehensive three-dimensional imaging of biological targets.However,the increasing demand for higher resolution and real-time imaging results in significant data volume,limiting data storage,transmission and processing efficiency of system.Therefore,there is an urgent need for an effective method to compress the raw data without compromising image quality.This paper presents a photoacoustic-computed tomography 3D data compression method and system based on Wavelet-Transformer.This method is based on the cooperative compression framework that integrates wavelet hard coding with deep learning-based soft decoding.It combines the multiscale analysis capability of wavelet transforms with the global feature modeling advantage of Transformers,achieving high-quality data compression and reconstruction.Experimental results using k-wave simulation suggest that the proposed compression system has advantages under extreme compression conditions,achieving a raw data compression ratio of up to 1:40.Furthermore,three-dimensional data compression experiment using in vivo mouse demonstrated that the maximum peak signal-to-noise ratio(PSNR)and structural similarity index(SSIM)values of reconstructed images reached 38.60 and 0.9583,effectively overcoming detail loss and artifacts introduced by raw data compression.All the results suggest that the proposed system can significantly reduce storage requirements and hardware cost,enhancing computational efficiency and image quality.These advantages support the development of photoacoustic-computed tomography toward higher efficiency,real-time performance and intelligent functionality.
基金supported by the National Natural Science Foundation of China(Grant No.62172033).
文摘Urban traffic generates massive and diverse data,yet most systems remain fragmented.Current approaches to congestion management suffer from weak data consistency and poor scalability.This study addresses this gap by proposing the Urban Traffic Congestion Unified Metadata Model(UTC-UMM).The goal is to provide a standardized and extensible framework for describing,extracting,and storing multisource traffic data in smart cities.The model defines a two-tier specification that organizes nine core traffic resource classes.It employs an eXtensible Markup Language(XML)Schema that connects general elements with resource-specific elements.This design ensures both syntactic and semantic interoperability across siloed datasets.Extension principles allow new elements or constraints to be introducedwithout breaking backward compatibility.Adistributed pipeline is implemented usingHadoop Distributed File System(HDFS)and HBase.It integrates computer vision for video and natural language processing for text to automate metadata extraction.Optimized row-key designs enable low-latency queries.Performance is tested with the Yahoo!Cloud Serving Benchmark(YCSB),which shows linear scalability and high throughput.The results demonstrate that UTC-UMM can unify heterogeneous traffic data while supporting real-time analytics.The discussion highlights its potential to improve data reuse,portability,and scalability in urban congestion studies.Future research will explore integration with association rulemining and advanced knowledge representation to capture richer spatiotemporal traffic patterns.
文摘With the accelerating aging process of China’s population,the demand for community elderly care services has shown diversified and personalized characteristics.However,problems such as insufficient total care service resources,uneven distribution,and prominent supply-demand contradictions have seriously affected service quality.Big data technology,with core advantages including data collection,analysis and mining,and accurate prediction,provides a new solution for the allocation of community elderly care service resources.This paper systematically studies the application value of big data technology in the allocation of community elderly care service resources from three aspects:resource allocation efficiency,service accuracy,and management intelligence.Combined with practical needs,it proposes optimal allocation strategies such as building a big data analysis platform and accurately grasping the elderly’s care needs,striving to provide operable path references for the construction of community elderly care service systems,promoting the early realization of the elderly care service goal of“adequate support and proper care for the elderly”,and boosting the high-quality development of China’s elderly care service industry.