Small datasets are often challenging due to their limited sample size.This research introduces a novel solution to these problems:average linkage virtual sample generation(ALVSG).ALVSG leverages the underlying data st...Small datasets are often challenging due to their limited sample size.This research introduces a novel solution to these problems:average linkage virtual sample generation(ALVSG).ALVSG leverages the underlying data structure to create virtual samples,which can be used to augment the original dataset.The ALVSG process consists of two steps.First,an average-linkage clustering technique is applied to the dataset to create a dendrogram.The dendrogram represents the hierarchical structure of the dataset,with each merging operation regarded as a linkage.Next,the linkages are combined into an average-based dataset,which serves as a new representation of the dataset.The second step in the ALVSG process involves generating virtual samples using the average-based dataset.The research project generates a set of 100 virtual samples by uniformly distributing them within the provided boundary.These virtual samples are then added to the original dataset,creating a more extensive dataset with improved generalization performance.The efficacy of the ALVSG approach is validated through resampling experiments and t-tests conducted on two small real-world datasets.The experiments are conducted on three forecasting models:the support vector machine for regression(SVR),the deep learning model(DL),and XGBoost.The results show that the ALVSG approach outperforms the baseline methods in terms of mean square error(MSE),root mean square error(RMSE),and mean absolute error(MAE).展开更多
Industrial data mining usually deals with data from different sources.These heterogeneous datasets describe the same object in different views.However,samples from some of the datasets may be lost.Then the remaining s...Industrial data mining usually deals with data from different sources.These heterogeneous datasets describe the same object in different views.However,samples from some of the datasets may be lost.Then the remaining samples do not correspond one-to-one correctly.Mismatched datasets caused by missing samples make the industrial data unavailable for further machine learning.In order to align the mismatched samples,this article presents a cooperative iteration matching method(CIMM)based on the modified dynamic time warping(DTW).The proposed method regards the sequentially accumulated industrial data as the time series.Mismatched samples are aligned by the DTW.In addition,dynamic constraints are applied to the warping distance of the DTW process to make the alignment more efficient.Then a series of models are trained with the cumulated samples iteratively.Several groups of numerical experiments on different missing patterns and missing locations are designed and analyzed to prove the effectiveness and the applicability of the proposed method.展开更多
In the era of big data,traditional statistical inference methods are faced with great challenges.Taking the two-sample distribution test scenario of big data as an example,this paper proposes the BB-KS test based on m...In the era of big data,traditional statistical inference methods are faced with great challenges.Taking the two-sample distribution test scenario of big data as an example,this paper proposes the BB-KS test based on m out of n bootstrap for solving a single-machine memory and computing constraints.It is verified to the feasibility and effectiveness of the proposed test method through theoretical analysis and numerical simulation.The results show that the BB-KS test can improve the calculation efficiency of the test to a certain extent in the single machine scenario.展开更多
With the increasing emphasis on personal information protection,encryption through security protocols has emerged as a critical requirement in data transmission and reception processes.Nevertheless,IoT ecosystems comp...With the increasing emphasis on personal information protection,encryption through security protocols has emerged as a critical requirement in data transmission and reception processes.Nevertheless,IoT ecosystems comprise heterogeneous networks where outdated systems coexist with the latest devices,spanning a range of devices from non-encrypted ones to fully encrypted ones.Given the limited visibility into payloads in this context,this study investigates AI-based attack detection methods that leverage encrypted traffic metadata,eliminating the need for decryption and minimizing system performance degradation—especially in light of these heterogeneous devices.Using the UNSW-NB15 and CICIoT-2023 dataset,encrypted and unencrypted traffic were categorized according to security protocol,and AI-based intrusion detection experiments were conducted for each traffic type based on metadata.To mitigate the problem of class imbalance,eight different data sampling techniques were applied.The effectiveness of these sampling techniques was then comparatively analyzed using two ensemble models and three Deep Learning(DL)models from various perspectives.The experimental results confirmed that metadata-based attack detection is feasible using only encrypted traffic.In the UNSW-NB15 dataset,the f1-score of encrypted traffic was approximately 0.98,which is 4.3%higher than that of unencrypted traffic(approximately 0.94).In addition,analysis of the encrypted traffic in the CICIoT-2023 dataset using the same method showed a significantly lower f1-score of roughly 0.43,indicating that the quality of the dataset and the preprocessing approach have a substantial impact on detection performance.Furthermore,when data sampling techniques were applied to encrypted traffic,the recall in the UNSWNB15(Encrypted)dataset improved by up to 23.0%,and in the CICIoT-2023(Encrypted)dataset by 20.26%,showing a similar level of improvement.Notably,in CICIoT-2023,f1-score and Receiver Operation Characteristic-Area Under the Curve(ROC-AUC)increased by 59.0%and 55.94%,respectively.These results suggest that data sampling can have a positive effect even in encrypted environments.However,the extent of the improvement may vary depending on data quality,model architecture,and sampling strategy.展开更多
Lightweight nodes are crucial for blockchain scalability,but verifying the availability of complete block data puts significant strain on bandwidth and latency.Existing data availability sampling(DAS)schemes either re...Lightweight nodes are crucial for blockchain scalability,but verifying the availability of complete block data puts significant strain on bandwidth and latency.Existing data availability sampling(DAS)schemes either require trusted setups or suffer from high communication overhead and low verification efficiency.This paper presents ISTIRDA,a DAS scheme that lets light clients certify availability by sampling small random codeword symbols.Built on ISTIR,an improved Reed–Solomon interactive oracle proof of proximity,ISTIRDA combines adaptive folding with dynamic code rate adjustment to preserve soundness while lowering communication.This paper formalizes opening consistency and prove security with bounded error in the random oracle model,giving polylogarithmic verifier queries and no trusted setup.In a prototype compared with FRIDA under equal soundness,ISTIRDA reduces communication by 40.65%to 80%.For data larger than 16 MB,ISTIRDA verifies faster and the advantage widens;at 128 MB,proofs are about 60%smaller and verification time is roughly 25%shorter,while prover overhead remains modest.In peer-to-peer emulation under injected latency and loss,ISTIRDA reaches confidence more quickly and is less sensitive to packet loss and load.These results indicate that ISTIRDA is a scalable and provably secure DAS scheme suitable for high-throughput,large-block public blockchains,substantially easing bandwidth and latency pressure on lightweight nodes.展开更多
The widespread usage of rechargeable batteries in portable devices,electric vehicles,and energy storage systems has underscored the importance for accurately predicting their lifetimes.However,data scarcity often limi...The widespread usage of rechargeable batteries in portable devices,electric vehicles,and energy storage systems has underscored the importance for accurately predicting their lifetimes.However,data scarcity often limits the accuracy of prediction models,which is escalated by the incompletion of data induced by the issues such as sensor failures.To address these challenges,we propose a novel approach to accommodate data insufficiency through achieving external information from incomplete data samples,which are usually discarded in existing studies.In order to fully unleash the prediction power of incomplete data,we have investigated the Multiple Imputation by Chained Equations(MICE)method that diversifies the training data through exploring the potential data patterns.The experimental results demonstrate that the proposed method significantly outperforms the baselines in the most considered scenarios while reducing the prediction root mean square error(RMSE)by up to 18.9%.Furthermore,we have also observed that the penetration of incomplete data benefits the explainability of the prediction model through facilitating the feature selection.展开更多
Dear Editor,This letter is concerned with stability analysis and stabilization design for sampled-data based load frequency control(LFC) systems via a data-driven method. By describing the dynamic behavior of LFC syst...Dear Editor,This letter is concerned with stability analysis and stabilization design for sampled-data based load frequency control(LFC) systems via a data-driven method. By describing the dynamic behavior of LFC systems based on a data-based representation, a stability criterion is derived to obtain the admissible maximum sampling interval(MSI) for a given controller and a design condition of the PI-type controller is further developed to meet the required MSI. Finally, the effectiveness of the proposed methods is verified by a case study.展开更多
Earthquakes are highly destructive spatio-temporal phenomena whose analysis is essential for disaster preparedness and risk mitigation.Modern seismological research produces vast volumes of heterogeneous data from sei...Earthquakes are highly destructive spatio-temporal phenomena whose analysis is essential for disaster preparedness and risk mitigation.Modern seismological research produces vast volumes of heterogeneous data from seismic networks,satellite observations,and geospatial repositories,creating the need for scalable infrastructures capable of integrating and analyzing such data to support intelligent decision-making.Data warehousing technologies provide a robust foundation for this purpose;however,existing earthquake-oriented data warehouses remain limited,often relying on simplified schemas,domain-specific analytics,or cataloguing efforts.This paper presents the design and implementation of a spatio-temporal data warehouse for seismic activity.The framework integrates spatial and temporal dimensions in a unified schema and introduces a novel array-based approach for managing many-to-many relationships between facts and dimensions without intermediate bridge tables.A comparative evaluation against a conventional bridge-table schema demonstrates that the array-based design improves fact-centric query performance,while the bridge-table schema remains advantageous for dimension-centric queries.To reconcile these trade-offs,a hybrid schema is proposed that retains both representations,ensuring balanced efficiency across heterogeneous workloads.The proposed framework demonstrates how spatio-temporal data warehousing can address schema complexity,improve query performance,and support multidimensional visualization.In doing so,it provides a foundation for integrating seismic analysis into broader big data-driven intelligent decision systems for disaster resilience,risk mitigation,and emergency management.展开更多
Portable ratiometric fluorescent platforms have emerged as promising tools for multifarious detection,yet remain unexplored for point-of-care monitoring doxorubicin(DOX),one of clinically antineoplastic drugs.To this ...Portable ratiometric fluorescent platforms have emerged as promising tools for multifarious detection,yet remain unexplored for point-of-care monitoring doxorubicin(DOX),one of clinically antineoplastic drugs.To this end,we herein develop a portable self-calibrating platform namely carbon dots(C-dots)-embedded hydrogel sensors with a smartphone-assisted high-throughput imaging device,for DOX sensing.The prepared green-emitting(λ_(em)=508 nm)and negatively-charged C-dots(−11.40±1.21 mV),which have sufficient spectral overlap with the absorption band of DOX(∼500 nm),can strongly bind with positively-charged DOX molecules by electrostatic attraction effects.As a result,DOX molecules are selectively and rapid(20 s)determined with a detection limit of 10.26 nmol/L via Förster resonance energy transfer processes,demonstrating a remarkably chromatic shift from green to red.Further integrated with a 3D-printed smartphone-assisted device,the platform enabled high-throughput quantification,achieving recoveries of 96.40%-101.85%in human urine/serum(RSDs<2.94%,n=3).Notably,the dual linear detection ranges of the platform align with the reported clinical DOX concentrations in urine and plasma(0-4 h post-administration),validating their capability for direct quantification of DOX in clinical samples without special pre-treatment processes.By virtue of attractive analytical performances and robust feasibility,this platform bridges laboratory precision and point-of-care testing needs,offering promising potential for personalized chemotherapy and multiplexed analyte screening.展开更多
Accurately assessing the relationship between tree growth and climatic factors is of great importance in dendrochronology.This study evaluated the consistency between alternative climate datasets(including station and...Accurately assessing the relationship between tree growth and climatic factors is of great importance in dendrochronology.This study evaluated the consistency between alternative climate datasets(including station and gridded data)and actual climate data(fixed-point observations near the sampling sites),in northeastern China’s warm temperate zone and analyzed differences in their correlations with tree-ring width index.The results were:(1)Gridded temperature data,as well as precipitation and relative humidity data from the Huailai meteorological station,was more consistent with the actual climate data;in contrast,gridded soil moisture content data showed significant discrepancies.(2)Horizontal distance had a greater impact on the representativeness of actual climate conditions than vertical elevation differences.(3)Differences in consistency between alternative and actual climate data also affected their correlations with tree-ring width indices.In some growing season months,correlation coefficients,both in magnitude and sign,differed significantly from those based on actual data.The selection of different alternative climate datasets can lead to biased results in assessing forest responses to climate change,which is detrimental to the management of forest ecosystems in harsh environments.Therefore,the scientific and rational selection of alternative climate data is essential for dendroecological and climatological research.展开更多
Photoacoustic-computed tomography is a novel imaging technique that combines high absorption contrast and deep tissue penetration capability,enabling comprehensive three-dimensional imaging of biological targets.Howev...Photoacoustic-computed tomography is a novel imaging technique that combines high absorption contrast and deep tissue penetration capability,enabling comprehensive three-dimensional imaging of biological targets.However,the increasing demand for higher resolution and real-time imaging results in significant data volume,limiting data storage,transmission and processing efficiency of system.Therefore,there is an urgent need for an effective method to compress the raw data without compromising image quality.This paper presents a photoacoustic-computed tomography 3D data compression method and system based on Wavelet-Transformer.This method is based on the cooperative compression framework that integrates wavelet hard coding with deep learning-based soft decoding.It combines the multiscale analysis capability of wavelet transforms with the global feature modeling advantage of Transformers,achieving high-quality data compression and reconstruction.Experimental results using k-wave simulation suggest that the proposed compression system has advantages under extreme compression conditions,achieving a raw data compression ratio of up to 1:40.Furthermore,three-dimensional data compression experiment using in vivo mouse demonstrated that the maximum peak signal-to-noise ratio(PSNR)and structural similarity index(SSIM)values of reconstructed images reached 38.60 and 0.9583,effectively overcoming detail loss and artifacts introduced by raw data compression.All the results suggest that the proposed system can significantly reduce storage requirements and hardware cost,enhancing computational efficiency and image quality.These advantages support the development of photoacoustic-computed tomography toward higher efficiency,real-time performance and intelligent functionality.展开更多
Amid the increasing demand for data sharing,the need for flexible,secure,and auditable access control mechanisms has garnered significant attention in the academic community.However,blockchain-based ciphertextpolicy a...Amid the increasing demand for data sharing,the need for flexible,secure,and auditable access control mechanisms has garnered significant attention in the academic community.However,blockchain-based ciphertextpolicy attribute-based encryption(CP-ABE)schemes still face cumbersome ciphertext re-encryption and insufficient oversight when handling dynamic attribute changes and cross-chain collaboration.To address these issues,we propose a dynamic permission attribute-encryption scheme for multi-chain collaboration.This scheme incorporates a multiauthority architecture for distributed attribute management and integrates an attribute revocation and granting mechanism that eliminates the need for ciphertext re-encryption,effectively reducing both computational and communication overhead.It leverages the InterPlanetary File System(IPFS)for off-chain data storage and constructs a cross-chain regulatory framework—comprising a Hyperledger Fabric business chain and a FISCO BCOS regulatory chain—to record changes in decryption privileges and access behaviors in an auditable manner.Security analysis shows selective indistinguishability under chosen-plaintext attack(sIND-CPA)security under the decisional q-Parallel Bilinear Diffie-Hellman Exponent Assumption(q-PBDHE).In the performance and experimental evaluations,we compared the proposed scheme with several advanced schemes.The results show that,while preserving security,the proposed scheme achieves higher encryption/decryption efficiency and lower storage overhead for ciphertexts and keys.展开更多
With the popularization of new technologies,telephone fraud has become the main means of stealing money and personal identity information.Taking inspiration from the website authentication mechanism,we propose an end-...With the popularization of new technologies,telephone fraud has become the main means of stealing money and personal identity information.Taking inspiration from the website authentication mechanism,we propose an end-to-end datamodem scheme that transmits the caller’s digital certificates through a voice channel for the recipient to verify the caller’s identity.Encoding useful information through voice channels is very difficult without the assistance of telecommunications providers.For example,speech activity detection may quickly classify encoded signals as nonspeech signals and reject input waveforms.To address this issue,we propose a novel modulation method based on linear frequency modulation that encodes 3 bits per symbol by varying its frequency,shape,and phase,alongside a lightweightMobileNetV3-Small-based demodulator for efficient and accurate signal decoding on resource-constrained devices.This method leverages the unique characteristics of linear frequency modulation signals,making them more easily transmitted and decoded in speech channels.To ensure reliable data delivery over unstable voice links,we further introduce a robust framing scheme with delimiter-based synchronization,a sample-level position remedying algorithm,and a feedback-driven retransmission mechanism.We have validated the feasibility and performance of our system through expanded real-world evaluations,demonstrating that it outperforms existing advanced methods in terms of robustness and data transfer rate.This technology establishes the foundational infrastructure for reliable certificate delivery over voice channels,which is crucial for achieving strong caller authentication and preventing telephone fraud at its root cause.展开更多
Missing data presents a crucial challenge in data analysis,especially in high-dimensional datasets,where missing data often leads to biased conclusions and degraded model performance.In this study,we present a novel a...Missing data presents a crucial challenge in data analysis,especially in high-dimensional datasets,where missing data often leads to biased conclusions and degraded model performance.In this study,we present a novel autoencoder-based imputation framework that integrates a composite loss function to enhance robustness and precision.The proposed loss combines(i)a guided,masked mean squared error focusing on missing entries;(ii)a noise-aware regularization term to improve resilience against data corruption;and(iii)a variance penalty to encourage expressive yet stable reconstructions.We evaluate the proposed model across four missingness mechanisms,such as Missing Completely at Random,Missing at Random,Missing Not at Random,and Missing Not at Random with quantile censorship,under systematically varied feature counts,sample sizes,and missingness ratios ranging from 5%to 60%.Four publicly available real-world datasets(Stroke Prediction,Pima Indians Diabetes,Cardiovascular Disease,and Framingham Heart Study)were used,and the obtained results show that our proposed model consistently outperforms baseline methods,including traditional and deep learning-based techniques.An ablation study reveals the additive value of each component in the loss function.Additionally,we assessed the downstream utility of imputed data through classification tasks,where datasets imputed by the proposed method yielded the highest receiver operating characteristic area under the curve scores across all scenarios.The model demonstrates strong scalability and robustness,improving performance with larger datasets and higher feature counts.These results underscore the capacity of the proposed method to produce not only numerically accurate but also semantically useful imputations,making it a promising solution for robust data recovery in clinical applications.展开更多
With the accelerating aging process of China’s population,the demand for community elderly care services has shown diversified and personalized characteristics.However,problems such as insufficient total care service...With the accelerating aging process of China’s population,the demand for community elderly care services has shown diversified and personalized characteristics.However,problems such as insufficient total care service resources,uneven distribution,and prominent supply-demand contradictions have seriously affected service quality.Big data technology,with core advantages including data collection,analysis and mining,and accurate prediction,provides a new solution for the allocation of community elderly care service resources.This paper systematically studies the application value of big data technology in the allocation of community elderly care service resources from three aspects:resource allocation efficiency,service accuracy,and management intelligence.Combined with practical needs,it proposes optimal allocation strategies such as building a big data analysis platform and accurately grasping the elderly’s care needs,striving to provide operable path references for the construction of community elderly care service systems,promoting the early realization of the elderly care service goal of“adequate support and proper care for the elderly”,and boosting the high-quality development of China’s elderly care service industry.展开更多
As one of the major volatile components in extraterrestrial materials,nitrogen(N_(2))isotopes serve not only as tracers for the formation and evolution of the solar system,but also play a critical role in assessing pl...As one of the major volatile components in extraterrestrial materials,nitrogen(N_(2))isotopes serve not only as tracers for the formation and evolution of the solar system,but also play a critical role in assessing planetary habitability and the search for extraterrestrial life.The integrated measurement of N_(2)and argon(Ar)isotopes by using noble gas mass spectrometry represents a state-of-the-art technique for such investigations.To support the growing demands of planetary science research in China,we have developed a high-efficiency,high-precision method for the integrated analysis of N_(2)and Ar isotopes.This was achieved by enhancing gas extraction and purification systems and integrating them with a static noble gas mass spectrometer.This method enables integrated N_(2)-Ar isotope measurements on submilligram samples,significantly improving sample utilization and reducing the impact of sample heterogeneity on volatile analysis.The system integrates CO_(2)laser heating,a modular two-stage Zr-Al getter pump,and a CuO furnace-based purification process,effectively reducing background levels(N_(2)blank as low as 0.35×10^(−6)cubic centimeters at standard temperature and pressure[ccSTP]).Analytical precision is ensured through calibration with atmospheric air and CO corrections.To validate the reliability of the method,we performed N_(2)-Ar isotope analyses on the Allende carbonaceous chondrite,one of the most extensively studied meteorites internationally.The measured N_(2)concentrations range from 19.2 to 29.8 ppm,withδ15N values between−44.8‰and−33.0‰.Concentrations of 40Ar,36Ar,and 38Ar are(12.5-21.1)×10^(−6)ccSTP/g,(90.9-150.3)×10^(−9)ccSTP/g,and(19.2-30.7)×10^(−9)ccSTP/g,respectively.These values correspond to cosmic-ray exposure ages of 4.5-5.7 Ma,consistent with previous reports.Step-heating experiments further reveal distinct release patterns of N and Ar isotopes,as well as their associations with specific mineral phases in the meteorite.In summary,the combined N_(2)-Ar isotopic system offers significant advantages for tracing volatile sources in extraterrestrial materials and will provide essential analytical support for upcoming Chinese planetary missions,such as Tianwen-2.展开更多
Multivariate anomaly detection plays a critical role in maintaining the stable operation of information systems.However,in existing research,multivariate data are often influenced by various factors during the data co...Multivariate anomaly detection plays a critical role in maintaining the stable operation of information systems.However,in existing research,multivariate data are often influenced by various factors during the data collection process,resulting in temporal misalignment or displacement.Due to these factors,the node representations carry substantial noise,which reduces the adaptability of the multivariate coupled network structure and subsequently degrades anomaly detection performance.Accordingly,this study proposes a novel multivariate anomaly detection model grounded in graph structure learning.Firstly,a recommendation strategy is employed to identify strongly coupled variable pairs,which are then used to construct a recommendation-driven multivariate coupling network.Secondly,a multi-channel graph encoding layer is used to dynamically optimize the structural properties of the multivariate coupling network,while a multi-head attention mechanism enhances the spatial characteristics of the multivariate data.Finally,unsupervised anomaly detection is conducted using a dynamic threshold selection algorithm.Experimental results demonstrate that effectively integrating the structural and spatial features of multivariate data significantly mitigates anomalies caused by temporal dependency misalignment.展开更多
As an important resource in data link,time slots should be strategically allocated to enhance transmission efficiency and resist eavesdropping,especially considering the tremendous increase in the number of nodes and ...As an important resource in data link,time slots should be strategically allocated to enhance transmission efficiency and resist eavesdropping,especially considering the tremendous increase in the number of nodes and diverse communication needs.It is crucial to design control sequences with robust randomness and conflict-freeness to properly address differentiated access control in data link.In this paper,we propose a hierarchical access control scheme based on control sequences to achieve high utilization of time slots and differentiated access control.A theoretical bound of the hierarchical control sequence set is derived to characterize the constraints on the parameters of the sequence set.Moreover,two classes of optimal hierarchical control sequence sets satisfying the theoretical bound are constructed,both of which enable the scheme to achieve maximum utilization of time slots.Compared with the fixed time slot allocation scheme,our scheme reduces the symbol error rate by up to 9%,which indicates a significant improvement in anti-interference and eavesdropping capabilities.展开更多
Data center industries have been facing huge energy challenges due to escalating power consumption and associated carbon emissions.In the context of carbon neutrality,the integration of data centers with renewable ene...Data center industries have been facing huge energy challenges due to escalating power consumption and associated carbon emissions.In the context of carbon neutrality,the integration of data centers with renewable energy has become a prevailing trend.To advance the renewable energy integration in data centers,it is imperative to thoroughly explore the data centers’operational flexibility.Computing workloads and refrigeration systems are recognized as two promising flexible resources for power regulationwithin data centermicro-grids.This paper identifies and categorizes delay-tolerant computing workloads into three types(long-running non-interruptible,long-running interruptible,and short-running)and develops mathematical time-shifting models for each.Additionally,this paper examines the thermal dynamics of the computer room and derives a time-varying temperature model coupled to refrigeration power.Building on these models,this paper proposes a two-stage,multi-time scale optimization scheduling framework that jointly coordinates computing workloads time-shift in day-ahead scheduling and refrigeration power control in intra-day dispatch to mitigate renewable variability.A case study demonstrates that the framework effectively enhances the renewable-energy utilization,improves the operational economy of the data center microgrid,and mitigates the impact of renewable power uncertainty.The results highlight the potential of coordinated computing workloads and thermal system flexibility to support greener,more cost-effective data center operation.展开更多
Modern intrusion detection systems(MIDS)face persistent challenges in coping with the rapid evolution of cyber threats,high-volume network traffic,and imbalanced datasets.Traditional models often lack the robustness a...Modern intrusion detection systems(MIDS)face persistent challenges in coping with the rapid evolution of cyber threats,high-volume network traffic,and imbalanced datasets.Traditional models often lack the robustness and explainability required to detect novel and sophisticated attacks effectively.This study introduces an advanced,explainable machine learning framework for multi-class IDS using the KDD99 and IDS datasets,which reflects real-world network behavior through a blend of normal and diverse attack classes.The methodology begins with sophisticated data preprocessing,incorporating both RobustScaler and QuantileTransformer to address outliers and skewed feature distributions,ensuring standardized and model-ready inputs.Critical dimensionality reduction is achieved via the Harris Hawks Optimization(HHO)algorithm—a nature-inspired metaheuristic modeled on hawks’hunting strategies.HHO efficiently identifies the most informative features by optimizing a fitness function based on classification performance.Following feature selection,the SMOTE is applied to the training data to resolve class imbalance by synthetically augmenting underrepresented attack types.The stacked architecture is then employed,combining the strengths of XGBoost,SVM,and RF as base learners.This layered approach improves prediction robustness and generalization by balancing bias and variance across diverse classifiers.The model was evaluated using standard classification metrics:precision,recall,F1-score,and overall accuracy.The best overall performance was recorded with an accuracy of 99.44%for UNSW-NB15,demonstrating the model’s effectiveness.After balancing,the model demonstrated a clear improvement in detecting the attacks.We tested the model on four datasets to show the effectiveness of the proposed approach and performed the ablation study to check the effect of each parameter.Also,the proposed model is computationaly efficient.To support transparency and trust in decision-making,explainable AI(XAI)techniques are incorporated that provides both global and local insight into feature contributions,and offers intuitive visualizations for individual predictions.This makes it suitable for practical deployment in cybersecurity environments that demand both precision and accountability.展开更多
基金funding support from the National Science and Technology Council(NSTC),under Grant No.114-2410-H-011-026-MY3.
文摘Small datasets are often challenging due to their limited sample size.This research introduces a novel solution to these problems:average linkage virtual sample generation(ALVSG).ALVSG leverages the underlying data structure to create virtual samples,which can be used to augment the original dataset.The ALVSG process consists of two steps.First,an average-linkage clustering technique is applied to the dataset to create a dendrogram.The dendrogram represents the hierarchical structure of the dataset,with each merging operation regarded as a linkage.Next,the linkages are combined into an average-based dataset,which serves as a new representation of the dataset.The second step in the ALVSG process involves generating virtual samples using the average-based dataset.The research project generates a set of 100 virtual samples by uniformly distributing them within the provided boundary.These virtual samples are then added to the original dataset,creating a more extensive dataset with improved generalization performance.The efficacy of the ALVSG approach is validated through resampling experiments and t-tests conducted on two small real-world datasets.The experiments are conducted on three forecasting models:the support vector machine for regression(SVR),the deep learning model(DL),and XGBoost.The results show that the ALVSG approach outperforms the baseline methods in terms of mean square error(MSE),root mean square error(RMSE),and mean absolute error(MAE).
基金the Key National Natural Science Foundation of China(No.U1864211)the National Natural Science Foundation of China(No.11772191)the Natural Science Foundation of Shanghai(No.21ZR1431500)。
文摘Industrial data mining usually deals with data from different sources.These heterogeneous datasets describe the same object in different views.However,samples from some of the datasets may be lost.Then the remaining samples do not correspond one-to-one correctly.Mismatched datasets caused by missing samples make the industrial data unavailable for further machine learning.In order to align the mismatched samples,this article presents a cooperative iteration matching method(CIMM)based on the modified dynamic time warping(DTW).The proposed method regards the sequentially accumulated industrial data as the time series.Mismatched samples are aligned by the DTW.In addition,dynamic constraints are applied to the warping distance of the DTW process to make the alignment more efficient.Then a series of models are trained with the cumulated samples iteratively.Several groups of numerical experiments on different missing patterns and missing locations are designed and analyzed to prove the effectiveness and the applicability of the proposed method.
文摘In the era of big data,traditional statistical inference methods are faced with great challenges.Taking the two-sample distribution test scenario of big data as an example,this paper proposes the BB-KS test based on m out of n bootstrap for solving a single-machine memory and computing constraints.It is verified to the feasibility and effectiveness of the proposed test method through theoretical analysis and numerical simulation.The results show that the BB-KS test can improve the calculation efficiency of the test to a certain extent in the single machine scenario.
基金supported by the Institute of Information&Communications Technology Planning&Evaluation(IITP)grant funded by the Korea government(MSIT)(No.RS-2023-00235509Development of security monitoring technology based network behavior against encrypted cyber threats in ICT convergence environment).
文摘With the increasing emphasis on personal information protection,encryption through security protocols has emerged as a critical requirement in data transmission and reception processes.Nevertheless,IoT ecosystems comprise heterogeneous networks where outdated systems coexist with the latest devices,spanning a range of devices from non-encrypted ones to fully encrypted ones.Given the limited visibility into payloads in this context,this study investigates AI-based attack detection methods that leverage encrypted traffic metadata,eliminating the need for decryption and minimizing system performance degradation—especially in light of these heterogeneous devices.Using the UNSW-NB15 and CICIoT-2023 dataset,encrypted and unencrypted traffic were categorized according to security protocol,and AI-based intrusion detection experiments were conducted for each traffic type based on metadata.To mitigate the problem of class imbalance,eight different data sampling techniques were applied.The effectiveness of these sampling techniques was then comparatively analyzed using two ensemble models and three Deep Learning(DL)models from various perspectives.The experimental results confirmed that metadata-based attack detection is feasible using only encrypted traffic.In the UNSW-NB15 dataset,the f1-score of encrypted traffic was approximately 0.98,which is 4.3%higher than that of unencrypted traffic(approximately 0.94).In addition,analysis of the encrypted traffic in the CICIoT-2023 dataset using the same method showed a significantly lower f1-score of roughly 0.43,indicating that the quality of the dataset and the preprocessing approach have a substantial impact on detection performance.Furthermore,when data sampling techniques were applied to encrypted traffic,the recall in the UNSWNB15(Encrypted)dataset improved by up to 23.0%,and in the CICIoT-2023(Encrypted)dataset by 20.26%,showing a similar level of improvement.Notably,in CICIoT-2023,f1-score and Receiver Operation Characteristic-Area Under the Curve(ROC-AUC)increased by 59.0%and 55.94%,respectively.These results suggest that data sampling can have a positive effect even in encrypted environments.However,the extent of the improvement may vary depending on data quality,model architecture,and sampling strategy.
基金supported in part by the Research Fund of Key Lab of Education Blockchain and Intelligent Technology,Ministry of Education(EBME25-F-08).
文摘Lightweight nodes are crucial for blockchain scalability,but verifying the availability of complete block data puts significant strain on bandwidth and latency.Existing data availability sampling(DAS)schemes either require trusted setups or suffer from high communication overhead and low verification efficiency.This paper presents ISTIRDA,a DAS scheme that lets light clients certify availability by sampling small random codeword symbols.Built on ISTIR,an improved Reed–Solomon interactive oracle proof of proximity,ISTIRDA combines adaptive folding with dynamic code rate adjustment to preserve soundness while lowering communication.This paper formalizes opening consistency and prove security with bounded error in the random oracle model,giving polylogarithmic verifier queries and no trusted setup.In a prototype compared with FRIDA under equal soundness,ISTIRDA reduces communication by 40.65%to 80%.For data larger than 16 MB,ISTIRDA verifies faster and the advantage widens;at 128 MB,proofs are about 60%smaller and verification time is roughly 25%shorter,while prover overhead remains modest.In peer-to-peer emulation under injected latency and loss,ISTIRDA reaches confidence more quickly and is less sensitive to packet loss and load.These results indicate that ISTIRDA is a scalable and provably secure DAS scheme suitable for high-throughput,large-block public blockchains,substantially easing bandwidth and latency pressure on lightweight nodes.
文摘The widespread usage of rechargeable batteries in portable devices,electric vehicles,and energy storage systems has underscored the importance for accurately predicting their lifetimes.However,data scarcity often limits the accuracy of prediction models,which is escalated by the incompletion of data induced by the issues such as sensor failures.To address these challenges,we propose a novel approach to accommodate data insufficiency through achieving external information from incomplete data samples,which are usually discarded in existing studies.In order to fully unleash the prediction power of incomplete data,we have investigated the Multiple Imputation by Chained Equations(MICE)method that diversifies the training data through exploring the potential data patterns.The experimental results demonstrate that the proposed method significantly outperforms the baselines in the most considered scenarios while reducing the prediction root mean square error(RMSE)by up to 18.9%.Furthermore,we have also observed that the penetration of incomplete data benefits the explainability of the prediction model through facilitating the feature selection.
基金supported in part by the National Natural Science Foundation of China(62373337,62373333)the 111 Project(B17040)State Key Laboratory of Advanced Electromagnetic Technology(2024KF002)
文摘Dear Editor,This letter is concerned with stability analysis and stabilization design for sampled-data based load frequency control(LFC) systems via a data-driven method. By describing the dynamic behavior of LFC systems based on a data-based representation, a stability criterion is derived to obtain the admissible maximum sampling interval(MSI) for a given controller and a design condition of the PI-type controller is further developed to meet the required MSI. Finally, the effectiveness of the proposed methods is verified by a case study.
文摘Earthquakes are highly destructive spatio-temporal phenomena whose analysis is essential for disaster preparedness and risk mitigation.Modern seismological research produces vast volumes of heterogeneous data from seismic networks,satellite observations,and geospatial repositories,creating the need for scalable infrastructures capable of integrating and analyzing such data to support intelligent decision-making.Data warehousing technologies provide a robust foundation for this purpose;however,existing earthquake-oriented data warehouses remain limited,often relying on simplified schemas,domain-specific analytics,or cataloguing efforts.This paper presents the design and implementation of a spatio-temporal data warehouse for seismic activity.The framework integrates spatial and temporal dimensions in a unified schema and introduces a novel array-based approach for managing many-to-many relationships between facts and dimensions without intermediate bridge tables.A comparative evaluation against a conventional bridge-table schema demonstrates that the array-based design improves fact-centric query performance,while the bridge-table schema remains advantageous for dimension-centric queries.To reconcile these trade-offs,a hybrid schema is proposed that retains both representations,ensuring balanced efficiency across heterogeneous workloads.The proposed framework demonstrates how spatio-temporal data warehousing can address schema complexity,improve query performance,and support multidimensional visualization.In doing so,it provides a foundation for integrating seismic analysis into broader big data-driven intelligent decision systems for disaster resilience,risk mitigation,and emergency management.
基金supported by the National NaturalScience Foundation of China(No.22274001)the Key Project of Natural Science Research of the Education Department of Anhui Province(No.2022AH051032)the Excellent Research and Innovation Team of Universities in Anhui Province(No.2024AH010016).
文摘Portable ratiometric fluorescent platforms have emerged as promising tools for multifarious detection,yet remain unexplored for point-of-care monitoring doxorubicin(DOX),one of clinically antineoplastic drugs.To this end,we herein develop a portable self-calibrating platform namely carbon dots(C-dots)-embedded hydrogel sensors with a smartphone-assisted high-throughput imaging device,for DOX sensing.The prepared green-emitting(λ_(em)=508 nm)and negatively-charged C-dots(−11.40±1.21 mV),which have sufficient spectral overlap with the absorption band of DOX(∼500 nm),can strongly bind with positively-charged DOX molecules by electrostatic attraction effects.As a result,DOX molecules are selectively and rapid(20 s)determined with a detection limit of 10.26 nmol/L via Förster resonance energy transfer processes,demonstrating a remarkably chromatic shift from green to red.Further integrated with a 3D-printed smartphone-assisted device,the platform enabled high-throughput quantification,achieving recoveries of 96.40%-101.85%in human urine/serum(RSDs<2.94%,n=3).Notably,the dual linear detection ranges of the platform align with the reported clinical DOX concentrations in urine and plasma(0-4 h post-administration),validating their capability for direct quantification of DOX in clinical samples without special pre-treatment processes.By virtue of attractive analytical performances and robust feasibility,this platform bridges laboratory precision and point-of-care testing needs,offering promising potential for personalized chemotherapy and multiplexed analyte screening.
基金supported by the International Partnership program of the Chinese Academy of Sciences(170GJHZ2023074GC)National Natural Science Foundation of China(42425706 and 42488201)+1 种基金National Key Research and Development Program of China(2024YFF0807902)Beijing Natural Science Foundation(8242041),and China Postdoctoral Science Foundation(2025M770353).
文摘Accurately assessing the relationship between tree growth and climatic factors is of great importance in dendrochronology.This study evaluated the consistency between alternative climate datasets(including station and gridded data)and actual climate data(fixed-point observations near the sampling sites),in northeastern China’s warm temperate zone and analyzed differences in their correlations with tree-ring width index.The results were:(1)Gridded temperature data,as well as precipitation and relative humidity data from the Huailai meteorological station,was more consistent with the actual climate data;in contrast,gridded soil moisture content data showed significant discrepancies.(2)Horizontal distance had a greater impact on the representativeness of actual climate conditions than vertical elevation differences.(3)Differences in consistency between alternative and actual climate data also affected their correlations with tree-ring width indices.In some growing season months,correlation coefficients,both in magnitude and sign,differed significantly from those based on actual data.The selection of different alternative climate datasets can lead to biased results in assessing forest responses to climate change,which is detrimental to the management of forest ecosystems in harsh environments.Therefore,the scientific and rational selection of alternative climate data is essential for dendroecological and climatological research.
基金supported by the National Key R&D Program of China[Grant No.2023YFF0713600]the National Natural Science Foundation of China[Grant No.62275062]+3 种基金Project of Shandong Innovation and Startup Community of High-end Medical Apparatus and Instruments[Grant No.2023-SGTTXM-002 and 2024-SGTTXM-005]the Shandong Province Technology Innovation Guidance Plan(Central Leading Local Science and Technology Development Fund)[Grant No.YDZX2023115]the Taishan Scholar Special Funding Project of Shandong Provincethe Shandong Laboratory of Advanced Biomaterials and Medical Devices in Weihai[Grant No.ZL202402].
文摘Photoacoustic-computed tomography is a novel imaging technique that combines high absorption contrast and deep tissue penetration capability,enabling comprehensive three-dimensional imaging of biological targets.However,the increasing demand for higher resolution and real-time imaging results in significant data volume,limiting data storage,transmission and processing efficiency of system.Therefore,there is an urgent need for an effective method to compress the raw data without compromising image quality.This paper presents a photoacoustic-computed tomography 3D data compression method and system based on Wavelet-Transformer.This method is based on the cooperative compression framework that integrates wavelet hard coding with deep learning-based soft decoding.It combines the multiscale analysis capability of wavelet transforms with the global feature modeling advantage of Transformers,achieving high-quality data compression and reconstruction.Experimental results using k-wave simulation suggest that the proposed compression system has advantages under extreme compression conditions,achieving a raw data compression ratio of up to 1:40.Furthermore,three-dimensional data compression experiment using in vivo mouse demonstrated that the maximum peak signal-to-noise ratio(PSNR)and structural similarity index(SSIM)values of reconstructed images reached 38.60 and 0.9583,effectively overcoming detail loss and artifacts introduced by raw data compression.All the results suggest that the proposed system can significantly reduce storage requirements and hardware cost,enhancing computational efficiency and image quality.These advantages support the development of photoacoustic-computed tomography toward higher efficiency,real-time performance and intelligent functionality.
文摘Amid the increasing demand for data sharing,the need for flexible,secure,and auditable access control mechanisms has garnered significant attention in the academic community.However,blockchain-based ciphertextpolicy attribute-based encryption(CP-ABE)schemes still face cumbersome ciphertext re-encryption and insufficient oversight when handling dynamic attribute changes and cross-chain collaboration.To address these issues,we propose a dynamic permission attribute-encryption scheme for multi-chain collaboration.This scheme incorporates a multiauthority architecture for distributed attribute management and integrates an attribute revocation and granting mechanism that eliminates the need for ciphertext re-encryption,effectively reducing both computational and communication overhead.It leverages the InterPlanetary File System(IPFS)for off-chain data storage and constructs a cross-chain regulatory framework—comprising a Hyperledger Fabric business chain and a FISCO BCOS regulatory chain—to record changes in decryption privileges and access behaviors in an auditable manner.Security analysis shows selective indistinguishability under chosen-plaintext attack(sIND-CPA)security under the decisional q-Parallel Bilinear Diffie-Hellman Exponent Assumption(q-PBDHE).In the performance and experimental evaluations,we compared the proposed scheme with several advanced schemes.The results show that,while preserving security,the proposed scheme achieves higher encryption/decryption efficiency and lower storage overhead for ciphertexts and keys.
文摘With the popularization of new technologies,telephone fraud has become the main means of stealing money and personal identity information.Taking inspiration from the website authentication mechanism,we propose an end-to-end datamodem scheme that transmits the caller’s digital certificates through a voice channel for the recipient to verify the caller’s identity.Encoding useful information through voice channels is very difficult without the assistance of telecommunications providers.For example,speech activity detection may quickly classify encoded signals as nonspeech signals and reject input waveforms.To address this issue,we propose a novel modulation method based on linear frequency modulation that encodes 3 bits per symbol by varying its frequency,shape,and phase,alongside a lightweightMobileNetV3-Small-based demodulator for efficient and accurate signal decoding on resource-constrained devices.This method leverages the unique characteristics of linear frequency modulation signals,making them more easily transmitted and decoded in speech channels.To ensure reliable data delivery over unstable voice links,we further introduce a robust framing scheme with delimiter-based synchronization,a sample-level position remedying algorithm,and a feedback-driven retransmission mechanism.We have validated the feasibility and performance of our system through expanded real-world evaluations,demonstrating that it outperforms existing advanced methods in terms of robustness and data transfer rate.This technology establishes the foundational infrastructure for reliable certificate delivery over voice channels,which is crucial for achieving strong caller authentication and preventing telephone fraud at its root cause.
文摘Missing data presents a crucial challenge in data analysis,especially in high-dimensional datasets,where missing data often leads to biased conclusions and degraded model performance.In this study,we present a novel autoencoder-based imputation framework that integrates a composite loss function to enhance robustness and precision.The proposed loss combines(i)a guided,masked mean squared error focusing on missing entries;(ii)a noise-aware regularization term to improve resilience against data corruption;and(iii)a variance penalty to encourage expressive yet stable reconstructions.We evaluate the proposed model across four missingness mechanisms,such as Missing Completely at Random,Missing at Random,Missing Not at Random,and Missing Not at Random with quantile censorship,under systematically varied feature counts,sample sizes,and missingness ratios ranging from 5%to 60%.Four publicly available real-world datasets(Stroke Prediction,Pima Indians Diabetes,Cardiovascular Disease,and Framingham Heart Study)were used,and the obtained results show that our proposed model consistently outperforms baseline methods,including traditional and deep learning-based techniques.An ablation study reveals the additive value of each component in the loss function.Additionally,we assessed the downstream utility of imputed data through classification tasks,where datasets imputed by the proposed method yielded the highest receiver operating characteristic area under the curve scores across all scenarios.The model demonstrates strong scalability and robustness,improving performance with larger datasets and higher feature counts.These results underscore the capacity of the proposed method to produce not only numerically accurate but also semantically useful imputations,making it a promising solution for robust data recovery in clinical applications.
文摘With the accelerating aging process of China’s population,the demand for community elderly care services has shown diversified and personalized characteristics.However,problems such as insufficient total care service resources,uneven distribution,and prominent supply-demand contradictions have seriously affected service quality.Big data technology,with core advantages including data collection,analysis and mining,and accurate prediction,provides a new solution for the allocation of community elderly care service resources.This paper systematically studies the application value of big data technology in the allocation of community elderly care service resources from three aspects:resource allocation efficiency,service accuracy,and management intelligence.Combined with practical needs,it proposes optimal allocation strategies such as building a big data analysis platform and accurately grasping the elderly’s care needs,striving to provide operable path references for the construction of community elderly care service systems,promoting the early realization of the elderly care service goal of“adequate support and proper care for the elderly”,and boosting the high-quality development of China’s elderly care service industry.
基金supported by the Bureau of Frontier Sciences and Basic Research,Chinese Academy of Sciences(Grant No.QYJ-2025-0103)the National Natural Science Foundation of China(Grant Nos.42441834,42241105,42441825,and 42203048)the Key Research Program of the Institute of Geology and Geophysics,Chinese Academy of Sciences(Grant No.IGGCAS-202401).
文摘As one of the major volatile components in extraterrestrial materials,nitrogen(N_(2))isotopes serve not only as tracers for the formation and evolution of the solar system,but also play a critical role in assessing planetary habitability and the search for extraterrestrial life.The integrated measurement of N_(2)and argon(Ar)isotopes by using noble gas mass spectrometry represents a state-of-the-art technique for such investigations.To support the growing demands of planetary science research in China,we have developed a high-efficiency,high-precision method for the integrated analysis of N_(2)and Ar isotopes.This was achieved by enhancing gas extraction and purification systems and integrating them with a static noble gas mass spectrometer.This method enables integrated N_(2)-Ar isotope measurements on submilligram samples,significantly improving sample utilization and reducing the impact of sample heterogeneity on volatile analysis.The system integrates CO_(2)laser heating,a modular two-stage Zr-Al getter pump,and a CuO furnace-based purification process,effectively reducing background levels(N_(2)blank as low as 0.35×10^(−6)cubic centimeters at standard temperature and pressure[ccSTP]).Analytical precision is ensured through calibration with atmospheric air and CO corrections.To validate the reliability of the method,we performed N_(2)-Ar isotope analyses on the Allende carbonaceous chondrite,one of the most extensively studied meteorites internationally.The measured N_(2)concentrations range from 19.2 to 29.8 ppm,withδ15N values between−44.8‰and−33.0‰.Concentrations of 40Ar,36Ar,and 38Ar are(12.5-21.1)×10^(−6)ccSTP/g,(90.9-150.3)×10^(−9)ccSTP/g,and(19.2-30.7)×10^(−9)ccSTP/g,respectively.These values correspond to cosmic-ray exposure ages of 4.5-5.7 Ma,consistent with previous reports.Step-heating experiments further reveal distinct release patterns of N and Ar isotopes,as well as their associations with specific mineral phases in the meteorite.In summary,the combined N_(2)-Ar isotopic system offers significant advantages for tracing volatile sources in extraterrestrial materials and will provide essential analytical support for upcoming Chinese planetary missions,such as Tianwen-2.
基金supported by Natural Science Foundation of Qinghai Province(2025-ZJ-994M)Scientific Research Innovation Capability Support Project for Young Faculty(SRICSPYF-BS2025007)National Natural Science Foundation of China(62566050).
文摘Multivariate anomaly detection plays a critical role in maintaining the stable operation of information systems.However,in existing research,multivariate data are often influenced by various factors during the data collection process,resulting in temporal misalignment or displacement.Due to these factors,the node representations carry substantial noise,which reduces the adaptability of the multivariate coupled network structure and subsequently degrades anomaly detection performance.Accordingly,this study proposes a novel multivariate anomaly detection model grounded in graph structure learning.Firstly,a recommendation strategy is employed to identify strongly coupled variable pairs,which are then used to construct a recommendation-driven multivariate coupling network.Secondly,a multi-channel graph encoding layer is used to dynamically optimize the structural properties of the multivariate coupling network,while a multi-head attention mechanism enhances the spatial characteristics of the multivariate data.Finally,unsupervised anomaly detection is conducted using a dynamic threshold selection algorithm.Experimental results demonstrate that effectively integrating the structural and spatial features of multivariate data significantly mitigates anomalies caused by temporal dependency misalignment.
基金supported by the National Science Foundation of China(No.62171387)the Science and Technology Program of Sichuan Province(No.2024NSFSC0468)the China Postdoctoral Science Foundation(No.2019M663475).
文摘As an important resource in data link,time slots should be strategically allocated to enhance transmission efficiency and resist eavesdropping,especially considering the tremendous increase in the number of nodes and diverse communication needs.It is crucial to design control sequences with robust randomness and conflict-freeness to properly address differentiated access control in data link.In this paper,we propose a hierarchical access control scheme based on control sequences to achieve high utilization of time slots and differentiated access control.A theoretical bound of the hierarchical control sequence set is derived to characterize the constraints on the parameters of the sequence set.Moreover,two classes of optimal hierarchical control sequence sets satisfying the theoretical bound are constructed,both of which enable the scheme to achieve maximum utilization of time slots.Compared with the fixed time slot allocation scheme,our scheme reduces the symbol error rate by up to 9%,which indicates a significant improvement in anti-interference and eavesdropping capabilities.
基金supported by Science and Technology Standard Project of Guangdong Electric Power Design Institute(ER11301W,ER11811W).
文摘Data center industries have been facing huge energy challenges due to escalating power consumption and associated carbon emissions.In the context of carbon neutrality,the integration of data centers with renewable energy has become a prevailing trend.To advance the renewable energy integration in data centers,it is imperative to thoroughly explore the data centers’operational flexibility.Computing workloads and refrigeration systems are recognized as two promising flexible resources for power regulationwithin data centermicro-grids.This paper identifies and categorizes delay-tolerant computing workloads into three types(long-running non-interruptible,long-running interruptible,and short-running)and develops mathematical time-shifting models for each.Additionally,this paper examines the thermal dynamics of the computer room and derives a time-varying temperature model coupled to refrigeration power.Building on these models,this paper proposes a two-stage,multi-time scale optimization scheduling framework that jointly coordinates computing workloads time-shift in day-ahead scheduling and refrigeration power control in intra-day dispatch to mitigate renewable variability.A case study demonstrates that the framework effectively enhances the renewable-energy utilization,improves the operational economy of the data center microgrid,and mitigates the impact of renewable power uncertainty.The results highlight the potential of coordinated computing workloads and thermal system flexibility to support greener,more cost-effective data center operation.
基金funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2025R104)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Modern intrusion detection systems(MIDS)face persistent challenges in coping with the rapid evolution of cyber threats,high-volume network traffic,and imbalanced datasets.Traditional models often lack the robustness and explainability required to detect novel and sophisticated attacks effectively.This study introduces an advanced,explainable machine learning framework for multi-class IDS using the KDD99 and IDS datasets,which reflects real-world network behavior through a blend of normal and diverse attack classes.The methodology begins with sophisticated data preprocessing,incorporating both RobustScaler and QuantileTransformer to address outliers and skewed feature distributions,ensuring standardized and model-ready inputs.Critical dimensionality reduction is achieved via the Harris Hawks Optimization(HHO)algorithm—a nature-inspired metaheuristic modeled on hawks’hunting strategies.HHO efficiently identifies the most informative features by optimizing a fitness function based on classification performance.Following feature selection,the SMOTE is applied to the training data to resolve class imbalance by synthetically augmenting underrepresented attack types.The stacked architecture is then employed,combining the strengths of XGBoost,SVM,and RF as base learners.This layered approach improves prediction robustness and generalization by balancing bias and variance across diverse classifiers.The model was evaluated using standard classification metrics:precision,recall,F1-score,and overall accuracy.The best overall performance was recorded with an accuracy of 99.44%for UNSW-NB15,demonstrating the model’s effectiveness.After balancing,the model demonstrated a clear improvement in detecting the attacks.We tested the model on four datasets to show the effectiveness of the proposed approach and performed the ablation study to check the effect of each parameter.Also,the proposed model is computationaly efficient.To support transparency and trust in decision-making,explainable AI(XAI)techniques are incorporated that provides both global and local insight into feature contributions,and offers intuitive visualizations for individual predictions.This makes it suitable for practical deployment in cybersecurity environments that demand both precision and accountability.