With the increasing emphasis on personal information protection,encryption through security protocols has emerged as a critical requirement in data transmission and reception processes.Nevertheless,IoT ecosystems comp...With the increasing emphasis on personal information protection,encryption through security protocols has emerged as a critical requirement in data transmission and reception processes.Nevertheless,IoT ecosystems comprise heterogeneous networks where outdated systems coexist with the latest devices,spanning a range of devices from non-encrypted ones to fully encrypted ones.Given the limited visibility into payloads in this context,this study investigates AI-based attack detection methods that leverage encrypted traffic metadata,eliminating the need for decryption and minimizing system performance degradation—especially in light of these heterogeneous devices.Using the UNSW-NB15 and CICIoT-2023 dataset,encrypted and unencrypted traffic were categorized according to security protocol,and AI-based intrusion detection experiments were conducted for each traffic type based on metadata.To mitigate the problem of class imbalance,eight different data sampling techniques were applied.The effectiveness of these sampling techniques was then comparatively analyzed using two ensemble models and three Deep Learning(DL)models from various perspectives.The experimental results confirmed that metadata-based attack detection is feasible using only encrypted traffic.In the UNSW-NB15 dataset,the f1-score of encrypted traffic was approximately 0.98,which is 4.3%higher than that of unencrypted traffic(approximately 0.94).In addition,analysis of the encrypted traffic in the CICIoT-2023 dataset using the same method showed a significantly lower f1-score of roughly 0.43,indicating that the quality of the dataset and the preprocessing approach have a substantial impact on detection performance.Furthermore,when data sampling techniques were applied to encrypted traffic,the recall in the UNSWNB15(Encrypted)dataset improved by up to 23.0%,and in the CICIoT-2023(Encrypted)dataset by 20.26%,showing a similar level of improvement.Notably,in CICIoT-2023,f1-score and Receiver Operation Characteristic-Area Under the Curve(ROC-AUC)increased by 59.0%and 55.94%,respectively.These results suggest that data sampling can have a positive effect even in encrypted environments.However,the extent of the improvement may vary depending on data quality,model architecture,and sampling strategy.展开更多
Traditional anomaly detection methods often assume that data points are independent or exhibit regularly structured relationships,as in Euclidean data such as time series or image grids.However,real-world data frequen...Traditional anomaly detection methods often assume that data points are independent or exhibit regularly structured relationships,as in Euclidean data such as time series or image grids.However,real-world data frequently involve irregular,interconnected structures,requiring a shift toward non-Euclidean approaches.This study introduces a novel anomaly detection framework designed to handle non-Euclidean data by modeling transactions as graph signals.By leveraging graph convolution filters,we extract meaningful connection strengths that capture relational dependencies often overlooked in traditional methods.Utilizing the Graph Convolutional Networks(GCN)framework,we integrate graph-based embeddings with conventional anomaly detection models,enhancing performance through relational insights.Ourmethod is validated on European credit card transaction data,demonstrating its effectiveness in detecting fraudulent transactions,particularly thosewith subtle patterns that evade traditional,amountbased detection techniques.The results highlight the advantages of incorporating temporal and structural dependencies into fraud detection,showcasing the robustness and applicability of our approach in complex,real-world scenarios.展开更多
Automated essay scoring(AES)systems have gained significant importance in educational settings,offering a scalable,efficient,and objective method for evaluating student essays.However,developing AES systems for Arabic...Automated essay scoring(AES)systems have gained significant importance in educational settings,offering a scalable,efficient,and objective method for evaluating student essays.However,developing AES systems for Arabic poses distinct challenges due to the language’s complex morphology,diglossia,and the scarcity of annotated datasets.This paper presents a hybrid approach to Arabic AES by combining text-based,vector-based,and embeddingbased similarity measures to improve essay scoring accuracy while minimizing the training data required.Using a large Arabic essay dataset categorized into thematic groups,the study conducted four experiments to evaluate the impact of feature selection,data size,and model performance.Experiment 1 established a baseline using a non-machine learning approach,selecting top-N correlated features to predict essay scores.The subsequent experiments employed 5-fold cross-validation.Experiment 2 showed that combining embedding-based,text-based,and vector-based features in a Random Forest(RF)model achieved an R2 of 88.92%and an accuracy of 83.3%within a 0.5-point tolerance.Experiment 3 further refined the feature selection process,demonstrating that 19 correlated features yielded optimal results,improving R2 to 88.95%.In Experiment 4,an optimal data efficiency training approach was introduced,where training data portions increased from 5%to 50%.The study found that using just 10%of the data achieved near-peak performance,with an R2 of 85.49%,emphasizing an effective trade-off between performance and computational costs.These findings highlight the potential of the hybrid approach for developing scalable Arabic AES systems,especially in low-resource environments,addressing linguistic challenges while ensuring efficient data usage.展开更多
In this paper,we consider a multiple-input single-output(MISO)Hammerstein system whose inputs and output are disturbed by unknown Gaussian white measurement noises.The parameter estimation of such a system is a typica...In this paper,we consider a multiple-input single-output(MISO)Hammerstein system whose inputs and output are disturbed by unknown Gaussian white measurement noises.The parameter estimation of such a system is a typical errors-in-variables(EIV)nonlinear system identification problem.This paper proposes a bias-correction least squares(BCLS)identification methods to compute a consistent estimate of EIV MISO Hammerstein systems from noisy data.To obtain the unbiased parameter estimates of EIV MISO Hammerstein system,the analytical expression of estimated bias for the standard least squares(LS)algorithm is derived first,which is a function about the variances of noises.And then a recursive algorithm is proposed to estimate the unknown term of noises variances from noisy data.Finally,based on bias estimation scheme,the bias caused by the correlation between the input–output signals exciting the true system and the corresponding measurement noise,resulting in unbiased parameter estimates of the EIV MISO Hammerstein system.The performance of the proposed method is demonstrated through a simulation example and a chemical continuously stirred tank reactor(CSTR)system.展开更多
In this paper,we propose a new privacy-aware transmission scheduling algorithm for 6G ad hoc networks.This system enables end nodes to select the optimum time and scheme to transmit private data safely.In 6G dynamic h...In this paper,we propose a new privacy-aware transmission scheduling algorithm for 6G ad hoc networks.This system enables end nodes to select the optimum time and scheme to transmit private data safely.In 6G dynamic heterogeneous infrastructures,unstable links and non-uniform hardware capabilities create critical issues regarding security and privacy.Traditional protocols are often too computationally heavy to allow 6G services to achieve their expected Quality-of-Service(QoS).As the transport network is built of ad hoc nodes,there is no guarantee about their trustworthiness or behavior,and transversal functionalities are delegated to the extreme nodes.However,while security can be guaranteed in extreme-to-extreme solutions,privacy cannot,as all intermediate nodes still have to handle the data packets they are transporting.Besides,traditional schemes for private anonymous ad hoc communications are vulnerable against modern intelligent attacks based on learning models.The proposed scheme fulfills this gap.Findings show the probability of a successful intelligent attack reduces by up to 65%compared to ad hoc networks with no privacy protection strategy when used the proposed technology.While congestion probability can remain below 0.001%,as required in 6G services.展开更多
Dear Editor,Aiming at the consensus tracking problem of a class of unknown heterogeneous nonlinear multiagent systems(MASs)with input constraints,a novel data-driven iterative learning consensus control(ILCC)protocol ...Dear Editor,Aiming at the consensus tracking problem of a class of unknown heterogeneous nonlinear multiagent systems(MASs)with input constraints,a novel data-driven iterative learning consensus control(ILCC)protocol based on zeroing neural networks(ZNNs)is proposed.First,a dynamic linearization data model(DLDM)is acquired via dynamic linearization technology(DLT).展开更多
We propose an integrated method of data-driven and mechanism models for well logging formation evaluation,explicitly focusing on predicting reservoir parameters,such as porosity and water saturation.Accurately interpr...We propose an integrated method of data-driven and mechanism models for well logging formation evaluation,explicitly focusing on predicting reservoir parameters,such as porosity and water saturation.Accurately interpreting these parameters is crucial for effectively exploring and developing oil and gas.However,with the increasing complexity of geological conditions in this industry,there is a growing demand for improved accuracy in reservoir parameter prediction,leading to higher costs associated with manual interpretation.The conventional logging interpretation methods rely on empirical relationships between logging data and reservoir parameters,which suffer from low interpretation efficiency,intense subjectivity,and suitability for ideal conditions.The application of artificial intelligence in the interpretation of logging data provides a new solution to the problems existing in traditional methods.It is expected to improve the accuracy and efficiency of the interpretation.If large and high-quality datasets exist,data-driven models can reveal relationships of arbitrary complexity.Nevertheless,constructing sufficiently large logging datasets with reliable labels remains challenging,making it difficult to apply data-driven models effectively in logging data interpretation.Furthermore,data-driven models often act as“black boxes”without explaining their predictions or ensuring compliance with primary physical constraints.This paper proposes a machine learning method with strong physical constraints by integrating mechanism and data-driven models.Prior knowledge of logging data interpretation is embedded into machine learning regarding network structure,loss function,and optimization algorithm.We employ the Physically Informed Auto-Encoder(PIAE)to predict porosity and water saturation,which can be trained without labeled reservoir parameters using self-supervised learning techniques.This approach effectively achieves automated interpretation and facilitates generalization across diverse datasets.展开更多
Digital ElevationModel(DEM)refers to a digital map of the surface of the Earth that only shows the bare ground,without any buildings,plants,or other characteristics.However,obtaining unlimited access to DEM data at hi...Digital ElevationModel(DEM)refers to a digital map of the surface of the Earth that only shows the bare ground,without any buildings,plants,or other characteristics.However,obtaining unlimited access to DEM data at high and medium resolutions is very hard.Consequently,users often question the accuracy of freely available DEMs and their suitability for various applications.By comparing them to Global Positioning System(GPS)elevation data,this study aimed to identify themost reliable and widely available DEM for various terrains.The objectives of this study were to generate DEMs fromdifferent open sources and validate the accuracy of these DEMs using GPS elevation data.Various DEM types including Sentinel-1,ALOS PALSAR,SRTM,AW3D30,and ASTER were compared.Root Mean Square Error(RMSE)andMean Error(ME)were used to measure the difference between the DEM-derived elevations and the GPS-measured elevations.The results showed that even though Sentinel-1 has higher resolutions,the accuracy of the DEM from Sentinel-1 depends on issues including coherence and interferometry,surface features,and temporal stability.On the other hand,ALOS PALSAR could accurately represent surfaces in some situations.Additionally,DEMs with lower resolutions,such as SRTM and AW3D30,demonstrated greater consistency across various types of terrain.In contrast,the ASTER DEM showed more variability in complex terrains.While freely available DEMs are easy to use and accessible,their accuracy varies depending on the source and terrain features.Future improvements could include adding more ground control points and using advanced filtering methods to enhance precision.展开更多
Current experimental and computational methods have limitations in accurately and efficiently classifying ion channels within vast protein spaces.Here we have developed a deep learning algorithm,GPT2 Ion Channel Class...Current experimental and computational methods have limitations in accurately and efficiently classifying ion channels within vast protein spaces.Here we have developed a deep learning algorithm,GPT2 Ion Channel Classifier(GPT2-ICC),which effectively distinguishing ion channels from a test set containing approximately 239 times more non-ion-channel proteins.GPT2-ICC integrates representation learning with a large language model(LLM)-based classifier,enabling highly accurate identification of potential ion channels.Several potential ion channels were predicated from the unannotated human proteome,further demonstrating GPT2-ICC’s generalization ability.This study marks a significant advancement in artificial-intelligence-driven ion channel research,highlighting the adaptability and effectiveness of combining representation learning with LLMs to address the challenges of imbalanced protein sequence data.Moreover,it provides a valuable computational tool for uncovering previously uncharacterized ion channels.展开更多
The increasing complexity of China’s electricity market creates substantial challenges for settlement automation,data consistency,and operational scalability.Existing provincial settlement systems are fragmented,lack...The increasing complexity of China’s electricity market creates substantial challenges for settlement automation,data consistency,and operational scalability.Existing provincial settlement systems are fragmented,lack a unified data structure,and depend heavily on manual intervention to process high-frequency and retroactive transactions.To address these limitations,a graph-based unified settlement framework is proposed to enhance automation,flexibility,and adaptability in electricity market settlements.A flexible attribute-graph model is employed to represent heterogeneousmulti-market data,enabling standardized integration,rapid querying,and seamless adaptation to evolving business requirements.An extensible operator library is designed to support configurable settlement rules,and a suite of modular tools—including dataset generation,formula configuration,billing templates,and task scheduling—facilitates end-to-end automated settlement processing.A robust refund-clearing mechanism is further incorporated,utilizing sandbox execution,data-version snapshots,dynamic lineage tracing,and real-time changecapture technologies to enable rapid and accurate recalculations under dynamic policy and data revisions.Case studies based on real-world data from regional Chinese markets validate the effectiveness of the proposed approach,demonstrating marked improvements in computational efficiency,system robustness,and automation.Moreover,enhanced settlement accuracy and high temporal granularity improve price-signal fidelity,promote cost-reflective tariffs,and incentivize energy-efficient and demand-responsive behavior among market participants.The method not only supports equitable and transparent market operations but also provides a generalizable,scalable foundation for modern electricity settlement platforms in increasingly complex and dynamic market environments.展开更多
The Fourth Industrial Revolution has endowed the concept of state sovereignty with new era-specific connotations,leading to the emergence of the theory of data sovereignty.While countries refine their domestic legisla...The Fourth Industrial Revolution has endowed the concept of state sovereignty with new era-specific connotations,leading to the emergence of the theory of data sovereignty.While countries refine their domestic legislation to establish their data sovereignty,they are also actively engaging in the negotiation of cross-border data flow rules within international trade agreements to construct data sovereignty.During these negotiations,countries express differing regulatory claims,with some focusing on safeguarding sovereignty and protecting human rights,some prioritizing economic promotion and security assurance,and others targeting traditional and innovative digital trade barriers.These varied approaches reflect the tension between three pairs of values:collectivism and individualism,freedom and security,and tradition and innovation.Based on their distinct value pursuits,three representative models of data sovereignty construction have emerged globally.At the current juncture,when international rules for digital trade are still in their nascent stages,China should timely establish its data sovereignty rules,actively participate in global data sovereignty competition,and balance its sovereignty interests with other interests.Specifically,China should explore the scope of system-acceptable digital trade barriers through free trade zones;integrate domestic and international legal frameworks to ensure the alignment of China’s data governance legislation with its obligations under international trade agreements;and use the development of the“Digital Silk Road”as a starting point to prioritize the formation of digital trade rules with countries participating in the Belt and Road Initiative,promoting the Chinese solutions internationally.展开更多
Accurate Global Horizontal Irradiance(GHI)forecasting has become vital for successfully integrating solar energy into the electrical grid because of the expanding demand for green power and the worldwide shift favouri...Accurate Global Horizontal Irradiance(GHI)forecasting has become vital for successfully integrating solar energy into the electrical grid because of the expanding demand for green power and the worldwide shift favouring green energy resources.Particularly considering the implications of the aggressive GHG emission targets,accurate GHI forecasting has become vital for developing,designing,and operational managing solar energy systems.This research presented the core concepts of modelling and performance analysis of the application of various forecasting models such as ARIMA(Autoregressive Integrated Moving Average),Elaman NN(Elman Neural Network),RBFN(Radial Basis Function Neural Network),SVM(Support Vector Machine),LSTM(Long Short-Term Memory),Persistent,BPN(Back Propagation Neural Network),MLP(Multilayer Perceptron Neural Network),RF(Random Forest),and XGBoost(eXtreme Gradient Boosting)for assessing multi-seasonal forecasting of GHI.Used the India region data to evaluate the models’performance and forecasting ability.Research using forecasting models for seasonal Global Horizontal Irradiance(GHI)forecasting in winter,spring,summer,monsoon,and autumn.Substantiated performance effectiveness through evaluation metrics,such as Mean Absolute Error(MAE),Root Mean Squared Error(RMSE),and R-squared(R^(2)),coded using Python programming.The performance experimentation analysis inferred that the most accurate forecasts in all the seasons compared to the other forecasting models the Random Forest and eXtreme Gradient Boosting,are the superior and competing models that yield Winter season-based forecasting XGBoost is the best forecasting model with MAE:1.6325,RMSE:4.8338,and R^(2):0.9998.Spring season-based forecasting XGBoost is the best forecasting model with MAE:2.599599,RMSE:5.58539,and R^(2):0.999784.Summer season-based forecasting RF is the best forecasting model with MAE:1.03843,RMSE:2.116325,and R^(2):0.999967.Monsoon season-based forecasting RF is the best forecasting model with MAE:0.892385,RMSE:2.417587,and R^(2):0.999942.Autumn season-based forecasting RF is the best forecasting model with MAE:0.810462,RMSE:1.928215,and R^(2):0.999958.Based on seasonal variations and computing constraints,the findings enable energy system operators to make helpful recommendations for choosing the most effective forecasting models.展开更多
Astronomical spectra are vital for deriving stellar properties,yet low signal-to-noise ratio(SNR)spectra often obscure key features,complicating accurate analysis.This study presents spec-Diffusion Probabilistic Model...Astronomical spectra are vital for deriving stellar properties,yet low signal-to-noise ratio(SNR)spectra often obscure key features,complicating accurate analysis.This study presents spec-Diffusion Probabilistic Models(DDPM),a novel deep learning approach based on DDPM,aimed at denoising low SNR spectra to improve stellar parameter estimation.Leveraging the LAMOST DR10 data set,we developed spec-DDPM using a tailored U-Net architecture(spec-Unet)to iteratively predict and remove noise.The model was trained on 28,500 low and high SNR spectral pairs and benchmarked against conventional methods,including Principal Component Analysis,wavelet techniques,and a modified DnCNN model.The spec-DDPM demonstrated superior performance,with reduced Mean Absolute Error,elevated Structural Similarity Index Measure,and enhanced spectral loss metrics.It effectively preserved critical spectral features and corrected continuum distortions.Validation experiments further confirmed its ability to improve stellar parameter estimation with reduced errors.These results underscore spec-DDPM’s potential to elevate spectral data quality,offering applications in restoring defective spectra and refining large-scale astronomical surveys.This work highlights the transformative role of deep learning in astronomical data processing.展开更多
The unprecedented scale of large models,such as large language models(LLMs)and text-to-image diffusion models,has raised critical concerns about the unauthorized use of copyrighted data during model training.These con...The unprecedented scale of large models,such as large language models(LLMs)and text-to-image diffusion models,has raised critical concerns about the unauthorized use of copyrighted data during model training.These concerns have spurred a growing demand for dataset copyright auditing techniques,which aim to detect and verify potential infringements in the training data of commercial AI systems.This paper presents a survey of existing auditing solutions,categorizing them across key dimensions:data modality,model training stage,data overlap scenarios,and model access levels.We highlight major trends,including the prevalence of black-box auditing methods and the emphasis on fine-tuning rather than pre-training.Through an in-depth analysis of 12 representative works,we extract four key observations that reveal the limitations of current methods.Furthermore,we identify three open challenges and propose future directions for robust,multimodal,and scalable auditing solutions.Our findings underscore the urgent need to establish standardized benchmarks and develop auditing frameworks that are resilient to low watermark densities and applicable in diverse deployment settings.展开更多
The aim of this article is to explore potential directions for the development of artificial intelligence(AI).It points out that,while current AI can handle the statistical properties of complex systems,it has difficu...The aim of this article is to explore potential directions for the development of artificial intelligence(AI).It points out that,while current AI can handle the statistical properties of complex systems,it has difficulty effectively processing and fully representing their spatiotemporal complexity patterns.The article also discusses a potential path of AI development in the engineering domain.Based on the existing understanding of the principles of multilevel com-plexity,this article suggests that consistency among the logical structures of datasets,AI models,model-building software,and hardware will be an important AI development direction and is worthy of careful consideration.展开更多
Background: The population of Fontan patients, patients born with a single functioningventricle, is growing. There is a growing need to develop algorithms for this population that can predicthealth outcomes. Artiffcia...Background: The population of Fontan patients, patients born with a single functioningventricle, is growing. There is a growing need to develop algorithms for this population that can predicthealth outcomes. Artiffcial intelligence models predicting short-term and long-term health outcomes forpatients with the Fontan circulation are needed. Generative adversarial networks (GANs) provide a solutionfor generating realistic and useful synthetic data that can be used to train such models. Methods: Despitetheir promise, GANs have not been widely adopted in the congenital heart disease research communitydue, in some part, to a lack of knowledge on how to employ them. In this research study, a GAN was usedto generate synthetic data from the Pediatric Heart Network Fontan I dataset. A subset of data consistingof the echocardiographic and BNP measures collected from Fontan patients was used to train the GAN.Two sets of synthetic data were created to understand the effect of data missingness on synthetic datageneration. Synthetic data was created from real data in which the missing values were imputed usingMultiple Imputation by Chained Equations (MICE) (referred to as synthetic from imputed real samples). Inaddition, synthetic data was created from real data in which the missing values were dropped (referred to assynthetic from dropped real samples). Both synthetic datasets were evaluated for ffdelity by using visualmethods which involved comparing histograms and principal component analysis (PCA) plots. Fidelitywas measured quantitatively by (1) comparing synthetic and real data using the Kolmogorov-Smirnovtest to evaluate the similarity between two distributions and (2) training a neural network to distinguishbetween real and synthetic samples. Both synthetic datasets were evaluated for utility by training aneural network with synthetic data and testing the neural network on its ability to classify patients thathave ventricular dysfunction using echocardiograph measures and serological measures. Results: Usinghistograms, associated probability density functions, and (PCA), both synthetic datasets showed visualresemblance in distribution and variance to real Fontan data. Quantitatively, synthetic data from droppedreal samples had higher similarity scores, as demonstrated by the Kolmogorov–Smirnov statistic, for all butone feature (age at Fontan) compared to synthetic data from imputed real samples, which demonstrateddissimilar scores for three features (Echo SV, Echo tda, and BNP). In addition, synthetic data from droppedreal samples resembled real data to a larger extent (49.3% classiffcation error) than synthetic data fromimputed real samples (65.28% classiffcation error). Classiffcation errors approximating 50% represent datasetsthat are indistinguishable. In terms of utility, synthetic data created from real data in which the missingvalues were imputed classiffed ventricular dysfunction in real data with a classiffcation error of 10.99%.Similarly, utility of the generated synthetic data by showing that a neural network trained on synthetic dataderived from real data in which the missing values were dropped could classify ventricular dysfunction inreal data with a classiffcation error of 9.44%. Conclusions: Although representing a limited subset of thevast data available on the Pediatric Heart Network, generative adversarial networks can create syntheticdata that mimics the probability distribution of real Fontan echocardiographic measures. Clinicians can usethese synthetic data to create models that predict health outcomes for Fontan patients.展开更多
Climate model prediction has been improved by enhancing model resolution as well as the implementation of sophisticated physical parameterization and refinement of data assimilation systems[section 6.1 in Wang et al.(...Climate model prediction has been improved by enhancing model resolution as well as the implementation of sophisticated physical parameterization and refinement of data assimilation systems[section 6.1 in Wang et al.(2025)].In relation to seasonal forecasting and climate projection in the East Asian summer monsoon season,proper simulation of the seasonal migration of rain bands by models is a challenging and limiting factor[section 7.1 in Wang et al.(2025)].展开更多
While the Ordos Basin is recognized for its substantial hydrocarbon exploration prospects,its rugged loess tableland terrain has rendered seismic exploration exceptionally challenging[1-3].Persistent obstacles such as...While the Ordos Basin is recognized for its substantial hydrocarbon exploration prospects,its rugged loess tableland terrain has rendered seismic exploration exceptionally challenging[1-3].Persistent obstacles such as complex 3D survey planning,low signal-tonoise ratio raw data,inadequate near-surface velocity modeling,and imaging inaccuracy have long hindered the advancement of seismic exploration across this region.Through a problem-solving approach rooted in geological target analysis,this research systematically investigates the behavioral patterns of nodal seismometer-based high-density seismic acquisition in loess plateau.Tailored advancements in waveform enhancement and depth velocity modelling methodologies have been engineered.Field validations confirm that the optimized workflow demonstrates marked improvements in amplitude preservation and imaging resolution,offering novel insights for future reservoir characterization endeavors.展开更多
This paper aims to conduct a systematic literature review(SLR)using an artificial intelligence(AI)approach to predict and diagnose diabetes mellitus.After reviewing the literature published from 2015–2025,the paper a...This paper aims to conduct a systematic literature review(SLR)using an artificial intelligence(AI)approach to predict and diagnose diabetes mellitus.After reviewing the literature published from 2015–2025,the paper aims to identify the most effective AI techniques,the most used datasets,the most widely used data preprocessing techniques,and the most common issues.After analyzing the literature,it has been found that convolutional neural networks(CNNs)and long short-term memory(LSTM)networks are deep learning models that have shown high accuracy in diabetes prediction.Recursive feature elimination(RFE)and SMOTE are feature selection techniques that have significantly improved model accuracy,training time,and interpretability.Amidst this technological advancement,some existing issues persist:data imbalance,the inapplicability of techniques,computational limitations,and a lack of real-time application in a healthcare environment.The literature review has also identified the need for robust,interpretable,and scalable AI systems capable of handling large volumes of data,including real-world data,in the healthcare industry.Furthermore,it has been identified that the benefits should be integrated with wearable health monitoring systems and the development of privacy-preserving models to ensure continuous,secure,and proactive diabetes management.展开更多
Rural domestic sewage treatment is critical for environmental protection.This study defines the spatial pattern of villages from the perspective of rural sewage treatment and develops an integrated decision-making sys...Rural domestic sewage treatment is critical for environmental protection.This study defines the spatial pattern of villages from the perspective of rural sewage treatment and develops an integrated decision-making system to propose a sewage treatment mode and scheme suitable for local conditions.By considering the village spatial layout and terrain factors,a decision tree model of residential density and terrain type was constructed with accuracies of 76.47%and 96.00%,respectively.Combined with binary classification probability unit regression,an appropriate sewage treatment mode for the village was determined with 87.00%accuracy.The Analytic Hierarchy Process(AHP),combined with the Technique for Order Preference(TOPSIS)by Similarity to an Ideal Solution model,formed the basis for optimal treatment process selection under different emission standards.Verification was conducted in 542 villages across three counties of the Inner Mongolia Autonomous Region,focusing on the standard effluent effect(0.3773),low investment cost(0.3196),and high standard effluent effect(0.5115)to determine the best treatment process for the same emission standard under different needs.The annual environmental and carbon emission benefits of sewage treatment in these villages were estimated.This model matches village density,geographic feature,and social development level,and provides scientific support and a theoretical basis for rural sewage treatment decision-making.展开更多
基金supported by the Institute of Information&Communications Technology Planning&Evaluation(IITP)grant funded by the Korea government(MSIT)(No.RS-2023-00235509Development of security monitoring technology based network behavior against encrypted cyber threats in ICT convergence environment).
文摘With the increasing emphasis on personal information protection,encryption through security protocols has emerged as a critical requirement in data transmission and reception processes.Nevertheless,IoT ecosystems comprise heterogeneous networks where outdated systems coexist with the latest devices,spanning a range of devices from non-encrypted ones to fully encrypted ones.Given the limited visibility into payloads in this context,this study investigates AI-based attack detection methods that leverage encrypted traffic metadata,eliminating the need for decryption and minimizing system performance degradation—especially in light of these heterogeneous devices.Using the UNSW-NB15 and CICIoT-2023 dataset,encrypted and unencrypted traffic were categorized according to security protocol,and AI-based intrusion detection experiments were conducted for each traffic type based on metadata.To mitigate the problem of class imbalance,eight different data sampling techniques were applied.The effectiveness of these sampling techniques was then comparatively analyzed using two ensemble models and three Deep Learning(DL)models from various perspectives.The experimental results confirmed that metadata-based attack detection is feasible using only encrypted traffic.In the UNSW-NB15 dataset,the f1-score of encrypted traffic was approximately 0.98,which is 4.3%higher than that of unencrypted traffic(approximately 0.94).In addition,analysis of the encrypted traffic in the CICIoT-2023 dataset using the same method showed a significantly lower f1-score of roughly 0.43,indicating that the quality of the dataset and the preprocessing approach have a substantial impact on detection performance.Furthermore,when data sampling techniques were applied to encrypted traffic,the recall in the UNSWNB15(Encrypted)dataset improved by up to 23.0%,and in the CICIoT-2023(Encrypted)dataset by 20.26%,showing a similar level of improvement.Notably,in CICIoT-2023,f1-score and Receiver Operation Characteristic-Area Under the Curve(ROC-AUC)increased by 59.0%and 55.94%,respectively.These results suggest that data sampling can have a positive effect even in encrypted environments.However,the extent of the improvement may vary depending on data quality,model architecture,and sampling strategy.
基金supported by the National Research Foundation of Korea(NRF)funded by the Korea government(RS-2023-00249743)Additionally,this research was supported by the Global-Learning&Academic Research Institution for Master’s,PhD Students,and Postdocs(LAMP)Program of the National Research Foundation of Korea(NRF)grant funded by the Ministry of Education(RS-2024-00443714)This research was also supported by the“Research Base Construction Fund Support Program”funded by Jeonbuk National University in 2025.
文摘Traditional anomaly detection methods often assume that data points are independent or exhibit regularly structured relationships,as in Euclidean data such as time series or image grids.However,real-world data frequently involve irregular,interconnected structures,requiring a shift toward non-Euclidean approaches.This study introduces a novel anomaly detection framework designed to handle non-Euclidean data by modeling transactions as graph signals.By leveraging graph convolution filters,we extract meaningful connection strengths that capture relational dependencies often overlooked in traditional methods.Utilizing the Graph Convolutional Networks(GCN)framework,we integrate graph-based embeddings with conventional anomaly detection models,enhancing performance through relational insights.Ourmethod is validated on European credit card transaction data,demonstrating its effectiveness in detecting fraudulent transactions,particularly thosewith subtle patterns that evade traditional,amountbased detection techniques.The results highlight the advantages of incorporating temporal and structural dependencies into fraud detection,showcasing the robustness and applicability of our approach in complex,real-world scenarios.
基金funded by Deanship of Graduate studies and Scientific Research at Jouf University under grant No.(DGSSR-2024-02-01264).
文摘Automated essay scoring(AES)systems have gained significant importance in educational settings,offering a scalable,efficient,and objective method for evaluating student essays.However,developing AES systems for Arabic poses distinct challenges due to the language’s complex morphology,diglossia,and the scarcity of annotated datasets.This paper presents a hybrid approach to Arabic AES by combining text-based,vector-based,and embeddingbased similarity measures to improve essay scoring accuracy while minimizing the training data required.Using a large Arabic essay dataset categorized into thematic groups,the study conducted four experiments to evaluate the impact of feature selection,data size,and model performance.Experiment 1 established a baseline using a non-machine learning approach,selecting top-N correlated features to predict essay scores.The subsequent experiments employed 5-fold cross-validation.Experiment 2 showed that combining embedding-based,text-based,and vector-based features in a Random Forest(RF)model achieved an R2 of 88.92%and an accuracy of 83.3%within a 0.5-point tolerance.Experiment 3 further refined the feature selection process,demonstrating that 19 correlated features yielded optimal results,improving R2 to 88.95%.In Experiment 4,an optimal data efficiency training approach was introduced,where training data portions increased from 5%to 50%.The study found that using just 10%of the data achieved near-peak performance,with an R2 of 85.49%,emphasizing an effective trade-off between performance and computational costs.These findings highlight the potential of the hybrid approach for developing scalable Arabic AES systems,especially in low-resource environments,addressing linguistic challenges while ensuring efficient data usage.
基金supported in part by the National Natural Science Foundation of China(62373070 and 52272388)in part by the Chongqing Natural Science Foundation(CSTB2024NSCQ-QCXMX0054,CSTB2022NSCQ-MSX1225 and CSTC2024YCJH-BGZXM0042)in part by the Key Research and Development Project of Anhui Province(202304a05020060).
文摘In this paper,we consider a multiple-input single-output(MISO)Hammerstein system whose inputs and output are disturbed by unknown Gaussian white measurement noises.The parameter estimation of such a system is a typical errors-in-variables(EIV)nonlinear system identification problem.This paper proposes a bias-correction least squares(BCLS)identification methods to compute a consistent estimate of EIV MISO Hammerstein systems from noisy data.To obtain the unbiased parameter estimates of EIV MISO Hammerstein system,the analytical expression of estimated bias for the standard least squares(LS)algorithm is derived first,which is a function about the variances of noises.And then a recursive algorithm is proposed to estimate the unknown term of noises variances from noisy data.Finally,based on bias estimation scheme,the bias caused by the correlation between the input–output signals exciting the true system and the corresponding measurement noise,resulting in unbiased parameter estimates of the EIV MISO Hammerstein system.The performance of the proposed method is demonstrated through a simulation example and a chemical continuously stirred tank reactor(CSTR)system.
基金funding from the European Commission by the Ruralities project(grant agreement no.101060876).
文摘In this paper,we propose a new privacy-aware transmission scheduling algorithm for 6G ad hoc networks.This system enables end nodes to select the optimum time and scheme to transmit private data safely.In 6G dynamic heterogeneous infrastructures,unstable links and non-uniform hardware capabilities create critical issues regarding security and privacy.Traditional protocols are often too computationally heavy to allow 6G services to achieve their expected Quality-of-Service(QoS).As the transport network is built of ad hoc nodes,there is no guarantee about their trustworthiness or behavior,and transversal functionalities are delegated to the extreme nodes.However,while security can be guaranteed in extreme-to-extreme solutions,privacy cannot,as all intermediate nodes still have to handle the data packets they are transporting.Besides,traditional schemes for private anonymous ad hoc communications are vulnerable against modern intelligent attacks based on learning models.The proposed scheme fulfills this gap.Findings show the probability of a successful intelligent attack reduces by up to 65%compared to ad hoc networks with no privacy protection strategy when used the proposed technology.While congestion probability can remain below 0.001%,as required in 6G services.
基金supported by the National Nature Science Foundation of China(U21A20166)the Science and Technology Development Foundation of Jilin Province(20230508095RC)+2 种基金the Major Science and Technology Projects of Jilin Province and Changchun City(20220301033GX)the Development and Reform Commission Foundation of Jilin Province(2023C034-3)the Interdisciplinary Integration and Innovation Project of JLU(JLUXKJC2020202).
文摘Dear Editor,Aiming at the consensus tracking problem of a class of unknown heterogeneous nonlinear multiagent systems(MASs)with input constraints,a novel data-driven iterative learning consensus control(ILCC)protocol based on zeroing neural networks(ZNNs)is proposed.First,a dynamic linearization data model(DLDM)is acquired via dynamic linearization technology(DLT).
基金supported by National Key Research and Development Program (2019YFA0708301)National Natural Science Foundation of China (51974337)+2 种基金the Strategic Cooperation Projects of CNPC and CUPB (ZLZX2020-03)Science and Technology Innovation Fund of CNPC (2021DQ02-0403)Open Fund of Petroleum Exploration and Development Research Institute of CNPC (2022-KFKT-09)
文摘We propose an integrated method of data-driven and mechanism models for well logging formation evaluation,explicitly focusing on predicting reservoir parameters,such as porosity and water saturation.Accurately interpreting these parameters is crucial for effectively exploring and developing oil and gas.However,with the increasing complexity of geological conditions in this industry,there is a growing demand for improved accuracy in reservoir parameter prediction,leading to higher costs associated with manual interpretation.The conventional logging interpretation methods rely on empirical relationships between logging data and reservoir parameters,which suffer from low interpretation efficiency,intense subjectivity,and suitability for ideal conditions.The application of artificial intelligence in the interpretation of logging data provides a new solution to the problems existing in traditional methods.It is expected to improve the accuracy and efficiency of the interpretation.If large and high-quality datasets exist,data-driven models can reveal relationships of arbitrary complexity.Nevertheless,constructing sufficiently large logging datasets with reliable labels remains challenging,making it difficult to apply data-driven models effectively in logging data interpretation.Furthermore,data-driven models often act as“black boxes”without explaining their predictions or ensuring compliance with primary physical constraints.This paper proposes a machine learning method with strong physical constraints by integrating mechanism and data-driven models.Prior knowledge of logging data interpretation is embedded into machine learning regarding network structure,loss function,and optimization algorithm.We employ the Physically Informed Auto-Encoder(PIAE)to predict porosity and water saturation,which can be trained without labeled reservoir parameters using self-supervised learning techniques.This approach effectively achieves automated interpretation and facilitates generalization across diverse datasets.
基金funded by the Ministry of Higher Education Malaysia(MOHE)through the Fundamental Research Grant Scheme(FRGS/1/2021/WAB07/UiTM/02/1).
文摘Digital ElevationModel(DEM)refers to a digital map of the surface of the Earth that only shows the bare ground,without any buildings,plants,or other characteristics.However,obtaining unlimited access to DEM data at high and medium resolutions is very hard.Consequently,users often question the accuracy of freely available DEMs and their suitability for various applications.By comparing them to Global Positioning System(GPS)elevation data,this study aimed to identify themost reliable and widely available DEM for various terrains.The objectives of this study were to generate DEMs fromdifferent open sources and validate the accuracy of these DEMs using GPS elevation data.Various DEM types including Sentinel-1,ALOS PALSAR,SRTM,AW3D30,and ASTER were compared.Root Mean Square Error(RMSE)andMean Error(ME)were used to measure the difference between the DEM-derived elevations and the GPS-measured elevations.The results showed that even though Sentinel-1 has higher resolutions,the accuracy of the DEM from Sentinel-1 depends on issues including coherence and interferometry,surface features,and temporal stability.On the other hand,ALOS PALSAR could accurately represent surfaces in some situations.Additionally,DEMs with lower resolutions,such as SRTM and AW3D30,demonstrated greater consistency across various types of terrain.In contrast,the ASTER DEM showed more variability in complex terrains.While freely available DEMs are easy to use and accessible,their accuracy varies depending on the source and terrain features.Future improvements could include adding more ground control points and using advanced filtering methods to enhance precision.
基金funded by grants from the National Key Research and Development Program of China(Grant Nos.:2022YFE0205600 and 2022YFC3400504)the National Natural Science Foundation of China(Grant Nos.:82373792 and 82273857)the Fundamental Research Funds for the Central Universities,China,and the East China Normal University Medicine and Health Joint Fund,China(Grant No.:2022JKXYD07001).
文摘Current experimental and computational methods have limitations in accurately and efficiently classifying ion channels within vast protein spaces.Here we have developed a deep learning algorithm,GPT2 Ion Channel Classifier(GPT2-ICC),which effectively distinguishing ion channels from a test set containing approximately 239 times more non-ion-channel proteins.GPT2-ICC integrates representation learning with a large language model(LLM)-based classifier,enabling highly accurate identification of potential ion channels.Several potential ion channels were predicated from the unannotated human proteome,further demonstrating GPT2-ICC’s generalization ability.This study marks a significant advancement in artificial-intelligence-driven ion channel research,highlighting the adaptability and effectiveness of combining representation learning with LLMs to address the challenges of imbalanced protein sequence data.Moreover,it provides a valuable computational tool for uncovering previously uncharacterized ion channels.
基金funded by the Science and Technology Project of State Grid Corporation of China(5108-202355437A-3-2-ZN).
文摘The increasing complexity of China’s electricity market creates substantial challenges for settlement automation,data consistency,and operational scalability.Existing provincial settlement systems are fragmented,lack a unified data structure,and depend heavily on manual intervention to process high-frequency and retroactive transactions.To address these limitations,a graph-based unified settlement framework is proposed to enhance automation,flexibility,and adaptability in electricity market settlements.A flexible attribute-graph model is employed to represent heterogeneousmulti-market data,enabling standardized integration,rapid querying,and seamless adaptation to evolving business requirements.An extensible operator library is designed to support configurable settlement rules,and a suite of modular tools—including dataset generation,formula configuration,billing templates,and task scheduling—facilitates end-to-end automated settlement processing.A robust refund-clearing mechanism is further incorporated,utilizing sandbox execution,data-version snapshots,dynamic lineage tracing,and real-time changecapture technologies to enable rapid and accurate recalculations under dynamic policy and data revisions.Case studies based on real-world data from regional Chinese markets validate the effectiveness of the proposed approach,demonstrating marked improvements in computational efficiency,system robustness,and automation.Moreover,enhanced settlement accuracy and high temporal granularity improve price-signal fidelity,promote cost-reflective tariffs,and incentivize energy-efficient and demand-responsive behavior among market participants.The method not only supports equitable and transparent market operations but also provides a generalizable,scalable foundation for modern electricity settlement platforms in increasingly complex and dynamic market environments.
基金This paper is a phased result of the“Research on the Issue of China’s Data Export System”(24SFB3035)a research project of the Ministry of Justice of China on the construction of the rule of law and the study of legal theories at the ministerial level in 2024.
文摘The Fourth Industrial Revolution has endowed the concept of state sovereignty with new era-specific connotations,leading to the emergence of the theory of data sovereignty.While countries refine their domestic legislation to establish their data sovereignty,they are also actively engaging in the negotiation of cross-border data flow rules within international trade agreements to construct data sovereignty.During these negotiations,countries express differing regulatory claims,with some focusing on safeguarding sovereignty and protecting human rights,some prioritizing economic promotion and security assurance,and others targeting traditional and innovative digital trade barriers.These varied approaches reflect the tension between three pairs of values:collectivism and individualism,freedom and security,and tradition and innovation.Based on their distinct value pursuits,three representative models of data sovereignty construction have emerged globally.At the current juncture,when international rules for digital trade are still in their nascent stages,China should timely establish its data sovereignty rules,actively participate in global data sovereignty competition,and balance its sovereignty interests with other interests.Specifically,China should explore the scope of system-acceptable digital trade barriers through free trade zones;integrate domestic and international legal frameworks to ensure the alignment of China’s data governance legislation with its obligations under international trade agreements;and use the development of the“Digital Silk Road”as a starting point to prioritize the formation of digital trade rules with countries participating in the Belt and Road Initiative,promoting the Chinese solutions internationally.
文摘Accurate Global Horizontal Irradiance(GHI)forecasting has become vital for successfully integrating solar energy into the electrical grid because of the expanding demand for green power and the worldwide shift favouring green energy resources.Particularly considering the implications of the aggressive GHG emission targets,accurate GHI forecasting has become vital for developing,designing,and operational managing solar energy systems.This research presented the core concepts of modelling and performance analysis of the application of various forecasting models such as ARIMA(Autoregressive Integrated Moving Average),Elaman NN(Elman Neural Network),RBFN(Radial Basis Function Neural Network),SVM(Support Vector Machine),LSTM(Long Short-Term Memory),Persistent,BPN(Back Propagation Neural Network),MLP(Multilayer Perceptron Neural Network),RF(Random Forest),and XGBoost(eXtreme Gradient Boosting)for assessing multi-seasonal forecasting of GHI.Used the India region data to evaluate the models’performance and forecasting ability.Research using forecasting models for seasonal Global Horizontal Irradiance(GHI)forecasting in winter,spring,summer,monsoon,and autumn.Substantiated performance effectiveness through evaluation metrics,such as Mean Absolute Error(MAE),Root Mean Squared Error(RMSE),and R-squared(R^(2)),coded using Python programming.The performance experimentation analysis inferred that the most accurate forecasts in all the seasons compared to the other forecasting models the Random Forest and eXtreme Gradient Boosting,are the superior and competing models that yield Winter season-based forecasting XGBoost is the best forecasting model with MAE:1.6325,RMSE:4.8338,and R^(2):0.9998.Spring season-based forecasting XGBoost is the best forecasting model with MAE:2.599599,RMSE:5.58539,and R^(2):0.999784.Summer season-based forecasting RF is the best forecasting model with MAE:1.03843,RMSE:2.116325,and R^(2):0.999967.Monsoon season-based forecasting RF is the best forecasting model with MAE:0.892385,RMSE:2.417587,and R^(2):0.999942.Autumn season-based forecasting RF is the best forecasting model with MAE:0.810462,RMSE:1.928215,and R^(2):0.999958.Based on seasonal variations and computing constraints,the findings enable energy system operators to make helpful recommendations for choosing the most effective forecasting models.
基金study was Foundation of China(NSFC)under grant Nos.11873037 and 11803016the science research grants from the China Manned Space Project with Nos.CMS-CSST-2021-B05 and CMSCSST-2021-A08+1 种基金the Natural Science Foundation of Shandong Province under grant Nos.ZR2022MA076,ZR2022MA089 and ZR2024MA063the Young Scholars Program of Shandong University,Weihai,under grant No.2016WHWLJH09 and GHfund A(202202018107).
文摘Astronomical spectra are vital for deriving stellar properties,yet low signal-to-noise ratio(SNR)spectra often obscure key features,complicating accurate analysis.This study presents spec-Diffusion Probabilistic Models(DDPM),a novel deep learning approach based on DDPM,aimed at denoising low SNR spectra to improve stellar parameter estimation.Leveraging the LAMOST DR10 data set,we developed spec-DDPM using a tailored U-Net architecture(spec-Unet)to iteratively predict and remove noise.The model was trained on 28,500 low and high SNR spectral pairs and benchmarked against conventional methods,including Principal Component Analysis,wavelet techniques,and a modified DnCNN model.The spec-DDPM demonstrated superior performance,with reduced Mean Absolute Error,elevated Structural Similarity Index Measure,and enhanced spectral loss metrics.It effectively preserved critical spectral features and corrected continuum distortions.Validation experiments further confirmed its ability to improve stellar parameter estimation with reduced errors.These results underscore spec-DDPM’s potential to elevate spectral data quality,offering applications in restoring defective spectra and refining large-scale astronomical surveys.This work highlights the transformative role of deep learning in astronomical data processing.
基金supported in part by NSFC under Grant Nos.62402379,U22A2029 and U24A20237.
文摘The unprecedented scale of large models,such as large language models(LLMs)and text-to-image diffusion models,has raised critical concerns about the unauthorized use of copyrighted data during model training.These concerns have spurred a growing demand for dataset copyright auditing techniques,which aim to detect and verify potential infringements in the training data of commercial AI systems.This paper presents a survey of existing auditing solutions,categorizing them across key dimensions:data modality,model training stage,data overlap scenarios,and model access levels.We highlight major trends,including the prevalence of black-box auditing methods and the emphasis on fine-tuning rather than pre-training.Through an in-depth analysis of 12 representative works,we extract four key observations that reveal the limitations of current methods.Furthermore,we identify three open challenges and propose future directions for robust,multimodal,and scalable auditing solutions.Our findings underscore the urgent need to establish standardized benchmarks and develop auditing frameworks that are resilient to low watermark densities and applicable in diverse deployment settings.
文摘The aim of this article is to explore potential directions for the development of artificial intelligence(AI).It points out that,while current AI can handle the statistical properties of complex systems,it has difficulty effectively processing and fully representing their spatiotemporal complexity patterns.The article also discusses a potential path of AI development in the engineering domain.Based on the existing understanding of the principles of multilevel com-plexity,this article suggests that consistency among the logical structures of datasets,AI models,model-building software,and hardware will be an important AI development direction and is worthy of careful consideration.
文摘Background: The population of Fontan patients, patients born with a single functioningventricle, is growing. There is a growing need to develop algorithms for this population that can predicthealth outcomes. Artiffcial intelligence models predicting short-term and long-term health outcomes forpatients with the Fontan circulation are needed. Generative adversarial networks (GANs) provide a solutionfor generating realistic and useful synthetic data that can be used to train such models. Methods: Despitetheir promise, GANs have not been widely adopted in the congenital heart disease research communitydue, in some part, to a lack of knowledge on how to employ them. In this research study, a GAN was usedto generate synthetic data from the Pediatric Heart Network Fontan I dataset. A subset of data consistingof the echocardiographic and BNP measures collected from Fontan patients was used to train the GAN.Two sets of synthetic data were created to understand the effect of data missingness on synthetic datageneration. Synthetic data was created from real data in which the missing values were imputed usingMultiple Imputation by Chained Equations (MICE) (referred to as synthetic from imputed real samples). Inaddition, synthetic data was created from real data in which the missing values were dropped (referred to assynthetic from dropped real samples). Both synthetic datasets were evaluated for ffdelity by using visualmethods which involved comparing histograms and principal component analysis (PCA) plots. Fidelitywas measured quantitatively by (1) comparing synthetic and real data using the Kolmogorov-Smirnovtest to evaluate the similarity between two distributions and (2) training a neural network to distinguishbetween real and synthetic samples. Both synthetic datasets were evaluated for utility by training aneural network with synthetic data and testing the neural network on its ability to classify patients thathave ventricular dysfunction using echocardiograph measures and serological measures. Results: Usinghistograms, associated probability density functions, and (PCA), both synthetic datasets showed visualresemblance in distribution and variance to real Fontan data. Quantitatively, synthetic data from droppedreal samples had higher similarity scores, as demonstrated by the Kolmogorov–Smirnov statistic, for all butone feature (age at Fontan) compared to synthetic data from imputed real samples, which demonstrateddissimilar scores for three features (Echo SV, Echo tda, and BNP). In addition, synthetic data from droppedreal samples resembled real data to a larger extent (49.3% classiffcation error) than synthetic data fromimputed real samples (65.28% classiffcation error). Classiffcation errors approximating 50% represent datasetsthat are indistinguishable. In terms of utility, synthetic data created from real data in which the missingvalues were imputed classiffed ventricular dysfunction in real data with a classiffcation error of 10.99%.Similarly, utility of the generated synthetic data by showing that a neural network trained on synthetic dataderived from real data in which the missing values were dropped could classify ventricular dysfunction inreal data with a classiffcation error of 9.44%. Conclusions: Although representing a limited subset of thevast data available on the Pediatric Heart Network, generative adversarial networks can create syntheticdata that mimics the probability distribution of real Fontan echocardiographic measures. Clinicians can usethese synthetic data to create models that predict health outcomes for Fontan patients.
文摘Climate model prediction has been improved by enhancing model resolution as well as the implementation of sophisticated physical parameterization and refinement of data assimilation systems[section 6.1 in Wang et al.(2025)].In relation to seasonal forecasting and climate projection in the East Asian summer monsoon season,proper simulation of the seasonal migration of rain bands by models is a challenging and limiting factor[section 7.1 in Wang et al.(2025)].
文摘While the Ordos Basin is recognized for its substantial hydrocarbon exploration prospects,its rugged loess tableland terrain has rendered seismic exploration exceptionally challenging[1-3].Persistent obstacles such as complex 3D survey planning,low signal-tonoise ratio raw data,inadequate near-surface velocity modeling,and imaging inaccuracy have long hindered the advancement of seismic exploration across this region.Through a problem-solving approach rooted in geological target analysis,this research systematically investigates the behavioral patterns of nodal seismometer-based high-density seismic acquisition in loess plateau.Tailored advancements in waveform enhancement and depth velocity modelling methodologies have been engineered.Field validations confirm that the optimized workflow demonstrates marked improvements in amplitude preservation and imaging resolution,offering novel insights for future reservoir characterization endeavors.
文摘This paper aims to conduct a systematic literature review(SLR)using an artificial intelligence(AI)approach to predict and diagnose diabetes mellitus.After reviewing the literature published from 2015–2025,the paper aims to identify the most effective AI techniques,the most used datasets,the most widely used data preprocessing techniques,and the most common issues.After analyzing the literature,it has been found that convolutional neural networks(CNNs)and long short-term memory(LSTM)networks are deep learning models that have shown high accuracy in diabetes prediction.Recursive feature elimination(RFE)and SMOTE are feature selection techniques that have significantly improved model accuracy,training time,and interpretability.Amidst this technological advancement,some existing issues persist:data imbalance,the inapplicability of techniques,computational limitations,and a lack of real-time application in a healthcare environment.The literature review has also identified the need for robust,interpretable,and scalable AI systems capable of handling large volumes of data,including real-world data,in the healthcare industry.Furthermore,it has been identified that the benefits should be integrated with wearable health monitoring systems and the development of privacy-preserving models to ensure continuous,secure,and proactive diabetes management.
基金supported by the Central Government Guiding Local Science and Technology Development Fund Project(No.2024SZY0343)the Joint Research Program for Ecological Conservation and High Quality Development of the Yellow River Basin(No.2022-YRUC-01-050205)+2 种基金the Higher Education Scientific Research Project of Inner Mongolia Autonomous Region(No.NJZZ23078)the project of Inner Mongolia"Prairie Talents"Engineering Innovation Entrepreneurship Talent Team,the Major Projects of Erdos Science and Technology(No.2022EEDSKJZDZX015)the Innovation Team of the Inner Mongolia Academy of Science and Technology(No.CXTD2023-01-016).
文摘Rural domestic sewage treatment is critical for environmental protection.This study defines the spatial pattern of villages from the perspective of rural sewage treatment and develops an integrated decision-making system to propose a sewage treatment mode and scheme suitable for local conditions.By considering the village spatial layout and terrain factors,a decision tree model of residential density and terrain type was constructed with accuracies of 76.47%and 96.00%,respectively.Combined with binary classification probability unit regression,an appropriate sewage treatment mode for the village was determined with 87.00%accuracy.The Analytic Hierarchy Process(AHP),combined with the Technique for Order Preference(TOPSIS)by Similarity to an Ideal Solution model,formed the basis for optimal treatment process selection under different emission standards.Verification was conducted in 542 villages across three counties of the Inner Mongolia Autonomous Region,focusing on the standard effluent effect(0.3773),low investment cost(0.3196),and high standard effluent effect(0.5115)to determine the best treatment process for the same emission standard under different needs.The annual environmental and carbon emission benefits of sewage treatment in these villages were estimated.This model matches village density,geographic feature,and social development level,and provides scientific support and a theoretical basis for rural sewage treatment decision-making.