期刊文献+
共找到1,269,534篇文章
< 1 2 250 >
每页显示 20 50 100
Impact of Data Processing Techniques on AI Models for Attack-Based Imbalanced and Encrypted Traffic within IoT Environments
1
作者 Yeasul Kim Chaeeun Won Hwankuk Kim 《Computers, Materials & Continua》 2026年第1期247-274,共28页
With the increasing emphasis on personal information protection,encryption through security protocols has emerged as a critical requirement in data transmission and reception processes.Nevertheless,IoT ecosystems comp... With the increasing emphasis on personal information protection,encryption through security protocols has emerged as a critical requirement in data transmission and reception processes.Nevertheless,IoT ecosystems comprise heterogeneous networks where outdated systems coexist with the latest devices,spanning a range of devices from non-encrypted ones to fully encrypted ones.Given the limited visibility into payloads in this context,this study investigates AI-based attack detection methods that leverage encrypted traffic metadata,eliminating the need for decryption and minimizing system performance degradation—especially in light of these heterogeneous devices.Using the UNSW-NB15 and CICIoT-2023 dataset,encrypted and unencrypted traffic were categorized according to security protocol,and AI-based intrusion detection experiments were conducted for each traffic type based on metadata.To mitigate the problem of class imbalance,eight different data sampling techniques were applied.The effectiveness of these sampling techniques was then comparatively analyzed using two ensemble models and three Deep Learning(DL)models from various perspectives.The experimental results confirmed that metadata-based attack detection is feasible using only encrypted traffic.In the UNSW-NB15 dataset,the f1-score of encrypted traffic was approximately 0.98,which is 4.3%higher than that of unencrypted traffic(approximately 0.94).In addition,analysis of the encrypted traffic in the CICIoT-2023 dataset using the same method showed a significantly lower f1-score of roughly 0.43,indicating that the quality of the dataset and the preprocessing approach have a substantial impact on detection performance.Furthermore,when data sampling techniques were applied to encrypted traffic,the recall in the UNSWNB15(Encrypted)dataset improved by up to 23.0%,and in the CICIoT-2023(Encrypted)dataset by 20.26%,showing a similar level of improvement.Notably,in CICIoT-2023,f1-score and Receiver Operation Characteristic-Area Under the Curve(ROC-AUC)increased by 59.0%and 55.94%,respectively.These results suggest that data sampling can have a positive effect even in encrypted environments.However,the extent of the improvement may vary depending on data quality,model architecture,and sampling strategy. 展开更多
关键词 Encrypted traffic attack detection data sampling technique AI-based detection IoT environment
在线阅读 下载PDF
Non-Euclidean Models for Fraud Detection in Irregular Temporal Data Environments
2
作者 Boram Kim Guebin Choi 《Computers, Materials & Continua》 2026年第4期1771-1787,共17页
Traditional anomaly detection methods often assume that data points are independent or exhibit regularly structured relationships,as in Euclidean data such as time series or image grids.However,real-world data frequen... Traditional anomaly detection methods often assume that data points are independent or exhibit regularly structured relationships,as in Euclidean data such as time series or image grids.However,real-world data frequently involve irregular,interconnected structures,requiring a shift toward non-Euclidean approaches.This study introduces a novel anomaly detection framework designed to handle non-Euclidean data by modeling transactions as graph signals.By leveraging graph convolution filters,we extract meaningful connection strengths that capture relational dependencies often overlooked in traditional methods.Utilizing the Graph Convolutional Networks(GCN)framework,we integrate graph-based embeddings with conventional anomaly detection models,enhancing performance through relational insights.Ourmethod is validated on European credit card transaction data,demonstrating its effectiveness in detecting fraudulent transactions,particularly thosewith subtle patterns that evade traditional,amountbased detection techniques.The results highlight the advantages of incorporating temporal and structural dependencies into fraud detection,showcasing the robustness and applicability of our approach in complex,real-world scenarios. 展开更多
关键词 Anomaly detection credit card transactions fraud detection graph convolutional networks non-euclidean data
在线阅读 下载PDF
Efficient Arabic Essay Scoring with Hybrid Models: Feature Selection, Data Optimization, and Performance Trade-Offs
3
作者 Mohamed Ezz Meshrif Alruily +4 位作者 Ayman Mohamed Mostafa Alaa SAlaerjan Bader Aldughayfiq Hisham Allahem Abdulaziz Shehab 《Computers, Materials & Continua》 2026年第1期2274-2301,共28页
Automated essay scoring(AES)systems have gained significant importance in educational settings,offering a scalable,efficient,and objective method for evaluating student essays.However,developing AES systems for Arabic... Automated essay scoring(AES)systems have gained significant importance in educational settings,offering a scalable,efficient,and objective method for evaluating student essays.However,developing AES systems for Arabic poses distinct challenges due to the language’s complex morphology,diglossia,and the scarcity of annotated datasets.This paper presents a hybrid approach to Arabic AES by combining text-based,vector-based,and embeddingbased similarity measures to improve essay scoring accuracy while minimizing the training data required.Using a large Arabic essay dataset categorized into thematic groups,the study conducted four experiments to evaluate the impact of feature selection,data size,and model performance.Experiment 1 established a baseline using a non-machine learning approach,selecting top-N correlated features to predict essay scores.The subsequent experiments employed 5-fold cross-validation.Experiment 2 showed that combining embedding-based,text-based,and vector-based features in a Random Forest(RF)model achieved an R2 of 88.92%and an accuracy of 83.3%within a 0.5-point tolerance.Experiment 3 further refined the feature selection process,demonstrating that 19 correlated features yielded optimal results,improving R2 to 88.95%.In Experiment 4,an optimal data efficiency training approach was introduced,where training data portions increased from 5%to 50%.The study found that using just 10%of the data achieved near-peak performance,with an R2 of 85.49%,emphasizing an effective trade-off between performance and computational costs.These findings highlight the potential of the hybrid approach for developing scalable Arabic AES systems,especially in low-resource environments,addressing linguistic challenges while ensuring efficient data usage. 展开更多
关键词 Automated essay scoring text-based features vector-based features embedding-based features feature selection optimal data efficiency
在线阅读 下载PDF
Noisy data-driven identification for errors-in-variables MISO Hammerstein nonlinear models
4
作者 Jie Hou Haoran Wang +1 位作者 Penghua Li Hao Su 《Control Theory and Technology》 2026年第1期111-126,共16页
In this paper,we consider a multiple-input single-output(MISO)Hammerstein system whose inputs and output are disturbed by unknown Gaussian white measurement noises.The parameter estimation of such a system is a typica... In this paper,we consider a multiple-input single-output(MISO)Hammerstein system whose inputs and output are disturbed by unknown Gaussian white measurement noises.The parameter estimation of such a system is a typical errors-in-variables(EIV)nonlinear system identification problem.This paper proposes a bias-correction least squares(BCLS)identification methods to compute a consistent estimate of EIV MISO Hammerstein systems from noisy data.To obtain the unbiased parameter estimates of EIV MISO Hammerstein system,the analytical expression of estimated bias for the standard least squares(LS)algorithm is derived first,which is a function about the variances of noises.And then a recursive algorithm is proposed to estimate the unknown term of noises variances from noisy data.Finally,based on bias estimation scheme,the bias caused by the correlation between the input–output signals exciting the true system and the corresponding measurement noise,resulting in unbiased parameter estimates of the EIV MISO Hammerstein system.The performance of the proposed method is demonstrated through a simulation example and a chemical continuously stirred tank reactor(CSTR)system. 展开更多
关键词 Biased-corrected least squares ERRORS-IN-VARIABLES MISO Hammerstein models Parameter estimation System identification
原文传递
Information Diffusion Models and Fuzzing Algorithms for a Privacy-Aware Data Transmission Scheduling in 6G Heterogeneous ad hoc Networks
5
作者 Borja Bordel Sánchez Ramón Alcarria Tomás Robles 《Computer Modeling in Engineering & Sciences》 2026年第2期1214-1234,共21页
In this paper,we propose a new privacy-aware transmission scheduling algorithm for 6G ad hoc networks.This system enables end nodes to select the optimum time and scheme to transmit private data safely.In 6G dynamic h... In this paper,we propose a new privacy-aware transmission scheduling algorithm for 6G ad hoc networks.This system enables end nodes to select the optimum time and scheme to transmit private data safely.In 6G dynamic heterogeneous infrastructures,unstable links and non-uniform hardware capabilities create critical issues regarding security and privacy.Traditional protocols are often too computationally heavy to allow 6G services to achieve their expected Quality-of-Service(QoS).As the transport network is built of ad hoc nodes,there is no guarantee about their trustworthiness or behavior,and transversal functionalities are delegated to the extreme nodes.However,while security can be guaranteed in extreme-to-extreme solutions,privacy cannot,as all intermediate nodes still have to handle the data packets they are transporting.Besides,traditional schemes for private anonymous ad hoc communications are vulnerable against modern intelligent attacks based on learning models.The proposed scheme fulfills this gap.Findings show the probability of a successful intelligent attack reduces by up to 65%compared to ad hoc networks with no privacy protection strategy when used the proposed technology.While congestion probability can remain below 0.001%,as required in 6G services. 展开更多
关键词 6G networks ad hoc networks PRIVACY scheduling algorithms diffusion models fuzzing algorithms
在线阅读 下载PDF
Data-Driven Iterative Learning Consensus Tracking Based on Robust Neural Models for Unknown Heterogeneous Nonlinear Multiagent Systems With Input Constraints
6
作者 Chong Zhang Yunfeng Hu +2 位作者 TingTing Wang Xun Gong Hong Chen 《IEEE/CAA Journal of Automatica Sinica》 2025年第10期2153-2155,共3页
Dear Editor,Aiming at the consensus tracking problem of a class of unknown heterogeneous nonlinear multiagent systems(MASs)with input constraints,a novel data-driven iterative learning consensus control(ILCC)protocol ... Dear Editor,Aiming at the consensus tracking problem of a class of unknown heterogeneous nonlinear multiagent systems(MASs)with input constraints,a novel data-driven iterative learning consensus control(ILCC)protocol based on zeroing neural networks(ZNNs)is proposed.First,a dynamic linearization data model(DLDM)is acquired via dynamic linearization technology(DLT). 展开更多
关键词 dynamic linearization data model dldm consensus tracking problem input constraints consensus tracking unknown heterogeneous nonlinear multiagent systems robust neural models data driven iterative learning zeroing neural networks znns
在线阅读 下载PDF
An integrated method of data-driven and mechanism models for formation evaluation with logs 被引量:1
7
作者 Meng-Lu Kang Jun Zhou +4 位作者 Juan Zhang Li-Zhi Xiao Guang-Zhi Liao Rong-Bo Shao Gang Luo 《Petroleum Science》 2025年第3期1110-1124,共15页
We propose an integrated method of data-driven and mechanism models for well logging formation evaluation,explicitly focusing on predicting reservoir parameters,such as porosity and water saturation.Accurately interpr... We propose an integrated method of data-driven and mechanism models for well logging formation evaluation,explicitly focusing on predicting reservoir parameters,such as porosity and water saturation.Accurately interpreting these parameters is crucial for effectively exploring and developing oil and gas.However,with the increasing complexity of geological conditions in this industry,there is a growing demand for improved accuracy in reservoir parameter prediction,leading to higher costs associated with manual interpretation.The conventional logging interpretation methods rely on empirical relationships between logging data and reservoir parameters,which suffer from low interpretation efficiency,intense subjectivity,and suitability for ideal conditions.The application of artificial intelligence in the interpretation of logging data provides a new solution to the problems existing in traditional methods.It is expected to improve the accuracy and efficiency of the interpretation.If large and high-quality datasets exist,data-driven models can reveal relationships of arbitrary complexity.Nevertheless,constructing sufficiently large logging datasets with reliable labels remains challenging,making it difficult to apply data-driven models effectively in logging data interpretation.Furthermore,data-driven models often act as“black boxes”without explaining their predictions or ensuring compliance with primary physical constraints.This paper proposes a machine learning method with strong physical constraints by integrating mechanism and data-driven models.Prior knowledge of logging data interpretation is embedded into machine learning regarding network structure,loss function,and optimization algorithm.We employ the Physically Informed Auto-Encoder(PIAE)to predict porosity and water saturation,which can be trained without labeled reservoir parameters using self-supervised learning techniques.This approach effectively achieves automated interpretation and facilitates generalization across diverse datasets. 展开更多
关键词 Well log Reservoir evaluation Label scarcity Mechanism model data-driven model Physically informed model Self-supervised learning Machine learning
原文传递
Evaluation of Different Digital Elevation Models with Elevation Data
8
作者 Muhamad Ammar Hanif Arif Amir Sharifuddin Ab Latip +3 位作者 Siti Balqis Mohd Tun Nur Azlina Hariffin Adel Gohari Mohd Hakimi Abdu 《Revue Internationale de Géomatique》 2025年第1期691-705,共15页
Digital ElevationModel(DEM)refers to a digital map of the surface of the Earth that only shows the bare ground,without any buildings,plants,or other characteristics.However,obtaining unlimited access to DEM data at hi... Digital ElevationModel(DEM)refers to a digital map of the surface of the Earth that only shows the bare ground,without any buildings,plants,or other characteristics.However,obtaining unlimited access to DEM data at high and medium resolutions is very hard.Consequently,users often question the accuracy of freely available DEMs and their suitability for various applications.By comparing them to Global Positioning System(GPS)elevation data,this study aimed to identify themost reliable and widely available DEM for various terrains.The objectives of this study were to generate DEMs fromdifferent open sources and validate the accuracy of these DEMs using GPS elevation data.Various DEM types including Sentinel-1,ALOS PALSAR,SRTM,AW3D30,and ASTER were compared.Root Mean Square Error(RMSE)andMean Error(ME)were used to measure the difference between the DEM-derived elevations and the GPS-measured elevations.The results showed that even though Sentinel-1 has higher resolutions,the accuracy of the DEM from Sentinel-1 depends on issues including coherence and interferometry,surface features,and temporal stability.On the other hand,ALOS PALSAR could accurately represent surfaces in some situations.Additionally,DEMs with lower resolutions,such as SRTM and AW3D30,demonstrated greater consistency across various types of terrain.In contrast,the ASTER DEM showed more variability in complex terrains.While freely available DEMs are easy to use and accessible,their accuracy varies depending on the source and terrain features.Future improvements could include adding more ground control points and using advanced filtering methods to enhance precision. 展开更多
关键词 Digital elevation model vertical accuracy GPS data
在线阅读 下载PDF
GPT2-ICC:A data-driven approach for accurate ion channel identification using pre-trained large language models 被引量:1
9
作者 Zihan Zhou Yang Yu +9 位作者 Chengji Yang Leyan Cao Shaoying Zhang Junnan Li Yingnan Zhang Huayun Han Guoliang Shi Qiansen Zhang Juwen Shen Huaiyu Yang 《Journal of Pharmaceutical Analysis》 2025年第8期1800-1809,共10页
Current experimental and computational methods have limitations in accurately and efficiently classifying ion channels within vast protein spaces.Here we have developed a deep learning algorithm,GPT2 Ion Channel Class... Current experimental and computational methods have limitations in accurately and efficiently classifying ion channels within vast protein spaces.Here we have developed a deep learning algorithm,GPT2 Ion Channel Classifier(GPT2-ICC),which effectively distinguishing ion channels from a test set containing approximately 239 times more non-ion-channel proteins.GPT2-ICC integrates representation learning with a large language model(LLM)-based classifier,enabling highly accurate identification of potential ion channels.Several potential ion channels were predicated from the unannotated human proteome,further demonstrating GPT2-ICC’s generalization ability.This study marks a significant advancement in artificial-intelligence-driven ion channel research,highlighting the adaptability and effectiveness of combining representation learning with LLMs to address the challenges of imbalanced protein sequence data.Moreover,it provides a valuable computational tool for uncovering previously uncharacterized ion channels. 展开更多
关键词 Ion channel Artificial intelligence Representation learning GPT2 Protein language model
在线阅读 下载PDF
Graph-Based Unified Settlement Framework for Complex Electricity Markets:Data Integration and Automated Refund Clearing
10
作者 Xiaozhe Guo Suyan Long +4 位作者 Ziyu Yue Yifan Wang Guanting Yin Yuyang Wang Zhaoyuan Wu 《Energy Engineering》 2026年第1期56-90,共35页
The increasing complexity of China’s electricity market creates substantial challenges for settlement automation,data consistency,and operational scalability.Existing provincial settlement systems are fragmented,lack... The increasing complexity of China’s electricity market creates substantial challenges for settlement automation,data consistency,and operational scalability.Existing provincial settlement systems are fragmented,lack a unified data structure,and depend heavily on manual intervention to process high-frequency and retroactive transactions.To address these limitations,a graph-based unified settlement framework is proposed to enhance automation,flexibility,and adaptability in electricity market settlements.A flexible attribute-graph model is employed to represent heterogeneousmulti-market data,enabling standardized integration,rapid querying,and seamless adaptation to evolving business requirements.An extensible operator library is designed to support configurable settlement rules,and a suite of modular tools—including dataset generation,formula configuration,billing templates,and task scheduling—facilitates end-to-end automated settlement processing.A robust refund-clearing mechanism is further incorporated,utilizing sandbox execution,data-version snapshots,dynamic lineage tracing,and real-time changecapture technologies to enable rapid and accurate recalculations under dynamic policy and data revisions.Case studies based on real-world data from regional Chinese markets validate the effectiveness of the proposed approach,demonstrating marked improvements in computational efficiency,system robustness,and automation.Moreover,enhanced settlement accuracy and high temporal granularity improve price-signal fidelity,promote cost-reflective tariffs,and incentivize energy-efficient and demand-responsive behavior among market participants.The method not only supports equitable and transparent market operations but also provides a generalizable,scalable foundation for modern electricity settlement platforms in increasingly complex and dynamic market environments. 展开更多
关键词 Electricity market market settlement data model graph database market refund clearing
在线阅读 下载PDF
Data Sovereignty Construction in International Trade Agreements:Causes,Models,and China’s Choices-Based on the Study of Cross-border Data Flow Rules
11
作者 ZHANG Qianwen 《The Journal of Human Rights》 2025年第3期589-614,共26页
The Fourth Industrial Revolution has endowed the concept of state sovereignty with new era-specific connotations,leading to the emergence of the theory of data sovereignty.While countries refine their domestic legisla... The Fourth Industrial Revolution has endowed the concept of state sovereignty with new era-specific connotations,leading to the emergence of the theory of data sovereignty.While countries refine their domestic legislation to establish their data sovereignty,they are also actively engaging in the negotiation of cross-border data flow rules within international trade agreements to construct data sovereignty.During these negotiations,countries express differing regulatory claims,with some focusing on safeguarding sovereignty and protecting human rights,some prioritizing economic promotion and security assurance,and others targeting traditional and innovative digital trade barriers.These varied approaches reflect the tension between three pairs of values:collectivism and individualism,freedom and security,and tradition and innovation.Based on their distinct value pursuits,three representative models of data sovereignty construction have emerged globally.At the current juncture,when international rules for digital trade are still in their nascent stages,China should timely establish its data sovereignty rules,actively participate in global data sovereignty competition,and balance its sovereignty interests with other interests.Specifically,China should explore the scope of system-acceptable digital trade barriers through free trade zones;integrate domestic and international legal frameworks to ensure the alignment of China’s data governance legislation with its obligations under international trade agreements;and use the development of the“Digital Silk Road”as a starting point to prioritize the formation of digital trade rules with countries participating in the Belt and Road Initiative,promoting the Chinese solutions internationally. 展开更多
关键词 data sovereignty cross-border data flow international trade agreements digital trade rules
原文传递
Performance Analysis of Various Forecasting Models for Multi-Seasonal Global Horizontal Irradiance Forecasting Using the India Region Dataset
12
作者 Manoharan Madhiarasan 《Energy Engineering》 2025年第8期2993-3011,共19页
Accurate Global Horizontal Irradiance(GHI)forecasting has become vital for successfully integrating solar energy into the electrical grid because of the expanding demand for green power and the worldwide shift favouri... Accurate Global Horizontal Irradiance(GHI)forecasting has become vital for successfully integrating solar energy into the electrical grid because of the expanding demand for green power and the worldwide shift favouring green energy resources.Particularly considering the implications of the aggressive GHG emission targets,accurate GHI forecasting has become vital for developing,designing,and operational managing solar energy systems.This research presented the core concepts of modelling and performance analysis of the application of various forecasting models such as ARIMA(Autoregressive Integrated Moving Average),Elaman NN(Elman Neural Network),RBFN(Radial Basis Function Neural Network),SVM(Support Vector Machine),LSTM(Long Short-Term Memory),Persistent,BPN(Back Propagation Neural Network),MLP(Multilayer Perceptron Neural Network),RF(Random Forest),and XGBoost(eXtreme Gradient Boosting)for assessing multi-seasonal forecasting of GHI.Used the India region data to evaluate the models’performance and forecasting ability.Research using forecasting models for seasonal Global Horizontal Irradiance(GHI)forecasting in winter,spring,summer,monsoon,and autumn.Substantiated performance effectiveness through evaluation metrics,such as Mean Absolute Error(MAE),Root Mean Squared Error(RMSE),and R-squared(R^(2)),coded using Python programming.The performance experimentation analysis inferred that the most accurate forecasts in all the seasons compared to the other forecasting models the Random Forest and eXtreme Gradient Boosting,are the superior and competing models that yield Winter season-based forecasting XGBoost is the best forecasting model with MAE:1.6325,RMSE:4.8338,and R^(2):0.9998.Spring season-based forecasting XGBoost is the best forecasting model with MAE:2.599599,RMSE:5.58539,and R^(2):0.999784.Summer season-based forecasting RF is the best forecasting model with MAE:1.03843,RMSE:2.116325,and R^(2):0.999967.Monsoon season-based forecasting RF is the best forecasting model with MAE:0.892385,RMSE:2.417587,and R^(2):0.999942.Autumn season-based forecasting RF is the best forecasting model with MAE:0.810462,RMSE:1.928215,and R^(2):0.999958.Based on seasonal variations and computing constraints,the findings enable energy system operators to make helpful recommendations for choosing the most effective forecasting models. 展开更多
关键词 Machine learning model deep learning model statistical model SEASONAL solar energy Global Hori-zontal Irradiance forecasting
在线阅读 下载PDF
Enhancing Stellar Spectra with Diffusion Probabilistic Models:A Novel Approach to Denoising Low SNR Astronomical Data
13
作者 Jingzhen Sun Yude Bu +8 位作者 Jiangchuan Zhang Mengmeng Zhang Shanshan Li Ke Wang Yuhang Zhang Zhenping Yi Xiaoming Kong Meng Liu Minglei Wu 《Research in Astronomy and Astrophysics》 2025年第10期69-77,共9页
Astronomical spectra are vital for deriving stellar properties,yet low signal-to-noise ratio(SNR)spectra often obscure key features,complicating accurate analysis.This study presents spec-Diffusion Probabilistic Model... Astronomical spectra are vital for deriving stellar properties,yet low signal-to-noise ratio(SNR)spectra often obscure key features,complicating accurate analysis.This study presents spec-Diffusion Probabilistic Models(DDPM),a novel deep learning approach based on DDPM,aimed at denoising low SNR spectra to improve stellar parameter estimation.Leveraging the LAMOST DR10 data set,we developed spec-DDPM using a tailored U-Net architecture(spec-Unet)to iteratively predict and remove noise.The model was trained on 28,500 low and high SNR spectral pairs and benchmarked against conventional methods,including Principal Component Analysis,wavelet techniques,and a modified DnCNN model.The spec-DDPM demonstrated superior performance,with reduced Mean Absolute Error,elevated Structural Similarity Index Measure,and enhanced spectral loss metrics.It effectively preserved critical spectral features and corrected continuum distortions.Validation experiments further confirmed its ability to improve stellar parameter estimation with reduced errors.These results underscore spec-DDPM’s potential to elevate spectral data quality,offering applications in restoring defective spectra and refining large-scale astronomical surveys.This work highlights the transformative role of deep learning in astronomical data processing. 展开更多
关键词 techniques spectroscopic-methods STATISTICAL-METHODS data analysis
在线阅读 下载PDF
Dataset Copyright Auditing for Large Models:Fundamentals,Open Problems,and Future Directions
14
作者 DU Linkang SU Zhou YU Xinyi 《ZTE Communications》 2025年第3期38-47,共10页
The unprecedented scale of large models,such as large language models(LLMs)and text-to-image diffusion models,has raised critical concerns about the unauthorized use of copyrighted data during model training.These con... The unprecedented scale of large models,such as large language models(LLMs)and text-to-image diffusion models,has raised critical concerns about the unauthorized use of copyrighted data during model training.These concerns have spurred a growing demand for dataset copyright auditing techniques,which aim to detect and verify potential infringements in the training data of commercial AI systems.This paper presents a survey of existing auditing solutions,categorizing them across key dimensions:data modality,model training stage,data overlap scenarios,and model access levels.We highlight major trends,including the prevalence of black-box auditing methods and the emphasis on fine-tuning rather than pre-training.Through an in-depth analysis of 12 representative works,we extract four key observations that reveal the limitations of current methods.Furthermore,we identify three open challenges and propose future directions for robust,multimodal,and scalable auditing solutions.Our findings underscore the urgent need to establish standardized benchmarks and develop auditing frameworks that are resilient to low watermark densities and applicable in diverse deployment settings. 展开更多
关键词 dataset copyright auditing large language models diffusion models multimodal auditing membership inference
在线阅读 下载PDF
The Development of Artificial Intelligence:Toward Consistency in the Logical Structures of Datasets,AI Models,Model Building,and Hardware?
15
作者 Li Guo Jinghai Li 《Engineering》 2025年第7期13-17,共5页
The aim of this article is to explore potential directions for the development of artificial intelligence(AI).It points out that,while current AI can handle the statistical properties of complex systems,it has difficu... The aim of this article is to explore potential directions for the development of artificial intelligence(AI).It points out that,while current AI can handle the statistical properties of complex systems,it has difficulty effectively processing and fully representing their spatiotemporal complexity patterns.The article also discusses a potential path of AI development in the engineering domain.Based on the existing understanding of the principles of multilevel com-plexity,this article suggests that consistency among the logical structures of datasets,AI models,model-building software,and hardware will be an important AI development direction and is worthy of careful consideration. 展开更多
关键词 CONSISTENCY datasets model building ai models artificial intelligence ai explore potential directions HARDWARE artificial intelligence
在线阅读 下载PDF
Generating Synthetic Data for Machine Learning Models from the Pediatric Heart Network Fontan I Dataset
16
作者 Vatche Bahudian John Valdovinos 《Congenital Heart Disease》 2025年第1期115-127,共13页
Background: The population of Fontan patients, patients born with a single functioningventricle, is growing. There is a growing need to develop algorithms for this population that can predicthealth outcomes. Artiffcia... Background: The population of Fontan patients, patients born with a single functioningventricle, is growing. There is a growing need to develop algorithms for this population that can predicthealth outcomes. Artiffcial intelligence models predicting short-term and long-term health outcomes forpatients with the Fontan circulation are needed. Generative adversarial networks (GANs) provide a solutionfor generating realistic and useful synthetic data that can be used to train such models. Methods: Despitetheir promise, GANs have not been widely adopted in the congenital heart disease research communitydue, in some part, to a lack of knowledge on how to employ them. In this research study, a GAN was usedto generate synthetic data from the Pediatric Heart Network Fontan I dataset. A subset of data consistingof the echocardiographic and BNP measures collected from Fontan patients was used to train the GAN.Two sets of synthetic data were created to understand the effect of data missingness on synthetic datageneration. Synthetic data was created from real data in which the missing values were imputed usingMultiple Imputation by Chained Equations (MICE) (referred to as synthetic from imputed real samples). Inaddition, synthetic data was created from real data in which the missing values were dropped (referred to assynthetic from dropped real samples). Both synthetic datasets were evaluated for ffdelity by using visualmethods which involved comparing histograms and principal component analysis (PCA) plots. Fidelitywas measured quantitatively by (1) comparing synthetic and real data using the Kolmogorov-Smirnovtest to evaluate the similarity between two distributions and (2) training a neural network to distinguishbetween real and synthetic samples. Both synthetic datasets were evaluated for utility by training aneural network with synthetic data and testing the neural network on its ability to classify patients thathave ventricular dysfunction using echocardiograph measures and serological measures. Results: Usinghistograms, associated probability density functions, and (PCA), both synthetic datasets showed visualresemblance in distribution and variance to real Fontan data. Quantitatively, synthetic data from droppedreal samples had higher similarity scores, as demonstrated by the Kolmogorov–Smirnov statistic, for all butone feature (age at Fontan) compared to synthetic data from imputed real samples, which demonstrateddissimilar scores for three features (Echo SV, Echo tda, and BNP). In addition, synthetic data from droppedreal samples resembled real data to a larger extent (49.3% classiffcation error) than synthetic data fromimputed real samples (65.28% classiffcation error). Classiffcation errors approximating 50% represent datasetsthat are indistinguishable. In terms of utility, synthetic data created from real data in which the missingvalues were imputed classiffed ventricular dysfunction in real data with a classiffcation error of 10.99%.Similarly, utility of the generated synthetic data by showing that a neural network trained on synthetic dataderived from real data in which the missing values were dropped could classify ventricular dysfunction inreal data with a classiffcation error of 9.44%. Conclusions: Although representing a limited subset of thevast data available on the Pediatric Heart Network, generative adversarial networks can create syntheticdata that mimics the probability distribution of real Fontan echocardiographic measures. Clinicians can usethese synthetic data to create models that predict health outcomes for Fontan patients. 展开更多
关键词 Synthetic data congenital heart disease Fontan circulation
暂未订购
Do Higher Horizontal Resolution Models Perform Better?
17
作者 Shoji KUSUNOKI 《Advances in Atmospheric Sciences》 2026年第1期259-262,共4页
Climate model prediction has been improved by enhancing model resolution as well as the implementation of sophisticated physical parameterization and refinement of data assimilation systems[section 6.1 in Wang et al.(... Climate model prediction has been improved by enhancing model resolution as well as the implementation of sophisticated physical parameterization and refinement of data assimilation systems[section 6.1 in Wang et al.(2025)].In relation to seasonal forecasting and climate projection in the East Asian summer monsoon season,proper simulation of the seasonal migration of rain bands by models is a challenging and limiting factor[section 7.1 in Wang et al.(2025)]. 展开更多
关键词 enhancing model resolution refinement data assimilation systems section climate model climate projection higher horizontal resolution seasonal forecasting simulation seasonal migration rain bands model resolution
在线阅读 下载PDF
Data Processing Solutions on Low Signal-to-noise Data in Loess Plateau Area:A Case Study in Ordos Basin,China
18
作者 GAO Rongtao CHENG Yun +1 位作者 TANG Ziqi LIU Zhao 《CT理论与应用研究(中英文)》 2026年第1期154-162,共9页
While the Ordos Basin is recognized for its substantial hydrocarbon exploration prospects,its rugged loess tableland terrain has rendered seismic exploration exceptionally challenging[1-3].Persistent obstacles such as... While the Ordos Basin is recognized for its substantial hydrocarbon exploration prospects,its rugged loess tableland terrain has rendered seismic exploration exceptionally challenging[1-3].Persistent obstacles such as complex 3D survey planning,low signal-tonoise ratio raw data,inadequate near-surface velocity modeling,and imaging inaccuracy have long hindered the advancement of seismic exploration across this region.Through a problem-solving approach rooted in geological target analysis,this research systematically investigates the behavioral patterns of nodal seismometer-based high-density seismic acquisition in loess plateau.Tailored advancements in waveform enhancement and depth velocity modelling methodologies have been engineered.Field validations confirm that the optimized workflow demonstrates marked improvements in amplitude preservation and imaging resolution,offering novel insights for future reservoir characterization endeavors. 展开更多
关键词 loess plateau ACQUISITION low signal to noise ratio data processing depth modeling
原文传递
AI-based learning models for the life cycle prediction and detection of diabetes disorders:A comprehensive perspective
19
作者 Mohd.Nazim Mohd.Aquib Ansari +1 位作者 Shahnawaz Ahmad Mohd.Arif 《Medical Data Mining》 2026年第2期43-56,共14页
This paper aims to conduct a systematic literature review(SLR)using an artificial intelligence(AI)approach to predict and diagnose diabetes mellitus.After reviewing the literature published from 2015–2025,the paper a... This paper aims to conduct a systematic literature review(SLR)using an artificial intelligence(AI)approach to predict and diagnose diabetes mellitus.After reviewing the literature published from 2015–2025,the paper aims to identify the most effective AI techniques,the most used datasets,the most widely used data preprocessing techniques,and the most common issues.After analyzing the literature,it has been found that convolutional neural networks(CNNs)and long short-term memory(LSTM)networks are deep learning models that have shown high accuracy in diabetes prediction.Recursive feature elimination(RFE)and SMOTE are feature selection techniques that have significantly improved model accuracy,training time,and interpretability.Amidst this technological advancement,some existing issues persist:data imbalance,the inapplicability of techniques,computational limitations,and a lack of real-time application in a healthcare environment.The literature review has also identified the need for robust,interpretable,and scalable AI systems capable of handling large volumes of data,including real-world data,in the healthcare industry.Furthermore,it has been identified that the benefits should be integrated with wearable health monitoring systems and the development of privacy-preserving models to ensure continuous,secure,and proactive diabetes management. 展开更多
关键词 artificial intelligence machine learning diabetes prediction deep learning models healthcare data analytics
在线阅读 下载PDF
A decision framework for rural domestic sewage treatment models and process:Evidence from Inner Mongolia Autonomous Region,China 被引量:1
20
作者 Ying Yan Pengyu Li +5 位作者 Zixuan Wang Yubo Tan Tianlong Zheng Jianguo Liu Xiaoxia Yang Junxin Liu 《Journal of Environmental Sciences》 2026年第1期302-311,共10页
Rural domestic sewage treatment is critical for environmental protection.This study defines the spatial pattern of villages from the perspective of rural sewage treatment and develops an integrated decision-making sys... Rural domestic sewage treatment is critical for environmental protection.This study defines the spatial pattern of villages from the perspective of rural sewage treatment and develops an integrated decision-making system to propose a sewage treatment mode and scheme suitable for local conditions.By considering the village spatial layout and terrain factors,a decision tree model of residential density and terrain type was constructed with accuracies of 76.47%and 96.00%,respectively.Combined with binary classification probability unit regression,an appropriate sewage treatment mode for the village was determined with 87.00%accuracy.The Analytic Hierarchy Process(AHP),combined with the Technique for Order Preference(TOPSIS)by Similarity to an Ideal Solution model,formed the basis for optimal treatment process selection under different emission standards.Verification was conducted in 542 villages across three counties of the Inner Mongolia Autonomous Region,focusing on the standard effluent effect(0.3773),low investment cost(0.3196),and high standard effluent effect(0.5115)to determine the best treatment process for the same emission standard under different needs.The annual environmental and carbon emission benefits of sewage treatment in these villages were estimated.This model matches village density,geographic feature,and social development level,and provides scientific support and a theoretical basis for rural sewage treatment decision-making. 展开更多
关键词 Rural domestic sewage Sewage treatment model DECISION-MAKING Environmental-economic benefits Inner Mongolia
原文传递
上一页 1 2 250 下一页 到第
使用帮助 返回顶部