Large-scale point cloud datasets form the basis for training various deep learning networks and achieving high-quality network processing tasks.Due to the diversity and robustness constraints of the data,data augmenta...Large-scale point cloud datasets form the basis for training various deep learning networks and achieving high-quality network processing tasks.Due to the diversity and robustness constraints of the data,data augmentation(DA)methods are utilised to expand dataset diversity and scale.However,due to the complex and distinct characteristics of LiDAR point cloud data from different platforms(such as missile-borne and vehicular LiDAR data),directly applying traditional 2D visual domain DA methods to 3D data can lead to networks trained using this approach not robustly achieving the corresponding tasks.To address this issue,the present study explores DA for missile-borne LiDAR point cloud using a Monte Carlo(MC)simulation method that closely resembles practical application.Firstly,the model of multi-sensor imaging system is established,taking into account the joint errors arising from the platform itself and the relative motion during the imaging process.A distortion simulation method based on MC simulation for augmenting missile-borne LiDAR point cloud data is proposed,underpinned by an analysis of combined errors between different modal sensors,achieving high-quality augmentation of point cloud data.The effectiveness of the proposed method in addressing imaging system errors and distortion simulation is validated using the imaging scene dataset constructed in this paper.Comparative experiments between the proposed point cloud DA algorithm and the current state-of-the-art algorithms in point cloud detection and single object tracking tasks demonstrate that the proposed method can improve the network performance obtained from unaugmented datasets by over 17.3%and 17.9%,surpassing SOTA performance of current point cloud DA algorithms.展开更多
Well logging technology has accumulated a large amount of historical data through four generations of technological development,which forms the basis of well logging big data and digital assets.However,the value of th...Well logging technology has accumulated a large amount of historical data through four generations of technological development,which forms the basis of well logging big data and digital assets.However,the value of these data has not been well stored,managed and mined.With the development of cloud computing technology,it provides a rare development opportunity for logging big data private cloud.The traditional petrophysical evaluation and interpretation model has encountered great challenges in the face of new evaluation objects.The solution research of logging big data distributed storage,processing and learning functions integrated in logging big data private cloud has not been carried out yet.To establish a distributed logging big-data private cloud platform centered on a unifi ed learning model,which achieves the distributed storage and processing of logging big data and facilitates the learning of novel knowledge patterns via the unifi ed logging learning model integrating physical simulation and data models in a large-scale functional space,thus resolving the geo-engineering evaluation problem of geothermal fi elds.Based on the research idea of“logging big data cloud platform-unifi ed logging learning model-large function space-knowledge learning&discovery-application”,the theoretical foundation of unified learning model,cloud platform architecture,data storage and learning algorithm,arithmetic power allocation and platform monitoring,platform stability,data security,etc.have been carried on analysis.The designed logging big data cloud platform realizes parallel distributed storage and processing of data and learning algorithms.The feasibility of constructing a well logging big data cloud platform based on a unifi ed learning model of physics and data is analyzed in terms of the structure,ecology,management and security of the cloud platform.The case study shows that the logging big data cloud platform has obvious technical advantages over traditional logging evaluation methods in terms of knowledge discovery method,data software and results sharing,accuracy,speed and complexity.展开更多
With the rise of remote collaboration,the demand for advanced storage and collaboration tools has rapidly increased.However,traditional collaboration tools primarily rely on access control,leaving data stored on cloud...With the rise of remote collaboration,the demand for advanced storage and collaboration tools has rapidly increased.However,traditional collaboration tools primarily rely on access control,leaving data stored on cloud servers vulnerable due to insufficient encryption.This paper introduces a novel mechanism that encrypts data in‘bundle’units,designed to meet the dual requirements of efficiency and security for frequently updated collaborative data.Each bundle includes updated information,allowing only the updated portions to be reencrypted when changes occur.The encryption method proposed in this paper addresses the inefficiencies of traditional encryption modes,such as Cipher Block Chaining(CBC)and Counter(CTR),which require decrypting and re-encrypting the entire dataset whenever updates occur.The proposed method leverages update-specific information embedded within data bundles and metadata that maps the relationship between these bundles and the plaintext data.By utilizing this information,the method accurately identifies the modified portions and applies algorithms to selectively re-encrypt only those sections.This approach significantly enhances the efficiency of data updates while maintaining high performance,particularly in large-scale data environments.To validate this approach,we conducted experiments measuring execution time as both the size of the modified data and the total dataset size varied.Results show that the proposed method significantly outperforms CBC and CTR modes in execution speed,with greater performance gains as data size increases.Additionally,our security evaluation confirms that this method provides robust protection against both passive and active attacks.展开更多
With the continuous advancement of the tiered diagnosis and treatment system,the medical consortium model has gained increasing attention as an important approach to promoting the vertical integration of healthcare re...With the continuous advancement of the tiered diagnosis and treatment system,the medical consortium model has gained increasing attention as an important approach to promoting the vertical integration of healthcare resources.Within this context,laboratory data,as a key component of healthcare information systems,urgently requires efficient sharing and intelligent analysis.This paper designs and constructs an intelligent early warning system for laboratory data based on a cloud platform tailored to the medical consortium model.Through standardized data formats and unified access interfaces,the system enables the integration and cleaning of laboratory data across multiple healthcare institutions.By combining medical rule sets with machine learning models,the system achieves graded alerts and rapid responses to abnormal key indicators and potential outbreaks of infectious diseases.Practical deployment results demonstrate that the system significantly improves the utilization efficiency of laboratory data,strengthens public health event monitoring,and optimizes inter-institutional collaboration.The paper also discusses challenges encountered during system implementation,such as inconsistent data standards,security and compliance concerns,and model interpretability,and proposes corresponding optimization strategies.These findings provide a reference for the broader application of intelligent medical early warning systems.展开更多
Airborne LiDAR(Light Detection and Ranging)is an evolving high-tech active remote sensing technology that has the capability to acquire large-area topographic data and can quickly generate DEM(Digital Elevation Model)...Airborne LiDAR(Light Detection and Ranging)is an evolving high-tech active remote sensing technology that has the capability to acquire large-area topographic data and can quickly generate DEM(Digital Elevation Model)products.Combined with image data,this technology can further enrich and extract spatial geographic information.However,practically,due to the limited operating range of airborne LiDAR and the large area of task,it would be necessary to perform registration and stitching process on point clouds of adjacent flight strips.By eliminating grow errors,the systematic errors in the data need to be effectively reduced.Thus,this paper conducts research on point cloud registration methods in urban building areas,aiming to improve the accuracy and processing efficiency of airborne LiDAR data.Meanwhile,an improved post-ICP(Iterative Closest Point)point cloud registration method was proposed in this study to determine the accurate registration and efficient stitching of point clouds,which capable to provide a potential technical support for applicants in related field.展开更多
Anomaly detection is an important task for maintaining the performance of cloud data center.Traditional anomaly detection primarily examines individual Virtual Machine(VM)behavior,neglecting the impact of interactions...Anomaly detection is an important task for maintaining the performance of cloud data center.Traditional anomaly detection primarily examines individual Virtual Machine(VM)behavior,neglecting the impact of interactions among multiple VMs on Key Performance Indicator(KPI)data,e.g.,memory utilization.Furthermore,the nonstationarity,high complexity,and uncertain periodicity of KPI data in VM also bring difficulties to deep learningbased anomaly detection tasks.To settle these challenges,this paper proposes MCBiWGAN-GTN,a multi-channel semi-supervised time series anomaly detection algorithm based on the Bidirectional Wasserstein Generative Adversarial Network with Graph-Time Network(BiWGAN-GTN)and the Complete Ensemble Empirical Mode Decomposition with Adaptive Noise(CEEMDAN).(a)The BiWGAN-GTN algorithm is proposed to extract spatiotemporal information from data.(b)The loss function of BiWGAN-GTN is redesigned to solve the abnormal data intrusion problem during the training process.(c)MCBiWGAN-GTN is designed to reduce data complexity through CEEMDAN for time series decomposition and utilizes BiWGAN-GTN to train different components.(d)To adapt the proposed algorithm for the entire cloud data center,a cloud data center anomaly detection framework based on Swarm Learning(SL)is designed.The evaluation results on a real-world cloud data center dataset show that MCBiWGAN-GTN outperforms the baseline,with an F1-score of 0.96,an accuracy of 0.935,a precision of 0.954,a recall of 0.967,and an FPR of 0.203.The experiments also verify the stability of MCBiWGAN-GTN,the impact of parameter configurations,and the effectiveness of the proposed SL framework.展开更多
A basic procedure for transforming readable data into encoded forms is encryption, which ensures security when the right decryption keys are used. Hadoop is susceptible to possible cyber-attacks because it lacks built...A basic procedure for transforming readable data into encoded forms is encryption, which ensures security when the right decryption keys are used. Hadoop is susceptible to possible cyber-attacks because it lacks built-in security measures, even though it can effectively handle and store enormous datasets using the Hadoop Distributed File System (HDFS). The increasing number of data breaches emphasizes how urgently creative encryption techniques are needed in cloud-based big data settings. This paper presents Adaptive Attribute-Based Honey Encryption (AABHE), a state-of-the-art technique that combines honey encryption with Ciphertext-Policy Attribute-Based Encryption (CP-ABE) to provide improved data security. Even if intercepted, AABHE makes sure that sensitive data cannot be accessed by unauthorized parties. With a focus on protecting huge files in HDFS, the suggested approach achieves 98% security robustness and 95% encryption efficiency, outperforming other encryption methods including Ciphertext-Policy Attribute-Based Encryption (CP-ABE), Key-Policy Attribute-Based Encryption (KB-ABE), and Advanced Encryption Standard combined with Attribute-Based Encryption (AES+ABE). By fixing Hadoop’s security flaws, AABHE fortifies its protections against data breaches and enhances Hadoop’s dependability as a platform for processing and storing massive amounts of data.展开更多
The integration of the Internet of Things(IoT)into healthcare systems improves patient care,boosts operational efficiency,and contributes to cost-effective healthcare delivery.However,overcoming several associated cha...The integration of the Internet of Things(IoT)into healthcare systems improves patient care,boosts operational efficiency,and contributes to cost-effective healthcare delivery.However,overcoming several associated challenges,such as data security,interoperability,and ethical concerns,is crucial to realizing the full potential of IoT in healthcare.Real-time anomaly detection plays a key role in protecting patient data and maintaining device integrity amidst the additional security risks posed by interconnected systems.In this context,this paper presents a novelmethod for healthcare data privacy analysis.The technique is based on the identification of anomalies in cloud-based Internet of Things(IoT)networks,and it is optimized using explainable artificial intelligence.For anomaly detection,the Radial Boltzmann Gaussian Temporal Fuzzy Network(RBGTFN)is used in the process of doing information privacy analysis for healthcare data.Remora Colony SwarmOptimization is then used to carry out the optimization of the network.The performance of the model in identifying anomalies across a variety of healthcare data is evaluated by an experimental study.This evaluation suggested that themodel measures the accuracy,precision,latency,Quality of Service(QoS),and scalability of themodel.A remarkable 95%precision,93%latency,89%quality of service,98%detection accuracy,and 96%scalability were obtained by the suggested model,as shown by the subsequent findings.展开更多
Cloud computing has become an essential technology for the management and processing of large datasets,offering scalability,high availability,and fault tolerance.However,optimizing data replication across multiple dat...Cloud computing has become an essential technology for the management and processing of large datasets,offering scalability,high availability,and fault tolerance.However,optimizing data replication across multiple data centers poses a significant challenge,especially when balancing opposing goals such as latency,storage costs,energy consumption,and network efficiency.This study introduces a novel Dynamic Optimization Algorithm called Dynamic Multi-Objective Gannet Optimization(DMGO),designed to enhance data replication efficiency in cloud environments.Unlike traditional static replication systems,DMGO adapts dynamically to variations in network conditions,system demand,and resource availability.The approach utilizes multi-objective optimization approaches to efficiently balance data access latency,storage efficiency,and operational costs.DMGO consistently evaluates data center performance and adjusts replication algorithms in real time to guarantee optimal system efficiency.Experimental evaluations conducted in a simulated cloud environment demonstrate that DMGO significantly outperforms conventional static algorithms,achieving faster data access,lower storage overhead,reduced energy consumption,and improved scalability.The proposed methodology offers a robust and adaptable solution for modern cloud systems,ensuring efficient resource consumption while maintaining high performance.展开更多
The development of machine learning and deep learning algorithms as well as the improvement ofhardware arithmetic power provide a rare opportunity for logging big data private cloud.With the deepeningof exploration an...The development of machine learning and deep learning algorithms as well as the improvement ofhardware arithmetic power provide a rare opportunity for logging big data private cloud.With the deepeningof exploration and development and the requirements of low-carbon development,the focus of exploration anddevelopment in the oil and gas industry is gradually shifting to the exploration and development of renewableenergy sources such as deep sea,deep earth and geothermal energy.The traditional petrophysical evaluation andinterpretation model has encountered great challenges in the face of new evaluation objects.To establish a distributedlogging big data private cloud platform with a unified learning model as the key,which realizes the distributed storageand processing of logging big data,and enables the learning of brand-new knowledge patterns from multi-attributedata in the large function space in the unified logging learning model integrating the expert knowledge and the datamodel,so as to solve the problem of geoengineering evaluation of geothermal fields.Based on the research ideaof“logging big data cloud platform---unified logging learning model---large function space---knowledge learning&discovery---application”,the theoretical foundation of unified learning model,cloud platform architecture,datastorage and learning algorithm,arithmetic power allocation and platform monitoring,platform stability,data security,etc.have been carried on analysis.The designed logging big data cloud platform realizes parallel distributed storageand processing of data and learning algorithms.New knowledge of geothermal evaluation is found in a large functionspace and applied to Geo-engineering evaluation of geothermal fields.The examples show its good application in theselection of logging series in geothermal fields,quality control of logging data,identification of complex lithologyin geothermal fields,evaluation of reservoir fluids,checking of associated helium,evaluation of cementing quality,evaluation of well-side fractures,and evaluation of geothermal water recharge under the remote logging module ofthe cloud platform.The first and second cementing surfaces of cemented wells in geothermal fields were evaluated,as well as the development of well-side distal fractures,fracture extension orientation.According to the well-sidefracture communication to form a good fluid pathway and large flow rate and long flow diameter of the thermalstorage fi ssure system,the design is conducive to the design of the recharge program of geothermal water.展开更多
Snow cover plays a critical role in global climate regulation and hydrological processes.Accurate monitoring is essential for understanding snow distribution patterns,managing water resources,and assessing the impacts...Snow cover plays a critical role in global climate regulation and hydrological processes.Accurate monitoring is essential for understanding snow distribution patterns,managing water resources,and assessing the impacts of climate change.Remote sensing has become a vital tool for snow monitoring,with the widely used Moderate-resolution Imaging Spectroradiometer(MODIS)snow products from the Terra and Aqua satellites.However,cloud cover often interferes with snow detection,making cloud removal techniques crucial for reliable snow product generation.This study evaluated the accuracy of four MODIS snow cover datasets generated through different cloud removal algorithms.Using real-time field camera observations from four stations in the Tianshan Mountains,China,this study assessed the performance of these datasets during three distinct snow periods:the snow accumulation period(September-November),snowmelt period(March-June),and stable snow period(December-February in the following year).The findings showed that cloud-free snow products generated using the Hidden Markov Random Field(HMRF)algorithm consistently outperformed the others,particularly under cloud cover,while cloud-free snow products using near-day synthesis and the spatiotemporal adaptive fusion method with error correction(STAR)demonstrated varying performance depending on terrain complexity and cloud conditions.This study highlighted the importance of considering terrain features,land cover types,and snow dynamics when selecting cloud removal methods,particularly in areas with rapid snow accumulation and melting.The results suggested that future research should focus on improving cloud removal algorithms through the integration of machine learning,multi-source data fusion,and advanced remote sensing technologies.By expanding validation efforts and refining cloud removal strategies,more accurate and reliable snow products can be developed,contributing to enhanced snow monitoring and better management of water resources in alpine and arid areas.展开更多
Cloud storage,a core component of cloud computing,plays a vital role in the storage and management of data.Electronic Health Records(EHRs),which document users’health information,are typically stored on cloud servers...Cloud storage,a core component of cloud computing,plays a vital role in the storage and management of data.Electronic Health Records(EHRs),which document users’health information,are typically stored on cloud servers.However,users’sensitive data would then become unregulated.In the event of data loss,cloud storage providers might conceal the fact that data has been compromised to protect their reputation and mitigate losses.Ensuring the integrity of data stored in the cloud remains a pressing issue that urgently needs to be addressed.In this paper,we propose a data auditing scheme for cloud-based EHRs that incorporates recoverability and batch auditing,alongside a thorough security and performance evaluation.Our scheme builds upon the indistinguishability-based privacy-preserving auditing approach proposed by Zhou et al.We identify that this scheme is insecure and vulnerable to forgery attacks on data storage proofs.To address these vulnerabilities,we enhanced the auditing process using masking techniques and designed new algorithms to strengthen security.We also provide formal proof of the security of the signature algorithm and the auditing scheme.Furthermore,our results show that our scheme effectively protects user privacy and is resilient against malicious attacks.Experimental results indicate that our scheme is not only secure and efficient but also supports batch auditing of cloud data.Specifically,when auditing 10,000 users,batch auditing reduces computational overhead by 101 s compared to normal auditing.展开更多
The cloud data centres evolved with an issue of energy management due to the constant increase in size,complexity and enormous consumption of energy.Energy management is a challenging issue that is critical in cloud d...The cloud data centres evolved with an issue of energy management due to the constant increase in size,complexity and enormous consumption of energy.Energy management is a challenging issue that is critical in cloud data centres and an important concern of research for many researchers.In this paper,we proposed a cuckoo search(CS)-based optimisation technique for the virtual machine(VM)selection and a novel placement algorithm considering the different constraints.The energy consumption model and the simulation model have been implemented for the efficient selection of VM.The proposed model CSOA-VM not only lessens the violations at the service level agreement(SLA)level but also minimises the VM migrations.The proposed model also saves energy and the performance analysis shows that energy consumption obtained is 1.35 kWh,SLA violation is 9.2 and VM migration is about 268.Thus,there is an improvement in energy consumption of about 1.8%and a 2.1%improvement(reduction)in violations of SLA in comparison to existing techniques.展开更多
Early detection of convective clouds is vital for minimizing hazardous impacts.Forecasting convective initiation(CI)using current multispectral geostationary meteorological satellites is often challenged by high false...Early detection of convective clouds is vital for minimizing hazardous impacts.Forecasting convective initiation(CI)using current multispectral geostationary meteorological satellites is often challenged by high false-alarm rates and missed detections caused by limited resolution.In contrast,high-resolution earth observation satellites offer more detailed texture information,improving early detection capabilities.The authors propose a novel methodology that integrates the advanced features of China’s latest-generation satellites,Gaofen-4(GF-4)and Fengyun-4A(FY-4A).This fusion method retains GF’s high-resolution details and FY-4A’s multispectral information.Two cases from different observational scenarios and weather conditions under GF-4’s staring mode were carried out to compare the CI forecast results based on fused data and solely on FY-4A data.The fused data demonstrated superior performance in detecting smaller-scale convective clouds,enabling earlier forecasting with a lead time of 15–30 minutes,and more accurate location identification.Integrating high-resolution earth observation satellites into early convective cloud detection provides valuable insights for forecasters and decision-makers,particularly given the current resolution limitations of geostationary meteorological satellites.展开更多
In this study, a variety of high-resolution satellite data were used to analyze the similarities and differences in horizontal and vertical cloud microphysical characteristics of 11 tropical cyclones(TCs) in three dif...In this study, a variety of high-resolution satellite data were used to analyze the similarities and differences in horizontal and vertical cloud microphysical characteristics of 11 tropical cyclones(TCs) in three different ocean basins.The results show that for the 11 TCs in different ocean basins, no matter in what season the TCs were generated when they reached or approached Category 4, their melting layers were all distributed in the vertical direction at the height of about 5 km. The high value of ice water contents in the vertical direction of 11 TCs all reach or approach about 2000 g cm^(–3).The total attenuated scattering coefficient at 532 nm, TAB-532, can successfully characterize the distribution of areas with high ice water content when the vertical distribution was concentrated near 0.1 km^(–1)sr^(–1), possibly because the diameter distribution of the corresponding range of aerosol particles had a more favorable effect on the formation of ice nuclei,indicating that aerosols had a significant impact on the ice-phase processes and characteristics. Moreover, by analyzing the horizontal cloud water content, the distribution analysis of cloud water path(CWP) and ice water path(IWP) shows that when the sea surface temperature was at a relatively high value, and the vertical wind shear was relatively small, the CWP and the IWP can reach a relatively high value, which also proves the importance of environmental field factors on the influence of TC cloud microphysical characteristics.展开更多
In the task of inspecting underwater suspended pipelines,multi-beam sonar(MBS)can provide two-dimensional water column images(WCIs).However,systematic interferences(e.g.,sidelobe effects)may induce misdetection in WCI...In the task of inspecting underwater suspended pipelines,multi-beam sonar(MBS)can provide two-dimensional water column images(WCIs).However,systematic interferences(e.g.,sidelobe effects)may induce misdetection in WCIs.To address this issue and improve the accuracy of detection,we developed a density-based clustering method for three-dimensional water column point clouds.During the processing of WCIs,sidelobe effects are mitigated using a bilateral filter and brightness transformation.The cross-sectional point cloud of the pipeline is then extracted by using the Canny operator.In the detection phase,the target is identified by using density-based spatial clustering of applications with noise(DBSCAN).However,the selection of appropriate DBSCAN parameters is obscured by the uneven distribution of the water column point cloud.To overcome this,we propose an improved DBSCAN based on a parameter interval estimation method(PIE-DBSCAN).First,kernel density estimation(KDE)is used to determine the candidate interval of parameters,after which the exact cluster number is determined via density peak clustering(DPC).Finally,the optimal parameters are selected by comparing the mean silhouette coefficients.To validate the performance of PIE-DBSCAN,we collected water column point clouds from an anechoic tank and the South China Sea.PIE-DBSCAN successfully detected both the target points of the suspended pipeline and non-target points on the seafloor surface.Compared to the K-Means and Mean-Shift algorithms,PIE-DBSCAN demonstrates superior clustering performance and shows feasibility in practical applications.展开更多
The spatial distribution of discontinuities and the size of rock blocks are the key indicators for rock mass quality evaluation and rockfall risk assessment.Traditional manual measurement is often dangerous or unreach...The spatial distribution of discontinuities and the size of rock blocks are the key indicators for rock mass quality evaluation and rockfall risk assessment.Traditional manual measurement is often dangerous or unreachable at some high-steep rock slopes.In contrast,unmanned aerial vehicle(UAV)photogrammetry is not limited by terrain conditions,and can efficiently collect high-precision three-dimensional(3D)point clouds of rock masses through all-round and multiangle photography for rock mass characterization.In this paper,a new method based on a 3D point cloud is proposed for discontinuity identification and refined rock block modeling.The method is based on four steps:(1)Establish a point cloud spatial topology,and calculate the point cloud normal vector and average point spacing based on several machine learning algorithms;(2)Extract discontinuities using the density-based spatial clustering of applications with noise(DBSCAN)algorithm and fit the discontinuity plane by combining principal component analysis(PCA)with the natural breaks(NB)method;(3)Propose a method of inserting points in the line segment to generate an embedded discontinuity point cloud;and(4)Adopt a Poisson reconstruction method for refined rock block modeling.The proposed method was applied to an outcrop of an ultrahigh steep rock slope and compared with the results of previous studies and manual surveys.The results show that the method can eliminate the influence of discontinuity undulations on the orientation measurement and describe the local concave-convex characteristics on the modeling of rock blocks.The calculation results are accurate and reliable,which can meet the practical requirements of engineering.展开更多
This study introduces a new ocean surface friction velocity scheme and a modified Thompson cloud microphysics parameterization scheme into the CMA-TYM model.The impact of these two parameterization schemes on the pred...This study introduces a new ocean surface friction velocity scheme and a modified Thompson cloud microphysics parameterization scheme into the CMA-TYM model.The impact of these two parameterization schemes on the prediction of the movement track and intensity of Typhoon Kompasu in 2021 is examined.Additionally,the possible reasons for their effects on tropical cyclone(TC)intensity prediction are analyzed.Statistical results show that both parameterization schemes improve the predictions of Typhoon Kompasu’s track and intensity.The influence on track prediction becomes evident after 60 h of model integration,while the significant positive impact on intensity prediction is observed after 66 h.Further analysis reveals that these two schemes affect the timing and magnitude of extreme TC intensity values by influencing the evolution of the TC’s warm-core structure.展开更多
Missing data presents a crucial challenge in data analysis,especially in high-dimensional datasets,where missing data often leads to biased conclusions and degraded model performance.In this study,we present a novel a...Missing data presents a crucial challenge in data analysis,especially in high-dimensional datasets,where missing data often leads to biased conclusions and degraded model performance.In this study,we present a novel autoencoder-based imputation framework that integrates a composite loss function to enhance robustness and precision.The proposed loss combines(i)a guided,masked mean squared error focusing on missing entries;(ii)a noise-aware regularization term to improve resilience against data corruption;and(iii)a variance penalty to encourage expressive yet stable reconstructions.We evaluate the proposed model across four missingness mechanisms,such as Missing Completely at Random,Missing at Random,Missing Not at Random,and Missing Not at Random with quantile censorship,under systematically varied feature counts,sample sizes,and missingness ratios ranging from 5%to 60%.Four publicly available real-world datasets(Stroke Prediction,Pima Indians Diabetes,Cardiovascular Disease,and Framingham Heart Study)were used,and the obtained results show that our proposed model consistently outperforms baseline methods,including traditional and deep learning-based techniques.An ablation study reveals the additive value of each component in the loss function.Additionally,we assessed the downstream utility of imputed data through classification tasks,where datasets imputed by the proposed method yielded the highest receiver operating characteristic area under the curve scores across all scenarios.The model demonstrates strong scalability and robustness,improving performance with larger datasets and higher feature counts.These results underscore the capacity of the proposed method to produce not only numerically accurate but also semantically useful imputations,making it a promising solution for robust data recovery in clinical applications.展开更多
Modern intrusion detection systems(MIDS)face persistent challenges in coping with the rapid evolution of cyber threats,high-volume network traffic,and imbalanced datasets.Traditional models often lack the robustness a...Modern intrusion detection systems(MIDS)face persistent challenges in coping with the rapid evolution of cyber threats,high-volume network traffic,and imbalanced datasets.Traditional models often lack the robustness and explainability required to detect novel and sophisticated attacks effectively.This study introduces an advanced,explainable machine learning framework for multi-class IDS using the KDD99 and IDS datasets,which reflects real-world network behavior through a blend of normal and diverse attack classes.The methodology begins with sophisticated data preprocessing,incorporating both RobustScaler and QuantileTransformer to address outliers and skewed feature distributions,ensuring standardized and model-ready inputs.Critical dimensionality reduction is achieved via the Harris Hawks Optimization(HHO)algorithm—a nature-inspired metaheuristic modeled on hawks’hunting strategies.HHO efficiently identifies the most informative features by optimizing a fitness function based on classification performance.Following feature selection,the SMOTE is applied to the training data to resolve class imbalance by synthetically augmenting underrepresented attack types.The stacked architecture is then employed,combining the strengths of XGBoost,SVM,and RF as base learners.This layered approach improves prediction robustness and generalization by balancing bias and variance across diverse classifiers.The model was evaluated using standard classification metrics:precision,recall,F1-score,and overall accuracy.The best overall performance was recorded with an accuracy of 99.44%for UNSW-NB15,demonstrating the model’s effectiveness.After balancing,the model demonstrated a clear improvement in detecting the attacks.We tested the model on four datasets to show the effectiveness of the proposed approach and performed the ablation study to check the effect of each parameter.Also,the proposed model is computationaly efficient.To support transparency and trust in decision-making,explainable AI(XAI)techniques are incorporated that provides both global and local insight into feature contributions,and offers intuitive visualizations for individual predictions.This makes it suitable for practical deployment in cybersecurity environments that demand both precision and accountability.展开更多
基金Postgraduate Innovation Top notch Talent Training Project of Hunan Province,Grant/Award Number:CX20220045Scientific Research Project of National University of Defense Technology,Grant/Award Number:22-ZZCX-07+2 种基金New Era Education Quality Project of Anhui Province,Grant/Award Number:2023cxcysj194National Natural Science Foundation of China,Grant/Award Numbers:62201597,62205372,1210456foundation of Hefei Comprehensive National Science Center,Grant/Award Number:KY23C502。
文摘Large-scale point cloud datasets form the basis for training various deep learning networks and achieving high-quality network processing tasks.Due to the diversity and robustness constraints of the data,data augmentation(DA)methods are utilised to expand dataset diversity and scale.However,due to the complex and distinct characteristics of LiDAR point cloud data from different platforms(such as missile-borne and vehicular LiDAR data),directly applying traditional 2D visual domain DA methods to 3D data can lead to networks trained using this approach not robustly achieving the corresponding tasks.To address this issue,the present study explores DA for missile-borne LiDAR point cloud using a Monte Carlo(MC)simulation method that closely resembles practical application.Firstly,the model of multi-sensor imaging system is established,taking into account the joint errors arising from the platform itself and the relative motion during the imaging process.A distortion simulation method based on MC simulation for augmenting missile-borne LiDAR point cloud data is proposed,underpinned by an analysis of combined errors between different modal sensors,achieving high-quality augmentation of point cloud data.The effectiveness of the proposed method in addressing imaging system errors and distortion simulation is validated using the imaging scene dataset constructed in this paper.Comparative experiments between the proposed point cloud DA algorithm and the current state-of-the-art algorithms in point cloud detection and single object tracking tasks demonstrate that the proposed method can improve the network performance obtained from unaugmented datasets by over 17.3%and 17.9%,surpassing SOTA performance of current point cloud DA algorithms.
基金supported By Grant (PLN2022-14) of State Key Laboratory of Oil and Gas Reservoir Geology and Exploitation (Southwest Petroleum University)。
文摘Well logging technology has accumulated a large amount of historical data through four generations of technological development,which forms the basis of well logging big data and digital assets.However,the value of these data has not been well stored,managed and mined.With the development of cloud computing technology,it provides a rare development opportunity for logging big data private cloud.The traditional petrophysical evaluation and interpretation model has encountered great challenges in the face of new evaluation objects.The solution research of logging big data distributed storage,processing and learning functions integrated in logging big data private cloud has not been carried out yet.To establish a distributed logging big-data private cloud platform centered on a unifi ed learning model,which achieves the distributed storage and processing of logging big data and facilitates the learning of novel knowledge patterns via the unifi ed logging learning model integrating physical simulation and data models in a large-scale functional space,thus resolving the geo-engineering evaluation problem of geothermal fi elds.Based on the research idea of“logging big data cloud platform-unifi ed logging learning model-large function space-knowledge learning&discovery-application”,the theoretical foundation of unified learning model,cloud platform architecture,data storage and learning algorithm,arithmetic power allocation and platform monitoring,platform stability,data security,etc.have been carried on analysis.The designed logging big data cloud platform realizes parallel distributed storage and processing of data and learning algorithms.The feasibility of constructing a well logging big data cloud platform based on a unifi ed learning model of physics and data is analyzed in terms of the structure,ecology,management and security of the cloud platform.The case study shows that the logging big data cloud platform has obvious technical advantages over traditional logging evaluation methods in terms of knowledge discovery method,data software and results sharing,accuracy,speed and complexity.
基金supported by the Institute of Information&communications Technology Planning&Evaluation(IITP)grant funded by the Korea government(MSIT)(RS-2024-00399401,Development of Quantum-Safe Infrastructure Migration and Quantum Security Verification Technologies).
文摘With the rise of remote collaboration,the demand for advanced storage and collaboration tools has rapidly increased.However,traditional collaboration tools primarily rely on access control,leaving data stored on cloud servers vulnerable due to insufficient encryption.This paper introduces a novel mechanism that encrypts data in‘bundle’units,designed to meet the dual requirements of efficiency and security for frequently updated collaborative data.Each bundle includes updated information,allowing only the updated portions to be reencrypted when changes occur.The encryption method proposed in this paper addresses the inefficiencies of traditional encryption modes,such as Cipher Block Chaining(CBC)and Counter(CTR),which require decrypting and re-encrypting the entire dataset whenever updates occur.The proposed method leverages update-specific information embedded within data bundles and metadata that maps the relationship between these bundles and the plaintext data.By utilizing this information,the method accurately identifies the modified portions and applies algorithms to selectively re-encrypt only those sections.This approach significantly enhances the efficiency of data updates while maintaining high performance,particularly in large-scale data environments.To validate this approach,we conducted experiments measuring execution time as both the size of the modified data and the total dataset size varied.Results show that the proposed method significantly outperforms CBC and CTR modes in execution speed,with greater performance gains as data size increases.Additionally,our security evaluation confirms that this method provides robust protection against both passive and active attacks.
文摘With the continuous advancement of the tiered diagnosis and treatment system,the medical consortium model has gained increasing attention as an important approach to promoting the vertical integration of healthcare resources.Within this context,laboratory data,as a key component of healthcare information systems,urgently requires efficient sharing and intelligent analysis.This paper designs and constructs an intelligent early warning system for laboratory data based on a cloud platform tailored to the medical consortium model.Through standardized data formats and unified access interfaces,the system enables the integration and cleaning of laboratory data across multiple healthcare institutions.By combining medical rule sets with machine learning models,the system achieves graded alerts and rapid responses to abnormal key indicators and potential outbreaks of infectious diseases.Practical deployment results demonstrate that the system significantly improves the utilization efficiency of laboratory data,strengthens public health event monitoring,and optimizes inter-institutional collaboration.The paper also discusses challenges encountered during system implementation,such as inconsistent data standards,security and compliance concerns,and model interpretability,and proposes corresponding optimization strategies.These findings provide a reference for the broader application of intelligent medical early warning systems.
基金Guangxi Key Laboratory of Spatial Information and Geomatics(21-238-21-12)Guangxi Young and Middle-aged Teachers’Research Fundamental Ability Enhancement Project(2023KY1196).
文摘Airborne LiDAR(Light Detection and Ranging)is an evolving high-tech active remote sensing technology that has the capability to acquire large-area topographic data and can quickly generate DEM(Digital Elevation Model)products.Combined with image data,this technology can further enrich and extract spatial geographic information.However,practically,due to the limited operating range of airborne LiDAR and the large area of task,it would be necessary to perform registration and stitching process on point clouds of adjacent flight strips.By eliminating grow errors,the systematic errors in the data need to be effectively reduced.Thus,this paper conducts research on point cloud registration methods in urban building areas,aiming to improve the accuracy and processing efficiency of airborne LiDAR data.Meanwhile,an improved post-ICP(Iterative Closest Point)point cloud registration method was proposed in this study to determine the accurate registration and efficient stitching of point clouds,which capable to provide a potential technical support for applicants in related field.
基金supported in part by National Natural Science Foundation of China under Grant 62071078in part by Sichuan Province Science and Technology Program under Grant 2021YFQ0053。
文摘Anomaly detection is an important task for maintaining the performance of cloud data center.Traditional anomaly detection primarily examines individual Virtual Machine(VM)behavior,neglecting the impact of interactions among multiple VMs on Key Performance Indicator(KPI)data,e.g.,memory utilization.Furthermore,the nonstationarity,high complexity,and uncertain periodicity of KPI data in VM also bring difficulties to deep learningbased anomaly detection tasks.To settle these challenges,this paper proposes MCBiWGAN-GTN,a multi-channel semi-supervised time series anomaly detection algorithm based on the Bidirectional Wasserstein Generative Adversarial Network with Graph-Time Network(BiWGAN-GTN)and the Complete Ensemble Empirical Mode Decomposition with Adaptive Noise(CEEMDAN).(a)The BiWGAN-GTN algorithm is proposed to extract spatiotemporal information from data.(b)The loss function of BiWGAN-GTN is redesigned to solve the abnormal data intrusion problem during the training process.(c)MCBiWGAN-GTN is designed to reduce data complexity through CEEMDAN for time series decomposition and utilizes BiWGAN-GTN to train different components.(d)To adapt the proposed algorithm for the entire cloud data center,a cloud data center anomaly detection framework based on Swarm Learning(SL)is designed.The evaluation results on a real-world cloud data center dataset show that MCBiWGAN-GTN outperforms the baseline,with an F1-score of 0.96,an accuracy of 0.935,a precision of 0.954,a recall of 0.967,and an FPR of 0.203.The experiments also verify the stability of MCBiWGAN-GTN,the impact of parameter configurations,and the effectiveness of the proposed SL framework.
基金funded by Princess Nourah bint Abdulrahman UniversityResearchers Supporting Project number (PNURSP2024R408), Princess Nourah bint AbdulrahmanUniversity, Riyadh, Saudi Arabia.
文摘A basic procedure for transforming readable data into encoded forms is encryption, which ensures security when the right decryption keys are used. Hadoop is susceptible to possible cyber-attacks because it lacks built-in security measures, even though it can effectively handle and store enormous datasets using the Hadoop Distributed File System (HDFS). The increasing number of data breaches emphasizes how urgently creative encryption techniques are needed in cloud-based big data settings. This paper presents Adaptive Attribute-Based Honey Encryption (AABHE), a state-of-the-art technique that combines honey encryption with Ciphertext-Policy Attribute-Based Encryption (CP-ABE) to provide improved data security. Even if intercepted, AABHE makes sure that sensitive data cannot be accessed by unauthorized parties. With a focus on protecting huge files in HDFS, the suggested approach achieves 98% security robustness and 95% encryption efficiency, outperforming other encryption methods including Ciphertext-Policy Attribute-Based Encryption (CP-ABE), Key-Policy Attribute-Based Encryption (KB-ABE), and Advanced Encryption Standard combined with Attribute-Based Encryption (AES+ABE). By fixing Hadoop’s security flaws, AABHE fortifies its protections against data breaches and enhances Hadoop’s dependability as a platform for processing and storing massive amounts of data.
基金funded by Deanship of Scientific Research(DSR)at King Abdulaziz University,Jeddah under grant No.(RG-6-611-43)the authors,therefore,acknowledge with thanks DSR technical and financial support.
文摘The integration of the Internet of Things(IoT)into healthcare systems improves patient care,boosts operational efficiency,and contributes to cost-effective healthcare delivery.However,overcoming several associated challenges,such as data security,interoperability,and ethical concerns,is crucial to realizing the full potential of IoT in healthcare.Real-time anomaly detection plays a key role in protecting patient data and maintaining device integrity amidst the additional security risks posed by interconnected systems.In this context,this paper presents a novelmethod for healthcare data privacy analysis.The technique is based on the identification of anomalies in cloud-based Internet of Things(IoT)networks,and it is optimized using explainable artificial intelligence.For anomaly detection,the Radial Boltzmann Gaussian Temporal Fuzzy Network(RBGTFN)is used in the process of doing information privacy analysis for healthcare data.Remora Colony SwarmOptimization is then used to carry out the optimization of the network.The performance of the model in identifying anomalies across a variety of healthcare data is evaluated by an experimental study.This evaluation suggested that themodel measures the accuracy,precision,latency,Quality of Service(QoS),and scalability of themodel.A remarkable 95%precision,93%latency,89%quality of service,98%detection accuracy,and 96%scalability were obtained by the suggested model,as shown by the subsequent findings.
文摘Cloud computing has become an essential technology for the management and processing of large datasets,offering scalability,high availability,and fault tolerance.However,optimizing data replication across multiple data centers poses a significant challenge,especially when balancing opposing goals such as latency,storage costs,energy consumption,and network efficiency.This study introduces a novel Dynamic Optimization Algorithm called Dynamic Multi-Objective Gannet Optimization(DMGO),designed to enhance data replication efficiency in cloud environments.Unlike traditional static replication systems,DMGO adapts dynamically to variations in network conditions,system demand,and resource availability.The approach utilizes multi-objective optimization approaches to efficiently balance data access latency,storage efficiency,and operational costs.DMGO consistently evaluates data center performance and adjusts replication algorithms in real time to guarantee optimal system efficiency.Experimental evaluations conducted in a simulated cloud environment demonstrate that DMGO significantly outperforms conventional static algorithms,achieving faster data access,lower storage overhead,reduced energy consumption,and improved scalability.The proposed methodology offers a robust and adaptable solution for modern cloud systems,ensuring efficient resource consumption while maintaining high performance.
文摘The development of machine learning and deep learning algorithms as well as the improvement ofhardware arithmetic power provide a rare opportunity for logging big data private cloud.With the deepeningof exploration and development and the requirements of low-carbon development,the focus of exploration anddevelopment in the oil and gas industry is gradually shifting to the exploration and development of renewableenergy sources such as deep sea,deep earth and geothermal energy.The traditional petrophysical evaluation andinterpretation model has encountered great challenges in the face of new evaluation objects.To establish a distributedlogging big data private cloud platform with a unified learning model as the key,which realizes the distributed storageand processing of logging big data,and enables the learning of brand-new knowledge patterns from multi-attributedata in the large function space in the unified logging learning model integrating the expert knowledge and the datamodel,so as to solve the problem of geoengineering evaluation of geothermal fields.Based on the research ideaof“logging big data cloud platform---unified logging learning model---large function space---knowledge learning&discovery---application”,the theoretical foundation of unified learning model,cloud platform architecture,datastorage and learning algorithm,arithmetic power allocation and platform monitoring,platform stability,data security,etc.have been carried on analysis.The designed logging big data cloud platform realizes parallel distributed storageand processing of data and learning algorithms.New knowledge of geothermal evaluation is found in a large functionspace and applied to Geo-engineering evaluation of geothermal fields.The examples show its good application in theselection of logging series in geothermal fields,quality control of logging data,identification of complex lithologyin geothermal fields,evaluation of reservoir fluids,checking of associated helium,evaluation of cementing quality,evaluation of well-side fractures,and evaluation of geothermal water recharge under the remote logging module ofthe cloud platform.The first and second cementing surfaces of cemented wells in geothermal fields were evaluated,as well as the development of well-side distal fractures,fracture extension orientation.According to the well-sidefracture communication to form a good fluid pathway and large flow rate and long flow diameter of the thermalstorage fi ssure system,the design is conducive to the design of the recharge program of geothermal water.
基金funded by the Third Xinjiang Scientific Expedition Program(2021xjkk1400)the National Natural Science Foundation of China(42071049)+2 种基金the Natural Science Foundation of Xinjiang Uygur Autonomous Region(2019D01C022)the Xinjiang Uygur Autonomous Region Innovation Environment Construction Special Project&Science and Technology Innovation Base Construction Project(PT2107)the Tianshan Talent-Science and Technology Innovation Team(2022TSYCTD0006).
文摘Snow cover plays a critical role in global climate regulation and hydrological processes.Accurate monitoring is essential for understanding snow distribution patterns,managing water resources,and assessing the impacts of climate change.Remote sensing has become a vital tool for snow monitoring,with the widely used Moderate-resolution Imaging Spectroradiometer(MODIS)snow products from the Terra and Aqua satellites.However,cloud cover often interferes with snow detection,making cloud removal techniques crucial for reliable snow product generation.This study evaluated the accuracy of four MODIS snow cover datasets generated through different cloud removal algorithms.Using real-time field camera observations from four stations in the Tianshan Mountains,China,this study assessed the performance of these datasets during three distinct snow periods:the snow accumulation period(September-November),snowmelt period(March-June),and stable snow period(December-February in the following year).The findings showed that cloud-free snow products generated using the Hidden Markov Random Field(HMRF)algorithm consistently outperformed the others,particularly under cloud cover,while cloud-free snow products using near-day synthesis and the spatiotemporal adaptive fusion method with error correction(STAR)demonstrated varying performance depending on terrain complexity and cloud conditions.This study highlighted the importance of considering terrain features,land cover types,and snow dynamics when selecting cloud removal methods,particularly in areas with rapid snow accumulation and melting.The results suggested that future research should focus on improving cloud removal algorithms through the integration of machine learning,multi-source data fusion,and advanced remote sensing technologies.By expanding validation efforts and refining cloud removal strategies,more accurate and reliable snow products can be developed,contributing to enhanced snow monitoring and better management of water resources in alpine and arid areas.
基金supported by National Natural Science Foundation of China(No.62172436)Additionally,it is supported by Natural Science Foundation of Shaanxi Province(No.2023-JC-YB-584)Engineering University of PAP’s Funding for Scientific Research Innovation Team and Key Researcher(No.KYGG202011).
文摘Cloud storage,a core component of cloud computing,plays a vital role in the storage and management of data.Electronic Health Records(EHRs),which document users’health information,are typically stored on cloud servers.However,users’sensitive data would then become unregulated.In the event of data loss,cloud storage providers might conceal the fact that data has been compromised to protect their reputation and mitigate losses.Ensuring the integrity of data stored in the cloud remains a pressing issue that urgently needs to be addressed.In this paper,we propose a data auditing scheme for cloud-based EHRs that incorporates recoverability and batch auditing,alongside a thorough security and performance evaluation.Our scheme builds upon the indistinguishability-based privacy-preserving auditing approach proposed by Zhou et al.We identify that this scheme is insecure and vulnerable to forgery attacks on data storage proofs.To address these vulnerabilities,we enhanced the auditing process using masking techniques and designed new algorithms to strengthen security.We also provide formal proof of the security of the signature algorithm and the auditing scheme.Furthermore,our results show that our scheme effectively protects user privacy and is resilient against malicious attacks.Experimental results indicate that our scheme is not only secure and efficient but also supports batch auditing of cloud data.Specifically,when auditing 10,000 users,batch auditing reduces computational overhead by 101 s compared to normal auditing.
文摘The cloud data centres evolved with an issue of energy management due to the constant increase in size,complexity and enormous consumption of energy.Energy management is a challenging issue that is critical in cloud data centres and an important concern of research for many researchers.In this paper,we proposed a cuckoo search(CS)-based optimisation technique for the virtual machine(VM)selection and a novel placement algorithm considering the different constraints.The energy consumption model and the simulation model have been implemented for the efficient selection of VM.The proposed model CSOA-VM not only lessens the violations at the service level agreement(SLA)level but also minimises the VM migrations.The proposed model also saves energy and the performance analysis shows that energy consumption obtained is 1.35 kWh,SLA violation is 9.2 and VM migration is about 268.Thus,there is an improvement in energy consumption of about 1.8%and a 2.1%improvement(reduction)in violations of SLA in comparison to existing techniques.
基金supported by the Demonstration System for High Resolution Meteorological Application(Ⅱ)[grant number 32-Y30F08-9001-20/22]the National Natural Science Foundation of China[grant numbers 12292981 and 12292984]。
文摘Early detection of convective clouds is vital for minimizing hazardous impacts.Forecasting convective initiation(CI)using current multispectral geostationary meteorological satellites is often challenged by high false-alarm rates and missed detections caused by limited resolution.In contrast,high-resolution earth observation satellites offer more detailed texture information,improving early detection capabilities.The authors propose a novel methodology that integrates the advanced features of China’s latest-generation satellites,Gaofen-4(GF-4)and Fengyun-4A(FY-4A).This fusion method retains GF’s high-resolution details and FY-4A’s multispectral information.Two cases from different observational scenarios and weather conditions under GF-4’s staring mode were carried out to compare the CI forecast results based on fused data and solely on FY-4A data.The fused data demonstrated superior performance in detecting smaller-scale convective clouds,enabling earlier forecasting with a lead time of 15–30 minutes,and more accurate location identification.Integrating high-resolution earth observation satellites into early convective cloud detection provides valuable insights for forecasters and decision-makers,particularly given the current resolution limitations of geostationary meteorological satellites.
基金National Natural Science Foundation of China(42192554, 42175008)Shanghai Typhoon Research Foundation(TFJJ202201)+1 种基金S&T Development Fund of CAMS (2022KJ012)Basic Research Fund of CAMS (2022Y006)。
文摘In this study, a variety of high-resolution satellite data were used to analyze the similarities and differences in horizontal and vertical cloud microphysical characteristics of 11 tropical cyclones(TCs) in three different ocean basins.The results show that for the 11 TCs in different ocean basins, no matter in what season the TCs were generated when they reached or approached Category 4, their melting layers were all distributed in the vertical direction at the height of about 5 km. The high value of ice water contents in the vertical direction of 11 TCs all reach or approach about 2000 g cm^(–3).The total attenuated scattering coefficient at 532 nm, TAB-532, can successfully characterize the distribution of areas with high ice water content when the vertical distribution was concentrated near 0.1 km^(–1)sr^(–1), possibly because the diameter distribution of the corresponding range of aerosol particles had a more favorable effect on the formation of ice nuclei,indicating that aerosols had a significant impact on the ice-phase processes and characteristics. Moreover, by analyzing the horizontal cloud water content, the distribution analysis of cloud water path(CWP) and ice water path(IWP) shows that when the sea surface temperature was at a relatively high value, and the vertical wind shear was relatively small, the CWP and the IWP can reach a relatively high value, which also proves the importance of environmental field factors on the influence of TC cloud microphysical characteristics.
基金the National Natural Science Foundation of China(Nos.42176188,42176192)the Hainan Provincial Natural Science Foundation of China(No.421CXTD442)+2 种基金the Stable Supporting Fund of Acoustic Science and Technology Laboratory(No.JCKYS2024604SSJS007)the Fundamental Research Funds for the Central Universities(No.3072024CFJ0504)the Harbin Engineering University Doctoral Research and Innovation Fund(No.XK2050021034)。
文摘In the task of inspecting underwater suspended pipelines,multi-beam sonar(MBS)can provide two-dimensional water column images(WCIs).However,systematic interferences(e.g.,sidelobe effects)may induce misdetection in WCIs.To address this issue and improve the accuracy of detection,we developed a density-based clustering method for three-dimensional water column point clouds.During the processing of WCIs,sidelobe effects are mitigated using a bilateral filter and brightness transformation.The cross-sectional point cloud of the pipeline is then extracted by using the Canny operator.In the detection phase,the target is identified by using density-based spatial clustering of applications with noise(DBSCAN).However,the selection of appropriate DBSCAN parameters is obscured by the uneven distribution of the water column point cloud.To overcome this,we propose an improved DBSCAN based on a parameter interval estimation method(PIE-DBSCAN).First,kernel density estimation(KDE)is used to determine the candidate interval of parameters,after which the exact cluster number is determined via density peak clustering(DPC).Finally,the optimal parameters are selected by comparing the mean silhouette coefficients.To validate the performance of PIE-DBSCAN,we collected water column point clouds from an anechoic tank and the South China Sea.PIE-DBSCAN successfully detected both the target points of the suspended pipeline and non-target points on the seafloor surface.Compared to the K-Means and Mean-Shift algorithms,PIE-DBSCAN demonstrates superior clustering performance and shows feasibility in practical applications.
基金supported by the National Natural Science Foundation of China(Grant Nos.41941017 and 42177139)Graduate Innovation Fund of Jilin University(Grant No.2024CX099)。
文摘The spatial distribution of discontinuities and the size of rock blocks are the key indicators for rock mass quality evaluation and rockfall risk assessment.Traditional manual measurement is often dangerous or unreachable at some high-steep rock slopes.In contrast,unmanned aerial vehicle(UAV)photogrammetry is not limited by terrain conditions,and can efficiently collect high-precision three-dimensional(3D)point clouds of rock masses through all-round and multiangle photography for rock mass characterization.In this paper,a new method based on a 3D point cloud is proposed for discontinuity identification and refined rock block modeling.The method is based on four steps:(1)Establish a point cloud spatial topology,and calculate the point cloud normal vector and average point spacing based on several machine learning algorithms;(2)Extract discontinuities using the density-based spatial clustering of applications with noise(DBSCAN)algorithm and fit the discontinuity plane by combining principal component analysis(PCA)with the natural breaks(NB)method;(3)Propose a method of inserting points in the line segment to generate an embedded discontinuity point cloud;and(4)Adopt a Poisson reconstruction method for refined rock block modeling.The proposed method was applied to an outcrop of an ultrahigh steep rock slope and compared with the results of previous studies and manual surveys.The results show that the method can eliminate the influence of discontinuity undulations on the orientation measurement and describe the local concave-convex characteristics on the modeling of rock blocks.The calculation results are accurate and reliable,which can meet the practical requirements of engineering.
基金supported by the National Key R&D Program of China[grant number 2023YFC3008004]。
文摘This study introduces a new ocean surface friction velocity scheme and a modified Thompson cloud microphysics parameterization scheme into the CMA-TYM model.The impact of these two parameterization schemes on the prediction of the movement track and intensity of Typhoon Kompasu in 2021 is examined.Additionally,the possible reasons for their effects on tropical cyclone(TC)intensity prediction are analyzed.Statistical results show that both parameterization schemes improve the predictions of Typhoon Kompasu’s track and intensity.The influence on track prediction becomes evident after 60 h of model integration,while the significant positive impact on intensity prediction is observed after 66 h.Further analysis reveals that these two schemes affect the timing and magnitude of extreme TC intensity values by influencing the evolution of the TC’s warm-core structure.
文摘Missing data presents a crucial challenge in data analysis,especially in high-dimensional datasets,where missing data often leads to biased conclusions and degraded model performance.In this study,we present a novel autoencoder-based imputation framework that integrates a composite loss function to enhance robustness and precision.The proposed loss combines(i)a guided,masked mean squared error focusing on missing entries;(ii)a noise-aware regularization term to improve resilience against data corruption;and(iii)a variance penalty to encourage expressive yet stable reconstructions.We evaluate the proposed model across four missingness mechanisms,such as Missing Completely at Random,Missing at Random,Missing Not at Random,and Missing Not at Random with quantile censorship,under systematically varied feature counts,sample sizes,and missingness ratios ranging from 5%to 60%.Four publicly available real-world datasets(Stroke Prediction,Pima Indians Diabetes,Cardiovascular Disease,and Framingham Heart Study)were used,and the obtained results show that our proposed model consistently outperforms baseline methods,including traditional and deep learning-based techniques.An ablation study reveals the additive value of each component in the loss function.Additionally,we assessed the downstream utility of imputed data through classification tasks,where datasets imputed by the proposed method yielded the highest receiver operating characteristic area under the curve scores across all scenarios.The model demonstrates strong scalability and robustness,improving performance with larger datasets and higher feature counts.These results underscore the capacity of the proposed method to produce not only numerically accurate but also semantically useful imputations,making it a promising solution for robust data recovery in clinical applications.
基金funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2025R104)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Modern intrusion detection systems(MIDS)face persistent challenges in coping with the rapid evolution of cyber threats,high-volume network traffic,and imbalanced datasets.Traditional models often lack the robustness and explainability required to detect novel and sophisticated attacks effectively.This study introduces an advanced,explainable machine learning framework for multi-class IDS using the KDD99 and IDS datasets,which reflects real-world network behavior through a blend of normal and diverse attack classes.The methodology begins with sophisticated data preprocessing,incorporating both RobustScaler and QuantileTransformer to address outliers and skewed feature distributions,ensuring standardized and model-ready inputs.Critical dimensionality reduction is achieved via the Harris Hawks Optimization(HHO)algorithm—a nature-inspired metaheuristic modeled on hawks’hunting strategies.HHO efficiently identifies the most informative features by optimizing a fitness function based on classification performance.Following feature selection,the SMOTE is applied to the training data to resolve class imbalance by synthetically augmenting underrepresented attack types.The stacked architecture is then employed,combining the strengths of XGBoost,SVM,and RF as base learners.This layered approach improves prediction robustness and generalization by balancing bias and variance across diverse classifiers.The model was evaluated using standard classification metrics:precision,recall,F1-score,and overall accuracy.The best overall performance was recorded with an accuracy of 99.44%for UNSW-NB15,demonstrating the model’s effectiveness.After balancing,the model demonstrated a clear improvement in detecting the attacks.We tested the model on four datasets to show the effectiveness of the proposed approach and performed the ablation study to check the effect of each parameter.Also,the proposed model is computationaly efficient.To support transparency and trust in decision-making,explainable AI(XAI)techniques are incorporated that provides both global and local insight into feature contributions,and offers intuitive visualizations for individual predictions.This makes it suitable for practical deployment in cybersecurity environments that demand both precision and accountability.