Material identification is critical for understanding the relationship between mechanical properties and the associated mechanical functions.However,material identification is a challenging task,especially when the ch...Material identification is critical for understanding the relationship between mechanical properties and the associated mechanical functions.However,material identification is a challenging task,especially when the characteristic of the material is highly nonlinear in nature,as is common in biological tissue.In this work,we identify unknown material properties in continuum solid mechanics via physics-informed neural networks(PINNs).To improve the accuracy and efficiency of PINNs,we develop efficient strategies to nonuniformly sample observational data.We also investigate different approaches to enforce Dirichlet-type boundary conditions(BCs)as soft or hard constraints.Finally,we apply the proposed methods to a diverse set of time-dependent and time-independent solid mechanic examples that span linear elastic and hyperelastic material space.The estimated material parameters achieve relative errors of less than 1%.As such,this work is relevant to diverse applications,including optimizing structural integrity and developing novel materials.展开更多
Fourier transform is a basis of the analysis. This paper presents a kind ofmethod of minimum sampling data determined profile of the inverted object ininverse scattering.
In the face of data scarcity in the optimization of maintenance strategies for civil aircraft,traditional failure data-driven methods are encountering challenges owing to the increasing reliability of aircraft design....In the face of data scarcity in the optimization of maintenance strategies for civil aircraft,traditional failure data-driven methods are encountering challenges owing to the increasing reliability of aircraft design.This study addresses this issue by presenting a novel combined data fusion algorithm,which serves to enhance the accuracy and reliability of failure rate analysis for a specific aircraft model by integrating historical failure data from similar models as supplementary information.Through a comprehensive analysis of two different maintenance projects,this study illustrates the application process of the algorithm.Building upon the analysis results,this paper introduces the innovative equal integral value method as a replacement for the conventional equal interval method in the context of maintenance schedule optimization.The Monte Carlo simulation example validates that the equivalent essential value method surpasses the traditional method by over 20%in terms of inspection efficiency ratio.This discovery indicates that the equal critical value method not only upholds maintenance efficiency but also substantially decreases workload and maintenance costs.The findings of this study open up novel perspectives for airlines grappling with data scarcity,offer fresh strategies for the optimization of aviation maintenance practices,and chart a new course toward achieving more efficient and cost-effective maintenance schedule optimization through refined data analysis.展开更多
Imbalance is a distinctive feature of many datasets,and how to make the dataset balanced become a hot topic in the machine learning field.The Synthetic Minority Oversampling Technique(SMOTE)is the classical method to ...Imbalance is a distinctive feature of many datasets,and how to make the dataset balanced become a hot topic in the machine learning field.The Synthetic Minority Oversampling Technique(SMOTE)is the classical method to solve this problem.Although much research has been conducted on SMOTE,there is still the problem of synthetic sample singularity.To solve the issues of class imbalance and diversity of generated samples,this paper proposes a hybrid resampling method for binary imbalanced data sets,RE-SMOTE,which is designed based on the improvements of two oversampling methods parameter-free SMOTE(PF-SMOTE)and SMOTE-Weighted Ensemble Nearest Neighbor(SMOTE-WENN).Initially,minority class samples are divided into safe and boundary minority categories.Boundary minority samples are regenerated through linear interpolation with the nearest majority class samples.In contrast,safe minority samples are randomly generated within a circular range centered on the initial safe minority samples with a radius determined by the distance to the nearest majority class samples.Furthermore,we use Weighted Edited Nearest Neighbor(WENN)and relative density methods to clean the generated samples and remove the low-quality samples.Relative density is calculated based on the ratio of majority to minority samples among the reverse k-nearest neighbor samples.To verify the effectiveness and robustness of the proposed model,we conducted a comprehensive experimental study on 40 datasets selected from real applications.The experimental results show the superiority of radius estimation-SMOTE(RE-SMOTE)over other state-of-the-art methods.Code is available at:https://github.com/blue9792/RE-SMOTE(accessed on 30 September 2024).展开更多
Freeform surface measurement is a key basic technology for product quality control and reverse engineering in aerospace field.Surface measurement technology based on multi-sensor fusion such as laser scanner and conta...Freeform surface measurement is a key basic technology for product quality control and reverse engineering in aerospace field.Surface measurement technology based on multi-sensor fusion such as laser scanner and contact probe can combine the complementary characteristics of different sensors,and has been widely concerned in industry and academia.The number and distribution of measurement points will significantly affect the efficiency of multisensor fusion and the accuracy of surface reconstruction.An aggregation‑value‑based active sampling method for multisensor freeform surface measurement and reconstruction is proposed.Based on game theory iteration,probe measurement points are generated actively,and the importance of each measurement point on freeform surface to multi-sensor fusion is clearly defined as Shapley value of the measurement point.Thus,the problem of obtaining the optimal measurement point set is transformed into the problem of maximizing the aggregation value of the sample set.Simulation and real measurement results verify that the proposed method can significantly reduce the required probe sample size while ensuring the measurement accuracy of multi-sensor fusion.展开更多
The capability of accurately predicting mineralogical brittleness index (BI) from basic suites of well logs is desirable as it provides a useful indicator of the fracability of tight formations.Measuring mineralogical...The capability of accurately predicting mineralogical brittleness index (BI) from basic suites of well logs is desirable as it provides a useful indicator of the fracability of tight formations.Measuring mineralogical components in rocks is expensive and time consuming.However,the basic well log curves are not well correlated with BI so correlation-based,machine-learning methods are not able to derive highly accurate BI predictions using such data.A correlation-free,optimized data-matching algorithm is configured to predict BI on a supervised basis from well log and core data available from two published wells in the Lower Barnett Shale Formation (Texas).This transparent open box (TOB) algorithm matches data records by calculating the sum of squared errors between their variables and selecting the best matches as those with the minimum squared errors.It then applies optimizers to adjust weights applied to individual variable errors to minimize the root mean square error (RMSE)between calculated and predicted (BI).The prediction accuracy achieved by TOB using just five well logs (Gr,ρb,Ns,Rs,Dt) to predict BI is dependent on the density of data records sampled.At a sampling density of about one sample per 0.5 ft BI is predicted with RMSE~0.056 and R^(2)~0.790.At a sampling density of about one sample per0.1 ft BI is predicted with RMSE~0.008 and R^(2)~0.995.Adding a stratigraphic height index as an additional (sixth)input variable method improves BI prediction accuracy to RMSE~0.003 and R^(2)~0.999 for the two wells with only 1 record in 10,000 yielding a BI prediction error of>±0.1.The model has the potential to be applied in an unsupervised basis to predict BI from basic well log data in surrounding wells lacking mineralogical measurements but with similar lithofacies and burial histories.The method could also be extended to predict elastic rock properties in and seismic attributes from wells and seismic data to improve the precision of brittleness index and fracability mapping spatially.展开更多
In real industrial scenarios, equipment cannot be operated in a faulty state for a long time, resulting in a very limited number of available fault samples, and the method of data augmentation using generative adversa...In real industrial scenarios, equipment cannot be operated in a faulty state for a long time, resulting in a very limited number of available fault samples, and the method of data augmentation using generative adversarial networks for smallsample data has achieved a wide range of applications. However, the current generative adversarial networks applied in industrial processes do not impose realistic physical constraints on the generation of data, resulting in the generation of data that do not have realistic physical consistency. To address this problem, this paper proposes a physical consistency-based WGAN, designs a loss function containing physical constraints for industrial processes, and validates the effectiveness of the method using a common dataset in the field of industrial process fault diagnosis. The experimental results show that the proposed method not only makes the generated data consistent with the physical constraints of the industrial process, but also has better fault diagnosis performance than the existing GAN-based methods.展开更多
As sandstone layers in thin interbedded section are difficult to identify,conventional model-driven seismic inversion and data-driven seismic prediction methods have low precision in predicting them.To solve this prob...As sandstone layers in thin interbedded section are difficult to identify,conventional model-driven seismic inversion and data-driven seismic prediction methods have low precision in predicting them.To solve this problem,a model-data-driven seismic AVO(amplitude variation with offset)inversion method based on a space-variant objective function has been worked out.In this method,zero delay cross-correlation function and F norm are used to establish objective function.Based on inverse distance weighting theory,change of the objective function is controlled according to the location of the target CDP(common depth point),to change the constraint weights of training samples,initial low-frequency models,and seismic data on the inversion.Hence,the proposed method can get high resolution and high-accuracy velocity and density from inversion of small sample data,and is suitable for identifying thin interbedded sand bodies.Tests with thin interbedded geological models show that the proposed method has high inversion accuracy and resolution for small sample data,and can identify sandstone and mudstone layers of about one-30th of the dominant wavelength thick.Tests on the field data of Lishui sag show that the inversion results of the proposed method have small relative error with well-log data,and can identify thin interbedded sandstone layers of about one-15th of the dominant wavelength thick with small sample data.展开更多
Volatile nitrosamines (VNAs) are a group of compounds classified as probable (group 2A) and possible (group 2B) carcinogens in humans. Along with certain foods and contaminated drinking water, VNAs are detected at hig...Volatile nitrosamines (VNAs) are a group of compounds classified as probable (group 2A) and possible (group 2B) carcinogens in humans. Along with certain foods and contaminated drinking water, VNAs are detected at high levels in tobacco products and in both mainstream and side-stream smoke. Our laboratory monitors six urinary VNAs—N-nitrosodimethylamine (NDMA), N-nitrosomethylethylamine (NMEA), N-nitrosodiethylamine (NDEA), N-nitrosopiperidine (NPIP), N-nitrosopyrrolidine (NPYR), and N-nitrosomorpholine (NMOR)—using isotope dilution GC-MS/ MS (QQQ) for large population studies such as the National Health and Nutrition Examination Survey (NHANES). In this paper, we report for the first time a new automated sample preparation method to more efficiently quantitate these VNAs. Automation is done using Hamilton STAR<sup>TM</sup> and Caliper Staccato<sup>TM</sup> workstations. This new automated method reduces sample preparation time from 4 hours to 2.5 hours while maintaining precision (inter-run CV < 10%) and accuracy (85% - 111%). More importantly this method increases sample throughput while maintaining a low limit of detection (<10 pg/mL) for all analytes. A streamlined sample data flow was created in parallel to the automated method, in which samples can be tracked from receiving to final LIMs output with minimal human intervention, further minimizing human error in the sample preparation process. This new automated method and the sample data flow are currently applied in bio-monitoring of VNAs in the US non-institutionalized population NHANES 2013-2014 cycle.展开更多
Seismic data interpolation,especially irregularly sampled data interpolation,is a critical task for seismic processing and subsequent interpretation.Recently,with the development of machine learning and deep learning,...Seismic data interpolation,especially irregularly sampled data interpolation,is a critical task for seismic processing and subsequent interpretation.Recently,with the development of machine learning and deep learning,convolutional neural networks(CNNs)are applied for interpolating irregularly sampled seismic data.CNN based approaches can address the apparent defects of traditional interpolation methods,such as the low computational efficiency and the difficulty on parameters selection.However,current CNN based methods only consider the temporal and spatial features of irregularly sampled seismic data,which fail to consider the frequency features of seismic data,i.e.,the multi-scale features.To overcome these drawbacks,we propose a wavelet-based convolutional block attention deep learning(W-CBADL)network for irregularly sampled seismic data reconstruction.We firstly introduce the discrete wavelet transform(DWT)and the inverse wavelet transform(IWT)to the commonly used U-Net by considering the multi-scale features of irregularly sampled seismic data.Moreover,we propose to adopt the convolutional block attention module(CBAM)to precisely restore sampled seismic traces,which could apply the attention to both channel and spatial dimensions.Finally,we adopt the proposed W-CBADL model to synthetic and pre-stack field data to evaluate its validity and effectiveness.The results demonstrate that the proposed W-CBADL model could reconstruct irregularly sampled seismic data more effectively and more efficiently than the state-of-the-art contrastive CNN based models.展开更多
In this paper, the consensus problem with position sampled data for second-order multi-agent systems is investigated.The interaction topology among the agents is depicted by a directed graph. The full-order and reduce...In this paper, the consensus problem with position sampled data for second-order multi-agent systems is investigated.The interaction topology among the agents is depicted by a directed graph. The full-order and reduced-order observers with position sampled data are proposed, by which two kinds of sampled data-based consensus protocols are constructed. With the provided sampled protocols, the consensus convergence analysis of a continuous-time multi-agent system is equivalently transformed into that of a discrete-time system. Then, by using matrix theory and a sampled control analysis method, some sufficient and necessary consensus conditions based on the coupling parameters, spectrum of the Laplacian matrix and sampling period are obtained. While the sampling period tends to zero, our established necessary and sufficient conditions are degenerated to the continuous-time protocol case, which are consistent with the existing result for the continuous-time case. Finally, the effectiveness of our established results is illustrated by a simple simulation example.展开更多
Data from the 2013 Canadian Tobacco, Alcohol and Drugs Survey, and two other surveys are used to determine the effects of cannabis use on self-reported physical and mental health. Daily or almost daily marijuana use i...Data from the 2013 Canadian Tobacco, Alcohol and Drugs Survey, and two other surveys are used to determine the effects of cannabis use on self-reported physical and mental health. Daily or almost daily marijuana use is shown to be detrimental to both measures of health for some age groups but not all. The age group specific effects depend on gender. Males and females respond differently to cannabis use. The health costs of regularly using cannabis are significant but they are much smaller than those associated with tobacco use. These costs are attributed to both the presence of delta9-tetrahydrocannabinol and the fact that smoking cannabis is itself a health hazard because of the toxic properties of the smoke ingested. Cannabis use is costlier to regular smokers and age of first use below the age of 15 or 20 and being a former user leads to reduced physical and mental capacities which are permanent. These results strongly suggest that the legalization of marijuana be accompanied by educational programs, counseling services, and a delivery system, which minimizes juvenile and young adult usage.展开更多
A new and useful method of technology economics, parameter estimation method, was presented in light of the stability of gravity center of object in this paper. This method could deal with the fitting and forecasting ...A new and useful method of technology economics, parameter estimation method, was presented in light of the stability of gravity center of object in this paper. This method could deal with the fitting and forecasting of economy volume and could greatly decrease the errors of the fitting and forecasting results. Moreover, the strict hypothetical conditions in least squares method were not necessary in the method presented in this paper, which overcame the shortcomings of least squares method and expanded the application of data barycentre method. Application to the steel consumption volume forecasting was presented in this paper. It was shown that the result of fitting and forecasting was satisfactory. From the comparison between data barycentre forecasting method and least squares method, we could conclude that the fitting and forecasting results using data barycentre method were more stable than those of using least squares regression forecasting method, and the computation of data barycentre forecasting method was simpler than that of least squares method. As a result, the data barycentre method was convenient to use in technical economy.展开更多
The basis of accurate mineral resource estimates is to have a geological model which replicates the nature and style of the orebody. Key inputs into the generation of a good geological model are the sample data and ma...The basis of accurate mineral resource estimates is to have a geological model which replicates the nature and style of the orebody. Key inputs into the generation of a good geological model are the sample data and mapping information. The Obuasi Mine sample data with a lot of legacy issues were subjected to a robust validation process and integrated with mapping information to generate an accurate geological orebody model for mineral resource estimation in Block 8 Lower. Validation of the sample data focused on replacing missing collar coordinates, missing assays, and correcting magnetic declination that was used to convert the downhole surveys from true to magnetic, fix missing lithology and finally assign confidence numbers to all the sample data. The missing coordinates which were replaced ensured that the sample data plotted at their correct location in space as intended from the planning stage. Magnetic declination data, which was maintained constant throughout all the years even though it changes every year, was also corrected in the validation project. The corrected magnetic declination ensured that the drillholes were plotted on their accurate trajectory as per the planned azimuth and also reflected the true position of the intercepted mineralized fissure(s) which was previously not the case and marked a major blot in the modelling of the Obuasi orebody. The incorporation of mapped data with the validated sample data in the wireframes resulted in a better interpretation of the orebody. The updated mineral resource generated by domaining quartz from the sulphides and compared with the old resource showed that the sulphide tonnes in the old resource estimates were overestimated by 1% and the grade overestimated by 8.5%.展开更多
In this paper,the authors consider a sparse parameter estimation problem in continuoustime linear stochastic regression models using sampling data.Based on the compressed sensing(CS)method,the authors propose a compre...In this paper,the authors consider a sparse parameter estimation problem in continuoustime linear stochastic regression models using sampling data.Based on the compressed sensing(CS)method,the authors propose a compressed least squares(LS) algorithm to deal with the challenges of parameter sparsity.At each sampling time instant,the proposed compressed LS algorithm first compresses the original high-dimensional regressor using a sensing matrix and obtains a low-dimensional LS estimate for the compressed unknown parameter.Then,the original high-dimensional sparse unknown parameter is recovered by a reconstruction method.By introducing a compressed excitation assumption and employing stochastic Lyapunov function and martingale estimate methods,the authors establish the performance analysis of the compressed LS algorithm under the condition on the sampling time interval without using independence or stationarity conditions on the system signals.At last,a simulation example is provided to verify the theoretical results by comparing the standard and the compressed LS algorithms for estimating a high-dimensional sparse unknown parameter.展开更多
With the increasing emphasis on personal information protection,encryption through security protocols has emerged as a critical requirement in data transmission and reception processes.Nevertheless,IoT ecosystems comp...With the increasing emphasis on personal information protection,encryption through security protocols has emerged as a critical requirement in data transmission and reception processes.Nevertheless,IoT ecosystems comprise heterogeneous networks where outdated systems coexist with the latest devices,spanning a range of devices from non-encrypted ones to fully encrypted ones.Given the limited visibility into payloads in this context,this study investigates AI-based attack detection methods that leverage encrypted traffic metadata,eliminating the need for decryption and minimizing system performance degradation—especially in light of these heterogeneous devices.Using the UNSW-NB15 and CICIoT-2023 dataset,encrypted and unencrypted traffic were categorized according to security protocol,and AI-based intrusion detection experiments were conducted for each traffic type based on metadata.To mitigate the problem of class imbalance,eight different data sampling techniques were applied.The effectiveness of these sampling techniques was then comparatively analyzed using two ensemble models and three Deep Learning(DL)models from various perspectives.The experimental results confirmed that metadata-based attack detection is feasible using only encrypted traffic.In the UNSW-NB15 dataset,the f1-score of encrypted traffic was approximately 0.98,which is 4.3%higher than that of unencrypted traffic(approximately 0.94).In addition,analysis of the encrypted traffic in the CICIoT-2023 dataset using the same method showed a significantly lower f1-score of roughly 0.43,indicating that the quality of the dataset and the preprocessing approach have a substantial impact on detection performance.Furthermore,when data sampling techniques were applied to encrypted traffic,the recall in the UNSWNB15(Encrypted)dataset improved by up to 23.0%,and in the CICIoT-2023(Encrypted)dataset by 20.26%,showing a similar level of improvement.Notably,in CICIoT-2023,f1-score and Receiver Operation Characteristic-Area Under the Curve(ROC-AUC)increased by 59.0%and 55.94%,respectively.These results suggest that data sampling can have a positive effect even in encrypted environments.However,the extent of the improvement may vary depending on data quality,model architecture,and sampling strategy.展开更多
Computer clusters with the shared-nothing architecture are the major computing platforms for big data processing and analysis.In cluster computing,data partitioning and sampling are two fundamental strategies to speed...Computer clusters with the shared-nothing architecture are the major computing platforms for big data processing and analysis.In cluster computing,data partitioning and sampling are two fundamental strategies to speed up the computation of big data and increase scalability.In this paper,we present a comprehensive survey of the methods and techniques of data partitioning and sampling with respect to big data processing and analysis.We start with an overview of the mainstream big data frameworks on Hadoop clusters.The basic methods of data partitioning are then discussed including three classical horizontal partitioning schemes:range,hash,and random partitioning.Data partitioning on Hadoop clusters is also discussed with a summary of new strategies for big data partitioning,including the new Random Sample Partition(RSP)distributed model.The classical methods of data sampling are then investigated,including simple random sampling,stratified sampling,and reservoir sampling.Two common methods of big data sampling on computing clusters are also discussed:record-level sampling and blocklevel sampling.Record-level sampling is not as efficient as block-level sampling on big distributed data.On the other hand,block-level sampling on data blocks generated with the classical data partitioning methods does not necessarily produce good representative samples for approximate computing of big data.In this survey,we also summarize the prevailing strategies and related work on sampling-based approximation on Hadoop clusters.We believe that data partitioning and sampling should be considered together to build approximate cluster computing frameworks that are reliable in both the computational and statistical respects.展开更多
To study the capacity of artificial neural network (ANN) applying to battlefield target classification and result of classification, according to the characteristics of battlefield target acoustic and seismic sign...To study the capacity of artificial neural network (ANN) applying to battlefield target classification and result of classification, according to the characteristics of battlefield target acoustic and seismic signals, an on the spot experiment was carried out to derive acoustic and seismic signals of a tank and jeep by special experiment system. Experiment data processed by fast Fourier transform(FFT) were used to train the ANN to distinguish the two battlefield targets. The ANN classifier was performed by the special program based on the modified back propagation (BP) algorithm. The ANN classifier has high correct identification rates for acoustic and seismic signals of battlefield targets, and is suitable for the classification of battlefield targets. The modified BP algorithm eliminates oscillations and local minimum of the standard BP algorithm, and enhances the convergence rate of the ANN.展开更多
Power transmission lines are a critical component of the entire power system,and ice accretion incidents caused by various types of power systems can result in immeasurable harm.Currently,network models used for ice d...Power transmission lines are a critical component of the entire power system,and ice accretion incidents caused by various types of power systems can result in immeasurable harm.Currently,network models used for ice detection on power transmission lines require a substantial amount of sample data to support their training,and their drawback is that detection accuracy is significantly affected by the inaccurate annotation among training dataset.Therefore,we propose a transformer-based detection model,structured into two stages to collectively address the impact of inaccurate datasets on model training.In the first stage,a spatial similarity enhancement(SSE)module is designed to leverage spatial information to enhance the construction of the detection framework,thereby improving the accuracy of the detector.In the second stage,a target similarity enhancement(TSE)module is introduced to enhance object-related features,reducing the impact of inaccurate data on model training,thereby expanding global correlation.Additionally,by incorporating a multi-head adaptive attention window(MAAW),spatial information is combined with category information to achieve information interaction.Simultaneously,a quasi-wavelet structure,compatible with deep learning,is employed to highlight subtle features at different scales.Experimental results indicate that the proposed model in this paper outperforms existing mainstream detection models,demonstrating superior performance and stability.展开更多
Based on the multi-model principle, the fuzzy identification for nonlinear systems with multirate sampled data is studied.Firstly, the nonlinear system with multirate sampled data can be shown as the nonlinear weighte...Based on the multi-model principle, the fuzzy identification for nonlinear systems with multirate sampled data is studied.Firstly, the nonlinear system with multirate sampled data can be shown as the nonlinear weighted combination of some linear models at multiple local working points. On this basis, the fuzzy model of the multirate sampled nonlinear system is built. The premise structure of the fuzzy model is confirmed by using fuzzy competitive learning, and the conclusion parameters of the fuzzy model are estimated by the random gradient descent algorithm. The convergence of the proposed identification algorithm is given by using the martingale theorem and lemmas. The fuzzy model of the PH neutralization process of acid-base titration for hair quality detection is constructed to demonstrate the effectiveness of the proposed method.展开更多
基金funded by the Cora Topolewski Cardiac Research Fund at the Children’s Hospital of Philadelphia(CHOP)the Pediatric Valve Center Frontier Program at CHOP+4 种基金the Additional Ventures Single Ventricle Research Fund Expansion Awardthe National Institutes of Health(USA)supported by the program(Nos.NHLBI T32 HL007915 and NIH R01 HL153166)supported by the program(No.NIH R01 HL153166)supported by the U.S.Department of Energy(No.DE-SC0022953)。
文摘Material identification is critical for understanding the relationship between mechanical properties and the associated mechanical functions.However,material identification is a challenging task,especially when the characteristic of the material is highly nonlinear in nature,as is common in biological tissue.In this work,we identify unknown material properties in continuum solid mechanics via physics-informed neural networks(PINNs).To improve the accuracy and efficiency of PINNs,we develop efficient strategies to nonuniformly sample observational data.We also investigate different approaches to enforce Dirichlet-type boundary conditions(BCs)as soft or hard constraints.Finally,we apply the proposed methods to a diverse set of time-dependent and time-independent solid mechanic examples that span linear elastic and hyperelastic material space.The estimated material parameters achieve relative errors of less than 1%.As such,this work is relevant to diverse applications,including optimizing structural integrity and developing novel materials.
文摘Fourier transform is a basis of the analysis. This paper presents a kind ofmethod of minimum sampling data determined profile of the inverted object ininverse scattering.
文摘In the face of data scarcity in the optimization of maintenance strategies for civil aircraft,traditional failure data-driven methods are encountering challenges owing to the increasing reliability of aircraft design.This study addresses this issue by presenting a novel combined data fusion algorithm,which serves to enhance the accuracy and reliability of failure rate analysis for a specific aircraft model by integrating historical failure data from similar models as supplementary information.Through a comprehensive analysis of two different maintenance projects,this study illustrates the application process of the algorithm.Building upon the analysis results,this paper introduces the innovative equal integral value method as a replacement for the conventional equal interval method in the context of maintenance schedule optimization.The Monte Carlo simulation example validates that the equivalent essential value method surpasses the traditional method by over 20%in terms of inspection efficiency ratio.This discovery indicates that the equal critical value method not only upholds maintenance efficiency but also substantially decreases workload and maintenance costs.The findings of this study open up novel perspectives for airlines grappling with data scarcity,offer fresh strategies for the optimization of aviation maintenance practices,and chart a new course toward achieving more efficient and cost-effective maintenance schedule optimization through refined data analysis.
基金supported by the National Key R&D Program of China,No.2022YFC3006302.
文摘Imbalance is a distinctive feature of many datasets,and how to make the dataset balanced become a hot topic in the machine learning field.The Synthetic Minority Oversampling Technique(SMOTE)is the classical method to solve this problem.Although much research has been conducted on SMOTE,there is still the problem of synthetic sample singularity.To solve the issues of class imbalance and diversity of generated samples,this paper proposes a hybrid resampling method for binary imbalanced data sets,RE-SMOTE,which is designed based on the improvements of two oversampling methods parameter-free SMOTE(PF-SMOTE)and SMOTE-Weighted Ensemble Nearest Neighbor(SMOTE-WENN).Initially,minority class samples are divided into safe and boundary minority categories.Boundary minority samples are regenerated through linear interpolation with the nearest majority class samples.In contrast,safe minority samples are randomly generated within a circular range centered on the initial safe minority samples with a radius determined by the distance to the nearest majority class samples.Furthermore,we use Weighted Edited Nearest Neighbor(WENN)and relative density methods to clean the generated samples and remove the low-quality samples.Relative density is calculated based on the ratio of majority to minority samples among the reverse k-nearest neighbor samples.To verify the effectiveness and robustness of the proposed model,we conducted a comprehensive experimental study on 40 datasets selected from real applications.The experimental results show the superiority of radius estimation-SMOTE(RE-SMOTE)over other state-of-the-art methods.Code is available at:https://github.com/blue9792/RE-SMOTE(accessed on 30 September 2024).
基金supported by the Na‑tional Key R&D Program of China(No.2022YFB3402600)the National Science Fund for Distinguished Young Scholars(No.51925505)+1 种基金the General Program of National Natural Science Foundation of China(No.52275491)Joint Funds of the National Natural Science Foundation of China(No.U21B2081).
文摘Freeform surface measurement is a key basic technology for product quality control and reverse engineering in aerospace field.Surface measurement technology based on multi-sensor fusion such as laser scanner and contact probe can combine the complementary characteristics of different sensors,and has been widely concerned in industry and academia.The number and distribution of measurement points will significantly affect the efficiency of multisensor fusion and the accuracy of surface reconstruction.An aggregation‑value‑based active sampling method for multisensor freeform surface measurement and reconstruction is proposed.Based on game theory iteration,probe measurement points are generated actively,and the importance of each measurement point on freeform surface to multi-sensor fusion is clearly defined as Shapley value of the measurement point.Thus,the problem of obtaining the optimal measurement point set is transformed into the problem of maximizing the aggregation value of the sample set.Simulation and real measurement results verify that the proposed method can significantly reduce the required probe sample size while ensuring the measurement accuracy of multi-sensor fusion.
文摘The capability of accurately predicting mineralogical brittleness index (BI) from basic suites of well logs is desirable as it provides a useful indicator of the fracability of tight formations.Measuring mineralogical components in rocks is expensive and time consuming.However,the basic well log curves are not well correlated with BI so correlation-based,machine-learning methods are not able to derive highly accurate BI predictions using such data.A correlation-free,optimized data-matching algorithm is configured to predict BI on a supervised basis from well log and core data available from two published wells in the Lower Barnett Shale Formation (Texas).This transparent open box (TOB) algorithm matches data records by calculating the sum of squared errors between their variables and selecting the best matches as those with the minimum squared errors.It then applies optimizers to adjust weights applied to individual variable errors to minimize the root mean square error (RMSE)between calculated and predicted (BI).The prediction accuracy achieved by TOB using just five well logs (Gr,ρb,Ns,Rs,Dt) to predict BI is dependent on the density of data records sampled.At a sampling density of about one sample per 0.5 ft BI is predicted with RMSE~0.056 and R^(2)~0.790.At a sampling density of about one sample per0.1 ft BI is predicted with RMSE~0.008 and R^(2)~0.995.Adding a stratigraphic height index as an additional (sixth)input variable method improves BI prediction accuracy to RMSE~0.003 and R^(2)~0.999 for the two wells with only 1 record in 10,000 yielding a BI prediction error of>±0.1.The model has the potential to be applied in an unsupervised basis to predict BI from basic well log data in surrounding wells lacking mineralogical measurements but with similar lithofacies and burial histories.The method could also be extended to predict elastic rock properties in and seismic attributes from wells and seismic data to improve the precision of brittleness index and fracability mapping spatially.
文摘In real industrial scenarios, equipment cannot be operated in a faulty state for a long time, resulting in a very limited number of available fault samples, and the method of data augmentation using generative adversarial networks for smallsample data has achieved a wide range of applications. However, the current generative adversarial networks applied in industrial processes do not impose realistic physical constraints on the generation of data, resulting in the generation of data that do not have realistic physical consistency. To address this problem, this paper proposes a physical consistency-based WGAN, designs a loss function containing physical constraints for industrial processes, and validates the effectiveness of the method using a common dataset in the field of industrial process fault diagnosis. The experimental results show that the proposed method not only makes the generated data consistent with the physical constraints of the industrial process, but also has better fault diagnosis performance than the existing GAN-based methods.
文摘As sandstone layers in thin interbedded section are difficult to identify,conventional model-driven seismic inversion and data-driven seismic prediction methods have low precision in predicting them.To solve this problem,a model-data-driven seismic AVO(amplitude variation with offset)inversion method based on a space-variant objective function has been worked out.In this method,zero delay cross-correlation function and F norm are used to establish objective function.Based on inverse distance weighting theory,change of the objective function is controlled according to the location of the target CDP(common depth point),to change the constraint weights of training samples,initial low-frequency models,and seismic data on the inversion.Hence,the proposed method can get high resolution and high-accuracy velocity and density from inversion of small sample data,and is suitable for identifying thin interbedded sand bodies.Tests with thin interbedded geological models show that the proposed method has high inversion accuracy and resolution for small sample data,and can identify sandstone and mudstone layers of about one-30th of the dominant wavelength thick.Tests on the field data of Lishui sag show that the inversion results of the proposed method have small relative error with well-log data,and can identify thin interbedded sandstone layers of about one-15th of the dominant wavelength thick with small sample data.
文摘Volatile nitrosamines (VNAs) are a group of compounds classified as probable (group 2A) and possible (group 2B) carcinogens in humans. Along with certain foods and contaminated drinking water, VNAs are detected at high levels in tobacco products and in both mainstream and side-stream smoke. Our laboratory monitors six urinary VNAs—N-nitrosodimethylamine (NDMA), N-nitrosomethylethylamine (NMEA), N-nitrosodiethylamine (NDEA), N-nitrosopiperidine (NPIP), N-nitrosopyrrolidine (NPYR), and N-nitrosomorpholine (NMOR)—using isotope dilution GC-MS/ MS (QQQ) for large population studies such as the National Health and Nutrition Examination Survey (NHANES). In this paper, we report for the first time a new automated sample preparation method to more efficiently quantitate these VNAs. Automation is done using Hamilton STAR<sup>TM</sup> and Caliper Staccato<sup>TM</sup> workstations. This new automated method reduces sample preparation time from 4 hours to 2.5 hours while maintaining precision (inter-run CV < 10%) and accuracy (85% - 111%). More importantly this method increases sample throughput while maintaining a low limit of detection (<10 pg/mL) for all analytes. A streamlined sample data flow was created in parallel to the automated method, in which samples can be tracked from receiving to final LIMs output with minimal human intervention, further minimizing human error in the sample preparation process. This new automated method and the sample data flow are currently applied in bio-monitoring of VNAs in the US non-institutionalized population NHANES 2013-2014 cycle.
基金Supported by the National Natural Science Foundation of China under Grant 42274144 and under Grant 41974137.
文摘Seismic data interpolation,especially irregularly sampled data interpolation,is a critical task for seismic processing and subsequent interpretation.Recently,with the development of machine learning and deep learning,convolutional neural networks(CNNs)are applied for interpolating irregularly sampled seismic data.CNN based approaches can address the apparent defects of traditional interpolation methods,such as the low computational efficiency and the difficulty on parameters selection.However,current CNN based methods only consider the temporal and spatial features of irregularly sampled seismic data,which fail to consider the frequency features of seismic data,i.e.,the multi-scale features.To overcome these drawbacks,we propose a wavelet-based convolutional block attention deep learning(W-CBADL)network for irregularly sampled seismic data reconstruction.We firstly introduce the discrete wavelet transform(DWT)and the inverse wavelet transform(IWT)to the commonly used U-Net by considering the multi-scale features of irregularly sampled seismic data.Moreover,we propose to adopt the convolutional block attention module(CBAM)to precisely restore sampled seismic traces,which could apply the attention to both channel and spatial dimensions.Finally,we adopt the proposed W-CBADL model to synthetic and pre-stack field data to evaluate its validity and effectiveness.The results demonstrate that the proposed W-CBADL model could reconstruct irregularly sampled seismic data more effectively and more efficiently than the state-of-the-art contrastive CNN based models.
基金supported by the Natural Science Foundation of Zhejiang Province,China(Grant No.LY13F030005)the National Natural Science Foundation of China(Grant No.61501331)
文摘In this paper, the consensus problem with position sampled data for second-order multi-agent systems is investigated.The interaction topology among the agents is depicted by a directed graph. The full-order and reduced-order observers with position sampled data are proposed, by which two kinds of sampled data-based consensus protocols are constructed. With the provided sampled protocols, the consensus convergence analysis of a continuous-time multi-agent system is equivalently transformed into that of a discrete-time system. Then, by using matrix theory and a sampled control analysis method, some sufficient and necessary consensus conditions based on the coupling parameters, spectrum of the Laplacian matrix and sampling period are obtained. While the sampling period tends to zero, our established necessary and sufficient conditions are degenerated to the continuous-time protocol case, which are consistent with the existing result for the continuous-time case. Finally, the effectiveness of our established results is illustrated by a simple simulation example.
文摘Data from the 2013 Canadian Tobacco, Alcohol and Drugs Survey, and two other surveys are used to determine the effects of cannabis use on self-reported physical and mental health. Daily or almost daily marijuana use is shown to be detrimental to both measures of health for some age groups but not all. The age group specific effects depend on gender. Males and females respond differently to cannabis use. The health costs of regularly using cannabis are significant but they are much smaller than those associated with tobacco use. These costs are attributed to both the presence of delta9-tetrahydrocannabinol and the fact that smoking cannabis is itself a health hazard because of the toxic properties of the smoke ingested. Cannabis use is costlier to regular smokers and age of first use below the age of 15 or 20 and being a former user leads to reduced physical and mental capacities which are permanent. These results strongly suggest that the legalization of marijuana be accompanied by educational programs, counseling services, and a delivery system, which minimizes juvenile and young adult usage.
文摘A new and useful method of technology economics, parameter estimation method, was presented in light of the stability of gravity center of object in this paper. This method could deal with the fitting and forecasting of economy volume and could greatly decrease the errors of the fitting and forecasting results. Moreover, the strict hypothetical conditions in least squares method were not necessary in the method presented in this paper, which overcame the shortcomings of least squares method and expanded the application of data barycentre method. Application to the steel consumption volume forecasting was presented in this paper. It was shown that the result of fitting and forecasting was satisfactory. From the comparison between data barycentre forecasting method and least squares method, we could conclude that the fitting and forecasting results using data barycentre method were more stable than those of using least squares regression forecasting method, and the computation of data barycentre forecasting method was simpler than that of least squares method. As a result, the data barycentre method was convenient to use in technical economy.
文摘The basis of accurate mineral resource estimates is to have a geological model which replicates the nature and style of the orebody. Key inputs into the generation of a good geological model are the sample data and mapping information. The Obuasi Mine sample data with a lot of legacy issues were subjected to a robust validation process and integrated with mapping information to generate an accurate geological orebody model for mineral resource estimation in Block 8 Lower. Validation of the sample data focused on replacing missing collar coordinates, missing assays, and correcting magnetic declination that was used to convert the downhole surveys from true to magnetic, fix missing lithology and finally assign confidence numbers to all the sample data. The missing coordinates which were replaced ensured that the sample data plotted at their correct location in space as intended from the planning stage. Magnetic declination data, which was maintained constant throughout all the years even though it changes every year, was also corrected in the validation project. The corrected magnetic declination ensured that the drillholes were plotted on their accurate trajectory as per the planned azimuth and also reflected the true position of the intercepted mineralized fissure(s) which was previously not the case and marked a major blot in the modelling of the Obuasi orebody. The incorporation of mapped data with the validated sample data in the wireframes resulted in a better interpretation of the orebody. The updated mineral resource generated by domaining quartz from the sulphides and compared with the old resource showed that the sulphide tonnes in the old resource estimates were overestimated by 1% and the grade overestimated by 8.5%.
基金supported by the Major Key Project of Peng Cheng Laboratory under Grant No.PCL2023AS1-2Project funded by China Postdoctoral Science Foundation under Grant Nos.2022M722926 and2023T160605。
文摘In this paper,the authors consider a sparse parameter estimation problem in continuoustime linear stochastic regression models using sampling data.Based on the compressed sensing(CS)method,the authors propose a compressed least squares(LS) algorithm to deal with the challenges of parameter sparsity.At each sampling time instant,the proposed compressed LS algorithm first compresses the original high-dimensional regressor using a sensing matrix and obtains a low-dimensional LS estimate for the compressed unknown parameter.Then,the original high-dimensional sparse unknown parameter is recovered by a reconstruction method.By introducing a compressed excitation assumption and employing stochastic Lyapunov function and martingale estimate methods,the authors establish the performance analysis of the compressed LS algorithm under the condition on the sampling time interval without using independence or stationarity conditions on the system signals.At last,a simulation example is provided to verify the theoretical results by comparing the standard and the compressed LS algorithms for estimating a high-dimensional sparse unknown parameter.
基金supported by the Institute of Information&Communications Technology Planning&Evaluation(IITP)grant funded by the Korea government(MSIT)(No.RS-2023-00235509Development of security monitoring technology based network behavior against encrypted cyber threats in ICT convergence environment).
文摘With the increasing emphasis on personal information protection,encryption through security protocols has emerged as a critical requirement in data transmission and reception processes.Nevertheless,IoT ecosystems comprise heterogeneous networks where outdated systems coexist with the latest devices,spanning a range of devices from non-encrypted ones to fully encrypted ones.Given the limited visibility into payloads in this context,this study investigates AI-based attack detection methods that leverage encrypted traffic metadata,eliminating the need for decryption and minimizing system performance degradation—especially in light of these heterogeneous devices.Using the UNSW-NB15 and CICIoT-2023 dataset,encrypted and unencrypted traffic were categorized according to security protocol,and AI-based intrusion detection experiments were conducted for each traffic type based on metadata.To mitigate the problem of class imbalance,eight different data sampling techniques were applied.The effectiveness of these sampling techniques was then comparatively analyzed using two ensemble models and three Deep Learning(DL)models from various perspectives.The experimental results confirmed that metadata-based attack detection is feasible using only encrypted traffic.In the UNSW-NB15 dataset,the f1-score of encrypted traffic was approximately 0.98,which is 4.3%higher than that of unencrypted traffic(approximately 0.94).In addition,analysis of the encrypted traffic in the CICIoT-2023 dataset using the same method showed a significantly lower f1-score of roughly 0.43,indicating that the quality of the dataset and the preprocessing approach have a substantial impact on detection performance.Furthermore,when data sampling techniques were applied to encrypted traffic,the recall in the UNSWNB15(Encrypted)dataset improved by up to 23.0%,and in the CICIoT-2023(Encrypted)dataset by 20.26%,showing a similar level of improvement.Notably,in CICIoT-2023,f1-score and Receiver Operation Characteristic-Area Under the Curve(ROC-AUC)increased by 59.0%and 55.94%,respectively.These results suggest that data sampling can have a positive effect even in encrypted environments.However,the extent of the improvement may vary depending on data quality,model architecture,and sampling strategy.
基金Supported in part by the National Natural Science Foundation of China(No.61972261)the National Key R&D Program of China(No.2017YFC0822604-2)
文摘Computer clusters with the shared-nothing architecture are the major computing platforms for big data processing and analysis.In cluster computing,data partitioning and sampling are two fundamental strategies to speed up the computation of big data and increase scalability.In this paper,we present a comprehensive survey of the methods and techniques of data partitioning and sampling with respect to big data processing and analysis.We start with an overview of the mainstream big data frameworks on Hadoop clusters.The basic methods of data partitioning are then discussed including three classical horizontal partitioning schemes:range,hash,and random partitioning.Data partitioning on Hadoop clusters is also discussed with a summary of new strategies for big data partitioning,including the new Random Sample Partition(RSP)distributed model.The classical methods of data sampling are then investigated,including simple random sampling,stratified sampling,and reservoir sampling.Two common methods of big data sampling on computing clusters are also discussed:record-level sampling and blocklevel sampling.Record-level sampling is not as efficient as block-level sampling on big distributed data.On the other hand,block-level sampling on data blocks generated with the classical data partitioning methods does not necessarily produce good representative samples for approximate computing of big data.In this survey,we also summarize the prevailing strategies and related work on sampling-based approximation on Hadoop clusters.We believe that data partitioning and sampling should be considered together to build approximate cluster computing frameworks that are reliable in both the computational and statistical respects.
文摘To study the capacity of artificial neural network (ANN) applying to battlefield target classification and result of classification, according to the characteristics of battlefield target acoustic and seismic signals, an on the spot experiment was carried out to derive acoustic and seismic signals of a tank and jeep by special experiment system. Experiment data processed by fast Fourier transform(FFT) were used to train the ANN to distinguish the two battlefield targets. The ANN classifier was performed by the special program based on the modified back propagation (BP) algorithm. The ANN classifier has high correct identification rates for acoustic and seismic signals of battlefield targets, and is suitable for the classification of battlefield targets. The modified BP algorithm eliminates oscillations and local minimum of the standard BP algorithm, and enhances the convergence rate of the ANN.
文摘Power transmission lines are a critical component of the entire power system,and ice accretion incidents caused by various types of power systems can result in immeasurable harm.Currently,network models used for ice detection on power transmission lines require a substantial amount of sample data to support their training,and their drawback is that detection accuracy is significantly affected by the inaccurate annotation among training dataset.Therefore,we propose a transformer-based detection model,structured into two stages to collectively address the impact of inaccurate datasets on model training.In the first stage,a spatial similarity enhancement(SSE)module is designed to leverage spatial information to enhance the construction of the detection framework,thereby improving the accuracy of the detector.In the second stage,a target similarity enhancement(TSE)module is introduced to enhance object-related features,reducing the impact of inaccurate data on model training,thereby expanding global correlation.Additionally,by incorporating a multi-head adaptive attention window(MAAW),spatial information is combined with category information to achieve information interaction.Simultaneously,a quasi-wavelet structure,compatible with deep learning,is employed to highlight subtle features at different scales.Experimental results indicate that the proposed model in this paper outperforms existing mainstream detection models,demonstrating superior performance and stability.
基金supported by the National Natural Science Foundation of China(61863034)。
文摘Based on the multi-model principle, the fuzzy identification for nonlinear systems with multirate sampled data is studied.Firstly, the nonlinear system with multirate sampled data can be shown as the nonlinear weighted combination of some linear models at multiple local working points. On this basis, the fuzzy model of the multirate sampled nonlinear system is built. The premise structure of the fuzzy model is confirmed by using fuzzy competitive learning, and the conclusion parameters of the fuzzy model are estimated by the random gradient descent algorithm. The convergence of the proposed identification algorithm is given by using the martingale theorem and lemmas. The fuzzy model of the PH neutralization process of acid-base titration for hair quality detection is constructed to demonstrate the effectiveness of the proposed method.