In the realm of video understanding,the demand for accurate and contextually rich video captioning has surged with the increasing volume and complexity of multimedia content.This research introduces an innovative solu...In the realm of video understanding,the demand for accurate and contextually rich video captioning has surged with the increasing volume and complexity of multimedia content.This research introduces an innovative solution for video captioning by integrating a Convolutional BiLSTM Convolutional Bidirectional Long Short-Term Memory(BiLSTM)constructed Variational Sequence-to-Sequence(CBVSS)approach.The proposed framework is adept at capturing intricate temporal dependencies within video sequences,enabling a more nuanced and contextually relevant description of dynamic scenes.However,optimizing its parameters for improved performance remains a crucial challenge.In response,in this research Golden Eagle Optimization(GEO)a metaheuristic optimization technique is used to fine-tune the Convolutional BiLSTM variational sequence-to-sequence model parameters.The application of GEO aims to enhancing the CBVSS ability to produce more exact and contextually rich video captions.The proposed attains an overall higher Recall of 59.75%and Precision of 63.78%for both datasets.Additionally,the proposed CBVSS method demonstrated superior performance across both datasets,achieving the highest METEOR(25.67)and CIDER(39.87)scores on the ActivityNet dataset,and further outperforming all compared models on the YouCook2 dataset with METEOR(28.67)and CIDER(43.02),highlighting its effectiveness in generating semantically rich and contextually accurate video captions.展开更多
Aiming at the difficulty of fault identification caused by manual extraction of fault features of rotating machinery,a one-dimensional multi-scale convolutional auto-encoder fault diagnosis model is proposed,based on ...Aiming at the difficulty of fault identification caused by manual extraction of fault features of rotating machinery,a one-dimensional multi-scale convolutional auto-encoder fault diagnosis model is proposed,based on the standard convolutional auto-encoder.In this model,the parallel convolutional and deconvolutional kernels of different scales are used to extract the features from the input signal and reconstruct the input signal;then the feature map extracted by multi-scale convolutional kernels is used as the input of the classifier;and finally the parameters of the whole model are fine-tuned using labeled data.Experiments on one set of simulation fault data and two sets of rolling bearing fault data are conducted to validate the proposed method.The results show that the model can achieve 99.75%,99.3%and 100%diagnostic accuracy,respectively.In addition,the diagnostic accuracy and reconstruction error of the one-dimensional multi-scale convolutional auto-encoder are compared with traditional machine learning,convolutional neural networks and a traditional convolutional auto-encoder.The final results show that the proposed model has a better recognition effect for rolling bearing fault data.展开更多
Single nucletide polymorphism(SNP)is an important factor for the study of genetic variation in human families and animal and plant strains.Therefore,it is widely used in the study of population genetics and disease re...Single nucletide polymorphism(SNP)is an important factor for the study of genetic variation in human families and animal and plant strains.Therefore,it is widely used in the study of population genetics and disease related gene.In pharmacogenomics research,identifying the association between SNP site and drug is the key to clinical precision medication,therefore,a predictive model of SNP site and drug association based on denoising variational auto-encoder(DVAE-SVM)is proposed.Firstly,k-mer algorithm is used to construct the initial SNP site feature vector,meanwhile,MACCS molecular fingerprint is introduced to generate the feature vector of the drug module.Then,we use the DVAE to extract the effective features of the initial feature vector of the SNP site.Finally,the effective feature vector of the SNP site and the feature vector of the drug module are fused input to the support vector machines(SVM)to predict the relationship of SNP site and drug module.The results of five-fold cross-validation experiments indicate that the proposed algorithm performs better than random forest(RF)and logistic regression(LR)classification.Further experiments show that compared with the feature extraction algorithms of principal component analysis(PCA),denoising auto-encoder(DAE)and variational auto-encode(VAE),the proposed algorithm has better prediction results.展开更多
Real-time 6 Degree-of-Freedom(DoF)pose estimation is of paramount importance for various on-orbit tasks.Benefiting from the development of deep learning,Convolutional Neural Networks(CNNs)in feature extraction has yie...Real-time 6 Degree-of-Freedom(DoF)pose estimation is of paramount importance for various on-orbit tasks.Benefiting from the development of deep learning,Convolutional Neural Networks(CNNs)in feature extraction has yielded impressive achievements for spacecraft pose estimation.To improve the robustness and interpretability of CNNs,this paper proposes a Pose Estimation approach based on Variational Auto-Encoder structure(PE-VAE)and a Feature-Aided pose estimation approach based on Variational Auto-Encoder structure(FA-VAE),which aim to accurately estimate the 6 DoF pose of a target spacecraft.Both methods treat the pose vector as latent variables,employing an encoder-decoder network with a Variational Auto-Encoder(VAE)structure.To enhance the precision of pose estimation,PE-VAE uses the VAE structure to introduce reconstruction mechanism with the whole image.Furthermore,FA-VAE enforces feature shape constraints by exclusively reconstructing the segment of the target spacecraft with the desired shape.Comparative evaluation against leading methods on public datasets reveals similar accuracy with a threefold improvement in processing speed,showcasing the significant contribution of VAE structures to accuracy enhancement,and the additional benefit of incorporating global shape prior features.展开更多
To address the difficulties in fusing multi-mode sensor data for complex industrial machinery, an adaptive deep coupling convolutional auto-encoder (ADCCAE) fusion method was proposed. First, the multi-mode features e...To address the difficulties in fusing multi-mode sensor data for complex industrial machinery, an adaptive deep coupling convolutional auto-encoder (ADCCAE) fusion method was proposed. First, the multi-mode features extracted synchronously by the CCAE were stacked and fed to the multi-channel convolution layers for fusion. Then, the fused data was passed to all connection layers for compression and fed to the Softmax module for classification. Finally, the coupling loss function coefficients and the network parameters were optimized through an adaptive approach using the gray wolf optimization (GWO) algorithm. Experimental comparisons showed that the proposed ADCCAE fusion model was superior to existing models for multi-mode data fusion.展开更多
The existing graph convolution methods usually suffer high computational burdens,large memory requirements,and intractable batch-processing.In this paper,we propose a high-efficient variational gridded graph convoluti...The existing graph convolution methods usually suffer high computational burdens,large memory requirements,and intractable batch-processing.In this paper,we propose a high-efficient variational gridded graph convolution network(VG-GCN)to encode non-regular graph data,which overcomes all these aforementioned problems.To capture graph topology structures efficiently,in the proposed framework,we propose a hierarchically-coarsened random walk(hcr-walk)by taking advantage of the classic random walk and node/edge encapsulation.The hcr-walk greatly mitigates the problem of exponentially explosive sampling times which occur in the classic version,while preserving graph structures well.To efficiently encode local hcr-walk around one reference node,we project hcrwalk into an ordered space to form image-like grid data,which favors those conventional convolution networks.Instead of the direct 2-D convolution filtering,a variational convolution block(VCB)is designed to model the distribution of the randomsampling hcr-walk inspired by the well-formulated variational inference.We experimentally validate the efficiency and effectiveness of our proposed VG-GCN,which has high computation speed,and the comparable or even better performance when compared with baseline GCNs.展开更多
A deep-sea riser is a crucial component of the mining system used to lift seafloor mineral resources to the vessel.Even minor damage to the riser can lead to substantial financial losses,environmental impacts,and safe...A deep-sea riser is a crucial component of the mining system used to lift seafloor mineral resources to the vessel.Even minor damage to the riser can lead to substantial financial losses,environmental impacts,and safety hazards.However,identifying modal parameters for structural health monitoring remains a major challenge due to its large deformations and flexibility.Vibration signal-based methods are essential for detecting damage and enabling timely maintenance to minimize losses.However,accurately extracting features from one-dimensional(1D)signals is often hindered by various environmental factors and measurement noises.To address this challenge,a novel approach based on a residual convolutional auto-encoder(RCAE)is proposed for detecting damage in deep-sea mining risers,incorporating a data fusion strategy.First,principal component analysis(PCA)is applied to reduce environmental fluctuations and fuse multisensor strain readings.Subsequently,a 1D-RCAE is used to extract damage-sensitive features(DSFs)from the fused dataset.A Mahalanobis distance indicator is established to compare the DSFs of the testing and healthy risers.The specific threshold for these distances is determined using the 3σcriterion,which is employed to assess whether damage has occurred in the testing riser.The effectiveness and robustness of the proposed approach are verified through numerical simulations of a 500-m riser and experimental tests on a 6-m riser.Moreover,the impact of contaminated noise and environmental fluctuations is examined.Results show that the proposed PCA-1D-RCAE approach can effectively detect damage and is resilient to measurement noise and environmental fluctuations.The accuracy exceeds 98%under noise-free conditions and remains above 90%even with 10 dB noise.This novel approach has the potential to establish a new standard for evaluating the health and integrity of risers during mining operations,thereby reducing the high costs and risks associated with failures.Maintenance activities can be scheduled more efficiently by enabling early and accurate detection of riser damage,minimizing downtime and avoiding catastrophic failures.展开更多
Future 6G communications will open up opportunities for innovative applications,including Cyber-Physical Systems,edge computing,supporting Industry 5.0,and digital agriculture.While automation is creating efficiencies...Future 6G communications will open up opportunities for innovative applications,including Cyber-Physical Systems,edge computing,supporting Industry 5.0,and digital agriculture.While automation is creating efficiencies,it can also create new cyber threats,such as vulnerabilities in trust and malicious node injection.Denialof-Service(DoS)attacks can stop many forms of operations by overwhelming networks and systems with data noise.Current anomaly detection methods require extensive software changes and only detect static threats.Data collection is important for being accurate,but it is often a slow,tedious,and sometimes inefficient process.This paper proposes a new wavelet transformassisted Bayesian deep learning based probabilistic(WT-BDLP)approach tomitigate malicious data injection attacks in 6G edge networks.The proposed approach combines outlier detection based on a Bayesian learning conditional variational autoencoder(Bay-LCVariAE)and traffic pattern analysis based on continuous wavelet transform(CWT).The Bay-LCVariAE framework allows for probabilistic modelling of generative features to facilitate capturing how features of interest change over time,spatially,and for recognition of anomalies.Similarly,CWT allows emphasizing the multi-resolution spectral analysis and permits temporally relevant frequency pattern recognition.Experimental testing showed that the flexibility of the Bayesian probabilistic framework offers a vast improvement in anomaly detection accuracy over existing methods,with a maximum accuracy of 98.21%recognizing anomalies.展开更多
Existing segmentation and augmentation techniques on convolutional neural network(CNN)has produced remarkable progress in object detection.However,the nominal accuracy and performance might be downturned with the phot...Existing segmentation and augmentation techniques on convolutional neural network(CNN)has produced remarkable progress in object detection.However,the nominal accuracy and performance might be downturned with the photometric variation of images that are directly ignored in the training process,along with the context of the individual CNN algorithm.In this paper,we investigate the effect of a photometric variation like brightness and sharpness on different CNN.We observe that random augmentation of images weakens the performance unless the augmentation combines the weak limits of photometric variation.Our approach has been justified by the experimental result obtained from the PASCAL VOC 2007 dataset,with object detection CNN algorithms such as YOLOv3(You Only Look Once),Faster R-CNN(Region-based CNN),and SSD(Single Shot Multibox Detector).Each CNN model shows performance loss for varying sharpness and brightness,ranging between−80%to 80%.It was further shown that compared to random augmentation,the augmented dataset with weak photometric changes delivered high performance,but the photometric augmentation range differs for each model.Concurrently,we discuss some research questions that benefit the direction of the study.The results prove the importance of adaptive augmentation for individual CNN model,subjecting towards the robustness of object detection.展开更多
The concept of classification through deep learning is to build a model that skillfully separates closely-related images dataset into different classes because of diminutive but continuous variations that took place i...The concept of classification through deep learning is to build a model that skillfully separates closely-related images dataset into different classes because of diminutive but continuous variations that took place in physical systems over time and effect substantially.This study has made ozone depletion identification through classification using Faster Region-Based Convolutional Neural Network(F-RCNN).The main advantage of F-RCNN is to accumulate the bounding boxes on images to differentiate the depleted and non-depleted regions.Furthermore,image classification’s primary goal is to accurately predict each minutely varied case’s targeted classes in the dataset based on ozone saturation.The permanent changes in climate are of serious concern.The leading causes beyond these destructive variations are ozone layer depletion,greenhouse gas release,deforestation,pollution,water resources contamination,and UV radiation.This research focuses on the prediction by identifying the ozone layer depletion because it causes many health issues,e.g.,skin cancer,damage to marine life,crops damage,and impacts on living being’s immune systems.We have tried to classify the ozone images dataset into two major classes,depleted and non-depleted regions,to extract the required persuading features through F-RCNN.Furthermore,CNN has been used for feature extraction in the existing literature,and those extricated diverse RoIs are passed on to the CNN for grouping purposes.It is difficult to manage and differentiate those RoIs after grouping that negatively affects the gathered results.The classification outcomes through F-RCNN approach are proficient and demonstrate that general accuracy lies between 91%to 93%in identifying climate variation through ozone concentration classification,whether the region in the image under consideration is depleted or non-depleted.Our proposed model presented 93%accuracy,and it outperforms the prevailing techniques.展开更多
The influenza virus changes its antigenicity frequently due to rapid mutations, leading to immune escape and failure of vaccination. Rapid determination of the influenza antigenicity could help identify the antigenic ...The influenza virus changes its antigenicity frequently due to rapid mutations, leading to immune escape and failure of vaccination. Rapid determination of the influenza antigenicity could help identify the antigenic variants in time. Here, we built a stacked auto-encoder (SAE) model for predicting the antigenic variant of human influenza A(H3N2) viruses based on the hemagglutinin (HA) protein sequences. The model achieved an accuracy of 0.95 in five-fold cross-validations, better than the logistic regression model did. Further analysis of the model shows that most of the active nodes in the hidden layer reflected the combined contribution of multiple residues to antigenic variation. Besides, some features (residues on HA protein) in the input layer were observed to take part in multiple active nodes, such as residue 189, 145 and 156, which were also reported to mostly determine the antigenic variation of influenza A(H3N2) viruses. Overall,this work is not only useful for rapidly identifying antigenic variants in influenza prevention, but also an interesting attempt in inferring the mechanisms of biological process through analysis of SAE model, which may give some insights into interpretation of the deep learning展开更多
Non-intrusive load monitoring(NILM)can infer load profiles for each individual appliance from aggregated power consumption signals without installing extra sub-meters.However,performance of traditional energy disaggre...Non-intrusive load monitoring(NILM)can infer load profiles for each individual appliance from aggregated power consumption signals without installing extra sub-meters.However,performance of traditional energy disaggregation methods deteriorates in complex environments,especially susceptible to the presence of other high power consumption appliances.Practicalities are also limited by diversity of household load patterns and measurement errors.In order to address these problems,a hybrid deep learning model consisting of two steps is proposed in this paper.First,an improved variational autoencoder(VAE)structure is introduced for preliminary energy disaggregation,where the encoder and decoder layers are long short-term networks(LSTM)to extract temporal characteristics of active power signals.Afterward,a post-processing method based on Siamese one-dimensional convolutional neural network(S-1D-CNN)is adopted to remove incorrectly predicted activation segments of target appliances.Experiments are conducted on two public datasets,and results show remarkable improvements on prediction accuracy over other deep learning methods.Both transferability and stability of the proposed model are verified under different working conditions.展开更多
Plant breeding stands as a cornerstone for agricultural productivity and the safeguarding of food security.The advent of Genomic Selection heralds a new epoch in breeding,characterized by its capacity to harness whole...Plant breeding stands as a cornerstone for agricultural productivity and the safeguarding of food security.The advent of Genomic Selection heralds a new epoch in breeding,characterized by its capacity to harness whole-genome variation for genomic prediction.This approach transcends the need for prior knowledge of genes associated with specific traits.Nonetheless,the vast dimensionality of genomic data juxtaposed with the relatively limited number of phenotypic samples often leads to the“curse of dimensionality”,where traditional statistical,machine learning,and deep learning methods are prone to overfitting and suboptimal predictive performance.To surmount this challenge,we introduce a unified Variational auto-encoder based Multi-task Genomic Prediction model(VMGP)that integrates self-supervised genomic compression and reconstruction with multiple prediction tasks.This approach provides a robust solution,offering a formidable predictive framework that has been rigorously validated across public datasets for wheat,rice,and maize.Our model demonstrates exceptional capabilities in multi-phenotype and multi-environment genomic prediction,successfully navigating the complexities of cross-population genomic selection and underscoring its unique strengths and utility.Furthermore,by integrating VMGP with model interpretability,we can effectively triage relevant single nucleotide polymorphisms,thereby enhancing prediction performance and proposing potential cost-effective genotyping solutions.The VMGP framework,with its simplicity,stable predictive prowess,and open-source code,is exceptionally well-suited for broad dissemination within plant breeding programs.It is particularly advantageous for breeders who prioritize phenotype prediction yet may not possess extensive knowledge in deep learning or proficiency in parameter tuning.展开更多
The Proton Exchange Membrane Fuel Cell(PEMFC)converts the chemical energy of hydrogen fuel directly into electrical energy with broad application prospects.Understanding how current density is distributed in the PEMFC...The Proton Exchange Membrane Fuel Cell(PEMFC)converts the chemical energy of hydrogen fuel directly into electrical energy with broad application prospects.Understanding how current density is distributed in the PEMFC systems is crucial as it is a key factor influencing system performance.However,direct modeling for current distribution may encounter the challenge of dimensional catastrophe owing to the high dimensionality of the data.This paper uses a high-resolution segmented measurement device with 396 points to conduct experimental tests on the current distribution of a PEMFC with reactive area of 406 cm^(2) during a stepwise increase in load current.The current distribution is modeled based on the test results to learn the mapping relationship between the experimental parameters and the current distribution.The proposed model utilizes a Conditional Variational Auto-Encoder(CVAE)to generate current distributions.The MSE(Mean-Square Error)of the trained CVAE model reaches 9.2×10^(-5),and the comparison results show that the 222.9A current distribution error has the largest MSE of 6.36×10^(-4) and a KL Divergence(Kullback-Leibler Divergence)of 9.55×10^(-4),both of which are at a low level.This model enables the direct determination of the current distribution based on the experimental parameters,thereby establishing a technical foundation for investigating the impact of experimental conditions on fuel cells.This model is also of great significance for research on fuel cell system control strategies and fault diagnosis.展开更多
Generative AI models for music and the arts in general are increasingly complex and hard to understand.The field of ex-plainable AI(XAI)seeks to make complex and opaque AI models such as neural networks more understan...Generative AI models for music and the arts in general are increasingly complex and hard to understand.The field of ex-plainable AI(XAI)seeks to make complex and opaque AI models such as neural networks more understandable to people.One ap-proach to making generative AI models more understandable is to impose a small number of semantically meaningful attributes on gen-erative AI models.This paper contributes a systematic examination of the impact that different combinations of variational auto-en-coder models(measureVAE and adversarialVAE),configurations of latent space in the AI model(from 4 to 256 latent dimensions),and training datasets(Irish folk,Turkish folk,classical,and pop)have on music generation performance when 2 or 4 meaningful musical at-tributes are imposed on the generative model.To date,there have been no systematic comparisons of such models at this level of com-binatorial detail.Our findings show that measureVAE has better reconstruction performance than adversarialVAE which has better musical attribute independence.Results demonstrate that measureVAE was able to generate music across music genres with inter-pretable musical dimensions of control,and performs best with low complexity music such as pop and rock.We recommend that a 32 or 64 latent dimensional space is optimal for 4 regularised dimensions when using measureVAE to generate music across genres.Our res-ults are the first detailed comparisons of configurations of state-of-the-art generative AI models for music and can be used to help select and configure AI models,musical features,and datasets for more understandable generation of music.展开更多
Exposure to poor indoor air conditions poses significant risks to human health, increasing morbidity and mortality rates. Soft measurement modeling is suitable for stable and accurate monitoring of air pollutants and ...Exposure to poor indoor air conditions poses significant risks to human health, increasing morbidity and mortality rates. Soft measurement modeling is suitable for stable and accurate monitoring of air pollutants and improving air quality. Based on partial least squares (PLS), we propose an indoor air quality prediction model that utilizes variational auto-encoder regression (VAER) algorithm. To reduce the negative effects of noise, latent variables in the original data are extracted by PLS in the first step. Then, the extracted variables are used as inputs to VAER, which improve the accuracy and robustness of the model. Through comparative analysis with traditional methods, we demonstrate the superior performance of our PLS-VAER model, which exhibits improved prediction performance and stability. The root mean square error (RMSE) of PLS-VAER is reduced by 14.71%, 26.47%, and 12.50% compared to single VAER, PLS-SVR, and PLS-ANN, respectively. Additionally, the coefficient of determination (R2) of PLS-VAER improves by 13.70%, 30.09%, and 11.25% compared to single VAER, PLS-SVR, and PLS-ANN, respectively. This research offers an innovative and environmentally-friendly approach to monitor and improve indoor air quality.展开更多
文摘In the realm of video understanding,the demand for accurate and contextually rich video captioning has surged with the increasing volume and complexity of multimedia content.This research introduces an innovative solution for video captioning by integrating a Convolutional BiLSTM Convolutional Bidirectional Long Short-Term Memory(BiLSTM)constructed Variational Sequence-to-Sequence(CBVSS)approach.The proposed framework is adept at capturing intricate temporal dependencies within video sequences,enabling a more nuanced and contextually relevant description of dynamic scenes.However,optimizing its parameters for improved performance remains a crucial challenge.In response,in this research Golden Eagle Optimization(GEO)a metaheuristic optimization technique is used to fine-tune the Convolutional BiLSTM variational sequence-to-sequence model parameters.The application of GEO aims to enhancing the CBVSS ability to produce more exact and contextually rich video captions.The proposed attains an overall higher Recall of 59.75%and Precision of 63.78%for both datasets.Additionally,the proposed CBVSS method demonstrated superior performance across both datasets,achieving the highest METEOR(25.67)and CIDER(39.87)scores on the ActivityNet dataset,and further outperforming all compared models on the YouCook2 dataset with METEOR(28.67)and CIDER(43.02),highlighting its effectiveness in generating semantically rich and contextually accurate video captions.
基金The National Natural Science Foundation of China(No.51675098)
文摘Aiming at the difficulty of fault identification caused by manual extraction of fault features of rotating machinery,a one-dimensional multi-scale convolutional auto-encoder fault diagnosis model is proposed,based on the standard convolutional auto-encoder.In this model,the parallel convolutional and deconvolutional kernels of different scales are used to extract the features from the input signal and reconstruct the input signal;then the feature map extracted by multi-scale convolutional kernels is used as the input of the classifier;and finally the parameters of the whole model are fine-tuned using labeled data.Experiments on one set of simulation fault data and two sets of rolling bearing fault data are conducted to validate the proposed method.The results show that the model can achieve 99.75%,99.3%and 100%diagnostic accuracy,respectively.In addition,the diagnostic accuracy and reconstruction error of the one-dimensional multi-scale convolutional auto-encoder are compared with traditional machine learning,convolutional neural networks and a traditional convolutional auto-encoder.The final results show that the proposed model has a better recognition effect for rolling bearing fault data.
基金Lanzhou Talent Innovation and Entrepreneurship Project(No.2020-RC-14)。
文摘Single nucletide polymorphism(SNP)is an important factor for the study of genetic variation in human families and animal and plant strains.Therefore,it is widely used in the study of population genetics and disease related gene.In pharmacogenomics research,identifying the association between SNP site and drug is the key to clinical precision medication,therefore,a predictive model of SNP site and drug association based on denoising variational auto-encoder(DVAE-SVM)is proposed.Firstly,k-mer algorithm is used to construct the initial SNP site feature vector,meanwhile,MACCS molecular fingerprint is introduced to generate the feature vector of the drug module.Then,we use the DVAE to extract the effective features of the initial feature vector of the SNP site.Finally,the effective feature vector of the SNP site and the feature vector of the drug module are fused input to the support vector machines(SVM)to predict the relationship of SNP site and drug module.The results of five-fold cross-validation experiments indicate that the proposed algorithm performs better than random forest(RF)and logistic regression(LR)classification.Further experiments show that compared with the feature extraction algorithms of principal component analysis(PCA),denoising auto-encoder(DAE)and variational auto-encode(VAE),the proposed algorithm has better prediction results.
基金supported by the National Natural Science Foundation of China(No.52272390)the Natural Science Foundation of Heilongjiang Province of China(No.YQ2022A009)the Shanghai Sailing Program,China(No.20YF1417300).
文摘Real-time 6 Degree-of-Freedom(DoF)pose estimation is of paramount importance for various on-orbit tasks.Benefiting from the development of deep learning,Convolutional Neural Networks(CNNs)in feature extraction has yielded impressive achievements for spacecraft pose estimation.To improve the robustness and interpretability of CNNs,this paper proposes a Pose Estimation approach based on Variational Auto-Encoder structure(PE-VAE)and a Feature-Aided pose estimation approach based on Variational Auto-Encoder structure(FA-VAE),which aim to accurately estimate the 6 DoF pose of a target spacecraft.Both methods treat the pose vector as latent variables,employing an encoder-decoder network with a Variational Auto-Encoder(VAE)structure.To enhance the precision of pose estimation,PE-VAE uses the VAE structure to introduce reconstruction mechanism with the whole image.Furthermore,FA-VAE enforces feature shape constraints by exclusively reconstructing the segment of the target spacecraft with the desired shape.Comparative evaluation against leading methods on public datasets reveals similar accuracy with a threefold improvement in processing speed,showcasing the significant contribution of VAE structures to accuracy enhancement,and the additional benefit of incorporating global shape prior features.
文摘To address the difficulties in fusing multi-mode sensor data for complex industrial machinery, an adaptive deep coupling convolutional auto-encoder (ADCCAE) fusion method was proposed. First, the multi-mode features extracted synchronously by the CCAE were stacked and fed to the multi-channel convolution layers for fusion. Then, the fused data was passed to all connection layers for compression and fed to the Softmax module for classification. Finally, the coupling loss function coefficients and the network parameters were optimized through an adaptive approach using the gray wolf optimization (GWO) algorithm. Experimental comparisons showed that the proposed ADCCAE fusion model was superior to existing models for multi-mode data fusion.
基金supported by the Natural Science Foundation of Jiangsu Province(BK20190019,BK20190452)the National Natural Science Foundation of China(62072244,61906094)the Natural Science Foundation of Shandong Province(ZR2020LZH008)。
文摘The existing graph convolution methods usually suffer high computational burdens,large memory requirements,and intractable batch-processing.In this paper,we propose a high-efficient variational gridded graph convolution network(VG-GCN)to encode non-regular graph data,which overcomes all these aforementioned problems.To capture graph topology structures efficiently,in the proposed framework,we propose a hierarchically-coarsened random walk(hcr-walk)by taking advantage of the classic random walk and node/edge encapsulation.The hcr-walk greatly mitigates the problem of exponentially explosive sampling times which occur in the classic version,while preserving graph structures well.To efficiently encode local hcr-walk around one reference node,we project hcrwalk into an ordered space to form image-like grid data,which favors those conventional convolution networks.Instead of the direct 2-D convolution filtering,a variational convolution block(VCB)is designed to model the distribution of the randomsampling hcr-walk inspired by the well-formulated variational inference.We experimentally validate the efficiency and effectiveness of our proposed VG-GCN,which has high computation speed,and the comparable or even better performance when compared with baseline GCNs.
基金the National Key Research and Development Program of China(No.2023 YFC2811600)the National Natural Science Foundation of China(Nos.52301349,52088102)+1 种基金the Major Science and Technology Innovation Program of Qingdao(No.223-3-hygg-10-hy)the Qingdao Science Foundation for Post-doctoral Scientists(Nos.QDBSH20220202070,QDBSH20220201015)。
文摘A deep-sea riser is a crucial component of the mining system used to lift seafloor mineral resources to the vessel.Even minor damage to the riser can lead to substantial financial losses,environmental impacts,and safety hazards.However,identifying modal parameters for structural health monitoring remains a major challenge due to its large deformations and flexibility.Vibration signal-based methods are essential for detecting damage and enabling timely maintenance to minimize losses.However,accurately extracting features from one-dimensional(1D)signals is often hindered by various environmental factors and measurement noises.To address this challenge,a novel approach based on a residual convolutional auto-encoder(RCAE)is proposed for detecting damage in deep-sea mining risers,incorporating a data fusion strategy.First,principal component analysis(PCA)is applied to reduce environmental fluctuations and fuse multisensor strain readings.Subsequently,a 1D-RCAE is used to extract damage-sensitive features(DSFs)from the fused dataset.A Mahalanobis distance indicator is established to compare the DSFs of the testing and healthy risers.The specific threshold for these distances is determined using the 3σcriterion,which is employed to assess whether damage has occurred in the testing riser.The effectiveness and robustness of the proposed approach are verified through numerical simulations of a 500-m riser and experimental tests on a 6-m riser.Moreover,the impact of contaminated noise and environmental fluctuations is examined.Results show that the proposed PCA-1D-RCAE approach can effectively detect damage and is resilient to measurement noise and environmental fluctuations.The accuracy exceeds 98%under noise-free conditions and remains above 90%even with 10 dB noise.This novel approach has the potential to establish a new standard for evaluating the health and integrity of risers during mining operations,thereby reducing the high costs and risks associated with failures.Maintenance activities can be scheduled more efficiently by enabling early and accurate detection of riser damage,minimizing downtime and avoiding catastrophic failures.
文摘Future 6G communications will open up opportunities for innovative applications,including Cyber-Physical Systems,edge computing,supporting Industry 5.0,and digital agriculture.While automation is creating efficiencies,it can also create new cyber threats,such as vulnerabilities in trust and malicious node injection.Denialof-Service(DoS)attacks can stop many forms of operations by overwhelming networks and systems with data noise.Current anomaly detection methods require extensive software changes and only detect static threats.Data collection is important for being accurate,but it is often a slow,tedious,and sometimes inefficient process.This paper proposes a new wavelet transformassisted Bayesian deep learning based probabilistic(WT-BDLP)approach tomitigate malicious data injection attacks in 6G edge networks.The proposed approach combines outlier detection based on a Bayesian learning conditional variational autoencoder(Bay-LCVariAE)and traffic pattern analysis based on continuous wavelet transform(CWT).The Bay-LCVariAE framework allows for probabilistic modelling of generative features to facilitate capturing how features of interest change over time,spatially,and for recognition of anomalies.Similarly,CWT allows emphasizing the multi-resolution spectral analysis and permits temporally relevant frequency pattern recognition.Experimental testing showed that the flexibility of the Bayesian probabilistic framework offers a vast improvement in anomaly detection accuracy over existing methods,with a maximum accuracy of 98.21%recognizing anomalies.
文摘Existing segmentation and augmentation techniques on convolutional neural network(CNN)has produced remarkable progress in object detection.However,the nominal accuracy and performance might be downturned with the photometric variation of images that are directly ignored in the training process,along with the context of the individual CNN algorithm.In this paper,we investigate the effect of a photometric variation like brightness and sharpness on different CNN.We observe that random augmentation of images weakens the performance unless the augmentation combines the weak limits of photometric variation.Our approach has been justified by the experimental result obtained from the PASCAL VOC 2007 dataset,with object detection CNN algorithms such as YOLOv3(You Only Look Once),Faster R-CNN(Region-based CNN),and SSD(Single Shot Multibox Detector).Each CNN model shows performance loss for varying sharpness and brightness,ranging between−80%to 80%.It was further shown that compared to random augmentation,the augmented dataset with weak photometric changes delivered high performance,but the photometric augmentation range differs for each model.Concurrently,we discuss some research questions that benefit the direction of the study.The results prove the importance of adaptive augmentation for individual CNN model,subjecting towards the robustness of object detection.
文摘The concept of classification through deep learning is to build a model that skillfully separates closely-related images dataset into different classes because of diminutive but continuous variations that took place in physical systems over time and effect substantially.This study has made ozone depletion identification through classification using Faster Region-Based Convolutional Neural Network(F-RCNN).The main advantage of F-RCNN is to accumulate the bounding boxes on images to differentiate the depleted and non-depleted regions.Furthermore,image classification’s primary goal is to accurately predict each minutely varied case’s targeted classes in the dataset based on ozone saturation.The permanent changes in climate are of serious concern.The leading causes beyond these destructive variations are ozone layer depletion,greenhouse gas release,deforestation,pollution,water resources contamination,and UV radiation.This research focuses on the prediction by identifying the ozone layer depletion because it causes many health issues,e.g.,skin cancer,damage to marine life,crops damage,and impacts on living being’s immune systems.We have tried to classify the ozone images dataset into two major classes,depleted and non-depleted regions,to extract the required persuading features through F-RCNN.Furthermore,CNN has been used for feature extraction in the existing literature,and those extricated diverse RoIs are passed on to the CNN for grouping purposes.It is difficult to manage and differentiate those RoIs after grouping that negatively affects the gathered results.The classification outcomes through F-RCNN approach are proficient and demonstrate that general accuracy lies between 91%to 93%in identifying climate variation through ozone concentration classification,whether the region in the image under consideration is depleted or non-depleted.Our proposed model presented 93%accuracy,and it outperforms the prevailing techniques.
文摘The influenza virus changes its antigenicity frequently due to rapid mutations, leading to immune escape and failure of vaccination. Rapid determination of the influenza antigenicity could help identify the antigenic variants in time. Here, we built a stacked auto-encoder (SAE) model for predicting the antigenic variant of human influenza A(H3N2) viruses based on the hemagglutinin (HA) protein sequences. The model achieved an accuracy of 0.95 in five-fold cross-validations, better than the logistic regression model did. Further analysis of the model shows that most of the active nodes in the hidden layer reflected the combined contribution of multiple residues to antigenic variation. Besides, some features (residues on HA protein) in the input layer were observed to take part in multiple active nodes, such as residue 189, 145 and 156, which were also reported to mostly determine the antigenic variation of influenza A(H3N2) viruses. Overall,this work is not only useful for rapidly identifying antigenic variants in influenza prevention, but also an interesting attempt in inferring the mechanisms of biological process through analysis of SAE model, which may give some insights into interpretation of the deep learning
文摘Non-intrusive load monitoring(NILM)can infer load profiles for each individual appliance from aggregated power consumption signals without installing extra sub-meters.However,performance of traditional energy disaggregation methods deteriorates in complex environments,especially susceptible to the presence of other high power consumption appliances.Practicalities are also limited by diversity of household load patterns and measurement errors.In order to address these problems,a hybrid deep learning model consisting of two steps is proposed in this paper.First,an improved variational autoencoder(VAE)structure is introduced for preliminary energy disaggregation,where the encoder and decoder layers are long short-term networks(LSTM)to extract temporal characteristics of active power signals.Afterward,a post-processing method based on Siamese one-dimensional convolutional neural network(S-1D-CNN)is adopted to remove incorrectly predicted activation segments of target appliances.Experiments are conducted on two public datasets,and results show remarkable improvements on prediction accuracy over other deep learning methods.Both transferability and stability of the proposed model are verified under different working conditions.
基金supported by the National Key Research and Development Program of China(No.2024YFD1201500)the Key Research and Development Program of Jiangsu Province,China(No.BE2022337,BE2023302,and BE2023315)the National Innovation Center for Digital Seed Industry,Beijing,China,100097.
文摘Plant breeding stands as a cornerstone for agricultural productivity and the safeguarding of food security.The advent of Genomic Selection heralds a new epoch in breeding,characterized by its capacity to harness whole-genome variation for genomic prediction.This approach transcends the need for prior knowledge of genes associated with specific traits.Nonetheless,the vast dimensionality of genomic data juxtaposed with the relatively limited number of phenotypic samples often leads to the“curse of dimensionality”,where traditional statistical,machine learning,and deep learning methods are prone to overfitting and suboptimal predictive performance.To surmount this challenge,we introduce a unified Variational auto-encoder based Multi-task Genomic Prediction model(VMGP)that integrates self-supervised genomic compression and reconstruction with multiple prediction tasks.This approach provides a robust solution,offering a formidable predictive framework that has been rigorously validated across public datasets for wheat,rice,and maize.Our model demonstrates exceptional capabilities in multi-phenotype and multi-environment genomic prediction,successfully navigating the complexities of cross-population genomic selection and underscoring its unique strengths and utility.Furthermore,by integrating VMGP with model interpretability,we can effectively triage relevant single nucleotide polymorphisms,thereby enhancing prediction performance and proposing potential cost-effective genotyping solutions.The VMGP framework,with its simplicity,stable predictive prowess,and open-source code,is exceptionally well-suited for broad dissemination within plant breeding programs.It is particularly advantageous for breeders who prioritize phenotype prediction yet may not possess extensive knowledge in deep learning or proficiency in parameter tuning.
基金sponsored by Science and Technology Program of Sichuan Province(2024ZDZX0035 and 2024ZHCG0072)。
文摘The Proton Exchange Membrane Fuel Cell(PEMFC)converts the chemical energy of hydrogen fuel directly into electrical energy with broad application prospects.Understanding how current density is distributed in the PEMFC systems is crucial as it is a key factor influencing system performance.However,direct modeling for current distribution may encounter the challenge of dimensional catastrophe owing to the high dimensionality of the data.This paper uses a high-resolution segmented measurement device with 396 points to conduct experimental tests on the current distribution of a PEMFC with reactive area of 406 cm^(2) during a stepwise increase in load current.The current distribution is modeled based on the test results to learn the mapping relationship between the experimental parameters and the current distribution.The proposed model utilizes a Conditional Variational Auto-Encoder(CVAE)to generate current distributions.The MSE(Mean-Square Error)of the trained CVAE model reaches 9.2×10^(-5),and the comparison results show that the 222.9A current distribution error has the largest MSE of 6.36×10^(-4) and a KL Divergence(Kullback-Leibler Divergence)of 9.55×10^(-4),both of which are at a low level.This model enables the direct determination of the current distribution based on the experimental parameters,thereby establishing a technical foundation for investigating the impact of experimental conditions on fuel cells.This model is also of great significance for research on fuel cell system control strategies and fault diagnosis.
文摘Generative AI models for music and the arts in general are increasingly complex and hard to understand.The field of ex-plainable AI(XAI)seeks to make complex and opaque AI models such as neural networks more understandable to people.One ap-proach to making generative AI models more understandable is to impose a small number of semantically meaningful attributes on gen-erative AI models.This paper contributes a systematic examination of the impact that different combinations of variational auto-en-coder models(measureVAE and adversarialVAE),configurations of latent space in the AI model(from 4 to 256 latent dimensions),and training datasets(Irish folk,Turkish folk,classical,and pop)have on music generation performance when 2 or 4 meaningful musical at-tributes are imposed on the generative model.To date,there have been no systematic comparisons of such models at this level of com-binatorial detail.Our findings show that measureVAE has better reconstruction performance than adversarialVAE which has better musical attribute independence.Results demonstrate that measureVAE was able to generate music across music genres with inter-pretable musical dimensions of control,and performs best with low complexity music such as pop and rock.We recommend that a 32 or 64 latent dimensional space is optimal for 4 regularised dimensions when using measureVAE to generate music across genres.Our res-ults are the first detailed comparisons of configurations of state-of-the-art generative AI models for music and can be used to help select and configure AI models,musical features,and datasets for more understandable generation of music.
基金supported by the Opening Project of Guangxi Key Laboratory of Clean Pulp&Papermaking and Pollution Control,China(No.2021KF11)the Shandong Provincial Natural Science Foundation,China(No.ZR2021MF135)+1 种基金the National Natural Science Foundation of China(No.52170001)the Natural Science Foundation of Jiangsu Provincial Universities,China(No.22KJA530003).
文摘Exposure to poor indoor air conditions poses significant risks to human health, increasing morbidity and mortality rates. Soft measurement modeling is suitable for stable and accurate monitoring of air pollutants and improving air quality. Based on partial least squares (PLS), we propose an indoor air quality prediction model that utilizes variational auto-encoder regression (VAER) algorithm. To reduce the negative effects of noise, latent variables in the original data are extracted by PLS in the first step. Then, the extracted variables are used as inputs to VAER, which improve the accuracy and robustness of the model. Through comparative analysis with traditional methods, we demonstrate the superior performance of our PLS-VAER model, which exhibits improved prediction performance and stability. The root mean square error (RMSE) of PLS-VAER is reduced by 14.71%, 26.47%, and 12.50% compared to single VAER, PLS-SVR, and PLS-ANN, respectively. Additionally, the coefficient of determination (R2) of PLS-VAER improves by 13.70%, 30.09%, and 11.25% compared to single VAER, PLS-SVR, and PLS-ANN, respectively. This research offers an innovative and environmentally-friendly approach to monitor and improve indoor air quality.