The aerial deployment method enables Unmanned Aerial Vehicles(UAVs)to be directly positioned at the required altitude for their mission.This method typically employs folding technology to improve loading efficiency,wi...The aerial deployment method enables Unmanned Aerial Vehicles(UAVs)to be directly positioned at the required altitude for their mission.This method typically employs folding technology to improve loading efficiency,with applications such as the gravity-only aerial deployment of high-aspect-ratio solar-powered UAVs,and aerial takeoff of fixed-wing drones in Mars research.However,the significant morphological changes during deployment are accompanied by strong nonlinear dynamic aerodynamic forces,which result in multiple degrees of freedom and an unstable character.This hinders the description and analysis of unknown dynamic behaviors,further leading to difficulties in the design of deployment strategies and flight control.To address this issue,this paper proposes an analysis method for dynamic behaviors during aerial deployment based on the Variational Autoencoder(VAE).Focusing on the gravity-only deployment problem of highaspect-ratio foldable-wing UAVs,the method encodes the multi-degree-of-freedom unstable motion signals into a low-dimensional feature space through a data-driven approach.By clustering in the feature space,this paper identifies and studies several dynamic behaviors during aerial deployment.The research presented in this paper offers a new method and perspective for feature extraction and analysis of complex and difficult-to-describe extreme flight dynamics,guiding the research on aerial deployment drones design and control strategies.展开更多
Geochemical survey data are essential across Earth Science disciplines but are often affected by noise,which can obscure important geological signals and compromise subsequent prediction and interpretation.Quantifying...Geochemical survey data are essential across Earth Science disciplines but are often affected by noise,which can obscure important geological signals and compromise subsequent prediction and interpretation.Quantifying prediction uncertainty is hence crucial for robust geoscientific decision-making.This study proposes a novel deep learning framework,the Spatially Constrained Variational Autoencoder(SC-VAE),for denoising geochemical survey data with integrated uncertainty quantification.The SC-VAE incorporates spatial regularization,which enforces spatial coherence by modeling inter-sample relationships directly within the latent space.The performance of the SC-VAE was systematically evaluated against a standard Variational Autoencoder(VAE)using geochemical data from the gold polymetallic district in the northwestern part of Sichuan Province,China.Both models were optimized using Bayesian optimization,with objective functions specifically designed to maintain essential geostatistical characteristics.Evaluation metrics include variogram analysis,quantitative measures of spatial interpolation accuracy,visual assessment of denoised maps,and statistical analysis of data distributions,as well as decomposition of uncertainties.Results show that the SC-VAE achieves superior noise suppression and better preservation of spatial structure compared to the standard VAE,as demonstrated by a significant reduction in the variogram nugget effect and an increased partial sill.The SC-VAE produces denoised maps with clearer anomaly delineation and more regularized data distributions,effectively mitigating outliers and reducing kurtosis.Additionally,it delivers improved interpolation accuracy and spatially explicit uncertainty estimates,facilitating more reliable and interpretable assessments of prediction confidence.The SC-VAE framework thus provides a robust,geostatistically informed solution for enhancing the quality and interpretability of geochemical data,with broad applicability in mineral exploration,environmental geochemistry,and other Earth Science domains.展开更多
Future 6G communications will open up opportunities for innovative applications,including Cyber-Physical Systems,edge computing,supporting Industry 5.0,and digital agriculture.While automation is creating efficiencies...Future 6G communications will open up opportunities for innovative applications,including Cyber-Physical Systems,edge computing,supporting Industry 5.0,and digital agriculture.While automation is creating efficiencies,it can also create new cyber threats,such as vulnerabilities in trust and malicious node injection.Denialof-Service(DoS)attacks can stop many forms of operations by overwhelming networks and systems with data noise.Current anomaly detection methods require extensive software changes and only detect static threats.Data collection is important for being accurate,but it is often a slow,tedious,and sometimes inefficient process.This paper proposes a new wavelet transformassisted Bayesian deep learning based probabilistic(WT-BDLP)approach tomitigate malicious data injection attacks in 6G edge networks.The proposed approach combines outlier detection based on a Bayesian learning conditional variational autoencoder(Bay-LCVariAE)and traffic pattern analysis based on continuous wavelet transform(CWT).The Bay-LCVariAE framework allows for probabilistic modelling of generative features to facilitate capturing how features of interest change over time,spatially,and for recognition of anomalies.Similarly,CWT allows emphasizing the multi-resolution spectral analysis and permits temporally relevant frequency pattern recognition.Experimental testing showed that the flexibility of the Bayesian probabilistic framework offers a vast improvement in anomaly detection accuracy over existing methods,with a maximum accuracy of 98.21%recognizing anomalies.展开更多
Quantitative analysis of aluminum-silicon(Al-Si)alloy microstructure is crucial for evaluating and controlling alloy performance.Conventional analysis methods rely on manual segmentation,which is inefficient and subje...Quantitative analysis of aluminum-silicon(Al-Si)alloy microstructure is crucial for evaluating and controlling alloy performance.Conventional analysis methods rely on manual segmentation,which is inefficient and subjective,while fully supervised deep learning approaches require extensive and expensive pixel-level annotated data.Furthermore,existing semi-supervised methods still face challenges in handling the adhesion of adjacent primary silicon particles and effectively utilizing consistency in unlabeled data.To address these issues,this paper proposes a novel semi-supervised framework for Al-Si alloy microstructure image segmentation.First,we introduce a Rotational Uncertainty Correction Strategy(RUCS).This strategy employs multi-angle rotational perturbations andMonte Carlo sampling to assess prediction consistency,generating a pixel-wise confidence weight map.By integrating this map into the loss function,the model dynamically focuses on high-confidence regions,thereby improving generalization ability while reducing manual annotation pressure.Second,we design a Boundary EnhancementModule(BEM)to strengthen boundary feature extraction through erosion difference and multi-scale dilated convolutions.This module guides the model to focus on the boundary regions of adjacent particles,effectively resolving particle adhesion and improving segmentation accuracy.Systematic experiments were conducted on the Aluminum-Silicon Alloy Microstructure Dataset(ASAD).Results indicate that the proposed method performs exceptionally well with scarce labeled data.Specifically,using only 5%labeled data,our method improves the Jaccard index and Adjusted Rand Index(ARI)by 2.84 and 1.57 percentage points,respectively,and reduces the Variation of Information(VI)by 8.65 compared to stateof-the-art semi-supervised models,approaching the performance levels of 10%labeled data.These results demonstrate that the proposed method significantly enhances the accuracy and robustness of quantitative microstructure analysis while reducing annotation costs.展开更多
Traditional electroencephalograph(EEG)-based emotion recognition requires a large number of calibration samples to build a model for a specific subject,which restricts the application of the affective brain computer i...Traditional electroencephalograph(EEG)-based emotion recognition requires a large number of calibration samples to build a model for a specific subject,which restricts the application of the affective brain computer interface(BCI)in practice.We attempt to use the multi-modal data from the past session to realize emotion recognition in the case of a small amount of calibration samples.To solve this problem,we propose a multimodal domain adaptive variational autoencoder(MMDA-VAE)method,which learns shared cross-domain latent representations of the multi-modal data.Our method builds a multi-modal variational autoencoder(MVAE)to project the data of multiple modalities into a common space.Through adversarial learning and cycle-consistency regularization,our method can reduce the distribution difference of each domain on the shared latent representation layer and realize the transfer of knowledge.Extensive experiments are conducted on two public datasets,SEED and SEED-IV,and the results show the superiority of our proposed method.Our work can effectively improve the performance of emotion recognition with a small amount of labelled multi-modal data.展开更多
In this study,the hourly directions of eight banking stocks in Borsa Istanbul were predicted using linear-based,deep-learning(LSTM)and ensemble learning(Light-GBM)models.These models were trained with four different f...In this study,the hourly directions of eight banking stocks in Borsa Istanbul were predicted using linear-based,deep-learning(LSTM)and ensemble learning(Light-GBM)models.These models were trained with four different feature sets and their performances were evaluated in terms of accuracy and F-measure metrics.While the first experiments directly used the own stock features as the model inputs,the second experiments utilized reduced stock features through Variational AutoEncoders(VAE).In the last experiments,in order to grasp the effects of the other banking stocks on individual stock performance,the features belonging to other stocks were also given as inputs to our models.While combining other stock features was done for both own(named as allstock_own)and VAE-reduced(named as allstock_VAE)stock features,the expanded dimensions of the feature sets were reduced by Recursive Feature Elimination.As the highest success rate increased up to 0.685 with allstock_own and LSTM with attention model,the combination of allstock_VAE and LSTM with the attention model obtained an accuracy rate of 0.675.Although the classification results achieved with both feature types was close,allstock_VAE achieved these results using nearly 16.67%less features compared to allstock_own.When all experimental results were examined,it was found out that the models trained with allstock_own and allstock_VAE achieved higher accuracy rates than those using individual stock features.It was also concluded that the results obtained with the VAE-reduced stock features were similar to those obtained by own stock features.展开更多
Generative Models have been shown to be extremely useful in learning features from unlabeled data. In particular, variational autoencoders are capable of modeling highly complex natural distributions such as images, w...Generative Models have been shown to be extremely useful in learning features from unlabeled data. In particular, variational autoencoders are capable of modeling highly complex natural distributions such as images, while extracting natural and human-understandable features without labels. In this paper we combine two highly useful classes of models, variational ladder autoencoders, and MMD variational autoencoders, to model face images. In particular, we show that we can disentangle highly meaningful and interpretable features. Furthermore, we are able to perform arithmetic operations on faces and modify faces to add or remove high level features.展开更多
In modern industry,process monitoring plays a significant role in improving the quality of process conduct.With the higher dimensional of the industrial data,the monitoring methods based on the latent variables have b...In modern industry,process monitoring plays a significant role in improving the quality of process conduct.With the higher dimensional of the industrial data,the monitoring methods based on the latent variables have been widely applied in order to decrease the wasting of the industrial database.Nevertheless,these latent variables do not usually follow the Gaussian distribution and thus perform unsuitable when applying some statistics indices,especially the T^(2) on them.Variational AutoEncoders(VAE),an unsupervised deep learning algorithm using the hierarchy study method,has the ability to make the latent variables follow the Gaussian distribution.The partial least squares(PLS)are used to obtain the information between the dependent variables and independent variables.In this paper,we will integrate these two methods and make a comparison with other methods.The superiority of this proposed method will be verified by the simulation and the Trimethylchlorosilane purification process in terms of the multivariate control charts.展开更多
Learning disentangled representation of data is a key problem in deep learning.Specifically,disentangling 2D facial landmarks into different factors(e.g.,identity and expression)is widely used in the applications of f...Learning disentangled representation of data is a key problem in deep learning.Specifically,disentangling 2D facial landmarks into different factors(e.g.,identity and expression)is widely used in the applications of face reconstruction,face reenactment and talking head et al..However,due to the sparsity of landmarks and the lack of accurate labels for the factors,it is hard to learn the disentangled representation of landmarks.To address these problem,we propose a simple and effective model named FLD-VAE to disentangle arbitrary facial landmarks into identity and expression latent representations,which is based on a Variational Autoencoder framework.Besides,we propose three invariant loss functions in both latent and data levels to constrain the invariance of representations during training stage.Moreover,we implement an identity preservation loss to further enhance the representation ability of identity factor.To the best of our knowledge,this is the first work to end-to-end disentangle identity and expression factors simultaneously from one single facial landmark.展开更多
Label propagation is an essential semi-supervised learning method based on graphs,which has a broad spectrum of applications in pattern recognition and data mining.This paper proposes a quantum semi-supervised classif...Label propagation is an essential semi-supervised learning method based on graphs,which has a broad spectrum of applications in pattern recognition and data mining.This paper proposes a quantum semi-supervised classifier based on label propagation.Considering the difficulty of graph construction,we develop a variational quantum label propagation(VQLP)method.In this method,a locally parameterized quantum circuit is created to reduce the parameters required in the optimization.Furthermore,we design a quantum semi-supervised binary classifier based on hybrid Bell and Z bases measurement,which has a shallower circuit depth and is more suitable for implementation on near-term quantum devices.We demonstrate the performance of the quantum semi-supervised classifier on the Iris data set,and the simulation results show that the quantum semi-supervised classifier has higher classification accuracy than the swap test classifier.This work opens a new path to quantum machine learning based on graphs.展开更多
In aerospace industry,gears are the most common parts of a mechanical transmission system.Gear pitting faults could cause the transmission system to crash and give rise to safety disaster.It is always a challenging pr...In aerospace industry,gears are the most common parts of a mechanical transmission system.Gear pitting faults could cause the transmission system to crash and give rise to safety disaster.It is always a challenging problem to diagnose the gear pitting condition directly through the raw signal of vibration.In this paper,a novel method named augmented deep sparse autoencoder(ADSAE)is proposed.The method can be used to diagnose the gear pitting fault with relatively few raw vibration signal data.This method is mainly based on the theory of pitting fault diagnosis and creatively combines with both data augmentation ideology and the deep sparse autoencoder algorithm for the fault diagnosis of gear wear.The effectiveness of the proposed method is validated by experiments of six types of gear pitting conditions.The results show that the ADSAE method can effectively increase the network generalization ability and robustness with very high accuracy.This method can effectively diagnose different gear pitting conditions and show the obvious trend according to the severity of gear wear faults.The results obtained by the ADSAE method proposed in this paper are compared with those obtained by other common deep learning methods.This paper provides an important insight into the field of gear fault diagnosis based on deep learning and has a potential practical application value.展开更多
As a major component of speech signal processing, speech emotion recognition has become increasingly essential to understanding human communication. Benefitting from deep learning, many researchers have proposed vario...As a major component of speech signal processing, speech emotion recognition has become increasingly essential to understanding human communication. Benefitting from deep learning, many researchers have proposed various unsupervised models to extract effective emotional features and supervised models to train emotion recognition systems. In this paper, we utilize semi-supervised ladder networks for speech emotion recognition. The model is trained by minimizing the supervised loss and auxiliary unsupervised cost function. The addition of the unsupervised auxiliary task provides powerful discriminative representations of the input features, and is also regarded as the regularization of the emotional supervised task. We also compare the ladder network with other classical autoencoder structures. The experiments were conducted on the interactive emotional dyadic motion capture (IEMOCAP) database, and the results reveal that the proposed methods achieve superior performance with a small number of labelled data and achieves better performance than other methods.展开更多
Contemporary attackers,mainly motivated by financial gain,consistently devise sophisticated penetration techniques to access important information or data.The growing use of Internet of Things(IoT)technology in the co...Contemporary attackers,mainly motivated by financial gain,consistently devise sophisticated penetration techniques to access important information or data.The growing use of Internet of Things(IoT)technology in the contemporary convergence environment to connect to corporate networks and cloud-based applications only worsens this situation,as it facilitates multiple new attack vectors to emerge effortlessly.As such,existing intrusion detection systems suffer from performance degradation mainly because of insufficient considerations and poorly modeled detection systems.To address this problem,we designed a blended threat detection approach,considering the possible impact and dimensionality of new attack surfaces due to the aforementioned convergence.We collectively refer to the convergence of different technology sectors as the internet of blended environment.The proposed approach encompasses an ensemble of heterogeneous probabilistic autoencoders that leverage the corresponding advantages of a convolutional variational autoencoder and long short-term memory variational autoencoder.An extensive experimental analysis conducted on the TON_IoT dataset demonstrated 96.02%detection accuracy.Furthermore,performance of the proposed approach was compared with various single model(autoencoder)-based network intrusion detection approaches:autoencoder,variational autoencoder,convolutional variational autoencoder,and long short-term memory variational autoencoder.The proposed model outperformed all compared models,demonstrating F1-score improvements of 4.99%,2.25%,1.92%,and 3.69%,respectively.展开更多
针对人体姿态估计中遮挡带来的缺乏图像低级特征指导和预测姿势与人体生理结构的不一致性问题,提出了一种新颖的生成式人体姿态估计方法(generative human pose estimation,GenPose)。该模型使用多尺度信息融合和条件生成模块解决了严...针对人体姿态估计中遮挡带来的缺乏图像低级特征指导和预测姿势与人体生理结构的不一致性问题,提出了一种新颖的生成式人体姿态估计方法(generative human pose estimation,GenPose)。该模型使用多尺度信息融合和条件生成模块解决了严重遮挡问题。多尺度模块从尺度和通道上细粒度融合图像特征,能捕捉到更多肢体细节,从而推理出遮挡关键点的特征信息。条件生成模块通过建模遮挡场景与姿态间的对应关系,根据标记编码器特征动态调整生成姿态,在保证可见点准确率的同时,在一定程度上减少了遮挡对非遮挡的干扰,提升了对遮挡姿态的生成效果。在公开的COCO和MPII数据集上,同以往方法相比,有了更好的结果,同时在CrowdPose、OCHuman以及SyncOCC数据集上验证了泛化能力。该模型在一定程度上能够解决严重遮挡下的姿态估计问题,提高了预测姿态的合理性,取得了更加优异的效果。展开更多
故障根因分析旨在找到导致特定问题、故障或事件发生的原因,是多个领域中追踪溯源的重要支撑技术,但现有方法在效率、准确性和稳定性等方面仍不能满足故障根因分析任务的实际需求。对此,将贝叶斯网作为相关属性之间依赖关系表示和推理...故障根因分析旨在找到导致特定问题、故障或事件发生的原因,是多个领域中追踪溯源的重要支撑技术,但现有方法在效率、准确性和稳定性等方面仍不能满足故障根因分析任务的实际需求。对此,将贝叶斯网作为相关属性之间依赖关系表示和推理的知识框架,提出基于贝叶斯网的故障根因分析方法。首先,针对高维数据和稀疏样本带来的挑战,提出基于向量量化自编码器的高维属性约简算法,并给出α-BIC评分准则,高效地学习根因贝叶斯网(Root Cause Bayesian Network,RCBN)。随后,基于贝叶斯网嵌入技术实现RCBN的高效推理,高效计算各原因条件下故障产生的可能性,进而使用因果模型中的Blame机制度量各原因对给定故障的贡献度,从而实现故障根因分析。在3个公共数据集和3个合成数据集上的实验结果表明,所提方法的平均检测准确性和效率明显优于对比方法,在CHILD数据集上精度提升了7%,运行时间快了60%。展开更多
基金co-supported by the Natural Science Basic Research Program of Shaanxi,China(No.2023-JC-QN-0043)the ND Basic Research Funds,China(No.G2022WD).
文摘The aerial deployment method enables Unmanned Aerial Vehicles(UAVs)to be directly positioned at the required altitude for their mission.This method typically employs folding technology to improve loading efficiency,with applications such as the gravity-only aerial deployment of high-aspect-ratio solar-powered UAVs,and aerial takeoff of fixed-wing drones in Mars research.However,the significant morphological changes during deployment are accompanied by strong nonlinear dynamic aerodynamic forces,which result in multiple degrees of freedom and an unstable character.This hinders the description and analysis of unknown dynamic behaviors,further leading to difficulties in the design of deployment strategies and flight control.To address this issue,this paper proposes an analysis method for dynamic behaviors during aerial deployment based on the Variational Autoencoder(VAE).Focusing on the gravity-only deployment problem of highaspect-ratio foldable-wing UAVs,the method encodes the multi-degree-of-freedom unstable motion signals into a low-dimensional feature space through a data-driven approach.By clustering in the feature space,this paper identifies and studies several dynamic behaviors during aerial deployment.The research presented in this paper offers a new method and perspective for feature extraction and analysis of complex and difficult-to-describe extreme flight dynamics,guiding the research on aerial deployment drones design and control strategies.
基金supported by the National Natural Science Foundation of China(Nos.42530801,42425208)the Natural Science Foundation of Hubei Province(China)(No.2023AFA001)+1 种基金the MOST Special Fund from State Key Laboratory of Geological Processes and Mineral Resources,China University of Geosciences(No.MSFGPMR2025-401)the China Scholarship Council(No.202306410181)。
文摘Geochemical survey data are essential across Earth Science disciplines but are often affected by noise,which can obscure important geological signals and compromise subsequent prediction and interpretation.Quantifying prediction uncertainty is hence crucial for robust geoscientific decision-making.This study proposes a novel deep learning framework,the Spatially Constrained Variational Autoencoder(SC-VAE),for denoising geochemical survey data with integrated uncertainty quantification.The SC-VAE incorporates spatial regularization,which enforces spatial coherence by modeling inter-sample relationships directly within the latent space.The performance of the SC-VAE was systematically evaluated against a standard Variational Autoencoder(VAE)using geochemical data from the gold polymetallic district in the northwestern part of Sichuan Province,China.Both models were optimized using Bayesian optimization,with objective functions specifically designed to maintain essential geostatistical characteristics.Evaluation metrics include variogram analysis,quantitative measures of spatial interpolation accuracy,visual assessment of denoised maps,and statistical analysis of data distributions,as well as decomposition of uncertainties.Results show that the SC-VAE achieves superior noise suppression and better preservation of spatial structure compared to the standard VAE,as demonstrated by a significant reduction in the variogram nugget effect and an increased partial sill.The SC-VAE produces denoised maps with clearer anomaly delineation and more regularized data distributions,effectively mitigating outliers and reducing kurtosis.Additionally,it delivers improved interpolation accuracy and spatially explicit uncertainty estimates,facilitating more reliable and interpretable assessments of prediction confidence.The SC-VAE framework thus provides a robust,geostatistically informed solution for enhancing the quality and interpretability of geochemical data,with broad applicability in mineral exploration,environmental geochemistry,and other Earth Science domains.
文摘Future 6G communications will open up opportunities for innovative applications,including Cyber-Physical Systems,edge computing,supporting Industry 5.0,and digital agriculture.While automation is creating efficiencies,it can also create new cyber threats,such as vulnerabilities in trust and malicious node injection.Denialof-Service(DoS)attacks can stop many forms of operations by overwhelming networks and systems with data noise.Current anomaly detection methods require extensive software changes and only detect static threats.Data collection is important for being accurate,but it is often a slow,tedious,and sometimes inefficient process.This paper proposes a new wavelet transformassisted Bayesian deep learning based probabilistic(WT-BDLP)approach tomitigate malicious data injection attacks in 6G edge networks.The proposed approach combines outlier detection based on a Bayesian learning conditional variational autoencoder(Bay-LCVariAE)and traffic pattern analysis based on continuous wavelet transform(CWT).The Bay-LCVariAE framework allows for probabilistic modelling of generative features to facilitate capturing how features of interest change over time,spatially,and for recognition of anomalies.Similarly,CWT allows emphasizing the multi-resolution spectral analysis and permits temporally relevant frequency pattern recognition.Experimental testing showed that the flexibility of the Bayesian probabilistic framework offers a vast improvement in anomaly detection accuracy over existing methods,with a maximum accuracy of 98.21%recognizing anomalies.
基金funded by the National Natural Science Foundation of China (52061020).
文摘Quantitative analysis of aluminum-silicon(Al-Si)alloy microstructure is crucial for evaluating and controlling alloy performance.Conventional analysis methods rely on manual segmentation,which is inefficient and subjective,while fully supervised deep learning approaches require extensive and expensive pixel-level annotated data.Furthermore,existing semi-supervised methods still face challenges in handling the adhesion of adjacent primary silicon particles and effectively utilizing consistency in unlabeled data.To address these issues,this paper proposes a novel semi-supervised framework for Al-Si alloy microstructure image segmentation.First,we introduce a Rotational Uncertainty Correction Strategy(RUCS).This strategy employs multi-angle rotational perturbations andMonte Carlo sampling to assess prediction consistency,generating a pixel-wise confidence weight map.By integrating this map into the loss function,the model dynamically focuses on high-confidence regions,thereby improving generalization ability while reducing manual annotation pressure.Second,we design a Boundary EnhancementModule(BEM)to strengthen boundary feature extraction through erosion difference and multi-scale dilated convolutions.This module guides the model to focus on the boundary regions of adjacent particles,effectively resolving particle adhesion and improving segmentation accuracy.Systematic experiments were conducted on the Aluminum-Silicon Alloy Microstructure Dataset(ASAD).Results indicate that the proposed method performs exceptionally well with scarce labeled data.Specifically,using only 5%labeled data,our method improves the Jaccard index and Adjusted Rand Index(ARI)by 2.84 and 1.57 percentage points,respectively,and reduces the Variation of Information(VI)by 8.65 compared to stateof-the-art semi-supervised models,approaching the performance levels of 10%labeled data.These results demonstrate that the proposed method significantly enhances the accuracy and robustness of quantitative microstructure analysis while reducing annotation costs.
基金National Natural Science Foundation of China(61976209,62020106015,U21A20388)in part by the CAS International Collaboration Key Project(173211KYSB20190024)in part by the Strategic Priority Research Program of CAS(XDB32040000)。
文摘Traditional electroencephalograph(EEG)-based emotion recognition requires a large number of calibration samples to build a model for a specific subject,which restricts the application of the affective brain computer interface(BCI)in practice.We attempt to use the multi-modal data from the past session to realize emotion recognition in the case of a small amount of calibration samples.To solve this problem,we propose a multimodal domain adaptive variational autoencoder(MMDA-VAE)method,which learns shared cross-domain latent representations of the multi-modal data.Our method builds a multi-modal variational autoencoder(MVAE)to project the data of multiple modalities into a common space.Through adversarial learning and cycle-consistency regularization,our method can reduce the distribution difference of each domain on the shared latent representation layer and realize the transfer of knowledge.Extensive experiments are conducted on two public datasets,SEED and SEED-IV,and the results show the superiority of our proposed method.Our work can effectively improve the performance of emotion recognition with a small amount of labelled multi-modal data.
文摘In this study,the hourly directions of eight banking stocks in Borsa Istanbul were predicted using linear-based,deep-learning(LSTM)and ensemble learning(Light-GBM)models.These models were trained with four different feature sets and their performances were evaluated in terms of accuracy and F-measure metrics.While the first experiments directly used the own stock features as the model inputs,the second experiments utilized reduced stock features through Variational AutoEncoders(VAE).In the last experiments,in order to grasp the effects of the other banking stocks on individual stock performance,the features belonging to other stocks were also given as inputs to our models.While combining other stock features was done for both own(named as allstock_own)and VAE-reduced(named as allstock_VAE)stock features,the expanded dimensions of the feature sets were reduced by Recursive Feature Elimination.As the highest success rate increased up to 0.685 with allstock_own and LSTM with attention model,the combination of allstock_VAE and LSTM with the attention model obtained an accuracy rate of 0.675.Although the classification results achieved with both feature types was close,allstock_VAE achieved these results using nearly 16.67%less features compared to allstock_own.When all experimental results were examined,it was found out that the models trained with allstock_own and allstock_VAE achieved higher accuracy rates than those using individual stock features.It was also concluded that the results obtained with the VAE-reduced stock features were similar to those obtained by own stock features.
文摘Generative Models have been shown to be extremely useful in learning features from unlabeled data. In particular, variational autoencoders are capable of modeling highly complex natural distributions such as images, while extracting natural and human-understandable features without labels. In this paper we combine two highly useful classes of models, variational ladder autoencoders, and MMD variational autoencoders, to model face images. In particular, we show that we can disentangle highly meaningful and interpretable features. Furthermore, we are able to perform arithmetic operations on faces and modify faces to add or remove high level features.
文摘In modern industry,process monitoring plays a significant role in improving the quality of process conduct.With the higher dimensional of the industrial data,the monitoring methods based on the latent variables have been widely applied in order to decrease the wasting of the industrial database.Nevertheless,these latent variables do not usually follow the Gaussian distribution and thus perform unsuitable when applying some statistics indices,especially the T^(2) on them.Variational AutoEncoders(VAE),an unsupervised deep learning algorithm using the hierarchy study method,has the ability to make the latent variables follow the Gaussian distribution.The partial least squares(PLS)are used to obtain the information between the dependent variables and independent variables.In this paper,we will integrate these two methods and make a comparison with other methods.The superiority of this proposed method will be verified by the simulation and the Trimethylchlorosilane purification process in terms of the multivariate control charts.
基金Supported by the National Natural Science Foundation of China(61210007).
文摘Learning disentangled representation of data is a key problem in deep learning.Specifically,disentangling 2D facial landmarks into different factors(e.g.,identity and expression)is widely used in the applications of face reconstruction,face reenactment and talking head et al..However,due to the sparsity of landmarks and the lack of accurate labels for the factors,it is hard to learn the disentangled representation of landmarks.To address these problem,we propose a simple and effective model named FLD-VAE to disentangle arbitrary facial landmarks into identity and expression latent representations,which is based on a Variational Autoencoder framework.Besides,we propose three invariant loss functions in both latent and data levels to constrain the invariance of representations during training stage.Moreover,we implement an identity preservation loss to further enhance the representation ability of identity factor.To the best of our knowledge,this is the first work to end-to-end disentangle identity and expression factors simultaneously from one single facial landmark.
基金Project supported by the Open Fund of Advanced Cryptography and System Security Key Laboratory of Sichuan Province(Grant No.SKLACSS-202108)the National Natural Science Foundation of China(Grant No.U162271070)Scientific Research Fund of Zaozhuang University(Grant No.102061901).
文摘Label propagation is an essential semi-supervised learning method based on graphs,which has a broad spectrum of applications in pattern recognition and data mining.This paper proposes a quantum semi-supervised classifier based on label propagation.Considering the difficulty of graph construction,we develop a variational quantum label propagation(VQLP)method.In this method,a locally parameterized quantum circuit is created to reduce the parameters required in the optimization.Furthermore,we design a quantum semi-supervised binary classifier based on hybrid Bell and Z bases measurement,which has a shallower circuit depth and is more suitable for implementation on near-term quantum devices.We demonstrate the performance of the quantum semi-supervised classifier on the Iris data set,and the simulation results show that the quantum semi-supervised classifier has higher classification accuracy than the swap test classifier.This work opens a new path to quantum machine learning based on graphs.
基金supported by the Natural Science Foundation of China(No.51675089).
文摘In aerospace industry,gears are the most common parts of a mechanical transmission system.Gear pitting faults could cause the transmission system to crash and give rise to safety disaster.It is always a challenging problem to diagnose the gear pitting condition directly through the raw signal of vibration.In this paper,a novel method named augmented deep sparse autoencoder(ADSAE)is proposed.The method can be used to diagnose the gear pitting fault with relatively few raw vibration signal data.This method is mainly based on the theory of pitting fault diagnosis and creatively combines with both data augmentation ideology and the deep sparse autoencoder algorithm for the fault diagnosis of gear wear.The effectiveness of the proposed method is validated by experiments of six types of gear pitting conditions.The results show that the ADSAE method can effectively increase the network generalization ability and robustness with very high accuracy.This method can effectively diagnose different gear pitting conditions and show the obvious trend according to the severity of gear wear faults.The results obtained by the ADSAE method proposed in this paper are compared with those obtained by other common deep learning methods.This paper provides an important insight into the field of gear fault diagnosis based on deep learning and has a potential practical application value.
基金supported by National Natural Science Foundation of China(Nos.61425017 and 61773379)the National Key Research&Development Plan of China(No.2017YFB1002804)
文摘As a major component of speech signal processing, speech emotion recognition has become increasingly essential to understanding human communication. Benefitting from deep learning, many researchers have proposed various unsupervised models to extract effective emotional features and supervised models to train emotion recognition systems. In this paper, we utilize semi-supervised ladder networks for speech emotion recognition. The model is trained by minimizing the supervised loss and auxiliary unsupervised cost function. The addition of the unsupervised auxiliary task provides powerful discriminative representations of the input features, and is also regarded as the regularization of the emotional supervised task. We also compare the ladder network with other classical autoencoder structures. The experiments were conducted on the interactive emotional dyadic motion capture (IEMOCAP) database, and the results reveal that the proposed methods achieve superior performance with a small number of labelled data and achieves better performance than other methods.
基金This work was supported by the National Research Foundation of Korea(NRF)grant funded by the Korean government(MSIT)(No.2021R1A2C2011391)was supported by the Institute of Information&communications Technology Planning&Evaluation(IITP)grant funded by the Korea government(MSIT)(No.2021-0-01806Development of security by design and security management technology in smart factory).
文摘Contemporary attackers,mainly motivated by financial gain,consistently devise sophisticated penetration techniques to access important information or data.The growing use of Internet of Things(IoT)technology in the contemporary convergence environment to connect to corporate networks and cloud-based applications only worsens this situation,as it facilitates multiple new attack vectors to emerge effortlessly.As such,existing intrusion detection systems suffer from performance degradation mainly because of insufficient considerations and poorly modeled detection systems.To address this problem,we designed a blended threat detection approach,considering the possible impact and dimensionality of new attack surfaces due to the aforementioned convergence.We collectively refer to the convergence of different technology sectors as the internet of blended environment.The proposed approach encompasses an ensemble of heterogeneous probabilistic autoencoders that leverage the corresponding advantages of a convolutional variational autoencoder and long short-term memory variational autoencoder.An extensive experimental analysis conducted on the TON_IoT dataset demonstrated 96.02%detection accuracy.Furthermore,performance of the proposed approach was compared with various single model(autoencoder)-based network intrusion detection approaches:autoencoder,variational autoencoder,convolutional variational autoencoder,and long short-term memory variational autoencoder.The proposed model outperformed all compared models,demonstrating F1-score improvements of 4.99%,2.25%,1.92%,and 3.69%,respectively.
文摘针对人体姿态估计中遮挡带来的缺乏图像低级特征指导和预测姿势与人体生理结构的不一致性问题,提出了一种新颖的生成式人体姿态估计方法(generative human pose estimation,GenPose)。该模型使用多尺度信息融合和条件生成模块解决了严重遮挡问题。多尺度模块从尺度和通道上细粒度融合图像特征,能捕捉到更多肢体细节,从而推理出遮挡关键点的特征信息。条件生成模块通过建模遮挡场景与姿态间的对应关系,根据标记编码器特征动态调整生成姿态,在保证可见点准确率的同时,在一定程度上减少了遮挡对非遮挡的干扰,提升了对遮挡姿态的生成效果。在公开的COCO和MPII数据集上,同以往方法相比,有了更好的结果,同时在CrowdPose、OCHuman以及SyncOCC数据集上验证了泛化能力。该模型在一定程度上能够解决严重遮挡下的姿态估计问题,提高了预测姿态的合理性,取得了更加优异的效果。
文摘故障根因分析旨在找到导致特定问题、故障或事件发生的原因,是多个领域中追踪溯源的重要支撑技术,但现有方法在效率、准确性和稳定性等方面仍不能满足故障根因分析任务的实际需求。对此,将贝叶斯网作为相关属性之间依赖关系表示和推理的知识框架,提出基于贝叶斯网的故障根因分析方法。首先,针对高维数据和稀疏样本带来的挑战,提出基于向量量化自编码器的高维属性约简算法,并给出α-BIC评分准则,高效地学习根因贝叶斯网(Root Cause Bayesian Network,RCBN)。随后,基于贝叶斯网嵌入技术实现RCBN的高效推理,高效计算各原因条件下故障产生的可能性,进而使用因果模型中的Blame机制度量各原因对给定故障的贡献度,从而实现故障根因分析。在3个公共数据集和3个合成数据集上的实验结果表明,所提方法的平均检测准确性和效率明显优于对比方法,在CHILD数据集上精度提升了7%,运行时间快了60%。