Accurately predicting the Remaining Useful Life(RUL)of lithium-ion batteries is crucial for battery management systems.Deep learning-based methods have been shown to be effective in predicting RUL by leveraging batter...Accurately predicting the Remaining Useful Life(RUL)of lithium-ion batteries is crucial for battery management systems.Deep learning-based methods have been shown to be effective in predicting RUL by leveraging battery capacity time series data.However,the representation learning of features such as long-distance sequence dependencies and mutations in capacity time series still needs to be improved.To address this challenge,this paper proposes a novel deep learning model,the MLP-Mixer and Mixture of Expert(MMMe)model,for RUL prediction.The MMMe model leverages the Gated Recurrent Unit and Multi-Head Attention mechanism to encode the sequential data of battery capacity to capture the temporal features and a re-zero MLP-Mixer model to capture the high-level features.Additionally,we devise an ensemble predictor based on a Mixture-of-Experts(MoE)architecture to generate reliable RUL predictions.The experimental results on public datasets demonstrate that our proposed model significantly outperforms other existing methods,providing more reliable and precise RUL predictions while also accurately tracking the capacity degradation process.Our code and dataset are available at the website of github.展开更多
视觉惯性里程计(visual-inertial odometry,VIO)通过融合视觉和惯性数据来实现位姿估计.在复杂环境中,惯性数据受噪声干扰,长时间运动会导致累积误差,同时大多数VIO忽略了模态间局部信息交互,未充分利用不同模态的互补性,从而影响位姿...视觉惯性里程计(visual-inertial odometry,VIO)通过融合视觉和惯性数据来实现位姿估计.在复杂环境中,惯性数据受噪声干扰,长时间运动会导致累积误差,同时大多数VIO忽略了模态间局部信息交互,未充分利用不同模态的互补性,从而影响位姿估计精度.针对上述问题,本文提出了一种基于注意力机制与局部交互的视觉惯性里程计(attention and local interaction-based visual-inertial odometry,ALVIO)模型.首先,该模型分别提取到视觉特征和惯性特征.其次,保留惯性特征的历史时序信息,并通过基于离散余弦变换(discrete cosine transform,DCT)的通道注意力机制增强低频有效特征,抑制高频噪声.接着,设计了多模态局部交互与全局融合模块,利用改进的分散注意力机制与MLP-Mixer逐步实现模态间的局部交互与全局融合,根据不同模态的贡献调节局部特征权重,实现模态间互补,再在全局维度上整合特征,得到统一表征.最后,将融合的特征进行时间建模和位姿回归得到相对位姿.为了验证模型在复杂环境下的有效性,对公开数据集KITTI和EuRoC进行了低质量处理并实验,实验表明,ALVIO相较于直接特征拼接模型、多头注意力融合模型、软掩码融合模型,平移误差分别减少了49.92%、32.82%、37.74%,旋转误差分别减少了51.34%、25.96%、29.54%,且具有更高的效率和鲁棒性.展开更多
Nowadays,inspired by the great success of Transformers in Natural Language Processing,many applications of Vision Transformers(ViTs)have been investigated in the field of medical image analysis including breast ultras...Nowadays,inspired by the great success of Transformers in Natural Language Processing,many applications of Vision Transformers(ViTs)have been investigated in the field of medical image analysis including breast ultrasound(BUS)image segmentation and classification.In this paper,we propose an efficient multi-task framework to segment and classify tumors in BUS images using hybrid convolutional neural networks(CNNs)-ViTs architecture and Multi-Perceptron(MLP)-Mixer.The proposed method uses a two-encoder architecture with EfficientNetV2 backbone and an adapted ViT encoder to extract tumor regions in BUS images.The self-attention(SA)mechanism in the Transformer encoder allows capturing a wide range of high-level and complex features while the EfficientNetV2 encoder preserves local information in image.To fusion the extracted features,a Channel Attention Fusion(CAF)module is introduced.The CAF module selectively emphasizes important features from both encoders,improving the integration of high-level and local information.The resulting feature maps are reconstructed to obtain the segmentation maps using a decoder.Then,our method classifies the segmented tumor regions into benign and malignant using a simple and efficient classifier based on MLP-Mixer,that is applied for the first time,to the best of our knowledge,for the task of lesion classification in BUS images.Experimental results illustrate the outperformance of our framework compared to recent works for the task of segmentation by producing 83.42%in terms of Dice coefficient as well as for the classification with 86%in terms of accuracy.展开更多
基金This work was supported by the National Natural Science Foundation of China(Grant Nos.62102191,61872114,and 61871020).
文摘Accurately predicting the Remaining Useful Life(RUL)of lithium-ion batteries is crucial for battery management systems.Deep learning-based methods have been shown to be effective in predicting RUL by leveraging battery capacity time series data.However,the representation learning of features such as long-distance sequence dependencies and mutations in capacity time series still needs to be improved.To address this challenge,this paper proposes a novel deep learning model,the MLP-Mixer and Mixture of Expert(MMMe)model,for RUL prediction.The MMMe model leverages the Gated Recurrent Unit and Multi-Head Attention mechanism to encode the sequential data of battery capacity to capture the temporal features and a re-zero MLP-Mixer model to capture the high-level features.Additionally,we devise an ensemble predictor based on a Mixture-of-Experts(MoE)architecture to generate reliable RUL predictions.The experimental results on public datasets demonstrate that our proposed model significantly outperforms other existing methods,providing more reliable and precise RUL predictions while also accurately tracking the capacity degradation process.Our code and dataset are available at the website of github.
文摘视觉惯性里程计(visual-inertial odometry,VIO)通过融合视觉和惯性数据来实现位姿估计.在复杂环境中,惯性数据受噪声干扰,长时间运动会导致累积误差,同时大多数VIO忽略了模态间局部信息交互,未充分利用不同模态的互补性,从而影响位姿估计精度.针对上述问题,本文提出了一种基于注意力机制与局部交互的视觉惯性里程计(attention and local interaction-based visual-inertial odometry,ALVIO)模型.首先,该模型分别提取到视觉特征和惯性特征.其次,保留惯性特征的历史时序信息,并通过基于离散余弦变换(discrete cosine transform,DCT)的通道注意力机制增强低频有效特征,抑制高频噪声.接着,设计了多模态局部交互与全局融合模块,利用改进的分散注意力机制与MLP-Mixer逐步实现模态间的局部交互与全局融合,根据不同模态的贡献调节局部特征权重,实现模态间互补,再在全局维度上整合特征,得到统一表征.最后,将融合的特征进行时间建模和位姿回归得到相对位姿.为了验证模型在复杂环境下的有效性,对公开数据集KITTI和EuRoC进行了低质量处理并实验,实验表明,ALVIO相较于直接特征拼接模型、多头注意力融合模型、软掩码融合模型,平移误差分别减少了49.92%、32.82%、37.74%,旋转误差分别减少了51.34%、25.96%、29.54%,且具有更高的效率和鲁棒性.
文摘Nowadays,inspired by the great success of Transformers in Natural Language Processing,many applications of Vision Transformers(ViTs)have been investigated in the field of medical image analysis including breast ultrasound(BUS)image segmentation and classification.In this paper,we propose an efficient multi-task framework to segment and classify tumors in BUS images using hybrid convolutional neural networks(CNNs)-ViTs architecture and Multi-Perceptron(MLP)-Mixer.The proposed method uses a two-encoder architecture with EfficientNetV2 backbone and an adapted ViT encoder to extract tumor regions in BUS images.The self-attention(SA)mechanism in the Transformer encoder allows capturing a wide range of high-level and complex features while the EfficientNetV2 encoder preserves local information in image.To fusion the extracted features,a Channel Attention Fusion(CAF)module is introduced.The CAF module selectively emphasizes important features from both encoders,improving the integration of high-level and local information.The resulting feature maps are reconstructed to obtain the segmentation maps using a decoder.Then,our method classifies the segmented tumor regions into benign and malignant using a simple and efficient classifier based on MLP-Mixer,that is applied for the first time,to the best of our knowledge,for the task of lesion classification in BUS images.Experimental results illustrate the outperformance of our framework compared to recent works for the task of segmentation by producing 83.42%in terms of Dice coefficient as well as for the classification with 86%in terms of accuracy.