期刊文献+
共找到69篇文章
< 1 2 4 >
每页显示 20 50 100
ACSF-ED: Adaptive Cross-Scale Fusion Encoder-Decoder for Spatio-Temporal Action Detection
1
作者 Wenju Wang Zehua Gu +2 位作者 Bang Tang Sen Wang Jianfei Hao 《Computers, Materials & Continua》 2025年第2期2389-2414,共26页
Current spatio-temporal action detection methods lack sufficient capabilities in extracting and comprehending spatio-temporal information. This paper introduces an end-to-end Adaptive Cross-Scale Fusion Encoder-Decode... Current spatio-temporal action detection methods lack sufficient capabilities in extracting and comprehending spatio-temporal information. This paper introduces an end-to-end Adaptive Cross-Scale Fusion Encoder-Decoder (ACSF-ED) network to predict the action and locate the object efficiently. In the Adaptive Cross-Scale Fusion Spatio-Temporal Encoder (ACSF ST-Encoder), the Asymptotic Cross-scale Feature-fusion Module (ACCFM) is designed to address the issue of information degradation caused by the propagation of high-level semantic information, thereby extracting high-quality multi-scale features to provide superior features for subsequent spatio-temporal information modeling. Within the Shared-Head Decoder structure, a shared classification and regression detection head is constructed. A multi-constraint loss function composed of one-to-one, one-to-many, and contrastive denoising losses is designed to address the problem of insufficient constraint force in predicting results with traditional methods. This loss function enhances the accuracy of model classification predictions and improves the proximity of regression position predictions to ground truth objects. The proposed method model is evaluated on the popular dataset UCF101-24 and JHMDB-21. Experimental results demonstrate that the proposed method achieves an accuracy of 81.52% on the Frame-mAP metric, surpassing current existing methods. 展开更多
关键词 Spatio-temporal action detection encoder-decoder cross-scale fusion multi-constraint loss function
在线阅读 下载PDF
基于时空特征融合的Encoder-Decoder多步4D短期航迹预测 被引量:3
2
作者 石庆研 张泽中 韩萍 《信号处理》 CSCD 北大核心 2023年第11期2037-2048,共12页
航迹预测在确保空中交通安全、高效运行中扮演着至关重要的角色。所预测的航迹信息是航迹优化、冲突告警等决策工具的输入,而预测准确性取决于模型对航迹序列特征的提取能力。航迹序列数据是具有丰富时空特征的多维时间序列,其中每个变... 航迹预测在确保空中交通安全、高效运行中扮演着至关重要的角色。所预测的航迹信息是航迹优化、冲突告警等决策工具的输入,而预测准确性取决于模型对航迹序列特征的提取能力。航迹序列数据是具有丰富时空特征的多维时间序列,其中每个变量都呈现出长短期的时间变化模式,并且这些变量之间还存在着相互依赖的空间信息。为了充分提取这种时空特征,本文提出了基于融合时空特征的编码器-解码器(Spatio-Temporal EncoderDecoder,STED)航迹预测模型。在Encoder中使用门控循环单元(Gated Recurrent Unit,GRU)、卷积神经网络(Convolutional Neural Network,CNN)和注意力机制(Attention,AT)构成的双通道网络来分别提取航迹时空特征,Decoder对时空特征进行拼接融合,并利用GRU对融合特征进行学习和递归输出,实现对未来多步航迹信息的预测。利用真实的航迹数据对算法性能进行验证,实验结果表明,所提STED网络模型能够在未来10 min预测范围内进行高精度的短期航迹预测,相比于LSTM、CNN-LSTM和AT-LSTM等数据驱动航迹预测模型具有更高的精度。此外,STED网络模型预测一个航迹点平均耗时为0.002 s,具有良好的实时性。 展开更多
关键词 4D航迹预测 时空特征 encoder-decoder 门控循环单元
在线阅读 下载PDF
基于encoder-decoder框架的城镇污水厂出水水质预测 被引量:5
3
作者 史红伟 陈祺 +1 位作者 王云龙 李鹏程 《中国农村水利水电》 北大核心 2023年第11期93-99,共7页
由于污水厂的出水水质指标繁多、污水处理过程中反应复杂、时序非线性程度高,基于机理模型的预测方法无法取得理想效果。针对此问题,提出基于深度学习的污水厂出水水质预测方法,并以吉林省某污水厂监测水质为来源数据,利用多种结合encod... 由于污水厂的出水水质指标繁多、污水处理过程中反应复杂、时序非线性程度高,基于机理模型的预测方法无法取得理想效果。针对此问题,提出基于深度学习的污水厂出水水质预测方法,并以吉林省某污水厂监测水质为来源数据,利用多种结合encoder-decoder结构的神经网络预测水质。结果显示,所提结构对LSTM和GRU网络预测能力都有一定提升,对长期预测能力提升更加显著,ED-GRU模型效果最佳,短期预测中的4个出水水质指标均方根误差(RMSE)为0.7551、0.2197、0.0734、0.3146,拟合优度(R2)为0.9013、0.9332、0.9167、0.9532,可以预测出水质局部变化,而长期预测中的4个指标RMSE为1.7204、1.7689、0.4478、0.8316,R2为0.4849、0.5507、0.4502、0.7595,可以预测出水质变化趋势,与顺序结构相比,短期预测RMSE降低10%以上,R2增加2%以上,长期预测RMSE降低25%以上,R2增加15%以上。研究结果表明,基于encoder-decoder结构的神经网络可以对污水厂出水水质进行准确预测,为污水处理工艺改进提供技术支撑。 展开更多
关键词 污水厂出水 encoder-decoder 多指标水质预测 GRU模型
在线阅读 下载PDF
耦合Encoder-Decoder的LSTM径流预报模型研究 被引量:14
4
作者 林康聆 陈华 +3 位作者 陈清勇 罗宇轩 刘峰 陈杰 《武汉大学学报(工学版)》 CAS CSCD 北大核心 2022年第8期755-761,共7页
将长短期记忆神经网络(long short-term memory neural network,LSTM)与Encoder-Decoder结构耦合应用为LSTM-ED模型,并与LSTM人工智能径流预报模型进行比较。通过在闽江建溪流域进行应用,结果表明,相较于LSTM,LSTM-ED在检验期整体和各... 将长短期记忆神经网络(long short-term memory neural network,LSTM)与Encoder-Decoder结构耦合应用为LSTM-ED模型,并与LSTM人工智能径流预报模型进行比较。通过在闽江建溪流域进行应用,结果表明,相较于LSTM,LSTM-ED在检验期整体和各预见期具有更高的精度和稳定性,且对于典型洪水的预报洪峰误差更小,其独有的语义向量可以保持水文信息的连续性,预报径流过程不易受降雨波动干扰。2个模型的预报能力都与流域最大汇流时间密切相关,当预见期小于流域最大汇流时间时,2个模型都有很好的预报能力;当预见期大于流域最大汇流时间时,模型预报能力显著变差;当预见期远大于流域最大汇流时间时,2个模型都失去预报可靠性。 展开更多
关键词 径流预报 encoder-decoder结构 长短期记忆神经网络 深度学习 人工神经网络
原文传递
利用Encoder-Decoder框架的深度学习网络实现绕射波分离及成像 被引量:4
5
作者 马铭 包乾宗 《石油地球物理勘探》 EI CSCD 北大核心 2023年第1期56-64,共9页
利用单纯绕射波场实现地下地质异常体的识别具有坚实的理论基础,对应的实施方法得到了广泛研究,且有效地应用于实际勘探。但现有技术在微小尺度异常体成像方面收效甚微,相关研究多数以射线传播理论为基础,对于影响绕射波分离成像精度的... 利用单纯绕射波场实现地下地质异常体的识别具有坚实的理论基础,对应的实施方法得到了广泛研究,且有效地应用于实际勘探。但现有技术在微小尺度异常体成像方面收效甚微,相关研究多数以射线传播理论为基础,对于影响绕射波分离成像精度的因素分析并不完备。相较于反射波,由于存在不连续构造而产生的绕射波能量微弱并且相互干涉,同时环境干扰使得绕射波进一步湮没。因此,更高精度的波场分离及单独成像是现阶段基于绕射波超高分辨率处理、解释的重点研究方向。为此,首先针对地球物理勘探中地质异常体的准确定位,以携带高分辨率信息的绕射波为研究对象,系统分析在不同尺度、不同物性参数的异常体情况下绕射波的能量大小及形态特征,掌握绕射波与其他类型波叠加的具体形式;然后根据相应特征性质提出基于深度学习技术的绕射波分离成像方法,即利用Encoder-Decoder框架的空洞卷积网络捕获绕射波场特征,从而实现绕射波分离,基于速度连续性原则构建单纯绕射波场的偏移速度模型并完成最终成像。数据测试表明,该方法最终可满足微小地质异常体高精度识别的需求。 展开更多
关键词 绕射波分离成像 深度神经网络 encoder-decoder框架 方差最大范数
在线阅读 下载PDF
基于注意力机制的Encoder-Decoder光伏发电预测模型 被引量:12
6
作者 宋良才 索贵龙 +2 位作者 胡军涛 窦艳梅 崔志永 《计算机与现代化》 2020年第9期112-117,共6页
影响光伏发电系统出力的天气因素具有很大的波动性和不连续性,因此需要创建合适的预测模型来对光伏出力特性进行精准预测,从而保证电网系统的有效运行。本文通过最大信息系数选择合适的历史光伏发电数据,将其作为特征之一进行输入数据重... 影响光伏发电系统出力的天气因素具有很大的波动性和不连续性,因此需要创建合适的预测模型来对光伏出力特性进行精准预测,从而保证电网系统的有效运行。本文通过最大信息系数选择合适的历史光伏发电数据,将其作为特征之一进行输入数据重构,并在由LSTM神经元构建的Encoder-Decoder模型上引入注意力机制,最终得到结合注意力机制的Encoder-Decoder光伏发电预测模型。经实际光伏电厂算例分析,验证了所提模型在光伏发电预测方面的准确性和适用性。 展开更多
关键词 光伏发电 最大信息系数 长短期记忆神经网络 encoder-decoder框架 注意力机制
在线阅读 下载PDF
Rethinking the Encoder-decoder Structure in Medical Image Segmentation from Releasing Decoder Structure 被引量:1
7
作者 Jiajia Ni Wei Mu +1 位作者 An Pan Zhengming Chen 《Journal of Bionic Engineering》 SCIE EI CSCD 2024年第3期1511-1521,共11页
Medical image segmentation has witnessed rapid advancements with the emergence of encoder-decoder based methods.In the encoder-decoder structure,the primary goal of the decoding phase is not only to restore feature ma... Medical image segmentation has witnessed rapid advancements with the emergence of encoder-decoder based methods.In the encoder-decoder structure,the primary goal of the decoding phase is not only to restore feature map resolution,but also to mitigate the loss of feature information incurred during the encoding phase.However,this approach gives rise to a challenge:multiple up-sampling operations in the decoder segment result in the loss of feature information.To address this challenge,we propose a novel network that removes the decoding structure to reduce feature information loss(CBL-Net).In particular,we introduce a Parallel Pooling Module(PPM)to counteract the feature information loss stemming from conventional and pooling operations during the encoding stage.Furthermore,we incorporate a Multiplexed Dilation Convolution(MDC)module to expand the network's receptive field.Also,although we have removed the decoding stage,we still need to recover the feature map resolution.Therefore,we introduced the Global Feature Recovery(GFR)module.It uses attention mechanism for the image feature map resolution recovery,which can effectively reduce the loss of feature information.We conduct extensive experimental evaluations on three publicly available medical image segmentation datasets:DRIVE,CHASEDB and MoNuSeg datasets.Experimental results show that our proposed network outperforms state-of-the-art methods in medical image segmentation.In addition,it achieves higher efficiency than the current network of coding and decoding structures by eliminating the decoding component. 展开更多
关键词 Medical image segmentation encoder-decoder architecture Attention mechanisms Releasing decoder architecture Neural network
在线阅读 下载PDF
A Road Extraction Method for Remote Sensing Image Based on Encoder-Decoder Network 被引量:30
8
作者 Hao HE Shuyang WANG +2 位作者 Shicheng WANG Dongfang YANG Xing LIU 《Journal of Geodesy and Geoinformation Science》 2020年第2期16-25,共10页
According to the characteristics of the road features,an Encoder-Decoder deep semantic segmentation network is designed for the road extraction of remote sensing images.Firstly,as the features of the road target are r... According to the characteristics of the road features,an Encoder-Decoder deep semantic segmentation network is designed for the road extraction of remote sensing images.Firstly,as the features of the road target are rich in local details and simple in semantic features,an Encoder-Decoder network with shallow layers and high resolution is designed to improve the ability to represent detail information.Secondly,as the road area is a small proportion in remote sensing images,the cross-entropy loss function is improved,which solves the imbalance between positive and negative samples in the training process.Experiments on large road extraction datasets show that the proposed method gets the recall rate 83.9%,precision 82.5%and F1-score 82.9%,which can extract the road targets in remote sensing images completely and accurately.The Encoder-Decoder network designed in this paper performs well in the road extraction task and needs less artificial participation,so it has a good application prospect. 展开更多
关键词 remote sensing road extraction deep learning semantic segmentation encoder-decoder network
在线阅读 下载PDF
基于Encoder-Decoder-ILSTM模型的瓦斯浓度预测研究 被引量:1
9
作者 陈小建 《能源与节能》 2023年第12期102-105,176,共5页
近年来,神经网络在各领域均发挥了巨大作用,同样在煤矿瓦斯浓度预测当中也有应用。为了提高模型的预测精度和实时性,结合Encoder-Decoder结构、长短期记忆形成、蛇优化算法提出了一种新的神经网络,为促进煤矿安全生产提供了技术支持。
关键词 神经网络 encoder-decoder 蛇优化算法 瓦斯浓度预测
在线阅读 下载PDF
耦合Encoder-Decoder与RFR的径流预报模型研究 被引量:1
10
作者 张健 《水利科学与寒区工程》 2024年第7期80-82,共3页
针对传统径流预报模型存在可靠性不高的缺陷,提出耦合Encoder-Decoder与RFR的径流预报模型,即通过Encoder-Decoder架构深度学习模块对径流-气象资料进行编码、解码处理以提取得到新的语义特征,进而将其作为输入变量用以随机森林回归(RFR... 针对传统径流预报模型存在可靠性不高的缺陷,提出耦合Encoder-Decoder与RFR的径流预报模型,即通过Encoder-Decoder架构深度学习模块对径流-气象资料进行编码、解码处理以提取得到新的语义特征,进而将其作为输入变量用以随机森林回归(RFR)拟合。在阜阳市径流量预报实证中表明,Encoder-Decoder与RFR模型的R2=0.75,MAE、RMSE分别为3.75、4.26亿m3;较之于RFR模型的R2提升了12.67%,而MAE和RMSE依次减小了17.40%、16.63%。 展开更多
关键词 encoder-decoder架构 RFR模型 径流量预报
在线阅读 下载PDF
Encoder-Decoder Based LSTM Model to Advance User QoE in 360-Degree Video
11
作者 Muhammad Usman Younus Rabia Shafi +4 位作者 Ammar Rafiq Muhammad Rizwan Anjum Sharjeel Afridi Abdul Aleem Jamali Zulfiqar Ali Arain 《Computers, Materials & Continua》 SCIE EI 2022年第5期2617-2631,共15页
The development of multimedia content has resulted in a massiveincrease in network traffic for video streaming. It demands such types ofsolutions that can be addressed to obtain the user’s Quality-of-Experience(QoE).... The development of multimedia content has resulted in a massiveincrease in network traffic for video streaming. It demands such types ofsolutions that can be addressed to obtain the user’s Quality-of-Experience(QoE). 360-degree videos have already taken up the user’s behavior by storm.However, the users only focus on the part of 360-degree videos, known as aviewport. Despite the immense hype, 360-degree videos convey a loathsomeside effect about viewport prediction, making viewers feel uncomfortablebecause user viewport needs to be pre-fetched in advance. Ideally, we canminimize the bandwidth consumption if we know what the user motionin advance. Looking into the problem definition, we propose an EncoderDecoder based Long-Short Term Memory (LSTM) model to more accuratelycapture the non-linear relationship between past and future viewport positions. This model takes the transforming data instead of taking the direct inputto predict the future user movement. Then, this prediction model is combinedwith a rate adaptation approach that assigns the bitrates to various tiles for360-degree video frames under a given network capacity. Hence, our proposedwork aims to facilitate improved system performance when QoE parametersare jointly optimized. Some experiments were carried out and compared withexisting work to prove the performance of the proposed model. Last but notleast, the experiments implementation of our proposed work provides highuser’s QoE than its competitors. 展开更多
关键词 encoder-decoder based lSTM 360-degree video streaming LSTM QOE viewport prediction
在线阅读 下载PDF
Classification of Arrhythmia Based on Convolutional Neural Networks and Encoder-Decoder Model
12
作者 Jian Liu Xiaodong Xia +2 位作者 Chunyang Han Jiao Hui Jim Feng 《Computers, Materials & Continua》 SCIE EI 2022年第10期265-278,共14页
As a common and high-risk type of disease,heart disease seriously threatens people’s health.At the same time,in the era of the Internet of Thing(IoT),smart medical device has strong practical significance for medical... As a common and high-risk type of disease,heart disease seriously threatens people’s health.At the same time,in the era of the Internet of Thing(IoT),smart medical device has strong practical significance for medical workers and patients because of its ability to assist in the diagnosis of diseases.Therefore,the research of real-time diagnosis and classification algorithms for arrhythmia can help to improve the diagnostic efficiency of diseases.In this paper,we design an automatic arrhythmia classification algorithm model based on Convolutional Neural Network(CNN)and Encoder-Decoder model.The model uses Long Short-Term Memory(LSTM)to consider the influence of time series features on classification results.Simultaneously,it is trained and tested by the MIT-BIH arrhythmia database.Besides,Generative Adversarial Networks(GAN)is adopted as a method of data equalization for solving data imbalance problem.The simulation results show that for the inter-patient arrhythmia classification,the hybrid model combining CNN and Encoder-Decoder model has the best classification accuracy,of which the accuracy can reach 94.05%.Especially,it has a better advantage for the classification effect of supraventricular ectopic beats(class S)and fusion beats(class F). 展开更多
关键词 ELECTROENCEPHALOGRAPHY convolutional neural network long short-term memory encoder-decoder model generative adversarial network
在线阅读 下载PDF
Underwater Acoustic Signal Noise Reduction Based on a Fully Convolutional Encoder-Decoder Neural Network
13
作者 SONG Yongqiang CHU Qian +2 位作者 LIU Feng WANG Tao SHEN Tongsheng 《Journal of Ocean University of China》 SCIE CAS CSCD 2023年第6期1487-1496,共10页
Noise reduction analysis of signals is essential for modern underwater acoustic detection systems.The traditional noise reduction techniques gradually lose efficacy because the target signal is masked by biological an... Noise reduction analysis of signals is essential for modern underwater acoustic detection systems.The traditional noise reduction techniques gradually lose efficacy because the target signal is masked by biological and natural noise in the marine environ-ment.The feature extraction method combining time-frequency spectrograms and deep learning can effectively achieve the separation of noise and target signals.A fully convolutional encoder-decoder neural network(FCEDN)is proposed to address the issue of noise reduc-tion in underwater acoustic signals.The time-domain waveform map of underwater acoustic signals is converted into a wavelet low-frequency analysis recording spectrogram during the denoising process to preserve as many underwater acoustic signal characteristics as possible.The FCEDN is built to learn the spectrogram mapping between noise and target signals that can be learned at each time level.The transposed convolution transforms are introduced,which can transform the spectrogram features of the signals into listenable audio files.After evaluating the systems on the ShipsEar Dataset,the proposed method can increase SNR and SI-SNR by 10.02 and 9.5dB,re-spectively. 展开更多
关键词 deep learning convolutional encoder-decoder neural network wavelet low-frequency analysis recording spectrogram
在线阅读 下载PDF
Action-Aware Encoder-Decoder Network for Pedestrian Trajectory Prediction
14
作者 傅家威 赵旭 《Journal of Shanghai Jiaotong university(Science)》 EI 2023年第1期20-27,共8页
Accurate pedestrian trajectory predictions are critical in self-driving systems,as they are fundamental to the response-and decision-making of ego vehicles.In this study,we focus on the problem of predicting the futur... Accurate pedestrian trajectory predictions are critical in self-driving systems,as they are fundamental to the response-and decision-making of ego vehicles.In this study,we focus on the problem of predicting the future trajectory of pedestrians from a first-person perspective.Most existing trajectory prediction methods from the first-person view copy the bird’s-eye view,neglecting the differences between the two.To this end,we clarify the differences between the two views and highlight the importance of action-aware trajectory prediction in the first-person view.We propose a new action-aware network based on an encoder-decoder framework with an action prediction and a goal estimation branch at the end of the encoder.In the decoder part,bidirectional long short-term memory(Bi-LSTM)blocks are adopted to generate the ultimate prediction of pedestrians’future trajectories.Our method was evaluated on a public dataset and achieved a competitive performance,compared with other approaches.An ablation study demonstrates the effectiveness of the action prediction branch. 展开更多
关键词 pedestrian trajectory prediction first-person view action prediction encoder-decoder bidirectional long short-term memory(Bi-LSTM)
原文传递
Robust Cultivated Land Extraction Using Encoder-Decoder
15
作者 Aziguli Wulamu Jingyue Sang +1 位作者 Dezheng Zhang and Zuxian Shi 《Journal of New Media》 2020年第4期149-155,共7页
Cultivated land extraction is essential for sustainable development and agriculture.In this paper,the network we propose is based on the encoder-decoder structure,which extracts the semantic segmentation neural networ... Cultivated land extraction is essential for sustainable development and agriculture.In this paper,the network we propose is based on the encoder-decoder structure,which extracts the semantic segmentation neural network of cultivated land from satellite images and uses it for agricultural automation solutions.The encoder consists of two part:the first is the modified Xception,it can used as the feature extraction network,and the second is the atrous convolution,it can used to expand the receptive field and the context information to extract richer feature information.The decoder part uses the conventional upsampling operation to restore the original resolution.In addition,we use the combination of BCE and Loves-hinge as a loss function to optimize the Intersection over Union(IoU).Experimental results show that the proposed network structure can solve the problem of cultivated land extraction in Yinchuan City. 展开更多
关键词 Semantic segmentation encoder-decoder cultivated land extraction atrous convolution
在线阅读 下载PDF
可见光成像系统MRED客观评测方法研究 被引量:1
16
作者 王萌 聂亮 《科技创新与应用》 2018年第6期77-78,共2页
最小可辨色差(MRED)是评价可见光成像系统的重要技术指标,传统的测量方法都是由测量人员的主观判读得到的,精确度和重复性不高。文章提出了一种基于图像处理技术的最小可分辨色差的客观评测方法,并通过实验进行验证。利用数字图像处理... 最小可辨色差(MRED)是评价可见光成像系统的重要技术指标,传统的测量方法都是由测量人员的主观判读得到的,精确度和重复性不高。文章提出了一种基于图像处理技术的最小可分辨色差的客观评测方法,并通过实验进行验证。利用数字图像处理技术对最小可分辨色差进行识别、判断来代替人眼的主观判断。客观方法测得的MRED为0.6764个色差单位。实验结果表明:测量结果准确,具有很好的重复性,与相同测量环境条件下由人眼主观测量的MRED结果相吻合。 展开更多
关键词 计算机图像处理 最小可分辨色差 可见光成像系统
在线阅读 下载PDF
Global Spatial-Temporal Information Encoder-Decoder Based Action Segmentation in Untrimmed Video 被引量:1
17
作者 Yichao Liu Yiyang Sun +2 位作者 Zhide Chen Chen Feng Kexin Zhu 《Tsinghua Science and Technology》 2025年第1期290-302,共13页
Action segmentation has made significant progress,but segmenting and recognizing actions from untrimmed long videos remains a challenging problem.Most state-of-the-art methods focus on designing models based on tempor... Action segmentation has made significant progress,but segmenting and recognizing actions from untrimmed long videos remains a challenging problem.Most state-of-the-art methods focus on designing models based on temporal convolution.However,the limitations of modeling long-term temporal dependencies and the inflexibility of temporal convolutions restrict the potential of these models.To address the issue of over-segmentation in existing action segmentation methods,which leads to classification errors and reduced segmentation quality,this paper proposes a global spatial-temporal information encoder-decoder based action segmentation method.The method proposed in this paper uses the global temporal information captured by refinement layer to assist the Encoder-Decoder(ED)structure in judging the action segmentation point more accurately and,at the same time,suppress the excessive segmentation phenomenon caused by the ED structure.The method proposed in this paper achieves 93%frame accuracy on the constructed real Tai Chi action dataset.The experimental results prove that this method can accurately and efficiently complete the long video action segmentation task. 展开更多
关键词 encoder-decoder(ED) Bidirectional Long Short-Term Memory(BiLSTM) Tai Chi action segmentation untrimmed video
原文传递
面向高功率微波反演的高效时序神经网络算法研究
18
作者 董纯志 黄志祥 冯乃星 《现代应用物理》 2025年第1期151-157,共7页
柱状等离子体阵列(columnar plasma array,CPA)被证明是一种极佳的高功率微波(high-power microwave,HPM)防护手段。然而,并非任一的HPM都足以激发CPA产生电磁屏蔽效应,因此提出并实现一种新的基于Encoder-Decoder框架和multivariate at... 柱状等离子体阵列(columnar plasma array,CPA)被证明是一种极佳的高功率微波(high-power microwave,HPM)防护手段。然而,并非任一的HPM都足以激发CPA产生电磁屏蔽效应,因此提出并实现一种新的基于Encoder-Decoder框架和multivariate attention机制的时间序列模型iiTransformer(improved iTransformer),对HPM与CPA间的复杂非线性过程进行数学建模并实现高功率微波反演。使用有限元方法(finite element method,FEM)完成算法仿真和数据采集,分别使用iiTransformer模型和ResNet-18模型实现了对高功率微波的反演推断。在iiTransformer模型中,利用Encoder和Decoder架构分别对数据序列和目标序列进行了多头自注意力处理,用于提取多变量在多通道中的关系依赖。相比之下,ResNet-18模型拟合数据序列所映射出的热图与目标序列间呈非线性映射关系。研究结果表明,所设计的iiTransformer模型具有很强的表征学习和非线性拟合能力,不仅泛化能力强而且鲁棒性好,在训练集上的损失为3.4548×10^(-7),在验证集上的损失达到了1.804×10^(-7),在测试集上的准确度为99.923%,远远高于ResNet-18模型的反演精度。 展开更多
关键词 高功率微波防护 柱状等离子体阵列 时间序列模型 encoder-decoder框架 multivariate attention机制
在线阅读 下载PDF
BDMFuse:Multi-scale network fusion for infrared and visible images based on base and detail features
19
作者 SI Hai-Ping ZHAO Wen-Rui +4 位作者 LI Ting-Ting LI Fei-Tao Fernando Bacao SUN Chang-Xia LI Yan-Ling 《红外与毫米波学报》 北大核心 2025年第2期289-298,共10页
The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method f... The fusion of infrared and visible images should emphasize the salient targets in the infrared image while preserving the textural details of the visible images.To meet these requirements,an autoencoder-based method for infrared and visible image fusion is proposed.The encoder designed according to the optimization objective consists of a base encoder and a detail encoder,which is used to extract low-frequency and high-frequency information from the image.This extraction may lead to some information not being captured,so a compensation encoder is proposed to supplement the missing information.Multi-scale decomposition is also employed to extract image features more comprehensively.The decoder combines low-frequency,high-frequency and supplementary information to obtain multi-scale features.Subsequently,the attention strategy and fusion module are introduced to perform multi-scale fusion for image reconstruction.Experimental results on three datasets show that the fused images generated by this network effectively retain salient targets while being more consistent with human visual perception. 展开更多
关键词 infrared image visible image image fusion encoder-decoder multi-scale features
在线阅读 下载PDF
FS-MSFormer:Image Dehazing Based on Frequency Selection and Multi-Branch Efficient Transformer
20
作者 Chunming Tang Yu Wang 《Computers, Materials & Continua》 2025年第6期5115-5128,共14页
Image dehazing aims to generate clear images critical for subsequent visual tasks.CNNs have made significant progress in the field of image dehazing.However,due to the inherent limitations of convolution operations,it... Image dehazing aims to generate clear images critical for subsequent visual tasks.CNNs have made significant progress in the field of image dehazing.However,due to the inherent limitations of convolution operations,it is challenging to effectively model global context and long-range spatial dependencies effectively.Although the Transformer can address this issue,it faces the challenge of excessive computational requirements.Therefore,we propose the FS-MSFormer network,an asymmetric encoder-decoder architecture that combines the advantages of CNNs and Transformers to improve dehazing performance.Specifically,the encoding process employs two branches formulti-scale feature extraction.One branch integrates an improved Transformer to enrich local and global contextual information while achieving linear complexity,and the other branch dynamically selects the most suitable frequency components in the frequency domain for enhancement.A single decoding branch is utilized to achieve feature recovery in the decoding process.After enhancing local and global features,they are fused with the encoded features,which reduces information loss and enhances the model’s robustness.A perceptual consistency loss function is also designed to minimize image color distortion.We conducted experiments on synthetic datasets SOTS-Indoor,Foggy Cityscapes,and the real-world dataset Dense-Haze,showing improved dehazing results.Compared with FSNet,our method has shown improvements of 0.95 dB in PSNR and 0.007 in SSIMon SOTS-Indoor dataset,and enhancements of 1.89 dB in PSNR and 0.0579 in SSIM on the Dense-Haze dataset demonstrate the effectiveness of our method. 展开更多
关键词 Asymmetric encoder-decoder architecture perceived consistency loss unified transformer
在线阅读 下载PDF
上一页 1 2 4 下一页 到第
使用帮助 返回顶部