Dense captioning aims to simultaneously localize and describe regions-of-interest(RoIs)in images in natural language.Specifically,we identify three key problems:1)dense and highly overlapping RoIs,making accurate loca...Dense captioning aims to simultaneously localize and describe regions-of-interest(RoIs)in images in natural language.Specifically,we identify three key problems:1)dense and highly overlapping RoIs,making accurate localization of each target region challenging;2)some visually ambiguous target regions which are hard to recognize each of them just by appearance;3)an extremely deep image representation which is of central importance for visual recognition.To tackle these three challenges,we propose a novel end-to-end dense captioning framework consisting of a joint localization module,a contextual reasoning module and a deep convolutional neural network(CNN).We also evaluate five deep CNN structures to explore the benefits of each.Extensive experiments on visual genome(VG)dataset demonstrate the effectiveness of our approach,which compares favorably with the state-of-the-art methods.展开更多
Aspect-based sentiment analysis aims to detect and classify the sentiment polarities as negative,positive,or neutral while associating them with their identified aspects from the corresponding context.In this regard,p...Aspect-based sentiment analysis aims to detect and classify the sentiment polarities as negative,positive,or neutral while associating them with their identified aspects from the corresponding context.In this regard,prior methodologies widely utilize either word embedding or tree-based rep-resentations.Meanwhile,the separate use of those deep features such as word embedding and tree-based dependencies has become a significant cause of information loss.Generally,word embedding preserves the syntactic and semantic relations between a couple of terms lying in a sentence.Besides,the tree-based structure conserves the grammatical and logical dependencies of context.In addition,the sentence-oriented word position describes a critical factor that influences the contextual information of a targeted sentence.Therefore,knowledge of the position-oriented information of words in a sentence has been considered significant.In this study,we propose to use word embedding,tree-based representation,and contextual position information in combination to evaluate whether their combination will improve the result’s effectiveness or not.In the meantime,their joint utilization enhances the accurate identification and extraction of targeted aspect terms,which also influences their classification process.In this research paper,we propose a method named Attention Based Multi-Channel Convolutional Neural Net-work(Att-MC-CNN)that jointly utilizes these three deep features such as word embedding with tree-based structure and contextual position informa-tion.These three parameters deliver to Multi-Channel Convolutional Neural Network(MC-CNN)that identifies and extracts the potential terms and classifies their polarities.In addition,these terms have been further filtered with the attention mechanism,which determines the most significant words.The empirical analysis proves the proposed approach’s effectiveness compared to existing techniques when evaluated on standard datasets.The experimental results represent our approach outperforms in the F1 measure with an overall achievement of 94%in identifying aspects and 92%in the task of sentiment classification.展开更多
This paper proposes Flex-QUIC,an AIempowered quick UDP Internet connections(QUIC)enhancement framework that addresses the challenge of degraded transmission efficiency caused by the static parameterization of acknowle...This paper proposes Flex-QUIC,an AIempowered quick UDP Internet connections(QUIC)enhancement framework that addresses the challenge of degraded transmission efficiency caused by the static parameterization of acknowledgment(ACK)mechanisms,loss detection,and forward error correction(FEC)in dynamic wireless networks.Unlike the standard QUIC protocol,Flex-QUIC systematically integrates machine learning across three critical modules to achieve high-efficiency operation.First,a contextual multi-armed bandit-based ACK adaptation mechanism optimizes the ACK ratio to reduce wireless channel contention.Second,the adaptive loss detection module utilizes a long short-term memory(LSTM)model to predict the reordering displacement for optimizing the packet reordering tolerance.Third,the FEC transmission scheme jointly adjusts the redundancy level based on the LSTM-predicted loss rate and congestion window state.Extensive evaluations across Wi-Fi,5G,and satellite network scenarios demonstrate that Flex-QUIC significantly improves throughput and latency reduction compared to the standard QUIC and other enhanced QUIC variants,highlighting its adaptability to diverse and dynamic network conditions.Finally,we further discuss open issues in deploying AI-native transport protocols.展开更多
The diagnosis of COVID-19 requires chest computed tomography(CT).High-resolution CT images can provide more diagnostic information to help doctors better diagnose the disease,so it is of clinical importance to study s...The diagnosis of COVID-19 requires chest computed tomography(CT).High-resolution CT images can provide more diagnostic information to help doctors better diagnose the disease,so it is of clinical importance to study super-resolution(SR)algorithms applied to CT images to improve the reso-lution of CT images.However,most of the existing SR algorithms are studied based on natural images,which are not suitable for medical images;and most of these algorithms improve the reconstruction quality by increasing the network depth,which is not suitable for machines with limited resources.To alleviate these issues,we propose a residual feature attentional fusion network for lightweight chest CT image super-resolution(RFAFN).Specifically,we design a contextual feature extraction block(CFEB)that can extract CT image features more efficiently and accurately than ordinary residual blocks.In addition,we propose a feature-weighted cascading strategy(FWCS)based on attentional feature fusion blocks(AFFB)to utilize the high-frequency detail information extracted by CFEB as much as possible via selectively fusing adjacent level feature information.Finally,we suggest a global hierarchical feature fusion strategy(GHFFS),which can utilize the hierarchical features more effectively than dense concatenation by progressively aggregating the feature information at various levels.Numerous experiments show that our method performs better than most of the state-of-the-art(SOTA)methods on the COVID-19 chest CT dataset.In detail,the peak signal-to-noise ratio(PSNR)is 0.11 dB and 0.47 dB higher on CTtest1 and CTtest2 at×3 SR compared to the suboptimal method,but the number of parameters and multi-adds are reduced by 22K and 0.43G,respectively.Our method can better recover chest CT image quality with fewer computational resources and effectively assist in COVID-19.展开更多
成像设备在受到外界激光干扰时,干扰光斑会遮挡显著目标的有效信息,导致图像质量显著下降,对后续工作带来困扰。针对激光干扰图像修复的问题,提出了基于全局语义感知与纹理频域约束的激光干扰图像修复网络模型。该模型由全局语义引导阶...成像设备在受到外界激光干扰时,干扰光斑会遮挡显著目标的有效信息,导致图像质量显著下降,对后续工作带来困扰。针对激光干扰图像修复的问题,提出了基于全局语义感知与纹理频域约束的激光干扰图像修复网络模型。该模型由全局语义引导阶段和局部细节增强阶段两部分组成:全局语义引导阶段通过结合滑动窗口的自注意力机制和分层结构的HBES(Hybrid Block of ESA andSTL)模块逐步扩大感受野以提取全局上下文信息,从而准确地推断出干扰区域的合理内容;局部细节增强阶段则以全局语义引导阶段的预测结果为输入,通过分析未干扰区域和干扰区域之间的相似性,将背景和干扰区域的关联信息相结合,生成高质量的修复结果。此外,为提升网络对纹理细节的关注,设计了一种余弦变换损失函数,强调图像细节部分的修复,使干扰区域的重建图像清晰、连贯。实验结果表明,该模型在激光干扰图像修复任务上取得了良好的修复效果,有效改善了图像质量。展开更多
基金Project(2020A1515010718)supported by the Basic and Applied Basic Research Foundation of Guangdong Province,China。
文摘Dense captioning aims to simultaneously localize and describe regions-of-interest(RoIs)in images in natural language.Specifically,we identify three key problems:1)dense and highly overlapping RoIs,making accurate localization of each target region challenging;2)some visually ambiguous target regions which are hard to recognize each of them just by appearance;3)an extremely deep image representation which is of central importance for visual recognition.To tackle these three challenges,we propose a novel end-to-end dense captioning framework consisting of a joint localization module,a contextual reasoning module and a deep convolutional neural network(CNN).We also evaluate five deep CNN structures to explore the benefits of each.Extensive experiments on visual genome(VG)dataset demonstrate the effectiveness of our approach,which compares favorably with the state-of-the-art methods.
基金supported by the Deanship of Scientific Research,Vice Presidency for Graduate Studies and Scientific Research,King Faisal University,Saudi Arabia[Grant No.3418].
文摘Aspect-based sentiment analysis aims to detect and classify the sentiment polarities as negative,positive,or neutral while associating them with their identified aspects from the corresponding context.In this regard,prior methodologies widely utilize either word embedding or tree-based rep-resentations.Meanwhile,the separate use of those deep features such as word embedding and tree-based dependencies has become a significant cause of information loss.Generally,word embedding preserves the syntactic and semantic relations between a couple of terms lying in a sentence.Besides,the tree-based structure conserves the grammatical and logical dependencies of context.In addition,the sentence-oriented word position describes a critical factor that influences the contextual information of a targeted sentence.Therefore,knowledge of the position-oriented information of words in a sentence has been considered significant.In this study,we propose to use word embedding,tree-based representation,and contextual position information in combination to evaluate whether their combination will improve the result’s effectiveness or not.In the meantime,their joint utilization enhances the accurate identification and extraction of targeted aspect terms,which also influences their classification process.In this research paper,we propose a method named Attention Based Multi-Channel Convolutional Neural Net-work(Att-MC-CNN)that jointly utilizes these three deep features such as word embedding with tree-based structure and contextual position informa-tion.These three parameters deliver to Multi-Channel Convolutional Neural Network(MC-CNN)that identifies and extracts the potential terms and classifies their polarities.In addition,these terms have been further filtered with the attention mechanism,which determines the most significant words.The empirical analysis proves the proposed approach’s effectiveness compared to existing techniques when evaluated on standard datasets.The experimental results represent our approach outperforms in the F1 measure with an overall achievement of 94%in identifying aspects and 92%in the task of sentiment classification.
基金supported in part by the National Key R&D Program of China with Grant number 2019YFB1803400.
文摘This paper proposes Flex-QUIC,an AIempowered quick UDP Internet connections(QUIC)enhancement framework that addresses the challenge of degraded transmission efficiency caused by the static parameterization of acknowledgment(ACK)mechanisms,loss detection,and forward error correction(FEC)in dynamic wireless networks.Unlike the standard QUIC protocol,Flex-QUIC systematically integrates machine learning across three critical modules to achieve high-efficiency operation.First,a contextual multi-armed bandit-based ACK adaptation mechanism optimizes the ACK ratio to reduce wireless channel contention.Second,the adaptive loss detection module utilizes a long short-term memory(LSTM)model to predict the reordering displacement for optimizing the packet reordering tolerance.Third,the FEC transmission scheme jointly adjusts the redundancy level based on the LSTM-predicted loss rate and congestion window state.Extensive evaluations across Wi-Fi,5G,and satellite network scenarios demonstrate that Flex-QUIC significantly improves throughput and latency reduction compared to the standard QUIC and other enhanced QUIC variants,highlighting its adaptability to diverse and dynamic network conditions.Finally,we further discuss open issues in deploying AI-native transport protocols.
基金supported by the General Project of Natural Science Foundation of Hebei Province of China(H2019201378)the Foundation of the President of Hebei University(XZJJ201917)the Special Project for Cultivating Scientific and Technological Innovation Ability of University and Middle School Students of Hebei Province(2021H060306).
文摘The diagnosis of COVID-19 requires chest computed tomography(CT).High-resolution CT images can provide more diagnostic information to help doctors better diagnose the disease,so it is of clinical importance to study super-resolution(SR)algorithms applied to CT images to improve the reso-lution of CT images.However,most of the existing SR algorithms are studied based on natural images,which are not suitable for medical images;and most of these algorithms improve the reconstruction quality by increasing the network depth,which is not suitable for machines with limited resources.To alleviate these issues,we propose a residual feature attentional fusion network for lightweight chest CT image super-resolution(RFAFN).Specifically,we design a contextual feature extraction block(CFEB)that can extract CT image features more efficiently and accurately than ordinary residual blocks.In addition,we propose a feature-weighted cascading strategy(FWCS)based on attentional feature fusion blocks(AFFB)to utilize the high-frequency detail information extracted by CFEB as much as possible via selectively fusing adjacent level feature information.Finally,we suggest a global hierarchical feature fusion strategy(GHFFS),which can utilize the hierarchical features more effectively than dense concatenation by progressively aggregating the feature information at various levels.Numerous experiments show that our method performs better than most of the state-of-the-art(SOTA)methods on the COVID-19 chest CT dataset.In detail,the peak signal-to-noise ratio(PSNR)is 0.11 dB and 0.47 dB higher on CTtest1 and CTtest2 at×3 SR compared to the suboptimal method,but the number of parameters and multi-adds are reduced by 22K and 0.43G,respectively.Our method can better recover chest CT image quality with fewer computational resources and effectively assist in COVID-19.
文摘成像设备在受到外界激光干扰时,干扰光斑会遮挡显著目标的有效信息,导致图像质量显著下降,对后续工作带来困扰。针对激光干扰图像修复的问题,提出了基于全局语义感知与纹理频域约束的激光干扰图像修复网络模型。该模型由全局语义引导阶段和局部细节增强阶段两部分组成:全局语义引导阶段通过结合滑动窗口的自注意力机制和分层结构的HBES(Hybrid Block of ESA andSTL)模块逐步扩大感受野以提取全局上下文信息,从而准确地推断出干扰区域的合理内容;局部细节增强阶段则以全局语义引导阶段的预测结果为输入,通过分析未干扰区域和干扰区域之间的相似性,将背景和干扰区域的关联信息相结合,生成高质量的修复结果。此外,为提升网络对纹理细节的关注,设计了一种余弦变换损失函数,强调图像细节部分的修复,使干扰区域的重建图像清晰、连贯。实验结果表明,该模型在激光干扰图像修复任务上取得了良好的修复效果,有效改善了图像质量。