期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
Fine-tuning a large multimodal neural network with clinical and dermoscopic images for accurate detection and histopathological subtype prediction of basal cell carcinoma in long-tailed distributions
1
作者 Yukun Wang Xingkuo Zhang +2 位作者 Hongjun Wu Li Xiao Jie Liu 《Chinese Medical Journal》 2025年第10期1251-1253,共3页
To the Editor:Basal cell carcinoma(BCC)is the most prevalent skin malignancy,with an increasing incidence and economic burden worldwide.[1]Various histopathological subtypes of BCC have been well described,and subtype... To the Editor:Basal cell carcinoma(BCC)is the most prevalent skin malignancy,with an increasing incidence and economic burden worldwide.[1]Various histopathological subtypes of BCC have been well described,and subtype confirmation is essential for BCC classification according to the risk of recurrence.[2]Early diagnosis and intervention are important,especially considering that the incidence of aggressive subtypes of BCC is increasing faster than that of indolent subtypes. 展开更多
关键词 cell carcinoma bcc dermoscopic images skin malignancywith multimodal neural network histopathological subtype subtype confirmation basal cell carcinoma clinical images
原文传递
An Intelligent Visibility Retrieval Framework Combining Meteorological Factors and Image Features
2
作者 MU Xi-yu ZHOU Yu-feng +7 位作者 XU Qi FENG Yi-fei LIU Ze-zhong CHENG Xiao-gang YAN Shu-qi YU Kun WU Hao YANG Hua-dong 《Journal of Tropical Meteorology》 2025年第5期545-555,共11页
Video imagery enables both qualitative characterization and quantitative retrieval of low-visibility conditions.These phenomena exhibit complex nonlinear dependencies on atmospheric processes,particularly during moist... Video imagery enables both qualitative characterization and quantitative retrieval of low-visibility conditions.These phenomena exhibit complex nonlinear dependencies on atmospheric processes,particularly during moisture-driven weather events such as fog,rain,and snow.To address this challenge,we propose a dual-branch neural architecture that synergistically processes optical imagery and multi-source meteorological data(temperature,humidity,and wind speed).The framework employs a convolutional neural network(CNN)branch to extract visibility-related visual features from video imagery sequences,while a parallel artificial neural network(ANN)branch decodes nonlinear relationships among the meteorological factors.Cross-modal feature fusion is achieved through an adaptive weighting layer.To validate the framework,multimodal Backpropagation-VGG(BP-VGG)and Backpropagation-ResNet(BP-ResNet)models are developed and trained/tested using historical imagery and meteorological observations from Nanjing Lukou International Airport.The results demonstrate that the multimodal networks reduce retrieval errors by approximately 8%–10%compared to unimodal networks relying solely on imagery.Among the multimodal models,BP-ResNet exhibits the best performance with a mean absolute percentage error(MAPE)of 8.5%.Analysis of typical case studies reveals that visibility fluctuates rapidly while meteorological factors change gradually,highlighting the crucial role of high-frequency imaging data in intelligent visibility retrieval models.The superior performance of BP-ResNet over BP-VGG is attributed to its use of residual blocks,which enables BP-ResNet to excel in multimodal processing by effectively leveraging data complementarity for synergistic improvements.This study presents an end-to-end intelligent visibility inversion framework that directly retrieves visibility values,enhancing its applicability across industries.However,while this approach boosts accuracy and applicability,its performance in critical low-visibility scenarios remains suboptimal,necessitating further research into more advanced retrieval techniques—particularly under extreme visibility conditions. 展开更多
关键词 multimodal neural network multisource factors intelligent visibility retrieval
在线阅读 下载PDF
Deep Multimodal Reinforcement Network with Contextually GuidedRecurrent Attention for Image Question Answering 被引量:2
3
作者 Ai-Wen Jiang Bo Liu Ming-Wen Wang 《Journal of Computer Science & Technology》 SCIE EI CSCD 2017年第4期738-748,共11页
Image question answering (IQA) has emerged as a promising interdisciplinary topic in computer vision and natural language processing fields. In this paper, we propose a contextually guided recurrent attention model fo... Image question answering (IQA) has emerged as a promising interdisciplinary topic in computer vision and natural language processing fields. In this paper, we propose a contextually guided recurrent attention model for solving the IQA issues. It is a deep reinforcement learning based multimodal recurrent neural network. Based on compositional contextual information, it recurrently decides where to look using reinforcement learning strategy. Different from traditional 'static' soft attention, it is deemed as a kind of 'dynamic' attention whose objective is designed based on reinforcement rewards purposefully towards IQA. The finally learned compositional information incorporates both global context and local informative details, which is demonstrated to benefit for generating answers. The proposed method is compared with several state-of-the-art methods on two public IQA datasets, including COCO-QA and VQA from dataset MS COCO. The experimental results demonstrate that our proposed model outperforms those methods and achieves better performance. 展开更多
关键词 image question answering recurrent attention deep reinforcement learning multimodal recurrent neural network multimodal fusion
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部