To the Editor:Basal cell carcinoma(BCC)is the most prevalent skin malignancy,with an increasing incidence and economic burden worldwide.[1]Various histopathological subtypes of BCC have been well described,and subtype...To the Editor:Basal cell carcinoma(BCC)is the most prevalent skin malignancy,with an increasing incidence and economic burden worldwide.[1]Various histopathological subtypes of BCC have been well described,and subtype confirmation is essential for BCC classification according to the risk of recurrence.[2]Early diagnosis and intervention are important,especially considering that the incidence of aggressive subtypes of BCC is increasing faster than that of indolent subtypes.展开更多
Video imagery enables both qualitative characterization and quantitative retrieval of low-visibility conditions.These phenomena exhibit complex nonlinear dependencies on atmospheric processes,particularly during moist...Video imagery enables both qualitative characterization and quantitative retrieval of low-visibility conditions.These phenomena exhibit complex nonlinear dependencies on atmospheric processes,particularly during moisture-driven weather events such as fog,rain,and snow.To address this challenge,we propose a dual-branch neural architecture that synergistically processes optical imagery and multi-source meteorological data(temperature,humidity,and wind speed).The framework employs a convolutional neural network(CNN)branch to extract visibility-related visual features from video imagery sequences,while a parallel artificial neural network(ANN)branch decodes nonlinear relationships among the meteorological factors.Cross-modal feature fusion is achieved through an adaptive weighting layer.To validate the framework,multimodal Backpropagation-VGG(BP-VGG)and Backpropagation-ResNet(BP-ResNet)models are developed and trained/tested using historical imagery and meteorological observations from Nanjing Lukou International Airport.The results demonstrate that the multimodal networks reduce retrieval errors by approximately 8%–10%compared to unimodal networks relying solely on imagery.Among the multimodal models,BP-ResNet exhibits the best performance with a mean absolute percentage error(MAPE)of 8.5%.Analysis of typical case studies reveals that visibility fluctuates rapidly while meteorological factors change gradually,highlighting the crucial role of high-frequency imaging data in intelligent visibility retrieval models.The superior performance of BP-ResNet over BP-VGG is attributed to its use of residual blocks,which enables BP-ResNet to excel in multimodal processing by effectively leveraging data complementarity for synergistic improvements.This study presents an end-to-end intelligent visibility inversion framework that directly retrieves visibility values,enhancing its applicability across industries.However,while this approach boosts accuracy and applicability,its performance in critical low-visibility scenarios remains suboptimal,necessitating further research into more advanced retrieval techniques—particularly under extreme visibility conditions.展开更多
Image question answering (IQA) has emerged as a promising interdisciplinary topic in computer vision and natural language processing fields. In this paper, we propose a contextually guided recurrent attention model fo...Image question answering (IQA) has emerged as a promising interdisciplinary topic in computer vision and natural language processing fields. In this paper, we propose a contextually guided recurrent attention model for solving the IQA issues. It is a deep reinforcement learning based multimodal recurrent neural network. Based on compositional contextual information, it recurrently decides where to look using reinforcement learning strategy. Different from traditional 'static' soft attention, it is deemed as a kind of 'dynamic' attention whose objective is designed based on reinforcement rewards purposefully towards IQA. The finally learned compositional information incorporates both global context and local informative details, which is demonstrated to benefit for generating answers. The proposed method is compared with several state-of-the-art methods on two public IQA datasets, including COCO-QA and VQA from dataset MS COCO. The experimental results demonstrate that our proposed model outperforms those methods and achieves better performance.展开更多
基金This work was supported by grants from CAMS Innovation Fund for Medical Sciences(CIFMS)(No.2022-I2M-C&T-A-007)the National Natural Science Foundation of China(No.92354307)the Fundamental Research Funds for the Central Universities(No.2023RC09).
文摘To the Editor:Basal cell carcinoma(BCC)is the most prevalent skin malignancy,with an increasing incidence and economic burden worldwide.[1]Various histopathological subtypes of BCC have been well described,and subtype confirmation is essential for BCC classification according to the risk of recurrence.[2]Early diagnosis and intervention are important,especially considering that the incidence of aggressive subtypes of BCC is increasing faster than that of indolent subtypes.
基金Foundation of Key Laboratory of Smart Earth(KF2023ZD03-02)China Meteorological Administration Innovation development project(CXFZ2025J116)+1 种基金National Natural Science Foundation of China(42205197)Basic Research Fund of CAMS(2022Y023,2022Y025)。
文摘Video imagery enables both qualitative characterization and quantitative retrieval of low-visibility conditions.These phenomena exhibit complex nonlinear dependencies on atmospheric processes,particularly during moisture-driven weather events such as fog,rain,and snow.To address this challenge,we propose a dual-branch neural architecture that synergistically processes optical imagery and multi-source meteorological data(temperature,humidity,and wind speed).The framework employs a convolutional neural network(CNN)branch to extract visibility-related visual features from video imagery sequences,while a parallel artificial neural network(ANN)branch decodes nonlinear relationships among the meteorological factors.Cross-modal feature fusion is achieved through an adaptive weighting layer.To validate the framework,multimodal Backpropagation-VGG(BP-VGG)and Backpropagation-ResNet(BP-ResNet)models are developed and trained/tested using historical imagery and meteorological observations from Nanjing Lukou International Airport.The results demonstrate that the multimodal networks reduce retrieval errors by approximately 8%–10%compared to unimodal networks relying solely on imagery.Among the multimodal models,BP-ResNet exhibits the best performance with a mean absolute percentage error(MAPE)of 8.5%.Analysis of typical case studies reveals that visibility fluctuates rapidly while meteorological factors change gradually,highlighting the crucial role of high-frequency imaging data in intelligent visibility retrieval models.The superior performance of BP-ResNet over BP-VGG is attributed to its use of residual blocks,which enables BP-ResNet to excel in multimodal processing by effectively leveraging data complementarity for synergistic improvements.This study presents an end-to-end intelligent visibility inversion framework that directly retrieves visibility values,enhancing its applicability across industries.However,while this approach boosts accuracy and applicability,its performance in critical low-visibility scenarios remains suboptimal,necessitating further research into more advanced retrieval techniques—particularly under extreme visibility conditions.
文摘Image question answering (IQA) has emerged as a promising interdisciplinary topic in computer vision and natural language processing fields. In this paper, we propose a contextually guided recurrent attention model for solving the IQA issues. It is a deep reinforcement learning based multimodal recurrent neural network. Based on compositional contextual information, it recurrently decides where to look using reinforcement learning strategy. Different from traditional 'static' soft attention, it is deemed as a kind of 'dynamic' attention whose objective is designed based on reinforcement rewards purposefully towards IQA. The finally learned compositional information incorporates both global context and local informative details, which is demonstrated to benefit for generating answers. The proposed method is compared with several state-of-the-art methods on two public IQA datasets, including COCO-QA and VQA from dataset MS COCO. The experimental results demonstrate that our proposed model outperforms those methods and achieves better performance.