Embodied visual exploration is critical for building intelligent visual agents. This paper presents the neural exploration with feature-based visual odometry and tracking-failure-reduction policy(Ne OR), a framework f...Embodied visual exploration is critical for building intelligent visual agents. This paper presents the neural exploration with feature-based visual odometry and tracking-failure-reduction policy(Ne OR), a framework for embodied visual exploration that possesses the efficient exploration capabilities of deep reinforcement learning(DRL)-based exploration policies and leverages feature-based visual odometry(VO) for more accurate mapping and positioning results. An improved local policy is also proposed to reduce tracking failures of feature-based VO in weakly textured scenes through a refined multi-discrete action space, keyframe fusion, and an auxiliary task. The experimental results demonstrate that Ne OR has better mapping and positioning accuracy compared to other entirely learning-based exploration frameworks and improves the robustness of feature-based VO by significantly reducing tracking failures in weakly textured scenes.展开更多
Despite the global attention towards pollution,it remains a significant global threat and challenge for both developed and developing countries.Urbanization and economic development influence different types of pollut...Despite the global attention towards pollution,it remains a significant global threat and challenge for both developed and developing countries.Urbanization and economic development influence different types of pollution.Visual pollution is considered a new phenomenon referring to the impact of existing and growing mainstream pollution which impairs an individual’s ability to enjoy visits or views.Recently,Jordanian cities have expanded in response to urbanization and ongoing development.Irbid City has the second largest population in Jordan after the capital Amman City highest population density in Jordan.In the modern era,Irbid City dramatically increased in population and dimension.The growth of the demographic population has been significant and has led to overpopulation,rapid urbanization,and unresolved problems associated with spatial planning and infrastructures leading to different types of pollution including visual pollution.The study area focuses on the city center with the most crowded population through field visits and actual observations.The study technique is descriptive and analytical,with a focus on meticulous monitoring and a follow-up-based questionnaire which is a tool for the study,involving data collection,classification,presentation,analysis,interpretation,and exploration to identify new facts and generalizations that can help solve current issues of visual pollution.The study provides recommendations for Irbid Municipal to eliminate visual pollution,in parallel with stricter supervision from the municipality during the building process to ensure proper implementation of the new rules,adopting an integrated policy for the city with the rest of the social,political,sensory,cultural,economic,and functional aspects,so that this policy is in the short and long term.展开更多
Throughout the lifespan,an animal can encounter predators frequently,thus the ability to avoid attacks from predators is crucial for its survival.The chances of evading danger can be greatly improved if the animal can...Throughout the lifespan,an animal can encounter predators frequently,thus the ability to avoid attacks from predators is crucial for its survival.The chances of evading danger can be greatly improved if the animal can respond immediately to the threat.Therefore,when an animal detects a threat through its visual system,it must quickly direct its gaze and attention toward the source of danger,assess the threat level,and take appropriate action.展开更多
Synaptic plasticity is essential for maintaining neuronal function in the central nervous system and serves as a critical indicator of the effects of neurodegenerative disease.Glaucoma directly impairs retinal ganglio...Synaptic plasticity is essential for maintaining neuronal function in the central nervous system and serves as a critical indicator of the effects of neurodegenerative disease.Glaucoma directly impairs retinal ganglion cells and their axons,leading to axonal transport dysfuntion,subsequently causing secondary damage to anterior or posterior ends of the visual system.Accordingly,recent evidence indicates that glaucoma is a degenerative disease of the central nervous system that causes damage throughout the visual pathway.However,the effects of glaucoma on synaptic plasticity in the primary visual cortex remain unclear.In this study,we established a mouse model of unilateral chronic ocular hypertension by injecting magnetic microbeads into the anterior chamber of one eye.We found that,after 4 weeks of chronic ocular hypertension,the neuronal somas were smaller in the superior colliculus and lateral geniculate body regions of the brain contralateral to the affected eye.This was accompanied by glial cell activation and increased expression of inflammatory factors.After 8 weeks of ocular hypertension,we observed a reduction in the number of excitatory and inhibitory synapses,dendritic spines,and activation of glial cells in the primary visual cortex contralateral to the affected eye.These findings suggest that glaucoma not only directly damages the retina but also induces alterations in synapses and dendritic spines in the primary visual cortex,providing new insights into the pathogenesis of glaucoma.展开更多
In recent years,the rapid development of artificial intelligence has driven the widespread deployment of visual systems in complex environments such as autonomous driving,security surveillance,and medical diagnosis.Ho...In recent years,the rapid development of artificial intelligence has driven the widespread deployment of visual systems in complex environments such as autonomous driving,security surveillance,and medical diagnosis.However,existing image sensors—such as CMOS and CCD devices—intrinsically suffer from the limitation of fixed spectral response.Especially in environments with strong glare,haze,or dust,external spectral conditions often severely mismatch the device's design range,leading to significant degradation in image quality and a sharp drop in target recognition accuracy.While algorithmic post-processing(such as color bias correction or background suppression)can mitigate these issues,algorithm approaches typically introduce computational latency and increased energy consumption,making them unsuitable for edge computing or high-speed scenarios.展开更多
Image captioning,the task of generating descriptive sentences for images,has advanced significantly with the integration of semantic information.However,traditional models still rely on static visual features that do ...Image captioning,the task of generating descriptive sentences for images,has advanced significantly with the integration of semantic information.However,traditional models still rely on static visual features that do not evolve with the changing linguistic context,which can hinder the ability to form meaningful connections between the image and the generated captions.This limitation often leads to captions that are less accurate or descriptive.In this paper,we propose a novel approach to enhance image captioning by introducing dynamic interactions where visual features continuously adapt to the evolving linguistic context.Our model strengthens the alignment between visual and linguistic elements,resulting in more coherent and contextually appropriate captions.Specifically,we introduce two innovative modules:the Visual Weighting Module(VWM)and the Enhanced Features Attention Module(EFAM).The VWM adjusts visual features using partial attention,enabling dynamic reweighting of the visual inputs,while the EFAM further refines these features to improve their relevance to the generated caption.By continuously adjusting visual features in response to the linguistic context,our model bridges the gap between static visual features and dynamic language generation.We demonstrate the effectiveness of our approach through experiments on the MS-COCO dataset,where our method outperforms state-of-the-art techniques in terms of caption quality and contextual relevance.Our results show that dynamic visual-linguistic alignment significantly enhances image captioning performance.展开更多
The dorsal and ventral visual streams have been considered to play distinct roles in visual processing for action:the dorsal stream is assumed to support real-time actions,while the ventral stream facilitates memory-g...The dorsal and ventral visual streams have been considered to play distinct roles in visual processing for action:the dorsal stream is assumed to support real-time actions,while the ventral stream facilitates memory-guided actions.However,recent evidence suggests a more integrated function of these streams.We investigated the neural dynamics and functional connectivity between them during memory-guided actions using intracranial EEG.We tracked neural activity in the inferior parietal lobule in the dorsal stream,and the ventral temporal cortex in the ventral stream as well as the hippocampus during a delayed action task involving object identity and location memory.We found increased alpha power in both streams during the delay,indicating their role in maintaining spatial visual information.In addition,we recorded increased alpha power in the hippocampus during the delay,but only when both object identity and location needed to be remembered.We also recorded an increase in theta band phase synchronization between the inferior parietal lobule and ventral temporal cortex and between the inferior parietal lobule and hippocampus during the encoding and delay.Granger causality analysis indicated dynamic and frequency-specific directional interactions among the inferior parietal lobule,ventral temporal cortex,and hippocampus that varied across task phases.Our study provides unique electrophysiological evidence for close interactions between dorsal and ventral streams,supporting an integrated processing model in which both streams contribute to memory-guided actions.展开更多
In the visual‘teach-and-repeat’task,a mobile robot is expected to perform path following based on visual memory acquired along a route that it has traversed.Following a visually familiar route is also a critical nav...In the visual‘teach-and-repeat’task,a mobile robot is expected to perform path following based on visual memory acquired along a route that it has traversed.Following a visually familiar route is also a critical navigation skill for foraging insects,which they accomplish robustly despite tiny brains.Inspired by the mushroom body structure in the insect brain and its well-understood associative learning ability,we develop an embodied model that can accomplish visual teach-and-repeat efficiently.Critical to the performance is steering the robot body reflexively based on the relative familiarity of left and right visual fields,eliminating the need for stopping and scanning regularly for optimal directions.The model is robust against noise in visual processing and motor control and can produce performance comparable to pure pursuit or visual localisation methods that rely heavily on the estimation of positions.The model is tested on a real robot and also shown to be able to correct for significant intrinsic steering bias.展开更多
Fig.1.The GenomeSyn tool for visualizing genome synteny and characterizing structural variations.A:The first synteny visualization map showed the detailed information of two or three genomes and can display structural...Fig.1.The GenomeSyn tool for visualizing genome synteny and characterizing structural variations.A:The first synteny visualization map showed the detailed information of two or three genomes and can display structural variations and other annotation information.B:The second type of visualization map was simple and only showed the synteny relationship between the chromosomes of two or three genomes.C:Multiplatform general GenomeSyn submission page,applicable to Windows,MAC and web platforms;other analysis files can be entered in the"other"option.The publisher would like to apologise for any inconvenience caused.展开更多
Visual entailment(VE)is a prototypical task in multimodal visual reasoning,where current methods frequently utilize large language models(LLMs)as the knowledge base to assist in answering questions.These methods heavi...Visual entailment(VE)is a prototypical task in multimodal visual reasoning,where current methods frequently utilize large language models(LLMs)as the knowledge base to assist in answering questions.These methods heavily rely on the textual modality,which inherently cannot capture the full extent of information contained within images.We propose a context-aware visual entailment(CAVE)model,which introduces a novel aggregation module designed to extract high-level semantic features from images.This module integrates lower-level semantic image features into high-level visual tokens,formatting them similarly to text tokens so that they can serve as inputs for LLMs.The CAVE model compensates for the loss of image information and integrates it more effectively with textual comprehension.Additionally,the CAVE model incorporates a new input format and training methodology,which is rooted in instruction tuning and in-context learning techniques.The objective of this research is to maximize the inherent logical reasoning capabilities of LLMs.Experimental results on the E-SNLIVE dataset show that the proposed CAVE model exhibits outstanding performance.展开更多
【目的】针对风电法兰分类细、规格多、直径大、孔数多,导致多孔加工坐标计算量大、输入效率低,且极坐标、旋转坐标及宏程序、二次开发等加工方案难以满足法兰生产企业实际生产需求的问题,提出一种高效解决方案。【方法】基于Visual Stu...【目的】针对风电法兰分类细、规格多、直径大、孔数多,导致多孔加工坐标计算量大、输入效率低,且极坐标、旋转坐标及宏程序、二次开发等加工方案难以满足法兰生产企业实际生产需求的问题,提出一种高效解决方案。【方法】基于Visual Studio 2022开发平台,开发了一款高效实用、能灵活快速生成螺栓孔加工程序的专用CAM系统。该系统应用了模块化设计思路,把零件信息、加工参数等按相应模块独立处理,有利于系统根据法兰设计标准的变化而及时调整,自动生成不同规格的风电法兰螺栓孔加工程序。【结果】所开发的风电法兰螺栓孔加工CAM系统,实现了多孔加工程序的快速自动生成,显著降低了数控编程员的劳动强度,提高了法兰孔加工生产效率。【结论】未来可进一步对AutoCAD、NX平台进行二次开发,借助平台强大的二维三维图形设计基础,开发基于法兰零件的集设计制造为一体的中小型CAD/CAM系统,以满足企业不断发展的生产管理需求。展开更多
BACKGROUND Esophageal and gastric variceal bleeding is a catastrophic complication of portal hypertension,most commonly caused by cirrhosis of various etiologies.Although a considerable body of research has been condu...BACKGROUND Esophageal and gastric variceal bleeding is a catastrophic complication of portal hypertension,most commonly caused by cirrhosis of various etiologies.Although a considerable body of research has been conducted in this area,the complexity of the disease and the lack of standardized treatment strategies have led to fragmented findings,insufficient information,and a lack of systematic investigation.Bibliometric analysis can help clarify research trends,identify core topics,and reveal potential future directions.Therefore,this study aims to use bibliometric methods to conduct an in-depth exploration of research progress in this field,with the expectation of providing new insights for both clinical practice and scientific research.AIM To evaluate research trends and advancements in esophagogastric variceal bleeding(EGVB)over the past twenty years.METHODS Relevant publications on EGVB were retrieved from the Web of Science Core Collection.VOSviewer,Pajek,CiteSpace,and the bibliometrix package were then employed to perform bibliometric visualizations of publication volume,countries,institutions,journals,authors,keywords,and citation counts.RESULTS The analysis focused on original research articles and review papers.From 2004 to 2023,a total of 2097 records on EGVB were retrieved.The number of relevant publications has increased significantly over the past two decades,especially in China and the United States.The leading contributors in this field,in terms of countries,institutions,authors,and journals,were China,Assistance Publique-Hôpitaux de Paris,Bosch Jaime,and World Journal of Gastroenterology,respectively.Core keywords in this field include portal hypertension,management,liver cirrhosis,risk,prevention,and diagnosis.Future research directions may focus on optimizing diagnostic methods,personalized treatment,and multidisciplinary collaboration.CONCLUSION Using bibliometric methods,this study reveals the developmental trajectory and trends in research on EGVB,underscoring risk assessment and diagnostic optimization as the core areas of current focus.The study provides an innovative and systematic perspective for this field,indicating that future research could center on multidisciplinary collaboration,personalized treatment approaches,and the development of new diagnostic tools.Moreover,this work offers practical research directions for both the academic community and clinical practice,driving continued advancement in this domain.展开更多
Rapid diagnosis of Salmonella is crucial for the effective control of food safety incidents, especially in regions with poor hygiene conditions. Polymerase chain reaction(PCR), as a promising tool for Salmonella detec...Rapid diagnosis of Salmonella is crucial for the effective control of food safety incidents, especially in regions with poor hygiene conditions. Polymerase chain reaction(PCR), as a promising tool for Salmonella detection, is facing a lack of simple and fast sensing methods that are compatible with field applications in resource-limited areas. In this work, we developed a sensing approach to identify PCR-amplified Salmonella genomic DNA with the naked eye in a snapshot. Based on the ratiometric fiuorescence signals from SYBR Green Ⅰ and Hydroxyl naphthol blue, positive samples stood out from negative ones with a distinct color pattern under UV exposure. The proposed sensing scheme enabled highly specific identification of Salmonella with a detection limit at the single-copy level. Also, as a supplement to the intuitive naked-eye visualization results, numerical analysis of the colored images was available with a smartphone app to extract RGB values from colored images. This work provides a simple, rapid, and user-friendly solution for PCR identification, which promises great potential in molecular diagnosis of Salmonella and other pathogens in field.展开更多
Objective The study of medicine formulas is a core component of traditional Chinese medicine(TCM),yet traditional learning methods often lack interactivity and contextual understanding,making it challenging for beginn...Objective The study of medicine formulas is a core component of traditional Chinese medicine(TCM),yet traditional learning methods often lack interactivity and contextual understanding,making it challenging for beginners to grasp the intricate composition rules of formulas.To address this gap,we introduce Formula-S,a situated visualization method for TCM formula learning in augmented reality(AR)and evaluate its performance.This study aims to evaluate the effectiveness of Formula-S in enhancing TCM formula learning for beginners by comparing it with traditional text-based formula learning and web-based visualization.Methods Formula-S is an interactive AR tool designed for TCM formula learning,featuring three modes(3D,Web,and Table).The dataset included TCM formulas and herb properties extracted from authoritative references,including textbook and the SymMap database.In Formula-S,the hierarchical visualization of the formulas as herbal medicine compositions,is linked to the multidimensional herb attribute visualization and embedded in the real world,where real herb samples are presented.To evaluate its effectiveness,a controlled study(n=30)was conducted.Participants who had no formal TCM knowledge were tasked with herbal medicine identification,formula composition,and recognition.In the study,participants interacted with the AR tool through HoloLens 2.Data were collected on both task performance(accuracy and response time)and user experience,with a focus on task efficiency,accuracy,and user preference across the different learning modes.Results The situated visualization method of Formula-S had comparable accuracy to other methods but shorter response time for herbal formula learning tasks.Regarding user experience,our new approach demonstrated the highest system usability and lowest task load,effectively reducing cognitive load and allowing users to complete tasks with greater ease and efficiency.Participants reported that Formula-S enhanced their learning experience through its intuitive interface and immersive AR environment,suggesting this approach offers usability advantages for TCM education.Conclusions The situated visualization method in Formula-S offers more efficient and accurate searching capabilities compared to traditional and web-based methods.Additionally,it provides superior contextual understanding of TCM formulas,making it a promising new solution for TCM learning.展开更多
The hierarchical and coordinated processing of visual information by the brain demonstrates its superior ability to min-imize energy consumption and maximize signal transmission efficiency.Therefore,it is crucial to d...The hierarchical and coordinated processing of visual information by the brain demonstrates its superior ability to min-imize energy consumption and maximize signal transmission efficiency.Therefore,it is crucial to develop artificial visual synapses that integrate optical sensing and synaptic functions.This study fully leverages the excellent photoresponsivity proper-ties of the PM6:Y6 system to construct a vertical photo-tunable organic memristor and conducts in-depth research on its resis-tive switching performance,photodetection capability,and simulation of photo-synaptic behavior,showcasing its excellent per-formance in processing visual information and simulating neuromorphic behaviors.The device achieves stable and gradual resis-tance change,successfully simulating voltage-controlled long-term potentiation/depression(LTP/LTD),and exhibits various photo-electric synergistic regulation of synaptic plasticity.Moreover,the device has successfully simulated the image percep-tion and recognition functions of the human visual nervous system.The non-volatile Au/PM6:Y6/ITO memristor is used as an artificial synapse and neuron modeling,building a hierarchical coordinated processing SLP-CNN cascade neural network for visual image recognition training,its linear tunable photoconductivity characteristic serves as the weight update of the net-work,achieving a recognition accuracy of up to 93.4%.Compared with the single-layer visual target recognition model,this scheme has improved the recognition accuracy by 19.2%.展开更多
The year 2024 marks the 60^(th)anniversary of Title IX and 25 years since the New York Times revealed bias against female faculty members at the Massachusetts Institute of Technology.We take an opportunity here to exa...The year 2024 marks the 60^(th)anniversary of Title IX and 25 years since the New York Times revealed bias against female faculty members at the Massachusetts Institute of Technology.We take an opportunity here to examine the state of gender bias in a relatively new yet already prominent field,neural regeneration in the visual system,for which there is a well-defined context useful for this purpose.The National Eye Institute(NEI)provided the first round of research funding for its Audacious Goals Initiative(AGI)on visual neural regeneration in 2013 and the last round in 2021.Therefore,we focus on this timespan.Data sources included PubMed,the National Science Foundation(NSF),the NEI,the Blue Ridge Institute for Medical Research and data from the major professional organization for eye and vision research,the Association for Research in Vision and Ophthalmology(ARVO).展开更多
基金supported by the National Natural Science Foundation of China (No.62202137)the China Postdoctoral Science Foundation (No.2023M730599)the Zhejiang Provincial Natural Science Foundation of China (No.LMS25F020009)。
文摘Embodied visual exploration is critical for building intelligent visual agents. This paper presents the neural exploration with feature-based visual odometry and tracking-failure-reduction policy(Ne OR), a framework for embodied visual exploration that possesses the efficient exploration capabilities of deep reinforcement learning(DRL)-based exploration policies and leverages feature-based visual odometry(VO) for more accurate mapping and positioning results. An improved local policy is also proposed to reduce tracking failures of feature-based VO in weakly textured scenes through a refined multi-discrete action space, keyframe fusion, and an auxiliary task. The experimental results demonstrate that Ne OR has better mapping and positioning accuracy compared to other entirely learning-based exploration frameworks and improves the robustness of feature-based VO by significantly reducing tracking failures in weakly textured scenes.
文摘Despite the global attention towards pollution,it remains a significant global threat and challenge for both developed and developing countries.Urbanization and economic development influence different types of pollution.Visual pollution is considered a new phenomenon referring to the impact of existing and growing mainstream pollution which impairs an individual’s ability to enjoy visits or views.Recently,Jordanian cities have expanded in response to urbanization and ongoing development.Irbid City has the second largest population in Jordan after the capital Amman City highest population density in Jordan.In the modern era,Irbid City dramatically increased in population and dimension.The growth of the demographic population has been significant and has led to overpopulation,rapid urbanization,and unresolved problems associated with spatial planning and infrastructures leading to different types of pollution including visual pollution.The study area focuses on the city center with the most crowded population through field visits and actual observations.The study technique is descriptive and analytical,with a focus on meticulous monitoring and a follow-up-based questionnaire which is a tool for the study,involving data collection,classification,presentation,analysis,interpretation,and exploration to identify new facts and generalizations that can help solve current issues of visual pollution.The study provides recommendations for Irbid Municipal to eliminate visual pollution,in parallel with stricter supervision from the municipality during the building process to ensure proper implementation of the new rules,adopting an integrated policy for the city with the rest of the social,political,sensory,cultural,economic,and functional aspects,so that this policy is in the short and long term.
基金supported by the National Natural Science Foundation of China(32471055 and 82171090)Shanghai Municipal Science and Technology Major Project(2018SHZDZX01)ZJLab,Shanghai Center for Brain Science and Brain-Inspired Technology,the Lingang Laboratory(LG-QS-202203-12).
文摘Throughout the lifespan,an animal can encounter predators frequently,thus the ability to avoid attacks from predators is crucial for its survival.The chances of evading danger can be greatly improved if the animal can respond immediately to the threat.Therefore,when an animal detects a threat through its visual system,it must quickly direct its gaze and attention toward the source of danger,assess the threat level,and take appropriate action.
基金supported by the National Natural Science Foundation of China,No.82271115(to MY).
文摘Synaptic plasticity is essential for maintaining neuronal function in the central nervous system and serves as a critical indicator of the effects of neurodegenerative disease.Glaucoma directly impairs retinal ganglion cells and their axons,leading to axonal transport dysfuntion,subsequently causing secondary damage to anterior or posterior ends of the visual system.Accordingly,recent evidence indicates that glaucoma is a degenerative disease of the central nervous system that causes damage throughout the visual pathway.However,the effects of glaucoma on synaptic plasticity in the primary visual cortex remain unclear.In this study,we established a mouse model of unilateral chronic ocular hypertension by injecting magnetic microbeads into the anterior chamber of one eye.We found that,after 4 weeks of chronic ocular hypertension,the neuronal somas were smaller in the superior colliculus and lateral geniculate body regions of the brain contralateral to the affected eye.This was accompanied by glial cell activation and increased expression of inflammatory factors.After 8 weeks of ocular hypertension,we observed a reduction in the number of excitatory and inhibitory synapses,dendritic spines,and activation of glial cells in the primary visual cortex contralateral to the affected eye.These findings suggest that glaucoma not only directly damages the retina but also induces alterations in synapses and dendritic spines in the primary visual cortex,providing new insights into the pathogenesis of glaucoma.
基金supported in part by STI 2030-Major Projects(2022ZD0209200)in part by National Natural Science Foundation of China(62374099)+2 种基金in part by Beijing Natural Science Foundation−Xiaomi Innovation Joint Fund(L233009)Beijing Natural Science Foundation(L248104)in part by Independent Research Program of School of Integrated Circuits,Tsinghua University,in part by Tsinghua University Fuzhou Data Technology Joint Research Institute.
文摘In recent years,the rapid development of artificial intelligence has driven the widespread deployment of visual systems in complex environments such as autonomous driving,security surveillance,and medical diagnosis.However,existing image sensors—such as CMOS and CCD devices—intrinsically suffer from the limitation of fixed spectral response.Especially in environments with strong glare,haze,or dust,external spectral conditions often severely mismatch the device's design range,leading to significant degradation in image quality and a sharp drop in target recognition accuracy.While algorithmic post-processing(such as color bias correction or background suppression)can mitigate these issues,algorithm approaches typically introduce computational latency and increased energy consumption,making them unsuitable for edge computing or high-speed scenarios.
基金supported by the National Natural Science Foundation of China(Nos.U22A2034,62177047)High Caliber Foreign Experts Introduction Plan funded by MOST,and Central South University Research Programme of Advanced Interdisciplinary Studies(No.2023QYJC020).
文摘Image captioning,the task of generating descriptive sentences for images,has advanced significantly with the integration of semantic information.However,traditional models still rely on static visual features that do not evolve with the changing linguistic context,which can hinder the ability to form meaningful connections between the image and the generated captions.This limitation often leads to captions that are less accurate or descriptive.In this paper,we propose a novel approach to enhance image captioning by introducing dynamic interactions where visual features continuously adapt to the evolving linguistic context.Our model strengthens the alignment between visual and linguistic elements,resulting in more coherent and contextually appropriate captions.Specifically,we introduce two innovative modules:the Visual Weighting Module(VWM)and the Enhanced Features Attention Module(EFAM).The VWM adjusts visual features using partial attention,enabling dynamic reweighting of the visual inputs,while the EFAM further refines these features to improve their relevance to the generated caption.By continuously adjusting visual features in response to the linguistic context,our model bridges the gap between static visual features and dynamic language generation.We demonstrate the effectiveness of our approach through experiments on the MS-COCO dataset,where our method outperforms state-of-the-art techniques in terms of caption quality and contextual relevance.Our results show that dynamic visual-linguistic alignment significantly enhances image captioning performance.
基金supported by European Union–Next Generation EU(LX22NPO5107(MEYS))the Czech Science Foundation(20-21339S)+2 种基金the Grant Agency of Charles University(GAUK 248122 and 272221)ERDF-Project Brain Dynamics(CZ.02.01.01/00/22_008/0004643)the Ministry of Health of the Czech Republic Project NU21J-08-00081.
文摘The dorsal and ventral visual streams have been considered to play distinct roles in visual processing for action:the dorsal stream is assumed to support real-time actions,while the ventral stream facilitates memory-guided actions.However,recent evidence suggests a more integrated function of these streams.We investigated the neural dynamics and functional connectivity between them during memory-guided actions using intracranial EEG.We tracked neural activity in the inferior parietal lobule in the dorsal stream,and the ventral temporal cortex in the ventral stream as well as the hippocampus during a delayed action task involving object identity and location memory.We found increased alpha power in both streams during the delay,indicating their role in maintaining spatial visual information.In addition,we recorded increased alpha power in the hippocampus during the delay,but only when both object identity and location needed to be remembered.We also recorded an increase in theta band phase synchronization between the inferior parietal lobule and ventral temporal cortex and between the inferior parietal lobule and hippocampus during the encoding and delay.Granger causality analysis indicated dynamic and frequency-specific directional interactions among the inferior parietal lobule,ventral temporal cortex,and hippocampus that varied across task phases.Our study provides unique electrophysiological evidence for close interactions between dorsal and ventral streams,supporting an integrated processing model in which both streams contribute to memory-guided actions.
基金support from the Huawei Technologies Co.,Ltd.[grant number YBN2020045132].
文摘In the visual‘teach-and-repeat’task,a mobile robot is expected to perform path following based on visual memory acquired along a route that it has traversed.Following a visually familiar route is also a critical navigation skill for foraging insects,which they accomplish robustly despite tiny brains.Inspired by the mushroom body structure in the insect brain and its well-understood associative learning ability,we develop an embodied model that can accomplish visual teach-and-repeat efficiently.Critical to the performance is steering the robot body reflexively based on the relative familiarity of left and right visual fields,eliminating the need for stopping and scanning regularly for optimal directions.The model is robust against noise in visual processing and motor control and can produce performance comparable to pure pursuit or visual localisation methods that rely heavily on the estimation of positions.The model is tested on a real robot and also shown to be able to correct for significant intrinsic steering bias.
文摘Fig.1.The GenomeSyn tool for visualizing genome synteny and characterizing structural variations.A:The first synteny visualization map showed the detailed information of two or three genomes and can display structural variations and other annotation information.B:The second type of visualization map was simple and only showed the synteny relationship between the chromosomes of two or three genomes.C:Multiplatform general GenomeSyn submission page,applicable to Windows,MAC and web platforms;other analysis files can be entered in the"other"option.The publisher would like to apologise for any inconvenience caused.
基金Fundamental Research Funds for the Central Universities,China(No.2232021A-10)Shanghai Pujiang Program,China(No.22PJ1423400)。
文摘Visual entailment(VE)is a prototypical task in multimodal visual reasoning,where current methods frequently utilize large language models(LLMs)as the knowledge base to assist in answering questions.These methods heavily rely on the textual modality,which inherently cannot capture the full extent of information contained within images.We propose a context-aware visual entailment(CAVE)model,which introduces a novel aggregation module designed to extract high-level semantic features from images.This module integrates lower-level semantic image features into high-level visual tokens,formatting them similarly to text tokens so that they can serve as inputs for LLMs.The CAVE model compensates for the loss of image information and integrates it more effectively with textual comprehension.Additionally,the CAVE model incorporates a new input format and training methodology,which is rooted in instruction tuning and in-context learning techniques.The objective of this research is to maximize the inherent logical reasoning capabilities of LLMs.Experimental results on the E-SNLIVE dataset show that the proposed CAVE model exhibits outstanding performance.
文摘【目的】针对风电法兰分类细、规格多、直径大、孔数多,导致多孔加工坐标计算量大、输入效率低,且极坐标、旋转坐标及宏程序、二次开发等加工方案难以满足法兰生产企业实际生产需求的问题,提出一种高效解决方案。【方法】基于Visual Studio 2022开发平台,开发了一款高效实用、能灵活快速生成螺栓孔加工程序的专用CAM系统。该系统应用了模块化设计思路,把零件信息、加工参数等按相应模块独立处理,有利于系统根据法兰设计标准的变化而及时调整,自动生成不同规格的风电法兰螺栓孔加工程序。【结果】所开发的风电法兰螺栓孔加工CAM系统,实现了多孔加工程序的快速自动生成,显著降低了数控编程员的劳动强度,提高了法兰孔加工生产效率。【结论】未来可进一步对AutoCAD、NX平台进行二次开发,借助平台强大的二维三维图形设计基础,开发基于法兰零件的集设计制造为一体的中小型CAD/CAM系统,以满足企业不断发展的生产管理需求。
基金Supported by the National Natural Science Foundation of China,No.81874390 and No.81573948Shanghai Natural Science Foundation,No.21ZR1464100+1 种基金Science and Technology Innovation Action Plan of Shanghai Science and Technology Commission,No.22S11901700the Shanghai Key Specialty of Traditional Chinese Clinical Medicine,No.shslczdzk01201.
文摘BACKGROUND Esophageal and gastric variceal bleeding is a catastrophic complication of portal hypertension,most commonly caused by cirrhosis of various etiologies.Although a considerable body of research has been conducted in this area,the complexity of the disease and the lack of standardized treatment strategies have led to fragmented findings,insufficient information,and a lack of systematic investigation.Bibliometric analysis can help clarify research trends,identify core topics,and reveal potential future directions.Therefore,this study aims to use bibliometric methods to conduct an in-depth exploration of research progress in this field,with the expectation of providing new insights for both clinical practice and scientific research.AIM To evaluate research trends and advancements in esophagogastric variceal bleeding(EGVB)over the past twenty years.METHODS Relevant publications on EGVB were retrieved from the Web of Science Core Collection.VOSviewer,Pajek,CiteSpace,and the bibliometrix package were then employed to perform bibliometric visualizations of publication volume,countries,institutions,journals,authors,keywords,and citation counts.RESULTS The analysis focused on original research articles and review papers.From 2004 to 2023,a total of 2097 records on EGVB were retrieved.The number of relevant publications has increased significantly over the past two decades,especially in China and the United States.The leading contributors in this field,in terms of countries,institutions,authors,and journals,were China,Assistance Publique-Hôpitaux de Paris,Bosch Jaime,and World Journal of Gastroenterology,respectively.Core keywords in this field include portal hypertension,management,liver cirrhosis,risk,prevention,and diagnosis.Future research directions may focus on optimizing diagnostic methods,personalized treatment,and multidisciplinary collaboration.CONCLUSION Using bibliometric methods,this study reveals the developmental trajectory and trends in research on EGVB,underscoring risk assessment and diagnostic optimization as the core areas of current focus.The study provides an innovative and systematic perspective for this field,indicating that future research could center on multidisciplinary collaboration,personalized treatment approaches,and the development of new diagnostic tools.Moreover,this work offers practical research directions for both the academic community and clinical practice,driving continued advancement in this domain.
基金supported by the Macao Science and Technology Development Fund(FDCT)(Nos.FDCT 0029/2021/A1,FDCT0002/2021/AKP,004/2023/SKL,0036/2021/APD)University of Macao(No.MYRG-GRG2023-00034-IME,SRG2024-00057IME)+2 种基金Dr.Stanley Ho Medical Development Foundation(No.SHMDF-OIRFS/2024/001)Zhuhai Huafa Group(No.HF-006-2021)Guangdong Science and Technology Department(No.2022A0505030022)。
文摘Rapid diagnosis of Salmonella is crucial for the effective control of food safety incidents, especially in regions with poor hygiene conditions. Polymerase chain reaction(PCR), as a promising tool for Salmonella detection, is facing a lack of simple and fast sensing methods that are compatible with field applications in resource-limited areas. In this work, we developed a sensing approach to identify PCR-amplified Salmonella genomic DNA with the naked eye in a snapshot. Based on the ratiometric fiuorescence signals from SYBR Green Ⅰ and Hydroxyl naphthol blue, positive samples stood out from negative ones with a distinct color pattern under UV exposure. The proposed sensing scheme enabled highly specific identification of Salmonella with a detection limit at the single-copy level. Also, as a supplement to the intuitive naked-eye visualization results, numerical analysis of the colored images was available with a smartphone app to extract RGB values from colored images. This work provides a simple, rapid, and user-friendly solution for PCR identification, which promises great potential in molecular diagnosis of Salmonella and other pathogens in field.
文摘Objective The study of medicine formulas is a core component of traditional Chinese medicine(TCM),yet traditional learning methods often lack interactivity and contextual understanding,making it challenging for beginners to grasp the intricate composition rules of formulas.To address this gap,we introduce Formula-S,a situated visualization method for TCM formula learning in augmented reality(AR)and evaluate its performance.This study aims to evaluate the effectiveness of Formula-S in enhancing TCM formula learning for beginners by comparing it with traditional text-based formula learning and web-based visualization.Methods Formula-S is an interactive AR tool designed for TCM formula learning,featuring three modes(3D,Web,and Table).The dataset included TCM formulas and herb properties extracted from authoritative references,including textbook and the SymMap database.In Formula-S,the hierarchical visualization of the formulas as herbal medicine compositions,is linked to the multidimensional herb attribute visualization and embedded in the real world,where real herb samples are presented.To evaluate its effectiveness,a controlled study(n=30)was conducted.Participants who had no formal TCM knowledge were tasked with herbal medicine identification,formula composition,and recognition.In the study,participants interacted with the AR tool through HoloLens 2.Data were collected on both task performance(accuracy and response time)and user experience,with a focus on task efficiency,accuracy,and user preference across the different learning modes.Results The situated visualization method of Formula-S had comparable accuracy to other methods but shorter response time for herbal formula learning tasks.Regarding user experience,our new approach demonstrated the highest system usability and lowest task load,effectively reducing cognitive load and allowing users to complete tasks with greater ease and efficiency.Participants reported that Formula-S enhanced their learning experience through its intuitive interface and immersive AR environment,suggesting this approach offers usability advantages for TCM education.Conclusions The situated visualization method in Formula-S offers more efficient and accurate searching capabilities compared to traditional and web-based methods.Additionally,it provides superior contextual understanding of TCM formulas,making it a promising new solution for TCM learning.
基金the National Natural Science Foundation of China(62111540271)Natural Science Foundation of Anhui Province(2308085MF207).
文摘The hierarchical and coordinated processing of visual information by the brain demonstrates its superior ability to min-imize energy consumption and maximize signal transmission efficiency.Therefore,it is crucial to develop artificial visual synapses that integrate optical sensing and synaptic functions.This study fully leverages the excellent photoresponsivity proper-ties of the PM6:Y6 system to construct a vertical photo-tunable organic memristor and conducts in-depth research on its resis-tive switching performance,photodetection capability,and simulation of photo-synaptic behavior,showcasing its excellent per-formance in processing visual information and simulating neuromorphic behaviors.The device achieves stable and gradual resis-tance change,successfully simulating voltage-controlled long-term potentiation/depression(LTP/LTD),and exhibits various photo-electric synergistic regulation of synaptic plasticity.Moreover,the device has successfully simulated the image percep-tion and recognition functions of the human visual nervous system.The non-volatile Au/PM6:Y6/ITO memristor is used as an artificial synapse and neuron modeling,building a hierarchical coordinated processing SLP-CNN cascade neural network for visual image recognition training,its linear tunable photoconductivity characteristic serves as the weight update of the net-work,achieving a recognition accuracy of up to 93.4%.Compared with the single-layer visual target recognition model,this scheme has improved the recognition accuracy by 19.2%.
文摘The year 2024 marks the 60^(th)anniversary of Title IX and 25 years since the New York Times revealed bias against female faculty members at the Massachusetts Institute of Technology.We take an opportunity here to examine the state of gender bias in a relatively new yet already prominent field,neural regeneration in the visual system,for which there is a well-defined context useful for this purpose.The National Eye Institute(NEI)provided the first round of research funding for its Audacious Goals Initiative(AGI)on visual neural regeneration in 2013 and the last round in 2021.Therefore,we focus on this timespan.Data sources included PubMed,the National Science Foundation(NSF),the NEI,the Blue Ridge Institute for Medical Research and data from the major professional organization for eye and vision research,the Association for Research in Vision and Ophthalmology(ARVO).