In order to compare the effects of visual and auditory instructions on the crew when guiding astronauts to complete the procedural tasks in the space station,in this study,subjects were recruited to complete the progr...In order to compare the effects of visual and auditory instructions on the crew when guiding astronauts to complete the procedural tasks in the space station,in this study,subjects were recruited to complete the programmed task of starting from the node module,locating the scientific cabinet and spectrometer,and finally operating the orbital replaceable unit on the spectrometer.Meanwhile,the task performance,eye movement parameters,and cognitive load induced by 2 kinds of instructions in the task were statistically analyzed.The results showed that there were highly significant differences in terms of task completion time,the NASA-TLX(Task Load Index)total score,and eye movement index between the 2 instructions(P<0.01).There were also significant differences in error rate and effort(P<0.05).This study proves that visual instruction interaction is better than auditory instruction.Our work provides important reference for the selection of human–computer interaction mode for procedural tasks on space stations.It also provides the experience and theoretical evidence missing so far and proves the benefits of augmented reality assistance in terms of task performance and human factors.展开更多
This review considers the modern industrial applications of augmented reality headsets.It draws upon a synthesis of information from open sources and press releases of companies,as well as the first-hand experiences o...This review considers the modern industrial applications of augmented reality headsets.It draws upon a synthesis of information from open sources and press releases of companies,as well as the first-hand experiences of industry representatives.Furthermore,the research incorporates insights from both profile events and in-depth discussions with skilled professionals.A specific focus is placed on the ergonomic characteristics of headsets:image quality,user-friendliness,etc.To provide an objective evaluation of the various headsets,a metric has been proposed which is dependent on the specific application case.This enables a comprehensive comparison of the various devices in terms of their quantitative characteristics,which is of particular importance for the formation of a rapidly developing industry.展开更多
Multi-modal large language models(MLLMs)have demonstrated impressive performance in vision-language tasks across a wide range of domains.However,the large model scale and associated high computational cost pose signif...Multi-modal large language models(MLLMs)have demonstrated impressive performance in vision-language tasks across a wide range of domains.However,the large model scale and associated high computational cost pose significant challenges for training and deploying MLLMs on consumer-grade GPUs or edge devices,thereby hindering their widespread application.In this work,we introduce Mini-InternVL,a series of MLLMs with parameters ranging from 1 billion to 4 billion,which achieves 90% of the performance with only 5% of the parameters.This significant improvement in efficiency and effectiveness makes our models more accessible and applicable in various real-world scenarios.To further promote the adoption of our models,we are developing a unified adaptation framework for Mini-InternVL,which enables our models to transfer and outperform specialized models in downstream tasks,including autonomous driving,medical image processing,and remote sensing.We believe that our models can provide valuable insights and resources to advance the development of efficient and effective MLLMs.展开更多
基金supported by the Foundation Strengthening Project(2021-JCJQ-JJ-1042)the Foundation of National Key Laboratory of Human Factors Engineering(6142222210301).
文摘In order to compare the effects of visual and auditory instructions on the crew when guiding astronauts to complete the procedural tasks in the space station,in this study,subjects were recruited to complete the programmed task of starting from the node module,locating the scientific cabinet and spectrometer,and finally operating the orbital replaceable unit on the spectrometer.Meanwhile,the task performance,eye movement parameters,and cognitive load induced by 2 kinds of instructions in the task were statistically analyzed.The results showed that there were highly significant differences in terms of task completion time,the NASA-TLX(Task Load Index)total score,and eye movement index between the 2 instructions(P<0.01).There were also significant differences in error rate and effort(P<0.05).This study proves that visual instruction interaction is better than auditory instruction.Our work provides important reference for the selection of human–computer interaction mode for procedural tasks on space stations.It also provides the experience and theoretical evidence missing so far and proves the benefits of augmented reality assistance in terms of task performance and human factors.
基金support of“Priority 2030”program at the Bauman Moscow State Technical University.O.L.A.and M.V.S.acknowledge the financial support of the Ministry of Science and Higher Education of the Russian Federation grant(Agreement dated 06.03.2024 number 075-02-2024-1519)for the experimental research,carried out using the infrastructure of the Educational Design Center for Opto-and Microelectronics of the Bauman Moscow State Technical Universitysupport of the Ministry of Science and Higher Education of the Russian Federation(Passport No.2019-0903).
文摘This review considers the modern industrial applications of augmented reality headsets.It draws upon a synthesis of information from open sources and press releases of companies,as well as the first-hand experiences of industry representatives.Furthermore,the research incorporates insights from both profile events and in-depth discussions with skilled professionals.A specific focus is placed on the ergonomic characteristics of headsets:image quality,user-friendliness,etc.To provide an objective evaluation of the various headsets,a metric has been proposed which is dependent on the specific application case.This enables a comprehensive comparison of the various devices in terms of their quantitative characteristics,which is of particular importance for the formation of a rapidly developing industry.
基金supported by the National Key R&D Program of China(Nos.2022ZD0160102 and 2022ZD0161300)the National Natural Science Foundation of China(Nos.62376134 and 62372223).
文摘Multi-modal large language models(MLLMs)have demonstrated impressive performance in vision-language tasks across a wide range of domains.However,the large model scale and associated high computational cost pose significant challenges for training and deploying MLLMs on consumer-grade GPUs or edge devices,thereby hindering their widespread application.In this work,we introduce Mini-InternVL,a series of MLLMs with parameters ranging from 1 billion to 4 billion,which achieves 90% of the performance with only 5% of the parameters.This significant improvement in efficiency and effectiveness makes our models more accessible and applicable in various real-world scenarios.To further promote the adoption of our models,we are developing a unified adaptation framework for Mini-InternVL,which enables our models to transfer and outperform specialized models in downstream tasks,including autonomous driving,medical image processing,and remote sensing.We believe that our models can provide valuable insights and resources to advance the development of efficient and effective MLLMs.