Human-computer interactions constitute an important subject for the development and popularization of information technologies,as they are not only an important frontier technology in computer science but also an impo...Human-computer interactions constitute an important subject for the development and popularization of information technologies,as they are not only an important frontier technology in computer science but also an important auxiliary technology in virtual reality(VR).In recent years,Chinese researchers have made significant advances in human-computer interactions.To systematically display China's latest advances in human-computer interactions and thus provide an impetus for the development of VR and other related fields,we have solicited articles for this special issue from experts in this area to participate in the review process.The following articles have been selected for publication in this special issue.展开更多
In this paper,we investigate methodologies to improve direct-touch interaction on invisible and intangible spatial input.We firstly discuss about the motive of looking for a new input method for whole body interaction...In this paper,we investigate methodologies to improve direct-touch interaction on invisible and intangible spatial input.We firstly discuss about the motive of looking for a new input method for whole body interaction and how it can be meaningful.We also describe the role that can play spatial interaction to improve the freedom of interaction for a user.We propose a method of spatial centered interaction using invisible and intangible spatial inputs.However,given their lack of tactile feedback and visual representation,direct touch interaction on such input can be confused.In order to make a step toward understanding causes and solutions for such phenomena,we made 2 user experiments.In the first one,we test 5 setups of helper that provide information of the location of the input by constraining the dimension it is located at.The results show that using marker on the ground and a relationship with the height of the user’s body improve significantly the locative task.In the second experiment,we create a dancing game using invisible and intangible spatial inputs and we stress the results obtained in the first experiment within this cognitively demanding context.Results show that the same setup of helper is still providing very good results in that context.展开更多
With the development of virtual reality(VR)and human-computer interaction technology,how to use natural and efficient interaction methods in the virtual environment has become a hot topic of research.Gesture is one of...With the development of virtual reality(VR)and human-computer interaction technology,how to use natural and efficient interaction methods in the virtual environment has become a hot topic of research.Gesture is one of the most important communication methods of human beings,which can effectively express users'demands.In the past few decades,gesture-based interaction has made significant progress.This article focuses on the gesture interaction technology and discusses the definition and classification of gestures,input devices for gesture interaction,and gesture interaction recognition technology.The application of gesture interaction technology in virtual reality is studied,the existing problems in the current gesture interaction are summarized,and the future development is prospected.展开更多
Background In recent years,the demand for interactive photorealistic three-dimensional(3D)environments has increased in various fields,including architecture,engineering,and entertainment.However,achieving a balance b...Background In recent years,the demand for interactive photorealistic three-dimensional(3D)environments has increased in various fields,including architecture,engineering,and entertainment.However,achieving a balance between the quality and efficiency of high-performance 3D applications and virtual reality(VR)remains challenging.Methods This study addresses this issue by revisiting and extending view interpolation for image-based rendering(IBR),which enables the exploration of spacious open environments in 3D and VR.Therefore,we introduce multimorphing,a novel rendering method based on the spatial data structure of 2D image patches,called the image graph.Using this approach,novel views can be rendered with up to six degrees of freedom using only a sparse set of views.The rendering process does not require 3D reconstruction of the geometry or per-pixel depth information,and all relevant data for the output are extracted from the local morphing cells of the image graph.The detection of parallax image regions during preprocessing reduces rendering artifacts by extrapolating image patches from adjacent cells in real-time.In addition,a GPU-based solution was presented to resolve exposure inconsistencies within a dataset,enabling seamless transitions of brightness when moving between areas with varying light intensities.Results Experiments on multiple real-world and synthetic scenes demonstrate that the presented method achieves high"VR-compatible"frame rates,even on mid-range and legacy hardware,respectively.While achieving adequate visual quality even for sparse datasets,it outperforms other IBR and current neural rendering approaches.Conclusions Using the correspondence-based decomposition of input images into morphing cells of 2D image patches,multidimensional image morphing provides high-performance novel view generation,supporting open 3D and VR environments.Nevertheless,the handling of morphing artifacts in the parallax image regions remains a topic for future research.展开更多
With the advancement of computer vision techniques in surveillance systems,the need for more proficient,intelligent,and sustainable facial expressions and age recognition is necessary.The main purpose of this study is...With the advancement of computer vision techniques in surveillance systems,the need for more proficient,intelligent,and sustainable facial expressions and age recognition is necessary.The main purpose of this study is to develop accurate facial expressions and an age recognition system that is capable of error-free recognition of human expression and age in both indoor and outdoor environments.The proposed system first takes an input image pre-process it and then detects faces in the entire image.After that landmarks localization helps in the formation of synthetic face mask prediction.A novel set of features are extracted and passed to a classifier for the accurate classification of expressions and age group.The proposed system is tested over two benchmark datasets,namely,the Gallagher collection person dataset and the Images of Groups dataset.The system achieved remarkable results over these benchmark datasets about recognition accuracy and computational time.The proposed system would also be applicable in different consumer application domains such as online business negotiations,consumer behavior analysis,E-learning environments,and emotion robotics.展开更多
Pen-based user interfaces which leverage the affordances of the pen provide userswith more flexibility and natural interaction. However, it is difficult to construct usable pen-baseduser interfaces because of the lack...Pen-based user interfaces which leverage the affordances of the pen provide userswith more flexibility and natural interaction. However, it is difficult to construct usable pen-baseduser interfaces because of the lack of support for their development. Toolkit-level support has beenexploited to solve this problem, but this approach makes it hard to gain platform independence,easy maintenance and easy extension. In this paper a context-aware infrastructure is created,called WEAVER, to provide pen interaction services for both novel pen-based applications andlegacy GUI-based applications. WEAVER aims to support the pen as another standard interactivedevice along with the keyboard and mouse and present a high-level access interface to pen input.It employs application context to tailor its service to different applications. By modeling theapplication context and registering the relevant action adapters, WEAVER can offer services,such as gesture recognition, continuous handwriting and other fundamental ink manipulations, toappropriate applications. One of the distinct features of WEAVER is that off-the-shelf GUI-basedsoftware packages can be easily enhanced with pen interaction without modifying the existing code.In this paper, the architecture and components of WEAVER are described. In addition, examplesand feedbacks of its use are presented.展开更多
Background Owing to recent advances in virtual reality(VR)technologies,effective user interaction with dynamic content in 3D scenes has become a research hotspot.Moving target selection is a basic interactive task in ...Background Owing to recent advances in virtual reality(VR)technologies,effective user interaction with dynamic content in 3D scenes has become a research hotspot.Moving target selection is a basic interactive task in which the user performance research in tasks is significant to user interface design in VR.Different from the existing static target selection studies,the moving target selection in VR is affected by the change in target speed,angle and size,and lack of research on some key factors.Methods This study designs an experimental scenario in which the users play badminton under the condition of VR.By adding seven kinds of modal clues such as vision,audio,haptics,and their combinations,five kinds of moving speed and four kinds of serving angles,and the effect of these factors on the performance and subjective feelings in moving target selection in VR,is studied.Results The results show that the moving speed of the shuttlecock has a significant impact on the user performance.The angle of service has a significant impact on hitting rate,but has no significant impact on the hitting distance.The acquisition of the user performance by the moving target is mainly influenced by vision under the combined modalities;adding additional modalities can improve user performance.Although the hitting distance of the target is increased in the trimodal condition,the hitting rate decreases.Conclusion This study analyses the results of user performance and subjective perception,and then provides suggestions on the combination of modality clues in different scenarios.展开更多
Background Crossing-based target selection motion may attain less error rates and higher interactive speed in some cases.Most of the research in target selection fields are focused on the analysis of the interaction r...Background Crossing-based target selection motion may attain less error rates and higher interactive speed in some cases.Most of the research in target selection fields are focused on the analysis of the interaction results.Additionally,as trajectories play a much more important role in crossing-based target selection compared to the other interactive techniques,an ideal model for trajectories can help computer designers make predictions about interaction results during the process of target selection rather than at the end of the whole process.Methods In this paper,a trajectory prediction model for crossing based target selection tasks is proposed by taking the reference of a dynamic model theory.Results Simulation results demonstrate that our model performed well with regard to the prediction of trajectories,endpoints and hitting time for target-selection motion,and the average error of trajectories,endpoints and hitting time values were found to be 17.28%,2.73mm and 11.50%,respectively.展开更多
Federated learning is an emerging privacy-preserving distributed learning paradigm,in which many clients collaboratively train a shared global model under the orchestration of a remote server.Most current works on fed...Federated learning is an emerging privacy-preserving distributed learning paradigm,in which many clients collaboratively train a shared global model under the orchestration of a remote server.Most current works on federated learning have focused on fully supervised learning settings,assuming that all the data are annotated with ground-truth labels.However,this work considers a more realistic and challenging setting,Federated Semi-Supervised Learning(FSSL),where clients have a large amount of unlabeled data and only the server hosts a small number of labeled samples.How to reasonably utilize the server-side labeled data and the client-side unlabeled data is the core challenge in this setting.In this paper,we propose a new FSSL algorithm for image classification based on consistency regularization and ensemble knowledge distillation,called EKDFSSL.Our algorithm uses the global model as the teacher in consistency regularization methods to enhance both the accuracy and stability of client-side unsupervised learning on unlabeled data.Besides,we introduce an additional ensemble knowledge distillation loss to mitigate model overfitting during server-side retraining on labeled data.Extensive experiments on several image classification datasets show that our EKDFSSL outperforms current baseline methods.展开更多
This article describes an immersive virtual reality reconstruction tool for root system architectures from 3D scans of soil columns.In practical scenarios,experimental conditions will be adapted to fit the need of the...This article describes an immersive virtual reality reconstruction tool for root system architectures from 3D scans of soil columns.In practical scenarios,experimental conditions will be adapted to fit the need of the data analysis pipeline,including sieving and drying the soil before scanning.Based on previous reports of automatic systems that do not represent what experts would annotate,we developed a virtual reality system to assist with the extraction of root systems in cases in which automated approaches fall short of expert knowledge.The aim of the present study is to evaluate whether our immersive method is superior to classical annotation approaches when tested on synthetic data sets using untrained participants.Our laboratory user study consists of evaluating the root extractions of participants,along with their rating on central user experience and usability measures.We show significant improvement in F1 score across conditions(noisy or clear data)as well as an improved usability.Our study highlights that using virtual reality in root extraction improves accuracy,and we perform an in-depth evaluation of biases that occur when users trace roots in soil volumes.展开更多
There are many bottlenecks that limit the computing power of the Mobile Web3 D and they need to be solved before implementing a public fire evacuation system on this platform.In this study,we focus on three key proble...There are many bottlenecks that limit the computing power of the Mobile Web3 D and they need to be solved before implementing a public fire evacuation system on this platform.In this study,we focus on three key problems:(1)The scene data for large-scale building information modeling(BIM)are huge,so it is difficult to transmit the data via the Internet and visualize them on the Web;(2)The raw fire dynamic simulator(FDS)smoke diffusion data are also very large,so it is extremely difficult to transmit the data via the Internet and visualize them on the Web;(3)A smart artificial intelligence fire evacuation app for the public should be accurate and real-time.To address these problems,the following solutions are proposed:(1)The large-scale scene model is made lightweight;(2)The amount of dynamic smoke is also made lightweight;(3)The dynamic obstacle maps established from the scene model and smoke data are used for optimal path planning using a heuristic method.We propose a real-time fire evacuation system based on the ant colony optimization(RFES-ACO)algorithm with reused dynamic pheromones.Simulation results show that the public could use Mobile Web3 D devices to experience fire evacuation drills in real time smoothly.The real-time fire evacuation system(RFES)is efficient and the evacuation rate is better than those of the other two algorithms,i.e.,the leader-follower fire evacuation algorithm and the random fire evacuation algorithm.展开更多
Activity recognition is a core aspect of ubiquitous computing applications. In order to deploy activity recognition systems in the real world, we need simple sensing systems with lightweight computational modules to a...Activity recognition is a core aspect of ubiquitous computing applications. In order to deploy activity recognition systems in the real world, we need simple sensing systems with lightweight computational modules to accurately analyze sensed data. In this paper, we propose a simple method to recognize human activities using simple object information involved in activities. We apply activity theory for representing complex human activities and propose a penalized naive Bayes classifier for performing activity recognition. Our results show that our method reduces computation up to an order of magnitude in both learning and inference without penalizing accuracy, when compared to hidden Markov models and conditional random fields.展开更多
The challenge of coping with non-frontal head poses during facial expression recognition results in considerable reduction of accuracy and robustness when capturing expressions that occur during natural communications...The challenge of coping with non-frontal head poses during facial expression recognition results in considerable reduction of accuracy and robustness when capturing expressions that occur during natural communications. In this paper, we attempt to recognize facial expressions under poses with large rotation angles from 2D videos. A depth^patch based 4D expression representation model is proposed. It was reconstructed from 2D dynamic images for delineating continuous spatial changes and temporal context under non-frontal cases. Furthermore, we present an effective deep neural network classifier, which can accurately capture pose-variant expression features from the depth patches and recognize non-frontal expressions. Experimental results on the BU-4DFE database show that the proposed method achieves a high recognition accuracy of 86.87% for non-frontal facial expressions within a range of head rotation angle of up to 52%, outperforming existing methods. We also present a quantitative analysis of the components contributing to the performance gain through tests on the BU-4DFE and Multi-PIE datasets.展开更多
Conventional vision-based systems,such as cameras,have demonstrated their enormous versatility in sensing human activities and developing interactive environments.However,these systems have long been criticized for in...Conventional vision-based systems,such as cameras,have demonstrated their enormous versatility in sensing human activities and developing interactive environments.However,these systems have long been criticized for incurring privacy,power,and latency issues due to their underlying structure of pixel-wise analog signal acquisition,computation,and communication.In this research,we overcome these limitations by introducing in-sensor analog computation through the distribution of interconnected photodetectors in space,having a weighted responsivity,to create what we call a computational photodetector.Computational photodetectors can be used to extract mid-level vision features as a single continuous analog signal measured via a two-pin connection.We develop computational photodetectors using thin and flexible low-noise organic photodiode arrays coupled with a self-powered wireless system to demonstrate a set of designs that capture position,orientation,direction,speed,and identification information,in a range of applications from explicit interactions on everyday surfaces to implicit activity detection.展开更多
Emotion plays a crucial role in gratifying users’needs during their experience of movies and TV series,and may be underutilized as a framework for exploring video content and analysis.In this paper,we present Emotion...Emotion plays a crucial role in gratifying users’needs during their experience of movies and TV series,and may be underutilized as a framework for exploring video content and analysis.In this paper,we present EmotionMap,a novel way of presenting emotion for daily users in 2D geography,fusing spatio-temporal information with emotional data.The interface is composed of novel visualization elements interconnected to facilitate video content exploration,understanding,and searching.EmotionMap allows understanding of the overall emotion at a glance while also giving a rapid understanding of the details.Firstly,we develop EmotionDisc which is an effective tool for collecting audiences’emotion based on emotion representation models.We collect audience and character emotional data,and then integrate the metaphor of a map to visualize video content and emotion in a hierarchical structure.EmotionMap combines sketch interaction,providing a natural approach for users’active exploration.The novelty and the effectiveness of EmotionMap have been demonstrated by the user study and experts’feedback.展开更多
文摘Human-computer interactions constitute an important subject for the development and popularization of information technologies,as they are not only an important frontier technology in computer science but also an important auxiliary technology in virtual reality(VR).In recent years,Chinese researchers have made significant advances in human-computer interactions.To systematically display China's latest advances in human-computer interactions and thus provide an impetus for the development of VR and other related fields,we have solicited articles for this special issue from experts in this area to participate in the review process.The following articles have been selected for publication in this special issue.
文摘In this paper,we investigate methodologies to improve direct-touch interaction on invisible and intangible spatial input.We firstly discuss about the motive of looking for a new input method for whole body interaction and how it can be meaningful.We also describe the role that can play spatial interaction to improve the freedom of interaction for a user.We propose a method of spatial centered interaction using invisible and intangible spatial inputs.However,given their lack of tactile feedback and visual representation,direct touch interaction on such input can be confused.In order to make a step toward understanding causes and solutions for such phenomena,we made 2 user experiments.In the first one,we test 5 setups of helper that provide information of the location of the input by constraining the dimension it is located at.The results show that using marker on the ground and a relationship with the height of the user’s body improve significantly the locative task.In the second experiment,we create a dancing game using invisible and intangible spatial inputs and we stress the results obtained in the first experiment within this cognitively demanding context.Results show that the same setup of helper is still providing very good results in that context.
基金National Key Research and Development(2016YFB1001405)Frontier Subject Key Research(QYZDY-SSW-JSC041)Chinese Academy of Sciences hundred people,National Natural Science Foundation of China(61572479)project support.
文摘With the development of virtual reality(VR)and human-computer interaction technology,how to use natural and efficient interaction methods in the virtual environment has become a hot topic of research.Gesture is one of the most important communication methods of human beings,which can effectively express users'demands.In the past few decades,gesture-based interaction has made significant progress.This article focuses on the gesture interaction technology and discusses the definition and classification of gestures,input devices for gesture interaction,and gesture interaction recognition technology.The application of gesture interaction technology in virtual reality is studied,the existing problems in the current gesture interaction are summarized,and the future development is prospected.
基金Supported by the Bavarian Academic Forum(BayWISS),as a part of the joint academic partnership digitalization program.
文摘Background In recent years,the demand for interactive photorealistic three-dimensional(3D)environments has increased in various fields,including architecture,engineering,and entertainment.However,achieving a balance between the quality and efficiency of high-performance 3D applications and virtual reality(VR)remains challenging.Methods This study addresses this issue by revisiting and extending view interpolation for image-based rendering(IBR),which enables the exploration of spacious open environments in 3D and VR.Therefore,we introduce multimorphing,a novel rendering method based on the spatial data structure of 2D image patches,called the image graph.Using this approach,novel views can be rendered with up to six degrees of freedom using only a sparse set of views.The rendering process does not require 3D reconstruction of the geometry or per-pixel depth information,and all relevant data for the output are extracted from the local morphing cells of the image graph.The detection of parallax image regions during preprocessing reduces rendering artifacts by extrapolating image patches from adjacent cells in real-time.In addition,a GPU-based solution was presented to resolve exposure inconsistencies within a dataset,enabling seamless transitions of brightness when moving between areas with varying light intensities.Results Experiments on multiple real-world and synthetic scenes demonstrate that the presented method achieves high"VR-compatible"frame rates,even on mid-range and legacy hardware,respectively.While achieving adequate visual quality even for sparse datasets,it outperforms other IBR and current neural rendering approaches.Conclusions Using the correspondence-based decomposition of input images into morphing cells of 2D image patches,multidimensional image morphing provides high-performance novel view generation,supporting open 3D and VR environments.Nevertheless,the handling of morphing artifacts in the parallax image regions remains a topic for future research.
基金This research was supported by the Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(No.2018R1D1A1A02085645)Also,this work was supported by the KoreaMedical Device Development Fund grant funded by the Korean government(the Ministry of Science and ICT,the Ministry of Trade,Industry and Energy,the Ministry of Health&Welfare,theMinistry of Food and Drug Safety)(Project Number:202012D05-02).
文摘With the advancement of computer vision techniques in surveillance systems,the need for more proficient,intelligent,and sustainable facial expressions and age recognition is necessary.The main purpose of this study is to develop accurate facial expressions and an age recognition system that is capable of error-free recognition of human expression and age in both indoor and outdoor environments.The proposed system first takes an input image pre-process it and then detects faces in the entire image.After that landmarks localization helps in the formation of synthetic face mask prediction.A novel set of features are extracted and passed to a classifier for the accurate classification of expressions and age group.The proposed system is tested over two benchmark datasets,namely,the Gallagher collection person dataset and the Images of Groups dataset.The system achieved remarkable results over these benchmark datasets about recognition accuracy and computational time.The proposed system would also be applicable in different consumer application domains such as online business negotiations,consumer behavior analysis,E-learning environments,and emotion robotics.
基金This research was initiated in a project with,国家高技术研究发展计划(863计划),日本科研项目,国家高技术研究发展计划(863计划)
文摘Pen-based user interfaces which leverage the affordances of the pen provide userswith more flexibility and natural interaction. However, it is difficult to construct usable pen-baseduser interfaces because of the lack of support for their development. Toolkit-level support has beenexploited to solve this problem, but this approach makes it hard to gain platform independence,easy maintenance and easy extension. In this paper a context-aware infrastructure is created,called WEAVER, to provide pen interaction services for both novel pen-based applications andlegacy GUI-based applications. WEAVER aims to support the pen as another standard interactivedevice along with the keyboard and mouse and present a high-level access interface to pen input.It employs application context to tailor its service to different applications. By modeling theapplication context and registering the relevant action adapters, WEAVER can offer services,such as gesture recognition, continuous handwriting and other fundamental ink manipulations, toappropriate applications. One of the distinct features of WEAVER is that off-the-shelf GUI-basedsoftware packages can be easily enhanced with pen interaction without modifying the existing code.In this paper, the architecture and components of WEAVER are described. In addition, examplesand feedbacks of its use are presented.
基金National Key Research and Development(2016YFB1001405)Frontier Subject Key Research(QYZDY-SSW JSC041)National Natural Science Foundation of China(61802379).
文摘Background Owing to recent advances in virtual reality(VR)technologies,effective user interaction with dynamic content in 3D scenes has become a research hotspot.Moving target selection is a basic interactive task in which the user performance research in tasks is significant to user interface design in VR.Different from the existing static target selection studies,the moving target selection in VR is affected by the change in target speed,angle and size,and lack of research on some key factors.Methods This study designs an experimental scenario in which the users play badminton under the condition of VR.By adding seven kinds of modal clues such as vision,audio,haptics,and their combinations,five kinds of moving speed and four kinds of serving angles,and the effect of these factors on the performance and subjective feelings in moving target selection in VR,is studied.Results The results show that the moving speed of the shuttlecock has a significant impact on the user performance.The angle of service has a significant impact on hitting rate,but has no significant impact on the hitting distance.The acquisition of the user performance by the moving target is mainly influenced by vision under the combined modalities;adding additional modalities can improve user performance.Although the hitting distance of the target is increased in the trimodal condition,the hitting rate decreases.Conclusion This study analyses the results of user performance and subjective perception,and then provides suggestions on the combination of modality clues in different scenarios.
基金National Key R&D Program of China(2016YFB1001405)the National Natural Science Foundation of China(61802379)Key Research Program of Frontier Sciences,CAS(QYZDY-SSW-JSC041).
文摘Background Crossing-based target selection motion may attain less error rates and higher interactive speed in some cases.Most of the research in target selection fields are focused on the analysis of the interaction results.Additionally,as trajectories play a much more important role in crossing-based target selection compared to the other interactive techniques,an ideal model for trajectories can help computer designers make predictions about interaction results during the process of target selection rather than at the end of the whole process.Methods In this paper,a trajectory prediction model for crossing based target selection tasks is proposed by taking the reference of a dynamic model theory.Results Simulation results demonstrate that our model performed well with regard to the prediction of trajectories,endpoints and hitting time for target-selection motion,and the average error of trajectories,endpoints and hitting time values were found to be 17.28%,2.73mm and 11.50%,respectively.
基金supported by the National Natural Science Foundation of China(Nos.62032017 and 62272368)Key Talent Project of Xidian University(No.QTZX24004)+2 种基金Innovation Capability Support Program of Shaanxi(No.2023-CX-TD-08)Shaanxi Qinchuangyuan“Scientists+Engineers”Team(No.2023KXJ-040)Science and Technology Program of Xi’an(No.23KGDW0005-2022).
文摘Federated learning is an emerging privacy-preserving distributed learning paradigm,in which many clients collaboratively train a shared global model under the orchestration of a remote server.Most current works on federated learning have focused on fully supervised learning settings,assuming that all the data are annotated with ground-truth labels.However,this work considers a more realistic and challenging setting,Federated Semi-Supervised Learning(FSSL),where clients have a large amount of unlabeled data and only the server hosts a small number of labeled samples.How to reasonably utilize the server-side labeled data and the client-side unlabeled data is the core challenge in this setting.In this paper,we propose a new FSSL algorithm for image classification based on consistency regularization and ensemble knowledge distillation,called EKDFSSL.Our algorithm uses the global model as the teacher in consistency regularization methods to enhance both the accuracy and stability of client-side unsupervised learning on unlabeled data.Besides,we introduce an additional ensemble knowledge distillation loss to mitigate model overfitting during server-side retraining on labeled data.Extensive experiments on several image classification datasets show that our EKDFSSL outperforms current baseline methods.
基金The authors would like to acknowledge funding provided by the German government to the Gauss Centre for Supercomputing via the InHPC-DE project(01-H17001)This work has partly been funded by the EUROCC2 project funded by the European High-Performance Computing Joint Undertaking(JU)and EU/EEA states under grant agreement No 101101903+2 种基金This work has partly been funded by the German Research Foundation under Germany's Excel-lence Strategy,EXC-2070-390732324-PhenoRobthe German Federal Ministry of Education and Research(BMBF)in the framework of the funding initiative Soil as a Sustainable Resource for the Bioeconomy BonaRes,the project BonaRes(Module A):Sustainable Subsoil Management-Soil3subproject 3(grant 031B1066C).
文摘This article describes an immersive virtual reality reconstruction tool for root system architectures from 3D scans of soil columns.In practical scenarios,experimental conditions will be adapted to fit the need of the data analysis pipeline,including sieving and drying the soil before scanning.Based on previous reports of automatic systems that do not represent what experts would annotate,we developed a virtual reality system to assist with the extraction of root systems in cases in which automated approaches fall short of expert knowledge.The aim of the present study is to evaluate whether our immersive method is superior to classical annotation approaches when tested on synthetic data sets using untrained participants.Our laboratory user study consists of evaluating the root extractions of participants,along with their rating on central user experience and usability measures.We show significant improvement in F1 score across conditions(noisy or clear data)as well as an improved usability.Our study highlights that using virtual reality in root extraction improves accuracy,and we perform an in-depth evaluation of biases that occur when users trace roots in soil volumes.
基金Project supported by the Key Research Projects of the Central University of Basic Scientific Research Funds for Cross Cooperation,China(No.201510-02)the Research Fund for the Doctoral Program of Higher Education,China(No.2013007211-0035)the Key Project in Science and Technology of Jilin Province,China(No.20140204088GX)
文摘There are many bottlenecks that limit the computing power of the Mobile Web3 D and they need to be solved before implementing a public fire evacuation system on this platform.In this study,we focus on three key problems:(1)The scene data for large-scale building information modeling(BIM)are huge,so it is difficult to transmit the data via the Internet and visualize them on the Web;(2)The raw fire dynamic simulator(FDS)smoke diffusion data are also very large,so it is extremely difficult to transmit the data via the Internet and visualize them on the Web;(3)A smart artificial intelligence fire evacuation app for the public should be accurate and real-time.To address these problems,the following solutions are proposed:(1)The large-scale scene model is made lightweight;(2)The amount of dynamic smoke is also made lightweight;(3)The dynamic obstacle maps established from the scene model and smoke data are used for optimal path planning using a heuristic method.We propose a real-time fire evacuation system based on the ant colony optimization(RFES-ACO)algorithm with reused dynamic pheromones.Simulation results show that the public could use Mobile Web3 D devices to experience fire evacuation drills in real time smoothly.The real-time fire evacuation system(RFES)is efficient and the evacuation rate is better than those of the other two algorithms,i.e.,the leader-follower fire evacuation algorithm and the random fire evacuation algorithm.
基金supported by the Korea Research Foundation under Grant No. KRF-2008-357-D00221
文摘Activity recognition is a core aspect of ubiquitous computing applications. In order to deploy activity recognition systems in the real world, we need simple sensing systems with lightweight computational modules to accurately analyze sensed data. In this paper, we propose a simple method to recognize human activities using simple object information involved in activities. We apply activity theory for representing complex human activities and propose a penalized naive Bayes classifier for performing activity recognition. Our results show that our method reduces computation up to an order of magnitude in both learning and inference without penalizing accuracy, when compared to hidden Markov models and conditional random fields.
基金This work was supported by the National Key Research and Development Program of China under Grant No. 2016YFBI001405, and the National Natural Science Foundation of China under Grant Nos. 61232013, 61422212, and 61661146002.
文摘The challenge of coping with non-frontal head poses during facial expression recognition results in considerable reduction of accuracy and robustness when capturing expressions that occur during natural communications. In this paper, we attempt to recognize facial expressions under poses with large rotation angles from 2D videos. A depth^patch based 4D expression representation model is proposed. It was reconstructed from 2D dynamic images for delineating continuous spatial changes and temporal context under non-frontal cases. Furthermore, we present an effective deep neural network classifier, which can accurately capture pose-variant expression features from the depth patches and recognize non-frontal expressions. Experimental results on the BU-4DFE database show that the proposed method achieves a high recognition accuracy of 86.87% for non-frontal facial expressions within a range of head rotation angle of up to 52%, outperforming existing methods. We also present a quantitative analysis of the components contributing to the performance gain through tests on the BU-4DFE and Multi-PIE datasets.
基金supported by the Georgia Tech CRNCH (Center for Research into Novel Computing Hierarchies) Ph.D.Fellowship.
文摘Conventional vision-based systems,such as cameras,have demonstrated their enormous versatility in sensing human activities and developing interactive environments.However,these systems have long been criticized for incurring privacy,power,and latency issues due to their underlying structure of pixel-wise analog signal acquisition,computation,and communication.In this research,we overcome these limitations by introducing in-sensor analog computation through the distribution of interconnected photodetectors in space,having a weighted responsivity,to create what we call a computational photodetector.Computational photodetectors can be used to extract mid-level vision features as a single continuous analog signal measured via a two-pin connection.We develop computational photodetectors using thin and flexible low-noise organic photodiode arrays coupled with a self-powered wireless system to demonstrate a set of designs that capture position,orientation,direction,speed,and identification information,in a range of applications from explicit interactions on everyday surfaces to implicit activity detection.
基金This work was supported by the National Key Research and Development Program of China under Grant No.2016YFB1001200the National Natural Science Foundation of China under Grant No.61872346the Strategic Priority Research Program of the Chinese Academy of Sciences under Grant No.19080102.
文摘Emotion plays a crucial role in gratifying users’needs during their experience of movies and TV series,and may be underutilized as a framework for exploring video content and analysis.In this paper,we present EmotionMap,a novel way of presenting emotion for daily users in 2D geography,fusing spatio-temporal information with emotional data.The interface is composed of novel visualization elements interconnected to facilitate video content exploration,understanding,and searching.EmotionMap allows understanding of the overall emotion at a glance while also giving a rapid understanding of the details.Firstly,we develop EmotionDisc which is an effective tool for collecting audiences’emotion based on emotion representation models.We collect audience and character emotional data,and then integrate the metaphor of a map to visualize video content and emotion in a hierarchical structure.EmotionMap combines sketch interaction,providing a natural approach for users’active exploration.The novelty and the effectiveness of EmotionMap have been demonstrated by the user study and experts’feedback.