Porous hydrogel sensors have attracted significant attention in fields such as smart wearables and medical monitoring due to their high sensitivity.However,existing fabrication methods typically degrade the surface sm...Porous hydrogel sensors have attracted significant attention in fields such as smart wearables and medical monitoring due to their high sensitivity.However,existing fabrication methods typically degrade the surface smoothness of hydrogels when introducing porous structures and face significant challenges in removing fillers completely.To address these challenges,we herein introduce a novel one-step,thermosensitive spray-coating technique for the preparation of aircell hydrogel(ACH).This method leverages the rapid cooling of a thermoresponsive gelatin methacryloyl solution through atomization,enabling rapid cross-linking within seconds and air bubbles encapsulated in situ.Additionally,the transient flow of the pre-gel facilitates the repair of voids formed by ruptured surface bubbles,leading to the creation of the ACH with uniformly distributed inner air bubbles and a smooth outer surface.The mold-free fabrication method is independent of substrate surface properties,enabling the creation of a porous hydrogel film with a thickness as thin as 163 µm.Furthermore,the dual-crosslinked network endows the ACH with excellent anti-swelling properties,and the physical crosslinking between gelatin molecules allows the ACH to self-heal.The ACH exhibits excellent sensitivity in deformation sensing and can even successfully track minor external forces,which enables it to effectively complete various tasks such as facial expression recognition,pitch differentiation,and motion detection.By integrating the ACH into a sensing glove,we also demonstrate the significant potential of the ACH for applications in human-machine interaction and tactile sensing.Ultimately,the ACH sensors are also applied to motion mapping and machine tactile feedback,indicating their promising potential in human-machine interaction.展开更多
Pedestrian trajectory prediction can significantly enhance the perception and decision-making capabilities of autonomous driving systems and intelligent surveillance systems based on camera sensors by predicting the s...Pedestrian trajectory prediction can significantly enhance the perception and decision-making capabilities of autonomous driving systems and intelligent surveillance systems based on camera sensors by predicting the states and behavior intentions of surrounding pedestrians.However,existing trajectory prediction methods remain failing to effectively model the diverse and complex interactions in the real world,including pedestrian-pedestrian interactions and pedestrian-environment interactions.Besides,these methods are not effective in capturing and characterizing the multimodal property of future trajectories.To address these challenges above,we propose to devise a handdesigned graph convolution and spatial cross attention to dynamically capture the diverse spatial interactions between pedestrians.To effectively explore the impact of scenarios on pedestrian trajectory,we build a pedestrian map,which can reflect the scene constraints and pedestrian motion preferences.Meanwhile,we construct a trajectory multimodality-aware module to capture the different potential mode implicit in diverse social behaviors for pedestrian future trajectory uncertainty.Finally,we compared the proposed method with trajectory prediction baselines on commonly used public pedestrian benchmarks,demonstrating the superior performance of our approach.展开更多
Rock discontinuities control rock mechanical behaviors and significantly influence the stability of rock masses.However,existing discontinuity mapping algorithms are susceptible to noise,and the calculation results ca...Rock discontinuities control rock mechanical behaviors and significantly influence the stability of rock masses.However,existing discontinuity mapping algorithms are susceptible to noise,and the calculation results cannot be fed back to users timely.To address this issue,we proposed a human-machine interaction(HMI)method for discontinuity mapping.Users can help the algorithm identify the noise and make real-time result judgments and parameter adjustments.For this,a regular cube was selected to illustrate the workflows:(1)point cloud was acquired using remote sensing;(2)the HMI method was employed to select reference points and angle thresholds to detect group discontinuity;(3)individual discontinuities were extracted from the group discontinuity using a density-based cluster algorithm;and(4)the orientation of each discontinuity was measured based on a plane fitting algorithm.The method was applied to a well-studied highway road cut and a complex natural slope.The consistency of the computational results with field measurements demonstrates its good accuracy,and the average error in the dip direction and dip angle for both cases was less than 3.Finally,the computational time of the proposed method was compared with two other popular algorithms,and the reduction in computational time by tens of times proves its high computational efficiency.This method provides geologists and geological engineers with a new idea to map rapidly and accurately rock structures under large amounts of noises or unclear features.展开更多
Electromyography(EMG)has already been broadly used in human-machine interaction(HMI)applications.Determining how to decode the information inside EMG signals robustly and accurately is a key problem for which we urgen...Electromyography(EMG)has already been broadly used in human-machine interaction(HMI)applications.Determining how to decode the information inside EMG signals robustly and accurately is a key problem for which we urgently need a solution.Recently,many EMG pattern recognition tasks have been addressed using deep learning methods.In this paper,we analyze recent papers and present a literature review describing the role that deep learning plays in EMG-based HMI.An overview of typical network structures and processing schemes will be provided.Recent progress in typical tasks such as movement classification,joint angle prediction,and force/torque estimation will be introduced.New issues,including multimodal sensing,inter-subject/inter-session,and robustness toward disturbances will be discussed.We attempt to provide a comprehensive analysis of current research by discussing the advantages,challenges,and opportunities brought by deep learning.We hope that deep learning can aid in eliminating factors that hinder the development of EMG-based HMI systems.Furthermore,possible future directions will be presented to pave the way for future research.展开更多
With the development of globalization,intercultural communicative competence has become one of the core qualities of modern college students.As an important platform to cultivate students’language skills and cultural...With the development of globalization,intercultural communicative competence has become one of the core qualities of modern college students.As an important platform to cultivate students’language skills and cultural literacy,the innovation of college English teaching mode is essential.Based on this,this paper mainly discusses methods to effectively cultivate students’intercultural communicative competence in college English teaching from the perspective of multimodal interactive teaching mode,hoping to provide references for improving the quality of college English teaching and students’comprehensive quality.展开更多
Hydrogel-based triboelectric nanoge nerator(TENG)has a promising applied prospect in wearable electronic devices.However,its low performance,poor stability,insufficient recyclability and inferior self-healing seriousl...Hydrogel-based triboelectric nanoge nerator(TENG)has a promising applied prospect in wearable electronic devices.However,its low performance,poor stability,insufficient recyclability and inferior self-healing seriously hinder its development.Herein,we report a robust route to a liquid metal(LM)/polyvinyl alcohol(PVA)hydrogel-based TENG(LP-TENG).Owing to the intrinsically liquid feature of conductive LM within the flexible PVA hydrogel,the as-prepared LP-TENG exhibited comprehensiye advantages of adaptability,biocompatibility,outstanding electrical performance,superior stability,recyclability and diverse applications,which were unattainable by traditional systems.Concretely,the LP-TENG delivered appealing open circuit voltage of 250 V,short circuit current of 4μA and transferred charge of 120 nC with high stability,outperforming most advanced TENG systems.The LP-TENG was successfully employed for versatile applications with multifunctionality,including human motion detection,handwriting recognition,energy collection,message transmission and human-machine interaction.This work presents significant prospects for crafting advanced materials and devices in the fields of wearable electronics,flexible skin and smart robots.展开更多
Disentangling the influence of multiple signal components on receivers and elucidating general processes influencing complex signal evolution are difficult tasks. In this study we test mate preferences of female squir...Disentangling the influence of multiple signal components on receivers and elucidating general processes influencing complex signal evolution are difficult tasks. In this study we test mate preferences of female squirrel treefrogs Hyla squirella and female tungara frogs Physalaemus pustulosus for similar combinations of acoustic and visual components of their multimodal courtship signals. In a two-choice playback experiment with squirrel treefrogs, the visual stimulus of a male model significantly increased the attractivness of a relatively unattractive slow call rate. A previous study demonstrated that faster call rates are more attractive to female squirrel treefrogs, and all else being equal, models of male frogs with large body stripes are more attractive. In a similar experiment with female tungara frogs, the visual stimulus of a robotic frog failed to increase the attractiveness of a relatively unattractive call. Females also showed no preference for the distinct stripe on the robot that males commonly bear on their throat. Thus, features of conspicuous signal components such as body stripes are not universally important and signal function is likely to differ even among species with similar ecologies and communication systems. Finally, we discuss the putative information content of anuran signals and suggest that the categorization of redundant versus multiple messages may not be sufficient as a general explanation for the evolution of multimodal signaling. Instead of relying on untested assumptions concerning the information content of signals, we discuss the value of initially collecting comparative empirical data sets related to receiver responses.展开更多
Background Augmented reality classrooms have become an interesting research topic in the field of education,but there are some limitations.Firstly,most researchers use cards to operate experiments,and a large number o...Background Augmented reality classrooms have become an interesting research topic in the field of education,but there are some limitations.Firstly,most researchers use cards to operate experiments,and a large number of cards cause difficulty and inconvenience for users.Secondly,most users conduct experiments only in the visual modal,and such single-modal interaction greatly reduces the users'real sense of interaction.In order to solve these problems,we propose the Multimodal Interaction Algorithm based on Augmented Reality(ARGEV),which is based on visual and tactile feedback in Augmented Reality.In addition,we design a Virtual and Real Fusion Interactive Tool Suite(VRFITS)with gesture recognition and intelligent equipment.Methods The ARGVE method fuses gesture,intelligent equipment,and virtual models.We use a gesture recognition model trained by a convolutional neural network to recognize the gestures in AR,and to trigger a vibration feedback after a recognizing a five finger grasp gesture.We establish a coordinate mapping relationship between real hands and the virtual model to achieve the fusion of gestures and the virtual model.Results The average accuracy rate of gesture recognition was 99.04%.We verify and apply VRFITS in the Augmented Reality Chemistry Lab(ARCL),and the overall operation load of ARCL is thus reduced by 29.42%,in comparison to traditional simulation virtual experiments.Conclusions We achieve real-time fusion of the gesture,virtual model,and intelligent equipment in ARCL.Compared with the NOBOOK virtual simulation experiment,ARCL improves the users'real sense of operation and interaction efficiency.展开更多
Speech recognition rate will deteriorate greatly in human-machine interaction when the speaker's speech mixes with a bystander's voice. This paper proposes a time-frequency approach for Blind Source Seperation...Speech recognition rate will deteriorate greatly in human-machine interaction when the speaker's speech mixes with a bystander's voice. This paper proposes a time-frequency approach for Blind Source Seperation (BSS) for intelligent Human-Machine Interaction(HMI). Main idea of the algorithm is to simultaneously diagonalize the correlation matrix of the pre-whitened signals at different time delays for every frequency bins in time-frequency domain. The prososed method has two merits: (1) fast convergence speed; (2) high signal to interference ratio of the separated signals. Numerical evaluations are used to compare the performance of the proposed algorithm with two other deconvolution algorithms. An efficient algorithm to resolve permutation ambiguity is also proposed in this paper. The algorithm proposed saves more than 10% of computational time with properly selected parameters and achieves good performances for both simulated convolutive mixtures and real room recorded speeches.展开更多
Teleoperation is of great importance in the area of robotics,especially when people are unavailable in the robot workshop.It provides a way for people to control robots remotely using human intelligence.In this paper,...Teleoperation is of great importance in the area of robotics,especially when people are unavailable in the robot workshop.It provides a way for people to control robots remotely using human intelligence.In this paper,a robotic teleoperation system for precise robotic manipulation is established.The data glove and the 7-degrees of freedom(DOFs)force feedback controller are used for the remote control interaction.The control system and the monitor system are designed for the remote precise manipulation.The monitor system contains an image acquisition system and a human-machine interaction module,and aims to simulate and detect the robot running state.Besides,a visual object tracking algorithm is developed to estimate the states of the dynamic system from noisy observations.The established robotic teleoperation systemis applied to a series of experiments,and high-precision results are obtained,showing the effectiveness of the physical system.展开更多
As the Internet of Things advances,gesture recognition emerges as a prominent domain in human-machine interaction(HMI).However,interactive wearables based on conductive hydrogels for individuals with single-arm functi...As the Internet of Things advances,gesture recognition emerges as a prominent domain in human-machine interaction(HMI).However,interactive wearables based on conductive hydrogels for individuals with single-arm functionality or disabilities remain underexplored.Here,we devised a wearable one-handed keyboard with gesture recognition,employing machine learning algorithms and hydrogel-based mechanical sensors to boost productivity.PCG(PAM/CMC/rGO)hydrogels are composed of polyacrylamide(PAM),sodium carboxymethyl cellulose(CMC),and reduced graphene oxide(rGO),which function as a strain,pressure sensor,and electrode material.The PAM chains offer the gel’s elasticity by covalent cross-linking,while the biocompatible CMC improves the dispersion of rGO and promotes electromechanical properties.Integrating rGO sheets into the polymer matrix facilitates cross-linking and generates supple-mentary conductive pathways,thereby augmenting the gel system’s elasticity,sensitivity,and durability.Our hydrogel sensors include high sensitivity(gage factor(GF)=8.18,395.6%-551.96%)and superior pressure sensing capabilities(Sensitivity(S)=0.3116 kPa^(-1),0-9.82 kPa).Furthermore,we developed a wearable keyboard with up to 98.13%accuracy using convolutional neural networks and a custom data acquisition system.This study establishes the groundwork for creating multifunctional gel sensors for intelligent machines,wearable devices,and brain-computer interfaces.展开更多
Biography videos based on life performances of prominent figures in history aim to describe great mens' life.In this paper,a novel interactive video summarization for biography video based on multimodal fusion is ...Biography videos based on life performances of prominent figures in history aim to describe great mens' life.In this paper,a novel interactive video summarization for biography video based on multimodal fusion is proposed,which is a novel approach of visualizing the specific features for biography video and interacting with video content by taking advantage of the ability of multimodality.In general,a story of movie progresses by dialogues of characters and the subtitles are produced with the basis on the dialogues which contains all the information related to the movie.In this paper,JGibbsLDA is applied to extract key words from subtitles because the biography video consists of different aspects to depict the characters' whole life.In terms of fusing keywords and key-frames,affinity propagation is adopted to calculate the similarity between each key-frame cluster and keywords.Through the method mentioned above,a video summarization is presented based on multimodal fusion which describes video content more completely.In order to reduce the time spent on searching the interest video content and get the relationship between main characters,a kind of map is adopted to visualize video content and interact with video summarization.An experiment is conducted to evaluate video summarization and the results demonstrate that this system can formally facilitate the exploration of video content while improving interaction and finding events of interest efficiently.展开更多
With the popularization of social media,stickers have become an important tool for young students to express themselves and resist mainstream culture due to their unique visual and emotional expressiveness.Most existi...With the popularization of social media,stickers have become an important tool for young students to express themselves and resist mainstream culture due to their unique visual and emotional expressiveness.Most existing studies focus on the negative impacts of spoof stickers,while paying insufficient attention to their positive functions.From the perspective of multimodal metaphor,this paper uses methods such as virtual ethnography and image-text analysis to clarify the connotation of stickers,understand the evolution of their digital dissemination forms,and explore the multiple functions of subcultural stickers in the social interactions between teachers and students.Young students use stickers to convey emotions and information.Their expressive function,social function,and cultural metaphor function progress in a progressive manner.This not only shapes students’values but also promotes self-expression and teacher-student interaction.It also reminds teachers to correct students’negative thoughts by using stickers,achieving the effect of“cultivating and influencing people through culture.”展开更多
Aiming at the problems of traditional guide devices such as single environmental perception and poor terrain adaptability,this paper proposes an intelligent guide system based on a quadruped robot platform.Data fusion...Aiming at the problems of traditional guide devices such as single environmental perception and poor terrain adaptability,this paper proposes an intelligent guide system based on a quadruped robot platform.Data fusion between millimeter-wave radar(with an accuracy of±0.1°)and an RGB-D camera is achieved through multisensor spatiotemporal registration technology,and a dataset suitable for guide dog robots is constructed.For the application scenario of edge-end guide dog robots,a lightweight CA-YOLOv11 target detection model integrated with an attention mechanism is innovatively adopted,achieving a comprehensive recognition accuracy of 95.8% in complex scenarios,which is 2.2% higher than that of the benchmark YOLOv11 network.The system supports navigation on complex terrains such as stairs(25 cm steps)and slopes(35°gradient),and the response time to sudden disturbances is shortened to 100 ms.Actual tests show that the navigation success rate reaches 95% in eight types of scenarios,the user satisfaction score is 4.8/5.0,and the cost is 50% lower than that of traditional guide dogs.展开更多
Accurate prediction of drug responses in cancer cell lines(CCLs)and transferable prediction of clinical drug responses using CCLs are two major tasks in personalized medicine.Despite the rapid advancements in existing...Accurate prediction of drug responses in cancer cell lines(CCLs)and transferable prediction of clinical drug responses using CCLs are two major tasks in personalized medicine.Despite the rapid advancements in existing computational methods for preclinical and clinical cancer drug response(CDR)prediction,challenges remain regarding the generalization of new drugs that are unseen in the training set.Herein,we propose a multimodal fusion deep learning(DL)model called drug-target and single-cell language based CDR(DTLCDR)to predict preclinical and clinical CDRs.The model integrates chemical descriptors,molecular graph representations,predicted protein target profiles of drugs,and cell line expression profiles with general knowledge from single cells.Among these features,a well-trained drug-target interaction(DTI)prediction model is used to generate target profiles of drugs,and a pretrained single-cell language model is integrated to provide general genomic knowledge.Comparison experiments on the cell line drug sensitivity dataset demonstrated that DTLCDR exhibited improved generalizability and robustness in predicting unseen drugs compared with previous state-of-the-art baseline methods.Further ablation studies verified the effectiveness of each component of our model,highlighting the significant contribution of target information to generalizability.Subsequently,the ability of DTLCDR to predict novel molecules was validated through in vitro cell experiments,demonstrating its potential for real-world applications.Moreover,DTLCDR was transferred to the clinical datasets,demonstrating satisfactory performance in the clinical data,regardless of whether the drugs were included in the cell line dataset.Overall,our results suggest that the DTLCDR is a promising tool for personalized drug discovery.展开更多
With the growing application of intelligent robots in service,manufacturing,and medical fields,efficient and natural interaction between humans and robots has become key to improving collaboration efficiency and user ...With the growing application of intelligent robots in service,manufacturing,and medical fields,efficient and natural interaction between humans and robots has become key to improving collaboration efficiency and user experience.Gesture recognition,as an intuitive and contactless interaction method,can overcome the limitations of traditional interfaces and enable real-time control and feedback of robot movements and behaviors.This study first reviews mainstream gesture recognition algorithms and their application on different sensing platforms(RGB cameras,depth cameras,and inertial measurement units).It then proposes a gesture recognition method based on multimodal feature fusion and a lightweight deep neural network that balances recognition accuracy with computational efficiency.At system level,a modular human-robot interaction architecture is constructed,comprising perception,decision,and execution layers,and gesture commands are transmitted and mapped to robot actions in real time via the ROS communication protocol.Through multiple comparative experiments on public gesture datasets and a self-collected dataset,the proposed method’s superiority is validated in terms of accuracy,response latency,and system robustness,while user-experience tests assess the interface’s usability.The results provide a reliable technical foundation for robot collaboration and service in complex scenarios,offering broad prospects for practical application and deployment.展开更多
Video action recognition(VAR)aims to analyze dynamic behaviors in videos and achieve semantic understanding.VAR faces challenges such as temporal dynamics,action-scene coupling,and the complexity of human interactions...Video action recognition(VAR)aims to analyze dynamic behaviors in videos and achieve semantic understanding.VAR faces challenges such as temporal dynamics,action-scene coupling,and the complexity of human interactions.Existing methods can be categorized into motion-level,event-level,and story-level ones based on spatiotemporal granularity.However,single-modal approaches struggle to capture complex behavioral semantics and human factors.Therefore,in recent years,vision-language models(VLMs)have been introduced into this field,providing new research perspectives for VAR.In this paper,we systematically review spatiotemporal hierarchical methods in VAR and explore how the introduction of large models has advanced the field.Additionally,we propose the concept of“Factor”to identify and integrate key information from both visual and textual modalities,enhancing multimodal alignment.We also summarize various multimodal alignment methods and provide in-depth analysis and insights into future research directions.展开更多
The fusion of VlSI (visual identity system Internet), digital maps and Web GIS is presented. Web GIS interface interactive design with VISI needs to consider more new factors. VISI can provide the design principle, ...The fusion of VlSI (visual identity system Internet), digital maps and Web GIS is presented. Web GIS interface interactive design with VISI needs to consider more new factors. VISI can provide the design principle, elements and contents for the Web GIS. The design of the Wuhan Bus Search System is fulfilled to confirm the validity and practicability of the fusion.展开更多
Background With an increasing number of vehicles becoming autonomous,intelligent,and connected,paying attention to the future usage of car human-machine interface with these vehicles should become more relevant.Severa...Background With an increasing number of vehicles becoming autonomous,intelligent,and connected,paying attention to the future usage of car human-machine interface with these vehicles should become more relevant.Several studies have addressed car HMI but were less attentive to designing and implementing interactive glazing for every day(autonomous)driving contexts.Methods Reflecting on the literature,we describe an engineering psychology practice and the design of six novel future user scenarios,which envision the application of a specific set of augmented reality(AR)support user interactions.Additionally,we conduct evaluations on specific scenarios and experiential prototypes,which reveal that these AR scenarios aid the target user groups in experiencing a new type of interaction.The overall evaluation is positive with valuable assessment results and suggestions.Conclusions This study can interest applied psychology educators who aspire to teach how AR can be operationalized in a human-centered design process to students with minimal pre-existing expertise or minimal scientific knowledge in engineering psychology.展开更多
基金financially supported by the National Key R&D Program of China(Grant No.2023YFE0108900)EU HORIZON 2021 L4DNANO(No.101086227)。
文摘Porous hydrogel sensors have attracted significant attention in fields such as smart wearables and medical monitoring due to their high sensitivity.However,existing fabrication methods typically degrade the surface smoothness of hydrogels when introducing porous structures and face significant challenges in removing fillers completely.To address these challenges,we herein introduce a novel one-step,thermosensitive spray-coating technique for the preparation of aircell hydrogel(ACH).This method leverages the rapid cooling of a thermoresponsive gelatin methacryloyl solution through atomization,enabling rapid cross-linking within seconds and air bubbles encapsulated in situ.Additionally,the transient flow of the pre-gel facilitates the repair of voids formed by ruptured surface bubbles,leading to the creation of the ACH with uniformly distributed inner air bubbles and a smooth outer surface.The mold-free fabrication method is independent of substrate surface properties,enabling the creation of a porous hydrogel film with a thickness as thin as 163 µm.Furthermore,the dual-crosslinked network endows the ACH with excellent anti-swelling properties,and the physical crosslinking between gelatin molecules allows the ACH to self-heal.The ACH exhibits excellent sensitivity in deformation sensing and can even successfully track minor external forces,which enables it to effectively complete various tasks such as facial expression recognition,pitch differentiation,and motion detection.By integrating the ACH into a sensing glove,we also demonstrate the significant potential of the ACH for applications in human-machine interaction and tactile sensing.Ultimately,the ACH sensors are also applied to motion mapping and machine tactile feedback,indicating their promising potential in human-machine interaction.
文摘Pedestrian trajectory prediction can significantly enhance the perception and decision-making capabilities of autonomous driving systems and intelligent surveillance systems based on camera sensors by predicting the states and behavior intentions of surrounding pedestrians.However,existing trajectory prediction methods remain failing to effectively model the diverse and complex interactions in the real world,including pedestrian-pedestrian interactions and pedestrian-environment interactions.Besides,these methods are not effective in capturing and characterizing the multimodal property of future trajectories.To address these challenges above,we propose to devise a handdesigned graph convolution and spatial cross attention to dynamically capture the diverse spatial interactions between pedestrians.To effectively explore the impact of scenarios on pedestrian trajectory,we build a pedestrian map,which can reflect the scene constraints and pedestrian motion preferences.Meanwhile,we construct a trajectory multimodality-aware module to capture the different potential mode implicit in diverse social behaviors for pedestrian future trajectory uncertainty.Finally,we compared the proposed method with trajectory prediction baselines on commonly used public pedestrian benchmarks,demonstrating the superior performance of our approach.
基金supported by the National Key R&D Program of China(No.2023YFC3081200)the National Natural Science Foundation of China(No.42077264)the Scientific Research Project of PowerChina Huadong Engineering Corporation Limited(HDEC-2022-0301).
文摘Rock discontinuities control rock mechanical behaviors and significantly influence the stability of rock masses.However,existing discontinuity mapping algorithms are susceptible to noise,and the calculation results cannot be fed back to users timely.To address this issue,we proposed a human-machine interaction(HMI)method for discontinuity mapping.Users can help the algorithm identify the noise and make real-time result judgments and parameter adjustments.For this,a regular cube was selected to illustrate the workflows:(1)point cloud was acquired using remote sensing;(2)the HMI method was employed to select reference points and angle thresholds to detect group discontinuity;(3)individual discontinuities were extracted from the group discontinuity using a density-based cluster algorithm;and(4)the orientation of each discontinuity was measured based on a plane fitting algorithm.The method was applied to a well-studied highway road cut and a complex natural slope.The consistency of the computational results with field measurements demonstrates its good accuracy,and the average error in the dip direction and dip angle for both cases was less than 3.Finally,the computational time of the proposed method was compared with two other popular algorithms,and the reduction in computational time by tens of times proves its high computational efficiency.This method provides geologists and geological engineers with a new idea to map rapidly and accurately rock structures under large amounts of noises or unclear features.
基金supported in part by the National Natural Science Foundation of China(U181321461773369+2 种基金61903360)the Selfplanned Project of the State Key Laboratory of Robotics(2020-Z12)China Postdoctoral Science Foundation funded project(2019M661155)。
文摘Electromyography(EMG)has already been broadly used in human-machine interaction(HMI)applications.Determining how to decode the information inside EMG signals robustly and accurately is a key problem for which we urgently need a solution.Recently,many EMG pattern recognition tasks have been addressed using deep learning methods.In this paper,we analyze recent papers and present a literature review describing the role that deep learning plays in EMG-based HMI.An overview of typical network structures and processing schemes will be provided.Recent progress in typical tasks such as movement classification,joint angle prediction,and force/torque estimation will be introduced.New issues,including multimodal sensing,inter-subject/inter-session,and robustness toward disturbances will be discussed.We attempt to provide a comprehensive analysis of current research by discussing the advantages,challenges,and opportunities brought by deep learning.We hope that deep learning can aid in eliminating factors that hinder the development of EMG-based HMI systems.Furthermore,possible future directions will be presented to pave the way for future research.
文摘With the development of globalization,intercultural communicative competence has become one of the core qualities of modern college students.As an important platform to cultivate students’language skills and cultural literacy,the innovation of college English teaching mode is essential.Based on this,this paper mainly discusses methods to effectively cultivate students’intercultural communicative competence in college English teaching from the perspective of multimodal interactive teaching mode,hoping to provide references for improving the quality of college English teaching and students’comprehensive quality.
基金financially supported by the Natural Science Foundation of China(Nos.22109120,62104170 and 82202757)Zhejiang Provincial Natural Science Foundation of China(Nos.LQ21B030002 and LY23F040001)。
文摘Hydrogel-based triboelectric nanoge nerator(TENG)has a promising applied prospect in wearable electronic devices.However,its low performance,poor stability,insufficient recyclability and inferior self-healing seriously hinder its development.Herein,we report a robust route to a liquid metal(LM)/polyvinyl alcohol(PVA)hydrogel-based TENG(LP-TENG).Owing to the intrinsically liquid feature of conductive LM within the flexible PVA hydrogel,the as-prepared LP-TENG exhibited comprehensiye advantages of adaptability,biocompatibility,outstanding electrical performance,superior stability,recyclability and diverse applications,which were unattainable by traditional systems.Concretely,the LP-TENG delivered appealing open circuit voltage of 250 V,short circuit current of 4μA and transferred charge of 120 nC with high stability,outperforming most advanced TENG systems.The LP-TENG was successfully employed for versatile applications with multifunctionality,including human motion detection,handwriting recognition,energy collection,message transmission and human-machine interaction.This work presents significant prospects for crafting advanced materials and devices in the fields of wearable electronics,flexible skin and smart robots.
文摘Disentangling the influence of multiple signal components on receivers and elucidating general processes influencing complex signal evolution are difficult tasks. In this study we test mate preferences of female squirrel treefrogs Hyla squirella and female tungara frogs Physalaemus pustulosus for similar combinations of acoustic and visual components of their multimodal courtship signals. In a two-choice playback experiment with squirrel treefrogs, the visual stimulus of a male model significantly increased the attractivness of a relatively unattractive slow call rate. A previous study demonstrated that faster call rates are more attractive to female squirrel treefrogs, and all else being equal, models of male frogs with large body stripes are more attractive. In a similar experiment with female tungara frogs, the visual stimulus of a robotic frog failed to increase the attractiveness of a relatively unattractive call. Females also showed no preference for the distinct stripe on the robot that males commonly bear on their throat. Thus, features of conspicuous signal components such as body stripes are not universally important and signal function is likely to differ even among species with similar ecologies and communication systems. Finally, we discuss the putative information content of anuran signals and suggest that the categorization of redundant versus multiple messages may not be sufficient as a general explanation for the evolution of multimodal signaling. Instead of relying on untested assumptions concerning the information content of signals, we discuss the value of initially collecting comparative empirical data sets related to receiver responses.
基金the National Key R&D Program of China(2018YFB1004901)the Independent Innovation Team Project of Jinan City(2019GXRC013).
文摘Background Augmented reality classrooms have become an interesting research topic in the field of education,but there are some limitations.Firstly,most researchers use cards to operate experiments,and a large number of cards cause difficulty and inconvenience for users.Secondly,most users conduct experiments only in the visual modal,and such single-modal interaction greatly reduces the users'real sense of interaction.In order to solve these problems,we propose the Multimodal Interaction Algorithm based on Augmented Reality(ARGEV),which is based on visual and tactile feedback in Augmented Reality.In addition,we design a Virtual and Real Fusion Interactive Tool Suite(VRFITS)with gesture recognition and intelligent equipment.Methods The ARGVE method fuses gesture,intelligent equipment,and virtual models.We use a gesture recognition model trained by a convolutional neural network to recognize the gestures in AR,and to trigger a vibration feedback after a recognizing a five finger grasp gesture.We establish a coordinate mapping relationship between real hands and the virtual model to achieve the fusion of gestures and the virtual model.Results The average accuracy rate of gesture recognition was 99.04%.We verify and apply VRFITS in the Augmented Reality Chemistry Lab(ARCL),and the overall operation load of ARCL is thus reduced by 29.42%,in comparison to traditional simulation virtual experiments.Conclusions We achieve real-time fusion of the gesture,virtual model,and intelligent equipment in ARCL.Compared with the NOBOOK virtual simulation experiment,ARCL improves the users'real sense of operation and interaction efficiency.
文摘Speech recognition rate will deteriorate greatly in human-machine interaction when the speaker's speech mixes with a bystander's voice. This paper proposes a time-frequency approach for Blind Source Seperation (BSS) for intelligent Human-Machine Interaction(HMI). Main idea of the algorithm is to simultaneously diagonalize the correlation matrix of the pre-whitened signals at different time delays for every frequency bins in time-frequency domain. The prososed method has two merits: (1) fast convergence speed; (2) high signal to interference ratio of the separated signals. Numerical evaluations are used to compare the performance of the proposed algorithm with two other deconvolution algorithms. An efficient algorithm to resolve permutation ambiguity is also proposed in this paper. The algorithm proposed saves more than 10% of computational time with properly selected parameters and achieves good performances for both simulated convolutive mixtures and real room recorded speeches.
基金NSFC-Shenzhen Robotics Research Center Project(No.U2013207)the Beijing Science and Technology Plan Project(No.Z191100008019008)。
文摘Teleoperation is of great importance in the area of robotics,especially when people are unavailable in the robot workshop.It provides a way for people to control robots remotely using human intelligence.In this paper,a robotic teleoperation system for precise robotic manipulation is established.The data glove and the 7-degrees of freedom(DOFs)force feedback controller are used for the remote control interaction.The control system and the monitor system are designed for the remote precise manipulation.The monitor system contains an image acquisition system and a human-machine interaction module,and aims to simulate and detect the robot running state.Besides,a visual object tracking algorithm is developed to estimate the states of the dynamic system from noisy observations.The established robotic teleoperation systemis applied to a series of experiments,and high-precision results are obtained,showing the effectiveness of the physical system.
基金supported by the China Postdoctoral Science Foundation(No.2022BG011)the Fundamental Research Funds for Central Universities(No.2020CDJ-LHZZ-077)+1 种基金the Natural Science Foundation of Chongqing,China(No.c stc2020jcyj-msxmX0397)the Fundamental Research Funds for Central Universities(No.00007717).
文摘As the Internet of Things advances,gesture recognition emerges as a prominent domain in human-machine interaction(HMI).However,interactive wearables based on conductive hydrogels for individuals with single-arm functionality or disabilities remain underexplored.Here,we devised a wearable one-handed keyboard with gesture recognition,employing machine learning algorithms and hydrogel-based mechanical sensors to boost productivity.PCG(PAM/CMC/rGO)hydrogels are composed of polyacrylamide(PAM),sodium carboxymethyl cellulose(CMC),and reduced graphene oxide(rGO),which function as a strain,pressure sensor,and electrode material.The PAM chains offer the gel’s elasticity by covalent cross-linking,while the biocompatible CMC improves the dispersion of rGO and promotes electromechanical properties.Integrating rGO sheets into the polymer matrix facilitates cross-linking and generates supple-mentary conductive pathways,thereby augmenting the gel system’s elasticity,sensitivity,and durability.Our hydrogel sensors include high sensitivity(gage factor(GF)=8.18,395.6%-551.96%)and superior pressure sensing capabilities(Sensitivity(S)=0.3116 kPa^(-1),0-9.82 kPa).Furthermore,we developed a wearable keyboard with up to 98.13%accuracy using convolutional neural networks and a custom data acquisition system.This study establishes the groundwork for creating multifunctional gel sensors for intelligent machines,wearable devices,and brain-computer interfaces.
基金Supported by the National Key Research and Development Plan(2016YFB1001200)the Natural Science Foundation of China(U1435220,61232013)Natural Science Research Projects of Universities in Jiangsu Province(16KJA520003)
文摘Biography videos based on life performances of prominent figures in history aim to describe great mens' life.In this paper,a novel interactive video summarization for biography video based on multimodal fusion is proposed,which is a novel approach of visualizing the specific features for biography video and interacting with video content by taking advantage of the ability of multimodality.In general,a story of movie progresses by dialogues of characters and the subtitles are produced with the basis on the dialogues which contains all the information related to the movie.In this paper,JGibbsLDA is applied to extract key words from subtitles because the biography video consists of different aspects to depict the characters' whole life.In terms of fusing keywords and key-frames,affinity propagation is adopted to calculate the similarity between each key-frame cluster and keywords.Through the method mentioned above,a video summarization is presented based on multimodal fusion which describes video content more completely.In order to reduce the time spent on searching the interest video content and get the relationship between main characters,a kind of map is adopted to visualize video content and interact with video summarization.An experiment is conducted to evaluate video summarization and the results demonstrate that this system can formally facilitate the exploration of video content while improving interaction and finding events of interest efficiently.
文摘With the popularization of social media,stickers have become an important tool for young students to express themselves and resist mainstream culture due to their unique visual and emotional expressiveness.Most existing studies focus on the negative impacts of spoof stickers,while paying insufficient attention to their positive functions.From the perspective of multimodal metaphor,this paper uses methods such as virtual ethnography and image-text analysis to clarify the connotation of stickers,understand the evolution of their digital dissemination forms,and explore the multiple functions of subcultural stickers in the social interactions between teachers and students.Young students use stickers to convey emotions and information.Their expressive function,social function,and cultural metaphor function progress in a progressive manner.This not only shapes students’values but also promotes self-expression and teacher-student interaction.It also reminds teachers to correct students’negative thoughts by using stickers,achieving the effect of“cultivating and influencing people through culture.”
文摘Aiming at the problems of traditional guide devices such as single environmental perception and poor terrain adaptability,this paper proposes an intelligent guide system based on a quadruped robot platform.Data fusion between millimeter-wave radar(with an accuracy of±0.1°)and an RGB-D camera is achieved through multisensor spatiotemporal registration technology,and a dataset suitable for guide dog robots is constructed.For the application scenario of edge-end guide dog robots,a lightweight CA-YOLOv11 target detection model integrated with an attention mechanism is innovatively adopted,achieving a comprehensive recognition accuracy of 95.8% in complex scenarios,which is 2.2% higher than that of the benchmark YOLOv11 network.The system supports navigation on complex terrains such as stairs(25 cm steps)and slopes(35°gradient),and the response time to sudden disturbances is shortened to 100 ms.Actual tests show that the navigation success rate reaches 95% in eight types of scenarios,the user satisfaction score is 4.8/5.0,and the cost is 50% lower than that of traditional guide dogs.
基金supported by the National Key Research and Development Program of China(Grant No.:2023YFC2605002)the National Key R&D Program of China(Grant No.:2022YFF1203003)+2 种基金Beijing AI Health Cultivation Project,China(Grant No.:Z221100003522022)the National Natural Science Foundation of China(Grant No.:82273772)the Beijing Natural Science Foundation,China(Grant No.:7212152).
文摘Accurate prediction of drug responses in cancer cell lines(CCLs)and transferable prediction of clinical drug responses using CCLs are two major tasks in personalized medicine.Despite the rapid advancements in existing computational methods for preclinical and clinical cancer drug response(CDR)prediction,challenges remain regarding the generalization of new drugs that are unseen in the training set.Herein,we propose a multimodal fusion deep learning(DL)model called drug-target and single-cell language based CDR(DTLCDR)to predict preclinical and clinical CDRs.The model integrates chemical descriptors,molecular graph representations,predicted protein target profiles of drugs,and cell line expression profiles with general knowledge from single cells.Among these features,a well-trained drug-target interaction(DTI)prediction model is used to generate target profiles of drugs,and a pretrained single-cell language model is integrated to provide general genomic knowledge.Comparison experiments on the cell line drug sensitivity dataset demonstrated that DTLCDR exhibited improved generalizability and robustness in predicting unseen drugs compared with previous state-of-the-art baseline methods.Further ablation studies verified the effectiveness of each component of our model,highlighting the significant contribution of target information to generalizability.Subsequently,the ability of DTLCDR to predict novel molecules was validated through in vitro cell experiments,demonstrating its potential for real-world applications.Moreover,DTLCDR was transferred to the clinical datasets,demonstrating satisfactory performance in the clinical data,regardless of whether the drugs were included in the cell line dataset.Overall,our results suggest that the DTLCDR is a promising tool for personalized drug discovery.
文摘With the growing application of intelligent robots in service,manufacturing,and medical fields,efficient and natural interaction between humans and robots has become key to improving collaboration efficiency and user experience.Gesture recognition,as an intuitive and contactless interaction method,can overcome the limitations of traditional interfaces and enable real-time control and feedback of robot movements and behaviors.This study first reviews mainstream gesture recognition algorithms and their application on different sensing platforms(RGB cameras,depth cameras,and inertial measurement units).It then proposes a gesture recognition method based on multimodal feature fusion and a lightweight deep neural network that balances recognition accuracy with computational efficiency.At system level,a modular human-robot interaction architecture is constructed,comprising perception,decision,and execution layers,and gesture commands are transmitted and mapped to robot actions in real time via the ROS communication protocol.Through multiple comparative experiments on public gesture datasets and a self-collected dataset,the proposed method’s superiority is validated in terms of accuracy,response latency,and system robustness,while user-experience tests assess the interface’s usability.The results provide a reliable technical foundation for robot collaboration and service in complex scenarios,offering broad prospects for practical application and deployment.
基金supported by the Zhejiang Provincial Natural Science Foundation of China(No.LQ23F030001)the National Natural Science Foundation of China(No.62406280)+5 种基金the Autism Research Special Fund of Zhejiang Foundation for Disabled Persons(No.2023008)the Liaoning Province Higher Education Innovative Talents Program Support Project(No.LR2019058)the Liaoning Province Joint Open Fund for Key Scientific and Technological Innovation Bases(No.2021-KF-12-05)the Central Guidance on Local Science and Technology Development Fund of Liaoning Province(No.2023JH6/100100066)the Key Laboratory for Biomedical Engineering of Ministry of Education,Zhejiang University,Chinain part by the Open Research Fund of the State Key Laboratory of Cognitive Neuroscience and Learning.
文摘Video action recognition(VAR)aims to analyze dynamic behaviors in videos and achieve semantic understanding.VAR faces challenges such as temporal dynamics,action-scene coupling,and the complexity of human interactions.Existing methods can be categorized into motion-level,event-level,and story-level ones based on spatiotemporal granularity.However,single-modal approaches struggle to capture complex behavioral semantics and human factors.Therefore,in recent years,vision-language models(VLMs)have been introduced into this field,providing new research perspectives for VAR.In this paper,we systematically review spatiotemporal hierarchical methods in VAR and explore how the introduction of large models has advanced the field.Additionally,we propose the concept of“Factor”to identify and integrate key information from both visual and textual modalities,enhancing multimodal alignment.We also summarize various multimodal alignment methods and provide in-depth analysis and insights into future research directions.
基金Supported by the National Natural Science Foundation of China (No. 40071071).
文摘The fusion of VlSI (visual identity system Internet), digital maps and Web GIS is presented. Web GIS interface interactive design with VISI needs to consider more new factors. VISI can provide the design principle, elements and contents for the Web GIS. The design of the Wuhan Bus Search System is fulfilled to confirm the validity and practicability of the fusion.
基金Supported by the‘Automotive Glazing Application in Intelligent Cockpit Human-Machine Interface’project(SKHX2021049)a collaboration between the Saint-Go Bain Research and the Beijing Normal University。
文摘Background With an increasing number of vehicles becoming autonomous,intelligent,and connected,paying attention to the future usage of car human-machine interface with these vehicles should become more relevant.Several studies have addressed car HMI but were less attentive to designing and implementing interactive glazing for every day(autonomous)driving contexts.Methods Reflecting on the literature,we describe an engineering psychology practice and the design of six novel future user scenarios,which envision the application of a specific set of augmented reality(AR)support user interactions.Additionally,we conduct evaluations on specific scenarios and experiential prototypes,which reveal that these AR scenarios aid the target user groups in experiencing a new type of interaction.The overall evaluation is positive with valuable assessment results and suggestions.Conclusions This study can interest applied psychology educators who aspire to teach how AR can be operationalized in a human-centered design process to students with minimal pre-existing expertise or minimal scientific knowledge in engineering psychology.