Although generative conversational artificial intelligence(AI)can answer questions well and hold conversations as a person,the semantic ambiguity inherent in text-based communication poses challenges to effective use....Although generative conversational artificial intelligence(AI)can answer questions well and hold conversations as a person,the semantic ambiguity inherent in text-based communication poses challenges to effective use.Effective use reflects the users’utilization of generative conversational AI to achieve their goals,which has not been previously studied.Drawing on the media naturalness theory,we examined how generative conversational AI’s content and style naturalness affect effective use.A two-wave survey was conducted to collect data from 565 users of generative conversational AI.Two techniques were used in this study.Initially,partial least squares structural equation modeling(PLS-SEM)was applied to determine the variables that significantly affected the mechanisms(i.e.,cognitive effort and communication ambiguity)and effective use.Secondly,an artificial neural network model was used to evaluate the relative importance of the significant predictors of mechanisms and effective use identified from the PLS-SEM analysis.The results revealed that the naturalness of content and style differed in their effects on cognitive effort and communication ambiguity.Additionally,cognitive effort and communication ambiguity negatively affected effective use.This study advances the literature on effective use by uncovering the psychological mechanisms underlying effective use and their antecedents.In addition,this study offers insights into the design of generative conversational AI.展开更多
This paper presents an innovative approach to enhance the querying capability of ChatGPT,a conversational artificial intelligence model,by incorporating voice-based interaction and a convolutional neural network(CNN)-...This paper presents an innovative approach to enhance the querying capability of ChatGPT,a conversational artificial intelligence model,by incorporating voice-based interaction and a convolutional neural network(CNN)-based impaired vision detection model.The proposed system aims to improve user experience and accessibility by allowing users to interact with ChatGPT using voice commands.Additionally,a CNN-based model is employed to detect impairments in user vision,enabling the system to adapt its responses and provide appropriate assistance.This research tackles head-on the challenges of user experience and inclusivity in artificial intelligence(AI).It underscores our commitment to overcoming these obstacles,making ChatGPT more accessible and valuable for a broader audience.The integration of voice-based interaction and impaired vision detection represents a novel approach to conversational AI.Notably,this innovation transcends novelty;it carries the potential to profoundly impact the lives of users,particularly those with visual impairments.The modular approach to system design ensures adaptability and scalability,critical for the practical implementation of these advancements.Crucially,the solution places the user at its core.Customizing responses for those with visual impairments demonstrates AI’s potential to not only understand but also accommodate individual needs and preferences.展开更多
A round 2010,academic circles witnessed a surge in Al research fueled by break-throughs such as the ImageNet project,a publicly available large-scale image database.The field reached a tipping point in 2016 when Googl...A round 2010,academic circles witnessed a surge in Al research fueled by break-throughs such as the ImageNet project,a publicly available large-scale image database.The field reached a tipping point in 2016 when Google’s AlphaGo defeated Go world champion Lee Se-dol and gained widespread public attention with the release of OpenAI's ChatGPT in November 2022.Just one year after ChatGPT’s debut,Chinese Al firm DeepSeek launched its open-source general large model,a milestone in the evolution of Al technology.展开更多
People occasionally interact with each other through conversation.In particular,we communicate through dialogue and exchange emotions and information from it.Emotions are essential characteristics of natural language....People occasionally interact with each other through conversation.In particular,we communicate through dialogue and exchange emotions and information from it.Emotions are essential characteristics of natural language.Conversational artificial intelligence is an integral part of all the technologies that allow computers to communicate like humans.For a computer to interact like a human being,it must understand the emotions inherent in the conversation and generate the appropriate responses.However,existing dialogue systems focus only on improving the quality of understanding natural language or generating natural language,excluding emotions.We propose a chatbot based on emotion,which is an essential element in conversation.EP-Bot(an Empathetic PolarisX-based chatbot)is an empathetic chatbot that can better understand a person’s utterance by utilizing PolarisX,an autogrowing knowledge graph.PolarisX extracts new relationship information and expands the knowledge graph automatically.It is helpful for computers to understand a person’s common sense.The proposed EP-Bot extracts knowledge graph embedding using PolarisX and detects emotion and dialog act from the utterance.Then it generates the next utterance using the embeddings.EP-Bot could understand and create a conversation,including the person’s common sense,emotion,and intention.We verify the novelty and accuracy of EP-Bot through the experiments.展开更多
Google's Bard has emerged as a formidable competitor to OpenAI's ChatGPT in the field of conversational AI.Notably,Bard has recently been updated to handle visual inputs alongside text prompts during conversat...Google's Bard has emerged as a formidable competitor to OpenAI's ChatGPT in the field of conversational AI.Notably,Bard has recently been updated to handle visual inputs alongside text prompts during conversations.Given Bard's impressive track record in handling textual inputs,we explore its capabilities in understanding and interpreting visual data(images)conditioned by text questions.This exploration holds the potential to unveil new insights and challenges for Bard and other forthcoming multi-modal Generative models,especially in addressing complex computer vision problems that demand accurate visual and language understanding.Specifically,in this study,we focus on 15 diverse task scenarios encompassing regular,camouflaged,medical,under-water and remote sensing data to comprehensively evaluate Bard's performance.Our primary finding indicates that Bard still struggles in these vision scenarios,highlighting the significant gap in vision-based understanding that needs to be bridged in future developments.We expect that this empirical study will prove valuable in advancing future models,leading to enhanced capabilities in comprehending and interpreting finegrained visual data.Our project is released on https://github.com/htqin/GoogleBard-VisUnderstand.展开更多
基金supported by the National Natural Science Foundation of China(NSFC)(Grant No.72171095)the National Social Science Foundation of China(Grant No.22VRC153)the Wuhan Textile University Fund(Grant Nos.2024289 and 2024380)。
文摘Although generative conversational artificial intelligence(AI)can answer questions well and hold conversations as a person,the semantic ambiguity inherent in text-based communication poses challenges to effective use.Effective use reflects the users’utilization of generative conversational AI to achieve their goals,which has not been previously studied.Drawing on the media naturalness theory,we examined how generative conversational AI’s content and style naturalness affect effective use.A two-wave survey was conducted to collect data from 565 users of generative conversational AI.Two techniques were used in this study.Initially,partial least squares structural equation modeling(PLS-SEM)was applied to determine the variables that significantly affected the mechanisms(i.e.,cognitive effort and communication ambiguity)and effective use.Secondly,an artificial neural network model was used to evaluate the relative importance of the significant predictors of mechanisms and effective use identified from the PLS-SEM analysis.The results revealed that the naturalness of content and style differed in their effects on cognitive effort and communication ambiguity.Additionally,cognitive effort and communication ambiguity negatively affected effective use.This study advances the literature on effective use by uncovering the psychological mechanisms underlying effective use and their antecedents.In addition,this study offers insights into the design of generative conversational AI.
基金This work was supported and funded by the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University(IMSIU)(Grant Number:IMSIU-RP23008).
文摘This paper presents an innovative approach to enhance the querying capability of ChatGPT,a conversational artificial intelligence model,by incorporating voice-based interaction and a convolutional neural network(CNN)-based impaired vision detection model.The proposed system aims to improve user experience and accessibility by allowing users to interact with ChatGPT using voice commands.Additionally,a CNN-based model is employed to detect impairments in user vision,enabling the system to adapt its responses and provide appropriate assistance.This research tackles head-on the challenges of user experience and inclusivity in artificial intelligence(AI).It underscores our commitment to overcoming these obstacles,making ChatGPT more accessible and valuable for a broader audience.The integration of voice-based interaction and impaired vision detection represents a novel approach to conversational AI.Notably,this innovation transcends novelty;it carries the potential to profoundly impact the lives of users,particularly those with visual impairments.The modular approach to system design ensures adaptability and scalability,critical for the practical implementation of these advancements.Crucially,the solution places the user at its core.Customizing responses for those with visual impairments demonstrates AI’s potential to not only understand but also accommodate individual needs and preferences.
文摘A round 2010,academic circles witnessed a surge in Al research fueled by break-throughs such as the ImageNet project,a publicly available large-scale image database.The field reached a tipping point in 2016 when Google’s AlphaGo defeated Go world champion Lee Se-dol and gained widespread public attention with the release of OpenAI's ChatGPT in November 2022.Just one year after ChatGPT’s debut,Chinese Al firm DeepSeek launched its open-source general large model,a milestone in the evolution of Al technology.
基金supported by Basic Science Research Program through the NRF(National Research Foundation of Korea)the MSIT(Ministry of Science and ICT),Korea,under the National Program for Excellence in SW supervised by the IITP(Institute for Information&communications Technology Promotion)and the Gachon University research fund of 2019(Nos.NRF2019R1A2C1008412,2015-0-00932,GCU-2019-0773).
文摘People occasionally interact with each other through conversation.In particular,we communicate through dialogue and exchange emotions and information from it.Emotions are essential characteristics of natural language.Conversational artificial intelligence is an integral part of all the technologies that allow computers to communicate like humans.For a computer to interact like a human being,it must understand the emotions inherent in the conversation and generate the appropriate responses.However,existing dialogue systems focus only on improving the quality of understanding natural language or generating natural language,excluding emotions.We propose a chatbot based on emotion,which is an essential element in conversation.EP-Bot(an Empathetic PolarisX-based chatbot)is an empathetic chatbot that can better understand a person’s utterance by utilizing PolarisX,an autogrowing knowledge graph.PolarisX extracts new relationship information and expands the knowledge graph automatically.It is helpful for computers to understand a person’s common sense.The proposed EP-Bot extracts knowledge graph embedding using PolarisX and detects emotion and dialog act from the utterance.Then it generates the next utterance using the embeddings.EP-Bot could understand and create a conversation,including the person’s common sense,emotion,and intention.We verify the novelty and accuracy of EP-Bot through the experiments.
文摘Google's Bard has emerged as a formidable competitor to OpenAI's ChatGPT in the field of conversational AI.Notably,Bard has recently been updated to handle visual inputs alongside text prompts during conversations.Given Bard's impressive track record in handling textual inputs,we explore its capabilities in understanding and interpreting visual data(images)conditioned by text questions.This exploration holds the potential to unveil new insights and challenges for Bard and other forthcoming multi-modal Generative models,especially in addressing complex computer vision problems that demand accurate visual and language understanding.Specifically,in this study,we focus on 15 diverse task scenarios encompassing regular,camouflaged,medical,under-water and remote sensing data to comprehensively evaluate Bard's performance.Our primary finding indicates that Bard still struggles in these vision scenarios,highlighting the significant gap in vision-based understanding that needs to be bridged in future developments.We expect that this empirical study will prove valuable in advancing future models,leading to enhanced capabilities in comprehending and interpreting finegrained visual data.Our project is released on https://github.com/htqin/GoogleBard-VisUnderstand.