With the explosive growth of false information on social media platforms, the automatic detection of multimodalfalse information has received increasing attention. Recent research has significantly contributed to mult...With the explosive growth of false information on social media platforms, the automatic detection of multimodalfalse information has received increasing attention. Recent research has significantly contributed to multimodalinformation exchange and fusion, with many methods attempting to integrate unimodal features to generatemultimodal news representations. However, they still need to fully explore the hierarchical and complex semanticcorrelations between different modal contents, severely limiting their performance detecting multimodal falseinformation. This work proposes a two-stage detection framework for multimodal false information detection,called ASMFD, which is based on image aesthetic similarity to segment and explores the consistency andinconsistency features of images and texts. Specifically, we first use the Contrastive Language-Image Pre-training(CLIP) model to learn the relationship between text and images through label awareness and train an imageaesthetic attribute scorer using an aesthetic attribute dataset. Then, we calculate the aesthetic similarity betweenthe image and related images and use this similarity as a threshold to divide the multimodal correlation matrixinto consistency and inconsistencymatrices. Finally, the fusionmodule is designed to identify essential features fordetectingmultimodal false information. In extensive experiments on four datasets, the performance of the ASMFDis superior to state-of-the-art baseline methods.展开更多
Role mining and setup affect the usage of role-based access control(RBAC).Traditionally,user's role and permission assigning are manipulated by security administrator of system.However,the cost is expensive and th...Role mining and setup affect the usage of role-based access control(RBAC).Traditionally,user's role and permission assigning are manipulated by security administrator of system.However,the cost is expensive and the operating process is complex.A new role analyzing method was proposed by generating mappings and using them to provide recommendation for systems.The relation among sets of permissions,roles and users was explored by generating mappings,and the relation between sets of users and attributes was analyzed by means of the concept lattice model,generating a critical mapping between the attribute and permission sets,and making the meaning of the role natural and operational.Thus,a role is determined by permission set and user's attributes.The generated mappings were used to automatically assign permissions and roles to new users.Experimental results show that the proposed algorithm is effective and efficient.展开更多
Time series forecasting has become an important aspect of data analysis and has many real-world applications.However,undesirable missing values are often encountered,which may adversely affect many forecasting tasks.I...Time series forecasting has become an important aspect of data analysis and has many real-world applications.However,undesirable missing values are often encountered,which may adversely affect many forecasting tasks.In this study,we evaluate and compare the effects of imputationmethods for estimating missing values in a time series.Our approach does not include a simulation to generate pseudo-missing data,but instead perform imputation on actual missing data and measure the performance of the forecasting model created therefrom.In an experiment,therefore,several time series forecasting models are trained using different training datasets prepared using each imputation method.Subsequently,the performance of the imputation methods is evaluated by comparing the accuracy of the forecasting models.The results obtained from a total of four experimental cases show that the k-nearest neighbor technique is the most effective in reconstructing missing data and contributes positively to time series forecasting compared with other imputation methods.展开更多
Biography videos based on life performances of prominent figures in history aim to describe great mens' life.In this paper,a novel interactive video summarization for biography video based on multimodal fusion is ...Biography videos based on life performances of prominent figures in history aim to describe great mens' life.In this paper,a novel interactive video summarization for biography video based on multimodal fusion is proposed,which is a novel approach of visualizing the specific features for biography video and interacting with video content by taking advantage of the ability of multimodality.In general,a story of movie progresses by dialogues of characters and the subtitles are produced with the basis on the dialogues which contains all the information related to the movie.In this paper,JGibbsLDA is applied to extract key words from subtitles because the biography video consists of different aspects to depict the characters' whole life.In terms of fusing keywords and key-frames,affinity propagation is adopted to calculate the similarity between each key-frame cluster and keywords.Through the method mentioned above,a video summarization is presented based on multimodal fusion which describes video content more completely.In order to reduce the time spent on searching the interest video content and get the relationship between main characters,a kind of map is adopted to visualize video content and interact with video summarization.An experiment is conducted to evaluate video summarization and the results demonstrate that this system can formally facilitate the exploration of video content while improving interaction and finding events of interest efficiently.展开更多
Deep learning-based action classification technology has been applied to various fields,such as social safety,medical services,and sports.Analyzing an action on a practical level requires tracking multiple human bodie...Deep learning-based action classification technology has been applied to various fields,such as social safety,medical services,and sports.Analyzing an action on a practical level requires tracking multiple human bodies in an image in real-time and simultaneously classifying their actions.There are various related studies on the real-time classification of actions in an image.However,existing deep learning-based action classification models have prolonged response speeds,so there is a limit to real-time analysis.In addition,it has low accuracy of action of each object ifmultiple objects appear in the image.Also,it needs to be improved since it has a memory overhead in processing image data.Deep learning-based action classification using one-shot object detection is proposed to overcome the limitations of multiframe-based analysis technology.The proposed method uses a one-shot object detection model and a multi-object tracking algorithm to detect and track multiple objects in the image.Then,a deep learning-based pattern classification model is used to classify the body action of the object in the image by reducing the data for each object to an action vector.Compared to the existing studies,the constructed model shows higher accuracy of 74.95%,and in terms of speed,it offered better performance than the current studies at 0.234 s per frame.The proposed model makes it possible to classify some actions only through action vector learning without additional image learning because of the vector learning feature of the posterior neural network.Therefore,it is expected to contribute significantly to commercializing realistic streaming data analysis technologies,such as CCTV.展开更多
Object tracking,an important technology in the field of image processing and computer vision,is used to continuously track a specific object or person in an image.This technology may be effective in identifying the sa...Object tracking,an important technology in the field of image processing and computer vision,is used to continuously track a specific object or person in an image.This technology may be effective in identifying the same person within one image,but it has limitations in handling multiple images owing to the difficulty in identifying whether the object appearing in other images is the same.When tracking the same object using two or more images,there must be a way to determine that objects existing in different images are the same object.Therefore,this paper attempts to determine the same object present in different images using color information among the unique information of the object.Thus,this study proposes a multiple-object-tracking method using histogram stamp extraction in closed-circuit television applications.The proposed method determines the presence or absence of a target object in an image by comparing the similarity between the image containing the target object and other images.To this end,a unique color value of the target object is extracted based on its color distribution in the image using three methods:mean,mode,and interquartile range.The Top-N accuracy method is used to analyze the accuracy of each method,and the results show that the mean method had an accuracy of 93.5%(Top-2).Furthermore,the positive prediction value experimental results show that the accuracy of the mean method was 65.7%.As a result of the analysis,it is possible to detect and track the same object present in different images using the unique color of the object.Through the results,it is possible to track the same object that can minimize manpower without using personal information when detecting objects in different images.In the last response speed experiment,it was shown that when the mean was used,the color extraction of the object was possible in real time with 0.016954 s.Through this,it is possible to detect and track the same object in real time when using the proposed method.展开更多
Artificial intelligence is increasingly being applied in the field of video analysis,particularly in the area of public safety where video surveillance equipment such as closed-circuit television(CCTV)is used and auto...Artificial intelligence is increasingly being applied in the field of video analysis,particularly in the area of public safety where video surveillance equipment such as closed-circuit television(CCTV)is used and automated analysis of video information is required.However,various issues such as data size limitations and low processing speeds make real-time extraction of video data challenging.Video analysis technology applies object classification,detection,and relationship analysis to continuous 2D frame data,and the various meanings within the video are thus analyzed based on the extracted basic data.Motion recognition is key in this analysis.Motion recognition is a challenging field that analyzes human body movements,requiring the interpretation of complex movements of human joints and the relationships between various objects.The deep learning-based human skeleton detection algorithm is a representative motion recognition algorithm.Recently,motion analysis models such as the SlowFast network algorithm,have also been developed with excellent performance.However,these models do not operate properly in most wide-angle video environments outdoors,displaying low response speed,as expected from motion classification extraction in environments associated with high-resolution images.The proposed method achieves high level of extraction and accuracy by improving SlowFast’s input data preprocessing and data structure methods.The input data are preprocessed through object tracking and background removal using YOLO and DeepSORT.A higher performance than that of a single model is achieved by improving the existing SlowFast’s data structure into a frame unit structure.Based on the confusion matrix,accuracies of 70.16%and 70.74%were obtained for the existing SlowFast and proposed model,respectively,indicating a 0.58%increase in accuracy.Comparing detection,based on behavioral classification,the existing SlowFast detected 2,341,164 cases,whereas the proposed model detected 3,119,323 cases,which is an increase of 33.23%.展开更多
With the increasing number of digital devices generating a vast amount of video data,the recognition of abnormal image patterns has become more important.Accordingly,it is necessary to develop a method that achieves t...With the increasing number of digital devices generating a vast amount of video data,the recognition of abnormal image patterns has become more important.Accordingly,it is necessary to develop a method that achieves this task using object and behavior information within video data.Existing methods for detecting abnormal behaviors only focus on simple motions,therefore they cannot determine the overall behavior occurring throughout a video.In this study,an abnormal behavior detection method that uses deep learning(DL)-based video-data structuring is proposed.Objects and motions are first extracted from continuous images by combining existing DL-based image analysis models.The weight of the continuous data pattern is then analyzed through data structuring to classify the overall video.The performance of the proposed method was evaluated using varying parameter settings,such as the size of the action clip and interval between action clips.The model achieved an accuracy of 0.9817,indicating excellent performance.Therefore,we conclude that the proposed data structuring method is useful in detecting and classifying abnormal behaviors.展开更多
Purpose-Since the performance of vehicular users and cellular users(CUE)in Vehicular networks is highly affected by the allocated resources to them.The purpose of this paper is to investigate the resource allocation f...Purpose-Since the performance of vehicular users and cellular users(CUE)in Vehicular networks is highly affected by the allocated resources to them.The purpose of this paper is to investigate the resource allocation for vehicular communications when multiple V2V links and a V2I link share spectrum with CUE in uplink communication under different Quality of Service(QoS).Design/methodology/approach-An optimization model to maximize the V2I capacity is established based on slowly varying large-scale fading channel information.Multiple V2V links are clustered based on sparrow search algorithm(SSA)to reduce interference.Then,a weighted tripartite graph is constructed by jointly optimizing the power of CUE,V2I and V2V clusters.Finally,spectrum resources are allocated based on a weighted 3D matching algorithm.Findings-The performance of the proposed algorithm is tested.Simulation results show that the proposed algorithm can maximize the channel capacity of V2I while ensuring the reliability of V2V and the quality of service of CUE.Originality/value-There is a lack of research on resource allocation algorithms of CUE,V2I and multiple V2V in different QoS.To solve the problem,one new resource allocation algorithm is proposed in this paper.Firstly,multiple V2V links are clustered using SSA to reduce interference.Secondly,the power allocation of CUE,V2I and V2V is jointly optimized.Finally,the weighted 3D matching algorithm is used to allocate spectrum resources.展开更多
文摘With the explosive growth of false information on social media platforms, the automatic detection of multimodalfalse information has received increasing attention. Recent research has significantly contributed to multimodalinformation exchange and fusion, with many methods attempting to integrate unimodal features to generatemultimodal news representations. However, they still need to fully explore the hierarchical and complex semanticcorrelations between different modal contents, severely limiting their performance detecting multimodal falseinformation. This work proposes a two-stage detection framework for multimodal false information detection,called ASMFD, which is based on image aesthetic similarity to segment and explores the consistency andinconsistency features of images and texts. Specifically, we first use the Contrastive Language-Image Pre-training(CLIP) model to learn the relationship between text and images through label awareness and train an imageaesthetic attribute scorer using an aesthetic attribute dataset. Then, we calculate the aesthetic similarity betweenthe image and related images and use this similarity as a threshold to divide the multimodal correlation matrixinto consistency and inconsistencymatrices. Finally, the fusionmodule is designed to identify essential features fordetectingmultimodal false information. In extensive experiments on four datasets, the performance of the ASMFDis superior to state-of-the-art baseline methods.
基金Project(61003140) supported by the National Natural Science Foundation of ChinaProject(013/2010/A) supported by Macao Science and Technology Development FundProject(10YJC630236) supported by Social Science Foundation for the Youth Scholars of Ministry of Education of China
文摘Role mining and setup affect the usage of role-based access control(RBAC).Traditionally,user's role and permission assigning are manipulated by security administrator of system.However,the cost is expensive and the operating process is complex.A new role analyzing method was proposed by generating mappings and using them to provide recommendation for systems.The relation among sets of permissions,roles and users was explored by generating mappings,and the relation between sets of users and attributes was analyzed by means of the concept lattice model,generating a critical mapping between the attribute and permission sets,and making the meaning of the role natural and operational.Thus,a role is determined by permission set and user's attributes.The generated mappings were used to automatically assign permissions and roles to new users.Experimental results show that the proposed algorithm is effective and efficient.
基金This research was supported by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(Grant Number 2020R1A6A1A03040583).
文摘Time series forecasting has become an important aspect of data analysis and has many real-world applications.However,undesirable missing values are often encountered,which may adversely affect many forecasting tasks.In this study,we evaluate and compare the effects of imputationmethods for estimating missing values in a time series.Our approach does not include a simulation to generate pseudo-missing data,but instead perform imputation on actual missing data and measure the performance of the forecasting model created therefrom.In an experiment,therefore,several time series forecasting models are trained using different training datasets prepared using each imputation method.Subsequently,the performance of the imputation methods is evaluated by comparing the accuracy of the forecasting models.The results obtained from a total of four experimental cases show that the k-nearest neighbor technique is the most effective in reconstructing missing data and contributes positively to time series forecasting compared with other imputation methods.
基金Supported by the National Key Research and Development Plan(2016YFB1001200)the Natural Science Foundation of China(U1435220,61232013)Natural Science Research Projects of Universities in Jiangsu Province(16KJA520003)
文摘Biography videos based on life performances of prominent figures in history aim to describe great mens' life.In this paper,a novel interactive video summarization for biography video based on multimodal fusion is proposed,which is a novel approach of visualizing the specific features for biography video and interacting with video content by taking advantage of the ability of multimodality.In general,a story of movie progresses by dialogues of characters and the subtitles are produced with the basis on the dialogues which contains all the information related to the movie.In this paper,JGibbsLDA is applied to extract key words from subtitles because the biography video consists of different aspects to depict the characters' whole life.In terms of fusing keywords and key-frames,affinity propagation is adopted to calculate the similarity between each key-frame cluster and keywords.Through the method mentioned above,a video summarization is presented based on multimodal fusion which describes video content more completely.In order to reduce the time spent on searching the interest video content and get the relationship between main characters,a kind of map is adopted to visualize video content and interact with video summarization.An experiment is conducted to evaluate video summarization and the results demonstrate that this system can formally facilitate the exploration of video content while improving interaction and finding events of interest efficiently.
基金supported by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(No.NRF-2022R1I1A1A01069526).
文摘Deep learning-based action classification technology has been applied to various fields,such as social safety,medical services,and sports.Analyzing an action on a practical level requires tracking multiple human bodies in an image in real-time and simultaneously classifying their actions.There are various related studies on the real-time classification of actions in an image.However,existing deep learning-based action classification models have prolonged response speeds,so there is a limit to real-time analysis.In addition,it has low accuracy of action of each object ifmultiple objects appear in the image.Also,it needs to be improved since it has a memory overhead in processing image data.Deep learning-based action classification using one-shot object detection is proposed to overcome the limitations of multiframe-based analysis technology.The proposed method uses a one-shot object detection model and a multi-object tracking algorithm to detect and track multiple objects in the image.Then,a deep learning-based pattern classification model is used to classify the body action of the object in the image by reducing the data for each object to an action vector.Compared to the existing studies,the constructed model shows higher accuracy of 74.95%,and in terms of speed,it offered better performance than the current studies at 0.234 s per frame.The proposed model makes it possible to classify some actions only through action vector learning without additional image learning because of the vector learning feature of the posterior neural network.Therefore,it is expected to contribute significantly to commercializing realistic streaming data analysis technologies,such as CCTV.
基金supported by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(No.2022R1F1A1068828).
文摘Object tracking,an important technology in the field of image processing and computer vision,is used to continuously track a specific object or person in an image.This technology may be effective in identifying the same person within one image,but it has limitations in handling multiple images owing to the difficulty in identifying whether the object appearing in other images is the same.When tracking the same object using two or more images,there must be a way to determine that objects existing in different images are the same object.Therefore,this paper attempts to determine the same object present in different images using color information among the unique information of the object.Thus,this study proposes a multiple-object-tracking method using histogram stamp extraction in closed-circuit television applications.The proposed method determines the presence or absence of a target object in an image by comparing the similarity between the image containing the target object and other images.To this end,a unique color value of the target object is extracted based on its color distribution in the image using three methods:mean,mode,and interquartile range.The Top-N accuracy method is used to analyze the accuracy of each method,and the results show that the mean method had an accuracy of 93.5%(Top-2).Furthermore,the positive prediction value experimental results show that the accuracy of the mean method was 65.7%.As a result of the analysis,it is possible to detect and track the same object present in different images using the unique color of the object.Through the results,it is possible to track the same object that can minimize manpower without using personal information when detecting objects in different images.In the last response speed experiment,it was shown that when the mean was used,the color extraction of the object was possible in real time with 0.016954 s.Through this,it is possible to detect and track the same object in real time when using the proposed method.
基金supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2020R1A6A1A03040583)supported by Kyonggi University’s Graduate Research Assistantship 2023.
文摘Artificial intelligence is increasingly being applied in the field of video analysis,particularly in the area of public safety where video surveillance equipment such as closed-circuit television(CCTV)is used and automated analysis of video information is required.However,various issues such as data size limitations and low processing speeds make real-time extraction of video data challenging.Video analysis technology applies object classification,detection,and relationship analysis to continuous 2D frame data,and the various meanings within the video are thus analyzed based on the extracted basic data.Motion recognition is key in this analysis.Motion recognition is a challenging field that analyzes human body movements,requiring the interpretation of complex movements of human joints and the relationships between various objects.The deep learning-based human skeleton detection algorithm is a representative motion recognition algorithm.Recently,motion analysis models such as the SlowFast network algorithm,have also been developed with excellent performance.However,these models do not operate properly in most wide-angle video environments outdoors,displaying low response speed,as expected from motion classification extraction in environments associated with high-resolution images.The proposed method achieves high level of extraction and accuracy by improving SlowFast’s input data preprocessing and data structure methods.The input data are preprocessed through object tracking and background removal using YOLO and DeepSORT.A higher performance than that of a single model is achieved by improving the existing SlowFast’s data structure into a frame unit structure.Based on the confusion matrix,accuracies of 70.16%and 70.74%were obtained for the existing SlowFast and proposed model,respectively,indicating a 0.58%increase in accuracy.Comparing detection,based on behavioral classification,the existing SlowFast detected 2,341,164 cases,whereas the proposed model detected 3,119,323 cases,which is an increase of 33.23%.
基金supported by Basic Science Research Program through the NationalResearch Foundation of Korea (NRF)funded by the Ministry of Education (2020R1A6A1A03040583).
文摘With the increasing number of digital devices generating a vast amount of video data,the recognition of abnormal image patterns has become more important.Accordingly,it is necessary to develop a method that achieves this task using object and behavior information within video data.Existing methods for detecting abnormal behaviors only focus on simple motions,therefore they cannot determine the overall behavior occurring throughout a video.In this study,an abnormal behavior detection method that uses deep learning(DL)-based video-data structuring is proposed.Objects and motions are first extracted from continuous images by combining existing DL-based image analysis models.The weight of the continuous data pattern is then analyzed through data structuring to classify the overall video.The performance of the proposed method was evaluated using varying parameter settings,such as the size of the action clip and interval between action clips.The model achieved an accuracy of 0.9817,indicating excellent performance.Therefore,we conclude that the proposed data structuring method is useful in detecting and classifying abnormal behaviors.
基金supported by the Program of National Natural Science Foundation of China(No.62001320)the special fund for Science and Technology Innovation Teams of Shanxi Province(No.202304051001035).
文摘Purpose-Since the performance of vehicular users and cellular users(CUE)in Vehicular networks is highly affected by the allocated resources to them.The purpose of this paper is to investigate the resource allocation for vehicular communications when multiple V2V links and a V2I link share spectrum with CUE in uplink communication under different Quality of Service(QoS).Design/methodology/approach-An optimization model to maximize the V2I capacity is established based on slowly varying large-scale fading channel information.Multiple V2V links are clustered based on sparrow search algorithm(SSA)to reduce interference.Then,a weighted tripartite graph is constructed by jointly optimizing the power of CUE,V2I and V2V clusters.Finally,spectrum resources are allocated based on a weighted 3D matching algorithm.Findings-The performance of the proposed algorithm is tested.Simulation results show that the proposed algorithm can maximize the channel capacity of V2I while ensuring the reliability of V2V and the quality of service of CUE.Originality/value-There is a lack of research on resource allocation algorithms of CUE,V2I and multiple V2V in different QoS.To solve the problem,one new resource allocation algorithm is proposed in this paper.Firstly,multiple V2V links are clustered using SSA to reduce interference.Secondly,the power allocation of CUE,V2I and V2V is jointly optimized.Finally,the weighted 3D matching algorithm is used to allocate spectrum resources.