Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The p...Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The proposed approach detects the topic-caption frames, and integrates them with silence clips detection results, as well as shot segmentation results to locate the news story boundaries. The integration of audio-visual features and text information overcomes the weakness of the approach using only image analysis techniques. On test data with 135 400 frames, when the boundaries between news stories are detected, the accuracy rate 85.8% and the recall rate 97.5% are obtained. The experimental results show the approach is valid and robust.展开更多
In order to ensure the safety,quality and efficiency of computer numerical control(CNC)machine tool processing,a real-time monitoring and visible solution for CNC machine tools based on hyper text markup language(HTML...In order to ensure the safety,quality and efficiency of computer numerical control(CNC)machine tool processing,a real-time monitoring and visible solution for CNC machine tools based on hyper text markup language(HTML)5 is proposed.The characteristics of the real-time monitoring technology of CNC machine tools under the traditional Client/Server(C/S)structure are compared and analyzed,and the technical drawbacks are proposed.Web real-time communication technology and browser drawing technology are deeply studied.A real-time monitoring and visible system for CNC machine tool data is developed based on Metro platform,combining WebSocket real-time communication technology and Canvas drawing technology.The system architecture is given,and the functions and implementation methods of the system are described in detail.The practical application results show that the WebSocket real-time communication technology can effectively reduce the bandwidth and network delay and save server resources.The numerical control machine data monitoring system can intuitively reflect the machine data,and the visible effect is good.It realizes timely monitoring of equipment alarms and prompts maintenance and management personnel.展开更多
In recent years, text visualization has been widely acknowledged as an effective approach for understanding the structure and patterns hidden in complicated textual information. In this paper, we propose a new visuali...In recent years, text visualization has been widely acknowledged as an effective approach for understanding the structure and patterns hidden in complicated textual information. In this paper, we propose a new visualization system called TextInsight with two of our contributions. Firstly, a textual entropy theory is introduced to encode the semantic importance distribution in the corpus. Based on the proposed multidimensional joint probability histogram in vector fields, the improved algorithm provides a novel way to position valuable information in massive short texts accurately. Secondly, a map-like metaphor is generated to visualize the textual topics and their relationships. For the problem of over-segmentation in the layout and clustering procedure, we propose an optimization algorithm combining Affinity Propagation(AP) and MultiDimensional Scaling(MDS), and the improved geographical representation is more comprehensible and aesthetically appealing. Our experimental results and initial user feedback suggest that this system is effective in aiding text analysis.展开更多
Text visualization is concerned with the representation of text in a graphicalform to facilitate comprehension of large textual data. Its aim is to improve the ability tounderstand and utilize the wealth of text-based...Text visualization is concerned with the representation of text in a graphicalform to facilitate comprehension of large textual data. Its aim is to improve the ability tounderstand and utilize the wealth of text-based information available. An essential task inany scientific research is the study and review of previous works in the specified domain,a process that is referred to as the literature survey process. This process involves theidentification of prior work and evaluating its relevance to the research question. With theenormous number of published studies available online in digital form, this becomes acumbersome task for the researcher. This paper presents the design and implementationof a tool that aims to facilitate this process by identifying relevant work and suggestingclusters of articles by conceptual modeling, thus providing different options that enablethe researcher to visualize a large number of articles in a graphical easy-to-analyze form.The tool helps the researcher in analyzing and synthesizing the literature and building aconceptual understanding of the designated research area. The evaluation of the toolshows that researchers have found it useful and that it supported the process of relevantwork analysis given a specific research question, and 70% of the evaluators of the toolfound it very useful.展开更多
In this paper, visualization of special features in “The Tale of Genji”, which is a typical Japanese classical literature, is studied by text mining the auxiliary verbs and examining the similarity in the sentence s...In this paper, visualization of special features in “The Tale of Genji”, which is a typical Japanese classical literature, is studied by text mining the auxiliary verbs and examining the similarity in the sentence style by the correspondence analysis with clustering. The result shows that the text mining error in the number of auxiliary verbs can be as small as 15%. The extracted feature in this study supports the multiple authors of “The Tale of Genji”, which agrees well with the result by Murakami and Imanishi [1]. It is also found that extracted features are robust to the text mining error, which suggests that the classification error is less affected by the text mining error and the possible use of this technique for further statistical study in classical literatures.展开更多
针对现有医学影像合成技术在准确捕捉复杂解剖结构和病理状态方面存在不足,从而生成低质量且与实际情况不符的胸片问题,文中提出了一种创新性的医学潜在扩散模型Chest-Chat。基于先前研究结果改进了所提模型,引入一种多模态文本编码器Me...针对现有医学影像合成技术在准确捕捉复杂解剖结构和病理状态方面存在不足,从而生成低质量且与实际情况不符的胸片问题,文中提出了一种创新性的医学潜在扩散模型Chest-Chat。基于先前研究结果改进了所提模型,引入一种多模态文本编码器MedA-BERT(Medical Attention Strategy Pre-training of Deep Bidirectional Transformers for Language Understanding)。采用跨模态视觉-语言预训练策略构建该编码器并强化胸片影像与对应文本报告间的深刻语义联系,结合双向交叉注意力机制和对比学习显著增强了模型对医学影像报告语义的理解和处理能力。将MedA-BERT与潜在扩散模型的视觉模块相结合,使Chest-Chat能够生成具有详细解剖和病理描述的高质量胸片。在CheXpert和MIMIC-CXR(Chest X-ray)两个公开数据集上进行了广泛评估。实验结果表明,Chest-Chat的FID InceptionV3(Fréchet Inception Distance)、FID XRV和MS-SSIM(Multi-Scale Structural Similarity)分别为58.38、3.69和0.12±0.11,其表现优于现有方法。展开更多
Visualization methods for single documents are either too simple, considering word frequency only, or depend on syntactic and semantic information bases to be more useful. This paper presents an intermediary approach,...Visualization methods for single documents are either too simple, considering word frequency only, or depend on syntactic and semantic information bases to be more useful. This paper presents an intermediary approach, based on H. P. Luhn’s automatic abstract creation algorithm, and intends to aggregate more information to document visualization than word counting methods do without the need of external sources. The method takes pairs of relevant words and computes the linkage force between them. Relevant words become vertices and links become edges in the resulting graph.展开更多
A two-stage algorithm based on deep learning for the detection and recognition of can bottom spray codes and numbers is proposed to address the problems of small character areas and fast production line speeds in can ...A two-stage algorithm based on deep learning for the detection and recognition of can bottom spray codes and numbers is proposed to address the problems of small character areas and fast production line speeds in can bottom spray code number recognition.In the coding number detection stage,Differentiable Binarization Network is used as the backbone network,combined with the Attention and Dilation Convolutions Path Aggregation Network feature fusion structure to enhance the model detection effect.In terms of text recognition,using the Scene Visual Text Recognition coding number recognition network for end-to-end training can alleviate the problem of coding recognition errors caused by image color distortion due to variations in lighting and background noise.In addition,model pruning and quantization are used to reduce the number ofmodel parameters to meet deployment requirements in resource-constrained environments.A comparative experiment was conducted using the dataset of tank bottom spray code numbers collected on-site,and a transfer experiment was conducted using the dataset of packaging box production date.The experimental results show that the algorithm proposed in this study can effectively locate the coding of cans at different positions on the roller conveyor,and can accurately identify the coding numbers at high production line speeds.The Hmean value of the coding number detection is 97.32%,and the accuracy of the coding number recognition is 98.21%.This verifies that the algorithm proposed in this paper has high accuracy in coding number detection and recognition.展开更多
文摘Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The proposed approach detects the topic-caption frames, and integrates them with silence clips detection results, as well as shot segmentation results to locate the news story boundaries. The integration of audio-visual features and text information overcomes the weakness of the approach using only image analysis techniques. On test data with 135 400 frames, when the boundaries between news stories are detected, the accuracy rate 85.8% and the recall rate 97.5% are obtained. The experimental results show the approach is valid and robust.
文摘In order to ensure the safety,quality and efficiency of computer numerical control(CNC)machine tool processing,a real-time monitoring and visible solution for CNC machine tools based on hyper text markup language(HTML)5 is proposed.The characteristics of the real-time monitoring technology of CNC machine tools under the traditional Client/Server(C/S)structure are compared and analyzed,and the technical drawbacks are proposed.Web real-time communication technology and browser drawing technology are deeply studied.A real-time monitoring and visible system for CNC machine tool data is developed based on Metro platform,combining WebSocket real-time communication technology and Canvas drawing technology.The system architecture is given,and the functions and implementation methods of the system are described in detail.The practical application results show that the WebSocket real-time communication technology can effectively reduce the bandwidth and network delay and save server resources.The numerical control machine data monitoring system can intuitively reflect the machine data,and the visible effect is good.It realizes timely monitoring of equipment alarms and prompts maintenance and management personnel.
基金Supported by the National High Technology Research and Development Program of China(863 Program)(No.2013AA7013033)
文摘In recent years, text visualization has been widely acknowledged as an effective approach for understanding the structure and patterns hidden in complicated textual information. In this paper, we propose a new visualization system called TextInsight with two of our contributions. Firstly, a textual entropy theory is introduced to encode the semantic importance distribution in the corpus. Based on the proposed multidimensional joint probability histogram in vector fields, the improved algorithm provides a novel way to position valuable information in massive short texts accurately. Secondly, a map-like metaphor is generated to visualize the textual topics and their relationships. For the problem of over-segmentation in the layout and clustering procedure, we propose an optimization algorithm combining Affinity Propagation(AP) and MultiDimensional Scaling(MDS), and the improved geographical representation is more comprehensible and aesthetically appealing. Our experimental results and initial user feedback suggest that this system is effective in aiding text analysis.
文摘Text visualization is concerned with the representation of text in a graphicalform to facilitate comprehension of large textual data. Its aim is to improve the ability tounderstand and utilize the wealth of text-based information available. An essential task inany scientific research is the study and review of previous works in the specified domain,a process that is referred to as the literature survey process. This process involves theidentification of prior work and evaluating its relevance to the research question. With theenormous number of published studies available online in digital form, this becomes acumbersome task for the researcher. This paper presents the design and implementationof a tool that aims to facilitate this process by identifying relevant work and suggestingclusters of articles by conceptual modeling, thus providing different options that enablethe researcher to visualize a large number of articles in a graphical easy-to-analyze form.The tool helps the researcher in analyzing and synthesizing the literature and building aconceptual understanding of the designated research area. The evaluation of the toolshows that researchers have found it useful and that it supported the process of relevantwork analysis given a specific research question, and 70% of the evaluators of the toolfound it very useful.
文摘In this paper, visualization of special features in “The Tale of Genji”, which is a typical Japanese classical literature, is studied by text mining the auxiliary verbs and examining the similarity in the sentence style by the correspondence analysis with clustering. The result shows that the text mining error in the number of auxiliary verbs can be as small as 15%. The extracted feature in this study supports the multiple authors of “The Tale of Genji”, which agrees well with the result by Murakami and Imanishi [1]. It is also found that extracted features are robust to the text mining error, which suggests that the classification error is less affected by the text mining error and the possible use of this technique for further statistical study in classical literatures.
文摘针对现有医学影像合成技术在准确捕捉复杂解剖结构和病理状态方面存在不足,从而生成低质量且与实际情况不符的胸片问题,文中提出了一种创新性的医学潜在扩散模型Chest-Chat。基于先前研究结果改进了所提模型,引入一种多模态文本编码器MedA-BERT(Medical Attention Strategy Pre-training of Deep Bidirectional Transformers for Language Understanding)。采用跨模态视觉-语言预训练策略构建该编码器并强化胸片影像与对应文本报告间的深刻语义联系,结合双向交叉注意力机制和对比学习显著增强了模型对医学影像报告语义的理解和处理能力。将MedA-BERT与潜在扩散模型的视觉模块相结合,使Chest-Chat能够生成具有详细解剖和病理描述的高质量胸片。在CheXpert和MIMIC-CXR(Chest X-ray)两个公开数据集上进行了广泛评估。实验结果表明,Chest-Chat的FID InceptionV3(Fréchet Inception Distance)、FID XRV和MS-SSIM(Multi-Scale Structural Similarity)分别为58.38、3.69和0.12±0.11,其表现优于现有方法。
文摘Visualization methods for single documents are either too simple, considering word frequency only, or depend on syntactic and semantic information bases to be more useful. This paper presents an intermediary approach, based on H. P. Luhn’s automatic abstract creation algorithm, and intends to aggregate more information to document visualization than word counting methods do without the need of external sources. The method takes pairs of relevant words and computes the linkage force between them. Relevant words become vertices and links become edges in the resulting graph.
文摘A two-stage algorithm based on deep learning for the detection and recognition of can bottom spray codes and numbers is proposed to address the problems of small character areas and fast production line speeds in can bottom spray code number recognition.In the coding number detection stage,Differentiable Binarization Network is used as the backbone network,combined with the Attention and Dilation Convolutions Path Aggregation Network feature fusion structure to enhance the model detection effect.In terms of text recognition,using the Scene Visual Text Recognition coding number recognition network for end-to-end training can alleviate the problem of coding recognition errors caused by image color distortion due to variations in lighting and background noise.In addition,model pruning and quantization are used to reduce the number ofmodel parameters to meet deployment requirements in resource-constrained environments.A comparative experiment was conducted using the dataset of tank bottom spray code numbers collected on-site,and a transfer experiment was conducted using the dataset of packaging box production date.The experimental results show that the algorithm proposed in this study can effectively locate the coding of cans at different positions on the roller conveyor,and can accurately identify the coding numbers at high production line speeds.The Hmean value of the coding number detection is 97.32%,and the accuracy of the coding number recognition is 98.21%.This verifies that the algorithm proposed in this paper has high accuracy in coding number detection and recognition.