In this paper, we present a complete set of procedures to automatically extract a music snippet, defined as the most representative or the highlighted excerpt of a music clip. We first generate a modified and compact ...In this paper, we present a complete set of procedures to automatically extract a music snippet, defined as the most representative or the highlighted excerpt of a music clip. We first generate a modified and compact similarity matrix based on selected features and distance metrics, and then several improved techniques for music repeated pattern discovery are utilized because a music snippet is usually a part of the repeated melody, main theme or chorus. During the process, redundant and wrongly detected patterns are discarded, boundaries are corrected using beat information, and final clusters are also further sorted according to the occurrence frequency and energy information. Subsequently, following our methods, we designed a music snippet extraction system which allows users to detect snippets. Experiments performed on the system show the superiority of our proposed approach.展开更多
在开源软件和开源平台中,开发人员可以通过提交issue来记录所发现的软件错误或提出新功能需求.由于缺乏经验、专业水平有限等原因,用户可能无法对issue内容进行准确有效地总结,导致issue标题质量较低,进而降低issue的解决效率.此外,现有...在开源软件和开源平台中,开发人员可以通过提交issue来记录所发现的软件错误或提出新功能需求.由于缺乏经验、专业水平有限等原因,用户可能无法对issue内容进行准确有效地总结,导致issue标题质量较低,进而降低issue的解决效率.此外,现有的issue标题自动生成方法主要面向GitHub等英文开源平台,当应用在Gitee等国产开源平台时表现不佳.同时,现有方法主要使用issue主体描述作为输入,忽略了issue中的代码片段等重要信息.为此,本文提出一种面向Gitee平台的issue标题自动生成方法GITG(Gitee Issue Title Generation),针对包含中文和英文文本的issue,使用构建的Gitee issue数据集对支持中文的预训练模型Chinese BART(Bidirectional and Auto-Regressive Transformers)进行微调,利用issue主体描述和代码片段的双模态信息来自动生成issue标题.为验证GITG的有效性,构建了包含18242个Gitee issue样本的数据集.实验结果表明,GITG在ROUGE-1、ROUGE-2和ROUGE-L指标上相较于iTAPE和iTiger分别至少提升了13.09%、10.18%和12.84%,在BLEU和METEOR指标上同样取得了性能提升.人工评价结果表明,GITG生成标题的平均得分在整体分数、流畅性、信息性和简洁性4个评价指标上相较iTAPE和iTiger分别至少提升了26.7%、20.8%、24.2%和20.0%.展开更多
基金Supported by the National Natural Science Foundation of China (Grant No. 60873098)
文摘In this paper, we present a complete set of procedures to automatically extract a music snippet, defined as the most representative or the highlighted excerpt of a music clip. We first generate a modified and compact similarity matrix based on selected features and distance metrics, and then several improved techniques for music repeated pattern discovery are utilized because a music snippet is usually a part of the repeated melody, main theme or chorus. During the process, redundant and wrongly detected patterns are discarded, boundaries are corrected using beat information, and final clusters are also further sorted according to the occurrence frequency and energy information. Subsequently, following our methods, we designed a music snippet extraction system which allows users to detect snippets. Experiments performed on the system show the superiority of our proposed approach.
文摘在开源软件和开源平台中,开发人员可以通过提交issue来记录所发现的软件错误或提出新功能需求.由于缺乏经验、专业水平有限等原因,用户可能无法对issue内容进行准确有效地总结,导致issue标题质量较低,进而降低issue的解决效率.此外,现有的issue标题自动生成方法主要面向GitHub等英文开源平台,当应用在Gitee等国产开源平台时表现不佳.同时,现有方法主要使用issue主体描述作为输入,忽略了issue中的代码片段等重要信息.为此,本文提出一种面向Gitee平台的issue标题自动生成方法GITG(Gitee Issue Title Generation),针对包含中文和英文文本的issue,使用构建的Gitee issue数据集对支持中文的预训练模型Chinese BART(Bidirectional and Auto-Regressive Transformers)进行微调,利用issue主体描述和代码片段的双模态信息来自动生成issue标题.为验证GITG的有效性,构建了包含18242个Gitee issue样本的数据集.实验结果表明,GITG在ROUGE-1、ROUGE-2和ROUGE-L指标上相较于iTAPE和iTiger分别至少提升了13.09%、10.18%和12.84%,在BLEU和METEOR指标上同样取得了性能提升.人工评价结果表明,GITG生成标题的平均得分在整体分数、流畅性、信息性和简洁性4个评价指标上相较iTAPE和iTiger分别至少提升了26.7%、20.8%、24.2%和20.0%.