Reading,especially reading at text level,is a process of continuously repeating,consolidating and understanding words,phrases,and sentences.However,from the perspective of psycholinguistics,the compariosn and contrast...Reading,especially reading at text level,is a process of continuously repeating,consolidating and understanding words,phrases,and sentences.However,from the perspective of psycholinguistics,the compariosn and contrast of empirical data from experiments and researches would be evidences to support the idea that reading at text level is mainly influenced by readers' ability of identifying a text's cohesion and coherence,balancing the activation and suppression of background knowledge,as well as readers' utilization of working memory.Readers can make good use of these three aspects in order to advance their understanding of reading at text level.展开更多
This paper presents a description and performance evaluation of a new bit-level, lossless, adaptive, and asymmetric data compression scheme that is based on the adaptive character wordlength (ACW(n)) algorithm. Th...This paper presents a description and performance evaluation of a new bit-level, lossless, adaptive, and asymmetric data compression scheme that is based on the adaptive character wordlength (ACW(n)) algorithm. The proposed scheme enhances the compression ratio of the ACW(n) algorithm by dividing the binary sequence into a number of subsequences (s), each of them satisfying the condition that the number of decimal values (d) of the n-bit length characters is equal to or less than 256. Therefore, the new scheme is referred to as ACW(n, s), where n is the adaptive character wordlength and s is the number of subsequences. The new scheme was used to compress a number of text files from standard corpora. The obtained results demonstrate that the ACW(n, s) scheme achieves higher compression ratio than many widely used compression algorithms and it achieves a competitive performance compared to state-of-the-art compression tools.展开更多
现有的句子级文本情感分类中,采用简单的词袋模型在获取句子级全局语义表示时,只能捕捉文本中的部分上下文关系和语义依赖,导致情感分类的准确性降低。为此,提出基于循环和卷积神经网络(Convolutional Neural Network,CNN)的句子级文本...现有的句子级文本情感分类中,采用简单的词袋模型在获取句子级全局语义表示时,只能捕捉文本中的部分上下文关系和语义依赖,导致情感分类的准确性降低。为此,提出基于循环和卷积神经网络(Convolutional Neural Network,CNN)的句子级文本情感分类研究。对句子级文本进行预处理,去除高频但无实际情感贡献的停用词,利用Word2Vec词嵌入技术和Skip-gram模型训练词向量。通过长短时记忆(Long Short Term Memory,LSTM)网络对预处理后的文本进行建模,得到句子级的全局语义表示。利用CNN对句子级文本语义特征进行提取,通过设定卷积核参数并进行卷积运算,结合分段池化技术,形成丰富的特征向量。采用softmax函数作为分类机制,将特征向量转化为情感分类的条件概率分布,判断句子级文本的情感类别。实验结果表明,所提方法在混淆矩阵和曲线下面积(Area Under Curve,AUC)值上均优于对比方法,可实现更加精准的句子级文本情感分类。展开更多
为实现英文文本标题的自动化生成,研究一套基于长短期记忆网络的句子级LSTM编码策略,并在标题生成模型中引入注意力机制来获取英文文本的上下文向量,保留文本中的重要信息。在此基础上,通过负对数似然函数来对模型加以训练。最后通过Byt...为实现英文文本标题的自动化生成,研究一套基于长短期记忆网络的句子级LSTM编码策略,并在标题生成模型中引入注意力机制来获取英文文本的上下文向量,保留文本中的重要信息。在此基础上,通过负对数似然函数来对模型加以训练。最后通过Byte Cup 2018数据集对本文提出的英语标题自动生成算法进行实验,并通过过ROUGE-N指标对标题生成质量加以评价。实验研究发现,所提出的句子级LSTM编码方案在英文文本标题生成准确性方面相比于其他常规摘要生成模型来说具有显著优势。展开更多
文摘Reading,especially reading at text level,is a process of continuously repeating,consolidating and understanding words,phrases,and sentences.However,from the perspective of psycholinguistics,the compariosn and contrast of empirical data from experiments and researches would be evidences to support the idea that reading at text level is mainly influenced by readers' ability of identifying a text's cohesion and coherence,balancing the activation and suppression of background knowledge,as well as readers' utilization of working memory.Readers can make good use of these three aspects in order to advance their understanding of reading at text level.
文摘This paper presents a description and performance evaluation of a new bit-level, lossless, adaptive, and asymmetric data compression scheme that is based on the adaptive character wordlength (ACW(n)) algorithm. The proposed scheme enhances the compression ratio of the ACW(n) algorithm by dividing the binary sequence into a number of subsequences (s), each of them satisfying the condition that the number of decimal values (d) of the n-bit length characters is equal to or less than 256. Therefore, the new scheme is referred to as ACW(n, s), where n is the adaptive character wordlength and s is the number of subsequences. The new scheme was used to compress a number of text files from standard corpora. The obtained results demonstrate that the ACW(n, s) scheme achieves higher compression ratio than many widely used compression algorithms and it achieves a competitive performance compared to state-of-the-art compression tools.
文摘现有的句子级文本情感分类中,采用简单的词袋模型在获取句子级全局语义表示时,只能捕捉文本中的部分上下文关系和语义依赖,导致情感分类的准确性降低。为此,提出基于循环和卷积神经网络(Convolutional Neural Network,CNN)的句子级文本情感分类研究。对句子级文本进行预处理,去除高频但无实际情感贡献的停用词,利用Word2Vec词嵌入技术和Skip-gram模型训练词向量。通过长短时记忆(Long Short Term Memory,LSTM)网络对预处理后的文本进行建模,得到句子级的全局语义表示。利用CNN对句子级文本语义特征进行提取,通过设定卷积核参数并进行卷积运算,结合分段池化技术,形成丰富的特征向量。采用softmax函数作为分类机制,将特征向量转化为情感分类的条件概率分布,判断句子级文本的情感类别。实验结果表明,所提方法在混淆矩阵和曲线下面积(Area Under Curve,AUC)值上均优于对比方法,可实现更加精准的句子级文本情感分类。
文摘为实现英文文本标题的自动化生成,研究一套基于长短期记忆网络的句子级LSTM编码策略,并在标题生成模型中引入注意力机制来获取英文文本的上下文向量,保留文本中的重要信息。在此基础上,通过负对数似然函数来对模型加以训练。最后通过Byte Cup 2018数据集对本文提出的英语标题自动生成算法进行实验,并通过过ROUGE-N指标对标题生成质量加以评价。实验研究发现,所提出的句子级LSTM编码方案在英文文本标题生成准确性方面相比于其他常规摘要生成模型来说具有显著优势。