现有基于预训练语言模型(PLM)的作文自动评分(AES)方法偏向于直接使用从PLM提取的全局语义特征表示作文的质量,却忽略了作文质量与更细粒度特征关联关系的问题。聚焦于中文AES研究,从多种文本角度分析和评估作文质量,提出利用图神经网络...现有基于预训练语言模型(PLM)的作文自动评分(AES)方法偏向于直接使用从PLM提取的全局语义特征表示作文的质量,却忽略了作文质量与更细粒度特征关联关系的问题。聚焦于中文AES研究,从多种文本角度分析和评估作文质量,提出利用图神经网络(GNN)对作文的多尺度特征进行联合学习的中文AES方法。首先,利用GNN分别获取作文在句子级别和段落级别的篇章特征;然后,将这些篇章特征与作文的全局语义特征进行联合特征学习,实现对作文更精准的评分;最后,构建一个中文AES数据集,为中文AES研究提供数据基础。在所构建的数据集上的实验结果表明,所提方法在6个作文主题上的平均二次加权Kappa(QWK)系数相较于R2-BERT(Bidirectional Encoder Representations from Transformers model with Regression and Ranking)提升了1.1个百分点,验证了在AES任务中进行多尺度特征联合学习的有效性。同时,消融实验结果进一步表明了不同尺度的作文特征对评分效果的贡献。为了证明小模型在特定任务场景下的优越性,与当前流行的通用大语言模型GPT-3.5-turbo和DeepSeek-V3进行了对比。结果表明,使用所提方法的BERT(Bidirectional Encoder Representations from Transformers)模型在6个作文主题上的平均QWK比GPT-3.5-turbo和DeepSeek-V3分别高出了65.8和45.3个百分点,验证了大语言模型(LLMs)在面向领域的篇章级作文评分任务中,因缺乏大规模有监督微调数据而表现不佳的观点。展开更多
人工智能在信用风险评估中能有效识别风险并提升决策效率,然而,现有信用风险数据普遍存在类别不平衡问题,导致模型在预测时偏向多数类,影响评估的准确性和可靠性。针对数据不平衡问题,提出一种融合变分自编码器(VAE)和条件表格生成对抗...人工智能在信用风险评估中能有效识别风险并提升决策效率,然而,现有信用风险数据普遍存在类别不平衡问题,导致模型在预测时偏向多数类,影响评估的准确性和可靠性。针对数据不平衡问题,提出一种融合变分自编码器(VAE)和条件表格生成对抗网络(CTGAN)的混合生成模型(VCTGAN),用于合成高质量平衡数据集。通过VAE中的隐变量学习真实数据的关键特征和潜在分布,生成结构化隐变量作为原始CTGAN的输入;在数据生成器中引入自注意力机制用于更好地捕捉不平衡数据的突出特征;在判别器中加入对比损失模块来增强生成数据的类别间差异,达到提高生成数据质量的目的。通过在Taiwan Credit和Give Me Some Credit两个基准数据集上的系统实验验证,分别取得了89.91%和96.89%的最佳分类准确率,结果表明这种改进方法在处理信用数据不平衡方面明显优于传统方法。消融实验进一步验证了各组件对性能的贡献,证实了所提方法的合理性和有效性。它不仅生成高质量的平衡数据集,而且提高模型识别少数类别的能力,为解决金融领域的数据不平衡问题提供了新的技术方案。展开更多
Maize (Zea mays) is the most widely grown grain crop in the world, playing important roles in agriculture and industry. However, the functions of maize genes remain largely unknown. High-quality genome- wide transcr...Maize (Zea mays) is the most widely grown grain crop in the world, playing important roles in agriculture and industry. However, the functions of maize genes remain largely unknown. High-quality genome- wide transcriptome datasets provide important biological knowledge which has been widely and suc- cessfully used in plants not only by measuring gene expression levels but also by enabling co-expression analysis for predicting gene functions and modules related to agronomic traits. Recently, thousands of maize transcriptomic data are available across different inbred lines, development stages, tissues, and treatments, or even across different tissue sections and cell lines. Here, we integrated 701 transcriptomic and 108 epigenomic data and studied the different conditional networks with multi-dimensional omics levels. We constructed a searchable, integrative, one-stop online platform, the maize conditional co- expression network (MCENet) platform. MCENet provides 10 global/conditional co-expression net- works, 5 network accessional analysis toolkits (i.e., Network Search, Network Remodel, Module Finder, Network Comparison, and Dynamic Expression View) and multiple network functional support toolkits (e.g., motif and module enrichment analysis). We hope that our database might help plant research communities to identify maize functional genes or modules that regulate important agronomic traits.展开更多
This paper tests various scenarios of feature selection and feature reduction, with the objective of building a real-time anomaly-based intrusion detection system. These scenarios are evaluated on the realistic Kyoto ...This paper tests various scenarios of feature selection and feature reduction, with the objective of building a real-time anomaly-based intrusion detection system. These scenarios are evaluated on the realistic Kyoto 2006+ dataset. The influence of reducing the number of features on the classification performance and the execution time is measured for each scenario. The so-called HVS feature selection technique detailed in this paper reveals many advantages in terms of consistency, classification performance and execution time.展开更多
Detecting the anomalous entity in real-time network traffic is a popular area of research in recent times.Very few researches have focused on creating malware that fools the intrusion detection system and this paper f...Detecting the anomalous entity in real-time network traffic is a popular area of research in recent times.Very few researches have focused on creating malware that fools the intrusion detection system and this paper focuses on this topic.We are using Deep Convolutional Generative Adversarial Networks(DCGAN)to trick the malware classifier to believe it is a normal entity.In this work,a new dataset is created to fool the Artificial Intelligence(AI)based malware detectors,and it consists of different types of attacks such as Denial of Service(DoS),scan 11,scan 44,botnet,spam,User Datagram Portal(UDP)scan,and ssh scan.The discriminator used in the DCGAN discriminates two different attack classes(anomaly and synthetic)and one normal class.The model collapse,instability,and vanishing gradient issues associated with the DCGAN are overcome using the proposed hybrid Aquila optimizer-based Mine blast harmony search algorithm(AO-MBHS).This algorithm helps the generator to create realistic malware samples to be undetected by the discriminator.The performance of the proposed methodology is evaluated using different performance metrics such as training time,detection rate,F-Score,loss function,Accuracy,False alarm rate,etc.The superiority of the hybrid AO-MBHS based DCGAN model is noticed when the detection rate is changed to 0 after the retraining method to make the defensive technique hard to be noticed by the malware detection system.The support vector machines(SVM)is used as the malicious traffic detection application and its True positive rate(TPR)goes from 80%to 0%after retraining the proposed model which shows the efficiency of the proposed model in hiding the samples.展开更多
Anomaly based approaches in network intrusion detection suffer from evaluation, comparison and deployment which originate from the scarcity of adequate publicly available network trace datasets. Also, publicly availab...Anomaly based approaches in network intrusion detection suffer from evaluation, comparison and deployment which originate from the scarcity of adequate publicly available network trace datasets. Also, publicly available datasets are either outdated or generated in a controlled environment. Due to the ubiquity of cloud computing environments in commercial and government internet services, there is a need to assess the impacts of network attacks in cloud data centers. To the best of our knowledge, there is no publicly available dataset which captures the normal and anomalous network traces in the interactions between cloud users and cloud data centers. In this paper, we present an experimental platform designed to represent a practical interaction between cloud users and cloud services and collect network traces resulting from this interaction to conduct anomaly detection. We use Amazon web services (AWS) platform for conducting our experiments.展开更多
This paper introduces a Convolutional Neural Network (CNN) model for Arabic Sign Language (AASL) recognition, using the AASL dataset. Recognizing the fundamental importance of communication for the hearing-impaired, e...This paper introduces a Convolutional Neural Network (CNN) model for Arabic Sign Language (AASL) recognition, using the AASL dataset. Recognizing the fundamental importance of communication for the hearing-impaired, especially within the Arabic-speaking deaf community, the study emphasizes the critical role of sign language recognition systems. The proposed methodology achieves outstanding accuracy, with the CNN model reaching 99.9% accuracy on the training set and a validation accuracy of 97.4%. This study not only establishes a high-accuracy AASL recognition model but also provides insights into effective dropout strategies. The achieved high accuracy rates position the proposed model as a significant advancement in the field, holding promise for improved communication accessibility for the Arabic-speaking deaf community.展开更多
针对遮挡场景下车辆跟踪精度下降的问题,提出了一种基于卷积核优选的遮挡车辆跟踪(Convolutional Kernel Optimization for Occluded Vehicle Tracking,CKO-OVT)算法。CKO-OVT算法通过卷积核优选策略自适应挑选出对车辆目标更为敏感的...针对遮挡场景下车辆跟踪精度下降的问题,提出了一种基于卷积核优选的遮挡车辆跟踪(Convolutional Kernel Optimization for Occluded Vehicle Tracking,CKO-OVT)算法。CKO-OVT算法通过卷积核优选策略自适应挑选出对车辆目标更为敏感的卷积算子进行特征提取,通过判别式孪生网络对跟踪结果进行评估并在跟踪失效的情况下重定位目标,进一步提升跟踪的鲁棒性和准确性。实验部分,构建了遮挡车辆跟踪(Occluded Vehicle Tracking,OVT)数据集,分别在目标跟踪基准(Object Tracking Benchmark,OTB)数据集、TColor-128公开数据集和自建OVT数据集上同高效卷积跟踪(Efficient Convolution Operators for Tracking,ECO)算法、ECO轻量化版本(Efficient Convolution Operators for Tracking Using HOG and CN,ECOHC)、相关滤波(Kernelized Correlation Filters Tracker,KCF)算法、判别式尺度空间跟踪(Discriminative Scale Space Tracker,DSST)算法、循环结构核跟踪(Circulant Structure Kernel Tracker,CSK)算法、层次相关滤波跟踪(Hierarchical Convolutional Features for Visual Tracking,HCFT)算法、基于分层卷积特征的鲁棒视觉跟踪(Robust Visual Tracking via Hierarchical Convolutional Features,HCFTstar)算法、全卷积孪生网络跟踪(Fully-Convolutional Siamese Networks for Object Tracking,SiameseFC)算法和抗干扰感知孪生网络跟踪(Distractor-Aware Siamese Networks for Object Tracking,DaSiam)算法9种主流算法进行实验对比,实验结果表明CKO-OVT算法在OTB数据集上距离精确率提升了2.2%,重叠成功率提升了1.8%;在TColor-128数据集上距离精确率提升了0.4%,重叠成功率提升了0.9%;在OVT数据集上距离精确率提升了1.7%,重叠成功率提升了1.2%。CKO-OVT算法通过自适应卷积核优选和判别式孪生网络,显著提升了遮挡场景下车辆跟踪的鲁棒性和准确性,在OTB、TColor-128和自建OVT数据集上的实验结果表明,CKO-OVT算法在距离精确率和重叠成功率上优于主流跟踪算法,为智能交通和自动驾驶领域的车辆跟踪提供了有效的解决方案。展开更多
文摘现有基于预训练语言模型(PLM)的作文自动评分(AES)方法偏向于直接使用从PLM提取的全局语义特征表示作文的质量,却忽略了作文质量与更细粒度特征关联关系的问题。聚焦于中文AES研究,从多种文本角度分析和评估作文质量,提出利用图神经网络(GNN)对作文的多尺度特征进行联合学习的中文AES方法。首先,利用GNN分别获取作文在句子级别和段落级别的篇章特征;然后,将这些篇章特征与作文的全局语义特征进行联合特征学习,实现对作文更精准的评分;最后,构建一个中文AES数据集,为中文AES研究提供数据基础。在所构建的数据集上的实验结果表明,所提方法在6个作文主题上的平均二次加权Kappa(QWK)系数相较于R2-BERT(Bidirectional Encoder Representations from Transformers model with Regression and Ranking)提升了1.1个百分点,验证了在AES任务中进行多尺度特征联合学习的有效性。同时,消融实验结果进一步表明了不同尺度的作文特征对评分效果的贡献。为了证明小模型在特定任务场景下的优越性,与当前流行的通用大语言模型GPT-3.5-turbo和DeepSeek-V3进行了对比。结果表明,使用所提方法的BERT(Bidirectional Encoder Representations from Transformers)模型在6个作文主题上的平均QWK比GPT-3.5-turbo和DeepSeek-V3分别高出了65.8和45.3个百分点,验证了大语言模型(LLMs)在面向领域的篇章级作文评分任务中,因缺乏大规模有监督微调数据而表现不佳的观点。
文摘人工智能在信用风险评估中能有效识别风险并提升决策效率,然而,现有信用风险数据普遍存在类别不平衡问题,导致模型在预测时偏向多数类,影响评估的准确性和可靠性。针对数据不平衡问题,提出一种融合变分自编码器(VAE)和条件表格生成对抗网络(CTGAN)的混合生成模型(VCTGAN),用于合成高质量平衡数据集。通过VAE中的隐变量学习真实数据的关键特征和潜在分布,生成结构化隐变量作为原始CTGAN的输入;在数据生成器中引入自注意力机制用于更好地捕捉不平衡数据的突出特征;在判别器中加入对比损失模块来增强生成数据的类别间差异,达到提高生成数据质量的目的。通过在Taiwan Credit和Give Me Some Credit两个基准数据集上的系统实验验证,分别取得了89.91%和96.89%的最佳分类准确率,结果表明这种改进方法在处理信用数据不平衡方面明显优于传统方法。消融实验进一步验证了各组件对性能的贡献,证实了所提方法的合理性和有效性。它不仅生成高质量的平衡数据集,而且提高模型识别少数类别的能力,为解决金融领域的数据不平衡问题提供了新的技术方案。
基金supported by the National Natural Science Foundation of China (Nos. 31771467, 31571360 and 31371291)
文摘Maize (Zea mays) is the most widely grown grain crop in the world, playing important roles in agriculture and industry. However, the functions of maize genes remain largely unknown. High-quality genome- wide transcriptome datasets provide important biological knowledge which has been widely and suc- cessfully used in plants not only by measuring gene expression levels but also by enabling co-expression analysis for predicting gene functions and modules related to agronomic traits. Recently, thousands of maize transcriptomic data are available across different inbred lines, development stages, tissues, and treatments, or even across different tissue sections and cell lines. Here, we integrated 701 transcriptomic and 108 epigenomic data and studied the different conditional networks with multi-dimensional omics levels. We constructed a searchable, integrative, one-stop online platform, the maize conditional co- expression network (MCENet) platform. MCENet provides 10 global/conditional co-expression net- works, 5 network accessional analysis toolkits (i.e., Network Search, Network Remodel, Module Finder, Network Comparison, and Dynamic Expression View) and multiple network functional support toolkits (e.g., motif and module enrichment analysis). We hope that our database might help plant research communities to identify maize functional genes or modules that regulate important agronomic traits.
文摘This paper tests various scenarios of feature selection and feature reduction, with the objective of building a real-time anomaly-based intrusion detection system. These scenarios are evaluated on the realistic Kyoto 2006+ dataset. The influence of reducing the number of features on the classification performance and the execution time is measured for each scenario. The so-called HVS feature selection technique detailed in this paper reveals many advantages in terms of consistency, classification performance and execution time.
基金This project was funded by the Deanship of Scientific Research(DSR)at King Abdulaziz University,Jeddah,under Grant No.RG-91-611-42.
文摘Detecting the anomalous entity in real-time network traffic is a popular area of research in recent times.Very few researches have focused on creating malware that fools the intrusion detection system and this paper focuses on this topic.We are using Deep Convolutional Generative Adversarial Networks(DCGAN)to trick the malware classifier to believe it is a normal entity.In this work,a new dataset is created to fool the Artificial Intelligence(AI)based malware detectors,and it consists of different types of attacks such as Denial of Service(DoS),scan 11,scan 44,botnet,spam,User Datagram Portal(UDP)scan,and ssh scan.The discriminator used in the DCGAN discriminates two different attack classes(anomaly and synthetic)and one normal class.The model collapse,instability,and vanishing gradient issues associated with the DCGAN are overcome using the proposed hybrid Aquila optimizer-based Mine blast harmony search algorithm(AO-MBHS).This algorithm helps the generator to create realistic malware samples to be undetected by the discriminator.The performance of the proposed methodology is evaluated using different performance metrics such as training time,detection rate,F-Score,loss function,Accuracy,False alarm rate,etc.The superiority of the hybrid AO-MBHS based DCGAN model is noticed when the detection rate is changed to 0 after the retraining method to make the defensive technique hard to be noticed by the malware detection system.The support vector machines(SVM)is used as the malicious traffic detection application and its True positive rate(TPR)goes from 80%to 0%after retraining the proposed model which shows the efficiency of the proposed model in hiding the samples.
文摘Anomaly based approaches in network intrusion detection suffer from evaluation, comparison and deployment which originate from the scarcity of adequate publicly available network trace datasets. Also, publicly available datasets are either outdated or generated in a controlled environment. Due to the ubiquity of cloud computing environments in commercial and government internet services, there is a need to assess the impacts of network attacks in cloud data centers. To the best of our knowledge, there is no publicly available dataset which captures the normal and anomalous network traces in the interactions between cloud users and cloud data centers. In this paper, we present an experimental platform designed to represent a practical interaction between cloud users and cloud services and collect network traces resulting from this interaction to conduct anomaly detection. We use Amazon web services (AWS) platform for conducting our experiments.
文摘This paper introduces a Convolutional Neural Network (CNN) model for Arabic Sign Language (AASL) recognition, using the AASL dataset. Recognizing the fundamental importance of communication for the hearing-impaired, especially within the Arabic-speaking deaf community, the study emphasizes the critical role of sign language recognition systems. The proposed methodology achieves outstanding accuracy, with the CNN model reaching 99.9% accuracy on the training set and a validation accuracy of 97.4%. This study not only establishes a high-accuracy AASL recognition model but also provides insights into effective dropout strategies. The achieved high accuracy rates position the proposed model as a significant advancement in the field, holding promise for improved communication accessibility for the Arabic-speaking deaf community.
文摘针对遮挡场景下车辆跟踪精度下降的问题,提出了一种基于卷积核优选的遮挡车辆跟踪(Convolutional Kernel Optimization for Occluded Vehicle Tracking,CKO-OVT)算法。CKO-OVT算法通过卷积核优选策略自适应挑选出对车辆目标更为敏感的卷积算子进行特征提取,通过判别式孪生网络对跟踪结果进行评估并在跟踪失效的情况下重定位目标,进一步提升跟踪的鲁棒性和准确性。实验部分,构建了遮挡车辆跟踪(Occluded Vehicle Tracking,OVT)数据集,分别在目标跟踪基准(Object Tracking Benchmark,OTB)数据集、TColor-128公开数据集和自建OVT数据集上同高效卷积跟踪(Efficient Convolution Operators for Tracking,ECO)算法、ECO轻量化版本(Efficient Convolution Operators for Tracking Using HOG and CN,ECOHC)、相关滤波(Kernelized Correlation Filters Tracker,KCF)算法、判别式尺度空间跟踪(Discriminative Scale Space Tracker,DSST)算法、循环结构核跟踪(Circulant Structure Kernel Tracker,CSK)算法、层次相关滤波跟踪(Hierarchical Convolutional Features for Visual Tracking,HCFT)算法、基于分层卷积特征的鲁棒视觉跟踪(Robust Visual Tracking via Hierarchical Convolutional Features,HCFTstar)算法、全卷积孪生网络跟踪(Fully-Convolutional Siamese Networks for Object Tracking,SiameseFC)算法和抗干扰感知孪生网络跟踪(Distractor-Aware Siamese Networks for Object Tracking,DaSiam)算法9种主流算法进行实验对比,实验结果表明CKO-OVT算法在OTB数据集上距离精确率提升了2.2%,重叠成功率提升了1.8%;在TColor-128数据集上距离精确率提升了0.4%,重叠成功率提升了0.9%;在OVT数据集上距离精确率提升了1.7%,重叠成功率提升了1.2%。CKO-OVT算法通过自适应卷积核优选和判别式孪生网络,显著提升了遮挡场景下车辆跟踪的鲁棒性和准确性,在OTB、TColor-128和自建OVT数据集上的实验结果表明,CKO-OVT算法在距离精确率和重叠成功率上优于主流跟踪算法,为智能交通和自动驾驶领域的车辆跟踪提供了有效的解决方案。