Forecasting energy demand is essential for optimizing energy generation and effectively predicting power system needs.Recently,many researchers have developed various models on tabular datasets to enhance the effectiv...Forecasting energy demand is essential for optimizing energy generation and effectively predicting power system needs.Recently,many researchers have developed various models on tabular datasets to enhance the effectiveness of demand prediction,including neural networks,machine learning,deep learning,and advanced architectures such as CNN and LSTM.However,research on the CNN models has struggled to provide reliable outcomes due to insufficient dataset sizes,repeated investigations,and inappropriate baseline selection.To address these challenges,we propose a Tabular data-based Lightweight Convolutional Neural Network(TLCNN)model for predicting energy demand.It frames the problem as a regression task that effectively captures complex data trends for accurate forecasting.The BanE-16 dataset is preprocessed using normalization techniques for categorical and numerical data before training the model.The proposed approach dynamically selects relevant features through a two-dimensional convolutional structure that improves adaptability.The model’s performance is evaluated using MSE,MAE,and Accuracy metrics.Experimental results show that TLCNN achieves a 10.89%lower MSE than traditional ML algorithms,demonstrating superior predictive capability.Additionally,TLCNN’s lightweight structure enhances generalization while reducing computational costs,making it suitable for real-world energy forecasting tasks.This study contributes to energy informatics by introducing an optimized deep-learning framework that improves demand prediction by ensuring robustness and adaptability for tabular data.展开更多
为解决网络货运平台价格预测不准确导致的成交率下降问题,提出基于Shingling检索的表格先验数据拟合网络(tabular prior-data fitted network,TabPFN)的局部上下文学习(local context learning with TabPFN based on shingling retrieva...为解决网络货运平台价格预测不准确导致的成交率下降问题,提出基于Shingling检索的表格先验数据拟合网络(tabular prior-data fitted network,TabPFN)的局部上下文学习(local context learning with TabPFN based on shingling retrieval,ShinglingPFN)模型。首先,该模型运用w-Shingling检索算法,从历史订单数据中匹配出与预测订单最相似的订单,构建局部关联的上下文数据。然后,加载并初始化预训练的TabPFN模型实例,将筛选出的订单数据输入模型,让TabPFN基于这些上下文信息学习货运特征与运费的关联模式。最后,输出该货运样本的运费预测结果。结果表明,ShinglingPFN模型相比随机森林(random forest,RF)模型减少了30.98%的平均绝对误差(mean absolute error,MAE)。通过全局敏感性分析,进一步增强了模型的可解释性。ShinglingPFN模型可为平台优化定价策略提供决策支撑。展开更多
Saraikistan (South Punjab and surrounding) area of Pakistan is located in the central Pakistan. This area represents Triassic-Jurassic to Recent sedimentary marine and terrestrial strata. Most of the Mesozoic and Earl...Saraikistan (South Punjab and surrounding) area of Pakistan is located in the central Pakistan. This area represents Triassic-Jurassic to Recent sedimentary marine and terrestrial strata. Most of the Mesozoic and Early Cenozoic are represented by marine strata with rare terrestrial deposits, while the Late Cenozoic is represented by continental fluvial deposits. This area hosts significant mineral deposits and their development can play a significant role in the development of Saraikistan region and ultimately for Pakistan. The data of recently discovered biotas from Cambrian to Miocene age are tabulated for quick view. Mesozoic biotas show a prominent paleobiogeographic link with Gondwana and Cenozoic show Eurasian. Phylogeny and hypodigm of Poripuchian titanosaurs from India and Pakistan are hinted at here.展开更多
影像基因组学认为神经影像与基因之间存在着一定程度的相关性,利用遗传变异与影像数据进行疾病分析愈发受研究人员重视。在实践中,临床医生拥有的数据规模往往较小,但仍然希望使用深度学习来解决现实问题。考虑到不断扩大的数据规模与...影像基因组学认为神经影像与基因之间存在着一定程度的相关性,利用遗传变异与影像数据进行疾病分析愈发受研究人员重视。在实践中,临床医生拥有的数据规模往往较小,但仍然希望使用深度学习来解决现实问题。考虑到不断扩大的数据规模与昂贵的标注成本,构建能够利用多模态数据的无监督学习方法十分必要。为了满足上述需求,提出了一种基于影像与基因多模态表格数据对比学习的表征学习方法(multimodal tabular data with contrastive learning,MTCL),该模型利用了静息态功能磁共振成像(rs-fMRI)和单核苷酸多态性(single nucleotide polymorphisms,SNP)数据,无需数据的任何标签信息。为了增强可解释性,模型先通过特征提取模块将rs-fMRI和SNP数据转换为表格类型结构,再通过多模态表格数据对比学习模块对多模态数据进行融合,并获得融合后的数据表征。在重度抑郁症(major depression disorder,MDD)数据上,文中提出的方法能够有效提升MDD诊断性能。此外,MTCL方法结合了模型归因方法挖掘与MDD相关的影像和遗传生物标记物,提高了模型的可解释性,有助于研究人员对疾病发病机制的理解。展开更多
为解决临床医学量表数据类别不均衡容易对模型产生影响,以及在处理量表数据任务时深度学习框架性能难以媲美传统机器学习方法问题,提出了一种基于级联欠采样的Transformer网络模型(layer by layer Transformer,LLT)。LLT通过级联欠采样...为解决临床医学量表数据类别不均衡容易对模型产生影响,以及在处理量表数据任务时深度学习框架性能难以媲美传统机器学习方法问题,提出了一种基于级联欠采样的Transformer网络模型(layer by layer Transformer,LLT)。LLT通过级联欠采样方法对多数类数据逐层删减,实现数据类别平衡,降低数据类别不均衡对分类器的影响,并利用注意力机制对输入数据的特征进行相关性评估实现特征选择,细化特征提取能力,改善模型性能。采用类风湿关节炎(RA)数据作为测试样本,实验证明,在不改变样本分布的情况下,提出的级联欠采样方法对少数类别的识别率增加了6.1%,与常用的NEARMISS和ADASYN相比,分别高出1.4%和10.4%;LLT在RA量表数据的准确率和F 1-score指标上达到了72.6%和71.5%,AUC值为0.89,mAP值为0.79,性能超过目前RF、XGBoost和GBDT等主流量表数据分类模型。最后对模型过程进行可视化,分析了影响RA的特征,对RA临床诊断具有较好的指导意义。展开更多
文摘Forecasting energy demand is essential for optimizing energy generation and effectively predicting power system needs.Recently,many researchers have developed various models on tabular datasets to enhance the effectiveness of demand prediction,including neural networks,machine learning,deep learning,and advanced architectures such as CNN and LSTM.However,research on the CNN models has struggled to provide reliable outcomes due to insufficient dataset sizes,repeated investigations,and inappropriate baseline selection.To address these challenges,we propose a Tabular data-based Lightweight Convolutional Neural Network(TLCNN)model for predicting energy demand.It frames the problem as a regression task that effectively captures complex data trends for accurate forecasting.The BanE-16 dataset is preprocessed using normalization techniques for categorical and numerical data before training the model.The proposed approach dynamically selects relevant features through a two-dimensional convolutional structure that improves adaptability.The model’s performance is evaluated using MSE,MAE,and Accuracy metrics.Experimental results show that TLCNN achieves a 10.89%lower MSE than traditional ML algorithms,demonstrating superior predictive capability.Additionally,TLCNN’s lightweight structure enhances generalization while reducing computational costs,making it suitable for real-world energy forecasting tasks.This study contributes to energy informatics by introducing an optimized deep-learning framework that improves demand prediction by ensuring robustness and adaptability for tabular data.
文摘为解决网络货运平台价格预测不准确导致的成交率下降问题,提出基于Shingling检索的表格先验数据拟合网络(tabular prior-data fitted network,TabPFN)的局部上下文学习(local context learning with TabPFN based on shingling retrieval,ShinglingPFN)模型。首先,该模型运用w-Shingling检索算法,从历史订单数据中匹配出与预测订单最相似的订单,构建局部关联的上下文数据。然后,加载并初始化预训练的TabPFN模型实例,将筛选出的订单数据输入模型,让TabPFN基于这些上下文信息学习货运特征与运费的关联模式。最后,输出该货运样本的运费预测结果。结果表明,ShinglingPFN模型相比随机森林(random forest,RF)模型减少了30.98%的平均绝对误差(mean absolute error,MAE)。通过全局敏感性分析,进一步增强了模型的可解释性。ShinglingPFN模型可为平台优化定价策略提供决策支撑。
文摘Saraikistan (South Punjab and surrounding) area of Pakistan is located in the central Pakistan. This area represents Triassic-Jurassic to Recent sedimentary marine and terrestrial strata. Most of the Mesozoic and Early Cenozoic are represented by marine strata with rare terrestrial deposits, while the Late Cenozoic is represented by continental fluvial deposits. This area hosts significant mineral deposits and their development can play a significant role in the development of Saraikistan region and ultimately for Pakistan. The data of recently discovered biotas from Cambrian to Miocene age are tabulated for quick view. Mesozoic biotas show a prominent paleobiogeographic link with Gondwana and Cenozoic show Eurasian. Phylogeny and hypodigm of Poripuchian titanosaurs from India and Pakistan are hinted at here.
文摘影像基因组学认为神经影像与基因之间存在着一定程度的相关性,利用遗传变异与影像数据进行疾病分析愈发受研究人员重视。在实践中,临床医生拥有的数据规模往往较小,但仍然希望使用深度学习来解决现实问题。考虑到不断扩大的数据规模与昂贵的标注成本,构建能够利用多模态数据的无监督学习方法十分必要。为了满足上述需求,提出了一种基于影像与基因多模态表格数据对比学习的表征学习方法(multimodal tabular data with contrastive learning,MTCL),该模型利用了静息态功能磁共振成像(rs-fMRI)和单核苷酸多态性(single nucleotide polymorphisms,SNP)数据,无需数据的任何标签信息。为了增强可解释性,模型先通过特征提取模块将rs-fMRI和SNP数据转换为表格类型结构,再通过多模态表格数据对比学习模块对多模态数据进行融合,并获得融合后的数据表征。在重度抑郁症(major depression disorder,MDD)数据上,文中提出的方法能够有效提升MDD诊断性能。此外,MTCL方法结合了模型归因方法挖掘与MDD相关的影像和遗传生物标记物,提高了模型的可解释性,有助于研究人员对疾病发病机制的理解。
文摘为解决临床医学量表数据类别不均衡容易对模型产生影响,以及在处理量表数据任务时深度学习框架性能难以媲美传统机器学习方法问题,提出了一种基于级联欠采样的Transformer网络模型(layer by layer Transformer,LLT)。LLT通过级联欠采样方法对多数类数据逐层删减,实现数据类别平衡,降低数据类别不均衡对分类器的影响,并利用注意力机制对输入数据的特征进行相关性评估实现特征选择,细化特征提取能力,改善模型性能。采用类风湿关节炎(RA)数据作为测试样本,实验证明,在不改变样本分布的情况下,提出的级联欠采样方法对少数类别的识别率增加了6.1%,与常用的NEARMISS和ADASYN相比,分别高出1.4%和10.4%;LLT在RA量表数据的准确率和F 1-score指标上达到了72.6%和71.5%,AUC值为0.89,mAP值为0.79,性能超过目前RF、XGBoost和GBDT等主流量表数据分类模型。最后对模型过程进行可视化,分析了影响RA的特征,对RA临床诊断具有较好的指导意义。