为更好地描述光伏出力不确定性,该文提出了一种基于时序卷积网络(temporal convolutional network,简称TCN)和双向长短期记忆(bidirectional long short term memory,简称BiLSTM)的光伏功率概率预测模型.首先,基于数值天气预报中的云量...为更好地描述光伏出力不确定性,该文提出了一种基于时序卷积网络(temporal convolutional network,简称TCN)和双向长短期记忆(bidirectional long short term memory,简称BiLSTM)的光伏功率概率预测模型.首先,基于数值天气预报中的云量和降雨量将历史数据集划分为晴天、多云天和阴雨天3种场景,生成具有相似天气类型的测试集和训练样本集:然后,应用TCN进行集成特征维度提取,利用BiLSTM神经网络建模进行输出功率和天气数据时间序列的双向拟合.针对传统区间预测分位数损失函数不可微的缺陷,引入Huber范数近似替代原损失函数,并应用梯度下降进行优化,构建改进的可微分位数回归(quantile regression,简称QR)模型,生成置信区间.最后,采用核密度估计(kerneldensity estimation,简称KDE)给出概率密度预测结果。以我国华东某地区分布式光伏电站作为研究对象,与现有概率预测方法相比,该文所提出的短期预测算法的功率区间各评价指标都有所改进,验证了所提方法的可靠性。展开更多
Digital twin technology is revolutionizing personalized healthcare by creating dynamic virtual replicas of individual patients.This paper presents a novel multi-modal architecture leveraging digital twins to enhance p...Digital twin technology is revolutionizing personalized healthcare by creating dynamic virtual replicas of individual patients.This paper presents a novel multi-modal architecture leveraging digital twins to enhance precision in predictive diagnostics and treatment planning of phoneme labeling.By integrating real-time images,electronic health records,and genomic information,the system enables personalized simulations for disease progression modeling,treatment response prediction,and preventive care strategies.In dysarthric speech,which is characterized by articulation imprecision,temporal misalignments,and phoneme distortions,existing models struggle to capture these irregularities.Traditional approaches,often relying solely on audio features,fail to address the full complexity of phoneme variations,leading to increased phoneme error rates(PER)and word error rates(WER).To overcome these challenges,we propose a novel multi-modal architecture that integrates both audio and articulatory data through a combination of Temporal Convolutional Networks(TCNs),Graph Convolutional Networks(GCNs),Transformer Encoders,and a cross-modal attention mechanism.The audio branch of the model utilizes TCNs and Transformer Encoders to capture both short-and long-term dependencies in the audio signal,while the articulatory branch leverages GCNs to model spatial relationships between articulators,such as the lips,jaw,and tongue,allowing the model to detect subtle articulatory imprecisions.A cross-modal attention mechanism fuses the encoded audio and articulatory features,enabling dynamic adjustment of the model’s focus depending on input quality,which significantly improves phoneme labeling accuracy.The proposed model consistently outperforms existing methods,achieving lower Phoneme Error Rates(PER),Word Error Rates(WER),and Articulatory Feature Misclassification Rates(AFMR).Specifically,across all datasets,the model achieves an average PER of 13.43%,an average WER of 21.67%,and an average AFMR of 12.73%.By capturing both the acoustic and articulatory intricacies of speech,this comprehensive approach not only improves phoneme labeling precision but also marks substantial progress in speech recognition technology for individuals with dysarthria.展开更多
文摘为更好地描述光伏出力不确定性,该文提出了一种基于时序卷积网络(temporal convolutional network,简称TCN)和双向长短期记忆(bidirectional long short term memory,简称BiLSTM)的光伏功率概率预测模型.首先,基于数值天气预报中的云量和降雨量将历史数据集划分为晴天、多云天和阴雨天3种场景,生成具有相似天气类型的测试集和训练样本集:然后,应用TCN进行集成特征维度提取,利用BiLSTM神经网络建模进行输出功率和天气数据时间序列的双向拟合.针对传统区间预测分位数损失函数不可微的缺陷,引入Huber范数近似替代原损失函数,并应用梯度下降进行优化,构建改进的可微分位数回归(quantile regression,简称QR)模型,生成置信区间.最后,采用核密度估计(kerneldensity estimation,简称KDE)给出概率密度预测结果。以我国华东某地区分布式光伏电站作为研究对象,与现有概率预测方法相比,该文所提出的短期预测算法的功率区间各评价指标都有所改进,验证了所提方法的可靠性。
基金funded by the Ongoing Research Funding program(ORF-2025-867),King Saud University,Riyadh,Saudi Arabia.
文摘Digital twin technology is revolutionizing personalized healthcare by creating dynamic virtual replicas of individual patients.This paper presents a novel multi-modal architecture leveraging digital twins to enhance precision in predictive diagnostics and treatment planning of phoneme labeling.By integrating real-time images,electronic health records,and genomic information,the system enables personalized simulations for disease progression modeling,treatment response prediction,and preventive care strategies.In dysarthric speech,which is characterized by articulation imprecision,temporal misalignments,and phoneme distortions,existing models struggle to capture these irregularities.Traditional approaches,often relying solely on audio features,fail to address the full complexity of phoneme variations,leading to increased phoneme error rates(PER)and word error rates(WER).To overcome these challenges,we propose a novel multi-modal architecture that integrates both audio and articulatory data through a combination of Temporal Convolutional Networks(TCNs),Graph Convolutional Networks(GCNs),Transformer Encoders,and a cross-modal attention mechanism.The audio branch of the model utilizes TCNs and Transformer Encoders to capture both short-and long-term dependencies in the audio signal,while the articulatory branch leverages GCNs to model spatial relationships between articulators,such as the lips,jaw,and tongue,allowing the model to detect subtle articulatory imprecisions.A cross-modal attention mechanism fuses the encoded audio and articulatory features,enabling dynamic adjustment of the model’s focus depending on input quality,which significantly improves phoneme labeling accuracy.The proposed model consistently outperforms existing methods,achieving lower Phoneme Error Rates(PER),Word Error Rates(WER),and Articulatory Feature Misclassification Rates(AFMR).Specifically,across all datasets,the model achieves an average PER of 13.43%,an average WER of 21.67%,and an average AFMR of 12.73%.By capturing both the acoustic and articulatory intricacies of speech,this comprehensive approach not only improves phoneme labeling precision but also marks substantial progress in speech recognition technology for individuals with dysarthria.