Deformation monitoring is a critical measure for intuitively reflecting the operational behavior of a dam.However,the deformation monitoring data are often incomplete due to environmental changes,monitoring instrument...Deformation monitoring is a critical measure for intuitively reflecting the operational behavior of a dam.However,the deformation monitoring data are often incomplete due to environmental changes,monitoring instrument faults,and human operational errors,thereby often hindering the accurate assessment of actual deformation patterns.This study proposed a method for quantifying deformation similarity between measurement points by recognizing the spatiotemporal characteristics of concrete dam deformation monitoring data.It introduces a spatiotemporal clustering analysis of the concrete dam deformation behavior and employs the support vector machine model to address the missing data in concrete dam deformation monitoring.The proposed method was validated in a concrete dam project,with the model error maintaining within 5%,demonstrating its effectiveness in processing missing deformation data.This approach enhances the capability of early-warning systems and contributes to enhanced dam safety management.展开更多
The advent of the digital era has provided unprecedented opportunities for businesses to collect and analyze customer behavior data. Precision marketing, as a key means to improve marketing efficiency, highly depends ...The advent of the digital era has provided unprecedented opportunities for businesses to collect and analyze customer behavior data. Precision marketing, as a key means to improve marketing efficiency, highly depends on a deep understanding of customer behavior. This study proposes a theoretical framework for multi-dimensional customer behavior analysis, aiming to comprehensively capture customer behavioral characteristics in the digital environment. This framework integrates concepts of multi-source data including transaction history, browsing trajectories, social media interactions, and location information, constructing a theoretically more comprehensive customer profile. The research discusses the potential applications of this theoretical framework in precision marketing scenarios such as personalized recommendations, cross-selling, and customer churn prevention. Through analysis, the study points out that multi-dimensional analysis may significantly improve the targeting and theoretical conversion rates of marketing activities. However, the research also explores theoretical challenges that may be faced in the application process, such as data privacy and information overload, and proposes corresponding conceptual coping strategies. This study provides a new theoretical perspective on how businesses can optimize marketing decisions using big data thinking while respecting customer privacy, laying a foundation for future empirical research.展开更多
To improve the effectiveness of dam safety monitoring database systems,the development process of a multi-dimensional conceptual data model was analyzed and a logic design wasachieved in multi-dimensional database mod...To improve the effectiveness of dam safety monitoring database systems,the development process of a multi-dimensional conceptual data model was analyzed and a logic design wasachieved in multi-dimensional database mode.The optimal data model was confirmed by identifying data objects,defining relations and reviewing entities.The conversion of relations among entities to external keys and entities and physical attributes to tables and fields was interpreted completely.On this basis,a multi-dimensional database that reflects the management and analysis of a dam safety monitoring system on monitoring data information has been established,for which factual tables and dimensional tables have been designed.Finally,based on service design and user interface design,the dam safety monitoring system has been developed with Delphi as the development tool.This development project shows that the multi-dimensional database can simplify the development process and minimize hidden dangers in the database structure design.It is superior to other dam safety monitoring system development models and can provide a new research direction for system developers.展开更多
By modeling the spatiotemporal data of the power grid, it is possible to better understand its operational status, identify potential issues and risks, and take timely measures to adjust and optimize the system. Compa...By modeling the spatiotemporal data of the power grid, it is possible to better understand its operational status, identify potential issues and risks, and take timely measures to adjust and optimize the system. Compared to the bus-branch model, the node-breaker model provides higher granularity in describing grid components and can dynamically reflect changes in equipment status, thus improving the efficiency of grid dispatching and operation. This paper proposes a spatiotemporal data modeling method based on a graph database. It elaborates on constructing graph nodes, graph ontology models, and graph entity models from grid dispatch data, describing the construction of the spatiotemporal node-breaker graph model and the transformation to the bus-branch model. Subsequently, by integrating spatiotemporal data attributes into the pre-built static grid graph model, a spatiotemporal evolving graph of the power grid is constructed. Furthermore, the concept of the “Power Grid One Graph” and its requirements in modern power systems are elucidated. Leveraging the constructed spatiotemporal node-breaker graph model and graph computing technology, the paper explores the feasibility of grid situational awareness. Finally, typical applications in an operational provincial grid are showcased, and potential scenarios of the proposed spatiotemporal graph model are discussed.展开更多
The data acquisition technologies used in power systems have been continuously improving,thus laying the solid foundation for data-driven operation analysis of power systems.However,existing methods for analyzing the ...The data acquisition technologies used in power systems have been continuously improving,thus laying the solid foundation for data-driven operation analysis of power systems.However,existing methods for analyzing the relationship between operational variables mainly depend on the mathematical model and element parameters of the power system.Therefore,a thorough data-based analysis method is required to investigate the spatiotemporal characteristics of power system operation,especially for new types of power systems.The causal inference method,which has been successfully applied in many fields,is a powerful tool for investigating the interaction of data variables.In this study,a causal inference method is proposed based on supervisory control and data acquisition(SCADA)data for investigating the spatiotemporal causal relationships in power systems.Initially,a multiple data-sequence regression model is proposed to analyze the relationship of operation data variables.Next,the linear non-Gaussian acyclic model(LiNGAM)is used to calculate the causal index of the operational variables,and its limitations are analyzed.Furthermore,a new causal index of“full variable amplitude LiNGAM(FVA-LiNGAM)”is proposed by incorporating prior causal direct knowledge and considering the effect of real variable amplitude.Using the FVA-LiNGAM causal index,the causal relationship of operation variables can be investigated with higher spatiotemporal accuracy than that of the original LiNGAM index.Taking a real SCADA data subset of a provincial power system as an example,the validity of the FVA-LiNGAM causal index is verified.The variation patterns in spatiotemporal causality are explored using actual SCADA data sequences.The result shows that there indeed exists some spatiotemporal causality variation patterns between the operating variables of the power system.展开更多
Developing a comprehensive understanding of inter-city interactions is crucial for regional planning.We therefore examined spatiotemporal patterns of population migration across the Qinghai-Tibet Plateau(QTP)using mig...Developing a comprehensive understanding of inter-city interactions is crucial for regional planning.We therefore examined spatiotemporal patterns of population migration across the Qinghai-Tibet Plateau(QTP)using migration big data from Tencent for the period between 2015 and 2019.We initially used decomposition and breakpoint detection methods to examine time-series migration data and to identify the two seasons with the strongest and weakest population migration levels,between June 18th and August 18th and between October 8th and February 15th,respectively.Population migration within the former period was 2.03 times that seen in the latter.We then used a variety of network analysis methods to examine population flow directions as well as the importance of each individual city in migration.The two capital cities on the QTP,Lhasa and Xining,form centers for population migration and are also transfer hubs through which migrants from other cities off the plateau enter and leave this region.Data show that these two cities contribute more than 35%of total population migration.The majority of migrants tend to move within the province,particularly during the weakest migration season.We also utilized interactive relationship force and radiation models to examine the interaction strength and the radiating energy of each individual city.Results show that Lhasa and Xining exhibit the strongest interactions with other cities and have the largest radiating energies.Indeed,the radiating energy of the QTP cities correlates with their gross domestic product(GDP)(Pearson correlation coefficient:0.754 in the weakest migration season,WMS versus 0.737 in the strongest migration season,SMS),while changes in radiating energy correlate with the tourism-related revenue(Pearson correlation coefficient:0.685).These outcomes suggest that level of economic development and level of tourism are the two most important factors driving the QTP population migration.The results of this analysis provide critical clarification guidance regarding huge QTP development differences.展开更多
Lightning is a significant natural hazard that poses considerable risks to both human safety and industrial operations.Accurate,fine-scale lightning forecasting is crucial for effective disaster prevention.Traditional...Lightning is a significant natural hazard that poses considerable risks to both human safety and industrial operations.Accurate,fine-scale lightning forecasting is crucial for effective disaster prevention.Traditional forecasting methods primarily rely on numerical weather prediction(NWP),which demands substantial computational resources to solve complex atmospheric evolution equations.Recently,deep learning-based weather prediction models—particularly weather foundation models(WFMs)—have demonstrated promising results,achieving performance comparable to NWP while requiring substantially fewer computational resources.However,existing WFMs are unable to directly generate lightning forecasts and struggle to satisfy the high spatial resolution required for fine-scale prediction.To address these limitations,this paper investigates a fine-scale lightning forecasting approach based on WFMs and proposes a dual-source data-driven forecasting framework that integrates the strengths of both WFMs and recent lightning observations to enhance predictive performance.Furthermore,a gated spatiotemporal fusion network(gSTFNet)is designed to address the challenges of cross-temporal and cross-modal fusion inherent in dual-source data integration.gSTFNet employs a dual-encoding structure to separately encode features from WFMs and lightning observations,effectively narrowing the modal gap in the latent feature space.A gated spatiotemporal fusion module is then introduced to model the spatiotemporal correlations between the two types of features,facilitating seamless cross-temporal fusion.The fused features are subsequently processed by a deconvolutional network to generate accurate lightning forecasts.We evaluate the proposed gSTFNet using real-world lightning observation data collected in Guangdong from 2018 to 2022.Experimental results demonstrate that:(1)In terms of the ETS score,the dual-source framework achieves a 50% improvement over models trained solely on WFMs,and a 300% improvement over the HRES lightning forecasting product released by the European Centre for Medium-Range Weather Forecasts(ECMWF);(2)gSTFNet outperforms several state-of-the-art deep learning baselines that utilize dual-source inputs,clearly demonstrating superior forecasting accuracy.展开更多
高精度的海上船舶轨迹预测是降低船舶碰撞风险、提升船舶搜救效率的重要基础.海上航行环境的多变性使船舶轨迹数据在时间和空间上具有高度复杂性,现有方法对船舶轨迹数据的质量及运动信息关注度不足,难以充分捕捉轨迹中的时空特征和关...高精度的海上船舶轨迹预测是降低船舶碰撞风险、提升船舶搜救效率的重要基础.海上航行环境的多变性使船舶轨迹数据在时间和空间上具有高度复杂性,现有方法对船舶轨迹数据的质量及运动信息关注度不足,难以充分捕捉轨迹中的时空特征和关联信息.因此,文中提出融合数据质量增强和时空信息编码网络的船舶海上轨迹预测方法(Ship Maritime Trajectory Prediction Method Integrating Data Quality Enhancement and Spatio-Temporal Information Encoding Network,DQE-STIEN).首先,基于船舶轨迹数据的特征,设计结合哈希映射分类及局部离群哈希值异常检测的数据质量增强算法,对问题数据进行质量增强.然后,针对多属性的船舶轨迹数据,设计具有双编码通道的时空信息编码网络,充分提取并整合船舶轨迹数据中的位置信息与运动特征.最后,基于时空信息编码提取数据中的时空关联信息,并经解码生成完整的轨迹预测结果.在5个不同区域的AIS数据集上的实验表明DQE-STIEN性能较优.同时,DQE-STIEN具有一定的泛化性,也能有效分析能源、销售、环境和金融等领域的时序数据.展开更多
随着定位技术和传感器的高速发展,用户移动轨迹数据日渐丰富,但大多分散在不同平台上。为了全面利用这些数据并准确反映用户的真实行为,对轨迹用户匹配的研究变得至关重要。该任务旨在从海量签到轨迹数据中精准关联用户身份。近年来,研...随着定位技术和传感器的高速发展,用户移动轨迹数据日渐丰富,但大多分散在不同平台上。为了全面利用这些数据并准确反映用户的真实行为,对轨迹用户匹配的研究变得至关重要。该任务旨在从海量签到轨迹数据中精准关联用户身份。近年来,研究者们尝试运用循环神经网络、注意力机制等方法深入挖掘轨迹数据。然而,当前方法在处理用户签到轨迹时面临两大挑战:一是签到数据中有限的时空特征不足以从主观和客观两个角度全面地建模签到点信息,二是用户的签到轨迹往往围绕着一个特定的主题。针对这两点挑战,提出了一种基于自然语言增强的轨迹用户匹配模型(Natural Language Augmented Trajectory User Link,NLATUL)。首先,设计了一套自然语言模板与软提示令牌来描述签到轨迹,并使用语言模型来理解签到点中的主观意图,融合用户的时空状态,提供了一种充分从主观与客观两个方面建模签到点的方法;在此基础上,通过提示学习的方法推理签到轨迹的主题,并对建模的签到点表示的轨迹进行双向编码,通过签到轨迹主题与签到轨迹编码的结合实现对用户签到轨迹的准确理解。在两个真实世界签到数据集上验证的实验结果表明,NLATUL能够更准确地匹配签到轨迹与其对应的用户。展开更多
马登–朱利安振荡(Madden-Julian Oscillation,MJO)作为热带季节内变率的主要模态,其准确预测对于提升次季节预测能力至关重要。然而,MJO具有多尺度演变特征和高度非线性动力过程,现有预测方法在捕捉其复杂时空结构方面仍存在不足。为此...马登–朱利安振荡(Madden-Julian Oscillation,MJO)作为热带季节内变率的主要模态,其准确预测对于提升次季节预测能力至关重要。然而,MJO具有多尺度演变特征和高度非线性动力过程,现有预测方法在捕捉其复杂时空结构方面仍存在不足。为此,本文提出了一种融合多模态数据与时空特征的MJO预测模型(Multimodal data and Integrated Spatiotemporal features for MJO prediction,MISM)。该模型以历史实时多变量MJO指数(Real-time Multivariate MJO index,RMM)和多个气象因子作为联合输入,通过压缩激励模块、卷积模块和Swin Transformer模块构建空间特征提取模块,并引入自回归注意力机制实现非线性时间序列建模。实验结果表明,MISM模型的预测技巧可延伸至30 d以上,并在25 d以上的长期预测阶段表现优于传统的动力学和统计学方法。此外,本文利用显著性图对气象因子贡献区域进行分析,结果显示西太平洋及印尼群岛在不同提前期均呈现较高敏感性,海洋区域贡献普遍强于陆地。水汽和海温异常在短期与中期作用更突出,而低层风场和对流活动在长期阶段贡献较强,高层环流则在各时效保持稳定影响,体现了模型对MJO演变机制的识别能力。展开更多
Oceanic dissolved oxygen(DO)in the ocean has an indispensable role on supporting biological respiration,maintaining ecological balance and promoting nutrient cycling.According to existing research,the total DO has dec...Oceanic dissolved oxygen(DO)in the ocean has an indispensable role on supporting biological respiration,maintaining ecological balance and promoting nutrient cycling.According to existing research,the total DO has declined by 2%of the total over the past 50 a,and the tropical Pacific Ocean occupied the largest oxygen minimum zone(OMZ)areas.However,the sparse observation data is limited to understanding the dynamic variation and trend of ocean using traditional interpolation methods.In this study,we applied different machine learning algorithms to fit regression models between measured DO,ocean reanalysis physical variables,and spatiotemporal variables.We demonstrate that extreme gradient boosting(XGBoost)model has the best performance,hereby reconstructing a four-dimensional DO dataset of the tropical Pacific Ocean from 1920 to 2023.The results reveal that XGBoost significantly improves the reconstruction performance in the tropical Pacific Ocean,with a 35.3%reduction in root mean-squared error and a 39.5%decrease in mean absolute error.Additionally,we compare the results with three Coupled Model Intercomparison Project Phase 6(CMIP6)models data to confirm the high accuracy of the 4-dimensional reconstruction.Overall,the OMZ mainly dominates the eastern tropical Pacific Ocean,with a slow expansion.This study used XGBoost to efficiently reconstructing 4-dimensional DO enhancing the understanding of the hypoxic expansion in the tropical Pacific Ocean and we foresee that this approach would be extended to reconstruct more ocean elements.展开更多
基金supported by the National Key R&D Program of China(Grant No.2022YFC3005401)the Fundamental Research Funds for the Central Universities(Grant No.B230201013)+2 种基金the National Natural Science Foundation of China(Grants No.52309152,U2243223,and U23B20150)the Natural Science Foundation of Jiangsu Province(Grant No.BK20220978)the Open Fund of National Dam Safety Research Center(Grant No.CX2023B03).
文摘Deformation monitoring is a critical measure for intuitively reflecting the operational behavior of a dam.However,the deformation monitoring data are often incomplete due to environmental changes,monitoring instrument faults,and human operational errors,thereby often hindering the accurate assessment of actual deformation patterns.This study proposed a method for quantifying deformation similarity between measurement points by recognizing the spatiotemporal characteristics of concrete dam deformation monitoring data.It introduces a spatiotemporal clustering analysis of the concrete dam deformation behavior and employs the support vector machine model to address the missing data in concrete dam deformation monitoring.The proposed method was validated in a concrete dam project,with the model error maintaining within 5%,demonstrating its effectiveness in processing missing deformation data.This approach enhances the capability of early-warning systems and contributes to enhanced dam safety management.
文摘The advent of the digital era has provided unprecedented opportunities for businesses to collect and analyze customer behavior data. Precision marketing, as a key means to improve marketing efficiency, highly depends on a deep understanding of customer behavior. This study proposes a theoretical framework for multi-dimensional customer behavior analysis, aiming to comprehensively capture customer behavioral characteristics in the digital environment. This framework integrates concepts of multi-source data including transaction history, browsing trajectories, social media interactions, and location information, constructing a theoretically more comprehensive customer profile. The research discusses the potential applications of this theoretical framework in precision marketing scenarios such as personalized recommendations, cross-selling, and customer churn prevention. Through analysis, the study points out that multi-dimensional analysis may significantly improve the targeting and theoretical conversion rates of marketing activities. However, the research also explores theoretical challenges that may be faced in the application process, such as data privacy and information overload, and proposes corresponding conceptual coping strategies. This study provides a new theoretical perspective on how businesses can optimize marketing decisions using big data thinking while respecting customer privacy, laying a foundation for future empirical research.
基金supported by the National Natural Science Foundation of China(Grant No.50539010,50539110,50579010,50539030 and 50809025)
文摘To improve the effectiveness of dam safety monitoring database systems,the development process of a multi-dimensional conceptual data model was analyzed and a logic design wasachieved in multi-dimensional database mode.The optimal data model was confirmed by identifying data objects,defining relations and reviewing entities.The conversion of relations among entities to external keys and entities and physical attributes to tables and fields was interpreted completely.On this basis,a multi-dimensional database that reflects the management and analysis of a dam safety monitoring system on monitoring data information has been established,for which factual tables and dimensional tables have been designed.Finally,based on service design and user interface design,the dam safety monitoring system has been developed with Delphi as the development tool.This development project shows that the multi-dimensional database can simplify the development process and minimize hidden dangers in the database structure design.It is superior to other dam safety monitoring system development models and can provide a new research direction for system developers.
基金supported by the Project of China Southern Power Grid Digital Grid Research Institute Co.,Ltd.(210002KK52222026)。
文摘By modeling the spatiotemporal data of the power grid, it is possible to better understand its operational status, identify potential issues and risks, and take timely measures to adjust and optimize the system. Compared to the bus-branch model, the node-breaker model provides higher granularity in describing grid components and can dynamically reflect changes in equipment status, thus improving the efficiency of grid dispatching and operation. This paper proposes a spatiotemporal data modeling method based on a graph database. It elaborates on constructing graph nodes, graph ontology models, and graph entity models from grid dispatch data, describing the construction of the spatiotemporal node-breaker graph model and the transformation to the bus-branch model. Subsequently, by integrating spatiotemporal data attributes into the pre-built static grid graph model, a spatiotemporal evolving graph of the power grid is constructed. Furthermore, the concept of the “Power Grid One Graph” and its requirements in modern power systems are elucidated. Leveraging the constructed spatiotemporal node-breaker graph model and graph computing technology, the paper explores the feasibility of grid situational awareness. Finally, typical applications in an operational provincial grid are showcased, and potential scenarios of the proposed spatiotemporal graph model are discussed.
基金supported by the National Natural Science Foundation of China(51877034).
文摘The data acquisition technologies used in power systems have been continuously improving,thus laying the solid foundation for data-driven operation analysis of power systems.However,existing methods for analyzing the relationship between operational variables mainly depend on the mathematical model and element parameters of the power system.Therefore,a thorough data-based analysis method is required to investigate the spatiotemporal characteristics of power system operation,especially for new types of power systems.The causal inference method,which has been successfully applied in many fields,is a powerful tool for investigating the interaction of data variables.In this study,a causal inference method is proposed based on supervisory control and data acquisition(SCADA)data for investigating the spatiotemporal causal relationships in power systems.Initially,a multiple data-sequence regression model is proposed to analyze the relationship of operation data variables.Next,the linear non-Gaussian acyclic model(LiNGAM)is used to calculate the causal index of the operational variables,and its limitations are analyzed.Furthermore,a new causal index of“full variable amplitude LiNGAM(FVA-LiNGAM)”is proposed by incorporating prior causal direct knowledge and considering the effect of real variable amplitude.Using the FVA-LiNGAM causal index,the causal relationship of operation variables can be investigated with higher spatiotemporal accuracy than that of the original LiNGAM index.Taking a real SCADA data subset of a provincial power system as an example,the validity of the FVA-LiNGAM causal index is verified.The variation patterns in spatiotemporal causality are explored using actual SCADA data sequences.The result shows that there indeed exists some spatiotemporal causality variation patterns between the operating variables of the power system.
基金National Natural Science Foundation of China(41590845)Strategic Priority Research Program of the Chinese Academy of Sciences(XDA19040501)+2 种基金Strategic Priority Research Program of the Chinese Academy of Sciences(XDA20040401)National Key Research and Development Program of China(2017YFB0503605)National Key Research and Development Program of China(2017YFC1503003)。
文摘Developing a comprehensive understanding of inter-city interactions is crucial for regional planning.We therefore examined spatiotemporal patterns of population migration across the Qinghai-Tibet Plateau(QTP)using migration big data from Tencent for the period between 2015 and 2019.We initially used decomposition and breakpoint detection methods to examine time-series migration data and to identify the two seasons with the strongest and weakest population migration levels,between June 18th and August 18th and between October 8th and February 15th,respectively.Population migration within the former period was 2.03 times that seen in the latter.We then used a variety of network analysis methods to examine population flow directions as well as the importance of each individual city in migration.The two capital cities on the QTP,Lhasa and Xining,form centers for population migration and are also transfer hubs through which migrants from other cities off the plateau enter and leave this region.Data show that these two cities contribute more than 35%of total population migration.The majority of migrants tend to move within the province,particularly during the weakest migration season.We also utilized interactive relationship force and radiation models to examine the interaction strength and the radiating energy of each individual city.Results show that Lhasa and Xining exhibit the strongest interactions with other cities and have the largest radiating energies.Indeed,the radiating energy of the QTP cities correlates with their gross domestic product(GDP)(Pearson correlation coefficient:0.754 in the weakest migration season,WMS versus 0.737 in the strongest migration season,SMS),while changes in radiating energy correlate with the tourism-related revenue(Pearson correlation coefficient:0.685).These outcomes suggest that level of economic development and level of tourism are the two most important factors driving the QTP population migration.The results of this analysis provide critical clarification guidance regarding huge QTP development differences.
基金supported by the Open Grants of Key Laboratory of Lightning,China Meteorological Administration(Grant Nos.2023KELL-B002 and 2024KELL-A001)the National Natural Science Foundation of China(Grant Nos.62306028,42075088 and U2342215)the Science and Technology Program of Shenzhen,China(Grant No.KJZD20240903102742055)。
文摘Lightning is a significant natural hazard that poses considerable risks to both human safety and industrial operations.Accurate,fine-scale lightning forecasting is crucial for effective disaster prevention.Traditional forecasting methods primarily rely on numerical weather prediction(NWP),which demands substantial computational resources to solve complex atmospheric evolution equations.Recently,deep learning-based weather prediction models—particularly weather foundation models(WFMs)—have demonstrated promising results,achieving performance comparable to NWP while requiring substantially fewer computational resources.However,existing WFMs are unable to directly generate lightning forecasts and struggle to satisfy the high spatial resolution required for fine-scale prediction.To address these limitations,this paper investigates a fine-scale lightning forecasting approach based on WFMs and proposes a dual-source data-driven forecasting framework that integrates the strengths of both WFMs and recent lightning observations to enhance predictive performance.Furthermore,a gated spatiotemporal fusion network(gSTFNet)is designed to address the challenges of cross-temporal and cross-modal fusion inherent in dual-source data integration.gSTFNet employs a dual-encoding structure to separately encode features from WFMs and lightning observations,effectively narrowing the modal gap in the latent feature space.A gated spatiotemporal fusion module is then introduced to model the spatiotemporal correlations between the two types of features,facilitating seamless cross-temporal fusion.The fused features are subsequently processed by a deconvolutional network to generate accurate lightning forecasts.We evaluate the proposed gSTFNet using real-world lightning observation data collected in Guangdong from 2018 to 2022.Experimental results demonstrate that:(1)In terms of the ETS score,the dual-source framework achieves a 50% improvement over models trained solely on WFMs,and a 300% improvement over the HRES lightning forecasting product released by the European Centre for Medium-Range Weather Forecasts(ECMWF);(2)gSTFNet outperforms several state-of-the-art deep learning baselines that utilize dual-source inputs,clearly demonstrating superior forecasting accuracy.
文摘高精度的海上船舶轨迹预测是降低船舶碰撞风险、提升船舶搜救效率的重要基础.海上航行环境的多变性使船舶轨迹数据在时间和空间上具有高度复杂性,现有方法对船舶轨迹数据的质量及运动信息关注度不足,难以充分捕捉轨迹中的时空特征和关联信息.因此,文中提出融合数据质量增强和时空信息编码网络的船舶海上轨迹预测方法(Ship Maritime Trajectory Prediction Method Integrating Data Quality Enhancement and Spatio-Temporal Information Encoding Network,DQE-STIEN).首先,基于船舶轨迹数据的特征,设计结合哈希映射分类及局部离群哈希值异常检测的数据质量增强算法,对问题数据进行质量增强.然后,针对多属性的船舶轨迹数据,设计具有双编码通道的时空信息编码网络,充分提取并整合船舶轨迹数据中的位置信息与运动特征.最后,基于时空信息编码提取数据中的时空关联信息,并经解码生成完整的轨迹预测结果.在5个不同区域的AIS数据集上的实验表明DQE-STIEN性能较优.同时,DQE-STIEN具有一定的泛化性,也能有效分析能源、销售、环境和金融等领域的时序数据.
文摘随着定位技术和传感器的高速发展,用户移动轨迹数据日渐丰富,但大多分散在不同平台上。为了全面利用这些数据并准确反映用户的真实行为,对轨迹用户匹配的研究变得至关重要。该任务旨在从海量签到轨迹数据中精准关联用户身份。近年来,研究者们尝试运用循环神经网络、注意力机制等方法深入挖掘轨迹数据。然而,当前方法在处理用户签到轨迹时面临两大挑战:一是签到数据中有限的时空特征不足以从主观和客观两个角度全面地建模签到点信息,二是用户的签到轨迹往往围绕着一个特定的主题。针对这两点挑战,提出了一种基于自然语言增强的轨迹用户匹配模型(Natural Language Augmented Trajectory User Link,NLATUL)。首先,设计了一套自然语言模板与软提示令牌来描述签到轨迹,并使用语言模型来理解签到点中的主观意图,融合用户的时空状态,提供了一种充分从主观与客观两个方面建模签到点的方法;在此基础上,通过提示学习的方法推理签到轨迹的主题,并对建模的签到点表示的轨迹进行双向编码,通过签到轨迹主题与签到轨迹编码的结合实现对用户签到轨迹的准确理解。在两个真实世界签到数据集上验证的实验结果表明,NLATUL能够更准确地匹配签到轨迹与其对应的用户。
文摘马登–朱利安振荡(Madden-Julian Oscillation,MJO)作为热带季节内变率的主要模态,其准确预测对于提升次季节预测能力至关重要。然而,MJO具有多尺度演变特征和高度非线性动力过程,现有预测方法在捕捉其复杂时空结构方面仍存在不足。为此,本文提出了一种融合多模态数据与时空特征的MJO预测模型(Multimodal data and Integrated Spatiotemporal features for MJO prediction,MISM)。该模型以历史实时多变量MJO指数(Real-time Multivariate MJO index,RMM)和多个气象因子作为联合输入,通过压缩激励模块、卷积模块和Swin Transformer模块构建空间特征提取模块,并引入自回归注意力机制实现非线性时间序列建模。实验结果表明,MISM模型的预测技巧可延伸至30 d以上,并在25 d以上的长期预测阶段表现优于传统的动力学和统计学方法。此外,本文利用显著性图对气象因子贡献区域进行分析,结果显示西太平洋及印尼群岛在不同提前期均呈现较高敏感性,海洋区域贡献普遍强于陆地。水汽和海温异常在短期与中期作用更突出,而低层风场和对流活动在长期阶段贡献较强,高层环流则在各时效保持稳定影响,体现了模型对MJO演变机制的识别能力。
基金The National Natural Science Foundation of China under contract Nos T2421002, 623B2071,and 42125601the National Key R&D Program of China under contract No. 2023YFF0805300
文摘Oceanic dissolved oxygen(DO)in the ocean has an indispensable role on supporting biological respiration,maintaining ecological balance and promoting nutrient cycling.According to existing research,the total DO has declined by 2%of the total over the past 50 a,and the tropical Pacific Ocean occupied the largest oxygen minimum zone(OMZ)areas.However,the sparse observation data is limited to understanding the dynamic variation and trend of ocean using traditional interpolation methods.In this study,we applied different machine learning algorithms to fit regression models between measured DO,ocean reanalysis physical variables,and spatiotemporal variables.We demonstrate that extreme gradient boosting(XGBoost)model has the best performance,hereby reconstructing a four-dimensional DO dataset of the tropical Pacific Ocean from 1920 to 2023.The results reveal that XGBoost significantly improves the reconstruction performance in the tropical Pacific Ocean,with a 35.3%reduction in root mean-squared error and a 39.5%decrease in mean absolute error.Additionally,we compare the results with three Coupled Model Intercomparison Project Phase 6(CMIP6)models data to confirm the high accuracy of the 4-dimensional reconstruction.Overall,the OMZ mainly dominates the eastern tropical Pacific Ocean,with a slow expansion.This study used XGBoost to efficiently reconstructing 4-dimensional DO enhancing the understanding of the hypoxic expansion in the tropical Pacific Ocean and we foresee that this approach would be extended to reconstruct more ocean elements.