As the risks associated with air turbulence are intensified by climate change and the growth of the aviation industry,it has become imperative to monitor and mitigate these threats to ensure civil aviation safety.The ...As the risks associated with air turbulence are intensified by climate change and the growth of the aviation industry,it has become imperative to monitor and mitigate these threats to ensure civil aviation safety.The eddy dissipation rate(EDR)has been established as the standard metric for quantifying turbulence in civil aviation.This study aims to explore a universally applicable symbolic classification approach based on genetic programming to detect turbulence anomalies using quick access recorder(QAR)data.The detection of atmospheric turbulence is approached as an anomaly detection problem.Comparative evaluations demonstrate that this approach performs on par with direct EDR calculation methods in identifying turbulence events.Moreover,comparisons with alternative machine learning techniques indicate that the proposed technique is the optimal methodology currently available.In summary,the use of symbolic classification via genetic programming enables accurate turbulence detection from QAR data,comparable to that with established EDR approaches and surpassing that achieved with machine learning algorithms.This finding highlights the potential of integrating symbolic classifiers into turbulence monitoring systems to enhance civil aviation safety amidst rising environmental and operational hazards.展开更多
In order to improve the accuracy and integrality of mining data records from the web, the concepts of isomorphic page and directory page and three algorithms are proposed. An isomorphic web page is a set of web pages ...In order to improve the accuracy and integrality of mining data records from the web, the concepts of isomorphic page and directory page and three algorithms are proposed. An isomorphic web page is a set of web pages that have uniform structure, only differing in main information. A web page which contains many links that link to isomorphic web pages is called a directory page. Algorithm 1 can find directory web pages in a web using adjacent links similar analysis method. It first sorts the link, and then counts the links in each directory. If the count is greater than a given valve then finds the similar sub-page links in the directory and gives the results. A function for an isomorphic web page judgment is also proposed. Algorithm 2 can mine data records from an isomorphic page using a noise information filter. It is based on the fact that the noise information is the same in two isomorphic pages, only the main information is different. Algorithm 3 can mine data records from an entire website using the technology of spider. The experiment shows that the proposed algorithms can mine data records more intactly than the existing algorithms. Mining data records from isomorphic pages is an efficient method.展开更多
Since the British National Archive put forward the concept of the digital continuity in 2007,several developed countries have worked out their digital continuity action plan.However,the technologies of the digital con...Since the British National Archive put forward the concept of the digital continuity in 2007,several developed countries have worked out their digital continuity action plan.However,the technologies of the digital continuity guarantee are still lacked.At first,this paper analyzes the requirements of digital continuity guarantee for electronic record based on data quality theory,then points out the necessity of data quality guarantee for electronic record.Moreover,we convert the digital continuity guarantee of electronic record to ensure the consistency,completeness and timeliness of electronic record,and construct the first technology framework of the digital continuity guarantee for electronic record.Finally,the temporal functional dependencies technology is utilized to build the first integration method to insure the consistency,completeness and timeliness of electronic record.展开更多
Regional healthcare platforms collect clinical data from hospitals in specific areas for the purpose of healthcare management.It is a common requirement to reuse the data for clinical research.However,we have to face ...Regional healthcare platforms collect clinical data from hospitals in specific areas for the purpose of healthcare management.It is a common requirement to reuse the data for clinical research.However,we have to face challenges like the inconsistence of terminology in electronic health records (EHR) and the complexities in data quality and data formats in regional healthcare platform.In this paper,we propose methodology and process on constructing large scale cohorts which forms the basis of causality and comparative effectiveness relationship in epidemiology.We firstly constructed a Chinese terminology knowledge graph to deal with the diversity of vocabularies on regional platform.Secondly,we built special disease case repositories (i.e.,heart failure repository) that utilize the graph to search the related patients and to normalize the data.Based on the requirements of the clinical research which aimed to explore the effectiveness of taking statin on 180-days readmission in patients with heart failure,we built a large-scale retrospective cohort with 29647 cases of heart failure patients from the heart failure repository.After the propensity score matching,the study group (n=6346) and the control group (n=6346) with parallel clinical characteristics were acquired.Logistic regression analysis showed that taking statins had a negative correlation with 180-days readmission in heart failure patients.This paper presents the workflow and application example of big data mining based on regional EHR data.展开更多
With the rapid development of information technology,the electronifi-cation of medical records has gradually become a trend.In China,the population base is huge and the supporting medical institutions are numerous,so ...With the rapid development of information technology,the electronifi-cation of medical records has gradually become a trend.In China,the population base is huge and the supporting medical institutions are numerous,so this reality drives the conversion of paper medical records to electronic medical records.Electronic medical records are the basis for establishing a smart hospital and an important guarantee for achieving medical intelligence,and the massive amount of electronic medical record data is also an important data set for conducting research in the medical field.However,electronic medical records contain a large amount of private patient information,which must be desensitized before they are used as open resources.Therefore,to solve the above problems,data masking for Chinese electronic medical records with named entity recognition is proposed in this paper.Firstly,the text is vectorized to satisfy the required format of the model input.Secondly,since the input sentences may have a long or short length and the relationship between sentences in context is not negligible.To this end,a neural network model for named entity recognition based on bidirectional long short-term memory(BiLSTM)with conditional random fields(CRF)is constructed.Finally,the data masking operation is performed based on the named entity recog-nition results,mainly using regular expression filtering encryption and principal component analysis(PCA)word vector compression and replacement.In addi-tion,comparison experiments with the hidden markov model(HMM)model,LSTM-CRF model,and BiLSTM model are conducted in this paper.The experi-mental results show that the method used in this paper achieves 92.72%Accuracy,92.30%Recall,and 92.51%F1_score,which has higher accuracy compared with other models.展开更多
In order to settle the problem of workflow data consis-tency under the distributed environment, an invalidation strategy based-on timely updating record list is put forward. The strategy adopting the method of updatin...In order to settle the problem of workflow data consis-tency under the distributed environment, an invalidation strategy based-on timely updating record list is put forward. The strategy adopting the method of updating the records list and the recovery mechanism of updating message proves the classical invalidation strategy. When the request cycle of duplication is too long, the strategy uses the method of updating the records list to pause for sending updating message; when the long cycle duplication is requested again, it uses the recovery mechanism to resume the updating message. This strategy not only ensures the consistency of the workflow data, but also reduces the unnecessary network traffic. From theoretical comparison with those common strategies, the unnecessary network traffic of this strategy is fewer and more stable. The simulation results validate this conclusion.展开更多
This article studies the fault recorder in power system and introduces the Comtrade format. Andituses C++ programming to read recorded fault data and adopts Fourier analysis and symmetrical component method to filter ...This article studies the fault recorder in power system and introduces the Comtrade format. Andituses C++ programming to read recorded fault data and adopts Fourier analysis and symmetrical component method to filter and extract fundamental waves. Finally the effectiveness of the data processing method introduced in this paper is verified by CAAP software.展开更多
The calibration of paleoclimate proxies is one of the key problems in the study of paleoclimate at present. Historical documentary records of climate are suitable for calibration on dating and the climatic implication...The calibration of paleoclimate proxies is one of the key problems in the study of paleoclimate at present. Historical documentary records of climate are suitable for calibration on dating and the climatic implication of the proxy data in a climatological sense. A test calibration on correcting the Delingha tree ring precipitation series using Chinese historical documentary records shows that among the 44 extreme dry cases in 1401 1950 AD, 42 cases (or 95.5%) are believable. Thus the long series of Delingha rings-denoted precipitation is highly reliable. Another test to validate the monsoon intensity proxy data based on the Zhanjiang Huguangyan sediments using historical records indicates that the years of Lake Maar Ti content series-designated winter monsoon intensities are entirely opposite to historical documents- depicted years of harsh winters in 800-900 AD. As a result, serious doubt is raised about the climatic implication of this paleo-monsoon proxy series.展开更多
With the advancements in the era of artificial intelligence,blockchain,cloud computing,and big data,there is a need for secure,decentralized medical record storage and retrieval systems.While cloud storage solves stor...With the advancements in the era of artificial intelligence,blockchain,cloud computing,and big data,there is a need for secure,decentralized medical record storage and retrieval systems.While cloud storage solves storage issues,it is challenging to realize secure sharing of records over the network.Medi-block record in the healthcare system has brought a new digitalization method for patients’medical records.This centralized technology provides a symmetrical process between the hospital and doctors when patients urgently need to go to a different or nearby hospital.It enables electronic medical records to be available with the correct authentication and restricts access to medical data retrieval.Medi-block record is the consumer-centered healthcare data system that brings reliable and transparent datasets for the medical record.This study presents an extensive review of proposed solutions aiming to protect the privacy and integrity of medical data by securing data sharing for Medi-block records.It also aims to propose a comprehensive investigation of the recent advances in different methods of securing data sharing,such as using Blockchain technology,Access Control,Privacy-Preserving,Proxy Re-Encryption,and Service-On-Chain approach.Finally,we highlight the open issues and identify the challenges regarding secure data sharing for Medi-block records in the healthcare systems.展开更多
In the software of data management system, there are some different lengths of records needed storing in an array, and the number of records often increases in use of the software. A universal data structure is presen...In the software of data management system, there are some different lengths of records needed storing in an array, and the number of records often increases in use of the software. A universal data structure is presented in the design, and it provide an unified interface for dynamic storage records in different length, so that the developers can call the unified interface directly for the data storage to simplify the design of data management system.展开更多
In the field of electronic record management,especially in the current big data environment,data continuity has become a new topic that is as important as security and needs to be studied.This paper decomposes the dat...In the field of electronic record management,especially in the current big data environment,data continuity has become a new topic that is as important as security and needs to be studied.This paper decomposes the data continuity guarantee of electronic record into a set of data protection requirements consisting of data relevance,traceability and comprehensibility,and proposes to use the associated data technology to provide an integrated guarantee mechanism to meet the above three requirements.展开更多
Without proper security mechanisms, medical records stored electronically can be accessed more easily than physical files. Patient health information is scattered throughout the hospital environment, including laborat...Without proper security mechanisms, medical records stored electronically can be accessed more easily than physical files. Patient health information is scattered throughout the hospital environment, including laboratories, pharmacies, and daily medical status reports. The electronic format of medical reports ensures that all information is available in a single place. However, it is difficult to store and manage large amounts of data. Dedicated servers and a data center are needed to store and manage patient data. However, self-managed data centers are expensive for hospitals. Storing data in a cloud is a cheaper alternative. The advantage of storing data in a cloud is that it can be retrieved anywhere and anytime using any device connected to the Internet. Therefore, doctors can easily access the medical history of a patient and diagnose diseases according to the context. It also helps prescribe the correct medicine to a patient in an appropriate way. The systematic storage of medical records could help reduce medical errors in hospitals. The challenge is to store medical records on a third-party cloud server while addressing privacy and security concerns. These servers are often semi-trusted. Thus, sensitive medical information must be protected. Open access to records and modifications performed on the information in those records may even cause patient fatalities. Patient-centric health-record security is a major concern. End-to-end file encryption before outsourcing data to a third-party cloud server ensures security. This paper presents a method that is a combination of the advanced encryption standard and the elliptical curve Diffie-Hellman method designed to increase the efficiency of medical record security for users. Comparisons of existing and proposed techniques are presented at the end of the article, with a focus on the analyzing the security approaches between the elliptic curve and secret-sharing methods. This study aims to provide a high level of security for patient health records.展开更多
处于改建阶段的智能变电站采样模式复杂,继电保护装置难以发现采样回路轻微异常,导致回路隐患暴露时间严重滞后。针对上述问题,分析改建时期智能变电站的采样模式和二次设备配置情况,提出基于同源录波数据比对的继电保护采样回路异常检...处于改建阶段的智能变电站采样模式复杂,继电保护装置难以发现采样回路轻微异常,导致回路隐患暴露时间严重滞后。针对上述问题,分析改建时期智能变电站的采样模式和二次设备配置情况,提出基于同源录波数据比对的继电保护采样回路异常检测方法。首先,利用双向编码器表征(bidirectional encoder representations from transformers,BERT)语言模型与余弦相似度算法,实现同源录波数据的通道匹配。然后,利用重采样技术和曼哈顿距离完成波形的采样频率统一与时域对齐。最后,基于动态时间规整(dynamic time warping,DTW)算法提出改进算法,并结合采样点偏移量共同设置采样回路的异常判据。算例分析表明,该方法可以完成录波数据的同源通道匹配,实现波形的一致性对齐,并且相比于传统DTW算法,改进DTW算法对异常状态识别的灵敏性和准确性更高。根据异常判据能够有效检测继电保护采样回路的异常状态,确保了智能变电站的安全可靠运行。展开更多
由于不同时期的录波数据记录标准有所不同,以及各个生产厂家对标准的解读存在偏差,造成同源录波数据的通道名称存在个性化差异,且通道索引号不同,难以进行录波数据的同源匹配。针对上述问题,提出基于句向量掩码纠错双向编码器表征语言模...由于不同时期的录波数据记录标准有所不同,以及各个生产厂家对标准的解读存在偏差,造成同源录波数据的通道名称存在个性化差异,且通道索引号不同,难以进行录波数据的同源匹配。针对上述问题,提出基于句向量掩码纠错双向编码器表征语言模型(sentence-masked language model as correction bidirectional encoder representations from transformers,Sentence-MacBERT)的同源录波数据匹配方法。首先,分析录波文件的记录格式特点,根据录波文件的格式特点完成核查信息表的构建。然后,通过构建的核查信息表进行录波文件自动校核。最后,在双向编码器表征(bidirectional encoder representations from transformers,BERT)模型的基础上构建Sentence-MacBERT同源通道匹配模型,完成同源录波数据匹配。算例分析表明,根据核查信息表能够完成录波文件的自动校核,并对解析失败的录波文件发出告警信息。利用Sentence-MacBERT模型进行通道名称匹配的效果良好,能够有效地完成录波数据的同源匹配,帮助运行人员进行故障分析。展开更多
基金supported by the Meteorological Soft Science Project(Grant No.2023ZZXM29)the Natural Science Fund Project of Tianjin,China(Grant No.21JCYBJC00740)the Key Research and Development-Social Development Program of Jiangsu Province,China(Grant No.BE2021685).
文摘As the risks associated with air turbulence are intensified by climate change and the growth of the aviation industry,it has become imperative to monitor and mitigate these threats to ensure civil aviation safety.The eddy dissipation rate(EDR)has been established as the standard metric for quantifying turbulence in civil aviation.This study aims to explore a universally applicable symbolic classification approach based on genetic programming to detect turbulence anomalies using quick access recorder(QAR)data.The detection of atmospheric turbulence is approached as an anomaly detection problem.Comparative evaluations demonstrate that this approach performs on par with direct EDR calculation methods in identifying turbulence events.Moreover,comparisons with alternative machine learning techniques indicate that the proposed technique is the optimal methodology currently available.In summary,the use of symbolic classification via genetic programming enables accurate turbulence detection from QAR data,comparable to that with established EDR approaches and surpassing that achieved with machine learning algorithms.This finding highlights the potential of integrating symbolic classifiers into turbulence monitoring systems to enhance civil aviation safety amidst rising environmental and operational hazards.
文摘In order to improve the accuracy and integrality of mining data records from the web, the concepts of isomorphic page and directory page and three algorithms are proposed. An isomorphic web page is a set of web pages that have uniform structure, only differing in main information. A web page which contains many links that link to isomorphic web pages is called a directory page. Algorithm 1 can find directory web pages in a web using adjacent links similar analysis method. It first sorts the link, and then counts the links in each directory. If the count is greater than a given valve then finds the similar sub-page links in the directory and gives the results. A function for an isomorphic web page judgment is also proposed. Algorithm 2 can mine data records from an isomorphic page using a noise information filter. It is based on the fact that the noise information is the same in two isomorphic pages, only the main information is different. Algorithm 3 can mine data records from an entire website using the technology of spider. The experiment shows that the proposed algorithms can mine data records more intactly than the existing algorithms. Mining data records from isomorphic pages is an efficient method.
基金This work is supported by the NSFC(Nos.61772280,61772454)the Changzhou Sci&Tech Program(No.CJ20179027)the PAPD fund from NUIST.Prof.Jin Wang is the corresponding author。
文摘Since the British National Archive put forward the concept of the digital continuity in 2007,several developed countries have worked out their digital continuity action plan.However,the technologies of the digital continuity guarantee are still lacked.At first,this paper analyzes the requirements of digital continuity guarantee for electronic record based on data quality theory,then points out the necessity of data quality guarantee for electronic record.Moreover,we convert the digital continuity guarantee of electronic record to ensure the consistency,completeness and timeliness of electronic record,and construct the first technology framework of the digital continuity guarantee for electronic record.Finally,the temporal functional dependencies technology is utilized to build the first integration method to insure the consistency,completeness and timeliness of electronic record.
基金Supported by the National Major Scientific and Technological Special Project for"Significant New Drugs Development’’(No.2018ZX09201008)Special Fund Project for Information Development from Shanghai Municipal Commission of Economy and Information(No.201701013)
文摘Regional healthcare platforms collect clinical data from hospitals in specific areas for the purpose of healthcare management.It is a common requirement to reuse the data for clinical research.However,we have to face challenges like the inconsistence of terminology in electronic health records (EHR) and the complexities in data quality and data formats in regional healthcare platform.In this paper,we propose methodology and process on constructing large scale cohorts which forms the basis of causality and comparative effectiveness relationship in epidemiology.We firstly constructed a Chinese terminology knowledge graph to deal with the diversity of vocabularies on regional platform.Secondly,we built special disease case repositories (i.e.,heart failure repository) that utilize the graph to search the related patients and to normalize the data.Based on the requirements of the clinical research which aimed to explore the effectiveness of taking statin on 180-days readmission in patients with heart failure,we built a large-scale retrospective cohort with 29647 cases of heart failure patients from the heart failure repository.After the propensity score matching,the study group (n=6346) and the control group (n=6346) with parallel clinical characteristics were acquired.Logistic regression analysis showed that taking statins had a negative correlation with 180-days readmission in heart failure patients.This paper presents the workflow and application example of big data mining based on regional EHR data.
基金This research was supported by the National Natural Science Foundation of China under Grant(No.42050102)the Postgraduate Education Reform Project of Jiangsu Province under Grant(No.SJCX22_0343)Also,this research was supported by Dou Wanchun Expert Workstation of Yunnan Province(No.202205AF150013).
文摘With the rapid development of information technology,the electronifi-cation of medical records has gradually become a trend.In China,the population base is huge and the supporting medical institutions are numerous,so this reality drives the conversion of paper medical records to electronic medical records.Electronic medical records are the basis for establishing a smart hospital and an important guarantee for achieving medical intelligence,and the massive amount of electronic medical record data is also an important data set for conducting research in the medical field.However,electronic medical records contain a large amount of private patient information,which must be desensitized before they are used as open resources.Therefore,to solve the above problems,data masking for Chinese electronic medical records with named entity recognition is proposed in this paper.Firstly,the text is vectorized to satisfy the required format of the model input.Secondly,since the input sentences may have a long or short length and the relationship between sentences in context is not negligible.To this end,a neural network model for named entity recognition based on bidirectional long short-term memory(BiLSTM)with conditional random fields(CRF)is constructed.Finally,the data masking operation is performed based on the named entity recog-nition results,mainly using regular expression filtering encryption and principal component analysis(PCA)word vector compression and replacement.In addi-tion,comparison experiments with the hidden markov model(HMM)model,LSTM-CRF model,and BiLSTM model are conducted in this paper.The experi-mental results show that the method used in this paper achieves 92.72%Accuracy,92.30%Recall,and 92.51%F1_score,which has higher accuracy compared with other models.
基金National Basic Research Program of China (973 Program) (2005CD312904)
文摘In order to settle the problem of workflow data consis-tency under the distributed environment, an invalidation strategy based-on timely updating record list is put forward. The strategy adopting the method of updating the records list and the recovery mechanism of updating message proves the classical invalidation strategy. When the request cycle of duplication is too long, the strategy uses the method of updating the records list to pause for sending updating message; when the long cycle duplication is requested again, it uses the recovery mechanism to resume the updating message. This strategy not only ensures the consistency of the workflow data, but also reduces the unnecessary network traffic. From theoretical comparison with those common strategies, the unnecessary network traffic of this strategy is fewer and more stable. The simulation results validate this conclusion.
文摘This article studies the fault recorder in power system and introduces the Comtrade format. Andituses C++ programming to read recorded fault data and adopts Fourier analysis and symmetrical component method to filter and extract fundamental waves. Finally the effectiveness of the data processing method introduced in this paper is verified by CAAP software.
基金supported in part by National Science Foundation of China(41075055)
文摘The calibration of paleoclimate proxies is one of the key problems in the study of paleoclimate at present. Historical documentary records of climate are suitable for calibration on dating and the climatic implication of the proxy data in a climatological sense. A test calibration on correcting the Delingha tree ring precipitation series using Chinese historical documentary records shows that among the 44 extreme dry cases in 1401 1950 AD, 42 cases (or 95.5%) are believable. Thus the long series of Delingha rings-denoted precipitation is highly reliable. Another test to validate the monsoon intensity proxy data based on the Zhanjiang Huguangyan sediments using historical records indicates that the years of Lake Maar Ti content series-designated winter monsoon intensities are entirely opposite to historical documents- depicted years of harsh winters in 800-900 AD. As a result, serious doubt is raised about the climatic implication of this paleo-monsoon proxy series.
文摘With the advancements in the era of artificial intelligence,blockchain,cloud computing,and big data,there is a need for secure,decentralized medical record storage and retrieval systems.While cloud storage solves storage issues,it is challenging to realize secure sharing of records over the network.Medi-block record in the healthcare system has brought a new digitalization method for patients’medical records.This centralized technology provides a symmetrical process between the hospital and doctors when patients urgently need to go to a different or nearby hospital.It enables electronic medical records to be available with the correct authentication and restricts access to medical data retrieval.Medi-block record is the consumer-centered healthcare data system that brings reliable and transparent datasets for the medical record.This study presents an extensive review of proposed solutions aiming to protect the privacy and integrity of medical data by securing data sharing for Medi-block records.It also aims to propose a comprehensive investigation of the recent advances in different methods of securing data sharing,such as using Blockchain technology,Access Control,Privacy-Preserving,Proxy Re-Encryption,and Service-On-Chain approach.Finally,we highlight the open issues and identify the challenges regarding secure data sharing for Medi-block records in the healthcare systems.
文摘In the software of data management system, there are some different lengths of records needed storing in an array, and the number of records often increases in use of the software. A universal data structure is presented in the design, and it provide an unified interface for dynamic storage records in different length, so that the developers can call the unified interface directly for the data storage to simplify the design of data management system.
基金This work is supported by the NSFC(61772280)the national training programs of innovation and entrepreneurship for undergraduates(Nos.201910300123Y,202010300200)the PAPD fund from NUIST.
文摘In the field of electronic record management,especially in the current big data environment,data continuity has become a new topic that is as important as security and needs to be studied.This paper decomposes the data continuity guarantee of electronic record into a set of data protection requirements consisting of data relevance,traceability and comprehensibility,and proposes to use the associated data technology to provide an integrated guarantee mechanism to meet the above three requirements.
文摘Without proper security mechanisms, medical records stored electronically can be accessed more easily than physical files. Patient health information is scattered throughout the hospital environment, including laboratories, pharmacies, and daily medical status reports. The electronic format of medical reports ensures that all information is available in a single place. However, it is difficult to store and manage large amounts of data. Dedicated servers and a data center are needed to store and manage patient data. However, self-managed data centers are expensive for hospitals. Storing data in a cloud is a cheaper alternative. The advantage of storing data in a cloud is that it can be retrieved anywhere and anytime using any device connected to the Internet. Therefore, doctors can easily access the medical history of a patient and diagnose diseases according to the context. It also helps prescribe the correct medicine to a patient in an appropriate way. The systematic storage of medical records could help reduce medical errors in hospitals. The challenge is to store medical records on a third-party cloud server while addressing privacy and security concerns. These servers are often semi-trusted. Thus, sensitive medical information must be protected. Open access to records and modifications performed on the information in those records may even cause patient fatalities. Patient-centric health-record security is a major concern. End-to-end file encryption before outsourcing data to a third-party cloud server ensures security. This paper presents a method that is a combination of the advanced encryption standard and the elliptical curve Diffie-Hellman method designed to increase the efficiency of medical record security for users. Comparisons of existing and proposed techniques are presented at the end of the article, with a focus on the analyzing the security approaches between the elliptic curve and secret-sharing methods. This study aims to provide a high level of security for patient health records.
文摘处于改建阶段的智能变电站采样模式复杂,继电保护装置难以发现采样回路轻微异常,导致回路隐患暴露时间严重滞后。针对上述问题,分析改建时期智能变电站的采样模式和二次设备配置情况,提出基于同源录波数据比对的继电保护采样回路异常检测方法。首先,利用双向编码器表征(bidirectional encoder representations from transformers,BERT)语言模型与余弦相似度算法,实现同源录波数据的通道匹配。然后,利用重采样技术和曼哈顿距离完成波形的采样频率统一与时域对齐。最后,基于动态时间规整(dynamic time warping,DTW)算法提出改进算法,并结合采样点偏移量共同设置采样回路的异常判据。算例分析表明,该方法可以完成录波数据的同源通道匹配,实现波形的一致性对齐,并且相比于传统DTW算法,改进DTW算法对异常状态识别的灵敏性和准确性更高。根据异常判据能够有效检测继电保护采样回路的异常状态,确保了智能变电站的安全可靠运行。
文摘由于不同时期的录波数据记录标准有所不同,以及各个生产厂家对标准的解读存在偏差,造成同源录波数据的通道名称存在个性化差异,且通道索引号不同,难以进行录波数据的同源匹配。针对上述问题,提出基于句向量掩码纠错双向编码器表征语言模型(sentence-masked language model as correction bidirectional encoder representations from transformers,Sentence-MacBERT)的同源录波数据匹配方法。首先,分析录波文件的记录格式特点,根据录波文件的格式特点完成核查信息表的构建。然后,通过构建的核查信息表进行录波文件自动校核。最后,在双向编码器表征(bidirectional encoder representations from transformers,BERT)模型的基础上构建Sentence-MacBERT同源通道匹配模型,完成同源录波数据匹配。算例分析表明,根据核查信息表能够完成录波文件的自动校核,并对解析失败的录波文件发出告警信息。利用Sentence-MacBERT模型进行通道名称匹配的效果良好,能够有效地完成录波数据的同源匹配,帮助运行人员进行故障分析。