This paper presents an procedure for purifying training data sets (i.e., past occurrences of slope failures) for inverse estimation on unobserved trigger factors of "different types of simultaneous slope failures"...This paper presents an procedure for purifying training data sets (i.e., past occurrences of slope failures) for inverse estimation on unobserved trigger factors of "different types of simultaneous slope failures". Due to difficulties in pixel-by-pixel observations of trigger factors, as one of the measures, the authors had proposed an inverse analysis algorithm on trigger factors based on SEM (structural equation modeling). Through a measurement equation, the trigger factor is inversely estimated, and a TFI (trigger factor influence) map can be also produced. As a subsequence subject, a purification procedure of training data set should be constructed to improve the accuracy of TFI map which depends on the representativeness of given training data sets of different types of slope failures. The proposed procedure resamples the matched pixels between original groups of past slope failures (i.e., surface slope failures, deep-seated slope failures, landslides) and classified three groups by K-means clustering for all pixels corresponding to those slope failures. For all cases of three types of slope failures, the improvement of success rates with respect to resampled training data sets was confirmed. As a final outcome, the differences between TFI maps produced by using original and resampled training data sets, respectively, are delineated on a DIF map (difference map) which is useful for analyzing trigger factor influence in terms of "risky- and safe-side assessment" sub-areas with respect to "different types of simultaneous slope failures".展开更多
Urban railways are vital means of public transportation in Korea.More than 30%of metropolitan residents use the railways,and this proportion is expected to increase.To enhance safety,the government has mandated the in...Urban railways are vital means of public transportation in Korea.More than 30%of metropolitan residents use the railways,and this proportion is expected to increase.To enhance safety,the government has mandated the installation of closed-circuit televisions in all carriages by 2024.However,cameras still monitored humans.To address this limitation,we developed a dataset of risk factors and a smart detection system that enables an immediate response to any abnormal behavior and intensive monitoring thereof.We created an innovative learning dataset that takes into account seven unique risk factors specific to Korean railway passengers.Detailed data collection was conducted across the Shinbundang Line of the Incheon Transportation Corporation,and the Ui-Shinseol Line.We observed several behavioral characteristics and assigned unique annotations to them.We also considered carriage congestion.Recognition performance was evaluated by camera placement and number.Then the camera installation plan was optimized.The dataset will find immediate applications in domestic railway operations.The artificial intelligence algorithms will be verified shortly.展开更多
Compared with channel estimation method based on explicit training sequences,bandwidth is saved for those methods using superimposed training sequences,while it is wasted when Cyclic Prefix(CP) is added.In previous wo...Compared with channel estimation method based on explicit training sequences,bandwidth is saved for those methods using superimposed training sequences,while it is wasted when Cyclic Prefix(CP) is added.In previous work of McLernon,the Mean Square Error(MSE) performance of Data-Dependent Superimposed Training(DDST) without CP for Single-Input Single-Output(SISO) system was analyzed under the assumption that the data-dependent sequence matrix was a circulant matrix and not interfered by others.In fact,for the system without CP,the data-dependent sequence matrix is not circulant any more and will be interfered.This paper derives the exact expression of MSE for the system without CP and also gives its extension to Multiple-Input Multiple-Output(MIMO) system without CP.展开更多
In the realm of subsurface flow simulations,deep-learning-based surrogate models have emerged as a promising alternative to traditional simulation methods,especially in addressing complex optimization problems.However...In the realm of subsurface flow simulations,deep-learning-based surrogate models have emerged as a promising alternative to traditional simulation methods,especially in addressing complex optimization problems.However,a significant challenge lies in the necessity of numerous high-fidelity training simulations to construct these deep-learning models,which limits their application to field-scale problems.To overcome this limitation,we introduce a training procedure that leverages transfer learning with multi-fidelity training data to construct surrogate models efficiently.The procedure begins with the pre-training of the surrogate model using a relatively larger amount of data that can be efficiently generated from upscaled coarse-scale models.Subsequently,the model parameters are finetuned with a much smaller set of high-fidelity simulation data.For the cases considered in this study,this method leads to about a 75%reduction in total computational cost,in comparison with the traditional training approach,without any sacrifice of prediction accuracy.In addition,a dedicated well-control embedding model is introduced to the traditional U-Net architecture to improve the surrogate model's prediction accuracy,which is shown to be particularly effective when dealing with large-scale reservoir models under time-varying well control parameters.Comprehensive results and analyses are presented for the prediction of well rates,pressure and saturation states of a 3D synthetic reservoir system.Finally,the proposed procedure is applied to a field-scale production optimization problem.The trained surrogate model is shown to provide excellent generalization capabilities during the optimization process,in which the final optimized net-present-value is much higher than those from the training data ranges.展开更多
弱监督关系抽取利用已有关系实体对从文本集中自动获取训练数据,有效解决了训练数据不足的问题。针对弱监督训练数据存在噪声、特征不足和不平衡,导致关系抽取性能不高的问题,文中提出NF-Tri-training(Tritraining with Noise Filtering...弱监督关系抽取利用已有关系实体对从文本集中自动获取训练数据,有效解决了训练数据不足的问题。针对弱监督训练数据存在噪声、特征不足和不平衡,导致关系抽取性能不高的问题,文中提出NF-Tri-training(Tritraining with Noise Filtering)弱监督关系抽取算法。它利用欠采样解决样本不平衡问题,基于Tri-training从未标注数据中迭代学习新的样本,提高分类器的泛化能力,采用数据编辑技术识别并移除初始训练数据和每次迭代产生的错标样本。在互动百科采集数据集上实验结果表明NF-Tri-training算法能够有效提升关系分类器的性能。展开更多
This study proposes the use of the MERISE conceptual data model to create indicators for monitoring and evaluating the effectiveness of vocational training in the Republic of Congo. The importance of MERISE for struct...This study proposes the use of the MERISE conceptual data model to create indicators for monitoring and evaluating the effectiveness of vocational training in the Republic of Congo. The importance of MERISE for structuring and analyzing data is underlined, as it enables the measurement of the adequacy between training and the needs of the labor market. The innovation of the study lies in the adaptation of the MERISE model to the local context, the development of innovative indicators, and the integration of a participatory approach including all relevant stakeholders. Contextual adaptation and local innovation: The study suggests adapting MERISE to the specific context of the Republic of Congo, considering the local particularities of the labor market. Development of innovative indicators and new measurement tools: It proposes creating indicators to assess skills matching and employer satisfaction, which are crucial for evaluating the effectiveness of vocational training. Participatory approach and inclusion of stakeholders: The study emphasizes actively involving training centers, employers, and recruitment agencies in the evaluation process. This participatory approach ensures that the perspectives of all stakeholders are considered, leading to more relevant and practical outcomes. Using the MERISE model allows for: • Rigorous data structuring, organization, and standardization: Clearly defining entities and relationships facilitates data organization and standardization, crucial for effective data analysis. • Facilitation of monitoring, analysis, and relevant indicators: Developing both quantitative and qualitative indicators helps measure the effectiveness of training in relation to the labor market, allowing for a comprehensive evaluation. • Improved communication and common language: By providing a common language for different stakeholders, MERISE enhances communication and collaboration, ensuring that all parties have a shared understanding. The study’s approach and contribution to existing research lie in: • Structured theoretical and practical framework and holistic approach: The study offers a structured framework for data collection and analysis, covering both quantitative and qualitative aspects, thus providing a comprehensive view of the training system. • Reproducible methodology and international comparison: The proposed methodology can be replicated in other contexts, facilitating international comparison and the adoption of best practices. • Extension of knowledge and new perspective: By integrating a participatory approach and developing indicators adapted to local needs, the study extends existing research and offers new perspectives on vocational training evaluation.展开更多
Tri-training能有效利用无标记样例提高泛化能力.针对Tri-training迭代中无标记样例常被错误标记而形成训练集噪声,导致性能不稳定的缺点,文中提出ADE-Tri-training(Tri-training with Adaptive Data Editing)新算法.它不仅利用Remove O...Tri-training能有效利用无标记样例提高泛化能力.针对Tri-training迭代中无标记样例常被错误标记而形成训练集噪声,导致性能不稳定的缺点,文中提出ADE-Tri-training(Tri-training with Adaptive Data Editing)新算法.它不仅利用Remove Only剪辑操作对每次迭代可能产生的误标记样例识别并移除,更重要的是采用自适应策略来确定Remove Only触发与抑制的恰当时机.文中证明,PAC理论下自适应策略中一系列判别充分条件可同时确保新训练集规模迭代增大和新假设分类错误率迭代降低更多.UCI数据集上实验结果表明:ADE-Tri-training具有更好的分类泛化性能和健壮性.展开更多
In recent years,blockchain technology has been applied in the educational domain because of its salient advantages,i.e.,transparency,decentralization,and immutability.Available systems typically use public blockchain ...In recent years,blockchain technology has been applied in the educational domain because of its salient advantages,i.e.,transparency,decentralization,and immutability.Available systems typically use public blockchain networks such as Ethereum and Bitcoin to store learning results.However,the cost of writing data on these networks is significant,making educational institutions limit data sent to the target network,typically containing only hash codes of the issued certificates.In this paper,we present a system based on a private blockchain network for lifelong learning data authentication and management named B4E(Blockchain For Education).B4E stores not only certificates but also learners’training data such as transcripts and educational programs in order to create a complete record of the lifelong education of each user and verify certificates that they have obtained.As a result,B4E can address two types of fake certificates,i.e.,certificates printed by unlawful organizations and certificates issued by educational institutions for learners who have not met the training requirements.In addition,B4E is designed to allow all participants to easily deploy software packages to manage,share,and check stored information without depending on a single point of access.As such,the system enhances the transparency and reliability of the stored data.Our experiments show that B4E meets expectations for deployment in reality.展开更多
文摘This paper presents an procedure for purifying training data sets (i.e., past occurrences of slope failures) for inverse estimation on unobserved trigger factors of "different types of simultaneous slope failures". Due to difficulties in pixel-by-pixel observations of trigger factors, as one of the measures, the authors had proposed an inverse analysis algorithm on trigger factors based on SEM (structural equation modeling). Through a measurement equation, the trigger factor is inversely estimated, and a TFI (trigger factor influence) map can be also produced. As a subsequence subject, a purification procedure of training data set should be constructed to improve the accuracy of TFI map which depends on the representativeness of given training data sets of different types of slope failures. The proposed procedure resamples the matched pixels between original groups of past slope failures (i.e., surface slope failures, deep-seated slope failures, landslides) and classified three groups by K-means clustering for all pixels corresponding to those slope failures. For all cases of three types of slope failures, the improvement of success rates with respect to resampled training data sets was confirmed. As a final outcome, the differences between TFI maps produced by using original and resampled training data sets, respectively, are delineated on a DIF map (difference map) which is useful for analyzing trigger factor influence in terms of "risky- and safe-side assessment" sub-areas with respect to "different types of simultaneous slope failures".
基金supported by a Korean Agency for Infrastructure Technology Advancement(KAIA)grant funded by the Ministry of Land,Infrastructure and Transport(grant no.RS-2023-00239464).
文摘Urban railways are vital means of public transportation in Korea.More than 30%of metropolitan residents use the railways,and this proportion is expected to increase.To enhance safety,the government has mandated the installation of closed-circuit televisions in all carriages by 2024.However,cameras still monitored humans.To address this limitation,we developed a dataset of risk factors and a smart detection system that enables an immediate response to any abnormal behavior and intensive monitoring thereof.We created an innovative learning dataset that takes into account seven unique risk factors specific to Korean railway passengers.Detailed data collection was conducted across the Shinbundang Line of the Incheon Transportation Corporation,and the Ui-Shinseol Line.We observed several behavioral characteristics and assigned unique annotations to them.We also considered carriage congestion.Recognition performance was evaluated by camera placement and number.Then the camera installation plan was optimized.The dataset will find immediate applications in domestic railway operations.The artificial intelligence algorithms will be verified shortly.
基金Supported by the National Natural Science Foundation of China (No.60772087,No.50803016,No.60975004,No.60902023)the Foundation for the Author of National Excellent Doctoral Dissertation of P.R. China (No.200341)+1 种基金the National 863 High-Tech R&D Program (No.2007AA01Z 228)the open research fund of Key Laboratory of Information Coding and Transmission,Southwest Jiaotong University
文摘Compared with channel estimation method based on explicit training sequences,bandwidth is saved for those methods using superimposed training sequences,while it is wasted when Cyclic Prefix(CP) is added.In previous work of McLernon,the Mean Square Error(MSE) performance of Data-Dependent Superimposed Training(DDST) without CP for Single-Input Single-Output(SISO) system was analyzed under the assumption that the data-dependent sequence matrix was a circulant matrix and not interfered by others.In fact,for the system without CP,the data-dependent sequence matrix is not circulant any more and will be interfered.This paper derives the exact expression of MSE for the system without CP and also gives its extension to Multiple-Input Multiple-Output(MIMO) system without CP.
基金funding support from the National Natural Science Foundation of China(No.52204065,No.ZX20230398)supported by a grant from the Human Resources Development Program(No.20216110100070)of the Korea Institute of Energy Technology Evaluation and Planning(KETEP)。
文摘In the realm of subsurface flow simulations,deep-learning-based surrogate models have emerged as a promising alternative to traditional simulation methods,especially in addressing complex optimization problems.However,a significant challenge lies in the necessity of numerous high-fidelity training simulations to construct these deep-learning models,which limits their application to field-scale problems.To overcome this limitation,we introduce a training procedure that leverages transfer learning with multi-fidelity training data to construct surrogate models efficiently.The procedure begins with the pre-training of the surrogate model using a relatively larger amount of data that can be efficiently generated from upscaled coarse-scale models.Subsequently,the model parameters are finetuned with a much smaller set of high-fidelity simulation data.For the cases considered in this study,this method leads to about a 75%reduction in total computational cost,in comparison with the traditional training approach,without any sacrifice of prediction accuracy.In addition,a dedicated well-control embedding model is introduced to the traditional U-Net architecture to improve the surrogate model's prediction accuracy,which is shown to be particularly effective when dealing with large-scale reservoir models under time-varying well control parameters.Comprehensive results and analyses are presented for the prediction of well rates,pressure and saturation states of a 3D synthetic reservoir system.Finally,the proposed procedure is applied to a field-scale production optimization problem.The trained surrogate model is shown to provide excellent generalization capabilities during the optimization process,in which the final optimized net-present-value is much higher than those from the training data ranges.
文摘弱监督关系抽取利用已有关系实体对从文本集中自动获取训练数据,有效解决了训练数据不足的问题。针对弱监督训练数据存在噪声、特征不足和不平衡,导致关系抽取性能不高的问题,文中提出NF-Tri-training(Tritraining with Noise Filtering)弱监督关系抽取算法。它利用欠采样解决样本不平衡问题,基于Tri-training从未标注数据中迭代学习新的样本,提高分类器的泛化能力,采用数据编辑技术识别并移除初始训练数据和每次迭代产生的错标样本。在互动百科采集数据集上实验结果表明NF-Tri-training算法能够有效提升关系分类器的性能。
基金Supported by the National Natural Science Foundation of China under Grant Nos.60702033 60772076 (国家自然科学基金)+3 种基金the National High-Tech Research and Development Plan of China under Grant No.2007AA01Z171 (国家高技术研究发展计划(863)the Science Fund for Distinguished Young Scholars of Heilongjiang Province of China under Grant No.JC200611 (黑龙江省杰出青年科学基金)the Natural Science Foundation of Heilongjiang Province of China under Grant No.ZJG0705 (黑龙江省自然科学重点基金)the Foundation of Harbin Institute of Technology of China under Grant No.HIT.2003.53 (哈尔滨工业大学校基金)
文摘This study proposes the use of the MERISE conceptual data model to create indicators for monitoring and evaluating the effectiveness of vocational training in the Republic of Congo. The importance of MERISE for structuring and analyzing data is underlined, as it enables the measurement of the adequacy between training and the needs of the labor market. The innovation of the study lies in the adaptation of the MERISE model to the local context, the development of innovative indicators, and the integration of a participatory approach including all relevant stakeholders. Contextual adaptation and local innovation: The study suggests adapting MERISE to the specific context of the Republic of Congo, considering the local particularities of the labor market. Development of innovative indicators and new measurement tools: It proposes creating indicators to assess skills matching and employer satisfaction, which are crucial for evaluating the effectiveness of vocational training. Participatory approach and inclusion of stakeholders: The study emphasizes actively involving training centers, employers, and recruitment agencies in the evaluation process. This participatory approach ensures that the perspectives of all stakeholders are considered, leading to more relevant and practical outcomes. Using the MERISE model allows for: • Rigorous data structuring, organization, and standardization: Clearly defining entities and relationships facilitates data organization and standardization, crucial for effective data analysis. • Facilitation of monitoring, analysis, and relevant indicators: Developing both quantitative and qualitative indicators helps measure the effectiveness of training in relation to the labor market, allowing for a comprehensive evaluation. • Improved communication and common language: By providing a common language for different stakeholders, MERISE enhances communication and collaboration, ensuring that all parties have a shared understanding. The study’s approach and contribution to existing research lie in: • Structured theoretical and practical framework and holistic approach: The study offers a structured framework for data collection and analysis, covering both quantitative and qualitative aspects, thus providing a comprehensive view of the training system. • Reproducible methodology and international comparison: The proposed methodology can be replicated in other contexts, facilitating international comparison and the adoption of best practices. • Extension of knowledge and new perspective: By integrating a participatory approach and developing indicators adapted to local needs, the study extends existing research and offers new perspectives on vocational training evaluation.
文摘Tri-training能有效利用无标记样例提高泛化能力.针对Tri-training迭代中无标记样例常被错误标记而形成训练集噪声,导致性能不稳定的缺点,文中提出ADE-Tri-training(Tri-training with Adaptive Data Editing)新算法.它不仅利用Remove Only剪辑操作对每次迭代可能产生的误标记样例识别并移除,更重要的是采用自适应策略来确定Remove Only触发与抑制的恰当时机.文中证明,PAC理论下自适应策略中一系列判别充分条件可同时确保新训练集规模迭代增大和新假设分类错误率迭代降低更多.UCI数据集上实验结果表明:ADE-Tri-training具有更好的分类泛化性能和健壮性.
基金supported by the Vietnamese MOET’s project“Researching and applying blockchain technology to the problem of authenticating the certificate issuing process in Vietnam”,No.B2020-BKA-14.
文摘In recent years,blockchain technology has been applied in the educational domain because of its salient advantages,i.e.,transparency,decentralization,and immutability.Available systems typically use public blockchain networks such as Ethereum and Bitcoin to store learning results.However,the cost of writing data on these networks is significant,making educational institutions limit data sent to the target network,typically containing only hash codes of the issued certificates.In this paper,we present a system based on a private blockchain network for lifelong learning data authentication and management named B4E(Blockchain For Education).B4E stores not only certificates but also learners’training data such as transcripts and educational programs in order to create a complete record of the lifelong education of each user and verify certificates that they have obtained.As a result,B4E can address two types of fake certificates,i.e.,certificates printed by unlawful organizations and certificates issued by educational institutions for learners who have not met the training requirements.In addition,B4E is designed to allow all participants to easily deploy software packages to manage,share,and check stored information without depending on a single point of access.As such,the system enhances the transparency and reliability of the stored data.Our experiments show that B4E meets expectations for deployment in reality.