One of the most useful Information Extraction (IE) solutions to Web information harnessing is Named Entity Recognition (NER). Hand-coded rule methods are still the best performers. These methods and statistical method...One of the most useful Information Extraction (IE) solutions to Web information harnessing is Named Entity Recognition (NER). Hand-coded rule methods are still the best performers. These methods and statistical methods exploit Natural Language Processing (NLP) features and characteristics (e.g. Capitalization) to extract Named Entities (NE) like personal and company names. For entities with multiple sub-entities of higher cardinality (e.g. linux command, citation) and which are non-speech, these systems fail to deliver efficiently. Promising Machine Learning (ML) methods would require large amounts of training examples which are impossible to manually produce. We call these entities Named High Cardinality Entities (NHCEs). We propose a sequence validation based approach for the extraction and validation of NHCEs. In the approach, sub-entities of NHCE candidates are statistically and structurally characterized during top-down annotation process and guided to transformation into either value types (v-type) or user-defined types (u-type) using a ML model. Treated as sequences of sub-entities, NHCE candidates with transformed sub-entities are then validated (and subsequently labeled) using a series of validation operators. We present a case study to demonstrate the approach and show how it helps to bridge the gap between IE and Intelligent Systems (IS) through the use of transformed sub-entities in supervised learning.展开更多
When a disaster occurs, the demand for information and communication technology (ICT) services drastically increases. To meet such demands, a national project was undertaken in Japan to develop the Movable and Deploya...When a disaster occurs, the demand for information and communication technology (ICT) services drastically increases. To meet such demands, a national project was undertaken in Japan to develop the Movable and Deployable ICT Resource Unit (MDRU). One challenge regarding the MDRU is securing operators to work the units in emergency situations. As ICT service users have diverse and frequently changing demands, strong technical skills and practical knowledge are required for the administration of MDRUs. In this paper, we propose a knowledge-based network management system to alleviate the burden on administrators. To deal with the structural changes to network systems that frequently occur with changes in ICT service demand, we introduce modularization techniques into our previous research. The proposed system can be easily reconfigured by join/disjoin modules corresponding to changes in the system configuration of the MDRU. The results of our experiments using the implemented experimental system confirm that the proposed system can be applied to MDRU operation and effectively supports administrators.展开更多
文摘One of the most useful Information Extraction (IE) solutions to Web information harnessing is Named Entity Recognition (NER). Hand-coded rule methods are still the best performers. These methods and statistical methods exploit Natural Language Processing (NLP) features and characteristics (e.g. Capitalization) to extract Named Entities (NE) like personal and company names. For entities with multiple sub-entities of higher cardinality (e.g. linux command, citation) and which are non-speech, these systems fail to deliver efficiently. Promising Machine Learning (ML) methods would require large amounts of training examples which are impossible to manually produce. We call these entities Named High Cardinality Entities (NHCEs). We propose a sequence validation based approach for the extraction and validation of NHCEs. In the approach, sub-entities of NHCE candidates are statistically and structurally characterized during top-down annotation process and guided to transformation into either value types (v-type) or user-defined types (u-type) using a ML model. Treated as sequences of sub-entities, NHCE candidates with transformed sub-entities are then validated (and subsequently labeled) using a series of validation operators. We present a case study to demonstrate the approach and show how it helps to bridge the gap between IE and Intelligent Systems (IS) through the use of transformed sub-entities in supervised learning.
文摘When a disaster occurs, the demand for information and communication technology (ICT) services drastically increases. To meet such demands, a national project was undertaken in Japan to develop the Movable and Deployable ICT Resource Unit (MDRU). One challenge regarding the MDRU is securing operators to work the units in emergency situations. As ICT service users have diverse and frequently changing demands, strong technical skills and practical knowledge are required for the administration of MDRUs. In this paper, we propose a knowledge-based network management system to alleviate the burden on administrators. To deal with the structural changes to network systems that frequently occur with changes in ICT service demand, we introduce modularization techniques into our previous research. The proposed system can be easily reconfigured by join/disjoin modules corresponding to changes in the system configuration of the MDRU. The results of our experiments using the implemented experimental system confirm that the proposed system can be applied to MDRU operation and effectively supports administrators.