A rough set probabilistic data association(RS-PDA)algorithm is proposed for reducing the complexity and time consumption of data association and enhancing the accuracy of tracking results in multi-target tracking appl...A rough set probabilistic data association(RS-PDA)algorithm is proposed for reducing the complexity and time consumption of data association and enhancing the accuracy of tracking results in multi-target tracking application.In this new algorithm,the measurements lying in the intersection of two or more validation regions are allocated to the corresponding targets through rough set theory,and the multi-target tracking problem is transformed into a single target tracking after the classification of measurements lying in the intersection region.Several typical multi-target tracking applications are given.The simulation results show that the algorithm can not only reduce the complexity and time consumption but also enhance the accuracy and stability of the tracking results.展开更多
Data mining (also known as Knowledge Discovery in Databases - KDD) is defined as the nontrivial extraction of implicit, previously unknown, and potentially useful information from data. The aims and objectives of data...Data mining (also known as Knowledge Discovery in Databases - KDD) is defined as the nontrivial extraction of implicit, previously unknown, and potentially useful information from data. The aims and objectives of data mining are to discover knowledge of interest to user needs.Data mining is really a useful tool in many domains such as marketing, decision making, etc. However, some basic issues of data mining are ignored. What is data mining? What is the product of a data mining process? What are we doing in a data mining process? Is there any rule we should obey in a data mining process? In order to discover patterns and knowledge really interesting and actionable to the real world Zhang et al proposed a domain-driven human-machine-cooperated data mining process.Zhao and Yao proposed an interactive user-driven classification method using the granule network. In our work, we find that data mining is a kind of knowledge transforming process to transform knowledge from data format into symbol format. Thus, no new knowledge could be generated (born) in a data mining process. In a data mining process, knowledge is just transformed from data format, which is not understandable for human, into symbol format,which is understandable for human and easy to be used.It is similar to the process of translating a book from Chinese into English.In this translating process,the knowledge itself in the book should remain unchanged. What will be changed is the format of the knowledge only. That is, the knowledge in the English book should be kept the same as the knowledge in the Chinese one.Otherwise, there must be some mistakes in the translating proces, that is, we are transforming knowledge from one format into another format while not producing new knowledge in a data mining process. The knowledge is originally stored in data (data is a representation format of knowledge). Unfortunately, we can not read, understand, or use it, since we can not understand data. With this understanding of data mining, we proposed a data-driven knowledge acquisition method based on rough sets. It also improved the performance of classical knowledge acquisition methods. In fact, we also find that the domain-driven data mining and user-driven data mining do not conflict with our data-driven data mining. They could be integrated into domain-oriented data-driven data mining. It is just like the views of data base. Users with different views could look at different partial data of a data base. Thus, users with different tasks or objectives wish, or could discover different knowledge (partial knowledge) from the same data base. However, all these partial knowledge should be originally existed in the data base. So, a domain-oriented data-driven data mining method would help us to extract the knowledge which is really existed in a data base, and really interesting and actionable to the real world.展开更多
This article states the poor database which is very common when being used them. So the demanding database must be all-round, effective collection. When the offering database is poor database, it will affect the appli...This article states the poor database which is very common when being used them. So the demanding database must be all-round, effective collection. When the offering database is poor database, it will affect the application of Supporter Deciding. To this question, the author brings out one solution to solve the poor database basing on the Rough Sets Theory. It can scientifically, correctly, effectively supplement the poor database, and can offer greatly help to enforce the application of data and artificial intelligence.展开更多
Rough set theory provides a useful mathematical foundation for developing automated computational systems that can help understand and make use of imperfect knowledge. Despite its recency, the theory and its extension...Rough set theory provides a useful mathematical foundation for developing automated computational systems that can help understand and make use of imperfect knowledge. Despite its recency, the theory and its extensions have been widely applied to many problems, including decision analysis, data mining, intelligent control and pattern recognition. This paper presents an outline of the basic concepts of rough sets and their major extensions, covering variable precision, tolerance and fuzzy rough sets. It also shows the diversity of successful applications these theories have entailed, ranging from financial and business, through biological and medicine, to physical, art, and meteorological.展开更多
Interval-valued data and incomplete data are two key problems for failure analysis of thruster experimental data and have been basically solved by the proposed methods in this paper. Firstly, information data acquired...Interval-valued data and incomplete data are two key problems for failure analysis of thruster experimental data and have been basically solved by the proposed methods in this paper. Firstly, information data acquired from the simulation and evaluation system formed as intervalvalued information system (IIS) is classified by the interval similarity relation. Then, as an improvement of the classical rough set, a new kind of generalized information entropy called "H'-information entropy" is suggested for the measurement of uncertainty and the classification ability of IIS. There is an innovative information filling technique using the properties of H'-information entropy to replace missing data by some smaller estimation intervals. Finally, an improved method of failure analysis synthesized by the above achievements is presented to classify the thruster experimental data, complete the information, and extract the failure rules. The feasibility and advantage of this method is testified by an actual application of failure analysis, whose performance is evaluated by the quantification of E-condition entropy.展开更多
This paper presents a generalized method for updating approximations of a concept incrementally, which can be used as an effective tool to deal with dynamic attribute generalization. By combining this method and the L...This paper presents a generalized method for updating approximations of a concept incrementally, which can be used as an effective tool to deal with dynamic attribute generalization. By combining this method and the LERS inductive learning algorithm, it also introduces a generalized quasi incremental algorithm for learning classification rules from data bases.展开更多
The article is a comprehensive review of two major approaches to rough set theory:the classic rough set model introduced by Pawlak and the probabilistic approaches.The classic model is presented as a staging ground to...The article is a comprehensive review of two major approaches to rough set theory:the classic rough set model introduced by Pawlak and the probabilistic approaches.The classic model is presented as a staging ground to the discussion of two varieties of the probabilistic approach,i.e.of the variable precision and Bayesian rough set models.Both of these models extend the classic model to deal with stochastic interactions while preserving the basic ideas of the original rough set theory,such as set approximations,data dependencies,reducts etc.The probabilistic models are able to handle weaker data interactions than the classic model,thus extending the applicability of the rough set paradigm.The extended models are presented in considerable detail with some illustrative examples.展开更多
Recently,much interest has been given tomulti-granulation rough sets (MGRS), and various types ofMGRSmodelshave been developed from different viewpoints. In this paper, we introduce two techniques for the classificati...Recently,much interest has been given tomulti-granulation rough sets (MGRS), and various types ofMGRSmodelshave been developed from different viewpoints. In this paper, we introduce two techniques for the classificationof MGRS. Firstly, we generate multi-topologies from multi-relations defined in the universe. Hence, a novelapproximation space is established by leveraging the underlying topological structure. The characteristics of thenewly proposed approximation space are discussed.We introduce an algorithmfor the reduction ofmulti-relations.Secondly, a new approach for the classification ofMGRS based on neighborhood concepts is introduced. Finally, areal-life application from medical records is introduced via our approach to the classification of MGRS.展开更多
It is well-known that rough set theory can be applied successfully to rough classification and knowledge discovery. Our work is concerned with finding methods for using rough sets to identify classes in datasets, find...It is well-known that rough set theory can be applied successfully to rough classification and knowledge discovery. Our work is concerned with finding methods for using rough sets to identify classes in datasets, finding dependencies in relations and discovering rules which are hidden in databases by means of decision tables and algorithm D. We use these methods to analyze and control aspects of nuclear energy generation.展开更多
Rough set (RS) and radial basis function neural network (RBFNN) based insulation data mining fault diagnosis for power transformer is proposed. On the one hand rough set is used as front of RBFNN to simplify the input...Rough set (RS) and radial basis function neural network (RBFNN) based insulation data mining fault diagnosis for power transformer is proposed. On the one hand rough set is used as front of RBFNN to simplify the input of RBFNN and mine the rules. The mined rules whose “confidence” and “support” is higher than requirement are used to offer fault diagnosis service for power transformer directly. On the other hand the mining samples corresponding to the mined rule, whose “confidence and support” is lower than requirement, are used to be training samples set of RBFNN and these samples are clustered by rough set. The center of each clustering set is used to be center of radial basis function, i.e., as the hidden layer neuron. The RBFNN is structured with above base, which is used to diagnose the case that can not be diagnosed by mined simplified valuable rules based on rough set. The advantages and effectiveness of this method are verified by testing.展开更多
Rough set theory is relativly new to area of soft computing to handle the uncertain big data efficiently. It also provides a powerful way to calculate the importance degree of vague and uncertain big data to help in d...Rough set theory is relativly new to area of soft computing to handle the uncertain big data efficiently. It also provides a powerful way to calculate the importance degree of vague and uncertain big data to help in decision making. Risk assessment is very important for safe and reliable investment. Risk management involves assessing the risk sources and designing strategies and procedures to mitigate those risks to an acceptable level. In this paper, we emphasize on classification of different types of risk factors and find a simple and effective way to calculate the risk exposure.. The study uses rough set method to classify and judge the safety attributes related to investment policy. The method which based on intelligent knowledge accusation provides an innovative way for risk analysis. From this approach, we are able to calculate the significance of each factor and relative risk exposure based on the original data without assigning the weight subjectively.展开更多
Many real-life data sets are incomplete,or in different words,are affected by missing attribute values.Three interpretations of missing attribute values are discussed in the paper:lost values(erased values),attribute-...Many real-life data sets are incomplete,or in different words,are affected by missing attribute values.Three interpretations of missing attribute values are discussed in the paper:lost values(erased values),attribute-concept values(such a value may be replaced by any value from the attribute domain restricted to the concept),and "do not care" conditions(a missing attribute value may be replaced by any value from the attribute domain).For incomplete data sets three definitions of lower and upper approximations are discussed.Experiments were conducted on six typical data sets with missing attribute values,using three different interpretations of missing attribute values and the same definition of concept lower and upper approximations.The conclusion is that the best approach to missing attribute values is the lost value type.展开更多
文摘A rough set probabilistic data association(RS-PDA)algorithm is proposed for reducing the complexity and time consumption of data association and enhancing the accuracy of tracking results in multi-target tracking application.In this new algorithm,the measurements lying in the intersection of two or more validation regions are allocated to the corresponding targets through rough set theory,and the multi-target tracking problem is transformed into a single target tracking after the classification of measurements lying in the intersection region.Several typical multi-target tracking applications are given.The simulation results show that the algorithm can not only reduce the complexity and time consumption but also enhance the accuracy and stability of the tracking results.
文摘Data mining (also known as Knowledge Discovery in Databases - KDD) is defined as the nontrivial extraction of implicit, previously unknown, and potentially useful information from data. The aims and objectives of data mining are to discover knowledge of interest to user needs.Data mining is really a useful tool in many domains such as marketing, decision making, etc. However, some basic issues of data mining are ignored. What is data mining? What is the product of a data mining process? What are we doing in a data mining process? Is there any rule we should obey in a data mining process? In order to discover patterns and knowledge really interesting and actionable to the real world Zhang et al proposed a domain-driven human-machine-cooperated data mining process.Zhao and Yao proposed an interactive user-driven classification method using the granule network. In our work, we find that data mining is a kind of knowledge transforming process to transform knowledge from data format into symbol format. Thus, no new knowledge could be generated (born) in a data mining process. In a data mining process, knowledge is just transformed from data format, which is not understandable for human, into symbol format,which is understandable for human and easy to be used.It is similar to the process of translating a book from Chinese into English.In this translating process,the knowledge itself in the book should remain unchanged. What will be changed is the format of the knowledge only. That is, the knowledge in the English book should be kept the same as the knowledge in the Chinese one.Otherwise, there must be some mistakes in the translating proces, that is, we are transforming knowledge from one format into another format while not producing new knowledge in a data mining process. The knowledge is originally stored in data (data is a representation format of knowledge). Unfortunately, we can not read, understand, or use it, since we can not understand data. With this understanding of data mining, we proposed a data-driven knowledge acquisition method based on rough sets. It also improved the performance of classical knowledge acquisition methods. In fact, we also find that the domain-driven data mining and user-driven data mining do not conflict with our data-driven data mining. They could be integrated into domain-oriented data-driven data mining. It is just like the views of data base. Users with different views could look at different partial data of a data base. Thus, users with different tasks or objectives wish, or could discover different knowledge (partial knowledge) from the same data base. However, all these partial knowledge should be originally existed in the data base. So, a domain-oriented data-driven data mining method would help us to extract the knowledge which is really existed in a data base, and really interesting and actionable to the real world.
文摘This article states the poor database which is very common when being used them. So the demanding database must be all-round, effective collection. When the offering database is poor database, it will affect the application of Supporter Deciding. To this question, the author brings out one solution to solve the poor database basing on the Rough Sets Theory. It can scientifically, correctly, effectively supplement the poor database, and can offer greatly help to enforce the application of data and artificial intelligence.
基金revised date May 14,2007 This work was partly supported by the UK EPSRC Grant(No.GR/S98603/01).
文摘Rough set theory provides a useful mathematical foundation for developing automated computational systems that can help understand and make use of imperfect knowledge. Despite its recency, the theory and its extensions have been widely applied to many problems, including decision analysis, data mining, intelligent control and pattern recognition. This paper presents an outline of the basic concepts of rough sets and their major extensions, covering variable precision, tolerance and fuzzy rough sets. It also shows the diversity of successful applications these theories have entailed, ranging from financial and business, through biological and medicine, to physical, art, and meteorological.
基金jointly supported by the National Natural Science Foundation (Nos.61175008,60935001)National Basic Research Program of China (No.2009CB824900)+1 种基金the Space Foundation of Supporting-Technology (No.2011-HTSHJD002)the Aeronautical Science Foundation of China (No.20105557007)
文摘Interval-valued data and incomplete data are two key problems for failure analysis of thruster experimental data and have been basically solved by the proposed methods in this paper. Firstly, information data acquired from the simulation and evaluation system formed as intervalvalued information system (IIS) is classified by the interval similarity relation. Then, as an improvement of the classical rough set, a new kind of generalized information entropy called "H'-information entropy" is suggested for the measurement of uncertainty and the classification ability of IIS. There is an innovative information filling technique using the properties of H'-information entropy to replace missing data by some smaller estimation intervals. Finally, an improved method of failure analysis synthesized by the above achievements is presented to classify the thruster experimental data, complete the information, and extract the failure rules. The feasibility and advantage of this method is testified by an actual application of failure analysis, whose performance is evaluated by the quantification of E-condition entropy.
文摘This paper presents a generalized method for updating approximations of a concept incrementally, which can be used as an effective tool to deal with dynamic attribute generalization. By combining this method and the LERS inductive learning algorithm, it also introduces a generalized quasi incremental algorithm for learning classification rules from data bases.
文摘The article is a comprehensive review of two major approaches to rough set theory:the classic rough set model introduced by Pawlak and the probabilistic approaches.The classic model is presented as a staging ground to the discussion of two varieties of the probabilistic approach,i.e.of the variable precision and Bayesian rough set models.Both of these models extend the classic model to deal with stochastic interactions while preserving the basic ideas of the original rough set theory,such as set approximations,data dependencies,reducts etc.The probabilistic models are able to handle weaker data interactions than the classic model,thus extending the applicability of the rough set paradigm.The extended models are presented in considerable detail with some illustrative examples.
文摘Recently,much interest has been given tomulti-granulation rough sets (MGRS), and various types ofMGRSmodelshave been developed from different viewpoints. In this paper, we introduce two techniques for the classificationof MGRS. Firstly, we generate multi-topologies from multi-relations defined in the universe. Hence, a novelapproximation space is established by leveraging the underlying topological structure. The characteristics of thenewly proposed approximation space are discussed.We introduce an algorithmfor the reduction ofmulti-relations.Secondly, a new approach for the classification ofMGRS based on neighborhood concepts is introduced. Finally, areal-life application from medical records is introduced via our approach to the classification of MGRS.
文摘It is well-known that rough set theory can be applied successfully to rough classification and knowledge discovery. Our work is concerned with finding methods for using rough sets to identify classes in datasets, finding dependencies in relations and discovering rules which are hidden in databases by means of decision tables and algorithm D. We use these methods to analyze and control aspects of nuclear energy generation.
基金the National Natural Science Foundation of China (Grant No. 50128706).
文摘Rough set (RS) and radial basis function neural network (RBFNN) based insulation data mining fault diagnosis for power transformer is proposed. On the one hand rough set is used as front of RBFNN to simplify the input of RBFNN and mine the rules. The mined rules whose “confidence” and “support” is higher than requirement are used to offer fault diagnosis service for power transformer directly. On the other hand the mining samples corresponding to the mined rule, whose “confidence and support” is lower than requirement, are used to be training samples set of RBFNN and these samples are clustered by rough set. The center of each clustering set is used to be center of radial basis function, i.e., as the hidden layer neuron. The RBFNN is structured with above base, which is used to diagnose the case that can not be diagnosed by mined simplified valuable rules based on rough set. The advantages and effectiveness of this method are verified by testing.
文摘Rough set theory is relativly new to area of soft computing to handle the uncertain big data efficiently. It also provides a powerful way to calculate the importance degree of vague and uncertain big data to help in decision making. Risk assessment is very important for safe and reliable investment. Risk management involves assessing the risk sources and designing strategies and procedures to mitigate those risks to an acceptable level. In this paper, we emphasize on classification of different types of risk factors and find a simple and effective way to calculate the risk exposure.. The study uses rough set method to classify and judge the safety attributes related to investment policy. The method which based on intelligent knowledge accusation provides an innovative way for risk analysis. From this approach, we are able to calculate the significance of each factor and relative risk exposure based on the original data without assigning the weight subjectively.
文摘Many real-life data sets are incomplete,or in different words,are affected by missing attribute values.Three interpretations of missing attribute values are discussed in the paper:lost values(erased values),attribute-concept values(such a value may be replaced by any value from the attribute domain restricted to the concept),and "do not care" conditions(a missing attribute value may be replaced by any value from the attribute domain).For incomplete data sets three definitions of lower and upper approximations are discussed.Experiments were conducted on six typical data sets with missing attribute values,using three different interpretations of missing attribute values and the same definition of concept lower and upper approximations.The conclusion is that the best approach to missing attribute values is the lost value type.