Covering-based rough sets process data organized by a covering of the universe. A soft set is a parameterized family of subsets of the universe. Both theories can deal with the uncertainties of data. Soft sets have no...Covering-based rough sets process data organized by a covering of the universe. A soft set is a parameterized family of subsets of the universe. Both theories can deal with the uncertainties of data. Soft sets have not any restrictions on the approximate description of the object,and they might form a covering of the universe. From this viewpoint,we establish a connection between these two theories. Specifically,we propose a complementary parameter for this purpose. With this parameter,the soft covering approximation space is established and the two theories are bridged. Furthermore,we study some relations between the covering and the soft covering approximation space and obtain some significant results. Finally,we define a notion of combine parameter which can help us to simplify the set of parameters and reduce the storage requirement of a soft covering approximation space.展开更多
In many machine learning applications,data are not free,and there is a test cost for each data item. For the economical reason,some existing works try to minimize the test cost and at the same time,preserve a particul...In many machine learning applications,data are not free,and there is a test cost for each data item. For the economical reason,some existing works try to minimize the test cost and at the same time,preserve a particular property of a given decision system. In this paper,we point out that the test cost one can afford is limited in some applications. Hence,one has to sacrifice respective properties to keep the test cost under a budget. To formalize this issue,we define the test cost constraint attribute reduction problem,where the optimization objective is to minimize the conditional information entropy. This problem is an essential generalization of both the test-cost-sensitive attribute reduction problem and the 0-1 knapsack problem,therefore it is more challenging. We propose a heuristic algorithm based on the information gain and test costs to deal with the new problem. The algorithm is tested on four UCI(University of California-Irvine) datasets with various test cost settings. Experimental results indicate the appropriate setting of the only user-specified parameter λ.展开更多
基金supported by National Natural Science Foundation of China under Grant No. 60873077/F020107the Science Research Project of Zhangzhou Normal University under Grant No. SK09002
文摘Covering-based rough sets process data organized by a covering of the universe. A soft set is a parameterized family of subsets of the universe. Both theories can deal with the uncertainties of data. Soft sets have not any restrictions on the approximate description of the object,and they might form a covering of the universe. From this viewpoint,we establish a connection between these two theories. Specifically,we propose a complementary parameter for this purpose. With this parameter,the soft covering approximation space is established and the two theories are bridged. Furthermore,we study some relations between the covering and the soft covering approximation space and obtain some significant results. Finally,we define a notion of combine parameter which can help us to simplify the set of parameters and reduce the storage requirement of a soft covering approximation space.
基金supported by the National Natural Science Foundation of China under Grant No. 60873077/F020107
文摘In many machine learning applications,data are not free,and there is a test cost for each data item. For the economical reason,some existing works try to minimize the test cost and at the same time,preserve a particular property of a given decision system. In this paper,we point out that the test cost one can afford is limited in some applications. Hence,one has to sacrifice respective properties to keep the test cost under a budget. To formalize this issue,we define the test cost constraint attribute reduction problem,where the optimization objective is to minimize the conditional information entropy. This problem is an essential generalization of both the test-cost-sensitive attribute reduction problem and the 0-1 knapsack problem,therefore it is more challenging. We propose a heuristic algorithm based on the information gain and test costs to deal with the new problem. The algorithm is tested on four UCI(University of California-Irvine) datasets with various test cost settings. Experimental results indicate the appropriate setting of the only user-specified parameter λ.