In the area of pattern recognition and machine learning,features play a key role in prediction.The famous applications of features are medical imaging,image classification,and name a few more.With the exponential grow...In the area of pattern recognition and machine learning,features play a key role in prediction.The famous applications of features are medical imaging,image classification,and name a few more.With the exponential growth of information investments in medical data repositories and health service provision,medical institutions are collecting large volumes of data.These data repositories contain details information essential to support medical diagnostic decisions and also improve patient care quality.On the other hand,this growth also made it difficult to comprehend and utilize data for various purposes.The results of imaging data can become biased because of extraneous features present in larger datasets.Feature selection gives a chance to decrease the number of components in such large datasets.Through selection techniques,ousting the unimportant features and selecting a subset of components that produces prevalent characterization precision.The correct decision to find a good attribute produces a precise grouping model,which enhances learning pace and forecast control.This paper presents a review of feature selection techniques and attributes selection measures for medical imaging.This review is meant to describe feature selection techniques in a medical domainwith their pros and cons and to signify its application in imaging data and data mining algorithms.The review reveals the shortcomings of the existing feature and attributes selection techniques to multi-sourced data.Moreover,this review provides the importance of feature selection for correct classification of medical infections.In the end,critical analysis and future directions are provided.展开更多
This paper presents a conceptual data model, the STA-model, for handling spatial, temporal and attribute aspects of objects in GIS. The model is developed on the basis of object-oriented modeling approach. This model ...This paper presents a conceptual data model, the STA-model, for handling spatial, temporal and attribute aspects of objects in GIS. The model is developed on the basis of object-oriented modeling approach. This model includes two major parts: (a) modeling the signal objects by STA-object elements, and (b) modeling relationships between STA-objects. As an example, the STA-model is applied for modeling land cover change data with spatial, temporal and attribute components.展开更多
In order to increase the fault diagnosis efficiency and make the fault data mining be realized, the decision table containing numerical attributes must be discretized for further calculations. The discernibility matri...In order to increase the fault diagnosis efficiency and make the fault data mining be realized, the decision table containing numerical attributes must be discretized for further calculations. The discernibility matrix-based reduction method depends on whether the numerical attributes can be properly discretized or not.So a discretization algorithm based on particle swarm optimization(PSO) is proposed. Moreover, hybrid weights are adopted in the process of particles evolution. Comparative calculations for certain equipment are completed to demonstrate the effectiveness of the proposed algorithm. The results indicate that the proposed algorithm has better performance than other popular algorithms such as class-attribute interdependence maximization(CAIM)discretization method and entropy-based discretization method.展开更多
To improve the efficiency of the attribute reduction, we present an attribute reduction algorithm based on background knowledge and information entropy by making use of background knowledge from research fields. Under...To improve the efficiency of the attribute reduction, we present an attribute reduction algorithm based on background knowledge and information entropy by making use of background knowledge from research fields. Under the condition of known background knowledge, the algorithm can not only greatly improve the efficiency of attribute reduction, but also avoid the defection of information entropy partial to attribute with much value. The experimental result verifies that the algorithm is effective. In the end, the algorithm produces better results when applied in the classification of the star spectra data.展开更多
At present,most signal-to-noise ratio(SNR)estimation methods can only calculate the global and not the local SNR of seismic data.This paper proposes a calculation method of a structure-oriented-based seismic SNR attri...At present,most signal-to-noise ratio(SNR)estimation methods can only calculate the global and not the local SNR of seismic data.This paper proposes a calculation method of a structure-oriented-based seismic SNR attribute.The purpose is to characterize the temporal and spatial variation of the seismic data SNR.First,the local slope parameters of the seismic events are calculated using a plane wave decomposition filter.Then,the singular value decomposition method is used to calculate the local seismic data SNR,thereby obtaining it in time and space.The proposed method overcomes the insufficiency of a conventional global SNR to characterize any local seismic data features and uses the SNR as an attribute of seismic data to more accurately describe the signal-noise energy distribution characteristics of seismic data in time and space.The results of a theoretical model test and real data processing show that the SNR attribute can be used not only for seismic data quality evaluation but also for analysis and evaluation of denoising methods.展开更多
The Internet now is a large-scale platform with big data. Finding truth from a huge dataset has attracted extensive attention, which can maintain the quality of data collected by users and provide users with accurate ...The Internet now is a large-scale platform with big data. Finding truth from a huge dataset has attracted extensive attention, which can maintain the quality of data collected by users and provide users with accurate and efficient data. However, current truth finder algorithms are unsatisfying, because of their low accuracy and complication. This paper proposes a truth finder algorithm based on entity attributes (TFAEA). Based on the iterative computation of source reliability and fact accuracy, TFAEA considers the interactive degree among facts and the degree of dependence among sources, to simplify the typical truth finder algorithms. In order to improve the accuracy of them, TFAEA combines the one-way text similarity and the factual conflict to calculate the mutual support degree among facts. Furthermore, TFAEA utilizes the symmetric saturation of data sources to calculate the degree of dependence among sources. The experimental results show that TFAEA is not only more stable, but also more accurate than the typical truth finder algorithms.展开更多
In uncertain data management, lineages are often used for probability computation of result tuples. However, most of existing works focus on tuple level lineage, which results in imprecise data derivation. Besides, co...In uncertain data management, lineages are often used for probability computation of result tuples. However, most of existing works focus on tuple level lineage, which results in imprecise data derivation. Besides, correlations among attributes cannot be captured. In this paper, for base tuples with multiple uncertain attributes, we define attribute level annotation to annotate each attribute. Utilizing these annotations to generate lineages of result tuples can realize more precise derivation. Simultaneously,they can be used for dependency graph construction. Utilizing dependency graph, we can represent not only constraints on schemas but also correlations among attributes. Combining the dependency graph and attribute level lineage, we can correctly compute probabilities of result tuples and precisely derivate data. In experiments, comparing lineage on tuple level and attribute level, it shows that our method has advantages on derivation precision and storage cost.展开更多
Decision rules mining is an important issue in machine learning and data mining.However,most proposed algorithms mine categorical data at single level,and these rules are not easily understandable and really useful fo...Decision rules mining is an important issue in machine learning and data mining.However,most proposed algorithms mine categorical data at single level,and these rules are not easily understandable and really useful for users.Thus,a new approach to hierarchical decision rules mining is provided in this paper,in which similarity direction measure is introduced to deal with hybrid data.This approach can mine hierarchical decision rules by adjusting similarity measure parameters and the level of concept hierarchy trees.展开更多
Learning unlabeled data is a significant challenge that needs to han-dle complicated relationships between nominal values and attributes.Increas-ingly,recent research on learning value relations within and between att...Learning unlabeled data is a significant challenge that needs to han-dle complicated relationships between nominal values and attributes.Increas-ingly,recent research on learning value relations within and between attributes has shown significant improvement in clustering and outlier detection,etc.However,typical existing work relies on learning pairwise value relations but weakens or overlooks the direct couplings between multiple attributes.This paper thus proposes two novel and flexible multi-attribute couplings-based distance(MCD)metrics,which learn the multi-attribute couplings and their strengths in nominal data based on information theories:self-information,entropy,and mutual information,for measuring both numerical and nominal distances.MCD enables the application of numerical and nominal clustering methods on nominal data and quantifies the influence of involving and filtering multi-attribute couplings on distance learning and clustering perfor-mance.Substantial experiments evidence the above conclusions on 15 data sets against seven state-of-the-art distance measures with various feature selection methods for both numerical and nominal clustering.展开更多
Image Aesthetic Assessment (IAA) is a widely considered problem given its usefulness in a wide range of applications such as the evaluation image capture pipelines, sharing and storage techniques media, but the intrin...Image Aesthetic Assessment (IAA) is a widely considered problem given its usefulness in a wide range of applications such as the evaluation image capture pipelines, sharing and storage techniques media, but the intrinsic mechanism of aesthetic evaluation is seldom been explored due to its subjective nature and the lack of interpretability of deep neural networks. Noticing that the photographic style annotations of images (<em>i</em>.<em>e</em>. the score of aesthetic attributes) are more objective and interpretable compared with the Mean Opinion Scores (MOS) annotations for IAA, we evaluate the problem of Aesthetic Attributes Assessment (AAA) as to provide complementary information for IAA. We firstly introduce the learning of data covariance in the field of Aesthetic Attributes Assessment and propose a regression model that jointly learns from MOS as well as the score of all aesthetic attributes at the same time. We construct our method by extending the scheme of data uncertainty learning and propose data covariance learning. Our method achieves the state-of-the-art performance on AAA without architectural design that trains in a totally end-to-end manner and can be easily extended to existing IAA methods.展开更多
Decision forest is a well-renowned machine learning technique to address the detection and prediction problems related to clinical data.But,the tra-ditional decision forest(DF)algorithms have lower classification accu...Decision forest is a well-renowned machine learning technique to address the detection and prediction problems related to clinical data.But,the tra-ditional decision forest(DF)algorithms have lower classification accuracy and cannot handle high-dimensional feature space effectively.In this work,we pro-pose a bootstrap decision forest using penalizing attributes(BFPA)algorithm to predict heart disease with higher accuracy.This work integrates a significance-based attribute selection(SAS)algorithm with the BFPA classifier to improve the performance of the diagnostic system in identifying cardiac illness.The pro-posed SAS algorithm is used to determine the correlation among attributes and to select the optimum subset of feature space for learning and testing processes.BFPA selects the optimal number of learning and testing data points as well as the density of trees in the forest to realize higher prediction accuracy in classifying imbalanced datasets effectively.The effectiveness of the developed classifier is cautiously verified on the real-world database(i.e.,Heart disease dataset from UCI repository)by relating its enactment with many advanced approaches with respect to the accuracy,sensitivity,specificity,precision,and intersection over-union(IoU).The empirical results demonstrate that the intended classification approach outdoes other approaches with superior enactment regarding the accu-racy,precision,sensitivity,specificity,and IoU of 94.7%,99.2%,90.1%,91.1%,and 90.4%,correspondingly.Additionally,we carry out Wilcoxon’s rank-sum test to determine whether our proposed classifier with feature selection method enables a noteworthy enhancement related to other classifiers or not.From the experimental results,we can conclude that the integration of SAS and BFPA outperforms other classifiers recently reported in the literature.展开更多
Theory of rough sets, proposed by Zdzislaw Pawlak in 1982, is a model of approximate reasoning. In applications, rough set methodology focuses on approximate representation of knowledge derivable from data. It leads t...Theory of rough sets, proposed by Zdzislaw Pawlak in 1982, is a model of approximate reasoning. In applications, rough set methodology focuses on approximate representation of knowledge derivable from data. It leads to significant results in many areas including, for example, finance, industry, multimedia, medicine, and most recently bioinformatics.展开更多
The advent of the digital age has consistently provided impetus for facilitating global trade,as evidenced by the numerous customs clearance documents and participants involved in the international trade process,inclu...The advent of the digital age has consistently provided impetus for facilitating global trade,as evidenced by the numerous customs clearance documents and participants involved in the international trade process,including enterprises,agents,and government departments.However,the urgent issue that requires immediate attention is how to achieve secure and efficient cross-border data sharing among these government departments and enterprises in complex trade processes.In addressing this need,this paper proposes a data exchange architecture employing Multi-Authority Attribute-Based Encryption(MA-ABE)in combination with blockchain technology.This scheme supports proxy decryption,attribute revocation,and policy update,while allowing each participating entity to manage their keys autonomously,ensuring system security and enhancing trust among participants.In order to enhance system decentralization,a mechanism has been designed in the architecture where multiple institutions interact with smart contracts and jointly participate in the generation of public parameters.Integration with the multi-party process execution engine Caterpillar has been shown to boost the transparency of cross-border information flow and cooperation between different organizations.The scheme ensures the auditability of data access control information and the visualization of on-chain data sharing.The MA-ABE scheme is statically secure under the q-Decisional Parallel Bilinear Diffie-Hellman Exponent(q-DPBDHE2)assumption in the random oracle model,and can resist ciphertext rollback attacks to achieve true backward and forward security.Theoretical analysis and experimental results demonstrate the appropriateness of the scheme for cross-border data collaboration between different institutions.展开更多
To solve the problems of data sharing in social network,such as management of private data is too loose,access permissions are not clear,mode of data sharing is too single and soon on,we design a hierarchical access c...To solve the problems of data sharing in social network,such as management of private data is too loose,access permissions are not clear,mode of data sharing is too single and soon on,we design a hierarchical access control scheme of private data based on attribute encryption.First,we construct a new algorithm based on attribute encryption,which divides encryption into two phases,and we can design two types of attributes encryption strategy to make sure that different users could get their own decryption keys corresponding to their permissions.We encrypt the private data hierarchically with our algorithm to realize“precise”,“more accurate”,“fuzzy”and“private”four management modes,then users with higher permissions can access the private data inferior to their permissions.And we outsource some complex operations of decryption to DSP to ensure high efficiency on the premise of privacy protection.Finally,we analyze the efficiency and the security of our scheme.展开更多
Because of the incompleteness and uncertainty in the information on overseas oil-gas projects, project evaluation needs models able to deal with such problems. A new model is, therefore, presented in this paper based ...Because of the incompleteness and uncertainty in the information on overseas oil-gas projects, project evaluation needs models able to deal with such problems. A new model is, therefore, presented in this paper based on interval multi-attribute decision-making theory. Analysis was made on the important attributes (index) and the re- lationships affecting the basic factors to the project eco- nomic results were described. The interval numbers are used to describe the information on overseas oil and gas projects. On these bases, an improved TOPSIS model is introduced for the evaluation and ranking of overseas oil and gas projects. The practical application of the new model was carried out for an oil company in selecting some promising blocks from 13 oil and gas blocks in eight dif- ferent countries in the Middle East. Based on these inno- vative studies, some conclusions are given from theoretical and application aspects. The practical application shows that the introduction of interval numbers into the evaluation and ranking of the overseas oil and gas projects can lead to more reasonable decisions. The users can do the project evaluation based on the comprehensive values as well as based on some preferred index in the project evaluation and ranking.展开更多
Most image saliency detection models are dependent on prior knowledge and demand high computational cost. However, spectral residual(SR) and phase spectrum of the Fourier transform(PFT) models are simple and fast ...Most image saliency detection models are dependent on prior knowledge and demand high computational cost. However, spectral residual(SR) and phase spectrum of the Fourier transform(PFT) models are simple and fast saliency detection approaches based on two-dimensional Fourier transform without the prior knowledge. For seismic data, the geological structure of the underground rock formation changes more obviously in the time direction. Therefore, one-dimensional Fourier transform is more suitable for seismic saliency detection. Fractional Fourier transform(FrFT) is an improved algorithm for Fourier transform, therefore we propose the seismic SR and PFT models in one-dimensional FrF T domain to obtain more detailed saliency maps. These two models use the amplitude and phase information in FrFT domain to construct the corresponding saliency maps in spatial domain. By means of these two models, several saliency maps at different fractional orders can be obtained for seismic attribute analysis. These saliency maps can characterize the detailed features and highlight the object areas, which is more conducive to determine the location of reservoirs. The performance of the proposed method is assessed on both simulated and real seismic data. The results indicate that our method is effective and convenient for seismic attribute extraction with good noise immunity.展开更多
Feature selection(FS) aims to determine a minimal feature(attribute) subset from a problem domain while retaining a suitably high accuracy in representing the original features. Rough set theory(RST) has been us...Feature selection(FS) aims to determine a minimal feature(attribute) subset from a problem domain while retaining a suitably high accuracy in representing the original features. Rough set theory(RST) has been used as such a tool with much success. RST enables the discovery of data dependencies and the reduction of the number of attributes contained in a dataset using the data alone,requiring no additional information. This paper describes the fundamental ideas behind RST-based approaches,reviews related FS methods built on these ideas,and analyses more frequently used RST-based traditional FS algorithms such as Quickreduct algorithm,entropy based reduct algorithm,and relative reduct algorithm. It is found that some of the drawbacks in the existing algorithms and our proposed improved algorithms can overcome these drawbacks. The experimental analyses have been carried out in order to achieve the efficiency of the proposed algorithms.展开更多
This article investigates the impact of CEO attributes on corporate reputation,financial performance,and corporate sustainable growth in India.Using static panel data methodology for a sample of NSE listed leading 138...This article investigates the impact of CEO attributes on corporate reputation,financial performance,and corporate sustainable growth in India.Using static panel data methodology for a sample of NSE listed leading 138 non-financial companies over the time-frame 2011 to 2018,we find that CEO remuneration and tenure maintains significant positive associations with corporate reputation,while duality and CEO busyness are found to be associated with corporate reputation negatively.The results also show that female CEOs and CEO remuneration are associated with corporate financial performance positively,whereas CEO busyness,as expected,holds a significant negative relationship with corporate financial performance.Moreover,the results demonstrate that CEO age is associated with corporate sustainable growth negatively,while tenure appears to have a significant and positive association with corporate sustainable growth.The results are robust to various tests and suggest that in the Indian context,demographic and job-specific attributes of CEOs exert significant influence on corpo-rate reputation,financial performance,and corporate sustainable growth.The empirical findings would provide a basis for the shareholders and companies to identify areas of consideration when appointing CEOs and determining their roles and responsibilities.展开更多
Recently, it has been seen that the ensemble classifier is an effective way to enhance the prediction performance. However, it usually suffers from the problem of how to construct an appropriate classifier based on a ...Recently, it has been seen that the ensemble classifier is an effective way to enhance the prediction performance. However, it usually suffers from the problem of how to construct an appropriate classifier based on a set of complex data, for example,the data with many dimensions or hierarchical attributes. This study proposes a method to constructe an ensemble classifier based on the key attributes. In addition to its high-performance on precision shared by common ensemble classifiers, the calculation results are highly intelligible and thus easy for understanding.Furthermore, the experimental results based on the real data collected from China Mobile show that the keyattributes-based ensemble classifier has the good performance on both of the classifier construction and the customer churn prediction.展开更多
This paper proposes two new algorithms for classifying objects with categorical attributes. These algorithms are derived from the assumption that the attributes of different object classes have different probability d...This paper proposes two new algorithms for classifying objects with categorical attributes. These algorithms are derived from the assumption that the attributes of different object classes have different probability distributions. One algorithm classifies objects based on the distribution of the attribute frequencies, and the other classifies objects based on the distribution of the pairwise attribute frequencies described using a matrix of pairwise frequencies. Both algorithms are based on the method of invariants, which offers the simplest dependencies for estimating the probabilities of objects in each class by an average frequency of their attributes. The estimated object class corresponds to the maximum probability. This method reflects the sensory process models of animals and is aimed at recognizing an object class by searching for a prototype in information accumulated in the brain. Because these matrices may be sparse, the solution cannot be determined for some objects. For these objects, an analog of the k-nearest neighbors method is provided in which for each attribute value, the class to which the majority of the k-nearest objects in the training sample belong is determined, and the most likely class value is calculated. The efficiencies of these two algorithms were confirmed on five databases.展开更多
文摘In the area of pattern recognition and machine learning,features play a key role in prediction.The famous applications of features are medical imaging,image classification,and name a few more.With the exponential growth of information investments in medical data repositories and health service provision,medical institutions are collecting large volumes of data.These data repositories contain details information essential to support medical diagnostic decisions and also improve patient care quality.On the other hand,this growth also made it difficult to comprehend and utilize data for various purposes.The results of imaging data can become biased because of extraneous features present in larger datasets.Feature selection gives a chance to decrease the number of components in such large datasets.Through selection techniques,ousting the unimportant features and selecting a subset of components that produces prevalent characterization precision.The correct decision to find a good attribute produces a precise grouping model,which enhances learning pace and forecast control.This paper presents a review of feature selection techniques and attributes selection measures for medical imaging.This review is meant to describe feature selection techniques in a medical domainwith their pros and cons and to signify its application in imaging data and data mining algorithms.The review reveals the shortcomings of the existing feature and attributes selection techniques to multi-sourced data.Moreover,this review provides the importance of feature selection for correct classification of medical infections.In the end,critical analysis and future directions are provided.
文摘This paper presents a conceptual data model, the STA-model, for handling spatial, temporal and attribute aspects of objects in GIS. The model is developed on the basis of object-oriented modeling approach. This model includes two major parts: (a) modeling the signal objects by STA-object elements, and (b) modeling relationships between STA-objects. As an example, the STA-model is applied for modeling land cover change data with spatial, temporal and attribute components.
基金the National Natural Science Foundation of China(No.51775090)the General Program of Civil Aviation Flight University of China(No.J2015-39)
文摘In order to increase the fault diagnosis efficiency and make the fault data mining be realized, the decision table containing numerical attributes must be discretized for further calculations. The discernibility matrix-based reduction method depends on whether the numerical attributes can be properly discretized or not.So a discretization algorithm based on particle swarm optimization(PSO) is proposed. Moreover, hybrid weights are adopted in the process of particles evolution. Comparative calculations for certain equipment are completed to demonstrate the effectiveness of the proposed algorithm. The results indicate that the proposed algorithm has better performance than other popular algorithms such as class-attribute interdependence maximization(CAIM)discretization method and entropy-based discretization method.
基金Supported by the National Natural Science Foundation of China(No. 60573075), the National High Technology Research and Development Program of China (No. 2003AA133060) and the Natural Science Foundation of Shanxi Province (No. 200601104).
文摘To improve the efficiency of the attribute reduction, we present an attribute reduction algorithm based on background knowledge and information entropy by making use of background knowledge from research fields. Under the condition of known background knowledge, the algorithm can not only greatly improve the efficiency of attribute reduction, but also avoid the defection of information entropy partial to attribute with much value. The experimental result verifies that the algorithm is effective. In the end, the algorithm produces better results when applied in the classification of the star spectra data.
基金supported by National Natural Science Foundation of China(No.41604094)Open Fund of Key Laboratory of Exploration Technologies for Oil and Gas Resources(Yangtze University),Ministry of Education(No.K2018-13)
文摘At present,most signal-to-noise ratio(SNR)estimation methods can only calculate the global and not the local SNR of seismic data.This paper proposes a calculation method of a structure-oriented-based seismic SNR attribute.The purpose is to characterize the temporal and spatial variation of the seismic data SNR.First,the local slope parameters of the seismic events are calculated using a plane wave decomposition filter.Then,the singular value decomposition method is used to calculate the local seismic data SNR,thereby obtaining it in time and space.The proposed method overcomes the insufficiency of a conventional global SNR to characterize any local seismic data features and uses the SNR as an attribute of seismic data to more accurately describe the signal-noise energy distribution characteristics of seismic data in time and space.The results of a theoretical model test and real data processing show that the SNR attribute can be used not only for seismic data quality evaluation but also for analysis and evaluation of denoising methods.
基金supported by the National Natural Science Foundation of China(61472192)the Scientific and Technological Support Project(Society)of Jiangsu Province(BE2016776)
文摘The Internet now is a large-scale platform with big data. Finding truth from a huge dataset has attracted extensive attention, which can maintain the quality of data collected by users and provide users with accurate and efficient data. However, current truth finder algorithms are unsatisfying, because of their low accuracy and complication. This paper proposes a truth finder algorithm based on entity attributes (TFAEA). Based on the iterative computation of source reliability and fact accuracy, TFAEA considers the interactive degree among facts and the degree of dependence among sources, to simplify the typical truth finder algorithms. In order to improve the accuracy of them, TFAEA combines the one-way text similarity and the factual conflict to calculate the mutual support degree among facts. Furthermore, TFAEA utilizes the symmetric saturation of data sources to calculate the degree of dependence among sources. The experimental results show that TFAEA is not only more stable, but also more accurate than the typical truth finder algorithms.
基金Supported by the Key Program of National Natural Science Foundation of China(61232002)The National Natural Science Foundation of China(61202033)+2 种基金The Program for Innovative Research Team of Wuhan(2014070504020237)The Ph.D.Seed Foundation of Wuhan University(2012211020207)The Science and Technology Support Program of Hubei Province(2015BAA127)
文摘In uncertain data management, lineages are often used for probability computation of result tuples. However, most of existing works focus on tuple level lineage, which results in imprecise data derivation. Besides, correlations among attributes cannot be captured. In this paper, for base tuples with multiple uncertain attributes, we define attribute level annotation to annotate each attribute. Utilizing these annotations to generate lineages of result tuples can realize more precise derivation. Simultaneously,they can be used for dependency graph construction. Utilizing dependency graph, we can represent not only constraints on schemas but also correlations among attributes. Combining the dependency graph and attribute level lineage, we can correctly compute probabilities of result tuples and precisely derivate data. In experiments, comparing lineage on tuple level and attribute level, it shows that our method has advantages on derivation precision and storage cost.
基金The research was supported by the National Natural Science Foundation of China under grant No:60775036, 60970061the Higher Education Nature Science Research Fund Project of Jiangsu Province under grant No: 09KJD520004.
文摘Decision rules mining is an important issue in machine learning and data mining.However,most proposed algorithms mine categorical data at single level,and these rules are not easily understandable and really useful for users.Thus,a new approach to hierarchical decision rules mining is provided in this paper,in which similarity direction measure is introduced to deal with hybrid data.This approach can mine hierarchical decision rules by adjusting similarity measure parameters and the level of concept hierarchy trees.
基金funded by the MOE(Ministry of Education in China)Project of Humanities and Social Sciences(Project Number:18YJC870006)from China.
文摘Learning unlabeled data is a significant challenge that needs to han-dle complicated relationships between nominal values and attributes.Increas-ingly,recent research on learning value relations within and between attributes has shown significant improvement in clustering and outlier detection,etc.However,typical existing work relies on learning pairwise value relations but weakens or overlooks the direct couplings between multiple attributes.This paper thus proposes two novel and flexible multi-attribute couplings-based distance(MCD)metrics,which learn the multi-attribute couplings and their strengths in nominal data based on information theories:self-information,entropy,and mutual information,for measuring both numerical and nominal distances.MCD enables the application of numerical and nominal clustering methods on nominal data and quantifies the influence of involving and filtering multi-attribute couplings on distance learning and clustering perfor-mance.Substantial experiments evidence the above conclusions on 15 data sets against seven state-of-the-art distance measures with various feature selection methods for both numerical and nominal clustering.
文摘Image Aesthetic Assessment (IAA) is a widely considered problem given its usefulness in a wide range of applications such as the evaluation image capture pipelines, sharing and storage techniques media, but the intrinsic mechanism of aesthetic evaluation is seldom been explored due to its subjective nature and the lack of interpretability of deep neural networks. Noticing that the photographic style annotations of images (<em>i</em>.<em>e</em>. the score of aesthetic attributes) are more objective and interpretable compared with the Mean Opinion Scores (MOS) annotations for IAA, we evaluate the problem of Aesthetic Attributes Assessment (AAA) as to provide complementary information for IAA. We firstly introduce the learning of data covariance in the field of Aesthetic Attributes Assessment and propose a regression model that jointly learns from MOS as well as the score of all aesthetic attributes at the same time. We construct our method by extending the scheme of data uncertainty learning and propose data covariance learning. Our method achieves the state-of-the-art performance on AAA without architectural design that trains in a totally end-to-end manner and can be easily extended to existing IAA methods.
文摘Decision forest is a well-renowned machine learning technique to address the detection and prediction problems related to clinical data.But,the tra-ditional decision forest(DF)algorithms have lower classification accuracy and cannot handle high-dimensional feature space effectively.In this work,we pro-pose a bootstrap decision forest using penalizing attributes(BFPA)algorithm to predict heart disease with higher accuracy.This work integrates a significance-based attribute selection(SAS)algorithm with the BFPA classifier to improve the performance of the diagnostic system in identifying cardiac illness.The pro-posed SAS algorithm is used to determine the correlation among attributes and to select the optimum subset of feature space for learning and testing processes.BFPA selects the optimal number of learning and testing data points as well as the density of trees in the forest to realize higher prediction accuracy in classifying imbalanced datasets effectively.The effectiveness of the developed classifier is cautiously verified on the real-world database(i.e.,Heart disease dataset from UCI repository)by relating its enactment with many advanced approaches with respect to the accuracy,sensitivity,specificity,precision,and intersection over-union(IoU).The empirical results demonstrate that the intended classification approach outdoes other approaches with superior enactment regarding the accu-racy,precision,sensitivity,specificity,and IoU of 94.7%,99.2%,90.1%,91.1%,and 90.4%,correspondingly.Additionally,we carry out Wilcoxon’s rank-sum test to determine whether our proposed classifier with feature selection method enables a noteworthy enhancement related to other classifiers or not.From the experimental results,we can conclude that the integration of SAS and BFPA outperforms other classifiers recently reported in the literature.
文摘Theory of rough sets, proposed by Zdzislaw Pawlak in 1982, is a model of approximate reasoning. In applications, rough set methodology focuses on approximate representation of knowledge derivable from data. It leads to significant results in many areas including, for example, finance, industry, multimedia, medicine, and most recently bioinformatics.
基金supported by Hainan Provincial Natural Science Foundation of China Nos.622RC617,624RC485Open Foundation of State Key Laboratory of Networking and Switching Technology(Beijing University of Posts and Telecommunications)(SKLNST-2023-1-07).
文摘The advent of the digital age has consistently provided impetus for facilitating global trade,as evidenced by the numerous customs clearance documents and participants involved in the international trade process,including enterprises,agents,and government departments.However,the urgent issue that requires immediate attention is how to achieve secure and efficient cross-border data sharing among these government departments and enterprises in complex trade processes.In addressing this need,this paper proposes a data exchange architecture employing Multi-Authority Attribute-Based Encryption(MA-ABE)in combination with blockchain technology.This scheme supports proxy decryption,attribute revocation,and policy update,while allowing each participating entity to manage their keys autonomously,ensuring system security and enhancing trust among participants.In order to enhance system decentralization,a mechanism has been designed in the architecture where multiple institutions interact with smart contracts and jointly participate in the generation of public parameters.Integration with the multi-party process execution engine Caterpillar has been shown to boost the transparency of cross-border information flow and cooperation between different organizations.The scheme ensures the auditability of data access control information and the visualization of on-chain data sharing.The MA-ABE scheme is statically secure under the q-Decisional Parallel Bilinear Diffie-Hellman Exponent(q-DPBDHE2)assumption in the random oracle model,and can resist ciphertext rollback attacks to achieve true backward and forward security.Theoretical analysis and experimental results demonstrate the appropriateness of the scheme for cross-border data collaboration between different institutions.
文摘To solve the problems of data sharing in social network,such as management of private data is too loose,access permissions are not clear,mode of data sharing is too single and soon on,we design a hierarchical access control scheme of private data based on attribute encryption.First,we construct a new algorithm based on attribute encryption,which divides encryption into two phases,and we can design two types of attributes encryption strategy to make sure that different users could get their own decryption keys corresponding to their permissions.We encrypt the private data hierarchically with our algorithm to realize“precise”,“more accurate”,“fuzzy”and“private”four management modes,then users with higher permissions can access the private data inferior to their permissions.And we outsource some complex operations of decryption to DSP to ensure high efficiency on the premise of privacy protection.Finally,we analyze the efficiency and the security of our scheme.
基金supported by the National Social Science Foundation key projects(13&ZD159,11&ZD164)
文摘Because of the incompleteness and uncertainty in the information on overseas oil-gas projects, project evaluation needs models able to deal with such problems. A new model is, therefore, presented in this paper based on interval multi-attribute decision-making theory. Analysis was made on the important attributes (index) and the re- lationships affecting the basic factors to the project eco- nomic results were described. The interval numbers are used to describe the information on overseas oil and gas projects. On these bases, an improved TOPSIS model is introduced for the evaluation and ranking of overseas oil and gas projects. The practical application of the new model was carried out for an oil company in selecting some promising blocks from 13 oil and gas blocks in eight dif- ferent countries in the Middle East. Based on these inno- vative studies, some conclusions are given from theoretical and application aspects. The practical application shows that the introduction of interval numbers into the evaluation and ranking of the overseas oil and gas projects can lead to more reasonable decisions. The users can do the project evaluation based on the comprehensive values as well as based on some preferred index in the project evaluation and ranking.
基金supported by the National Natural Science Foundation of China (Nos.61571096,61775030,41274127,41301460,and 40874066)
文摘Most image saliency detection models are dependent on prior knowledge and demand high computational cost. However, spectral residual(SR) and phase spectrum of the Fourier transform(PFT) models are simple and fast saliency detection approaches based on two-dimensional Fourier transform without the prior knowledge. For seismic data, the geological structure of the underground rock formation changes more obviously in the time direction. Therefore, one-dimensional Fourier transform is more suitable for seismic saliency detection. Fractional Fourier transform(FrFT) is an improved algorithm for Fourier transform, therefore we propose the seismic SR and PFT models in one-dimensional FrF T domain to obtain more detailed saliency maps. These two models use the amplitude and phase information in FrFT domain to construct the corresponding saliency maps in spatial domain. By means of these two models, several saliency maps at different fractional orders can be obtained for seismic attribute analysis. These saliency maps can characterize the detailed features and highlight the object areas, which is more conducive to determine the location of reservoirs. The performance of the proposed method is assessed on both simulated and real seismic data. The results indicate that our method is effective and convenient for seismic attribute extraction with good noise immunity.
基金supported by the UGC, SERO, Hyderabad under FDP during XI plan period, and the UGC, New Delhi for financial assistance under major research project Grant No. F-34-105/2008
文摘Feature selection(FS) aims to determine a minimal feature(attribute) subset from a problem domain while retaining a suitably high accuracy in representing the original features. Rough set theory(RST) has been used as such a tool with much success. RST enables the discovery of data dependencies and the reduction of the number of attributes contained in a dataset using the data alone,requiring no additional information. This paper describes the fundamental ideas behind RST-based approaches,reviews related FS methods built on these ideas,and analyses more frequently used RST-based traditional FS algorithms such as Quickreduct algorithm,entropy based reduct algorithm,and relative reduct algorithm. It is found that some of the drawbacks in the existing algorithms and our proposed improved algorithms can overcome these drawbacks. The experimental analyses have been carried out in order to achieve the efficiency of the proposed algorithms.
文摘This article investigates the impact of CEO attributes on corporate reputation,financial performance,and corporate sustainable growth in India.Using static panel data methodology for a sample of NSE listed leading 138 non-financial companies over the time-frame 2011 to 2018,we find that CEO remuneration and tenure maintains significant positive associations with corporate reputation,while duality and CEO busyness are found to be associated with corporate reputation negatively.The results also show that female CEOs and CEO remuneration are associated with corporate financial performance positively,whereas CEO busyness,as expected,holds a significant negative relationship with corporate financial performance.Moreover,the results demonstrate that CEO age is associated with corporate sustainable growth negatively,while tenure appears to have a significant and positive association with corporate sustainable growth.The results are robust to various tests and suggest that in the Indian context,demographic and job-specific attributes of CEOs exert significant influence on corpo-rate reputation,financial performance,and corporate sustainable growth.The empirical findings would provide a basis for the shareholders and companies to identify areas of consideration when appointing CEOs and determining their roles and responsibilities.
基金supported by the National Natural Science Foundation of China under Grants No.71271044 and 71572029
文摘Recently, it has been seen that the ensemble classifier is an effective way to enhance the prediction performance. However, it usually suffers from the problem of how to construct an appropriate classifier based on a set of complex data, for example,the data with many dimensions or hierarchical attributes. This study proposes a method to constructe an ensemble classifier based on the key attributes. In addition to its high-performance on precision shared by common ensemble classifiers, the calculation results are highly intelligible and thus easy for understanding.Furthermore, the experimental results based on the real data collected from China Mobile show that the keyattributes-based ensemble classifier has the good performance on both of the classifier construction and the customer churn prediction.
文摘This paper proposes two new algorithms for classifying objects with categorical attributes. These algorithms are derived from the assumption that the attributes of different object classes have different probability distributions. One algorithm classifies objects based on the distribution of the attribute frequencies, and the other classifies objects based on the distribution of the pairwise attribute frequencies described using a matrix of pairwise frequencies. Both algorithms are based on the method of invariants, which offers the simplest dependencies for estimating the probabilities of objects in each class by an average frequency of their attributes. The estimated object class corresponds to the maximum probability. This method reflects the sensory process models of animals and is aimed at recognizing an object class by searching for a prototype in information accumulated in the brain. Because these matrices may be sparse, the solution cannot be determined for some objects. For these objects, an analog of the k-nearest neighbors method is provided in which for each attribute value, the class to which the majority of the k-nearest objects in the training sample belong is determined, and the most likely class value is calculated. The efficiencies of these two algorithms were confirmed on five databases.