Objective speech quality is difficult to be measured without the input reference speech.Mapping methods using data mining are investigated and designed to improve the output-based speech quality assessment algorithm.T...Objective speech quality is difficult to be measured without the input reference speech.Mapping methods using data mining are investigated and designed to improve the output-based speech quality assessment algorithm.The degraded speech is firstly separated into three classes(unvoiced,voiced and silence),and then the consistency measurement between the degraded speech signal and the pre-trained reference model for each class is calculated and mapped to an objective speech quality score using data mining.Fuzzy Gaussian mixture model(GMM)is used to generate the artificial reference model trained on perceptual linear predictive(PLP)features.The mean opinion score(MOS)mapping methods including multivariate non-linear regression(MNLR),fuzzy neural network(FNN)and support vector regression(SVR)are designed and compared with the standard ITU-T P.563 method.Experimental results show that the assessment methods with data mining perform better than ITU-T P.563.Moreover,FNN and SVR are more efficient than MNLR,and FNN performs best with 14.50% increase in the correlation coefficient and 32.76% decrease in the root-mean-square MOS error.展开更多
In modern workforce management,the demand for new ways to maximize worker satisfaction,productivity,and security levels is endless.Workforce movement data such as those source data from an access control system can su...In modern workforce management,the demand for new ways to maximize worker satisfaction,productivity,and security levels is endless.Workforce movement data such as those source data from an access control system can support this ongoing process with subsequent analysis.In this study,a solution to attaining this goal is proposed,based on the design and implementation of a data mart as part of a dimensional trajectory data warehouse(TDW)that acts as a repository for the management of movement data.A novel methodological approach is proposed for modeling multiple spatial and temporal dimensions in a logical model.The case study presented in this paper for modeling and analyzing workforce movement data is to support human resource management decision-making and the following discussion provides a representative example of the contribution of a TDW in the process of information management and decision support systems.The entire process of exporting,cleaning,consolidating,and transforming data is implemented to achieve an appropriate format for final import.Structured query language(SQL)queries demonstrate the convenience of dimensional design for data analysis,and valuable information can be extracted from the movements of employees on company premises to manage the workforce efficiently and effectively.Visual analytics through data visualization support the analysis and facilitate decisionmaking and business intelligence.展开更多
Along with the rapid development of internet,CRM has become one of the most important facts leading the enterprises to be competent.At the same time,the analytical CRM based on Date Warehouse is the kernel of CRM syst...Along with the rapid development of internet,CRM has become one of the most important facts leading the enterprises to be competent.At the same time,the analytical CRM based on Date Warehouse is the kernel of CRM system.This paper mainly explains the idea of CRM and the DW model of analytical CRM system.展开更多
Training is the main business of the grass-roots team station of the fire rescue team and the basis of fire fighting and rescue work.The current training work has some problems, such as lack of science, lack of compet...Training is the main business of the grass-roots team station of the fire rescue team and the basis of fire fighting and rescue work.The current training work has some problems, such as lack of science, lack of competitiveness and lack of archives. In order to meet the functional and task requirements of "full disaster and large emergency" of national emergency rescue in the new period, this paper puts forward the design idea of intelligent training system suitable for fire rescue team.Use modern technologies such as data warehouse (DW), data mining (DM), online analytical processing (OLAP) and decision support system to guide, promote and improve the training of team stations, so as to achieve the goal of more scientific training, more active participation in training and continuous technological innovation.展开更多
Recent advances in computing,communications,digital storage technologies,and high-throughput data-acquisition technologies,make it possible to gather and store incredible volumes of data.It creates unprecedented oppor...Recent advances in computing,communications,digital storage technologies,and high-throughput data-acquisition technologies,make it possible to gather and store incredible volumes of data.It creates unprecedented opportunities for large-scale knowledge discovery from database.Data mining is an emerging area of computational intelligence that offers new theories,techniques,and tools for processing large volumes of data,such as data analysis,decision making,etc.There are many researchers working on designing efficient data mining techniques,methods,and algorithms.Unfortunately,most data mining researchers pay much attention to technique problems for developing data mining models and methods,while little to basic issues of data mining.In this paper,we will propose a new understanding for data mining,that is,domain-oriented data-driven data mining(3DM)model.Some data-driven data mining algorithms developed in our Lab are also presented to show its validity.展开更多
The paper introduced the data mining and issues related to it.Data mining is a technique by which we can extract useful knowledge from urge set of data.Data mining tasks used to perform various operations and used to ...The paper introduced the data mining and issues related to it.Data mining is a technique by which we can extract useful knowledge from urge set of data.Data mining tasks used to perform various operations and used to solve various problems related to data mining.Data warehouse is the collection of different method and techniques used to extract useful information from raw data.Genetic algorithm is based on Darwin’s theory in which low standard chromosomes are removed from the population due to their inability to survive the process of selection.The high standard chromosomes survive and are mixed by recombination to form more appropriate individuals.In this urge amount of data is used to predict future result by following several steps.展开更多
The advanced data mining technologies and the large quantities of remotely sensed Imagery provide a data mining opportunity with high potential for useful results. Extracting interesting patterns and rules from data s...The advanced data mining technologies and the large quantities of remotely sensed Imagery provide a data mining opportunity with high potential for useful results. Extracting interesting patterns and rules from data sets composed of images and associated ground data can be of importance in object identification, community planning, resource discovery and other areas. In this paper, a data field is presented to express the observed spatial objects and conduct behavior mining on them. First, most of the important aspects are discussed on behavior mining and its implications for the future of data mining. Furthermore, an ideal framework of the behavior mining system is proposed in the network environment. Second, the model of behavior mining is given on the observed spatial objects, including the objects described by the first feature data field and the main feature data field by means of the potential function. Finally, a case study about object identification in public is given and analyzed. The experimental results show that the new model is feasible in behavior mining.展开更多
An object oriented data modelling in computer aided design (CAD) databases is focused. Starting with the discussion of data modelling requirements for CAD applications, appropriate data modelling features are introdu...An object oriented data modelling in computer aided design (CAD) databases is focused. Starting with the discussion of data modelling requirements for CAD applications, appropriate data modelling features are introduced herewith. A feasible approach to select the “best” data model for an application is to analyze the data which has to be stored in the database. A data model is appropriate for modelling a given task if the information of the application environment can be easily mapped to the data model. Thus, the involved data are analyzed and then object oriented data model appropriate for CAD applications are derived. Based on the reviewed object oriented techniques applied in CAD, object oriented data modelling in CAD is addressed in details. At last 3D geometrical data models and implementation of their data model using the object oriented method are presented.展开更多
Various code development platforms, such as the ATHENA Framework [1] of the ATLAS [2] experiment encounter lengthy compilation/linking times. To augment this situation, the IRIS Development Platform was built as a sof...Various code development platforms, such as the ATHENA Framework [1] of the ATLAS [2] experiment encounter lengthy compilation/linking times. To augment this situation, the IRIS Development Platform was built as a software development framework acting as compiler, cross-project linker and data fetcher, which allow hot-swaps in order to compare various versions of software under test. The flexibility fostered by IRIS allowed modular exchange of software libraries among developers, making it a powerful development tool. The IRIS platform used input data ROOT-ntuples [3];however a new data model is sought, in line with the facilities offered by IRIS. The schematic of a possible new data structuring—as a user implemented object oriented data base, is presented.展开更多
This paper presents the semantic analysis of queries written in natural language (French) and dedicated to the object oriented data bases. The studied queries include one or two nominal groups (NG) articulating around...This paper presents the semantic analysis of queries written in natural language (French) and dedicated to the object oriented data bases. The studied queries include one or two nominal groups (NG) articulating around a verb. A NG consists of one or several keywords (application dependent noun or value). Simple semantic filters are defined for identifying these keywords which can be of semantic value: class, simple attribute, composed attribute, key value or not key value. Coherence rules and coherence constraints are introduced, to check the validity of the co-occurrence of two consecutive nouns in complex NG. If a query is constituted of a single NG, no further analysis is required. Otherwise, if a query covers two valid NG, it is a subject of studying the semantic coherence of the verb and both NG which are attached to it.展开更多
A Model, called 'Entity-Roles' is proposed in this paper in which the world of Interest is viewed as some mathematical structure. With respect to this structure, a First order (three-valued) Logic Language is ...A Model, called 'Entity-Roles' is proposed in this paper in which the world of Interest is viewed as some mathematical structure. With respect to this structure, a First order (three-valued) Logic Language is constructured.Any world to be modelled can be logically specified in this Language. The integrity constraints on the database and the deducing rules within the Database world are derived from the proper axioms of the world being modelled.展开更多
In this paper, we designed a customer-centered data warehouse system with five subjects: listing, bidding, transaction, accounts, and customer contact based on the business process of online auction companies. For ea...In this paper, we designed a customer-centered data warehouse system with five subjects: listing, bidding, transaction, accounts, and customer contact based on the business process of online auction companies. For each subject, we analyzed its fact indexes and dimensions. Then take transaction subject as example, analyzed the data warehouse model in detail, and got the multi-dimensional analysis structure of transaction subject. At last, using data mining to do customer segmentation, we divided customers into four types: impulse customer, prudent customer, potential customer, and ordinary customer. By the result of multi-dimensional customer data analysis, online auction companies can do more target marketing and increase customer loyalty.展开更多
With the rocketing progress of the Internet, it is easier for people to get information about the objects that they are interested in. However, this information usually has conflicts. In order to resolve conflicts and...With the rocketing progress of the Internet, it is easier for people to get information about the objects that they are interested in. However, this information usually has conflicts. In order to resolve conflicts and get the true information, truth discovery has been proposed and received widespread attention. Many algorithms have been proposed to adapt to different scenarios. This paper aims to investigate these algorithms and summarize them from the perspective of algorithm models and specific concepts. Some classic datasets and evaluation metrics are given in this paper. Some future directions for readers are also provided to better understand the field of truth discovery.展开更多
Most of the international accreditation bodies in engineering education(e.g.,ABET)and outcome-based educational systems have based their assess-ments on learning outcomes and program educational objectives.However,map...Most of the international accreditation bodies in engineering education(e.g.,ABET)and outcome-based educational systems have based their assess-ments on learning outcomes and program educational objectives.However,map-ping program educational objectives(PEOs)to student outcomes(SOs)is a challenging and time-consuming task,especially for a new program which is applying for ABET-EAC(American Board for Engineering and Technology the American Board for Engineering and Technology—Engineering Accreditation Commission)accreditation.In addition,ABET needs to automatically ensure that the mapping(classification)is reasonable and correct.The classification also plays a vital role in the assessment of students’learning.Since the PEOs are expressed as short text,they do not contain enough semantic meaning and information,and consequently they suffer from high sparseness,multidimensionality and the curse of dimensionality.In this work,a novel associative short text classification tech-nique is proposed to map PEOs to SOs.The datasets are extracted from 152 self-study reports(SSRs)that were produced in operational settings in an engineering program accredited by ABET-EAC.The datasets are processed and transformed into a representational form appropriate for association rule mining.The extracted rules are utilized as delegate classifiers to map PEOs to SOs.The proposed asso-ciative classification of the mapping of PEOs to SOs has shown promising results,which can simplify the classification of short text and avoid many problems caused by enriching short text based on external resources that are not related or relevant to the dataset.展开更多
基金Projects(61001188,1161140319)supported by the National Natural Science Foundation of ChinaProject(2012ZX03001034)supported by the National Science and Technology Major ProjectProject(YETP1202)supported by Beijing Higher Education Young Elite Teacher Project,China
文摘Objective speech quality is difficult to be measured without the input reference speech.Mapping methods using data mining are investigated and designed to improve the output-based speech quality assessment algorithm.The degraded speech is firstly separated into three classes(unvoiced,voiced and silence),and then the consistency measurement between the degraded speech signal and the pre-trained reference model for each class is calculated and mapped to an objective speech quality score using data mining.Fuzzy Gaussian mixture model(GMM)is used to generate the artificial reference model trained on perceptual linear predictive(PLP)features.The mean opinion score(MOS)mapping methods including multivariate non-linear regression(MNLR),fuzzy neural network(FNN)and support vector regression(SVR)are designed and compared with the standard ITU-T P.563 method.Experimental results show that the assessment methods with data mining perform better than ITU-T P.563.Moreover,FNN and SVR are more efficient than MNLR,and FNN performs best with 14.50% increase in the correlation coefficient and 32.76% decrease in the root-mean-square MOS error.
文摘In modern workforce management,the demand for new ways to maximize worker satisfaction,productivity,and security levels is endless.Workforce movement data such as those source data from an access control system can support this ongoing process with subsequent analysis.In this study,a solution to attaining this goal is proposed,based on the design and implementation of a data mart as part of a dimensional trajectory data warehouse(TDW)that acts as a repository for the management of movement data.A novel methodological approach is proposed for modeling multiple spatial and temporal dimensions in a logical model.The case study presented in this paper for modeling and analyzing workforce movement data is to support human resource management decision-making and the following discussion provides a representative example of the contribution of a TDW in the process of information management and decision support systems.The entire process of exporting,cleaning,consolidating,and transforming data is implemented to achieve an appropriate format for final import.Structured query language(SQL)queries demonstrate the convenience of dimensional design for data analysis,and valuable information can be extracted from the movements of employees on company premises to manage the workforce efficiently and effectively.Visual analytics through data visualization support the analysis and facilitate decisionmaking and business intelligence.
文摘Along with the rapid development of internet,CRM has become one of the most important facts leading the enterprises to be competent.At the same time,the analytical CRM based on Date Warehouse is the kernel of CRM system.This paper mainly explains the idea of CRM and the DW model of analytical CRM system.
文摘Training is the main business of the grass-roots team station of the fire rescue team and the basis of fire fighting and rescue work.The current training work has some problems, such as lack of science, lack of competitiveness and lack of archives. In order to meet the functional and task requirements of "full disaster and large emergency" of national emergency rescue in the new period, this paper puts forward the design idea of intelligent training system suitable for fire rescue team.Use modern technologies such as data warehouse (DW), data mining (DM), online analytical processing (OLAP) and decision support system to guide, promote and improve the training of team stations, so as to achieve the goal of more scientific training, more active participation in training and continuous technological innovation.
文摘Recent advances in computing,communications,digital storage technologies,and high-throughput data-acquisition technologies,make it possible to gather and store incredible volumes of data.It creates unprecedented opportunities for large-scale knowledge discovery from database.Data mining is an emerging area of computational intelligence that offers new theories,techniques,and tools for processing large volumes of data,such as data analysis,decision making,etc.There are many researchers working on designing efficient data mining techniques,methods,and algorithms.Unfortunately,most data mining researchers pay much attention to technique problems for developing data mining models and methods,while little to basic issues of data mining.In this paper,we will propose a new understanding for data mining,that is,domain-oriented data-driven data mining(3DM)model.Some data-driven data mining algorithms developed in our Lab are also presented to show its validity.
文摘The paper introduced the data mining and issues related to it.Data mining is a technique by which we can extract useful knowledge from urge set of data.Data mining tasks used to perform various operations and used to solve various problems related to data mining.Data warehouse is the collection of different method and techniques used to extract useful information from raw data.Genetic algorithm is based on Darwin’s theory in which low standard chromosomes are removed from the population due to their inability to survive the process of selection.The high standard chromosomes survive and are mixed by recombination to form more appropriate individuals.In this urge amount of data is used to predict future result by following several steps.
基金Supported by the National 973 Program of China(No.2006CB701305,No.2007CB310804)the National Natural Science Fundation of China(No.60743001)+1 种基金the Best National Thesis Fundation (No.2005047)the National New Century Excellent Talent Fundation (No.NCET-06-0618)
文摘The advanced data mining technologies and the large quantities of remotely sensed Imagery provide a data mining opportunity with high potential for useful results. Extracting interesting patterns and rules from data sets composed of images and associated ground data can be of importance in object identification, community planning, resource discovery and other areas. In this paper, a data field is presented to express the observed spatial objects and conduct behavior mining on them. First, most of the important aspects are discussed on behavior mining and its implications for the future of data mining. Furthermore, an ideal framework of the behavior mining system is proposed in the network environment. Second, the model of behavior mining is given on the observed spatial objects, including the objects described by the first feature data field and the main feature data field by means of the potential function. Finally, a case study about object identification in public is given and analyzed. The experimental results show that the new model is feasible in behavior mining.
文摘An object oriented data modelling in computer aided design (CAD) databases is focused. Starting with the discussion of data modelling requirements for CAD applications, appropriate data modelling features are introduced herewith. A feasible approach to select the “best” data model for an application is to analyze the data which has to be stored in the database. A data model is appropriate for modelling a given task if the information of the application environment can be easily mapped to the data model. Thus, the involved data are analyzed and then object oriented data model appropriate for CAD applications are derived. Based on the reviewed object oriented techniques applied in CAD, object oriented data modelling in CAD is addressed in details. At last 3D geometrical data models and implementation of their data model using the object oriented method are presented.
文摘Various code development platforms, such as the ATHENA Framework [1] of the ATLAS [2] experiment encounter lengthy compilation/linking times. To augment this situation, the IRIS Development Platform was built as a software development framework acting as compiler, cross-project linker and data fetcher, which allow hot-swaps in order to compare various versions of software under test. The flexibility fostered by IRIS allowed modular exchange of software libraries among developers, making it a powerful development tool. The IRIS platform used input data ROOT-ntuples [3];however a new data model is sought, in line with the facilities offered by IRIS. The schematic of a possible new data structuring—as a user implemented object oriented data base, is presented.
文摘This paper presents the semantic analysis of queries written in natural language (French) and dedicated to the object oriented data bases. The studied queries include one or two nominal groups (NG) articulating around a verb. A NG consists of one or several keywords (application dependent noun or value). Simple semantic filters are defined for identifying these keywords which can be of semantic value: class, simple attribute, composed attribute, key value or not key value. Coherence rules and coherence constraints are introduced, to check the validity of the co-occurrence of two consecutive nouns in complex NG. If a query is constituted of a single NG, no further analysis is required. Otherwise, if a query covers two valid NG, it is a subject of studying the semantic coherence of the verb and both NG which are attached to it.
文摘A Model, called 'Entity-Roles' is proposed in this paper in which the world of Interest is viewed as some mathematical structure. With respect to this structure, a First order (three-valued) Logic Language is constructured.Any world to be modelled can be logically specified in this Language. The integrity constraints on the database and the deducing rules within the Database world are derived from the proper axioms of the world being modelled.
基金Supported by the National Natural Science Foundation of China (70471037)211 Project Foundation of Shanghai University (8011040506)
文摘In this paper, we designed a customer-centered data warehouse system with five subjects: listing, bidding, transaction, accounts, and customer contact based on the business process of online auction companies. For each subject, we analyzed its fact indexes and dimensions. Then take transaction subject as example, analyzed the data warehouse model in detail, and got the multi-dimensional analysis structure of transaction subject. At last, using data mining to do customer segmentation, we divided customers into four types: impulse customer, prudent customer, potential customer, and ordinary customer. By the result of multi-dimensional customer data analysis, online auction companies can do more target marketing and increase customer loyalty.
基金Fundamental Research Funds for the Central Universities,China (No. 22D111207)。
文摘With the rocketing progress of the Internet, it is easier for people to get information about the objects that they are interested in. However, this information usually has conflicts. In order to resolve conflicts and get the true information, truth discovery has been proposed and received widespread attention. Many algorithms have been proposed to adapt to different scenarios. This paper aims to investigate these algorithms and summarize them from the perspective of algorithm models and specific concepts. Some classic datasets and evaluation metrics are given in this paper. Some future directions for readers are also provided to better understand the field of truth discovery.
文摘Most of the international accreditation bodies in engineering education(e.g.,ABET)and outcome-based educational systems have based their assess-ments on learning outcomes and program educational objectives.However,map-ping program educational objectives(PEOs)to student outcomes(SOs)is a challenging and time-consuming task,especially for a new program which is applying for ABET-EAC(American Board for Engineering and Technology the American Board for Engineering and Technology—Engineering Accreditation Commission)accreditation.In addition,ABET needs to automatically ensure that the mapping(classification)is reasonable and correct.The classification also plays a vital role in the assessment of students’learning.Since the PEOs are expressed as short text,they do not contain enough semantic meaning and information,and consequently they suffer from high sparseness,multidimensionality and the curse of dimensionality.In this work,a novel associative short text classification tech-nique is proposed to map PEOs to SOs.The datasets are extracted from 152 self-study reports(SSRs)that were produced in operational settings in an engineering program accredited by ABET-EAC.The datasets are processed and transformed into a representational form appropriate for association rule mining.The extracted rules are utilized as delegate classifiers to map PEOs to SOs.The proposed asso-ciative classification of the mapping of PEOs to SOs has shown promising results,which can simplify the classification of short text and avoid many problems caused by enriching short text based on external resources that are not related or relevant to the dataset.