An excellent cardinality estimation can make the query optimiser produce a good execution plan.Although there are some studies on cardinality estimation,the prediction results of existing cardinality estimators are in...An excellent cardinality estimation can make the query optimiser produce a good execution plan.Although there are some studies on cardinality estimation,the prediction results of existing cardinality estimators are inaccurate and the query efficiency cannot be guaranteed as well.In particular,they are difficult to accurately obtain the complex relationships between multiple tables in complex database systems.When dealing with complex queries,the existing cardinality estimators cannot achieve good results.In this study,a novel cardinality estimator is proposed.It uses the core techniques with the BiLSTM network structure and adds the attention mechanism.First,the columns involved in the query statements in the training set are sampled and compressed into bitmaps.Then,the Word2vec model is used to embed the word vectors about the query statements.Finally,the BiLSTM network and attention mechanism are employed to deal with word vectors.The proposed model takes into consideration not only the correlation between tables but also the processing of complex predicates.Extensive experiments and the evaluation of BiLSTM-Attention Cardinality Estimator(BACE)on the IMDB datasets are conducted.The results show that the deep learning model can significantly improve the quality of cardinality estimation,which is a vital role in query optimisation for complex databases.展开更多
A DMVOCC-MVDA (distributed multiversion optimistic concurrency control with multiversion dynamic adjustment) protocol was presented to process mobile distributed real-time transaction in mobile broadcast environment...A DMVOCC-MVDA (distributed multiversion optimistic concurrency control with multiversion dynamic adjustment) protocol was presented to process mobile distributed real-time transaction in mobile broadcast environments. At the mobile hosts, all transactions perform local pre-validation. The local pre-validation process is carried out against the committed transactions at the server in the last broadcast cycle. Transactions that survive in local pre-validation must be submitted to the server for local final validation. The new protocol eliminates conflicts between mobile read-only and mobile update transactions, and resolves data conflicts flexibly by using multiversion dynamic adjustment of serialization order to avoid unnecessary restarts of transactions. Mobile read-only transactions can be committed with no-blocking, and respond time of mobile read-only transactions is greatly shortened. The tolerance of mobile transactions of disconnections from the broadcast channel is increased. In global validation mobile distributed transactions have to do check to ensure distributed serializability in all participants. The simulation results show that the new concurrency control protocol proposed offers better performance than other protocols in terms of miss rate, restart rate, commit rate. Under high work load (think time is ls) the miss rate of DMVOCC-MVDA is only 14.6%, is significantly lower than that of other protocols. The restart rate of DMVOCC-MVDA is only 32.3%, showing that DMVOCC-MVDA can effectively reduce the restart rate of mobile transactions. And the commit rate of DMVOCC-MVDA is up to 61.2%, which is obviously higher than that of other protocols.展开更多
Database systems have consistently been prime targets for cyber-attacks and threats due to the critical nature of the data they store.Despite the increasing reliance on database management systems,this field continues...Database systems have consistently been prime targets for cyber-attacks and threats due to the critical nature of the data they store.Despite the increasing reliance on database management systems,this field continues to face numerous cyber-attacks.Database management systems serve as the foundation of any information system or application.Any cyber-attack can result in significant damage to the database system and loss of sensitive data.Consequently,cyber risk classifications and assessments play a crucial role in risk management and establish an essential framework for identifying and responding to cyber threats.Risk assessment aids in understanding the impact of cyber threats and developing appropriate security controls to mitigate risks.The primary objective of this study is to conduct a comprehensive analysis of cyber risks in database management systems,including classifying threats,vulnerabilities,impacts,and countermeasures.This classification helps to identify suitable security controls to mitigate cyber risks for each type of threat.Additionally,this research aims to explore technical countermeasures to protect database systems from cyber threats.This study employs the content analysis method to collect,analyze,and classify data in terms of types of threats,vulnerabilities,and countermeasures.The results indicate that SQL injection attacks and Denial of Service(DoS)attacks were the most prevalent technical threats in database systems,each accounting for 9%of incidents.Vulnerable audit trails,intrusion attempts,and ransomware attacks were classified as the second level of technical threats in database systems,comprising 7%and 5%of incidents,respectively.Furthermore,the findings reveal that insider threats were the most common non-technical threats in database systems,accounting for 5%of incidents.Moreover,the results indicate that weak authentication,unpatched databases,weak audit trails,and multiple usage of an account were the most common technical vulnerabilities in database systems,each accounting for 9%of vulnerabilities.Additionally,software bugs,insecure coding practices,weak security controls,insecure networks,password misuse,weak encryption practices,and weak data masking were classified as the second level of security vulnerabilities in database systems,each accounting for 4%of vulnerabilities.The findings from this work can assist organizations in understanding the types of cyber threats and developing robust strategies against cyber-attacks.展开更多
Most of the proposed concurrency control protocols for real time database systems are based on serializability theorem. Owing to the unique characteristics of real time database applications and the importance of sa...Most of the proposed concurrency control protocols for real time database systems are based on serializability theorem. Owing to the unique characteristics of real time database applications and the importance of satisfying the timing constraints of transactions, serializability is too strong as a correctness criterion and not suitable for real time databases in most cases. On the other hand, relaxed serializability including epsilon serializability and similarity serializability can allow more real time transactions to satisfy their timing constraints, but database consistency may be sacrificed to some extent. We thus propose the use of weak serializability(WSR) that is more relaxed than conflicting serializability while database consistency is maintained. In this paper, we first formally define the new notion of correctness called weak serializability. After the necessary and sufficient conditions for weak serializability are shown, corresponding concurrency control protocol WDHP(weak serializable distributed high priority protocol) is outlined for distributed real time databases, where a new lock mode called mask lock mode is proposed for simplifying the condition of global consistency. Finally, through a series of simulation studies, it is shown that using the new concurrency control protocol the performance of distributed real time databases can be greatly improved.展开更多
This paper formally defines and analyses the new notion of correctness called quasi serializability, and then outlines corresponding concurrency control protocol QDHP for distributed real-time databases. Finally, thro...This paper formally defines and analyses the new notion of correctness called quasi serializability, and then outlines corresponding concurrency control protocol QDHP for distributed real-time databases. Finally, through a series of simulation studies, it shows that using the new concurrency control protocol the performance of distributed real-time databases can be much improved.展开更多
In the Engine CAD application system engineering database management system (ECAD-EDBMS) is the kernel. ECAD-EDBMS can manage and process the multimedia such as graphics, data, text, sound, image and video. It provide...In the Engine CAD application system engineering database management system (ECAD-EDBMS) is the kernel. ECAD-EDBMS can manage and process the multimedia such as graphics, data, text, sound, image and video. It provides the integrated environment and more functions for many subsystems of ECAD and engine designers. So it improves the design efficiency.展开更多
In the context of a proliferation of DatabaseManagement Systems(DBMSs),we have envisioned and produced an OWL 2 ontology able to provide a high-level machine-processable description of the DBMSs domain.This conceptual...In the context of a proliferation of DatabaseManagement Systems(DBMSs),we have envisioned and produced an OWL 2 ontology able to provide a high-level machine-processable description of the DBMSs domain.This conceptualization aims to facilitate a proper execution of various software engineering processes and database-focused administration tasks.Also,it can be used to improve the decision-making process for determining/selecting the appropriate DBMS,subject to specific requirements.The proposed model describes the most important features and aspects regarding the DBMS domain,including the support for various paradigms(relational,graph-based,key-value,tree-like,etc.),query languages,platforms(servers),plus running environments(desktop,Web,cloud),specific contexts—i.e.,focusing on optimizing queries,redundancy,security,performance,schema vs.schema-less approaches,programming languages/paradigms,and others.The process of populating the ontology with significant individuals(actual DBMSs)benefits from the existing knowledge exposed by free and open machine-processable knowledge bases,by using structured data fromWikipedia and related sources.The pragmatic use of our ontology is demonstrated by two educational software solutions based on current practices in Web application development,proving support for learning and experimenting key features of the actual semantic Web technologies and tools.This approach is also an example of using multiple knowledge from database systems,semanticWeb technologies,and software engineering areas.展开更多
Secretion systems, macromolecules to pass which can mediate the across cellular membranes, are essential for virulent and genetic material exchange among bacterial species[1]. Type IV secretion system (T4SS) is one ...Secretion systems, macromolecules to pass which can mediate the across cellular membranes, are essential for virulent and genetic material exchange among bacterial species[1]. Type IV secretion system (T4SS) is one of the secretion systems and it usually consists of 12 genes: VirB1, VirB2 ...VirB11, and VirD4[2]. The structure and molecular mechanisms of these genes have been well analyzed in Gram-negative strains[3] and Gram-positive strains were once believed to be lack of T4SS. However, some recent studies revealed that one or more virB/D genes also exist in some kinds of Gram-positive bacteria and play similar role, and form a T4SS-like system[3]. The VirBl-like, VirB4, VirB6, and VirD4 genes were identified in the chromosome of Gram-positive bacterium Streptococcus suis in our previous studies and their role as important mobile elements for horizontal transfer to recipients in an 89 K pathogenicity island (PAl) was demonstrated[45]. However, their structure and molecular mechanisms in other strains, especially in Gram-positive strains, are remained unclear.展开更多
The necessity and the feasibility of introducing attribute weight into digital fingerprinting system are given. The weighted algorithm for fingerprinting relational databases of traitor tracing is proposed. Higher wei...The necessity and the feasibility of introducing attribute weight into digital fingerprinting system are given. The weighted algorithm for fingerprinting relational databases of traitor tracing is proposed. Higher weights are assigned to more significant attributes, so important attributes are more frequently fingerprinted than other ones. Finally, the robustness of the proposed algorithm, such as performance against collusion attacks, is analyzed. Experimental results prove the superiority of the algorithm.展开更多
The process of constructing a database of average cross-sections in Chinese proximal femurs is described. The main goal of creating the database is for designing hip stems for Chinese patients. Methods for constructin...The process of constructing a database of average cross-sections in Chinese proximal femurs is described. The main goal of creating the database is for designing hip stems for Chinese patients. Methods for constructing the database are introduced. According to some existing software and programs developed by the authors, a database of average cross-sections in Chinese proximal femurs was built based on CT images of eighty femur-specimens. 3-D shape of a patient's proximal femurs can be reconstructed according to the database and X-ray radiographs. Theoretical analyses and results of clinical application indicate that the database can be used to design hip stems for Chinese patients.展开更多
Until recently, many computational materials scientists have shown little interest in materials databases. This is now changing be-cause the amount of computational data is rapidly increasing and the potential for dat...Until recently, many computational materials scientists have shown little interest in materials databases. This is now changing be-cause the amount of computational data is rapidly increasing and the potential for data mining provides unique opportunities for discovery and optimization. Here, a few examples of such opportunities are discussed relating to structural analysis and classification, discovery of correlations between materials properties, and discovery of unsuspected compounds.展开更多
There has been an increasing interest in integrating decision support systems (DSS) and expert systems (ES) to provide decision makers a more accessible, productive and domain-independent information and computing env...There has been an increasing interest in integrating decision support systems (DSS) and expert systems (ES) to provide decision makers a more accessible, productive and domain-independent information and computing environment. This paper is aimed at designing a multiple expert systems integrated decision support system (MESIDSS) to enhance decision makers' ability in more complex cases. The basic framework, management system of multiple ESs, and functions of MESIDSS are presented. The applications of MESIDSS in large-scale decision making processes are discussed from the following aspects of problem decomposing, dynamic combination of multiple ESs, link of multiple bases and decision coordinating. Finally, a summary and some ideas for the future are presented.展开更多
This paper studied an integrative fault diagnostic system on the power transformer. On-line monitor items were grounded current of iron core, internal partial discharge and oil dissolved gas. Diagnostic techniques wer...This paper studied an integrative fault diagnostic system on the power transformer. On-line monitor items were grounded current of iron core, internal partial discharge and oil dissolved gas. Diagnostic techniques were simple rule-based judgment, fuzzy logistic reasoning and neural network distinguishing. Considering that much faults information was interactional, intellectualized diagnosis was implemented based on integrating the neural network with the expert system. Hologamous integrating strategies were materialized by information-based integrating monitor devices, shared information database on several levels and fusion diagnosis software along thought patterns. The expert system practiced logic thought by logistic reasoning. The neural network realized image thought by model matching. Creative conclusion was educed by their integrating. The diagnosis example showed that the integrative diagnostic system was reasonable and practical.展开更多
This paper presented a rule merging and simplifying method and an improved analysis deviation algorithm. The fuzzy equivalence theory avoids the rigid way (either this or that) of traditional equivalence theory. Durin...This paper presented a rule merging and simplifying method and an improved analysis deviation algorithm. The fuzzy equivalence theory avoids the rigid way (either this or that) of traditional equivalence theory. During a data cleaning process task, some rules exist such as included/being included relations with each other. The equivalence degree of the being-included rule is smaller than that of the including rule, so a rule merging and simplifying method is introduced to reduce the total computing time. And this kind of relation will affect the deviation of fuzzy equivalence degree. An improved analysis deviation algorithm that omits the influence of the included rules' equivalence degree was also presented. Normally the duplicate records are logged in a file, and users have to check and verify them one by one. It's time-cost. The proposed algorithm can save users' labor during duplicate records checking. Finally, an experiment was presented which demonstrates the possibility of the rule.展开更多
A prototype of fault diagnosis based on Petri net, which is developed for a satellite tele-control subsystem, is introduced in this paper. Its structure is first given with the emphasis on a Petri net modeling tool wh...A prototype of fault diagnosis based on Petri net, which is developed for a satellite tele-control subsystem, is introduced in this paper. Its structure is first given with the emphasis on a Petri net modeling tool which is designed using the object oriented method. The prototype is connected to the database with DAO (Date Access Object) technique, and makes the Petri net's firing mechanism and its analyzing methods to be packed up as DLL (Dynamic Link Library) documents. Compared with the rule-based expert system method, the Petri net-based one can store the knowledge in mathematical matrix and make inference more quickly and effectively.展开更多
DM usually means an efficient knowledge discovery from database, and the immune algorithm is a biological theory-based and global searching algorithm. A novel induction algorithm is proposed here which integrates a po...DM usually means an efficient knowledge discovery from database, and the immune algorithm is a biological theory-based and global searching algorithm. A novel induction algorithm is proposed here which integrates a power of individual immunity and an evolutionary mechanism of population. This algorithm does not take great care of discovering some classifying information, but unknown knowledge or a predication on higher level rules. Theoretical analysis and simulations both show that this algorithm is prone to the stabilization of a population and the improvement of entire capability, and also keeping a high degree of preciseness during the rule induction.展开更多
Frequent Pattern mining plays an essential role in data mining.Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach.However,candidate set generation is still costly,especially ...Frequent Pattern mining plays an essential role in data mining.Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach.However,candidate set generation is still costly,especially when there exist prolific patterns and/or long patterns.We introduce a novel frequent pattern growth(FP-growth)method,which is efficient and scalable for mining both long and short frequent patterns without candidate generation.And build a new projection frequent pattern tree(PFP-tree)algorithm,which not only heirs all the advantages in the FP-growth method,but also avoids it's bottleneck in database size dependence when constructing the frequent pattern tree(FP-tree).Efficiency of mining is achieved by introducing the projection technique,which avoid serial scan each frequent item in the database,the cost is mainly related to the depth of the tree,namely the number of frequent items of the longest transaction in the database,not the sum of all the frequent items in the database,which hugely shortens the time of tree-construction.Our performance study shows that the PFP-tree method is efficient and scalable for mining large databases or data warehouses,and is even about an order of magnitude faster than the FP-growth method.展开更多
Nowadays, many kinds of computer network data management systems have been built widely in China. People have realized widely that management information system (MIS) has brought a revolution to the management mechani...Nowadays, many kinds of computer network data management systems have been built widely in China. People have realized widely that management information system (MIS) has brought a revolution to the management mechanism. Moreover, the managers of company need wide-range and comprehensive decision information more and more urgently which is the character of information explosion era. The needs of users become harsher and harsher in the design of MIS, and these needs have brought new problems to the general designers of MIS. Furthermore, the current method of traditional database development can't solve so big and complex problems of wide-range and comprehensive information processing. This paper proposes the adoption of parallel processing mode, the built of new decision support system (DSS) is to discuss and analyze the problems of information collection, processing and the acquirement of full-merit information with cross-domain and cross-VLDB (very-large database).展开更多
In solving the clustering problem in the context of knowledge discovery in databases (KDD), the traditional methods, for example, the K-means algorithm and its variants, usually require the users to provide the number...In solving the clustering problem in the context of knowledge discovery in databases (KDD), the traditional methods, for example, the K-means algorithm and its variants, usually require the users to provide the number of clusters in advance based on the pro-information. Unfortunately, the number of clusters in general is unknown to the users who are usually short of pro-information. Therefore, the clustering calculation becomes a tedious trial-and-error work, and the result is often not global optimal especially when the number of clusters is large. In this paper, a new dynamic clustering method based on genetic algorithms (GA) is proposed and applied for auto-clustering of data entities in large databases. The algorithm can automatically cluster the data according to their similarities and find the exact number of clusters. Experiment results indicate that the method is of global optimization by dynamically clustering logic.展开更多
A partition checkpoint strategy based on data segment priority is presented to meet the timing constraints of the data and the transaction in embedded real-time main memory database systems(ERTMMDBS) as well as to r...A partition checkpoint strategy based on data segment priority is presented to meet the timing constraints of the data and the transaction in embedded real-time main memory database systems(ERTMMDBS) as well as to reduce the number of the transactions missing their deadlines and the recovery time.The partition checkpoint strategy takes into account the characteristics of the data and the transactions associated with it;moreover,it partitions the database according to the data segment priority and sets the corresponding checkpoint frequency to each partition for independent checkpoint operation.The simulation results show that the partition checkpoint strategy decreases the ratio of trans-actions missing their deadlines.展开更多
基金supported by the National Natural Science Foundation of China under grant nos.61772091,61802035,61962006,61962038,U1802271,U2001212,and 62072311the Sichuan Science and Technology Program under grant nos.2021JDJQ0021 and 22ZDYF2680+7 种基金the CCF‐Huawei Database System Innovation Research Plan under grant no.CCF‐HuaweiDBIR2020004ADigital Media Art,Key Laboratory of Sichuan Province,Sichuan Conservatory of Music,Chengdu,China under grant no.21DMAKL02the Chengdu Major Science and Technology Innovation Project under grant no.2021‐YF08‐00156‐GXthe Chengdu Technology Innovation and Research and Development Project under grant no.2021‐YF05‐00491‐SNthe Natural Science Foundation of Guangxi under grant no.2018GXNSFDA138005the Guangdong Basic and Applied Basic Research Foundation under grant no.2020B1515120028the Science and Technology Innovation Seedling Project of Sichuan Province under grant no 2021006the College Student Innovation and Entrepreneurship Training Program of Chengdu University of Information Technology under grant nos.202110621179 and 202110621186.
文摘An excellent cardinality estimation can make the query optimiser produce a good execution plan.Although there are some studies on cardinality estimation,the prediction results of existing cardinality estimators are inaccurate and the query efficiency cannot be guaranteed as well.In particular,they are difficult to accurately obtain the complex relationships between multiple tables in complex database systems.When dealing with complex queries,the existing cardinality estimators cannot achieve good results.In this study,a novel cardinality estimator is proposed.It uses the core techniques with the BiLSTM network structure and adds the attention mechanism.First,the columns involved in the query statements in the training set are sampled and compressed into bitmaps.Then,the Word2vec model is used to embed the word vectors about the query statements.Finally,the BiLSTM network and attention mechanism are employed to deal with word vectors.The proposed model takes into consideration not only the correlation between tables but also the processing of complex predicates.Extensive experiments and the evaluation of BiLSTM-Attention Cardinality Estimator(BACE)on the IMDB datasets are conducted.The results show that the deep learning model can significantly improve the quality of cardinality estimation,which is a vital role in query optimisation for complex databases.
基金Project(20030533011)supported by the National Research Foundation for the Doctoral Program of Higher Education of China
文摘A DMVOCC-MVDA (distributed multiversion optimistic concurrency control with multiversion dynamic adjustment) protocol was presented to process mobile distributed real-time transaction in mobile broadcast environments. At the mobile hosts, all transactions perform local pre-validation. The local pre-validation process is carried out against the committed transactions at the server in the last broadcast cycle. Transactions that survive in local pre-validation must be submitted to the server for local final validation. The new protocol eliminates conflicts between mobile read-only and mobile update transactions, and resolves data conflicts flexibly by using multiversion dynamic adjustment of serialization order to avoid unnecessary restarts of transactions. Mobile read-only transactions can be committed with no-blocking, and respond time of mobile read-only transactions is greatly shortened. The tolerance of mobile transactions of disconnections from the broadcast channel is increased. In global validation mobile distributed transactions have to do check to ensure distributed serializability in all participants. The simulation results show that the new concurrency control protocol proposed offers better performance than other protocols in terms of miss rate, restart rate, commit rate. Under high work load (think time is ls) the miss rate of DMVOCC-MVDA is only 14.6%, is significantly lower than that of other protocols. The restart rate of DMVOCC-MVDA is only 32.3%, showing that DMVOCC-MVDA can effectively reduce the restart rate of mobile transactions. And the commit rate of DMVOCC-MVDA is up to 61.2%, which is obviously higher than that of other protocols.
基金supported by the Deanship of Scientific Research,Vice Presidency for Graduate Studies and Scientific Research,King Faisal University,Saudi Arabia(Grant No.KFU242068).
文摘Database systems have consistently been prime targets for cyber-attacks and threats due to the critical nature of the data they store.Despite the increasing reliance on database management systems,this field continues to face numerous cyber-attacks.Database management systems serve as the foundation of any information system or application.Any cyber-attack can result in significant damage to the database system and loss of sensitive data.Consequently,cyber risk classifications and assessments play a crucial role in risk management and establish an essential framework for identifying and responding to cyber threats.Risk assessment aids in understanding the impact of cyber threats and developing appropriate security controls to mitigate risks.The primary objective of this study is to conduct a comprehensive analysis of cyber risks in database management systems,including classifying threats,vulnerabilities,impacts,and countermeasures.This classification helps to identify suitable security controls to mitigate cyber risks for each type of threat.Additionally,this research aims to explore technical countermeasures to protect database systems from cyber threats.This study employs the content analysis method to collect,analyze,and classify data in terms of types of threats,vulnerabilities,and countermeasures.The results indicate that SQL injection attacks and Denial of Service(DoS)attacks were the most prevalent technical threats in database systems,each accounting for 9%of incidents.Vulnerable audit trails,intrusion attempts,and ransomware attacks were classified as the second level of technical threats in database systems,comprising 7%and 5%of incidents,respectively.Furthermore,the findings reveal that insider threats were the most common non-technical threats in database systems,accounting for 5%of incidents.Moreover,the results indicate that weak authentication,unpatched databases,weak audit trails,and multiple usage of an account were the most common technical vulnerabilities in database systems,each accounting for 9%of vulnerabilities.Additionally,software bugs,insecure coding practices,weak security controls,insecure networks,password misuse,weak encryption practices,and weak data masking were classified as the second level of security vulnerabilities in database systems,each accounting for 4%of vulnerabilities.The findings from this work can assist organizations in understanding the types of cyber threats and developing robust strategies against cyber-attacks.
文摘Most of the proposed concurrency control protocols for real time database systems are based on serializability theorem. Owing to the unique characteristics of real time database applications and the importance of satisfying the timing constraints of transactions, serializability is too strong as a correctness criterion and not suitable for real time databases in most cases. On the other hand, relaxed serializability including epsilon serializability and similarity serializability can allow more real time transactions to satisfy their timing constraints, but database consistency may be sacrificed to some extent. We thus propose the use of weak serializability(WSR) that is more relaxed than conflicting serializability while database consistency is maintained. In this paper, we first formally define the new notion of correctness called weak serializability. After the necessary and sufficient conditions for weak serializability are shown, corresponding concurrency control protocol WDHP(weak serializable distributed high priority protocol) is outlined for distributed real time databases, where a new lock mode called mask lock mode is proposed for simplifying the condition of global consistency. Finally, through a series of simulation studies, it is shown that using the new concurrency control protocol the performance of distributed real time databases can be greatly improved.
基金the National Natural Science Foundation of China and the Commission of Science,Technokgy and Industry for National Defense
文摘This paper formally defines and analyses the new notion of correctness called quasi serializability, and then outlines corresponding concurrency control protocol QDHP for distributed real-time databases. Finally, through a series of simulation studies, it shows that using the new concurrency control protocol the performance of distributed real-time databases can be much improved.
文摘In the Engine CAD application system engineering database management system (ECAD-EDBMS) is the kernel. ECAD-EDBMS can manage and process the multimedia such as graphics, data, text, sound, image and video. It provides the integrated environment and more functions for many subsystems of ECAD and engine designers. So it improves the design efficiency.
文摘In the context of a proliferation of DatabaseManagement Systems(DBMSs),we have envisioned and produced an OWL 2 ontology able to provide a high-level machine-processable description of the DBMSs domain.This conceptualization aims to facilitate a proper execution of various software engineering processes and database-focused administration tasks.Also,it can be used to improve the decision-making process for determining/selecting the appropriate DBMS,subject to specific requirements.The proposed model describes the most important features and aspects regarding the DBMS domain,including the support for various paradigms(relational,graph-based,key-value,tree-like,etc.),query languages,platforms(servers),plus running environments(desktop,Web,cloud),specific contexts—i.e.,focusing on optimizing queries,redundancy,security,performance,schema vs.schema-less approaches,programming languages/paradigms,and others.The process of populating the ontology with significant individuals(actual DBMSs)benefits from the existing knowledge exposed by free and open machine-processable knowledge bases,by using structured data fromWikipedia and related sources.The pragmatic use of our ontology is demonstrated by two educational software solutions based on current practices in Web application development,proving support for learning and experimenting key features of the actual semantic Web technologies and tools.This approach is also an example of using multiple knowledge from database systems,semanticWeb technologies,and software engineering areas.
基金supported by the National Natural Science Foundation of China (No. 81201322)the Priority Project on Infectious Disease Control and Prevention 2011ZX10004-001 and 2013ZX10003006-002 by the Chinese Ministry of Science and Technology and the Chinese Ministry of Healththe Foundation of State Key Laboratory for Infectious Disease Prevention and Control (Grand No. 2011SKLID303)
文摘Secretion systems, macromolecules to pass which can mediate the across cellular membranes, are essential for virulent and genetic material exchange among bacterial species[1]. Type IV secretion system (T4SS) is one of the secretion systems and it usually consists of 12 genes: VirB1, VirB2 ...VirB11, and VirD4[2]. The structure and molecular mechanisms of these genes have been well analyzed in Gram-negative strains[3] and Gram-positive strains were once believed to be lack of T4SS. However, some recent studies revealed that one or more virB/D genes also exist in some kinds of Gram-positive bacteria and play similar role, and form a T4SS-like system[3]. The VirBl-like, VirB4, VirB6, and VirD4 genes were identified in the chromosome of Gram-positive bacterium Streptococcus suis in our previous studies and their role as important mobile elements for horizontal transfer to recipients in an 89 K pathogenicity island (PAl) was demonstrated[45]. However, their structure and molecular mechanisms in other strains, especially in Gram-positive strains, are remained unclear.
文摘The necessity and the feasibility of introducing attribute weight into digital fingerprinting system are given. The weighted algorithm for fingerprinting relational databases of traitor tracing is proposed. Higher weights are assigned to more significant attributes, so important attributes are more frequently fingerprinted than other ones. Finally, the robustness of the proposed algorithm, such as performance against collusion attacks, is analyzed. Experimental results prove the superiority of the algorithm.
文摘The process of constructing a database of average cross-sections in Chinese proximal femurs is described. The main goal of creating the database is for designing hip stems for Chinese patients. Methods for constructing the database are introduced. According to some existing software and programs developed by the authors, a database of average cross-sections in Chinese proximal femurs was built based on CT images of eighty femur-specimens. 3-D shape of a patient's proximal femurs can be reconstructed according to the database and X-ray radiographs. Theoretical analyses and results of clinical application indicate that the database can be used to design hip stems for Chinese patients.
文摘Until recently, many computational materials scientists have shown little interest in materials databases. This is now changing be-cause the amount of computational data is rapidly increasing and the potential for data mining provides unique opportunities for discovery and optimization. Here, a few examples of such opportunities are discussed relating to structural analysis and classification, discovery of correlations between materials properties, and discovery of unsuspected compounds.
文摘There has been an increasing interest in integrating decision support systems (DSS) and expert systems (ES) to provide decision makers a more accessible, productive and domain-independent information and computing environment. This paper is aimed at designing a multiple expert systems integrated decision support system (MESIDSS) to enhance decision makers' ability in more complex cases. The basic framework, management system of multiple ESs, and functions of MESIDSS are presented. The applications of MESIDSS in large-scale decision making processes are discussed from the following aspects of problem decomposing, dynamic combination of multiple ESs, link of multiple bases and decision coordinating. Finally, a summary and some ideas for the future are presented.
文摘This paper studied an integrative fault diagnostic system on the power transformer. On-line monitor items were grounded current of iron core, internal partial discharge and oil dissolved gas. Diagnostic techniques were simple rule-based judgment, fuzzy logistic reasoning and neural network distinguishing. Considering that much faults information was interactional, intellectualized diagnosis was implemented based on integrating the neural network with the expert system. Hologamous integrating strategies were materialized by information-based integrating monitor devices, shared information database on several levels and fusion diagnosis software along thought patterns. The expert system practiced logic thought by logistic reasoning. The neural network realized image thought by model matching. Creative conclusion was educed by their integrating. The diagnosis example showed that the integrative diagnostic system was reasonable and practical.
文摘This paper presented a rule merging and simplifying method and an improved analysis deviation algorithm. The fuzzy equivalence theory avoids the rigid way (either this or that) of traditional equivalence theory. During a data cleaning process task, some rules exist such as included/being included relations with each other. The equivalence degree of the being-included rule is smaller than that of the including rule, so a rule merging and simplifying method is introduced to reduce the total computing time. And this kind of relation will affect the deviation of fuzzy equivalence degree. An improved analysis deviation algorithm that omits the influence of the included rules' equivalence degree was also presented. Normally the duplicate records are logged in a file, and users have to check and verify them one by one. It's time-cost. The proposed algorithm can save users' labor during duplicate records checking. Finally, an experiment was presented which demonstrates the possibility of the rule.
文摘A prototype of fault diagnosis based on Petri net, which is developed for a satellite tele-control subsystem, is introduced in this paper. Its structure is first given with the emphasis on a Petri net modeling tool which is designed using the object oriented method. The prototype is connected to the database with DAO (Date Access Object) technique, and makes the Petri net's firing mechanism and its analyzing methods to be packed up as DLL (Dynamic Link Library) documents. Compared with the rule-based expert system method, the Petri net-based one can store the knowledge in mathematical matrix and make inference more quickly and effectively.
基金This project was supported by the National Natural Science Foundation of China (No. 60073053) the Nationa1 "863" High-Tech P
文摘DM usually means an efficient knowledge discovery from database, and the immune algorithm is a biological theory-based and global searching algorithm. A novel induction algorithm is proposed here which integrates a power of individual immunity and an evolutionary mechanism of population. This algorithm does not take great care of discovering some classifying information, but unknown knowledge or a predication on higher level rules. Theoretical analysis and simulations both show that this algorithm is prone to the stabilization of a population and the improvement of entire capability, and also keeping a high degree of preciseness during the rule induction.
基金Supported by the National Natural Saience Foundation of China(90104005)
文摘Frequent Pattern mining plays an essential role in data mining.Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach.However,candidate set generation is still costly,especially when there exist prolific patterns and/or long patterns.We introduce a novel frequent pattern growth(FP-growth)method,which is efficient and scalable for mining both long and short frequent patterns without candidate generation.And build a new projection frequent pattern tree(PFP-tree)algorithm,which not only heirs all the advantages in the FP-growth method,but also avoids it's bottleneck in database size dependence when constructing the frequent pattern tree(FP-tree).Efficiency of mining is achieved by introducing the projection technique,which avoid serial scan each frequent item in the database,the cost is mainly related to the depth of the tree,namely the number of frequent items of the longest transaction in the database,not the sum of all the frequent items in the database,which hugely shortens the time of tree-construction.Our performance study shows that the PFP-tree method is efficient and scalable for mining large databases or data warehouses,and is even about an order of magnitude faster than the FP-growth method.
文摘Nowadays, many kinds of computer network data management systems have been built widely in China. People have realized widely that management information system (MIS) has brought a revolution to the management mechanism. Moreover, the managers of company need wide-range and comprehensive decision information more and more urgently which is the character of information explosion era. The needs of users become harsher and harsher in the design of MIS, and these needs have brought new problems to the general designers of MIS. Furthermore, the current method of traditional database development can't solve so big and complex problems of wide-range and comprehensive information processing. This paper proposes the adoption of parallel processing mode, the built of new decision support system (DSS) is to discuss and analyze the problems of information collection, processing and the acquirement of full-merit information with cross-domain and cross-VLDB (very-large database).
基金This project was supported by the National Natural Science Foundation of China (No. 79400013, No. 60074026).
文摘In solving the clustering problem in the context of knowledge discovery in databases (KDD), the traditional methods, for example, the K-means algorithm and its variants, usually require the users to provide the number of clusters in advance based on the pro-information. Unfortunately, the number of clusters in general is unknown to the users who are usually short of pro-information. Therefore, the clustering calculation becomes a tedious trial-and-error work, and the result is often not global optimal especially when the number of clusters is large. In this paper, a new dynamic clustering method based on genetic algorithms (GA) is proposed and applied for auto-clustering of data entities in large databases. The algorithm can automatically cluster the data according to their similarities and find the exact number of clusters. Experiment results indicate that the method is of global optimization by dynamically clustering logic.
基金Supported by the National Natural Science Foundation of China (60673128)
文摘A partition checkpoint strategy based on data segment priority is presented to meet the timing constraints of the data and the transaction in embedded real-time main memory database systems(ERTMMDBS) as well as to reduce the number of the transactions missing their deadlines and the recovery time.The partition checkpoint strategy takes into account the characteristics of the data and the transactions associated with it;moreover,it partitions the database according to the data segment priority and sets the corresponding checkpoint frequency to each partition for independent checkpoint operation.The simulation results show that the partition checkpoint strategy decreases the ratio of trans-actions missing their deadlines.