Today, the quantity of data continues to increase, furthermore, the data are heterogeneous, from multiple sources (structured, semi-structured and unstructured) and with different levels of quality. Therefore, it is v...Today, the quantity of data continues to increase, furthermore, the data are heterogeneous, from multiple sources (structured, semi-structured and unstructured) and with different levels of quality. Therefore, it is very likely to manipulate data without knowledge about their structures and their semantics. In fact, the meta-data may be insufficient or totally absent. Data Anomalies may be due to the poverty of their semantic descriptions, or even the absence of their description. In this paper, we propose an approach to better understand the semantics and the structure of the data. Our approach helps to correct automatically the intra-column anomalies and the inter-col- umns ones. We aim to improve the quality of data by processing the null values and the semantic dependencies between columns.展开更多
We present a new method for estimating missing values or correcting unreliable observed values of time dependent physical fields. This method, is based on Hidden Markov Models and Self-Organizing Maps, and is named PR...We present a new method for estimating missing values or correcting unreliable observed values of time dependent physical fields. This method, is based on Hidden Markov Models and Self-Organizing Maps, and is named PROFHMM_UNC. PROFHMM_UNC combines the knowledge of the physical process under study provided by an already known dynamic model and the truncated time series of observations of the phenomenon. In order to generate the states of the Hidden Markov Model, Self-Organizing Maps are used to discretize the available data. We make a modification to the Viterbi algorithm that forces the algorithm to take into account a priori information on the quality of the observed data when selecting the optimum reconstruction. The validity of PROFHMM_UNC was endorsed by performing a twin experiment with the outputs of the ocean biogeochemical NEMO-PISCES model.展开更多
This paper presents, without altering the AADL meta-model, a formal description of static and behavioral aspects of the AADL thread component. This active and concurrent applicative component of AADL poses many challe...This paper presents, without altering the AADL meta-model, a formal description of static and behavioral aspects of the AADL thread component. This active and concurrent applicative component of AADL poses many challenges to its formalization and analysis including instantaneous and/or delayed communications, concurrent tasks and time-dependent features, and the need to analyze correctness. This formalization, based on real-time object-oriented theories, allows not only a precise description of the semantics of threads composition with respect to their timing requirements but also makes possible the formal verification of behavioral properties.展开更多
In this paper, we propose astochastic Petri net model P-timed Workflow (WPTSPN) to specify, verify, and analyze a business process (BP) of a Flexible Manufacturing System (FMS). After formalizing the semantics of our ...In this paper, we propose astochastic Petri net model P-timed Workflow (WPTSPN) to specify, verify, and analyze a business process (BP) of a Flexible Manufacturing System (FMS). After formalizing the semantics of our model, we illustrate how to verifysome of its properties (reachability, safety, boundedness, liveness, correctness, alive tokens, and security) in the P-Timed context. Next, we validate the relevance of the proposed model with MATLAB simulation through a specific FMS case study. Finally, we use a generalized truncated density function to predict the duration of a token’s sojourn (residence) in a timed place with respect to the sequence states of the global FMS workflow.展开更多
A main advantage of Architecture Description Languages (ADL) is their aptitude to facilitate formal analysis and verification of complex software architectures. Since some researchers try to extend them by new techniq...A main advantage of Architecture Description Languages (ADL) is their aptitude to facilitate formal analysis and verification of complex software architectures. Since some researchers try to extend them by new techniques, we show in this paper how the use of tile logic as extension of rewriting logic can enforce the ability of existing ADL formalisms to cope with hierarchy and composition features which are more and more present in such software architectures. In order to cover ADL key and generic concepts, our approach is explained through LfP (Language for rapid Prototyping) as ADL offering the possibility to specify the hierarchical behaviour of software components. Then, our contribution goal is to exploit a suitable logic that allows reasoning naturally about software system behaviour, possibly hierarchical and modular, in terms of its basic components and their interactions.展开更多
Content syndication has become a popular way for timely delivery of frequently updated information on the Web. Today, web syndication technologies such as RSS or Atom are used in a wide variety of applications spreadi...Content syndication has become a popular way for timely delivery of frequently updated information on the Web. Today, web syndication technologies such as RSS or Atom are used in a wide variety of applications spreading from large-scale news broadcasting to medium-scale information sharing in scientific and professional communities. However, they exhibit serious limitations for dealing with information overload in Web 2.0. There is a vital need for efficient real- time filtering methods across feeds, to allow users to effectively follow personally interesting information. We investigate in this paper three indexing techniques for users' subscriptions based on inverted lists or on an ordered trie for exact and partial matching. We present analytical models for memory requirements and matching time and we conduct a thorough experimental evaluation to exhibit the impact of critical parameters of realistic web syndication workloads.展开更多
We present the AS-Index, a new index structure for exact string search in disk resident databases. AS-Index relies on a classical inverted file structure, whose main innovation is a probabilistic search based on the p...We present the AS-Index, a new index structure for exact string search in disk resident databases. AS-Index relies on a classical inverted file structure, whose main innovation is a probabilistic search based on the properties of algebraic signatures used for both n-grams hashing and pattern search. Specifically, the properties of our signatures allow to carry out a search by inspecting only two of the posting lists. The algorithm thus enjoys the unique feature of requiring a constant number of disk accesses, independently from both the pattern size and the database size. We conduct extensive experiments on large datasets to evaluate our index behavior. They confirm that it steadily provides a search performance proportional to the two disk accesses necessary to obtain the posting lists. This makes our structure a choice of interest for the class of applications that require very fast lookups in large textual databases. We describe the index structure, our use of algebraic signatures, and the search algorithm. We discuss the operational trade-offs based on the parameters that affect the behavior of our structure, and present the theoretical and experimental performance analysis. We next compare the AS-Index with the state-of-the-art alternatives and show that 1) its construction time matches that of its competitors, due to the similarity of structures, 2) as for search time, it constantly outperforms the standard approach, thanks to the economical access to data complemented by signature calculations, which is at the core of our search method.展开更多
In this study, we present a framework based on a prediction model that facilitates user access to a number of services in a smart living environment. Users must be able to access all available services continuously eq...In this study, we present a framework based on a prediction model that facilitates user access to a number of services in a smart living environment. Users must be able to access all available services continuously equipped with mobile devices or smart objects without being impacted by technical constraints such as performance or memory issues, regardless of their physical location and mobility. To achieve this goal, we propose the use of cloudlet-based architecture that serves as distributed cloud resources with specific ranges of influence and a realtime processing framework that tracks events and preferences of the end consumers, predicts their requirements,and recommends services to optimize resource utilization and service response time.展开更多
文摘Today, the quantity of data continues to increase, furthermore, the data are heterogeneous, from multiple sources (structured, semi-structured and unstructured) and with different levels of quality. Therefore, it is very likely to manipulate data without knowledge about their structures and their semantics. In fact, the meta-data may be insufficient or totally absent. Data Anomalies may be due to the poverty of their semantic descriptions, or even the absence of their description. In this paper, we propose an approach to better understand the semantics and the structure of the data. Our approach helps to correct automatically the intra-column anomalies and the inter-col- umns ones. We aim to improve the quality of data by processing the null values and the semantic dependencies between columns.
文摘We present a new method for estimating missing values or correcting unreliable observed values of time dependent physical fields. This method, is based on Hidden Markov Models and Self-Organizing Maps, and is named PROFHMM_UNC. PROFHMM_UNC combines the knowledge of the physical process under study provided by an already known dynamic model and the truncated time series of observations of the phenomenon. In order to generate the states of the Hidden Markov Model, Self-Organizing Maps are used to discretize the available data. We make a modification to the Viterbi algorithm that forces the algorithm to take into account a priori information on the quality of the observed data when selecting the optimum reconstruction. The validity of PROFHMM_UNC was endorsed by performing a twin experiment with the outputs of the ocean biogeochemical NEMO-PISCES model.
文摘This paper presents, without altering the AADL meta-model, a formal description of static and behavioral aspects of the AADL thread component. This active and concurrent applicative component of AADL poses many challenges to its formalization and analysis including instantaneous and/or delayed communications, concurrent tasks and time-dependent features, and the need to analyze correctness. This formalization, based on real-time object-oriented theories, allows not only a precise description of the semantics of threads composition with respect to their timing requirements but also makes possible the formal verification of behavioral properties.
文摘In this paper, we propose astochastic Petri net model P-timed Workflow (WPTSPN) to specify, verify, and analyze a business process (BP) of a Flexible Manufacturing System (FMS). After formalizing the semantics of our model, we illustrate how to verifysome of its properties (reachability, safety, boundedness, liveness, correctness, alive tokens, and security) in the P-Timed context. Next, we validate the relevance of the proposed model with MATLAB simulation through a specific FMS case study. Finally, we use a generalized truncated density function to predict the duration of a token’s sojourn (residence) in a timed place with respect to the sequence states of the global FMS workflow.
文摘A main advantage of Architecture Description Languages (ADL) is their aptitude to facilitate formal analysis and verification of complex software architectures. Since some researchers try to extend them by new techniques, we show in this paper how the use of tile logic as extension of rewriting logic can enforce the ability of existing ADL formalisms to cope with hierarchy and composition features which are more and more present in such software architectures. In order to cover ADL key and generic concepts, our approach is explained through LfP (Language for rapid Prototyping) as ADL offering the possibility to specify the hierarchical behaviour of software components. Then, our contribution goal is to exploit a suitable logic that allows reasoning naturally about software system behaviour, possibly hierarchical and modular, in terms of its basic components and their interactions.
文摘Content syndication has become a popular way for timely delivery of frequently updated information on the Web. Today, web syndication technologies such as RSS or Atom are used in a wide variety of applications spreading from large-scale news broadcasting to medium-scale information sharing in scientific and professional communities. However, they exhibit serious limitations for dealing with information overload in Web 2.0. There is a vital need for efficient real- time filtering methods across feeds, to allow users to effectively follow personally interesting information. We investigate in this paper three indexing techniques for users' subscriptions based on inverted lists or on an ordered trie for exact and partial matching. We present analytical models for memory requirements and matching time and we conduct a thorough experimental evaluation to exhibit the impact of critical parameters of realistic web syndication workloads.
文摘We present the AS-Index, a new index structure for exact string search in disk resident databases. AS-Index relies on a classical inverted file structure, whose main innovation is a probabilistic search based on the properties of algebraic signatures used for both n-grams hashing and pattern search. Specifically, the properties of our signatures allow to carry out a search by inspecting only two of the posting lists. The algorithm thus enjoys the unique feature of requiring a constant number of disk accesses, independently from both the pattern size and the database size. We conduct extensive experiments on large datasets to evaluate our index behavior. They confirm that it steadily provides a search performance proportional to the two disk accesses necessary to obtain the posting lists. This makes our structure a choice of interest for the class of applications that require very fast lookups in large textual databases. We describe the index structure, our use of algebraic signatures, and the search algorithm. We discuss the operational trade-offs based on the parameters that affect the behavior of our structure, and present the theoretical and experimental performance analysis. We next compare the AS-Index with the state-of-the-art alternatives and show that 1) its construction time matches that of its competitors, due to the similarity of structures, 2) as for search time, it constantly outperforms the standard approach, thanks to the economical access to data complemented by signature calculations, which is at the core of our search method.
基金supported by the National Institute of Standards and Technologies(NIST)
文摘In this study, we present a framework based on a prediction model that facilitates user access to a number of services in a smart living environment. Users must be able to access all available services continuously equipped with mobile devices or smart objects without being impacted by technical constraints such as performance or memory issues, regardless of their physical location and mobility. To achieve this goal, we propose the use of cloudlet-based architecture that serves as distributed cloud resources with specific ranges of influence and a realtime processing framework that tracks events and preferences of the end consumers, predicts their requirements,and recommends services to optimize resource utilization and service response time.