A novel data streams partitioning method is proposed to resolve problems of range-aggregation continuous queries over parallel streams for power industry.The first step of this method is to parallel sample the data,wh...A novel data streams partitioning method is proposed to resolve problems of range-aggregation continuous queries over parallel streams for power industry.The first step of this method is to parallel sample the data,which is implemented as an extended reservoir-sampling algorithm.A skip factor based on the change ratio of data-values is introduced to describe the distribution characteristics of data-values adaptively.The second step of this method is to partition the fluxes of data streams averagely,which is implemented with two alternative equal-depth histogram generating algorithms that fit the different cases:one for incremental maintenance based on heuristics and the other for periodical updates to generate an approximate partition vector.The experimental results on actual data prove that the method is efficient,practical and suitable for time-varying data streams processing.展开更多
A novel Hilbert-curve is introduced for parallel spatial data partitioning, with consideration of the huge-amount property of spatial information and the variable-length characteristic of vector data items. Based on t...A novel Hilbert-curve is introduced for parallel spatial data partitioning, with consideration of the huge-amount property of spatial information and the variable-length characteristic of vector data items. Based on the improved Hilbert curve, the algorithm can be designed to achieve almost-uniform spatial data partitioning among multiple disks in parallel spatial databases. Thus, the phenomenon of data imbalance can be significantly avoided and search and query efficiency can be enhanced.展开更多
To improve the effectiveness of dam safety monitoring database systems,the development process of a multi-dimensional conceptual data model was analyzed and a logic design wasachieved in multi-dimensional database mod...To improve the effectiveness of dam safety monitoring database systems,the development process of a multi-dimensional conceptual data model was analyzed and a logic design wasachieved in multi-dimensional database mode.The optimal data model was confirmed by identifying data objects,defining relations and reviewing entities.The conversion of relations among entities to external keys and entities and physical attributes to tables and fields was interpreted completely.On this basis,a multi-dimensional database that reflects the management and analysis of a dam safety monitoring system on monitoring data information has been established,for which factual tables and dimensional tables have been designed.Finally,based on service design and user interface design,the dam safety monitoring system has been developed with Delphi as the development tool.This development project shows that the multi-dimensional database can simplify the development process and minimize hidden dangers in the database structure design.It is superior to other dam safety monitoring system development models and can provide a new research direction for system developers.展开更多
Many classical clustering algorithms do good jobs on their prerequisite but do not scale well when being applied to deal with very large data sets(VLDS).In this work,a novel division and partition clustering method(DP...Many classical clustering algorithms do good jobs on their prerequisite but do not scale well when being applied to deal with very large data sets(VLDS).In this work,a novel division and partition clustering method(DP) was proposed to solve the problem.DP cut the source data set into data blocks,and extracted the eigenvector for each data block to form the local feature set.The local feature set was used in the second round of the characteristics polymerization process for the source data to find the global eigenvector.Ultimately according to the global eigenvector,the data set was assigned by criterion of minimum distance.The experimental results show that it is more robust than the conventional clusterings.Characteristics of not sensitive to data dimensions,distribution and number of nature clustering make it have a wide range of applications in clustering VLDS.展开更多
The question of how to choose a copula model that best fits a given dataset is a predominant limitation of the copula approach, and the present study aims to investigate the techniques of goodness-of-fit tests for mul...The question of how to choose a copula model that best fits a given dataset is a predominant limitation of the copula approach, and the present study aims to investigate the techniques of goodness-of-fit tests for multi-dimensional copulas. A goodness-of-fit test based on Rosenblatt's transformation was mathematically expanded from two dimensions to three dimensions and procedures of a bootstrap version of the test were provided. Through stochastic copula simulation, an empirical application of historical drought data at the Lintong Gauge Station shows that the goodness-of-fit tests perform well, revealing that both trivariate Gaussian and Student t copulas are acceptable for modeling the dependence structures of the observed drought duration, severity, and peak. The goodness-of-fit tests for multi-dimensional copulas can provide further support and help a lot in the potential applications of a wider range of copulas to describe the associations of correlated hydrological variables. However, for the application of copulas with the number of dimensions larger than three, more complicated computational efforts as well as exploration and parameterization of corresponding copulas are required.展开更多
The advent of the digital era has provided unprecedented opportunities for businesses to collect and analyze customer behavior data. Precision marketing, as a key means to improve marketing efficiency, highly depends ...The advent of the digital era has provided unprecedented opportunities for businesses to collect and analyze customer behavior data. Precision marketing, as a key means to improve marketing efficiency, highly depends on a deep understanding of customer behavior. This study proposes a theoretical framework for multi-dimensional customer behavior analysis, aiming to comprehensively capture customer behavioral characteristics in the digital environment. This framework integrates concepts of multi-source data including transaction history, browsing trajectories, social media interactions, and location information, constructing a theoretically more comprehensive customer profile. The research discusses the potential applications of this theoretical framework in precision marketing scenarios such as personalized recommendations, cross-selling, and customer churn prevention. Through analysis, the study points out that multi-dimensional analysis may significantly improve the targeting and theoretical conversion rates of marketing activities. However, the research also explores theoretical challenges that may be faced in the application process, such as data privacy and information overload, and proposes corresponding conceptual coping strategies. This study provides a new theoretical perspective on how businesses can optimize marketing decisions using big data thinking while respecting customer privacy, laying a foundation for future empirical research.展开更多
To enable quality sealability and further improve the reconstructed video quallty m rate shaping, a rate-distortion optimized packet dropping scheme for H. 264 data partitioned video bitstream is proposed in this pape...To enable quality sealability and further improve the reconstructed video quallty m rate shaping, a rate-distortion optimized packet dropping scheme for H. 264 data partitioned video bitstream is proposed in this paper. Some side information is generated for each video bitstream in advance, while streaming such side information is exploited by a greedy algorithm to optimally drop partitions in a rate-distortion optimized way. Quality sealability is supported by adopting data partition instead of whole frame as the dropping unit. Simulation resuhs show that the proposed scheme achieves a great gain in the reconstructed video quality over two typical frame dropping schemes, with the help of the fine granularity in dropping unit as well as rate-distortion optimization.展开更多
Since its inception in the 1970s,multi-dimensional magnetic resonance(MR)has emerged as a powerful tool for non-invasive investigations of structures and molecular interactions.MR spectroscopy beyond one dimension all...Since its inception in the 1970s,multi-dimensional magnetic resonance(MR)has emerged as a powerful tool for non-invasive investigations of structures and molecular interactions.MR spectroscopy beyond one dimension allows the study of the correlation,exchange processes,and separation of overlapping spectral information.The multi-dimensional concept has been re-implemented over the last two decades to explore molecular motion and spin dynamics in porous media.Apart from Fourier transform,methods have been developed for processing the multi-dimensional time-domain data,identifying the fluid components,and estimating pore surface permeability via joint relaxation and diffusion spectra.Through the resolution of spectroscopic signals with spatial encoding gradients,multi-dimensional MR imaging has been widely used to investigate the microscopic environment of living tissues and distinguish diseases.Signals in each voxel are usually expressed as multi-exponential decay,representing microstructures or environments along multiple pore scales.The separation of contributions from different environments is a common ill-posed problem,which can be resolved numerically.Moreover,the inversion methods and experimental parameters determine the resolution of multi-dimensional spectra.This paper reviews the algorithms that have been proposed to process multidimensional MR datasets in different scenarios.Detailed information at the microscopic level,such as tissue components,fluid types and food structures in multi-disciplinary sciences,could be revealed through multi-dimensional MR.展开更多
A partition checkpoint strategy based on data segment priority is presented to meet the timing constraints of the data and the transaction in embedded real-time main memory database systems(ERTMMDBS) as well as to r...A partition checkpoint strategy based on data segment priority is presented to meet the timing constraints of the data and the transaction in embedded real-time main memory database systems(ERTMMDBS) as well as to reduce the number of the transactions missing their deadlines and the recovery time.The partition checkpoint strategy takes into account the characteristics of the data and the transactions associated with it;moreover,it partitions the database according to the data segment priority and sets the corresponding checkpoint frequency to each partition for independent checkpoint operation.The simulation results show that the partition checkpoint strategy decreases the ratio of trans-actions missing their deadlines.展开更多
Peer-to-peer (P2P) networking is a distributed architecture that partitions tasks or data between peer nodes. In this paper, an efficient Hypercube Sequential Matrix Partition (HS-MP) for efficient data sharing in P2P...Peer-to-peer (P2P) networking is a distributed architecture that partitions tasks or data between peer nodes. In this paper, an efficient Hypercube Sequential Matrix Partition (HS-MP) for efficient data sharing in P2P Networks using tokenizer method is proposed to resolve the problems of the larger P2P networks. The availability of data is first measured by the tokenizer using Dynamic Hypercube Organization. By applying Dynamic Hypercube Organization, that efficiently coordinates and assists the peers in P2P network ensuring data availability at many locations. Each data in peer is then assigned with valid ID by the tokenizer using Sequential Self-Organizing (SSO) ID generation model. This ensures data sharing with other nodes in large P2P network at minimum time interval which is obtained through proximity of data availability. To validate the framework HS-MP, the performance is evaluated using traffic traces collected from data sharing applications. Simulations conducting using Network simulator-2 show that the proposed framework outperforms the conventional streaming models. The performance of the proposed system is analyzed using energy consumption, average latency and average data availability rate with respect to the number of peer nodes, data size, amount of data shared and execution time. The proposed method reduces the energy consumption 43.35% to transpose traffic, 35.29% to bitrev traffic and 25% to bitcomp traffic patterns.展开更多
With the advantages of MapReduce programming model in parallel computing and processing of data and tasks on large-scale clusters, a Dataaware partitioning schema in MapReduce for large-scale high-dimensional data is ...With the advantages of MapReduce programming model in parallel computing and processing of data and tasks on large-scale clusters, a Dataaware partitioning schema in MapReduce for large-scale high-dimensional data is proposed. It optimizes partition method of data blocks with the same contribution to computation in MapReduce. Using a two-stage data partitioning strategy, the data are uniformly distributed into data blocks by clustering and partitioning. The experiments show that the data-aware partitioning schema is very effective and extensible for improving the query efficiency of highdimensional data.展开更多
Steganalysis is a technique used for detecting the existence of secret information embedded into cover media such as images and videos.Currently,with the higher speed of the Internet,videos have become a kind of main ...Steganalysis is a technique used for detecting the existence of secret information embedded into cover media such as images and videos.Currently,with the higher speed of the Internet,videos have become a kind of main methods for transferring information.The latest video coding standard High Efficiency Video Coding(HEVC)shows better coding performance compared with the H.264/AVC standard published in the previous time.Therefore,since the HEVC was published,HEVC videos have been widely used as carriers of hidden information.In this paper,a steganalysis algorithm is proposed to detect the latest HEVC video steganography method which is based on the modification of Prediction Units(PU)partition modes.To detect the embedded data,All the PU partition modes are extracted from P pictures,and the probability of each PU partition mode in cover videos and stego videos is adopted as the classification feature.Furthermore,feature optimization is applied,that the 25-dimensional steganalysis feature has been reduced to the 3-dimensional feature.Then the Support Vector Machine(SVM)is used to identify stego videos.It is demonstrated in experimental results that the proposed steganalysis algorithm can effectively detect the stego videos,and much higher classification accuracy has been achieved compared with state-of-the-art work.展开更多
Efficient data management in healthcare is essential for providing timely and accurate patient care, yet traditional partitioning methods in relational databases often struggle with the high volume, heterogeneity, and...Efficient data management in healthcare is essential for providing timely and accurate patient care, yet traditional partitioning methods in relational databases often struggle with the high volume, heterogeneity, and regulatory complexity of healthcare data. This research introduces a tailored partitioning strategy leveraging the MD5 hashing algorithm to enhance data insertion, query performance, and load balancing in healthcare systems. By applying a consistent hash function to patient IDs, our approach achieves uniform distribution of records across partitions, optimizing retrieval paths and reducing access latency while ensuring data integrity and compliance. We evaluated the method through experiments focusing on partitioning efficiency, scalability, and fault tolerance. The partitioning efficiency analysis compared our MD5-based approach with standard round-robin methods, measuring insertion times, query latency, and data distribution balance. Scalability tests assessed system performance across increasing dataset sizes and varying partition counts, while fault tolerance experiments examined data integrity and retrieval performance under simulated partition failures. The experimental results demonstrate that the MD5-based partitioning strategy significantly reduces query retrieval times by optimizing data access patterns, achieving up to X% better performance compared to round-robin methods. It also scales effectively with larger datasets, maintaining low latency and ensuring robust resilience under failure scenarios. This novel approach offers a scalable, efficient, and fault-tolerant solution for healthcare systems, facilitating faster clinical decision-making and improved patient care in complex data environments.展开更多
In order to discover the main causes of elevator group accidents in edge computing environment, a multi-dimensional data model of elevator accident data is established by using data cube technology, proposing and impl...In order to discover the main causes of elevator group accidents in edge computing environment, a multi-dimensional data model of elevator accident data is established by using data cube technology, proposing and implementing a method by combining classical Apriori algorithm with the model, digging out frequent items of elevator accident data to explore the main reasons for the occurrence of elevator accidents. In addition, a collaborative edge model of elevator accidents is set to achieve data sharing, making it possible to check the detail of each cause to confirm the causes of elevator accidents. Lastly the association rules are applied to find the law of elevator Accidents.展开更多
Similarity measure design for discrete data group was proposed. Similarity measure design for continuous membership function was also carried out. Proposed similarity measures were designed based on fuzzy number and d...Similarity measure design for discrete data group was proposed. Similarity measure design for continuous membership function was also carried out. Proposed similarity measures were designed based on fuzzy number and distance measure, and were proved. To calculate the degree of similarity of discrete data, relative degree between data and total distribution was obtained. Discrete data similarity measure was completed with combination of mentioned relative degrees. Power interconnected system with multi characteristics was considered to apply discrete similarity measure. Naturally, similarity measure was extended to multi-dimensional similarity measure case, and applied to bus clustering problem.展开更多
This paper focuses on the design process for reconfigurable architecture. Our contribution focuses on introducing a new temporal partitioning algorithm. Our algorithm is based on typical mathematic flow to solve the t...This paper focuses on the design process for reconfigurable architecture. Our contribution focuses on introducing a new temporal partitioning algorithm. Our algorithm is based on typical mathematic flow to solve the temporal partitioning problem. This algorithm optimizes the transfer of data required between design partitions and the reconfiguration overhead. Results show that our algorithm considerably decreases the communication cost and the latency compared with other well known algorithms.展开更多
Outlier detection is an important task in data mining. In fact, it is difficult to find the clustering centers in some sophisticated multidimensional datasets and to measure the deviation degree of each potential outl...Outlier detection is an important task in data mining. In fact, it is difficult to find the clustering centers in some sophisticated multidimensional datasets and to measure the deviation degree of each potential outlier. In this work, an effective outlier detection method based on multi-dimensional clustering and local density(ODBMCLD) is proposed. ODBMCLD firstly identifies the center objects by the local density peak of data objects, and clusters the whole dataset based on the center objects. Then, outlier objects belonging to different clusters will be marked as candidates of abnormal data. Finally, the top N points among these abnormal candidates are chosen as final anomaly objects with high outlier factors. The feasibility and effectiveness of the method are verified by experiments.展开更多
The amount of data for decision making has increased tremendously in the age of the digital economy. Decision makers who fail to proficiently manipulate the data produced may make incorrect decisions and therefore har...The amount of data for decision making has increased tremendously in the age of the digital economy. Decision makers who fail to proficiently manipulate the data produced may make incorrect decisions and therefore harm their business. Thus, the task of extracting and classifying the useful information efficiently and effectively from huge amounts of computational data is of special importance. In this paper, we consider that the attributes of data could be both crisp and fuzzy. By examining the suitable partial data, segments with different classes are formed, then a multithreaded computation is performed to generate crisp rules (if possible), and finally, the fuzzy partition technique is employed to deal with the fuzzy attributes for classification. The rules generated in classifying the overall data can be used to gain more knowledge from the data collected.展开更多
Online Transaction Processing(OLTP)gets support from data partitioning to achieve better performance and scalability.The primary objective of database and application developers is to provide scalable and reliable dat...Online Transaction Processing(OLTP)gets support from data partitioning to achieve better performance and scalability.The primary objective of database and application developers is to provide scalable and reliable database systems.This research presents a novel method for data partitioning and load balancing for scalable transactions.Data is efficiently partitioned using the hybrid graph partitioning method.Optimized load balancing(OLB)approach is applied to calculate the weight factor,average workload,and partition efficiency.The presented approach is appropriate for various online data transaction applications.The quality of the proposed approach is examined using OLTP database benchmark.The performance of the proposed methodology significantly outperformed with respect to metrics like throughput,response time,and CPU utilization.展开更多
The paper presents a novel Graphics Interchange Format (GIF) Steganography system. The algorithm uses an animated (GIF) file format video to applyon, a secured and variable image partition scheme for data embedding. T...The paper presents a novel Graphics Interchange Format (GIF) Steganography system. The algorithm uses an animated (GIF) file format video to applyon, a secured and variable image partition scheme for data embedding. The secretdata could be any character text, any image, an audio file, or a video file;that isconverted in the form of bits. The proposed method uses a variable partitionscheme structure for data embedding in the (GIF) file format video. The algorithmestimates the capacity of the cover (GIF) image frames to embed data bits. Ourmethod built variable partition blocks in an empty frame separately and incorporate it with randomly selected (GIF) frames. This way the (GIF) frame is dividedinto variable block same as in the empty frame. Then algorithm embeds secretdata on appropriate pixel of the (GIF) frame. Each selected partition block for dataembedding, can store a different number of data bits based on block size. Intruders could never come to know exact position of the secrete data in this stegoframe. All the (GIF) frames are rebuild to make animated stego (GIF) video.The performance of the proposed (GIF) algorithm has experimented andevaluated based on different input parameters, like Mean Square Error (MSE)and Peak Signal-to-Noise Ratio (PSNR) values. The results are compared withsome existing methods and found that our method has promising results.展开更多
基金The High Technology Research Plan of Jiangsu Prov-ince (No.BG2004034)the Foundation of Graduate Creative Program ofJiangsu Province (No.xm04-36).
文摘A novel data streams partitioning method is proposed to resolve problems of range-aggregation continuous queries over parallel streams for power industry.The first step of this method is to parallel sample the data,which is implemented as an extended reservoir-sampling algorithm.A skip factor based on the change ratio of data-values is introduced to describe the distribution characteristics of data-values adaptively.The second step of this method is to partition the fluxes of data streams averagely,which is implemented with two alternative equal-depth histogram generating algorithms that fit the different cases:one for incremental maintenance based on heuristics and the other for periodical updates to generate an approximate partition vector.The experimental results on actual data prove that the method is efficient,practical and suitable for time-varying data streams processing.
基金Funded by the National 863 Program of China (No. 2005AA113150), and the National Natural Science Foundation of China (No.40701158).
文摘A novel Hilbert-curve is introduced for parallel spatial data partitioning, with consideration of the huge-amount property of spatial information and the variable-length characteristic of vector data items. Based on the improved Hilbert curve, the algorithm can be designed to achieve almost-uniform spatial data partitioning among multiple disks in parallel spatial databases. Thus, the phenomenon of data imbalance can be significantly avoided and search and query efficiency can be enhanced.
基金supported by the National Natural Science Foundation of China(Grant No.50539010,50539110,50579010,50539030 and 50809025)
文摘To improve the effectiveness of dam safety monitoring database systems,the development process of a multi-dimensional conceptual data model was analyzed and a logic design wasachieved in multi-dimensional database mode.The optimal data model was confirmed by identifying data objects,defining relations and reviewing entities.The conversion of relations among entities to external keys and entities and physical attributes to tables and fields was interpreted completely.On this basis,a multi-dimensional database that reflects the management and analysis of a dam safety monitoring system on monitoring data information has been established,for which factual tables and dimensional tables have been designed.Finally,based on service design and user interface design,the dam safety monitoring system has been developed with Delphi as the development tool.This development project shows that the multi-dimensional database can simplify the development process and minimize hidden dangers in the database structure design.It is superior to other dam safety monitoring system development models and can provide a new research direction for system developers.
基金Projects(60903082,60975042)supported by the National Natural Science Foundation of ChinaProject(20070217043)supported by the Research Fund for the Doctoral Program of Higher Education of China
文摘Many classical clustering algorithms do good jobs on their prerequisite but do not scale well when being applied to deal with very large data sets(VLDS).In this work,a novel division and partition clustering method(DP) was proposed to solve the problem.DP cut the source data set into data blocks,and extracted the eigenvector for each data block to form the local feature set.The local feature set was used in the second round of the characteristics polymerization process for the source data to find the global eigenvector.Ultimately according to the global eigenvector,the data set was assigned by criterion of minimum distance.The experimental results show that it is more robust than the conventional clusterings.Characteristics of not sensitive to data dimensions,distribution and number of nature clustering make it have a wide range of applications in clustering VLDS.
基金supported by the Program of Introducing Talents of Disciplines to Universities of the Ministry of Education and State Administration of the Foreign Experts Affairs of China (the 111 Project, Grant No.B08048)the Special Basic Research Fund for Methodology in Hydrology of the Ministry of Sciences and Technology of China (Grant No. 2011IM011000)
文摘The question of how to choose a copula model that best fits a given dataset is a predominant limitation of the copula approach, and the present study aims to investigate the techniques of goodness-of-fit tests for multi-dimensional copulas. A goodness-of-fit test based on Rosenblatt's transformation was mathematically expanded from two dimensions to three dimensions and procedures of a bootstrap version of the test were provided. Through stochastic copula simulation, an empirical application of historical drought data at the Lintong Gauge Station shows that the goodness-of-fit tests perform well, revealing that both trivariate Gaussian and Student t copulas are acceptable for modeling the dependence structures of the observed drought duration, severity, and peak. The goodness-of-fit tests for multi-dimensional copulas can provide further support and help a lot in the potential applications of a wider range of copulas to describe the associations of correlated hydrological variables. However, for the application of copulas with the number of dimensions larger than three, more complicated computational efforts as well as exploration and parameterization of corresponding copulas are required.
文摘The advent of the digital era has provided unprecedented opportunities for businesses to collect and analyze customer behavior data. Precision marketing, as a key means to improve marketing efficiency, highly depends on a deep understanding of customer behavior. This study proposes a theoretical framework for multi-dimensional customer behavior analysis, aiming to comprehensively capture customer behavioral characteristics in the digital environment. This framework integrates concepts of multi-source data including transaction history, browsing trajectories, social media interactions, and location information, constructing a theoretically more comprehensive customer profile. The research discusses the potential applications of this theoretical framework in precision marketing scenarios such as personalized recommendations, cross-selling, and customer churn prevention. Through analysis, the study points out that multi-dimensional analysis may significantly improve the targeting and theoretical conversion rates of marketing activities. However, the research also explores theoretical challenges that may be faced in the application process, such as data privacy and information overload, and proposes corresponding conceptual coping strategies. This study provides a new theoretical perspective on how businesses can optimize marketing decisions using big data thinking while respecting customer privacy, laying a foundation for future empirical research.
基金Supported by the National Natural Science Foundation of China ( No. 60702031 )the National High Technology Research and Development Programme of China (No. 2008AA01Z217A)
文摘To enable quality sealability and further improve the reconstructed video quallty m rate shaping, a rate-distortion optimized packet dropping scheme for H. 264 data partitioned video bitstream is proposed in this paper. Some side information is generated for each video bitstream in advance, while streaming such side information is exploited by a greedy algorithm to optimally drop partitions in a rate-distortion optimized way. Quality sealability is supported by adopting data partition instead of whole frame as the dropping unit. Simulation resuhs show that the proposed scheme achieves a great gain in the reconstructed video quality over two typical frame dropping schemes, with the help of the fine granularity in dropping unit as well as rate-distortion optimization.
基金supported by the National Natural Science Foundation of China(No.61901465,82222032,82172050).
文摘Since its inception in the 1970s,multi-dimensional magnetic resonance(MR)has emerged as a powerful tool for non-invasive investigations of structures and molecular interactions.MR spectroscopy beyond one dimension allows the study of the correlation,exchange processes,and separation of overlapping spectral information.The multi-dimensional concept has been re-implemented over the last two decades to explore molecular motion and spin dynamics in porous media.Apart from Fourier transform,methods have been developed for processing the multi-dimensional time-domain data,identifying the fluid components,and estimating pore surface permeability via joint relaxation and diffusion spectra.Through the resolution of spectroscopic signals with spatial encoding gradients,multi-dimensional MR imaging has been widely used to investigate the microscopic environment of living tissues and distinguish diseases.Signals in each voxel are usually expressed as multi-exponential decay,representing microstructures or environments along multiple pore scales.The separation of contributions from different environments is a common ill-posed problem,which can be resolved numerically.Moreover,the inversion methods and experimental parameters determine the resolution of multi-dimensional spectra.This paper reviews the algorithms that have been proposed to process multidimensional MR datasets in different scenarios.Detailed information at the microscopic level,such as tissue components,fluid types and food structures in multi-disciplinary sciences,could be revealed through multi-dimensional MR.
基金Supported by the National Natural Science Foundation of China (60673128)
文摘A partition checkpoint strategy based on data segment priority is presented to meet the timing constraints of the data and the transaction in embedded real-time main memory database systems(ERTMMDBS) as well as to reduce the number of the transactions missing their deadlines and the recovery time.The partition checkpoint strategy takes into account the characteristics of the data and the transactions associated with it;moreover,it partitions the database according to the data segment priority and sets the corresponding checkpoint frequency to each partition for independent checkpoint operation.The simulation results show that the partition checkpoint strategy decreases the ratio of trans-actions missing their deadlines.
文摘Peer-to-peer (P2P) networking is a distributed architecture that partitions tasks or data between peer nodes. In this paper, an efficient Hypercube Sequential Matrix Partition (HS-MP) for efficient data sharing in P2P Networks using tokenizer method is proposed to resolve the problems of the larger P2P networks. The availability of data is first measured by the tokenizer using Dynamic Hypercube Organization. By applying Dynamic Hypercube Organization, that efficiently coordinates and assists the peers in P2P network ensuring data availability at many locations. Each data in peer is then assigned with valid ID by the tokenizer using Sequential Self-Organizing (SSO) ID generation model. This ensures data sharing with other nodes in large P2P network at minimum time interval which is obtained through proximity of data availability. To validate the framework HS-MP, the performance is evaluated using traffic traces collected from data sharing applications. Simulations conducting using Network simulator-2 show that the proposed framework outperforms the conventional streaming models. The performance of the proposed system is analyzed using energy consumption, average latency and average data availability rate with respect to the number of peer nodes, data size, amount of data shared and execution time. The proposed method reduces the energy consumption 43.35% to transpose traffic, 35.29% to bitrev traffic and 25% to bitcomp traffic patterns.
文摘With the advantages of MapReduce programming model in parallel computing and processing of data and tasks on large-scale clusters, a Dataaware partitioning schema in MapReduce for large-scale high-dimensional data is proposed. It optimizes partition method of data blocks with the same contribution to computation in MapReduce. Using a two-stage data partitioning strategy, the data are uniformly distributed into data blocks by clustering and partitioning. The experiments show that the data-aware partitioning schema is very effective and extensible for improving the query efficiency of highdimensional data.
基金Part of the work was supported by the National Natural Science Foundation of China(No.61702034)Part of the work was supported by the Opening Project of Guangdong Province Key Laboratory of Information Security Technology(Grant No.2017B030314131).
文摘Steganalysis is a technique used for detecting the existence of secret information embedded into cover media such as images and videos.Currently,with the higher speed of the Internet,videos have become a kind of main methods for transferring information.The latest video coding standard High Efficiency Video Coding(HEVC)shows better coding performance compared with the H.264/AVC standard published in the previous time.Therefore,since the HEVC was published,HEVC videos have been widely used as carriers of hidden information.In this paper,a steganalysis algorithm is proposed to detect the latest HEVC video steganography method which is based on the modification of Prediction Units(PU)partition modes.To detect the embedded data,All the PU partition modes are extracted from P pictures,and the probability of each PU partition mode in cover videos and stego videos is adopted as the classification feature.Furthermore,feature optimization is applied,that the 25-dimensional steganalysis feature has been reduced to the 3-dimensional feature.Then the Support Vector Machine(SVM)is used to identify stego videos.It is demonstrated in experimental results that the proposed steganalysis algorithm can effectively detect the stego videos,and much higher classification accuracy has been achieved compared with state-of-the-art work.
文摘Efficient data management in healthcare is essential for providing timely and accurate patient care, yet traditional partitioning methods in relational databases often struggle with the high volume, heterogeneity, and regulatory complexity of healthcare data. This research introduces a tailored partitioning strategy leveraging the MD5 hashing algorithm to enhance data insertion, query performance, and load balancing in healthcare systems. By applying a consistent hash function to patient IDs, our approach achieves uniform distribution of records across partitions, optimizing retrieval paths and reducing access latency while ensuring data integrity and compliance. We evaluated the method through experiments focusing on partitioning efficiency, scalability, and fault tolerance. The partitioning efficiency analysis compared our MD5-based approach with standard round-robin methods, measuring insertion times, query latency, and data distribution balance. Scalability tests assessed system performance across increasing dataset sizes and varying partition counts, while fault tolerance experiments examined data integrity and retrieval performance under simulated partition failures. The experimental results demonstrate that the MD5-based partitioning strategy significantly reduces query retrieval times by optimizing data access patterns, achieving up to X% better performance compared to round-robin methods. It also scales effectively with larger datasets, maintaining low latency and ensuring robust resilience under failure scenarios. This novel approach offers a scalable, efficient, and fault-tolerant solution for healthcare systems, facilitating faster clinical decision-making and improved patient care in complex data environments.
文摘In order to discover the main causes of elevator group accidents in edge computing environment, a multi-dimensional data model of elevator accident data is established by using data cube technology, proposing and implementing a method by combining classical Apriori algorithm with the model, digging out frequent items of elevator accident data to explore the main reasons for the occurrence of elevator accidents. In addition, a collaborative edge model of elevator accidents is set to achieve data sharing, making it possible to check the detail of each cause to confirm the causes of elevator accidents. Lastly the association rules are applied to find the law of elevator Accidents.
基金Project(2010-0020163) supported by Key Research Institute Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology, Korea
文摘Similarity measure design for discrete data group was proposed. Similarity measure design for continuous membership function was also carried out. Proposed similarity measures were designed based on fuzzy number and distance measure, and were proved. To calculate the degree of similarity of discrete data, relative degree between data and total distribution was obtained. Discrete data similarity measure was completed with combination of mentioned relative degrees. Power interconnected system with multi characteristics was considered to apply discrete similarity measure. Naturally, similarity measure was extended to multi-dimensional similarity measure case, and applied to bus clustering problem.
文摘This paper focuses on the design process for reconfigurable architecture. Our contribution focuses on introducing a new temporal partitioning algorithm. Our algorithm is based on typical mathematic flow to solve the temporal partitioning problem. This algorithm optimizes the transfer of data required between design partitions and the reconfiguration overhead. Results show that our algorithm considerably decreases the communication cost and the latency compared with other well known algorithms.
基金Project(61362021)supported by the National Natural Science Foundation of ChinaProject(2016GXNSFAA380149)supported by Natural Science Foundation of Guangxi Province,China+1 种基金Projects(2016YJCXB02,2017YJCX34)supported by Innovation Project of GUET Graduate Education,ChinaProject(2011KF11)supported by the Key Laboratory of Cognitive Radio and Information Processing,Ministry of Education,China
文摘Outlier detection is an important task in data mining. In fact, it is difficult to find the clustering centers in some sophisticated multidimensional datasets and to measure the deviation degree of each potential outlier. In this work, an effective outlier detection method based on multi-dimensional clustering and local density(ODBMCLD) is proposed. ODBMCLD firstly identifies the center objects by the local density peak of data objects, and clusters the whole dataset based on the center objects. Then, outlier objects belonging to different clusters will be marked as candidates of abnormal data. Finally, the top N points among these abnormal candidates are chosen as final anomaly objects with high outlier factors. The feasibility and effectiveness of the method are verified by experiments.
文摘The amount of data for decision making has increased tremendously in the age of the digital economy. Decision makers who fail to proficiently manipulate the data produced may make incorrect decisions and therefore harm their business. Thus, the task of extracting and classifying the useful information efficiently and effectively from huge amounts of computational data is of special importance. In this paper, we consider that the attributes of data could be both crisp and fuzzy. By examining the suitable partial data, segments with different classes are formed, then a multithreaded computation is performed to generate crisp rules (if possible), and finally, the fuzzy partition technique is employed to deal with the fuzzy attributes for classification. The rules generated in classifying the overall data can be used to gain more knowledge from the data collected.
文摘Online Transaction Processing(OLTP)gets support from data partitioning to achieve better performance and scalability.The primary objective of database and application developers is to provide scalable and reliable database systems.This research presents a novel method for data partitioning and load balancing for scalable transactions.Data is efficiently partitioned using the hybrid graph partitioning method.Optimized load balancing(OLB)approach is applied to calculate the weight factor,average workload,and partition efficiency.The presented approach is appropriate for various online data transaction applications.The quality of the proposed approach is examined using OLTP database benchmark.The performance of the proposed methodology significantly outperformed with respect to metrics like throughput,response time,and CPU utilization.
文摘The paper presents a novel Graphics Interchange Format (GIF) Steganography system. The algorithm uses an animated (GIF) file format video to applyon, a secured and variable image partition scheme for data embedding. The secretdata could be any character text, any image, an audio file, or a video file;that isconverted in the form of bits. The proposed method uses a variable partitionscheme structure for data embedding in the (GIF) file format video. The algorithmestimates the capacity of the cover (GIF) image frames to embed data bits. Ourmethod built variable partition blocks in an empty frame separately and incorporate it with randomly selected (GIF) frames. This way the (GIF) frame is dividedinto variable block same as in the empty frame. Then algorithm embeds secretdata on appropriate pixel of the (GIF) frame. Each selected partition block for dataembedding, can store a different number of data bits based on block size. Intruders could never come to know exact position of the secrete data in this stegoframe. All the (GIF) frames are rebuild to make animated stego (GIF) video.The performance of the proposed (GIF) algorithm has experimented andevaluated based on different input parameters, like Mean Square Error (MSE)and Peak Signal-to-Noise Ratio (PSNR) values. The results are compared withsome existing methods and found that our method has promising results.