[Objective]In response to the issue of insufficient integrity in hourly routine meteorological element data files,this paper aims to improve the availability and reliability of data files,and provide high-quality data...[Objective]In response to the issue of insufficient integrity in hourly routine meteorological element data files,this paper aims to improve the availability and reliability of data files,and provide high-quality data file support for meteorological forecasting and services.[Method]In this paper,an efficient and accurate method for data file quality control and fusion processing is developed.By locating the missing measurement time,data are extracted from the"AWZ.db"database and the minute routine meteorological element data file,and merged into the hourly routine meteorological element data file.[Result]Data processing efficiency and accuracy are significantly improved,and the problem of incomplete hourly routine meteorological element data files is solved.At the same time,it emphasizes the importance of ensuring the accuracy of the files used and carefully checking and verifying the fusion results,and proposes strategies to improve data quality.[Conclusion]This method provides convenience for observation personnel and effectively improves the integrity and accuracy of data files.In the future,it is expected to provide more reliable data support for meteorological forecasting and services.展开更多
In this paper, we analyze the complexity and entropy of different methods of data compression algorithms: LZW, Huffman, Fixed-length code (FLC), and Huffman after using Fixed-length code (HFLC). We test those algorith...In this paper, we analyze the complexity and entropy of different methods of data compression algorithms: LZW, Huffman, Fixed-length code (FLC), and Huffman after using Fixed-length code (HFLC). We test those algorithms on different files of different sizes and then conclude that: LZW is the best one in all compression scales that we tested especially on the large files, then Huffman, HFLC, and FLC, respectively. Data compression still is an important topic for research these days, and has many applications and uses needed. Therefore, we suggest continuing searching in this field and trying to combine two techniques in order to reach a best one, or use another source mapping (Hamming) like embedding a linear array into a Hypercube with other good techniques like Huffman and trying to reach good results.展开更多
In this paper, we present a distributed multi-level cache system based on cloud storage, which is aimed at the low access efficiency of small spatio-temporal data files in information service system of Smart City. Tak...In this paper, we present a distributed multi-level cache system based on cloud storage, which is aimed at the low access efficiency of small spatio-temporal data files in information service system of Smart City. Taking classification attribute of small spatio-temporal data files in Smart City as the basis of cache content selection, the cache system adopts different cache pool management strategies in different levels of cache. The results of experiment in prototype system indicate that multi-level cache in this paper effectively increases the access bandwidth of small spatio-temporal files in Smart City and greatly improves service quality of multiple concurrent access in system.展开更多
In consultative committee for space data systems(CCSDS)file delivery protocol(CFDP)recommendation of reliable transmission,there are no detail transmission procedure and delay calculation of prompted negative acknowle...In consultative committee for space data systems(CCSDS)file delivery protocol(CFDP)recommendation of reliable transmission,there are no detail transmission procedure and delay calculation of prompted negative acknowledge and asynchronous negative acknowledge models.CFDP is designed to provide data and storage management,story and forward,custody transfer and reliable end-to-end delivery over deep space characterized by huge latency,intermittent link,asymmetric bandwidth and big bit error rate(BER).Four reliable transmission models are analyzed and an expected file-delivery time is calculated with different trans-mission rates,numbers and sizes of packet data units,BERs and frequencies of external events,etc.By comparison of four CFDP models,the requirement of BER for typical missions in deep space is obtained and rules of choosing CFDP models under different uplink state informations are given,which provides references for protocol models selection,utilization and modification.展开更多
Integrating heterogeneous data sources is a precondition to share data for enterprises. Highly-efficient data updating can both save system expenses, and offer real-time data. It is one of the hot issues to modify dat...Integrating heterogeneous data sources is a precondition to share data for enterprises. Highly-efficient data updating can both save system expenses, and offer real-time data. It is one of the hot issues to modify data rapidly in the pre-processing area of the data warehouse. An extract transform loading design is proposed based on a new data algorithm called Diff-Match,which is developed by utilizing mode matching and data-filtering technology. It can accelerate data renewal, filter the heterogeneous data, and seek out different sets of data. Its efficiency has been proved by its successful application in an enterprise of electric apparatus groups.展开更多
Data layout in a file system is the organization of data stored in external storages. The data layout has a huge impact on performance of storage systems. We survey three main kinds of data layout in traditional file ...Data layout in a file system is the organization of data stored in external storages. The data layout has a huge impact on performance of storage systems. We survey three main kinds of data layout in traditional file systems: in-place update file system, log-structured file system, and copy-on-write file sys- tem. Each file system has its own strengths and weaknesses under different circumstances. We also include a recent us- age of persistent layout in a file system that combines both flash memory and byte- addressable non- volatile memory. With this survey, we conclude that persistent data layout in file systems may evolve dramatically in the era of emerging non-volatile memory.展开更多
Metadata prefetching and data placement play a critical role in enhancing access performance for file systems operating over wide-area networks.However,developing effective strategies for metadata prefetching in envir...Metadata prefetching and data placement play a critical role in enhancing access performance for file systems operating over wide-area networks.However,developing effective strategies for metadata prefetching in environments with concurrent workloads and for data placement across distributed networks remains a significant challenge.This study introduces novel and efficient methodologies for metadata prefetching and data placement,leveraging fine-grained control of prefetching strategies and variable-sized data fragment writing to optimize the I/O bandwidth of distributed file systems.The proposed metadata prefetching technique employs dynamic workload analysis to identify dominant workload patterns and adaptively refines prefetching policies,thereby boosting metadata access efficiency under concurrent scenarios.Meanwhile,the data placement strategy improves write performance by storing data fragments locally within the nearest data center and transmitting only the fragment location metadata to the remote data center hosting the original file.Experimental evaluations using real-world system traces demonstrate that the proposed approaches reduce metadata access times by up to 33.5%and application data access times by 17.19%compared to state-of-the-art techniques.展开更多
Nowadays short texts can be widely found in various social data in relation to the 5G-enabled Internet of Things (IoT). Short text classification is a challenging task due to its sparsity and the lack of context. Prev...Nowadays short texts can be widely found in various social data in relation to the 5G-enabled Internet of Things (IoT). Short text classification is a challenging task due to its sparsity and the lack of context. Previous studies mainly tackle these problems by enhancing the semantic information or the statistical information individually. However, the improvement achieved by a single type of information is limited, while fusing various information may help to improve the classification accuracy more effectively. To fuse various information for short text classification, this article proposes a feature fusion method that integrates the statistical feature and the comprehensive semantic feature together by using the weighting mechanism and deep learning models. In the proposed method, we apply Bidirectional Encoder Representations from Transformers (BERT) to generate word vectors on the sentence level automatically, and then obtain the statistical feature, the local semantic feature and the overall semantic feature using Term Frequency-Inverse Document Frequency (TF-IDF) weighting approach, Convolutional Neural Network (CNN) and Bidirectional Gate Recurrent Unit (BiGRU). Then, the fusion feature is accordingly obtained for classification. Experiments are conducted on five popular short text classification datasets and a 5G-enabled IoT social dataset and the results show that our proposed method effectively improves the classification performance.展开更多
Integration between file systems and multidatabase systems is a necessary approach to support data sharing from distributed and heterogeneous data sources. We first analyses problems about data integration between fil...Integration between file systems and multidatabase systems is a necessary approach to support data sharing from distributed and heterogeneous data sources. We first analyses problems about data integration between file systems and multidatabase systems. Then, A common data model named XIDM(XML\|based Integrating Dada Model), which is XML oriented, is presented. XIDM bases on a series of XML standards, especially XML Schema, and can well describe semistructured data. So XIDM is powerfully practicable and multipurpose.展开更多
Protecting the security of sensitive information has become a matter of great concern to everyone. Data hiding technique solves the problem to some extent, but still, some shortcomings remain for researching. To impro...Protecting the security of sensitive information has become a matter of great concern to everyone. Data hiding technique solves the problem to some extent, but still, some shortcomings remain for researching. To improve the capability of hiding huge data file in disk with high efficiency. In this paper, we propose a novel approach called CryptFS, which is achieved by utilizing the file access mechanism and modifying the cluster chain structure to hide data. CryptFS can quickly hide data file with G bytes size in less than 0.1s. The time used for hiding and recovering data is irrelevant to the size of data file, and the reliability of the hidden file is high, which will not be overlaid by new created file and disk defragment.展开更多
Nowadays, an increasing number of persons choose to outsource their computing demands and storage demands to the Cloud. In order to ensure the integrity of the data in the untrusted Cloud, especially the dynamic files...Nowadays, an increasing number of persons choose to outsource their computing demands and storage demands to the Cloud. In order to ensure the integrity of the data in the untrusted Cloud, especially the dynamic files which can be updated online, we propose an improved dynamic provable data possession model. We use some homomorphic tags to verify the integrity of the file and use some hash values generated by some secret values and tags to prevent replay attack and forgery attack. Compared with previous works, our proposal reduces the computational and communication complexity from O(logn) to O(1). We did some experiments to ensure this improvement and extended the model to file sharing situation.展开更多
In the last decade,building energy simulation( BES)became a central component in building energy systems’ design and optimization. For each building location,BES requires one year of hourly weather data. Most buildin...In the last decade,building energy simulation( BES)became a central component in building energy systems’ design and optimization. For each building location,BES requires one year of hourly weather data. Most buildings are designed to last50 + years,consequently,the building design phase should include BES with future weather files considering climate change.This paper presents a comparative study of two methods to produce future climate hourly data files for BES: Morphing and typical meteorological year of future climate( F-TMY). The study uses data from a high-resolution( 9 km) regional climate atmospheric model simulation of Iberia,spanning 10 years of historical and future hourly data. This study compares both methods by analyzing anomalies in air temperature,and the impact in BES predictions of annual and peak energy consumption for space heating, cooling and ventilation in 4 buildings.Additionally, this study performs a sensitivity analysis of morphing method. The analysis shows that F-TMY is representative of the multi-year simulation for BES applications.A high-quality Morphed TMY weather file has a similar performance compared to F-TMY( average difference: 8% versus 7%). Morphing based on different baseline climates,low-grid resolution and/or outdated climate projections leads to BES average differences of 16%~20%.展开更多
Data hiding(DH)is an important technology for securely transmitting secret data in networks,and has increasing become a research hotspot throughout the world.However,for Joint photographic experts group(JPEG)images,it...Data hiding(DH)is an important technology for securely transmitting secret data in networks,and has increasing become a research hotspot throughout the world.However,for Joint photographic experts group(JPEG)images,it is difficult to balance the contradiction among embedded capacity,visual quality and the file size increment in existing data hiding schemes.Thus,to deal with this problem,a high-imperceptibility data hiding for JPEG images is proposed based on direction modification.First,this proposed scheme sorts all of the quantized discrete cosine transform(DCT)block in ascending order according to the number of non-consecutive-zero alternating current(AC)coefficients.Then it selects non-consecutive-zero AC coefficients with absolute values less than or equal to 1 at the same frequency position in two adjacent blocks for pairing.Finally,the 2-bit secret data can be embedded into a coefficient-pair by using the filled reference matrix and the designed direction modification rules.The experiment was conducted on 5 standard test images and 1000 images of BOSSbase dataset,respectively.The experimental results showed that the visual quality of the proposed scheme was improved by 1∼4 dB compared with the comparison schemes,and the file size increment was reduced at most to 15%of the comparison schemes.展开更多
In this paper,an energy-conservation Hadoop Distributed File System(HDFS)oriented to massive news data is proposed based on news access pattern,in order to reduce energy consumption of big news data storage system.Fir...In this paper,an energy-conservation Hadoop Distributed File System(HDFS)oriented to massive news data is proposed based on news access pattern,in order to reduce energy consumption of big news data storage system.First,divide all data nodes into real-time responding hot data nodes and standby cold data nodes.To make a good balance between data access performance and energyconservation,this paper takes two strategies of priority allocation,named Active State Node Priority(ASNP)and Lower Than Average Utilization Rate Node Priority(LANP),to mostly guarantee the balance of data distribution in cluster in order to obtain a good data access performance.It also confirms the opportunities to move data from hot data nodes to cold data nodes is based on the access pattern of news data and develops a simulating experimental platform that can evaluate energy consumption of any file accessing operation under any different storage strategies and parameters.Simulation experiments shows that strategies proposed in this paper saves 20%–35%energy than traditional HDFS and 99.9%responding time of reading files will not be affected,with an average of 0.008%–0.036%time delay.展开更多
MIXED is a digital preservation project. It uses a strategy of converting data to intermediate XML. In this paper we position this strategy with respect to the well-known emulation and migration strategies. Then we de...MIXED is a digital preservation project. It uses a strategy of converting data to intermediate XML. In this paper we position this strategy with respect to the well-known emulation and migration strategies. Then we detail the MIXED strategy and explain why it is an optimized, economical way of migration. Finally, we describe how DANS is implementing a software tool that can perform the migrations needed for this strategy.展开更多
China Unicorn, the largest WCDMA 3G operator in China, meets the requirements of the historical Mobile Internet Explosion, or the surging of Mobile Internet Traffic from mobile terminals. According to the internal sta...China Unicorn, the largest WCDMA 3G operator in China, meets the requirements of the historical Mobile Internet Explosion, or the surging of Mobile Internet Traffic from mobile terminals. According to the internal statistics of China Unicom, mobile user traffic has increased rapidly with a Compound Annual Growth Rate (CAGR) of 135%. Currently China Unicorn monthly stores more than 2 trillion records, data volume is over 525 TB, and the highest data volume has reached a peak of 5 PB. Since October 2009, China Unicom has been developing a home-brewed big data storage and analysis platform based on the open source Hadoop Distributed File System (HDFS) as it has a long-term strategy to make full use of this Big Data. All Mobile Internet Traffic is well served using this big data platform. Currently, the writing speed has reached 1 390 000 records per second, and the record retrieval time in the table that contains trillions of records is less than 100 ms. To take advantage of this opportunity to be a Big Data Operator, China Unicom has developed new functions and has multiple innovations to solve space and time constraint challenges presented in data processing. In this paper, we will introduce our big data platform in detail. Based on this big data platform, China Unicom is building an industry ecosystem based on Mobile Internet Big Data, and considers that a telecom operator centric ecosystem can be formed that is critical to reach prosperity in the modern communications business.展开更多
To better understand different users' accessing intentions, a novel clustering and supervising method based on accessing path is presented. This method divides users' interest space to express the distribution...To better understand different users' accessing intentions, a novel clustering and supervising method based on accessing path is presented. This method divides users' interest space to express the distribution of users' interests, and directly to instruct the constructing process of web pages indexing for advanced performance.展开更多
基金the Fifth Batch of Innovation Teams of Wuzhou Meteorological Bureau"Wuzhou Innovation Team for Enhancing the Comprehensive Meteorological Observation Ability through Digitization and Intelligence"Wuzhou Science and Technology Planning Project(202402122,202402119).
文摘[Objective]In response to the issue of insufficient integrity in hourly routine meteorological element data files,this paper aims to improve the availability and reliability of data files,and provide high-quality data file support for meteorological forecasting and services.[Method]In this paper,an efficient and accurate method for data file quality control and fusion processing is developed.By locating the missing measurement time,data are extracted from the"AWZ.db"database and the minute routine meteorological element data file,and merged into the hourly routine meteorological element data file.[Result]Data processing efficiency and accuracy are significantly improved,and the problem of incomplete hourly routine meteorological element data files is solved.At the same time,it emphasizes the importance of ensuring the accuracy of the files used and carefully checking and verifying the fusion results,and proposes strategies to improve data quality.[Conclusion]This method provides convenience for observation personnel and effectively improves the integrity and accuracy of data files.In the future,it is expected to provide more reliable data support for meteorological forecasting and services.
文摘In this paper, we analyze the complexity and entropy of different methods of data compression algorithms: LZW, Huffman, Fixed-length code (FLC), and Huffman after using Fixed-length code (HFLC). We test those algorithms on different files of different sizes and then conclude that: LZW is the best one in all compression scales that we tested especially on the large files, then Huffman, HFLC, and FLC, respectively. Data compression still is an important topic for research these days, and has many applications and uses needed. Therefore, we suggest continuing searching in this field and trying to combine two techniques in order to reach a best one, or use another source mapping (Hamming) like embedding a linear array into a Hypercube with other good techniques like Huffman and trying to reach good results.
基金Supported by the Natural Science Foundation of Hubei Province(2012FFC034,2014CFC1100)
文摘In this paper, we present a distributed multi-level cache system based on cloud storage, which is aimed at the low access efficiency of small spatio-temporal data files in information service system of Smart City. Taking classification attribute of small spatio-temporal data files in Smart City as the basis of cache content selection, the cache system adopts different cache pool management strategies in different levels of cache. The results of experiment in prototype system indicate that multi-level cache in this paper effectively increases the access bandwidth of small spatio-temporal files in Smart City and greatly improves service quality of multiple concurrent access in system.
基金supported by the National Natural Science Fandation of China (60672089,60772075)
文摘In consultative committee for space data systems(CCSDS)file delivery protocol(CFDP)recommendation of reliable transmission,there are no detail transmission procedure and delay calculation of prompted negative acknowledge and asynchronous negative acknowledge models.CFDP is designed to provide data and storage management,story and forward,custody transfer and reliable end-to-end delivery over deep space characterized by huge latency,intermittent link,asymmetric bandwidth and big bit error rate(BER).Four reliable transmission models are analyzed and an expected file-delivery time is calculated with different trans-mission rates,numbers and sizes of packet data units,BERs and frequencies of external events,etc.By comparison of four CFDP models,the requirement of BER for typical missions in deep space is obtained and rules of choosing CFDP models under different uplink state informations are given,which provides references for protocol models selection,utilization and modification.
基金Supported by National Natural Science Foundation of China (No. 50475117)Tianjin Natural Science Foundation (No.06YFJMJC03700).
文摘Integrating heterogeneous data sources is a precondition to share data for enterprises. Highly-efficient data updating can both save system expenses, and offer real-time data. It is one of the hot issues to modify data rapidly in the pre-processing area of the data warehouse. An extract transform loading design is proposed based on a new data algorithm called Diff-Match,which is developed by utilizing mode matching and data-filtering technology. It can accelerate data renewal, filter the heterogeneous data, and seek out different sets of data. Its efficiency has been proved by its successful application in an enterprise of electric apparatus groups.
基金supported by ZTE Industry-Academia-Research Cooperation Funds
文摘Data layout in a file system is the organization of data stored in external storages. The data layout has a huge impact on performance of storage systems. We survey three main kinds of data layout in traditional file systems: in-place update file system, log-structured file system, and copy-on-write file sys- tem. Each file system has its own strengths and weaknesses under different circumstances. We also include a recent us- age of persistent layout in a file system that combines both flash memory and byte- addressable non- volatile memory. With this survey, we conclude that persistent data layout in file systems may evolve dramatically in the era of emerging non-volatile memory.
基金funded by the National Natural Science Foundation of China under Grant No.62362019the Hainan Provincial Natural Science Foundation of China under Grant No.624RC482.
文摘Metadata prefetching and data placement play a critical role in enhancing access performance for file systems operating over wide-area networks.However,developing effective strategies for metadata prefetching in environments with concurrent workloads and for data placement across distributed networks remains a significant challenge.This study introduces novel and efficient methodologies for metadata prefetching and data placement,leveraging fine-grained control of prefetching strategies and variable-sized data fragment writing to optimize the I/O bandwidth of distributed file systems.The proposed metadata prefetching technique employs dynamic workload analysis to identify dominant workload patterns and adaptively refines prefetching policies,thereby boosting metadata access efficiency under concurrent scenarios.Meanwhile,the data placement strategy improves write performance by storing data fragments locally within the nearest data center and transmitting only the fragment location metadata to the remote data center hosting the original file.Experimental evaluations using real-world system traces demonstrate that the proposed approaches reduce metadata access times by up to 33.5%and application data access times by 17.19%compared to state-of-the-art techniques.
基金supported in part by the Beijing Natural Science Foundation under grants M21032 and 19L2029in part by the National Natural Science Foundation of China under grants U1836106 and 81961138010in part by the Scientific and Technological Innovation Foundation of Foshan under grants BK21BF001 and BK20BF010.
文摘Nowadays short texts can be widely found in various social data in relation to the 5G-enabled Internet of Things (IoT). Short text classification is a challenging task due to its sparsity and the lack of context. Previous studies mainly tackle these problems by enhancing the semantic information or the statistical information individually. However, the improvement achieved by a single type of information is limited, while fusing various information may help to improve the classification accuracy more effectively. To fuse various information for short text classification, this article proposes a feature fusion method that integrates the statistical feature and the comprehensive semantic feature together by using the weighting mechanism and deep learning models. In the proposed method, we apply Bidirectional Encoder Representations from Transformers (BERT) to generate word vectors on the sentence level automatically, and then obtain the statistical feature, the local semantic feature and the overall semantic feature using Term Frequency-Inverse Document Frequency (TF-IDF) weighting approach, Convolutional Neural Network (CNN) and Bidirectional Gate Recurrent Unit (BiGRU). Then, the fusion feature is accordingly obtained for classification. Experiments are conducted on five popular short text classification datasets and a 5G-enabled IoT social dataset and the results show that our proposed method effectively improves the classification performance.
基金Supported by the Beforehand Research for National Defense of China(94J3. 4. 2. JW0 5 15 )
文摘Integration between file systems and multidatabase systems is a necessary approach to support data sharing from distributed and heterogeneous data sources. We first analyses problems about data integration between file systems and multidatabase systems. Then, A common data model named XIDM(XML\|based Integrating Dada Model), which is XML oriented, is presented. XIDM bases on a series of XML standards, especially XML Schema, and can well describe semistructured data. So XIDM is powerfully practicable and multipurpose.
基金Supported by the National High Technology Research and Development Program of China (863 Program) (2009AA01Z434)the "Core Electronic Devices, High_End General Chip, and Fundamental Software" Major Project (2013JH00103)
文摘Protecting the security of sensitive information has become a matter of great concern to everyone. Data hiding technique solves the problem to some extent, but still, some shortcomings remain for researching. To improve the capability of hiding huge data file in disk with high efficiency. In this paper, we propose a novel approach called CryptFS, which is achieved by utilizing the file access mechanism and modifying the cluster chain structure to hide data. CryptFS can quickly hide data file with G bytes size in less than 0.1s. The time used for hiding and recovering data is irrelevant to the size of data file, and the reliability of the hidden file is high, which will not be overlaid by new created file and disk defragment.
基金supported by Major Program of Shanghai Science and Technology Commission under Grant No.10DZ1500200Collaborative Applied Research and Development Project between Morgan Stanley and Shanghai Jiao Tong University, China
文摘Nowadays, an increasing number of persons choose to outsource their computing demands and storage demands to the Cloud. In order to ensure the integrity of the data in the untrusted Cloud, especially the dynamic files which can be updated online, we propose an improved dynamic provable data possession model. We use some homomorphic tags to verify the integrity of the file and use some hash values generated by some secret values and tags to prevent replay attack and forgery attack. Compared with previous works, our proposal reduces the computational and communication complexity from O(logn) to O(1). We did some experiments to ensure this improvement and extended the model to file sharing situation.
文摘In the last decade,building energy simulation( BES)became a central component in building energy systems’ design and optimization. For each building location,BES requires one year of hourly weather data. Most buildings are designed to last50 + years,consequently,the building design phase should include BES with future weather files considering climate change.This paper presents a comparative study of two methods to produce future climate hourly data files for BES: Morphing and typical meteorological year of future climate( F-TMY). The study uses data from a high-resolution( 9 km) regional climate atmospheric model simulation of Iberia,spanning 10 years of historical and future hourly data. This study compares both methods by analyzing anomalies in air temperature,and the impact in BES predictions of annual and peak energy consumption for space heating, cooling and ventilation in 4 buildings.Additionally, this study performs a sensitivity analysis of morphing method. The analysis shows that F-TMY is representative of the multi-year simulation for BES applications.A high-quality Morphed TMY weather file has a similar performance compared to F-TMY( average difference: 8% versus 7%). Morphing based on different baseline climates,low-grid resolution and/or outdated climate projections leads to BES average differences of 16%~20%.
基金supported by the National Natural Science Foundation of China (62072325)Shanxi Scholarship Council of China (HGKY2019081)+1 种基金Fundamental Research Program of Shanxi Province (202103021224272)TYUST SRIF (20212039).
文摘Data hiding(DH)is an important technology for securely transmitting secret data in networks,and has increasing become a research hotspot throughout the world.However,for Joint photographic experts group(JPEG)images,it is difficult to balance the contradiction among embedded capacity,visual quality and the file size increment in existing data hiding schemes.Thus,to deal with this problem,a high-imperceptibility data hiding for JPEG images is proposed based on direction modification.First,this proposed scheme sorts all of the quantized discrete cosine transform(DCT)block in ascending order according to the number of non-consecutive-zero alternating current(AC)coefficients.Then it selects non-consecutive-zero AC coefficients with absolute values less than or equal to 1 at the same frequency position in two adjacent blocks for pairing.Finally,the 2-bit secret data can be embedded into a coefficient-pair by using the filled reference matrix and the designed direction modification rules.The experiment was conducted on 5 standard test images and 1000 images of BOSSbase dataset,respectively.The experimental results showed that the visual quality of the proposed scheme was improved by 1∼4 dB compared with the comparison schemes,and the file size increment was reduced at most to 15%of the comparison schemes.
文摘In this paper,an energy-conservation Hadoop Distributed File System(HDFS)oriented to massive news data is proposed based on news access pattern,in order to reduce energy consumption of big news data storage system.First,divide all data nodes into real-time responding hot data nodes and standby cold data nodes.To make a good balance between data access performance and energyconservation,this paper takes two strategies of priority allocation,named Active State Node Priority(ASNP)and Lower Than Average Utilization Rate Node Priority(LANP),to mostly guarantee the balance of data distribution in cluster in order to obtain a good data access performance.It also confirms the opportunities to move data from hot data nodes to cold data nodes is based on the access pattern of news data and develops a simulating experimental platform that can evaluate energy consumption of any file accessing operation under any different storage strategies and parameters.Simulation experiments shows that strategies proposed in this paper saves 20%–35%energy than traditional HDFS and 99.9%responding time of reading files will not be affected,with an average of 0.008%–0.036%time delay.
文摘MIXED is a digital preservation project. It uses a strategy of converting data to intermediate XML. In this paper we position this strategy with respect to the well-known emulation and migration strategies. Then we detail the MIXED strategy and explain why it is an optimized, economical way of migration. Finally, we describe how DANS is implementing a software tool that can perform the migrations needed for this strategy.
基金supported in part by the National Key Basic Research and Development(973)Program of China(Nos.2013CB228206 and 2012CB315801)the National Natural Science Foundation of China(Nos.61233016 and 61140320)supported by the Intel Research Council under the title of"Security Vulnerability Analysis Based on Cloud Platform with Intel IA Architecture"
文摘China Unicorn, the largest WCDMA 3G operator in China, meets the requirements of the historical Mobile Internet Explosion, or the surging of Mobile Internet Traffic from mobile terminals. According to the internal statistics of China Unicom, mobile user traffic has increased rapidly with a Compound Annual Growth Rate (CAGR) of 135%. Currently China Unicorn monthly stores more than 2 trillion records, data volume is over 525 TB, and the highest data volume has reached a peak of 5 PB. Since October 2009, China Unicom has been developing a home-brewed big data storage and analysis platform based on the open source Hadoop Distributed File System (HDFS) as it has a long-term strategy to make full use of this Big Data. All Mobile Internet Traffic is well served using this big data platform. Currently, the writing speed has reached 1 390 000 records per second, and the record retrieval time in the table that contains trillions of records is less than 100 ms. To take advantage of this opportunity to be a Big Data Operator, China Unicom has developed new functions and has multiple innovations to solve space and time constraint challenges presented in data processing. In this paper, we will introduce our big data platform in detail. Based on this big data platform, China Unicom is building an industry ecosystem based on Mobile Internet Big Data, and considers that a telecom operator centric ecosystem can be formed that is critical to reach prosperity in the modern communications business.
文摘To better understand different users' accessing intentions, a novel clustering and supervising method based on accessing path is presented. This method divides users' interest space to express the distribution of users' interests, and directly to instruct the constructing process of web pages indexing for advanced performance.