In several fields like financial dealing,industry,business,medicine,et cetera,Big Data(BD)has been utilized extensively,which is nothing but a collection of a huge amount of data.However,it is highly complicated along...In several fields like financial dealing,industry,business,medicine,et cetera,Big Data(BD)has been utilized extensively,which is nothing but a collection of a huge amount of data.However,it is highly complicated along with time-consuming to process a massive amount of data.Thus,to design the Distribution Preserving Framework for BD,a novel methodology has been proposed utilizing Manhattan Distance(MD)-centered Partition Around Medoid(MD–PAM)along with Conjugate Gradient Artificial Neural Network(CG-ANN),which undergoes various steps to reduce the complications of BD.Firstly,the data are processed in the pre-processing phase by mitigating the data repetition utilizing the map-reduce function;subsequently,the missing data are handled by substituting or by ignoring the missed values.After that,the data are transmuted into a normalized form.Next,to enhance the classification performance,the data’s dimensionalities are minimized by employing Gaussian Kernel(GK)-Fisher Discriminant Analysis(GK-FDA).Afterwards,the processed data is submitted to the partitioning phase after transmuting it into a structured format.In the partition phase,by utilizing the MD-PAM,the data are partitioned along with grouped into a cluster.Lastly,by employing CG-ANN,the data are classified in the classification phase so that the needed data can be effortlessly retrieved by the user.To analogize the outcomes of the CG-ANN with the prevailing methodologies,the NSL-KDD openly accessible datasets are utilized.The experiential outcomes displayed that an efficient result along with a reduced computation cost was shown by the proposed CG-ANN.The proposed work outperforms well in terms of accuracy,sensitivity and specificity than the existing systems.展开更多
China Unicorn, the largest WCDMA 3G operator in China, meets the requirements of the historical Mobile Internet Explosion, or the surging of Mobile Internet Traffic from mobile terminals. According to the internal sta...China Unicorn, the largest WCDMA 3G operator in China, meets the requirements of the historical Mobile Internet Explosion, or the surging of Mobile Internet Traffic from mobile terminals. According to the internal statistics of China Unicom, mobile user traffic has increased rapidly with a Compound Annual Growth Rate (CAGR) of 135%. Currently China Unicorn monthly stores more than 2 trillion records, data volume is over 525 TB, and the highest data volume has reached a peak of 5 PB. Since October 2009, China Unicom has been developing a home-brewed big data storage and analysis platform based on the open source Hadoop Distributed File System (HDFS) as it has a long-term strategy to make full use of this Big Data. All Mobile Internet Traffic is well served using this big data platform. Currently, the writing speed has reached 1 390 000 records per second, and the record retrieval time in the table that contains trillions of records is less than 100 ms. To take advantage of this opportunity to be a Big Data Operator, China Unicom has developed new functions and has multiple innovations to solve space and time constraint challenges presented in data processing. In this paper, we will introduce our big data platform in detail. Based on this big data platform, China Unicom is building an industry ecosystem based on Mobile Internet Big Data, and considers that a telecom operator centric ecosystem can be formed that is critical to reach prosperity in the modern communications business.展开更多
In the data retrieval process of the Data recommendation system,the matching prediction and similarity identification take place a major role in the ontology.In that,there are several methods to improve the retrieving...In the data retrieval process of the Data recommendation system,the matching prediction and similarity identification take place a major role in the ontology.In that,there are several methods to improve the retrieving process with improved accuracy and to reduce the searching time.Since,in the data recommendation system,this type of data searching becomes complex to search for the best matching for given query data and fails in the accuracy of the query recommendation process.To improve the performance of data validation,this paper proposed a novel model of data similarity estimation and clustering method to retrieve the relevant data with the best matching in the big data processing.In this paper advanced model of the Logarithmic Directionality Texture Pattern(LDTP)method with a Metaheuristic Pattern Searching(MPS)system was used to estimate the similarity between the query data in the entire database.The overall work was implemented for the application of the data recommendation process.These are all indexed and grouped as a cluster to form a paged format of database structure which can reduce the computation time while at the searching period.Also,with the help of a neural network,the relevancies of feature attributes in the database are predicted,and the matching index was sorted to provide the recommended data for given query data.This was achieved by using the Distributional Recurrent Neural Network(DRNN).This is an enhanced model of Neural Network technology to find the relevancy based on the correlation factor of the feature set.The training process of the DRNN classifier was carried out by estimating the correlation factor of the attributes of the dataset.These are formed as clusters and paged with proper indexing based on the MPS parameter of similarity metric.The overall performance of the proposed work can be evaluated by varying the size of the training database by 60%,70%,and 80%.The parameters that are considered for performance analysis are Precision,Recall,F1-score and the accuracy of data retrieval,the query recommendation output,and comparison with other state-of-art methods.展开更多
Pseudo-measurement production of distribution networks is an important technology for ensuring the visibility of distribution networks.The paper discusses the concept of pseudo-metrics,databases,algorithms,and applica...Pseudo-measurement production of distribution networks is an important technology for ensuring the visibility of distribution networks.The paper discusses the concept of pseudo-metrics,databases,algorithms,and applications in practical engineering.Firstly,the basic concept of Pseudo-measurement of the distribution network is introduced.Subsequently,various pseudo-measurement databases and algorithms are discussed in detail,and the advantages and limitations of these methods in dealing with the historical and operational data of medium-voltage pseudo-measurement distribution networks are compared and analysed.Then,the main application of pseudo-measurement is analysed.Finally,the main challenges in the field of pseudo-measurement of the distribution networks are discussed,such as data quality,computational complexity,model accuracy,and feasibility in practical application.The development direction of future research is envisaged.展开更多
文摘In several fields like financial dealing,industry,business,medicine,et cetera,Big Data(BD)has been utilized extensively,which is nothing but a collection of a huge amount of data.However,it is highly complicated along with time-consuming to process a massive amount of data.Thus,to design the Distribution Preserving Framework for BD,a novel methodology has been proposed utilizing Manhattan Distance(MD)-centered Partition Around Medoid(MD–PAM)along with Conjugate Gradient Artificial Neural Network(CG-ANN),which undergoes various steps to reduce the complications of BD.Firstly,the data are processed in the pre-processing phase by mitigating the data repetition utilizing the map-reduce function;subsequently,the missing data are handled by substituting or by ignoring the missed values.After that,the data are transmuted into a normalized form.Next,to enhance the classification performance,the data’s dimensionalities are minimized by employing Gaussian Kernel(GK)-Fisher Discriminant Analysis(GK-FDA).Afterwards,the processed data is submitted to the partitioning phase after transmuting it into a structured format.In the partition phase,by utilizing the MD-PAM,the data are partitioned along with grouped into a cluster.Lastly,by employing CG-ANN,the data are classified in the classification phase so that the needed data can be effortlessly retrieved by the user.To analogize the outcomes of the CG-ANN with the prevailing methodologies,the NSL-KDD openly accessible datasets are utilized.The experiential outcomes displayed that an efficient result along with a reduced computation cost was shown by the proposed CG-ANN.The proposed work outperforms well in terms of accuracy,sensitivity and specificity than the existing systems.
基金supported in part by the National Key Basic Research and Development(973)Program of China(Nos.2013CB228206 and 2012CB315801)the National Natural Science Foundation of China(Nos.61233016 and 61140320)supported by the Intel Research Council under the title of"Security Vulnerability Analysis Based on Cloud Platform with Intel IA Architecture"
文摘China Unicorn, the largest WCDMA 3G operator in China, meets the requirements of the historical Mobile Internet Explosion, or the surging of Mobile Internet Traffic from mobile terminals. According to the internal statistics of China Unicom, mobile user traffic has increased rapidly with a Compound Annual Growth Rate (CAGR) of 135%. Currently China Unicorn monthly stores more than 2 trillion records, data volume is over 525 TB, and the highest data volume has reached a peak of 5 PB. Since October 2009, China Unicom has been developing a home-brewed big data storage and analysis platform based on the open source Hadoop Distributed File System (HDFS) as it has a long-term strategy to make full use of this Big Data. All Mobile Internet Traffic is well served using this big data platform. Currently, the writing speed has reached 1 390 000 records per second, and the record retrieval time in the table that contains trillions of records is less than 100 ms. To take advantage of this opportunity to be a Big Data Operator, China Unicom has developed new functions and has multiple innovations to solve space and time constraint challenges presented in data processing. In this paper, we will introduce our big data platform in detail. Based on this big data platform, China Unicom is building an industry ecosystem based on Mobile Internet Big Data, and considers that a telecom operator centric ecosystem can be formed that is critical to reach prosperity in the modern communications business.
文摘In the data retrieval process of the Data recommendation system,the matching prediction and similarity identification take place a major role in the ontology.In that,there are several methods to improve the retrieving process with improved accuracy and to reduce the searching time.Since,in the data recommendation system,this type of data searching becomes complex to search for the best matching for given query data and fails in the accuracy of the query recommendation process.To improve the performance of data validation,this paper proposed a novel model of data similarity estimation and clustering method to retrieve the relevant data with the best matching in the big data processing.In this paper advanced model of the Logarithmic Directionality Texture Pattern(LDTP)method with a Metaheuristic Pattern Searching(MPS)system was used to estimate the similarity between the query data in the entire database.The overall work was implemented for the application of the data recommendation process.These are all indexed and grouped as a cluster to form a paged format of database structure which can reduce the computation time while at the searching period.Also,with the help of a neural network,the relevancies of feature attributes in the database are predicted,and the matching index was sorted to provide the recommended data for given query data.This was achieved by using the Distributional Recurrent Neural Network(DRNN).This is an enhanced model of Neural Network technology to find the relevancy based on the correlation factor of the feature set.The training process of the DRNN classifier was carried out by estimating the correlation factor of the attributes of the dataset.These are formed as clusters and paged with proper indexing based on the MPS parameter of similarity metric.The overall performance of the proposed work can be evaluated by varying the size of the training database by 60%,70%,and 80%.The parameters that are considered for performance analysis are Precision,Recall,F1-score and the accuracy of data retrieval,the query recommendation output,and comparison with other state-of-art methods.
基金Shanxi Energy Internet Research Institute,Grant/Award Number:SXEI2023ZD002。
文摘Pseudo-measurement production of distribution networks is an important technology for ensuring the visibility of distribution networks.The paper discusses the concept of pseudo-metrics,databases,algorithms,and applications in practical engineering.Firstly,the basic concept of Pseudo-measurement of the distribution network is introduced.Subsequently,various pseudo-measurement databases and algorithms are discussed in detail,and the advantages and limitations of these methods in dealing with the historical and operational data of medium-voltage pseudo-measurement distribution networks are compared and analysed.Then,the main application of pseudo-measurement is analysed.Finally,the main challenges in the field of pseudo-measurement of the distribution networks are discussed,such as data quality,computational complexity,model accuracy,and feasibility in practical application.The development direction of future research is envisaged.