Under the modern education system of China, the annual scholarship evaluation is a vital thing for many of the collegestudents. This paper adopts the classification algorithm of decision tree C4.5 based on the betteri...Under the modern education system of China, the annual scholarship evaluation is a vital thing for many of the collegestudents. This paper adopts the classification algorithm of decision tree C4.5 based on the bettering of ID3 algorithm and constructa data set of the scholarship evaluation system through the analysis of the related attributes in scholarship evaluation information.And also having found some factors that plays a significant role in the growing up of the college students through analysis and re-search of moral education, intellectural education and culture&PE.展开更多
The ID3 algorithm is a classical learning algorithm of decision tree in data mining.The algorithm trends to choosing the attribute with more values,affect the efficiency of classification and prediction for building a...The ID3 algorithm is a classical learning algorithm of decision tree in data mining.The algorithm trends to choosing the attribute with more values,affect the efficiency of classification and prediction for building a decision tree.This article proposes a new approach based on an improved ID3 algorithm.The new algorithm introduces the importance factor λ when calculating the information entropy.It can strengthen the label of important attributes of a tree and reduce the label of non-important attributes.The algorithm overcomes the flaw of the traditional ID3 algorithm which tends to choose the attributes with more values,and also improves the efficiency and flexibility in the process of generating decision trees.展开更多
Based on the discuss of the basic concept of data mining technology and the decision tree method,combining with the data samples of wind and hailstorm disasters in some counties of Mudanjiang region,the forecasting mo...Based on the discuss of the basic concept of data mining technology and the decision tree method,combining with the data samples of wind and hailstorm disasters in some counties of Mudanjiang region,the forecasting model of agro-meteorological disaster grade was established by adopting the C4.5 classification algorithm of decision tree,which can forecast the direct economic loss degree to provide rational data mining model and obtain effective analysis results.展开更多
The productivity and quality in the turning process can be improved by utilizing the predicted performance of the cutting tools.This research incorporates condition monitoring of a non-carbide tool insert using vibrat...The productivity and quality in the turning process can be improved by utilizing the predicted performance of the cutting tools.This research incorporates condition monitoring of a non-carbide tool insert using vibration analysis along with machine learning and fuzzy logic approach.A non-carbide tool insert is considered for the process of cutting operation in a semi-automatic lathe,where the condition of tool is monitored using vibration characteristics.The vibration signals for conditions such as heathy,damaged,thermal and flank were acquired with the help of piezoelectric transducer and data acquisition system.The descriptive statistical features were extracted from the acquired vibration signal using the feature extraction techniques.The extracted statistical features were selected using a feature selection process through J48 decision tree algorithm.The selected features were classified using J48 decision tree and fuzzy to develop the fault diagnosis model for the improved predictive analysis.The decision tree model produced the classification accuracy as 94.78%with five selected features.The developed fuzzy model produced the classification accuracy as 94.02%with five membership functions.Hence,the decision tree has been proposed as a suitable fault diagnosis model for predicting the tool insert health condition under different fault conditions.展开更多
AIM: To assess the usefulness of FibroTest to forecast scores by constructing decision trees in patients with chronic hepatitis C.METHODS: We used the C4.5 classification algorithm to construct decision trees with d...AIM: To assess the usefulness of FibroTest to forecast scores by constructing decision trees in patients with chronic hepatitis C.METHODS: We used the C4.5 classification algorithm to construct decision trees with data from 261 patients with chronic hepatitis C without a liver biopsy. The FibroTest attributes of age, gender, bilirubin, apolipoprotein, haptoglobin, α2 macroglobulin, and γ-glutamyl transpeptidase were used as predictors, and the FibroTest score as the target. For testing, a 10-fold cross validation was used.RESULTS: The overall classification error was 14.9% (accuracy 85.1%). FibroTest's cases with true scores of FO and F4 were classified with very high accuracy (18/20 for FO, 9/9 for FO-1 and 92/96 for F4) and the largest confusion centered on F3. The algorithm produced a set of compound rules out of the ten classification trees and was used to classify the 261 patients. The rules for the classification of patients in FO and F4 were effective in more than 75% of the cases in which they were tested.CONCLUSION: The recognition of clinical subgroups should help to enhance our ability to assess differences in fibrosis scores in clinical studies and improve our understanding of fibrosis progression,展开更多
A high-speed comer detection algorithm based on fuzzy ID3 decision tree was proposed. In the algorithm, the Bresenham circle with 3-pixel radius was used as the test mask, overlapping the candidate comers with the nuc...A high-speed comer detection algorithm based on fuzzy ID3 decision tree was proposed. In the algorithm, the Bresenham circle with 3-pixel radius was used as the test mask, overlapping the candidate comers with the nucleus. Connected pixels on the circle were applied to compare the intensity value with the nucleus, with the membership function used to give the fuzzy result. The pixel with maximum information gain was chosen as the parent node to build a binary decision tree. Thus, the comer detector was derived. The pictures taken in Fengtai Railway Station in Beijing were used to test the method. The experimental results show that when the number of pixels on the test mask is chosen to be 9, best result can be obtained. The comer detector significantly outperforms existing detector in computational efficiency without sacrificing the quality and the method also provides high performance against Poisson noise and Gaussian blur.展开更多
Extraction of impervious surfaces is one of the necessary processes in urban change detection.This paper derived a unified conceptual model(UCM)from the vegetation-impervious surface-soil(VIS)model to make the extract...Extraction of impervious surfaces is one of the necessary processes in urban change detection.This paper derived a unified conceptual model(UCM)from the vegetation-impervious surface-soil(VIS)model to make the extraction more effective and accurate.UCM uses the decision tree algorithm with indices of spectrum and texture,etc.In this model,we found both dependent and independent indices for multi-source satellite imagery according to their similarity and dissimilarity.The purpose of the indices is to remove the other land-use and land-cover types(e.g.,vegetation and soil)from the imagery,and delineate the impervious surfaces as the result.UCM has the same steps conducted by decision tree algorithm.The Landsat-5 TM image(30 m)and the Satellite Probatoire d’Observation de la Terre(SPOT-4)image(20 m)from Chaoyang District(Beijing)in 2007 were used in this paper.The results show that the overall accuracy in Landsat-5 TM image is 88%,while 86.75%in SPOT-4 image.It is an appropriate method to meet the demand of urban change detection.展开更多
In today's digital age,the popularity and development of online education systems provide students with more flexible and convenient ways of learning.However,students'adaptation to the online education system ...In today's digital age,the popularity and development of online education systems provide students with more flexible and convenient ways of learning.However,students'adaptation to the online education system is affected by a variety of factors,including gender,age,educational background,and field of specialisation.Through in-depth analyses and studies of these factors,the following conclusions can be drawn:gender has little influence on students'adaptation to online education,and male and female students perform similarly overall,but the proportion of male students at high adaptation levels is significantly higher than that of females.The majority of students show medium adaptability,indicating that the overall effect of online education is average.students in the age groups of 6-10,16-20 and 26-30 years old have lower adaptability levels,and there are more low adaptability groups among students in colleges and universities.students majoring in IT are more adapted to the online education system,and students not majoring in IT have relatively poorer adaptability level.Local students are more adaptable to online education than foreign students.In areas with unstable electricity,students'adaptability is usually lower.The decision tree algorithm predictions showed good overall model accuracy,with higher prediction accuracy for students with high,low and medium levels of adaptability.The test set accuracy was 93.27%,and the precision and recall were both 93.33%,indicating excellent model predictions.In summary,by deeply analysing the influence of various factors on students'adaptation degree to online education and using the random forest algorithm to make predictions,it can provide an important reference for improving the effectiveness of online education systems and provide useful insights for personalised education.展开更多
In order to classify packet, we propose a novel IP classification based the non-collision hash and jumping table trie-tree (NHJTTT) algorithm, which is based on noncollision hash Trie-tree and Lakshman and Stiliadis p...In order to classify packet, we propose a novel IP classification based the non-collision hash and jumping table trie-tree (NHJTTT) algorithm, which is based on noncollision hash Trie-tree and Lakshman and Stiliadis proposing a 2-dimensional classification algorithm (LS algorithm). The core of algorithm consists of two parts: structure the non-collision hash function, which is constructed mainly based on destination/source port and protocol type field so that the hash function can avoid space explosion problem; introduce jumping table Trie-tree based LS algorithm in order to reduce time complexity. The test results show that the classification rate of NHJTTT algorithm is up to 1 million packets per second and the maximum memory consumed is 9 MB for 10 000 rules. Key words IP classification - lookup algorithm - trie-tree - non-collision hash - jumping table CLC number TN 393.06 Foundation item: Supported by the Chongqing of Posts and Telecommunications Younger Teacher Fundation (A2003-03).Biography: SHANG Feng-jun (1972-), male, Ph.D. candidate, lecture, research direction: the smart instrument and network.展开更多
Fault detection and classification is a key challenge for the protection of High Voltage DC(HVDC)transmission lines.In this paper,the Teager-Kaiser Energy Operator(TKEO)algorithm associated with a decision tree-based ...Fault detection and classification is a key challenge for the protection of High Voltage DC(HVDC)transmission lines.In this paper,the Teager-Kaiser Energy Operator(TKEO)algorithm associated with a decision tree-based fault classi-fier is proposed to detect and classify various DC faults.The Change Identification Filter is applied to the average and differential current components,to detect the first instant of fault occurrence(above threshold)and register a Change Identified Point(CIP).Further,if a CIP is registered for a positive or negative line,only three samples of currents(i.e.,CIP and each side of CIP)are sent to the proposed TKEO algorithm,which produces their respective 8 indices through which the,fault can be detected along with its classification.The new approach enables quicker detection allowing utility grids to be restored as soon as possible.This novel approach also reduces computing complexity and the time required to identify faults with classification.The importance and accuracy of the proposed scheme are also thor-oughly tested and compared with other methods for various faults on HVDC transmission lines.展开更多
To overcome the limitation that complex data types with noun attributes cannot be processed by rank learning algorithms, a new rank learning algorithm is designed. In the learning algorithm based on the decision tree,...To overcome the limitation that complex data types with noun attributes cannot be processed by rank learning algorithms, a new rank learning algorithm is designed. In the learning algorithm based on the decision tree, the splitting rule of the decision tree is revised with a new definition of rank impurity. A new rank learning algorithm, which can be intuitively explained, is obtained and its theoretical basis is provided. The experimental results show that in the aspect of average rank loss, the ranking tree algorithm outperforms perception ranking and ordinal regression algorithms and it also has a faster convergence speed. The rank learning algorithm based on the decision tree is able to process categorical data and select relative features.展开更多
In machine learning,randomness is a crucial factor in the success of ensemble learning,and it can be injected into tree-based ensembles by rotating the feature space.However,it is a common practice to rotate the featu...In machine learning,randomness is a crucial factor in the success of ensemble learning,and it can be injected into tree-based ensembles by rotating the feature space.However,it is a common practice to rotate the feature space randomly.Thus,a large number of trees are required to ensure the performance of the ensemble model.This random rotation method is theoretically feasible,but it requires massive computing resources,potentially restricting its applications.A multimodal genetic algorithm based rotation forest(MGARF)algorithm is proposed in this paper to solve this problem.It is a tree-based ensemble learning algorithm for classification,taking advantage of the characteristic of trees to inject randomness by feature rotation.However,this algorithm attempts to select a subset of more diverse and accurate base learners using the multimodal optimization method.The classification accuracy of the proposed MGARF algorithm was evaluated by comparing it with the original random forest and random rotation ensemble methods on 23 UCI classification datasets.Experimental results show that the MGARF method outperforms the other methods,and the number of base learners in MGARF models is much fewer.展开更多
BACKGROUND Down syndrome(DS)is one of the most common chromosomal aneuploidy diseases.Prenatal screening and diagnostic tests can aid the early diagnosis,appropriate management of these fetuses,and give parents an inf...BACKGROUND Down syndrome(DS)is one of the most common chromosomal aneuploidy diseases.Prenatal screening and diagnostic tests can aid the early diagnosis,appropriate management of these fetuses,and give parents an informed choice about whether or not to terminate a pregnancy.In recent years,investigations have been conducted to achieve a high detection rate(DR)and reduce the false positive rate(FPR).Hospitals have accumulated large numbers of screened cases.However,artificial intelligence methods are rarely used in the risk assessment of prenatal screening for DS.AIM To use a support vector machine algorithm,classification and regression tree algorithm,and AdaBoost algorithm in machine learning for modeling and analysis of prenatal DS screening.METHODS The dataset was from the Center for Prenatal Diagnosis at the First Hospital of Jilin University.We designed and developed intelligent algorithms based on the synthetic minority over-sampling technique(SMOTE)-Tomek and adaptive synthetic sampling over-sampling techniques to preprocess the dataset of prenatal screening information.The machine learning model was then established.Finally,the feasibility of artificial intelligence algorithms in DS screening evaluation is discussed.RESULTS The database contained 31 DS diagnosed cases,accounting for 0.03%of all patients.The dataset showed a large difference between the numbers of DS affected and non-affected cases.A combination of over-sampling and undersampling techniques can greatly increase the performance of the algorithm at processing non-balanced datasets.As the number of iterations increases,the combination of the classification and regression tree algorithm and the SMOTETomek over-sampling technique can obtain a high DR while keeping the FPR to a minimum.CONCLUSION The support vector machine algorithm and the classification and regression tree algorithm achieved good results on the DS screening dataset.When the T21 risk cutoff value was set to 270,machine learning methods had a higher DR and a lower FPR than statistical methods.展开更多
CC’s(Cloud Computing)networks are distributed and dynamic as signals appear/disappear or lose significance.MLTs(Machine learning Techniques)train datasets which sometime are inadequate in terms of sample for inferrin...CC’s(Cloud Computing)networks are distributed and dynamic as signals appear/disappear or lose significance.MLTs(Machine learning Techniques)train datasets which sometime are inadequate in terms of sample for inferring information.A dynamic strategy,DevMLOps(Development Machine Learning Operations)used in automatic selections and tunings of MLTs result in significant performance differences.But,the scheme has many disadvantages including continuity in training,more samples and training time in feature selections and increased classification execution times.RFEs(Recursive Feature Eliminations)are computationally very expensive in its operations as it traverses through each feature without considering correlations between them.This problem can be overcome by the use of Wrappers as they select better features by accounting for test and train datasets.The aim of this paper is to use DevQLMLOps for automated tuning and selections based on orchestrations and messaging between containers.The proposed AKFA(Adaptive Kernel Firefly Algorithm)is for selecting features for CNM(Cloud Network Monitoring)operations.AKFA methodology is demonstrated using CNSD(Cloud Network Security Dataset)with satisfactory results in the performance metrics like precision,recall,F-measure and accuracy used.展开更多
In this investigation,the Gradient Boosting(GB),Linear Regression(LR),Decision Tree(DT),and Voting algo-rithms were applied to predict the distribution pattern of Au geochemical data.Trace and indicator elements,inclu...In this investigation,the Gradient Boosting(GB),Linear Regression(LR),Decision Tree(DT),and Voting algo-rithms were applied to predict the distribution pattern of Au geochemical data.Trace and indicator elements,including Mo,Cu,Pb,Zn,Ag,Ni,Co,Mn,Fe,and As,were used with these machine learning algorithms(MLAs)to predict Au concentration values in the Doostbigloo porphyry Cu-Au-Mo mineralization area.The performance of the models was evaluated using the Mean Absolute Percentage Error(MAPE)and Root Mean Square Error(RMSE)metrics.The proposed ensemble Voting algorithm outperformed the other models,yielding more ac-curate predictions according to both metrics.The predicted data from the GB,LR,DT,and Voting MLAs were modeled using the Concentration-Area fractal method,and Au geochemical anomalies were mapped.To compare and validate the results,factors such as the location of the mineral deposits,their surface extent,and mineralization trend were considered.The results indicate that integrating hybrid MLAs with fractal modeling signifi-cantly improves geochemical prospectivity mapping.Among the four models,three(DT,GB,Voting)accurately identified both mineral deposits.The LR model,however,only identified Deposit I(central),and its mineralization trend diverged from the field data.The GB and Voting models produced similar results,with their final maps derived from fractal modeling showing the same anomalous areas.The anomaly boundaries identified by these two models are consistent with the two known reserves in the region.The results and plots related to prediction indicators and error rates for these two models also show high similarity,with lower error rates than the other models.Notably,the Voting model demonstrated superior performance in accurately delineating mineral deposit locations and identifying realistic mineralization trends while minimizing false anomalies.展开更多
In the aluminum reduction process, aluminum uoride (AlF3) is added to lower the liquidus temperature of the electrolyte and increase the electrolytic ef ciency. Making the decision on the amount of AlF3 addi- tion (re...In the aluminum reduction process, aluminum uoride (AlF3) is added to lower the liquidus temperature of the electrolyte and increase the electrolytic ef ciency. Making the decision on the amount of AlF3 addi- tion (referred to in this work as MDAAA) is a complex and knowledge-based task that must take into con- sideration a variety of interrelated functions;in practice, this decision-making step is performed manually. Due to technician subjectivity and the complexity of the aluminum reduction cell, it is dif cult to guarantee the accuracy of MDAAA based on knowledge-driven or data-driven methods alone. Existing strategies for MDAAA have dif culty covering these complex causalities. In this work, a data and knowl- edge collaboration strategy for MDAAA based on augmented fuzzy cognitive maps (FCMs) is proposed. In the proposed strategy, the fuzzy rules are extracted by extended fuzzy k-means (EFKM) and fuzzy deci- sion trees, which are used to amend the initial structure provided by experts. The state transition algo- rithm (STA) is introduced to detect weight matrices that lead the FCMs to desired steady states. This study then experimentally compares the proposed strategy with some existing research. The results of the comparison show that the speed of FCMs convergence into a stable region based on the STA using the proposed strategy is faster than when using the differential Hebbian learning (DHL), particle swarm optimization (PSO), or genetic algorithm (GA) strategies. In addition, the accuracy of MDAAA based on the proposed method is better than those based on other methods. Accordingly, this paper provides a feasible and effective strategy for MDAAA.展开更多
文摘Under the modern education system of China, the annual scholarship evaluation is a vital thing for many of the collegestudents. This paper adopts the classification algorithm of decision tree C4.5 based on the bettering of ID3 algorithm and constructa data set of the scholarship evaluation system through the analysis of the related attributes in scholarship evaluation information.And also having found some factors that plays a significant role in the growing up of the college students through analysis and re-search of moral education, intellectural education and culture&PE.
文摘The ID3 algorithm is a classical learning algorithm of decision tree in data mining.The algorithm trends to choosing the attribute with more values,affect the efficiency of classification and prediction for building a decision tree.This article proposes a new approach based on an improved ID3 algorithm.The new algorithm introduces the importance factor λ when calculating the information entropy.It can strengthen the label of important attributes of a tree and reduce the label of non-important attributes.The algorithm overcomes the flaw of the traditional ID3 algorithm which tends to choose the attributes with more values,and also improves the efficiency and flexibility in the process of generating decision trees.
基金Supported by Science and Technology Plan of Mudanjiang City (G200920064)Teaching Reform Construction of Mudanjiang Normal University (10-xj11080)
文摘Based on the discuss of the basic concept of data mining technology and the decision tree method,combining with the data samples of wind and hailstorm disasters in some counties of Mudanjiang region,the forecasting model of agro-meteorological disaster grade was established by adopting the C4.5 classification algorithm of decision tree,which can forecast the direct economic loss degree to provide rational data mining model and obtain effective analysis results.
文摘The productivity and quality in the turning process can be improved by utilizing the predicted performance of the cutting tools.This research incorporates condition monitoring of a non-carbide tool insert using vibration analysis along with machine learning and fuzzy logic approach.A non-carbide tool insert is considered for the process of cutting operation in a semi-automatic lathe,where the condition of tool is monitored using vibration characteristics.The vibration signals for conditions such as heathy,damaged,thermal and flank were acquired with the help of piezoelectric transducer and data acquisition system.The descriptive statistical features were extracted from the acquired vibration signal using the feature extraction techniques.The extracted statistical features were selected using a feature selection process through J48 decision tree algorithm.The selected features were classified using J48 decision tree and fuzzy to develop the fault diagnosis model for the improved predictive analysis.The decision tree model produced the classification accuracy as 94.78%with five selected features.The developed fuzzy model produced the classification accuracy as 94.02%with five membership functions.Hence,the decision tree has been proposed as a suitable fault diagnosis model for predicting the tool insert health condition under different fault conditions.
基金Supported by A grant of the Universidad Nacional Autonoma de Mexico SDI.PTID.05.6
文摘AIM: To assess the usefulness of FibroTest to forecast scores by constructing decision trees in patients with chronic hepatitis C.METHODS: We used the C4.5 classification algorithm to construct decision trees with data from 261 patients with chronic hepatitis C without a liver biopsy. The FibroTest attributes of age, gender, bilirubin, apolipoprotein, haptoglobin, α2 macroglobulin, and γ-glutamyl transpeptidase were used as predictors, and the FibroTest score as the target. For testing, a 10-fold cross validation was used.RESULTS: The overall classification error was 14.9% (accuracy 85.1%). FibroTest's cases with true scores of FO and F4 were classified with very high accuracy (18/20 for FO, 9/9 for FO-1 and 92/96 for F4) and the largest confusion centered on F3. The algorithm produced a set of compound rules out of the ten classification trees and was used to classify the 261 patients. The rules for the classification of patients in FO and F4 were effective in more than 75% of the cases in which they were tested.CONCLUSION: The recognition of clinical subgroups should help to enhance our ability to assess differences in fibrosis scores in clinical studies and improve our understanding of fibrosis progression,
基金Project(J2008X011) supported by the Natural Science Foundation of Ministry of Railway and Tsinghua University,China
文摘A high-speed comer detection algorithm based on fuzzy ID3 decision tree was proposed. In the algorithm, the Bresenham circle with 3-pixel radius was used as the test mask, overlapping the candidate comers with the nucleus. Connected pixels on the circle were applied to compare the intensity value with the nucleus, with the membership function used to give the fuzzy result. The pixel with maximum information gain was chosen as the parent node to build a binary decision tree. Thus, the comer detector was derived. The pictures taken in Fengtai Railway Station in Beijing were used to test the method. The experimental results show that when the number of pixels on the test mask is chosen to be 9, best result can be obtained. The comer detector significantly outperforms existing detector in computational efficiency without sacrificing the quality and the method also provides high performance against Poisson noise and Gaussian blur.
基金supported by the National Natural Science Foundation of China(Grant No.40671127)the National Hi-Tech Research and Development Program of China("863"Project)(Grant Nos.2006AA120101,2006AA120102)+1 种基金the National Key Technology Research and Development Program(Grant No.2008BAK49B04)the National China Next General Internet Program(Grant No.CNGI–09–01–07)
文摘Extraction of impervious surfaces is one of the necessary processes in urban change detection.This paper derived a unified conceptual model(UCM)from the vegetation-impervious surface-soil(VIS)model to make the extraction more effective and accurate.UCM uses the decision tree algorithm with indices of spectrum and texture,etc.In this model,we found both dependent and independent indices for multi-source satellite imagery according to their similarity and dissimilarity.The purpose of the indices is to remove the other land-use and land-cover types(e.g.,vegetation and soil)from the imagery,and delineate the impervious surfaces as the result.UCM has the same steps conducted by decision tree algorithm.The Landsat-5 TM image(30 m)and the Satellite Probatoire d’Observation de la Terre(SPOT-4)image(20 m)from Chaoyang District(Beijing)in 2007 were used in this paper.The results show that the overall accuracy in Landsat-5 TM image is 88%,while 86.75%in SPOT-4 image.It is an appropriate method to meet the demand of urban change detection.
文摘In today's digital age,the popularity and development of online education systems provide students with more flexible and convenient ways of learning.However,students'adaptation to the online education system is affected by a variety of factors,including gender,age,educational background,and field of specialisation.Through in-depth analyses and studies of these factors,the following conclusions can be drawn:gender has little influence on students'adaptation to online education,and male and female students perform similarly overall,but the proportion of male students at high adaptation levels is significantly higher than that of females.The majority of students show medium adaptability,indicating that the overall effect of online education is average.students in the age groups of 6-10,16-20 and 26-30 years old have lower adaptability levels,and there are more low adaptability groups among students in colleges and universities.students majoring in IT are more adapted to the online education system,and students not majoring in IT have relatively poorer adaptability level.Local students are more adaptable to online education than foreign students.In areas with unstable electricity,students'adaptability is usually lower.The decision tree algorithm predictions showed good overall model accuracy,with higher prediction accuracy for students with high,low and medium levels of adaptability.The test set accuracy was 93.27%,and the precision and recall were both 93.33%,indicating excellent model predictions.In summary,by deeply analysing the influence of various factors on students'adaptation degree to online education and using the random forest algorithm to make predictions,it can provide an important reference for improving the effectiveness of online education systems and provide useful insights for personalised education.
文摘In order to classify packet, we propose a novel IP classification based the non-collision hash and jumping table trie-tree (NHJTTT) algorithm, which is based on noncollision hash Trie-tree and Lakshman and Stiliadis proposing a 2-dimensional classification algorithm (LS algorithm). The core of algorithm consists of two parts: structure the non-collision hash function, which is constructed mainly based on destination/source port and protocol type field so that the hash function can avoid space explosion problem; introduce jumping table Trie-tree based LS algorithm in order to reduce time complexity. The test results show that the classification rate of NHJTTT algorithm is up to 1 million packets per second and the maximum memory consumed is 9 MB for 10 000 rules. Key words IP classification - lookup algorithm - trie-tree - non-collision hash - jumping table CLC number TN 393.06 Foundation item: Supported by the Chongqing of Posts and Telecommunications Younger Teacher Fundation (A2003-03).Biography: SHANG Feng-jun (1972-), male, Ph.D. candidate, lecture, research direction: the smart instrument and network.
文摘Fault detection and classification is a key challenge for the protection of High Voltage DC(HVDC)transmission lines.In this paper,the Teager-Kaiser Energy Operator(TKEO)algorithm associated with a decision tree-based fault classi-fier is proposed to detect and classify various DC faults.The Change Identification Filter is applied to the average and differential current components,to detect the first instant of fault occurrence(above threshold)and register a Change Identified Point(CIP).Further,if a CIP is registered for a positive or negative line,only three samples of currents(i.e.,CIP and each side of CIP)are sent to the proposed TKEO algorithm,which produces their respective 8 indices through which the,fault can be detected along with its classification.The new approach enables quicker detection allowing utility grids to be restored as soon as possible.This novel approach also reduces computing complexity and the time required to identify faults with classification.The importance and accuracy of the proposed scheme are also thor-oughly tested and compared with other methods for various faults on HVDC transmission lines.
基金The Planning Program of Science and Technology of Hunan Province (No05JT1039)
文摘To overcome the limitation that complex data types with noun attributes cannot be processed by rank learning algorithms, a new rank learning algorithm is designed. In the learning algorithm based on the decision tree, the splitting rule of the decision tree is revised with a new definition of rank impurity. A new rank learning algorithm, which can be intuitively explained, is obtained and its theoretical basis is provided. The experimental results show that in the aspect of average rank loss, the ranking tree algorithm outperforms perception ranking and ordinal regression algorithms and it also has a faster convergence speed. The rank learning algorithm based on the decision tree is able to process categorical data and select relative features.
基金Project(61603274)supported by the National Natural Science Foundation of ChinaProject(2017KJ249)supported by the Research Project of Tianjin Municipal Education Commission,China。
文摘In machine learning,randomness is a crucial factor in the success of ensemble learning,and it can be injected into tree-based ensembles by rotating the feature space.However,it is a common practice to rotate the feature space randomly.Thus,a large number of trees are required to ensure the performance of the ensemble model.This random rotation method is theoretically feasible,but it requires massive computing resources,potentially restricting its applications.A multimodal genetic algorithm based rotation forest(MGARF)algorithm is proposed in this paper to solve this problem.It is a tree-based ensemble learning algorithm for classification,taking advantage of the characteristic of trees to inject randomness by feature rotation.However,this algorithm attempts to select a subset of more diverse and accurate base learners using the multimodal optimization method.The classification accuracy of the proposed MGARF algorithm was evaluated by comparing it with the original random forest and random rotation ensemble methods on 23 UCI classification datasets.Experimental results show that the MGARF method outperforms the other methods,and the number of base learners in MGARF models is much fewer.
基金Supported by Science and Technology Department of Jilin Province,No.20190302073GX.
文摘BACKGROUND Down syndrome(DS)is one of the most common chromosomal aneuploidy diseases.Prenatal screening and diagnostic tests can aid the early diagnosis,appropriate management of these fetuses,and give parents an informed choice about whether or not to terminate a pregnancy.In recent years,investigations have been conducted to achieve a high detection rate(DR)and reduce the false positive rate(FPR).Hospitals have accumulated large numbers of screened cases.However,artificial intelligence methods are rarely used in the risk assessment of prenatal screening for DS.AIM To use a support vector machine algorithm,classification and regression tree algorithm,and AdaBoost algorithm in machine learning for modeling and analysis of prenatal DS screening.METHODS The dataset was from the Center for Prenatal Diagnosis at the First Hospital of Jilin University.We designed and developed intelligent algorithms based on the synthetic minority over-sampling technique(SMOTE)-Tomek and adaptive synthetic sampling over-sampling techniques to preprocess the dataset of prenatal screening information.The machine learning model was then established.Finally,the feasibility of artificial intelligence algorithms in DS screening evaluation is discussed.RESULTS The database contained 31 DS diagnosed cases,accounting for 0.03%of all patients.The dataset showed a large difference between the numbers of DS affected and non-affected cases.A combination of over-sampling and undersampling techniques can greatly increase the performance of the algorithm at processing non-balanced datasets.As the number of iterations increases,the combination of the classification and regression tree algorithm and the SMOTETomek over-sampling technique can obtain a high DR while keeping the FPR to a minimum.CONCLUSION The support vector machine algorithm and the classification and regression tree algorithm achieved good results on the DS screening dataset.When the T21 risk cutoff value was set to 270,machine learning methods had a higher DR and a lower FPR than statistical methods.
文摘CC’s(Cloud Computing)networks are distributed and dynamic as signals appear/disappear or lose significance.MLTs(Machine learning Techniques)train datasets which sometime are inadequate in terms of sample for inferring information.A dynamic strategy,DevMLOps(Development Machine Learning Operations)used in automatic selections and tunings of MLTs result in significant performance differences.But,the scheme has many disadvantages including continuity in training,more samples and training time in feature selections and increased classification execution times.RFEs(Recursive Feature Eliminations)are computationally very expensive in its operations as it traverses through each feature without considering correlations between them.This problem can be overcome by the use of Wrappers as they select better features by accounting for test and train datasets.The aim of this paper is to use DevQLMLOps for automated tuning and selections based on orchestrations and messaging between containers.The proposed AKFA(Adaptive Kernel Firefly Algorithm)is for selecting features for CNM(Cloud Network Monitoring)operations.AKFA methodology is demonstrated using CNSD(Cloud Network Security Dataset)with satisfactory results in the performance metrics like precision,recall,F-measure and accuracy used.
文摘In this investigation,the Gradient Boosting(GB),Linear Regression(LR),Decision Tree(DT),and Voting algo-rithms were applied to predict the distribution pattern of Au geochemical data.Trace and indicator elements,including Mo,Cu,Pb,Zn,Ag,Ni,Co,Mn,Fe,and As,were used with these machine learning algorithms(MLAs)to predict Au concentration values in the Doostbigloo porphyry Cu-Au-Mo mineralization area.The performance of the models was evaluated using the Mean Absolute Percentage Error(MAPE)and Root Mean Square Error(RMSE)metrics.The proposed ensemble Voting algorithm outperformed the other models,yielding more ac-curate predictions according to both metrics.The predicted data from the GB,LR,DT,and Voting MLAs were modeled using the Concentration-Area fractal method,and Au geochemical anomalies were mapped.To compare and validate the results,factors such as the location of the mineral deposits,their surface extent,and mineralization trend were considered.The results indicate that integrating hybrid MLAs with fractal modeling signifi-cantly improves geochemical prospectivity mapping.Among the four models,three(DT,GB,Voting)accurately identified both mineral deposits.The LR model,however,only identified Deposit I(central),and its mineralization trend diverged from the field data.The GB and Voting models produced similar results,with their final maps derived from fractal modeling showing the same anomalous areas.The anomaly boundaries identified by these two models are consistent with the two known reserves in the region.The results and plots related to prediction indicators and error rates for these two models also show high similarity,with lower error rates than the other models.Notably,the Voting model demonstrated superior performance in accurately delineating mineral deposit locations and identifying realistic mineralization trends while minimizing false anomalies.
文摘In the aluminum reduction process, aluminum uoride (AlF3) is added to lower the liquidus temperature of the electrolyte and increase the electrolytic ef ciency. Making the decision on the amount of AlF3 addi- tion (referred to in this work as MDAAA) is a complex and knowledge-based task that must take into con- sideration a variety of interrelated functions;in practice, this decision-making step is performed manually. Due to technician subjectivity and the complexity of the aluminum reduction cell, it is dif cult to guarantee the accuracy of MDAAA based on knowledge-driven or data-driven methods alone. Existing strategies for MDAAA have dif culty covering these complex causalities. In this work, a data and knowl- edge collaboration strategy for MDAAA based on augmented fuzzy cognitive maps (FCMs) is proposed. In the proposed strategy, the fuzzy rules are extracted by extended fuzzy k-means (EFKM) and fuzzy deci- sion trees, which are used to amend the initial structure provided by experts. The state transition algo- rithm (STA) is introduced to detect weight matrices that lead the FCMs to desired steady states. This study then experimentally compares the proposed strategy with some existing research. The results of the comparison show that the speed of FCMs convergence into a stable region based on the STA using the proposed strategy is faster than when using the differential Hebbian learning (DHL), particle swarm optimization (PSO), or genetic algorithm (GA) strategies. In addition, the accuracy of MDAAA based on the proposed method is better than those based on other methods. Accordingly, this paper provides a feasible and effective strategy for MDAAA.