The feasibility of constructing shallow foundations on saturated sands remains uncertain.Seismic design standards simply stipulate that geotechnical investigations for a shallow foundation on such soils shall be condu...The feasibility of constructing shallow foundations on saturated sands remains uncertain.Seismic design standards simply stipulate that geotechnical investigations for a shallow foundation on such soils shall be conducted to mitigate the effects of the liquefaction hazard.This study investigates the seismic behavior of strip foundations on typical two-layered soil profiles-a natural loose sand layer supported by a dense sand layer.Coupled nonlinear dynamic analyses have been conducted to calculate response parameters,including seismic settlement,the acceleration response on the ground surface,and excess pore pressure beneath strip foundations.A novel liquefaction potential index(LPI_(footing)),based on excess pore pressure ratios across a given region of soil mass beneath footings is introduced to classify liquefaction severity into three distinct levels:minor,moderate,and severe.To validate the proposed LPI_(footing),the foundation settlement is evaluated for the different liquefaction potential classes.A classification tree model has been grown to predict liquefaction susceptibility,utilizing various input variables,including earthquake intensity on the ground surface,foundation pressure,sand permeability,and top layer thickness.Moreover,a nonlinear regression function has been established to map LPI_(footing) in relation to these input predictors.The models have been constructed using a substantial dataset comprising 13,824 excess pore pressure ratio time histories.The performance of the developed models has been examined using various methods,including the 10-fold cross-validation method.The predictive capability of the tree also has been validated through existing experimental studies.The results indicate that the classification tree is not only interpretable but also highly predictive,with a testing accuracy level of 78.1%.The decision tree provides valuable insights for engineers assessing liquefaction potential beneath strip foundations.展开更多
The main objective of this research is to determine the capacity of land cover classification combining spec- tral and textural features of Landsat TM imagery with ancillary geographical data in wetlands of the Sanjia...The main objective of this research is to determine the capacity of land cover classification combining spec- tral and textural features of Landsat TM imagery with ancillary geographical data in wetlands of the Sanjiang Plain, Heilongjiang Province, China. Semi-variograms and Z-test value were calculated to assess the separability of grey-level co-occurrence texture measures to maximize the difference between land cover types. The degree of spatial autocorrelation showed that window sizes of 3×3 pixels and 11×11 pixels were most appropriate for Landsat TM im- age texture calculations. The texture analysis showed that co-occurrence entropy, dissimilarity, and variance texture measures, derived from the Landsat TM spectrum bands and vegetation indices provided the most significant statistical differentiation between land cover types. Subsequently, a Classification and Regression Tree (CART) algorithm was applied to three different combinations of predictors: 1) TM imagery alone (TM-only); 2) TM imagery plus image texture (TM+TXT model); and 3) all predictors including TM imagery, image texture and additional ancillary GIS in- formation (TM+TXT+GIS model). Compared with traditional Maximum Likelihood Classification (MLC) supervised classification, three classification trees predictive models reduced the overall error rate significantly. Image texture measures and ancillary geographical variables depressed the speckle noise effectively and reduced classification error rate of marsh obviously. For classification trees model making use of all available predictors, omission error rate was 12.90% and commission error rate was 10.99% for marsh. The developed method is portable, relatively easy to im- plement and should be applicable in other settings and over larger extents.展开更多
针对卡方自动交互诊断(CHAID)决策树易过拟合的问题,提出CHAID随机森林方法(CHAID Random Forest,CHAID-RF)。该方法采用随机采样、随机选择特征以及集成的策略,将CHAID决策树作为基分类器,形成CHAID-RF。为了验证CHAID-RF的有效性,选取...针对卡方自动交互诊断(CHAID)决策树易过拟合的问题,提出CHAID随机森林方法(CHAID Random Forest,CHAID-RF)。该方法采用随机采样、随机选择特征以及集成的策略,将CHAID决策树作为基分类器,形成CHAID-RF。为了验证CHAID-RF的有效性,选取CART、CHAID、SVM、RF作为对比算法,以准确率、加权查准率、加权查全率、加权F值作为分类模型评价指标,以均方根误差作为回归模型评价指标,采用10个分类数据集和7个回归数据集进行验证。实验结果表明CHAID-RF可行有效。展开更多
Addressing the issues of high redundancy,poor targeting,and low efficiency in the fuzzy testing process of application-layer network protocols,NetFuzz,a novel multi-objective fuzzy testing tool based on a classificati...Addressing the issues of high redundancy,poor targeting,and low efficiency in the fuzzy testing process of application-layer network protocols,NetFuzz,a novel multi-objective fuzzy testing tool based on a classification tree was introduced in this paper. NetFuzz utilizes a four-tiered classification tree protocol description method for application-layer network protocols,encompassing the protocol under test,its functions,specific commands,parameters,and variant markers. A multi-objective optimization model is designed to enhance the coverage,validity,and diversity of test cases,with a genetic algorithm employed to generate these cases. The tool is evaluated by testing file transfer protocol( FTP) and hypertext transfer protocol( HTTP) servers with known vulnerabilities,comparing its performance against the Peach fuzzing tool. NetFuzz can effectively detect security vulnerabilities that Peach Fuzzer tool failed to identify. While ensuring the detection of vulnerabilities,the testing time is reduced by approximately 58%,and the number of valid test cases increases by 47. 67%.展开更多
Urban tree species provide various essential ecosystem services in cities,such as regulating urban temperatures,reducing noise,capturing carbon,and mitigating the urban heat island effect.The quality of these services...Urban tree species provide various essential ecosystem services in cities,such as regulating urban temperatures,reducing noise,capturing carbon,and mitigating the urban heat island effect.The quality of these services is influenced by species diversity,tree health,and the distribution and the composition of trees.Traditionally,data on urban trees has been collected through field surveys and manual interpretation of remote sensing images.In this study,we evaluated the effectiveness of multispectral airborne laser scanning(ALS)data in classifying 24 common urban roadside tree species in Espoo,Finland.Tree crown structure information,intensity features,and spectral data were used for classification.Eight different machine learning algorithms were tested,with the extra trees(ET)algorithm performing the best,achieving an overall accuracy of 71.7%using multispectral LiDAR data.This result highlights that integrating structural and spectral information within a single framework can improve the classification accuracy.Future research will focus on identifying the most important features for species classification and developing algorithms with greater efficiency and accuracy.展开更多
Despite the widespread use of Decision trees (DT) across various applications, their performance tends to suffer when dealing with imbalanced datasets, where the distribution of certain classes significantly outweighs...Despite the widespread use of Decision trees (DT) across various applications, their performance tends to suffer when dealing with imbalanced datasets, where the distribution of certain classes significantly outweighs others. Cost-sensitive learning is a strategy to solve this problem, and several cost-sensitive DT algorithms have been proposed to date. However, existing algorithms, which are heuristic, tried to greedily select either a better splitting point or feature node, leading to local optima for tree nodes and ignoring the cost of the whole tree. In addition, determination of the costs is difficult and often requires domain expertise. This study proposes a DT for imbalanced data, called Swarm-based Cost-sensitive DT (SCDT), using the cost-sensitive learning strategy and an enhanced swarm-based algorithm. The DT is encoded using a hybrid individual representation. A hybrid artificial bee colony approach is designed to optimize rules, considering specified costs in an F-Measure-based fitness function. Experimental results using datasets compared with state-of-the-art DT algorithms show that the SCDT method achieved the highest performance on most datasets. Moreover, SCDT also excels in other critical performance metrics, such as recall, precision, F1-score, and AUC, with notable results with average values of 83%, 87.3%, 85%, and 80.7%, respectively.展开更多
To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts,which has low efficiency, a new approach based on the FCR-tree...To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts,which has low efficiency, a new approach based on the FCR-tree(fuzzy classification rules tree)for text categorization is proposed.The compactness of the FCR-tree saves significant space in storing a large set of rules when there are many repeated words in the rules.In comparison with classification rules,the fuzzy classification rules contain not only words,but also the fuzzy sets corresponding to the frequencies of words appearing in texts.Therefore,the construction of an FCR-tree and its structure are different from a CR-tree.To debase the difficulty of FCR-tree construction and rules retrieval,more k-FCR-trees are built.When classifying a new text,it is not necessary to search the paths of the sub-trees led by those words not appearing in this text,thus reducing the number of traveling rules.Experimental results show that the proposed approach obviously outperforms the conventional method in efficiency.展开更多
For classifying unknown 3-D objects into a set of predetermined object classes, a part-level object classification method based on the improved interpretation tree is presented. The part-level representation is implem...For classifying unknown 3-D objects into a set of predetermined object classes, a part-level object classification method based on the improved interpretation tree is presented. The part-level representation is implemented, which enables a more compact shape description of 3-D objects. The proposed classification method consists of two key processing stages: the improved constrained search on an interpretation tree and the following shape similarity measure computation. By the classification method, both whole match and partial match with shape similarity ranks are achieved; especially, focus match can be accomplished, where different key parts may be labeled and all the matched models containing corresponding key parts may be obtained. A series of experiments show the effectiveness of the presented 3-D object classification method.展开更多
To solve the multi-class fault diagnosis tasks,decision tree support vector machine(DTSVM),which combines SVM and decision tree using the concept of dichotomy,is proposed.Since the classification performance of DTSVM ...To solve the multi-class fault diagnosis tasks,decision tree support vector machine(DTSVM),which combines SVM and decision tree using the concept of dichotomy,is proposed.Since the classification performance of DTSVM highly depends on its structure,to cluster the multi-classes with maximum distance between the clustering centers of the two sub-classes,genetic algorithm is introduced into the formation of decision tree,so that the most separable classes would be separated at each node of decisions tree.Numerical simulations conducted on three datasets compared with"one-against-all"and"one-against-one"demonstrate the proposed method has better performance and higher generalization ability than the two conventional methods.展开更多
文摘The feasibility of constructing shallow foundations on saturated sands remains uncertain.Seismic design standards simply stipulate that geotechnical investigations for a shallow foundation on such soils shall be conducted to mitigate the effects of the liquefaction hazard.This study investigates the seismic behavior of strip foundations on typical two-layered soil profiles-a natural loose sand layer supported by a dense sand layer.Coupled nonlinear dynamic analyses have been conducted to calculate response parameters,including seismic settlement,the acceleration response on the ground surface,and excess pore pressure beneath strip foundations.A novel liquefaction potential index(LPI_(footing)),based on excess pore pressure ratios across a given region of soil mass beneath footings is introduced to classify liquefaction severity into three distinct levels:minor,moderate,and severe.To validate the proposed LPI_(footing),the foundation settlement is evaluated for the different liquefaction potential classes.A classification tree model has been grown to predict liquefaction susceptibility,utilizing various input variables,including earthquake intensity on the ground surface,foundation pressure,sand permeability,and top layer thickness.Moreover,a nonlinear regression function has been established to map LPI_(footing) in relation to these input predictors.The models have been constructed using a substantial dataset comprising 13,824 excess pore pressure ratio time histories.The performance of the developed models has been examined using various methods,including the 10-fold cross-validation method.The predictive capability of the tree also has been validated through existing experimental studies.The results indicate that the classification tree is not only interpretable but also highly predictive,with a testing accuracy level of 78.1%.The decision tree provides valuable insights for engineers assessing liquefaction potential beneath strip foundations.
基金Under the auspices of National Natural Science Foundation of China (No. 40871188) National Key Technologies R&D Program of China (No. 2006BAD23B03)
文摘The main objective of this research is to determine the capacity of land cover classification combining spec- tral and textural features of Landsat TM imagery with ancillary geographical data in wetlands of the Sanjiang Plain, Heilongjiang Province, China. Semi-variograms and Z-test value were calculated to assess the separability of grey-level co-occurrence texture measures to maximize the difference between land cover types. The degree of spatial autocorrelation showed that window sizes of 3×3 pixels and 11×11 pixels were most appropriate for Landsat TM im- age texture calculations. The texture analysis showed that co-occurrence entropy, dissimilarity, and variance texture measures, derived from the Landsat TM spectrum bands and vegetation indices provided the most significant statistical differentiation between land cover types. Subsequently, a Classification and Regression Tree (CART) algorithm was applied to three different combinations of predictors: 1) TM imagery alone (TM-only); 2) TM imagery plus image texture (TM+TXT model); and 3) all predictors including TM imagery, image texture and additional ancillary GIS in- formation (TM+TXT+GIS model). Compared with traditional Maximum Likelihood Classification (MLC) supervised classification, three classification trees predictive models reduced the overall error rate significantly. Image texture measures and ancillary geographical variables depressed the speckle noise effectively and reduced classification error rate of marsh obviously. For classification trees model making use of all available predictors, omission error rate was 12.90% and commission error rate was 10.99% for marsh. The developed method is portable, relatively easy to im- plement and should be applicable in other settings and over larger extents.
文摘针对卡方自动交互诊断(CHAID)决策树易过拟合的问题,提出CHAID随机森林方法(CHAID Random Forest,CHAID-RF)。该方法采用随机采样、随机选择特征以及集成的策略,将CHAID决策树作为基分类器,形成CHAID-RF。为了验证CHAID-RF的有效性,选取CART、CHAID、SVM、RF作为对比算法,以准确率、加权查准率、加权查全率、加权F值作为分类模型评价指标,以均方根误差作为回归模型评价指标,采用10个分类数据集和7个回归数据集进行验证。实验结果表明CHAID-RF可行有效。
文摘Addressing the issues of high redundancy,poor targeting,and low efficiency in the fuzzy testing process of application-layer network protocols,NetFuzz,a novel multi-objective fuzzy testing tool based on a classification tree was introduced in this paper. NetFuzz utilizes a four-tiered classification tree protocol description method for application-layer network protocols,encompassing the protocol under test,its functions,specific commands,parameters,and variant markers. A multi-objective optimization model is designed to enhance the coverage,validity,and diversity of test cases,with a genetic algorithm employed to generate these cases. The tool is evaluated by testing file transfer protocol( FTP) and hypertext transfer protocol( HTTP) servers with known vulnerabilities,comparing its performance against the Peach fuzzing tool. NetFuzz can effectively detect security vulnerabilities that Peach Fuzzer tool failed to identify. While ensuring the detection of vulnerabilities,the testing time is reduced by approximately 58%,and the number of valid test cases increases by 47. 67%.
文摘Urban tree species provide various essential ecosystem services in cities,such as regulating urban temperatures,reducing noise,capturing carbon,and mitigating the urban heat island effect.The quality of these services is influenced by species diversity,tree health,and the distribution and the composition of trees.Traditionally,data on urban trees has been collected through field surveys and manual interpretation of remote sensing images.In this study,we evaluated the effectiveness of multispectral airborne laser scanning(ALS)data in classifying 24 common urban roadside tree species in Espoo,Finland.Tree crown structure information,intensity features,and spectral data were used for classification.Eight different machine learning algorithms were tested,with the extra trees(ET)algorithm performing the best,achieving an overall accuracy of 71.7%using multispectral LiDAR data.This result highlights that integrating structural and spectral information within a single framework can improve the classification accuracy.Future research will focus on identifying the most important features for species classification and developing algorithms with greater efficiency and accuracy.
文摘Despite the widespread use of Decision trees (DT) across various applications, their performance tends to suffer when dealing with imbalanced datasets, where the distribution of certain classes significantly outweighs others. Cost-sensitive learning is a strategy to solve this problem, and several cost-sensitive DT algorithms have been proposed to date. However, existing algorithms, which are heuristic, tried to greedily select either a better splitting point or feature node, leading to local optima for tree nodes and ignoring the cost of the whole tree. In addition, determination of the costs is difficult and often requires domain expertise. This study proposes a DT for imbalanced data, called Swarm-based Cost-sensitive DT (SCDT), using the cost-sensitive learning strategy and an enhanced swarm-based algorithm. The DT is encoded using a hybrid individual representation. A hybrid artificial bee colony approach is designed to optimize rules, considering specified costs in an F-Measure-based fitness function. Experimental results using datasets compared with state-of-the-art DT algorithms show that the SCDT method achieved the highest performance on most datasets. Moreover, SCDT also excels in other critical performance metrics, such as recall, precision, F1-score, and AUC, with notable results with average values of 83%, 87.3%, 85%, and 80.7%, respectively.
基金The National Natural Science Foundation of China(No.60473045)the Technology Research Project of Hebei Province(No.05213573)the Research Plan of Education Office of Hebei Province(No.2004406)
文摘To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts,which has low efficiency, a new approach based on the FCR-tree(fuzzy classification rules tree)for text categorization is proposed.The compactness of the FCR-tree saves significant space in storing a large set of rules when there are many repeated words in the rules.In comparison with classification rules,the fuzzy classification rules contain not only words,but also the fuzzy sets corresponding to the frequencies of words appearing in texts.Therefore,the construction of an FCR-tree and its structure are different from a CR-tree.To debase the difficulty of FCR-tree construction and rules retrieval,more k-FCR-trees are built.When classifying a new text,it is not necessary to search the paths of the sub-trees led by those words not appearing in this text,thus reducing the number of traveling rules.Experimental results show that the proposed approach obviously outperforms the conventional method in efficiency.
基金The National Basic Research Program of China(973Program)(No2006CB303105)the Research Foundation of Bei-jing Jiaotong University (NoK06J0170)
文摘For classifying unknown 3-D objects into a set of predetermined object classes, a part-level object classification method based on the improved interpretation tree is presented. The part-level representation is implemented, which enables a more compact shape description of 3-D objects. The proposed classification method consists of two key processing stages: the improved constrained search on an interpretation tree and the following shape similarity measure computation. By the classification method, both whole match and partial match with shape similarity ranks are achieved; especially, focus match can be accomplished, where different key parts may be labeled and all the matched models containing corresponding key parts may be obtained. A series of experiments show the effectiveness of the presented 3-D object classification method.
基金supported by the National Natural Science Foundation of China(60604021,60874054)
文摘To solve the multi-class fault diagnosis tasks,decision tree support vector machine(DTSVM),which combines SVM and decision tree using the concept of dichotomy,is proposed.Since the classification performance of DTSVM highly depends on its structure,to cluster the multi-classes with maximum distance between the clustering centers of the two sub-classes,genetic algorithm is introduced into the formation of decision tree,so that the most separable classes would be separated at each node of decisions tree.Numerical simulations conducted on three datasets compared with"one-against-all"and"one-against-one"demonstrate the proposed method has better performance and higher generalization ability than the two conventional methods.