Cyberbullying on social media poses significant psychological risks,yet most detection systems over-simplify the task by focusing on binary classification,ignoring nuanced categories like passive-aggressive remarks or...Cyberbullying on social media poses significant psychological risks,yet most detection systems over-simplify the task by focusing on binary classification,ignoring nuanced categories like passive-aggressive remarks or indirect slurs.To address this gap,we propose a hybrid framework combining Term Frequency-Inverse Document Frequency(TF-IDF),word-to-vector(Word2Vec),and Bidirectional Encoder Representations from Transformers(BERT)based models for multi-class cyberbullying detection.Our approach integrates TF-IDF for lexical specificity and Word2Vec for semantic relationships,fused with BERT’s contextual embeddings to capture syntactic and semantic complexities.We evaluate the framework on a publicly available dataset of 47,000 annotated social media posts across five cyberbullying categories:age,ethnicity,gender,religion,and indirect aggression.Among BERT variants tested,BERT Base Un-Cased achieved the highest performance with 93%accuracy(standard deviation across±1%5-fold cross-validation)and an average AUC of 0.96,outperforming standalone TF-IDF(78%)and Word2Vec(82%)models.Notably,it achieved near-perfect AUC scores(0.99)for age and ethnicity-based bullying.A comparative analysis with state-of-the-art benchmarks,including Generative Pre-trained Transformer 2(GPT-2)and Text-to-Text Transfer Transformer(T5)models highlights BERT’s superiority in handling ambiguous language.This work advances cyberbullying detection by demonstrating how hybrid feature extraction and transformer models improve multi-class classification,offering a scalable solution for moderating nuanced harmful content.展开更多
Accurate lithofacies classification in low-permeability sandstone reservoirs remains challenging due to class imbalance in well-log data and the difficulty of the modeling vertical lithological dependencies.Traditiona...Accurate lithofacies classification in low-permeability sandstone reservoirs remains challenging due to class imbalance in well-log data and the difficulty of the modeling vertical lithological dependencies.Traditional core-based interpretation introduces subjectivity,while conventional deep learning models often fail to capture stratigraphic sequences effectively.To address these limitations,we propose a hybrid CNN–GRU framework that integrates spatial feature extraction and sequential modeling.Heat Kernel Imputation is applied to reconstruct missing log data,and Borderline SMOTE(BSMOTE)improves class balance by augmenting boundary-case minority samples.The CNN component extracts localized petrophysical features,and the GRU component captures depth-wise lithological transitions,to enable spatial-sequential feature fusion.Experiments on real-well datasets from tight sandstone reservoirs show that the proposed model achieves an average accuracy of 93.3%and a Macro F1-score of 0.934.It outperforms baseline models,including RF(87.8%),GBDT(81.8%),CNN-only(87.5%),and GRU-only(86.1%).Leave-one-well-out validation further confirms strong generalization ability.These results demonstrate that the proposed approach effectively addresses data imbalance and enhances classification robustness,offering a scalable and automated solution for lithofacies interpretation under complex geological conditions.展开更多
The rapid and increasing growth in the volume and number of cyber threats from malware is not a real danger;the real threat lies in the obfuscation of these cyberattacks,as they constantly change their behavior,making...The rapid and increasing growth in the volume and number of cyber threats from malware is not a real danger;the real threat lies in the obfuscation of these cyberattacks,as they constantly change their behavior,making detection more difficult.Numerous researchers and developers have devoted considerable attention to this topic;however,the research field has not yet been fully saturated with high-quality studies that address these problems.For this reason,this paper presents a novel multi-objective Markov-enhanced adaptive whale optimization(MOMEAWO)cybersecurity model to improve the classification of binary and multi-class malware threats through the proposed MOMEAWO approach.The proposed MOMEAWO cybersecurity model aims to provide an innovative solution for analyzing,detecting,and classifying the behavior of obfuscated malware within their respective families.The proposed model includes three classification types:Binary classification and multi-class classification(e.g.,four families and 16 malware families).To evaluate the performance of this model,we used a recently published dataset called the Canadian Institute for Cybersecurity Malware Memory Analysis(CIC-MalMem-2022)that contains balanced data.The results show near-perfect accuracy in binary classification and high accuracy in multi-class classification compared with related work using the same dataset.展开更多
A new arrival and departure flight classification method based on the transitive closure algorithm (TCA) is proposed. Firstly, the fuzzy set theory and the transitive closure algorithm are introduced. Then four diff...A new arrival and departure flight classification method based on the transitive closure algorithm (TCA) is proposed. Firstly, the fuzzy set theory and the transitive closure algorithm are introduced. Then four different factors are selected to establish the flight classification model and a method is given to calculate the delay cost for each class. Finally, the proposed method is implemented in the sequencing problems of flights in a terminal area, and results are compared with that of the traditional classification method(TCM). Results show that the new classification model is effective in reducing the expenses of flight delays, thus optimizing the sequences of arrival and departure flights, and improving the efficiency of air traffic control.展开更多
This paper presents a fuzzy logic approach to efficiently perform unsupervised character classification for improvement in robustness, correctness and speed of a character recognition system. The characters are first ...This paper presents a fuzzy logic approach to efficiently perform unsupervised character classification for improvement in robustness, correctness and speed of a character recognition system. The characters are first split into eight typographical categories. The classification scheme uses pattern matching to classify the characters in each category into a set of fuzzy prototypes based on a nonlinear weighted similarity function. The fuzzy unsupervised character classification, which is natural in the repre...展开更多
In order to reduce amount of data storage and improve processing capacity of the system, this paper proposes a new classification method of data source by combining phase synchronization model in network clusteri...In order to reduce amount of data storage and improve processing capacity of the system, this paper proposes a new classification method of data source by combining phase synchronization model in network clustering with cloud model. Firstly, taking data source as a complex network, after the topography of network is obtained, the cloud model of each node data is determined by fuzzy analytic hierarchy process (AHP). Secondly, by calculating expectation, entropy and hyper entropy of the cloud model, comprehensive coupling strength is got and then it is regarded as the edge weight of topography. Finally, distribution curve is obtained by iterating the phase of each node by means of phase synchronization model. Thus classification of data source is completed. This method can not only provide convenience for storage, cleaning and compression of data, but also improve the efficiency of data analysis.展开更多
Coalbed methane has been explored in many basins worldwide for 30 years, and has been developed commercially in some of the basins. Many researchers have described the characteristics of coalbed methane geology and te...Coalbed methane has been explored in many basins worldwide for 30 years, and has been developed commercially in some of the basins. Many researchers have described the characteristics of coalbed methane geology and technology systematically. According to these investigations, a coalbed methane reservoir can be defined: 'a coal seam that contains some coalbed methane and is isolated from other fluid units is called a coalbed methane reservoir'. On the basis of anatomization, analysis, and comparison of the typical coalbed methane reservoirs, coalbed methane reservoirs can be divided into two classes: the hydrodynamic sealing coalbed methane reservoirs and the self-sealing coalbed methane reservoirs. The former can be further divided into two sub-classes: the hydrodynamic capping coalbed methane reservoirs, which can be divided into five types and the hydrodynamic driving coalbed methane reservoirs, which can be divided into three types. The latter can be divided into three types. Currently, hydrodynamic sealing reservoirs are the main target for coalbed methane exploration and development; self-sealing reservoirs are unsuitable for coalbed methane exploration and development, but they are closely related with coal mine gas hazards. Finally, a model for hydrodynamic sealing coalbed methane reservoirs is established.展开更多
A Long Short-Term Memory(LSTM) Recurrent Neural Network(RNN) has driven tremendous improvements on an acoustic model based on Gaussian Mixture Model(GMM). However, these models based on a hybrid method require a force...A Long Short-Term Memory(LSTM) Recurrent Neural Network(RNN) has driven tremendous improvements on an acoustic model based on Gaussian Mixture Model(GMM). However, these models based on a hybrid method require a forced aligned Hidden Markov Model(HMM) state sequence obtained from the GMM-based acoustic model. Therefore, it requires a long computation time for training both the GMM-based acoustic model and a deep learning-based acoustic model. In order to solve this problem, an acoustic model using CTC algorithm is proposed. CTC algorithm does not require the GMM-based acoustic model because it does not use the forced aligned HMM state sequence. However, previous works on a LSTM RNN-based acoustic model using CTC used a small-scale training corpus. In this paper, the LSTM RNN-based acoustic model using CTC is trained on a large-scale training corpus and its performance is evaluated. The implemented acoustic model has a performance of 6.18% and 15.01% in terms of Word Error Rate(WER) for clean speech and noisy speech, respectively. This is similar to a performance of the acoustic model based on the hybrid method.展开更多
The Sanjiang Plain, where nearly 20 kinds of wetlands exist now, is one of the largest wetlands distributed area of wetlands in China. To identify each of them and pick up them separately by means of automatic interpr...The Sanjiang Plain, where nearly 20 kinds of wetlands exist now, is one of the largest wetlands distributed area of wetlands in China. To identify each of them and pick up them separately by means of automatic interpretation of remote sensing from TM Landsat images is extremely important. However, most of the types of wetlands can not be divided each other due to the similarity and the illegibility of the wetland spectrum shown in TM images. Special disposals to remote sensing images include the spectrum enhancement of wetland information, the pseudo color composite of TM images of different bands and the algebra enhancement of TM images. By this way some kinds of wetlands such as Sparganium stoloniferum and Bolboschoenus maritimus can be identified. But in many cases, these methods are still insufficient because of the noise brought from the atmosphere transportation and so on. The physical features of wetlands reflecting the diversification of spectrum information of wetlands, which include the spatial temporal characteristics of the wetlands distribution, the landscape differences of wetlands from season to season, the growing environment and the vertical structure of wetlands vegetation and so on, must be taken into consideration. Besides these, the artificial alteration to spatial structure of wetlands such as the exploitation of some types of them can be also used as important symbols of wetlands identification from remote sensing images. On the basis of the above geographics analysis, a set of wetlands classification models of remote sensing could be established, and many types of wetlands such as paddy field, reed swamp, peat mire, meadow, CAREX marsh and paludification meadow and so on, will be distinguished consequently. All the ways of geographical analysis and model establishment will be given in detail in this article.展开更多
In this paper,several properties of one-way classification model with skew-normal random effects are obtained,such as moment generating function,density function and noncentral skew chi-square distribution,etc.Based o...In this paper,several properties of one-way classification model with skew-normal random effects are obtained,such as moment generating function,density function and noncentral skew chi-square distribution,etc.Based on the EM algorithm,we discuss the maximum likelihood(ML)estimation of unknown parameters.For testing problem of fixed effect,a parametric bootstrap(PB)approach is developed.Finally,some simulation results on the Type I error rates and powers of the PB approach are obtained,which show that the PB approach provides satisfactory performances on the Type I error rates and powers,even for small samples.For illustration,our main results are applied to a real data problem.展开更多
A Fisher discriminant analysis (FDA) model for the prediction of classification of rockburst in deep-buried long tunnel was established based on the Fisher discriminant theory and the actual characteristics of the p...A Fisher discriminant analysis (FDA) model for the prediction of classification of rockburst in deep-buried long tunnel was established based on the Fisher discriminant theory and the actual characteristics of the project. First, the major factors of rockburst, such as the maximum tangential stress of the cavern wall σθ, uniaxial compressive strength σc, uniaxial tensile strength or, and the elastic energy index of rock Wet, were taken into account in the analysis. Three factors, Stress coefficient σθ/σc, rock brittleness coefficient σc/σt, and elastic energy index Wet, were defined as the criterion indices for rockburst prediction in the proposed model. After training and testing of 12 sets of measured data, the discriminant functions of FDA were solved, and the ratio of misdiscrimina- tion is zero. Moreover, the proposed model was used to predict rockbursts of Qinling tunnel along Xi'an-Ankang railway. The results show that three forecast results are identical with the actual situation. Therefore, the prediction accuracy of the FDA model is acceptable.展开更多
Accurate simulation of tropical cyclone tracks is a prerequisite for tropical cyclone risk assessment.Against the spatial characteristics of tropical cyclone tracks in the Northwest Pacific region,stochastic simulatio...Accurate simulation of tropical cyclone tracks is a prerequisite for tropical cyclone risk assessment.Against the spatial characteristics of tropical cyclone tracks in the Northwest Pacific region,stochastic simulation method based on classification model is used to simulate tropical cyclone tracks in this region.Such simulation includes the classification method,the genesis model,the traveling model,and the lysis model.Tropical cyclone tracks in the Northwest Pacific region are classified into five categories on the basis of its movement characteristics and steering positions.In the genesis model,Gaussian kernel probability density functions with the biased cross validation method are used to simulate the annual occurrence number and genesis positions.The traveling model is established on the basis of the mean and mean square error of the historical 6 h latitude and longitude displacements.The termination probability is used as the discrimination standard in the lysis model.Then,this stochastic simulation method of tropical cyclone tracks is applied and qualitatively evaluated with different diagnostics.Results show that the tropical cyclone tracks in Northwest Pacific can be satisfactorily simulated with this classification model.展开更多
The lithofacies classification is essential for oil and gas reservoir exploration and development.The traditional method of lithofacies classification is based on"core calibration logging"and the experience ...The lithofacies classification is essential for oil and gas reservoir exploration and development.The traditional method of lithofacies classification is based on"core calibration logging"and the experience of geologists.This approach has strong subjectivity,low efficiency,and high uncertainty.This uncertainty may be one of the key factors affecting the results of 3 D modeling of tight sandstone reservoirs.In recent years,deep learning,which is a cutting-edge artificial intelligence technology,has attracted attention from various fields.However,the study of deep-learning techniques in the field of lithofacies classification has not been sufficient.Therefore,this paper proposes a novel hybrid deep-learning model based on the efficient data feature-extraction ability of convolutional neural networks(CNN)and the excellent ability to describe time-dependent features of long short-term memory networks(LSTM)to conduct lithological facies-classification experiments.The results of a series of experiments show that the hybrid CNN-LSTM model had an average accuracy of 87.3%and the best classification effect compared to the CNN,LSTM or the three commonly used machine learning models(Support vector machine,random forest,and gradient boosting decision tree).In addition,the borderline synthetic minority oversampling technique(BSMOTE)is introduced to address the class-imbalance issue of raw data.The results show that processed data balance can significantly improve the accuracy of lithofacies classification.Beside that,based on the fine lithofacies constraints,the sequential indicator simulation method is used to establish a three-dimensional lithofacies model,which completes the fine description of the spatial distribution of tight sandstone reservoirs in the study area.According to this comprehensive analysis,the proposed CNN-LSTM model,which eliminates class imbalance,can be effectively applied to lithofacies classification,and is expected to improve the reality of the geological model for the tight sandstone reservoirs.展开更多
The development of deep learning has revolutionized image recognition technology.How to design faster and more accurate image classification algorithms has become our research interests.In this paper,we propose a new ...The development of deep learning has revolutionized image recognition technology.How to design faster and more accurate image classification algorithms has become our research interests.In this paper,we propose a new algorithm called stochastic depth networks with deep energy model(SADIE),and the model improves stochastic depth neural network with deep energy model to provide attributes of images and analysis their characteristics.First,the Bernoulli distribution probability is used to select the current layer of the neural network to prevent gradient dispersion during training.Then in the backpropagation process,the energy function is designed to optimize the target loss function of the neural network.We also explored the possibility of using Adam and SGD combination optimization in deep neural networks.Finally,we use training data to train our network based on deep energy model and testing data to verify the performance of the model.The results we finally obtained in this research include the Classified labels of images.The impacts of our obtained results show that our model has high accuracy and performance.展开更多
The complex sand-casting process combined with the interactions between process parameters makes it difficult to control the casting quality,resulting in a high scrap rate.A strategy based on a data-driven model was p...The complex sand-casting process combined with the interactions between process parameters makes it difficult to control the casting quality,resulting in a high scrap rate.A strategy based on a data-driven model was proposed to reduce casting defects and improve production efficiency,which includes the random forest(RF)classification model,the feature importance analysis,and the process parameters optimization with Monte Carlo simulation.The collected data includes four types of defects and corresponding process parameters were used to construct the RF model.Classification results show a recall rate above 90% for all categories.The Gini Index was used to assess the importance of the process parameters in the formation of various defects in the RF model.Finally,the classification model was applied to different production conditions for quality prediction.In the case of process parameters optimization for gas porosity defects,this model serves as an experimental process in the Monte Carlo method to estimate a better temperature distribution.The prediction model,when applied to the factory,greatly improved the efficiency of defect detection.Results show that the scrap rate decreased from 10.16% to 6.68%.展开更多
Objective Debris flows are cohesive sediment gravity flows which occur in both subaerial and subaqueous settings. Compared to subaerial debris flows which have been well studied as a geological hazard, subaqueous deb...Objective Debris flows are cohesive sediment gravity flows which occur in both subaerial and subaqueous settings. Compared to subaerial debris flows which have been well studied as a geological hazard, subaqueous debris flows showing complicated sediment composition and sedimentary processes were poorly understood. The main objective of this work is to establish a classification scheme and facies sequence models of subaqueous debris flows for well understanding their sedimentary processes and depositional characteristics.展开更多
Prognosis is a key technology to improve reliability,safety and maintainability of products,a lot of researchers have been devoted to this technology.But to improve the predict accuracy of remaining life of products h...Prognosis is a key technology to improve reliability,safety and maintainability of products,a lot of researchers have been devoted to this technology.But to improve the predict accuracy of remaining life of products has been difficult.To predict the lifetime specification of pneumatic cylinders with high reliability and long lifetime and small specimen,this paper put forward the prognosis algorithm based on the path classification and estimation(PACE) model.PACE model is based entirely on failure data instead of failure threshold.Pneumatic cylinders normally characterize with failure mechanism wear and tear.Since the minimum working pressure increases with the number of working cycles,the minimum working pressure is chosen as degradation signal.PACE model is fundamentally composed of two operations:path classification and remaining useful life(RUL) estimation.Path classification is to classify a current degradation path as belonging to one or more of previously collected exemplary degradation paths.RUL estimation is to use the resulting memberships to estimate the remaining useful life.In order for verification and validation of PACE prognostic method,six pneumatic cylinders are tested.The test data is analyzed by PACE prognostics.It is found that the PACE based prognosis method has higher prediction accuracy and smaller variance and PACE model is significantly outperform population based prognostics especially for small specimen condition.PACE model based method solved the problem of prediction accuracy for small specimen pneumatic cylinders' prognosis.展开更多
In this study,eight different varieties of maize seeds were used as the research objects.Conduct 81 types of combined preprocessing on the original spectra.Through comparison,Savitzky-Golay(SG)-multivariate scattering...In this study,eight different varieties of maize seeds were used as the research objects.Conduct 81 types of combined preprocessing on the original spectra.Through comparison,Savitzky-Golay(SG)-multivariate scattering correction(MSC)-maximum-minimum normalization(MN)was identified as the optimal preprocessing technique.The competitive adaptive reweighted sampling(CARS),successive projections algorithm(SPA),and their combined methods were employed to extract feature wavelengths.Classification models based on back propagation(BP),support vector machine(SVM),random forest(RF),and partial least squares(PLS)were established using full-band data and feature wavelengths.Among all models,the(CARS-SPA)-BP model achieved the highest accuracy rate of 98.44%.This study offers novel insights and methodologies for the rapid and accurate identification of corn seeds as well as other crop seeds.展开更多
Early detection of hepatocellular carcinoma (HCC) is critical for the effective treatment. Alpha fetoprotein (AFP) serum level is currently used for HCC screening, but the cutoff of the AFP test has limited sensit...Early detection of hepatocellular carcinoma (HCC) is critical for the effective treatment. Alpha fetoprotein (AFP) serum level is currently used for HCC screening, but the cutoff of the AFP test has limited sensitivity (-50%), indicating a high false negative rate. We have successfully demonstrated that cancer derived DNA biomarkers can be detected in urine of patients with cancer and can be used for the early detection of cancer (Jain et al., 2015; Lin et al., 2011; Song et al., 2012; Su, Lin, Song, & Jain, 2014; Su, Wang, Norton, Brenner, & Block, 2008). By combining urine biomarkers (uBMK) values and serum AFP (sAFP) level, a new classification model has been proposed for more efficient HCC screening. Several criterions have been discussed to optimal the cutoff for uBMK score and sAFP score. A joint distribution of sAFP and uBMK with point mass has been fitted using maximum likelihood method. Numerical results show that the sAFP data and uBMK data are very well described by proposed model. A tree-structured sequential test can be optimized by selecting the cutoffs. Bootstrap simulations also show the robust classification results with the optimal cuto~..展开更多
A comprehensive understanding of village development patterns and the identification of different village types is crucial for formulating tailored planning for rural revitalization.However,a model for large-scale vil...A comprehensive understanding of village development patterns and the identification of different village types is crucial for formulating tailored planning for rural revitalization.However,a model for large-scale village classification to support tailored rural revitalization planning is still lacking.This study aims to develop a large-scale village classification model using the Gaussian Mixture Models to support tailored rural revitalization efforts.Firstly,we propose a multi-dimensional index system to capture the diverse features of massive villages.Secondly,the GMM clustering algorithm is applied to identify distinct village types based on their unique features.The model was employed to classify the 25,409 villages in Hubei province in China into four classes.Villages in these classes exhibit discernible differences in spatial distribution,topography,location,economic development level,industrial structure,infrastructure,and resource endowment.In addition,the GMM-based village classification model demonstrates a high level of agreement with evaluations made by planning experts,confirming its accuracy and reliability.In the empirical study,our model achieves an overall accuracy of 95.29%,signifying substantial concordance between the classifications made by planning experts and the results generated by our model.Based on the identified features,tailored paths are proposed r each village class for rural revitalization efforts.展开更多
基金funded by Scientific Research Deanship at University of Hail-Saudi Arabia through Project Number RG-23092.
文摘Cyberbullying on social media poses significant psychological risks,yet most detection systems over-simplify the task by focusing on binary classification,ignoring nuanced categories like passive-aggressive remarks or indirect slurs.To address this gap,we propose a hybrid framework combining Term Frequency-Inverse Document Frequency(TF-IDF),word-to-vector(Word2Vec),and Bidirectional Encoder Representations from Transformers(BERT)based models for multi-class cyberbullying detection.Our approach integrates TF-IDF for lexical specificity and Word2Vec for semantic relationships,fused with BERT’s contextual embeddings to capture syntactic and semantic complexities.We evaluate the framework on a publicly available dataset of 47,000 annotated social media posts across five cyberbullying categories:age,ethnicity,gender,religion,and indirect aggression.Among BERT variants tested,BERT Base Un-Cased achieved the highest performance with 93%accuracy(standard deviation across±1%5-fold cross-validation)and an average AUC of 0.96,outperforming standalone TF-IDF(78%)and Word2Vec(82%)models.Notably,it achieved near-perfect AUC scores(0.99)for age and ethnicity-based bullying.A comparative analysis with state-of-the-art benchmarks,including Generative Pre-trained Transformer 2(GPT-2)and Text-to-Text Transfer Transformer(T5)models highlights BERT’s superiority in handling ambiguous language.This work advances cyberbullying detection by demonstrating how hybrid feature extraction and transformer models improve multi-class classification,offering a scalable solution for moderating nuanced harmful content.
基金supported by the Langfang Science and Technology Program with self-raised funds under the project“Application of Deep Learning-Based Joint Well-Seismic Analysis in Lithology Prediction”(Project No.2024011013)the Science and Technology Innovation Program for Postgraduate students in IDP subsidized by Fundamental Research Funds for the Central Universities,under the project“Research on CNN Algorithm Enhanced by Physical Information for Lithofacies Prediction in Tight Sandstone Reservoirs”(Project No.ZY20250328).
文摘Accurate lithofacies classification in low-permeability sandstone reservoirs remains challenging due to class imbalance in well-log data and the difficulty of the modeling vertical lithological dependencies.Traditional core-based interpretation introduces subjectivity,while conventional deep learning models often fail to capture stratigraphic sequences effectively.To address these limitations,we propose a hybrid CNN–GRU framework that integrates spatial feature extraction and sequential modeling.Heat Kernel Imputation is applied to reconstruct missing log data,and Borderline SMOTE(BSMOTE)improves class balance by augmenting boundary-case minority samples.The CNN component extracts localized petrophysical features,and the GRU component captures depth-wise lithological transitions,to enable spatial-sequential feature fusion.Experiments on real-well datasets from tight sandstone reservoirs show that the proposed model achieves an average accuracy of 93.3%and a Macro F1-score of 0.934.It outperforms baseline models,including RF(87.8%),GBDT(81.8%),CNN-only(87.5%),and GRU-only(86.1%).Leave-one-well-out validation further confirms strong generalization ability.These results demonstrate that the proposed approach effectively addresses data imbalance and enhances classification robustness,offering a scalable and automated solution for lithofacies interpretation under complex geological conditions.
文摘The rapid and increasing growth in the volume and number of cyber threats from malware is not a real danger;the real threat lies in the obfuscation of these cyberattacks,as they constantly change their behavior,making detection more difficult.Numerous researchers and developers have devoted considerable attention to this topic;however,the research field has not yet been fully saturated with high-quality studies that address these problems.For this reason,this paper presents a novel multi-objective Markov-enhanced adaptive whale optimization(MOMEAWO)cybersecurity model to improve the classification of binary and multi-class malware threats through the proposed MOMEAWO approach.The proposed MOMEAWO cybersecurity model aims to provide an innovative solution for analyzing,detecting,and classifying the behavior of obfuscated malware within their respective families.The proposed model includes three classification types:Binary classification and multi-class classification(e.g.,four families and 16 malware families).To evaluate the performance of this model,we used a recently published dataset called the Canadian Institute for Cybersecurity Malware Memory Analysis(CIC-MalMem-2022)that contains balanced data.The results show near-perfect accuracy in binary classification and high accuracy in multi-class classification compared with related work using the same dataset.
文摘A new arrival and departure flight classification method based on the transitive closure algorithm (TCA) is proposed. Firstly, the fuzzy set theory and the transitive closure algorithm are introduced. Then four different factors are selected to establish the flight classification model and a method is given to calculate the delay cost for each class. Finally, the proposed method is implemented in the sequencing problems of flights in a terminal area, and results are compared with that of the traditional classification method(TCM). Results show that the new classification model is effective in reducing the expenses of flight delays, thus optimizing the sequences of arrival and departure flights, and improving the efficiency of air traffic control.
文摘This paper presents a fuzzy logic approach to efficiently perform unsupervised character classification for improvement in robustness, correctness and speed of a character recognition system. The characters are first split into eight typographical categories. The classification scheme uses pattern matching to classify the characters in each category into a set of fuzzy prototypes based on a nonlinear weighted similarity function. The fuzzy unsupervised character classification, which is natural in the repre...
基金National Natural Science Foundation of China(No.61171057,No.61503345)Science Foundation for North University of China(No.110246)+1 种基金Specialized Research Fund for Doctoral Program of Higher Education of China(No.20121420110004)International Office of Shanxi Province Education Department of China,and Basic Research Project in Shanxi Province(Young Foundation)
文摘In order to reduce amount of data storage and improve processing capacity of the system, this paper proposes a new classification method of data source by combining phase synchronization model in network clustering with cloud model. Firstly, taking data source as a complex network, after the topography of network is obtained, the cloud model of each node data is determined by fuzzy analytic hierarchy process (AHP). Secondly, by calculating expectation, entropy and hyper entropy of the cloud model, comprehensive coupling strength is got and then it is regarded as the edge weight of topography. Finally, distribution curve is obtained by iterating the phase of each node by means of phase synchronization model. Thus classification of data source is completed. This method can not only provide convenience for storage, cleaning and compression of data, but also improve the efficiency of data analysis.
基金We wish to thank the Ministry of Science an d Technology of China for its finan cial support of the“Project 973”(No.2002CB211705)the Science and Technology Admi nistration of Henan Province.
文摘Coalbed methane has been explored in many basins worldwide for 30 years, and has been developed commercially in some of the basins. Many researchers have described the characteristics of coalbed methane geology and technology systematically. According to these investigations, a coalbed methane reservoir can be defined: 'a coal seam that contains some coalbed methane and is isolated from other fluid units is called a coalbed methane reservoir'. On the basis of anatomization, analysis, and comparison of the typical coalbed methane reservoirs, coalbed methane reservoirs can be divided into two classes: the hydrodynamic sealing coalbed methane reservoirs and the self-sealing coalbed methane reservoirs. The former can be further divided into two sub-classes: the hydrodynamic capping coalbed methane reservoirs, which can be divided into five types and the hydrodynamic driving coalbed methane reservoirs, which can be divided into three types. The latter can be divided into three types. Currently, hydrodynamic sealing reservoirs are the main target for coalbed methane exploration and development; self-sealing reservoirs are unsuitable for coalbed methane exploration and development, but they are closely related with coal mine gas hazards. Finally, a model for hydrodynamic sealing coalbed methane reservoirs is established.
基金supported by the Ministry of Trade,Industry & Energy(MOTIE,Korea) under Industrial Technology Innovation Program (No.10063424,'development of distant speech recognition and multi-task dialog processing technologies for in-door conversational robots')
文摘A Long Short-Term Memory(LSTM) Recurrent Neural Network(RNN) has driven tremendous improvements on an acoustic model based on Gaussian Mixture Model(GMM). However, these models based on a hybrid method require a forced aligned Hidden Markov Model(HMM) state sequence obtained from the GMM-based acoustic model. Therefore, it requires a long computation time for training both the GMM-based acoustic model and a deep learning-based acoustic model. In order to solve this problem, an acoustic model using CTC algorithm is proposed. CTC algorithm does not require the GMM-based acoustic model because it does not use the forced aligned HMM state sequence. However, previous works on a LSTM RNN-based acoustic model using CTC used a small-scale training corpus. In this paper, the LSTM RNN-based acoustic model using CTC is trained on a large-scale training corpus and its performance is evaluated. The implemented acoustic model has a performance of 6.18% and 15.01% in terms of Word Error Rate(WER) for clean speech and noisy speech, respectively. This is similar to a performance of the acoustic model based on the hybrid method.
文摘The Sanjiang Plain, where nearly 20 kinds of wetlands exist now, is one of the largest wetlands distributed area of wetlands in China. To identify each of them and pick up them separately by means of automatic interpretation of remote sensing from TM Landsat images is extremely important. However, most of the types of wetlands can not be divided each other due to the similarity and the illegibility of the wetland spectrum shown in TM images. Special disposals to remote sensing images include the spectrum enhancement of wetland information, the pseudo color composite of TM images of different bands and the algebra enhancement of TM images. By this way some kinds of wetlands such as Sparganium stoloniferum and Bolboschoenus maritimus can be identified. But in many cases, these methods are still insufficient because of the noise brought from the atmosphere transportation and so on. The physical features of wetlands reflecting the diversification of spectrum information of wetlands, which include the spatial temporal characteristics of the wetlands distribution, the landscape differences of wetlands from season to season, the growing environment and the vertical structure of wetlands vegetation and so on, must be taken into consideration. Besides these, the artificial alteration to spatial structure of wetlands such as the exploitation of some types of them can be also used as important symbols of wetlands identification from remote sensing images. On the basis of the above geographics analysis, a set of wetlands classification models of remote sensing could be established, and many types of wetlands such as paddy field, reed swamp, peat mire, meadow, CAREX marsh and paludification meadow and so on, will be distinguished consequently. All the ways of geographical analysis and model establishment will be given in detail in this article.
基金Supported by Zhejiang Provincial Philosophy and Social Science Planning Zhijiang Youth Project of China(Grant No.16ZJQN017YB)Ministry of Education of China,Humanities and Social Science Projects(Grant No.19YJA910006)+2 种基金Zhejiang Provincial Natural Science Foundation of China(Grant No.LY20A010019)Fundamental Research Funds for the Provincial Universities of Zhejiang(Grant No.GK199900299012-204)Zhejiang Provincial Statistical Science Research Base Project of China(Grant No.19TJJD08)
文摘In this paper,several properties of one-way classification model with skew-normal random effects are obtained,such as moment generating function,density function and noncentral skew chi-square distribution,etc.Based on the EM algorithm,we discuss the maximum likelihood(ML)estimation of unknown parameters.For testing problem of fixed effect,a parametric bootstrap(PB)approach is developed.Finally,some simulation results on the Type I error rates and powers of the PB approach are obtained,which show that the PB approach provides satisfactory performances on the Type I error rates and powers,even for small samples.For illustration,our main results are applied to a real data problem.
基金Supported by the National 11th Five-Year Science and Technology Supporting Plan of China(2006BAB02A02)Central South University Innovation funded projects (2009ssxt230, 2009ssxt234)
文摘A Fisher discriminant analysis (FDA) model for the prediction of classification of rockburst in deep-buried long tunnel was established based on the Fisher discriminant theory and the actual characteristics of the project. First, the major factors of rockburst, such as the maximum tangential stress of the cavern wall σθ, uniaxial compressive strength σc, uniaxial tensile strength or, and the elastic energy index of rock Wet, were taken into account in the analysis. Three factors, Stress coefficient σθ/σc, rock brittleness coefficient σc/σt, and elastic energy index Wet, were defined as the criterion indices for rockburst prediction in the proposed model. After training and testing of 12 sets of measured data, the discriminant functions of FDA were solved, and the ratio of misdiscrimina- tion is zero. Moreover, the proposed model was used to predict rockbursts of Qinling tunnel along Xi'an-Ankang railway. The results show that three forecast results are identical with the actual situation. Therefore, the prediction accuracy of the FDA model is acceptable.
基金National Natural Science Foundation of China(51408174)Provincial Undergraduate Innovation and Entrepreneurship Training Program of Hefei University of Technology(S201910359302)
文摘Accurate simulation of tropical cyclone tracks is a prerequisite for tropical cyclone risk assessment.Against the spatial characteristics of tropical cyclone tracks in the Northwest Pacific region,stochastic simulation method based on classification model is used to simulate tropical cyclone tracks in this region.Such simulation includes the classification method,the genesis model,the traveling model,and the lysis model.Tropical cyclone tracks in the Northwest Pacific region are classified into five categories on the basis of its movement characteristics and steering positions.In the genesis model,Gaussian kernel probability density functions with the biased cross validation method are used to simulate the annual occurrence number and genesis positions.The traveling model is established on the basis of the mean and mean square error of the historical 6 h latitude and longitude displacements.The termination probability is used as the discrimination standard in the lysis model.Then,this stochastic simulation method of tropical cyclone tracks is applied and qualitatively evaluated with different diagnostics.Results show that the tropical cyclone tracks in Northwest Pacific can be satisfactorily simulated with this classification model.
基金supported by the Fundamental Research Funds for the Central Universities(Grant No.300102278402)。
文摘The lithofacies classification is essential for oil and gas reservoir exploration and development.The traditional method of lithofacies classification is based on"core calibration logging"and the experience of geologists.This approach has strong subjectivity,low efficiency,and high uncertainty.This uncertainty may be one of the key factors affecting the results of 3 D modeling of tight sandstone reservoirs.In recent years,deep learning,which is a cutting-edge artificial intelligence technology,has attracted attention from various fields.However,the study of deep-learning techniques in the field of lithofacies classification has not been sufficient.Therefore,this paper proposes a novel hybrid deep-learning model based on the efficient data feature-extraction ability of convolutional neural networks(CNN)and the excellent ability to describe time-dependent features of long short-term memory networks(LSTM)to conduct lithological facies-classification experiments.The results of a series of experiments show that the hybrid CNN-LSTM model had an average accuracy of 87.3%and the best classification effect compared to the CNN,LSTM or the three commonly used machine learning models(Support vector machine,random forest,and gradient boosting decision tree).In addition,the borderline synthetic minority oversampling technique(BSMOTE)is introduced to address the class-imbalance issue of raw data.The results show that processed data balance can significantly improve the accuracy of lithofacies classification.Beside that,based on the fine lithofacies constraints,the sequential indicator simulation method is used to establish a three-dimensional lithofacies model,which completes the fine description of the spatial distribution of tight sandstone reservoirs in the study area.According to this comprehensive analysis,the proposed CNN-LSTM model,which eliminates class imbalance,can be effectively applied to lithofacies classification,and is expected to improve the reality of the geological model for the tight sandstone reservoirs.
文摘The development of deep learning has revolutionized image recognition technology.How to design faster and more accurate image classification algorithms has become our research interests.In this paper,we propose a new algorithm called stochastic depth networks with deep energy model(SADIE),and the model improves stochastic depth neural network with deep energy model to provide attributes of images and analysis their characteristics.First,the Bernoulli distribution probability is used to select the current layer of the neural network to prevent gradient dispersion during training.Then in the backpropagation process,the energy function is designed to optimize the target loss function of the neural network.We also explored the possibility of using Adam and SGD combination optimization in deep neural networks.Finally,we use training data to train our network based on deep energy model and testing data to verify the performance of the model.The results we finally obtained in this research include the Classified labels of images.The impacts of our obtained results show that our model has high accuracy and performance.
基金financially supported by the National Key Research and Development Program of China(2022YFB3706800,2020YFB1710100)the National Natural Science Foundation of China(51821001,52090042,52074183)。
文摘The complex sand-casting process combined with the interactions between process parameters makes it difficult to control the casting quality,resulting in a high scrap rate.A strategy based on a data-driven model was proposed to reduce casting defects and improve production efficiency,which includes the random forest(RF)classification model,the feature importance analysis,and the process parameters optimization with Monte Carlo simulation.The collected data includes four types of defects and corresponding process parameters were used to construct the RF model.Classification results show a recall rate above 90% for all categories.The Gini Index was used to assess the importance of the process parameters in the formation of various defects in the RF model.Finally,the classification model was applied to different production conditions for quality prediction.In the case of process parameters optimization for gas porosity defects,this model serves as an experimental process in the Monte Carlo method to estimate a better temperature distribution.The prediction model,when applied to the factory,greatly improved the efficiency of defect detection.Results show that the scrap rate decreased from 10.16% to 6.68%.
基金jointly funded by the National Natural Science Foundation of China(grants No.41172104,41202078 and 41372117)the Major National S&T Program of China(grant No.2011ZX05009-002)
文摘Objective Debris flows are cohesive sediment gravity flows which occur in both subaerial and subaqueous settings. Compared to subaerial debris flows which have been well studied as a geological hazard, subaqueous debris flows showing complicated sediment composition and sedimentary processes were poorly understood. The main objective of this work is to establish a classification scheme and facies sequence models of subaqueous debris flows for well understanding their sedimentary processes and depositional characteristics.
基金supported by the Laboratory of Aviation Safety Technical Analysis and Appraisal of China Academy of Civil Aviation Science and Technology(Grant No. 2009-02)
文摘Prognosis is a key technology to improve reliability,safety and maintainability of products,a lot of researchers have been devoted to this technology.But to improve the predict accuracy of remaining life of products has been difficult.To predict the lifetime specification of pneumatic cylinders with high reliability and long lifetime and small specimen,this paper put forward the prognosis algorithm based on the path classification and estimation(PACE) model.PACE model is based entirely on failure data instead of failure threshold.Pneumatic cylinders normally characterize with failure mechanism wear and tear.Since the minimum working pressure increases with the number of working cycles,the minimum working pressure is chosen as degradation signal.PACE model is fundamentally composed of two operations:path classification and remaining useful life(RUL) estimation.Path classification is to classify a current degradation path as belonging to one or more of previously collected exemplary degradation paths.RUL estimation is to use the resulting memberships to estimate the remaining useful life.In order for verification and validation of PACE prognostic method,six pneumatic cylinders are tested.The test data is analyzed by PACE prognostics.It is found that the PACE based prognosis method has higher prediction accuracy and smaller variance and PACE model is significantly outperform population based prognostics especially for small specimen condition.PACE model based method solved the problem of prediction accuracy for small specimen pneumatic cylinders' prognosis.
基金supported by the Science and Technology Development Plan Project of Jilin Provincial Department of Science and Technology (No.20220203112S)the Jilin Provincial Department of Education Science and Technology Research Project (No.JJKH20210039KJ)。
文摘In this study,eight different varieties of maize seeds were used as the research objects.Conduct 81 types of combined preprocessing on the original spectra.Through comparison,Savitzky-Golay(SG)-multivariate scattering correction(MSC)-maximum-minimum normalization(MN)was identified as the optimal preprocessing technique.The competitive adaptive reweighted sampling(CARS),successive projections algorithm(SPA),and their combined methods were employed to extract feature wavelengths.Classification models based on back propagation(BP),support vector machine(SVM),random forest(RF),and partial least squares(PLS)were established using full-band data and feature wavelengths.Among all models,the(CARS-SPA)-BP model achieved the highest accuracy rate of 98.44%.This study offers novel insights and methodologies for the rapid and accurate identification of corn seeds as well as other crop seeds.
文摘Early detection of hepatocellular carcinoma (HCC) is critical for the effective treatment. Alpha fetoprotein (AFP) serum level is currently used for HCC screening, but the cutoff of the AFP test has limited sensitivity (-50%), indicating a high false negative rate. We have successfully demonstrated that cancer derived DNA biomarkers can be detected in urine of patients with cancer and can be used for the early detection of cancer (Jain et al., 2015; Lin et al., 2011; Song et al., 2012; Su, Lin, Song, & Jain, 2014; Su, Wang, Norton, Brenner, & Block, 2008). By combining urine biomarkers (uBMK) values and serum AFP (sAFP) level, a new classification model has been proposed for more efficient HCC screening. Several criterions have been discussed to optimal the cutoff for uBMK score and sAFP score. A joint distribution of sAFP and uBMK with point mass has been fitted using maximum likelihood method. Numerical results show that the sAFP data and uBMK data are very well described by proposed model. A tree-structured sequential test can be optimized by selecting the cutoffs. Bootstrap simulations also show the robust classification results with the optimal cuto~..
基金National Natural Science Foundation of China,No.42293271,No.41971336。
文摘A comprehensive understanding of village development patterns and the identification of different village types is crucial for formulating tailored planning for rural revitalization.However,a model for large-scale village classification to support tailored rural revitalization planning is still lacking.This study aims to develop a large-scale village classification model using the Gaussian Mixture Models to support tailored rural revitalization efforts.Firstly,we propose a multi-dimensional index system to capture the diverse features of massive villages.Secondly,the GMM clustering algorithm is applied to identify distinct village types based on their unique features.The model was employed to classify the 25,409 villages in Hubei province in China into four classes.Villages in these classes exhibit discernible differences in spatial distribution,topography,location,economic development level,industrial structure,infrastructure,and resource endowment.In addition,the GMM-based village classification model demonstrates a high level of agreement with evaluations made by planning experts,confirming its accuracy and reliability.In the empirical study,our model achieves an overall accuracy of 95.29%,signifying substantial concordance between the classifications made by planning experts and the results generated by our model.Based on the identified features,tailored paths are proposed r each village class for rural revitalization efforts.