Widely used deep neural networks currently face limitations in achieving optimal performance for purchase intention prediction due to constraints on data volume and hyperparameter selection.To address this issue,based...Widely used deep neural networks currently face limitations in achieving optimal performance for purchase intention prediction due to constraints on data volume and hyperparameter selection.To address this issue,based on the deep forest algorithm and further integrating evolutionary ensemble learning methods,this paper proposes a novel Deep Adaptive Evolutionary Ensemble(DAEE)model.This model introduces model diversity into the cascade layer,allowing it to adaptively adjust its structure to accommodate complex and evolving purchasing behavior patterns.Moreover,this paper optimizes the methods of obtaining feature vectors,enhancement vectors,and prediction results within the deep forest algorithm to enhance the model’s predictive accuracy.Results demonstrate that the improved deep forest model not only possesses higher robustness but also shows an increase of 5.02%in AUC value compared to the baseline model.Furthermore,its training runtime speed is 6 times faster than that of deep models,and compared to other improved models,its accuracy has been enhanced by 0.9%.展开更多
Identifying fractures along a well trajectory is of immense significance in determining the subsurface fracture network distribution.Typically,conventional logs exhibit responses in fracture zones,and almost all wells...Identifying fractures along a well trajectory is of immense significance in determining the subsurface fracture network distribution.Typically,conventional logs exhibit responses in fracture zones,and almost all wells have such logs.However,detecting fractures through logging responses can be challenging since the log response intensity is weak and complex.To address this problem,we propose a deep learning model for fracture identification using deep forest,which is based on a cascade structure comprising multi-layer random forests.Deep forest can extract complex nonlinear features of fractures in conventional logs through ensemble learning and deep learning.The proposed approach is tested using a dataset from the Oligocene to Miocene tight carbonate reservoirs in D oilfield,Zagros Basin,Middle East,and eight logs are selected to construct the fracture identification model based on sensitivity analysis of logging curves against fractures.The log package includes the gamma-ray,caliper,density,compensated neutron,acoustic transit time,and shallow,deep,and flushed zone resistivity logs.Experiments have shown that the deep forest obtains high recall and accuracy(>92%).In a blind well test,results from the deep forest learning model have a good correlation with fracture observation from cores.Compared to the random forest method,a widely used ensemble learning method,the proposed deep forest model improves accuracy by approximately 4.6%.展开更多
The paper proposes a new deep structure model,called Densely Connected Cascade Forest-Weighted K Nearest Neighbors(DCCF-WKNNs),to implement the corrosion data modelling and corrosion knowledgemining.Firstly,we collect...The paper proposes a new deep structure model,called Densely Connected Cascade Forest-Weighted K Nearest Neighbors(DCCF-WKNNs),to implement the corrosion data modelling and corrosion knowledgemining.Firstly,we collect 409 outdoor atmospheric corrosion samples of low-alloy steels as experiment datasets.Then,we give the proposed methods process,including random forests-K nearest neighbors(RF-WKNNs)and DCCF-WKNNs.Finally,we use the collected datasets to verify the performance of the proposed method.The results show that compared with commonly used and advanced machine-learning algorithms such as artificial neural network(ANN),support vector regression(SVR),random forests(RF),and cascade forests(cForest),the proposed method can obtain the best prediction results.In addition,the method can predict the corrosion rates with variations of any one single environmental variable,like pH,temperature,relative humidity,SO2,rainfall or Cl-.By this way,the threshold of each variable,upon which the corrosion rate may have a large change,can be further obtained.展开更多
With the continuous development of machine learning and the increasing complexity of financial data analysis,it is more popular to use models in the field of machine learning to solve the hot and difficult problems in...With the continuous development of machine learning and the increasing complexity of financial data analysis,it is more popular to use models in the field of machine learning to solve the hot and difficult problems in the financial industry.To improve the effectiveness of stock trend prediction and solve the problems in time series data processing,this paper combines the fuzzy affiliation function with stock-related technical indicators to obtain nominal data that can widely reflect the constituent stocks in the case of time series changes by analysing the S&P 500 index.Meanwhile,in order to optimise the current machine learning algorithm in which the setting and adjustment of hyperparameters rely too much on empirical knowledge,this paper combines the deep forest model to train the stock data separately.The experimental results show that(1)the accuracy of the extreme random forest and the accuracy of the multi-grain cascade forest are both higher than that of the gated recurrent unit(GRU)model when the un-fuzzy index-adjusted dataset is used as features for input,(2)the accuracy of the extreme random forest and the accuracy of the multigranular cascade forest are improved by using the fuzzy index-adjusted dataset as features for input,(3)the accuracy of the fuzzy index-adjusted dataset as features for inputting the extreme random forest is improved by 18.89% compared to that of the un-fuzzy index-adjusted dataset as features for inputting the extreme random forest and(4)the average accuracy of the fuzzy index-adjusted dataset as features for inputting multi-grain cascade forest increased by 5.67%.展开更多
In the research field of bearing fault diagnosis,classical deep learning models have the problems of too many parameters and high computing cost.In addition,the classical deep learning models are not effective in the ...In the research field of bearing fault diagnosis,classical deep learning models have the problems of too many parameters and high computing cost.In addition,the classical deep learning models are not effective in the scenario of small data.In recent years,deep forest is proposed,which has less hyper parameters and adaptive depth of deep model.In addition,weighted deep forest(WDF)is proposed to further improve deep forest by assigning weights for decisions trees based on the accuracy of each decision tree.In this paper,weighted deep forest model-based bearing fault diagnosis method(WDBM)is proposed.The WDBM is regard as a novel bearing fault diagnosis method,which not only inherits the WDF’s advantages-strong robustness,good generalization,less parameters,faster convergence speed and so on,but also realizes effective diagnosis with high precision and low cost under the condition of small samples.To verify the performance of the WDBM,experiments are carried out on Case Western Reserve University bearing data set(CWRU).Experiments results demonstrate that WDBM can achieve comparative recognition accuracy,with less computational overhead and faster convergence speed.展开更多
This article introduces a new medical internet of things(IoT)framework for intelligent fall detection system of senior people based on our proposed deep forest model.The cascade multi-layer structure of deep forest cl...This article introduces a new medical internet of things(IoT)framework for intelligent fall detection system of senior people based on our proposed deep forest model.The cascade multi-layer structure of deep forest classifier allows to generate new features at each level with minimal hyperparameters compared to deep neural networks.Moreover,the optimal number of the deep forest layers is automatically estimated based on the early stopping criteria of validation accuracy value at each generated layer.The suggested forest classifier was successfully tested and evaluated using a public SmartFall dataset,which is acquired from three-axis accelerometer in a smartwatch.It includes 92781 training samples and 91025 testing samples with two labeled classes,namely non-fall and fall.Classification results of our deep forest classifier demonstrated a superior performance with the best accuracy score of 98.0%compared to three machine learning models,i.e.,K-nearest neighbors,decision trees and traditional random forest,and two deep learning models,which are dense neural networks and convolutional neural networks.By considering security and privacy aspects in the future work,our proposed medical IoT framework for fall detection of old people is valid for real-time healthcare application deployment.展开更多
The dark web is a shadow area hidden in the depths of the Internet,which is difficult to access through common search engines.Because of its anonymity,the dark web has gradually become a hotbed for a variety of cyber-...The dark web is a shadow area hidden in the depths of the Internet,which is difficult to access through common search engines.Because of its anonymity,the dark web has gradually become a hotbed for a variety of cyber-crimes.Although some research based on machine learning or deep learning has been shown to be effective in the task of analyzing dark web traffic in recent years,there are still pain points such as low accuracy,insufficient real-time performance,and limited application scenarios.Aiming at the difficulties faced by the existing automated dark web traffic analysis methods,a novel method named Dark-Forest to analyze the behavior of dark web traffic is proposed.In this method,firstly,particle swarm optimization algorithm is used to filter the redundant features of dark web traffic data,which can effectively shorten the training and inference time of the model to meet the realtime requirements of dark web detection task.Then,the selected features of traffic are analyzed and classified using the DeepForest model as a backbone classifier.The comparison experiment with the current mainstream methods shows that Dark-Forest takes into account the advantages of statistical machine learning and deep learning,and achieves an accuracy rate of 87.84%.This method not only outperforms baseline methods such as Random Forest,MLP,CNN,and the original DeepForest in both large-scale and small-scale dataset based learning tasks,but also can detect normal network traffic,tunnel network traffic and anonymous network traffic,which may close the gap between different network traffic analysis tasks.Thus,it has a wider application scenario and higher practical value.展开更多
Along with the development of 5G network and Internet of Things technologies,there has been an explosion in personalized healthcare systems.When the 5G and Artificial Intelligence(Al)is introduced into diabetes manage...Along with the development of 5G network and Internet of Things technologies,there has been an explosion in personalized healthcare systems.When the 5G and Artificial Intelligence(Al)is introduced into diabetes management architecture,it can increase the efficiency of existing systems and complications of diabetes can be handled more effectively by taking advantage of 5G.In this article,we propose a 5G-based Artificial Intelligence Diabetes Management architecture(AIDM),which can help physicians and patients to manage both acute complications and chronic complications.The AIDM contains five layers:the sensing layer,the transmission layer,the storage layer,the computing layer,and the application layer.We build a test bed for the transmission and application layers.Specifically,we apply a delay-aware RA optimization based on a double-queue model to improve access efficiency in smart hospital wards in the transmission layer.In application layer,we build a prediction model using a deep forest algorithm.Results on real-world data show that our AIDM can enhance the efficiency of diabetes management and improve the screening rate of diabetes as well.展开更多
The connectivity of sandbodies is a key constraint to the exploration effectiveness of Bohai A Oilfield.Conventional connectivity studies often use methods such as seismic attribute fusion,while the development of con...The connectivity of sandbodies is a key constraint to the exploration effectiveness of Bohai A Oilfield.Conventional connectivity studies often use methods such as seismic attribute fusion,while the development of contiguous composite sandbodies in this area makes it challenging to characterize connectivity changes with conventional seismic attributes.Aiming at the above problem in the Bohai A Oilfield,this study proposes a big data analysis method based on the Deep Forest algorithm to predict the sandbody connectivity.Firstly,by compiling the abundant exploration and development sandbodies data in the study area,typical sandbodies with reliable connectivity were selected.Then,sensitive seismic attribute were extracted to obtain training samples.Finally,based on the Deep Forest algorithm,mapping model between attribute combinations and sandbody connectivity was established through machine learning.This method achieves the first quantitative determination of the connectivity for continuous composite sandbodies in the Bohai Oilfield.Compared with conventional connectivity discrimination methods such as high-resolution processing and seismic attribute analysis,this method can combine the sandbody characteristics of the study area in the process of machine learning,and jointly judge connectivity by combining multiple seismic attributes.The study results show that this method has high accuracy and timeliness in predicting connectivity for continuous composite sandbodies.Applied to the Bohai A Oilfield,it successfully identified multiple sandbody connectivity relationships and provided strong support for the subsequent exploration potential assessment and well placement optimization.This method also provides a new idea and method for studying sandbody connectivity under similar complex geological conditions.展开更多
To efficiently mine threat intelligence from the vast array of open-source cybersecurity analysis reports on the web,we have developed the Parallel Deep Forest-based Multi-Label Classification(PDFMLC)algorithm.Initial...To efficiently mine threat intelligence from the vast array of open-source cybersecurity analysis reports on the web,we have developed the Parallel Deep Forest-based Multi-Label Classification(PDFMLC)algorithm.Initially,open-source cybersecurity analysis reports are collected and converted into a standardized text format.Subsequently,five tactics category labels are annotated,creating a multi-label dataset for tactics classification.Addressing the limitations of low execution efficiency and scalability in the sequential deep forest algorithm,our PDFMLC algorithm employs broadcast variables and the Lempel-Ziv-Welch(LZW)algorithm,significantly enhancing its acceleration ratio.Furthermore,our proposed PDFMLC algorithm incorporates label mutual information from the established dataset as input features.This captures latent label associations,significantly improving classification accuracy.Finally,we present the PDFMLC-based Threat Intelligence Mining(PDFMLC-TIM)method.Experimental results demonstrate that the PDFMLC algorithm exhibits exceptional node scalability and execution efficiency.Simultaneously,the PDFMLC-TIM method proficiently conducts text classification on cybersecurity analysis reports,extracting tactics entities to construct comprehensive threat intelligence.As a result,successfully formatted STIX2.1 threat intelligence is established.展开更多
基金supported by Ningxia Key R&D Program (Key)Project (2023BDE02001)Ningxia Key R&D Program (Talent Introduction Special)Project (2022YCZX0013)+2 种基金North Minzu University 2022 School-Level Research Platform“Digital Agriculture Empowering Ningxia Rural Revitalization Innovation Team”,Project Number:2022PT_S10Yinchuan City School-Enterprise Joint Innovation Project (2022XQZD009)“Innovation Team for Imaging and Intelligent Information Processing”of the National Ethnic Affairs Commission.
文摘Widely used deep neural networks currently face limitations in achieving optimal performance for purchase intention prediction due to constraints on data volume and hyperparameter selection.To address this issue,based on the deep forest algorithm and further integrating evolutionary ensemble learning methods,this paper proposes a novel Deep Adaptive Evolutionary Ensemble(DAEE)model.This model introduces model diversity into the cascade layer,allowing it to adaptively adjust its structure to accommodate complex and evolving purchasing behavior patterns.Moreover,this paper optimizes the methods of obtaining feature vectors,enhancement vectors,and prediction results within the deep forest algorithm to enhance the model’s predictive accuracy.Results demonstrate that the improved deep forest model not only possesses higher robustness but also shows an increase of 5.02%in AUC value compared to the baseline model.Furthermore,its training runtime speed is 6 times faster than that of deep models,and compared to other improved models,its accuracy has been enhanced by 0.9%.
基金funded by the National Natural Science Foundation of China(Grant No.42002134)China Postdoctoral Science Foundation(Grant No.2021T140735).
文摘Identifying fractures along a well trajectory is of immense significance in determining the subsurface fracture network distribution.Typically,conventional logs exhibit responses in fracture zones,and almost all wells have such logs.However,detecting fractures through logging responses can be challenging since the log response intensity is weak and complex.To address this problem,we propose a deep learning model for fracture identification using deep forest,which is based on a cascade structure comprising multi-layer random forests.Deep forest can extract complex nonlinear features of fractures in conventional logs through ensemble learning and deep learning.The proposed approach is tested using a dataset from the Oligocene to Miocene tight carbonate reservoirs in D oilfield,Zagros Basin,Middle East,and eight logs are selected to construct the fracture identification model based on sensitivity analysis of logging curves against fractures.The log package includes the gamma-ray,caliper,density,compensated neutron,acoustic transit time,and shallow,deep,and flushed zone resistivity logs.Experiments have shown that the deep forest obtains high recall and accuracy(>92%).In a blind well test,results from the deep forest learning model have a good correlation with fracture observation from cores.Compared to the random forest method,a widely used ensemble learning method,the proposed deep forest model improves accuracy by approximately 4.6%.
基金financially supported by the National Key R&D Program of China(No.2017YFB0702100)the National Natural Science Foundation of China(No.51871024)。
文摘The paper proposes a new deep structure model,called Densely Connected Cascade Forest-Weighted K Nearest Neighbors(DCCF-WKNNs),to implement the corrosion data modelling and corrosion knowledgemining.Firstly,we collect 409 outdoor atmospheric corrosion samples of low-alloy steels as experiment datasets.Then,we give the proposed methods process,including random forests-K nearest neighbors(RF-WKNNs)and DCCF-WKNNs.Finally,we use the collected datasets to verify the performance of the proposed method.The results show that compared with commonly used and advanced machine-learning algorithms such as artificial neural network(ANN),support vector regression(SVR),random forests(RF),and cascade forests(cForest),the proposed method can obtain the best prediction results.In addition,the method can predict the corrosion rates with variations of any one single environmental variable,like pH,temperature,relative humidity,SO2,rainfall or Cl-.By this way,the threshold of each variable,upon which the corrosion rate may have a large change,can be further obtained.
基金Fundamental Research Foundation for Universities of Heilongjiang Province,Grant/Award Number:LGYC2018JQ003。
文摘With the continuous development of machine learning and the increasing complexity of financial data analysis,it is more popular to use models in the field of machine learning to solve the hot and difficult problems in the financial industry.To improve the effectiveness of stock trend prediction and solve the problems in time series data processing,this paper combines the fuzzy affiliation function with stock-related technical indicators to obtain nominal data that can widely reflect the constituent stocks in the case of time series changes by analysing the S&P 500 index.Meanwhile,in order to optimise the current machine learning algorithm in which the setting and adjustment of hyperparameters rely too much on empirical knowledge,this paper combines the deep forest model to train the stock data separately.The experimental results show that(1)the accuracy of the extreme random forest and the accuracy of the multi-grain cascade forest are both higher than that of the gated recurrent unit(GRU)model when the un-fuzzy index-adjusted dataset is used as features for input,(2)the accuracy of the extreme random forest and the accuracy of the multigranular cascade forest are improved by using the fuzzy index-adjusted dataset as features for input,(3)the accuracy of the fuzzy index-adjusted dataset as features for inputting the extreme random forest is improved by 18.89% compared to that of the un-fuzzy index-adjusted dataset as features for inputting the extreme random forest and(4)the average accuracy of the fuzzy index-adjusted dataset as features for inputting multi-grain cascade forest increased by 5.67%.
基金:The work is supported by the National Key R&D Program of China(No.2021YFB2700500,2021YFB2700503).Tao Wang received the grant and the URLs to sponsors’websites is https://service.most.gov.cn/.
文摘In the research field of bearing fault diagnosis,classical deep learning models have the problems of too many parameters and high computing cost.In addition,the classical deep learning models are not effective in the scenario of small data.In recent years,deep forest is proposed,which has less hyper parameters and adaptive depth of deep model.In addition,weighted deep forest(WDF)is proposed to further improve deep forest by assigning weights for decisions trees based on the accuracy of each decision tree.In this paper,weighted deep forest model-based bearing fault diagnosis method(WDBM)is proposed.The WDBM is regard as a novel bearing fault diagnosis method,which not only inherits the WDF’s advantages-strong robustness,good generalization,less parameters,faster convergence speed and so on,but also realizes effective diagnosis with high precision and low cost under the condition of small samples.To verify the performance of the WDBM,experiments are carried out on Case Western Reserve University bearing data set(CWRU).Experiments results demonstrate that WDBM can achieve comparative recognition accuracy,with less computational overhead and faster convergence speed.
基金the Deputyship for Research&Innovation,Ministry of Education in Saudi Arabia for funding this research work through the Project Number(IFP2021-043).
文摘This article introduces a new medical internet of things(IoT)framework for intelligent fall detection system of senior people based on our proposed deep forest model.The cascade multi-layer structure of deep forest classifier allows to generate new features at each level with minimal hyperparameters compared to deep neural networks.Moreover,the optimal number of the deep forest layers is automatically estimated based on the early stopping criteria of validation accuracy value at each generated layer.The suggested forest classifier was successfully tested and evaluated using a public SmartFall dataset,which is acquired from three-axis accelerometer in a smartwatch.It includes 92781 training samples and 91025 testing samples with two labeled classes,namely non-fall and fall.Classification results of our deep forest classifier demonstrated a superior performance with the best accuracy score of 98.0%compared to three machine learning models,i.e.,K-nearest neighbors,decision trees and traditional random forest,and two deep learning models,which are dense neural networks and convolutional neural networks.By considering security and privacy aspects in the future work,our proposed medical IoT framework for fall detection of old people is valid for real-time healthcare application deployment.
基金funded by Henan Provincial Key R&D and Promotion Special Project(Science and Technology Tackling)(212102210165)National Social Science Foun-dation Key Project(20AZD114)+1 种基金Henan Provincial Higher Education Key Research Project Program(20B520008)Public Security Behavior Scientific Research and Technological Innovation Project of the Chinese People’s Public Security University(2020SYS08).
文摘The dark web is a shadow area hidden in the depths of the Internet,which is difficult to access through common search engines.Because of its anonymity,the dark web has gradually become a hotbed for a variety of cyber-crimes.Although some research based on machine learning or deep learning has been shown to be effective in the task of analyzing dark web traffic in recent years,there are still pain points such as low accuracy,insufficient real-time performance,and limited application scenarios.Aiming at the difficulties faced by the existing automated dark web traffic analysis methods,a novel method named Dark-Forest to analyze the behavior of dark web traffic is proposed.In this method,firstly,particle swarm optimization algorithm is used to filter the redundant features of dark web traffic data,which can effectively shorten the training and inference time of the model to meet the realtime requirements of dark web detection task.Then,the selected features of traffic are analyzed and classified using the DeepForest model as a backbone classifier.The comparison experiment with the current mainstream methods shows that Dark-Forest takes into account the advantages of statistical machine learning and deep learning,and achieves an accuracy rate of 87.84%.This method not only outperforms baseline methods such as Random Forest,MLP,CNN,and the original DeepForest in both large-scale and small-scale dataset based learning tasks,but also can detect normal network traffic,tunnel network traffic and anonymous network traffic,which may close the gap between different network traffic analysis tasks.Thus,it has a wider application scenario and higher practical value.
基金supported by grants from the industry prospecting and common key technology key projects of Jiangsu Province Science and Technology Department(Grant no.BE2020721)the Special guidance funds for service industry of Jiangsu Province Development and Reform Commission(Grant no.(2019)1089)+4 种基金the big data industry development pilot demonstration project of Ministry of Industry and Information Technology of China(Grant no.(2019)243,(2020)84)the Industrial and Information Industry Transformation and Upgrading Guiding Fund of Jiangsu Economy and Information Technology Commission(Grant no.(2018)0419)the Research Project of Jiangsu Province Sciences(Grant no.2019-2020ZZWKT15)the found of Jiangsu Engineering Research Center of Jiangsu Province Development and Reform Commission(Grant no.(2020)1460)the found of Jiangsu Digital Future Integration Innovation Center(Grant no.(2018)498).
文摘Along with the development of 5G network and Internet of Things technologies,there has been an explosion in personalized healthcare systems.When the 5G and Artificial Intelligence(Al)is introduced into diabetes management architecture,it can increase the efficiency of existing systems and complications of diabetes can be handled more effectively by taking advantage of 5G.In this article,we propose a 5G-based Artificial Intelligence Diabetes Management architecture(AIDM),which can help physicians and patients to manage both acute complications and chronic complications.The AIDM contains five layers:the sensing layer,the transmission layer,the storage layer,the computing layer,and the application layer.We build a test bed for the transmission and application layers.Specifically,we apply a delay-aware RA optimization based on a double-queue model to improve access efficiency in smart hospital wards in the transmission layer.In application layer,we build a prediction model using a deep forest algorithm.Results on real-world data show that our AIDM can enhance the efficiency of diabetes management and improve the screening rate of diabetes as well.
文摘The connectivity of sandbodies is a key constraint to the exploration effectiveness of Bohai A Oilfield.Conventional connectivity studies often use methods such as seismic attribute fusion,while the development of contiguous composite sandbodies in this area makes it challenging to characterize connectivity changes with conventional seismic attributes.Aiming at the above problem in the Bohai A Oilfield,this study proposes a big data analysis method based on the Deep Forest algorithm to predict the sandbody connectivity.Firstly,by compiling the abundant exploration and development sandbodies data in the study area,typical sandbodies with reliable connectivity were selected.Then,sensitive seismic attribute were extracted to obtain training samples.Finally,based on the Deep Forest algorithm,mapping model between attribute combinations and sandbody connectivity was established through machine learning.This method achieves the first quantitative determination of the connectivity for continuous composite sandbodies in the Bohai Oilfield.Compared with conventional connectivity discrimination methods such as high-resolution processing and seismic attribute analysis,this method can combine the sandbody characteristics of the study area in the process of machine learning,and jointly judge connectivity by combining multiple seismic attributes.The study results show that this method has high accuracy and timeliness in predicting connectivity for continuous composite sandbodies.Applied to the Bohai A Oilfield,it successfully identified multiple sandbody connectivity relationships and provided strong support for the subsequent exploration potential assessment and well placement optimization.This method also provides a new idea and method for studying sandbody connectivity under similar complex geological conditions.
文摘To efficiently mine threat intelligence from the vast array of open-source cybersecurity analysis reports on the web,we have developed the Parallel Deep Forest-based Multi-Label Classification(PDFMLC)algorithm.Initially,open-source cybersecurity analysis reports are collected and converted into a standardized text format.Subsequently,five tactics category labels are annotated,creating a multi-label dataset for tactics classification.Addressing the limitations of low execution efficiency and scalability in the sequential deep forest algorithm,our PDFMLC algorithm employs broadcast variables and the Lempel-Ziv-Welch(LZW)algorithm,significantly enhancing its acceleration ratio.Furthermore,our proposed PDFMLC algorithm incorporates label mutual information from the established dataset as input features.This captures latent label associations,significantly improving classification accuracy.Finally,we present the PDFMLC-based Threat Intelligence Mining(PDFMLC-TIM)method.Experimental results demonstrate that the PDFMLC algorithm exhibits exceptional node scalability and execution efficiency.Simultaneously,the PDFMLC-TIM method proficiently conducts text classification on cybersecurity analysis reports,extracting tactics entities to construct comprehensive threat intelligence.As a result,successfully formatted STIX2.1 threat intelligence is established.