BACKGROUND Despite the frequent progression from Parkinson’s disease(PD)to Parkinson’s disease dementia(PDD),the basis to diagnose early-onset Parkinson dementia(EOPD)in the early stage is still insufficient.AIM To ...BACKGROUND Despite the frequent progression from Parkinson’s disease(PD)to Parkinson’s disease dementia(PDD),the basis to diagnose early-onset Parkinson dementia(EOPD)in the early stage is still insufficient.AIM To explore the prediction accuracy of sociodemographic factors,Parkinson's motor symptoms,Parkinson’s non-motor symptoms,and rapid eye movement sleep disorder for diagnosing EOPD using PD multicenter registry data.METHODS This study analyzed 342 Parkinson patients(66 EOPD patients and 276 PD patients with normal cognition),younger than 65 years.An EOPD prediction model was developed using a random forest algorithm and the accuracy of the developed model was compared with the naive Bayesian model and discriminant analysis.RESULTS The overall accuracy of the random forest was 89.5%,and was higher than that of discriminant analysis(78.3%)and that of the naive Bayesian model(85.8%).In the random forest model,the Korean Mini Mental State Examination(K-MMSE)score,Korean Montreal Cognitive Assessment(K-MoCA),sum of boxes in Clinical Dementia Rating(CDR),global score of CDR,motor score of Untitled Parkinson’s Disease Rating(UPDRS),and Korean Instrumental Activities of Daily Living(KIADL)score were confirmed as the major variables with high weight for EOPD prediction.Among them,the K-MMSE score was the most important factor in the final model.CONCLUSION It was found that Parkinson-related motor symptoms(e.g.,motor score of UPDRS)and instrumental daily performance(e.g.,K-IADL score)in addition to cognitive screening indicators(e.g.,K-MMSE score and K-MoCA score)were predictors with high accuracy in EOPD prediction.展开更多
The use of machine learning algorithms to identify characteristics in Distributed Denial of Service (DDoS) attacks has emerged as a powerful approach in cybersecurity. DDoS attacks, which aim to overwhelm a network or...The use of machine learning algorithms to identify characteristics in Distributed Denial of Service (DDoS) attacks has emerged as a powerful approach in cybersecurity. DDoS attacks, which aim to overwhelm a network or service with a flood of malicious traffic, pose significant threats to online systems. Traditional methods of detection and mitigation often struggle to keep pace with the evolving nature of these attacks. Machine learning, with its ability to analyze vast amounts of data and recognize patterns, offers a robust solution to this challenge. The aim of the paper is to demonstrate the application of ensemble ML algorithms, namely the K-Means and the KNN, for a dual clustering mechanism when used with PySpark to collect 99% accurate data. The algorithms, when used together, identify distinctive features of DDoS attacks that prove a very accurate reflection of reality, so they are a good combination for this aim. Impressively, having preprocessed the data, both algorithms with the PySpark foundation enabled the achievement of 99% accuracy when tuned on the features of a DDoS big dataset. The semi-supervised dataset tabulates traffic anomalies in terms of packet size distribution in correlation to Flow Duration. By training the K-Means Clustering and then applying the KNN to the dataset, the algorithms learn to evaluate the character of activity to a greater degree by displaying density with ease. The study evaluates the effectiveness of the K-Means Clustering with the KNN as ensemble algorithms that adapt very well in detecting complex patterns. Ultimately, cross-reaching environmental results indicate that ML-based approaches significantly improve detection rates compared to traditional methods. Furthermore, ensemble learning methods, which combine two plus multiple models to improve prediction accuracy, show greatness in handling the complexity and variability of big data sets especially when implemented by PySpark. The findings suggest that the enhancement of accuracy derives from newer software that’s designed to reflect reality. However, challenges remain in the deployment of these systems, including the need for large, high-quality datasets and the potential for adversarial attacks that attempt to deceive the ML models. Future research should continue to improve the robustness and efficiency of combining algorithms, as well as integrate them with existing security frameworks to provide comprehensive protection against DDoS attacks and other areas. The dataset was originally created by the University of New Brunswick to analyze DDoS data. The dataset itself was based on logs of the university’s servers, which found various DoS attacks throughout the publicly available period to totally generate 80 attributes with a 6.40GB size. In this dataset, the label and binary column become a very important portion of the final classification. In the last column, this means the normal traffic would be differentiated by the attack traffic. Further analysis is then ripe for investigation. Finally, malicious traffic alert software, as an example, should be trained on packet influx to Flow Duration dependence, which creates a mathematical scope for averages to enact. In achieving such high accuracy, the project acts as an illustration (referenced in the form of excerpts from my Google Colab account) of many attempts to tune. Cybersecurity advocates for more work on the character of brute-force attack traffic and normal traffic features overall since most of our investments as humans are digitally based in work, recreational, and social environments.展开更多
An ensemble prediction model of solar proton events (SPEs), combining the information of solar flares and coronal mass ejections (CMEs), is built. In this model, solar flares are parameterized by the peak flux, th...An ensemble prediction model of solar proton events (SPEs), combining the information of solar flares and coronal mass ejections (CMEs), is built. In this model, solar flares are parameterized by the peak flux, the duration and the longitude. In addition, CMEs are parameterized by the width, the speed and the measurement position angle. The importance of each parameter for the occurrence of SPEs is estimated by the information gain ratio. We find that the CME width and speed are more informative than the flare’s peak flux and duration. As the physical mechanism of SPEs is not very clear, a hidden naive Bayes approach, which is a probability-based calculation method from the field of machine learning, is used to build the prediction model from the observational data. As is known, SPEs originate from solar flares and/or shock waves associated with CMEs. Hence, we first build two base prediction models using the properties of solar flares and CMEs, respectively. Then the outputs of these models are combined to generate the ensemble prediction model of SPEs. The ensemble prediction model incorporating the complementary information of solar flares and CMEs achieves better performance than each base prediction model taken separately.展开更多
基金Supported by Basic Science Research Program through the National Research Foundation of Korea funded by the Ministry of Education,No.NRF-2018R1D1A1B07041091 and NRF-2019S1A5A8034211.
文摘BACKGROUND Despite the frequent progression from Parkinson’s disease(PD)to Parkinson’s disease dementia(PDD),the basis to diagnose early-onset Parkinson dementia(EOPD)in the early stage is still insufficient.AIM To explore the prediction accuracy of sociodemographic factors,Parkinson's motor symptoms,Parkinson’s non-motor symptoms,and rapid eye movement sleep disorder for diagnosing EOPD using PD multicenter registry data.METHODS This study analyzed 342 Parkinson patients(66 EOPD patients and 276 PD patients with normal cognition),younger than 65 years.An EOPD prediction model was developed using a random forest algorithm and the accuracy of the developed model was compared with the naive Bayesian model and discriminant analysis.RESULTS The overall accuracy of the random forest was 89.5%,and was higher than that of discriminant analysis(78.3%)and that of the naive Bayesian model(85.8%).In the random forest model,the Korean Mini Mental State Examination(K-MMSE)score,Korean Montreal Cognitive Assessment(K-MoCA),sum of boxes in Clinical Dementia Rating(CDR),global score of CDR,motor score of Untitled Parkinson’s Disease Rating(UPDRS),and Korean Instrumental Activities of Daily Living(KIADL)score were confirmed as the major variables with high weight for EOPD prediction.Among them,the K-MMSE score was the most important factor in the final model.CONCLUSION It was found that Parkinson-related motor symptoms(e.g.,motor score of UPDRS)and instrumental daily performance(e.g.,K-IADL score)in addition to cognitive screening indicators(e.g.,K-MMSE score and K-MoCA score)were predictors with high accuracy in EOPD prediction.
文摘The use of machine learning algorithms to identify characteristics in Distributed Denial of Service (DDoS) attacks has emerged as a powerful approach in cybersecurity. DDoS attacks, which aim to overwhelm a network or service with a flood of malicious traffic, pose significant threats to online systems. Traditional methods of detection and mitigation often struggle to keep pace with the evolving nature of these attacks. Machine learning, with its ability to analyze vast amounts of data and recognize patterns, offers a robust solution to this challenge. The aim of the paper is to demonstrate the application of ensemble ML algorithms, namely the K-Means and the KNN, for a dual clustering mechanism when used with PySpark to collect 99% accurate data. The algorithms, when used together, identify distinctive features of DDoS attacks that prove a very accurate reflection of reality, so they are a good combination for this aim. Impressively, having preprocessed the data, both algorithms with the PySpark foundation enabled the achievement of 99% accuracy when tuned on the features of a DDoS big dataset. The semi-supervised dataset tabulates traffic anomalies in terms of packet size distribution in correlation to Flow Duration. By training the K-Means Clustering and then applying the KNN to the dataset, the algorithms learn to evaluate the character of activity to a greater degree by displaying density with ease. The study evaluates the effectiveness of the K-Means Clustering with the KNN as ensemble algorithms that adapt very well in detecting complex patterns. Ultimately, cross-reaching environmental results indicate that ML-based approaches significantly improve detection rates compared to traditional methods. Furthermore, ensemble learning methods, which combine two plus multiple models to improve prediction accuracy, show greatness in handling the complexity and variability of big data sets especially when implemented by PySpark. The findings suggest that the enhancement of accuracy derives from newer software that’s designed to reflect reality. However, challenges remain in the deployment of these systems, including the need for large, high-quality datasets and the potential for adversarial attacks that attempt to deceive the ML models. Future research should continue to improve the robustness and efficiency of combining algorithms, as well as integrate them with existing security frameworks to provide comprehensive protection against DDoS attacks and other areas. The dataset was originally created by the University of New Brunswick to analyze DDoS data. The dataset itself was based on logs of the university’s servers, which found various DoS attacks throughout the publicly available period to totally generate 80 attributes with a 6.40GB size. In this dataset, the label and binary column become a very important portion of the final classification. In the last column, this means the normal traffic would be differentiated by the attack traffic. Further analysis is then ripe for investigation. Finally, malicious traffic alert software, as an example, should be trained on packet influx to Flow Duration dependence, which creates a mathematical scope for averages to enact. In achieving such high accuracy, the project acts as an illustration (referenced in the form of excerpts from my Google Colab account) of many attempts to tune. Cybersecurity advocates for more work on the character of brute-force attack traffic and normal traffic features overall since most of our investments as humans are digitally based in work, recreational, and social environments.
基金supported by the Young Researcher Grant of National Astronomical Observatories, Chinese Academy of Sciences, the National Basic Research Program of China (973 Program, Grant No. 2011CB811406)the National Natural Science Foundation of China (Grant Nos. 10733020, 10921303, 11003026 and 11078010)
文摘An ensemble prediction model of solar proton events (SPEs), combining the information of solar flares and coronal mass ejections (CMEs), is built. In this model, solar flares are parameterized by the peak flux, the duration and the longitude. In addition, CMEs are parameterized by the width, the speed and the measurement position angle. The importance of each parameter for the occurrence of SPEs is estimated by the information gain ratio. We find that the CME width and speed are more informative than the flare’s peak flux and duration. As the physical mechanism of SPEs is not very clear, a hidden naive Bayes approach, which is a probability-based calculation method from the field of machine learning, is used to build the prediction model from the observational data. As is known, SPEs originate from solar flares and/or shock waves associated with CMEs. Hence, we first build two base prediction models using the properties of solar flares and CMEs, respectively. Then the outputs of these models are combined to generate the ensemble prediction model of SPEs. The ensemble prediction model incorporating the complementary information of solar flares and CMEs achieves better performance than each base prediction model taken separately.