Background:The association between cancer and venous thromboembolism(VTE)is well-established with cancer patients accounting for approximately 20%of all VTE incidents.In this paper,we have performed a comparison of ma...Background:The association between cancer and venous thromboembolism(VTE)is well-established with cancer patients accounting for approximately 20%of all VTE incidents.In this paper,we have performed a comparison of machine learning(ML)methods to traditional clinical scoring models for predicting the occurrence of VTE in a cancer patient population,identified important features(clinical biomarkers)for ML model predictions,and examined how different approaches to reducing the number of features used in the model impact model performance.Methods:We have developed an ML pipeline including three separate feature selection processes and applied it to routine patient care data from the electronic health records of 1910 cancer patients at the University of California Davis Medical Center.Results:Our ML-based prediction model achieved an area under the receiver operating characteristic curve of 0.778±0.006(mean±SD)when trained on a set of 15 features.This result is comparable with the model performance when trained on all features in our feature pool[0.779±0.006(mean±SD)with 29 features].Our result surpasses the most validated clinical scoring system for VTE risk assessment in cancer patients by 16.1%.We additionally found cancer stage information to be a useful predictor after all performed feature selection processes despite not being used in existing score-based approaches.Conclusion:From these findings,we observe that ML can offer new insights and a significant improvement over the most validated clinical VTE risk scoring systems in cancer patients.The results of this study also allowed us to draw insight into our feature pool and identify the features that could have the most utility in the context of developing an efficient ML classifier.While a model trained on our entire feature pool of 29 features significantly outperformed the traditionally used clinical scoring system,we were able to achieve an equivalent performance using a subset of only 15 features through strategic feature selection methods.These results are encouraging for potential applications of ML to predicting cancer-associated VTE in clinical settings such as in bedside decision support systems where feature availability may be limited.展开更多
In this paper,we present a user-complaint prediction system for mobile access networks based on network monitoring data.By applying machine-learning models,the proposed system can relate user complaints to network per...In this paper,we present a user-complaint prediction system for mobile access networks based on network monitoring data.By applying machine-learning models,the proposed system can relate user complaints to network performance indicators,alarm reports in a data-driven fashion,and predict the complaint events in a fine-grained spatial area within a specific time window.The proposed system harnesses several special designs to deal with the specialty in complaint prediction;complaint bursts are extracted using linear filtering and threshold detection to reduce the noisy fluctuation in raw complaint events.A fuzzy gridding method is also proposed to resolve the inaccuracy in verbally described complaint locations.Furthermore,we combine up-sampling with down-sampling to combat the severe skewness towards negative samples.The proposed system is evaluated using a real dataset collected from a major Chinese mobile operator,in which,events due to complaint bursts account approximately for only 0:3%of all recorded events.Re-sults show that our system can detect 30%of complaint bursts 3 h ahead with more than 80%precision.This will achieve a corresponding proportion of quality of experi-ence improvement if all predicted complaint events can be handled in advance through proper network maintenance.展开更多
基金CITRIS and Banatao Institute at the University of California,Grant/Award Number:CITRIS-2018-0257。
文摘Background:The association between cancer and venous thromboembolism(VTE)is well-established with cancer patients accounting for approximately 20%of all VTE incidents.In this paper,we have performed a comparison of machine learning(ML)methods to traditional clinical scoring models for predicting the occurrence of VTE in a cancer patient population,identified important features(clinical biomarkers)for ML model predictions,and examined how different approaches to reducing the number of features used in the model impact model performance.Methods:We have developed an ML pipeline including three separate feature selection processes and applied it to routine patient care data from the electronic health records of 1910 cancer patients at the University of California Davis Medical Center.Results:Our ML-based prediction model achieved an area under the receiver operating characteristic curve of 0.778±0.006(mean±SD)when trained on a set of 15 features.This result is comparable with the model performance when trained on all features in our feature pool[0.779±0.006(mean±SD)with 29 features].Our result surpasses the most validated clinical scoring system for VTE risk assessment in cancer patients by 16.1%.We additionally found cancer stage information to be a useful predictor after all performed feature selection processes despite not being used in existing score-based approaches.Conclusion:From these findings,we observe that ML can offer new insights and a significant improvement over the most validated clinical VTE risk scoring systems in cancer patients.The results of this study also allowed us to draw insight into our feature pool and identify the features that could have the most utility in the context of developing an efficient ML classifier.While a model trained on our entire feature pool of 29 features significantly outperformed the traditionally used clinical scoring system,we were able to achieve an equivalent performance using a subset of only 15 features through strategic feature selection methods.These results are encouraging for potential applications of ML to predicting cancer-associated VTE in clinical settings such as in bedside decision support systems where feature availability may be limited.
基金This work was sponsored in part by the National Natural Science Foundation of China(Nos.91638204,61571265,61621091)。
文摘In this paper,we present a user-complaint prediction system for mobile access networks based on network monitoring data.By applying machine-learning models,the proposed system can relate user complaints to network performance indicators,alarm reports in a data-driven fashion,and predict the complaint events in a fine-grained spatial area within a specific time window.The proposed system harnesses several special designs to deal with the specialty in complaint prediction;complaint bursts are extracted using linear filtering and threshold detection to reduce the noisy fluctuation in raw complaint events.A fuzzy gridding method is also proposed to resolve the inaccuracy in verbally described complaint locations.Furthermore,we combine up-sampling with down-sampling to combat the severe skewness towards negative samples.The proposed system is evaluated using a real dataset collected from a major Chinese mobile operator,in which,events due to complaint bursts account approximately for only 0:3%of all recorded events.Re-sults show that our system can detect 30%of complaint bursts 3 h ahead with more than 80%precision.This will achieve a corresponding proportion of quality of experi-ence improvement if all predicted complaint events can be handled in advance through proper network maintenance.