Customer attrition in the banking industry occurs when consumers quit using the goods and services offered by the bank for some time and,after that,end their connection with the bank.Therefore,customer retention is es...Customer attrition in the banking industry occurs when consumers quit using the goods and services offered by the bank for some time and,after that,end their connection with the bank.Therefore,customer retention is essential in today’s extremely competitive banking market.Additionally,having a solid customer base helps attract new consumers by fostering confidence and a referral from a current clientele.These factors make reducing client attrition a crucial step that banks must pursue.In our research,we aim to examine bank data and forecast which users will most likely discontinue using the bank’s services and become paying customers.We use various machine learning algorithms to analyze the data and show comparative analysis on different evaluation metrics.In addition,we developed a Data Visualization RShiny app for data science and management regarding customer churn analysis.Analyzing this data will help the bank indicate the trend and then try to retain customers on the verge of attrition.展开更多
The data production elements are driving profound transformations in the real economy across production objects,methods,and tools,generating significant economic effects such as industrial structure upgrading.This pap...The data production elements are driving profound transformations in the real economy across production objects,methods,and tools,generating significant economic effects such as industrial structure upgrading.This paper aims to reveal the impact mechanism of the data elements on the“three transformations”(high-end,intelligent,and green)in the manufacturing sector,theoretically elucidating the intrinsic mechanisms by which the data elements influence these transformations.The study finds that the data elements significantly enhance the high-end,intelligent,and green levels of China's manufacturing industry.In terms of the pathways of impact,the data elements primarily influence the development of high-tech industries and overall green technological innovation,thereby affecting the high-end,intelligent,and green transformation of the industry.展开更多
Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpe...Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpected channel volatility and thus developing a re-transmission mechanism(e.g.,hybrid automatic repeat request[HARQ])becomes indispensable.In that regard,instead of discarding previously transmitted information,the incremental knowledge-based HARQ(IK-HARQ)is deemed as a more effective mechanism that could sufficiently utilize the information semantics.However,considering the possible existence of semantic ambiguity in image transmission,a simple bit-level cyclic redundancy check(CRC)might compromise the performance of IK-HARQ.Therefore,there emerges a strong incentive to revolutionize the CRC mechanism,thus more effectively reaping the benefits of both SemCom and HARQ.In this paper,built on top of swin transformer-based joint source-channel coding(JSCC)and IK-HARQ,we propose a semantic image transmission framework SC-TDA-HARQ.In particular,different from the conventional CRC,we introduce a topological data analysis(TDA)-based error detection method,which capably digs out the inner topological and geometric information of images,to capture semantic information and determine the necessity for re-transmission.Extensive numerical results validate the effectiveness and efficiency of the proposed SC-TDA-HARQ framework,especially under the limited bandwidth condition,and manifest the superiority of TDA-based error detection method in image transmission.展开更多
The widespread usage of rechargeable batteries in portable devices,electric vehicles,and energy storage systems has underscored the importance for accurately predicting their lifetimes.However,data scarcity often limi...The widespread usage of rechargeable batteries in portable devices,electric vehicles,and energy storage systems has underscored the importance for accurately predicting their lifetimes.However,data scarcity often limits the accuracy of prediction models,which is escalated by the incompletion of data induced by the issues such as sensor failures.To address these challenges,we propose a novel approach to accommodate data insufficiency through achieving external information from incomplete data samples,which are usually discarded in existing studies.In order to fully unleash the prediction power of incomplete data,we have investigated the Multiple Imputation by Chained Equations(MICE)method that diversifies the training data through exploring the potential data patterns.The experimental results demonstrate that the proposed method significantly outperforms the baselines in the most considered scenarios while reducing the prediction root mean square error(RMSE)by up to 18.9%.Furthermore,we have also observed that the penetration of incomplete data benefits the explainability of the prediction model through facilitating the feature selection.展开更多
Face recognition has emerged as one of the most prominent applications of image analysis and under-standing,gaining considerable attention in recent years.This growing interest is driven by two key factors:its extensi...Face recognition has emerged as one of the most prominent applications of image analysis and under-standing,gaining considerable attention in recent years.This growing interest is driven by two key factors:its extensive applications in law enforcement and the commercial domain,and the rapid advancement of practical technologies.Despite the significant advancements,modern recognition algorithms still struggle in real-world conditions such as varying lighting conditions,occlusion,and diverse facial postures.In such scenarios,human perception is still well above the capabilities of present technology.Using the systematic mapping study,this paper presents an in-depth review of face detection algorithms and face recognition algorithms,presenting a detailed survey of advancements made between 2015 and 2024.We analyze key methodologies,highlighting their strengths and restrictions in the application context.Additionally,we examine various datasets used for face detection/recognition datasets focusing on the task-specific applications,size,diversity,and complexity.By analyzing these algorithms and datasets,this survey works as a valuable resource for researchers,identifying the research gap in the field of face detection and recognition and outlining potential directions for future research.展开更多
Electric Vehicle Charging Systems(EVCS)are increasingly vulnerable to cybersecurity threats as they integrate deeply into smart grids and Internet ofThings(IoT)environments,raising significant security challenges.Most...Electric Vehicle Charging Systems(EVCS)are increasingly vulnerable to cybersecurity threats as they integrate deeply into smart grids and Internet ofThings(IoT)environments,raising significant security challenges.Most existing research primarily emphasizes network-level anomaly detection,leaving critical vulnerabilities at the host level underexplored.This study introduces a novel forensic analysis framework leveraging host-level data,including system logs,kernel events,and Hardware Performance Counters(HPC),to detect and analyze sophisticated cyberattacks such as cryptojacking,Denial-of-Service(DoS),and reconnaissance activities targeting EVCS.Using comprehensive forensic analysis and machine learning models,the proposed framework significantly outperforms existing methods,achieving an accuracy of 98.81%.The findings offer insights into distinct behavioral signatures associated with specific cyber threats,enabling improved cybersecurity strategies and actionable recommendations for robust EVCS infrastructure protection.展开更多
Accurate capacity and State of Charge(SOC)estimation are crucial for ensuring the safety and longevity of lithium-ion batteries in electric vehicles.This study examines ten machine learning architectures,Including Dee...Accurate capacity and State of Charge(SOC)estimation are crucial for ensuring the safety and longevity of lithium-ion batteries in electric vehicles.This study examines ten machine learning architectures,Including Deep Belief Network(DBN),Bidirectional Recurrent Neural Network(BiDirRNN),Gated Recurrent Unit(GRU),and others using the NASA B0005 dataset of 591,458 instances.Results indicate that DBN excels in capacity estimation,achieving orders-of-magnitude lower error values and explaining over 99.97%of the predicted variable’s variance.When computational efficiency is paramount,the Deep Neural Network(DNN)offers a strong alternative,delivering near-competitive accuracy with significantly reduced prediction times.The GRU achieves the best overall performance for SOC estimation,attaining an R^(2) of 0.9999,while the BiDirRNN provides a marginally lower error at a slightly higher computational speed.In contrast,Convolutional Neural Networks(CNN)and Radial Basis Function Networks(RBFN)exhibit relatively high error rates,making them less viable for real-world battery management.Analyses of error distributions reveal that the top-performing models cluster most predictions within tight bounds,limiting the risk of overcharging or deep discharging.These findings highlight the trade-off between accuracy and computational overhead,offering valuable guidance for battery management system(BMS)designers seeking optimal performance under constrained resources.Future work may further explore advanced data augmentation and domain adaptation techniques to enhance these models’robustness in diverse operating conditions.展开更多
In trying to explain why Hong Kong of China ranks highest in life expectancy in the world,we review what various experts are hypothesizing,and how data science methods may be used to provide more evidence-based conclu...In trying to explain why Hong Kong of China ranks highest in life expectancy in the world,we review what various experts are hypothesizing,and how data science methods may be used to provide more evidence-based conclusions.While more data become available,we find some data analysis studies were too simplistic,while others too overwhelming in answering this challenging question.We find the approach that analyzes life expectancy related data(mortality causes and rate for different cohorts)inspiring,and use this approach to study a carefully selected set of targets for comparison.In discussing the factors that matter,we argue that it is more reasonable to try to identify a set of factors that together explain the phenomenon.展开更多
Background There is insufficient evidence to provide recommendations for leisure-time physical activity among workers across various occupational physical activity levels.This study aimed to assess the association of ...Background There is insufficient evidence to provide recommendations for leisure-time physical activity among workers across various occupational physical activity levels.This study aimed to assess the association of leisure-time physical activity with cardiovascular and all-cause mortality across occupational physical activity levels.Methods This study utilized individual participant data from 21 cohort studies,comprising both published and unpublished data.Eligibility criteria included individual-level data on leisure-time and occupational physical activity(categorized as sedentary,low,moderate,and high)along with data on all-cause and/or cardiovascular mortality.A 2-stage individual participant data meta-analysis was conducted,with separate analysis of each study using Cox proportional hazards models(Stage 1).These results were combined using random-effects models(Stage 2).Results Higher leisure-time physical activity levels were associated with lower all-cause and cardiovascular mortality risk across most occupational physical activity levels,for both males and females.Among males with sedentary work,high compared to sedentary leisure-time physical activity was associated with lower all-cause(hazard ratios(HR)=0.77,95%confidence interval(95%CI):0.70-0.85)and cardiovascular mortality(HR=0.76,95%CI:0.66-0.87)risk.Among males with high levels of occupational physical activity,high compared to sedentary leisure-time physical activity was associated with lower all-cause(HR=0.84,95%CI:0.74-0.97)and cardiovascular mortality(HR=0.79,95%CI:0.60-1.04)risk,while HRs for low and moderate levels of leisure-time physical activity ranged between 0.87 and 0.97 and were not statistically significant.Among females,most effects were similar but more imprecise,especially in the higher occupational physical activity levels.Conclusion Higher levels of leisure-time physical activity were generally associated with lower mortality risks.However,results for workers with moderate and high occupational physical activity levels,especially women,were more imprecise.Our findings suggests that workers may benefit from engaging in high levels of leisure-time physical activity,irrespective of their level of occupational physical activity.展开更多
Myocarditis is a significant public health concern because of its potential to cause heart failure and sudden death.The standard invasive diagnostic method,endomyocardial bi-opsy,is typically reserved for cases with s...Myocarditis is a significant public health concern because of its potential to cause heart failure and sudden death.The standard invasive diagnostic method,endomyocardial bi-opsy,is typically reserved for cases with severe complications,limiting its widespread use.Conversely,non‐invasive cardiac magnetic resonance(CMR)imaging presents a promising alternative for detecting and monitoring myocarditis,because of its high signal contrast that reveals myocardial involvement.To assist medical professionals via artificial intelligence,the authors introduce generative adversarial networks‐multi discriminator(GAN‐MD),a deep learning model that uses binary classification to diagnose myocarditis from CMR images.Their approach employs a series of convolutional neural networks(CNNs)that extract and combine feature vectors for accurate diagnosis.The authors suggest a novel technique for improving the classification precision of CNNs.Using generative adversarial networks(GANs)to create synthetic images for data augmentation,the authors address challenges such as mode collapse and unstable training.Incorporating a reconstruction loss into the GAN loss function requires the generator to produce images reflecting the discriminator features,thus enhancing the generated images'quality to more accurately replicate authentic data patterns.Moreover,combining this loss function with other reg-ularisation methods,such as gradient penalty,has proven to further improve the perfor-mance of diverse GAN models.A significant challenge in myocarditis diagnosis is the imbalance of classification,where one class dominates over the other.To mitigate this,the authors introduce a focal loss‐based training method that effectively trains the model on the minority class samples.The GAN‐MD approach,evaluated on the Z‐Alizadeh Sani myocarditis dataset,achieves superior results(F‐measure 86.2%;geometric mean 91.0%)compared with other deep learning models and traditional machine learning methods.展开更多
Machine fault diagnostics are essential for industrial operations,and advancements in machine learning have significantly advanced these systems by providing accurate predictions and expedited solutions.Machine learni...Machine fault diagnostics are essential for industrial operations,and advancements in machine learning have significantly advanced these systems by providing accurate predictions and expedited solutions.Machine learning models,especially those utilizing complex algorithms like deep learning,have demonstrated major potential in extracting important information fromlarge operational datasets.Despite their efficiency,machine learningmodels face challenges,making Explainable AI(XAI)crucial for improving their understandability and fine-tuning.The importance of feature contribution and selection using XAI in the diagnosis of machine faults is examined in this study.The technique is applied to evaluate different machine-learning algorithms.Extreme Gradient Boosting,Support Vector Machine,Gaussian Naive Bayes,and Random Forest classifiers are used alongside Logistic Regression(LR)as a baseline model because their efficacy and simplicity are evaluated thoroughly with empirical analysis.The XAI is used as a targeted feature selection technique to select among 29 features of the time and frequency domain.The XAI approach is lightweight,trained with only targeted features,and achieved similar results as the traditional approach.The accuracy without XAI on baseline LR is 79.57%,whereas the approach with XAI on LR is 80.28%.展开更多
Health data and cutting-edge technologies empower medicine and improve healthcare.It has become even more true during the COVID-19 pandemic.Through coronavirus data sharing and worldwide collaboration,the speed of vac...Health data and cutting-edge technologies empower medicine and improve healthcare.It has become even more true during the COVID-19 pandemic.Through coronavirus data sharing and worldwide collaboration,the speed of vaccine development for COVID-19 is unprecedented.Digital and data technologies were quickly adopted during the pandemic,showing how those technologies can be harnessed to enhance public health and healthcare.A wide range of digital data sources are being utilized and visually presented to enhance the epidemiological surveillance of COVID-19.Digital contact tracing mobile apps have been adopted by many countries to control community transmission.Deep learning has been utilized to achieve various solutions for COVID-19 disruption,including outbreak prediction,virus spread tracking.展开更多
Improving population health by creating more equitable health systems is a major focus of health policy and planning today.However,before we can achieve equity in health,we must first begin by leveraging all we have l...Improving population health by creating more equitable health systems is a major focus of health policy and planning today.However,before we can achieve equity in health,we must first begin by leveraging all we have learned,and are continuing to discover,about the many social,structural,and environmental determinants of health.We must fully consider the conditions in which people are born,grow,learn,work,play,and age.The study of social determinants of health has made tremendous strides in recent decades.At the same time,we have seen huge advances in how health data are collected,analyzed,and used to inform action in the health sector.It is time to merge these two fields,to harness the best from both and to improve decision-making to accelerate evidence-based action toward greater health equity.展开更多
Dear Editor,In this letter,a novel data-driven adaptive predictive control method is proposed using the triangular dynamic linearization technique.The proposed method only contains one time-varying parameter with expl...Dear Editor,In this letter,a novel data-driven adaptive predictive control method is proposed using the triangular dynamic linearization technique.The proposed method only contains one time-varying parameter with explicit physical meaning,which can prevent severe deviation in parameter estimation.Specifically,a triangular dynamic linearization(TDL)data model is employed to predict future system outputs,and then to correct inaccurate predictive outputs,a feedback regulator is designed.An autotuned weighing factor is introduced to alleviate the computational burden in practical applications and further improve output tracking performance.Closed-loop stability conditions are derived by rigorous analysis.Simulation results are provided to demonstrate the efficacy of the proposed method.展开更多
This work presents an advanced and detailed analysis of the mechanisms of hepatitis B virus(HBV)propagation in an environment characterized by variability and stochas-ticity.Based on some biological features of the vi...This work presents an advanced and detailed analysis of the mechanisms of hepatitis B virus(HBV)propagation in an environment characterized by variability and stochas-ticity.Based on some biological features of the virus and the assumptions,the corresponding deterministic model is formulated,which takes into consideration the effect of vaccination.This deterministic model is extended to a stochastic framework by considering a new form of disturbance which makes it possible to simulate strong and significant fluctuations.The long-term behaviors of the virus are predicted by using stochastic differential equations with second-order multiplicative α-stable jumps.By developing the assumptions and employing the novel theoretical tools,the threshold parameter responsible for ergodicity(persistence)and extinction is provided.The theoretical results of the current study are validated by numerical simulations and parameters estimation is also performed.Moreover,we obtain the following new interesting findings:(a)in each class,the average time depends on the value ofα;(b)the second-order noise has an inverse effect on the spread of the virus;(c)the shapes of population densities at stationary level quickly changes at certain values of α.The last three conclusions can provide a solid research base for further investigation in the field of biological and ecological modeling.展开更多
Amid the landscape of Cloud Computing(CC),the Cloud Datacenter(DC)stands as a conglomerate of physical servers,whose performance can be hindered by bottlenecks within the realm of proliferating CC services.A linchpin ...Amid the landscape of Cloud Computing(CC),the Cloud Datacenter(DC)stands as a conglomerate of physical servers,whose performance can be hindered by bottlenecks within the realm of proliferating CC services.A linchpin in CC’s performance,the Cloud Service Broker(CSB),orchestrates DC selection.Failure to adroitly route user requests with suitable DCs transforms the CSB into a bottleneck,endangering service quality.To tackle this,deploying an efficient CSB policy becomes imperative,optimizing DC selection to meet stringent Qualityof-Service(QoS)demands.Amidst numerous CSB policies,their implementation grapples with challenges like costs and availability.This article undertakes a holistic review of diverse CSB policies,concurrently surveying the predicaments confronted by current policies.The foremost objective is to pinpoint research gaps and remedies to invigorate future policy development.Additionally,it extensively clarifies various DC selection methodologies employed in CC,enriching practitioners and researchers alike.Employing synthetic analysis,the article systematically assesses and compares myriad DC selection techniques.These analytical insights equip decision-makers with a pragmatic framework to discern the apt technique for their needs.In summation,this discourse resoundingly underscores the paramount importance of adept CSB policies in DC selection,highlighting the imperative role of efficient CSB policies in optimizing CC performance.By emphasizing the significance of these policies and their modeling implications,the article contributes to both the general modeling discourse and its practical applications in the CC domain.展开更多
INTRODUCTION Artificial intelligence(AI)has brought about revolutionary changes in the medical field,including clinical practice,basic research and health monitoring.The powerful computing capabilities and intelligent...INTRODUCTION Artificial intelligence(AI)has brought about revolutionary changes in the medical field,including clinical practice,basic research and health monitoring.The powerful computing capabilities and intelligent algorithms enable it to process and analyse largescale medical data.展开更多
This article delves into the analysis of performance and utilization of Support Vector Machines (SVMs) for the critical task of forest fire detection using image datasets. With the increasing threat of forest fires to...This article delves into the analysis of performance and utilization of Support Vector Machines (SVMs) for the critical task of forest fire detection using image datasets. With the increasing threat of forest fires to ecosystems and human settlements, the need for rapid and accurate detection systems is of utmost importance. SVMs, renowned for their strong classification capabilities, exhibit proficiency in recognizing patterns associated with fire within images. By training on labeled data, SVMs acquire the ability to identify distinctive attributes associated with fire, such as flames, smoke, or alterations in the visual characteristics of the forest area. The document thoroughly examines the use of SVMs, covering crucial elements like data preprocessing, feature extraction, and model training. It rigorously evaluates parameters such as accuracy, efficiency, and practical applicability. The knowledge gained from this study aids in the development of efficient forest fire detection systems, enabling prompt responses and improving disaster management. Moreover, the correlation between SVM accuracy and the difficulties presented by high-dimensional datasets is carefully investigated, demonstrated through a revealing case study. The relationship between accuracy scores and the different resolutions used for resizing the training datasets has also been discussed in this article. These comprehensive studies result in a definitive overview of the difficulties faced and the potential sectors requiring further improvement and focus.展开更多
A generalization of supervised single-label learning based on the assumption that each sample in a dataset may belong to more than one class simultaneously is called multi-label learning.The main objective of this wor...A generalization of supervised single-label learning based on the assumption that each sample in a dataset may belong to more than one class simultaneously is called multi-label learning.The main objective of this work is to create a novel framework for learning and classifying imbalancedmulti-label data.This work proposes a framework of two phases.The imbalanced distribution of themulti-label dataset is addressed through the proposed Borderline MLSMOTE resampling method in phase 1.Later,an adaptive weighted l21 norm regularized(Elastic-net)multilabel logistic regression is used to predict unseen samples in phase 2.The proposed Borderline MLSMOTE resampling method focuses on samples with concurrent high labels in contrast to conventional MLSMOTE.The minority labels in these samples are called difficult minority labels and are more prone to penalize classification performance.The concurrentmeasure is considered borderline,and labels associated with samples are regarded as borderline labels in the decision boundary.In phase II,a novel adaptive l21 norm regularized weighted multi-label logistic regression is used to handle balanced data with different weighted synthetic samples.Experimentation on various benchmark datasets shows the outperformance of the proposed method and its powerful predictive performances over existing conventional state-of-the-art multi-label methods.展开更多
Literature shows that both market data and financial media impact stock prices;however,using only one kind of data may lead to information bias.Therefore,this study uses market data and news to investigate their joint...Literature shows that both market data and financial media impact stock prices;however,using only one kind of data may lead to information bias.Therefore,this study uses market data and news to investigate their joint impact on stock price trends.However,combining these two types of information is difficult because of their completely different characteristics.This study develops a hybrid model called MVL-SVM for stock price trend prediction by integrating multi-view learning with a support vector machine(SVM).It works by simply inputting heterogeneous multi-view data simultaneously,which may reduce information loss.Compared with the ARIMA and classic SVM models based on single-and multi-view data,our hybrid model shows statistically significant advantages.In the robustness test,our model outperforms the others by at least 10%accuracy when the sliding windows of news and market data are set to 1–5 days,which confirms our model’s effectiveness.Finally,trading strategies based on single stock and investment portfolios are constructed separately,and the simulations show that MVL-SVM has better profitability and risk control performance than the benchmarks.展开更多
文摘Customer attrition in the banking industry occurs when consumers quit using the goods and services offered by the bank for some time and,after that,end their connection with the bank.Therefore,customer retention is essential in today’s extremely competitive banking market.Additionally,having a solid customer base helps attract new consumers by fostering confidence and a referral from a current clientele.These factors make reducing client attrition a crucial step that banks must pursue.In our research,we aim to examine bank data and forecast which users will most likely discontinue using the bank’s services and become paying customers.We use various machine learning algorithms to analyze the data and show comparative analysis on different evaluation metrics.In addition,we developed a Data Visualization RShiny app for data science and management regarding customer churn analysis.Analyzing this data will help the bank indicate the trend and then try to retain customers on the verge of attrition.
文摘The data production elements are driving profound transformations in the real economy across production objects,methods,and tools,generating significant economic effects such as industrial structure upgrading.This paper aims to reveal the impact mechanism of the data elements on the“three transformations”(high-end,intelligent,and green)in the manufacturing sector,theoretically elucidating the intrinsic mechanisms by which the data elements influence these transformations.The study finds that the data elements significantly enhance the high-end,intelligent,and green levels of China's manufacturing industry.In terms of the pathways of impact,the data elements primarily influence the development of high-tech industries and overall green technological innovation,thereby affecting the high-end,intelligent,and green transformation of the industry.
基金supported in part by the National Key Research and Development Program of China under Grant 2024YFE0200600in part by the National Natural Science Foundation of China under Grant 62071425+3 种基金in part by the Zhejiang Key Research and Development Plan under Grant 2022C01093in part by the Zhejiang Provincial Natural Science Foundation of China under Grant LR23F010005in part by the National Key Laboratory of Wireless Communications Foundation under Grant 2023KP01601in part by the Big Data and Intelligent Computing Key Lab of CQUPT under Grant BDIC-2023-B-001.
文摘Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpected channel volatility and thus developing a re-transmission mechanism(e.g.,hybrid automatic repeat request[HARQ])becomes indispensable.In that regard,instead of discarding previously transmitted information,the incremental knowledge-based HARQ(IK-HARQ)is deemed as a more effective mechanism that could sufficiently utilize the information semantics.However,considering the possible existence of semantic ambiguity in image transmission,a simple bit-level cyclic redundancy check(CRC)might compromise the performance of IK-HARQ.Therefore,there emerges a strong incentive to revolutionize the CRC mechanism,thus more effectively reaping the benefits of both SemCom and HARQ.In this paper,built on top of swin transformer-based joint source-channel coding(JSCC)and IK-HARQ,we propose a semantic image transmission framework SC-TDA-HARQ.In particular,different from the conventional CRC,we introduce a topological data analysis(TDA)-based error detection method,which capably digs out the inner topological and geometric information of images,to capture semantic information and determine the necessity for re-transmission.Extensive numerical results validate the effectiveness and efficiency of the proposed SC-TDA-HARQ framework,especially under the limited bandwidth condition,and manifest the superiority of TDA-based error detection method in image transmission.
文摘The widespread usage of rechargeable batteries in portable devices,electric vehicles,and energy storage systems has underscored the importance for accurately predicting their lifetimes.However,data scarcity often limits the accuracy of prediction models,which is escalated by the incompletion of data induced by the issues such as sensor failures.To address these challenges,we propose a novel approach to accommodate data insufficiency through achieving external information from incomplete data samples,which are usually discarded in existing studies.In order to fully unleash the prediction power of incomplete data,we have investigated the Multiple Imputation by Chained Equations(MICE)method that diversifies the training data through exploring the potential data patterns.The experimental results demonstrate that the proposed method significantly outperforms the baselines in the most considered scenarios while reducing the prediction root mean square error(RMSE)by up to 18.9%.Furthermore,we have also observed that the penetration of incomplete data benefits the explainability of the prediction model through facilitating the feature selection.
文摘Face recognition has emerged as one of the most prominent applications of image analysis and under-standing,gaining considerable attention in recent years.This growing interest is driven by two key factors:its extensive applications in law enforcement and the commercial domain,and the rapid advancement of practical technologies.Despite the significant advancements,modern recognition algorithms still struggle in real-world conditions such as varying lighting conditions,occlusion,and diverse facial postures.In such scenarios,human perception is still well above the capabilities of present technology.Using the systematic mapping study,this paper presents an in-depth review of face detection algorithms and face recognition algorithms,presenting a detailed survey of advancements made between 2015 and 2024.We analyze key methodologies,highlighting their strengths and restrictions in the application context.Additionally,we examine various datasets used for face detection/recognition datasets focusing on the task-specific applications,size,diversity,and complexity.By analyzing these algorithms and datasets,this survey works as a valuable resource for researchers,identifying the research gap in the field of face detection and recognition and outlining potential directions for future research.
文摘Electric Vehicle Charging Systems(EVCS)are increasingly vulnerable to cybersecurity threats as they integrate deeply into smart grids and Internet ofThings(IoT)environments,raising significant security challenges.Most existing research primarily emphasizes network-level anomaly detection,leaving critical vulnerabilities at the host level underexplored.This study introduces a novel forensic analysis framework leveraging host-level data,including system logs,kernel events,and Hardware Performance Counters(HPC),to detect and analyze sophisticated cyberattacks such as cryptojacking,Denial-of-Service(DoS),and reconnaissance activities targeting EVCS.Using comprehensive forensic analysis and machine learning models,the proposed framework significantly outperforms existing methods,achieving an accuracy of 98.81%.The findings offer insights into distinct behavioral signatures associated with specific cyber threats,enabling improved cybersecurity strategies and actionable recommendations for robust EVCS infrastructure protection.
文摘Accurate capacity and State of Charge(SOC)estimation are crucial for ensuring the safety and longevity of lithium-ion batteries in electric vehicles.This study examines ten machine learning architectures,Including Deep Belief Network(DBN),Bidirectional Recurrent Neural Network(BiDirRNN),Gated Recurrent Unit(GRU),and others using the NASA B0005 dataset of 591,458 instances.Results indicate that DBN excels in capacity estimation,achieving orders-of-magnitude lower error values and explaining over 99.97%of the predicted variable’s variance.When computational efficiency is paramount,the Deep Neural Network(DNN)offers a strong alternative,delivering near-competitive accuracy with significantly reduced prediction times.The GRU achieves the best overall performance for SOC estimation,attaining an R^(2) of 0.9999,while the BiDirRNN provides a marginally lower error at a slightly higher computational speed.In contrast,Convolutional Neural Networks(CNN)and Radial Basis Function Networks(RBFN)exhibit relatively high error rates,making them less viable for real-world battery management.Analyses of error distributions reveal that the top-performing models cluster most predictions within tight bounds,limiting the risk of overcharging or deep discharging.These findings highlight the trade-off between accuracy and computational overhead,offering valuable guidance for battery management system(BMS)designers seeking optimal performance under constrained resources.Future work may further explore advanced data augmentation and domain adaptation techniques to enhance these models’robustness in diverse operating conditions.
基金support of funding(No.UGC/IDS(R)11/21)from the Hong Kong SAR Government.
文摘In trying to explain why Hong Kong of China ranks highest in life expectancy in the world,we review what various experts are hypothesizing,and how data science methods may be used to provide more evidence-based conclusions.While more data become available,we find some data analysis studies were too simplistic,while others too overwhelming in answering this challenging question.We find the approach that analyzes life expectancy related data(mortality causes and rate for different cohorts)inspiring,and use this approach to study a carefully selected set of targets for comparison.In discussing the factors that matter,we argue that it is more reasonable to try to identify a set of factors that together explain the phenomenon.
基金The Trùndelag Health Study (HUNT) is a collaboration between HUNT Research Centre (Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology), Trùndelag County Council, Central Norway Regional Health Authority, and the Norwegian Institute of Public HealthThe coordination of European Prospective Investigation into Cancer and Nutrition - Spain study (EPIC) is financially supported by the International Agency for Research on Cancer (IARC)+7 种基金by the Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, which has additional infrastructure support provided by the NIHR Imperial Biomedical Research Centre (BRC)supported by Health Research Fund (FIS) - Instituto de Salud Carlos III (ISCIII), Regional Governments of Andaluc 1a, Asturias, Basque Country, Murcia and Navarra, and the Catalan Institute of Oncology - ICO (Spain)funded by The Netherlands Organisation for Health Research and DevelopmentZon Mw (Grant No.: 531-00141-3)Funding for the SHIP study has been provided by the Federal Ministry for Education and Research (BMBFidentification codes 01 ZZ96030, 01 ZZ0103, and 01 ZZ0701)support from the Swedish Research Council (2018-02527 and 2019-00193)financed by the Helmholtz Zentrum München - German Research Center for Environmental Health, which is funded by the German Federal Ministry of Education and Research (BMBF) and by the State of Bavaria.
文摘Background There is insufficient evidence to provide recommendations for leisure-time physical activity among workers across various occupational physical activity levels.This study aimed to assess the association of leisure-time physical activity with cardiovascular and all-cause mortality across occupational physical activity levels.Methods This study utilized individual participant data from 21 cohort studies,comprising both published and unpublished data.Eligibility criteria included individual-level data on leisure-time and occupational physical activity(categorized as sedentary,low,moderate,and high)along with data on all-cause and/or cardiovascular mortality.A 2-stage individual participant data meta-analysis was conducted,with separate analysis of each study using Cox proportional hazards models(Stage 1).These results were combined using random-effects models(Stage 2).Results Higher leisure-time physical activity levels were associated with lower all-cause and cardiovascular mortality risk across most occupational physical activity levels,for both males and females.Among males with sedentary work,high compared to sedentary leisure-time physical activity was associated with lower all-cause(hazard ratios(HR)=0.77,95%confidence interval(95%CI):0.70-0.85)and cardiovascular mortality(HR=0.76,95%CI:0.66-0.87)risk.Among males with high levels of occupational physical activity,high compared to sedentary leisure-time physical activity was associated with lower all-cause(HR=0.84,95%CI:0.74-0.97)and cardiovascular mortality(HR=0.79,95%CI:0.60-1.04)risk,while HRs for low and moderate levels of leisure-time physical activity ranged between 0.87 and 0.97 and were not statistically significant.Among females,most effects were similar but more imprecise,especially in the higher occupational physical activity levels.Conclusion Higher levels of leisure-time physical activity were generally associated with lower mortality risks.However,results for workers with moderate and high occupational physical activity levels,especially women,were more imprecise.Our findings suggests that workers may benefit from engaging in high levels of leisure-time physical activity,irrespective of their level of occupational physical activity.
文摘Myocarditis is a significant public health concern because of its potential to cause heart failure and sudden death.The standard invasive diagnostic method,endomyocardial bi-opsy,is typically reserved for cases with severe complications,limiting its widespread use.Conversely,non‐invasive cardiac magnetic resonance(CMR)imaging presents a promising alternative for detecting and monitoring myocarditis,because of its high signal contrast that reveals myocardial involvement.To assist medical professionals via artificial intelligence,the authors introduce generative adversarial networks‐multi discriminator(GAN‐MD),a deep learning model that uses binary classification to diagnose myocarditis from CMR images.Their approach employs a series of convolutional neural networks(CNNs)that extract and combine feature vectors for accurate diagnosis.The authors suggest a novel technique for improving the classification precision of CNNs.Using generative adversarial networks(GANs)to create synthetic images for data augmentation,the authors address challenges such as mode collapse and unstable training.Incorporating a reconstruction loss into the GAN loss function requires the generator to produce images reflecting the discriminator features,thus enhancing the generated images'quality to more accurately replicate authentic data patterns.Moreover,combining this loss function with other reg-ularisation methods,such as gradient penalty,has proven to further improve the perfor-mance of diverse GAN models.A significant challenge in myocarditis diagnosis is the imbalance of classification,where one class dominates over the other.To mitigate this,the authors introduce a focal loss‐based training method that effectively trains the model on the minority class samples.The GAN‐MD approach,evaluated on the Z‐Alizadeh Sani myocarditis dataset,achieves superior results(F‐measure 86.2%;geometric mean 91.0%)compared with other deep learning models and traditional machine learning methods.
基金funded by Woosong University Academic Research 2024.
文摘Machine fault diagnostics are essential for industrial operations,and advancements in machine learning have significantly advanced these systems by providing accurate predictions and expedited solutions.Machine learning models,especially those utilizing complex algorithms like deep learning,have demonstrated major potential in extracting important information fromlarge operational datasets.Despite their efficiency,machine learningmodels face challenges,making Explainable AI(XAI)crucial for improving their understandability and fine-tuning.The importance of feature contribution and selection using XAI in the diagnosis of machine faults is examined in this study.The technique is applied to evaluate different machine-learning algorithms.Extreme Gradient Boosting,Support Vector Machine,Gaussian Naive Bayes,and Random Forest classifiers are used alongside Logistic Regression(LR)as a baseline model because their efficacy and simplicity are evaluated thoroughly with empirical analysis.The XAI is used as a targeted feature selection technique to select among 29 features of the time and frequency domain.The XAI approach is lightweight,trained with only targeted features,and achieved similar results as the traditional approach.The accuracy without XAI on baseline LR is 79.57%,whereas the approach with XAI on LR is 80.28%.
文摘Health data and cutting-edge technologies empower medicine and improve healthcare.It has become even more true during the COVID-19 pandemic.Through coronavirus data sharing and worldwide collaboration,the speed of vaccine development for COVID-19 is unprecedented.Digital and data technologies were quickly adopted during the pandemic,showing how those technologies can be harnessed to enhance public health and healthcare.A wide range of digital data sources are being utilized and visually presented to enhance the epidemiological surveillance of COVID-19.Digital contact tracing mobile apps have been adopted by many countries to control community transmission.Deep learning has been utilized to achieve various solutions for COVID-19 disruption,including outbreak prediction,virus spread tracking.
文摘Improving population health by creating more equitable health systems is a major focus of health policy and planning today.However,before we can achieve equity in health,we must first begin by leveraging all we have learned,and are continuing to discover,about the many social,structural,and environmental determinants of health.We must fully consider the conditions in which people are born,grow,learn,work,play,and age.The study of social determinants of health has made tremendous strides in recent decades.At the same time,we have seen huge advances in how health data are collected,analyzed,and used to inform action in the health sector.It is time to merge these two fields,to harness the best from both and to improve decision-making to accelerate evidence-based action toward greater health equity.
基金supported in part by the National Natural Science Foundation of China(62173002,52301408,62173255)the Beijing Natural Science Foundation(4222045).
文摘Dear Editor,In this letter,a novel data-driven adaptive predictive control method is proposed using the triangular dynamic linearization technique.The proposed method only contains one time-varying parameter with explicit physical meaning,which can prevent severe deviation in parameter estimation.Specifically,a triangular dynamic linearization(TDL)data model is employed to predict future system outputs,and then to correct inaccurate predictive outputs,a feedback regulator is designed.An autotuned weighing factor is introduced to alleviate the computational burden in practical applications and further improve output tracking performance.Closed-loop stability conditions are derived by rigorous analysis.Simulation results are provided to demonstrate the efficacy of the proposed method.
基金supported by the NSFC(12201557)the Foundation of Zhejiang Provincial Education Department,China(Y202249921).
文摘This work presents an advanced and detailed analysis of the mechanisms of hepatitis B virus(HBV)propagation in an environment characterized by variability and stochas-ticity.Based on some biological features of the virus and the assumptions,the corresponding deterministic model is formulated,which takes into consideration the effect of vaccination.This deterministic model is extended to a stochastic framework by considering a new form of disturbance which makes it possible to simulate strong and significant fluctuations.The long-term behaviors of the virus are predicted by using stochastic differential equations with second-order multiplicative α-stable jumps.By developing the assumptions and employing the novel theoretical tools,the threshold parameter responsible for ergodicity(persistence)and extinction is provided.The theoretical results of the current study are validated by numerical simulations and parameters estimation is also performed.Moreover,we obtain the following new interesting findings:(a)in each class,the average time depends on the value ofα;(b)the second-order noise has an inverse effect on the spread of the virus;(c)the shapes of population densities at stationary level quickly changes at certain values of α.The last three conclusions can provide a solid research base for further investigation in the field of biological and ecological modeling.
文摘Amid the landscape of Cloud Computing(CC),the Cloud Datacenter(DC)stands as a conglomerate of physical servers,whose performance can be hindered by bottlenecks within the realm of proliferating CC services.A linchpin in CC’s performance,the Cloud Service Broker(CSB),orchestrates DC selection.Failure to adroitly route user requests with suitable DCs transforms the CSB into a bottleneck,endangering service quality.To tackle this,deploying an efficient CSB policy becomes imperative,optimizing DC selection to meet stringent Qualityof-Service(QoS)demands.Amidst numerous CSB policies,their implementation grapples with challenges like costs and availability.This article undertakes a holistic review of diverse CSB policies,concurrently surveying the predicaments confronted by current policies.The foremost objective is to pinpoint research gaps and remedies to invigorate future policy development.Additionally,it extensively clarifies various DC selection methodologies employed in CC,enriching practitioners and researchers alike.Employing synthetic analysis,the article systematically assesses and compares myriad DC selection techniques.These analytical insights equip decision-makers with a pragmatic framework to discern the apt technique for their needs.In summation,this discourse resoundingly underscores the paramount importance of adept CSB policies in DC selection,highlighting the imperative role of efficient CSB policies in optimizing CC performance.By emphasizing the significance of these policies and their modeling implications,the article contributes to both the general modeling discourse and its practical applications in the CC domain.
文摘INTRODUCTION Artificial intelligence(AI)has brought about revolutionary changes in the medical field,including clinical practice,basic research and health monitoring.The powerful computing capabilities and intelligent algorithms enable it to process and analyse largescale medical data.
文摘This article delves into the analysis of performance and utilization of Support Vector Machines (SVMs) for the critical task of forest fire detection using image datasets. With the increasing threat of forest fires to ecosystems and human settlements, the need for rapid and accurate detection systems is of utmost importance. SVMs, renowned for their strong classification capabilities, exhibit proficiency in recognizing patterns associated with fire within images. By training on labeled data, SVMs acquire the ability to identify distinctive attributes associated with fire, such as flames, smoke, or alterations in the visual characteristics of the forest area. The document thoroughly examines the use of SVMs, covering crucial elements like data preprocessing, feature extraction, and model training. It rigorously evaluates parameters such as accuracy, efficiency, and practical applicability. The knowledge gained from this study aids in the development of efficient forest fire detection systems, enabling prompt responses and improving disaster management. Moreover, the correlation between SVM accuracy and the difficulties presented by high-dimensional datasets is carefully investigated, demonstrated through a revealing case study. The relationship between accuracy scores and the different resolutions used for resizing the training datasets has also been discussed in this article. These comprehensive studies result in a definitive overview of the difficulties faced and the potential sectors requiring further improvement and focus.
基金partly supported by the Technology Development Program of MSS(No.S3033853)by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(No.2021R1A4A1031509).
文摘A generalization of supervised single-label learning based on the assumption that each sample in a dataset may belong to more than one class simultaneously is called multi-label learning.The main objective of this work is to create a novel framework for learning and classifying imbalancedmulti-label data.This work proposes a framework of two phases.The imbalanced distribution of themulti-label dataset is addressed through the proposed Borderline MLSMOTE resampling method in phase 1.Later,an adaptive weighted l21 norm regularized(Elastic-net)multilabel logistic regression is used to predict unseen samples in phase 2.The proposed Borderline MLSMOTE resampling method focuses on samples with concurrent high labels in contrast to conventional MLSMOTE.The minority labels in these samples are called difficult minority labels and are more prone to penalize classification performance.The concurrentmeasure is considered borderline,and labels associated with samples are regarded as borderline labels in the decision boundary.In phase II,a novel adaptive l21 norm regularized weighted multi-label logistic regression is used to handle balanced data with different weighted synthetic samples.Experimentation on various benchmark datasets shows the outperformance of the proposed method and its powerful predictive performances over existing conventional state-of-the-art multi-label methods.
基金partly supported by National Natural Science Foundation of China(No.71771204,72231010)the Fundamental Research Funds for the Central Universities(No.E0E48946X2).
文摘Literature shows that both market data and financial media impact stock prices;however,using only one kind of data may lead to information bias.Therefore,this study uses market data and news to investigate their joint impact on stock price trends.However,combining these two types of information is difficult because of their completely different characteristics.This study develops a hybrid model called MVL-SVM for stock price trend prediction by integrating multi-view learning with a support vector machine(SVM).It works by simply inputting heterogeneous multi-view data simultaneously,which may reduce information loss.Compared with the ARIMA and classic SVM models based on single-and multi-view data,our hybrid model shows statistically significant advantages.In the robustness test,our model outperforms the others by at least 10%accuracy when the sliding windows of news and market data are set to 1–5 days,which confirms our model’s effectiveness.Finally,trading strategies based on single stock and investment portfolios are constructed separately,and the simulations show that MVL-SVM has better profitability and risk control performance than the benchmarks.