The scarcity of experimental data poses a significant challenge in predicting the self-healing capacity of bacteriadriven concrete.To address this issue,we explored the use of synthetic data generation to augment the ...The scarcity of experimental data poses a significant challenge in predicting the self-healing capacity of bacteriadriven concrete.To address this issue,we explored the use of synthetic data generation to augment the limited available dataset.By creating a synthetic dataset derived from real-world data,we substantially expanded the original data volume.We then trained and evaluated multiple machine learning(ML)models,encompassing both probabilistic and ensemble methods,for predicting self-healing capacity.Our comparative analysis revealed that ensemble methods,specifically the random forest(RF)algorithm,achieved the highest performance with an accuracy and F1-score of 0.863,surpassing the probabilistic models.Furthermore,when applied to real-world cases,the models maintained high predictive accuracy.This work confirms the value of synthetic data for enhancing the accuracy and reliability of predictive models in civil engineering,especially in data-scarce contexts.Our findings underscore the potential of machine learning and artificial intelligence to transform concrete research and highlight the role of synthetic data in overcoming common data limitations.展开更多
文摘The scarcity of experimental data poses a significant challenge in predicting the self-healing capacity of bacteriadriven concrete.To address this issue,we explored the use of synthetic data generation to augment the limited available dataset.By creating a synthetic dataset derived from real-world data,we substantially expanded the original data volume.We then trained and evaluated multiple machine learning(ML)models,encompassing both probabilistic and ensemble methods,for predicting self-healing capacity.Our comparative analysis revealed that ensemble methods,specifically the random forest(RF)algorithm,achieved the highest performance with an accuracy and F1-score of 0.863,surpassing the probabilistic models.Furthermore,when applied to real-world cases,the models maintained high predictive accuracy.This work confirms the value of synthetic data for enhancing the accuracy and reliability of predictive models in civil engineering,especially in data-scarce contexts.Our findings underscore the potential of machine learning and artificial intelligence to transform concrete research and highlight the role of synthetic data in overcoming common data limitations.