摘要
The integration of machine learning(ML)models enhances the efficiency,affordability,and reliability of feature detection in microscopy,yet their development and applicability are hindered by the dependency on scarce and often flawed manually labeled datasets with a lack of domain awareness.We addressed these challenges by creating a physics-based synthetic image and data generator,resulting in an ML model that achieves comparable precision(0.86),recall(0.63),F1 scores(0.71),and engineering property predictions(R2=0.82)to amodel trained on human-labeled data.We enhanced both models by using feature prediction confidence scores to derive an image-wide confidence metric,enabling simple thresholding to eliminate ambiguous and out-of-domain images,resulting in performance boosts of 5–30%with a filtering-out rate of 25%.Our study demonstrates that synthetic data can eliminate human reliance in ML and provides a means for domain awareness in cases where many feature detections per image are needed.
基金
funded by the Electric Power Research Institute(EPRI)under award numbers 10012138 and 10012245
support for M.J.Lynch was funded by the NASA Space Technology Graduate Research Opportunities(NSTGRO)under grant number 80NSSC21K1246.M.J.Lynch and K.G.Field would like to acknowledge critical feedback from research group members on content and visuals used within this manuscript.