AUTHOR=Allen Cody , Aryal Shiva , Do Tuyen , Gautum Rishav , Hasan Md Mahmudul , Jasthi Bharat K. , Gnimpieba Etienne , Gadhamshetty Venkataramana TITLE=Deep learning strategies for addressing issues with small datasets in 2D materials research: Microbial Corrosion JOURNAL=Frontiers in Microbiology VOLUME=Volume 13 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/microbiology/articles/10.3389/fmicb.2022.1059123 DOI=10.3389/fmicb.2022.1059123 ISSN=1664-302X ABSTRACT=Corrosion coatings based on 2D materials such as graphene and hexagonal boron nitride have gained a lot of traction in recent years. Their impermeability, inertness, excellent bonding with underlying substrates, and amenability to functionalization enhances their ability to be customized for a range of environmental, chemical, and medial domains. Material and computer scientists use material databases and machine leaning tools to predict novel 2D materials, but unfortunately many of these predicted materials cannot withstand harsh microbially corrosive environments. Therefore, the addition of electrochemical parameters need to be investigated to aid machine learning models to improve their predictions of stable materials in these environments. Commonly machine learning models are trained with data sets ranging from tens of thousands to state of the art models on the order of millions of labels. Unfortunately, electrochemical data generation via electrochemical impedance spectroscopy (EIS) and linear polarization resistance (LPR) are complex and time intensive. Therefore, large datasets are not available, and training of a classifier is difficult and often results in overfitting. Deep learning data augmentation methods help significantly in this regard, by generating synthetic electrochemical data resembling the training data classes. We investigated two different deep generative models, variation autoencoder (VAE) and generative adversarial network (GAN) for synthetic data generation. Experiments demonstrate that GAN generated synthetic data had a greater neural network system performance than VAE generated synthetic data by 3-5%. Whereas VAE data performed better than GAN data when using XGBoost by 5-6%. Here, we show how synthetic data from VAE and GAN models can be used for electrochemical modeling of microbial corrosive systems using electrochemical parameters.