Computational Discovery of TTF Molecules with Deep Generative Models

We present a computational workflow based on quantum chemical calculations and generative models based on deep neural networks for the discovery of novel materials. We apply the developed workflow to search for molecules suitable for the fusion of triplet-triplet excitations (triplet-triplet fusion, TTF) in blue OLED devices. By applying generative machine learning models, we have been able to pinpoint the most promising regions of the chemical space for further exploration. Another neural network based on graph convolutions was trained to predict excitation energies; with this network, we estimate the alignment of energy levels and filter molecules before running time-consuming quantum chemical calculations. We present a comprehensive computational evaluation of several generative models, choosing a modification of the Junction Tree VAE (JT-VAE) as the best one in this application. The proposed approach can be useful for computer-aided design of materials with energy level alignment favorable for efficient energy transfer, triplet harvesting, and exciton fusion processes, which are crucial for the development of the next generation OLED materials.

. Distributions of S 1 (blue), T 1 (orange) and T 2 (green) energies for the training (left) and validation (right) data sets, which were used for evaluation of performance of different ML models.

The JT-VAE generative neural network
The JT-VAE neural network was constructed with dimensionality of hidden and latent spaces equal to 450 and 56 respectively. Depth of the graphs and the trees was chosen to be 3 and 20 respectively. To ensure better performance of the autoencoder, the first 10000 steps (warmup) were done without the Kullback-Leibler divergence loss. One of the most important parameters of the autoencoder it its reconstruction accuracy. Reconstruction accuracy as a function of the training timestep is presented in Fig. S2 for all three networks, corresponding to TTF cores with zero, one and two side chains. Figure S2. Reconstruction accuracy for JT-VAE NNs. Results for networks trained on datasets with 0, 1, and 2 side chains are shown in graphs a), b) and c), respectively. Results for training and validation sets are shown respectively in red and green. Figure S2 shows that for all datasets, the reconstruction accuracy is close to 60-65%, with the exception for 2 side chains, for which the accuracy is about 5 percentage points lower. Overall, the values correspond quite well to the numbers on reconstruction accuracy reported in the original work on the Junction-Tree VAE (Jin et al., 2018).

The JT-E model for energies prediction
Out of sample accuracy validation for three JT-E models trained on the sets with 0, 1 and 2 side chains are presented in Table S1. It shows that the highest accuracy is achieved for the dataset with no side chains. Datasets with 1 and 2 side chains feature higher chemical diversity, which makes predictions more difficult. As a result, the accuracy decreases by ≈ 50% for molecules with 2 side chains compared to those with no side chains. It turns out that predictions of singlet excitations are more difficult than for triplet ones. Nevertheless, one can conclude that the prediction performance of JT-E NN is very good, nearly comparable to the accuracy of the PM3 method, which can be measured on the order of ≈ 0.05 eV for the molecules in question (see the discussion in Section 2.1).

Benchmarking the methods for computing excitation energies
Results of the validation are presented in Fig. S4. The linear fit used for bias correction is shown with dotted lines. Compounds used in the validation are present in three traits: pure hydrocarbons, nitrogenand oxygen-containing molecules. These groups are scattered along similar trends in Fig. S4 and were used in statistical analysis without distinction. Fig. S4 also illustrates a surprisingly poor performance of multireference approach. While triplet energies are reasonably good, the prediction of singlet transitions fails entirely. We attribute this issue to larger active spaces required to describe singlet excited states. 0 side chains 1 side chain 2 side chains  Table S1. Mean absolute and root-mean squared errors for JT-E models trained on datasets each comprising ∼ 450,000 molecules with 0, 1 and 2 side chains. Figure S3. Compounds constituting the validation dataset with known experimental S 1 or T 1 excitation energies (referenced in Section 2.3 of the main text). Nitrogen atoms are highlighted with blue, oxygen atoms with red color.
Although multireference calculations with the (12,12) active space are feasible for molecules of moderate size, blind screening without manual inspection of every particular case seems to be impossible.
Actual values of the thresholds δ a , δ b and δ c are affected by possible inaccuracies in the calculation of energy levels. To find a reasonable estimate of δ c , we can validate the predictions of corrected PM3 on the experimental dataset (see Fig. S5). It can be seen that deviations are distributed non-uniformly, with larger errors for larger values of δ c . The worst results are for the six compounds with only one aromatic ring (shown in red on Figure S5). In our search, we have focused on (presumably) larger molecules and lower values of δ c , so we should be concerned with only the left part of the plot in Fig. S5, demonstrating a standard deviation of about 0.4 eV.
Unfortunately, the same approach cannot be applied to find δ b , since experimental values of the second triplet energy level are unavailable. The possible error margin here is relatively high because we use empirical factors obtained for T 1 to correct PM3 results for T 2 . To define constraints limiting the relevant area of chemical space, we suggest to employ a more qualitative approach. Fig. S6 presents core fragments of compounds with registered TTF activity (Wang et al., 2020), plotted in accordance with their energy losses. For a selected few compounds, external quantum efficiency of the device was reported to exceed statistical limit, thus indicating a favorable alignment of T 1 and T 2 energy levels. These compounds are rubrene (Cheng et al., 2010) (RUB) and perylene (Hoseinkhani et al., 2015) (PER) showed with red markers in Fig. S6. We also added tetracene (TET, square marker) to this group, since the required alignment of T 1 and T 2 in this case is suggested by independent experimental evidence (Völcker et al., 1989; Komfort et al., Figure S4. Validation results for energy prediction models; linear fit is shown with dotted lines. 1990; Fallon et al., 2020). It can be seen that the assumption δ b = δ c = 0 fails completely, leaving aside the majority of black points and all red ones. Introduction of a tighter criterion δ c = −0.3 eV does not change the situation, while allowing some room for statistical errors in calculations. The adequacy of the TTF search relies solely on the δ b . For δ b = −0.8 eV almost all points are included in the target area, while lower values ignore many compounds, introducing an obvious error in the important case of perylene.

Details of the screening
The first generation of the skeleton frames consisted of one 5-and one 6-membered rings. After consecutively applying steps I-III of the structure generation algorithm three times in a row, we obtained all possible frames with at most 4 rings.
Step IV produced all possible core compounds within the constraints listed in Section 2.2 of the main text. The corresponding region of the chemical space contains 472505 non-equivalent structures. The size of the subspace is sufficiently small to be treatable with SE methods of quantum chemistry, but at the same time large enough to be subjected to the search of promising TTF candidates and to be used for the development and validation of ML models. After that, we conducted geometry optimization using PM3. On the optimized structures, three first singlet and triplet excitation energies were calculated in Gaussian 16 at the configuration interactions singles (CIS) level using PM3 Hamiltonian. For 10035 structures (about 2% of the total amount) simulations failed, primarily due to the  unconverged optimization procedure. After application of TTF criteria, the majority of core compounds were fileterd out, leaving only 5690 candidates.