A Neural Network Reveals Motoric Effects of Maternal Preconception Exposure to Nicotine on Rat Pup Behavior: A New Approach for Movement Disorders Diagnosis

Neurodevelopmental disorders can stem from pharmacological, genetic, or environmental causes and early diagnosis is often a key to successful treatment. To improve early detection of neurological motor impairments, we developed a deep neural network for data-driven analyses. The network was applied to study the effect of maternal nicotine exposure prior to conception on 10-day-old rat pup motor behavior in an open field task. Female Long-Evans rats were administered nicotine (15 mg/L) in sweetened drinking water (1% sucralose) for seven consecutive weeks immediately prior to mating. The neural network outperformed human expert designed animal locomotion measures in distinguishing rat pups born to nicotine exposed dams vs. control dams (87 vs. 64% classification accuracy). Notably, the network discovered novel movement alterations in posture, movement initiation and a stereotypy in “warm-up” behavior (repeated movements along specific body dimensions) that were predictive of nicotine exposure. The results suggest novel findings that maternal preconception nicotine exposure delays and alters offspring motor development. Similar behavioral symptoms are associated with drug-related causes of disorders such as autism spectrum disorder and attention-deficit/hyperactivity disorder in human children. Thus, the identification of motor impairments in at-risk offspring here shows how neuronal networks can guide the development of more accurate behavioral tests to earlier diagnose symptoms of neurodevelopmental disorders in infants and children.


INTRODUCTION
Many neurological disorders, such as attention deficit/hyperactivity (ADHD) and autism spectrum disorder (ASD), have an early life onset. Although the successful treatment of the consequence of childhood onset disorders depends upon the early diagnosis of at-risk children , the methodology related to early diagnosis is underdeveloped. For example, mothers outperform experts in the early diagnosis of conditions such as ASD but the way that they do so is ad hoc (Sacrey et al., 2018). Many methods and tools have been introduced in order to address the problem of diagnosis and quantification of human disorders in animal models (Basso et al., 1995;Kabra et al., 2013;Berman et al., 2014;Machado et al., 2015;Wiltschko et al., 2015;Ben-Shaul, 2017;Markowitz et al., 2018;Mathis et al., 2018;Arac et al., 2019;Graving et al., 2019;Pereira et al., 2019). Nevertheless, for animal models and for human childhood disorders, early detection is difficult because symptomology must be detected within the limited motor repertoire displayed by infants (Schamhardt et al., 1993). To address the problem of early diagnosis, we introduce a deep neural network that automatically classifies spontaneous behavior and extracts, in a data-driven way, movements that distinguish control and experimental groups of animals.
We applied our network to study the rat pups born to maternal preconception nicotine exposed (MPNE) mothers. Nicotine is one of the most widely used drug of abuse by preconception parents and it is capable of perturbing many aspects of development (Dwyer et al., 2008;Devoto et al., 2020). Preconception nicotine can influence offspring development via three main mechanisms; it may induce physiological changes in the mother that alter the fetal environment, it may induce epigenetic modifications in the oocyte that shape ontology (Bohacek and Mansuy, 2013), and it may change the quality of maternal care, thereby resulting in the behavioral transmission of an altered developmental trajectory. Nicotine also influences brain development, e.g., by interacting with nicotinic acetylcholine receptors (nAChRs), affecting neuronal proliferation, differentiation, and maturation (Dwyer et al., 2008;Blood-Siegfried and Rende, 2010). There is limited research into the effects of MPNE on behavior (Holloway et al., 2007;Vassoler et al., 2014;Zhu et al., 2014;Yohn et al., 2015;Renaud and Fountain, 2016) and currently no studies consider its impact on early postnatal development. Therefore, the current research addresses two gaps in our understanding of nicotine's impact on early infant behavior. First, does nicotine administration during the preconception period in prospective dams, as opposed to the prenatal, preconception + prenatal, or paternal preconception period, affect behavior? Second, are offspring affected at an early stage of infant development, thus demonstrating an early impact of MPNE on offspring locomotion and its sensitivity to experimental detection?
To address these questions, we first analyzed neonatal (10-days-old) rat pup video recordings using standard locomotor-derived kinematic measures. Then we showed that a neural network can improve on this conventional analysis by identifying causative symptomology of the effect of MPNE. Importantly, we also present how to extract knowledge from the deep neural network in order to identify novel behavioral components that distinguished the nicotine exposed group from the control group.

Effect of Maternal Nicotine Exposure Prior to Conception on Offspring: Analyses of Behavior Using Expert Selected Measures
Standard "exploratory" locomotor measures were used to investigate the effect of MPNE on offspring locomotor development (Methods). Of 351 rat pups, 191 were from preconception sucralose-exposed dams, and 160 were from preconception nicotine-exposed dams (Methods). Ten day old (P10) pups were placed singly in the open field for one minute and their behavior was videotaped to investigate locomotor development (Methods). The movement of an animal was described in terms of two movement kinematics (Mychasiuk et al., 2013;Jenkins et al., 2018). (1) Total activity was the total number of square entries for either front paw of the animal during exploration. (2) Novel activity was the number of unique square entries, which relates to locomotor complexity. Those measures were calculated separately for the inner and outer part of the open field ( Figure 1A, Methods). A statistical comparison of the above movement measures of the MPNE (nicotine exposed dam) and control groups (sucralose-exposed dam) are shown in Figure 1B. The MPNE group was less active, entered fewer squares and explored fewer novel squares than did the control group (Total Control = 57.0 ± 1.8, Total Nicotine = 42.2 ± 1.6; Total Inner Control = 32.0 ± 1.2, Total Inner Nicotine = 26.4 ± 2.0; Total Outer Control = 24.0 ± 1.2, Total Outer Nicotine = 15.0 ± 1.0; Novel Control = 21.0 ± 0.6, Novel Nicotine = 15.5 ± 0.6; Novel Inner Control = 11.1 ± 0.4, Novel Inner Nicotine = 9.1 ± 0.3; Novel Outer Control = 9.9 ± 0.5, Novel Outer Nicotine = 6.3 ± 0.3; "±" represents SEM; p < 0.001 for all comparisons using t-test; using non-parametric Mann-Whitney U test also gave significant results for all comparisons with p < 0.003). We did not detect significant sex differences on any of the above measures (t-test, p > 0.05 for all measures). In summary, on all measures the MPNE offspring showed less exploration than the control offspring.
Combining Movement Measures to Distinguish the Control vs. Nicotine Groups As described above, we quantified the behavior by using typical kinematic measures employed in an open field task. This approach requires assumptions regarding which features of the behavior will be useful in distinguishing between treatment groups. To estimate the reliability of these expert selected features, we used machine learning algorithms to predict treatment groups using all six values of behavioral measures described above. We used five different algorithms to ensure that our results were not dependent on a specific data analysis method. For all algorithms we used fivefold cross-validation, where we trained the model on 80% of trials and predicted the treatment group for the remaining 20% of trials. We repeated this process 5 times to predict group category for every trial. The algorithms discriminated between the two groups with accuracy between 57-64% (Decision tree: 57%; Random forest: 61%; Logistic regression: 61%; K-nearest neighbors: 63%; Support vector machine: 64%) (Supplementary Text 1). This means that based on described movement measures it is possible to tell with about 64% accuracy if it is a control or nicotine group animal (chance level is 50%). We then applied principle component analysis (PCA) to the movement measures. The distribution of points for both classes largely overlapped in PC space (Supplementary Figure 1). It indicates a weak discriminability between classes, which is consistent with the above result using machine learning algorithms.

Using Deep Neural Network to Distinguish the Control vs. Nicotine Groups
To investigate if additional information could be extracted from the rat pup's behavior, we used a deep neural network to examine the same videos of MPNE and control animals in the open field task (Figure 2A). This approach does not require specifying which behavioral measures should be used. Rather, the neural network discovers by itself which features in video (e.g., shapes, movements, etc.) are the most predictive of the treatment groups. Specifically, we used a convolutional network (ConvNet) (Szegedy et al., 2016) to convert each video frame (400 × 350 pixels) to a set of 2,048 features. Those features may loosely correspond to object edges. Features from 150 video frames from a single video clip were then combined and passed to a recurrent neural network (RNN). This allowed the analysis of animal movements throughout each trial (1 trial = 1 video clip consisting of 150 frames corresponding to 50 s). The network was then trained to assign a correct group category to each video clip (Figure 2A), and then information was extracted from the network to investigate its decisions ( Figure 2B, see next section). After training, the network was able to distinguish videos of the MPNE and control groups with 87% accuracy. This accuracy is higher than the classification accuracy obtained from kinematic defined movement features (57-64%). Figure 2C shows the average activity of the output neuron for the control group (mean = 0.82 ± 0.02 SEM) and the nicotine group (mean = 0.13 ± 0.017 SEM). The activity of the output neuron was bounded between 0-1, with 1 corresponding to the control category. For example, a value of the output neuron of 0.9 can be interpreted as the network indicating that it is 90% "confident" of identifying a control animal, and only 10% "confident" that it is a MPNE animal. For calculating the network's prediction accuracy, values of the output neuron above 0.5 was considered as identifying a rat pup in the control group, and values below 0.5 as the rat pups in the MPNE group.
To verify that our network does not require fine parameter tuning for robust performance, we also tested four variations of the network. In particular, we modified the number of neurons and layers in the RNN, and we repeated the training and testing on the same data. The modified networks produced results similar to those of the original network (Supplementary Figure 2). To ensure that network accuracy is not a result of an overfitting and that our network can generalize to new animals, all predictions were obtained using fivefold cross-validation as described above. Thus, no videos of the predicted animal were included in the training dataset. Altogether, these results indicate that there is information about MPNE in the behavior that is not accounted for by the standard movement analyses.

Extracting Knowledge From the Neural Network
Considering that the network classified the animal groups from videos with higher accuracy than the kinematic measures, we investigated what movement features were the most informative for the network. We applied a recently developed Layer-wise Relevance Propagation method (LRP) to extract knowledge from deep neural networks (Bach et al., 2015; 2019) (Methods). First, we identified which features extracted from the videos were contributing the most to the predictions made by the RNN (Figure 2B-features importance array), and FIGURE 2 | Neural network architecture for data-driven analyses. (A) The network is trained using video clips of single trials (each consisting of 150 video frames). Frames are then passed through a convolutional network (ConvNet) to extract 2,048 high level image features from each frame. The features from 150 successive video frames are then given as an input to a recurrent neural network (RNN) to analyze temporal information across frames. Based on this information, RNN predicts a group category for each video clip (Output). (B) After the network is trained, information is extracted from the network weights in order to identify image features and the parts of each video frame that were the most important to network decision making. For visualization, only every 20th feature is shown. (C) Average activity of the output neuron for animal videos from each group.
then we investigated which parts of each frame corresponded to those most informative features ( Figure 2B-left side). This knowledge extraction method reveals the network's focus for decision making.
Examination of the feature's importance to the matrix revealed that certain video frames were particularly informative for the network decision. For example, the first frames had multiple features which contributed to network classification more strongly than subsequent frames ( Figure 3A and Supplementary Figure 3). Plotting the average value of the features separately for each animal group showed the high discriminative power of those initial frames ( Figure 3B). To investigate why the first video frames were singled out, we closely inspected those first frames. We found that on average, there was a meaningful difference in the starting posture and starting movement between the MPNE and control animals. Figures 3C,D illustrates the difference in their starting posture as soon as the pups in the open field box. The MPNE animals sprawled, with the fore and hind legs extended, whereas the control animals had their limbs beneath in a posture of supporting the body. In short, the MPNE animals displayed reduced postural support. The lack of postural support indicated by extended limbs could also be observed as the MPNE animals initiated movement. Once moving, the temporal features of movement were also different between MNPE and control animals. Notably, the control animals began to move as soon as they were placed in the open field. They collected their body by bringing their limbs to a weight bearing posture and made small lateral movements of their head as they initiated movement. The MPNE animals mostly lingered (not moving), then took more time to establish postural support and only then initiated movement.
Because knowledge extraction from the network revealed that the initial posture is a highly discriminative feature between MPNE and control pups, we developed measure to quantify it. For that, first, we used DeepLabCut software (Mathis et al., 2018), which allowed for semi-automatic marking of the position of multiple body parts (four legs, nose, tail base and center of the body; Figure 4A). Next, from x and y coordinates of resulted marks, we estimated pose by calculating the average distances between front and hind limbs in the initial frame. Consistent with our visual observation, MPNE animals had displayed a significantly larger distance between front and hind limbs as compared with controls, indicating reduced postural support (DistNicotine = 96.2 pixels ± 1.53 SEM, DistControl = 87.57 pixels ± 1.46 SEM, p < 0.0001, t-test; Figure 4B).
Visualization of feature importance also showed unexpected periodic movement changes ( Figure 3B). Specifically, features occurring about every 11th frame, corresponding to period  of about 3.7 s, were informative for the network's distinction between the MPNE and the control groups. This was also confirmed with spectral analyses shown in Supplementary  Figures 4, 5. To investigate the behavior underlying the distinguishing movements, we divided the videos into 11 frame segments and aligned the segments (Figure 5). This revealed a stereotypical, repetitive behavior in MPNE animals. The animals made repeated lateral movements that returned the animal to its initial position. For comparison, Figure 6 illustrates a typical temporal sequence of two control animals at the same time. The control animals also make lateral movements, but the amplitude and frequency of movement are different from that of the MPNE animals. For example, in the control rat #1, the lateral head movement begins at frame 10 and it ends at frame 18. Its next lateral movement increases in amplitude, thus modifying the sequence of movement (i.e., frames 21 and 31 are not the same in Figure 6 for rat #1). Moreover, some of the control animals pivoted as part of the lateral movement (Figure 6, rat #2). Note that although our analyses showed that features of importance peak at frames 10, 21, . . . , etc., it should not be interpreted that only those specific frames are of significance to the network. Rather, it should be seen as an indication that at those times the network recognized a periodic stereotypical sequence. Thus, the network identified from raw video data a stereotypical behavior as a distinguishing feature of MPNE pups.
To test whether the network-uncovered differences in stereotypical behavior distinguished the control and nicotine groups, we conducted additional analyses. Using DeepLabCut marks (Figure 4A), we tracked nose position for the first 16 frames for all rats ( Figure 7A). Next, we calculated distance between nose position in the 1st and 11th frame. This allowed us to estimate the relationship between the starting position of the 1st and 2nd sequence of movements. Consistently with results presented in Figures 5,6, the average distance for MPNE group was significantly smaller (mean DistContr = 37.7 pixels ± 2.9 SEM, DistNicotine = 60.4 pixels ± 3.1 SEM; p < 0.0001, Kolmogorov-Smirnov test; Figure 7B). Repeating the analyses using the 3rd and 14th frame gave similar results. This confirmed greater stereotypy in making repeated movements in nicotine exposed group.
We also tested whether only the coordinates of body parts marked with DeepLabCut could provide better features than the ConvNet for predicting the animal group. For that, in all video frames we tracked the position of the nose, limbs, tail base and body center as illustrated in Figure 4A. All points corresponding to frames from one trial were combined as one input to the RNN (similar to ConvNet features in Figure 2; RNN with 256 LSTM units). Thus, each video frame was represented by x-and y-coordinates of seven marked body parts. This procedure resulted in a 62% accuracy in distinguishing the control and MPNE animals. This is a lower accuracy than using original videos (87%), suggesting that ConvNet features selected in a data-driven way contain additional information useful for behavioral classification. Increasing the number of selected body parts for digitization may result in improved RNN performance relative DeepLabCut features. It is noteworthy, however, that the advantage of our network is that it can directly predict movement deficits from raw videos and does not require human decisions on which body parts to select.

DISCUSSION
The neural network described here revealed motoric impairments in at-risk rat pups whose mother received preconception nicotine. Impairments include a reduction in postural support, slower maturation of warm-up movements, and stereotype in the component movements of warm-up. These motoric abnormalities in the development of the rat pups may be the first symptoms of what might later become abnormalities in adult behavior. Thus, they may be diagnostic of early symptoms of conditions analogous to those of human developmental disorders. The network analysis also provided insights into the hypoactivity of the MPNE pups, as their reduced posture, slow warm-up and stereotype would be expected to compete with locomotion. Thus, by analyzing the network's decision-making process, new insights into behavioral differences were obtained.
We suggest that this network methodology could be useful for the analysis of behavior of other animal analogs of human movement disorders as well as for the analysis of at-risk human infants.

Significance of Behavioral Results
Our data-driven approach identified impaired postural support, reduced warm-up and increased stereotypical behavior of warmup components of the MPNE pups. The interpretation of the results of our network analysis capitalized on previous work showing that the development of infant rat movement is organized. When an adult rat is placed on a horizontal surface in an open environment it sequentially makes lateral, forward and dorsoventral movements that escalate in amplitude into forward locomotion (Golani et al., 1981;Golani, 1992). Such warm-up behavior is also a feature of the ontogeny of motor development, in which the topographic dimensions of movement emerge and escalate as maturation proceeds (Golani et al., 1981). Thus, in terms of warm-up, the MPNE animals display lower level of maturation, featuring reduced postural support and movement relative to the control pups.
The infant MPNE pups also displayed stereotyped behavior in that once making one lateral head movement they then repeated the movement at regular intervals rather that escalating warmup movements into locomotion. Stereotype, as featured in tics, repetitive movements and compulsive behavior, is a feature of many developmental conditions including ASD and ADHD. Stereotype is also symptomatic of use of drugs of abuse and many adult neurological conditions, especially those that affect the basal ganglia (Eilam et al., 2006;Lelard et al., 2006;Singer, 2009;Wolgin, 2012;Berman, 2018;Martino and Hedderly, 2019). That the infant MPNE rat pups displayed stereotype suggests that the MPNE treatments resulted in neural changes in the rat pups that may be analogous to those that produce stereotype in human conditions. Future work could examine both the neural basis of MPNE-based stereotype as well as its influence on adult motor and cognitive behavior.
In the present study we also used conventional locomotor measures with the MPNE rat pups and found that the pups displayed reduced locomotion. A reduction in locomotion can have many causes but our network results suggest that reduced postural support, immaturity in warm-up and stereotype could all contribute. Thus, an important feature of the network analysis is that it pointed to potential first causes of the more behaviorallyholistic symptomology of locomotor measures. Many tools have been constructed for the diagnosis of developmental disabilities, but most are compromised by questions related to reliability. The reliability of measurements of activity changes in conditions such ADHD is illustrative (Egger and Emde, 2011;Wolraich et al., 2019). The utility of more detailed methodology for symptom detection is that it improves the reliability of some behavioral measures and may actually serve as a more valid replacement.
Nicotine exposure during development has long been associated with changes in locomotor behavior. Prenatal, preconception + prenatal, and paternal preconception nicotine have all been linked with hyperactivity in young adult offspring that can propagate through multiple generations (Bruin et al., 2010;McCarthy et al., 2020). However, here we report a decrease in locomotion following maternal preconception nicotine exposure. An important distinction between the study presented here and previously reported results is the age at which the pups were tested; we analyzed the emergence of locomotor behavior in 10-day-old pups. To our knowledge, no other studies have explored the impact of preconception nicotine on young offspring locomotion. One study examined the effects of prenatal nicotine on 19-day-old offspring and found a similar decrease in activity in the nicotine-exposed pups compared to control pups (LeSage et al., 2006). Furthermore, they report that pups prenatally exposed to nicotine were less active than controls during the initial exploration of the open field but were equally active after 10 min of exploration. This finding is reminiscent of our observation that MPNE pups had slowed warm-up and increased stereotypy relative to control pups. Therefore, it may be that the effects of nicotine on locomotor development are age-dependent; nicotine may delay maturation in early life leading to decreased locomotion, but eventually lead to hyperactivity in later life. Further research is required to understand the effects of the cross-generational effects of maternal and biparental preconception nicotine exposure on the emergence of locomotion in early life.

Comparison With Other Methods
One approach to improving the analysis of exploratory behavior is the use of tracking systems such as those using markers that are automatically or manually attached to body parts (Zhou et al., 2008;Parmiani et al., 2019). In the last few years several methods such as LocoMouse (Machado et al., 2015), DeepLabCut (Mathis et al., 2018), JABAA (Kabra et al., 2013), Optimouse (Ben-Shaul, 2017), and LEAP (Pereira et al., 2019) and DLCAnalyzer (Sturman et al., 2020) have been developed to allow users to identify key points in videos, such as the location of a paw, and then automatically track the movement of those key points across video frames. For instance, in DeepLabCut (Mathis et al., 2018), the experimenter manually labels body parts (e.g., the snout, left ear, right ear, and tail) in selected video frames using virtual markers, and then the network is trained to automatically predict the position of those body parts in the rest of the video. Although this method is useful, it requires investigator decisions about relevant body parts, and it requires separate analyses to determine whether the measures are relevant. Here we have shown that using whole frame video is informative about behavior that a selective investigator may not have predicted a priori. Nevertheless, DeepLabCut did play a valuable role in quantifying the behavior allowing us to validate the results we obtained from the knowledge extraction method. Thus, our FIGURE 6 | Sample behavior of typical control animals. The movements are less repetitive compared to nicotine animals and more diverse as exemplified by pivoting (rat #2).

FIGURE 7 | (A)
Position of rat nose during first 16 frames for control (red) and MPNE pups (blue). Each animal is represented by a line connecting 16 points. For visualization, only every 2nd animal is shown. All trajectories are aligned such that nose position in frame 1 is set to point (0, 0). (B) Distribution of distances between nose position in frame 1 and 11. The shift of distribution to the left for the MPNE group shows that nicotine exposed pups were more likely to return to the same position after first sequence of movements.
network approach adds to the armamentarium of behavioral analysis.
The second category of automated methods such as MoSeq (Wiltschko et al., 2015), MotionMapper (Berman et al., 2014) and B-SOiD (Hsu and Yttri, 2020) first reduce the dimensionality of the video data, and then relate the results to the behavioral components. These methods require image pre-processing and proper image alignment, and additional methods must be applied for classifications. Our method also offers an alternative to these approaches. First, the convolutional network works with raw images without the need of pre-processing and without the difficult task of image alignment, and second, the network automatically identifies the most relevant behavior for predictions. In short, our approach offers a one-step solution for feature selection and animal group classification.
The method presented here also provides significant advancement on our previous network used for analyses of the skilled reaching behavior of stroke rats (Ryait et al., 2019). Specifically, here we introduced analyses in a time dimension, which allowed us to identify repetitive movements. As movement timing is crucial component of animal behavior, the temporal analyses presented here can help to provide more sensitive measures of neurological deficits.
We suggest that the analysis used here could be applied to the behavior of at-risk human infants in the way that it was applied here to infant rats. The development of behavior of human infants is also organized, and this organization is widely used to assess the attainment of developmental milestones (Harris and Heriza, 1987;Sacrey et al., 2020). The assessment of milestones, however, depends upon the accuracy of the rating tool, the expertise of the rater, and it can be confounded by the normal developmental variability of infants. Nevertheless, brief video records of infant behavior could be subject to network analysis to confirm development milestones and to pinpoint the significance of variability as was done here for the behavior of infant rats.
In the future, the behavioral analyses described here could be combined with histological analyses (Faraji et al., 2013) or with electrophysiological recordings (Schjetnan and Luczak, 2011;Ponjavic-Conte et al., 2012;Schjetnan et al., 2019). Most neuronal analyses rely on using expert selected features of brain activity (e.g., spike timing, correlations, firing rate in specific time) to relate it to behavior or sensory stimuli (Luczak et al., 2004;Luczak and Narayanan, 2005;Quiroga and Panzeri, 2009). Applying the present here data-driven approach to electrophysiological data may uncover novel features of neuronal activity patterns, more predictive of animal behavior.
In conclusion, the experimental results answer the questions proposed in the introduction of this study. Nicotine administration during the preconception period in prospective dams altered the behavior of the infant offspring by reducing locomotion, reducing postural support, slowing the development of warm-up, and inducing stereotype in component movements of warm-up. The MPNE offspring were affected at an early stage of infant development, thus demonstrating an early impact of MPNE on offspring locomotion and its sensitivity to experimental detection. These findings suggest that the measurement of infant behavior using a neural network analysis can improve the identification of behavioral irregularities in at-risk infant rats and in the same way, it could be applied to the early identification of signs of symptomology in at-risk human infants.

Animals
Procedures were conducted in accordance with the Canadian Council of Animal Care and were approved by the University of Lethbridge Animal Care and Use committee. Animals were given food and water ad libitum and were maintained on a 12-h light/dark schedule (lights on from 07:30 to 19:30) in a temperature-and humidity-controlled (21 • /50%) breeding room. A total of 45 female Long Evans born in-house from 11 different litters were used. Nicotine-exposed dams (n = 23) received 15 mg nicotine hydrogen tartrate salt (Sigma) per liter of drinking water sweetened with 1% sucralose to increase palatability (Nesil et al., 2011;Collins et al., 2012). Control dams (n = 22) received 1% sucralose only. Nicotine was administered for seven consecutive weeks beginning in adulthood (90-daysold); 7 weeks is the length of the spermatogenic cycle in male Long Evans rats and was chosen to mirror the complementary paternal studies. Nicotine consumption was calculated as mg of nicotine per kg of body weight. The volume of water consumed each day was measured by weighing the water bottles at the same time each day and dividing the change in volume by the number of females with access to the bottle. The mg of nicotine consumed was then calculated from the volume divided by the average weight of the females with access to the bottle. On average, nicotine-exposed dams consumed 2.4 mg of nicotine per kg of body weight per day across the 7 weeks. Females were bred with non-drugexposed male Long Evans rats (n = 45) the day following completion of nicotine administration. Animals in this analysis were pups from 32 successful litters for a total of 351 pups. Eighteen litters (191 pups: 102 female and 89 male) of the animals were from sucralose-exposed dams, and 14 litters (160 pups: 76 female and 84 male) were from nicotine-exposed dams. Females in both conditions reared their own litters (i.e., pups that were not cross-fostered) until pups were weaned on postnatal day 22.

Behavioral Testing
Pups were tested in the open field task on post-natal day 10. The testing apparatus was a clear Plexiglas box measuring 20 cm × 30 cm with a grid of 150 squares (10 squares × 15 squares) on the floor each with a size of 2 cm × 2 cm ( Figure 1A). Pups were placed individually in the center four squares (shaded black) and left to explore the box for 1 min while being recorded from above. The open field was cleaned with Virkon between animals.
Kinematic movement measures and their definitions in the scoring procedure are as follow: Novel = the number of unique squares that either front paw of the pup enters, up to a maximum of 146 (i.e., the box is divided into 150 squares total (10 × 15), minus the four shaded squares). Total = the total number of square entries for either front paw (i.e., the number of times a front paw goes from one square to another).
Novel Inner = the number of unique squares in the inner portion of the field that either front paw of the pup enters. (i.e., the number that are within the 6 × 11 squares in the center of the box, minus the four shaded squares).
Novel Outer = the number of unique squares in the outer portion that either front paw of the pup enters. (i.e., the two rows of squares that make up the perimeter of the open field).
Total Inner = the total number of square entries in the inner portion for either front paw.
Total Outer = the total number of square entries in the outer portion for either front paw.

Deep Neural Network Training and Architecture
For training the ConvNet neural network we used 351 videos: 160 from MPNE animals, and 191 videos from control animals. The original frame rate was 30 frames per second with resolutions 720 × 480 pixels. However, to reduce the amount of data, we used only every 10th video frame (three per second). From each video we excluded the initial period showing the experimenter's hand releasing the pup. The 50 s of recording (150 frames) after that was used.
The general network architecture is shown in Figure 2. First, a pre-trained convolutional neural network (ConvNet) known as Inception-V3 (Szegedy et al., 2016) was used to extract 2,048 features from each video frame. This reduced each video to a 2D matrix of the size (2,048 features × 150 frames). This matrix was then given as an input to the recurrent neural network (RNN) to predict animal groups. We used a RNN composed of 256 long short-term memory (LSTM) units, which allowed for the extraction of temporal relations between frames. The LSTM layer was followed by a dropout layer of 0.2 to prevent overfitting, and then a dense layer with two neurons with the softmax activation function classified the animal's behavior. We used "Group K-Fold" in Keras to split the data randomly and uniformly (to prevent the train and test data being biased) into 5 classes. Each run is initiated with random set of weights. Batch size was 100 and Adam optimizer was used with binary cross entropy as the loss function. The code for our network including all parameters is available in the Github repository as Behaviour_Recognizer toolbox: https://github.com/rezatorabi13/ Behaviour_Recognizer.

Knowledge Extraction Method
After the network was trained, information was extracted from the network weights in order to identify image features and the parts of each video frame that most contributed to the network decision. For this knowledge extraction from the network, we used the Layer-wise Relevance Propagation (LPR) method (Bach et al., 2015;Lapuschkin et al., 2019) available in the DeepExplain package (Braitenberg and Schüz, 1998). This method uses the strength of synaptic weights and neuronal activity in the previous layer to recursively calculate the contribution (importance) of each neuron to the output score. Because our network is composed of two parts, ConvNet and RNN (Figure 2A), we first investigated which features were most informative for the RNN to classify animal groups (Figure 2B middle panel). Next, we propagated feature importance back to pixels in the video through the Inception V3 network (Figure 2B left panel, Supplementary Figure 6, and Supplementary Text 2). This provided us with information related to which parts of the image the network was "attending to" when making classifications. This allowed for a check on whether the network was using rat movements rather than spurious features, such as the amount of light, to discriminate between the treatment groups. Using other methods for knowledge extraction like gradient-based methods (Shrikumar et al., 2017;Ancona et al., 2018) gave qualitatively similar results.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found at the following link: https://github.com/rezatorabi13/Behaviour_Recognizer.

ETHICS STATEMENT
The procedures of the animal study were approved by the University of Lethbridge Animal Care and Use Committee in accordance with the Canadian Council of Animal Care.

AUTHOR CONTRIBUTIONS
RT, SJ, IW, RG, and AL: conceptualization. SJ and AH: data acquisition and data scoring. RT, AL, and IW: analyses and interpretation of data. RT, SJ, AH, IW, RG, and AL: writing, review, and editing. All authors contributed to the article and approved the submitted version.  correlations between principal components and movement measures (B). We found that the first principal component has the largest correlation with the "Novel" followed by the "Total" measure. This indicates that control animals have a greater tendency to explore new places (i.e., enter a greater number of unique squares) as well as to explore more (i.e., enter more squares overall) than MPNE animals.
Supplementary Figure 2 | Different architectures of recurrent neural network (RNN) tested. For this manuscript we selected the top network (one layer of LSTM with 256 neurons). However, all tested networks produced similar results, as shown in the table. The network performance was also robust to changes in video preprocessing. Specifically, down-sampling video by taking every 9th frame instead of every 10th frame (Methods) gave similar accuracy of ∼89%. This shows that our network does not need fine tuning to outperform machine learning methods using expert selected movement measures. (B) Sample learning curve from training top RNN with one layer of 256 long short-term memory (LSTM) (Hochreiter and Schmidhuber, 1997;Greff et al., 2017) units. Note that on training data we achieved 100% accuracy (red line), however, on testing data accuracy was reduced (green line). This suggests that with larger dataset performance of the network may further improve.

Supplementary Figure 3 | Most informative features for network decisions. (A)
Average of feature's importance over all videos, as shown in main text in Figure 3A. Considering reoccurring peaks in feature importance, it is apparent that the network is identifying periodic behavior, especially in the early 50 frames (∼ 17 s), to distinguish MPNE animals from control animals. Our RNN network was able to detect a repetitive pattern in the behavior because it is composed of the LSTM units which have memory and are specialized in identifying sequences of activity. (B) Average relevance (importance) of each 2,048 features across all video data. Average relevance was obtained by averaging columns in the matrix shown in panel (A). (C) The same average feature importance as in (B), but sorted from highest to the lowest value. It illustrates that about 20 features had a disproportional effect on network decision making.
Supplementary Figure 4 | Power spectra analysis. In order to investigate periodic behavior shown in Figure 3B, we calculated power spectra of the average feature importance. Blue and orange lines denote MPNE and control animals, respectively. Peak in power spectra at frequency of 0.27 Hz confirmed that video features oscillate with a period of 1/0.27 = 3.7 s. Note that this periodic behavior was seen mostly in MPNE animals. This indicates that MPNE animals have much more stereotypical behavior, while control animals have more diverse and less repetitive movements.
Supplementary Figure 5 | Power spectral analysis for 20 most important features. The blue and orange lines indicate MPNE and control animals, respectively. After identifying the most important features as illustrated in Supplementary Figure 4C, the power spectral for each of the 20 top features was calculated in each video. Then, we averaged spectra of each feature, separately for MPNE and control animals, which lead to the 20 graphs shown above. As can be seen, the periodic behavior with the frequency of about 0.27 Hz is clearly visible in nearly all of the features.
Supplementary Figure 6 | (Left column) Representative video frames, and the same frames with superimposed network focus (Right column). Red color scale denotes the most informative pixels used by the network to make decisions. This allowed us to verify that the network used features related to rat posture to discriminate control from MPNE animals. This analyses also ensures that the network does not "cheat" by using spurious features like the clock display. To superimpose pixel importance on the frames, the values of importance were rescaled to range 50-250. The pixel importance was obtained by using LRP method described above.
Supplementary Text 1 | Applying machine learning algorithms on movement measures.
Supplementary Text 2 | Notes on knowledge extraction from neural networks.