Direct Metatranscriptome RNA-seq and Multiplex RT-PCR Amplicon Sequencing on Nanopore MinION - Promising Strategies for Multiplex Identification of Viable Pathogens in Food.

Viable pathogenic bacteria are major biohazards that pose a significant threat to food safety. Despite the recent developments in detection platforms, multiplex identification of viable pathogens in food remains a major challenge. A novel strategy is developed through direct metatranscriptome RNA-seq and multiplex RT-PCR amplicon sequencing on Nanopore MinION to achieve real-time multiplex identification of viable pathogens in food. Specifically, this study reports an optimized universal Nanopore sample extraction and library preparation protocol applicable to both Gram-positive and Gram-negative pathogenic bacteria, demonstrated using a cocktail culture of E. coli O157:H7, Salmonella enteritidis, and Listeria monocytogenes, which were selected based on their impact on economic loss or prevalence in recent outbreaks. Further evaluation and validation confirmed the accuracy of direct metatranscriptome RNA-seq and multiplex RT-PCR amplicon sequencing using Sanger sequencing and selective media. The study also included a comparison of different bioinformatic pipelines for metatranscriptomic and amplicon genomic analysis. MEGAN without rRNA mapping showed the highest accuracy of multiplex identification using the metatranscriptomic data. EPI2ME also demonstrated high accuracy using multiplex RT-PCR amplicon sequencing. In addition, a systemic comparison was drawn between Nanopore sequencing of the direct metatranscriptome RNA-seq and RT-PCR amplicons. Both methods are comparable in accuracy and time. Nanopore sequencing of RT-PCR amplicons has higher sensitivity, but Nanopore metatranscriptome sequencing excels in read length and dealing with complex microbiome and non-bacterial transcriptome backgrounds.


INTRODUCTION
Biological threats, including bacteria, viruses, and parasites, remain as the top food safety challenge in the United States. According to the CDC surveillance for foodborne disease outbreaks most recent annual report, in 2016 outbreaks attributed to bacterial infection comprised 44% of the total 645 outbreaks and caused 76% of the 847 hospitalization cases (Centers for Disease Control and Prevention, 2018). More importantly, a recent report published by U.S. Department of Agriculture Economic Research Service (USDA ERS) stated that food safety challenges caused an annual loss of $15.5 billion to the economy and the top 10 infectious bacteria alone contribute to $10 billion in economic loss (Hoffmann et al., 2015). These statistics revealed that bacterial infection is the primary concern among all biological threats. To cope with the threat of bacterial infection to public health, the demand for a rapid and highly sensitive method to detect and identify bacterial pathogens in food is enormous and becoming more urgent, especially after the implementation of the Food Safety Modernization Act (FSMA) in 2011.
Commercial food safety testing methods include traditional plate counting methods, immunological techniques such as enzyme-linked immunosorbent assay (ELISA), lateral flow immunoassay chip, electrochemical biosensors, and chromatography, as well as nucleic acid-based approaches (Heiat et al., 2014). As summarized in Supplementary Table 1, the widely recognized cultivation methods and commercial rapid detection systems that are available for food defense applications have major limitations, such as large sample size, long turnaround time, and intensive labor demands. Rapid detection systems also failed to address unique changes to food safety and food defense, despite the recent success in medical and clinical diagnostics.
There are two major unique challenges in food defense, especially in identifying biohazards in food. First, viable bacteria are the etiological agents of foodborne illnesses. Most rapid detection methods have limited discretionary power to identify bacterial viability (summarized in Supplementary Table 1). Current genome-based technology includes polymerase chain reaction (PCR), real-time PCR (qPCR), fluorescence in situ hybridization (FISH), nucleic acid sequence-based amplification (NASBA), and loop-mediated isothermal amplification (LAMP). PCR based methods are sensitive and specific but easily generate cross contamination between pre-PCR and post-PCR products. Insufficient permeability of cell walls and the inherent autofluorescence of the substrate will decrease the efficiency of FISH (Baschien et al., 2008). NASBA and LAMP do not require a thermocycler, however, NASBA shows a size range limitation of target RNA and LAMP requires complex primer design that cannot be adapted to multiplex amplification (Zanoli and Spoto, 2013). In addition, all these approaches suffer from the limited capability to identify viable pathogens (Malhotra et al., 2014). False positive results are a major issue for DNAbased approaches. This is due to the inability to differentiate DNA molecules in viable bacterial cells from the genomic background, which is comprised of stable DNA molecules from the microbiota, the food matrices, and dead pathogens inactivated during food processing and storage (Sheridan et al., 1998;Hellyer et al., 1999). In contrast, transcriptome-based technologies which utilize RNA as alternative biomarkers for bacterial viability hold more promise, because RNA molecules tend to have a shorter half-life than DNA in the environment when cells are inactivated (Sheridan et al., 1998;Hellyer et al., 1999). Recent progress was made using reverse transcription PCR (RT-PCR), but false positives also plague RT-PCR approaches (Lu et al., 2016). This can be explained by non-specific amplification of RNA molecules from food matrices and microbiota (Ju et al., 2016;Takahashi et al., 2018). Subsequent sequencing of the RT-PCR amplicons has the potential to significantly improve the accuracy of the transcriptome-based approach by identifying the origin of the amplicons.
The second prominent challenge is multiplex identification without the need for assay customization to each individual microbial threat. Each food commodity often faces multiple, and sometimes random, threats from dozens of major etiological agents (Crowe et al., 2015). A monitoring and inspection system should entail capacities of multiplex identification. Nonetheless, conventional systems depend on the customization of recognition elements, like antibodies or enzymes, to achieve multiplex detection, which can be self-prohibitory economically (Faggioli et al., 2017;Wang et al., 2019). Therefore, a feasible strategy should enable multiplex identification without the need to customize for individual threats, which can be of great importance and benefit to food defense. Several multiplex RT-PCR methods were developed for S. aureus, Salmonella and Listeria using food models over the last decade (Bao et al., 2008;Kawasaki et al., 2010;Ruiz-Rueda et al., 2011;Garrido et al., 2013;Salihah et al., 2016;Ding et al., 2017). However, a recent validation study suggests that multiplex RT-PCR may also generate false positive results in real food samples, especially if rRNA is the target template (Ju et al., 2016). Very recently, Next Generation Sequencing (NGS) platforms, such as Illumina, have emerged as a new strategy for food defense (Diaz-Sanchez et al., 2013;Solieri et al., 2013;Mayo et al., 2014;Moran-Gilad, 2017;Taboada et al., 2017) but, its applications in food testing are very limited. NGS does not permit timely analysis, as these platforms generate sequence reads in parallel and not in series, so data analysis can add significant burden to total turnaround time. Additionally, NGS relies on non-portable and expensive equipment, which is also economically self-prohibitory for the food industry.
The novel Oxford Nanopore MinION sequencer has emerged as a promising method of food pathogen detection based on its rapid, cost effective, portable, and high-throughput RNA and DNA sequencing workflows (Pritchard et al., 2016;Walsh et al., 2017;Hyeon et al., 2018;Taylor et al., 2019). Nanopore sequencing is a third-generation sequencing platform that can produce long reads on DNA and RNA molecules and perform real-time metagenomic and metatranscriptomic sequence analysis on the pocket-sized Nanopore MinION device (Garalde et al., 2018). This technology can be used to identify viral pathogens, as well as microorganisms such bacteria and fungi (Greninger et al., 2015;Juul et al., 2015;Cheng et al., 2018). A few studies have demonstrated Nanopore's potential for food safety application using metagenomic sequencing in clinical and food samples (Quick et al., 2015), however, like other genomic approaches, stable DNA molecules can cause false-positive identification, and the studies did not include a validation of whether the nanopore metagenomic sequencing data only correlates with viable pathogens. Direct RNA sequencing on Nanopore was successfully developed in 2018 on the Nanopore MinION (Garalde et al., 2018).
Therefore, for the first time, RNA-enabled Nanopore sequencing is evaluated for its potential in achieving multiplex identification of viable pathogens in this study (Figure 1). Specifically, an optimized universal RNA extraction and DNA digestion method is developed to simplify and standardize the RNA preparation for both Gram-positive and Gram-negative bacteria. In addition, the work also includes an accuracy evaluation of different bioinformatic pipelines using whole metatranscriptome datasets, which includes MEGAN as a tool to compare the impact of rRNA on the accuracy of taxonomic analysis. Direct metatranscriptomic RNA sequencing and multiplex RT-PCR amplicon sequencing were evaluated and compared using a cocktail culture of Escherichia coli O157:H7 (E. coli O157:H7), Salmonella enteritidis (S. enteritidis), and Listeria monocytogenes (L. monocytogenes) in both standard general-purpose media and a food model. The three bacteria were selected based on their impact on economic loss or prevalence in recent outbreaks. Figure 2 showed qPCR and RT-qPCR results of E. coli O157:H7 samples collected from 5 time points. E. coli O157:H7 growth curve (Figure 2A) resembles a typical microbial growth curve with an exponential phase from 0 to 24 h and a stationary phase from 24 to 72 h. Bacterial counts at 72 h showed a slight decrease from 24 h, which could indicate the start of death phase. The RT-qPCR of mRNA collected at different time points (Figure 2B) showed that the greatest amount of RNA was found in 8-h and 24-h samples, followed by a decline of mRNA concentration at 72 h. A high alignment between mRNA concentration and viable cell density can be established between Figures 2A,B. The results indicate mRNA has good correlation with viable bacterial count. The melt curve ( Figure 2E) showed 5 peaks from the 5 different time points, which indicates that the same mRNA was amplified. In the negative control, no colony was identified on BHI agar and no amplicon was detected by gel electrophoresis.

Verification of mRNA as Biomarkers for Bacteria Viability
The qPCR of E. coli O157:H7 DNA ( Figure 2D) showed that the amount of DNA in 72-h samples was greater than the amount in 24-h samples, which contradicted the data from the viable bacterial counts. This indicated DNA accumulation from nonviable cells was present in 72-h samples, which was consistent with other studies (del Mar Lleò et al., 2000;Delgado-Viscogliosi et al., 2009). Hence, DNA was not a great indicator of bacteria viability. Additionally, the same qPCR amplicon was detected by gel electrophoresis in the negative control of sodium hypochlorite treated E. coli O157:H7. Therefore, the results demonstrate that the global transcriptome, especially mRNA, of bacteria could be a robust indicator of cell viability.
Direct Metatranscriptome RNA-seq on Nanopore MinION and NGS iSeq 100 Tables 1, 2 showed the results of direct metatranscriptome RNAseq of E. coli O157:H7, S. enteritidis and L. monocytogenes cocktail in BHI and LJE 24-h culture using different approaches of sequencing and bioinformatics pipelines. Both EPI2ME, MG-RAST, and MEGAN miss-identified the three pathogens as other species ( Table 2). MEGAN with non-rRNA mapping successfully identified the three bacteria without miss-identification as Listeria, E. coli and Salmonella at 91.1, 5.4, and 3.6% in BHI and 67.5, 20, and 12.5% in LJE, respectively ( Table 2). The results agreed with plate counting confirmation and the growth curve of cocktail culture that all three bacteria were present (Supplementary Figure 1 and Supplementary Table 2). The mean read length was close to 1,200 bp ( Table 1 and Supplementary Figure 2), which agrees with the size of 16S RNA in bacteria. The average quality scores were 83.4% in BHI and 83.7 in LJE. There was no miss-identification of the bacteria using a quality score cut-off at 7.0 (80%) using MEGAN analysis with non-rRNA mapping. No false positive identification of any bacteria was found in the negative control of sodium hypochlorite treated cocktail culture (negative control). In addition, the three pathogens were identified with genus-level in NGS for positive control, without carrying out other species (Table 2). Therefore,  the results strongly support that direct metatranscriptome RNAseq on Nanopore MinION can achieve multiplex identification of viable pathogens.

Multiplex RT-PCR Amplicon Sequencing on Nanopore MinION
Similarly, multiplex RT-PCR amplicon sequencing also successfully identified the three bacteria in the 4-h cocktail culture sample (Figure 3). E. coli O157:H7, S. enteritidis and L. monocytogenes were observed in a real-time phylogenetic tree generated by EPI2ME in less than 15 min and the distribution was, respectively, 84.7, 13.7, and 1.7% in BHI sample, and 50.9, 43.4, and 5.7% in LJE sample ( Figures 3A,B and Table 2). The average read quality was 90.9 and 89.3%. A total of 29,279 reads were analyzed in BHI culture with a 3-h running time (early termination due to high quality score) and 442,325 reads in LJE cocktail culture with a 24-h running time ( Table 1). The average sequence length was 534 bp in BHI sample and 432 bp in LJE sample (Supplementary Table 3

Quality Control and Comparison of Bioinformatic Pipelines
In this study, the quality score of direct metatranscriptome RNAseq was 83.4 and 83.7%, while 90.9 and 89.3% for multiplex RT-PCR amplicon sequencing from MinKNOW QC report (Table 1). Raw data was collected by nanopore real-time sequencing software MinKNOW and analyzed with different bioinformatic databases and pipelines.
In metatranscriptomic direct RNA-seq, MinKNOW missidentified Bacillus as the top genus in both the BHI and LJE samples ( Table 2). This error could be caused by the similarity between Listeria and Bacillus, especially with their housekeeping genes and rRNA (Borezee et al., 2000;Ferreira et al., 2004). Although MG-RAST eliminated this misreading, other untargeted bacteria counted for a close proportion to S. enteritidis (1.5%) ( Table 2). MEGAN was able to eliminate other untargeted bacteria except Bacillus (still misidentified as 63.6%), which again was likely caused by rRNA or other housekeeping genes. Therefore, MEGAN with non-rRNA mapping was performed and successfully identified all three primary bacteria of Listeria, E. coli O157:H7 and Salmonella without any miss-identification ( Table 2).
The results of multiplex RT-PCR sequencing showed that three targeted bacteria were anchored accurately by MinKNOW ( Table 2), and no miss-identification appeared in the results.

Gel Electrophoresis of RT-PCR and PCR Amplicon
RT-PCR was used to verify the presence of all three bacteria in the cocktail culture, and PCR was used to verify the complete removal of DNA contamination using the protocol described above.  Lactobacillus  Supplementary Figure 3A. shows the verification of 24-h LJE cocktail culture, which was used in metatranscriptomic direct RNA-seq. The results showed the RT-PCR product of three expected bands for stx, invA and inlA in the 24-h cocktail culture, which confirms the presence of all three target pathogens. No bands appeared on negative controls using only PCR without the RT step, which indicates the absence of DNA contamination.
Further validation was performed for the 4-h LJE cocktail culture sample, which was used in the multiplex RT-PCR amplicon sequencing. Supplementary Figure 3B. shows multiplex RT-PCR amplicon with three bands. The sizes of the amplicons are consistent with previous reports of 520 (stx), 244 (invA) and 153 (inlA) bp (Supplementary Figure 3B line 2). No RT-PCR products were detected in the negative control (line 3, 4, and 5) using only PCR without the RT step, which indicates that there was no DNA contamination in the sample.

Viability and Multiplex Identification
In this study, RNA-enabled Nanopore sequencing is evaluated, for the first time, for its potential in achieving multiplex identification of viable pathogens. The optimized universal RNA extraction and DNA digestion method was developed to simplify and standardize the RNA preparation for both Gram-positive and Gram-negative bacteria. Direct metatranscriptome RNAseq and multiplex RT-PCR amplicon sequencing were evaluated and compared using a cocktail culture of E. coli O157:H7, S. enteritidis, and L. monocytogenes in both standard generalpurpose media and a food model.
False positives are a major issue for DNA-based approaches. This is due to the inability to differentiate DNA molecules in viable bacterial cells from the genomic background, which is comprised of stable DNA molecules from the microbiota, the food matrices, and dead pathogens inactivated during food processing and storage. Both approaches developed in this study only utilize RNA, especially mRNA, as the ultimate sequencing target, which eliminated false positive identification typically caused by DNA contamination.
Random and unknown threats from multiple infectious bacteria poses a significant threat to the safety and security of food supplies worldwide. A feasible strategy should enable multiplex identification without the need to customize for an individual threat. Therefore, the developed universal protocol is applicable to both Gram-positive and Gram-negative bacteria. RNA from multiple pathogens in one food sample can be collected from one extraction and library preparation step, followed by the universal sequencing protocol.

Comparison of Direct Metatranscriptome RNA-seq and Multiplex RT-PCR Amplicon Sequencing
The developed method successfully identified all three bacteria from cocktail culture in BHI and LJE by direct metatranscriptome RNA-seq and multiplex RT-PCR amplicon sequencing. Nonetheless, the two sequencing approaches entail different capacities and challenges. Direct metatranscriptome RNA-seq does not require assay customization for an individual biohazard if the bioinformatic database includes the target microbiota. Multiplex RT-PCR amplicons comprise the target gene copies, and they can be easily captured by the motor membrane protein when passing through the nanopores. As a result, it shows higher accuracy, greater quality score, better quality control, and less turnaround time.
The two strategies result in different read length. In this study, we extracted total bacterial RNA that is comprised of a majority of rRNA and a small number of mRNA and tRNA for nanopore sequencing. The bioanalyzer results showed that the majority of RNA from E. coli O157:H7, S. enteritidis and L. monocytogenes cocktail culture was 16S RNA with 1250-2100 nucleotides, and 23S RNA with 2250-3950 nucleotides, respectively. Direct metatranscriptome RNAseq sequencing successfully identified all three bacteria. The RNA read length ranged from 0 to 3000 nucleotides, with the most abundant read length between 400-1600 nucleotides. The read length from direct metatranscriptome RNA-seq is approximately the full length of RNA (Cho et al., 2014;Byrne et al., 2017). Thus, this method can provide approximal fulllength RNA sequence. However, in multiplex RT-PCR amplicon sequencing, the amplicons for each strain have different expected sizes, which were observed in the sequencing read length results. In the real-time analysis of multiplex RT-PCR amplicon sequencing, all three bacteria were identified within 15 min and the resulting read lengths were 510 bp (stx), 244 bp (invA) and 153 bp (inlA), respectively. Some reports suggest that Nanopore excels in long RNA reads up to thousands of nucleotides, and sequencing of short reads tends to be more challenging due to their higher and non-uniform error profiles, which might result in a large fraction of reads remaining unmapped or unused (Grabherr et al., 2011;Madoui et al., 2015;de Lannoy et al., 2017). However, amplicon sequencing showed less error than metatranscriptomic direct RNA-seq. Multiplex RT-PCR amplicon sequencing successfully identified all three target bacteria using MinKNOW (Table 2) in real time. Direct metatranscriptome RNA-seq could experience issues of miss-identification when the assay is plagued by using a less efficient library preparation or choosing inappropriate bioinformatic pipelines. Therefore, more work is warranted to improve library preparation and bioinformatic pipelines for practical applications.
Both methods are comparable in their total turnaround time. Direct metatranscriptome RNA-seq does not include an additional RT-PCR step, but the library preparation, bioinformatic analysis, and mapping could easily offset the time difference. The total turnaround time for direct metatranscriptome RNA-seq is approximately 6.5 h, which includes RNA purification (3.5 h), library preparation (1.5 h), Nanopore sequencing (1 h), and bioinformatic analysis (0.5 h). Multiplex amplicon sequencing takes approximately 6 h, which includes RNA purification (3.5 h), RT-PCR (2 h), library preparation (0.5 h), Nanopore sequencing (15 min), and bioinformatics analysis (0.5 h).
Multiplex RT-PCR amplicon sequencing requires substantially less RNA input, which could translate into less microbial input. The method only requires 36.5 ng RNA input for multiplex RT-PCR, and 33.8 ng amplicon for library preparation and sequencing (Supplementary Table 2). The amplicon sequencing method is more sensitive and could be applicable for food commodities with low bacterial loading around 10 1 -10 4 CFU/g. 500 ng RNA input on Nanopore MinION is recommended by the supplier for metatranscriptomic direct RNA-seq. However, significant RNA loss was observed during the library preparation due to the three purification steps. The initial purified RNA concentration before library preparation was 3490 ng and 1338 ng in BHI and LJE, respectively, and only 744 and 130 ng were yielded for Nanopore sequencing. RNA loss can be as high as 80-90%, which significantly restricted sensitivity of the assay. Redesign of the library preparation protocol to minimize RNA loss can have profound significance for assay sensitivity and feasibility for clinical applications.
The two strategies pose different levels of complexities. Direct metatranscriptome RNA-seq may be applicable in foods with a complex microbiome (e.g. cultured food). Direct metatranscriptome RNA-seq does not require assay customization for an individual biohazard, if the bioinformatic database includes the target microbiota. The multiplex RT-PCR amplicon sequencing requires complex primer design and validation. Not all RT-PCR primers work in multiplex RT-PCR, due to potential primer interaction, non-specific amplification, and amplification bias. The amplicon sequencing may be more suitable for high-throughput and continuous monitoring of foodborne pathogens with high risk factors.

Comparison Between Different Bioinformatic Pipelines
Bioinformatic analysis has significant impact to the accuracy of Nanopore sequencing. Different computational pipelines of the same nanopore data may lead to different results. Normally, MinION pipeline contains primer trimming, alignment, variant calling and consensus generation (Loman and Quinlan, 2014;Wood and Salzberg, 2014;Menzel et al., 2016;Kerkhof et al., 2017), and EPI2ME conducts real-time surveillance of nanopore sequencing. First, reads containing raw data are base called by MinKNOW, and then extracted into a FASTQ file for mapping to reference transcriptome or genome (Li and Durbin, 2010;Mitsuhashi et al., 2017;Rang et al., 2018), aligned to sequence via primer trimming and coverage normalization. During this process, low quality or low coverage reads (read hit) are filtered out to generate final sequence for BLAST in NCBI 1 . MinION chemistry provides a simplified and rapid report of nanopore running, including read number, read length, cumulative read, taxonomy tree and quality control. MEGAN and MG-RAST are popular software or service for metagenome or metatranscriptome analysis. The similarity between them is that they perform computational analysis of multiple datasets for taxonomic content based on family and genus level. In contrast, MEGAN is able to perform taxonomical, functional and interactive analyses, which is the comparison of taxonomic and functional contents based on the SEED hierarchy and KEGG pathways (Borthong et al., 2018;Huson et al., 2018). In this study, MG-RAST was used in taxonomic analysis, and MEGAN was selected for mRNA analysis by removing rRNA from the whole metatranscriptome datasets. Taxonomical classification using MEGAN on non-rRNA data showed higher fidelity to types of pathogens inoculated in the sample and was consistent with our separate plate count and NGS validation. Moreover, both MG-RAST and MEGAN showed higher accuracy compared with EPI2ME for taxonomical classification using the whole metatranscriptome data. The results agree with previous reports on bioinformatic analysis using non-rRNA, because different bacteria can have very similar housekeeping genes in forms of rRNA, which leads to errors in taxonomical classification (Klappenbach et al., 2000;Schmieder et al., 2011). Multiplex RT-PCR amplicon sequencing obtained a rapid and accurate taxonomic content because this nanopore sequencing method poses a high sensitivity. In addition, adequate and complete BLAST database may further improve the accuracy, rapidness, and quality for the multiplex identification of viable pathogens in food. 1 Available at: https://www.ncbi.nlm.nih.gov (accessed January 6, 2019).

CONCLUSION
Novel strategies are developed through direct metatranscriptome RNA-seq and multiplex RT-PCR amplicon sequencing on Nanopore MinION to achieve real-time multiplex identification of viable pathogens in food. This study reports an optimized universal Nanopore sample extraction and library preparation protocol applicable to both Gram-positive and Gram-negative bacteria. Further evaluation and validation confirmed the accuracy of direct metatranscriptome RNA-seq and multiplex RT-PCR amplicon sequencing using Sanger sequencing and selective media. The result of NGS (positive control) from this study confirmed the identification of the three pathogens performed by Nanopore direct metatranscriptome RNA-seq, even though NGS had time-consuming and cost prohibitive limitations. The study also included a comparison of different bioinformatic pipelines for metatranscriptomic and amplicon genomic analysis. In addition, direct metatranscriptome RNAseq and RT-PCR amplicon sequencing were compared for their respective advantages in sample inputs, accuracy, sensitivity, and time effectiveness for potential applications.
Both direct metatranscriptome RNA-seq and multiplex RT-PCR amplicon sequencing require further development to address some pressing challenges. (A) Optimization of direct metatranscriptome RNA-seq sequencing may include minimizing RNA loss in the library preparation step; comparison of bioinformatic pipelines to eliminate miss-identified and unclassified targets; cross-domain identification of prokaryotes, eukaryotes, and viruses. This is especially important in complex mixtures. For example, less RNA loss during library preparation often leads to increased RNA yield and input on Nanopore device, which improves data quality and accuracy of the taxonomical identification. (B) Multiplex RT-PCR amplicon sequencing can benefit from: multiplex primer development; inclusivity/exclusivity evaluations; reduced amplification bias.
To the best of our knowledge, this is the first report of metatranscriptome sequencing of cocktail microbial RNAs on the emerging Nanopore platform. Direct RNA-seq and RT-PCR amplicon sequencing of the metatranscriptome enable the direct identification of nucleotide analogs in RNAs, which is highly informative for determining microbial identities while detecting ecologically relevant processes. The information pertained in this study could be important for future revelatory research including predicting antibiotic resistance, elucidating hostpathogen interactions, prognosing disease progression, and investigating microbial ecology, etc.

Bacterial Strains and Culturing
Escherichia coli O157:H7 (ATCC 43895), S. enteritidis (ATCC 13076), and L. monocytogenes (ATCC 19115) were acquired from ATCC (Manassas, VA). The three bacteria were cultured using Brain Heart Infusion (BHI) broth and agar (BD, Franklin Lakes, NJ, United States) at 37 • C for 24 h either in separate individual cultures or in cocktail cultures. Romaine lettuce (Lactuca sativa L. var. longifolia) juice extract (LJE) was used as a food model in this study. Romaine LJE was prepared according to our previous publications (Shen et al., 2012). Briefly, 250 g fresh Romaine lettuce heart (Fresh Express) and 200 ml DI Water was blended in a Waring 7011G Commercial Blender for 1 min. The blended mixture was then filtered through Büchner funnel with P5 filter paper. The filtrate was centrifuged at 2300 × g for 10 min (low speed centrifugation), the supernatant from low speed centrifugation was then centrifuged at 3200 × g for 30 min (high speed centrifugation). High speed centrifugation supernatant was then filtered through 0.2 micro filter membrane (vacuum filter 0.2-micron, Thermo Fisher Scientific) and diluted to 4% using sterilized DI water (COD = 800 ppm) to grow bacteria.
Cocktail culture of E. coli O157:H7, S. enteritidis, and L. monocytogenes in BHI or LJE were obtained by inoculating appropriate volume of 24-h stock culture of the individual bacteria to achieve the initial concentration shown in Supplementary Table 2. The concentration for each bacteria was determined by plate counting methods using selective agars. Oxford Listeria selective agar base (Oxford formulation) with Oxford modified Listeria selective supplement was used for the selective quantification of L. monocytogenes. MacConkey agar (BD, Franklin Lakes, NJ, United States) was used to differentiate and quantify E. coli O157:H7 and S. enteritidis. Both cultures were incubated at 37 • C for 24 h before quantification using an automated plate counter (Scan 300, Interscience Laboratories Inc., Woburn, MA, United States).

Verification of mRNA as Biomarkers for Bacteria Viability
Escherichia coli O157:H7 was selected as a model organism to demonstrate that mRNA is a valid indicator for bacteria viability. Aliquot of overnight E. coli O157:H7 culture was inoculated at 3-log CFU/mL in BHI broth and incubated at 37 • C. Culture was sampled at 0, 4, 8, 24, and 72 h, fractions from the sample culture were taken and plated on BHI agar to determine viable bacteria counts. Remaining fractions were used for DNA and RNA purification using DNeasy blood and tissue kit (Qiagen, Germantown, MD, United States) and Monarch Total RNA Miniprep Kit (New England Biolabs, Ipswich, MA, United States) following the supplier protocols, respectively. In RNA preparation, to lyse the cells, the cell pellet obtained from initial centrifugation was incubated at 37 • C for 1 h with 300 rpm mixing in 250 µL 3 mg/mL lysozyme (Alfa Aesar, Haverhill, MA, United States) in Tris-EDTA buffer (Sigma-Aldrich, St. Louis, MO, United States). Purified DNA and RNA from these four time points were quantified using Qubit dsDNA HS Assay Kit and Qubit RNA HS Assay Kit (Invitrogen, Carlsbad, CA, United States), and also quantified using NEB Luna Universal qPCR Master Mix and Luna Universal One-Step RT-qPCR Kit (New England Biolabs, Ipswich, MA, United States) following supplier protocols., A Biorad CFX-96 Touch real time PCR detection system was used for qPCR and RT-qPCR testing. The primer pairs used in this test were designated as Stx1A and sequence was listed in Supplementary Table 3. E. coli O157:H7 inactivated with 13.4 mmol/L of sodium hypochlorite was used as the negative control to test whether mRNA and/or DNA can be used as viability biomarkers (Skinner et al., 2018).

Direct Metatranscriptome RNA-seq on Nanopore MinION
One dimensional direct metatranscriptome RNA-seq was performed using 24-h cocktail culture of E. coli O157:H7, S. enteritidis, and L. monocytogenes in BHI or LJE. RNA was extracted from the cocktail culture by Monarch Total RNA Miniprep Kit including DNase I (NEB, Ipswich, MA, United States) which was confirmed by multiplex RT-PCR and gel electrophoresis. DNA was completed digested using DNase I (NEB # T2010S, working concentration: 0.1 U/µl), which was verified by multiplex PCR and gel electrophoresis. The primers stx, invA and LisA2 were selected for E. coli O157:H7, S. enteritidis, and L. monocytogenes, respectively. RT-PCR products were analyzed using gel electrophoresis with 1.2% agarose gel. The One Taq One-Step RT-PCR Kit (NEB # E5310S) was used for nucleic acid amplification. The thermal cycler condition: reverse transcription at 48 • C for 15 min; initial denaturation at 94 • C for 1 min; denaturation at 94 • C for 15 s, annealing at 53 • C for 30 s, extension at 68 • C for 40 s with 40 cycles; final extension at 68 • C for 5 min.
The prepared RNA samples were further modified with poly(A) tailing and library preparation by following suppliers' protocols. Direct metatranscriptome RNA-seq was developed based on supplier's direct RNA-seq protocol (RNA Kit SQK-RNA001, Oxford Nanopore Technologies, Oxford, United Kingdom). The MinION flow cell was primed using a priming mix, and then 75 µl of sample was loaded to the SpotON sample port dropwise to avoid bubbles. After adding the sample, MinKNOW software was initiated to start a sequencing run. Cocktail culture inactivated with 13.4 mmol/L of sodium hypochlorite was used as the negative control to test whether metatranscriptome sequencing can eliminate false positive identification. Briefly, 100 ng of FFPE RNA was diluted in 12 µl of Nucleasefree Water for RNA probe hybridization. Twelve microliters of total RNA was mixed with 1 µl of NEBNext rRNA Depletion Solution and 2 µl of Probe Hybridization Buffer by following the thermal condition: heated lid at 105 • C; 2 min at 95 • C; ramp down to 22 • C at 0.1 • C/s; 5 min hold at 22 • C. After that, RNase H Digestion and DNase I Digestion were performed. First, 15 µl of sample were mixed with 5 µl of RNase H master mix (2 µl of NEBNext RNase H, 2 µl of NEBNext RNase H Reaction Buffer, 1 µl of Nuclease-free Water) and incubated for 30 min at 37 • C (heated lid at 40 • C), and then 20 µl of sample was added to 30 µl of DNase I digestion master mix (5 µl of DNase I Reaction Buffer, 2.5 µl of DNase I, 22.5 µl of Nuclease-free Water) and incubated for another 30 min with the same condition. After RNA purification by beads (NEBNext RNA Sample Purification Beads, NEB # E7104S, 2.2X beads), 5 µl of sample with Nuclease-free Water was reacted with 1 µl of random primers for 5 min at 65 • C (heated lid at 105 • C, and hold at 4 • C) for priming. The 6 µl of primed RNA was mixed with 4 µl of NEBNext First Strand Synthesis Reaction Buffer, 8 µl of NEBNext Strand Specificity Reagent and 2 µl of NEBNext First Strand Synthesis Enzyme Mix to form the first strand cDNA (thermal condition: headed lid at 80 • C, 10 min at 25 • C, 15 min at 42 • C, 15 min at 70 • C and hold at 4 • C), followed by second strand cDNA synthesis by reacting with 8 µl of NEBNext Second Strand Synthesis Reaction Buffer with dUTP, 4 µl of NEBNext Second Strand Synthesis Enzyme Mix and 48 µl of Nuclease-free Water for 1 h at 16 • C. After cDNA synthesis, a bead purification (1.8X) was performed as before and 50 µl of purified cDNA was eluted. End Prep of cDNA was carried out in a reaction of 50 µl of Second Strand Synthesis Product, 7 µl of NEBNext Ultra II End Prep Reaction Buffer, 3 µl of NEBNext Ultra II End Prep Enzyme Mix (thermal condition: heated lid at 75 • C, 30 min at 20 • C, 30 min at 65 • C and hold at 4 • C), and Adaptor Ligation in a reaction of 60 µl of End Prepped DNA, 2.5 µl of 25-fold Diluted Adapter, 1 µl of NEBNext Ligation Enhancer and 30 µl of NEBNext Ultra II Ligation Master Mix with incubation for 15 min at 20 • C, followed by another 15 min incubation at 37 • C by adding 3 µl of USER Enzyme. DNA libraries were purified with 0.9X DNA purification beads. Finally, 15 µl of adaptor ligated DNA, 25 µl of NEBNext Ultra II Q5 Master Mix, 5 µl of Universal PCR Primer/i5 Primer, 5 µl of Index (X) Primer/i7 Primer (NEBNext Multiplex Oligos for Illumina Set 2, NEB # E7500) were mixed together for PCR Enrichment (thermal condition: initial denaturation at 98 • C for 30 s; denaturation at 98 • C for 10 s, annealing/extension at 65 • C for 75 s with 16 cycles; final extension at 65 • C for 5 min with 16 cycles). After PCR enrichment, 0.9X beads were used to purify DNA and 0.1X TE was used to elute DNA for Assess Library Quality on Agilent 2100 Bioanalyzer. RNA samples from LJE and BHI samples were treated the same, however, the LJE samples had an additional step of RNA Fragmentation via mixing 5 µl of sample with 4 µl of NEBNext First Strand Synthesis Reaction Buffer and 1 µl of random primers, following the incubation at 94 • C for 15 min.
Library quantification was done on CFX-96 Touch system (Bio-Rad Labratories, Hercules, CA, United States) using NEBNext Library Quant Kit (NEB # E7630) by following NEBNext Library Quant Kit Protocol. Libraries were diluted at 1:1,000, 1:10,000 and 1:100,000 and 4 µl of template was used in each 20 µl total reaction volume. The thermal condition: initial denaturation at 95 • C for 1 min; denaturation at 95 • C for 15 s with 35 cycles; extension at 63 • C for 45 s. The calculation of quantitative library DNA was achieved by using NEBioCalculator v1.10.0. An iSeq 100 system (Illumina, San Diego, CA, United States) was used to sequence 24-h cocktail cultures of BHI-derived and LJE-derived E. coli O157:H7, L. monocytogenes, and S. enteritidis as a control dataset. All eight libraries were normalized to a 1 nM concentration and 5 µl of library pool was diluted to a 50 pM loading concentration. PhiX was spiked-in to the library pool at 10%. This value was used in order to increase base diversity. Paired-end reads were generated at 151 cycles each with a 6-nucleotide indexing read.
Prior to RT-PCR amplicon sequencing, an end repair/A tailing step (NEBNext End Repair/dA-tailing Module) was carried out for the RT-PCR products, followed by a ligation step using NEB Ultra II ligation master mix. Oxford Nanopore Ligation Sequencing Kit SQK-LSK108 and Library Loading Bead Kit EXP-LLB001 were used for the library preparation of RT-PCR amplicon. To validate Nanopore DNA sequencing results, the RT-PCR products were also sequenced using an Applied Biosystems 3130xl genetic analyzer by following a protocol at NEB. The DNA sequencing data collected by the sequencer was analyzed using EPI2ME and MG-RAST to confirm the identities of the three bacteria. Cocktail culture inactivated with 13.4 mmol/L of sodium hypochlorite was used as the negative control to test whether multiplex RT-PCR amplicon sequencing can eliminate false positive identification. Detailed protocols for RT-PCR are provided in the Supplementary Material.

Data Analysis and Bioinformatics for Oxford Nanopore Sequencing Run
Sequencing reads were base-called via the local base-calling algorithm with MinKNOW software (v. 1.4.3). All FASTQ files of passed base-called reads were collected and combined to one file for analysis. EPI2ME, MG-RAST and MEGAN were used for metagenomics and taxonomic analysis.

Data Analysis and Bioinformatics for iSeq100 Control Run
Sequencing reads were demultiplexed by unique six nucleotide barcodes using the bcl2fastq program 2 . Sequencing reads were uploaded to Galaxy (Galaxy) and adapter sequences were trimmed using Cutadapt (Galaxy) with a quality cutoff of 20. The program trimmed the adapter sequence in both the 5 and 3 orientations. Forward and Reverse reads were joined using FASTQjoiner (Galaxy). FASTQ files were converted to FASTA files using FASTQ to FASTA program 3 (GalaxyGalaxy). FASTA files were then aligned and mapped on NCBI BLAST by its refseq_rna and blastn programs. The Common Taxonomic tree and Alignment Summary Viewer tools were used to analyze alignments per bacterial species and to generate phylogenetic trees.

AUTHOR'S NOTE
This manuscript has been released as a Pre-Print at bioRxiv (Yang et al., 2019).

AUTHOR CONTRIBUTIONS
MY, AC, XL, DS, SL, TG, and LS provided assistance and conducted the experiments throughout the project. MY, HD, and JL aided in NGS validation. YL, MX, and BZ provided guidance and wrote the manuscript.

FUNDING
The project is partially supported by the U.S. Department of Agriculture (S51600000035794).