Complete Genome Sequence of Lactobacillus casei LC5, a Potential Probiotics for Atopic Dermatitis

Probiotics are living microorganisms providing health beneficial effect to the host (1). Probiotics have been used for the treatment or prevention of various diseases related to diarrhea (2), cholesterol (3) immune function (4), and inflammatory bowel disease (5). In addition, recent study also presents that probiotic bacteria in the Bifidobacterium and Lactobacillus genera are able to have therapeutic effects in the patients of psychological disorders, such as depression, anxiety, and memory (6). Lactobacillus casei is a Gram-positive bacterium that naturally inhabits the human and animal gastrointestinal and mouth organs (7). As its name implies, this heterofermentative microorganism is the dominant species present in ripening cheddar cheese (8). In probiotic aspects, L. casei showed beneficial roles in the activation of the gut mucosal immune system (9), treatment of diabetics (10), and chronic constipation (11). In the previous study, we isolated L. casei LC5 strain from fermented dairy products, which showed immune regulatory functions, especially, therapeutic effect on atopic dermatitis as a member of complex probiotics (12–14). In order to gain better insight of the probiotic effect on atopic dermatitis, we analyzed the genome sequence of L. casei LC5. According to the report of NCBI Genome,1 more than two hundreds of Lactobacillus organisms are sequenced and their beneficial properties derived from genomic information are used in the food industry. However, the available genomes of L. casei strains as members of health promoting probiotics are still insufficient. Furthermore, L. casei strains are frequently confused with the closely related strains such as Lactobacillus paracasei and Lactobacillus rhamnosus. Therefore, comparative study in a whole genome scale is required to clarify taxonomic association of L. casei LC5 as well as its functional characteristics. The availability of the genomic information of L. casei LC5 will aid as a basis for further in-depth analysis of the probiotic function of L. casei strains.


BaCKGRoUND
Probiotics are living microorganisms providing health beneficial effect to the host (1). Probiotics have been used for the treatment or prevention of various diseases related to diarrhea (2), cholesterol (3) immune function (4), and inflammatory bowel disease (5). In addition, recent study also presents that probiotic bacteria in the Bifidobacterium and Lactobacillus genera are able to have therapeutic effects in the patients of psychological disorders, such as depression, anxiety, and memory (6).
Lactobacillus casei is a Gram-positive bacterium that naturally inhabits the human and animal gastrointestinal and mouth organs (7). As its name implies, this heterofermentative microorganism is the dominant species present in ripening cheddar cheese (8). In probiotic aspects, L. casei showed beneficial roles in the activation of the gut mucosal immune system (9), treatment of diabetics (10), and chronic constipation (11). In the previous study, we isolated L. casei LC5 strain from fermented dairy products, which showed immune regulatory functions, especially, therapeutic effect on atopic dermatitis as a member of complex probiotics (12)(13)(14).
In order to gain better insight of the probiotic effect on atopic dermatitis, we analyzed the genome sequence of L. casei LC5. According to the report of NCBI Genome, 1 more than two hundreds of Lactobacillus organisms are sequenced and their beneficial properties derived from genomic information are used in the food industry. However, the available genomes of L. casei strains as members of health promoting probiotics are still insufficient. Furthermore, L. casei strains are frequently confused with the closely related strains such as Lactobacillus paracasei and Lactobacillus rhamnosus. Therefore, comparative study in a whole genome scale is required to clarify taxonomic association of L. casei LC5 as well as its functional characteristics. The availability of the genomic information of L. casei LC5 will aid as a basis for further in-depth analysis of the probiotic function of L. casei strains.

Bacterial Strains and DNa preparation
Lactobacillus casei LC5 was isolated from fermented dairy products and commercially used as probiotics in Korea (15). L. casei LC5 was cultured aerobically in MRS medium (Difco, USA) at 37°C for 18 h. Genomic DNA from L. casei LC5 was extracted and purified using a QIAamp DNA Mini Kit (Qiagen, Germany). The concentration of genomic DNA was qualified with NanoDrop 2000 UV-vis spectrophotometer (Thermo Scientific, USA) and Qubit 2.0 fluorometer (Life Technology, USA).

Genome Sequencing, assembly, and annotation
Whole genome sequencing of L. casei LC5 was carried out by using PacBio RS II platform. A 20 kb DNA library was constructed according to the manufacturer's instruction and sequenced using single molecule real-time (SMRT) sequencing technology with the P6 DNA polymerase and C4 chemistry. A total of 138,180 subreads (1.04 Gb) were obtained with 400-fold coverage. The average length of subreads was 7,550 bp and N50 was 10,940 bp. Genome assembly was performed using HGAP 3.0 (16) with default options. The annotation was carried out with NCBI Prokaryotic Genome Annotation Pipeline (17) through NCBI Genome submission portal (GenomeSubmit at http:// ncbi.nlm.nih.gov). The chromosome topology was drawn using DNAPlotter (18). Clusters of orthologous groups (COG) categories were assigned to the coding genes using BLASTP (e-value: 1e−3) against COG database (19). The evolutionary history was inferred by using the maximum likelihood method based on the Tamura-Nei model (20). All positions containing gaps and missing data were eliminated. There were a total of 1521 positions in the final dataset. Those phylogenetic analyses were conducted in MEGA6 (21). To compute genomic distance, we first computed orthologous average nucleotide identity (OrthoANI) values using orthologous average nucleotide identity tool (22). The OrthoANI 2 http://www.ncbi.nlm.nih.gov/genome/. values were converted to distance values by following formula: distance = 1 − (OrthoANI/100). The evolutionary distance was computed using the neighbor-joining method of MEGA6 (21). The tree is drawn to scale with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The resulting phylogenetic tree was produced using MEGA6. Pan-genomic study using Panseq (23) was performed to investigate the genomic conservation and finding novel region in the sequenced genome.

Genome Characteristics of L. casei LC5
We obtained a complete genome sequence of L. casei LC5 using SMRT sequencing. This genome has a chromosome and no organelle sequences. The total size of the genome is 3,132,867 bp and its GC content is 47.9%. A total of 2,925 genes were detected from the genome sequence. The number of coding CDS is 2,817 and pseudogenes is 31. Seventy seven RNAs (15 rRNAs, 59 tRNAs, and 3 non-coding RNAs) were also identified. Repeating region or CRISPR array was not identified. Genomic features of L. casei LC5 are shown in Figure 1A.
Although L. casei LC5 was identified as a strain of L. casei, it showed different genomic features compared to the other published L. casei strains; According to the summary of 37 L. casei genomes deposited in NCBI Assembly, the median length is 3.01993 Mb, the median of coding genes is 2,712, and the median of GC contents is 46.4%. An interesting point is that those genomes can be split into two groups by the difference of GC contents, high-GC group (47.7-47.9%) and low-GC group (46.2-46.6%). Five genomes (ATCC 393, N87, 867_LCAS, Lbs2, JCM 1134) and L. casei LC5 belong to the high-GC group and the other genomes belong to the low-GC group ( Table 1).

Comparative Study of L. casei Group
Comparative study of both 16S rRNA genes and whole genome sequences revealed that the closest genome of L. casei LC5 was L. casei ATCC 393 and second closest one was L. zeae DSM 20178. The three genomes which showed distinguishable differences on the comparative study, LC5, ATCC 393, and L. zeae DSM 20178, belong to the high-GC group as described in the above section. In contrast to the phylogenetic distances based on 16S rRNA gene among the high-GC group (below 0.001), the distances between the high-GC group and the low-GC group were above 0.003 ( Figure 1C). It was also supported by the estimation result of the whole genomic comparison. Average nucleotide identity (ANI) values among the high-GC group were above 94% whereas ANI values between two groups were below 80% ( Figure 1D). All the L. casei strains and L. paracasei strains belonging to the low-GC group showed the high genomic similarity of 98% or higher.

Functional Classification
Functional classification based on COG assigned the 2,334 CDSs into the 1,309 COG numbers. From the comparison of functional categories against the 19 L. casei group genomes, we found that L. casei LC5 contains the high number of proteins which associate with "carbohydrate transport and metabolism (G)" (376 proteins) and "transcription (K)" (239 proteins) excluding two unknown categories, "general function prediction only (R)" and "function unknown (S)" as shown in Figure 1B. L. casei LC5 has at least 36 more proteins than the other genomes on the category G and has at least 8 more proteins than the other genomes on the category K. The gene expansion of those two functional categories in the LC5 genome is not found on the other members of high-GC group. Although the genomes belonging to high-GC group showed high similarities to each other and the genomes belonging to the high-GC group do not have excessive proteins on the categories, G and K, when compared to those belonging to the low-GC group. Moreover, L. casei ATCC 393 which is the most similar genome of LC5 has fewer proteins than the average number of those categories, 223 proteins for the category G and 192 proteins for the category K.
In the previous study, probiotic LC5 strain isolated from Korean fermented dairy product showed great therapeutic effect on atopic dermatitis. Here, we report a genomic overview and distinguishing gene features of LC5 by comparative genomic analysis of 20 related strains. The genomic data presented in this report will broaden our knowledge about roles and mechanisms