Papain-Like Cysteine Protease Gene Family in Fig (Ficus carica L.): Genome-Wide Analysis and Expression Patterns

The papain-like cysteine proteases (PLCPs) are the most abundant family of cysteine proteases in plants, with essential roles in biotic/abiotic stress responses, growth and senescence. Papain, bromelain and ficin are widely used in food, medicine and other industries. In this study, 31 PLCP genes (FcPCLPs) were identified in the fig (Ficus carica L.) genome by HMM search and manual screening, and assigned to one of nine subfamilies based on gene structure and conserved motifs. SAG12 and RD21 were the largest subfamilies with 10 and 7 members, respectively. The FcPCLPs ranged from 1,128 to 5,075 bp in length, containing 1–10 introns, and the coding sequence ranged from 624 to 1,518 bp, encoding 207–505 amino acids. Subcellular localization analysis indicated that 24, 2, and 5 PLCP proteins were targeted to the lysosome/vacuole, cytoplasm and extracellular matrix, respectively. Promoter (2,000 bp upstream) analysis of FcPLCPs revealed a high number of plant hormone and low temperature response elements. RNA-seq revealed differential expression of 17 FcPLCPs in the inflorescence and receptacle, and RD21 subfamily members were the major PLCPs expressed in the fruit; 16 and 5 FcPLCPs responded significantly to ethylene and light, respectively. Proteome analyses revealed 18 and 5 PLCPs in the fruit cell soluble proteome and fruit latex, respectively. Ficins were the major PLCP in fig fruit, with decreased abundance in inflorescences, but increased abundance in receptacles of commercial-ripe fruit. FcRD21B/C and FcALP1 were aligned as the genes encoding the main ficin isoforms. Our study provides valuable multi-omics information on the FcPLCP family and lays the foundation for further functional studies.


INTRODUCTION
Cysteine proteases, which contain a cysteine residue at their active catalytic site, catalyze the hydrolysis of peptides and proteins. Cysteine proteases can be divided into 11 clans with different evolutionary routes; different families in the same clan are diverse in sequence and structure (Bah et al., 2006;Rawlings et al., 2016). C1A is the largest family of the CA clan of papain-like cysteine proteases (PLCPs) (Rawlings et al., 2010). Members of the PLCP family in plants, such as papain, chymopapain, caricain (Carica papaya), bromelain (Ananas comosus) and ficin (Ficus carica), have broad substrate specificity and strong proteolytic activity, and the enzymes are of high commercial value in cheese making (Mazri et al., 2018), meat tenderization (Sullivan and Calkins, 2010;Bekhit et al., 2014), beer stabilization, biscuit baking and leather softening, as versatile biocatalysts (Huang et al., 2008) and in making digestive drugs (Zhalehjoo and Mostafaie, 2012).
In plants, PLCPs act as an immunity hub, playing critical roles in plant-pathogen/pest interactions and abiotic stress responses (Shindo and Van Der Hoorn, 2008;Misas-Villamil et al., 2016). PLCPs are first synthesized as inactive precursors with a signal peptide, an N-terminal self-inhibiting predomain, and a mature catalytic domain (Richau et al., 2012;Wang et al., 2014). The mature PLCPs are monomer proteases consisting of an α-helix and β-sheet domain of similar size, forming an active cleft that specifically binds with the substrate. The active catalytic site-the highly conserved catalytic triad Cys-His-Asn-is located at the cleft, which is the conserved characteristic of PLCPs (Polaina and MacCabe, 2007). Most PLCPs have small molecular masses, ranging from 20 to 35 kDa, and a few are 50-75 kDa. The optimum pH for catalytic activity is 5.0-8.0 (Dubey et al., 2007). In plants, PLCPs are divided into nine subfamilies according to the propeptide domain and characteristic motifs: RD21A-like, CEP1-like, XCP2-like, XBCP3like, THI1-like, SAG12-like, RD19A-like, ALP-like, and CTB-like (Richau et al., 2012).  (Raskovic et al., 2016). The latex participates in defense against fungi and insects (Mnif et al., 2015) and has historically been used to treat skin diseases. On the other hand, the proteases in latex damage the skin of fig pickers and workers in the orchard, and in commercialripe fig fruit, the latex needs to be drained after harvesting (Flaishman et al., 2008).
PLCPs-dominated by ficin isoforms-are the major protein component of fig latex (Zare et al., 2013;Raskovic et al., 2014). Ficin (EC 3.4.22.3), also known as ficain, is widely found in Ficus species. Our previous study revealed that multiple cysteine proteases, i.e., ficins A, B, C, D, and cysteine protease RD21A, make up the large proportion of the fig fruit's soluble proteome in both commercial-ripe and tree-ripe fruit, but transcripts of ficin isoforms were not identified by RNA-seq (Cui et al., 2019).
Publication of the fig (F. carica) genome (Usai et al., 2020) has provided the necessary information for bioinformatics analyses of the FcPLCP family. In the present study, gene structures, conserved motifs, phylogenetic relationships, and promoter cis-elements of FcPLCPs were analyzed, and genetranscription patterns and protein abundance in fig fruit were revealed. This combined genomic, transcriptomic and proteomic study provides a matrix of information on the PLCP family in fig, laying a sound foundation for the identification of important PCLPs for further studies of biological function.

Identification of PLCP Genes From the Genome of Ficus carica
Genomic data of F. carica and Morus notabilis were downloaded from NCBI 1 , genomic data of Arabidopsis thaliana were downloaded from the TAIR database 2 , data of Ficus hispida and Ficus microcarpa were downloaded from the Genome Sequence Archive (GSA) and Genome Warehouse (GWH) database 3 . The gene and coding sequences (CDSs) were extracted from contig level sequences (BioProject: PRJNA565858 4 ) using TBtools , according to gene location information in Usai et al. (2020). The protein sequences were translated based on the CDSs.
The Hidden Markov Model (HMM) file corresponding to the peptidase C1 domain (PF00112) was downloaded from the Pfam database (El-Gebali et al., 2019). HMMER was used to search for the PLCPs from F. carica, A. thaliana, M. notabilis, F. hispida, and F. microcarpa genome databases with default parameters. All candidate proteins were confirmed by Pfam and the conserved domain database (Marchler-Bauer and Bryant, 2004;NCBI CDD 5

Multiple Sequence Alignment, and Phylogenetic and Sequence Feature Analyses
The PLCP sequences of F. carica, A. thaliana, M. notabilis, F. hispida, and F. microcarpa were subjected to multiple sequence alignment using ClustalW with default parameters. An unrooted phylogenetic tree based on this alignment was constructed using the neighbor-joining method by MEGA X, with the following parameters: Poisson model, pairwise deletion, 1,000 bootstrap replications. PLCPs were named according to their homology with Arabidopsis.
Exon-intron positions were obtained by genome annotation. The conserved motifs of PLCPs of F. carica were computed by the MEME program (Bailey et al., 2009), with the following parameters: classic mode; site distribution, zero or one occurrence per sequence; number of motifs, 20; width of motifs, between 6 and 50 residues. PROSITE (Sigrist et al., 2012) and NCBI CDD were used for motif analysis. The diagrams were drawn with TBtools .

Chromosomal Location, Gene Duplication, and Promoter Analysis
Papain-like cysteine proteases genes were mapped to F. carica chromosomes using TBtools . The Multiple Collinearity Scan toolkit (MCScanX) was used to conduct the syntenic analysis among F. carica, F. hispida, and F. microcarpa (Wang et al., 2012).
An all-against-all BLASTP alignment was run to reveal potential gene duplication. The criteria for duplicated pairs were: (a) length of the aligned sequence covers >75% of the longer gene, and (b) >75% similarity of aligned regions (Vatansever et al., 2016). Corresponding coding regions were aligned using ClustalW. The number of non-synonymous substitutions per non-synonymous site (K a ) and the number of synonymous substitutions per synonymous site (K s ) were calculated by KaKs_Calculator 2.0 (Wang et al., 2010). The gene duplications were dated using the formula T = K s /2r; r, which is the rate of divergence for nuclear genes, was taken to be 7 × 10 −9 synonymous substitutions per site per year according to a previous report in A. thaliana (Ossowski et al., 2010).
Promoter analysis was conducted using Plant CARE (Lescot et al., 2002), based on the 2000-bp sequence upstream of the gene.  (Flaishman et al., 2008), we subdivided the fig development process into six stages: stages 1 and 2 belonging to phase I, a rapid growth stage; stages 3 and 4 belonging to phase II, during which fruit size and hardness remain almost unchanged; stages 5 and 6 belonging to phase III, the mature stage, where stage 5 corresponds to commercial ripeness. Inflorescences and receptacles at each stage were separated, marked as F1-F6 and R1-R6, respectively, and stored at −80 • C for further RNA-seq and proteomic analyses.

Plant Materials
Fig fruit latex was collected by cutting the stage 5 fruit peel with a scalpel; the latex that flowed out was collected into centrifuge tubes and frozen in liquid nitrogen, then stored at −80 • C for protein identification.

RNA-Seq and Quantitative Real-Time PCR (qRT-PCR) Validation
RNA was isolated from samples using the modified CTAB method (Chai et al., 2014). Library construction and RNA-seq methods were as described previously (Wang et al., 2017). per million mapped reads (FPKM) and is displayed in heat maps. Significant gene expression was defined by p-adjust < 0.05 and | log2(fold change) | ≥ 1.
Transcriptome data of fig fruit after ethylene application (Cui et al., 2020) and under light deprivation  were re-mined to explore the changes in PLCP gene expression under these treatments. Briefly, fig fruit in phase II were injected with 1 mL of 250 mg/L ethephon solution through the ostiole, RNA-seq was carried out on the inflorescence and receptacle at 2, 4, and 6 days after treatment, and the transcriptome data were stored at NCBI (SRA accession: PRJNA606407) (Cui et al., 2020). For the light-deprivation treatment, fig fruit in stage 2 were covered with opaque paper bags and the transcriptomes of the inflorescence and receptacle of light-deprived and control fruit were determined in commercial-ripe fruit. The complete dataset can be found in the NCBI SRA database (accession number PRJNA494945) .
The expression of eight PLCP genes was validated by qRT-PCR. PrimeScript TM RT reagent kit (RR037Q, Takara, Dalian, China) was used to reverse transcribe the total RNA. The 18S gene was used for normalization. Primers were designed by Beacon Designer 7 software (Supplementary Table 1). The qPCR was performed with ChamQ Universal SYBR qPCR Master Mix (Q711-02, Vazyme, Nanjing, China). A 15-µL reaction mixture was added to each well. The PCR program was as follows: 95 • C for 30 s, and 40 cycles of: 95 • C for 10 s, 60 • C for 30 s. The 2 − Ct method (Livak and Schmittgen, 2001) was used for relative quantification analysis with three replicates for each sample.

Quantitative Proteomic Analysis
Stage 1, 3, and 5 inflorescences and receptacles were used for proteomic analysis. Protein extraction and quantitative analysis were as described in Cui et al. (2019). Three biological replicates were performed for each sample. The digested peptides were labeled with TMT10plex TM Isobaric Label Reagent Set (Thermo Scientific) according to the manufacturer's instructions. A Q-Exactive mass spectrometer (Thermo Fisher Scientific, Waltham, MA, United States) was used to detect peptide signals. The MS scans were run as described in our previous publication (Cui et al., 2019). The MS results were input into PD software (Proteome Discoverer 1.4, Thermo) to screen the spectra. The selected peptides were identified using Mascot (version 2.3.01) and annotated according to the UniProt database. Then the peptides were quantified by PD software based on their annotation and spectrum. ANOVA was performed to evaluate the significance of the differences. Proteins with a p-value less than 0.05, and fold change ≥ 1.2 or ≤0.83 were considered differentially abundant proteins (DAPs). The mass spectrometry proteomics data of inflorescences and receptacles were deposited in the ProteomeXchange Consortium via the PRIDE (Perez-Riverol et al., 2019) partner repository with the dataset identifier PXD025170.
The tris-phenol method (Xie et al., 2009) was used to extract latex protein. A 2-D Quant-kit was used to quantify the protein, and 20 µg protein was applied to SDS-PAGE. The latex protein was excised from the gel and cut into 10 pieces, digested with trypsin enzyme diluted with NH 4 HCO 3 solution, and the peptides obtained from the digestion were separated by multidimensional liquid chromatography (Dionex Ultimate 3000 nano-LC system), then detected and analyzed by tandem MS (Thermo Fisher Q-Exactive). Mascot (version 2.3.01), MaxQuant (version 1.5.2.8), Thermo Scientific Proteome Discoverer (version 1.3/1.4), and UniProt database were used for protein identification and quantification. The mass spectrometry proteomics data of latex were deposited in the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD025485.

K a /K s Analysis
Genome replication, segmental duplication, and tandem duplication are considered the main evolutionary forces. We identified six duplicated gene pairs in the fig PLCP family ( Table 1). The K a /K s ratios of FcRD21A/FcRD21B, FcRD21A/FcRD21C, FcRD21B/FcRD21C, and FcSAG12F/FcSAG12G were <1, suggesting negative selection on these four pairs. The K a /K s ratios of FcRD19A/FcRD19B and FcRD19B/FcRD19C were >1, suggesting positive selection for these two pairs. In general, negative selection (K a /K s < 1) eliminates deleterious mutations, retaining the original function of the protein. Positive selection (K a /K s > 1) changes the protein, usually related to coevolution of immune system genes and parasites (Hurst, 2002). Duplication of these six PLCP gene pairs were calculated to have occurred between 1.76 and 59.98 million years ago.

Gene Structure and Sequence Features
Gene structure analysis showed that the PLCP genes of F. carica, A. thaliana, and M. notabilis differ in intron numbers, whereas the intron numbers within a subfamily were conserved (Supplementary Figure 1). Subfamilies RD21 and RD19 featured 2-5 introns and 2-4 introns, respectively. The other subfamilies had 1-10 introns. The full lengths of the FcPCLPs were from 1,128 to 5,075 bp with CDSs ranging from 624 to 1,518 bp.   2, 3, 4, 6, 8, 9, 10, 11, 13, 14, and 19 were identified as belonging to the peptidase C1 domain (PF00112). Motifs 2, 10, and 8 had Asn, Cys, and His catalytic sites, respectively. Motif 10 of some SAG12 subfamily members had a double Cys in the catalytic site. The distribution of the specific conserved motifs varied among subfamilies. In RD19, ALP, and CTB subfamilies, motif 16 replaced motifs 11 and 6. In the RD21 and XBCP3 subfamilies, motifs 18 and 15 were usually located at the C terminus (Figure 3).
Motif 15 appeared in two members of the RD21 subfamily and two members of the XBCP3 subfamily (Figure 3). It was identified as a granulin domain (PF00396). In plants, the granulin motif has been found at the C terminus of some cysteine proteases whose expression is upregulated under environmental stress (Bateman and Bennett, 2009). Not every XBCP3 and RD21 subfamily member carried the granulin motif, suggesting that the granulin polymorphism evolved by domain loss (Richau et al., 2012).
All CEP subfamily FcPLCPs had the KDEL sequence at the C terminus for retention in the endoplasmic reticulum. Than et al. (2004) found that removing the N-terminal propeptide and C-terminal KDEL sequence under acidic conditions results in ricinosome maturation. We also found that the N terminus of FcALP1 carries an NPIR signal known as a vacuolartargeting sequence. Fig PLCPs are 207 to 505 amino acids in length, and have a molecular mass of 23.09-56.19 kDa and pI of 3.88 to 9.14 (Supplementary Table 2). The GRAVY index of FcPLCPs was from −0.59 to −0.14, indicating they are hydrophilic; 19 FcPLCPs, 31 AtPLCPs, and 20 MnPLCPs were predicted to carry N-terminal signal peptides (Figure 3). In several PLCP sequences annotated in the present study, neither signal peptides nor propeptides were identified. This could be due to the assembly quality of the used fig genome (Usai et al., 2020, 74-fold coverage of the cv. Dottato haploid genome) and the possible existence of pseudogenes. Most FcPLCPs were predicted to be soluble proteins. Subcellular localization analysis showed that 24, 5, and 2 FcPLCPs were located in the lysosome/vacuole, extracellular matrix and cytoplasm, respectively. All 32 AtPLCPs were predicted to be localized to the lysosome/vacuole. MnPLCPs were located in the lysosome/vacuole (21 PLCPs), extracellular matrix (1 PLCP), plastid (1 PLCP), and cytoplasm (1 PLCP) (Supplementary Table 3).

Promoter Analysis
All FcPLCP promoters contained at least one putative biotic/abiotic stress response element; 25 PLCPs contained an abscisic acid response element, 22 contained a methyl jasmonate response element, 19 contained a salicylic acid response element, 17 contained a low-temperature response element, 16 contained an auxin response element, 14 had a gibberellin response element, 16 had a defense and stress response element and 14 had a drought response element (Figure 4), suggesting that FcPLCPs are involved in stress responses. Moreover, all of the FcPLCP promoters contained light response elements, and 29 of them contained anaerobic induction elements, indicating that their expression is regulated by light and oxygen (Supplementary Table 4).

Expression Pattern of PLCPs in Fig Fruit Development
Seventeen PCLPs were identified in the inflorescence and receptacle transcriptomes of the six fig fruit developmental stages; no tissue-specific PCLP was found in the inflorescence or receptacle. The RD21 subfamily made up the largest proportion of detected expressed PCLPs (Figure 5).

Change in Abundance of PLCP Proteins During Fig Fruit Development
Protein identification and quantification were performed at three developmental stages (stages 1, 3, and 5) for inflorescences and receptacles. Eighteen PLCPs were annotated, and 13 of them were identified as DAPs. Their fold changes are shown in Table 3 and Supplementary Table 5.

PLCP Proteins in Fig Fruit Latex
Seventy-four proteins were identified in the latex, with molecular masses ranging from 8.9 to 206.1 kDa and pI values of 4.84 to 10.78. Latex contained many stress-response proteins, including trypsin-like protease inhibitor, chitinase, endochitinase, and pathogenesis-related (PR) protein isoforms, indicating that it plays an essential role in resistance to insects and microbes. PLCPs were the most abundant protein component in fig latex at the commercial-ripe stage, accounting for 38.93% of the total protein content. The identified PLCPs included ficin 4 (14.93%), ficin 1B (8.63%), ficin 1A (8.04%), W9RY43 (5.82%, FcRD21G), and ficin D (1.50%) ( Table 4 and Supplementary Table 6).

Expression Pattern of PLCPs in Fig Fruit Treated With Ethephon and Light Deprivation
Phase II fig fruit were treated with 1 mL of 250 mg/L ethephon (Cui et al., 2020). The RNA-seq data for the inflorescence and receptacle at 0, 2, 4, and 6 days after treatment were analyzed. Sixteen PLCPs showed differential expression (Figure 6). Eleven PLCPs were upregulated in the inflorescence or receptacle; among them, c39535_g2 (FcRD19D), c43917_g2   Detailed information can be found in Supplementary Table 6. PSMs, the peptide-spectrum matches.  . The RNA-seq data for the inflorescences and receptacles of ripe fruit revealed 3 and 4 differentially expressed PLCPs, respectively: all of them were downregulated under light deprivation ( Table 5). Genes c47131_g1 (FcRD21A, FcRD21B, and FcRD21C) and c8892_g1 (FcRD21E) were repressed in both the inflorescence and receptacle, whereas c56852_g1 (FcSAG12F, FcSAG12G) was only downregulated in the inflorescence, and c39535_g2 (FcRD19D) and c13373_g1 (FcXBCP3A) were only downregulated in the receptacle. The genus has about 700 species, most of them evergreens, including trees, shrubs and climbers growing under different climatic conditions (Flaishman et al., 2008;Zhang et al., 2020). In three sequenced Ficus species genomes, 31, 35 and 34 PLCP genes were identified (F. carica, F. hispida and F. microcarpa, respectively). The number of PLCPs varied among species and subfamilies, possibly due to whole-genome duplication, tandem duplication, and large-scale segmental duplication (Liu et al., 2018). In eukaryotic species, gene duplications are estimated to occur at an average rate of 0.01 per gene per million years (Lynch and Conery, 2000).

PLCPs in Ficus Species
PLCPs of the three Ficus species differed in the structure of their subfamilies. In F. carica, SAG12 (10 members) and RD21 (7) were the two largest subfamilies, whereas in F. hispida, SAG12 (10) and RD19 (10) were the two largest; the RD21 subfamily had 5 PLCPs, and no THI subfamily members were identified, possibly being lost during evolution: the THI subfamily has been reported as lost in poplar (Zou et al., 2017). In F. microcarpa, RD21 (11) and SAG12 (10) were the two largest subfamilies. The three Ficus species all had 10 SAG12 subfamily members, whereas in mulberry and Arabidopsis, SAG12 had 9 and 6 members, respectively. It is speculated that SAG12 PLCPs may have formed before the formation of the individual species. SAG12 PLCPs are senescence-associated genes and their expression increases in senescing leaves (James et al., 2018).
To further analyze the PLCP family's phylogenetic mechanism, we constructed comparative syntenic maps of F. carica associated with F. hispida and F. microcarpa. The three Ficus species' genomes showed high homology (Figure 7). A total of 16 and 20 FcPLCPs showed syntenic relationships with those in F. hispida and F. microcarpa, respectively (Supplementary Table 7), indicating that F. carica may have a closer evolutionary relationship with F. microcarpa than with F. hispida. Some syntenic gene pairs were conserved in the three Ficus species, including some SAG12 and RD21 subfamily members, indicating that their expansion occurred before the three species' divergence. A chromosome fusion or fission event occurred between F. microcarpa chr3 and its homologs, chr3 and chr7 of F. carica and F. hispida, respectively . FcRD19A, FcXBCP3C, and FcRD21F on chr7 were identified as syntenic genes. F. carica chr2, F. microcarpa chr2 and 7, and F. hispida chr2 and 14 may also have undergone fusion or fission events. Several inversions occurred in the three genomes' chromosomal fragments, calling for further study.

PLCP Expression Patterns
Ficus is characterized by unique aggregate fruit that develop from an enclosed urn-shaped inflorescence; the receptacle serves as a physical barrier, protecting the enclosed inflorescence and small drupelets from disease and insects. A few recent studies have revealed that the inflorescence and surrounding receptacle of figs differ with respect to ripening (Freiman et al., 2015), and their response to gibberellin (Chai et al., 2018(Chai et al., , 2019 and ethephon treatment (Cui et al., 2020).
In the present study, among the 31 sequences recruited as PLCP-encoding genes from the published fig genome, 17 PLCP transcripts and 18 PLCP proteins were identified, and their spatiotemporal expression/abundance pattern in the inflorescence and receptacle was revealed by transcriptome and proteome analysis, respectively. Limited by the fact that only fig fruit material was used in our study, we are not in a position to suggest which remaining sequences are transcribed/translated in other tissues or, in other words, are pseudogenes. The present results emphasize the divergent roles of the inflorescence and receptacle in fig fruit reproductive biology. The spatial expression pattern of PLCPs has been reported in Arabidopsis (Richau et al., 2012), rubber tree (Zou et al., 2017), cotton (Zhang et al., 2019), castor bean and physic nut (Zou et al., 2018).
Pollination of the Ficus inflorescence relies on species-specific wasps (Blastophaga psenes L.) (Flaishman et al., 2008;Zhang et al., 2020).  (Stover et al., 2007). Transcriptome analyses of the inflorescence demonstrated high CEP1 levels in the young fruit. AtCEP1 has been reported to be essential in tapetum programmed cell death and pollen development (Liu et al., 2018). The relatively high expression of members of the FcRD21 and FcRD19 subfamilies, both involved in resistance to biotic stress, suggests that tissue-specific PLCP expression could play a role in supporting fig inflorescences' control of the biological risk associated with the wasp pollinator entering the syconium at the early stage of fruit development.
Among of the 18 PLCPs identified in the fig's soluble proteome, 6 were identified as DAPs in the inflorescence, all of them ficins. Significantly decreased abundance (p < 0.05, fold change ≤ 0.83) was found for ficin 1C, ficin 2C, ficin 4, ficin 5, ficin 6A and ficin isoform D during inflorescence development. When the fig fruit ripens, the decreased abundance of ficin and other PLCPs in the inflorescence-the major edible part of fig fruit-may indicate a decreased requirement for antibacterial agents, thereby facilitating the feeding of dispersing organisms.
A large number of PLCPs, especially ficins, are toxic to herbivorous insects (Konno et al., 2004). The biotic risk faced by the fig receptacle is different from that faced by the inflorescence, the latter being well protected from most insects by the receptacle and scales covering the ostiole. PLCPs of subfamilies RD21, RD19, ALP1, and CTB2 showed high transcription levels in the receptacle. In papaya fruit, subfamily III PLCP genes, including the papain gene CpXCP5, are expressed at high levels in stage I fruit; the papain and other major PLCPs in papaya latex provide defense against herbivorous insects as the papaya fruit develops (Liu et al., 2018). In our study, only one putative FcXCP1 transcript was identified in fig fruit and it showed basal expression in the receptacle, whereas one RD21 member (c47131_g1), previously shown to have roles in plant immunity and resistance to necrotrophic fungal pathogens and arthropod crustaceans (Rustgi et al., 2018), exhibited extremely high expression in the receptacle before fig fruit ripening.
Our fig fruit receptacle proteomic data also supported a significant increase in abundance of ficin 4, ficin 1A, ficin 1C, ficin 6A, ficin isoform D, CTB2 and XBCP3C from mid-stage fig development to near commercial ripeness. Latex collected from the receptacle of commercial-ripe figs was rich in ficin 4, ficin 1A, ficin 1B and ficin isoform D. The major PLCP components in the receptacle latex were similar to those found in the receptacle. The presence of ficins in the latex confirms their role in plant resistance to microbes and herbivores (Kitajima et al., 2018). High transcription and protein abundance of the major PLCPs in the fig receptacle in the commercial-ripe fruit suggest strong and persistent PLCP protection of the receptacle against biotic stresses.
Moreover, PLCP expression has been reported to be modulated by plant hormones and environmental stimuli. Ethephon is regularly applied to rubber trees to increase the yield of rubber latex (Zhu and Zhang, 2009). In our study, most of the PLCPs were upregulated following ethephon application. Commonly found light-responsive elements in PLCP promoters and a comparison of light-deprived and natural grown fig fruit transcriptomes strongly suggest that light is a positive signal in the expression of most PLCPs. In support of this, a study with smyrna-type fig cultivars found the highest protease activity in the late afternoon after long light exposure (Lazreg-Aref et al., 2018). Recently, differently changing patterns of protease activity in different fig types and cultivars have been reported. In the common-type cultivar Kahli and the San Pedro-type cultivar Bither Abiadh, protease activity decreases with maturity, whereas in the smyrna-type cultivars Njali and Temri, and the caprifig cv. Besbessi, protease activity increases with maturity (Lazreg-Aref et al., 2018). This shows that in addition to the stage of development, cultivar may also be an important factor affecting PLCP expression in fig fruit, warranting further study.

CONCLUSION
In this study, the PLCP family was analyzed in fruit of the edible fig F. carica at the level of gene structure, sequence characteristics, promoter cis-elements, expression patterns and proteomics. Species-specific PLCP subfamily duplication was revealed, which could be relevant to the uniqueness of edible fig, being the only deciduous species of Ficus, which has been under long selection pressure by humans as one of the earliest domesticated fruit trees. High expression of disease-and herbivore-resistance/repelling PLCP genes and a high abundance of ficins in the inflorescence,

DATA AVAILABILITY STATEMENT
The datasets generated for this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: NCBI, accession numbers: PRJNA606407, PRJNA494945, PRJNA723733, and ProteomeXchange consortium, via PRIDE partner repository: PXD025170 and PXD025485.

AUTHOR CONTRIBUTIONS
YZ, YC, and MS conducted the experiments and data analyses. YZ, AV, SC, and HM prepared the manuscript. All authors have read and approved the manuscript for publication.