Genome-wide analysis of bromodomain gene family in Arabidopsis and rice

The bromodomain-containing proteins (BRD-proteins) belongs to family of ‘epigenetic mark readers’, integral to epigenetic regulation. The BRD-members contain a conserved ‘bromodomain’ (BRD/BRD-fold: interacts with acetylated-lysine in histones), and several additional domains, making them structurally/functionally diverse. Like animals, plants also contain multiple Brd-homologs, however the extent of their diversity and impact of molecular events (genomic duplications, alternative splicing, AS) therein, is relatively less explored. The present genome-wide analysis of Brd-gene families of Arabidopsis thaliana and Oryza sativa showed extensive diversity in structure of genes/proteins, regulatory elements, expression pattern, domains/motifs, and the bromodomain (w.r.t. length, sequence, location) among the Brd-members. Orthology analysis identified thirteen ortholog groups (OGs), three paralog groups (PGs) and four singleton members (STs). While more than 40% Brd-genes were affected by genomic duplication events in both plants, AS-events affected 60% A. thaliana and 41% O. sativa genes. These molecular events affected various regions (promoters, untranslated regions, exons) of different Brd-members with potential impact on expression and/or structure-function characteristics. RNA-Seq data analysis indicated differences in tissue-specificity and stress response of Brd-members. Analysis by RT-qPCR revealed differential abundance and salt stress response of duplicate A. thaliana and O. sativa Brd-genes. Further analysis of AtBrd gene, AtBrdPG1b showed salinity-induced modulation of splicing pattern. Bromodomain (BRD)-region based phylogenetic analysis placed the A. thaliana and O. sativa homologs into clusters/sub-clusters, mostly consistent with ortholog/paralog groups. The bromodomain-region displayed several conserved signatures in key BRD-fold elements (α-helices, loops), along with variations (1-20 sites) and indels among the BRD-duplicates. Homology modeling and superposition identified structural variations in BRD-folds of divergent and duplicate BRD-members, which might affect their interaction with the chromatin histones, and associated functions. The study also showed contribution of various duplication events in Brd-gene family expansion among diverse plants, including several monocot and dicot plant species.


Introduction
Gene expression in higher animals and plants is highly complex and regulated at multiple levels in response to cellular/physiological requirements, and unfavorable environmental conditions (Floris et al., 2009;Haak et al., 2017;Merchante et al., 2017;Withers and Dong, 2017). Abiotic stresses negatively affect plant physiology and growth, leading to substantial loss in productivity. Unfavorable environmental conditions (viz. salinity, drought, heat) induces complex transcriptional programing in plants, leading to stress adaptive responses to mitigate detrimental impact (Kreps et al., 2002;Gao et al., 2008). The transcription status of genes is influenced by both genetic and epigenetic components (Singh, 1998;Kim et al., 2015), and unlike the genetic-elements (core promoter, cis-elements, enhancers, silencers), the epigenetic controls involve non-sequence-based modifications to alter the expression of genes (Gibney and Nolan, 2010;Lämke and Bäurle, 2017;Rendina Gonzaĺez et al., 2018). Epigenetic modifications of DNA and/or histones affect the state of chromatin and transcription activity (Iwasaki and Paszkowski, 2014), leading to the enhanced response potential of the genetic material (Strahl and Allis, 2000;Loidl, 2004). Post-translational modifications (PTMs: acetylation, methylation, phosphorylation, ubiquitylation etc.) affect the characteristics of several cellular proteins including histones, where such modifications modulate the nucleosome dynamics (Bowman and Poirier, 2015). Among these, acetylation of lysine residues plays important roles in protein-protein interactions, nuclear transport, as well as modulation of chromatin state due to impact on positive charge and steric bulk of a nucleosome (Bowman and Poirier, 2015). The cellular epigenetic regulation is based on a system of 'writer', 'reader' and 'eraser' proteins for dynamic management of PTM marks (Musselman et al., 2012;Zhao et al., 2018). For acetylation/deacetylation of lysine residues, lysine acetyltransferases (KATs) and lysine deacetylases/histone deacetylases (KDAC/HDACs) perform 'writer' and 'eraser' functions (Pandey et al., 2002;Musselman et al., 2012), while the conserved bromodomain (BRD/BRD-fold), an important component of several chromatin-associated proteins, serve as 'reader' of lysine acetylation on histones (Drazic et al., 2016).
The bromodomain containing genes (Brd-genes) were first identified in the Drosophila melanogaster (Tamkun et al., 1992), and subsequently reported in diverse eukaryotes (Rao et al., 2014). The~110 amino acid bromodomain (BRD)-region folds into four ahelices (aZ, aA, aB, aC) connected by three loops (ZA, AB, BC) to form a conserved BRD/BRD-fold (a hydrophobic pocket) to recognize acetylated lysine residues in the histones (Marmorstein and Berger, 2001;Bottomley, 2004;Mujtaba et al., 2007;Ferri et al., 2016). A single BRD-domain is capable of recognizing acetylated lysine residues on different histones (Josling et al., 2012). BRDcontaining proteins (alone or as multi-protein complexes) are involved in regulation of gene expression by different mechanisms viz. chromatin remodeling, histone modifications, transcriptional machinery regulation (Florence and Faller, 2001;Fujisawa and Filippakopoulos, 2017). The Brd-gene family, which is a large and diverse family among different organisms, contains a total of 46 Brd-members in humans, divided into eight structurally and functionally distinct groups (Filippakopoulos et al., 2012). Human Brd-members have been studied well for their involvement in chromatin dynamics and diverse cellular functions, and have gained attention as promising drug targets for different disease conditions (Zeng and Zhou, 2002;Sanchez and Zhou, 2009;Taniguchi, 2016;Cochran et al., 2019;Uppal et al., 2019;Boyson et al., 2021).
Epigenetic-changes in chromatin structure and organization, and its impact on expression of genes is equally important for cellular and physiological requirements, and stress responses in plants. Salinity is a complex condition, which along with ionic imbalance-mediated toxicity, also leads to osmotic and oxidative stress, and hence multiple mechanisms including epigenetic regulation are activated (Fransz and De Jong, 2002;Rosa and Shaw, 2013;Kim et al., 2015;Vergara and Gutierrez, 2017;Bhadouriya et al., 2021;Pei et al., 2021;Yung et al., 2021). Like animals, plants also harbor large Brd-gene families, however the extent of divergence of Brd-members, and their functional significance (and role of genomic duplications and alternative splicing events) is relatively less explored. Studies on few Brdhomologs from Arabidopsis and other plants have shown their involvement in functions like seed germination (Duque and Chua, 2003), leaf development (Chua et al., 2005), mitotic cell cycle (Airoldi et al., 2010), sugar and abscisic acid responses , growth and development (Rao et al., 2014;Martel et al., 2017), pathogen perception and immune response (Sukarta et al., 2020;Zhou et al., 2022), and as important subunits of SWI/SNF chromatin remodelers (Jarończyk et al., 2021).
The number of Brd-gene family members varies in different organisms (Rao et al., 2014). An important feature of most plants genomes is genomic duplication events that have contributed towards generation of additional copies of several genes, leading to divergence towards regulatory, structural and functional differences (Flagel and Wendel, 2009;Barker et al., 2012;Qiao et al., 2019). In addition to the diversity of genes in multi-member families, it is also important to identify the conserved orthologs as well as species-specific paralogs to get insights into the evolutionary trends, and lineage-specific events (Altenhoff et al., 2019). Further, a large number of reports have shown the involvement of alternative splicing (AS) mechanism in regulation of expression of genes in diverse conditions, and alteration of key features of the alternative protein isoforms (Reddy et al., 2013). Involvement of both these mechanisms on diversity of Brd-genes among plants has not been explored well, and is worth investigating.
In the present study, we carried out genome-wide analysis of Brd-gene family in model plants A. thaliana (dicot) and O. sativa (monocot), to understand the extent of diversity of genes and proteins, regulatory regions, tissue-and stress-induced expression and splicing dynamics, domains-motifs architecture, and variations in the BRD-fold. Analysis of conserved orthologs, species-specific paralogs and singleton Brd-members, revealed the differential evolutionary trend of Brd-genes in the two species. Result also showed the effect of genome duplication and AS-events on the characteristics of Brd genes, proteins as well as the BRD-fold, in both the species. Moreover, analysis also showed substantial contribution of various duplication events in Brd-gene copy number increase among lower and higher plants, including monocot and dicot species. To our knowledge, this is the first study on analysis of diversity of Brd-homologs of model plants, A. thaliana and O. sativa, which will be useful for further studies on deciphering their functional significance in chromatin dynamics and response to diverse cellular conditions and stress responses.

In silico analysis of characteristics of genes, proteins and transcripts
The structure and organization of AtBrd and OsBrd genes in terms of untranslated regions (UTRs), exons, and introns was analyzed using the Gene Structure Display Server (GSDS, http://gsds.gao-lab.org/, Hu et al., 2015). The important characteristics (molecular weight, MW; isoelectric point, pI etc.) of the AtBRD and OsBRD proteins were estimated using the ProtParam tool on the ExPASy website (http:// web.expasy.org/protparam/). The alternative isoforms of the AtBrd and OsBrd genes were retrieved from the respective databases, and compared with the corresponding constitutive isoforms by pair-wise alignment using ClustalX (Thompson et al., 1997).

Identification of orthologs and paralogs
For the identification of orthologs and paralogs among the A. thaliana and O. sativa Brd-family members, OrthoVenn2 online tool was used (https://orthovenn2.bioinfotoolkits.net/home, Xu et al., 2019). In brief, the full-length sequences of BRD-containing sequences were analysed at OrthoVenn2 portal (E-value, 1e-2; inflation value, 1.5) to identify the shared ortholog groups (OGs, representation of both species), species-specific paralog groups (PGs, representation of one species) and singleton sequences (STs, not part of ortholog/paralog groups).

In silico analysis of conserved domains, functional sites and motifs
Presence of conserved domains, important functional sites in the BRD proteins was analysed at CDD-NCBI and PROSITE (https://prosite.expasy.org/, Sigrist et al., 2012). Domain analysis was carried out using default search parameters and only significant hits were considered for analysis. Conserved motifs were analysed at MEME online Suite (version 5.4.1, http://meme-suite.org/tools/ meme, Bailey et al., 2015) using following parameters; minimum and maximum motif width: 6-50, number of motifs: 15.

2.6
In silico analysis of gene expression using RNA-Seq data The RNA-Seq expression data (as FPKM values, Fragments Per Kilobase of transcript, per Million mapped reads) of respective Brdgenes was retrieved from the Arabidopsis RNA-seq Database (V2, http://ipf.sustech.edu.cn/pub/athrna/) for different tissues (shoot, root, stem, meristem, seedling, embryo, leaf, silique, endosperm, seed, flower and pollen) and two stress conditions (cold and drought). Rice Expression Database (http://expression.ic4r.org/) was used for retrieving data for rice tissues (root, shoot, panicle (3 stages), anther (2 stages), pistil, aleurone and seed) and two stress conditions (drought and cadmium). The gene/locus names were used for search and retrieval of FPKM data. In case of multiple libraries, the average FPKM values were used for analysis. FPKM values were log2-transformed and used for generation of heat mapbased transcript profiles by Heatmap Illustrator software (HemI, version 1.0, Deng et al., 2014).
15-day old seedlings were transferred to MS-media containing 150 mM sodium chloride (NaCl). Seeds of rice genotype NSICRc106 (obtained from International Rice Research Institute, Philippines) were grown in Hoagland media (Himedia, India) in Sanyo MLR-351H plant growth chamber as detailed previously (Sanyal et al., 2018). Six-day-old seedlings were subjected to salt-stress (150 mM NaCl). Tissue samples of both the plants were collected at 24 h timepoint, frozen in liquid nitrogen, and stored at -70°C. Three-five seedlings were pooled for total RNA isolation by TRIzol (Invitrogen, USA), which was assessed for quality and quantity, and treated with DNase I (Roche Diagnostics, Germany) to remove DNA contamination. Total RNA (10 µg) was reverse transcribed using SuperScript II reverse transcriptase (Invitrogen, USA) using anchored oligo(dT)23 and random nonamers (New England Biolabs, USA), as per the protocol recommended by the manufacturer.
Transcript levels of Brd-genes (six AtBrd-gene pairs and five OsBrdgene pairs), and constitutive and alternative splice variants of one AtBrd-gene (AtBrdPG1b) were analyzed by RT-qPCR analysis, using oligonucleotide primers designed utilizing the exon-intron information available from RGAP and TAIR databases (Supplementary Table 1). Briefly, RT-qPCR assays were carried out on LightCycler LC480 II (Roche Diagnostics, Germany) using SYBR Green Jumpstart Taq Ready mix (Sigma-Aldrich, USA) using following cycling settings: 94°C (2 min), 45 cycles of 94°C (15 sec), 60°C (15 sec), 68°C (20 sec), followed by melting curve analysis to assess the amplicon specificity. Three independent replicate sets were used, and analysis was carried out as per Schmittgen and Livak (2008) using AtActin and OselF1a as reference genes. Statistical analysis was carried out by Student's t-test and differences were considered significant only when the P < 0.05.

Homology modelling and comparison
The homology models of BRD-fold of several A. thaliana and O. sativa BRD proteins were generated at SWISS-MODEL workspace (http://swissmodel.expasy.org) using automated mode option, and compared for structural differences. For identification of structural differences due to variations among BRD-folds, the homology models of BRD-folds of duplicate Brd-pairs or divergent Brdmembers were superposed using structure comparison tools at SWISS-MODEL workspace.

Analysis of duplication events among plant genomes
The Brd-gene members from A. thaliana and O. sativa were analyzed for block and tandem duplication events at PLAZA (version 4.5) dicots and monocots databases (https:// bioinformatics.psb.ugent.be/plaza/, Van Bel et al., 2018). InterPro identifier IPR001487 (bromodomain) was used to identify the chromosomal locations of all Brd-genes (including duplicatepairs), and represented using the Circle Plot tool available at PLAZA server. Additional 79 plant genomes including seven lower photosynthetic organisms, 27 monocots and 45 dicots (available at PLAZA monocots and dicots databases) were analyzed for prevalence of different duplication events (block, tandem, combined tandem + block events) leading to multiple Brd-genes.  Figure 1A) and O. sativa (Chr2: 04, Chr3/Chr6/Chr8: 03 each, Chr1/Chr4/Chr7/Chr9: 02 each, Chr10: 01, Chr5/Chr11/ Chr12: nil; Table 2 and Figure 1B). The Brd-genes and encoded proteins in both the species showed extensive diversity in characteristics viz. number of alternative isoforms, molecular weight, isoelectric point (Tables 1, 2).
The OGs showed variable representation of species-specific Brd-members and displayed different configurations relative to AtBRD vs osBRD-members viz. many-to-one (OG1, OG2, OG6, The locus numbers are as per TAIR database (The Arabidopsis Information Resource at https://www.arabidopsis.org/); 2 Simplified designation of the genes as per clustering in different ortholog/paralog groups or singleton category; 3 Ortholog group (OG)/paralog group (PG)/singleton (ST) category association of Brd-members; BD1-6: Block duplication events 1-6; Chr No: Chromosome number; CDS: Coding DNA sequence; Alternatively spliced transcripts with differences in UTR (*) exon (**) or both regions (***) are indicated.

Heterogeneity of AtBRD and OsBRDmembers: Impact of duplication and splicing events
The AtBrd-members showed considerable heterogeneity in length of the gene (1,919 -10,519 bp), coding region (1,110 -6,579 bp) and encoded proteins (369 -2,192 aa) (Table 1), while the OsBrd-members displayed relatively higher variability (gene: 2,119 -16,548 bp; coding region: 717 -6,603 bp; protein: 238 -2,200 aa) ( Table 2), which was attributed to the length and number of exons, introns and 5′-/3′-UTRs. The Brd-members in most OGs/PGs showed similarity in length and exon-intron organization in the  Figure 2). Duplication events resulted in the diversity of the Brd-genes in both A. thaliana and O. sativa. The AtBrd-genes originated due to six block duplication events (BD1-BD6) displayed variations in length and organization of exons, introns and UTRs (Table 1 and Figure 2A). The five duplicated OsBrd-gene pairs due to one tandem and four block events displayed relatively higher heterogeneity than AtBrds (Table 2 and Figure 2B). The Brd-members specific to certain OGs displayed species-specific duplication events viz. AtBrd-genes in OG1, OG2, and OG3 and OsBrds in OG5. Higher heterogeneity in gene structure was observed among the OsBrd paralogs (PG3) than AtBrd paralogs (PG1, PG2) (Figures 2A, B).

Duplication events affected the cis-element diversity and promoter structure
The tandem and block duplications affected the cis-element diversity among duplicated Brd-gene pairs in both the species. For example, Brd-gene pair BD5 AtBrd1a-BD5 AtBrd1b (OG1) differed in cis-elements for light, abiotic and biotic stress (wound, defense), phytohormones (gibberellin, jasmonic acid) and physiological functions (meristem and endosperm-specific expression, circadian control), while BD1 AtBrdPG1a-BD1 AtBrdPG1b (PG1) differed in elements for light, defense, abiotic stress, phytohormones (ethylene, gibberellin), TF-binding and physiological functions ( Figure  Alternative splicing (AS)-mediated changes in domain architecture of BRD isoforms of A. thaliana (A) and O. sativa (B) belonging to certain ortholog groups (OGs), paralog groups (PGs), and singleton category (STs). The members specific to each group are arranged side-by-side for comparison, and the designations 'BD' and 'TD' in the names indicate the block or tandem duplication. Domains/important functional sites among different isoforms (constitutive:.1 and alternative: .2 to .6) are shown by different color codes. Scale on the top indicates the protein length (number of amino acids). copy numbers, and BD1 OsBrdST1 also lacked elements for abiotic and biotic stress. Also, OG5-specific BD3 OsBrd5a-BD3 OsBrd5b displayed differences in elements for light, abiotic and biotic stresses, and certain physiological mechanisms ( Figure 5B and Supplementary Figure 4). The duplications did not affected the length of promoter region of AtBrd-gene pairs, however substantial length differences were evident among most of the duplicate OsBrd-pairs ( Figure 6). In addition, duplicated AtBrd and OsBrd-pairs displayed differences in the arrangement of TFBS (all duplicate pairs), repeat OsBrd2-BD4 OsBrd13), and CpG islands ( BD3 AtBrd2a-BD3 AtBrd2b; BD5 AtBrd1a-BD5 AtBrd1b; All OsBrd-pairs) ( Figure 6). Overall, the duplication event seems to have affected the promoters of the Brdmembers, which might be important for their responsiveness towards diverse intrinsic/extrinsic factors.

Sequence divergence and key conserved sites in bromodomain (BRD) region of BRD-homologs
The bromodomain (BRD) region showed more length variation among OsBRDs (range: 57-133 aa) than AtBRDs (range: 94-133 aa), particularly due to two OsBRDs (OsBRD3a, OsBRDST2), which harbored long N-terminal deletion leading to exceptionally small BRD-regions (Supplementary Figure 5). Also, three-pairs of AtBRDs Expression analysis of Brd-genes: Heatmap-based analysis of RNA-Seq data for tissue-specific and abiotic stress-responsive expression pattern of Brd-genes of A. thaliana (A) and O. sativa (B), belonging to thirteen ortholog groups (OG1-13), three paralog groups (PG1-3), and singleton category (STs). Different tissues and stress conditions are indicated on the top, and names of Brd-genes are shown on the sides, with designations 'BD' and 'TD' indicating block or tandem duplication event. A continuous color gradient scale is indicative of the expression level (blue: low levels; red: high). RT-qPCR analysis of transcript levels of six duplicated AtBrd-gene pairs (C) and five OsBrd-gene pairs (D) in seedling tissues (top panels), and in response to salt stress (NaCl, 150 mM, bottom panels), using reference genes (AtActin; OselF1a). Designations 'BD' and 'TD' indicate block or tandem duplication event. (E) Expression pattern of two isoforms of AtBrdPG1b (constitutive:.1; alternative:.2) in Arabidopsis seedlings (top panel), and in response to salt stress (bottom panel). The analysis was carried out in triplicate, data is represented as mean ± SD, and statistical significance is indicated by *(p <0.05), **(p <0.01), ***(p <0.001), ns (no significant difference).

Duplication events affected the Brdgene numbers among higher plants
Based on the results obtained in A. thaliana and O. sativa, the impact of duplication events was evaluated on Brd-genes among genomes of 79 photosynthetic organisms, including monocots and dicots. Brd-gene copies among four lower organisms ranged from 09-16, while P. abies harboured 28 copies with no evidence of duplications ( Figure 11A). A. trichopoda harbored one tandemduplicate, while P. patens showed four block and eight tandem duplicated Brd-genes ( Figure 11A). Among the monocots, Brd-gene copies ranged from 14 (A. shenzhenica) to 79 (T. aestivum), and except three, all genomes showed different duplication types, a) block events (BD), b) tandem events (TD), c) both tandem and block events (TD, BD), d) tandem and combined events (TD, TD + BD), e) block and combined events (BD, TD + BD), and f) all events ( Figure 11B). Events BD, TD and TD + BD were responsible for higher Brd-genes in several monocots viz. Z. mays, M. acuminata, E. guineensis, M. sinesis, T. turgidum, S. spontaneum and T. aestivum ( Figure 11B). Likewise, dicots also showed several combinations of events (BD, BD and TD, TD + BD) leading to Brd-genes from 20 (B. vulgaris) to 62 (G. max). Events BD, TD and TD + BD were major contributors to higher Brd-genes among D. carota, P. trichocarpa, M. esculanta, C. arietinum, B. rapa, B. oleracea, C. quinoa, P. bretscheneideri, A. chinensis and G. max ( Figure 11C). The duplication events seem to have contributed towards expansion of Brd-gene copies among plants.
Present comparative analysis of A. thaliana and O. sativa Brdhomologs provided insights into diversity of genes/proteins/ regulatory elements, orthologs and paralogs, along with duplication and AS-mediated effects on key BRD-features. Response of Brd-members to salinity is indicative of their involvement in stress-induced epigenetic regulation (Bhadouriya et al., 2021;Yung et al., 2021). Recently, GCN5 Brd-type has been reported to be involved in salt tolerance response in A. thaliana (Yung et al., 2021). The salinity-induced AS-modulation generated AtBrdPG1b alternative isoform (lacks C-ter Ser RR), which might differ in key features affecting its function/interaction (Reddy et al., 2013). As several At-/Os-Brd-homologs are affected by AS (Figure 4), it is important to decipher their functional significance. Furthermore, genomic duplications also contributed towards the Brd-gene family expansion among the plants, and hence understanding its significance in Bromodomain-diversity is important. Recently, three AtBRDs has been identified as subunits of SWI/SNF multi-protein chromatin remodeler, with role in binding of BRM-ATPase to the target genes (Jarończyk et al., 2021). Present study placed these AtBRDs to the OG3 (BD2 duplicates AtBRD3a-AtBRD3b and AtBRD3c), which also suggests similar roles for corresponding O. sativa homologs (OsBRD3a, OsBRD3b). Presence of multiple At-and OsBrdmembers suggests their involvement in diverse cellular functions (as in humans), however, the number and domain diversity of plant homologs was substantially less (Filippakopoulos et al., 2012;Fujisawa and Filippakopoulos, 2017). Further, in both plants, the Brd-members lacked dual/poly BRD architecture like human BRDs (Sanchez and Zhou, 2009;Filippakopoulos et al., 2012), except BD2 OsBRDPG3c that was predicted to harbor an additional BRD- region with high heterogeneity and lack of key BRD-fold elements. Few lower photosynthetic organisms do harbor Brd-members with more than one BRD-region viz. MCO15G409l (M. commoda) and Cre05.g247000BRD (C. reinhardtii).
A notable feature of A. thaliana and O. sativa Brd-members was enhanced diversity due to genomic duplications, important for evolution of multi-gene families among plants (Flagel and Wendel, 2009;Barker et al., 2012;Qiao et al., 2019). In O. sativa, the OsBrdduplicates displayed higher divergence, as well as different outcomes for the tandem duplication (TD) events. While, the TD1 event generated a Brd-copy in a 3-member PG3 group, another event affected the OsBrdST2 (LOC_Os02g09920, domains: BRD-PHD-WHIM1-ZnF) and generated LOC_Os02g09910 encoding protein lacking BRDdomain (Supplementary Figure 7). Maintenance of single-copy Brdmembers (in both species) from OG8-OG13 (GTE1 TF, GTE12 TF, GCN5, BRM, ATPase-family, Figure 1C), and their comparable expression levels (Figure 7) indicates involvement in essential conserved functions (Panchy et al., 2016). On the contrary, speciesspecific duplications enhanced the copies of Brd-genes primarily encoding for TFs of GTE-type (A. thaliana, OG1: GTE8; PG1: GTE3; PG2: GTE2 and O. sativa, PG3: GTE7), BRPF3 (A. thaliana, OG3) and BDF2 (O. sativa, OG5). Among plants, retention of duplicated genes involved in certain functions (transcription regulation, signalling, stress responses) is likely to be associated with gene-dosage imbalance or paralog interference (Panchy et al., 2016). Post-speciation duplication, and post-duplication loss can also lead to differences in copy-number of genes (Altenhoff et al., 2019;Qiao et al., 2019), and possibility of both these mechanisms cannot be ruled out for differences in AtBrd or OsBrd members. Such events may result into divergence of certain gene-copies affecting regulatory, structural and functional characteristics.
Changes in promoter sequence/structure including the CpG islands (initiates dispersed transcription initiation events, Deaton and Bird, 2011) may affect expression dynamics of duplicate AtBrd and OsBrd-genes, which may modulate the relative levels of Brdhomologs affecting the chromatin dynamics during response to metabolic and environmental cues (Lämke and Bäurle, 2017;Ojolo et al., 2018;Zhang et al., 2018;Chang et al., 2020). As epigenetic regulation is integral to plants responses to different stress conditions (Kim et al., 2015;Yung et al., 2021), differential response of certain At-and OsBrds indicates their roles as both positive and negative epigenetic modulators during stress-response. Intriguingly, several At-and OsBrds displayed AS-events, known to enhance transcriptome and/or proteome diversity (Ali et al., 2007;Reddy et al., 2013;Laloum et al., 2018). The Brd-genes with relatively conserved gene/protein organization seems to have Comparative assessment of duplication events affecting Brd-gene copies among different plant genomes: (A) lower photosynthetic organisms, (B) monocots, and (C) dicots, as per analysis at PLAZA database (version 4.5). Types of duplication events are indicated by different grey shades and designations ND (no duplication), TD (tandem duplication), BD (block duplication), and TD + BD (combined tandem and block duplication).
evolved towards different splicing patterns. Further, if the related Brds of both species (including the duplicated Brd-members) showed AS-events, the impact on the transcript and/or protein isoforms was different (Figure 4). While AS-mediated differences in UTRs may affect the stability, translation, localization of transcripts (Mignone et al., 2002), events in exons can alter structuralfunctional characteristics. Higher abundance of AS-isoforms of certain OsBrds ( BD1 OsBrd4a.2, OsBrd4b.2, BD3 OsBrd5a.2) (Supplementary Figure 8) and salinity induced AtBrdPG1b.2 ( Figure 7E) may have some functional importance, which needs further investigation for better insights. It is reported that different duplicated genes in plants may diverge to undergo independent, functionally shared, or accelerated AS-modes (Iñiguez and Hernańdez, 2017). Our analysis shows that A. thaliana and O. sativa Brd-duplicates generates non-shared isoforms, indicative of evolution towards AS-mediated sub-functionalization. In a recent study AS-mediated impact on fate and interaction of two GCN5 isoforms was reported in B. distachyon (Martel et al., 2017). In our analysis, the GCN5 Brd-member in A. thaliana and O. sativa belong to OG10 (AtBRD10, OsBRD10, Figure  Structural variations in the BRD-fold are known to alter its interaction with acetylated lysine on histones, and associated functions of the Brd-proteins (Josling et al., 2012). The At-and OsBRD-members (including duplicates) harbored variations (substitutions at key sites, additional secondary elements, and partial/complete loss of BRD-fold elements), which might affect their interaction capability/affinity with the chromatin. It is therefore important to decipher their structural-functional characteristics vis-à-vis other BRD-members. BRD-region similar to OsBRD3a and OsBRDST2 with characteristic long N-ter deletion (caused loss of aZ-helix, ZA-loop) was not observed among AtBRDs, however an uncharacterized human protein showed similar deletion and loss of elements (Supplementary Figure 6). Interestingly, the At-and OsBrd BRD-fold elements harboured several conserved signatures (e.g., leucine repeat pattern in aZ and sites in ZA-loop) suggesting similar roles in interaction with other helices/loops, as reported in human-BRDs (Filippakopoulos et al., 2012). Plant-specific amino acid variations in aB, aC and ZA loop (particularly among BRD-duplicates) are also likely to affect their interaction with chromatin, and associated functions. The consistency between the BRD-region based relationships, and ortholog-paralog clustering, show its utility in deciphering the divergence of Brd-family in a species, and to overcome issues related to the analysis of such multi-domain proteins (Nakano et al., 2006). Although, the At/OsBRD-homologs lacked dual-BRDs like certain human BRD-homologs (Filippakopoulos et al., 2012), similar domains were identified among different At/OsBRDs, and it would be interesting to find out if they differ in their interaction capabilities (Miller et al., 2016).
The present analysis revealed extensive diversity among important aspects of A. thaliana and O. sativa Brd-members. Functional aspects are likely to be conserved among Brdorthologs maintained as single copy in both species viz. TF GTE1 (OG9), GCN5 (OG10), BRM, ATP-dependent helicase (OG11), GTE12 (OG12), ATPase family-AAA domain (OG13), and an uncharacterized BRD (OG8). In both the species, genomic duplications and alternative splicing have contributed towards the Brd-homolog diversity. Species-specific evolutionary trends were also identified in the two species, like generation of four extensively diverse AtBrds due to two block duplications in OG2 (compared to single OsBrd-member), and unequal number of Brds due to duplication events in species-specific PGs (PG1, PG2 and PG3), most of which are still not completely characterized. The Brd-gene copies were substantially enhanced among several complex photosynthetic organisms with history of duplication events. Overall, the plant Brd-gene family is relatively less studied, however its diversity, impact of duplication and AS-events, domain signatures, suggest involvement in diverse cellular mechanisms, which advocates a thorough analysis for understanding their functional significance.

Data availability statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

Author contributions
TVA: analysis of gene/proteins, RNA-Seq data, and sequence divergence; RPS: analysis of transcripts and domain heterogeneity; HSM: data analysis and review, manuscript writing; AS: planning and execution, in silico and experimental analysis, data review, manuscript compilation and communication. All authors contributed to the article and approved the submitted version.

Funding
This work was supported by the institutional funding of Bhabha Atomic Research Centre, Mumbai, Maharashtra, India. No separate funding was obtained from any other National/International funding body for this study.