ORIGINAL RESEARCH article

Front. Genet., 30 March 2023

Sec. Genetics of Common and Rare Diseases

Volume 14 - 2023 | https://doi.org/10.3389/fgene.2023.1114774

Frequencies of variants in genes associated with dyslipidemias identified in Costa Rican genomes

  • 1. Centro de Investigación en Biología Celular y Molecular, University of Costa Rica, San José, Costa Rica

  • 2. Escuela de Biología, University of Costa Rica, San José, Costa Rica

Abstract

Dyslipidemias are risk factors in diseases of significant importance to public health, such as atherosclerosis, a condition that contributes to the development of cardiovascular disease. Unhealthy lifestyles, the pre-existence of diseases, and the accumulation of genetic variants in some loci contribute to the development of dyslipidemia. The genetic causality behind these diseases has been studied primarily on populations with extensive European ancestry. Only some studies have explored this topic in Costa Rica, and none have focused on identifying variants that can alter blood lipid levels and quantifying their frequency. To fill this gap, this study focused on identifying variants in 69 genes involved in lipid metabolism using genomes from two studies in Costa Rica. We contrasted the allelic frequencies with those of groups reported in the 1000 Genomes Project and gnomAD and identified potential variants that could influence the development of dyslipidemias. In total, we detected 2,600 variants in the evaluated regions. However, after various filtering steps, we obtained 18 variants that have the potential to alter the function of 16 genes, nine variants have pharmacogenomic or protective implications, eight have high risk in Variant Effect Predictor, and eight were found in other Latin American genetic studies of lipid alterations and the development of dyslipidemia. Some of these variants have been linked to changes in blood lipid levels in other global studies and databases. In future studies, we propose to confirm at least 40 variants of interest from 23 genes in a larger cohort from Costa Rica and Latin American populations to determine their relevance regarding the genetic burden for dyslipidemia. Additionally, more complex studies should arise that include diverse clinical, environmental, and genetic data from patients and controls and functional validation of the variants.

1 Introduction

Dyslipidemias are a group of conditions characterized by abnormal lipid levels. High lipid profiles include hyperlipidemias or hyperlipoproteinemia. These are worldwide diseases affecting many people. In Latin American cities such as Barquisimeto, Lima, and Bogotá, this condition has been recorded in >70% of men and >50% of women (Vinueza et al., 2010). Costa Rica is no exception. In a study conducted in the 2000s involving 107,000 inhabitants of San José, it was reported that 36% of men and 22% of women had hypercholesterolemia, while 48% of men and 52% of women reported hypertriglyceridemia (). These conditions have been closely linked to the development of complex ailments such as cardiovascular diseases and acute pancreatitis (; Pretis et al., 2018; Paredes et al., 2019), making hyperlipidemia a public health problem in the 21st century.

A sedentary lifestyle and poor eating habits can profoundly impact the development of these diseases (). The clinical approach to these cases usually includes the implementation of exercise regimens and caloric restriction. Additionally, multiple pieces of evidence have shown that the genetic characteristics of an individual play a leading role in the development of hyperlipidemias (Johansen et al., 2011; ; Wierzbicki and Reynolds, 2019). Currently, the diseases are considered mostly polygenic. However, variants in genes such as the lipoprotein lipase (LPL), the low-density lipoprotein receptor (LDLR), and apolipoprotein B (APOB) tend to have more marked effects than other genes involved in lipid metabolism (Johansen et al., 2011, 2014; Lewis et al., 2015; , ).

Most of the studies aimed at identifying the effect of the genetic component on the presence of alterations in lipid metabolism and the development of dyslipidemia have been performed mainly in Anglo-Saxon and European countries. The study by Andaleon et al. (2019) on Latin American populations is one of the most exhaustive of this kind in this region, including Central Americans. However, little is currently known in Latin American populations about the genetic variants and frequencies in genes previously linked to these conditions in other global studies.

Particularly in Costa Rica, few studies on this matter have been published. In one study, from the Dietary Fat and Heart Disease in Costa Rica project (also known as the Costa Rica Heart Study), they quantified the allelic frequencies of specific variants in the APOC, LPL, APOE, PCSK9, FADS1-two to three, and USF1 genes in 4,000 individuals from the Costa Rican Central Valley. They reported an association of some of these variants with an increased risk of coronary heart disease and hyperlipidemia (; ; Yang et al., 2004; Ruiz-Narváez et al., 2005, 2008; ; ; Yu et al., 2017). Other two research projects have focused on identifying genetic variants in regions of interest, such as the LPL gene and the APOCII promoter region in a group of 38 Costa Ricans with hypertriglyceridemia (; ).

Here, we used data from 258 whole genomes from the Central Valley of Costa Rica to identify genetic variants in genes linked to the incidence of dyslipidemia and estimate their allelic frequencies as a proxy of genetic burden. This is the first national portrait of the frequency of previously reported risk variants in genes associated with this group of diseases obtained from genomic data. Additionally, we report the allelic frequencies of variants in genes of interest previously identified in Costa Ricans (i.e., LDLR and APOCII) and Latin American populations. The information generated in this study will help guide and contextualize future studies on dyslipidemia in Costa Rica and the region; possible next steps include validation of 40 variants of interest in a larger population and determining the impact of these findings on the national healthcare system. Moreover, this study reflects the importance of studies that include clinical, environmental, and genetic data from patients and controls.

2 Materials and methods

2.1 Samples and genomic data

We used anonymized whole genome sequence data from two collections. One is from the repository PSYCH-CV, a collection of Costa Rican WGS from the NIMH-funded (National Institute of Mental Health) study U01MH105630-04S1, which included subjects with mania and psychosis and their relatives recruited under different studies and anonymized in the WGS data repository (). We selected only unrelated individuals without a mental disorder diagnosis from the families, for a total of 23 individuals. The sequencing was carried out using the Illumina HISEQ 2000 with paired ends. The data had a minimum coverage of 30x and a read length of 100 pb. The data were previously aligned with the BWA-MEM tool of the BWA V0.7.15 package using the GRCH38 reference genome and stored in CRAM format.

The second data set was from the project The Genetic Epidemiology of Asthma in Costa Rica (dbGAP phs000988.V4.P1). Individuals without a family relationship and an asthma diagnosis were selected using the dbGAP metadata. In total, 234 subjects met these criteria (called dbGAP-CV, Supplementary Table S1), and CRAM files were downloaded from the database. The genomes of both databases were added to a single group of 258 subjects called CR-WGS for the variant annotation.

2.2 Variant discovery and genotype

The analysis was limited to all coordinates corresponding to the transcriptome according to the GFF3 of Ensembl 106 for the GRCh38 genome, including miRNAs and lncRNAs. We call these regions the exome. Additionally, we extracted two sets of ancestry informative markers (AIMs) sets reported by Campos-Sánchez et al. (2013) and by Galanter et al. (2012). Each coordinate interval was extended to 300 bp upstream and downstream (Table 1).

TABLE 1

Use in the studyIdentifierSource of coordinatesSource of identifiers
Quality control analysisRNA coding regions from Ensembl Release 106
Variant training set for GATK, ‘Variant Quality Score Recalibration’ (VQSR)RNA coding regions from Ensembl Release 106
Ancestry estimates based on Costa Ricans studies78 variants from dbSNPdbSNP variants: Ensembl Genes 106 database, GRCh38.p13.genome coordinates extracted from BioMart
Ancestry estimates compared to American groups from 1KGP phase 3446 variants from dbSNPdbSNP variants: Ensembl Genes 106 database, GRCh38.p13. genome coordinates extracted from BioMart
Exonic variants in genes involved in lipid metabolism and dyslipidemiasABCA1, ABCG1, ABCG4, ABCG5, ABCG8, ABHD5, ANGPTL3, APOA1, APOA2, APOA4, APOA5, APOB, APOC1, APOC2, APOC3, APOC4, APOD, APOE, APOF, APOH, APOL1, APOL2, APOL3, APOL4, APOL5, APOL6, APOM, APOO, CD36, CELSR2, CETP, CILP2, CREB3L3, CYP26A1, FADS1, FADS2, FADS3, GALNT2, GCKR, GPD1, GPIHBP1, HMGCR, KLHL8, LCAT, LDLR, LDLRAP1, LIPA, LIPC, LIPE, LIPG, LMF1, LPL, LRP1, MLXIPL, MTTP, MYLIP, NCAN, NPC1L1, PCSK9, PLA2G7, PLIN1, PLTP, PNPLA2, PPARA, SCARB1, SORT1, STAP1, TRIB1, USF1Genetic symbols: Ensembl Genes 106 database, GRCh38.p13. genome coordinates extracted from BioMartPlaisier et al. (2009),Nakayama et al. (2010),Johansen et al. (2011, 2014), Johansen and Hegele (2011),Vasquez-Vidal (2014),Lewis et al. (2015),, ), Sarraju and Knowles (2019)

Genomic coordinates selected for variant calling.

As a quality control measure on the reads, duplicate reads were first removed using the MarkDuplicates tool, which is part of the GATK package. Next, to adjust for observed systematic errors caused by the sequencer, the GATK machine learning model called Base Quality Score Recalibrator was implemented using the BaseRecalibrator and ApplyBQSR commands.

We used HaplotypeCaller, GenomicsDBImport, GenotypeGVCF, and MergeVcfs for indel-like or SNV-like variant calling. During this process, tGRCh38/hg38 was selected as the reference genome and the dbSNP Build 151 variant database was used as the reference source for variants.

As a quality check on the identified variants, an error score referred to as VQSLOD was calculated for the identified variants using GATK’s machine learning model, Variant Quality Score Recalibrator (VQSR). To do this, metrics obtained for each variant are fed to the VQSR model, including variant depth, strand bias, and quality of the variant assigned in the previous stage, along with lists of variants with different degrees of confidence (). The evaluation of variant calling errors was performed for indels and SNVs separately.

The databases supplied to the VQSR model are stored in GATK’s repository “Resource bundle” “genomics-public-data”, except for the dbSNP v151 database, which was extracted from the FTP site of the National Center for Biotechnology Information of the United States (NCBI). To calculate the error score in the indels, those highly validated in the Mills and 1,000 genomes gold standard data set (Mills et al., 2006) were considered true variants. The training data were the genotypes from the first phase of the 1000 Genomes Project (1KGP) study obtained with the Axiom Exome Plus chip. The dbSNP v151 database was also supplied to the model, but it was considered a database with a lower degree of validation.

To calculate error scores for SNVs, we considered true variants as those found in the HapMap database phase 3 release 3, part of the International HapMap Project (). The training databases were defined as the panel of phase 3 1KGP genotyped with the OMNI 2.5 chip and the database of genotypes with a high confidence level from phase 1 of 1KGP. Finally, the dbSNP database was the reference source for known variants. Using ApplyVQSR, we excluded from further analysis variants with a VQSLOD of less than 97.5% of SNVs-like variants and 95% of indel-like variants. This bioinformatics pipeline is summarized in Figure 1.

FIGURE 1

2.3 Evaluation of bioinformatics processing

Using the GATK CollectVariantCallingMetrics tool, the transition vs. transversion ratio (Ti/Tv) and the heterozygous vs. homozygous alternative allele ratio (Het/non-ref Hom) were calculated, metrics commonly used to describe the quality of the variant calling process. These metrics were obtained separately for each chromosome and at the exome level. The values ​​obtained were compared between both Costa Rican cohorts using a t-test.

Additionally, to evaluate the concordance between the allele frequencies, a linear model was generated to contrast the frequencies previously reported in the Costa Rica Heart Study publications and those obtained for CR-WGS (; Ruiz-Narváez et al., 2005; Ruiz-Narvaez et al., 2010).

2.4 Genetic ancestry analysis

To determine if the subjects included in both Costa Rican cohorts present an ancestry profile that fits within the pattern observed in other Latin American populations, we used the genotypes of 446 AIMs (Ancestry Informative Markers) described by Galanter et al. (2012), and the ancestral populations from 1KGP panel (European-EUR, African-AFR, and East Asian-EAS) (; Sudmant et al., 2015). We used the EAS group as a proxy of Native American ancestry since most of the ancestry of Native Americans comes from the East Asian population (Wang et al., 2019), given the scarcity of genomic data for this population group. Subjects from Barbados (ACB) and subjects with African ancestry from the South West of USA (ASW) were not considered members of AFR, nor were Utahns (CEUs) part of the EUR group since they are Americans. The CLM (Colombia), MXL (Mexico), PEL (Peru), and PUR (Puerto Rico) groups were considered Latin American.

The genotypes of the 446 AIMs were downloaded for 200 randomly selected individuals for each ancestral group (AFR, EUR, EAS) and all available samples for ACB, ASW, CEU, CLM, MXL, PEL, and PUR individuals. Genotypes were extracted for both Costa Rican cohorts, which were integrated with the 1KGP dataset. Principal component analysis (PCA) was performed using the number of alternative alleles by AIM. Only AIMs without missing genotypes were included. We estimated the similarity relationships between American populations and AFR, EUR, and EAS using the allelic frequencies in the TreeMix v1.13 program (Pickrell and Pritchard, 2012).

To assess whether the ancestry of both Costa Rican cohorts was consistent with the profile previously reported for subjects from the Costa Rican Central Valley, we performed a genetic admixture analysis using STRUCTURE v2.3.4 (Hubisz et al., 2009) using 78 AIMs described by Campos-Sánchez et al. (2013). We used the same ancestral groups as before (AFR, EUR, EAS). We integrated the genotypes for such AIMs in both Costa Rican cohorts and those reported for Costa Rican groups from the North Region (2013-NR), South Region (2013-SR), the Caribbean region (2013-CR), and the Ventral Valley (2013-CV) (). The integrated database contained 1,067 individuals for the analysis in STRUCTURE (Hubisz et al., 2009). The run parameters were: ‘Length of Burnin Period’ or the number of iterations to reduce the effect of the initial configuration set to 50,000, ‘Number of MCMC Reps after Burnin’ or the number of iterations of the model to obtain accurate estimates set to 100,000, genetically admixed individuals, the groups could have correlated allele frequencies, and the ancestral groups were EUR, AFR and EAS groups. With these parameters, we performed ten simulations assuming that the population had three ancestral groups. These results were merged using CLUMPP and DISTRUCT through the CLUMPAK tool (Rosenberg, 2004; Jakobsson and Rosenberg, 2007; Kopelman et al., 2015). Three plots were generated, one representing genetic structure, a ternary plot of genetic admixture, and a principal component analysis (PCA) using the number of alternative alleles per variant. Only AIMs with complete genotypes were included. Kruskal-Wallis test was applied to determine ancestry similarities among Costa Rican and Latin American populations, from there we built 95% confidence intervals considering Tukey correction to identify specific differences between pairs of populations.

2.5 Annotation of variants

We studied the variants identified within a set of 69 genes that have a key role in lipid metabolism or that contain variants that have been associated with changes in blood lipid levels (Table 1). We annotated the variants found in the regions of interest with information hosted in Ensembl release 109 using its REST API v15.5 (). Pathogenicity predictions, phenotypic associations, and population genetics information were extracted for each variant.

The variant type was determined using Variant Effect Predictor (VEP) v7 (). In silico predictions of pathogenicity for missense variants were generated using the traditional tools PolyPhen2 and SIFT () and two more recently developed tools, ClinPred and REVEL (Ioannidis et al., 2016; ; ). Phenotypic association annotations were done with Ensembl API REST which uses ClinVar and NHGRI-EBI GWAS catalog databases (Landrum et al., 2017; ).

To contrast the variant´s population frequencies found in the CR-WGS group with those reported in extensively characterized populations, we collected the frequencies of the 1KGP, EAS, EUR, AFR, AMR, and all 1KGP (ALL) groups. Fisher’s exact tests were performed to determine which of the variants found have a different allelic frequency in the group of Costa Rican genomes compared to the 1KGP populations. A significance level of 0.05 adjusted with the Bonferroni correction was used as the threshold to determine if the frequency between the two populations was different.

2.6 Identification and characterization of variants of interest

The study considered a polymorphic site as a variant of interest if (1) it was a risk variant according to three or more sources of functional annotation or if (2) the variant was previously reported in Costa Rica or Latin America within the context of metabolism of lipids and dyslipidemias. This produced two lists of variants of interest: one consisted of risk variants annotated by bioinformatic predictions found in the genes from Table 1, and the other includes the variants that have been reported in Costa Ricans and Latin Americans in the genes of interest in the context of lipid metabolism or dyslipidemia.

The list of risk variants with more than one count determined by bioinformatic predictions met at least three of the following criteria: (1) be categorized by PolyPhen2 as possibly harmful (P) or probably harmful (D), (2) being categorized by SiFT as a deleterious variant by having a score less than 0.05, (3) having an index calculated by REVEL greater than 0.5 (it groups 13 predictive tools), (4) having the ClinPred score greater than 0.5 or (5) having a phenotype reported by ClinVar or NHGRI-EBI GWAS catalog which was related to lipid metabolism or an increased risk of developing and suffering from dyslipidemia. The pharmacogenomics variants were identified from ClinVar and NHGRI-EBI GWAS catalog and annotated with PharmGKB (www.pharmgkb.org).

We used the jVenn tool () to generate Venn diagrams to visualize the consensus between the different sources in determining risk variants.

We calculated the number of variants in homozygous and heterozygous states, and the total present per subject to reflect the genetic burden of dyslipidemia-related variants in the population. These metrics were obtained for the set of variants categorized by VEP as LOW, MODERATE, and HIGH risk, and the set of variants categorized as variants of interest in the present study. The data was represented in distribution plots.

2.7 Code for bioinformatic analysis

In addition to the tools mentioned above, we used the free programming languages Python 3.7 and R 4.1.2. Python was used to manage the variant call workflow, annotate the variants, manipulate the data, and generate visualizations. R was used to generate the visualizations produced from the TreeMix results. All code can be found in the GitHub repository https://github.com/jcvalverdehernandez/cr_dislipidemia_2022.

3 Results

3.1 Variant call metrics met exome quality standards

The relationship Ti/Tv obtained for both datasets had a mean of 2.33 (Figure 2A). For exomes, it is reported that Ti/Tv values around 3.0 usually indicate that the data have adequate quality (Wang et al., 2015). This metric is sensitive to the genome region and functionality; thus, including intronic regions could reduce this ratio, similar to what we observe in our data. We used transcriptome coordinates that include coding and non-coding sequences (miRNAs and lncRNAs), as specified in the transcript coordinates from Ensembl 106.

FIGURE 2

The average HET/non-ref HOM ratio observed for both cohorts was 1.66 (Figure 2B). The expected value of this index is 2.0 for whole-genome sequencing variants. However, this highly depends on ancestry (Wang et al., 2015). In the study by Wang et al. (2015), average exome estimates varied from 1.4 to two in Asians and Africans, respectively.

Additionally, an exome average of 137,593 SNVs and 13,273 indels were identified per individual for both cohorts (Figure 2C). All metrics per chromosome and cohort are in Supplementary Figure S1; Supplementary Table S1. Moreover, PSYCH-CV and dbGAP-CV presented similar metrics for the three metrics (t-test p-value >0.05).

Finally, allelic frequencies previously reported at various polymorphic sites in the Costa Rica Heart Study were significantly correlated (r = 1.00, p = 1.8e-13) with those observed in CR-WGS. This result suggests a high similarity between these cohorts and that variant calling was accurate (Supplementary Figure S2).

3.2 The ancestry of Costa Rican genomes is consistent with previous studies

The ancestry analyses validated that PSYCH-CV and dbGAP-CV cohorts have a genetic profile consistent with that expected from a random sample of Costa Ricans from the Central Valley. They also reveal an ancestry profile similar to other Latin American groups in 1KGP, such as CLM, MXL, and PEL.

Principal component analysis (PCA) captured around 40.58% (between principal components 1 and 2) of the genetic variation using the panel of 446 AIMs in the three ancestral groups and the six American groups (Figure 3A). We observed that the PSYCH-CV and dbGAP-CV individuals appear to have more similarity with the Colombian (CLM) subjects in European and Asian ancestry, and in the AFR only for PSYCH-CV. Additionally, PSYCH-CV presented similarities with the AFR and EAS component of Mexicans (MXL) (Supplementary Table S3). These observations were verified by building 95% confidence intervals (Supplementary Table S4), which are also reflected in the genetic structure plot (Figure 3C). The genetic distance tree also groups Costa Rican genomes with Latin American and European groups (Figure 3B).

FIGURE 3

. (A) Principal component analysis. (B) Genetic relationships between the populations included in the analysis according to TreeMix estimates. (C) Individual genetic structure plot. Featured 1KGP populations - EUR: Eastern Europe, AFR: Africa, EAS: Eastern Asia, ACB: Barbados, ASW: African Ancestry in Southwest US, CEU: Utah, CLM: Colombia, MXL: Mexico, PEL: Peru, PUR: Puerto Rico, PSYCH-CV: Psychiatric study Central Valley, dbGAP-CV: dbGAP Central Valley.

When contrasting the genetic ancestry of PSYCH-CV and dbGAP-CV using 78 AIMS we observed complete similarity in all three ancestry components among them. Using these same markers we compared ancestry with the Costa Rican groups described by and observed the most significant similarity with the Central Valley group (2013-CV) in all three ancestry components for PSYCH-CV, but only for AFR and EAS for dbGAP-CV. Moreover, both groups showed similar AFR ancestry compared to the South (2013-SR), and EAS ancestry compared to the Caribbean Region (2013-CR). PSYCH-CV also presented AFR ancestry similar to 2013-CR (Figures 4A, B; Supplementary Table S3). These observations were verified by building 95% confidence intervals (Supplementary Table S4). The rest of the confidence intervals reflected statistically significant differences. The PCA captured approximately 36.33% of the genetic variation between principal components 1 and 2. These results provided confidence that CR-WGS represented the Central Valley population of Costa Rica.

FIGURE 4

. (A) Principal component analysis, (B) Genetic admixture ternary diagram. (C) Individual genetic structure plot. AFR: Africa, EAS: East Asia, EUR: Europe, AMR: Latin America, 2013-CR: Costa Ricans from the Caribbean Region, 2013-NR: Costa Ricans from the North Zone, 2013-SR: Costa Ricans from the South Zone, 2013-CV: Costa Ricans from the Central Valley, PSYCH-CV: Psychiatric study Central Valley, dbGAP-CV: dbGAP Central Valley.

3.3 Polymorphic sites identified in genes of interest

We identified 2,600 polymorphic sites in CR-WGS in the 69 genes of interest (Table 1) consisting of 2,460 SNVs and 140 indels (Table 2). However, only 2,553 were annotated in dbSNP. We detected 47 new variants not reported previously in dbSNP. Multiallelic variants represented 2.9% of all variants detected.

TABLE 2

MetricTotalSNVsIndels
Variants identified2,6002,460140
Not in dbSNP47443
In dbSNP2,5532,416137
Multiallelic753738
Biallelic2,5252,423102

Variant calling statistics for the panel of 69 genes involved in lipid metabolism.

We classified 2,277 variants (unique rsIDs) into 2,769 impact annotations assigned in VEP based on the in silico consequence of the variant according to the Sequence Ontology (SO) term. This means that a variant could have different impact annotations depending on the region of the gene and the alternative transcript they belong to. For example, the rs5088 in APOA2 had five annotations: intron variant, synonymous variant, 3-prime UTR variant, downstream gene variant, and splice region variant; three had a MODIFIER, and two had a LOW impact. In summary, 349 variants had a LOW impact (low risk of affecting gene transcripts), 397 MODERATE, and eight HIGH risks. It was impossible to assign an expected risk to consequences assigned to 1,941 of the variants using VEP; these consequences are referred to as MODIFIER (Supplementary Table S3). To get an idea about the genetic burden for dyslipidemia in our sample, we plotted the number of variants per individual (Figures 5A–C). The subjects presented on average 56.22 LOW impact variants (34.9 and 21.36 in heterozygous and homozygous state, respectively), 47.29 MODERATE impact variants (27.23 and 20.06 in heterozygous and homozygous state, respectively), and 1.03 HIGH impact variants (0.82 and 0.43 in heterozygous and homozygous state, respectively).

FIGURE 5

According to Fisher’s exact tests implemented to contrast the allele frequencies of the 2,174 variants detected in CR-WGS and those of the groups belonging to 1KGP, we observed that AMR, EUR, and ALL groups are the most similar to CR-WGS (Figure 6A). These differed individually from CR-WGS in 54, 214, and 452 allelic frequency variants, respectively (Figure 6B). On the other hand, EAS and AFR presented statistically significant differences in the frequency of the alleles of 694 and 1,082 polymorphic sites compared to CR-WGS, respectively (Supplementary Figure S4).

FIGURE 6

The eight variants associated with high-risk consequences according to VEP are summarized in Table 3. These are located in eight genes and include stop gained and start lost annotations; most were heterozygous and presented 1 to 37 copies in CR-WGS. Interestingly, rs328G and rs132642T are homozygous in two different individuals each. SNV rs328 was reported as benign in other Latin American studies and ClinVar (Table 6), while rs132642 has no annotation in ClinVar. Allele frequencies from 1KGP and gnomAD exomes are low (up to 11%, Table 3).

TABLE 3

GenedbSNP rsIDAlleles (REF/ALT)ImpactAlternative allele frequency in CR-WGS (count)Samples homozygous for least frequent alleleDepth of REF:ALT in least frequent allele1KGP frequency Global for least frequent allelegnomAD exomes frequency Global for least frequent allele
APOC4rs5164G/Astop_gained0.0019 (1)0.00270.0004
APOL3rs132642T/Astart_lost0.9027 (464)20:30, 0:370.05840.1146
APOL4rs192225524C/Astop_gained0.0311 (16)0.00090.0005
CD36rs3211938T/Gstop_gained0.0019 (1)0.03090.0061
GCKRrs146053779C/Tstop_gained0.0096 (5)0.00140.0009
GPD1rs144009925A/Gstart_lost0.0039 (2)-0.0003
LPLrs328C/Gstop_gained0.0719 (37)20:27, 1:340.09240.0921
SCARB1rs749801989T/Cstart_lost0.0116 (6)-0.0001

High-risk variants frequency and presence of homozygous individuals for the alternate allele in CR-WGS.

Forty-one variants in 21 genes were associated with phenotypic traits categorized as protective, drug response, association, risk factor, likely pathogenic, and pathogenic (Figure 7). The genes with more than one variant with phenotypic traits categorized as risk or pathogenic factors (i.e., risk factor, pathogenic or likely pathogenic) were APOA5, APOB, APOE, APOL1, CD36, GCKR, LDLR, LPL, PCSK9, and PLA2G7.

FIGURE 7

Seven variants were annotated with features associated with drug response and two with protective features in APOB, APOE, and HMGCR genes (Table 4). The allelic frequencies of the alternate allele ranged from 0.01 to 0.76. These nine variants are present in 1KGP populations but we observed statistical differences in the allelic frequencies of seven of the variants. All variants presented annotations in ClinVar, including associations with traits such as warfarin, atorvastatin, and statins responses, and one protective against metabolic syndrome.

TABLE 4

GenedbSNP rsIDAlleles (REF/ALT)Alternative allele frequencyProtective or pharmacogenetic traits
CR-WGS1KGP Phase 3
ALLEUREASAFRAMR
APOBrs1042034C/T0.761630.62959*0.782300.27976*0.87594*0.74927Allele T per ClinVar:Warfarin response
APOBrs1367117G/A0.344960.16932*0.298210.11507*0.07791*0.28674Allele A per ClinVar:Warfarin responseAllele A per HGRI-EBI GWAS catalog:Medication use HMG CoA reductase inhibitors
APOBrs679899G/A0.401160.485020.474150.86408*0.13010*0.39193Allele A per ClinVar:Warfarin response
APOBrs693G/A0.449610.25099*0.442340.06150*0.20953*0.37752Allele A per ClinVar:Warfarin response
MTTPrs3816873T/C0.274320.249800.260430.13591*0.260960.17867Allele C per ClinVar:
Metabolic syndrome, potection against
APOErs429358T/C0.070040.15055*0.15506*0.086300.26777*0.10374Allele C per ClinVar:Warfarin response
APOErs7412C/T0.066150.075070.062620.100190.102870.04755Allele T per ClinVar:atorvastatin response - Efficacy, Warfarin response
Allele T per NHGRI-EBI GWAS catalog:Response to statins (LDL cholesterol change), Lipoprotein-associated phospholipase A2 activity change in response to darapladib treatment in cardiovascular disease
APOErs769450G/A0.317120.327270.411530.218250.350220.29682Allele A per ClinVar:Warfarin response
HMGCRrs17238540T/G0.013620.035540.01689-0.10816*0.02449Allele G per ClinVar:Statins, attenuated cholesterol lowering by

Variants found in genes of interest that are associated phenotypically with pharmacogenomic or protective traits against diseases. CR-WGS: Costa Rican genomes evaluated in this study, ALL: all Subjects from 1KGP phase 3, EAS: East Asia, EUR: Europe, AFR: Africa, AMR: Latin America. * Significantly different allelic frequency (p < 0.05) compared to CR-WGS.

Of the missense variants identified within the genes of interest listed in Table 1, 18 were categorized as risk variants by more than three sources used for functional annotation and had more than one count in CR-WGS (Figure 8; Table 5). These variants were located in 16 genes. The alternate allele frequencies ranged from 0.00389 to 0.09143 and 0.00001–0.08852 in CR-WGS and ALL, respectively. Thirteen variants were only present in CR-WGS and ALL; three were reported in AMR and CR-WGS, one in EUR and AMR, one in AFR and AMR, and one in EAS and AMR. In this list, only rs1801689 in APOH presented allelic frequencies significantly different from AFR and EAS, and rs202022169 in CELSR2 showed statistical differences with ALL. Additionally, only nine variants had a phenotype association in ClinVar, GWAS, or Teslovich et al. (2010), including sitosterolemia, cholesterol levels, hypertriglyceridemia, apolipoproteinemia, familial hypercholesterolemia, among others.

FIGURE 8

TABLE 5

GenedbSNP rsIDAlleles (REF/ALT)Allele frequencyAnnotation
CR-WGS1KGP Phase 3
ALLAFREURAMREASClassified as functionalPhenotype association
ABCA1rs766619359C/T0.00778-----S, P, R, C
ABCG8rs11887534G/C0.050380.060500.076390.079520.096540.01388S, PClinVar: SITOSTEROLEMIA GWAS: C-reactive protein levels or LDL-cholesterol levels (pleiotropy)
Teslovich: Cholesterol, total | Low-density lipoprotein cholesterol
ABCG8rs200433692C/T0.005810.00039--0.00288-S, P, C
APOA5rs3135506G/C0.091430.055710.067320.067590.11671-S, PClinVar: Familial hypertriglyceridemia GWAS: Low density lipoprotein cholesterol levels | High density lipoprotein cholesterol levels | Total cholesterol levels | Total triglycerides levels
APOErs7412C/T0.066140.075070.102870.062620.047550.10019S, P, RClinVar: Apolipoproteinemia E1 | atorvastatin response - Efficacy, Familial type 3 hyperlipoproteinemia | Hypercholesterolemia GWAS: Cholesterol, total | HDL cholesterol | High density lipoprotein cholesterol levels | LDL cholesterol | Lipid metabolism phenotypes | Lipoprotein A levels | Lipoprotein-associated phospholipase A2 activity change in response to darapladib treatment in | Response to statins (LDL cholesterol change) | Triglyceride levels
APOHrs1801689A/C0.031120.016370.00151*0.040750.036020.00099*S, P, R
APOL1rs775820342G/A0.00389-----S, P, C
CD36rs146027667G/T0.00389-----S, P, R
CELSR2rs202022169T/C0.019370.00079*--0.00432-S, P, R
CELSR2rs1203365203G/A0.00387-----S, P, C
CREB3L3rs779860332C/A0.00389-----S, P, R, C
GCKRrs146175795G/A0.011620.00439--0.021610.00694S, R, CClinVar: Hypertriglyceridemia
LCATrs4986970A/T0.007780.008380.001510.026830.00720-S, P, RClinVar: LCAT deficiency GWAS: Apolipoprotein A1 levels, Total cholesterol levels
LDLRrs148698650G/A0.003890.000790.00075-0.00288-S, RClinVar: Familial hypercholesterolemia
LIPErs1166099993G/A0.00389-----S, P, C
LPLrs118204057G/A0.005830.00019--0.00144-P, RClinVar: Hyperlipidemia, familial combined, LPL related | Hyperlipoproteinemia, type I GWAS: High density lipoprotein cholesterol levels | Triglyceride levels
PPARArs1800206C/G0.035010.022760.00529*0.058640.03458-S,RClinVar: HYPERAPOBETALIPOPROTEINEMIA, SUSCEPTIBILITY TO
SCARB1rs748231262G/A0.00389-----S, P, R, CClinVar: Familial hypercholesterolemia

Allele frequency and annotation of variants that produce alterations in genes involved in lipid metabolism that are categorized as risky by more than three sources and with more than one count in CR-WGS. CR-WGS: Costa Rican genomes evaluated in this study, ALL: all Subjects from 1KGP phase 3, EAS: East Asia, EUR: Europe, AFR: Africa, AMR: Latin America, S: SIFT, P: PolyPhen2, R: REVEL, C: ClinPred. * Significantly different allelic frequency (p < 0.05) compared to CR-WGS.

Finally, only eight variants previously linked to lipid metabolism or the development of dyslipidemia in Costa Ricans and Latin Americans were found in CR-WGS (Table 6). These variants were in ABCA1, ABCG8, CELSR2, and LPL genes, with frequencies ranging from 0.004 to 0.031. The variant rs1231383321 in LPL is a private variant found in one individual (heterozygous, sequencing depth 16:21) from CR-WGS.

TABLE 6

GenedbSNP rsIDAlleles (REF/ALT)Frequency of alternative allelePhenotypic association with Latin American populations
CR-WGS1KGP Phase 3
ALLEUREASAFRAMR
ABCA1rs9282541G/A0.052520.00599*--0.00075*0.04178Allele A found mostly in Native Americans and their descendants. Negative correlation between the early development of coronary disease and HDL-C levels (Villareal-Molina et al. 2012).
ABCG8rs4245791C/T0.748060.84105*0.689860.99603*0.89334*0.80259A GWAS shows an association between the C allele with levels of LDL in Latin Americans .
CELSR2rs12740374G/T0.215110.195480.212720.04265*0.247350.20461A GWAS shows an association between the T allele with levels of LDL and cholesterol in Latin Americans .
LPLrs1231383321C/A0.00194**-----Allele A found in Costa Ricans with severe hyperlipidemia .
LPLrs118204057G/A0.005830.00019---0.00144Allele A found in Costa Ricans with severe hyperlipidemia .
LPLrs268A/G0.033070.00519*0.01391-0.00075*0.01152Allele A found in Costa Ricans with severe hyperlipidemia .
LPLrs316C/A0.194550.152550.120270.11210*0.236760.14553Allele A found in Costa Ricans with severe hyperlipidemia .
LPLrs328C/G0.071980.092450.130210.122020.061270.06340Allele G is associated in Costa Ricans with a lower risk for myocardial infarction (Yang et al. 2004).

Variants previously reported in genes involved in lipid metabolism from Costa Rica and Latin America. CR-WGS: Costa Rican genomes evaluated in this study, ALL: all Subjects from 1KGP phase 3, EAS: East Asia, EUR: Europe, AFR: Africa, AMR: Latin America. * Significantly different allelic frequency (p < 0.05) compared to CR-WGS. ** Found in one individual.

In summary, we identified 40 variants of interest related to dyslipidemia in CR-WGS. Subjects in our sample presented on average 7.49 of these variants (Figure 5D). Moreover, 60% of the subjects have two or three variants in homozygous state and 20% of the subjects present five variants in heterozygous states.

4 Discussion

4.1 Exome quality metrics

The bioinformatics workflow used to perform variant calling on the PSYCH-CV and dbGAP-CV cohorts revealed metrics (Ti/Tv and HET/non-ref HOM ratios) within expected values for adequate quality exomes (Wang et al., 2015)​​. Although Ti/Tv ratios were lower than the standard (Wang et al., 2015)​​, we must consider that the exome regions included mature transcripts, miRNAs, and lncRNAs coordinates in Ensembl 106 that could impact lowering the values of this metric. Moreover, HET/non-ref HOM ratios for both cohorts were within the standard for Asians and Africans since this metric is sensitive to ancestry (Wang et al., 2015)​​.

On average, each individual from CR-WGS contained 137k SNVs per exome (210 Mb), but the regions included non-coding sequences that can accumulate more variants. According to the literature, the expected count of SNVs per exome (33 Mb) ranges between 15,000 and 20,000, the determining factor of this variation being the coordinates used to define the exome and the ancestry (Ng et al., 2009; Stitziel et al., 2011). In contrast, there are three million SNPs in a genome (Stitziel et al., 2011). Moreover, the average Ti/Tv ratio, HET/non-ref HOM ratio, and SNV per individual were almost identical in PSYCH-CV and dbGAP-CV (t-test p-value >0.05), confirming the possibility of adding both cohorts for variant annotation.

4.2 Concordance with the ancestry of Costa Ricans from the Central Valley

The results obtained from the ancestry analysis showed that PSYCH-CV and dbGAP-CV samples show a genetic admixture consistent with Latin American populations and ancestry studies from the Central Valley (). There is also a high concordance between the allele frequencies reported for CR-WGS to the sample of Costa Ricans from the Central Valley without diagnosed disease studied in the Costa Rica Heart Study. All this suggests that the allelic frequencies obtained from CR-WGS are representative of the general population of the Central Valley of Costa Rica and that conclusions from this study can have implications in healthcare policies.

CR-WGS presented an ancestry profile similar to some Latin American groups reported in 1KGP. Of the four Hispanic groups included in 1KGP, the Costa Rican group closely resembles the EUR and EAS component of Colombians (AFR also for PSYCH-CV), and the AFR and EAS component of Mexicans only for PSYCH-CV. This is consistent with previous studies as reviewed by (; Wang et al., 2019). The impact of this finding in the study of dyslipidemias in Latin America should be studied further to determine whether conclusions derived from Costa Rican populations apply to other Latin American groups with high European ancestry.

PSYCH-CV and dbGAP-CV samples have comparable admixture proportions to Central Valley samples from , which is consistent with the origin of both cohorts. Notably, the European component was lower in CR-WGS (mean 0.47) and the Asian (used as a proxy of Amerindian) was higher (mean 0.46) compared to (EUR 0.569 and EAS 0.364). This may be because, in the present study, the East Asian population (EAS) reported in 1KGP was used as the ancestral group instead of an Amerindian group, as in the study by . Although EAS has been used in previous ancestry studies as a group analogous to Native Americans due to their historical origin and because EAS is a broad and standardized group (Wang et al., 2019), it is recommended in future studies to use genomic information from Native Americans for ancestry estimations.

4.3 Pharmacogenomic variants

According to the functional annotation extracted from ClinVar and GWAS Catalog, at least nine identified variants have been reported to impact either the efficacy, safety, or metabolism of therapeutic agents (Table 3). Eight of these variants are found in PharmGKB, but three have no conclusive evidence, or no association was found with a pharmacogenomics phenotype.

Four variants in APOB showed phenotypes associated with response to warfarin, according to ClinVar; they all presented frequencies above 34%. The same variants are reported in PharmGKB, but only two have a significant association with warfarin. Variants rs1042034 and rs693 were studied in Korean patients under warfarin treatment and the risk of hemorrhage, but the T and G alleles, respectively, were not associated (Yee et al., 2019). However, in the same study, the G allele in rs1367117 and the G allele in rs6789899 were associated with an increased risk of hemorrhage when using warfarin in people with heart valve replacement.

It has been observed in previous studies that the variants rs429358 and rs7412 in APOE can alter the efficacy of statin-type drugs such as lovastatin, atorvastatin, or pravastatin to reduce blood cholesterol levels (Mega et al., 2009; ; ). A study in hypercholesterolemic Chilean patients showed that these variants impact statins response (Lagos et al., 2015). studied the interaction of APOE genotypes (using the HhaI enzyme) and fat plasma with lipoprotein levels and low-density lipoproteins in Costa Ricans. Moreover, rs7412 has shown protective effects against SARS-CoV-2 (). Due to their high allelic frequencies, these variants are candidates for further pharmacogenomic studies in Costa Ricans and Latin American populations (Table 4). On the other hand, rs769450 is an intron variant interpreted as a drug response to warfarin in ClinVar but without assertion criteria. However, in dbSNP, this variant is supported by Musunuru et al. (2012) and Son et al. (2015) associated with decreased risk of elevated triglycerides and LDL (low-density lipoprotein) phenotype, respectively. Additionally, in PharmGKB, allele A is not associated with the risk of hemorrhage during warfarin treatment in people with heart valve replacement compared to allele G.

In HMGCR, the genotype TT in rs17238540 is associated with reduced LDL cholesterol in patients treated with simvastatin (Krauss et al., 2008). Furthermore, the genotype GT, compared to TT, showed a decreased reduction in total cholesterol under pravastatin treatment (). This marker should be studied in more detail in patients under statin treatment.

The only protective variant found was rs3816873 in MTTP. This is a microsomal triglyceride transfer protein that catalyzes the transport of triglyceride, cholesteryl ester, and phospholipid between phospholipid surfaces. This variant was associated with protection against metabolic syndrome in ClinVar and OMIM (https://omim.org/entry/157147#0009) and is a benign variant in abetalipoproteinemia.

4.4 Risk variants

Alterations in the expression levels or the functioning of the genes involved in lipid metabolism evaluated in this study can cause imbalances in the lipid profile and lead to the development of dyslipidemia. Eight variants presented high impact in VEP; only two were homozygous for the recessive allele (Table 3). For instance, rs132642 in APOL3 had no annotation in ClinVar, and rs328 in LPL is annotated as benign in the phenotype hyperlipoproteinemia type I. This mutation truncates the last two codons of the protein. Evidence from Kobayashi et al. (1992) was from a heterozygous individual and performed expression studies in Cos-1 cells. presented the case of two homozygous brothers in rs328 with another mutation Asp156Gly in LPL. They confirmed in vitro that the carboxyl terminus of LPL was not responsible for hyperlipoproteinemia type I. The minor allele frequencies of rs132642 and rs328 are 5.8% and 9.25% in dbSNP (1KGP Global group). All other five high-risk variants identified in Costa Ricans are presented as heterozygous, and only two have ClinVar annotations with uncertain or conflicting interpretations (CD36, GCKR, and GPD1). In dbSNP, five of these variants (rs5164, rs192225524, rs146053779, rs144009925, and rs749801989) have frequencies below 0.1% in the Global populations of 1KGP and gnomAD exomes. These deserve further study in Latin American populations because of their low allelic frequencies in the same databases (0.3%).

Sixteen out of the 69 genes evaluated contained risk variants defined by more than three bioinformatic tools (Figure 8; Table 4). The genes of the apolipoprotein family with risk variants include APOA5, APOE, APOH, and APOL1. According to Su & Peng (Su and Peng, 2020), APOA5 and APOE participate in the assembly of VLDLs. The study by Zhou et al. (2018) reported that variants in APOA tend to impact plasma triglyceride levels more than cholesterol. Several studies have linked the presence of the C allele in SNV rs3135506 with elevated plasma triglyceride levels (Ruiz-Narváez et al., 2005; Li et al., 2014). Surendran et al. (2012) found an allele frequency of 21% in patients with severe hypertriglyceridemia, while the control group presented a frequency of 9%. This variant reached an allelic frequency of 9% in the Costa Rican group and did not show significant differences with the other 1KGP groups.

On the other hand, several studies have associated the presence of the T allele of the rs7412 variant belonging to APOE with high blood cholesterol levels, mainly provided by LDLs, and with high body mass index (Thompson et al., 2009; Tejedor et al., 2014). Although the frequency of this variant in Costa Ricans is 6.6% while that of Latin Americans registered in 1KGP is 4.75%, no statistically significant differences were found between them; evaluating this in other parts of the country or increasing the size of the sample can help clarify whether this trend dissipates or becomes more robust. Although little is known about the molecular role of APOH in lipid metabolism, it has been observed in various populations that the presence of some variants associated with the functioning of this apolipoprotein affects LDL cholesterol levels (Willer et al., 2013). The C allele of the rs1801689 variant has been linked to changes in blood LDL levels; this variation alters the affinity of APOH with phospholipids (Mather et al., 2016). The variant rs775820342 in APOL1 presented low frequencies in CR-WGS and ALL and is not reported in ClinVar. This is a missense variant with computational pathogenic evidence that could be studied further.

Five risk variants were identified in three genes involved in lipid transport, ABCA1, ABCG5, and ABCG8, from the ABC transporter family. ABCA1 participates in the formation of HDLs by translocating cholesterol and phospholipids from the interior of the cell to nascent HDLs. The variant rs766619359 in this gene is a missense mutation. The alternate T allele is almost absent in 1KGP (0.004%) and gnomAD (0.0064% genomes, 0.0024% exomes); no reports are available in ClinVar, suggesting that this is a pathogenic variant.

On the other hand, ABCG5 forms a heterodimer with ABCG8 that mediates the absorption and excretion of sterols at multiple levels (). Of the risk variants identified, only rs11887534 in ABCG8 has been associated with changes in the levels of HDLs in the blood in response to statin treatment (Sałacka et al., 2021). Additionally, rs200433692 in ABCG8 is a missense mutation almost absent in population databases such as 1KGP (0.04%), gnomAD (0.0071% genomes, 0.0088% exomes), and ExAC (0.0116%).

Risk variants were found in four genes (CELSR2, CREB3L3, GCKR, and LCAT) with a regulatory or signaling role in lipid metabolism. No previous research was found associating the presence of the risk variants found in CELSR2 and CREB3L3 with alterations in the lipid profile or risk of suffering from dyslipidemia. Moreover, alternate allele frequencies of the variants rs1203365203 and rs779860332 were extremely low in ALL (0.001%–0.02%) and CR-WGS (0.4%, Table 5). Allele C in rs202022169, on the other hand, presented a statistical difference in the allele frequency with ALL, reaching up to 1.9% in CR-WGS compared to 0.007% in ALL and 0.4% in AMR. However, variant rs146175795 in GCKR is presented in ClinVar with conflicting interpretations of pathogenicity, including one associated with hypertriglyceridemia in two heterozygous individuals (Rees et al., 2012). LCAT rs4986970 was reported as benign in ClinVar and it was associated with a reduction in HDL cholesterol (Haase et al., 2012), it presented a frequency of 0.7 in CR-WGS.

Five putative risk variants (0.3–3.5% frequency in CR-WGS) were found in CD36, LDLR, LIPE, PPARA, and SCARB1 genes, involved in lipid and lipoprotein sensing. Variant rs148698650 detected in LDLR has been linked to alterations in lipid profile according to ClinVar, rs1800206 in PPARA has been associated with lipid-altered phenotypes in three studies (Vohl et al., 2000; Tai et al., 2002; Robitaille et al., 2004), and rs748231262 in SCARB1 has one report in an Argentinian study of familial hypercholesterolemia (). The other two variants have frequencies below 0.4% in CR-WGS and are absent from ALL, AFR, EUR, AMR, and EAS.

Finally, LPL variant rs118204057 has multiple reports associated with hyperlipidemia and hyperlipoproteinemia pathology and protein function (Monsalve et al., 1990; Hata et al., 1992; Henderson et al., 1992; Mailly et al., 1997; ; Soto et al., 2015; ; ). Moreover, population frequencies are low (ALL 0.019%, 0.14% AMR, 0.58% CR-WGS), and it was detected in one individual with severe hyperlipidemia from Costa Rica (). This variant deserves further study in Costa Rica and Latin American countries.

4.5 Variants previously reported in the Latin American region

We detected in CR-WGS the ABCA1 variant rs9282541 that was considered a private variant in Native Americans and their descendants (Villarreal-Molina et al., 2012; ). Its allelic frequency resembles that observed in Latin Americans reported in 1KGP. Villarreal-Molina et al. (2012) reported in Mexican subjects that this variant was associated with lower levels of total cholesterol and HDL cholesterol in plasma. Additionally, they observed that the variant’s effect depends on the sex of the subject, probably interacting with other factors.

Two variants reported in the study by , which focused on identifying variants associated with changes in the lipid profile of Latin Americans living in the United States, were found in the Costa Rican cohort analyzed. The intron variant rs4245791 in ABCG8 is not annotated in ClinVar. However, several publications provide evidence of its relationship with total cholesterol (Ma et al., 2010); higher cholestanol-to-cholesterol levels -an estimate of cholesterol absorption- (Silbernagel et al., 2013), and increased plasma phytosterol concentrations, relatively elevated LDL-C; and increased coronary artery disease risk (). According to research, the variant rs12740374 in CELSR2 influences LDL cholesterol levels in Hispanics (Samani et al., 2007; ; Musunuru et al., 2010).

Although the research by detected genetic variants with a quantitative impact on plasma lipid levels for Latin Americans, it is essential to mention that the people included in that study reside in the United States. This means they were exposed to different lifestyles and environmental conditions than their country of origin. Only the environment can affect the variation of plasma total cholesterol levels up to 21% and 29% in plasma triglyceride levels; approximately 6% of the variation is attributed to the interaction between environment and genetics ().

We detected in CR-WGS four of the 15 variants described by in LPL (Table 6). According to a meta-analysis, the G allele in the rs268 variant is associated with lower plasma HDL cholesterol levels (). This variant has a frequency of 3.3% in CR-WGS, significantly higher compared to ALL and AFR but not to AMR (1.1%) and EUR (1.3%). Variant rs316 is intronic, and according to Pirim et al. (2014), it is possibly located next to a regulatory site. The A allele in this variant has been repeatedly associated with an increase in HDL cholesterol (Schuster et al., 2011; Pirim et al., 2014, 2015), but it is benign in ClinVar. The missense variant rs1231383321 was detected in one individual in CR-WGS, and it is also reported in American gnomAD-exomes and genomes with a frequency of 0.023% and 0.051%, respectively. The rs118204057 variant was discussed previously.

On the other hand, we identified the LPL variant rs328 (S447*) in CR-WGS, this was previously associated in a publication of the Costa Rica Heart Study with a reduction in the risk of myocardial infarction in Costa Ricans (Yang et al., 2004). The G allele suppresses the encoding of the last two amino acids of LPL, increasing its lipase activity. Notably, this is associated with low levels of plasma triglycerides and increases in HDL cholesterol in healthy subjects. However, in subjects with obesity, this allele instead is associated with elevated levels of plasma triglycerides (Palacio-Rojas et al., 2017).

Overall, this study presents the reanalysis of Costa Ricans’ genomic data to estimate dyslipidemia variants’ baseline frequencies. The finding that these genomes’ ancestry accurately resembles those of Central Valley and some Latin American populations is relevant, considering the low amount of genomic data in these populations to derive conclusions about the genetic burden in the general population.

The study identified 2,600 variants in 69 genes involved in lipid metabolism in the genomes of people from the Central Valley of Costa Rica. Among these, 33 variants have the potential to affect the functioning of these genes, some have been directly linked to the development of hyperlipidemia, and some could affect the performance of proteins involved in lipid metabolism according to bioinformatic analysis. However, some have not been directly associated with developing such conditions in the literature. On the other hand, we found seven variants with pharmacogenomic relevance, several of which can modulate the subject’s response to the application of statin-type drugs, therapies commonly used to treat cases of severe hyperlipidemia. Our analysis of the number of variants per individual for the 40 variants of interest suggests an important genetic burden for dyslipidemia in our sample; however, we could not determine the relationship of these variants with dyslipidemia phenotypes due to the lack of metadata associated with the datasets analyzed.

In the future, it is essential to develop studies that capture environmental, genotypic, and phenotypic data from Costa Ricans living in Costa Rica to understand more clearly the dynamics that participate in the incidence of dyslipidemia. These efforts can be focused on the 23 genes and 40 variants identified in this study, which can be analyzed with traditional genotyping methodologies (i.e., PCR, RFLP, Sanger sequencing) reducing costs. Alternatively, genetic analysis using genome sequencing, exome sequencing, or a panel of genes involved in lipid metabolism, such as the LipidSeq panel described by Johansen et al. (2014), could help to identify variants in affected individuals. In an Argentinian study, this strategy has already been used (), where they sequenced only genes linked to lipid metabolism. Additionally, copy number variants should be studied as they have been involved in certain dyslipidemia disorders (Iacocca and Hegele, 2018). Moreover, the abundant clinical information hosted in the Costa Rican Social Security System (Caja Costarricense del Seguro Social - C.C.S.S.) could strengthen this type of genomic study. Eventually, functional validation of the variants detected in patients should be performed to provide conclusive evidence of the association with dyslipidemia.

Statements

Data availability statement

Publicly available datasets were analyzed in this study. This data can be found here: phs000988.V4.P1 can be requested directly through dbGAP. can be requested through the original authors.

Ethics statement

The studies involving human participants were reviewed and approved by the Comité Ético Científico, Universidad de Costa Rica. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

Author contributions

RC-S and SS designed the study. RC-S and JV-H collected the genomics data. JV-H and AF-C performed the data analysis. JV-H, AF-C, GC-S, and RC-S wrote the manuscript. All authors read and approved the final manuscript.

Acknowledgments

This work was funded by the University of Costa Rica (project number B9-259). This research was partially supported by a machine allocation on the Kabré supercomputer at the Costa Rica National High Technology Center and the CICIMA high-performance computer cluster at the University of Costa Rica. This study was supported by NHLBI grant R37 HL066289. We wish to acknowledge the investigators at the Channing Division of Network Medicine at Brigham and Women’s Hospital, the investigators at the Hospital Nacional de Niños (HNN) in San José, Costa Rica, and the study subjects and their extended family members who contributed samples and genotypes to the study, and the NIH/NHLBI for its support in making this project possible. We also want to acknowledge Esteban Rodríguez (CIBCM, UCR) for Google Cloud assistance and Federico Muñoz Rojas for CICIMA (Centro de Investigación en Ciencia e Ingeniería de Materiales, UCR) for CICIMA cluster support.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2023.1114774/full#supplementary-material

References

  • 1

    AdhikariK.Chacón-DuqueJ. C.Mendoza-RevillaJ.Fuentes-GuajardoM.Ruiz-LinaresA. (2017). The genetic diversity of the americas. Annu. Rev. Genom Hum. G.18, 277296. 10.1146/annurev-genom-083115-022331

  • 2

    AlirezaieN.KernohanK. D.HartleyT.MajewskiJ.HockingT. D. (2018). ClinPred: Prediction tool to identify disease-relevant nonsynonymous single-nucleotide variants. Am. J. Hum. Genet.103, 474483. 10.1016/j.ajhg.2018.08.005

  • 3

    AndaleonA.MogilL. S.WheelerH. E. (2019). Genetically regulated gene expression underlies lipid traits in Hispanic cohorts. Plos One14, e0220827. 10.1371/journal.pone.0220827

  • 4

    AshrafA. P.HurstA. C. E.GargA. (2017). Extreme hypertriglyceridemia, pseudohyponatremia, and pseudoacidosis in a neonate with lipoprotein lipase deficiency due to segmental uniparental disomy. J. Clin. Lipidol.11, 757762. 10.1016/j.jacl.2017.03.015

  • 5

    AslibekyanS.JensenM. K.CamposH.LinkletterC. D.LoucksE. B.OrdovasJ. M.et al (2012). Fatty acid desaturase gene variants, cardiovascular risk factors, and myocardial infarction in the Costa Rica study. Front. Genet.3, 72. 10.3389/fgene.2012.00072

  • 6

    AutonA.AbecasisG. R.AltshulerD. M.DurbinR. M.AbecasisG. R.BentleyD. R.et al (2015). A global reference for human genetic variation. Nature526, 6874. 10.1038/nature15393

  • 7

    BardouP.MarietteJ.EscudiéF.DjemielC.KloppC. (2014). jvenn: an interactive Venn diagram viewer. Bmc Bioinforma.15, 293. 10.1186/1471-2105-15-293

  • 8

    BoesE.CoassinS.KolleritsB.HeidI. M.KronenbergF. (2009). Genetic-epidemiological evidence on genes associated with HDL cholesterol levels: A systematic in-depth review. Exp. Gerontol.44, 136160. 10.1016/j.exger.2008.11.003

  • 9

    BrahmA.HegeleR. A. (2013). Hypertriglyceridemia. Nutrients5, 9811001. 10.3390/nu5030981

  • 10

    BrownS.OrdovásJ. M.CamposH. (2003). Interaction between the APOC3 gene promoter polymorphisms, saturated fat intake and plasma lipoproteins. Atherosclerosis170, 307313. 10.1016/s0021-9150(03)00293-4

  • 11

    BruikmanC. S.HovinghG. K.KasteleinJ. J. P. (2017). Molecular basis of familial hypercholesterolemia. Curr. Opin. Cardiol.32, 262266. 10.1097/hco.0000000000000385

  • 12

    BunielloA.MacArthurJ. A. L.CerezoM.HarrisL. W.HayhurstJ.MalangoneC.et al (2019). The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res.47, D1005D1012. 10.1093/nar/gky1120

  • 13

    CaddeoA.MancinaR. M.PirazziC.RussoC.SasidharanK.SandstedtJ.et al (2018). Molecular analysis of three known and one novel LPL variants in patients with type I hyperlipoproteinemia. Nutr. Metab. Cardiovasc Dis.28, 158164. 10.1016/j.numecd.2017.11.003

  • 14

    CalandraS.TarugiP.SpeedyH. E.DeanA. F.BertoliniS.ShouldersC. C. (2011). Mechanisms and genetic determinants regulating sterol absorption, circulating LDL levels, and sterol elimination: Implications for classification and disease risk. J. Lipid Res.52, 18851926. 10.1194/jlr.r017855

  • 15

    CamposH.D’AgostinoM.OrdovásJ. M. (2001). Gene-diet interactions and plasma lipoproteins: Role of apolipoprotein E and habitual saturated fat intake. Genet. Epidemiol.20, 117128. 10.1002/1098-2272(200101)20:1<117::aid-gepi10>3.0.co;2-c

  • 16

    Campos-SánchezR.RaventósH.BarrantesR. (2013). Ancestry informative markers clarify the regional admixture variation in the Costa Rican population. Hum. Biol.85, 721740. 10.3378/027.085.0505

  • 17

    ChasmanD. I.PosadaD.SubrahmanyamL.CookN. R.StantonV. P.RidkerP. M. (2004). Pharmacogenetic study of statin therapy and cholesterol reduction. Acc. Curr. J. Rev.13, 2021. 10.1016/j.accreview.2004.07.109

  • 18

    Chavarria-SoleyG.Francis-CartinF.Jimenez-GonzalezF.PeraltaJ. M.BlangeroJ.GurR. E.et al (2021). Identification of genetic risk variants for major psychiatric disorders in Costa Rican families using WGS. Eur. Neuropsychopharm51, e16e17. 10.1016/j.euroneuro.2021.07.042

  • 19

    CiuculeteD. M.BandsteinM.BenedictC.WaeberG.VollenweiderP.LindL.et al (2017). A genetic risk score is significantly associated with statin therapy response in the elderly population. Clin. Genet.91, 379385. 10.1111/cge.12890

  • 20

    ConsortiumI. H.AltshulerD. M.GibbsR. A.PeltonenL.AltshulerD. M.GibbsR. A.et al (2010). Integrating common and rare genetic variation in diverse human populations. Nature467, 5258. 10.1038/nature09298

  • 21

    ConsortiumM. I. G.KathiresanS.VoightB. F.PurcellS.MusunuruK.ArdissinoD.et al (2009). Genome-wide association of early-onset myocardial infarction with single nucleotide polymorphisms and copy number variants. Nat. Genet.41, 334341. 10.1038/ng.327

  • 22

    CorralP.GellerA. S.PoliseckiE. Y.LopezG. I.BañaresV. G.CacciagiuL.et al (2018). Unusual genetic variants associated with hypercholesterolemia in Argentina. Atherosclerosis277, 256261. 10.1016/j.atherosclerosis.2018.06.009

  • 23

    CunninghamF.AllenJ. E.AllenJ.Alvarez-JarretaJ.AmodeM. R.ArmeanI. M.et al (2021). Ensembl 2022. Nucleic Acids Res.50, D988D995. 10.1093/nar/gkab1049

  • 24

    DePristoM. A.BanksE.PoplinR.GarimellaK. V.MaguireJ. R.HartlC.et al (2011). A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet.43, 491498. 10.1038/ng.806

  • 25

    DronJ. S.DilliottA. A.LawsonA.McIntyreA. D.DavisB. D.WangJ.et al (2020a). Loss-of-Function CREB3L3 variants in patients with severe hypertriglyceridemia. Arterioscler. Thromb. Vasc. Biol.40, 19351941. 10.1161/atvbaha.120.314168

  • 26

    DronJ. S.WangJ.CaoH.McIntyreA. D.IacoccaM. A.MenardJ. R.et al (2019). Severe hypertriglyceridemia is primarily polygenic. J. Clin. Lipidol.13, 8088. 10.1016/j.jacl.2018.10.006

  • 27

    DronJ. S.WangJ.McIntyreA. D.IacoccaM. A.RobinsonJ. F.BanM. R.et al (2020b). Six years’ experience with LipidSeq: Clinical and research learnings from a hybrid, targeted sequencing panel for dyslipidemias. Bmc Med. Genomics13, 23. 10.1186/s12920-020-0669-2

  • 28

    DuW.HuZ.WangL.LiM.ZhaoD.LiH.et al (2020). ABCA1 variants rs1800977 (C69T) and rs9282541 (R230C) are associated with susceptibility to type 2 diabetes. Public Health Genomi23, 2025. 10.1159/000505344

  • 29

    ElderS. J.LichtensteinA. H.PittasA. G.RobertsS. B.FussP. J.GreenbergA. S.et al (2009). Genetic and environmental influences on factors associated with cardiovascular disease and the metabolic syndrome. J. Lipid Res.50, 19171926. 10.1194/jlr.p900033-jlr200

  • 30

    Espinosa-SalinasI.ColmenarejoG.Fernández-DíazC. M.de cedrónM. G.MartinezJ. A.RegleroG.et al (2022). Potential protective effect against SARS-CoV-2 infection by APOE rs7412 polymorphism. Sci. Rep-uk12, 7247. 10.1038/s41598-022-10923-4

  • 31

    FaustinellaF.ChangA.BiervlietJ. P. V.RosseneuM.VinaimontN.SmithL. C.et al (1991). Catalytic triad residue mutation (Asp156—-Gly) causing familial lipoprotein lipase deficiency. Co-Inheritance with a nonsense mutation (Ser447—-Ter) in a Turkish family. J. Biol. Chem.266, 1441814424. 10.1016/s0021-9258(18)98701-6

  • 32

    FeingoldK. (2000). “Introduction to lipids and lipoproteins,” in Endotext [internet]. Editors FeingoldK.AnawaltB.BoyceA. (South Dartmouth, MA: MDText.com, Inc). Available at: https://www.ncbi.nlm.nih.gov/books/NBK305896/(Accessed January 19, 2021).

  • 33

    FlanaganS. E.PatchA.-M.EllardS. (2010). Using SIFT and PolyPhen to predict loss-of-function and gain-of-function mutations. Genet. Test. Mol. Bioma14, 533537. 10.1089/gtmb.2010.0036

  • 34

    GalanterJ. M.Fernandez-LopezJ. C.GignouxC. R.Barnholtz-SloanJ.Fernandez-RozadillaC.ViaM.et al (2012). Development of a panel of genome-wide ancestry informative markers to study admixture throughout the americas. PLoS Genet.8, e1002554. 10.1371/journal.pgen.1002554

  • 35

    GilbertB.RouisM.GriglioS.de LumleyL.LaplaudP. M. (2001). Lipoprotein lipase (LPL) deficiency: A new patient homozygote for the preponderant mutation Gly188Glu in the human LPL gene and review of reported mutations: 75 % are clustered in exons 5 and 6. Ann. De. Génétique44, 2532. 10.1016/s0003-3995(01)01037-1

  • 36

    GongJ.CamposH.McGarveyS.WuZ.GoldbergR.BaylinA. (2011). Genetic variation in stearoyl-CoA desaturase 1 is associated with metabolic syndrome prevalence in Costa Rican adults. J. Nutr.141, 22112218. 10.3945/jn.111.143503

  • 37

    González-CorderoM. (2018). Mutaciones en la región codificante del gen de la lipoproteína lipasa (LPL), en una muestra de pacientes con h-annotated.pdf.

  • 38

    GuanZ.WuK.LiR.YinY.LiX.ZhangS.et al (2019). Pharmacogenetics of statins treatment: Efficacy and safety. J. Clin. Pharm. Ther.44, 858867. 10.1111/jcpt.13025

  • 39

    GunningA. C.FryerV.FashamJ.CrosbyA. H.EllardS.BapleE. L.et al (2021). Assessing performance of pathogenicity predictors using clinically relevant variant datasets. J. Med. Genet.58, 547555. 10.1136/jmedgenet-2020-107003

  • 40

    Gutiérrez-ÁvilaJ. D. (2019). Caracterización de la Región Promotora del Gen de la Apolipoproteína CII (APO CII), cofactor de la Lipoproteína Lipasa (LPL).

  • 41

    Gutiérrez-PeñaE. G.Romero-ZúñigaJ. J. (2010). Dislipidemia y niveles de lípidos sanguíneos en pacientes tratados en centros de atención primaria de la zona este de San José, Costa Rica, año 2006. Rev. MHSalud7. 10.15359/mhs.7-2.1

  • 42

    HaaseC. L.Tybjærg-HansenA.QayyumA. A.SchouJ.NordestgaardB. G.Frikke-SchmidtR. (2012). LCAT, HDL cholesterol and ischemic cardiovascular disease: A mendelian randomization study of HDL cholesterol in 54,500 individuals. J. Clin. Endocrinol. Metab.97, E248E256. 10.1210/jc.2011-1846

  • 43

    HataA.RidingerD. N.SutherlandS. D.EmiM.KwongL. K.ShuhuaJ.et al (1992). Missense mutations in exon 5 of the human lipoprotein lipase gene. Inactivation correlates with loss of dimerization. J. Biol. Chem.267, 2013220139. 10.1016/s0021-9258(19)88676-3

  • 44

    HendersonH. E.HassanF.BergerG. M.HaydenM. R. (1992). The lipoprotein lipase gly188----glu mutation in South Africans of Indian descent: Evidence suggesting common origins and an increased frequency. J. Med. Genet.29, 119122. 10.1136/jmg.29.2.119

  • 45

    HubiszM. J.FalushD.StephensM.PritchardJ. K. (2009). Inferring weak population structure with the assistance of sample group information. Mol. Ecol. Resour.9, 13221332. 10.1111/j.1755-0998.2009.02591.x

  • 46

    IacoccaM. A.HegeleR. A. (2018). Role of DNA copy number variation in dyslipidemias. Curr. Opin. Lipidol.29, 125132. 10.1097/mol.0000000000000483

  • 47

    IoannidisN. M.RothsteinJ. H.PejaverV.MiddhaS.McDonnellS. K.BahetiS.et al (2016). REVEL: An ensemble method for predicting the pathogenicity of rare missense variants. Am. J. Hum. Genet.99, 877885. 10.1016/j.ajhg.2016.08.016

  • 48

    JakobssonM.RosenbergN. A. (2007). CLUMPP: A cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics23, 18011806. 10.1093/bioinformatics/btm233

  • 49

    JohansenC. T.DubéJ. B.LoyzerM. N.MacDonaldA.CarterD. E.McIntyreA. D.et al (2014). LipidSeq: A next-generation clinical resequencing panel for monogenic dyslipidemias. J. Lipid Res.55, 765772. 10.1194/jlr.d045963

  • 50

    JohansenC. T.HegeleR. A. (2011). Genetic bases of hypertriglyceridemic phenotypes. Curr. Opin. Lipidol.22, 247253. 10.1097/mol.0b013e3283471972

  • 51

    JohansenC. T.KathiresanS.HegeleR. A. (2011). Genetic determinants of plasma triglycerides. J. Lipid Res.52, 189206. 10.1194/jlr.r009720

  • 52

    KobayashiJ.NishidaT.AmeisD.StahnkeG.SchotzM. C.HashimotoH.et al (1992). A heterozygous mutation (the codon for Ser447→ a stop codon) in lipoprotein lipase contributes to a defect in lipid interface recognition in a case with type I hyperlipidemia. Biochem. Bioph Res. Co.182, 7077. 10.1016/s0006-291x(05)80113-5

  • 53

    KopelmanN. M.MayzelJ.JakobssonM.RosenbergN. A.MayroseI. (2015). Clumpak: A program for identifying clustering modes and packaging population structure inferences across K. Mol. Ecol. Resour.15, 11791191. 10.1111/1755-0998.12387

  • 54

    KraussR. M.MangraviteL. M.SmithJ. D.MedinaM. W.WangD.GuoX.et al (2008). Variation in the 3-hydroxyl-3-methylglutaryl coenzyme A reductase gene is associated with racial differences in low-density lipoprotein cholesterol response to simvastatin treatment. Circulation117, 15371544. 10.1161/circulationaha.107.708388

  • 55

    LagosJ.ZambranoT.RosalesA.SalazarL. (2015). APOE polymorphisms contribute to reduced atorvastatin response in Chilean amerindian subjects. Int. J. Mol. Sci.16, 78907899. 10.3390/ijms16047890

  • 56

    LandrumM. J.LeeJ. M.BensonM.BrownG. R.ChaoC.ChitipirallaS.et al (2017). ClinVar: Improving access to variant interpretations and supporting evidence. Nucleic Acids Res.46, D1062D1067. 10.1093/nar/gkx1153

  • 57

    LewisG. F.XiaoC.HegeleR. A. (2015). Hypertriglyceridemia in the genomic era: A new paradigm. Endocr. Rev.36, 131147. 10.1210/er.2014-1062

  • 58

    LiS.HuB.WangY.WuD.JinL.WangX. (2014). Influences of APOA5 variants on plasma triglyceride levels in Uyghur population. Plos One9, e110258. 10.1371/journal.pone.0110258

  • 59

    MaL.YangJ.RuneshaH. B.TanakaT.FerrucciL.BandinelliS.et al (2010). Genome-wide association analysis of total cholesterol and high-density lipoprotein cholesterol levels using the Framingham Heart Study data. Bmc Med. Genet.11, 55. 10.1186/1471-2350-11-55

  • 60

    MaillyF.PalmenJ.MullerD. P. R.GibbsT.LloydJ.BrunzellJ.et al (1997). Familial lipoprotein lipase (LPL) deficiency: A catalogue of LPL gene mutations identified in 20 patients from the UK, Sweden, and Italy. Hum. Mutat.10, 465473. 10.1002/(SICI)1098-1004(1997)10:6<465::AID-HUMU8>3.0.CO;2-C

  • 61

    MatherK. A.ThalamuthuA.OldmeadowC.SongF.ArmstrongN. J.PoljakA.et al (2016). Genome-wide significant results identified for plasma apolipoprotein H levels in middle-aged and older adults. Sci. Rep-uk6, 23675. 10.1038/srep23675

  • 62

    MegaJ. L.MorrowD. A.BrownA.CannonC. P.SabatineM. S. (2009). Identification of genetic variants associated with response to statin therapy. Arterioscler. Thromb. Vasc. Biol.29, 13101315. 10.1161/atvbaha.109.188474

  • 63

    MillsR. E.LuttigC. T.LarkinsC. E.BeauchampA.TsuiC.PittardW. S.et al (2006). An initial map of insertion and deletion (INDEL) variation in the human genome. Genome Res.16, 11821190. 10.1101/gr.4565806

  • 64

    MonsalveM. V.HendersonH.RoedererG.JulienP.DeebS.KasteleinJ. J.et al (1990). A missense mutation at codon 188 of the human lipoprotein lipase gene is a frequent cause of lipoprotein lipase deficiency in persons of different ancestries. J. Clin. Invest.86, 728734. 10.1172/jci114769

  • 65

    MusunuruK.RomaineS. P. R.LettreG.WilsonJ. G.VolcikK. A.TsaiM. Y.et al (2012). Multi-Ethnic analysis of lipid-associated loci: The NHLBI CARe project. Plos One7, e36473. 10.1371/journal.pone.0036473

  • 66

    MusunuruK.StrongA.Frank-KamenetskyM.LeeN. E.AhfeldtT.SachsK. V.et al (2010). From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus. Nature466, 714719. 10.1038/nature09266

  • 67

    NakayamaK.BayasgalanT.TazoeF.YanagisawaY.GotohT.YamanakaK.et al (2010). A single nucleotide polymorphism in the FADS1/FADS2 gene is associated with plasma lipid profiles in two genetically similar Asian ethnic groups with distinctive differences in lifestyle. Hum. Genet.127, 685690. 10.1007/s00439-010-0815-6

  • 68

    NgS. B.TurnerE. H.RobertsonP. D.FlygareS. D.BighamA. W.LeeC.et al (2009). Targeted capture and massively parallel sequencing of 12 human exomes. Nature461, 272276. 10.1038/nature08250

  • 69

    Palacio-RojasM.PrietoC.BermúdezV.GaricanoC.NavaT. N.MartínezM. S.et al (2017). Dyslipidemia: Genetics, lipoprotein lipase and HindIII polymorphism. F1000research6, 2073. 10.12688/f1000research.12938.2

  • 70

    ParedesS.FonsecaL.RibeiroL.RamosH.OliveiraJ. C.PalmaI. (2019). Novel and traditional lipid profiles in Metabolic Syndrome reveal a high atherogenicity. Sci. Rep-uk9, 11792. 10.1038/s41598-019-48120-5

  • 71

    PickrellJ. K.PritchardJ. K. (2012). Inference of population splits and mixtures from genome-wide allele frequency data. Plos Genet.8, e1002967. 10.1371/journal.pgen.1002967

  • 72

    PirimD.WangX.RadwanZ. H.NiemsiriV.BunkerC. H.BarmadaM. M.et al (2015). Resequencing of LPL in African Blacks and associations with lipoprotein–lipid levels. Eur. J. Hum. Genet.23, 12441253. 10.1038/ejhg.2014.268

  • 73

    PirimD.WangX.RadwanZ. H.NiemsiriV.HokansonJ. E.HammanR. F.et al (2014). Lipoprotein lipase gene sequencing and plasma lipid profile. J. Lipid Res.55, 8593. 10.1194/jlr.m043265

  • 74

    PlaisierC. L.HorvathS.Huertas-VazquezA.Cruz-BautistaI.HerreraM. F.Tusie-LunaT.et al (2009). A systems genetics approach implicates USF1, FADS3, and other causal candidate genes for familial combined hyperlipidemia. Plos Genet.5, e1000642. 10.1371/journal.pgen.1000642

  • 75

    PoplinR.Ruano-RubioV.DePristoM. A.FennellT. J.CarneiroM. O.der AuweraG. A. V.et al (2018). Scaling accurate genetic variant discovery to tens of thousands of samples. Biorxiv, 201178. 10.1101/201178

  • 76

    PretisN.AmodioA.FrulloniL. (2018). Hypertriglyceridemic pancreatitis: Epidemiology, pathophysiology and clinical management. Ueg J.6, 649655. 10.1177/2050640618755002

  • 77

    ReesM. G.NgD.RuppertS.TurnerC.BeerN. L.SwiftA. J.et al (2012). Correlation of rare coding variants in the gene encoding human glucokinase regulatory protein with phenotypic, cellular, and kinetic outcomes. J. Clin. Invest.122, 205217. 10.1172/jci46425

  • 78

    RobitailleJ.BrouilletteC.HoudeA.LemieuxS.PérusseL.TchernofA.et al (2004). Association between the PPARalpha-L162V polymorphism and components of the metabolic syndrome. J. Hum. Genet.49, 482489. 10.1007/s10038-004-0177-9

  • 79

    RosenbergN. A. (2004). distruct: a program for the graphical display of population structure. Mol. Ecol. Notes4, 137138. 10.1046/j.1471-8286.2003.00566.x

  • 80

    Ruiz-NarvaezE. A.BareL.ArellanoA.CataneseJ.CamposH. (2010). West African and Amerindian ancestry and risk of myocardial infarction and metabolic syndrome in the Central Valley population of Costa Rica. Hum. Genet.127, 629638. 10.1007/s00439-010-0803-x

  • 81

    Ruiz-NarváezE. A.SacksF. M.CamposH. (2008). Abdominal obesity and hyperglycemia mask the effect of a common APOC3 haplotype on the risk of myocardial infarction. Am. J. Clin. Nutr.87, 19321938. 10.1093/ajcn/87.6.1932

  • 82

    Ruiz-NarváezE. A.YangY.NakanishiY.KirchdorferJ.CamposH. (2005). APOC3/A5 haplotypes, lipid levels, and risk of myocardial infarction in the Central Valley of Costa Rica. J. Lipid Res.46, 26052613. 10.1194/jlr.m500040-jlr200

  • 83

    SałackaA.BorońA.GorącyI.HornowskaI.SafranowK.CiechanowiczA. (2021). An association of ABCG8: rs11887534 polymorphism and HDL-cholesterol response to statin treatment in the polish population. Pharmacol. Rep.73, 17811786. 10.1007/s43440-021-00302-7

  • 84

    SamaniN. J.ErdmannJ.HallA. S.HengstenbergC.ManginoM.MayerB.et al (2007). Genomewide association analysis of coronary artery disease. New Engl. J. Med.357, 443453. 10.1056/nejmoa072366

  • 85

    SarrajuA.KnowlesJ. W. (2019). Genetic testing and risk scores: Impact on familial hypercholesterolemia. Front. Cardiovasc Med.6, 5. 10.3389/fcvm.2019.00005

  • 86

    SchusterK. B.WilfertW.EvansD.ThieryJ.TeupserD. (2011). Identification of mutations in the lipoprotein lipase (LPL) and apolipoprotein C-II (APOC2) genes using denaturing high performance liquid chromatography (DHPLC). Clin. Chim. Acta412, 240244. 10.1016/j.cca.2010.10.006

  • 87

    SilbernagelG.ChapmanM. J.GenserB.KleberM. E.FaulerG.ScharnaglH.et al (2013). High intestinal cholesterol absorption is associated with cardiovascular disease and risk alleles in ABCG8 and ABO evidence from the LURIC and YFS cohorts and from a meta-analysis. J. Am. Coll. Cardiol.62, 291299. 10.1016/j.jacc.2013.01.100

  • 88

    SonK. Y.SonH.-Y.ChaeJ.HwangJ.JangS.YunJ. M.et al (2015). Genetic association of APOA5 and APOE with metabolic syndrome and their interaction with health-related behavior in Korean men. Lipids Health Dis.14, 105. 10.1186/s12944-015-0111-5

  • 89

    SotoA. G.McIntyreA.AgrawalS.BialoS. R.HegeleR. A.BoneyC. M. (2015). Severe Hypertriglyceridemia due to a novel p.Q240H mutation in the Lipoprotein Lipase gene. Lipids Health Dis.14, 102. 10.1186/s12944-015-0107-1

  • 90

    StitzielN. O.KiezunA.SunyaevS. (2011). Computational and statistical approaches to analyzing variants identified by exome sequencing. Genome Biol.12, 227. 10.1186/gb-2011-12-9-227

  • 91

    SuX.PengD. (2020). The exchangeable apolipoproteins in lipid metabolism and obesity. Clin. Chim. Acta503, 128135. 10.1016/j.cca.2020.01.015

  • 92

    SudmantP. H.RauschT.GardnerE. J.HandsakerR. E.AbyzovA.HuddlestonJ.et al (2015). An integrated map of structural variation in 2,504 human genomes. Nat. Genet.526, 7581. 10.1038/nature15394

  • 93

    SurendranR. P.VisserM. E.HeemelaarS.WangJ.PeterJ.DefescheJ. C.et al (2012). Mutations in LPL, APOC2, APOA5, GPIHBP1 and LMF1 in patients with severe hypertriglyceridaemia. J. Intern Med.272, 185196. 10.1111/j.1365-2796.2012.02516.x

  • 94

    TaiE. S.DemissieS.CupplesL. A.CorellaD.WilsonP. W.SchaeferE. J.et al (2002). Association between the PPARA L162V polymorphism and plasma lipid levels: The framingham offspring study. Arterioscler. Thromb. Vasc. Biol.22, 805810. 10.1161/01.atv.0000012302.11991.42

  • 95

    TejedorM. T.Garcia-SobrevielaM. P.LedesmaM.Arbones-MainarJ. M. (2014). The apolipoprotein E polymorphism rs7412 associates with body fatness independently of plasma lipids in middle aged men. Plos One9, e108605. 10.1371/journal.pone.0108605

  • 96

    TeslovichT. M.MusunuruK.SmithA. V.EdmondsonA. C.StylianouI. M.KosekiM.et al (2010). Biological, clinical and population relevance of 95 loci for blood lipids. Nature466, 707713. 10.1038/nature09270

  • 97

    ThompsonJ. F.HydeC. L.WoodL. S.PacigaS. A.HindsD. A.CoxD. R.et al (2009). Comprehensive whole-genome and candidate gene analysis for response to statin therapy in the treating to new targets (TNT) cohort. Circ. Cardiovasc Genet.2, 173181. 10.1161/circgenetics.108.818062

  • 98

    Vasquez-VidalI. (2014). Impact of Fatty Acid Desaturase (FADS) genotypes on the relationship between serum lipids and dietary fat intake-annotated.pdf.

  • 99

    Villarreal-MolinaT.Posadas-RomeroC.Romero-HidalgoS.Antúnez-ArgüellesE.Bautista-GrandeA.Vargas-AlarcónG.et al (2012). The ABCA1 gene R230C variant is associated with decreased risk of premature coronary artery disease: The genetics of atherosclerotic disease (GEA) study. Plos One7, e49285. 10.1371/journal.pone.0049285

  • 100

    VinuezaR.BoissonnetC. P.AcevedoM.UrizaF.BenitezF. J.SilvaH.et al (2010). Dyslipidemia in seven Latin American cities: CARMELA study. Prev. Med.50, 106111. 10.1016/j.ypmed.2009.12.011

  • 101

    VohlM. C.LepageP.GaudetD.BrewerC. G.BétardC.PerronP.et al (2000). Molecular scanning of the human PPARα gene: Association of the L162V mutation with hyperapobetalipoproteinemia. J. Lipid Res.41, 945952. 10.1016/s0022-2275(20)32037-x

  • 102

    WangJ.RaskinL.SamuelsD. C.ShyrY.GuoY. (2015). Genome measures used for quality control are dependent on gene function and ancestry. Bioinformatics31, 318323. 10.1093/bioinformatics/btu668

  • 103

    WangL. J.ZhangC. W.SuS. C.ChenH. I. H.ChiuY. C.LaiZ.et al (2019). An ancestry informative marker panel design for individual ancestry estimation of Hispanic population using whole exome sequencing data. Bmc Genomics20, 1007. 10.1186/s12864-019-6333-6

  • 104

    WierzbickiA. S.ReynoldsT. M. (2019). Genetic risk scores in lipid disorders. Curr. Opin. Cardiol.34, 406412. 10.1097/hco.0000000000000623

  • 105

    WillerC. J.SchmidtE. M.SenguptaS.PelosoG. M.GustafssonS.KanoniS.et al (2013). Discovery and refinement of loci associated with lipid levels. Nat. Genet.45, 12741283. 10.1038/ng.2797

  • 106

    YangY.Ruiz-NarvaezE.NiuT.XuX.CamposH. (2004). Genetic variants of the lipoprotein lipase gene and myocardial infarction in the Central Valley of Costa Rica. J. Lipid Res.45, 21062109. 10.1194/jlr.m400202-jlr200

  • 107

    YeeJ.KimW.ChangB. C.ChungJ. E.LeeK. E.GwakH. S. (2019). APOB gene polymorphisms may affect the risk of minor or minimal bleeding complications in patients on warfarin maintaining therapeutic INR. Eur. J. Hum. Genet.27, 15421549. 10.1038/s41431-019-0450-1

  • 108

    YuZ.HuangT.ZhengY.WangT.HeianzaY.SunD.et al (2017). PCSK9 variant, long-chain n–3 PUFAs, and risk of nonfatal myocardial infarction in Costa Rican Hispanics. Am. J. Clin. Nutr.105, 11981203. 10.3945/ajcn.116.148106

  • 109

    ZhouY.MägiR.MilaniL.LauschkeV. M. (2018). Global genetic diversity of human apolipoproteins and effects on cardiovascular disease risk. J. Lipid Res.59, 19872000. 10.1194/jlr.p086710

Summary

Keywords

dyslipidemia, genetic variant, whole genome sequences (WGS), Costa Rica, allele frequencies, pharmacogenomic, Latin America

Citation

Valverde-Hernández JC, Flores-Cruz A, Chavarría-Soley G, Silva de la Fuente S and Campos-Sánchez R (2023) Frequencies of variants in genes associated with dyslipidemias identified in Costa Rican genomes. Front. Genet. 14:1114774. doi: 10.3389/fgene.2023.1114774

Received

02 December 2022

Accepted

14 March 2023

Published

30 March 2023

Volume

14 - 2023

Edited by

Alpo Juhani Vuorio, University of Helsinki, Finland

Reviewed by

Alexey N. Meshkov, National Research Center for Preventive Medicine, Russia

Edward Antonio Ruiz-Narvaez, University of Michigan, United States

Updates

Copyright

*Correspondence: Rebeca Campos-Sánchez,

Deceased

This article was submitted to Genetics of Common and Rare Diseases, a section of the journal Frontiers in Genetics

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics