Frequencies of variants in genes associated with dyslipidemias identified in Costa Rican genomes   Juan C. Valverde-Hernández1, Andrés Flores-Cruz1, Gabriela Chavarría-Soley2, 1, Sandra Silva de la Fuente1, Rebeca Campos-Sánchez1*   1Center for Research in Cellular and Molecular Biology, Faculty of Sciences, University of Costa Rica, Costa Rica, 2School of Biology, Faculty of Sciences, University of Costa Rica, Costa Rica   Submitted to Journal:   Frontiers in Genetics   Specialty Section:   Genetics of Common and Rare Diseases   Article type:   Original Research Article   Manuscript ID:   1114774   Received on:   02 Dec 2022   Revised on:   13 Mar 2023   Journal website link:   www.frontiersin.org In review http://www.frontiersin.org/       Conflict of interest statement   The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest       Author contribution statement   RCS and SSF designed the study. RCS and JCV collected the genomics data. JCV and AFC performed the data analysis. JCV, AFC, GCS, and RCS wrote the manuscript. All authors read and approved the final manuscript.       Keywords   Dyslipidemia, Genetic variant, whole genome sequences (WGS), Costa Rica, allele frequencies, pharmacogenomic       Abstract Word count: 283   Dyslipidemias are risk factors in diseases of significant importance to public health, such as atherosclerosis, a condition that contributes to the development of cardiovascular disease. Unhealthy lifestyles, the pre-existence of diseases, and the accumulation of genetic variants in some loci contribute to the development of dyslipidemia. The genetic causality behind these diseases has been studied primarily on populations with extensive European ancestry. Only some studies have explored this topic in Costa Rica, and none have focused on identifying variants that can alter blood lipid levels and quantifying their frequency. To fill this gap, this study focused on identifying variants in 69 genes involved in lipid metabolism using genomes from two studies in Costa Rica. We contrasted the allelic frequencies with those of groups reported in the 1000 Genomes Project and gnomAD and identified potential variants that could influence the development of dyslipidemias. In total, we detected 2600 variants in the evaluated regions. However, after various filtering steps, we obtained 18 variants that have the potential to alter the function of 16 genes, nine variants have pharmacogenomic or protective implications, eight have high risk in VEP, and eight were found in other Latin American genetic studies of lipid alterations and the development of dyslipidemia. Some of these variants have been linked to changes in blood lipid levels in other global studies and databases. In future studies, we propose to confirm at least 40 variants of interest from 23 genes in a larger cohort from Costa Rica and Latin American populations to determine their relevance regarding the genetic burden for dyslipidemia. Additionally, more complex studies should arise that include diverse clinical, environmental, and genetic data from patients and controls and functional validation of the variants.       Contribution to the field Dyslipidemias are risk factors in diseases of significant importance to public health, such as acute pancreatitis and atherosclerosis, conditions that contribute to the development of pancreatic cancer and cardiovascular diseases, respectively. Unhealthy lifestyles, the pre-existence of diseases, and the accumulation of genetic variants contribute to the development of dyslipidemia. Latin America is highly affected by these diseases. For instance, in Costa Rica, some studies have focused on particular genes and variants that could influence the development of dyslipidemia. Here, we explored two collections of whole genome sequences from Costa Rica to extract variants of interest and their allelic frequencies from 69 genes. These collections resemble the Central Valley of Costa Rica and other Latin American populations, such as Colombia. This is important because we can reuse these genomes and others to address diverse genetic questions. Among all variants detected using a bioinformatics pipeline, we identified 40 variants in 23 genes that can potentially alter lipid function, and that can be confirmed in future studies in Costa Rica and Latin America. This study contributes to the knowledge of the genetic burden of dyslipidemia that should be complemented with environmental and phenotypic data of patients and controls, maybe in collaboration with the Costa Rican Social Security System (Caja Costarricense del Seguro Social - C.C.S.S.). Eventually, functional validation of the variants detected in patients should be performed to provide conclusive evidence of the association with dyslipidemia.           Funding information   Universidad de Costa Rica provided funds for a student assistant in the project B9-259. The University can also provide up to $1000 in publishing fees. In review       Ethics statements   Studies involving animal subjects Generated Statement: No animal studies are presented in this manuscript.       Studies involving human subjects Generated Statement: The studies involving human participants were reviewed and approved by Comité Ético Científico, Universidad de Costa Rica. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.       Inclusion of identifiable human data Generated Statement: No potentially identifiable human images or data is presented in this study. In review       Data availability statement Generated Statement: Publicly available datasets were analyzed in this study. This data can be found here: phs000988.V4.P1 can be requested directly through dbGAP. Chavarria-Soley et al. 2021 can be requested through the original authors..     In review Frequencies of variants in genes associated with dyslipidemias identified in Costa Rican genomes Juan Carlos Valverde-Hernández1, Andrés Flores-Cruz1, Gabriela Chavarría-Soley1,2, Sandra 1 Silva de la Fuente1♰, Rebeca Campos-Sánchez1* 2 1 Centro de Investigación en Biología Celular y Molecular, University of Costa Rica, San José, Costa 3 Rica. 4 2 Escuela de Biología, University of Costa Rica, San José, Costa Rica. 5 ♰ Deceased 6 * Correspondence: 7 Rebeca Campos-Sánchez 8 rebeca.campos@ucr.ac.cr 9 Keywords: dyslipidemia, genetic variants, whole-genome sequences, Costa Rica, allele 10 frequencies, pharmacogenomic 11 Abstract 12 Dyslipidemias are risk factors in diseases of significant importance to public health, such as 13 atherosclerosis, a condition that contributes to the development of cardiovascular disease. Unhealthy 14 lifestyles, the pre-existence of diseases, and the accumulation of genetic variants in some loci 15 contribute to the development of dyslipidemia. The genetic causality behind these diseases has been 16 studied primarily on populations with extensive European ancestry. Only some studies have explored 17 this topic in Costa Rica, and none have focused on identifying variants that can alter blood lipid levels 18 and quantifying their frequency. To fill this gap, this study focused on identifying variants in 69 genes 19 involved in lipid metabolism using genomes from two studies in Costa Rica. We contrasted the allelic 20 frequencies with those of groups reported in the 1000 Genomes Project and gnomAD and identified 21 potential variants that could influence the development of dyslipidemias. In total, we detected 2600 22 variants in the evaluated regions. However, after various filtering steps, we obtained 18 variants that 23 have the potential to alter the function of 16 genes, nine variants have pharmacogenomic or protective 24 implications, eight have high risk in VEP, and eight were found in other Latin American genetic studies 25 of lipid alterations and the development of dyslipidemia. Some of these variants have been linked to 26 changes in blood lipid levels in other global studies and databases. In future studies, we propose to 27 confirm at least 40 variants of interest from 23 genes in a larger cohort from Costa Rica and Latin 28 American populations to determine their relevance regarding the genetic burden for dyslipidemia. 29 Additionally, more complex studies should arise that include diverse clinical, environmental, and 30 genetic data from patients and controls and functional validation of the variants. 31 1 Introduction 32 Dyslipidemias are a group of conditions characterized by abnormal lipid levels. High lipid profiles 33 include hyperlipidemias or hyperlipoproteinemia. These are worldwide diseases affecting many 34 people. In Latin American cities such as Barquisimeto, Lima, and Bogotá, this condition has been 35 recorded in >70% of men and >50% of women (Vinueza et al., 2010). Costa Rica is no exception. In a 36 study conducted in the 2000s involving 107,000 inhabitants of San José, it was reported that 36% of 37 men and 22% of women had hypercholesterolemia, while 48% of men and 52% of women reported 38 hypertriglyceridemia (Gutiérrez-Peña and Romero-Zúñiga, 2010). These conditions have been closely 39 linked to the development of complex ailments such as cardiovascular diseases and acute pancreatitis 40 In review https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=22294788041666913&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:35d6eaf7-3321-41e4-a956-6f269ada4cfc https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=9048327773183994&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:4216c4da-e0d0-4ff1-a422-19f4812e229c Dyslipidemia variants in Costa Rica 2 (Bruikman et al., 2017; Pretis et al., 2018; Paredes et al., 2019), making hyperlipidemia a public health 41 problem in the 21st century. 42 A sedentary lifestyle and poor eating habits can profoundly impact the development of these diseases 43 (Brahm and Hegele, 2013). The clinical approach to these cases usually includes the implementation 44 of exercise regimens and caloric restriction. Additionally, multiple pieces of evidence have shown that 45 the genetic characteristics of an individual play a leading role in the development of hyperlipidemias 46 (Johansen et al., 2011; Brahm and Hegele, 2013; Wierzbicki and Reynolds, 2019). Currently, the 47 diseases are considered mostly polygenic. However, variants in genes such as the lipoprotein lipase 48 (LPL), the low-density lipoprotein receptor (LDLR), and apolipoprotein B (APOB) tend to have more 49 marked effects than other genes involved in lipid metabolism (Johansen et al., 2011, 2014; Lewis et 50 al., 2015; Dron et al., 2020a, 2020b). 51 Most of the studies aimed at identifying the effect of the genetic component on the presence of 52 alterations in lipid metabolism and the development of dyslipidemia have been performed mainly in 53 Anglo-Saxon and European countries. The study by Andaleon et al. (2019) on Latin American 54 populations is one of the most exhaustive of this kind in this region, including Central Americans. 55 However, little is currently known in Latin American populations about the genetic variants and 56 frequencies in genes previously linked to these conditions in other global studies. 57 Particularly in Costa Rica, few studies on this matter have been published. In one study, from the 58 Dietary Fat and Heart Disease in Costa Rica project (also known as the Costa Rica Heart Study), they 59 quantified the allelic frequencies of specific variants in the APOC, LPL, APOE, PCSK9, FADS1-2-3, 60 and USF1 genes in 4000 individuals from the Costa Rican Central Valley. They reported an association 61 of some of these variants with an increased risk of coronary heart disease and hyperlipidemia (Campos 62 et al., 2001; Brown et al., 2003; Yang et al., 2004; Ruiz-Narváez et al., 2005, 2008; Gong et al., 2011; 63 Aslibekyan et al., 2012; Yu et al., 2017). Other two research projects have focused on identifying 64 genetic variants in regions of interest, such as the LPL gene and the APOCII promoter region in a group 65 of 38 Costa Ricans with hypertriglyceridemia (González-Cordero, 2018; Gutiérrez-Ávila, 2019). 66 Here, we used data from 258 whole genomes from the Central Valley of Costa Rica to identify genetic 67 variants in genes linked to the incidence of dyslipidemia and estimate their allelic frequencies as a 68 proxy of genetic burden. This is the first national portrait of the frequency of previously reported risk 69 variants in genes associated with this group of diseases obtained from genomic data. Additionally, we 70 report the allelic frequencies of variants in genes of interest previously identified in Costa Ricans (i.e., 71 LDLR and APOCII) and Latin American populations. The information generated in this study will help 72 guide and contextualize future studies on dyslipidemia in Costa Rica and the region; possible next steps 73 include validation of 43 variants of interest in a larger population and determining the impact of these 74 findings on the national healthcare system. Moreover, this study reflects the importance of studies that 75 include clinical, environmental, and genetic data from patients and controls. 76 2 Materials and methods 77 2.1 Samples and genomic data 78 We used anonymized whole genome sequence data from two collections. One is from the repository 79 PSYCH-CV, a collection of Costa Rican WGS from the NIMH-funded (National Institute of Mental 80 Health) study U01MH105630-04S1, which included subjects with mania and psychosis and their 81 relatives recruited under different studies and anonymized in the WGS data repository (Chavarria-82 Soley et al., 2021). We selected only unrelated individuals without a mental disorder diagnosis from 83 the families, for a total of 23 individuals. The sequencing was carried out using the Illumina HISEQ 84 2000 team with paired ends. The data had a minimum coverage of 30x and a read length of 100pb. The 85 data were previously aligned with the BWA-MEM tool of the BWA V0.7.15 package using the 86 GRCH38 reference genome and stored in CRAM format. 87 In review https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=8810468955211074&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:fadda3e6-bb7a-4007-9821-f6718f3050ec,7eb22041-c472-43d8-bbeb-1a8659aba564:afdfb3f9-c91f-4d62-86bb-0856fe6d193c,7eb22041-c472-43d8-bbeb-1a8659aba564:ea70ac5f-f1fb-4d36-9995-7539f4485f91 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=24715292595805316&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:2ba6316a-128a-403e-8897-2f55ad514e12 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=5874859964141869&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:d8293026-e669-4f69-ab00-bfa2c80cdc08,7eb22041-c472-43d8-bbeb-1a8659aba564:85e9dc1f-ed97-447f-a5d3-9acf88f4613a,7eb22041-c472-43d8-bbeb-1a8659aba564:2ba6316a-128a-403e-8897-2f55ad514e12 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=8651066906866915&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:2953d135-46d5-447e-b2f1-b05919e98e7e,7eb22041-c472-43d8-bbeb-1a8659aba564:a29f5c53-f95d-4c15-a8cf-94f89e83e7d1,7eb22041-c472-43d8-bbeb-1a8659aba564:d8293026-e669-4f69-ab00-bfa2c80cdc08,7eb22041-c472-43d8-bbeb-1a8659aba564:c8be989c-3958-43c9-b2eb-4aa6b141045b,7eb22041-c472-43d8-bbeb-1a8659aba564:61af6079-a536-4751-904e-c83b2600a7a0 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=8651066906866915&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:2953d135-46d5-447e-b2f1-b05919e98e7e,7eb22041-c472-43d8-bbeb-1a8659aba564:a29f5c53-f95d-4c15-a8cf-94f89e83e7d1,7eb22041-c472-43d8-bbeb-1a8659aba564:d8293026-e669-4f69-ab00-bfa2c80cdc08,7eb22041-c472-43d8-bbeb-1a8659aba564:c8be989c-3958-43c9-b2eb-4aa6b141045b,7eb22041-c472-43d8-bbeb-1a8659aba564:61af6079-a536-4751-904e-c83b2600a7a0 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=5375907330926574&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:64a41e06-bc91-4d13-8234-e8c88a183dfa https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=2904999078504299&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:7c2d6aaa-c3a0-48b1-be93-0f4c9e31e37f,7eb22041-c472-43d8-bbeb-1a8659aba564:8c150ddb-b1a8-498d-8589-9d1a4d27ee63,7eb22041-c472-43d8-bbeb-1a8659aba564:214a2581-5d18-4a92-aa0f-c8c826e16ede,7eb22041-c472-43d8-bbeb-1a8659aba564:347ca8d5-509b-4500-982a-d4df7df7013d,7eb22041-c472-43d8-bbeb-1a8659aba564:81c7850c-09e4-4cfd-95ff-6363485a1ce4,7eb22041-c472-43d8-bbeb-1a8659aba564:9a0bc3b0-3c30-496c-8ee9-193d95a5af1d,7eb22041-c472-43d8-bbeb-1a8659aba564:e2de3919-2d94-441d-a679-a2912d1dc2e3,7eb22041-c472-43d8-bbeb-1a8659aba564:455156cf-ef74-4654-857a-eb276b9ce7ee https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=2904999078504299&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:7c2d6aaa-c3a0-48b1-be93-0f4c9e31e37f,7eb22041-c472-43d8-bbeb-1a8659aba564:8c150ddb-b1a8-498d-8589-9d1a4d27ee63,7eb22041-c472-43d8-bbeb-1a8659aba564:214a2581-5d18-4a92-aa0f-c8c826e16ede,7eb22041-c472-43d8-bbeb-1a8659aba564:347ca8d5-509b-4500-982a-d4df7df7013d,7eb22041-c472-43d8-bbeb-1a8659aba564:81c7850c-09e4-4cfd-95ff-6363485a1ce4,7eb22041-c472-43d8-bbeb-1a8659aba564:9a0bc3b0-3c30-496c-8ee9-193d95a5af1d,7eb22041-c472-43d8-bbeb-1a8659aba564:e2de3919-2d94-441d-a679-a2912d1dc2e3,7eb22041-c472-43d8-bbeb-1a8659aba564:455156cf-ef74-4654-857a-eb276b9ce7ee https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=2904999078504299&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:7c2d6aaa-c3a0-48b1-be93-0f4c9e31e37f,7eb22041-c472-43d8-bbeb-1a8659aba564:8c150ddb-b1a8-498d-8589-9d1a4d27ee63,7eb22041-c472-43d8-bbeb-1a8659aba564:214a2581-5d18-4a92-aa0f-c8c826e16ede,7eb22041-c472-43d8-bbeb-1a8659aba564:347ca8d5-509b-4500-982a-d4df7df7013d,7eb22041-c472-43d8-bbeb-1a8659aba564:81c7850c-09e4-4cfd-95ff-6363485a1ce4,7eb22041-c472-43d8-bbeb-1a8659aba564:9a0bc3b0-3c30-496c-8ee9-193d95a5af1d,7eb22041-c472-43d8-bbeb-1a8659aba564:e2de3919-2d94-441d-a679-a2912d1dc2e3,7eb22041-c472-43d8-bbeb-1a8659aba564:455156cf-ef74-4654-857a-eb276b9ce7ee https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=7763076622195746&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:2e53ee77-3e05-4207-be93-748c5005f010,7eb22041-c472-43d8-bbeb-1a8659aba564:83ba6275-bab3-4201-9563-b371d942db41 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=4885926958503021&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:8004b800-5b21-4c71-858a-4fadedfd1727 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=4885926958503021&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:8004b800-5b21-4c71-858a-4fadedfd1727 Dyslipidemia variants in Costa Rica 3 The second data set was from the project The Genetic Epidemiology of Asthma in Costa Rica (dbGAP 88 phs000988.V4.P1). Individuals without a family relationship and an asthma diagnosis were selected 89 using the dbGAP metadata. In total, 234 subjects met these criteria (Supplementary Table 1), and 90 CRAM files were downloaded from the database. The genomes of both databases were added to a 91 single group of 258 subjects called CR-WGS for the variant annotation. 92 2.2 Variant discovery and genotype 93 The analysis was limited to all coordinates corresponding to the transcriptome according to the GFF3 94 of Ensembl 106 for the GRCh38 genome, including miRNAs and lncRNAs. We call these regions the 95 exome. Additionally, we extracted two sets of ancestry informative markers (AIMs) sets reported by 96 Campos-Sánchez et al. (2013) and by Galanter et al. (2012). Each coordinate interval was extended to 97 300 bp upstream and downstream (Table 1). 98 As a quality control measure on the reads, duplicate reads were first removed using the MarkDuplicates 99 tool, which is part of the GATK package. Next, to adjust for observed systematic errors caused by the 100 sequencer, the GATK machine learning model called Base Quality Score Recalibrator was 101 implemented using the BaseRecalibrator and ApplyBQSR commands. 102 We used HaplotypeCaller, GenomicsDBImport, GenotypeGVCF, and MergeVcfs for indel-like or 103 SNV-like variant calling. During this process, tGRCh38/hg38 was selected as the reference genome 104 and the dbSNP Build 151 variant database was used as the reference source for variants. 105 As a quality check on the identified variants, an error score referred to as VQSLOD was calculated for 106 the identified variants using GATK's machine learning model, Variant Quality Score Recalibrator 107 (VQSR). To do this, metrics obtained for each variant are fed to the VQSR model, including variant 108 depth, strand bias, and quality of the variant assigned in the previous stage, along with lists of variants 109 with different degrees of confidence (DePristo et al., 2011). The evaluation of variant calling errors 110 was performed for indels and SNVs separately. 111 The databases supplied to the VQSR model are stored in GATK’s repository “Resource bundle” 112 “genomics-public-data”, except for the dbSNP v151 database, which was extracted from the FTP site 113 of the National Center for Biotechnology Information of the United States (NCBI). To calculate the 114 error score in the indels, those highly validated in the Mills and 1000 genomes gold standard data set 115 (Mills et al., 2006) were considered true variants. The training data were the genotypes from the first 116 phase of the 1000 Genomes Project (1KGP) study obtained with the Axiom Exome Plus chip. The 117 dbSNP v151 database was also supplied to the model, but it was considered a database with a lower 118 degree of validation. 119 To calculate error scores for SNVs, we considered true variants as those found in the HapMap database 120 phase 3 release 3, part of the International HapMap Project (Consortium et al., 2010). The training 121 databases were defined as the panel of phase 3 1KGP genotyped with the OMNI 2.5 chip and the 122 database of genotypes with a high confidence level from phase 1 of 1KGP. Finally, the dbSNP database 123 was the reference source for known variants. Using ApplyVQSR, we excluded from further analysis 124 variants with a VQSLOD of less than 97.5% of SNVs-like variants and 95% of indel-like variants. 125 2.3 Evaluation of bioinformatics processing 126 Using the GATK CollectVariantCallingMetrics tool, the transition vs. transversion ratio (Ti/Tv) and 127 the heterozygous vs. homozygous alternative allele ratio (Het/non-ref Hom) were calculated, metrics 128 commonly used to describe the quality of the variant calling process. These metrics were obtained 129 separately for each chromosome and at the exome level. The values obtained were compared between 130 both Costa Rican cohorts using a t-test. 131 Additionally, to evaluate the concordance between the allele frequencies, a linear model was generated 132 to contrast the frequencies previously reported in the Costa Rica Heart Study publications and those 133 obtained for CR-WGS (Brown et al., 2003; Ruiz-Narváez et al., 2005; Ruiz-Narvaez et al., 2010). 134 2.4 Genetic ancestry analysis 135 In review https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=046669719969382184&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:0FBBFBB3-8E35-4463-957C-23C29D8856DC https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=7744074664081979&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:22F9F907-8162-47AC-920C-9A45CE591B41 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=5410204246067976&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:3e0e0abe-611a-42ea-b855-c7092d47dc63 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=15186836835975537&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:20e7683c-c582-45cf-9dcb-3fa22b8b613a https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=3340328848361468&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:c441d0ae-dea6-42d8-a4ec-e3902ad73232 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=22498281897812655&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:8c150ddb-b1a8-498d-8589-9d1a4d27ee63,7eb22041-c472-43d8-bbeb-1a8659aba564:347ca8d5-509b-4500-982a-d4df7df7013d,7eb22041-c472-43d8-bbeb-1a8659aba564:7F15E89B-56C5-4EFF-A3CB-E1FF248194BF Dyslipidemia variants in Costa Rica 4 To determine if the subjects included in both Costa Rican cohorts present an ancestry profile that fits 136 within the pattern observed in other Latin American populations, we used the genotypes of 446 AIMs 137 (Ancestry Informative Markers) described by Galanter et al. (2012), and the ancestral populations from 138 1KGP panel (European-EUR, African-AFR, and East Asian-EAS) (Auton et al., 2015; Sudmant et al., 139 2015). We used the EAS group as a proxy of Native American ancestry since most of the ancestry of 140 Native Americans comes from the East Asian population (Wang et al., 2019), given the scarcity of 141 genomic data for this population group. Subjects from Barbados (ACB) and subjects with African 142 ancestry from the South West of USA (ASW) were not considered members of AFR, nor were Utahns 143 (CEUs) part of the EUR group since they are Americans. The ACB, CLM (Colombia), MXL (Mexico), 144 PEL (Peru), and PUR (Puerto Rico) groups were considered Latin American. 145 The genotypes of the 446 AIMs were downloaded for 200 randomly selected individuals for each 146 ancestral group (AFR, EUR, EAS) and all available samples for ACB, ASW, CEU, CLM, MXL, PE, 147 and PUR individuals. Genotypes were extracted for both Costa Rican cohorts, which were integrated 148 with the 1KGP dataset. Principal component analysis (PCA) was performed using the number of 149 alternative alleles by AIM. Only AIMs without missing genotypes were included. We estimated the 150 similarity relationships between American populations and AFR, EUR, and EAS using the allelic 151 frequencies in the TreeMix v1.13 program (Pickrell and Pritchard, 2012). 152 To assess whether the ancestry of both Costa Rican cohorts was consistent with the profile previously 153 reported for subjects from the Costa Rican Central Valley, we performed a genetic admixture analysis 154 using STRUCTURE v2.3.4 (Hubisz et al., 2009) using 78 AIMs described by Campos-Sánchez et al. 155 (2013). We used the same ancestral groups as before (AFR, EUR, EAS). We integrated the genotypes 156 for such AIMs in both Costa Rican cohorts and those reported for Costa Rican groups from the North 157 Region (2013-NR), South Region (2013-SR), the Caribbean region (2013-CR), and the Ventral Valley 158 (2013-CV) (Campos-Sánchez et al., 2013). The integrated database contained 1067 individuals for the 159 analysis in STRUCTURE (Hubisz et al., 2009). The run parameters were: 'Length of Burnin Period' or 160 the number of iterations to reduce the effect of the initial configuration set to 50000, 'Number of 161 MCMC Reps after Burnin' or the number of iterations of the model to obtain accurate estimates set to 162 100000, genetically admixed individuals, the groups could have correlated allele frequencies, and the 163 ancestral groups were EUR, AFR and EAS groups. With these parameters, we performed ten 164 simulations assuming that the population had three ancestral groups. These results were merged using 165 CLUMPP and DISTRUCT through the CLUMPAK tool (Rosenberg, 2004; Jakobsson and Rosenberg, 166 2007; Kopelman et al., 2015). Three plots were generated, one representing genetic structure, a ternary 167 plot of genetic admixture, and a principal component analysis (PCA) using the number of alternative 168 alleles per variant. Only AIMs with complete genotypes were included. Kruskal-Wallis test was 169 applied to determine ancestry similarities among Costa Rican and Latin American populations, from 170 there we built 95% confidence intervals considering Tukey correction to identify specific differences 171 between pairs of populations. 172 173 174 2.5 Annotation of variants 175 We studied the variants identified within a set of 69 genes that have a key role in lipid metabolism or 176 that contain variants that have been associated with changes in blood lipid levels (Table 1). We 177 annotated the variants found in the regions of interest with information hosted in Ensembl release 109 178 using its REST API v15.5 (Cunningham et al., 2021). Pathogenicity predictions, phenotypic 179 associations, and population genetics information were extracted for each variant. 180 The variant type was determined using Variant Effect Predictor (VEP) v7 (Cunningham et al., 2021). 181 In silico predictions of pathogenicity for missense variants were generated using the traditional tools 182 PolyPhen2 and SIFT (Flanagan et al., 2010) and two more recently developed tools, ClinPred and 183 REVEL (Ioannidis et al., 2016; Alirezaie et al., 2018; Gunning et al., 2021). Phenotypic association 184 In review https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=1662051138535906&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:645fb047-da25-4c1b-83bb-264ce2d759a2 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=9172609477038118&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:ecbc05b1-4112-4d77-a9dc-4db7a415efbb,7eb22041-c472-43d8-bbeb-1a8659aba564:00DC8078-AD44-4335-A013-915FF93B83DF https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=9172609477038118&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:ecbc05b1-4112-4d77-a9dc-4db7a415efbb,7eb22041-c472-43d8-bbeb-1a8659aba564:00DC8078-AD44-4335-A013-915FF93B83DF https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=6990899695576458&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:645fb047-da25-4c1b-83bb-264ce2d759a2 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=7555682987440596&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:bf6b3fc0-cfaf-4c7a-bb83-1be049be1f1a https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=8479499296222416&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:37cfe9f9-424a-4fd3-89b3-2b08ebbe59d6 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=599475857342456&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:0FBBFBB3-8E35-4463-957C-23C29D8856DC https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=9086654577851655&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:0FBBFBB3-8E35-4463-957C-23C29D8856DC https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=9177247930675677&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:37cfe9f9-424a-4fd3-89b3-2b08ebbe59d6 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=4275390558097647&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:519d0b57-eddd-414f-874a-79d8edddaa5e,7eb22041-c472-43d8-bbeb-1a8659aba564:c7d899af-d8d3-49c6-8c12-fcf50c80b4e4,7eb22041-c472-43d8-bbeb-1a8659aba564:aa2fc822-b428-4bd3-94b9-301376ca801a https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=4275390558097647&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:519d0b57-eddd-414f-874a-79d8edddaa5e,7eb22041-c472-43d8-bbeb-1a8659aba564:c7d899af-d8d3-49c6-8c12-fcf50c80b4e4,7eb22041-c472-43d8-bbeb-1a8659aba564:aa2fc822-b428-4bd3-94b9-301376ca801a https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=6078731691099153&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:a58222dc-73a6-4ae3-91f5-63338ff0ff0b https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=4862653465716308&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:a58222dc-73a6-4ae3-91f5-63338ff0ff0b https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=6718891945913771&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:1cb9896e-8514-47bd-95c7-06c32f02167c https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=585237340235816&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:650b67ff-58b7-4e16-abc7-6210ca1f93ec,7eb22041-c472-43d8-bbeb-1a8659aba564:2c2e3f7e-e6e8-4cb7-a01f-5a2ff6c4a6c0,7eb22041-c472-43d8-bbeb-1a8659aba564:3d48d2d4-1a3a-46b2-8e78-7cd665c9ee8f Dyslipidemia variants in Costa Rica 5 annotations were done with Ensembl API REST which uses ClinVar and NHGRI-EBI GWAS catalog 185 databases (Landrum et al., 2017; Buniello et al., 2019). 186 To contrast the variant´s population frequencies found in the CR-WGS group with those reported in 187 extensively characterized populations, we collected the frequencies of the 1KGP, EAS, EUR, AFR, 188 AMR, and all 1KGP (ALL) groups. Fisher's exact tests were performed to determine which of the 189 variants found have a different allelic frequency in the group of Costa Rican genomes compared to the 190 1KGP populations. A significance level of 0.05 adjusted with the Bonferroni correction was used as 191 the threshold to determine if the frequency between the two populations was different. 192 2.6 Identification and characterization of variants of interest 193 The study considered a polymorphic site as a variant of interest if (1) it was a risk variant according to 194 three or more sources of functional annotation or if (2) the variant was previously reported in Costa 195 Rica or Latin America within the context of metabolism of lipids and dyslipidemias. This produced 196 two lists of variants of interest: one consisted of risk variants annotated by bioinformatic predictions 197 found in the genes from Table 1, and the other includes the variants that have been reported in Costa 198 Ricans and Latin Americans in the genes of interest in the context of lipid metabolism or dyslipidemia. 199 The list of risk variants with more than one count determined by bioinformatic predictions met at least 200 three of the following criteria: (1) be categorized by PolyPhen2 as possibly harmful (P) or probably 201 harmful (D), (2) being categorized by SiFT as a deleterious variant by having a score less than 0.05, 202 (3) having an index calculated by REVEL greater than 0.5 (it groups 13 predictive tools), (4) having 203 the ClinPred score greater than 0.5 or (5) having a phenotype reported by ClinVar or NHGRI-EBI 204 GWAS catalog which was related to lipid metabolism or an increased risk of developing and suffering 205 from dyslipidemia. The pharmacogenomics variants were identified from ClinVar and NHGRI-EBI 206 GWAS catalog and annotated with PharmGKB (www.pharmgkb.org). 207 We used the jVenn tool (Bardou et al., 2014) to generate Venn diagrams to visualize the consensus 208 between the different sources in determining risk variants. 209 We calculated the number of variants in homozygous and heterozygous states, and the total present per 210 subject to reflect the genetic burden of dyslipidemia-related variants in the population. These metrics 211 were obtained for the set of variants categorized by VEP as LOW, MODERATE, and HIGH risk, and 212 the set of variants categorized as variants of interest in the present study. The data was represented in 213 distribution plots. 214 2.7 Code for bioinformatic analysis 215 In addition to the tools mentioned above, we used the free programming languages Python 3.7 and R 216 4.1.2. Python was used to manage the variant call workflow, annotate the variants, manipulate the data, 217 and generate visualizations. R was used to generate the visualizations produced from the TreeMix 218 results. All code can be found in the GitHub repository 219 https://github.com/jcvalverdehernandez/cr_dislipidemia_2022. 220 3 Results 221 3.1 Variant call metrics met exome quality standards 222 The relationship Ti/Tv obtained for both datasets had a mean of 2.33 (Fig 2A). For exomes, it is 223 reported that Ti/Tv values around 3.0 usually indicate that the data have adequate quality (Wang et al., 224 2015). This metric is sensitive to the genome region and functionality; thus, including intronic regions 225 could reduce this ratio, similar to what we observe in our data. We used transcriptome coordinates that 226 include coding and non-coding sequences (miRNAs and lncRNAs), as specified in the transcript 227 coordinates from Ensembl 106. 228 The average HET/non-ref HOM ratio observed for both cohorts was 1.66 (Fig 2B). The expected value 229 of this index is 2.0 for whole-genome sequencing variants. However, this highly depends on ancestry 230 (Wang et al., 2015). In the study by Wang et al. (2015), average exome estimates varied from 1.4 to 2 231 in Asians and Africans, respectively. 232 In review https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=39379424821829967&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:6c039e9c-e0ac-47fa-82e4-070d7cfef9ba,7eb22041-c472-43d8-bbeb-1a8659aba564:171bd6a4-a24b-4106-8ee2-32e8d2afff05 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=11158635039704523&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:5799f0f5-633e-440c-ae9d-6a0e08818b1a https://github.com/jcvalverdehernandez/cr_dislipidemia_2022 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=3142832504803824&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:084c7647-d884-4b62-a7d2-ac1830c5eee1 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=3142832504803824&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:084c7647-d884-4b62-a7d2-ac1830c5eee1 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=30061167175083026&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:084c7647-d884-4b62-a7d2-ac1830c5eee1 Dyslipidemia variants in Costa Rica 6 Additionally, an exome average of 137593 SNVs and 13273 indels were identified per individual for 233 both cohorts (Fig 2C). All metrics per chromosome and cohort are in Supplementary Figure 1 and 234 Supplementary Table 2. Moreover, PSYCH-CV and dbGAP-CV presented similar metrics for the three 235 metrics (t-test p-value > 0.05). 236 Finally, allelic frequencies previously reported at various polymorphic sites in the Costa Rica Heart 237 Study were significantly correlated (r=1.00, p=1.8e-13) with those observed in CR-WGS. This result 238 suggests a high similarity between these cohorts and that variant calling was accurate (Supplementary 239 Figure 2). 240 3.2 The ancestry of Costa Rican genomes is consistent with previous studies 241 The ancestry analyses validated that PSYCH-CV and dbGAP-CV cohorts have a genetic profile 242 consistent with that expected from a random sample of Costa Ricans from the Central Valley. They 243 also reveal an ancestry profile similar to other Latin American groups in 1KGP, such as CLM, MXL, 244 and PEL. 245 Principal component analysis (PCA) captured around 40.58% (between principal components 1 and 2) 246 of the genetic variation using the panel of 446 AIMs in the three ancestral groups and the six American 247 groups (Figure 3A). We observed that the PSYCH-CV and dbGAP-CV individuals appear to have 248 more similarity with the Colombian (CLM) subjects in European and Asian ancestry, and in the AFR 249 only for PSYCH-CV. Additionally, PSYCH-CV presented similarities with the AFR and EAS 250 component of Mexicans (MXL) (Supplementary Table 3). These observations were verified by 251 building 95% confidence intervals (Supplementary Table 4), which are also reflected in the genetic 252 structure plot (Figure 3C). The genetic distance tree also groups Costa Rican genomes with Latin 253 American and European groups (Figure 3B). 254 When contrasting the genetic ancestry of PSYCH-CV and dbGAP-CV using 78 AIMS we observed complete 255 similarity in all three ancestry components among them. Using these same markers we compared ancestry with 256 the Costa Rican groups described by Campos-Sánchez et al. (2013) and observed the most significant similarity 257 with the Central Valley group (2013-VC) in all three ancestry components for PSYCH-CV, but only for AFR 258 and EAS for dbGAP-CV. Moreover, both groups showed similar AFR ancestry compared to the South (2013-259 SR), and EAS ancestry compared to the Caribbean Region (2013-CR). PSYCH-CV also presented AFR ancestry 260 similar to 2013-CR (Figure 4A and B, Supplementary Table 3). These observations were verified by building 261 95% confidence intervals (Supplementary Table 4). The rest of the confidence intervals reflected statistically 262 significant differences. The PCA captured approximately 36.33% of the genetic variation between 263 principal components 1 and 2. These results provided confidence that CR-WGS represented the Central 264 Valley population of Costa Rica. 265 3.3 Polymorphic sites identified in genes of interest 266 We identified 2600 polymorphic sites in CR-WGS in the 69 genes of interest (Table 1) consisting of 267 2460 SNVs and 140 indels (Table 2). However, only 2553 were annotated in dbSNP. We detected 47 268 new variants not reported previously in dbSNP. Multiallelic variants represented 2.9% of all variants 269 detected. 270 We classified 2277 variants (unique rsIDs) into 2769 impact annotations assigned in VEP based on the 271 in silico consequence of the variant according to the Sequence Ontology (SO) term. This means that a 272 variant could have different impact annotations depending on the region of the gene and the alternative 273 transcript they belong to. For example, the rs5088 in APOA2 had five annotations: intron variant, 274 synonymous variant, 3-prime UTR variant, downstream gene variant, and splice region variant; three 275 had a MODIFIER, and two had a LOW impact. In summary, 349 variants had a LOW impact (low risk 276 of affecting gene transcripts), 397 MODERATE, and 8 HIGH risks. It was impossible to assign an 277 expected risk to consequences assigned to 1941 of the variants using VEP; these consequences are 278 referred to as MODIFIER (Supplementary Figure 3). To get an idea about the genetic burden for 279 dyslipidemia in our sample, we plotted the number of variants per individual (Figure 5 A-C). The 280 subjects presented on average 56.22 LOW impact variants (34.9 and 21.36 in heterozygous and 281 In review Dyslipidemia variants in Costa Rica 7 homozygous state, respectively), 47.29 MODERATE impact variants (27.23 and 20.06 in 282 heterozygous and homozygous state, respectively), and 1.03 HIGH impact variants (0.82 and 0.43 in 283 heterozygous and homozygous state, respectively). 284 According to Fisher's exact tests implemented to contrast the allele frequencies of the 2174 variants 285 detected in CR-WGS and those of the groups belonging to 1KGP, we observed that AMR, EUR, and 286 ALL groups are the most similar to CR-WGS (Figure 6A). These differed individually from CR-WGS 287 in 54, 214, and 452 allelic frequency variants, respectively (Figure 6B). On the other hand, EAS and 288 AFR presented statistically significant differences in the frequency of the alleles of 694 and 1082 289 polymorphic sites compared to CR-WGS, respectively (Supplementary Figure 4). 290 The eight variants associated with high-risk consequences according to VEP are summarized in Table 291 3. These are located in eight genes and include stop gained and start lost annotations; most were 292 heterozygous and presented 1 to 37 copies in CR-WGS. Interestingly, rs328G and rs132642T are 293 homozygous in two different individuals each. SNV rs328 was reported as benign in other Latin 294 American studies and ClinVar (Table 6), while rs132642 has no annotation in ClinVar. Allele 295 frequencies from 1KGP and gnomAD exomes are low (up to 11%, Table 3). 296 Forty-one variants in 21 genes were associated with phenotypic traits categorized as protective, drug 297 response, association, risk factor, likely pathogenic, and pathogenic (Figure 7). The genes with more 298 than one variant with phenotypic traits categorized as risk or pathogenic factors (i.e., risk factor, 299 pathogenic or likely pathogenic) were APOA5, APOB, APOE, APOL1, CD36, GCKR, LDLR, LPL, 300 PCSK9, and PLA2G7. 301 Seven variants were annotated with features associated with drug response and two with protective 302 features in APOB, APOE, and HMGCR genes (Table 4). The allelic frequencies of the alternate allele 303 ranged from 0.01 to 0.76. These nine variants are present in 1KGP populations but we observed 304 statistical differences in the allelic frequencies of seven of the variants. All variants presented 305 annotations in ClinVar, including associations with traits such as warfarin, atorvastatin, and statins 306 responses, and one protective against metabolic syndrome. 307 Of the missense variants identified within the genes of interest listed in Table 1, 18 were categorized 308 as risk variants by more than three sources used for functional annotation and had more than one count 309 in CR-WGS (Figure 8 and Table 5). These variants were located in 16 genes. The alternate allele 310 frequencies ranged from 0.00389-0.09143 and 0.00001-0.08852 in CR-WGS and ALL, respectively. 311 Thirteen variants were only present in CR-WGS and ALL; three were reported in AMR and CR-WGS, 312 one in EUR and AMR, one in AFR and AMR, and one in EAS and AMR. In this list, only rs1801689 313 in APOH presented allelic frequencies significantly different from AFR and EAS, and rs202022169 in 314 CELSR2 showed statistical differences with ALL. Additionally, only nine variants had a phenotype 315 association in ClinVar, GWAS, or Teslovich et al. (2010), including sitosterolemia, cholesterol levels, 316 hypertriglyceridemia, apolipoproteinemia, familial hypercholesterolemia, among others. 317 Finally, only eight variants previously linked to lipid metabolism or the development of dyslipidemia 318 in Costa Ricans and Latin Americans were found in CR-WGS (Table 6). These variants were in 319 ABCA1, ABCG8, CELSR2, and LPL genes, with frequencies ranging from 0.004 to 0.031. The variant 320 rs1231383321 in LPL is a private variant found in one individual (heterozygous, sequencing depth 321 16:21) from CR-WGS. 322 In summary, we identified 40 variants of interest related to dyslipidemia in CR-WGS. Subjects in our 323 sample presented on average 7.49 of these variants (Figure 5D). Moreover, 60% of the subjects have 324 2 or 3 variants in homozygous state and 20% of the subjects present 5 variants in heterozygous states. 325 326 4 Discussion 327 4.1 Exome quality metrics 328 The bioinformatics workflow used to perform variant calling on the PSYCH-CV and dbGAP-CV 329 cohorts revealed metrics (Ti/Tv and HET/non-ref HOM ratios) within expected values for adequate 330 In review https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=38225011252951413&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:392949b5-eafa-4303-89e7-09543ccf22b3 Dyslipidemia variants in Costa Rica 8 quality exomes (Wang et al., 2015). Although Ti/Tv ratios were lower than the standard (Wang et al., 331 2015), we must consider that the exome regions included mature transcripts, miRNAs, and lncRNAs 332 coordinates in Ensembl 106 that could impact lowering the values of this metric. Moreover, HET/non-333 ref HOM ratios for both cohorts were within the standard for Asians and Africans since this metric is 334 sensitive to ancestry (Wang et al., 2015). 335 On average, each individual from CR-WGS contained 137k SNVs per exome (210 Mb), but the regions 336 included non-coding sequences that can accumulate more variants. According to the literature, the 337 expected count of SNVs per exome (33 Mb) ranges between 15,000 and 20,000, the determining factor 338 of this variation being the coordinates used to define the exome and the ancestry (Ng et al., 2009; 339 Stitziel et al., 2011). In contrast, there are 3 million SNPs in a genome (Stitziel et al., 2011). Moreover, 340 the average Ti/Tv ratio, HET/non-ref HOM ratio, and SNV per individual were almost identical in 341 PSYCH-CV and dbGAP-CV (t-test p-value > 0.05), confirming the possibility of adding both cohorts 342 for variant annotation. 343 4.2 Concordance with the ancestry of Costa Ricans from the Central Valley 344 The results obtained from the ancestry analysis showed that PSYCH-CV and dbGAP-CV samples 345 show a genetic admixture consistent with Latin American populations and ancestry studies from the 346 Central Valley (Campos-Sánchez et al., 2013). There is also a high concordance between the allele 347 frequencies reported for CR-WGS to the sample of Costa Ricans from the Central Valley without 348 diagnosed disease studied in the Costa Rica Heart Study. All this suggests that the allelic frequencies 349 obtained from CR-WGS are representative of the general population of the Central Valley of Costa 350 Rica and that conclusions from this study can have implications in health care policies. 351 CR-WGS presented an ancestry profile similar to some Latin American groups reported in 1KGP. Of 352 the four Hispanic groups included in 1KGP, the Costa Rican group closely resembles the EUR and EAS 353 component of Colombians (AFR also for PSYCH-CV), and the AFR and EAS component of Mexicans only for 354 PSYCH-CV. This is consistent with previous studies as reviewed by (Adhikari et al., 2017; Wang et al., 2019). 355 The impact of this finding in the study of dyslipidemias in Latin America should be studied further to 356 determine whether conclusions derived from Costa Rican populations apply to other Latin American 357 groups with high European ancestry. 358 PSYCH-CV and dbGAP-CV samples have comparable admixture proportions to Central Valley 359 samples from Campos-Sánchez et al. (2013), which is consistent with the origin of both cohorts. 360 Notably, the European component was lower in CR-WGS (mean 0.47) and the Asian (used as a proxy 361 of Amerindian) was higher (mean 0.46) compared to Campos-Sánchez et al. (2013) (EUR 0.569 and 362 EAS 0.364). This may be because, in the present study, the East Asian population (EAS) reported in 363 1KGP was used as the ancestral group instead of an Amerindian group, as in the study by Campos-364 Sánchez et al. (2013). Although EAS has been used in previous ancestry studies as a group analogous 365 to Native Americans due to their historical origin and because EAS is a broad and standardized group 366 (Wang et al., 2019), it is recommended in future studies to use genomic information from Native 367 Americans for ancestry estimations. 368 4.3 Pharmacogenomic variants 369 According to the functional annotation extracted from ClinVar and GWAS Catalog, at least nine 370 identified variants have been reported to impact either the efficacy, safety, or metabolism of therapeutic 371 agents (Table 3). Eight of these variants are found in PharmGKB, but three have no conclusive 372 evidence, or no association was found with a pharmacogenomics phenotype. 373 Four variants in APOB showed phenotypes associated with response to warfarin, according to ClinVar; 374 they all presented frequencies above 34%. The same variants are reported in PharmGKB, but only two 375 have a significant association with warfarin. Variants rs1042034 and rs693 were studied in Korean 376 patients under warfarin treatment and the risk of hemorrhage, but the T and G alleles, respectively, 377 were not associated (Yee et al., 2019). However, in the same study, the G allele in rs1367117 and the 378 In review https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=39859396093715327&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:084c7647-d884-4b62-a7d2-ac1830c5eee1 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=9020973534946465&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:084c7647-d884-4b62-a7d2-ac1830c5eee1 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=9020973534946465&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:084c7647-d884-4b62-a7d2-ac1830c5eee1 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=8563796943804963&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:084c7647-d884-4b62-a7d2-ac1830c5eee1 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=18656426492931089&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:18d8a23a-0893-4606-bea0-cc356ffe4a04,7eb22041-c472-43d8-bbeb-1a8659aba564:dc613b57-7a03-47a3-a800-b3f0ae845887 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=18656426492931089&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:18d8a23a-0893-4606-bea0-cc356ffe4a04,7eb22041-c472-43d8-bbeb-1a8659aba564:dc613b57-7a03-47a3-a800-b3f0ae845887 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=057584341058033894&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:dc613b57-7a03-47a3-a800-b3f0ae845887 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=36617148472850436&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:0FBBFBB3-8E35-4463-957C-23C29D8856DC https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=7521834683411662&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:645fb047-da25-4c1b-83bb-264ce2d759a2 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=9348187416082187&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:931b1a77-5e7d-4d7a-a5ab-453361f9f1c1 Dyslipidemia variants in Costa Rica 9 G allele in rs6789899 were associated with an increased risk of hemorrhage when using warfarin in 379 people with heart valve replacement. 380 It has been observed in previous studies that the variants rs429358 and rs7412 in APOE can alter the 381 efficacy of statin-type drugs such as lovastatin, atorvastatin, or pravastatin to reduce blood cholesterol 382 levels (Mega et al., 2009; Ciuculete et al., 2017; Guan et al., 2019). A study in hypercholesterolemic 383 Chilean patients showed that these variants impact statins response (Lagos et al., 2015). Campos et al. 384 (2001) studied the interaction of APOE genotypes (using the HhaI enzyme) and fat plasma with 385 lipoprotein levels and low-density lipoproteins in Costa Ricans. Moreover, rs7412 has shown 386 protective effects against SARS-CoV-2 (Espinosa-Salinas et al., 2022). Due to their high allelic 387 frequencies, these variants are candidates for further pharmacogenomic studies in Costa Ricans and 388 Latin American populations (Table 4). On the other hand, rs769450 is an intron variant interpreted as 389 a drug response to warfarin in ClinVar but without assertion criteria. However, in dbSNP, this variant 390 is supported by Musunuru et al. (2012) and Son et al. (2015) associated with decreased risk of elevated 391 triglycerides and LDL (low-density lipoprotein) phenotype, respectively. Additionally, in PharmGKB, 392 allele A is not associated with the risk of hemorrhage during warfarin treatment in people with heart 393 valve replacement compared to allele G. 394 In HMGCR, the genotype TT in rs17238540 is associated with reduced LDL cholesterol in patients 395 treated with simvastatin (Krauss et al., 2008). Furthermore, the genotype GT, compared to TT, showed 396 a decreased reduction in total cholesterol under pravastatin treatment (Chasman et al., 2004). This 397 marker should be studied in more detail in patients under statin treatment. 398 The only protective variant found was rs3816873 in MTTP. This is a microsomal triglyceride transfer 399 protein that catalyzes the transport of triglyceride, cholesteryl ester, and phospholipid between 400 phospholipid surfaces. This variant was associated with protection against metabolic syndrome in 401 ClinVar and OMIM (https://omim.org/entry/157147#0009) and is a benign variant in 402 abetalipoproteinemia. 403 4.4 Risk variants 404 Alterations in the expression levels or the functioning of the genes involved in lipid metabolism 405 evaluated in this study can cause imbalances in the lipid profile and lead to the development of 406 dyslipidemia. Eight variants presented high impact in VEP; only two were homozygous for the 407 recessive allele (Table 3). For instance, rs132642 in APOL3 had no annotation in ClinVar, and rs328 408 in LPL is annotated as benign in the phenotype hyperlipoproteinemia type I. This mutation truncates 409 the last two codons of the protein. Evidence from Kobayashi et al. (1992) was from a heterozygous 410 individual and performed expression studies in Cos-1 cells. Faustinella et al. (1991) presented the case 411 of two homozygous brothers in rs328 with another mutation Asp156Gly in LPL. They confirmed in 412 vitro that the carboxyl terminus of LPL was not responsible for hyperlipoproteinemia type I. The minor 413 allele frequencies of rs132642 and rs328 are 5.8% and 9.25% in dbSNP (1KGP Global group). All 414 other five high-risk variants identified in Costa Ricans are presented as heterozygous, and only two 415 have ClinVar annotations with uncertain or conflicting interpretations (CD36, GCKR, and GPD1). In 416 dbSNP, five of these variants (rs5164, rs192225524, rs146053779, rs144009925, and rs749801989) 417 have frequencies below 0.1% in the Global populations of 1KGP and gnomAD exomes. These deserve 418 further study in Latin American populations because of their low allelic frequencies in the same 419 databases (0.3%). 420 Sixteen out of the 69 genes evaluated contained risk variants defined by more than three bioinformatic 421 tools (Figure 8 and Table 4). The genes of the apolipoprotein family with risk variants include APOA5, 422 APOE, APOH, and APOL1. According to Su & Peng (Su and Peng, 2020), APOA5 and APOE 423 participate in the assembly of VLDLs. The study by Zhou et al. (2018) reported that variants in APOA 424 tend to impact plasma triglyceride levels more than cholesterol. Several studies have linked the 425 presence of the C allele in SNV rs3135506 with elevated plasma triglyceride levels (Ruiz-Narváez et 426 al., 2005; Li et al., 2014). Surendran et al. (2012) found an allele frequency of 21% in patients with 427 In review https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=21429798180289839&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:b2e33b83-0718-4bab-8212-f64e9e2bf232,7eb22041-c472-43d8-bbeb-1a8659aba564:8b90d772-7564-451b-92b4-7b0930ff96a5,7eb22041-c472-43d8-bbeb-1a8659aba564:26e5a424-5877-4ad3-99e5-53465c9923c7 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=32512412917324485&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:cb5a60d6-3d70-44bd-9cff-914a1218d29e https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=4866683624861309&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:7c2d6aaa-c3a0-48b1-be93-0f4c9e31e37f https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=7150264290878451&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:0fa2bc9d-f90a-4219-87ea-12906c4ebe69 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=27521909782488907&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:219ae5fb-c644-4c21-b21d-f65ee5bd5d7f https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=8301614817712253&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:0928e4c0-ce63-4a68-9a1c-55711170dd62 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=6517120207228522&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:a0d48dec-a79e-4df1-92e0-850ed3162d15 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=5927075864134392&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:2b1e5f78-0a46-44c3-a3d3-c7725c0c7eda https://omim.org/entry/157147#0009 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=1660412942273165&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:39ace635-b964-4e92-b6f9-e2d4e932612d https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=0002497813702017071&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:8665880b-7f13-4d3d-8bdf-672bd3068dad https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=988432919051767&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:8e9edb67-62c9-456f-b047-628c322de5d2 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=3822005906205842&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:107b0c0f-396d-45cc-88c5-8d2369a26af1 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=1923038264273913&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:760a9dda-4240-44ee-90bf-9ddecf8607c7,7eb22041-c472-43d8-bbeb-1a8659aba564:347ca8d5-509b-4500-982a-d4df7df7013d https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=1923038264273913&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:760a9dda-4240-44ee-90bf-9ddecf8607c7,7eb22041-c472-43d8-bbeb-1a8659aba564:347ca8d5-509b-4500-982a-d4df7df7013d https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=1985402968676222&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:6974dbde-b400-4958-a61a-2be816c4bb29 Dyslipidemia variants in Costa Rica 10 severe hypertriglyceridemia, while the control group presented a frequency of 9%. This variant reached 428 an allelic frequency of 9% in the Costa Rican group and did not show significant differences with the 429 other 1KGP groups. 430 On the other hand, several studies have associated the presence of the T allele of the rs7412 variant 431 belonging to APOE with high blood cholesterol levels, mainly provided by LDLs, and with high body 432 mass index (Thompson et al., 2009; Tejedor et al., 2014). Although the frequency of this variant in 433 Costa Ricans is 6.6% while that of Latin Americans registered in 1KGP is 4.75%, no statistically 434 significant differences were found between them; evaluating this in other parts of the country or 435 increasing the size of the sample can help clarify whether this trend dissipates or becomes more robust. 436 Although little is known about the molecular role of APOH in lipid metabolism, it has been observed 437 in various populations that the presence of some variants associated with the functioning of this 438 apolipoprotein affects LDL cholesterol levels (Willer et al., 2013). The C allele of the rs1801689 439 variant has been linked to changes in blood LDL levels; this variation alters the affinity of APOH with 440 phospholipids (Mather et al., 2016). The variant rs775820342 in APOL1 presented low frequencies in 441 CR-WGS and ALL and is not reported in ClinVar. This is a missense variant with computational 442 pathogenic evidence that could be studied further. 443 Five risk variants were identified in three genes involved in lipid transport, ABCA1, ABCG5, and 444 ABCG8, from the ABC transporter family. ABCA1 participates in the formation of HDLs by 445 translocating cholesterol and phospholipids from the interior of the cell to nascent HDLs. The variant 446 rs766619359 in this gene is a missense mutation. The alternate T allele is almost absent in 1KGP 447 (0.004%) and gnomAD (0.0064% genomes, 0.0024% exomes); no reports are available in ClinVar, 448 suggesting that this is a pathogenic variant. 449 On the other hand, ABCG5 forms a heterodimer with ABCG8 that mediates the absorption and 450 excretion of sterols at multiple levels (Feingold, 2000). Of the risk variants identified, only rs11887534 451 in ABCG8 has been associated with changes in the levels of HDLs in the blood in response to statin 452 treatment (Sałacka et al., 2021). Additionally, rs200433692 in ABCG8 is a missense mutation almost 453 absent in population databases such as 1KGP (0.04%), gnomAD (0.0071% genomes, 0.0088% 454 exomes), and ExAC (0.0116%). 455 Risk variants were found in four genes (CELSR2, CREB3L3, GCKR, and LCAT) with a regulatory or 456 signaling role in lipid metabolism. No previous research was found associating the presence of the risk 457 variants found in CELSR2 and CREB3L3 with alterations in the lipid profile or risk of suffering from 458 dyslipidemia. Moreover, alternate allele frequencies of the variants rs1203365203 and rs779860332 459 were extremely low in ALL (0.001-0.02%) and CR-WGS (0.4%, Table 5). Allele C in rs202022169, 460 on the other hand, presented a statistical difference in the allele frequency with ALL, reaching up to 461 1.9% in CR-WGS compared to 0.007% in ALL and 0.4% in AMR. However, variant rs146175795 in 462 GCKR is presented in ClinVar with conflicting interpretations of pathogenicity, including one 463 associated with hypertriglyceridemia in two heterozygous individuals (Rees et al., 2012). LCAT 464 rs4986970 was reported as benign in ClinVar and it was associated with a reduction in HDL cholesterol 465 (Haase et al. 20), it presented a frequency of 0.7 in CR-WGS. 466 Five putative risk variants (0.3-3.5% frequency in CR-WGS) were found in CD36, LDLR, LIPE, 467 PPARA, and SCARB1 genes, involved in lipid and lipoprotein sensing. Variant rs148698650 detected 468 in LDLR has been linked to alterations in lipid profile according to ClinVar, rs1800206 in PPARA has 469 been associated with lipid-altered phenotypes in three studies (Vohl et al., 2000; Tai et al., 2002; 470 Robitaille et al., 2004), and rs748231262 in SCARB1 has one report in an Argentinian study of familial 471 hypercholesterolemia (Corral et al., 2018). The other two variants have frequencies below 0.4% in CR-472 WGS and are absent from ALL, AFR, EUR, AMR, and EAS. 473 474 Finally, LPL variant rs118204057 has multiple reports associated with hyperlipidemia and 475 hyperlipoproteinemia pathology and protein function (Monsalve et al., 1990; Hata et al., 1992; 476 In review https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=7851950308722028&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:a4546928-5db4-4e55-a5c9-aeb252a1fa4b,7eb22041-c472-43d8-bbeb-1a8659aba564:25464685-c130-4b72-ab04-319e9bb50912 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=13227847334855403&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:4b800ed0-a6d8-4628-b65c-8d6c3fe11d3c https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=3767747568895078&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:a17db35c-d951-49d0-9602-cf31cbccaabd https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=33959215140062526&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:2c07f4a0-e512-4e2b-9c54-051784bbfd56 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=2624967131162691&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:1afea4fd-cba7-4904-af9d-fbdbc31749c1 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=42289202598048936&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:b38275d5-418a-4bf6-86e3-bcaeded9cce2 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=3747146191658475&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:e433cfc5-086f-4446-9062-0fbaa1e5bb1d,7eb22041-c472-43d8-bbeb-1a8659aba564:68e879fa-9aeb-4392-9a3e-62d38bbb334d,7eb22041-c472-43d8-bbeb-1a8659aba564:e28a15ac-7dcc-4216-a7d4-a0792e01af74 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=3747146191658475&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:e433cfc5-086f-4446-9062-0fbaa1e5bb1d,7eb22041-c472-43d8-bbeb-1a8659aba564:68e879fa-9aeb-4392-9a3e-62d38bbb334d,7eb22041-c472-43d8-bbeb-1a8659aba564:e28a15ac-7dcc-4216-a7d4-a0792e01af74 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=27317915818192595&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:bfa48b63-919f-4b82-b507-01e53ffd0daa https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=8998545936687967&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:526bd169-ecb0-48c4-84e3-ad027864410e,7eb22041-c472-43d8-bbeb-1a8659aba564:c0b0236f-2d97-41ae-ae79-059c7e5f5b00,7eb22041-c472-43d8-bbeb-1a8659aba564:2c5058b8-a3fd-46f4-9879-480caa6bad96,7eb22041-c472-43d8-bbeb-1a8659aba564:096d514e-0909-4171-ad15-636285dcb3c6,7eb22041-c472-43d8-bbeb-1a8659aba564:ac683646-18a4-4e5e-b373-46ec7f8be974,7eb22041-c472-43d8-bbeb-1a8659aba564:3e805559-790b-44e5-b173-38baf3ed4ba4,7eb22041-c472-43d8-bbeb-1a8659aba564:b93da6ab-7867-430a-a909-3d5bfd11169a,7eb22041-c472-43d8-bbeb-1a8659aba564:6346dfd4-629e-499a-baf0-6963d058127c Dyslipidemia variants in Costa Rica 11 Henderson et al., 1992; Mailly et al., 1997; Gilbert et al., 2001; Soto et al., 2015; Ashraf et al., 2017; 477 Caddeo et al., 2018). Moreover, population frequencies are low (ALL 0.019%, 0.14% AMR, 0.58% 478 CR-WGS), and it was detected in one individual with severe hyperlipidemia from Costa Rica 479 (González-Cordero, 2018). This variant deserves further study in Costa Rica and Latin American 480 countries. 481 4.5 Variants previously reported in the Latin American region 482 483 We detected in CR-WGS the ABCA1 variant rs9282541 that was considered a private variant in Native 484 Americans and their descendants (Villarreal-Molina et al., 2012; Du et al., 2020). Its allelic frequency 485 resembles that observed in Latin Americans reported in 1KGP. Villarreal-Molina et al. (2012) reported 486 in Mexican subjects that this variant was associated with lower levels of total cholesterol and HDL 487 cholesterol in plasma. Additionally, they observed that the variant’s effect depends on the sex of the 488 subject, probably interacting with other factors. 489 Two variants reported in the study by Andaleon et al. (2019), which focused on identifying variants 490 associated with changes in the lipid profile of Latin Americans living in the United States, were found 491 in the Costa Rican cohort analyzed. The intron variant rs4245791 in ABCG8 is not annotated in 492 ClinVar. However, several publications provide evidence of its relationship with total cholesterol (Ma 493 et al., 2010); higher cholestanol-to-cholesterol levels -an estimate of cholesterol absorption- 494 (Silbernagel et al., 2013), and increased plasma phytosterol concentrations, relatively elevated LDL-495 C; and increased coronary artery disease risk (Calandra et al., 2011). According to research, the variant 496 rs12740374 in CELSR2 influences LDL cholesterol levels in Hispanics (Samani et al., 2007; 497 Consortium et al., 2009; Musunuru et al., 2010). 498 Although the research by Andaleon et al. (2019) detected genetic variants with a quantitative impact 499 on plasma lipid levels for Latin Americans, it is essential to mention that the people included in that 500 study reside in the United States. This means they were exposed to different lifestyles and 501 environmental conditions than their country of origin. Only the environment can affect the variation of 502 plasma total cholesterol levels up to 21% and 29% in plasma triglyceride levels; approximately 6% of 503 the variation is attributed to the interaction between environment and genetics (Elder et al., 2009). 504 We detected in CR-WGS four of the 15 variants described by González Cordero (2018) in LPL (Table 505 6). According to a meta-analysis, the G allele in the rs268 variant is associated with lower plasma HDL 506 cholesterol levels (Boes et al., 2009). This variant has a frequency of 3.3% in CR-WGS, significantly 507 higher compared to ALL and AFR but not to AMR (1.1%) and EUR (1.3%). Variant rs316 is intronic, 508 and according to Pirim et al. (2014), it is possibly located next to a regulatory site. The A allele in this 509 variant has been repeatedly associated with an increase in HDL cholesterol (Schuster et al., 2011; Pirim 510 et al., 2014, 2015), but it is benign in ClinVar. The missense variant rs1231383321 was detected in one 511 individual in CR-WGS, and it is also reported in American gnomAD-exomes and genomes with a 512 frequency of 0.023% and 0,051%, respectively. The rs118204057 variant was discussed previously. 513 On the other hand, we identified the LPL variant rs328 (S447*) in CR-WGS, this was previously 514 associated in a publication of the Costa Rica Heart Study with a reduction in the risk of myocardial 515 infarction in Costa Ricans (Yang et al., 2004). The G allele suppresses the encoding of the last two 516 amino acids of LPL, increasing its lipase activity. Notably, this is associated with low levels of plasma 517 triglycerides and increases in HDL cholesterol in healthy subjects. However, in subjects with obesity, 518 this allele instead is associated with elevated levels of plasma triglycerides (Palacio-Rojas et al., 2017). 519 Overall, this study presents the reanalysis of Costa Ricans' genomic data to estimate dyslipidemia 520 variants' baseline frequencies. The finding that these genomes' ancestry accurately resembles those of 521 Central Valley and some Latin American populations is relevant, considering the low amount of 522 genomic data in these populations to derive conclusions about the genetic burden in the general 523 population. 524 In review https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=8998545936687967&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:526bd169-ecb0-48c4-84e3-ad027864410e,7eb22041-c472-43d8-bbeb-1a8659aba564:c0b0236f-2d97-41ae-ae79-059c7e5f5b00,7eb22041-c472-43d8-bbeb-1a8659aba564:2c5058b8-a3fd-46f4-9879-480caa6bad96,7eb22041-c472-43d8-bbeb-1a8659aba564:096d514e-0909-4171-ad15-636285dcb3c6,7eb22041-c472-43d8-bbeb-1a8659aba564:ac683646-18a4-4e5e-b373-46ec7f8be974,7eb22041-c472-43d8-bbeb-1a8659aba564:3e805559-790b-44e5-b173-38baf3ed4ba4,7eb22041-c472-43d8-bbeb-1a8659aba564:b93da6ab-7867-430a-a909-3d5bfd11169a,7eb22041-c472-43d8-bbeb-1a8659aba564:6346dfd4-629e-499a-baf0-6963d058127c https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=8998545936687967&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:526bd169-ecb0-48c4-84e3-ad027864410e,7eb22041-c472-43d8-bbeb-1a8659aba564:c0b0236f-2d97-41ae-ae79-059c7e5f5b00,7eb22041-c472-43d8-bbeb-1a8659aba564:2c5058b8-a3fd-46f4-9879-480caa6bad96,7eb22041-c472-43d8-bbeb-1a8659aba564:096d514e-0909-4171-ad15-636285dcb3c6,7eb22041-c472-43d8-bbeb-1a8659aba564:ac683646-18a4-4e5e-b373-46ec7f8be974,7eb22041-c472-43d8-bbeb-1a8659aba564:3e805559-790b-44e5-b173-38baf3ed4ba4,7eb22041-c472-43d8-bbeb-1a8659aba564:b93da6ab-7867-430a-a909-3d5bfd11169a,7eb22041-c472-43d8-bbeb-1a8659aba564:6346dfd4-629e-499a-baf0-6963d058127c https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=9966494558543576&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:2e53ee77-3e05-4207-be93-748c5005f010 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=6208283218215088&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:ca0a2009-f6c4-4afd-acc5-7cba577e7c65,7eb22041-c472-43d8-bbeb-1a8659aba564:7d7ed440-ced5-4179-9711-6946e16bb960 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=5738606575756889&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:2369b938-a0e2-427d-9ffd-6a877cd2c456 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=5738606575756889&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:2369b938-a0e2-427d-9ffd-6a877cd2c456 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=6229664864148602&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:5286e32f-9f7c-4c93-8c79-9db2898d418b https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=662116845769661&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:82aa6289-5486-4af1-b0cc-c307fa1c82f0 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=2438192226872884&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:0aefe271-716b-4c47-9990-491351453aa0,7eb22041-c472-43d8-bbeb-1a8659aba564:2825c3ea-86ef-433f-9135-265668a43f84,7eb22041-c472-43d8-bbeb-1a8659aba564:9875f9c7-4f4d-431a-9a04-b0a6f5764af4 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=2438192226872884&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:0aefe271-716b-4c47-9990-491351453aa0,7eb22041-c472-43d8-bbeb-1a8659aba564:2825c3ea-86ef-433f-9135-265668a43f84,7eb22041-c472-43d8-bbeb-1a8659aba564:9875f9c7-4f4d-431a-9a04-b0a6f5764af4 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=036094876198920156&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:b610f801-79ff-4cd9-807f-4ae7b8b7afd3 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=8502828149518507&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:488b49eb-7834-4223-b5c1-f825f8ff913e https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=8533257097193621&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:9cbae89f-3044-4b9b-8432-806e92c86fb7 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=9458182952460313&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:9cbae89f-3044-4b9b-8432-806e92c86fb7,7eb22041-c472-43d8-bbeb-1a8659aba564:33bae10f-b7d6-43a3-a6c5-822741b74ef7,7eb22041-c472-43d8-bbeb-1a8659aba564:6da499e9-57a0-4c88-b49e-95035699ef8f https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=9458182952460313&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:9cbae89f-3044-4b9b-8432-806e92c86fb7,7eb22041-c472-43d8-bbeb-1a8659aba564:33bae10f-b7d6-43a3-a6c5-822741b74ef7,7eb22041-c472-43d8-bbeb-1a8659aba564:6da499e9-57a0-4c88-b49e-95035699ef8f https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=3505538345691126&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:81c7850c-09e4-4cfd-95ff-6363485a1ce4 https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=6617625115640944&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:39d82420-cc75-4eb2-8b63-77917c0bc003 Dyslipidemia variants in Costa Rica 12 The study identified 2600 variants in 69 genes involved in lipid metabolism in the genomes of people 525 from the Central Valley of Costa Rica. Among these, 33 variants have the potential to affect the 526 functioning of these genes, some have been directly linked to the development of hyperlipidemia, and 527 some could affect the performance of proteins involved in lipid metabolism according to bioinformatic 528 analysis. However, some have not been directly associated with developing such conditions in the 529 literature. On the other hand, we found seven variants with pharmacogenomic relevance, several of 530 which can modulate the subject's response to the application of statin-type drugs, therapies commonly 531 used to treat cases of severe hyperlipidemia. Our analysis of the number of variants per individual for 532 the 40 variants of interest suggests an important genetic burden for dyslipidemia in our sample; 533 however, we could not determine the relationship of these variants with dyslipidemia phenotypes due 534 to the lack of metadata associated with the datasets analyzed. 535 In the future, it is essential to develop studies that capture environmental, genotypic, and phenotypic 536 data from Costa Ricans living in Costa Rica to understand more clearly the dynamics that participate 537 in the incidence of dyslipidemia. These efforts can be focused on the 23 genes and 40 variants identified 538 in this study, which can be analyzed with traditional genotyping methodologies (i.e., PCR, RFLP, 539 Sanger sequencing,) reducing costs. Alternatively, genetic analysis using genome sequencing, exome 540 sequencing, or a panel of genes involved in lipid metabolism, such as the LipidSeq panel described by 541 Johansen et al. (2014), could help to identify variants in affected individuals. In an Argentinian study, 542 this strategy has already been used (Corral et al., 2018), where they sequenced only genes linked to 543 lipid metabolism. Additionally, copy number variants should be studied as they have been involved in 544 certain dyslipidemia disorders (Iacocca and Hegele, 2018). Moreover, the abundant clinical 545 information hosted in the Costa Rican Social Security System (Caja Costarricense del Seguro Social - 546 C.C.S.S.) could strengthen this type of genomic study. Eventually, functional validation of the variants 547 detected in patients should be performed to provide conclusive evidence of the association with 548 dyslipidemia. 549 5 Conflict of Interest 550 The authors declare that the research was conducted in the absence of any commercial or financial 551 relationships that could be construed as a potential conflict of interest. 552 6 Author Contributions 553 RCS and SSF designed the study. RCS and JCV collected the genomics data. JCV and AFC performed 554 the data analysis. JCV, AFC, GCS, and RCS wrote the manuscript. All authors read and approved the 555 final manuscript. 556 7 Funding 557 This work was funded by the University of Costa Rica (project number B9-259). 558 8 Acknowledgments 559 This research was partially supported by a machine allocation on the Kabré supercomputer at the Costa 560 Rica National High Technology Center and the CICIMA high-performance computer cluster at the 561 University of Costa Rica. 562 This study was supported by NHLBI grant R37 HL066289. We wish to acknowledge the investigators 563 at the Channing Division of Network Medicine at Brigham and Women's Hospital, the investigators at 564 the Hospital Nacional de Niños in San José, Costa Rica, and the study subjects and their extended 565 family members who contributed samples and genotypes to the study, and the NIH/NHLBI for its 566 support in making this project possible. 567 We also want to acknowledge Esteban Rodríguez for Google Cloud assistance and Federico Muñoz 568 for CICIMA cluster support. 569 In review https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=5436794668643772&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:c8be989c-3958-43c9-b2eb-4aa6b141045b https://app.readcube.com/library/7eb22041-c472-43d8-bbeb-1a8659aba564/all?uuid=23391112427509209&item_ids=7eb22041-c472-43d8-bbeb-1a8659aba564:bfa48b63-919f-4b82-b507-01e53ffd0daa Dyslipidemia variants in Costa Rica 13 9 References 570 Adhikari, K., Chacón-Duque, J. C., Mendoza-Revilla, J., Fuentes-Guajardo, M., and Ruiz-Linares, A. (2017). 571 The Genetic Diversity of the Americas. Annu Rev Genom Hum G 18, 277–296. doi: 10.1146/annurev-genom-572 083115-022331. 573 Alirezaie, N., Kernohan, K. D., Hartley, T., Majewski, J., and Hocking, T. D. (2018). ClinPred: Prediction 574 Tool to Identify Disease-Relevant Nonsynonymous Single-Nucleotide Variants. Am J Hum Genetics 103, 575 474–483. doi: 10.1016/j.ajhg.2018.08.005. 576 Andaleon, A., Mogil, L. S., and Wheeler, H. E. (2019). Genetically regulated gene expression underlies lipid 577 traits in Hispanic cohorts. Plos One 14, e0220827. doi: 10.1371/journal.pone.0220827. 578 Ashraf, A. P., Hurst, A. C. E., and Garg, A. (2017). Extreme hypertriglyceridemia, pseudohyponatremia, and 579 pseudoacidosis in a neonate with lipoprotein lipase deficiency due to segmental uniparental disomy. J Clin 580 Lipidol 11, 757–762. doi: 10.1016/j.jacl.2017.03.015. 581 Aslibekyan, S., Jensen, M. K., Campos, H., Linkletter, C. D., Loucks, E. B., Ordovas, J. M., et al. (2012). 582 Fatty Acid Desaturase Gene Variants, Cardiovascular Risk Factors, and Myocardial Infarction in the Costa 583 Rica Study. Frontiers Genetics 3, 72. doi: 10.3389/fgene.2012.00072. 584 Auton, A., Abecasis, G. R., Altshuler, D. M., Durbin, R. M., Abecasis, G. R., Bentley, D. R., et al. (2015). A 585 global reference for human genetic variation. Nature 526, 68–74. doi: 10.1038/nature15393. 586 Bardou, P., Mariette, J., Escudié, F., Djemiel, C., and Klopp, C. (2014). jvenn: an interactive Venn diagram 587 viewer. Bmc Bioinformatics 15, 293. doi: 10.1186/1471-2105-15-293. 588 Boes, E., Coassin, S., Kollerits, B., Heid, I. M., and Kronenberg, F. (2009). Genetic-epidemiological evidence 589 on genes associated with HDL cholesterol levels: A systematic in-depth review. Exp Gerontol 44, 136–160. 590 doi: 10.1016/j.exger.2008.11.003. 591 Brahm, A., and Hegele, R. A. (2013). Hypertriglyceridemia. Nutrients 5, 981–1001. doi: 592 doi:10.3390/nu5030981. 593 Brown, S., Ordovás, J. M., and Campos, H. (2003). Interaction between the APOC3 gene promoter 594 polymorphisms, saturated fat intake and plasma lipoproteins. Atherosclerosis 170, 307–313. doi: 595 10.1016/s0021-9150(03)00293-4. 596 Bruikman, C. S., Hovingh, G. K., and Kastelein, J. J. P. (2017). Molecular basis of familial 597 hypercholesterolemia. Curr Opin Cardiol 32, 262–266. doi: 10.1097/hco.0000000000000385. 598 Buniello, A., MacArthur, J. A. L., Cerezo, M., Harris, L. W., Hayhurst, J., Malangone, C., et al. (2019). The 599 NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary 600 statistics 2019. Nucleic Acids Res 47, D1005–D1012. doi: 10.1093/nar/gky1120. 601 Caddeo, A., Mancina, R. M., Pirazzi, C., Russo, C., Sasidharan, K., Sandstedt, J., et al. (2018). Molecular 602 analysis of three known and one novel LPL variants in patients with type I hyperlipoproteinemia. Nutrition 603 Metabolism Cardiovasc Dis 28, 158–164. doi: 10.1016/j.numecd.2017.11.003. 604 Calandra, S., Tarugi, P., Speedy, H. E., Dean, A. F., Bertolini, S., and Shoulders, C. C. (2011). Mechanisms 605 and genetic determinants regulating sterol absorption, circulating LDL levels, and sterol elimination: 606 implications for classification and disease risk. J Lipid Res 52, 1885–1926. doi: 10.1194/jlr.r017855. 607 Campos, H., D’Agostino, M., and Ordovás, J. M. (2001). Gene‐diet interactions and plasma lipoproteins: Role 608 of apolipoprotein E and habitual saturated fat intake. Genet. Epidemiol. 20, 117–128. doi: 10.1002/1098-609 2272(200101)20:1<117::aid-gepi10>3.0.co;2-c. 610 Campos-Sánchez, R., Raventós, H., and Barrantes, R. (2013). Ancestry informative markers clarify the 611 regional admixture variation in the Costa Rican population. Hum Biol 85, 721–740. doi: 612 10.3378/027.085.0505. 613 Chasman, D. I., Posada, D., Subrahmanyam, L., Cook, N. R., Stanton, V. P., and Ridker, P. M. (2004). 614 Pharmacogenetic study of statin therapy and cholesterol reduction. Acc Curr J Rev 13, 20–21. doi: 615 10.1016/j.accreview.2004.07.109. 616 Chavarria-Soley, G., Francis-Cartin, F., JImenez-Gonzalez, F., Peralta, J. M., Blangero, J., Gur, R. E., et al. 617 (2021). Identification of genetic risk variants for major psychiatric disorders in Costa Rican families using 618 WGS. Eur Neuropsychopharm 51, e16–e17. doi: 10.1016/j.euroneuro.2021.07.042. 619 Ciuculete, D. M., Bandstein, M., Benedict, C., Waeber, G., Vollenweider, P., Lind, L., et al. (2017). A genetic 620 risk score is significantly associated with statin therapy response in the elderly population. Clin Genet 91, 621 379–385. doi: 10.1111/cge.12890. 622 In review Dyslipidemia variants in Costa Rica 14 Consortium, I. H. 3, Altshuler, D. M., Gibbs, R. A., Peltonen, L., Altshuler, D. M., Gibbs, R. A., et al. (2010). 623 Integrating common and rare genetic variation in diverse human populations. Nature 467, 52–58. doi: 624 10.1038/nature09298. 625 Consortium, M. I. G., Kathiresan, S., Voight, B. F., Purcell, S., Musunuru, K., Ardissino, D., et al. (2009). 626 Genome-wide association of early-onset myocardial infarction with single nucleotide polymorphisms and 627 copy number variants. Nat Genet 41, 334–341. doi: 10.1038/ng.327. 628 Corral, P., Geller, A. S., Polisecki, E. Y., Lopez, G. I., Bañares, V. G., Cacciagiu, L., et al. (2018). Unusual 629 genetic variants associated with hypercholesterolemia in Argentina. Atherosclerosis 277, 256–261. doi: 630 10.1016/j.atherosclerosis.2018.06.009. 631 Cunningham, F., Allen, J. E., Allen, J., Alvarez-Jarreta, J., Amode, M. R., Armean, I. M., et al. (2021). 632 Ensembl 2022. Nucleic Acids Res 50, D988–D995. doi: 10.1093/nar/gkab1049. 633 DePristo, M. A., Banks, E., Poplin, R., Garimella, K. V., Maguire, J. R., Hartl, C., et al. (2011). A framework 634 for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet 43, 491–498. 635 doi: 10.1038/ng.806. 636 Dron, J. S., Dilliott, A. A., Lawson, A., McIntyre, A. D., Davis, B. D., Wang, J., et al. (2020a). Loss-of-637 Function CREB3L3 Variants in Patients With Severe Hypertriglyceridemia. Arteriosclerosis Thrombosis Vasc 638 Biology 40, 1935–1941. doi: 10.1161/atvbaha.120.314168. 639 Dron, J. S., Wang, J., Cao, H., McIntyre, A. D., Iacocca, M. A., Menard, J. R., et al. (2019). Severe 640 hypertriglyceridemia is primarily polygenic. J Clin Lipidol 13, 80–88. doi: 10.1016/j.jacl.2018.10.006. 641 Dron, J. S., Wang, J., McIntyre, A. D., Iacocca, M. A., Robinson, J. F., Ban, M. R., et al. (2020b). Six years’ 642 experience with LipidSeq: clinical and research learnings from a hybrid, targeted sequencing panel for 643 dyslipidemias. Bmc Med Genomics 13, 23. doi: 10.1186/s12920-020-0669-2. 644 Du, W., Hu, Z., Wang, L., Li, M., Zhao, D., Li, H., et al. (2020). ABCA1 Variants rs1800977 (C69T) and 645 rs9282541 (R230C) Are Associated with Susceptibility to Type 2 Diabetes. Public Health Genomi 23, 20–25. 646 doi: 10.1159/000505344. 647 Elder, S. J., Lichtenstein, A. H., Pittas, A. G., Roberts, S. B., Fuss, P. J., Greenberg, A. S., et al. (2009). 648 Genetic and environmental influences on factors associated with cardiovascular disease and the metabolic 649 syndrome. J Lipid Res 50, 1917–1926. doi: 10.1194/jlr.p900033-jlr200. 650 Espinosa-Salinas, I., Colmenarejo, G., Fernández-Díaz, C. M., Cedrón, M. G. de, Martinez, J. A., Reglero, G., 651 et al. (2022). Potential protective effect against SARS-CoV-2 infection by APOE rs7412 polymorphism. Sci 652 Rep-uk 12, 7247. doi: 10.1038/s41598-022-10923-4. 653 Faustinella, F., Chang, A., Biervliet, J. P. V., Rosseneu, M., Vinaimont, N., Smith, L. C., et al. (1991). 654 Catalytic triad residue mutation (Asp156—-Gly) causing familial lipoprotein lipase deficiency. Co-inheritance 655 with a nonsense mutation (Ser447—-Ter) in a Turkish family. J Biol Chem 266, 14418–14424. doi: 656 10.1016/s0021-9258(18)98701-6. 657 Feingold, K. (2000). Introduction to Lipids and Lipoproteins. , eds. K. Feingold, B. Anawalt, and A. Boyce 658 South Dartmouth (MA): MDText.com, Inc. Available at: https://www.ncbi.nlm.nih.gov/books/NBK305896/ 659 [Accessed January 19, 2021]. 660 Flanagan, S. E., Patch, A.-M., and Ellard, S. (2010). Using SIFT and PolyPhen to Predict Loss-of-Function 661 and Gain-of-Function Mutations. Genet Test Mol Bioma 14, 533–537. doi: 10.1089/gtmb.2010.0036. 662 Galanter, J. M., Fernandez-Lopez, J. C., Gignoux, C. R., Barnholtz-Sloan, J., Fernandez-Rozadilla, C., Via, 663 M., et al. (2012). Development of a Panel of Genome-Wide Ancestry Informative Markers to Study 664 Admixture Throughout the Americas. PLoS Genet 8, e1002554. doi: 10.1371/journal.pgen.1002554.s004. 665 Gilbert, B., Rouis, M., Griglio, S., Lumley, L. de, and Laplaud, P.-M. (2001). Lipoprotein lipase (LPL) 666 deficiency: a new patient homozygote for the preponderant mutation Gly188Glu in the human LPL gene and 667 review of reported mutations: 75 % are clustered in exons 5 and 6. Ann De Génétique 44, 25–32. doi: 668 10.1016/s0003-3995(01)01037-1. 669 Gong, J., Campos, H., McGarvey, S., Wu, Z., Goldberg, R., and Baylin, A. (2011). Genetic Variation in 670 Stearoyl-CoA Desaturase 1 Is Associated with Metabolic Syndrome Prevalence in Costa Rican Adults. J 671 Nutrition 141, 2211–2218. doi: 10.3945/jn.111.143503. 672 González-Cordero, M. (2018). Mutaciones en la región codificante del gen de la lipoproteína lipasa (LPL), en 673 una muestra de pacientes con h-annotated.pdf. 674 Guan, Z., Wu, K., Li, R., Yin, Y., Li, X., Zhang, S., et al. (2019). Pharmacogenetics of statins treatment: 675 In review Dyslipidemia variants in Costa Rica 15 Efficacy and safety. J Clin Pharm Ther 44, 858–867. doi: 10.1111/jcpt.13025. 676 Gunning, A. C., Fryer, V., Fasham, J., Crosby, A. H., Ellard, S., Baple, E. L., et al. (2021). Assessing 677 performance of pathogenicity predictors using clinically relevant variant datasets. J Med Genet 58, 547–555. 678 doi: 10.1136/jmedgenet-2020-107003. 679 Gutiérrez-Ávila, J. D. (2019). Caracterización de la Región Promotora del Gen de la Apolipoproteína CII 680 (APO CII), cofactor de la Lipoproteína Lipasa (LPL). 681 Gutiérrez-Peña, E. G., and Romero-Zúñiga, J. J. (2010). Dislipidemia y niveles de lípidos sanguíneos en 682 pacientes tratados en centros de atención primaria de la zona este de San José, Costa Rica, año 2006. Revista 683 MHSalud 7. doi: https://doi.org/10.15359/mhs.7-2.1. 684 Haase, C. L., Tybjærg-Hansen, A., Qayyum, A. A., Schou, J., Nordestgaard, B. G., and Frikke-Schmidt, R. 685 (2012). LCAT, HDL Cholesterol and Ischemic Cardiovascular Disease: A Mendelian Randomization Study of 686 HDL Cholesterol in 54,500 Individuals. J Clin Endocrinol Metabolism 97, E248–E256. doi: 10.1210/jc.2011-687 1846. 688 Hata, A., Ridinger, D. N., Sutherland, S. D., Emi, M., Kwong, L. K., Shuhua, J., et al. (1992). Missense 689 mutations in exon 5 of the human lipoprotein lipase gene. Inactivation correlates with loss of dimerization. J 690 Biol Chem 267, 20132–20139. doi: 10.1016/s0021-9258(19)88676-3. 691 Henderson, H. E., Hassan, F., Berger, G. M., and Hayden, M. R. (1992). The lipoprotein lipase Gly188----Glu 692 mutation in South Africans of Indian descent: evidence suggesting common origins and an increased 693 frequency. J Med Genet 29, 119. doi: 10.1136/jmg.29.2.119. 694 Hubisz, M. J., Falush, D., Stephens, M., and Pritchard, J. K. (2009). Inferring weak population structure with 695 the assistance of sample group information. Mol Ecol Resour 9, 1322–1332. doi: 10.1111/j.1755-696 0998.2009.02591.x. 697 Iacocca, M. A., and Hegele, R. A. (2018). Role of DNA copy number variation in dyslipidemias. Curr Opin 698 Lipidol 29, 125–132. doi: 10.1097/mol.0000000000000483. 699 Ioannidis, N. M., Rothstein, J. H., Pejaver, V., Middha, S., McDonnell, S. K., Baheti, S., et al. (2016). 700 REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants. Am J Hum 701 Genetics 99, 877–885. doi: 10.1016/j.ajhg.2016.08.016. 702 Jakobsson, M., and Rosenberg, N. A. (2007). CLUMPP: a cluster matching and permutation program for 703 dealing with label switching and multimodality in analysis of population structure. Bioinformatics 23, 1801–704 1806. doi: 10.1093/bioinformatics/btm233. 705 Johansen, C. T., Dubé, J. B., Loyzer, M. N., MacDonald, A., Carter, D. E., McIntyre, A. D., et al. (2014). 706 LipidSeq: a next-generation clinical resequencing panel for monogenic dyslipidemias[S]. J Lipid Res 55, 765–707 772. doi: 10.1194/jlr.d045963. 708 Johansen, C. T., and Hegele, R. A. (2011). Genetic bases of hypertriglyceridemic phenotypes. Curr Opin 709 Lipidol 22, 247–253. doi: 10.1097/mol.0b013e3283471972. 710 Johansen, C. T., Kathiresan, S., and Hegele, R. A. (2011). Genetic determinants of plasma triglycerides. J 711 Lipid Res 52, 189–206. doi: 10.1194/jlr.r009720. 712 Kobayashi, J., Nishida, T., Ameis, D., Stahnke, G., Schotz, M. C., Hashimoto, H., et al. (1992). A 713 heterozygous mutation (the codon for Ser447→ a stop codon) in lipoprotein lipase contributes to a defect in 714 lipid interface recognition in a case with type I hyperlipidemia. Biochem Bioph Res Co 182, 70–77. doi: 715 10.1016/s0006-291x(05)80113-5. 716 Kopelman, N. M., Mayzel, J., Jakobsson, M., Rosenberg, N. A., and Mayrose, I. (2015). Clumpak: a program 717 for identifying clustering modes and packaging population structure inferences across K. Mol Ecol Resour 15, 718 1179–1191. doi: 10.1111/1755-0998.12387. 719 Krauss, R. M., Mangravite, L. M., Smith, J. D., Medina, M. W., Wang, D., Guo, X., et al. (2008). Variation in 720 the 3-Hydroxyl-3-Methylglutaryl Coenzyme A Reductase Gene Is Associated With Racial Differences in 721 Low-Density Lipoprotein Cholesterol Response to Simvastatin Treatment. Circulation 117, 1537–1544. doi: 722 10.1161/circulationaha.107.708388. 723 Lagos, J., Zambrano, T., Rosales, A., and Salazar, L. (2015). APOE Polymorphisms Contribute to Reduced 724 Atorvastatin Response in Chilean Amerindian Subjects. Int J Mol Sci 16, 7890–7899. doi: 725 10.3390/ijms16047890. 726 Landrum, M. J., Lee, J. M., Benson, M., Brown, G. R., Chao, C., Chitipiralla, S., et al. (2017). ClinVar: 727 improving access to variant interpretations and supporting evidence. Nucleic Acids Res 46, gkx1153-. doi: 728 In review Dyslipidemia variants in Costa Rica 16 10.1093/nar/gkx1153. 729 Lewis, G. F., Xiao, C., and Hegele, R. A. (2015). Hypertriglyceridemia in the Genomic Era: A New Paradigm. 730 Endocr Rev 36, 131–147. doi: 10.1210/er.2014-1062. 731 Li, S., Hu, B., Wang, Y., Wu, D., Jin, L., and Wang, X. (2014). Influences of APOA5 Variants on Plasma 732 Triglyceride Levels in Uyghur Population. Plos One 9, e110258. doi: 10.1371/journal.pone.0110258. 733 Ma, L., Yang, J., Runesha, H. B., Tanaka, T., Ferrucci, L., Bandinelli, S., et al. (2010). Genome-wide 734 association analysis of total cholesterol and high-density lipoprotein cholesterol levels using the Framingham 735 Heart Study data. Bmc Med Genet 11, 55. doi: 10.1186/1471-2350-11-55. 736 Mailly, F., Palmen, J., Muller, D. P. R., Gibbs, T., Lloyd, J., Brunzell, J., et al. (1997). Familial lipoprotein 737 lipase (LPL) deficiency: A catalogue of LPL gene mutations identified in 20 patients from the UK, Sweden, 738 and Italy. Hum. Mutat. 10, 465–473. doi: 10.1002/(sici)1098-1004(1997)10:6<465::aid-humu8>3.0.co;2-c. 739 Mather, K. A., Thalamuthu, A., Oldmeadow, C., Song, F., Armstrong, N. J., Poljak, A., et al. (2016). Genome-740 wide significant results identified for plasma apolipoprotein H levels in middle-aged and older adults. Sci Rep-741 uk 6, 23675. doi: 10.1038/srep23675. 742 Mega, J. L., Morrow, D. A., Brown, A., Cannon, C. P., and Sabatine, M. S. (2009). Identification of Genetic 743 Variants Associated With Response to Statin Therapy. Arteriosclerosis Thrombosis Vasc Biology 29, 1310–744 1315. doi: 10.1161/atvbaha.109.188474. 745 Mills, R. E., Luttig, C. T., Larkins, C. E., Beauchamp, A., Tsui, C., Pittard, W. S., et al. (20