Clinical and genetic analysis of Costa Rican patients with Parkinson's disease.

Abstract Background: Parkinson's disease (PD) involves environmental risk and protective factors as well as genetic variance. Most of the research in genomics has been done in subjects of European ancestry leading to sampling bias and leaving Latin American populations underrepresented. Objective: We sought to phenotype and genotype Costa Rican PD cases and controls. Methods: We enrolled 118 PD patients with 97 unrelated controls. Collected information included demographics, exposure to risk and protective factors, motor and cognitive assessments. We sequenced coding and untranslated regions in familial PD and atypical parkinsonism-associated genes including GBA, SNCA, VPS35, LRRK2, GCH1, PRKN, PINK1, DJ-1, VPS13C, ATP13A2. Results: Mean age of PD probands was 62.12 {+/-} 13.51 years, 57.6% were male. Prevalence of risk and protective factors reached 30%. Physical activity significantly correlated with better motor performance despite years of disease. Increased years of education were significantly associated with better cognitive function, whereas hallucinations, falls, mood disorders and coffee consumption correlated with worse cognitive performance. We did not identify an association between tested genes and PD or any damaging homozygous or compound heterozygous variants. Rare variants in LRRK2 were nominally associated with PD, six were located between amino acids p.1620-1623 in the C-terminal-of-ROC (COR) domain of LRRK2. Nonsynonymous GBA variants (p.T369M, p.N370S, p.L444P) were identified in three healthy individuals. One PD patient carried a pathogenic GCH1 variant, p.K224R. Conclusion: This is the first study that reports on sociodemographic, risk factors, clinical presentation and genetics of Costa Rican patients with PD.


Introduction
Parkinson's disease (PD) is a complex and heterogeneous movement disorder caused by a progressive degeneration of dopaminergic neurons. Main clinical motor symptoms associated with PD include tremor, rigidity, bradykinesia and postural imbalance [1]. Years before motor symptoms are manifested there can be prodromal non-motor key features that include REM sleep disorders, anosmia and constipation [2]. Its pathophysiology involves environmental factors as well as genetic variance, which provide insight into its molecular pathogenesis. Among environmental factors that contribute to PD risk include pesticide and herbicide exposure, welding and well water consumption. There are also protective factors such as smoking, coffee consumption and performing physical activity that may reduce the risk of developing PD [3].
Since the description of PD associated mutations in the SNCA [4], other genes have been linked to autosomal dominant (AD) forms of familial PD, including LRRK2 and VPS35. In addition, there are clinically and genetically diverse early-onset (EO) autosomal-recessive (AR) forms of PD with associated genes like PRKN, PINK1, and DJ-1 that exhibit phenotypes similar to idiopathic PD; while other associated genes such as VPS13C, ATP13A2 combine atypical features of parkinsonism like dystonia and early cognitive impairment, along with a poor response to levodopa [5]. Large-scale genome-wide association studies (GWASs) have established different GBA locus as strong risk factors for PD both in homozygous and heterozygous state; displaying a phenotype similar to idiopathic PD, yet with a faster rate of progression of cognitive and motor decline [6].
Variant calls with a genotype frequency of less than 25% of the reads or genotype quality of less than 30 were excluded. Samples and variants with more than 10% missingness were also excluded.

Statistics
We used Stata® (version 14) for the statistical analysis of sociodemographic and clinical variables. Normally distributed variables are reported as mean with its standard deviation (SD), whereas continuous but non-normally distributed variables are reported as median with the 25 th and 75 th percentile values (Q1-Q3). Normally distributed variables were compared with paired or unpaired t-tests, while non-normally distributed variables were compared with Mann-Whitney U test or Wilcoxon match-paired signed-rank test. Frequencies were compared with χ 2 and Fisher's exact test. Tests were 2-tailed, and significance was set at p<0.05. We modeled through linear regression the association between demographic and clinical variables with the severity of the disease, as indexed by UPDRS and MoCA.
For genetic analysis, common and rare variants were analyzed separately. Association of common variants was tested using logistic regression adjusted for age and sex in PLINK v1.9.
For rare variants' analysis, we examined the burden of rare variants in each gene using optimized sequence Kernel association test (SKAT-O) adjusted for age and sex [28]. Rare variants were separated into different categories based on their potential pathogenicity to examine specific enrichment in different variant subgroup as described previously [29]: 1) variants with Combined Annotation Dependent Depletion (CADD) score of potentially deleterious variants) [30]; 2) regulatory variants predicted by ENCODE [31]; 3) potentially functional variants including all nonsynonymous variants, stop gain/loss variants, frameshift variants and intronic splicing variants located within two base pairs of exon-intron junctions; 4) loss-of-function variants which includes stop gain/loss, frameshift and splicing variants; 5) only nonsynonymous variants. Bonferroni correction for multiple comparisons was applied as necessary.
This study was approved by the Ethics Committee of Hospital San Juan de Dios, Caja Costarricense de Seguro Social (CLOBI-HSJD #014-2015) and the University of Costa Rica (837-B5-304). Written informed consent was obtained from all participants.

Sociodemographic and clinical variables
At enrollment, PD probands had a mean age of 62.12 ± 13.51 years (range 25-86), and the mean age at onset was 54.62 ± 13.54 (range 16-83) years. Male PD patients comprised 57.63% of the sample with a mean age of 62.75 ± 12.17, while females reported a mean age of 61.26 ± 15.22 years at recruitment. Despite a significantly larger proportion of the male PD patients reported current or previous jobs involving agricultural activities (19.40% male, 2.08% female, p=0.01), the mean number of years of education of these men was significantly higher than women (10.74 ± 3.81 vs 8.86 ± 4.01, p=0.03). Table 1 details subjects' baseline characteristics along with the prevalence of exposure to main risk and protective factors for PD. Most of the risk and protective . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 30, 2020. . factors were more prevalent in men. Tables 1 and 2 detail the frequency of clinical manifestations as well as the standardized scale scores reported for PD cases. The most frequent initial symptomatology of the patients included: resting tremor (71.30%), rigidity (24.07%) and pain (10.19%). Most of the patients had asymmetric onset (94.12%) and a good response to levodopa (89.11%). Other frequently reported motor features comprised dystonia (46.08%), falls (39.22%), and dysphagia (36.27%). Common non-motor manifestations such as hiposmia, sleep disorders and depressive/anxious mood were seen in more than 50% of the PD cases. Overall median score of UPDRS 'ON' was 36 (22-60), most of our patients were graded in the '2.5' and '3' categories of the H&Y scale with a median for S&E score of 80% (80-90%). The median value for the MoCA test was 22 (17-25). There were no statistically significant differences between sex, regarding these scores.
We were able to establish through multivariate linear regression modeling that an increased disease duration along with the presence of orthostasis, dysphagia and mood disorders significantly correlated with increased scores in total ON UPDRS. Furthermore, we found an interaction between performing regular physical activity and duration of disease, where despite having increased years of evolution, patients that performed regular physical activity still scored less in the total ON UPDRS (see Supplementary Figure 1). Additionally, lower scores in MoCA testing significantly correlated with increased age, coffee consumption and the presence of hallucinations, falls and mood disorders, whereas increased years of education correlated with better MoCA scores (see Supplementary Figure 2).

Quality of coverage and identified variants
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The average coverage of the 10 genes analyzed in this study was >588X for all genes. The coverage per gene, and the percentage of nucleotides covered at >15X and >30X for each gene are detailed in Supplementary Table 1. There were no differences in the coverage across the samples (patients and controls). Overall, after quality control, we identified 163 rare variants (Supplementary Table 2 and 158 common variants (Supplementary Table 3 across all genes and all samples that were included in the analysis. Specific protein and DNA changes are listed in Supplementary Tables 4 and 5 for rare and common exonic variants, respectively.

Rare and common variants in Parkinson's disease and parkinsonism-related genes
Burden and SKAT-O analyses did not identify an association of any of the tested genes and PD (Table 3) after correction for multiple comparisons, as expected given the small sample size. We also did not identify any PD patients with potentially damaging homozygous or compound heterozygous variants in any of these genes. Rare variants in LRRK2 were nominally associated with PD, and 11 (9.2%) of patients carried a rare nonsynonymous variant, compared to four (4.1%) among controls. Interestingly, six of these rare nonsynonymous variants, all located between amino acids p.1620-1623 in the C-terminal-of-ROC (COR) domain of LRRK2, were found in six patients and none in controls (Table 4).
Nonsynonymous GBA variants were identified in three individuals: p.T369M was identified in a male patient with age at onset of 48 years, p.N370S was identified in a healthy female individual recruited at the age of 78 years, and p.L444P was identified in a healthy female individual . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 30, 2020. . recruited at the age of 64. While we cannot rule out that these healthy individuals will develop PD in the future, it is unlikely that GBA variants have a major role in PD among Costa Rican patients. One PD patient carried a pathogenic GCH1 variant, p.K224R, further emphasizing the role of this gene in PD.
In the analysis of common variants, none of the variants was associated with PD after correction for multiple comparisons (Supplementary Table 3), which set the corrected p value for statistical significance at p<0.00031. One nonsynonymous variant in LRRK2, p.I723V, was found with allele frequency of 0.01 in patients and 0.09 in controls (OR=0.11, 95% CI=0.02-0.52, p=0.005), yet this difference was not statistically significant after correction for multiple comparisons.

Clinical Features
Our sample average age of PD at diagnosis as well as the observed male predominance is consistent with prior literature. A history of non-potable water consumption along with exposure to pesticides and herbicides was reported in up to 40% of the patients. Previous exposure to pesticides and herbicides is associated with the development of PD [32]; similarly, exposure to welding and heavy metals such as manganese, copper, iron and mercury have been suggested to increase risk of PD. In this study, 22.1% and 11.6% of the patients reported frequent exposure to welding and other heavy metals, respectively.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 30, 2020. .
The majority of our patients fulfilled Group A Gelb criteria while up to 60% also reported at least one of Group B symptoms, being the most frequent dystonia, falls and dysphagia. Common non-motor manifestations such as hiposmia, sleep disorders and depressive/anxious mood were seen in more than half our PD cases.
Performing regular physical activity correlated with lower ON UPDRS scores in spite of increasing age. Physical activity has been established as a possible protective factor for incident Parkinsonism [33], our data would be suggesting that physical activity could determine reduced severity of disease, specifically concerning motor features. Although exercise has not proven to slow the progression of akinesia, rigidity and gait disturbances, it promotes a feeling of physical  [39]. Reduced education years also have been proposed as a risk factor for cognitive impairment in patients with PD [40]. Poor global cognition has been previously associated with a higher hazard of incident parkinsonism [41]. There have been inconsistent findings regarding the effects of coffee consumption on specific cognitive domains. It has been suggested to be in association with improved executive performance but smaller hippocampal volume and worse memory function [42]; however, this . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 30, 2020. . association is not sustained when cognition is analyzed longitudinally. Other literature suggested that coffee consumption might be slightly beneficial on memory without a dose response relationship [43]. Recent large-scale genetic analysis using mendelian randomization did not find any evidence supporting any beneficial or adverse long-term effect of coffee consumption on global cognition or memory function [44] or AD incidence [45]. To our knowledge, there is no literature evaluating the effect on cognition of coffee consumption, specifically for PD patients.
Our findings suggest a possible deleterious effect that should be further explored in this population.

Genetic assessment
After sequence coding familial PD and atypical parkinsonism-associated genes including GBA, SNCA, VPS35, LRRK2, GCH1, PRKN, PINK1, DJ-1, VPS13C, ATP13A2 and correcting for multiple comparisons, burden and SKAT-O analyses did not show an association of any of the tested genes and PD. We also did not identify any homozygous or compound heterozygous pathogenic variants in any of these genes. Rare variants in LRRK2 were nominally associated with PD, observing only in affected individuals, six of these rare nonsynonymous variants, all located between amino acids p.1620-1623 in the COR domain of LRRK2. LRRK2 encodes a multiple domain protein that includes a Roc-COR tandem domain, a tyrosine kinase-like protein kinase domain and at least four repeat domains located within the N-terminal and C-terminal regions. The Roc-COR domain classifies the LRRK2 protein as part of the ROCO superfamily of Ras-like G proteins [46]. Mutations in LRRK2 are the most common cause of late onset AD hereditary PD. Most frequently reported disease-causing mutations locate in the kinase domain . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 30, 2020. .
(i.e. G2019S) increasing kinase activity; and in the Roc-COR tandem domain (i.e. R1441C/G and Y1699C) impairing its GTPase function. Alterations of both kinase and GTPase activity may mediate neurodegeneration in these forms of PD [47]. Of the six patients found to have nonsynonymous variants in the COR domain, two had first degree relatives with dementia, one had a second degree relative with PD and one had two sisters with PD diagnosed at a very young age (20 and 30 years old) (see Table 4).
Nonsynonymous GBA variants were identified in three individuals including one patient and two unaffected controls. While we cannot rule out that these healthy individuals will develop PD in the future, it is unlikely that GBA variants have a major role in PD among Costa Rican patients.
Finally, one PD patient carried a pathogenic variant, p.K224R in GCH1 gene. GCH1 encodes for GTP cyclohydrolase 1 which is a key enzyme for dopamine production in nigrostriatal neurons.
Loss of function mutations such as p.K224R have been shown to cause Dopa-responsive dystonia (DRD), however variants in this gene, have also been implicated in PD, perhaps through regulation of GCH1 expression [48,49]. It has been suggested that late-onset DRD might present clinically with parkinsonism; or alternatively pathogenic GCH1 mutations may predispose to both diseases and carriers will develop any of both depending on other genetic or . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 30, 2020. . Therefore, we would have expected to observe a higher prevalence of mutations, similar to other European series reported. However, our sample is more representative of the metropolitan area where most of the patients were recruited, thus warranting in the future, for a more comprehensive study involving a wider and more representative population of the whole country.
Particularly, including more patients from the non-metropolitan and coastal zones.

Conclusions
This is the first study that reports on sociodemographic, risk factors, clinical presentation and genetics of Costa Rican patients with PD. We observed a high prevalence of exposure to both risk factors (pesticides, herbicides, non-potable water, low education) as well as protective factors (tobacco and coffee intake). Regular physical activity significantly correlated with better UPDRS scores despite years of evolution of the disease. Increased years of education were significantly associated with better MoCA test scores, whereas the presence of hallucinations, falls and mood disorders correlated with a worse performance in MoCA test. Interestingly, coffee consumption also correlated significantly with worse MoCA test scoring.
We did not find an association of any of the tested familial PD and atypical parkinsonismassociated genes including GBA, SNCA, VPS35, LRRK2, GCH1, PRKN, PINK1, DJ-1, VPS13C, ATP13A2 and PD. We also did not identify any homozygous or compound heterozygous pathogenic variants in any of these genes. Rare variants in LRRK2 were nominally associated . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 30, 2020. . with PD, six of these rare nonsynonymous variants, all located in the COR domain of LRRK2.
One PD patient carried a pathogenic GCH1 variant, p.K224R, further emphasizing the role of this gene in PD.

Acknowledgements:
We thank the participants for their contribution to the study. ZGO is supported by the Fonds de recherche du Québec-Santé Chercheur-Boursier award and is a Parkinson Canada New Investigator awardee. We thank Jennifer Ruskey, Sandra Laurent, Lynne Krohn, Uladzislau Rudakou, D. Rochefort, H. Catoire, V. Zaharieva and Dr. Ellen Sylvie-Mas for their assistance.

Authors' Roles:
GTA and EY conceptualized the report and made substantial contributions to the design, drafting and revision of the work. EY and ZGO performed the genetic analysis. TLP, JRM, AGP, ZGO, KCC and JFT significantly contributed to drafting and critically reviewing the paper.

Sources of Funding:
This work was supported by the Research Fund from the Vicerrectoria de Investigacion of the Universidad de Costa Rica. The genetic analysis was supported by the Michael J. Fox Foundation, the Canadian Consortium on Neurodegeneration in Aging (CCNA), Parkinson . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 30, 2020. .

Canada the Canada First Research Excellence Fund (CFREF), awarded to McGill University for
the Healthy Brains for Healthy Lives (HBHL) program.  [ . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint

References
The copyright holder for this this version posted September 30, 2020. . https://doi.org/10.1101/2020.09.29.20202432 doi: medRxiv preprint [  [ N u t r i e n t s 7 , 9 5 9 0 -9 6 0 1 . [ . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 30, 2020. .  Abbreviations: Q1-Q3: first and third quartile.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 30, 2020. .  is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 30, 2020. . . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint
In this model, increasing years of evolution of the disease (A) along with the presence of orthostasis (C), dysphagia (D) and mood disorders (E) significantly correlated with increased scores in total ON UPDRS. Furthermore, we found an interaction between performing regular physical activity and years of disease (F), where despite having increased years of disease, patients that performed regular physical activity still scored less in the total ON UPDRS.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 30, 2020. . https://doi.org/10. 1101/2020 for MoCA test scores. In this model, lower scores in MoCA test significantly correlated with increasing age (A), coffee consumption (C) and the presence of hallucinations (D), falls (E) and mood disorders (F), whereas increasing years of education (B) correlated with better MoCA scores.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 30, 2020. . https://doi.org/10. 1101/2020