doi:10.29203/ka.2020.494 ORIGINAL Karstenia, Volume 58 (2020), Issue 2, pages 190–200 RESEARCH www.karstenia.fi Comparative analysis on datasets of myxomycetes associated with boreal, temperate and tropical regions of North America Carlos Rojas1 * and Steven L. Stephenson2 Abstract 1 Engineering Research Institute and Department of Datasets from boreal (Denali National Park, United Biosystems Engineering, University of Costa Rica, States), temperate (Great Smoky Mountains Nation- San Pedro de Montes de Oca, 11501-Costa Rica al Park, United States) and tropical (La Selva Biolog- 2 Department of Biological Sciences, University of ical Station, Costa Rica) regions of North America Arkansas, Fayetteville, Arkansas 72701 were subjected to analysis. The complete dataset, composed primarily of field data, consisted of 3558 * Corresponding author: carlos.rojasalvarado@ucr.ac.cr records, with 46% temperate, 29% boreal and 23% tropical. A total of 208 species were recorded for the Keywords: biogeography, biomes, datasets, three regions, with 69% temperate, 49% boreal and macroecology, modelling, slime molds 40% tropical. A high significant correlation between the number of records and the number of species Article info: (r2=0.99, P=0.001) suggested that the latter was a Received: 15 July 2020 function of the former, independent of location. Accepted: 11 September 2020 However, this relationship was stable at low survey Published online: 13 October 2020 efforts, as it was observed in a model obtained with Corresponding Editor: Nikki Heherson A. Dagamac 25 independent datasets from the northern hemi- Assistant Editor: Oleg Shchepin sphere of the Americas. Diversity values, calculated with the Shannon Index, ranged from 3.4 to 4.0 and were different for all pairwise combinations (all cas- es P<0.05) of the three datasets, but when calculat- 191 ed with the Simpson Index they were not different timating biodiversity as noted by Guillera‐Arroita for the combination of temperate and boreal data- [2016]). In this context, critical evaluations of mixed sets. At the species level, the smallest value (0.38) methodologies are necessary to design survey strat- for coefficient of community was observed for the egies that are less susceptible to the intrinsic biases boreal-tropical pair and highest (0.56) for the tem- of techniques and sampling. perate-tropical pair. The taxonomic diversity indices Presence-only data, despite providing limited were 2.68 and 2.83 for the boreal and tropical data- information about species distributions, are still sets, but 3.76 for the temperate dataset. The latter remarkably important for ecological purposes (see may be an indication of higher fruiting propensity Bradley 2016). With myxomycetes, it is clear that in temperate regions rather than an indication of the accumulated information on their distribution, intraspecific diversity, an idea that deserves further albeit incomplete due to the constraint explained examination. The boreal dataset had the highest earlier, is also limited by the fact that only one stage number of unique genera (7), followed by the tem- of their life cycle, the reproductive one, provides perate (6) and the tropical (2) datasets. However, the information (“the tip of the iceberg” according the temperate dataset showed the highest number to Schnittler et al. [2017]). However, those issues do of unique species (57), followed by the boreal (37) not affect the robustness of data equally at all lev- and tropical (26) datasets. When analyzed in a com- els and apply mostly to biodiversity matters. In this parative context, standard experiments with similar sense, morphospecies-based, presence-only data- field efforts and techniques are still required to doc- sets still represent valuable tools for evaluation of ument patterns of reproductive occurrence of myxo- the reproductive stage and for the study of ecologi- mycetes in different regions of the world. For macro- cal and evolutionary pressures directed on myxomy- ecological purposes, all regions represented by the cetes during the reproductive part of their life cycle. datasets analyzed herein still remain understudied. In macroecological terms, such uncovering of fundamental relationships between species and the systems they inhabit at large spatial scales is impor- tant to understand the principles that determine, to a certain extent, their distribution (see Kent 2005). Introduction As such, examinations of potential ecological asso- ciations of species of myxomycetes, contribute to The myxomycetes comprise a group of amoeboid the knowledge of species distribution via species (or protists known to occur in all biomes on the plan- fruiting) occurrence. At large geographical scales, et (Novozhilov et al. 2017b). Most of the large-scale this is fundamental to examine biogeographical spatial research on these organisms during the last patterns and understand the basic drivers of myx- 200 years has been based on the reproductive stage omycete biodiversity (see Shoemaker et al. 2017). In of their life cycle (Schnittler et al. 2017). Despite the simple terms, macroecological patterns can eluci- incomplete information that has originated from date biogeographical ones. such a shortcoming, particularly in terms of species One limitation, however, for comparative mac- absence, there is a large amount of data showing the roecological purposes using myxomycete data is the opposite pattern (i.e., species presence). unavailability of reliable information for most eco- In terms of species distribution and biogeo- systems around the world. As such, available infor- graphical patterns, the classical survey method- mation on morphospecies presence is remarkably ology used with myxomycetes has the intrinsic valuable within that context. During his career, the constraint of generating false negatives (i.e., falsely second author of this paper has generated a num- determining the absence of a species). However, it ber of valuable datasets, including some in boreal, is also known that modern molecular methods, temperate, and tropical regions. Within that frame- commonly used in the detection of microorganisms, work, the approach used herein was to examine one have the constraint of generating false positives (i.e., well-developed dataset in each one of three biomes falsely determining the presence of taxa by overes- in the northern hemisphere of the Americas. The 192 objective of the latter was to analyze myxomycete USA) and the La Selva Biological Station (LSBS, 10°N, occurrence in the three regions, using data for the Costa Rica) were selected for the characterization reproductive stage of their life cycle, for a compari- of boreal (taiga), seasonal temperate and tropical son of results that can contribute to the creation of rainforests, respectively. These three locations rep- a representative picture of the biological complexity resented an extreme gradient of naturally occurring present in them. This is an important task for the biomes, each one separated between 25°–28° of lat- elucidation of macroecological dynamics than can itude, in the northern hemisphere of the Americas. potentially serve as the basis of monitoring. After correcting inconsistencies in the datasets (i.e., old names, records with incomplete informa- tion, and clearly wrong data), a calculation of alpha diversity for each one was carried out using both the Shannon´s and the Simpson´s diversity indi- Material and Methods ces. Also, the Taxonomic Diversity Index (number of species/number of genera) was calculated and the Three datasets, obtained from the database of diversity profile for each set was also constructed worldwide myxomycete records at the University of using Hill numbers. The Shannon Index was calcu- Arkansas, were used for this study. All of them were lated for reference since a number of previous stud- selected on the basis of geographical location (i.e., ies on myxomycetes have calculated this estimator; representative of a particular biome) and the num- however, due its difficult interpretation, the intui- ber of records contained. In this manner, datasets tive form of the Simpson´s Index (1-D) was calcu- from Denali National Park (DNP hereafter, 63°N, lated as well. The diversity profiles in the sense of Alaska, USA), the Great Smoky Mountains Nation- Chao et al (2014) represent a more accurate strategy al Park (GSMNP, 35°N, Tennessee/North Carolina, for interpretation of alpha diversity dynamics and in Figure 1. Relative effort of myxomycete recording by time in the three datasets studied herein. Temporal sections with high collecting effort have been marked. 193 Table 1. Diversity estimators (and range) calculated for the three datasets analyzed in the present study. Highest values are shown in bold. The abbreviations for the datasets are provided in the Material and Methods. Datasets DNP GSMNP LSBS Shannon´s Diversity Index 3.90 (3.88-3.98) 4.05 (4.08-4.19) 3.43 (3.37-3.53) Simpson´s Diversity Index 0.97 (0.96-0.97) 0.96 (0.95-0.97) 0.94 (0.93-0.95) Chao 1 - Maximum Number of species 137 (110-140) 172 (150-180) 128 (90-131) Taxonomic Diversity Index 2.68 3.76 2.83 these, a Hill number q = 0 is associated with species Huo (2016). In both cases, the number of records as- richness, q = 1 indicates the number of species with sociated with the different species was used for the typical abundances, and q = 2 refers to the effective calculations (and also as a proxy for species abun- number of dominant species. dance). In a similar manner, the unique number of A general rarefaction curve was created for species and genera was calculated for each dataset each dataset for comparison purposes. With these and the number of shared taxa between two datasets curves, the relative accumulation of species can be were determined. The effective proportion of non- compared between different sets at the same sam- shared species in the pooled community was calcu- pling effort. Similarly, a correlation between the lated. A second cluster analysis with the data at the number of records and the number of species was genus level was also carried out. The beta diversity carried out to test the hypothesis that the relation- approach is intended to create a picture of the ratio ship between the two variables is independent of between regional and local diversity. Following the location. With the use of 25 datasets of myxomycete observations of Sreekar et al. (2018), a null hypothe- records from North America (obtained from Lado & sis of no differences in beta diversity due to the large Rojas [2018] and Stephenson et al. [2020]), a general spatial scaling was established for all three locations. model was created to assess the last idea in a broad- For the latter, the shared Jaccard estimator of beta er context and to evaluate the position of the three diversity was used in the model as a reference of studied datasets in relation with the general trend of comparison and a Wilcoxon test was used. species accumulation. All diversity-based calculations were carried For beta diversity analysis, two approaches out using the SpadeR program in R (Chao et al. were also followed. The commonly used Sørensen 2016) and statistical analyses were performed in coefficient of community estimator was calculated PAST v4.01 (Hammer et al. 2001). In all cases, the for all pairwise combinations and a cluster analysis cutoff value to accept the null hypothesis was 0.05. using the Jaccard algorithm was carried out with The nomenclature was established according to the complete three datasets. The latter followed the Lado (2005–2020). “regional” approach discussed by Chao and Chun- 194 Results A total of 143 species were recorded in GSMNP, followed by 102 from DNP and 85 from LSBS. Using The complete dataset contained 3558 records of the number of records as a proxy for collecting ef- myxomycetes with 1684 records from GSMNP fort, there was a high significant correlation between (40%), 1043 from DNP (29%) and 831 (23%) from effort and the number of species recorded (r2=0.99, LSBS. All records were obtained between 1963 and P=0.001). The highest Shannon Index of Diversity 2013 (Fig. 1). The collecting effort in the different was observed in GSMNP, followed by DNP and LSBS locations took place at different times with about (Table 1) and all combinations were significantly dif- 75% of all records from DNP made between 1989 ferent (P<0.05 in all cases). The highest Simpson In- and 1995 (7 year period), 70% of records from the dex of Diversity was calculated for DNP, followed by GSMNP made between 1999 and 2003 (5 year peri- GSMNP and LSBS and the pairwise comparison of od) and about 94% of records from LSBS made be- diversity values between GSMNP and DNP was not tween 1999 and 2012 (14 year period). significant. The highest Taxonomic Diversity Index was observed in GSMNP, followed by LSBS and DNP. The rarefaction curves constructed with the data (Fig. 2, top) showed that the lowest number of maximum expected species was associated with LSBS with approximately 110 species. The GSMNP was associated, as expected, with the highest values of accumulated recorded species based on a similar effort. For this dataset, however, the diversity pro- files (Fig. 2, bottom) showed a lower number of spe- cies with typical abundances and dominant species (q=2 and 3) than DNP. The LSBS dataset showed the lowest values at all Hill number values. The values of records and species observed in the three datasets studied herein placed them in the growing part of the curve of species accumulation constructed with 25 datasets from the northern hemisphere of The Americas (Fig. 3). Such a model showed a remarkably high correlation value (r=0.97) and a classical shape produced by equation 1. 822.7 + 330.9 x 0.73 y = ––––––––––––––––––––– 227.9 + X0.73 (equation 1) Where x is the number or records of myxomy- cetes in a dataset and 330.9 indicates the maximum number of species to be recorded (the asymptote) Figure 2. Rarefaction (top) and diversity profile in a dataset at the maximum spatial scale studied (bottom) curves constructed with the three datasets herein (about 80 000 square kilometers) studied herein. For the first one, dotted sections The average deviation of the actual observed represent extrapolated values. For interpretation of values obtained from the datasets and the model q values see the Materials and Methods section. was -1.3X10-8 units (species, y axis) but the average 195 Figure 3. Model of species accumulation as a function of number of records observed for 15 independent datasets from North America (dotted line, high correlation) and real observed values (markers). The three datasets studied herein are shown in the graph along with the linear relationship among them (solid line, r = 0.99). deviation obtained for the three datasets studied or 45% of total number of genera) differences were herein was -31.5 units. This suggests that all three smaller than at the species level. datasets are underestimating the number of species With a total of 63 shared species, the pair DNP- at their respective effort based on a general trend. GSMNP showed the highest value of taxonomic When the beta diversity analyses were carried overlap. This was followed by 57 shared species out, results showed that, at the species level, the between GSMPN–LSBS and only 36 between DNP– overall similarity among the three datasets was 0.29 LSBS. The latter was similar to the 34 shared species (range between 0.26–0.32) when a classical species across the three datasets, including Ceratiomyxa richness-based regional estimator such as Jaccard fruticulosa, Lycogala epidendrum and Stemonitis ax- was used. With this approach, the most dissimi- ifera. At the same time, GSMNP showed 57 unique lar dataset was DNP (Fig. 4, left) but no significant species including Colloderma oculatum, Trichia differences were found (W=5, P=0.28) in similarity subfusca and Lamproderma columbinum. This was across datasets. The coefficient of community val- followed by the 37 unique species in DNP, which ues, which are focused more on local diversity (Fig. included Mucilago crustacea, Trichia flavicoma and 4, left) showed a similar pattern, with an average of Craterium leucocephalum; and 26 unique species in 0.55 (range between 0.51–0.59) and highest differ- LSBS, including Arcyria afroalpina, Physarella oblon- ences between DNP and LSBS. ga and Physarum melleum. Interestingly, the effec- When a similar evaluation was performed at tive proportion of non-shared species in the pooled the genus level, a different pattern was observed in data from all three datasets was calculated in 56% the cluster analysis (Fig. 4, right). In this case, the of the species. At the genus level, DNP showed sev- most dissimilar dataset was LSBS. However, differ- en unique genera, with Calomyxa and Oligonema ences in the values of the Jaccard estimator as well among them; GSMNP had six unique genera, with as the coefficient of community were very small, as Barbeyella and Colloderma, and only Physarella and anticipated at a larger taxonomic scale. Since most Stemonaria were unique to LSBS (see Table 2) genera where shared among datasets (22 in average 196 Figure 4. Diagram of the relationships between the studied datasets using records of myxomycete species (left) and genera (right) and the Jaccard similarity as an estimator of beta diversity (top). The matrices (bottom) show the coefficient of community values observed for pairwise combinations (above) and the number of shared taxa (below). The abbreviations of the datasets are provided in Material and Methods. Discussion Park. However, it makes perfect sense since the pe- This study was not intended to analyze myxomy- riod between 1997–2003 was a very active time for cete biodiversity and biogeography. As pointed out biological research at that park. It was the peak of by Schnittler et al. (2017), those are topics for which the All Taxa Biodiversity Inventory initiative (see molecular information would facilitate a robust ex- Sharkey 2001), a project designed to document the amination of patterns. However, as mentioned be- biodiversity of a number of biological groups in the fore, the current available datasets on myxomycete area. Both the datasets at Denali National Park and occurrence ( field-based presence only data) are La Selva Biological Station followed a grant-based remarkably valuable for ecological analysis. In that model of generation of information, with which data manner, the purpose of this study was to examine accumulated at a slower pace. In this sense, the da- myxomycete presence in three biomes of the world, tasets studied in the present investigation demon- to contribute, with comparative analyses, to the elu- strated that an increased effort of myxomycete re- cidation of macroecological patterns. Even though cording in one general location had a positive effect the datasets presented herein are primarily, but not in the faster accumulation of data. entirely, generated with field records, it is important Despite the latter, the high linear correlation of to note that an even higher level of representative- species accumulation and records (effort) observed ness could be accomplished in an analysis made herein, in the context of the general model construct- with a completely known nature of records. ed with independent data, clearly showed that neither It is interesting to note that the shortest peri- of the three datasets can be considered a complete od of time associated with recording information picture of field-based information on myxomycetes for the datasets considered in the present study for their respective biome. However, those results was that in the Great Smoky Mountains National also showed that in terms of stimulus-response at the 197 Table 2. Number of records of myxomycetes, arranged by genus, associated with the three datasets studied in the present investigation. Numbers in bold denote unique genera for particular datasets. Genus Dataset Genus Dataset DNP GSMNP LSBS DNP GSMNP LSBS Arcyodes 3 1 Hemitrichia 58 85 63 Arcyria 71 241 153 Lamproderma 4 30 26 Badhamia 6 6 Leocarpus 37 16 Badhamiopsis 3 Lepidoderma 1 17 Barbeyella 11 Licea 15 22 3 Calomyxa 11 Lindbladia 1 Calonema 1 Lycogala 61 49 14 Ceratiomyxa 33 42 76 Macbrideola 2 Clastoderma 1 10 2 Metatrichia 6 22 2 Collaria 1 32 6 Mucilago 47 Colloderma 19 Oligonema 2 Comatricha 27 42 14 Paradiacheopsis 1 Craterium 18 11 2 Perichaena 89 35 34 Cribraria 22 141 41 Physarella 10 Diachea 3 1 Physarum 173 325 193 Diacheopsis 3 Reticularia 2 1 Dianema 1 3 Siphoptychium 3 1 Dictydiaethalium 4 2 Stemonaria 1 1 Diderma 17 44 37 Stemonitis 58 73 22 Didymium 36 99 111 Stemonitopsis 10 42 4 Echinostelium 10 8 4 Symphytocarpus 1 1 Elaeomyxa 8 Trichia 155 163 4 Enerthenema 6 10 Tubifera 4 34 2 Enteridium 1 Willkommlangea 1 Fuligo 47 18 1 198 reproductive stage, all biomes are essentially provid- Such a phenomenon, in a way, is also present ing equivalent conditions for myxomycetes to thrive. in temperate forests. However, the higher number of Such neutrality in the boreal-temperate-tropical gra- compartments within the ecosystem in this biome dient has been observed in plant responses to the en- (compared to the boreal one) allows for a potentially vironment (Bastias et al. 2017) and would deserve a higher number of niches to be occupied. The Great finer exploration with other myxomycete data. In this Smoky Mountains have been known to be a good sense, differences in myxomycete reproductive dy- example of biological complexity in temperate are- namics according to biome are more related with the as for a long time (see Whittaker 1956). If, by mere length of growing seasons than with the biological probability, some species of myxomycete occupying quality of the systems (in the sense of Cavahaugh et these niches would also fruit during the favorable al. 2014). It is clear that the length of the season with seasons after the winter, a higher number of species such conditions (i.e., non-winter months) follows the would be expected to be recorded. The higher Tax- pattern DNP