Archive for August, 2019

portalr – An R package for using the Portal data

August 13, 2019

Much effort has been made over the years to keep the Portal data a continuous, consistent time series.

Nonetheless, every field project has its quirks. And in 40 years, a lot of interesting stuff can happen. So some of that consistency has to happen post-hoc. Naturally over the course of decades of researchers using the data, some ‘best practices’ have been developed to deal with data cleaning on multiple levels.

Special Cases

You have to stop setting traps halfway through a plot because it’s in the middle of a lightning storm. You trap with no fences at all, because they’re being replaced. You catch a skink, or a cactus wren, or a snake(!), in a rodent trap.

Within-time series

We have made several improvements over the years to the ways data are collected. While not always affecting the consistency of the time series, those changes may affect the way the data get summarized to mesh with the previous methods.

Across time series

Of course, we just really collect a lot of data, of all types. These data are collected in different ways and at different time scales, but they can all be woven into one time series matrix, if you know what you’re doing.

We want to share these ‘best practices’ publically along with the dataset, because we want it to be easily accessible to anyone who might want to use it. Not just those of us who know all its ‘secrets.’ Or those of us who can yell down the hall to the senior grad student “Hey, there were no fences during a census? What should I do about THAT?”

The best way to do that seemed to be an R package, which we’ve published on CRAN.

portalr

Now you can install it easily from CRAN:

install.packages("portalr")

The development version is also available directly from GitHub by using devtools:

# install.packages("remotes")

remotes::install_github("weecology/portalr")

There are functions to download the data, or to load it into R (including straight from the GitHub repo):

download_observations(".")

data_tables <- load_rodent_data("repo")

You can summarize the rodent data in many different ways. There are arguments for the table shape, whether or not to include unknowns, which treatment types to use, and much more. The possible combinations are endless.

abundance(".", level = "site", shape = "crosstab", time = "period")

Screen Shot 2019-08-13 at 2.12.45 PM.png

You can also get the data as biomass, or even energy, rather than abundance:

biomass("repo", level = "plot", type = "granivores", shape = "flat", time = "date)

Screen Shot 2019-08-13 at 2.58.13 PM.png

There are similar options for weather, plant, and ant data:

weather("Monthly", ".", fill = T)

Screen Shot 2019-08-13 at 2.52.36 PM.png

plant_abundance(".", shape = "flat", level = "quadrat")

Screen Shot 2019-08-13 at 3.02.20 PM.png

There are more in-depth examples in the vignettes. Go check them out!

browseVignettes("portalr")

This is designed to be a quick way to get you off the ground, out of the data cleaning step, and into doing analyses. It also works as a good introduction to the data that are available. Of course the raw data and their metadata are always available once you feel prepared to create more specific/complicated data summaries of your own. The methods¬†and the data paper contain a great amount of details, so you can always discover the provenance of our ‘best practices’ yourself, or decide to do something slightly differently if it fits your question better.

If you use the data in a way that we don’t provide, but you think may be generally useful, please feel free to submit a pull request, request an additional argument in a function, etc. We would love to know how you’re using it!

Advertisements