Every so often, someone asks me for a Portal reading list to come up to speed on what we know about the site. This seems like a simple questions, but it is actually pretty difficult. We define a “Portal Paper” as a paper using data collected at the study site (whether or not Portal Project people were involved), or data collected near the study site if that data was collected by the project or with substantive assistance from our project. Over the years, the site has contributed to over 120 papers and book chapters (our current estimate is 123, but we still find older papers that we didn’t know existed).
How the Portal Data is used has been changing in recent years. Historically, most papers were by people affiliated with the group. As we’ve blogged about before, starting in 2009 we have been working on making our data openly available through number of venues. We post all of our data on the Portal GitHub Repo and we have also published two Data Papers through Ecology’s Ecological Archives. The nice thing about Data Papers is that they are indexed and cited just like regular scholarly papers, which is important because it allows us to 1) document that the Portal Project is a valuable scientific resource to the community (which theoretically may be helpful on grant applications) and 2) let us here at the Portal Project keep informed of results coming out of the site that we’re not involved in. So, how are scientists, external to our group, using our openly available data? Google Scholar lists 22 citations between the 2 data papers. (For the non-academics, citing other papers is in our own papers is an important part of scientific publishing. It allows us to give credit to those whose ideas, data, or methods we are working with. It also allows to us provide proof or support that statements we make in our papers are supported by things other people have been finding. Google Scholar is a database that keeps track of these citations). Here’s the breakdown of what Google Scholar says has been citing our Data Papers:
All the site focused research (papers that use our data as the primary focus of their analysis) was done by or in collaboration with someone affiliated with the project. If other researchers are using our data, so far they tend to either use it as part of a meta-analysis (i.e. as one of many data points in the analysis) or to make a figure for their statistical or conceptual paper that has an empirical example of what they are talking about. (Three papers cite a Data Paper for reasons that defy classification. After reading their paper I have no idea why they cited us)! This number of citations listed by Google Scholar is probably a little lower than the database’s actual use in papers because data citations for meta-analyses often get shoved off into the supplementary materials and are not indexed by Google Scholar as a result. Our usage in meta-analyses is probably higher, but it is unlikely that we’ve missed a paper focused solely on data from our site.
We are hoping to increase Portal’s usability and we have some things in the works, which we will blog about later, that we hope will make it easier for people to get the data they need to use Portal as part of their analyses. We love seeing the data used but know that the history of the site and all the manipulation changes can make it difficult to figure out how to extract the data you need.