This is a tutorial on how to download data from Jstor.
Jstor is a repository of academic journals. Recently, Jstor has allowed people to search and download bibliometric data.
One aspect of the bibliometric data available for download are counts of words. Jstor has indexed words that appear in its database of articles.
It is possible to search for, download, and analyze these data.
This tutorial will demonstrate how to accomplish this with R. In particular, we will search for the terms 'quantitative' and 'qualitative' in academic journal articles from 1900 to 2010. We will then graph how many times these two terms have been mentioned in Jstor over time.
Create a dataframe to house the data.
We will first create a dataframe to house the data. We will start off with a dataframe that has one column, and 111 rows - equal to the number of years from 1900 to 2010.
Create a column of years.
Next, we will list out the years in sequence, and rename the column to “YEAR”
Specify the terms to search for in Jstor.
Loop through those search terms and pull in Jstor data.
Inspect the data frame downloaded.
Look at the top six rows and the last six rows.
Reshape the data to put it in a graph friendly (long) format.
Graph the data to observe trends over time.
As you can see from the graph below, academic journal articles have mentioned “quantitative” more frequently than “qualitative” throughout years, but the disparity in mentions increases over time.