Accessing Coin Gecko's API for Cryptocurrency Market Data
Here I will give some simple examples of using Coin Gecko’s API to access market data for the broad cross-section of cryptocurrencies and tokens from various blockchains. Coin Gecko offers a basic subscription that is free to use for everyone interested in learning more about cryptocurrency data
*** Please include a citation if you use this code for a research project. ***
Load Packages
# Install Packages
install.packages('tidyverse')
# Load Packages
library(tidyverse)
library(httr)
library(jsonlite)
library(glue)
A Basic GET Call
Set the generic URL that can be used in conjunction with any of Coin Gecko’s endpoints to make a GET call. In lamen’s terms, a GET call is your computer asking the API to grant you access to a particular dataset (i.e. an “endpoint”).
# Base URL
base <- 'https://api.coingecko.com/api/v3/'
We will use this as a simple example of a single GET call. For this endpoint (coins/list), we will be getting every cryptocurrency/token that is available on CoinGecko’s API
##### List all assets #####
# set endpoint
endpoint <- 'coins/list'
# GET call
get <- GET(glue(base,endpoint))
# prepare data
dat <- fromJSON(rawToChar(get$content))
# view data
df1 <- data.frame(name=dat$name,ticker=dat$symbol,id=dat$id)
# vector of all asset IDs
IDs <- df1$id
Optimizing API Limits to get Full Historical Data
Each API has limits on the data you can retrieve for a single GET call, as well as limits on the number of GET calls you can push to the API in a given timeframe. For example, depending on your account with CoinGecko, you may be able to send somewhere between 10-50 GET calls per minute. For this reason, it is important to optimize the GET calls when getting data in bulk
First, we will save a list of crypto IDs that will go into our GET calls. We will also save the beginning and ending for our endpoint
##### Loop to get all days historical data --> 1 request for 1 day of data #####
# blank list
l1 <- list()
# vector of all asset IDs
IDs <- df1$id
# start of endpoint
start <- 'https://api.coingecko.com/api/v3/coins/'
# end of endpoint
end <- '/market_chart?vs_currency=usd&days=4000&interval=daily' # notice 4,000 days which gives me more than 10 years
Create a loop to optimize API request limits
p.s. I am only running one iteration of the outer loop. To run the full set, replace this with length(IDs). This will take a very long time to run
### Loop over all iterations of 30 IDs --> due to the request limit per minute ###
# create index of ID places (per 30 requests)
# number of iterations
x <- floor(length(IDs)/30) # 30 IDs per 1 request
# data frame of iterations and total IDs
iters <- data.frame(tot.IDs=1:(x*30),iter=rep(1:x,each=30)) # 30 IDs per 1 request
# run loop over all iterations (30 per)
for(j in 1:1){
# select which iteration you want to perform
index <- iters %>% filter(iter==j)
### Loop over all IDs ###
for(n in ((min(index$tot.IDs)):(max(index$tot.IDs)))){ # can only make 50 requests per minute
# repeat this GET call until you don't get a status 429
repeat{
# GET call
get <- GET(glue(start,IDs[n],end))
# there are rate limits for GET calls for every API service. The best way to deal with this constraint is
# to simply get for a `429` error which signifies that you have reached the limit. Simply wait a minute and try again
if(get$status_code==429){
Sys.sleep(60)
}
# if you get a successful status code (200), then break
if(get$status_code==200){
break
}
}
# notice each call is in USD, 10 years of data (3,650 days), and daily frequency
# prepare data
dat <- fromJSON(rawToChar(get$content))
# save
l1[[n]] <- data.frame(unix=as.POSIXct(dat$prices[,1]/1000,origin = '1970-01-01'),price=dat$prices[,2],
mkt.cap=dat$market_caps[,2],vol=dat$total_volumes[,2]) %>% try()
# save name of crypto
l1[[n]]$name <- IDs[n]
} # end first loop
}
# Export data before you lose it (piece meal)
df.out <- do.call(rbind,l1)