APIs are a big buzzword in the tech industry. So what is an API you ask? API stands for Application Program Interface. Think of it as a protocol for how to make requests and communicate with another server.
But before we get to APIs, we should have a general understanding of how HTTP requests work. Often, we just type in a website domain into the url bar and hit go. Sometimes we don't even do that, we just google it and click the link. A lot is happening in the background. Let's explore this process a little further.
HTTP stands for Hyper Text Transfer Protocol. This protocol (like many) was proposed by the Internet Engineering Task Force (IETF) through a request for comments (RFC). We're going to start with a very simple HTTP method: the get method.
To learn more about HTTP methods see:
https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods
The first thing to understand when dealing with APIs is how to make get requests in general. To do this, we'll use the Python requests package.
import requests
response = requests.get('https://flatironschool.com')
print('Type:', type(response), '\n')
print('Response:', response, '\n')
print('Response text:\n', response.text[:500])
Voila! We just retrieved a web page using python rather then our web browser! First, you can see the HTTP Response code. From there, we accessed the web page itself (or more precisely the HTML doc) via the .text method of the response object. Let's take a quick aside and look at some other common HTTP response codes:
Let's try retrieving another webpage:
#The Electronic Frontier Foundation (EFF) website; advocating for data privacy and an open internet
response = requests.get('https://www.eff.org')
print(response)
print(response.text[:2500])
Success! As you can see, the response.text is the html code for the given url that we requested. In the background, this forms the basis for web browsers themselves. Every time you put in a new url or click on a link your computer makes a get request for that particular page and then the browser itself renders that page into a visual display on screen.
Some requests are a bit more complicated. Often, websites require identity verification such as logins. This helps a variety of issues such as privacy concerns, limiting access to content and tracking users history. Going forward, OAuth has furthered this idea by allowing third parties such as apps access to user information without providing the underlying password itself.
In the words of the Internet Engineering Task Force, "The OAuth 2.0 authorization framework enables a third-party application to obtain limited access to an HTTP service, either on behalf of a resource owner by orchestrating an approval interaction between the resource owner and the HTTP service, or by allowing the third-party application to obtain access on its own behalf. This specification replaces and obsoletes the OAuth 1.0 protocol described in RFC 5849."
See https://oauth.net/2/ or https://tools.ietf.org/html/rfc6749 for more details.
Alternatively, for a specific case check out Yelp's authentication guide, which we're about to check out!
With that, lets go grab an access token from an API site and make some API calls! Point your browser over to this yelp page and start creating an app in order to obtain and api access token:
Now it's time to start making some api calls!
#As a general rule of thumb, don't store passwords in a main file like this!
#Instead, you would normally store those passwords under a sub file like passwords.py which you would then import.
#This code snippet might not work if repeatedly used.
client_id = '######'
api_key = '######'
Great! Now that we have our access tokens set up we can make a request to the API. Every API has a specific format required to make requests, or calls to the server. Here's how to implement the yelp API using the Python requests package. In future lessons, you'll practice translating an APIs documenation yourself! As a reference, here's the documentation to the Yelp API: https://www.yelp.com/developers/documentation/v3/get_started
term = 'Mexican'
location = 'Astoria NY'
SEARCH_LIMIT = 10
url = 'https://api.yelp.com/v3/businesses/search'
headers = {
'Authorization': 'Bearer {}'.format(api_key),
}
url_params = {
'term': term.replace(' ', '+'),
'location': location.replace(' ', '+'),
'limit': SEARCH_LIMIT
}
response = requests.get(url, headers=headers, params=url_params)
print(response)
print(type(response.text))
print(response.text[:1000])
MMMM Look at that! We have a nice nifty little return now! As you can see, the contents of the response is formatted as a string but what kind of data structures does this remind you of?
To start there's the outer curly brackets:
{"businesses":
Hopefully you're thinking 'hey that's just like a python dictionary!'
Then within that we have what appears to be a list of dictionaries:
[{"id": "jeWIYbgBho9vBDhc5S1xvg",
This response is an example of a json (Javascript Object Notation) format. You can read more about json here, but it's pretty similar to the data structures you've already seen in python.
We can also take json and convert it into a DataFrame, a spreadsheet style object (ala excel), using the Pandas package:
#import the package under an alias (short typing in the future)
import pandas as pd
#Create a dataframe
df = pd.DataFrame.from_dict(response.json())
Whoops, what's going on here!? Well, notice from our previous preview of the response that we saw there were a hierarhcy within the response. Let's begin to investigate further to see what the problem is.
First, recall that the overall strucutre of the response was a dictionary. Let's look at what those values are.
response.json().keys()
Now let's go a bit further and start to preview what's stored in each of the values for these keys.
for key in response.json().keys():
print(key)
value = response.json()[key] #Use standard dictionary formatting
print(type(value)) #What type is it?
print('\n\n') #Seperate out data
response.json()['businesses'][:2]
response.json()['total']
response.json()['region']
As you can see, we're primarily interested in the 'bussinesses' entry.
Let's go ahead and create a dataframe from that.
df = pd.DataFrame.from_dict(response.json()['businesses'])
print(len(df)) #Print how many rows
print(df.columns) #Print column names
df.head() #Previews the first five rows.
#You could also write df.head(10) to preview 10 rows or df.tail() to see the bottom
import folium
#Retrieve the Latitude and Longitude from the Yelp Response
lat_long = response.json()['region']['center']
lat = lat_long['latitude']
long = lat_long['longitude']
#Create a map of the area
yelp_map = folium.Map([lat, long])
yelp_map
folium.Marker?
Great. Now let's finish building the map!
for row in df.index:
lat_long = df['coordinates'][row]
lat = lat_long['latitude']
long = lat_long['longitude']
name = df['name'][row]
rating = df['rating'][row]
price = df['price'][row]
details = '{} Price: {} Rating:{}'.format(name,price,rating)
marker = folium.Marker([lat, long], popup=details)
marker.add_to(yelp_map)
yelp_map
Congratulations! We've covered a lot here! We started with HTTP requests, one of the fundamental protocols underlying the internet that we know and love. From there, we further investigated OAuth and saw how to get an access token to use in an API such as yelp. Then we made some requests to retrieve information that came back as a json format. We then transformed this data into a dataframe using the Pandas package. Finally, we created an initial visualization of the data that we retrieved using folium.