A brief introduction to Seaborn
RCKettel
Posted on October 18, 2020
Seaborn is a graphing tool that is used within python as a means to display and interpret data. For any data scientist who is going to be displaying their findings or simply making a presentation, a graph that is appealing to the eye and is easy to understand is ideal. While Matplotlib is a useful tool that will help with exploring data and finding relationships it isn’t as easily customizable and doesn’t have as many methods to make the graphs created with it look as presentable. Seaborn can be used for many of the same purposes and since it was built to be a supplement to Matplotlib much of the code necessary to make plots using its library is very similar. In this tutorial we will go over how to make a basic graph and some basic customizations that can be used to make graphs look nicer such as changing x and y labels, adding a title, changing the orientation of the x-ticks, adding a built in style, and removing the error bars from the graph.
The data for this tutorial is taken from a dataset called Pokemon With Stats. It contains data on 721 Pokemon up to the sixth generation including: names, numbers, types, basic stats, what generation the Pokemon is from, and whether or not the it is legendary. I chose this particular dataset since its subject matter is well known and as such will be easy for someone to relate to even if they have only a passing exposure to the subject matter.
The libraries that will be used are Matplotlib and Seaborn so they should be imported before the graph is created.
import matplotlib as plt
import seaborn as sns
We will begin with a basic barplot by entering the code:
sns.barplot('Type 1', 'Defense', data=pokemon);
which will create a graph that looks like this:
Even thought this graph has its flaws, without needing to insert X or Y labels or do anything special to the colors in the bars of the graph they already appear more pleasing to the eye. There is a lot wrong with this graph however, there is no title, the x_axis and y_axis labels should be changed, the x_axis tick labels are unreadable, there is a bunch of whitespace in the plot’s background and what are those black lines at the top of each of the bars? All of this will be changed. First, we will add a title. There are two ways to do this one is to use .set(“title”)
after the final parenthesis in the above code. The other uses code that should already be known if someone is familiar with Matplotlib.
sns.barplot('Type 1', 'Defense', data=pokemon).set(title= 'Pokemon Defense by Type');
sns.barplot('Type 1', 'Defense', data=pokemon)
plt.title('Defense by Type')
In either case the graph should look like this:
Now that the title has been added. The x and y axis labels should be changed to better communicate what information is being portrayed in the graph. This can be done by adding to the `.set(“title”) method or by using the Matplotlib style coding.
python
'''.set("title") method adds labels after the title'''
sns.barplot('Type1','Defense',data=pokemon).set(title='Pokemon Defense by Type', xlabel='Type', ylabel='Defense Statistic');
'''Matplotlib has multiple lines of code'''
sns.barplot('Type 1', 'Defense', data=pokemon)
plt.title('Pokemon Defense by Type')
plt.xlabel('Type')
plt.ylabel('Defense Statistic');
As you might have guessed the .set()
method can be used equally as well as the .plt()
style coding and each to the coders own preference. While one writes horizontally and can be a little long, the other writes its code vertically and needs more lines of code. Since the Seaborn library is built to work with Matplotlib some of the more complicated changes are often done with the latter style of code. For instance, up to this point all of the tick labels on the x_axis have been unreadable. This can be changed by rotating them to a desired angle. The angle of rotation can be any desired angle from 0 to 360 degrees and are set by using the rotation feature in the xticks function:
python
sns.barplot('Type 1', 'Defense', data=pokemon)
plt.title('Pokemon Defense by Type')
plt.xlabel('Type')
plt.ylabel('Defense Statistic')
plt.xticks(rotation=45);
At this point the graph could be used in a presentation and would be easy to understand. Yet, there are some other changes that could be made to the bar graph that would help refine it further. The background of the graph contains a lot of white space that can be filled in by using one of three built in styles: dark grid, white grid, and dark. There are two other styles, white and ticks, which help make the colors or tick marks more prominent but for the purposes of this tutorial white grid will be demonstrated.
python
'''The set_style parameter should be run before the initial plot code.'''
sns.set_style('whitegrid')
sns.barplot('Type 1', 'Defense', data=pokemon)
plt.title('Pokemon Defense by Type')
plt.xlabel('Type')
plt.ylabel('Defense Statistic')
plt.xticks(rotation=45);
Finally, the odd black lines in the bar graph are called error bars and detail the confidence intervals for the particular populations being graphed. These are statistical measures of where within the lower and upper estimates the mean of the population should land. The smaller the range the more precise the estimate. If these are not necessary for the graph or they just seem unsightly they can be removed by adjusting one of the parameters of `sns.barplot’, ci, equal to none.
sns.set_style('whitegrid')
sns.barplot('Type 1', 'Defense', data=pokemon, ci=None)
plt.title('Pokemon Defense by Type')
plt.xlabel('Type')
plt.ylabel('Defense Statistic')
plt.xticks(rotation=45);
Now that the graph looks ready for presentation it should be scaled to match. There are four scales built in to Seaborn and they are appropriately named. In order of size they are paper, notebook, talk, and poster the default being notebook. Since this is going to be used in a talk the scaling should reflect that. Remember that this will affect the entire code of the graph so it should be instantiated before `sns.barplot’ and the size of text will be changed so the rotation of the x-ticks might need to be adjusted for readability.
python
sns.set_style('whitegrid')
sns.set_context('talk')
sns.barplot('Type 1', 'Defense', data=pokemon, ci=None)
plt.title('Pokemon Defense by Type')
plt.xlabel('Type')
plt.ylabel('Defense Statistic')
plt.xticks(rotation=65);
As was previously stated, it is always necessary for programmers of all types to be able to give professional and intuitive visualizations in their presentations. Seaborn is an easy alternative to Matplotlib that can allow the user to make these presentations with less effort. In this tutorial some simple changes were examined that would make a graph look much more pleasing to the eye with only a few steps. There are many other features and types of graphs that can be used in Seaborn any number of ways so it is encouraged that the aspiring programmer become familiar with this library.
Links
The Pokemon database:
https://www.kaggle.com/abcsds/pokemonSeaborn Documentation on barplots:
https://seaborn.pydata.org/generated/seaborn.barplot.html
Matplotlib documentation on the Axes class:
https://matplotlib.org/api/axes_api.html#axis-labels-title-and-legend
Seaborn Styling tutorial:
https://www.codecademy.com/articles/seaborn-design-i#:~:text=Seaborn%20has%20five%20built%2Din,better%20suit%20your%20presentation%20needs.
Posted on October 18, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.