Stacked and Grouped Bar Charts Using Plotly (Python)
Fredrik Sjöstrand
Posted on January 8, 2020
In this post, I will cover how you can create a bar chart that has both grouped and stacked bars using plotly. It is quite easy to create a plot that is either stacked or grouped, as both are covered in the tutorial at https://plot.ly/python/bar-charts/. However, if you want to have both you need to dig through the API documentation. Well, not anymore as I have done it for you. I will assume you have a basic understanding of plotly, like understanding the tutorial linked above. Finally, if you just want to check out the finished code you can find it at the end of the post.
Example Data
To start with, I want to have an example to illustrate the use-case. In this example, we have a project on GitHub with different types of issues e.g. feature, bug or documentation. From this project, we have taken some issues and created a system to automatically classify them. It has two parts, model 1 and model 2. If model 1 fails to make a prediction model 2 is used.
Model 1 could be a simple rule-based model, where if any of the classes appear in the text of the issue it is classified as that class. For example, if the word bug is written it is classified as a bug or if feature appears it is classified as a feature. If none of the words appears it hands the issue to model 2 which uses a machine learning model to make the prediction and always produces a classification.
Below I have defined a dictionary with some data I have created based on this example. Note that all lists have the same length and could be represented as a pandas dataframe. Original is how many of each type of issue exists in the dataset, based on the actual labels on GitHub Issue Tracker. Model_1 is the predictions of the rule-based model and model_2 the predictions of the machine learning model. Finally, as the total number of issues doesn't change, the sum of all values in original is the same as the sum of all values in model_1 and model_2 combined.
data = {
"original":[15, 23, 32, 10, 23],
"model_1": [4, 8, 18, 6, 0],
"model_2": [11, 18, 18, 0, 20],
"labels": [
"feature",
"question",
"bug",
"documentation",
"maintenance"
]
}
Plot
We will use this data to create the plot. First, we need to import graph_objects from plotly which contains everything we will need. We can also write out the standard scaffold of a plotly graph that uses the Figure object.
from plotly import graph_objects as go
fig = go.Figure(
data = [
],
layout=go.Layout(
title="Issue Types - Original and Models",
yaxis_title="Number of Issues"
)
)
In each step of the tutorial, we will add a graph object to the data parameter in the Figure constructor. We won't make any changes to the existing objects. Each of these will be an instance of the Bar class and use labels from the example data as the x-axis.
Step 1
In this first version of the plot, we will just show the values of original as the y-axis. The only difference from the plotly tutorial for bar charts is the offsetgroup parameter, which we set to zero. This doesn't have any visible effect at the moment but is important for later.
fig1 = go.Figure(
data = [
go.Bar(
name="Original",
x=data["labels"],
y=data["original"],
offsetgroup=0,
),
],
layout=go.Layout(
title="Issue Types - Original and Models",
yaxis_title="Number of Issues"
)
)
fig1.show()
Step 2
For the next step, we add a Bar object using the data for model_1 as the y-axis. We also set the offsetgroup to 1 for this graph. This creates a bar chart with grouped bars. The result looks like the grouped bars from the tutorial but will allow us to, in the next step, add the next set of bars on top of these.
fig2 = go.Figure(
data=[
go.Bar(
name="Original",
x=data["labels"],
y=data["original"],
offsetgroup=0,
),
go.Bar(
name="Model 1",
x=data["labels"],
y=data["model_1"],
offsetgroup=1,
),
],
layout=go.Layout(
title="Issue Types - Original and Models",
yaxis_title="Number of Issues"
)
)
fig2.show()
Step 3
Now for the final step, we will add a Bar with the data for model_2 as the y-axis, stacking them on top of the bars for model_1. First, we give them the same position on the x-axis by using the same offsetgroup value, 1. Secondly, we offset the bars along the y-axis by setting the base parameter to the model_1 list. That is it, now we have our grouped and stacked bar chart.
fig3 = go.Figure(
data=[
go.Bar(
name="Original",
x=data["labels"],
y=data["original"],
offsetgroup=0,
),
go.Bar(
name="Model 1",
x=data["labels"],
y=data["model_1"],
offsetgroup=1,
),
go.Bar(
name="Model 2",
x=data["labels"],
y=data["model_2"],
offsetgroup=1,
base=data["model_1"],
)
],
layout=go.Layout(
title="Issue Types - Original and Models",
yaxis_title="Number of Issues"
)
)
fig3.show()
Entire Example
from plotly import graph_objects as go
data = {
"original":[15, 23, 32, 10, 23],
"model_1": [4, 8, 18, 6, 0],
"model_2": [11, 18, 18, 0, 20],
"labels": [
"feature",
"question",
"bug",
"documentation",
"maintenance"
]
}
fig = go.Figure(
data=[
go.Bar(
name="Original",
x=data["labels"],
y=data["original"],
offsetgroup=0,
),
go.Bar(
name="Model 1",
x=data["labels"],
y=data["model_1"],
offsetgroup=1,
),
go.Bar(
name="Model 2",
x=data["labels"],
y=data["model_2"],
offsetgroup=1,
base=data["model_1"],
)
],
layout=go.Layout(
title="Issue Types - Original and Models",
yaxis_title="Number of Issues"
)
)
fig.show()
Posted on January 8, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.