Stacked and Grouped Bar Charts Using Plotly (Python)

fronkan

Fredrik Sjöstrand

Posted on January 8, 2020

Stacked and Grouped Bar Charts Using Plotly (Python)

In this post, I will cover how you can create a bar chart that has both grouped and stacked bars using plotly. It is quite easy to create a plot that is either stacked or grouped, as both are covered in the tutorial at https://plot.ly/python/bar-charts/. However, if you want to have both you need to dig through the API documentation. Well, not anymore as I have done it for you. I will assume you have a basic understanding of plotly, like understanding the tutorial linked above. Finally, if you just want to check out the finished code you can find it at the end of the post.

Example Data

To start with, I want to have an example to illustrate the use-case. In this example, we have a project on GitHub with different types of issues e.g. feature, bug or documentation. From this project, we have taken some issues and created a system to automatically classify them. It has two parts, model 1 and model 2. If model 1 fails to make a prediction model 2 is used.

Model 1 could be a simple rule-based model, where if any of the classes appear in the text of the issue it is classified as that class. For example, if the word bug is written it is classified as a bug or if feature appears it is classified as a feature. If none of the words appears it hands the issue to model 2 which uses a machine learning model to make the prediction and always produces a classification.

Below I have defined a dictionary with some data I have created based on this example. Note that all lists have the same length and could be represented as a pandas dataframe. Original is how many of each type of issue exists in the dataset, based on the actual labels on GitHub Issue Tracker. Model_1 is the predictions of the rule-based model and model_2 the predictions of the machine learning model. Finally, as the total number of issues doesn't change, the sum of all values in original is the same as the sum of all values in model_1 and model_2 combined.

data = {
    "original":[15, 23, 32, 10, 23],
    "model_1": [4,   8, 18,  6,  0],
    "model_2": [11, 18, 18,  0,  20],
    "labels": [
        "feature",
        "question",
        "bug",
        "documentation",
        "maintenance"
    ]
}
Enter fullscreen mode Exit fullscreen mode

Plot

We will use this data to create the plot. First, we need to import graph_objects from plotly which contains everything we will need. We can also write out the standard scaffold of a plotly graph that uses the Figure object.

from plotly import graph_objects as go

fig = go.Figure(
    data = [

    ],
    layout=go.Layout(
        title="Issue Types - Original and Models",
        yaxis_title="Number of Issues"
    )
)
Enter fullscreen mode Exit fullscreen mode

In each step of the tutorial, we will add a graph object to the data parameter in the Figure constructor. We won't make any changes to the existing objects. Each of these will be an instance of the Bar class and use labels from the example data as the x-axis.

Step 1

In this first version of the plot, we will just show the values of original as the y-axis. The only difference from the plotly tutorial for bar charts is the offsetgroup parameter, which we set to zero. This doesn't have any visible effect at the moment but is important for later.

fig1 = go.Figure(
    data = [
        go.Bar(
            name="Original",
            x=data["labels"],
            y=data["original"],
            offsetgroup=0,
        ),
    ],
    layout=go.Layout(
        title="Issue Types - Original and Models",
        yaxis_title="Number of Issues"
    )
)
fig1.show()
Enter fullscreen mode Exit fullscreen mode

Image showing a bar chart where each label has a single bar. The bar has a single color and represents the original values.

Step 2

For the next step, we add a Bar object using the data for model_1 as the y-axis. We also set the offsetgroup to 1 for this graph. This creates a bar chart with grouped bars. The result looks like the grouped bars from the tutorial but will allow us to, in the next step, add the next set of bars on top of these.

fig2 = go.Figure(
    data=[
        go.Bar(
            name="Original",
            x=data["labels"],
            y=data["original"],
            offsetgroup=0,
        ),
        go.Bar(
            name="Model 1",
            x=data["labels"],
            y=data["model_1"],
            offsetgroup=1,
        ),
    ],
    layout=go.Layout(
        title="Issue Types - Original and Models",
        yaxis_title="Number of Issues"
    )
)
fig2.show()
Enter fullscreen mode Exit fullscreen mode

Image showing a bar chart where each label has two bars. The first bar is just one color and represents the original value. The second bar has another color and represents the predictions of model 1

Step 3

Now for the final step, we will add a Bar with the data for model_2 as the y-axis, stacking them on top of the bars for model_1. First, we give them the same position on the x-axis by using the same offsetgroup value, 1. Secondly, we offset the bars along the y-axis by setting the base parameter to the model_1 list. That is it, now we have our grouped and stacked bar chart.

fig3 = go.Figure(
    data=[
        go.Bar(
            name="Original",
            x=data["labels"],
            y=data["original"],
            offsetgroup=0,
        ),
        go.Bar(
            name="Model 1",
            x=data["labels"],
            y=data["model_1"],
            offsetgroup=1,
        ),
        go.Bar(
            name="Model 2",
            x=data["labels"],
            y=data["model_2"],
            offsetgroup=1,
            base=data["model_1"],
        )
    ],
    layout=go.Layout(
        title="Issue Types - Original and Models",
        yaxis_title="Number of Issues"
    )
)
fig3.show()
Enter fullscreen mode Exit fullscreen mode

Image showing a bar chart where each label has two bars. The first bar is just one color and represents the original value. The second bar has two colors, the bottom one representing model 1 and the upper part representing model 2

Entire Example

from plotly import graph_objects as go

data = {
    "original":[15, 23, 32, 10, 23],
    "model_1": [4,   8, 18,  6,  0],
    "model_2": [11, 18, 18,  0,  20],
    "labels": [
        "feature",
        "question",
        "bug",
        "documentation",
        "maintenance"
    ]
}

fig = go.Figure(
    data=[
        go.Bar(
            name="Original",
            x=data["labels"],
            y=data["original"],
            offsetgroup=0,
        ),
        go.Bar(
            name="Model 1",
            x=data["labels"],
            y=data["model_1"],
            offsetgroup=1,
        ),
        go.Bar(
            name="Model 2",
            x=data["labels"],
            y=data["model_2"],
            offsetgroup=1,
            base=data["model_1"],
        )
    ],
    layout=go.Layout(
        title="Issue Types - Original and Models",
        yaxis_title="Number of Issues"
    )
)

fig.show()
Enter fullscreen mode Exit fullscreen mode
💖 💪 🙅 🚩
fronkan
Fredrik Sjöstrand

Posted on January 8, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related