Plotting a group level comparison histogram in pandas

Let's say I have a DataFrame like below,

years = [2014, 2014, 2014, 2015, 2015, 2015, 2015]
vehicle_types = ['Truck', 'Truck', 'Car', 'Bike', 'Truck', 'Bike', 'Car']
companies = ["Mercedez",  "Tesla", "Tesla", "Yamaha", "Tesla", "BMW", "Ford"]

df = pd.DataFrame({'year': years,
                    'vehicle_type': vehicle_types,
                    'company': companies
                   })

df.head()

And I want to plot the distribution of vehicle types per year, something like this,

Turns out, this can easily be done in one line with pandas,

df.groupby(['year'])['vehicle_type'].value_counts().unstack().plot.bar()

It's amazing how a single statement takes care of,

Null counts
Plotting the histogram bars side by side
And aesthetics like labels, legends, etc.

The critical part here was the unstack function and how it fits well with the multi-index created by value_counts().

Blog

Plotting a group level comparison histogram in pandas

Satwik Kansal

Join Our Newsletter. No Spam, Only the good stuff.

Related