Plotting a group level comparison histogram in pandas
Satwik Kansal
Posted on May 6, 2020
Let's say I have a DataFrame like below,
years = [2014, 2014, 2014, 2015, 2015, 2015, 2015]
vehicle_types = ['Truck', 'Truck', 'Car', 'Bike', 'Truck', 'Bike', 'Car']
companies = ["Mercedez", "Tesla", "Tesla", "Yamaha", "Tesla", "BMW", "Ford"]
df = pd.DataFrame({'year': years,
'vehicle_type': vehicle_types,
'company': companies
})
df.head()
And I want to plot the distribution of vehicle types per year, something like this,
Turns out, this can easily be done in one line with pandas,
df.groupby(['year'])['vehicle_type'].value_counts().unstack().plot.bar()
It's amazing how a single statement takes care of,
- Null counts
- Plotting the histogram bars side by side
- And aesthetics like labels, legends, etc.
The critical part here was the unstack
function and how it fits well with the multi-index created by value_counts()
.
💖 💪 🙅 🚩
Satwik Kansal
Posted on May 6, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
githubcopilot AI Innovations at Microsoft Ignite 2024 What You Need to Know (Part 2)
November 29, 2024