Recapping the AI, Machine Learning and Data Science Meetup

We just wrapped up the February AI, Machine Learning and Data Science Meetup, and if you missed it or want to revisit it, here’s a recap! In this blog post you’ll find the playback recordings, highlights from the presentations and Q&A, as well as the upcoming Meetup schedule so that you can join us at a future event.

First, Thanks for Voting for Your Favorite Charity!

In lieu of swag, we gave Meetup attendees the opportunity to help guide a $200 donation to charitable causes. The charity that received the highest number of votes this month was Oceana, which is focused on ocean conservation — protecting and restoring marine life and the world’s abundant and biodiverse oceans. We are sending this event’s charitable donation of $200 to Oceana on behalf of the Meetup members!

Missed the Meetup? No problem. Here are playbacks and talk abstracts from the event.

Towards Fair Computer Vision: Discover the Hidden Biases of an Image Classifier

Recent works find that AI algorithms learn biases from data. Therefore, it is urgent and vital to identify biases in AI algorithms. However, previous bias identification methods overly rely on human experts to conjecture potential biases, which may neglect other underlying biases not realized by humans. Is there an automatic way to assist human experts in finding biases in a broad domain of image classifiers? In this talk, I will introduce solutions.

Speaker: Chenliang Xu is an Associate Professor in the Department of Computer Science at the University of Rochester. His research originates in computer vision and tackles interdisciplinary topics, including video understanding, audio-visual learning, vision and language, and methods for trustworthy AI. He has authored over 90 peer-reviewed papers in computer vision, machine learning, multimedia, and AI venues.

Q&A

The first example from Prof. Xu (classification) sounds like a sensitivity analysis on the hyperplanes to ensure no single attribute has too much influence. Without additional training images, is the discriminator model susceptible to overfitting?

Resource links

Discover the Unknown Biased Attribute of an Image Classifier (ICCV 2021)
Discover and Mitigate Unknown Biases with Debiasing Alternate Networks (ECCV 2022)
GitHub Repo: Discover and Mitigate Unknown Biases with Debiasing Alternate Networks (ECCV 2022)

Food Waste Classification with AI

1/3 of all food gets wasted, with millions of tons of food being thrown away each day. Food does not mean the same thing everywhere in the world, there are thousands of different meals across the world, therefore a lot of different classes to distinguish between. In this talk we’ll see through challenges of food-waste classification and see how foundation models can be useful to this task. We will also explore how we use FiftyOne to test models during development.

Speaker: Luka Posilović is a computer scientist with a PhD from FER, Zagreb, Croatia, working as a Head of machine learning in Kitro. Luka and the team are trying to reduce the global food waste problem by using AI.

Q&A

Jump straight to the Q&A section on YouTube.

How well do the models handle different types of food vs different types of buckets? In particular things like sauces or blended foods, do you fine tune them per cuisine?
I volunteered at a school during lunch time and students were throwing their milk boxes as is, unopened. Is your system able to detect those?
Do you train the model on a standard / generic set of images, or on images provided by the customer for their specific case?
What’s the size of your datasets in terms of number of images?
Do the errors made by human labelers and automatic labeling tend to be correlated or not? Are there unique errors that automatic labeling is prone to?
How customizable is this algorithm based on an organization’s needs?
- Have you explored using the data to extend into inventory management?
Will the food weight be measured at the same time to increase weight prediction accuracy?
Does your algorithm also differentiate between rotten food vs normal food that is edible?
What method do you use regarding optimization of your model?
With such big datasets would approaches such as one shot learning be beneficial?
You’re obviously only sampling the top layer. How do you ensure it’s representative?
Do you plan to use LLM as part of your pipeline?

There was some additional Q&A that Luka ran out of time to answer, you can find it here:

Will the food weight be measured at the same time to increase weight prediction accuracy? “Yes, food weight is measured at the same time so the accuracy is quite high.”
Does your ML algo also differentiate between rotten food vs normal food that is edible? “It differentiates between avoidable (edible) and unavoidable food (e.g. bones, eggshells, vegetable peels). Rotten food, if visually very distinctive from not rotten, could in theory be recognized but for now we don’t have a food waste source “rotten/storage” as we’d need a bigger data set to train the models.”
From a business model point of view, how do customers pay for Kitro? “Kitro’s product and services are priced as a software subscription that includes the rental of the KITRO TARE, data analysis and consulting.”

Resource links

Learn more about KITRO Tare

Objects and Image Geo-localization from Visual Data

Localizing images and objects from visual information stands out as one of the most challenging and dynamic topics in computer vision, owing to its broad applications across different domains. In this talk, we will introduce and delve into several research directions aimed at advancing solutions to these complex problems.

Speaker: Safwan Wshah is an Associate Professor in the Department of Computer Science at the University of Vermont. His research interests encompass the intersection of machine learning theory and application, with a particular emphasis on geo-localization from visual information. Additionally, he maintains broader interests in deep learning, computer vision, data analytics, and image processing.

Resource links

Visual and Object Geo-localization: A Comprehensive Survey
In Action: Maplyzer

Lightning Talk: The Next Generation of Video Understanding with Twelve Labs

The evolution of video understanding has followed a similar trajectory to language and image understanding – with the rise of large pre-trained foundation models trained on a huge amount of data. Given the surge of multimodal research lately, video foundation models are becoming even more powerful to decipher the rich visual information embedded in videos. This talk will explore diverse use cases of video understanding and provide a glimpse of Twelve Labs offerings.

Speaker: James Le is the Head of Developer Experience at Twelve Labs, a startup building multimodal foundation models for video understanding. Previously, he worked at ML Infrastructure startups such as Superb AI and Snorkel AI, while contributing to the popular Full-Stack Deep Learning course series. He is also the host of Datacast, a podcast featuring conversations with founders, investors, and operators in the data and AI infrastructure space to unpack the narrative journeys of their careers.

Q&A

How many different object types can you search for inside a video currently?
Do you think these models could work on live video to interpret what is happening in real time?
To enable searching for content, how much did you have to train (like “drinking a Coca Cola,” or “people having fun”) vs. commercially available data (models?)

Resource links

Join the AI, Machine Learning and Data Science Meetup!

The AI, Machine Learning and Data Science Meetup membership has grown to almost 12,000 members! The goal of the Meetups is to bring together communities of data scientists, machine learning engineers, and open source enthusiasts who want to share and expand their knowledge of AI and complementary technologies.

Join one of the 12 Meetup locations closest to your timezone.

What’s Next?

Up next on March 21 at 10 AM Pacific we have a great line up speakers including:

Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models – Andrew Owens at the University of Michigan
Omnidirectional Computer Vision – Ciarán Eising at the University of Limerick & the D2iCE Research Group
Illuminating the Underground World with Multimodal Algorithms – Adonaí Vera, Machine Learning Engineer
The Role of AI in Fixing Hiring – Saurav Pandit, PhD at AI 2030

Get Involved!

There are a lot of ways to get involved in the Computer Vision Meetups. Reach out if you identify with any of these:

You’d like to speak at an upcoming Meetup
You have a physical meeting space in one of the Meetup locations and would like to make it available for a Meetup
You’d like to co-organize a Meetup
You’d like to co-sponsor a Meetup

Reach out to Meetup co-organizer Jimmy Guerrero on Meetup.com or ping me over LinkedIn to discuss how to get you plugged in.

These Meetups are sponsored by Voxel51, the company behind the open source FiftyOne computer vision toolset. FiftyOne enables data science teams to improve the performance of their computer vision models by helping them curate high quality datasets, evaluate models, find mistakes, visualize embeddings, and get to production faster. It’s easy to get started, in just a few minutes.