[Optimise] App size reduction by 22% using Computer Vision

c6z3h

ZhiHong Chua

Posted on November 1, 2023

[Optimise] App size reduction by 22% using Computer Vision

How it began

In a team code review session, someone asked, "Why is this image here? I thought we have saved it elsewhere, why didn't we just use that?"

My thoughts

True, it seems possible it was saved elsewhere. And... if someone could make this mistake here, it's likely others did too.

Wait but why?

Our react-native app used to be a multi-bundle architecture that only loaded each bundle when required by the user, which was meant to optimise load times. Naturally, images were not shared between bundles. However, it resulted in plenty of bugs and random edge cases after taking into account Push Notifications, Deeplink and other things.

Today, we use pretty much monolith, but surprisingly, no one has done the optimisation to check for duplicate images!

Initial plan

It seemed perfect to use Computer Vision with openCV library to parse each pixels in the image into 0/1 and compare the pixelmap to see which images' ones were similar and report it. I was too lazy to try to construct the algorithm so I asked ChatGPT to do it:

ChatGPT's OpenCV Code

Problem

After letting it run for a bit, I realised it was taking really long, so I added print logs. Considering there were 1033 images to compare, turns out it would take:

1033 * 1033 = 1,067,089 seconds = 296.41 hours = 12.35 days !!??!?!!??!
Enter fullscreen mode Exit fullscreen mode

I figured the fact that they were comparing each image to all others was the problem. If 'n' denotes the length of images, this was a O(n^2) solution. We have to do faster.

Optimised plan

Rather than compare both, I decided I want a one-pass solution and just store the existing solution in a set / hashmap, and then for each image append to the set that has the same pixel-map value. It took me less than 30 seconds to write it up:

ChatGPT's ImageHash Code
P.S Thanks to ChatGPT
How long did it take now?

5 seconds
Enter fullscreen mode Exit fullscreen mode

Anyway, turns out the app size would be reduced a whopping

22.2%
Enter fullscreen mode Exit fullscreen mode

Next Up:

All that's left was to get the go-ahead from team lead, and it would be scheduled for December!

  • Short term: Manually check these duplicates and keep 1 copy, delete all others (just in case there are precision errors in the image-hash technique)
  • Medium term: Also identify and remove images that are present but no longer imported by code.
  • Long term: Add a check that runs on commits to warn of any duplicate image. Also include developer contact so any issues can be reported and rectified.
  • Long long term: Migrate these images to a CDN (edit 11 Nov: see part 2 on why this might be a bad idea)

References: https://apiumhub.com/tech-blog-barcelona/introduction-perceptual-hashes-measuring-similarity/

💖 💪 🙅 🚩
c6z3h
ZhiHong Chua

Posted on November 1, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related