Calculating Android App Size on disk: Overview

ilyavorobiev

Ilya Vorobiev

Posted on March 12, 2023

Calculating Android App Size on disk: Overview

This is the first part in the series of articles describing how to calculate the size of an uncompressed app on disk and report it. This part covers motivation and high-level architecture of the system.

Why is it important?

As it was said by Google “App size is one of the biggest factors that can affect your app’s install and uninstall metrics”. While there is a lot of discussion about why the compressed binary size is important and how it affects the app installs, lack of free storage space is often the reason why people uninstall the apps.

That means there are at least two reasons why we should care about the installed app size and how much we save on disk:

  • To not be at the top list of apps occupying too much space
  • To not bring the user to the state, where they need to decide what to delete

Now, the Google Play Console gives some insight into the uncompressed app size but it might not be granular enough and doesn’t show everything that is stored on disk. Building your own monitoring of what you store will give you a full insight into where the most impactful directories are and how to reduce the chance that users will uninstall your app.

Data collection requirements

There are a few requirements that we want to impose on the app size monitoring solution

  • It should calculate how much space the app occupies on the disk. This is the system’s primary goal - we want to see how much space our user will gain by deleting the app, and we want to minimize this metric.
  • It should calculate how much space important folders take (e.g. caches, databases or unpacked code). The number for the overall app is good, but the goal is to be able to identify where the most impact comes from and to identify opportunities. For that we need to understand which folders contribute the most.
  • The solution should be scalable with a number of files and nesting growing over time. We could’ve just reported all files related to our app, but with the app’s and users' engagement growth, we might encounter weird edge cases with very deep nesting or just a huge amount of files.
  • The solution shouldn’t impair the user experience especially when the app is in the foreground. Traversing the file system may impose a lot of IO work and compete with the rest of the app for resources. The job needs to be executed when the user is not actively interacting with the app and preferably when the phone is charging.
  • The solution shouldn’t violate any user’s privacy preferences. Even though I will not cover this in detail in this series of articles, always make sure your users are aware that you collect the usage data and they should have the option to opt out.

What is counted towards the app size?

Basically, everything that gets deleted together with your app.

  • Internal cache
  • Internal files
  • Internal data
  • External cache
  • External files
  • External media (may not be needed, in case you switched to MediaStore API)

Note: anything saved to external storage, not related to your app (such as other app’s external folders or shared media folder) will not be counted towards your app size and will not be deleted with your app. With this in mind, I don’t think it impacts the user’s decision to remove your app in particular, that’s why I’ll not include it in the calculation. But keeping an eye on how you utilise the shared media space is still a good idea, since by overusing it you still push the user towards the space limit and closer to the decision to delete something from their device.

High-level architecture of the solution

The monitoring system will consist of multiple components so that it can gain some flexibility: e.g. be controlled from the backend or report to various systems.

AppSizeCollector is the class that scans the filesystem and calculates the sizes of all the folders on the device.

Illustration of app collector and disk interaction

AppSizeCollectorConfig is settings for the collector. It may contain a map of how deep to go in the recursion for certain folders.
Illustration of relation between AppSizeCollectorConfig and AppSizeCollector

AppSizeCollectionScheduler will be responsible to schedule the AppSizeCollector work once a day and make sure the job is launched in the background with minimal impact on user experience.

Illustration of relation of scheduler and AppSizeCollector

AppSizeReportTransport is a component responsible for sending information to a consumer (us). It may send the collected info to a backend endpoint, but for demo purposes, we will just output this into logcat.

Illustration of relation between app size report and the transport

AppSizeReportSchema is a schema of the final report.

I will dive deeper into the implementation of these components in the next part of this series and stitch everything together with a scheduler job in the third part.

💖 💪 🙅 🚩
ilyavorobiev
Ilya Vorobiev

Posted on March 12, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related