The real size of Android objects 📏

pyricau

Py ⚔

Posted on September 22, 2020

The real size of Android objects 📏

Header image: Deep Dive by Romain Guy.

I'm currently reimplementing how LeakCanary computes the retained heap size of objects. As a quick reminder:

Shallow heap size of an object: The object size in the memory.

Retained heap size of an object: The shallow size of that object plus the shallow size of all the objects that are transitively held in memory by only that object. In other words, it's the amount of memory that will be freed when that object is garbage collected.

One cannot trust a shallow size

As part of that work, I compared the shallow size of objects as reported in LeakCanary versus other heap dump tools such as YourKit, Eclipse Memory Analyzer Tool (MAT) and Android Studio Memory Analyzer. That's when I realized something was wrong: every tool provides a different answer.

I asked Jesse Wilson about it and he pointed me to this article by Aleksey Shipilёv: What Heap Dumps Are Lying To You About. Some take aways:

  • Every Java VM lays out its memory in a slightly different way and performs various optimizations, such as changing field order, aligning bits, etc.
  • The heap dump format (.hprof) is a standard. A class dump record contains the list of fields and their types as well as the instance size of the class. Aleksey Shipilёv asked about having the instance size be the actual size of an instance in memory but the answer was nope: the sizes in the HPROF dump are VM and padding independent, to avoid breaking expectations from consuming tools.
  • There's a tool called JOL that instruments the JVM runtime to report the actual size of an object. Aleksey used that to compare the size reported in hprof based tools with the actual size and found that they were all wrong in a different way.

Exploring Java's Hidden Costs

In Exploring Java's Hidden Costs, Jake Wharton showed how to use JOL. Unfortunately, JOL only runs on JVMs, and not the Dalvik or ART runtimes. To quote Jake:

In this case, because the classes are exactly the same and the JVM 64 bit, and Android's now 64 bit, the number should be translatable. If that bothers you, treat them as an approximation and allow for some 20% variance. It's certainly a lot easier than figuring out how to get the object sizes on Android itself. Which is not impossible, it's just a lot easier this way.

Damn it, in LeakCanary all we have is the heap dump. Sounds like we need to go the not easy way!

I asked Romain Guy about this and he suggested looking at JVM TI, an agent interface that is implemented on Android 8+ as ART TI. JVM TI exposes a GetObjectSize API.

Read the Source, Luke

Read the Source, Luke

Learn to Read the Source, Luke - Coding Horror

Android is Open Source, so we can always find answers to our questions... as long as we know where to search!

JVM TI

Here's the implementation of JVM TI GetObjectSize():



jvmtiError ObjectUtil::GetObjectSize(env* env ATTRIBUTE_UNUSED,
                                     jobject jobject,
                                     jlong* size_ptr) {
  art::ObjPtr<art::mirror::Object> object = 
      soa.Decode<art::mirror::Object>(jobject);

  *size_ptr = object->SizeOf();
  return ERR(NONE);
}


Enter fullscreen mode Exit fullscreen mode

The interesting code lives in Object::SizeOf():



template<VerifyObjectFlags kVFlags>
inline size_t Object::SizeOf() {
  size_t result;
  constexpr VerifyObjectFlags kNewFlags = RemoveThisFlags(kVFlags);
  if (IsArrayInstance<kVFlags>()) {
    result = AsArray<kNewFlags>()->template SizeOf<kNewFlags>();
  } else if (IsClass<kNewFlags>()) {
    result = AsClass<kNewFlags>()->template SizeOf<kNewFlags>();
  } else if (IsString<kNewFlags>()) {
    result = AsString<kNewFlags>()->template SizeOf<kNewFlags>();
  } else {
    result = GetClass<kNewFlags, kWithoutReadBarrier>()
        ->template GetObjectSize<kNewFlags>();
  }
  return result;
}


Enter fullscreen mode Exit fullscreen mode

Instance size

Let's focus on instance size, which is the last else in that conditional. If the object is an instance, then its size is retrieved from GetClass()->GetObjectSize() which returns the value from object_size_ in class.h:



// Total object size; used when allocating storage on gc heap.
// (For interfaces and abstract classes this will be zero.)
// See also class_size_.
uint32_t object_size_;


Enter fullscreen mode Exit fullscreen mode

Instance allocation

Let's double check that this is the actual size of instances in memory by looking at usages. We find that object_size_ is used by Class::Alloc() in class-alloc-inl.h



template<bool kIsInstrumented, Class::AddFinalizer kAddFinalizer, 
    bool kCheckAddFinalizer>
inline ObjPtr<Object> Class::Alloc(Thread* self, 
    gc::AllocatorType allocator_type) {
  gc::Heap* heap = Runtime::Current()->GetHeap();
  return heap->AllocObjectWithAllocator<kIsInstrumented, false>(
      self, this, this->object_size_, allocator_type,
      VoidFunctor()
  );
}


Enter fullscreen mode Exit fullscreen mode

So the actual memory allocated for an object is indeed defined by object_size_ in class.h.

Note: the memory allocated by Heap::AllocObjectWithAllocator() in heap-inl.h might be rounded up to a multiple of 8 when using a Thread-local bump allocator (TLAB, see Trash Talk by Chet Haase and Romain Guy). However the default CMS GC does not use that allocator.

Class linking

Looked at more usages for object_size_ we find that it's set by ClassLinker::LinkFields() in class_linker.cc when linking classes:



bool ClassLinker::LinkFields(Thread* self,
                             Handle<mirror::Class> klass,
                             bool is_static,
                             size_t* class_size) {
  MemberOffset field_offset(0);

  ObjPtr<mirror::Class> super_class = klass->GetSuperClass();
  if (super_class != nullptr) {
    field_offset = MemberOffset(super_class->GetObjectSize());
  }

  // ... code that increases field_offset as fields are added

  size_t size = field_offset.Uint32Value();
  klass->SetObjectSize(size);

  return true;
}


Enter fullscreen mode Exit fullscreen mode

Back to heap dumps

Now that we know how to get the actual size of instances, let's compare that with the instance size reported in Android heap dumps.

Read the Source, Luke

When we trigger a heap dump via Debug.dumpHprofData(), the VM calls DumpHeap() in hprof.cc. Let's look at Hprof::DumpHeapClass(), more specifically the part where the instance size of a class is added:



// Instance size.
if (klass->IsClassClass()) {
  // As mentioned above, we will emit instance fields as
  // synthetic static fields. So the base object is "empty."
  __ AddU4(0);
} else if (klass->IsStringClass()) {
  // Strings are variable length with character data at the end 
  // like arrays. This outputs the size of an empty string.
  __ AddU4(sizeof(mirror::String));
} else if (klass->IsArrayClass() || klass->IsPrimitive()) {
  __ AddU4(0);
} else {
  __ AddU4(klass->GetObjectSize());  // instance size
}


Enter fullscreen mode Exit fullscreen mode

The last else in that conditional is the instance size for most instances, and once again points to object_size_ in class.h.

So, unlike Open JDK heap dumps, Android heap dumps contain the actual size of instances in memory.

Exploring heap dump records

In Exploring Java's Hidden Costs, Jake showed the output from JOL for android.util.SparseArray:



android.util.SparseArray object internals:
SIZE     TYPE DESCRIPTION 
4        (object header)
4        (object header)
4        (object header)
4        int SparseArray.mSize
1        boolean SparseArray.mGarbage
3        (alignment/padding gap)
4        int[] SparseArray.mKeys
4        Object[] SparseArray.mValues
4        (loss due to the next object alignment)

Instance size: 32 bytes


Enter fullscreen mode Exit fullscreen mode

Let's use the LeakCanary heap dump parser (Shark) to see what Android heap dumps report:



val hprofFile = "heap_dump_android_o.hprof".classpathFile()
val sparseArraySize = hprofFile.openHeapGraph().use { graph ->
  graph.findClassByName("android.util.SparseArray")!!.instanceByteSize
}
println("Instance size: $sparseArraySize bytes")


Enter fullscreen mode Exit fullscreen mode

Result:



Instance size: 21 bytes


Enter fullscreen mode Exit fullscreen mode

Nice, that's way less than the 32 bytes reported by JOL!

Let's look at the details of the reported fields:



val description = hprofFile.openHeapGraph().use { graph ->
  graph.findClassByName("android.util.SparseArray")!!
      .classHierarchy
      .flatMap { clazz ->
        clazz.readRecord().fields.map { field ->
          val fieldSize = if (field.type == REFERENCE_HPROF_TYPE)
            graph.identifierByteSize
          else
            byteSizeByHprofType.getValue(field.type)
          val typeName =
            if (field.type == REFERENCE_HPROF_TYPE)
              "REF"
            else
              primitiveTypeByHprofType.getValue(field.type).name
          val className = clazz.name
          val fieldName = clazz.instanceFieldName(field)
          "$fieldSize $typeName $className#$fieldName"
        }.asSequence()
      }.joinToString("\n")
}
println(description)


Enter fullscreen mode Exit fullscreen mode

Result:



1 BOOLEAN android.util.SparseArray#mGarbage
4 REF android.util.SparseArray#mKeys
4 INT android.util.SparseArray#mSize
4 REF android.util.SparseArray#mValues
4 REF java.lang.Object#shadow$_klass_
4 INT java.lang.Object#shadow$_monitor_


Enter fullscreen mode Exit fullscreen mode

So every SparseArray instance has a shallow size of 21 bytes, which includes 8 bytes from the Object class and 13 bytes for its own fields... and 0 bytes wasted!

Gaps and alignment

ClassLinker::LinkFields() in class_linker.cc determines the position of every field in memory, with the following rules:

  • The first N bytes are used to store the field values of the parent class, based on the parent Class::GetObjectSize(). N could be anything, even an odd number. If the parent class has gaps (unused bytes), these won't be touched.
  • Then fields are inserted, aligned on their size: longs are 8 byte aligned, ints are 4 byte aligned, etc.
  • The insertion order is references first, then primitive fields with largest field first. E.g. reference then long then int then char then boolean.
  • Because fields must be aligned on their own size, there may be gaps. Here's an example on a 32 bit ART VM:


open class Parent {
  val myChar = 'a'
  val myBool1 = true
  val myBool2 = false
}

class Child : Parent() {
  val ref1 = Any()
  val ref2 = Any()
  val myLong = 0L
} 


Enter fullscreen mode Exit fullscreen mode


# java.lang.Object is 8 bytes
4 REF     java.lang.Object#shadow$_klass_
4 INT     java.lang.Object#shadow$_monitor_
# com.example.Parent is 8 + 3 = 11 bytes
2 CHAR    com.example.Parent#myChar
1 BOOLEAN com.example.Parent#myBool1
1 BOOLEAN com.example.Parent#myBool2
# com.example.Child is 11 + 21 = 32 bytes
1 GAP for 4 byte alignment for refs
4 REF     com.example.Child#ref1
4 REF     com.example.Child#ref2
4 GAP for 8 byte alignment for long
8 LONG    com.example.Child#myLong


Enter fullscreen mode Exit fullscreen mode

Here com.example.Child is 32 bytes which includes 5 bytes wasted for field alignment.

  • If a field can fit in an existing gap, that field gets moved forward:


open class Parent {
  val myChar = 0
  val myBool1 = true
  val myBool2 = false
}

class Child : Parent() {
  val ref1 = Any()
  val ref2 = Any()
  val myLong = 0L
  // Added myInt and myBool3
  val myInt = 0
  val myBool3 = true
} 


Enter fullscreen mode Exit fullscreen mode


# java.lang.Object is 8 bytes
4 REF     java.lang.Object#shadow$_klass_
4 INT     java.lang.Object#shadow$_monitor_
# com.example.Parent is 8 + 3 = 11 bytes
2 CHAR    com.example.Parent#myChar
1 BOOLEAN com.example.Parent#myBool1
1 BOOLEAN com.example.Parent#myBool2
# com.example.Child is 11 + 21 = still 32 bytes!
1 BOOLEAN com.example.Parent#myBool3 (1 byte gap)
4 REF     com.example.Child#ref1
4 REF     com.example.Child#ref2
4 INT     com.example.Child#myInt (4 byte gap)
8 LONG    com.example.Child#myLong


Enter fullscreen mode Exit fullscreen mode

In this example we added an int and a boolean to Child and the instance size hasn't changed.

Conclusion

  • Unlike Open JDK heap dumps, Android heap dumps contain the actual size of instances in memory.
  • We only looked into instance size, on the latest Art implementation. It'd be interesting to do a similar investigation for class size and array size, and also see if the same results can be observed on Dalvik.
  • Reading the Android sources is fun! Even when, like me, you have no idea how C++ works. Comments, symbol names and git history usually provide enough details to figure it out.

Many thanks to Romain Guy, Jesse Wilson and Artem Chubaryan who sent me so many pointers and entertained my stupid ideas.

💖 💪 🙅 🚩
pyricau
Py ⚔

Posted on September 22, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

The real size of Android objects 📏
android The real size of Android objects 📏

September 22, 2020