Understanding CBOR Encoded Data

CBOR, or Concise Binary Object Representation, is an efficient way of serializing data objects, similar to what you see in the protobuf standard.

The reason I tried to understand this is because it was used throughout the web authentication security protocol.

Why may it be of interest to you?
Maybe you want to:

Read dense and highly technical articles 🤷‍♀️.
Understand a byte encoding standard which is becoming more relevant for web security.
Write your own serialiser to avoid adding a dependency to some of the existing libraries which do it well.
Amaze your friends and family!

The Encoding Rules

CBOR encoding data is structured as a sequence of CBOR header and CBOR data pairs. The header part determines the format and the length of the data segment.

_{Table 1 - Header Structure}

1 Byte Header		Data
Major Type	Additional Info	Data
3 Bits	5 Bits	n Bytes

All possible 3-bit values of the major type are shown in the table below.

_{Table 2 - Major Types}

Major Type	Semantics
`0x00`	Unsigned Integer
`0x01`	Negative Integer
`0x02`	Byte String
`0x03`	Text String
`0x04`	Array
`0x05`	Map
`0x06`	Tag
`0x07`	Simple/Float

Unsigned Integer (MT 0)

An unsigned integer may have a value which ranges from 0 to 2⁶⁴-1, but the number of bytes used to express the integer are specified in the additional information part of the header.

_{Table 3 - Integer size}

Add. Info.	Semantics
`0x18`	8 bit
`0x19`	16 bit
`0x1A`	32 bit
`0x1B`	64 bit

So that an 8 bit unsigned integer has a header of

(mt) 0b0000_0000 | (add. info.) 0b0001_1000 = (hdr) 0b0001_1000

And an unsigned integer value of 24 including the header would read as 0b0001_1000_0001_1000 or 0x1818.

A cool trick is that numbers 0 to 23 can all be represented as a header byte only. This means that the integer value of 23 can be represented as 0x1817 or just as 0x17 (0b0001_0111).

Negative Integer (MT 1)

A negative integer may have a value which ranges from 2⁶⁴ to -1. Where value = -1 - argument.

Negative integers also use the same additional information as unsigned integers as seen in table 3.

So to represent -500, the header of 0b0011_1001 should be used, which means "negative integer with 16 bits"; and a value of -1 - 500 = 499 (0b0000_0001_1111_0011); altogether 0b0011_1001_0000_0001_1111_0011 or 0x3901F3.

Just as before values of 0 to 23 can be represented just in the header byte, which means that it would represent negative integers of -1 to -24, where -24 is 0x37 (0b0011_0111).

Byte String (MT 2)

Similar to the rules we have established for both unsigned integers and negative integers, we can specify a byte string length just with the header value if the length is from 0 to 23, and for longer strings, follow the additional information codes shown in table 3.

Encoding the byte string 0x01020304 results in 0x4401020304. The header 0x44 or 0b0100_0100, represents major type 2 and a length of 4.

A byte string of 500 octets would have the major type of 0x02 followed by the additional information of 0x19 (16 bits), 0x01F4 to specify the length of 500, then 500 bytes for the value.

Text String (MT 3)

A CBOR text string is always UTF-8 and the number of bytes in the string is given by the length argument formatted in the same way as a byte string.

The UTF-8 string represented by the bytes 0x6C6F6FC9942073E1B4892073E1B489C9A5CA87 has a length of 19. Since the length is less than 23, we can represent the length with only the header byte of 0b0111_0011 or 0x73, encoding the string as 0x736C6F6FC9942073E1B4892073E1B489C9A5CA87.

73                                      # text(19)
   6C6F6FC9942073E1B4892073E1B489C9A5CA87 # "looɔ sᴉ sᴉɥʇ"

Array (MT 4)

Arrays have a length, and as expected the length is represented in the same way as all the other major types described so far. Furthermore, and array is more like a tuple, since it is not constrained to encapsulate a single type.

An array of 2 elements would look like:

0b100 MT4 bits
0b00010 Additional information for length of 2
0b0000_0001 Integer value of 1
0b0001_1001_0000_0001_1111_0100 Integer value of 500

The encoded result is 0x82011901F4.

82         # array(2)
   01      # unsigned(1)
   19 01F4 # unsigned(500)

Map (MT 5)

A map (also known as a dictionary) is similar to an array which is read as a sequence of key/value pairs; if we used the array specified above as a map, it would read the first array element as the first key and the second element as the first value. This means that a valid map type always has an even number of elements.

Since the map's keys are used to lookup the corresponding value, the map should not have any duplicate keys.

Using the example array above, and changing the major type to 5 the encoded result is 0xA1011901F4.

A1         # map(1)
   01      # unsigned(1)
   19 01F4 # unsigned(500)

Tag (MT 6)

A tag operates as an extension point for an extended number of types (see IANA CBOR Tags for full list) where we can specify data structures that represent datetime, big numbers, URLs and much much more. The way it works is that the tag encodes a number which then describes a well known tag data type.

For example, the tag value of 0 corresponds to a RFC3339 datetime.
Representing the RFC3339 datetime "1985-04-12T23:20:50.52Z" we can first start with the encoded string

77                                      # text(23)
   313938352D30342D31325432333A32303A35302E35325A # "1985-04-12T23:20:50.52Z"

and then prefix the encoding with the tag header 0b110 (tag) 0b00000 (0 value = RFC3339 datetime), or 0xC0.

C0                                      # tag(0)
   77                                   # text(23)
      313938352D30342D31325432333A32303A35302E35325A # "1985-04-12T23:20:50.52Z"

The decoder then knows the following string should be interpreted as a RFC3339 datetime.

Simple/Float (MT 7)

A major type 7 is either a floating point number or a simple value depending on the additional information part of the header.

_{Table 4 - Simple/Float Additional Information}

Add. Info.	Semantics
`0x14`	false
`0x15`	true
`0x16`	null
`0x17`	undefined
`0x19`	IEEE 754 Half-Precision Float (16 bits follow)
`0x1A`	IEEE 754 Single-Precision Float (32 bits follow)
`0x1B`	IEEE 754 Double-Precision Float (64 bits follow)
`0x1F`	"break" stop code for indefinite-length items

A value of 3.14159 can be expressed as a 64 bit float as 0xFB400921F9F01B866E.

Indefinite Length Strings, Arrays & Maps

Byte strings, along with text strings, arrays and maps have another way of defining length. They can use the additional information code 0x1F which signifies that the sequence has an "indefinite length". This is used for scenarios where the length is unknown, like in streaming scenarios.

The termination of a indefinite length sequence is the "break" or "stop" code, which is major type 7 with a additional information code of 0x1F, together the header is 0xFF.

Strings

For byte and text strings of an indefinite length, the data arrives in chunks. Each chunk has a specified length. The sequence of chunks are terminated by the break code 0xFF.

The CBOR encoded data 0x5F440102030443010203FF describes an indefinite length byte string with two chunks, respectively of length 4 and 3 bytes.

5F             # header for the indefinite length byte string
   44          # the header for a byte string of length 4
      01020304 # the first chunk of 4 bytes
   43          # the header for a byte string of length 3
      010203   # the second chunk of 3 bytes
   FF          # the indefinite length byte string is terminated with the "break" code

Arrays & Maps

The CBOR encoded data 0x9F00203040506070809FF describes an indefinite length array of unsigned integers.

9F    # header for the indefinite length array
   00 # unsigned(0)
   01 # unsigned(1)
   02 # unsigned(2)
   03 # unsigned(3)
   04 # unsigned(4)
   05 # unsigned(5)
   06 # unsigned(6)
   07 # unsigned(7)
   08 # unsigned(8)
   09 # unsigned(9)
   0A # unsigned(10)
   FF # the indefinite length array is terminated

These arrays can be nested with definite length arrays as well. The CBOR data 0x9F018202039F0405FFFF is an indefinite length array that is comprised of:

an element
a definite length array of 2 elements
an indefinite length array of 3 elements

9F       # array(*)
   01    # unsigned(1)
   82    # array(2)
      02 # unsigned(2)
      03 # unsigned(3)
   9F    # array(*)
      04 # unsigned(4)
      05 # unsigned(5)
      FF # primitive(*)
   FF    # primitive(*)

Final Thoughts

There are some interesting tricks used in the standard to really make the encoding economical. There are pros and cons using CBOR when compared to other standards like protobuf.

With protobuf you need to distribute definition files in order to decode the data, which leads to much more efficiently encoded class structures (map keys are often strings). You can encode your class structures with enums to represent the class properties, and then distribute the enum definitions. So from that point of view, CBOR can be pretty damn close to protobuf but a hell of a lot more readable (once you understand the standard).

If you are interested, I wrote a simple deserialiser in C#. If you would like to know more about the explanation feel free to reach out.

CBOR is also extended to include a way of signing and encrypting CBOR encoded objects, called COSE or CBOR Object Signing & Encryption which is used by web authentication protocols like App-Attest.

Check out CDDL (Concise Data Definition Language) as a way to define your CBOR structures.

Blog