Deserialising binary data in Rust

thiagomg

Thiago Massari Guedes

Posted on August 15, 2024

Deserialising binary data in Rust

One common way to deal with serialised data in C++ is to map to a struct, but in Rust with memory ownership and some smart memory optimizations, it is a little more complicated.

For example, let's say a producer in C++ serialises this message:

struct book_msg {  
    uint8_t major_version, // byte 0: message major version  
    uint8_t minor_version, // byte 1: message minor version  
    uint8_t msg_type,      // byte 2: message type  
    uint8_t title[20]      // byte 3-23: title of the book  
}
// ....
auto msg = create_msg();
comm.send(&msg, sizeof(book_msg));
Enter fullscreen mode Exit fullscreen mode

How can we deserialise that in Rust?

Creating the struct to be filled with the data

Rust does not guarantee that the order of the struct arguments is maintained, so we need to use #[repr(C, packed)] to tell the compiler that the order of the arguments has to be maintained

#[repr(C, packed)]
struct BookMsg {  
    pub major_version: u8, // byte 0: message major version  
    pub minor_version: u8, // byte 1: message minor version  
    pub msg_type: u8,      // byte 2: message type  
    pub title: [u8; 20]    // byte 3-23: title of the book  
}
Enter fullscreen mode Exit fullscreen mode

Mapping the bytes to a struct

Let's say we received:

// 013854546865205369676e206f662074686520466f7572 in hex
let msg_vec: Vec<u8> = comm.recv();
Enter fullscreen mode Exit fullscreen mode

to map this vector to a struct, we need to violate rust safe guarantees as rust cannot verify in compile time that the struct maps to the vector size and data types.

Explanation of those 2 lines:

  1. We're casting the pointer from *const u8 to *const Msg
  2. And then we are accessing it as a value *
  3. And finally, getting a reference to this value &
let msg_bytes: *const u8 = msg_vec.as_ptr();  
let mapped_msg: &BookMsg = unsafe { &*(msg_bytes as *const BookMsg) };
Enter fullscreen mode Exit fullscreen mode

Beware. When accessing parts of the message, it's better to copy the value to an external variable to avoid memory aligning issues.

let major = mapped_msg.major_version;  
let minor = mapped_msg.minor_version;  
let msg_type = msg.msg_type;  
let title = msg.title;

// Result:
// msg version=1.56, type=84
// msg title="The Sign of the Four"
Enter fullscreen mode Exit fullscreen mode

Summary

  1. Use structs with #[repr(C, packed)] to ensure the layout is preserved
  2. Use pointers to map to this struct
  3. Copy the value from the struct to an external variable to avoid memory alignment issues

Source: Deserialising binary data in Rust

💖 💪 🙅 🚩
thiagomg
Thiago Massari Guedes

Posted on August 15, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related