Deserialising binary data in Rust
Thiago Massari Guedes
Posted on August 15, 2024
One common way to deal with serialised data in C++ is to map to a struct, but in Rust with memory ownership and some smart memory optimizations, it is a little more complicated.
For example, let's say a producer in C++ serialises this message:
struct book_msg {
uint8_t major_version, // byte 0: message major version
uint8_t minor_version, // byte 1: message minor version
uint8_t msg_type, // byte 2: message type
uint8_t title[20] // byte 3-23: title of the book
}
// ....
auto msg = create_msg();
comm.send(&msg, sizeof(book_msg));
How can we deserialise that in Rust?
Creating the struct to be filled with the data
Rust does not guarantee that the order of the struct arguments is maintained, so we need to use #[repr(C, packed)]
to tell the compiler that the order of the arguments has to be maintained
#[repr(C, packed)]
struct BookMsg {
pub major_version: u8, // byte 0: message major version
pub minor_version: u8, // byte 1: message minor version
pub msg_type: u8, // byte 2: message type
pub title: [u8; 20] // byte 3-23: title of the book
}
Mapping the bytes to a struct
Let's say we received:
// 013854546865205369676e206f662074686520466f7572 in hex
let msg_vec: Vec<u8> = comm.recv();
to map this vector to a struct, we need to violate rust safe guarantees as rust cannot verify in compile time that the struct maps to the vector size and data types.
Explanation of those 2 lines:
- We're casting the pointer from
*const u8
to*const Msg
- And then we are accessing it as a value
*
- And finally, getting a reference to this value
&
let msg_bytes: *const u8 = msg_vec.as_ptr();
let mapped_msg: &BookMsg = unsafe { &*(msg_bytes as *const BookMsg) };
Beware. When accessing parts of the message, it's better to copy the value to an external variable to avoid memory aligning issues.
let major = mapped_msg.major_version;
let minor = mapped_msg.minor_version;
let msg_type = msg.msg_type;
let title = msg.title;
// Result:
// msg version=1.56, type=84
// msg title="The Sign of the Four"
Summary
- Use structs with
#[repr(C, packed)]
to ensure the layout is preserved - Use pointers to map to this struct
- Copy the value from the struct to an external variable to avoid memory alignment issues
Posted on August 15, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.