Protocol buffer deep-dive
TECH SCHOOL
Posted on February 24, 2020
Welcome to the 2nd hands-on lecture of protocol buffer. In the previous lecture, we have learned some basic syntax and data types. Today we will dig deeper into it.
Here's the link to the full gRPC course playlist on Youtube
Github repository: pcbook-go and pcbook-java
Gitlab repository: pcbook-go and pcbook-java
Here are the things that we're going to do:
- Define and use custom types in protocol-buffer message fields, such as enums or other messages.
- Discuss when to use nested types and when not to.
- Organise protobuf messages into multiple files, put them into a package, then import them into other places.
- Explore some well-known types that were already defined by Google.
- Learn about Repeated field, one-of field.
- Use option to tell protoc to generate Go code with the package name that we want.
Multiple messages in 1 file
Let's start with the processor_message.proto
file. We can define multiple messages in 1 file, so I will add a GPU message here. It makes sense because GPU is also a processor.
syntax = "proto3";
message CPU {
string brand = 1;
string name = 2;
uint32 number_cores = 3;
uint32 number_threads = 4;
double min_ghz = 5;
double max_ghz = 6;
}
message GPU {
string brand = 1;
string name = 2;
double min_ghz = 3;
double max_ghz = 4;
// memory ?
}
It has some similar fields as the CPU, such as brand, name, min and max frequency. Just one different thing is that it has its own memory.
Custom types: message and enum
Memory is a very popular term that can be used in other places, such as the RAM or storage (persistent drive). It has many different measurement units, such as kilobyte, megabyte, gigabyte, or terabyte. So I will define it as a custom type, in a separate memory_message.proto
file, so that we can reuse it later.
pcbook
├── proto
│ ├── processor_message.proto
│ └── memory_message.proto
├── pb
│ └── processor_message.pb.go
├── main.go
└── Makefile
First, we need to define the measurement units. To do that, we will use enum. Because this unit should only exist within the context of the memory, we should define it as a nested type inside the Memory message.
syntax = "proto3";
message Memory {
enum Unit {
UNKNOWN = 0;
BIT = 1;
BYTE = 2;
KILOBYTE = 3;
MEGABYTE = 4;
GIGABYTE = 5;
TERABYTE = 6;
}
uint64 value = 1;
Unit unit = 2;
}
The convention is, always use a special value to serve as default value of your enum and assign the tag 0 for it. Then we add other units, from BIT to TERABYTE.
The Memory message will have 2 fields: one for the value and the other for the unit.
Import proto files
Now let's go back to the processor_message.proto
file. We have to import the memory_message.proto
file in order to use the Memory type. And in the GPU message, we add a new memory field of type Memory.
syntax = "proto3";
import "memory_message.proto";
message CPU {
string brand = 1;
string name = 2;
uint32 number_cores = 3;
uint32 number_threads = 4;
double min_ghz = 5;
double max_ghz = 6;
}
message GPU {
string brand = 1;
string name = 2;
double min_ghz = 3;
double max_ghz = 4;
Memory memory = 5;
}
Now if we try to generate Go codes, there will be an error saying "inconsistent package names"
Because we haven't specified the package name in the proto files, by default, protoc will use the file name as the Go package.
The reason protoc throws an error here is that, the 2 generated Go files will belong to 2 different packages, but in Go, we cannot put 2 files of different packages in the same folder, in this case, the pb
folder.
Set package name
To fix it, we must tell protoc to put the generated codes in the same package by specifying it in our proto files with this command package techschool.pcbook
.
The memory_message.proto
file:
syntax = "proto3";
package techschool.pcbook;
message Memory {
enum Unit {
UNKNOWN = 0;
BIT = 1;
BYTE = 2;
KILOBYTE = 3;
MEGABYTE = 4;
GIGABYTE = 5;
TERABYTE = 6;
}
uint64 value = 1;
Unit unit = 2;
}
The processor_message.proto
file:
syntax = "proto3";
package techschool.pcbook;
import "memory_message.proto";
message CPU {
string brand = 1;
string name = 2;
uint32 number_cores = 3;
uint32 number_threads = 4;
double min_ghz = 5;
double max_ghz = 6;
}
message GPU {
string brand = 1;
string name = 2;
double min_ghz = 3;
double max_ghz = 4;
Memory memory = 5;
}
Now if we run make gen
again, it will work, and the 2 generated Go files will belongs to the same package techschool_pcbook
. Protoc uses underscore here because we cannot have dot in the package name in Go.
File memory_message.pb.go
:
package techschool_pcbook
import (
fmt "fmt"
proto "github.com/golang/protobuf/proto"
math "math"
)
// Reference imports to suppress errors if they are not otherwise used.
var _ = proto.Marshal
var _ = fmt.Errorf
var _ = math.Inf
// This is a compile-time assertion to ensure that this generated file
// is compatible with the proto package it is being compiled against.
// A compilation error at this line likely means your copy of the
// proto package needs to be updated.
const _ = proto.ProtoPackageIsVersion3 // please upgrade the proto package
type Memory_Unit int32
const (
Memory_UNKNOWN Memory_Unit = 0
Memory_BIT Memory_Unit = 1
Memory_BYTE Memory_Unit = 2
Memory_KILOBYTE Memory_Unit = 3
Memory_MEGABYTE Memory_Unit = 4
Memory_GIGABYTE Memory_Unit = 5
Memory_TERABYTE Memory_Unit = 6
)
...
File processor_message.pb.go
:
package techschool_pcbook
import (
fmt "fmt"
proto "github.com/golang/protobuf/proto"
math "math"
)
// Reference imports to suppress errors if they are not otherwise used.
var _ = proto.Marshal
var _ = fmt.Errorf
var _ = math.Inf
// This is a compile-time assertion to ensure that this generated file
// is compatible with the proto package it is being compiled against.
// A compilation error at this line likely means your copy of the
// proto package needs to be updated.
const _ = proto.ProtoPackageIsVersion3 // please upgrade the proto package
type CPU struct {
Brand string `protobuf:"bytes,1,opt,name=brand,proto3" json:"brand,omitempty"`
Name string `protobuf:"bytes,2,opt,name=name,proto3" json:"name,omitempty"`
NumberCores uint32 `protobuf:"varint,3,opt,name=number_cores,json=numberCores,proto3" json:"number_cores,omitempty"`
NumberThreads uint32 `protobuf:"varint,4,opt,name=number_threads,json=numberThreads,proto3" json:"number_threads,omitempty"`
MinGhz float64 `protobuf:"fixed64,5,opt,name=min_ghz,json=minGhz,proto3" json:"min_ghz,omitempty"`
MaxGhz float64 `protobuf:"fixed64,6,opt,name=max_ghz,json=maxGhz,proto3" json:"max_ghz,omitempty"`
XXX_NoUnkeyedLiteral struct{} `json:"-"`
XXX_unrecognized []byte `json:"-"`
XXX_sizecache int32 `json:"-"`
}
...
Update proto_path setting for vscode
There's one thing I want to show you here. Let's get back to our processor_message.proto
file. Although we have successfully generated the Go codes, vscode still shows some red lines on the Memory and import command.
The problem is, by default, the vscode-proto3
extension uses our current working folder as the proto_path
when it runs protoc
for code analysis. So it cannot find the memory_message.proto
file in pcbook
folder to import.
If we change the path to proto/memory_message.proto
then it won't complain anymore. However, I don't want to do that because later we will use these proto files in our Java project with a different directory structure.
So I'm gonna show you how to fix this by changing the proto_path
settings of the vscode-proto3
extension. Let's open the extension tab and look for vscode-proto3
.
We copy these settings and paste them to the settings.json
file of vscode.
{
"workbench.colorTheme": "Material Theme Palenight",
"workbench.iconTheme": "material-icon-theme",
"editor.minimap.enabled": false,
"editor.formatOnSave": true,
"explorer.openEditors.visible": 0,
"protoc": {
"path": "/usr/local/bin/protoc",
"options": [
"--proto_path=proto"
]
}
}
We can get the protoc path by running: which protoc
in the terminal. Normally it is /usr/local/bin/protoc
. Then the --proto_path
option should be set to proto
. Now after we save the settings.json
file and restart vscode, the error will be gone.
Install clang-format to automatic format code.
By the way, in the last lecture, we have installed the extension to call clang-format
library. However, the code is not automatically formatted on save.
The reason is: we haven't installed the library yet. So let's install it with Homebrew.
brew install clang-format
Then restart visual studio code. Now the code will be automatically formatted when we save the file.
Define Storage message
Let's continue with our project. I'm gonna create a new message for the storage in storage_message.proto
file.
pcbook
├── proto
│ ├── processor_message.proto
│ ├── memory_message.proto
│ └── storage_message.proto
├── pb
│ ├── processor_message.pb.go
│ └── memory_message.pb.go
├── main.go
└── Makefile
A storage could be a hard disk driver or a solid state driver. So we should define a Driver
enum with these 2 values.
syntax = "proto3";
package techschool.pcbook;
import "memory_message.proto";
message Storage {
enum Driver {
UNKNOWN = 0;
HDD = 1;
SSD = 2;
}
Driver driver = 1;
Memory memory = 2;
}
Then add 2 fields to the storage message: the driver type, and the memory size.
Use option to generate custom package name for Go
The Go package name techschool_pcbook
that protoc generates for us is a bit too long, and doesn't match with the name of the pb
folder that contains the Go files.
So I want to tell it to use pb
as the package name, but just for Go, because Java or other languages will use a different package naming convention.
We can do that by setting option go_package = "pb"
in our proto files.
File storage_message.proto
:
syntax = "proto3";
package techschool.pcbook;
option go_package = "pb";
import "memory_message.proto";
message Storage {
enum Driver {
UNKNOWN = 0;
HDD = 1;
SSD = 2;
}
Driver driver = 1;
Memory memory = 2;
}
File memory_message.proto
:
syntax = "proto3";
package techschool.pcbook;
option go_package = "pb";
message Memory {
enum Unit {
UNKNOWN = 0;
BIT = 1;
BYTE = 2;
KILOBYTE = 3;
MEGABYTE = 4;
GIGABYTE = 5;
TERABYTE = 6;
}
uint64 value = 1;
Unit unit = 2;
}
File processor_message.proto
:
syntax = "proto3";
package techschool.pcbook;
option go_package = "pb";
import "memory_message.proto";
message CPU {
string brand = 1;
string name = 2;
uint32 number_cores = 3;
uint32 number_threads = 4;
double min_ghz = 5;
double max_ghz = 6;
}
message GPU {
string brand = 1;
string name = 2;
double min_ghz = 3;
double max_ghz = 4;
Memory memory = 5;
}
Now if we run make gen
to generate codes, all the generated Go files will use the same pb
package.
pcbook
├── proto
│ ├── processor_message.proto
│ ├── memory_message.proto
│ └── storage_message.proto
├── pb
│ ├── processor_message.pb.go
│ ├── memory_message.pb.go
│ └── storage_message.pb.go
├── main.go
└── Makefile
File storage_message.pb.go
:
package pb
import (
fmt "fmt"
proto "github.com/golang/protobuf/proto"
math "math"
)
// Reference imports to suppress errors if they are not otherwise used.
var _ = proto.Marshal
var _ = fmt.Errorf
var _ = math.Inf
// This is a compile-time assertion to ensure that this generated file
// is compatible with the proto package it is being compiled against.
// A compilation error at this line likely means your copy of the
// proto package needs to be updated.
const _ = proto.ProtoPackageIsVersion3 // please upgrade the proto package
type Storage_Driver int32
const (
Storage_UNKNOWN Storage_Driver = 0
Storage_HDD Storage_Driver = 1
Storage_SSD Storage_Driver = 2
)
var Storage_Driver_name = map[int32]string{
0: "UNKNOWN",
1: "HDD",
2: "SSD",
}
var Storage_Driver_value = map[string]int32{
"UNKNOWN": 0,
"HDD": 1,
"SSD": 2,
}
func (x Storage_Driver) String() string {
return proto.EnumName(Storage_Driver_name, int32(x))
}
func (Storage_Driver) EnumDescriptor() ([]byte, []int) {
return fileDescriptor_170f09d838bd8a04, []int{0, 0}
}
type Storage struct {
Driver Storage_Driver `protobuf:"varint,1,opt,name=driver,proto3,enum=techschool.pcbook.Storage_Driver" json:"driver,omitempty"`
Memory *Memory `protobuf:"bytes,2,opt,name=memory,proto3" json:"memory,omitempty"`
XXX_NoUnkeyedLiteral struct{} `json:"-"`
XXX_unrecognized []byte `json:"-"`
XXX_sizecache int32 `json:"-"`
}
...
File memory_message.pb.go
:
package pb
import (
fmt "fmt"
proto "github.com/golang/protobuf/proto"
math "math"
)
// Reference imports to suppress errors if they are not otherwise used.
var _ = proto.Marshal
var _ = fmt.Errorf
var _ = math.Inf
// This is a compile-time assertion to ensure that this generated file
// is compatible with the proto package it is being compiled against.
// A compilation error at this line likely means your copy of the
// proto package needs to be updated.
const _ = proto.ProtoPackageIsVersion3 // please upgrade the proto package
type Memory_Unit int32
const (
Memory_UNKNOWN Memory_Unit = 0
Memory_BIT Memory_Unit = 1
Memory_BYTE Memory_Unit = 2
Memory_KILOBYTE Memory_Unit = 3
Memory_MEGABYTE Memory_Unit = 4
Memory_GIGABYTE Memory_Unit = 5
Memory_TERABYTE Memory_Unit = 6
)
var Memory_Unit_name = map[int32]string{
0: "UNKNOWN",
1: "BIT",
2: "BYTE",
3: "KILOBYTE",
4: "MEGABYTE",
5: "GIGABYTE",
6: "TERABYTE",
}
var Memory_Unit_value = map[string]int32{
"UNKNOWN": 0,
"BIT": 1,
"BYTE": 2,
"KILOBYTE": 3,
"MEGABYTE": 4,
"GIGABYTE": 5,
"TERABYTE": 6,
}
func (x Memory_Unit) String() string {
return proto.EnumName(Memory_Unit_name, int32(x))
}
func (Memory_Unit) EnumDescriptor() ([]byte, []int) {
return fileDescriptor_c0c7f919ccc765da, []int{0, 0}
}
type Memory struct {
Value uint64 `protobuf:"varint,1,opt,name=value,proto3" json:"value,omitempty"`
Unit Memory_Unit `protobuf:"varint,2,opt,name=unit,proto3,enum=techschool.pcbook.Memory_Unit" json:"unit,omitempty"`
XXX_NoUnkeyedLiteral struct{} `json:"-"`
XXX_unrecognized []byte `json:"-"`
XXX_sizecache int32 `json:"-"`
}
...
File processor_message.pb.go
:
package pb
import (
fmt "fmt"
proto "github.com/golang/protobuf/proto"
math "math"
)
// Reference imports to suppress errors if they are not otherwise used.
var _ = proto.Marshal
var _ = fmt.Errorf
var _ = math.Inf
// This is a compile-time assertion to ensure that this generated file
// is compatible with the proto package it is being compiled against.
// A compilation error at this line likely means your copy of the
// proto package needs to be updated.
const _ = proto.ProtoPackageIsVersion3 // please upgrade the proto package
type CPU struct {
Brand string `protobuf:"bytes,1,opt,name=brand,proto3" json:"brand,omitempty"`
Name string `protobuf:"bytes,2,opt,name=name,proto3" json:"name,omitempty"`
NumberCores uint32 `protobuf:"varint,3,opt,name=number_cores,json=numberCores,proto3" json:"number_cores,omitempty"`
NumberThreads uint32 `protobuf:"varint,4,opt,name=number_threads,json=numberThreads,proto3" json:"number_threads,omitempty"`
MinGhz float64 `protobuf:"fixed64,5,opt,name=min_ghz,json=minGhz,proto3" json:"min_ghz,omitempty"`
MaxGhz float64 `protobuf:"fixed64,6,opt,name=max_ghz,json=maxGhz,proto3" json:"max_ghz,omitempty"`
XXX_NoUnkeyedLiteral struct{} `json:"-"`
XXX_unrecognized []byte `json:"-"`
XXX_sizecache int32 `json:"-"`
}
...
Define Keyboard message
Next, we will define the keyboard message. It can has a QWERTY, QWERTZ, or AZERTY layout. For your information, QWERTZ is used widely in Germany. While in France, AZERTY is more popular.
syntax = "proto3";
package techschool.pcbook;
option go_package = "pb";
message Keyboard {
enum Layout {
UNKNOWN = 0;
QWERTY = 1;
QWERTZ = 2;
AZERTY = 3;
}
Layout layout = 1;
bool backlit = 2;
}
The keyboard can be backlit or not, so we use a boolean field for it. Very simple, right?
Define Screen message
Now let's write a more complex message: the screen. It has a nested message type: Resolution
. The reason we use nested type here is: resolution is an entity that has a close connection with the screen, it doesn’t have any meaning when standing alone.
syntax = "proto3";
package techschool.pcbook;
option go_package = "pb";
message Screen {
message Resolution {
uint32 width = 1;
uint32 height = 2;
}
enum Panel {
UNKNOWN = 0;
IPS = 1;
OLED = 2;
}
float size_inch = 1;
Resolution resolution = 2;
Panel panel = 3;
bool multitouch = 4;
}
Similarly, we have an enum for screen panel, which can be IPS or OLED. Then the screen size in inch. And finally a bool field to tell if it's a multitouch screen or not.
Define Laptop message
Alright, I think basically we've defined all necessary components of a laptop. So let's define the laptop message now.
syntax = "proto3";
package techschool.pcbook;
option go_package = "pb";
import "processor_message.proto";
import "memory_message.proto";
message Laptop {
string id = 1;
string brand = 2;
string name = 3;
CPU cpu = 4;
Memory ram = 5;
}
It has a unique identifier of type string. This ID will be automatically generated by the server. It has a brand and a name. Then a CPU and RAM. We need to import other proto files to use these types.
Repeated field
A laptop can have more than 1 GPU, so we use the repeated
keyword to tell protoc that this is a list of GPUs.
Similarly, it's normal for a laptop to have multiple storages, so this field should be repeated as well.
syntax = "proto3";
package techschool.pcbook;
option go_package = "pb";
import "processor_message.proto";
import "memory_message.proto";
import "storage_message.proto";
import "screen_message.proto";
import "keyboard_message.proto";
message Laptop {
string id = 1;
string brand = 2;
string name = 3;
CPU cpu = 4;
Memory ram = 5;
repeated GPU gpus = 6;
repeated Storage storages = 7;
Screen screen = 8;
Keyboard keyboard = 9;
}
Then comes 2 normal fields: screen and keyboard. It's pretty straight-forward.
Oneof field
How about the weight of the laptop? Let's say, we allow it to be specified in either kilograms or pounds. In order to do that, we can use a new keyword: oneof
.
syntax = "proto3";
package techschool.pcbook;
option go_package = "pb";
import "processor_message.proto";
import "memory_message.proto";
import "storage_message.proto";
import "screen_message.proto";
import "keyboard_message.proto";
message Laptop {
string id = 1;
string brand = 2;
string name = 3;
CPU cpu = 4;
Memory ram = 5;
repeated GPU gpus = 6;
repeated Storage storages = 7;
Screen screen = 8;
Keyboard keyboard = 9;
oneof weight {
double weight_kg = 10;
double weight_lb = 11;
}
}
In this block, we define 2 fields, one for kilograms and the other for pounds. Remember that when you use oneof
fields group, only the field that get assigned last will keep its value.
Well-known types
Then we add 2 more fields: the price in USD and the release year of the laptop. And finally, we need a timestamp field to store the last update time of the record in our system.
Timestamp is one of the well-known types that have already been defined by Google, so we just need to import the package and use it.
syntax = "proto3";
package techschool.pcbook;
option go_package = "pb";
import "processor_message.proto";
import "memory_message.proto";
import "storage_message.proto";
import "screen_message.proto";
import "keyboard_message.proto";
import "google/protobuf/timestamp.proto";
message Laptop {
string id = 1;
string brand = 2;
string name = 3;
CPU cpu = 4;
Memory ram = 5;
repeated GPU gpus = 6;
repeated Storage storages = 7;
Screen screen = 8;
Keyboard keyboard = 9;
oneof weight {
double weight_kg = 10;
double weight_lb = 11;
}
double price_usd = 12;
uint32 release_year = 13;
google.protobuf.Timestamp updated_at = 14;
}
There are many other well-known types. Please check out this link to learn more about them.
Now we can run make gen
to generate Go codes for all of the messages.
pcbook
├── proto
│ ├── processor_message.proto
│ ├── memory_message.proto
│ ├── storage_message.proto
│ ├── keyboard_message.proto
│ ├── screen_message.proto
│ └── laptop_message.proto
├── pb
│ ├── processor_message.pb.go
│ ├── memory_message.pb.go
│ └── storage_message.pb.go
│ ├── keyboard_message.pb.go
│ ├── screen_message.pb.go
│ └── laptop_message.pb.go
├── main.go
└── Makefile
Hooray! We've learned a lot about protocol buffer and how to generate Go codes from it. In the next hands-on lecture, I will show you how to setup a Gradle project to automatically generate Java codes from our proto files.
Thanks a lot for reading, and see you later!
If you like the article, please subscribe to our Youtube channel and follow us on Twitter for more tutorials in the future.
If you want to join me on my current amazing team at Voodoo, check out our job openings here. Remote or onsite in Paris/Amsterdam/London/Berlin/Barcelona with visa sponsorship.
Posted on February 24, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.