atanda nafiu
Posted on August 24, 2023
Streaming large data set have a heavy cost on memory and performance in I/O. The modern serialized and deserialized data method uses protocol buffers, which Google designed. Due to the popularity of microservice architecture systems, it's been difficult to communicate between systems using other communication models as services. Protocol buffers were designed to beat the likes of JSON and XML by removing the responsibility performed by these formats. gRPC is also a communication model that takes the protocol buffer format further. To simplify the developer experience and improve efficiency, gRPC API should uses a protocol buffer for API definition.
What are Protocol Buffers
"Protocol buffers are a flexible, efficient, automated mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once. Then you can use specially generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages. You can even update your data structure without breaking deployed programs compiled against the "old" format.” — Google
Protocol buffers, commonly known as Protobuf, are packages designed by Google to serialize structured data like JSON and XML, but strictly typed. Protobuf helps structure data and can be used in languages like Go, Java, Python, etc. Protobuf is crucial in microservices, where much data is transferred across services.
Why Protocol Buffers are Needed
Modern server architectures like gRPC, REST, and GraphQL need fast parsing of serialised and deserialised structured data. Protobuf is used for communication protocols that generate thousands of messages and are available in many languages.
Protobufs help serialized structured data format for packet types that are large messages in size. The data format is used for long-term data storage. Information changes, including new files and deleting existing fields to any protobuf without updating the existing data. Protobuf is used primarily for interchanging structure information, inter-server communication, and archival data storage on disk. Facebook also used a similar protocol called Apache Thrift, and Amazon created ION. Microsoft also used the Bond protocol, offering a concrete RPC protocol stack to a defined gRPC service.
Prerequisites
- Basic knowledge of data structures.
- A working installation of Go
- Basic understanding of Go.
-
Download the
.proto
on your machine and download the code extension for easy formatting. # Protocol Buffers Language
Protobuf has a minimalistic language syntax, which means protobuf also has a language type; when protobuf compiles, protobuf generates a file for a specific programming language, but in your case, you will have a .go
file with a struct mapping the protocol file.
Types of Protocol Buffers
- Scalar value
- Enumeration and Repeated value
- Nested value
Scalar Values
The types used in the preceding message are scalar; these types are similar to that of Go types. These are int
, int32
, int64
, string
, and bool
. Below is a comparison of the type and protobuf scalar type.
Go type | Protobuf type |
---|---|
float32 | float |
float64 | double |
uint32 | fixed32 |
uint64 | fixed64 |
[ ]byte | bytes |
Default values are also given whenever the User doesn't assign a value to the scalar value. The defined scalar type will be set with a default value.
Enumeration and Repeated Values
The enumeration in protobuf gives an ordering number that sets the default value order from zero to n.
syntax = "proto3";
message Scheldule {
enum month {
janury = 0
Febuary = 1;
March = 2;
April = 3;
May = 4;
June = 5;
July = 6;
August = 7;
September = 8;
October = 9;
November = 10;
December = 11;
}
}
You possibly assigned the same value for multiple enumerations members in a situation. protobuf allows aliases to set two different members. Aliases enum looks like this:
enum EnumAllowingAlias {
option allow_alias = true;
UNKNOWN = 0;
STARTED = 1;
RUNNING = 1;
}
STARTED
and RUNNING
have the same assigned value. This means both will have the same have; if you remove the duplicated value, you must remove allow_alias
. Otherwise, the protobuf compiler will throw an error.
Repeated value
The repeated field is a protobuf's message representing a list with a given key. Repeated areas are similar to an array/list of an element. The duplicated site looks like this:
message PhoneInfo{
string serialNum = 1;
int type = 2;
repeated string name = 3;
}
The last line of code is a repeated field, an array of names. The value could be something such as this ["100.104.112.10", "100.104.112.12"]
.
Nested Field
Protocol buffers allow a model to be nested. The inner JSON as of a member of outerJSON, and the code looks like this:
message PhoneInfo {
string serialNum = 1;
int type = 2;
repeated Proxy name = 3;
}
message Name {
string serialNum = 1;
int type = 2;
}
The nested field of Name
type into the phoneInfo.
Let's look at an example of a message from the official page of the protocol buffer with a JSON:
message person {
option string name = 1;
option int32 id = 2;
option string email = 3
}
The preceding example message was defined with a type called person. The JSON type looks like this:
{
"person": {
"name": "atanda0x",
"id": 1,
"email": "atandanafiu@gmail.com"
}
}
Protocol buffers allow change support to compile with JSON style, but structures are the same. Sequential numbers (1,2,3) are given in protobuf order tags to serialize and deserialize data between two systems. The preceding file compiles the message targeted language; the GO struct and field are empty default values.
Protocol Buffer Compiler (protoc)
Protocol buffer compiler uses .proto
to transfer data between various systems. The diagram below explains the flow of protobuf in a chosen language. In your case, it's Go.
Environment and Folder Setup
In this section, you will set up the development evniromnet for protobuf project.
Initialize Project
The first step is to initialize a new Go project. Open a new terminal and run the following command to Create a new folder: You can name it whatever you like. Mine will be protocol-buffer
:
mkdir protocol-buffer
Next, move into that folder with the following command:
cd protocol-buffer
Then initialize the new Go project with the go mod
command:
go mod init github.com/atanda0x/protobu_go
Create Folders and Server/Client Files
Run the following command to create all the folders that will hold the project files:
mkdir protofiles protoServer protoClient
The command above will create three folders:
- protofiles - will contain all the protobuf files.
- protoServer - will contain server files
- protoClient - will contain client files
Install Libraries
You will install the necessary libraries for your project.
Run the following command to install the protobuf compiler on your machine for WSL or check other os installations here:
apt install -y protobuf-compiler
Then you need to install Go proto plugins using the go get
command:
go install google.golang.org/protobuf/cmd/protoc-gen-go@v1.28
go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@v1.2
Once that’s done, create a user.proto
file in the protofiles folder in your project and add the following code inside it. This models the User's information. The example includes a few messages:
syntax = "proto3";
package protofiles;
import "google/protobuf/timestamp.proto";
option go_package = "./";
message User {
string name = 1;
int32 id = 2; // Unique ID number for this person.
string email = 3;
enum PhoneInfo {
MOBILE = 0;
HOME = 1;
WORK = 2;
}
message PhoneNumber {
string number = 1;
PhoneInfo type = 2;
}
repeated PhoneNumber phones = 4;
google.protobuf.Timestamp last_updated = 5;
}
message AddressBook {
repeated User people = 1;
}
The above code declares the file as proto3
sytax, and a protofiles
package, imports the google/protobuf/timestamp.proto
and defiles the file the generated file should be stored—the User
message with fields name
, id
, and email
. The phoneInfo enums
also have fields , the User
and AddressBook
was created as a central message with a list of User.
The package proteprofile
tells the compiler to add the generating file about the package name. You need to compile it into a Go file, which offers Go access to the User.proto
file.
Next, you need to compile the code with this command so the protoc will generate a file that the go files will interact with:
protoc --go_out=. *.proto
Once you run that command in the same directory, it automatically generates a User.pb.go
file. Please open the file; you will see that it contains a block of codes generated that looks like this:
The generated block of code needs to be used in the main.go
file to Create a protobuf string. Generating user.pb.go
gives you the following user types:
- An
AddressBook
structure with aUser field
. - A
User
structure with fields forName
,Id
,Email
,
andPhones
. - A
User_PhoneInfo
structure, with fields forNumber
andType
. - The type
Person_PhoneType
and a value are defined for each value in theUser.PhoneInfo
enum.
Server and Client
Now that the protobuf has generated a file to interact with, let’s create the server and client files. The server and client are used to interact with the generated user.pb.go
file.
Create a server.go
file inside your protoServer
folder and paste the following code.
package main
import (
"fmt"
pb "github.com/atanda0x/protobuf-go/protofiles"
"google.golang.org/protobuf/proto"
)
func main() {
u := &pb.User{
Id: 1234,
Name: "Atanda N",
Email: "atandanafiu@gmail.com",
Phones: []*pb.User_PhoneNumber{
{
Number: "+234", Type: pb.User_HOME,
},
},
}
u1 := &pb.User{}
body, _ := proto.Marshal(u)
_ = proto.Unmarshal(body, u1)
fmt.Println("Original struct loaded from proto file:", u)
fmt.Println("Marshaled proto data: ", body)
fmt.Println("Unmarshaled struct: ", u1)
}
The code above declares the file as a main.go
package, start with package main
, protobuf pb were imported from the protofile
, imports the proto
package . The struct
mapped to the given protobuf in protofiles. The User struct
is used and initialized. proto_marchal
function serialized the struct
. The marshalled
data is not initiated because the proto
library serialized data into binary bytes. Protobuf in Go struct
generated by compiling the files, which can be used to create JSON string on the fly.
You need to run/build the file with the terminal. Run the following command to do so:
go build
go run main.go
You will see this output print to the terminal:
Let's modify this example, but you will create another file called main_json.go
to a different folder because using the same packages name in one directory will throw an error.
package main
import (
"encoding/json"
"fmt"
pb "github.com/atanda0x/protobuf-go/protofiles"
)
func main() {
u := &pb.User{
Id: 1234,
Name: "Atanda0x",
Email: "atandanafiu@gmail.com",
Phones: []*pb.User_PhoneNumber{
{
Number: "+23490", Type: pb.User_HOME,
},
},
}
body, _ := json.Marshal(u)
fmt.Println(string(body))
}
Run the command below, and the compile will print a JSON string that can be sent to any client that understands JSON strings.
go run main_json.go
Any programming language can pick JSON strings because it is easier to load data instantly.
The benefit of using a protocol buffer over JSON is that buffer is intended to communicate between two backend systems with less overhead. The binary is less than text, and protocol marshalled data is smaller than JSON strings.
Since protobuf is just a data format used to pass a message between two systems in the form of RPC, which makes it less important if not used in communicating, Google Remote Procedure Call (gRPC) takes it further to scale microservices communication, a client and server talk with each other in protocol buffer.
Introduction to Google Remote Procedure Call (gRPC)
Google Remote Procedure Call(gRPC) is an open-source communication mechanism that sends/receive message between two systems. gRPC uses an architecture that supports authentication, load balancing and health checking. I wrote a step-by-step guide on how to use RPC with Go. In the article, there's a section I use JSON RPC service, which is similar to gRPC.
gRPC was designed to transfer data in the form of protocol buffers. gRPC takes creation one step further(easy to use and elegant), creating APIs to define services and start running them smoothly. The combination of gRPC and protobuf is seamless since multiple programming languages can understand gRPC and protocol buffers provided data structures.
The advantages of gRPC over JSON, HTTP, and REST are as follows:
- gRPC uses HTTP/2 instead of traditional HTTP/1.1
- gRPC uses a protocol buffer over JSON/XML.
- The message is transmitted faster with gRPC.
- gRPC has a built-in control flow.
- gRPC supports bidirectional streaming.
- gRPC uses a single TCP connection for sending and receiving multiple messages between the server and the client over traditional HTTP/1.1
The diagram above shows that any backend system or mobile app can directly communicate with the gRPC server by firing a protobuf request.
Bidirectional streaming with gRPC and Protobuf in Go
Now that you know what gRPC and Protobuf are and what they can do. Let’s create an API-based project. A client sends a money transfer request to the server in this use case. The server does a few tasks and sends those step details back to the server as a stream of responses.
The server needs to notify the client whenever some processing is performed. This is called a server push model. The server can send a stream of results back when a client asks for them only once. This is different to polling, where the client requests something each and every time. This can be useful when a series of time-taking steps need to be done. The gRPC client can escalate that job to the gRPC server. Then, the server takes its time and relays the message to the client, which reads them and does something useful. This concept is similar to Websocket but between any type of platform.
You have installed the Go proto plugins
in the protobufs
section. The Project outlook
serverPush
├── datafiles
│ └── transaction.proto
│ └── transaction.pb.go
│ └── transaction_grpc.pb.go
└── grpcServer
│ └── main.go
└── grpcClient
└── main.go
Create the proto file for the project with the message and service defined. Name your file transaction.proto
in the protofile
directory.
syntax = "proto3";
package protofiles;
option go_package = "./";
message TransferRequest {
string from = 1;
string to = 2;
float amount = 3;
}
message TransferResponse {
bool confirmation = 1;
}
service MoneyTransfered {
rpc MoneyTransfered (TransferRequest) returns (TransferResponse);
}
The code above has two messages and one service defined in the protocol buffer file. The exciting part is in the service; it returns a stream instead of a plain response:
rpc MakeTransaction(TransactionRequest) returns (stream TransactionResponse) {}
Compile the code with the following command below and make sure you're in the protofiles
directory:
protoc --go_out=. --go_opt=paths=source_relative --go-grpc_out=. --go-grpc_opt=paths=source_relative protofile/transaction.proto
This creates a new file called transaction.pb.go
in the datafiles directory. You
Use this file's definitions in the server and client programs, which we will create shortly. Now, write the gRPC server code. This code is a bit different from the previous example because of the introduction of streams.
Creation of server and client files
Let’s create a directory for the server and the client files. The server implements the interface that is generated from protofile
.
mkdir -p grpcServer
mkdir -p grpcClient
You cd
into them one after the other. In the grpcServer
create a main.go
file.
package main
import (
"flag"
"fmt"
"log"
"net"
"time"
pb "github.com/atanda0x/protobuf-go/StreamwithGRPC/datafiles"
"google.golang.org/grpc"
"google.golang.org/grpc/reflection"
)
var (
port = flag.Int("port", 50051, "Server port")
noOfSteps = 3
)
type server struct {
pb.UnimplementedMoneyTransactionServer
}
func (s *server) MakeTransaction(in *pb.TransactionRequest, stream pb.MoneyTransaction_MakeTransactionServer) error {
log.Printf("Got request for money.....")
log.Printf("Amount: $ %f, From A/c:%s, To A/c: %s", in.Amount, in.From, in.To)
for i := 0; i < noOfSteps; i++ {
time.Sleep(time.Second * 2)
if err := stream.Send(&pb.TransactionResponse{Status: "good", Step: int32(i), Description: fmt.Sprintln("Description of step &d", int(i))}); err != nil {
log.Fatalf("%v.Send(%v) = %v", stream, "status", err)
}
}
log.Printf("Successfully transfered amount $%v from %v to %v", in.Amount, in.From, in.To)
return nil
}
func main() {
flag.Parse()
s := grpc.NewServer()
lis, err := net.Listen("tcp", fmt.Sprintf(":%d", *port))
if err != nil {
log.Fatalf("Failed to listen: %v", err)
}
pb.RegisterMoneyTransactionServer(s, &server{})
reflection.Register(s)
if err := s.Serve(lis); err != nil {
log.Fatalf("Failed to serve: %v", err)
}
}
The MakeTransaction
in the code above is the function that interests us. It takes a request and a stream as its arguments. In the function, You are looping through the number of steps (here, it is three) and performing the computation. The server is simulating the mock I/O or computation using the time.Sleep
function:
stream.Send()
This function sends a stream response from the server to the client. Now, let us compose the client program.
In the generated transaction.pb.go
file, you will see the RegisterMoneyTransaferedSever
function and MakeTransfer
function as part of the MoneyTransferServer
interface. MoneyTransfer
has the RPC request details. It is a struct that maps to the TransferRequest
message defined in the protobuf file:
rpc MakeTransfered(TransferRequest) returns (TransferResponse) {}
The Client file
Create the client file, cd
into the grpcClient directory. You should have a main.go
file in it.
package main
import (
"context"
"flag"
"io"
"log"
pb "github.com/atanda0x/protobuf-go/StreamwithGRPC/datafiles"
"google.golang.org/grpc"
)
var (
address = flag.String("add", "localhost:50051", "The address to connect")
)
func ReceiveStream(client pb.MoneyTransactionClient, request *pb.TransactionRequest) {
log.Println("Started listening to the server stream!!!!")
stream, err := client.MakeTransaction(context.Background(), request)
if err != nil {
log.Fatalf("%v.MakeTransaction(_) = _, %v", client, err)
}
for {
response, err := stream.Recv()
if err == io.EOF {
break
}
if err != nil {
log.Fatalf("%v.MakeTransaction(_) = _, %v", client, err)
}
log.Printf("Status: %v, Operation: %v", response.Status, response.Description)
}
}
func main() {
flag.Parse()
conn, err := grpc.Dial(*address, grpc.WithInsecure())
if err != nil {
log.Fatalf("Did not connect: %v", err)
}
defer conn.Close()
client := pb.NewMoneyTransactionClient(conn)
from := "1234"
to := "5678"
amount := float32(1250.75)
ReceiveStream(client, &pb.TransactionRequest{From: from, To: to, Amount: amount})
}
The code above ReceiveStream
is the custom function written to send a request and receive a stream of messages. It takes two arguments: MoneyTransactionClient
and TransactionRequest
. uses the first argument to create a stream and listens to it. The client will stop listening and terminate whenever the server exhausts all the messages. Then, an io.EOF
error will be returned if the client tries to receive messages. The response logging was collected from the gRPC server. The second argument, TransactionRequest
, is used to send the request to the server for the first time.
Open a terminal to run the grpcServer
with the command below, run go build
before running the actual file.
go build
go run main.go
Open another terminal for the grpcClient
to run the main.go
file in it.
go build
go run main.go
The TCP server is running on port 50051. The output of both the server and client looks like this.
The client output:
The server gives a log message to the console at the same time.
The client made a request to grpcServer
and passed it all from the A/C number to the A/C number and amount. The server picked up the details, processed them, and responded that everything was ok.
This process happens in sync with the server. The client stays alive until all the streaming messages are sent back. The server can handle any number of clients at a given time. Every client request is considered an individual entity. This is an example of the server sending a stream of responses. Other cases can also be implemented with protocol buffers and gRPC: The client sends streamed requests to get one final response from the server The client and server are both sending streamed requests and responses at the same time.
Conclusion
And that's it! You have just built working bidirectional streaming with gRPC and protobuf. You learned the Google RPC open-source communication tool and protocol buffers, why you should use it in your project compared to other communication models, and how to generate a pb
file and interact with it in the client/server files. Using his advance gRPC It supports multiple programming languages and built-in features for load balancing and authentication and supports bi-directional streaming, which you build projects on. gRPC helps developers build efficient and robust distributed systems tailored to specific needs in any programing language of the developer's choice. The choice might depend on project requirements and trade-offs between building distributed systems, compatibility, and ease of use.
Thank you so much for reading. Happy coding!
Posted on August 24, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.