Command-Line Tools with Go: Piping Data
Tony Pujals
Posted on September 24, 2024
Unix is well-known for advocating the philosophy that commands should do one thing and do it well.
Sophisticated data processing and transformation operations can often be performed using the shell pipe operator to chain commands together so that the output of one becomes the input of another, manipulating and transforming data to achieve a desired result.
For example:
# Sort file names.
ls | sort
# Count files.
ls -l | count -l
# Print out unique file extensions.
# 1. List all files that have extensions
# 2. Transform the data (discard everything but extensions)
# 3. Sort the list (data must be sorted to identify duplicates)
# 4. Filter out duplicates
# 5. Browse the results
ls *.* | sed 's/.*\.//' | sort | uniq | less
With Go, programmers can create efficient and performant commands for processing data. We'll touch on this with the following snippets.
Add line numbers to output
The essence of a command that can be used in a pipe operation is that it reads from stdin
and writes to stdout
.
add-line-numbers.go
package main
import (
"bufio"
"fmt"
"os"
)
func main() {
// Buffered input that splits input on lines.
input := bufio.NewScanner(os.Stdin)
// Buffered output.
output := bufio.NewWriter(os.Stdout)
lineNo := 0
// Scan until EOF (no more input).
for input.Scan() {
text := input.Text()
lineNo++
s := fmt.Sprintf("%03d %s\n", lineNo, text)
// It would be simpler to just use fmt.Println,
// but want to emphasize piping stdin to stdout
// explicitly.
// Intentionally ignoring return values.
_, _ = output.WriteString(s)
}
// Always explicitly flush remaining buffered output.
_ = output.Flush()
}
This example reads a line at a time from stdin
and writes it back out to stdout
with each line prefixed with the line number. Here, we use the program file itself as the input to generate numbered output.
$ cat add-line-numbers.go | go run add-line-numbers.go
001 package main
002
003 import (
004 "bufio"
005 "fmt"
006 "os"
007 )
008
009 func main() {
010
011 // Buffered input that splits input on lines.
012 input := bufio.NewScanner(os.Stdin)
013
014 // Buffered output.
015 output := bufio.NewWriter(os.Stdout)
016
017 lineNo := 0
018
019 // Scan until EOF (no more input).
020 for input.Scan() {
021 text := input.Text()
022 lineNo++
023 s := fmt.Sprintf("%03d %s\n", lineNo, text)
024
025 // It would be simpler to just use fmt.Println,
026 // but want to emphasize piping stdin to stdout
027 // explicitly.
028 // Intentionally ignoring return values.
029 _, _ = output.WriteString(s)
030
031 }
032
033 // Always explicitly flush remaining buffered output.
034 _ = output.Flush()
035
036 }
Base64 encode input
This example reads a line at a time from stdin
, base64 encodes it, and writes it back out to stdout
.
package main
import (
"bufio"
"encoding/base64"
"os"
)
func main() {
// Buffered input that splits input on lines.
input := bufio.NewScanner(os.Stdin)
// Base64 Encoder/writer.
encoder := base64.NewEncoder(
base64.StdEncoding,
os.Stdout)
// Scan until EOF (no more input).
for input.Scan() {
bytes := input.Bytes()
_, _ = encoder.Write(bytes)
_, _ = encoder.Write([]byte{'\n'})
}
// Close the encoder and ensure it flushes remaining output
_ = encoder.Close()
}
Since the scanner splits on newline characters (\n
) without returning them, it's necessary to explicitly write a newline after writing each line.
$ cat base64-encode.go | go run base64-encode.go
cGFja2FnZSBtYWluCgppbXBvcnQgKAoJImJ1ZmlvIgoJImVuY29kaW5nL2Jhc2U2NCIKCSJvcyIKKQoKZnVuYyBtYWluKCkgewoKCS8vIEJ1ZmZlcmVkIGlucHV0IHRoYXQgc3BsaXRzIGlucHV0IG9uIGxpbmVzLgoJaW5wdXQgOj0gYnVmaW8uTmV3U2Nhbm5lcihvcy5TdGRpbikKCgkvLyBCYXNlNjQgRW5jb2Rlci93cml0ZXIuCgllbmNvZGVyIDo9IGJhc2U2NC5OZXdFbmNvZGVyKAoJCWJhc2U2NC5TdGRFbmNvZGluZywKCQlvcy5TdGRvdXQpCgoJLy8gU2NhbiB1bnRpbCBFT0YgKG5vIG1vcmUgaW5wdXQpLgoJZm9yIGlucHV0LlNjYW4oKSB7CgkJYnl0ZXMgOj0gaW5wdXQuQnl0ZXMoKQoJCV8sIF8gPSBlbmNvZGVyLldyaXRlKGJ5dGVzKQoJCV8sIF8gPSBlbmNvZGVyLldyaXRlKFtdYnl0ZXsnXG4nfSkKCX0KCgkvLyBDbG9zZSB0aGUgZW5jb2RlciBhbmQgZW5zdXJlIGl0IGZsdXNoZXMgcmVtYWluaW5nIG91dHB1dAoJXyA9IGVuY29kZXIuQ2xvc2UoKQp9Cg==
You can confirm the text was correctly encoded by piping the encoded result to the system base64
command (Linux and MacOS) to decode it:
$ cat base64-encode.go | go run base64-encode.go | base64 -D
package main
import (
"bufio"
"encoding/base64"
"os"
)
func main() {
// Buffered input that splits input on lines.
input := bufio.NewScanner(os.Stdin)
// Base64 Encoder/writer.
encoder := base64.NewEncoder(
base64.StdEncoding,
os.Stdout)
// Scan until EOF (no more input).
for input.Scan() {
bytes := input.Bytes()
_, _ = encoder.Write(bytes)
_, _ = encoder.Write([]byte{'\n'})
}
// Close the encoder and ensure it flushes remaining output
_ = encoder.Close()
}
This post is an excerpt from a short, introductory guide I wrote on standard library features of Go that are useful for creating command line tools: Go for CLI Apps and Tools.
Posted on September 24, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.