Command-Line Tools with Go: Piping Data

subfuzion

Tony Pujals

Posted on September 24, 2024

Command-Line Tools with Go: Piping Data

Unix is well-known for advocating the philosophy that commands should do one thing and do it well.

Sophisticated data processing and transformation operations can often be performed using the shell pipe operator to chain commands together so that the output of one becomes the input of another, manipulating and transforming data to achieve a desired result.

For example:

# Sort file names.
ls | sort

# Count files.
ls -l | count -l

# Print out unique file extensions.
#  1. List all files that have extensions
#  2. Transform the data (discard everything but extensions)
#  3. Sort the list (data must be sorted to identify duplicates)
#  4. Filter out duplicates
#  5. Browse the results
ls *.* | sed 's/.*\.//' | sort | uniq | less
Enter fullscreen mode Exit fullscreen mode

With Go, programmers can create efficient and performant commands for processing data. We'll touch on this with the following snippets.

Add line numbers to output

The essence of a command that can be used in a pipe operation is that it reads from stdin and writes to stdout.

add-line-numbers.go

package main

import (
    "bufio"
    "fmt"
    "os"
)

func main() {

    // Buffered input that splits input on lines.
    input := bufio.NewScanner(os.Stdin)

    // Buffered output.
    output := bufio.NewWriter(os.Stdout)

    lineNo := 0

    // Scan until EOF (no more input).
    for input.Scan() {
        text := input.Text()
        lineNo++
        s := fmt.Sprintf("%03d %s\n", lineNo, text)

        // It would be simpler to just use fmt.Println,
        // but want to emphasize piping stdin to stdout
        // explicitly.
        // Intentionally ignoring return values.
        _, _ = output.WriteString(s)

    }

    // Always explicitly flush remaining buffered output.
    _ = output.Flush()
}
Enter fullscreen mode Exit fullscreen mode

This example reads a line at a time from stdin and writes it back out to stdout with each line prefixed with the line number. Here, we use the program file itself as the input to generate numbered output.

$ cat add-line-numbers.go | go run add-line-numbers.go
001 package main
002 
003 import (
004     "bufio"
005     "fmt"
006     "os"
007 )
008 
009 func main() {
010 
011     // Buffered input that splits input on lines.
012     input := bufio.NewScanner(os.Stdin)
013 
014     // Buffered output.
015     output := bufio.NewWriter(os.Stdout)
016 
017     lineNo := 0
018 
019     // Scan until EOF (no more input).
020     for input.Scan() {
021         text := input.Text()
022         lineNo++
023         s := fmt.Sprintf("%03d %s\n", lineNo, text)
024 
025         // It would be simpler to just use fmt.Println,
026         // but want to emphasize piping stdin to stdout
027         // explicitly.
028         // Intentionally ignoring return values.
029         _, _ = output.WriteString(s)
030 
031     }
032 
033     // Always explicitly flush remaining buffered output.
034     _ = output.Flush()
035 
036 }
Enter fullscreen mode Exit fullscreen mode

Base64 encode input

This example reads a line at a time from stdin, base64 encodes it, and writes it back out to stdout.

package main

import (
    "bufio"
    "encoding/base64"
    "os"
)

func main() {

    // Buffered input that splits input on lines.
    input := bufio.NewScanner(os.Stdin)

    // Base64 Encoder/writer.
    encoder := base64.NewEncoder(
        base64.StdEncoding,
        os.Stdout)

    // Scan until EOF (no more input).
    for input.Scan() {
        bytes := input.Bytes()
        _, _ = encoder.Write(bytes)
        _, _ = encoder.Write([]byte{'\n'})
    }

    // Close the encoder and ensure it flushes remaining output
    _ = encoder.Close()
}
Enter fullscreen mode Exit fullscreen mode

Since the scanner splits on newline characters (\n) without returning them, it's necessary to explicitly write a newline after writing each line.

$ cat base64-encode.go | go run base64-encode.go
cGFja2FnZSBtYWluCgppbXBvcnQgKAoJImJ1ZmlvIgoJImVuY29kaW5nL2Jhc2U2NCIKCSJvcyIKKQoKZnVuYyBtYWluKCkgewoKCS8vIEJ1ZmZlcmVkIGlucHV0IHRoYXQgc3BsaXRzIGlucHV0IG9uIGxpbmVzLgoJaW5wdXQgOj0gYnVmaW8uTmV3U2Nhbm5lcihvcy5TdGRpbikKCgkvLyBCYXNlNjQgRW5jb2Rlci93cml0ZXIuCgllbmNvZGVyIDo9IGJhc2U2NC5OZXdFbmNvZGVyKAoJCWJhc2U2NC5TdGRFbmNvZGluZywKCQlvcy5TdGRvdXQpCgoJLy8gU2NhbiB1bnRpbCBFT0YgKG5vIG1vcmUgaW5wdXQpLgoJZm9yIGlucHV0LlNjYW4oKSB7CgkJYnl0ZXMgOj0gaW5wdXQuQnl0ZXMoKQoJCV8sIF8gPSBlbmNvZGVyLldyaXRlKGJ5dGVzKQoJCV8sIF8gPSBlbmNvZGVyLldyaXRlKFtdYnl0ZXsnXG4nfSkKCX0KCgkvLyBDbG9zZSB0aGUgZW5jb2RlciBhbmQgZW5zdXJlIGl0IGZsdXNoZXMgcmVtYWluaW5nIG91dHB1dAoJXyA9IGVuY29kZXIuQ2xvc2UoKQp9Cg==
Enter fullscreen mode Exit fullscreen mode

You can confirm the text was correctly encoded by piping the encoded result to the system base64 command (Linux and MacOS) to decode it:

$ cat base64-encode.go | go run base64-encode.go | base64 -D
package main

import (
    "bufio"
    "encoding/base64"
    "os"
)

func main() {

    // Buffered input that splits input on lines.
    input := bufio.NewScanner(os.Stdin)

    // Base64 Encoder/writer.
    encoder := base64.NewEncoder(
        base64.StdEncoding,
        os.Stdout)

    // Scan until EOF (no more input).
    for input.Scan() {
        bytes := input.Bytes()
        _, _ = encoder.Write(bytes)
        _, _ = encoder.Write([]byte{'\n'})
    }

    // Close the encoder and ensure it flushes remaining output
    _ = encoder.Close()
}
Enter fullscreen mode Exit fullscreen mode

This post is an excerpt from a short, introductory guide I wrote on standard library features of Go that are useful for creating command line tools: Go for CLI Apps and Tools.

💖 💪 🙅 🚩
subfuzion
Tony Pujals

Posted on September 24, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

Meet PostPilot
go Meet PostPilot

November 11, 2024