Using FUSE to map network statistics to directories

r4dx

Efim Rovaev

Posted on March 2, 2023

Using FUSE to map network statistics to directories

Intro

FUSE is a great Linux tool for mortals to implement file systems and run them in user space. Let's have some fun with it and implement a simple file system that shows network statistics! We'll use procfs - a filesystem to get system stats.
We will use GO to make it easier to write things.
You can try it right now by downloading the repo, building from sources and mounting the filesystem to a directory like that:

$ git clone https://github.com/r4dx/netstatfs
$ cd netstatfs
$ go build
$ mkdir m
$ ./netstatfs --mount=m&
$ ls m
Enter fullscreen mode Exit fullscreen mode

m folder will contain directories - each representing a process, within each folder there will files - each representing a socket connection.
Now, when we know what we want to achieve, let's take a closer look into tools we'll be using.

FUSE

FUSE, or Filesystem in Userspace, is a software interface that enables users to create their own file systems without writing drivers. To access a file or directory in FUSE, a program first sends a request to the FUSE library that resides in the user space. This library then sends a request of its own to the kernel through the FUSE device file. The kernel forwards the request to the file system implementation in the user space (something that we'll implement!).

This is, of course, a very brief description of FUSE. You can dive into detail as deep as you want (or need) by visiting these resources:

How does netstat do it?

Apart from FUSE, our project employs the proc file system, a Linux virtual file system providing information about the system and processes running on it. We will be using procfs to retrieve network statistics.
One interesting note is that Linux netstat command uses procfs to retrieve network stats. This is how it does it:

  • It opens the /proc/ directory and reads all numerical IDs there. These IDs stand for the processes that are currently running on the system.
  • For each process, it reads all IDs in /proc/{processId}/fd/
  • It resolves the links of socket file IDs, so they are wrapped in the following format: socket:[socketInode] or [0000]:socketInode
  • Then it stores the mapping of socketInode to its corresponding process in the pgr_hash hash map.
  • Finally, it reads network protocol files in /proc/net/{tcp,tcp6,udp,udp6,igmp,igmp6,unix} that have an inode field. This field is then used to retrieve the process from the pgr_hash hash map.

It's worth noting that the ss command, which serves a similar purpose, uses the netlink protocol instead. To learn more about this, you can refer to the following link: https://man7.org/linux/man-pages/man7/sock_diag.7.html

Diving into the Code

Now let us take a closer look at the code and discuss it bit by bit:

netstatfs.go implements a file system using the bazil.org/fuse library. The file system exposes information about the network connections of the running processes running on the system.

The main function first sets up the mount point for the file system by parsing a command-line argument. Then, it creates a connection to the FUSE library by calling fuse.Mount. The NewNetstatfs function creates a new instance of the Netstatfs struct, which acts as the root node for the file system. Finally, the function serves the Netstatfs instance as a FUSE file system by calling fs.Serve.

func main() {
    mountpoint := flag.String("mount", "", "mountpoint for FS")
    flag.Parse()
    conn, err := fuse.Mount(*mountpoint, fuse.FSName("netstatfs"))
    if err != nil {
     panic(err)
    }
    defer conn.Close()
    netstatfs, err := NewNetstatfs()
    if err != nil {
     panic(err)
    }
    err = fs.Serve(conn, &netstatfs)
    if err != nil {
     panic(err)
    }
}
Enter fullscreen mode Exit fullscreen mode

Let's take a look at Netstatfs struct:

type Netstatfs struct {
    ProcessProvider ProcessProvider
    SocketProvider  SocketProvider
    FileIdProvider  FileIdProvider
    RootINode       uint64
}
Enter fullscreen mode Exit fullscreen mode

Those are self-explanatory - notice that The FileIdProvider provides unique file IDs for each node in the file system, and the RootINode is the inode number of the root node.

The Root method of Netstatfs returns the root directory of the file system. The root directory is implemented as an instance of the RootDir struct, which has methods to return its attributes and the contents of the directory. The contents of the root directory are names of the processes running on the system, with each name having the format <process ID>_<process name>.

func (me Netstatfs) Root() (fs.Node, error) {
    return RootDir{Root: &me}, nil
}

type RootDir struct {
    Root *Netstatfs
}
Enter fullscreen mode Exit fullscreen mode

Look at the ReadAllDir method of RootDir struct - it uses ProcessProvider to get list of running processes and represent them as directories in our filesystem:

func (me RootDir) ReadDirAll(ctx context.Context) ([]fuse.Dirent, error) {
    processes, err := (*me.Root).ProcessProvider.GetProcesses()
    if err != nil {
        return nil, err
    }
    result := make([]fuse.Dirent, len(processes))
    for i, process := range processes {
        fn := processNameToFileName(
            process.Id, process.Name)
        inode, err := (*me.Root).FileIdProvider.GetByProcessId(process.Id)
        if err != nil {
            return nil, err
        }

        result[i] = fuse.Dirent{Inode: inode,
            Name: fn,
            Type: fuse.DT_Dir}
    }
    return result, nil
}
Enter fullscreen mode Exit fullscreen mode

Each process directory is then implemented as an instance of the ProcessDir struct. The ReadDirAll method returns the network sockets associated with the process. The names of the files have the format <socket local address>:<socket local port>_<socket remote address>:<socket remote port>.

type ProcessDir struct {
    Root    *Netstatfs
    Process Process
    INode   uint64
}
Enter fullscreen mode Exit fullscreen mode
func (me ProcessDir) ReadDirAll(ctx context.Context) ([]fuse.Dirent, error) {
    sockets, err := (*me.Root).SocketProvider.GetSockets(me.Process.Id)
    if err != nil {
        log.Printf("Couldn't get sockets for process=%d: %s\n", me.Process.Id, err)
        return nil, err
    }
    result := make([]fuse.Dirent, len(sockets))
    for i, socket := range sockets {
        inode, err := (*me.Root).FileIdProvider.GetBySocketId(me.Process.Id, socket.Id)
        if err != nil {
            log.Printf("Couldn't get fileid for process=%d, socket=%d: %s", me.Process.Id, socket.Id, err)
            return nil, err
        }
        result[i] = fuse.Dirent{Inode: inode,
            Name: socketToFileName(socket),
            Type: fuse.DT_File}
    }
    return result, nil
}
Enter fullscreen mode Exit fullscreen mode

Notice that ProcessDir is connected to RootDir via implementing the Lookup method.

func (me RootDir) Lookup(ctx context.Context, name string) (fs.Node, error) {
    id, err := fileNameToProcessId(name)
    if err != nil {
        return nil, err
    }
    process, err := me.Root.ProcessProvider.GetProcessById(id)
    if err != nil {
        return nil, err
    }
    inode, err := (*me.Root).FileIdProvider.GetByProcessId(id)
    if err != nil {
        return nil, err
    }
    return ProcessDir{Root: me.Root,
        Process: process,
        INode:   inode}, nil
}
Enter fullscreen mode Exit fullscreen mode

Now, when we've unveiled how the FUSE part works, let's take a quick glimpse on how do we get the list of open sockets for a process.
Basically this is what GetSockets does:

func (me procfsSocketProvider) getProcessSocketInternal(processId uint, socketId uint64) (ProcessSocket, error) {
    file := strconv.FormatUint(uint64(processId), 10) + "/fd/" + strconv.FormatUint(socketId, 10)

    resolved, err := me.procfs.Readlink(file)
    if err != nil {
     return ProcessSocket{}, err
    }
    inode, err := me.getINodeIfSocket(resolved)
    if err != nil {
     return ProcessSocket{}, err
    }
    return ProcessSocket{INode: inode, ProcessId: processId,
     Id: socketId, SocketInfo: SocketInfo{}}, nil
}

func (me procfsSocketProvider) GetSockets(processId uint) ([]ProcessSocket, error) {
    base := strconv.Itoa(int(processId)) + "/fd/"

    files, err := me.procfs.Readdirnames(base)
    if err != nil {
     return nil, err
    }
    result := make([]ProcessSocket, 0)
    for _, file := range files {
     fd, err := strconv.ParseUint(file, 10, 64)
     if err != nil {
         continue
     }
     socket, err := me.getProcessSocketInternal(processId, fd)
     if err != nil {
         continue
     }
     result = append(result, socket)
    }
    err = me.fillSocketInfo(result)
    if err != nil {
     return nil, err
    }

    return result, nil
}
func (me procfsSocketProvider) getINodeIfSocket(src string) (uint64, error) {
    matches := me.socketINodeRe.FindStringSubmatchIndex(src)
    outStr := string(me.socketINodeRe.ExpandString([]byte{}, "$inode", src, matches))
    res, err := strconv.ParseUint(outStr, 10, 64)
    if err != nil {
     return 0, err
    }
    return res, nil
}

Enter fullscreen mode Exit fullscreen mode

The result is a list of all the sockets for the given process with detailed information about each socket.

Conclusion

I hope this helped you get a quick glimpse in what you can do with the power of FUSE. If you'll read the source further - you'll see that with little effort we could even implement reading the content of each "file" (socket in our case) by using PCAP library.

💖 💪 🙅 🚩
r4dx
Efim Rovaev

Posted on March 2, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related