Using FUSE to map network statistics to directories
Efim Rovaev
Posted on March 2, 2023
Intro
FUSE is a great Linux tool for mortals to implement file systems and run them in user space. Let's have some fun with it and implement a simple file system that shows network statistics! We'll use procfs
- a filesystem to get system stats.
We will use GO to make it easier to write things.
You can try it right now by downloading the repo, building from sources and mounting the filesystem to a directory like that:
$ git clone https://github.com/r4dx/netstatfs
$ cd netstatfs
$ go build
$ mkdir m
$ ./netstatfs --mount=m&
$ ls m
m
folder will contain directories - each representing a process, within each folder there will files - each representing a socket connection.
Now, when we know what we want to achieve, let's take a closer look into tools we'll be using.
FUSE
FUSE, or Filesystem in Userspace, is a software interface that enables users to create their own file systems without writing drivers. To access a file or directory in FUSE, a program first sends a request to the FUSE library that resides in the user space. This library then sends a request of its own to the kernel through the FUSE device file. The kernel forwards the request to the file system implementation in the user space (something that we'll implement!).
This is, of course, a very brief description of FUSE. You can dive into detail as deep as you want (or need) by visiting these resources:
- Official FUSE website with detailed information about the technology, its design and the ways it works: https://github.com/libfuse/libfuse
- Wikipedia page on FUSE covering its history, design and implementation: https://en.wikipedia.org/wiki/Filesystem_in_Userspace
- Linux manual page on FUSE, providing an insight into using the tool in Linux environments: https://man7.org/linux/man-pages/man4/fuse.4.html
- There are also several online step-by-step tutorials on FUSE: ** https://www.ibm.com/developerworks/library/l-fuse/index.html ** https://www.cs.nmsu.edu/~pfeiffer/fuse-tutorial/html/index.html
How does netstat do it?
Apart from FUSE, our project employs the proc file system, a Linux virtual file system providing information about the system and processes running on it. We will be using procfs to retrieve network statistics.
One interesting note is that Linux netstat command uses procfs to retrieve network stats. This is how it does it:
- It opens the
/proc/
directory and reads all numerical IDs there. These IDs stand for the processes that are currently running on the system. - For each process, it reads all IDs in
/proc/{processId}/fd/
- It resolves the links of socket file IDs, so they are wrapped in the following format:
socket:[socketInode]
or[0000]:socketInode
- Then it stores the mapping of socketInode to its corresponding process in the
pgr_hash
hash map. - Finally, it reads network protocol files in /proc/net/{tcp,tcp6,udp,udp6,igmp,igmp6,unix} that have an inode field. This field is then used to retrieve the process from the pgr_hash hash map.
It's worth noting that the ss command, which serves a similar purpose, uses the netlink protocol instead. To learn more about this, you can refer to the following link: https://man7.org/linux/man-pages/man7/sock_diag.7.html
Diving into the Code
Now let us take a closer look at the code and discuss it bit by bit:
netstatfs.go implements a file system using the bazil.org/fuse
library. The file system exposes information about the network connections of the running processes running on the system.
The main
function first sets up the mount point for the file system by parsing a command-line argument. Then, it creates a connection to the FUSE library by calling fuse.Mount
. The NewNetstatfs
function creates a new instance of the Netstatfs
struct, which acts as the root node for the file system. Finally, the function serves the Netstatfs instance as a FUSE file system by calling fs.Serve
.
func main() {
mountpoint := flag.String("mount", "", "mountpoint for FS")
flag.Parse()
conn, err := fuse.Mount(*mountpoint, fuse.FSName("netstatfs"))
if err != nil {
panic(err)
}
defer conn.Close()
netstatfs, err := NewNetstatfs()
if err != nil {
panic(err)
}
err = fs.Serve(conn, &netstatfs)
if err != nil {
panic(err)
}
}
Let's take a look at Netstatfs
struct:
type Netstatfs struct {
ProcessProvider ProcessProvider
SocketProvider SocketProvider
FileIdProvider FileIdProvider
RootINode uint64
}
Those are self-explanatory - notice that The FileIdProvider
provides unique file IDs for each node in the file system, and the RootINode
is the inode number of the root node.
The Root
method of Netstatfs
returns the root directory of the file system. The root directory is implemented as an instance of the RootDir
struct, which has methods to return its attributes and the contents of the directory. The contents of the root directory are names of the processes running on the system, with each name having the format <process ID>_<process name>
.
func (me Netstatfs) Root() (fs.Node, error) {
return RootDir{Root: &me}, nil
}
type RootDir struct {
Root *Netstatfs
}
Look at the ReadAllDir
method of RootDir struct - it uses ProcessProvider
to get list of running processes and represent them as directories in our filesystem:
func (me RootDir) ReadDirAll(ctx context.Context) ([]fuse.Dirent, error) {
processes, err := (*me.Root).ProcessProvider.GetProcesses()
if err != nil {
return nil, err
}
result := make([]fuse.Dirent, len(processes))
for i, process := range processes {
fn := processNameToFileName(
process.Id, process.Name)
inode, err := (*me.Root).FileIdProvider.GetByProcessId(process.Id)
if err != nil {
return nil, err
}
result[i] = fuse.Dirent{Inode: inode,
Name: fn,
Type: fuse.DT_Dir}
}
return result, nil
}
Each process directory is then implemented as an instance of the ProcessDir
struct. The ReadDirAll
method returns the network sockets associated with the process. The names of the files have the format <socket local address>:<socket local port>_<socket remote address>:<socket remote port>
.
type ProcessDir struct {
Root *Netstatfs
Process Process
INode uint64
}
func (me ProcessDir) ReadDirAll(ctx context.Context) ([]fuse.Dirent, error) {
sockets, err := (*me.Root).SocketProvider.GetSockets(me.Process.Id)
if err != nil {
log.Printf("Couldn't get sockets for process=%d: %s\n", me.Process.Id, err)
return nil, err
}
result := make([]fuse.Dirent, len(sockets))
for i, socket := range sockets {
inode, err := (*me.Root).FileIdProvider.GetBySocketId(me.Process.Id, socket.Id)
if err != nil {
log.Printf("Couldn't get fileid for process=%d, socket=%d: %s", me.Process.Id, socket.Id, err)
return nil, err
}
result[i] = fuse.Dirent{Inode: inode,
Name: socketToFileName(socket),
Type: fuse.DT_File}
}
return result, nil
}
Notice that ProcessDir
is connected to RootDir
via implementing the Lookup
method.
func (me RootDir) Lookup(ctx context.Context, name string) (fs.Node, error) {
id, err := fileNameToProcessId(name)
if err != nil {
return nil, err
}
process, err := me.Root.ProcessProvider.GetProcessById(id)
if err != nil {
return nil, err
}
inode, err := (*me.Root).FileIdProvider.GetByProcessId(id)
if err != nil {
return nil, err
}
return ProcessDir{Root: me.Root,
Process: process,
INode: inode}, nil
}
Now, when we've unveiled how the FUSE part works, let's take a quick glimpse on how do we get the list of open sockets for a process.
Basically this is what GetSockets
does:
func (me procfsSocketProvider) getProcessSocketInternal(processId uint, socketId uint64) (ProcessSocket, error) {
file := strconv.FormatUint(uint64(processId), 10) + "/fd/" + strconv.FormatUint(socketId, 10)
resolved, err := me.procfs.Readlink(file)
if err != nil {
return ProcessSocket{}, err
}
inode, err := me.getINodeIfSocket(resolved)
if err != nil {
return ProcessSocket{}, err
}
return ProcessSocket{INode: inode, ProcessId: processId,
Id: socketId, SocketInfo: SocketInfo{}}, nil
}
func (me procfsSocketProvider) GetSockets(processId uint) ([]ProcessSocket, error) {
base := strconv.Itoa(int(processId)) + "/fd/"
files, err := me.procfs.Readdirnames(base)
if err != nil {
return nil, err
}
result := make([]ProcessSocket, 0)
for _, file := range files {
fd, err := strconv.ParseUint(file, 10, 64)
if err != nil {
continue
}
socket, err := me.getProcessSocketInternal(processId, fd)
if err != nil {
continue
}
result = append(result, socket)
}
err = me.fillSocketInfo(result)
if err != nil {
return nil, err
}
return result, nil
}
func (me procfsSocketProvider) getINodeIfSocket(src string) (uint64, error) {
matches := me.socketINodeRe.FindStringSubmatchIndex(src)
outStr := string(me.socketINodeRe.ExpandString([]byte{}, "$inode", src, matches))
res, err := strconv.ParseUint(outStr, 10, 64)
if err != nil {
return 0, err
}
return res, nil
}
The result is a list of all the sockets for the given process with detailed information about each socket.
Conclusion
I hope this helped you get a quick glimpse in what you can do with the power of FUSE. If you'll read the source further - you'll see that with little effort we could even implement reading the content of each "file" (socket in our case) by using PCAP library.
Posted on March 2, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 30, 2024