Chandelier Axel
Posted on June 11, 2022
You most likely heard that everything in UNIX is a file, but it never really clicked for you. Or you get it, but couldn't find the technical details behind it. Or you have no clues about what this even means - like me before this.
In any cases, you are at the right place. Got your coffee ? ☕ Let's dive in.
Table of contents
Is everything really a file ?
Yup.
This yet simple but quite complex sentence resume the entire architecture of the UNIX Operating System (OS).
Basically, every features your OS is able to do, rely on files. Processes, devices (such as a printer, or your mouse), networks, and even directories.
Wait, directories ?
Yup. There are 7 different types of files, you can find a complete list here. For now, we'll focus on regular files, and directories.
What is a file, if everything is a file ?
From an OS point of view, a file is actually nothing more than a specific data structure called an inode. You may visualize an inode such as a table containaing file related informations.
The inode itself contain all the metadatas (extra datas that describes data) relatives to the file it describe. Informations such as the creation date, the last update date, the ownership, the permissions, the file size ...
⚠ The file name and the actual file data are not part of the inode, keep it in mind, we'll cover this later.
Inode
Basics
You can visualize an inode informations directly in your terminal, by typing the ls -l
command.
Here's the kind of output you should receive :
-rw-r--r-- 1 user root 13 10 mai 22:10 index.txt
In order :
-
File type and permissions. It's quite hard to read, let's break it down together.
- The first character specify the what kind of file this is (the 7 types we mentionned earlier).
- - Regular file.
- d Directory.
- l Symbolic link.
- b Block special file.
- c Character special file.
- s Socket link.
- p FIFO.
- The next three characters are related to the owner permissions for the file.
- The next three characters are related to the group permissions for the file.
-
The last three characters are related to the others permissions for the file.
All permissions fields can be read as follow :
- Is the permission allowed to read the file ? - for no, r for yes.
- Is the permission allowed to write the file ? - for no, w for yes.
- Is the permission allowed to execute the file ? - for no, x for yes.
- The first character specify the what kind of file this is (the 7 types we mentionned earlier).
Number of hard links. You may find more informations here.
Owner name.
Group name.
Number of bytes in your file.
Date of last modification.
File name. Not a part of the inode, but still in the output. More on it later.
📌 You may find all these informations - and more, directly from your terminal in an UNIX environnement with the command man ls
If you're willing to get a bit more informations about a specific file, you may want to use the stat
command directly.
Inode number
If you execute the ls -li
command you will find an extra column at the beginning of the output.
12345678 -rw-r--r-- 1 user root 13 10 mai 22:10 index.txt
It contain a very specific integer, the inode number.
💡 This number is the unique identifier for this specific file in your entire file system.
Every files will have a different inode number (with the exception of hard links), and there's a maximum amount of inode number your file system can handle. Afterward, you will not be able to create anymore files.
File name and directories
If the filename is not specified by the inode, where is it stored ?
You most likely already guessed it by the title, but the file name is stored within the parent directory.
What exactly is a directory then ? It's actually a pretty simple thing. Type ls -l
command somewhere with a directory.
drw-r--r-- 1 user root 13 10 mai 23:17 directoryName
Noticed the d at the beginning ?
The directories files actually only contain a mapping table, between a file inode number, and his name.
Inode number | File name |
---|---|
900 | file.txt |
901 | data.json |
The list goes on for all the file or other directories it may contain.
A directory is nothing but a specific file, it also have an inode number, and his name is saved within his parent directory inode.
📌 ls -li
for a directory to see the inode number
899 drw-r--r-- 1 user root 13 10 mai 23:17 directoryName
Here's a complete representation of the content of a directory.
💡 The first two entries are references to itself, and to his parent directory.
📌 To see them, use the -a
option on the ls
command to see hidden files.
Inode number | File name |
---|---|
898 | . |
899 | .. |
900 | file.txt |
901 | data.json |
902 | newDirectory |
The file data
So, where is the file data ?
What make the inode so special is that it kept references (pointers) toward the memory blocks that are actually containing the data in disk. By doing so, when we ask to open the file, it go through all of them and recover the informations needed.
The pointers part is oversimplified for the sake of this article, indirect pointers are purposely omitted.
Let's get a visual recap for a file inode.
File Inode |
---|
File types and permissions |
Hard links |
Owner name |
Group name |
Number of bytes |
(A lot more metadata informations ) |
Pointer toward block n° 100 |
Pointer toward block n° 101 |
I hope you enjoyed the read, and learned as much as I did. If you have any questions or comments, feel free to reach to me on Twitter or in the comments below. Have a nice day !
Posted on June 11, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.