6 Essential Tips for JuiceFS Users
DASWU
Posted on November 24, 2023
As big data and artificial intelligence (AI) technologies continue to evolve, more enterprises, teams, and individuals are adopting JuiceFS, an open-source high-performance distributed file system designed for the cloud. This article compiles six practical tips to help you enhance management efficiency of JuiceFS, including:
- Viewing mounted file systems
- Streamlining management using bash scripts
- Checking how many clients are mounted concurrently
- Enabling/disabling the trash feature
- Completely destroying a file system
- Metadata backup and restoration
Viewing mounted file systems
Sometimes, you may have multiple JuiceFS file systems mounted on a single machine or different options mounted on the same file system across multiple machines. Distinguishing which machine is mounting which file system and what tuning options are set is a common question. Here are a few convenient methods, illustrated on a Linux system:
Method 1: Using the ps
command
ps aux | grep juicefs
This command's output will display background-mounted file systems.
herald 36290 0.2 0.1 800108 78848 ? Sl 11:07 0:24 juicefs mount -d sqlite3:///home/herald/jfs/my.db /home/herald/jfs/mnt
herald 37190 1.3 0.1 3163100 106160 ? Sl 11:11 2:12 juicefs mount -d badger:///home/herald/jfs/mydb /home/herald/jfs/mnt2
herald 68886 0.0 0.0 221812 2400 pts/0 S+ 13:54 0:00 grep --color=auto --exclude-dir=.bzr --exclude-dir=CVS --exclude-dir=.git --exclude-dir=.hg --exclude-dir=.svn --exclude-dir=.idea --exclude-dir=.tox juicefs
Method 2: Using pgrep
and cat
commands
In Linux systems, you can find process information in the /proc
file system and access it using the process identifier (PID) as the directory name.
Use pgrep
to find the PID of the juicefs
mount process:
pgrep juicefs
This will output the PIDs of juicefs
mount processes, for example:
36290
37190
Use cat /proc/PID/cmdline
to print the command line of each process, for example:
cat /proc/36290/cmdline
It will output something similar to the following:
juicefs mount -d sqlite3:///home/herald/jfs/my.db /home/herald/jfs/mnt
Method 3: Using a bash script
I've integrated Method 2 into a bash script available on Github Gist:
# Download the bash script.
curl -LO https://gist.githubusercontent.com/yuhr123/4e7a09653e833a083dae87ba76b7d642/raw/d8de5350955aa33a3bfafc7cf3756c5f8f3fa04d/proc
# Grant script execution permissions.
chmod +x proc
# Run the script.
./proc juicefs
It will output something similar to the following:
PID: 36290, Command Line: juicefs mount -d sqlite3:///home/herald/jfs/my.db /home/herald/jfs/mnt
PID: 37190, Command Line: juicefs mount -d badger:///home/herald/jfs/mydb /home/herald/jfs/mnt2
Streamlining management using bash scripts
The JuiceFS client operates through command lines. While it's not challenging to use, entering commands directly can be cumbersome, especially for users who have just started or are repeatedly adjusting mounting options and tuning performance. Bash scripts can help manage various commands.
Creating a file system using a script
For example, creating a script named format-myjfs.sh
to manage the commands that create a file system:
#!/bin/bash
juicefs format --storage s3 \
--bucket xxx \
--access-key xxx \
--secret-key xxx \
redis://xxx.xxx.xxx/1 \
myjfs
Run the script:
bash format-myjfs.sh
This script is convenient to check which bucket and database this file system is composed of at any time. The disadvantage is that it may need to write the access key of the object storage or database. Therefore, if you want to manage it this way, you must keep this script properly. You can use the environment variables to convey sensitive information. You can also use gpg
to perform symmetric encryption on this script after use.
Managing file system mounting with a script
Mounting a file system is a daily and frequent management action, such as creating a script named mount-myjfs.sh
:
#!/bin/bash
juicefs mount \
--cache-dir /mnt/juicefs-cache \
--buffer-size 2048 \
--writeback \
--free-space-ratio 0.5 \
redis://xxx.xxx.xxx/1 \
/mnt/myjfs
Run the script:
bash mount-juicefs.sh
This script provides a more intuitive way to adjust mounting options.
Checking how many clients are mounted concurrently
A key feature of the cloud file system is that it can be mounted by multiple clients located on different networks at the same time. For example, if the same file system is mounted in a data center in Chicago and another data center in New York simultaneously, the servers in both places can read and write at the same time. JuiceFS’ transaction mechanism can ensure the consistency of written data.
To view the current mounted clients, use the status
command:
juicefs status redis://192.168.1.80/1
The output, in JSON format, includes information about active sessions, such as software version, hostname, IP address, mount point, and process ID. For example:
{
"Setting": {
"Name": "myjfs",
"UUID": "520ae432-f355-43d2-a445-020787f325f4",
"Storage": "minio",
"Bucket": "http://192.168.1.80:9123/myjfs",
"AccessKey": "admin",
"SecretKey": "removed",
"BlockSize": 4096,
"Compression": "none",
"EncryptAlgo": "aes256gcm-rsa",
"KeyEncrypted": true,
"TrashDays": 1,
"MetaVersion": 1,
"MinClientVersion": "1.1.0-A",
"DirStats": true
},
"Sessions": [
{
"Sid": 2,
"Expire": "2023-10-27T09:08:09+08:00",
"Version": "1.1.0+2023-09-04.08c4ae6",
"HostName": "homelab",
"IPAddrs": [
"192.168.1.80",
],
"MountPoint": "/home/herald/jfs/mnt3",
"ProcessID": 173507
},
{
"Sid": 4,
"Expire": "2023-10-27T09:08:11+08:00",
"Version": "1.1.0+2023-09-04.08c4ae6",
"HostName": "HeralddeMacBook-Air.local",
"IPAddrs": [
"192.168.3.102",
],
"MountPoint": "webdav",
"ProcessID": 20746
}
],
"Statistic": {
"UsedSpace": 4347064320,
"AvailableSpace": 1125895559778304,
"UsedInodes": 11,
"AvailableInodes": 10485760
}
}
Enabling/disabling the trash feature
JuiceFS supports a trash feature as a safety mechanism against accidental deletions. By default, the trash feature is enabled, retaining deleted files for one day before permanent deletion from the .trash
directory. When conducting optimization tests with frequent creation and deletion of temporary files, it's essential to disable the trash feature for timely storage space release.
Use the config
command to adjust the number control trash of --trash-days
. The set number represents the number of days the trash reserves files. If you set it to 0, the trash feature is disabled. For example:
# Set the trash to retain files for 7 days.
juicefs config META-URL --trash-days=7
# Disable the trash feature.
juicefs config META-URL --trash-days=0
Completely destroying a file system
For those new to a technology, understanding how to clean and delete a file system is crucial. JuiceFS file system destruction, like creation, involves necessary confirmation steps:
- Use the
status
command to find the UUID of the file system to be deleted:
# juicefs status redis://192.168.1.80/1
{
"Setting": {
"Name": "myjfs",
"UUID": "520ae432-f355-43d2-a445-020787f325f4",
"Storage": "minio",
"Bucket": "http://192.168.1.80:9123/myjfs",
- Confirm that all clients have stopped using the file system, as active mounts prevent destruction.
- Execute the
destroy
command to destroy the file system:
juicefs destroy redis://192.168.1.80/1 520ae432-f355-43d2-a445-020787f325f4
Metadata backup and restoration
JuiceFS stores data and metadata separately:
- Data is stored in object stores in blocks.
- Metadata, containing crucial information like file names, sizes, locations, and permissions, is stored in a separate database.
When you access files, you must first retrieve the metadata before you get the actual data. Metadata is crucial to any file system.
To ensure metadata safety, JuiceFS enables automatic hourly backups to the object storage bucket's meta
directory. In case of metadata engine failure, you can download the latest backup and restore metadata using the load
command.
When you restore metadata, note that:
- You can only restore the metadata to a new database.
- You must reset the secret key of the object storage.
For example, assuming that your file system was created using Redis Database 1, now it is damaged, and you need to rebuild the metadata on Database 2. Just go to the meta
directory of the object storage to download the latest backup and then follow the steps below to restore it.
# Import metadata backup into a new database.
juicefs load redis://192.168.1.80/2 dump-2023-10-27-025129.json.gz
# Update object storage secret key.
juicefs config --secret-key xxx redis://192.168.1.80/2
Note:
There is inevitably a time lag between automatic backup and the occurrence of a failure. It's impossible to recover new data created between the last backup and the occurrence of a failure.
After all, there are only a few extreme situations. The more common requirement is to migrate metadata between different databases. This operation is also simple:
- Stop the reading and writing applications of the file system.
- Use the
dump
command to export the metadata. - Use the
load
command to import it on the target database.
# Export metadata to the meta-dump.json file.
juicefs dump redis://192.168.1.80/1 meta-dump.json
# Import metadata into a new sqlite database.
juicefs load sqlite3://myjfs.db meta-dump.json
# Update the secret key of the object storage.
juicefs config --secret-key xxx sqlite3://myjfs.db
If you have any questions or would like to learn more details, feel free to join discussions about JuiceFS on GitHub and the JuiceFS community on Slack.
Posted on November 24, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 29, 2024
November 29, 2024