JuiceFS 1.2: Introducing Enterprise-Grade Permission Management and Smooth Upgrades

daswu

DASWU

Posted on June 21, 2024

JuiceFS 1.2: Introducing Enterprise-Grade Permission Management and Smooth Upgrades

JuiceFS Community Edition 1.2 is released today! This marks the third major release since its open-source debut in 2021. This version is also a long-term support (LTS) release. We will continue to maintain versions 1.2 and 1.1, while version 1.0 will no longer receive updates.

JuiceFS is an open-source distributed file system designed for cloud environments, supporting 10+ metadata engines and 30+ data storage engines. This flexibility empowers users to adapt to diverse enterprise environments and data storage requirements. Moreover, JuiceFS is compatible with multiple access protocols, including POSIX, HDFS, S3, and WebDAV, and can serve as a Persistent Volume in Kubernetes. This ensures seamless data flow across different applications.
Licensed under Apache 2.0, JuiceFS Community Edition allows users to modify and enhance it according to their specific needs. This makes it suitable for various commercial environments.

This post provides a brief introduction to the new features and optimizations in JuiceFS 1.2. Feel free to download and try it out.

New features and optimizations

Over the past few years, JuiceFS has been widely adopted across various industries and use cases, particularly in the fields of AI and foundation models. To address complex permission management challenges in these massive data scenarios, JuiceFS 1.2 introduces several new features and optimizes existing features:

  • POSIX ACLs: Enables robust user permission management using Linux ACL tools (setfacl/getfacl).
  • Smooth upgrades: Allows remounting JuiceFS at the same mount point to achieve seamless application upgrades without disruption. It also supports online adjustment of mounting parameters.
  • Advanced S3 Gateway features: Introduces Identity and Access Management (IAM) and event notifications for enhanced security, flexibility, and automated data management and monitoring capabilities suitable for multi-user environments and complex application scenarios.
  • JuiceFS Sync optimization: Enhances selective synchronization and performance optimizations for large directories and complex migration tasks, improving data synchronization efficiency.

More application scenarios

Support for NFS/Dragonfly/Bunny as object storage: Offers flexibility in selecting backend storage based on specific scenario requirements. JuiceFS now supports 40+ object storages.

Increased stability

  • Automatic disk failure detection and isolation: Uses the client's local hard disk for data caching, effectively boosting data access speeds in most scenarios. This release introduces automatic detection and isolation of faulty disks, ensuring system stability and minimizing disruptions to application operations in the event of hardware failures.

  • Optimized dump command and metadata auto-backup to improve metadata export performance: In the previous implementation, all key-value pairs were loaded into memory to speed up metadata export. This imposed significant memory pressure on large-scale file systems. In version 1.2, JuiceFS automatically selects a strategy based on the number of files. It chooses a file-by-file export approach when there are more than one hundred thousand files in total. Moreover, this strategy includes concurrent prefetching features to balance speed.

Enhanced usability

  • The juicefs compact command: Users can now manually perform compact operations on specified paths. In previous versions, users could use the juicefs gc --compact command to perform a global compact operation. This could reduce object storage capacity use as needed. However, for large-scale file systems, this gc command often took a long time, leading to a poor user experience. Therefore, we have introduced a new compact command in JuiceFS, allowing users to compact only the specified paths, thereby enhancing operational flexibility.

  • New --cache-expire option in the juicefs mount command: The --cache-expire option allows users to specify the expiration time for local data cache. Once the specified time expires, relevant cache data is automatically deleted. In the previous approach, cache cleanup was triggered only when the cache disk reached its capacity threshold. Compared to the previous method, the new option provides users with more flexible cache management choices.

  • Optimized the juicefs warmup command: Enables users to manually clear cache blocks on specified paths and check the existing cache ratio on those paths. This facilitates more reliable and effective management of client data caching, thereby improving application cache hit rates.

  • Background operation support for gateway/webdav: Allows users to run gateway/webdav as a daemon in the background, enhancing service availability and stability. It enables easier integration and usage of JuiceFS in various network environments.

  • Multiple usability enhancements: Includes more human-friendly formats for command-line parameters, such as direct use of “128K” and “4M” to specify block sizes, more monitoring metrics, and a more reliable debug information collection command.

Rapid community growth

Open-sourced in January 2021, JuiceFS has obtained 10k stars on GitHub. The latest version has seen 410 new issues, 464 merged pull requests, and 44 contributors.

Anonymous reports indicate ongoing rapid increases across user metrics. Our user base is steadily expanding, with 57% from Asia, 33% from the United States, and 10% from Europe.

Image description

New case studies:

Check out more user stories.

Upcoming features

We will gradually implement the following features in future versions. We welcome you to contribute together:

  • Distributed data caching
  • Support for Kerberos and Ranger
  • User and group quotas

Try it out!

Welcome to download and try JuiceFS 1.2! If you have any questions, join JuiceFS discussions on GitHub and our community on Slack.

As JuiceFS enters its fourth year of open-source development, it has grown from a new brand to a widely adopted product. Originally supporting Hadoop in the cloud, JuiceFS has expanded its applications into AI training, inference, and beyond. Now it is an essential tool in many engineers’ daily workflows.

We are truly grateful for the invaluable contributions of our community members—your feedback, solutions, code contributions, and practical insights have been instrumental in our journey. Thank you for being part of the JuiceFS community!

💖 💪 🙅 🚩
daswu
DASWU

Posted on June 21, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

What was your win this week?
weeklyretro What was your win this week?

November 29, 2024

Where GitOps Meets ClickOps
devops Where GitOps Meets ClickOps

November 29, 2024