Hadoop FS Shell Expunge: Optimizing HDFS Storage with Ease
Labby
Posted on June 26, 2024
Introduction
Welcome to our exciting lab set in an interstellar base where you play the role of a skilled intergalactic communicator. In this scenario, you are tasked with managing the Hadoop HDFS using the FS Shell expunge command to maintain data integrity and optimize storage utilization. Your mission is to ensure the efficient cleanup of unnecessary files and directories to free up storage space and improve system performance.
Enabling and Configuring the HDFS Trash Feature
In this step, let's start by accessing the Hadoop FS Shell and examining the current files and directories in the Hadoop Distributed File System.
- Open the terminal and switch to the
hadoop
user:
su - hadoop
- Modifying
/home/hadoop/hadoop/etc/hadoop/core-site.xml
to enable the Trash feature:
nano /home/hadoop/hadoop/etc/hadoop/core-site.xml
Add the following property between the <configuration>
tags:
<property>
<name>fs.trash.interval</name>
<value>1440</value>
</property>
<property>
<name>fs.trash.checkpoint.interval</name>
<value>1440</value>
</property>
Save the file and exit the text editor.
- restart the HDFS service:
Stop the HDFS service:
/home/hadoop/hadoop/sbin/stop-dfs.sh
Start the HDFS service:
/home/hadoop/hadoop/sbin/start-dfs.sh
- Create a file and delete it in the HDFS:
Create a file in the HDFS:
hdfs dfs -touchz /user/hadoop/test.txt
Delete the file:
hdfs dfs -rm /user/hadoop/test.txt
- Check if the Trash feature is enabled:
hdfs dfs -ls /user/hadoop/.Trash/Current/user/hadoop/
You should see the file you deleted in the Trash directory.
Expunge Unnecessary Files
Now, let's proceed to expunge unnecessary files and directories using the FS Shell expunge command.
- Expunge all the trash checkpoints:
hdfs dfs -expunge -immediate
- Verify that the unnecessary files are successfully expunged:
hdfs dfs -ls /user/hadoop/.Trash
There should be no files or directories listed.
Summary
In this lab, we delved into the power of the Hadoop FS Shell expunge command to manage and optimize data storage in the Hadoop Distributed File System. By learning how to initiate the FS Shell, view current files, and expunge unnecessary data, you have gained valuable insights into maintaining data integrity and enhancing system performance. Practicing these skills will equip you to efficiently manage your Hadoop environment and ensure smooth operations.
Want to learn more?
- 🚀 Practice Hadoop FS Shell expunge
- 🌳 Learn the latest Hadoop Skill Trees
- 📖 Read More Hadoop Tutorials
Join our Discord or tweet us @WeAreLabEx ! 😄
Posted on June 26, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
July 10, 2024