Disruption and Restoration: A Python Solution for Fragmenting and Defragmenting Files.

shivangvora1206

Shivang Vora

Posted on March 15, 2024

Disruption and Restoration: A Python Solution for Fragmenting and Defragmenting Files.

Motivation, the why?

As a weekend project I wanted to explore the idea of making a file fragmenting and defragmenting system that divides the original file into smaller packets which are individually salted and when needed can be used together to return the original data.

Flow Diagram
To add a ✨flare of drama✨ to the reading experience, while going through this,
if you could imagine the divided packets as legendary rings which are given to a few skilled and trusted warriors who will one day have to come together and use the rings as keys to unlock some forbidden knowledge
or
as USB drives sworn to be protected by secret agents who directly report to the president and when the time is right (or the leader is trigger happy) together will unlock the nuclear launch codes.

your dedication to the bit would be much appreciated.

Overview, the what?

To give an overview, the system mainly does a few things:

Fragment

  1. Divide the original file into smaller subparts.
  2. Salt the subparts.
  3. Store the subparts with encrypted names.
  4. Create a license file which is used to verify if all the subparts belong to the same file and how to arrange them to retrieve the original data.

When the N heroes entered the dungeon with their rings and found a statue with N+1 fingers they were confused and disheartened. Was the prophecy false? did they do something wrong? did they miss some critical side quest? Just then the tall old wizard with a Clausesque beard (all words are made up) remembered the necklace his father gave him when he was a child. The necklace also had a ring in it, how convenient, the prophecy was true after all.
The license here play the role of that last ring, it is just as significant in the process of defragmenting the parts to extract original data.

Defragment

  1. Use the license file to check if all the required parts are present.
  2. Unsalt all the parts.
  3. Sequence the parts and combine them to output the original file.

Salting

Salting is the process of adding something to the subpart which corrupts it or makes it unusable so that if it is not removed, even all the subparts combined wouldn't give the original data back.
For this I wanted a pinch of salt which would be unique for every subpart but still identifiable so that it can be removed while defragmenting.
The user has an option to choose their own salt_length as a parameter while creating an instance of the class, so when a part is salted, a sequence of bytes whose length is the same as salt_length is selected from within the part and copied at a different place in the part itself.
Since the unsalting process is based on the location of the salt rather than the contents, the salt can be modified and the unsalting would work fine as long the location and length are maintained.

To put it in a line,
data = data[:location] + salt + data[location:]

Unsalting

Unsalting is the process of removing the salt from the subpart to make it usable again.
Since the location and length of the salt is known, unsalting is just going to the location and taking away the sequence of bytes that is unnecessary.
data = data[:location]+data[self.salt_length+1:]

Usage, the how?

The usage is quite simple.

1) Make sure the file which needs to be fragmented is in the same directory as the subscrypt.py file.

2) Create an instance of the Subscrypt class divs -> number of subparts

from subscrypt import Subscrypt
sp = Subscrypt() #default divs=4, salt_length=4
Enter fullscreen mode Exit fullscreen mode

or

from subscrypt import Subscrypt
sp = Subscrypt(divs = 5, salt_length = 6)
Enter fullscreen mode Exit fullscreen mode

3) Use the frag() method to fragment the file into subparts
frag(filename)

from subscrypt import Subscrypt

sp = Subscrypt()

sp.frag("sample.jpg")
Enter fullscreen mode Exit fullscreen mode

frag() will create a directory named after the file name containing the subparts and license file.

4) Use the defrag() method to retrieve the original file from the subparts
defrag(filename, divs, licenseFile)

from subscrypt import Subscrypt

sp = Subscrypt()

sp.defrag("sample.jpg", 4, "license.enc")
Enter fullscreen mode Exit fullscreen mode

defrag() uses the directory created by frag() containing all the subparts and will store the output file in the same directory.

Note:
When initialized, the subscrypt instance looks for two files a_key.key and b_key.key which are important for encryption and decryption in the fragmentation and defragmentation process. If the key files are not found they will be created automatically. If a file was fragmented using a particular set of a_key and b_key, defragmentation will only work with the same a_key and b_key.

A more detailed documentation can be found here.

Use cases, the where?

There are a few cases where something like this could be useful:

  1. Services allowing users to download the content offline but don't want it to be accessible outside of the application.
  2. Situations where privacy is crucial, image data, video data, etc., can be stored this way.
  3. Hiding data in a CTF game in a Hackathon.
  4. Just became the president of a country with nuclear weapons and looking for something more secure than upper case letters and special characters? look further.

Conclusion

This project was just a POC for what I had in mind and can definitely be improvised and built further to maybe become actually useful.

Thanks for reading!

💖 💪 🙅 🚩
shivangvora1206
Shivang Vora

Posted on March 15, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related