A brief, beginner's guide to cryptography and AES!
Phantz
Posted on December 23, 2019
In this guide I'll be giving a brief introduction to cryptography, explain the basic concepts of it, and show some practical code (with python) to use it all!
I've recently started diving deep into the cryptography world and I was inspired enough to share my insights about it, which is where this guide comes from.
I'm, however, by no means, an expert of this topic. But I'll try my best to convey what I've learned :)
This guide is pretty loooong, so if you are short on time, you can put it on your Reading list to continue later! If you're going for the one-go, Buckle up!
So what exactly is cryptography?
As wikipedia explains it:
Cryptography is the practice and study of techniques for secure communication in the presence of third parties called adversaries
To put it simply, Cryptography is the branch of CS which focuses on securing a piece of data through the use of specific algorithms.
Why specific algorithms? Because if we secure a piece of data by the same method everytime, so that the output for the same data is always identical - well, that's not very secure is it?
This is why we need algorithms, that are random. So that encrypting the same data doesn't always produce the same output. But you might already know that computers can never be truly random. So we need to rely on highly sophisticated encryption algorithms that can generate ciphers that are never truly random but as random as possible. So let's discuss about some of these algorithms!
Algorithms
The algorithms can be divided into 3 sub categories - Symmetric
, Asymmetric
and Transform
.
However there's another sub category - Hashing
. You might be wondering, but hashing is one-way whereas encryption is two-way. That is, what has been encrypted can also be decrypted but what is hashed can not be dehashed. This Hashing
is actually different however, I'm referring to a Cryptographic Hash function which can both me mapped and demapped.
We'll be focusing on Symmetric
algorithms in this guide as these are one of the most common encryption techniques.
Symmetric Encryption
From Wikipedia
Symmetric encryption is a type of encryption where only one key (a secret key) is used to both encrypt and decrypt electronic information
Symmetric encryption can be primarily divided into 3DES
, AES
and BLOWFISH
. AES
which stands for Advanced Encryption Standard
, is one of the most popular encryption method. The python module pycryptodome
allows us to easily implement this method of encryption. So let's dive into some examples!
First make sure you have pycryptodome
installed with pip install pycryptodome
To use the AES encryption we import these modules;
from Crypto.Cipher import AES
from Crypto import Random
Note: You only need the AES
module for the main encryption, but to generate secure keys I recommend the Random
module as it generated Cryptographically Secure random output.
There are 2 keys that you need to generate for AES:-
AES_KEY
: This is the primary key that is needed to both encrypt and decrypt a piece of data. This is a private protected key and should never be kept alongside the encrypted data. This key can be either16 bytes
, or24 bytes
, or32 bytes
. These are the valuesAES.key_size
holds.
Initiallization Vector (IV)
/NONCE
: Depending on what AES mode you are using (we'll be diving into the modes explanation real soon), you'll also need either an IV or NONCE. This is a public key that can be stored alongside the encrypted_data. This key is used to prevent a piece of data to be always encrypted into the same output. IV is16 bytes
in size, reflected byAES.block_size
, whereas the size of a NONCE can range from1 byte
to15 bytes
inclusive. Hence the max limit is reflected byAES.block_size-1
.
Message Authentication Code (MAC)
: This is a byte string returned by Authenticated Encryption methods to check if the encrypted data was modified before decryption. More on this later! This is a public key that can be stored alongside the encrypted data. This key is16 bytes
in size.
Now AES has many modes, you should never just take a guess on which mode to use, most of these modes have a specific use case. Modes are denoted by AES.MODE_(mode name here)
. Let's focus on the most commonly used modes!
-
AES.MODE_CBC
: This is a method that requires data padding. That is, if the size of the data is NOT in multiples of theAES_KEY
size, you'll have to pad it with fill values (usually zeroes) to encrypt. This requires anIV
. The output from this mode is completely malleable -
AES.MODE_OFB
: This is a stream cipher method. This DOES NOT require data padding and it is independent of the source data. Neither encryption nor decryption can be parallelized. That is, Neither encryption nor decryption is done in parts. If one part of the output corrupts, so does the whole data. This requires either anIV
or aNONCE
. The output from this mode is completely malleable -
AES.MODE_CFB
: This is a stream cipher method. This DOES NOT require data padding and it is dependent on the source data. Encryption cannot be parallelized but decryption can be. That is, Encryption is not is done in parts but decryption is. This requires either anIV
or aNONCE
. The output from this mode is completely malleable -
AES.MODE_OCB
: This is an authenticated encryption method. This method overcomes the drawbacks of the previous methods and uses aMAC
(Message Encryption Code) to make sure theencrypted_data
has not been modified before decryption. This requires aNONCE
. The output from this mode is not malleable
I highly recommend reading this stack overflow post to know more about these modes.
So once you've decided on a mode to use, we can move on to the coding part! I'll be using AES.MODE_OCB
for this example.
First we generate the required keys:-
AES_key = Random.get_random_bytes(AES.key_size[0]) # A key of size 16 bytes
NONCE = Random.get_random_bytes(AES.block_size-1) # A key of 15 bytes
Now we have to convert our source_data to a byte string. Remember AES
modes take byte strings as input and put out byte strings as output. After that we can just encrypt it.
data = 'Hello World'
byte_dat = data.encode() # You can also just do 'Hello World'.encode()
cipher = AES.new(AES_key, AES.MODE_OCB, NONCE)
ciphertext, MAC = cipher.encrypt_and_digest(byte_dat)
Notice, the encryption returns 2 values, a ciphertext
and a MAC
. They are both byte strings and we need both.
Afterwards, when we want to decrypt this ciphertext
, we'll be needing the AES_KEY
, the NONCE
, the MAC
and ofcourse the ciphertext
itself.
cipher = AES.new(AES_KEY, AES.MODE_OCB, NONCE)
decrypted_string = cipher.decrypt_and_verify(ciphertxt, MAC)
og_string = decrypted_string.decode()
Now a few words on storing the keys and the encrypted data, if you remember from above, we can store the ciphertext
, the MAC
and the NONCE
in the same place. The AES_KEY
, however, should be stored as securely as possible, we DO NOT want unauthorized access to this key. There are many methods of storing byte strings. You can either store them exactly as they seem to be, as byte strings or you can conver them into a list of integers by using list()
. For instance, using list()
on a MAC
will return a list of 16 integers;
You can also decode these byte strings into strings as well for storage. The codec
you want to use for the AES output byte strings is latin-1
. However you can also use b64encode()
on the byte string and then use decode()
on them. I prefer base64encoding, salting etc on top of the ciphertext
since the ciphertext
is actually the same length as the original string, we don't want that.
For encryption storage-
from base64 import b64encode
encoded_string = b64encode(ciphertxt).decode()
NONCE_string = NONCE.decode('latin-1')
MAC_string = MAC.decode('latin-1')
For decryption retrieval-
from base64 import b64decode
ciphertxt = b64decode(encoded_string)
NONCE = NONCE_string.encode('latin-1')
MAC = MAC.encode('latin-1')
And that's it! That's how you encrypt data with AES using pycryptodome
!
I hope this guide helped you learn a thing or two about cryptography, it's a really cool topic and if your thirst for knowledge has not quenched yet checkout this really cool github repo which has a LOT of resources about Cryptography! If you made it this far, thanks for reading and I sincerely hope my guide helps you in your future endeavors!
Posted on December 23, 2019
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.