Build a Reddit Video downloader with Python

amal

Amal Shaji

Posted on January 15, 2021

Build a Reddit Video downloader with Python

This article is about building a Reddit video downloader from scratch using python. Our downloader will be able to downloader video along with the audio.

Reddit makes it tough to download posted videos directly. This is because Reddit saves the audio and video separately on their servers. But this is not a big problem, as we will download both audio and video separately and join them using FFmpeg.

How to get the video URL?

Every Reddit post can be parsed as a JSON, simply by appending a .json to the URL. Thus,

https://www.reddit.com/r/PublicFreakout/comments/kwif5d/insurrectionist_gets_arrested_trying_to_break/
Enter fullscreen mode Exit fullscreen mode

becomes

https://www.reddit.com/r/PublicFreakout/comments/kwif5d/insurrectionist_gets_arrested_trying_to_break.json
Enter fullscreen mode Exit fullscreen mode

Go to the JSON URL, and you should see something like,

json.PNG

From there, it is easy. Press Ctrl + F and search for .mp4.

url.PNG

That's what we need, the fallback URL. Now have to figure out how to reach here starting from the top.

If you are trying to experiment with this on your own, make sure you either download the JSON or try using jupyter or colab. The Reddit rate-limiting feature won't allow you to fetch the JSON data frequently.

First, get the JSON data using requests.

data = requests.get(post_url).json()
Enter fullscreen mode Exit fullscreen mode

where post_url is the JSON URL.

Obtain the URL using,

url = data[0]["data"]["children"][0]["data"]["secure_media"]["reddit_video"][
        "fallback_url"
    ]
Enter fullscreen mode Exit fullscreen mode

This is what a typical video URL looks like,

https://v.redd.it/jvjih3f894b61/DASH_1080.mp4?source=fallback

Not the best approach. Here we didn't even check if the keys existed. We'll assume this is the standard format for all posts.

What about the audio URL?

The audio URL is not available in the JSON data. But it can be obtained using a simple hack.

As mentioned above, this is the video URL. Go to the URL and check if it's video-only.

https://v.redd.it/jvjih3f894b61/DASH_1080.mp4?source=fallback

Here the DASH_1080.mp4 is the filename, where 1080 is the resolution. Replace the resolution with audio, and you get the audio URL.

Thus the audio URL corresponding to our URL is
https://v.redd.it/jvjih3f894b61/DASH_audio.mp4?source=fallback

Now that you have both audio and video, download them in a pythonic way.

import urllib.request

urllib.request.urlretrieve(
            url,
            filename=output_path,
        )
Enter fullscreen mode Exit fullscreen mode

If you are a big fan of progress bars, add one using the reporthook of urlretrieve.

from tqdm import tqdm

class DownloadProgressBar(tqdm):
    def update_to(self, b: int = 1, bsize: int = 1024, tsize: int = None):
        if tsize is not None:
            self.total = tsize
        self.update(b * bsize - self.n)

with DownloadProgressBar(unit="B", unit_scale=True, miniters=1) as d:
        urllib.request.urlretrieve(
            url,
            filename=output_path,
            reporthook=d.update_to,
        )
Enter fullscreen mode Exit fullscreen mode

Let's merge 'em

Once the files are downloaded to audio.mp4 and video.mp4, let's merge them using FFMpeg.

FYI, you can have any name for audio and video output files.

If you don't have FFmpeg installed, download it from here for windows.

For Linux,

sudo apt install ffmpeg
Enter fullscreen mode Exit fullscreen mode

Now run the following to merge them both,

ffmpeg -i video.mp4 -i audio.mp4 -c:v copy -c:a aac output.mp4
Enter fullscreen mode Exit fullscreen mode

It doesn't take more than a few seconds to merge. Once complete, enjoy the video.

Conclusion

In this post, we saw how to download a Reddit video along with its audio using python. There was nothing fancy in this approach; all we had to do was get the URL somehow.
If there's a better approach, please do tell.

I went ahead and created a python CLI named redl. Run,

redl https://www.reddit.com/r/PublicFreakout/comments/kwif5d/insurrectionist_gets_arrested_trying_to_break/
Enter fullscreen mode Exit fullscreen mode

to download a video.

The whole code can be found at

GitHub logo amalshaji / redl

A Reddit video downloader(with audio) in Python

redl - A Reddit video downloader(with audio)

⚠️ Requires ffmpeg installed

Redl scrapes the reddit post json and retrives both audio and video URLs. Once these files are downloaded, it uses ffmpeg to join them.

Installation

pip install redl --user
Enter fullscreen mode Exit fullscreen mode

Usage

redl https://www.reddit.com/r/Damnthatsinteresting/comments/kwrbde/making_a_grapefruit_dessert/
Enter fullscreen mode Exit fullscreen mode
💖 💪 🙅 🚩
amal
Amal Shaji

Posted on January 15, 2021

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related