megajakob
Posted on November 24, 2020
Image source: Infographic vector created by pikisuperstar — www.freepik.com
Content Delivery Networks (CDN) are generally used by sites and applications for speeding up the loading of static elements. This is accomplished by caching files on CDN servers located in various regions of the world. Having requested data via CDN, the user receives it from the nearest server.
The underlying principles behind CDNs and their functionality are approximately the same for all of them. Having received a request for a file, a CDN server takes it once from the origin server and then transfers it to the user and caches a copy of it for a period of time. Further requests for the data are handled using the cache. All CDNs have options for pre-loading files, cache clearing, cache retention times and much more.
Sometimes, for various reasons, one might find oneself in need of building one's own CDN, and so, we present the following guide on how to realize it.
When do you need your own CDN?
Let's take a look at situations where you might need to create your own CDN:
- when you are trying to save money and even affordable solutions like BunnyCDN end up costing you hundreds of dollars per month
- when you want to get permanent cache or need guaranteed bandwidth and resources
- existing CDNs do not have PoPs in your target region
- you require special content delivery settings
- you want to speed up the delivery of the dynamic content by serving it closer to the users
- you are concerned third-party CDNs might illegally collect and use user data (hey there, servers that aren't GDPR-compliant) or engage in other illicit activities
In most other cases, it is more viable to use existing ready-made solutions.
Let's create own CDN
To build even a simple content delivery network you need the following:
- domain name or a subdomain
- a minimum of two servers in different regions. The servers can be virtual or dedicated
- geoDNS tool. With it, a user sending a request to the domain will be directed to the nearest server
Registering domain and ordering servers
Registering a domain name is easy — just register it in any domain zone you prefer. For CDN, you could use also a subdomain, like cdn.domainname.com. This is the case for the following example.
When it comes to servers, you should rent those in regions and countries where your target audience is located. If your project is intercontinental, it is convenient to select from among hosting providers that offer servers across the world, such as PQ.hosting and DigitalOcean (for virtual and cloud servers), or OVH and Leaseweb (for dedicated servers).
For small and medium projects, usually virtual servers are quite enough. They are also much cheaper than dedicated servers. For example, PQ.hosting offers 25GB NVMe server with Unlimited traffic for 4.77€/month in any of 30+ locations.
Let's order three virtual servers on different continents. During installation, select the latest Debian. Here are our servers:
Frankfurt, ip: 199.247.18.199
Chicago, ip: 149.28.121.123
Singapore, ip: 157.230.240.216
Configuring geoDNS
To ensure the clients are directed to the proper (closest) servers upon sending requests to our domain or subdomain, we'll need a DNS server with geoDNS functionality.
Here's how geoDNS works:
- It gets the client's IP (if they sent the DNS request) or the IP of the recursive DNS server that is used for processing the request. Generally speaking, such recursive servers are usually the DNSs of the Internet providers.
- By the client's IP it identifies their country or region. This operation requires the use of GeoIP database, which are available in no short supply. There are even decent free options.
- Depending on the client's location, geoDNS returns him the IP address of the closest CDN server.
A DNS server with geoDNS functionality is something you can build yourself, but it is better to use ready-made solutions that has servers arround the world and an out-of the-box Anycast option:
- СlouDNS, from $9.95/month, GeoDNS package, one DNS Failover is provided by default
- Amazon Route 53, from $35/month for 50 million geo-requests. DNS Failover is priced separately
- DNS Made Easy, from $125/month, with 10 DNS Failovers
- Cloudflare, Geo Steering functionality is provided in Enterprise packages
When ordering geoDNS you need to pay attention to the number of requests included in the package and keep in mind that the real requests number might greatly exceed your expectations. There are millions of web crawlers, scanners, spammers and other devilry at work at any given time.
Almost all DNS services include an useful feature for CDN building - DNS Failover. With it, you can configure activity monitoring so that if a server goes down the system will automatically redirect the clients to working servers instead.
For our CDN, let's use ClouDNS and its GeoDNS package.
In profile, add a new DNS zone and specify your domain name. If you're using subdomain, and the main domain is in use, don't forget to add the existing DNS records immediately after adding the zone. The next step is to create several A records for the CDN domain/subdomain, each of which will be used for the specified region. You can designate continents or countries as regions, and subregion options are available for the US and Canada.
In our example, the CDN will operate on the cdn.sayt.in subdomain. Having added the sayt.in zone, create the first A record for the subdomain and direct all NA clients to the Chicago server:
Repeat this step for the other regions and don't forget to create one record for default regions. Here's what the end result looks like:
The last default record means requests from all unspecified regions (Europe, Africa, satellite Internet users, etc.) are to be directed to the Frankfurt server.
This concludes the basic DNS configuration. All that's left is to go to registrar website and replace the current nameservers with the ones provided by ClouDNS. While they're being updated, we'll set the servers.
Installing SSL certificates
Our CDN will operate using HTTPS, so if you already have SSL certificates for the domain or the subdomain, upload them to all the servers, for instance, to the /etc/ssl/yourdomain/ directory.
If you don't have any certificates, you can get it for free from Let's Encrypt. ACME Shell script is a great option. It has a user-friendly client and, more importantly, it allows to perform validation of the domain/subdomain via DNS using API by ClouDNS.
We'll install acme.sh on only one server — the European one (199.247.18.199), and from it, the certificates will be copied to all the others. To install it, run the following command:
root@cdn:~# wget -O - https://get.acme.sh | bash; source ~/.bashrc
During the installation a CRON task for automatic updating of the certificates will be created.
Domain verification upon issuance of the certificate will be performed via DNS using API, so in the ClouDNS profile, under Reseller API, create a new API user and specify a password for it. Enter the resulting auth-id along with the password into the following file: ~/.acme.sh/dnsapi/dns_cloudns.sh (not to be confused with dns_clouddns.sh). Here are the lines you need to uncomment and edit:
CLOUDNS_AUTH_ID=<auth-id>
CLOUDNS_AUTH_PASSWORD="<password>"
Now, let's request the issuance of the SSL certificate for cdn.sayt.in
root@cdn:~# acme.sh --issue --dns dns_cloudns -d cdn.sayt.in --reloadcmd "service nginx reload"
For future use, in parameters, we left a command for automatic configuration reboot after every renewal of the certificate.
The process of acquiring the certificate might take up to two minutes, so do not interrupt it. In case a domain validation error occurs, try running the command again. In the end, we'll see where the certificates were downloaded.
Remember these paths, we'll need to specify them when copying the certificates to other servers and we'll need to specify them in server settings. Ignore the Nginx config reloading error, — it won't occur on a fully configured server during certificate renewal.
As for SSL certificate, all we left to do is copy it to the two other servers saving the certificate paths. Create identical directories on each server and then copy certificate files:
root@cdn:~# mkdir -p /root/.acme.sh/cdn.sayt.in/
root@cdn:~# scp -r root@199.247.18.199:/root/.acme.sh/cdn.sayt.in/* /root/.acme.sh/cdn.sayt.in/
To automate the certificate renewal, you need to create a daily CRON task on both servers. Below is the command you should to add to CRON jobs:
scp -r root@199.247.18.199:/root/.acme.sh/cdn.sayt.in/* /root/.acme.sh/cdn.sayt.in/ && service nginx reload</pre>
Keep in mind, that connection to the remote origin server requires an access by key without entering a password. Don't forget to create it.
Installing and configuring Nginx
For static content delivery we'll use Nginx, configured as a caching proxy server. Update the lists of packages and install it on all the three servers:
root@cdn:~# apt update
root@cdn:~# apt install nginx
Instead of the default config, use the one below:
nginx.conf
user www-data;
worker_processes auto;
pid /run/nginx.pid;
events {
worker_connections 4096;
multi_accept on;
}
http {
sendfile on;
tcp_nopush on;
tcp_nodelay on;
types_hash_max_size 2048;
include /etc/nginx/mime.types;
default_type application/octet-stream;
access_log off;
error_log /var/log/nginx/error.log;
gzip on;
gzip_disable "msie6";
gzip_comp_level 6;
gzip_proxied any;
gzip_vary on;
gzip_types text/plain application/javascript text/javascript text/css application/json application/xml text/xml application/rss+xml;
gunzip on;
proxy_temp_path /var/cache/tmp;
proxy_cache_path /var/cache/cdn levels=1:2 keys_zone=cdn:64m max_size=20g inactive=7d;
proxy_cache_bypass $http_x_update;
server {
listen 443 ssl;
server_name cdn.sayt.in;
ssl_certificate /root/.acme.sh/cdn.sayt.in/cdn.sayt.in.cer;
ssl_certificate_key /root/.acme.sh/cdn.sayt.in/cdn.sayt.in.key;
location / {
proxy_cache cdn;
proxy_cache_key $uri$is_args$args;
proxy_cache_valid 90d;
proxy_pass https://sayt.in;
}
}
}
In config, let's edit:
- max_size — cache size not exceeding the available disk space
- inactive — retention time for unrequested cached data
- ssl_certificate and ssl_certificate_key — paths to SSL certificate and key
- proxy_cache_valid — retention time for cached data
- proxy_pass — address of the origin server from which the CDN will request data for caching. For our example, it's sayt.in
As you can see, it's no rocket science. The only difficulty is configuring the retention time, given the similarities between inactive and proxy_cache_valid parameters. Let's take a closer look. Here's what happens with inactive=7d and proxy_cache_valid=90d:
- if the request is not repeated within 7 days, the data is deleted from cache
- if the request is repeated even once during 7 days, the cache will be considered out-of-date after 90 days and the next request will make Nginx update it from the origin server
With nginx.conf handled, reload the configuration:
root@cdn:~# service nginx reload
So, our CDN is ready to use! For $15/month we got PoPs on three continents and 3TB of traffic: 1TB for each region.
Checking our CDN
Let's take a look at ping to our CDN from different locations. Any ping services will do in this case.
Ping server | Host | IP | Avg time, msec |
---|---|---|---|
Germany, Berlin | cdn.sayt.in | 199.247.18.199 | 9.6 |
Netherlands, Amsterdam | cdn.sayt.in | 199.247.18.199 | 10.1 |
France, Paris | cdn.sayt.in | 199.247.18.199 | 16.3 |
UK, London | cdn.sayt.in | 199.247.18.199 | 14.9 |
Canada, Toronto | cdn.sayt.in | 149.28.121.123 | 16.2 |
USA, San Francisco | cdn.sayt.in | 149.28.121.123 | 52.7 |
USA, Dallas | cdn.sayt.in | 149.28.121.123 | 23.1 |
USA, Chicago | cdn.sayt.in | 149.28.121.123 | 2.6 |
USA, New-York | cdn.sayt.in | 149.28.121.123 | 19.8 |
Singapore | cdn.sayt.in | 157.230.240.216 | 1.7 |
Japan, Tokyo | cdn.sayt.in | 157.230.240.216 | 74.8 |
Australia, Sydney | cdn.sayt.in | 157.230.240.216 | 95.9 |
The results are good. Now let's place a test image titled test.jpg on the main server and check how fast it loads via CDN. For this purpose, Ping Admin service can be used. The loading should be fast.
Let's make a small script in case we need to clear cache on a CDN point.
purge.sh
#!/bin/bash
if [ -z "$1" ]
then
echo "Purging all cache"
rm -rf /var/cache/cdn/*
else
echo "Purging $1"
FILE=`echo -n "$1" | md5sum | awk '{print $1}'`
FULLPATH=/var/cache/cdn/${FILE:31:1}/${FILE:29:2}/${FILE}
rm -f "${FULLPATH}"
fi
To clear all cache on the server, simply run the script. If you need a some file purged, just specify its path:
root@cdn:~# ./purge.sh /test.jpg
To purge cache everywhere the script must be run on all CDN servers.
In lieu of conclusions
Finally, I would like to give some useful tips so that you can avoid falling into the pitfalls I already cleared:
- Consider the viability and maintenance costs of your future CDN. In most cases, it's much efficient and easier to buy a cheap CDN, which will most likely be more stable and be of better quality.
- To improve the CDN's fault tolerance, it is recommended you set up a DNS Failover, which would allow you to quickly switch the A record in the event a server breaks. You can do so in the DNS's records control panel.
- Websites with wide coverage require a large number of PoPs, but do not get overzealous. Most likely, a user won't notice the difference between your CDN and a paid one, if you have servers in 6-7 locations (Europe, North America (East), North America (West), Singapore, Australia, Hong Kong or Japan).
- Sometimes hosting providers do not allow to use rented servers for CDNs. So, if you're looking to create a CDN service, make sure to read up on the hosting provider's terms and conditions.
- Study the submarine cable map to understand how the continents are connected and use this knowledge when building your CDN.
- Try to ping your servers from different locations. This way, you can spot the regions closest to your PoPs and it will help you configure GeoDNS.
Posted on November 24, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.