Kenneth R Hancock
Posted on February 7, 2017
(X-Post from my Medium Blog -- Here)
Hello World! I'd like to take a sentence or two of your time to firstly say that this is my first blog post ever. I hope through this blog I can explore ideas and concepts I enjoy. Hopefully allowing me to grow as a Computer Scientist and programmer, and from my mistakes (and triumphs) will allow others to grow as well. My target audience is anyone that would find this useful. I'd like to briefly touch an introduction to caching before showing the code. But you can find my example project here: https://github.com/krhancoc/Blog-Example-1-Caching.
Caching is an optimization technique that stores previously retrieved results into a location you desire (Could be local memory, a file, an SQL database, etc). These results could be anything from an entire page, to outputs from a series of API calls. I like to see caching (and similarly Memoization) as something that reduces Time Complexity at the cost of Space Complexity. Caching is incredibly useful when you have an action that's CPU intensive and the results won't be changing often. I'm going to use an example of returning a fibonacci number recursively (although really I would just use the closed form cause constant time is the bomb). But doing it recursively is great for us because of the following:
The Fibonacci sequence without any use of Caching/Memoization takes an incredibly long time, and is very easy to implement recursively.
The Fibonacci sequence contains many of the same “sub-problems” when done recursively. Ex: F(5) = F(4) + F(3) = F(3) + F(2) + F(2) + F(1). You can see here that F(2) is called twice, and without much thought you can imagine asking for F(800) would result in thousands of the same recursive calls.
Note there is a problem when using the recursive method – the recursion limit of python. Because of how fast this function grows, you'll hit the recursion limit quite quickly, now you can increase this (although not recommended). A proper thing that can be done is “warming” the cache, by asking for increasingly larger fibonacci numbers on server startup. So that way when a patron of your fibonacci site comes along, their request will most likely be already in the cache. There are obviously tonnes of ways to deal with this issue but I wanted to touch a little bit on it.
I was inspired to do this little project when I looked here, and knew I could apply the same technique to create a caching decorator for Django for a function I was using that sent thousands of requests to an API (which in turn put large amounts of stress on the server I was running the application on).
Here is the decorator:
import functools
from django.core.cache import caches
from django.core.cache.backends.base import InvalidCacheBackendError
def cache(cache_alias='default'):
class cacheobject(object):
def __init__(self, func):
self.func = func
try:
self.cache = caches[cache_alias]
except InvalidCacheBackendError:
self.cache = caches['default']
def __call__(self, value):
response = self.cache.get(value)
if response is None:
response = self.func(value)
self.cache.set(value, response, None)
return response
else:
return response
def __repr__(self):
'''Return the function's docstring.'''
return self.func.__doc__
def __get__(self, obj, objtype):
'''Support instance methods.'''
return functools.partial(self.__call__, obj)
return cacheobject
You'll also need to setup some sort of caching system for your Django project which can be found in the documentation (which is really quite good). For this project it looked like this:
CACHES = {
'default': {
'BACKEND': 'django.core.cache.backends.locmem.LocMemCache',
'LOCATION': 'local',
},
'sql': {
'BACKEND': 'django.core.cache.backends.db.DatabaseCache',
'location': 'table',
},
'file': {
'BACKEND': 'django.core.cache.backends.filebased.FileBasedCache',
'LOCATION': 'file_cache'
}
}
CACHE_MIDDLEWARE_SECONDS = 300
CACHE_MIDDLEWARE_KEY_PREFIX = ''
CACHE_MIDDLEWARE_ALIAS = 'default'
Now from here you can add it to any function you'd like, and change your cache parameter to the alias of the cache you would like to store it, depending on how fast you need cache to be, and also the data you are storing. More permanent data should be stashed in an SQL cache (slower option also potential bandwidth usage) while less permanent data should be stored in system memory, which is very fast), with file caching being in the middle of the two.
For our fibonacci example it would be like:
@cache('default')
def fibb(num):
if num in (0,1):
return num
else:
return fibb(num -1) + fibb(num - 2)
From here I created a simple Django template to have a user ask for a fibonacci number.
NOW BEHOLD THE POWER OF MEMOIZATION AND CACHING! What was once a function that took minutes to even find the 30th fibonacci number, is now essentially instant (thanks to a nice reduction to linear time complexity). From here, as you warm the cache with larger and larger numbers, you'll be able to find even the thousandth fibonacci number, and rather than taking – 1.071509e+301 – separate function calls, it can be brought down to thousands, or even one function call (depending on what is cached).
F(1000) = 4346655768693745643568852767504062580256466051737178040
24817290895365554179490518904038798400792551692959225930803226347
75209689623239873322471161642996440906533187938298969649928516003
704476137795166849228875
I realize that this is still very general but what I hope for this to be is a template that others may use in their Django projects to reduce the load that certain functions may have on their servers CPU or bandwidth usage. If you get the chance, try pulling and testing the function with and without the caching decorator, and try using different caches for speed comparisons.
Cheers folks. Just in case here is the link again to the git repo:
https://github.com/krhancoc/Blog-Example-1-Caching.
If you feel there is more to add don't hesitate to comment or contact me, an open discussion benefits everyone in terms of learning.
Thank you,
Ryan.
Posted on February 7, 2017
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 29, 2024