Joblib

sharathhebbar

Sharath Hebbar

Posted on June 25, 2023

Joblib

Joblib

Joblib is a set of tools to provide lightweight pipelining in Python. In particular: transparent disk-caching of functions and lazy re-evaluation (memoize pattern) easy simple parallel computing.

Why it is used?

  • Better performance
  • reproducibility
  • Avoid computing the same thing twice
  • Persist to disk transparently

Features

Transparent and fast disk-caching of output value
Embarrassingly parallel helper
Fast compressed Persistence

Importing libraries

from joblib import Memory,Parallel, delayed,dump,load
import pandas as pd
import numpy as np
import math
Enter fullscreen mode Exit fullscreen mode

Data Creation

my_dir = '/content/sample_data'
a = np.vander(np.arange(3))
print(a)
output: [[0 0 1]  [1 1 1]  [4 2 1]]
Enter fullscreen mode Exit fullscreen mode

Memory

mem = Memory(my_dir)
output: [[ 0  0  1]  [ 1  1  1]  [16  4  1]]
sqr = mem.cache(np.square)
b = sqr(a)
print(b)
output: [[ 0  0  1]  [ 1  1  1]  [16  4  1]]
Enter fullscreen mode Exit fullscreen mode

Parallel

%%time
Parallel(n_jobs=1)(delayed(np.square)(i) for i in range(10))
output: CPU times: user 2.85 ms, sys: 0 ns, total: 2.85 ms
Wall time: 3 ms
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
%%time
Parallel(n_jobs=2)(delayed(np.square)(i) for i in range(10))
output: CPU times: user 42.7 ms, sys: 762 µs, total: 43.5 ms
Wall time: 75.9 ms
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
%%time
Parallel(n_jobs=3)(delayed(np.square)(i) for i in range(10))
output: CPU times: user 92.9 ms, sys: 8.93 ms, total: 102 ms
Wall time: 151 ms
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Enter fullscreen mode Exit fullscreen mode

Dump

dump(a,'/content/sample_data/a.job')
output: ['/content/sample_data/a.job']
Load
aa = load('/content/sample_data/a.job')
print(aa)
output: array([[0, 0, 1],        [1, 1, 1],        [4, 2, 1]])
Enter fullscreen mode Exit fullscreen mode

References

Documentation: https://joblib.readthedocs.io
Download: https://pypi.python.org/pypi/joblib#downloads
Source code: https://github.com/joblib/joblib
Report issues: https://github.com/joblib/joblib/issues

Source:
https://medium.com/r/?url=https%3A%2F%2Fgithub.com%2FSharathHebbar%2FData-Science-and-ML%2Ftree%2Fmain%2Fcodes%2Fjoblib

💖 💪 🙅 🚩
sharathhebbar
Sharath Hebbar

Posted on June 25, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

What was your win this week?
weeklyretro What was your win this week?

November 29, 2024

Where GitOps Meets ClickOps
devops Where GitOps Meets ClickOps

November 29, 2024

How to Use KitOps with MLflow
beginners How to Use KitOps with MLflow

November 29, 2024

Modern C++ for LeetCode 🧑‍💻🚀
leetcode Modern C++ for LeetCode 🧑‍💻🚀

November 29, 2024