Analyzing Your On-Call Activity with Kubernetes Jobs

In Kubernetes, you can create Jobs, or one-off tasks, or make them recurring CronJob resources (like you would on a non-distributed system), to schedule pods to complete a task.

Let’s take the example of the following script:

import requests
import json
import os
import datetime
import sys

API_KEY = os.environ['PAGERDUTY_RO_KEY']

SINCE = os.environ['START_DATE']
UNTIL = os.environ['END_DATE']
STATUSES = []
TIME_ZONE = 'UTC'
LIMIT = 50
RUNDATE = datetime.datetime.today().strftime('%Y%m%d%H%M%S')

def list_incidents(offsetval):
 url = 'https://api.pagerduty.com/incidents'
 headers = {
 'Accept': 'application/vnd.pagerduty+json;version=2',
 'Authorization': 'Token token={token}'.format(token=API_KEY)
 }
 payload = {
 'since': SINCE,
 'until': UNTIL,
 'statuses[]': STATUSES,
 'limit': LIMIT,
 'time_zone': TIME_ZONE,
 'offset': offsetval
 }
 r = requests.get(url, headers=headers, params=payload)
 return r.text

def write_csv(resp):
 incidents = json.loads(resp)['incidents']
 incidents_data = open('/mnt/pd-audit/%s-Incidents-%s.csv' % ( RUNDATE, offset), 'w+')
 for inc in incidents:
 incidents_data.write("%s,%s,\n" % (inc['title'],inc['created_at']))
 incidents_data.close()

if __name__ == '__main__':
 more_status = True
 offset = 0
 while more_status == True:
 resp = list_incidents(offset)
 more = json.loads(resp)['more']
 if more == False:
 more_status = False
 print "No more pages after current run. '/mnt/pd-audit/%s-Incidents-%s.csv'..." % (RUNDATE, offset)
 write_csv(resp)
 else:
 print "Writing '/mnt/pd-audit/%s-Incidents-%s.csv'..." % (RUNDATE, offset)
 resp = list_incidents(offset)
 write_csv(resp)
 offset += 1

which just dumps a CSV of PagerDuty activity between a start and end date to disk, which your team can review later.

So, let’s say you want to run this on Kubernetes as a job you can kick off as-desired, build this script into an image:

FROM python:2

ENV PAGERDUTY_RO_KEY
ENV START_DATE
ENV END_DATE

COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt

COPY pd-alert-csv.py main.py

CMD ['python','main.py']

and push it to your registry, and then you can reference it in a template like:

apiVersion: batch/v1
kind: Job
metadata:
 name: pd-alert-csv
 namespace: jobs-space
spec:
 template:
 spec:
 volumes:
 - name: pd-data
 emptyDir: {}
 containers:
 - name: pd-alert-csv
 image: coolregistry.us/pd-alert-csv:latest
 volumeMounts:
 - mountPath: "/mnt/pd-audit"
 name: pd-data
 env:
 - name: PAGERDUTY_RO_KEY
 valueFrom:
 secretKeyRef:
 name: pd-api-token
 key: PAGERDUTY_RO_KEY
 - name: START_DATE
 value: "SOME_DATE" 
#Ideally, you might program this window calculation into the script if this were recurring
 - name: END_DATE
 value: "SOME_LATER_DATE"

You might do something like have an EBS-backed volume for the data, or however you’d like it stored (or even write it to S3 directly or something in the script), and then load your PagerDuty API key into the pod as an environment varialble pushed as a Secret in Kubernetes, and then the start and end date for the job, which would be passed into your container from the image we just built.

To make this a recurring job, much like the Deployment resource does with Pods , you’re going to wrap your Job in a CronJob declaration to extend what Kubernetes is supposed to do with the job and when:

apiVersion: batch/v1beta1
kind: CronJob
metadata:
 name: pd-alert-report-cron
spec:
 schedule: "0 \*/12 \* \* \*"
 jobTemplate:
 spec:
 template:
 spec:
...

and then indented within that spec key would be the job we defined above.

To learn more about what you can do with jobs (work queues, etc.), definitely checkout the Kubernetes documentation:

https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/

and the amazing Kubernetes By Example for another great example:

http://kubernetesbyexample.com/jobs/

Blog

Analyzing Your On-Call Activity with Kubernetes Jobs

Joseph D. Marhee

Join Our Newsletter. No Spam, Only the good stuff.

Related