Rails Antivirus validator as a service on K8s

tareksamni

Tarek N. Elsamni

Posted on February 5, 2020

Rails Antivirus validator as a service on K8s

Originally posted on https://www.shebanglabs.io/rails-antivirus-clamby-clamav/

Carrierwave, ClamAV and Clamby

If you are building a web application, you definitely will want to enable file uploading. File uploading is an important feature in modern-day applications. Carrierwave is a famous ruby gem that works perfectly with Rack based web applications, such as Ruby on Rails to provide file uploading out of the box with a long list of other features around this speciality.

If you have a file upload on your web application and you do not scan the files for viruses then you not only compromise your software, but also the users of the application and their files.

To avoid such scenarios we tend often to whitelist allowed file extensions and content types. This approach might not be enough if you decided to allow/whitelist executable uploads or if the attacker is uploading a malicious image or any file of an allowed file extension or content-type.

In this tutorial, I will show you how to utilize Rails ActiveModel::Validator class to build a modular validator to scan each file upload in real-time using ClamAV and Clamby gem.

ClamAV® is an open source antivirus engine for detecting trojans, viruses, malware & other malicious threats.

Clamby gem depends on the clamscan daemons to be installed already. If you installed clamscan and tried to run Clamby, you will notice that it takes few seconds (around ~10 depending on available computing resources). This is because every time you run a scan, a new process of clamscan gets initiated to run the scan which takes some time to load the antivirus database, check viruses signatures, run other boating routines and finally start the actual scan.

To overcome this issue. Clamby creator is highly recommending to use the daemonize set to true option. This will allow for clamscan to remain in memory and will not have to load for each virus scan. It will save several seconds per request.

The bad news is a single process of ClamAV is consuming an average of 600-800MB.

For every rails server/pod running you will consume such expensive memory for nothing but preloading the viruses database in memory to deliver real-time antivirus scans!

Fortunately, ClamAV has a TCP/IP socket based interface. Which means we could run a single shared process and access it remotely using TCP/IP sockets. Or even better to run a cluster of distributed processes and loadbalance the virus scans across them. This sounds like a good plan 👌.

Assumptions And Prerequisites

The following part of this post will show you how to deploy ClamAV as a service on K8s, access it from other pods (Rails) over a TCP/IP socket and how to configure Rails to utilize this service in a modular and DRY implementation.

This post makes the following assumptions:

Step 1: Deploy ClamAV as a service on Kubernetes

To deploy ClamAV on Kubernetes, you need to configure a kubernetes deployment and make it accessible through a kubernetes service. The service will expose the deployment using a FDQN DNS that loadbalances the traffic to the deployment replicas without any unfamiliar service discovery mechanisms (which makes the antivirus horizontally scalable).

  • The kubernetes deployment will look like:
# k8s/clamav-deployment.yaml
apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: clamav
  namespace: shared
spec:
  replicas: 1
  minReadySeconds: 30
  template:
    metadata:
      labels:
        app: clamav
    spec:
      containers:
      - name: clamav
        image: quay.io/ukhomeofficedigital/clamav:v1.7.1
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 3310
          name: api
          protocol: TCP
        livenessProbe:
          exec:
            command:
            - /readyness.sh
          initialDelaySeconds: 20
          timeoutSeconds: 2

Enter fullscreen mode Exit fullscreen mode
  • The exposing service will look like:
# k8s/clamav-service.yaml
apiVersion: v1
kind: Service
metadata:
  name: antivirus-svc
  namespace: shared
spec:
  selector:
    app: clamav
  clusterIP: None
  ports:
  - name: zombie-port # Actually, we do not use this port but it is still needed to allow the service to receive TCP traffic.
    port: 1234
    targetPort: 1234

Enter fullscreen mode Exit fullscreen mode

Now, you can create the deployment and its exposing service using kubectl as follows:

kubectl apply -f k8s/clamav-deployment.yaml -f k8s/clamav-service.yaml
kubectl -n shared get svc
NAME             TYPE        CLUSTER-IP  EXTERNAL-IP  PORT(S)   AGE
antivirus-svc    ClusterIP   None        <none>       1234/TCP  20s
Enter fullscreen mode Exit fullscreen mode

Step 2: Configure Clamby to use ClamAV service

As shown in the previous step, ClamAV is now up and running as a kubernetes deployment with 1 replica (you could add more replicas to make it horizontal scalable) and listening to port 3310 with protocol TCP. Also, the kubernetes service will make sure that the traffic going to antivirus-svc.shared.svc.cluster.local is being load balanced across the replicas automagically.

To configure Clamby ruby gem to connect to the ClamAV daemon at antivirus-svc.shared.svc.cluster.local using port 3310 and over TCP sockets we need to use the following Rails initializer:

# config/initializers/clamby.rb

clamby_configs = {
  daemonize: true
}

clamby_configs[:config_file] = '/etc/clamav/clamd.conf'

Clamby.configure(clamby_configs)
Enter fullscreen mode Exit fullscreen mode

This initializer is instructing the Clamby gem to use a clamav config file located at /etc/clamav/clamd.conf. This file is not created yet but we will now create it as a part of building the RoR docker image used to run the application.

So, your RoR Dockerfile should look something like:

FROM bitnami/rails:latest

# Install OS dependencies
# COPY Gemfile $APP_PATH/Gemfile
# COPY Gemfile.lock $APP_PATH/Gemfile.lock

# Install bundler
# bundle install

# COPY . $APP_PATH

# Precompile assets

RUN echo "TCPSocket 3310" > /etc/clamav/clamd.conf
RUN echo "TCPAddr antivirus-svc.shared.svc.cluster.local" >> /etc/clamav/clamd.conf

# Entrypoint and CMD

Enter fullscreen mode Exit fullscreen mode

Now, if you run rails c from a container running on the kubernetes cluster and using this Dockerfile image. Then you should be able to run the following command to do ClamAV scans using the remote service over TCP:

# rails c
Loading development environment (Rails 5.2.3)
[1] pry(main)> Clamby.virus?('SOME_LOCAL_FILE_PATH')
ClamAV 0.101.1/25431/Fri Apr 26 08:57:33 2019
/app/SOME_LOCAL_FILE_PATH: OK
false # no virus 🎉
Enter fullscreen mode Exit fullscreen mode

Step 3: An activemodel validator to utilize Clamby

After getting all of the infrastructure in place for running ClamAV as a remote service over TCP and configuring the RoR app to connect to it. It is time to write a modular, DRY and reusable ActiveModel validator that could be used to scan every file the user uploads in real-time.

An ActiveModel validator could look like:

# app/validators/antivirus_validator.rb

class AntivirusValidator < ActiveModel::Validator
  def validate(record)
    if file(record).path && File.exist?(file(record).path) && Clamby.virus?(file(record).path)
      record.errors.add(options[:attribute_name].to_sym, I18n.t('infected_file'))
    end
  end

  private

  def file(record)
    record.public_send(options[:attribute_name].to_sym)
  end
end

Enter fullscreen mode Exit fullscreen mode

Then you could use the validator with the following one line inside any ActiveRecord model:

# app/models/some_model.rb

class SomeModel < ActiveRecord::Base
  mount_uploader :image, PictureUploader
  validates_with AntivirusValidator, attribute_name: 'image'
end

Enter fullscreen mode Exit fullscreen mode

Whenever you need to scan a file uploaded by a mounted uploader in an ActiveModel object, all you need to do is to add the following validation to the model:

validates_with AntivirusValidator, attribute_name: 'image'
Enter fullscreen mode Exit fullscreen mode

Because the ClamAV process is preloaded, up and running already on the remote deployment. and because the deployment is running on the same kubernetes cluster so all traffic goes local. A file scan process takes ~20ms for small files < 1MB and little bit more for bigger files. Do not hesitate to scan every single file uploaded by the end users as the process is not expensive and everything is now in-place to do scans with an extra one line of code.

Happy virus 🦠 scanning 👋

💖 💪 🙅 🚩
tareksamni
Tarek N. Elsamni

Posted on February 5, 2020

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related