AdTech using SingleStoreDB and Looker

veryfatboy

Akmal Chaudhri

Posted on March 16, 2023

AdTech using SingleStoreDB and Looker

Abstract

In this article, we'll see how to run a SingleStore GitHub project for implementing an AdTech application in SingleStoreDB Cloud. We'll also use Looker to create some visualisations.

Introduction

In this article, we'll take an existing SingleStore GitHub repo and demonstrate the ease with which it can be deployed and run on SingleStoreDB Cloud. The complete updated code can be found in a forked GitHub repo for ease of use.

Create a SingleStoreDB Cloud account

A previous article showed the steps to create a free SingleStoreDB Cloud account. We'll use AdTech Demo Group as our Workspace Group Name and adtech-demo as our Workspace Name. We'll make a note of our password and host name.

Install Go

First, we'll download and install Go, as follows:

rm -rf /usr/local/go && tar -C /usr/local -xzf go1.20.1.linux-amd64.tar.gz
Enter fullscreen mode Exit fullscreen mode

We may need to use sudo, as follows:

sudo rm -rf /usr/local/go && sudo tar -C /usr/local -xzf go1.20.1.linux-amd64.tar.gz
Enter fullscreen mode Exit fullscreen mode

We'll add the following to the PATH variable:

export PATH=$PATH:/usr/local/go/bin
Enter fullscreen mode Exit fullscreen mode

and then check the version:

go version
Enter fullscreen mode Exit fullscreen mode

This will print the installed version of Go. For example:

go version go1.20.1 linux/amd64
Enter fullscreen mode Exit fullscreen mode

Clone GitHub repo

We'll now clone the following GitHub repo:

git clone https://github.com/VeryFatBoy/starter_kits
Enter fullscreen mode Exit fullscreen mode

Data Generator

Let's first start with the AdTech data generator and change to the directory:

cd starter_kits/adtech/generator
Enter fullscreen mode Exit fullscreen mode

We'll now build the code, as follows:

go mod init generator

go mod tidy

go build
Enter fullscreen mode Exit fullscreen mode

This will create an executable generator file.

Let's now create a small dataset for initial testing:

./generator small
Enter fullscreen mode Exit fullscreen mode

This will create four files with tab-separated values:

campaigns-small.tsv
events-small.tsv
networks-small.tsv
platforms-small.tsv
Enter fullscreen mode Exit fullscreen mode

We'll gzip these files:

gzip *.tsv
Enter fullscreen mode Exit fullscreen mode

We can also pass the parameter medium or large for larger datasets.

Load and run SQL

We'll move up one directory level:

cd ..
Enter fullscreen mode Exit fullscreen mode

In the file setup.sql, the following four code blocks each use a pipeline and have been commented out:

...
-- create or replace pipeline load_events as
-- load data s3 'reference-solutions.memsql.com/events-large.tsv.gz'
-- config '{"region": "us-east-1"}'
-- credentials '{"aws_access_key_id": "", "aws_secret_access_key": ""}'
-- into table events;
-- start pipeline load_events;
...
-- create or replace pipeline load_campaigns as
-- load data s3 'reference-solutions.memsql.com/campaigns-large.tsv.gz'
-- config '{"region": "us-east-1"}'
-- credentials '{"aws_access_key_id": "", "aws_secret_access_key": ""}'
-- skip duplicate key errors
-- into table campaigns;
-- start pipeline load_campaigns;
...
-- create or replace pipeline load_networks as
-- load data s3 'reference-solutions.memsql.com/networks-large.tsv.gz'
-- config '{"region": "us-east-1"}'
-- credentials '{"aws_access_key_id": "", "aws_secret_access_key": ""}'
-- skip duplicate key errors
-- into table networks;
-- start pipeline load_networks;
...
-- create or replace pipeline load_platforms as
-- load data s3 'reference-solutions.memsql.com/platforms-large.tsv.gz'
-- config '{"region": "us-east-1"}'
-- credentials '{"aws_access_key_id": "", "aws_secret_access_key": ""}'
-- skip duplicate key errors
-- into table platforms;
-- start pipeline load_platforms;
...
Enter fullscreen mode Exit fullscreen mode

If we decide to use pipelines later, we can uncomment and modify the code to suit our needs.

The following four new code blocks, respectively, have been added so that we can load the locally generated data instead:

...
load data local infile "generator/events-small.tsv.gz"
into table events
columns terminated by '\t';
...
load data local infile "generator/campaigns-small.tsv.gz"
into table campaigns
skip duplicate key errors
columns terminated by '\t';
...
load data local infile "generator/networks-small.tsv.gz"
into table networks
skip duplicate key errors
columns terminated by '\t';
...
load data local infile "generator/platforms-small.tsv.gz"
into table platforms
skip duplicate key errors
columns terminated by '\t';
...
Enter fullscreen mode Exit fullscreen mode

Next, we'll run the commands in the setup.sql file to create the adtech database and tables and load the data:

mysql -u admin -h <host> -P 3306 -p<password> --local-infile < setup.sql
Enter fullscreen mode Exit fullscreen mode

We'll replace the <host> and <password> with the values from our SingleStoreDB Cloud account.

From our SingleStoreDB Cloud account, we can use the SQL Editor to check that the tables were correctly created:

USE adtech;

SHOW TABLES;
Enter fullscreen mode Exit fullscreen mode

Benchmarking using dbbench (optional)

The dbbench tool can be used to run workloads against a SingleStoreDB database. A discussion of the tool is beyond the scope of this article, but the SingleStore GitHub repo is a great place to start and contains a tutorial.

To build and run the dbbench program, we'll first change to the directory starter_kits/adtech/benchmarks.

Next, we'll set up some environment variables:

export GOPATH=$HOME/go
export PATH=$PATH:$GOPATH/bin
Enter fullscreen mode Exit fullscreen mode

and then install dbbench, as follows:

go install github.com/memsql/dbbench@latest
Enter fullscreen mode Exit fullscreen mode

We can run the program from the starter_kits/adtech/benchmarks directory, as follows:

dbbench --host=<host> --username=admin --password=<password> --intermediate-stats=false adtech.ini
Enter fullscreen mode Exit fullscreen mode

We'll replace the <host> and <password> with the values from our SingleStoreDB Cloud account.

Here is an example of the results using the small dataset with an S-00 Workspace Size:

Connected
starting impressions per campaign
starting ctr_per_country
starting campaign facts
starting spend per advertiser
stopping spend per advertiser
stopping ctr_per_country
stopping campaign facts
stopping impressions per campaign

ctr_per_country: latency 59.327792ms±15.229575ms; 30 transactions (16.847 TPS); 7470 rows (4194.792 RPS)
 16.777216ms -  33.554432ms [    12]: ██████████████████████████████████████████████████
 33.554432ms -  67.108864ms [     7]: █████████████████████████████▏
 67.108864ms - 134.217728ms [    10]: █████████████████████████████████████████▋
134.217728ms - 268.435456ms [     1]: ████▏

spend per advertiser: latency 58.056977ms±15.261319ms; 30 transactions (17.215 TPS); 15870 rows (9106.741 RPS)
  8.388608ms -  16.777216ms [     1]: ███▎
 16.777216ms -  33.554432ms [     6]: ████████████████████
 33.554432ms -  67.108864ms [    15]: ██████████████████████████████████████████████████
 67.108864ms - 134.217728ms [     7]: ███████████████████████▎
134.217728ms - 268.435456ms [     1]: ███▎

impressions per campaign: latency 116.803395ms±35.36871ms; 30 transactions (8.559 TPS); 30 rows (8.559 RPS)
  8.388608ms -  16.777216ms [     2]: ██████▋
 16.777216ms -  33.554432ms [     2]: ██████▋
 33.554432ms -  67.108864ms [     2]: ██████▋
 67.108864ms - 134.217728ms [    15]: ██████████████████████████████████████████████████
134.217728ms - 268.435456ms [     7]: ███████████████████████▎
268.435456ms - 536.870912ms [     2]: ██████▋

campaign facts: latency 64.628849ms±30.600693ms; 30 transactions (15.464 TPS); 0 rows (0.000 RPS)
  8.388608ms -  16.777216ms [     5]: █████████████████████████
 16.777216ms -  33.554432ms [     5]: █████████████████████████
 33.554432ms -  67.108864ms [    10]: ██████████████████████████████████████████████████
 67.108864ms - 134.217728ms [     9]: █████████████████████████████████████████████
134.217728ms - 268.435456ms [     0]: 
268.435456ms - 536.870912ms [     1]: █████
Enter fullscreen mode Exit fullscreen mode

Using dbbench allows us to build reference implementations and then test the performance against these.

Visualisations using Looker

Finally, we'll create some visualisations.

To use Looker, we can use an existing account or request a free trial.

Once logged into the Looker account, from Account > Admin > Database > Connections we'll create a New Connection, as follows:

  • Name: reference_solutions
  • Dialect: SingleStore 7+
  • Remote Host: <host> Port: 3306
  • Database: adtech
  • Username: admin
  • Password: <password>

We'll replace the <host> and <password> with the values from our SingleStoreDB Cloud account.

We can test if the connection is successful.

Next, from Account > Develop, we'll ensure that Development Mode is toggled to ON, select Manage LookML Projects, and choose New LookML Project.

We'll now enter the following:

  • Project Name: adtech
  • Starting Point: Generate Model from Database Schema
  • Connection: reference_solutions
  • Build Views From: All Tables

Next, we'll select Create > Dashboard from the Looker home page. We'll use the name Events and then click Create Dashboard.

We'll click Edit Dashboard and then Add Visualization.

From the left nav, we'll select the Events table, which will appear expanded in a new window. Under DIMENSIONS, we'll select Country; under MEASURES, we'll choose Count. We'll select Google Maps on the right-hand side and then choose Run.

This will create a heatmap as shown in Figure 1.

Figure 1. Count per Country.

Figure 1. Count per Country.

Hovering over the different countries will provide the number of events for that country.

From the Looker home page, we'll select Create > Dashboard. We'll use the name Campaigns and then click Create Dashboard.

We'll click Edit Dashboard and then Add Visualization.

We'll select the Campaigns table from the left nav, which will appear expanded in a new window. Under DIMENSIONS, we'll select Campaign Name; under MEASURES, we'll choose Count. On the right-hand side, we'll set the Row Limit to 10, select Pie Chart and then choose Run.

This will create a chart as shown in Figure 2.

Figure 2. Count per Campaign.

Figure 2. Count per Campaign.

Additional dashboards could be created to get further insights.

Summary

We have generated some example AdTech data and loaded the small dataset into SingleStoreDB Cloud. The next step would be to create and load the medium or large dataset. We could also store the AdTech data in the cloud and use the pipelines feature to stream the data into SingleStoreDB.

We have built and tested dbbench. This tool could be handy for creating reference test results.

Finally, we have created two quick dashboards using Looker to gain insights into some of our AdTech data. Additional visualisations could be easily created using this powerful browser-based tool.

💖 💪 🙅 🚩
veryfatboy
Akmal Chaudhri

Posted on March 16, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

AdTech using SingleStoreDB and Looker
singlestoredb AdTech using SingleStoreDB and Looker

March 16, 2023