Cameron Archer
Posted on March 22, 2023
This post is the last in a series on real-time analytics. It is an excerpt from Real-time analytics, a definitive guide which can be read in full here.
--
So how do you begin to build real-time analytics into your next development project? As this guide has demonstrated, there are 3 core steps to building real-time analytics:
- Ingesting data at streaming scale
- Querying the data to build analytics metrics
- Publishing the metrics to integrate into your apps
Tinybird is a real-time analytics platform that makes all of this possible. Below you’ll find practical steps on ingesting data from streaming platforms (and other sources), querying that data with SQL, and publishing low-latency, high-concurrency APIs for consumption within your applications.
If you’re new to Tinybird, you can try it out by signing up for a free-forever Build Plan, with no credit card required, no time restrictions, and generous free limits.
Ingesting real-time data into Tinybird
Tinybird supports ingestion from multiple sources, including streaming platforms, files, databases, and data warehouses. Here’s how to code ingestion from various sources using Tinybird.
Ingest real-time data from Kafka
Tinybird enables real-time data ingestion from Kafka using the native Kafka connector. You can use the Tinybird UI to set up your Kafka connection, choose your topics, and define your ingestion schema in a few clicks. Or, you can use the Tinybird CLI to develop Kafka ingestion pipelines from your terminal.
To learn more about building real-time analytics on top of Kafka data, check out these resources:
- Docs - Tinybird Kafka Connector
- Screencast - Create REST APIs from Kafka streams in minutes
- Blog - From Kafka streams to data products
- Live Coding Session - Build low-latency analytics APIs on top of Kafka data
Note that this applies to any Kafka-compatible platform such as Confluent, Redpanda, Upstash, Aiven, or Amazon MSK.
Ingest real-time data from your Data Warehouse
Tinybird works as a real-time publication layer for data stored in data warehouses. With Tinybird, you can synchronize tables in your data warehouses - such as BigQuery, Redshift, or Snowflake - develop metrics in SQL, and publish those metrics as low-latency, high-concurrency APIs.
Tinybird’s Connector Development Kit has made it possible to quickly ingest data from many data warehouses into Tinybird.
Check out these resources below to learn how to build real-time analytics on top of your data warehouse:
- Docs - Tinybird Snowflake Connector
- Docs - Tinybird BigQuery Connector
- Screencast - Sync BigQuery tables to Tinybird with the BigQuery Connector
- Blog - Transforming real-time applications with Tinybird and BigQuery
- Live Coding Session - Building real-time analytics with BigQuery
Ingest real-time data from CSV, NDJSON, and Parquet files
Tinybird enables data ingestion from CSV, NDJSON, and Parquet files, either locally on your machine or remotely in cloud storage such as GCP or S3 buckets. While data stored in files is often not generated in real time, it can be beneficial as dimensional data to join with data ingested through streaming platforms. Tinybird has wide coverage of SQL joins to make this possible.
You can ingest real-time data from files using the Tinybird UI, using the CLI, or using the Data Sources API.
Here are some resources to learn how to ingest data from local or remote files:
- Docs - Ingest data from CSV files into Tinybird
- Docs - How to ingest NDJSON data into Tinybird
- Docs - The Tinybird Datasources API
- Blog - Querying large CSVs online with SQL
- Screencast - Ingest data from a file into Tinybird
Ingest from your applications via HTTP
Perhaps the simplest way to capture real-time data into Tinybird is using the Events API, a simple HTTP endpoint that enables high-frequency ingestion of JSON records into Tinybird.
Because it’s just an HTTP endpoint, you can invoke the API from any application code. The Events API can handle ingestion at up to 1000 requests and 20+ MB per second, making it super scalable for most streaming use cases.
Check out the code snippets below for example usage in your favorite language.
cURL
curl \
-X POST 'https://api.tinybird.co/v0/events?name=events_example' \
-H "Authorization: Bearer p.eyJ1IjogImE5Yzk1YTA4LTkwZmQtNDRiMi05NDFkLWJlNWQwZTViODVkOCIsICJpZCI6ICJlNjZkZjA0Yi1hZmI1LTRlMDctOWE0ZC01NzNkMDc2NzZmZmQifQ.IZlFNiZSgJgNTGs7wJSIgly2PHNDvn6Hohg1oZUDOiI" \
-d $'{"timestamp":"2022-10-27T11:43:02.099Z","transaction_id":"8d1e1533-6071-4b10-9cda-b8429c1c7a67","name":"Bobby Drake","email":"bobby.drake@pressure.io","age":42,"passport_number":3847665,"flight_from":"Barcelona","flight_to":"London","extra_bags":1,"flight_class":"economy","priority_boarding":false,"meal_choice":"vegetarian","seat_number":"15D","airline":"Red Balloon"}
{"timestamp":"2022-05-11T20:58:45.112Z","transaction_id":"710d06d7-72c8-49bd-86ef-a510c331c3c4","name":"Kaylie Corkery","email":"kaylie.corkery@elevation.xyz","age":34,"passport_number":3510502,"flight_from":"Madrid","flight_to":"New York","extra_bags":2,"flight_class":"business","priority_boarding":true,"meal_choice":"gluten_free","seat_number":"4B","airline":"Green Bird"}'
Python
import requests
import json
from datetime import datetime
data = json.dumps({
'timestamp': '2022-10-27T11:43:02.099Z',
'transaction_id': '8d1e1533-6071-4b10-9cda-b8429c1c7a67',
'name': 'Bobby Drake',
'email': 'bobby.drake@pressure.io',
'age': 42,
'passport_number': 3847665,
'flight_from': 'Barcelona',
'flight_to': 'London',
'extra_bags': 1,
'flight_class': 'economy',
'priority_boarding': False,
'meal_choice': 'vegetarian',
'seat_number': '15D',
'airline': 'Red Balloon'
})
r = requests.post('https://api.tinybird.co/v0/events',
params = {
'name': 'events_example',
'token': 'p.eyJ1IjogImE5Yzk1YTA4LTkwZmQtNDRiMi05NDFkLWJlNWQwZTViODVkOCIsICJpZCI6ICJlNjZkZjA0Yi1hZmI1LTRlMDctOWE0ZC01NzNkMDc2NzZmZmQifQ.IZlFNiZSgJgNTGs7wJSIgly2PHNDvn6Hohg1oZUDOiI',
},
data=data)
print(r.status_code)
print(r.text)
Javascript
fetch(
'https://api.tinybird.co/v0/events?name=events_example',
{
method: 'POST',
body: JSON.stringify({
"timestamp": "2022-10-27T11:43:02.099Z",
"transaction_id": "8d1e1533-6071-4b10-9cda-b8429c1c7a67",
"name": "Bobby Drake",
"email": "bobby.drake@pressure.io",
"age": 42,
"passport_number": 3847665,
"flight_from": "Barcelona",
"flight_to": "London",
"extra_bags": 1,
"flight_class": "economy",
"priority_boarding": false,
"meal_choice": "vegetarian",
"seat_number": "15D",
"airline": "Red Balloon"
}),
headers: { Authorization: 'Bearer p.eyJ1IjogImE5Yzk1YTA4LTkwZmQtNDRiMi05NDFkLWJlNWQwZTViODVkOCIsICJpZCI6ICJlNjZkZjA0Yi1hZmI1LTRlMDctOWE0ZC01NzNkMDc2NzZmZmQifQ.IZlFNiZSgJgNTGs7wJSIgly2PHNDvn6Hohg1oZUDOiI' }
}
)
.then(res => res.json())
.then(data => console.log(data))
Go
package main
import (
"bytes"
"fmt"
"io/ioutil"
"net/http"
"time"
)
func main() {
url := "https://api.tinybird.co/v0/events?name=events_example"
fmt.Println("URL:>", url)
var jsonStr = []byte(`{"timestamp":"2022-10-27T11:43:02.099Z","transaction_id":"8d1e1533-6071-4b10-9cda-b8429c1c7a67","name":"Bobby Drake","email":"bobby.drake@pressure.io","age":42,"passport_number":3847665,"flight_from":"Barcelona","flight_to":"London","extra_bags":1,"flight_class":"economy","priority_boarding":false,"meal_choice":"vegetarian","seat_number":"15D","airline":"Red Balloon"}`)
req, err := http.NewRequest("POST", url, bytes.NewBuffer(jsonStr))
req.Header.Set("Authorization", "Bearer p.eyJ1IjogImE5Yzk1YTA4LTkwZmQtNDRiMi05NDFkLWJlNWQwZTViODVkOCIsICJpZCI6ICJlNjZkZjA0Yi1hZmI1LTRlMDctOWE0ZC01NzNkMDc2NzZmZmQifQ.IZlFNiZSgJgNTGs7wJSIgly2PHNDvn6Hohg1oZUDOiI")
req.Header.Set("Content-Type", "application/json")
client := &http.Client{Timeout: time.Second * 10}
resp, err := client.Do(req)
if err != nil {
panic(err)
}
defer resp.Body.Close()
body, _ := ioutil.ReadAll(resp.Body)
fmt.Println(string(body))
}
Rust
use reqwest::blocking::Client;
use serde_json::json;
fn main() {
let client = Client::new();
let res = client.post("https://api.tinybird.co/v0/events?name=events_example")
.header("Authorization", "Bearer p.eyJ1IjogImE5Yzk1YTA4LTkwZmQtNDRiMi05NDFkLWJlNWQwZTViODVkOCIsICJpZCI6ICJlNjZkZjA0Yi1hZmI1LTRlMDctOWE0ZC01NzNkMDc2NzZmZmQifQ.IZlFNiZSgJgNTGs7wJSIgly2PHNDvn6Hohg1oZUDOiI")
.json(&json!({
"timestamp": "2022-10-27T11:43:02.099Z",
"transaction_id": "8d1e1533-6071-4b10-9cda-b8429c1c7a67",
"name": "Bobby Drake",
"email": "bobby.drake@pressure.io",
"age": 42,
"passport_number": 3847665,
"flight_from": "Barcelona",
"flight_to": "London",
"extra_bags": 1,
"flight_class": "economy",
"priority_boarding": false,
"meal_choice": "vegetarian",
"seat_number": "15D",
"airline": "Red Balloon"
}))
.send()
.unwrap();
let body = res.text().unwrap();
println!("{}", body);
}
PHP
<?php
$curl = curl_init();
curl_setopt_array($curl, array(
CURLOPT_URL => "https://api.tinybird.co/v0/events?name=events_example",
CURLOPT_RETURNTRANSFER => true,
CURLOPT_ENCODING => "",
CURLOPT_MAXREDIRS => 10,
CURLOPT_TIMEOUT => 30,
CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
CURLOPT_CUSTOMREQUEST => "POST",
CURLOPT_POSTFIELDS => "{\"timestamp\":\"2022-10-27T11:43:02.099Z\",\"transaction_id\":\"8d1e1533-6071-4b10-9cda-b8429c1c7a67\",\"name\":\"Bobby Drake\",\"email\":\"bobby.drake@pressure.io\",\"age\":42,\"passport_number\":3847665,\"flight_from\":\"Barcelona\",\"flight_to\":\"London\",\"extra_bags\":1,\"flight_class\":\"economy\",\"priority_boarding\":false,\"meal_choice\":\"vegetarian\",\"seat_number\":\"15D\",\"airline\":\"Red Balloon\"}",
CURLOPT_HTTPHEADER => array(
"Authorization: Bearer p.eyJ1IjogImE5Yzk1YTA4LTkwZmQtNDRiMi05NDFkLWJlNWQwZTViODVkOCIsICJpZCI6ICJlNjZkZjA0Yi1hZmI1LTRlMDctOWE0ZC01NzNkMDc2NzZmZmQifQ.IZlFNiZSgJgNTGs7wJSIgly2PHNDvn6Hohg1oZUDOiI",
"Content-Type: application/json"
),
));
$response = curl_exec($curl);
$err = curl_error($curl);
curl_close($curl);
if ($err) {
echo "cURL Error #:" . $err;
} else {
echo $response;
}
?>
Ruby
require 'net/http'
require 'json'
uri = URI('https://api.tinybird.co/v0/events?name=events_example')
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
request = Net::HTTP::Post.new(uri.request_uri)
request['Authorization'] = 'Bearer p.eyJ1IjogImE5Yzk1YTA4LTkwZmQtNDRiMi05NDFkLWJlNWQwZTViODVkOCIsICJpZCI6ICJlNjZkZjA0Yi1hZmI1LTRlMDctOWE0ZC01NzNkMDc2NzZmZmQifQ.IZlFNiZSgJgNTGs7wJSIgly2PHNDvn6Hohg1oZUDOiI'
request.body = JSON.dump({
"timestamp": "2022-10-27T11:43:02.099Z",
"transaction_id": "8d1e1533-6071-4b10-9cda-b8429c1c7a67",
"name": "Bobby Drake",
"email": "bobby.drake@pressure.io",
"age": 42,
"passport_number": 3847665,
"flight_from": "Barcelona",
"flight_to": "London",
"extra_bags": 1,
"flight_class": "economy",
"priority_boarding": false,
"meal_choice": "vegetarian",
"seat_number": "15D",
"airline": "Red Balloon"
})
response = http.request(request)
puts response.body
Java
import java.io.IOException;
import java.net.HttpURLConnection;
import java.net.URL;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
public class Main {
public static void main(String[] args) throws IOException {
String url = "https://api.tinybird.co/v0/events?name=events_example";
String auth = "Bearer p.eyJ1IjogImE5Yzk1YTA4LTkwZmQtNDRiMi05NDFkLWJlNWQwZTViODVkOCIsICJpZCI6ICJlNjZkZjA0Yi1hZmI1LTRlMDctOWE0ZC01NzNkMDc2NzZmZmQifQ.IZlFNiZSgJgNTGs7wJSIgly2PHNDvn6Hohg1oZUDOiI";
String data = "{\"timestamp\":\"2022-10-27T11:43:02.099Z\",\"transaction_id\":\"8d1e1533-6071-4b10-9cda-b8429c1c7a67\",\"name\":\"Bobby Drake\",\"email\":\"bobby.drake@pressure.io\",\"age\":42,\"passport_number\":3847665,\"flight_from\":\"Barcelona\",\"flight_to\":\"London\",\"extra_bags\":1,\"flight_class\":\"economy\",\"priority_boarding\":false,\"meal_choice\":\"vegetarian\",\"seat_number\":\"15D\",\"airline\":\"Red Balloon\"}";
HttpURLConnection con = (HttpURLConnection) new URL(url).openConnection();
con.setRequestMethod("POST");
con.setRequestProperty("Authorization", auth);
con.setRequestProperty("Content-Type", "application/json");
con.setDoOutput(true);
con.getOutputStream().write(data.getBytes(StandardCharsets.UTF_8));
System.out.println(new String(con.getInputStream().readAllBytes(), StandardCharsets.UTF_8));
con.disconnect();
}
}
For more info on building real-time analytics on top of application data using the Events API, check out these resources:
- Docs - The Tinybird Events API
- Guide - Ingest data into Tinybird with an HTTP request
- Screencast - Stream data with the Tinybird Events API
Query and shape real-time data with SQL
Tinybird offers a delightful interface for building real-time analytics metrics using the SQL you know and love.
With Tinybird Pipes you can chop up more complex queries into chained, composable nodes of SQL. This simplifies development flow and makes it easy to identify queries that impede performance or increase latency.
Tinybird Pipes also include a robust templating language to extend your query logic beyond SQL and publish dynamic, parameterized endpoints from your queries.
Below are some example code snippets of SQL queries written in Tinybird Pipes for simple real-time analytics use cases.
Web Analytics
SELECT
session_id,
location,
device,
browser,
min(timestamp) AS session_start,
max(timestamp) AS session_end,
domain(argMin(referrer, timestamp)) AS session_referrer,
argMin(href, timestamp) AS entry_page,
argMax(href, timestamp) AS exit_page,
count(path(pathname)) AS total_pageviews,
uniq(path(pathname)) AS unique_pageviews,
groupArray(path(pathname)) AS urls,
groupArray(action) AS actions,
count(action) > 1 ? 0 : 1 AS bounce
FROM analytics_events
WHERE session_start BETWEEN {{Date(start_date)}} AND {{Date(end_date)}}
GROUP BY session_id, location, device, browser
ORDER BY session_start desc
Software Log Analytics
SELECT
function_name,
count(total) AS total
FROM errors_per_function_per_hour
WHERE 1
{% if defined(date_from) and defined(date_to) %}
AND hour BETWEEN {{Date(start_date)}} AND {{Date(end_date)}}
{% else %}
{{ error("Must supply date_from & date_to params") }}
{% end %}
GROUP BY function_name
Usage-based pricing
SELECT
sum(usage) AS total_usage,
sum(usage) * usage_price AS total_bill
FROM usage_events
JOIN pricing USING(event_type_id)
WHERE user = {{String(uuid)}}
AND toStartOfMonth(timestamp) = {{Date(billing_month)}}
Personalized eCommerce Site
WITH (
SELECT
product_id,
count(views) AS views
FROM sessions
WHERE visitor_id = {{String(visitor_uuid)}}
) AS product_views
SELECT
offer_expiration,
offer_code
FROM offers
INNER JOIN product_views USING(product_id)
ORDER by offer_expiration_date DESC
LIMIT 1
For more info on building real-time analytics metrics with Tinybird Pipes, check out these resources:
- Docs - What are Tinybird Pipes?
- Guide - Best practices for faster SQL queries
- Screencast - Create a Tinybird Pipe
- Blog - To the limits of SQL… and beyond
Publish real-time analytics APIs
Tinybird shines in its publication layer. Whereas other real-time analytics platforms or technologies may still demand that you build a custom backend to support user-facing applications, Tinybird massively simplifies application development with instant REST API publication from SQL queries.
Every API published from a Tinybird Pipe includes automatically generated, OpenAPI-compatible documentation, security through auth token management, and built-in observability dashboards and APIs to monitor endpoint performance and usage.
Furthermore, Tinybird APIs can be parameterized using a simple templating language. By utilizing the templating language in your SQL queries, you can build robust logic for dynamic API endpoints.
To learn more about how to build real-time analytics APIs with Tinybird, check out these resources:
- Docs - Create APIs in Tinybird
- Screencast - Publish an API from an SQL Pipe
- Guide - Add advanced features to your Tinybird APIs
- Blog - Publish SQL-based API endpoints on NGINX log analytics
Ready to experience the industry-leading real-time analytics platform? Try Tinybird today, for free. Get started with the Build Plan - which is more than enough for most simple projects and has no time limit - and upgrade as you scale.
Posted on March 22, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.