Kinyungu Denis
Posted on September 20, 2022
In Data Engineering we have large sets of data that will be queried to obtain meaningful results. SQL is heavily used and it will be a crucial skill for a one to write and execute complex queries.
We have various non-relational databases such as MySQL,SQL Server, PostgreSQL, Oracle Database and many others. The good thing is they all use SQL query language for their queries, so they do not differ too much. In this post I will use PostgreSQL to write queries. PostgreSQL is an advanced, enterprise-class, and open-source relational database system. PostgreSQL supports both SQL (relational) and JSON (non-relational) querying.
Fundamentals in PostgreSQL
- Select, Column Aliases, Order By, Select Distinct, Where, Limit, Fetch, In, Between, Like, Is Null, Table Aliases.
- Joins, Inner Join, Left Join, Self-Join, Full Outer Join, Cross Join, Natural Join
- Group By, Union, Intersect, Having, Grouping Sets, Cube, Rollup, Subquery, Any, All, Exists
- Insert, Insert Multiple Rows, Update, Update Join, Delete, Delete Join, Upsert
The SELECT statement has the following clauses:
- Select distinct rows using DISTINCT operator.
- Sort rows using ORDER BY clause.
- Filter rows using WHERE clause.
- Select a subset of rows from a table using LIMIT or FETCH clause.
- Group rows into groups using GROUP BY clause.
- Filter groups using HAVING clause.
- Join with other tables using joins such as INNER JOIN, LEFT JOIN, FULL OUTER JOIN, CROSS JOIN clauses.
- Perform set operations using UNION, INTERSECT, and EXCEPT.
SELECT
select_list
FROM
table_name;
SELECT first_name, last_name, goods_bought FROM customer;
To select data from all columns of the customer table:
SELECT * FROM consumer_reports;
However, it is not a good practice to use the asterisk (*) in the SELECT statement when you embed SQL statements in the application code. It is a good practice to explicitly specify the column names in the SELECT clause whenever possible to get only necessary data from the database.
A column alias allows you to assign a column or an expression in the select list of a SELECT statement a temporary name. The column alias exists temporarily during the execution of the query.
SELECT column_name AS alias_name
FROM table_name;
SELECT
first_name || ' ' || last_name AS full_name
FROM
consumer_reports;
The ORDER BY clause allows you to sort rows returned by a SELECT clause in ascending or descending order based on a sort expression.
SELECT
select_list
FROM
table_name
ORDER BY
sort_expression1 [ASC | DESC],
...
...
sort_expressionN [ASC | DESC];
SELECT
first_name,
last_name
FROM
consumer_reports
ORDER BY
first_name DESC;
The DISTINCT clause is used in the SELECT statement to remove duplicate rows from a result set. The DISTINCT clause keeps one row for each group of duplicates. The DISTINCT clause can be applied to one or more columns in the select list of the SELECT statement.
SELECT
DISTINCT column1, column2, column3, column4
FROM
table_name;
SELECT
DISTINCT shape,
color
FROM
records
ORDER BY
color;
The SELECT statement returns all rows from one or more columns in a table. To select rows that satisfy a specified condition, you use a WHERE clause. WHERE clause filters rows returned by a SELECT statement.
SELECT select_list
FROM table_name
WHERE condition
ORDER BY sort_expression
To form the condition in the WHERE clause, you use comparison and logical operators:
AND -- Logical operator AND
OR -- Logical operator OR
IN -- Return true if a value matches any value in a list
BETWEEN -- Return true if a value is between a range
of values
LIKE -- Return true if a value matches a pattern
IS NULL -- Return true if a value is NULL
NOT -- Negate the result of other operators
SELECT
last_name,
first_name
FROM
consumer_records
WHERE
first_name = 'Brian' AND
last_name = 'Kamau';
SELECT
first_name,
last_name
FROM
customer
WHERE
first_name IN ('Brian','Kelvin','Martin');
PostgreSQL LIMIT is an optional clause of the SELECT statement that constrains the number of rows returned by the query.
SELECT select_list
FROM table_name
ORDER BY sort_expression
LIMIT row_count
In case you want to skip a number of rows before returning the row_count rows, you use OFFSET clause placed after the LIMIT clause as the following statement:
SELECT select_list
FROM table_name
LIMIT row_count OFFSET row_to_skip;
This query will display the first 20 rows from our film table, which will be ordered in descending order.
SELECT
film_id,
title,
release_year
FROM
film
ORDER BY
film_id DESC
LIMIT 20;
This query will skip 15 rows then proceed to display the next 20 rows only.
SELECT
film_id,
title,
release_year
FROM
film
ORDER BY
film_id
LIMIT 20 OFFSET 15;
To constrain the number of rows returned by a query, you often use the LIMIT clause. However, the LIMIT clause is not a SQL-standard. To conform with the SQL standard, PostgreSQL supports the FETCH clause to retrieve a number of rows returned by a query.
Syntax of the PostgreSQL FETCH clause:
OFFSET start { ROW | ROWS }
FETCH { FIRST | NEXT } [ row_count ] { ROW | ROWS } ONLY
In this syntax:
ROW is the synonym for ROWS, FIRST is the synonym for NEXT, you can use them interchangeably.
The start is an integer that must be zero or positive.
The row_count is 1 or greater.
This query will skip the first 20 rows then proceed to display the next 20 rows.
SELECT
film_id,
title
FROM
film
ORDER BY
title
OFFSET 20 ROWS
FETCH FIRST 20 ROWS ONLY;
You use IN operator in the WHERE clause to check if a value matches any value in a list of values.
The syntax of the IN operator is as follows:
value IN (value1,value2,value3, value4, ..., valueN)
This query will return the first 15 rows that has customer id of 1 and 2.
SELECT customer_id,
rental_id,
return_date
FROM
rental
WHERE
customer_id IN (1, 2)
ORDER BY
return_date DESC
FETCH FIRST 15 ROWS ONLY;
You can combine the IN operator with the NOT operator to select rows whose values do not match the values in the list.
The following query finds all rentals with the customer id is not 1 or 2.
SELECT
customer_id,
rental_id,
return_date
FROM
rental
WHERE
customer_id NOT IN (1, 2)
FETCH NEXT 20 ROWS ONLY;
You use the BETWEEN operator to match a value against a range of values. The syntax of the BETWEEN operator:
value BETWEEN low AND high;
To check if a value is out of a range, you combine the NOT operator with the BETWEEN operator as follows:
value NOT BETWEEN low AND high;
Often used the BETWEEN operator in the WHERE clause.
This query will return the first 20 rows where the price is between 10 and 12.
SELECT
customer_id,
payment_id,
amount
FROM
payment
WHERE
amount BETWEEN 10 AND 12
FETCH FIRST 15 ROWS ONLY;
This query returns the first 15 rows that do not meet the condition in the where clause.
SELECT
customer_id,
payment_id,
amount
FROM
payment
WHERE
amount NOT BETWEEN 10 AND 12
FETCH FIRST 15 ROWS ONLY;
To construct a pattern by combining literal values with wildcard characters and use the LIKE or NOT LIKE operator to find the matches. PostgreSQL provides you with two wildcards:
- Percent sign ( %) matches any sequence of zero or more characters.
- Underscore sign ( _) matches any single character.
value LIKE pattern
PostgreSQL supports the ILIKE operator that works like the LIKE operator. In addition, the ILIKE operator matches value case-insensitively.
This query will return all the first name which have 'er' in between the name, then skips 5 rows then fetches the next 20 rows which are ordered according to first_name.
SELECT
first_name,
last_name
FROM
customer
WHERE
first_name LIKE '%er%'
ORDER BY
first_name
OFFSET 5 ROWS
FETCH FIRST 20 ROWS ONLY;
In database, NULL means missing information or not applicable. NULL is not a value, therefore, you cannot compare it with any other values like numbers or strings.
To check if a value is not NULL, you use the IS NOT NULL operator:
value IS NOT NULL
Table aliases temporarily assign tables new names during the execution of a query.
table_name AS alias_name;
Inner Join
In a relational database, data is typically distributed in more than one table. To select complete data, you often need to query data from multiple tables.Let us understand to combine data from multiple tables using the INNER JOIN clause.
Suppose that there are two tables car and manufacturer. The table car has a column model whose value matches with values in the make column of table manufacturer. To select data from both tables, you use the INNER JOIN clause in the SELECT statement as follows:
SELECT
model,
year,
make,
origin
FROM
Car
INNER JOIN manufacturer ON model = make;
To join table car with the table manufacturer, you follow these steps:
- First, specify columns from both tables that you want to select data in the SELECT clause.
- Second, specify the main table for example table car in the FROM clause.
- Third, specify the second table (table manufacturer) in the INNER JOIN clause and provide a join condition after the ON keyword.
How the INNER JOIN works.
For each row in the table car, inner join compares the value in the model column with the value in the make column of every row in the table manufacturer:
If these values are equal, the inner join creates a new row that contains all columns of both tables and adds it to the result set.
In case these values are not equal, the inner join just ignores them and moves to the next row.
This query returns the customer with id of 3 and the amount and date which it is paid.
SELECT
c.customer_id,
first_name,
last_name,
amount,
payment_date
FROM
customer c
INNER JOIN payment p
ON p.customer_id = c.customer_id
WHERE
c.customer_id = 3;
Left Join
There are two tables, car and manufacturer table. Each row in the table car may have zero or many corresponding rows in the table manufacturer while each row in the table manufacturer has one and only one corresponding row in the table car .
To select data from the table car that may or may not have corresponding rows in the table manufacturer, you use the LEFT JOIN clause.
SELECT
model,
year,
make,
origin
FROM
Car
Left JOIN manufacturer ON model = make;
To join the table car with the manufacturer table using a left join:
- First, specify the columns in both tables from which you want to select data in the SELECT clause.
- Second, specify the left table (table car) in the FROM clause.
- Third, specify the right table (table manufacturer) in the LEFT JOIN clause and the join condition after the ON keyword.
- The LEFT JOIN clause starts selecting data from the left table. For each row in the left table, it compares the value in the model column with the value of each row in the make column in the right table.
If these values are equal, the left join clause creates a new row that contains columns that appear in the SELECT clause and adds this row to the result set.
In case these values are not equal, the left join clause also creates a new row that contains columns that appear in the SELECT clause. In addition, it fills the columns that come from the right table with NULL.
this query will return the first 25 rows on the left join clause to join the film table with inventory table.
SELECT
f.film_id,
title,
inventory_id
FROM
film f
LEFT JOIN inventory i
ON i.film_id = f.film_id
WHERE i.film_id IS NULL
ORDER BY title
FETCH FIRST 25 ROWS ONLY;
Self-Join
A self-join is a regular join that joins a table to itself, a self-join query hierarchical data or to compare rows within the same table.
SELECT select_list
FROM table_name t1
INNER JOIN table_name t2 ON join_predicate;
The above table is joined to itself using the INNER JOIN clause.
SELECT select_list
FROM table_name t1
LEFT JOIN table_name t2 ON join_predicate;
The above table is joined to itself using the LEFT JOIN clause.
This query finds all pair of films which are not duplicates that have the same length. It will skip the first 15 rows then return the next first 25 rows that have films of the same length.
SELECT DISTINCT
f1.title,
f2.title,
f1.length
FROM
film f1
INNER JOIN film f2
ON f1.film_id <> f2.film_id AND
f1.length = f2.length
OFFSET 15 ROWS
FETCH FIRST 25 ROWS ONLY;
Full Outer Join
Syntax of the full outer join:
SELECT * FROM car
FULL [OUTER] JOIN manufacturer on car.id = manufacturer.id;
If the rows in the joined table do not match, the full outer join sets NULL values for every column of the table that does not have the matching row.
If a row from one table matches a row in another table, the result row will contain columns populated from columns of rows from both tables.
Cross Join
CROSS JOIN clause allows you to produce a Cartesian Product of rows in two or more tables.
Cross Join Syntax:
SELECT select_list
FROM table1
CROSS JOIN table2;
This statement is similar to:
SELECT select_list
FROM T1, T2;
Natural Join
It is a join that creates an implicit join based on the same column names in the joined tables.
A natural join can be an inner join, left join, or right join. If you do not specify a join explicitly, PostgreSQL will use the INNER JOIN by default.
Natural Join syntax:
SELECT select_list
FROM table1
NATURAL [INNER, LEFT, RIGHT] JOIN table2;
The convenience of the NATURAL JOIN is that it does not require you to specify the join clause because it uses an implicit join clause based on the common column.
However, avoid using the NATURAL JOIN whenever possible because sometimes it may cause an unexpected result.
Group By
GROUP BY clause divides the rows returned from the SELECT statement into groups. For each group, you can apply an aggregate function for example SUM() to calculate the sum of items or COUNT() to get the number of items in the groups.
Basic Syntax of Group By:
SELECT
column_1,
column_2,
...,
aggregate_function(column_n)
FROM
table_name
GROUP BY
column_1,
column_2,
...
column_n;
This query will return 25 rows from the payment table grouped by customer id, ordered in descending of the total amount
SELECT
customer_id,
SUM (amount)
FROM
payment
GROUP BY
customer_id
ORDER BY
SUM (amount) DESC
FETCH FIRST 25 ROWS ONLY;
We can use multiple columns with Group By. In this query Group By clause divides the rows in the payment table by the values in the customer_id and staff_id columns. SUM() calculates the total amount. Then ordered by the customer_id in ascending order. It fetches the first 30 rows from our table payment.
SELECT
customer_id,
staff_id,
SUM(amount)
FROM
payment
GROUP BY
staff_id,
customer_id
ORDER BY
customer_id
FETCH FIRST 30 ROWS ONLY;
Having
HAVING clause specifies a search condition for a group or an aggregate. The HAVING clause is often used with the GROUP BY clause to filter groups or aggregates based on a specified condition.
SELECT
column_1,
column_2,
...
aggregate_function (column_n)
FROM
table_name
GROUP BY
column_1
HAVING
condition;
PostgreSQL evaluates the HAVING clause after the FROM, WHERE, GROUP BY, and before the SELECT, DISTINCT, ORDER BY and LIMIT clauses.
Since the HAVING clause is evaluated before the SELECT clause, you cannot use column aliases in the HAVING clause. Because at the time of evaluating the HAVING clause, the column aliases specified in the SELECT clause are not available.
The WHERE clause allows you to filter rows based on a specified condition. However, the HAVING clause allows you to filter groups of rows according to a specified condition.
The WHERE clause is applied to rows while the HAVING clause is applied to groups of rows.
SELECT
customer_id,
SUM (amount)
FROM
payment
GROUP BY
customer_id
HAVING
SUM (amount) > 150;
Union Operator
UNION operator combines result sets of two or more SELECT statements into a single result set.
SELECT select_list_1
FROM table_1
UNION
SELECT select_list_2
FROM table_2
To combine the result sets of two queries using the UNION operator, ensure that:
The number and the order of the columns in the select list of both queries must be the same. The data types must be compatible.
The UNION operator removes all duplicate rows from the combined data set. Use the the UNION ALL to retain duplicate rows.
Intersect Operator
PostgreSQL INTERSECT operator combines result sets of two or more SELECT statements into a single result set.
The INTERSECT operator returns any rows available in both result sets.
SELECT select_list
FROM table_1
INTERSECT
SELECT select_list
FROM table_2;
The number of columns and their order in the SELECT clauses must be the same. The data types of the columns must be compatible. When using the Intersect operator.
Except
EXCEPT operator returns rows by comparing the result sets of two or more queries. It returns distinct rows from the first (left) query that are not in the output of the second (right) query.
SELECT select_list
FROM table_1
EXCEPT
SELECT select_list
FROM table_2;
RollUp
PostgreSQL ROLLUP is a subclause of the GROUP BY clause that offers a shorthand for defining multiple grouping sets. A grouping set is a set of columns by which you group.
ROLLUP assumes a hierarchy among the input columns and generates all grouping sets that make sense considering the hierarchy. ROLLUP is often used to generate the subtotals and the grand total for reports.
This query finds the number of rental per day, month, and year by using the ROLLUP. It will skip 15 rows, then fetch the first 25 rows that follows.
SELECT
EXTRACT (YEAR FROM rental_date) y,
EXTRACT (MONTH FROM rental_date) M,
EXTRACT (DAY FROM rental_date) d,
COUNT (rental_id)
FROM
rental
GROUP BY
ROLLUP (
EXTRACT (YEAR FROM rental_date),
EXTRACT (MONTH FROM rental_date),
EXTRACT (DAY FROM rental_date)
)
OFFSET 15
FETCH FIRST 25 ROWS ONLY;
Cube
PostgreSQL CUBE is a sub clause of the GROUP BY clause. The CUBE allows you to generate multiple grouping sets. A grouping set is a set of columns to which you want to group.
This query generates all possible grouping sets based on the dimension columns specified in CUBE.
SELECT
c1, c2, c3,
aggregate (c4)
FROM
table_name
GROUP BY
CUBE (c1, c2, c3);
Subquery
A subquery is a query nested inside another query such as SELECT, INSERT, DELETE and UPDATE.
The query inside the brackets is called a subquery, the query that contains the subquery is known as an outer query.
PostgreSQL executes the query that contains a subquery in the following sequence:
First, executes the subquery then gets the result and passes it to the outer query. Lastly executes the outer query.
SELECT
film_id,
title,
rental_rate
FROM
film
WHERE
rental_rate > (
SELECT
AVG (rental_rate)
FROM
film
)
FETCH FIRST 30 ROWS ONLY;
The following query gets films that have the returned date between 2005-05-29 and 2005-05-30. Then 30 rows are skipped and the first 30 rows that follows are returned.
SELECT
film_id,
title
FROM
film f
WHERE
film_id IN (
SELECT
i.film_id
FROM
rental r
INNER JOIN inventory i ON i.inventory_id =
r.inventory_id
WHERE
return_date BETWEEN '2005-05-29'
AND '2005-05-30'
)
OFFSET 30 ROWS FETCH FIRST 30 ROWS ONLY;
All Operator
ALL operator allows you to query data by comparing a value with a list of values returned by a subquery.
comparison_operator ALL (subquery)
The ALL operator must be preceded by a comparison operator such as equal (=), not equal (!=), greater than (>), greater than or equal to (>=), less than (<), and less than or equal to (<=). Followed by a subquery which also must be surrounded by the parentheses.
- column_name > ALL (subquery) the expression evaluates to true if a value is greater than the biggest value returned by the subquery.
- column_name >= ALL (subquery) the expression evaluates to true if a value is greater than or equal to the biggest value returned by the subquery.
- column_name < ALL (subquery) the expression evaluates to true if a value is less than the smallest value returned by the subquery.
- column_name <= ALL (subquery) the expression evaluates to true if a value is less than or equal to the smallest value returned by the subquery.
- column_name = ALL (subquery) the expression evaluates to true if a value is equal to any value returned by the subquery.
- column_name != ALL (subquery) the expression evaluates to true if a value is not equal to any value returned by the subquery.
SELECT
film_id,
title,
length
FROM
film
WHERE
length > ALL (
SELECT
ROUND(AVG (length),2)
FROM
film
GROUP BY
rating
)
ORDER BY
length
FETCH FIRST 25 ROWS ONLY;
Exists Operator
The EXISTS operator is a boolean operator that tests for existence of rows in a subquery.It accepts an argument which is a subquery.
EXISTS (subquery)
SELECT
column1, column2
FROM
table_1
WHERE
EXISTS( SELECT
1
FROM
table_2
WHERE
column_2 = table_1.column_1);
This query statement returns customers who have paid at least one rental with an amount greater than 15
SELECT first_name,
last_name
FROM customer c
WHERE EXISTS
(SELECT 1
FROM payment p
WHERE p.customer_id = c.customer_id
AND amount > 15 )
ORDER BY first_name,
last_name;
Insert
INSERT statement allows you to insert a new row into a table.
INSERT INTO table_name(column1, column2, value3, ...)
VALUES (value1, value2, value3, ...);
For example this will insert values to a table called links, for the column url, name, last_modified columns.
INSERT INTO links (url, name, last_modified)
VALUES('https://www.dev.to','DEV','2022-09-20');
Insert Multiple Rows
INSERT INTO table_name (column_list)
VALUES
(value_list_1),
(value_list_2),
(value_list_3),
...
(value_list_n);
INSERT INTO
links (url, name, date_modified)
VALUES
('https://www.tradingview.com', 'tradingview', '2022-09-
15'),
('https://www.codenewbie.com','codenewbie', '2022-09-
18'),
('https://www.forem.com','Forem', '2022-09-20'),
('https://www.bitbucket.com', 'Bitbucket', '2022-09-20');
Update
UPDATE statement allows you to modify data in a table
UPDATE table_name
SET column1 = value1,
column2 = value2,
column3 = value3
...
WHERE condition;
UPDATE subjects
SET published_date = '2022-08-15'
WHERE subject_id = 231;
Returns the following message after one row has been updated:
UPDATE 1
Delete
DELETE statement allows you to delete one or more rows from a table.
DELETE FROM table_name
WHERE condition;
This will delete the row where the id is 7.
DELETE FROM links
WHERE id = 7;
This query deletes all the rows in our table since we did not specify a where clause.
DELETE FROM links;
The subquery returns a list of phones from the blacklist table and the DELETE statement deletes the contacts whose phones match with the phones returned by the subquery.To delete all contacts whose phones are in the blacklist table.
DELETE FROM contacts
WHERE phone IN (SELECT phone FROM blacklist);
Upsert
Referred to as merge, when you insert a new row into the table, PostgreSQL will update the row if it already exists, otherwise, it will insert the new row.
INSERT INTO table_name(column_list)
VALUES(value_list)
ON CONFLICT target action;
INSERT INTO customers (name, email)
VALUES('tradingview','hotline@tradingview')
ON CONFLICT (name)
DO NOTHING;
INSERT INTO customers (name, email)
VALUES('tradingview','hotline@tradingview')
ON CONFLICT (name)
DO
UPDATE SET email = EXCLUDED.email || ';' || customers.email;
Common Table Expressions (CTE)
It is a temporary result set which you can reference within another SQL statement including SELECT, INSERT, UPDATE or DELETE. They only exist during the execution of the query and used to simplify complex joins and subqueries in PostgreSQL.
WITH cte_name (column_list) AS (
CTE_query_definition
)
statement;
Advantages of using CTEs:
- Improve the readability of complex queries.
- Ability to create recursive queries, queries that reference themselves.
- Use CTEs in conjunction with window functions to create an initial result set and use another select statement to further process this result set.
This query, the CTE returns a result set that includes staff id and the number of rentals. Then, join the staff table with the CTE using the staff_id column.
WITH cte_rental AS (
SELECT staff_id,
COUNT(rental_id) rental_count
FROM rental
GROUP BY staff_id
)
SELECT s.staff_id,
first_name,
last_name,
rental_count
FROM staff s
INNER JOIN cte_rental USING (staff_id);
Recursive Query
A recursive CTE has three elements:
- Non-recursive term: a CTE query definition that forms the base result set of the CTE structure.
- Recursive term: one or more CTE query definitions joined with the non-recursive term using the UNION or UNION ALL operator.
- Termination check: the recursion stops when no rows are returned from the previous iteration.
Sequence that PostgreSQL executes a recursive CTE:
- Execute the non-recursive term to create the base result set
- Execute recursive term with Ri as an input to return the result set Ri+1 as the output.
- Repeat step 2 until an empty set is returned (termination check)
- Return the final result set that is a UNION or UNION ALL of the result set
WITH RECURSIVE cte_name AS(
CTE_query_definition -- non-recursive term
UNION [ALL]
CTE_query definion -- recursive term
) SELECT * FROM cte_name;
I trust you have understood the content that we have covered so far. Lets gear on and continue learning.
Managing Tables
PostgreSQL Data Types, Create Table, Select Into, Create Table As, Serial, Sequences, Identity Column, Alter Table, Rename Table, Add Column, Drop Column, Change Column’s Data Type, Rename Column, Drop Table, Temporary Table, Truncate Table
Transaction
A database transaction is a single unit of work that consists of one or more operations. A PostgreSQL transaction is atomic, consistent, isolated, and durable. These properties are often referred to as ACID:
- Atomicity guarantees that the transaction completes in an all-or-nothing manner.
- Consistency ensures the change to data written to the database must be valid and follow predefined rules.
- Isolation determines how transaction integrity is visible to other transactions.
- Durability makes sure that transactions that have been committed will be stored in the database permanently.
-- start a transaction
BEGIN;
-- insert a new row into the accounts table
INSERT INTO accounts(name,balance)
VALUES('Alice',10000);
-- commit the change (or roll it back later)
COMMIT;
-- start a transaction
BEGIN;
-- deduct 1000 from account 1
UPDATE accounts
SET balance = balance - 1000
WHERE id = 1;
-- add 1000 to account 2
UPDATE accounts
SET balance = balance + 1000
WHERE id = 2;
-- select the data from accounts
SELECT id, name, balance
FROM accounts;
-- commit the transaction
COMMIT;
-- begin the transaction
BEGIN;
-- deduct the amount from the account 1
UPDATE accounts
SET balance = balance - 1500
WHERE id = 1;
-- add the amount from the account 3 (instead of 2)
UPDATE accounts
SET balance = balance + 1500
WHERE id = 3;
-- roll back the transaction
ROLLBACK;
The easiest way to export data of a table to a CSV file is to use COPY statement.
COPY persons TO '/home/exporter/persons_db.csv' DELIMITER ',' CSV HEADER;
A relational database consists of multiple related tables. A table consists of rows and columns. Tables allow you to store structured data.
CREATE TABLE [IF NOT EXISTS] table_name (
column1 datatype(length) column_contraint,
column2 datatype(length) column_contraint,
column3 datatype(length) column_contraint,
table_constraints
);
Column Constraints
- NOT NULL – ensures that values in a column cannot be NULL.
- UNIQUE – ensures the values in a column unique across the rows within the same table.
- PRIMARY KEY – a primary key column uniquely identify rows in a table. A table can have one and only one primary key.
- FOREIGN KEY – ensures values in a column or a group of columns from a table exists in a column or group of columns in another table. Unlike the primary key, a table can have many foreign keys.
- CHECK – a CHECK constraint ensures the data must satisfy a boolean expression.
CREATE TABLE accounts (
user_id serial PRIMARY KEY,
username VARCHAR ( 50 ) UNIQUE NOT NULL,
password VARCHAR ( 50 ) NOT NULL,
email VARCHAR ( 255 ) UNIQUE NOT NULL,
created_on TIMESTAMP NOT NULL,
last_login TIMESTAMP
);
CREATE TABLE roles(
role_id serial PRIMARY KEY,
role_name VARCHAR (255) UNIQUE NOT NULL
);
CREATE TABLE account_roles (
user_id INT NOT NULL,
role_id INT NOT NULL,
grant_date TIMESTAMP,
PRIMARY KEY (user_id, role_id),
FOREIGN KEY (role_id)
REFERENCES roles (role_id),
FOREIGN KEY (user_id)
REFERENCES accounts (user_id)
);
SELECT
film_id,
title,
length
INTO TEMP TABLE short_film
FROM
film
WHERE
length < 60
ORDER BY
title;
SELECT * FROM short_film;
CREATE TABLE action_film AS
SELECT
film_id,
title,
release_year,
length,
rating
FROM
film
INNER JOIN film_category USING (film_id)
WHERE
category_id = 1;
SELECT * FROM action_film
ORDER BY title;
Serial
CREATE TABLE table_name(
id SERIAL
);
CREATE TABLE fruits(
id SERIAL PRIMARY KEY,
name VARCHAR NOT NULL
);
INSERT INTO fruits(name)
VALUES('Orange');
Sequence
CREATE SEQUENCE [ IF NOT EXISTS ] sequence_name
[ AS { SMALLINT | INT | BIGINT } ]
[ INCREMENT [ BY ] increment ]
[ MINVALUE minvalue | NO MINVALUE ]
[ MAXVALUE maxvalue | NO MAXVALUE ]
[ START [ WITH ] start ]
[ CACHE cache ]
[ [ NO ] CYCLE ]
[ OWNED BY { table_name.column_name | NONE } ]
Create a table
CREATE TABLE order_details(
order_id SERIAL,
item_id INT NOT NULL,
item_text VARCHAR NOT NULL,
price DEC(10,2) NOT NULL,
PRIMARY KEY(order_id, item_id)
);
CREATE SEQUENCE order_item_id
START 10
INCREMENT 10
MINVALUE 10
OWNED BY order_details.item_id;
INSERT INTO
order_details(order_id, item_id, item_text, price)
VALUES
(100, nextval('order_item_id'),'DVD Player',100),
(100, nextval('order_item_id'),'Android TV',550),
(100, nextval('order_item_id'),'Speaker',250);
SELECT
order_id,
item_id,
item_text,
price
FROM
order_details;
List all sequences in the current database
SELECT
relname sequence_name
FROM
pg_class
WHERE
relkind = 'S';
First, specify the name of the sequence which you want to drop and use the CASCADE option if you want to recursively drops objects that depend on the sequence, and objects that depend on the dependent objects.
DROP SEQUENCE [ IF EXISTS ] sequence_name [, ...]
[ CASCADE | RESTRICT ];
DROP TABLE order_details;
PostgreSQL identity column
PostgreSQL version 10 introduced a new constraint GENERATED AS IDENTITY that allows you to automatically assign a unique number to a column. The GENERATED AS IDENTITY constraint is the SQL standard-conforming variant of the good old SERIAL column.
column_name type GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY[ ( sequence_option ) ]
create a table named color with the color_id as the identity column:
CREATE TABLE color (
color_id INT GENERATED ALWAYS AS IDENTITY,
color_name VARCHAR NOT NULL
);
Insert new rows into the color table:
INSERT INTO color(color_name)
VALUES ('Green')
VALUES ('Blue');
Alter Table
To change the structure of an existing table, you use PostgreSQL ALTER TABLE statement.
ALTER TABLE table_name action;
Add a column, Drop a column, Change the data type of a column, Rename a column, Set a default value for the column, Add a constraint to a column, Rename a table.
CREATE TABLE links (
link_id serial PRIMARY KEY,
title VARCHAR (512) NOT NULL,
url VARCHAR (1024) NOT NULL
);
To add a new column named active, you use the following statement
ALTER TABLE links
ADD COLUMN active boolean;
To remove the active column from the links table.
ALTER TABLE links
DROP COLUMN active;
To add a new column named target to the links table
ALTER TABLE links
ADD COLUMN target VARCHAR(10);
To change the name of the links table to short_urls:
ALTER TABLE links
RENAME TO short_urls;
ALTER TABLE table_name
DROP COLUMN column_name;
To drop column that other objects depend on.
ALTER TABLE table_name
DROP COLUMN column_name CASCADE;
Drop Table
To drop a table from the database.
DROP TABLE [IF EXISTS] table_name
[CASCADE | RESTRICT];
The CASCADE option allows you to remove the table and its dependent objects. The RESTRICT option rejects the removal if there is any object depends on the table. The RESTRICT option is the default if you don’t explicitly specify it in the DROP TABLE statement.
Truncate Table
The TRUNCATE TABLE statement deletes all data from a table without scanning it. It is faster than the DELETE statement, TRUNCATE TABLE statement reclaims the storage right away so you do not have to perform a subsequent VACUMM operation, which is useful in the case of large tables.
This query removes all data and resets the identity column value.
TRUNCATE TABLE table_name
RESTART IDENTITY;
To remove data from a table and other tables that have foreign key reference the table, you use CASCADE option in the TRUNCATE TABLE statement
TRUNCATE TABLE table_name
CASCADE;
The TRUNCATE TABLE is transaction-safe. It means that if you place it within a transaction, you can roll it back safely.
Database Constraints
Primary Key, Foreign Key, Check Constraint, Unique Constraint
Not Null Constraint
Primary Key
A primary key is a column or a group of columns used to identify a row uniquely in a table. A table can have one and only one primary key. It is a good practice to add a primary key to every table. PostgreSQL creates a unique B-tree index on the column or a group of columns used to define the primary key.
CREATE TABLE po_headers (
po_no SERIAL PRIMARY KEY,
vendor_no INTEGER,
description TEXT,
shipping_address TEXT
);
Removing Primary Key
ALTER TABLE table_name DROP CONSTRAINT primary_key_constraint;
To remove the primary key from the table
ALTER TABLE products
DROP CONSTRAINT products_pkey;
Foreign Key
A foreign key is a column or a group of columns in a table that reference the primary key of another table.
In PostgreSQL, you define a foreign key using the foreign key constraint. The foreign key constraint helps maintain the referential integrity of data between the child and parent tables.
A foreign key constraint indicates that values in a column or a group of columns in the child table equal the values in a column or a group of columns of the parent table.
syntax
[CONSTRAINT fk_name]
FOREIGN KEY(fk_columns)
REFERENCES parent_table(parent_key_columns)
[ON DELETE delete_action]
[ON UPDATE update_action]
actions: SET NULL, SET DEFAULT, RESTRICT, NO ACTION, CASCADE
CREATE TABLE customers(
customer_id INT GENERATED ALWAYS AS IDENTITY,
customer_name VARCHAR(255) NOT NULL,
PRIMARY KEY(customer_id)
);
CREATE TABLE contacts(
contact_id INT GENERATED ALWAYS AS IDENTITY,
customer_id INT,
contact_name VARCHAR(255) NOT NULL,
phone VARCHAR(15),
email VARCHAR(100),
PRIMARY KEY(contact_id),
CONSTRAINT fk_customer
FOREIGN KEY(customer_id)
REFERENCES customers(customer_id)
);
- ON DELETE CASCADE automatically deletes all the referencing rows in the child table when the referenced rows in the parent table are deleted.
- The SET NULL automatically sets NULL to the foreign key columns in the referencing rows of the child table when the referenced rows in the parent table are deleted.
- The RESTRICT action is similar to the NO ACTION. PostgreSQL issues a constraint violation because the referencing rows
- The ON DELETE SET DEFAULT sets the default value to the foreign key column of the referencing rows in the child table when the referenced rows from the parent table are deleted.
ALTER TABLE child_table
ADD CONSTRAINT constraint_fk
FOREIGN KEY (fk_columns)
REFERENCES parent_table(parent_key_columns)
ON DELETE CASCADE;
Check Constraint
A CHECK constraint is a kind of constraint that allows you to specify if values in a column must meet a specific requirement. It uses a Boolean expression to evaluate the values before they are inserted or updated to the column.
If the values pass the check, PostgreSQL will insert or update these values to the column. Otherwise, PostgreSQL will reject the changes and issue a constraint violation error.
CREATE TABLE employees (
id SERIAL PRIMARY KEY,
first_name VARCHAR (50),
last_name VARCHAR (50),
birth_date DATE CHECK (birth_date > '1900-01-01'),
joined_date DATE CHECK (joined_date > birth_date),
salary numeric CHECK(salary > 0)
);
To add CHECK constraints to existing tables, you use the ALTER TABLE statement.
CREATE TABLE prices_list (
id serial PRIMARY KEY,
product_id INT NOT NULL,
price NUMERIC NOT NULL,
discount NUMERIC NOT NULL,
valid_from DATE NOT NULL,
valid_to DATE NOT NULL
);
ALTER TABLE prices_list
ADD CONSTRAINT price_discount_check
CHECK (
price > 0
AND discount >= 0
AND price > discount
);
Unique Constraint
To ensure that values stored in a column or a group of columns are unique across the whole table such as email addresses or usernames.
PostgreSQL provides you with the UNIQUE constraint that maintains the uniqueness of the data correctly.
CREATE TABLE person (
id SERIAL PRIMARY KEY,
first_name VARCHAR (50),
last_name VARCHAR (50),
email VARCHAR (50) UNIQUE
);
Not-Null Constraint
In databases, NULL represents unknown or information missing. NULL is not the same as an empty string or the number zero.
CREATE TABLE invoices(
id SERIAL PRIMARY KEY,
product_id INT NOT NULL,
qty numeric NOT NULL CHECK(qty > 0),
net_price numeric CHECK(net_price > 0)
);
- Use the NOT NULL constraint for a column to enforce a column not accept NULL. By default, a column can hold NULL.
- To check if a value is NULL or not, you use the IS NULL operator. The IS NOT NULL negates the result of the IS NULL.
- Never use equal operator = to compare a value with NULL because it always returns NULL.
PostgreSQL Data Types
Boolean, Char, VarChar, and Text, Numeric, Integer, Serial, Date, Timestamp, Interval, Time, Uuid, Json, Hstore, Array, User-defined Data Types
Boolean
PostgreSQL supports a single Boolean data type: BOOLEAN that can have three values: true, false and NULL.
PostgreSQL uses one byte for storing a boolean value in the database. The BOOLEAN can be abbreviated as BOOL.
CREATE TABLE stock_availability (
product_id INT PRIMARY KEY,
available BOOLEAN NOT NULL
);
VarChar, Var, Text
PostgreSQL provides three primary character types: CHARACTER(n) or CHAR(n), CHARACTER VARYINGING(n) or VARCHAR(n), and TEXT, where n is a positive integer.
Advantage of specifying the length specifier for the VARCHAR data type is that PostgreSQL will issue an error if you attempt to insert a string that has more than n characters into the VARCHAR(n) column.
- PostgreSQL supports CHAR, VARCHAR, and TEXT data types. The CHAR is fixed-length character type while the VARCHAR and TEXT are varying length character types.
- Use VARCHAR(n) if you want to validate the length of the string (n) before inserting into or updating to a column.
- VARCHAR (without the length specifier) and TEXT are equivalent.
CREATE TABLE chemical_compounds (
id serial PRIMARY KEY,
first CHAR (7),
second VARCHAR (19),
third TEXT
);
Numeric Type
The NUMERIC type can store numbers with a lot of digits. Typically, you use the NUMERIC type for numbers that require exactness such as monetary amounts or quantities.
NUMERIC(precision, scale)
The precision is the total number of digits and the scale is the number of digits in the fraction part. For example, the number 8765.351 has the precision 7 and scale 3.
If precision is not required, you should not use the NUMERIC type because calculations on NUMERIC values are typically slower than integers, floats, and double precision.
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
price NUMERIC(5,2)
);
Integer Data Types
To store the whole numbers in PostgreSQL, you use one of the following integer types: SMALLINT, INTEGER, and BIGINT.
SMALLINT type for storing something like ages of people, the number of pages of a book
INTEGER is the most common choice between integer types because it offers the best balance between storage size, range, and performance.
Using BIGINT type is not only consuming a lot of storage but also decreasing the performance of the database, therefore, you should have a good reason to use it.
CREATE TABLE cities (
city_id serial PRIMARY KEY,
city_name VARCHAR (255) NOT NULL,
population INT NOT NULL CHECK (population >= 0)
);
Date
To store date values, use the PostgreSQL DATE data type that uses 4 bytes to store a date value.
CREATE TABLE employees (
employee_id serial PRIMARY KEY,
first_name VARCHAR (255),
last_name VARCHAR (355),
birth_date DATE NOT NULL,
hire_date DATE NOT NULL
);
INSERT INTO employees (first_name, last_name, birth_date, hire_date)
VALUES ('Derrick','Kimani','1990-05-01','2015-06-01'),
('Florence','Wanjiru','1991-03-05','2013-04-01'),
('Richard','Chege','1992-09-01','2011-10-01');
To get the current date and time, use the built-in Now() function
SELECT NOW()::date;
SELECT CURRENT_DATE;
To get the year, quarter, month, week, day from a date value, you use the EXTRACT() function.
The following statement extracts the year, month, and day from the birth dates of employees:
SELECT
employee_id,
first_name,
last_name,
EXTRACT (YEAR FROM birth_date) AS YEAR,
EXTRACT (MONTH FROM birth_date) AS MONTH,
EXTRACT (DAY FROM birth_date) AS DAY
FROM
employees;
TimeStamp
The timestamp datatype allows you to store both date and time. However, it does not have any time zone data. It means that when you change the timezone of your database server, the timestamp value stored in the database will not change automatically.
The timestamptz datatype is the timestamp with the time zone. The timestamptz datatype is a time zone-aware date and time data type.
SELECT CURRENT_TIMESTAMP;
SELECT TIMEOFDAY();
Time
A time value may have a precision up to 6 digits. The precision specifies the number of fractional digits placed in the second field.
The TIME data type requires 8 bytes and its allowed range is from 00:00:00 to 24:00:00.
column_name TIME(precision);
CREATE TABLE shifts (
id serial PRIMARY KEY,
shift_name VARCHAR NOT NULL,
start_at TIME NOT NULL,
end_at TIME NOT NULL
);
SELECT LOCAL TIME;
SELECT CURRENT TIME;
To extracting hours, minutes, seconds from a time value, you use the EXTRACT function
SELECT
LOCALTIME,
EXTRACT (HOUR FROM LOCALTIME) as hour,
EXTRACT (MINUTE FROM LOCALTIME) as minute,
EXTRACT (SECOND FROM LOCALTIME) as second,
EXTRACT (milliseconds FROM LOCALTIME) as milliseconds;
UUID Data Type
UUID stands for Universal Unique Identifier defined by RFC 4122 and other related standards. A UUID value is 128-bit quantity generated by an algorithm that make it unique in the known universe using the same algorithm.
a UUID is a sequence of 32 digits of hexadecimal digits represented in groups separated by hyphens.
Because of its uniqueness feature, often found UUID in the distributed systems because it guarantees a better uniqueness than the SERIAL data type which generates only unique values within a single database. To store UUID values in the PostgreSQL database, you use the UUID data type.
To install the uuid-ossp module, you use the CREATE EXTENSION statement
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
If you want to generate a UUID value solely based on random numbers, use the uuid_generate_v4()
SELECT uuid_generate_v4();
Create a table whose primary key is UUID data type, the values of the primary key column will be generated automatically using the uuid_generate_v4() function.
CREATE TABLE contacts (
contact_id uuid DEFAULT uuid_generate_v4 (),
first_name VARCHAR NOT NULL,
last_name VARCHAR NOT NULL,
email VARCHAR NOT NULL,
phone VARCHAR,
PRIMARY KEY (contact_id)
);
INSERT INTO contacts (
first_name,
last_name,
email
)
VALUES
(
'Kamau',
'Kelvin',
'kamau.kelvin@example.com'
),
(
'Nafula',
'Wepkulu',
'nafula.wepkulu@example.com',
),
(
'Kasunda',
'Mutorini',
'kasunda.mutorini@example.com'
);
To query our database so that we view the uuid in the customer_id column.
SELECT
*
FROM
contacts;
Hstore data type
The hstore module implements the hstore data type for storing key-value pairs in a single value.
The hstore data type is very useful in many cases, such as semi-structured data or rows with many attributes that are rarely queried. Notice that keys and values are just text strings only.
To enable the hstore extension which loads the contrib module to your PostgreSQL instance.
CREATE EXTENSION hstore;
CREATE TABLE books (
id serial primary key,
title VARCHAR (255),
attr hstore
);
The data that we insert into the hstore column is a list of comma-separated key =>value pairs. Both keys and values are quoted using double quotes (“”).
PostgreSQL provides the hstore_to_json() function to convert hstore data to JSON.
SELECT
title,
hstore_to_json (attr) json
FROM
books;
JSON data type
JSON stands for JavaScript Object Notation. JSON is an open standard format that consists of key-value pairs, JSON is human-readable text.
The main usage of JSON is to transport data between a server and a web application.
The orders table consists of two columns:
The id column is the primary key column that identifies the order. The info column stores the data in the form of JSON.
CREATE TABLE orders (
id serial NOT NULL PRIMARY KEY,
info json NOT NULL
);
PostgreSQL provides two native operators -> and ->> to help you query JSON data.
The operator -> returns JSON object field by key.
The operator ->> returns JSON object field by text.
We can apply aggregate functions such as MIN, MAX, AVERAGE, SUM, etc., to JSON data. For example, the following statement returns minimum quantity, maximum quantity, average quantity and the total quantity of products sold.
SELECT
MIN (CAST (info -> 'items' ->> 'qty' AS INTEGER)),
MAX (CAST (info -> 'items' ->> 'qty' AS INTEGER)),
SUM (CAST (info -> 'items' ->> 'qty' AS INTEGER)),
AVG (CAST (info -> 'items' ->> 'qty' AS INTEGER))
FROM orders;
The json_each() function allows us to expand the outermost JSON object into a set of key-value pairs.
To get a set of keys in the outermost JSON object, you use the json_object_keys() function.
User-defined Data Types
Domain is a data type with optional constraints e.g., NOT NULL and CHECK. A domain has a unique name within the schema scope.
Domains are useful for centralizing the management of fields with common constraints.
The CREATE TYPE statement allows you to create a composite type, used as the return type of a function.
CREATE TYPE film_summary AS (
film_id INT,
title VARCHAR,
release_year SMALLINT
);
Use the film_summary data type as the return type of a function
CREATE OR REPLACE FUNCTION get_film_summary (f_id INT)
RETURNS film_summary AS
$$
SELECT
film_id,
title,
release_year
FROM
film
WHERE
film_id = f_id ;
$$
LANGUAGE SQL;
A user-defined function that returns a random number between two numbers low and high.
CREATE OR REPLACE FUNCTION random_between(low INT ,high INT)
RETURNS INT AS
$$
BEGIN
RETURN floor(random()* (high-low + 1) + low);
END;
$$ language 'plpgsql' STRICT;
SELECT random_between(1,100);
To get multiple random numbers between two integers, execute:
SELECT random_between(1, 100)
FROM generate_series(1, 4);
To list all user-defined types in the current database use the \dT or \dT+ command.
Conditional Expressions & Operators
CASE, COALESCE, NULLIF, CAST
CASE expression
The PostgreSQL CASE expression is the same as IF/ELSE statement in other programming languages. It allows you to add if-else logic to the query to form a powerful query.
General CASE expression
CASE
WHEN condition_1 THEN result_1
WHEN condition_2 THEN result_2
[WHEN ...]
[ELSE else_result]
END
SELECT
SUM (CASE
WHEN rental_rate = 0.99 THEN 1
ELSE 0
END
) AS "Economy",
SUM (
CASE
WHEN rental_rate = 2.99 THEN 1
ELSE 0
END
) AS "Mass",
SUM (
CASE
WHEN rental_rate = 4.99 THEN 1
ELSE 0
END
) AS "Premium"
FROM
film;
Coalesce
COALESCE function that returns the first non-null argument. The COALESCE function accepts an unlimited number of arguments. It returns the first argument that is not null. If all arguments are null, the COALESCE function will return null.
The COALESCE function evaluates arguments from left to right until it finds the first non-null argument. All the remaining arguments from the first non-null argument are not evaluated.
COALESCE (argument_1, argument_2,argument_3, ...);
CREATE TABLE items (
ID serial PRIMARY KEY,
product VARCHAR (100) NOT NULL,
price NUMERIC NOT NULL,
discount NUMERIC
);
Insert records into the item table
INSERT INTO items (product, price, discount)
VALUES
('Cassava', 1000 ,10),
('Yams', 1500 ,20),
('Arrow roots', 800 ,5),
('Potatoes', 500, NULL);
SELECT
product,
(price - COALESCE(discount,0)) AS net_price
FROM
items;
COALESCE function makes the query shorter and easier to read, it substitutes null values in the query.
NULLIF
To use PostgreSQL NULLIF function to handle null values.
NULLIF(argument_1, argument_2, argument_3);
Apply the NULLIF function to substitute the null values for displaying data and preventing division by zero error.
CAST operator
To convert a value of one data type into another. PostgreSQL provides you with the CAST operator that allows you to do this.
CAST ( expression AS target_type );
Specify an expression that can be a constant, a table column, an expression that evaluates to a value. Then the target data type to which you want to convert the result of the expression.
PostgreSQL type cast :: operator
expression::type
Cast a string to a double
SELECT
CAST ('10.2' AS DOUBLE PRECISION);
Cast a string to a date
SELECT
CAST ('2015-01-01' AS DATE),
CAST ('01-OCT-2015' AS DATE);
Cast operator to convert a string to an interval
SELECT '15 minute'::interval,
'2 hour'::interval,
'1 day'::interval,
'2 week'::interval,
'3 month'::interval;
Explain
The EXPLAIN statement returns the execution plan which PostgreSQL planner generates for a given statement.
The EXPLAIN shows how tables involved in a statement will be scanned by index scan or sequential scan, etc., and if multiple tables are used, what kind of join algorithm will be used.
EXPLAIN [ ( option [, ...] ) ] sql_statement;
Option can be :
ANALYZE [ boolean ]
VERBOSE [ boolean ]
COSTS [ boolean ]
BUFFERS [ boolean ]
TIMING [ boolean ]
SUMMARY [ boolean ]
FORMAT { TEXT | XML | JSON | YAML }
Boolean specifies whether the selected option should be turned on or off. You can use TRUE, ON, or 1 to enable the option, and FALSE, OFF, or 0 to disable it.
The ANALYZE option causes the sql_statement to be executed first and then actual run-time statistics in the returned information including total elapsed time expended within each plan node and the number of rows it actually returned.
TIMING includes the actual startup time and time spent in each node in the output.It defaults to TRUE and it may only be used when ANALYZE is enabled.
COSTS option includes the estimated startup and total costs of each plan node, as well as the estimated number of rows and the estimated width of each row in the query plan. The COSTS defaults to TRUE.
BUFFERS only can be used when ANALYZE is enabled. By default, the BUFFERS parameter set to FALSE
VERBOSE parameter allows you to show additional information regarding the plan. This parameter sets to FALSE by default.
SUMMARY parameter adds summary information such as total timing after the query plan. Note that when ANALYZE option is used, the summary information is included by default.
The output format of the query plan such as TEXT, XML, JSON, and YAML. This parameter is set to TEXT by default.
Explain statement that calculates the query plan.
EXPLAIN ANALYZE
SELECT
f.film_id,
title,
name category_name
FROM
film f
INNER JOIN film_category fc
ON fc.film_id = f.film_id
INNER JOIN category c
ON c.category_id = fc.category_id
ORDER BY
title;
Indeed we have covered the basics of SQL using PostgreSQl in detail. You need to practice what you have learnt using a sample database, so as to understand the concepts well. Our goal should be to write complex readable queries with fast execution rate. The next SQL article will cover the advanced topics in SQL such as Indexes, Views, Stored Procedures.
Feel free to share your thoughts in the comments.
Posted on September 20, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.