Master SQL Querying: A Comprehensive Guide.
Fullstack Dev
Posted on July 25, 2024
At first glance, SQL might seem simple and not hard at all, that is true until you find yourself in front of complex database structures or having to write complex queries. It does get confusing or difficult sometimes (Yes even experienced people can find themselves scratching their heads).
In this guide we're gonna lay out all the concepts so you can pick up SQL in no time.
SELECT basics
A SELECT
statement retrieves data from a table found in your database. Each table has rows and columns. You can select specific columns from a table.
Here is the Syntax:
SELECT "column_name", "column_name"
FROM "table_name";
Keep in mind that you can select one or multiple columns.
Here is an example of our table named general_info
Country | Capital | Year_of_Independence |
---|---|---|
United States | Washington, D.C. | 1776 |
France | Paris | 1789 |
India | New Delhi | 1947 |
Brazil | Brasilia | 1822 |
China | Beijing | 1949 |
Australia | Canberra | 1901 |
Our SELECT
statement will need the column names and the table name from which we want to retrieve data can select one or multiple columns.
SELECT "Country", "Year_of_Independence" FROM "general_info";
This query will return all the names of the countries and their year of independence.
Select with WHERE clause
A WHERE
clause is a condition that must be met in order for a row to be present in the result set of the query. In other words it helps filter the result and only returns the rows that match the condition you specified.
Country | Capital | Year_of_Independence |
---|---|---|
United States | Washington, D.C. | 1776 |
France | Paris | 1789 |
India | New Delhi | 1947 |
Brazil | Brasilia | 1822 |
China | Beijing | 1949 |
Australia | Canberra | 1901 |
We're going to use the same Table general_info
to look at an example.
Imagine we want to return the capital of a country that got its independence in 1901.
SELECT "Capital"
FROM "general_info"
WHERE "year_of_independence" = 1901;
Functions
Functions are built-in blocks of code, each of these functions execute a specific task.
Function Name | Description |
---|---|
SUM | Calculates the sum of a set of values |
AVG | Calculates the average of a set of values |
MIN | Returns the smallest value in a set |
MAX | Returns the largest value in a set |
COUNT | Counts the number of rows or non-null values |
ROUND | Rounds a number to a specified decimal place |
TRUNCATE | Truncates a number to a specified decimal place |
CONCAT | Combines two or more strings into a single string |
LENGTH | Returns the length of a string |
UPPER | Converts a string to uppercase |
LOWER | Converts a string to lowercase |
SUBSTRING | Extracts a substring from a string |
TRIM | Removes leading and trailing spaces from a string |
LTRIM | Removes leading spaces from a string |
RTRIM | Removes trailing spaces from a string |
LIKE | Used for pattern matching in strings |
YEAR | Extracts the year from a date |
MONTH | Extracts the month from a date |
DAY | Extracts the day from a date |
HOUR | Extracts the hour from a time |
MINUTE | Extracts the minute from a time |
SECOND | Extracts the second from a time |
CASE WHEN | Performs conditional logic |
COALESCE | Returns the first non-null value from a list of expressions |
NULLIF | Returns NULL if two expressions are equal, otherwise returns the first expression |
Here are some examples:
SUM
Let's calculate the total of prices In the fruits
table.
SELECT SUM("Price") AS 'total'
FROM "Fruits";
The AS
allows us to rename the result column as total.
AVG
SELECT AVG("Price") AS 'Average_price'
FROM "Fruits";
Here we're getting the average price of all the fruits in the Fruits
Table.
UPPER
SELECT UPPER("Fruit") AS "Fruit_name_upper"
FROM "Fruits";
This query converts all the fruit names in the Fruits
Table to uppercase.
TURNCATE
TRUNCATE in the context of numbers means to remove digits after a specified decimal place without rounding.
SELECT TURNCATE("name_of_column",2) AS 'New_Value'
FROM "table_name";
We can also use TURNCATE
to empty a table without messing with or deleting the table structure.
TURNCATE TABLE "Fruits";
All the data will be removed from the table. It is an irreversible operation.
SELECT with GROUP BY
The GROUP BY
clause groups rows that have the same value, it categorizes them based on a specific column of your choosing.
Imagine a list of fruits and their prices.
Table name: Fruits
Fruit | Price |
---|---|
Apple | 0.5 |
Banana | 0.3 |
Orange | 0.4 |
Grape | 0.4 |
Strawberry | 0.5 |
Mango | 0.3 |
Using this example, we can group the fruits by price.
SELECT "Price"
FROM "FRUITS"
GROUP BY "Price";
The result would look like that:
Price |
---|
0.3 |
0.4 |
0.5 |
What essentially happened is that all the rows that had the same prices were put together into a single group.
SELECT with JOIN
Sometimes you need to view the data of more than just one table to get the required data.
Table 1: Customers
CustomerID | CustomerName | City |
---|---|---|
1 | John Doe | New York |
2 | Jane Smith | Los Angeles |
3 | Kate Johnson | Chicago |
4 | Ilene Brown | Dallas |
5 | David Michaels | Houston |
6 | Olivia Davids | Miami |
Table 2: Orders
OrderID | CustomerID | OrderDate |
---|---|---|
101 | 1 | 2023-11-08 |
102 | 2 | 2023-12-23 |
103 | 3 | 2023-11-25 |
104 | 1 | 2023-12-18 |
105 | 4 | 2023-11-22 |
106 | 2 | 2023-12-15 |
In this example, the Customers
table is related to a row of the Orders
table. If we want to find out Who placed the order and When the order was placed, we need to join the Customers
table to the Orders
table on the common field that is the CustomerID
in both tables.
SELECT Customers.CustomerName, Orders.OrderDate
FROM Customers JOIN Orders
ON Customers.CustomerID = Orders.CustomerID;
This would be the result of this query:
CustomerName | OrderDate |
---|---|
John Doe | 2023-11-08 |
John Doe | 2023-12-18 |
Jane Smith | 2023-12-23 |
Jane Smith | 2023-12-15 |
Kate Johnson | 2023-11-25 |
Ilene Brown | 2023-11-22 |
LEFT JOIN
A LEFT JOIN
returns all the rows from the left table even if there are no matches in the right one.
This means that the returned result will have all the data from the left table; if there's a match in the right table then the relevant columns are filled and if there is no match found in the right table the columns from the right table will have null values.
SELECT Customers.CustomerName, Orders.OrderID
FROM Customers
LEFT JOIN Orders
ON Customers.CustomerID = Orders.CustomerID;
This will help us know which customer did not place an order.
CustomerName | OrderID |
---|---|
John Doe | 101 |
John Doe | 104 |
Jane Smith | 102 |
Jane Smith | 106 |
Kate Johnson | 103 |
Ilene Brown | 105 |
David Michaels | |
Olivia Davids |
RIGHT JOIN
A RIGHT JOIN
is essentially the opposite of a LEFT JOIN
. It returns all rows from the right table even if there are no matches in the left table.
This means that the returned result will have all the data from the right table; if there's a match in the left table then the relevant columns are filled and if there is no match found in the left table the columns from the left table will have null values.
SELECT Customers.CustomerName, Orders.OrderID
FROM Customers
RIGHT JOIN Orders
ON Customers.CustomerID = Orders.CustomerID;
This query will return all orders, including those without corresponding customers. If an order doesn't have a corresponding customer, the CustomerName column will be NULL.
Subqueries
Subqueries queries are queries nested inside another query.
Subquerying with JOIN
Let's look at one using the JOIN
statement and our Customers
and Orders
tables
SELECT CustomerName, TotalOrders
FROM Customers
JOIN(SELECT Orders.CustomerID, COUNT(Orders.CustomerID) AS TotalOrders
FROM Orders
GROUP BY CustomerID) AS OrderCounts
ON Customers.CustomerID = Orders.CustomerID;
Let's break our query down by starting from the inner query to the outer one:
-
SELECT Orders.CustomerID,COUNT(Orders.CustomerID) AS TotalOrders FROM Orders
: We are calculating the total number of orders made by each customer in theOrders
by using a built-in function namedCOUNT
. -
As TotalOrders
, is used to rename the column we got as a result. -
GROUP BY CustomerID
, this query groups the result byCustomerID
this will allow us to know how many Orders each Person Placed. AS OrderCounts
Gives an Alias to the subquery allowing us to treat it like a normal table temporarily.SELECT CustomerName, TotalOrders FROM Customers
We are selecting theCustomerName
From theCustomers
table and theTotalOrder
From the temporary table we just created that has the columnTotalOrders
.Lastly
ON Customers.CustomerID = Orders.CustomerID
Joins both tables by using their common field which isCustomerID
.
Our result would be:
CustomerName | TotalOrders |
---|---|
John Doe | 2 |
Jane Smith | 2 |
Kate Johnson | 1 |
Ilene Brown | 1 |
David Michaels | 0 |
Olivia Davids | 0 |
Subquerying with IN
Let's find customers who placed orders after the date 2023-11-08
.
SELECT CustomerName
FROM Customers
WHERE CustomerID IN (
SELECT CustomerID
FROM Orders
WHERE OrderDate > '2023-11-08'
);
SELECT CustomerID FROM Orders WHERE OrderDate > '2023-11-08': This part retrieves the
CustomerID
from theOrders
table where theOrderDate
is after November 08, 2023.SELECT CustomerName FROM Customers WHERE CustomerID IN (...)
: This part selects theCustomerName
from theCustomers
table whereCustomerID
is found in the result of the subquery.
A Deeper Dive
After Presenting the basic concepts of how each provided statement above works let's see more statements at the works.
Here are The same tables with a little modification.
Customers Table
CustomerID | CustomerName | City |
---|---|---|
1 | John Doe | New York |
2 | Jane Smith | Los Angeles |
3 | Kate Johnson | Chicago |
4 | Ilene Brown | Dallas |
5 | David Michaels | Houston |
6 | Olivia Davids | Houston |
Orders Table
OrderID | CustomerID | OrderDate | OrderAmount |
---|---|---|---|
101 | 1 | 2023-11-08 | 100 |
102 | 2 | 2023-12-23 | 150 |
103 | 3 | 2023-11-25 | 80 |
104 | 1 | 2023-12-18 | 200 |
105 | 4 | 2023-11-22 | 120 |
106 | 2 | 2023-12-15 | 90 |
Basic SELECT Queries
- Get all Customers' data:
SELECT * FROM Customers
The asterisk *
means all data.
- Get specific Columns:
SELECT CustomerName, City
FROM Customers;
Filtering Data
- Get orders placed after a specific date:
SELECT * FROM Orders
WHERE OrderDate > '2023-11-08';
- Find Customers in a Specific City:
SELECT * FROM Customers
WHERE City = 'Los Angeles';
- Find Orders With an order amount greater than 100:
SELECT * FROM Orders WHERE OrderAmount >100;
Sorting Data (ORDER BY)
When sorting Data with ORDER BY
We have the DESC
and ASC
Keywords.
-
ASC
Sorts the data fromA
toZ
-
DESC
Sorts the data fromZ
toA
ASC
is the default sort order.
- Sorting Customers by name in descending order:
SELECT * FROM Customers
ORDER BY CustomerName DESC;
Aggregating Data (GROUP BY, HAVING,other functions)
Group By
We've already come across GROUP BY
. It will categorize data and group them based on a specific column such as an ID
for example.
- Find the total order amount for each customer:
SELECT CustomerID, SUM(OrderAmount) AS TotalOrderAmount
FROM Orders
GROUP BY CustomerID;
HAVING
HAVING
is similar to the WHERE
clause but it uses the results of GROUP BY
rather than individual rows. This means that HAVING
Filters the groups made by GROUP BY
and checks if the condition provided is applied.
- Find Customers with a total order amount greater than 200:
SELECT CustomerID, SUM(OrderAmount) AS TotalOrderAmount
FROM Orders
GROUP BY CustomerID
HAVING SUM(OrderAmount) > 200;
INNER JOIN
We've previously seen LEFT JOIN
s and RIGHT JOIN
s, and how they return null values if the right or left row is missing a value.
What if we want to return the rows that only have matching values in both tables and avoid the null
values?
This is Where INNER JOIN
comes into play.
SELECT Customers.CustomerName, Orders.OrderDate
FROM Customers
INNER JOIN Orders
ON Customers.CustomerID = Orders.CustomerID;
COALESCE FUNCTION
COALESCE function to replace null values with a specific one.
SELECT Customers.CustomerName, COALESCE(Orders.OrderDate, 'No Orders') AS OrderDate
FROM Customers
LEFT JOIN Orders
ON Customers.CustomerID = Orders.CustomerID;
Correlated Subqueries
A correlated subquery is when a subquery references a column from the outer query for example:
SELECT CustomerName
FROM Customers
WHERE EXISTS (
SELECT *
FROM Orders
WHERE Orders.CustomerID = Customers.CustomerID AND OrderDate > '2023-11-30'
);
WITH
The with
clause also known as a common table expression, makes it possible to temporarily define result sets that can be referenced within a SELECT
, INSERT
, UPDATE
, or DELETE
statement.
This means that it allows you to break down complex queries into smaller ones and make them more manageable.
Let's say we want to find customers who have placed orders after a specific date using common table expression.
WITH RecentOrders AS (
SELECT CustomerID
FROM Orders
WHERE OrderDate > '2023-11-30'
)
SELECT CustomerName
FROM Customers
WHERE CustomerID IN (SELECT CustomerID FROM RecentOrders);
The WITH
clause improves readability and performance and it only exists for the duration of the query.
Complex Grouping
Let's say we want to calculate the total sales for each month and year in the Orders
Table.
SELECT
YEAR(OrderDate) AS OrderYear,
MONTH(OrderDate) AS OrderMonth,
SUM(OrderAmount) AS TotalSales
FROM
Orders
GROUP BY
YEAR(OrderDate), MONTH(OrderDate);
- We will select three columns
year
of the order, themonth
of the order, and the total sales amount for that month and year. Then we will need to group by the year and month.
In conclusion, understanding the fundamentals and basic querying at the start of your journey to learn SQL is the building block of Mastering it. Understanding the concepts laid out in this article will give you a significant step forward and will help you become proficient in database management. The KEY is to PRACTICE and CONSISTENT APPLICATION.
Posted on July 25, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.