Understanding SQL Joins: A Beginner's Guide
Katherine Mykhailyk
Posted on April 30, 2023
There are three main types of joins in SQL: INNER, OUTER, and CROSS joins. Each join involves two tables, referred to as the LEFT and RIGHT tables, respectively.
By combining these concepts, you can create different types of joins such as inner, left outer, right outer, and full outer joins.
It's crucial to keep in mind that when you use the JOIN keyword without specifying the type of join, it will be treated as an INNER JOIN by default. To perform an outer join, you need to specify whether it's a RIGHT, LEFT, or FULL join. Moreover, it's worth noting that the INNER and OUTER keywords are optional with INNER being the default.
For presentational purposes I created two DB tables (I use SQL Server and Azure Data Studio visual tool):
PaymentSystems Table:
Customers Table:
Let's examine how the various types of joins can impact the outcome of a query using the example of these two tables.
Inner type of joins
1. INNER JOIN (or simply JOIN)
The diagram for INNER JOIN, depicts the overlapping area between two circles, which represents the rows in each table that have a common value. When this type of join is used, only the rows that have matching values in both tables will be selected and returned. In other words, INNER JOIN returns only the data that is common to both tables.
Let’s assume the following code:
SELECT * FROM dbo.PaymentSystems S
INNER JOIN dbo.Customers C
ON S.Id = C.PaymentSystemId
Running the above code will produce the following result:
As shown in the result, the INNER JOIN did not include certain rows from the Customers table in the final output, as those rows did not have an Id that matched the condition specified in the ON clause: S.Id = C.PaymentSystemId.
Outer type of joins
2. LEFT OUTER JOIN (or simply LEFT JOIN)
It retrieves all the rows from the left table and any matching rows from the right table. If there are no matching rows in the right table, NULL values are used instead.
Let's execute this query and observe the result:
SELECT * FROM dbo.Customers C
LEFT JOIN dbo.PaymentSystems S
ON C.PaymentSystemId = S.Id
As shown in the result, the LEFT OUTER JOIN retrieved all the rows from the Customers table, but only those rows from the PaymentSystems table that matched the condition specified in the ON clause: C.PaymentSystemId = S.Id. In cases where there was no match found, the missing values were replaced with NULL.
3. RIGHT OUTER JOIN (or simply RIGHT JOIN)
It returns all the rows from the right table and matching rows from the left table. If there are no matches in the left table, NULL values are returned. It's similar to LEFT OUTER JOIN, but the tables are reversed, with the right table being the primary table.
SELECT * FROM dbo.PaymentSystems S
RIGHT JOIN dbo.Customers C
ON S.Id = C.PaymentSystemId
The outcome of the query will be as follows:
The Customers table has been included completely, but PaymentSystems table is only included where it met the condition S.Id = C.PaymentSystemId.
4. FULL OUTER JOIN (or simply FULL JOIN)
It retrieves all rows from both tables, and if there are any unmatched rows, it fills in the columns with NULL values.
SELECT * FROM dbo.PaymentSystems S
FULL OUTER JOIN dbo.Customers C
ON S.Id = C.PaymentSystemId
As you can see, this code returns a result set that includes all the rows from both the PaymentSystems and Customers tables, regardless of whether there is a matching row in the other table. The result set will have NULL values in any columns where there is no match between the two tables.
Cartesian type of joins
5. CROSS JOIN
A CROSS JOIN produces a result set that combines each row from the first table with each row from the second table, resulting in a Cartesian product of the two tables. It should be used with caution as it can generate a large number of rows and cause performance issues. Cross joins are appropriate when no other join can be used, and when the WHERE clause can limit the result set to a manageable size. Although a cross join doesn't make sense for my particular tables, we can still use it to illustrate how it works:
SELECT TOP 14 * FROM dbo.Customers C
CROSS JOIN dbo.PaymentSystems S
The resulting output displays all the possible combinations of rows from the Customers and PaymentSystems tables, as generated by the cross join operation. As you can see, I have limited the output by using the TOP 14 clause in order to restrict the number of rows returned by the query.
No obvious category
6. SELF JOIN
There is no specific keyword in SQL for a self join. Self join doesn't belong to any of the traditional join categories such as inner, outer, left, right, or full outer join. It's a type of a instruction where a table is joined with itself. In other words, a self join is a regular join, but the table is used as both the left and right table in the join. This can be useful when you have a table that contains a hierarchy or a relationship between rows, and you need to join rows within the same table based on that hierarchy or relationship.
A self join is utilised when a table needs to reference data within itself, for example: employees, multi-level marketing, machine parts, comment tree where each comment references the next one, and so on.
The following example demonstrates a simple implementation of a self join with the help of LEFT JOIN:
SELECT *
FROM dbo.Employees E
LEFT JOIN dbo.Employees MA
ON MA.EmployeeId = E.ManagerId
Alternatively, you can also use the following query with a WHERE clause:
Select * FROM dbo.Employees E, dbo.Employees MA
WHERE MA.EmployeeId = E.ManagerId
Another way to obtain the same result is by using an INNER JOIN :
SELECT * FROM dbo.Employees E
INNER JOIN dbo.Employees MA
ON MA.EmployeeId = E.ManagerId
The three queries shown above are examples of self-joins where the Employees table is joined with itself.
In conclusion, SQL joins provide a powerful way to combine data from multiple tables. Understanding the differences between joins is essential for building complex queries and retrieving the data you need. By mastering SQL joins, you can unlock the full potential of your database and gain deeper insights into your data.
Posted on April 30, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.