Understanding SQL Joins: A Beginner's Guide

katerinamykhailyk

Katherine Mykhailyk

Posted on April 30, 2023

Understanding SQL Joins: A Beginner's Guide

Image description

There are three main types of joins in SQL: INNER, OUTER, and CROSS joins. Each join involves two tables, referred to as the LEFT and RIGHT tables, respectively.

By combining these concepts, you can create different types of joins such as inner, left outer, right outer, and full outer joins.

It's crucial to keep in mind that when you use the JOIN keyword without specifying the type of join, it will be treated as an INNER JOIN by default. To perform an outer join, you need to specify whether it's a RIGHT, LEFT, or FULL join. Moreover, it's worth noting that the INNER and OUTER keywords are optional with INNER being the default.

For presentational purposes I created two DB tables (I use SQL Server and Azure Data Studio visual tool):

PaymentSystems Table:

Image description

Customers Table:

Image description

Let's examine how the various types of joins can impact the outcome of a query using the example of these two tables.

Inner type of joins

1. INNER JOIN (or simply JOIN)

The diagram for INNER JOIN, depicts the overlapping area between two circles, which represents the rows in each table that have a common value. When this type of join is used, only the rows that have matching values in both tables will be selected and returned. In other words, INNER JOIN returns only the data that is common to both tables.
Image description

Let’s assume the following code:

SELECT * FROM dbo.PaymentSystems S
INNER JOIN dbo.Customers C
ON S.Id = C.PaymentSystemId
Enter fullscreen mode Exit fullscreen mode

Running the above code will produce the following result:

Image description

As shown in the result, the INNER JOIN did not include certain rows from the Customers table in the final output, as those rows did not have an Id that matched the condition specified in the ON clause: S.Id = C.PaymentSystemId.

Image description

Outer type of joins

2. LEFT OUTER JOIN (or simply LEFT JOIN)

It retrieves all the rows from the left table and any matching rows from the right table. If there are no matching rows in the right table, NULL values are used instead.

Image description

Let's execute this query and observe the result:

SELECT * FROM dbo.Customers C
LEFT JOIN dbo.PaymentSystems S
ON C.PaymentSystemId = S.Id
Enter fullscreen mode Exit fullscreen mode

Image description

As shown in the result, the LEFT OUTER JOIN retrieved all the rows from the Customers table, but only those rows from the PaymentSystems table that matched the condition specified in the ON clause: C.PaymentSystemId = S.Id. In cases where there was no match found, the missing values were replaced with NULL.

3. RIGHT OUTER JOIN (or simply RIGHT JOIN)

Image description

It returns all the rows from the right table and matching rows from the left table. If there are no matches in the left table, NULL values are returned. It's similar to LEFT OUTER JOIN, but the tables are reversed, with the right table being the primary table.

SELECT * FROM dbo.PaymentSystems S
RIGHT JOIN dbo.Customers C
ON S.Id = C.PaymentSystemId
Enter fullscreen mode Exit fullscreen mode

The outcome of the query will be as follows:

Image description

The Customers table has been included completely, but PaymentSystems table is only included where it met the condition S.Id = C.PaymentSystemId.

4. FULL OUTER JOIN (or simply FULL JOIN)

Image description

It retrieves all rows from both tables, and if there are any unmatched rows, it fills in the columns with NULL values.

SELECT * FROM dbo.PaymentSystems S
FULL OUTER JOIN dbo.Customers C
ON S.Id = C.PaymentSystemId
Enter fullscreen mode Exit fullscreen mode

Image description

As you can see, this code returns a result set that includes all the rows from both the PaymentSystems and Customers tables, regardless of whether there is a matching row in the other table. The result set will have NULL values in any columns where there is no match between the two tables.

Cartesian type of joins

5. CROSS JOIN

Image description

A CROSS JOIN produces a result set that combines each row from the first table with each row from the second table, resulting in a Cartesian product of the two tables. It should be used with caution as it can generate a large number of rows and cause performance issues. Cross joins are appropriate when no other join can be used, and when the WHERE clause can limit the result set to a manageable size. Although a cross join doesn't make sense for my particular tables, we can still use it to illustrate how it works:

SELECT TOP 14 * FROM  dbo.Customers C
CROSS JOIN dbo.PaymentSystems S
Enter fullscreen mode Exit fullscreen mode

Image description

The resulting output displays all the possible combinations of rows from the Customers and PaymentSystems tables, as generated by the cross join operation. As you can see, I have limited the output by using the TOP 14 clause in order to restrict the number of rows returned by the query.

No obvious category

6. SELF JOIN

There is no specific keyword in SQL for a self join. Self join doesn't belong to any of the traditional join categories such as inner, outer, left, right, or full outer join. It's a type of a instruction where a table is joined with itself. In other words, a self join is a regular join, but the table is used as both the left and right table in the join. This can be useful when you have a table that contains a hierarchy or a relationship between rows, and you need to join rows within the same table based on that hierarchy or relationship.

Image description

A self join is utilised when a table needs to reference data within itself, for example: employees, multi-level marketing, machine parts, comment tree where each comment references the next one, and so on.

The following example demonstrates a simple implementation of a self join with the help of LEFT JOIN:

SELECT *
FROM dbo.Employees E
LEFT JOIN dbo.Employees MA
ON MA.EmployeeId = E.ManagerId
Enter fullscreen mode Exit fullscreen mode

Alternatively, you can also use the following query with a WHERE clause:

Select * FROM dbo.Employees E, dbo.Employees MA
WHERE MA.EmployeeId = E.ManagerId
Enter fullscreen mode Exit fullscreen mode

Another way to obtain the same result is by using an INNER JOIN :

SELECT * FROM dbo.Employees E
INNER JOIN dbo.Employees MA
ON MA.EmployeeId = E.ManagerId
Enter fullscreen mode Exit fullscreen mode

The three queries shown above are examples of self-joins where the Employees table is joined with itself.

In conclusion, SQL joins provide a powerful way to combine data from multiple tables. Understanding the differences between joins is essential for building complex queries and retrieving the data you need. By mastering SQL joins, you can unlock the full potential of your database and gain deeper insights into your data.

💖 💪 🙅 🚩
katerinamykhailyk
Katherine Mykhailyk

Posted on April 30, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related