MySQL Basics: Understanding Join and Index with Examples

ronnychen

Yuhao Chen

Posted on September 28, 2024

MySQL Basics: Understanding Join and Index with Examples

Join

Types of Joins:

  1. INNER JOIN
  2. LEFT JOIN (or LEFT OUTER JOIN)
  3. RIGHT JOIN (or RIGHT OUTER JOIN)
  4. FULL JOIN (or FULL OUTER JOIN)
  5. CROSS JOIN

1. INNER JOIN

An INNER JOIN retrieves records that have matching values in both tables. If there is no match, the row will not be included in the result set.

Example:

Let’s say you have two tables: employees and departments.

Table: employees

emp_id first_name last_name department_id
1 John Doe 101
2 Jane Smith 102
3 Alice Johnson 103
4 Bob Williams NULL

Table: departments

department_id department_name
101 HR
102 Engineering
103 Marketing
104 Sales

Now, let's use an INNER JOIN to retrieve all employees with their corresponding department names:

SELECT employees.emp_id, employees.first_name, employees.last_name, departments.department_name
FROM employees
INNER JOIN departments
ON employees.department_id = departments.department_id;
Enter fullscreen mode Exit fullscreen mode

Result:

emp_id first_name last_name department_name
1 John Doe HR
2 Jane Smith Engineering
3 Alice Johnson Marketing
  • Explanation: Only the rows where there is a matching department_id in both tables (employees and departments) are included in the result set. Bob, who has a NULL value for department_id, is excluded.

2. LEFT JOIN (or LEFT OUTER JOIN)

A LEFT JOIN returns all records from the left table (in this case, employees), and the matched records from the right table (in this case, departments). If there is no match, the result is NULL for the right table's columns.

Example:

SELECT employees.emp_id, employees.first_name, employees.last_name, departments.department_name
FROM employees
LEFT JOIN departments
ON employees.department_id = departments.department_id;
Enter fullscreen mode Exit fullscreen mode

Result:

emp_id first_name last_name department_name
1 John Doe HR
2 Jane Smith Engineering
3 Alice Johnson Marketing
4 Bob Williams NULL
  • Explanation: This query includes all rows from the employees table. If a corresponding department_id is not found in the departments table, the department_name column will contain NULL (as in Bob’s case).

3. RIGHT JOIN (or RIGHT OUTER JOIN)

A RIGHT JOIN returns all records from the right table (in this case, departments), and the matched records from the left table (in this case, employees). If there is no match, the result is NULL for the left table's columns.

Example:

SELECT employees.emp_id, employees.first_name, employees.last_name, departments.department_name
FROM employees
RIGHT JOIN departments
ON employees.department_id = departments.department_id;
Enter fullscreen mode Exit fullscreen mode

Result:

emp_id first_name last_name department_name
1 John Doe HR
2 Jane Smith Engineering
3 Alice Johnson Marketing
NULL NULL NULL Sales
  • Explanation: This query returns all rows from the departments table. If there is no matching department_id in the employees table (as is the case with the "Sales" department), the employee fields will contain NULL.

4. FULL JOIN (or FULL OUTER JOIN)

A FULL JOIN combines the results of both LEFT JOIN and RIGHT JOIN. It returns all records when there is a match in either the left or right table. If there is no match, the result is NULL for the missing side.

Example (if MySQL supported FULL JOIN natively):

SELECT employees.emp_id, employees.first_name, employees.last_name, departments.department_name
FROM employees
FULL OUTER JOIN departments
ON employees.department_id = departments.department_id;
Enter fullscreen mode Exit fullscreen mode

Result (Hypothetical):

emp_id first_name last_name department_name
1 John Doe HR
2 Jane Smith Engineering
3 Alice Johnson Marketing
4 Bob Williams NULL
NULL NULL NULL Sales
  • Explanation: This query includes all rows from both tables. If there is no matching department_id in either table, the columns for the non-matching table will show NULL. In MySQL, you can simulate a FULL JOIN by combining a LEFT JOIN and a RIGHT JOIN using UNION.
SELECT employees.emp_id, employees.first_name, employees.last_name, departments.department_name
FROM employees
LEFT JOIN departments
ON employees.department_id = departments.department_id
UNION
SELECT employees.emp_id, employees.first_name, employees.last_name, departments.department_name
FROM employees
RIGHT JOIN departments
ON employees.department_id = departments.department_id;
Enter fullscreen mode Exit fullscreen mode

5. CROSS JOIN

A CROSS JOIN produces the Cartesian product of the two tables. It combines each row of the first table with every row of the second table, without considering any condition. This is useful when you want to combine all possible combinations of rows from two tables.

Example:

SELECT employees.first_name, departments.department_name
FROM employees
CROSS JOIN departments;
Enter fullscreen mode Exit fullscreen mode

Result:

first_name department_name
John HR
John Engineering
John Marketing
John Sales
Jane HR
Jane Engineering
Jane Marketing
Jane Sales
Alice HR
Alice Engineering
Alice Marketing
Alice Sales
Bob HR
Bob Engineering
Bob Marketing
Bob Sales
  • Explanation: The CROSS JOIN combines every employee with every department, resulting in a Cartesian product. This is often not useful unless you specifically want to get all combinations.

Real-World Use Cases

  • INNER JOIN: Used when you only want records that have matches in both tables (e.g., finding all orders where there is a matching customer).
  • LEFT JOIN: Useful when you want all records from the left table, even if there’s no matching record in the right table (e.g., listing all customers and their orders, including customers who have not placed any orders).
  • RIGHT JOIN: Similar to LEFT JOIN, but focuses on all records from the right table (e.g., listing all departments and the employees in those departments, including departments without employees).
  • FULL JOIN: Used when you need to see all records from both tables, whether or not they match.
  • CROSS JOIN: Used to create combinations (e.g., pairing products with all possible promotional offers).

Index

Types of Indexes

Clustered index: Determines the physical order of data in the table (there can be only one clustered index per table).

Non-clustered index: Separate from the table data and contains a pointer to the data.

(Only when they are not part of the primary key or explicitly defined as the clustered index)

  • Single-column index: Indexes a single column in a table.
  • Composite index: Indexes multiple columns in a table (useful for queries that filter by more than one column).
  • Unique index: Ensures that the indexed column(s) have unique values.

Clustered Index

Example

Let’s say we have an employees table:

CREATE TABLE employees (
    emp_id INT PRIMARY KEY,  -- Clustered Index is automatically created on this column
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    department VARCHAR(50)
);
Enter fullscreen mode Exit fullscreen mode

In this case, the emp_id is the primary key and will be the clustered index by default. The data in the employees table is physically ordered by emp_id.

Query benefit

A query like the one below will be fast because the rows are stored in the physical order of emp_id:

SELECT * FROM employees WHERE emp_id BETWEEN 100 AND 200;
Enter fullscreen mode Exit fullscreen mode

Single-Column Index

Example

Let’s create a single-column index on the first_name column in the employees table.

CREATE INDEX idx_firstname ON employees (first_name);
Enter fullscreen mode Exit fullscreen mode

This index is non-clustered (since there can only be one clustered index) and will improve query performance when searching by first_name.

Query benefit

The following query will benefit from the single-column index:

SELECT * FROM employees WHERE first_name = 'John';
Enter fullscreen mode Exit fullscreen mode

Composite Index

Example

You frequently query employees by both last_name and department, so you can create a composite index on these two columns.

CREATE INDEX idx_lastname_department ON employees (last_name, department);
Enter fullscreen mode Exit fullscreen mode

This is a non-clustered composite index that will speed up queries involving both last_name and department. The order of the columns in the composite index matters, as queries that filter by last_name first and then department will be faster.

Query benefit

The following query will benefit from the composite index:

SELECT * FROM employees WHERE last_name = 'Smith' AND department = 'Sales';
Enter fullscreen mode Exit fullscreen mode

However, if you search only by department, the index will not be as effective as if you had indexed department separately.

Unique Index

Example

You want to ensure that the email field is unique for all employees. You can create a unique index on the email column.

CREATE UNIQUE INDEX idx_email_unique ON employees (email);
Enter fullscreen mode Exit fullscreen mode

This unique index guarantees that no two rows in the employees table will have the same email.

Query benefit

The following query will benefit from the unique index:

SELECT * FROM employees WHERE email = 'john.doe@example.com';
Enter fullscreen mode Exit fullscreen mode

Additionally, if someone tries to insert a duplicate email, they will get an error due to the unique constraint:

INSERT INTO employees (emp_id, first_name, last_name, department, email)
VALUES (101, 'John', 'Doe', 'HR', 'john.doe@example.com');  -- This will fail if 'john.doe@example.com' already exists
Enter fullscreen mode Exit fullscreen mode
💖 💪 🙅 🚩
ronnychen
Yuhao Chen

Posted on September 28, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related