Using SQLite to Examine Data Relationships and Compare Tables

jimgrimes86

Jim Grimes

Posted on November 13, 2023

Using SQLite to Examine Data Relationships and Compare Tables

SQLite provides numerous ways to view and compare data across multiple tables. This blog will discuss SQLite queries that can be used to view data relationships:

  • INNER JOIN
  • LEFT JOIN
  • Using Multiple Join Operators

And SQLite queries that can be used to compare tables:

  • INTERSECT
  • UNION
  • UNION ALL
  • EXCEPT

Simple Select Queries

In order to illustrate how to use SELECT statements, I have created a database to keep track of garden plants and garden beds, which I refer to in the examples I use. The database has a plants table that contains information on the different garden plants, including the plant name, whether the plant requires full sun, and the time of year that the plant begins to bloom. The data in the plants table can be retrieved with a SELECT query:

SELECT * FROM plants;
Enter fullscreen mode Exit fullscreen mode
id  name              full_sun  bloom      
--  ----------------  --------  -----------
1   Bee Balm          0         late spring
2   Petunia           1         summer     
3   Coneflower        0         summer     
4   Zinnia            1         late spring
5   Black-Eyed Susan  0         summer    
Enter fullscreen mode Exit fullscreen mode

In the full_sun column, the integers 1 and 0 are used to represent the Boolean values of 'true' and 'false', respectively.

To limit the result's columns to only include the plant's id number and name, the columns can be specified in the SELECT statement instead of using *:

SELECT id, name FROM plants;
Enter fullscreen mode Exit fullscreen mode
id  name            
--  ----------------
1   Bee Balm        
2   Petunia         
3   Coneflower      
4   Zinnia          
5   Black-Eyed Susan
Enter fullscreen mode Exit fullscreen mode

To limit the query result to only those rows that match a certain condition, a WHERE clause can be added to the query. For example, the following query would only select plants that do not require full sunlight:

SELECT * FROM plants WHERE full_sun IS 0;
Enter fullscreen mode Exit fullscreen mode
id  name              full_sun  bloom      
--  ----------------  --------  -----------
1   Bee Balm          0         late spring
3   Coneflower        0         summer     
5   Black-Eyed Susan  0         summer    
Enter fullscreen mode Exit fullscreen mode

Select Queries With Join Operators

Retrieving data from a single table has its uses, but it does not allow me to easily see the relationships between data on multiple tables. You can use join statements to view data across multiple tables at the same time.

Our database also contains a table with information on garden beds:

beds

id  light        
--  -------------
1   full sun     
2   partial shade
Enter fullscreen mode Exit fullscreen mode

A type of plant can be planted in multiple garden beds, and a garden bed can have multiple kinds of plants. To keep track of the relationships between plants and garden beds, there is a join table, called plant_beds:

plant_beds

id  plant_id  bed_id
--  --------  ------
1   1         2     
2   4         1     
3   3         2     
4   5         1     
5   1         1 
Enter fullscreen mode Exit fullscreen mode

The relationship between the plants, beds, and plant_beds tables can be visualized like this:

Representation of table relationships

INNER JOIN

The INNER JOIN can be used to view the plants that correspond to each row in the plant_bed join table. SQLite treats the operators "INNER JOIN", "JOIN" and "," exactly the same, so they can be used interchangably.

SELECT plant_beds.id, plant_beds.plant_id, plants.name, plant_beds.bed_id
FROM plant_beds
INNER JOIN plants
ON plant_beds.plant_id = plants.id
Enter fullscreen mode Exit fullscreen mode
id  plant_id  name              bed_id
--  --------  ----------------  ------
1   1         Bee Balm          2     
2   4         Zinnia            1     
3   3         Coneflower        2     
4   5         Black-Eyed Susan  1     
5   1         Bee Balm          1    
Enter fullscreen mode Exit fullscreen mode

Don't forget to include the ON statement in the query. The ON operator tells SQLite how the tables relate to each other. Without that instruction, SQLite will return every row from the plant_beds table matched with every row from the plants database:

SELECT plant_beds.id, plant_beds.plant_id, plants.name, plant_beds.bed_id
FROM plant_beds
INNER JOIN plants
Enter fullscreen mode Exit fullscreen mode
id  plant_id  name              bed_id
--  --------  ----------------  ------
1   1         Bee Balm          2     
1   1         Petunia           2     
1   1         Coneflower        2     
1   1         Zinnia            2     
1   1         Black-Eyed Susan  2     
2   4         Bee Balm          1     
2   4         Petunia           1     
2   4         Coneflower        1     
2   4         Zinnia            1     
2   4         Black-Eyed Susan  1     
3   3         Bee Balm          2     
3   3         Petunia           2     
3   3         Coneflower        2     
3   3         Zinnia            2     
3   3         Black-Eyed Susan  2     
4   5         Bee Balm          1     
4   5         Petunia           1     
4   5         Coneflower        1     
4   5         Zinnia            1     
4   5         Black-Eyed Susan  1     
5   1         Bee Balm          1     
5   1         Petunia           1     
5   1         Coneflower        1     
5   1         Zinnia            1     
5   1         Black-Eyed Susan  1    
Enter fullscreen mode Exit fullscreen mode

This result includes Petunias in the response, even though they aren't currently associated with any beds in our plant_beds join table. So even though SQLite has returned information from both the plants table and the plant_beds table, the response does not represent the relationship between the two tables.

LEFT JOIN

The reponse to an INNER JOIN query only includes the rows from each table that have a match in the ON statement. SO, in the example above, Petunia is not include din the response because it is not included in the plant_beds table. In order to see all the rows from the left table or the right table, LEFT JOIN or LEFT OUTER JOIN should be used instead of INNER JOIN. The response from a LEFT JOIN query will include the same rows as an INNER JOIN query, as well as an extra row from each row in the left-hand table (or first table listed in the query) that does not have a mathing row in the right-hand table (the second table listed in the query). SQLite uses NULL as a default value for any column that does not have a value in the right-hand table

SELECT plant_beds.id, plant_beds.plant_id, plants.name, plant_beds.bed_id
FROM plant_beds
LEFT JOIN plants
ON plant_beds.plant_id = plants.id
Enter fullscreen mode Exit fullscreen mode
id  plant_id  name              bed_id
--  --------  ----------------  ------
5   1         Bee Balm          1     
1   1         Bee Balm          2     
              Petunia                 
3   3         Coneflower        2     
2   4         Zinnia            1     
4   5         Black-Eyed Susan  1 
Enter fullscreen mode Exit fullscreen mode

As can be seen in this example, the LEFT JOIN query result includes the Petunia row from the plants table and uses NULL values, which appear as blank spaces, for the columns where there is no match for Petunia in the plant_beds table.

Using a JOIN Statement With More Than Two Tables

INNER JOIN and LEFT JOIN statements can be repeated in a query to produce a result that includes information from more than one table. For example, to see information from the plant_beds table as well as details from both the plants and beds tables, two INNER JOIN statements can be used.

SELECT plant_beds.id, plant_beds.plant_id, plants.name, plant_beds.bed_id, beds.light
FROM plant_beds
INNER JOIN plants
ON plant_beds.plant_id = plants.id
INNER JOIN beds
ON plant_beds.bed_id = beds.id
Enter fullscreen mode Exit fullscreen mode
id  plant_id  name              bed_id  light        
--  --------  ----------------  ------  -------------
1   1         Bee Balm          2       partial shade
2   4         Zinnia            1       full sun     
3   3         Coneflower        2       partial shade
4   5         Black-Eyed Susan  1       full sun     
5   1         Bee Balm          1       full sun 
Enter fullscreen mode Exit fullscreen mode

Compound Select Statements

What if, instead of listing all plants in one table, the plants were sorted into multiple tables based on characteristics of each plant. I have made three more database tables to keep track of native plants, plants that are particularly attractive to bees, and plants that are likely to attract hummingbirds. Of course, there are native plants that are known as good choices for bees and hummingbirds, so the three tables will have some plants in common.

native_plants

id  name                full_sun  bloom      
--  ------------------  --------  -----------
1   Arrowwood Viburnum  0         late spring
2   Bee Balm            0         late spring
3   Black-Eyed Susan    0         summer     
4   Coneflower          0         summer     
5   Goldenrod           0         late summer
Enter fullscreen mode Exit fullscreen mode

bee_plants

id  name        full_sun  bloom      
--  ----------  --------  -----------
1   Bee Balm    0         late spring
2   Lavender    1         summer     
3   Coneflower  0         summer     
4   Zinnia      1         late spring
Enter fullscreen mode Exit fullscreen mode

hummingbird_plants

id  name             full_sun  bloom      
--  ---------------  --------  -----------
1   Petunia          1         summer     
2   Bee Balm         0         late spring
3   Cardinal Flower  1         mid summer 
4   Garden Phlox     0         summer 
Enter fullscreen mode Exit fullscreen mode

These tables can also be visualized like this:

Representation of multiple plant tables

Now that the plants are organized into multiple tables, it would be nice to be able to compare the tables to each other. This can be done with compound SELECT statements. Specifically, with the INTERSECT, UNION, UNION ALL, and EXCEPT operators.

INTERSECT

INTERSECT is used to obtain the columns and rows that tables have in common.

SELECT name, bloom FROM native_plants
INTERSECT
SELECT name, bloom FROM bee_plants
Enter fullscreen mode Exit fullscreen mode
name        bloom      
----------  -----------
Bee Balm    late spring
Coneflower  summer
Enter fullscreen mode Exit fullscreen mode

In order for the INTERSECT query to work, the columns in each SELECT column must match. The SELECT statements must have the same number of columns, and the columns must return the same type of information so that SQLite can compare each column value.

Note that in this example, the SELECT statement is not using id or * to retrieve the id column from the tables. While the native_plants, bee_plants, and hummingbird_plants tables have some plants in common, those plants do not have same id numbers in each table. In bee_plants, Bee Balm has an id of 1 and Coneflower has an id of 3, whereas in native_plants, Bee Balm has an id 2 and Coneflower has an id of 4. So if the query included the id column, the Bee Balm and Coneflower rows from each table would not match, and the INTERSECT operator would return no results.

UNION and UNION ALL

To get the information from multiple tables, regardless of whether the tables rows are the same in each table, use the UNION operator.

SELECT name, bloom FROM bee_plants
UNION
SELECT name, bloom FROM hummingbird_plants
Enter fullscreen mode Exit fullscreen mode
name             bloom      
---------------  -----------
Bee Balm         late spring
Cardinal Flower  mid summer 
Coneflower       summer     
Garden Phlox     summer     
Lavender         summer     
Petunia          summer     
Zinnia           late spring
Enter fullscreen mode Exit fullscreen mode

The UNION operator returned the rows from the bee_plants table and the rows from the hummingbird_plants table, and it automatically removed duplicate rows. In order to return every row from both tables, even if a row appears in each table, the UNION ALL operator can be used.

SELECT name, bloom FROM bee_plants
UNION ALL
SELECT name, bloom FROM hummingbird_plants
Enter fullscreen mode Exit fullscreen mode
name             bloom      
---------------  -----------
Bee Balm         late spring
Lavender         summer     
Coneflower       summer     
Zinnia           late spring
Petunia          summer     
Bee Balm         late spring
Cardinal Flower  mid summer 
Garden Phlox     summer 
Enter fullscreen mode Exit fullscreen mode

Bee Balm appears twice in the result from the UNION ALL query because it exists in both the hummingbird_plants table and the bee_plants table.

EXCEPT

The EXCEPT operator can be used to retrieve rows from one table that do not appear in another table.

SELECT name, bloom FROM native_plants
EXCEPT
SELECT name, bloom FROM bee_plants
Enter fullscreen mode Exit fullscreen mode
name                bloom      
------------------  -----------
Arrowwood Viburnum  late spring
Black-Eyed Susan    summer     
Goldenrod           late summer
Enter fullscreen mode Exit fullscreen mode

Conclusion

Detailed documentation of SQLite's SELECT statements, including the operators discussed above and many other options for constructing SELECT statement, can be found on SQLite's website : https://sqlite.org/lang_select.html

πŸ’– πŸ’ͺ πŸ™… 🚩
jimgrimes86
Jim Grimes

Posted on November 13, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related