What is SQL - Part 3

marcegarba

Marcelo Garbarino

Posted on April 15, 2021

What is SQL - Part 3

Viewing, updating and deleting data in tables

This is part 3 of a four-part article which explains SQL

Retrieving data using the SELECT clause

  • As mentioned before, relations / tables / sets are an unordered collection of rows, or tuples.
  • Let's retrieve all the content in table products:
SELECT *
FROM products;
Enter fullscreen mode Exit fullscreen mode
  • SELECT *
    • This means brings all the columns; it could also be written as SELECT id, name
  • FROM products
    • List of tables to operate on. In this case, we're talking about table products
    • Here's a possible result:
id name
1 Sport shoes A
3 Suit C
2 Sport watch B
  • Why possible? Because there is no specific ordering of rows; it all depends on how the DB system stores the rows, the order in which rows were INSERTed, and others

Ordering

  • In order to establish a specific order, we must use the ORDER clause:
SELECT id, name
FROM products
ORDER BY id;
Enter fullscreen mode Exit fullscreen mode
id name
1 Sport shoes A
2 Sport watch B
3 Suit C
  • In reverse order, by name:
SELECT id, name
FROM products
ORDER BY name DESC;
Enter fullscreen mode Exit fullscreen mode
id name
3 Suit C
2 Sport watch B
1 Sport shoes A
  • The ORDER clause may list more than one column, or expression; in this simple case, it doesn't make much sense

Filtering

  • Let's see how to limit the number of rows, based on filtering conditions:
SELECT *
FROM products
WHERE id = 2;
Enter fullscreen mode Exit fullscreen mode
id name
2 Sport watch B
SELECT *
FROM products
WHERE name LIKE 'Sport %'
ORDER BY id ASC
Enter fullscreen mode Exit fullscreen mode
id name
1 Sport shoes A
2 Sport watch B
  • Here, a combination of WHERE and ORDER BY clauses:
    • WHERE name LIKE 'Sport %' means all rows where name starts with 'Sport':
    • The % symbol maps from zero to any number of characters
    • If the ? symbol were used, that would mean exactly one character, which would bring zero rows, as
  • ORDER BY id ASC: ASC for ascending, which is the default in the clause

Functions

  • SQL offers a number of standard functions, and in fact, there are two types of functions:
    • Common functions, which apply to expressions in rows
    • Aggregate and Window functions, which apply to more than one row at a time
Row functions
SELECT id, id * 2 AS double_id, upper(name) AS upper_name
FROM products
ORDER BY id;
Enter fullscreen mode Exit fullscreen mode
id double_id upper_name
1 2 SPORT SHOES A
2 4 SPORT WATCH B
3 6 SUIT C
  • Several things to notice here:
    • Expressions like id * 2 and functions like upper(name)
    • the AS key word is used to give names to the expressions; if it were not used, the calculated columns would have default names, which vary depending on the DB system used
Aggregate and window functions
SELECT COUNT(*) AS count
FROM products;
Enter fullscreen mode Exit fullscreen mode
count
3
SELECT id_product,
       COUNT(*) AS number_of_sales,
       SUM(amount) AS sum_amount
FROM sales
GROUP BY product
ORDER BY product;
Enter fullscreen mode Exit fullscreen mode
id_product number_of_sales sum_amount
1 4 575.72
2 5 2697.32
3 4 1903.75
  • Here aggregate functions COUNT(), SUM() are used
  • Notice the resulting rows and content:
    • Aggregate functions
    • COUNT(*): counts the number of rows, _according to the GROUP BY criterion
    • SUM(amount): calculates the sum of column amount according to the GROUP BY criterion
    • AVG(x): calculates the average
    • MAX(x), MIN(x): obtains the maximum or minimum value
    • There are other aggregate functions in standard SQL
  • The list of columns in the SELECT when using grouping functions, should either:
    • Be the result of an aggregate expression
    • Be named in the GROUP BY clause
  • If this is not the case, the query should fail
    • If it doesn't fail, for some DB engines, that's an issue, since it's not easy to spot a failed SELECT
    • PostgreSQL is very strict, which is a great thing; not so with other DB products
Filtering grouped rules
  • Instead of using the WHERE clause, for filtering on aggregated expressions, HAVING has to be used:
SELECT id_product,
       COUNT(*) AS number_of_sales,
       SUM(amount) AS sum_amount
FROM sales
GROUP BY product
HAVING COUNT(*) < 5
ORDER BY product;
Enter fullscreen mode Exit fullscreen mode
id_product number_of_sales sum_amount
1 4 575.72
3 4 1903.75

Combining results from more than one table

  • By using the JOIN clause, more than one table can be invoked in the SELECT statement.
  • Here is an example:
SELECT s.id_product,
       p.name,
       COUNT(*) AS number_of_sales,
       SUM(s.amount) AS sum_amount
FROM sales s
  JOIN products p
    ON s.id_product = p.id
GROUP BY s.id_product, p.name
HAVING COUNT(*) < 5
ORDER BY s.id_product;
Enter fullscreen mode Exit fullscreen mode
  • A few changes to notice:
    • FROM ... JOIN ... ON ...
    • FROM used to name one table,
    • JOIN names the second table
    • ON establishes the way both tables are related
    • Table name aliases:
    • sales s and products p; this is not necessary, but instead of typing sales.id_product, product.name and sales.amount, aliases help us to shorten the sentences
    • If the column names are unique among the invoked tables, there is no strict need to preface the column with the table or alias names, although it's a good practice. In fact, if in the future one of the column names are repeated, what worked so far will stop to do so, as the DB engine wouldn't know what table the column referrer to
  • the GROUP BY clause needs to include p.name; otherwise, an error should be triggered (again, not all DB engines do so)
  • Types of JOIN:
    • INNER JOIN: Rows from both tables must be present (the default)
    • LEFT JOIN: Rows from the left-side table must exist, for columns in the right-side table missing, NULL values are used
    • RIGHT JOIN: If rows from the left-side table do not exist, NULL values are used instead
    • CROSS JOIN: Cartesian product from both tables are retrieved, as long as the 'ON' clause is fulfilled

As an example of a CROSS JOIN, which is not much used, look at this simple SELECT:

SELECT p.id AS id_product,
       p.name AS name_product,
       s.id AS id_seller,
       s.name AS name_seller
FROM products p,
     sellers s
ORDER BY id_product, id_seller;
Enter fullscreen mode Exit fullscreen mode

Simply naming the tables without any filtering condition, is equivalent to a CROSS JOIN:

id_product name_product id_seller name_seller
1 Sport shoes A 234 John S.
1 Sport shoes A 281 Luisa G.
1 Sport shoes A 341 Mary T.
2 Sport watch B 234 John S.
2 Sport watch B 281 Luisa G.
2 Sport watch B 341 Mary T.
3 Suit C 234 John S.
3 Suit C 281 Luisa G.
3 Suit C 341 Mary T.

This SELECT is just an example of a CROSS JOIN result.

Updating table contents with UPDATE

  • The UPDATE sentence is used for updating rows
  • An example follows:
UPDATE sellers
SET name = 'María T.'
WHERE id = 341;
Enter fullscreen mode Exit fullscreen mode
  • After this change, look at the table contents:
SELECT *
FROM sellers
ORDER BY id;
Enter fullscreen mode Exit fullscreen mode
id name
234 John S.
281 Luisa G.
341 María T.
  • Important points:
    • Limit the UPDATE with a WHERE; otherwise, the update will impact all rows in a table, all of them, in one transaction
    • Not all updates are guaranteed to success; for instance, in the schema with all constraints in place, this UPDATE will fail:
UPDATE sellers
SET id = 235
WHERE id = 234;
Enter fullscreen mode Exit fullscreen mode
  • If you try this, you'll see that it fails, as there are relational integrity constraints which prevent tables sellers and products to update their PK if there is at least one row in the sales table that point to those values, as is the case in the example here
    • See the error with PostgreSQL:
ERROR:  update or delete on table "sellers" violates foreign     
        key constraint "sales_seller_fkey" on table "sales"
DETAIL: Key (id)=(234) is still referenced from table "sales".
Enter fullscreen mode Exit fullscreen mode
  • See it with MariaDB:
ERROR 1451 (23000): Cannot delete or update a parent row:
      a foreign key constraint fails
      (`course`.`sales`,
       CONSTRAINT `sales_ibfk_2`
       FOREIGN KEY (`id_seller`) REFERENCES `sellers` (`id`)
      )
Enter fullscreen mode Exit fullscreen mode

Deleting table contents with DELETE

  • The DELETE DML sentence is used to remove rows from tables
  • Here's an example:
DELETE
FROM sales
WHERE quantity BETWEEN (4 AND 5);
Enter fullscreen mode Exit fullscreen mode
  • After deleting these rows, the contents of the table are:
SELECT id_product,
       id_seller,
       date,
       quantity,
       amount
FROM sales
ORDER BY id_product, id_seller;
Enter fullscreen mode Exit fullscreen mode
id_product id_seller date quantity amount
1 234 2020-06-10 2 148.34
1 234 2020-01-23 2 142.38
1 234 2020-03-01 1 75.00
1 341 2020-01-17 3 210.00
2 234 2020-12-25 1 220.00
2 341 2020-01-31 1 215.48
2 341 2020-12-01 2 448.50
3 281 2020-04-15 1 350.00
3 281 2020-05-13 2 605.25
3 341 2020-02-15 1 348.50
3 341 2020-11-18 2 600.00
  • As is the case with the failed UPDATE and constraints are in place, certain DELETE queries will fail
  • For instance:
DELETE
FROM sales
WHERE id = 234;
Enter fullscreen mode Exit fullscreen mode
  • If the FOREIGN KEY constraints established in part 2 for table sales had the ON UPDATE CASCADE ON DELETE CASCCADE, neither the UPDATE nor the DELETE would fail:
    • For the UPDATE, it would modify the values of the foreign keys in table sales
    • For the DELETE, it would also remove rows from the sales table
💖 💪 🙅 🚩
marcegarba
Marcelo Garbarino

Posted on April 15, 2021

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related