0140 MonetDB – SET operators

SET OPERATORS

Let's say we have a purple and a green data set. The data set is the result of a query. Using the SET operators, we can treat the rows in those two data sets as mathematical sets.

With joins, we concatenate tables horizontally. If we want to fuse tables vertically then we would use a union. For union to work we need to meet three conditions:
– Two data sets must have the same number of rows.
– Order of columns in one data set must be the same as order of the columns in another data set.
– Columns at the same position must have similar data type.
 
There are two kinds of unions, we have "UNION" and we have "UNION ALL". "UNION ALL" would just glue two data sets together. If the first data set has X rows, and the second one Y rows, final data set would have X + Y rows.
 
"UNION" will return the same thing, but without duplicates.
 
"INTERSECT" means that we are looking for the rows that belong both to the first and the second data set. "INTERSECT" will return only distinct rows.

"EXCEPT" returns only rows from the first dataset, that do not have equivalents in the second data set. Again, only distinct rows will be returned.

Sample tables

We will create two sample tables. Notice that columns in these tables are using compatible data types ( CHAR vs VARCHAR(1), TINYINT vs SMALLINT ).

1) CREATE TABLE aboveTable ( Letter CHAR, Number TINYINT );
2) INSERT INTO aboveTable ( Letter, Number )
  VALUES ( 'A', 1 ), ( 'A', 1 ),
         ( 'B', 2 ), ( 'B', 2 );
1) CREATE TABLE belowTable ( Letter VARCHAR(1), Number SMALLINT );
2) INSERT INTO belowTable ( Letter, Number )
  VALUES ( 'B', 2 ), ( 'B', 2 ),
         ( 'C', 3 ), ( 'C', 3 );

UNION ALL

Our tables are meeting conditions for a UNION ALL.

SELECT * FROM aboveTable UNION ALL SELECT * FROM  belowTable;

UNION

UNION will remove duplicate rows.

SELECT * FROM aboveTable UNION SELECT * FROM belowTable;

INTERSECT

Intersect will give us overlap between two tables.

SELECT * FROM aboveTable INTERSECT SELECT * FROM belowTable;

EXCEPT

Let's see rows that exist in the aboveTable, but not in the belowTable.
 
SELECT * FROM aboveTable EXCEPT SELECT * FROM belowTable;

INTERSECT and EXCEPT with duplicates

If we can use ALL with UNION, is it possible to use it with INTERSECT and EXCEPT? It is possible. This way, all of the duplicates will remain.

SELECT * FROM aboveTable INTERSECT ALL SELECT * FROM  belowTable;SELECT * FROM aboveTable EXCEPT ALL SELECT * FROM  belowTable;

ORDER BY and SET operators

When we use ORDER BY, it must be applied on the last query.
 
SELECT * FROM aboveTable
UNION
SELECT * FROM belowTable ORDER BY Number DESC;

Column names and SET operations

The above SELECT is determining names of columns. Names of columns in above and below SELECT don't have to be the same.

SELECT Letter as L, Number as N
FROM aboveTable
UNION
SELECT * FROM belowTable ORDER BY N DESC;

Connecting More Than 2 Tables with Set Operators

How EXCEPT works

SELECT * FROM aboveTable
UNION ALL
SELECT * FROM belowTable
EXCEPT
SELECT * FROM belowTable;
Chained operations are conducted
in sequence manner. First we make
an UNION, and then we apply EXCEPT.
 
Let's see below steps that are leading to our results.
1) In the first step, UNION ALL will unite our two tables. Next we have to do EXCEPT vs belowTable.2) In the second step, all duplicates from both tables will be removed. Then we would do EXCEPT operation.
 
3) The only row that exist in the first table, and not in the second table is a row "A-1". That is our result.

How EXCEPT ALL works

SELECT * FROM aboveTable
UNION ALL
SELECT * FROM belowTable
EXCEPT ALL
SELECT * FROM belowTable;
When we use EXCEPT ALL, we would
get 4 rows as a result. Let's see below
steps that have leaded to this outcome.
1) In the first step, UNION ALL will unite our two tables. Next, we have to do EXCEPT ALL vs belowTable.2) This time there is no DISTINCT.

Rows from the two tables will destroy each other like matter and antimatter. What remains in the left table will be our final result.

EXCEPT ALL with DISTINCT

Statement below will return table on the right image. Now we have 6 rows in the result data set.

SELECT * FROM aboveTable
UNION ALL
SELECT * FROM belowTable
EXCEPT ALL
SELECT DISTINCT * FROM belowTable;
This time only two rows from the UNION ALL tables will be excluded so we will be left with 6 remaining rows.

UNION AND INTERSECT

Here is one example with INTERSECT. Again, first two data sets will be connected with UNION ALL. Their result will be than intersected with the last data set.

SELECT * FROM aboveTable
UNION ALL
SELECT * FROM belowTable
INTERSECT ALL
SELECT DISTINCT * FROM belowTable;
1) We will start two intermediate data sets. First one is the result of UNION and it will have 8 rows. The other one is created with DISTINCT from the belowTable, and it has 2 rows.
2) Intersection between these two data sets is this:

Two Conclusions

1) SET operators don't have priority among them. They are applied from the top to the bottom. This is different than in some other databases where INTERSECTION has the top priority. This would be the result if we enforce priority of an INTERSECT operator.

2) SET operators without ALL specifiers will first remove duplicates from their operands. SET operators with ALL specifiers will leave their operands untouched. This is true not only for EXCEPT (like in previous examples), but also for UNION and INTERSECT. The message here is that removal of the duplicates is not conducted on the result, but on the operands, and after that SET operators will apply their logic.

Corresponding

THIS IS NOT WORKING IN MONETDB, although documentation claims that it works.

When working with SET operators we need to be careful to order our columns correctly. Their position has to match between tables. This makes things harder. We will face errors many times.
CORRESPONDING clause can fix this problem. CORRESPONDING clause will correctly pair columns so that UNION is successful. Unfortunately, statement on the right side will work in some other databases, but not in the MonetDB, so I will just stop explaining it.

You can read more about this clause on this blog:
https://blog.jooq.org/a-rarely-seen-but-useful-sql-feature-corresponding/
SELECT Number, Letter FROM aboveTable
UNION ALL CORRESPONDING
SELECT Letter, Number FROM belowTable;

Leave a Comment

Your email address will not be published. Required fields are marked *