SET OPERATORS
Let's say we have a purple and a green data set. The data set is the result of a query. Using the SET operators, we can treat the rows in those two data sets as mathematical sets.
With joins, we concatenate tables horizontally. If we want to fuse tables vertically then we would use a union. For union to work we need to meet three conditions: – Two data sets must have the same number of rows. – Order of columns in one data set must be the same as order of the columns in another data set. – Columns at the same position must have similar data type. There are two kinds of unions, we have "UNION" and we have "UNION ALL". "UNION ALL" would just glue two data sets together. If the first data set has X rows, and the second one Y rows, final data set would have X + Y rows. "UNION" will return the same thing, but without duplicates. "INTERSECT" means that we are looking for the rows that belong both to the first and the second data set. "INTERSECT" will return only distinct rows. "EXCEPT" returns only rows from the first dataset, that do not have equivalents in the second data set. Again, only distinct rows will be returned. |
Sample tables
We will create two sample tables. Notice that columns in these tables are using compatible data types ( CHAR vs VARCHAR(1), TINYINT vs SMALLINT ).
1) CREATE TABLE aboveTable ( Letter CHAR, Number TINYINT ); 2) INSERT INTO aboveTable ( Letter, Number ) VALUES ( 'A', 1 ), ( 'A', 1 ), ( 'B', 2 ), ( 'B', 2 ); |
1) CREATE TABLE belowTable ( Letter VARCHAR(1), Number SMALLINT ); 2) INSERT INTO belowTable ( Letter, Number ) VALUES ( 'B', 2 ), ( 'B', 2 ), ( 'C', 3 ), ( 'C', 3 ); |
UNION ALL
Our tables are meeting conditions for a UNION ALL.SELECT * FROM aboveTable UNION ALL SELECT * FROM belowTable; |
UNION
UNION will remove duplicate rows.SELECT * FROM aboveTable UNION SELECT * FROM belowTable; |
INTERSECT
Intersect will give us overlap between two tables.SELECT * FROM aboveTable INTERSECT SELECT * FROM belowTable; |
EXCEPT
Let's see rows that exist in the aboveTable, but not in the belowTable.SELECT * FROM aboveTable EXCEPT SELECT * FROM belowTable; |
INTERSECT and EXCEPT with duplicates
If we can use ALL with UNION, is it possible to use it with INTERSECT and EXCEPT? It is possible. This way, all of the duplicates will remain.
SELECT * FROM aboveTable INTERSECT ALL SELECT * FROM belowTable; | SELECT * FROM aboveTable EXCEPT ALL SELECT * FROM belowTable; |
ORDER BY and SET operators
When we use ORDER BY, it must be applied on the last query.SELECT * FROM aboveTable |
Column names and SET operations
The above SELECT is determining names of columns. Names of columns in above and below SELECT don't have to be the same.SELECT Letter as L, Number as N |
Connecting More Than 2 Tables with Set Operators
How EXCEPT works
SELECT * FROM aboveTable | Chained operations are conducted in sequence manner. First we make an UNION, and then we apply EXCEPT. Let's see below steps that are leading to our results. |
1) In the first step, UNION ALL will unite our two tables. Next we have to do EXCEPT vs belowTable. | 2) In the second step, all duplicates from both tables will be removed. Then we would do EXCEPT operation. 3) The only row that exist in the first table, and not in the second table is a row "A-1". That is our result. |
How EXCEPT ALL works
SELECT * FROM aboveTable | When we use EXCEPT ALL, we would get 4 rows as a result. Let's see below steps that have leaded to this outcome. |
1) In the first step, UNION ALL will unite our two tables. Next, we have to do EXCEPT ALL vs belowTable. | 2) This time there is no DISTINCT. Rows from the two tables will destroy each other like matter and antimatter. What remains in the left table will be our final result. |
EXCEPT ALL with DISTINCT
Statement below will return table on the right image. Now we have 6 rows in the result data set.SELECT * FROM aboveTable | This time only two rows from the UNION ALL tables will be excluded so we will be left with 6 remaining rows. |
UNION AND INTERSECT
Here is one example with INTERSECT. Again, first two data sets will be connected with UNION ALL. Their result will be than intersected with the last data set.SELECT * FROM aboveTable |
1) We will start two intermediate data sets. First one is the result of UNION and it will have 8 rows. The other one is created with DISTINCT from the belowTable, and it has 2 rows. 2) Intersection between these two data sets is this: |
Two Conclusions
1) SET operators don't have priority among them. They are applied from the top to the bottom. This is different than in some other databases where INTERSECTION has the top priority. This would be the result if we enforce priority of an INTERSECT operator.
2) SET operators without ALL specifiers will first remove duplicates from their operands. SET operators with ALL specifiers will leave their operands untouched. This is true not only for EXCEPT (like in previous examples), but also for UNION and INTERSECT. The message here is that removal of the duplicates is not conducted on the result, but on the operands, and after that SET operators will apply their logic.
Corresponding
THIS IS NOT WORKING IN MONETDB, although documentation claims that it works.
When working with SET operators we need to be careful to order our columns correctly. Their position has to match between tables. This makes things harder. We will face errors many times. |
CORRESPONDING clause can fix this problem. CORRESPONDING clause will correctly pair columns so that UNION is successful. Unfortunately, statement on the right side will work in some other databases, but not in the MonetDB, so I will just stop explaining it. You can read more about this clause on this blog: https://blog.jooq.org/a-rarely-seen-but-useful-sql-feature-corresponding/ | SELECT Number, Letter FROM aboveTable |