0490 Grouping Sets and Comments in MonetDB
Sample Table
CREATE TABLE tabSales( Continent VARCHAR(20), Subcontinent VARCHAR(20), Country VARCHAR(20), State VARCHAR(30), Sales INT ); |
INSERT INTO tabSales VALUES ('America', 'North', 'Canada', 'Ontario', 1), | ![]() |
The Problem
![]() | When we create a Pivot table, from the sample table, we will see all of the detail sales (1,2,4,8,16), but we will also see totals (3,12,16,31). The question is, what query would return all of these numbers, both detail values and totals, if we use MonetDB. |
This is one possible solution:SELECT Country, State, Sales FROM tabSales UNION ALL SELECT Country, null, SUM( Sales ) FROM tabSales GROUP BY Country UNION ALL SELECT null, null, SUM( Sales ) FROM tabSales ; | ![]() | On the image, the rows are sorted so that the table looks like the pivot table. |
UNION ALL solution is bad for several reasons:
1) We have three queries to execute and then to combine multiple result sets into one.
2) It is hard to read and modify long UNION ALL query.
3) We have to be careful to properly align columns.
This is the problem that can be solved by grouping sets.
Grouping Sets
"Grouping Sets" are much better and faster syntax to achieve the same goal. SELECT Country, State, SUM( Sales ) Empty parentheses are for the grand total. | ![]() | We'll get the same result, except the repetition of country names is reduced. Instead of them we have nulls. |
Look what we will get if we place parentheses around Country and State. SELECT Country, State, SUM( Sales ) Parentheses are there to define each group. | ![]() |
We can see the effect of parentheses better on this example. SELECT Continent, Subcontinent, SUM( Sales ) | ![]() Continent can be used by itself, but it can be also used in conjunction with Subcontinent to define a group. |
It is now clear that each element inside GROUPING SETS is a separate definition of a group. Each group can be defined by one column > Continent <, or by several columns placed inside of the parentheses ( Continent, Subcontinent ).
These two examples, that would return the same result, show the logic and brevity of the grouping sets. | SELECT Col1, Col2, SUM( Sales ) | SELECT Col1, Col2, Sales FROM Table SELECT Col1, null, SUM( Sales ) FROM Table GROUP BY Col1; |
Rollup
SELECT Continent, Subcontinent, Country, SUM( Sales ) GROUP BY ROLLUP( Continent, Subcontinent, Country ); | ROLLUP( a, b, c ) is the same as grouping sets "( a, b, c ), ( a, b ), ( a ), ()". This is a way to get hierarchy of the columns. Rollup will give us all of the numbers that we need to create a pivot table. |
![]() | For ROLLUP, the order of the columns is important.ROLLUP( a, b, c ) ROLLUP( b, a, c ) |
Similar to GROUPING SETS, ROLLUP can also create combinations of columns by using parentheses. SELECT Subcontinent, Country, State, SUM( Sales ) | ![]() |
CUBE
CUBE works similar to ROLLUP, but have a different logic. CUBE will give us all of the possible combinations. CUBE( a, b, c ) will give us 2^3 grouping sets "(a,b,c), (a,b), (a,c), (b,c), (a), (b), (c) and ()". ———————————————————————————- SELECT Subcontinent, Country, SUM( Sales ) FROM tabSales GROUP BY CUBE( Subcontinent, Country ); Because we have only 2 columns inside of CUBE in our example, number of combinations is 2^2 = 4 "(a,b), (b), (a), ()". | ![]() |
We can also define groups by using parentheses. SELECT Subcontinent, Country, SUM( Sales ) FROM tabSales We now have only one element. We have only two ( 2^1 ) groups "(a), ()". | ![]() |
Addition and Multiplication in Grouping Sets
This is addition: ( a, b ) | This is multiplication. Multiplication is like crossjoin.( a, b ) |
Syntax for addition is like this. Everything inside of the GROUPING SETS parentheses will be added to each other. In this example we will add ( Subcontinent ) + ( Country ) + ( ). SELECT Subcontinent, Country, SUM( Sales ) | ![]() |
So, if we create GROUPING SETS like this, this will be addition.GROUPING SETS ( Continent, ROLLUP( Continent, Subcontinent ), CUBE( Country, State ), () ) |
Addition can create duplicates:SELECT Continent FROM tabSales | ![]() | CUBE will create Grand Total, but we will also get grand total from the "( )" element. |
This is a syntax for multiplication. This time we will have commas between GROUPING SETS, ROLLUPS and CUBES, and individual elements. GROUPING SETS ( Continent ), ROLLUP( Continent, Subcontinent ), CUBE( Country, State ), (),Country |
This example will give us 2 x 2 = 4 rows. ROLLUP will give us America, and "( )". GROUPING SETS will give us "North" and "Central". Then we combine them 2 x 2. SELECT Continent, Subcontinent, SUM( Sales ) FROM tabSales | ![]() |
Multiplication can also create duplicates:SELECT Continent, SUM( Sales ) FROM tabSales | ![]() |
Indicator Function – GROUPING
GROUPING function will inform us what rows are subtotals / grand total. In such rows, some columns have nulls because they are consolidated. GROUPING function has an argument which is a column, and GROUPING function will return the result only for that column.SELECT Continent, GROUPING( Continent ) AS ContiGroup, Subcontinent, GROUPING( Subcontinent ) AS SubcontiGroup , SUM( Sales ) |
![]() This function is important because it help us to make distinction between subtotal nulls, and missing data nulls. |
Formatting with COALESCE and Sort
This is not good looking table. Let's fix it.SELECT Subcontinent, State, SUM( Sales ) | ![]() |
COALESCE will helps us to eliminate NULLS:
| ![]() |
With GROUPING function, we can create columns that will help us to sort the table.
SELECT COALESCE( Subcontinent, 'Grand' ) AS Subcontinent | ![]() |
These auxiliary columns ( SubcSort and StateSort ) can be easily eliminated by wrapping everything with "SELECT Subcontinent, State, Sales".
Comments
Sample Table and Function
Let's create two tables and function.CREATE TABLE tabComment( Number INTEGER ); | CREATE OR REPLACE FUNCTION funcComment( Arg1 INTEGER ) |
Comments on Database Objects
We can create comments that are tied for database objects. Comments convey information about that object. | COMMENT ON TABLE tabComment IS 'tabComment description'; COMMENT ON COLUMN tabComment.Number IS 'Number column description'; COMMENT ON FUNCTION funcComment IS 'funcComment description'; |
We will then find IDs of our database objects:SELECT * FROM sys.tables WHERE name = 'tabcomment'; 15876 15875 15881
2000 | All of these IDs can be found in the system table "sys.comments " together with their comments. SELECT * FROM sys.comments ![]() |
Deleting a Comment
If we delete an object, its comment will be deleted. DROP TABLE tabComment; SELECT * FROM sys.comments WHERE Id = 15876; | ![]() |
We can delete a comment by setting it to NULL or an empty string. COMMENT ON SCHEMA sys IS null; | ![]() |
If a function is overloaded then we have to provide the full signature. COMMENT ON FUNCTION funcComment( INTEGER ) IS ''; | ![]() |
Persistent Database Objects
There are other database objects that we can place a comment on. They are all persistent database objects. | COMMENT ON VIEW view_name IS 'Comment'; COMMENT ON AGGREGATE aggregate_name IS 'Comment'; |
We can not create a comment on a temporary object.COMMENT ON TABLE tabTemporary IS 'tabTemporary description'; | ![]() |