MonetDB

006. MonetDB – Data Types

Strings

All CHARACTER data types are using UTF-8.

NameAliasesDescription
CHARACTERCHARACTER(1), CHAR, CHAR(1)0 or 1 character
CHARACTER (length)CHAR(length)Fixed length. String is returned without padding spaces, but it is stored with padding spaces.
CHARACTER VARYING
(length)
VARCHAR (length)"Length" is a maximal number of characters for this string.
CHARACTER LARGE OBJECTCLOB, TEXT, STRINGString of unbounded length.
CHARACTER LARGE OBJECT (length)CLOB (length), TEXT
(length), STRING (length)
String with maximal number of characters.
CLOB(N) is similar to VARCHAR(N), but it can hold much bigger string, although it seems that in MonetDB there is no difference between them.

Binary objects

NameAliasesDescription
BINARY LARGE OBJECTBLOBBinary objects with unbounded length.
BINARY LARGE OBJECT ( length )BLOB ( length )Binary objects with maximal length.

Numbers

Boolean data type can be considered as 0 or 1. So, all data types below are for numbers. "Prec(ision)" is total number of figures. "Scale" are figures used for decimals. For number 385,26; "Prec" is 5 (3+2), and "Scale" is 2. For all number data types, precision is smaller than 18 (or 38 on linux).

NameAliasesDescription
BOOLEANBOOLTrue of False.
TINYINTInteger between -127 and 127 (8 bit)
SMALLINTInteger between -32767 and 32767 (16 bit)
INTEGERINT, MEDIUMINTInteger between -2.147.483.647 and 2.147.483.647 (32 bit)
BIGINT64 bit signed integer
HUGEINT128 bit signed integer
DECIMALDEC ( Prec ), NUMERIC ( Prec )Decimal number, where "Prec" is 18, and "Scale" is 3.
DECIMAL ( Prec )DEC ( Prec ), NUMERIC ( Prec )Zero decimals, but we decide on total number of figures ("Prec").
DECIMAL ( Prec , Scale )DEC ( Prec , Scale ), NUMERIC ( Prec , Scale )We decide on "Prec(ision)" and "Scale".
REALFLOAT(24)32 bit approximate number.
DOUBLE PRECISIONDOUBLE, FLOAT, FLOAT(53)64 bit approximate number.
FLOAT ( Prec )FLOAT(24) is same as REAL, FLOAT(53) is same as DOUBLE PRECISION.
In this case precision can be only between 1 and 53 because this
is special kind of precision ( binary (radix 2) precision ).

Time

These are time data types. "Prec(ision)" now has different meaning. "Prec" is number of figures used for fraction of a second. For 33,7521 seconds, "Prec" is 4. In all cases below, "Prec" has to be between 0 and 6.

NameAliasesDescription
DATEDate YYYY-MM-DD.  
TIMETIME(0)Time of day HH:MI:SS.
TIME ( Prec )TIME with fraction of a second (HH:MI:SS.ssssss).
TIME WITH TIME ZONETIME(0) WITH TIME ZONETIME of day with a timezone (HH:MI:SS+HH:MI).
TIME ( Prec ) WITH TIME ZONESame as above, but now with fraction of a second (HH:MI:SS.ssssss+HH:MI).
TIMESTAMPTIMESTAMP(6)Combination of a DATE and TIME(6) (YYYY-MM-DD HH:MI:SS.ssssss).
TIMESTAMP ( Prec )Same as above, but we decide on "Prec(ision)".
TIMESTAMP WITH TIME ZONETIMESTAMP(6) WITH TIMEZ ONETIMESTAMP(6) with a timezone (YYYY-MM-DD HH:MI:SS.ssssss+HH:MI).
TIMESTAMP ( Prec ) WITH TIME ZONESame as above, but we decide on "Prec(ision)".

INTERVAL

Interval is the difference between two dates and times. There are two measure units to express interval. One is to use number of months. The other is time interval that is expressed in seconds with milliseconds precision. These two types can not be mixed because months have varying numbers of days.

There are three data types if you are using number of months: YEAR, MONTH and YEAR TO MONTH.

SELECT INTERVAL '3' YEAR AS "ThreeYears"
     , INTERVAL '36' MONTH AS "ThirtySixMonths"
     , INTERVAL '0003-01-01' YEAR TO MONTH AS "ThreeYearsAndOneMonth";

If you are using seconds as measurement unit then we have 10 data types:

INTERVAL DAYINTERVAL DAY TO HOURINTERVAL DAY TO MINUTEINTERVAL DAY TO SECONDINTERVAL HOUR
INTERVAL HOUR TO MINUTEINTERVAL HOUR TO SECONDINTERVAL MINUTEINTERVAL MINUTE TO SECONDINTERVAL SECOND
SELECT INTERVAL '1' DAY AS "Day"                                                   --1*24*60*60
            , INTERVAL '1 01' DAY TO HOUR AS "DayToHour"                           --DAY+60*60
            , INTERVAL '1 01:01' DAY TO MINUTE AS "DayToMinute"                    --DAY+60*60+60
            , INTERVAL '1 01:01:01.333' DAY TO SECOND AS "DayToSecond"             --DAY+60*60+60+1,333
            , INTERVAL '1' HOUR AS "Hour"                                          --60*60
            , INTERVAL '01:01' HOUR TO MINUTE AS "HourToMinute"                    --HOUR+60
            , INTERVAL '01:01:01.333' HOUR TO SECOND AS "HourToSecond"             --HOUR+60+1,333
            , INTERVAL '1' MINUTE AS "Minute"                                      --60
            , INTERVAL '01:01.333' MINUTE TO SECOND AS "MinuteToSecond"            --60+1,333
            , INTERVAL '15.333' SECOND AS "Second"                                 --15,333
            ;

For seconds data type, maximal precision is up to milliseconds. Result is always expressed with three decimals.

For "YEAR TO MONTH" we can also write "SELECT INTERVAL '2-5' YEAR TO MONTH".

TIME ZONES

Timestamp is combination of date and time. Timestamp time is time without daylight savings time (DST) regime. This time should represent Greenwich time.

For getting correct time, we should provide time zone with each database connection so that Greenwich time is transformed to local time. Timestamps '15:16:55+02:00' and '14:16:55+01:00' are presenting the same time but for users in different time zones. Timestamp '15:16:55+02:00' and '14:16:55+01:00' are both presenting Greenwich time of '13:16:55+00:00' because 15 – 2=13 and 14 – 1= 13.

If we want, we can change our connection time zone setting by issuing statement "SET TIME ZONE INTERVAL '01:00' HOUR TO MINUTE".
This statement "SELECT CURRENT_TIMEZONE" would tell us what is our current time zone.

005. MonetDB – Identifiers and Constants

Comments

There are two types of comments. One-line comments start with "- -" two dashes and they end at the end of the line. First, we will start the server we previously created with "monetdbd start /home/fffovde/DBfarm1". We will also start our mclient application with "mclient -u voc -d voc". Then we can try such comment as:

sql>SELECT * 
more>FROM --some comment
more>total limit 5; 

We can also use multiple lines comment. They start with "/*", and they end with "*/".

sql>SELECT /*
more>some comment
more>here */ * FROM total LIMIT 5;

Identifiers and Keywords

Identifiers and Keywords are not case sensitive.  

If we have an identifier that is the same word as keyword, then we should place quotes around it. We can have column named SELECT, if this identifier is inside of quotes. This also allows usage of spaces and special characters inside of our identifier.

SELECT 'value' as "SELECT.;' ";    

Identifiers can not start with % symbol.

SELECT 'value' as "%columnName";SELECT 'value' as "columnName";

Constants

String constants are delimited with single quotes like 'string data'. If our text contains single quotes then such single quotes should be doubled, like 'O''Connor'.

SELECT 'O''Connor' AS columnName;  

We can use UNICODE codes to create constants.

sql>SELECT U&'\0441\043F\0430\0441\0438\0431\043E' as "thank you";  

With UESCAPE it is possible to change default escape sign:

sql>SELECT U&'*0441*043F*0430*0441*0438*0431*043E' UESCAPE '*' as "thank you";  

Time Constants

These constants can be typed as strings, but will still be recognized as time constants.

'2014-02-03'CAST('2014-02-03' AS DATE)
'15:45:56'CAST ('15:45:56' AS TIME)
'2014-02-03 15:45:56'CAST ('2014-02-03 15:45:56' AS TIMESTAMP)

Special Characters

Inside of strings we can use these special characters.

\t  –  this will return TAB.
\n –  new line.
\r  –  carriage return.
\f  –  form feed.
\'  –  single quote.
\\  – backslash.

If we want to disable such behavior, we can use raw strings. We just type "R" before string, and escape sequences will be ignored.

Data

Data is expressed as scalar or a table. Scalars are constants, column references, operator results, function results, and subqueries that return one value. Column reference is written as "tableName.columnName". We can omit "tableName" and only write "columnName" if there is no ambiguity.

Table name is written as "schemaName.tableName". If there is no ambiguity, we can write only "tableName".

We can also reference tables and columns through their aliases.

004. Connect to MonetDB from Python

Installation of Pymonetdb Module

Python comes preinstalled on most of the linux distributions. We can check version of our python with a command:

python3 --version           

Now that we know that python is installed, we can install python module which we will use to connect to MonetDB from python. First, we will update the list of available software packages and we will check whether pymonetdb module is available:

sudo apt update
apt search pymonetdb

We will notice that we have two versions of pymonetdb module. Former is for python2 and latter is for python3.

Because Ubuntu's repository has appropriate pymonetdb module, we can install it. For installation we need pip. Pip is a console program used for installing python modules. So, first we need to install pip:

sudo apt install python3-pip                      

After installing pip, we will use it to install pymonetdb module, pip will know which module to install (python3):

pip install pymonetdb  

Pymonetdb module is installed.

Installing of Spyder IDE on Ubuntu

Now we can try to connect to MonetDB from python. For that, I will type python commands into Spyder IDE. We have to first install Spyder IDE on Ubuntu.

sudo apt install spyder                  

We can then start Spyder from the graphical interface (1). This is how spyder looks like (2):

Spyder is a free and open source scientific environment for Python.

Python Script to Connect to MonetDB

Inside of Spyder IDE, I will add this script. This script will first create connection object. Using that connection object, we will create cursor object. Then we can use cursor object to execute our query.

import pymonetdb
connection = pymonetdb.connect(username="voc", password="voc", hostname="localhost", database="voc")
cursor = connection.cursor()
cursor.execute('SELECT * FROM voc.total')
[print( row ) for row in cursor.fetchall() ]

Result of our query will be list of tuples (like [(a,b,c),(1,2,3)] ), where each tuple is one row of a table. We will use list comprehension to print those rows one by one. At the end, Spyder console (1) will show us result.

Pymonetdb Help

If you want to learn more about pymonetdb, you can go to official documentation on this address:

https://pymonetdb.readthedocs.io/en/latest/index.html

003. Install Sample Database in MonetDB

Download of Schema

First, we will download sample database from this location:
https://dev.monetdb.org/Assets/VOC/voc_dump.zip

This will give us ZIP file. Inside of it, there is SQL script with the schema for our new database.

What is a Schema?

Database is made of tables, views, indices and so on. Inside of Database, each of these objects belong to some schema. Database is organizationally divided into schemas. There is no overlap between schemas. There could be some special objects that are outside of a scheme, like roles.

Creation, usage and modification of schema elements is strictly done by the user or the role that owns that scheme. During creation of a scheme, we should decide who will be owner, because later it will not be possible to change ownership. If we want several people to own one schema then we should set a role as an owner of a scheme.

Only 'monetdb' user and 'sysadmin' role are allowed to create new schemes.

Creation of a New User

For our schema we will create a new user. First, we will enter mclient with 'monetdb' privileges.

> mclient -u monetdb -d voc           

For creation of a user, we need username (USER), password (WITH PASSWORD), user's full name (NAME), and default schema for that user (SCHEMA). If a user wants to use table from some schema he can use schema as a prefix "myschema.mytable". If a user doesn't provide schema as a prefix, then the default schema will be used.

sql> CREATE USER "voc" WITH PASSWORD 'voc' NAME 'VOC Explorer' SCHEMA "sys";

As a 'monetdb' user we can create new schema. We will say that previously created "voc" user is the owner of that schema.

sql> CREATE SCHEMA "voc" AUTHORIZATION "voc";   

We will set that new schema as the default schema for our user.

sql> ALTER USER "voc" SET SCHEMA "voc";  
sql> \q   -- we can exit mclient with "quit" or "\q"

Populating our Schema with Database Objects

Our schema is currently empty, but we have definitions of all the tables, view, indices … inside of our downloaded SQL script. We will use that script to populate our schema. We type:

> mclient -u voc -d voc ~/Desktop/voc_dump.sql   

We can use mclient command "\d" to list all the tables and views inside of our database. This is what we would get:

sql> \d
TABLE voc.craftsmen
TABLE voc.impotenten
TABLE voc.invoices
TABLE voc.passengers
TABLE voc.seafarers
TABLE voc.soldiers
TABLE voc.total
TABLE voc.voyages

DBeaver Database Manager Program

We will install DBeaver database manager program in order to peruse our database.

> sudo snap install dbeaver-ce      

This program is GUI program, so we will open it in our desktop environment. In the program we will click on the arrow (1), and then we will select Other (2) to open other servers. There we will find MonetDB. We will click on it (3), and a new dialog will open. In this dialog, host and port are already entered. We just need to enter Database/Schema (4), Username and Password (which is "voc") (5).

DBeaver will not have driver for MonetDB database installed, so it will offer you to download it. Just accept that offer.

At the end, objects of our database will appear inside of pane on the left side of a program. There, we should double click on schema name (1). After that we can select tab "ER Diagram" (2). There, after some rearangement, we will see ER diagram of our database (3). As we can see, tables are organised in star schema with "voyages" table as the only fact table. All tables are connected with foreign key constraints, where foreign key is also primary key inside of dimension tables. The only exception is Inovice table where foreign key columns are not primary key columns, and that is why that relationship is shown with dashed line (4).

DBeaver is an excellent program. If you want to learn more about it you can try this linkedin tutorial.
https://www.linkedin.com/learning/dbeaver-essential-training?trk=learning-topics_trending-courses_related-content-card&upsellOrderOrigin=default_guest_learning

002. Creation of MonetDB Database

General architecture

MonetDB database is presented with MonetDB daemon (service). This daemon can control several servers. Each server is defined inside of one directory that we create. Common name for such directories is "dbfarm". Beside configuration files for that server, all of the databases of that server will also be placed inside of subfolders of dbfarm directory.

Server creation

When we want to create new server, first thing is to create folder for that server.

-> monetdbd create /home/fffovde/DBfarm1

Inside of that folder a new file with the name ".meroviginian_properties" will be created. This file will have properties for our database.

The Merovingian dynasty was the ruling family of the Franks from the mid-5th century until 751. This dynasty ruled the Netherlands, the country from which MonetDB originates. MonetDB is using this term for some of its internal files and commands.

If we take a look inside of this file, we will find only one property:

-> cat .merovingian_properties
# DO NOT EDIT THIS FILE - use monetdb(1) and monetdbd(1) to set properties
# This file is used by monetdbd
control=false

All other properties are using default values. We can read those default values by command:

> monetdbd get all /home/fffovde/DBfarm1
property
hostname
dbfarm
status
mserver
logfile
pidfile
loglevel
sockdir
listenaddr
port
exittimeout
forward
discovery
discoveryttl
control
passphrase
snapshotdir
snapshotcompression
mapisock
controlsock
value
FffOvdeKomp
/home/fffovde/DBfarm1
no monetdbd is serving this dbfarm
unknown (monetdbd not running)
/home/fffovde/DBfarml/merovingian.log
/home/fffovde/DBfarml/merovingian.pid
information
/tmp
localhost
/50000
60
proxy
true
600
no
<unset>
<unset>
.tar.lz4
/tmp/.s.monetdb.50000
/tmp/.s.merovingian.50000

***********************************************

Monetdb daemon will use default port 50000. It is possible to have several Monetdb daemons. Their port numbers could collide. If we have several Monetdb daemons, then we should immediately change default port number to some unused port number. 

We can set another port number by running this command:

> monetdbd set port=12345 /home/fffovde/DBfarm1

***********************************************

If our monetdbd service is already running, we should stop it. We are doing this in order to release port 50000.

> systemctl stop monetdbd
> systemctl disable monetdbd

Now we can start our server. Our new server will use default port 50000.

> monetdbd start /home/fffovde/DBfarm1

If we now look inside of our DBfarm1 directory, we will now see all of this files.

File ".merovingian_lock" is empty. This file probably just signalized that there is a server inside of directory dbfarm.

File "merovingian.pid" has the number 2436. This is the number of monetdbd process. If we use command "sudo netstat -tulnp" to show us all listening ports, we will see the name monetdbd beside process 2436, and this process will listen the port 50000.

We can also read content of log file. There we will see how our action succeeded.

After this step we no longer have to use command monetdbd, we can just use monetdb (without d).

Database creation

We will create new database in this way. This database is created in "maintenance mode" because no one can access it before we can properly configure it. This command will create a new folder with the name "voc" inside of our DBfarm1 directory.

monetdb create voc

Now, we can start our database, so that only members of monetdb group can access it.

monetdb start voc                  

We can check status of our database with this command (50000 is port number):

monetdb -p50000 status   

Last step would be to make this database available to all of users:

monetdb release voc   

Making queries

"mclient" is application used by users to send queries to databases. We have to provide name of a user with "-u" switch, and name of a database with "-d" switch. Everyone that are inside of "monetdb" group can use "monetdb" username. We will be asked for password, and default password is "monetdb". At the bottom we can notice "sql>" prompt. This is where we can type our queries.

mclient -u monetdb -d voc   

**************************************************

If we have used "set port" command to set some other port for our server, then we have to supply that alternative port number to mclient:

mclient -p50007 -u monetdb -d voc

**************************************************

Now we can type our first query. Don't forget the semicolon. We can exit "sql>" prompt with the command "quit".

SELECT 'columnValue' as columnName;    

How to Stop or Lock Our Server?

We can stop our server with stop command:

Monetdbd stop  ~/DBfarm1     

After we do this, we can check our "merovingian.log" file. Inside of it, we will see all of the databases of that server to be shut down.

cat merovingian.log     

If we just want to make our database unavailable then we use "lock" command. This would put our database under maintenance mod.

monetdb lock voc         

We already know that we can exit maintenance mode with "monetdb release voc".