Notice: In this blog post, we will create a VOC database. This database will be used as the main database for most subsequent blog posts.
Getting the Codename of our Ubuntu Version
First, we need to know the code name of our Ubuntu version. We can find that by reading from the file "os-release". From this file we can read only the line that has words "VERSION_CODENAME" inside of it.
cat /etc/os-release | grep VERSION_CODENAME
Our Ubuntu codename is "focal". It is also possible to use command:
lsb_release -cs
We can see from the command line above that our user account is "fffovde". "FffOvdeKomp" is the name of our computer.
Adding a Repository Where MonetDB is Stored
Next, in folder "/etc/apt/sources.list.d" we will create a file with the name "monetdb.list".
#jump to that folder cd /etc/apt/sources.list.d #create new file, you will be asked to provide password sudo touch monetdb.list
Inside of this file we must place this text. These are addresses to MonetdDB repository.
deb https://dev.monetdb.org/downloads/deb/ focal monetdb deb-src https://dev.monetdb.org/downloads/deb/ focal monetdb
We can add this text by running these two lines in our terminal:
sudo sh -c 'echo "deb https://dev.monetdb.org/downloads/deb/ focal monetdb" >> monetdb.list' sudo sh -c 'echo "deb-src https://dev.monetdb.org/downloads/deb/ focal monetdb" >> monetdb.list'
Now our file looks like this:
Installing GPG key
Then, we would execute this command. This command will read GPG key file from the internet, and it will place that file in location "/etc/apt/trusted.gpg.d/monetdb.gpg". GPG key is a file which will be used to verify MonetDB packages before installing them.
This monetd.gpg file is a binary file. We can read its content with the command:
sudo apt-key finger
This command will read values of all the GPG keys in our Ubuntu. One of those keys will be for MonetDB.
If result of this command is equal to " DBCE 5625 94D7 1959 7B54 CE85 3F1A D47F 5521 A603" for MonetDB, then that means that we have installed the correct key.
There is also a key "8289 A5F5 75C4 9F50 22F8 EE20 F654 63E2 DF0E 54F3", but that key is for versions of MonetDB older than 11.49.
MonetDB Installation
Now we can install MonetDB. First, we will update our list of available software with the command:
sudo apt update
Then we can install MonetDB server and client: sudo apt install monetdb5-sql monetdb-client
Creating a DBfarm
In MonetDB, databases are located inside a folder usually called "DBfarm". In addition to the databases, in this folder we will find configuration files with settings for the DBfarm.
Monetdbd, linux daemon, is used to initialize DBfarm by placing configuration files inside it. We use this daemon for changing DBfarm settings and for communication with databases. Monetdbd manages DBfarm and its databases.
monetdbd create /home/fffovde/DBfarm1
We will use "monetdbd" to create a DBfarm on our disk.
Inside of this folder, a new file ".merovingian_properties" will appear. ls -A /home/fffovde/DBfarm1
The Merovingian dynasty was the ruling family of the Franks from the mid-5th century until 751. This dynasty ruled the Netherlands, the country from which MonetDB originates. MonetDB is using this term for some of its internal files and commands.
If we look inside of this file, we will find only one property. All other properties are using default values. We can read those default values by command: monetdbd get all /home/fffovde/DBfarm1
property hostname dbfarm status mserver logfile pidfile loglevel sockdir listenaddr port exittimeout forward discovery discoveryttl control passphrase snapshotdir snapshotcompression mapisock controlsock
value FffOvdeKomp /home/fffovde/DBfarm1 no monetdbd is serving this dbfarm unknown (monetdbd not running) /home/fffovde/DBfarml/merovingian.log /home/fffovde/DBfarml/merovingian.pid information /tmp localhost /50000 60 proxy true 600 no <unset> <unset> .tar.lz4 /tmp/.s.monetdb.50000 /tmp/.s.merovingian.50000
Now, that DBfarm is created, I will start the daemon. We can use daemon to control DBfarm.
monetdbd start /home/fffovde/DBfarm1
Inside of the DBfarm1, we now have 4 files: ls -A /home/fffovde/DBfarm1
File ".merovingian_lock" is empty. This file probably just signalizes that this is DBfarm.
File "merovingian.pid" has the number 1863. This is the number of monetdbd process. If we use command "sudo ss -tlnp" to show us all listening ports, we will see the name monetdbd beside process 1863, and this process will listen the default port 50000.
We can also read content of the log file.
cat /home/fffovde/DBfarm1/merovingian.log
Creation of a Database
While monetdbd is used to manage DBfarm, "monetdb" console application is used to manage individual databases. In the background, "monetdb" will send our commands to monetdbd, and monetdbd will be the one exercising direct control over databases. So, we control databases from "monetdb", but through the power of monetdbd.
This is how we create a database with the name "voc": monetdb create voc
A folder with the name "voc" will appear inside of the DBfarm1 directory.
When database is created, it will only have one default user. That user is administrator "monetdb", and he has default password "monetdb". Database will be created in the maintenance mode. That means that only administrator will be able to start the database, and only on the local computer.
Administrator should start the database with this command. Then he should log into database, and he should change his default password to some other secret and complex password. We will not do that this time, we will continue using the default password "monetdb".
I will change the mode of the database, and I will take it out of the maintenance mode. We can do this to make database available to all the users ( although we currently don't have other users ). monetdb release voc
We can start our database. After the database start, it will be ready for users to log in. monetdb start voc We can check the status of the database with the command: monetdb status
The process with our database server is called "mserver5". This process will run when the database is opened: pgrep mserver5
This command will return process ID of our database server process.
Logging into Database
Now that our database is working and is listening on port 50000, we can try to use it. We will now use another application, with the name "mclient". Let's first recapitulate three console applications used by MonetDB:
"monetdbd" is managing DBfarm.
"monetdb" is managing individual database, with the help of "monetdbd".
Database will run as a "mserver5" process. We'll use "mclient" application to send queries to this process.
We will login to the "voc" database as a "monetdb" user. Password is the default password "monetdb". mclient -u monetdb -d voc
This will be our welcome screen. We will get "sql>" prompt. There we can type our queries.
I can run this query in my database: SELECT 'columnValues' as columnName;
We can exit the "mclient" program by typing the word "quit".
How to Stop our Server?
This is how we can stop "mserver5" process, the process of the Monetdb database. monetdb stop voc
Log file will tell us what happened. tail -1 /home/fffovde/DBfarm1/merovingian.log
We can put our database in maintenance mode at any time. It doesn't matter if database is opened or closed. We use "lock" command. monetdb lock voc
Next time, only administrator, on the local computer, can start the database with "monetdb start voc" command. He can start the database in exclusive mode, so that he can run some maintenance operations on the database ( he can do backup, or he can make changes in the schema ). We saw previously that database can be taken out of maintenance mode with "monetdb release voc" command. After that, any user can login to a database.
We can also stop "monetdbd" daemon.
monetdbd stop /home/fffovde/DBfarm1
Log file will show us that daemon has stopped.
Install MonetDB in Alternative Way
This time I will jump to the newer version of the Ubuntu. It is "noble". cat /etc/os-release | grep VERSION_CODENAME
Then, I will go the web page https://www.monetdb.org/downloads/deb/repo/. On this web page we have a list of the newest versions of the Ubuntu and Debian. One of the versions is "noble". Enter that folder, click on "monetdb-repo.deb". Firefox will download this file.
We will install this "monetdb-repo.deb" package. sudo apt install /home/fff/Downloads/monetdb-repo.deb
This package will add the file "monetdb.sources" inside of the "/etc/apt/sources.list.d". This file has the same content as the file "monetdb.list" that we have created by hand during the original installation.
This "monetdb-repo.deb" package will also provide GPG keys that can be found on this location "/usr/share/keyrings/monetdb-1.4.gpg". We now understand that we can prepare our computer for monetdb installation by using this package, instead of doing everything by hand.
I will uninstall MonetDB on the "noble" server. We will first create and start a DBfarm, because I want to show you the whole process.
For uninstallation process, if monetdbd is running, we must stop it.
monetdbd stop /home/fff/DBfarm1
We will now list all the packages that have "monetdb" in their name. dpkg -l | grep monetdb
We will remove all those packages. When we delete "monetdb-repo", files with repositories and gpg keys will also be deleted. sudo apt purge libmonetdb-client28 libmonetdb-mutils libmonetdb-stream28 libmonetdb30 monetdb-client monetdb-repo monetdb-server monetdb-sql monetdb5-sql
We will clean all the possible remains: sudo apt autoremove sudo apt autoclean
We still have "monetdb" group and user. cat /etc/passwd | grep monetdb getent group monetdb
We will delete the group and the user. sudo deluser monetdb sudo delgroup monetdb Actually, when we delete the user, the group will also be deleted, so we don't have to delete it separately.
This will not delete DBfarms, only MonetDB application.
Systemd Unit File is a file with settings that Systemd will use when starting and controlling some daemon. From version 11.53.13, MonetDB is using a new systemd file that allows us to change the default directory ( DBfarm ) for databases that are controlled by Systemd.
Old Systemd Unit File
systemctl status monetdbd This command will show us where is systemd unit file for Monetdbd daemon. /lib/systemd/system/monetdbd.service
cat /lib/systemd/system/monetdbd.service
The old system unit file was hard coded to always use the directory "/var/monetdbd5/dbfarm". Only databases located within this directory could be controlled by systemd. Only such databases could be started automatically after the computer boots.
[Unit] Description=MonetDB database server daemon Documentation=man:monetdbd https://www.monetdb.org/documentation/admin-guide/manpages/monetdbd/ After=network.target
From version 11.53.13, we have a new system file that allows us to change DBfarm folder, controlled by Systemd.
systemctl status monetdbd This command will show us where is systemd unit file for Monetdbd daemon. /usr/lib/systemd/system/monetdbd.service
cat /usr/lib/systemd/system/monetdbd.service
This is the new look of the Systemd unit file. We can notice that now we have environ variable DBFARM that is directed to directory "DBFARM=/var/monetdb5/dbfarm" by default. This file is now more complex because it allows us to take some other directory for the DBfarm.
[Unit] Description=MonetDB database server daemon Documentation=man:monetdbd https://www.monetdb.org/documentation/admin-guide/manpages/monetdbd/ After=network.target
We will not change the original Systemd unit file. Instead of that we will create a "drop-in". That means that we will create another file that will be "amendment" to the original file. "Drop-in" file augments or overrides parts of a unit file without touching the original unit file.
We create this "drop-in" file by running the command: sudo systemctl edit monetdbd
An instance of Nano text editor will open. Inside of it we will se the whole original Systemd unit file, but all the lines will be commented out (every line will start with #).
This location will override the original "/var/monetdb5/dbfarm" location.
I am using linux distribution "KDE Neon". This distribution doesn't use "SELinux". "SELInux" is security feature of the "Red Hat" distribution. The original Systemd unit file provided by the MonetDB developer team is using the command "chcon" that is only useful for the distributions that are using SELinux. I will exclude this code from my Systemd unit file.
Extraordinary, because I am using distro without SELinux, I will also add these corrected lines to my "drop-in" Systemd unit file.
# First, we will delete all of the "ExecStartPre=". ExecStartPre= # This line will add the first part of the "ExecStartPre=". The line "ExecStartPre" is split into three parts, # just to make it more readable. These three lines can be considered as one script. ExecStartPre=/bin/bash -c 'test -d ${DBFARM} || (mkdir -m 770 ${DBFARM})' # This line will add the second part of the "ExecStartPre=". ExecStartPre=/bin/bash -c 'test -f ${DBFARM}/.merovingian_properties || (umask 0007; /usr/bin/monetdbd-11.53.13 create ${DBFARM}; /usr/bin/monetdbd-11.53.13 set pidfile=/run/monetdb/merovingian.pid ${DBFARM}; touch ${DBFARM}/.merovingian_lock)' # Third line is unchanged. ExecStartPre=/usr/bin/grep -q pidfile=/run/monetdb/merovingian.pid ${DBFARM}/.merovingian_properties
This config snippet will be "amendment" to original Systemd unit file. This snippet must be added above the comment "Edits below this comment will be discarded".
We will save this change with "Ctrl+O" ENTER, and then we will exit with the "Ctrl+X". After that we must reload all the systemd unit files.
sudo systemctl daemon-reload
The Changes We Made
This command will show us the new value of the DBFARM environ: systemctl show monetdbd | grep ^Environment=
On the image we use these commands to check override.conf:
cd /etc/systemd/system/monetdbd.service.d ls -alh cat override.conf
systemctl cat monetdbd
This command is also useful because it will show us the original file, and it overrides, and their locations, all together.
Testing The Changes
Because we are going to use Systemd, we should add our user to "monetdb" group.
sudo usermod -aG monetdb "$USER"
After that, we should log out and then log in.
newgrp monetdb
Sometimes log out/in will not be enough. In that case, try to open a new session ( a tab ) in a terminal, or run this command in the existing session. This is happening because OS will try to recycle the old session, so changes are not applied.
We will now create one database and after that we will restart our computer to see if it will be started automatically.
-- Start monetdbd daemon controlled by systemd. -- Create a database. -- Make database available. --Make monetdbd to start automatically after computer reboot. -- Restart our computer.
After the reboot, I will open the terminal, and I will try to log in: mclient -u monetdb -d DBdesktop
It will work. We don't have to start the daemon manually.
The last step is to check if DBdesktop is really inside of the dbfarm2 directory. cd /var/monetdb5/dbfarm2 ls -alh
What is Schema
Database is made of tables, views, indices and so on. Inside of Database, each of these objects belongs to some schema. Database is organizationally divided into schemas. There is no overlap between schemas. Some special objects are outside of a schema, like roles.
Usage and modification of schema elements is strictly done by the user or the role that owns that schema. During creation of a schema, we should decide who will be the owner, because later it will not be possible to change ownership. If we want several people to maintain one schema then we should set a role as an owner of a schema.
Only 'monetdb' user and 'sysadmin' role are allowed to create new schemas.
Schema as a File
We can create a file that will contain all the instructions the server needs to create database objects inside of the schema. This file would tell the database which tables, relations, indexes, views to create. In this way, we can create everything that makes up one schema. That is why we call such a file a "schema file". Although we have not learned all the SQL commands needed to create such a file, we can use the "schema file" that the MonetDB development team has prepared for us.
First, we will download sample database from this location:
This will give us ZIP file. Inside of it, there is SQL script with the schema for our new database.
Creation of a New User
For our schema we will create a new user. First, we will enter mclient with 'monetdb' privileges. > monetdbd start /home/fffovde/DBfarm1 > mclient -u monetdb -d voc--password monetdb
For creation of a user, we need username (USER), password (WITH PASSWORD), user's full name (NAME), and default schema for that user (SCHEMA). The default schema is schema that MonetDB will use as a current schema when the user log in. For tables in the current schema, the user can type "SELECT * FROM Table1", but for tables in NON current schemas, the user must type "SELECT * FROM schema.Table1".
"sys" schema is a built-in schema in MonetDB. We will use it temporarily so we can create a new user. sql> CREATE USER "voc" WITH PASSWORD 'voc' NAME 'VOC Explorer' SCHEMA "sys";
As a 'monetdb' administrator we can create a new schema. We will say that previously created "voc" user is the owner of that schema.
sql> CREATE SCHEMA "voc" AUTHORIZATION "voc";
Don't get confused. The name of our database is "voc", but the name of the new schema is also "voc", and the name of the user is "voc".
We will set the new schema as the default schema for our user. sql> ALTER USER "voc" SET SCHEMA "voc"; sql> \q-- we can exit mclient with "quit" or "\q"
Since the "voc" schema is the default schema for the "voc" user, this schema will be active when this user logs in to MonetDB. Everything the user does will be reflected in this schema, unless the user explicitly mentions that they want to work in a different schema.
Populating our Schema with Database Objects
Our "voc" schema is currently empty, but we have definitions of all the tables, view, indices … inside of our downloaded SQL script. We will use that script to populate our schema. We type:
> mclient -u voc -d voc ~/Desktop/voc_dump.sql
This code will log "voc" user into "voc" server. It will also execute all the SQL commands from the "voc_dump.sql" file. When the user log in, and his default schema is "voc", that means that he will log in into "voc" schema. All the SQL commands will be executed inside of this schema.
We will log in again: mclient -u voc -d voc#password is voc
We can use mclient command "\d" to list all the tables and views inside of our database. sql> \d TABLE voc.craftsmen TABLE voc.impotenten TABLE voc.invoices TABLE voc.passengers TABLE voc.seafarers TABLE voc.soldiers TABLE voc.total TABLE voc.voyages
This is what we would get:
DBeaver Database Manager Program
We will install DBeaver database manager program to peruse our database. > sudo snap install dbeaver-ce
This program is GUI program, so we will open it in our desktop environment. In the program we will click on the triangle (1), and then we will select "Other" (2) to open other servers. There we will find MonetDB. We will click on it (3), and a new dialog will open. In this dialog, host and port are already entered. We just need to enter Database/Schema (4), Username and Password (5) (which is "voc").
DBeaver will not have driver for MonetDB database installed, so it will offer you to download it. Just accept that offer.
At the end, objects of our database will appear inside of pane on the left side of a program. There, we should double click on schema name (1). After that we can select tab "ER Diagram" (2). There, after some rearrangement, we will see ER diagram of our database (3). As we can see, tables are organized in star schema with "voyages" table as the only fact table. All tables are connected with foreign key constraints, where foreign key is also primary key inside of dimension tables. The only exception is Invoice table where foreign key columns are not primary key columns, and that is why that relationship is shown with dashed line (4).
Here is download for voc_dump.zip file used in this blog post:
People used to love playing games on old game consoles. People would continue to play those games, but those consoles are no longer for sale, so there are no devices available on which to play these old games.
To solve this problem, emulators are born. Emulators are programs that you install on your computer, and they imitate retro game consoles. In that way, people can still play their favorite video games, even without the game console.
Emulator is simulating the actual hardware. For video games it doesn't matter if they are running on an emulator or on the real device.
It is also possible to emulate Operational System with all the programs on that system. It is just like we have several computers inside of the one computer case.
Emulator for OS is called hypervisor. One such hypervisor is VirtualBox. VirtualBox is application that we install on our computer, and then we create "virtual machines". A virtual machine (VM) is a software-based emulation of a physical computer that runs its own operating system and applications. With VirtualBox, we can use virtual machines like any other application.
Container
Container is something in between portable application and virtual machine.
Portable application
Container
Virtual machine
– Application that doesn't need installation. – It should match OS version. – Lightest and fastest. – Not isolated from other applications.
– Application with most of its dependencies. – It should match class of OS (Linux, Windows…). – Middle weight. – Mostly isolated.
– Full OS with all the applications. – Should match hypervisor. – Heavy weight. – Fully isolated.
We could say that container is stripped down virtual machine. If we limit virtual machines to only one application, and we force virtual machines to share kernel with the host machine, then we would get containers.
VM – kernel – other apps = container
Containers doesn't use hypervisor. They use docker engine. Docker engine is just an application, just like VirtualBox.
Containers are similar to portable applications, they can be easily deployed, moved, and upgraded. In addition, containers are isolated from other applications and will not compete for resources between them. Containers are much lighter and faster than virtual machines. Containers are immutable, so it is not possible to update parts of the container. This gives us consistency that we can rely on.
Usage of a Container
Images are templates used to create containers. An image contains everything needed to build a single container. Images are created by application developers. A developer would create an image and then provide it to users, who would then be able to create containers based on the image.
Let's say we want to publish our application as an image. We will package all our source code along with a "Dockerfile" (explained later) into a single folder. We would use the "Build" command to create a "Docker image". The image is the template from which containers are built. We will upload that image to a Docker image repository (registry). "Docker hub" is a well-known cloud image registry. Users can download our image from the registry. Based on that image, the user will launch the container. In the container, the user would find the application we created.
Anatomy of a Dockerfile
If we go to this git hub web page "link", we will find an example of files that are needed to create an image for MonetDB.
Here we can see source files for MonetDB, and among them there is a file with a name "Dockerfile". This file is an instruction for docker image creation.
"Dockerfile" for MonetDB is a more complex one, so I will show you another one that is simple. =>
"Node.js" is a framework for making web applications. The Dockerfile below is instructions on how to make docker image based on the "Node.js" application.
"FROM" command will download stripped down linux distro from the "Docker Hub" cloud. That stripped linux distro will give system API for our application. After that, we will define working folder, we will copy our application into that folder, and we will install all dependencies for our app. The last line will start our application when a user starts the container.
FROM node:22-alpine# ALPINE is a tiny linux distro, with node.js installed. WORKDIR /app # we set the working directory inside of the ALPINE linux. COPY . .# we copy from the current directory on the host # machine to "/app" directory inside of the ALPINE linux. RUN npm install # we install dependencies that are listed inside of our project, into image CMD npm start# this command will run when our container starts.
All Dockerfiles are made like this. They usually have the same steps:
1) get linux image for API
2) define a folder
3) copy your application
4) download dependencies
5) build/compile
6) create user
7) expose network ports
8) set some environs
9) define startup command
Docker file is basically a list of steps we would take on a physical machine to make our application operational. We will not go any deeper into docker technology. In this article we will only learn how to install docker and download and use MonetDB container.
On the bottom of that page, we will find a script. We can download and run this script. That means that we can install docker with two lines of code: curl -fsSL https://get.docker.com -o get-docker.sh sudo sh get-docker.sh
We should add our user to docker group. sudo usermod -aG docker $USER
After that we should log out and log in.
We can now confirm that docker is installed: docker version
Monetdb Container Installation
Getting a Monetdb Image
MonetDB containers are hosted on "Docker Hub". MonetDB team will create a new container for each new version of MonetDB. We can find a list of all the MonetDB versions on this page: https://hub.docker.com/r/monetdb/monetdb/tags
The name of each version is made of three elements: nameOfDockerHubAccount/nameOfApplication :versionNumber
We can download the latest version to our computer using this command. docker pull monetdb/monetdb:latest
We type this command to observe our images. In non-compressed state, our image is 536 MB. docker images
"Tag" is a version of the image. If we don't provide a tag, we will always get the "latest" version.
Starting MonetDB Container
This is how we can start MonetDB containers. ALPINE linux will expose the server on the default port 50.000, but we must tie that to the host port number. We will use again 50.000 for the host port number. By default, our farm will be on location "/var/monetdb5/dbfarm" inside of the container. The name of default database will be "monetdb". We can set an environ with administrator password like "-e MDB_DB_ADMIN_PASS=monetdb2".
docker run -p50000:50000--restart unless-stopped -e MDB_DB_ADMIN_PASS=monetdb2 monetdb/monetdb:latest
The server will stay in the foreground in the terminal. We will open a new tab inside terminal and there we will type: docker ps #this will list opened containers
We can connect to bash in the new container. "28629" are start figures of the container ID. We will be connected as a root. docker exec -it 28629 bash
Administrator is "monetdb", and the default database is "monetdb". Password is "monetdb2" and it is set through environ MDB_DB_ADMIN_PASS.
mclient -u monetdb -d monetdb # monetdb2 is password
What Will Happen If We Reboot Computer
I will exit bash, and then I will reboot my computer. We used option below with the "docker run" command. This option will make our container to automatically start when the computer starts. It is only possible to stop it manually. --restart unless-stopped If we don't want container to start automatically, we use the default "--restart no".
After restart "docker ps" command will show containers active on our computer. We can see that our container STATUS is "Up 43 seconds".
If we type "docker ps -a" we would get all the containers on our computer, both stopped and active.
I will stop the container. docker stop 28629
There are now no active containers, just one stopped container.
We can now delete stopped container: docker rm 28629
This command would delete all the stopped containers: docker container prune
We will now start the container again, but with fully customized options.
Above we use the "-d" option to detach our container. This means that after we run this command, docker will not remain in the foreground, but we will get the command line again, so we can continue typing commands in the same shell.
At the bottom of the image, we can see ID of the new container.
1666729293f….
MDB_FARM_DIR and MDB_CREATE_DBS
We can define in which directory in the container our farm will be created. That is defined with MDB_FARMDIR. With MDB_CREATE-DBS, we can create one or several databases. Between the names of databases there must be no spaces, just the comma.
I will again enter container's bash to check whether we have this folder and databases.
docker exec -it 166 bash
Inside of the bash I will jump do "dbfarm2" folder. cd /var/monetdb5/dbfarm2
Inside this folder we will find databases db1 and db2. ls -alh
We don't have explicitly define MDB_FARM_DIR and MDB_CREATE_DBS. They can take the default values.
/var/monetdb5/dbfarm
monetdb
Volume
We used option that will create volume. Volume is folder on the host system that is mounted into file system of the container.
-v data-vol:/var/monetdb5/dbfarm2
Volumes can be found on the host system at the location: /var/lib/docker/volumes
We can open this folder on our host computer, and inside of it we will find our databases. The purpose of the volumes is to provide data persistence. If our container is deleted, everything inside of it will be deleted. But this will not happen if the data is on the host computer and is only mounted to container.
If we want the data to be safe and persistent, we should use volumes.
MDB_SNAPSHOT_DIR, MDB_SNAPSHOT_COMPRESSION and MDB_LOGFILE
We set MDB_LOGFILE to be "/logfile". That is where is MonetDB log: cat /logfile
By default, logfile will be placed inside of the DBfarm, and will have a name "merovingian.log".
We have set snapshot options: -e MDB_SNAPSHOT_DIR=/snapshot -e MDB_SNAPSHOT_COMPRESSION=.tar.bz2
I will make a snapshot with the default settings:
monetdb snapshot create db1
MonetDB will try but will fail because "/snapshot" directory doesn't exist. We must create it.
mkdir /snapshot monetdb snapshot create db1 cd /snapshot ls -alh
Snapshot will be created with the assigned compression "bz2".
Snapshots cannot be created if we don't provide snapshots environs. It would be wise to place snapshots inside of the volume to preserve them.
MDB_FARM_PROPERTIES and MDB_DB_PROPERTIES
With MDB_FARM_PROPERTIES environs we set two DBfarm properties. monetdbd get all /var/monetdb5/dbfarm2 | egrep 'exittimeout|keepalive'
Possible properties are explained inside of this blog post "link". We cannot set properties "listenaddr, control and passphrase". "Listenaddr" is always set to "all". "Passphrase" is set by the environ MDB_DAEMON_PASS. If we set MDB_DAEMON_PASS, then "control" will be set to true.
If I try to read database properties with "monetdb get all db1", I am getting an error "incomplete response from monetdb". I don't know why this is happening. We are also kicked out from the bash. This is a bug.
Alternatively, we can read from the system function "env()". I will login again into bash and then into mclient.
docker exec -it 166 bash mclient -u monetdb -d db1 #password monetdb2 SELECT * FROM env() WHERE value = 10;
In the "env()" table, we can see that the values of the "nthreads" and "ncopyinthreads" properties are set.
Possible properties for a database are explained in this blog post "link".
MDB_DB_ADMIN_PASS and MDB_DAEMON_PASS
MDB_DB_ADMIN_PASS is a mandatory environ. We must set it in order to run MonetDB container.
MDB_DAEMON_PASS is a password for remote Monetdbd control. Remote control of a daemon is explained in the blog post "link". In the context of the containers, this password will allow us to control the daemon in the container from the host operating system.
I will run this command from the host OS to control monetdbd daemon inside of the container. monetdb -h localhost -P daemonPass create db3
For this to work we need to have some conditions fulfilled on the host operational system: – We must have MonetDB installed on the host computer. – DBfarm must be started. Without that we cannot use "monetdb" command. – DBfarm on the host computer must use port number different than the port number of the MonetDB inside of the container.
monetdbd get port /home/clean/DBfarm1
Host computer has Monetdbd on the port 50.0001.
On the host computer we can visit "volume" folder. Inside of it we will see that the new database "db3" has been created.
How to Stop and Start Container?
We can stop the container with command: docker stop 16667 docker ps -a
We can start the container with the command: docker start 16667 docker ps
Clean Up
Deleting Images and Containers
I will now delete everything we have done. First, I will stop and delete container. docker stop 16667 docker rm 16667
We can also delete the image: docker images docker rmi 9f2e
Removing Docker from Your System
Deleting Docker includes these steps:
We will first stop docker services. sudo systemctl stop docker.service sudo systemctl stop docker.socket
We can now uninstall all parts of the docker: sudo apt purge -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
In the next step, we will remove docker repository and GPG key. sudo rm -f /etc/apt/sources.list.d/docker.list sudo rm -f /etc/apt/keyrings/docker.gpg
We can update the list of available packages. sudo apt update
We will remove orphaned dependencies. sudo apt autoremove -y sudo apt autoclean
We can also remove all the remaining docker files. sudo rm -rf /var/lib/docker sudo rm -rf /var/lib/containerd sudo rm -rf /etc/docker sudo rm -rf ~/.docker
We will delete docker user group. sudo groupdel docker
Now, we can conclude that docker is deleted: docker --version
In one of the previous blog posts "link", we saw the existence of a system table containing statistics about database tables:
SELECT * FROM sys.statistics WHERE minval = 'A';
We saw that this table can be updated partially with these commands. Now, we will see that for each of these commands we can use stored procedure, as an alternative.
Update schema statistics:
Update table statistics:
Update column statistics:
Command
ANALYZE sys;
ANALYZE sys.auths;
ANALYZE sys.auths ( name );
Procedure
CALL sys.analyze('sys');
CALL sys.analyze('sys', 'auths');
CALL sys.analyze( 'sys', 'auths', 'name' );
There is also a procedure that will update statistics for all the schemas:
CALL sys.analyze();
Table Data Dump
I will read from one system table. We'll see that this table is empty. SELECT * FROM sys.dump_statements;
I will now call this procedure: CALL sys.dump_table_data( 'sys', 'auths' );
We will read again from the "sys.dumpstatements". SELECT * FROM sys.dump_statements;
Now we have a row with a statement that can be used to fill "auths" table with the data.
We can empty this table normally: DELETE FROM sys.dump_statements;
If we call the same procedure, but without arguments, all the tables from the database will be dumped ( image is from "DBfarmDB" database ): CALL sys.dump_table_data(); SELECT * FROM sys.dump_statements;
Eval Procedure
I want to find the table in my database that has the biggest ID. I want to read the content of that table. It is easy to find the name of this table. The problem is how to read its content. SELECT name FROM sys.tables WHERE id = (SELECT MAX( id ) FROM sys.tables);
CREATE OR REPLACE PROCEDURE TableWithBiggestID() BEGIN DECLARE TableName VARCHAR(50); SET TableName = ( SELECT name FROM sys.tables WHERE id = (SELECT MAX( id ) FROM sys.tables)); CALL sys.eval( 'SELECT * FROM ' || TableName || ';' ); END;
For that task, I will use stored procedure "sys.eval". It will transform text into statement.
In our procedure, we will find the name of the table with the biggest ID ( table "tabula" ). We will create string "SELECT * FROM tabula;". To execute string, we will transform it into statement with "sys.eval". If we call our procedure "TableWithBiggestID", we will see the content of the table "tabula". CALL TableWithBiggestID();
This function is only available to administrator.
Vacuum
Vacuuming is a maintenance process where we clean databases of obsolete data and defragment columns. After deleting or updating many rows, old records will not be deleted but will only be marked as deleted. This can lead to excessive disk usage and prevent sequential reading of column values.
We can manually vacuum one table or one column.
CALL sys.vacuum('voc', 'voyages');
CALL sys.vacuum('voc', 'voyages', 'number');
It is possible to schedule column vacuuming at regular intervals. The interval is expressed in seconds.
CALL sys.vacuum('voc', 'voyages', 'number', 3600); This is how we stop the schedule.
CALL sys.stop_vacuum('voc', 'voyages','number');
Session Procedures
Session Procedures for Users
User can read his own session ID. SELECT sys.current_sessionid();
User can limit memory consumption for his queries. This is measured in MB.
CALL sys.setmemorylimit(500);
User can disable his memory limit if he call this procedure with the argument equal to zero.
CALL sys.setmemorylimit(0);
Users can set query timeout to 1 second.
CALL sys.setquerytimeout(1);
If our query is longer than 1 second, error message will appear.
Zero will reset query timeout. CALL sys.setquerytimeout(0);
We can limit session to 3 seconds. CALL sys.setsessiontimeout(3);
This will happen if we run a query after 3 seconds.
Zero will reset session timeout.
CALL sys.setsessiontimeout(0);
Users can choose their optimizer. CALL sys.setoptimizer('minimal_pipe');
We can find a list of optimizers in the system table "sys.optimizers".
Default optimizer is "default_pipe".
We can limit the number of threads that user queries can use. CALL sys.setworkerlimit(8);
Zero argument will remove this limit. CALL sys.setworkerlimit(0);
Client Info Procedure for Users
We can declare some properties for the database client we are using. If we read from this system table, we will find a list of the properties that we can declare.
SELECT * FROM sys.clientinfo_properties;
ClientHostname – the name of our computer. ApplicationName – Mclient, DBeaver, Tableau. ClientLibrary – libmapi 11.53.3 ClientPid – number of the client process. ClientRemark – A comment.
I will change one of these properties. CALL sys.setclientinfo('ClientHostname', 'OldServer' );
We can read current values of these properties in the "sys.sessions". SELECT * FROM sys.sessions;
The purpose is to write client data to the "sys.session" table, so we can better understand server usage.
Session Procedures for Admins
I will open another tab in the terminal, and there I will login as a user "voc".
mclient -u voc -d voc --password is voc
If the user reads from "sys.session", he will only get his own session.
SELECT * FROM sys.sessions;
When the admin reads from the "sys.sessions", he will see sessions with all the users. Admin can use IDs of those sessions to manage those sessions. Below we can see how admin can manage user voc session ( ID 1 ). SELECT * FROM sys.sessions;
CALL sys.setmemorylimit( 1, 100 );
Admin can limit memory usage to 100 MB.
We disable this limit by setting it to zero.
CALL sys.setoptimizer( 1, 'sequential_pipe');
Admin can set optimizer.
Default optimizer is "default_pipe".
CALL sys.setquerytimeout( 1, 8 );
User's queries will time out after 8 sec.
Zero will disable query time out.
CALL sys.setsessiontimeout( 1, 30 );
Session will time out after 30 seconds.
Zero will disable session time out.
CALL sys.setworkerlimit( 1, 8 );
User's queries are limited to 8 threads.
Zero will disable this limit.
CALL sys.stopsession( 1);
We can stop user's session.
Queries Que
The Size of Queries Que
I will quit the session, and I will stop the "voc" database. quit monetdb stop voc
After that I will start the database, and I will login to it. monetdb start voc mclient -u monetdb -d voc --password monetdb
I will read from the system view: select * from sys.queue; It is also possible to use function: select * from sys.queue();
The result is a table that has a history of the last run queries. This table was emptied when we restarted the database, so now it only has 2 queries.
I will create a list of 70 queries, and I will paste all of them in the mclient.
After all of them run, I will again read from the "sys.queue" table.
SELECT COUNT(*) FROM sys.queue;
The result will be 64. MonetDB is keeping the last 64 queries.
If I login again as a "voc" user in the other tab of the terminal, I can run a query as a "voc" user. mclient -u voc -d voc --password voc SELECT 'New';
I will read "sys.queue" table as a "voc" user. SELECT * FROM sys.queue; He has 3 queries.
I will go back being admin, and I will again run the statement.
SELECT COUNT( * ) FROM sys.queue();
The result will be 61. MonetDB is keeping only the last 64 queries ( 61 + 3 ), from the all of the sessions.
Admin has ability to read the whole que: select COUNT( * ) FROM sys.queue( 'ALL' );
With "ALL", admin can read from all of the sessions. SELECT DISTINCT sessionid FROM sys.queue( 'ALL' );
Changing The Que Size
The number of queries saved is the same as the maximal number of clients that can connect to database. I will quit my session. I will stop my database and then I will change "nclients" property of the database.
quit monetdb stop voc monetdb set nclients=10 voc
I will log in into database. mclient -u monetdb -d voc
I will again paste 70 queries into mclient.
If we now read from the "sys.queue", we will only find 10 queries. This is the proof that "nclients" property will limit number of queries kept inside of this table.
We can read the current value of "nclients" property with this query. select * from environment where name = 'max_clients';
Management of Queries
We can provide the name of a user, and that will return only the queries for that user. Currently, all the saved queries belong to admin. select COUNT( * ) from sys.queue('monetdb');
"queue" view will also include queries that are currently running. Such queries will be labeled with "running" in the status column.
There are five possible statuses.
Query is executing. running
Query is halted. paused
Query is stopping. stopping
Query is stopped. aborted
Query is executed. finished
I will open one more tab in the terminal, where I will log in as a admin.
mclient -u monetdb -d voc --password monetdb
I will use this query on the right. This is recursive CTE that uses LEFT OUTER JOIN. Such query will never end. It is eternal.
I will run this query in the first terminal tab, and I will check the result in the second terminal table.
WITH recursive numbers AS ( SELECT 1 UNION ALL SELECT 1 FROM numbers LEFT OUTER JOIN sys.auths ON 1 = 1 ) SELECT * FROM numbers;
The last query in my "sys.queue" is number 93 ( column "tag" ). I know that when I run my eternal query, it will get number 94. I will run eternal query.
I will run this statement in the terminal second tab. My query will have status "running".
SELECT status FROM sys.queue WHERE tag = 94;
I can pause my query from the terminal second tab. Status will change.
CALL sys.pause( 94 ); SELECT status FROM sys.queue WHERE tag = 94;
I can resume my query. Status will change again.
CALL sys.resume( 94 ); SELECT status FROM sys.queue WHERE tag = 94;
We can stop the query.
CALL sys.stop( 94 ); SELECT status FROM sys.queue WHERE tag = 94;
It seems that my query can be only stopped with "Ctrl+C". This is because I used eternal query.
Even when we use "Ctrl+C" to stop query, its status will become "aborted".
SELECT status FROM sys.queue WHERE tag = 94;
If we are admin we can manage queries of other users. For that we must provide the name of a user and his query number ( from column "tag" ).
Mclient is the MonetDB command line client that connects to the MonetDB server to execute SQL commands. We have used this client in all the previous videos, but now we will see how we can make better use of it.
General Information
With mclient we can find out the version of our server and we can get help for mclient commands.
This will show us version: mclient -v
This is how we can get help: mclient -?
Information About Our Database
In one of the previous blogposts "link", I created database farm "DBfarm". Inside of it we have a database "DBfarmDB". I will now login to that database as an admin and I will create some database objects. The objects will be created in the "sys" schema. monetdbd start /home/fffovde/Desktop/DBfarm monetdb set readonly=no DBfarmDB#if it is still readonly mclient -u monetdb -d DBfarmDB #password monetdb
CREATE TABLE tablo( Number INT ); CREATE VIEW viewo( Number ) AS SELECT Number from tablo; CREATE SEQUENCE seqo;
CREATE FUNCTION funco() RETURNS INTEGER RETURN SELECT 2;
CREATE SCHEMA schemo;
If we just type "\D" inside of the mclient, we will get sql dump in the stdout, for the whole database.
We can type the table name after the "\D" command. This will give us SQL to create that table. \D tablo
If we want to get a list of selected database objects, we just need to list them after the "\d" command. I will list schemas, sequences and views.
\dnsv
We can get lists of these objects: n – schemas f – functions s – sequences t – tables v – views Just "\d" will give as tables and views by default. It is the same as "\dvt".
We can combine "\d" command with the letter "S". That will add system objects to lists. \dSt
"\dS" is equal to "\dSvt".
We can get SQL dump only for objects with the specific name. \dst tablo Name should be written with the lower letters, or inside of double quotes, like "Tablo".
We can use wildcards "*" (several letters) and "_" (one letter). \dst *o Wildcards can not be used inside of double quotes. This will not work: \dst "*o"
Query Execution Control
Query Formatting
We can control how the result of a query will be presented ( default is "sql" ). Let's run "SELECT * FROM sys.auths;" with different modes.
If we don't want to see the result of a query, we can use "trash" and "rowcount" modes.
\f trash --trash will return nothing
\f rowcount -- only row count will return
\f sql --default
\f expanded -- records are vertical
\f csv -- comma isseparator
\f tab -- tab is separator
\f raw --raw database format
\f xml --like xml document
There are some special modes for the CSV.
\f csv=c --separator is letter "c", or any other sign
\f csv+c --header is included
<= Note that there are double quotes around the word "public".
For the rest =>, we will use query: SELECTschema_path FROM sys.users;
\f csv --quotes are doubled
\f csv-noquote --no doubling
\f sql --default
The Number of Rows
When we run query that will return a lot or rows, we will not be able to see all of them. They will just whiz by before our eyes. We want to control how the rows are displayed. We want to see them page by page. For that we will use "\r" command.
For this example, I will use system table sys.function_types. It has 7 rows.
SELECT * FROM sys.function_types;
I will set rows command to four: \r 4
We will get only one row for the first page. For the first page we always get N-3 rows. This is probably bug. If N is three or smaller, then the first page will be empty. For other pages we will get the complete 4 pages. We can jump to the next page with "Enter". That is the same as "next".
If we type the letter "c" and then we press ENTER, we will jump to the end of a table.
If we type the letter "q" and then we press ENTER, we will immediately exit the execution.
For the last example I will make terminal window small.
I will set command to zero: \r 0 I will query this table. SELECT * FROM sys.keywords;
This is the best mode, because now the page is the size of the window.
I will press ENTER several times to go through a few pages. Then I will enlarge the window.
Only the first page will have 2 rows that are visible in one window. Other pages will have five rows. Obviously we have some "offset by 3" bug.
We will exit paging mode by setting the command to minus one "\r -1".
Width of a Table
If we set "\w" to minus one, MonetDB will think that we have unlimited width terminal. I've narrowed the window, so presentation will be squeezed and crumbly. It becomes better if window is wider.
Default value of the "\w" command is zero. I made my terminal window narrow, so in this mode 11/16 columns will not be shown. Only the width of the terminal is used.
If we set "\w" to some positive number, MonetDB will think that the terminal is only a certain number of characters wide. I'll set it to 50 characters, which is enough for 4 columns.
Auto Commit Mode
By default, auto commit mode is turned ON in mclient. We can turn it OFF with the command "\a".
We turn it back to ON with "\A".
\a INSERT INTO tablo( Number ) VALUES ( 1 ); ROLLBACK; SELECT * FROM tablo; --nothing was auto commited
\A INSERT INTO tablo( Number ) VALUES ( 2 ); ROLLBACK; --not allowed in auto commit mode SELECT * FROM tablo; --value 2 is auto commited
Logging and Monitoring
I will leave the session and then I will log in, just to clear the history. quit mclient -u monetdb -d DBfarmDB --password monetdb
I will turn on logging and tracing on mclient. This is mostly useful for debugging. \L /home/fffovde/Desktop/FileLog --log to this file \X --trace everything that mclient is executing
I will now execute a query:
SELECT 2;
This is tracing. We'll get a lot of lines.
A new "LogFile" will appear with the similar content. This is logging.
We can disable logging and tracing. \L --we use \L without giving a full path \X false
I will run a few more queries: SELECT 3; SELECT 4;
Using the "\history" command, we can list all the SQL and commands we run in this session.
I will again quit the session, and log again, so I can change current directory.
quit cd /home/fffovde/Desktop mclient -u monetdb -d DBfarmDB --password monetdb
SQL Input and Output
We can redirect the result of our statement to some shell command. \| grep "alpha" SELECT * FROM (VALUES('alpha'),('beta'));
I will disable this command like this. Now our result is normal. \| --without shell command SELECT * FROM (VALUES('alpha'),('beta'));
We can redirect the result of our statement to a file. \>ResultFile SELECT 10, 20;
Every statement we run will now go to the file, and not to the terminal. The file is in the current directory. We can disable the effect of this command with: \>--without the name of a file
On the Desktop, I will create one textual file with one query inside of it.
I can call this query from the file. I'll type: \<SqlFile
This is usefull when we want to run some prepared SQL script. The problem is that we will get the result, but we will not know what statement caused that result. To solve this we will use echo "\e".
After we turn on echo, every statement will be shown before the result of the statement. Useless for SQL but great for files.
SELECT 107;
\<SqlFile
I will again quit, and then login back, to disable effect of the echo command. quit mclient -u monetdb -d DBfarmDB --password monetdb
Other Backslash Commands
"\?" will give a list of all of the backslash commands.
"\help" will give a list of SQL statements.
We already saw how to use timing in this blog post "link". It is also possible to turn on timing inside of the mclient. Options are "none, clock, performance".
\t clock SELECT 599;
\t performance SELECT 599;
--statements-- sql = parsing opt = optimizing run = running clk = total time
\t will return the current mode.
\tnone To disable timing.
We can always exit mclient with the "\q", or "quit".
Dash Commands
When we log in to mclient, we can provide an IP address, port number, username and database name. There are many more of these arguments
Many commands exist in both backslash and dash version. I will give you example of such commands here. During this example I will alternately login and quit, because dash commands can be run only during the login.
This will create sql dump of the whole database.
mclient -u monetdb -d DBfarmDB -D
This is the same as "\D" backslash command, that we saw earlier.
We can enable tracing and logging.
mclient -u monetdb -d DBfarmDB -X -L FileLog
The same as "\X" and "\L".
We will set that our queries return expanded format, we will turn on echo and the timing. mclient -u monetdb -d DBfarmDB -f expanded -t clock -e
Format and Time support all of the possible modes.
The same as "\f, \t, \e".
With "-a" switch we can turn off autocommit for the whole session: "mclient -u monetdb -d DBfarmDB -a".
When we run a query, it will be paged to 5 rows, and the visible width will be 50 characters. mclient -u monetdb -d DBfarmDB -r 5 -w 50
We already saw this with commands "\r, \w".
We can apply shell command on all of our queries. Notice that pipe must be escaped. mclient -u monetdb -d DBfarmDB -\| 'grep m'
The same as \| grep "m".
History, Null and Timezone
I will login again with some new options: mclient -u monetdb -d DBfarmDB -z -n "zero" -H
Option "-z" is for disabling the time zone. Statement: SELECT CURRENT_TIME; , will give me Greenwich time, not my local time.
Option "-n" will define how the null will be presented in the terminal. This is only valid when "-f" format is either "sql,csv, tab or expanded".
Default format is "sql", we have set nul to be "zero", so the null will be presented as a "zero". SELECT null; --result is "zero"
In this session we have run two statements: --SELECT CURRENT_TIME; --SELECT null;
Because our session was started with the switch history "-H" both of these statements are recorded in the file: /home/fffovde/.mapiclient_history_sql
I will now quit this session, and then I will login again: quit mclient -u monetdb -d DBfarmDB -H
Because I have used "-H" switch, history of mclient statements will be available for me. By pressing keyboard arrows, I will list that history.
If I run one more statement, that statement will be added to "/.mapiclient_history_sql" file. So, when "-H" switch is active, new statements will be added to history file, and that history will be available to us. SELECT 55; We can delete the history by deleting directly from the ".mapiclient_history_sql" file.
Getting Data from URLs
I will create a new table and then I will try to get data from the internet to insert into that table. Mclient will refuse to do that. CREATE TABLE IndustryList( Industry VARCHAR(50) ); COPY INTO IndustryList FROM 'https://cdn.wsform.com/wp-content/uploads/2020/06/industry.csv' ON CLIENT;
I will quit the session, and then I will login with a remote "-R" switch. quit mclient -u monetdb -d DBfarmDB -R--password monetdb COPY INTO IndustryList FROM 'https://cdn.wsform.com/wp-content/uploads/2020/06/industry.csv' ON CLIENT;
It will work now. I can read from "IndustryList" table. SELECT * FROM IndustryList;
SQL Dump with Insert Into
Normal dump will create a script that will fill a table with the data from the Stdin. It will use "COPY INTO FROM stdin" statement. mclient -u monetdb -d DBfarmDB -D
If we want to get "INSERT INTO" statements, then we have to add switch "-N". mclient -u monetdb -d DBfarmDB -D -N
Running SQL Statements from the Shell
With "-s" switch we can run a statement, instead of logging in. mclient -u monetdb -d DBfarmDB -s "SELECT null;"
We can also run a script directly from the shell. We previously created this file. mclient -u monetdb -d DBfarmDB SqlFile-i
We can use them both: mclient -u monetdb -d DBfarmDB -s "SELECT null;" SqlFile
Many of the previously shown options will work with the shell query, too. mclient -u monetdb -d DBfarmDB -s "SELECT null;" -n "zero"
I will add command "\f xml" in the SqlFile.
This will make our command to fail. mclient -u monetdb -d DBfarmDB SqlFile
We will add interactive switch "-i". It will work now. mclient -u monetdb -d DBfarmDB SqlFile -i
DOTMONETDBFILE
We already talked about DOTMONETDBFILE environment variable, in the blog post about backup "link". All of that is valid for the mclient, too.