0350 Exporting Data and Binary Files in MonetDB
Sample Table
CREATE TABLE tblCSV ( Letter VARCHAR(10), Number REAL, Calendar DATE ); INSERT INTO tblCSV (Letter, Number, Calendar) VALUES ('A', 3.82, null), ('B', 4.83, '2025-01-02'), ('C', 5.64, '2025-01-03'); | ![]() |
Exporting Data from MonetDB to CSV
We saw how to import data from CSV file. For exporting to CSV file, we will again use COPY INTO statement.
COPY SELECT * FROM tblCSV INTO '/home/fffovde/Desktop/tblCSV'; By default, we'll get textual file, in UTF-8 encoding (1). Line endings are going to be LF (2). Column separators are pipes (3). Default string wrappers will be double quotes (4). Nulls will be represented as a >null< (5). Notice that I have omitted "ON SERVER" subclause. Subclause "ON SERVER" is the default. | ![]() |
Default string wrappers are not the same when writing and reading strings. They are double quotes when writing, and empty strings when reading.
We can change default settings, like in this example. Column separator is TAB sign (1). String wrapper is a star (2). NULLs are represented with >Nil< (3). COPY SELECT * FROM tblCSV INTO '/home/fffovde/Desktop/tblCSV' ON SERVER USING DELIMITERS E'\t', E'\n', '*' NULL AS 'Nil'; | ![]() |
We can not overwrite the existing file with the same name. That would return the error. | ![]() |
Exporting Compressed CSV File
If our file name has extensions .xz .bz2 .gz .lz4, then the result file will be compressed. Compression level is the best when using .xz ( .xz > .bz2 > .gz > .lz4 ). For better compression, the more time is needed.
Let's use extreme zip compression (.xz). COPY SELECT * FROM tblCSV INTO '/home/fffovde/Desktop/tblCSV.xz'; Our file is reduced from 648 to 108 bytes. | ![]() |
Exporting CSV File on the Client
I will demonstrate exporting of a CSV file, on the client, by using python. We will use the code bellow. This is the same code used for uploading files, with two distinctions. Instead "set_uploader" we are using "set_downloader". COPY INTO statement is also different.
import pymonetdb connection = pymonetdb.connect(username="monetdb", password="monetdb", hostname="localhost", database="voc") handler = pymonetdb.SafeDirectoryHandler("/home/fffovde/Desktop") connection.set_downloader(handler) cursor = connection.cursor() cursor.execute("COPY SELECT * FROM tblCSV INTO '/home/fffovde/Desktop/tblCSV.bz2' ON CLIENT;") connection.commit() cursor.close() connection.close() | Our file will be saved in compressed state: ![]() |
Binary Files
MonetDB can import/export data even faster than from/to CSV files. For that we can use binary files.
Little Endian vs Big Endian
Let's say that we have hexadecimal number 0x12345678. Every two figures represent one byte:
12 => 00001100 | 34 => 00100010 | 56 => 00111000 | 78 => 01001110 |
![]() | When we write our number to memory, we can write it in two ways. – We can write it in order 12345678. This is called Big Endian. – We can write it in the reverse order 78563412. This is called Little Endian. This distinction is important when writing binary files. |
There is also a term Native Endian. That is the preferred byte order of the system MonetDB is running on. If your system is using AMD, ARM, or Intel CPU, then your system is using Little Endian.
We can check the Endianness of our system.lscpu | grep "Byte Order" | ![]() |
Exporting Binary Files
MonetDB can export data into MonetDB custom binary files. Each table column will become a separate file. | COPY SELECT * FROM tblCSV INTO LITTLE ENDIAN BINARY |
![]() | As a result, we'll get three binary files. One file for each column. |
Instead of "Little Endian" we can use "Big Endian" or "Native Endian". Instead of "ON SERVER", we can use "ON CLIENT". For exporting data on the client, we can reuse the same python script shown above.
Loading Binary Files to MonetDB
Exported binary files can be imported in any MonetDB database. Before import, we can empty database table "TRUNCATE tblCSV;
".
COPY LITTLE ENDIAN BINARY INTO tblCSV( Letter, Number, Calendar ) FROM '/home/fffovde/Desktop/Letter', '/home/fffovde/Desktop/Number', '/home/fffovde/Desktop/Calendar' ON SERVER; | ![]() |
TRUNCATE tblCSV;
It is possible to load less then three columns. We can see that "Calendar" column is empty.COPY LITTLE ENDIAN BINARY INTO tblCSV( Letter, Number ) FROM '/home/fffovde/Desktop/Letter', '/home/fffovde/Desktop/Number' ON SERVER; | ![]() |
TRUNCATE tblCSV;
COPY LITTLE ENDIAN BINARY INTO tblCSV FROM '/home/fffovde/Desktop/Letter', | We don't have to declare columns of the database table. In that case, we just have to make sure that order and number of files is the same as the order and number of the columns in the table. |
Loading Binary Files to MonetDB on Client
TRUNCATE tblCSV;
import pymonetdb connection = pymonetdb.connect(username="monetdb", password="monetdb", hostname="localhost", database="voc") handler = pymonetdb.SafeDirectoryHandler("/home/fffovde/Desktop") connection.set_uploader(handler) cursor = connection.cursor() cursor.execute("COPY LITTLE ENDIAN BINARY INTO tblCSV FROM '/home/fffovde/Desktop/Letter', '/home/fffovde/Desktop/Number', '/home/fffovde/Desktop/Calendar' ON CLIENT;") connection.commit() cursor.close() connection.close() | For uploading binary files to MonetDB, from the client, we are using the similar python script that we have used for CSV files. |