010. MonetDB  – URL and Network Data Types

URL Data Type is used for storing URL addresses. We will create a table with this type of data. We can also use something like URL(512) if we want to limit the number of characters.

CREATE TABLE URLtable( URLcolumn URL );

Inside of this column we can store both string and URL data types.

INSERT INTO URLtable ( URLcolumn ) VALUES ( 'https://www.monetdb.org/documentation/user-guide/' );
INSERT INTO URLtable ( URLcolumn ) VALUES ( URL'https://www.monetdb.org/documentation/user-guide/' );

We can create URL data type by using CONVERT and CAST functions, or by using URL prefix:

SELECT CONVERT('https://www.monetdb.org/documentation/user-guide/', url);
SELECT CAST('https://www.monetdb.org/documentation/user-guide/' AS url);
SELECT URL 'https://www.monetdb.org/documentation/user-guide/';

URL Data Type functions

Suppose we have this URL in our table:

'https://me@www.monetdb.org:458/Doc/Abc.html?lang=nl&sort=asc#example'

We can divide this URL into its component parts:

https://me@www.monetdb.org:458/Doc/Abc.html?lang=nl&sort=asc#example
protocol
or
schema
userhostportpath
or
context
queryanchor
or
fragment

We can then read all these parts using the MonetDB URL functions. If a part does not exist, these functions will return NULL. All functions return a CLOB.

To extract the "host" we have two functions, GETHOST and URL_EXTRACT_HOST. The URL_EXTRACT_HOST function accepts a second argument that can exclude the "www" part from the host, if the value of this argument is true.

SELECT SYS.GETPROTOCOL( URLcolumn ) FROM URLtable;https
SELECT SYS.GETUSER(URLcolumn) FROM URLtable;me
SELECT SYS.GETHOST( URLcolumn ) FROM URLtable;
SELECT SYS.URL_EXTRACT_HOST( URLcolumn, false )FROM URLtable
SELECT SYS.URL_EXTRACT_HOST( URLcolumn, true ) FROM URLtable;
www.monetdb.org
www.monetdb.org
          monetdb.org
SELECT SYS.GETPORT( URLcolumn ) FROM URLtable;458
SELECT SYS.GETCONTEXT( URLcolumn ) FROM URLtable;/Doc/Abc.html
SELECT SYS.GETQUERY( URLcolumn ) FROM URLtable;lang=nl&sort=asc
SELECT SYS.GETANCHOR( URLcolumn ) FROM URLtable;example

The path (context) can be further divided into:

SELECT SYS.GETFILE( URLcolumn ) FROM URLtable;Abc.html
SELECT SYS.GETBASENAME( URLcolumn ) FROM URLtable;Abc
SELECT SYS.GETEXTENSION( URLcolumn ) FROM URLtable;html

From the host we can read the domain separately:

SELECT SYS.GETDOMAIN( URLcolumn ) FROM URLtable;org

Using the SIS.ISAURL function, we can check if something is a valid URL.

SELECT SYS.ISAURL( URLcolumn ) FROM URLtable;true

Robots.txt is a text file with instructions for search engines (e.g. Google). It is always located in the root directory of the web server. This function will return the location of Robots.txt.

SELECT SYS.GETROBOTURL( URLcolumn ) FROM URLtable;https://me@www.monetdb.org:458/robots.txt

All the functions above will accept either string or URL data type as its argument. For creation of a new URL, we need arguments that are strings (or integer for ports). Function for creation of a new URL-s, can accept two combinations of arguments, as we can see bellow. Compound argument 'usr@www.a.com:123' is called "authority".

SELECT SYS.NEWURL( 'https', 'usr@www.a.com:123', 'docs/index.html');https://usr@www.a.com:123/docs/index.html
SELECT SYS.NEWURl( 'https', 'www.a.com', 5567, 'docs/index.html');https://www.a.com:5567/docs/index.html

Network Data Type

This data type is used to store IPv4 addresses, such as '192.168.1.5/24'. We can create column with this data type, and we can write strings and network data type in this column.

CREATE TABLE InetTable ( InetColumn INET );
INSERT INTO InetTable ( InetColumn ) VALUES ( '192.168.1.5/24' );
INSERT INTO InetTable ( InetColumn ) VALUES ( INET '192.168.1.5/24' );

For creation of INET data type we can use CONVERT, CAST functions or INET prefix operator.

SELECT CONVERT('192.168.1.5/24', INET);
SELECT CAST('192.168.1.5/24'  AS INET);
SELECT INET '192.168.1.5/24';

Network Data Type Operators

Network data type operators are used to compare two network addresses. Network addresses consist of four numbers between 1 and 255 ( 1.1.1.1 to 255.255.255.255 ). We can add zeros to those numbers, so that each number has three digits. Then we can remove dots.

17.18.203.1   =>017.018.203.001       =>017018203001
221.42.2.56   =>221.042.002.056       =>221042002056

Logic of comparison is simple, if 221042002056   >   017018203001 then 221.42.2.56   >   17.18.203.1 . We can test this with MonetDB operator.

SELECT INET '221.42.2.56'   >  INET '17.18.203.1';true

We can use all mathematical operators. If they can work on numbers, they can work on network addresses.

<<==>=><>

Network Data Type Operators for Belonging

Maximal network address can be 255.255.255.255. Number 255 can be presented as 28. This means that total number of combinations is 28 x 28 x 28 x 28=232. We can say that each number has 8 bits, but total IP address has 32 bits.

We have address for a network, and address for a computer on that network.

Network addressComputer address
221221.42.2.56On large networks we can have many computers, so we need three numbers to label them all.
221.42221.42.2.56On medium networks two numbers are enough to mark all of our computers.
221.42.2221.42.2.56On small networks we only need one number. Here we can have up to 255 computers.

In the second column, in the table above, we see that the computer address is composed of two parts. One part will indicate the network and the other will be for the computer on that network. I used red and black to make a difference. In the real world, this distinction is made using this syntax such as "221.42.2.56/24". This "/24" suffix means that the first 24 bits (the first three numbers) are for the network and the rest for the computer.

Now we can understand operators for belonging. These operators will tell us whether some computer address belong to some network.

SELECT INET '192.168.1.5' << INET '192.168.1/24';trueComputer 192.168.1.5 belongs to network 192.168.1.
SELECT INET '192.168.1/24' <<= INET '192.168.1/24'trueSame as above, but this will also return true if we compare
two same networks. In our example we have the same network
on the both side of the operator.

Of course, we can also use reverse operators >> and >>=, so we have to be careful what is on the left, and what is on the right side of our operator.

Network Data Type Functions

All of these functions will accept only INET data type.

SELECT SYS.ABBREV( INET '10.1.0.0/16')10.1/16This function will remove the trailing zeros,
to shorten network address. Returns CLOB.
SELECT SYS.BROADCAST( INET '192.168.1.5/24')192.168.1.255/24If we want to send message to all of the computers on
the network, we use broadcast address. This is network
part of address, and all other numbers are 255. In our
example we have 192.168.1 + 255 = 192.168.1.255.
Returns INET.
SELECT SYS.HOST( INET '192.168.1.5/24')192.168.1.5This will just extract host from network address
(by removing "/24"). Returns CLOB.
SELECT SYS.LEFT_SHIFT( inet '192.168.1.5'          
                                          , inet '192.168.1/24' )
trueSame as
"SELECT INET '192.168.1.5'  <<  inet '192.168.1.5/24';".
This computer address is from this network.
SELECT SYS.LEFT_SHIFT_ASSIGN(inet '192.168.1/24'
                                                         , inet '192.168.1/24')
trueSame as
" SELECT INET '192.168.1/24' <<= INET '192.168.1/24' ".
SELECT SYS.MASKLEN( INET '192.168.1.5/24' )24This will return "/24" as an integer.
SELECT SYS.NETMASK( INET '192.168.1.5/24' )255.255.255.0This is a different syntax for "/24". They mean the
same thing. Returns INET.
SELECT SYS.NETWORK( INET '192.168.1.5/24' )192.168.1.0/24It will extract network part of an address. Everything
else will be zero. Returns INET.
SELECT SYS.RIGHT_SHIFT( INET '192.168.1/24'
                                           , INET '192.168.1.5' )
trueSame as
"SELECT INET '192.168.1/24' >> INET '192.168.1.5' ; ".
This computer address is from this network.
SELECT SYS.RIGHT_SHIFT_ASSIGN( INET '192.168.1/24'
                                                            , INET '192.168.1/24')
trueSame as
"SELECT INET '192.168.1/24' >>= INET '192.168.1/24' ;".
SELECT SYS.SETMASKLEN( inet '192.168.1.5/24',  16 )192.168.1.5/16Returns INET with "/24" changed to "/16".
The second argument is an INTEGER.
SELECT SYS.TEXT( INET '192.168.1.5' )"192.168.1.5/32"Returns INET as a text. It attached "/32" part.

There is another function. I don't know what it's used for, but I figured out how it works. It seems that this function will show us the maximum number of computers that can be in this network.

SELECT SYS.HOSTMASK( INET '192.168.23.20/30' )0.0.0.3It is calculated like 232 / 230 = 22 = 4. Result is 4 -1 = 3.
SELECT SYS.HOSTMASK( INET '192.168.23.20/24' )0.0.0.255It is calculated like 232 / 224 = 28 = 256. Result is 256 – 1 = 255.
SELECT SYS.HOSTMASK( INET '192.168.23.20/17' )0.0.127.255It is calculated like 232 / 217 = 215 = 27 x 28 = 128 x 256.
Result is 128 -1 = 127 and 256-1=255.

Leave a Comment

Your email address will not be published. Required fields are marked *