URL Data Type is used for storing URL addresses. We will create a table with this type of data. We can also use something like URL(512) if we want to limit the number of characters.
CREATE TABLE URLtable( URLcolumn URL );
Inside of this column we can store both string and URL data types.
INSERT INTO URLtable ( URLcolumn ) VALUES ( 'https://www.monetdb.org/documentation/user-guide/' );
INSERT INTO URLtable ( URLcolumn ) VALUES ( URL'https://www.monetdb.org/documentation/user-guide/' );
We can create URL data type by using CONVERT and CAST functions, or by using URL prefix:
SELECT CONVERT('https://www.monetdb.org/documentation/user-guide/', url);
SELECT CAST('https://www.monetdb.org/documentation/user-guide/' AS url);
SELECT URL 'https://www.monetdb.org/documentation/user-guide/';
URL Data Type functions
Suppose we have this URL in our table:
'https://me@www.monetdb.org:458/Doc/Abc.html?lang=nl&sort=asc#example'
We can divide this URL into its component parts:
https | :// | me | @ | www.monetdb.org | : | 458 | /Doc/Abc.html | ? | lang=nl&sort=asc | # | example |
protocol or schema | user | host | port | path or context | query | anchor or fragment |
We can then read all these parts using the MonetDB URL functions. If a part does not exist, these functions will return NULL. All functions return a CLOB.
To extract the "host" we have two functions, GETHOST and URL_EXTRACT_HOST. The URL_EXTRACT_HOST function accepts a second argument that can exclude the "www" part from the host, if the value of this argument is true.
SELECT SYS.GETPROTOCOL( URLcolumn ) FROM URLtable; | https |
SELECT SYS.GETUSER(URLcolumn) FROM URLtable; | me |
SELECT SYS.GETHOST( URLcolumn ) FROM URLtable; SELECT SYS.URL_EXTRACT_HOST( URLcolumn, false )FROM URLtable SELECT SYS.URL_EXTRACT_HOST( URLcolumn, true ) FROM URLtable; | www.monetdb.org www.monetdb.org monetdb.org |
SELECT SYS.GETPORT( URLcolumn ) FROM URLtable; | 458 |
SELECT SYS.GETCONTEXT( URLcolumn ) FROM URLtable; | /Doc/Abc.html |
SELECT SYS.GETQUERY( URLcolumn ) FROM URLtable; | lang=nl&sort=asc |
SELECT SYS.GETANCHOR( URLcolumn ) FROM URLtable; | example |
The path (context) can be further divided into:
SELECT SYS.GETFILE( URLcolumn ) FROM URLtable; | Abc.html |
SELECT SYS.GETBASENAME( URLcolumn ) FROM URLtable; | Abc |
SELECT SYS.GETEXTENSION( URLcolumn ) FROM URLtable; | html |
From the host we can read the domain separately:
SELECT SYS.GETDOMAIN( URLcolumn ) FROM URLtable; | org |
Using the SIS.ISAURL function, we can check if something is a valid URL.
SELECT SYS.ISAURL( URLcolumn ) FROM URLtable; | true |
Robots.txt is a text file with instructions for search engines (e.g. Google). It is always located in the root directory of the web server. This function will return the location of Robots.txt.
SELECT SYS.GETROBOTURL( URLcolumn ) FROM URLtable; | https://me@www.monetdb.org:458/robots.txt |
All the functions above will accept either string or URL data type as its argument. For creation of a new URL, we need arguments that are strings (or integer for ports). Function for creation of a new URL-s, can accept two combinations of arguments, as we can see bellow. Compound argument 'usr@www.a.com:123' is called "authority".
SELECT SYS.NEWURL( 'https', 'usr @www.a.com:123', 'docs/index.html'); | https://usr @www.a.com:123/docs/index.html |
SELECT SYS.NEWURl( 'https', 'www.a.com', 5567, 'docs/index.html'); | https://www.a.com:5567/docs/index.html |
Network Data Type
This data type is used to store IPv4 addresses, such as '192.168.1.5/24'. We can create column with this data type, and we can write strings and network data type in this column.
CREATE TABLE InetTable ( InetColumn INET );
INSERT INTO InetTable ( InetColumn ) VALUES ( '192.168.1.5/24' );
INSERT INTO InetTable ( InetColumn ) VALUES ( INET '192.168.1.5/24' );
For creation of INET data type we can use CONVERT, CAST functions or INET prefix operator.
SELECT CONVERT('192.168.1.5/24', INET);
SELECT CAST('192.168.1.5/24' AS INET);
SELECT INET '192.168.1.5/24';
Network Data Type Operators
Network data type operators are used to compare two network addresses. Network addresses consist of four numbers between 1 and 255 ( 1.1.1.1 to 255.255.255.255 ). We can add zeros to those numbers, so that each number has three digits. Then we can remove dots.
17.18.203.1 => | 017.018.203.001 => | 017018203001 |
221.42.2.56 => | 221.042.002.056 => | 221042002056 |
Logic of comparison is simple, if 221042002056 > 017018203001 then 221.42.2.56 > 17.18.203.1 . We can test this with MonetDB operator.
SELECT INET '221.42.2.56' > INET '17.18.203.1'; | true |
We can use all mathematical operators. If they can work on numbers, they can work on network addresses.
< | <= | = | >= | > | <> |
Network Data Type Operators for Belonging
Maximal network address can be 255.255.255.255. Number 255 can be presented as 28. This means that total number of combinations is 28 x 28 x 28 x 28=232. We can say that each number has 8 bits, but total IP address has 32 bits.
We have address for a network, and address for a computer on that network.
Network address | Computer address | |
221 | 221.42.2.56 | On large networks we can have many computers, so we need three numbers to label them all. |
221.42 | 221.42.2.56 | On medium networks two numbers are enough to mark all of our computers. |
221.42.2 | 221.42.2.56 | On small networks we only need one number. Here we can have up to 255 computers. |
In the second column, in the table above, we see that the computer address is composed of two parts. One part will indicate the network and the other will be for the computer on that network. I used red and black to make a difference. In the real world, this distinction is made using this syntax such as "221.42.2.56/24". This "/24" suffix means that the first 24 bits (the first three numbers) are for the network and the rest for the computer.
Now we can understand operators for belonging. These operators will tell us whether some computer address belong to some network.
SELECT INET '192.168.1.5' << INET '192.168.1/24'; | true | Computer 192.168.1.5 belongs to network 192.168.1. |
SELECT INET '192.168.1/24' <<= INET '192.168.1/24' | true | Same as above, but this will also return true if we compare two same networks. In our example we have the same network on the both side of the operator. |
Of course, we can also use reverse operators >> and >>=, so we have to be careful what is on the left, and what is on the right side of our operator.
Network Data Type Functions
All of these functions will accept only INET data type.
SELECT SYS.ABBREV( INET '10.1.0.0/16') | 10.1/16 | This function will remove the trailing zeros, to shorten network address. Returns CLOB. |
SELECT SYS.BROADCAST( INET '192.168.1.5/24') | 192.168.1.255/24 | If we want to send message to all of the computers on the network, we use broadcast address. This is network part of address, and all other numbers are 255. In our example we have 192.168.1 + 255 = 192.168.1.255. Returns INET. |
SELECT SYS.HOST( INET '192.168.1.5/24') | 192.168.1.5 | This will just extract host from network address (by removing "/24"). Returns CLOB. |
SELECT SYS.LEFT_SHIFT( inet '192.168.1.5' , inet '192.168.1/24' ) | true | Same as "SELECT INET '192.168.1.5' << inet '192.168.1.5/24';". This computer address is from this network. |
SELECT SYS.LEFT_SHIFT_ASSIGN(inet '192.168.1/24' , inet '192.168.1/24') | true | Same as " SELECT INET '192.168.1/24' <<= INET '192.168.1/24' ". |
SELECT SYS.MASKLEN( INET '192.168.1.5/24' ) | 24 | This will return "/24" as an integer. |
SELECT SYS.NETMASK( INET '192.168.1.5/24' ) | 255.255.255.0 | This is a different syntax for "/24". They mean the same thing. Returns INET. |
SELECT SYS.NETWORK( INET '192.168.1.5/24' ) | 192.168.1.0/24 | It will extract network part of an address. Everything else will be zero. Returns INET. |
SELECT SYS.RIGHT_SHIFT( INET '192.168.1/24' , INET '192.168.1.5' ) | true | Same as "SELECT INET '192.168.1/24' >> INET '192.168.1.5' ; ". This computer address is from this network. |
SELECT SYS.RIGHT_SHIFT_ASSIGN( INET '192.168.1/24' , INET '192.168.1/24') | true | Same as "SELECT INET '192.168.1/24' >>= INET '192.168.1/24' ;". |
SELECT SYS.SETMASKLEN( inet '192.168.1.5/24', 16 ) | 192.168.1.5/16 | Returns INET with "/24" changed to "/16". The second argument is an INTEGER. |
SELECT SYS.TEXT( INET '192.168.1.5' ) | "192.168.1.5/32" | Returns INET as a text. It attached "/32" part. |
There is another function. I don't know what it's used for, but I figured out how it works. It seems that this function will show us the maximum number of computers that can be in this network.
SELECT SYS.HOSTMASK( INET '192.168.23.20/30' ) | 0.0.0.3 | It is calculated like 232 / 230 = 22 = 4. Result is 4 -1 = 3. |
SELECT SYS.HOSTMASK( INET '192.168.23.20/24' ) | 0.0.0.255 | It is calculated like 232 / 224 = 28 = 256. Result is 256 – 1 = 255. |
SELECT SYS.HOSTMASK( INET '192.168.23.20/17' ) | 0.0.127.255 | It is calculated like 232 / 217 = 215 = 27 x 28 = 128 x 256. Result is 128 -1 = 127 and 256-1=255. |