By placing letter e before string, we can write control characters ( Tab, New line, Carriage return ) with their specifiers. This will work even without e.

Control Character	Name	Usage
\n	Newline	Moves the cursor to the next line.
\r	Carriage return	Moves the cursor to the beginning of the current line.
\t	Tab space	Inserts a tab space.
\b	Backspace	Deletes the character just before the control character.
\f	Form feed	Moves the cursor to the next page.
\u	Unicode character	Represents a Unicode character using a code number.

Without e, the result is the same ( e'A\tB' = 'A\tB' ). If we want to write literal specifiers, then we have to use prefix R.

We can write blob with pairs of hexadecimal characters. Before such blob we should type X or blob. SELECT x'12AB', blob '12AB';

Concatenation

Number of Characters

Searching for One String Within Another

Get the Substring by its Position

Instead of SUBSTRING, we can use SUBSTR.

Convert Case

Max and Min

Unicode and ASCII

Padding of The String

Trimming The String

String Repetition

Extracting the Start or End of the String

Guess the Start or the End of the String

Does One String Contains Another

Insert and Replace

Find a String in the List

Other Functions

Blob Functions

String Similarity

Before calculating similarity between two strings, we should standardize those strings as much as possible. In MonetDB, we can achieve that with QGRAMNORMALIZE function.

In order to transform word "kitten" into word "sitting" we need one insert, and two updates.

The DAMERAU-LEVENSHTEIN function is similar to the LEVENSHTEIN function. The difference is that DAMERAU-LEWENSTEIN also takes transpositions into account.
In the DAMERAU-LEVENSHTEIN function, "AC" -> "CA" is only one step. In the LEVENSHTEIN function, we would have to update two letters, so that transformation would be 2 steps.

It seems that in MonetDB, these two functions are the same, and the behavior is controlled by the number of the arguments.

There is also a version of the LEVENSHTEIN function with 4 arguments. In this function, default cost for the insert is 1. Third argument is DELETE, forth is REPLACE. Transposition is not possible in this case, so it will not act as a DAMERAULEVENSHTEIN function.

EDITDISTANCE is the same as DAMERAULEVENSHTEIN.
Everything cost 1 point, but transposition costs 2 points.
SELECT EDITDISTANCE( 'AC', 'CA' ); EDITDISTANCE2 is the same as DAMERAULEVENSHTEIN. Everything cost 1 point.
SELECT EDITDISTANCE2( 'AC', 'CA' );

PATINDEX function is similar. Difference is that this function can use wildcards. It can use '%' for zero or more characters or '_' for one character. It will return position, not of the first letter, but of the last letter. `SELECT PATINDEX( '%eS', 'succeSslessness' ),PATINDEX( '____eS', 'succeSslessness' );`
`SELECT PATINDEX( 's%c', 'succeSslessness' ), --0 PATINDEX( '%s%c', 'succeSslessness' ), --4 PATINDEX( '%s%c%', 'succeSslessness' ), --4 PATINDEX( 'c_S', 'succeSslessness' ), --6 PATINDEX( 'succeSslessness%', 'succeSslessness' ), --15 PATINDEX( 'succeSslessness_', 'succeSslessness' )`; `--0`	These examples show the logic behind PATINDEX.
All of these functions (LOCATE, CHARINDEX, POSITION, PATINDEX) are case sensitive. `SELECT CHARINDEX( 'ES', 'successlessness' );` If they can not find substring, all of them will return zero.

If we want to achieve the behavior of the DAMERAU-LEVENSHTEIN function, we have to provide 5 arguments. Last three arguments are in order: costs for insert/delete, substitution, transposition.	`SELECT LEVENSHTEIN('AC','CA', 1, 1, 1 ), DAMERAULEVENSHTEIN('AC','CA', 1, 1, 1 );`
If we provide only two arguments, then functions will act as a LEVENSHTEIN function.	`SELECT LEVENSHTEIN('AC','CA'), DAMERAULEVENSHTEIN('AC','CA');`

0270 String Functions in MonetDB

Concatenation

Number of Characters

Searching for One String Within Another

Get the Substring by its Position

Convert Case

Max and Min

Unicode and ASCII

Padding of The String

Trimming The String

String Repetition

Extracting the Start or End of the String

Guess the Start or the End of the String

Does One String Contains Another

Insert and Replace

Find a String in the List

Other Functions

Blob Functions

String Similarity

Jaro-Winkler function

Soundex

Leave a Comment Cancel Reply

SUBSTRING function will return all the letters of the string after some position. `SELECT SUBSTRING( '123456', 3 ), SUBSTRING( '123456', 3, 2 ) ;` With the third argument, we can limit number of characters returned.
If the second argument is too large, we will get an empty string. `SELECT SUBSTRING( '123', 5 ), SUBSTRING( '123', 2, 7 ) ;` If the third argument is too large, it will have no effect.

We can pad some string with spaces from the left, or from the right side. Second argument decides what should be the length of the whole string. `SELECT '=>' \|\| LPAD( 'zz', 5 ) AS LeftPadding , RPAD( 'zz', 5 ) \|\| '<=' AS RightPadding;`
If used, third argument will be used instead of the space sign. `SELECT LPAD( 'zz', 7, 'OE' ) AS LeftPadding, RPAD( 'zz', 7, 'OE' ) AS RightPadding;`
If the string is already too long, then it will be truncated from the right side. `SELECT LPAD( '12345', 3 ) AS LeftPadding` `, RPAD( '12345', 3 ) AS RightPadding;`

We can trim spaces from the start, from the end, or from the both ends of a string. `SELECT 'leftTrim' \|\| LTRIM( ' <=' )` `, RTRIM( '=> ' ) \|\| 'righTrim'` `, TRIM( ' A ' ) AS BothSides;`
The second argument can be used to specify the string to trim. `SELECT 'leftTrim' \|\| LTRIM( 'UUU<=', 'UU' )` `, RTRIM( '=>UUU', 'UU' ) \|\| 'righTrim'` `, TRIM( 'zzAzz', 'z' ) AS BothSides;`

This is how we can check if the string starts or ends with some other string. `SELECT startswith( 'ABC', 'AB' ), endswith( 'ABC', 'bC' );`
If we want to make comparison case insensitive, then we have to provide TRUE for the third argument. `SELECT startswith( 'ABC', 'ab', true ), endswith( 'ABC', 'bC', true );`

Concatenation

Number of Characters

Searching for One String Within Another

Get the Substring by its Position

Convert Case

Max and Min

Unicode and ASCII

Padding of The String

Trimming The String

String Repetition

Extracting the Start or End of the String

Guess the Start or the End of the String

Does One String Contains Another

Insert and Replace

Find a String in the List

Other Functions

Blob Functions

String Similarity

Jaro-Winkler function

Soundex

Related Posts

Leave a Comment Cancel Reply