0290 Transactions in MonetDB
What is Transaction?
We insert 500 rows into the table. Due to hardware/software/power issues, our rows are only partially inserted. We do not know which rows are written and which are not. We are not sure whether rows written are correct or not. For this problem we use a transaction. A transaction is a guarantee that all rows will be inserted correctly or none of them. |
Let's assume another example. The employee was promoted. We need to change her job title in our database and increase her salary. For this we need two SQL statements. We want both statements to succeed or both to fail. Again, we can use a transaction. | START TRANSACTION; |
So, transaction is a set of statements that will be completed fully or will have no effect at all. This is how we preserve integrity and consistency of our database.
How Transactions Work in MonetDB?
MonetDB is using Optimistic concurrency control. That means that we do not use locks. Let's assume that we want to change the value 30 to value 60 in our database.
First, we will take a note that current value in the table is 30. → | Parallel to that, we will prepare our number 60 for the insertion. We want to prepare all in advanced, so that we can be really fast during insertion. |
After that two things can happen:
If the current value is still 30, then we will lock the table. This lock will be really short and will be fine grained, meaning we will only lock 1 record, in our example. It's almost as if we're not locking anything. After that, we will commit our change. Because we are really fast during commit (everything is already prepared), we reduce the chance that something bad will happen during that time. This guarantees us that the whole transaction will be completed successfully. |
If the value in the table is changed by some other transaction, while we were preparing our insert, then we abort our mission. If this is the case, server will send an error message. After that, the user or application, can decide whether they want to retry their transaction or to do something else. Error: Transaction failed: A conditional update failed The purpose of this fail is to avoid conflicts between transactions. |
Optimistic concurrency control is great for databases where we have high read, low write workloads. That means our database should not have a lot of
conflicts where two transactions are changing the data. It is great for analytic and web databases, because of speed and scalability.
Autocommit
START TRANSACTION; --no need for this | By default, MonetDB uses a transaction around each individual statement. No need to manually start and end a transaction. SELECT 2; — <= this is already an individual transaction |
Several Statements in One Transaction
To wrap several statements into one transaction, we have to use "START TRANSACTION" and "COMMIT" statements. Now, either both transactions will succeed or both will fail. | START TRANSACTION; |
Why my Transaction Doesn't Work?
We will create one sample table.CREATE TABLE LetterNumber( Letter CHAR, Number INTEGER ); |
If we start transaction, execute some statement and don't Commit, our change will still be visible. How is this possible without Commit?
START TRANSACTION; | If we start a new session, only then we will see that our change is not really committed. The change is only visible inside of our original session, so we can check the result of our statement without really committing it. |
READ ONLY and READ WRITE
MonetDB documentation claims that we can make our transaction READ ONLY. This is not true, MonetDB doesn't support read only mode. It does support READ WRITE mode, but that mode is the default, so this subclause is useless.
START TRANSACTION READ ONLY; | START TRANSACTION READ WRITE; is the default, so it is unnecessary. |
ISOLATION LEVEL
During the read of one column, someone else can change the data. If we need 5 seconds to read that column, during that time someone can: – Insert a new row. – Update a row. – Delete a row. The question is: Do we want to isolate our query from those changes? We can isolate our query completely or partially. Possible isolation levels are: 1) READ UNCOMMITTED 2) READ COMMITTED 3) REPEATABLE READ 4) SERIALIZABLE |
Default Isolation Level
Default isolation level in MonetDB is READ COMMITTED (not sure 100%). For the current session isolation level can be changed with SET TRANSACTION. This statement will change isolation level, but it will also automatically start a new transaction.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
| <= we change isolation level, but we also start a new transaction. <= this statement is inside the new transaction. <= we end this transaction. |
SELECT 3; | <= this is another transaction (because of autocommit). This transaction is in the same session so it is using READ UNCOMMITTED isolation level. So, this isolation level is now valid for all the transactions in the rest of the session. |
Because SET TRANSACTION automatically starts a new transaction, we can not use START TRANSACTION immediately after SET TRANSACTION. |
ISOLATION anomalies
ISOLATION levels are used to save us from different anomalies ( dirty reads, lost updates … ) that can occur in a concurrent environment. A concurrent environment is an environment in which different users execute their queries at the same time. Database will always fight to eliminate these anomalies. Even if we use the READ UNCOMMITTED isolation level, anomalies will probably not occur. |
READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, SERIALIZABLE isolation levels
READ UNCOMMITTED will not isolate us at all. We will see everything other people are doing.
If we start a READ UNCOMMITTED transaction (1), we will see that that in the third row we have C3 (2). In that moment, some other user will start a transaction (3), and he will change the third row to C5 (4). If we read our table again, inside our transaction, we will see that change (5). We are not isolated at all. Even uncommitted changes by other users will be visible. This anomaly is called "dirty reads". If we want to avoid this anomaly, we should use READ COMMITED isolation level. |
READ COMMITTED will save us from dirty reads, but it won't save us from "lost updates". Transaction A starts before Transaction B. Transaction A ends after Transaction B. Both transactions try to update the value 100 to something else. Since transaction A is committed last, the value written to the table will be 115. The value 75 is lost. The solution is the REPEATABAL READ level. This level, if used by Transaction A, would lock the value 100 after reading it. This would force transaction B to fail, after step X = X – 25. We don't know in advance which transaction would be the last, so both transactions should use REPEATABLE READ. |
There is a similar anomaly as "dirty reads". That anomaly is "Non-Repeatable Reads". The difference is that the other person will not only make the change, they will also commit the change. If we can see the change before a commit, it's a "dirty reads" anomaly, if we only see it after a commit, it's a " Non-Repeatable Reads " anomaly. REPEATABLE READ will save us from this anomaly, too.
The strictest isolation level is "SERIALIZABLE". This level will save us from "phantom reads". Phantom reads occur when one user start the transaction (1), and then he reads all the rows where gender is female (2). In the meantime, another user will start the transaction (3), he will delete one row where gender is female (4), and he will commit (5). If the first user tries to read the same rows again (inside of the same transaction), he will see the deletion made by another user (6). The same thing would happen if the rows are inserted, in that case we would see "phantom rows". |
How to assign isolation level in the individual transaction?
We saw how to assign default isolation level with SET TRANSACTION. It is similar for individual transaction:
START TRANSACTION REPEATABLE READ;
SELECT 2;
COMMIT;
Transaction control commands
We can control how our transaction will end. COMMIT keyword means that we want to execute our statement. ROLLBACK keyword means that we will fail our statement.
START TRANSACTION ISOLATION LEVEL READ COMMITTED; | START TRANSACTION ISOLATION LEVEL READ COMMITTED; UPDATE LetterNumber SET Number = 5 WHERE Letter = 'B'; COMMIT; --will execute |
We can automatically open a new transaction after COMMIT or ROLLBACK. For that we use AND CHAIN subclause:
START TRANSACTION; SELECT 3; | START TRANSACTION; SELECT 3; --rollback this second transaction |
What is a SAVEPOINT?
Savepoint is the same as the checkpoint in video games?
If your player dies before the checkpoint: | He will have to start again from the start: |
If your player dies after the checkpoint: | Then he can continue the game from the checkpoint: |
Idea of checkpoint is to save your progress, so you don't have to replay everything. In Transactions control, idea of Savepoint is to avoid repeating the whole transaction, if we can repeat only part of it.
How to use SAVEPOINT?
We'll make an example with two savepoints, then we will see what will happen if we apply ROLLBACK, ROLLBACK to savepoint1, or ROLLBACK to savepoint2.
START TRANSACTION; UPDATE LetterNumber SET Number = 2 WHERE Letter = 'A'; SAVEPOINT SP1; –the name of a savepoint UPDATE LetterNumber SET Number = 8 WHERE Letter = 'B'; SAVEPOINT SP2; –the name of a second savepoint UPDATE LetterNumber SET Number = 7 WHERE Letter = 'C'; | Then, we can use one of these three commands: ROLLBACK; – Would rollback the whole transaction. ROLLBACK TO SAVEPOINT SP2; – Would rollback to the second savepoint. ROLLBACK TO SAVEPOINT SP1; – Would rollback to the first savepoint. |
Starting status of our table: | Status before ROLLBACK: | If we use ROLLBACK: | If we go to SAVEPOINT SP2: | If we go to SAVEPOINT SP1: |
We can delete savepoints with RELEASE command. After we release a savepoint, we can not go back to that savepoint any more.
Before release: | RELEASE SAVEPOINT SP2; | RELEASE SAVEPOINT SP1; | RELEASE SAVEPOINT SP1; |
Savepoints are independent. If we release savepoint1, savepoint2 will still be alive and valid.
Synonyms
These commands are synonymous, we can use them interchangeably.
SET TRANSACTION; | START TRANSACTION; | COMMIT; | ROLLBACK; |
SET LOCAL TRANSACTION; | BEGIN TRANSACTION; | COMMIT WORK; | ROLLBACK WORK; |