Transactions in SQL

 Transactions


Now as you can see that in example A is a person who has 1000 rupees in his account and he want to transfer 500 rupees to B who has account balance as 5000.

So now this information is stored in a table.

Steps to make is possible programatically:

  1. Get account balance of A. - > DB call.
  2. Check, if A >= amount. - > logical call not a DB call.
  3. Debit Rs. 500 from A and updte table. - > DB call
  4. Update the balance of B and update the table. - > DB call
These 4 steps mechanism complete 1 transaction between two people.

Take a scenario:
Multiple person are trying to send amount to same accont. Like how it happens at the time of shopping when multiple person comes to purchase goods from a shop then they pay money from their account to shop keeper account.

Now suppose a developer has created a method for transaction to transfer amount from one account to another.

Scenario: Multiple person are making transaction and transfering amount to same person say is as shopkeeper. Suppose these transactions are happening at the same time exactly the same time.
Suppose method name is bank_transfer with parameters as from, to and amount.

Now let us consider the process where concurrent transactions are happening and let us analyze it.


No suppose that we consider that transaction done by c is the lastest one and final B account balance become 15000.

As you can see that we are missing 500 amount which A has transfered to B.

Let us analyze the issue now:
  • We lost 500 during the transaction.
  • In case of system failure, we might end up in intermediate state.
    • Money got deducted from A and not received by B.
This will lead to:
  • Illogical/Incorrect/Inconsistent data
  • Might end up in intermediate state.
Now to overcome this issue we need to follow some rules and regulations those rules are known as:
ACID properties.
So expected properties from a transaction: (Why expected as it is not manditory)
  • Atomicity - 
    • A single whole unit.
      • To an outsider, it should feel that either the transaction has happened or hasn't happened at all
      • Either money is transferred or not transferred
      • It takes care of intermediate state.
      • In short it says that if a transaction starts then it must complete and must not allow any other transaction to start if it is working at the moment.
      • Specially in case of money transfer.
    • This step is manditory in the case of banking transaction and not in case of any information update like on social media or different tools where it is not important to have per second or milisecond or nenosecond sync to have latest information.
  • Consistency
    • It means to have data authenticity and reliability.
      • Correct
      • Accurate
      • Exact
      • Logical
    • the numbers before and after transaction should add up.
    • In Bank transaction it is manditory. But in our example it is not followed and caused issue. Loss of 500 rupees.
    • But it is not manditory if jio cinema is showing that how many persons are watching the live program as here is can have some delay. As no one is bothered about it. As you can see that even though 100 more members has joined the live program still number is shown as 3.2 Cr in rounded form not as exact number. At it is not important and manditory to show the exact count of audiences who are watching the live show.
    • In case of Bank Transaction it is manditory and even up to decimal points.
  • Isolation
    • It say that one transaction should not interfere with other transaction which is happening at the same time. There are four type of isolation level.
      • Read Uncommitted
      • Read Committed
      • Repeatable Read (MySQL have it by default) (Snapshot)
      • Serializable
  • Durability
    • Once a transaction is done then data should persist forever.
    • In real time it is not possible as we save data in harddisk. If it got corrupted then it will be lost. So any service that can just increase the time of storage but it cannot store it forever.
    • So it means here durability means to have data for long time in decades number or atleast till the person who made is alive or the system which used it if it is important for it for sometime then it need to be available for that time duration.

Commit:

  • It saves the data once transaction is completed.
  • It ensures property of durability
  • In MySQL, we have autocommit feature by default.

RollBack:

  • Now suppose we have committed the wrong data and want to reverse the wrong data to initial state.
  • It can undo the changes till last commit. So once transaction is committed then it cannot be rolled back.
We can make a flow which can be consist of multiple database commands as a transaction and for it we have the command like start transaction followed by steps and commit command to complete it.

Isolation levels:

Read Uncommitted
  • We can read latest data including uncommitted(stored in memory only not in harddisk) data as well
  • It will lead us to Dirty Read.



As you can see that T1 got system failure and as a result we are not able to store changed data into database but T2 can read the read uncommit data hence we have read the wrong data and performed are activity. This uncommitted data is lost and hence it is known as dirty data as you cannot justify from where you got this value as after system failure of T1 you will see once again Anjali as information as Rahul is lost in failure.

Practical example of read uncommit isolation level:
You need to create two sessions to see this issue:
-- Left side session of MySQL Transactions 1:
show variables like 'autocommit'; -- By default it is on in MySQL
set autocommit = 0;

show variables like 'transaction_isolation'; -- By default it is repeatable read.
use school;
start transaction;
select * 
from student_scaler
where id = 1; -- Puneet Kumar Singh


-- updating the information in left session
update student_scaler
set name = 'Rahul Singh'
where id = 1; -- till now data is not committed.

select * 
from student_scaler
where id = 1; -- Rahul Singh

rollback; -- Due to system failure rollback command is now run

select *
from student_scaler
where id = 1; -- Puneet Kumar Singh

-- Right side session of MySQL Transaction 2:
show variables like 'autocommit'; -- By default it is on in MySQL
set autocommit = 0;

show variables like 'transaction_isolation'; -- By default it is repeatable read.
set session transaction isolation level read uncommitted;
-- Reads latest data
use school;
start transaction;
select * 
from student_scaler
where id = 1; -- here you can see the latest uncommitted data as isolation is set to read it.
-- You will see Rahul Singh as in left side session record is updated but not committed but still able to see it due to
-- isolation level as RU.

-- Now suppose that you are not going to read the data from the table for id = 1 once again then in this case your data is
-- Rahul Singh 

How to resolve uncommited data read which has occured by keeping read uncommit isolation level.

Read Commit Isolation level is solution:

It ensures that we can read only committed data. (Latest committed data)

  • When we read data before manupulation it will give result as Anjali.
  • After modification of data even when new data is not committed then still we will be able to read Anjali as result.
  • After modification got committed we will be able to read Rahul as result.
Isolation level is dependent on sessions which has enabled it. So isolation is a concept related to session.

Pros: No dirty read issue.
Cons: 
  • Might not be efficient
  • Non-Repeatable Read
-- Left side session of MySQL Transactions 1:
set session transaction isolation level read uncommitted;

-- manupulated the data
update student_scaler
set name = 'Rahul Singh'
where id = 1;

select *
from student_scaler
where id = 1; -- Rahul Singh
 -- now committing the change 
 commit;

-- Right side session of MySQL Transaction 2:
set session transaction isolation level read committed;

select * 
from student_scaler
where id = 1; -- Puneet Kumar Singh
-- as not able to read any uncommitted data.

-- Reading data after commit is done in left side transaction.
select * 
from student_scaler
where id = 1; -- Rahul Singh

Issue caused by read commit isolation level:


Steps:
  • Get data of users having psp < 80
  • send emails (This process may take time as per result if students in 1000s) for example we can assume it now as 5 mins.
  • Update email_sent column. ( if psp < 80 is True)
Now suppose that during sending the email system updated the psp of the students as per scheduled jobs. Now in this case we must have read the latest information which got updated the committed.
Now suppose C psp also decreased to 79. In this case you must need to send email to him as well. But it will not be send as repeated read is not allowed.
So we will read the data which was fetched before the process of sending_email is started for all desired users.

Now in this condition as you can see that  the last step is to update the status of send_email in table. This step will make A and C as send_email to True but in reality only email will be send to A student alone as when this process was started at that time only A have psp below 80.

So as you can see now that the read at step 1 and 3 are different as we are using read committed isolation level. But we need to have the same read which we had at the time of step 1.

As you can see read committed isolation level has solved the dirty read issue but has introduced a new issue which is non-repeatable read.

Non-repeatable read - 
  • When a transaction reads the same data of same row with different value at different time and this read happens in same transaction steps this is know are non-repeatable read.

Practical demonstration:
-- Non-Repeatable read issue
use school;
start transaction;

update student_scaler 
set psp = 88
where id =4;

commit;

-- Non-Repeatable read issue
use school;
start transaction;
select * 
from student_scaler
where psp < 92;
-- Rahul Singh
-- Sending email to those learners who has psp less than 92. -- > suppose it take around 5 minutes of time.

-- reading the data as left side transaction has manupulated the data but not committed it.
select * 
from student_scaler
where psp < 92;
-- Result Singh
-- not yet changed as manupulation is not committed.

-- reading data as transaction on left side session is committed
select * 
from student_scaler
where psp < 92;
-- Now we have got two names Rahul and Sachin.


-- As result is committed the this session not allowed to repeated read of data.
-- Due to Read Committed isolation level data will be read with lastest commits.
-- This latest information has raised the issue as now we have update the status of email send in table and now as you
-- can see clearly that  we have not send an email to Sachin but as he was having psp greater than 92 before the 
-- manupulation is made over left side session.
-- Hence it will lead us to update the wrong record in table and false flag that Sachin has also been informed about 
-- less psp warning and he need to update increase his psp to 92 and above.
-- So here you can see that I am not able to repeatedly read the same information through the current transaction.

-- to solve this issue we need to use repeated read isolation level.

Repeatable Read (Snapshot) Isolation:

  • For first read in a transaction it reads latest committed value of current row.
  • After that in the same transaction it will read the same value.


Same scenario for repeated read isolation level as well.
It ensures that whatever data we have read at the starting of a transaction in any process then that same information will be used throughout the transaction.

This will allow us to have uniformity in data and will help us to counter the update information read which may lead to issue and will make us to update wrong information in database.

Pros - We are now not having the non-repeating data read issue.
Cons - Phantom read issue is introduced.

Practical scenario example:
-- repeated read scenario testing
use school;
start transaction;
update student_scaler
set psp = 90
where id = 4;
commit;
-- Now Sachin psp is increased above 92 psp mark.

-- repeatable read scenario testing.
show variables like 'transaction_isolation';
set session transaction isolation level repeatable read;

use school;
start transaction;

-- reading latest snapshot of transaction where psp is < 92
select *
from student_scaler
where psp < 92;
-- Rahul and Sachin are read - > snap has been taken.

-- sending email to these students as they are not having minimum psp manditory guidelines.
-- now in between this email sending is in process.
-- transaction in left session has updated the student table with new information for students with their psp.
-- Now reading data after manupulating the record in table over left side session where information is committed as well.
select *
from student_scaler
where psp < 92;
-- Rahul and Sachin are read , new latest committed information is not fetched. It is because of repeatable read isolation level.

Phantom Read issue:

When you are using repeatable read isolation level in RDBMS. Then you create a snapshot of any read operation whenever it is run first time in a transaction.

Now if by any how this snapshot get updated then we will have a wrong value cascaded to following steps in a transaction which need to use the original snap but now have new snapshot for information this will lead to phantom read issue.

This happens when we make a change in data in the same transaction where we have read the snapshot for working and by mistake we have modified the data of the same table from which snapshot is taken. Then by mistake we will update the snapshot. This happens due to developer error where the transaction which need to update this information must be run on other session but run by mistake in same session and will lead to phantom read issue as snapshot got updated unintentionally.

Practical scenario testing:
-- Phontom read scenario testing
start transaction;
 
 insert into student_scaler ( name, b_id, psp) value('John Pina', 4, 70);
 commit;

-- Phantom Read issue scenario testing
start transaction;
select *
from student_scaler
where psp < 92;
-- We Rahul and Sachin as entry for sending email
-- Doing some work: A new row have been inserted and committed in left session.

-- Now as I have remembered that I have given wrong last name to John as Pina which need to be Cena
-- and to make this changes we by mistake used the same right session which already has the snapshot for use.

-- update the last name of student id 6 from Pina to Cena

update student_scaler
set name = 'John Cena'
where id = 6;

commit;
-- As a result new snapshot will be created in our right session as well.
-- this updation of snapshot lead us to phantom read issue.


Serializable isolation level:


Serial meand one after another. It means that which every resource is used by any one of the transaction then another transaction cannot acquire it for manupulation. So here the transaction will acquire an lock over the resource so that no other resource can acquire the same resource for manupulation or for any other task.
Now once the transaction is complete then another transaction can acquire a lock over the resource and can perform the task.

 So here two or more transactions happen one after another if use the same resource for task. As a result it become so safe and provide data consistant all the time.

                            commit;                                                                       commit;




Drawback of this isolation level.

Deadlock issue, (waiting time issue)
Performance wise it is very slow.


DeadLock:

In this situation all requests sit in waiting state and program not able to assign task to any of the requests as a result none of the requests starts and take place.

Real time scenario for this issue is Book my show application use to book a show. Here two different persons are trying to book same seats which cause deadlock and to resolve it cool time period is introduced so that we won't get into deadlock issue.

How to resolve the deadloack issue?
One of the way to solve this issue is to sort the resource for acquiring it and to put lock on it.

As you can see that sorting of resource make us to lock the resources in a right way and as a result will help to avoid the deadlock.





Comments

Popular posts from this blog

Aggregate Queries and Views

Schema Design