Square Root Decomposition Technique

This technique is used to answer range queries. Among several other techniques that are used for the same purpose, I find this one to be the simplest of them all.
Before starting off with this technique, let's analyze why do we need it and how it help us optimize the solution to a problem.

If you don't know what a range query is, I'll highly recommend reading about it here: RANGE QUERIES

Consider following problem statement:
Given an array of length N perform following two operations on it,
1. Insert new value at index i, where 0<=i<N
2. Calculate sum of all numbers in a given range of indices L to R in the array, such that 0<=L<=R<N.

The first operation is very simple, we just need to change the value at given index in the array. This can be accomplished in O(1) time.
The naive approach to the second operation will be to loop from index L till index R while adding up all the numbers that occur in between them. In worst case, we had to loop through the entire array and hence the time complexity of this operation is O(N). If we had to perform this operation Q times, the time complexity then becomes O(Q*N), which is not good.
To optimize this, we make use of a technique called as Square Root Decomposition which performs this operation in O(√N).

What is Square Root Decomposition Technique?
As the name suggests, it has a lot to do with square roots. In this technique, we divide the array into √N blocks, where N is the size of array. If N is not a perfect square, we consider the nearest perfect square value greater than N. E.g, if N = 13, we consider N=16 and hence we divide the array into √16=4 equal blocks, each of size √16=4. One thing to notice here is that, if array is divided into √N equal blocks, each block then contains √N elements (except the last one if N is not a perfect square).

Example: 



In the above example, there are 7 elements in the array and hence we divide array into ceil(√7)=3 equal blocks, in which last block is partially filled up (since 7 is not a perfect square).
Now, for each part, we calculate the sum of all the elements in it and store it in another array, say SUM[]. As there are 3 parts i.e. √7, the size of SUM[] will be 3 in this case. Each index in SUM[] will contain sum of all values in block corresponding to that index. Also, we don't create √N arrays as shown above. Instead, we access the original array in that fashion and store the sum of each block in SUM[] array as shown below.



Now, our data structure is ready. It's time to create an algorithm for performing the given operations.

Operation 1: Find the sum of all numbers in a given range (L, R):

1. Calculate block number of L and R. (LBlock = L/√N, RBlock = R/√N)
2. Calculate the sum of all the blocks in between L's block and R's block from SUM[] array (excluding the blocks in which L and R reside). As maximum size of SUM[] is √N, we can have maximum √N iterations. Hence, this operation is performed in O(√N).
3. Calculate sum of all numbers in between L and end of L's block. This operation also requires O(√N) as maximum size of a block can be √N.
4. Calculate sum of all numbers in between start of R's block and R.
5. Add all the calculations obtained in steps 2, 3 and 4.

Overall time complexity of this operation would be: O(√N+√N+√N) = O(3√N) = O(√N).

Consider the example below:


In the above example, L=1, R=7,
LBlock=1/3=0, RBlock=7/3=2.
Hence, excluding block 0 and 2, calculate sum of all blocks in between them (i.e. SUM[1]=11, since there is only one block between block 0 and 2).
Now, calculate sum of all numbers from ARR[1] to ARR[2] (i.e. -2+6=4) and sum of all numbers from ARR[6] to ARR[7] (i.e. 10+(-4)=6).
Note: Starting of block can be calculated by block_no.*√N.
Therefore, ans=11+4+4=21.

Operation 2: Insert new value at index i, where 0<=i<N

1. Take out the number from ARR[i] into some other variable, say old_value.
2. Insert new value at ARR[i], i.e. ARR[i]=new_value
3. Calculate the block number of i by using block_no=i/√N.
4. Update SUM[block_no], i.e. SUM[block_no]=SUM[block_no]-old_value+new_value.

This operation can be performed in O(1) as we are directly inserting new value into array.

Overall time complexity of entire algorithm: O(√N).
Overall space complexity of entire algorithm: O(√N), since we only store SUM[] of size √N.

Code: SquareRootDecomposition.java

Problems based on Square Root Decomposition:
1. HackerEarth: Problems on Sqrt Decomposition
2. Codeforces: Subset Sums
3. CodeChef: HILLJUMP

Comments

  1. Nice!!, Could you also please handle if start and end is same ,, Eg l=8 and r=8

    ReplyDelete

Post a Comment

Popular Posts