Segment Tree

Another simple data structure for answering range queries is Segment Tree. Before moving on the actual data structure, let's first take a look at it's use case.

If you don't know what a range query is, I'll highly recommend reading about it here: RANGE QUERIES

Segment tree provides two operations:

1. Update: In this operation we can update an element in the Array and reflect the corresponding change in the Segment tree.
2. Range Query: In this operation we can query on a range or segment and return the answer to the problem in that particular range. (Range query can be anything such as sum of all numbers in the range, minimum number in a particular range, etc.)

You might think that same operations were performed with Square Root Decomposition technique then why Segment Tree ? It's because Segment Tree answer range queries much faster than square root decomposition technique (in O(logN) as compared with O(√N)). In most of the competitions online, square root decomposition fails to meet the time limit constraint and at such time segment tree comes in the picture.
Enough said, let's take a look at the data structure.

Consider the problem below:
Given an array, input[] of size N, there are Q queries to be answered where each query specifies integer L and R (range L to R where 0<=L<=R<=N). For each such query, output the sum of all numbers in the array input[] in the given range.

Suppose we have array input[] = {1, 2, 3, 4} and a query (L=1, R=3).
Looking at the array we can easily answer the query (i.e. input[1]+input[2]+input[3] = 2+3+4 = 9)
Let's take a look at how segment tree answers this query.

Segment Tree Structure:
Structure of Segment tree is recursive. Each node in the segment tree will hold the sum of numbers from array in some range. The root of the tree will hold the sum of all the numbers in array (i.e. it covers range 0 to N-1). Sum of left half of the input[] array will be stored in the left child of the root and sum of right half of the input[] array will be stored in the right child of the root and so on for further levels. The leaves of the tree will contain the input[] array numbers.

Here, we don't create actual tree by maintaining links to children. Instead we just maintain the entire tree in an array tree[] in which each index is node of the tree and where for ith node, it's left child is at tree[(2*i)+1] and right child at tree[(2*i)+2]. Parent of ith node is tree[(i-1)/2].

But what will be the size of array tree[] (i.e. for a input[] of size N, how many maximum nodes will be required in the tree) ? For that look at the below tree and it's array representation.


In the diagram above, the equivalent array representation of tree is shown. We observe that for input[] array of size 4, the tree[] array turned out to be of size 6 based on the diagrammatic view of tree.

The formula for calculating size of tree[] (i.e. nodes in tree) for input[] of size N:
1. If size of input[] is power of 2, then tree_size = N * 2 - 1
2. Else, tree_size = nextPowerOf2(N) * 2 - 1, where nextPowerOf2() returns the next power of 2 number after N.

For example,
If, N=4, tree_size = 4 * 2 - 1 = 7
If, N=5, tree_size = 8 * 2 - 1 = 15 (since 8 is the next power of 2 after 5)

In the worst case, size of tree[] can be atmost 4*N.

Let's take a look at pseudo code to construct this tree:

createTree(input[], segmentTree[], low, high, root) {
if (low == high) {
segmentTree[root] = input[low];
return;
}
int mid = (low + high) / 2;
createTree(input, segmentTree, low, mid, 2 * root+ 1);
createTree(input, segmentTree, mid + 1, high, 2 * root+ 2);
segmentTree[root] = segmentTree[2 * root+ 1] + segmentTree[2 * root+ 2];
}

In the above pseudo code, we recursively create the left half and right half of the tree. low an high indicates the range of the root node which is passed as last function parameter. For calling this function initially, we need to pass low as 0 and high as N-1 and the tree creation will start from the root, hence we'll pass root as 0.

i.e. createTree(input, segmentTree, 0, N-1, 0); will be the initial call to this function indicating that root node (i.e. node 0) covers all the elements in the array (0 to N-1).

Let's now take a look at how to answer range queries using the created tree.

Range Query:
For a given query range, we start from root of the tree where three conditions arise for each node:
1. Partial Overlap: Meaning that the query range falls partially in the range covered at current node.
2. Total Overlap: Meaning that the query range falls totally in the range covered at current node.
3. No Overlap: Meaning that the query range does not fall in the range covered at the current node.

In case of partial overlap, the query is sent recursively to the left child and right child of the node.
In case of total overlap, the value at node is returned.
In case of no overlap, value 0 is returned.

For example, if N=4 and query(l=1, r=3), we start from the root (i.e. tree[0]). Root covers the range 0 to 3 (i.e. N-1). Range 1 to 3 partially overlaps the range 0 to 3, hence we then send the same query to left and right child of the root. Left child = 2 * 0 + 1 = 1, Right child = 2 * 0 + 2 = 2.

This process continue recursively as shown in the figure below.



Pseudo-code:

calculateRangeSum(tree[], low, high, l, r, root) {
if (low >= l && high <= r) {                         // Total Overlap
return tree[root];
}
if (high < l || low > r) {                                 // No Overlap
return 0;
}
mid = (high + low) / 2;
return calculateRangeSum(tree, low, mid, l, r, 2 * root + 1) + calculateRangeSum(tree, mid + 1, high, l, r, 2 * root + 2);     // left subtree + right subtree
}

Analysis of Segment Tree:

Time Complexity:
    1. Tree Construction: O(4 * N) = O(N)
    2. Range Query: O(4 * logN) = O(logN)

Space complexity: O(4 * N) = O(N)

Code: SegmentTree.java


Comments

Post a Comment

Popular Posts