Contents

# More on Binary Search: Variants

In the last binary search article, we discussed how to write a correct binary search. But sometimes we are not asked to find any position of the target. If we need the first or the last occurrence, can we do it?

In Python, the basic binary search looks like this:

``````lo = 0
hi = N-1
while (lo < hi):
mid = (lo+hi)//2
if A[mid] == target:
return mid
elif A[mid] < target:
lo = mid + 1
else:
hi = mid
return lo if A[lo] == target else -1
``````

And the invariant of the algorithm is that `A[lo] <= target <= A[hi]`.

## Find first occurrence

Finding first occurrence is similar to the classic binary search.

1. We should use `if A[mid] >= target`, `hi = mid` because it discards other occurrences of target to the right, while keeping `A[mid]`. (Given target exists)
2. `hi = mid` works with `lo = mid + 1`.
3. Since `lo = mid + 1` and floor division is used, `hi - lo` is strictly decreasing in each pass. The algorithm will always terminate.
``````lo = 0
hi = N-1
while (lo < hi):
mid = (lo+hi)//2
if A[mid] >= target:
hi = mid
else:
lo = mid + 1
return lo if A[lo] == target else -1
``````

## Find last occurrence

1. `if A[mid] <= target`, `lo = mid` guarantees at least one target is kept while discarding everything to the left. (Given target exists)
2. `lo = mid` works with `hi = mid - 1`.
3. Here’s the tricky part: to ensure that `hi - lo` decreases in each pass, the floor truncation should have bias towards the higher index. i.e. `mid = (lo+hi+1)//2`. Without this tweak, there will be dead loop.
``````lo = 0
hi = N-1
while (lo < hi):
mid = (lo+hi+1)//2
if A[mid] <= target:
lo = mid
else:
hi = mid - 1
return lo if A[lo] == target else -1
``````

The algorithm will terminate with `lo == hi` and `A[lo]` is the first occurrence if it exists.

## Conclusion

The keys to quickly writing a correct algorithm for a binary search variant:

1. The condition containing `<` should set `lo` and `>` should set `hi`.
2. The condition containing `==` should guarantee that probable solutions are not discarded. Under this condition, the `lo` or `hi` will be set to `mid` and there will be no `-1` or `+1`.
3. To determine the condition of the test, imagine the input is a list of equal elements. If `A[mid] <= target` and set `lo = mid`, it discards some target occurrences to the left, therefore this is for finding last occurrence. The reasoning for finding first occurrence is similar.
4. In case of `lo = mid`, we need to avoid dead loop by adding 1 before the floor division, i.e. `mid = (lo+hi+1)//2`.

Here’s a leetcode problem which requires finding both first and last occurrences: 34. Find First and Last Position of Element in Sorted Array

Tags// , ,