More on Binary Search: Variants
1/Jun 2019In the last binary search article, we discussed how to write a correct binary search. But sometimes we are not asked to find any position of the target. If we need the first or the last occurrence, can we do it?
The classic binary search
In Python, the basic binary search looks like this:
lo = 0
hi = N-1
while (lo < hi):
mid = (lo+hi)//2
if A[mid] == target:
return mid
elif A[mid] < target:
lo = mid + 1
else:
hi = mid
return lo if A[lo] == target else -1And the invariant of the algorithm is that A[lo] <= target <= A[hi].
Find first occurrence
Finding first occurrence is similar to the classic binary search.
- We should use
if A[mid] >= target,hi = midbecause it discards other occurrences of target to the right, while keepingA[mid]. (Given target exists) hi = midworks withlo = mid + 1.- Since
lo = mid + 1and floor division is used,hi - lois strictly decreasing in each pass. The algorithm will always terminate.
lo = 0
hi = N-1
while (lo < hi):
mid = (lo+hi)//2
if A[mid] >= target:
hi = mid
else:
lo = mid + 1
return lo if A[lo] == target else -1Find last occurrence
if A[mid] <= target,lo = midguarantees at least one target is kept while discarding everything to the left. (Given target exists)lo = midworks withhi = mid - 1.- Here’s the tricky part: to ensure that
hi - lodecreases in each pass, the floor truncation should have bias towards the higher index. i.e.mid = (lo+hi+1)//2. Without this tweak, there will be dead loop.
lo = 0
hi = N-1
while (lo < hi):
mid = (lo+hi+1)//2
if A[mid] <= target:
lo = mid
else:
hi = mid - 1
return lo if A[lo] == target else -1The algorithm will terminate with lo == hi and A[lo] is the first occurrence if it exists.
Conclusion
The keys to quickly writing a correct algorithm for a binary search variant:
- The condition containing
<should setloand>should sethi. - The condition containing
==should guarantee that probable solutions are not discarded. Under this condition, theloorhiwill be set tomidand there will be no-1or+1. - To determine the condition of the test, imagine the input is a list of equal elements. If
A[mid] <= targetand setlo = mid, it discards some target occurrences to the left, therefore this is for finding last occurrence. The reasoning for finding first occurrence is similar. - In case of
lo = mid, we need to avoid dead loop by adding 1 before the floor division, i.e.mid = (lo+hi+1)//2.
Here’s a leetcode problem which requires finding both first and last occurrences: 34. Find First and Last Position of Element in Sorted Array
More Reading