8.3. Binary Search¶
If a list is sorted, we can describe a more efficient algorithm than linear search. For a sorted list, any value you check gives you some information about the relative location of the thing you are looking for. Imagine you pick up a phone book looking for “Davis, Sue”. You flip to the middle somewhere and see “Jones” - oops you are too far in, so you jump way earlier in the book. Now you see “Evans” - still too far back - so you jump earlier again. Then you see “Clark” - that means you need to jump back partway to where “Evans” was…
This is essentially the strategy of binary search. To do binary search for a particular value (key
) in a list, check the middle item in the list. If it is not the key and the key is smaller than the middle item, the target must be in the first half of the list. If the target is larger than the middle item, the target must be in the last half of the list. Thus, one unsuccessful comparison reduces the number of items left to check by half!
To continue, if the item you are looking for is less than the value you just checked, throw out everything at or above the middle. If the value you are looking for is bigger, throw out from the middle down. Then repeat the process but only using the values you have not thrown out. The splitting process continues until the target is located or there is nothing left in the list.
Try watching the algorithm run on various values. Note that items that have turned white have been “thrown out” and no longer need to be worried about. Gray items show the range of possible locations that still need to be checked:
Below are the main steps of the binary search algorithm, expressed in pseudocode. To do binary search you need to keep track of the lowIndex
, the lowest possible location the number could be at, and the highIndex
, the highest possible location. Each time at each step of the loop you calculate the middle between those two locations and check that middleIndex
. If you do not find what you are looking for, you change either the low or high boundary to “throw out” half of the remaining items. Try running the algorithm by hand and make sure you can predict the values the animation above shows.
Note: assume that list and key are already set 1 set
lowIndex
= 1 2 sethighIndex
= length oflist
3 repeat untillowIndex
is larger thanhighIndex
4 setmiddleIndex
= (lowIndex
+highIndex
)/2 round down 5 setmiddleValue
=list[middleIndex]
6 ifmiddleValue
==key
7 print "Found at " +middleIndex
8 end program 9 10 Note: We didn't find, it check to see if too high or too low: 11 ifmiddleValue
>key
then 12 Note: Too high, change the upper bound 13highIndex
=middleIndex
- 1 14 else 15 Note: Too low, change the lower bound 16lowIndex
=middleIndex
+ 1 17 18 print "Value not found"
Although binary search is more complex than linear search, and depends on the items being in order so that you can rule out half of them with each step, it can be substantially more efficient. The 20 items above never take more than 6 steps to check. If the list was 100 items it would never take more than 8 steps. 100,000 items could be searched in just 19 steps! We will take a closer look at just how efficient this algorithm is later.