DOUG LLOYD: So in CS50 we learned about a variety of sorting and searching algorithms. And sometimes it can be a little tricky to keep track of what algorithm does what. We've really only skimmed the surface too, there are many other searching and sorting algorithms. So in this video let's just take a few minutes to try and distill each algorithm down to its core elements so you can remember the most important information about them and be able to articulate the differences, if necessary.
The first is selection sort. The basic idea behind selection sort is find the smallest unsorted element in an array and swap it with the first unsorted element of that array. We said that the worst-case run time of that was n squared. And if you want to recall why, take a look at the selection sort video. The best-case run time is also n squared.
Bubble sort, the idea behind that one is to swap adjacent pairs. So that's the key that helps you remember the difference here. Selection sort is, so far, find the smallest-- bubble sort is swap adjacent pairs. We swap adjacent pairs of elements if they are out of order, which effectively bubbles larger elements to the right, and at the same time it also begins to move smaller elements to the left. The worst-case case run time of bubble sort is n squared. The best-case run time of bubble sort is n. Because in that situation we don't actually-- we might not need to make any swaps at all. We only have to make one pass across all n elements.
In insertion sort, the basic idea here is shifting. That's the keyword for insertion sort. We're going to step once through the array from left to right. And as we do, we're going to shift elements we've already seen to make room for smaller ones that need to fit somewhere back in that sorted portion. So we build the sorted array one element at a time, left to right, and we shift things to make room. The worst-case run time of insertion sort is n squared. The best-case run time is n.
Merge sort-- the keyword here is split and merge. We split the full array, whether it's six elements, eight elements, 10,000 elements-- we split it down by half, by half, by half, until we have under array of n one element arrays. A set of n one element arrays. So we started with one 1,000-element array, and we get to the point where we have 1,000 one-element arrays. Then we begin to merge those sub arrays back together in the correct order. So we take two one-element arrays and create a two-element array. We take two two-element arrays and create a four-element array and so on and so on until we've again rebuilt one n element array.
The worst-case run time of merge sort is n times log n. We have n elements, but this recombining process takes log n steps to get back to a full array. The best case run time is also n log n because this process doesn't really care whether the array was sorted or not to start with. The process is the same, there's no way to short circuit things. So n log n in the worst case, n log n in the best case.
We talked about two searching algorithms. So linear search is about iterating. We proceed across the array once, from left to right, trying to find the number that we're looking for. The worst-case run time is big O of n. It might take us iterating across every single element to find the element we're looking for, either in the last position, or not at all. But we can't confirm that until we've looked at everything. m the best-case, we find immediately. The best-case run time of linear search is omega of 1.
Lastly, there was binary search, which requires assorted array. Remember that's a very important consideration when working with binary search. It's a prerequisite to using it-- the array that you're looking through must be sorted. Otherwise, the keyword is divide and conquer. Split the array into half and eliminate half of the elements every time as you proceed through. Because of this divide and conquer and splitting things in half approach, the worst-case run time of binary search is log n, which is substantially better than linear search's n. The best-case run time is still one.
We might find it immediately the first time we make a division, but, again, remember that although binary search is substantially better than linear search by virtue of being log n versus n, you might have to go through the work of sorting your array first, which might make it less effective depending on the size of the iterating sorted.
I'm Doug Lloyd, this is CS50.