[MUSIC PLAYING] DAVID MALAN: All right. All right, welcome back. So this is Week 4, the beginning thereof, already. And you'll recall that last week, we put code aside for just a little bit and we started talking a little more high-level, about things like searching and sorting, which though somewhat simple ideas, are representative of a class of problems you will begin to solve particularly as you start thinking about final projects and interesting solutions you might have to real-world problems. Now bubble sort was one of the simplest such algorithms, and it worked by having these small numbers in a list or in an array kind of bubble their way up to the top, and the big numbers move their way down to the end of that list.
And recall that we could visualize bubble sort a little something like this. So let me go ahead and click Start. I've preselected bubble sort. And if you recall that the taller blue lines represent big numbers, small blue lines represent small numbers, as we go through this again and again and again, comparing two bars next to each other in red, we're going to swap the biggest and the smallest if they are out of order.
So this will go on and go on and go on, and you'll see that the larger elements are making their way to the right, and the smaller elements are making their way to the left. But we began to quantify the efficiency, the quality of this algorithm. And we said that in the worst case, this algorithm took roughly how many steps?
So n squared. And what was n?
AUDIENCE: Number of elements.
DAVID MALAN: So n was the number of elements. And so we'll do this often. Any time we want to talk about the size of a problem or the size of an input, or the amount of time it takes to produce output, we'll just generalized whatever the input is as n. So back in Week 0, the number pages in the phone book was n. The number of students in the room was n. So here, too, we're following that pattern.
Now n squared isn't particularly fast, so we tried to do better. And so we looked at a couple of other algorithms, among which were selection sort. So selection sort was a little different. It was almost simpler, I dare say, whereby I started at the start of the list of our volunteers and I just again and again and again went through the list, plucking out the smallest element at a time and putting him or her at the beginning of the list.
But this, too, once we started to think through the math and bigger picture, thought about how many times I was going back and forth and back and forth, we said in the worst case, selection sort, too, was what? n squared. Now in the real world, it might actually be marginally faster. Because again, I didn't have to keep backtracking once I had sorted the smallest elements. But if we think about very large n, and if you sort of do out the math as I did on the board, with the n squared minus something, everything else besides the n squared, once n gets really large, doesn't really matter as much. So as computer scientists, we sort of turn a blind eye to the smaller factors and focus only on the factor in an expression that's going to make the biggest difference.
Well, lastly, we looked at insertion sort. And this was similar in spirit, but rather than go through iteratively and select the smallest element one at a time, I instead took the hand that I was dealt, and I decided, all right, you belong here. Then I moved on to the next element and decided that he or she belonged here. And then I moved on and on. And I might to, along the way, shift these guys in order to make room for them. So that was sort of the mental reversal of selection sort that we called insertion sort.
So these topics to occur in the real world. Just a few years ago, when a certain senator was running for president, Eric Schmidt, at the time the CEO of Google, actually had the opportunity to interview him. And we thought we'd share this YouTube clip for you here, if we could turn up the volume.
[VIDEO PLAYBACK]
-Now, Senator, you're here at Google, and I like to think of the presidency as a job interview.
[LAUGHTER]
-Now it's hard to get a job as president. And you're going through the rigors now. It's also hard to get a job at Google. We have questions and we ask our candidates questions. And this one is from Larry Schwimmer.
[LAUGHTER] -You guys think I'm kidding? It's right here. What is the most efficient way to sort a million two-bit integers?
[LAUGHTER]
-Well, uh--
-I'm sorry. Maybe we should--
-No, no, no, no, no.
-That's not a-- OK.
-I think the bubble sort would be the wrong way to go.
[LAUGHTER]
[CHEERING AND APPLAUSE]
-Come on, who told him this? OK.
[END VIDEO PLAYBACK]
DAVID MALAN: So there you have it. So we began to quantify these running times, so to speak, with something called asymptotic notation, which is just referring to our sort of turning a blind eye to those smaller factors and only looking at the running time, the performance of these algorithms, as n gets really large over time. And so we introduced big O. And big O represented something that we thought of as an upper bound. And actually, Barry, can we lower than the mic a little bit?
We thought of this is an upper bound. So big O of n squared means that in the worst case, something like selection sort would take n squared steps. Or something like insertion sort would n squared steps. Now for something like insertion sort, what was the worst case? Given an array, what's the worst possible scenario that you might find yourself faced with?
It's completely backwards, right? Because if it's completely backwards, you have to do a whole lot of work. Because if you're completely backwards, you're going to find the biggest element here, even though it belongs down there. So you're going to say, all right, at this moment in time, you belong here, so you leave it alone.
Then you realize, oh, damn, I have to move this slightly smaller element to the left of you. Then I have to do that again and again and again. And if I walked back and forth, you would sort of feel the performance of that algorithm, because constantly am I shuffling everyone else down in the array to make room for it. So that's the worst case.
By contrast-- and this was a cliffhanger last time-- we said that insertion sort was an omega of what? What's the best-case running time of insertion sort? So it's actually n. That was the blank that we left on the board last time.
And it's omega of n because why? Well, in the very best case, what's insertion sort going to be handed? Well, a list that's completely sorted already, minimal work to do. But what's neat about insertion sort is that because it starts here and decides, oh, you are the number one, you belong here. Oh, what a good fortune.
You're the number two. You also belong here. Number three, even better, you belong here. As soon as it gets to the end of the list, per insertion sort's pseudocode that we walked through verbally last time, it's done. But selection sort, by contrast, kept doing what?
Kept going through the list again and again and again. Because the key insight there was only once you've looked all the way to the end of the list can you be certain that the element you selected was indeed the currently smallest element. So these different mental models end up yielding some very real-world differences for us, as well as these theoretical asymptotic differences.
So just to recap, then, big O of n squared, we've seen a few such algorithms thus far. Big O of n? What's an algorithm that could be said to be big O of n? In the worst case, it takes a linear number of steps.
OK, linear search. And in the worst case, where is the element you're looking for when applying linear search?
OK, in the worst case, it's not even there. Or in the second worst case, it's all the way in the end, which is plus-or-minus a one-step difference. So at the end of the day, we can say it's linear. Big O of n would be linear search, because in the worst case, the element's not even there or it's all the way at the end.
Well, big O of log of n. We didn't talk in great detail about this, but we've seen this before. What runs in so-called logarithmic time, in the worst case?
Yeah, so binary search. And binary search in the worst case might have the element somewhere in the middle, or somewhere inside the array. But you only find it once you divide the list in half, in half, in half, in half. And then voila, it's there. Or again, worst case, it's not even there. But you don't know that it's not there until you sort of reach that last bottom-most elements by halving and halving and halving.
Big O of 1. So we could big O of 2, big O of 3. Anytime you want just a constant number, we just sort of just simplify that as big O of 1. Even though if realistically, it takes 2 or even 100 steps, if it's a constant number of steps, we just say big O of 1. What's an algorithm that's in big O of 1?
AUDIENCE: Finding the length of a variable.
DAVID MALAN: Finding the length of a variable?
AUDIENCE: No, the length if it's already sorted.
DAVID MALAN: Good. OK, so finding the length of something if the length of that something, like an array, is stored in some variable. Because you can just read the variable, or print the variable, or just generally access that variable. And voila, that takes constant time.
By contrast, think back to scratch. Think back to the first week of C, calling just printf and printing something on the screen is arguably constant time, because it just takes some number of CPU cycles to show that text on the screen. Or wait-- does it? How else might we model the performance of printf? Would someone like to disagree, that maybe it's not really constant time? In what sense might printf's running time, actually printing a string on the screen, be something other than constant.
AUDIENCE: [INAUDIBLE].
DAVID MALAN: Yeah. So it depends on our perspective. If we actually think of the input to printf as being the string, and therefore we measure the size of that input by its length-- so let's call that length n as well-- arguably, printf is itself big O of n because it's going to take you n steps to print out each of those n characters, most likely. At least to the extent that we assume that maybe it's using a for loop underneath the hood.
But we would have to look at that code to understand it better. And indeed, once you guys start analyzing your own algorithms, you'll literally do just that. Sort of eyeball your code and think about-- all right, I have this loop here or I have a nested loops here, that's going to do n things n times, and you can sort of reason your way through the code, even if it's pseudocode and not actual code.
So what about omega of n squared? What was an algorithm that in the best case, still took n squared steps? Yeah?
AUDIENCE: [INAUDIBLE].
DAVID MALAN: So selection sort. Because in that problem really reduced to the fact that again, I don't know I've found the current smallest until I've checked all the darn elements. So omega of, say, n, we just came up with one. Insertion sort. If the list happens to be sorted already, in the best case we just have to make one pass through it, at which point we're sure. And then that could be said to be linear, for sure.
What about omega of 1? What, in the best case, might take a constant number of steps? So linear search, if you just get lucky and the element you're looking is right at the beginning of the list, if that's where you're starting your linear traversal of that list.
And this is true of a number of things. For instance, even binary search is omega of 1. Because what if you get really darn lucky and smack-dab in the middle of your array is the number you're looking for? So you can get lucky there, as well.
This one, lastly, omega of n log n. So n log n, we didn't really talk about yet, but--
AUDIENCE: Merge sort?
DAVID MALAN: Merge sort. That was the cliffhanger of last time, where we proposed, and we showed visually, that there are algorithms. And merge sort of just one such algorithm that is fundamentally faster than some of these other guys. In fact, merge short is not only in the best case n log n, in the worst case n log n. And when you have this coincidence of omega and big O being the same thing? We can actually describe that as what's called theta, though it's a little less common. But that just means the two bounds, in this case, are the same.
So merge sort, what does this really boil down to for us? Well, recall the motivation. Let me pull up another animation that we didn't look at last time. This one, same idea, but it's a little larger. And I'm going to go ahead and point out first-- we have insertion sort on the top left, then selection sort, bubble sort, a couple of other sorts-- shell and quick-- that we haven't talked about, and heap and merge sort.
So at least try to focus your eyes on the top three on the left and then merge sort when I click this green arrow. But I'll let all of them run, just to give you a sense of the diversity of algorithms that exist in the world. I'm going to let this run for just a few seconds. And if you focus your eyes-- pick an algorithm, focus on it for just a seconds-- you'll begin to see the pattern that it's implementing.
Merge sort, notice, is done. Heap sort, quick sort, shell-- so it seems we introduced the three worst algorithms last week. But that's good that we're here today to look at merge sort, which is one of the easier ones is to look at, even though it probably will bend your mind just a little bit. Here we can see just how much selection sort sucks.
But on the flip side, it's really easy to implement. And maybe for P Set 3, that's one of the algorithms you chose to implement for the standard edition. Perfectly fine, perfectly correct.
But again, as n gets large, if you choose to implement a faster algorithm like merge sort, odds are in larger and larger inputs, your code is just going to run faster. Your website's going to work better. Your users are going to be happier. And so there are these effects of actually giving us some deeper thought.
So let's take a look at what merge sort is actually all about. The cool thing is that merge sort is just this. This is, again, what we've called pseudocode, pseudocode being English-like syntax. And the simplicity is sort of fascinating.
So on input of n elements-- so that just means, here's an array. It's got n things in it. That's all we're saying there.
If n is less than 2, return. So that's just the trivial case. If n is less than 2, then obviously it's 1 or 0, in which case the thing is already sorted or nonexistent, so just return. There's nothing to do. So that's a simple case to pluck off.
Else, we have three steps. Sort the left half of the elements, sort the right half of the elements, and then merge the sorted halves. What's interesting here is that I'm kind of punting, right? There's kind of a circular definition to this algorithm. In what sense is this algorithm's definition circular?
AUDIENCE: [INAUDIBLE].
DAVID MALAN: Yeah, my sorting algorithm, two of its steps are "sort something." And so that begs the question, well, what am I going to use to sort the left half and the right half? And the beauty here is that even though again, this is the mind-bending part potentially, you can use same algorithm to sort the left half.
But wait a minute. When you're told to sort the left half, what are the two steps going to be next? We'll sort the left half of the left half and the right half of the left half. Damn, how do I sort those two halves, or quarters, now?
But that's OK. We have a sorting algorithm here. And even though you might worry at first this is kind of an infinite loop, it's a cycle that's never going to end-- it is going to end once what happens? Once n is less than 2. Which eventually is going to happen, because if you keep halving and halving in halving these halves, surely eventually you're going to end up with just 1 or 0 elements. At which point, this algorithm says you're done.
So the real magic in this algorithm seems to be in that final step, merging. That simple idea just merging two things, that's what's ultimately going to allow us to sort an array of, let's say, eight elements. So I have eight more stress balls up here, eight pieces of paper, and one Google Glass-- which I get to keep.
[LAUGHTER]
DAVID MALAN: If we could take eight volunteers, and let's see if we can play this out, so. Wow, OK. Computer science is getting fun. All right. So how about you three, biggest hand up there. Four in the back. And how about we'll do you three in this row? And four in the front. So you eight, come on up.
[LAUGHTER]
DAVID MALAN: I'm actually not sure what it is. Is it the stress balls? The desk lamps? The material? The internet?
OK. So come on up. Who would like-- keep coming up. Let's see. And this puts you in location-- you're in location one. Uh-oh, wait a minute. 1, 2, 3, 4, 5, 6, 7-- oh, good. All right, we're good. All right, so everyone have a seat, but not on the Google Glass. Let me to queue these up. What's your name?
MICHELLE: Michelle.
DAVID MALAN: Michelle? All right, you get to look like the geek, if that's OK. Well, I do too, I suppose, for just a moment. All right, standby. We've been trying to come up with a use case for Google Glass, and we thought it would be fun to just do this when people are onstage. We will record the world from their perspective. All right. Not probably what Google intended. All right, if you don't mind wearing this for the next awkward minutes, that would be wonderful.
All right, so we have here an array of elements, and that array, as per the pieces of paper in these folks' hands, is currently unsorted.
MICHELLE: Oh, that's so weird.
DAVID MALAN: It's pretty much random. And in just a moment, we're going to try to implement merge sort together and see where that key insight is. And the trick here with merge sort is something that we haven't assumed yet. We actually need some additional space. So what's going to be particularly interesting about this is that these guys are going to move around a little bit, because I'm going to assume that there's an extra array of space, say, right behind them.
So if they're behind their chair, that's the secondary array. If they're seated here, that's the primary array. But this is a resource that we have not leveraged thus far with bubble sort, with selection sort, with insertion sort. Recall last week, everyone just kind of shuffled in place. They didn't use any additional memory. We made room for people by moving people around.
So this is a key insight, too. There's this trade-off, in general in computer science, of resources. If you want to speed up something like time, you're going to have to pay a price. And one of those prices is very often space, the amount of memory or hard disk space that you're using. Or, frankly, the amount of programmer time. How much time it takes you, the human, to actually implement some more complicated algorithm. But for today, the trade-off is time and space.
So if you guys could just hold up your numbers so we can see that you're indeed matching 4, 2, 6, 1, 3, 7, 8. Excellent. So I'm going to try to orchestrate things, if you guys can just follow my lead here.
So I am going to implement, first, the first step of the pseudocode, which is on input of n elements, if n is less than 2, then return. Obviously, that doesn't apply, so we move on. So sort the left half of the elements. So that means I'm going to focus my attention for just a moment on these four guys here. All right, what do I next do?
AUDIENCE: Sort the left half.
DAVID MALAN: So now I have to sort the left half of these guys. Because again, assume to yourself the goal is to sort the left half. How do you do that? Just follow the instructions, even though we're doing it again. So sort the left half. Now I'm sorting these two guys. What comes next?
AUDIENCE: Sort the left half.
DAVID MALAN: Sort the left half. So now these, this seat here, is a list of size 1. And what's your name again?
PRINCESS DAISY: Princess Daisy.
DAVID MALAN: Princess Daisy is here. And so she is already sorted, because the list is of size 1. What do I next do? OK, return, because that list is of size 1, which is less than 2. Then what's the next step? And now you have to kind of backtrack in your mind.
Sort the right half, which is-- what's your name?
LINDA: Linda.
DAVID MALAN: Linda. And so what do we do now that we have a list of size 1?
AUDIENCE: Return.
DAVID MALAN: Careful. We return first, and now the third step-- and if I kind of depict it by embracing the two seats now, now I have to merge these two elements. So now unfortunately, the elements are out of order. But that's where the merging process starts to get compelling.
So if you guys could stand up for just a moment, I'm going to need you, in a moment, to step behind your chair. And if Linda, because 2 is smaller than 4, why don't you come around first? Stay there. So Linda, you come around first.
Now in reality if it's just an array we could just move her in real time from this chair to this spot. So imagine that took some constant number of steps 1. And now-- but we need to put you in the first location here.
And now if you could come around, as well, we're going to be in location two. And even though this feels like it's taking a while, what's nice now is that the left half of the left half is now sorted. So what was the next step, if we now rewind further in the story?
AUDIENCE: Right half.
DAVID MALAN: Sort the right half. So you guys have to do this, as well. So if you could stand up for just a moment? And what's your name?
JESS: Jess.
DAVID MALAN: Jess. OK, so Jess is now the left half of the right half. And so she's a list of size 1. She's obviously sorted. And your name again?
MICHELLE: Michelle.
DAVID MALAN: Michelle is obviously a list of size 1. She's already sorted. So now the magic happens, the merging process. So who's going to come first? Obviously Michelle. So if you could come around back. The space we have available for her now is right behind this chair here. And now if you could come back as well, we now have, to be clear, two halves, each of size 2-- and just for depiction's sake, if you could make a little bit of a space-- one left half here, one right half here.
Rewind further in the story. What step is next?
AUDIENCE: Merge.
DAVID MALAN: So now we have to merge. So OK, so now, thankfully, we just freed up four chairs. So we've used twice as much memory, but we can give flip-flopping between the two arrays. So which number is to come first? So Michelle, obviously. So come around and take your seat here. And then number 2 is obviously next, so you come here. Number 4, number 6. And again, even though there's a little bit of walking involved, really, these could happen instantly, by moving one-- OK, well played.
[LAUGHTER]
DAVID MALAN: And now we're in pretty good shape. The left half of the entire input has now been sorted. All right, so these guys had the advantage of my-- how did it end up all the girls on the left and all the boys on the right?
OK, so guys' turn now. So I won't walk you through these steps. We'll see if we can reapply the same pseudocode. If you want to go ahead and stand up, and you guys, let me give you the mic. See if you can't replicate what we just did here on the other end of the list. Who needs to speak first, based on the algorithm? So explain what you're doing before you make any foot movements.
SPEAKER 1: All right, so since I am the left half of the left half, I return. Right?
DAVID MALAN: Good.
SPEAKER 1: And then--
DAVID MALAN: Who does the mic go to next?
SPEAKER 1: Next number.
SPEAKER 2: So I'm the right half of the left half of the left half, and I return.
DAVID MALAN: Good. You return. So now what's the next up for you two?
SPEAKER 2: We want see who's smaller.
DAVID MALAN: Exactly. We want to merge. The space we're going to use to merge you into, even though they're obviously sorted already, we're going to follow the same algorithm. So who goes in back first? So 3, and then 7. And now the mic goes to these guys, OK?
SPEAKER 3: So I'm the right half of the left half, and my n is less than 1, so I'm just going to pass--
DAVID MALAN: Good.
SPEAKER 4: I'm the right half of the right half of the right half, and I'm also one person, so I'm going to return. So now we merge.
SPEAKER 3: So we go back.
DAVID MALAN: So you go into the back. So 5 goes first, then 8. And now audience, which is the step we have to now rewind back to in our minds?
AUDIENCE: Merge.
DAVID MALAN: Merging left half and right half of the original left half. So now-- and just to make this clear, make a little bit of space between you two guys. So now that's the two lists, left and right. So how do we now merge you guys into the front row of seats again?
3 goes first. Then 5, obviously. Then 7, and now 8. OK, and now we are?
AUDIENCE: Not done.
DAVID MALAN: Not done, because obviously, there's one step remaining. But again, the reason I'm using this jargon like "rewind in your mind," it's because that's really what's happening. We're going through all of these steps, but we're sort of pausing for a moment, diving deeper into the algorithm, pausing for a moment, diving deeper into the algorithm, and now we have to sort of rewind in our minds and undo all of these layers that we've sort of put on hold.
So now we have two lists of size 4. If you guys could stand up one last time and make a bit of space here to make clear that this is the left half of the original, the right half of the original. Who's the first number that we need to pull into the back? Michelle, of course.
So we put Michelle here. And who has number 2? Number 2 comes on back as well. Number 3? Excellent. Number 4, number 5, number 6, number 7, and number 8.
OK, so it felt like a lot of steps, for sure. But now let's see if we can't confirm sort of intuitively that this algorithm fundamentally, particularly as n gets really large, as we've seen with the animations, is fundamentally faster. So I claim this algorithm, in the worst case and even in the best case, is big O of n times log n. That is, there's some aspect of this algorithm that takes n steps, but there's another aspect somewhere in that iteration, that looping, that takes log n steps. Can we put our finger on what those two numbers are referring to? Well, where-- where'd the mic go?
SPEAKER 1: Would the log n be breaking us up into two-- dividing by two, essentially.
DAVID MALAN: Exactly. Any time we see in any algorithm thus far, there's been this pattern of dividing, dividing, dividing. And it's typically reduced to something that's logarithmic, log base 2. But it could really be anything, but log base 2.
Now what about the n? I can see that we kind of divided you guys-- divided you, divided you, divided you, divided you. Where does the end come from?
So it's the merging. Because think about it. When you merge eight people together, whereby half of them are a set of four and the other half are another set of four, how do you go about doing the merging? Well, you guys did it fairly intuitively.
But if I instead did it a little more methodically, I might have pointed at the leftmost person first with my left hand, pointed at the leftmost person of that half with my right hand, and just subsequently walked through the list, pointing at the smallest element each time, moving my finger over and over as needed throughout the list. But what's key about this merging process is I'm comparing these pairs of elements. From the right half and from the left half, I'm never once backtracking.
So the merge itself is taking no more than n steps. And how many times did I have to do that merging? Well, no more than n, and we just saw that with the final merge. And so if you do something that takes log n steps n times, or vice versa, it's going to give us n times log n.
And why is this better? Well, if we already know that log n is better than n-- right? We saw in binary search, the phone book example, log n was definitely better than linear. So that means n times log n is definitely better than n times another n, AKA n squared. And that's what we ultimately feel.
So big round of applause, if we could, for these guys.
[APPLAUSE]
DAVID MALAN: And your parting gifts-- you may keep the numbers, if you would like. And your parting gift, as usual. Oh, and we will send you the footage, Michelle. Thank you. All right. Help yourself to a stress ball.
And let me pull up, in the meantime, our friend Rob Bowden to offer somewhat different perspective on this, since you can think about these steps happening in a somewhat different way. In fact, the set-up for what Rob's about to show us assumes that we've already done the dividing up of the big list into eight small lists, each of size 1.
So we're changing the pseudocode a little bit just to sort of get at the core idea of how the merging works. But the running time of what he's about to do is still going to be the same. And again, the set-up here is that he's begun with eight lists of size 1. So you've missed the part where he's actually done the log n, log n, log n dividing of the input.
[VIDEO PLAYBACK]
-That's it for step one. For step two, repeatedly merge pairs of lists.
DAVID MALAN: Hm. Only audio is coming out of my computer. Let's try this again.
-Just arbitrarily pick which-- now we have four lists. Learn before.
DAVID MALAN: There we go.
-Merging 108 and 15, we end up with the list 15, 108. Merging 50 and 4, we end up with 4, 50. Merging 8 and 42, we end up with 8, 42. And merging 23 and 16, we end up with 16, 23.
Now all our lists are of size 2. Notice that each of the four lists is sorted. So we can start merging pairs of lists again. Merging 15 and 108 and 4 and 50, we first take the 4, then the 15, then the 50, then the 108. Merging 8, 42 and 16, 23, we first take the 8, then the 16, then the 23, then the 42.
So now we have just two lists of size 4, each of which is sorted. So now we merge these two lists. First, we take the 4, then we take the 8, then we take the 15, then 16, then 23, then 42, then 50, then 108.
[END VIDEO PLAYBACK]
DAVID MALAN: Again, notice, he never touched a given cup more than one time after advancing beyond it. So he's never repeating. So he's always moving to the side, and that's where we got our n.
Why not let me pull up one animation that we saw earlier, but this time focusing only on merge sort. Let me go ahead and zoom in on this here. First let me choose a random input, magnify this, and you can sort of see what we took for granted, earlier, merge sort is actually doing.
So notice that you get these halves or these quarters or these eighths of the problem that all of a sudden start to take good shape. And then finally, you see at the very end that bam, everything is merged together.
So these are just three different takes on the same idea. But the key insight, just like divide and conquer in the very first class, was that we decided to somehow divide the problem into something big, into something sort of identical in spirit, but smaller and smaller and smaller and smaller.
Now another fun way to sort of think about these, even though it's not going to give you the same intuitive understanding, is the following animation. So this is a video someone put together that associated different sounds with the various operations for insertion sort, for merge sort, and for a couple of others. So in a moment, I'm going to hit Play. It's about a minute long. And even though you can still see the patterns happening, this time you can also hear how these algorithms are performing differently and with somewhat different patterns.
This is insertion sort.
[TONES PLAYING]
DAVID MALAN: It again is trying to insert each element into where it belongs. This is bubble sort.
[TONES PLAYING]
DAVID MALAN: And you can sort of feel how relatively little work it's doing on each step. This is what tediousness sounds like.
[TONES PLAYING]
DAVID MALAN: This is selection sort, where we select the element we want by going through again and again and again and putting it at the beginning.
[TONES PLAYING]
DAVID MALAN: This is merge sort, which you can really start to feel.
[TONES PLAYING]
[LAUGHTER]
DAVID MALAN: Something called gnome sort, which we haven't looked at.
[TONES PLAYING]
DAVID MALAN: So let me see, now, distracted as you hopefully are by the music, if I can slip a little bit of math in here. So there's a fourth way that we can think about what it means these functions to be faster than ones that we've seen before. And if you're coming at the course from a mathematics background, you actually know perhaps already that you can slap a term on this technique-- namely recursion, a function that somehow calls itself.
And again, recall that merge sort pseudocode was recursive in the sense that one of merge sort's steps was to call sort-- that is, itself. But thankfully, because we kept calling sort, or merge sort, specifically, on a smaller and smaller and smaller list, we eventually bottomed out thanks to what we'll call a base case, the hard-coded case that said if the list is small, less than 2 in that case, just return immediately. If we didn't have that special case, the algorithm would never bottom out, and you would indeed get into an infinite loop truly forever.
But suppose that we wanted to now put some numbers on this, again, using n as the size of the input. And I wanted to ask you, what's the total time involved in running merge sort? Or more generally, what's the cost of it in time?
Well it's pretty easy to measure that. If n is less than 2, the time involved in sorting n elements, where n is 2, is 0. Because we just return. There's no work to be done. Now arguably, maybe it's one step or two steps to figure out the amount of work, but it's close enough to 0 that I'm just going to say no work is required if the list is so small to be uninteresting.
But this case is interesting. The recursive case was the branch of the pseudocode that said else, sort the left half, sort the right half, merge the two halves. Now why does this expression represent that expense? Well, T of n just means the time to sort n elements. And then on the right-hand side of the equals sign there, the T of n divided by 2 is referring to the cost of what? Sorting the left half. The other T of n divided by 2 is presumably referring to the cost to sort the right half.
And then the plus n? Is the merging. Because if you have two lists, one of size n over 2 and another of size n over 2, you have to essentially touch each of those elements, just like Rob touched each of the cups, and just as we pointed at each of the volunteers on stage. So n is the expense of merging.
Now unfortunately, this formula is also itself recursive. So if asked the question, if n is, say, 16, if there's 16 people on stage or 16 cups in the video, how many total steps does it take to sort them with merge sort? It's actually not an obvious answer, because now you have to sort of recursively answer this formula.
But that's OK, because let me propose that we do the following. The time involved to sort 16 people or 16 cups is going to be represented generally as T of 16. But that equals, according to our previous formula, 2 times the amount of time it takes to sort 8 cups plus 16. And again, plus 16 is the time to merge, and the two times T of 8 is the time to sort left half and right half.
But again, this isn't enough. We have to dive in deeper. This means we have to answer the question, what is T of 8? Well T of 8 is just 2 times T of 4 plus 8. Well, what's T of 4? T of 4 is just 2 times T of 2 plus 4. Well, what's T of 2? T of 2 is just 2 times T of 1 plus 2.
And again, we're kind of getting stuck in this cycle. But it's about to hit that so-called base case. Because what's T of 1, did we claim? 0. So now finally, we can work backwards.
If T of 1 is 0, I can now go back up one line to this guy here, and I can plug in 0 for T of 1. So that means it equals 2 times zero, otherwise known as 0, plus 2. And so that whole expression is 2.
Now if I take the T of 2, whose answer is 2, plug it into the middle line, T of 4, that gives me 2 times 2 plus 4, so 8. If I then plug in 8 to the previous line, that gives me 2 times 8, 16. And if we then continue that with the 24, adding in 16, we finally get a value of 64.
Now that in and of itself sort of speaks nothing to the n notation, the big O, the omega that we've been talking about. But it turns out that 64 is indeed 16, the size of the input, log base 2 of 16. And if this is a little unfamiliar, just think back, and it'll come back to you eventually. If this is log base 2, it's like 2 raised to the what gives you 16? Oh, that's 4, so it's 16 times 4.
And again, it's not a big deal if this is sort of a hazy memory now. But for now, take on faith that 16 log 16 is 64. And so indeed, with this simple sanity check, we've confirmed-- but not proved formally-- that the running time of merge sort is indeed n log n.
So not bad. It's definitely better than the algorithms we've seen thus far, and it's because we've leveraged, one, a technique called recursion. But more interesting than that, that notion of dividing and conquering. Again, truly week 0 stuff that even now is recurring in a more compelling way.
Now a fun little exercise, if you've never done this-- and you probably wouldn't have, because sort of normal people don't think to do this. But if I go to google.com and if I want to learn something about recursion, Enter.
[LAUGHTER] [MORE LAUGHTER] DAVID MALAN: Bad joke slowly spreading. [LAUGHTER] DAVID MALAN: Just in case, it's there. I didn't spell it wrong, and there's the joke. All right. Explain it to the people next to you if it hasn't quite clicked just yet. But recursion, more generally, refers to the process of a function calling itself, or more generally, dividing a problem into something that can be solved piecemeal by solving identical representative problems.
Well, let's change gears for just a moment. We like to end on certain cliffhangers, so let's start to set the stage, for several minutes, on a very simple idea-- that of swapping two elements, right? All of these algorithms we've been talking about the past couple of lectures involve some sort of swapping. Today it was visualized by them getting up out of their chairs and walking around, but in code, we would just take an element from one array and plop it into another.
So how do we go about doing this? Well, let me go ahead and write a quick program here. I'm going to go ahead and do this as the following. Let's call this-- what do we want to call this one?
Actually, no. Let me rewind. I don't want to do that cliffhanger yet. It will spoil the fun. Let's do this instead.
Suppose that I want to write a little program and that now embraces this idea of recursion. I kind of got ahead of myself there. I'm going to do the following.
First, a quick include of standard io.h, as well as an include of cs50.h. And then I'm going to go ahead and declare int main void in the usual way. I realized I've misnamed the file, so let me just add a .c extension here so that we can compile it properly. Start this function off.
And the function I want to write, quite simply, is one that asks the user for a number and then adds up all the numbers between that number and, say, 0. So first I'm going to go ahead and declare int n. Then I copy some code that we've used for a while. While something is true. I'll come back to that in a moment.
What do I want to do? I want to say printf positive integer please. And then I'm going to say n gets get int. So again, some boilerplate code that we've used before. And I'm going to do this while n is less than 1. So this will ensure that the user gives me a positive integer.
And now I'm going to do the following. I want to add up all of the numbers between 1 and and n, or 0 and n, equivalently, to get the total sum. So the big sigma symbol that you might recall. So I'm going to do this by first calling a function called sigma, passing it in n, and then I'm going to say printf, the answer is right there.
So in short, I get and int from the user. I ensure it's positive. I declare a variable called answer of type int and store in it the return value of sigma, passing in n as input. And then I print out that answer.
Unfortunately, even though sigma sounds like something that might be in the math.h file, its declaration, it's actually not. So that's OK. I can implement this myself. I'm going to implement a function called sigma, and it's going to take a parameter-- let's just call it m, just so it's different. And then up here, I'm going to say, well, if m is less than 1-- this is a very uninteresting program. So I'm going to go ahead and immediately return 0. It just doesn't make sense to add up all the numbers between 1 and m if m is itself 0 or negative.
And then I'm going to go ahead and do this very iteratively. I'm going to do this sort of old-school, and I'm going to go ahead and say that I'm going to declare a sum to be 0. Then I'm going to have a for loop of int-- and let me do it to match our distribution code, so you have a copy at home. int i gets 1 on up to i is less than or equal to m. i plus plus. And then inside of this for loop-- we're almost there-- sum gets sum plus 1. And then I'm going to return the sum.
So I did this quickly, quite admittedly. But again, the main function's pretty straightforward, based on code we've written thus far. Uses the dual loop to get a positive int from the user. I then pass that int to a new function called sigma, calling it, again, n. And I store the return value, the answer from the black box currently known as sigma, in a variable called answer. Then I print it.
If we now continue the story, how is sigma implemented? I propose to implement as follows. First, a little bit of error-checking to make sure that the user's not messing with me and passing in some negative or 0 value. Then I declare a variable called sum and set it to 0.
And now I begin to move from i equals 1 all the way up to and including m, because I want to include all the numbers from one through m, inclusive. And inside of this for loop, I just do sum gets whatever it is now, plus the value of i. Plus the value of i.
As an aside, if you've not seen this before, there's some syntactic sugar for this line. I can rewrite this as plus equals i, just to save myself a few keystrokes and to look a bit cooler. But that's all. It's functionally the same thing.
Unfortunately, this code's not going to compile yet. If I run make sigma 0, how am I going to get yelled at? What's it going to not like?
AUDIENCE: [INAUDIBLE].
DAVID MALAN: Yeah, I didn't declare the function up top, right? C is kind of stupid, in that it only does what you tell it to do, and you have to do it in that order. And so if I hit Enter here, I'm going to get a warning about sigma implicit declaration.
Oh, not a problem. I can go up to the top, and I can say, all right, wait a minute. Sigma is a function that returns an int and it expects an int as input, semicolon. Or I could put the whole function above main, but in general, I'd recommend against that, because it's nice to always have main at the top so you can dive right in and know what a program's doing by reading main first.
So now let me clear the screen. Remake sigma 0. All seems to check out. Let me run sigma 0. Positive inter. I'll give it the number 3 to keep it simple. So that should give me 3 plus 2 plus 1, so 6. Enter, and indeed I get 6.
I can do something bigger-- 50, 12, 75. Just as a tangent, I'm going to do something ridiculous like a really big number, Oh, that actually worked out-- eh, I don't think that's right. Let's see. Let's really mess with it.
That's a problem. What's going on? The code's not that bad. It's still linear. Whistling is a good effect, though. What's going on?
Not sure if I heard it. So it turns out-- and this is as an aside. This is not core to the idea of recursion. It turns out, because I'm trying to represent such a big number, most likely it's being misinterpreted by C as a not positive number, but negative number.
We have not talked about this, but it turns out there are negative numbers in the world in addition to positive numbers. And the means by which you can represent a negative number essentially is, you use one special bit to indicate positive over negative. It's a little more complex than that, but that's the basic idea.
So unfortunately, if C is confusing one of those bits as actually meaning, oh, this is a negative number, my loop here, for instance, is actually never going to terminate. So if I were actually printing something again and again, we would see a whole lot. But again, this is besides the point. This is really just a sort of intellectual curiosity that we'll come back to eventually. But for now, this is a correct implementation if we assume that the user will provide ints that fit within ints.
But I claim that this code, frankly, could be done so much more simply. If the goal at hand is to take a number like m and add up all of the numbers between it and 1, or conversely between 1 and it, I claim that I can borrow this idea that merge sort had, which was taking a problem of this size and dividing it into something smaller. Maybe not half, but smaller, but representatively the same. Same idea, but a smaller problem.
So I'm actually-- let me save this file with a different version number. We'll call this version 1 instead of 0. And I claim that I can actually reimplement this in this sort of mind-bending way.
I'm going to leave part of it alone. I'm going to say if m is less than or even equal to 0-- I'm just going to be a little more anal this time with my error checking-- I'm going to go ahead and return 0. This is arbitrary. I am just simply deciding if the user gives me a negative number, I'm returning 0, and they should have read the documentation more closely.
Else-- notice what I'm going to do. Else I am going to return m plus-- what is sigma of m? Well, sigma of m plus m minus 1, plus m minus 2, plus m minus 3. I don't want to write all of that out. Why don't I just punt? Recursively call myself with a slightly smaller problem, semicolon, and call it a day?
Right? Now here, too, you might feel or worry that this is an infinite loop that I'm inducing, whereby I am implementing sigma by calling sigma. But that's perfectly OK, because I thought ahead an added which lines?
AUDIENCE: [INAUDIBLE].
DAVID MALAN: 23 to 26, which is my if condition. Because what's nice about the subtraction here, because I keep handing sigma smaller problems, smaller problems, smaller-- it's not half the size. It's only a baby step smaller, but that's OK. Because eventually, we'll work our way down to 1 or 0. And once we hit 0, sigma's not going to call itself anymore. It's going to immediately return 0.
So the effect, if you sort of wind this up in your mind, is to add m plus m minus 1, plus m minus 2, plus m minus 3, plus dot, dot, dot, m minus m, eventually giving you 0, and the effect is ultimately to add all of these things together. So we have not, with recursion, solved the problem that we couldn't solve before. Indeed, version 0 of this, and every problem to date, has been solvable with just using for loops or while loops or similar constructs.
But recursion, I daresay, gives us a different way of thinking about problems, whereby if we can take a problem, divide it from something somewhat large into something somewhat smaller, I claim that we can solve it perhaps a little more elegantly in terms of the design, with less code, and maybe even solve problems that would be harder, as we'll eventually see, solving purely iteratively.
But the cliffhanger that I did want to leave us on was this. Let me go ahead and open up a file from-- actually, let me go and do this real fast. Let me go ahead and propose the following. Among today's code is this file here. This one here, noswap.
So this is a stupid little program that I whipped up that claims to do the following. In main, it first declares an int called x and assigns it the value of 1. Then it declares an int y and assigns it the value 2. Then it prints out what x and y is. Then it says, swapping, dot dot dot.
Then it claims to be calling a function called swap, passing in x and y, the idea of which is that hopefully x and y will come back different, the opposite. Then it claim swapped! with an exclamation point. Then it prints out x and y.
But it turns out that this very simple demonstration down here is actually buggy. Even though I'm declaring a temporary variable and temporarily putting a in it, then I'm reassigning a the value of b-- which feels reasonable, because I've saved a copy of a in temp. Then I update b to equal whatever was in temp. This sort of shell game of moving a into b and b into a by using this middle-man called temp feels perfectly reasonable.
But I claim that when I run this code, as I'll do now-- let me go ahead and paste it in here. I'll call this noswap.c. And as the name suggests, this is not going to be a correct program. Make noswap ./ no swap. x is 1, y is 2, swapping, swapped. x is 1, y is 2. This is fundamentally wrong, even though this seems perfectly reasonable to me. And there is a reason, but we're not going to reveal the reason just yet.
For now the second cliffhanger I wanted to leave you with is this, an announcement of sorts on coupon codes. Our innovation with late days this year has provoked a non-trivial number of questions, which was not our intention. The intention of these coupon codes, whereby if you do part of the problem set early, thereby getting an extra day, was really to help you guys help yourself start early, sort of by incentivizing you. Helps us distribute load across office hours better so that it's sort of win-win.
Unfortunately, I think my instructions have not been, to date, very clear, so I went back this weekend and updated the spec in bigger, bolder text to explain bullets like these. And just to say it more publicly, by default, problem sets are due Thursday at noon, per the syllabus. If you start early, finishing part of the problem set by Wednesday at 12:00 PM, the part that relates to a coupon code, the idea is that you can extend your deadline for the P set until Friday. That is, bit off a tiny part of the P set relative to what typically is the larger problem, and you buy yourself an extra day. Again, it gets you thinking about the problem set, gets you to office hours sooner. But the coupon code problem is still required, even if you don't submit it.
But more compellingly is this. (STAGE WHISPER) And those folks leaving early are gonna regret it. As are the folks on the balcony. I apologize in advance to the folks on the balcony for reasons that will be clear in just a moment.
So we are fortunate to have one of CS50's former head teaching fellows at a company called dropbox.com. They have very generously donated a coupon code here for this much space, which is up from the usual 2 gigabytes. So what I thought we would do on this final note is do a bit of a giveaway, whereby in just a moment, we will reveal the winner and who has a coupon code that you can then go to their website, type it in, and voila, get a whole lot more Dropbox space for your appliance and for your personal files.
And first, who would like to participate in this drawing? OK, now that makes it even more fun. The person who receives this 25-gigabyte coupon code-- which is far more compelling than the late days now, perhaps-- is the one who is seated on top of a seat cushion beneath which there is that coupon code. You may now look underneath your seat cushion.
[VIDEO PLAYBACK]
-One, two, three.
[SCREAMING]
-You get a car! You get a car!
DAVID MALAN: We will see you on Wednesday.
-You get a car! You get a car! You get a car! You get a car! You get a car!
DAVID MALAN: Balcony folks, come down here to the front, where we have extras.
-Everybody gets a car! Everybody gets a car!
[END VIDEO PLAYBACK]
NARRATOR: At the next CS50--
SPEAKER 5: Oh my gosh gosh gosh gosh gosh gosh gosh gosh gosh gosh--
[UKELELE PLAYS]