[ Silence ]
>> Alright. Welcome back. This is the end of week 3. So you'll recall at the beginning of week zero, we played around with this, with the whole bunch of our staff up on the stage. And the goal there was to consider just how efficiently, how effectively we could solve a fairly mundane problem of just looking up someone like Mike Smith in this phone book and recall that one of the first passes at it involved, well, start at the beginning, turn to the second page, turn to the third page, dot, dot, dot. A thousand some odd pages later, bam! We happen to pawn Mike Smith. But of course almost everyone except Jason as I recall up here, went at this phonebook by tearing it first in half and then in half and then in half and then in half again and that algorithm or that procedure, we decided was in theory much, much faster than the sort of old school start at the beginning, go to the end and stop as soon as you find Mike Smith. And just to refresh some jargon, if we said that the old school naive but very correct approach was linear in nature and that we were just kind of walking along a straight line from left to right in the phonebook, how did we describe the smarter approach that the CA's and TF's took? Right. It was logarithmic, right? So, sort of a--maybe a flashback to a high school math at this moment but logarithmic in short was better and we'll actually feel what we mean by better today and beyond because today we put aside, see entirely for just a bit, no new syntax, nothing new scary today but we finally start to deliver on this syllabus's promise that 50 is ultimately about thinking, teaching you to think more carefully and solve problems more efficiently and more effectively. So, with that said, recall this, no need to stand up this time, this particular algorithm which was actually very similar in spirit to that phonebook example. In the phonebook example, again the linear approach was left to right but the logarithmic approach, the divide and conquer approach was to start in the middle and then repeat, repeat, repeat tearing half of the problem away again and again. So, you went from a hundred percent to 50 percent to 25 percent to 12 and a half percent and so forth. It was really a whittling down of that phonebook. This algorithm too, recall that first day, we all stood up and we had each of you pair off with the person next to you and then one of you sat down but the other kept both your number plus their number. And so if you continue that logic of having everyone stand then half people sit, half people sit, half people sit, but you never throw away any of the numbers that are in your head because you just hand your number off to the person still standing. In theory at the end of that algorithm, we should have had just one person standing and his or her number should have been the collective count in the room, because everyone recall started at one. Now, that's a bit of a cheat, that approach, and that it was not so much as more efficient algorithm per se. It was really leveraging more than one CPU, right? If each of us is sort of the brains of that algorithm, well, we threw like 400, 500 CPU's at that problem and ran the algorithm in parallel. But we'll see when you only have just one CPU, one core, one brain inside of your own computer, you can still leverage this idea of divide and conquer and ultimately get much, much faster and smarter implementations. Now, rather than use everyone in the room here, thought we'd pick on those of you in the orchestra section, just those of you here on the ground floor, otherwise this isn't too easy. So, if just you folks would humor me for just a moment and execute step one here, let's see if we can't find within this orchestra section the tallest person among you. And this should be easy if you're all standing on flat ground, alright? Execute, oh come on, even you folks standing in the back there, come on.
[ Inaudible Discussions ]
>> Alright, so the group should be getting smaller and smaller. I still see two of you standing.
[ Inaudible Remark ]
[ Laughter ]
[ Inaudible Remark ]
>> Okay, need to do a side by side comparison. Alright, and what's your name again?
>> Renzo.
>> So, Renzo isn't he the tallest person in the orchestra section today? Congratulations. Alright. So, first of all, it didn't go unnoticed, it was a whole lot of opting out in the back section there, the orchestra. That's okay, so Renzo was the tallest and consider in theory how fast this algorithm should have been. If everyone stood up then half of you sat down, half of you, half of you, half of you. So, again if we start with say not just maybe 50, 60, 70 people here but instead start with a thousand people, we then got down to say 500 and then 250 and then 125. If you can kind of think about these numbers, they really start to shrink and shrink and shrink pretty quickly. In fact if we extrapolate like we did in week zero, if we had crazy thing, 4 billion people sitting in just the orchestra section today, how many iterations would there have been of this three step algorithm? So, only 32, right? It would be a kind of a mess but of course with just 32 steps and by 32 we get that from, you can divide 4 billion but in half again and again and again, a total of 32 times. I mean that's kind of a powerful thing. Even if you had 4 billion people here after just 32 rounds of this, everyone would have sat down except one person and that person would be standing like Renzo with the total amount. Now, what's the contrast here, right? That's kind of powerful. It's kind of nice that we're leveraging all these brains. What would be the alternative, right? If I did it sort of old fashion style, I would start down here and I would say, "Would you mind standing up for just a moment?" Alright, so I'm gonna compare these two fellows here and of course this is a very close tie. So, we're gonna break the tie randomly. So, I'm gonna say, Joe, you stay standing. Go ahead and sit down. Now, what do I do again? Well, I now walk down to the rest of the aisle, I find someone else, I compare those two, one of them sits down, I move on to the next, on to the next, on to the next. But, you know, all this while, I'm remembering who is the most tall--the tallest person I've seen thus far. So, I kind of remember that person or drag them along with me but at the end of the day that algorithm is still something linear. I have to make some N minus 1 comparison if there's N of you and I have to compare all of you--one person against all of you potentially to find the tallest. So, that would actually be much slower. If we did have some 60 or so people here, that would take 60 steps, whereas yours would have taken 60 to 30 to 15 to 7 and a half to 4 to 2 to 8, like 7 or 8 steps to do that same thing in this size group. So thank you, Joe. So, linear is obviously a bit slower. So, how can we now leverage this sort of commonsensical ideas that we're using in this human context to actually now leverage them in more powerful context. Well, first what's really the goal? So, let's just kind of plot upper [inaudible] like you might on a graphing calculator or Excel or on a piece of paper, what these things look like. So, if we very generically say that this chart represents on the X axis the size of some problem and size we measure typically numerically, we usually call it N as a generic variable and the size of the problem that we've been playing with here are size--the number of people in this room, so 60. In this case a few hundred, in the total case. So, N equals that value. So, the farther out you go over here, the more people there are to count in the algorithm, the more input there is. Now, on the Y axis here, the time to solve, that's just the generic way of saying how much, how many steps are there gonna be, how many seconds, how many minutes is it gonna take to actually execute this algorithm given these many people. So, if we plotted here in red a straight line, straight implying linear, so if we had N people in this room as the size of the problem grows from left to right, well, the amount of time it would take for me to count you back in the first week or for me to find the tallest person in here manually by comparing every person against--comparing pairs of people again and again 'til I find the tallest. Well, every time you add another person to the orchestra section, it's gonna take me one more step. If a hundred more people come in sit down in the orchestra, 100 more steps. So, it's a one to one, a linear relationship. Now, what did I do in week zero as an optimization of counting everyone in the room? It was kind of slow to say 1, 2, 3, 4, 5, so what was the obvious marginal improvement, alright? So, count in pair, so two people at once, 2, 4, 6, 8, 10, 12, 14 and so forth, and just verbally you can hear how the speed has obviously increased. It's twice that but graphically it's really just the same shape of a line, it just happens to be a little lower, alright? Because you can actually have more people but it's gonna take less time if you look up the Y-axis, obviously because you're counting them twice as--twice as many at a time. But when you all counted yourselves, or in this case found the tallest person in just the orchestra section, well, the shape of that graph is much more powerfully curved and almost flat like this. Whereby if we graph how much time it takes to count the number of people in the mezzanine or find the tallest person--sorry, the number of people in the theater or the tallest person in the orchestra, well, you get this shape instead. And why is that? Well, if we literally double the number of people in this orchestra section, suppose there are 60, 60 more people come in, they sit down, how many more steps does my algorithm take if I'm actually using something like divide and conquer, this parallelism? Just one, and that's kind of the powerful idea. You can double the size of the problem and I'd barely bat an eye at it because I can absorb that problem by just executing one more step. But by contrast the linear algorithms, if I was doing it manually, 1, 2, 3, that would take 60 more steps to count 60 people. So, the goal here ultimately is not gonna be to count simple things like these, but try.
>> And most every time we start implementing new problems and solving more interesting challenges in the P-sets is to try to figure out, well, can I get away with something linear? It's quick, it's easy, I know for loops and while loops, or can I actually make something that's smarter than that, that's more efficient than that, that's better designed. And if you consider the applications here, right, we talked about the 6 degrees of Kevin Bacon example and just asking sort of very reasonable questions like, "Who are my friends of friends on Facebook?" That is not an easy question to answer as the size of your social network grows because of the multiplying effects of having your thousand friends times their thousand friends and so forth, that graph is actually gonna look much more different. In fact, kind of take a mental image of these two straight lines and then this curved green line, green meaning good here. And if we actually zoom out and consider other possible algorithms, those three colored lines have now kind of flattened themselves out on the right hand side there. And what you see plotted here now are numbers like, well, suppose your algorithm was N squared, in other words if you've got N people, it takes N squared steps, 10 people, 100 steps. Well if its N cubed, it's even bigger. If we zoom out further though, you can see that there's at least one type of algorithm. We probably want to avoid throughout life that's anything that looks like this. This is what we generally call an exponential algorithm where you take some constant number like 2 and then raise that to some variable number like N here. And we'll put this into context before long. There's some very famous and some very common problems that potentially fall into that bucket but you can see just from the shape of this graph, if you just to have an input of size 60, 60 people, 60 something and my God it takes 200,000 units of time, and that's a lot compared to anything that's down there. Now, you've all been sectioning for this class and other classes and can you just think about the Harvard has in the fall semester like a few thousand classes of varying sizes and we have a few hundred rooms probably and there's a few hours in the day, 9 to 6-ish that classes might be held in and there's some finite number of faculty, so there's all of these various inputs. And if you just start to think about now I've got a few thousand times of this many rooms times this many students, all of whom have these many constraints that the problem of scheduling people or resources or rooms or classes is actually one of those problems that tends to fall in the 2 to the N bucket, at least if implemented poorly. FAS's sectioning tool which you use for this class and probably for some of your other classes, if you actually Google around, you can find the algorithm that they used and it's actually disclaimed as not being optimal. It is not the perfect solution. It will not maximize all of your happiness across campus because doing so would actually take a ridiculous amount of time. You would essentially have to compare every possible persons, possible maximum happiness given a current schedule then compared against everyone else's optimum and then figure out from that exactly how, who we should put in which class, which section, with which TF, it just takes too long. Certainly if you have many classes, all of whom we're trying to section at once, so they optimize. Maybe there's a bit of randomness. And so with randomness, with optimizations, kind of cutting corners, the out program like that can say, you probably got your first or second or third choice but if you kind of added up all of your happiness, if you could kind of quantize it somehow, odds are most of you are not perfectly happy right now with your section assignments. But that's sort of the nature of these kinds of problems. So, let's just slap some jargon on one thing just so we can express things a little more succinctly later. In programming, in computer science more generally, computer scientists have some shorthand notation for describing how long it takes to run certain algorithms or certain programs. And this notation here just denotes what we'll call big O and this is the--it stands for order, the order of magnitude of running time essentially. So, if we have an algorithm like your grade school counting, 1, 2, 3, 4, we do say that's linear but if you wanna be really fancy, you can say that that algorithm of counting is in big O of N. So, big O of N is again just sort of the mathy [phonetic] way of saying that algorithm in the worst case is going to take you N steps no matter what, because you have to count every person. By contrast, if we did the two Z's counting, counting everyone in pairs, 2, 4, 6, 8 and so forth, we could actually whittle that down to something like big O of N over 2, right, if N is the total number of people divided by 2. And though if we went even smarter and had you guys count yourselves, pairing off again and again so that the problem goes from this size to this size to this size to this size, that we could express as big O of log base 2 of N. And realize that this is pretty much as mathy as a class like this tends to get but it--well, you'll find it helpful in thinking through some of your own programs longer term in the future weeks just as to how well designed something might be and we'll see some alternatives to this big O notation. In fact what we're gonna see too before long is that an algorithm that's big O of N versus an algorithm that's big O of N over 2, most people just say, any constant number like that is actually pretty meaningless. Because you know what? If your algorithm is twice as fast as mine, whereby you count in twosies and I count one at a time, well you know what? All I have to do to catch up to you is wait literally 12 months, I get a brand new computer that's probably gonna be twice as fast, thanks to industry trends and voila, now our algorithms are the same. But that's not what we mean by comparing algorithms. So, we do in fact, when talking about how well designed something is, we just cross out things like divide by 2 or multiply by 2, any of these constant factors. Because again, it then ties us too much to current hardware capabilities and so we try to focus only on the--the formulaic part, the variables. So, let's try to make this a little more real. So, dramatic flourish time. So, in advance, we prepared these white pieces of paper here. Hopefully you didn't see what's behind them. On the--on this blackboard here, and the question at hand is, "How can we find numbers behind these doors efficiently as quickly as possible?" And what's nice about this point in the semester is that now even though you might still be fighting with some new season tax and functions and return variables and stuff that might be new to you, we at least have a pretty good toolkit for storing information,. You've got variables and floats and chars and strings but we also now have what data structure that we introduced a couple of days ago? Yeah, so we have arrays. And an array in English, anyone, is what? Wanna have one person, in just words you might use to explain it to your roommate?
[ Inaudible Remark ]
>> Okay, collection of data points, collection of integers or chars or whatnot and a little--be a little more precise, what's--what about that collection? Finite number of them, and can you be even more specific with regards to how they're laid out in memory?
[ Inaudible Remark ]
>> Perfect, they each take up the same amount of space, int, int, int, int or char, char, char, char and they're all contiguous in memory. And the advantage of the contiguousness of these values is that you can use that very succinct square bracket notation to go from the zeroth element to element 1, element 2, element N minus 1, but we did see some dangers. If you accidentally take an array that's of size N and go to the name of the array bracket N, what did you just do that's bad? Sorry? You fell off the end of the array. And now sometimes this might have no bad effect, right? I printed something out where I went a little too far and all we saw was a space character, not a big deal. But if you a little too far, you're actually gonna start touching RAM that does not belong to you. In fact it might belong to some other program potentially all together, and so the operating system will typically crash in some form and the--the bad news you get from your command line is core dump or segmentation fault, one or both of those realities. But let's assume for now that we at least have arrays and so we have this capacity to store say 8 integers back to back to back. So, this might be our first array here. This might be a second array here and behind each of the doors of paper here on the top, we have some random number in random order. And suppose if I were to ask you to say find me the number 7, what would you do in that top array of integers? This is not even a computer science question. I say, "Here's 8 pieces of paper, there's 8 numbers behind them." It's sort of game show like, find me the number 7. What would you do as a reasonable person? I'm sorry? Alright, so rip off the paper, right, and just start looking for it, right? And unfortunately all of these doors are closed, all of the papers hiding it, so you can't kind of do this human thing and just kind of skim it and be like, "Oh, bam, it's right there." Rather you have to be much more methodical and actually open the door or lift the paper then let it go and then check another one, check another one. And this kinda hammer home--hammers home the point perhaps that a computer too has to do that. We've had the luxury for some 18, 20, 30 years as humans growing up of just kind of being able to glance at something and say, "It's right there, that number is right there." Or looking out on this room and saying, "Feels like 60 people," and you get this aggregate sense. But what your brain, you know, maybe is actually doing and what a computer certainly would need to do is you can't just kind of glance out and say, 60. You have to do something much more methodical, 1, 2, 3, 4, 5. Same thing in Scratch, you might have felt your self frustrated at first because Scratch kind of confines you to using only specific puzzle pieces that exist and you kind of have to figure out how to express something like that would be very natural to say. Like, move the cat up to the left. Well, how do you actually move that cat, how do you move Scratch up to the left? Well, you had to do something like in a loop and you had to have a move and then turn and move and turn.
>> In other words you had to break down common sense and human assumptions into something much more precise. So, one of my favorite CS50 moments was a couple of years ago when we did a demonstration much like this with one of your predecessors named Shawn. Rather than have someone new come up this year, I thought I would engage us and we'll look back at how Shawn approached this problem and hopefully as you watch Shawn participates in this example here, you yourself will be thinking about what you would be doing or would not be doing if you had been Shawn standing in front of some 300 people trying and a little example like this. So, this is Shawn trying to find us the number 7. And again, ask yourself as we watch this, what you might have been doing or yelling at Shawn at the time. Okay, so your task here, Shawn, is the following. I had hidden behind these doors the number 7, but tucked away in some of these doors as well are other nonnegative numbers and your goal is to think of this top row of numbers is just an array. Were just a sequence of pieces of paper with numbers behind them, and your goal is only using the top array here, find me the number 7. And we are then going to critique how you go about doing it.
>> Alright.
>> Find us the number 7 please.
[ Laughter ]
>> No, 5, 19, 13.
[ Laughter ]
>> It's not a trick question. One. At this point your score is not very good, so you might as well keep going. Three, go on. Frankly I can't help but wonder what you're even thinking about.
[ Laughter ]
>> Only the top row, so you got three left, so find me 7.
[ Inaudible Remark ]
[ Laughter ]
>> 17.
[ Laughter ]
[ Inaudible Remark ]
>> 7.
[ Applause ]
>> So, we never again have to do that demonstration 'cause Shawn kind of captured the potential of it so perfectly. So, there really was no right answer to that question, right? The best he could have done was just start tearing the pieces of paper from left to right and find the number 7 and I'm sure he was thinking, "There's got to be some trick to this." And it probably didn't help that all these eyes were bearing down on him. But the problem is that if you're not given any sort of assumptions to go on. You're just told, "Here's a bunch of numbers, they happen to be nonnegative but they're in some random order, find me something." The best you can do is to implement an algorithm in what running time, big O of what, if there's N total doors to look behind? It's big O of N, right? So, that algorithm in the worst case was gonna take Shawn big O of N steps and it almost did. It was big O of, in his case, N minus 1. But as we do with divide by 2, even doing plus this, minus this, who cares? We just typically erase that part of the formula and we just say Shawn's implementation was big O of N. But there was--there's another way we can express running time because Shawn could have gotten lucky, right? If 2 years ago he had actually done something slightly different, he might have avoided that whole awkward moment for himself on stage. What could he have done differently just by chance that would have made that problem go away much faster? Alright, so start on the other end. Right now there's no way he would have known that and had he done that we wouldn't be showing the video 'cause it wouldn't be nearly as much fun to watch but he would have solved that problem in just one step, right? So, in one step you would say is the best case solution for that algorithm. The process is the same. Even if he started from the right and most likely he would have just played the same game but from right to left instead of left right. So, in the worst case, 7 could have been here and if Shawn just so happen to start over there, well, it's the same kind of bad luck. He would be a big O of N. But in the case of good luck where 7 is here and he starts here or 7 is here and he starts here, we would say that that algorithm is yes in big O of N 'cause big O is again worst case but we can also express it as Omega, Omega of some constant number. And how many steps in the best case might have taken Shawn to find the number 7? One, so we would say, in Omega of 1. So, big O parenthesis N for the worst case and then Omega parenthesis 1 for the best case. And maybe it could take him 2 steps, right? Maybe it actually takes 2 steps. Conceptually you have to lift the paper, look at number and that's two steps but the takeaway is that it's a constant number of steps. Irrespective of how many doors there are in the best case, the solution to that problem might only involve just one step. So, could he have done better with these doors? Would you have done anything differently given the same input that he was?
[ Inaudible Remark ]
>> Okay, so you could try the twosies approach where you could at least have kind of made that awkwardness go away twice as fast by checking these two and then these two and then these two and it would take 4 steps maximally instead of 8. So, in formulaic terms, it's still kind of big O of N 'cause it's big O of N over 2 but we said that, whatever, with the constant numbers it's still big O of N, but in reality that would absolutely have been faster, it would have been twice as fast. If he actually had some way of lifting two pages with one hand and then two in the other, he could actually do N over 4 and get that over right quick in just two huge steps or lift all the papers at once. But again these aren't fundamental improvements and he's really given no additional information he can leverage. If I had him taught his whole, they're all in random order. I mean sure, you can start in random order, start here, no, start here, no, start here, no. But at the end, you're gonna have to check them all. And in fact f you try this random approach, what might you also have to do if you're now the human or a computer implementing that kind of randomness? You have to remember where you've been. So, here we see the first example of a tradeoff. Well, you can maybe get lucky and find the number 7 faster but it's gonna cost you something. You're gonna need to keep around some variables or another array just to remember what doors you've been at. For instance, you could have another array of size 8 and just might--much like you would on a piece paper, you could check off which doors you've been--been to that you've opened but that's gonna take you another 8 boolean, some number of bits or bytes of memory. So, what would Shawn have benefited from besides my just flat out saying the number 7 is behind that door? What could I have done to at least help him make his algorithm better?
[ Inaudible Remark ]
>> Okay, yeah. So what if I have arranged it in ascending or descending order, right? What if I steal a very reasonable idea that, you know, Verizon has been doing for a thousand years and its predecessors. You're--our phone numbers are not in random order in this book and for good reason, they're alphabetical, right? But if they're alphabetical, that's not just convenient, that is an assumption we can now exploit to implement this technique known as divide and conquer. So, let's go ahead and [inaudible] for this, we have a volunteer. And I think I can promise not to show you on video 2 years hence but let's see if you graduate first. How about in the back, you wanna come on down? So, let's suppose we did just that and behind these sets of doors, this second array. So, this is indeed what Shawn saw long ago. These were the answers that he was revealing one at a time, okay? Come on up, what's your name?
>> Jeremy.
>> Jeremy, alright. Jeremy, David, nice to meet you. Alright, so now Jeremy's tasked with finding let's say the number 50 but this time--so these numbers are different, just to be clear, they're just different random numbers. But I did take some time before class to sort the numbers in such a way that there still might be gaps between them. The numbers are not gonna be, 48, 49, 50, 51, 52, there could be gaps but the array of numbers now is increasing from left to right. And I ask you, find the number 50. Before you reveal it, what's your first instinct, what do you wanna do?
[ Inaudible Remark ]
>> Okay, let's--just for the folks playing along at home.
>> Take one of two middle [inaudible] numbers.
>> Okay. Okay, and why the middle ones?
[ Inaudible Remark ]
>> Okay, good. So, divide the numbers in half so that much like the phonebook we can then go left to right. Alright, so take your pick, reveal one such door. Okay, so we're at 61, alright? So, this was the number 61 and now this is obviously bigger or smaller than what we're looking for? Alright, so it's bigger, so much like we dramatically try to tear phonebooks in weeks past, this problem now really goes away and we don't have to even know what these numbers are but we can at least ignore that half of the problem entirely. So, we've gone from size 8 to size 4, now what do you wanna do? Alright, so now let's go down the middle. And again just to be clear here, there's really not a middle, so if we were implementing this in a program, you got to have to figure out how to round, you're either gonna have to round down to an even number or round up to an odd number, something like that, but we can at least do that consistently. So, you're rounding up slightly, perfect. So, after--a big round of applause, that was--[applause] Okay? So, partly good luck, right, because that could have taken Jeremy not 2 steps, which it did, but rather 3, right? If you'd kind of rounded down just based on whatever arithmetic you implemented, and was there any reason you went with the right instead of the left?
>> 61 is pretty close. These numbers are [inaudible].
>> Okay.
[ Inaudible Remark ]
>> Okay, good. So, he was actually a little cheating then if taking into account the numbers that we pretended weren't there, but that's fine, right, because if you are given some additional input like the gaps are of this size or the numbers are between this range, that's absolutely something you could exploit and worse case, you just randomly choose. But in short, even though there's only 8 pieces of paper or doors here, we still went from 8 to 4 and then we got lucky when we went from 4 to 1. But even in the worse case, we would have gone from 8 to 4 to 2 to 1. So, that's 3 iterations total and that indeed would be a logarithmic running time. Because if we then said to Jeremy, alright, there's not 8 pieces of paper, I actually maybe went to the effort of putting 16 pieces of paper, how many steps would that have taken you? Well, [inaudible] not twice as many.
[ Inaudible Remark ]
>> Okay, good. So if we had 8 pieces of paper here and it took you 2 steps in that case, suppose we doubled it.
[ Inaudible Remark ]
>> Exactly, just one more division. And that's the powerful thing. We go from 8 to 16 to 32 to 64 and you can keep batting these problems away, chopping them in half ever so easily. So, very well done, thank you so much, alright. [Applause] Alright. So, unfortunately, when you are just given some data like you don't have someone presorting things for you. So at the end of the day, this kind of begs the question, just how efficiently can we actually sort all of these numbers? How do we go about doing it and with what techniques can we do and visualize this? Well, why don't we go ahead and pause here, take our 5-minute break and then we'll bring some more folks up on stage. Alright, so let's put these numbers aside and use numbers that are a little smaller and more manageable. Here are some numbers in essentially random order, and there's only 8 of them. And the fact that I keep using frankly 8 is kind of intentional 'cause it divides by two nicely, but in reality you might get sizes of 9 or 13 or any crazy numbers. So again just realize when it comes time to implement something where you're dividing by 2, you're not necessarily gonna get as perfect numbers as we are. So, you're gonna have to figure out whether to round up or down but you've seen things like the round function already. So, that will come up again before long. So, it's more fun than just kind of talking very boringly about numbers on the screen. We actually try to engage in other [inaudible] worth of folks. Can I ask for your indulgence one last time to have 8 people join us up here? Okay, you are number 1. Come on up. Number 2, 3, 4, 5, 6, 7 and really it's just me and the orchestra today. Okay. Well, it's--wait, wait, I said 7. How about let's go in the back there, 8. Alright, come on down. I think I got 3, 4, 5, 6, 7, perfect. Alright, you raised your hand first, so go ahead if you would and just line yourselves up on stage in a manner consistent with the sample on the screen there, 1, 2, 3, oh there we go. Just in time. Alright. And so--so that the lectern is not in the way, why don't we come over here if you don't mind. Alright, so this is the easy part. Make yourselves look like that. Alright, hopefully we are correct, okay? So, now we need one ninth volunteer and this person actually has to do some thinking on their feet, okay? Come on down, yeah, in the black shirt. Oh, I'm sorry; we'll get you next time. Black shirt, how about it? Come on down. Alright, so in just a moment we're gonna have to figure out how we can take this array of numbers which is effectively random, much like the first array that Shawn actually used, but we need to sort this. And even though this too might seem at first glance, and this is kind of easy, right? Like, there's number 1, let's put 1 there, 8 there, that's the sort of human approach, where you just kind of get an overarching sense of things and you just kind of know instinctively what to do. But again if you're a computer or even a robot, if you think of it that way, you really can only do one thing at a time and so you might have to first look here behind this door and there's a number, then you can move on to the next, then on to the next. But then if you wanna swap people, you actually have to spend some CPU time, some CPU cycles to move them around. So, again we come back to this notion of swap. But unfortunately according to last week's buggy program, swap doesn't really work in C. So, that's something else we'll have to fix before long. And what is your name?
>> Casey [phonetic].
>> Casey, okay. Casey, David. So, Casey is gonna be our sort of godlike figure here and actually moving these folks around. So, if you guys could just take just a step or so back just so Casey has some leg room in front. This may end up being a little awkward. But let's do this. So, Casey, your task first is to sort these people and before you actually instruct them to do something, at least give us, the audience, an idea verbally of what you're going to do. What would you like to do? Sort these people, what would you do first?
>> In a descending order.
>> Yes, so in from 1 on the left to 8 on the right. How would you approach this problem as a normal human being?
[ Inaudible Remark ]
>> Okay, so do that step, go.
[ Inaudible Remark ]
>> Very good so far, okay? What would you do next?
[ Inaudible Remark ]
>> Okay, and next.
[ Inaudible Remark ]
>> Okay, so there's a pattern.
[ Inaudible Remark ]
>> Good.
[ Inaudible Remark ]
>> Okay.
[ Inaudible Remark ]
>> Okay, done. So, this looks correct. Oh, very good.
[ Applause ]
>> Very well done. So, the question now for the audience is, "How long did that take?" Not so much in like seconds on the clock but how many steps? How many operations? How many units of time? Like, how can we begin to quantize what we just did? So, N minus 1. So it felt like if there're 8 people, right, she had to do 8 things minimally. So, there's at least 8 steps. But it's not quite as simple as that, right? Again, if you now think of these humans as an array, each of them in their own bucket of memory, all be it side by side but in a distinct location, how did Casey actually find the number 1? So, let's actually rewind. So, Casey you can come back here. If you guys could revert back to the original configuration, let's now dig a bit deeper because sorting N people is actually not as fast as big O of N steps or even N minus 1. That did not take N steps. We actually have to push a little harder on this question. So, again we rewind and you said, "Okay, you're gonna pluck out number 1." How did you find number 1?
[ Inaudible Remark ]
>> Okay, good. So, a reasonable human would do that. You kind of look at all the numbers, but therein lies a hidden cost, right? How do you look at all those numbers? Well, even if your brain did it super fast, what you probably kind of sort of did was at least glance at 4, glance at 2, glance at 6, glance at 8, glance at 1 and then maybe there's a 0. So you might have glanced at 3, glanced at 7, glanced at 5 and then you concluded that number 1 is indeed the smallest. So, how many steps did we actually just spend? That was actually 8 steps to find the smallest number in this array. And so then what did we do? So, now let's fast forward, so you identified number 1 N steps later, how did you move him over here?
>> I told him to move.
>> Okay. So, you told him to move, but now let's consider that. How does this work? So, what's your name, number 1?
[ Inaudible Remark ]
>> [Inaudible]. So, if you could go ahead and step forward and no one else move just yet, how did you go about inserting yourself into the right part of the list? Okay, so you walked in in the list, so let's keep playing so we could do that.
[ Inaudible Remark ]
>> Yeah, we'll do exactly what you did last time and then we'll pause you. Pause. Now what did you do?
[ Inaudible Remark ]
[ Laughter ]
>> Yeah, we stopped, okay, and then what? Now, if this were an array, what did we just do? We tried to pull up number 1 in location negative number--a negative 1, right? You can't just kind of go to the beginning of the array, you wanted to go at location bracket 0. So--and what's your name number 4?
>> Sam.
>> Sam. So Sam, what really should Sam do here and what should number 2 and 6 and 8 really do? You know, it'd be nice if they kind of made some room. But wait a minute, what did that just cost us? That's 4 more steps, right? So, that's expensive. Well, let's rewind, that feels too expensive, this is taking forever now with all these steps. What's slightly smarter than having all of his neighbors move down? Okay, so just swap who? So, how about just 4 and 1, right? 'Cause Sam here, yeah, he's number 4 but we don't really know anything about him yet. He might be the biggest number, he might be the smallest. So far as we're concerned, Sam is still completely random. So, who cares if we put him in another random spot? So, let's move Sam over there. That frees up bracket 0, so now [inaudible] can go into bracket 0. So, now you can, yes. So, now how many steps did that take? Well, to rewind, it--first took Casey some 8 steps to identify number 1 then it took what? One more step to move Sam and then one more step to move [inaudible] into that spot, so that's like 10 steps. So, it's like N plus 1, N plus 2, give or take, but it's at least N, give or take. So, Casey now, and this story has obviously taken a lot longer now. Now, what do we do next to find number 2, or rather the next smallest?
>> Look at them again.
>> Okay, so we look at them again. Do we keep--need to keep looking at all 8 people though?
[ Inaudible Remark ]
>> Okay good, so there's a slight optimization here, right? We don't need to look at spot number 1--spot number 0 ever again where the number 1 is because we just put him there as the smallest so we can at least whittle this down to a problem of size N minus 1, then next N minus 2, N minus 3, dot, dot, dot, hopefully down to 0. So, how do we find number 2? What is your name?
>> Emy [phonetic].
>> Emy. So Emy happens to be holding number 2, but do we know she's the next smallest? What do you think? Casey, how do you know for sure or not if Emy is the next smallest value?
[ Inaudible Remark ]
>> So, really even though it looks obviously 2 is pretty small, it's probably next to 1. We only know that if we look at a total of 7 numbers. And now this, this is getting a little awkward up here, right? So, let's just fast forward verbally. So, now to find 3, we do N minus 2 steps, then N minus 3 steps, then N minus 4 steps, until finally we find number 8 and we can just leave him in spot number 8. So, that then begs the question, how may total steps was this? It does feel like it was way more than just the initial sense of N steps or N minus 1 to sort all these people when you actually start counting up the comparisons and the swaps. So, really how many was this? Yeah, so if you actually add all this up, this is effectively N steps, plus N minus 1, plus N minus 2, plus N minus 3. And now if you kind of think back to like your high school physics book or math book with all the formulas and whatnot, this is a series of sorts, a summation and so if you actually add N--8 plus 7 plus 6 plus 5 plus 4 and so forth, you essentially get that formula N times N plus 1 over 2, so N squared something. So, in short, it is in fact something like N squared. And again, if we added all these up, that would be the formula we get, N squared something. And how do we feel this? Well, just to if you'll humor me with this part, how do we go and find number 3? Well, you go here, here, here, okay. So, smallest 6, not smaller, smaller 4, oh smaller still 3, so you can kind of feel the inefficiency of this, so find out--pluck out 3, swap him with 6. Now, I can at least optimize. I never again have to look over here but now I start with 8. Is 8 the smallest? At this moment it is. How about--well, 4 is now smaller, 6 is not, 7 is not, 5 is not, so I better remember that 4 is there, so I pull Sam out, swap with 8. So, in short, even though this went super fast the first time, if you really break it down into sort of scratch puzzle pieces or C-loops, well really what Casey would have been doing is walking this far the first time then going back starting here, walking here again. So, this is step 7 steps, then we go back over here. Now, we've got 6 steps. Now, these are real steps. Because arrays are contiguous back to back, we do have random access, right? I can go from bracket 7 to bracket 0 like that because I know exactly where 0 is. But if I wanna compare these people, I can go back here very easily but again, you're literally going back and forth and back and forth and back and forth and that adds up and in fact it gives us something like N squared. Let's do something a little different. So, if you guys could reset one last time to this original configuration. That felt nice but it turns out we can solve this problem actually in like at least half a dozen ways. Some of them increasingly powerful or even slower, so let me just do this. That was perfect implementation. In fact what Casey proposed is something called selection sort. There's a whole bunch of algorithms the world has invented. Selection sort is the one where literally you select the smallest number then you do it again and again and again and just by common sense if you keep selecting the smallest number, you're gonna get a shorted list if you do what Casey just did. There's this other one though called bubble sort and this one you can kind of see as the name implies that there's a bubbling effect. So, let's actually try this, let's compare each of these volunteers side by side and if they're out of order, let's just swap them pairwise. Let's see if we can take a whole different approach here. So, Casey we point first at bracket 0 and bracket 1, are they in order or out of order? Out of order, so we swap? So that took what, 1, 2, 3 steps maybe, right? One step to compare, one step to move Sam and one step to move Emy. So that's like 3 total steps but that's constant, so that's nice, one big 3, a 3-unit step. Alright, so now we compare 4 and 6, and do you wanna swap or not?
>> Don't swap them.
>> Okay, 6 and 8?
>> Don't swap.
>> 8 and 1?
>> Swap.
>> So that's 3 more steps we just spent. 8 and 3.
>> Swap them.
>> 8 and 7?
>> Swap them.
>> 8 and 5.
>> Swap them.
>> Done, right? Alright, still not quite. Right, this algorithm is a little different in that we did fix some problems along the way, right? We fixed a lot of juxtapositions where we fixed the ordering but what is the worst case scenario when trying to sort N humans like this? What's the worst possible input? Right there in reverse order. So, let's actually see what bubble sort is like in the--in the really,the worst case. So, let's ignore this now and now go ahead and just line up in reverse order, 8 to 1, and let's feel how much faster or slower this alternative algorithm is, bubble sort, to that thing called selection sort. So, here we go. Bubble sort begins with 8 and 7.
>> Swap them.
>> 8 and 6?
>> Swap them.
>> 8 and 5?
>> Swap.
>> 8 and 4?
>> Swap.
>> 8 and 3?
>> Swap.
>> 8 and 2?
>> Swap.
>> 8 and 1. Okay, so done but not quite. What's curious about bubble sort, and if you actually kind of think this through logically to the end, this will actually work because if you just keep on going back and forth, back and forth, fixing all of the problems, albeit slowly, they will eventually bubble up the small numbers here and bubble down the big numbers there and that's where bubble sort gets its name. But how many steps did this actually take? Well, in the worst case, number 8 was indeed over here and number 1 was over there so how many spots did number 8 need to move to fix just him? He had to move all the way from the left to all the way to the right. So, that's like N steps, it's actually N minus 1 but it's like N steps. What about the next person? Now, we have to move 7 all the way down. That's N minus 2. Then we have to move 6, N minus 3, then N minus 4. So, where is this going? What's the running time of bubble sort? It's also N squared. Why? Well, there's N total people up here and on every iteration, you might move someone into their perfect position but you still have 7 more people then 6 more people. So, you have to do N things roughly N times, and so even if we don't really do out the math but just think about it, you've got N people and you might have to walk this list N times to fix all of the juxtapositions, all of the misorderings, that's N times N or N squared. So bubble sort sucks, selection sort sucks. Like we saw the graph up there before where linear was good, N over 2 was good, logarithm was amazing but now, where are we at? We're at the N squared which is this curve that's goes off into the positive direction is actually pretty darn slow. So, can we do better? Well, first off all a round of applause if we could for our volunteers here [applause] and Casey in particular, nicely done. [Inaudible] exit there. You can keep the papers as souvenirs. Let's see if we can't see or feel now some of the speed of these things. I'm gonna go over to the courses websites here and I'm gonna go to today's lectures page where we have a couple of demo programs linked. If I go in here to sorting algorithms, so this happens to be a web page that a fellow at another university made that implements in a language called Java or what's called an applet. But in short, this is just a visualization, it's a web based tool with which we can now see this and some other algorithms. So, I'm gonna scroll down here to the bottom and choose my sorting algorithm. You can already see that the world has a bunch and this is barely scratching the surface. We looked at bubble sort, we looked at selection sort but there're things like heap sort and shell sort and quick sort and a whole bunch of other things. So, let's choose bubble sort. What kind of array do I want? Let's choose random initially. View is just an aesthetic thing. The delay will leave there for just a moment. Let me go ahead and click start and see what happens. So, in this particular visualization, each of these blue bars represents a number, the shorter the bar, the smaller the number, the taller the bar, the bigger the number. So, the idea is to move all the short bars to the left and all the big bars to the right. What you see is this thing runs is that it's comparing in red two humans at a time, two bars at a time, and if they are big-small it makes them small-big, in other words it swaps them. So, notice how every time something's out of order, it swaps them. So, what's really happening here? In theory, the biggest bar is making its way from left to right, but we might--that's still not gonna fix the problem. And you can really appreciate that even though we're fixing some problems here and slowly the small bars are trickling down to the left and the bigger bar is bubbling its way up to the right, we now have to do the whole thing again. Now, this speed is way too slow to be fun so let's stop this and let's actually do this much quicker with a much shorter delay. And actually even that's a little tedious, so let's do it even faster. Let's do 15, start. So, same thing, so that's one pass of Casey on the stage, that's a second pass, third pass, fourth, fifth. But again, even though I'm counting these slowly, each of these passes, how many steps is there during each of these passes? Well, it's N, right? There's gonna be a total of N passes. But to do a pass, Casey has to walk N steps across the stage. The reds have to go N steps across the screen. So, bubble sort is indeed something like N times N or big O of N squared which per the chart before is actually kind of bad and you can feel it, right? I'm just kind of stalling here verbally. I have nothing interesting to say at the moment. But we just kind of need to kill time until this algorithm finishes, much like it will in just a moment. Perfect. So, no, no. [Applause] So that's bubble sort, what about selection sort? What's neat about these visualizations is that what might seem like fairly abstract ideas or just common sense ideas, you can really start to appreciate what's going on if we paint a picture for them.
>> So, this is selection sort this time. Let me start it somewhat slow but then we'll speed it up. So, selection sort kind of doesn't seem like he's doing as much work at first because notice the red bar is just kind of looking around trying to find the then smallest element. But what happens with selection sort is that he plucks out that smallest person or bar and then forces it over to the left and then repeats and then repeats. And so again and again, what you're seeing in red is the smallest bar that the algorithm can find on that pass and then again he's booting Sam or whoever is in his way, moving that bar into place and then repeating. So, on each pass, how many steps does it take in selection sort to find the smallest element? Well, it's actually gonna take as many as N steps, right? You might find 2 or 3 pretty quickly but you don't know the computer that 2 and 3 are the smallest and so you compare them against every other element again and again and again. So, you've got N passes each time you're plucking off one element, so this is N times N, to move N elements in place and my god, this is really too slow. So, let's change this to super fast. Now, this does not mean selection sort is faster just because I run it faster. This again is consistent though with this idea, you know your algorithms not faster than mine if you just wait a year and run a new hardware. We really care about the fundamental differences here. So, let's go ahead and do this selection sort. Now it's really wearing along but again, fundamentally it's the same thing. Bubble sort is big O of N squared and selection sort is big O of N squared. But what about this other thing? So, big O is again worst case. What was the notation for best case? Yeah, so Omega, so if we go back to this notation with big O and Omega, we've got big O of N squared now. It's pretty bad. So, this is big O of N squared. That's all I've been saying. What about let's say selection sort? In the best case where you are handed a bunch of humans that are already in order, how much work would Casey do if she were implementing selection sort? Best case? So, who thinks N squared, which was the original answer for worst case? Who thinks N? Excellent, you're almost all wrong. So, why is that, right? So, selection sort is actually big O of N squared but it's also in Omega of N squared. Why? Well, Casey started over here and again suppose the humans are 1, 2, 3, 4, 5, 6, 7, 8, all in the order, what was Casey's first suggestion to us? Well, find the smallest element. How did she do that? Well, she looked to her left and said, "Oh, here's number 1." But did she know at that moment in time that 1 is the smallest? No, she only knew that once she checked here and then here and then here and then here, increasingly disappointed that she's not finding anything smaller but only once she got to number 8 could she definitively conclude that was in fact the smallest number. So, what does she do? She makes no swaps, she knows 1 is in the right place, next what should she have done or would she have done? Find the next smallest element. So she can now ignore number 1 but she now needs to start at number 2 and says, "Okay, 2 is now the current smallest, let's see if there's anyone smaller, 1.5, something like that." Well, let's see, nope, nope, nope, nope, nope, she'll only when she does another N minus 1 steps does Casey know definitively that 2 in fact was the smallest. What she gonna do again? Same mistake, same mistake, same mistake. So as elegant and as simple and as sort of commonsensical as selection sort is, it's always N squared because you don't have this sort of omniscient view of your data. You only check for the current smallest element and to do that you have to check everyone else. So, that's N squared. What about bubble sort though? If I instead revert to bubble sort which was the second algorithm where Casey just swapped pairs of people but then did it again and again and again, we said that in the worst case that's N squared, if people are reversed ordered. What about in the best case? What if all those volunteers came up and they were 1 through 8 in perfect order, how much work would Casey do with bubble sort? So, it would be N, right, and if you're just kind of playing the odds here, most likely the answer to this question is gonna be the opposite to the other one. So, indeed the answer here is going to be N but how? So, bubble sort does have a potential optimization. In the worst case Bubble Sort, N squared, no doubt about it because in the worst case everyone's out of order and we kind of walked through just how much work you have to do to re-bubble everyone through. But in the best case, what might Casey be able to do to optimize so that she stops wasting her time if everyone is already sorted? She first compares 1 and 2, not out of order, 2 and 3, 3 and 4, 4 and 5, 5 and 6, 7 and 8, done. So, what does--what could Casey do? She could now go back to the start and do it again, but what could she do instead? She just needs to leverage some piece of information she just created for herself. There're no swaps. So what if Casey in her implementation just has a variable initialized to 0, and that's just a counter. And anytime she does a swap, she does plus, plus, plus, plus, plus, plus. If she does any swaps, that means, you know what, we've seen this before. Just because I've done swaps doesn't mean the problem is solved, I better go do it again. That should be her reasoning. But if instead she makes a pass through these humans, makes no swaps whatsoever, what does she obviously do? Right, it would be down right stupid to do it again because if you didn't change the state of your world, if you didn't change your array and the answer you got was no swaps, what are you gonna get if--as an answer the next time? Well no swaps, no swaps, no swaps, so what's the opportunity here if counter equals equals 0 exits? And so bubble sort by contrast is absolutely big O of N squared in the worst case but in the best case we would say it's instead Omega of N. She still has to check everyone. It's not Omega of 1, it's not constant time. She can't just glance at number 1 and say, "Feels like it's in order." Still has to look at all N but that's why we get Omega of N. So, this isn't bad, this is feeling kind of good and especially when we speed up our CPU time, it's looking really good but remember where we started. This was counting people one at a time, this was counting people two at a time, this was having you guys count yourselves sitting down half at a time again and again and again but again N squared is not the best place to be in. This here, down at bottom right if we zoom in, that's what N looks like. It's still a straight line, we just kind of [inaudible] with the X and Y-axis to stretch things out but N is still straight as a line there. But if we scroll up, N log int, now that's interesting, we'll see that guy before long. But N squared who we've seen today, that feels bad, right? That means that if your size of your input is say, my god this is only lining up at like 4 or 6, whatever the unit of measure is here, that's bad. It's gonna keep growing faster and faster. So, with something like bubble sort, sorting these elements is going to feel, feel, feel slow unless you do something better. But thankfully there are smarter ways. And again for this promise of teaching us how to do things more efficiently and effectively, let's just get a sense of how much better we can do. So, the world has come up with again yet more sorting algorithms, stooge sort, Q sort, J sort. I'm gonna choose our familiar ones. I'm gonna choose selection sort in this box. I'm gonna choose bubble sort in this one, and then I'm gonna choose another guy, merge sort. We haven't looked at merge sort but someone put a little more thought into implementing that algorithm. It's still just as correct, all three of these are right and Casey in our demonstrations, no one did anything wrong today but we can do better. So, what I'm gonna do here, and these examples are very similar, it's just this, the person who implemented this rotated the bar so that they're now sideways instead of top to bottom, but it's the same idea. We wanna get short bars on one end of the graph and long bars on the other. I'm gonna click as fast as I can on all three of these graphs to have them run simultaneously in hopes of demonstrating which one is better. So, let's just see after a little bit of thought and a little bit of CS50, how much better we can do. Here we go.
[ Pause ]
>> Done. So that is merge sort, this is CS50 and we will see you next week.
[ Applause ]