SPEAKER 1: Hey everyone! Welcome back to section. So glad to see so many of you both here, and everyone who's watching online. So, as usual welcome back. I hope that you all had a lovely weekend, full of rest, relaxation. It was beautiful out yesterday. So, I hope you enjoyed the outdoors.
So first a couple of announcements. Grading. So, most of you should have gotten an email from me about your Scratch Pset, as well as grading for Pset 1. So, just a couple things. Be sure to use check50 in style50. These are meant to be resources for you guys, to make sure that you're getting as many points as you can without needlessly losing them. So, things like style are very important. We are going to take off for it. Some of you may have already noticed that from your Pset. And check50 is just a really easy way to make sure that we're actually returning what needs to be returned to the user, and that everything's working properly.
On the second note, make sure your uploading things to the correct folder. It makes my life just a little bit more difficult if you upload Pset 2 into Pset 1 because when I download things, they don't download correctly. And I know it's a little wonky in a system to get used to, but just be super careful, if only for me, so that when you're getting emails at like 2 A.M. and I'm grading. If not cause I have to look all around for your Pset. Cool.
I know it's early, but I totally got taken off guard by an essay that's due this Friday, that my professors were just like, oh yeah. Remember, you have an essay due on Friday. So, I know no one likes to think about midterms, but your first quiz is on October 15th, which October is starting this week. So, it might be sooner than you expected is all. So that you're not thrown off guard when I mention next week's section that oh, your quiz next week, I thought I'd give you a little bit more of a heads up now.
So, your problem set, number three. How people have read the spec out of curiosity? OK. We got a couple. Kind of down from last week but that's OK. I know it was beautiful out. So Break Out. Definitely after you get done today read your spec at least try like downloading distribution code and running like the first initial thing that they ask you to. Because we are using distribution code and a library that we've only been using-- --It's only the second time we've done this Pset, crazy things can happen with your appliance, and you want to find that out now versus later.
Because if it's Thursday night or it's Wednesday night and for some reason your appliance just doesn't want to run with the library or with the distribution code, that means you can't even start doing the coding. Because you can't check to see if it works. Your not gonna be able to see if it compiles. You want to take care of those early in the week, when you can still email me or one of the other TFs, and we can get those fixed. Because those are issues that are going to stop you from making any real progress. It's not like one bug, that you can just kind of skip over. If you're having issues with your appliance or distribution code, you really want to get that taken care of sooner rather than later. So even if you're not gonna actually start coding, download the distribution code, read the spec, make sure everything's working there. OK? If you can just do that, I promise your lives will be easier. And so you're probably going to do it right now right? OK. So, any questions there? Any logistic things? Everyone's good? OK.
Disclaimer for those of you in the room and online. I'm going to be trying to switch between PowerPoint in the appliance because we are going to be doing some coding today by popular demand of the anonymous suggestion poll I sent out last week. So, we will be doing some coding. So, if you guys also want to fire up your appliances, and you should have got an email from me, with a sample file. Please feel free to do that.
So, we're going to talk about GDB, which is a debugger. It's going to help you kind of figure out where things are going wrong in your code. It's really just a way for you to step through your code as it's happening, and be able to print out variables or see what's actually happening under the hood verses your program just running, it's like faulting, and you're like, no idea what just happened here. I don't know what line it failed at. I don't know where it went wrong. So, GDB is going to help you with that. Also, if you decide to continue yes, and take 61, it will really, really be your best friend, cause I can tell you because I'm going through that class.
We're going to look at binary search, which if you guys remember the great phone book example spectacle from class. We'll be implementing that, and walking through that a little bit more, and then we're going through four different sorts, which are Bubble, Selection, Insertion, and Merge. Cool. So, GDB as I mentioned, is a debugger. And these are kind of the big things, the big functions or commands that you use within GDB, and I will walk you through a demo of it in a second.
So, this is not just going to stay abstract. I'll try and make it as concrete as possible for you guys. So, break. It'll either be break like, some number, which represents a line in your program, or you can name a function. So, if you say break main, it will stop at main, and let you walk through that function.
Likewise, if you have some external function like Swap or Cube, that we looked at last week. If you say break one of those, whenever your program hits that, it'll wait for you to tell it what to do. Before it will just execute so you could actually step inside the function and see what's going on. So, Next, just skips over the next line, goes over functions. Step. These are all little abstract. So, I'm just going to run through them, but you'll see them in use in a second.
Step into a function. So as I was saying, like with Swap, it would allow you to actually as if you're like physically stepping inside, you can mess with those variables, print out what they are, see what's going on. List will literally just print out the surrounding code. So, if you kind of forget where you are in your program, or you're wondering what's going on around it, this will just print out a segment of like five or six lines around it. So, you can get oriented about where you are.
Print some variable. So, if you have the key like in Caesar, that we'll look at. You can say Print Key at any point. It'll tell you what the value is so that, maybe somewhere along the way, you overwrote your key. You can actually tell that because you can actually observe that value.
In the locals, just prints out your local variables. So, anytime you're within a loop, and you just want to see like, oh. What is my I? What is this key value that I initialize here? What is the message at this point? It will just print all of those, so that you don't have to individually say, Print I. Print Message. Print Key. And then Display. What that does is as you step through the program, it'll just make sure that it's displaying some certain variable at every point. So that you also-- --it's kind of a shortcut where you don't have to keep going like, oh. Print Key or Print I. It just will automatically do it for you.
So, with that, we're going to see how this goes. I'm going to try and switch over to my appliance. See if I can do this. All. We're just going to mirror it. There's nothing crazy on my laptop anyways. OK. This needs to be this one. It's so tiny. Let's see if we can do this.
OK. Alice is obviously struggling here just a little bit, but we'll get it in a momento. OK. We are just going to increase this. OK. Can everyone kind of see that? Maybe a little bit? I know it's a little small. You can't quite figure out how to make this larger. If anyone knows. Does anyone know how to make it larger? OK. We're going to roll with it . Doesn't matter anyways because it's just that's the code that you guys should have.
What's more important is the terminal here. And we have here Why is it so small? Settings. Oh. True Ike. How's this? Out of there. Is that better for everyone? OK,. Cool.
You know when you're in a CS class technical difficulties are kind of part of the-- So, let's clear this. OK. So, right here in section, which we had here. Caesar is an executable file. So I made it. So, one thing to realize with GDB is that it only works on executable files. So, you can't run it on a dotsy. You have to actually make sure that your code compiles, and that it can actually be run.
So, make sure that if it doesn't compile, get it to compile, so that you can kind of run through it. So, to start GDB, all you do, Gloria type GDB, and then just the file that you want. I always misspell Caesar. But you want to make sure since it's an executable, ti's dot flash so that means you're going to run CSI you're going to execute this files either with the debugger. OK. So, you do that, you get this kind of gibberish. It's just all things about debugger. You don't really have to worry about it right now. And as you see, we have this open parens, GDP, close parens, and just kind of looks like our command line, right?
So, what we want to do-- --So, the first thing is we want to choose a place to break it. So, there is one bug in this Caesar program that I introduce, that we're going to find out. It What it does is it takes the input Barfoo in all caps, and for some reason it doesn't change A. It just leaves it alone, Is everything else correct, but the second letter A remains unchanged. So, we're going to try and figure out why that is. So, the first thing you typically want to do whenever you start on GDB is figure out where to break it.
So Caesar is a pretty short program. We just have one function, right? What was our function in Caesar? There's only one function, Main right? Main is a function for all your programs. If you didn't have Main, I might be a little worried right now, but I hope you all had Main in there. So, what we can do is we can just break Main, just like that. So, it says, OK. We set our breakpoint one there.
So, now the thing to remember is Caesar takes one command line argument right and we haven't done that anywhere yet. So, what you do is when you actually go to run the program, any program that you're running in GDB that needs command line arguments, you're going to input when you first start running it. So, in this case, we do Run with a key of three. And it will actually start.
So, if you see here, we have If RC is not equal to 2. So if you guys all have that file that I sent out up you'll see that that's like the first line our main function, right? It's checking to see if we have the correct number of arguments. So, if you're wondering if RC is correct, you can do something just like Print RC. RC is two, which is what we expected, right?
So, we can go Next, and continue through. So, we have some key there. And we can print out our key to make sure that's correct. Interesting. Not quite what we expected. So, one thing to realize with GDB also, is that it's not until you actually hit Next, that the line that you just saw is actually executed. So, in this case Key hasn't been assigned yet. So, Key is some garbage value that you see on the bottom there. Negative $120-- --It's one billion and something odd things right? It's not the Key that we expected. But if we hit Next, and then we try and Print key, it's three.
Everyone see that? So, if you get something that you're like, wait. This is completely wrong, and I don't know how this would happen because all I want to do is assign a number, a variable, try hitting Next, try printing it again, and see if that works. Because it's only going to execute and actually assign something after you hit Next. Make sense to everyone? Uh huh?
SPEAKER 2: When you random numbers what does that mean?
SPEAKER 1: It's just random. It's just garbage. It's just something that your computer will randomly assign. Cool. So, now we can move through, and so now we have this plain text GetString. So, let me just introduce what will happen when we hit Next here. Our GDB kind of disappears, right? That's because GetString is now executing, right? So, when we saw plain text equals GetString, open parens and parens, and we hit Next, that has actually executed now. So, it's waiting for us to input something.
So, we're going to input our food which is what it's failing as I told you and that just says that it's finished executing, that the closed bracket means it's exiting out of that loop. So, we can hit Next, and now, as I'm sure you're all familiar from Caesar, this is, what's this line going to do. It's for Int I equals 0, N equals Strlen, plain text, and then I is less than n, I, plus, plus. What is this loop going to do? Open your message. Cool. So, let's start doing that.
So, should this condition match, for our first one? If it's a B, it's plain text I. We can get information about our locals. So, I is zero, and if six, which we expect, and our key is three. All that make sense, right? Those numbers are all exactly what they should be. So, hum? SPEAKER 3: I have random numbers for mine. SPEAKER 1: Well, we can check-- --we can chat about that in a second. But you should be getting this. So, if we have a capital B for our first one, this condition should catch it, right? So, if we hit Next, we see that this If actually executes. Because if you're following along in your code, this line here, where plain text I is replaced with this arithmetic, only executes if the If condition is correct right?
GDB is only going to show you things that are actually executing. So if this If condition wasn't met, it's just going to skip to the next line. OK? So, we have that. This bracket means it's closed out of that loop now. So, it's going to start again. Just like that. So, that we can get info about our locals here, and we see that our first letter has changed, right? It's now an E, as it should be. So, we can continue on.
And we have this check. And this check should work, right? It's A. It should be changed three letters forward. But if you notice, we get something different. So in this case up here, it caught it, and so this line executed, which modified our B. But, in this case here, we have that it just skipped it, and went to the [? L siff. ?] So something's going on there. What that's telling you is that, we know that it should catch here, but it's not. Can anyone see what our problem is in that line? It's a very minute thing. And you could also look at your code. It's also line-- forget what line it is in there-- but it's in the [INAUDIBLE]. Yes?
SPEAKER 4: It's on the greater than page if you read it in the book. SPEAKER 1: Exactly. So, the debugger couldn't tell you that, but the debugger could get you down to a line that you know is not functioning. And sometimes, when especially later in the semester, when you're dealing with a hundred, a hundred few lines of code, and you don't know where it's failing, this is a great way to do it. So, we found our bug. You can fix it in your file, and then you could run it again, and everything would work perfectly. And the biggest thing is this can seem like, OK. Yeah. Cool. You knew what you're looking for. So, you knew what to do.
GDB can be super helpful because you can print out all these things that you wouldn't. It's much more useful than PrintF. How many of you use like PrintF statements to figure out where a bug was, right? So, with this, you don't have to keep going back, and like commenting in PrintF, or commenting out, and figure out what you should be printing. This actually just allows you to step through, print out things as you're going through, so, you can observe how they change in real time, as your program is running.
And it does take a little bit of getting used to. I would highly recommend just kind of being a little frustrated with it for right now. If you spend an hour over the next week learning how to use GDB, you will save yourself so much time later on. And literally. we tell this to people every year, and I remember when I took the class, I was like, I will be fine. No. Pset 6 came on and I was like, I'm gonna learn how to use GDB because I don't know what's going on here.
So if you take the time so use it on smaller programs that you're going to be working on, like working through something like Visionare, like this. Or if you want extra practice, I'm sure I could come up with buggy programs, for you to debug if you'd like.
But if you just take some time to get used to it, just play around with it, it will really serve you well. And it's really one of those things that you just have to try, and get your hands dirty with, before you really understand it. I really only understood it once I had to debug things with it, and it's much nicer to have an idea of how to debug sooner rather than later. OK. Cool. I know that's kind of like a crash course in GDB, and I will definitely work on getting these to look bigger next time. Cool.
So, if we go back to our PowerPoint. Is this going to work? Awh. Yes. OK. So, if you ever need any of those again, there's the list. So Binary Search, which everyone remembers the great spectacle of David ripping phone books in half. I don't really get the phone books anymore, because like where do you get phone books these days? I really do not know. The Binary Search. Does anyone remember how Binary Search works? Anyone at all? Yeah? SPEAKER 5: You know when you look at which half it would be in, Based on that, and get rid of the other half.
SPEAKER 1 Exactly. So, Binary Search, it's kind of a-- --we like to call it divide and conquer. So, what you'll do is you'll look in the middle, and you'll see if it matches what you're looking for. And if it doesn't, then you try to figure out, is it going to be left half or the right half. So, this might be if you're looking at something that's alphabetized, you see, oh. Does Allison come before M? Yes. So, we're going to look at the first half.
Or it could be like with numbers. Anything that you can compare, it can be sorted. You can use binary search on. So, anyone remember this graph or what this is? It's Asymptotic Complexity. So, this graph just describes how long it takes you to solve a problem as you increase the number of things that you're using.
So, we have N, which is linear time. If N over two, which is slightly better, still grows super fast. And then we have Login, which is what we consider Binary Search. If we notice, as your problem gets much and much larger, the time it takes you to solve it doesn't really increase that much. It's like comparable here in the beginning. You're like, OK. Anything here doesn't really matter which one we use, but you get out to a million, a billion. You're trying to find some-- --you're trying to find a needle in a haystack.
I think you want this problem. You want this complexity, not linear because for all you know your gonna be searching through each individual needle, thing of hay, trying to look for your needle. And that's not too fun in my opinion. I like fast. I like efficient. And as hardworking students you guys are, you know work smarter, not harder type thing, how you can make up these algorithms.
So, we're going to walk through just a quick example. I think you guys should have a hand on Binary Search, but in case anyone is a little fuzzy, want to reinforce it, we're going to just go through an example here. So, we're looking for if the array contains seven.
So, first thing we do is look in the middle, right? And also you're going to be coding Binary Search in just a second. So, it's going to be fun. So we look in the middle little arrays 3. Does 3 equal 7? Doesn't. It's six. So, is it less than or greater than seven? Less than. Yes. Nice job guys. I feel I like I should have candy because I want to throw it out into the yards. It's what I am going to do next week. It will keep you guys sharp.
So, we throw away that first half, right? it was less than. we know that everything on that left hand side is going to be less than what we're actually looking for. So, there's no need to pay attention to it. Just forget about it. So, now we look at our right hand side, and we look at the middle over there, and now it's nine. So, 9 is-- --Everyone? Greater than what we're looking for, right? So, we're going to throw away everything to the right. Like that. Now, all we're left with is one. So we check, is this one what we're looking for? it is. We found what we wanted. So we're done. Bilinear Search.
And if you notice, we had seven inputs there. It only took us like three times, but if you're doing like a billion, you guys know how many steps it would take if we had four billion things? Any guesses? It's 32. 32 steps to find something in a four billion element array because of powers of two. So two is to 32, is to four billion.
So pretty crazy how you're still within like a fairly small number of steps to find something in four billion elements. So on that note, we're going to code this so you guys can actually kind of see how this works. All right, so you guys can code. I'm going to let you guys talk for a little bit. Get to know people around you, which is what someone wanted from last section.
So get to know the people around you. Talk for a little bit. And all I want from you guys right now is just try to create an outline of pseudocode. OK? Whoa. All I want from you guys is you're just going to fill in this while case. So I have set these upper and lower bounds which represent the beginning and end of our array. And you are going to actually loop through and figure out what we're doing within this while loop.
So if you can figure out-- I have a hint there-- what are the cases that we have here? So if you want to figure out the cases, we will pseudocode those and then we'll actually code them. And it's going to be, I think, hopefully it'll be a little easier than you expect. Because it's not that much code, actually, which is really cool.
Mm-hm?
STUDENT: [INAUDIBLE]? INSTRUCTOR: Yes. There was something to find in the middle.
STUDENT: So we can use that. OK.
INSTRUCTOR: Perfect. So that's the first thing we need to do. So find the middle. Great. So do you have an idea of how we might actually find the middle with code?
STUDENT: Yeah. n over 2? INSTRUCTOR: So n over 2. So one thing to remember is that your upper and lower bounds change. We keep constricting the part of the array we're looking to. So n over 2 will only work for the first thing we do. So taking upper and lower into account, how might we get that middle element? Because we want the middle between upper and lower, right? Mm-hm?
STUDENT: [INAUDIBLE].
INSTRUCTOR: So we have some middle. And it'll be upper plus lower over 2. Awesome. There we go. One line down. You guys are on your way. So now that we have our middle, what do we want to do? Just in general. You don't have to code it. Yes. STUDENT: [INAUDIBLE]? INSTRUCTOR: So it's plus because you're finding the average between the two of them. So if you think of them as kind of increasing in from the sides, think about it as you approach the middle, you want like that. So if you were on either side of the middle, and we have like 5 and 7. When you add them together you get 12, you divide by 2, is 6.
Sometimes it's hard to explain why that works, but if you work through an example sometimes, it'll help you figure out if it should be plus or minus. Yes.
STUDENT: [INAUDIBLE] exactly in the middle if they had a case where there's a lot of smaller numbers and like one large number?
INSTRUCTOR: So all you need is the middle of the array. So if you had a bunch of small numbers and then one really large number at the end, it doesn't matter. All that matters is that they're sorted, you just want to look at the middle of the array because you're still slicing your problem in half. Cool. So now that we have the middle, what do we do next?
STUDENT: Compare. INSTRUCTOR: The compare. So compare middle to value_wanted. Cool. So you see up here we have this value we want up here. Remember this is an array. So middle refers to the index. So we want to do values of middle. Don't forget if you want to compare, double equals. You do single equals you're just going to reassign it, and then, of course, it's going to be the value you want. So don't do that.
So we're going to see if the values at the middle is equal to the value we want. Don't forget your braces. Dropbox should go away. So what do we do in this case? If it's what do we want to return? We're trying to say.
STUDENT: Print off.
INSTRUCTOR: Well, we don't want to print off. So this is a bool here, so we want to return true or false. We're saying, is this number an [? RRA? ?] So if it is, we just return it true. If I can spell true.
STUDENT: Why wouldn't you return zero? INSTRUCTOR: So you could return zero if you wanted. But in this case because our function returns a bool, we need to return either true or false. STUDENT: When you're saying boolean expression, can you set it equal to false? Like if I want to say, if this condition is not met, like is upper equals false. Will it understand if you just put false on the other side? INSTRUCTOR: Yeah. So actually if you're ever doing something like is upper or is lower, that returns true or false and it's actually bad style to say equals equals true or equals equals false. You want to use that result as itself as your check. Not what I wanted. That's what I wanted. So in the case of you're asking about something like save this in c.
So if we have int main (void) and something like this. And you have if is upper of some input and you're asking if you can do something like this? Right? STUDENT: I was trying to do it [INAUDIBLE]. Because if it's-- INSTRUCTOR: Right. So you want this to be false, right? STUDENT: Yeah. INSTRUCTOR: So in this case you want it to execute if it's not true. So the cool thing you do there is this. So remember exclamation point negates things? It says [INAUDIBLE] means not. So if we look at just this part here, you'd say that evaluates to false as you want it to. Not false is true which means this would execute. Does that make sense? STUDENT: Yeah. INSTRUCTOR: Awesome. OK. So we could just return true in this case. So now we have two other cases in this case. What are our two other cases? Let's just do it this way. So let's start with else if values at the middle is less than the value we want. So our value in the middle is less than the value that we're looking for.
So which bound do you think we want to update? Upper or lower? Upper? So which side of the array are we going to be looking at?
STUDENT: The lower.
INSTRUCTOR: We are we going to be looking at the left. So else if little value is less. So your middle value here is less than what we want. So we want to take the right side of our array. So we're going to update our lower bound. So we'll reassign our lower. And what do you think lower should be? STUDENT: The middle value? INSTRUCTOR: So the middle value-- STUDENT: Plus 1. INSTRUCTOR: --plus 1. Can anyone tell me why we have that plus 1?
STUDENT: [? No value ?] is more equal to it.
INSTRUCTOR: Right. Because we already know that our middle value is not equal to it and we want to exclude it from all subsequent searches. If you forget that plus 1, this will like loop indefinitely. And you'll just be caught in an infinite loop and then you'll segfault and things go bad. So always make sure that you're not including the value that you just looked at. So we take care of that with a plus 1.
So now we have our last condition which I always for safety sake you can check here, else if value at the middle is greater than the value we want. That means that we want the left hand half. So which one are we going to update? Upper. And what is this one going to equal? Middle minus 1 because, of course, we want to make sure that we're not looking at that middle value again. And then we have it. That's it. That's all binary search is. It's not that bad, right? It's like 10 lines of code with white space. So very powerful, very useful, you will be using it in one of your later psets. Maybe not this one, but later. So learn it. Love it. It will treat you well. So does anyone have any questions on binary search? Yes.
STUDENT: Does it matter whether your n is even or odd?
INSTRUCTOR: No. Because we cast it to the middle as an int, it will just truncate it. So it will stay an integer and it will eventually sort through everything. So you don't have to worry about that. Everyone good? Awesome. Cool. So, you guys got this. Slideshow. So as we were talking about, I know David mentioned complexity runtimes.
So in the best case, it's just one, which we call constant time. Can anyone tell me why that might be? What type of scenario would that entail? Mm-hm.
STUDENT: [INAUDIBLE] first-- INSTRUCTOR: So the middle being the first element that we come to, right? So either an array of one or whatever we're looking for just happens to be smack dab in the middle. So that's our best case. You get into real problems, probably not going to reach [INAUDIBLE] that often. What about our worst case? Our worst case is log n. And that has to do with the whole powers of two thing that I talked about.
So in the worst case it would mean that we had to chop the array down until it was an element of one. So we had to chop it down in half as many times as we possibly could. That's why it's log n because you just keep dividing by two. So assumptions, things you need to know if you're ever going to use a binary search. Your elements must be sorted. They have to be sorted because that's the only way you can know if you are able to throw out half of it.
If you had this jumbled bag of numbers and you're saying, OK, I'm going to check the middle number and the number I'm looking for is less than that, I'm just going to arbitrarily throw out one half. You wouldn't know if your numbers in that other half. Your list has to be sorted. As well, this may be going ahead a little bit, but you need to have random access. You need to be able to just go to that middle element. If you have to traverse through something or it takes you extra steps to get to that middle element, it's not log n anymore because you're adding more work into it. And this will make a little more sense in two weeks, but I just kind of wanted to preface, give you guys an idea of what's to come. But those are the two important assumptions that you need for a binary list. Make sure it's sorted. That's the big one for you guys right now. And on that we can go into the rest of our sorts. So four sorts-- bubble, insertion, selection, and merge. They're all kind of cool. If you guys decide to take CS 124, you'll learn about all sorts of sorts. And if you're an XKCD fan, there is a really cool comic about like really ineffective sorts, which I highly recommend you going to look at. One of them is like panic sort, which is like, oh no, return random array. Shutdown system. Leave. So geeky humor is always good.
So does anyone remember kind of like just a general idea of how bubble sort works. You remember?
STUDENT: Yeah.
INSTRUCTOR: Go for it.
STUDENT: So you're going through and if it's bigger, then you swap the two.
INSTRUCTOR: Mm-hm. Exactly. So you just iterate through. You check two numbers. If the one before is bigger than the one afterwards, you just swap them so that in this way all of the higher numbers bubble up towards the end of the list and all the lower numbers bubble down.
Did he show you guys the cool sound effect sorting video? It's kind of cool. So as Robert just said, the algorithm that you just step through the list, swapping the adjacent values if they're not in order. And then just keep repeating until you don't make any swaps. So not bad, right? So we just have a quick example here. So this is going to sort them in ascending order. So when we go through the first time, we look through eight and six are obviously not in order, we swap them.
So look at the next one. Eight and four not in order. Swap them. And then eight and two, swap them. There we go. So after your first pass, you know that your largest number is going to be all the way at the top because it's just going to be constantly larger than everything else and it's just going to bubble up all the way to the end there. Does that makes sense to everyone? Cool.
So then we look at our second pass. Six and four, switch. Six and two, switch. And now we have a few things in order. So for every pass that we make through our entire list, we know that like that many numbers at the end will have been sorted. So we do a third pass, which is one swap. And then on our fourth pass, we have zero slots. And so we know that our array has been sorted.
And that is the big thing with bubble sort. We know that when we have zero swaps, that means that everything is in complete order. It's kind of how we check. So we are also going to code bubble sort which also is not that bad. None of these are that bad. I know they can seem a little scary. I know when I took the class, even when I was teaching the class for the first time last year, I was like, how do I do this? It makes sense in theory, but how do we actually do this? Which is why I also want to walk through code with you guys here. So I have a pseudocode for you guys this time. So just keep this in mind as we're about to transition over. So we have some counter that keeps track of our swaps, because we need to make sure that we're checking that. And we iterate the entire array as we just did with this example. If the element before is larger than the element after where we're at, we swap them and we increment our counter because as soon as we swap, we want to let our counter know that. Any questions there? Something seems funny over here. STUDENT: Do you set the counter to zero every time you go through the loop? Don't you keep going back to zero every time? INSTRUCTOR: Not necessarily. So what happens is we go through here. So do while, remember, this will execute once without fail. So it's going to set the counter equal to zero, then it's going to iterate through. As it iterates through, it will update counter. As it updates counter, when it's done, when it's reached the end of the array, if our list hasn't been sorted, counter will have been updated.
So then it checks the condition and it says, OK, is counter greater than zero. If it is, do it again. You want to reset so that when you go through, counter is equal to zero. If you go through a sorted array, nothing changes, this fails, and you return the sorted list. Does that makes sense? STUDENT: It might in a little bit. INSTRUCTOR: OK. If there's any other question that comes up. Yes.
STUDENT: What would the function be for swapping the elements?
INSTRUCTOR: So we can actually write that if we're going to right now. Cool. So on that note, Alison is going to switch back to the appliance. It's going to be fun. And we have our nice bubble sort thing here. So I already did cycling through the array. We have our swaps that are equal to zero. So we want to swap adjacent elements if they're out of order. So the first thing we need to do is iterate through our array.
So how do you think we might iterate through our array? We have for and i equals 0. We want i to be less than n minus 1 minus k. And I'll explain that in a second. So this is an optimization here where, remember how I said after every pass through the array we know that whatever's on--
So after one pass we know that this is sorted. After two passes we know that all this is sorted. After three passes we know that's sorted. So the way I'm iterating through the array here, is it's making sure to only go through what we know is unsorted. OK? That's just an optimization. You could write it naively just iterating through everything, it would just take longer. With this four loop it's just a nice optimization because we know that after every full iteration through the array here, like every full loop here, we know that one more of these elements will be sorted at the end.
So we don't have to worry about those. Does that makes sense to everyone? That cool little trick? So in that case, if we're iterating through, we know that we want to check if array n and n plus 1 are in order. OK. So here's the pseudocode. We want to check if array n and n plus 1 are in order. So what might we have there? It's going to be some conditional. It will be an if.
STUDENT: If array n is less than array n plus 1. INSTRUCTOR: Mm-hm. Well, less than or greater than. STUDENT: Greater than. Then we want to swap them. Exactly. So now we get into what's the mechanism for swapping them? So we went through this briefly, a type of swap function last week. Does anyone remember how it worked? So we can't just reassign them, right? Because one of them will get lost. If we said A is equal to B and then B is equal to A, all a sudden both of them are just equal to B.
So what we have to do is we have a temporary variable that's going to hold one of ours while we're in the process of swapping. So what we have is we'll have some int temp is equal to-- you can assign it to whichever one you want, just make sure you keep track of it-- so in this case, I'm going to assign it to array n plus 1. So that's going to hold whatever value is in that second block that we're looking at.
And then we can do is we can go ahead and reassign array n plus 1, because we know we have that value stored. This is also one of the big things-- I don't know if any of you had issues where if you switch two lines of code suddenly things worked. Order is very important in CS. So make sure you diagram things out if possible as to what's actually happening. So now we're going to reassign array n plus 1, because we know we have that value stored.
And we can assign that to array n or in this case array i. Too many variables. OK. So now we've reassigned array i plus 1 to equal what's in array i. And now we can go back and assign array i to what? Anyone?
STUDENT: 10.
INSTRUCTOR: 10. Exactly. And one last thing. If we have swapped it now, what do we need to do? What's the one thing that's going to tell us if we ever terminate this program? What tells us that we have a sorted list? If we don't perform any swaps, right? If swaps is equal to zero at the end of this. So whenever you perform a swap, as we just did here, we want to update swaps. And I know there was a question earlier about can you use zero or one instead of true or false. And that's what this does here. So this says if not swaps. So if swaps is zero, which is-- I always get my truths and my falses mixed up. We want us to evaluate to true and it's not. So if it's zero, then it's false. If you negate it with a [? bang ?] it becomes true. So then this line executes.
Truths and false and zeros and ones get crazy. Just if you slowly walk through it it will make sense. But that's what this little bit of code here does. So this checks to see have we done any swaps. So if it's anything besides zero, it's going to be false and the whole thing is going to execute again. Cool?
STUDENT: What does break do?
INSTRUCTOR: Break just breaks you out of the loop. So in this case it would just like end the program and you would just have your sorted list. STUDENT: Amazing. INSTRUCTOR: I'm sorry? STUDENT: Because previously we used written 1 over written zero to present that if that will work or not.
INSTRUCTOR: Yeah. So you can return zero or 1. In this case, because we're not actually doing anything with the function, we just want it to break. We don't really care about it. Brake is also good if it's used for breaking out of four loops or conditions that you don't want to keep executing. It just takes you out of them. It's a bit of a nuance thing. I feel like there's a lot of hand waving, like you'll learn about this soon.
But you'll learn about this soon. I promise. OK. So does everyone get bubble sort? Not too bad. Iterate through, swap things using a temp variable, and we're all set there? Cool. Awesome. OK. Back to the PowerPoint. Any questions in general about these so far? Cool. Mm-hm.
STUDENT: [INAUDIBLE] int main usually. Do you have to have that for this?
INSTRUCTOR: So we were just looking just at the actual sorting algorithm. If you had it within like a larger program, you would have an int main somewhere. Depending on where you use this algorithm, it would determine what's being returned by it. But for our case, we're strictly looking at how does this actually iterate through an array. So we don't worry about it.
So we were talking about best case and worst case scenarios for binary search. So it's also important to do that for each of our sorts. So what do you think is the worst case runtime of bubble sort? You guys remember?
STUDENT: N minus 1. INSTRUCTOR: N minus 1. So that means there are n minus 1 comparisons. So one thing to realize is that on the first iteration, we go through, we compare these two-- so that's 1. These two, three, four. So after one iteration we already have four comparisons. When I'm talking about runtime and n. N represents the number of comparisons as a function of how many elements we have. OK?
So we go through, we have four. The next time you know we don't have to take care of this. We compare these two, these two, these two, and if we didn't have that optimization with the four loop that I wrote, you would be comparing in here anyways. So you would have to run through the array and make n comparisons n times, because every time we run through it we sort one thing.
And every time we run through the array, we make n comparisons. So our runtime for this is actually n squared, which is much worse in our log end because that means if we had four billion elements, it's going to take us four billion squared instead of 32. So not the best runtime, but for some things, you know, if you're within a certain range of elements bubble sort may be fine to use.
OK. So now what is the best case runtime? STUDENT: Zero? Or 1?
INSTRUCTOR: So 1 would be one comparison. Right.
STUDENT: N minus 1?
INSTRUCTOR: So, yeah. So n minus 1. Whenever you have a concept like n minus 1, we tend to just drop it off and we just say n because you have to compare each of these-- each pair. So it would be n minus 1, which we we'd just say is approximately n. When you're dealing with runtime, everything is in approximates. As long as the exponent is correct, you're pretty good.
That's how we deal with it. So that the best case is n, which means that the list is already sorted, and all we do is run through and check that it's sorted. Cool. All right. So as you see here, we just have some more graphs. So n squared. Fun. Much worse than n as we see, and much, much worse than log 2n. And then you also get into log logs. And you take 124, you get into like log star, which is like crazy. So if you're interested, lookup log star. It's kind of fun. So we have this great chart. Just a heads up, this a wonderful chart to have for your mid-term because we long to ask you these thins. So just a heads up, have this on your mid-term on your nice cheat sheet there. So we just looked at bubble sort. Worst case, n squared, best case, n. And we're going to look at the others.
And as you can see, the only one that really does well is merge sort, which we'll get into why. So we're going to go to the next one here-- selection sort. Does anyone remember how selection sort worked? Go for it.
STUDENT: Basically go through an order and create a new list. And just as you're putting elements in, put them in the right place in the new list.
INSTRUCTOR: So that sounds more like insertion sort. But you're really close. They're very similar. Even I get them mixed up sometimes. Before this section I was like, wait. OK. So what you want to do is selection sort, the way you can think about it and the way I make sure I try not to get them mixed up, is it goes through and it selects the smallest number and it puts that at the beginning of your list. It swaps it with that first spot. They actually have an example for me. Awesome. So just a way to think of it-- selection sort, select the smallest value. And we're going to run through an example that I think will help because I think visuals always help. So we start out with something that is completely unsorted. Red will be unsorted, green will be sorted. It will all make sense in a second.
So we go through and we iterate from the beginning to the end. And we say, OK, 2 is our smallest number. So we're going to take 2 and we're going to move it to the front of our array because it's the smallest number we have. So that's what this is doing here. It's just going to swap those two. So now we have a sorted part and an unsorted part. And what's good to remember about selection sort is we're only selecting from the unsorted part.
The sorted part you just leave alone. Mm-hm?
STUDENT: How does it know what is the smallest without comparing it to every other value in the array. INSTRUCTOR: It does compare it. We like skipped it. This is just general overall. Yeah. When we write the code I'm sure you'll be more satisfied. But you store this first element as the smallest. You compare and you say, OK, is it smaller? Yes. Keep it. Here is it smaller? No?
This is your smallest, reassign it to your value. And you'll be much happier when we go through the code. So we go through, we swap it, so then we look at this unsorted portion. So we're going to select three out. We're going to put it on at the end of our sorted portion. And we're just going to keep doing that, doing that, and doing that. So this is our kind of pseudocode here. We'll code it up here in a second. But just something to walk through on a high level. You're going to go from i equals 0 to n minus 2. That's another optimization. Don't worry too much about it. So as you were saying. As Jacob was saying, how do we keep track of what our minimum is? How do we know? We have to compare everything in our list.
So minimum equals i. It's just saying in this case the index of our minimum value. So then it's going to iterate through and it goes from j equals i plus 1. So we already know that that's our first element. We don't need to compare it to itself. So we start comparing it to the next one which is why it's i plus 1 to n minus 1, which is the end of the array there. And we said if array at j is less than array min, then we reassign where our minimum indices is.
And if min is not equal to i, as in where we were back over here. So like when we first did this one. In this case, it would start at zero, it would end up being two. So min would not equal i in the end. That lets us know that we need to swap them. I feel like a concrete example will help much more than this. So I'll code this up with you guys right now and I think it'll be better.
Sorts tend to work that way in that it's often better just to see them. So what we want to do is we first want the smallest element in its position in the array. Exactly what Jacob was saying. You need to store that somehow. So we're going to start here iterating over the array. We're going to say it's our first one just to start with. So we are going to have int smallest is equal to array at i.
So one thing to notice, every time this loop executes, we are starting one step further along. When we start we look at this one. The next time we iterate through, we're starting at this one and assigning it our smallest value. So it's very similar to bubble sort where we know that after one pass, this last element is sorted. With selection sort, it's just the opposite.
At every pass, we know that the first one is sorted. After the second pass, the second one will be sorted. And as you saw with the slide examples, our sorted portion just keeps growing. So by setting our smallest one to arrays i, all it's doing is constricting what we're looking at so as to minimize the number of comparisons we make. Does that make sense to everyone? Do you need me to run through that again slower or in different words? I'm happy to. OK.
So we're storing the value at this point, but we also want to store the index. So we're going to store the position of the smallest one, which is just going to be i. So now Jacob is satisfied. We have things stored. And now we need to look through the unsorted part of the array. So in this case this would be our unsorted. This is i. OK.
So what we're going to do is going to be for a loop. Whenever you need to iterate through an array, your mind could go to for a loop. So for some int k equals-- what do we think k is going to equal to start with? This is what we set as our smallest value and we want to compare it. What do we want to compare it to? It's going to be this next one, right? So we want k to be initialized to i plus 1 to start.
And we want k in this case we already have size stored up here, so we can just use size. Size being the size of the array. And we just want to update k by one each time. Cool. So now we need to find the smallest element here. So if we iterate through, we want to say, if array at k is less than our smallest value-- this is where we're actually keeping track of what's the smallest here-- then we want to reassign what our smallest value is.
This means that, oh, we're iterating through here. Whatever value is here is not our smallest thing. We don't want it. We want to reassign it. So if we're reassigning it, what do you think might be in this code here? We want to reassign smallest and position. So what is smallest now? STUDENT: Array k. INSTRUCTOR: Array k. And what is position now? What's the indices of our smallest value? It's just k. So array k, k, they match up. So we wanted to reassign that. And then after we found our smallest, so at the end of this for loop here we have found what our smallest value is, so we just swap it. In this case, like say our smallest value is out here. This is our smallest value.
We just want to swap it here, which is what that swap function at the bottom did, which we just wrote up together a couple minutes ago. So it should look familiar. And then it will just iterate through until it reaches all the way to the end, which means that you have zero elements that are unsorted and everything else has been sorted. Make sense? A little more concretely? The code help?
STUDENT: For a size, you never really define it or change it, how does it know?
INSTRUCTOR: So one thing to notice up here is int size. So we're saying in this sort-- sort is a function in this case-- it's selection sort, it's passed in with the function. So if it wasn't passed in, you would do something like with the length of the array or you would iterate through to find the length. But because it's passed in, we can just use it. You just assume that the user gave you a valid size that actually represents a size of your array. Cool?
If you guys have any trouble with these or want more practice coding sorts on your own, you should go to study.cs50. It's a tool. They have a checker that you can actually write. They do pseudocode. They have more videos and slides including the ones I use here. So if you're still feeling a little fuzzy, try that out. As always, come talk to me, too. Question?
STUDENT: Are you saying the size is previously defined? INSTRUCTOR: Yes. Size is previously defined up here in the function declaration. So you assume that it's been passed in by the user, and for simplicity's sake, we're going to assume that the user gave us the correct size. Cool. So that's selection sort. Guys, I know we're learning a lot today. It's a dense data for section. So with that, we are going to go to insertion sort.
OK. So before that we have to do our runtime analysis here. So in the best case, granted since I showed you the table I already kind of gave it away. But best case runtime, what do we think? Everything sorted. N squared. Anyone have an explanation for why you think?
STUDENT: You're comparing through-- INSTRUCTOR: Right. You're comparing through. At every iteration, even though we're decrementing this by one, you're still searching through everything to find the smallest one. So even if your smallest value is here at the beginning, you're still comparing it against everything else to make sure it's the smallest thing. So you'll end up running through approximately n squared times. All right. And what's the worst case? Also n squared because you're going to be doing that same procedure. So in this case, selection sort has something that we also call expected runtime. So in the others, we just know the upper and lower bounds. Depending on how crazy our list is or how unsorted it is, they vary between n or n squared. We don't know.
But because selection sort has the same worst and best case, that tells us that no matter what type of input we have, whether it's completely sorted or completely reverse sorted, it's going to take the same amount of time. So in that case, if you remember from our table, it actually had a value that these two sorts don't have, which is expected runtime. So we know that whenever we run selection sort, it's guaranteed to run an n squared time. There is no variability there. It's just expected. And, again, if you want to learn more, take CS 124 in the Spring. All right. We've seen this one. Cool. So insertion sort. And I'm probably going to blaze through this. I won't have you guys code it. We'll just walk through it. So insertion sort is kind of similar to selection sort in that we have both an unsorted and sorted part of the array.
But what's different is that as we go through one by one, we just take whatever number is next in our unsorted, and correctly sort it into our sorted array. It'll make more sense with an example. So everything starts as unsorted, just like with selection sort. And we're going to sort this in ascending order as we have been. So on our first pass we take the first value and we say, OK, you are now in a list by yourself.
Because you are in a list by yourself, you are sorted. Congratulations for being the first element in this array. You're already sorted all on your own. So now we have a sorted and an unsorted array. So now we take the first. What happens between here and here is that we say, OK, we're going to look at the first value of our unsorted array and we're going to input it in its correct place in the sorted array.
So what we do is we take 5 and we say, OK, 5 is greater than 3, so we just insert it right to the right of that. We're good. So then we go on to our next one. And we take 2. We say, OK, 2 is less than 3, so we know that it needs to be at the front of our list now. So what we do is we push 3 and 5 down and we move 2 into that first slot. So we're just inserting it into the correct place it should be.
Then we look at our next one, and we say 6. OK, 6 is greater than everything in our sorted array, so we just tag it on to the end. And then we look at 4. 4 is less than 6, it's less than 5 but it's greater than 3. So we just insert it right into the middle between 3 and 5. So to make that a little bit more concrete, here is kind of the idea of what happened. So for each unsorted element, we determine where in the sorted portion it is.
So keeping in mind the sorted and unsorted, we have to traverse through and figure out where it fits in the sorted array. And we insert it by shifting the elements to the right of it down. And then we just keep iterating through until we have a completely sorted list where unsorted is now zero and sorted takes up the entirety of our list. So, again, to make things even more concrete, we have pseudocode.
So basically for i is equal to 0 to n minus 1, that's just the length of our array. We have some element that is equal to the first array or the first indices. We set j equal to that. So while j is greater than zero and the array, j minus 1 is greater than the element, so all that's doing is making sure that your j really represents the unsorted portion of the array.
So while there is still things to sort and j minus one is-- what is the element her? J was never defined here. It's kind of annoying. OK. Anyways. So j minus 1, you're checking the element before it. You're saying, OK, is the element before wherever I am-- let's actually draw this out. So let's say this is like on our second pass. So i is going to be equal to 1, which is here.
So i is going to be equal to 1. This would be 2, 4, 5, 6, 7. All right. So our element in this case is going to be equal to 4. And we have some j that's going to be equal to 1. Oh, j is decrementing. That's what it is. So j is equal to i, so what this is saying is that as we move forward, we're just making sure that we're not over indexing this way when we're trying to insert things into our sorted list.
So when j is equal to 1 in this case and array j minus one-- so array j minus 1 is 2 in this case-- if that's greater than the element, then all this is doing is shifting things down. So in this case, array j minus one would be array zero, which is 2. 2 is not greater than 4, so this doesn't execute. So the shift doesn't move down. What this does here is just moving your sorted array down. In this case, actually, we could do-- let's make this 3. So if we're to walk through with this example, we're now here. This is sorted. This is unsorted. Cool? So i is equal to 2, so our element is equal to 3. And our j is equal to 2. So we look through and we say, OK, is array j minus one greater than the element that we're looking at? And the answer is yes, right? 4 is greater than 3 and j is 2, so this code executes.
So now what we do an array at 2, so right here, we swap them. So we just say, OK, array at 2 is now going to be 3. And j is going to equal j minus 1, which is 1. That's horrible, but you guys get the idea. J is now equal to 1. And array j is just going to be equal to our element, which was 4. I erased something I shouldn't have or miswrote something, but you guys get the idea.
It move at n. And then if this were, it would loop again and it would say, OK, j is 1 now. And array j minus 1 is now 2. Is 2 less than our element? No? That means that we've inserted this element in the correct spot in our sorted array. Then we can take this and we say, OK, our sorted array is here. And it would take this number 6 and be like, OK, is 6 less than this number? No? Cool. We're fine.
Do it again. We say 7. Is 7 less than the end of our sorted array? No. So we're fine. So this would be sorted. Basically all this does is it's saying take the first element of your unsorted array, figure out where it goes in your sorted array. And this just takes care of swaps to do that. You're basically just swapping until it's in the right spot. The visual image is that you're moving everything down by doing that.
So it's like half bubble sort-esque. Check out study 50. I highly recommend trying to code this on your own. If you have any issues or you want to see sample code for an insertion sort, please let me know. I'm always around. So worst case runtime and best case runtime. As you guy saw from the table I already showed you, it's both n squared and n.
So kind of going off of what we talked about with our previous sorts, worst case runtime is that if it's completely unsorted, we have to compare all of these n times. We do a whole lot of comparisons because if it's in reverse order, we're going to say, OK, this is the same, this is good, and this one will have to be compared against the first one to be moved back. And as we get toward the tail end, we have to compare, compare, and compare against everything.
So it ends up being approximately n squared. If it's correct then you say, OK, 2, you're good. 3, you're compared to 2. You're good. 4, you just compare to the tail. You're good. 6, compare to the tail, you're fine. So for every spot if it's already sorted, you're making one comparison. So it's just n. And because we have a best case runtime of n and a worst case runtime of n squared, we have no expected runtime.
It just depends on the chaos of our list there. And again, another graph and another table. So differences between sorts. I'm just going to breeze through, I feel like we've talked extensively about how they all kind of vary and link together. So merge sort is the last one I shall bore you guys with. We do have a pretty colorful picture. So merge sort is a recursive algorithm. So do you guys know what a recursive function is?
Anyone want to say? You want to try? So a recursive function is just a function that calls itself. So if you guys are familiar with the Fibonacci sequence, that's deemed recursive because you take the previous two and add them together to get your next one. So recursive, I always think of recursion as like a spiral so you're like spiraling down into it. But it's just a function that calls itself.
And, actually, really quickly I can show you what that looks like. So recursive here, if we look, this is the recursive way to sum over an array. So all that we do is we have sum function sum that takes a size and an array. And if you notice, size decrements by one each time. And all it does is if x is equal to zero-- so if the size of the array is equal to zero-- it returns zero.
Otherwise it sums this last element of the array, and then takes a sum of the rest of the array. So it's just breaking it down into smaller and smaller problems. Long story short, recursion, function that calls itself. If that's all you got out of this, that's what a recursive function is. If you take 51, you will get very, very comfortable with recursion. It's really cool. It made sense at like 3 AM one night out. And I was like, why have I never use this?
So for merge sort, basically what it's going to do is it's going to break it down and break it down until it's just single elements. The single elements are easy to sort. We see that. If you have one element, it's already considered sorted. So on an input of n elements, if n is less than 2, just return because that means it's either 0 or 1 as we've seen. Those are considered sorted elements.
Otherwise break it in half. Sort the first half, sort the second half, and then merge them together. Why it's called merge sort. So we have here we'll sort these. So we keep having them until the array size is 1. So when it's 1, we just return because this is a sorted array, and this is a sorted array, and that's a sorted array, we're all sorted. So then what we do is we start merging them together.
So the way you can think about merging is you just remove the smaller number of each of the sub arrays and just append it to the emerged array. So if you look here, when we have these sets we have 4, 6, and 1. When we want to merge these, we look at these first two and we say, OK, 1 is smaller, it goes to the front. 4 and 6, there's nothing to compare it to, just tag it on to the end.
When we combine these two, we just take the smaller one of these two, so it's 1. And now we take the smaller of these two, so 2. Smaller of these two, 3. Smaller of these two, 4, 5, 6. So you're just pulling off these. And because they've been sorted previously, you just have one comparison each time there. So more code here, just representation. So you start at the middle and you sort left and the right and then you just merge those.
And we don't have code for merge right here. But, again, if you go on study 50, it'll be there. Otherwise come talk to me if you're still confused. So cool thing here is that best case, worst case, and expected runtime are all in log n, which is far better than we've seen for the rest of our sorts. We've seen n squared and what we actually get here is n log n, which is great.
Look at how much better that is. Such a nice curve. So much more efficient. If you ever can, use merge sort. It will save you time. Then again, as we said, if you're down in this lower region, it doesn't make that much of a difference. You get up to thousands and thousands of inputs, you definitely want a more efficient algorithm. And, again, our lovely table of all sorts that you guys learned today.
So I know it's been a dense day. This isn't necessarily going to help you with your pset. But I just want to make a disclaimer that section is not just about psets. All this material is fair game for your midterms. And also if you do continue on with CS, these are really important fundamentals that you would need to know. So some days will be a little more pset help, but some weeks will be much more actual content that may not seem super useful to you right now, but I promise if you continue on will be very, very useful.
So that's it for section. Down to the wire. I did it within one minute. But there you go. And I will have donuts or candy. Is anyone allergic to anything, by the way? Eggs and milk. So donuts are a no? OK. All right. Chocolate no? Starburst. Starbursts are good. OK. We're going to have Starburst next week then. That's what I'll get. You guys have a great week. Read your spec.
Let me know if you have any questions. Pset two grades should be out to you by Thursday. If you have any questions about how I graded something or why I graded something the way I did, please email me, come talk to me. I'm a little crazy this week, but I promise I will still reply within 24 hours. So have a great week, everyone. Good luck on your pset.