All right, so, computational complexity. Just a bit of a warning before we dive in too far-- this'll probably be among the most math-heavy things we talk about in CS50. Hopefully it won't be too overwhelming and we'll try and guide you through the process, but just a bit of a fair warning. There's a little bit of math involved here. All right, so in order to make use of our computational resources in the real world-- it's really important to understand algorithms and how they process data. If we have a really efficient algorithm, we can minimize the amount of resources we have available to deal with it. If we have an algorithm that is going to take a lot of work to process a really large set of data, it's going to require more and more resources, which is money, RAM, all that kind of stuff.
So, being able to analyze an algorithm using this tool set, basically, asks the question-- how does this algorithm scale as we throw more and more data at it? In CS50, the amount of data we're working with is pretty small. Generally, our programs are going to run in a second or less-- probably a lot less particularly early on.
But think about a company that deals with hundreds of millions of customers. And they need to process that customer data. As the number of customers they have gets bigger and bigger, it's going to require more and more resources. How many more resources? Well, that depends on how we analyze the algorithm, using the tools in this toolbox. When we talk about the complexity of an algorithm-- which sometimes you'll hear it referred to as time complexity or space complexity but we're just going to call complexity-- we're generally talking about the worst-case scenario. Given the absolute worst pile of data that we could be throwing at it, how is this algorithm going to process or deal with that data? We generally call the worst-case runtime of an algorithm big-O. So an algorithm might be said to run in O of n or O of n squared. And more about what those mean in a second.
Sometimes, though, we do care about the best-case scenario. If the data is everything we wanted it to be and it was absolutely perfect and we were sending this perfect set of data through our algorithm. How would it handle in that situation? We sometimes refer to that as big-Omega, so in contrast with big-O, we have big-Omega. Big-Omega for the best-case scenario. Big-O for the worst-case scenario. Generally, when we talk about the complexity of an algorithm, we're talking about the worst-case scenario. So keep that in mind.
And in this class, we're generally going to leave the rigorous analysis aside. There are sciences and fields devoted to this kind of stuff. When we talk about reasoning through algorithms, which we'll do piece-by-piece for many algorithms we talk about in the class. We're really just talking about reasoning through it with common sense, not with formulas, or proofs, or anything like that. So don't worry, We won't be turning into a big math class.
So I said we care about complexity because it asks the question, how do our algorithms handle larger and larger data sets being thrown at them. Well, what is a data set? What did I mean when I said that? It means whatever makes the most sense in context, to be honest. If we have an algorithm, the Processes Strings-- we're probably talking about the size of the string. That's the data set-- the size, the number of characters that make up the string. If we're talking about an algorithm that processes files, we might be talking about how many kilobytes comprise that file. And that's the data set. If we're talking about an algorithm that handles arrays more generally, such as sorting algorithms or searching algorithms, we're probably talking about the number of elements that comprise an array.
Now, we can measure an algorithm-- in particular, when I say we can measure an algorithm, I mean we can measure how many resources it takes up. Whether those resources are, how many bytes of RAM-- or megabytes of RAM it uses. Or how much time it takes to run. And we can call this measure, arbitrarily, f of n. Where n is the number of elements in the data set. And f of n is how many somethings. How many units of resources does it require to process that data.
Now, we actually don't care about what f of n is exactly. In fact, we very rarely will-- certainly will never in this class-- I dive into any really deep analysis of what f of n is. We're just going to talk about what f of n is approximately or what it tends to. And the tendency of an algorithm is dictated by its highest order term. And we can see what I mean by that by taking a look at a more concrete example.
So let's say that we have three different algorithms. The first of which takes n cubed, some units of resources to process a data set of size n. We have a second algorithm that takes n cubed plus n squared resources to process a data set of size n. And we have a third algorithm that runs in-- that takes up n cubed minus 8n squared plus 20 n units of resources to process an algorithm with data set of size n.
Now again, we really aren't going to get into this level of detail. I'm really just have these up here as an illustration of a point that I'm going to be making in a second, which is that we only really care about the tendency of things as the data sets get bigger. So if the data set is small, there's actually a pretty big difference in these algorithms. The third algorithm there takes 13 times longer, 13 times the amount of resources to run relative to the first one.
If our data set is size 10, which is bigger, but not necessarily huge, we can see that there's actually a bit of a difference. The third algorithm becomes more efficient. It's about actually 40%-- or 60% more efficient. It takes 40% the amount of time. It can run-- it can take 400 units of resources to process a data set of size 10. Whereas the first algorithm, by contrast, takes 1,000 units of resources to process a data set of size 10. But look what happens as our numbers get even bigger.
Now, the difference between these algorithms start to become a little less apparent. And the fact that there are lower-order terms-- or rather, terms with lower exponents-- start to become irrelevant. If a data set is of size 1,000 and the first algorithm runs in a billion steps. And the second algorithm runs in a billion and a million steps. And the third algorithm runs in just shy of a billion steps. It's pretty much a billion steps. Those lower-order terms start to become really irrelevant. And just to really hammer home the point-- if the data input is of size a million-- all three of these pretty much take one quintillion-- if my math is correct-- steps to process a data input of size a million. That's a lot of steps. And the fact that one of them might take a couple 100,000, or a couple 100 million even less when we're talking about a number that big-- it's kind of irrelevant. They all tend to take approximately n cubed, and so we would actually refer to all of these algorithms as being on the order of n cubed or big-O of n cubed.
Here's a list of some of the more common computational complexity classes that we'll encounter in algorithms, generally. And also specifically in CS50. These are ordered from generally fastest at the top, to generally slowest at the bottom. So constant time algorithms tend to be the fastest, regardless of the size of the data input you pass in. They always take one operation or one unit of resources to deal with. It might be 2, it might be 3, it might be 4. But it's a constant number. It doesn't vary.
Logarithmic time algorithms are slightly better. And a really good example of a logarithmic time algorithm you've surely seen by now is the tearing apart of the phone book to find Mike Smith in the phone book. We cut the problem in half. And so as n gets larger and larger and larger-- in fact, every time you double n, it only takes one more step. So that's a lot better than, say, linear time. Which is if you double n, it takes double the number of steps. If you triple n, it takes triple the number of steps. One step per unit.
Then things get a little more-- little less great from there. You have linear rhythmic time, sometimes called log linear time or just n log n. And we'll an example of an algorithm that runs in n log n, which is still better than quadratic time-- n squared. Or polynomial time, n two any number greater than two. Or exponential time, which is even worse-- C to the n. So some constant number raised to the power of the size of the input. So if there's 1,000-- if the data input is of size 1,000, it would take C to the 1,000th power. It's a lot worse than polynomial time.
Factorial time is even worse. And in fact, there really do exist infinite time algorithms, such as, so-called stupid sort-- whose job is to randomly shuffle an array and then check to see whether it's sorted. And if it's not, randomly shuffle the array again and check to see whether it's sorted. And as you can probably imagine-- you can imagine a situation where in the worst-case, that will never actually start with the array. That algorithm would run forever. And so that would be an infinite time algorithm. Hopefully you won't be writing any factorial or infinite time algorithms in CS50.
So, let's take a little more concrete look at some simpler computational complexity classes. So we have an example-- or two examples here-- of constant time algorithms, which always take a single operation in the worst-case. So the first example-- we have a function called 4 for you, which takes an array of size 1,000. But then apparently doesn't actually look at it-- doesn't really care what's inside of it, of that array. Always just returns four. So, that algorithm, despite the fact that it takes 1,000 elements doesn't do anything with them. Just returns four. It's always a single step.
In fact, add 2 nums-- which we've seen before as well-- just processes two integers. It's not a single step. It's actually a couple steps. You get a, you get b, you add them together, and you output the results. So it's 84 steps. But it's always constant, regardless of a or b. You have to get a, get b, add them together, output the result. So that's a constant time algorithm.
Here's an example of a linear time algorithm-- an algorithm that gets-- that takes one additional step, possibly, as your input grows by 1. So, let's say we're looking for the number 5 inside of an array. You might have a situation where you can find it fairly early. But you could also have a situation where it might be the last element of the array. In an array of size 5, if we're looking for the number 5. It would take 5 steps. And in fact, imagine that there's not 5 anywhere in this array. We still actually have to look at every single element of the array in order to determine whether or not 5 is there.
So in the worst-case, which is that the element is last in the array or doesn't exist at all. We still have to look at all of the n elements. And so this algorithm runs in linear time. You can confirm that by extrapolating a little bit by saying, if we had a 6-element array and we were looking for the number 5, it might take 6 steps. If we have a 7-element array and we're looking for the number 5. It might take 7 steps. As we add one more element to our array, it takes one more step. That's a linear algorithm in the worst-case.
Couple quick questions for you. What's the runtime-- what's the worst-case runtime of this particular snippet of code? So I have a 4 loop here that runs from j equals 0, all the way up to m. And what I'm seeing here, is that the body of the loop runs in constant time. So using the terminology that we've already talked about-- what would be the worst-case runtime of this algorithm? Take a second. The inner part of the loop runs in constant time. And the outer part of the loop is going to run m times. So what's the worst-case runtime here? Did you guess big-O of m? You'd be right.
How about another one? This time we have a loop inside of a loop. We have an outer loop that runs from zero to p. And we have an inner loop that runs from zero to p, and inside of that, I state that the body the loop runs in constant time. So what's the worst-case runtime of this particular snippet of code? Well, again, we have an outer loop that runs p times. And each time-- iteration of that loop, rather. We have an inner loop that also runs p times. And then inside of that, there's the constant time-- little snippet there.
So if we have an outer loop that runs p times inside of which is an inner loop that runs p times-- what is the worst-case runtime of this snippet of code? Did you guess big-O of p squared?
I'm Doug Lloyd. This is CS50.