>> Welcome back to CS 50. This is Week 3. And on the evaluations at the end of last fall we got a comment from a student about how they apparently were in incentivized to come to class early simply because we started most every lecture with a little something stupid. So I thought we'd begin two minutes early so that we can squeeze in two such minutes of motivation. This one related, of course, to Problem Set 2, which for those of you doing the hacker edition challenges you not to implement ciphers but to do the reverse, cryptanalysis, and actually decrypt some known people's passwords. So with that, I give you this perhaps famous clip.
>> What's going on, what are you doing to my daughter?
>> Permit me to introduce the brilliant young plastic surgeon, Dr. Philip Slotkin, the greatest nose job man in the entire universe and Beverly Hills.
>> Your Highness.
>> Nose job, I don't understand, she's already had a nose job. It was her sweet 16 present.
>> No, it's not what you think. It's much, much worse. If you do not give me the combination to the [Inaudible] Dr. Slotkin will give your daughter back her old nose!
>> No! Where did you get that?
>> All right, I'll tell, I'll tell.
>> No, daddy, no, you mustn't.
>> You're right, my dear. I'll miss your new nose, but I will not tell him the combination no matter what.
>> Very well. Dr. Slotkin, do your worst.
>> My pleasure.
>> No! Wait, wait! I'll tell. I'll tell.
>> I knew it would work. All right, give it to me.
>> The combination is 1.
>> So the combination is one, two, three, four, five.
>> That's the stupidest combination I ever heard in my life, that's the kind of thing an idiot would have on his luggage.
>> Thank you, your Highness.
>> What did you do?
>> I turned off the [Inaudible] --
>> No you didn't, you turned off the whole movie.
>> I must have pressed the wrong button.
>> Put it back on, put the movie back on. Got to get that thing fixed. We're back, and we have the combination.
>> We're done with you. Go back to the golf course and work on your putts.
>> Let's go, Arnold, come Gretchen. Of course you know I'll still have to bill you for this.
>> Bet she gives great helmet.
>> Did it work? Where's the key?
>> It works sir, we have the combination.
>> Great. Now we can take every last breath of fresh air from plant [Inaudible] What's the combination?
>> One, two, three, four, five.
>> One, two, three, four, five?
>> That's amazing. I've got the same combination on my Luggage! Prepare, Space Ball One for immediate departure.
>> Yes sir.
>> And change the combination on my luggage.
>> Okay, it really doesn't get stupider than that. So this is CS 50. We left off last week looking at cryptography, the motivation for which ultimately was to introduce a problem domain for Problem Set 2. But more than that, it was to make clear this reality that if you've already sat down to tackle P Set 2, or will soon this week, what's ultimately a fairly interesting idea, taking plain text, converting it to cipher text, and generally doing something useful, encrypting information, it really reduces to some things that are very, very simple. You're going to find that taking in plain text is a matter of calling something like get string, which of course returns to you a string, which of course is just an array of characters. Well, encrypting text, then, is really going to reduce itself to just iterating over that array from I equals 0 on up on the length of the string, and then doing some kind of manipulation of each of the characters. Maybe in the case of Caesar, some addition. But there's a problem with the Caesar cipher. So the Caesar cipher is this -- this algorithm via which you take some text and quote unquote rotate each of the characters by some number of places. And rot 13 [Assumed spelling] was a very specific instance of that, where the secret rotational value, the so-called key was in that case 13. But we can generalize. It can be anything, from 0 on up to infinity. But of course only a few of those, say 26, actually have some utility. How would you go about cracking, that is figuring out someone's password, if it were encrypted with the Caesar cipher? If I hand you a string of text, it looks like nonsense because it's been encrypted with rot 13, not twice, but once. What would you do?
[ Inaudible audience comment ]
>> What's that? Okay, so frequency analysis. So a fancy way of saying, well, there's some letters in the alphabet that are more popular than others. If you've ever watched Wheel of Fortune, you'll realize that most contestants know this. So maybe you look for the most frequently reoccurring letter in the cipher text and realize, hmm, Q, Q appears an awful lot. But wait a minute, this has been encrypted. It's probably the case that Q is the result of encrypting a more common letter, maybe E, for instance, and getting Q as a result. So you might be able to reverse engineer what the secret key is by making some inferences. Or even if you're not that clever and you've never heard this term frequency analysis, but you know it's been encrypted at least with Caesar, I mean, what is another approach, a correct approach, if naive approach, even. Yeah?
[ Inaudible audience comment ]
>> Yeah, that's only 26 keys that might have been used if we're talking about Caesar. So yeah, might be a little tedious, but it's not going take forever just to try all possible keys. And in fact, you don't have to try all possible keys on the entire message, especially if this is an entire letter or paragraph or whatever that's been encrypted. It will try the first few letters, and if you start seeing English pop out, or maybe French or Spanish or anything that can be expressed with A through Z, odds are if it looks like a word, if it looks like a sentence, bam, you found a key. So Caesar is not all that strong, and maybe it was in fact used back in the day, but there have been certainly improvements on it since. Because even we with a pace of paper and a pencil could crack the Caesar cipher. Well, the Visionaire [Assumed spelling], cipher, which came along some years later is a little for sophisticated. It takes the approach of Caesar, of rotating letters in the alphabet by some number of places, but of rotating different letters by a different number of places. So instead of using K, the key, the numeric key, to encrypt every single letter in your plain text, it uses a different key for each successive letter. Now that doesn't go on forever, you usually pick a fixed number of keys, and in fact, if you just need a fixed number of keys, you just choose what we call a keyword. So something like foobar. Now, as an aside, if you've wondered why I constantly resort to fu and bar and all Baz and all of these strange names for variables, you'll find that this is just a silly computer science thing, a silly programming thing, a convention. Whenever you need to whip out a random variable for the sake of discussion, fu is typically one's first instinct. Bar, is followed by that, and then ucks [Phonetic], and then there's a whole bunch of silly ones after that. So foobar is sort of a conical example of a random word that we might use. But we just have to realize that each letter in foobar in this case just represents a different key. So F represents one key, O represents another. O happens to represent the same. B is another key, so in other words you take your keyword, you line it up under your plain text, and you rotate each of the plain text characters by the number of places implied by the key. So now what is F? Well, let's see, if we start counting as zero as we typically do, A is zero, so then it's B, C, D, E, F. So F represents the number, the key 5. So what does that mean? Well, if my plain text P is hello world, and I need to rotate H by F, that is by 5 places, I go H I J K L M. Okay, so the resulting cipher text letter is M, and then I repeat that same process using different keys. Now presumably, I might -- plain text, the message I want to encrypt, is hopefully going to be longer than the key. If you're remembering a key that's longer than your actual message you're losing some efficiency there, perhaps. So foobar is only six letters. So what if our plain text is longer than six letters? Well, you just start repeating again and again and again. So let's ask the question for just a moment, then. If there are 26 possible keys for Caesar, how many possible keys are there for Visionaire, by contrast? Little louder, little more confident. Okay, infinitely many, certainly, because we can pick any possible keyword of any length. But let's generalize. Suppose we have N letters in our keyword for Visionaire. How many end letter key words are possible for Visionaire? Right, so 26 to the end. So if every -- you have an N letter keyword, six in this case, and each of those letters can be A through Z through any of 26 possible letters, you've got 26 options times 26 option, times 26, so that's 26 on the N. And then if I relax the constraint that the key is six letters, and I say ehh, the key is any number of letters, then you have even more. You have 26 to the 2, plus 26 to the 3, plus 26 to the 4, so in short, the key space is a lot larger. Now ultimately you could still crack the Visionaire cipher by a few ways. You could do some intelligent manipulation, you could look for patterns, for instance, you could do frequency analysis. Some fairly sophisticated heuristics, or you could try all possible keys. Start with all the one letter keys, then once you've tried all those and gotten back non sense, try all the two letter keys, the three letter keys. And that algorithm, though naive, would eventually work, at least in theory. If you have enough time to wait around. Unfortunately, these days, when you've got two gigahertz of power underneath your own desk or your laptop, you can churn through a lot of keys very quickly, and in fact there's this algorithm called DES, D-E-S, and this is actually an algorithm that's used in the function that's typically used on a Linux or UNIX or Mac os system, still today, to encrypt your own passwords. So you've got a user name and password, maybe on your own computer, certainly on Nice dot F A S dot Harvard.edu now, and your password, fortunately, is stored in encrypted fashion, and this is why, for instance, the user assistants, beg them as you might, cannot tell you your password if you forget it, because the system stores it in encrypted fashion. So this algorithm, this function called DES was used for some time, many years after Visionaire, but in more recent modern times, and there were with DES 72 quadrillion possible keys because of the number of bits used to encrypt something with this particular algorithm. Now we won't go into the specifics, I put this little snippet from a text book to just kind of hit at the relative complexity of this. Visionaire is very simple, you can do it paper-pencil. DES starts to take a computer or calculator, as all of these arrows and this flowchart generally implies. But a computer can compute DES-encrypted text fairly quickly. Unfortunately, too quickly. So the reality is these days if you encrypt something with DES, you know, if someone has enough time and enough CPU cycles at their disposal they can crack your password. And that's in fact, exactly what Problem Set 2's hacker edition is all about. Now this is you know, interesting maybe fun in the context of passwords on some random university's system. It's a lot more worrisome, if random people, students in CS 50 with a laptop on their lap, can crack information that has been encrypted with something like DES. DES is not very secure any more. 56 bit keys, as it once used, no longer very secure. So there are variants. In fact, someone came up with the bright idea of not just using DES, but let's run DES three times, call it tripped DES or 3 DES. And that actually is a little more secure, but not terribly so. So the world has come up with more sophisticated algorithms. And if you take or have time to take over your years here computer science 120, and there's a cryptography class, and there's some graduate level ones, it's really interest space. But the funny thing is a lot of the security of today's cryptographic mechanisms ultimately boil down to assumptions. The safety of them boils down to assumptions. In other words, there is this algorithm called R S A, and you might have seen this listed to various web sites, you might have heard of this algorithm before. It basically boils down to a really fancy way of allowing web sites to communicate with browsers securely, but that's just a specific case. You can use it for far many more interesting purposes; ATM machines, bank transactions, a whole bunch of interesting stuff. But it ultimately, the security of R S A, whether you're using keys that are 56 bits long, which is pretty short these days, or 128 bits long, or 1,024 bits long, ultimately, whether or not this algorithm actually works reduces to whether mankind can come up with a fast way of factoring large numbers. So again, long story short, this algorithm called R S A boils down to assuming that mankind, and their computers, cannot take a really big number and factor it really quickly into two large prime numbers. So put another way, if you want to encrypt some information and you want to use an algorithm like R S A, you take two really big prime numbers, and it's not hard to generate prime numbers, even big ones, it's not hard to multiply them together. But it is much harder to undo that process, when all you do is hand an adversary or hand a computer a really big number and then say hey, this just so happens to be the product of two really big primes, tell me what they are. We, thus far, have not come up with a relatively fast way of figuring that out. It will take more units than we have in our lifetimes, for instance, to make sweeping generalizations. But the key is, no pun intended, that there is a lot of interesting opportunities to improve upon or even break some algorithms that the world itself is now entrusting our money, our security, and a whole bunch of things too. To give you a sense of what a bigger key looks like. So this is intentionally rather small and deliberate, this is a key, a really big number, that happens to be represented with alphabetical characters and numbers, just for efficiency's sake, that represents a key one might use with an algorithm like R S A. This is a lot bigger than a number from 1 to 26, this is a lot bigger than a word like foobar. So this hints at the types of security that's actually in use today. And so that's where we pick off -- pick up today. So we spent a lot of time in the past couple of weeks talking about syntax and C and programming, now finally that we've gotten some of the minutia out of the way, what a for loop is, what a while loop is, and granted, you might still need another week or two of practice to get comfortable with even those syntactical details, we finally now have the ability to start talking at a slightly higher level about how you actually start to think more effectively, as the syllabus and course catalog claims, how you can solve problems more efficiently, as is the other promise made. And we started this course with this sort of visually interesting example of taking a really big phone book, and I claim I could find Mike Smith much faster than say a completely naive person would, by starting at the front and flipping through the As, and then the Bs, and then the Cs. Because I took this approach of dividing the problem in half and conquering only half of it. And then I realized, oh, if Mike I know is in the right-hand side of the phone book, let me do this again, and split the problem again and again. And our take away was that if that phone book, and it pretty much was, a 1,000 pages or 1,024 pages, how many times would I actually have to split that problem in half to find someone like Mike Smith. So just ten, right? If you have 1024 pages, and you just do the math in your head or on a piece of paper, you can divide that number in half ten times. Now there might be some interesting rounding issues at the end when you get to the last page, but ten times, instead of 1024 pages. Suppose that phone book, as we said in week 0, were not 1,000 pages, but 4 billion pages, that's a hell of a phone book, with a lot of names in it. But how many pages would I actually have to look at. How many times would I have to chop a 4 billion page phone book in half. 32 times. Why? Well, you can take 4 billion and divide it in half roughly 32 times. And that speaks to the power of actually designing algorithms intelligently, that speaks to the power of what you can do with this basic device and just a modicum of intelligence inserted into your algorithm. Right, if I told you to search a phone book right know, you could certainly recently through how you might implement, divide, and conquer, but by far the simplest way to implement search for a phone book using the syntax and the capabilities of C you have now is what, like, write a for loop, I gets zero, I less than or equal to 4 billion, and then just check from I on up to 4 billion whether or not someone is in the current variable or in the current array or however you actually code it up. You could implement that with just a few minute's time. But that thing's going to take a while to run. By contrast, put a little more thought up front and the algorithm itself will execute far more quickly. So we're not going to stand up and indulge in this again, but this is an instance of the same idea. The first day you all stood up, you paired off with someone, and then half of you sat down, then half of you sat down, then half of you sat down. Now, granted, that turned into a bit of a disaster, as is often the case. Took way longer than for me to just go 2, 4, 6, 8, 10, 12. But in theory, had there not been the social awkwardness, had there not been the problems with arithmetic, we would have been able to -- you would have been able to count the people in this room so much more quickly than I, because whereas I'm sort of going like this, 1, 2, 3, so with every CPU cycle I'm doing one thing, you, by contrast, with every toll of the clock, with every CPU cycle, half of you were sitting down. You were taking a half -- 50% of the problem out of the room at once. Where I was just taking one nth of it out of the way. So this too has implications. So this is just a fancy way -- I whipped this up with Excel -- to depict the following idea. So on the X axis here is this number N, which generally is going to refer to the size of the problem, how many pages are in the phone book. Let's call it N. How many students are in Sanders. Let's call it N. So I have from zero on up to 130. And then on the Y axis I have what I call T of N, the running time of an algorithm whose input is of size N. So it's just a general way of describing the running time, how much time does it take for an algorithm to run, and what do we mean by time? Like steps or seconds, minutes, whatever, so long as we all agree what the metric is. And these different lines represent different running times. So at the top left, we have a little legend there. It's kind of small to see, but we have three graphs here. One is this one up top. So on the left-hand side we have a very steep curve, but this is a line, this is a linear graph. And this represents an algorithm that given A inputs takes N steps to run. And that is representative of the running time of 1, 2, 3, this linear algorithm of counting students. But we realized in Week Zero we can do better, right? We can actually sped those up. So the second graph there, the second line that's a little bit to the right of the first one, that would be the result of my doing 2, 4, 6, 8. It's actually twice as fast, now contrast this with your line, so log of N, as we described your algorithm that day, as we described this general principle of divide and conquer. What this means is your algorithm should take way less time. In other words, the more students are in the room, look at the contrast between how many steps or how many seconds, how many units much time your algorithm should take versus mine, which is literally off the charts. Now what does this mean? If we add -- if one more student comes into this room that takes me one more step. If one more student comes into this room, that's kind of rounding error for you. Because you can kind of assume that one additional student. So let's actually take a bigger -- let's increase the problem more significantly. Suppose we have roughly 2, 300 students in this room. Suppose another 200 or 300 students come into this room within the next few seconds. Well that's going to take me literally twice as long. If it took me N steps before, it's going to take me 2 N steps by doubling the size of the room. Now my contrast, if 200 or 300 more students enter this room in the next few seconds, how many more steps or unit of time does it take you to count those additional students? Just one, right? Because within the first iteration of your algorithm, those same kids will sit down right away. And that's a huge bite out of the problem, whereas mine has grown significantly. And that's the fundamental contrast between something that's linear, like in my case, and something that's logarithmic, or at least mathematically more interesting. And not to overwhelm with numbers, but just to give -- put concrete values on what the implications here are, this is just a big chart that computes some very basic functions. So in the left-hand, the third column here, right above where I'm standing, this is N, so we arbitrarily have 1, 2, 4, 8, 16, a whole bunch of values of N. So my algorithm, 1, 2, 3, 4 -- if we have let's say 1 million students in this room or 1,048,576, that's obviously going to take me 1,048,576 steps. But my contrast, if you scroll back to the left, and you can do this, you know, with a calculator or even in your head perhaps, that same student body of a million students in Sanders could be counted by you all in just 20 steps. And so the take away visually of a graph like this, and we'll revisit this or you can revisit this as we introduce yet more algorithms, is exactly how significant the implications are of well-designed algorithms. And we can zoom out even further, if you look at various graphs like these three here, the specifics are not so interesting today, but notice this last one here. So it's small to see, but that big curve that's going to the north there is called 2 to the N. 2 to the N being an exponential function. So we'll talk about it at times and in future classes if you take other theory classes in CS, even just for fun or if you're minoring in CS, you'll find that exponential algorithms, very bad. Because as N increases they take a ridiculous amount of time. But back to our motivation today of cryptography, when we say it's really hard to factor large numbers quickly, what we mean, essentially, is that factoring large numbers generally boils down to something that takes exponential time, not linear, not quadratic, not polynomial, if you recall the terms, but exponential. So graphs like that, very bad. Takes a good amount of time to run. So how can we start to describe the running times of algorithms. Right? It's sort of -- it's easy enough to pick a few examples, talk about which one's better, why it's better, and steps it takes. But can we generalize. Can you start to look at your own code or your own programs and say this algorithm is good because it takes this amount of time, or this implementation I wrote last week sucks because it takes this much time in general. Can we slap some labels or some formalities on just so we can start comparing algorithms against one another, even in different languages or written by different people. Well, there's three basic constructs, three little pieces of jargon we'll start using. One is called big O, one -- and that's the guy at the top. The middle guy is called theta. And the bottom is omega. And these symbols are just generally used by mathematicians in computer science to express the running time of algorithms. Now what does this mean? So formally, it means this. And these slides are on line, if you like the mathematical formalities, because there's actually some juicy ideas in there, but for today, they reduce to some relatively simple ideas. If you have an algorithm that in the worst case takes N steps, you say that that algorithm is in -- you say that that algorithm is in big O of N. So this is the worst case running time of time linear algorithm of counting students, and as an aside, happy birthday to Jansen and Mike. If, though, your algorithm in the best case might actually take fewer steps, well then you start talking about omega notation. Now in the case of counting students, this really doesn't help us, because in the best case how many steps is it going to take to count N students? Well, N. Right, I kind of have to count everyone. In the best case? Well, if I want to count all of the students, it's still going to take N steps. But let's pick another algorithm arbitrarily. Suppose students file into the room, I want to search for a student named Mike, just because. So in the worst case, how many steps will be take me to find a student named Mike as they file in the door. All right, so N steps. Because in the worst case, the Mike or the several Mikes we have in the course this semester are just -- just so happen going to be the last people coming into the room. Just by chance. But in the best case, where might Mike be in the line? First, right? Common sense. So in the best case there we say that algorithm of finding Mike, searching for Mike, is in omega of 1 in the best case, just takes one step, bam, I ask the question, I get the answer I'm looking for right away. So when you have an algorithm that's both in big O of N or omega of N, so not the searching but rather the counting, then can you say that that algorithm is described by theta of N. So in short, we have worst case running time up top, we have best case running time at bottom, and if those two just so happen to be the same, then you can say that the algorithm is in theta of whatever formula you've come up with, and in that case. All right, so what does this mean exactly? So in the case of a phone book, when I tore the phone book again and again and again, and what was the running time of finding Mike Smith or really finding anyone, for that matter, in a phone book that happens to have N pages? In the worst case, searching a phone book takes how many steps? Okay, so log N, if my algorithm is to just start at the first page and go from left to right all the way to the end of the phone book. So that might be one implementation, and this will be say, linear search of a phone book. But we can do better. In other words, if I want to compare one algorithm, the linear algorithm that we started class within Week Zero versus the more intelligent one of divide and conquer, divide and conquer worst case running time is instead what? Log N, right? Because we said multiply times now that in the worst case we're going have to keep chopping the phone book in half and half and half, until we get to one page where Mike Smith is finally listed. How many times can we do that chopping? Log in time, so this is something that we'll call logarithmic. All right, what about the best case running time of the phone book searching algorithm? Yeah, right? Because in the best case -- and again, when you talk about an algorithm, you talk about a general input. We don't talk about an algorithm for finding Mike Smith, we talk about an algorithm for finding someone. Well, the best case running time of finding someone in a phone book whether you use the linear algorithm or the logarithmic one ultimately boils down to just constant time, right? Because you might be looking for Abigail Adams, who just so happens to be on the first page of the phone book. Bam, you're done, you don't even have to divide the phone book in half multiple times. And we'll call the running time of that algorithm constant, whether it takes one step, maybe it takes two steps, because there she is on the page, now I have to find what row she's in on the page. But those are just a constant number of steps, so we're going to generalize as just constant time. So what's the point of all this? Well, we now, just as we did, have this means of comparing not apples and oranges, per se, but apples and apples. If we have two different algorithms, linear search and then divide and conquer for the phone book, we can now say something qualitatively about which one is better in terms of its running time. And the notation folks tend to use for exactly that, is this notation here. So just to toss out some jargon in case you see it anywhere, constant time we discussed, logarithmic time we discussed, linear time we discussed. We'll see this running time often, N log N. Frankly, in 12, 15 years of computer science I've never heard one human say supra linear, but that is in fact the formal term. N squared, we'll see today and-or Wednesday. That's called quadratic. More generally, if you have N raised to some power, that's a polynomial. And then there's N factorial, which is even worse than any of those, but you tend not to see that unless your algorithm is frankly downright stupid. So couple of announcements. There's one hand out today, very little code, but nonetheless, it's out there. Lunch seems to have gone well, so we were kind of overwhelmed by responses. We ended up having a barbecue for 60 of your classmates. So we'll try to do another such thing some future Friday, more on that in a future class. If you've not yet returned sensor boards from the hacker edition of P set 0, that's fine, just hand them to one of the TFs in our little nook over here. P set 2 went out the door on Friday. The walk through was filmed yesterday, on Sunday, that video will go on line by tomorrow. Office hours have been in progress, do take advantage of those, and sections have already begun. So your assigned section, you should have been informed of by e-mail. Either you went last night or your section is probably today or tomorrow, so do take advantage of those as well. And we also deployed per Problem Set 2 this link on Friday. So there is now the bulletin board, which if clicked will log you into a little interface that looks like this. And the bulletin board is meant to serve a couple of purposes. One, it's meant to allow us to answer the same question once for many different people to benefit from the answer. It's meant to be an opportunity for you to potentially field each other's questions, so long as it doesn't cross that line of outright telling someone some answer, and what you'll find is that week to week we'll add another category to the bulletin board for P set 2, P set 3, P set 4 and so forth, and the goal of the bulletin board is to ask questions that you're pretty sure anyone can hear the answer to, anyone can see the specifics, but if you find that asking your question really requires that you show us your code, then continue to use help at CS.50 for that, but what you'll find more importantly, especially for those less comfortable, when you log in for the first time although it does authenticate you with user name and password, we have made your name the default value of students. So you can have that pseudonym, you can change it to something other than that, your own name or something fun. But realize the point of the bulletin boards, anonymity is so that you can post what you fear is the proverbial dumb question, and no one else, other than the staff as necessary, is going ask that you even asked that question. So if that gives you more comfort, by all means, ask for questions in that way. That was a lot to absorb, let's take our five minute break.
[ Background noise ]
>> All right, we are back. So we do the following with key note. We have eight doors here, behind which are little surprises. To reveal what's behind these doors, though, I need to solicit the ever-awkward volunteer to come up on stage, if you will. Yeah? Come on down. Again, you'll need to sign your rights away in perpetuity in a moment. Come on up. What is your name?
>> Edward, okay. David. Nice to meet you. Got a fan over there. All right, so Edward, much like a game show, we have the same doors -- oops -- [Inaudible] will spoil the surprise. So we have eight doors behind these pieces of paper here. So we have two arrays. So I'm going start using the geek terminology. So we have two arrays, arrays of integers. Each is of size eight. And you are now just a human who's been presented with a very awkward situation. The goal of which in this top array is to find me the number 3. And what we the audience hope to learn from this is exactly how you the human go about solving this problem. So if you would, on the top row only, find us the number 3, please. See the number 2. Oh, that was very good. See the number 3. Okay, so if you would, can you explain how you found the number 3?
>> Just went up, [Inaudible] pieces of paper at random.
>> Okay, so that was just chance, then, it was not spoiled by the wind.
>> Right. Systematically go through and check all the paper.
>> Okay, good. So thank you, first, for spoiling my demo by finding it so quickly. So -- but let's ask the question nonetheless. What is the running time of the algorithm that Edward just tried, can we slap a label on this. Best case running time. It's apparently two, but one in the truly best case, so this is in omega of 1, we'll say. In the best case, he just happens to go straight for the right number, just by chance or by some more sophisticated heuristic. And in the worst case, how long would it have taken you to find the number 3? Okay, 8 or more, generally big O of N. So again, we have an algorithm that's big O of N, or omega of 1. So the question now at hand is can we do better than that. So let's see. How could we fundamentally improve upon that algorithm, any suggestions for Edward. Could we do better than big O event. Because big O event is great when it's only eight, but if N is now 4 billion, as we've discussed, N, not so good any more. Sorry?
[ Inaudible audience comment ]
>> Okay, so we could sort these things, right? There is an advantage that we had all this time since Week Zero of a phone book. Someone else, Verizon or whomever, has done the hard work of sorting that phone book alphabetically so that we can make certain assumptions. An assumption like when I go down the middle and end up at the Ms or the Ns I know the Smiths, the Ss, are going to be to the right. Now I didn't give Edward any such assumptions, and in fact, it's a little weird that 2 is over here, and yet 3 is over here, kind of suggests that maybe these things are not so sorted. And in fact, there's 8, there's 1. So the top row is not in fact sorted. So if we now give you the added assumption that the bottom array of 8 numbers is in fact sorted, how might you go about, without answering, just doing, finding the number 50 in the bottom array. Different numbers, but they are now sorted. I see 171. Dammit! Okay. Okay, very good. Okay. So what did you do this time?
>> Well, since you told me it was sorted, I assumed it meant it was either -- there was going to be some sort of pattern. So either the larger numbers were on the right or on the left. So I checked both sides.
>> Okay, good. So as an aside, this demo was far more effective last year when we spent like 10 minutes on this. It was somewhat awkward, actually, because your predecessor spends a lot of time thinking about which number to reveal, thinking there was perhaps some trick to this, when really, five minutes before class I had written random numbers on the board. So you're doing much better. He was a good sport about it. Okay, so bottom array is now sorted. Let's slap some numbers on this. Best case is apparently two again, in reality, but formally or asymptotically, any time we start talking about big O and omega, we're talking asymptotically, which means as inputs get large, as N gets large, what generally happens to running time. So in the worst case, how long would you algorithm have taken. Okay, N, because why, what was your algorithm here? You said I went high, I went low, but we didn't really get to see what you would have done next. In fact, let's continue this. Find me the number 52. I see 51. I see 61. Good. Back on top. Okay, so pause there, because we know you're going to struggle now. So what was your algorithm? Now we have to push a little harder.
>> So I was assuming they were sorted numerically. So that --
>> So when you said 52, since I new 50 was already down there, I just went to the same row.
>> Okay, good. So in short, I can summarize. So at first you went high because 50 is a pretty big number, certainly relative to the number 3 we looked for before. So you went high, didn't find it, then sort of thought maybe I was messing with you, so went left to look for the number 50 there. Didn't find it, but you did know it was sorted. So the rest of your algorithm after this guess work at the beginning, try high, if not high, try low, was presumably to iterate then from left to right. Okay, so we can still slap some formality on this approach too. In the worst case, how many steps would that take, as you said. N steps, right? Because maybe you went high here. Okay, that wasn't 50. Then you went low, not here still, then you end up iterating over the N minus 1 remaining elements, okay, big O event. But again, best case running time is omega of -- in your case there, is just 1. So in constant time. So this sort of begs the question then, how long did it take me, the person setting this up, to actually sort these values. So let's put you out of your awkward [Inaudible] round of applause if we could.
[ Applause ]
>> Legal form awaits. Okay, so that begs the question, then, how do we actually go about sorting these values. Because I kind of did it for free. Right? I did it before class, kind of cheated by having the numbers pre-baked. But if you're just handed a bunch of numbers as Edward effectively was, and I left it to him to sort those values so that he could then impress us with the running time of his better algorithm, how much time does it takes to sort these numbers. Well that, we'll see, is a much harder question to answer and a much more time consuming problem to solve. But let's see if we can express now what Edward kind of did or in an ideal world would have done exactly. We can express this algorithm, or really any algorithm moving forward in the course, not just in C code which gets fairly pedantic, writing out all the minutia, we just want to convey the idea of an algorithm, just like we did in Week Zero and One, we can go with pseudo code. There is no formal standard for what pseudo code is, different people take different approaches. The only important thing is to communicate your idea clearly without getting bogged down in stupid details like semicolons and parentheses, which add nothing to the logic and only distract from your idea. So I took the somewhat conventional but somewhat arbitrary approach of defining my pseudo code as follows. For linear search, as I called it, and this is what the world generally calls an algorithm where you start from here to here to here to here, and move from left to right, so on input N, for N elements, for each element I, if I equals, equals N, return true. So on input N, for each element -- yup. So this is -- there are many different ways you can implement this pseudo code or write it, but it implies that there's an array of some sort, but syntactically, doesn't matter how we express this. So I'm going to take the approach of saying for each element I in my array, if I happens to equal N, the thing I'm looking for, go ahead and return true. But then per the indentation, if I never return true, the last line in my program says return false. And the answer is no, as was the case when Edward failed to find 52 because it just wasn't in that second array. But we can do what I was hoping he could do, but it is okay, I can do it myself, we can implement what we might call binary search, which is actually direct application of the whole divide and conquer idea. Now this algorithm looks at first glance certainly more complicated than linear search, but we've already seen how much time it saves on things like searching phone books and what not. So in theory, if I were searching for the number 50 here and I didn't have this heuristic of going high or going low, because maybe that would work, but even Edward had to make a guess, is 50 big or is it a small value. Well in this case, it was actually relatively small value, because we had 50 here, but then 171 here. And in the end, although it was a clever trick, and he actually solved that problem very quickly to his credit, the algorithm itself, ultimately boiled down to no better than just linear search. Because in the end these clever heuristics, which in the real world clearly saved us some time, and garnered him some applause, the rest of the algorithm, had he not been so lucky, is still the same linear algorithm. So what if I instead apply the Mike Smith approach. I go into the middle of my phone book, the middle of my array. And now I'm looking for the number 50. Well, we have some weird mathematical rounding errors that I'm sure we can work out with the rounding function or ceiling or floor or whatever, so I'm going to arbitrarily go here. Is this the number 50? No, it's the number 105. But just like in Week Zero when I was able to tear away the problem of the phone book, I now know that my answer is definitely not going to be to the right. So I can focus only on the remaining numbers here. Where do I go next? Well, I think essentially the same problem. I'm still looking for the number 50, I've still got an array, upside is half as big as it once was. I can use that to my advantage performance-wise, so let me just go smack-dab into the middle. I've got to pick left or right arbitrarily, because the math isn't working out beautifully, so let's pick this one. 51, not the case. But now I know it's this way. So now I know it's not here, 51 or 61. And I'm left with a problem that's a quarter of my original size. So already, fundamentally, I'm definitely moving faster than Edward would have, given many more elements in his array, which clearly is going to the case when we have more room and more interesting problems. So I've got two elements left. I arbitrarily go right, as I've been doing. Aha, the number 50. Now, granted, and this is actually a use full take away, granted, my algorithm was slower than Edward's. Edward took two steps. I took three steps, and yet I'm going to make the claim, arrogantly, nonetheless, my algorithm is better because as N increases, asymptotically, the bigger and bigger N gets, the more and more you'll be impressed by the running time of my algorithm, and more and more Edward's algorithm, in all fairness, just devolves back into a very slow but correct implementation of the same idea of search. Now as an aside, for the astute, this just so happens to be all of the computer science courses you're qualified to take after or before C S 50. So keep that in mind at shopping terms -- shopping periods time. But for now, the take away is that very subliminal, I know, maybe it's not subliminal if I tell you what they are, so for now we just need a way of expressing this. So there's many different ways we can express this. This is what I would call an iterative implementation of binary search. Thus far in the class we've only talked about phone books and splitting them in half, throwing half away, repeating. But now you have this ability to program a machine, you kind of start -- you need to start expressing yourself more precisely. You can't just tell the computer find me the number 50, or 52, you need to tell it how to do that. You need to teach the machine, so to speak. So one way we might imply that in pseudo code is as follows. On the input of an array, where we call it array bracket zero on up to N minus 1, so the length of most any array we ever discuss now is N, but the largest element index is obviously N minus 1, we start counting from zero, and input K, where K is the value I'm looking for. I kind of need to implement this idea of binary search. So I'm going to define in italics here a variable called first. And initialize it to zero. I'm going to define another variable called last, and initialize it to N minus 1. And what the subsequent steps do, if you think it through, as a future Problem Set may ask you to do, it's going to ask you to implement very precisely this approach. Here's N elements, I'm going to use my left hand as my left variable, my right hand as my right variable, and what is it telling me to do on the first step? Well, while first is less than or equal to last. So good, my left hand is indeed less than my right hand, because it's to the left of it. What do I do? Let middle equals first, plus last, divided by 2. So what that math does for me is if I add this index and this index, divide by 2 and take care of the rounding, that gives me a number in the middle. Let's say 105 for now. So middle equals that element there. So if K is less than the middle element in the array, then let last equal middle minus 1. All right, so I'm looking for the number 50. So 50 is indeed less than the middle element of the array. So what do I do? Well the then condition says then let last equal middle minus 1. Here's last, here's middle, here's middle minus 1. And already in one iteration we've seen how to translate a very simple idea, right? Split the problem in half, in half, to something much closer to what GCC could understand, much closer to what you could actually code up in nano. And if you follow the logic there, what you'll find is that iteratively [Inaudible] my right hand, keep moving closer to my left, sometimes my left goes closer to my right, if my hands ever cross what's the implication, presumably? That it's not there. I'm looking for 52, and you know, I went too far to the right, too far to the left, my hands cross because it's just not there, or I end up smack dab on top of the number I'm looking for. Whether it's 50 or 52. But what we've sacrificed here is this beauty of the original algorithm. What's really cool about divide and conquer is it's so damn simple to explain. Take the phone book, split it in half, look at the elements. If it's less than, go left, if it's greater than, go right. Repeat. There's this elegance about this, where you just say divide the problem in half, check for something, and then repeat, repeat. This really clouds some of the simplicity of that algorithm. But we have this ability, it turns out, in C, and in Java, in C++, and all of these languages, particularly ones like Lisp and Scheme, which you could cover in C S 51, of this feature of recursion. So it turns out that whenever we've said something like repeat, you can repeat by essentially applying the same algorithm again. Now thus far in C, how do you implement an algorithm? Well, you write a function like cipher or Caesar that implements some algorithm. So if we ever have to express this idea of repeat, maybe we can have a function just invoke itself again on a smaller piece of the puzzle. Well, what do we mean by that. Well, let's take a look at this very simple program. So this is sigma 1.C, this is one of today's two print outs, two pieces of code that's printed out. So this is going to be a function that returns that an int it's called sigma in homage to the little symbol from your math book that looks like this, and just implies summation. So I want to do this. I want to sum up all of the numbers from I equals 0 to M. So if I want to express this like the back of a high school math book which, this is I equals 0 on up to M of I. So I just want to compute the sum of all the numbers from 0 up to M. Okay, so not an interesting problem, but it's interesting how we can solve it. So here's one little sanity check. When we said last week you better get into the habit for your own sake and your program's sake for error checking, you've got to do sanity checks like this. Just because the function is going to be passed in int per the definition, per its signature, so to speak, per its declaration up here, that doesn't mean it's going to be an int you want. It might be negative. It might be zero. Neither of which we want if we're trying to sum up positive values only. So I'm going to avoid the risk of an infinite loop by just doing a sanity check. If M is less than 1, return zero. So here's an interesting feature of C and a lot of languages. If you have to somehow signify an error condition, some languages, as you may know from high school, like Java, have this thing called exceptions where you can kind of raise a red flag and not bother returning a value at all. But languages like C require that a function return a value, generally. In this is case an int. And sometimes you need to return what we called last week a special sentinel value, a special value that the person calling your function had better check for if they want to see if your function worked correctly. So it's not sufficient for a function like this to just print out warning, M is less than 1. Because how is the code that called this function going to know that an error occurred, right? This is the point we were trying to make last week with side effects versus return values. A human can see side effects. A computer can only see return values. So if we're trying to write code that calls this function and I want my code to be able to check if an error happens, I can't rely on print def, I have to rely on return values. So what value could I return in the event that the input just doesn't make sense? Well, here I arbitrarily chose zero. What might have been another reasonable value to return to signify essentially error? Negative 1, why? Well just because. If you return a negative value clearly the caller, the function that called this function can infer by getting back a negative number when the whole point of the exercise was summation of positive numbers, that something went wrong. So think too, back to last week when we mentioned very briefly, because we'll come back to it, that get string sometimes can return a bogus value or not return a string per se. It can sometimes return a special sentinel called what? This was null, so we'll see this many more times, it hasn't really come up, but N-U-L-L in all caps is a constant that signifies, there is no string, but I'm a function with a return value, I have to return something. So when you start consulting man pages later in the course for new functions that you might want to employ, you'll need to check the section of the manual that says what return values does this function have, because sometimes you want to check with your own little if condition, did I get back a value I care about. Well, this code is easy. Now once I've done this sanity check and I want to sum up the numbers from 1 through M or equivalently 0 through M, I'm going to do this, I'm going to declare a variable called sum, it's going to be a type int, and I'm going to initialize it to zero. I'm then going to iterate from I equals 1 all the way up to M -- equals, less than or equals to M -- and then I++, and then what does this code ultimately do? It just sums up all the numbers from 1 to M or 0 to M. All right, so if I see this in action, let me go ahead to a terminal window here. I've got -- let's go ahead and open a new one, let me go ahead and SSH to nice.fas.harvard.edu, and if I go into my CS 50 directory, okay, Lecture 3, Week 3, okay, sigma one. So this is the same exact code. Notice -- oh, why do I have this at the top of this file? What's the point of significant -- declaring sigma as a prototype up there? Yeah, so remember from last week, if you are calling a function that just so happens to be implemented later in the file but you're trying to call it from main, which apparently I am, so this was not on the slide a moment ago, but apparently I am calling sigma from main -- it's a stupid detail, but you've got to proactively tell the compiler hey, there is a function in this file called sigma, it returns an int, it takes an int, just know to expect it eventually. And realize this too. It turns out that when you declare a prototype for a function, you don't have to provide the names of the variables, all the compiler needs to know at that point in time is what types of variables it has as arguments. So you can literally just write inter, or if it takes two arguments, int, comma, whatever. So either approach is fine. I could have specified the name of the variable up here with int-M, but it just isn't necessary. So I tend to be fairly minimalist when I write the code. All right, so here's the function down there. And notice one other detail, just to make this more compelling. Why did I not do this, because this was a question asked, I think on the bulletin board this weekend. Why did I not do -- say, this. Probably a couple problems here. What's one, yeah?
[ Inaudible audience comment ]
>> Yeah. So there's a logical error, declaring int, reinitializing sum to zero every time. So that's not good. What if I did this, as I've seen before. I can sometimes initialize two variables to a value in my for loop. So that seems to be okay. But what kind of error am I going to run into next. Some of you may have -- yeah? Oh, this is -- don't stretch like this during class. Okay. Yeah. In the back.
[ Inaudible audience comment ]
>> Exactly. Remember last week's discussion of scope. So generally, when you declare a variable, say inside curly braces, or in this case, inside the for loop, that variable only lives inside of that for loop, which means this math is perfect. We initialize sum to zero, we keep adding to sum I, but the moment down here I try to return sum, my compiler is going to complain to me sum not declared, even though it is, because it's outside of the scope. So it was very deliberate a moment ago when I declared this variable outside of the loop so that it would remain in scope the entire time. All right, so let's go ahead and save this. Let's make sigma 1. I'm not going to bother typing GGC and all that, because this handles a lot of the magic for me now. I'm going to go ahead and dot slash sigma 1. Be ever so clear for security reasons that it's in this current directory. Enter, positive integer please, let's say 10. Hopefully, that's correct. 9. All right. Let's do one I can check quickly. 3. Yes, I'm pretty sure that 3 plus 2 plus 1 equals in fact 6. So feels like this code is correct. But turns out we can implement this in a very different way. How it this function get called first? Well, here is an example of where do while is useful. We said last week -- we said last week that do while is often useful when you want to ask the user for something, give them a chance to give it to you, and then reprimand them or ask them again if they fail to provide what you're looking for. So here's exactly that scenario. Declare an int called N. I'm going to do the following. Tell the user positive integer, please. I'm going to then use get int from the CS 50 library to get the int. And then right here rather elegantly am I going to check within my while construct, if N is at this point in time less than 1, do it again, and do it again. So this is a general approach, do while, to getting user input and the result to be clear is if I try messing with the computer and say negative 123 or negative 6, it's just going to keep pestering me. But remember, my random word of last week, monkey, why does it say retry instead of positive integer here? So this is [Inaudible] to get int. So as the name get int implies, get int will do its best to get an int from the user, but it doesn't guarantee it's going to be positive or negative or anything like that. It just gets you an int. So turns out we can implement this same function in a really interesting way that hints at the beauty of this thing called recursion. A function is recursive if it calls itself. Now on first glance or first hear that might be a little worrisome. If I'm a function and I got called with an input, and I try to solve that problem by calling myself, that should immediately should infinite loop, bad stuff happens. But hopefully there's some way of making sure that you don't keep calling yourself in perpetuity, and there is. It's called the base case. So notice this implementation of Signa is only four lines of code, and frankly, if I got rid of some of the indentation, all that, it's maybe like a two line program, maybe a one line program that implements exactly the same function, exactly the same formula of adding numbers together. So what do I do? On input M I ask myself is M less than or equal to 0. If so, just return 0, because bad things will happen if I start trying to deal with negative numbers. Now apparently all I need to do to answer this question is call myself. And this is where you get this weird sort of circular magic. Notice that all I do otherwise, after my base case, after my sanity check, is return the result of M, the value I was passed, plus sigma of M minus 1. Now that alone actually works. If I go back to my terminal and I compile sigma 2, and I run sigma 2, and I give it the number 3, I get back 6. If I give it 10, I get 55. In fact, it works, and yet how is it doing that? Well, let's see. So it turns out that this problem of adding all the numbers from 0 on up on M, or equivalently M all the way down to 0 is pretty easy, even conceptually. If you hand me the number M and ask me what's the sum from M down to 0, well, I could say, all right, well, it's M plus the sum of all the numbers from M minus 1 on down there. All right, so let me ask you again. What is the sum of M minus 1 on down to 0. Well, it's M minus 1 plus the sum of M minus 2 all the way down to 0. Right? You can be this really argumentative human pushing back on the person in wise-ass form, oh, well, it's just the answer of M plus the rest itself. Well, what's the rest of it. Well, it's a little bit less than the rest of it, plus the rest of that. Right, you just keep taking this bite, this bite, minus 1 out of each piece of the puzzle. And that would seem to go on forever, but not quite. Because eventually when I get over here, my last piece of the puzzle is going to be M equals what? Zero. And that's when I hit the so-called base case, when all of my answers now start to bubble up. In a sense, I was kind of passing the buck by playing this game with the person asking the question, well, what's M plus all the numbers below it? Well, it's M plus the sum of all the numbers below it. What's the sum of that, right. Eventually, we bottom out at zero, and then I finally start getting useful answers back until the M result is precisely the same. But what's been the point of all of this? Well, and again, this is something that you perhaps begin to see eventually, and maybe you never see, but there's kind of a beauty about expressing something as silly as addition of numbers with just this recursive scenario. And this will be a very representative approach to solving much bigger, much more interesting problems. But there is a catch. Let me go ahead and run sigma of one, my first version, on 100. That works pretty well. How about 1,000. Works pretty well. How about 10,000. Still working. 100,000. All right, then eventually I'm going to start to exhaust an int. Okay, so now I exhausted the int, but we would expect that. Let me try now to run sigma 2. So it's a more elegant solution, feels like it should be even faster because there's just frankly fewer lines of code, well, let's see what happens. Sigma 2 -- oops, 10. That works, how about 100. That works. How about 1,000. Still working. 10,000. All right. 100,000. Okay, come on, break. Oh, now some of you have seen this problem too. And we actually smiled. So in all fairness to the anonymized student asking the question on the bulletin board, is it okay if my program quits by saying segmentation fault, [Inaudible] dump. No, so this is a bad thing. Bless your heart for asking, but this is not a solution to the problem. This is, as I said in the bulletin board post, frankly -- or maybe -- actually, maybe this was a private e-mail, as I said in my response -- as I said in this response this is sort of the LINUX equivalent of Microsoft Word just hanging, or your Mac program just freezing or unexpectedly quitting, which is ever so annoying. This is sort of the equivalent in the LINUX world. But fortunately, now that we're the programmer and we're not just the user, the fact that it's dumping core, so to speak, is actually useful. So for today's purposes, segmentation fault means something bad happened with respect to memory. Because memory in your computer is generally modeled as a bunch of segments, so a segmentation fault means something went awry with regard to memory. Core dumped means that something useful was left in my current directory. If I type LS for list, notice that I have this file called core. Core is actually a file containing pretty much the contents of the computer's RAM at the moment in time when my computer screwed up. Now it's not all of the memory from all users on the system, because that would be a security concern or privacy concern, but it's diagnostically useful, we'll see in a week or so once we start dissecting the insides of that file. But for now the question is quite simply why it this happen, why did it not appear capable of handling the same number of zeros as the first version. Now granted, the first version, still broken, but it doesn't ultimately choke in the same horrific way. So what is happening in sigma 2 that's not apparently happening in sigma 1. And again, some of you have tripped over this, and frankly, by the end of the semester you will all have core dumped. Why? What might be going wrong here? So it has to do with memory. Where is memory getting used in this program. I don't even see any local variables other than the parameter called M for this function. So where's the memory getting used. Like, why do I have memory problems? Yeah, so when I call M -- when I call sigma does what?
[ Inaudible audience comment ]
>> So it makes -- so it allocates a bit more memory. In fact, let me remind of this little picture from last week, and we said we'd come back to this. This recalls the rectangle that represents your computer's RAM. But recall the more important point from last week that any time you call a function a little chunk of memory does get allocated on the so-called stack for that function. And that chunk of memory we call a frame, so if sigma is called by name and then sigma calls who in version 2? Sigma. And then sigma calls sigma, and sigma calls sigma, you know, even though on this picture I've not implied that there is an upper bound on the amount of memory you can allocate, but in reality there is. On a typical system, a given program often cannot use any more than say 2 gigabytes of memory. Now that's a lot of memory, but not necessarily for programs like Photoshop, and so there actually are some interesting real world implications here. But the point is that rectangle eventually has a top to it. And if you keep trying to put more and more stack frames on your memory stack, eventually you're going to overlap with something else that needs it more. In fact, what did we say is at the very top of this rectangle, if briefly, last week? Sorry?
[ Inaudible audience comment ]
>> So there's a bunch of -- there's global variables, but there's also the text segment, as we called it, the program itself. Your zeros and ones of your program itself. Now this doesn't mean that you're overwriting your program on disc, it doesn't mean you're erasing your code. But it does mean that the operating system is just going to bail all together if you try using memory that already belongs to something else. So let's go ahead and end there, and on Wednesday we'll figure out how to work around this.
[ Background noise ]
==== Transcribed by Automatic Sync Technologies ====