1
00:00:00,000 --> 00:00:05,760
2
00:00:05,760 --> 00:00:08,900
>> DOUG LLOYD: So in CS50 we learned about
a variety of sorting and searching
3
00:00:08,900 --> 00:00:09,442
algorithms.
4
00:00:09,442 --> 00:00:11,400
And sometimes it can be
a little tricky to keep
5
00:00:11,400 --> 00:00:14,161
track of what algorithm does what.
6
00:00:14,161 --> 00:00:15,910
We've really only
skimmed the surface too,
7
00:00:15,910 --> 00:00:18,740
there are many other searching
and sorting algorithms.
8
00:00:18,740 --> 00:00:21,780
So in this video let's
just take a few minutes
9
00:00:21,780 --> 00:00:24,980
to try and distill each algorithm
down to its core elements
10
00:00:24,980 --> 00:00:27,810
so you can remember the most
important information about them
11
00:00:27,810 --> 00:00:31,970
and be able to articulate the
differences, if necessary.
12
00:00:31,970 --> 00:00:34,220
>> The first is selection sort.
13
00:00:34,220 --> 00:00:38,210
The basic idea behind selection sort
is find the smallest unsorted element
14
00:00:38,210 --> 00:00:42,890
in an array and swap it with the
first unsorted element of that array.
15
00:00:42,890 --> 00:00:46,620
We said that the worst-case
run time of that was n squared.
16
00:00:46,620 --> 00:00:50,060
And if you want to recall why, take
a look at the selection sort video.
17
00:00:50,060 --> 00:00:54,560
The best-case run time
is also n squared.
18
00:00:54,560 --> 00:00:58,910
>> Bubble sort, the idea behind that
one is to swap adjacent pairs.
19
00:00:58,910 --> 00:01:01,730
So that's the key that helps you
remember the difference here.
20
00:01:01,730 --> 00:01:04,920
Selection sort is, so far,
find the smallest-- bubble
21
00:01:04,920 --> 00:01:06,790
sort is swap adjacent pairs.
22
00:01:06,790 --> 00:01:08,710
We swap adjacent pairs
of elements if they
23
00:01:08,710 --> 00:01:12,530
are out of order, which effectively
bubbles larger elements to the right,
24
00:01:12,530 --> 00:01:17,060
and at the same time it also begins
to move smaller elements to the left.
25
00:01:17,060 --> 00:01:20,180
The worst-case case run time
of bubble sort is n squared.
26
00:01:20,180 --> 00:01:23,476
The best-case run time
of bubble sort is n.
27
00:01:23,476 --> 00:01:25,350
Because in that situation
we don't actually--
28
00:01:25,350 --> 00:01:27,141
we might not need to
make any swaps at all.
29
00:01:27,141 --> 00:01:31,026
We only have to make one
pass across all n elements.
30
00:01:31,026 --> 00:01:34,600
>> In insertion sort, the
basic idea here is shifting.
31
00:01:34,600 --> 00:01:36,630
That's the keyword for insertion sort.
32
00:01:36,630 --> 00:01:39,630
We're going to step once through
the array from left to right.
33
00:01:39,630 --> 00:01:41,670
And as we do, we're
going to shift elements
34
00:01:41,670 --> 00:01:46,260
we've already seen to make room for
smaller ones that need to fit somewhere
35
00:01:46,260 --> 00:01:48,080
back in that sorted portion.
36
00:01:48,080 --> 00:01:51,600
So we build the sorted array one
element at a time, left to right,
37
00:01:51,600 --> 00:01:54,740
and we shift things to make room.
38
00:01:54,740 --> 00:01:58,650
The worst-case run time of
insertion sort is n squared.
39
00:01:58,650 --> 00:02:02,380
The best-case run time is n.
40
00:02:02,380 --> 00:02:05,380
>> Merge sort-- the keyword
here is split and merge.
41
00:02:05,380 --> 00:02:08,949
We split the full array, whether
it's six elements, eight elements,
42
00:02:08,949 --> 00:02:13,790
10,000 elements-- we split it
down by half, by half, by half,
43
00:02:13,790 --> 00:02:17,720
until we have under array
of n one element arrays.
44
00:02:17,720 --> 00:02:19,470
A set of n one element arrays.
45
00:02:19,470 --> 00:02:22,640
So we started with one
1,000-element array,
46
00:02:22,640 --> 00:02:26,550
and we get to the point where we
have 1,000 one-element arrays.
47
00:02:26,550 --> 00:02:30,990
Then we begin to merge those sub arrays
back together in the correct order.
48
00:02:30,990 --> 00:02:34,860
So we take two one-element arrays
and create a two-element array.
49
00:02:34,860 --> 00:02:38,180
We take two two-element arrays
and create a four-element array
50
00:02:38,180 --> 00:02:43,900
and so on and so on until we've
again rebuilt one n element array.
51
00:02:43,900 --> 00:02:48,410
>> The worst-case run time of
merge sort is n times log n.
52
00:02:48,410 --> 00:02:52,390
We have n elements, but
this recombining process
53
00:02:52,390 --> 00:02:56,960
takes log n steps to get
back to a full array.
54
00:02:56,960 --> 00:03:01,160
The best case run time is also n log
n because this process doesn't really
55
00:03:01,160 --> 00:03:04,090
care whether the array was
sorted or not to start with.
56
00:03:04,090 --> 00:03:07,590
The process is the same, there's
no way to short circuit things.
57
00:03:07,590 --> 00:03:11,610
So n log n in the worst case,
n log n in the best case.
58
00:03:11,610 --> 00:03:13,960
>> We talked about two
searching algorithms.
59
00:03:13,960 --> 00:03:16,230
So linear search is about iterating.
60
00:03:16,230 --> 00:03:19,500
We proceed across the array
once, from left to right,
61
00:03:19,500 --> 00:03:21,950
trying to find the number
that we're looking for.
62
00:03:21,950 --> 00:03:24,550
The worst-case run time is big O of n.
63
00:03:24,550 --> 00:03:27,610
It might take us iterating
across every single element
64
00:03:27,610 --> 00:03:30,660
to find the element we're looking
for, either in the last position,
65
00:03:30,660 --> 00:03:31,630
or not at all.
66
00:03:31,630 --> 00:03:34,720
But we can't confirm that until
we've looked at everything.
67
00:03:34,720 --> 00:03:36,730
m the best-case, we find immediately.
68
00:03:36,730 --> 00:03:41,060
The best-case run time of
linear search is omega of 1.
69
00:03:41,060 --> 00:03:43,689
>> Lastly, there was binary search,
which requires assorted array.
70
00:03:43,689 --> 00:03:45,605
Remember that's a very
important consideration
71
00:03:45,605 --> 00:03:47,155
when working with binary search.
72
00:03:47,155 --> 00:03:50,180
It's a prerequisite to using it--
the array that you're looking through
73
00:03:50,180 --> 00:03:52,160
must be sorted.
74
00:03:52,160 --> 00:03:54,500
Otherwise, the keyword
is divide and conquer.
75
00:03:54,500 --> 00:03:58,310
Split the array into half and
eliminate half of the elements
76
00:03:58,310 --> 00:04:00,780
every time as you proceed through.
77
00:04:00,780 --> 00:04:04,330
Because of this divide and conquer
and splitting things in half approach,
78
00:04:04,330 --> 00:04:07,450
the worst-case run time
of binary search is
79
00:04:07,450 --> 00:04:11,730
log n, which is substantially
better than linear search's n.
80
00:04:11,730 --> 00:04:13,570
The best-case run time is still one.
81
00:04:13,570 --> 00:04:17,010
>> We might find it immediately the
first time we make a division, but,
82
00:04:17,010 --> 00:04:19,339
again, remember that
although binary search is
83
00:04:19,339 --> 00:04:24,030
substantially better than linear search
by virtue of being log n versus n,
84
00:04:24,030 --> 00:04:27,110
you might have to go through the work
of sorting your array first, which
85
00:04:27,110 --> 00:04:32,010
might make it less effective depending
on the size of the iterating sorted.
86
00:04:32,010 --> 00:04:35,250
>> I'm Doug Lloyd, this is CS50.
87
00:04:35,250 --> 00:04:36,757