DataAnalysis--Segment01.mp4

Algebra Nspirations

Data Analysis and Statistics

A recent study by the Environmental Protection

Agency found that Americans waste an estimated 27% of the

food available for consumption each year.

And Teen Health and the Media posted on their website that

the average American teen spends about 20 hours a week

watching television.

That's equivalent to 43 24-hour days, or 12% of an

entire year.

According to the National Lightning Safety Institute,

the odds of being struck by lightning is

about 1 in 300 million.

And according to the National Highway Safety Administration,

the probability of a teenage driver in the US having an

accident in any 6-month period is 30%.

The first two facts are statistics.

The latter two are probabilities.

We're confronted daily with quantitative information of

this kind, be it in the form of words, numbers,

percentages, graphs, charts, or table.

And we make our decisions based on these data

about our health, our education, our vacation

activities, our political choices, our

environment, and much more.

In Introductory Algebra, data analysis is almost synonymous

to statistics.

In this lesson, we'll take a look at some of the basic

notions of probability and statistics, and investigate

some real-world situations.

Probability and statistics are related but different.

I'll begin with probability.

The field rose from the dice tables of France when the

Chevalier de Mere brought his question about a gambling game

to the attention of Blaise Pascal and Pierre de Fermat.

In their ensuing written correspondence, these famous

French mathematicians laid the foundation for the theory of

probability.

The word "probability" mean likelihood, odds, or chance,

like in the chance of a thunderstorm tonight is 70%.

As a field of study, however, probability theory is a branch

of mathematics concerned when analyzing random phenomena.

So uncertainty and randomness are central ideas of

probability.

The rise of statistics to the status of a science is less

precise and more recent.

Early applications revolved around the needs of state

governments to base decisions on demographic

and economic data.

This explains the common root "stat" found in state and

statistics.

Statistics in the plural form are numerical facts or data

such as averages.

In the singular form, statistics is the science that

uses data to find out things about our world.

So while both probability and statistics are concerned with

the frequency of events, they go about it in different ways.

In probability, we begin with a model based on

mathematical theory.

And from there, we make predictions about events will

occur and assign them probabilities.

In statistics, we begin by representing, analyzing, and

interpreting data collected from the real world.

And from there, draw conclusions about the rules of

the underlying model.

To illustrate this difference, a probabilist would theorize

that given the mathematical structure of a cube, rolling

any one of these numbers is the same.

She would therefore assign that probability 1/6 to

obtaining any one of the numbers 1 through

6 on any one roll.

A statistician, on the other hand,

takes nothing for granted.

He'd begin by setting up an experiment that consists of

rolling this number cube many, many times.

He'd then record those observations, analyze the

information.

And based on the data, conclude if this is

a fair die or not.

Finally, he'd use probability theory to make predictions.

During World War II, while confined to a prison camp, the

English mathematician John Kerrich flipped a coin 10,000

times to prove the common sense notion that repeatedly

flipping a coin results in heads up about half

or 50% of the time.

His common sense notion has a mathematical name.

It's called the law of large numbers.

First, a word on randomness.

Gravity is an example of something that's

certainly not random.

If I drop this ball, it will fall at the same speed and in

the same direction every single time.

If I flip this coin, on the other hand, I can't predict

what I will get on each trial.

It's not like dropping a ball.

But while I can't predict the outcome of any individual

trial, I can predict a long-term pattern.

That's the signature of randomness.

Random doesn't mean completely haphazard or chaotic, it means

there exists a regularity that only appears over many, many

repetitions.

And it's this type of order or regularity that we describe

mathematically using probability.

Let's use the Nspire to simulate the theorem of

probability called the law of large numbers.

Turn on the TI-Nspire.

Press the Home key for a new document.

If a document is open, you'll be prompted to save it.

Decide, then select 1 to create a calculator page.

Press Menu.

And under Probability, select Random.

And then select Integer.

RandInt

short for random integer

appears.

Enter 0 comma 1 comma and the right parenthesis.

This command will randomly display numbers

ranging from 0 to 1.

But since there are integers, there are only two

possibilities.

Press Enter.

And indeed, you obtain 0 or 1.

I got a 1.

Press Enter several more times to see what happens.

You get a list of 0's and 1's randomly

generated by the computer.

If we let 1 equal heads and 0 equal tails, this then can

also be viewed as the list of coin-tossing outcomes after

several random trials of your experiment.

Let's add another feature.

This time, we'll use another path to access randInt.

Press the Catalog key.

Press 1 for the first tab.

An alphabetical list of commands appears.

Press R to move to those beginning with the letter R.

Scroll down a bit with the Down Arrow and select randInt.

Press Enter.

Input 0 comma 1 comma 15.

Then press Enter as the right paren is

automatically inserted.

As you can see, inserting a third integer, n, tells the

handheld to execute n trials.

Enclosed in braces, you have the 15 outcomes of the

simulated coin toss.

The next step will be to simulate flipping a coin 500

times for each of 50 trials, for a total of 50 times 500,

or 25,000 flips.

And on each trial, we'll record the number of heads.

To generate the trial numbers 1 through 50 in column A,

we'll use the fact that each consecutive counting number

equals the previous one plus 1.

To generate the number of heads on each trial, we'll

flip the coin 500 times using randInt, and then add up the

500 numbers.

Since heads are 1 and tails 0, the sum will yield the sum of

the 1's, which is the sum of all heads.

Here we go.

Press Home to insert a list and spreadsheet page.

Scroll to the top of column A and type t

for the trial number.

Press Enter.

In the formula line

the gray row

we need a formula that will generate numbers 1 through 50

for the 50 simulated trials.

Here's one way of doing this.

Press Menu.

And under Data, select Generate Sequence.

You will then use the n-th term of the sequence.

In this case, the n-th trial.

Each trial number u of n is one more than

the previous one.

So type n for u of n.

As usual, we use Tab to scroll down.

Enter 1 for the starting value.

Enter 50 for the maximum number of trials.

Then, Tab Down and click OK.

You've generated the trial numbers.

Use the nav pad to move to the top of column B and type N-O-H

for number of heads.

In cell B1 not the formula cell

you're going to type in sum of randInt of 0, 1, and 500,

which will give us the number of heads.

So here we go.

Type in the equal sign S-U-M for sum.

Then press Catalog and select randInt.

0 comma 1 500 and the right paren twice.

Now we have the number of heads from the first trial.

To do the same for each of the 50 trials, press Menu.

And under Data, select Fill down.

Now press the Down Arrow until cell B50.

Then press Enter.

Notice the time clock, indicating that the Nspire is

executing all 50 trials of 500 coin tosses, and recording the

corresponding number of heads for each trial.

They now appear in the B cell.

So far in column B, we have the number of heads in 500

flips that correspond to each of the 50 trials.

Our last step is to show that as we move from 500 flips to

1,000 to 1,500 and so on, all the way to 25,000 flips, the

ratio of the cumulative heads to the total number of flips

closer and closer to 1/2, or 0.5, or 50%.

This chart will help you understand the

formula we will be using.

We'll create a third column C of the ratios of cumulating

heads over cumulating flips.

The first ratio is simply H1 over 500.

In the second, it's the heads H1 plus H2 over 500 times 2,

or 1,000 flips.

After three trials, the ratio of heads is the sum of H1 plus

H2 plus H3 over 500 times 3, or 1,500 trials and so on.

One more comment before we resume.

In the formula we'll use on the Nspire, the variable A

stands for the numbers in column A. And the variable B

stands for the numbers in column B. I think we're ready.

Use the nav pad to move to the top of column C and type r for

ratio of total heads to total flips.

Then move down to the formula row.

We'll enter the ratio you see here.

So first press Control and the division symbol for a fraction

placeholder.

Type C-U-M-, short for cumulative, followed by S-U-M,

left paren B and right paren.

In the denominator, enter 500 times A,

the column A variable.

Finally, press the Right Arrow and press Enter.

You now have the ratio of the total number of heads to the

total number of flips.

So 500 flips, a 1,000 , 1,500, and so on, all the way to

25,000 on the 50th trial.

Pretty powerful machine.

To complete this simulation, let's make a scatter

plot of this data.

Press the Home key and select Data & Statistics page.

Again, it will take a few seconds.

Now you have what appears to be a mess.

Press the Down Arrow and you'll see

Click to Add Variable.

Click to select t for the trial number.

Then, with the nav pad, navigate over to the y-axis

and click to Add r for ratio when you see the same box.

As expected, as the number of flips increases from 500 on

trial 1 to 25,000 on trial 50, the ratio of the total number

of heads to the total number of flips approaches the

horizontal line of 0.5 or 50%.

In his prison cell, John Kerrich carried out a similar

experiment, but he flipped his coin 10,000 times without the

help of a computer simulation.

He did it all by hand.

If you count about 6 seconds for flipping and recording

each outcome, without counting the rests in between, that

alone is over 40 hours.

After 10 flips, he got 4 heads.

After 30, his heads to flips ratio was 56.7%.

And after 10,000 flips, he got heads 50.67% of

the time, or 0.5067.

His ratio also approached 0.5.

Let's finish with a little vocabulary.

In this experiment, we have two possible

outcomes, heads and tails.

The sample space S is the set of all possible outcomes.

Here, the sample space has two elements.

We use braces to denote a set.

Any combination of outcomes is an event.

For example, obtaining heads when tossing one coin or

obtaining two heads when tossing two coins.

To every event E, the probability function P of E

assigns a number between 0 and 1.

For example, the probability of obtaining heads when

tossing one coin is 0.5 for 50%.

But the probability of obtaining two heads when

tossing two coins is 0.25 for 25%.

See if you can figure that one out.

Finally, the probability of an impossible event, such as

obtaining heads and tails when tossing one coin is 0.

And the probability of a certain event is 1 for 100%.