Investigation 2: Data Analysis: Endangered Wolves
On to part two for a closer look at statistics.
In order to analyze, interpret, and draw inferences
from real-world observations, statisticians must first
collect, organize, and represent their data in
meaningful ways that make them easier
to analyze and interpret.
In the case of univariate data analysis, or data analysis of
a single variable, they can create dot charts, bar graphs,
box-and-whisker plots, pie charts, and other visually
effective representations.
In the case of bivariate data analysis, or data involving
two variables, statistical regression is a staple of
statistics.
I will show you three representations available to
you for univariate data sets.
Suppose you jotted down the breed of dog each student in
your class owns.
Let's use the Nspire to organize this data in ways you
may not have seen on other technology.
We'll just use the first letter of each breed.
And we'll use the letter n for none.
Press the Home key for a new document.
Save the previous one if you wish, then create a list and
spreadsheet page.
Scroll to the top of column A and type B-R-E-E-D for our
x-variable.
Press Enter, or the Down Arrow, twice.
Pause the video to enter the dog data in cells A1 through
A20, then resume viewing.
You don't need to select the column data, because no matter
what command you enter, there's no ambiguity about
which variables you want to use.
So press Menu and under Data, select Quick Graph.
Instantly, you have a bar graph where each bar is made
up of dots, sometimes called a dot chart.
Now for the fun part.
Press Menu.
And under Plot Type, select Pie Chart.
And there you have it.
To magnify the pie chart, under Page Layout, select
Custom Split.
Arrow left until you're about 3/4 of the way over.
Then, press Enter.
Use the nav pad to move the pointer to a
particular pie slice.
Click and hold the segment to see the
summary of that category.
The summary displays the number of cases in that
category as well as the percentage that the category
represents among all the cases.
Lastly, press Menu.
And under Plot Type, select Bar Chart.
This is the classical-looking bar graph.
The count is indicated on the vertical axis.
And if you click and hold any one bar
again, you get the same type of summary we just saw.
I'm sure you've done a fair amount of work with univariate
data, meaning data involving one variable.
So we'll move onto bivariate data.
Bivariate data means data pairs of two variables.
The independent variable denoted by x and the dependent
variable denoted by y.
Regarding the relationship between the two variables, we
have three possible cases.
If y depends on x, we have a causal relationship.
If x and y are related but do not depend on each other,
there is no causal relationship.
Finally, there may be no relationship at all between
the two variables.
The example we're about to investigate
belongs to case two.
Gray wolves, also known as timber wolves, or simply
wolves, originated about 300,000 years ago.
They are known for their intelligence and adaptability.
In the past, wolves endured revenge killings for attacks
on livestock.
By the 1930s, they were completely extinguished from
the northern Rocky Mountain states.
Eventually, with the passage of the Endangered Species Act
in 1973, public attitudes changed and wolves received
legal protection.
We're going to use data from the 2007 Rocky Mountain Wolf
Recovery Report, plot them, analyze them, find the
mathematical model that best fits the growth pattern, and
discuss the prediction of future population numbers.
Turn on the TI-Nspire.
Press the Home key for a new document.
Save the previous document if you wish, then create a lists
and spreadsheet page.
Scroll to the top of column A and type Y-E-A-R for the
x-variable.
Press Enter or the Down Arrow.
We'll use the Fill down feature to insert consecutive
years through 2007.
Scroll down beyond the formula line.
And in cell A1, type 1979.
Move down and in cell A2, type 1980.
Press Enter.
Go back up to A2, press and hold the Shift key as you move
up to A1, and both cells will become darkened, which means
they've been selected.
Press Menu.
And under Data, select Fill down.
1979 to 2008 is a 30-year span, so our data through 2007
spanned 29 years.
Press the Down Arrow until the dotted box
includes the 29th row.
Press Enter.
You can see the final years of the
year list you've generated.
Pressing the Right Arrow takes you back up to cell B1.
Scroll to the top of column B and type wolf pop for the
annual wolf population, our y-variable.
Arrow Down twice to cell B1.
Pause the DVD now to enter the data from the chart.
Stop at 1,513 in cell B29 before resuming.
To select both column, Press Menu and follow this path
Actions to Select.
And then, to Select Column.
Now, press and hold the Shift key as you Arrow Left to
select both columns.
To create a scatter plot of these data, press Menu.
And under Data, select Quick Graph.
Scatter plot and data lists are now side by side.
Observe the shape of this graph.
It's ascending, which was expected.
But it also reveals a rapid growth rate.
So we suspect an exponential function.
To perform an exponential regression, press Control Tab
to switch back to the list still highlighted.
Next, press Menu.
Under Statistics, select Stat Calculations.
And under that, select Exponential Regression.
A dialog box appears.
Press the Down Arrow to select year for the X List.
Tab down to Y. Do the same, but select wolf pop this time.
Notice the regression equation, which the Nspire
will compute, will be saved to the f1 function.
Tab down to first result column and note that the
results will be stored starting in column C. Tab down
to OK and press Enter.
To see more of this spreadsheet, under Page Layout
select Custom Split.
Arrow Right to move the partition to the right, then
press Enter.
Column C contains the words and symbols.
Column D gives their values.
Notice the exponential function general
form in cell D2.
Scroll down to line 6.
r is called the correlation coefficient.
Its value tells us how good a fit we have.
An absolute value of 1 means a perfect fit.
This r-value of 0.988 implies an excellent fit.
We're going to introduce a new graphs and geometry page where
we'll superimpose the continuous graph of the F1
regression equation over the scatter plot of data pairs.
Then, we'll see how well we can use this mathematical
model to predict future wolf population numbers.
Insert the Graphs and Geometry Page by pressing Control-I and
selecting 2.
Press Menu.
And under Graph Type, select Scatter Plot.
The x-variable is highlighted.
Press Enter and choose Year.
Press Enter again.
Tab over to y.
Press Enter and this time choose wolf pop.
To adjust the window, since our data begins
at 1979, press Menu.
And under Window, select Zoom Data.
There's our scatter plot in a full screen.
To graph the regression equation over this plot, press
Menu again.
And under Graph Type, choose Function.
f2 is displayed at the bottom.
To access f1, where our regression equation is stored,
press the Up Arrow once and f1 is displayed.
Press Enter to graph it.
As you can see, while f1 is a good fit, meaning the function
f1 is a good mathematical model for the wolf population
growth, it's not perfect.
For example, there's one period of time where the
model's growth is slower than the actual
wolf population growth.
And later, it's the opposite.
To use f1 to predict future wolf population
numbers, press Menu.
And under Trace, select Graph Trace.
A point's coordinate appears.
If the y-value is integer, it's on the scatter plot.
Press the Down Arrow to access the f1 graph.
Type in 2012.
The screen pans over to show the point on f1's graph.
7,000 wolves seems rather excessive.
And that's only because the model is theoretical.
It's modeled with a 98.9% accuracy, the population
growth over 29 years The timespan during which wolves
were protected under the Endangered
Species Act, or ESA.
As the NRM gray wolf population reaches the desired
levels, the ESA protection may be lifted and other variables
will then come into play, such as the numbers of wolves
killed by humans.
The growth rate will, therefore, slow down and a new
mathematical model will have to be developed.
This process of starting with data from the real world,
developing a theoretical model in mathematics, and then
returning to the real world better equipped to explain and
predict phenomena is called mathematical modeling.
But predicting the future behavior of variables in any
phenomenon is tricky business.
Scientists must combine information from theoretical
models with common sense knowledge to make the best
prediction.
Understanding probability is important because life is full
of uncertainty.
And we're better off making choices based on objective
probability than on our own subjective beliefs.
Understanding statistics is equally important because it
provides the tools for predicting and forecasting
future events based on observational data.
Now that you know how to carry out probability simulations
and the process of mathematical modeling with the
TI-Nspire and it's built-in statistics, try your hand at
other interesting data sets.
Good luck.