A Basic Introduction to Statistics

So how can we define “statistics”?  If you search around the internet you’ll find lots of examples, but it’s a useful exercise to try and  use your own words.   How about this one –  statistics the “science” of obtaining information from numerical data –

• This information is used to
• gain insights (Who will be the next president?)
• understand relationships (Does watching T.V. affect a student’s grades?)
• draw conclusions (Does smoking cause cancer?)
• make decisions (Should the space shuttle Challenger be launched?)

Challenger was launched in 1985, even though there were concerns that the temperature was too cold.
The space shuttle exploded a few minutes after take-off.
Three Main Components of Statistics

• Data collection
• Data analysis
• Drawing conclusions from the data

What is a part of components? Statistics is driven by data, therefore …
“In God we trust. All others must bring data.”
Robert Hayden, Plymouth State College

Statistics is more of an art than a science.

We can never determine if an event will definitely occur: we can only determine the probability that an
event will occur.

“Statistics means never having to say you are certain”

This seems to be a major shortcoming, but it is actually the greatest strength of statistics!

“An approximate answer to the right question is worth a good deal more than the exact

John Tukey

In mathematics we frequently find exact answers to fictitious problems. Statistics gives approximate
answers to real problems. The difficulty is asking the right questions and collecting the correct data!

A mathematician and a statistician apply for the same job. At the interview, they are
asked the question, “What is 1 + I?” The mathematician says “2” without hesitating.
The statistician pauses for a few minutes and asks “What do you want it to be?”

In statistics, different conclusions are possible depending on which questions are asked and which data
are collected.  For example if you’re studying the relative benefits of how to buy and resell sneakers online it’s important to gather the correct benefits too.  This is essential to study profits and whether your efforts are worth it.  Many entrepreneurs are faced with the issue of trying to balance the costs of buying – e.g marketing against the potential sales they might achieve.  There are other costs for smaller businesses, things like software and even specialised servers known as ‘sneaker proxies’ – read about it here.

Individuals are often accused of using statistics to distort the truth.
“There are three kinds of lies: lies, damn lies, and statistics.”
Benjamin Disraeli
A Lie is bad, a damn lie is worse, but a lie based on statistics is the worst lie! Why is a fie based on statistics
the worst type of lie? People believe it ;s true!
“A statistician is a person who comes to the rescue of figures that cannot lie for
themselves. ”

Statistics can be used to spin data. Forexample, a textbook publisher claims that a new math textbook
will increase pass rates on Regents exams. Data are collected which show that 2 of 30 students passed
the Regents exam using the old book while 3 of 30 students passed using the new book. Theclaim? The
new book increased the pass rates by 50%!
But do not despair!
“It is easy to lie with statistics, but it is easier to lie without them.”
Frederick Mosteller
However, always remember that 39% of data is made up. (Thinkabout it!)

The Route Inspection Problem (Chinese Postman)

The Route Inspection Problem

The problem is to start from a node, travel along each and every arc and return to the starting point.  If it is possible to achieve this without going over the same arc twice, the minimum distance is just the sum of the arc lengths.  If it is necessary to cover some of the arcs twice, them we must select these in the most economic way.   This type of problem is called a Route Inspection Problem.  They are usually illustrated by gritter lorries or postmen, i.e. where all roads must be covered.

First we need to cover some network theory.

Node type – a n-node is a node where n arcs join

1-node, 3-node, … are called odd nodes

2-node, 4-node, … are called even nodes

In a network there must always be an even number of odd nodes.

Traversability – a network is said to be traversable if you can draw it without removing your pen from the paper and without retracing the same arc twice.

Looking at this network you will find you must start at one of the odd nodes and end at the other.

If you investigate other networks and note their node type numbers for each number you will find : For a network to be traversable it must have 0 or 2 odd nodes, and if we are to be able to start and finish at the same node it must have no odd nodes.  The implications of this result about traversability for route inspection problems are as follows.

If there are no odd nodes in the network, the network is traversable and the minimum distance is the sum of the arc distances.  Otherwise there will be an even number of odd nodes and the route inspection algorithm requires that we identify them and link them together in pairs in the most economic way.  The links selected will be repeated and the adding in of these extra arcs makes all the nodes even and the network traversable.

For more information on these procedures and the calculations behind them please see the previous posts on the route inspection problems.