Archive for category Basic stats & math

Blended learning for math & stats

Check out this intriguing YouTube video by Khan Academy proving the Pythagorean Theorem:

Now imagine grade schoolers being lectured like this at home and then spending their time in class following up one-on-one or in small group sessions with the teacher. See this report from a 7th grade math teacher in California who takes advantage of this “blended learning” approach. As face-to-face time with educators becomes ever-more expensive, expect more-and-more use of asynchronous web-based training like this. That’s what I foresee. Don’t you?

No Comments

Armed and dangerous – switchblades and statistics

(Warning: Quirky material ahead =>)

Seeing this CBS News about Maine legalizing switchblades for one-armed people reminded me of a riddle about limbs that’s posed by some statisticians for educational purposes.  Here it is: “The great majority of people in [fill in your country here] have more than the average number of [choose either arms or legs here].”

For an answer {UK, legs}, see this posting on averages by Kevin McConway, Professor of Applied Statistics in the Department of Mathematics and Statistics at The Open University.  I heard this riddle also from Hans Rosling in his BBC TV program on “The Joy of Statistics.”*  He spoke of his home country of Sweden, whose inhabitants on average have 1.999 legs.

I’m quitting while I’m ahead.  Oops, this makes me wonder if I have an average number of heads – a scary thought, my hunch being that I’m below average for this.  I never imagined that averages could be so creepy!

*See this StatsMadeEasy blog on Rosling

 

1 Comment

South Korea achieves top ranking for education

Based on testing of nearly half a million 15-year-old students worldwide, South Korea ranks as the number 1 country overall for education.  It’s laid out beautifully in the graphical illustration by Paul Scruton for this article by UK’s Guardian on which country does best at reading, science and maths.  This breaking news is timely for me because I will be in Seoul Thursday giving a presentation hosted by the Department of Statistics at Chung-Ang University.  The USA fell far down the list in math, so I suppose I am a bit out of line trying to explain new tools of design of experiments to anyone in this country. ; )

I spoke today to a resident of Busan with a daughter in high school.  She said the school runs from 8 in the morning to 8 at night Monday through Friday and a half day every other Saturday.  No wonder South Korean teenagers test so highly relative to the USA!

No Comments

Minnesota’s ’08 Senate race dissed by British math master Charles Seife

Sunday’s New York Times provided this review of Proofiness – The Dark Arts of Mathematical Deception – due for publication later this week.  The cover, seen here in Amazon, depicts a stats wizard conjuring numbers out of thin air.

What caught my eye in the critique by Steven Strogatz – an applied mathematics professor at Cornell, was the deception caused by “disestimation” (as Proofiness author Seife terms it) of the results from Minnesota’s razor-thin 2008 Senate race, which Al Franken won by a razor-thin 0.0077 percent margin (225 votes out of 1.2 million counted) over Norm Coleman.  Disestimation is the act of taking a number too literally, understating or ignoring the uncertainties that surround it; in other words, giving too much weight to a measurement, relative to its inherent error.

“A nice anecdote I like to talk about is a guide at the American Museum of Natural History, who’s pointing at the Tyrannosaurus rex.  Someone asks, how old is it, and he says it’s 65 million and 38 years old.  Sixty-five million and 38 years old, how do you know that?   The guide says, well, when I started at this museum 38 years ago, a scientist told me it was 65 million years old. Therefore, now it’s 65 million and 38.  That’s an act of disestimation.  The 65 million was a very rough number, and he turned it into a precise number by thinking that the 38 has relevance when in fact the error involved in measuring the dinosaur was plus or minus 100,000 years.  The 38 years is nothing.”

–          Charles Seife (Source: This transcript of an interview by NPR.)

We Minnesotans would have saved a great deal of money if our election officials had simply tossed a coin to determine the outcome of the Franken-Coleman contest.  Unfortunately, disestimation is embedded in our election laws, which are bound and determined to make every single vote count, even though many thousands in a State-wide race prove very difficult to decipher.

No Comments

Harvard economist advises students of all ages to learn some statistics

In this Sunday New York Times “Economic View” column, Harvard professor N. Gregory Mankiw advises that those who wish to pursue this “dismal science” take one or more courses in statistics while in college.  He sees a dearth of knowledge on this subject in his first year students.

“High school mathematics curriculums spend too much time on traditional topics like Euclidean geometry and trigonometry.  For a typical person, these are useful intellectual exercises but have little applicability to daily life.  Students would be better served by learning more about probability and statistics.”

— N. Gregory Mankiw

I’m with him on learning more about stats but not at the expense of less geometry and trig, which come in very handy for anyone pursuing an engineering career.   Also, budding economists could benefit from a little knowledge of period functions such as sine waves.  It seems to me that what goes around comes around.

No Comments

Quantifying statements of confidence: Is anything “iron clad”?

Today’s “daily” emailed by The Scientist features a heads-up on “John Snow’s Grand Experiment of 1855” that his pioneering epidemiology on cholera may not be as “iron clad” as originally thought.  A commentator questions what “iron clad” means in statistical terms.

It seems to me that someone ought to develop a numerical confidence scale along these lines.  For example:

  • 100% Certain.
  • 99.9% Iron clad.
  • 99% Beyond a shadow of a doubt.
  • 95% Unequivocal.
  • 90% Definitive.
  • 80% Clear and convincing evidence.
  • 50% On the balance of probabilities.

There are many other words used to convey a level of confidence, such as: clear-cut, definitive, unambiguous, conclusive.  How do these differ in degree?

Of course much depends on how is making such a statement, many of whom are not always right, but never in doubt. ; )  I’m skeptical of any assertion, thus I follow the advice of famed statistician W. Edwards Deming:

“In God we trust, all others bring data.”

Statistics can be very helpful for stating any conclusion because it allows one to never have to say you are certain.  But are you sure enough to say it’s “iron clad” or what?

,

1 Comment

What value for p is right for testing t (or tasting tea)?

Seeking sponsors for his educational website, statistician Keith Bower sent me a sample of his work – this 5 minute podcast on p-values.  I enjoyed the story Keith tells of how Sir Ronald Fisher, who more-or-less invented design of experiments, settled on the p value of 5% as being a benchmark for statistical significance.

This sent me scurrying over to my office bookshelf for The Lady Tasting Tea – a delightful collection of stories* compiled by David Salsburg.**  Page 100 of this book reports Fisher saying that below p of 0.01 one can declare an effect (that is – significance), above 0.2 not (that is – insignificant), and in-between it might be smart to do another experiment.

So it seems that Fisher did some flip-flopping on the issue of what value of p is needed to declare statistical significance.

PS.  One thing that bothers me in any discussion of p-values is that it is mainly in the context of estimating the risk in a test of the null hypothesis and almost invariably overlooks the vital issue of power.  For example, see this YouTube video on Understanding the p-value.  It’s quite entertaining and helpful so far as it goes, but the decision to accept the null at p > 0.2 is based on a very small sample size.  Perhaps the potential problem (underweight candy bars), which one could scope out by calculating the appropriate statistical interval (confidence, prediction or tolerance), merits further experimentation to increase the power.  What do you think?

*In the title story, originally told by Sir Ronald Fisher, a Lady claims to have the ability to tell which went into her cup first—the tea or the milk.  Fisher devised a test whereupon the Lady is presented eight cups in random order, four of which are made one way (tea first) and four the other (milk first).  He calculates the odds of correct identification as 1 right way out of 70 possible selections, which falls below the standard 5% probability value generally accepted for statistical significance.  Salsburg reveals on good authority (H. Fairfield Smith–a colleague of Fisher) that the Lady identified all eight cups correctly!

**Salsburg, who worked for some years as a statistician at a major pharmaceutical company offers this amusing anecdote from personal experience:

“When I first began to work in the drug industry…one…referred to…uncertainty [as] ‘error.’ One of the senior executives refused to send such a report to the U.S. Food and Drug Administration [FDA]. ‘How can we admit to having error in our data?’ he asked [and]…insisted I find some other way to describe it…I contacted H.F. Smith [who] suggested that I call the line ‘residual’…I mentioned this to other statisticians…and they began to use it…It seems that no one [in the FDA, at least]…will admit to having error.”

No Comments

Bonferroni of Bergamo

Bonferroni corrected

Uncorrected (random results)

I enjoyed a fine afternoon in the old Citta Alta of Bergamo in northern Italy – a city in the sky that the Venetians, at the height of their power as the “most serene republic,” walled off as their western-most outpost in the 17 century.

In statistical circles this town is most notable for being the birthplace of Carlo Emilio Bonferroni.  You may have heard of the “Bonferroni Correction” – a method that addresses the problem of multiple comparisons.

For example, when I worked for General Mills the head of quality control in Minneapolis would mix up a barrel of flour and split it into 10 samples, carefully sealed in air-tight containers, for each of the mills to test in triplicate for moisture.  At this time I had just learned how to do the t-test for comparing two means.  Fortunately for the various QC supervisors, no one asked me to analyze the results, because I would have simply taken the highest moisture value and compared it to the lowest one.  Given that there are 45 possible pair-wise comparisons (10*9/2), this biased selection (high versus low) is likely to produce a result that tests significant at the 0.05 level (1 out of 20).

This is a sadistical statistical scheme for a Machiavellian manager because of the intimidating false positives (Type I error).  In the simulation pictured, using the random number generator in Design-Expert® software (based on a nominal value of 100), you can see how, with the significance threshold set at 0.05 for the least-significant-difference (LSD) bars (derived from t-testing), the supervisors of Mills 4 and 7 appear to be definitely discrepant.  (Click on the graphic to expand the view.) Shame on them!  Chances are that the next month’s inter-laboratory collaborative testing would cause others to be blamed for random variation.

In the second graph I used a 0.005 significance level – 1/10th as much per the Bonferroni Correction.  That produces a more sensible picture — all the LSD bars overlap, so no one can be fingered for being out of line.

By the way, the overall F-test on this data set produces a p value of 0.63 – not significant.

Since Bonferroni’s death a half-century ago in 1960, much more sophisticated procedures have been developed to correct for multiple comparisons.  Nevertheless, by any measure of comparative value, Bergamo can consider this native son as one of those who significantly stood above most others in terms of his contributions to the world.

,

1 Comment

Getting straight to the point via the word for today

Today I learned a new aspect of geometry – the “symmedians” of a triangle.  This esoteric term showed up in a review by Wall Street Journal writer Mark Laswell of a book on personal ads.*  Here’s the appeal for a companion that caught my eye:

“Apparently the Three Symmedians aren’t a novelty Bosnian folk troupe.  Rubbish mathematician (M 37).”

This diagram and detailing by Wolfram Mathworld tells you how to draw symmedians on a triangle and locate the symmedian point, which is the “isogonal conjugate” of the centroid.

It turns out that the centroid is a vital point for mixture design of experiments aimed at optimizing product formulations, as explained in this primer that I co-authored.

So that explains how the symmedian is an interesting ‘counter-point’ for me.  However, I wonder if the self-styled “rubbish mathematician” attracted an isogonal conjugate with his play on geometry.

*(“Lonely Hearts, Like Minds The eccentric personal ads of ‘romantically awkward eggheads”)

No Comments

A journal title that caught my eye today

While reading over the table of contents of the Journal of Agricultural, Biological, and Environmental Statistics that came in the mail today, I came across this intriguing title: “A Graphical Method for Dating Chicks Using Bivariate Body Measurements.”  How you interpret “dating” makes all the difference!

1 Comment