Stats Made Easy

Reaching your boiling point

Posted by mark in science on July 31, 2010

Our marketing director emailed me this motivational video called “212° the extra degree.” this motivational video called “212° the extra degree”. It says that at this temperature water boils providing the steam needed to accomplish things. The idea is that only one degree of heat makes all the difference.

I get it. However, being a chemical engineer with an interest in being accurate about physical processes, I had to be troublesome by pointing out that here in Twin Cities at over 800 above sea-level the pressure drops enough that on average the boiling point drops to 210.5 F. But setting this aside and focusing only on the 1 degree between water and steam, one must keep in mind the huge difference of simply heating up water versus making it change state, the is, the heat (or enthalpy in technical terms) of vaporization.

Thank goodness that our marketing director had become accustomed to working with a bunch of engineers, statisticians and programmers who, when one asks “Could I talk with your for a minute?”, immediately set the timer on their digital watches for precisely 60 seconds (the the nearest one-hundredth).

Coincidentally, while vacationing in Wisconsin’s Door County, I enjoyed a fine demonstration of how hard it can be to bring a quantity of water to a boil. It’s a tradition there to throw a bunch of fish in one kettle and vegetables in another and cook them up with a wood fire. However, as I learned and experienced from a somewhat dangerous vantage point, a pitcher of kerosene provides the final heat needed to accomplish the boil-over. My eyebrows needed a bit of burn-back, so that’s OK.

science, travel

No Comments

What value for p is right for testing t (or tasting tea)?

Posted by mark in Basic stats & math, history on July 25, 2010

Seeking sponsors for his educational website, statistician Keith Bower sent me a sample of his work – this 5 minute podcast on p-values. I enjoyed the story Keith tells of how Sir Ronald Fisher, who more-or-less invented design of experiments, settled on the p value of 5% as being a benchmark for statistical significance.

This sent me scurrying over to my office bookshelf for The Lady Tasting Tea – a delightful collection of stories* compiled by David Salsburg.** Page 100 of this book reports Fisher saying that below p of 0.01 one can declare an effect (that is – significance), above 0.2 not (that is – insignificant), and in-between it might be smart to do another experiment.

So it seems that Fisher did some flip-flopping on the issue of what value of p is needed to declare statistical significance.

PS. One thing that bothers me in any discussion of p-values is that it is mainly in the context of estimating the risk in a test of the null hypothesis and almost invariably overlooks the vital issue of power. For example, see this YouTube video on Understanding the p-value. It’s quite entertaining and helpful so far as it goes, but the decision to accept the null at p > 0.2 is based on a very small sample size. Perhaps the potential problem (underweight candy bars), which one could scope out by calculating the appropriate statistical interval (confidence, prediction or tolerance), merits further experimentation to increase the power. What do you think?

*In the title story, originally told by Sir Ronald Fisher, a Lady claims to have the ability to tell which went into her cup first—the tea or the milk. Fisher devised a test whereupon the Lady is presented eight cups in random order, four of which are made one way (tea first) and four the other (milk first). He calculates the odds of correct identification as 1 right way out of 70 possible selections, which falls below the standard 5% probability value generally accepted for statistical significance. Salsburg reveals on good authority (H. Fairfield Smith–a colleague of Fisher) that the Lady identified all eight cups correctly!

**Salsburg, who worked for some years as a statistician at a major pharmaceutical company offers this amusing anecdote from personal experience:

“When I first began to work in the drug industry…one…referred to…uncertainty [as] ‘error.’ One of the senior executives refused to send such a report to the U.S. Food and Drug Administration [FDA]. ‘How can we admit to having error in our data?’ he asked [and]…insisted I find some other way to describe it…I contacted H.F. Smith [who] suggested that I call the line ‘residual’…I mentioned this to other statisticians…and they began to use it…It seems that no one [in the FDA, at least]…will admit to having error.”

statistics

No Comments

Ink made to last and fonts that minimize its consumption

Posted by mark in politics on July 20, 2010

Over the past few weeks, I’ve come across a number of interesting inkles about ink.

A team of U.S.-British researchers announced earlier this month that they deciphered previously-illegible scrawling by African explorer David Livingstone, which he made 140 years ago under desperate circumstances using the juice of local berries. See the image enhancement in this article by New Scientist Tech. Given the depressing content of Livingstone’s laments, it may be just as well he used ephemeral ink.
The Dead Sea Scrolls, now on exhibit at the Minnesota Science Museum (see this picture, for example), were written with extremely durable black ink (well over 2000 years old!) comprised of lamp black (soot), gum Arabic and flaxseed oil. According to this Numerica entry on the chemistry of ink a red version was made by substituting cinnabar (mercury sulfide ? – HgS). That must have been used by the editor overseeing publication of the Scrolls. ; )
Printer.com suggests that we all save ink by favoring certain fonts over others. For example Century Gothic* uses 30 percent less ink than Arial. As a general rule the serif fonts do better than the sans serif ones. An article by Dinesh Ramde of Associated Press on 4/7/10 reported that a school of 6,500 students, such as the University of Wisconsin-Green Bay, can save up to $10,000 per year by switching to an ink-stingy font. To really make a statement about their support for Earth, UW-GB ought to go with the “holey” ecofont. However, rather than going to something so ugly, perhaps the best thing for all concerned about going green would be to be prohibited from printing anything and just hand-write what’s absolutely essential to put on paper (or papyrus).

No Comments

A breadth of fresh error

Posted by mark in politics, Uncategorized on June 20, 2010

This weekend’s Wall Street Journal features a review by Stats.org editor Trevor Butterworth of a new book titled Wrong: Why Experts Keep Failing US – And How to know When Not to Trust Them. The book undermines scientists, as well as financial wizards, doctors and all others who feel they are almost always right and thus never in doubt. In fact, it turns out that these experts may be nearly as often wrong as they are right in their assertions. Butterworth prescribes as a remedy the tools of uncertainty that applied statisticians employ to good effect.

Unfortunately the people funding consultants and researchers do not want to hear any equivocation in stated results. However, it’s vital that experts convey the possible variability in their findings if we are to gain a true picture of what may, indeed, transpire.

“Error is to be expected and not something to be scorned or obscured.”

— Trevor Butterworth

politics, statistics

No Comments

Tasty tidbits gleaned by a news-starved junky for stats trivia

Posted by mark in history, Uncategorized on June 10, 2010

The June 10^th “Views” section of the International Herald Tribune (the global edition of New York Times) offered a few choice bits for me to savor after nearly two weeks traveling abroad without an American newspaper.

A pie chart reporting on a June 1-7 telephone survey by Stanford University of 1000 American adults asking their opinion on belief in global warming. A pie chart illustrated that about 75% do believe in global warming, 20% do not, and 5% “don’t believe in pie charts”. I suspect that the author of this editorial, Jon A. Krosnick – a professor of communications at Stanford, meant this last bit of the chart to represent those who are undecided, but the graphic designers (Fogleson-Lubliner) figured they’d have some fun.
Olivia Judson’s comments on “Galton’s legacy” note that this preeminent British statistician once published a comment in Nature (June 25, 1885 “Measure of Fidget”) that correlated boredom by how the audience squirmed during particularly wearisome presentations. I wish I would’ve thought of this “amusing way of passing an otherwise dull” lecture before attending two statistical conferences over the last several weeks. Based on this 2005 assessment of “Nodding and napping in medical lectures”, the more things change the more they stay the same, at least so far as presentations are concerned. The only difference is cost. For example, the authors figure that at a typical 1 hour talk to 100 high-powered professionals, say master statisticians, perhaps as much as $20,000 goes up in snores.

“Nodding was common, but whether in agreement with the speaker or in reverie remains undetermined.”

— Kenneth Rockwood (Dalhousie University), Christopher J. Patterson, McMaster University, David B. Hogan (University of Calgary)

global warming, history

No Comments

Bonferroni of Bergamo

Posted by mark in Basic stats & math, history, Uncategorized on June 6, 2010

Bonferroni corrected

Uncorrected (random results)

I enjoyed a fine afternoon in the old Citta Alta of Bergamo in northern Italy – a city in the sky that the Venetians, at the height of their power as the “most serene republic,” walled off as their western-most outpost in the 17 century.

In statistical circles this town is most notable for being the birthplace of Carlo Emilio Bonferroni. You may have heard of the “Bonferroni Correction” – a method that addresses the problem of multiple comparisons.

For example, when I worked for General Mills the head of quality control in Minneapolis would mix up a barrel of flour and split it into 10 samples, carefully sealed in air-tight containers, for each of the mills to test in triplicate for moisture. At this time I had just learned how to do the t-test for comparing two means. Fortunately for the various QC supervisors, no one asked me to analyze the results, because I would have simply taken the highest moisture value and compared it to the lowest one. Given that there are 45 possible pair-wise comparisons (10*9/2), this biased selection (high versus low) is likely to produce a result that tests significant at the 0.05 level (1 out of 20).

This is a sadistical statistical scheme for a Machiavellian manager because of the intimidating false positives (Type I error). In the simulation pictured, using the random number generator in Design-Expert® software (based on a nominal value of 100), you can see how, with the significance threshold set at 0.05 for the least-significant-difference (LSD) bars (derived from t-testing), the supervisors of Mills 4 and 7 appear to be definitely discrepant. (Click on the graphic to expand the view.) Shame on them! Chances are that the next month’s inter-laboratory collaborative testing would cause others to be blamed for random variation.

In the second graph I used a 0.005 significance level – 1/10^th as much per the Bonferroni Correction. That produces a more sensible picture — all the LSD bars overlap, so no one can be fingered for being out of line.

By the way, the overall F-test on this data set produces a p value of 0.63 – not significant.

Since Bonferroni’s death a half-century ago in 1960, much more sophisticated procedures have been developed to correct for multiple comparisons. Nevertheless, by any measure of comparative value, Bergamo can consider this native son as one of those who significantly stood above most others in terms of his contributions to the world.

history, statistics

1 Comment

Priming R&D managers to allow sufficient runs for well-designed experiment

Posted by mark in design of experiments on June 2, 2010

I am learning a lot this week at the Third European DOE User meeting in Lucerne, which features many excellent applications of DOE to industrial problems. Here’s an interesting observation from Pavel Nesladek, a technical expert from the Advanced Mask Technology Center of Dresden, Germany. He encounters severe pressure to find answers in minimal time at the least possible cost. Pavel found that whatever number of runs he asked to do for a given design of experiment, his manager would press for fewer. However, he learned that by asking for a prime number, these questions would be preempted, presumably because this seemed to be so precise that it must not be tampered with! For example, Pavel really needed to complete 20 runs for adequate power and resolution in a troubleshooting experiment, so he asked for 23 and got it. Tricky! Perhaps you stats-savvy readers who need a certain sample size to accomplish your objective might try this approach. Prima!

design of experiments

No Comments

PB&J please, but hold the jelly (and margarine) and put it on toast – a mixture design combined with a categorical factor

Posted by mark in design of experiments, Uncategorized, Wellness on May 27, 2010

My colleague Pat Whitcomb just completed the first teach of Advanced Formulations: Combining Mixture & Process Variables. It inspired me to develop a virtual experiment for optimizing my perfect peanut butter and jelly (PB&J) sandwich. This was a staple for me and my six siblings when we were growing up. Unfortunately, so far as I was concerned, my mother generously slathered margarine on the bread (always white in those days – no whole grains) and then thick layers of peanut butter and jelly (always grape). As you see* in the response surfaces for overall liking [ 🙁 1-9 🙂 ], I prefer that none of the mixture ingredients (A: Peanut butter, B: Margarine, C: Jelly) be mixed, and I like the bread toasted. This analysis was produced using the Combined design tab from Design-Expert® software version 8 released by Stat-Ease earlier this year. I’d be happy to provide the data set, especially for anyone that may be hosting me for a PB&J dinner party. 😉

*Click to enlarge the plots so you can see the legend, etc.

design of experiments, food

No Comments

Stat-Ease Corporation celebrates 25 years in business

Posted by mark in history, Uncategorized on May 16, 2010

My business partner Pat Whitcomb started up Stat-Ease as a business entity in 1982,* but he did not incorporate it until June of 1985. So that brings us to 25 years as a Corporation this coming month. This is quite an achievement for a software publisher – not many remain since 1985, I’ll wager, especially ones so specialized as us. That’s our saving grace, I figure – sticking to a niche like a clam in a wave-beaten hollow.

According to this report on U.S. Small Business Administration Office of Advocacy statistics from September 2009, only half of all startups survive five years. This correlates with a decay curve posted by Scott Shanem Professor of Entrepreneurial Studies at Case Western Reserve University, which shows that only about a quarter of companies remain alive after ten years.

I’d say we’ve done very well to make it this far. Having weathered the recent economic downturn in good shape, I feel positive about continuing on for at least a few years more. 😉

PS. If you’re interested to learn more about us, check out this history of Stat-Ease.

*The year the word “internet” was used for the first time according to this timeline. Check out these photos from the 1980’s by the Computer History Museum, especially the Osborne “portable” (24 pounds!) PC with a screen that looks about the size of today’s internet-enabled smart phones.

history

No Comments

Two-level factorial experimentation might make music for my ears

Posted by mark in design of experiments, Uncategorized, Wellness on May 9, 2010

I am a fan of classical music – it soothes my mind and lifts my spirits. Maybe I’m deluded, but I swear there’s a Mozart effect* on my brain. However, a big monkey wrench comes flying in on my blissful state when my stereo speaker (always only one of the two) suddenly goes into a hissy fit. I’ve tried a number of things on a hit-or-miss basis and failed to find the culprit. At this point I think it’s most likely the receiver itself – a Yamaha RX496. However, before spending the money to replace it, I’d like to rule out a number of other factors:

Speaker set: A vs B
Speaker wire: Thin vs Thick.
Source: CD vs FM-Radio
Speaker: Left vs Right.

It’s very possible that an interaction of two or more factors may be causing the problem, so to cover all bases I need to do all 16 possible combinations (2^4). But, aside from the work this involves for all the switching around of parts and settings, I am stymied by the failure being so sporadic.

Anyways, I feel better now having vented this to my blog while listening to some soothing Sunday choir music by the Dale Warland Singers on the local classical radio station. I’m taking no chances: It’s playing on my backup Panasonic SA-EN25 bookshelf system.

*Vastly over-rated according to this report by the Skeptic’s Dictionary.

design of experiments, health

No Comments

Reaching your boiling point

What value for p is right for testing t (or tasting tea)?

Ink made to last and fonts that minimize its consumption

A breadth of fresh error

Tasty tidbits gleaned by a news-starved junky for stats trivia

Bonferroni of Bergamo

Priming R&D managers to allow sufficient runs for well-designed experiment

PB&J please, but hold the jelly (and margarine) and put it on toast – a mixture design combined with a categorical factor

Stat-Ease Corporation celebrates 25 years in business

Two-level factorial experimentation might make music for my ears

Links

Archives

Meta