Archive for category Basic stats & math

Variation in eggs presents perplexing problems for preparation

Today is World Egg Day.

I’m a big fan of eggs—my favorite being ones perfectly poached in an Endurance Stainless Steel Pan. However, the eggs that come from my daughters’ hens vary in size far more per container than store-bought, graded ones. I work around this by adding or subtracting time based on my experience. I really should weigh the eggs and design an experiment to optimize the time.

Coincidentally, I just received the new issue of Chance, published by the American Statistical Association. An article titled “A Physicist and a Statistician Walk into a Bar” caught my eye because one of my Stat-Ease consulting colleagues is a physicist and another is a statistician. I was hoping for a good joke at both of their expense. However, the authors (John Durso and Howard Wainer) go in a completely different direction with an amusing, but educational, story about a hypothetical optimization of soft-boiled eggs.

The problem is that recipes suffer from the “flaw of averages” —smaller ones get undercooked and bigger ones end up overcooked unless the time gets adjusted (as I well know!).

While the physicist sits over a pint of beer and pad of paper scratching out possible solutions based on on partial differential equations related to spheroidal geometry, the statistician assesses data collected on weights versus cooking time. Things get a bit mathematical at this point* (this is an ASA publication, after all) but in the end the statistician determines that weight versus cooking time can be approximated by a quadratic model, which makes sense to the physicist based on the geometry and makeup of an egg.

I took some liberties with the data to simplify things by reducing the number of experimental runs from 41 to 8. Also, based on my experience cooking eggs of varying weights, I increased the variation to a more realistic level. See my hypothetical quadratic fit below in a confidence-banded graph produced by Stat-Ease software.

Perhaps someday I may build up enough steam to weigh every egg, time the poaching and measure the runniness of the resulting yolks. However, for now I just eat them as they are after being cooked by my assessment of the individual egg-size relative to others in the carton. With some pepper and salt and a piece of toast to soak up any leftover yolk, my poached eggs always hit the spot.

*For example, they apply Tukey’s ladder of variable transformations – a method that works well on single-factor fits and can be related to the shape of the curve being concave or convex, going up or down the powers, respectively. It relates closely to the more versatile Box-Cox plot provided by Stat-Ease software. Using the same data as Durso and Wainer presented, I found that the Box-Cox plot recommended the same transformation as Tukey’s ladder.

No Comments

Industrial statisticians keeping calm and carrying on with p-values

Last week I attended a special webinar on “Statistical Significance and p-values” presented by the European Network of Business and Industrial Statistics (ENBIS). To my relief, none of the speakers called for abandoning the use of p values. Though I feel that p’s should not be a statistic to solely rely on for deeming results significant or not, when used properly they certainly reduce the risk of pressing ahead with spurious outcomes. It was great to get varying perspectives on this issue.

Here are a couple of fun quotes on that I gleaned from this ENBIS event:

  • “Surely, God loves the .06 nearly as much as the .05. Can there be any doubt that God views the strength of evidence for or against the null as a fairly continuous function of the magnitude of p?” – Rosnow, R.L. & Rosenthal, R. “Statistical procedures and the justification of knowledge in psychological science”, American Psychologist, 44 (1989), 1276-1284.
  • “My definition of a statistician is ‘one who prefers true doubts to false certainty’.” – Stephen Senn (Statistical Consultant, Edinburgh, Scotland, UK)

If you have a strong stomach for stats, see this Royal Society review article: The reign of the p-value is over: what alternative analyses could we employ to fill the power vacuum? It includes discussion of an alternative to p values called the “Akaike information criterion” (AIC). This interested me, because, as a measure for goodness of model-fit, Stat-Ease software provides AICc—a version of this statistic that corrects (hence the appendage “c”) for the small sample sizes of industrial experiments (relative to large retrospective scientific studies).

No Comments

Engineer detects “soul crushing” patterns in “A Million Random Digits”

Randomization provides an essential hedge against time-related lurking variables, such as increasing temperature and humidity. It made all the difference for me succeeding with my first designed experiment on a high-pressure reactor placed outdoors for safety reasons.

Back then I made use of several methods for randomization:

  • Flipping open a telephone directory and reading off the last four digits of listings
  • Pulling out number from pieces of paper put in my hard hat (easiest approach)
  • Using a table of random numbers.

All of these methods seem quaint with the ubiquity of random-number generators.* However, this past spring at the height of the pandemic quarantine, a software engineer Gary Briggs of Rand combatted boredom by bearing down on his company’s landmark 1955 compilation of “A Million Random Digits with 100,000 Normal Deviates”.**

“Rand legend has it that a submarine commander used the book to set unpredictable courses to dodge enemy ships.”

Wall Street Journal

As reported here by the Wall Street Journal (9/24/20), Briggs discovered “soul crushing” flaws.

No worries, though, Rand promises to remedy the mistakes in their online edition of the book — worth a look if only for the enlightening foreword.

* Design-Expert® software generates random run orders via code based on the Mersenne Twister. For a view of leading edge technology, see the report last week (9/21/20) by HPC Wire on IBM, CQC Enable Cloud-based Quantum Random Number Generation.

**For a few good laughs, see these Amazon customer reviews.

No Comments

Business people taking notice of pushback on p-value

As the headline article for their November 17 Business section, my hometown newspaper, the St. Paul Pioneer Press, picked up an alarming report on p-values by Associated Press (AP). That week I gave a talk to the Minnesota Reliability Consortium*, after which one of the engineers told me that he also read this article and lost some of his faith in the value of statistics.

“One investment analyst reacted by reducing his forecast for peak sales of the drug — by $1 billion. What happened? The number that caused the gasps was 0.059. The audience was looking for something under 0.05.”

Malcom Ritter, AP, relaying the reaction to results from a “huge” heart drug study presented this fall by Dr. Scott Solomon of Harvard’s Brigham and Women’s Hospital.

As I noted in this May 1st blog/, rather than abandoning p-values, it would pay to simply be far more conservative by reducing the critical value for significance from 0.05 to 0.005. Furthermore, as pointed out by Solomon (the scientist noted in the quote), failing to meet whatever p-value one sets a priori as the threshold, may not refute a real benefit—perhaps more data might generate sufficient power to achieve statistical significance.

Rather than using p-values to arbitrarily make a binary pass/fail decision, analysts should use this statistic as a continuous measure of calculated risk for investment. Of course, the amount of risk that can be accepted depends on the rewards that will come if the experimental results turn out to be true.

It is a huge mistake to abandon statistics because of p being hacked to come out below 0.05, or p being used to kill projects due to it coming out barely above 0.05. Come on people, we can be smarter than that.

* “Know the SCOR for Multifactor Strategy of Experimentation”

No Comments

ASA calls for abandoning the declaration of results being “statistically significant”

On March 21 the American Statistical Association (ASA) sent out a shocking email to all members that the lead editorial in a special, open-access issue of The American Statistician calls for abandoning the use of “statistically significant”.  With irony evidently intended by their italicization, they proclaimed it “a significant day in the history of the ASA and statistics.

I think the probability of experimenters ignoring ASA’s advice and continuing to say “statistically significant” approaches 100 percent. Out of the myriad of suggestions in the 43 articles of The American Statistician special issue the ones I like best come from statisticians Daniel J. Benjamin and James O. Berger. They propose that, because “p-values are often misinterpreted in ways that lead to overstating the evidence against the null hypothesis”, the threshold for “statistical significance” of novel discoveries require a threshold of 0.005. By their reckoning, a p-value between 0.05 and 0.005 should the be degraded to “suggestive,” rather than “significant.”*

It’s a shame that p-hackers, skewered in this xkcd cartoon, undermined the sound application of statistics for filtering out findings unsupported by the data.

*The American Statistician, 2019, Vol. 73, No. S1, 186–191: Statistical Inference in the 21st Century, “Three Recommendations for Improving the Use of p-Values”.

,

No Comments

“Data are profoundly dumb”




This is the controversial view of Judea Pearl and Dana Mackenzie expressed in “Mind over Data”—the lead article in the August issue of Significance. In this excerpt from The Book of Why these co-authors explain “how the founders of modern statistics ‘squandered’ the chance to establish the science of causal inference”. They warn against “falsely believing the answers to all scientific questions reside in the data, to be unveiled through clever data-mining tricks.”

“Lucky is he who has been able to understand the cause of things.”

– Virgil (29 BC)

Pearl and Mackenzie are optimistic that the current “Causal Revolution” will lead to far greater understanding of underlying mechanisms. However, by my reckoning, randomized controlled trials remain the gold standard for establishing cause and effect relationships. Only then can the data speak loud and clear.

No Comments

The hero of zero




Breaking news about nothing: Dating done with the Oxford Radiocarbon Accelerator Unit now puts the invention of the number zero 500 years earlier than previously believed.  As explained in this post by The Guardian, the hero of zero is Indian mathematician Brahmagupta who worked out this pivotal number in 628 AD.  Isn’t that something?

The development of zero in mathematics underpins an incredible range of further work, including the notion of infinity, the modern notion of the vacuum in quantum physics, and some of the deepest questions in cosmology of how the Universe arose – and how it might disappear from existence in some unimaginable future scenario.

– Hannah Devlin,

,

No Comments

Errors, blunders & lies




David S. Salsburg, author of “The Lady Tasting Tea”*, which I enjoyed greatly, hits the spot again with his new book on Errors, Blunders & Lies-How to Tell the Difference. It’s all about a fundamental statistical equation: Observation = model + error. The errors, of course, are normal and must be expected. But blunders and lies cannot be tolerated.

The section on errors concludes with my favorite chapter: “Regression and Big Data”. There Salsburg endorses my favorite way to avoid over-fitting of happenstance results—hold back at random 10 percent of the data and see how well these outcomes are predicted by the 90 percent you regress.** Whenever I tried this on manufacturing data it became very clear that our high-powered statistical models worked very well for predicting what happened last month. 😉 They were worthless for seeing into the future.

Another personal favorite is the bit on spurious correlations that Italian statistician Carlo Bonferroni*** guarded against, also known as the “will of the wisps” per the founder of Yale’s statistics school—Francis Anscombe.

If you are looking for statistical insights that come without all the dreary mathematical details, this book on “Errors, Blunders & Lies” will be just the ticket. Salsburg concludes with a timely heads-up on the statistical lies caused “curbstoning” (reported here by the New York Post), which may soon combine with gerrymandering (see my previous post) to create a perfect storm of data tampering in the upcoming census. We’d all do well to sharpen up our savvy on stats!

The old saying is that “figures will not lie,” but a new saying is “liars will figure.” It is our duty, as practical statisticians, to prevent the liar from figuring; in other words, to prevent him from perverting the truth, in the interest of some theory he wishes to establish.

– Carroll D. Wright, U.S. government statistician, speaking to 1889 Convention of Commissioners of Bureaus of Statistics of Labor.

*Based on the story told here.

**An idea attributed to the inventor of modern day statistics—R. A. Fisher, and endorsed by famed mathematician John Tukey, who suggested the hold-back be 10 percent.

***See my blog on Bonferroni of Bergamo.

No Comments

Models responsible for whacky weather




Watching Brazilian supermodel Gisele Bundchen sashay across the Olympic stadium in Rio reminded me that, while these fashion plates are really dishy to view, they can be very dippy when it comes to forecasting.  Every time one of our local weather gurus says that their models are disagreeing, I wonder why they would ask someone like Gisele.  What does she and her like know about meteorology?

There really is a connection of fashion and statistical models—the random walk.  However, this movement would be more like that of a drunken man than a fashionably-calculated stroll down the catwalk.  For example, see this video by an MIT professor showing 7 willy-nilly paths from a single point.

Anyways, I am wandering all over the place with this blog.  Mainly I wanted to draw your attention to the Monte Carlo method for forecasting.  I used this for my MBA thesis in 1980, burning up many minutes of very expensive main-frame computer time in the late ‘70s.  What got me going on this whole Monte Carlo meander is this article from yesterday’s Wall Street Journal.  Check out how the European models did better than the Americans on predicting the path of Hurricane Sandy.  Evidently the Euros are on to something as detailed in this Scientific American report at the end of last year’s hurricane season.

I have a random thought for improving the American models—ask Cindy Crawford.  She graduated as valedictorian of her high school in Illinois and earned a scholarship for chemical engineering at Northwestern University.  Cindy has all the talents to create a convergence of fashion and statistical models.  That would be really sweet.

No Comments

Big data puts an end to the reign of statistics




Michael S. Malone of the Wall Street Journal proclaimed last month* that

One of the most extraordinary features of big data is that it signals the end of the reign of statistics.  For 400 years, we’ve been forced to sample complex systems and extrapolate.  Now, with big data, it is possible to measure everything…

Based on what I’ve gathered (admittedly only a small and probably unrepresentative sample), I think this is very unlikely.  Nonetheless, if I were a statistician, I would reposition myself as a “Big Data Scientist”.

*”The Big-Data Future Has Arrived”, 2/22/16.

1 Comment