Posts Tagged statistics

Statistics to make distracted drivers more aware this month

April is now the Mathematics and Statistics Awareness Month (formerly it was just math–no stats). It also is Distracted Driving Awareness Month.

Putting these two themes together brings us to data published this month by Zendrive, a San Francisco-based startup that uses smartphone sensors to measure drivers’ behavior. They claim that 90% of collisions are due to human error, of which 1 in 4 stem from phone use while driving.

These statistics are very worrying to start off with.  But, according to this blog, it gets far worse when you drill down on Zendrive’s 3-month analysis of 3-million anonymous drivers, who made 570-million trips and covered 5.6-billion miles:

  • Drivers used their phones on 88-percent of the trips
  • They spent 3.5 minutes per hour on calls (an enormous amount of time considering that even a few seconds of distraction can create dire consequences)

About a third of US states prohibit use of hand-held phones while driving. Does this reduce distraction? The stats posted by Zendrive are not definitive.

It seems to me that that hands-free must be far safer. However, this ranking of driving distractions* (benchmarked to plain driving—rating of 1) does not provide much support for what is seemingly obvious:

  1. Listening to the radio — 1.21
  2. Listening to a book on tape — 1.75
  3. Talking on a hands-free cellphone — 2.27
  4. Talking with a passenger in the front seat — 2.33
  5. Talking on a hand-held cellphone — 2.45
  6. Interacting with a speech recognition e-mail or text system — 3.06

For all the fuss about talking on the phone, whether hands-free or not, it does not cause any more distraction than chatting with a passenger.

This list does not include texting, which Consumer Reports figures is 23 times more distracting than talking on your cell phone while driving.**

Please avoid any distractions when you drive, especially texting.

*Source: This 10/16/15 Boston Globe OpEd

**Posted here

No Comments

“Bright line” rules are simple but not very bright

Just the other day a new term came to light for me—a “bright line” rule.  Evidently this is commonplace legal jargon that traces back to at least 1946 according to this language log.  It refers to “a clear, simple, and objective standard which can be applied to judge a situation” by this USLegal.com definition.

I came across the term in this statement* on p-values from American Statistical Association (ASA) on statistical significance:

“Practices that reduce data analysis or scientific inference to mechanical ‘bright-line’ rules (such as ‘p < 0.05’) for justifying scientific claims or conclusions can lead to erroneous beliefs and poor decision-making.”

The ASA goes on to say:

“Researchers should bring many contextual factors into play to derive scientific inferences, including the design of the study, the quality of the measurements, the external evidence for the phenomenon under study, and the validity off assumptions that underlie the data analysis.”

It is hard to argue that if the p-value is high, the null will fly, that is, results cannot be deemed statistically significant.  However, I’ve never bought into 0.05 being the bright-line rule.  It is good to see ASA dulling down this overly simplistic statistical standard.

I can see the value for “bright line rules” in legal processes, a case in point being the requirement for the Miranda warning being given to advise US citizens of their rights when being arrested.  However, it is ludicrous to apply such dogmatism to statistics.

*(The American Statistician, v70, #2, May 2016, p131)

No Comments

Men who have children make more money and live longer–correlation or causation?

Hey guys, if you want to make more money and live longer, have kids.  Anyways that seems to be the gist of two studies reported this month, at least from my perspective as a father of five.  Here is the scoop:

  • “Men in the top 1 percent distribution level live about 15 years longer than men in the bottom 1 percent on the income distribution in the United States.” – Raj Chetty, professor of economics at Stanford University, quoted in this report by NPR on an article in The Journal of American Medical Association on “The Association Between Income and Life Expectancy in the United States, 2001-2014” he lead-authored.
  • Working fathers enjoy 21% ‘wage bonus’ over childless colleagues according to a study by United Kingdom’s Trades Union Congress reported here

Before you run off madly making babies, that correlation may not be causation.  For example, as reported in this expose by Slate, statistics indicate that eating ice cream turns people into killers.  Could that really be the scoop?

Correlation

No Comments

American Statistical Association (ASA) defends itself against P-shooters

With the fundamental statistic of P value coming under severe attack, e.g., it being banned in early 2015 by the Basic and Applied Social Psychology (BASP) journal, the ASA has been compelled to issue an unprecedented press release with guidelines for avoiding misuse of hypothesis testing by scientists claiming significant experimental results.*  “The ASA statement is intended to steer research into a ‘post p<0.05 era,’” said Ron Wasserstein, the ASA’s executive director.

“To pounce on tiny P values and ignore the larger question is to fall prey to the ‘seductive certainty of significance.’”

– Geoff Cumming, emeritus psychologist, La Trobe University, Melbourne, Australia

The ASA statement on “Statistical Significance and P-Values” can be seen here.  It includes 6 guidelines on proper use of this essential tool for assessing research data, beginning with the assertion that “P-values can indicate how incompatible the data are with a specified statistical model.”

*See, for example, this Nature article that claims P values, the ‘gold standard’ of statistical validity, are not as reliable as many scientists assume.

No Comments

Big data puts an end to the reign of statistics

Michael S. Malone of the Wall Street Journal proclaimed last month* that

One of the most extraordinary features of big data is that it signals the end of the reign of statistics.  For 400 years, we’ve been forced to sample complex systems and extrapolate.  Now, with big data, it is possible to measure everything…

Based on what I’ve gathered (admittedly only a small and probably unrepresentative sample), I think this is very unlikely.  Nonetheless, if I were a statistician, I would reposition myself as a “Big Data Scientist”.

*”The Big-Data Future Has Arrived”, 2/22/16.

1 Comment

A Data Sherlock’s best friend: IBM’s Watson

According to this report last week by eWeek, more than 1 million users have registered for IBM’s Watson Analytics service since it launched a little over 1 year ago.  Evidently this artificially intelligent (AI) statistician-in-a-box will enable “citizen data scientists” to decipher patterns in the massive pile of information that now flow in from all quarters.  Current clients featuring by eWeek range from multinational law firm using it to identify new areas of practice to a UK a care provider looking for factors that improve worker safety.  IBM itself now operates an enterprise called Watson Health that deciphers medical imagery, and they bought the digital assets of the Weather Company to help businesses defend themselves against Mother Nature.*

Unfortunately for one of the early adopters of Watson—the MD Anderson Cancer Center at University of Texas (UT)—AI’s current IQ still falls far short of initial hopes.

“On Jeopardy! [Where Watson made its name 5 years ago by defeating the human champions] there’s a right answer to the question [actually the right question for the answer], but, in the medical world, there are often just informed decisions.”

— Lynda Chin, chief innovation officer for health affairs, UT

So it seems that, for the moment, at least, human statistical Sherlocks will not be replaced by AI’s overseen by amateurs at sleuthing out the culprits for cancer or other highly prized information.  However, Watson might be as capable an assistant as ‘his’ literary namesake.

*1/6/16 Financial Times “Big Read” on “Artificial Intelligence”, p 5 sidebar.

No Comments

Sine illusion makes peaks and valleys on graphs look overly variable

An article in the latest Journal of Computational and Graphical Statistics (JCGS, Vol 24, Num 4, Dec 2015, p1170)) alerted me to a fascinating misperception called the “sine illusion” that causes misinterpretation of trends in variability.  See it nicely illustrated here by vision researcher Micheal Bach.  The JGCS, Susan VanderPlas and Heike Hofmann, detail “Signs of Sine Illusion—Why We Need to Care” and provide methods to counteract its misleading effects.

If you see a scatter plot that goes up and down with seemingly large scatter at the bends, get out a ruler to get the true perspective.  That is my take home message for those like me who like to be accurate in their assessments of data.

“The illusion is explained in terms of a perceptual compromise between the vertical extent and the greater overall dimensions of the section at the turn of the sine-wave figure.”

– RH Day and EJ Stecher, “Sine of an illusion,” Perception, 20; 1991, 49–55.

No Comments

How you can make statistics persuasive for your political cause

For a very unsettling demonstration of statistics being easily biased to whatever result you like, go to this blog by science journalist Christie Aschwanden and chart maker Richie King.  Scroll down to the Hack Your Way To Scientific Glory control panel.  There you can play your hunches as to how Democrats versus Republicans affect the U.S. economy.  With a few changes in how you define the factors and measure the response, the results can be manipulated as you like.  Print out the final statistics and use them to beat up your political opponents.  What fun!

No Comments

Fisher-Yates shuffle for music streaming is perfectly random—too much so for some

The headline “When random is too random” caught my eye when the April issue of Significance, published by The Royal Statistical Society, circulated by me the other day.  It really makes no statistical sense, but the music-streaming service Spotify abandoned the truly random Fisher-Yates shuffle.  The problem with randomization is that it naturally produces repeats in tracks two or even three days in a row and occasionally back-to-back.  Although this happened purely by chance, Spotify consumers complained.

Along similar lines, I have been aggravated by screen savers that randomly show family photos.  It really seems that some get repeated too often even though it’s only by chance.  For a detailing of how Spotify’s software engineer Lukáš Poláček tweaked the Fisher-Yates shuffle to stretch songs out more evenly see this blog post.

“I think Fisher-Yates shuffle is one of the most beautiful random algorithms and it’s amazing that such a complicated problem can be solved in 3 lines of code in some programming languages.  And this is accomplished using the optimal number of operations and optimal amount of randomness.”

– Lukáš Poláček (who nevertheless, due to fickleness of music listeners, tweaked the algorithm to introduce a degree of unrandomization so it would reduce natural clustering)

No Comments

Believe it or not–sweet statistics prove that you can lose weight by eating chocolate

Keep calm and carry on eating chocolateA very happy lady munching on a huge candy bar caught my eye in The Times of India on Friday, May 25.  Not the lady—the chocolate.

After tasting a variety of delectable darks from a chocolatier in Belgium many years ago, I became hooked.  However, I never imagined this addiction would provide a side benefit of weight loss.  It turns out that a clinical trial set up by journalist John Bohannon and two colleagues came up with this finding and showed it to be statistically significant.  This made headlines worldwide.

Unfortunately, at least so far as I’m concerned, the whole study was a hoax based on deliberate application of junk science done to expose phony claims made by the diet industry.

It turns out to be very easy to generate false positive results that favor a dietary supplement.  Simply measure a large number of things on a small group of people.  Something surely will emerge that out of this context tests significantly significant.  What this will be, whether a reduction in blood pressure, or loss in weight, etc., is completely random.

Read the whole amazing story here.

My thinking is while Bohannan’s study did not prove that eating chocolate leads to weight loss, the subjects did in fact shed pounds faster than the controls.  That is good enough for me.  Any other studies showing just the opposite results have become irrelevant now—I will pay no attention to them.

Now, having returned from my travel to India, I am going back to dip into my horde of dark chocolate.

, , ,

No Comments

css.php