Slackers rule by a nerd’s law

The of and to a “in” is that it, for you, was with “on”.  Profound?  No.  These are the top 14 most commonly used words according to this Vsauce video by Michael Stevens.  He goes on to reveal a “bizarre” pattern where the second word (“of”) appears one-half as often, the third (“and”) one-third as frequently, and so on, that is, proportionally to one over its rank.  This phenomenon is known as Zipf’s law after the author of Human Behaviour and the Principle of Least Effort published in 1949.

“The” leads the list at 6% for being most used by the reckoning of Stevens.  Another study of 743 billion words found on Google books by their director of research came up with “the” occurring 7.14 percent of the time.  See this Abacaba video for entertaining and informative bubble charts on word frequencies by use, length and gender.

By the way, I learned a new term from Stevens: “hapax legomenon”—a word that only appears once in a book, that is, at the extreme end of the frequency chart ruled by Zipf’s law.  I am now on the lookout for these rarities so I can stop a casual conversation in its tracks by announcing my discovery of a hapax legomenon. ; )

Zipf’s law does not just apply to words, for example, this mysterious rule governs the size of cities as explained by this post on Gixmodo .

The driving force for this regularity in frequency distributions is the tendency for people to put in as little effort as they can, that is, slacking off for the most part.

That is it.

*For bringing this to my attention, I credit Nathaniel Chapman, an undergraduate researcher going for a Master’s degree in chemical engineering at the South Dakota School of Mines and Technology.

Read this as fast as you can but be prepared for a test to follow

Once upon a time I sped through Melville’s lengthy novel “Moby Dick.”  If I recall correctly, it has something to do with a fellow missing one arm who goes chasing after the devilish whale that bit it off.  Nowadays my eyes tire more quickly so I appreciate the advantages of electronic readers such as Kindle that serve me up columns of enlarged text with only a few words per line.  Then I needn’t work too hard looking back and forth.  What really works well is keeping one’s eyes fixed and moving the text along the focus.  This is called rapid sequential visual presentation, or RSVP.

Recently I got the heads-up from Scientific American*about a smart-watch from Samsung that comes equipped with an RSVP app called Spritz.  They claim that their “Optimal Recognition Point” (ORP) technology increases reading-speed on-average by half-again, from 220 to 330 words-per-minute.  My only question is how anyone can hold their wrist steady long enough to digest much.  I’d hate to run into anyone walking down the street while absorbed in a particularly fascinating book.  Texting is bad enough.

Then again it’s one thing to see a lot of words and even process them through your head, but yet another thing to comprehend fully what’s been read.  That’s the point of Annie Murphy Paul of The Weekly Wonk in this blog that questions the claims of Spritz.  If I read her correctly (ha ha), she suggests that subject-matter expertise is the real key to effective reading—not just doing it faster, but also with greater comprehension.  Excepting pulp fiction that requires little intelligence (gotta love it!), that makes a lot of sense to me.

Nevertheless, I’m anxious to see RSVP come to Kindle so I can try reading more in the short periods of time that I can free up and/or last before becoming eye-weary.  Maybe then I will re-read “Moby Dick.”  I have this vague recollection of the whale being white, but that just doesn’t seem right.

*See Speed-Reading Reborn for Smartphones, Smartwatches

It’s the letter of the law: Stand down with Calibri

Twenty years ago or so I cajoled the advertising rep from R&D Magazine into lending me a binder filled with several inches of ‘white papers’ of the publisher’s research on readership.  Their data came primarily from A/B (split) testing—not very sophisticated but effective for simple comparisons.  One question I resolved was whether to use serif or sans serif font.  The research showed significant advantages to headlines being san serif, such as Arial font, and text in serif—for example, Times New Roman.  I’ve stuck with that ever since,* except for the fonts themselves changing over to Calibri and Cambria—the defaults in current versions of Microsoft Office software.

However, now I am set back by this news from Wall Street Journal that Calibri comes up short—30 percent to be precise—versus Arial and other common fonts, at least so far as the State of Michigan is concerned.  The inventor of Calibri, Lucas de Groot, justifies his type being smaller because of its high readability per square inch.  Although this seems plausible to me, I would like to see the research supporting this assertion.

For an interesting detailing of fonts—serif versus san serif and neo-grotesque versus humanist—see this blog by Laurie Israel Think.

*For writings that will likely be read in printed form, that is.  Having seen research like this recent study from the JOURNAL OF COGNITIVE PSYCHOLOGY, I believe that words written in a sans serif font provide a significant advantage for messages read on computer screens, such as blogs and email.  Thus for these purposes I prefer using Calibri exclusively—ditto for presentations projected on screen, for example—using Powerpoint.

Statisticians apply stylometry to identify authors and they invent algorithms that assess essays

My colleague Tryg, who, like me, loves word play, drew my attention to this podcast* that explains how “By Their Words You Shall Know Them.”  I teed it up on my smart phone and listened on my way to work yesterday—a fun way to pass my half hour commute into Minneapolis from my home in Stillwater, Minnesota.  One thing that caught my ear was the early 1960s work by Harvard statistician Frederick Mosteller to pin down who wrote 12 of the 85 Federalist papers published under the pen name “Publius”.  He and colleague David Wallace (University of Chicago) applied Bayes; theorem to attribute these writings to James Madison (as opposed to Alexander Hamilton).  Mosteller also led the way to today’s reliance on statistics in sports by doing the first known academic analysis of baseball in 1946—concluding that luck rules even in a seven game World Series.  He didn’t agree that, though the Cardinals beat his home town Red Sox, the best team actually won.

This analytical dissection of written words has come to be known as “stylometry”.  As computing power increases and algorithms develop, writings are being put to the test.  For example, see this New York Times Digital Domain column from earlier this month that details developments in ‘essay-scoring engines’.  For now the students hold the upper hand on computer-based grading of papers—web-based essay mills can easily throw together fact-laden gibberish that fools the virtual professors.  These are easily seen by teachers when they skim the results—check out some goofy passages passed along by Duke University professor Dan Ariely in this editorial for the Los Angeles Times .

The advent of spell-checking and grammar inspection in word processors has been a boon for writers.  However, passing these tests does not necessarily lead to clear prose.  When I started work as an engineer, the head of our process development group handed me a little booklet by Robert Gunning on “How to Take the Fog Out of Writing”.  He advocated short, active sentences—not the passive, long and pedantic style I’d grown accustomed to from academia.  See how your writing scores for fog using this online tool by Simon Bond.  The quote below scored 20.86.  This paragraph came back with a fog index of 9.152 (up to this clause to be precise!).  Gunning’s score estimates the years of formal education needed to understand text on a first reading.  Thus my writing supposedly can be understood by 10th grader.  Draw your own conclusions on the readability of our founding fathers.

“As there is a degree of depravity in mankind which requires a certain degree of circumspection and distrust, so there are other qualities in human nature which justify a certain portion of esteem and confidence.”

– Madison, Federalist Papers #55, 346

*By online Slate magazine’s Lexicon Valley host Mike Vuolo

Do narrower columns hold up better for body of written work?

Pat Whitcomb came across a very intriguing article in the February ’07 issue of Training & Development magazine that says keeping line lengths shorter makes text easier to read and remember.* (Hmmm – did this asterisk cause you to reflexively glance to the footnote and interrupt your train of thought? Sorry about that!) IBM researchers evaluated paragraphs at 40 percent screen width versus 80 with a device that measures eye-gaze tracking. They found that narrow columns were more comprehensible and required less re-reading. However, this came at the cost of “paragraph abandonment,” a shift by readers to skimming completely over segments of text.

What’s telling to me, is that the IBM web page on this research (linked above) displays text in a single, wide column. I like this wider style for displaying text on my computer because I can then scroll line-by-line and not be forced to go back up again as required with two columns side-by-side. For example, see this latest edition of the Minnesota Section ASQ newsletter. Notice how it shifts format from one column to two. Observe how you read these. Which do you prefer?

My preference is to print pieces written in two-column format and then use my finger as a guide to maintain focus on the line of text. I picked this up from a business colleague years ago after he took a speed-reading course.

* “The Long and the Short of Learning” by Peter Orton, David Beymer and Daniel Russell.

