There’s this confidence game (aka con) that’s been going around. It’s called lying with data.
A friend of mine recently fell for the lying with data con as performed by Jakob Nielsen.
Jakob Nielsen should know better.
Any successful con involves three core components.
Belief: Identifying a mark (that’s con-speak for “target audience”) that really wants to believe something. The next step is finding some clever way to use the mark’s desire to believe something to get something that is actually valuable from them.
Offer: The con artist, in order to begin the con, makes an offer of some sort to the mark. The offer needs to be very tightly aligned with the the mark’s desire.
Friction: In addition to the desire of the mark to believe something, there is often something that inhibits double-checking the value of what is being offered. It is somehow difficult or time-consuming or embarrassing for the mark to find out if what is being offered is what it purports to be.
The “art” in con artistry is in appropriately assessing and adjusting the balance between human belief and human friction. A con artist knows how the mark orients in the world and uses that to fold the mark’s decision-making in on itself.
That is how a con artist exchanges another person’s feelings or emotional state for something of value. This is, in effect, a sort of alchemy where something is produced from nothing. Except of course for the mark. Where their emotions and human nature are used to take something from them–turning something into nothing.
I should note that performing a con is in most cases poor form and in some cases criminal, like fraud. I’m going to describe the “lying with data con” using some examples that aren’t criminal–the con artists themselves may not even know they are performing a con–but that doesn’t make the results any less dangerous.
The Jakob Nielsen con: Bounce rates and design agenda
In this article Jakob Nielsen uses data from a research paper to support his view of how websites should be assembled. He then passes that view on to readers as clear pieces of action that should be taken.
Except that the data doesn’t support his view.
It doesn’t discount his view either, which is important for a con. You’d have to be a really talented con artist to use contrary evidence to support your con.
Let’s dig in.
The mark: time-starved marketing decision-makers
I should confess that I don’t know precisely who the mark is for Nielsen’s Alertbox. I’m making an assumption that it’s marketing decision-makers because that is the group of people that are likely to buy NNg’s services.
I’m also assuming that it’s a “time-starved” group because decision-makers are always feeling time-starved.
If you want to accuse me of building a straw man argument, this would be a good place to do it.
The belief: changing something simple on the website will improve bounce rate
The marks need to believe that if they make a change to their website, the bounce rate will go down. They need to believe this because making the bounce rate go down is part of how they look good to their colleagues and/or related to how they make money. They want to be in control these things.
This is a function of human social behavior around establishing personal status.
Please note that I’m not saying that this is wrong or bad or anything like that. It’s simply the state of mind for the mark.
The offer: Here’s what you should do to lower the bounce rate on your site…
Jakob Nielsen’s post leads off with this:
“Summary: Users often leave Web pages in 10-20 seconds, but pages with a clear value proposition can hold people’s attention for much longer because visit-durations follow a negative Weibull distribution.”
The article then continues on to elaborate on this theme. The summary is perfect though, it accurately summarizes the entire con. So you can get the short con in the summary or the long con in the entire post. Either way you get the same con.
There’s a ton of craft in this.
It starts with an accepted truth: many people leave websites without staying very long.
Then Jakob Nielsen offers a clear instruction for what the mark should do: make pages with a clear value proposition.
The friction: time, money, self-esteem
In the last step, this con provides a boatload of friction to prevent verification on whether the recommended course of action is worthwhile or not. There’s friction in simply understanding what’s being said: “visit-durations follow a negative Weibull distribution.”
Let’s make special note of the various types of friction being laid onto the mark.
The mark may be embarrassed that they don’t know what a Weibull distribution is and just skip the fact-check. Alternately, the mark may be proud that they know what a Weibull distribution is and assume that the data does in fact support Nielsen’s recommendation.
Maybe they can’t afford the time to read an eight page research paper.
Maybe they can’t afford to pay $15 to download the article.
Maybe they are don’t understand the specialized use of language, math, and statistics used in the cited research paper.
This con, like most, uses a combination of human self-esteem behavior (pride and embarrassment) and poverty (of time, money, and understanding) to discourage anyone from cross-referencing Jakob Nielsen’s interpretation of a research paper.
A less artful way of pulling this con would be to simply say “Hey don’t got time and money to read this complicated research? You don’t have to! I read it for you and it says that you should put your value proposition on your page!”
You should read the research for yourself because it is quite fascinating. When you read it you will notice that nowhere does it mention anything about a value proposition or even the kind of content that induces people to stay on a page for more than 20 seconds. The closest it comes is measuring the occurrence of words on a “most used 1000 words” list.
In fact, in section 4.2 of their work, the authors of the paper notice that there is a difference between how visitors treat pages based on what dmoz category a page is in. They then state: ”This observation leads to a hypothesis that negative aging is more common on less-entertaining pages than on fun pages, which in turn suggests that people tend to screen less-entertaining pages more harshly.”
Note how easily and confidently they say that. It’s just a guess, a simple thought. They don’t say “the data supports the view that…” or “because of this data we think you should …” They’re clean and honest about it. They have a hypothesis.
Also notice how testing their hypothesis might involve making a page more entertaining and seeing if dwell time increases as a result. Adding a clear value proposition doesn’t sound like making a page more entertaining but that’s just my opinion.
It may be hard to spot the payoff in this con. That’s because in some ways, there isn’t one. Jakob Nielsen doesn’t collect money from marks who decide, after participating in the “lying with data” con, to include a clear value proposition.
Sure he gets to say “include a clear value proposition” backed up with some fancy sciencey-sounding stuff. But he could have just as easily said: A lot of people bounce from websites, if you want them to have a chance to see your value proposition make sure it’s very clear. He wouldn’t need the research paper to back that up even (which is good, because the research paper doesn’t back that up). It would still carry weight: “Jakob Nielsen says include a clear value proposition on your web page” is enough for many people to simply add a clear value proposition.
Maybe the only thing he gains is people’s attention for a little while. Maybe he gains their gratitude in having “saved” them the time of reading the actual research. Perhaps he somehow trades that up for consulting work or increased reputation or something.
Another example of the lying with data con
It’s not just Jakob Nielsen. Variations of the “lying with data” con get played every day. It feels to me like it’s increasing but I don’t have any data to back that up.
It was played just the other day over on Fast Company when they published an article that claims: “The data doesn’t lie. The web is shifting from text to video.”
The author then presents four stats all of which talk about video but none of which talks about text. How can you compare–”shifting from N to Q”–by just showing one side of the metric?
The first stat talks about the volume of video that transfers across the internet each minute. It makes no mention of text data.
The second stat is what a VP of YouTube said about what he thinks will happen in the future during a keynote at a consumer electronics trade show. This doesn’t mean it won’t come true. But it isn’t data yet–it’s still a hypothesis.
The third stat is more hypothesis, this time from Cisco. It talks about percentage of data transfer volume of video. How can you use data volume to compare text and video when 1GB of video might be about 5 minutes vs 1GB of text might be about 500,000 pages (which would likely take longer than 5 minutes to read)? If the author is actually talking about data transfer, sure. But he isn’t, he’s talking about people.
The fourth stat is a repeat of the third stat, but limited just to video delivered to the TV.
As you can see, none of this data supports the idea that the web is shifting from text to video. I’m not saying video isn’t on the rise or that video isn’t important. I’m simply saying that the data that is presented doesn’t say anything about how video is doing in relation to text.
The author could simply have stated his feeling about video and skipped all the sciencey-ness and there’d be no problem.
But for some reason he ran the lying with data con.
The art of the con in the attention economy
In the old analog world it was easier to spot a con underway because something would physically change hands. Perhaps the “attention economy” provides an entirely new medium for con artists to exploit, deriving value in secondary or tangential markets.
But whether or not something is gained by the con artist via “lying with data” something is lost. We, as a society, have less meaningful conversations.
We have less meaningful conversations because the lying with data con exchanges actual observations about the world for someone’s opinion. The lying with data con discourages us from knowing more about the world and how people behave.
This loss is a dangerous one. Sure, when we’re talking about something as trivial as whether people stay on a website for 20 second or not it doesn’t seem like that big of a deal. But it’s a dangerous pattern.
Lying with data disconnects. It disconnects decision-makers from understanding what is really going on in the world. Lying with data short-circuits the ability of a decision-maker to make good decisions.
When poor decisions are made based on the lying with data con the response if often “well you’re not doing it right” or some variation. This further disconnects the decision-maker from reality by focusing his or her attention inward on what he or she is not doing right.
Disconnecting from reality and folding inward is a recipe for collapse.
Con artistry in a socially-networked system
If someone who knows how to read and understand data science, like Jakob Nielsen, is ok with running the lying with data con then what happens as we move down the scale of experience working with this stuff?
In a socially connected attention economy, what happens when lying with data spreads? I became aware of Nielsen’s post over a year after it was written because I caught a friend of mine unwittingly running the lying with data con because he was citing Nielsen.
When I called my friend out on it, his response was that I should look at Jakob Nielsen’s LinkedIn profile. I’m not one to settle for an appeal to authority and I am very aware of Nielsen’s impressive experience.
I even agree with many of Nielsen’s opinions. I’m grateful for his work in standardizing approaches to UX (even though I get irritated by the blue underline thing now and then).
The problem is that when someone respected, who should know better, plays the lying with data con it can spread. It becomes acceptable. We start using data to support pre-concieved notions instead of learning more about the world. The world, however, eventually intervenes.
In addition to the bad example that Jakob Nielsen or Fast Company set when they run the lying with data con, there’s the danger of elision.
Social tools are increasingly truncating thoughts and experiences to the shortest and smallest units possible. What gets truncated along the way is context. It’s perfect for running the lying with data con–no time or space to provide the details.
This elision of time and space can be harmful whether it is exploited on purpose (for a book length exposition of this I highly recommend Ryan Holiday’s Trust Me I’m Lying) or naively. This is part of why people like Jakob Nielsen or the editors at Fast Company ought to know better.
When marks go off after having been played by the lying with data con they often don’t even realize it. They simply say something like “I read in Fast Company that the web is shifting to video and they had data to back it up.” Or maybe they say “You gotta put a clear value proposition on your page if you want people to bounce less because Jakob Nielsen said it and had data to back it up.”
They elide the details of the data itself. The next marks (the grand-marks and great-grand-marks of the original con) just accept it based on authority and trust. Over time, the actual observation of the real world is lost and all that is left is the con.
In the short-format social media system, any reference to the data itself may disappear. If the data was poorly aligned with the initial statements it leaves people who trust the source/authority of the speaker with wrong observations of the world.
For example, in early 2012 I saw a presentation in which the speaker stated to a large, paid audience of real estate professionals that the number one reason that real estate agents who make more than $100k per year will leave their brokerages is because their broker doesn’t offer adequate technology support.
He told this audience that his data came from 1500+ survey responses.
But there were also things about this data that he didn’t share with his audience. It wasn’t 1500+ responses it was 1345 responses as reported by his sponsoring organization. Moreover, the group which was making $100k or more totaled 358 survey responses. Still a healthy amount, but not 1500+.
In addition, the survey was only distributed online via a website which is focused on technology tips and also to a Facebook group which offers tech support for real estate agents. I’m sure I’m not the only one who suspects that there may be a bias in this sample.
After the presentation the speaker asked me what I thought of his presentations and I was less than enthusiastic, citing the concerns I’ve noted above. It would have been more accurate to say “the number one reason 358 agents that say they make more than $100k/year and also read our tech blog or use our Facebook tech support group would leave their broker is because of technology.” That would have been true and still helpful even.
The problem of course is that this sort of context is a bit of a mouthful. So the context gets elided away.
Once this sort of data hits a social network like Twitter–where words are at a premium–and you end up with simply: “#1 reason agents will leave their broker is lack of technology.”
Through ongoing elision of this sort, the friction to overcome the lying with data con is raised considerably.
What could be done?
In the physical world there are a number of things marks can do to keep from getting conned. We can tie our wallets to our belt loops. We can avoid giving out our bank account numbers to Nigerian princes. We can slow things down bit if we’re getting confused. These are good habits to avoid being conned in the real world.
In the attention economy we might want to develop similar habits. Perhaps make a habit of investigating alternate interpretations of the data when presented with what we suspect to be a lying with data con.
For example, when looking at Jakob Nielsen’s lying with data con we might ask “Is it possible that people bounce when they encounter a clear value proposition that doesn’t meet their needs?” If there isn’t sufficient information then we know we simply can’t accept the data. Perhaps the con artist’s statement is simply their opinion and we can evaluate it in that light. Importantly, we might think twice about spreading the particular con to the people who trust us.
In the case of the Fast Company lying with data con we might ask how the rise of HD video, which uses more data per minute than regular video, might be having some impact on the rise of video data transfer volume. As with the Nielsen example, this might lead us to be careful about spreading the potential con to people who trust us.
When we encounter data which has suffered a significant elision of context, as with the reasons an agent may choose to leave their broker, we can ask for that context. When the context arrives we can see if it aligns with the statement or not. If the context doesn’t arrive then we know to disregard the information as being, most likely, a lying with data con.
In all of the above situations the statements made by the con artists may be true or helpful even. But they aren’t backed by the data they’ve presented. The statements are still hypothesis. As such they can be accepted as their hypothesis or tested. Then we’d know more about the world, not less.
Lying with data is a problem. It leads to distrust of individuals, brands, and science.