BrandSavant

Gaining Insight From Social Media Data

The Confounding Variable Of The Retweet

by Tom Webster on May 4, 2010

Last night I got a particularly hamfisted pitch on Twitter for an automated sentiment tracking service. Apparently, they picked up that I had mentioned a brand, and sent an unsolicited reply that gave me the number of positive and negative mentions of that brand online along with a link to click for more information.

(FYI – I call this sort of solicitation the “Eavesdrop Pitch.” It’s kinda like overhearing me ask someone for the time at a cocktail party, shoving your wrist under my nose, and yelling “WANNA BUY A WATCH?” But I digress.)

I clicked the link, because I am a curious sort, to see what this company’s angle was. First of all, the “brand” that I mentioned was actually “Chicago,” as I recently spent a few days there at SOBCon. I started with the negative mentions, and it became clear fairly quickly that the actual sentiment analysis was not to be trusted, since there was no delineation between tweets that were negative about Chicago, tweets describing something negative that happened while in Chicago, and tweets where “Chicago” was basically a non sequitur to an unrelated negative sentiment. Readers of this blog have heard this story before.

It was the positive mentions, however, that really sparked an epiphany. There were over five times as many positive mentions of Chicago as negative mentions, which on the surface seems to have some significance, right? Except, as I poked around through these positive mentions, here’s what I found:

RT @justinbieber: I need to work on my curveball. Hey Im Canadian we play hockey. Haha. I love Chicago – http://mlb.mlb.com/video/play.j …

If you know anything about The Bieber, you know I didn’t just find a few of these, either. There were thousands of retweets about Justin Bieber throwing out the first pitch at a recent Chicago White Sox game. If you just look at the original tweet, it is clearly positive about Chicago, and no one would quibble with any monitoring platform on that score. What about the retweet, however? Was Bieber’s original message retweeted by thousands because they also love Chicago? Or because they love hockey? Or Canada? Or…The Bieber?

Any and all of the above, surely, which neatly highlights the distinction Liz Strauss often makes between social media monitoring and social media listening. Setting the sentiment angle aside, however, the really interesting thing about this retweet is what it suggests about the brand mention as a base metric. If you use a monitoring platform to at least count brand mentions, you may be tempted to track the rise and fall of this number, associate it with some external events, and make some assumptions. They do this on CNBC every day, when they tell you with a straight face that a particular stock went up or down “on news of higher unemployment” or “on the basis of its recent earnings announcement.”

The truth is, the anchors on CNBC don’t really know why a stock went up or down in the short term, because in the short term the stock market is a random walk. The human mind hates chaos, and naturally wants to impose patterns and order on what it perceives–even where no pattern exists. If you were employed by Chicago Tourism, and you were tracking “brand mentions,” you might think you had a pretty good month. But Chicago’s insertion into Bieber’s tweet–and the concomitant repetition of that tweet – were random occurrences as far as Chicago is concerned. Yes, they certainly got a few more impressions of the brand, but when those impressions fall back down to pre-Bieber levels next month, what did that metric really tell you?

The retweet, then, is truly a confounding variable in social media monitoring for brand mentions. Just as you truly cannot gauge sentiment without knowing motive, you also can’t really process a retweeted brand mention (again, except as a gross “impression”) without fully grokking what part of the original message actually sparked the retweet. With retweets as confounding variables, the “metric” of mentions truly becomes a random walk, which means ascribing a rise or fall in said mentions to external events or internal efforts is potentially ruinous.

I’m not arguing against social media monitoring here – not by a long shot! I use and recommend several platforms. Blaming social media monitoring platforms for recording this metric is akin to blaming Microsoft if you give a crappy PowerPoint presentation. ‘Tis a poor potter who blames the clay. What I am suggesting, however, is that you distance yourself a bit from the numbers, and use these tools not as a means to an end – the tabulation of mentions – but as an entry point to a conversation. Listening, not monitoring. It may seem slightly ironic to have a “numbers guy” advising you to take a metric with a grain of salt, but I am not a “numbers guy”; my job is to tell stories with numbers. The random walk of brand mentions almost never tells a story in isolation.

And now, your moment of Zen.

Be Sociable, Share!
  • http://www.hugoguzman Hugo from Zeta

    Great post! Modern monitoring tools are simply not sophisticated enough to deliver quality sentiment scoring. That’s why it’s crucial to use some good, old-fashioned elbow grease when analyzing social media data.

  • http://www.ddmcd.com Dennis McDonald

    Excellent post.

    I’m less concerned about the inconsistency of “sentiment analysis” than I am about the underlying population numbers and what they represent. As a former sample-survey-and-statistics guy I’m especially sensitive to what populations the measures of mentions and conversations are referring to and whether it’s possible to relate the conversation metrics to known populations.

    Sometimes this is not a problem if you are primarily looking at an entry point into conversations. But if you’re tracking, say, how well a government program or a new product introduction is performing with a particular population or market segment, the lack of a definable and measurable connection between conversation and population can be a big problem.

  • Tom Webster

    Dennis – I’m with you, especially because I am currently a sample-survey-and-statistics guy myself. Yet I am reluctant to dismiss social media monitoring metrics, especially when half the US population is using social networks. Yes, the non-response bias is troubling, but I think the answer is the continued application of sound sampling techniques (properly weighting sources and samples, for instance) within the framework of a structured monitoring program. There’s a there there, though yes, it needs some work :)

  • http://www.ddmcd.com Dennis McDonald

    I’m not dismissing social media monitoring statistics; they can be of great use. My concern has to do with how easy or hard it is to relate the different categories they use to other (e.g., known or documented) population statistics. Having recently evaluated several different monitoring tools I thought they could be doing a better job on things like making their population categorization rules and processes more transparent.

  • http://www.ddmcd.com Dennis McDonald

    Here are some additional concern about social media metrics, or rather, not with the metrics themselves but how they might be used in some cases: “OMB’s New Guidance on Social Media is an Improvement – But There’s a Catch” http://www.ddmcd.com/guidance.html

  • http://tomsideas.wordpress.com Tom Messett

    Hi Tom, nice post, I used to work for a monitoring service so I know exactly what you mean about the RT issue and couldn’t agree more! Anyway, I am really interested to learn more about your thoughts on the “Eavesdrop Pitch”?

  • Pingback: “Unaided Recall” in Social Media Research | BrandSavant

  • Pingback: Twitter’s Most Elusive Statistic | BrandSavant

  • Pingback: Understanding Klout | BrandSavant

  • Pingback: You Got It All Wrong: The Limits of Social Media Monitoring | BrandSavant

Previous post:

Next post: