Tom Webster, writing and speaking

The Five Biggest Challenges for Social Media Monitoring on Twitter

Added on by Tom Webster.

I've heard a lot of speakers at conferences (both social media consultants and brand managers) talk about how they are using Twitter to listen and respond to mentions of their brands. There's no downside to this--if someone tweets that they have a problem with your product, and you respond to that problem, you might just have made a new fan--maybe more than one, depending on who else was listening. There are obvious tactical benefits to monitoring brands on Twitter.

However, what is not so clear are the strategic benefits to mining Twitter data--in particular, using Twitter for market research, a usage I have often heard cited by Twitter enthusiasts. I'd like to think that someday organic monitoring of Twitter might truly supplement--and even replace--more artificial and intrusive means of gauging consumer opinion. For that to happen, however, market researchers and social media monitoring services need to address these five concerns, in increasing order of difficulty:

1. Representative Sampling. Not much more needs to be said about this--though Pew reports that nearly one in five of us are "tweeting," the reality is that it's probably more like one in ten (owing to problems with the wording of the Pew question). The non-response bias on current Twitter data is enormous. If Twitter really becomes a mainstream communication channel, however, this will sort itself out. The key for Twitter enthusiasts is to realize that they are not necessarily "ahead of the curve." They might just be different. Again, I think this issue will diminish over time.

2. Language Issues. Nuances are often lost in 140 characters--there can be dozens of shades of dissatisfaction with products and services. The reductio ad absurdum of Twitter compresses most of them to "FTW!" and "FAIL." Over time, sentiment monitoring will continue to get better and better, but there is a big difference between a coffee shop FAIL of having to wait more than five minutes in line and a FAIL of finding a cigarette butt in your coffee. Tactically, a customer service rep can address both on Twitter, but there isn't a computer on earth that can yet tell me the difference between those two FAILS.

3. The Retweet Problem. Now we are getting into some thornier issues. If I read a series of blog posts about a brand, I can use backlinks, Google rank, Technorati and other tools as a proxy to guesstimate the authority of a given blog and weigh the results accordingly. With Twitter, it isn't so easy. Followers are a poor proxy (is Ashton Kutcher more authoritative than Arnold Schwarzenegger? Is Chris Brogan less authoritative than Britney Spears? On what topics?); lists are perhaps more promising but the math on those potential algorithms makes my head hurt. Today we often use the retweet, even if only subconsciously, as a proxy for credibility. When we see something retweeted multiple times, our brains process this as "man, I see that opinion everywhere--it must be true!" when it is entirely possible that the retweeter(s) didn't even go back and check the source material referenced in the original tweet. We also don't know why a user retweeted something, which brings me to the currently insurmountable number four...

4. Motive. Twitter monitoring is great for counting the mentions of a brand, and potentially--someday--even a measure of observable sentiment. But what motivates someone to tweet negatively--or positively--about a product or service? I'm not talking about sponsored tweets and disclosure issues, I'm talking about discerning the motives behind a complaint, or a word of praise.

5. Contextualizing the User. This is potentially the stickiest wicket, and the ones that will give future researchers the biggest headaches when trying to parse Twitter longitudinally. If I post a cranky tweet about a car, a social media monitoring service will duly record this as "one mention" and might even go so far as to label it as negative. But what if I am @crankypants and all of my tweets are cranky? What if I complain a lot, and my particular complaint about a car actually isn't so bad, when compared to my long history of #complaints? Or what if my positive mention of a product is one of thousands of similarly pollyana-ish proclamations about products I am hoping to get free samples of? We can track the sentiment of a brand over time, but how do we contextualize the user to determine if my "FAIL" is better or worse than your "FAIL?" This one issue will dog Twitter research for years--maybe decades--and I don't think there is a computer "alive" that can solve these last two issues satisfactorily to come up with any kind of system for accurately categorizing sentiment over time.

So, what do you think? What am I missing?