Tom Webster, writing and speaking

"The Usual Caveats" of Social Media Research

Added on by Tom Webster.

Not for the first time today, a piece of research was passed along to me by someone warning that it was social media research, "so the usual caveats apply." This phrase is starting to become meaningless to me. Does it have meaning to you? When I look back through various studies I've seen that have been derived from social media clickstream data, I often see that they've been shared with "the usual caveats." To me, that phrase has become the new "interesting;" that thing you say about a piece of data when you don't know what it means.

Here's the thing about the usual caveats: if you do the work, those caveats are quantifiable. When someone denigrates TV ratings research, for example, because those ratings are based upon a sample of a few hundred viewers who keep diaries/electronic records of what they watch, I'm quick to remind people that this sampling methodology is predictable, operates within known parameters, and is based upon a methodology accredited by the Media Research Council, an independent body established to insure the validity, reliability and effectiveness of those estimates. And by the way, estimates aren't guesses.

But when people look at social media research and accept it "with the usual caveats," without knowing what those caveats actually are, they risk great violence to the truth and certainly some damage to their brand, if not their psyche. Accepting this is to accept the fact that *no research* is generally better than *awful research.* After all, doing *no* research means that you have a chance of making a poor decision (or no decision); relying on *crap* research guarantees that bad decision.

So, the next time you review, endorse or pass along a piece of social media research with "the usual caveats," consider what some of those caveats might actually be. For the most part, they all concern the same thing: how representative of (and therefore projectable to) some known population that data is. And note: social media need not be representative, if it is predictably, quantifiably and/or repeatably non-representative. So, what are my caveats? Here are a few:

1. What specific sources comprise the data, and in what proportions? (BTW, this is where most social media monitoring and research platforms fall flat, and I'm only on #1).

2. Is the data weighted to some known quantity, or did it come straight from the "spigot?"

3. Can we calibrate the data using some known secondary research?

4. Can we calibrate the data with primary research?

5. Can we track the conversational history of the data sources to gauge sample quality?

6. Can I pull a random sample of raw data to manually test against automated descriptive statistics?

7. What do I know about the customers of the brand in question who do not tweet about this product?

Now, do you need to know the answers to all of these things to judge the quality of a piece of social media research? No. There is no such thing as perfect information, whether that data comes from surveys or servers. That's why these are caveats. These are the things you need to know that you know or don't know before you know what you know. Or, as one of America's greatest poets once said,

The Unknown

As we know,

There are known knowns.

There are things we know we know.

We also know

There are known unknowns.

That is to say

We know there are some things

We do not know.

But there are also unknown unknowns,

The ones we don't know

We don't know.

I pass this "poem" along with the usual caveats.