January 4, 2012

BIG DATA – Most Popular Varieties of the 2011 Holidays

We process BIG DATA for the wine industry. Over 1 million online conversations per day actually. And for three years now we have collected these conversations, put them in our data store, and presented them to the clients who use our social listening platform. There is no bigger focus group, survey, or mass of data for the wine industry about wine expressions and sentiment than VinTank’s. So far we have analyzed over 250 million conversations (we average about a million a day) with over 25 million being high-quality (brand, variety, region) across 9 million social wine customers. Here are a few examples of recent data we have collected:

What you see is 1.1 million conversations about Chardonnay across 5 broad social categories (shown above as weekly totals). That’s a whole lot of information about people who drink and talk socially about Chardonnay.

To look at Pinot Noir you can see almost 500K conversations. Imagine what a winery or brand manager could learn from analyzing these conversations and customers . . .

We thought the same thing and wanted to do a little deeper analysis. To better understand which of the mainstream varieties were being discussed through the holidays we built a real-time tracking dashboard. We started by tracking the four major varieties (Cabernet Sauvignon, Chardonnay, Pinot Noir, and Merlot) and then added five more (Malbec, Riesling, Sauvignon Blanc, Shiraz, Zinfandel) a couple days later on Dec 26th. The system automatically deals with synonyms, misspellings, and abbreviations for each variety.

The winner was easily Chardonnay by a landslide.  Also, as you can see Merlot is by no means dead in the social ecosphere placing a strong #2 and doing a fine job bucking the Sideways affect of Pinot Noir who places #3. And mired mid-pack at #4 is the king of grapes, Cabernet Sauvignon.

This is just the first of many future reports where we will be surfacing interesting macro trends we see in our industry. Can’t wait to share what’s next. Do you have a report you want to see?  Let us know in the comments.

  • http://www.twistedoak.com/ El Jefe

    Did you also track “Syrah”? Also, it might be a lot more interesting to see up and coming varieties like Tempranillo rather than watching the same old workhorses flat-lining their way along…;)

  • Nick

    How do you handle “cab” in terms of knowing when it’s cabernet vs. a taxi?

  • Marcia

    Dang! No Grenache. Always the bridesmaid at the end of the line….

  • http://twitter.com/james_jory James Jory

    You know, you’re right Marcia. I just added Grenache/Garnacha to the variety dashboard. We’ll see how it fares.

  • http://twitter.com/james_jory James Jory

    This one of the trickier aspects of analyzing wine conversations, especially in tweets where people often come up with very creative wording. Without getting into too much detail, the system uses a scoring algorithm that uses weightings and probabilities given surrounding text and user patterns.

  • http://twitter.com/james_jory James Jory

    Yes, Syrah is included with the Shiraz statistics (Syrah & Shiraz are aliases for the purposes of analysis in this case).

    I just knew you were going to ask about Tempranillo! Just added it to the dashboard.

  • http://arnoldwaldstein.com awaldstein

    Interesting and thanks for sharing.

    US produced wines are varietal by nomenclature mostly. Not so in Europe.

    How does your data handle mentions of Bordeaux or Rioja? Or is the data US market and US producer centric?
    I’m fascinated by this. Most interested if you map this against sales numbers. By default, they should be the same or am I missing some key point?

  • http://www.facebook.com/pmabray Paul Mabray

    Candidly those wines fare less so than variety driven wines.  We are perfecting even more complex algorithms to “fold up” wines to deal with regional wines, even more rich behavior with varieties, and more.

  • http://twitter.com/james_jory James Jory

    I would say that the analysis in this post is much more indicative of New World wines than Old World wines (rather than just U.S. vs. Europe). I call is declarative matching. I’m sure that’s what you meant, though.

    When it comes to analysis based on regions/appellations we can often infer varieties and/or wine type with varying degrees of certainty. For example, from Barolo we can derive Nebbiolo and red wine type with 100% certainty. For Rioja the wine could be red or white but in the absence of other markers the reference is probably for a red wine made from (mostly) Tempranillo. Similarly we can sometimes derive info from our knowledge of a specific wine or producer (i.e. they make a blend by a certain marketing designation made of x,y,z grapes). Lastly we can derive varieties and/or type from wine terminology or marketing designations (e.g. “meritage”, “GSM”, “white zinfandel”, and so on). 

    Our system has all of this info codified so when we make a definitive match, we can infer matches on meta concepts like wine type, varieties, brands, and appellations. At this point, though, we are clearly separating declarative matches from derived matches to ensure that our analysis is as accurate as possible. In the future I see the ability for the user to select the layer of matching they’re interested in (i.e. declarative only, derived only, or declarative + derived).

    Sorry for the long explanation. I dig this stuff.

  • Cedwards

    Can you do AVA discussion analysis? We just did a blog post on the new coombsville AVA not much dialog now I am sure but would be intetesting to measure conversations around it and wineries related to it since it is a new launch. Would be an interesting long term review does AVA’s matter in social media dialog? Thanks Chris E

  • Tkoby11

    Merlot is #2, but how do you know if that is goo Merlot talk or bad Merlot talk? Just curious if this filters good commentary vs not good?

  • http://twitter.com/james_jory James Jory

    Yes, regions and appellations/AVAs can be monitored too. In fact, many of our clients already do this with custom campaigns within their own accounts. Hint: accounts are free to setup.

    We will be monitoring appellations at the system level too but with more precision than what can be done with custom campaigns.

  • http://twitter.com/james_jory James Jory

    We do provide some automated sentiment analysis for some of the categories we monitor but even the state of the art algorithms leave a lot to be desired. For this reason we haven’t put a lot of energy into analyzing macro trends based on sentiment. Nevertheless, this is an evolving field and we will continue to incorporate the latest techniques as they deliver usable results. 

  • http://twitter.com/TuomasMW Tuomas Meriluoto MW

    Understanding consumers has always been a high interest for me. Do you think it would be possible to analyze in what type of context are people mostly discussing wine, or different grape varieties or countries/regions. Are they looking for recommendations what to drink with certain kind of food, ideas for gifts, just general info or what?

  • http://twitter.com/james_jory James Jory

    We are already measuring most of this contextual information today and are experimenting with measuring things like customer intent (with interesting pilots already being done by some of our clients).

    This is one reason why we are so bullish on the potential of these digital channels. Never before have wineries had this level of insight into how their customers interact with their specific products or product segments in market. The fact that wine is an experiential consumable luxury product provides us with a tremendous amount of content to work with.

  • http://vinebuzz.biz/ Rich Reader

    Where is the Cabernet Franc?
    There’s some evidence in Napa that it is as large as Shiraz.
    It’s a principal component in many Cab Sauv as well as other blends.

  • http://twitter.com/james_jory James Jory

    I guess it depends on what you’re measuring here. If you’re talking about how much Cabernet Franc is planted in Napa vs Syrah, then you may be right. After all, Napa has more of a heritage growing red Bordeaux varieties than Rhone varieties. Regardless, for the most part Cab Franc is a blending grape. I’m sure most consumers don’t realize that when they’re drinking a Cabernet Sauvignon from Napa Valley that there may be any Cab Franc in the wine let alone actually mentioning it in social conversations.

    What we’re measuring here is what varieties all consumers are talking about online (declaratively speaking). Since Syrah is bottled much more commonly as a declared/primary variety than Cab Franc and given the popularity of Australian Shiraz, it’s really not even a close comparison.

  • tercero wines

    I would be curious as to which data sources you are collecting your information from . . . Are we talking CellarTracker here? Erobertparker? Wineberserkers? WineLoversPage? Blogs?

    There is so much info out there and it would be good to put some ‘context’ to the communication.
    That said, overall, I dig what you guys and gals are doing – trying to make ‘concrete sense’ of the ‘ongoing chatter’ and hopefully providing some guidance based on it.

  • SteveF

    Did you track Moscato mentions?

  • http://twitter.com/james_jory James Jory

    Although we monitor millions of sites, so far we have found wine conversations on just over 200K social sites. Of those, about 120K can be considered blogs and over 70K are forums. By far the most conversations come from Twitter but forums are still very popular conversational platforms.

    It’s important to note that we measure wine conversations across the entire consumer spectrum (9M so far) and not just oenophiles that hang out on CellarTracker or wine forums. Not surprisingly we have found that consumers identify and describe wine much differently depending on where they are in the spectrum. In addition, the number of casual/occasional wine consumers dwarfs the oenophiles so that should be taken into account when interpreting the results in this post.

  • tercero wines

    Thanks – that clarifies things quite a bit. It’s always interesting to see ‘leans’ to stories based on where the data is compiled, and therefore what the data therefore may or may not imply.

    When looking at Twitter data, do you delete winery’s own twitter feeds as this may bias numbers . . . . Also, do you look internationally (ie foreign language) or just English speaking?


  • http://twitter.com/james_jory James Jory

    Yes, we do. We have several hundred varieties in the system but only actively monitor the most common ones (and moscato is one of them). It does very well, as you’d expect, based on the hip-hop connection. Conversation volume for moscato more than doubled over the holidays.

  • http://www.facebook.com/karinmckercher Karin McKercher

    If I understand you correctly then, you’re saying people are talking about Merlot, but you don’t know if they’re saying “I’m not drinking no effing Merlot” or if they’re saying “That Sideways guy was ridonculous. Merlot rocks!” If that’s the case, then suggesting that Merlot is “doing a fine job bucking the Sideways affect” is misleading. How can the data be at all relevant if it doesn’t have context?

  • http://www.facebook.com/pmabray Paul Mabray

    We’ve done random samplings and most conversations seem positive.  We did not run it through a sentiment analysis (we didn’t have time).  

  • http://www.facebook.com/pmabray Paul Mabray

    Actually in random samplings most conversations seem very positive so I am confident that Merlot did fine in sentiment this holiday.

  • http://www.facebook.com/pmabray Paul Mabray

    We did but we just didn’t add it to this analysis.  It didn’t beat the top 4 varieties.

  • http://www.facebook.com/pmabray Paul Mabray

    We did look internationally (probably not as much as we will in the future) and we analyzed the winery conversations but did not exclude them.  They did not statistically impact the total.

  • http://twitter.com/james_jory James Jory

    I think you’re confusing popularity in mentions with sentiment. We made no conclusions about sentiment. In the context of this post, popularity simply means how often one variety was mentioned vs. others. The point of the Sideways comment was that Merlot is not withering away into the abyss as many were pontificating following the movie. Therefore, the data is absolutely relevant.

    As part of our continual tuning of our platform and algorithms, I read more wine related tweets, status updates, and posts and look at more wine photos than I care to admit. I can tell you (anecdotally of course) that wine makes most people happy and people typically just want to share with others what they’re drinking.

  • Mike

    Merlot and Chardonnay have been in the top three for many years…20+…. People talk dry and drink sweet always have. For most people wine is just red and white.. Cheers Michael

  • http://www.facebook.com/karinmckercher Karin McKercher

    Actually, I’m not confusing popularity with sentiment. That’s exactly the point I’m making. That frequency of mentions doesn’t mean a whole lot if we don’t have context. Despite that comment, I’m not suggesting the data are irrelevant, and I’m certain your anecdotal evidence is spot on. Really, I’m merely following up on your question posed earlier in the post: “Imagine what a winery or brand manager could learn from analyzing these conversations and customers . . .” How does that winery or brand manager use the data?

    I think this is exactly the challenge many (especially small) wineries don’t know how to answer.

    My comment is certainly not a criticism, and it’s not an expression of doubt that there lacks meaning in the data you’ve collected. It’s an invitation to expound upon how this data could be useful. Perhaps I could have better posed the question as:  If I am a winery/brand manager, how do I make this data relevant?

  • http://twitter.com/james_jory James Jory

    Gotcha, Karin. Thank you for the clarification and sorry if I came off defensive.

    Your question is a great one and is exactly what we are hoping to answer with our SCRM solution. 

  • Christian Miller

    Fascinating stuff, thank you. How do you categorize “tasting notes” and micro- vs. regular blogs?

    Historical footnote: the Sideways impact on Merlot was mostly trade talk and post-hoc rationalizations. We cross-tabbed movie viewership and change in Merlot purchases among core wine consumers in 2005, and there was very little correlation. In contrast, there was a positive correlation between having seen Sideways and drinking more Pinot Noir.

  • Christian Miller

    Why would you recode Barolo to Nebbiolo or Bordeaux to Cabernet/Merlot? The wines are probably mostly purchased or discussed with reference to their place names rather than their varietal composition. Or perhaps you were just mentioning this as feasible, not routine?

  • http://twitter.com/james_jory James Jory

    Yes, I was just describing how we could map varieties based on regulations or probability given certain appellations. For the data described in this post, we did not do any mapping.

    Performing mappings can be useful and even necessary depending on what it is you’re trying to measure. For example, if you wanted to compare mentions of white wines vs. red wines, you would certainly want to map up references of wines, appellations, and varieties to typical finished wine types for each.

  • http://www.winecellarinnovations.com/blog/2012/01/06/wine-cellar-roundup-episode-76/ Wine Cellar Roundup – Episode #76

    [...] dump! I love looking at pretty charts and graphs. Take a look at the popular varieties sold over the [...]

  • http://www.facebook.com/karinmckercher Karin McKercher

    No worries, James. There’s a lot of work there, and you’re bound to feel somewhat protective of it!

  • Julie Brosterman

    Curious to know how Champagne fared – probably better than Chardonnay.

For Your Continued Enjoyment...

January 22, 2013

2012 – What is the REAL Scorecard for W...

I just read this report about the Wine Market Council’s presentation.  Unfortunately I was disappointed that the... more

May 4, 2012

The Newest Feature – Social CRM with yo...

James Jory, our CTO, glows with kung foo.  In conversations with many of our clients... more

February 19, 2012

VinTank 2.0 Advanced Features: Innovation thr...

Yesterday, we quietly launched our new advanced features on VinTank Social Connect.  As promised, the... more

January 16, 2012

VinTank Social Connect 2.0 >> Check out all t...

What’s Different With VinTank Social Connect Version 2.0? Well to start . . . our... more