Review: My ability to judge spirits

All results of the Tokyo Whisky & Spirits Competition 2021 (TWSC 2021) are now public. It's given me a chance to compare my own scoring during the blind tasting sessions to the scoring by other judges on the panel. The idea here is to see how much of an outlier I am when compared to other spirits judges: are my ratings too harsh, too generous, or somewhere in between?

One of the big selling points of TWSC is that because it's conducted in Japan, it offers entrants a chance to see what Japanese industry professionals think of their bottles. The very first sentence on the TWSC homepage, in fact, mentions the sensitive and delicate palates of Japanese people.

Well I was born and raised in America, and I'm not even partially Japanese. Yet, as a member of the TWSC Executive Committee, for the fully remote TWSC 2021, I was a judge for both the first and second rounds. So in addition to the Bronze, Silver, and Gold medals, I also helped decide who won the very top awards like Best of the Best and Superior Gold. In a competition that's billed as being by Japanese people, it's a mixture of feeling honored, burdened with responsibility, and a bit out of place.

Japanese or not, taste and smell are quite subjective, and don't really have right or wrong answers. But when it comes to actually scoring spirits, we need to ask more objective questions: would someone who likes peaty whisky enjoy this whisky? Does this spirit retain its character when adding a few drops of water? Or does its structure break down? How well does this shochu hit its targeted profile?

These are the kinds of questions that were going through my head and the heads of other judges at TWSC 2021. With TWSC, unlike the reviews I do on my own here at, I'm able to see how my scores compared to the averages. Was I able to pick out the winners? Or did I like the losers more? Let's see how I fare.

Please note that I am not allowed to disclose exactly which bottles I did or did not judge.

Western Spirits Division: ROUND 1

For the first rounds, while I have my own scores, I can't directly compare them to the scores from the other judges. But I can compare them to the medals that each bottle went on to win or not. So purely for comparison purposes in this post--these are NOT in any way official brackets for TWSC medals--let's just say 69 or below is no medal, 70-79 is Bronze, 80-89 is Silver, and 90-100 is Gold.

Following that scale, my scores were as below:

  • 10 out of 11 bottles that I awarded 85 points or higher went on to win Silver, Gold, or Superior Gold
  • 3 out of 4 bottles that I awarded the highest scores to within a given flight won Gold or Superior Gold
  • All second-highest scoring bottles in a flight received at least Silver
  • Medal-for-medal, I "correctly" guessed 0 out of 1 no-medal bottles, 4 out of 9 Bronze winners, 9 out of 12 Silver winners, and 3 out of 7 Gold winners
  • As you can see there were a handful of cases where I was way off from the norm, rating bottles significantly lower or higher than the other judges

Overall, not too bad: I tended to favor the same bottles that other judges favored.

Shochu Division: Round 1

Using the same scale as we did for the Western Spirits Division, we have the below:

  • 8 out of 10 bottles that I awarded 85 points or higher went on to win Silver, Gold, or Superior Gold
  • 3 out of 4 bottles that I awarded the highest scores to within a given flight won Gold or Superior Gold
  • Only 1 out of 4 of my second-highest scoring bottles in a flight received at least Silver
  • The only 2 90+ scores I gave both went on to win Gold or Superior Gold
  • Medal-for-medal, I "correctly" guessed 1 out of 8 no-medal bottles, 1 out of 6 Bronze winners, 8 out of 9 Silver winners, and 2 out of 6 Gold winners
  • In some cases I was way off base, giving a 79 to a bottle that won Gold for example

My tendency to award a middling score of 80-89 was much more apparent in shochu, where I gave a whopping 22 entries such scores while other judges awarded Silver to less than half that number. For the really good bottles (Gold or Superior Gold), two of them I picked up myself, two of them I came close (awarded 85+), and two of them I was way off about.

Western Spirits Division: ROUND 2

As we said in the press release, the second round of judging was to decide the Best of the Best and Superior Gold winners in each given division. The judge pool here was comprised only of the TWSC Executive Committee and the Whisky Galore taster panel.

For round 2, I have access to anonymized scores of all the other judges. This allows us to take a more arithmetical approach.

There are two different sets here. The first set is single malt whiskies, the second... isn't. Again, I can't name the exact bottles.

For single malt whiskies, overall I tended to rate entries slightly higher than my peers, though not excessively so (2.7 points). Removing my scores from the average did not materially impact the final rankings -- basically the overall #4 and #5 would have switched positions.

Ranking-wise, my #1 pick ended up placing 4th in the whole division, my #2 was #1, and one of my ties for #3 ended up placing 2nd. Likewise, my lower-rated entries were also rated lower by everyone else. That said, I was quite far off when it came to one of the bottles I picked as 3rd place, as it ended up in 10th place, barely making the cut for Superior Gold.

For bottles that weren't single malt whisky, it looks like I was far too generous. Overall, my average score was a full 4.5 points over the average from the pool without me. At the very least, I wasn't single-handedly responsible for changing any rankings.

Shochu Division: Round 2

I'm all over the board here. One of the bottles I awarded 97 points to ended up as #2 in the rankings, but another ended up as #8, and yet another as #24.

Overall I was way, way more generous than the other judges for round 2 of the Shochu Division. Luckily(?) my scores didn't end up impacting the final rankings.

Do Try This At Home

Wondering how your own scoring abilities compare? It's not very often that we're able to quantitatively compare our tasting abilities to those of others.

Although my analysis above is based on results from a spirits competition with judges from the industry, it wouldn't be too difficult to replicate within your local whisky club or circle of friends. The conditions would be:

  • Tasting is conducted blindly, so scores can't be influenced by brand or age statement
  • Judges should score bottles independently, i.e. no discussion prior to recording your final scores
  • A sufficient number of judges score the same bottles; you'll probably want at least 7 judges per bottle
  • Use a numerical scale so it's easier to compute averages and such

From there it's purely administrative: one person (who isn't a judge) knows which bottles are which and assigns the order they should be tasted in.

In conclusion

The main takeaway for me from this whole exercise is that I probably need to be a bit more critical about both Western spirits and shochu. My scores in round 2 were well above average, and I am guessing that if I had access to the round 1 numbers, it would hold true there too.

The other takeaway from all this is that it's evident that my tastes for shochu are significantly different from those of the other judges. I suspect the root cause here is simply not having enough experience with the category. Since I already have the Shochu Kikisake-shi qualification, rather than academic, that additional experience needs to be hands-on: visiting distilleries, drinking a huge variety of shochus, and learning more about the people and processes that makes individual brands special.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.