Coffee Cupping Calibration

Why coffee cupping

In addition, some question whether existing protocols are sufficient and wholly accurate. As one example, Brian Johnson, director of coffee at My Virtual Coffee House Roasters, suggests that water protocols should be published in exact gram weight of water in addition to traditional volume recommendations, since this echoes the current habits of baristas in measuring and evaluating espresso and filter drip extractions. The quality of water is even more important as water comprises approximately 98.75 percent of the brewed beverage in a traditional cupping. Despite the importance of water quality, however, labs around the world (and even within the United States) don’t use a standardised water base.

For Peter Burgess, a roaster who works with Coffee Flower, a huge weakness of current cupping practice exists in the roast degree recommendation. “You have to look at more than one roast,” he says, because “there are problems with taking a qualitative measurement of both acidity and body, two things greatly influenced by roast development, unless you are looking at a couple of different roasts.”

O’Mally also feels that the sample-roasting specifications are too lax and instead requires that his samples are “roasted within 10:30–11:15 [minutes] rather than the eight- to 12-minute window of the SCAA.”

Certainly, there are many other conditions and variables in cupping that might be better controlled or differently managed. Without postulating conclusions here, I suggest to the industry that we review existing protocols and best practices to see where we can improve upon them in the interest of achieving greater objectivity and balance of results.

Of course, until everyone is rigorously employing protocols, creating more of them or adapting them doesn’t fix anything.

How To Cup Coffee


Even if we had a perfect tasting process available to us and we employed its protocols religiously, a more difficult challenge manifests in the concept of calibration. Calibration is the notion that, assuming equality of sample and process, cuppers around the world are consistently infusing vocabulary and scoring with like meaning. Are the words citrus, fruity, savoury and floral applied consistently among cuppers of the same coffee, for example, or does fruity mean cherry to some and over-fermented to others?

Is an 85-point coffee in one lab at least in the range of 84–86 among cuppers of the same coffee elsewhere? For cupping to serve its critical purpose as a buying and selling tool and price determiner, we have to have semantic consensus among our professional tasters. As Johnson states, “[Without calibration] pricing will break down, since in our small sector of bean to cup coffee machines, price should be in direct relation to quality/ cupping score.” (Johnson also suggests that “availability, sustainability and traceability” are part of the pricing equation.) If we aren’t speaking the same language, how do we determine fairly if a coffee is worth a differential of +50 versus +250? And, are we doing ourselves a disservice by selling coffees to consumers at prices they aren’t really worth?

Over the years, I have frequently questioned whether we are universally calibrated as tasters. For example, many times, I have been witness to or part of debates over whether a “fruity” sample (in the same cupping session) is over-fermented and defective or whether it is a 90-plus-rated coffee. Well-respected and experienced tasters can land on opposite sides of this debate.

Johnson, having also experienced this particular argument, states, “I believe many people need more training in identifying uncontrolled ferment as a defect since what we like about honeys and naturals is a result in controlled ferment.” Johnson is exactly right: as an industry, we need more training to calibrate. In this particular case, we need more training of processing manifestations— both controlled and uncontrolled to find some objective end to these debates.

However, even if we can come to agreement and can taste accurately whether fruit is caused by happenstance versus deliberate effort in processing, I suspect we will still have variation on what is “positive” fruit versus fruit that is “defective.” Some of these issues get into bias (which we will discuss later in the article), but some of it is because we don’t have universal language agreement.

What Do The Tasters Say

What do the tasters say?Despite the frequency of these debates over the years, to my pleasant surprise when I questioned some fellow tasters about calibration, they were largely positive about their success in matching their results with their outside partners.

For Spencer, Caribou finds that they “trend in the same direction as our industry contemporaries” more often than not. O’Mally asserted that his industry partners “consistently score/calibrate within our results,” but he also acknowledges that they have some trading partners whose “results are very inconsistent.”

For O’Mally, these inconsistencies are attributable to “counterparts neither scientifically running the lab, nor consistently cupping,” though, and not problems of semantic variation. Spencer also says that “for calibration the greatest challenge is practices in the lab.”

Without doubt, poor protocols yield result variation, but part of the problem is semantic. We are not speaking the same language—both in words and numbers.

Spread the love