Text Analytics to the Rescue

Frank Buckler, PhD.
5 min readJul 5, 2022


Text analytics had great success in recent years and most larger enterprises use them in CX in one way or the other. But still, most companies are far from getting a lot of value out of it. It’s more like a piece of software that someone plugged in the process. Here is why this is a problem and how companies can circumvent them.

Text analytics software is supposed to read and understand unstructured qualitative feedback. This understanding is defined by associating a verbatim with the correct theme. In short, the task is all about categorizing text feedback into a finite set of topics or categories.

First and still most used text analytics methods are unsupervised. They analyze a set of feedbacks and start to cluster and build topics. The problem: for simple matters, it makes even a good first impression. But when you look more closely, it doesn’t perform even nearly as well as a human reader would do.

The more specific and complex the feedback, the more apparent the lack of understanding becomes.

Sure, the algorithm has no deeper industry knowledge. So far there is no alternative than using AI that can be tough by domain experts — the human.

Even this — done naively — can do more harm than good.

After all, even worth, at the end of a text analytics implementation this question always: Now what? What do we learn from this? Should we really fix this?

I like to share three principles that can help see the light.

The effort is worth it. Being able to leverage unstructured customer feedback is worth gold. It is truly customer-oriented to not torture respondents with lengthy closed-ended questionnaires, but simply ask a question or two and let customers express themselves in their own words.

This enables you to conduct research on every customer and every touchpoint, and get in-depth insights with a comparably simple research approach.

How MICROSOFT drives value from unstructured feedback

The brand runs one of the world’s largest B2B customer trackers collecting over 200.000 feedback every year. Early on it adopted text analytics but the depth of insights a convention text analytics can extract from highly technical feedback is sobering. The categorization in 20 rather generic categories like Quality, Price, or Service was not very helpful and prone to failure as well.

Microsoft implemented already in 2019 a highly sophisticated text analytics approach. Trained by domain experts it gives nearly 200 highly granular topics. Not only that it proved that the accuracy even exceeds human categorization.

Not only that. The brand invested in an elaborated driver analysis — a causal machine learning that identifies the impact that an improvement in a topic would have on customer satisfaction.

Now, instead of looking at 200 topics and how they changed, the team is focusing on those positive topics that will have the highest impact (hidden drivers) and those who are negatively important and too often mentioned at the same time (leakages).

The Three Principles of Good Text Analytics for CX

These are the fundamental principles to consider in order to drive value from your customer’s unstructured feedback

1. Build your own codebook and use the right AI

It’s not enough to just buy a text analytics software or subscription. In most cases it does not fully satisfy the expectations your business partners have today.

What you need is a text AI that you can train.

This training starts by defining the set of themes (codebook) that you want your customer’s feedback gets quantified. You don’t want to outsource this to software.

Be sure the software you are using has the right measures to validate its accuracy. E.g. you do not want to look at hit rates but F1 score.

2. Train it well

Garbage in garbage out: Training itself has some tricks and trades that you can either learn yourself or you can find external partners who are experienced in this.

It’s not enough to use a domain expert. The codebook should be documented well and the person should stay the same over time — or the handover phase should be extensive.

Over time training is changing the way your system categorizes feedback. Either you stop training to maintain consistency (not recommended, as the accuracy will decay over time) or you must rebaseline the past once in a while.

It’s important to communicate this expectation early on: No categorization will ever be perfect.

3. Do NOT interpret text analytics — its just data

The greatest misconception about text analytics is jumping from data to conclusions. Intuitively businesses look at the most often mentioned topics. Because they believe these are the reasons for success or failure.

Even worse: this makes perfect sense as it is the answer to the question “why did you rate that way”. And the customer is telling us why.

However, it turns out that the frequency of mentionings and importance is largely uncorrelated.

In other words, whenever you give your business partners unguided access to frequencies of topics, they will most likely conclude with highly imperfect decisions.

The Microsoft case above shows how this issue must be solved.

State-of-the-Art Text Analytics

Text analytics is an amazing opportunity to discover unbiased insights in an easy and practical form.

The state of the art relies on deep learning AI systems that are not only pre-trained but still can be trained by domain experts. This training requires some care and rigor in the process to avoid garbage-in-garbage-out.

To finally derive business value from text analytics it must be linked to some kind of driver analysis process. Even in this respect are (causal) machine learning approaches most appropriate.

A detailed education program on the state-of-the-art provides the world’s largest CX Analytics Masters course. It is open since spring 2021. Here is more https://www.cx-ai.com/cx-analytics-masters

The safe and easy way of course is the use vendors like www.cx-ai.com who have perfectionated all those measures and provide those processes as a managed service.

More background provides also my latest book “The CX Insights Manifesto” available at Amazon.com, co.uk, .ca, .it, .fr and .de.





Frank Buckler, PhD.

Founder of CX-AI.com and CEO of Success Drivers // Pioneering Causal AI for Insights since 2001 // Author, Speaker, Daddy