What are binary classifiers and why do we love them so much?

By Wojciech Gryc on July 23, 2021

Summary: Binary classifiers are models that take a piece of text and determine whether the piece of text falls into a given category: it does (“yes”) or does not (“no”). In this article, we explain why our modelling platform is focused on binary classifiers.

binary classifier

Natural Language Processing (NLP) is a massive field with many types of modelling tasks and approaches to solving a problem. A classic one – and one that is often underestimated by many people – is the binary classifier.

Binary classifiers are models that take a piece of text and determine whether the piece of text falls into a given category: it does (“yes”) or does not (“no”).

Yup, that’s all it does! Given a piece of text, tell me if the text is applicable to a category – yes, or no. Hence the “binary” in “binary classifier”.

So why are these models so magical?

We like using binary classifiers as they are easy to understand and improve, and many business problems can be framed as yes/no questions. Here some of the reasons why we like binary classifiers.

Reason 1: Understanding confidence

Many binary classification models can estimate their own confidence in the category. In this case, you’re not only getting a “yes” or “no” classification, you get a model’s determination of the strength of the classification. In other words, the model tells you either “yes” or “no” and how certain it is about this “yes” or “no”. This certainty is expressed as a confidence score.

Typically, a confidence score is shown as a number between 0.0 to 1.0. A 0.0 means “no” and 1.0 means “yes”. For instance, a model might score a piece of text as 0.23 – close to 0.0 but not quite there. This means that the model is more certain that the classification is a “no” than another piece of text with, say, a score of 0.37.

Once you score all your text, you can rank it… Now, if you have thousands of articles or thousands of facts, you can find the most relevant statements to a specific category based on the scores.

This prioritization allows us to quickly filter text that we know is within a category or not. For example, if we are labelling text from product reviews as “positive” (1.0) or “negative” (0.0), we can quickly determine if we have more positive or more negative reviews.

Reason 2: Using confidence to improve model training

However, there are cases where a model isn’t sure how to label a piece of text. For instance, text that is scored between 0.4 to 0.6 suggests that a model is uncertain about its label. Here, a human can review these cases manually and assign a label. The human effectively teaches the model to treat the text as a “yes” or “no”. As a result, this improves the model’s ability to classify text.

Using confidence scores to decide which pieces of text should be reviewed and labelled by humans makes the model training process significantly more efficient. It means we can spend the human effort on labelling in an optimal way, by labeling data that will teach the binary classifiers with new information, rather than examples they would already classify properly.

Reason 3: Combining classifiers to answer complex questions

What’s more powerful is when you combine binary classifiers. Suppose you have three separate models: one to determine if a statement is negative, another to determine if it’s discussing your company’s customer experience, and a third one to determine if it’s discussing pricing.

If you apply all three models to a group of customer service emails, and filter them to only include ones where all three models scored a “yes”, then you’ll find all the statements that are negative in sentiment, focusing on customer experience, and also discussing your product’s price… You’ll very quickly narrow text down to specific categories that would otherwise require custom models to build!

… and as a result, your yes/no questions have turned into complex queries on unstructured data!

Conclusion

Binary classifiers are underrated – they’re seemingly not as advanced as more complex models, and don’t get as much love as broader types of AI models… Yet at the same time, they are easy to understand, easy to train and improve, and easy to combine to answer surprisingly complex questions.

This is why we recommend those working with Natural Language Processing, especially when applying NLP to applied processes, start by framing questions and problems in terms of combinations of binary classifiers. This also helps frame your problem or research questions in a way that non-technical and business leaders understand.

It’s also why we have so many of them in our NLP model library!

© 2021