What is AutoNLP?

By Wojciech Gryc on July 25, 2021

Summary: Automated Natural Language Processing (AutoNLP) is a technology that enables users to quickly build dozens or hundreds of AI models without worrying about the very manual tasks of training each model and comparing the model to their labeled data.


Automated Natural Language Processing (AutoNLP) is a technology that enables users to quickly build dozens or hundreds of AI models without worrying about the very manual tasks of training each model, comparing the model to their labeled data, and then repeating this process for all modeling approaches to find the best one.

Building NLP models requires testing dozens of approaches

There are numerous modeling approaches in NLP and all of them can provide value. Given a labelled data set, it’s not always clear which modeling approach will be the one with the highest accuracy for that data set.

This problem is further compounded if you are trying to build different models. At Phase, we build hundreds of models to detect different topics, such as negative sentiment towards a company, discussions around diversity issues at work, problems with customers, and so on… Building a good model for each these topic requires testing dozens of different approaches.

Dozens of models; dozens of approaches… This requires hundreds, potentially thousands, of variations of experiments to find the best approaches. This is why we use AutoNLP.

So, what is AutoNLP?

AutoNLP automates the work associated with building models so you can test the many different machine learning architectures and configurations available, and then simply choose the best one.

At Phase, we have pre-built dozens of modeling strategies and we test each of these strategies via our AutoNLP framework.

Our framework standardizes the way models are built so that every single data set can be used within the modeling process. Similarly, we standardize each modeling approach so that we can simply “plug and play” new approaches into the workflow… This means that every single modeling approach can be used to build models with every single data set! Now we can automatically test and explore all the approaches and choose the best one without human intervention.

From a user’s perspective, given a labelled data set (e.g., a list of yes/no examples for a specific text category), our AutoNLP framework builds dozens of models to find the best performing approach. This is what we then use to score your broader pieces of text.

Since this process is automated, we rerun this process on a regular basis as new labeled examples are provided, to generate newer and better models.

The benefits of AutoNLP

There are numerous benefits of using this approach:

  • Models get more accurate over time. We can easily retrain models as you provide more data. This means models get more performant over time. This also means that you can regularly retrain models to see if the new labels you’ve provided are helping the models over time.
  • New modeling approaches are incorporated into our AutoNLP framework, and you benefit automatically. When new research is incorporated into our framework, the models built on your data automatically use this new R&D to see if it helps improve the quality and performance of your models. This means you don’t have to build and rebuild new models manually to benefit from new research.
  • You can build models on a whim. Our AutoNLP approach makes it easy to build and experiment with models. For instance, you can create smaller labeled data sets, test them on the modeling framework, and see how they work. This can then inform whether you’re labeling your data consistently enough to build a model, and whether you need to label more data to build an acceptable model.
  • Improve models iteratively. Our binary classifiers give us a view into how confident the models are in their labels. We improve models by applying new labels to the pieces of text that the models are not confident in. These new labels are quickly incorporated into the updated versions of the models by having our AutoNLP framework retrain models based on this new data without requiring humans to manage the process. In this way, our iterative approach helps you focus on labeling data points that are most beneficial to the model’s learning, and you benefit quickly from the new labels.

These are just some of the reasons why our AutoNLP approach is so helpful in building accurate models quickly. It’s also why Phase NLP can provide so many models to address your own challenges and business needs.

© 2021