Taking a Cloud of Words and Giving Them Meaning

Problem

We worked with a locomotive company that wanted to extract insights from unstructured text entered by technicians in the field during a maintenance event.

Challenge

The data came from different sources, was written in different languages, filled with acronyms and abbreviations, and contained typos and misspellings. The company wanted to take this cloud of words and boil each record down to a one to two word concept.

Solution

They had a neural network model, but it had low accuracy and didn’t scale well. We recommended changes in methodology and then replaced the existing model with an SVM model.

Outcome

The new model increased accuracy by 24 percentage points, required fewer computational resources, and was deployed for real-time use.