FuzzyWuzzy Python Library: Interesting Tool for NLP and Text Analytics
Introduction
There are many ways to compare text in python. But, often we search for an easy way to compare text. Comparing text is needed for various text analytics and Natural Language Processing purposes.
One of the easiest ways of comparing text in python is using the fuzzy-wuzzy library. Here, we get a score out of 100, based on the similarity of the strings. Basically, we are given the similarity index. The library uses Levenshtein distance to calculate the difference between two strings.
Levenshtein Distance
The Levenshtein distance is a string metric to calculate the difference between two different strings. Soviet mathematician Vladimir Levenshtein formulated this method and it is named after him.
The Levenshtein distance between two strings a,b (of length {|a| and |b| respectively) is given by lev(a,b) where
where