The State of Multilingual AI
Models that allow interaction via natural language have become ubiquitious. Research models such as BERT and T5 have become much more accessible while the latest generation of language and multi-modal models are demonstrating increasingly powerful capabilities. At the same time, a wave of NLP startups has started to put this technology to practical use. While such language technology may be hugely impactful, recent models have mostly focused on English and a handful of other languages with large amounts of resources. […]
Read morePredicting the outcome of the World Cup in 150 lines of Python
It seems like Machine learning and Deep Learning are everywhere these days. AI can do anything! From writing essays to cheat tests to generating to having an infinite image generating machine running on your local machine, the possibilities are endless. So it should come as no surprise that you can use deep learning to predict the outcome of major sports tournaments.
Read moreK-Means Clustering: A Centroid-based Algorithm
K — means clustering is a centroid-based unsupervised machine learning algorithm. Unsupervised learning uses the machine learning algorithm to analyze unlabelled data and find hidden patterns without human intervention. It’s clear from the name itself that K-means is a cluster-based algorithm. Clustering is a technique where we can group together a set
Read moreSentence-BERT (S-BERT) Multilingual NLP model for the German Language (Python)
“ Semantic search is a data-searching technique to determine the intent and contextual meaning of words similar to the human mind ”
Read moreCombining Embedding and Keyword Based Search for Improved Performance
TLDR — Ensembling keyword and embedding models for search is one of the quickest and easiest ways to improve search performance over the standard embedding based search paradigms. There is a large amount of evidence in the machine learning literature which supports that this helps with in domain performance, out of domain generalization, as well as multilingual transfer. The reason for this seems to be that sparse and dense representations of text seem to represent complimentary linguistic qualities of their […]
Read more