Flexible retrieval with NMSLIB and FlexNeuART

Our objective is to introduce to the NLP community an existing k-NN search library NMSLIB, a new retrieval toolkit FlexNeuART, as well as their integration capabilities. NMSLIB, while being one the fastest k-NN search libraries, is quite generic and supports a variety of distance/similarity functions… Because the library relies on the distance-based structure-agnostic algorithms, it can be further extended by adding new distances. FlexNeuART is a modular, extendible and flexible toolkit for candidate generation in IR and QA applications, which […]

Read more

Attribution Preservation in Network Compression for Reliable Network Interpretation

Neural networks embedded in safety-sensitive applications such as self-driving cars and wearable health monitors rely on two important techniques: input attribution for hindsight analysis and network compression to reduce its size for edge-computing. In this paper, we show that these seemingly unrelated techniques conflict with each other as network compression deforms the produced attributions, which could lead to dire consequences for mission-critical applications… This phenomenon arises due to the fact that conventional network compression methods only preserve the predictions of […]

Read more

Character Entropy in Modern and Historical Texts: Comparison Metrics for an Undeciphered Manuscript

This paper outlines the creation of three corpora for multilingual comparison and analysis of the Voynich manuscript: a corpus of Voynich texts partitioned by Currier language, scribal hand, and transcription system, a corpus of 294 language samples compiled from Wikipedia, and a corpus of eighteen transcribed historical texts in eight languages. These corpora will be utilized in subsequent work by the Voynich Working Group at Yale University… We demonstrate the utility of these corpora for studying characteristics of the Voynich […]

Read more

Hacks to perform faster Text Mining in R

Introduction Data science demands versatility. Move away from your regular methods, challenge your ways of working, explore new ways of doing things more efficiently. On reminiscing about my old days, my initial years in data science, I had also got trapped by this devil of ‘complacency’. At one point, I was not challenging myself enough. I wasn’t  experimenting with the ways of doing work. I accepted the things as they were, until I realized ‘Complacency is a state of mind […]

Read more

How to leverage Social Media Analytics for your business?

Introduction Conventional media, such as television, radio or newspapers transmits information only in one direction. Users can consume the information which the media offers, but they have very little or no ability to share their own views on the subject. Now-a-days, digital mediums has made it possible to have a two-way form of communication that allows individuals to interact with the information being transmitted. This is known as Social media which encompasses a wide variety of online content, from social […]

Read more

Introductory guide to Information Retrieval using kNN and KDTree

Introduction I love cricket as much as I love data science. A few years back (on 16 November 2013 to be precise), my favorite cricketer – Sachin Tendulkar retired from International Cricket. I spent that entire day reading articles and blogs about him on the web. By the end of the day, I had read close to 50 articles about him. Interestingly, while I was reading these articles – none of the websites suggested me articles outside of Sachin or cricket. […]

Read more

Automatic Image Captioning using Deep Learning (CNN and LSTM) in PyTorch

Introduction Deep Learning is a very rampant field right now – with so many applications coming out day by day. And the best way to get deeper into Deep Learning is to get hands-on with it. Take up as much projects as you can, and try to do them on your own. This would help you grasp the topics in more depth and assist you in becoming a better Deep Learning practitioner. In this article, we will take a look […]

Read more

A Must-Read NLP Tutorial on Neural Machine Translation – The Technique Powering Google Translate

Introduction “If you talk to a man in a language he understands, that goes to his head. If you talk to him in his own language, that goes to his heart.” – Nelson Mandela The beauty of language transcends boundaries and cultures. Learning a language other than our mother tongue is a huge advantage. But the path to bilingualism, or multilingualism, can often be a long, never-ending one. There are so many little nuances that we get lost in the […]

Read more

8 Awesome Data Science Capstone Projects from Praxis Business School

Introduction It is not the strongest or the most intelligent who will survive but those who can best manage change. Evolution is the only way anything can survive in this universe. And when it comes to industry relevant education in a fast evolving domain like Machine Learning and Artificial Intelligence – it is necessary to evolve or you will simply perish (over time). I have personally experienced this first hand while building Analytics Vidhya. It still amazes me to see […]

Read more

OpenAI’s GPT-2: A Simple Guide to Build the World’s Most Advanced Text Generator in Python

Overview Learn how to build your own text generator in Python using OpenAI’s GPT-2 framework GPT-2 is a state-of-the-art NLP framework – a truly incredible breakthrough We will learn how it works and then implements our own text generator using GPT-2   Introduction “The world’s best economies are directly linked to a culture of encouragement and positive feedback.” Can you guess who said that? It wasn’t a President or Prime Minister. It certainly wasn’t a leading economist like Raghuram Rajan. […]

Read more
1 744 745 746 747 748 911