Character Entropy in Modern and Historical Texts: Comparison Metrics for an Undeciphered Manuscript

This paper outlines the creation of three corpora for multilingual comparison and analysis of the Voynich manuscript: a corpus of Voynich texts partitioned by Currier language, scribal hand, and transcription system, a corpus of 294 language samples compiled from Wikipedia, and a corpus of eighteen transcribed historical texts in eight languages. These corpora will be utilized in subsequent work by the Voynich Working Group at Yale University… We demonstrate the utility of these corpora for studying characteristics of the Voynich […]

Read more

Hacks to perform faster Text Mining in R

Introduction Data science demands versatility. Move away from your regular methods, challenge your ways of working, explore new ways of doing things more efficiently. On reminiscing about my old days, my initial years in data science, I had also got trapped by this devil of ‘complacency’. At one point, I was not challenging myself enough. I wasn’t  experimenting with the ways of doing work. I accepted the things as they were, until I realized ‘Complacency is a state of mind […]

Read more

How to leverage Social Media Analytics for your business?

Introduction Conventional media, such as television, radio or newspapers transmits information only in one direction. Users can consume the information which the media offers, but they have very little or no ability to share their own views on the subject. Now-a-days, digital mediums has made it possible to have a two-way form of communication that allows individuals to interact with the information being transmitted. This is known as Social media which encompasses a wide variety of online content, from social […]

Read more

Introductory guide to Information Retrieval using kNN and KDTree

Introduction I love cricket as much as I love data science. A few years back (on 16 November 2013 to be precise), my favorite cricketer – Sachin Tendulkar retired from International Cricket. I spent that entire day reading articles and blogs about him on the web. By the end of the day, I had read close to 50 articles about him. Interestingly, while I was reading these articles – none of the websites suggested me articles outside of Sachin or cricket. […]

Read more

Automatic Image Captioning using Deep Learning (CNN and LSTM) in PyTorch

Introduction Deep Learning is a very rampant field right now – with so many applications coming out day by day. And the best way to get deeper into Deep Learning is to get hands-on with it. Take up as much projects as you can, and try to do them on your own. This would help you grasp the topics in more depth and assist you in becoming a better Deep Learning practitioner. In this article, we will take a look […]

Read more

A Must-Read NLP Tutorial on Neural Machine Translation – The Technique Powering Google Translate

Introduction “If you talk to a man in a language he understands, that goes to his head. If you talk to him in his own language, that goes to his heart.” – Nelson Mandela The beauty of language transcends boundaries and cultures. Learning a language other than our mother tongue is a huge advantage. But the path to bilingualism, or multilingualism, can often be a long, never-ending one. There are so many little nuances that we get lost in the […]

Read more

8 Awesome Data Science Capstone Projects from Praxis Business School

Introduction It is not the strongest or the most intelligent who will survive but those who can best manage change. Evolution is the only way anything can survive in this universe. And when it comes to industry relevant education in a fast evolving domain like Machine Learning and Artificial Intelligence – it is necessary to evolve or you will simply perish (over time). I have personally experienced this first hand while building Analytics Vidhya. It still amazes me to see […]

Read more

OpenAI’s GPT-2: A Simple Guide to Build the World’s Most Advanced Text Generator in Python

Overview Learn how to build your own text generator in Python using OpenAI’s GPT-2 framework GPT-2 is a state-of-the-art NLP framework – a truly incredible breakthrough We will learn how it works and then implements our own text generator using GPT-2   Introduction “The world’s best economies are directly linked to a culture of encouragement and positive feedback.” Can you guess who said that? It wasn’t a President or Prime Minister. It certainly wasn’t a leading economist like Raghuram Rajan. […]

Read more

6 Exciting Open Source Data Science Projects you Should Start Working on Today

Overview Here are six open-source data science projects to enhance your skillset These projects cover a diverse set of domains, from computer vision to natural language processing (NLP), among others Pick your favorite open-source data science project(s) and get coding!   Introduction I recently helped out in a round of interviews for an open data scientist position. As you can imagine, there were candidates from all kinds of backgrounds – software engineering, learning and development, finance, marketing, etc. What stood […]

Read more

spaCy Tutorial to Learn and Master Natural Language Processing (NLP)

Introduction spaCy is my go-to library for Natural Language Processing (NLP) tasks. I’d venture to say that’s the case for the majority of NLP experts out there! Among the plethora of NLP libraries these days, spaCy really does stand out on its own. If you’ve used spaCy for NLP, you’ll know exactly what I’m talking about. And if you’re new to the power of spaCy, you’re about to be enthralled by how multi-functional and flexible this library is. The factors […]

Read more
1 747 748 749 750 751 914