The Ultimate Learning Path to Become a Data Scientist and Master Machine Learning in 2019

The Learning Path to Become a Data Scientist in 2020 is now live! Head over here to start your data science journey. Introduction Learning paths are immensely popular among our readers and with good reason! Learning paths take away the pain and confusion from the learning process. For those who don’t know what a learning path is – we take the pain of going through all the resources available on data science, machine learning and Artificial Intelligence, select the best […]

Read more

Introduction to PyTorch-Transformers: An Incredible Library for State-of-the-Art NLP (with Python code)

Overview We look at the latest state-of-the-art NLP library in this article called PyTorch-Transformers We will also implement PyTorch-Transformers in Python using popular NLP models like Google’s BERT and OpenAI’s GPT-2! This has the potential to revolutionize the landscape of NLP as we know it   Introduction “NLP’s ImageNet moment has arrived.” – Sebastian Ruder Imagine having the power to build the Natural Language Processing (NLP) model that powers Google Translate. What if I told you this can be done […]

Read more

Demystifying BERT: A Comprehensive Guide to the Groundbreaking NLP Framework

Overview Google’s BERT has transformed the Natural Language Processing (NLP) landscape Learn what BERT is, how it works, the seismic impact it has made, among other things We’ll also implement BERT in Python to give you a hands-on learning experience   Introduction to the World of BERT Picture this – you’re working on a really cool data science project and have applied the latest state-of-the-art library to get a pretty good result. And boom! A few days later, there’s a […]

Read more

3 Important NLP Libraries for Indian Languages You Should Try Out Today!

Overview Ever wondered how to use NLP models in Indian languages? This article is all about breaking boundaries and exploring 3 amazing libraries for Indian Languages We will implement plenty of NLP tasks in Python using these 3 libraries and work with Indian languages   Introduction Language is a wonderful tool of communication – its powered the human race for centuries and continues to be at the heart of our culture. The sheer amount of languages in the world dwarf […]

Read more

Build Text Categorization Model with Spark NLP

Overview Setting up John Snow labs Spark-NLP on AWS EMR and using the library to perform a simple text categorization of BBC articles. Introduction Natural Language Processing is one of the important processes for data science teams across the globe. With ever-growing data, most of the organizations have already moved to big data platforms like Apache Hadoop and cloud offerings like AWS, Azure, and GCP. These platforms are more than capable of handling    

Read more

Text Mining Simplified – IPL 2020 Tweet Analysis with R

This article was published as a part of the Data Science Blogathon. Introduction Text mining utilizes different AI technologies to automatically process data and generate valuable insights, enabling companies to make data-driven decisions. Text mining identifies facts, relationships, and assertions that would otherwise remain buried in the mass of textual big data. Once extracted, this information is converted into a structured form that can be further analyzed, or presented directly using clustered HTML tables, mind maps, charts, etc. Advantages of […]

Read more

Beyond Marginal Uncertainty: How Accurately can Bayesian Regression Models Estimate Posterior Predictive Correlations?

While uncertainty estimation is a well-studied topic in deep learning, most such work focuses on marginal uncertainty estimates, i.e. the predictive mean and variance at individual input locations. But it is often more useful to estimate predictive correlations between the function values at different input locations… In this paper, we consider the problem of benchmarking how accurately Bayesian models can estimate predictive correlations. We first consider a downstream task which depends on posterior predictive correlations: transductive active learning (TAL). We […]

Read more

Semi-Supervised Low-Resource Style Transfer of Indonesian Informal to Formal Language with Iterative Forward-Translation

In its daily use, the Indonesian language is riddled with informality, that is, deviations from the standard in terms of vocabulary, spelling, and word order. On the other hand, current available Indonesian NLP models are typically developed with the standard Indonesian in mind… In this work, we address a style-transfer from informal to formal Indonesian as a low-resource machine translation problem. We build a new dataset of parallel sentences of informal Indonesian and its formal counterpart. We benchmark several strategies […]

Read more

Deep coastal sea elements forecasting using U-Net based models

Due to the development of deep learning techniques applied to satellite imagery, weather forecasting that uses remote sensing data has also been the subject of major progress. The present paper investigates multiple steps ahead frame prediction for coastal sea elements in the Netherlands using U-Net based architectures… Hourly data from the Copernicus observation programme spanned over a period of 2 years has been used to train the models and make the forecasting, including seasonal predictions. We propose a variation of […]

Read more

From Dataset Recycling to Multi-Property Extraction and Beyond

This paper investigates various Transformer architectures on the WikiReading Information Extraction and Machine Reading Comprehension dataset. The proposed dual-source model outperforms the current state-of-the-art by a large margin… Next, we introduce WikiReading Recycled-a newly developed public dataset and the task of multiple property extraction. It uses the same data as WikiReading but does not inherit its predecessor’s identified disadvantages. In addition, we provide a human-annotated test set with diagnostic subsets for a detailed analysis of model performance. (read more) PDF

Read more
1 721 722 723 724 725 906