Python: Safely Create Nested Directory

Introduction File manipulation is one of the most important skills to master in any programming language, and doing it correctly is of utmost importance. Making a mistake could cause an issue in your program, other programs running on the same system, and even the system itself. Possible errors can occur due to the parent directory not existing, or by other programs changing files in the file system at the same time, creating something that is called a race condition. A […]

Read more

How to Merge DataFrames in Pandas – merge(), join(), append(), concat() and update()

Introduction Pandas provides a huge range of methods and functions to manipulate data, including merging DataFrames. Merging DataFrames allows you to both create a new DataFrame without modifying the original data source or alter the original data source. If you are familiar with the SQL or a similar type of tabular data, you probably are familiar with the term join, which means combining DataFrames to form a new DataFrame. If you are a beginner it can be hard to fully […]

Read more

Best Practices for Data-Efficient Modeling in NLG: How to Train Production-Ready Neural Models with Less Data

December 8, 2020 By: Ankit Arun, Soumya Batra, Vikas Bhardwaj, Ashwini Challa, Pinar Donmez, Peyman Heidari, Hakan Inan, Shashank Jain, Anuj Kumar, Shawn Mei, Karthik Mohan, Michael White Abstract Natural language generation (NLG) is a critical component in conversational systems, owing to its role of formulating a correct and natural text response. Traditionally, NLG components have been deployed using template-based solutions. Although neural network solutions recently developed in the research community have been shown to provide several benefits, deployment of […]

Read more

Semi-Supervised Learning With Label Spreading

Semi-supervised learning refers to algorithms that attempt to make use of both labeled and unlabeled training data. Semi-supervised learning algorithms are unlike supervised learning algorithms that are only able to learn from labeled training data. A popular approach to semi-supervised learning is to create a graph that connects examples in the training dataset and propagates known labels through the edges of the graph to label unlabeled examples. An example of this approach to semi-supervised learning is the label spreading algorithm […]

Read more

How to Use Global and Nonlocal Variables in Python

Introduction In this article we’ll be taking a look at Global and Non-Local Variables in Python and how you to use them to avoid issues when writing code. We’ll be starting off with a brief primer on variable scopes before we launch into the how and why of using global and non-local variables in your own functions. Scopes in Python Before we can get started, we first have to touch on scopes. For those of you who are less familiar, […]

Read more

Semi-Supervised Learning With Label Propagation

Semi-supervised learning refers to algorithms that attempt to make use of both labeled and unlabeled training data. Semi-supervised learning algorithms are unlike supervised learning algorithms that are only able to learn from labeled training data. A popular approach to semi-supervised learning is to create a graph that connects examples in the training dataset and propagate known labels through the edges of the graph to label unlabeled examples. An example of this approach to semi-supervised learning is the label propagation algorithm […]

Read more

Multinomial Logistic Regression With Python

Multinomial logistic regression is an extension of logistic regression that adds native support for multi-class classification problems. Logistic regression, by default, is limited to two-class classification problems. Some extensions like one-vs-rest can allow logistic regression to be used for multi-class classification problems, although they require that the classification problem first be transformed into multiple binary classification problems. Instead, the multinomial logistic regression algorithm is an extension to the logistic regression model that involves changing the loss function to cross-entropy loss […]

Read more

Fake news classifier on US Election News📰 | LSTM 🈚

Introduction News media has become a channel to pass on the information of what’s happening in the world to the people living. Often people perceive whatever conveyed in the news to be true. There were circumstances where even the news channels acknowledged that their news is not true as they wrote. But some news has a significant impact not only on the people or    

Read more

Ultimate Guide to Heatmaps in Seaborn with Python

Introduction A heatmap is a data visualization technique that uses color to show how a value of interest changes depending on the values of two other variables. For example, you could use a heatmap to understand how air pollution varies according to the time of day across a set of cities. Another, perhaps more rare case of using heatmaps is to observe human behavior – you can create visualizations of how people use social media, how their answers on surveys […]

Read more

Histogram-Based Gradient Boosting Ensembles in Python

Gradient boosting is an ensemble of decision trees algorithms. It may be one of the most popular techniques for structured (tabular) classification and regression predictive modeling problems given that it performs so well across a wide range of datasets in practice. A major problem of gradient boosting is that it is slow to train the model. This is particularly a problem when using the model on large datasets with tens of thousands of examples (rows). Training the trees that are […]

Read more
1 696 697 698 699 700 919