Efficient One-Pass End-to-End Entity Linking for Questions

November 16, 2020 By: Belinda Z. Li, Sewon Min, Srinivasan Iyer, Yashar Mehdad, Wen-tau Yih Abstract We present ELQ, a fast end-to-end entity linking model for questions, which uses a biencoder to jointly perform mention detection and linking in one pass. Evaluated on WebQSP and GraphQuestions with extended annotations that cover multiple entities per question, ELQ outperforms the previous state of the art by a large margin of +12.7% and +19.6% F1, respectively. With a very fast inference time (1.57 […]

Read more

Feature Selection with Stochastic Optimization Algorithms

Typically, a simpler and better-performing machine learning model can be developed by removing input features (columns) from the training dataset. This is called feature selection and there are many different types of algorithms that can be used. It is possible to frame the problem of feature selection as an optimization problem. In the case that there are few input features, all possible combinations of input features can be evaluated and the best subset found definitively. In the case of a […]

Read more

Reading and Writing HTML Tables with Pandas

Introduction Hypertext Markup Language (HTML) is the standard markup language for building web pages. We can render tabular data using HTML’s element. The Pandas data analysis library provides functions like read_html() and to_html() so we can import and export data to DataFrames. In this article, we will learn how to read tabular data from an HTML file and load it into a Pandas DataFrame. We’ll also learn how to write data from a Pandas DataFrame and to an HTML file. […]

Read more

Ensemble Learning Algorithm Complexity and Occam’s Razor

Occam’s razor suggests that in machine learning, we should prefer simpler models with fewer coefficients over complex models like ensembles. Taken at face value, the razor is a heuristic that suggests more complex hypotheses make more assumptions that, in turn, will make them too narrow and not generalize well. In machine learning, it suggests complex models like ensembles will overfit the training dataset and perform poorly on new data. In practice, ensembles are almost universally the type of model chosen […]

Read more

How to Choose an Optimization Algorithm

Optimization is the problem of finding a set of inputs to an objective function that results in a maximum or minimum function evaluation. It is the challenging problem that underlies many machine learning algorithms, from fitting logistic regression models to training artificial neural networks. There are perhaps hundreds of popular optimization algorithms, and perhaps tens of algorithms to choose from in popular scientific code libraries. This can make it challenging to know which algorithms to consider for a given optimization […]

Read more

Matplotlib Line Plot – Tutorial and Examples

Introduction Matplotlib is one of the most widely used data visualization libraries in Python. From simple to complex visualizations, it’s the go-to library for most. In this tutorial, we’ll take a look at how to plot a line plot in Matplotlib – one of the most basic types of plots. Line Plots display numerical values one one axis, and categorical values on the other. They can typically be used in much the same way Bar Plots can be used, though, […]

Read more

Matplotlib Violin Plot – Tutorial and Examples

Introduction There are many data visualization libraries in Python, yet Matplotlib is the most popular library out of all of them. Matplotlib’s popularity is due to its reliability and utility – it’s able to create both simple and complex plots with little code. You can also customize the plots in a variety of ways. In this tutorial, we’ll cover how to plot Violin Plots in Matplotlib. Violin plots are used to visualize data distributions, displaying the range, median, and distribution […]

Read more

How to Upload Files with Python’s requests Library

Introduction Python is supported by many libraries which simplify data transfer over HTTP. The requests library is one of the most popular Python packages as it’s heavily used in web scraping. It’s also popular for interacting with servers! The library makes it easy to upload data in a popular format like JSON, but also makes it easy to upload files as well. In this tutorial, we will take a look at how to upload files using Python’s requests library. The […]

Read more

Seaborn Violin Plot – Tutorial and Examples

Introduction Seaborn is one of the most widely used data visualization libraries in Python, as an extension to Matplotlib. It offers a simple, intuitive, yet highly customizable API for data visualization. In this tutorial, we’ll take a look at how to plot a Violin Plot in Seaborn. Violin plots are used to visualize data distributions, displaying the range, median, and distribution of the data. Violin plots show the same summary statistics as box plots, but they also include Kernel Density […]

Read more
1 697 698 699 700 701 919