The NLP Cypher | 09.05.21

Hey Welcome Back! A flood of EMNLP 2021 papers came in this week so today’s newsletter should be loads of fun! 😋

But first, a meme search engine:

An article on The Gradient had an interesting take on NLU. It describes how a NNs’ capacity for NLU inference is inherently bounded to the background knowledge it knows (which is usually highly limited relative to a human). Although I would add a bit more nuance to this by sharing that this is only a problem for a model that is not localized for its user, meaning a model that wasn’t fine-tuned/prompted (localized) for a specific user. For information that is general and with ground truth i.e. (rain is wet or rain falls down to the ground), the MTP isn’t a big issue with large enough data/model.

I think a bigger issue in NLU (using text only) is when data doesn’t match the complexity of real-world. Meaning there isn’t enough information in the text only modality. Humans by default use a multi-modal approach (text, audio, visual etc.) when interpreting the world around us which helps us with inference. Multi-modal learning can be a viable approach to the MTP problem examples discussed in the article.

For those into document (PDF) parsing 👇. Includes the 2nd version of LayoutLM and also its multi-lingual cousin LayoutXLM.

…And there’s already a repo built on top of these models! 👌

https://arxiv.org/pdf/2108.13048.pdf

https://arxiv.org/pdf/2108.13300.pdf

https://arxiv.org/pdf/2108.08877.pdf

https://arxiv.org/pdf/2108.10197.pdf

Had previously mentioned the highlights/shorter version on a previous newsletter, now you can get the full dataset:

A long and awesome introduction to graph neural networks.

Holy Moly 🤯

The Compendium contains over 500-topics in ML, and has been written for over 4 years. It’s now offered in an interactive web-based format.

The NLP Cypher | 09.05.21

A collection of recently released repos that To finish reading, please visit source site

A collection of recently released repos that

To finish reading, please visit source site