The NLP Cypher | 07.04.21
Hey Welcome back! Want to wish everyone in the US a happy 4th of July๐๐! Also, want to quickly mention that the NLP Index has doubled in size (since its inception) with now housing over 6,000 repos, pretty cool!!! ๐ And as always, it gets updated weekly. But first, this week we asked 100 NLP developers: Name one thing Microsoft got for paying $7.5 billi for GitHub, and $1 billi to OpenAI? SURVEY SAYS:
7.5B + 1B = GitHub CoPilot ๐
If you want to hear GitHubโs take on their new code generating assistant read here:
Alsoโฆ it turns out CoPilot is a serial killer ๐, but at least the code is readable. ๐ช
Hey did you know Microsoft has a stash of models tucked away in their repository spanning NLU, document understanding, cross-lingual and more? If these models interest you, follow this page:
UniLM (
v1@NeurIPS'19 | v2@ICML'20 | v3@ACL'21
): unified pre-training for language understanding and generationInfoXLM (
v1@NAACL'21 | v2@ACL'21
): multilingual/cross-lingual pre-trained models for language understanding and generationDeltaLM (
NEW
): encoder-decoder pre-training for language generation and translation by augmenting pretrained multilingual encodersMiniLM (
v1@NeurIPS'20 | v2@ACL'21
): small and fast pre-trained models for language understanding and generationAdaLM (
v1@ACL'21
): domain, language, and task adaptation of pre-trained modelsLayoutLM (