Scaling up BERT-like model Inference on modern CPU – Part 1
Back in October 2019, my colleague Lysandre Debut published a comprehensive (at the time) inference performance benchmarking blog (1). Since then, 🤗 transformers (2) welcomed a tremendous number of new architectures and thousands of new models were added to the 🤗 hub (3) which now counts more than 9,000 of them as of first quarter of 2021.
Read more