Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers
OpenAI recently released their GPT-OSS series of models. The models feature some novel techniques like MXFP4 quantization, efficient kernels, a brand new chat format, and more. To enable the release of gpt-oss through transformers, we have upgraded the library considerably. The updates make it very efficient to load, run, and fine-tune the models. In this blog post, we talk about all the upgrades in-depth, and how they become part of the transformers toolkit so other models (current and future) can […]
Read more