Hugging Face and Google partner for open AI collaboration
11/13/2025 Update: we announced a new and deeper partnership with Google Cloud to enable companies to build their own AI with open
Read moreDeep Learning, NLP, NMT, AI, ML
11/13/2025 Update: we announced a new and deeper partnership with Google Cloud to enable companies to build their own AI with open
Read moreGiven the widespread adoption of LLMs, it is critical to understand their safety and risks in different scenarios before extensive deployments in the real world. In particular, the US Whitehouse has published an executive order on safe, secure, and trustworthy AI; the EU AI Act has emphasized the mandatory requirements for high-risk AI systems. Together with regulations, it is important to provide technical solutions to assess the risks of AI systems, enhance their safety, and potentially provide safe and aligned […]
Read moreIn the rapidly evolving field of Natural Language Processing (NLP), Large Language Models (LLMs) have become central to AI’s ability to understand and generate human language. However, a significant challenge that persists is their tendency to hallucinate — i.e., producing content that may not align with real-world facts or the user’s input. With the constant release of new open-source models, identifying the most reliable ones, particularly in terms of their propensity to generate hallucinated content, becomes crucial. The Hallucinations Leaderboard […]
Read moreRecently, code generation models have become very popular, especially with the release of state-of-the-art open-source models such as BigCode’s StarCoder and Meta AI’s Code Llama. A growing number of works focuses on making Large Language Models (LLMs) more optimized and accessible. In this blog, we are happy to share the latest results of LLM optimization on Intel Xeon focusing on the popular code generation LLM, StarCoder. The StarCoder Model is a cutting-edge LLM specifically designed for assisting the user with […]
Read moreToday, the Patronus team is excited to announce the new Enterprise Scenarios Leaderboard, built using the Hugging Face Leaderboard Template in collaboration with their teams. The leaderboard aims to evaluate the performance of language models on real-world enterprise use cases. We currently support 6 diverse tasks – FinanceBench, Legal Confidentiality, Creative Writing, Customer Support Dialogue, Toxicity, and Enterprise PII. We measure the performance of models on metrics like accuracy, engagingness, toxicity, relevance, and Enterprise PII. Why do
Read moreIn this blog, we provide examples of how to get started with PatchTST. We first demonstrate the forecasting capability of PatchTST on the Electricity data. We will then demonstrate the transfer learning capability of PatchTST by using the previously trained model to do zero-shot forecasting on the electrical transformer (ETTh1) dataset. The zero-shot forecasting performance will denote the test performance of the model in the target domain, without any training on the target domain. Subsequently, we will do linear probing […]
Read moreWe are excited to announce the general availability of Hugging Face Text Generation Inference (TGI) on AWS Inferentia2 and Amazon SageMaker. Text Generation Inference (TGI), is a purpose-built solution for deploying and serving Large Language Models (LLMs)
Read moreSince the launch of ChatGPT in 2022, we have seen tremendous progress in LLMs, ranging from the release of powerful pretrained models like Llama 2 and Mixtral, to the development of new alignment techniques like Direct Preference Optimization. However, deploying LLMs in consumer applications poses several challenges, including the need to add guardrails that prevent the model from generating undesirable responses. For example, if you are building an AI tutor for children, then you don’t want it to generate toxic […]
Read moreWe’re happy to introduce the NPHardEval leaderboard, using NPHardEval, a cutting-edge benchmark developed by researchers from the University of Michigan and Rutgers University. NPHardEval introduces a dynamic, complexity-based framework for assessing Large Language Models’ (LLMs) reasoning abilities. It poses 900 algorithmic questions spanning the NP-Hard complexity class and lower, designed to rigorously test LLMs, and is updated on a monthly basis to prevent overfitting! A Unique Approach to LLM Evaluation NPHardEval stands apart
Read moreSegMoE is an exciting framework for creating Mixture-of-Experts Diffusion models from scratch! SegMoE is comprehensively integrated within the Hugging Face ecosystem and comes supported with diffusers 🔥! Among the features and integrations being released today: Table of Contents What is SegMoE? SegMoE models follow the same architecture as Stable Diffusion. Like Mixtral 8x7b, a SegMoE model
Read more