Graph Attention Networks: Self-Attention for GNNs

Graph Attention Networks (GATs) are one of the most popular types of Graph Neural Networks. Instead of calculating static weights based on node degrees like Graph Convolutional Networks (GCNs), they assign dynamic weights to node features through a process called self-attention. The main idea behind GATs is that some neighbors are more important than others, regardless of their node degrees. Node 4 is more important than node 3, which is more important than node 2 In this article, we will […]

Read more

GraphSAGE: Scaling up Graph Neural Networks

What do UberEats and Pinterest have in common? They both use GraphSAGE to power their recommender systems on a massive scale: millions and billions of nodes and edges. 🖼️ Pinterest developed its own version called PinSAGE to recommend the most relevant images (pins) to its users. Their graph has 18 billion connections and three billion nodes. 🍽️ UberEats also reported using a modified version of GraphSAGE to suggest dishes, restaurants, and cuisines. UberEats claims to support more than 600,000 restaurants […]

Read more

GIN: How to Design the Most Powerful Graph Neural Network

Graph Neural Networks are not limited to classifying nodes. One of the most popular applications is graph classification. This is a common task when dealing with molecules: they are represented as graphs and features about each atom (node) can be used to predict the behavior of the entire molecule. However, GNNs only learn node embeddings. How to combine them in order to produce an entire graph embedding? In this article, we will: See a new type of layer, called “global […]

Read more

Introduction to Constraint Programming in Python

Constraint Programming is a technique to find every solution that respects a set of predefined constraints. It is an invaluable tool for data scientists to solve a huge variety of problems, such as scheduling, timetabling, sequencing, etc. In this article, we’ll see how to use CP in two different ways: Satisfiability: the goal is to find one or multiple feasible solutions (i.e., solutions that respect our constraints) by narrowing down a large set of potential solutions; Optimization: the goal is […]

Read more

Optimize Your Marketing Budget with Nonlinear Programming

In the age of digital marketing, businesses face the challenge of allocating their marketing budget across multiple channels to maximize sales. However, as they broaden their reach, these firms inevitably face the issue of diminishing returns – the phenomenon where additional investment in a marketing channel yields progressively smaller increases in conversions. This is where the concept of marketing budget allocation steps in, adding another layer of complexity to the whole process. In this article, we’re going to explore the […]

Read more

Decoding Strategies in Large Language Models

In the fascinating world of large language models (LLMs), much attention is given to model architectures, data processing, and optimization. However, decoding strategies like beam search, which play a crucial role in text generation, are often overlooked. In this article, we will explore how LLMs generate text by delving into the mechanics of greedy search and beam search, as well as sampling techniques with top-k and nucleus sampling. By the conclusion of this article, you’ll not only understand these decoding […]

Read more

Improve ChatGPT with Knowledge Graphs

ChatGPT has shown impressive capabilities in processing and generating human-like text. However, it is not without its imperfections. A primary concern is the model’s propensity to produce either inaccurate or obsolete answers, often called “hallucinations.” The New York Times recently highlighted this issue in their article, “Here’s What Happens When Your Lawyer Uses ChatGPT.” It presents a lawsuit where a lawyer leaned heavily on ChatGPT to assist in preparing a court filing for a client suing an airline. The model […]

Read more

4-bit LLM Quantization with GPTQ

Recent advancements in weight quantization allow us to run massive large language models on consumer hardware, like a LLaMA-30B model on an RTX 3090 GPU. This is possible thanks to novel 4-bit quantization techniques with minimal performance degradation, like GPTQ, GGML, and NF4. In the previous article, we introduced naĂŻve 8-bit quantization techniques and the excellent LLM.int8(). In this article, we will explore the popular GPTQ algorithm to understand how it works and implement it using the AutoGPTQ library. You […]

Read more

A Beginner’s Guide to LLM Fine-Tuning

The growing interest in Large Language Models (LLMs) has led to a surge in tools and wrappers designed to streamline their training process. Popular options include FastChat from LMSYS (used to train Vicuna) and Hugging Face’s transformers/trl libraries (used in my previous article). In addition, each big LLM project, like WizardLM, tends to have its own training script, inspired by the original Alpaca implementation. In this article, we will use Axolotl, a tool created by the OpenAccess AI Collective. We […]

Read more
1 88 89 90 91 92 920