LoftQ: Reimagining LLM fine-tuning with smarter initialization

This research paper was presented at the 12th International Conference on Learning Representations (opens in new tab) (ICLR 2024), the premier conference dedicated to the advancement of deep learning. Large language models (LLMs) use extensive datasets and advanced algorithms to generate nuanced, context-sensitive content. However, their development requires substantial computational resources. To address this, we developed LoftQ, an innovative technique that streamlines the fine-tuning process—which is used to  

Read more

Python News: What’s New From April 2024

In April 2024, Python’s core development team released versions 3.13.0a6 and 3.12.3 of the language! The former received several exciting features, improvements, and optimizations, while the latter got more than 300 commits for security improvements and bug fixes. The 3.13.0a6 release is the last alpha release. In the first half of May, the code will be frozen and won’t accept new features. Note that 3.13.0a6 is a pre-release, so you shouldn’t use it for production environments. However, it provides a […]

Read more

Abstracts: May 6, 2024

MICHEL GALLEY: Thank you for having me. HUIZINGA: So I like to start with a distillation or sort of an elevator pitch of your research. Tell us in just a couple sentences what problem or issue your paper addresses and why we should care about it. GALLEY: So this paper is about evaluating large foundation models. So it’s a very important part of researching large language models because it’s a good way to evaluate, kind of, the capabilities—what these models […]

Read more

Random Search and Grid Search for Function Optimization

Function optimization requires the selection of an algorithm to efficiently sample the search space and locate a good or best solution. There are many algorithms to choose from, although it is important to establish a baseline for what types of solutions are feasible or possible for a problem. This can be achieved using a naive optimization algorithm, such as a random search or a grid search. The results achieved by a naive optimization algorithm are computationally efficient to generate and […]

Read more

Basin Hopping Optimization in Python

Basin hopping is a global optimization algorithm. It was developed to solve problems in chemical physics, although it is an effective algorithm suited for nonlinear objective functions with multiple optima. In this tutorial, you will discover the basin hopping global optimization algorithm. After completing this tutorial, you will know: Basin hopping optimization is a global optimization that uses random perturbations to jump basins, and a local search algorithm to optimize each basin. How to use the basin hopping optimization algorithm […]

Read more

XGBoost for Regression

Extreme Gradient Boosting (XGBoost) is an open-source library that provides an efficient and effective implementation of the gradient boosting algorithm. Shortly after its development and initial release, XGBoost became the go-to method and often the key component in winning solutions for a range of problems in machine learning competitions. Regression predictive modeling problems involve predicting a numerical value such as a dollar amount or a height. XGBoost can be used directly for regression predictive modeling. In this tutorial, you will […]

Read more

Develop a Neural Network for Banknote Authentication

It can be challenging to develop a neural network predictive model for a new dataset. One approach is to first inspect the dataset and develop ideas for what models might work, then explore the learning dynamics of simple models on the dataset, then finally develop and tune a model for the dataset with a robust test harness. This process can be used to develop effective neural network models for classification and regression predictive modeling problems. In this tutorial, you will […]

Read more

Gradient Descent With Nesterov Momentum From Scratch

Gradient descent is an optimization algorithm that follows the negative gradient of an objective function in order to locate the minimum of the function. A limitation of gradient descent is that it can get stuck in flat areas or bounce around if the objective function returns noisy gradients. Momentum is an approach that accelerates the progress of the search to skim across flat areas and smooth out bouncy gradients. In some cases, the acceleration of momentum can cause the search […]

Read more

Gradient Descent Optimization With Nadam From Scratch

Gradient descent is an optimization algorithm that follows the negative gradient of an objective function in order to locate the minimum of the function. A limitation of gradient descent is that the progress of the search can slow down if the gradient becomes flat or large curvature. Momentum can be added to gradient descent that incorporates some inertia to updates. This can be further improved by incorporating the gradient of the projected new position rather than the current position, called […]

Read more

A Gentle Introduction to XGBoost Loss Functions

XGBoost is a powerful and popular implementation of the gradient boosting ensemble algorithm. An important aspect in configuring XGBoost models is the choice of loss function that is minimized during the training of the model. The loss function must be matched to the predictive modeling problem type, in the same way we must choose appropriate loss functions based on problem types with deep learning neural networks. In this tutorial, you will discover how to configure loss functions for XGBoost ensemble […]

Read more
1 33 34 35 36 37 907