How to Iterate Over Rows in pandas, and Why You Shouldn’t

One of the most common questions you might have when entering the world of pandas is how to iterate over rows in a pandas DataFrame. If you’ve gotten comfortable using loops in core Python, then this is a perfectly natural question to ask. While iterating over rows is relatively straightforward with .itertuples() or .iterrows(), that doesn’t necessarily mean iteration is the best way to work with DataFrames. In fact, while iteration may be a quick way to make progress, relying […]

Read more

float64 to float32: Saving memory without losing precision

Libraries like NumPy and Pandas let you switch data types, which allows you to reduce memory usage. Switching from numpy.float64 (“double-precision” or 64-bit floats) to numpy.float32 (“single-precision” or 32-bit floats) cuts memory usage in half. But it does so at a cost: float32 can only store a much smaller range of numbers, with less precision. So if you want to save memory, how do you use float32 without distorting your results? Let’s find out! In particular, we will: Explore some […]

Read more

The Python Standard REPL: Try Out Code and Ideas Quickly

The Python standard shell, or REPL (Read-Eval-Print Loop), allows you to run Python code interactively while working on a project or learning the language. This tool is available in every Python installation, so you can use it at any moment. As a Python developer, you’ll spend a considerable part of your coding time in a REPL session because this tool allows you to test new ideas, explore and experiment with new tools and libraries, refactor and debug your code, and […]

Read more

Python Basics: Object-Oriented Programming

OOP, or object-oriented programming, is a method of structuring a program by bundling related properties and behaviors into individual objects. Conceptually, objects are like the components of a system. Think of a program as a factory assembly line of sorts. At each step of the assembly line, a system component processes some material, ultimately transforming raw material into a finished product. An object contains both data, like the raw or preprocessed materials at each step on an assembly line, and […]

Read more

Linear Algebra in Python: Matrix Inverses and Least Squares

Linear algebra is an important topic across a variety of subjects. It allows you to solve problems related to vectors, matrices, and linear equations. In Python, most of the routines related to this subject are implemented in scipy.linalg, which offers very fast linear algebra capabilities. In particular, linear models play an important role in a variety of real-world problems, and scipy.linalg provides tools to compute them in an efficient way. In this tutorial, you’ll learn how to: Study linear systems […]

Read more

Some reasons to avoid Cython

If you need to speed up Python, Cython is a very useful tool. It lets you seamlessly merge Python syntax with calls into C or C++ code, making it easy to write high-performance extensions with rich Python interfaces. That being said, Cython is not the best tool in all circumstances. So in this article I’ll go over some of the limitations and problems with Cython, and suggest some alternatives. A quick overview of Cython In case you’re not familiar with […]

Read more

Why don’t people use character-level MT? – One year later

In this post, I comment on our (i.e., myself, Helmut Schmid and Alex Fraser) year-old paper “Why don’t people use character-level machine translation,” published in Findings of ACL 2022. Here, I will (besides briefly summarizing the paper’s main message) mostly comment on what I learned while working on the one-year-later perspective, focusing more on what I would do differently now. If you are interested in the exact research content, read the paper or watch a 5-minute presentation. Paper TL;DR Doing […]

Read more

Python Basics Exercises: File System Operations

In Python Basics: File System Operations, you learned how to use Python to work with files and folders. As a programmer, you’ll use the pathlib and shutil modules to complete file system operations without relying on your graphical user interface (GUI). While you already got lots of hands-on practice with file system operations, programmers never stop training! The more you use your new skills, the more comfortable you’ll be when it’s time to put them to work in your own […]

Read more

Python’s Assignment Operator: Write Robust Assignments

Python’s assignment operators allow you to define assignment statements. This type of statement lets you create, initialize, and update variables throughout your code. Variables are a fundamental cornerstone in every piece of code, and assignment statements give you complete control over variable creation and mutation. Learning about the Python assignment operator and its use for writing assignment statements will arm you with powerful tools for writing better and more robust Python code. Assignment Statements and the Assignment Operator One of […]

Read more

Why Polars uses less memory than Pandas

Processing large amounts of data with Pandas can be difficult; it’s quite easy to run out of memory and either slow down or crash. The Polars dataframe library is a potential solution. While Polars is mostly known for running faster than Pandas, if you use it right it can sometimes also significantly reduce memory usage compared to Pandas. In particular, certain techniques that you need to do manually in Pandas can be done automatically in Polars, allowing you to process […]

Read more
1 106 107 108 109 110 913