Optimizing Python Code Performance with timeit

Have you ever wondered if that fancy new algorithm you implemented actually runs faster than the simpler one? Or do you want to know which part of your Python application is dragging down the performance? As a developer, measuring the execution time of your code is crucial for benchmarking different options and identifying bottlenecks.

In this comprehensive tutorial, I‘ll show you how timing Python code works and demonstrate practical techniques to optimize performance.

Here is what I will cover:

  • Why code timing matters for real-world software
  • Using Python‘s builtin timeit module for basic benchmarking
  • Timing code snippets, math expressions, and function calls
  • Statistical analysis of measurements
  • Presenting and interpreting benchmark results
  • Advanced profiling and optimization tools and techniques

So let‘s get started!

Why Timing Code Matters for Building Better Software

We‘ve all experienced the pain of slow, laggy applications. Users get frustrated by poor performance and bounce to alternatives. For businesses, site speed has a quantifiable impact as well. According to Google research:

  • 53% of users will abandon a mobile site if pages take over 3 seconds to load
  • A 100ms delay leads to 1% reduction in Amazon sales

Delivering quality user experiences means paying attention to performance. And modern web and mobile apps have grown more complex:

Average web page size tripled from 2011 to 2021 (1MB to 3MB)
JavaScript footprint can exceed 9MB on popular sites  

So what can developers do? The first step is profiling code to determine where time is being spent.

Python comes with some great built-in tools for timing and optimization. In this tutorial, I‘ll show you how to use them to speed up your apps.

Introduction to Python‘s timeit Module

The timeit module contains convenience functions for benchmarking small code snippets. It‘s easy to use – just pass in a code statement as a string to time its execution.

Let‘s try it interactively first. Open up a Python shell and run:

import timeit

timeit.timeit(‘"-".join([str(n) for n in range(100)])‘, number=10000)

On my laptop, this takes around 0.35 milliseconds to run 10,000 times.

The key parameters are:

  • stmt: Code statement to time
  • setup: Initialization code (run once per loop)
  • number: Number of iterations

There is also a command line interface that can be handy for quick tests.

So now you have seen the basics – let‘s explore some more examples!

Timing Common Application Tasks

Beyond microbenchmarks, it helps to test real-world use cases. Here are some common application tasks in Python and alternatives we can compare:

File I/O

with open(‘data.json‘) as f:
   data = json.load(f) # Python standard library 

data = pandas.read_json(‘data.json‘) # Pandas   

Which runs faster for loading JSON data?

String Processing

long_string.count(‘some_text‘) # str builtin 

re.findall(‘some_text‘, long_string) # Regex    

What string search works best?

Data Analysis

sum([num ** 2 for num in numbers]) # List comp

numbers.apply(lambda x: x**2).sum() # Pandas 

Can Pandas outperform base Python math?

These examples demonstrate realistic usage where optimization can make a difference.

Now let‘s learn how we can leverage timeit to analyze the performance of code like this.

Statistical Analysis for Robust Benchmarking

Simply running a statement multiple times yields timing results. But some key analysis helps ensure robust methodology:

  • Confidence intervals: Account for inherent variability in measurements
  • Standard error and standard deviation: Assess sample error margins
  • Hypothesis testing: Quantify statistical significance between results

Fortunately, Python has great statistical analysis libraries like SciPy we can utilize:

from scipy import stats

baseline_times = timeit.repeat(‘baseline_version()‘, number=100, repeat=5)
optimized_times = timeit.repeat(‘optimized_version()‘, number=100, repeat=5)

stats.ttest_ind(baseline_times, optimized_times) # Statistical significance test

stats.describe(optimized_times) # Summary statistics

This workflow provides rigor when benchmarking Python code with timeit.

Next let‘s look at effective ways to present timing data.

Presenting Benchmark Results

The raw output of timeit is just runtimes for code execution. To make this data insightful, appropriate visualization is key.

Tables

Algorithm Mean Time Std Dev Runs
Baseline 5.2 ms 1.3 ms 1000
Optimized 3.5 ms 0.9 ms 1000

Graphs and Charts

import matplotlib.pyplot as plt

x = [1, 2, 4, 8] # input sizes
y = [t1, t2, t3, t4] # timing data

plt.plot(x, y)  
plt.show() 

With good visual presentation, we can better understand the performance profile.

This covers the fundamentals of using timeit for benchmarking tasks. Next let‘s dig into more advanced optimization.

Advanced Python Profiling: cProfile, Tracing, and More

Python offers several tools to analyze code beyond timeit:

  • cProfile: Shows frequency of function calls
  • Tracing: Tracks statement execution
  • Snakeviz: Generates visual call graph for cProfile output
  • line_profiler: Times code line-by-line
  • Memory profiler: Checks memory usage

For example, a cProfile report helps identify the hot paths through an application:

     4066407 function calls in 14.598 CPU seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1   14.598   14.598   14.598   14.598 module.py:110(compute)
 36002/360    0.273    0.000    1.636    0.000 module.py:21(helper)

This shows the compute() function takes up most time, so we should focus optimization there.

These tools provide low-level insight into Python program execution. I have guides on using them in more depth.

Optimization Techniques

Beyond diagnosing issues, let‘s discuss some ways to actually improve performance:

  • Algorithms: Faster sorts, searches, compression approaches
  • Data Structures: Sets vs lists, tries, heaps all have tradeoffs
  • Just-in-time Compilation: Cython, Numba
  • Caching: Save prior work instead of recomputing
  • Concurrency: Multi-threading, asyncio
  • Vectorization: Numpy array operations

Here is an example using Numba to JIT compile some numerical code:

from numba import jit
import numpy as np

@jit(nopython=True) 
def sum_squares(arr):
    s = 0
    for x in arr:
        s += x*x
    return s

arr = np.arange(1e6)

%timeit sum_squares(arr) # ~300ms without Numba
%timeit sum_squares(arr) # ~3ms with Numba  

That‘s an easy 100x speedup!

Hopefully this gives you some ideas of additional ways to make Python go faster.

Summary

In this tutorial, I covered:

  • The importance of tracking Python code performance
  • Using timeit for benchmarking functions and code blocks
  • Statistical rigor to analyze measurements
  • Visualization techniques for understandable results
  • Advanced profilers like cProfile and line_profiler
  • Optimization approaches like algorithms, compilation, concurrency

Mastering these practical techniques for timing, profiling, and speeding up Python code will help you as a developer create faster, higher quality applications.

For more information, check out my video course on Python performance tuning. Happy coding!