Monthly Archives: June 2018

Tiny speed-ups for Python code

Here’s a bunch of examples looking at micro-optimisations in [Python][python] code.

[python]: https://www.python.org/

Testing if a variable matches 1 or none of 2 values
—————————————————

You have a variable and want to test if it is any of 2 values. Should you use a test for membership with a tuple? Or a test for membership with a set? Or just use 2 comparisons with a logical or?

Answer: use `in` with a tuple.

if value in (‘foo’, ‘bar’):
pass

Caveats: for complex objects calculating the identity might be more expensive than calculating a comparison, or calculating the hash of the object when using a set membership test.

Fastest way to copy a list
————————–

Given a list of objects (already in memory), what’s the fastest way to make a second list of the same objects? The built-in `list()` or a list comprehension?

Answer: use the built-in `list()`.

source = [‘foo’, ‘bar’]
dest = list(source)

Checking if any item in a tuple is an empty string
—————————————————-

Given a tuple of values, is any of them the empty string? This was prompted by some crazy code that used `any()` with a generator expression and a test.

Answer: use `in`, don’t use `any()` with a generator expression and a test.

if ” in (‘foo’, ‘bar’, ‘baz’):
pass

Caveats: not sure if this would still hold true for a really long tuple, and maybe depends on the position of the empty string in the tuple (when there is one).

Reading test fixtures from disk versus memory
———————————————

Is it faster to read a test file from disk or construct it in memory?

Answer: memory.

import io

def test_file_contents():
source = io.StringIO(‘foo’)

assert source.read() == ‘foo’

Updating a dictionary with 1 key and value
——————————————

What’s the fastest way to set 1 key / value in an existing dictionary?

Answer: use direct assignment, it is a lot faster. Don’t use `dict.update()`.

value = {}
value[‘foo’] = ‘bar’

Caveats: I see code using update for a single key a lot, it makes me go spare. You will miss out on annoying me.

I think it is useful to be aware of what does and does not perform well in the CPython implementation. Given 2 ways to do the same thing, one should prefer the more efficient approach, right? Having said that, sometimes the more efficient approach is harder to read, so may not necessarily be better.