This group of pages do not really describe a programming project, but is a convenient place to record various Python tricks I've learnt. I hope they might also prove useful to other people.
I've called these code snippets tricks, but they are really just handy Python features used in simple, but effective ways. They are methods that since learning, I have used frequently and feel have improved the way I code.
There is a good list of so-called "hidden features" in Python on Stack Overload: http://stackoverflow.com/questions/101268/hidden-features-of-python/1024693
In Python, not only is 0 False and 1 True, but True is 1 and False is 0. This means you can do weird things like:
>>> a = 1 >>> (a==1) + (a>0) + (a==2) 2
I'm not sure why you'd want to do this unless you wanted to count how many conditions had been met. More usefully, you can use a boolean test as an index for an array or tuple. For example, rather than write:
if a % 2 == 0:
print "a is even"
else:
print "a is odd"
You can write:
print ("a is odd", "a is even")[a % 2 == 0]
Admittedly, this is probably less readable.
An example of when I've found this trick useful is when I wanted to create a play/pause button. In response to a keystroke, I wanted flip the value of a boolean variable call 'paused': if it was currently True then it should becomes False, if it were False then it should become True. This can be achieve like so:
paused = (True, False)[paused]
Another situation in which using a boolean test as a index might be useful is when you don't have the luxury of writing multiple lines of code, e.g. within a lambda function or list comprehension. For example:
>>> my_list = [1, 7, 11, 8, 13, 2]
>>> [("odd","even")[i % 2 == 0] for i in my_list]
["odd","odd","odd","even","odd","even"]
A more useful example would be to threshold a list of data:
thresholded_data = [(0,1)[i > threshold] for i my list]
Like enumerate, I'm not sure that this can be rightly called a trick since it's just a built-in function, but it's one that I wasn't aware of until long after I first needed it. Using a dictionary's get() function allows you to automatically check whether a key is in a dictionary and return a default value if it isn't.
For example:
>>> numbers = {1: 'one', 2:'two', 3:'three'}
>>> print numbers.get(1, 'Number not defined')
one
>>> print numbers.get(4, 'Number not defined')
Number not defined
This is useful in all sorts of situations. One common situation in which I find it useful is when you want to get counts of the numbers of items in a group of items, for example, if you want to create a histogram. For example, to count the frequency of letters in a string:
my_string = "I want to get the counts for each letter in this sentence"
counts = {}
for letter in my_string:
counts[letter] = counts.get(letter, 0) + 1
print counts
Here, for each letter in my_string, you are getting the number of counts for that letter; if that letter hasn't yet been added to the counts dictionary, then the 0 is returned.
This is probably stretching the definition of 'trick', given that it's simply using a built-in Python function, but it's a very handy function that I didn't know about for some time, whilst wishing there was something exactly like it.
Python is great when it come to traversing lists with for loops, but sometimes you want to know where in the list you are. I used to write code something like:
for n in range(len(my_list)):
print n, my_list[n]
But it's much cleaner (and computationally more efficient, I believe):
for n, item in enumerate(my_list):
print n, item
This function is particularly useful if you want to compare every item in a list to every other item in the list. For example, in my particle simulation, I wanted to test whether any particle in a list of my_particles overlapped with any other. The following code calls the collide function (which checks whether two particles overlap), with each pair of particles in the list.
for i, particle1 in enumerate(my_particles):
for particle2 in my_particles[i+1:]:
collide(particle1, particle2)
A further 'trick' with enumerate is to pass a second parameter, which define what number to start counting from. For example:
a = ['two', 'three', 'four'] for i, word in enumerate(a, 2): print i, word
Will print:
2 two 3 three 4 four
Therefore the code for particle collisions above can be made a bit cleaner:
for i, particle1 in enumerate(my_particles, 1):
for particle2 in my_particles[i:]:
collide(particle1, particle2)
I had known about Python list comprehensions for a while, but had avoided them, thinking them too complicated (to write and to read) and that I could achieve whatever list comprehensions achieve without them (which is true). However, now I have got the hang of using them, I find them the one of the most useful techniques in Python. I sometimes find myself replace quite long loops or whole functions with a single line list comprehension. I have even attempted to reduce Conway's Game of Life to a single line of Python using list comprehensions.
You can find good guides to list comprehensions elsewhere on the internet, but briefly, they generate a list using one or more for loops with optional conditions. For example, if you want a list of the first five square numbers: [1, 4, 9, 16, 24], you could use a for loop.
squares = []
for n in range(1,6):
square.append(n**2)
But you can do in a single line with a list comprehension:
[n**2 for n in range(1,6)]
A real-life example of when I used a list comprehension was to built a list of all the codons (triplets of nucleotide bases):
bases = ['U', 'C', 'A', 'G'] codons = [a+b+c for a in bases for b in bases for c in bases]
List comprehensions can be used to construct lists of arbitrary complexity and it can be very tempting once you get the hang of them, though I'm not sure it makes for very readable code. Below is some code I wrote to compress a list such that the first item in my new list was the average of the first ten items in the original list, the second was the average of the next ten items and so on. It's written such that I can change the 'bin size' from ten to whatever I want.
data = [10, 13, 15 ...] # Several thousand numbers bin = 10 compressed_data = [float(sum(data[n*bin:(n+1)*bin]))/bin for n in range(len(data)/bin)]
Whether this is particularly readable is debatable.
Sorting lists in Python is very simple (list.sort()), but I often need to sort a list of objects based on the one of the objects' attributes. I tried various messy, hacky methods before finding a simple method: passing a new comparison function for sort to use.
Say you have a list of objects, each of which has an attribute called 'score'. You can sort the list by object score like so:
my_list.sort(key = lambda x: x.score)
This passes a lambda function to sort, which tells it to compare the score attributes of the objects. Otherwise, the sort function works exactly as normal (so will, for example, order strings alphabetically.
You can also use this technique to sort a dictionary by its values:
sorted_keys = sorted(my_dict.keys(), key=lambda x: my_dict[x])
for k in sorted_keys:
print my_dict[k]
The code creates a list of the dictionary keys, which it sorts based on the value for each key (note that you can't simply sort my_dict.keys()). Alternatively you can loop through the keys and values in one go:
for k, v in sorted(my_dict.items(), key=lambda (k,v): v):
print k, vI think a common mistake when coming to Python from some other programming languages is to loop through a list like this:
for i in range(len(my_list)):
print my_list[i]
Rather than:
for x in my_list:
print x
Whilst I knew not to do this, when trying to loop through two lists at the same time, I found myself resorting to this:
for i, x in enumerate(list1):
print x, list2[i]
A much better solution is to zip the two lists together like this:
for (i, j) in zip(list1, list2):
print i, j
A syntax that confused me for some time was zip(*my_list):
z = zip(list1, list2) newlist1, newlist2 = zip(*z)
This works becauses the * syntax unpacks a list of values. The above code zips and unzips two lists, which is pointless, but the same syntax can be used to convert from a list of columns of data to a list of rows of data. For example, the following list comprehension reads in a file of tab-delimited data as a list of rows, where each row is a tuple of values:
rows = [line.rstrip().split('\t') for line in file(filename)]
If you want to flip the data through 90 degrees (i.e. convert from rows or data to columns of data), then you use:
columns = zip(*rows)
For example, if the data was originally (a, 1), (b, 2), (c, 3), it becomes (a, b, c), (1, 2, 3).
In Python, and and or work in a slightly unusual way, which means they can be used to assign values.
Rather than write:
if n < 0:
result = 'n is negative'
else:
result = 'n is positive'
You can write:
result = n < 0 and 'n is positive' or 'n is negative'
The result is shorter, though I'm not sure it's more readable. However, the fact that it's a single line makes it more versatile. For example, you can include it in a list comprehension, as I did in this contrived example, or you can pass it as an argument to a function.
Then general form is:
result = test and true_result or false_result
The logic is that the two results count as being true, so if the test is true, then test and true_result is true; if the test is false then test or false_result is true.
There is a more detailed explanation of why this trick works at Dive Into Python.