Featured post

new redirect for blender.org bpy docs.

http://www.blender.org/api/blender_python_api_current/ As of 10/11 november 2015 we can now link to the current api docs and not be worr...

April 06, 2013

Beautiful, Idiomatic Python

PyconUS 2013 brings many great videos of the event to those of us who didn't manage to go there. transforming-code-into-beautiful-idiomatic-python by Raymond Hettinger is a shining example of the difference between 'functioning' code versus 'fast and beautiful' code. Here's a link to his slides: speakerdeck. Follow Raymond Hettinger's low traffic twitter account @raymondh

If you don't really get what idiomatic means, think of idioms (http://en.wikipedia.org/wiki/Idiom). Idiomatic can refer to expressing an idea in a way that many people can follow - like a figure of speech, or a proverb. Proverbs don't generally translate well between spoken languages, they tend to sound weird, and aren't as 'catchy / funny' or 'to the point' as the same sentiment expressed by a native speaker using the correct proverb for that language.

When speaking about programming languages the concept of the idiom also applies. This is perhaps an unfair example, but it should make a point. If someone uses C++ they might iterate over an array like so
for (int i = 0; i < size_of_array; i += 1) { cout << some_array[i]; }
An exact translation into Python (for the c++ example would be)
some_list = [2,5,34,56,27,45,67]
size_of_list = len(some_list)
for i in range(size_of_list):
    print(some_list[i])

Python offers a tidier 'native' idiomatic way to express iterating over a list
for i in some_list:
    print(i)
C++ has moved on and the latest implementation C++11 offers a for each loop similar to those provided by Python and Java. A for each loop is such a fundamental expressions that the mechanics of setting up the loop and accessing members via indices can done for us by the language. Hence instead of writing them explicitly they can be written in an implied 'idiomatic' way. Over time (or by design) the people most active in the development of a language arrive at preferred way to express common operations.

I recommend watching the video linked to above while pausing and taking notes, here are mine as a note to self for quick reference (as I don't write much Python at present).
# idiomatic_python.py
some_list = ['fee', 'phi', 'fo', 'fhum']
# do something for every item in iterable
for item in some_list:
print(item)
# need to reverse iterate?
for item in reversed(some_list):
print(item)
# need to keep track of the index while iterating?
for idx, item in enumerate(some_list):
print(idx, item)
'''---------------------------------------------'''
i_list = ['wood','metal','stone','sand']
j_list = ['wood2','metal2','stone2','sand2']
# need to iterate over multiple small lists?
for i, j in zip(i_list, j_list):
print(i, j)
# need to loop in sorted order?
for item in sorted(item_list):
print(item)
# need to loop in reverse-sorted order?
for item in sorted(item_list, reverse=True):
print(item)
# customized sort order?
# if len(), int() or any of the other builtins aren't sufficient
# to determine a sorting order, then you write your own function,
# or a lambda. sorted has great documentation, read it.
print(sorted(some_list, key=some_property_of_item))
print(sorted(some_list, key=len))
# these two produce the same output, but the first is slower.
print(sorted(some_list, key=lambda x: int(x)))
print(sorted(some_list, key=int))
'''---------------------------------------------'''
# calling a function until sentinal value
# investigate partials. <--- do this.
blocks = []
for block in iter(partial.f.read, 32), ''):
blocks.append(block)
# multiple exit points in a loop?
some_iterable = ['vertices', 'indices', 'edges', 'polygons']
def find(some_iterable, token):
for idx, item in enumerate(some_iterable):
if item == token:
break
else:
return -1
return idx
idx = find(some_iterable, 'edges')
print(idx)
'''---------------------------------------------'''
# need to loop over the keys of a dictionary?
some_dict = {"author": "john dough", "job": "clown", "age": "184"}
for key in some_dict:
print(key)
# need to loop over the keys and values of a dictionary?
# http://docs.python.org/3.1/whatsnew/3.0.html#views-and-iterators-instead-of-lists
for key, value in some_dict.items():
print(key, value)
# this makes a list copy of the keys in a dict so you can modify the
# original dict while iterating over the keys.
some_dict = {"author": "john dough", "job": "clown", "age": "184"}
# FOOOOOO, fails in vanila py2 and py3.2
# for key in some_dict.keys():
# if key.startswith('a'):
# del some_dict[key]
diff_dict = {key: some_dict[key] for key in some_dict if not key.startswith('a')}
print(diff_dict)
'''---------------------------------------------'''
# Construction a dictionary fom pairs ( two lists )
# reuses the same tuple over and over.
field_keys = ['author', 'job', 'age']
field_values = ['john dough', 'clown', 184]
new_dict = dict(zip(field_keys, field_values))
print(new_dict)
# default values from dict if dict key not present
# this example counts with the dict
test_list = ['noun', 'noun', 'verb', 'adjective', 'subject', 'proper_noun']
some_dict = {"noun": 3, "verb": 23, "adjective": 7}
count_dict = {}
for POS in some_dict:
count_dict[POS] = count_dict.get(POS, 0) + 1
print(count_dict)
# a more modern way of ensuring error-free lookup
# warning: d will need to be converted back to a normal dict from defaultdict
from collections import defaultdict
d = defaultdict(int)
for POS in some_dict:
d[POS] += 1
print(count_dict)
# print(dict(count_dict)) # subtle, but requires conversion to use as dict
# how to group in dictionaries, use setdefault, it inserts missing keys
# slightly ugly
test_list = ['noun', 'noun', 'verb', 'adjective', 'subject', 'proper_noun']
d = {}
for POS in test_list:
key = len(POS)
d.setdefault(key, []).append(POS)
print(d)
# {9: ['adjective'], 11: ['proper_noun'],
# 4: ['noun', 'noun', 'verb'], 7: ['subject']}
# cleaner python, using defaultdict, this is the new idion. it is fast.
# you must know how to do it.
test_list = ['noun', 'noun', 'verb', 'adjective', 'subject', 'proper_noun']
from collections import defaultdict
d = defaultdict(list)
for POS in test_list:
key = len(POS)
d[key].append(POS)
print(dict(d)) # subtle, but requires conversion to use as dict
'''---------------------------------------------'''
# popping from a dictionary
some_dict = {"noun": 3, "verb": 23, "adjective": 7}
while some_dict:
key, value = some_dict.popitem()
print(key, value)
print(some_dict) # ends empty
'''---------------------------------------------'''
### linking dictionaries
# cl = command line
# looks in cl_args first then os.environ then defaults
# fast and beautiful.
d = ChainMap(cl_args, os.environ, defaults)
### keyword arguments to clarify variable function input
# not
some_function("var1", False, 30, False)
# better, but slightly slower by a very small amount, so don't do
# it in the middle of a tight massively repeating loop.
some_function("var1", nails=False, num=30, stainless=False)
# TestResults = doctest.testmod()
# i must confess i don't get what this slide was about..
from collections import namedtuple
TestResults = (0, 4)
TestResults = namedtuple('TestResults', ['failed', 'attempted'])
print(TestResults.failed)
'''---------------------------------------------'''
# simultaneous variable updates (state updates)
# straight rip from Raymond Hettinger,
# this updates the state all at once
def fibonacci(n):
x, y = 0, 1
for i in range(n):
print(x)
x, y = y, x+y
'''---------------------------------------------'''
# high performance, anywhere you use
# del some_iterable[0]
# some_iterable.pop[0]
# some_iterable.insert(0, "some new item") you should be using
# deque (pronounced deck)
some_iterable = deque(['item1','item2','item3','item4','item5'])
del some_iterable[0]
some_iterable.popleft()
some_iterable.appendleft('new item')
#Using decorators to avoid using cycles on functions that will
# retrn the same value given the same input, here's an example
# assuming the content of the page won't change between calls
# taken directly from Raymond Hettinger
from urllib.request import urlopen
def cache(func):
saved = {}
@wraps(func)
def newfunc(*args):
if args in saved:
return newfunc(*args)
result = func(*args)
saved[args] = result
return result
return newfunc
@cache
def text_content_of_page(url):
return urlopen(url).read()
'''---------------------------------------------'''
# Factoring out temporary context, means you don't need to
# reset an old context, this can be useful for blender context
# jumping too, at the end ..whatever the context precision was
# will be reverted to. This avoids set-up and tear-down logic.
with localcontext(Context(prec= 50)):
print(Decimal(400)/Decimal(455))
# another example of ditching the setup-teardown is file access
# takes care of the try and finally close() behind the scene
with open('some_file.txt') as f:
file_content = f.read()
# i've never used a lock so this will be homework
# make a lock
lock = threading.Lock()
with lock:
print("stff1")
print("stff1")
# at this point lock is released by control flow.
# i've never used OSError either. more study!
with ignored(OSError):
os.remove('some_file.tmp')
@contextmanager
def ignored(*exceptions):
try:
yield
except exceptions:
pass
'''---------------------------------------------'''
# generator expressions, avoids using the square brackets,
# creates a generator version of the this and goes a lot faster
sum([i**2 for i in range(10)])
sum(i**2 for i in range(10))