Monday, October 10, 2011

Cartesian Product for Sets

(Python 2.x)

This is a short one - there is a thread on Python-ideas mailing lista bout adding a cartesian product operation for SET types in Python, using the " * " operator for it.

Such an operation would yield a set where each element would be a tuple with one element from the first set and an elemento f the second set, for all possible combinations.

Most likely it won't fly as a language modification, since using itertools.product seems both more explicit and more eficient than having a __mul__ operation defined for set * set.

However, for the few cases on will want a "set" object with a defined cartesian product operation, it is a simple matter, as are simple all thing Python, of adding a __mul__ method to a "set" subclass that calls itertools.product.

One straightforward way of doing so, one would think, would be to write:


>>> import itertools
>>> class mset(set):
...   __mul__ = itertools.product


However, this won't work for an implementation detail in cpython 2: itertools.product is a "built-in" (a.k.a. native code) function. And as such, it won't work as an ordinary instancemethod if put in a class body like above. Python functions addressed in class bodies like this automatically work as methods of the class, available as instance methods - that is, an extra "self" argument is inserted by the inner workings of Python whenever the method is called. Being a native function, the "self" argument won't be inserted by Python - and thus our __mul__ method would require two explicit arguments and would not be fit for acting as a method to be called when the "*" operator is used between our object and another one.

Thus, steps in meta-python. All that is needed for the itertools.product method to work as an operator is to wrap it with a "pure-python" function.

One could write
def __mul__(self, other):
     return itertools.product(self, other)

but...where is the _fun_ of that? It is so boring   it could be written even in Java. 

Instead, we define an "instancemethod" decorator, since pythona lready has the "classmethod" and "staticmethod" counterparts for it. And all an "instancemethod" decorator needs to do is to wrap a function call in "pure-python" so that it is promoted to a method by the type metaclass (the default Python mechanism for that):

import itertools
instancemethod = lambda f: (lambda *a, **kw: f(*a, **kw))
class mset(set):
   __mul__ = __rmul__ = instancemethod(itertools.product)

And there it is - it works now.
One can type soemthing like:
>>> a = set(range(5,10))
>>> b = mset(range(5))
>>> a * b
<itertools.product object at 0xaace10>

>>> set(a*b)
set([(4, 7), (4, 8), (2, 8), (0, 7), (1, 6), (3, 7), (2, 5), (4, 9), (2, 9), (1, 5), (3, 6), (2, 6), (4, 5), (3, 9), (0, 5), (1, 9), (0, 8), (3, 5), (2, 7), (4, 6), (3, 8), (0, 6), (1, 8), (1, 7), (0, 9)])


Friday, January 28, 2011

Ordered Unit Tests

Due to the way classes are constructed in Python, and to how tests are grouped in classes when using the unittest module, ordinarily these tests are executed in arbitrary order (sorted by the method name hash in cPython 2.x)

It is not diffcult to imagine situations when we'd like our unittests to run in order (most likely from simpler to more complex tests). Python 3 metaclasses include a feature that unittest could use to ensure ordered execution of tests, if it where coded to do so (the "__prepare__" method in the metaclass can change the assemble-time class dictionary to an ordered dict, and anottate tehe execution ordeer from there).

However, I wanted an implementation that would work in Python 2. So I came up with this "ordered" decorator that can ensure execution order for tests using unittest. It is deliciously hackish, as you can see: the methods are still called in arbitrary order - but what runs is the decorator code, which then takes care of calling the next test in order, regardless of the one unittest thinks is running. :-)

Appart from having to use the functools.wraps decorator (to preserve method names so that unittest actually finds them -- I didn try without it, it might work), the code is rather straightforward once it is written down.

With you, tonight's metapython bit: