Monday, October 10, 2011

Cartesian Product for Sets

(Python 2.x)

This is a short one - there is a thread on Python-ideas mailing lista bout adding a cartesian product operation for SET types in Python, using the " * " operator for it.

Such an operation would yield a set where each element would be a tuple with one element from the first set and an elemento f the second set, for all possible combinations.

Most likely it won't fly as a language modification, since using itertools.product seems both more explicit and more eficient than having a __mul__ operation defined for set * set.

However, for the few cases on will want a "set" object with a defined cartesian product operation, it is a simple matter, as are simple all thing Python, of adding a __mul__ method to a "set" subclass that calls itertools.product.

One straightforward way of doing so, one would think, would be to write:

>>> import itertools
>>> class mset(set):
...   __mul__ = itertools.product

However, this won't work for an implementation detail in cpython 2: itertools.product is a "built-in" (a.k.a. native code) function. And as such, it won't work as an ordinary instancemethod if put in a class body like above. Python functions addressed in class bodies like this automatically work as methods of the class, available as instance methods - that is, an extra "self" argument is inserted by the inner workings of Python whenever the method is called. Being a native function, the "self" argument won't be inserted by Python - and thus our __mul__ method would require two explicit arguments and would not be fit for acting as a method to be called when the "*" operator is used between our object and another one.

Thus, steps in meta-python. All that is needed for the itertools.product method to work as an operator is to wrap it with a "pure-python" function.

One could write
def __mul__(self, other):
     return itertools.product(self, other)

but...where is the _fun_ of that? It is so boring   it could be written even in Java. 

Instead, we define an "instancemethod" decorator, since pythona lready has the "classmethod" and "staticmethod" counterparts for it. And all an "instancemethod" decorator needs to do is to wrap a function call in "pure-python" so that it is promoted to a method by the type metaclass (the default Python mechanism for that):

import itertools
instancemethod = lambda f: (lambda *a, **kw: f(*a, **kw))
class mset(set):
   __mul__ = __rmul__ = instancemethod(itertools.product)

And there it is - it works now.
One can type soemthing like:
>>> a = set(range(5,10))
>>> b = mset(range(5))
>>> a * b
<itertools.product object at 0xaace10>

>>> set(a*b)
set([(4, 7), (4, 8), (2, 8), (0, 7), (1, 6), (3, 7), (2, 5), (4, 9), (2, 9), (1, 5), (3, 6), (2, 6), (4, 5), (3, 9), (0, 5), (1, 9), (0, 8), (3, 5), (2, 7), (4, 6), (3, 8), (0, 6), (1, 8), (1, 7), (0, 9)])


  1. Thanks a lot very much for the high quality and results-oriented help. I won’t think twice to endorse your blog post to anybody who wants and needs support about this area.

    Best Hadoop training in bangalore