Saturday, October 30, 2010

Creating a Heap Class in one Python Line

Sometimes the stdlib is just strange --
like, the "heapq" module -- it provides some methods to use a list as a Heap, according to some well known algorithms - but it does not provide itself a "Heap" class -  You have to create your Healp as an empty list, and call "heapq.heappush" and "heapq.heappop"  (among other  methods), always passing your "heap" (list) as the first argument.

Oh.. "the object itself" as its first argument -- I had seen that before - so, we could possibly just create a class and in its body provide "pop = heapq.heappop" , and so on -- since these "heap*" methods signature is always "heapq.heap* (heap[, arg])", if I mark these functions as a "Heap class" Method themselves, they should work as methods. Not!

class Heap(list):
    pop = heapq.heappop 

The only members  in a class body that are promoted to "InstanceMethods" are functions - and the hepq methods are "built-in functions".  It is a trick on how Python new style classes work. Without being functions, they are not promoted to "methods", and whenever they are called, the object instance will not be pre-pended to parameters list.

So, the most obvious way to do that, is to create wrapper functions that just call the heapq.methods, and use  those as methods in a class that inherits from "list".

Since the goal is to do that in one line, this is the perfect ocasion to use the new dictionary generators from Python 2.7 to create the class body.

So we try:

Heap = type("Heap", (list,), {item: (lambda self, *args: getattr(heapq, "heap" + item)(self, *args))  for item in ("pop", "push", "pushpop", "replace")   } )

This also does not work!
We  have to keep in mind whenever we generate functions inside a for loop in Python, that functions work in a closure - so, the variables used on the "for" itself will always evaluate to the last value the attained in the "for statement" when the generated functions are called.  Hence, all the members in the class above will wrap "heapq.heapreplace".

The way to get it working is to add yield the generated functions from another level of wrapper functions, so that the variables used in the loop are frozen in this second-level wrappers.

import heapq
Heap = type("Heap", (list,), {item: (lambda item2: (lambda self, *args: getattr(heapq, "heap" + item2)(self, *args)))(item)  for item in ("pop", "push", "pushpop", "replace")})

And now we are in business:

>>> a = Heap()
>>> a.push(10)
>>> a.push(5)
>>> a.push(1)
>>> a.push(20)
>>> a.pop()
>>> a.pop()
>>> a.pushpop(30)