astomo.us

In my last post I made an off-hand remark about writing tail-recursive code in Python. I thought that I would elaborate on this. Python’s creator, Guido van Rossum, has famously said that he’s not interested in implementing tail recursion in the language- much to the dismay/disgust of functional programmers everywhere. Im certainly not a hard-core functional programmer and hadnt devoted much thought to this until a few weeks ago when my friend Judah gave a talk on functional programming in numerical analysis; he made a pretty convincing case that using a functional style has a lot of advantages. Of course there’s not much point in doing this if your language doesnt do tail-recursion optimization b/c your code will most likely explode the stack and/or run abysmally slowly. Im still not sure about the second problem, but one (admittedly hacky) way to avoid the first problem when running tail-recursive code in Python is to use decorators (not to be confused with the Decorator design pattern).

So what are decorators in Python? Well, one way to think of them is as a Pythonic analog to the classic C preprocessor macro: they modify existing code at (byte) compile time. However decorators have a somewhat more functional flavor, for example:


#decorators.py
import sys

class simple_decorator(object):
  def __init__(self, f):
    print "initializing simple_decorator"
    f() # Prove that function definition has completed

  def __call__(self):
    print "calling simple_decorator"

def foo1():
  print "calling foo1"

@simple_decorator
def foo2():
  print "calling foo2"

foo1()
foo2()

You can see that simple_decorator is a class whose constructor takes a function as its second argument (functions are first-class objects in Python). The call method typically changes the behavior of the base function in some way. Here it completely replaces it. So in my python interpreter:


In [5]: from decorators import *
initializing simple_decorator
calling foo2
calling foo1
calling simple_decorator

In [6]: foo1()
calling foo1

In [7]: foo2()
calling simple_decorator

foo1 and foo2 have the same source code (modulo the print string of course), but different behavior- thats because foo2 was modified using simple_decorator. We can use this construction to implement some really basic tail-recursion ‘optimization’ by collecting local vars each time the stack explodes and piping them back into the function:


class stack_explosion:
  def __init__(self, args, kwargs):
    self.args = args
    self.kwargs = kwargs

def tail_recursive(foo):
  def foo_tr(*args, **kwargs):
    f = sys._getframe()
    if f.f_back and f.f_back.f_back and f.f_back.f_back.f_code == f.f_code:
      raise stack_explosion(args, kwargs)
    else:
      while True:
        try: return foo(*args, **kwargs)
        except stack_explosion, boom:
          args = boom.args
          kwargs = boom.kwargs
  return foo_tr

@tail_recursive
def fact1(n, result=1):
  if n == 0:
    return result
  return fact1(n-1, n*result)
  
def fact2(n, result=1):
  if n == 0:
    return result
  return fact2(n-1, n*result)

print fact1(10000)
print fact2(10000)

Running fact1(10000) returns 10000! as one would hope, wheras fact2(10000) (the undecorated version) returns a runtime stack-overflow error. Cool, right? This approach was originally pointed out to me by my friend John. Actually speeding up tail-recursive code in Python is more complicated though, and depends a lot on which implementation youre using. One way to go, at least in CPython, is to use a trampoline. This (I believe) is how Scheme does it.

Quick edit: According to Craig you use trampolines to implement tail recursion in scheme if you’re building it on top of a system that doesn’t handle tail calls correctly. If you’re just compiling to assembly, though, you don’t need a trampoline: CPS the program and you can correctly generate the jumps when you’re emitting the assembly.

Ive just finished reading Godfried Toussaint’s excellent paper ‘The Euclidean Algorithm Generates Traditional Musical Rhythms’.   As the title suggests, Toussaint explores the applications of Euclid’s famous algorithm for computing the GCD of two integers to algorithmic rhythm generation.  It turns out that if you take a templated version of the algorithm and apply it to list types in place of integers with a certain mix operation in place of subtraction, and you feed it lists of the type [1,1,1.....,0,0....,0],  you get a procedure that tries to distribute the zeros amongst the ones with maximal uniformity.  For example, applying the procedure to the list [1,1,1,1,0,0,0,0] returns [1, 0, 1, 0, 1, 0, 1, 0] as you might expect.  Things get more interesting when you introduce some non-uniformity however.  Applying it to [1,1,1,0,0,0,0,0] returns [1, 0, 0, 1, 0, 1, 0, 0], a common clave known as tresillo.

The procedure I’ve just described was first applied to pulse timing in spallation neutron source (SNS) accelerators by Eric Bjorklund, but the basic premise of trying to spread stuff out as much as possible subject to certain constraints pops up everywhere in mathematics, from the distribution of zeros of the zeta function to the eigenvalues of a Gaussian ensemble (the two are actually related, a fact famously discovered in the cafeteria of the Institute for Advanced Study) to Ornstein-Uhlenbeck processes (O-U processes, incidentally, are the continuous-time version of what you get when you feed pure white noise into a certain kind of IIR filter).   Not to mention various elementary physics phenomena such as the diffusion equation and magnetism.  This stuff is still an active area of research in mathematics.  If youre interested I suggest checking out Terry Tao’s blog on the subject for starters.

Anyways, this is incredible and I could go on and on about it, but its much more fun to just implement Bjorklund and start messing around.  I find scripting languages in general and Python in particular to be a good tool for this sort of thing. The remainder of the post is a sortof walk-through of my python code, so fire up your Python interpreter and read on.

For reference, here is a subtractive version of the EA:


def gcd(a,b):
  if a==0: return b
  while b!=0:
    if a>b: a = a-b
    else: b = b-a
  return a

To morph EA land into Bjorklund we need a analogs of the integers a and b, >, and -. In Bjorklund a and b are lists and are at times glued together into one list. So we need a way of splitting a list into two components a and b. The split works by separating out the largest homogeneous chuck of the tail of the list. I did this in a tail recursive style:


def split(a,b=[]):
  b.insert(0,a.pop())
  if not a or b[0] != a[-1]: return a,b
  else: return split(a,b)

So for example in my python interpreter:


In [64]: a = [[1],[1],[1],[0],[0],[0],[0],[0]]

In [65]: split(a)
Out[65]: ([[1], [1], [1]], [[0], [0], [0], [0], [0]])

In [66]: a = [[1, 0], [1, 0], [1, 0], [1, 0]]

In [67]: split(a)
Out[67]: ([], [[1, 0], [1, 0], [1, 0], [1, 0]])

In the latter case the entire list is homogeneous so split returns an empty list and the original list. This will be our base case (ie the Bjorklund version of b==0).

The analog of subtraction is mix.


def mix(a,b):
  m = min(len(a),len(b))
  if len(a) > m:
    return [a[i]+b[i] for i in range(m)] + a[m:]
  else:
    return [a[i]+b[i] for i in range(m)] + b[m:]

Mix takes two lists and concatenates them element-wise, then concatenates the result with whatever is left over. For example:


In [72]: a = [[1],[1],[1],[0],[0],[0],[0],[0]]

In [73]: b = [[1],[1],[1],[1],[1],[0],[0]]

In [74]: mix(a,b)
Out[74]: [[1, 1], [1, 1], [1, 1], [0, 1], [0, 1], [0, 0], [0, 0], [0]]

Now the morphed EA looks like this:


def foo(a):
  x,y = split(a)
  while len(x) != 1 and len(x) != 0:
    x,y = split(mix(x,y))
  return [x,y]

One final primitive we need is a function to flatten out the result:


def flatten(x):
  result = []
  for el in x:
    if hasattr(el, "__iter__") and not isinstance(el, basestring):
      result.extend(flatten(el))
    else:
      result.append(el)
  return result

Finally, the Bjorklund algorithm can be written succintly as:


def bjorklund(a):
  return flatten(foo(a))

So for example to create a tresillo clave and its ‘mirror image’:


In [78]: a = [[1],[1],[1],[0],[0],[0],[0],[0]]

In [79]: bjorklund(a)
Out[79]: [1, 0, 0, 1, 0, 1, 0, 0]

In [80]: b = [[1],[1],[1],[1],[1],[0],[0],[0]]

In [81]: bjorklund(b)
Out[81]: [1, 0, 1, 1, 0, 1, 0, 1]

Of course this is just the tip of the iceberg. As a white rabbit, observe that the operator is contravariant (i.e. bjorklund(a+b) = bjorklund(b) + bjorklund(a) where + here denotes list concatenation):


In [82]: a = [[1],[1],[1],[0],[0],[0],[0],[0]]

In [83]: b = [[1],[1],[1],[1],[1],[0],[0],[0]]

In [84]: bjorklund(a+b)
Out[84]: [1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1]

Addendum: If Python isnt your thing, there are a few other implementations of Bjorklund out there in Ruby and Lisp.

sparse filter design

October 30th, 2009

If sparse filters are a good idea then perhaps there exists a design algorithm that works better than windowing or Parks-McClellan.  Heres a really basic thing you could do instead that may or may not be any good:

Solve the constrained opt. problem:
\underset{b \in l^2(Z)}{min} \|Fb - g\|_2 + \lambda \|b\|_1

where g is the desired freq. response, F is the DFT operator, and lambda is a penalty parameter.   if you make the problem finite dimensional by instead optimizing over b \in R^n for large n (say the size of your audio buffer) and using a FFT instead of a DFT then you have a well-known and easily solved penalized linear regression problem called the lasso.

As a random aside the lasso problem is very similar to the el-1 compressive sensing problem, which my dept. is nuts about and definitely has lots of unexplored applications in audio.

Anyways using \|Fb - g\|_2 here is not really good though because usually you get filter specs. that you want to meet which means using the sup norm. however since \| \cdot \|_{\infty} = lim_{p \to \infty} \| \cdot \|_{p} on finite measure spaces like the torus, one could in principle get a good worst case error by instead minimizing the p norm for large p.

so you could modify the above to:
\underset{b \in l^2(Z)}{min} \|Fb - g\|_p + \lambda \|b\|_1

I dont know how feasible that is in practice though.


musical sorts

October 30th, 2009

A couple of months ago i got my roommate Ryan hooked on Perry Cook’s excellent sound synthesis library STK.  Since then he’s been writing musical versions of classic sorting algorithms- check them out.

moore’s law is stalling out

October 30th, 2009

I spend a fair amount of time programming GPU’s. when I tell certain people this they ask ‘why bother?’.   Well here is a partial answer in the form of an amusing chart I came across while preparing slides for a talk at HRL:

time vs chip power density

time vs chip power density

The gist is that you cant wring many more cycles per second out of your CPU w/o a really big heat sink.  One way to get around this is to add more CPU’s, this is what general-purpose computer manufacturers are doing of course but they have to make certain concessions in order to achieve latency low enough for general-purpose use.  Graphics cards have no such latency restrictions and hence are optimized for throughput.  If chips were cars GPU’s would be something akin to drag racers- they go fast but don’t turn very well or stop as soon as you apply the brakes.  However they are definitely suitable for stream programming which is the major draw for me.

new setup

October 30th, 2009

Ive finally accepted the fact that web hosting is not a strong point of the ucla math dept (nor should it be) and so Ive set up here instead.  A stripped-down version of my old academic page still exists here.  Hopefully with a 10x bigger pot (its hard to cram a lot of content into 100 Mb’s) this new site will start to grow.