?

Log in

No account? Create an account

fanf

Tennent's correspondence principle, closures and continuations.

« previous entry | next entry »
2nd Feb 2012 | 19:30

Tennent's correspondence principle is a powerful programming language design guideline. It says that the meaning of an expression or statement should not change when an extra level of block structure is added around it. Following this principle strictly leads to profound consequences, especially for control structures and closures.

There are some implications for variable scoping, too. If the extra level of block structure is a binding construct (such as let or for) then the new variable may shadow an outer variable and change the meaning of inner uses of that name. This is usually treated as a style warning rather than an error as Tennent would suggest.

Instead of forbidding shadowing, another way to follow Tennent might be to specify that local variables have function scope not block scope, as Javascript does. Then all uses of a variable name in a function refer to the same object. However this causes trouble if the language also has nested functions, because this brings back nested scopes and the shadowing problem. The combination of closures and function-scoped variables that look like block-scoped variables leads to a well known Javascript gotcha, so it's probably best to stick with block structure.

Tennent's principle gets more interesting when you look at control structures. Firstly, it says that C-style break and continue are a bad idea. If the language instead has labelled loops, so you write break label then the break still refers to the same loop even if you add an intermediate loop.

Strict adherents of structured programming say that you should follow Tennent by abolishing constructs like break and goto. However this is going too far: to code within this restriction you often have to add extra state variables to produce the control flow that you want, which is less readable and less efficient.

But this pragmatism blows up in your face if your language has nested functions. Ideally you would like users of the language to be able to define their own control structures by writing appropriate higher-order functions. Then the bodies of these control structures are nested functions. And within these control structures, break, goto, and return should work as they do with built-in control structures. The problem with this is it allows nested functions to capture their continuations. In fact, you can define call-with-current-continuation as follows (using a Lua-ish syntax with a named return statement analogous to labelled break):

  function callcc(f)
    local function k(r)
      callcc returns r
    end
    callcc returns f(k)
  end

First-class continuations cause enormous difficulties for language implementations and code that uses them is often extremely hard to understand. How can we escape from this trap?

There is the option of hair-shirt structured programming which bans early returns. This prevents inner functions from making non-local jumps to outer functions. There is the option of not supporting nested functions and higher-order programming. But neither of these are very interesting.

Continuations cause problems when they are not used in a stack-like manner. It is possible to keep nested functions to a stack discipline if you allow them to be used in only two ways: they can be called, and they can be passed as a function argument. They cannot be copied into variables in outer scopes or data structures on the heap, and they cannot be returned up the stack. You can lift these restrictions if the closure does not capture its continuation, which is easy to check statically by looking at its jumpy statements.

Smalltalk-style blocks are enjoying a renaissance at the moment, having been re-popularized by Ruby. Instead of making static restrictions, Smalltalk relies on a dynamic check that prevents callcc from working. Mozilla's Rust programming language implements has second-class "stack closures" (though it isn't clear to me if you can jump out of them) as well as first-class closures that cannot capture their continuation.

It seems to me that this approach is a good way to support expressive higher-order programming and embedded domain-specific languages with a conventional stack-based language implementation.

| Leave a comment | Share

Comments {4}

msimoni

Aren't you using a stronger form of the principle than usual?

from: msimoni
date: 2nd Feb 2012 19:44 (UTC)

The page http://gafter.blogspot.com/2006/08/tennents-correspondence-principle-and.html uses a much weaker form of the principle:

"an expression or statement, when wrapped in a closure and then immediately invoked, ought to have the same meaning as it did before being wrapped in a closure."

E.g. Scheme follows this form of the principle, since X is the same as ((lambda () X)).

You extend this principle to the wrapping of an expression in an arbitrary other expression, e.g. a LET.

Scheme does not follow this stronger form of principle, of course, but this is not considered problematic by Schemers.

Reply | Thread

Tony Finch

Re: Aren't you using a stronger form of the principle than usual?

from: fanf
date: 3rd Feb 2012 11:31 (UTC)

That's true. However all the problems that I discussed arise in the closure case, so the extra strength in my statement of Tennent's principle doesn't introduce new difficulties. Also, I note that LET is defined in terms of LAMBDA in Scheme :-)

An interesting case in Scheme is the "named let" which gives you a kind of jump-to-label construct. So you can write:

  (define call-with-current-continuation
    (lambda (f)
      (let k (r (f k))
         r)))

The question I was trying to address (which I probably could have stated more clearly) is how you can follow Tennent's correspondence principle, and provide jumpy control flow primitives, whilst avoiding the first-class continuation tar pit. Scheme doesn't try to avoid it :-) so I think Scheme does follow my stronger statement (though it allows variable shadowing, which I decided to leave a matter of style rather than language).

Reply | Parent | Thread

msimoni

Re: Aren't you using a stronger form of the principle than usual?

from: msimoni
date: 3rd Feb 2012 13:51 (UTC)

Oh, and personally, if the language doesn't have multi-shot continuations, I think the best way is to allow any use of continuation ojects (store in variable, pass to function, ...) but simply perform a runtime check and raise an error if a continuation is invoked after its dynamic (stack) extent has been left.

Reply | Parent | Thread

Simon Tatham

from: simont
date: 3rd Feb 2012 18:54 (UTC)

Firstly, it says that C-style break and continue are a bad idea. If the language instead has labelled loops, so you write break label then the break still refers to the same loop even if you add an intermediate loop.

I encountered a particularly amusing special case of this sort of problem last year. Someone had taken a big switch and rewritten it as a series of chained if-elses (on the grounds that some of the conditions were about to become more complex than "case CONSTANT:" could express). So they'd removed the break at the end of each case handler as part of this transformation, but they'd missed a few breaks half way through case handlers which were exiting the switch early. As a result, those breaks silently transformed into breaks out of the loop surrounding the switch, with Hilarious Consequences™.

Reply | Thread