Making Choices Lazily - CiteSeerX

21 downloads 0 Views 232KB Size Report
tion, we describe this semantics fairly brie y, and present an equivalent .... (e1 8 e2) = let y = e1 8 e2 in y; y fresh ...... tions allowed) plus a parallel combinator 2].
Making Choices Lazily John Hughes Andrew Moran Department of Computing Science Chalmers University of Technology and University of Goteborg S-412 96 Goteborg, Sweden e-mail: [rjmh,andrew]@cs.chalmers.se

Abstract We present a natural semantics that models the untyped, normal order -calculus plus McCarthy's amb in the context of call-by-need parameter passing. This results in a singular semantics for amb. Previous work on singular choice has concentrated on erratic choice, a less interesting nondeterministic choice operator, and only in relation to callby-value parameter passing, or call-by-name restricted to deterministic terms. The natural semantics contains rules for both convergent and divergent behaviour, allowing it to distinguish programs that di er only in their divergent behaviour. As a result, it is more discriminating than current domain-theoretic models. This, and the fact that it models singular amb, makes the natural semantics suitable for reasoning about lazy, functional languages containing McCarthy's amb. 1 Introduction The need for non-determinism in functional programming is apparent. There are parallel algorithms that are inherently non-deterministic, deterministic parallel algorithms that require internal non-determinism, and algorithms that admit elegant solutions involving non-determinism. A kind of fair non-determinism is required to implement functional operating systems [22, 10, 21, 13], to provide a merging of incoming messages. A similar mechanism would also nd a role in Fudgets [4], to deal with streams of incoming events from the environment. Since functional languages claim to be well-suited to the implementation of parallel algorithms, they should be able to express such algorithms. In addition, any deterministic, parallel construct can be de ned in terms of an appropriate non-deterministic choice operator, so a framework for reasoning about a non-deterministic lambda calculus would form a foundation for a framework for reasoning about deterministic parallel functional languages. It is easy to add non-deterministic choice to a functional language: evaluate each branch in parallel and accept the rst to terminate as the result. Such an operator will terminate when either operand does, but can only loop when both operands do. In other words, it is McCarthy's amb [15]. If added to a lazy functional language, we will have singular

choice (since the choice node will be overwritten by the rst operand to terminate, choices will be made only once). We would like to reason about programs written in lazy functional languages containing amb. To do so, we require a semantics that both satisfactorily models amb, and models singular choice. Unfortunately, giving a semantics to McCarthy's amb is dicult. Current denotational semantics for amb either identify programs we would rather consider distinct, or are complex and dicult to relate to our operational intuitions. Also, the semantics of singular choice has been studied only in relation to call-by-value parameter passing and call-by-name restricted to deterministic terms, and not with relation to the important (and more challenging) call-by-need strategy. Singular choice can cause problems if the semantics doesn't model sharing appropriately. In particular, -conversion becomes unsafe. For example (using 8 to denote singular McCarthy's amb), (x:x + x)(1 8 2) 6= (1 8 2) + (1 8 2); since the left hand side can evaluate to 2 or 4 and the right hand side can evaluate to 2, 3, or 4. This invalidates fold/unfold transformations, a major tool in program transformation. We rst present a summary of an earlier paper that deals with plural McCarthy's amb (in which choice expressions may be copied) to introduce our programme. Then we present a small step reduction semantics that corresponds closely to our operational intuitions both about amb and singular choice. However, this semantics contains a lot of detail and is cumbersome to reason with. Therefore, we present a more abstract natural semantics that is more suited to reasoning. The two semantics are proven to be equivalent. The natural semantics includes rules for both convergent and divergent behaviour of amb, and can therefore distinguish between expressions that most denotational models identify. Furthermore, since this natural semantics is built on top of Launchbury's natural semantics for laziness, it models singular choice in such a way that the problem noted above with -conversion disappears. We close with some small examples and a discussion of related work. We believe the semantics presented here can provide a basis for reasoning about a non-deterministic, lazy, functional language (e.g. Haskell plus amb). Future work will involve investigating equivalences that arise from the semantics, and using those equivalences to establish an equational theory.

2 McCarthy's amb Operator We begin with an overview of our programme, using plural choice as an illustrative example. Plural choice results

if choice expressions are copied when substituted; it arises immediately in a call-by-name context. Considering again the example from the introduction: if 8 was plural amb, the left hand side could evaluate to 2, 3, or 4. This section summarizes parts of [16]. The central idea is to produce two operational semantics: one intensional and low-level semantics that has a clear correspondence to our operational intuitions, and a more abstract natural semantics that is easier to use for equational reasoning. To justify using the latter so, the two semantics are proved to be equivalent. 2.1 Small Step Semantics The terms in the small step semantics are of the form M ::= xjx:M jM N jM m8n N: Values are lambda expressions. We call the set of all small step terms s8, and the set of small step values, sV 8 . The m and n in a choice expression e1 m8n e2 represent evaluation resource allocations (see below). An initial term is one for which all 8-expressions have nil evaluation resources. Although not enforced by the grammar, we stipulate that all values must be initial in this sense. The rules are given in gure 1. They de ne the relation !, between terms of the language. It is a weak reduction scheme; the rules do not apply under s. The small step semantics is de ned as the re exive, transitive closure of this relation, written ! . We write e ! v to mean that there exists a reduction sequence from term e to value v. We write e !! to mean that e has an in nite reduction sequence, i.e. possible non-termination. Note that since non-determinism is present, the same expression may have many nite and in nite reduction sequences. The rules Fun! and Sub! de ne function application and normal order -reduction in the usual way. To prevent a non-terminating branch from causing a 8expression to diverge when the other branch terminates, we incorporate a scheduler into the semantics. Each branch is allocated a nite, non-zero amount of reductions in each scheduling phase. In this way, if a branch can terminate, then there exists at least one scheduling that will lead to it being chosen as the nal result of the 8-expression. This is formalised in the following theorem. Theorem 2.1 (Fairness) For all closed small step terms e1 and e2 , and small step values v1 and v2 ,

e1 ! v1 _ e2 ! v2 () 9i:e1 080 e2 ! vi: Furthermore, the scheduling guarantees that a 8-expression can only diverge when both branches are divergent (as stated by the following theorem). Theorem 2.2 (Bottom-Avoidance) For all closed small step terms e1 and e2 ,

e1 !! ^ e2 !! () e1 m8n e2 !! :

Together, these two theorems imply that 8, as implemented by the rules in gure 1, is McCarthy's amb. The proofs of these theorems appear in [16], and are simpler versions of the proofs of the analogous theorems that appear in section 4.

2.2 Natural Semantics To model the operational behaviour of the language more abstractly, we give natural semantics for convergence (de ned inductively) and for divergence (de ned co-inductively). We need to describe divergence since amb is distinguished from erratic choice (which may diverge if either operand can) by its divergent behaviour. Of course, we can't prove the divergence of a term in general. Therefore, we admit in nite proofs of divergence, via the divergence rules. Formally, these rules form a coinductive de nition (for a more detailed description of coinduction and its uses, see [6]). In the following sections, natural semantics rules will be labelled with a + to indicate that they form part of an inductive de nition (the normal interpretation), or with a ? to indicate that they form part of a co-inductive de nition. Terms of the natural semantics are essentially the same as the small step terms, except that the 8 expressions are not annotated with evaluation resources. We call the set of all natural terms 8 , and the set of natural values, V8 . Since we wish to compare these two semantics, we introduce two mappings, ()] and ()[ . Mapping from natural terms to small step terms, ()] replaces all occurrences of the amb operator with 080 , whereas ()[ removes the annotations (yielding a natural term from a small step term). 2.2.1 Convergence The natural semantics rules for convergent behaviour are as follows: x:e +x:e + Lam+

e1 +x:e (e[e2 =x])+v + App+ (e1 e2 )+v ei +v + i2f1; 2g Amb +i (e1 8 e2 )+v The notation e + v means that closed expression e may converge to some value v. The Lam+ rule states that values converge to themselves. The App+ rule describes convergent, normal order -reduction. That amb expressions may converge whenever either operand can is stated by the two Amb+i rules. The i refers to which of e1 and e2 is chosen. This semantics agrees with the small step semantics with respect to convergent behaviour, modulo ()] and ()[ . Theorem 2.3 (i) For all closed small step terms e, and for all small step values v, e ! v =) e[ + v[; (ii) For all closed natural terms e, and for all natural values v, e + v =) e] ! v]: The proof is presented in full in [16], and is a simpler version of the proof of the analogous theorem presented in section 4.2.1.

e1 ! e01 e1 e2 ! e01 e2

Fun!

(x:e) e2 ! e[e2=x]

v1 m+18n e2 ! v1

Amb!1

e1 m8n+1 v2 ! v2

e1 ! e01 Red!1 m +1 n e1 8 e2 ! e01 m8n e2

Sub! Amb!2

e2 ! e02 Red!2 m n +1 e1 8 e2 ! e1 m8n e02

e1 080 e2 ! e1 m8n e2 ; m; n > 0 Sched! Figure 1: The small step reduction semantics for plural amb 2.2.2 Divergence The (co-inductively de ned) natural semantics rules for divergent behaviour are as follows:

e1 * ? (e1 e2 )*

App*1

e1 +x:e (e[e2 =x])* ? App *2 (e1 e2 )* e1 * e2 * ? (e1 8 e2 )*

Amb*

We write e * to mean that closed expression e may diverge. An application may diverge if the function itself does, or if substitution leads to divergence. A choice expression may only diverge if both of its operands can diverge. The last rule is crucial, for it distinguishes the choice described in the natural semantics from erratic choice. The divergence semantics agrees with the small step semantics with respect to in nite reduction sequences, modulo ()] and ()[ . Theorem 2.4 (i) For all closed small step terms e,

e !! =) e[ *; (ii) For all closed natural terms e,

e * =) e] !! : The proof may be found in [16], and is a simpler version of the proof of the analogous theorem presented in section 4.2.2. 3 Singular Choice and Sharing When non-determinism is combined with normal-order reduction, the parameter passing mechanism can in uence the possible outcomes of evaluation. When call-by-name parameter passing is used, the result is what's known as plural choice, in which di erent occurrences of the same choice expression may make di erent choices. The classic example of this uses the double function: (x:x + x) (5 8 6): (1) With plural choice [5, 20], this may evaluate to 10, 11, or 12. It is called plural because a variable may be bound to a set of possible values. If call-by-need (i.e. call-by-name plus sharing) parameter passing is used, then the result is singular choice [5, 20],

in which shared occurrences1 of the same choice expression must make the same choice. With singular choice, (1) may evaluate to 10 or 12, but not to 11. It is called singular because a variable must be bound to a single value. Note that singular choice does not require di erent applications of functions to make choices consistently. For example, in (f:f 1 + f 2) (x:x + (5 8 6)) (2) each application of f may make a di erent choice, leading to four possible values. The operational semantics given in section 2 are valid for plural choice only. Since call-by-need is used in many implementations of functional languages and it leads immediately to singular choice, we would like to have a framework for reasoning about singular choice. While it may be possible to model singular choice without also modelling sharing, our intuition is that this is not the case. We choose to use Launchbury's natural semantics for laziness [14] as the foundation for our semantics for singular choice. In this section, we describe this semantics fairly brie y, and present an equivalent small step semantics to facilitate the addition of amb to the language modelled. 3.1 Launchbury's Semantics for Laziness In [14], laziness is captured in two stages.2 First, terms of the object language are transformed into a normalised form in which arguments to functions are always variables and all bound variables are distinct. The normalised terms have the following syntax: M ::=xjx:M jM xjlet xi = Mi in M: Since all applications are of an expression to a variable, no copying can occur during substitution. Scope is irrelevant since all names are distinct, allowing the recursive lets to model non-recursive bindings since no undesired name capture can occur. We consider the convergence of heap-expression pairs to heap-value pairs. A heap in this context is a partial function from variable names to expressions. A value is as before. Judgements in the (convergence) natural semantics are of the form ? : e +  : v; meaning that given the bindings in heap ?, expression e may converge to value v (with resulting heap ). 1 Call-by-value parameter passing also leads to singular choice, as noted in [11, 7]. 2 The reader is referred to [14] for the de nitive description of this semantics. We shall give a short summary only. Also, our notation di ers slightly (but not signi cantly).

The natural semantics rules, given below, are mostly straightforward. ? : x:e + ? : x:e + Lam+ ? : e +  : y:e0  : e0 [x=y] +  : v + App+ ? : ex+ : v ? : e+ : v + (?; x 7! e) : x + (; x 7! v) : v^

Var+

(?; x1 7! e1 ; : : : ; xn 7! en ) : e +  : v Let+ ? : let x1 = e1  xn = en in e +  : v + The crucial rule is the Var+ rule, which concerns the convergence of a variable x bound to some e in a heap ?. We evaluate e in ?, minus the reference to x, resulting (if successful) in a new heap-value pair  : v. The nal result heap is the new heap  plus a binding of x to v. The nal value is v with all let-bound variables renamed (denoted by v^). This last allows us to add bindings to the heap in the Let+ rule without worrying about capture. Note also that in the App+ rule, substitution has become merely a renaming of the bound variable, so there is no copying during substitution. Another e ect of the Var+ rule is black-hole detection, that is, detection of self-dependent expressions. This happens because when evaluating a variable, the expression to which it refers is evaluated in a heap with the binding for the variable removed. If the expression depends on the variable (i.e. it is self-dependent), a stage in the proof will be reached with no applicable rule. So a proof in this semantics can fail for two reasons: because of the existence of an in nite loop (resulting in an in nite \proof"), and because of the existence of a black-hole (resulting in an incomplete \proof"). We say a heap-expression pair ? : x is a black hole if x has no binding in ?. 3.2 Small Steps for Laziness The rst step in adapting Launchbury's semantics for our purposes is to give it a small step semantics, since this is our starting point. We begin by de ning an immediate reduction relation, !, between terms of our language by the rules presented in gure 2. The small step semantics is de ned as the re exive transitive closure of the relation, written ! . Judgements are of the form ? : e !  : e0 , meaning that term e in the context 0of the set of bindings ? can reduce in one step to the term e with the set of possibly modi ed and extended bindings .  We will write ? : e !  : v to mean there exists a nite reduction sequence from e (with an initial heap ?) to the value v (with nal heap ). As a result of the re exive nature of ! , for all values v, nand heaps ?, ? : v ! ? : v. We will also sometimes write ! , to mean n reduction steps. Black holes manifest themselves in the small step semantics as nite reduction sequences whose last expression is something other than a value. Since black hole detection isn't our major concern here, we choose to identify them with non-termination (i.e. in nite reduction sequences). In gure 2, rule BH! accomplishes this.

4 Singular McCarthy's amb In this section, we give small step and natural semantics for singular McCarthy's amb. The language being considered is the same as that in section 3 with the additional syntactic form M 8 N . Also, since we wish to investigate singular choice, we must ensure that each choice expression is named. To accomplish this, we add to the normalisation procedure in [14] a rule for choice expressions: (e1 8 e2 ) = let y = e1 8 e2 in y; y fresh We de ne ()] and ()[ for the normalised language and heaps in the obvious way. 4.1 Small Step Semantics We need to add the ve rules concerning the reduction of 8 expressions. The branches in Red!i rules share the heap. For example, Red!1 is: ? : e1 !  : e01 Red!1 m +1 ? : e1 8n e2 !  : e01 m8n e2 When a branch is chosen, via one of the Amb!i rules, it inherits the heap: ? : e1 m8n+1 v2 ! ? : v2 Amb!2 The Sched! rule is unchanged apart from the addition of the heap. We say that  : e consistently extends ? : e if ? is contained in  and no variable clashes are introduced by the extra bindings in . This amounts to saying that ?   and that  : e is a valid heap-expression pair. An important property of the small step semantics is that reduction is preserved by consistent extension; that is if ? : e reduces to some ?0 : e0 , then for any consistent extension0  :0 e,  : e reduces, by the same reduction rule, to some  : e (where 0 is of course determined by , e, and the reduction rule in question). In the following subsections, we examine the convergent and divergent behaviour of the small step semantics for singular McCarthy's amb. 4.1.1 Convergent Behaviour When we examine the reduction of 8 expressions and sharing is present, complications arise. Consider e1 8 e2 and assume that e1 evaluates to some v which is chosen as the value of the 8 expression. The unused branch may still make additions and updates to the heap during reduction of the amb expression (via the Red!2 rule in this case). It can add new bindings that are independent of the other branch, or it may rewrite some binding (possibly to a non-value, since the Var!1 rule updates the heap for every reduction). So for a given reduction sequence ? : e !  : v, there could be an in nite number of other reduction sequences for ? : e terminating with the same value v but with different nal heaps. This makes the formulation of an analogue to the fairness theorem from section 2 for the singular case more unwieldy than it should be. (It would also complicate the bottom-avoidance theorem for singular choice and the soundness and completeness theorems.) To simplify matters, we introduce the notion of an ideal reduction sequence. A reduction sequence is said to be ideal if only those branches of 8 expressions which are nally chosen are

? : e !  : e0 ? : e x !  : e0 x

Fun!

? : (y:e) x ! ? : e[x=y]

Sub!

? : e !  : e0 (?; x 7! e) : x ! (; x 7! e0 ) : x

Var!1

(?; x 7! v) : x ! (?; x 7! v) : v^

Var!2

? : let x1 = e1  xn = en in e ! (?; x1 7! e1 ; : : : ; xn 7! en ) : e Let! ? : x ! ? : x x2= ?

BH!

Figure 2: The small step semantics for laziness reduced. For any ? : e ! 0 : v, we can derive its ideal counterpart, written ? : e !I  : v. Here, 0 may be smaller than . This is because the original reduction sequence may have added super uous bindings to the heap (during the reduction of an unused branch). The de nition of the set of ideal reduction sequences, Ideal, is given in gure 3. Note that this does not constitute a new de nition of reduction, but is merely the de nition of the set of reduction sequences that are ideal. To derive the ideal counterpart for a given convergent reduction sequence, we must remove all unnecessary reductions. Only reductions of unused branches in 8 expressions can be unnecessary, so it may seem that simply discarding all such reductions is sucient to derive the ideal reduction sequence. However, if a needed, shared variable is reduced in an unused branch, then reductions of that variable are also needed, and so may not be discarded. Luckily, it is always possible to discover when a reduction in an unused branch is needed, and to construct an analogous ideal reduction. A brief description of the idealisation algorithm is presented in appendix A. Using the algorithm, we can derive the ideal counterpart of any given convergent reduction sequence. The convergent behaviour of 8 expressions in terms of ideal reduction sequences can now be formalised. Since every reduction sequence has an ideal, a consequence of this theorem is that 8 is fair with respect to general reduction sequences also. Theorem 4.1 (Fairness) For all closed small step terms e1 ; e2 , for all small step values v1 ; v2 ; and v, and all heaps ?; 1 , and 2 , ? : e1 !I 1 : v1 _ ? : e2 !I 2 : v2

() 9i:? : e1 080 e2 !I i : vi : Proof. If case. Assume ? : ei !I i : vi . Then ? : ei !m i : vi for some m. We can construct a reduction 00 sequence for ? : e1 8 e2 thus (letting i = 1, without loss of generality): ? : e1 080 e2 ! ? : e1 m1+1n8n e2 ; Sched! (some n > 0) !m 1 : v1 8 e2 ; m  Red!1 ; ? : e1 !m 1 : v1 ! 1 : v 1 by Amb!1 This is an ideal reduction sequence (since the branch nally chosen is the only reduced), so we have the required result. The result follows similarly for i = 2, since all of the rules are symmetric.

Only if case. Assume ? : e1 080 e2 !I 1 : v1 (again, letting i = 1 without loss of generality). The last reduction in this sequence must be Amb!1 ; since this is an ideal reduction sequence, the whole sequence is of the form 0

? : e1 080 e2 !n 1 : v1 m+18n e2 ! 1 : v1 : We can construct a reduction sequence for ? : e1 by rewriting ?0 : e01 to ?00 : e001 whenever ?0 : e01 m+18n e2 rewrites to ?00 : e001 m8n e2 . Since the sequence for ? : e1 080 e2 is nite, the constructed reduction sequence for ? : e1 will also be nite. Furthermore, since each reduction in the sequence for the 8 expression is an ideal reduction, all of the reductions in the constructed sequence will be ideal. Therefore ? : e1 !I 1 : v1 , as required. The result follows similarly for i = 2. 2 Note that in fact, for any m and n, ? : e1 m8n e2 !I  : v implies that one of the branches has an ideal reduction sequence terminating with  : v. The reverse is however not true; consider K 180 I , where K = x:y:x and I = x:x. Even though I converges, the 8-expression cannot choose I . What if one branch of a 8 expression reduces to a black hole? Without the BH! rule, it could mean the 8 expression would also reduce to a black hole. This could happen if the branch had some evaluation resources left, and its sibling had none. Since neither branch can reduce, the 8 expression can't reduce: a black hole. However, since we have the BH! rule, black holes are treated like in nitely reducing expressions, so this problem does not arise. 4.1.2 Divergence The divergence of 8 expressions is complicated by the fact that the branches may have shared variables. Consider the following example (where division by 0 is assumed to diverge). (3) let x = 0 8 1; a = 1 8 1 in a x 1?x Either branch of a can diverge, the left when x reduces to 0, the right when x reduces to 1. Furthermore, what makes one branch diverge, leads the other to converge. However, if 8 is singular amb the expression cannot diverge, since whatever x is bound to leads to the convergence of one of the branches. The important point is that during the reduction of a divergent expression, variables in the heap may be bound to values. Once bound to a value, a variable is never updated (since the Var!2 rule applies). We will nd it useful to know which variables are bound to values during a divergent reduction; in some sense, this is the \ nal" heap of that

? : e !m  : y:e 2 Ideal  : e[x=y] !n  : v 2 Ideal + ? : e x !m  : (y:e)x !  : e[x=y] !n  : v 2 Ideal

AppI

? : (y:e) x ! ? : e[x=y] 2 Ideal +

SubI

? : e !n  : v 2 Ideal n (?; x 7! e) : x ! (; x 7! v) : x ! (; x 7! v) : v^ 2 Ideal

+ VarI1

(?; x 7! v) : x ! (?; x 7! v) : v^ 2 Ideal +

VarI2

(?; x1 7! e1 ; : : : ; xn 7! en ) : e !n  : v 2 Ideal + ? : let x1 = e1  xn = en in e !n+1  : v 2 Ideal

LetI

? : e1 !m  : v 2 Ideal + AmbI1 ? : e1 080 e2 ! ? : e1 m+18n e2 !m  : v18n e2 !  : v 2 Ideal ? : e2 !n  : v 2 Ideal + AmbI2 ? : e1 080 e2 ! ? : e1 m8n+1 e2 !n  : e1 m81 v !  : v 2 Ideal Figure 3: The rules de ning the set Ideal reduction. In the above, a can't diverge because the nal heaps that result when the branches diverge independently are incompatible: they contain di erent bindings for x. To de ne the nal heap, we need the notion of the normal form part of heap, de ned by ? def = fx 7! v 2 ? j v a value g: Since variables bound to values are never updated, and heaps always grow, the normal form parts of the heaps in an in nite sequence ?0 : e0 ! ?1 : e1 ! ?2 : e2 !  ?i : ei !  form a chain in the subset ordering nf ?0  nf ?1  nf ?2   nf ?i   : This allows us to de ne the nal heap as the limit of that chain. We denote in nite reduction sequences by ? : e !! , where  is the! nal normal form heap. More formally, we de ne ? : e !  by nf

? : e !!  , 9f(?i ; ei )ji2Ng: (?0 ; e0 ) = (?; e) ^ 8i2NS:?i : ei ! ?i+1 : ei+1 ^  = i2N nf ?i Recall the complications that arose with nite reduction sequences and unused branches doing possibly unnecessary work. Similar complications arise when we examine in nite reduction sequences. Consider an in nite reduction sequence of the form ? : e1 x !n  : (y:e) x !  : e[x=y] !! : Since it depends on the convergent reduction sequence ? : e1 !  : (y:e), and there can be many convergent reduction sequences for ? : e1 with nal value y:e but di erent heaps, there can be many in nite reduction sequences for ? : e1 x, with di erent nal normal form heaps. Fortunately, the notion of ideal reduction sequences extends to in nite reduction sequences. An in nite reduction sequence is ideal if those convergent sequences upon which

it depends are ideal. We write ? : e !!I  to denote an ideal reduction sequence from ? : e with nal normal form heap . The de nition of the set of in nite ideal reduction sequences, !-Ideal, is given in gure 4. We can derive ideal counterparts for in nite reduction sequences in the following way. Taking arbitrarily large nite initial subsequences of a given in nite reduction sequence, we can \idealise" the convergent subsequences upon which the subsequences depends (i.e. the reduction of functions to lambda expressions) in the way described above. We can formalise the divergent behaviour of 8 expressions in terms of ideal in nite reduction sequences. Note that we require 1 [2 to be a nal heap in the theorem. This is to cope with cases like (3) above, and forces the branches to respect each others' decisions with respect to shared variables. Since we know that every in nite reduction sequence has an ideal, a consequence of this theorem is that 8 is bottom-avoiding with respect to general in nite reduction sequences also. Theorem 4.2 (Bottom-Avoidance) For all closed small step terms e1 and e2 , all heaps ?, and all nal heaps 1 ; and 2 , ? : e1 !!I 1 ^ ? : e2 !!I 2 ^ 1 [2 is a nal form heap

()

? : e1 m8n e2 !!I 1 [2 : Proof. If case. Assume ? : e1 !!I 1 , ? : e2 !!I 2 , and that 1 [ 2 is a nal heap. There are two sub-cases to deal with here, depending upon whether e1 and e2 have shared, needed variables. We consider the simpler case rst: e1 and e2 have no shared, needed variables. Since both e1 and e2 can be reduced in nitely, both of the Red!i rules may be applied in nitely. Therefore, we will be able to construct an in nite number of reduction subsequences of this form: ? : e1 080 e2 ! ? : e1 m0 80n e2 by the Sched! rule !m+n ?0 : e01 8 e02 m  Red!1 ; n  Red!2 and so we can construct an in nite reduction sequence for ? : e1 080 e2 . (Note that we are implicitly using the fact

? : e !!  2 !-Ideal ? : e x !!  2 !-Ideal

?

Fun!-I

? : e !n  : y:e 2 Ideal  : e[x=y] !!  2 !-Ideal ? : e x !n  : (y:e)x !  : e[x=y] !!  2 !-Ideal ? : e !!  2 !-Ideal (?; x 7! e) : x !!  2 !-Ideal

?

?

(?; x1 7! e1 ; : : : ; xn 7! en ) : e !!  2 !-Ideal ? : let x1 = e1   xn = en in e !!  2 !-Ideal ? : e1 !! 1 2 !-Ideal ? : e2 !! 2 2 !-Ideal ? : e1 m8n e2 !! (1 [2 ) 2 !-Ideal

Sub!-I Var!-I

? ?

Let!-I Amb!-I

Figure 4: The rules de ning the set !-Ideal that reduction is preserved by consistent extension in the above construction.) Since the in nite reduction sequences for ? : e1 and ? : e2 are both ideal, each reduction in the constructed sequence will be ideal (by the de nition of !-Ideal). Furthermore, since 1 and 2 are respectively the nal heaps of these in nite reduction sequences, and the only extra reductions in the constructed sequence (instances of the Sched! rule) don't change the heap, the nal heap of the constructed sequence will be 1 [2 . By assumption, this is a nal heap, so we have ? : e1 080 e2 !!I (1m[n2 ). An in nite reduction sequence for a general term e1 8 e2 may be constructed in an almost identical fashion (the only di erence being that the rst reduction is not an application of the Sched! rule). If e1 and e2 have shared, needed variables, then ? : e1 !!I 1 and ? : e2 !!I 2 will contain identical reductions3 that depend upon the reduction of the variables in question. We construct an in nite reduction sequence for ? : e1 m8n e2 in the manner described above, except that we only add reductions for a branch if a corresponding reduction has not been previously added. Consider the following situation: the current heap-expression pair is ? : e1 m8n e2 and we are considering adding a reduction for the right-hand branch that corresponds to ?2 : e2 !I ?02 : e02 . In addition, assume that the latter reduction depends upon the reduction of some shared variable x. If ?2 (x) = ?(x), then we may add the branch reduction. Otherwise, it has already been performed, so we must skip it. It is possible that one of the branches will have all of its reductions usurped by the other branch within the scope of the scheduling phase we are currently constructing. This will invalidate the construction, since we must assign nonzero evaluation resources to both branches in the scheduling phase. We can overcome this by looking further ahead in the reduction sequences for both branches, this time favouring the starved branch. This is always possible since both branches reduce in nitely. Only if case. Assume ? : e1 m8n e2 !!I . This in nite reduction sequence must have in nitely many subsequences of the form introduced in the if case above (for, otherwise, it is nite). We construct reduction sequences for both ? : e1 and ? : e2 as in the only if case in the proof of theorem 4.1. Since the reduction sequence for the 8 expression has in nitely many applications of the Red!i rules, both of the constructed sequences will be in nite, and ideal (by the def3

Modulo where the reductions appear in the sequences.

inition of !-Ideal, since every reduction in the former is). If e1 and e2 have shared, needed variables, then simply using applications of the Red!i rules may not be sucient to construct valid sequence for e1 and e2 ; we need to add reductions to the sequence for the ei whose branch does not have the corresponding reductions. This can be done during construction by examining the current heap as is done in the if case above. The required reduction(s) can be extracted from the segment of the other branch's reduction sequence constructed thus far. Furthermore, since the only reductions that change the heap in each of the constructed sequences also occur in the in nite reduction sequence for ? : e1 m8n e2 , 1 [2 will be the same as , and therefore a nal heap. Therefore, ? : e1 !!I 1 , ? : e2 !!I 2 , and 1 [2 is a nal heap, as required. 2 A corollary of this theorem and theorem 4.1 is that 8, as implemented by the rules given above, is McCarthy's amb. Since the semantics models sharing, 8 is singular amb. 4.2 Natural Semantics In the following subsections, we present the convergence and divergence natural semantics for singular amb. 4.2.1 Convergence For the convergence rules ? : ei +  : v + i2f1; 2g Amb+i ? : (e1 8 e2 ) +  : v we have only to add heaps ? and  to the rules given in section 2. The natural semantics is sound and complete with respect to ideal ( nite) reduction sequences. Similarly to the small step fairness and bottom-avoidance theorems, this extends to general nite reduction sequences. Theorem 4.3 (i) For all small step heap-expression pairs ? : e, and all small step heap-value pairs  : v, ? : e !I  : v =) ?[ : e[ + [ : v[ ; (ii) For all natural heap-expression pairs ? : e, and all natural heap-value pairs  : v,

? : e +  : v =) ?] : e] !I ] : v] :

Proof. (i) We proceed via induction on the length of valid ideal reduction sequences ? : e !I  : v. We show that we can always construct a valid proof of convergence for ?[ : e[ from an ideal reduction sequence for ? : e. We give only the application and amb cases in full. The other cases are similar. Case: e1 x. The ideal reduction sequence from ? : e1 x to  : v will consist of three parts (each of which is ideal): (1) ? : e1 x !nI  : (y:e)x; (2)  : (y:e)x !I  : e[x=y], and (3)  : e[x=y] !I  : v. Since ? : e1 x rewrites to ?0 : e01 x only when ? : e1 rewrites to ?0 : e01 (via the Fun! rule), from (1) we may conclude that ? : e1 reduces to  : y:e in n steps. Furthermore, this reduction sequence is ideal, since ? : e1 x !nI  : (y:e)x is. From this and the inductive hypothesis, we obtain a valid proof that ?[ : e1 [ + [ : (y:e)[ . Since (3) is ideal, from it and the inductive hypothesis, we have a valid proof that [ : (e[x=y])[ + [ : v[ . Therefore, by the App+ rule, to [ : v[, as required. ?[ : e1 [ x converges m8n e . By theorem 4.1, for some i2f1; 2g, ? : Case: e 1 2 ei !I i : vi . Consider i = 1. By the inductive hypothesis, we have a valid proof that ?[ : e1 [ + 1 [ : v1 [ . Therefore, by the Amb+1 rule, ?[ : (e1 [ 8 e2 [ ) + 1 [ : v1 [ , as required. The case for i = 2 follows similarly. (ii) We proceed via rule induction, with cases on the structure of e, and show that we can always construct a valid ideal reduction sequence for ?] : e] from a valid convergence proof for ? : e. Again, we give only the cases for application and amb expressions. Case: e1 x. Since ? : (e1 x) +  : v, we know that there exists some  such that ? : e1 +  : y:e and  : e[x=y] +  : v from App+ . Applying the inductive hypothesis to the former, yields an ideal reduction sequence ?] : e1 ] !I ] : (y:e)] , and to the latter yields an ideal reduction sequence ] : (e[x=y])] !I ] : v]. We can now construct a reduction sequence for ?] : (e1 x)] , as required, in the following way: ?] : (e1 x)] !n ] : (y:e)]x Fun! , and ?] : e1 ] !I ] : (y:e)] ] ] !  : (e[x=y]) Sub! rule ] ! ] : v] ] : (e[x=y]) !I ] : v] This constructed sequence is ideal, since it satis es the definition given in gure 3, so we have ?] : e1 ] x !I ] : v] , as required. Case: e1 8 e2 . Since ? : (e1 8 e2 ) +  : v, we know that either ? : e1 +  : v or ? : e2 +  : v. Consider the former. Applying the inductive hypothesis, we obtain ?] : e1 ] !I ] : v] . Therefore, by theorem 4.1, ?] : e1 ] 080 e2 ] !I ] : v] , as required. The other case follows similarly. 2 4.2.2 Divergence Before we can give the divergence rule for singular amb, we have to add rules for divergent behaviour to the lazy semantics. Judgements will be of the form ? : e* meaning that expression e, in context ?, may diverge with nal (normal form) heap . The nal heap  is de ned

identically to the nal heap in in nite reduction sequences ? : e !! , and may be in nite. This is an implicit sidecondition on all of the divergence rules. Without such a side-condition, the nal heaps are not constrained at all by the rules given in gure 5. The rules still form a valid coinductive de nition because the side-condition is syntactic in nature. Applications may diverge if evaluation of the function does, or if substitution leads to divergence. A variable may diverge if the expression it is bound to may, and a let expression may diverge when its body may diverge. A variable may also diverge if it has no binding in the heap. There is only one rule describing the divergence of a singular amb expression. ? : e 1 * 1 ? : e 2 * 2 ? Amb* ? : (e1 8 e2 ) * (1 [2 ) The e ect of having 1 [2 as the nal heap in the conclusion is to constrain both branches to agree on the values that shared variables may be bound to during evaluation, since the union of the normal form heaps of the branches must itself be a normal form heap. If we didn't have this constraint, a branch could ignore its sibling's decision with respect to a shared choice, violating the terms of singular choice. This is essentially the same restriction that we made in the formulation of the theorem 4.2. The natural semantics for divergence is sound and complete with respect to ideal in nite reduction sequences. Similarly to the convergence case, this extends to general in nite reduction sequences. We require the following lemma to prove (ii). Lemma B.1 For all closed natural heap-expression pairs ? : e and natural nal heaps , 0 0] 0] 0 0 ? : e *  =) 9?0 ; e0 ; n:?] : e] !n> I ? : e ^ ? : e *: 0 The proof may be found in appendix B. The notation !n> I denotes an ideal reduction sequence of length at least one. Theorem 4.4 (i) For all small step heap-expression pairs ? : e, and all small step nal heaps ,

? : e !!I  =) ?[ : e[ *[ ; (ii) For all natural heap-expression pairs ? : e, and all natural nal heaps ,

? : e * =) ?] : e] !!I ] : Proof. To prove! this, we rst give an! equivalent co-inductive de nition of !I , i.e. we de ne !I as the largest relation satisfying: ? : e !I ?0 : e0 ?0 : e0 !!I  ? Inf! ? : e !!I  Then, since * is also de ned co-inductively, it is a matter of showing that each relation satis es the rules of the other, modulo ()] and ()[ . (i) We have to show that !!I (modulo ()[ ) \satis es" the rules de ning *. Since * is de ned co-inductively, this means showing, after substituting !!I for * in the rules, the conclusion implies the premises for each rule For example, in the Var* case, we have to show (?; x 7! e) : x !!I  =) ? : e !!I 

? : e* ? : ex*

?

App*1

? : e +  : y:e0  : e0 [x=y] *  ? : ex* ? : e* (?; x 7! e) : x * 

?

(?; x1 7! e1 ; : : : ; xn 7! en ) : e *  ? : let x1 = e1  xn = en in e *  ? : x * nf ?

?

? x2= ?

App*2 Var*

?

Let* BH*

Figure 5: Divergence natural semantics rules for laziness Since * is by de nition the largest relation satisfying the rules, the desired result will follow. We present only the application and amb cases. The other! cases are similar. Case: App*i . Assume ? : e x !I . We have to show that either ? : e1 !!I , or ?[ : e1 [ + [ : (y:e)[ and  : e[x=y] !!I . Since ? : e1 x !!I , one of the following must be the case. (i) ? : e1 x !I ?0 : e01 x. We know ?! : e1 !I ?0 : e01 (from 0 0 the Fun! rule) and ? : !e1 x !I  (by the above coinductive0 de nition of !I ). Either ? : e1 !!I 0 (for some  ), or ? : e1 reduces to some  : y:e, or both. If it diverges, we have the required result (since  will be equal to 0 because every reduction in the in nite reduction sequence for ? : e1 x that changes the heap has a counterpart in the in nite reduction sequence for ? : e1 ). So assume ? : e1 does not divergen and reduces to some  : y:e. Therefore ? : e1 x !I  : (x:e) x which then reduces to  : e[x=y] (where n is the length of the reduction sequence ? : e1 !I  : y:e). Either  : e[x=y] diverges or it doesn't. If the latter, then as ? : e1 doesn't diverge, ? : e1 x doesn't either, contradicting the main assumption. Therefore  : e[x=y] !!I 0 (for some 0 equal to  by similar reasoning to the above) and the result follows (since ? : e1 +  : y:e by theorem 4.3). (ii) ? : e1 x !I ? : e[x=y], where e1 = y:e. We have that ?[ : e1 [ + ?[ : (y:e)[ (by Lam+ ) and that ? : e[x=y] has an in nite ideal reduction sequence!with nal heap  (by the co-inductive de nition of !I ), so the result follows immediately. Case: Amb* . We!have to show that ? : e1 m8n e2 !!I  ! implies that ? : e1 !I 1 , ? : e2 !I 2 , and  = 1 [2 which is immediate from theorem 4.2. (ii) Given a proof of divergence for ? : e, we construct an in nite reduction sequence for ?] : e]. Since ? : e * , by lemma B.1, we know there exists a heap-expression pair ?0 : 0 0] 0] 0 0 e0 and an n such that ?] : e] !n> I ? : e and ? : e * . We take this reduction segment as the initial segment of the ] : e], and then apply (ideal) in nite reduction sequence for ? the construction to ?0 : e0 , yielding an in nite reduction sequence for e] . 2 5 Examples The following examples assume the existence of natural semantics rules for natural numbers and addition. Space for-

bids the presentation of these rules, but the convergence rules are the same as those presented in [14], and the divergence rules are the obvious ones (i.e. an addition diverges if either operand does). 5.1 -conversion Since the semantics models sharing, the problem we had with -conversion disappears, since the mechanism of substitution we use does not copy. To see this, consider the example again. The relevant expressions in the normalised language are let y = 1 8 2 in (x:x + x) y; let y = 1 8 2 in y + y: Since the amb expression is named, it will be evaluated only once (when y is updated the rst time the amb expression will disappear). As a consequence, both expressions may evaluate to 2 or 4, but the latter cannot evaluate to 3 (since that would imply that the amb expression was reduced twice). 5.2 Recursion Consider the following de nitions: let nats = 0 8 (nats + 1) in nats let nats0 = x:let y = 0 8 ((nats0 x) + 1) in y; z = 0; in (nats0 z ): Neither may diverge. While the second may converge to any natural number, the rst may only converge to 0. The rst may either choose 0, or try to evaluate nats and add 1 to it. Trying the latter results in a black hole (since the binding for nats won't be present). The scheduling rule prevents the black hole from leading to divergence of the term as a whole, so the rst expression may only evaluate to 0. 5.3 Divergence The classical divergent term is = (x:xx)(x:xx). After normalisation, this becomes let y = x:xx in (x:xx) y: Figure 6 is part of an in nite \proof" that this expression diverges (where ? = fy 7! x:xxg). Note that the nal

? : x:xx + ? : x:xx

Lam+

fg : x:xx +fg : x:xx ? : y + ? : x:xx

Lam+ Let+

? : (x:xx) y * ? Let fg : let y = x:xx in (x:xx) y * ? *

fg : x:xx + fg : x:xx ? : y + ? : x:xx

? : yy * ? App *2

Lam+ Let+

.. .

? : yy * ? App *2 ? : yy * ? App *2

Figure 6: A partial in nite \proof" that diverges normal form heap is in this case nite, since there is only ever one normal form in the heap. 6 Related Work Denotational models for non-determinism are based upon power domains, but all power domain constructions (with the exception of Broy's semantics for amb [3]) lead to the equivalence of expressions that in practice we consider distinct. For example, the denotation of (1 : ?) 8 (1 : 1 : 1 : ?) (where : is list cons) in the Plotkin powerdomain [17] is fj1 : ?; 1 : 1 : ?; 1 : 1 : 1 : ?jg: As a result, it is identi ed with (1 : ?) 8 (1 : 1 : ?) 8 (1 : 1 : 1 : ?). These two expressions have di erent behaviours; after the rst has computed two elements, we know the third element will be well-de ned. The other powerdomain constructions ([18, 19, 9, 8]) are less discriminating. This lack of discrimination can cause diculties with equational reasoning. Consider again the two expressions above. The program context tail ((tail []) 8 (1 : 1 : ?)); always converges for the rst expression, but may diverge for the second. A semantics that identi es these two expressions can't be used to prove termination in this context, and may lead to the transformation of a terminating program into a non-terminating one. The natural semantics presented here is able to distinguish between these two expressions, since they have di erent divergent behaviour when placed in the above program context. Broy's semantics for amb appears to make to the distinctions we expect, but it is quite complex (involving three xed point iterations over two di erent orderings in the non- at case) and it is not obvious that it conforms to our operational intuitions about amb. Furthermore, we consider the operational semantics given by Broy to be too restrictive. While it avoids divergence, it cannot be considered fair in the following sense: out of two convergent branches, it always chooses the one requiring fewest reduction steps. Boudol examines full abstraction for two versions of the untyped lambda calculus (di ering in the kinds of abstractions allowed) plus a parallel combinator [2]. The combinator isn't actually a choice operator, and the setting is a deterministic one, but parallel combinations may be viewed as collecting all possible behaviours of some choice operator. However, since only may-converge properties are considered, that choice operator cannot be distinguished from erratic choice. In [11, 12], Hennessy and Ashcroft examine operational and denotational semantics for plural and singular erratic choice (which they call run-time and call-time, respectively). To achieve singular choice, substitution is restricted to deterministic terms, and as such is closely related to call-byvalue parameter passing. We consider this approach too

restrictive, since it leads to expression (2) in section 3 having only two possible values, rather then four as it should. A similar approach is used in [7]. In addition, both deal only with erratic choice. The semantics presented here associates singular choice with call-by-need, the parameter passing strategy used in lazy functional languages, and deals with McCarthy's amb, a more useful non-deterministic operator than erratic choice. Clinger [5] discusses the denotational semantics of plural and singular amb. He also discusses the di erences between global amb (which may choose any convergent branch provided it does not lead to later divergence) and local amb (which does not consider its context when choosing convergent branches). This distinction is also noted in [20]. This paper deals with local amb, and we give a semantics for what Clinger calls singular, non-strict, lazy amb. Since [5] and [20] use powerdomains to describe amb (whether global or local), undesirable identi cation of expressions results. Recently, Ariola et al presented a call-by-need -calculus [1] which models sharing without relying upon heaps. We feel that the notion of a nal heap is important to the correct description of the divergent behaviour of our language, and it is not clear how this notion could be expressed in the framework of Ariola et al's calculus. This aside, it would be interesting to investigate the calculus' use as a basis for the semantics of singular amb, since the heaps have proved troublesome when it comes to proving properties of the semantics. 7 Conclusions We have presented a small step equivalent of Launchbury's natural semantics for laziness, and used that as the basis for a small step semantics for singular McCarthy's amb. This semantics can be viewed as an implementation and corresponds closely to our operational intuitions about the behaviour of McCarthy's amb in the context of a call-by-need parameter passing strategy. To facilitate reasoning, a more abstract natural semantics for singular McCarthy's amb using Launchbury's natural semantics for laziness as a basis was also given. It is equivalent in a very strong sense to the small step semantics. The natural semantics is more discriminating than denotational models. Since the natural semantics models sharing, -conversion is valid. These two factors combined make it suitable as a basis for reasoning about non-deterministic, lazy, functional languages. Acknowledgements We would like to thank John Launchbury for discussions about his natural semantics, and its small step counterpart.

References [1] Z. Ariola, M. Felleisen, J. Maraist, M. Odersky, and P. Wadler. A call-by-need lambda calculus. In Conference Record of the 22th ACM Symposium on Principles of Programming Languages, pages 233{246, 1995. [2] G. Boudol. Lambda-calculi for (strict) parallel functions. Information and Computation, 1992. [3] M. Broy. Fixed point theory for communication and concurrency. In D. Bjrner, editor, Formal Description of Programming Concepts{II, pages 125{147. IFIP, North Holland, 1983. [4] M. Carlsson and T. Hallgren. Fudgets - A Graphical User Interface in a Lazy Functional Language. In FPCA '93 - Conference on Functional Programming Languages and Computer Architecture, pages 321{330. ACM Press, June 1993. [5] W. Clinger. Nondeterministic call by need is neither lazy nor by name. In Lisp and Functional Programming, pages 226{234, August 1982. [6] P. Cousot and R. Cousot. Inductive de nitions, semantics and abstract interpretation. In Principles of Programming Languages, January 1991. [7] U. de'Liguoro and A. Piperno. Must preorder in nondeterministic untyped -calculus. In CAAP '92, volume 581 of LNCS, pages 203{220, February 1992. [8] C. A. Gunter. Relating total and partial correctness interpretations of non-deterministic programs. In Principles of Programming Languages, January 1990. [9] R. Heckmann. Set domains. In ESOP '90, volume 432 of LNCS, pages 177{196, May 1990. [10] P. Henderson. Purely functional operating systems. In J. Darlington, P. Henderson, and D. A. Turner, editors, Functional Programming and its Applications, pages 177{192. Cambridge University Press, 1982. [11] M. C. B. Hennessy. The semantics of call-by-value and call-by-name in a nondeterministic environment. SIAM Journal on Computing, 9(1):67{84, February 1980. [12] M. C. B. Hennessy and E. A. Ashcroft. A mathematical semantics for a nondeterministic typed -calculus. Theoretical Computer Science, 11:227{245, 1980. [13] S. B. Jones. A range of operating systems written in a purely functional style. PRG Technical Monograph PRG-42, Oxford University Computing Laboratory, 1984. [14] J. Launchbury. A natural semantics for laziness. In Principles of Programming Languages, January 1993. [15] J. McCarthy. A basis for a mathematical theory of computations. In P. Bra ort and D. Hirschberg, editors, Computer Programming and Formal Systems, pages 33{70. North-Holland, 1963. [16] A. Moran. Natural Semantics for Non-Determinism. Licentiate Thesis, Chalmers University of Technology and University of Goteborg, Sweden, May 1994. [17] G. D. Plotkin. A powerdomain construction. SIAM Journal on Computing, 5(3):452{487, 1976.

[18] M. B. Smyth. Power domains. Journal of Computer and System Sciences, 16(23{26):23{35, 1978. [19] M. B. Smyth. Power domains and predicate transformers: A topological view. In ICALP '83, volume 154 of LNCS, pages 662{676, 1983. [20] H. Sndergaard and P. Sestoft. Non-determinism in functional languages. The Computer Journal, 35(5):514{523, October 1992. [21] W. Stoye. A new scheme for writing functional operating systems. Technical Report 56, University of Cambridge Computing Laboratory, 1984. [22] D. A. Turner. An approach to functional operating systems. In D. A. Turner, editor, Research Topics in Functional Programming. Addison Wesley, 1990. A The Idealisation Algorithm Space limits us to all but a cursory presentation of the idealisation algorithm; only the barest essentials of the algorithm are described. The algorithm is comprised of three parts: rescheduling, tagging, and bubbling. Rescheduling Convergent reduction of a choice expression may have many applications of the Sched! rule. To simplify later manipulations of the reduction sequence, we replace all instances of the Sched! rule with one initial instance, which allocates to each branch the total it was allocated in the original sequence. Tagging Each reduction will be tagged as either needed or unneeded. Reductions that reduce an unused branch of a choice expression are tagged as unneeded; all other reductions are tagged as needed. So the only reductions which are not tagged needed are those which may be super uous. Bubbling The tagged reduction sequence is scanned, \bubbling" unneeded reductions toward the end of the sequence. The object is to partition the sequence such that all of the needed reductions are performed prior to any unneeded reductions. Given a tagged reduction sequence ? : e !  : v, the bubbling process produces a partitioned, tagged reduction sequence ? : e !N 0 : v !U  : v where !N is a sequence of needed reductions, and !U is a sequence of unneeded reductions. The unneeded reductions may be thought of as acting upon 0 to bring it in line with . The ideal counterpart to the (untagged) reduction sequence ? : e !0  : v is the initial needed subsequence of this, ? : e !I  : v. When an unneeded reduction directly precedes a needed one, there are two possible courses of action. If the needed reduction depends upon the reduction of a variable, and the unneeded reduction a ects the binding of that variable, then the unneeded reduction must be promoted, since the needed reduction occurs in a context that \assumes" that the unneeded reduction has been performed. In other words, the reduction tagged as unneeded is actually needed. If so, the unneeded reduction is duly promoted to needed status. This is a matter of changing its tag, and possibly updating scheduling information. The latter may be necessary since we could be promoting a reduction that was originally

applied by an unused branch, but is now being applied by a used branch. Otherwise, we have to swap the needed and unneeded reductions, to move the needed reduction toward the front of the sequence. This entails changing the heaps in the reductions to re ect the swap. This is harmless, since the reductions are independent in the above sense; neither a ects any part of the heap that concerns the other. B Proof of the Auxiliary Lemma Lemma B.1 For all closed natural heap-expression pairs ? : e and natural nal heaps , 0 0] 0] ? : e *  =) 9?0 ; e0 ; n: ?0] : e0] !n> I ? :e ^ ? : e *: Proof. We prove the lemma by showing the following proposition: 0] 0] 8k:? : e *  =) 9?0 ; e0 ; n: ?]0 : e0] !n>k I ? :e ^ ? : e *:

We proceed via structural induction, strati ed by the size of the heap. Given a divergence proof for ? : e, for each k we construct a heap-expression pair ?0 : e0 that also diverges and exhibit, for some n > k ideal reductions from ?] : e] to ?0 ] : e0 ] . Assume ? : e * . There are four cases, depending upon the structure of e, but we present only the application, variable, and amb cases. The letrecs case is similar. Case: e1 x. There are two sub-cases. (i) ? : e1 * . Applying the inductive hypothesis yields, 0] for all k, a ?0 : e01 and an n such that ?] : e1 ] !n>k I ? : ] 0 0 0 e1 and ? : e1 * . By the Fun! rule, from this we can ] 0] 0 construct ?]0 : (e1 x)] !n>k I ? : (e1 x) . Furthermore, by App*1 , e1 x *, as required. (ii) ? : e1 +  : y:e, and  : e[x=y] * . We can reduce ?] : (e1 x)] to ] : (e[x=y])] thus (assuming ] ] ?] : e1 ] !m>k I  : (y:e) ): ] ] ?] : (e1 x)] !m I  : (y:e) ]x m  Fun! !I  : (e[x=y]) Subst!

The last term is in initial form (i.e. all 8-expressions have nil evaluation resources) since we are substituting x for y only, and all lambda expressions (values) are assumed to be in initial form. Letting e0 = e[x=y] we have the desired result since ? : (e1 x)] (ideally) reduces to  : (e[x=y])] in m + 1 > k steps, and  : (e[x=y]) * . Case: x. There are two sub-cases, depending upon whether a binding for x exists in ?. (i) x2?: Let ? = (; x 7! e). By Var* , we know that  : e * . Applying the inductive hypothesis (valid even though e is almost certainly structurally larger than x since  is smaller than ?) yields, for any k, 0 : e0 such that ] : e] reduces (for some n) in n > k steps to 0 ] : e0 ] and 0 : e0 * . Then, by n applications ] ] n 0] 0] of Var !1 , ? : x ! ( ; x 7! e ) : x, and by Var* , 0 0 ( ; x 7! e ) : x * , as required.

(ii) x02= ?. Since BH! applies here, we can let ?0 = ? and e = x. Then, for all k, there exists an n > k such that ?] : x0 reduces via n applications of BH! to ?0 ] : e0 ] , 0 and ? : e * , as required. Case: e1 8 e2 . Here we know that ? : e1 * 1 , ? : e2 * 2 , and that 1 [ 2 is a valid nal0 heap. By the inductive hypothesis, we have, for all k, ?i : e0i and ni such that ?] : ei ] !nI i >k ?i ] : e0i ] and ?i : e0i * i . Let us initially assume that the sets of variables that are reduced in the course of reducing ?] : e1 ] and ?] : e2 ] are disjoint. Then we reduce ?] : e1 ] 080 e2 ] to ?0 ] : e01 ] 080 e02 ] (for some ?0 ) in the following manner: ?] : e1 ] 080 e2 ] !

?] : e1 ] n18n2 e2 ] Sched! n !I 1 ?00 ] : e01 ] 08n2 e2 ] n1  Red!1 !nI 2 ?0 ] : e01 ] 080 e02 ] n2  Red!2

Here, ?00 = ?1 and ?0 = ?1 [ ?2 . The Red!2 reductions are valid even though we need to know ?1 ] : e2 ] !In2 >0 ?0 ] : e02 ] to conclude] them, because of the disjoint nature of the reductions of ? : e1 ] and0 ]?] : 0e2 ]. Furthermore, since ?1 : e1 * 1 and ?02 ] : e02 * 2 , we know ?0 ] : e01 * 1 and ?0 ] : e02 * 2 (since adding conservatively to the heap cannot change the divergent properties of a term). So by Amb* , ?0 ] : e01 8 e02 * , as required. A complication similar to that encountered in the proof of theorem 4.2 arises here if e1 and e2 have shared, needed variables that are reduced in the course of reducing ?] : e1 ] ] ] and ? : e2 . A similar augmentation to the construction works here also. That we can look further ahead in the reduction sequences when needed (in case one of the branches is starved when all its reductions the scheduling phase being constructed are usurped) is guaranteed by the inductive hypotheses (since we have ?] : ei ] !nI i >k ?i ] : e0i ] for all k).

2