A b s t r a c t : Intersection types and bounded quantification are complementary extensions of first-order a statically typed programming language with subtyping. We define a typed A-calculus combining these extensions, illustrate its properties, and develop proof-theoretic results leading to algorithms for subtyping and typechecking.

Among the intriguing properties of intersection types [

Bounded quantification [

By viewing intersection types and bounded polymorphism as extensions of
a common base, a simply typed A-calculus with subtyping, we can merge them,
yielding a compact, natural synthesis of their features in a new calculus. This
calculus, called F^ ("F-meet"), provides a formal basis for new programming
languages combining the benefits of existing languages based on intersection
types [

The following section establishes some notational conventions and reviews
the definitions of the pure systems of intersections and bounded quantification.
Section 3 presents F^ in full. Section 4 gives examples illustrating its expressive
power, and Sections 5 and 6 present algorithms for checking the subtype relation
and synthesizing minimal types for terms and sketch their proofs of soundness
and completeness. Section 7 discusses semantic issues, and Section 8 offers
directions for future research. A more detailed presentation of these results can
be found in [

B a c k g r o u n d The metavariables c~ and fl range over type variables; Or, r, 0, r and r range over types; e and f range over terms; and z and y range over term variables. A finite sequence with elements xl through z,, is written [Xl..Z,~]. Concatenation of finite sequences is written X1 * X2. Single elements are adjoined to the right or left of sequences with a comma: [x~,Xb] or [Xa,xb]. Sequences are sometimes written using "comprehension" notation: [z [ ...]; for example, i f T _= [cr--~r, r162 ~r--*0], then the comprehension [(~ [ (1---~2 e T and ( l = a ] stands for the sequence It, 0]. A context F is a finite sequence of typing assumptions z : r and subtyping assumptions a<_r, with no variable or type variable appearing twice on the left. It is convenient to view a context F as a finite function and write dora(F) for its domain.

A type v is closed with respect to a context F if its free (type) variables are all in dom(F). A term e is closed with respect to F if its free (term and type) variables are all in dom(r). A context F is closed if F = {}, or if F - F~, a_<v with F1 closed and v closed with respect to I'1, or if F = F1, x:v with F1 closed and 7" closed with respect to F1. A subtyping statement F F cr < r is closed if F is closed and ~r and ~"are closed with respect to F; a typing statement F ~ e e v is closed if F is closed and e and r are closed with respect to F. In the following, we assume t h a t all statements under discussion are closed; in particular, we allow only closed statements in instances of inference rules.

Types, terms, contexts, and statements t h a t differ only in the names of bound
variables are considered identical. (That is., we think of variables not as names
but as pointers into the surrounding context, as suggested by deBruijn [

Examples are set in a typewriter font; A-calculus notation is transliterated as follows: T is written as T, A as \ , A as \ \ , V as A l l , and < as <. Lines of input to the typechecker are prefixed with > and followed by the system's response. The type constructors -+ and V bind more tightly t h a n A. Also, -+ associates to the right and V obeys the usual "dot rule" where the b o d y r of a quantified type Va_<a. r is taken to extend to the right as far as possible. 2 . 1

S i m p l y

T y p e d ) ~ - C a l c u l u s w i t h

S u b t y p i n g
T h e first-order A-calculus with intersection types (called A^ here) and the
secondorder A-calculus with bounded quantification (called F_<) can b o t h be presented
as extensions of the simply typed A-calculus enriched with a subtyping relation
(A<), a system proposed by Cardelli [

T h e types of A< consist of a set of primitive types (ranged over by the metavariable p) closed under the type constructor --+: T h e terms of Am< consist of a countable set of variables together with all the phrases t h a t can be built from these by functional abstraction and application: p x r ::=

I rl--+r2 e ::=

I ~ x : r . e

I e, e2
T h e typing relation of A< is formalized as a collection of inference rules for
deriving typing statements of the form F ~- e e r ("under assumptions F, expression e
has t y p e r " ) . T h e rules for variables, abstractions, and applications are exactly
the same as in the ordinary simply typed A-calculus [

P ~- e ~ rl

F F e e r 2

P F rl < r2 (suB)

Intuitively, a subtyping s t a t e m e n t F k- cr _< r corresponds to the assertion t h a t cr is a refinement of r , in the sense t h a t every element of ~r contains enough information to meaningfully be regarded as an element of r. In some models this means simply t h a t (r is a subset of r; more generally, it implies the existence of a distinguished coercion function from a to r. 2 . 2

I n t e r s e c t i o n

T y p e s T h e first-order calculus of intersection types, A^, is formed from A< by adding intersections to the language of types: In the examples, we use the abbreviations ~rAr = A[a, r] and T = A[].

(SUB-INTER-G) (SUB-INTER-LB) One additional subtyping rule captures the relation between intersections and function spaces, allowing the two constructors to "distribute" when an intersection appears on the right-hand side of an arrow:

P k A[o----.rl ..~r---*rn] ~ o" --* A[rl..~-n] (SuB-DIST-IA) This inclusion is actually an equivalence, since the other direction m a y be proved from the rules for meets and arrows (listed in full in Section 3. It has a strong effect on b o t h syntactic and semantic properties of the language; for example, it implies t h a t T < ~r---,T for any or.

T h e intersection introduction rule allows an intersection type to be derived for a t e r m whenever each of the elements of the intersection can be derived for it separately: for a l l i , r F e e ri

F I- e e A [ r l . . r , ] (INTER-I) T h e corresponding elimination rule would allow us to infer, on the basis of a derivation of a s t a t e m e n t like F k e e A[rl..r~], t h a t e possesses every v~ individually. But this follows already from the rule SUB-INTEa-LB and subsumption; we need not add the elimination rule explicitly to the calculus.

T h e nullary case of INTER-I is worth particular notice, since it allows the type T to be derived for every t e r m of the calculus, including terms whose evaluation intuitively encounters a run time error or fails to terminate.

T h e system as we have described it so far supports the use of intersection types in programming only to a limited degree. Suppose, for example, t h a t the primitive subtype relation has Int <_ Real and the addition function is overloaded to operate on b o t h integers and reals:

+ e Int--*Int--*Int A Real--*Real--+Real.

Using just the constructs introduced so far, there is no way of writing our own
functions t h a t "inherit" the finitary polymorphism of +. For example, the
doubling function Ax:?. x + x cannot be given the type Int--~Int A Real---*Real, since
replacing the ? with either Int or Real (or IntAReal, or indeed - - assuming we
could write it - - Int V Real) gives a typing t h a t is too restrictive:
> d o u b l e 1 = \x:Int. plus x x;
d o u b l e l : I n t - > Int
> d o u b l e 2 = \x:Real. p l u s x x;
d o u b l e 2 : R e a l - > R e a l
This led Reynolds [

t Ax:rl..r~. e The typing rule for this form allows the typechecker to make a choice of any of the a's as the type of x in the body:

F, x:cq ~- e e ri l~ ~" AX:O'l..gr n . e E eri--+ri (ARROW-I') This rule can be used together with INTER-I to generate a set of n alternative typings for the body and then form their intersection as the type of the whole A-abstraction: > double = \x:Int,Real. plus x x; double : Real->Real /\ Int->Int 2 . 3

B o u n d e d

P o l y m o r p h i s m Like other second-order A-calculi, the terms of F< include the variables, abstractions, and applications of A< plus type abstractions and type applications. The latter are slightly refined to take account of the subtype relation: each type abstraction gives a bound for the type variable it introduces and each type application must satisfy the constraint that the argument type is a subtype of the bound of the polymorphic function being applied. Also, like that of A^, the F< subtype relation includes a maximal element, generally called Top. r ::= Top [ ~ I rl---+ r2 ] Vow<r1.r2 e ::= x I I I I Type abstractions are checked by moving the bound for the type variable into the context and checking the body of the abstraction under the enriched set of assumptions (rule ALL-I in Section 3). The rule for type applications must check that the argument type is indeed a subtype of the bound of the corresponding quantifier (rule ALL-E). Like arrow types, subtyping of quantified types is contravariant in their bounds and covariant in their bodies (rule SuB-ALL). 3

T h e

FA C a l c u l u s F^ is essentially a "least upper bound" of A^ and F<. To achieve a compact and symmetric calculus, however, a few modifications and extensions are needed. Since F< allows primitive types to be encoded as type variables, we drop the primitive types of A< and A^. Since T and Top both function as maximal elements of their respective subtype orderings, we drop Top and let T take over its job. Since V behaves like a kind of function space constructor, we add a new law SuB-DIST-IQ, analogous to SUB-DIsT-IA, allowing intersections to be distributed over quantifiers on the right-hand side.

The notions of type variables and type substitution inherited from F< can be used to define a further generalization of AA'S A-abstraction. We extend the syntax of terms with a new form e ::= ...

I Ior in l.. .. e whose typing rule allows a choice of any of the ~'s as a replacement for c~ in the body:

r F ~- forc~ in ~l..~n. e e 7-i (FOR) This rule, like the generalized arrow introduction rule ARROW-I' of A^, can be used together with INTER-I to generate a set of n alternative typings for the body and then form their intersection as the type of the whole f o r expression: > double = for A in Int,Real. \x:A. plus x x; double : Real->Real /\ Int->Int Indeed, AA'Sgeneralized A-abstraction may be reintroduced as a simple syntactic abbreviation: Ax:~l..~n. e d=et f o r a in ~1..~,. Am:a. e, where a is fresh.

Besides separating the mechanisms of functional abstraction and alternation, the introduction of the f o r construct extends the expressive power of the language by providing a n a m e for the "current choice" being made by the type checker. For example, the explicit f o r construct may be used to improve the efficiency of typechecking even for first-order languages with intersections. The second version of p o l y requires that the body be checked only twice, as compared to sixteen times for the first version. > poly = \w :Int,Real. \x :Int, Real. \y :Int, Real. \z :Int,Real. > plus (double x) (plus (plus w y) z); poly : Real->Real->Real->Real->Real /\ Int->Int->Int->Int->Int > poly = for A in Int,Real. > \w:A. \x:A. \y:A. \z:A. > plus (double x) (plus (plus w y) z); poly : Real->Real->Real->Real->Real /\ Int->Int->Int->Int->Int

W e now define the FA calculus formally. The sets of types and terms are given by the following abstract grammar: e ::-~7-1 ---+7-2 Vc~<rl. r2 X )%~:T. C el e2 A~<_r. e @] f o r ~ in 7-1..7-~. e The three-place subtype relation F ~- c~ <_ 7- is the least relation closed under the following rules:

F }- 7- _< 7

( S u B - R E F L ) F F rl _< erl

F,~_<n F o2 < r2 I' t- gcY<cr,, a2 _< V~_<rl. r2 for a l l i , r F er _< ri r ~ ~ < A [ n . . M r ~ A [ ~ . . M

< ~ r F A [ ~ I . . ~ - + M < ~ -~ A [ n . . M (SUB-TRANS) (SUB-TVaR) (SUB-ARRow)

(SuB-ALL) (SUB-INTER-G) (SUB-INTER-LB) (SUB-DIST-IA) T h e three-place typing relation r P e e r is the least relation closed under the following rules: r ~ ,

~ r ( , )

P, x:rl F e e r~ r k A~a:rl. e e rl--+r2 I~ }- el e 7"1----+7"2 ~P }- e2 e 7"1

r b el e2 c r2

F, c~_<rl [- e e r2 r F A a < r l . e e ga_<rl, r2 P F e e ga_<rl, r~ IF' F r < ri r I- e[r] e {r/a,}r= r F { ~ d ~ } e e r~ r t- f o r a in crl..~n, e c ri for a l l i , r t- e ~ ri r F e ~ A [ n . . M (VAR) ( A R a o w - I ) (ARROW-E) (ALL-I) (ALL-E)

(FOR) (INTER-I) r ~ e e n r P n < r~ (SUB)

r F e e ~

T h e one point where A^ and F^ do not fit together perfectly is the m a x i m a l types T (the e m p t y intersection) and Top. We might hope t h a t these would coincide in F^, but this, unfortunately, is not the case. T h e difference arises from the INTER-I rule of A^, which, in its nullary form, states t h a t any t e r m whatsoever has type T. F< has no such rule; the only way a t e r m e can be assigned type Top is by the rules SUB and S u B - T o P , which require t h a t the t e r m already have some type ~r with a < r. In other words, Top is the type of all well-typed terms, whereas T is the type of all terms. Order-theoretically, of course, the two types are equivalent (each is a subtype of the other), since each is maximal. 4 E x a m p l e s Intersection types allow very refined types to be assigned to expressions - - much more refined than is possible in conventional polymorphic languages. Instead of a single description, each expression may be assigned any finite collection of descriptions, each capturing some aspect of its behavior. Since we are working in an explicitly typed calculus, this requires effort fl'om the programmer in the form of type assumptions or annotations; in general, as more effort is expended, better typings are obtained.

Many functional languages provide a primitive type Bool with two elements, t r u e and f a l s e . Here we can introduce two subtypes of Bool, called True and F a l s e and give more exact types for the constants t r u e and f a l s e in terms of these refinements: > Bool < T, > True < Bool, > False < Bool; > true : True, > false : False; T h e polymorphic i f primitive can also be given a more refined type: if we know whether the value of the test lies in the type True or the type F a l s e , we can tell in advance which of the branches will be chosen. An optimizing compiler might use this information to generate more efficient code in some cases. > if : All A. > >

(True -> A -> T -> A)
/\ (False -> T -> A -> A)
/\ (Bool -> A -> A -> A);
(The third typing is needed here because F^'s types cannot express the idea that
every element of Bool is an element of either True or F a l s e . This shortcoming,
while not serious in practice, has motivated the investigation of a dual notion of
union types [

Because of the axioms SuB-DIsT-IA and SuB-DIsT-IQ, we cannot check whether F b cr _< T just by comparing the outermost constructors of cr and v and making recursive calls. For example, is derivable (using SUB-DIST-IA and SUB-TRANS,with intermediate type T), although is not. So, given s a, and r, the algorithm must first perform a complete analysis of the structure of v. Whenever r has the form 7"1~r2 or Vc~<rl. r2, it pushes the left-hand side - - rl or a_<vl - - onto a queue of pending left-hand sides and proceeds recursively with the analysis of 7"2. When r has the form of an intersection, it calls itself recursively on each of the elements. When r is finally reduced to a type variable, the algorithm begins analyzing g, matching left-hand sides of arrow and polymorphic types against the queue of pending left-hand sides from r. In the base case, when both g and r have been reduced to variables, the algorithm first checks whether they are identical; if so, and if the queue of pending left-hand sides is empty, the algoriLhm immediately returns true. Otherwise, the variable ~ is replaced by its upper bound from s and the analysis continues.

Formally, let X be a finite sequence of elements of the set {T [ r a type} U {a<_r [ a a type variable and r a type}. Define the type X:=~v as follows: [ ] ~ r [~, x ] ~ r [~_<~,x]~ = = = , ~ -~ ( x ~ ) w _ < ~ . ( x ~ ) .

Note that every type 7- has either the form X ~ a or the form X=~A[rl.J-n] , for a unique X.

The 4-place relation r F ~r < X::c,v is the least relation closed under the following rules. (Note t h a t these rules are syntax-directed - - at most one of t h e m can be used to establish a given conclusion.) for a l l i , F F a _< X ~ ~'i

r e ~ < x ~ AP~..~,,] for s o m e i , r F Gi <_ x =~ a

r F A [ ~ I . . ~ . ] < x ~ r F n < [ ] = r

r F or2 <_ X2 = > a r F n _< [] = ~ l r, fl_<n F a2 _< X2 ~ a r F V~3_<ch. ~r2 _< [/~<-n,X2] ::r a r ~ r ( z ) < x ~ (ASuBR-INTER) (ASuBL-INTNR) (ASuBL-ARROW)

(ASuBL-ALL) (ASuBL-REFL) (ASuBL-TVAR) The more convenient three-place relation F F G < v m a y be defined as F ~ - ~ _ < r

iff F ~ - r 1 6 2 where r = X ~ r and either r = A[r162 or r =- a.

T h e o r e m : A statement F F cr < r is derivable from the original F^ subtyping rules iff it is derivable from the rules defining the algorithm.

P r o o f s k e t c h : The soundness of the algorithm is established by a
straightforward inductive argument. To show completeness, we first reformulate pure FA
subtyping as an equivalent relation on types in a "conjunctive normal form"
where all intersections appear on the outside or on the left of arrows or
quantitiers. (The advantage of this presentation is t h a t it does not need distributivity
rules.) The algorithm is then shown to be complete for this system by an
extension of the m e t h o d used by Curien and Ghelli [

T y p e

S y n t h e s i s

A l g o r i t h m Given a term e and a context F, the type synthesis algorithm constructs a m i n i m a l type o. for e under F - - t h a t is, it finds a type o. such t h a t F ~- e c c~, and such t h a t any other type t h a t can be derived for e is a supertype of o.. (Because it calls the subtyping algorithm, the type synthesis procedure m a y also fail to terminate in some pathological cases.)

The algorithm can be explained by separating the typing rules of F^ into two sets: the structuralor syntax-directed rules VAR, ARROW-E, ALL-I, ALL-E, and FoR, whose applicability depends on the form of e, and the non-structural rules INTER-I and SUB, which can be applied without regard to the form of e. The non-structural rules are removed from the system and their possible effects accounted for by modifying the structural rules VAR, ARRow-E, ALL-E, and Fort.

The main source of difficulty is the application rules ARROW-E and ALL-E. An application (el e2) in F^ has every type r2 such t h a t ea can be shown to have some type rl--~r2 and e2 can be shown to inhabit r l , where the rule SUB m a y be used on both sides to promote the types of el and e2 to supertypes with appropriate shapes. For example, if e~ ~ (o.l-~o.2) ^ (w_<o.3 o.~) ^ (o.~-~o.~) ^ (o.~-~o.s) e2 e o-1 A (Vc~<o.3. o-4) ^ o-5, then (ea e2) has both types o.~ and o.6, and hence (by INTER-I) also type o-2Ao.6. To deal with this flexibility deterministically, we must show t h a t the set of supertypes of (o-1~o-2) A (Va<o-8. o-4) ^ (credo.6) ^ (o.7-*o.s) that have the appropriate shape to appear as the type of el in an instance of ARROW-E can be characterized finitely, using an auxiliary function arrowbasis: arrowbasis((o.1---*o.2) A (Va<o.3. o-4) A (o-5--~o-6) A (o.w--*o-8))

= [r o.5--+o.6, 0"7"-+o'8].

Formally, arrowbasisr and the analogous function allbasisr for dealing with type applications are defined as follows: arrowbasisr( c~) arrowbasisv( vl--*r2 ) arrowbasisr(V~<_vl, v2) arrowbasisr(A[vl..r~]) = = = = arrowbasisr(r(~ ) ) [] arrowbasisv( T1) * . . . * allbasisr(c 0 allbasisr.(rl--*r2) allbasisr(V~<_r~, r2) allbasisr(A[rl..'rn]) = aIlbas,sr(F(~) ) -= [] = [V(~<rl. r2] = allbas,sr(rl) * . . . *

arrowbasisp( rn ) allbasisr(rn).

The fact t h a t arrowbasis deserves its name - - t h a t is computes a "finite basis" for the set of all arrow-shaped supertypes of a given type - - is captured by the following lemma: 1. r ~ ~ < A(ar~o~basisr(~)).

2. If F F a _< rl-~r2, then F ~ A ( a r r o w b a s i s r ( a ) ) <_ vl-~v,..

It is then a simple m a t t e r to characterize the possible types for (el e2) by checking whether the minimal type of e2 is a subtype of each of the left-hand sides in the arrow basis of the minimal type of el.

The type synthesis relation F t- e e v is the least relation closed under the following (syntax-directed) rules: (A-V~R) (A-A~utow-I) (A-ALL-I)

(A-FoR) r e .

~ r ( . )

F, x : r l I-- e e 7"2 r F AX:rl. e e rl---+r~

r , a < _ q F e E r2

F b A a < r l . e ~ Va<_rl. r~ r I- et ~ at r I- e2 e ~2 r e e~ e~ e A[r I (r162 e arrowbasisr(crl)and r F ~2 _< r (A-AttRow-E) r F e[~] ~ A[IH~}r

r F e e a l I(W-<r e d e allbasisr(al)and I' F r < r (A-ALL-E) for all i, F F {cri/a}e fi ri

F b for a in o ' l . . e n , e e h[rl..vn] T h e o r e m : [Soundness] If r b e e r is derived by this algorithm, then it can also be derived from the original F^ typing rules.

P r o o f i Straightforward translation of derivations.

T h e o r e m : [Minimal typing] If r b e e ~ is derived by the, algorithm and r b e e r can be derived from the original typing rules, then r b a < r. P r o o f : Straightforward induction, using the properties of allbasis and arrowbasis for the application cases. 7

S e m a n t i c s
A straightforward untyped semantics can be given for F^ by extending Bruce and
Longo's partial equivalence relation model for F< [

More refined models have been given for intersections (c.f. [

An equational theory of provable equivalences between terms of pure Fn can be shown to be sound for both the untyped and the translation semantics. 8

F u t u r e

W o r k A primary practical concern for programming notations based on intersection types is the efficiency of typechecking for large programs. Naive implementations of the algorithms given here exhibit exponential behavior - - in practice - - in both type synthesis (because of the for construct) and subtyping (because of rules ASUBR-INTEa and ASuRL-INTER). Fortunately, this behavior normally occurs as a result of explicit programmer directives - - requests, in effect, for an exponential amount of analysis of the program during typechecking. Still, a serious implementation must find ways to economize; for example, by cacheing the partial results of previous analysis.

Another consideration for any language based on second-order polymorphism
is the problem of verbosity. Without some means of abbreviation or partial type
inference, even modest programs quickly become overburdened with type
annotations. Cardelli's partial type inference method for F< [

Larger examples are needed to establish the practical need for intersection types and bounded quantification in their most general forms. It may be possible to obtain most of the practical power of Fn while remaining within a simpler, more tractable fragment.

A c k n o w l e d g e m e n t s John Reynolds and Bob Harper supervised the thesis in which these results first appeared: their contributions can be found on every page. I am also grateful for discussions with Luca Cardelli, Tim Freeman, Nico Habermann, QingMing Ma, Frank Pfenning, and Didier R~my.

R e f e r e n c e s
[1] Franco Barbanera and Mariangiola Dezani-Ciancaglini. Intersection and union
types. In Ito and Meyer [