( Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I JOHX MCCAaTItY, Massachusetts Institute of Technology, Cambridge, Mass. 1. Introduction 2. F u n c t i o n s a n d F u n c t i o n Definitions A programming system called LISP (for lASt Processor) has been developed for the I B M 704 computer by the Artificial Intelligence group at M.I.T. The system was designed to facilitate experiments with a proposed system called the Advice Taker, whereby a machine could be instructed to handle declarative as well as imperative sentences and could exhibit "common sense" in carrying out its instructions. The original proposal It] for the Advice Taker was made in November 1958. The main requirement was a programming system for manipulating expressions representing formalized declarative and irnperalive sentences so that the Advice Taker system could make deductions. In the course of its development the Lisp system went through several stages of simplification and eventually came to be based on a scheme for representing the partial recursive functions of a certain class of symbolic expressions. This representation is independent of the IBM 704 computer, or of any other electronic computer, and it now seems expedient to expound the system by starting with the class of expressions called S-expressions and the functions called S-functions. In this article, we first describe a formalism for defining functions reeursively. We believe this formalism has advantages both as a programming language and as vehicle for developing a theory of computation. Next, we describe S-expressions and S-functions, give some examples, and then describe the universM S-function apply which plays the theoretical role of a universal Turing machine and the practical role of an interpreter. Then we describe the representation of S-expressions in the memmT of the IBM 704 by list structures similar to those used by Newell, Shaw and Simon [2], and the representation of S-functions by program. Then we mention the main features of the Lisp programming system for the IBM 704. Next comes another way of describing computations with symbolic expressions, and finally we give a recursive function interpretation of flow charts. We hope to describe some of the sylnbolie computations for which LISP has been used in another paper, and also to give elsewhere some applications of our reeursive function formalism to mathematical logic and to the problem of mechanical theorem proving. 184 Communications of t h e ACM We shMl need a number of mathematical ideas ar:d notations concerning functions in general. Most of the ideas are well known, but the notion of conditional e,~pre~'sion is believed to be new, and ihe use of conditional expressions permits functions to be defined recursively in a new and convenient way. a. Partial Functions. A partial function is a funct on that is defined only on part of its domain. Partial funetio:~s necessarily arise when functions are defined by eomputa~ tions because for some values of the arguments t:he Pomp:> ration defining the value of the function may not terminate. However, some of our elementary functions wilt be defined as partial functions. b. Propositional Expres.s'ions and Predicates. A t)ropo~itionM expression is an expression whose possible values are T (for truth) and F (for falsity). We shall assume that the reader is fanfiliar with the propositionM eom~eelives A ("and"), V ( " o r " ) , and ~ ( " n o t " ) , Typieai propositional expressions are: x 2--+3) = 4 (2 < 1--~ 4, 2 > 1 - ~ 3 , zpre;~. : = 3 1 - , 6 0, =3 < T -~3) (2 < I --~ 3, 4 < 1 --~ 4) is undefined ,viii ~'e: Some of tile simplest a p p l i c a t i o n s of eonditionM expressia~s are in giving such definitions as x = (x < 0 - - , - x , = 0 -+ m, T --* g e d ( r e m ( n , m ) , m ) ) where rein(n, In) denotes the remainder left when n is divided b y m. The Newtonian algorithm for obtaining an approximate square root of a number a, starting with an initial approximation :v and requiring that an acceptable approximation y satisfy l y 2 -- a I < e, m a y be written as sqrt(a, x, ~) = (Ix ~ - (1 (2 < 1 - - , 3 , T - - , 0 ) is n l l d e [ i l l e d !i g e d ( m , n ) = (In > n --, g e d ( n , m ) , r e m ( n , m ) a I < e--,x,T-~sqrt iF--~x) The simultaneous reeursive definition of several functions is also possible, and we shall use such definitions if they are required. There is no guarantee t h a t tile computation determined by a reeursive definition will ever terminate and, for example, an a t t e m p t to compute n[ from our definition will only succeed if n is a non-negative integer. If the computation does not terminate, the function m u s t be regarded as undefined for the given arguments. The propositional connectives themselves can be defined by conditional expressions. We write p/~q= pVq san x O/ll/eb ~yp e~l ~ (x < 0 - - , - 1 , x == 0 - - ~ 0 , T - ~ I ) d. Rccursive Function DefiniHons. By using conditionM ~:,xpressions we can, without circularity, define functions ,v formulas in which the defined function occurs. For {'xample, we wrile l~: = (n = 0 - ~ 1, T - - * n . ( n - o~ the [ mde~acollnf¢~- tr~t h 1)1) When we use this formula to e v a l u a t e 0 [ we get the answer i: because of the way in which t h e value of a conditional expression was defined, the meaningless expression ~]- ~} - 1)! does not arise. T h e e v a l u a t i o n of 2! according r, lhi:~ definition proceeds as follows: 2! - (2 = 0 - * 1, T - - . 2 . ( 2 - 1):) - 2.1! = 2.(1 = 0~L T-~ 1.(1 -- 1)!) = 2-1-0! lly F,:,~ worg! ities..'~ e e's ~?~ ~ tbet~ el, = -.1.(0 ( a , ~1 (x + ~) x ' e)) 2 > 1--~2) = 3 (,2 < 1-->4, T - ~ 3 ) (2 the recursive function defirfition: = 0-~I,T-~0.(0- 1)!) =2.1.1 =2 We now give two other a p p l i c a t i o n s of recursive function definitions. T h e greatest c o m m o n divisor, gcd(m,n), of ~wc, positive integers m a n d n is computed b y means of the Euclidean algorithm. T h i s a l g o r i t h m is expressed by (p --~ q, T --~ F) = (p -~ T, T - * q) ~-~p = ( p - ~ F , T - - ~ T ) p D q = (p-~q, T - ~ T ) I t is readily seen that the right-hand sides of the equations have the correct t r u t h tables. If we consider situations in which p or q m a y be undefined, the connectives /~ and V are seen to be; noncommutative. F o r example if p is false and q is undefined, we see t h a t according to the definitions given above p A q is false, but q A p is undefined. For our applications this n o n c o m m u t a t i v i t y is desirable, since p /~ q is computed by first c o m p u t i n g p, and if p is false q is not computed. :If the c o m p u t a t i o n for p does not terminate, we never get around to c o m p u t i n g q. We shall use propositional connectives in this sense hereafter. e. Functions and Forms. I t is usual in m a t h e m a t i c s - outside of mathematical l o g i c - - t o use the word "function" imprecisely and to apply it to forms such as y~ -V x. Because we shall later compute with expressions for functions, we need a distinction between functions a n d forms and a notation for expressing this distinction. This distinction and a notation for describing it, from which we deviate trivially, is given by Church [3]. Let f be an expression t h a t stands for a function of two integer variables. I t should make sense to write f(3, 4) and the value of this expression should be determined. The expression y~ + x does not meet this requirement; Communications of the ACM 185 i '?i ¸¸¸¸ • !!!! ~!iiililii:~i ,!~i i? ,ii!~ iS, ~i !ii~~~ !i ~iii!i~~i!i ii ii?!il, ;i:+ ii y~ + x(3, 4) is not a conventional no~ati(m, and if we a t t e m p t e d to define it we would be uneert%ir~ whether its valtte would turn out, to be i13 or 1.9. Church calls a,n expression like ye + x a form. A form can be eot~verted into a ftmetion if we can determine the, eorrespondenee between the variables occurring in the form and the ordered list of a r g u m e n t s of the desired function. This is accomplished b y Church's X-flotation. If ~;is ~ form in variables xl , - . . , x~,, then X((xt , , . . , x,,), ~;) will t:>(:~taken to be the function of n variables whose value is d e t e r m n e d by substituting the arguments for lhe variables x~, • • - , x~, in t h a t order in g and evaluating the resulting expression. For example, X((x,y),ye+x) is a function of two variables, and X( (x, y ) , y~ + x ) (3, 4:) - 19. The variables occurring in the list of variables of a X-expression are d u m m y or bound, like, variables of integration in a definite integral. T h a t is, we m a y change the names of the bound w~riables in a function expression without changing the vMue of the expression, provided t h a t we m a k e tile same change for each occurrence of the wu'iable and do not m a k e two variM)les lhe same t h a t previously were different. Thus X((x,y),y"'+x),X((u,v), v ~ + u ) and X((y, x), x e + y ) denote the same function. We shall frequently use expressions in which some of the w m a b l e s are bound by X's and others are not. Such art expression m a y be regarded as defining a function with parameters. T h e unbound variables are (;ailed free variables. An a d e q u a t e notation t h a t distinguishes functions from forms allows an u n a m b i g u o u s t r e a t n m n t of functions of ftmetions. It, would involve too much of a digression to give examples here, but we shall use functions with functions as a r g u m e n t s later in this report. Difticulties arise in combining functions described by X-expressions, or by arty other notation involving variables, because different bound variables m a y be represented by the same symbol. This is called collision of bound variables. There is a notation involving operators that are called eombinators for combining functions without the use of variables, Unfortunately, the combinatory expressions for interesting combinations of functions term to be lengthy and unreadable. f. Expressions for Recursive Functions. T h e X-notation is inadequate for naming functions defined recursively. For example, using X's, we can convert the definition sqrt(a, x, e) J. a ( ix" -- a I < e - ~ x, T -~ sqrt(a, 2(x + x ), e)) ~:~ii! ~i~i~' :i i~I into sqrt = X ( ( a , x , ¢ ) , ( l x 2 -- a,l < e---+ x, T'---~ i a) +))) sqrt (a, 2(x + x ' ii? i!~T i~' but the right-hand side cannot serve as an expression for the fimetion because there would be nothing to indicate 186 C o m n u m i c a t i o n s of the ACI~'I that the refere~ee to sqrt within *he expression sloe :1for ih~. expression as a whole. In order to be able to write expressio~s for I'eCursive functions, we introduce another ~aotation label(~ ['~ &:.. notes the. expression g, provided t h a t oeeurre~ces <)f ~} within 8 are to be interpreted as referring to the expres,,:i<, as a whole. Thus we can write h~bel(sqrt, X((a, x, e), ( [ x'-' - a < e --, x, rl' -~ sqrt (a, 1 (x a 2 ' + x )' ~):~) a nanle for ollr s(tl:'t functiolt. T h e symbol a ill label(a,g) is also bound thai is, i~ i; m a y be altered systematically without changing the mealing of the expression. It behqvcs, differently from a \'ariaN~ bound by a X, howe\'er. ab; { 3. R e e u r s i v e F u n c t i o n s o f S) m b o l i c E x p r e s s i o n s We shall first define a class of symbolic express ~ s :i~ terms of ordered pairs and lists. Then we shall define five, elementary functions and predicates, and build front them by composition, conditional expressions, and "(ru,sv~:, definitions an extensive class of functions of which w~ sh'dl give a n u m b e r of examples. We shall lhen show how these functions themselves can be expressed as syr~bol c expressions, and we shah define a universal f t l l l C t i O t t ( P' ] g that Mlows us to c o m p u t e front the expression for a gi~t~ function its value for given arguments. Finally, we shal define some functions with functions as argume~ts :u~d giw; some useful examples. 0 g ~i' ; ¢ i ill' a. A Class qf S!]nzbolic Expressions. We shall now ns are th.el~ defined as follows: 1. M()mic symbols are S-expressions° i?. If e~ alld e.., are S-expressions, so is (el'e2). Examples of S-expressions a r e t f o r iL 2. eq. eq Ix; 3'] is defined if "rod o n l y if both x and y are atomic, eq Ix; y] = T if x a n d y a r e t h e same symbol, and eq Ix; Yl = F otherwise. T h u s eq [X; X] = T eq IX; A] = F eq [X; ( X . A ) ] is undefined. AB p>e:ssi>¢ (~v. B) (tAB.O)-D) ) An S-expression is then s i m p l y an ordered pair, the wrms of which m a y be a t o m i c s y m b o l s or simpler S-expressi,ms. We can represent a tist of a r b i t r a r y length in terms <>fS-expressions as follows. T h e list { : e m<~. cdr [ ( X . A ) ] = A edr [ ( ( X . A ) - Y ) ] =: 55 ( m t - i m p ' ( .... ( m , , - N [ L ) . . . ) ) ) ma ~he!~ "eea ~:sve : 'hi,e~i w, {e ,aow h*.:~'.~: s,:,,m}',~,ii~ ilere N I L is an atomic s y m b o l used to terminate lists. Si~me many of the s y m b o l i c expressions with which we de:~.l are conveniently e x p r e s s e d as lists, we shall introduce a lis~ imtation to a b b r e v i a t e c e r t a i n S-expressions. We have 1. i m ) siands for ( r e . N I L ) . 2. (m~ , . , . , m , ) s t a n d s for (ml. (. -. ( m , . N I L ) - . - ) ) . '.~. ( m , , -.- , m,,.x) s t a n d s :for (mr' ( - - - ( m , , . x ) . . . )). Subexpressions can be similarly examples of these a b b r e v i a t i o n s are w e :,t~atl ¢; 'e :forme~ abbreviated. Some {(AB, C), l ) ) for ( ( A B - ( C - N I L ) ) . ( D . N I L ) ) ( ~ A , B ) , C , I ) . E ) for ( ( A . ( B - N I L ) ) . ( C . ( D . E ) ) ) Nince we regard the expressions with c o m m a s as abbreviation,s for those not i n v o l v i n g commas, we shall refer to ~hem all as S-expressions. b. ["~mc~ions of S-expressions and the Expressions 77~at t~q,'~,~ent Them. We now define a class of functions of S-expressions. T h e expressions representing these ftmci~¸ ~iiii!~' ~io~s are written in a c o n v e n t i o n a l functional notation. be, is, tIowever, in order to clearly distinguish the expressions { in Ieiter~ rvpresenting fmmtions from S:expressions, we shall use re'pitS :{ ~quences of lower-case l e t t e r s for function names and { variabk~s ravaging over the s e t of S-expressions. We also ~ brackets and semicolons, instead of parentheses and commas, for denoting the a p p l i c a t i o n of functions to their arguments. T h u s we write car [x] car [cons [ ( A . B ) ; x]] i" reqti~ i)e f0t:ff# lish ~0r45 L a i r y SII~ '-~te r s is if.is ,ca1 i car [ ( X . A ) I = X ear [ ( ( X . A ) . Y ) ] = i X . A ) 4. edr. cdr Ix] is also defined w h e n x is not atomic. We have cdr [(<'e~)l = < . T h u s edr {X] is undefined. (m~, me, "'" ,m,,) is represented by the S-expression ms 3. ear. car Ix] is defined if "rod o n l y if x is not atomic. car [(e~.ee)] = el. T h u s car [XI is undefined. 5. cons. cons Ix; y] is defined f o r a n y x and y. We have cons [e~ ; e~] = (e1.e~). T h u s cons [X; A1 = i X . A) cons [ i X . A ) ; Y] = ( ( X - A ) . Y ) ear, cdr, and cons ,~i'e easily seen to s a t i s f y t h e relations ear [cons [x; y]] = x cdr [cons {x; y]] = y cons [car [x]; cdr [x/1 = x, p r o v i d e d t h a t x is not. atomic. The nantes " c a r " and " c o n s " will c o m e to have mnemonic significance only when we discuss t h e representation of the system in the computer. C o m p o s i t i o n s of car and cdr give the subexpressions of a given e x p r e s s i o n in a given position. Compositions of cons form e x p r e s s i o n s of a given structure out of paris. The class of f u n c t i o n s which can be formed in this way is quite limited a n d not v e r y interesting. d. Recur,s@e S-functions. We gel; a n m c h larger class of functions (in fact, all c o m p u t a b l e f u n c t i o n s ) when we allow ourselves to form new f u n c t i o n s of S-expressions by conditional expressions and recursive definition. We now give some examples of f u n c t i o n s that are definable in this way. 1. ff [x]. The value of ff Ix] is t h e first atomic symbol of the S-expression x with the p a r e n t h e s e s ignored. Thus ff [ ( ( A . B ) . C ) ] = a W e have ff [x] = [atom [x] - ~ x; T --~ff [ear [x]]l I~ these M-expressions (meta.-expressions) a n y S-expressions that oeetlr s t a n d for themselves. We now tract in detail the s t e p s in t h e evaluation of ff [ ( ( A . B ) , C ) ] : e, The tflemerttary S:functions and Predicates. We intro ff [ear [(A.B)]]] append Ix; y] = [mdI [x] -+ y; T ~ cons [ca," [x]; = ff [ear [ ( ( A . B ) . C ) ] ] = [F + ( A . B ) ; T - ~ f f [eat" [(m.*{)]]] = [T --+ff [car [(A.B)]]] append [cdr [x]; y]]] An example is = ff lear [(A.B)]] append [(A, B ) ; (C, l), E)] = (A, B, C, D, ]i;) : ff [A] 2. among [x; y]. This predicate is true if the S-expression x occurs among the elements of the lis~ y. We have = [atom [A] --~ A; T --+ ff [ear [A]]] = [T ~ A; T --, ff [car [A]]] among Ix; y] = ~--mull [y] /~ [equal [x; ear [y]] =A 2. subst Ix; y; z]. This function gives the result of substituting the S-expression x for all occurrences of the atomic symbol y in the S-expression z. It is defined by 3. pair Ix; y]. This function gives the list of pairs of corresponding elements of the lists x and y, We have subst Ix; y; z] = [atom [z] --+ [eq [z; y] -+ x; T --~ z]; pair Ix; y] = [null [ x ] / \ null [y] -+ N I L ; --,atom [xl V among [x; cdr D']]] /~ ~-~atom [y] -+ cons [lisg lear Ix]; ear [y]]; T + cons [subst Ix; y; ear [z]]; subst Ix; y; edr [z]l]] pair [edr Ix]; cdr [y]]]] As an example, we have subst [ ( X . A ) ; B; ((A-B)-C)] = ((A-(X.A))-C) 3. equal [x; y]. This is a predicate t h a t has the value T if x and y are the same S-expression, and has the value Iv otherwise. We have equal Ix; y] = [atom [x]/~ a t o m [y] /~ eq Ix; y]] V [~-oatom [x] /~ ~ a t o m [y] /~ equal lear [x]; car [y]] /~ equal [edr [x]; edr [yl]l It is convenient to see how the elementary functions look in the abbreviated list notation. T h e reader will easily verify t h a t (i) ear [ ( m l , m2, . . . , m , ) ] = ml (ii) edr [ ( m , , m2, " " , m ~ ) ] = ( m 2 , . . " , m ~ ) (iii) cdr [(m)] = N I L (iv) c o n s [ m , ; ( m 2 , ' " , m , , ) ] = (ml,m~,'",m,,) (v) cons [m; NIL] = (m) We define null [x] = a t o m [x] A eq [x; NIL] This predicate is useful in dealing with lists. Compositions of ear and edr arise so frequently t h a t m a n y expressions can be written more concisely if we abbreviate eadr [x] for eaddr [x] for pair [(A, B, C ) ; (X, (Y, Z), U)] = ((A, X ) , (B, (Y, Z)), (C, C)) 4. assoe [x;y]. If y is a list of the form ( ( u t , v,), • • • , (m~, v~) ) and x is one of the u's, then assoe [x; Yl is the corresponding v. We have assoc [x; y] = ecl[caar [y]; x] -+ eadar [y] ; T --~ assoc Ix; ode [y}]] An example is assoe IX; ( ( W , (A, B ) ) , (X, (C, D ) ) , (Y, (E, F ) ) ) ] : (C, ~)) 5. sublis [x; y]. Here x is assumed to have the form of a list of pairs ((u~, v , ) , • .. , (u,,, v , ) ) , where the u% are atomic, and y may be any S-expression. The value of sublis [x; y] is tile result of substituting each v for the corresponding u in y. In order to define sublis, we first defi~e an auxiliary function. We have sub2 [x; z] = [null Ix] -+ z; eq [eaar [x]; z] ~ eadar [x]; T -+ sub2 [cdr [x]; z]] and sublis Ix; Yl = [atom [y] --~ sub2 [x; y]; ear [edr [x]], T ~ cons [sublis [x; ear [y]]; sublis [x; edr [y]i] ear [edr [edr [x]]], etc. Another useful abbreviation isto write list [e~ ; e= ; • • • ; e~] for cons [el ; cons [e2 ; • • • ; cons [e. ; N I L ] - . • ]]. This function gives the list, ( e , , . . - , e.), as a hmetion of its elements. 188 An example is C o m m u n i c a t i o n s o f t h e ACM We have sublis [((X, (A, B ) ) , (Y, (B, C ) ) ) ; (A, X . Y ) ] = (a, (a, B), B, c) ~ssi0ns ~ :} x]; y]ll S-ex. %have ,dr b']]l )airs of 'e dr [y]]]] Ic, u)); e. R~¢prese,zlation of S-Functions by S-Expressions. S.func[ions have bem~ described by M-expressions. We now give a rule for t,ranslating M-expressions into Sexpressions, ii, order to be able to use S-functions for making certain computations with S-functions and for aaswering certain questions about S-functions. The translation is determined by the following rules in which we denote ~he translation of an M-expression 8 by 1. If g is an S-expression E* is (QUOTE, 8). 2. Variables and function names that were represented by strings of lower-case letters are translated to the corresponding strings of the corresponding upper-case letters. Thus car* is CAR, and subst* is SUBST. 3. A form fie, ; . • • ; en] is translated to (f*, e~*, • - • , en*). Thus {cons {ear [x]; edr [x]l}* is (CONS, (CAR, X), and, eval [e; a] = [ atom [e] --+ assoc [e; a]; atom [car [e]] --+ [ eq {ear [e]; QUOTE]-~ cadr [e]; eq [car [e]; ATOM] -÷ atom [eval [cadr [e]; a]]; eq [cal" [e]; EQ] --~ [eval [cadr [e]; a] = eval [cad& [e]; a]]; eq [car {el; CON])] --~ evcon [edr [e]; a]; eq [ear" [e]; CAR] --~ ear [eval [cadr [el; a]]; eq [car [e]; CDR] --~ cdr [eval [cadr [e]; a]]; eq [car {el; CONS] --~ cons [eval [eadr {el; a]; eval [eaddr {el ; a]]; T --+ eval [cons [assoe {ear {el; a]; CDR, X)). 4. {[p, -+ e, ; . . . ; p,, -+ e,]}* is (COND, (p**, e**), . . , (p,,*. e,,*)). 5. {X[[x, ; .." ; x,]; g]}* is (LAMBDA, (x~*, --- , x,,*), ~;*). 6. {label [a; a]}* is (LABEL, a*, g*). With these conventions the substitution function whose M-expression is label [subst; X[[x; y; z]; [atom [z] -+ [eq [y; z] -~, x; T --~ z]; T --~ cons [subst [x; y; ear [z]]; subst Ix; y; cdr [z]]]]l] has the S-expression ( L A B E L , SUEST, (I.MMI~DA, (X, Y, Z), (COND evlis [edr [e]; all; a]]; eq [caar [e]; LABEL] --~ eval [cons [eaddar {el; cdr [e]]; cons {list [cadar {el; car [e]; a]]; eq [eaar [e]; LAMBDA] -+ ewd [caddar [e]; append [pair [ca.dar {el; evils [cdr {el; a]; a]]] and evcon [c; a] = [eval [caar [e]; a] -~ eval [eadar [c]; al; T --~ evcon [cdr [c]; a]] ((ATOM, Z), (COND, (EQ, Y, Z), X), ((QUOTE, T), Z))), ((QUOTE, T), (CONS, (SUBST, X, Y, and evils [m; a] = [uull [m] --+ NIL; (CAI~ Z)), (SUBST, X, Y, (CDR, Z ) ) ) ) ) ) ) This notation is writable and somewhat readable. It can be made easier tO read and write at the cost of making its structure less regular. If more characters were available on the computer, it could be improved considerably. he c0ri define" f. The Universal S-Function apply. There is an S-function apply with the property that if f is an S-expression for art S-function f' and args is a list of arguments of the form (argl, . . . , argn), where argl, . . ' , argn are arbitrary S-expressions, then apply{f; args] and f'[argl; .." ; argn] arc defined for the same vahms of argl, . . . , argn, and are equal when defined. For example, X[[x; y]; cons [car [x]; yl] [(A, B); (C, D)] [x]; z]l il¸ = apply [(LAMBDA, (X, Y), (CONS, (CAR, X), y ) ) ; ((A, B), (C, D))] = (A, C, D) The S-function apply is defined by apply If; a r g s ] = eval [cons If; appq [args]]; NIL] where appq [m] = {null [m] -+ NIL; T -~ cons {list [QUOTE; car [m]]; appq [cdr [m]]]] T -~ cons [eval [car [m]; a]; evlis [cdr [m]; a]]] We now explain a number of points about these definitions. 1. apply itself forms an expression representing the value of the function applied to the arguments, and puts the work of evaluating this expression onto a function eval. It uses appq to put quotes around each of the arguments, so that eval will regard them as standing for themselves. 2. eval [e; a] has two arguments, an expression e to be evahmted, and a list of pairs a. The first item of each pair is an atomic symbol, and the second is the expression for which the symbol stands. 3. If the expression to be evaluated is atomic, eval evaluates whatever is paired with it first on the list a. 4. If e is not atomic but car [e] is atomic, then the expres~' e) or (ATOM, e) or sion has one of the forms (QUO 'I'E, (EQ, el, e2) or (COND, (pl, e , ) , . . . , (P,,, e,,)), or (CAR, e) or (CDR, e) or (CONS, e,, e2) or (f, e,, .." , e,,) where f is an atomic symbol. In the case (QUOTE; e) the expression e, itself, is taken. In the case of (ATOM, e) or (CAR, e) or (CDR, e) the expression e is evaluated and the appropriate function taken. In the case of (EQ, el, e2) or (CONS, el, e~) two expressions have to be evaluated. In the case of (COND, C o m m u n i c a t i o n s of t h e ACM 18~ (p~, el), - . . , (p,~, e , ) ) the p's havo to be evaluated in order until a true p is found, and theH the eorresponding o must be evaluated. This is accomplished by eveon. Finally, in the case of (f, o,, . . . , on) we evaluate t,he expression that results from replacing f in this expression by whatever it; is paired with in the list a. 5. T h e evaluation of ((LABEI~, f, g), e~, • - • , e,,) is 'tecomplished by evaluating (8, o~ , • .. , e~) with the pairing (f, ( L A B E L , f, 8)) put on the front of the previous list, a of pairs. 6. Finally, the evaluation of ( (LA MBDA, (x~, • - - , x,~), ~), e~, . . . , e,~) is accomplished by evaluating ~; with tho list of pairs ( ( x t , o~), . . . , ( ( x n , o,)) put on the front of the previous list a. The list a could be eliminated, and L A M B D A and L A B E l , exprossions evaluated by substituting the arguments for the variables in. the expressions ~;. Unfortunately, difficulties involving collisions of bound variables arise, but they are avoided by using the list a. Calculating the values of functions by using apply is an activity better suited to electronic computers than to people. As an illustration, however, we now give some of the steps for calculating ((ATOM, X), X), ((QUOTE, T), (FF, ( C a ~ , X ) ) ) ) ) ) ; ((A.B))] = The first argument is tho S-expression that represents the function ff defined in seetion 3d. We shall abbreviate it by using the lettor ¢. Wo have apply [~; ( ( A - B ) ) ] = e v a I [ ( ( L A B E L , FF, ¢), (QUOTE, ( A - B ) ) ) ; NIL] where ¢ is tho part of ¢ beginning ( L A M B D A ( X ) , ~o), ( Q U O T E , (A.B))); ( ( F F , ¢))] where ca is the part of ¢ beginning ( C O N D = e v a l [(CONI), (rr~, et), (~r~, e=)); ( ( X , ( Q U O T E , ( A . B ) ) ) , (FF, ¢ ) ) ] Denoting ( (X, ( Q U O T E , ( A- B ) ) ), ( F F , ¢) ) by a, wo obtain = ovoon [( (~-~, ~), (~r~, ~.) ) ; ~*1 This involves eval [~rt ; co] = eval [(ATOM, X ) ; a] = a t o m loyal iX; a]] = a t o m loyal [assoe iX; ( (X, (QUOTE, (A. B) ) ), (FF, ¢))]; all = a t o m [oval [(QUOTE, (A. B ) ) ; a]] = atom [(A.B)] =F 190 C o m m u n i c a t i o n s o f t h e ACM apply [¢5; ((A. B))] = ore,on [ ( ( ~ , ~ ) ) ' ~t, which involves oval [~ ; a] = eva[ [(Q[YOTE, T ) ; ~] = T. Our main calculation again eontim~es with :~ apply [¢; ((A. B) )] = oval [e2 ; a] t = eval [(FF, (CAR, X ) ) ; a] = e v a l [cons [4; evils [((CAR, X ) ) ; a]]; a] i' Evaluating evlis [((C, AR, X ) ) ; a] involves oval [(c~' k t{, x ) ,~l !i - ear [oval [X; all] = ear [(A'B)I, where we took steps from the earli~,r computation of a.tom loyal IX; all = A, and so evils i t ( C A R , X ) ) ; a] then becomes list [list [QUOTt!;; A]] = ( ( Q U ) r E , A)) '( ....., ~. and otlr main quantity becomes eva| [(4,, (q~Jo IL, x)); ~] apply [(LABEL, FF, ( L A M B D A , ( X ) , ( C O N D , = eval[((LAMBDA, ()ur main caleulation contirmes wiih The subsequent steps are made as in tho begimfing ,,f the calculation. T h e L A B E L and L A M B D A cause i~(:~*.' pairs to be added to ~, which gives a new list; of pairs < . The rr, term of the conditional oval [(ATOM, X); the value 32 t)oeause X is paired with ( Q [ O 1 E, A)tirst in a l , rather than with ( Q U. .O. .I. E c, (A . B ) ) as in c~. Therefore we end up with oval iX; a~] fl'om the .e~'co~, 'rod this is just A. g. Functions with Functions as Arguments. There are a number of useflfl functions some of whose arguments are functions. T h e y are especially useful in defining othe|" runetions. One such function is maplist ix; f] with an S-expression argument x and an argument f that, is a function from S-expressions to S-expressions. Wo define {{ '{{:: i :~, :i~ ~- :, lnaplist [x; fl = [null [xl --~ N I L ; T --+ cons [fix]; maplist [cdr ix]; i'1~] , :['he usefulness of maplist is illustrated by fornmlas for the partial derivative with rospeet to x of expressions i~> volving sums and products of x and other variables. The S-expressions that we shall differentiate are formed ~ follows. 1. An atomic symbol is an allowed expression. 2. If e~, e~, • • • , e, are allowed expressions, (PLUS, < , . . . , en) and ( T I M E S , e~, . . . , e.) arc also, a.nd represeld' the sum and product, respectively, of e~, • • - , e , . This is, essentially, the Polish notation for functio~* except, that the inclusion of paronthe~es and eoinmas tfl" lows functions of variablo numbers of arguments. An exa m~ ple of an allowed expression is ( T I M E S , X (t)L[~'S' X, A), Y), the conventional algebraic notati(m for whicD is X ( X + A ) Y . ii !i : ;~ ~ ~, ;:. (a) (b) (c) FIG. I Our different.fallen tormula, which gives the derivative of y with respecti (.() x, is diff [y; x] = [atom [y]---* [eq [y; x] -~, ON[i;; T --, ZEtIOI; eq [car [y]; PI,USI --~ cons [['LUS; maplist [cdr [y]; X[[z]; diff[car [z]; x]/I]; eq[car [y]; TIMES] --+ cons[PLUS; maplistledr[y]; X[[zl; cons [TIMES; maplist[cdr [y]; X[[w]; ~--~eq [z; w] -÷ car [w]; 1l' ~ diff' [car [[w]; xlll]]]] The derivative of the allowed expression, as computed by this formula, is ,arlier } (PLUS, (TIMES, ONE, (I'IA;rs, X, A), Y), (TIMES, X, (PLUS, ONE, ZERO), Y), ! !i ( T I M E S , X, (PLUS, X, A), Z E R O ) ) ) Besides maplist, another useful function with functional arguments is search, which is defined as search Ix; p; f; u] --- [null Ix] --+ u; p[x] --~ f[x]; T -+ search [cdr [x]; p; f; u] a. R<'4ffe,~entalio~ of £'-P~'xpre.~,~wns b:q List Structure. A list structttre is a collection of computer words arranged as in figure la or lb. Each word of the list structure is represented by one of the subdivided rectangles in the figure. The left box of a rectangle represents the address field of the word and the right box represents the decrement field. An arrow from a box ~;o another rectangle means that the field corresponding to the box contains the location of the word corresponding to the other rectangle. It is permitted for a substructure to occur in more than one place in t~ list structm'e, as in figure lb. but it is no~ permitted for a sturcture to have cycles, as in figure le. An atomic symbol is represented in the computer by a list structure of special form called the association list of the symbol. The address field of the first word contains a special constant which enables the program to tell that this word represents an atomic symbol. We shall describe association lists in section 4b. The function ,search is used to search a list for an element that has the property p, and if such an element is found, f of that element is taken. If there is no such element, the function u of no argument is computed. i;iiii!iiii!i 4. The LISP P r o g r a m m i n g S y s t e m xpres, fl'0m ! x;l fill f~ as foi::: ~ls lrt- tl ted as i which i The LISP programming system is a system for using the IBM 704 computer to compute with symbolic information in the form of S-expressions. It has been or will be used for the following purposes: 1. Writing a compiler to compile LISP programs into machine language. 2. Writing a program t.o check proofs in a (:lass of formal logical systems. 3. Writing programs for fornml differentiation and integration. 4. Writing programs to realize various algorithms for generating proofs in predicate calculus. 5. Making certain engineering calculations whose results are formulas rather than numbers. 6. Programming the Advice Taker system. The basis of the system is a way of writing computer programs to evaluate S-functions. This will be described in the following sections. In addition to the facilities for describing S-functions, there are facilities for using S-flmctions in programs written as sequences of statements along the lines of FORTRAN (4) or AL(;OI; (5). These features will not be described in tiffs article. (o) (b) l?m. 2 An S-expression x that is not atomic is represented by a word, the address and decrement parts of which contain tile locations of the subexpressions ear[x] and edr[x], respectively. If we use the symbols A, B, ere, to denote the locations of the association list of these symbols, then tile S-expression ( ( A . B ) . ( C . ( E . F ) ) ) is represented by the list structure a of figure 2. Turning to the list. form of S-expressions, we see that, tile S-expression (A, (B, C), D), which is an abbreviation for ( A . ( ( B . ( C - N [ L ) ) . ( D NIL))), is represented by tile list structure of figure 2b. When a list structure is regarded as representing a list, we see that each term of the list occupies tile address part of a word, the decrement part, of which points to the word containing the next term, while the last word has N I L in its decrement. An expression that has a given subexpression occurring more than once can be represented in more than one w'~y. Whether the list structure for the subexpression is or is not repeated depends upon the history of the program. Whether or not a subexpression is repeated will make no Comnumications of the XCM i J~il;! ~ /" Zil 1.91 ..... difference in the results of a program as they appear outside the machine, although it will affect the time and storage requirements. For example, tt~e S-('xpressio~ ((A. B). (A. B)) can be represented by either the list. structure of figure 3a or 3b. {b) ~o} F~G. 3 Tile prohibition against circular list, structures is essentially a prohibition against an expression being a subexpression of itself. Such an expression could not exis~ ot~ paper in a world with our topology. Circular l:]st structures would have some advantages in the machine, for example, for representing recursive hmctions, but difficulties in printing them, and in certain other operations, make it seem advisable not to use them for the present. The advantages of list structures for the storage of symbolic expressions are: 1. The size and even the number of expressions with which the program will have to deal cannot be predicted in advance. Therefore, it is difficult to arrange blocks of storage of fixed length to contain them. 2. Registers can be put back on the free-storage list when they are no longer needed. Even one register returned to the list is of value, but if expressions are stored linearly, it is difficult to make use of blocks of registers of odd sizes that m a y become available. 3. An expression t h a t occurs as a subexpression of several expressions need be represented in storage only once. b. Association Lists. In the LIsP programming system we put more in the association list of a symbol than is required by the mathematical system described in the previous sections. In fact, any information that we desire to associate with the symbol nmy be put on the association list. This information m a y include: the print name, that is, the string of letters and digits which represents the symbol outside the machine; a numerical value if the symbol represents a number; another S-expression if the symbol, in some way, serves as a name for it; or the location of a routine if the symbol represents a functior~ for which there is a machine-language subroutine. All this implies that in the machine system there are more prirnitive entities than have been described in the: sections on the mathematical system. For the present, we shall only describe how print names are represented on association lists so that in reading or printing the program can establish a correspondence between information on punched cards, magr:Jetic tape or printed page and the list structure inside the machine. The association list. of the symbol DIFFEI{t~3NTIATt?]has a segment (ff the form shown in figure 4, Here lmame is a symbol that indicates that the struett~re for the. print 192 ( : o m m u n i c a t i o n s of the ACM ~mme of the symbol who~e asso(qaEi(m list this is hangs from the uext~ word , functions that correspoad i:,., g(e; W< the basic functions of L:~se, so that, mathematically, linear I,Ise includes L~st,. This turns out to be the mos~ convenient way of progr'~mming, i:n linear L:~se, the mor~, complicated manipulations. However, if the function> are to be represented by computer routines, Ltse is ess<~. tially faster. I 6. F l o w c h a r t s a n d R e c u r s } o n exi Since both the usual form of computer program m~d recursive function definitions are universal computationally, i~ is interesting to display the relation between them. The translation of recurs}re symbolic functions inlo compuier programs was the subject of the rest of this report, h~ ~his section we show how to go the o t t e r way, at least i~ prineipk.~. The state of the machine at any time during a compu,alion is gNen by the values of a number of variables, l,e~ ' these variables be combined into a vector ~. Consider a program block with one entrance and one exit. It (teti~,es and is essentiMly defined by a certain function f tha~ ? takes one machine configuration into another, that is, f has the form ~' = f(~). Let us call f lhe associated functio,~ of the program block. Now let a number of such })locks i, combined into a program by decision elements u that (h,. eide after each t)lock is completed which block will }~a~ entered next. Nevertheless, let the whole program ~siil} have one entrance and one exit. C o m m u n i c a t i o n s of the AC~! ih a[ p~ t c t, :i (: F~c,. 5 We give as an example the flowchart of figure 5. l,ei *~> describe the function r[~] that gives the transformati(m 0g ~{ the vector ~ between entrance and exit of the whole block. { 194 fn J~ ! i'airl:i 2,1~{I u~.l tO : eally, We shall (:tefi,~(~ il i,~ col~junction with lhe functions ~[f/]and t[,~l, which give the /rausformations thai ~ undergoesbetween the points S a,~d T respeclively and the exit. We h~ve most r[~] = [~rH[~I-' ~'[f,[~ll; T - ~ s[fd~]]l (Signal Corps), the U.S. Air Force (Office of Scientific Research, Air Research and Development Command), and the U.S. N a v y (Office of Naval Research)). The author also wishes to acknowledge the personal financiM support of the Alfred P. Sloa.n Foun(l't,tion.. Ill0r(, ~ti0ns~ )8sell. # Given a flowch'trt with a single, entrance and a single exit, it is easy ~(, write down the recursive function that gives the tr'msformation of tim state vector from entrance to exit in terms of the corresponding functions for the computation blocks and the predicates of tile branch points. In general, we proceed as follows. In figure 6, let ~ be an n-way branch point, and let fi, ... , f, be the computations leading to branch points fl~, f12, ' " , fl . . . . l~et 4) be the function that transforms between fl and the exit of the chart, and let (b~ , • " • , 4,~ be the corresponding functions for f l ~ , . . . , fl,~. We then write t~dreL ,nelly, lputei,!, in this } apu~a~ ;ider ~i, definil ~[~] = [p~[~J --, ¢,[t',[}]]; . . . ; p,,[}] ~ 0,,[f',,[}]]] f that! Acknowledgments s, fii~ii2 The inadequacy of tile X-notation for naming recursive t ion ,eks g~ functions was noticed by N. Rochester, and he discovered hat di) ~n alternative to the solution involving label which has been used here. The form of subroutine for cons which will permits its composition with other functions was invented, in connection with another programming system, by C. i Gerberiek and H. L. Gelernter, of I B M Corporation. The Lisp programming system was developed by a group 4; including R. Brayton, D. Edwards, P. Fox, L. Hodes, D. Luckham, K. MMing, J. M c C a r t h y , D. Park, S. Russell. The group was supported by the M.I.T. Computation !4¸ Center, and by the M.I.T. Research i,aboratory of EleeU'onics (which is supported in part by the U.S. Army ¢ f2 %_j #, ... %_j #2 FIG. 6 REFERENCES 1. J. McCARTHY, Programs with common sense, Paper presented at the Symposium on the Mechanization of Thought Processes, National Physical Laboratory, Teddington, England, Nov. 24-27, 1958. (Published in Proceedings of the Symposium by H. M. Stationery Office). 2. A. NEWELL AND J. C. SHAW, Programming the logic theory machine, Proc. Western Joint Computer Conference, Feb. 1957. 3. A. CmmcH, The Calculi of Lambda-Conversion (Princeton University Press, Princeton, N. J., 1941). 4. FORTRAN Programmer's Reference Manual, IBM Corporation, New York, Oct. 15, 1956. 5. A. J. PERLIS AND K. SAME, LSON, International algebraic language, Preliminary Report, Comm. Assoc. Comp. Mach., Dec. 1958. Symbol Manipulation by Threaded Lists* A. J. PEltLIS AND CHARLES THORNTON, Carnegie Institute of Technology, Pittsburgh, Pa. P a r t 1: T h e T h r e a d e d List Language 1. I n t r o d u c t i o n In the field variously called artificial intelligence, i! heuristic programming, a u t o m a t a theory, etc., many of ilii!¸ * The work was SUl)ported in part by the Off:iceof Naval Research under contract munber Nonr.-760 (18), Nr 04(,)-141 and by the U. S. Army Signal (~orps under e(mtraet number l)a 36-0398eq5081, File No. 0195-PH-58-91 (4461). bl0< the most interesting problems do not lend themselves readily to solutions formulated in the automatic programming systems now in wide use. Several new approaches to more adequate and natural programming systems have been made in the past few years. Notable among these are the list structure languages of the I P L family developed by Newell-Simon-Shaw [1] and LISP by M c C a r t h y [2]. They provide great flexibility for the construction of highly composed programs, and are able to represent and process systems of arbitrarily great complexity, subject C o m m u n i c a t i o n s of the AC1M i: i~;~~!il 1195