We are a sharing community. So please help us by uploading **1** new document or like us to download:

OR LIKE TO DOWNLOAD IMMEDIATELY

Structure and Interpretation of Computer Programs Second Edition Unofficial Texinfo Format 2.neilvandyke4

Harold Abelson and Gerald Jay Sussman with Julie Sussman foreword by Alan J. Perlis

c 1996 by The Massachusetts Institute of Technology Copyright Structure and Interpretation of Computer Programs second edition Harold Abelson and Gerald Jay Sussman with Julie Sussman foreword by Alan J. Perlis The MIT Press Cambridge, Massachusetts London, England McGraw-Hill Book Company New York, St. Louis, San Francisco Montreal, Toronto This book is one of a series of texts written by faculty of the Electrical Engineering and Computer Science Department at the Massachusetts Institute of Technology. It was edited and produced by The MIT Press under a joint production-distribution arrangement with the McGraw-Hill Book Company. Unofficial Texinfo Format 2.neilvandyke4 (January 10, 2007)

i

Short Contents Unofficial Texinfo Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Preface to the Second Edition . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Preface to the First Edition . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1 Building Abstractions with Procedures . . . . . . . . . . . . . . . . 15 2 Building Abstractions with Data . . . . . . . . . . . . . . . . . . . . 75 3 Modularity, Objects, and State . . . . . . . . . . . . . . . . . . . . 183 4 Metalinguistic Abstraction . . . . . . . . . . . . . . . . . . . . . . . 299 5 Computing with Register Machines . . . . . . . . . . . . . . . . . 405 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509

ii

Structure and Interpretation of Computer Programs, 2e

iii

Table of Contents Unofficial Texinfo Format . . . . . . . . . . . . . . . . . . . . . . 1 Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Preface to the Second Edition . . . . . . . . . . . . . . . . . 9 Preface to the First Edition . . . . . . . . . . . . . . . . . . 11 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1

Building Abstractions with Procedures . . . . . 15 1.1

The Elements of Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.2 Naming and the Environment. . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.3 Evaluating Combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.4 Compound Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.5 The Substitution Model for Procedure Application . . . . . . . 1.1.6 Conditional Expressions and Predicates . . . . . . . . . . . . . . . . . 1.1.7 Example: Square Roots by Newton’s Method . . . . . . . . . . . . 1.1.8 Procedures as Black-Box Abstractions . . . . . . . . . . . . . . . . . . . 1.2 Procedures and the Processes They Generate . . . . . . . . . . . . . . . . . 1.2.1 Linear Recursion and Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.2 Tree Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.3 Orders of Growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.4 Exponentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.5 Greatest Common Divisors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.6 Example: Testing for Primality . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Formulating Abstractions with Higher-Order Procedures . . . . . . 1.3.1 Procedures as Arguments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.2 Constructing Procedures Using Lambda . . . . . . . . . . . . . . . . . . 1.3.3 Procedures as General Methods . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.4 Procedures as Returned Values . . . . . . . . . . . . . . . . . . . . . . . . .

17 18 19 20 22 24 26 30 33 37 37 41 45 47 50 51 56 57 61 64 69

iv

2

Structure and Interpretation of Computer Programs, 2e

Building Abstractions with Data. . . . . . . . . . . 75 2.1

Introduction to Data Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 2.1.1 Example: Arithmetic Operations for Rational Numbers . . . 78 2.1.2 Abstraction Barriers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 2.1.3 What Is Meant by Data? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 2.1.4 Extended Exercise: Interval Arithmetic . . . . . . . . . . . . . . . . . . 86 2.2 Hierarchical Data and the Closure Property . . . . . . . . . . . . . . . . . . 89 2.2.1 Representing Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 2.2.2 Hierarchical Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 2.2.3 Sequences as Conventional Interfaces . . . . . . . . . . . . . . . . . . . 102 2.2.4 Example: A Picture Language . . . . . . . . . . . . . . . . . . . . . . . . . 113 2.3 Symbolic Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 2.3.1 Quotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 2.3.2 Example: Symbolic Differentiation . . . . . . . . . . . . . . . . . . . . . 126 2.3.3 Example: Representing Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 2.3.4 Example: Huffman Encoding Trees . . . . . . . . . . . . . . . . . . . . . 138 2.4 Multiple Representations for Abstract Data . . . . . . . . . . . . . . . . . 145 2.4.1 Representations for Complex Numbers . . . . . . . . . . . . . . . . . 146 2.4.2 Tagged data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 2.4.3 Data-Directed Programming and Additivity . . . . . . . . . . . . 153 2.5 Systems with Generic Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 2.5.1 Generic Arithmetic Operations . . . . . . . . . . . . . . . . . . . . . . . . 161 2.5.2 Combining Data of Different Types . . . . . . . . . . . . . . . . . . . . 165 2.5.3 Example: Symbolic Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

3

Modularity, Objects, and State . . . . . . . . . . . 183 3.1

Assignment and Local State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Local State Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 The Benefits of Introducing Assignment . . . . . . . . . . . . . . . . 3.1.3 The Costs of Introducing Assignment . . . . . . . . . . . . . . . . . . 3.2 The Environment Model of Evaluation . . . . . . . . . . . . . . . . . . . . . . 3.2.1 The Rules for Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Applying Simple Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3 Frames as the Repository of Local State . . . . . . . . . . . . . . . . 3.2.4 Internal Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Modeling with Mutable Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Mutable List Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Representing Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3 Representing Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.4 A Simulator for Digital Circuits . . . . . . . . . . . . . . . . . . . . . . . . 3.3.5 Propagation of Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Concurrency: Time Is of the Essence . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 The Nature of Time in Concurrent Systems . . . . . . . . . . . . . 3.4.2 Mechanisms for Controlling Concurrency . . . . . . . . . . . . . . . 3.5 Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 Streams Are Delayed Lists. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.2 Infinite Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.3 Exploiting the Stream Paradigm . . . . . . . . . . . . . . . . . . . . . . .

184 184 189 192 198 199 201 204 208 210 211 219 224 229 239 249 250 254 264 265 272 279

v 3.5.4 Streams and Delayed Evaluation . . . . . . . . . . . . . . . . . . . . . . . 288 3.5.5 Modularity of Functional Programs and Modularity of Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294

4

Metalinguistic Abstraction . . . . . . . . . . . . . . . 299 4.1

The Metacircular Evaluator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 The Core of the Evaluator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.2 Representing Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.3 Evaluator Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.4 Running the Evaluator as a Program . . . . . . . . . . . . . . . . . . . 4.1.5 Data as Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.6 Internal Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.7 Separating Syntactic Analysis from Execution. . . . . . . . . . . 4.2 Variations on a Scheme – Lazy Evaluation . . . . . . . . . . . . . . . . . . 4.2.1 Normal Order and Applicative Order . . . . . . . . . . . . . . . . . . . 4.2.2 An Interpreter with Lazy Evaluation . . . . . . . . . . . . . . . . . . . 4.2.3 Streams as Lazy Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Variations on a Scheme – Nondeterministic Computing . . . . . . . 4.3.1 Amb and Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.2 Examples of Nondeterministic Programs . . . . . . . . . . . . . . . . 4.3.3 Implementing the Amb Evaluator . . . . . . . . . . . . . . . . . . . . . . . 4.4 Logic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.1 Deductive Information Retrieval . . . . . . . . . . . . . . . . . . . . . . . 4.4.2 How the Query System Works . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.3 Is Logic Programming Mathematical Logic? . . . . . . . . . . . . 4.4.4 Implementing the Query System . . . . . . . . . . . . . . . . . . . . . . . 4.4.4.1 The Driver Loop and Instantiation . . . . . . . . . . . . . . . . 4.4.4.2 The Evaluator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.4.3 Finding Assertions by Pattern Matching . . . . . . . . . . . 4.4.4.4 Rules and Unification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.4.5 Maintaining the Data Base . . . . . . . . . . . . . . . . . . . . . . . 4.4.4.6 Stream Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.4.7 Query Syntax Procedures . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.4.8 Frames and Bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5

301 302 306 313 316 319 322 327 331 332 333 340 342 344 347 353 363 366 375 381 386 386 387 390 392 395 397 398 401

Computing with Register Machines . . . . . . . 405 5.1

Designing Register Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 A Language for Describing Register Machines . . . . . . . . . . . 5.1.2 Abstraction in Machine Design . . . . . . . . . . . . . . . . . . . . . . . . 5.1.3 Subroutines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.4 Using a Stack to Implement Recursion . . . . . . . . . . . . . . . . . 5.1.5 Instruction Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 A Register-Machine Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 The Machine Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 The Assembler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.3 Generating Execution Procedures for Instructions . . . . . . . 5.2.4 Monitoring Machine Performance . . . . . . . . . . . . . . . . . . . . . . 5.3 Storage Allocation and Garbage Collection . . . . . . . . . . . . . . . . . .

406 408 412 414 418 423 423 425 428 431 437 440

vi

Structure and Interpretation of Computer Programs, 2e 5.3.1 Memory as Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 Maintaining the Illusion of Infinite Memory . . . . . . . . . . . . . 5.4 The Explicit-Control Evaluator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1 The Core of the Explicit-Control Evaluator . . . . . . . . . . . . . 5.4.2 Sequence Evaluation and Tail Recursion . . . . . . . . . . . . . . . . 5.4.3 Conditionals, Assignments, and Definitions . . . . . . . . . . . . . 5.4.4 Running the Evaluator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Compilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.1 Structure of the Compiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.2 Compiling Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.3 Compiling Combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.4 Combining Instruction Sequences . . . . . . . . . . . . . . . . . . . . . . 5.5.5 An Example of Compiled Code . . . . . . . . . . . . . . . . . . . . . . . . 5.5.6 Lexical Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.7 Interfacing Compiled Code to the Evaluator . . . . . . . . . . . .

440 445 450 451 456 458 460 465 468 472 477 482 485 493 496

References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509

Unofficial Texinfo Format

1

Unofficial Texinfo Format This is the second edition SICP book, from Unofficial Texinfo Format. You are probably reading it in an Info hypertext browser, such as the Info mode of Emacs. You might alternatively be reading it TEX-formatted on your screen or printer, though that would be silly. And, if printed, expensive. The freely-distributed official HTML-and-GIF format was first converted personually to Unofficial Texinfo Format (UTF) version 1 by Lyssa Ayth during a long Emacs lovefest weekend in April, 2001. The UTF is easier to search than the HTML format. It is also much more accessible to people running on modest computers, such as donated ’386-based PCs. A 386 can, in theory, run Linux, Emacs, and a Scheme interpreter simultaneously, but most 386s probably can’t also run both Netscape and the necessary X Window System without prematurely introducing budding young underfunded hackers to the concept of thrashing. UTF can also fit uncompressed on a 1.44MB floppy diskette, which may come in handy for installing UTF on PCs that do not have Internet or LAN access. The Texinfo conversion has been a straight transliteration, to the extent possible. Like the TEX-to-HTML conversion, this was not without some introduction of breakage. In the case of Unofficial Texinfo Format, figures have suffered an amateurish resurrection of the lost art of ASCII art. Also, it’s quite possible that some errors of ambiguity were introduced during the conversion of some of the copious superscripts (‘^’) and subscripts (‘ ’). Divining which has been left as an exercise to the reader. But at least we don’t put our brave astronauts at risk by encoding the greater-than-or-equal symbol as >. If you modify ‘sicp.texi’ to correct errors or improve the ASCII art, then update the @set utfversion 2.neilvandyke4 line to reflect your delta. For example, if you started with Lytha’s version 1, and your name is Bob, then you could name your successive versions 1.bob1, 1.bob2, . . . 1.bobn . Also update utfversiondate. If you want to distribute your version on the Web, then embedding the string “sicp.texi” somewhere in the file or Web page will make it easier for people to find with Web search engines. It is believed that the Unofficial Texinfo Format is in keeping with the spirit of the graciously freely-distributed HTML version. But you never know when someone’s armada of lawyers might need something to do, and get their shorts all in a knot over some benign little thing, so think twice before you use your full name or distribute Info, DVI, PostScript, or PDF formats that might embed your account or machine name. Peath, Lytha Ayth Addendum: See also the SICP video lectures by Abelson and Sussman: http://www.swiss.ai.mit.edu/classes/6.001/abelson-sussman-lectures/

2

Structure and Interpretation of Computer Programs, 2e

Dedication

3

Dedication This book is dedicated, in respect and admiration, to the spirit that lives in the computer. “I think that it’s extraordinarily important that we in computer science keep fun in computing. When it started out, it was an awful lot of fun. Of course, the paying customers got shafted every now and then, and after a while we began to take their complaints seriously. We began to feel as if we really were responsible for the successful, error-free perfect use of these machines. I don’t think we are. I think we’re responsible for stretching them, setting them off in new directions, and keeping fun in the house. I hope the field of computer science never loses its sense of fun. Above all, I hope we don’t become missionaries. Don’t feel as if you’re Bible salesmen. The world has too many of those already. What you know about computing other people will learn. Don’t feel as if the key to successful computing is only in your hands. What’s in your hands, I think and hope, is intelligence: the ability to see the machine as more than when you were first led up to it, that you can make it more.” —Alan J. Perlis (April 1, 1922 February 7, 1990)

4

Structure and Interpretation of Computer Programs, 2e

Foreword

5

Foreword Educators, generals, dieticians, psychologists, and parents program. Armies, students, and some societies are programmed. An assault on large problems employs a succession of programs, most of which spring into existence en route. These programs are rife with issues that appear to be particular to the problem at hand. To appreciate programming as an intellectual activity in its own right you must turn to computer programming; you must read and write computer programs—many of them. It doesn’t matter much what the programs are about or what applications they serve. What does matter is how well they perform and how smoothly they fit with other programs in the creation of still greater programs. The programmer must seek both perfection of part and adequacy of collection. In this book the use of “program” is focused on the creation, execution, and study of programs written in a dialect of Lisp for execution on a digital computer. Using Lisp we restrict or limit not what we may program, but only the notation for our program descriptions. Our traffic with the subject matter of this book involves us with three foci of phenomena: the human mind, collections of computer programs, and the computer. Every computer program is a model, hatched in the mind, of a real or mental process. These processes, arising from human experience and thought, are huge in number, intricate in detail, and at any time only partially understood. They are modeled to our permanent satisfaction rarely by our computer programs. Thus even though our programs are carefully handcrafted discrete collections of symbols, mosaics of interlocking functions, they continually evolve: we change them as our perception of the model deepens, enlarges, generalizes until the model ultimately attains a metastable place within still another model with which we struggle. The source of the exhilaration associated with computer programming is the continual unfolding within the mind and on the computer of mechanisms expressed as programs and the explosion of perception they generate. If art interprets our dreams, the computer executes them in the guise of programs! For all its power, the computer is a harsh taskmaster. Its programs must be correct, and what we wish to say must be said accurately in every detail. As in every other symbolic activity, we become convinced of program truth through argument. Lisp itself can be assigned a semantics (another model, by the way), and if a program’s function can be specified, say, in the predicate calculus, the proof methods of logic can be used to make an acceptable correctness argument. Unfortunately, as programs get large and complicated, as they almost always do, the adequacy, consistency, and correctness of the specifications themselves become open to doubt, so that complete formal arguments of correctness seldom accompany large programs. Since large programs grow from small ones, it is crucial that we develop an arsenal of standard program structures of whose correctness we have become sure—we call them idioms—and learn to combine them into larger structures using organizational techniques of proven value. These techniques are treated at length in this book, and understanding them is essential to participation in the Promethean enterprise called programming. More than anything else, the uncovering and mastery of powerful organizational techniques accelerates our ability to create large, significant programs. Conversely, since writing large programs is very taxing, we are stimulated to invent new methods of reducing the mass of function and detail to be fitted into large programs. Unlike programs, computers must obey the laws of physics. If they wish to perform rapidly—a few nanoseconds per state change—they must transmit electrons only small

6

Structure and Interpretation of Computer Programs, 2e

distances (at most 11 over 2 feet). The heat generated by the huge number of devices so concentrated in space has to be removed. An exquisite engineering art has been developed balancing between multiplicity of function and density of devices. In any event, hardware always operates at a level more primitive than that at which we care to program. The processes that transform our Lisp programs to “machine” programs are themselves abstract models which we program. Their study and creation give a great deal of insight into the organizational programs associated with programming arbitrary models. Of course the computer itself can be so modeled. Think of it: the behavior of the smallest physical switching element is modeled by quantum mechanics described by differential equations whose detailed behavior is captured by numerical approximations represented in computer programs executing on computers composed of . . . ! It is not merely a matter of tactical convenience to separately identify the three foci. Even though, as they say, it’s all in the head, this logical separation induces an acceleration of symbolic traffic between these foci whose richness, vitality, and potential is exceeded in human experience only by the evolution of life itself. At best, relationships between the foci are metastable. The computers are never large enough or fast enough. Each breakthrough in hardware technology leads to more massive programming enterprises, new organizational principles, and an enrichment of abstract models. Every reader should ask himself periodically “Toward what end, toward what end?”—but do not ask it too often lest you pass up the fun of programming for the constipation of bittersweet philosophy. Among the programs we write, some (but never enough) perform a precise mathematical function such as sorting or finding the maximum of a sequence of numbers, determining primality, or finding the square root. We call such programs algorithms, and a great deal is known of their optimal behavior, particularly with respect to the two important parameters of execution time and data storage requirements. A programmer should acquire good algorithms and idioms. Even though some programs resist precise specifications, it is the responsibility of the programmer to estimate, and always to attempt to improve, their performance. Lisp is a survivor, having been in use for about a quarter of a century. Among the active programming languages only Fortran has had a longer life. Both languages have supported the programming needs of important areas of application, Fortran for scientific and engineering computation and Lisp for artificial intelligence. These two areas continue to be important, and their programmers are so devoted to these two languages that Lisp and Fortran may well continue in active use for at least another quarter-century. Lisp changes. The Scheme dialect used in this text has evolved from the original Lisp and differs from the latter in several important ways, including static scoping for variable binding and permitting functions to yield functions as values. In its semantic structure Scheme is as closely akin to Algol 60 as to early Lisps. Algol 60, never to be an active language again, lives on in the genes of Scheme and Pascal. It would be difficult to find two languages that are the communicating coin of two more different cultures than those gathered around these two languages. Pascal is for building pyramids—imposing, breathtaking, static structures built by armies pushing heavy blocks into place. Lisp is for building organisms—imposing, breathtaking, dynamic structures built by squads fitting fluctuating myriads of simpler organisms into place. The organizing principles used are the same in both cases, except for one extraordinarily important difference: The discretionary exportable functionality entrusted to the individual Lisp programmer is more than an order of magnitude greater than

Foreword

7

that to be found within Pascal enterprises. Lisp programs inflate libraries with functions whose utility transcends the application that produced them. The list, Lisp’s native data structure, is largely responsible for such growth of utility. The simple structure and natural applicability of lists are reflected in functions that are amazingly nonidiosyncratic. In Pascal the plethora of declarable data structures induces a specialization within functions that inhibits and penalizes casual cooperation. It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures. As a result the pyramid must stand unchanged for a millennium; the organism must evolve or perish. To illustrate this difference, compare the treatment of material and exercises within this book with that in any first-course text using Pascal. Do not labor under the illusion that this is a text digestible at MIT only, peculiar to the breed found there. It is precisely what a serious book on programming Lisp must be, no matter who the student is or where it is used. Note that this is a text about programming, unlike most Lisp books, which are used as a preparation for work in artificial intelligence. After all, the critical programming concerns of software engineering and artificial intelligence tend to coalesce as the systems under investigation become larger. This explains why there is such growing interest in Lisp outside of artificial intelligence. As one would expect from its goals, artificial intelligence research generates many significant programming problems. In other programming cultures this spate of problems spawns new languages. Indeed, in any very large programming task a useful organizing principle is to control and isolate traffic within the task modules via the invention of language. These languages tend to become less primitive as one approaches the boundaries of the system where we humans interact most often. As a result, such systems contain complex languageprocessing functions replicated many times. Lisp has such a simple syntax and semantics that parsing can be treated as an elementary task. Thus parsing technology plays almost no role in Lisp programs, and the construction of language processors is rarely an impediment to the rate of growth and change of large Lisp systems. Finally, it is this very simplicity of syntax and semantics that is responsible for the burden and freedom borne by all Lisp programmers. No Lisp program of any size beyond a few lines can be written without being saturated with discretionary functions. Invent and fit; have fits and reinvent! We toast the Lisp programmer who pens his thoughts within nests of parentheses. Alan J. Perlis New Haven, Connecticut

8

Structure and Interpretation of Computer Programs, 2e

Preface to the Second Edition

9

Preface to the Second Edition Is it possible that software is not like anything else, that it is meant to be discarded: that the whole point is to always see it as a soap bubble? —Alan J. Perlis The material in this book has been the basis of MIT’s entry-level computer science subject since 1980. We had been teaching this material for four years when the first edition was published, and twelve more years have elapsed until the appearance of this second edition. We are pleased that our work has been widely adopted and incorporated into other texts. We have seen our students take the ideas and programs in this book and build them in as the core of new computer systems and languages. In literal realization of an ancient Talmudic pun, our students have become our builders. We are lucky to have such capable students and such accomplished builders. In preparing this edition, we have incorporated hundreds of clarifications suggested by our own teaching experience and the comments of colleagues at MIT and elsewhere. We have redesigned most of the major programming systems in the book, including the genericarithmetic system, the interpreters, the register-machine simulator, and the compiler; and we have rewritten all the program examples to ensure that any Scheme implementation conforming to the IEEE Scheme standard (IEEE 1990) will be able to run the code. This edition emphasizes several new themes. The most important of these is the central role played by different approaches to dealing with time in computational models: objects with state, concurrent programming, functional programming, lazy evaluation, and nondeterministic programming. We have included new sections on concurrency and nondeterminism, and we have tried to integrate this theme throughout the book. The first edition of the book closely followed the syllabus of our MIT one-semester subject. With all the new material in the second edition, it will not be possible to cover everything in a single semester, so the instructor will have to pick and choose. In our own teaching, we sometimes skip the section on logic programming (section Section 4.4 [4-4], page 363), we have students use the register-machine simulator but we do not cover its implementation (section Section 5.2 [5-2], page 423), and we give only a cursory overview of the compiler (section Section 5.5 [5-5], page 465). Even so, this is still an intense course. Some instructors may wish to cover only the first three or four chapters, leaving the other material for subsequent courses. The World-Wide-Web site http://www-mitpress.mit.edu/sicp/ provides support for users of this book. This includes programs from the book, sample programming assignments, supplementary materials, and downloadable implementations of the Scheme dialect of Lisp.

10

Structure and Interpretation of Computer Programs, 2e

Preface to the First Edition

11

Preface to the First Edition A computer is like a violin. You can imagine a novice trying first a phonograph and then a violin. The latter, he says, sounds terrible. That is the argument we have heard from our humanists and most of our computer scientists. Computer programs are good, they say, for particular purposes, but they aren’t flexible. Neither is a violin, or a typewriter, until you learn how to use it. —Marvin Minsky, “Why Programming Is a Good Medium for Expressing Poorly-Understood and Sloppily-Formulated Ideas” “The Structure and Interpretation of Computer Programs” is the entry-level subject in computer science at the Massachusetts Institute of Technology. It is required of all students at MIT who major in electrical engineering or in computer science, as one-fourth of the “common core curriculum,” which also includes two subjects on circuits and linear systems and a subject on the design of digital systems. We have been involved in the development of this subject since 1978, and we have taught this material in its present form since the fall of 1980 to between 600 and 700 students each year. Most of these students have had little or no prior formal training in computation, although many have played with computers a bit and a few have had extensive programming or hardware-design experience. Our design of this introductory computer-science subject reflects two major concerns. First, we want to establish the idea that a computer language is not just a way of getting a computer to perform operations but rather that it is a novel formal medium for expressing ideas about methodology. Thus, programs must be written for people to read, and only incidentally for machines to execute. Second, we believe that the essential material to be addressed by a subject at this level is not the syntax of particular programming-language constructs, nor clever algorithms for computing particular functions efficiently, nor even the mathematical analysis of algorithms and the foundations of computing, but rather the techniques used to control the intellectual complexity of large software systems. Our goal is that students who complete this subject should have a good feel for the elements of style and the aesthetics of programming. They should have command of the major techniques for controlling complexity in a large system. They should be capable of reading a 50-page-long program, if it is written in an exemplary style. They should know what not to read, and what they need not understand at any moment. They should feel secure about modifying a program, retaining the spirit and style of the original author. These skills are by no means unique to computer programming. The techniques we teach and draw upon are common to all of engineering design. We control complexity by building abstractions that hide details when appropriate. We control complexity by establishing conventional interfaces that enable us to construct systems by combining standard, wellunderstood pieces in a “mix and match” way. We control complexity by establishing new languages for describing a design, each of which emphasizes particular aspects of the design and deemphasizes others. Underlying our approach to this subject is our conviction that “computer science” is not a science and that its significance has little to do with computers. The computer revolution is a revolution in the way we think and in the way we express what we think. The essence of this change is the emergence of what might best be called procedural epistemology— the study of the structure of knowledge from an imperative point of view, as opposed to

12

Structure and Interpretation of Computer Programs, 2e

the more declarative point of view taken by classical mathematical subjects. Mathematics provides a framework for dealing precisely with notions of “what is.” Computation provides a framework for dealing precisely with notions of “how to.” In teaching our material we use a dialect of the programming language Lisp. We never formally teach the language, because we don’t have to. We just use it, and students pick it up in a few days. This is one great advantage of Lisp-like languages: They have very few ways of forming compound expressions, and almost no syntactic structure. All of the formal properties can be covered in an hour, like the rules of chess. After a short time we forget about syntactic details of the language (because there are none) and get on with the real issues—figuring out what we want to compute, how we will decompose problems into manageable parts, and how we will work on the parts. Another advantage of Lisp is that it supports (but does not enforce) more of the large-scale strategies for modular decomposition of programs than any other language we know. We can make procedural and data abstractions, we can use higher-order functions to capture common patterns of usage, we can model local state using assignment and data mutation, we can link parts of a program with streams and delayed evaluation, and we can easily implement embedded languages. All of this is embedded in an interactive environment with excellent support for incremental program design, construction, testing, and debugging. We thank all the generations of Lisp wizards, starting with John McCarthy, who have fashioned a fine tool of unprecedented power and elegance. Scheme, the dialect of Lisp that we use, is an attempt to bring together the power and elegance of Lisp and Algol. From Lisp we take the metalinguistic power that derives from the simple syntax, the uniform representation of programs as data objects, and the garbagecollected heap-allocated data. From Algol we take lexical scoping and block structure, which are gifts from the pioneers of programming-language design who were on the Algol committee. We wish to cite John Reynolds and Peter Landin for their insights into the relationship of Church’s [lambda] calculus to the structure of programming languages. We also recognize our debt to the mathematicians who scouted out this territory decades before computers appeared on the scene. These pioneers include Alonzo Church, Barkley Rosser, Stephen Kleene, and Haskell Curry.

Acknowledgements

13

Acknowledgements We would like to thank the many people who have helped us develop this book and this curriculum. Our subject is a clear intellectual descendant of “6.231,” a wonderful subject on programming linguistics and the [lambda] calculus taught at MIT in the late 1960s by Jack Wozencraft and Arthur Evans, Jr. We owe a great debt to Robert Fano, who reorganized MIT’s introductory curriculum in electrical engineering and computer science to emphasize the principles of engineering design. He led us in starting out on this enterprise and wrote the first set of subject notes from which this book evolved. Much of the style and aesthetics of programming that we try to teach were developed in conjunction with Guy Lewis Steele Jr., who collaborated with Gerald Jay Sussman in the initial development of the Scheme language. In addition, David Turner, Peter Henderson, Dan Friedman, David Wise, and Will Clinger have taught us many of the techniques of the functional programming community that appear in this book. Joel Moses taught us about structuring large systems. His experience with the Macsyma system for symbolic computation provided the insight that one should avoid complexities of control and concentrate on organizing the data to reflect the real structure of the world being modeled. Marvin Minsky and Seymour Papert formed many of our attitudes about programming and its place in our intellectual lives. To them we owe the understanding that computation provides a means of expression for exploring ideas that would otherwise be too complex to deal with precisely. They emphasize that a student’s ability to write and modify programs provides a powerful medium in which exploring becomes a natural activity. We also strongly agree with Alan Perlis that programming is lots of fun and we had better be careful to support the joy of programming. Part of this joy derives from observing great masters at work. We are fortunate to have been apprentice programmers at the feet of Bill Gosper and Richard Greenblatt. It is difficult to identify all the people who have contributed to the development of our curriculum. We thank all the lecturers, recitation instructors, and tutors who have worked with us over the past fifteen years and put in many extra hours on our subject, especially Bill Siebert, Albert Meyer, Joe Stoy, Randy Davis, Louis Braida, Eric Grimson, Rod Brooks, Lynn Stein, and Peter Szolovits. We would like to specially acknowledge the outstanding teaching contributions of Franklyn Turbak, now at Wellesley; his work in undergraduate instruction set a standard that we can all aspire to. We are grateful to Jerry Saltzer and Jim Miller for helping us grapple with the mysteries of concurrency, and to Peter Szolovits and David McAllester for their contributions to the exposition of nondeterministic evaluation in Chapter 4 [Chapter 4], page 299. Many people have put in significant effort presenting this material at other universities. Some of the people we have worked closely with are Jacob Katzenelson at the Technion, Hardy Mayer at the University of California at Irvine, Joe Stoy at Oxford, Elisha Sacks at Purdue, and Jan Komorowski at the Norwegian University of Science and Technology. We are exceptionally proud of our colleagues who have received major teaching awards for

14

Structure and Interpretation of Computer Programs, 2e

their adaptations of this subject at other universities, including Kenneth Yip at Yale, Brian Harvey at the University of California at Berkeley, and Dan Huttenlocher at Cornell. Al Moy´e arranged for us to teach this material to engineers at Hewlett-Packard, and for the production of videotapes of these lectures. We would like to thank the talented instructors—in particular Jim Miller, Bill Siebert, and Mike Eisenberg—who have designed continuing education courses incorporating these tapes and taught them at universities and industry all over the world. Many educators in other countries have put in significant work translating the first edition. Michel Briand, Pierre Chamard, and Andr´e Pic produced a French edition; Susanne Daniels-Herold produced a German edition; and Fumio Motoyoshi produced a Japanese edition. We do not know who produced the Chinese edition, but we consider it an honor to have been selected as the subject of an “unauthorized” translation. It is hard to enumerate all the people who have made technical contributions to the development of the Scheme systems we use for instructional purposes. In addition to Guy Steele, principal wizards have included Chris Hanson, Joe Bowbeer, Jim Miller, Guillermo Rozas, and Stephen Adams. Others who have put in significant time are Richard Stallman, Alan Bawden, Kent Pitman, Jon Taft, Neil Mayle, John Lamping, Gwyn Osnos, Tracy Larrabee, George Carrette, Soma Chaudhuri, Bill Chiarchiaro, Steven Kirsch, Leigh Klotz, Wayne Noss, Todd Cass, Patrick O’Donnell, Kevin Theobald, Daniel Weise, Kenneth Sinclair, Anthony Courtemanche, Henry M. Wu, Andrew Berlin, and Ruth Shyu. Beyond the MIT implementation, we would like to thank the many people who worked on the IEEE Scheme standard, including William Clinger and Jonathan Rees, who edited the R^4RS, and Chris Haynes, David Bartley, Chris Hanson, and Jim Miller, who prepared the IEEE standard. Dan Friedman has been a long-time leader of the Scheme community. The community’s broader work goes beyond issues of language design to encompass significant educational innovations, such as the high-school curriculum based on EdScheme by Schemer’s Inc., and the wonderful books by Mike Eisenberg and by Brian Harvey and Matthew Wright. We appreciate the work of those who contributed to making this a real book, especially Terry Ehling, Larry Cohen, and Paul Bethge at the MIT Press. Ella Mazel found the wonderful cover image. For the second edition we are particularly grateful to Bernard and Ella Mazel for help with the book design, and to David Jones, TEX wizard extraordinaire. We also are indebted to those readers who made penetrating comments on the new draft: Jacob Katzenelson, Hardy Mayer, Jim Miller, and especially Brian Harvey, who did unto this book as Julie did unto his book Simply Scheme. Finally, we would like to acknowledge the support of the organizations that have encouraged this work over the years, including suppport from Hewlett-Packard, made possible by Ira Goldstein and Joel Birnbaum, and support from DARPA, made possible by Bob Kahn.

Chapter 1: Building Abstractions with Procedures

15

1 Building Abstractions with Procedures The acts of the mind, wherein it exerts its power over simple ideas, are chiefly these three: 1. Combining several simple ideas into one compound one, and thus all complex ideas are made. 2. The second is bringing two ideas, whether simple or complex, together, and setting them by one another so as to take a view of them at once, without uniting them into one, by which it gets all its ideas of relations. 3. The third is separating them from all other ideas that accompany them in their real existence: this is called abstraction, and thus all its general ideas are made. —John Locke, An Essay Concerning Human Understanding (1690) We are about to study the idea of a computational process. Computational processes are abstract beings that inhabit computers. As they evolve, processes manipulate other abstract things called data. The evolution of a process is directed by a pattern of rules called a program. People create programs to direct processes. In effect, we conjure the spirits of the computer with our spells. A computational process is indeed much like a sorcerer’s idea of a spirit. It cannot be seen or touched. It is not composed of matter at all. However, it is very real. It can perform intellectual work. It can answer questions. It can affect the world by disbursing money at a bank or by controlling a robot arm in a factory. The programs we use to conjure processes are like a sorcerer’s spells. They are carefully composed from symbolic expressions in arcane and esoteric programming languages that prescribe the tasks we want our processes to perform. A computational process, in a correctly working computer, executes programs precisely and accurately. Thus, like the sorcerer’s apprentice, novice programmers must learn to understand and to anticipate the consequences of their conjuring. Even small errors (usually called bugs or glitches) in programs can have complex and unanticipated consequences. Fortunately, learning to program is considerably less dangerous than learning sorcery, because the spirits we deal with are conveniently contained in a secure way. Real-world programming, however, requires care, expertise, and wisdom. A small bug in a computeraided design program, for example, can lead to the catastrophic collapse of an airplane or a dam or the self-destruction of an industrial robot. Master software engineers have the ability to organize programs so that they can be reasonably sure that the resulting processes will perform the tasks intended. They can visualize the behavior of their systems in advance. They know how to structure programs so that unanticipated problems do not lead to catastrophic consequences, and when problems do arise, they can debug their programs. Well-designed computational systems, like welldesigned automobiles or nuclear reactors, are designed in a modular manner, so that the parts can be constructed, replaced, and debugged separately.

Programming in Lisp We need an appropriate language for describing processes, and we will use for this purpose the programming language Lisp. Just as our everyday thoughts are usually expressed in our natural language (such as English, French, or Japanese), and descriptions of quantitative phenomena are expressed with mathematical notations, our procedural thoughts will

16

Structure and Interpretation of Computer Programs, 2e

be expressed in Lisp. Lisp was invented in the late 1950s as a formalism for reasoning about the use of certain kinds of logical expressions, called recursion equations, as a model for computation. The language was conceived by John McCarthy and is based on his paper “Recursive Functions of Symbolic Expressions and Their Computation by Machine” (McCarthy 1960). Despite its inception as a mathematical formalism, Lisp is a practical programming language. A Lisp interpreter is a machine that carries out processes described in the Lisp language. The first Lisp interpreter was implemented by McCarthy with the help of colleagues and students in the Artificial Intelligence Group of the MIT Research Laboratory of Electronics and in the MIT Computation Center.1 Lisp, whose name is an acronym for LISt Processing, was designed to provide symbol-manipulating capabilities for attacking programming problems such as the symbolic differentiation and integration of algebraic expressions. It included for this purpose new data objects known as atoms and lists, which most strikingly set it apart from all other languages of the period. Lisp was not the product of a concerted design effort. Instead, it evolved informally in an experimental manner in response to users’ needs and to pragmatic implementation considerations. Lisp’s informal evolution has continued through the years, and the community of Lisp users has traditionally resisted attempts to promulgate any “official” definition of the language. This evolution, together with the flexibility and elegance of the initial conception, has enabled Lisp, which is the second oldest language in widespread use today (only Fortran is older), to continually adapt to encompass the most modern ideas about program design. Thus, Lisp is by now a family of dialects, which, while sharing most of the original features, may differ from one another in significant ways. The dialect of Lisp used in this book is called Scheme.2 Because of its experimental character and its emphasis on symbol manipulation, Lisp was at first very inefficient for numerical computations, at least in comparison with Fortran. Over the years, however, Lisp compilers have been developed that translate programs into machine code that can perform numerical computations reasonably efficiently. And for special applications, Lisp has been used with great effectiveness.3 Although Lisp has not yet overcome its old reputation as hopelessly inefficient, Lisp is now used in many applications 1 2

3

The Lisp 1 Programmer’s Manual appeared in 1960, and the Lisp 1.5 Programmer’s Manual (McCarthy 1965) was published in 1962. The early history of Lisp is described in McCarthy 1978. The two dialects in which most major Lisp programs of the 1970s were written are MacLisp (Moon 1978; Pitman 1983), developed at the MIT Project MAC, and Interlisp (Teitelman 1974), developed at Bolt Beranek and Newman Inc. and the Xerox Palo Alto Research Center. Portable Standard Lisp (Hearn 1969; Griss 1981) was a Lisp dialect designed to be easily portable between different machines. MacLisp spawned a number of subdialects, such as Franz Lisp, which was developed at the University of California at Berkeley, and Zetalisp (Moon 1981), which was based on a special-purpose processor designed at the MIT Artificial Intelligence Laboratory to run Lisp very efficiently. The Lisp dialect used in this book, called Scheme (Steele 1975), was invented in 1975 by Guy Lewis Steele Jr. and Gerald Jay Sussman of the MIT Artificial Intelligence Laboratory and later reimplemented for instructional use at MIT. Scheme became an IEEE standard in 1990 (IEEE 1990). The Common Lisp dialect (Steele 1982, Steele 1990) was developed by the Lisp community to combine features from the earlier Lisp dialects to make an industrial standard for Lisp. Common Lisp became an ANSI standard in 1994 (ANSI 1994). One such special application was a breakthrough computation of scientific importance—an integration of the motion of the Solar System that extended previous results by nearly two orders of magnitude, and demonstrated that the dynamics of the Solar System is chaotic. This computation was made possible by new integration algorithms, a special-purpose compiler, and a special-purpose computer all implemented with the aid of software tools written in Lisp (Abelson et al. 1992; Sussman and Wisdom 1992).

Chapter 1: Building Abstractions with Procedures

17

where efficiency is not the central concern. For example, Lisp has become a language of choice for operating-system shell languages and for extension languages for editors and computer-aided design systems. If Lisp is not a mainstream language, why are we using it as the framework for our discussion of programming? Because the language possesses unique features that make it an excellent medium for studying important programming constructs and data structures and for relating them to the linguistic features that support them. The most significant of these features is the fact that Lisp descriptions of processes, called procedures, can themselves be represented and manipulated as Lisp data. The importance of this is that there are powerful program-design techniques that rely on the ability to blur the traditional distinction between “passive” data and “active” processes. As we shall discover, Lisp’s flexibility in handling procedures as data makes it one of the most convenient languages in existence for exploring these techniques. The ability to represent procedures as data also makes Lisp an excellent language for writing programs that must manipulate other programs as data, such as the interpreters and compilers that support computer languages. Above and beyond these considerations, programming in Lisp is great fun.

1.1 The Elements of Programming A powerful programming language is more than just a means for instructing a computer to perform tasks. The language also serves as a framework within which we organize our ideas about processes. Thus, when we describe a language, we should pay particular attention to the means that the language provides for combining simple ideas to form more complex ideas. Every powerful language has three mechanisms for accomplishing this: primitive expressions which represent the simplest entities the language is concerned with, means of combination by which compound elements are built from simpler ones, and means of abstraction by which compound elements can be named and manipulated as units. In programming, we deal with two kinds of elements: procedures and data. (Later we will discover that they are really not so distinct.) Informally, data is “stuff” that we want to manipulate, and procedures are descriptions of the rules for manipulating the data. Thus, any powerful programming language should be able to describe primitive data and primitive procedures and should have methods for combining and abstracting procedures and data. In this chapter we will deal only with simple numerical data so that we can focus on the rules for building procedures.4 In later chapters we will see that these same rules allow us to build procedures to manipulate compound data as well. 4

The characterization of numbers as “simple data” is a barefaced bluff. In fact, the treatment of numbers is one of the trickiest and most confusing aspects of any programming language. Some typical issues involved are these: Some computer systems distinguish integers, such as 2, from real numbers, such as 2.71. Is the real number 2.00 different from the integer 2? Are the arithmetic operations used for integers the same as the operations used for real numbers? Does 6 divided by 2 produce 3, or 3.0? How large a number can we represent? How many decimal places of accuracy can we represent? Is the range of integers the same as the range of real numbers? Above and beyond these questions, of course, lies a collection of issues concerning roundoff and truncation errors – the entire science of numerical analysis. Since our focus in this book is on large-scale program design rather than on numerical techniques, we are

18

Structure and Interpretation of Computer Programs, 2e

1.1.1 Expressions One easy way to get started at programming is to examine some typical interactions with an interpreter for the Scheme dialect of Lisp. Imagine that you are sitting at a computer terminal. You type an expression, and the interpreter responds by displaying the result of its evaluating that expression. One kind of primitive expression you might type is a number. (More precisely, the expression that you type consists of the numerals that represent the number in base 10.) If you present Lisp with a number 486 the interpreter will respond by printing5 486 Expressions representing numbers may be combined with an expression representing a primitive procedure (such as + or *) to form a compound expression that represents the application of the procedure to those numbers. For example: (+ 137 349) 486 (- 1000 334) 666 (* 5 99) 495 (/ 10 5) 2 (+ 2.7 10) 12.7 Expressions such as these, formed by delimiting a list of expressions within parentheses in order to denote procedure application, are called combinations. The leftmost element in the list is called the operator, and the other elements are called operands. The value of a combination is obtained by applying the procedure specified by the operator to the arguments that are the values of the operands. The convention of placing the operator to the left of the operands is known as prefix notation, and it may be somewhat confusing at first because it departs significantly from the customary mathematical convention. Prefix notation has several advantages, however. One of them is that it can accommodate procedures that may take an arbitrary number of arguments, as in the following examples: (+ 21 35 12 7)

5

going to ignore these problems. The numerical examples in this chapter will exhibit the usual roundoff behavior that one observes when using arithmetic operations that preserve a limited number of decimal places of accuracy in noninteger operations. Throughout this book, when we wish to emphasize the distinction between the input typed by the user and the response printed by the interpreter, we will show the latter in slanted characters.

Chapter 1: Building Abstractions with Procedures

19

75 (* 25 4 12) 1200 No ambiguity can arise, because the operator is always the leftmost element and the entire combination is delimited by the parentheses. A second advantage of prefix notation is that it extends in a straightforward way to allow combinations to be nested, that is, to have combinations whose elements are themselves combinations: (+ (* 3 5) (- 10 6)) 19 There is no limit (in principle) to the depth of such nesting and to the overall complexity of the expressions that the Lisp interpreter can evaluate. It is we humans who get confused by still relatively simple expressions such as (+ (* 3 (+ (* 2 4) (+ 3 5))) (+ (- 10 7) 6)) which the interpreter would readily evaluate to be 57. We can help ourselves by writing such an expression in the form (+ (* 3 (+ (* 2 4) (+ 3 5))) (+ (- 10 7) 6)) following a formatting convention known as pretty-printing, in which each long combination is written so that the operands are aligned vertically. The resulting indentations display clearly the structure of the expression.6 Even with complex expressions, the interpreter always operates in the same basic cycle: It reads an expression from the terminal, evaluates the expression, and prints the result. This mode of operation is often expressed by saying that the interpreter runs in a readeval-print loop. Observe in particular that it is not necessary to explicitly instruct the interpreter to print the value of the expression.7

1.1.2 Naming and the Environment A critical aspect of a programming language is the means it provides for using names to refer to computational objects. We say that the name identifies a variable whose value is the object. In the Scheme dialect of Lisp, we name things with define. Typing (define size 2) 6

7

Lisp systems typically provide features to aid the user in formatting expressions. Two especially useful features are one that automatically indents to the proper pretty-print position whenever a new line is started and one that highlights the matching left parenthesis whenever a right parenthesis is typed. Lisp obeys the convention that every expression has a value. This convention, together with the old reputation of Lisp as an inefficient language, is the source of the quip by Alan Perlis (paraphrasing Oscar Wilde) that “Lisp programmers know the value of everything but the cost of nothing.”

20

Structure and Interpretation of Computer Programs, 2e

causes the interpreter to associate the value 2 with the name size.8 Once the name size has been associated with the number 2, we can refer to the value 2 by name: size 2 (* 5 size) 10 Here are further examples of the use of define: (define pi 3.14159) (define radius 10) (* pi (* radius radius)) 314.159 (define circumference (* 2 pi radius)) circumference 62.8318 Define is our language’s simplest means of abstraction, for it allows us to use simple names to refer to the results of compound operations, such as the circumference computed above. In general, computational objects may have very complex structures, and it would be extremely inconvenient to have to remember and repeat their details each time we want to use them. Indeed, complex programs are constructed by building, step by step, computational objects of increasing complexity. The interpreter makes this step-by-step program construction particularly convenient because name-object associations can be created incrementally in successive interactions. This feature encourages the incremental development and testing of programs and is largely responsible for the fact that a Lisp program usually consists of a large number of relatively simple procedures. It should be clear that the possibility of associating values with symbols and later retrieving them means that the interpreter must maintain some sort of memory that keeps track of the name-object pairs. This memory is called the environment (more precisely the global environment, since we will see later that a computation may involve a number of different environments).9

1.1.3 Evaluating Combinations One of our goals in this chapter is to isolate issues about thinking procedurally. As a case in point, let us consider that, in evaluating combinations, the interpreter is itself following a procedure. To evaluate a combination, do the following: 1. Evaluate the subexpressions of the combination. 8 9

In this book, we do not show the interpreter’s response to evaluating definitions, since this is highly implementation-dependent. Chapter 3 [Chapter 3], page 183 will show that this notion of environment is crucial, both for understanding how the interpreter works and for implementing interpreters.

Chapter 1: Building Abstractions with Procedures

21

2. Apply the procedure that is the value of the leftmost subexpression (the operator) to the arguments that are the values of the other subexpressions (the operands). Even this simple rule illustrates some important points about processes in general. First, observe that the first step dictates that in order to accomplish the evaluation process for a combination we must first perform the evaluation process on each element of the combination. Thus, the evaluation rule is recursive in nature; that is, it includes, as one of its steps, the need to invoke the rule itself.10 Notice how succinctly the idea of recursion can be used to express what, in the case of a deeply nested combination, would otherwise be viewed as a rather complicated process. For example, evaluating (* (+ 2 (* 4 6)) (+ 3 5 7)) requires that the evaluation rule be applied to four different combinations. We can obtain a picture of this process by representing the combination in the form of a tree, as shown in [Figure 1-1], page 21. Each combination is represented by a node with branches corresponding to the operator and the operands of the combination stemming from it. The terminal nodes (that is, nodes with no branches stemming from them) represent either operators or numbers. Viewing evaluation in terms of the tree, we can imagine that the values of the operands percolate upward, starting from the terminal nodes and then combining at higher and higher levels. In general, we shall see that recursion is a very powerful technique for dealing with hierarchical, treelike objects. In fact, the “percolate values upward” form of the evaluation rule is an example of a general kind of process known as tree accumulation. Figure 1.1: Tree representation, showing the value of each subcombination. 390 /|\____________ / | \ * 26 15 /|\ | / | \ // \\ + 2 24 / | | \ /|\ + 3 5 7 / | \ * 4 6 Next, observe that the repeated application of the first step brings us to the point where we need to evaluate, not combinations, but primitive expressions such as numerals, built-in operators, or other names. We take care of the primitive cases by stipulating that • the values of numerals are the numbers that they name, • the values of built-in operators are the machine instruction sequences that carry out the corresponding operations, and 10

It may seem strange that the evaluation rule says, as part of the first step, that we should evaluate the leftmost element of a combination, since at this point that can only be an operator such as + or * representing a built-in primitive procedure such as addition or multiplication. We will see later that it is useful to be able to work with combinations whose operators are themselves compound expressions.

22

Structure and Interpretation of Computer Programs, 2e

• the values of other names are the objects associated with those names in the environment. We may regard the second rule as a special case of the third one by stipulating that symbols such as + and * are also included in the global environment, and are associated with the sequences of machine instructions that are their “values.” The key point to notice is the role of the environment in determining the meaning of the symbols in expressions. In an interactive language such as Lisp, it is meaningless to speak of the value of an expression such as (+ x 1) without specifying any information about the environment that would provide a meaning for the symbol x (or even for the symbol +). As we shall see in Chapter 3 [Chapter 3], page 183, the general notion of the environment as providing a context in which evaluation takes place will play an important role in our understanding of program execution. Notice that the evaluation rule given above does not handle definitions. For instance, evaluating (define x 3) does not apply define to two arguments, one of which is the value of the symbol x and the other of which is 3, since the purpose of the define is precisely to associate x with a value. (That is, (define x 3) is not a combination.) Such exceptions to the general evaluation rule are called special forms. Define is the only example of a special form that we have seen so far, but we will meet others shortly. Each special form has its own evaluation rule. The various kinds of expressions (each with its associated evaluation rule) constitute the syntax of the programming language. In comparison with most other programming languages, Lisp has a very simple syntax; that is, the evaluation rule for expressions can be described by a simple general rule together with specialized rules for a small number of special forms.11

1.1.4 Compound Procedures We have identified in Lisp some of the elements that must appear in any powerful programming language: • Numbers and arithmetic operations are primitive data and procedures. • Nesting of combinations provides a means of combining operations. • Definitions that associate names with values provide a limited means of abstraction. Now we will learn about procedure definitions, a much more powerful abstraction technique by which a compound operation can be given a name and then referred to as a unit. We begin by examining how to express the idea of “squaring.” We might say, “To square something, multiply it by itself.” This is expressed in our language as (define (square x) (* x x)) We can understand this in the following way: 11

Special syntactic forms that are simply convenient alternative surface structures for things that can be written in more uniform ways are sometimes called syntactic sugar, to use a phrase coined by Peter Landin. In comparison with users of other languages, Lisp programmers, as a rule, are less concerned with matters of syntax. (By contrast, examine any Pascal manual and notice how much of it is devoted to descriptions of syntax.) This disdain for syntax is due partly to the flexibility of Lisp, which makes it easy to change surface syntax, and partly to the observation that many “convenient” syntactic constructs, which make the language less uniform, end up causing more trouble than they are worth when programs become large and complex. In the words of Alan Perlis, “Syntactic sugar causes cancer of the semicolon.”

Chapter 1: Building Abstractions with Procedures

23

(define (square x) (* x x)) | | | | | | To square something, multiply it by itself. We have here a compound procedure, which has been given the name square. The procedure represents the operation of multiplying something by itself. The thing to be multiplied is given a local name, x, which plays the same role that a pronoun plays in natural language. Evaluating the definition creates this compound procedure and associates it with the name square.12 The general form of a procedure definition is (define ( ) ) The is a symbol to be associated with the procedure definition in the environment.13 The are the names used within the body of the procedure to refer to the corresponding arguments of the procedure. The is an expression that will yield the value of the procedure application when the formal parameters are replaced by the actual arguments to which the procedure is applied.14 The and the are grouped within parentheses, just as they would be in an actual call to the procedure being defined. Having defined square, we can now use it: (square 21) 441 (square (+ 2 5)) 49 (square (square 3)) 81 We can also use square as a building block in defining other procedures. For example, x ^2 + y^2 can be expressed as (+ (square x) (square y)) We can easily define a procedure sum-of-squares that, given any two numbers as arguments, produces the sum of their squares: (define (sum-of-squares x y) (+ (square x) (square y))) (sum-of-squares 3 4) 25 Now we can use sum-of-squares as a building block in constructing further procedures: 12

13

14

Observe that there are two different operations being combined here: we are creating the procedure, and we are giving it the name square. It is possible, indeed important, to be able to separate these two notions—to create procedures without naming them, and to give names to procedures that have already been created. We will see how to do this in section Section 1.3.2 [1-3-2], page 61. Throughout this book, we will describe the general syntax of expressions by using italic symbols delimited by angle brackets—e.g., —to denote the “slots” in the expression to be filled in when such an expression is actually used. More generally, the body of the procedure can be a sequence of expressions. In this case, the interpreter evaluates each expression in the sequence in turn and returns the value of the final expression as the value of the procedure application.

24

Structure and Interpretation of Computer Programs, 2e

(define (f a) (sum-of-squares (+ a 1) (* a 2))) (f 5) 136 Compound procedures are used in exactly the same way as primitive procedures. Indeed, one could not tell by looking at the definition of sum-of-squares given above whether square was built into the interpreter, like + and *, or defined as a compound procedure.

1.1.5 The Substitution Model for Procedure Application To evaluate a combination whose operator names a compound procedure, the interpreter follows much the same process as for combinations whose operators name primitive procedures, which we described in section Section 1.1.3 [1-1-3], page 20. That is, the interpreter evaluates the elements of the combination and applies the procedure (which is the value of the operator of the combination) to the arguments (which are the values of the operands of the combination). We can assume that the mechanism for applying primitive procedures to arguments is built into the interpreter. For compound procedures, the application process is as follows: To apply a compound procedure to arguments, evaluate the body of the procedure with each formal parameter replaced by the corresponding argument. To illustrate this process, let’s evaluate the combination (f 5) where f is the procedure defined in section Section 1.1.4 [1-1-4], page 22. We begin by retrieving the body of f: (sum-of-squares (+ a 1) (* a 2)) Then we replace the formal parameter a by the argument 5: (sum-of-squares (+ 5 1) (* 5 2)) Thus the problem reduces to the evaluation of a combination with two operands and an operator sum-of-squares. Evaluating this combination involves three subproblems. We must evaluate the operator to get the procedure to be applied, and we must evaluate the operands to get the arguments. Now (+ 5 1) produces 6 and (* 5 2) produces 10, so we must apply the sum-of-squares procedure to 6 and 10. These values are substituted for the formal parameters x and y in the body of sum-of-squares, reducing the expression to (+ (square 6) (square 10)) If we use the definition of square, this reduces to (+ (* 6 6) (* 10 10)) which reduces by multiplication to (+ 36 100) and finally to 136 The process we have just described is called the substitution model for procedure application. It can be taken as a model that determines the “meaning” of procedure application,

Chapter 1: Building Abstractions with Procedures

25

insofar as the procedures in this chapter are concerned. However, there are two points that should be stressed: • The purpose of the substitution is to help us think about procedure application, not to provide a description of how the interpreter really works. Typical interpreters do not evaluate procedure applications by manipulating the text of a procedure to substitute values for the formal parameters. In practice, the “substitution” is accomplished by using a local environment for the formal parameters. We will discuss this more fully in Chapter 3 [Chapter 3], page 183 and Chapter 4 [Chapter 4], page 299 when we examine the implementation of an interpreter in detail. • Over the course of this book, we will present a sequence of increasingly elaborate models of how interpreters work, culminating with a complete implementation of an interpreter and compiler in Chapter 5 [Chapter 5], page 405. The substitution model is only the first of these models—a way to get started thinking formally about the evaluation process. In general, when modeling phenomena in science and engineering, we begin with simplified, incomplete models. As we examine things in greater detail, these simple models become inadequate and must be replaced by more refined models. The substitution model is no exception. In particular, when we address in Chapter 3 [Chapter 3], page 183 the use of procedures with “mutable data,” we will see that the substitution model breaks down and must be replaced by a more complicated model of procedure application.15

Applicative order versus normal order According to the description of evaluation given in section Section 1.1.3 [1-1-3], page 20, the interpreter first evaluates the operator and operands and then applies the resulting procedure to the resulting arguments. This is not the only way to perform evaluation. An alternative evaluation model would not evaluate the operands until their values were needed. Instead it would first substitute operand expressions for parameters until it obtained an expression involving only primitive operators, and would then perform the evaluation. If we used this method, the evaluation of (f 5) would proceed according to the sequence of expansions (sum-of-squares (+ 5 1) (* 5 2)) (+

(square (+ 5 1))

(+ (* (+ 5 1) (+ 5 1)) followed by the reductions (+ (* 6 6) (+ 15

36

(square (* 5 2))

)

(* (* 5 2) (* 5 2))) (* 10 10)) 100)

Despite the simplicity of the substitution idea, it turns out to be surprisingly complicated to give a rigorous mathematical definition of the substitution process. The problem arises from the possibility of confusion between the names used for the formal parameters of a procedure and the (possibly identical) names used in the expressions to which the procedure may be applied. Indeed, there is a long history of erroneous definitions of substitution in the literature of logic and programming semantics. See Stoy 1977 for a careful discussion of substitution.

26

Structure and Interpretation of Computer Programs, 2e

136 This gives the same answer as our previous evaluation model, but the process is different. In particular, the evaluations of (+ 5 1) and (* 5 2) are each performed twice here, corresponding to the reduction of the expression (* x x) with x replaced respectively by (+ 5 1) and (* 5 2). This alternative “fully expand and then reduce” evaluation method is known as normalorder evaluation, in contrast to the “evaluate the arguments and then apply” method that the interpreter actually uses, which is called applicative-order evaluation. It can be shown that, for procedure applications that can be modeled using substitution (including all the procedures in the first two chapters of this book) and that yield legitimate values, normalorder and applicative-order evaluation produce the same value. (See [Exercise 1-5], page 29 for an instance of an “illegitimate” value where normal-order and applicative-order evaluation do not give the same result.) Lisp uses applicative-order evaluation, partly because of the additional efficiency obtained from avoiding multiple evaluations of expressions such as those illustrated with (+ 5 1) and (* 5 2) above and, more significantly, because normal-order evaluation becomes much more complicated to deal with when we leave the realm of procedures that can be modeled by substitution. On the other hand, normal-order evaluation can be an extremely valuable tool, and we will investigate some of its implications in Chapter 3 [Chapter 3], page 183 and Chapter 4 [Chapter 4], page 299.16

1.1.6 Conditional Expressions and Predicates The expressive power of the class of procedures that we can define at this point is very limited, because we have no way to make tests and to perform different operations depending on the result of a test. For instance, we cannot define a procedure that computes the absolute value of a number by testing whether the number is positive, negative, or zero and taking different actions in the different cases according to the rule / | |x| = < | \

x 0 -x

if x > 0 if x = 0 if x < 0

This construct is called a case analysis, and there is a special form in Lisp for notating such a case analysis. It is called cond (which stands for “conditional”), and it is used as follows: (define (abs x) (cond ((> x 0) x) ((= x 0) 0) ((< x 0) (- x)))) The general form of a conditional expression is 16

In Chapter 3 [Chapter 3], page 183 we will introduce stream processing, which is a way of handling apparently “infinite” data structures by incorporating a limited form of normal-order evaluation. In section Section 4.2 [4-2], page 331 we will modify the Scheme interpreter to produce a normal-order variant of Scheme.

Chapter 1: Building Abstractions with Procedures

27

(cond ( ) ( ) ... ( )) consisting of the symbol cond followed by parenthesized pairs of expressions ( ) called clauses. The first expression in each pair is a predicate—that is, an expression whose value is interpreted as either true or false.17 Conditional expressions are evaluated as follows. The predicate is evaluated first. If its value is false, then is evaluated. If ’s value is also false, then is evaluated. This process continues until a predicate is found whose value is true, in which case the interpreter returns the value of the corresponding consequent expression of the clause as the value of the conditional expression. If none of the ’s is found to be true, the value of the cond is undefined. The word predicate is used for procedures that return true or false, as well as for expressions that evaluate to true or false. The absolute-value procedure abs makes use of the primitive predicates >, x 5) (< x 10)) As another example, we can define a predicate to test whether one number is greater than or equal to another as (define (>= x y) (or (> x y) (= x y))) or alternatively as (define (>= x y) (not (< x y))) Exercise 1.1: Below is a sequence of expressions. What is the result printed by the interpreter in response to each expression? Assume that the sequence is to be evaluated in the order in which it is presented. 10 (+ 5 3 4) (- 9 1) 19

A minor difference between if and cond is that the part of each cond clause may be a sequence of expressions. If the corresponding is found to be true, the expressions are evaluated in sequence and the value of the final expression in the sequence is returned as the value of the cond. In an if expression, however, the and must be single expressions.

Chapter 1: Building Abstractions with Procedures

(/ 6 2) (+ (* 2 4) (- 4 6)) (define a 3) (define b (+ a 1)) (+ a b (* a b)) (= a b) (if (and (> b a) (< b (* a b))) b a) (cond ((= a 4) 6) ((= b 4) (+ 6 7 a)) (else 25)) (+ 2 (if (> b a) b a)) (* (cond ((> a b) a) ((< a b) b) (else -1)) (+ a 1)) Exercise 1.2: Translate the following expression into prefix form. 5 + 4 + (2 - (3 - (6 + 4/5))) ----------------------------3(6 - 2)(2 - 7) Exercise 1.3: Define a procedure that takes three numbers as arguments and returns the sum of the squares of the two larger numbers. Exercise 1.4: Observe that our model of evaluation allows for combinations whose operators are compound expressions. Use this observation to describe the behavior of the following procedure: (define (a-plus-abs-b a b) ((if (> b 0) + -) a b)) Exercise 1.5: Ben Bitdiddle has invented a test to determine whether the interpreter he is faced with is using applicative-order evaluation or normal-order evaluation. He defines the following two procedures: (define (p) (p)) (define (test x y) (if (= x 0) 0

29

30

Structure and Interpretation of Computer Programs, 2e

y)) Then he evaluates the expression (test 0 (p)) What behavior will Ben observe with an interpreter that uses applicative-order evaluation? What behavior will he observe with an interpreter that uses normalorder evaluation? Explain your answer. (Assume that the evaluation rule for the special form if is the same whether the interpreter is using normal or applicative order: The predicate expression is evaluated first, and the result determines whether to evaluate the consequent or the alternative expression.)

1.1.7 Example: Square Roots by Newton’s Method Procedures, as introduced above, are much like ordinary mathematical functions. They specify a value that is determined by one or more parameters. But there is an important difference between mathematical functions and computer procedures. Procedures must be effective. As a case in point, consider the problem of computing square roots. We can define the square-root function as sqrt(x) = the y such that y >= 0 and y^2 = x This describes a perfectly legitimate mathematical function. We could use it to recognize whether one number is the square root of another, or to derive facts about square roots in general. On the other hand, the definition does not describe a procedure. Indeed, it tells us almost nothing about how to actually find the square root of a given number. It will not help matters to rephrase this definition in pseudo-Lisp: (define (sqrt x) (the y (and (>= y 0) (= (square y) x)))) This only begs the question. The contrast between function and procedure is a reflection of the general distinction between describing properties of things and describing how to do things, or, as it is sometimes referred to, the distinction between declarative knowledge and imperative knowledge. In mathematics we are usually concerned with declarative (what is) descriptions, whereas in computer science we are usually concerned with imperative (how to) descriptions.20 How does one compute square roots? The most common way is to use Newton’s method of successive approximations, which says that whenever we have a guess y for the value of the square root of a number x, we can perform a simple manipulation to get a better 20

Declarative and imperative descriptions are intimately related, as indeed are mathematics and computer science. For instance, to say that the answer produced by a program is “correct” is to make a declarative statement about the program. There is a large amount of research aimed at establishing techniques for proving that programs are correct, and much of the technical difficulty of this subject has to do with negotiating the transition between imperative statements (from which programs are constructed) and declarative statements (which can be used to deduce things). In a related vein, an important current area in programming-language design is the exploration of so-called very high-level languages, in which one actually programs in terms of declarative statements. The idea is to make interpreters sophisticated enough so that, given “what is” knowledge specified by the programmer, they can generate “how to” knowledge automatically. This cannot be done in general, but there are important areas where progress has been made. We shall revisit this idea in Chapter 4 [Chapter 4], page 299.

Chapter 1: Building Abstractions with Procedures

31

guess (one closer to the actual square root) by averaging y with x /y.21 For example, we can compute the square root of 2 as follows. Suppose our initial guess is 1: Guess Quotient Average 1 (2/1) = 2 ((2 + 1)/2) = 1.5 1.5 (2/1.5) = 1.3333 ((1.3333 + 1.5)/2) = 1.4167 1.4167 (2/1.4167) = 1.4118 ((1.4167 + 1.4118)/2) = 1.4142 1.4142 ... ... Continuing this process, we obtain better and better approximations to the square root. Now let’s formalize the process in terms of procedures. We start with a value for the radicand (the number whose square root we are trying to compute) and a value for the guess. If the guess is good enough for our purposes, we are done; if not, we must repeat the process with an improved guess. We write this basic strategy as a procedure: (define (sqrt-iter guess x) (if (good-enough? guess x) guess (sqrt-iter (improve guess x) x))) A guess is improved by averaging it with the quotient of the radicand and the old guess: (define (improve guess x) (average guess (/ x guess))) where (define (average x y) (/ (+ x y) 2)) We also have to say what we mean by “good enough.” The following will do for illustration, but it is not really a very good test. (See exercise [Exercise 1-7], page 32.) The idea is to improve the answer until it is close enough so that its square differs from the radicand by less than a predetermined tolerance (here 0.001):22 (define (good-enough? guess x) (< (abs (- (square guess) x)) 0.001)) Finally, we need a way to get started. For instance, we can always guess that the square root of any number is 1:23 21

22

23

This square-root algorithm is actually a special case of Newton’s method, which is a general technique for finding roots of equations. The square-root algorithm itself was developed by Heron of Alexandria in the first century A.D. We will see how to express the general Newton’s method as a Lisp procedure in section Section 1.3.4 [1-3-4], page 69. We will usually give predicates names ending with question marks, to help us remember that they are predicates. This is just a stylistic convention. As far as the interpreter is concerned, the question mark is just an ordinary character. Observe that we express our initial guess as 1.0 rather than 1. This would not make any difference in many Lisp implementations. MIT Scheme, however, distinguishes between exact integers and decimal values, and dividing two integers produces a rational number rather than a decimal. For example, dividing 10 by 6 yields 5/3, while dividing 10.0 by 6.0 yields 1.6666666666666667. (We will learn how to implement arithmetic on rational numbers in section Section 2.1.1 [2-1-1], page 78.) If we start with an initial guess of 1 in our square-root program, and x is an exact integer, all subsequent values produced in the square-root computation will be rational numbers rather than decimals. Mixed operations on rational numbers and decimals always yield decimals, so starting with an initial guess of 1.0 forces all subsequent values to be decimals.

32

Structure and Interpretation of Computer Programs, 2e

(define (sqrt x) (sqrt-iter 1.0 x)) If we type these definitions to the interpreter, we can use sqrt just as we can use any procedure: (sqrt 9) 3.00009155413138 (sqrt (+ 100 37)) 11.704699917758145 (sqrt (+ (sqrt 2) (sqrt 3))) 1.7739279023207892 (square (sqrt 1000)) 1000.000369924366 The sqrt program also illustrates that the simple procedural language we have introduced so far is sufficient for writing any purely numerical program that one could write in, say, C or Pascal. This might seem surprising, since we have not included in our language any iterative (looping) constructs that direct the computer to do something over and over again. Sqrt-iter, on the other hand, demonstrates how iteration can be accomplished using no special construct other than the ordinary ability to call a procedure.24 Exercise 1.6: Alyssa P. Hacker doesn’t see why if needs to be provided as a special form. “Why can’t I just define it as an ordinary procedure in terms of cond?” she asks. Alyssa’s friend Eva Lu Ator claims this can indeed be done, and she defines a new version of if: (define (new-if predicate then-clause else-clause) (cond (predicate then-clause) (else else-clause))) Eva demonstrates the program for Alyssa: (new-if (= 2 3) 0 5) 5 (new-if (= 1 1) 0 5) 0 Delighted, Alyssa uses new-if to rewrite the square-root program: (define (sqrt-iter guess x) (new-if (good-enough? guess x) guess (sqrt-iter (improve guess x) x))) What happens when Alyssa attempts to use this to compute square roots? Explain. 24

Readers who are worried about the efficiency issues involved in using procedure calls to implement iteration should note the remarks on “tail recursion” in section Section 1.2.1 [1-2-1], page 37.

Chapter 1: Building Abstractions with Procedures

33

Exercise 1.7: The good-enough? test used in computing square roots will not be very effective for finding the square roots of very small numbers. Also, in real computers, arithmetic operations are almost always performed with limited precision. This makes our test inadequate for very large numbers. Explain these statements, with examples showing how the test fails for small and large numbers. An alternative strategy for implementing good-enough? is to watch how guess changes from one iteration to the next and to stop when the change is a very small fraction of the guess. Design a square-root procedure that uses this kind of end test. Does this work better for small and large numbers? Exercise 1.8: Newton’s method for cube roots is based on the fact that if y is an approximation to the cube root of x, then a better approximation is given by the value x/y^2 + 2y ---------3 Use this formula to implement a cube-root procedure analogous to the squareroot procedure. (In section Section 1.3.4 [1-3-4], page 69 we will see how to implement Newton’s method in general as an abstraction of these square-root and cube-root procedures.)

1.1.8 Procedures as Black-Box Abstractions Sqrt is our first example of a process defined by a set of mutually defined procedures. Notice that the definition of sqrt-iter is recursive; that is, the procedure is defined in terms of itself. The idea of being able to define a procedure in terms of itself may be disturbing; it may seem unclear how such a “circular” definition could make sense at all, much less specify a well-defined process to be carried out by a computer. This will be addressed more carefully in section Section 1.2 [1-2], page 37. But first let’s consider some other important points illustrated by the sqrt example. Observe that the problem of computing square roots breaks up naturally into a number of subproblems: how to tell whether a guess is good enough, how to improve a guess, and so on. Each of these tasks is accomplished by a separate procedure. The entire sqrt program can be viewed as a cluster of procedures (shown in [Figure 1-2], page 33) that mirrors the decomposition of the problem into subproblems. Figure 1.2: Procedural decomposition of the sqrt program. sqrt | sqrt-iter / \ good-enough improve / \ | square abs average The importance of this decomposition strategy is not simply that one is dividing the program into parts. After all, we could take any large program and divide it into parts— the first ten lines, the next ten lines, the next ten lines, and so on. Rather, it is crucial that each procedure accomplishes an identifiable task that can be used as a module in

34

Structure and Interpretation of Computer Programs, 2e

defining other procedures. For example, when we define the good-enough? procedure in terms of square, we are able to regard the square procedure as a “black box.” We are not at that moment concerned with how the procedure computes its result, only with the fact that it computes the square. The details of how the square is computed can be suppressed, to be considered at a later time. Indeed, as far as the good-enough? procedure is concerned, square is not quite a procedure but rather an abstraction of a procedure, a so-called procedural abstraction. At this level of abstraction, any procedure that computes the square is equally good. Thus, considering only the values they return, the following two procedures for squaring a number should be indistinguishable. Each takes a numerical argument and produces the square of that number as the value.25 (define (square x) (* x x)) (define (square x) (exp (double (log x)))) (define (double x) (+ x x)) So a procedure definition should be able to suppress detail. The users of the procedure may not have written the procedure themselves, but may have obtained it from another programmer as a black box. A user should not need to know how the procedure is implemented in order to use it.

Local names One detail of a procedure’s implementation that should not matter to the user of the procedure is the implementer’s choice of names for the procedure’s formal parameters. Thus, the following procedures should not be distinguishable: (define (square x) (* x x)) (define (square y) (* y y)) This principle—that the meaning of a procedure should be independent of the parameter names used by its author—seems on the surface to be self-evident, but its consequences are profound. The simplest consequence is that the parameter names of a procedure must be local to the body of the procedure. For example, we used square in the definition of good-enough? in our square-root procedure: (define (good-enough? guess x) (< (abs (- (square guess) x)) 0.001)) The intention of the author of good-enough? is to determine if the square of the first argument is within a given tolerance of the second argument. We see that the author of good-enough? used the name guess to refer to the first argument and x to refer to the second argument. The argument of square is guess. If the author of square used x (as above) to refer to that argument, we see that the x in good-enough? must be a different x 25

It is not even clear which of these procedures is a more efficient implementation. This depends upon the hardware available. There are machines for which the “obvious” implementation is the less efficient one. Consider a machine that has extensive tables of logarithms and antilogarithms stored in a very efficient manner.

Chapter 1: Building Abstractions with Procedures

35

than the one in square. Running the procedure square must not affect the value of x that is used by good-enough?, because that value of x may be needed by good-enough? after square is done computing. If the parameters were not local to the bodies of their respective procedures, then the parameter x in square could be confused with the parameter x in good-enough?, and the behavior of good-enough? would depend upon which version of square we used. Thus, square would not be the black box we desired. A formal parameter of a procedure has a very special role in the procedure definition, in that it doesn’t matter what name the formal parameter has. Such a name is called a bound variable, and we say that the procedure definition binds its formal parameters. The meaning of a procedure definition is unchanged if a bound variable is consistently renamed throughout the definition.26 If a variable is not bound, we say that it is free. The set of expressions for which a binding defines a name is called the scope of that name. In a procedure definition, the bound variables declared as the formal parameters of the procedure have the body of the procedure as their scope. In the definition of good-enough? above, guess and x are bound variables but . 1 . . 0 . . . . . .>. .. Consider the pattern of this computation. To compute (fib 5), we compute (fib 4) and (fib 3). To compute (fib 4), we compute (fib 3) and (fib 2). In general, the evolved process looks like a tree, as shown in [Figure 1-5], page 42. Notice that the branches split into two at each level (except at the bottom); this reflects the fact that the fib procedure calls itself twice each time it is invoked. This procedure is instructive as a prototypical tree recursion, but it is a terrible way to compute Fibonacci numbers because it does so much redundant computation. Notice in [Figure 1-5], page 42 that the entire computation of (fib 3)—almost half the work— is duplicated. In fact, it is not hard to show that the number of times the procedure will compute (fib 1) or (fib 0) (the number of leaves in the above tree, in general) is precisely Fib(n + 1). To get an idea of how bad this is, one can show that the value of Fib(n) grows exponentially with n. More precisely (see [Exercise 1-13], page 45), Fib(n) is the closest integer to [phi] ^n /[sqrt] (5), where [phi] = (1 + [sqrt]5)/2 ~= 1.6180 is the golden ratio, which satisfies the equation [phi]^2 = [phi] + 1 Thus, the process uses a number of steps that grows exponentially with the input. On the other hand, the space required grows only linearly with the input, because we need keep track only of which nodes are above us in the tree at any point in the computation. In general, the number of steps required by a tree-recursive process will be proportional to the number of nodes in the tree, while the space required will be proportional to the maximum depth of the tree. We can also formulate an iterative process for computing the Fibonacci numbers. The idea is to use a pair of integers a and b, initialized to Fib(1) = 1 and Fib(0) = 0, and to repeatedly apply the simultaneous transformations

Chapter 1: Building Abstractions with Procedures

43

a

View more...
Harold Abelson and Gerald Jay Sussman with Julie Sussman foreword by Alan J. Perlis

c 1996 by The Massachusetts Institute of Technology Copyright Structure and Interpretation of Computer Programs second edition Harold Abelson and Gerald Jay Sussman with Julie Sussman foreword by Alan J. Perlis The MIT Press Cambridge, Massachusetts London, England McGraw-Hill Book Company New York, St. Louis, San Francisco Montreal, Toronto This book is one of a series of texts written by faculty of the Electrical Engineering and Computer Science Department at the Massachusetts Institute of Technology. It was edited and produced by The MIT Press under a joint production-distribution arrangement with the McGraw-Hill Book Company. Unofficial Texinfo Format 2.neilvandyke4 (January 10, 2007)

i

Short Contents Unofficial Texinfo Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Preface to the Second Edition . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Preface to the First Edition . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1 Building Abstractions with Procedures . . . . . . . . . . . . . . . . 15 2 Building Abstractions with Data . . . . . . . . . . . . . . . . . . . . 75 3 Modularity, Objects, and State . . . . . . . . . . . . . . . . . . . . 183 4 Metalinguistic Abstraction . . . . . . . . . . . . . . . . . . . . . . . 299 5 Computing with Register Machines . . . . . . . . . . . . . . . . . 405 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509

ii

Structure and Interpretation of Computer Programs, 2e

iii

Table of Contents Unofficial Texinfo Format . . . . . . . . . . . . . . . . . . . . . . 1 Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Preface to the Second Edition . . . . . . . . . . . . . . . . . 9 Preface to the First Edition . . . . . . . . . . . . . . . . . . 11 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1

Building Abstractions with Procedures . . . . . 15 1.1

The Elements of Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.2 Naming and the Environment. . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.3 Evaluating Combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.4 Compound Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.5 The Substitution Model for Procedure Application . . . . . . . 1.1.6 Conditional Expressions and Predicates . . . . . . . . . . . . . . . . . 1.1.7 Example: Square Roots by Newton’s Method . . . . . . . . . . . . 1.1.8 Procedures as Black-Box Abstractions . . . . . . . . . . . . . . . . . . . 1.2 Procedures and the Processes They Generate . . . . . . . . . . . . . . . . . 1.2.1 Linear Recursion and Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.2 Tree Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.3 Orders of Growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.4 Exponentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.5 Greatest Common Divisors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.6 Example: Testing for Primality . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Formulating Abstractions with Higher-Order Procedures . . . . . . 1.3.1 Procedures as Arguments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.2 Constructing Procedures Using Lambda . . . . . . . . . . . . . . . . . . 1.3.3 Procedures as General Methods . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.4 Procedures as Returned Values . . . . . . . . . . . . . . . . . . . . . . . . .

17 18 19 20 22 24 26 30 33 37 37 41 45 47 50 51 56 57 61 64 69

iv

2

Structure and Interpretation of Computer Programs, 2e

Building Abstractions with Data. . . . . . . . . . . 75 2.1

Introduction to Data Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 2.1.1 Example: Arithmetic Operations for Rational Numbers . . . 78 2.1.2 Abstraction Barriers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 2.1.3 What Is Meant by Data? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 2.1.4 Extended Exercise: Interval Arithmetic . . . . . . . . . . . . . . . . . . 86 2.2 Hierarchical Data and the Closure Property . . . . . . . . . . . . . . . . . . 89 2.2.1 Representing Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 2.2.2 Hierarchical Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 2.2.3 Sequences as Conventional Interfaces . . . . . . . . . . . . . . . . . . . 102 2.2.4 Example: A Picture Language . . . . . . . . . . . . . . . . . . . . . . . . . 113 2.3 Symbolic Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 2.3.1 Quotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 2.3.2 Example: Symbolic Differentiation . . . . . . . . . . . . . . . . . . . . . 126 2.3.3 Example: Representing Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 2.3.4 Example: Huffman Encoding Trees . . . . . . . . . . . . . . . . . . . . . 138 2.4 Multiple Representations for Abstract Data . . . . . . . . . . . . . . . . . 145 2.4.1 Representations for Complex Numbers . . . . . . . . . . . . . . . . . 146 2.4.2 Tagged data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 2.4.3 Data-Directed Programming and Additivity . . . . . . . . . . . . 153 2.5 Systems with Generic Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 2.5.1 Generic Arithmetic Operations . . . . . . . . . . . . . . . . . . . . . . . . 161 2.5.2 Combining Data of Different Types . . . . . . . . . . . . . . . . . . . . 165 2.5.3 Example: Symbolic Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

3

Modularity, Objects, and State . . . . . . . . . . . 183 3.1

Assignment and Local State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Local State Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 The Benefits of Introducing Assignment . . . . . . . . . . . . . . . . 3.1.3 The Costs of Introducing Assignment . . . . . . . . . . . . . . . . . . 3.2 The Environment Model of Evaluation . . . . . . . . . . . . . . . . . . . . . . 3.2.1 The Rules for Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Applying Simple Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3 Frames as the Repository of Local State . . . . . . . . . . . . . . . . 3.2.4 Internal Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Modeling with Mutable Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Mutable List Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Representing Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3 Representing Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.4 A Simulator for Digital Circuits . . . . . . . . . . . . . . . . . . . . . . . . 3.3.5 Propagation of Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Concurrency: Time Is of the Essence . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 The Nature of Time in Concurrent Systems . . . . . . . . . . . . . 3.4.2 Mechanisms for Controlling Concurrency . . . . . . . . . . . . . . . 3.5 Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 Streams Are Delayed Lists. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.2 Infinite Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.3 Exploiting the Stream Paradigm . . . . . . . . . . . . . . . . . . . . . . .

184 184 189 192 198 199 201 204 208 210 211 219 224 229 239 249 250 254 264 265 272 279

v 3.5.4 Streams and Delayed Evaluation . . . . . . . . . . . . . . . . . . . . . . . 288 3.5.5 Modularity of Functional Programs and Modularity of Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294

4

Metalinguistic Abstraction . . . . . . . . . . . . . . . 299 4.1

The Metacircular Evaluator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 The Core of the Evaluator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.2 Representing Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.3 Evaluator Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.4 Running the Evaluator as a Program . . . . . . . . . . . . . . . . . . . 4.1.5 Data as Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.6 Internal Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.7 Separating Syntactic Analysis from Execution. . . . . . . . . . . 4.2 Variations on a Scheme – Lazy Evaluation . . . . . . . . . . . . . . . . . . 4.2.1 Normal Order and Applicative Order . . . . . . . . . . . . . . . . . . . 4.2.2 An Interpreter with Lazy Evaluation . . . . . . . . . . . . . . . . . . . 4.2.3 Streams as Lazy Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Variations on a Scheme – Nondeterministic Computing . . . . . . . 4.3.1 Amb and Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.2 Examples of Nondeterministic Programs . . . . . . . . . . . . . . . . 4.3.3 Implementing the Amb Evaluator . . . . . . . . . . . . . . . . . . . . . . . 4.4 Logic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.1 Deductive Information Retrieval . . . . . . . . . . . . . . . . . . . . . . . 4.4.2 How the Query System Works . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.3 Is Logic Programming Mathematical Logic? . . . . . . . . . . . . 4.4.4 Implementing the Query System . . . . . . . . . . . . . . . . . . . . . . . 4.4.4.1 The Driver Loop and Instantiation . . . . . . . . . . . . . . . . 4.4.4.2 The Evaluator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.4.3 Finding Assertions by Pattern Matching . . . . . . . . . . . 4.4.4.4 Rules and Unification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.4.5 Maintaining the Data Base . . . . . . . . . . . . . . . . . . . . . . . 4.4.4.6 Stream Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.4.7 Query Syntax Procedures . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.4.8 Frames and Bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5

301 302 306 313 316 319 322 327 331 332 333 340 342 344 347 353 363 366 375 381 386 386 387 390 392 395 397 398 401

Computing with Register Machines . . . . . . . 405 5.1

Designing Register Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 A Language for Describing Register Machines . . . . . . . . . . . 5.1.2 Abstraction in Machine Design . . . . . . . . . . . . . . . . . . . . . . . . 5.1.3 Subroutines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.4 Using a Stack to Implement Recursion . . . . . . . . . . . . . . . . . 5.1.5 Instruction Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 A Register-Machine Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 The Machine Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 The Assembler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.3 Generating Execution Procedures for Instructions . . . . . . . 5.2.4 Monitoring Machine Performance . . . . . . . . . . . . . . . . . . . . . . 5.3 Storage Allocation and Garbage Collection . . . . . . . . . . . . . . . . . .

406 408 412 414 418 423 423 425 428 431 437 440

vi

Structure and Interpretation of Computer Programs, 2e 5.3.1 Memory as Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 Maintaining the Illusion of Infinite Memory . . . . . . . . . . . . . 5.4 The Explicit-Control Evaluator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1 The Core of the Explicit-Control Evaluator . . . . . . . . . . . . . 5.4.2 Sequence Evaluation and Tail Recursion . . . . . . . . . . . . . . . . 5.4.3 Conditionals, Assignments, and Definitions . . . . . . . . . . . . . 5.4.4 Running the Evaluator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Compilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.1 Structure of the Compiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.2 Compiling Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.3 Compiling Combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.4 Combining Instruction Sequences . . . . . . . . . . . . . . . . . . . . . . 5.5.5 An Example of Compiled Code . . . . . . . . . . . . . . . . . . . . . . . . 5.5.6 Lexical Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.7 Interfacing Compiled Code to the Evaluator . . . . . . . . . . . .

440 445 450 451 456 458 460 465 468 472 477 482 485 493 496

References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509

Unofficial Texinfo Format

1

Unofficial Texinfo Format This is the second edition SICP book, from Unofficial Texinfo Format. You are probably reading it in an Info hypertext browser, such as the Info mode of Emacs. You might alternatively be reading it TEX-formatted on your screen or printer, though that would be silly. And, if printed, expensive. The freely-distributed official HTML-and-GIF format was first converted personually to Unofficial Texinfo Format (UTF) version 1 by Lyssa Ayth during a long Emacs lovefest weekend in April, 2001. The UTF is easier to search than the HTML format. It is also much more accessible to people running on modest computers, such as donated ’386-based PCs. A 386 can, in theory, run Linux, Emacs, and a Scheme interpreter simultaneously, but most 386s probably can’t also run both Netscape and the necessary X Window System without prematurely introducing budding young underfunded hackers to the concept of thrashing. UTF can also fit uncompressed on a 1.44MB floppy diskette, which may come in handy for installing UTF on PCs that do not have Internet or LAN access. The Texinfo conversion has been a straight transliteration, to the extent possible. Like the TEX-to-HTML conversion, this was not without some introduction of breakage. In the case of Unofficial Texinfo Format, figures have suffered an amateurish resurrection of the lost art of ASCII art. Also, it’s quite possible that some errors of ambiguity were introduced during the conversion of some of the copious superscripts (‘^’) and subscripts (‘ ’). Divining which has been left as an exercise to the reader. But at least we don’t put our brave astronauts at risk by encoding the greater-than-or-equal symbol as >. If you modify ‘sicp.texi’ to correct errors or improve the ASCII art, then update the @set utfversion 2.neilvandyke4 line to reflect your delta. For example, if you started with Lytha’s version 1, and your name is Bob, then you could name your successive versions 1.bob1, 1.bob2, . . . 1.bobn . Also update utfversiondate. If you want to distribute your version on the Web, then embedding the string “sicp.texi” somewhere in the file or Web page will make it easier for people to find with Web search engines. It is believed that the Unofficial Texinfo Format is in keeping with the spirit of the graciously freely-distributed HTML version. But you never know when someone’s armada of lawyers might need something to do, and get their shorts all in a knot over some benign little thing, so think twice before you use your full name or distribute Info, DVI, PostScript, or PDF formats that might embed your account or machine name. Peath, Lytha Ayth Addendum: See also the SICP video lectures by Abelson and Sussman: http://www.swiss.ai.mit.edu/classes/6.001/abelson-sussman-lectures/

2

Structure and Interpretation of Computer Programs, 2e

Dedication

3

Dedication This book is dedicated, in respect and admiration, to the spirit that lives in the computer. “I think that it’s extraordinarily important that we in computer science keep fun in computing. When it started out, it was an awful lot of fun. Of course, the paying customers got shafted every now and then, and after a while we began to take their complaints seriously. We began to feel as if we really were responsible for the successful, error-free perfect use of these machines. I don’t think we are. I think we’re responsible for stretching them, setting them off in new directions, and keeping fun in the house. I hope the field of computer science never loses its sense of fun. Above all, I hope we don’t become missionaries. Don’t feel as if you’re Bible salesmen. The world has too many of those already. What you know about computing other people will learn. Don’t feel as if the key to successful computing is only in your hands. What’s in your hands, I think and hope, is intelligence: the ability to see the machine as more than when you were first led up to it, that you can make it more.” —Alan J. Perlis (April 1, 1922 February 7, 1990)

4

Structure and Interpretation of Computer Programs, 2e

Foreword

5

Foreword Educators, generals, dieticians, psychologists, and parents program. Armies, students, and some societies are programmed. An assault on large problems employs a succession of programs, most of which spring into existence en route. These programs are rife with issues that appear to be particular to the problem at hand. To appreciate programming as an intellectual activity in its own right you must turn to computer programming; you must read and write computer programs—many of them. It doesn’t matter much what the programs are about or what applications they serve. What does matter is how well they perform and how smoothly they fit with other programs in the creation of still greater programs. The programmer must seek both perfection of part and adequacy of collection. In this book the use of “program” is focused on the creation, execution, and study of programs written in a dialect of Lisp for execution on a digital computer. Using Lisp we restrict or limit not what we may program, but only the notation for our program descriptions. Our traffic with the subject matter of this book involves us with three foci of phenomena: the human mind, collections of computer programs, and the computer. Every computer program is a model, hatched in the mind, of a real or mental process. These processes, arising from human experience and thought, are huge in number, intricate in detail, and at any time only partially understood. They are modeled to our permanent satisfaction rarely by our computer programs. Thus even though our programs are carefully handcrafted discrete collections of symbols, mosaics of interlocking functions, they continually evolve: we change them as our perception of the model deepens, enlarges, generalizes until the model ultimately attains a metastable place within still another model with which we struggle. The source of the exhilaration associated with computer programming is the continual unfolding within the mind and on the computer of mechanisms expressed as programs and the explosion of perception they generate. If art interprets our dreams, the computer executes them in the guise of programs! For all its power, the computer is a harsh taskmaster. Its programs must be correct, and what we wish to say must be said accurately in every detail. As in every other symbolic activity, we become convinced of program truth through argument. Lisp itself can be assigned a semantics (another model, by the way), and if a program’s function can be specified, say, in the predicate calculus, the proof methods of logic can be used to make an acceptable correctness argument. Unfortunately, as programs get large and complicated, as they almost always do, the adequacy, consistency, and correctness of the specifications themselves become open to doubt, so that complete formal arguments of correctness seldom accompany large programs. Since large programs grow from small ones, it is crucial that we develop an arsenal of standard program structures of whose correctness we have become sure—we call them idioms—and learn to combine them into larger structures using organizational techniques of proven value. These techniques are treated at length in this book, and understanding them is essential to participation in the Promethean enterprise called programming. More than anything else, the uncovering and mastery of powerful organizational techniques accelerates our ability to create large, significant programs. Conversely, since writing large programs is very taxing, we are stimulated to invent new methods of reducing the mass of function and detail to be fitted into large programs. Unlike programs, computers must obey the laws of physics. If they wish to perform rapidly—a few nanoseconds per state change—they must transmit electrons only small

6

Structure and Interpretation of Computer Programs, 2e

distances (at most 11 over 2 feet). The heat generated by the huge number of devices so concentrated in space has to be removed. An exquisite engineering art has been developed balancing between multiplicity of function and density of devices. In any event, hardware always operates at a level more primitive than that at which we care to program. The processes that transform our Lisp programs to “machine” programs are themselves abstract models which we program. Their study and creation give a great deal of insight into the organizational programs associated with programming arbitrary models. Of course the computer itself can be so modeled. Think of it: the behavior of the smallest physical switching element is modeled by quantum mechanics described by differential equations whose detailed behavior is captured by numerical approximations represented in computer programs executing on computers composed of . . . ! It is not merely a matter of tactical convenience to separately identify the three foci. Even though, as they say, it’s all in the head, this logical separation induces an acceleration of symbolic traffic between these foci whose richness, vitality, and potential is exceeded in human experience only by the evolution of life itself. At best, relationships between the foci are metastable. The computers are never large enough or fast enough. Each breakthrough in hardware technology leads to more massive programming enterprises, new organizational principles, and an enrichment of abstract models. Every reader should ask himself periodically “Toward what end, toward what end?”—but do not ask it too often lest you pass up the fun of programming for the constipation of bittersweet philosophy. Among the programs we write, some (but never enough) perform a precise mathematical function such as sorting or finding the maximum of a sequence of numbers, determining primality, or finding the square root. We call such programs algorithms, and a great deal is known of their optimal behavior, particularly with respect to the two important parameters of execution time and data storage requirements. A programmer should acquire good algorithms and idioms. Even though some programs resist precise specifications, it is the responsibility of the programmer to estimate, and always to attempt to improve, their performance. Lisp is a survivor, having been in use for about a quarter of a century. Among the active programming languages only Fortran has had a longer life. Both languages have supported the programming needs of important areas of application, Fortran for scientific and engineering computation and Lisp for artificial intelligence. These two areas continue to be important, and their programmers are so devoted to these two languages that Lisp and Fortran may well continue in active use for at least another quarter-century. Lisp changes. The Scheme dialect used in this text has evolved from the original Lisp and differs from the latter in several important ways, including static scoping for variable binding and permitting functions to yield functions as values. In its semantic structure Scheme is as closely akin to Algol 60 as to early Lisps. Algol 60, never to be an active language again, lives on in the genes of Scheme and Pascal. It would be difficult to find two languages that are the communicating coin of two more different cultures than those gathered around these two languages. Pascal is for building pyramids—imposing, breathtaking, static structures built by armies pushing heavy blocks into place. Lisp is for building organisms—imposing, breathtaking, dynamic structures built by squads fitting fluctuating myriads of simpler organisms into place. The organizing principles used are the same in both cases, except for one extraordinarily important difference: The discretionary exportable functionality entrusted to the individual Lisp programmer is more than an order of magnitude greater than

Foreword

7

that to be found within Pascal enterprises. Lisp programs inflate libraries with functions whose utility transcends the application that produced them. The list, Lisp’s native data structure, is largely responsible for such growth of utility. The simple structure and natural applicability of lists are reflected in functions that are amazingly nonidiosyncratic. In Pascal the plethora of declarable data structures induces a specialization within functions that inhibits and penalizes casual cooperation. It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures. As a result the pyramid must stand unchanged for a millennium; the organism must evolve or perish. To illustrate this difference, compare the treatment of material and exercises within this book with that in any first-course text using Pascal. Do not labor under the illusion that this is a text digestible at MIT only, peculiar to the breed found there. It is precisely what a serious book on programming Lisp must be, no matter who the student is or where it is used. Note that this is a text about programming, unlike most Lisp books, which are used as a preparation for work in artificial intelligence. After all, the critical programming concerns of software engineering and artificial intelligence tend to coalesce as the systems under investigation become larger. This explains why there is such growing interest in Lisp outside of artificial intelligence. As one would expect from its goals, artificial intelligence research generates many significant programming problems. In other programming cultures this spate of problems spawns new languages. Indeed, in any very large programming task a useful organizing principle is to control and isolate traffic within the task modules via the invention of language. These languages tend to become less primitive as one approaches the boundaries of the system where we humans interact most often. As a result, such systems contain complex languageprocessing functions replicated many times. Lisp has such a simple syntax and semantics that parsing can be treated as an elementary task. Thus parsing technology plays almost no role in Lisp programs, and the construction of language processors is rarely an impediment to the rate of growth and change of large Lisp systems. Finally, it is this very simplicity of syntax and semantics that is responsible for the burden and freedom borne by all Lisp programmers. No Lisp program of any size beyond a few lines can be written without being saturated with discretionary functions. Invent and fit; have fits and reinvent! We toast the Lisp programmer who pens his thoughts within nests of parentheses. Alan J. Perlis New Haven, Connecticut

8

Structure and Interpretation of Computer Programs, 2e

Preface to the Second Edition

9

Preface to the Second Edition Is it possible that software is not like anything else, that it is meant to be discarded: that the whole point is to always see it as a soap bubble? —Alan J. Perlis The material in this book has been the basis of MIT’s entry-level computer science subject since 1980. We had been teaching this material for four years when the first edition was published, and twelve more years have elapsed until the appearance of this second edition. We are pleased that our work has been widely adopted and incorporated into other texts. We have seen our students take the ideas and programs in this book and build them in as the core of new computer systems and languages. In literal realization of an ancient Talmudic pun, our students have become our builders. We are lucky to have such capable students and such accomplished builders. In preparing this edition, we have incorporated hundreds of clarifications suggested by our own teaching experience and the comments of colleagues at MIT and elsewhere. We have redesigned most of the major programming systems in the book, including the genericarithmetic system, the interpreters, the register-machine simulator, and the compiler; and we have rewritten all the program examples to ensure that any Scheme implementation conforming to the IEEE Scheme standard (IEEE 1990) will be able to run the code. This edition emphasizes several new themes. The most important of these is the central role played by different approaches to dealing with time in computational models: objects with state, concurrent programming, functional programming, lazy evaluation, and nondeterministic programming. We have included new sections on concurrency and nondeterminism, and we have tried to integrate this theme throughout the book. The first edition of the book closely followed the syllabus of our MIT one-semester subject. With all the new material in the second edition, it will not be possible to cover everything in a single semester, so the instructor will have to pick and choose. In our own teaching, we sometimes skip the section on logic programming (section Section 4.4 [4-4], page 363), we have students use the register-machine simulator but we do not cover its implementation (section Section 5.2 [5-2], page 423), and we give only a cursory overview of the compiler (section Section 5.5 [5-5], page 465). Even so, this is still an intense course. Some instructors may wish to cover only the first three or four chapters, leaving the other material for subsequent courses. The World-Wide-Web site http://www-mitpress.mit.edu/sicp/ provides support for users of this book. This includes programs from the book, sample programming assignments, supplementary materials, and downloadable implementations of the Scheme dialect of Lisp.

10

Structure and Interpretation of Computer Programs, 2e

Preface to the First Edition

11

Preface to the First Edition A computer is like a violin. You can imagine a novice trying first a phonograph and then a violin. The latter, he says, sounds terrible. That is the argument we have heard from our humanists and most of our computer scientists. Computer programs are good, they say, for particular purposes, but they aren’t flexible. Neither is a violin, or a typewriter, until you learn how to use it. —Marvin Minsky, “Why Programming Is a Good Medium for Expressing Poorly-Understood and Sloppily-Formulated Ideas” “The Structure and Interpretation of Computer Programs” is the entry-level subject in computer science at the Massachusetts Institute of Technology. It is required of all students at MIT who major in electrical engineering or in computer science, as one-fourth of the “common core curriculum,” which also includes two subjects on circuits and linear systems and a subject on the design of digital systems. We have been involved in the development of this subject since 1978, and we have taught this material in its present form since the fall of 1980 to between 600 and 700 students each year. Most of these students have had little or no prior formal training in computation, although many have played with computers a bit and a few have had extensive programming or hardware-design experience. Our design of this introductory computer-science subject reflects two major concerns. First, we want to establish the idea that a computer language is not just a way of getting a computer to perform operations but rather that it is a novel formal medium for expressing ideas about methodology. Thus, programs must be written for people to read, and only incidentally for machines to execute. Second, we believe that the essential material to be addressed by a subject at this level is not the syntax of particular programming-language constructs, nor clever algorithms for computing particular functions efficiently, nor even the mathematical analysis of algorithms and the foundations of computing, but rather the techniques used to control the intellectual complexity of large software systems. Our goal is that students who complete this subject should have a good feel for the elements of style and the aesthetics of programming. They should have command of the major techniques for controlling complexity in a large system. They should be capable of reading a 50-page-long program, if it is written in an exemplary style. They should know what not to read, and what they need not understand at any moment. They should feel secure about modifying a program, retaining the spirit and style of the original author. These skills are by no means unique to computer programming. The techniques we teach and draw upon are common to all of engineering design. We control complexity by building abstractions that hide details when appropriate. We control complexity by establishing conventional interfaces that enable us to construct systems by combining standard, wellunderstood pieces in a “mix and match” way. We control complexity by establishing new languages for describing a design, each of which emphasizes particular aspects of the design and deemphasizes others. Underlying our approach to this subject is our conviction that “computer science” is not a science and that its significance has little to do with computers. The computer revolution is a revolution in the way we think and in the way we express what we think. The essence of this change is the emergence of what might best be called procedural epistemology— the study of the structure of knowledge from an imperative point of view, as opposed to

12

Structure and Interpretation of Computer Programs, 2e

the more declarative point of view taken by classical mathematical subjects. Mathematics provides a framework for dealing precisely with notions of “what is.” Computation provides a framework for dealing precisely with notions of “how to.” In teaching our material we use a dialect of the programming language Lisp. We never formally teach the language, because we don’t have to. We just use it, and students pick it up in a few days. This is one great advantage of Lisp-like languages: They have very few ways of forming compound expressions, and almost no syntactic structure. All of the formal properties can be covered in an hour, like the rules of chess. After a short time we forget about syntactic details of the language (because there are none) and get on with the real issues—figuring out what we want to compute, how we will decompose problems into manageable parts, and how we will work on the parts. Another advantage of Lisp is that it supports (but does not enforce) more of the large-scale strategies for modular decomposition of programs than any other language we know. We can make procedural and data abstractions, we can use higher-order functions to capture common patterns of usage, we can model local state using assignment and data mutation, we can link parts of a program with streams and delayed evaluation, and we can easily implement embedded languages. All of this is embedded in an interactive environment with excellent support for incremental program design, construction, testing, and debugging. We thank all the generations of Lisp wizards, starting with John McCarthy, who have fashioned a fine tool of unprecedented power and elegance. Scheme, the dialect of Lisp that we use, is an attempt to bring together the power and elegance of Lisp and Algol. From Lisp we take the metalinguistic power that derives from the simple syntax, the uniform representation of programs as data objects, and the garbagecollected heap-allocated data. From Algol we take lexical scoping and block structure, which are gifts from the pioneers of programming-language design who were on the Algol committee. We wish to cite John Reynolds and Peter Landin for their insights into the relationship of Church’s [lambda] calculus to the structure of programming languages. We also recognize our debt to the mathematicians who scouted out this territory decades before computers appeared on the scene. These pioneers include Alonzo Church, Barkley Rosser, Stephen Kleene, and Haskell Curry.

Acknowledgements

13

Acknowledgements We would like to thank the many people who have helped us develop this book and this curriculum. Our subject is a clear intellectual descendant of “6.231,” a wonderful subject on programming linguistics and the [lambda] calculus taught at MIT in the late 1960s by Jack Wozencraft and Arthur Evans, Jr. We owe a great debt to Robert Fano, who reorganized MIT’s introductory curriculum in electrical engineering and computer science to emphasize the principles of engineering design. He led us in starting out on this enterprise and wrote the first set of subject notes from which this book evolved. Much of the style and aesthetics of programming that we try to teach were developed in conjunction with Guy Lewis Steele Jr., who collaborated with Gerald Jay Sussman in the initial development of the Scheme language. In addition, David Turner, Peter Henderson, Dan Friedman, David Wise, and Will Clinger have taught us many of the techniques of the functional programming community that appear in this book. Joel Moses taught us about structuring large systems. His experience with the Macsyma system for symbolic computation provided the insight that one should avoid complexities of control and concentrate on organizing the data to reflect the real structure of the world being modeled. Marvin Minsky and Seymour Papert formed many of our attitudes about programming and its place in our intellectual lives. To them we owe the understanding that computation provides a means of expression for exploring ideas that would otherwise be too complex to deal with precisely. They emphasize that a student’s ability to write and modify programs provides a powerful medium in which exploring becomes a natural activity. We also strongly agree with Alan Perlis that programming is lots of fun and we had better be careful to support the joy of programming. Part of this joy derives from observing great masters at work. We are fortunate to have been apprentice programmers at the feet of Bill Gosper and Richard Greenblatt. It is difficult to identify all the people who have contributed to the development of our curriculum. We thank all the lecturers, recitation instructors, and tutors who have worked with us over the past fifteen years and put in many extra hours on our subject, especially Bill Siebert, Albert Meyer, Joe Stoy, Randy Davis, Louis Braida, Eric Grimson, Rod Brooks, Lynn Stein, and Peter Szolovits. We would like to specially acknowledge the outstanding teaching contributions of Franklyn Turbak, now at Wellesley; his work in undergraduate instruction set a standard that we can all aspire to. We are grateful to Jerry Saltzer and Jim Miller for helping us grapple with the mysteries of concurrency, and to Peter Szolovits and David McAllester for their contributions to the exposition of nondeterministic evaluation in Chapter 4 [Chapter 4], page 299. Many people have put in significant effort presenting this material at other universities. Some of the people we have worked closely with are Jacob Katzenelson at the Technion, Hardy Mayer at the University of California at Irvine, Joe Stoy at Oxford, Elisha Sacks at Purdue, and Jan Komorowski at the Norwegian University of Science and Technology. We are exceptionally proud of our colleagues who have received major teaching awards for

14

Structure and Interpretation of Computer Programs, 2e

their adaptations of this subject at other universities, including Kenneth Yip at Yale, Brian Harvey at the University of California at Berkeley, and Dan Huttenlocher at Cornell. Al Moy´e arranged for us to teach this material to engineers at Hewlett-Packard, and for the production of videotapes of these lectures. We would like to thank the talented instructors—in particular Jim Miller, Bill Siebert, and Mike Eisenberg—who have designed continuing education courses incorporating these tapes and taught them at universities and industry all over the world. Many educators in other countries have put in significant work translating the first edition. Michel Briand, Pierre Chamard, and Andr´e Pic produced a French edition; Susanne Daniels-Herold produced a German edition; and Fumio Motoyoshi produced a Japanese edition. We do not know who produced the Chinese edition, but we consider it an honor to have been selected as the subject of an “unauthorized” translation. It is hard to enumerate all the people who have made technical contributions to the development of the Scheme systems we use for instructional purposes. In addition to Guy Steele, principal wizards have included Chris Hanson, Joe Bowbeer, Jim Miller, Guillermo Rozas, and Stephen Adams. Others who have put in significant time are Richard Stallman, Alan Bawden, Kent Pitman, Jon Taft, Neil Mayle, John Lamping, Gwyn Osnos, Tracy Larrabee, George Carrette, Soma Chaudhuri, Bill Chiarchiaro, Steven Kirsch, Leigh Klotz, Wayne Noss, Todd Cass, Patrick O’Donnell, Kevin Theobald, Daniel Weise, Kenneth Sinclair, Anthony Courtemanche, Henry M. Wu, Andrew Berlin, and Ruth Shyu. Beyond the MIT implementation, we would like to thank the many people who worked on the IEEE Scheme standard, including William Clinger and Jonathan Rees, who edited the R^4RS, and Chris Haynes, David Bartley, Chris Hanson, and Jim Miller, who prepared the IEEE standard. Dan Friedman has been a long-time leader of the Scheme community. The community’s broader work goes beyond issues of language design to encompass significant educational innovations, such as the high-school curriculum based on EdScheme by Schemer’s Inc., and the wonderful books by Mike Eisenberg and by Brian Harvey and Matthew Wright. We appreciate the work of those who contributed to making this a real book, especially Terry Ehling, Larry Cohen, and Paul Bethge at the MIT Press. Ella Mazel found the wonderful cover image. For the second edition we are particularly grateful to Bernard and Ella Mazel for help with the book design, and to David Jones, TEX wizard extraordinaire. We also are indebted to those readers who made penetrating comments on the new draft: Jacob Katzenelson, Hardy Mayer, Jim Miller, and especially Brian Harvey, who did unto this book as Julie did unto his book Simply Scheme. Finally, we would like to acknowledge the support of the organizations that have encouraged this work over the years, including suppport from Hewlett-Packard, made possible by Ira Goldstein and Joel Birnbaum, and support from DARPA, made possible by Bob Kahn.

Chapter 1: Building Abstractions with Procedures

15

1 Building Abstractions with Procedures The acts of the mind, wherein it exerts its power over simple ideas, are chiefly these three: 1. Combining several simple ideas into one compound one, and thus all complex ideas are made. 2. The second is bringing two ideas, whether simple or complex, together, and setting them by one another so as to take a view of them at once, without uniting them into one, by which it gets all its ideas of relations. 3. The third is separating them from all other ideas that accompany them in their real existence: this is called abstraction, and thus all its general ideas are made. —John Locke, An Essay Concerning Human Understanding (1690) We are about to study the idea of a computational process. Computational processes are abstract beings that inhabit computers. As they evolve, processes manipulate other abstract things called data. The evolution of a process is directed by a pattern of rules called a program. People create programs to direct processes. In effect, we conjure the spirits of the computer with our spells. A computational process is indeed much like a sorcerer’s idea of a spirit. It cannot be seen or touched. It is not composed of matter at all. However, it is very real. It can perform intellectual work. It can answer questions. It can affect the world by disbursing money at a bank or by controlling a robot arm in a factory. The programs we use to conjure processes are like a sorcerer’s spells. They are carefully composed from symbolic expressions in arcane and esoteric programming languages that prescribe the tasks we want our processes to perform. A computational process, in a correctly working computer, executes programs precisely and accurately. Thus, like the sorcerer’s apprentice, novice programmers must learn to understand and to anticipate the consequences of their conjuring. Even small errors (usually called bugs or glitches) in programs can have complex and unanticipated consequences. Fortunately, learning to program is considerably less dangerous than learning sorcery, because the spirits we deal with are conveniently contained in a secure way. Real-world programming, however, requires care, expertise, and wisdom. A small bug in a computeraided design program, for example, can lead to the catastrophic collapse of an airplane or a dam or the self-destruction of an industrial robot. Master software engineers have the ability to organize programs so that they can be reasonably sure that the resulting processes will perform the tasks intended. They can visualize the behavior of their systems in advance. They know how to structure programs so that unanticipated problems do not lead to catastrophic consequences, and when problems do arise, they can debug their programs. Well-designed computational systems, like welldesigned automobiles or nuclear reactors, are designed in a modular manner, so that the parts can be constructed, replaced, and debugged separately.

Programming in Lisp We need an appropriate language for describing processes, and we will use for this purpose the programming language Lisp. Just as our everyday thoughts are usually expressed in our natural language (such as English, French, or Japanese), and descriptions of quantitative phenomena are expressed with mathematical notations, our procedural thoughts will

16

Structure and Interpretation of Computer Programs, 2e

be expressed in Lisp. Lisp was invented in the late 1950s as a formalism for reasoning about the use of certain kinds of logical expressions, called recursion equations, as a model for computation. The language was conceived by John McCarthy and is based on his paper “Recursive Functions of Symbolic Expressions and Their Computation by Machine” (McCarthy 1960). Despite its inception as a mathematical formalism, Lisp is a practical programming language. A Lisp interpreter is a machine that carries out processes described in the Lisp language. The first Lisp interpreter was implemented by McCarthy with the help of colleagues and students in the Artificial Intelligence Group of the MIT Research Laboratory of Electronics and in the MIT Computation Center.1 Lisp, whose name is an acronym for LISt Processing, was designed to provide symbol-manipulating capabilities for attacking programming problems such as the symbolic differentiation and integration of algebraic expressions. It included for this purpose new data objects known as atoms and lists, which most strikingly set it apart from all other languages of the period. Lisp was not the product of a concerted design effort. Instead, it evolved informally in an experimental manner in response to users’ needs and to pragmatic implementation considerations. Lisp’s informal evolution has continued through the years, and the community of Lisp users has traditionally resisted attempts to promulgate any “official” definition of the language. This evolution, together with the flexibility and elegance of the initial conception, has enabled Lisp, which is the second oldest language in widespread use today (only Fortran is older), to continually adapt to encompass the most modern ideas about program design. Thus, Lisp is by now a family of dialects, which, while sharing most of the original features, may differ from one another in significant ways. The dialect of Lisp used in this book is called Scheme.2 Because of its experimental character and its emphasis on symbol manipulation, Lisp was at first very inefficient for numerical computations, at least in comparison with Fortran. Over the years, however, Lisp compilers have been developed that translate programs into machine code that can perform numerical computations reasonably efficiently. And for special applications, Lisp has been used with great effectiveness.3 Although Lisp has not yet overcome its old reputation as hopelessly inefficient, Lisp is now used in many applications 1 2

3

The Lisp 1 Programmer’s Manual appeared in 1960, and the Lisp 1.5 Programmer’s Manual (McCarthy 1965) was published in 1962. The early history of Lisp is described in McCarthy 1978. The two dialects in which most major Lisp programs of the 1970s were written are MacLisp (Moon 1978; Pitman 1983), developed at the MIT Project MAC, and Interlisp (Teitelman 1974), developed at Bolt Beranek and Newman Inc. and the Xerox Palo Alto Research Center. Portable Standard Lisp (Hearn 1969; Griss 1981) was a Lisp dialect designed to be easily portable between different machines. MacLisp spawned a number of subdialects, such as Franz Lisp, which was developed at the University of California at Berkeley, and Zetalisp (Moon 1981), which was based on a special-purpose processor designed at the MIT Artificial Intelligence Laboratory to run Lisp very efficiently. The Lisp dialect used in this book, called Scheme (Steele 1975), was invented in 1975 by Guy Lewis Steele Jr. and Gerald Jay Sussman of the MIT Artificial Intelligence Laboratory and later reimplemented for instructional use at MIT. Scheme became an IEEE standard in 1990 (IEEE 1990). The Common Lisp dialect (Steele 1982, Steele 1990) was developed by the Lisp community to combine features from the earlier Lisp dialects to make an industrial standard for Lisp. Common Lisp became an ANSI standard in 1994 (ANSI 1994). One such special application was a breakthrough computation of scientific importance—an integration of the motion of the Solar System that extended previous results by nearly two orders of magnitude, and demonstrated that the dynamics of the Solar System is chaotic. This computation was made possible by new integration algorithms, a special-purpose compiler, and a special-purpose computer all implemented with the aid of software tools written in Lisp (Abelson et al. 1992; Sussman and Wisdom 1992).

Chapter 1: Building Abstractions with Procedures

17

where efficiency is not the central concern. For example, Lisp has become a language of choice for operating-system shell languages and for extension languages for editors and computer-aided design systems. If Lisp is not a mainstream language, why are we using it as the framework for our discussion of programming? Because the language possesses unique features that make it an excellent medium for studying important programming constructs and data structures and for relating them to the linguistic features that support them. The most significant of these features is the fact that Lisp descriptions of processes, called procedures, can themselves be represented and manipulated as Lisp data. The importance of this is that there are powerful program-design techniques that rely on the ability to blur the traditional distinction between “passive” data and “active” processes. As we shall discover, Lisp’s flexibility in handling procedures as data makes it one of the most convenient languages in existence for exploring these techniques. The ability to represent procedures as data also makes Lisp an excellent language for writing programs that must manipulate other programs as data, such as the interpreters and compilers that support computer languages. Above and beyond these considerations, programming in Lisp is great fun.

1.1 The Elements of Programming A powerful programming language is more than just a means for instructing a computer to perform tasks. The language also serves as a framework within which we organize our ideas about processes. Thus, when we describe a language, we should pay particular attention to the means that the language provides for combining simple ideas to form more complex ideas. Every powerful language has three mechanisms for accomplishing this: primitive expressions which represent the simplest entities the language is concerned with, means of combination by which compound elements are built from simpler ones, and means of abstraction by which compound elements can be named and manipulated as units. In programming, we deal with two kinds of elements: procedures and data. (Later we will discover that they are really not so distinct.) Informally, data is “stuff” that we want to manipulate, and procedures are descriptions of the rules for manipulating the data. Thus, any powerful programming language should be able to describe primitive data and primitive procedures and should have methods for combining and abstracting procedures and data. In this chapter we will deal only with simple numerical data so that we can focus on the rules for building procedures.4 In later chapters we will see that these same rules allow us to build procedures to manipulate compound data as well. 4

The characterization of numbers as “simple data” is a barefaced bluff. In fact, the treatment of numbers is one of the trickiest and most confusing aspects of any programming language. Some typical issues involved are these: Some computer systems distinguish integers, such as 2, from real numbers, such as 2.71. Is the real number 2.00 different from the integer 2? Are the arithmetic operations used for integers the same as the operations used for real numbers? Does 6 divided by 2 produce 3, or 3.0? How large a number can we represent? How many decimal places of accuracy can we represent? Is the range of integers the same as the range of real numbers? Above and beyond these questions, of course, lies a collection of issues concerning roundoff and truncation errors – the entire science of numerical analysis. Since our focus in this book is on large-scale program design rather than on numerical techniques, we are

18

Structure and Interpretation of Computer Programs, 2e

1.1.1 Expressions One easy way to get started at programming is to examine some typical interactions with an interpreter for the Scheme dialect of Lisp. Imagine that you are sitting at a computer terminal. You type an expression, and the interpreter responds by displaying the result of its evaluating that expression. One kind of primitive expression you might type is a number. (More precisely, the expression that you type consists of the numerals that represent the number in base 10.) If you present Lisp with a number 486 the interpreter will respond by printing5 486 Expressions representing numbers may be combined with an expression representing a primitive procedure (such as + or *) to form a compound expression that represents the application of the procedure to those numbers. For example: (+ 137 349) 486 (- 1000 334) 666 (* 5 99) 495 (/ 10 5) 2 (+ 2.7 10) 12.7 Expressions such as these, formed by delimiting a list of expressions within parentheses in order to denote procedure application, are called combinations. The leftmost element in the list is called the operator, and the other elements are called operands. The value of a combination is obtained by applying the procedure specified by the operator to the arguments that are the values of the operands. The convention of placing the operator to the left of the operands is known as prefix notation, and it may be somewhat confusing at first because it departs significantly from the customary mathematical convention. Prefix notation has several advantages, however. One of them is that it can accommodate procedures that may take an arbitrary number of arguments, as in the following examples: (+ 21 35 12 7)

5

going to ignore these problems. The numerical examples in this chapter will exhibit the usual roundoff behavior that one observes when using arithmetic operations that preserve a limited number of decimal places of accuracy in noninteger operations. Throughout this book, when we wish to emphasize the distinction between the input typed by the user and the response printed by the interpreter, we will show the latter in slanted characters.

Chapter 1: Building Abstractions with Procedures

19

75 (* 25 4 12) 1200 No ambiguity can arise, because the operator is always the leftmost element and the entire combination is delimited by the parentheses. A second advantage of prefix notation is that it extends in a straightforward way to allow combinations to be nested, that is, to have combinations whose elements are themselves combinations: (+ (* 3 5) (- 10 6)) 19 There is no limit (in principle) to the depth of such nesting and to the overall complexity of the expressions that the Lisp interpreter can evaluate. It is we humans who get confused by still relatively simple expressions such as (+ (* 3 (+ (* 2 4) (+ 3 5))) (+ (- 10 7) 6)) which the interpreter would readily evaluate to be 57. We can help ourselves by writing such an expression in the form (+ (* 3 (+ (* 2 4) (+ 3 5))) (+ (- 10 7) 6)) following a formatting convention known as pretty-printing, in which each long combination is written so that the operands are aligned vertically. The resulting indentations display clearly the structure of the expression.6 Even with complex expressions, the interpreter always operates in the same basic cycle: It reads an expression from the terminal, evaluates the expression, and prints the result. This mode of operation is often expressed by saying that the interpreter runs in a readeval-print loop. Observe in particular that it is not necessary to explicitly instruct the interpreter to print the value of the expression.7

1.1.2 Naming and the Environment A critical aspect of a programming language is the means it provides for using names to refer to computational objects. We say that the name identifies a variable whose value is the object. In the Scheme dialect of Lisp, we name things with define. Typing (define size 2) 6

7

Lisp systems typically provide features to aid the user in formatting expressions. Two especially useful features are one that automatically indents to the proper pretty-print position whenever a new line is started and one that highlights the matching left parenthesis whenever a right parenthesis is typed. Lisp obeys the convention that every expression has a value. This convention, together with the old reputation of Lisp as an inefficient language, is the source of the quip by Alan Perlis (paraphrasing Oscar Wilde) that “Lisp programmers know the value of everything but the cost of nothing.”

20

Structure and Interpretation of Computer Programs, 2e

causes the interpreter to associate the value 2 with the name size.8 Once the name size has been associated with the number 2, we can refer to the value 2 by name: size 2 (* 5 size) 10 Here are further examples of the use of define: (define pi 3.14159) (define radius 10) (* pi (* radius radius)) 314.159 (define circumference (* 2 pi radius)) circumference 62.8318 Define is our language’s simplest means of abstraction, for it allows us to use simple names to refer to the results of compound operations, such as the circumference computed above. In general, computational objects may have very complex structures, and it would be extremely inconvenient to have to remember and repeat their details each time we want to use them. Indeed, complex programs are constructed by building, step by step, computational objects of increasing complexity. The interpreter makes this step-by-step program construction particularly convenient because name-object associations can be created incrementally in successive interactions. This feature encourages the incremental development and testing of programs and is largely responsible for the fact that a Lisp program usually consists of a large number of relatively simple procedures. It should be clear that the possibility of associating values with symbols and later retrieving them means that the interpreter must maintain some sort of memory that keeps track of the name-object pairs. This memory is called the environment (more precisely the global environment, since we will see later that a computation may involve a number of different environments).9

1.1.3 Evaluating Combinations One of our goals in this chapter is to isolate issues about thinking procedurally. As a case in point, let us consider that, in evaluating combinations, the interpreter is itself following a procedure. To evaluate a combination, do the following: 1. Evaluate the subexpressions of the combination. 8 9

In this book, we do not show the interpreter’s response to evaluating definitions, since this is highly implementation-dependent. Chapter 3 [Chapter 3], page 183 will show that this notion of environment is crucial, both for understanding how the interpreter works and for implementing interpreters.

Chapter 1: Building Abstractions with Procedures

21

2. Apply the procedure that is the value of the leftmost subexpression (the operator) to the arguments that are the values of the other subexpressions (the operands). Even this simple rule illustrates some important points about processes in general. First, observe that the first step dictates that in order to accomplish the evaluation process for a combination we must first perform the evaluation process on each element of the combination. Thus, the evaluation rule is recursive in nature; that is, it includes, as one of its steps, the need to invoke the rule itself.10 Notice how succinctly the idea of recursion can be used to express what, in the case of a deeply nested combination, would otherwise be viewed as a rather complicated process. For example, evaluating (* (+ 2 (* 4 6)) (+ 3 5 7)) requires that the evaluation rule be applied to four different combinations. We can obtain a picture of this process by representing the combination in the form of a tree, as shown in [Figure 1-1], page 21. Each combination is represented by a node with branches corresponding to the operator and the operands of the combination stemming from it. The terminal nodes (that is, nodes with no branches stemming from them) represent either operators or numbers. Viewing evaluation in terms of the tree, we can imagine that the values of the operands percolate upward, starting from the terminal nodes and then combining at higher and higher levels. In general, we shall see that recursion is a very powerful technique for dealing with hierarchical, treelike objects. In fact, the “percolate values upward” form of the evaluation rule is an example of a general kind of process known as tree accumulation. Figure 1.1: Tree representation, showing the value of each subcombination. 390 /|\____________ / | \ * 26 15 /|\ | / | \ // \\ + 2 24 / | | \ /|\ + 3 5 7 / | \ * 4 6 Next, observe that the repeated application of the first step brings us to the point where we need to evaluate, not combinations, but primitive expressions such as numerals, built-in operators, or other names. We take care of the primitive cases by stipulating that • the values of numerals are the numbers that they name, • the values of built-in operators are the machine instruction sequences that carry out the corresponding operations, and 10

It may seem strange that the evaluation rule says, as part of the first step, that we should evaluate the leftmost element of a combination, since at this point that can only be an operator such as + or * representing a built-in primitive procedure such as addition or multiplication. We will see later that it is useful to be able to work with combinations whose operators are themselves compound expressions.

22

Structure and Interpretation of Computer Programs, 2e

• the values of other names are the objects associated with those names in the environment. We may regard the second rule as a special case of the third one by stipulating that symbols such as + and * are also included in the global environment, and are associated with the sequences of machine instructions that are their “values.” The key point to notice is the role of the environment in determining the meaning of the symbols in expressions. In an interactive language such as Lisp, it is meaningless to speak of the value of an expression such as (+ x 1) without specifying any information about the environment that would provide a meaning for the symbol x (or even for the symbol +). As we shall see in Chapter 3 [Chapter 3], page 183, the general notion of the environment as providing a context in which evaluation takes place will play an important role in our understanding of program execution. Notice that the evaluation rule given above does not handle definitions. For instance, evaluating (define x 3) does not apply define to two arguments, one of which is the value of the symbol x and the other of which is 3, since the purpose of the define is precisely to associate x with a value. (That is, (define x 3) is not a combination.) Such exceptions to the general evaluation rule are called special forms. Define is the only example of a special form that we have seen so far, but we will meet others shortly. Each special form has its own evaluation rule. The various kinds of expressions (each with its associated evaluation rule) constitute the syntax of the programming language. In comparison with most other programming languages, Lisp has a very simple syntax; that is, the evaluation rule for expressions can be described by a simple general rule together with specialized rules for a small number of special forms.11

1.1.4 Compound Procedures We have identified in Lisp some of the elements that must appear in any powerful programming language: • Numbers and arithmetic operations are primitive data and procedures. • Nesting of combinations provides a means of combining operations. • Definitions that associate names with values provide a limited means of abstraction. Now we will learn about procedure definitions, a much more powerful abstraction technique by which a compound operation can be given a name and then referred to as a unit. We begin by examining how to express the idea of “squaring.” We might say, “To square something, multiply it by itself.” This is expressed in our language as (define (square x) (* x x)) We can understand this in the following way: 11

Special syntactic forms that are simply convenient alternative surface structures for things that can be written in more uniform ways are sometimes called syntactic sugar, to use a phrase coined by Peter Landin. In comparison with users of other languages, Lisp programmers, as a rule, are less concerned with matters of syntax. (By contrast, examine any Pascal manual and notice how much of it is devoted to descriptions of syntax.) This disdain for syntax is due partly to the flexibility of Lisp, which makes it easy to change surface syntax, and partly to the observation that many “convenient” syntactic constructs, which make the language less uniform, end up causing more trouble than they are worth when programs become large and complex. In the words of Alan Perlis, “Syntactic sugar causes cancer of the semicolon.”

Chapter 1: Building Abstractions with Procedures

23

(define (square x) (* x x)) | | | | | | To square something, multiply it by itself. We have here a compound procedure, which has been given the name square. The procedure represents the operation of multiplying something by itself. The thing to be multiplied is given a local name, x, which plays the same role that a pronoun plays in natural language. Evaluating the definition creates this compound procedure and associates it with the name square.12 The general form of a procedure definition is (define ( ) ) The is a symbol to be associated with the procedure definition in the environment.13 The are the names used within the body of the procedure to refer to the corresponding arguments of the procedure. The is an expression that will yield the value of the procedure application when the formal parameters are replaced by the actual arguments to which the procedure is applied.14 The and the are grouped within parentheses, just as they would be in an actual call to the procedure being defined. Having defined square, we can now use it: (square 21) 441 (square (+ 2 5)) 49 (square (square 3)) 81 We can also use square as a building block in defining other procedures. For example, x ^2 + y^2 can be expressed as (+ (square x) (square y)) We can easily define a procedure sum-of-squares that, given any two numbers as arguments, produces the sum of their squares: (define (sum-of-squares x y) (+ (square x) (square y))) (sum-of-squares 3 4) 25 Now we can use sum-of-squares as a building block in constructing further procedures: 12

13

14

Observe that there are two different operations being combined here: we are creating the procedure, and we are giving it the name square. It is possible, indeed important, to be able to separate these two notions—to create procedures without naming them, and to give names to procedures that have already been created. We will see how to do this in section Section 1.3.2 [1-3-2], page 61. Throughout this book, we will describe the general syntax of expressions by using italic symbols delimited by angle brackets—e.g., —to denote the “slots” in the expression to be filled in when such an expression is actually used. More generally, the body of the procedure can be a sequence of expressions. In this case, the interpreter evaluates each expression in the sequence in turn and returns the value of the final expression as the value of the procedure application.

24

Structure and Interpretation of Computer Programs, 2e

(define (f a) (sum-of-squares (+ a 1) (* a 2))) (f 5) 136 Compound procedures are used in exactly the same way as primitive procedures. Indeed, one could not tell by looking at the definition of sum-of-squares given above whether square was built into the interpreter, like + and *, or defined as a compound procedure.

1.1.5 The Substitution Model for Procedure Application To evaluate a combination whose operator names a compound procedure, the interpreter follows much the same process as for combinations whose operators name primitive procedures, which we described in section Section 1.1.3 [1-1-3], page 20. That is, the interpreter evaluates the elements of the combination and applies the procedure (which is the value of the operator of the combination) to the arguments (which are the values of the operands of the combination). We can assume that the mechanism for applying primitive procedures to arguments is built into the interpreter. For compound procedures, the application process is as follows: To apply a compound procedure to arguments, evaluate the body of the procedure with each formal parameter replaced by the corresponding argument. To illustrate this process, let’s evaluate the combination (f 5) where f is the procedure defined in section Section 1.1.4 [1-1-4], page 22. We begin by retrieving the body of f: (sum-of-squares (+ a 1) (* a 2)) Then we replace the formal parameter a by the argument 5: (sum-of-squares (+ 5 1) (* 5 2)) Thus the problem reduces to the evaluation of a combination with two operands and an operator sum-of-squares. Evaluating this combination involves three subproblems. We must evaluate the operator to get the procedure to be applied, and we must evaluate the operands to get the arguments. Now (+ 5 1) produces 6 and (* 5 2) produces 10, so we must apply the sum-of-squares procedure to 6 and 10. These values are substituted for the formal parameters x and y in the body of sum-of-squares, reducing the expression to (+ (square 6) (square 10)) If we use the definition of square, this reduces to (+ (* 6 6) (* 10 10)) which reduces by multiplication to (+ 36 100) and finally to 136 The process we have just described is called the substitution model for procedure application. It can be taken as a model that determines the “meaning” of procedure application,

Chapter 1: Building Abstractions with Procedures

25

insofar as the procedures in this chapter are concerned. However, there are two points that should be stressed: • The purpose of the substitution is to help us think about procedure application, not to provide a description of how the interpreter really works. Typical interpreters do not evaluate procedure applications by manipulating the text of a procedure to substitute values for the formal parameters. In practice, the “substitution” is accomplished by using a local environment for the formal parameters. We will discuss this more fully in Chapter 3 [Chapter 3], page 183 and Chapter 4 [Chapter 4], page 299 when we examine the implementation of an interpreter in detail. • Over the course of this book, we will present a sequence of increasingly elaborate models of how interpreters work, culminating with a complete implementation of an interpreter and compiler in Chapter 5 [Chapter 5], page 405. The substitution model is only the first of these models—a way to get started thinking formally about the evaluation process. In general, when modeling phenomena in science and engineering, we begin with simplified, incomplete models. As we examine things in greater detail, these simple models become inadequate and must be replaced by more refined models. The substitution model is no exception. In particular, when we address in Chapter 3 [Chapter 3], page 183 the use of procedures with “mutable data,” we will see that the substitution model breaks down and must be replaced by a more complicated model of procedure application.15

Applicative order versus normal order According to the description of evaluation given in section Section 1.1.3 [1-1-3], page 20, the interpreter first evaluates the operator and operands and then applies the resulting procedure to the resulting arguments. This is not the only way to perform evaluation. An alternative evaluation model would not evaluate the operands until their values were needed. Instead it would first substitute operand expressions for parameters until it obtained an expression involving only primitive operators, and would then perform the evaluation. If we used this method, the evaluation of (f 5) would proceed according to the sequence of expansions (sum-of-squares (+ 5 1) (* 5 2)) (+

(square (+ 5 1))

(+ (* (+ 5 1) (+ 5 1)) followed by the reductions (+ (* 6 6) (+ 15

36

(square (* 5 2))

)

(* (* 5 2) (* 5 2))) (* 10 10)) 100)

Despite the simplicity of the substitution idea, it turns out to be surprisingly complicated to give a rigorous mathematical definition of the substitution process. The problem arises from the possibility of confusion between the names used for the formal parameters of a procedure and the (possibly identical) names used in the expressions to which the procedure may be applied. Indeed, there is a long history of erroneous definitions of substitution in the literature of logic and programming semantics. See Stoy 1977 for a careful discussion of substitution.

26

Structure and Interpretation of Computer Programs, 2e

136 This gives the same answer as our previous evaluation model, but the process is different. In particular, the evaluations of (+ 5 1) and (* 5 2) are each performed twice here, corresponding to the reduction of the expression (* x x) with x replaced respectively by (+ 5 1) and (* 5 2). This alternative “fully expand and then reduce” evaluation method is known as normalorder evaluation, in contrast to the “evaluate the arguments and then apply” method that the interpreter actually uses, which is called applicative-order evaluation. It can be shown that, for procedure applications that can be modeled using substitution (including all the procedures in the first two chapters of this book) and that yield legitimate values, normalorder and applicative-order evaluation produce the same value. (See [Exercise 1-5], page 29 for an instance of an “illegitimate” value where normal-order and applicative-order evaluation do not give the same result.) Lisp uses applicative-order evaluation, partly because of the additional efficiency obtained from avoiding multiple evaluations of expressions such as those illustrated with (+ 5 1) and (* 5 2) above and, more significantly, because normal-order evaluation becomes much more complicated to deal with when we leave the realm of procedures that can be modeled by substitution. On the other hand, normal-order evaluation can be an extremely valuable tool, and we will investigate some of its implications in Chapter 3 [Chapter 3], page 183 and Chapter 4 [Chapter 4], page 299.16

1.1.6 Conditional Expressions and Predicates The expressive power of the class of procedures that we can define at this point is very limited, because we have no way to make tests and to perform different operations depending on the result of a test. For instance, we cannot define a procedure that computes the absolute value of a number by testing whether the number is positive, negative, or zero and taking different actions in the different cases according to the rule / | |x| = < | \

x 0 -x

if x > 0 if x = 0 if x < 0

This construct is called a case analysis, and there is a special form in Lisp for notating such a case analysis. It is called cond (which stands for “conditional”), and it is used as follows: (define (abs x) (cond ((> x 0) x) ((= x 0) 0) ((< x 0) (- x)))) The general form of a conditional expression is 16

In Chapter 3 [Chapter 3], page 183 we will introduce stream processing, which is a way of handling apparently “infinite” data structures by incorporating a limited form of normal-order evaluation. In section Section 4.2 [4-2], page 331 we will modify the Scheme interpreter to produce a normal-order variant of Scheme.

Chapter 1: Building Abstractions with Procedures

27

(cond ( ) ( ) ... ( )) consisting of the symbol cond followed by parenthesized pairs of expressions ( ) called clauses. The first expression in each pair is a predicate—that is, an expression whose value is interpreted as either true or false.17 Conditional expressions are evaluated as follows. The predicate is evaluated first. If its value is false, then is evaluated. If ’s value is also false, then is evaluated. This process continues until a predicate is found whose value is true, in which case the interpreter returns the value of the corresponding consequent expression of the clause as the value of the conditional expression. If none of the ’s is found to be true, the value of the cond is undefined. The word predicate is used for procedures that return true or false, as well as for expressions that evaluate to true or false. The absolute-value procedure abs makes use of the primitive predicates >, x 5) (< x 10)) As another example, we can define a predicate to test whether one number is greater than or equal to another as (define (>= x y) (or (> x y) (= x y))) or alternatively as (define (>= x y) (not (< x y))) Exercise 1.1: Below is a sequence of expressions. What is the result printed by the interpreter in response to each expression? Assume that the sequence is to be evaluated in the order in which it is presented. 10 (+ 5 3 4) (- 9 1) 19

A minor difference between if and cond is that the part of each cond clause may be a sequence of expressions. If the corresponding is found to be true, the expressions are evaluated in sequence and the value of the final expression in the sequence is returned as the value of the cond. In an if expression, however, the and must be single expressions.

Chapter 1: Building Abstractions with Procedures

(/ 6 2) (+ (* 2 4) (- 4 6)) (define a 3) (define b (+ a 1)) (+ a b (* a b)) (= a b) (if (and (> b a) (< b (* a b))) b a) (cond ((= a 4) 6) ((= b 4) (+ 6 7 a)) (else 25)) (+ 2 (if (> b a) b a)) (* (cond ((> a b) a) ((< a b) b) (else -1)) (+ a 1)) Exercise 1.2: Translate the following expression into prefix form. 5 + 4 + (2 - (3 - (6 + 4/5))) ----------------------------3(6 - 2)(2 - 7) Exercise 1.3: Define a procedure that takes three numbers as arguments and returns the sum of the squares of the two larger numbers. Exercise 1.4: Observe that our model of evaluation allows for combinations whose operators are compound expressions. Use this observation to describe the behavior of the following procedure: (define (a-plus-abs-b a b) ((if (> b 0) + -) a b)) Exercise 1.5: Ben Bitdiddle has invented a test to determine whether the interpreter he is faced with is using applicative-order evaluation or normal-order evaluation. He defines the following two procedures: (define (p) (p)) (define (test x y) (if (= x 0) 0

29

30

Structure and Interpretation of Computer Programs, 2e

y)) Then he evaluates the expression (test 0 (p)) What behavior will Ben observe with an interpreter that uses applicative-order evaluation? What behavior will he observe with an interpreter that uses normalorder evaluation? Explain your answer. (Assume that the evaluation rule for the special form if is the same whether the interpreter is using normal or applicative order: The predicate expression is evaluated first, and the result determines whether to evaluate the consequent or the alternative expression.)

1.1.7 Example: Square Roots by Newton’s Method Procedures, as introduced above, are much like ordinary mathematical functions. They specify a value that is determined by one or more parameters. But there is an important difference between mathematical functions and computer procedures. Procedures must be effective. As a case in point, consider the problem of computing square roots. We can define the square-root function as sqrt(x) = the y such that y >= 0 and y^2 = x This describes a perfectly legitimate mathematical function. We could use it to recognize whether one number is the square root of another, or to derive facts about square roots in general. On the other hand, the definition does not describe a procedure. Indeed, it tells us almost nothing about how to actually find the square root of a given number. It will not help matters to rephrase this definition in pseudo-Lisp: (define (sqrt x) (the y (and (>= y 0) (= (square y) x)))) This only begs the question. The contrast between function and procedure is a reflection of the general distinction between describing properties of things and describing how to do things, or, as it is sometimes referred to, the distinction between declarative knowledge and imperative knowledge. In mathematics we are usually concerned with declarative (what is) descriptions, whereas in computer science we are usually concerned with imperative (how to) descriptions.20 How does one compute square roots? The most common way is to use Newton’s method of successive approximations, which says that whenever we have a guess y for the value of the square root of a number x, we can perform a simple manipulation to get a better 20

Declarative and imperative descriptions are intimately related, as indeed are mathematics and computer science. For instance, to say that the answer produced by a program is “correct” is to make a declarative statement about the program. There is a large amount of research aimed at establishing techniques for proving that programs are correct, and much of the technical difficulty of this subject has to do with negotiating the transition between imperative statements (from which programs are constructed) and declarative statements (which can be used to deduce things). In a related vein, an important current area in programming-language design is the exploration of so-called very high-level languages, in which one actually programs in terms of declarative statements. The idea is to make interpreters sophisticated enough so that, given “what is” knowledge specified by the programmer, they can generate “how to” knowledge automatically. This cannot be done in general, but there are important areas where progress has been made. We shall revisit this idea in Chapter 4 [Chapter 4], page 299.

Chapter 1: Building Abstractions with Procedures

31

guess (one closer to the actual square root) by averaging y with x /y.21 For example, we can compute the square root of 2 as follows. Suppose our initial guess is 1: Guess Quotient Average 1 (2/1) = 2 ((2 + 1)/2) = 1.5 1.5 (2/1.5) = 1.3333 ((1.3333 + 1.5)/2) = 1.4167 1.4167 (2/1.4167) = 1.4118 ((1.4167 + 1.4118)/2) = 1.4142 1.4142 ... ... Continuing this process, we obtain better and better approximations to the square root. Now let’s formalize the process in terms of procedures. We start with a value for the radicand (the number whose square root we are trying to compute) and a value for the guess. If the guess is good enough for our purposes, we are done; if not, we must repeat the process with an improved guess. We write this basic strategy as a procedure: (define (sqrt-iter guess x) (if (good-enough? guess x) guess (sqrt-iter (improve guess x) x))) A guess is improved by averaging it with the quotient of the radicand and the old guess: (define (improve guess x) (average guess (/ x guess))) where (define (average x y) (/ (+ x y) 2)) We also have to say what we mean by “good enough.” The following will do for illustration, but it is not really a very good test. (See exercise [Exercise 1-7], page 32.) The idea is to improve the answer until it is close enough so that its square differs from the radicand by less than a predetermined tolerance (here 0.001):22 (define (good-enough? guess x) (< (abs (- (square guess) x)) 0.001)) Finally, we need a way to get started. For instance, we can always guess that the square root of any number is 1:23 21

22

23

This square-root algorithm is actually a special case of Newton’s method, which is a general technique for finding roots of equations. The square-root algorithm itself was developed by Heron of Alexandria in the first century A.D. We will see how to express the general Newton’s method as a Lisp procedure in section Section 1.3.4 [1-3-4], page 69. We will usually give predicates names ending with question marks, to help us remember that they are predicates. This is just a stylistic convention. As far as the interpreter is concerned, the question mark is just an ordinary character. Observe that we express our initial guess as 1.0 rather than 1. This would not make any difference in many Lisp implementations. MIT Scheme, however, distinguishes between exact integers and decimal values, and dividing two integers produces a rational number rather than a decimal. For example, dividing 10 by 6 yields 5/3, while dividing 10.0 by 6.0 yields 1.6666666666666667. (We will learn how to implement arithmetic on rational numbers in section Section 2.1.1 [2-1-1], page 78.) If we start with an initial guess of 1 in our square-root program, and x is an exact integer, all subsequent values produced in the square-root computation will be rational numbers rather than decimals. Mixed operations on rational numbers and decimals always yield decimals, so starting with an initial guess of 1.0 forces all subsequent values to be decimals.

32

Structure and Interpretation of Computer Programs, 2e

(define (sqrt x) (sqrt-iter 1.0 x)) If we type these definitions to the interpreter, we can use sqrt just as we can use any procedure: (sqrt 9) 3.00009155413138 (sqrt (+ 100 37)) 11.704699917758145 (sqrt (+ (sqrt 2) (sqrt 3))) 1.7739279023207892 (square (sqrt 1000)) 1000.000369924366 The sqrt program also illustrates that the simple procedural language we have introduced so far is sufficient for writing any purely numerical program that one could write in, say, C or Pascal. This might seem surprising, since we have not included in our language any iterative (looping) constructs that direct the computer to do something over and over again. Sqrt-iter, on the other hand, demonstrates how iteration can be accomplished using no special construct other than the ordinary ability to call a procedure.24 Exercise 1.6: Alyssa P. Hacker doesn’t see why if needs to be provided as a special form. “Why can’t I just define it as an ordinary procedure in terms of cond?” she asks. Alyssa’s friend Eva Lu Ator claims this can indeed be done, and she defines a new version of if: (define (new-if predicate then-clause else-clause) (cond (predicate then-clause) (else else-clause))) Eva demonstrates the program for Alyssa: (new-if (= 2 3) 0 5) 5 (new-if (= 1 1) 0 5) 0 Delighted, Alyssa uses new-if to rewrite the square-root program: (define (sqrt-iter guess x) (new-if (good-enough? guess x) guess (sqrt-iter (improve guess x) x))) What happens when Alyssa attempts to use this to compute square roots? Explain. 24

Readers who are worried about the efficiency issues involved in using procedure calls to implement iteration should note the remarks on “tail recursion” in section Section 1.2.1 [1-2-1], page 37.

Chapter 1: Building Abstractions with Procedures

33

Exercise 1.7: The good-enough? test used in computing square roots will not be very effective for finding the square roots of very small numbers. Also, in real computers, arithmetic operations are almost always performed with limited precision. This makes our test inadequate for very large numbers. Explain these statements, with examples showing how the test fails for small and large numbers. An alternative strategy for implementing good-enough? is to watch how guess changes from one iteration to the next and to stop when the change is a very small fraction of the guess. Design a square-root procedure that uses this kind of end test. Does this work better for small and large numbers? Exercise 1.8: Newton’s method for cube roots is based on the fact that if y is an approximation to the cube root of x, then a better approximation is given by the value x/y^2 + 2y ---------3 Use this formula to implement a cube-root procedure analogous to the squareroot procedure. (In section Section 1.3.4 [1-3-4], page 69 we will see how to implement Newton’s method in general as an abstraction of these square-root and cube-root procedures.)

1.1.8 Procedures as Black-Box Abstractions Sqrt is our first example of a process defined by a set of mutually defined procedures. Notice that the definition of sqrt-iter is recursive; that is, the procedure is defined in terms of itself. The idea of being able to define a procedure in terms of itself may be disturbing; it may seem unclear how such a “circular” definition could make sense at all, much less specify a well-defined process to be carried out by a computer. This will be addressed more carefully in section Section 1.2 [1-2], page 37. But first let’s consider some other important points illustrated by the sqrt example. Observe that the problem of computing square roots breaks up naturally into a number of subproblems: how to tell whether a guess is good enough, how to improve a guess, and so on. Each of these tasks is accomplished by a separate procedure. The entire sqrt program can be viewed as a cluster of procedures (shown in [Figure 1-2], page 33) that mirrors the decomposition of the problem into subproblems. Figure 1.2: Procedural decomposition of the sqrt program. sqrt | sqrt-iter / \ good-enough improve / \ | square abs average The importance of this decomposition strategy is not simply that one is dividing the program into parts. After all, we could take any large program and divide it into parts— the first ten lines, the next ten lines, the next ten lines, and so on. Rather, it is crucial that each procedure accomplishes an identifiable task that can be used as a module in

34

Structure and Interpretation of Computer Programs, 2e

defining other procedures. For example, when we define the good-enough? procedure in terms of square, we are able to regard the square procedure as a “black box.” We are not at that moment concerned with how the procedure computes its result, only with the fact that it computes the square. The details of how the square is computed can be suppressed, to be considered at a later time. Indeed, as far as the good-enough? procedure is concerned, square is not quite a procedure but rather an abstraction of a procedure, a so-called procedural abstraction. At this level of abstraction, any procedure that computes the square is equally good. Thus, considering only the values they return, the following two procedures for squaring a number should be indistinguishable. Each takes a numerical argument and produces the square of that number as the value.25 (define (square x) (* x x)) (define (square x) (exp (double (log x)))) (define (double x) (+ x x)) So a procedure definition should be able to suppress detail. The users of the procedure may not have written the procedure themselves, but may have obtained it from another programmer as a black box. A user should not need to know how the procedure is implemented in order to use it.

Local names One detail of a procedure’s implementation that should not matter to the user of the procedure is the implementer’s choice of names for the procedure’s formal parameters. Thus, the following procedures should not be distinguishable: (define (square x) (* x x)) (define (square y) (* y y)) This principle—that the meaning of a procedure should be independent of the parameter names used by its author—seems on the surface to be self-evident, but its consequences are profound. The simplest consequence is that the parameter names of a procedure must be local to the body of the procedure. For example, we used square in the definition of good-enough? in our square-root procedure: (define (good-enough? guess x) (< (abs (- (square guess) x)) 0.001)) The intention of the author of good-enough? is to determine if the square of the first argument is within a given tolerance of the second argument. We see that the author of good-enough? used the name guess to refer to the first argument and x to refer to the second argument. The argument of square is guess. If the author of square used x (as above) to refer to that argument, we see that the x in good-enough? must be a different x 25

It is not even clear which of these procedures is a more efficient implementation. This depends upon the hardware available. There are machines for which the “obvious” implementation is the less efficient one. Consider a machine that has extensive tables of logarithms and antilogarithms stored in a very efficient manner.

Chapter 1: Building Abstractions with Procedures

35

than the one in square. Running the procedure square must not affect the value of x that is used by good-enough?, because that value of x may be needed by good-enough? after square is done computing. If the parameters were not local to the bodies of their respective procedures, then the parameter x in square could be confused with the parameter x in good-enough?, and the behavior of good-enough? would depend upon which version of square we used. Thus, square would not be the black box we desired. A formal parameter of a procedure has a very special role in the procedure definition, in that it doesn’t matter what name the formal parameter has. Such a name is called a bound variable, and we say that the procedure definition binds its formal parameters. The meaning of a procedure definition is unchanged if a bound variable is consistently renamed throughout the definition.26 If a variable is not bound, we say that it is free. The set of expressions for which a binding defines a name is called the scope of that name. In a procedure definition, the bound variables declared as the formal parameters of the procedure have the body of the procedure as their scope. In the definition of good-enough? above, guess and x are bound variables but . 1 . . 0 . . . . . .>. .. Consider the pattern of this computation. To compute (fib 5), we compute (fib 4) and (fib 3). To compute (fib 4), we compute (fib 3) and (fib 2). In general, the evolved process looks like a tree, as shown in [Figure 1-5], page 42. Notice that the branches split into two at each level (except at the bottom); this reflects the fact that the fib procedure calls itself twice each time it is invoked. This procedure is instructive as a prototypical tree recursion, but it is a terrible way to compute Fibonacci numbers because it does so much redundant computation. Notice in [Figure 1-5], page 42 that the entire computation of (fib 3)—almost half the work— is duplicated. In fact, it is not hard to show that the number of times the procedure will compute (fib 1) or (fib 0) (the number of leaves in the above tree, in general) is precisely Fib(n + 1). To get an idea of how bad this is, one can show that the value of Fib(n) grows exponentially with n. More precisely (see [Exercise 1-13], page 45), Fib(n) is the closest integer to [phi] ^n /[sqrt] (5), where [phi] = (1 + [sqrt]5)/2 ~= 1.6180 is the golden ratio, which satisfies the equation [phi]^2 = [phi] + 1 Thus, the process uses a number of steps that grows exponentially with the input. On the other hand, the space required grows only linearly with the input, because we need keep track only of which nodes are above us in the tree at any point in the computation. In general, the number of steps required by a tree-recursive process will be proportional to the number of nodes in the tree, while the space required will be proportional to the maximum depth of the tree. We can also formulate an iterative process for computing the Fibonacci numbers. The idea is to use a pair of integers a and b, initialized to Fib(1) = 1 and Fib(0) = 0, and to repeatedly apply the simultaneous transformations

Chapter 1: Building Abstractions with Procedures

43

a

We are a sharing community. So please help us by uploading **1** new document or like us to download:

OR LIKE TO DOWNLOAD IMMEDIATELY