👩‍💻 chrismanbrown.gitlab.io

Lets Read: Structure and Interpretation of Computer Programs

Chapter 1: Part 1


This is the second entry in the Let’s Read SICP series.

« previous | index | next »


1 Building Abstractions With Procedures

Chapter 1 here we go!

1.1 The Elements of Programming

Here we lay the foundations of programming and define a lot of terminology.

One tricky concept is the difference between applicative-order evaluation vs. normal-mode evaluation, which are two different ways that interpreters can evaluate code and which seems to me like a complicated concept to include with other content here such as “a variable is an abstraction of a hard coded value.”

The way I understand it is this.

Applicative-order evaluation, such as is done with Scheme, evaluates a normal procedure by first evaluating the operator and its operands, and then applying the operator to all of its operands.

In normal-mode evaluation, evaluation of the operands is deferred until they are needed, which is after the expression has been reduced to nothing but primative operators and data. I think this is what is called “lazy evaluation”.

You can test out which method your interpreter uses with the following test.

(define (infinite) (infinite)) ; start an infinite loop

(define (test x y)
  (if (= 0 x)

(test 0 (infinite))

Applicative-order will evaluate all of the operands immediately and start an infinite loop.

Normal-mode will defer evaluating the operands until they are needed, which in this case is never, because the if statement evaluates to true, so the expression evalutes to 0, and y is never expanded.

1.2 Procedures and the Processes They Generate

So there is no syntax for looping (that I know of) in Scheme. Iteration is done through recursion.

One neat thing that I’ve never thought of is that a recursive procedure can generate a recursive or an interative process.

;; Factorial 1: Recursive procedure, recursive process
(define (factorial n)
  (if (= n 1)
    (* n (factorial ( - n 1 )))))

;; Factorial 2: Recursive procedure, iterative process
(define (factorial n)
  (define (fac-iter product count max-count)
    (if (> max-count count)
      (fac-iter (* product count)
                (+ 1 count)
  (fac-iter 1 1 n))

Factorial 1 is a recursive process that defers procedure calls to the stack.

Factorial 2 is an iterative process produced by a recursive procedure. It doesn’t push function calls onto the stack. It doesn’t use the stack at all. It relies on no state in the environment. Which means that if you needed to for some reason, you could stop the process and resume it later because all of the state is held in product, count, and max-count. You can’t halt the first version and resume it mid-process because the state isn’t held in the program. It’s held on the stack.

This is also where we get into tail-recursion.

Given these two examples (assume inc and dec exist in the environment and increment and decrement a value by one):

;; version 1
(define (+ a b)
  (if (= a 0)
    (inc (+ (dec a) b))))

;; version 2
(define (+ a b)
  (if (= a 0)
    (+ (dec a) (inc b))))

You can visually confirm that version 2 is an iterative process because it uses tail call optimization by making sure that the last expression in the procedure is a call to itself.

Version 1 calls itself in the middle of the final expression, meaning it must iterate recursively. It is not in tail call form.

1.2.2 Tree Recrusion

Okay I feel like the authors made a leap here, started talking about steps vs space required for tree recursion, and they never explained what that is. I get what steps are. But what is space, the amount of memory needed to store the data structure?

This went from “draw a circle” to “draw the rest of the owl” really fast. I can kind of keep up with the text, but the exercises are veering wildly toward complex mathematical proofs that just don’t interest me. I want to learn about how to organize programs, not proving that Fib(n) is the closest integer to phi to the n over the square root of five where phi equals 1 + the square root of five over two. (That is real, that is exercise 1.13.)

1.2.3 Orders of Growth

This is where the authors start talking about big O notation without ever calling it big O notation.

They instead describe a function R with a domain n equal to the size of the problem which returns the amount of resources needed for a process of size n. R(n) has a rate of growth equal to theta of R(n). And it is theta of n that is equal to O(n), which I guess has become the common man’s simplified version of theta of n because I’m not writing this in LaTeX and I don’t know how to write a theta, or any other kind of mathematical formula.

1.2.4 Exponentiation

Theta aka Big O is demonstrated here by comparing different methods for calculating exponents.

One recurring theme here is that when translating a function to a procedure, the most straightforward expression often includes a recursive process, and is also very expensive in terms of resources. (At least, that seems to be true of the examples they have chosen to show us so far.) And that rewriting them into a less expensive iterative process requires some mathematical intuition or know-how.

For example, exponentiation.

Given the definitions:

…consider the following.

;; recursive process
;; eloquent, expressive, costly:
;;   O(n) steps, O(n) space
(define (exp b n)
  (if (= n 0)
    (* b (exp b (- n 1)))))

;; iterative process
;; slightly less eloquent, but also less costly:
;;   O(n) steps, O(1) space
(define (exp b n)
  ;; creating an internal "iterator" procedure is a
  ;; common pattern for turning a recursive process
  ;; into an iterative one, because it allows you to
  ;; write the parent procedure in tail call form:
  ;; the last line is a call to a procedure that does
  ;; not need to be expanded by the interpreter
  (define (exp-itr b count product)
    (if (= count 0)
      (exp-itr b
               (- count 1)
               (* b product))))
  (exp-itr b n 1))

But then we get blindsided by mathematical cleverness.

Because we can also use this definition of exponents:

…which allows us to write an even better procedure.

;; recursive process, but not iterating over *every* step
;; O(log n) steps, O(log n) space
(define (fast-exp b n)
  (cond ((= n 0) 1)
        ((even? n) (square (fast-exp b (/ n 2))))
        (else (* b (fast-exp b (- n 1))))))

So there. I guess the pattern to learn here is that recursive processes can be very inefficient. Rewriting them as an iterative process can be more effecient. But if you can figure out some way to skip over certain iterations altogether, then that is even better. (Even though doing so may be difficult if, in these contrived examples, you don’t have a background in math.)

1.2.5 Greatest Common Divisors

Euclid’s Algorithm states that the greatest common divisor of two numbers a and b is equal to the greatest common divisor of b and the remainder of a divided by b.

(define gcd (a b)
  (if (= b 0)
    (gcd (b (remainder a b)))))

Then the authors did something funny by saying “If it takes k steps to compute the GCD of some pair of numbers, then the smaller number of the pair must be greater than or equal to the kth number of the Fibonnaci sequence”, and thus the process must have an order of growth of O(log n). Although the authors do actually back up their claim about the Fibonnaci sequence in a footnote using a mathematical proof, I don’t think one should just say stuff like that as though it’s true, as though it’s common knowledge. This is becoming a running theme for me in this book so far. “Ah, well you see, as everybody knows, the impact of Euclid’s Algorithm on Lame’s Theorem…”

Nobody knows what that is. Not anybody in this one-person book club of mine anyway.

1.2.6 Example: Testing for Primality

In which probabilistic algorithms are introduced.

This is kind of interesting.

Without getting into the details, Fermat’s Little Theorem describes an algorithm that makes a pretty good guess whether a number is prime or not. For a random number between 1 and n - 1, inclusive, if f(n) is true, then n is probably prime. So you just run f a bunch of times with random values between 1 and n - 1, and if you ever get a false result, then n is definitely not prime. But if it evaluates to true for all given values then n is probably prime.

This notion of probabilistic algorithms was disturbing to me because it seems to undermine the correctness that we tend to expect and demand of our computations.

But then the authors were kind of like, deal with it. That’s the difference between mathematics and engineering. Implying I guess that “probably” is good enough for engineering. They went on to say that if tested thorougly the likelihood of a probabilistic algorithm returning an incorrect answer is the same as a solar radiation flare moving electrons around on your computer hardware and making an incorrect calculation.

But then they went on to say that there is a whole set of rare numbers known as Carmichael numbers that will always pass Fermat’s Theorem as true even though they are not prime! False positives.

So, I don’t know about all of that.

Fermat’s Theorem is hella fast, but it can’t be assumed to be correct every time, 100% of the time.

Applicative vs Normal mode continue to come up here, so that seems important. Also the idea of subsequent squairing as introdudced in exponentiation. I don’t know if that’s actually an important concept, or whether it’s just the chosen example with witch to talk about logarithmic growth. But I think I need to go back and try to understand that part a bit better.


I think this post was kind of long. I might have bit off more than I expected by trying to do one chapter per post. But now I know. I’ll probably split chapters up into at least two posts from now on. So stay tuned for Chapter 1: Part 2.

This chapter so far is very challenging because it seems to be all about advanced math, which I think distracts a great deal from the actual teachings. Hope it gets better!