👩‍💻 chrismanbrown.gitlab.io

Lets Read: Structure and Interpretation of Computer Programs

Chapter 1: Part 2


This is the third entry in the Let’s Read SICP series.

« previous | index | next »



Okay here’s the last section of chapter 1!

It’s all about higher order procedures.

Incidentally, here’s the current process I’ve developed for reading this book so far.

  1. Skim the sections really quickly. Deprioritize deep understanding. Just understand broad context and general concepts. Don’t read exercises, don’t read footnotes unless you want to.

  2. Go back and re-read for understanding. Read all the footnotes. Read the exercises for intent. (What concept is the exercise trying to teach?) Maybe attempt the exercise, but probably don’t.

  3. Come here and summarize concepts, learnings, and interesting things as best you can.

1.3.1 Procedures as Arguments

Introduces the concept of higher-order procedures. Which are procedures that can accept procedures as an argument, and return procedures as a value.

Uses summing a series to point out the existence of an accumulator pattern and a filter pattern.

Okay, three different “sum of a series” procedures.

;; Procedure 1
;; add the numbers in a range of numbers from a to b
(define (sum a b)
  (if (> a b)
    (+ a (sum (+ a 1) b))))

;; Procedure 2
;; add the cubes of the numbers in a series
(define (sum-cubes a b)
  (if (> a b)
    (+ (cube a) (sum-cubes (+ a 1) b))))
;;     ^^^^^^^^
;; this is the difference b/t Proc 1 and Proc 2

;; Procedure 3
;; Add the series 1/(3*5) + 1/(7*9) ...
;; (This series converges on Pi/8)
(define (sum-pi a b)
  (if (> a b)
    (+ (/ (* a (+ a 2))) (pi-sum (+ a 4)) b)))
;;                               ^^^^^^^
;; this is the difference b/t Proc 2 and Proc 3

The pattern that emerges is:

(define (<name> a b)
  (if (> a b)
    (+ <term> (<name> (<next> a) b))))

where <term> and <next> both take a as an argument.

So our abstraction looks something like this:

(define (sum term a next b)
  (if (> a b)
    (+ (term a) (sum term (next a) next b))))

which we could then use to redefine sum-cube for example.

(define (sum-cube a b)
  (sum cube a inc b))

And regular old adding:

(define (identity x) x)
(define (sum-series a b) (sum identity a inc b))

Okay. So that’s prelude to the exercises.

Now, if we create a product procedure in the same way we created sum, then we become aware of a further abstraction.

We just need to change + to *, and also we can’t have the break condition of the recursion be zero, because multiplying by zero will make our answer zero. We need to change 0 to 1.

So we abstract out those values to arrive at a more generic accumulate.

(define (accumulator combiner break-case term a next b)
  (if (> a b)
    (combiner (term a) (accumulator combiner break-case term (next a) next b))))


(define (sum a b)
  (accumulator + 0 identity a inc b))
(define (product a b)
  (accumulator * 1 identity a inc b))

And the very last step is to abstract out the (> a b) conditional into a parameter so that we could, say, sum all of the primes of a series or something. This leads us to the final filtered-accumulator.

1.3.2 Constructing Procedures Using Lambda

Lambdas are anonymous functions, and let establishes local scope.

I’ve always kind of hated typing out “lambda” just to create an anonymous function. If I had the luxury, I’d probably alias define and lambda to def and fn or something.

let syntax I am familiar with from fennel.

Of particular interest here is that the local variables are always determined by the enclosing scope if possible. I think I might have known that intuitively, but definitely not explicitly.

(define x 2)

(let ((x 3)
     (y (+ x 2))) 
  (* x y))
;; -> 12

Also, let is just syntactic sugar for a lambda.

let’s syntax is (let (<var> <exp>) <body>) and the equivalent lambda syntax is ((lambda (<var>) <body>) <exp>).

For example, the following code and the previous code are the same.

(define x 2)

((lambda (x y)
  (* x y)) x (+ x 2))

1.3.3 Procedures as General Methods

To be honest, I’m not sure what the point of this section is supposed to be, unless by “general methods” the authors mean “procedures that are not functions.” Which they might. Because the examples they used in this section are finding zeroes and fixed points of functions. Which means, respectively, finding the x for f where f(x) = 0 and x.

The general procedure for determining both is the same:

  1. Define an error tolerance. e.g. 0.0001

  2. Establish the method for creating (and improving) a guess.

  3. Kick off the process with a seed guess. e.g. 1.0

  4. Repeatedly improve the guess over and over again until the difference between two subsequent guesses is less than the tolerance.

So while this class of procedure doesn’t meet the standards required to be considered a function, maybe it is what the authors think of as a general method. If that’s the case, this is yet another instance of them saying something, not saying what it is, and then leaving the reader to infer its meaning.

So here’s my inference.

General Method:
a procedure which calculates a value not through a function call, but through some other kind of process. For example, approximation of a value through repeatedly refined guesses until the difference between the values of the guesses falls within a predefined tolerance.

Edit: I went back and re-read the intro to this chapter, and I owe the authors an apology because the definition of general method is right there. So innocuous as to be easy to miss it, as I did, but it’s there:

General Method:
a procedure used to express general methods of computation independent of the functions involved.

Though humbled, I reserve the right to point out that my inference was more or less correct. For which I give the authors partial credit, because my acquired understanding is the result of our shared atemporal experience as reader and author. We’re on this journey together!

Also, “blink and you’ll miss it” definitions like this is what people mean when they say this book is “dense”.

1.3.4 Procedures as Returned Values

Okay so let’s try to actually walk through the example text from this section.

Let’s find the square root of x.

The square root y of x is any number such that y^2 = x.

We’re going to use fixed point and average damping to do this.

Fixed point, again, is a number x, given a function f, where f(x) = x. Which means that x is a value that makes f behave as the Identity function.

I don’t fully understand this next part, but apparently for “some functions” you can find the fixed-point by repeatedly calling f with the result of calling f:

f(x), f(f(x)), f(f(f(x))), ...

I don’t know why. But let’s assume it’s the case, and use fixed-point to implement our square root function.

So fixed-point is a procedure where we constantly call f with the next value of f(x) until two subsequent values are “close enough” to each other given a defined amount of error tolerance.

(define (fixed-point f initial-guess)
  (define tolerance 0.00001)
  (define (close-enough? a b)
    (< (abs (- a b)) tolerance))
  (define (try guess)
    (let ((next (f guess)))
      (if (close-enough? guess next)
        (try next))))
  (try initial-guess))

So now we could define a square root procedure:

(define (sqrt x)
  (fixed-point (lambda (y) (/ x y) 1.0))

But this won’t work as written because of some oscillation.

If x = 10, the initial guess of 1.0 expands to (/ 10 1.0) = 10.0.

So the next guess of 10.0 expands to (/ 10 10.0) = back to 1.0.

And it will bounce back and forth in a loop forever.

So we introduce average-damp which is a procedure that takes a value x and a function f, and returns the average of x and f(x).

;; no changes here
(define (fixed-point f initial-guess)
  (define tolerance 0.00001)
  (define (close-enough? a b)
    (< (abs (- a b)) tolerance))
  (define (try guess)
    (let ((next (f guess)))
      (if (close-enough? guess next)
        (try next))))
  (try initial-guess))

;; this is the new bit
(define (average-damp f)
  (define (average x y) (/ (+ x y) 2))
  (lambda (x) (average x (f x))))

;; now use the new bit
(define (sqrt x)
  (fixed-point (average-damp (lambda (y) (/ x y))) 1.0))

Now you can use sqrt.

All of this was conceptually introduced in the previous section and what this section actually does is introduce a few other transformations, and then points out that they all follow a pattern and can be further abstracted by using procedures as arguments and return values, much in the same way we went super abstracted in the previous section with the accumulator.

Finally this section introduces some rights and privileges of first-class procedures:

  1. They may be named as variables

  2. They may be passed as arguments to procedures

  3. They may be returned as values from procedures

  4. They may be included in data structures


That was Building Abstractions With Procedures!

Up next is Building Abstractions With Data, which I’m kind of excited about.

General impressions are that the text is very challenging and overly mathematical, but I’m still having fun with it so far.

I’ve never been so granular in my thought to notice the difference between process, procedure, and function. They really drove that point home by showing us half a dozen different square root procedures.

Sprinkling in some interpreter stuff with applicative vs normal mode this early in was pretty bold I thought. But I see why, there are some real gotchas if you don’t know what you’re doing. Same with introducing Big O notation. I wouldn’t consider that Chapter 1 material. But I guess it kind of has to be if you have to introduce recursion on day one.