Expressions, Environment and Evaluation

Expressions

LISP is all about expressions. An expression can be a literal, an identifier, or a list consisting of an operator and one or more arguments.

A literal is an object with an intrinsic value. In our system, that's either an integer or NIL (if you consider "nothing" to be a value).

An identifier is a name for an object. Symbols can be identifiers.

Everything else is a list of the form (operator argument...) where argument... means zero or more arguments.

Environment

To associate identifiers with objects we need an environment. This is a collection of bindings, each of which consists of an identifier and its corresponding value. For example:
Bindings
IdentifierValue
FOO42
BARNIL
BAZ(X Y Z)
Note that the identifiers are all symbols, but the values can be any object within our system of data — the value for BAZ is a list containing three symbols.

An environment can also have a parent environment. If there is no binding for a particular identifier in the environment, we can check the parent, the parent's parent and so on. In this way we can create a tree of environments which share bindings with their ancestors unless explicit replacements exist.

Implementation

There is a convenient way of representing environments using our LISP data types:

(parent (identifier . value)...)
So the environment above (assuming it has no parent) is:
(NIL (FOO . 42) (BAR . NIL) (BAZ . (X Y Z)))

Here is a function to create an empty environment with a specified parent (which could be NIL):

Atom env_create(Atom parent)
{
	return cons(parent, nil);
}

Next we have two functions to retrieve and create bindings in an environment.

int env_get(Atom env, Atom symbol, Atom *result)
{
	Atom parent = car(env);
	Atom bs = cdr(env);

	while (!nilp(bs)) {
		Atom b = car(bs);
		if (car(b).value.symbol == symbol.value.symbol) {
			*result = cdr(b);
			return Error_OK;
		}
		bs = cdr(bs);
	}

	if (nilp(parent))
		return Error_Unbound;

	return env_get(parent, symbol, result);
}
Disallowing duplicate symbols means that we don't have to call strcmp here, which should mean that this lookup function is not too slow.

int env_set(Atom env, Atom symbol, Atom value)
{
	Atom bs = cdr(env);
	Atom b = nil;

	while (!nilp(bs)) {
		b = car(bs);
		if (car(b).value.symbol == symbol.value.symbol) {
			cdr(b) = value;
			return Error_OK;
		}
		bs = cdr(bs);
	}

	b = cons(symbol, value);
	cdr(env) = cons(b, cdr(env));

	return Error_OK;
}

Only env_get recursively checks the parent environments. We don't want to modify the bindings of parents.

Evaluation

Now that we have expressions, we can start to evaluate them. Evalution is a process which takes an expression and an environment, and produces a value (the result). Let's specify the rules.

Implementation

We will need to check whether an expression is a proper list.

int listp(Atom expr)
{
	while (!nilp(expr)) {
		if (expr.type != AtomType_Pair)
			return 0;
		expr = cdr(expr);
	}
	return 1;
}

The Error enumeration needs a few more entires:
Error_Unbound Attempted to evaluate a symbol for which no binding exists
Error_Args A list expression was shorter or longer than expected
Error_Type An object in an expression was of a different type than expected

The function to perform evaluation is now a straightforward translation of the rules into C.

int eval_expr(Atom expr, Atom env, Atom *result)
{
	Atom op, args;
	Error err;

	if (expr.type == AtomType_Symbol) {
		return env_get(env, expr, result);
	} else if (expr.type != AtomType_Pair) {
		*result = expr;
		return Error_OK;
	}

	if (!listp(expr))
		return Error_Syntax;

	op = car(expr);
	args = cdr(expr);

	if (op.type == AtomType_Symbol) {
		if (strcmp(op.value.symbol, "QUOTE") == 0) {
			if (nilp(args) || !nilp(cdr(args)))
				return Error_Args;

			*result = car(args);
			return Error_OK;
		} else if (strcmp(op.value.symbol, "DEFINE") == 0) {
			Atom sym, val;

			if (nilp(args) || nilp(cdr(args)) || !nilp(cdr(cdr(args))))
				return Error_Args;

			sym = car(args);
			if (sym.type != AtomType_Symbol)
				return Error_Type;

			err = eval_expr(car(cdr(args)), env, &val);
			if (err)
				return err;

			*result = sym;
			return env_set(env, sym, val);
		}
	}

	return Error_Syntax;
}

Testing

Extending the read-print loop from the previous chapter, we now have a read-eval-print loop (REPL). This is the core of our LISP interpreter.

int main(int argc, char **argv)
{
	Atom env;
	char *input;

	env = env_create(nil);

	while ((input = readline("> ")) != NULL) {
		const char *p = input;
		Error err;
		Atom expr, result;

		err = read_expr(p, &p, &expr);		

		if (!err)
			err = eval_expr(expr, env, &result);

		switch (err) {
		case Error_OK:
			print_expr(result);
			putchar('\n');
			break;
		case Error_Syntax:
			puts("Syntax error");
			break;
		case Error_Unbound:
			puts("Symbol not bound");
			break;
		case Error_Args:
			puts("Wrong number of arguments");
			break;
		case Error_Type:
			puts("Wrong type");
			break;
		}

		free(input);
	}

	return 0;
}

Let's see what it can do.

> foo
Symbol not bound
> (quote foo)
FOO
> (define foo 42)
FOO
> foo
42
> (define foo (quote bar))
FOO
> foo
BAR

We can now interactively assign names to objects.