A proper specification of reduction

Currently the implementation of the evaluator is taken as trusted part of the specification. It would be nicer to have a separate specification of reduction (e.g. based on naive substitution) and then prove correctness of the AM-based evaluator with respect to it.