 In this post I'll derive the Y-combinator and explain all the steps taken. The key to understanding the Y-combinator is knowing precisely what it does. I am sure many people never understand Y-combinator because of this very reason, or because the introduction is too lengthy to comprehend. Therefore I'll give the shortest possible explanation of Y-combinator before I derive it so that you know what the end result should be.

The Y-combinator allows an anonymous function to call itself, that is, it allows anonymous function recursion. Since anonymous functions don't have names, they can't refer to themselves easily. The Y-combinator is a clever construct helps them to refer to themselves. That's it. That's all the Y-combinator does. Remember that.

I'll use the Scheme programming language to derive the Y-combinator. My derivation is based on the one in "The Little Schemer" book. I took the examples from the book, filled in the missing steps and explained the steps in more details.

Here it is.

Suppose we want to write the `length` function, which, given a list, returns the number of elements in it. It's really easy if we can give the function a name:

```(define length
(lambda (list)
(cond
((null? list) 0)
(else
```

But now suppose that we can't give names - we can only use anonymous functions:

```(lambda (list)
(cond
((null? list) 0)
(else
```

Suddenly there is no way this anonymous function can refer to itself. What do we put in `???`? We can't refer to the name `length` anymore because it's an anonymous function, which doesn't have any name. One thing we can try is to put the function itself in the place of `???`:

```(lambda (list)
(cond
((null? list) 0)
(else

(lambda (list)                   ; the
(cond                          ;
((null? list) 0)             ; function
(else                        ;
(add1 (??? (cdr list))))))  ; itself

(cdr list))))))
```

This is not much better, we are still left with `???`. But there is a bright side to it, this function can determine lengths of lists with one element and no elements. Let's try inserting the same anonymous function in place of `???` again:

```(lambda (list)
(cond
((null? list) 0)
(else
(lambda (list)
(cond
((null? list) 0)
(else
(lambda (list)
(cond
((null? list) 0)
(else
(cdr list))))))
(cdr list))))))
```

Now this function can determine lengths of lists with 0, 1 and 2 elements. If we continue this way, we can construct the `length` function for lists with a certain number of elements. But this is not what we want, we the real `length` function that works for lists with any number of elements.

As we all know, repeating code is not a good thing. Let's try to factor out the repetitions and rewrite the original anonymous function slightly. Instead of leaving `???` in the code, let's pass it to an anonymous function via an argument called `length`.

```((lambda (length)
(lambda (list)
(cond
((null? list) 0)
(else
???)
```

Notice that this code invokes the first `lambda (length)` function and passes it `???` as the `length` argument. This function in turn returns the original anonymous function:

```(lambda (list)
(cond
((null? list) 0)
(else
```

Now let's try constructing the previous two length functions that work for lists with a maximum of one and two elements. First the function for lists with a maximum of one element:

```((lambda (f)
(lambda (list)
(cond
((null? list) 0)
(else
((lambda (g)
(lambda (list)
(cond
((null? list) 0)
(else
???))
```

Here `???` first gets passed to `lambda (g)`, which returns the original anonymous function, which then gets passed to `lambda (f)`, which in turn returns the function that works for lists with one element:

```(lambda (list)
(cond
((null? list) 0)
(else
(lambda (list)
(cond
((null? list) 0)
(else
(cdr list))))))
```

Similarly, the following code returns a function that works for lists with a maximum of two elements:

```((lambda (f)
(lambda (list)
(cond
((null? list) 0)
(else
((lambda (g)
(lambda (list)
(cond
((null? list) 0)
(else
((lambda (h)
(lambda (list)
(cond
((null? list) 0)
(else
???)))
```

Since the argument names `f`, `g`,`h` in `(lambda (f) ...)`, `(lambda (g) ...)`, `(lambda (h) ...)` are independent, we can rename all of them to `length`, to make it look more similar to the `length` function:

```((lambda (length)
(lambda (list)
(cond
((null? list) 0)
(else
((lambda (length)
(lambda (list)
(cond
((null? list) 0)
(else
((lambda (length)
(lambda (list)
(cond
((null? list) 0)
(else
???)))
```

We are still repeating code. The obvious thing we can do about it is create an anonymous function called `mk-length` (make length) that creates this code for us:

```((lambda (mk-length)
(mk-length ???))
(lambda (length)
(lambda (list)
(cond
((null? list) 0)
(else
```

This is pretty tricky. Observe how `(lambda (mk-length) ...)` gets the `(lambda (length) ...)` function passed as the `mk-length` argument, which in turn accepts `???` as an argument and returns our original anonymous function:

```(lambda (list)
(cond
((null? list) 0)
(else
```

Now let's try constructing length functions for lists of length one and two. For the list of length one, we just need to apply `mk-length` on itself:

```((lambda (mk-length)
(<span style="color:green">mk-length</span></code>
(<span style="color:blue">mk-length</span> ???)))
(lambda (length)
(lambda (list)
(cond
((null? list) 0)
(else
```

Let's go through this code. First `(lambda (length) ...)` gets passed to the `lambda (mk-length)` function as the `mk-length` argument. Then it applies the result of `(mk-length ???)` (which is the original anonymous function) to the `mk-length`. This produces our well known function that works on lists with one or none elements:

```(lambda (list)
(cond
((null? list) 0)
(else
(lambda (list)
(cond
((null? list) 0)
(else
(cdr list))))))
```

Similarly, by adding another call to `mk-length`, we get the function that works for lists with two or less elements:

```((lambda (mk-length)
(mk-length
(mk-length
(mk-length ???)))
(lambda (length)
(lambda (list)
(cond
((null? list) 0)
(else
```

Now, since we don't know what `???` is, let's just call it `mk-length` and see what happens:

```((lambda (mk-length)
(mk-length mk-length))
(lambda (length)
(lambda (list)
(cond
((null? list) 0)
(else
```

Nothing bad actually happens, we still get the same original anonymous function, except in place of `???` we have the function:

```(lambda (length)
(lambda (list)
(cond
((null? list) 0)
(else
```

Notice also that argument names `mk-length` and `length` in `lambda (mk-length)` and `lambda (length)` are independent. Therefore we can rename `length` to `mk-length` to remind that the first argument to `mk-length` is `mk-length`:

```((lambda (mk-length)
(mk-length mk-length))
(lambda (mk-length)
(lambda (list)
(cond
((null? list) 0)
(else
```

And now comes the key trick. We replace `mk-length` with a self-application `(mk-length mk-length)`:

```((lambda (mk-length)
(mk-length mk-length))
(lambda (mk-length)
(lambda (list)
(cond
((null? list) 0)
(else
```

If you try this code out, you'll see that it returns the length of a list for any list. Here is an example of applying it to a list of 10 elements:

```(((lambda (mk-length)
(mk-length mk-length))
(lambda (mk-length)
(lambda (list)
(cond
((null? list) 0)
(else
'(a b c d e f g h i j))
```

The function works because it keeps adding recursive uses by passing `mk-length` to itself, just as it is about to expire. This is not yet the Y-combinator, but we have successfully managed to recursively call an anonymous function. Now we need to massage the code a bit to separate the anonymous function from the self-applicative code.

The first step is to move the self-applicative code out as much as possible:

```((lambda (mk-length)
(mk-length mk-length))
(lambda (mk-length)
<strong>((lambda (length)
(lambda (list)
(cond
((null? list) 0)
(else
(lambda (x)
((mk-length mk-length) x)))))
```

Now the code in bold looks very much like the `length` function we need! Let's move it outside as well:

```((lambda (le)
((lambda (mk-length)
(mk-length mk-length))
(lambda (mk-length)
(le (lambda (x)
((mk-length mk-length) x))))))
<strong>(lambda (length)
(lambda (list)
(cond
((null? list) 0)
(else
```

There we have it! The anonymous `length` function has been separated from the self-applicative call. The code in bold is the `length` function, and the code not-in-bold is the Y-combinator!

Now let's rename `mk-length` to `f` and join the lines to make it more compact:

```((lambda (le)
((lambda (f) (f f))
(lambda (f)
(le (lambda (x) ((f f) x))))))
(lambda (length)
(lambda (list)
(cond
((null? list) 0)
(else
```

And we are done. We can now write down the definition of Y-combinator:

```(define Y
(lambda (le)
((lambda (f) (f f))
(lambda (f)
(le (lambda (x) ((f f) x)))))))
```

And we can apply it to our anonymous `length` function:

```((Y (lambda (length)
(lambda (list)
(cond
((null? list) 0)
(else