Encoder architectures

Encoder architectures

There are many resources on encoder architectures that explain the inner workings of encoder models. A question arises as to why another discussion is required. A book like this requires it, so we will provide a discussion of encoder architectures. To help with the explanation, we’ll also offer a layer-by-layer manual reconstruction of the internal workings of encoders using MiniLM.

We choose MiniLM as our sentence encoder, an Apache-2.0 open-source model. We can legally download all components, run computations, and compare results to ensure accuracy. We choose Hussain et al.’s (2024) “Open-source LLMs rock.” as our sentence to encoder. The description follows soon, the Jupyter notebook showing the ground up reconstruction of the operational version on MiniLM is available now.

View the MiniLM notebook—>

Bug reports and corrections are welcome!

Next section

Decoder architectures

Last section

What’s in an LLM?

Return home

Psychometrics.ai