There are many resources on encoder architectures that explain the inner workings of encoder models. A question arises as to why another discussion is required. A book like this requires it, so we will provide a discussion of encoder architectures. To help with the explanation, we’ll also offer a layer-by-layer manual reconstruction of the internal workings of encoders using MiniLM.
We choose MiniLM as our sentence encoder, an Apache-2.0 open-source model. We can legally download all components, run computations, and compare results to ensure accuracy. We choose Hussain et al.’s (2024) “Open-source LLMs rock.” as our sentence to encoder. The description follows soon, the Jupyter notebook showing the ground up reconstruction of the operational version on MiniLM is available now.
Bug reports and corrections are welcome!
Next section
Last section
Return home