What are encoder-decoder architectures?

What are encoder-decoder architectures?

The encoder-decoder architecture was the original transformer architecture that was proposed by Vaswani et al. They are now largely legacy architectures except for some specific applications like translation as they have been supplanted by encoder models for understanding (i.e.. embeddings) and increasingly decoder-only models for understanding and generation.

It is also worth noting that encoder models are sometimes still preferred over decoders for embeddings because of their transparency and their bidirectional attention allows each token to attend to the full context in both directions which produces richer embeddings. However, the largest decoder models can match or exceed encoder performance even on understanding tasks although they are not as transparent.

Next page

Vision and video transformers for psychologists who know LLMs

Last page

What is a decoder architecture?

Return home

Psychometrics.ai

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).

image
Google scholar profile for Nigel Guenole - AI psychometrics research
Linkedin profile for Nigel Guenole - AI assessment consulting and strategy