Search

Book purpose

Table of contents

Note of thanks

Endorsements

References

License and referencing

Update tracking

What are encoder-decoder architectures?

What are encoder-decoder architectures?

The encoder-decoder architecture was the original transformer architecture that was proposed by Vaswani et al. They are now largely legacy architectures except for some specific applications like translation as they have been supplanted by encoder models for understanding (i.e.. embeddings) and increasingly decoder-only models for understanding and generation.

It is also worth noting that encoder models are sometimes still preferred over decoders for embeddings because of their transparency and their bidirectional attention allows each token to attend to the full context in both directions which produces richer embeddings. However, the largest decoder models can match or exceed encoder performance even on understanding tasks although they are not as transparent.

Next page

Five criteria for choosing a language model in AI psychometrics

Last page

What is a decoder architecture?

Return home

Psychometrics.ai

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).

image

image