Psychometrics.ai

Search

Book purpose

Table of contents

References

Endorsements

Note of thanks

Author bio

License

Update tracking

Psychometrics.ai
Encoder-Decoder Transformer Architectures

Encoder-Decoder Transformer Architectures

The encoder-decoder architecture was the original transformer architecture that was proposed by Vaswani et al (2017). An example is T5. These models are now largely legacy architectures except for some specific applications like translation as they have been supplanted by encoder models for understanding (i.e., using embeddings) and increasingly decoder-only models for understanding and generation.

It is also worth noting that encoder models are sometimes still preferred over decoders for embeddings because of their transparency and their bidirectional attention allows each token to attend to the full context in both directions which produces richer embeddings. However, the largest decoder models can match or exceed encoder performance even on understanding tasks although they are not as transparent.

Next page

Vision and video transformers for psychologists who know LLMs

Last page

Decoder Architecture Explained: GPT-2 Reconstruction

Return home

Psychometrics.ai

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).

image
Google scholar profile for Nigel Guenole - AI psychometrics research
Linkedin profile for Nigel Guenole - AI assessment consulting and strategy