2017NLPTransformer

Attention Is All You Need

The foundational Transformer architecture that introduced self-attention as a replacement for recurrence, enabling parallel processing and improved long-range dependency modeling.

Original Source

Paper Preview

Select text on the current page to save it as a note. Saved notes stay highlighted.

Notes

Select text from the current page. Saving it will persist both the note and the highlight.

Login to unlock note-taking and persistent markers.

Current selection

Select text from the page preview to save it as a note.
Notes lockedSign in to create, view, and revisit saved annotations for this paper.