In this video, I’ll try to present a comprehensive study on Ashish Vaswani and his coauthors’ renowned paper, “attention is all you need”
This paper is a major turning point in deep learning research. The transformer architecture, which was introduced in this paper, is now used in a variety of state-of-the-art models in natural language processing and beyond.
📑 Chapters:
0:00 Abstract
0:39 Introduction
2:44 Model Details
3:20 Encoder
3:30 Input Embedding
5:22 Positional Encoding
11:05 Self-Attention
15:38 Multi-Head Attention
17:31 Add and Layer Normalization
20:38 Feed Forward NN
23:40 Decoder
23:44 Decoder in Training and Testing Phase
27:31 Masked Multi-Head Attention
30:03 Encoder-decoder Self-Attention
33:19 Results
35:37 Conclusion
📝 Link to the paper:
👥 Authors:
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kais
0 views
59
13
5 days ago 00:04:17 0
Blue System 2024 Angel of the night
2 months ago 00:03:58 0
Steve Aoki ft. - Born To Get Wild (Dimitri Vegas & Like Mike vs Boostedkids Remix)
2 months ago 00:00:56 6
Helldivers 2 - Galactic Emergency Trailer | PS5 & PC Games
2 months ago 00:03:56 0
Your Office Deserves Better Than That Crusty Printer – Meet the Boss-Level Brother MFC-L6810DW - YouTube
2 months ago 00:02:03 25
Anja Epp (SUI) vs Savannah Witt (USA) 55kg. Women wrestling.
3 months ago 00:12:26 0
PyTorch Practical - Multihead Attention Computation in PyTorch