Expire-Span: Not All Memories are Created Equal: Learning to Forget by Expiring (Paper Explained)
#expirespan #nlp #facebookai
Facebook AI (FAIR) researchers present Expire-Span, a variant of Transformer XL that dynamically assigns expiration dates to previously encountered signals. Because of this, Expire-Span can handle sequences of many thousand tokens, while keeping the memory and compute requirements at a manageable level. It severely matches or outperforms baseline systems, while consuming much less resources. We discuss its architecture, advantages, and shortcomings.
OUTLINE:
0:00 - Intro & Overview
2:30 - Remembering the past in sequence models
5:45 - Learning to expire past memories
8:30 - Difference to local attention
10:00 - Architecture overview
13:45 - Comparison to Transformer XL
18:50 - Predicting expiration masks
32:30 - Experimental Results
40:00 - Conclusion & Comments
Paper:
Code:
ADDENDUM: I mention several times that the gradient signal of the e quantity only occurs inside the R ramp. By th
1 view
38
16
10 months ago 00:12:48 2
Richard Wagner: “Tristan und Isolde“, Prelude | Bayerisches Staatsorchester & Vladimir Jurowski
11 months ago 00:38:34 1
Documentary CREATION OF THE SOUL, esoteric, spirituality
2 years ago 00:34:15 139
Expired 90’s Snack Challenge (Courtesy of Nick S) | L.A. BEAST
4 years ago 00:41:45 1
Expire-Span: Not All Memories are Created Equal: Learning to Forget by Expiring (Paper Explained)
5 years ago 02:14:24 1
“2019 USA FREEDOM ACT“ Government Surveillance And YOUR Privacy! (Rebranding Of The Patriot Act)