A ConvNet for the 2020s – Paper Explained (with animations)

Can a ConvNet outperform a Vision Transformer? What kind of modifications do we have to apply to a ConvNet to make it as powerful as a Transformer? Spoiler: it’s not attention. ► SPONSOR: Weights & Biases 👉 The official ConvNeXt repo has a W&B integration! Also, W&B built the CIFAR10 training colab linked there: 🥳 ❓ Check out our daily #MachineLearning Quiz Questions: Explained Paper 📜: Liu, Zhuang, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, and Saining Xie. “A ConvNet for the 2020s.” arXiv preprint arXiv: (2022). 🔗 Tweet of Lukas Beyer (ViT author): 🔗 Depthwise convolutions image and explanation:

7 views