Date | Topic | Papers |
Aug 26 | Course Introduction | |
Aug 31 | Convolutional Neural Networks
[no code (yet)] |
| Gradient-based learning applied to document recognition, LeCun, Bottun, Bengio, Haffner; 1998 |
| ImageNet Classification with Deep Convolutional Neural Networks, Krizhevsky, Sutskever, Hinton; 2012 |
| Network In Network, Lin, Chen, Yan; 2013 |
| Going Deeper with Convolutions, Szegedy, Liu, Jia, Sermanet, Reed, Anguelov, Erhan, Vanhoucke, Rabinovich; 2014 |
| Very Deep Convolutional Networks for Large-Scale Image Recognition, Simonyan, Zisserman; 2014 |
Sep 02 | Convolutional Neural Networks
[no code (yet)] |
| Deep Residual Learning for Image Recognition, He, Zhang, Ren, Sun; 2015 |
| Densely Connected Convolutional Networks, Huang, Liu, Maaten, Weinberger; 2016 |
| MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, Howard, Zhu, Chen, Kalenichenko, Wang, Weyand, Andreetto, Adam; 2017 |
| MobileNetV2: Inverted Residuals and Linear Bottlenecks, Sandler, Howard, Zhu, Zhmoginov, Chen; 2018 |
| EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks, Tan, Le; 2019 |
Sep 07 | Non-linearities (and initialization) [C] |
| Understanding the difficulty of training deep feedforward neural networks, Glorot, Bengio; 2010 | S1: Serdjan Rolovic
S2: Elias Lampietti
C: Matthew Kelleher
R: Samantha Hay
| Deep Sparse Rectifier Neural Networks, Glorot, Bordes, Bengio; 2011 | S1: Ishank Arora
S2: Yeming Wen
C: Ayush Chauhan
R: Christopher Hahn
| Maxout Networks, Goodfellow, Warde-Farley, Mirza, Courville, Bengio; 2013 | S1: Zayne Sprague
S2: Zhou Fang
C: Reid Ling Tong Li
R: Jay Whang
| Exact solutions to the nonlinear dynamics of learning in deep linear neural networks, Saxe, McClelland, Ganguli; 2013 | S1: Liyan Chen
C: Marlan McInnes-Taylor
R: Nilesh Gupta
| Rectifier Nonlinearities Improve Neural Network Acoustic Models, Maas, Hannun, Ng; 2013 | S1: Kelsey Ball
S2: Srinath Tankasala
C: Hung-Ting Chen
R: Sai Kiran Maddela
Sep 09 | Non-linearities (and initialization) [C] |
| Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification, He, Zhang, Ren, Sun; 2015 | S1: Tongrui Li
S2: Jordi Ramos Chen
| Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs), Clevert, Unterthiner, Hochreiter; 2015 | S1: Joshua Papermaster
C: Ian Trowbridge
R: Tarannum Khan
| Searching for Activation Functions, Ramachandran, Zoph, Le; 2017 | S1:
S2: Jay Liao
C: Ishan Shah
R: Kiran Raja
| Mish: A Self Regularized Non-Monotonic Activation Function, Misra; 2019 | S1: Shivi Agarwal
S2: Atreya Dey
C: Ojas Patel
R: Daniel Almeraz
| Gaussian Error Linear Units (GELUs), Hendrycks, Gimpel; 2016 | S1: Marco Bueso
S2: Jose Chavez
C: Cheng-Chun Hsu
R: Shivang Singh
Sep 14 | Optimizers [C] |
| Large-Scale Machine Learning with Stochastic Gradient Descent, Bottou; 2010 | S1: Elias Lampietti
S2: Zayne Sprague
C: Samantha Hay
R: Ojas Patel
| On the importance of initialization and momentum in deep learning, Sutskever, Martens, Dahl, Hinton; 2013 | S1: Zhou Fang
S2: Kelsey Ball
C: Kiran Raja
R: Reid Ling Tong Li
| Cyclical Learning Rates for Training Neural Networks, Smith; 2015 | S1: Jay Liao
S2: Liyan Chen
C: Shivang Singh
R: Ian Trowbridge
| SGDR: Stochastic Gradient Descent with Warm Restarts, Loshchilov, Hutter; 2016 | S1: Atreya Dey
S2: Tongrui Li
C: Tarannum Khan
| Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates, Smith, Topin; 2017 | S1: Jose Chavez
S2: Marco Bueso
R: Marlan McInnes-Taylor
Sep 16 | Optimizers [C] |
| Adaptive Subgradient Methods for Online Learning and Stochastic Optimization, Duchi, Hazan, Singer; 2011 | S1: Yeming Wen
C: Daniel Almeraz
R: Ayush Chauhan
| ADADELTA: An Adaptive Learning Rate Method, Zeiler; 2012 | S1: Srinath Tankasala
S2: Shivi Agarwal
C: Nilesh Gupta
R: Matthew Kelleher
| Adam: A Method for Stochastic Optimization, Kingma, Ba; 2014 | S1: Jordi Ramos Chen
S2: Ishank Arora
C: Jay Whang
R: Cheng-Chun Hsu
| On the Convergence of Adam and Beyond, Reddi, Kale, Kumar; 2019 | S1: ABAYOMI ADEKANMBI
S2: Joshua Papermaster
C: Sai Kiran Maddela
R: Hung-Ting Chen
| Decoupled Weight Decay Regularization, Loshchilov, Hutter; 2017 | S1:
S2: Serdjan Rolovic
C: Christopher Hahn
R: Ishan Shah
Sep 21 | Normalizations [C] |
| Dropout: A Simple Way to Prevent Neural Networks from Overfitting, Srivastava, Hinton, Krizhevsky, Sutskever, Salakhutdinov; 2014 | S1: Ojas Patel
S2: Tarannum Khan
C: Marco Bueso
| Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, Ioffe, Szegedy; 2015 | S1: Ishan Shah
C: Kelsey Ball
R: Elias Lampietti
| Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks, Salimans, Kingma; 2016 | S1: Marlan McInnes-Taylor
S2: Samantha Hay
C: Zayne Sprague
R: Jay Liao
| Layer Normalization, Ba, Kiros, Hinton; 2016 | S1: Ayush Chauhan
S2: Jay Whang
C: Ishank Arora
R: Atreya Dey
| Instance Normalization: The Missing Ingredient for Fast Stylization, Ulyanov, Vedaldi, Lempitsky; 2016 | S1: Reid Ling Tong Li
S2: Kiran Raja
C: Liyan Chen
Sep 23 | Normalizations [C] |
| Group Normalization, Wu, He; 2018 | S1: Ian Trowbridge
S2: Shivang Singh
C: Shivi Agarwal
R: Srinath Tankasala
| High-Performance Large-Scale Image Recognition Without Normalization, Brock, De, Smith, Simonyan; 2021 | S1: Matthew Kelleher
S2: Christopher Hahn
C: Serdjan Rolovic
R: Jose Chavez
| Micro-Batch Training with Batch-Channel Normalization and Weight Standardization, Qiao, Wang, Liu, Shen, Yuille; 2019 | S1:
S2: Nilesh Gupta
C: Joshua Papermaster
R: Jordi Ramos Chen
| Understanding Batch Normalization, Bjorck, Gomes, Selman, Weinberger; 2018 | S1: Hung-Ting Chen
S2: Daniel Almeraz
C: Tongrui Li
R: Zhou Fang
| Rethinking "Batch" in BatchNorm, Wu, Johnson; 2021 | S1: Cheng-Chun Hsu
S2: Sai Kiran Maddela
R: Yeming Wen
Sep 28 | Sequence models [C] |
| Sequence to Sequence Learning with Neural Networks, Sutskever, Vinyals, Le; 2014 | S1: Sai Kiran Maddela
S2: Ian Trowbridge
C: Srinath Tankasala
R: Liyan Chen
| Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling, Chung, Gulcehre, Cho, Bengio; 2014 | S1: Christopher Hahn
S2: Reid Ling Tong Li
C: Jay Liao
R: Tongrui Li
| Neural Machine Translation by Jointly Learning to Align and Translate, Bahdanau, Cho, Bengio; 2014 | S1: Shivang Singh
S2: Ayush Chauhan
C: Elias Lampietti
R: Serdjan Rolovic
| Attention Is All You Need, Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser, Polosukhin; 2017 | S1: Samantha Hay
S2: Ojas Patel
C: Zhou Fang
R: Ishank Arora
| End-To-End Memory Networks, Sukhbaatar, Szlam, Weston, Fergus; 2015 | S1: Tarannum Khan
S2: Marlan McInnes-Taylor
R: Zayne Sprague
Sep 30 | Sequence models [C] |
| Attention is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth, Dong, Cordonnier, Loukas; 2021 | S1: Nilesh Gupta
S2: Ishan Shah
C: Jordi Ramos Chen
R: Kelsey Ball
| BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Devlin, Chang, Lee, Toutanova; 2018 | S1: Daniel Almeraz
C: Jose Chavez
R: Joshua Papermaster
| A Primer in BERTology: What we know about how BERT works, Rogers, Kovaleva, Rumshisky; 2020 | S1:
S2: Hung-Ting Chen
C: Atreya Dey
R: Shivi Agarwal
| Improving Language Understanding by Generative Pre-Training, Radford, Narasimhan, Salimans, Sutskever; 2018 | S1: Kiran Raja
S2: Cheng-Chun Hsu
C: Yeming Wen
R: Marco Bueso
| Language Models are Few-Shot Learners, Brown, Mann, Ryder, Subbiah, Kaplan, Dhariwal, Neelakantan, Shyam, Sastry, Askell, Agarwal, Herbert-Voss, Krueger, Henighan, Child, Ramesh, Ziegler, Wu, Winter, Hesse, Chen, Sigler, Litwin, Gray, Chess, Clark, Berner, McCandlish, Radford, Sutskever, Amodei; 2020 | S1: Jay Whang
S2: Matthew Kelleher
Oct 05 | Efficient Transformers [C] |
| Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context, Dai, Yang, Yang, Carbonell, Le, Salakhutdinov; 2019 | S1: Serdjan Rolovic
S2: Elias Lampietti
C: Matthew Kelleher
R: Samantha Hay
| Generating Long Sequences with Sparse Transformers, Child, Gray, Radford, Sutskever; 2019 | S1: Ishank Arora
S2: Yeming Wen
C: Ayush Chauhan
R: Christopher Hahn
| Compressive Transformers for Long-Range Sequence Modelling, Rae, Potapenko, Jayakumar, Lillicrap; 2019 | S1: Zayne Sprague
S2: Zhou Fang
C: Reid Ling Tong Li
R: Jay Whang
| Reformer: The Efficient Transformer, Kitaev, Kaiser, Levskaya; 2020 | S1: Liyan Chen
C: Marlan McInnes-Taylor
R: Nilesh Gupta
| Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention, Katharopoulos, Vyas, Pappas, Fleuret; 2020 | S1: Kelsey Ball
S2: Srinath Tankasala
C: Hung-Ting Chen
R: Sai Kiran Maddela
Oct 07 | Efficient Transformers [C] |
| Linformer: Self-Attention with Linear Complexity, Wang, Li, Khabsa, Fang, Ma; 2020 | S1: Tongrui Li
S2: Jordi Ramos Chen
| Rethinking Attention with Performers, Choromanski, Likhosherstov, Dohan, Song, Gane, Sarlos, Hawkins, Davis, Mohiuddin, Kaiser, Belanger, Colwell, Weller; 2020 | S1: Joshua Papermaster
C: Ian Trowbridge
R: Tarannum Khan
| Longformer: The Long-Document Transformer, Beltagy, Peters, Cohan; 2020 | S1:
S2: Jay Liao
C: Ishan Shah
R: Kiran Raja
| Big Bird: Transformers for Longer Sequences, Zaheer, Guruganesh, Dubey, Ainslie, Alberti, Ontanon, Pham, Ravula, Wang, Yang, Ahmed; 2020 | S1: Shivi Agarwal
S2: Atreya Dey
C: Ojas Patel
R: Daniel Almeraz
| LambdaNetworks: Modeling Long-Range Interactions Without Attention, Bello; 2021 | S1: Marco Bueso
S2: Jose Chavez
C: Cheng-Chun Hsu
R: Shivang Singh
Oct 12 | Vision Transformers [C] |
| An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, Dosovitskiy, Beyer, Kolesnikov, Weissenborn, Zhai, Unterthiner, Dehghani, Minderer, Heigold, Gelly, Uszkoreit, Houlsby; 2020 | S1: Elias Lampietti
S2: Zayne Sprague
C: Samantha Hay
R: Ojas Patel
| Training data-efficient image transformers & distillation through attention, Touvron, Cord, Douze, Massa, Sablayrolles, Jégou; 2020 | S1: Zhou Fang
S2: Kelsey Ball
C: Kiran Raja
R: Reid Ling Tong Li
| BEiT: BERT Pre-Training of Image Transformers, Bao, Dong, Wei; 2021 | S1: Jay Liao
S2: Liyan Chen
C: Shivang Singh
R: Ian Trowbridge
| LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference, Graham, El-Nouby, Touvron, Stock, Joulin, Jégou, Douze; 2021 | S1: Atreya Dey
S2: Tongrui Li
C: Tarannum Khan
| Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions, Wang, Xie, Li, Fan, Song, Liang, Lu, Luo, Shao; 2021 | S1: Jose Chavez
S2: Marco Bueso
R: Marlan McInnes-Taylor
Oct 14 | Vision Transformers [C] |
| Swin Transformer: Hierarchical Vision Transformer using Shifted Windows, Liu, Lin, Cao, Hu, Wei, Zhang, Lin, Guo; 2021 | S1: Yeming Wen
C: Daniel Almeraz
R: Ayush Chauhan
| Transformer in Transformer, Han, Xiao, Wu, Guo, Xu, Wang; 2021 | S1: Srinath Tankasala
S2: Shivi Agarwal
C: Nilesh Gupta
R: Matthew Kelleher
| Perceiver: General Perception with Iterative Attention, Jaegle, Gimeno, Brock, Zisserman, Vinyals, Carreira; 2021 | S1: Jordi Ramos Chen
S2: Ishank Arora
C: Jay Whang
R: Cheng-Chun Hsu
| Perceiver IO: A General Architecture for Structured Inputs & Outputs, Jaegle, Borgeaud, Alayrac, Doersch, Ionescu, Ding, Koppula, Zoran, Brock, Shelhamer, Hénaff, Botvinick, Zisserman, Vinyals, Carreira; 2021 | S1: ABAYOMI ADEKANMBI
S2: Joshua Papermaster
C: Sai Kiran Maddela
R: Hung-Ting Chen
| MLP-Mixer: An all-MLP Architecture for Vision, Tolstikhin, Houlsby, Kolesnikov, Beyer, Zhai, Unterthiner, Yung, Steiner, Keysers, Uszkoreit, Lucic, Dosovitskiy; 2021 | S1:
S2: Serdjan Rolovic
C: Christopher Hahn
R: Ishan Shah
Oct 19 | Implicit functions [C] |
| DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation, Park, Florence, Straub, Newcombe, Lovegrove; 2019 | S1: Ojas Patel
S2: Tarannum Khan
C: Marco Bueso
| Occupancy Networks: Learning 3D Reconstruction in Function Space, Mescheder, Oechsle, Niemeyer, Nowozin, Geiger; 2018 | S1: Ishan Shah
C: Kelsey Ball
R: Elias Lampietti
| Implicit Geometric Regularization for Learning Shapes, Gropp, Yariv, Haim, Atzmon, Lipman; 2020 | S1: Marlan McInnes-Taylor
S2: Samantha Hay
C: Zayne Sprague
R: Jay Liao
| Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains, Tancik, Srinivasan, Mildenhall, Fridovich-Keil, Raghavan, Singhal, Ramamoorthi, Barron, Ng; 2020 | S1: Ayush Chauhan
S2: Jay Whang
C: Ishank Arora
R: Atreya Dey
| Implicit Neural Representations with Periodic Activation Functions, Sitzmann, Martel, Bergman, Lindell, Wetzstein; 2020 | S1: Reid Ling Tong Li
S2: Kiran Raja
C: Liyan Chen
Oct 21 | Implicit functions [C] |
| Learning Continuous Image Representation with Local Implicit Image Function, Chen, Liu, Wang; 2020 | S1: Ian Trowbridge
S2: Shivang Singh
C: Shivi Agarwal
R: Srinath Tankasala
| NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis, Mildenhall, Srinivasan, Tancik, Barron, Ramamoorthi, Ng; 2020 | S1: Matthew Kelleher
S2: Christopher Hahn
C: Serdjan Rolovic
R: Jose Chavez
| NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections, Martin-Brualla, Radwan, Sajjadi, Barron, Dosovitskiy, Duckworth; 2020 | S1:
S2: Nilesh Gupta
C: Joshua Papermaster
R: Jordi Ramos Chen
| Baking Neural Radiance Fields for Real-Time View Synthesis, Hedman, Srinivasan, Mildenhall, Barron, Debevec; 2021 | S1: Hung-Ting Chen
S2: Daniel Almeraz
C: Tongrui Li
R: Zhou Fang
| GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields, Niemeyer, Geiger; 2020 | S1: Cheng-Chun Hsu
S2: Sai Kiran Maddela
R: Yeming Wen
Oct 26 | 2D recognition [C] |
| Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, Ren, He, Girshick, Sun; 2015 | S1: Sai Kiran Maddela
S2: Ian Trowbridge
C: Srinath Tankasala
R: Liyan Chen
| You Only Look Once: Unified, Real-Time Object Detection, Redmon, Divvala, Girshick, Farhadi; 2015 | S1: Christopher Hahn
S2: Reid Ling Tong Li
C: Jay Liao
R: Tongrui Li
| Focal Loss for Dense Object Detection, Lin, Goyal, Girshick, He, Dollár; 2017 | S1: Shivang Singh
S2: Ayush Chauhan
C: Elias Lampietti
R: Serdjan Rolovic
| Mask R-CNN, He, Gkioxari, Dollár, Girshick; 2017 | S1: Samantha Hay
S2: Ojas Patel
C: Zhou Fang
R: Ishank Arora
| Cascade R-CNN: Delving into High Quality Object Detection, Cai, Vasconcelos; 2017 | S1: Tarannum Khan
S2: Marlan McInnes-Taylor
R: Zayne Sprague
Oct 28 | 2D recognition [C] |
| Deformable Convolutional Networks, Dai, Qi, Xiong, Li, Zhang, Hu, Wei; 2017 | S1: Nilesh Gupta
S2: Ishan Shah
C: Jordi Ramos Chen
R: Kelsey Ball
| CornerNet: Detecting Objects as Paired Keypoints, Law, Deng; 2018 | S1: Daniel Almeraz
C: Jose Chavez
R: Joshua Papermaster
| Objects as Points, Zhou, Wang, Krähenbühl; 2019 | S1:
S2: Hung-Ting Chen
C: Atreya Dey
R: Shivi Agarwal
| End-to-End Object Detection with Transformers, Carion, Massa, Synnaeve, Usunier, Kirillov, Zagoruyko; 2020 | S1: Kiran Raja
S2: Cheng-Chun Hsu
C: Yeming Wen
R: Marco Bueso
| Deformable DETR: Deformable Transformers for End-to-End Object Detection, Zhu, Su, Lu, Li, Wang, Dai; 2020 | S1: Jay Whang
S2: Matthew Kelleher
Nov 02 | 3D recognition [C] |
| PointNet: Deep Learning on Point Sets for 3D Classification and
Segmentation, Qi, Su, Mo, Guibas; 2016 | S1: Serdjan Rolovic
S2: Elias Lampietti
C: Matthew Kelleher
R: Samantha Hay
| PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric
Space, Qi, Yi, Su, Guibas; 2017 | S1: Ishank Arora
S2: Yeming Wen
C: Ayush Chauhan
R: Christopher Hahn
| Dynamic Graph CNN for Learning on Point Clouds, Wang, Sun, Liu, Sarma, Bronstein, Solomon; 2018 | S1: Zayne Sprague
S2: Zhou Fang
C: Reid Ling Tong Li
R: Jay Whang
| PointCNN: Convolution On $\mathcal{X}$-Transformed Points, Li, Bu, Sun, Wu, Di, Chen; 2018 | S1: Liyan Chen
C: Marlan McInnes-Taylor
R: Nilesh Gupta
| Point Transformer, Zhao, Jiang, Jia, Torr, Koltun; 2020 | S1: Kelsey Ball
S2: Srinath Tankasala
C: Hung-Ting Chen
R: Sai Kiran Maddela
Nov 04 | 3D recognition [C] |
| VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection, Zhou, Tuzel; 2017 | S1: Tongrui Li
S2: Jordi Ramos Chen
| PointPillars: Fast Encoders for Object Detection from Point Clouds, Lang, Vora, Caesar, Zhou, Yang, Beijbom; 2018 | S1: Joshua Papermaster
C: Ian Trowbridge
R: Tarannum Khan
| PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud, Shi, Wang, Li; 2018 | S1:
S2: Jay Liao
C: Ishan Shah
R: Kiran Raja
| Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object
Detection for Autonomous Driving, Wang, Chao, Garg, Hariharan, Campbell, Weinberger; 2018 | S1: Shivi Agarwal
S2: Atreya Dey
C: Ojas Patel
R: Daniel Almeraz
| Center-based 3D Object Detection and Tracking, Yin, Zhou, Krähenbühl; 2020 | S1: Marco Bueso
S2: Jose Chavez
C: Cheng-Chun Hsu
R: Shivang Singh
Nov 09 | Open world perception [C] |
| Momentum Contrast for Unsupervised Visual Representation Learning, He, Fan, Wu, Xie, Girshick; 2019 | S1: Elias Lampietti
S2: Zayne Sprague
C: Samantha Hay
R: Ojas Patel
| A Simple Framework for Contrastive Learning of Visual Representations, Chen, Kornblith, Norouzi, Hinton; 2020 | S1: Zhou Fang
S2: Kiran Raja
C: Kelsey Ball
R: Reid Ling Tong Li
| VirTex: Learning Visual Representations from Textual Annotations, Desai, Johnson; 2020 | S1: Jay Liao
S2: Liyan Chen
C: Shivang Singh
R: Ian Trowbridge
| Contrastive Learning of Medical Visual Representations from Paired
Images and Text, Zhang, Jiang, Miura, Manning, Langlotz; 2020 | S1: Atreya Dey
S2: Tongrui Li
C: Marco Bueso
| Learning Transferable Visual Models From Natural Language Supervision, Radford, Kim, Hallacy, Ramesh, Goh, Agarwal, Sastry, Askell, Mishkin, Clark, Krueger, Sutskever; 2021 | S1: Jose Chavez
S2: Tarannum Khan
R: Marlan McInnes-Taylor
Nov 11 | Open world perception [C] |
| Towards Open Set Deep Networks, Bendale, Boult; 2015 | S1: Yeming Wen
C: Daniel Almeraz
R: Ayush Chauhan
| Large-Scale Long-Tailed Recognition in an Open World, Liu, Miao, Zhan, Wang, Gong, Yu; 2019 | S1: Srinath Tankasala
S2: Shivi Agarwal
C: Nilesh Gupta
R: Matthew Kelleher
| Class-Balanced Loss Based on Effective Number of Samples, Cui, Jia, Lin, Song, Belongie; 2019 | S1: Jordi Ramos Chen
S2: Ishank Arora
C: Jay Whang
R: Cheng-Chun Hsu
| Decoupling Representation and Classifier for Long-Tailed Recognition, Kang, Xie, Rohrbach, Yan, Gordo, Feng, Kalantidis; 2019 | S1: ABAYOMI ADEKANMBI
S2: Joshua Papermaster
C: Sai Kiran Maddela
R: Hung-Ting Chen
| Overcoming Classifier Imbalance for Long-tail Object Detection with
Balanced Group Softmax, Li, Wang, Kang, Tang, Wang, Li, Feng; 2020 | S1:
S2: Serdjan Rolovic
C: Christopher Hahn
R: Ishan Shah
Nov 16 | Temporal reasoning and Video
[no code (yet)] |
| Two-Stream Convolutional Networks for Action Recognition in Videos, Simonyan, Zisserman; 2014 |
| Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset, Carreira, Zisserman; 2017 |
| SlowFast Networks for Video Recognition, Feichtenhofer, Fan, Malik, He; 2018 |
| Is Space-Time Attention All You Need for Video Understanding?, Bertasius, Wang, Torresani; 2021 |
| Multiscale Vision Transformers, Fan, Xiong, Mangalam, Li, Yan, Malik, Feichtenhofer; 2021 |
Nov 18 | Temporal reasoning and Video
[no code (yet)] |
| Online Model Distillation for Efficient Video Inference, Mullapudi, Chen, Zhang, Ramanan, Fatahalian; 2018 |
| Long-Term Feature Banks for Detailed Video Understanding, Wu, Feichtenhofer, Fan, He, Krähenbühl, Girshick; 2018 |
| Long Short-Term Transformer for Online Action Detection, Xu, Xiong, Chen, Li, Xia, Tu, Soatto; 2021 |
| Less is More: ClipBERT for Video-and-Language Learning via Sparse
Sampling, Lei, Li, Zhou, Gan, Berg, Bansal, Liu; 2021 |
| CLEVRER: CoLlision Events for Video REpresentation and Reasoning, Yi, Gan, Li, Kohli, Wu, Torralba, Tenenbaum; 2019 |
Nov 23 | Final Project Q/A | |
Nov 25 | No class - Thanksgiving | |
Nov 30 | Final Project Presentations |
Dec 02 | Final Project Presentations |