Overview

We are witnessing groundbreaking results in image-to-text and image-to-3D models. However, the generation process with these models is iterative and computationally expensive, requiring multiple sampling steps through large models. There is a growing need to make these algorithms faster for serving millions of users without the use of too many GPUs/TPUs. In this course, we will focus on techniques such as progressive parallel decoding, distillation methods, and Markov Random Fields to achieve speedup on generative models. The course will also focus on highlighting the limitations of popular image evaluation techniques such as FID and providing efficient alternative approaches such as CMMD and DreamSim.

Speakers

Shobhita Sundaram
Shobhita Sundaram
Massachusetts Institute of Technology (MIT)
Sadeep Jayasumana
Sadeep Jayasumana
Google Research
Varun Jampani
Varun Jampani
Stability AI
Dilip Krishnan
Dilip Krishnan
Google DeepMind
Srikumar Ramalingam
Srikumar Ramalingam
Google Research


Schedule

  • Date: September 29, 2024
  • Time: 9:10 AM - 1:00 PM
  • Location: Amber 3
Time Instructor Title
9:10 AM Srikumar Ramalingam Cornerstones of the Text-To-Pixels Journey
9:50 AM Shobhita Sundaram Image Evaluation Methods
10:30 AM Break
11:00 AM Varun Jampani Thinking Slow and Fast: Recent Trends in 3D Generative Models
11:30 AM Dilip Krishnan Parallel Decoding and Image Generation
12:00 AM Sadeep Jayasumana Structured Prediction Algorithms for Fast Image Generation