Overview

We are witnessing groundbreaking results in image-to-text and image-to-video models. However, the generation process with these models is iterative and computationally expensive, requiring multiple sampling steps through large models. There is a growing need to make these algorithms faster for serving millions of users without the use of too many GPUs/TPUs. In this course, we will focus on techniques such as progressive parallel decoding, distillation methods, and Markov Random Fields to achieve speedup on generative models.

Speakers

Richard Hartley
Richard Hartley
Australian National University
Sadeep Jayasumana
Sadeep Jayasumana
OCTAVE | Ex-Google AI Research
Ameesh Makadia
Ameesh Makadia
Google Research
Srikumar Ramalingam
Srikumar Ramalingam
Google Research


Schedule

  • Date: June 12, 2024
  • Time: 9:00 AM - 12:30 PM
  • Location:
Time Instructor Title
9:00 AM Richard Hartley Mathematics of Diffusion Models
9:45 AM Srikumar Ramalingam Efficient methods and cornerstones of text-to-image Generation
10:30 AM Break
10:45 AM Sadeep Jayasumana Continuous MRF and FoE model for t2i Generation
11:30 AM Ameesh Makadia Efficient Text-to-3D and Text-to-Video generation