Meet the Founders of Decoherence: Q&A with Rishi Bhuta and Will Stith
A chat with Decoherence co-founders Rishi and Will on how it all started, and what they envision for the future of generative AI video.
1. Hi Rishi and Will! Tell us about your background.
Rishi: I originally started programming out of necessity. I was fresh out of high school and determined to work as an Aerospace engineer at NASA. I joined a project team to build an autonomous space rover where I got “stuck with” programming the navigation stack. I quickly realized that I was way more interested in software and switched majors to Computer Science where I specialized in Computer Vision.
Will: I love to push the boundaries of technology and work on things that are just barely possible. Prior to Decoherence, I was working on a satellite constellation aiming to bring high-speed internet to every location on the planet. Before that I was involved with two different robotics projects - one that autonomously wrangles Amazon packages inside their warehouses and one that was transforming a lawnmower into a self-driving vehicle.
2. How did you two meet?
Rishi: Will and I met on the job in Amazon Robotics back in 2019. We were working on similar projects so we collaborated often at work, but what really started off our friendship was the weekly trips we took adventuring the Pacific Northwest. We’d go on hikes, backpacking trips, and ski trips to discuss our latest ideas, and we still do that today at Decoherence. Nothing quite like a strategy hike!
Will: I met Rishi on my first day at Amazon. We immediately hit it off because of our shared interest in robotics and skiing. I moved out to Seattle to work at Amazon, and I think for my first winter out here I went skiing with Rishi every single weekend. We then became roommates after Covid hit and our previous roommates moved back home.
3. When did you become interested in generative AI?
Rishi: I really started getting excited about generative AI when I had the chance to play with Disco Diffusion, DALLE2, and eventually Midjourney in early 2022. I was making thousands of images and sharing them non-stop with friends and family. It felt like I finally had the tools to visualize my ideas. It really hit me at that moment–creativity isn’t scarce, but limited by the tools we have to manifest it.
Will: The moment Stable Diffusion, an open-source text-to-image model, was released was the defining moment for me. Suddenly I could use my PC at home to generate an unlimited amount of images. It reminded me of the feeling I had in my first college robotics course, watching my code control a robot in real-life. But with Stable Diffusion, I was typing in English and watching images come to life in front of me. I felt like the world fundamentally changed on that day.
4. What inspired you to create Decoherence?
Rishi: I’ve always loved working on things that felt alive in one way or another. I was previously working in robotics because I loved that I could write code and look over and see my robot moving. It felt tangible and cooperative – working together with a machine to achieve a goal. That same magic struck when I generated my first image and, soon after, my first video. The only problem was that I had to download a bunch of disparate tools, read through difficult UIs, follow poorly documented guides, keep up on the latest research, and pray my GPU didn’t give out… If that’s the barrier to more creativity for all, I think we can fix that.
Will: I was blown away by what was now technologically possible after Stable Diffusion was released, and began thinking about how the technology would continue to progress. The natural evolution after text-to-image seemed to be generating video, and so I decided to go all-in on Decoherence and push on the boundaries of what’s possible.
5. What challenges did you have to overcome in the process of developing Decoherence?
Rishi: Unlike text and images, generating truly realistic videos remains unsolved. Video is a particularly troubling domain because it has a temporal element to it. Specifically, “subjects” in videos tend to have “motion” over time. Maintaining this temporal movement is particularly challenging for AI systems, but there is some incredibly promising work being done to get past this hurdle.
Will: Decoherence is trying to reach into the future and build the next generation of video creation before the underlying technology is truly ready for it. We know that generative video is still in its infancy, and so our biggest challenge is building the correct application for a technology that hasn’t been invented yet.
6. Are there unexpected ways in which people have utilized Decoherence that surprised you?
Rishi: In the early days of Decoherence, we had a user that was an elementary school teacher. His students were reading a book in class, but they would often lose focus or interest in the stories. He let the students describe the scenes how they remembered them after reading, and then used Decoherence to animate their recollections. He said they really loved seeing their descriptions come to life visually, and it kept them engaged to remember all the descriptions of things in the book!
Will: Yes! I’m constantly being surprised by our users’ creativity. We expected users to make music videos with Decoherence. What I didn’t expect was for users to start telling stories using the videos they’ve generated. Some of my favorite examples are amateur authors using Decoherence to complement their audiobook narration, and school teachers using Decoherence to help engage with their students.
7. In terms of future developments, what new features or enhancements can users expect next from Decoherence?
Rishi: We have some really exciting things coming soon in two categories. First off, we have major improvements to the video generation system coming very soon. This upgrade will allow you to create much more consistent videos with less flickering. In addition to the generation system, we have major updates coming to the overall Decoherence editor. We want to give you more control, which means better management of scenes and generations. We also know that making videos is not typically a one-and-done, and being able to craft, edit, and refine your videos is essential.
Will: We are always trying to push the boundaries of what’s possible with generative video. Next up you can expect a new method for generating short clips with consistent characters and realistic motion. We’re also making improvements to the creative process - users will be able to generate a video scene-by-scene rather than all at once. This will let you iterate on your generative video faster and make that perfect final shot.
8. What do you think the future of AI-powered video generation looks like? What do you hope to see?
Rishi: It’s going to be an absolute explosion of creativity across the world. Just 25 years ago, creating and distributing video was reserved for people with networks and tons of money. Over the last decade, the creator economy took hold and millions of people were able to use more accessible filming equipment, editing tools, and social media to share their stories with the world. This next decade will further that trend much more deeply. I fully expect small teams with tiny budgets to start creating films, tv shows, and music videos that rival the big studios. I hope to see a future where even more creativity in the world is unlocked.
Will: I think the future of AI video generation will unlock an entirely new population of creatives. Just as the smartphone brought videography from the studio and into everybody’s pocket, AI video generation will enable even more people to be creative. I hope to see a future where creators aren’t limited by their access to actors, to sets, and to VFX, but instead are only limited by their own imagination.
9. Tell us something about yourself that might surprise people. Any secret talents?
Rishi: Back in middle school, I was running a school-wide candy selling business with my best friend. We got so far into it that we would keep track of customers, and we let people that had good “credit” with us sign a contract to buy candy if they promised to pay us back within a week. Eventually, the teachers found out and made us stop (or so they thought), but my parents were blown away that a 13 year old was making a few hundred bucks a month and didn’t punish me!
Will: I wanted to be a gymnast when I was younger, but my parents didn’t let me join a team. They claimed I was going to be “too tall”, but I think it was actually because I was a guy. So instead I taught myself some routines. I can still do a standing backflip today!
10. What advice would you give to your younger self?
Rishi: Make things and give them to people fast! Don’t wait until it’s “perfect”. If it’s truly something that people want, they will be forgiving of its shortcomings and eager to help you make it better.
Will: Starting a company seems to get harder, not easier, as you get older. So start a company when you’re young! When people get older they tend to get things like serious relationships, expensive hobbies, and mortgage payments. All of those things make it that much harder to quit your job and go all-in on some crazy idea.
11. And finally, do you have any tips for people who want to get better at creating AI art?
Rishi: Mastering AI generators is a bit of randomness, a bit of skill, and a lot of trial and error. We’re building this editor so you can iterate faster and easier because it’s pretty rare to get “the perfect shot” on your first attempt. Just keep on trying things and share your art with others so everyone can learn from each other.
Will: Keep practicing! Making AI art is a skill, and you can definitely get better at it with practice. The more you use AI art tools, the more natural it will start to feel. Even when you are practiced, making that perfect piece of art takes iteration. Most videos I make go through at least 5 iterations before I’m happy with the final result.