Large language models (LLMs) are revolutionizing the way we interact with computers and the world around us. However, in order to truly understand the world, LLM-powered agents need to be able to see. While vision-language models present a promising pathway to such multimodal understanding, it turns out that text-only LLMs can achieve remarkable success with prompting and tool use.
In this talk, Jacob Marks will give an overview of key LLM-centered projects that are transforming the field of computer vision, such as VisProg, ViperGPT, VoxelGPT, and HuggingGPT. He will also discuss his first-hand experience of building VoxelGPT, shedding light on the challenges and lessons learned, as well as a practitioner’s insights into domain-specific prompt engineering. He will conclude with his thoughts on the future of LLMs in computer vision.
This event is open to all and is especially relevant for researchers and practitioners interested in computer vision, generative AI, LLMs, and machine learning. RSVP now for an enlightening session!
We are looking for passionate people willing to cultivate and inspire the next generation of leaders in tech, business, and data science. If you are one of them get in touch with us!