Talk: Rigorous measurement in text-to-image systems, 4/29

Talk: Rigorous measurement in text-to-image systems, 4/29 4-5pm ET Monday, April 29 in ENGR 231 & Webex

Rigorous measurement in text-to-image systems (and AI more broadly?)

Michael Saxon

University of California, Santa Barbara

April 29, 2024 4:00 – 5:15 PM ET

ENGR 231 and Webex

As large pretrained models underlying generative AI systems have grown larger, inscrutable, and widely-deployed, interest in understanding their nature as emergent rather than engineered systems has grown. I believe to move this "ersatz natural science" of AI forward, we need to focus on building rigorous observational tools for these systems, which can characterize capabilities unambiguously. At their best, benchmarks and metrics could meet this need, but at present they are often treated as mere leaderboards to chase and only very indirectly measure capabilities of interest. This talk covers three works on this topic: first, a work laying out the high-level case for building a subfield of "model metrology" which focuses on building better benchmarks and metrics. Then, it covers two works on metrology in the generative image domain: first, a work which assesses multilingual conceptual knowledge in text-to-image (T2I) systems, and second, a meta-benchmark that demonstrates how many T2I prompt faithfulness benchmarks actually fail to capture the compositionality characteristics of T2I systems which they purport to measure. This line of inquiry is intended to help move benchmarking toward the ideal of rigorous tools of scientific observation.

Michael Saxon is a PhD candidate and NSF Fellow in the NLP Group at the University of California, Santa Barbara. His research sits on the intersection of generative model benchmarking, multimodality, and AI ethics. He’s particularly interested in making meaningful evaluations of hard-to-measure new capabilities in these artifacts. Michael earned his BS in Electrical Engineering and MS in Computer Engineering at Arizona State University, advised by Visar Berish and Sethuraman Panchanathan in 2018 and 2020 respectively.

]]>

Rigorous measurement in text-to-image systems (and AI more broadly?) Michael Saxon University of California, Santa Barbara April 29, 2024 4:00 – 5:15 PM ET ENGR 231 and Webex As...

https://www.tejasgokhale.com/seminar.html https://dev.my.umbc.edu/api/v0/pixel/news/141414/guest@my.umbc.edu/865d0e818fb35debfc44e3d3e0813c60/api/pixel ai image llm text text-to-image vision Computer Science and Electrical Engineering https://dev.my.umbc.edu/groups/csee https://assets3-dev.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/xsmall.png?1314043393 https://assets1-dev.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/original.png?1314043393 https://assets1-dev.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/xxlarge.png?1314043393 https://assets4-dev.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/xlarge.png?1314043393 https://assets3-dev.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/large.png?1314043393 https://assets1-dev.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/medium.png?1314043393 https://assets2-dev.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/small.png?1314043393 https://assets3-dev.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/xsmall.png?1314043393 https://assets3-dev.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/xxsmall.png?1314043393 Computer Science and Electrical Engineering https://assets4-dev.my.umbc.edu/system/shared/thumbnails/news/000/141/414/2e89f5ef5c07dcb98fe19a39c915f9ec/xxlarge.jpg?1714225177 https://assets3-dev.my.umbc.edu/system/shared/thumbnails/news/000/141/414/2e89f5ef5c07dcb98fe19a39c915f9ec/xlarge.jpg?1714225177 https://assets1-dev.my.umbc.edu/system/shared/thumbnails/news/000/141/414/2e89f5ef5c07dcb98fe19a39c915f9ec/large.jpg?1714225177 https://assets4-dev.my.umbc.edu/system/shared/thumbnails/news/000/141/414/2e89f5ef5c07dcb98fe19a39c915f9ec/medium.jpg?1714225177 https://assets1-dev.my.umbc.edu/system/shared/thumbnails/news/000/141/414/2e89f5ef5c07dcb98fe19a39c915f9ec/small.jpg?1714225177 https://assets1-dev.my.umbc.edu/system/shared/thumbnails/news/000/141/414/2e89f5ef5c07dcb98fe19a39c915f9ec/xsmall.jpg?1714225177 https://assets1-dev.my.umbc.edu/system/shared/thumbnails/news/000/141/414/2e89f5ef5c07dcb98fe19a39c915f9ec/xxsmall.jpg?1714225177 0 0 true Sat, 27 Apr 2024 09:43:46 -0400 Talk: Learning to Synthesize Images, 4-5:15pm ET, Wed. 4/17 Advances in Perception, Prediction & Reasoning seminar

Learning to Synthesize Images with Multimodal and Hierarchical Inputs

Yu Zeng, JHU

April 17, 2024 4:00 – 5:15 PM

ENGR 231, UMBC or Webex

In recent years, image synthesis and manipulation has experienced remarkable advancements driven by deep learning algorithms and web-scale data, yet there persists a notable disconnect between the intricate nature of human ideas and the simplistic input structures employed by the existing models. In this talk, I will present our research towards a more natural way for controllable image synthesis inspired by the coarse-to-fine workflow of human artists and the inherently multimodal aspect of human thought processes. We consider the inputs of semantic and visual modality at varying levels of hierarchy. For the semantic modality, we introduce a general framework for modeling semantic inputs of different levels, which includes image-level text prompts and pixel-level label maps as two extremes and brings a series of mid-level regional descriptions with different precision. For the visual modality, we explore the use of low-level and high-level visual inputs aligning with the natural hierarchy of visual processing. Additionally, as the misuse of generated images becomes a societal threat, I will introduce our findings on the trustworthiness of deep generative models in the second part of this talk and potential future research directions.

Yu Zeng is a Ph.D. candidate at Johns Hopkins University advised by Vishal M. Patel. Her research interest lies in computer vision and deep learning. She has focused on two main areas: (1) deep generative models for image synthesis and editing and (2) label-efficient deep learning. By combining these research areas, she aims to bridge human creativity and machine intelligence through user-friendly and socially responsible models while minimizing the need for intensive human supervision. Yuhas collaborated with researchers at NVIDIA and Adobe through internships. Prior to her Ph.D., she worked as a researcher at Tencent Games. Yu’s research has been recognized by the KAUST Rising Stars in AI, and her Ph.D. study has been supported by a JHU Kewei Yang and Grace Xin Fellowship.

]]>

Learning to Synthesize Images with Multimodal and Hierarchical Inputs Yu Zeng, JHU April 17, 2024 4:00 – 5:15 PM ENGR 231, UMBC or Webex In recent years, image synthesis and manipulation...

https://dev.my.umbc.edu/api/v0/pixel/news/140947/guest@my.umbc.edu/d2902ab5728446da5f501daf9d79bb35/api/pixel ai images multimodal ppr vision Computer Science and Electrical Engineering https://dev.my.umbc.edu/groups/csee https://assets3-dev.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/xsmall.png?1314043393 https://assets1-dev.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/original.png?1314043393 https://assets1-dev.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/xxlarge.png?1314043393 https://assets4-dev.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/xlarge.png?1314043393 https://assets3-dev.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/large.png?1314043393 https://assets1-dev.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/medium.png?1314043393 https://assets2-dev.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/small.png?1314043393 https://assets3-dev.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/xsmall.png?1314043393 https://assets3-dev.my.umbc.edu/system/shared/avatars/groups/000/000/099/d117dca133c64bf78a4b7696dd007189/xxsmall.png?1314043393 Computer Science and Electrical Engineering https://assets4-dev.my.umbc.edu/system/shared/thumbnails/news/000/140/947/80114b749f795158bf424e6568f1d2c0/xxlarge.jpg?1713206571 https://assets1-dev.my.umbc.edu/system/shared/thumbnails/news/000/140/947/80114b749f795158bf424e6568f1d2c0/xlarge.jpg?1713206571 https://assets1-dev.my.umbc.edu/system/shared/thumbnails/news/000/140/947/80114b749f795158bf424e6568f1d2c0/large.jpg?1713206571 https://assets1-dev.my.umbc.edu/system/shared/thumbnails/news/000/140/947/80114b749f795158bf424e6568f1d2c0/medium.jpg?1713206571 https://assets2-dev.my.umbc.edu/system/shared/thumbnails/news/000/140/947/80114b749f795158bf424e6568f1d2c0/small.jpg?1713206571 https://assets1-dev.my.umbc.edu/system/shared/thumbnails/news/000/140/947/80114b749f795158bf424e6568f1d2c0/xsmall.jpg?1713206571 https://assets4-dev.my.umbc.edu/system/shared/thumbnails/news/000/140/947/80114b749f795158bf424e6568f1d2c0/xxsmall.jpg?1713206571 0 0 true Mon, 15 Apr 2024 14:46:13 -0400