About this Event
2317 SPEEDWAY , Austin, Texas 78712
https://ifml.institute/events/long-context-foundational-models #TexasAISrinadh Bhojanapalli, Research Scientist at Google Research, New York, focuses in his research on developing a principled understanding of Transformer models and scaling them efficiently. Prior to joining Google, Srinadh served as a Research Assistant Professor at TTI Chicago. He holds a Ph.D. in Electrical and Computer Engineering from The University of Texas at Austin, where he was mentored by Prof. Sujay Sanghavi.
Talk Abstract:
Foundational large language models, while successful at shorter contexts, struggle to scale to longer context inputs. Preventing performance decay of Transformers when input lengths exceed those used during training has been a significant challenge in extending their capabilities. Though the Transformer architecture itself has no inherent input sequence length limits, current training methods constrain their performance on longer inputs. In this talk, we will present new results in scaling Transformer input length and discuss some open challenges in making attention approaches efficient for longer contexts.
User Activity
No recent activity