Abstracts: NeurIPS 2024 with Weizhu Chen

WEIZHU CHEN: Thank you for having me, Amber. 

TINGLE: So let’s start with a brief overview of your paper. In a couple sentences, tell us about the problem your research addresses and, more importantly, why the research community and beyond should know about this work. 

CHEN: So my team basically in Microsoft GenAI, we are working on model training. So one of the things actually we do in the pretraining, we realize the importance of the data. And we found that actually when we do this kind of data for each of the tokens, some token is more important than the other. That’s one. The other one actually is some token actually is very, very hard to be predicted during the pretraining. So, for example, just like

 

 

To finish reading, please visit source site

Leave a Reply