Jeff Cheng

photo.jpg
contact.txt
Jeffrey Cheng
PhD @ Princeton PLI



Contact me via X or email:
jc93 at princeton dot edu

About Me

about.txt

I am a first year PhD student at Princeton Language and Intelligence (PLI) advised by Danqi Chen. My research interests broadly lie in the intersection of natural language processing and machine learning. I am currently interested in language models and agents; in particular, I aim to study the downstream effects of pretraining data and methods to improve the capabilities and efficiency of reasoning models.


Below are a few questions I am interested in:

pretraining.txt
Data:
  • How does pretraining data influence language models as sources of knowledge? [Dated Data]
  • Can we attribute content generated by models back to their pretraining corpus?
  • How do we best correct misalignments arising from knowledge conflicts in models' pretraining data?
reasoning.txt
Reasoning:
  • Can we make reasoning models more efficient by shifting away from a discrete token space and perform reasoning in continuous latent space? [Compressed Chain of Thought]
  • How much better would reasoning models be if trained with process rewards rather than just outcome rewards?
  • How can we construct environments with verifiable rewards and/or induce structure into reasoning chains to make models more capable and efficient?
previous.txt

Previously, I recieved my Master's at Johns Hopkins University, advised by Benjamin Van Durme. Prior to NLP, my interests were in mathematics and fluid dynamics. I conducted research in these areas during my undergraduate studies at Duke University, advised by Tarek Elgindi.

misc.txt
Outside of research, I am an avid climber and chess player. I am also starting to run again after a long break.
News

Aug 2025
Started my PhD at Princeton graciously supported by the Francis Upton Fellowship
Dec 2024
New Preprint, [Compressed Chain of Thought], released!
Oct 2024
Attended CoLM 2024 and presented [Dated Data]. It wins Outstanding Paper Award! (Top 0.4%)