Papers
Preprints
-
Anna A. Ivanova*, Aalok Sathe*, Benjamin Lipkin*, Unnathi Kumar, Setayesh Radkani, Thomas H. Clark, Carina Kauf, Jennifer Hu, R.T. Pramod, Gabriel Grand, Vivian Paulun, Maria Ryskina, Ekin Akyürek, Ethan Wilcox, Nafisa Rashid, Leshem Choshen, Roger Levy, Evelina Fedorenko, Joshua Tenenbaum and Jacob Andreas.
-
Language modeling with editable external knowledge.
Belinda Z. Li, Emmy Liu, Alexis Ross, Abbas Zeitoun, Graham Neubig and Jacob Andreas.
-
Policy learning with a language bottleneck.
Megha Srivastava, Cédric Colas, Dorsa Sadigh and Jacob Andreas.
-
Bayesian preference elicitation with language models.
Kunal Handa, Yarin Gal, Ellie Pavlick, Noah Goodman, Jacob Andreas, Alex Tamkin and Belinda Z. Li.
-
Eliciting human preferences with language models.
Belinda Z. Li, Alex Tamkin, Noah Goodman and Jacob Andreas.
Press: VentureBeat
2024
-
Inspecting and editing knowledge representations in language models.
Evan Hernandez, Belinda Z. Li and Jacob Andreas.
COLM 2024.
-
An incomplete loop: Deductive, inductive and abductive learning in large language models.
Emmy Liu, Graham Neubig and Jacob Andreas.
COLM 2024.
-
Unforgettable generalization in language models.
Eric Zhang, Leshem Choshen and Jacob Andreas.
COLM 2024.
-
Toward in-context teaching: Adapting examples to students’ misconceptions.
Alexis Ross and Jacob Andreas.
ACL 2024.
-
Lexicon-level contrastive visual grounding improves language modeling.
Chengxu Zhuang, Ev Fedorenko and Jacob Andreas.
ACL Findings 2024.
-
Deductive closure training of language models for coherence, accuracy and updatability.
Afra Feyza Akyürek, Ekin Akyürek, Leshem Choshen, Derry Wijaya, Jacob Andreas.
ACL Findings 2024.
-
In-context language learning: Architectures and algorithms.
Ekin Akyürek, Bailin Wang, Yoon Kim and Jacob Andreas.
ICML 2024.
-
A multimodal automated interpretability agent.
Tamar Rott Shaham, Sarah Schwettmann, Franklin Wang, Achyuta Rajaram, Evan Hernandez, Jacob Andreas, Antonio Torralba.
ICML 2024.
-
Decomposing uncertainty for large language models through input clarification ensembling.
Bairu Hou, Yujian Liu, Kaizhi Qian, Jacob Andreas, Shiyu Chang and Yang Zhang.
ICML 2024 Oral.
-
Learning phonotactics from linguistic informants.
Canaan Breiss*, Alexis Ross*, Amani Maina-Kilaas, Roger P. Levy, Jacob Andreas.
SCiL 2024.
-
Contextual and combinatorial structure in sperm whale vocalizations.
Pratyusha Sharma, Shane Gero, Roger Payne, David F. Gruber, Daniela Rus*, Antonio Torralba*, Jacob Andreas*.
Nature Communications 2024.
Press: New York Times (The Daily podcast), Washington Post
-
Regularized conventions: Equilibrium computation as a model of pragmatic reasoning.
Athul Paul Jacob, Gabriele Farina and Jacob Andreas.
NAACL 2024, SCiL 2024.
-
Visual grounding helps learn word meanings in low-data regimes.
Chengxu Zhuang, Evelina Fedorenko and Jacob Andreas.
NAACL 2024 Best paper.
-
Zhaofeng Wu, Linlu Qiu, Alexis Ross, Ekin Akyürek, Boyuan Chen, Bailin Wang, Najoung Kim, Jacob Andreas and Yoon Kim.
NAACL 2024.
-
LaMPP: Language models as probabilistic priors for perception and action.
Belinda Z. Li, William Chen, Pratyusha Sharma and Jacob Andreas.
ICLR 2024 Workshop on Generative AI for Decision-Making.
-
The consensus game: Language model generation via equilibrium search.
Athul Paul Jacob, Yikang Shen, Gabriele Farina and Jacob Andreas.
ICLR 2024 Spotlight, NeurIPS R0-FoMo Workshop Best Paper.
Press: Quanta
-
Linearity of relation decoding in transformer language models.
Evan Hernandez*, Arnab Sen Sharma*, Tal Haklay, Kevin Meng, Martin Wattenberg, Jacob Andreas, Yonatan Belinkov and David Bau.
ICLR 2024 Spotlight.
Press: Hacker News, MIT News.
-
Learning with language-guided state abstractions.
Andi Peng, Ilia Sucholutsky, Belinda Z. Li, Theodore Sumers, Thomas L. Griffiths, Jacob Andreas and Julie Shah.
ICLR 2024.
Press: MIT News
-
Learning adaptive planning representations with natural language guidance.
Lionel Wong, Jiayuan Mao, Pratyusha Sharma, Zachary S. Siegel, Jiahai Feng, Noa Korneev, Joshua B. Tenenbaum and Jacob Andreas.
ICLR 2024.
Press: MIT News
-
Modeling boundedly rational agents with latent inference budgets.
Athul Paul Jacob, Abhishek Gupta and Jacob Andreas.
ICLR 2024.
Press: MIT News.
-
LILO: Learning interpretable libraries by compressing and documenting code.
Gabriel Grand, Lionel Wong, Maddy Bowers, Theo X. Olausson, Muxin Liu, Joshua B. Tenenbaum and Jacob Andreas.
ICLR 2024.
Press: MIT News
2023
-
Alignment via mutual information.
Shinjini Ghosh, Yoon Kim, Ramon Fernandez Astudillo, Tahira Naseem and Jacob Andreas.
CoNLL 2023.
-
Kevin Liu, Stephen Casper, Dylan Hadfield-Menell and Jacob Andreas.
EMNLP 2023.
-
Pushdown layers: Encoding recursive structure in transformer language models.
Shikhar Murty, Pratyusha Sharma, Jacob Andreas and Christopher D. Manning.
EMNLP 2023.
-
AutoReply: Detecting nonsense in dialogue introspectively with discriminative replies.
Weiyan Shi, Emily Dinan, Adi Renduchintala, Daniel Fried, Athul Paul Jacob, Zhou Yu, Mike Lewis.
EMNLP Findings 2023.
-
Pseudointelligence: A unifying framework for language model evaluation.
Shikhar Murty*, Orr Paradise*, Pratyusha Sharma*.
EMNLP Findings 2023.
-
The clock and the pizza: Two stories in mechanistic explanation of neural networks.
Ziming Liu*, Ziqian Zhong*, Max Tegmark and Jacob Andreas.
NeurIPS 2023 Oral.
-
A function interpretation benchmark for evaluating interpretability methods.
Sarah Schwettmann, Tamar Rott Shaham, Joanna Materzynska, Neil Chowdhury, Shuang Li, Jacob Andreas, David Bau, Antonio Torralba. NeurIPS Datasets & Benchmarks 2023.
Press: MIT News. project page
-
Carina Kauf, Greta Tuckute, Roger Levy, Jacob Andreas and Evelina Fedorenko.
Neurobiology of Language 2023.
Press: The Atlantic code
-
Compositionality as lexical symmetry..
Ekin Akyürek and Jacob Andreas.
ACL 2023 Lexical Semantics Area Award.
-
Grokking of hierarchical structure in vanilla transformers.
Shikhar Murty, Pratyusha Sharma, Jacob Andreas and Chris Manning.
ACL 2023.
-
Language modeling with latent situations.
Belinda Z. Li, Max Nye and Jacob Andreas.
ACL Findings 2023.
-
Guiding pretraining in reinforcement learning with large language models.
Yuqing Du*, Olivia Watkins*, Zihan Wang, Cédric Colas, Trevor Darrell, Pieter Abbeel, Abhishek Gupta and Jacob Andreas.
ICML 2023.
-
PromptBoosting: Text classification with langauge models in ten forward passes.
Bairu Hou, Joe O’Connor, Jacob Andreas, Yang Zhang and Shiyu Chang.
ICML, 2023.
-
What learning algorithm is in-context learning? Investigations with linear models.
Ekin Akyürek, Dale Schuurmans, Jacob Andreas*, Tengyu Ma*, Denny Zhou*.
ICLR 2023 Notable Top-5% Paper.
Press: Motherboard, MIT News. code
-
Characterizing intrinsic compositionality in transformers with tree projections.
Shikhar Murty, Pratyusha Sharma, Jacob Andreas and Christopher Manning. ICLR 2023.
-
Compositional semantic parsing with large language models.
Andrew Drozdov, Nathanael Schärli, Ekin Akyürek, Nathan Scales, Xinying Song, Xinyun Chen, Olivier Bousquet, Denny Zhou. ICLR 2023.
-
Mastering the game of no-press Diplomacy via human-regularized reinforcement learning and planning.
Anton Bakhtin*, David J Wu*, Adam Lerer*, Jonathan Gray*, Athul Paul Jacob*, Gabriele Farina*, Alexander H Miller, Noam Brown.
ICLR 2023 Notable Top-5% Paper.
Press: NewScientist
-
Top-down synthesis for library learning.
Matthew Bowers, Theo X. Olausson, Lionel Wong, Gabriel Grand, Joshua B. Tenenbaum, Kevin Ellis, Armando Solar-Lezama.
POPL 2023.
2022
-
Pre-trained language models for interactive decision-making.
Shuang Li, Xavier Puig, Chris Paxton, Yilun Du, Clinton Wang, Linxi Fan, Tao Chen, De-An Huang, Ekin Akyürek, Anima Anandkumar*, Jacob Andreas*, Igor Mordatch*, Antonio Torralba*, Yuke Zhu*.
NeurIPS 2022 Oral.
-
Human-level play in the game of Diplomacy by combining language models with strategic reasoning.
FAIR, Anton Bakhtin* , Noam Brown*, Emily Dinan*, Gabriele Farina, Colin Flaherty*, Daniel Fried, Andrew Goff, Jonathan Gray*, Hengyuan Hu*, Athul Paul Jacob*, Mojtaba Komeili, Karthik Konath, Minae Kwon, Adam Lerer*, Mike Lewis*, Alexander H. Miller*, Sasha Mitts, Adithya Renduchintala*, Stephen Roller, Dirk Rowe, Weiyan Shi*, Joe Spisak, Alexander Wei, David Wu*, Hugh Zhang*, Markus Zijlstra. Science 2022.
Press: The Economist, Forbes, The Washington Post, The New York Times, The New Yorker, VentureBeat.
-
Language models as agent models.
Jacob Andreas.
EMNLP Findings 2022.
-
Toward tracing factual knowledge in language models back to the training data.
Ekin Akyürek, Tolga Bolukbasi, Frederick Liu, Binbin Xiong, Ian Tenney, Jacob Andreas, Kelvin Guu.
EMNLP Findings 2022.
-
Hierarchical phrase-based sequence-to-sequence learning.
Bailin Wang, Ivan Titov, Jacob Andreas and Yoon Kim.
EMNLP 2022.
-
Modeling strong and human-Like gameplay with KL-regularized search.
Athul Paul Jacob*, David J. Wu*, Gabriele Farina*, Adam Lerer, Hengyuan Hu, Anton Bakhtin, Jacob Andreas, Noam Brown.
ICML 2022 Spotlight.
-
Toward understanding the communication in sperm whales.
Jacob Andreas, Gašper Beguš, Michael M. Bronstein, Roee Diamant, Denley Delaney, Shane Gero, Shafi Goldwasser, David F. Gruber, Sarah de Haas, Peter Malkin, Nikolay Pavlov, Roger Payne, Giovanni Petri, Daniela Rus, Pratyusha Sharma, Dan Tchernov, Pernille Tønnesen, Antonio Torralba, Daniel Vogt, Robert J. Wood.
iScience 2022.
Press: The New Yorker
-
Identifying concept libraries from language about object structure.
Lionel Wong*, William P. McCarthy*, Gabriel Grand*, Yoni Friedman, Joshua B. Tenenbaum, Jacob Andreas, Robert D. Hawkins, Judith E. Fan.
CogSci 2022.
-
Correcting robot plans with natural language feedback.
Pratyusha Sharma, Balakumar Sundaralingam, Valts Blukis, Chris Paxton, Tucker Hermans, Antonio Torralba, Jacob Andreas, Dieter Fox.
RSS 2022.
-
Quantifying adaptability in pre-trained language models with 500 tasks.
Belinda Z. Li, Jane Yu, Madian Khabsa, Luke Zettlemoyer, Alon Halevy, Jacob Andreas.
NAACL 2022.
-
Skill induction and planning with latent language.
Pratyusha Sharma, Antonio Torralba, and Jacob Andreas.
ACL 2022.
-
Natural language descriptions of deep visual features.
Evan Hernandez, Sarah Schwettmann, David Bau, Teona Bagashvili, Antonio Torralba and Jacob Andreas.
ICLR 2022 Oral.
Press: MIT News project page
-
Subspace regularizers for few-shot class incremental learning.
Afra Feyza Akyürek, Ekin Akyürek, Derry Wijaya and Jacob Andreas.
ICLR 2022.
2021
-
Teachable reinforcement learning via advice distillation.
Olivia Watkins, Trevor Darrell, Pieter Abbeel, Jacob Andreas and Abhishek Gupta.
-
Anthony Bau and Jacob Andreas.
EMNLP 2021.
-
Toward a visual concept vocabulary for generative adversarial networks.
Sarah Schwettmann, Evan Hernandez, David Bau, Samuel Klein, Jacob Andreas, Antonio Torralba.
ICCV 2021.
-
The low-dimensional linear geometry of contextualized word representations.
Evan Hernandez and Jacob Andreas.
CoNLL 2021.
-
Leveraging language to learn program abstractions and search heuristics.
Lionel Wong, Kevin Ellis, Joshua B. Tenenbaum and Jacob Andreas.
ICML 2021.
-
Implicit representations of meaning in neural language models.
Belinda Z. Li, Maxwell Nye and Jacob Andreas.
ACL 2021.
Press: Scientific American code
-
What context features can transformer language models use?
Joe O’Connor and Jacob Andreas.
ACL 2021.
-
Lexicon learning for few-shot sequence modeling.
Ekin Akyürek and Jacob Andreas.
ACL 2021.
-
Multitasking inhibits semantic drift.
Athul Paul Jacob, Mike Lewis and Jacob Andreas.
NAACL 2021.
-
Representing partial programs with blended abstract semantics.
Maxwell Nye, Yewen Pu, Matthew Bowers, Jacob Andreas, Joshua B. Tenenbaum, Armando Solar-Lezama.
ICLR 2021.
-
Learning to recombine and resample data for compositional generalization.
Ekin Akyürek, Afra Feyza Akyürek and Jacob Andreas.
ICLR 2021.
2020
-
Compositional explanations of neurons.
Jesse Mu and Jacob Andreas.
NeurIPS 2020 Oral.
-
A benchmark for systematic generalization in grounded language understanding.
Laura Ruis, Jacob Andreas, Marco Baroni, Diane Bouchacourt and Brenden Lake.
NeurIPS 2020.
-
Good-enough compositional data augmentation.
Jacob Andreas.
ACL 2020.
-
Unnatural language processing: bridging the gap between synthetic and natural language data.
Alana Marzoev, Sam Madden, Frans Kaashoek, Mike Cafarella and Jacob Andreas.
NeurIPS workshop on Emergent Communication.
2019
-
A survey of reinforcement learning informed by natural language.
Jelena Luketina, Nantas Nardelli, Gregory Farquhar, Jakob Foerster, Jacob Andreas, Edward Grefenstette, Shimon Whiteson and Tim Rocktäschel.
IJCAI 2019.
-
Measuring compositionality in representation learning.
Jacob Andreas.
ICLR 2019.