Etash Guha

Archon: An Architecture Search Framework for Inference-Time Techniques

Jon Saad-Falcon
Adrian Gamarra Lafuente
Shlok Natarajan
Nahum Maru
Hristo Todorov
Etash Guha
E. Kelly Buchanan
Mayee Chen
Neel Guha
Christopher Ré
Azalia Mirhoseini
Submitted on 10 Dec 2024 (preprint), 2024

Abstract

We introduce Archon, a modular framework for optimizing large language model (LLM) systems through automated architecture search of inference-time techniques. While inference-time methods have shown great promise for enhancing LLM capabilities, developing effective systems that combine these techniques remains challenging due to limited understanding of their individual utility and interactions. Archon addresses this by providing an extensible design space for selecting, combining, and stacking inference-time techniques like generation ensembling, repeated sampling, ranking, fusion, critiquing, verification, and unit testing.

Materials

Project
PDF

BibTeX