Archon: An Architecture Search Framework for Inference-Time Techniques

Jon Saad-Falcon

Adrian Gamarra Lafuente

Shlok Natarajan

Nahum Maru

Hristo Todorov

Etash Guha

E. Kelly Buchanan

Mayee Chen

Neel Guha

Christopher Ré

Azalia Mirhoseini

Submitted on 10 Dec 2024 (preprint), 2024

Abstract

We introduce Archon, a modular framework for optimizing large language model (LLM) systems through automated architecture search of inference-time techniques. While inference-time methods have shown great promise for enhancing LLM capabilities, developing effective systems that combine these techniques remains challenging due to limited understanding of their individual utility and interactions. Archon addresses this by providing an extensible design space for selecting, combining, and stacking inference-time techniques like generation ensembling, repeated sampling, ranking, fusion, critiquing, verification, and unit testing.

Materials

Project

PDF

Archon: An Architecture Search Framework for Inference-Time Techniques

Abstract

Materials

BibTeX