AI Software Engineering Agents and Program Repair

Building autonomous agents and program repair systems that understand, debug, and modify real-world codebases.

Overview

Software engineering is increasingly complex, with modern codebases spanning millions of lines of code across interconnected systems. Our research develops AI agents that can autonomously navigate these codebases, understand program semantics, and perform meaningful engineering tasks such as bug repair, code editing, and crash resolution.

We focus on building agents that work on real-world software, not toy benchmarks, tackling challenges like Linux kernel crash resolution and large-scale repository maintenance.

Key Directions

Crash Resolution: End-to-end agents that diagnose and patch real kernel crashes, combining LLM reasoning with execution feedback loops.
Code Editing & Repair: Systems that learn transformation rules and apply context-aware refinements to generate correct patches.
Agent Evaluation: Empirical studies of how SE agents work in practice, including traceability analysis and benchmarks that measure agent capabilities on realistic tasks.
Repository-Centric Learning: Training specialized models that deeply understand individual repositories rather than relying on general-purpose LLMs.

Impact

Our kGym platform and kAgent agent represent the first end-to-end LLM-based repair loop for real Linux kernel failures. SWE-Spot demonstrates that small, repository-specialized models can outperform much larger general-purpose LLMs on repository-specific tasks. Our empirical study on APR agent traceability provides the first systematic analysis of how repair agents navigate codebases, informing the design of more transparent and effective agents. EditLord learns reusable code transformation rules that generalize across editing tasks, while REFINE shows that context-aware patch refinement significantly improves repair quality. Across these efforts, our work has shaped how the community builds, evaluates, and understands AI agents for software engineering.

Contributors

Baishakhi Ray Simin Chen Alex Mathai Chenxi Huang Hailie Mitchell Ira Ceka Jinjun Peng Magnus Saebo Tianjun Zhong Yangruibo Ding Vikram Nitin Kexin Pei Saikat Chakraborty

AI Software Engineering Agents and Program Repair

AI Software Engineering Agents and Program Repair

Overview

Key Directions

Impact

Contributors

Selected Publications

kAgent: An execution-guided crash resolution agent for the Linux kernel

Trustworthy AI Software Engineers

Outrunning LLM Cutoffs: A Live Kernel Crash Resolution Benchmark for All

SWE-Spot: Building Small Repo-Experts with Repository-Centric Learning

Understanding APR Agents Through the Lens of Traceability: An Empirical Study

AppForge: From Assistant to Independent Developer--Are GPTs Ready for Software Development?

REFINE: Enhancing Program Repair Agents through Context-Aware Patch Refinement

Red Teaming Program Repair Agents: When Correct Patches can Hide Vulnerabilities

FaultLine: Automated proof-of-vulnerability generation using LLM agents

EditLord: Learning Code Transformation Rules for Code Editing

Kgym: A platform and dataset to benchmark large language models on Linux kernel crash resolution

Automated Code Editing with Search-Generate-Modify

Tracefixer: Execution trace-driven program repair