AI Software Engineering Agents and Program Repair

← Back to Projects

AI Software Engineering Agents and Program Repair

Building autonomous agents and program repair systems that understand, debug, and modify real-world codebases.

Overview

Software engineering is increasingly complex, with modern codebases spanning millions of lines of code across interconnected systems. Our research develops AI agents that can autonomously navigate these codebases, understand program semantics, and perform meaningful engineering tasks such as bug repair, code editing, and crash resolution.

We focus on building agents that work on real-world software, not toy benchmarks, tackling challenges like Linux kernel crash resolution and large-scale repository maintenance.

Key Directions

  • Crash Resolution: End-to-end agents that diagnose and patch real kernel crashes, combining LLM reasoning with execution feedback loops.
  • Code Editing & Repair: Systems that learn transformation rules and apply context-aware refinements to generate correct patches.
  • Agent Evaluation: Empirical studies of how SE agents work in practice, including traceability analysis and benchmarks that measure agent capabilities on realistic tasks.
  • Repository-Centric Learning: Training specialized models that deeply understand individual repositories rather than relying on general-purpose LLMs.

Impact

Our kGym platform and CrashFixer agent represent the first end-to-end LLM-based repair loop for real Linux kernel failures. SWE-Spot demonstrates that small, repository-specialized models can outperform much larger general-purpose LLMs on repository-specific tasks. Our empirical study on APR agent traceability provides the first systematic analysis of how repair agents navigate codebases, informing the design of more transparent and effective agents. EditLord learns reusable code transformation rules that generalize across editing tasks, while REFINE shows that context-aware patch refinement significantly improves repair quality. Across these efforts, our work has shaped how the community builds, evaluates, and understands AI agents for software engineering.

Contributors

Baishakhi Ray Simin Chen Alex Mathai Chenxi Huang Hailie Mitchell Ira Ceka Jinjun Peng Magnus Saebo Tianjun Zhong Yangruibo Ding Vikram Nitin Kexin Pei Saikat Chakraborty

Selected Publications

Trustworthy AI Software Engineers

A Aleti, B Ray, R Hoda, S Chen · Preprint, 2026

Outrunning LLM Cutoffs: A Live Kernel Crash Resolution Benchmark for All

C Huang, A Mathai, F Yu, A Nogikh, P Maniatis, F Ivančić, E Wu, K Kaffes, J Yang, B Ray · ICML 2026

SWE-Spot: Building Small Repo-Experts with Repository-Centric Learning

J Peng, M Saebo, T Zhong, YJ Cheng, J Yang, B Ray, S Chen, Y Ding · Preprint, 2026

Understanding APR Agents Through the Lens of Traceability: An Empirical Study

I Ceka*, H Mitchell*, S Pujar, L Buratti, S Ramji, J Yang, G Kaiser, B Ray · ISSTA 2026

AppForge: From Assistant to Independent Developer--Are GPTs Ready for Software Development?

D Ran, Y Cao, M Wu, S Chen, Y Guo, J Ren, Z Song, H Yu, J Wei, L Li, W Yang, B Ray, T Xie · ICLR 2025

REFINE: Enhancing Program Repair Agents through Context-Aware Patch Refinement

A Pabba, S Chen, A Mathai, A Chakraborty, B Ray · Preprint, 2025

Red Teaming Program Repair Agents: When Correct Patches can Hide Vulnerabilities

S Chen, Y He, S Jana, B Ray · Preprint, 2025

FaultLine: Automated proof-of-vulnerability generation using LLM agents

V Nitin, B Ray, RZ Moghaddam · Preprint, 2025

CrashFixer: A crash resolution agent for the Linux kernel

A Mathai, C Huang, S Ma, J Kim, H Mitchell, A Nogikh, P Maniatis · Preprint, 2025

EditLord: Learning Code Transformation Rules for Code Editing

W Li, A Jan, B Ray, J Yang, C Mao, K Pei · ICML 2025

Kgym: A platform and dataset to benchmark large language models on Linux kernel crash resolution

A Mathai, C Huang, P Maniatis, A Nogikh, F Ivančić, J Yang, B Ray · Advances in Neural Information Processing Systems 37, 78053-78078

Automated Code Editing with Search-Generate-Modify

C Liu, P Cetin, Y Patodia, B Ray, S Chakraborty, Y Ding · IEEE Transactions on Software Engineering, 2024

Tracefixer: Execution trace-driven program repair

I Bouzenia, Y Ding, K Pei, B Ray, M Pradel · Preprint, 2023

View all publications →