Publications

2026

kAgent: An execution-guided crash resolution agent for the Linux kernel

A Mathai, C Huang, S Ma, J Kim, H Mitchell, A Nogikh, P Maniatis

Deep Learning for Code (DL4C) Workshop at ICML 2026 PDF

#AI Software Engineering Agents and Program Repair

Trustworthy AI Software Engineers

A Aleti, B Ray, R Hoda, S Chen

Preprint, 2026 PDF

#Trustworthy and Robust AI Systems #AI Software Engineering Agents and Program Repair

Outrunning LLM Cutoffs: A Live Kernel Crash Resolution Benchmark for All

C Huang, A Mathai, F Yu, A Nogikh, P Maniatis, F Ivančić, E Wu, K Kaffes, J Yang, B Ray

ICML 2026 PDF

#AI Software Engineering Agents and Program Repair

Code Quality Analysis of Translations from C to Rust

B Tadesse, V Nitin, M Salah, B Ray, M d'Amorim, W Assunção

Preprint, 2026 PDF

#Cross-Code and Program Translation #Software Modernization and Migration

SWE-Spot: Building Small Repo-Experts with Repository-Centric Learning

J Peng, M Saebo, T Zhong, YJ Cheng, J Yang, B Ray, S Chen, Y Ding

Preprint, 2026 PDF

#AI Software Engineering Agents and Program Repair

Your compiler is backdooring your model: Understanding and exploiting compilation inconsistency vulnerabilities in deep learning compilers

S Chen, J Peng, Y He, J Yang, B Ray

IEEE Symposium on Security and Privacy (S&P) 2026 PDF

★ Distinguished Paper Award

#Trustworthy and Robust AI Systems #Code Security and Vulnerability Detection

CodeSense: a Real-World Benchmark and Dataset for Code Semantic Reasoning

MK Roy, S Chen, B Steenhoek, J Peng, G Kaiser, B Ray, W Le

ICLR 2026 PDF

#Reasoning and Evaluation for Code and LLMs

Understanding APR Agents Through the Lens of Traceability: An Empirical Study

I Ceka*, H Mitchell*, S Pujar, L Buratti, S Ramji, J Yang, G Kaiser, B Ray

ISSTA 2026 PDF

#AI Software Engineering Agents and Program Repair

2025

C2SaferRust: Transforming C Projects into Safer Rust with NeuroSymbolic Techniques

V Nitin, R Krishna, L Lemos do Valle, B Ray

IEEE Transactions on Software Engineering, 2025 PDF

#Cross-Code and Program Translation #Software Modernization and Migration

Benchmarking large language models under data contamination: A survey from static to dynamic evaluation

S Chen, Y Chen, Z Li, Y Jiang, Z Wan, Y He, D Ran, T Gu, H Li, T Xie

EMNLP 2025 PDF

#Reasoning and Evaluation for Code and LLMs

Mechanics of Learned Reasoning 1: TempoBench, A Benchmark for Interpretable Deconstruction of Reasoning System Performance

N Holzer, W Fishell, B Ray, M Santolucito

Preprint, 2025 PDF

#Reasoning and Evaluation for Code and LLMs

AppForge: From Assistant to Independent Developer--Are GPTs Ready for Software Development?

D Ran, Y Cao, M Wu, S Chen, Y Guo, J Ren, Z Song, H Yu, J Wei, L Li, W Yang, B Ray, T Xie

ICLR 2025 PDF

#AI Software Engineering Agents and Program Repair

REFINE: Enhancing Program Repair Agents through Context-Aware Patch Refinement

A Pabba, S Chen, A Mathai, A Chakraborty, B Ray

Preprint, 2025 PDF

#AI Software Engineering Agents and Program Repair

Red Teaming Program Repair Agents: When Correct Patches can Hide Vulnerabilities

S Chen, Y He, S Jana, B Ray

Preprint, 2025 PDF

#Code Security and Vulnerability Detection #AI Software Engineering Agents and Program Repair

FaultLine: Automated proof-of-vulnerability generation using LLM agents

V Nitin, B Ray, RZ Moghaddam

Preprint, 2025 PDF

#Code Security and Vulnerability Detection #AI Software Engineering Agents and Program Repair

Code Reasoning for Software Engineering Tasks: A Survey and A Call to Action

S Pujar, I Ceka, I Manotas, G Kaiser, B Ray, S Ramji

Preprint, 2025 PDF

#Reasoning and Evaluation for Code and LLMs

CWEval: Outcome-driven Evaluation on Functionality and Security of LLM Code Generation

J Peng, L Cui, K Huang, J Yang, B Ray

2025 IEEE/ACM International Workshop on Large Language Models for Code PDF

#Code Security and Vulnerability Detection #Reasoning and Evaluation for Code and LLMs

EditLord: Learning Code Transformation Rules for Code Editing

W Li, A Jan, B Ray, J Yang, C Mao, K Pei

ICML 2025 PDF

#AI Software Engineering Agents and Program Repair

DyCodeEval: Dynamic Benchmarking of Reasoning Capabilities in Code Large Language Models Under Data Contamination

S Chen, P Pusarla, B Ray

ICML 2025 PDF

#Reasoning and Evaluation for Code and LLMs #Trustworthy and Robust AI Systems

Vulnerability detection with code language models: How far are we?

Y Ding, Y Fu, O Ibrahim, C Sitawarin, X Chen, B Alomair, D Wagner, B Ray, Y Chen

ICSE 2025 PDF

#Code Security and Vulnerability Detection #Reasoning and Evaluation for Code and LLMs

2024

Semcoder: Training code language models with comprehensive semantics reasoning

Y Ding, J Peng, MJ Min, G Kaiser, J Yang, B Ray

Advances in Neural Information Processing Systems 37, 60275-60308 PDF

#Reasoning and Evaluation for Code and LLMs

Can LLM prompting serve as a proxy for static analysis in vulnerability detection

I Ceka, F Qiao, A Dey, A Valecha, G Kaiser, B Ray

Preprint, 2024 PDF

#Code Security and Vulnerability Detection

Kgym: A platform and dataset to benchmark large language models on Linux kernel crash resolution

A Mathai, C Huang, P Maniatis, A Nogikh, F Ivančić, J Yang, B Ray

Advances in Neural Information Processing Systems 37, 78053-78078 PDF

#AI Software Engineering Agents and Program Repair

Comment on Revisiting Neural Program Smoothing for Fuzzing

D She, K Pei, J Yang, B Ray, S Jana

Preprint, 2024 PDF

#Code Security and Vulnerability Detection

Yuga: Automatically Detecting Lifetime Annotation Bugs in the Rust Language

V Nitin, A Mulhern, S Arora, B Ray

IEEE Transactions on Software Engineering 50(10), 2602-2613 PDF

#Code Security and Vulnerability Detection #Cross-Code and Program Translation

Spectra: Enhancing the code translation ability of language models by generating multi-modal specifications

V Nitin, R Krishna, B Ray

Preprint, 2024 PDF

#Cross-Code and Program Translation #Software Modernization and Migration

Cycle: Learning to self-refine the code generation

Y Ding, MJ Min, G Kaiser, B Ray

Proceedings of the ACM on Programming Languages 8 (OOPSLA1), 392-418 PDF

#Reasoning and Evaluation for Code and LLMs

Automated Code Editing with Search-Generate-Modify

C Liu, P Cetin, Y Patodia, B Ray, S Chakraborty, Y Ding

IEEE Transactions on Software Engineering, 2024 PDF

#AI Software Engineering Agents and Program Repair

Towards causal deep learning for vulnerability detection

MM Rahman, I Ceka, C Mao, S Chakraborty, B Ray, W Le

ICSE 2024 PDF

#Code Security and Vulnerability Detection #Trustworthy and Robust AI Systems

TRACED: Execution-aware Pre-training for Source Code

Y Ding, B Steenhoek, K Pei, G Kaiser, W Le, B Ray

ICSE 2024 PDF

#Reasoning and Evaluation for Code and LLMs #Code Security and Vulnerability Detection

Beyond accuracy: Evaluating self-consistency of code LLMs with IdentityChain

MJ Min, Y Ding, L Buratti, S Pujar, G Kaiser, S Jana, B Ray

ICLR 2024 PDF

#Reasoning and Evaluation for Code and LLMs

2023

Language-guided traffic simulation via scene-level diffusion

Z Zhong, D Rempe, Y Chen, B Ivanovic, Y Cao, D Xu, M Pavone, B Ray

Conference on Robot Learning, 144-177 PDF

#Trustworthy and Robust AI Systems

Guided conditional diffusion for controllable traffic simulation

Z Zhong, D Rempe, D Xu, Y Chen, S Veer, T Che, B Ray, M Pavone

ICRA 2023, 3560-3566 PDF

#Trustworthy and Robust AI Systems

On ML-based program translation: perils and promises

A Malyala, K Zhou, B Ray, S Chakraborty

ICSE 2023 PDF

#Cross-Code and Program Translation

Summarize and generate to back-translate: Unsupervised translation of programming languages

W Ahmad, S Chakraborty, B Ray, KW Chang

EACL 2023 PDF

#Cross-Code and Program Translation

Tracefixer: Execution trace-driven program repair

I Bouzenia, Y Ding, K Pei, B Ray, M Pradel

Preprint, 2023 PDF

#AI Software Engineering Agents and Program Repair #Reasoning and Evaluation for Code and LLMs

Concord: Clone-aware contrastive learning for source code

Y Ding, S Chakraborty, L Buratti, S Pujar, A Morari, G Kaiser, B Ray

ISSTA 2023 PDF

★ Distinguished Paper Award

#Reasoning and Evaluation for Code and LLMs

A static evaluation of code completion by large language models

H Ding, V Kumar, Y Tian, Z Wang, R Kwiatkowski, X Li, MK Ramanathan

ACL 2023 PDF

#Reasoning and Evaluation for Code and LLMs

Cameo: A causal transfer learning approach for performance optimization of configurable computer systems

MS Iqbal, Z Zhong, I Ahmad, B Ray, P Jamshidi

ACM Symposium on Cloud Computing 2023, 555-571 PDF

#Trustworthy and Robust AI Systems

2022

Learning approximate execution semantics from traces for binary function similarity

K Pei, Z Xuan, J Yang, S Jana, B Ray

IEEE Transactions on Software Engineering 49(4), 2776-2790 PDF

#Code Security and Vulnerability Detection

Natgen: generative pre-training by 'naturalizing' source code

S Chakraborty, T Ahmed, Y Ding, PT Devanbu, B Ray

ESEC/FSE 2022 PDF

#Reasoning and Evaluation for Code and LLMs

NeuDep: neural binary memory dependence analysis

K Pei, D She, M Wang, S Geng, Z Xuan, Y David, J Yang, S Jana, B Ray

ESEC/FSE 2022 PDF

#Code Security and Vulnerability Detection

Multi-lingual evaluation of code generation models

B Athiwaratkun, SK Gouda, Z Wang, X Li, Y Tian, M Tan, WU Ahmad

Preprint, 2022 PDF

#Reasoning and Evaluation for Code and LLMs

Cargo: AI-guided dependency analysis for migrating monolithic applications to microservices architecture

V Nitin, S Asthana, B Ray, R Krishna

ASE 2022 PDF

★ Distinguished Paper Award

#Software Modernization and Migration

Neural network guided evolutionary fuzzing for finding traffic violations of autonomous vehicles

Z Zhong, G Kaiser, B Ray

IEEE Transactions on Software Engineering 49(4), 1860-1875 PDF

#Trustworthy and Robust AI Systems

Detecting multi-sensor fusion errors in advanced driver-assistance systems

Z Zhong, Z Hu, S Guo, X Zhang, Z Zhong, B Ray

ISSTA 2022 PDF

#Trustworthy and Robust AI Systems

Automatic map generation for autonomous driving system testing

Y Tang, Y Zhou, K Yang, Z Zhong, B Ray, Y Liu, P Zhang, J Chen

Preprint, 2022 PDF

#Trustworthy and Robust AI Systems

Unicorn: Reasoning about configurable system performance through the lens of causality

MS Iqbal, R Krishna, MA Javidian, B Ray, P Jamshidi

EuroSys 2022, 199-217 PDF

#Trustworthy and Robust AI Systems #Software Modernization and Migration

Repairing Group-Level Errors for DNNs Using Weighted Regularization

Z Zhong, Y Tian, CJ Sweeney, V Ordonez, B Ray

Preprint, 2022 PDF

#Trustworthy and Robust AI Systems

VELVET: a noVel Ensemble Learning approach to automatically locate VulnErable sTatements

Y Ding, S Suneja, Y Zheng, J Laredo, A Morari, G Kaiser, B Ray

SANER 2022 PDF

#Code Security and Vulnerability Detection

Deep learning based vulnerability detection: Are we there yet?

S Chakraborty, R Krishna, Y Ding, B Ray

IEEE Transactions on Software Engineering 48(9), 3280-3296 PDF

★ Best Paper Award Runner-up

#Code Security and Vulnerability Detection #Reasoning and Evaluation for Code and LLMs

Towards Learning (Dis)-Similarity of Source Code from Program Contrasts

Y Ding, L Buratti, S Pujar, A Morari, B Ray, S Chakraborty

ACL 2022 PDF

#Reasoning and Evaluation for Code and LLMs

2021

A survey on scenario-based testing for automated driving systems in high-fidelity simulation

Z Zhong, Y Tang, Y Zhou, VO Neves, Y Liu, B Ray

Preprint, 2021 PDF

#Trustworthy and Robust AI Systems

On multi-modal learning of editing source code

S Chakraborty, B Ray

ASE 2021 PDF

#AI Software Engineering Agents and Program Repair #Reasoning and Evaluation for Code and LLMs

Retrieval augmented code generation and summarization

MR Parvez, W Ahmad, S Chakraborty, B Ray, KW Chang

Findings of EMNLP 2021, 2719-2734 PDF

#Reasoning and Evaluation for Code and LLMs

Stateformer: Fine-grained type recovery from binaries using generative state modeling

K Pei, J Guan, M Broughton, Z Chen, S Yao, D Williams-King, V Ummadisetty, J Yang, B Ray, S Jana

ESEC/FSE 2021 PDF

#Code Security and Vulnerability Detection

DIRECT: A Transformer-based Model for Decompiled Identifier Renaming

V Nitin, A Saieva, B Ray, G Kaiser

NLP4Prog 2021 PDF

#Code Security and Vulnerability Detection #Software Modernization and Migration

Understanding local robustness of deep neural networks under natural variations

Z Zhong, Y Tian, B Ray

FASE 2021 PDF

#Trustworthy and Robust AI Systems

Unified pre-training for program understanding and generation

WU Ahmad, S Chakraborty, B Ray, KW Chang

NAACL 2021 PDF

#Reasoning and Evaluation for Code and LLMs

CADET: Debugging and fixing misconfigurations using counterfactual reasoning

MS Iqbal, R Krishna, MA Javidian, B Ray, P Jamshidi

Preprint, 2021 PDF

#Trustworthy and Robust AI Systems

2020

Patching as translation: the data and the metaphor

Y Ding, B Ray, P Devanbu, VJ Hellendoorn

ASE 2020 PDF

#AI Software Engineering Agents and Program Repair #Cross-Code and Program Translation

Repairing confusion and bias errors for DNN-based image classifiers

Y Tian

ESEC/FSE 2020 PDF

#Trustworthy and Robust AI Systems

MTFuzz: fuzzing with a multi-task neural network

D She, R Krishna, L Yan, S Jana, B Ray

ESEC/FSE 2020 PDF

#Code Security and Vulnerability Detection #Trustworthy and Robust AI Systems

Codit: Code editing with tree-based neural models

S Chakraborty, Y Ding, M Allamanis, B Ray

IEEE Transactions on Software Engineering 48(4), 1385-1399 PDF

#AI Software Engineering Agents and Program Repair

Multitask learning strengthens adversarial robustness

C Mao, A Gupta, V Nitin, B Ray, S Song, J Yang, C Vondrick

ECCV 2020, 158-174 PDF

#Trustworthy and Robust AI Systems

ConEx: Efficient exploration of big-data system configurations for better performance

R Krishna, C Tang, K Sullivan, B Ray

IEEE Transactions on Software Engineering 48(3), 893-909 PDF

#Trustworthy and Robust AI Systems

A transformer-based approach for source code summarization

W Ahmad, S Chakraborty, B Ray, KW Chang

ACL 2020 PDF

#Reasoning and Evaluation for Code and LLMs

Testing DNN image classifiers for confusion & bias errors

Y Tian, Z Zhong, V Ordonez, G Kaiser, B Ray

ICSE 2020 PDF

#Trustworthy and Robust AI Systems

Neutaint: Efficient dynamic taint analysis with neural networks

D She, Y Chen, A Shah, B Ray, S Jana

IEEE Symposium on Security and Privacy 2020, 1527-1543 PDF

#Code Security and Vulnerability Detection #Trustworthy and Robust AI Systems

2019

Metric learning for adversarial robustness

C Mao, Z Zhong, J Yang, C Vondrick, B Ray

NeurIPS 2019 PDF

#Trustworthy and Robust AI Systems

Bringing engineering rigor to Deep Learning

K Pei, S Wang, Y Tian, J Whitehouse, C Vondrick, Y Cao, B Ray, S Jana

ACM SIGOPS Operating Systems Review 53(1), 59-67 PDF

#Trustworthy and Robust AI Systems

Neuzz: Efficient fuzzing with neural program smoothing

D She, K Pei, D Epstein, J Yang, B Ray, S Jana

IEEE Symposium on Security and Privacy 2019, 803-817 PDF

#Code Security and Vulnerability Detection #Trustworthy and Robust AI Systems

Toward optimal selection of information retrieval models for software engineering tasks

MM Rahman, S Chakraborty, G Kaiser, B Ray

SCAM 2019 PDF

#Reasoning and Evaluation for Code and LLMs

2018

Building language models for text with named entities

MR Parvez, S Chakraborty, B Ray, KW Chang

ACL 2018 PDF

#Reasoning and Evaluation for Code and LLMs

DeepTest: Automated testing of deep-neural-network-driven autonomous cars

Y Tian, K Pei, S Jana, B Ray

ICSE 2018 PDF

#Trustworthy and Robust AI Systems

Searching for high-performing software configurations with metaheuristic algorithms

C Tang, K Sullivan, B Ray

ICSE 2018 PDF

#Trustworthy and Robust AI Systems

Which similarity metric to use for software documents? A study on information retrieval based software engineering tasks

MM Rahman, S Chakraborty, B Ray

ICSE 2018 PDF

#Reasoning and Evaluation for Code and LLMs

Entropy guided spectrum based bug localization using statistical language model

S Chakraborty, Y Li, M Irvine, R Saha, B Ray

Preprint, 2018 PDF

#Code Security and Vulnerability Detection

A case study on the impact of similarity measure on information retrieval based software engineering tasks

MM Rahman, S Chakraborty, G Kaiser, B Ray

Preprint, 2018 PDF

#Reasoning and Evaluation for Code and LLMs

2017

Interpreted formalisms for configurations

C Tang, K Sullivan, J Xiang, T Weiss, B Ray

Preprint, 2017 PDF

#Trustworthy and Robust AI Systems

Automatically diagnosing and repairing error handling bugs in C

Y Tian, B Ray

ESEC/FSE 2017 PDF

#AI Software Engineering Agents and Program Repair