Overview
Software vulnerabilities remain one of the most critical challenges in computing. Our research combines deep learning with program analysis to build systems that can automatically detect vulnerabilities, generate proofs of exploitability, and understand the limitations of current detection approaches.
We take a rigorous, evaluation-driven approach, building benchmarks that test whether models truly understand security semantics rather than memorizing surface patterns.
Key Directions
- Vulnerability Detection: Evaluating and improving how LLMs and deep learning models detect vulnerabilities, including causal approaches that go beyond spurious correlations.
- Proof-of-Vulnerability Generation: Automated systems that not only find bugs but generate concrete exploits to validate them.
- Red Teaming: Studying how AI repair agents can inadvertently introduce new vulnerabilities while fixing existing bugs.
- Fuzzing: Neural-guided fuzzing techniques that combine program analysis with learned models to discover deep bugs.
- Binary Analysis: Learning execution semantics from traces for tasks like binary similarity detection and memory dependence analysis.
Impact
Our work on deep learning-based vulnerability detection revealed critical shortcomings in existing approaches, reshaping how the community evaluates detection tools. Neuzz pioneered neural program smoothing for fuzzing and has been widely adopted. FaultLine demonstrates that LLM agents can automatically generate proof-of-vulnerability exploits.
Contributors
Baishakhi Ray
Simin Chen
Vikram Nitin
Jinjun Peng
Ira Ceka
Yangruibo Ding
Kexin Pei
Dongdong She
Saikat Chakraborty
Selected Publications
Your compiler is backdooring your model: Understanding and exploiting compilation inconsistency vulnerabilities in deep learning compilers
S Chen, J Peng, Y He, J Yang, B Ray · IEEE Symposium on Security and Privacy (S&P) 2026
Red Teaming Program Repair Agents: When Correct Patches can Hide Vulnerabilities
S Chen, Y He, S Jana, B Ray · Preprint, 2025
FaultLine: Automated proof-of-vulnerability generation using LLM agents
V Nitin, B Ray, RZ Moghaddam · Preprint, 2025
CWEval: Outcome-driven Evaluation on Functionality and Security of LLM Code Generation
J Peng, L Cui, K Huang, J Yang, B Ray · 2025 IEEE/ACM International Workshop on Large Language Models for Code
Can LLM prompting serve as a proxy for static analysis in vulnerability detection
I Ceka, F Qiao, A Dey, A Valecha, G Kaiser, B Ray · Preprint, 2024
Comment on Revisiting Neural Program Smoothing for Fuzzing
D She, K Pei, J Yang, B Ray, S Jana · Preprint, 2024
Yuga: Automatically Detecting Lifetime Annotation Bugs in the Rust Language
V Nitin, A Mulhern, S Arora, B Ray · IEEE Transactions on Software Engineering 50(10), 2602-2613
Towards causal deep learning for vulnerability detection
MM Rahman, I Ceka, C Mao, S Chakraborty, B Ray, W Le · ICSE 2024
Vulnerability detection with code language models: How far are we?
Y Ding, Y Fu, O Ibrahim, C Sitawarin, X Chen, B Alomair, D Wagner, B Ray, Y Chen · ICSE 2025
TRACED: Execution-aware Pre-training for Source Code
Y Ding, B Steenhoek, K Pei, G Kaiser, W Le, B Ray · ICSE 2024
Learning approximate execution semantics from traces for binary function similarity
K Pei, Z Xuan, J Yang, S Jana, B Ray · IEEE Transactions on Software Engineering 49(4), 2776-2790
NeuDep: neural binary memory dependence analysis
K Pei, D She, M Wang, S Geng, Z Xuan, Y David, J Yang, S Jana, B Ray · ESEC/FSE 2022
VELVET: a noVel Ensemble Learning approach to automatically locate VulnErable sTatements
Y Ding, S Suneja, Y Zheng, J Laredo, A Morari, G Kaiser, B Ray · SANER 2022
Deep learning based vulnerability detection: Are we there yet?
S Chakraborty, R Krishna, Y Ding, B Ray · IEEE Transactions on Software Engineering 48(9), 3280-3296
View all publications →