Cross-Code and Program Translation

← Back to Projects

Cross-Code and Program Translation

Translating code across programming languages with a focus on safety and correctness.

Overview

Migrating legacy codebases to modern, safer languages is a pressing need across industry and government. Our research develops techniques for automatically translating code between programming languages, with particular emphasis on the C-to-Rust translation pipeline where memory safety guarantees are at stake.

We study both the capabilities and pitfalls of ML-based translation, building neurosymbolic systems that combine the flexibility of neural models with the rigor of formal specifications.

Key Directions

  • C-to-Rust Translation: Neurosymbolic techniques that produce safer Rust code from C projects, preserving functionality while gaining memory safety.
  • Multi-modal Specifications: Generating specifications in multiple modalities (types, contracts, tests) to guide and validate translations.
  • Translation Quality: Empirical studies measuring the quality, correctness, and idiomatic nature of machine-translated code.
  • Unsupervised Translation: Back-translation techniques that enable code translation without parallel training corpora.

Impact

C2SaferRust is among the first systems to translate real-world C projects into safer Rust using neurosymbolic techniques, addressing a key need in the software security community. Our empirical studies on ML-based translation quality have helped set realistic expectations for automated migration tools.

Contributors

Baishakhi Ray Vikram Nitin Saikat Chakraborty Rahul Krishna

Selected Publications

View all publications →