Event

Doctoral Defence: TANG Xunzhu

The Doctoral School in Sciences and Engineering is happy to invite you to TANG Xunzhu’s defence entitled

Automating the Analysis and Generation of Code Changes with Foundation Models

Supervisor: Prof Jacques KLEIN

Code changes are fundamental to software evolution, encompassing feature implementation, bug fixing, and security patching. As software systems scale in size and complexity, the ability to accurately understand and analyze code changes becomes increasingly critical for ensuring long-term software reliability, security, and maintainability. However, most existing tools treat code changes as plain diffs or unstructured token sequences, which overlook their underlying intent, semantics, and structural context. This lack of structured representation limits the effectiveness of automated approaches in key software engineering tasks such as vulnerability detection, code review, and automated program repair.

This thesis addresses this challenge by exploring how structured and generalizable representations of code and code changes can support a wide range of LLM-driven downstream tasks. We introduce Patcherizer, a learning framework designed to model both the sequential semantics and structural transformations involved in code modifications. Patcherizer combines token-level change intent, abstract syntax tree (AST) structure analysis, and contextual information to form representations that are both expressive and reusable across tasks. Our experiments demonstrate that Patcherizer significantly improves performance in tasks such as commit message generation, defect prediction, and patch correctness evaluation.

To showcase the utility of such representations, we design and evaluate three downstream systems, each leveraging large language models (LLMs) for different types of reasoning and generation:

  • LLMDA focuses on security patch detection. It uses LLMs to generate natural language explanations of code changes and augments training data with these interpretive narratives. These augmented examples are combined with structured change features and discriminative learning techniques to distinguish undocumented vulnerability patches from ordinary commits. This hybrid approach improves detection accuracy, particularly in real-world settings where commit messages may be misleading or incomplete.
  • CodeAgent targets automated code review by modeling the process as a multi-agent system. Each LLM-powered agent specializes in a specific review domain, such as functional correctness, security compliance, or style adherence. A coordination mechanism oversees these agents to maintain review quality and prevent redundancy or drift. This system better simulates collaborative developer environments and enables scalable, consistent review automation.
  • SynFix addresses the challenge of repository-level program repair. Unlike methods that focus on isolated functions or files, SynFix models code dependencies across an entire codebase using call graph analysis. It applies LLMs to generate patches that remain valid across multiple interdependent components, ensuring consistency and reducing the risk of regressions. This approach improves repair coverage and accuracy on large-scale projects.

Building upon the insights from these systems, we propose Repairity, a framework aimed at bridging the performance gap between open-source and closed-source LLMs for program repair. Repairity adopts a three-stage methodology: (1) extracting high-quality reasoning traces from commercial LLMs, (2) fine-tuning open-source models on this data, and (3) applying reinforcement learning with model-generated feedback to further improve repair quality. This pipeline significantly enhances the performance of open models like Qwen2.5-Coder, reducing the accuracy gap with commercial models such as Claude by over 90%, while maintaining full transparency and customizability.

In summary, this dissertation presents a comprehensive study on structured code change representation and its integration with LLM-based systems for software analysis and synthesis. Through the development of Patcherizer and its application to tasks such as vulnerability detection, code review, and repository-scale repair, we demonstrate the practical benefits of representation-driven automation in modern software engineering workflows.