Name: Doctoral Defence: Yu PEI
Start: 2025-10-23 10:00:00
End: 2025-10-23 13:00:00
Location: JFK (Kirchberg) - Room JFK-02-201

The Doctoral School in Sciences and Engineering is happy to invite you to Yu PEI’s defence entitled

Understanding, Localizing, and Repairing Flakiness in Web Front-End Testing

Supervisor: Assoc. Prof Michail PAPADAKIS

Software system reliability and maintainability are threatened by flaky tests, which yield inconsistent results without changing the code. Due to the complexity of modern front-end frameworks, dynamic content, and asynchronous behavior, the issue is also painful in the context of web applications. Although test flakiness in software testing is becoming more widely recognized, little is known about its precise forms and underlying causes in web front-end testing. This dissertation aims to address this gap by providing a thorough examination of the causes, patterns, and mitigation strategies of web front-end test flakiness.

First, this dissertation provides a real-world examination of how DOM event interactions contribute to flaky behavior. This study conducts an empirical analysis of 123 flaky tests in 49 open-source web projects, focusing on the correlation between DOM event interactions and test flakiness. By analyzing interaction types and their potential relationships, the study identifies unpredictable DOM-event sequences, asynchronous operations, and event handling mechanisms as key factors. These results demonstrate the important role that DOM event-driven interactions play in web testing flakiness.

Based on analysis, the dissertation presents DoeFL, an approach that integrates structured analysis of DOM event sequences with large language models (LLMs) and aims to identify the locations of web flakiness. DoeFL models causal relationships between DOM events by utilizing Lamport logical clocks, and it then leverages LLMs to capture semantic test intents and patterns. We evaluate DoeFL on 122 real-world flaky test cases, achieving localization performance (47.5% Top 1 accuracy). By leveraging LLMs into the approach, it can identify subtle or implicit flakiness patterns, achieving 77.9% Top 1 accuracy. These results demonstrate the effectiveness of combining structural and semantic reasoning to localize flakiness in web environments.

Continuing the investigation into mitigation, the third part of this dissertation proposes TRaf, a repair technique that addresses async wait flakiness. Our investigation reveals that developers adjusted wait time to address asynchronous wait flakiness in about 63% of cases (31 out of 49), even when the underlying causes lie elsewhere. By employing code similarity and history analysis, TRaf statically suggests more suitable wait times, thereby decreasing the time required for test execution by 11.1% in comparison to developer-written values. When combined with dynamic refinement, TRaf achieves even greater gains (16.8%), offering developers a practical and automated alternative to traditional manual tuning.

Finally, the dissertation further investigates web-specific visual flakiness, an often-overlooked aspect of test flakiness. In comparison to non-visual flakiness issues, these problems are also common and require workable investigations and solutions. Through a detailed empirical study, we categorized visual flakiness into structure-related (59.9%) and style-related (40.1%) causes. The analysis of real-world fixes reveals patterns in which developers address these failures, which inform future tooling and practices. Overall, this dissertation enhances comprehension of web front-end test flakiness and provides empirical insights and innovative tools for its localization and repair, thereby establishing the foundation for more reliable and sustainable web testing practices.

Add to calendar

Understanding, Localizing, and Repairing Flakiness in Web Front-End Testing

Share this