PhD Defence: Instruction coverage for Android app testing and tuning

Please click on this link to both register and connect on the day of the event.

Meeting number (access code): 163 680 8550
Meeting password: gYXVMyd4V85

Members of the defence committee:

Chair: Prof. Dr Yves Le Traon, University of Luxembourg
Vice-chair: Dr Olga Gadyatskaya, Leiden University, The Netherlands
Supervisor: Prof. Dr Sjouke Mauw, University of Luxembourg
Member: Prof. Dr Pascal Bouvry, University of Luxembourg
Member: Prof. Dr Yang Liu, Nanyang Technological University, Singapore

For many people, mobile apps have already become an indispensable part of modern life. Apps entertain, educate, assist us in our daily routines and help us connect with others. However, the advanced capabilities of modern devices running the apps and sensitive user data make mobile devices also an attractive attack target. To get access to sensitive data, adversaries tend to conceal malicious functionality in freely distributed legitimately-looking apps.

The problem of low-quality and malicious apps, spreading at an enormous scale, is especially relevant for one of the biggest software repositories – Google Play. The Android apps distributed through this platform undergo a validation process by Google. However, that is insufficient to confirm their good nature. To identify dangerous apps, novel frameworks for testing and app analysis are being developed by the Android community.

Code coverage is one of the most common metrics for evaluating the effectiveness of these frameworks, and it is used as an internal metric to guide code exploration in some of them. However, when analyzing apps without source code, the Android community relies mostly on method coverage since there are no reliable tools for measuring finer-grained code coverage in 3rd-party Android app testing.

Another stumbling block for testing frameworks is the inability to test an app exhaustively. While code coverage measurement can indicate an improvement in testing, it is neither possible to reach 100% coverage nor to identify the maximum reachable coverage value for the app. Despite testing, the app still contains high amounts of not executed code, which makes it impossible to confirm the absence of potentially malicious code in the part of the app that has not been tested. The existing static debloating approaches aim at app size minimization rather than security and simply debloat not reachable code. However, there is currently no approach to debloat apps based on dynamic analysis information, i.e. to cut out not-executed code.

In this dissertation, we solve these two problems by, first, proposing an efficient approach and a tool to measure code coverage at the instruction level, and second, a dynamic binary shrinking methodology for deleting not executed code from the app. We support our solutions by the following contributions:

An instrumentation approach to measuring code coverage at the instruction level. Our technique instruments small representation of Android bytecode to allow code coverage measurement at the finest level.
An implementation of the instrumentation approach. ACVTool is a self-contained package containing 4K lines of Python code. It is publicly available and can be integrated into different testing frameworks.
An extensive empirical evaluation that shows the high reliability and versatility of our approach. ACVTool successfully executes on 96.9% of apps from our dataset, introduces a negligible instrumentation time and runtime overheads, and its results are complaint to the results of JaCoCo (source code coverage) and Ella (method coverage) tools.
A detailed study on the influence of code coverage metric granularity on automated testing. We demonstrate the usefulness of ACVTool for automated testing techniques that rely on code coverage data in their operation.
A dynamic debloating approach based on ACVTool instruction coverage. We propose Dynamic Binary Shrinking System, a novel methodology created to shrink 3rd-party Android apps towards observed benign functionality on executed code.
An implementation of the dynamic debloating technique incorporated into the ACVCut tool. The tool demonstrates the viability of the Dynamic Shrinking System in two examples. It allows us to cut out not executed code and, thus, provide 100% instruction coverage on explored app behaviours.

Partager ce contenu