Android validation dataset

It is our belief that we can find different relationships among the Android apps. These relationships depend on how the apps where created and how much of the code and the resources are shared between them.

We build DroidKin, a system to find these relations between the apps based on the meta-data, the resources and the code from the apps. In order to validate our toughs, we create the Android validation data set. This data set wich consist in 72 original apps from different origins, and the following 10 transformations on each app, resulting in 792 apps with different relationships between them.

The transformations performed featured the following operations: insert junk code, insert junk files, replace icons, replace files, different aligns, replace strings and more.

For more details on the relationships, the transformation and the results see:

  • Gonzalez, Hugo, Natalia Stakhanova, and A. Ghorbani. "Droidkin: Lightweight detection of android apps similarity." Proceedings of the 10th SECURECOMM (2014).

