Fork me on GitHub

Argus Static Analysis Framework

Argus-SAF is a static analysis framework that we build in house to do security vetting for Android applications. It integrated two of our previously developed products Argus-Jawa and Argus-Amandroid, and have the capability to perform comprehensive, efficient and highly precise Inter-component Data Flow Analysis.

Argus-SAF origin

Argus-SAF is also known as Amandroid, first published at CCS’14 [pdf].

Argus-Jawa

Argus-Jawa is a general static analysis framework for our home-brewing intermediate representation (IR) language Jawa. Any java-like language (e.g., java, java bytecode, dalvik bytecode) can be analyzed if it have been translated into Jawa.

It provides the ability to:

  1. Parsing Jawa codes.
  2. Load information from jar file and class file.
  3. Build AST for jawa records (classes) and procedures (methods).
  4. Resolving class hierarchy and class elements overwritten relationship.
  5. Resolving virtual method invocation.

It can conduct/build:

  1. Call Graph
  2. Reaching Definition Analysis
  3. Points-to Analysis
  4. Monotonic Data Flow Analysis,
  5. Reaching Facts Analysis
  6. Intra-/Inter- procedural Control Flow Graph
  7. Intra-/Inter- procedural Data Flow Graph
  8. Data Dependence Analysis
  9. Taint Analysis
  10. Side Effect Analysis

Argus-Amandroid

Amandroid meaning

Aman means secure in Indonesian, so Amandroid means secure android.

Overview

Amandroid is a static analysis framework for Android apps.

The Android platform is immensely popular. However, malicious or vulnerable applications have been reported to cause several security problems. Currently there is no effective method that a market operator can use to vet apps entering a market (e.g., Google Play).

Prior works using static analysis to address Android app security problems more focus on specific problems and built specialized tools for them. We observe that a large portion of those security issues can be resolved by addressing one underlying core problem – capturing semantic behaviors of the app such as object points-to and control-/data-flow information. Thus, we designed a new approach to conducting static analysis for vetting Android apps, and built a generic framework, called Amandroid, which builds upon Argus-Jawa and does flow- and context-sensitive data flow analysis in an inter-component way.

Our approach shows that a comprehensive (tracking all objects) static analysis method on Android apps is totally feasible in terms of computation resources, and the Amandroid framework is flexible and easy to be extended for many types of specialized security analyses.

Since Amandroid directly handles Inter-component control and data flows, it can be used to address security problems that result from interactions among multiple components from either the same or different apps. Amandroid analysis is sound in that it can provide assurance of the absence of the specified security problems in an app with well-specified and reasonable assumptions on the Android runtime and its library.

On top of Amandroid we performed certain specific security analyses, for instance,

  1. Sensitive data flow tracking
  2. Data injection detection
  3. API misuse checking

We apply those analyses on hundreds of apps collected from Google Play’s popular apps and a third-party security company, and the results show that it is capable of finding real security issues and efficient enough in terms of analysis time.

Amandroid Workflow

Figure: The pipeline of Amandroid framework.
Figure: The pipeline of Amandroid framework.

Amandroid take an Android APK x as the input, then it works as following:

  1. Extract x, then parse *.dex file to Jawa-DeDexer module and other files (like *.xml, resource.arsc) to Preprocess module.
  2. Jawa Dedexer in Dex2Jawa module decompile the *.dex file into Jawa format. Parsers in Preprocess module can provide app’s information to AppInfoCollector. Developer can specify what kind of information he/she is interested and non-interesting app can be ignored. Finally, Preprocess module will output meta data of x.
  3. AndroidEnvironmentGenerator in EnvironmentBuilder is getting all sources codes and meta datas from previous step, then building the environment method for each of the component.
  4. DataFlowFramework provide data flow analysis technics to examine data flow problems. AndroidReachingFactsAnalysis takes environment methods as the entry points and build IDFG. InterproceduralDataDependenceAnalysis takes IDFG and build DDG. AndroidDataDependentTaintAnalysis takes DDG and SourceAndSinkManager (provided by the developer) to do taint analysis and output taint result.
  5. Developer specified plugin get all the result, then he/she can do further analysis or visualize it in certain way.

Please note that source codes and environment appeals above are all Jawa format.

Getting Started

This section will help you to start with using Argus-CIT.

Download

Requirement: Java 8

  1. Click: Download
  2. In arguslab bintray repo click Files > Version Folder
  3. Download argus-saf_***-version-assembly.jar

Run

To run Argus-SAF, in a terminal command prompt, type:

$ java -jar argus-saf_***-version-assembly.jar

Above command will show you the usage of Argus-SAF:

Available Modes:
  a[picheck]    Detecting API misuse.
  d[ecompile]   Decompile Apk file(s).
  s[tage]       Stage middle results.
  t[aint]       Perform taint analysis on Apk(s).

There are several modes you can use. Let’s take taint analysis as an example, type:

$ java -jar argus-saf_***-version-assembly.jar t

It will show you the usage and available options:

usage: t[aint] [options] <file_apk/dir>
 -d,--debug            Output debug information.
 -f,--force            Force delete previous decompile result. [Default: false]
 -i,--ini <path>       Set .ini configuration file path.
 -mo,--module <name>   Taint analysis module to use. [Default: DATA_LEAKAGE, Choices: (COMMUNICATION_LEAKAGE,
                       OAUTH_TOKEN_TRACKING, PASSWORD_TRACKING, INTENT_INJECTION, DATA_LEAKAGE)]
 -o,--output <dir>     Set output directory. [Default: .]

Two notable options are -mo,--module and -i,--ini.

  1. -mo,--module allows you to set the module you wanna use in the analysis. By default it set to DATA_LEAKAGE detection, you can switch between different modules by specify this option.
  2. -i,--ini allows you to specify the custom configuration file to use for the analysis, the detailed information will be discussed in Configuration File.

Test

To make sure Argus-SAF running on your environment, you can execute it on our test apks, which you can download from ICC-Bench.

The command to run is:

$ java -jar argus-saf_***-version-assembly.jar t -o /outputPath /path/icc-bench

Install Amandroid Stash

If you are first time use Argus-SAF, above test command will automatically download and install Amandroid Stash under path ~/.amandroid_stash. It contains necessary android sdks and configuration files for Argus-SAF’s analysis.

More test apks you can find from DroidBench.

Working with Argus-SAF

Argus-SAF released two libraries: jawa-core and amandroid-core. Both of them are exist in the Maven Central Repo.

As aforementioned, jawa-core contains all the static analysis apis for analyzing Jawa, amandroid-core contains all the android related analyzing apis and tools.

Obtain Argus-SAF as Library

You can obtain Argus-SAF as library for your own project to build new static analysis tools.

Here, we assume your project is a SBT project:

Depend on jawa-core by editing build.sbt:

libraryDependencies += "com.github.arguslab" %% "jawa-core" % VERSION

Depend on amandroid-core by editing build.sbt:

libraryDependencies += "com.github.arguslab" %% "amandroid-core" % VERSION

Note that:

1. Depend on amandroid-core will automatically add jawa-core as dependency.

2. If your project use Maven or Gradle as the build tool, you should translate it to corresponding format by following format in Maven Central Repo.

3. VERSION should change to current released version.

Build Project Based on Argus-SAF

Argus-SAF-playground is a project which has the basic setup for a Argus-SAF enhanced project with demo codes of how to perform different kind of analysis. Any project want to leveraging Argus-SAF can just fork from (or just learn the setup from) this repo and based on that to implement your own project.

Configuration File

By default, Argus-SAF will use ~/.amandroid_stash/amandroid/config.ini as the default configuration file. However, user could provide their own configuration file as well.

Format is as following:

; General configuration for amandroid
[general]
; Dependence directory for odex resolution.
;dependence_dir = /path
; Output debug information
debug = false
; Java Library jar files
;lib_files = /path/lib1.jar:/path/lib2.jar

; Configuration for data flow analysis
[analysis]
; Handle static initializer
static_init = false
parallel = false
; Context length for k-context sensitive analysis
k_context = 1
; Source and sink list file
;sas_file = /path/sas.txt
; timeout setting for analyzing one component (minutes)
timeout = 10

; Concurrent settings for Amandroid actors
[concurrent]
;actor_conf_file = /path/application.conf

Tutorial: Load APK

Your project could be written in both Java and Scala, in this tutorial we use Scala for demonstration.

Step by Step

First at all, make sure your project has amandroid-core as dependency.

Then, following steps will decompile an apk file with loading all the classes and resources.

A. Prepare DecompilerSettings. It defines the decompile layout, message level, odex dependence files, and whether force delete the output folder if it’s already exist.

val outputUri = FileUtil.toUri(outputPath)
val layout = DecompileLayout(outputUri)
val settings = DecompilerSettings(
  AndroidGlobalConfig.settings.dependence_dir.map(FileUtil.toUri),
  dexLog = false, debugMode = false, removeSupportGen = true,
  forceDelete = true, Some(new DecompileTimer(5 minutes)), layout)

B. Decompile the apk.

val apkUri = FileUtil.toUri(apkPath)
val (outUri, srcs, _) = ApkDecompiler.decompile(apkUri, settings)

C. Initialize ApkGlobal, load jawa code and collect info. ApkGlobal is the apk resource manager, class loader and class path manager for our analysis.

val reporter = new PrintReporter(MsgLevel.ERROR)
val apk = new ApkGlobal(ApkModel(apkUri, outUri, srcs), reporter)
srcs foreach {
  src =>
    val fileUri = FileUtil.toUri(FileUtil.toFilePath(outUri) + File.separator + src)
    if(FileUtil.toFile(fileUri).exists()) {
      //store the app's jawa code in AmandroidCodeSource which is organized class by class.
      apk.load(fileUri, Constants.JAWA_FILE_EXT, AndroidLibraryAPISummary)
    }
}
AppInfoCollector.collectInfo(apk, global, outUri)

Load Apk Using ApkYard

ApkYard is a class which allows loading multiple apks and enables inter-app analysis. How to do inter-app analysis using ApkYard you can check tutorial.

val apkUri = FileUtil.toUri(apkPath)
val outputUri = FileUtil.toUri(outputPath)
val reporter = new PrintReporter(MsgLevel.ERROR)
val yard = new ApkYard(reporter)
val layout = DecompileLayout(outputUri)
val settings = DecompilerSettings(AndroidGlobalConfig.settings.dependence_dir, dexLog = false, debugMode = false, removeSupportGen = true, forceDelete = forceDelete, None, layout)
val apk = yard.loadApk(apkUri, settings)

Retrieve Information from Apk

val appName = apk.getAppName
val certificate = apk.getCertificates
val uses_permissions = apk.getUsesPermissions
val component_infos = apk.getComponentInfos // ComponentInfo(compType: [class type], typ: [ACTIVITY, SERVICE, RECEIVER, PROVIDER], exported: Boolean, enabled: Boolean, permission: ISet[String])
val intent_filter = apk.getIntentFilterDB // IntentFilterDB contains intent filter information for each component.

Access Environment Methods

var entryPoints = global.getEntryPoints(AndroidConstants.MAINCOMP_ENV) // Exposed components

if(!public_only)
  entryPoints ++= global.getEntryPoints(AndroidConstants.COMP_ENV) // Private components

Full Example

The full example can be found at Argus-SAF-playground:LoadApk.

Tutorial: Load Class, Field, Method

Suppose our apk have following class:

package org.argus.test;

public class Hello {
    int i;
    public void greeting() {
        System.out.println("Hello World!");
    }
}

To load the Hello class and access its attributes:

A. Load APK follow previous tutorial.

B. Do following:

val clazz: JawaClass = global.getClassOrResolve(new JawaType("org.argus.test.Hello"))
val method_opt: Option[JawaMethod] = clazz.getDeclaredMethodByName("greeting")
val field_opt: Option[JawaField] = clazz.getDeclaredField("i")

From JawaClass, JawaMethod, JawaField you can access their access flags, qualified name, overwritten information, etc. The detailed usage you can study from the source code.

Tutorial: Generate Graphs

In the tutorial we show how to generate Control Flow Graph, Reaching Definition Analysis, Call Graph.

Control Flow Graph

Control Flow Graph can be easily acquired from JawaAlirInfoProvider with JawaMethod.

val method: JawaMethod = clazz.getDeclaredMethodByName("greeting").get

Reaching Definition Analysis

Reaching Definition Analysis can be easily acquired from JawaAlirInfoProvider with JawaMethod and Control Flow Graph.

val method: JawaMethod = clazz.getDeclaredMethodByName("greeting").get
val cfg = JawaAlirInfoProvider.getCfg(method)
val rda = JawaAlirInfoProvider.getRda(method, cfg)

Inter-procedural Data Flow Graph

Inter-procedural Data Flow Graph (IDFG) is a combination of Inter-procedural Control Flow Graph (ICFG) and a points-to information map which denotes that at each program point what are the possible Object types.

We have two points-to analysis algorithm to build IDFG: InterproceduralSuperSpark, AndroidReachingFactsAnalysis.

Most of the time InterproceduralSuperSpark is just used to build Call Graph efficiently, because it is more light-weight than AndroidReachingFactsAnalysis, but still preserves enough precision (flow-,object-,field- sensitive). We will discuss this in Call Graph section.

In this tutorial we will talk about how to build IDFG use AndroidReachingFactsAnalysis:

A. Configuration. AndroidReachingFactsAnalysisConfig contains following global variables:

  1. resolve_icc: control whether find ICC call target and passing points-to facts to target component.
  2. resolve_static_init: control whether handle static init when analyzing. (Recommend to turn this off as it is very time consuming.)
  3. parallel: control whether run analysis in parallel mode. (We don’t suggest to turn this on, as we have more robust Akka Actor solution.)

Whether Resolving ICC

We introduced Component Based Analysis approach to handle ICC communication in a more scalable way. Thus, if you are using this approach, you should turn resolve_icc off.

To set those variables is very simple:

AndroidReachingFactsAnalysisConfig.resolve_icc = false
AndroidReachingFactsAnalysisConfig.resolve_static_init = false
AndroidReachingFactsAnalysisConfig.parallel = false

B. Load APK follow previous tutorial.

C. Perform analysis:

implicit val factory = new RFAFactFactory
// ep is the entry point method for the analsis. Most of the time it is the environment method we generated for each component.
val initialfacts = AndroidRFAConfig.getInitialFactsForMainEnvironment(ep)
val timeout = Some(new MyTimeout(AndroidGlobalConfig.settings.timeout minutes))
val idfg = AndroidReachingFactsAnalysis(global, apk, ep, initialfacts, new ClassLoadManager, timeout)

Inter-procedural Data Dependence Graph

val iddResult = InterproceduralDataDependenceAnalysis(global, idfg)

Call Graph

There are few algorithms we can use to build Call Graph: InterproceduralSuperSpark, SignatureBasedCallGraph, AndroidReachingFactsAnalysisBuilder, etc.

InterproceduralSuperSpark is the best option if you want to build a Call Graph efficiently as well as preserve enough precision.

// methods is the entry point methods you want to start with to build call graph.
val idfg = InterproceduralSuperSpark(global, methods.map(_.getSignature))
val icfg = idfg.icfg
val call_graph = icfg.getCallGraph

Output Graphs in Different Format

Our generated graphs allows three kind of output format: Dot, GraphML, GML.

graph.toDot(writer)
graph.toGraphML(writer)
graph.toGML(writer)

Tutorial: Taint Analysis

Argus-SAF’s taint analysis leverages our Inter-procedural Reaching Fact Analysis and Inter-procedural Data Dependence Analysis algorithms, which reported in the Amandroid [pdf] paper.

Step by Step

A. Perform Inter-procedural Data Flow Analysis and Inter-procedural Data Dependence Analysis to generate IDFG and IDDG.

B. Provide a Source and Sink Manager for the taint analysis. Argus-SAF currently have five build-in managers:

  1. IntentInjectionSourceAndSinkManager
  2. PasswordSourceAndSinkManager
  3. OAuthSourceAndSinkManager
  4. DataLeakageAndroidSourceAndSinkManager
  5. CommunicationSourceAndSinkManager
val ssm = module match {
  case INTENT_INJECTION =>
    new IntentInjectionSourceAndSinkManager(global, apk, apk.getLayoutControls, apk.getCallbackMethods, AndroidGlobalConfig.settings.sas_file)
  case PASSWORD_TRACKING =>
    new PasswordSourceAndSinkManager(global, apk, apk.getLayoutControls, apk.getCallbackMethods, AndroidGlobalConfig.settings.sas_file)
  case OAUTH_TOKEN_TRACKING =>
    new OAuthSourceAndSinkManager(global, apk, apk.getLayoutControls, apk.getCallbackMethods, AndroidGlobalConfig.settings.sas_file)
  case DATA_LEAKAGE =>
    new DataLeakageAndroidSourceAndSinkManager(global, apk, apk.getLayoutControls, apk.getCallbackMethods, AndroidGlobalConfig.settings.sas_file)
  case COMMUNICATION_LEAKAGE =>
    new CommunicationSourceAndSinkManager(global, apk, apk.getLayoutControls, apk.getCallbackMethods, AndroidGlobalConfig.settings.sas_file)
}

You can also provide your own Source and Sink Manager follow tutorial.

C. Perform taint analysis:

val taint_analysis_result = AndroidDataDependentTaintAnalysis(global, iddResult, idfg.ptaresult, ssm)

Customize Source and Sink Manager

Source and Sink Manager can specify four kind of Source points and two kind of Sink points.

Source points:

  1. Api Source: Given api signature will return a tainted data.
  2. Callback Source: Given callback method will contain tainted data as parameter.
  3. UI Source: Given ui component contains tainted data.
  4. ICC Source: Given component environment method is receiving tainted data.

Sink points:

  1. Api Sink: Given api signature will leak its parameters.
  2. Icc Sink: Given ICC method will leak Intent.

For both Api Source and Api Sink we can specify it in a Source and Sink File using following format:

Landroid/telephony/TelephonyManager;.getDeviceId:()Ljava/lang/String; SENSITIVE_INFO -> _SOURCE_
Landroid/content/pm/PackageManager;.queryBroadcastReceivers:(Landroid/content/Intent;I)Ljava/util/List; SENSITIVE_INFO -> _SOURCE_
Landroid/os/Handler;.obtainMessage:(ILjava/lang/Object;)Landroid/os/Message; MESSAGE -> _SOURCE_
Landroid/util/Log;.d:(Ljava/lang/String;Ljava/lang/String;)I -> _SINK_
Ljava/io/Writer;.write:(Ljava/lang/String;II)V -> _SINK_ 1
Ljava/net/URLConnection;.setRequestProperty:(Ljava/lang/String;Ljava/lang/String;)V -> _SINK_ 1|2

Note that, 1|2 in above format means the first and second parameter will leak the data, no number means all parameter.

Your Source and Sink Manager need extends from SourceAndSinkManager or AndroidSourceAndSinkManager or DefaultAndroidSourceAndSinkManager. Here we take IntentInjectionSourceAndSinkManager as an example to discuss:

package org.argus.amandroid.plugin.dataInjection

import org.argus.amandroid.alir.pta.reachingFactsAnalysis.model.InterComponentCommunicationModel
import org.argus.amandroid.alir.taintAnalysis.AndroidSourceAndSinkManager
import org.argus.amandroid.core.Apk
import org.argus.amandroid.core.parser.LayoutControl
import org.argus.jawa.alir.controlFlowGraph.{ICFGInvokeNode, ICFGNode}
import org.argus.jawa.alir.pta.PTAResult
import org.sireum.pilar.ast._
import org.sireum.util._
import org.argus.jawa.core._

/**
 * @author Fengguo Wei
 * @author Sankardas Roy
 */
class IntentInjectionSourceAndSinkManager(
    global: Global,
    apk: Apk,
    layoutControls: Map[Int, LayoutControl],
    callbackSigs: ISet[Signature],
    sasFilePath: String)
    extends AndroidSourceAndSinkManager(global, apk, layoutControls, callbackSigs, sasFilePath){

  // We only interested about icc source, so for api source we just return false
  override def isSource(calleeSig: Signature, callerSig: Signature, callerLoc: JumpLocation): Boolean = {
    false
  }

  // We only interested about icc source, so for callback source we just return false
  override def isCallbackSource(sig: Signature): Boolean = {
    false
  }

  // We only interested about icc source, so for ui source we just return false
  override def isUISource(calleeSig: Signature, callerSig: Signature, callerLoc: JumpLocation): Boolean = {
    false
  }

  // If the given point is environment method's entry node, we consider it as icc source.
  override def isIccSource(entNode: ICFGNode, iddgEntNode: ICFGNode): Boolean = {
    entNode == iddgEntNode
  }

  // if the given point is an ICC call, we mark it as icc sink.
  override def isIccSink(invNode: ICFGInvokeNode, ptaresult: PTAResult): Boolean = {
    var sinkflag = false
    val calleeSet = invNode.getCalleeSet
    calleeSet.foreach{
      callee =>
        if(InterComponentCommunicationModel.isIccOperation(callee.callee)){
          sinkflag = true
        }
    }
    sinkflag
  }

  // api source is using the default one, which implemented in AndroidSourceAndSinkManager.
  // The basic idea is check whether given api signature is matching with api sinks specified in provided sasFile (Source and Sink File).
}

Tutorial: Inter-app Analysis

val fileUris = apkFiles.map(FileUtil.toUri)
val outputUri = FileUtil.toUri(outputPath)
val reporter = new PrintReporter(MsgLevel.ERROR)
val res = TaintAnalysisTask(TaintAnalysisModules.DATA_LEAKAGE, fileUris, outputUri, forceDelete = true, reporter).run

To customize the inter-app analysis you can check the code at Argus-SAF:TaintAnalysisTask.