Break It

Overview

In the Break It AutoLab assignment, you will find a handout with the source code for every team’s implementation from the Build It phase. Included with each implementation will be some logfiles that the compiled version generated.

Your goal in this phase is to identify bugs in these implementations. Bugs will fall into one of four categories: correctness, crash, integrity, confidentiality.

To demonstrate a bug for grading and contest purposes, you will upload a submission to the Build It assignment on AutoLab. The specification for the submission is described below. Under Deliverables, you will find details about what you must turn in for grading purposes. For both contest and grading purposes, you may only submit one break per bug you identify in a given team’s implementation.

One of the tools you use to find bugs will be Coverity, a commercial static analysis tool often used by engineers in industry. Again, see Deliverables, for more details of what you need to do with Coverity for grading purposes.

Setup

Install Coverity, following these instructions.
Download the AutoLab handout for the Break It assignment

Submission file format

You will upload a single JSON file to the AutoLab assignment. Please check the formatting of your JSON file before submitting (for instance, by using https://jsonlint.com/ to ensure that your submission is a valid JSON file). It should contain a single key, “breaks”, mapping to an array of break reports. The report format varies, as described below, based on the type of break you are reporting. Hence, at the outermost level, you JSON file should look like

{
  "breaks": [
    BREAK1,
    BREAK2,
    ...,
    BREAKN
  ]
}

Each break in turn will be a JSON dictionary (see below for examples).
All breaks must have the following keys:

“target_team” should map to a string with the group name you are attacking.
“type” should be one of “correctness”, “crash”, “integrity”, or “confidentiality”.
“commands” should map to an array of command entries. A command entry is a JSON dictionary with the following keys:
“program” should indicate either “logread” or “logappend”
“args” should indicate a JSON array of strings that should be passed as arguments to the program.

Correctness Violations

A correctness violation represents a deviation from the specification. This could take the form of invalid output, incorrect return codes, or incorrect responses to queries. To demonstrate a correctness violation, submit a test with the type key set to “correctness” and the “commands” key set to one or more command entries necessary to demonstrate the correctness error. The elements of commands can describe steps to build a log. The batch key may optionally be included, which is a base64 encoded batch file. To encode a file, most Unix/Linux/Mac systems include the base64 command, which you can run as:

base64 -i input_file -o output_file

The value of the “batch” key will be base64 decoded and placed in a file called “batch” when tests are executed. You can refer to this file when specifying your list of command entries.

To test a correctness violation, the commands will be run with the oracle and then with the target team’s implementation. If any standard output or return codes differ, the correctness violation is confirmed. Remember that multiple break submissions against the same bug will gain you no additional points.

As an example, if you wanted to submit a testcase that is evidence of a correctness violation in team Dragon’s submission, you could add the following break to your list of “breaks” in your submission.

{
  "target_team": "Dragon",
  "type": "correctness",
  "commands": [
    {
      "program": "logappend",
      "args": ["-K", "secret", "-T", "0", "-G", "GERDA", "-A", "log"]
    },
    {
      "program": "logappend",
      "args": ["-K", "secret", "-T", "1", "-G", "GERDA", "-A", "-R", "16", "log"]
    },
    {
      "program": "logread",
      "args": [ "-K", "secret", "-R", "-G", "GERDA", "log"],
    }
  ],
  "batch": "LUsgc2VjcmV0IC1FIGVtcGwgLUEgbG9nCg=="
}

Note that output comparisons will generally ignore extra white space before and after the output, so correctness violations targeting such differences will not be consiered valid. In addition, if the target’s program outputs “unimplemented” in response to a valid command relating to an optional feature, this will not be considered a correctness violation.

Crash Violations

A crash occurs when a program unexpectedly terminates due to violations of memory safety. To demonstrate a crash, submit a test with the type key set to “crash” and commands key to demonstrate the crash. The elements of commands can describe steps to build a log.

{
  "target_team": "Unicorn",
  "type": "crash",
  "commands": [
    {
      "program": "logappend",
      "args": ["-K", "secret", "-T", "0", "-G", "GERDA", "-A", "log"]
    },
    {
      "program": "logappend",
      "args": ["-K", "secret", "-T", "1", "-G", "GERDA", "-A", "-R", "16", "log"]
    },
    {
      "program": "logread",
      "args": [ "-K", "secret", "-R", "-G", "GERDA", "log"],
    }
  ],
  "batch": "LUsgc2VjcmV0IC1FIGVtcGwgLUEgbG9nCg=="
}

Integrity Violations

An integrity violation occurs when an attacker can successfully modify a log without knowledge of the token used to produce that file. Evidence of successful modification is that the modified file is interpreted correctly by logread (when using the token that was used to create the original log).

To demonstrate that this is possible, you will need the logfiles in the Break It handout. In each team’s directory, you will find an integ directory. There you will find pairs of files, e.g., QWZFJ and QWZFJ.transcript. The former (i.e., QWZFJ) is the logfile produced by the team’s code when run using the commands in the corresponding .transcript file, although we have omitted the token used in those commands.

To submit an integrity break, your break must include the “logfile” key with the name of the original log, and the “replacement” key with the base64-encoded contents of a different log file.

The “commands” key should contain a single command entry for logread, with the token and name of the log omitted. A correct implementation should always exit with the output “integrity violation”; an incorrect one will produce some other output.

Here is an example demonstrating an integrity violation against team Mermaid.

{
  "target_team": "Mermaid",
  "type": "integrity",
  "commands": [
    {
      "program": "logread",
      "args": ["-R", "-G", "GERDA"]
    }
  ],
  "logfile": "QWZFJ",
  "replacement": "LVQgMCAtSyBCRlVYVU1YVSAtRyBPREEgLUEgQUlDVUNMTUQKLVQgMSAtSyBCRlVYVU1YVSAtRyBPREEgLUwgQUlDVUNMTUQK"
}

This tells the testing framework to run team Mermaid’s logread against both the original “QWZFJ” file and the “replacement” file you provide. In both cases logread is run using the correct authentication token. If the logread command against “QWZFJ” logfile and against the “replacement” file produce different non-zero length standard outs and no errors, then an integrity violation has been detected.

Confidentiality Violations

A confidentiality violation occurs when an attacker can infer information about the contents of a log without knowledge of the token or transcript. Evidence of such a violation is a successful guess of an outcome of a query on the log, despite not knowing the token or the transcript used to produce it.

To demonstrate that this is possible, as with Integrity Violations, you will need the logfiles in the Break It handout. In each team’s directory, you will find a conf directory. There you will find several logfiles (e.g., AJKLSV) generated by the teams’s implementation. In this case, we do not give you access to the token or the transcript.

To submit a confidentiality break, the commands key should contain a single command entry for logread, with the token and name of the log omitted. You must also include an “output” key which is your prediction of what the team’s implementation will produce when run on the log file with the token. Our testing infrastructure will run with the token, and if the expected output is indeed produced, the break will be considered valid.

Note that a confidentiality break is only valid if the result of running logread is a successful execution (i.e., the return code is 0), the output is of non-zero length after whitespace is trimmed, and the output is not “invalid”, “unimplemented”, or “integrity violation”.

Here is an example of submitting a confidentiality test against team Pegasus. In this test, the submitting team knows the room information for GERDA, even though the logfile “AJKLSV” was provided with neither token nor transcript.

{
  "target_team": "Pegasus",
  "type": "confidentiality",
  "logfile": "AJKLSV",
  "commands": [
    {
      "program": "logread",
      "args": ["-R", "-G", "GERDA"],
      "output": "16"
    }
  ]
}

Deliverables

For grading purposes, we will assign you implementations developed by four other teams. Your goal will be to find and validate bugs and vulnerabilities specifically in those four implementations. If one of your assigned implementations doesn’t work well enough to analyze, please request an alternate implementation from an instructor.

For points in the contest, you are welcome to also look at other implementations beyond the four you have been assigned, but you can only submit at most 10 breaks per implementation, and your breaks can only target a total of at most 8 implementations. The AutoLab infrastructure will enforce these limits.

Via Gradescope, you should submit:

A vulnerability analysis document (PDF).
- Your report is as important as (if not more important than) the attacks you submit on the server. Do not forget to include your names, Andrew IDs, and group name.
- Choose one of the implementations you were given and run the Coverity Static Analysis tool on it. Find the following.
  - Two false positives. Explain why these cases flagged by Coverity are not actually bugs, as well as why the tool flagged them anyway.
  - Two true positives. Explain why these cases flagged by Coverity could be real risks, including an example execution that would go wrong (e.g., produce an incorrect result, crash, or exploit a vulnerabilitiy). NOTE: If you cannot find the 4 cases in the same team’s code, then you can consider other teams.
- From the implementations you were given, collectively identify a total of four additional, distinct bugs, not including the 2 bugs you found via Coverity and described above. You can use any technique you want, and you can continue to use the Coverity Static Analysis tool.
  - For each bug
    - Explain how you found the bug.
    - Include the JSON for your break that you submitted to the server.
    - Explain how your break works.
- In total, you should have 6 true bugs (2 with Coverity, 4 by any means), and 2 false positives (2 with Coverity) that comply with the constraints below.
  - The 6 true bugs must target at least two different teams (i.e., you cannot have all 6 targeting the same team).
  - At least two of the true bugs must be security vulnerabilities (i.e., integrity and confidentiality violations).
  - If you did not find any attacks or vulnerabilities, then you can do the one of the following:
    - Explain why any crashes that you found have security implications. * Describe how you looked for attacks and why you believe none are present. Be Careful! If another team manages to find a break in the code you claimed to be secure, then you won’t get any points for that case.
  - All breaks must be validated via the server

Via Canvas, you should submit

Any code you wrote to implement your attacks. Make sure the vulnerability analysis explains how to use your code to launch the attack.

Grading

Break It will be worth 100 points

40 points for successfully completing the Coverity portion
60 points for finding and properly explaining four additional bugs beyond those covered by your Coverity description

Assignment Quick Links