<html>

  <head>

    <meta http-equiv="content-type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <p>And another on Spectre</p>

    <p>--dave<br>

    </p>

    <div class="moz-forward-container"><br>

      <br>

      -------- Forwarded Message --------

      <table class="moz-email-headers-table" border="0" cellspacing="0"

        cellpadding="0">

        <tbody>

          <tr>

            <th nowrap="nowrap" valign="BASELINE" align="RIGHT">Subject:

            </th>

            <td>Spectre attacks: exploiting speculative execution</td>

          </tr>

          <tr>

            <th nowrap="nowrap" valign="BASELINE" align="RIGHT">Date: </th>

            <td>Tue, 16 Jan 2018 06:00:00 +0000</td>

          </tr>

          <tr>

            <th nowrap="nowrap" valign="BASELINE" align="RIGHT">From: </th>

            <td>adriancolyer <><br>

              <br>

            </td>

          </tr>

        </tbody>

      </table>

<a class="moz-txt-link-freetext" href="https://blog.acolyer.org/2018/01/16/spectre-attacks-exploiting-speculative-execution/">https://blog.acolyer.org/2018/01/16/spectre-attacks-exploiting-speculative-execution/</a><br>

      <br>

      <title>Spectre attacks: exploiting speculative execution</title>

      <base

href="https://blog.acolyer.org/2018/01/16/spectre-attacks-exploiting-speculative-execution/">

      <p><a href="https://spectreattack.com/spectre.pdf"

          moz-do-not-send="true">Spectre attacks: exploiting speculative

          execution</a> Kocher et al., <em>2018</em></p>

      <p>Yesterday we looked at Meltdown and some of the background on

        how modern CPUs speculatively execute instructions. Today it’s

        the turn of Spectre of course, which shares some of the same

        foundations but is a different attack, not mitigated by KAISER.

        On a technical front, Spectre is as fascinating as it is

        terrifying, introducing a whole new twist on <a

href="https://blog.acolyer.org/2017/12/06/the-dynamics-of-innocent-flesh-on-the-bone-code-reuse-ten-years-later/"

          moz-do-not-send="true">ROP</a>.</p>

      <blockquote>

        <p> This paper describes practical attacks that combine

          methodology from side-channel attacks, fault attacks, and

          return-oriented programming that can read arbitrary memory

          from the victim’s process… These attacks represent a serious

          threat to actual systems, since vulnerable speculative

          execution capabilities are found in microprocessors from

          Intel, AMD, and ARM that are used in billions of devices.

        </p>

      </blockquote>

      <p>Spectre is a step-up from the already very bad Meltdown:</p>

      <ul>

        <li>it applies to AMD and ARM-based processors (so works on

          mobile phones, tablets, and so on)</li>

        <li>it only assumes that speculatively executed instructions can

          read from memory the victim process could access normally

          (i.e., no page fault or exception is triggered)</li>

        <li>the attack can be mounted in just a few lines of JavaScript

          – and a few lines of JavaScript can easily be delivered to

          your system from, for example, an ad displayed on a site that

          you would otherwise trust. </li>

      </ul>

      <blockquote>

        <p> … in order to mount a Spectre attack, an attacker starts by

          locating a sequence of instructions within the process address

          space which when executed acts as a covert channel transmitter

          which leaks the victim’s memory or register contents. The

          attacker then tricks the CPU into speculatively and

          erroneously executing this instruction sequence…

        </p>

      </blockquote>

      <h3>Let’s play tennis</h3>

      <p>You’re watching an epic clay-court tennis match with long

        rallies from the baseline. Every time the ball is received deep

        in the forehand side of the court, our would-be-champion plays a

        strong cross-court forehand. Her opponent plays the ball deep

        into the forehand side of the court once more, and almost

        reflexively starts moving across the court to get into position

        for the expected return. But this time her speculative execution

        is in vain — the ball is played down the line at pace leaving

        her wrong-footed. By forcing a branch misprediction the now

        champion tennis player has won the point, game, set and match!</p>

      <p>Spectre exploits speculative execution of instructions

        following a branch. Inside the CPU, a <em>Branch Target Buffer</em>

        (BTB) keeps a mapping from addresses of recently executed branch

        instructions to destination addresses. The BTB is used to

        predict future code addresses (“it’s going to be a cross-court

        forehand”) even before decoding the branch instructions.

        Speculative execution of the predicted branch improves

        performance.</p>

      <h3>Attack foundations</h3>

      <p>Consider this simple code fragment:</p>

      <pre class="brush: cpp; title: ; notranslate">    if (x < array1_size)

          y = array2[array1[x] * 256];

</pre>

      <p>On line 1 we test to see whether <code>x</code> is in bounds

        for <code>array1</code>, and if it is, we use the value of <code>x</code>

        as in index into <code>array1</code> on line 2. We can execute

        this code fragment lots of times, always providing a value of x

        that is in bounds (the cross-court forehand). The BTB learns to

        predict that the condition will evaluate to true, causing

        speculative execution of line 2. Then, wham, we play the

        forehand down-the-line, passing a value of x which is out of

        bounds. Speculative execution of line 2 goes ahead anyway,

        before we figure out that actually the condition was false this

        time.</p>

      <p>Suppose we want to know the value of sensitive memory at

        address $a$. We can set <code>x = a - (base address of array1)</code>

        and then the lookup of <code>array1[x]</code> on line 2 will

        resolve to the secret byte at $a$. If we ensure that <code>array1_size</code>

        and <code>array_2</code> are not present in the processor’s

        cache (e.g., via a clflush) before executing this code, then we

        can leak the sensitive value.</p>

      <p>When the code runs…</p>

      <ol>

        <li>The processor compares the malicious value of <code>x</code>

          against <code>array1_size</code>. This results in a cache

          miss.</li>

        <li>While waiting for <code>array1_size</code> to be fetched,

          speculative execution of line 2 occurs, reading the data, $k$,

          at the target address. </li>

        <li>We then compute <code>k * 256</code> and use this as an

          index into <code>array2</code>. That’s not in the cache

          either, so off we go to fetch it.</li>

        <li>Meanwhile, the value of <code>array1_size</code> arrives

          from DRAM, the processor realises that the speculative

          execution was erroneous in this case, and rewinds its register

          state. But <em>crucially, the access of array2 affects the

            cache as we saw yesterday with Meltdown</em>.</li>

        <li>The attacker now recovers the secret byte $k$, since

          accesses to <code>array2[n*256]</code> will be fast for the

          case when $n = k$, and slow otherwise. </li>

      </ol>

      <p>You can even do this from JavaScript, using code like this:</p>

      <p><img

src="https://adriancolyer.files.wordpress.com/2018/01/spectre-js.jpeg?w=640"

          alt="" moz-do-not-send="true"><br>

        (<a

          href="https://adriancolyer.files.wordpress.com/2018/01/spectre-js.jpeg"

          moz-do-not-send="true">Enlarge</a>)</p>

      <p>Note that there’s no access to the <code>clflush</code>

        instruction from JavaScript, but reading a series of addresses

        at 4096-byte intervals out of a large array does the same job.

        There’s also a small challenge getting the timer accuracy needed

        to detect the fast access in <code>array2</code>. The Web

        Workers feature of HTML5 provides the answer – creating a

        separate thread that repeatedly decrements a value in a shared

        memory location yields a timer that provides sufficient

        resolution.</p>

      <p>For a C code example, see appendix A in the paper. Unoptimised,

        this code can read about 10KB/second an an i7 Surface Pro 3.</p>

      <h3>The art of misdirection</h3>

      <p>With <em>indirect</em> branches we can do something even more

        special. An indirect branch is one that jumps to an address

        contained in a register, memory location, or on the stack.</p>

      <blockquote>

        <p> If the determination of the destination address is delayed

          due to a cache miss and the branch predictor has been

          mistrained with malicious destinations, speculative execution

          may continue at a location chosen by the adversary.

        </p>

      </blockquote>

      <p>This means we can make the processor speculatively execute

        instructions from any memory location of our choosing. So all we

        have to do, as in the gadgets of <a

          href="https://blog.acolyer.org/2015/12/01/rop/"

          moz-do-not-send="true">Return-Oriented Programming</a> (ROP),

        is find some usable gadgets is the victim binary.</p>

      <blockquote>

        <p> … exploitation is similar to return-oriented programming,

          except that correctly-written software is vulnerable, gadgets

          are limited in their duration, but need not terminate cleanly

          (since the CPU will eventually recognize the speculative

          error), and gadgets must exfiltrate data via side channels

          rather than explicitly.

        </p>

      </blockquote>

      <p>In other words, this is an even lower barrier than regular ROP

        gadget chaining. Code executing in one hyper-thread of x86

        processors can mistrain the branch predictor for code running on

        the same CPU in a different hyper-thread. Furthermore, the

        branch predictor appears to use only the low bits of the virtual

        address: “<em>as a result, an adversary does <strong>not</strong>

          need to be able to even execute code at any of the memory

          addresses containing the victim’s branch instruction.</em>”</p>

      <h3>A Windows example</h3>

      <p>A proof-of-concept program was written that generates a random

        key the goes into a infinite loop calling <code>Sleep(0)</code>,

        loading the first bytes of a file, calling Windows crypto

        functions to compute the SHA-1 hash of the key and file header,

        and then prints out the hash whenever the header changes. When

        compiled with optimisation flags, the call to <code>Sleep(0)</code>

        will be made with file data in registers <code>ebx</code> and <code>edi</code>

        (normal behaviour, nothing special was done to cause this).</p>

      <p>In <code>ntdll.dll</code> we find the byte sequence <code>13

          BC 13 BD 13 BE 13 12 17</code>, which when executed

        corresponds to:</p>

      <pre>    adc  edi, dword ptr [ebx+edx+13BE13BDh]

    adc  dl, byte ptr [edi]

</pre>

      <p>If we control <code>ebx</code> and <code>edi</code> (which we

        do, via the file header), then we can control which address will

        be read by this code fragment.</p>

      <p>Now the first instruction of the sleep function is <code>jmp

          dword ptr ds:[76AE0078h]</code>, which we can target for

        branch mistraining. (The actual destination changes per reboot

        due to ASLR).</p>

      <ul>

        <li>Simple pointer operations were used to locate the indirect

          jump at the entry point for sleep, and the memory location

          holding the destination for the jump</li>

        <li>The memory page containing the destination for the jump was

          made writable using copy-on-write, and modified to change the

          jump destination to the gadget address. Use the same method, a

          <code>ret 4</code> instruction was written at the location of

          the gadget. (These changes are only visible to the attacker,

          not the victim).</li>

        <li>A set of threads are launched to mistrain the branch

          predictor. This is a bit fiddly (see the details in section

          5.2), but once an effective mimic jump sequence is found the

          attacker is able to read through the victim’s address space. </li>

      </ul>

      <h3>Variations and mitigations</h3>

      <p>Spectre is not really so much an individual attack, as a whole

        new class of attacks: speculative execution may affect the state

        of other microarchitectural components, and virtually any

        observable effect of speculative execution can lead to leaks of

        sensitive information. Timing effects from memory bus

        contention, DRAM row address selection status, availability of

        virtual registers, ALU activity, and the state of the branch

        predictor itself need to be considered.</p>

      <blockquote>

        <p> The conditional branch vulnerability can be mitigated if

          speculative execution can be halted on potentially-sensitive

          execution paths… Indirect branch poisoning is even more

          challenging to mitigate in software. It might be possible to

          disable hyperthreading and flush branch prediction state

          during context switches, although there does not appear to be

          any architecturally-defined method for doing this.

        </p>

      </blockquote>

      <p>Anything we can do in software or microcode though, should at

        best be seen as stop-gap countermeasures pending further

        research.</p>

      <h3>The last word</h3>

      <blockquote>

        <p> The vulnerabilities in this paper, as well as many others,

          arise from a longstanding focus in the technology industry on

          maximizing performance. As a result, processors, compilers,

          device drivers, operating systems, and numerous other critical

          components have evolved compounding layers of complex

          optimizations that introduce security risks. As the costs of

          insecurity rise, these design choices need to be revisited,

          and in many case alternate implementations optimized for

          security will be required.

        </p>

      </blockquote>

    </div>

  </body>

</html>