Introduction
I resonantly
realized how much stuff was published on the row hammer and how much I was
missing a short summary. So I wrote one and you are now reading the result. The
summary is moderately technical and is kept short with intend. I may or may not
update this post – but please let me know if you think I missed something important.
There will be no new results here.
Short version of how dram works.
Current
DRAM comes in modules called DIMM’s. If you buy a modern memory module for your
PC, you’re buying a DIMM. If you look at the DIMM most DIMMs will have chips on
both sides. Each side of the DIMM is a rank. Each rank again consists of a
number of banks. The banks are in the physical individual chips you can see.
Inside a bank you’d find a two dimensional matrix of memory cells. There are
32k rows in the matrix and 16k or 512k cells per row. Each cell stores one bit and consists of a
transistor for control and a capacitor which stores charge to signify bit is
equal to 1 and no charge when bit is equal to 0 (on some chips the encoding is
reversed). Thus a row stores 8kb or 64kb of data depending on the exact kind of
DRAM you have in front of you. When you read or write from/to DRAM an entire
row is first read into a so called so called row buffer. This is because for
reading automatically discharges the capacitor and since writes rarely rewrite
the entire row. Reading a row into the row buffer is called activating the row.
An active row is thus cached in the row buffer. If a row is already active, it
is not reactivated on requests. Also to prevent the capacitors loose charge overtime they are refreshed regularly (typically every 64 ms) by activating the rows.
Row hammer introduction
This
secton is based on [1] Kim et Al. where not otherwise noted.
When a row
is activated a small effect is caused on the neighboring row due to so called
cross talk effects. The effect can be caused by electromagnetic interference,
so called conductive bridges where there is minor electric conductivity in dram
modules where it shouldn’t be. And finally, so called hot-carrier-injection may
play a role where an electron reaches sufficient kinetic energy where it leaks
from the system or even permanently damage parts of the circuitry. The net effect is a loss of charge in the DRAM
cell which if large enough will cause a bit to flip.
Consequently,
it’s possible to cause bits to flip in DRAM by reading or writing repeatedly
and systematically from/to two rows in DRAM to (active the rows) bit flips can
be introduced in rows up to 9 rows away from these two “aggressor rows”. The 9
rows are called victim rows. The most errors happen in the row immediately next
to an aggressor row. Picking the aggressor rows so they bracket a victim row is
called double sided row hammering and is far more efficient that normal row
hammering. Using two adjacent rows to hammer surrounding rows is called amplified
single sided hammering and can be useful in exploitation scenarios. If the victim
rows are refreshed before enough cross talk can be generated no bit flips is incurred.
As a rule of thumb the higher the frequency of row activation the higher the probability
of flipping bits.
It has been
shown that bits can be flipped in less than 9 milliseconds and typically
requires around 128k row activations. [3] Seaborn & Dullien has
reported bit flips with as little as 98k row activations.
The problem
occurs primarily with RAM produced after 2010. In a sample of 129 RAM modules
from 3 manufacturers over 80% where vulnerable. With all modules produced after
2012 being vulnerable. [4] Lanteigne showed
that DDR4 ram is vulnerable too with 8 out of 12 sampled DRAMs was subject to
bit flips. Further this paper showed that certain patterns in the DRAM rows where
more likely to cause bit flips.
Physical addressing and finding banks and row
Obviously
to cause row hammering one needs two addresses belonging to rows in the same
bank. [2] showed that repeatedly choosing two random addresses in a large
buffer would in a practical amount of time belong to the same bank and thus be
useful for hammering in software.
An optimal solution
requires that the attacker has knowledge of physical addresses. Even with a physical
address an attacker would need to know how they map to dimm, banks and rows to
optimally row hammer. [5] Seaborn used row hammer itself to derive the complex function
that determines physical address to dram location for a sandy bridge computer.
[6] Pessl et al. showed how to use “row buffer side channel attacks” to derive
the complex addressing function generically and provided maps for many modern
intel CPU’s.
To obtain
physical addresses the /proc/$PID/pagemap could provide this information. However,
/proc/$PID/pagemap, which is not available in all operating systems and no
longer affords unpriviledged access in most operating systems that do support
it. This problem for an attacker remains to be definitively solved.
Row hammering with software
To cause
row hammer from software you need to activate memory rows, that is cause reads
or writes to physical memory. However modern processors are equipped with
caches, so that they don’t incur serious speed penalties when memory is read or
written. Thus to cause row hammering bit flips it’s required to bypass the
caches.
[1] did this using the clflush instruction
that removes a particular address from the cache causing the next read of the
address to go directly to memory. This approach has two down sides. First, since
clflush is a rare instruction, validator sandboxes (such as NaCL of google
chrome) can ban this instruction and thus defeat this attack. Second
Just-in-time compilers and existing code on computers generally do not use this
opcode disabling attack scenarios where jit compilers are used (such as
javascript) or for the future using existing code in data-only attacks.
[7] Aweke posted on a forum he was able to
flip bits without clflush – he did not say how, but it was likely using the
same method as [8] which systematically accessed memory in a pattern that
causes the processor to evict the address of the attacker row from the cache
causing the next read to go to physical memory. Unfortunately, how to evict
caches is CPU dependent and undocumented and despite [8] Gruss, Maurice &
Mangard mapping out how to optimally evict on most modern CPU’s it’s not the
most trivial process. Typically, this requires knowledge of the physical
address discussed above as well as a complex mapping function for cache sets.
It is however possible to approximate this either through using large pages or
through timing side channels. Also it is slower and thus less efficient than
the clflush version above. Since this vector does not require special
instructions, it’s applicable to native code (including sandboxed code), java script
and potentially other JIT compiled languages.
[9] Qiao &
Seaborn found out that the movnti instruction is capable of by passing the
caches on it’s own. Further this instruction is commonly used – including as
memcpy/memset in common libraries and thus difficult to ban in validator
sandboxes and lowers the burden for
future row hammer as a code reuse attack. It remains to be seen if JIT
compiled languages can make use of it.
Finally, [10]
Fogh showed that the policies that maintains the coherency of multiple caches
on the CPU can be used to cause row activations and speculated it would be fast
enough for row hammering. Since the coherency policies are subject to much less
change than cache eviction policies and does not require any special
instructions this method may solve problems with the other methods should it be
fast enough. This remain to be researched.
Exploiting row hammer
[2] showed that row hammer could be used to
break out of the NaCL chrome sandbox. The NaCL sandbox protects itself against by
verifying all code paths before execution and block the use of undesired
activities. To avoid new unintended code paths to be executed the sandbox enforces
a 32 bit aligned address for relative jumps and adding a base address. In code
it looks like this:
and rax, ~31
add rax, r15 //(base
address of sandbox)
jmp rax
Bit flips
in these instructions often cause other registers to be used and thus bypass
the sandbox enforcing limits on relative jumps. By spraying code like the above
then row hammer, check for usable bit flips and finally use one of these
relative jumps to execute a not validated code path and thus exit the sandbox.
Not validated code path can be entered through code reuse style gadgets.
The second
and more serious privilege escalation attack demonstrated by [2] was from ring
3 user privileges to ring 0. Since adjacent physical addresses has a tendency
to be used at the same time, CPU vendors map adjacent physical addresses to
different parts of RAM as this offers the possibility of memory request being
handled by DRAM in parallel. This has the effect that banks are often shared
between different software across trust boundaries. This allows an attacker to
flip bits in page table entries (PTE). PTE’s is central in the security on x86
and controls access writes to memory as well as mapping between virtual and
physical addresses. By repeatedly memory
mapping a file many new PTE’s are generated. Then the attacker use row
hammer to cause bit flips in the PTE’s. The attacker hopes that a bit flips so that a
PTE with write privileges maps to a new physical address where another PTE is
located. By strategically modifying this second PTE the attacker has read &
write access to the entire memory.
It is
likely that row hammer can be exploited in other ways.
Row hammer mitigation
Most hardware
mitigations suggest focuses on refreshing victim rows thus leaving less time
for row hammer to do it’s work. Unfortunately, during the refresh ram is unable
to respond to requests from the CPU and thus it comes at a performance penalty.
The
simplest suggestion is increase the refresh rate for all ram. Much hardware
support this already for high-temperatures. Usually the refresh rate is doubled,
however to perfectly rule out row one would need to increase the refresh rate
more than 7 fold [1]. Which in term is a steep performance penalty and a power
consumption issue.
TTR [17] is
a method that keeps track of used rows and cause targeted refresh of neighbors
to minimize the penalty. The method needs to be supported in both CPU as well
as RAM modules. MAC also known as maximum activation count keeps tracks of how
many times a given row was activated. pTTR does this only statistically and
thus cheaper to build into hardware. PARA [1] is another suggested hardware
mitigation to statistically refresh victim rows. ARMOR [16] a solution that
keep tracks of row activation in the memory interface.
It has been
suggested that ECC ram can be used as a mitigation. Unfortunately, ECC ram will
not to detect or correct bit flips in all instances where there are multiple
bit flips in a row. Thus this leaves room for an attack to be successful even with
ECC ram. Also ECC ram may cause the attacked system to reset, turning row
hammer into a Denial of Service attack. [4] Suggests this problem persists in
real life experiments.
Nishat
Herath and I suggested using the LLC miss performance counter to detect row hammering
here [11] Fogh & Nishat. LLC Misses are rare in real usage, but abundant in row hammering scenarios. [12] Gruss et al. ,[13] Payer refined the
method respectively with correcting for generally activity in the memory
subsystem. The methods are likely to present false positives in some cases and
[11] and [13] therefore suggested only slowing down offenders to prevent bit
flips. [14] Aweke et al. presented a method that first detected row hammering
as above, then verified using PEBS performance monitoring, which has the
advantage of delivering an address related to a cache miss and thus grants the ability
to selectively read neighbor rows and thus doing targeted row refresh in a
software implementation. [15] Fogh speculated that this method could be effectively
bypassed by an attacker.
Litterature
[1] Yoongu
Kim, R. Daly, J. Kim, C. Fallin, Ji Hye Lee, Donghyuk Lee, C. Wilkerson, K.
Lai, and O. Mutlu. Flipping Bits in Memory Without Accessing Them: An
Experimental Study of DRAM Disturbance Errors. In Computer Architecture (ISCA),
2014 ACM/IEEE 41st International Symposium on, pages 361--372, June 2014.
[2] Mark
Seaborn and Thomas Dullien. Exploiting the DRAM rowhammer bug to gain kernel
privileges. March 2015.
[3] Mark Seaborn
and Thomas Dullien. “Exploiting the DRAM rowhammer bug to gain kernel
privileges”. https://www.blackhat.com/docs/us-15/materials/us-15-Seaborn-Exploiting-The-DRAM-Rowhammer-Bug-To-Gain-Kernel-Privileges.pdf
[4] Mark
Lanteigne. "How Rowhammer Could Be Used to Exploit Weaknesses in Computer
Hardware
".
Third I/O.
http://www.thirdio.com/rowhammer.pdf
[5] Mark Seaborn.” How physical addresses map to rows and banks in
DRAM”;
[6] Peter Pessl,
Daniel Gruss, Clémentine Maurice, Michael Schwarz, Stefan Mangard:
„Reverse Engineering Intel DRAM Addressing and
Exploitation“
[7] Zelalem
Birhanu Aweke, “Rowhammer without CLFLUSH,”
https://groups.google.com/forum/#!topic/rowhammer-discuss/ojgTgLr4q M, May
2015, retrieved on July 16, 2015.
[8] Daniel
Gruss, Clémentine Maurice, Stefan Mangard: “Rowhammer.js: A Remote Software-Induced
Fault Attack in JavaScript”
[9] Rui
Qiao, Mark Seaborn: “A New Approach for Rowhammer Attacks”. http://seclab.cs.sunysb.edu/seclab/pubs/host16.pdf
[10] Anders
Fogh: “Row hammer, java script and MESI” http://dreamsofastone.blogspot.de/2016/02/row-hammer-java-script-and-mesi.html
[11] Anders
Fogh, Nishat Herath. “These Are Not Your Grand Daddys CPU Performance
Counters”. Black Hat 2015. http://dreamsofastone.blogspot.de/2015/08/speaking-at-black-hat.html
[12] Daniel Gruss, Clémentine
Maurice, Klaus Wagner, Stefan Mangard: “Flush+Flush: A Fast and
Stealthy Cache Attack”
[13] Mathias Payer: “HexPADS: a platform to detect “stealth” attacks“. https://nebelwelt.net/publications/files/16ESSoS.pdf
[14] Zelalem Birhanu Aweke, Salessawi Ferede Yitbarek, Rui Qiao,
Reetuparna Das, Matthew Hicks, Yossi Oren, Todd Austin:”ANVIL: Software-Based
Protection Against Next-Generation Rowhammer Attacks”
[15] Anders
Fogh: “Anvil & Next generation row hammer attacks”. http://dreamsofastone.blogspot.de/2016/03/anvil-next-generation-row-hammer-attacks.html
[16] http://apt.cs.manchester.ac.uk/projects/ARMOR/RowHammer/armor.html