Using Corellium Kernel Hooks to Disable Exploit Mitigations
In October 2020, Ian Beer of Google Project Zero disclosed a vulnerability in XNU, the kernel used by iOS and macOS, which had been exploited in the wild. This vulnerability is used as a privilege escalation component in an exploit chain, along with a kernel memory disclosure (CVE-2020-27950) and Safari RCE (CVE-2020-27930).
We won't go into the details of the vulnerability here, as the proof-of-concept and a root-cause analysis are available directly from Google Project Zero. Instead, we'll take the proof-of-concept and attempt to run it to see what happens.
First we build the proof-of-concept:
$ xcrun -sdk iphoneos cc -arch arm64 -o turnstiles host_not.c -Wall -O3 -framework CoreFoundation
$ codesign -s - turnstiles
This produces an ad-hoc signed command-line binary that we can run on a Corellium virtual device. We'll use an iPhone 7 running iOS 14.1 (18A395), the version right before the vulnerability was patched in iOS 14.2.
After creating the virtual device, we can upload the turnstiles binary, for example into /tmp/ and run it:
zone_require
In iOS 13.0, Apple introduced the zone_require mitigation. This is intended to defeat a common iOS kernel exploitation technique: the zone transfer, which was commonly used to turn use-after-free bugs into type confusion to create some other primitive such as arbitrary read/write. XNU uses the zone allocator to slice up a memory page into elements of a specific type, for example socket objects are allocated in the socket zone, and Mach ports are allocated in the ipc.ports zone. We can see a list of zones by running zprint:
zone name size size size #elts
#elts inuse size count
-------------------------------------------------------------------------------------------------------------
vm.permanent 1 32K 32K 32768
32768 28590 16K 16384
vm.permanent.percpu 2 32K 32K 16384
16384 7592 32K 16384
ipc.ports 168 2304K 2304K 14043
14043 13072 16K 97 C
ipc.port.sets 96 32K 32K 341
341 240 16K 170 C
ipc.vouchers 80 48K 64K 614
819 86 16K 204 C
tasks 1576 320K 320K 207
207 194 16K 10 C
proc 1072 208K 208K 198
198 192 16K 15 C
VM.map.copies 80 64K 64K 819
819 653 16K 204 C
pmap 232 48K 48K 211
211 195 16K 70 C
vm.objects 256 3456K 3456K 13824
13824 13572 16K 64 C
maps 280 64K 64K 234
234 207 16K 58
VM.map.entries 80 1872K 1872K 23961
23961 23593 16K 204 C
Reserved.VM.map.entries 80 160K 160K 2048
2048 260 16K 204
VM.map.holes 32 160K 160K 5120
5120 1381 16K 512 C
vm.pages 64 160K 160K 2560
2560 2141 16K 256 XC
default.kalloc.16 16 736K 736K 47104 47104 45673 16K 1024 C
default.kalloc.32 32 400K 400K 12800
12800 12433 16K 512 C
default.kalloc.48 48 400K 400K 8533
8533 6641 16K 341 C
default.kalloc.64 64 416K 416K 6656
6656 6568 16K 256 C
default.kalloc.80 80 320K 368K 4096
4710 2617 16K 204 C
default.kalloc.96 96 240K 240K 2560
2560 2469 16K 170 C
default.kalloc.128 128 320K 320K 2560
2560 2425 16K 128 C
[snip]
Note the existence of the kalloc zones. kalloc builds on top of the zone allocator for objects that do not have a dedicated zone. These objects are allocated by size, and placed into the smallest bin available for the requested size. An exploit developer can use these to control the sizes of their allocations, allowing the heap to be "groomed" and filled with arbitrary attacker-controlled data. In order to control the data being type-confused, the exploit developer typically wants to transfer the page containing the target object from a type-specific zone to a kalloc zone. Generally, that requires controlling every allocation of a page, where one of the allocations is the target object.
For example, suppose there's a use-after-free of a socket object. The attacker will want to perform a zone transfer so that the dangling pointer's data is entirely attacker-controlled. To do this, a standard flow might be:
1. Allocate ("spray") a large number of socket objects. This ensures that any holes in the pages already in the socket zone are filled in, and then starts allocating one or more fresh pages containing objects whose creation was initiated by the attacker (and therefore the attacker can free them at any time).
2. Trigger the "free" part of the use-after-free bug. This stage will depend on the specifics of the bug in question, but the end result is that the target object is freed, but can still be accessed through a dangling pointer.
3. Free the sprayed objects, in the hope that the page containing the target object will no longer contain any allocations. At this point, the page is empty but still considered part of the socket zone.
4. Cause a garbage collection by creating memory pressure, such as allocating and then freeing a large amount of memory in userspace. This will mark the page that formerly contained the target object as free, allowing another zone to claim the page.
5. Attempt to reallocate the target object as a different type, such as entirely attacker-controlled data via kalloc.
6. Trigger the "use" part of the use-after-free bug. This will perform some action on the target object, which has had its data changed. For example, it may call a function pointer that is now attacker-controlled.
The purpose of zone_require is to prevent this entire technique from working. When the dangling socket pointer is referenced after having its contents replaced, a zone check will occur to validate that the page is still owned by the correct zone. Here's an example usage where an object kmsg is checked to ensure that its allocation is inside the correct zone:
In the case of CVE-2020-27932, we see the panic message: "zone_require failed: address in unexpected zone id 107 (host_notify) (addr: 0xffffffe19c7a54d0, expected: ipc ports)". Helpfully, Ian Beer's write-up mentions that "there are presumably some more tricks to get around that". The vulnerability exists as far back as iOS 12.0, so we could simply go back in time to before zone_require was introduced in order to experiment with this vulnerability, but Corellium offers a better way by using Kernel Hooks to disable the mitigation altogether.
Introduction to Kernel Hooks
Corellium Kernel Hooks allow us to introspect and modify the kernel at runtime, similar to using a Python script attached to a breakpoint in lldb. Kernel Hooks, however, have some significant advantages:
- Able to be set/modified from within the Corellium web interface without connecting the kernel debugger, including executing on every boot.
- Hooks execute without locking, allowing race conditions to be investigated (which a traditional debugger might prevent from triggering by pausing all cores whenever a breakpoint triggers).
- Hooks are written in a C-like language.
At the most basic, we can use a hook to print to the console when a certain instruction is reached, for example by placing a hook at the first instruction of a function at some address (for a made up example, 0xfffffff007738eb0):
print_int("Reached hooked function, x0=", cpu.x[0]);
This will log to the console in purple text, showing when the function is called, and printing the value of the X0 register.
To disable zone_require, we'll need to locate the function that enforces the check and causes a kernel panic. We can do this by locating the string used in the panic message, "zone_require failed" in the kernelcache opened in Binary Ninja, and then following the cross-references to the relevant function.
We can disable this mitigation entirely by simply return from this function. This is done in the hooks language by setting the PC register to the contents of the LR register (also known as X30):
print("zone_require called\n");
cpu.pc = cpu.x[30];
On the Kernel Hooks tab, add a new hook and input the correct address for the beginning of the function (fffffff007768fb8, note that 0x should not be entered) and the contents of our hook, then click Create hook. Leave the patch type as csmfcc to use the C-like hooks language.
If we run the proof-of-concept again with the hook in place, we'll see the "zone_require called" message in purple, and then a different panic message:
From here we can continue to explore this vulnerability as if the mitigation didn't exist and begin implementing an exploit for it, and then deal with the mitigation later.
Advance Your Mobile Security Research with Corellium
Experience Corellium’s groundbreaking virtualization technology for mobile devices and discover never-before-possible mobile vulnerability and threat research for iOS and Android phones. Book a meeting today to explore how our platform can optimize mobile security research and malware analysis.