A few notes about the exploitation of CVE-2017-3623 (aka EBBISLAND) on Solaris 10 (x86/amd64)

The advisory CVE-2017-3623 [CVE] is associated with the infamous EBBISLAND exploit that was leaked as part of the Equation Group collection [EQGROUP]. The corresponding heap overflow vulnerability can be exploited to provide a remote root shell to the attacker. What makes it especially interesting is that:

The vulnerability was confirmed in Solaris 6 to 9 (by @hackerfantastic [HFANT]). We also confirm that Solaris 10 (up to u10) is also vulnerable.
It is not a bug within a specific service but rather within a library (libnsl.so) that is linked to most of the RPC services (more on that later). The implication is that there is not a single entry point but many.

Long story short, an unpatched Solaris exposing at least one of its RPC services can be compromised. It is just a matter of having a supported target. This is one of the most interesting bugs affecting Solaris that we know of. The u11 version of Solaris 10 which was released in 2013 fixed the bug which means that Oracle’s engineers were aware of it yet chose not to disclose it (or were told not to). Instead they advertised perhaps more relevant security features :).

Illustration 1: http://docs.oracle.com/cd/E26505_01/html/E27003/index.html

We could not find any mention of the patch within the release notes. Was the bug found by the Equation Group during a diff session? Who knows.

In this paper, we discuss the exploitation of the vulnerability on Solaris 10 (Intel only). The exploitation on previous versions is assumed to be easy and will not be discussed. We quickly reversed the EQGRP binary EBBISLAND [EBB] to trigger the vulnerability. The read of [F1] and [F2] seemed to confirm at the time that EBBISLAND was mostly target based but we could be proven wrong as we did not reverse the ebbshave.vX binaries. For this reason and because we focused our efforts on Solaris 10 which is more difficult to exploit than version 9 and earlier, we developed our own exploitation strategy which is presented here. It is of course possible that smarter strategies exist.

1. The vulnerability analysis

This vulnerability is (officially) said to affect the kernel RPC subsystem as illustrated by both [CVE] and [ORACLE]. Is that really the case though?

Discrediting the kernel hypothesis

The Solaris rpc subsystem is running both within the kernel (rpcmod, rpcsec modules) and in userland (/lib/libnsl.so.1). Both implementations are using very similar code therefore a memory corruption bug could potentially affect both components. In order to find out, it is enough to use the original leaked binary and observe its effect on a vulnerable target.
No matter the options chosen on the CLI, the EBBISLAND [EBB] exploit generates a single frame with a very specific structure. In particular the credentials structure has its "flavor" (this is the name of this field) set to 0x55de (AUTH_LOOPBACK). This identifies the authentication subsystem and XDR encoded data follows as its parameters. It is clear from the observation of the packet that the bug is related to the authentication subsystem.

The coredump analysis

Replaying a modified packet targeting bootparams, allows us to trigger a userland crash (in this case in rpc.metamhd rather than bootparams):

bash-3.00# /opt/csw/bin/gdb /usr/sbin/rpc.metamhd core

GNU gdb (GDB) 7.7

[...]

Reading symbols from /usr/sbin/rpc.metamhd...(no debugging symbols found)...done .

[New LWP 1]

[New LWP 2]

[Thread debugging using libthread_db enabled]

[New Thread 1 (LWP 1)]

[New Thread 2 (LWP 2)]

Core was generated by `/usr/sbin/rpc.metamhd'.

Program terminated with signal SIGSEGV, Segmentation fault.

#0 0xd0efe2cc in xdr_replymsg () from /lib/libnsl.so.1

(gdb) bt

#0 0xd0efe2cc in xdr_replymsg () from /lib/libnsl.so.1

#1 0xd0f0c09e in svc_vc_reply () from /lib/libnsl.so.1

#2 0xd0f03c47 in svcerr_noprog () from /lib/libnsl.so.1

#3 0xd0f040fa in _svc_prog_dispatch () from /lib/libnsl.so.1

#4 0xd0f08bfc in _svc_run_mt () from /lib/libnsl.so.1

#5 0xd0f08473 in svc_run () from /lib/libnsl.so.1

#6 0x08057094 in main ()

(gdb) print /x $ecx

$1 = 0x450685ec <-- 0x450685ec was read from the network buffer

(gdb) x/6i $pc

=> 0xd0efe2cc <xdr_replymsg+75>: mov 0x18(%ecx),%ecx <-- control of %ecx

is possible

0xd0efe2cf <xdr_replymsg+78>: add $0x18,%eax

0xd0efe2d2 <xdr_replymsg+81>: push %eax

0xd0efe2d3 <xdr_replymsg+82>: push %edi

0xd0efe2d4 <xdr_replymsg+83>: call *%ecx <-- If %ecx can be

controlled, %EIP

can be set.

0xd0efe2d6 <xdr_replymsg+85>: add $0x8,%esp

(gdb)

The payload reliably triggers it. Thanks to the core dump analysis it is possible to see that:

The crash occurs within “libnsl.so.1”.
The memory corruption is heap based.
This specific crash is reliable and almost always the same (%ecx dereference) from which we get an instant EIP control.

Note:

The exact same crash is obtained when targeting several (but not all) other processes. This seems to confirm that contrary to the official statement, the “ebbisland” bug is a userland one affecting libnsl.so.1. It is possible that there is (or was) a “similar” kernel bug but it is not what is currently exploited by the leaked binary.
On approximately 1/20 attempts, the crash occurs in a different place. We can assume that this is due to small (unexpected) heap variations.
On Solaris 9 and below there is no NX bit (and no ASLR) therefore achieving code execution from this point is trivial.

The source code analysis

The vulnerability itself is a heap overflow in __svcauth_loopback(). When an rpc call is performed, a tag (the so-called "flavor" in the credentials struct) is used to select an authentication subsystem. If the value 0x55de0000 is used then the “loopback” authentication is selected.

RPC services are constantly polling for new RPC messages. When one is received, the svc_getreq_common() function is called over a specific file descriptor. This function then retrieves the corresponding SVCXPRT object associated with that descriptor and two of its attributes: "msg" and "r", calling _svc_prog_dispatch() in the process.

_svc_prog_dispatch() may perform an authentication and calls __gss_authenticate() with "r" and "msg" to do so.

Depending on the flavor which is extracted from the network buffer the function calls the corresponding authentication subsystem. When the authentication flavor is AUTH_LOOPBACK (0x55de0000), the function calls __svcauth_loopback().

__svcauth_loopback() essentially starts by extracting data out of the network buffer by using the XDR decoding API. An array of GID may be provided by the attacker within the frame. To avoid buffer overflows, the number of GIDs that a user may provide is limited to NGRPS_LOOPBACK (=64) as demonstrated by the following code:

[...]

area = (struct area *)rqst->rq_clntcred;

aup = &area->area_aup;

[...]

gid_len = IXDR_GET_U_INT32(buf); // reads the gid_len (unsigned)

on the network

if (gid_len > NGRPS_LOOPBACK) { // security check

stat = AUTH_BADCRED;

goto done;

}

aup->aup_len = gid_len;

for (i = 0; i < gid_len; i++) {

aup->aup_gids[i] = (gid_t)IXDR_GET_INT32(buf);

}

Therefore at first, there does not seem to be any corruption of any kind. However, the current check is misleading as one may think that the final buffer is able to store up to 64 GIDs when it is ... not!

While the loop itself is correct, it relies on the assumption that aup is pointing to a buffer big enough to store the entire 'struct area', including one with up to 64 GIDs.

This is however not the case but it is not obvious at first sight as the allocation is performed several functions before:

a) Each SVCXPRT object is created within svc_xprt_alloc()

SVCXPRT *

svc_xprt_alloc(void)

{

[..]

if ((cred_area = malloc(2*MAX_AUTH_BYTES + RQCRED_SIZE)) == NULL)

goto err_exit;

xt->cred_area = cred_area;

[...]

return (xprt);

err_exit:

svc_xprt_free(xprt);

return (NULL);

}

cred_area is a big buffer of size S = 2*MAX_AUTH_BYTES + RQCRED_SIZE bytes

S = 2*400 + 400 = 1200 bytes

b) When data arrives on the socket (for a specific file descriptor), svc_getreq_common() is called

void

svc_getreq_common(const int fd)

{

SVCXPRT *xprt;

enum xprt_stat stat;

struct rpc_msg *msg;

struct svc_req *r;

char *cred_area;

[...]

if ((fd >= nsvc_xports) || (xprt = svc_xports[fd]) == NULL) { // [L1]

(void) rw_unlock(&svc_fd_lock);

return;

}

[...]

r = SVCEXT(xprt)->req; // [L2]

[...]

cred_area = SVCEXT(xprt)->cred_area;

msg->rm_call.cb_cred.oa_base = cred_area;

msg->rm_call.cb_verf.oa_base = &(cred_area[MAX_AUTH_BYTES]);

r->rq_clntcred = &(cred_area[2 * MAX_AUTH_BYTES]); // [L3]

L1: The SVCXPRT object (xprt) is retrieved based on a global array of sockets (svc_xports)

L2: The svc_req object is derived from xprt

L3: r->rq_clntcred is set to &(cred_area[2 * MAX_AUTH_BYTES]) which means that it points to the remaining 400 bytes.

In __svcauth_loopback(), a 'struct area *' pointer is used over rq_clntcred, therefore over the aforementioned 400 bytes buffer. The function copies directly from the network to this buffer. The size of the whole structure (which includes the 64 GID) is:

S = sizeof(authsys_parms) + (MAX_MACHINE_NAME+1) * sizeof(char)
+ NGRPS_LOOPBACK*sizeof(git_t) = (6 * 4) + (255+1)*1 + 64*4 = 536 bytes

This is significantly greater than the 400 allocated hence the (heap) overflow.

How the vulnerability was fixed

Oracle chose to fix this vulnerability by extending the size of cred_area.

This is the disassembly of svc_xprt_alloc() on S10u8 (unpatched version):

.text:00044166 push 24h ; '$' ; size

.text:00044168 call _malloc

.text:0004416D add esp, 4

.text:00044170 test eax, eax

.text:00044172 jz short loc_441A7

.text:00044174 mov [edi+0Ch], eax

.text:00044177 push 4B0h ; size <-- 1200 bytes

.text:0004417C call _malloc

.text:00044181 add esp, 4

.text:00044184 test eax, eax

And we can compare it with the disassembly of svc_xprt_alloc() on S10u11 (patched version):

.text:0004464E push 24h ; '$' ; size

.text:00044650 call _malloc

.text:00044655 add esp, 4

.text:00044658 test eax, eax

.text:0004465A jz short loc_4468F

.text:0004465C mov [edi+0Ch], eax

.text:0004465F push 538h ; size <-- 1336 bytes

.text:00044664 call _malloc

.text:00044669 add esp, 4

.text:0004466C test eax, eax

The buffer used to copy the struct area is now of size 536 bytes which is exactly what was necessary according to our calculation.

2. A few thoughts about the exploitability on Solaris 10

Exploiting a bug on Solaris 10 is much easier than it is on Linux for several very specific reasons:

There is no “real” ASLR. Solaris 10 does support the NX bit preventing the execution of code in the data regions (stack, heap, data section, etc) but without any strong ASLR the mechanism is only partially effective as ret2libc payloads can be written for identified targets.
Kind of like service packs in Windows, Solaris has the concept of periodical updates (numbered from u1 to u11) [SOL]. It is clear that some changes can be observed between binaries of different updates yet it is not an absolute rule that a given binary will be recompiled. In particular, some of them (especially among the rpc services) have either not changed at all or not too often from u1 to u10. Two examples of such binaries would be “/usr/sbin/rpc.bootparamd” and “/usr/lib/netsvc/rusers/rpc.rusersd”. This is interesting because for these services a generic ret2libc payload can be written, modulo of course heap and libraries addresses. For other services, assuming an exploit writer who has enough resources (both in time and .iso files), it is still very possible to build a short list of ret2libc (ROP) payloads for each specific services.
The dynamic linker (/lib/ld.so.1) uses “rwx” pages (for unknown reasons). If one can copy his shellcode into these pages it is instant game over. This also means that the ret2libc primitive can be extremely small thus extremely easy to generate.

(Almost) no ASLR and a small set of binaries makes the exploitation not only possible but relatively simple from a theoretical point of view. Practically speaking though, there are several factors that need to be taken into account:

Depending on both the service that you attack, the update and how the service is managed (svcadm, inetadm, cli, etc.), you may not have an unlimited number of attempts in order to exploit it. In the worse case you may render a service unusable and trigger logs very easy to find (not to mention the corresponding core file):

bash-3.2# cat /var/adm/messages|grep maintenance

Jun 13 11:42:34 Host-001 svc.startd[8]: [ID 748625 daemon.error] network/rpc/bootparams:default failed repeatedly: transitioned to maintenance (see 'svcs -xv' for details)

Jun 14 08:10:28 Host-002 svc.startd[8]: [ID 748625 daemon.error] network/rpc/bootparams:default failed repeatedly: transitioned to maintenance (see 'svcs -xv' for details)

Jun 14 08:14:08 Host-002 svc.startd[8]: [ID 748625 daemon.error] network/rpc/bootparams:default failed repeatedly: transitioned to maintenance (see 'svcs -xv' for details)

Jun 14 08:34:33 Host-002 svc.startd[8]: [ID 748625 daemon.error] network/rpc/bootparams:default failed repeatedly: transitioned to maintenance (see 'svcs -xv' for details)

bash-3.2# svcs -xv

svc:/application/print/server:default (LP print server)

State: disabled since Wed Jun 14 08:15:58 2017

Reason: Disabled by an administrator.

See: http://sun.com/msg/SMF-8000-05

See: man -M /usr/share/man -s 1M lpsched

Impact: 2 dependent services are not running:

svc:/application/print/rfc1179:default

svc:/application/print/ipp-listener:default

svc:/network/rpc/bootparams:default (boot parameter server)

State: maintenance since Wed Jun 14 08:34:33 2017

Reason: Restarting too quickly.

See: http://sun.com/msg/SMF-8000-L5

See: man -M /usr/share/man -s 1M rpc.bootparamd

See: /var/svc/log/network-rpc-bootparams:default.log

Impact: This service is not running.

bash-3.2#

While there is no specific “security” ASLR per say, there is still a “natural” ASLR. The heap may (or should) be different from one update to another. This is essentially because while the main binary of the service does not change (much), the libraries it is linked against actually do. Additionally, two exact same OSs may load libraries at different addresses. Empirical results suggest that there is a natural entropy of maximum 8 bits (probably a lot less in practice). This entropy seems to be a per boot factor which means that for a given boot, two processes (executed from the same binary) will have the same layout. This can easily be seen by comparing the core files produced during crashes:

bash-3.00# pmap core*|grep ld|grep 160K

D13C4000 160K r-x-- /lib/ld.so.1

D0FC4000 160K r-x-- /lib/ld.so.1

D13C4000 160K r-x-- /lib/ld.so.1

D0FC4000 160K r-x-- /lib/ld.so.1

D13C4000 160K r-x-- /lib/ld.so.1

Finally depending on “when” you attack the service, the heap might not be in a usable state. Specifically, one-shotting a service that has been used before is probably very difficult. Fortunately, if you can reliably crash it you will be fine as your exploit should be the first to interact with the service and its memory has been reset. Of course if you choose to target some highly used RPC service such as one related to file sharing this may not hold.
The main binary though possibly quite constant across updates will be small in size. This means that the ret2libc payload created using code borrowed from the binary (the so-called gadgets) could be quite limited in functionalities. This ELF binary will always be loaded at 0x08050000.
The architecture of the target is unknown. Interestingly enough whether you run a 32 bits or a 64 bits environment, your service is still 32 bits. That being said, the address space will change depending on the kernel architecture. This is documented in the excellent [CORE].

To sum things up, for a given service (bootparams, metad, metamedd, etc.) writing a payload is easy modulo the knowledge of a few parameters (the binary version, the architecture, the update version, the effect of the current boot on the ASLR, etc.). Guessing is certainly possible but the cost is necessarily one crash per attempt (mandatory heap reset) which means at least a dozen of crashes.

There are of course many ways to reduce this amount quite effectively but since we face services that may stop functioning at some point, this is not acceptable. For these reasons we explored the idea of leaking information out of the process to create (more reliable primitives). We will now detail the strategy deployed for the exploit module released within Canvas 7.14 targeting bootparams.

Note: Exploiting other services require slightly different strategies. We applied some of them to increase the number of targets (especially when bootparams is not available) in the updated version of the module (Canvas 7.15). We won’t discuss this in this blogpost.

3. Information Leaking

The overflow primitive allow us to overwrite adjacent objects which, depending on the heap layout, may include pointers, file descriptors, etc. It is probably possible to generate an information leak this way but that’s composing with several difficulties:

the overflow primitive only allows the overwriting of the 4*n next bytes where n is fixed. Since pointer addresses are aligned modulo 4 partial pointer overwrites are not directly doable which means some guessing, the corruption of another object type, etc.
being able to overwrite a specific object requires a heap control ranging from painful to undoable (depending on the allocator, your degree of control, etc.).

On a modern OS, the allocator is usually heavily protected against trivial chunk’s metadata manipulation which is why exploit writers often favor adjacent object overflow with this kind of primitives. Fortunately this is not the case with Solaris 10 (but may be with Solaris 11?) thus it is doable to transform our adjacent overwrite into a write4 primitive which is, especially when you have a poor ASLR, a much better weapon and saves us a lot of effort.

(a) A few notes about how to achieve the write4

Achieving the write4 is obviously completely dependent on the underlying allocator which, on Solaris 10, is based on self-adjusting binary trees. One can find the main ideas behind in [SABT]. We assume that the reader is already very familiar with the excellent [P57OUAF]. Our write4 primitive is a direct application of the technique described in this paper.

However, for the sake of completeness, let us provide a few additional details:

As explained in [P57OUAF], the chunks have fields of type WORD and a WORD is itself a union. The size of this object is controlled by the ALIGN macro which is 8. This is important because it means that most of the fields are 32 bits pointers. Clearly in memory there is a 4 bytes gap (filled with 0) between 2 fields instead of having immediately consecutive pointers. The reason behind such a choice is unclear but it needs to be taken into account during the exploitation attempt as pointed out in [P57OUAF]. I would recommend to read the assembly code instead of the source code.
A call to free() does not trigger an actual chunk manipulation. Instead that chunk C is only set in a circular list flist. Later when (or if) enough allocations and deallocations are made, realfree() will be called upon C. If the header of C was corrupted, an unlink-like primitive will be user controlled and a write4 primitive achieved. It is mandatory to understand this point as the write4 is absolutely not synchronous by default! This can however be fixed:
The unlink() primitive exploited within realfree() will perform two write operations. By design overwriting at address X with DWORD Y is only possible if Y (modulo an offset) is a valid pointer to writable memory. Nothing new here. To avoid problems we use the (universal) address 0x08063000 (.got) when we can, heap & stack addresses otherwise.
We do overwrite the size field of the adjacent chunk as proposed in [P57OUAF] however to avoid too much heap corruption, we chose to create a fake chunk before the corrupted one. Since the (overwritten) size field is used as an offset and (without any sanity check), we can set it to a negative value (0xffffffc0 | 1). Below is the fake chunk located a little bit before the overflown chunk:

pattern += struct.pack('>L', 0x200) # t_s +0