From: Lorenz Bauer <lmb@cloudflare.com>
To: Jakub Sitnicki <jakub@cloudflare.com>
Cc: bpf <bpf@vger.kernel.org>, Networking <netdev@vger.kernel.org>,
kernel-team <kernel-team@cloudflare.com>
Subject: Re: [PATCH bpf-next] bpf, netns: Fix use-after-free in pernet pre_exit callback
Date: Wed, 1 Jul 2020 09:21:19 +0100 [thread overview]
Message-ID: <CACAyw98SeKeZemGcPbmMwokfsSd+eSLgPaeRnOYRo3b=Rvudjg@mail.gmail.com> (raw)
In-Reply-To: <20200630164541.1329993-1-jakub@cloudflare.com>
On Tue, 30 Jun 2020 at 17:45, Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
> Iterating over BPF links attached to network namespace in pre_exit hook is
> not safe, even if there is just one. Once link gets auto-detached, that is
> its back-pointer to net object is set to NULL, the link can be released and
> freed without waiting on netns_bpf_mutex, effectively causing the list
> element we are operating on to be freed.
>
> This leads to use-after-free when trying to access the next element on the
> list, as reported by KASAN. Bug can be triggered by destroying a network
> namespace, while also releasing a link attached to this network namespace.
Acked-by: Lorenz Bauer <lmb@cloudflare.com>
>
> | ==================================================================
> | BUG: KASAN: use-after-free in netns_bpf_pernet_pre_exit+0xd9/0x130
> | Read of size 8 at addr ffff888119e0d778 by task kworker/u8:2/177
> |
> | CPU: 3 PID: 177 Comm: kworker/u8:2 Not tainted 5.8.0-rc1-00197-ga0c04c9d1008-dirty #776
> | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
> | Workqueue: netns cleanup_net
> | Call Trace:
> | dump_stack+0x9e/0xe0
> | print_address_description.constprop.0+0x3a/0x60
> | ? netns_bpf_pernet_pre_exit+0xd9/0x130
> | kasan_report.cold+0x1f/0x40
> | ? netns_bpf_pernet_pre_exit+0xd9/0x130
> | netns_bpf_pernet_pre_exit+0xd9/0x130
> | cleanup_net+0x30b/0x5b0
> | ? unregister_pernet_device+0x50/0x50
> | ? rcu_read_lock_bh_held+0xb0/0xb0
> | ? _raw_spin_unlock_irq+0x24/0x50
> | process_one_work+0x4d1/0xa10
> | ? lock_release+0x3e0/0x3e0
> | ? pwq_dec_nr_in_flight+0x110/0x110
> | ? rwlock_bug.part.0+0x60/0x60
> | worker_thread+0x7a/0x5c0
> | ? process_one_work+0xa10/0xa10
> | kthread+0x1e3/0x240
> | ? kthread_create_on_node+0xd0/0xd0
> | ret_from_fork+0x1f/0x30
> |
> | Allocated by task 280:
> | save_stack+0x1b/0x40
> | __kasan_kmalloc.constprop.0+0xc2/0xd0
> | netns_bpf_link_create+0xfe/0x650
> | __do_sys_bpf+0x153a/0x2a50
> | do_syscall_64+0x59/0x300
> | entry_SYSCALL_64_after_hwframe+0x44/0xa9
> |
> | Freed by task 198:
> | save_stack+0x1b/0x40
> | __kasan_slab_free+0x12f/0x180
> | kfree+0xed/0x350
> | process_one_work+0x4d1/0xa10
> | worker_thread+0x7a/0x5c0
> | kthread+0x1e3/0x240
> | ret_from_fork+0x1f/0x30
> |
> | The buggy address belongs to the object at ffff888119e0d700
> | which belongs to the cache kmalloc-192 of size 192
> | The buggy address is located 120 bytes inside of
> | 192-byte region [ffff888119e0d700, ffff888119e0d7c0)
> | The buggy address belongs to the page:
> | page:ffffea0004678340 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0
> | flags: 0x2fffe0000000200(slab)
> | raw: 02fffe0000000200 ffffea00045ba8c0 0000000600000006 ffff88811a80ea80
> | raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
> | page dumped because: kasan: bad access detected
> |
> | Memory state around the buggy address:
> | ffff888119e0d600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> | ffff888119e0d680: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
> | >ffff888119e0d700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> | ^
> | ffff888119e0d780: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
> | ffff888119e0d800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> | ==================================================================
>
> Remove the "fast-path" for releasing a link that got auto-detached by a
> dying network namespace to fix it. This way as long as link is on the list
> and netns_bpf mutex is held, we have a guarantee that link memory can be
> accessed.
>
> An alternative way to fix this issue would be to safely iterate over the
> list of links and ensure there is no access to link object after detaching
> it. But, at the moment, optimizing synchronization overhead on link release
> without a workload in mind seems like an overkill.
>
> Fixes: 7233adc8b69b ("bpf, netns: Keep a list of attached bpf_link's")
> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> ---
> kernel/bpf/net_namespace.c | 10 +++-------
> 1 file changed, 3 insertions(+), 7 deletions(-)
>
> diff --git a/kernel/bpf/net_namespace.c b/kernel/bpf/net_namespace.c
> index 7a34a8caf954..247543380fa6 100644
> --- a/kernel/bpf/net_namespace.c
> +++ b/kernel/bpf/net_namespace.c
> @@ -43,15 +43,11 @@ static void bpf_netns_link_release(struct bpf_link *link)
> enum netns_bpf_attach_type type = net_link->netns_type;
> struct net *net;
>
> - /* Link auto-detached by dying netns. */
> - if (!net_link->net)
> - return;
> -
> mutex_lock(&netns_bpf_mutex);
>
> - /* Recheck after potential sleep. We can race with cleanup_net
> - * here, but if we see a non-NULL struct net pointer pre_exit
> - * has not happened yet and will block on netns_bpf_mutex.
> + /* We can race with cleanup_net, but if we see a non-NULL
> + * struct net pointer, pre_exit has not run yet and wait for
> + * netns_bpf_mutex.
> */
> net = net_link->net;
> if (!net)
> --
> 2.25.4
>
--
Lorenz Bauer | Systems Engineer
6th Floor, County Hall/The Riverside Building, SE1 7PB, UK
www.cloudflare.com
prev parent reply other threads:[~2020-07-01 8:21 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-30 16:45 [PATCH bpf-next] bpf, netns: Fix use-after-free in pernet pre_exit callback Jakub Sitnicki
2020-06-30 17:43 ` Yonghong Song
2020-06-30 18:56 ` Andrii Nakryiko
2020-07-01 8:21 ` Lorenz Bauer [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CACAyw98SeKeZemGcPbmMwokfsSd+eSLgPaeRnOYRo3b=Rvudjg@mail.gmail.com' \
--to=lmb@cloudflare.com \
--cc=bpf@vger.kernel.org \
--cc=jakub@cloudflare.com \
--cc=kernel-team@cloudflare.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).