From: Rune Kleveland <rune.kleveland@infomedia.dk>
To: "Eric W. Biederman" <ebiederm@xmission.com>, Yu Zhao <yuzhao@google.com>
Cc: Alexey Gladkov <legion@kernel.org>,
Jordan Glover <Golden_Miller83@protonmail.ch>,
LKML <linux-kernel@vger.kernel.org>,
linux-mm@kvack.org, containers@lists.linux-foundation.org
Subject: Re: linux 5.14.3: free_user_ns causes NULL pointer dereference
Date: Sun, 10 Oct 2021 10:59:10 +0200 [thread overview]
Message-ID: <ccbccf82-dc50-00b2-1cfd-3da5e2c81dbf@infomedia.dk> (raw)
In-Reply-To: <87v92cvhbf.fsf@disp2133>
Hi!
Just wanted to let you know that I still get these on stock Fedora
kernel 5.14.10 on the IBM blades. But it took 10 hours before the first
server crashed. The other 4 still runs fine since 15 hours ago. So for
me it seems more stable now, but that could just be a coincidence.
Best regards,
Rune
------------[ cut here ]------------
kernel BUG at mm/slub.c:321!
invalid opcode: 0000 [#1] SMP PTI
CPU: 22 PID: 1838853 Comm: python3 Not tainted 5.14.10-200.fc34.x86_64 #1
Hardware name: IBM BladeCenter HS22 -[7870TKN]-/68Y8161, BIOS
-[P9E164CUS-1.28]- 04/17/2018
RIP: 0010:__slab_free+0x245/0x4a0
Code: 0f b6 5c 24 1b 44 8b 44 24 1c 48 89 44 24 08 48 8b 54 24 20 4c 8b
4c 24 28 e9 8a fe ff ff 41 f7 45 08 00 0d 21 00 75 98 eb 8d <0f> 0b 49
3b 54 24 28 0f 85 53 ff ff ff 49 8b 44 24 08 4>
RSP: 0018:ffffb71dcfd6fda0 EFLAGS: 00010246
RAX: ffff9c5480d35860 RBX: ffff9c5480d35800 RCX: ffff9c5480d35800
RDX: 00000000802a0029 RSI: ffffeb41da034d00 RDI: ffff9c4f00042800
RBP: ffffb71dcfd6fe50 R08: 0000000000000001 R09: ffffffff9210b6a5
R10: ffff9c548102e000 R11: 0000000062667658 R12: ffffeb41da034d00
R13: ffff9c4f00042800 R14: ffff9c5480d35800 R15: ffff9c5480d35800
FS: 00007f7a89765740(0000) GS:ffff9c6c1fc80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7a7ad7f4b0 CR3: 0000000564af4002 CR4: 00000000000206e0
Call Trace:
? filename_lookup+0x135/0x1b0
? put_ucounts+0x65/0x70
kfree+0x369/0x3c0
put_ucounts+0x65/0x70
put_cred_rcu+0x70/0xd0
do_faccessat+0x113/0x240
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f7a899cc44b
Code: 77 05 c3 0f 1f 40 00 48 8b 15 29 1a 0d 00 f7 d8 64 89 02 48 c7 c0
ff ff ff ff c3 0f 1f 40 00 f3 0f 1e fa b8 15 00 00 00 0f 05 <48> 3d 00
f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 f9 19 0>
RSP: 002b:00007ffd01fa9ce8 EFLAGS: 00000202 ORIG_RAX: 0000000000000015
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7a899cc44b
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00007f79a96d6e10
RBP: 0000000000000001 R08: 0000000000000000 R09: 00007f7a7c1fb930
R10: 00007f79a96d6000 R11: 0000000000000202 R12: 00007ffd01fa9d00
R13: 0000000000000001 R14: 0000556841045c90 R15: 00000000ffffff9c
Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs
lockd grace sunrpc fscache netfs nft_fib_inet nft_fib_ipv4 nft_fib_ipv6
nft_fib nft_reject_inet nf_reject_ipv4 nf_rejec>
---[ end trace 0a81b150eacde1d5 ]---
RIP: 0010:__slab_free+0x245/0x4a0
Code: 0f b6 5c 24 1b 44 8b 44 24 1c 48 89 44 24 08 48 8b 54 24 20 4c 8b
4c 24 28 e9 8a fe ff ff 41 f7 45 08 00 0d 21 00 75 98 eb 8d <0f> 0b 49
3b 54 24 28 0f 85 53 ff ff ff 49 8b 44 24 08 4>
RSP: 0018:ffffb71dcfd6fda0 EFLAGS: 00010246
RAX: ffff9c5480d35860 RBX: ffff9c5480d35800 RCX: ffff9c5480d35800
RDX: 00000000802a0029 RSI: ffffeb41da034d00 RDI: ffff9c4f00042800
RBP: ffffb71dcfd6fe50 R08: 0000000000000001 R09: ffffffff9210b6a5
R10: ffff9c548102e000 R11: 0000000062667658 R12: ffffeb41da034d00
R13: ffff9c4f00042800 R14: ffff9c5480d35800 R15: ffff9c5480d35800
FS: 00007f7a89765740(0000) GS:ffff9c6c1fc80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7a7ad7f4b0 CR3: 0000000564af4002 CR4: 00000000000206e0
------------[ cut here ]------------
On 04/10/2021 19:19, Eric W. Biederman wrote:
> ebiederm@xmission.com (Eric W. Biederman) writes:
>
>> Adding Rune Kleveland to the discussion as he also seems to have
>> reproduced the issue.
>>
>> Alex and I have been starring at the code and the reports and this
>> bug is hiding well. Here is what we have figured out so far.
>>
>> Both the warning from free_user_ns calling dec_ucount that Jordan Glover
>> reported and the KASAN error that Yu Zhao has reported appear to have
>> the same cause. Using a ucounts structure after it has been freed and
>> reallocated as something else.
>>
>> I have just skimmed through the recent report from Rune Kleveland
>> and it appears also to be a use after free. Especially since the
>> second failure in the log is slub complaining about trying to free
>> the ucounts data structure.
>>
>> We looked through the users of put_ucounts and we don't see any obvious
>> buggy users that would be freeing the data structure early.
>>
>> Alex has tried to reproduce this so far is not having any luck.
>> Folks can you tell what compiler versions you are using and share your
>> kernel config with us? That might help.
>>
>> The little debug diff below is my guess of what is happening. If the
>> folks who can reproduce this issue can try the patch below and let me
>> know if the warnings fire that would be appreciated. It is still not
>> enough to track down the bug but at least it will confirm my current
>> hypothesis about how things look before there is a use of memory after
>> it is freed.
> Bah. Scratch that test patch. I just double checked myself and
> cred->ucounts and cred->user_ns->ucounts should never be equal,
> as the user namespace is counted in it's parent user namespace.
>
> That observation now tells me I have a parent user namespace that went
> corrupt.
>
> Back to the drawing board.
>
>
>> Thank you,
>> Eric
>>
>> diff --git a/kernel/cred.c b/kernel/cred.c
>> index f784e08c2fbd..e7ffaa3cf5a6 100644
>> --- a/kernel/cred.c
>> +++ b/kernel/cred.c
>> @@ -120,6 +120,12 @@ static void put_cred_rcu(struct rcu_head *rcu)
>> if (cred->group_info)
>> put_group_info(cred->group_info);
>> free_uid(cred->user);
>> +#if 1
>> + if ((cred->ucounts == cred->user_ns->ucounts) &&
>> + (atomic_read(&cred->ucounts->count) == 1)) {
>> + WARN_ONCE(1, "put_cred_rcu: ucount count 1\n");
>> + }
>> +#endif
>> if (cred->ucounts)
>> put_ucounts(cred->ucounts);
>> put_user_ns(cred->user_ns);
>> diff --git a/kernel/exit.c b/kernel/exit.c
>> index 91a43e57a32e..60fd88b34c1a 100644
>> --- a/kernel/exit.c
>> +++ b/kernel/exit.c
>> @@ -743,6 +743,13 @@ void __noreturn do_exit(long code)
>> if (unlikely(!tsk->pid))
>> panic("Attempted to kill the idle task!");
>>
>> +#if 1
>> + if ((tsk->cred->ucounts == tsk->cred->user_ns->ucounts) &&
>> + (atomic_read(tsk->cred->ucounts->count) == 1)) {
>> + WARN_ONCE(1, "do_exit: ucount count 1\n");
>> + }
>> +#endif
>> +
>> /*
>> * If do_exit is called because this processes oopsed, it's possible
>> * that get_fs() was left as KERNEL_DS, so reset it to USER_DS before
next prev parent reply other threads:[~2021-10-10 8:59 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-15 19:49 linux 5.14.3: free_user_ns causes NULL pointer dereference Jordan Glover
2021-09-15 21:02 ` Eric W. Biederman
2021-09-15 22:42 ` Jordan Glover
2021-09-15 23:44 ` Yu Zhao
2021-09-17 16:15 ` Eric W. Biederman
2021-09-17 18:45 ` Yu Zhao
2021-09-15 23:47 ` Jordan Glover
2021-09-16 17:30 ` Eric W. Biederman
2021-09-16 19:14 ` Alexey Gladkov
2021-09-28 13:40 ` Jordan Glover
2021-09-29 17:36 ` Alexey Gladkov
2021-09-29 21:39 ` Jordan Glover
2021-09-30 13:06 ` Alexey Gladkov
2021-09-30 22:27 ` Yu Zhao
2021-10-04 17:10 ` Eric W. Biederman
2021-10-04 17:19 ` Eric W. Biederman
2021-10-04 21:34 ` Yu Zhao
2021-10-06 7:57 ` Rune Kleveland
2021-10-10 8:59 ` Rune Kleveland [this message]
2021-10-11 13:09 ` Hillf Danton
2021-10-12 17:31 ` Eric W. Biederman
2021-10-15 22:10 ` [CFT][PATCH] ucounts: Fix signal ucount refcounting Eric W. Biederman
2021-10-15 23:09 ` Alexey Gladkov
2021-10-16 17:34 ` Eric W. Biederman
2021-10-17 19:35 ` Yu Zhao
2021-10-18 15:35 ` Eric W. Biederman
2021-10-16 2:08 ` Hillf Danton
2021-10-16 18:00 ` Eric W. Biederman
2021-10-17 16:47 ` Rune Kleveland
2021-10-18 6:25 ` Yu Zhao
2021-10-18 10:31 ` Jordan Glover
2021-10-18 16:06 ` [PATCH v2] " Eric W. Biederman
2021-10-18 17:21 ` [PATCH 0/3] ucounts: misc fixes Eric W. Biederman
2021-10-18 17:23 ` [PATCH 1/3] ucounts: Pair inc_rlimit_ucounts with dec_rlimit_ucoutns in commit_creds Eric W. Biederman
2021-10-18 17:23 ` [PATCH 2/3] ucounts: Proper error handling in set_cred_ucounts Eric W. Biederman
2021-10-18 17:24 ` [PATCH 3/3] ucounts: Move get_ucounts from cred_alloc_blank to key_change_session_keyring Eric W. Biederman
2021-10-18 17:54 ` [PATCH 0/4] ucounts: misc cleanups Eric W. Biederman
2021-10-18 17:55 ` [PATCH 1/4] ucounts: In set_cred_ucounts assume new->ucounts is non-NULL Eric W. Biederman
2021-10-18 17:56 ` [PATCH 2/4] ucounts: Remove unnecessary test for NULL ucount in get_ucounts Eric W. Biederman
2021-10-18 17:56 ` [PATCH 3/4] ucounts: Add get_ucounts_or_wrap for clarity Eric W. Biederman
2021-10-18 17:57 ` [PATCH 4/4] ucounts: Use atomic_long_sub_return " Eric W. Biederman
2021-10-18 22:29 ` [PATCH 0/4] ucounts: misc cleanups Yu Zhao
2021-10-18 22:28 ` [PATCH 0/3] ucounts: misc fixes Yu Zhao
2021-10-18 22:26 ` [PATCH v2] ucounts: Fix signal ucount refcounting Yu Zhao
2021-10-06 2:12 ` linux 5.14.3: free_user_ns causes NULL pointer dereference Hillf Danton
2021-10-06 6:22 ` Yu Zhao
2021-10-07 13:28 ` Jordan Glover
2021-10-10 11:26 ` Hillf Danton
2021-10-03 19:37 ` Jordan Glover
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ccbccf82-dc50-00b2-1cfd-3da5e2c81dbf@infomedia.dk \
--to=rune.kleveland@infomedia.dk \
--cc=Golden_Miller83@protonmail.ch \
--cc=containers@lists.linux-foundation.org \
--cc=ebiederm@xmission.com \
--cc=legion@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).