linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Rune Kleveland <rune.kleveland@infomedia.dk>
To: "Eric W. Biederman" <ebiederm@xmission.com>, Yu Zhao <yuzhao@google.com>
Cc: Alexey Gladkov <legion@kernel.org>,
	Jordan Glover <Golden_Miller83@protonmail.ch>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, containers@lists.linux-foundation.org
Subject: Re: linux 5.14.3: free_user_ns causes NULL pointer dereference
Date: Sun, 10 Oct 2021 10:59:10 +0200	[thread overview]
Message-ID: <ccbccf82-dc50-00b2-1cfd-3da5e2c81dbf@infomedia.dk> (raw)
In-Reply-To: <87v92cvhbf.fsf@disp2133>

Hi!

Just wanted to let you know that I still get these on stock Fedora 
kernel 5.14.10 on the IBM blades. But it took 10 hours before the first 
server crashed. The other 4 still runs fine since 15 hours ago. So for 
me it seems more stable now, but that could just be a coincidence.

Best regards,

Rune

------------[ cut here ]------------
kernel BUG at mm/slub.c:321!
invalid opcode: 0000 [#1] SMP PTI
CPU: 22 PID: 1838853 Comm: python3 Not tainted 5.14.10-200.fc34.x86_64 #1
Hardware name: IBM BladeCenter HS22 -[7870TKN]-/68Y8161, BIOS 
-[P9E164CUS-1.28]- 04/17/2018
RIP: 0010:__slab_free+0x245/0x4a0
Code: 0f b6 5c 24 1b 44 8b 44 24 1c 48 89 44 24 08 48 8b 54 24 20 4c 8b 
4c 24 28 e9 8a fe ff ff 41 f7 45 08 00 0d 21 00 75 98 eb 8d <0f> 0b 49 
3b 54 24 28 0f 85 53 ff ff ff 49 8b 44 24 08 4>
RSP: 0018:ffffb71dcfd6fda0 EFLAGS: 00010246
RAX: ffff9c5480d35860 RBX: ffff9c5480d35800 RCX: ffff9c5480d35800
RDX: 00000000802a0029 RSI: ffffeb41da034d00 RDI: ffff9c4f00042800
RBP: ffffb71dcfd6fe50 R08: 0000000000000001 R09: ffffffff9210b6a5
R10: ffff9c548102e000 R11: 0000000062667658 R12: ffffeb41da034d00
R13: ffff9c4f00042800 R14: ffff9c5480d35800 R15: ffff9c5480d35800
FS:  00007f7a89765740(0000) GS:ffff9c6c1fc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7a7ad7f4b0 CR3: 0000000564af4002 CR4: 00000000000206e0
Call Trace:
  ? filename_lookup+0x135/0x1b0
  ? put_ucounts+0x65/0x70
  kfree+0x369/0x3c0
  put_ucounts+0x65/0x70
  put_cred_rcu+0x70/0xd0
  do_faccessat+0x113/0x240
  do_syscall_64+0x3b/0x90
  entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f7a899cc44b
Code: 77 05 c3 0f 1f 40 00 48 8b 15 29 1a 0d 00 f7 d8 64 89 02 48 c7 c0 
ff ff ff ff c3 0f 1f 40 00 f3 0f 1e fa b8 15 00 00 00 0f 05 <48> 3d 00 
f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 f9 19 0>
RSP: 002b:00007ffd01fa9ce8 EFLAGS: 00000202 ORIG_RAX: 0000000000000015
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7a899cc44b
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00007f79a96d6e10
RBP: 0000000000000001 R08: 0000000000000000 R09: 00007f7a7c1fb930
R10: 00007f79a96d6000 R11: 0000000000000202 R12: 00007ffd01fa9d00
R13: 0000000000000001 R14: 0000556841045c90 R15: 00000000ffffff9c
Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs 
lockd grace sunrpc fscache netfs nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 
nft_fib nft_reject_inet nf_reject_ipv4 nf_rejec>
---[ end trace 0a81b150eacde1d5 ]---
RIP: 0010:__slab_free+0x245/0x4a0
Code: 0f b6 5c 24 1b 44 8b 44 24 1c 48 89 44 24 08 48 8b 54 24 20 4c 8b 
4c 24 28 e9 8a fe ff ff 41 f7 45 08 00 0d 21 00 75 98 eb 8d <0f> 0b 49 
3b 54 24 28 0f 85 53 ff ff ff 49 8b 44 24 08 4>
RSP: 0018:ffffb71dcfd6fda0 EFLAGS: 00010246
RAX: ffff9c5480d35860 RBX: ffff9c5480d35800 RCX: ffff9c5480d35800
RDX: 00000000802a0029 RSI: ffffeb41da034d00 RDI: ffff9c4f00042800
RBP: ffffb71dcfd6fe50 R08: 0000000000000001 R09: ffffffff9210b6a5
R10: ffff9c548102e000 R11: 0000000062667658 R12: ffffeb41da034d00
R13: ffff9c4f00042800 R14: ffff9c5480d35800 R15: ffff9c5480d35800
FS:  00007f7a89765740(0000) GS:ffff9c6c1fc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7a7ad7f4b0 CR3: 0000000564af4002 CR4: 00000000000206e0
------------[ cut here ]------------

On 04/10/2021 19:19, Eric W. Biederman wrote:
> ebiederm@xmission.com (Eric W. Biederman) writes:
>
>> Adding Rune Kleveland to the discussion as he also seems to have
>> reproduced the issue.
>>
>> Alex and I have been starring at the code and the reports and this
>> bug is hiding well.  Here is what we have figured out so far.
>>
>> Both the warning from free_user_ns calling dec_ucount that Jordan Glover
>> reported and the KASAN error that Yu Zhao has reported appear to have
>> the same cause.  Using a ucounts structure after it has been freed and
>> reallocated as something else.
>>
>> I have just skimmed through the recent report from Rune Kleveland
>> and it appears also to be a use after free.  Especially since the
>> second failure in the log is slub complaining about trying to free
>> the ucounts data structure.
>>
>> We looked through the users of put_ucounts and we don't see any obvious
>> buggy users that would be freeing the data structure early.
>>
>> Alex has tried to reproduce this so far is not having any luck.
>> Folks can you tell what compiler versions you are using and share your
>> kernel config with us?  That might help.
>>
>> The little debug diff below is my guess of what is happening.  If the
>> folks who can reproduce this issue can try the patch below and let me
>> know if the warnings fire that would be appreciated.  It is still not
>> enough to track down the bug but at least it will confirm my current
>> hypothesis about how things look before there is a use of memory after
>> it is freed.
> Bah.  Scratch that test patch.  I just double checked myself and
> cred->ucounts and cred->user_ns->ucounts should never be equal,
> as the user namespace is counted in it's parent user namespace.
>
> That observation now tells me I have a parent user namespace that went
> corrupt.
>
> Back to the drawing board.
>
>
>> Thank you,
>> Eric
>>
>> diff --git a/kernel/cred.c b/kernel/cred.c
>> index f784e08c2fbd..e7ffaa3cf5a6 100644
>> --- a/kernel/cred.c
>> +++ b/kernel/cred.c
>> @@ -120,6 +120,12 @@ static void put_cred_rcu(struct rcu_head *rcu)
>>   	if (cred->group_info)
>>   		put_group_info(cred->group_info);
>>   	free_uid(cred->user);
>> +#if 1
>> +	if ((cred->ucounts == cred->user_ns->ucounts) &&
>> +	    (atomic_read(&cred->ucounts->count) == 1)) {
>> +		WARN_ONCE(1, "put_cred_rcu: ucount count 1\n");
>> +	}
>> +#endif
>>   	if (cred->ucounts)
>>   		put_ucounts(cred->ucounts);
>>   	put_user_ns(cred->user_ns);
>> diff --git a/kernel/exit.c b/kernel/exit.c
>> index 91a43e57a32e..60fd88b34c1a 100644
>> --- a/kernel/exit.c
>> +++ b/kernel/exit.c
>> @@ -743,6 +743,13 @@ void __noreturn do_exit(long code)
>>   	if (unlikely(!tsk->pid))
>>   		panic("Attempted to kill the idle task!");
>>   
>> +#if 1
>> +	if ((tsk->cred->ucounts == tsk->cred->user_ns->ucounts) &&
>> +	    (atomic_read(tsk->cred->ucounts->count) == 1)) {
>> +		WARN_ONCE(1, "do_exit: ucount count 1\n");
>> +	}
>> +#endif
>> +
>>   	/*
>>   	 * If do_exit is called because this processes oopsed, it's possible
>>   	 * that get_fs() was left as KERNEL_DS, so reset it to USER_DS before


  parent reply	other threads:[~2021-10-10  8:59 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-15 19:49 linux 5.14.3: free_user_ns causes NULL pointer dereference Jordan Glover
2021-09-15 21:02 ` Eric W. Biederman
2021-09-15 22:42   ` Jordan Glover
2021-09-15 23:44     ` Yu Zhao
2021-09-17 16:15       ` Eric W. Biederman
2021-09-17 18:45         ` Yu Zhao
2021-09-15 23:47     ` Jordan Glover
2021-09-16 17:30       ` Eric W. Biederman
2021-09-16 19:14         ` Alexey Gladkov
2021-09-28 13:40         ` Jordan Glover
2021-09-29 17:36           ` Alexey Gladkov
2021-09-29 21:39             ` Jordan Glover
2021-09-30 13:06               ` Alexey Gladkov
2021-09-30 22:27                 ` Yu Zhao
2021-10-04 17:10                   ` Eric W. Biederman
2021-10-04 17:19                     ` Eric W. Biederman
2021-10-04 21:34                       ` Yu Zhao
2021-10-06  7:57                       ` Rune Kleveland
2021-10-10  8:59                       ` Rune Kleveland [this message]
2021-10-11 13:09                         ` Hillf Danton
2021-10-12 17:31                         ` Eric W. Biederman
2021-10-15 22:10                         ` [CFT][PATCH] ucounts: Fix signal ucount refcounting Eric W. Biederman
2021-10-15 23:09                           ` Alexey Gladkov
2021-10-16 17:34                             ` Eric W. Biederman
2021-10-17 19:35                               ` Yu Zhao
2021-10-18 15:35                                 ` Eric W. Biederman
2021-10-16  2:08                           ` Hillf Danton
2021-10-16 18:00                             ` Eric W. Biederman
2021-10-17 16:47                           ` Rune Kleveland
2021-10-18  6:25                             ` Yu Zhao
2021-10-18 10:31                               ` Jordan Glover
2021-10-18 16:06                           ` [PATCH v2] " Eric W. Biederman
2021-10-18 17:21                             ` [PATCH 0/3] ucounts: misc fixes Eric W. Biederman
2021-10-18 17:23                               ` [PATCH 1/3] ucounts: Pair inc_rlimit_ucounts with dec_rlimit_ucoutns in commit_creds Eric W. Biederman
2021-10-18 17:23                               ` [PATCH 2/3] ucounts: Proper error handling in set_cred_ucounts Eric W. Biederman
2021-10-18 17:24                               ` [PATCH 3/3] ucounts: Move get_ucounts from cred_alloc_blank to key_change_session_keyring Eric W. Biederman
2021-10-18 17:54                               ` [PATCH 0/4] ucounts: misc cleanups Eric W. Biederman
2021-10-18 17:55                                 ` [PATCH 1/4] ucounts: In set_cred_ucounts assume new->ucounts is non-NULL Eric W. Biederman
2021-10-18 17:56                                 ` [PATCH 2/4] ucounts: Remove unnecessary test for NULL ucount in get_ucounts Eric W. Biederman
2021-10-18 17:56                                 ` [PATCH 3/4] ucounts: Add get_ucounts_or_wrap for clarity Eric W. Biederman
2021-10-18 17:57                                 ` [PATCH 4/4] ucounts: Use atomic_long_sub_return " Eric W. Biederman
2021-10-18 22:29                                 ` [PATCH 0/4] ucounts: misc cleanups Yu Zhao
2021-10-18 22:28                               ` [PATCH 0/3] ucounts: misc fixes Yu Zhao
2021-10-18 22:26                             ` [PATCH v2] ucounts: Fix signal ucount refcounting Yu Zhao
2021-10-06  2:12                   ` linux 5.14.3: free_user_ns causes NULL pointer dereference Hillf Danton
2021-10-06  6:22                     ` Yu Zhao
2021-10-07 13:28                     ` Jordan Glover
2021-10-10 11:26                       ` Hillf Danton
2021-10-03 19:37             ` Jordan Glover

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ccbccf82-dc50-00b2-1cfd-3da5e2c81dbf@infomedia.dk \
    --to=rune.kleveland@infomedia.dk \
    --cc=Golden_Miller83@protonmail.ch \
    --cc=containers@lists.linux-foundation.org \
    --cc=ebiederm@xmission.com \
    --cc=legion@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).