From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD68CC4332F for ; Mon, 4 Oct 2021 17:19:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A16F56139F for ; Mon, 4 Oct 2021 17:19:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234036AbhJDRVP (ORCPT ); Mon, 4 Oct 2021 13:21:15 -0400 Received: from out02.mta.xmission.com ([166.70.13.232]:47148 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233268AbhJDRVO (ORCPT ); Mon, 4 Oct 2021 13:21:14 -0400 Received: from in02.mta.xmission.com ([166.70.13.52]:59930) by out02.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1mXRca-00AJVM-7s; Mon, 04 Oct 2021 11:19:24 -0600 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95]:35422 helo=email.xmission.com) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1mXRcY-00FRhr-CL; Mon, 04 Oct 2021 11:19:23 -0600 From: ebiederm@xmission.com (Eric W. Biederman) To: Yu Zhao Cc: Alexey Gladkov , Jordan Glover , LKML , linux-mm@kvack.org, containers@lists.linux-foundation.org, Rune Kleveland References: <1M9_d6wrcu6rdPe1ON0_k0lOxJMyyot3KAb1gdyuwzDPC777XVUWPHoTCEVmcK3fYfgu7sIo3PSaLe9KulUdm4TWVuqlbKyYGxRAjsf_Cpk=@protonmail.ch> <87ee9pa6xw.fsf@disp2133> <878rzw77i3.fsf@disp2133> <20210929173611.fo5traia77o63gpw@example.org> <20210930130640.wudkpmn3cmah2cjz@example.org> <878rz8wwb6.fsf@disp2133> Date: Mon, 04 Oct 2021 12:19:16 -0500 In-Reply-To: <878rz8wwb6.fsf@disp2133> (Eric W. Biederman's message of "Mon, 04 Oct 2021 12:10:05 -0500") Message-ID: <87v92cvhbf.fsf@disp2133> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1mXRcY-00FRhr-CL;;;mid=<87v92cvhbf.fsf@disp2133>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1+Te4UnXIOVy3kw5wEucLta233BUFlS1sc= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: Re: linux 5.14.3: free_user_ns causes NULL pointer dereference X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ebiederm@xmission.com (Eric W. Biederman) writes: > Adding Rune Kleveland to the discussion as he also seems to have > reproduced the issue. > > Alex and I have been starring at the code and the reports and this > bug is hiding well. Here is what we have figured out so far. > > Both the warning from free_user_ns calling dec_ucount that Jordan Glover > reported and the KASAN error that Yu Zhao has reported appear to have > the same cause. Using a ucounts structure after it has been freed and > reallocated as something else. > > I have just skimmed through the recent report from Rune Kleveland > and it appears also to be a use after free. Especially since the > second failure in the log is slub complaining about trying to free > the ucounts data structure. > > We looked through the users of put_ucounts and we don't see any obvious > buggy users that would be freeing the data structure early. > > Alex has tried to reproduce this so far is not having any luck. > Folks can you tell what compiler versions you are using and share your > kernel config with us? That might help. > > The little debug diff below is my guess of what is happening. If the > folks who can reproduce this issue can try the patch below and let me > know if the warnings fire that would be appreciated. It is still not > enough to track down the bug but at least it will confirm my current > hypothesis about how things look before there is a use of memory after > it is freed. Bah. Scratch that test patch. I just double checked myself and cred->ucounts and cred->user_ns->ucounts should never be equal, as the user namespace is counted in it's parent user namespace. That observation now tells me I have a parent user namespace that went corrupt. Back to the drawing board. > Thank you, > Eric > > diff --git a/kernel/cred.c b/kernel/cred.c > index f784e08c2fbd..e7ffaa3cf5a6 100644 > --- a/kernel/cred.c > +++ b/kernel/cred.c > @@ -120,6 +120,12 @@ static void put_cred_rcu(struct rcu_head *rcu) > if (cred->group_info) > put_group_info(cred->group_info); > free_uid(cred->user); > +#if 1 > + if ((cred->ucounts == cred->user_ns->ucounts) && > + (atomic_read(&cred->ucounts->count) == 1)) { > + WARN_ONCE(1, "put_cred_rcu: ucount count 1\n"); > + } > +#endif > if (cred->ucounts) > put_ucounts(cred->ucounts); > put_user_ns(cred->user_ns); > diff --git a/kernel/exit.c b/kernel/exit.c > index 91a43e57a32e..60fd88b34c1a 100644 > --- a/kernel/exit.c > +++ b/kernel/exit.c > @@ -743,6 +743,13 @@ void __noreturn do_exit(long code) > if (unlikely(!tsk->pid)) > panic("Attempted to kill the idle task!"); > > +#if 1 > + if ((tsk->cred->ucounts == tsk->cred->user_ns->ucounts) && > + (atomic_read(tsk->cred->ucounts->count) == 1)) { > + WARN_ONCE(1, "do_exit: ucount count 1\n"); > + } > +#endif > + > /* > * If do_exit is called because this processes oopsed, it's possible > * that get_fs() was left as KERNEL_DS, so reset it to USER_DS before