From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2B793C433FE for ; Mon, 4 Oct 2021 17:10:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1450D61409 for ; Mon, 4 Oct 2021 17:10:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234745AbhJDRMe (ORCPT ); Mon, 4 Oct 2021 13:12:34 -0400 Received: from out02.mta.xmission.com ([166.70.13.232]:45012 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231575AbhJDRMc (ORCPT ); Mon, 4 Oct 2021 13:12:32 -0400 Received: from in02.mta.xmission.com ([166.70.13.52]:56088) by out02.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1mXRUA-00AIQb-9M; Mon, 04 Oct 2021 11:10:42 -0600 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95]:35024 helo=email.xmission.com) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1mXRU8-00FQAR-W9; Mon, 04 Oct 2021 11:10:41 -0600 From: ebiederm@xmission.com (Eric W. Biederman) To: Yu Zhao Cc: Alexey Gladkov , Jordan Glover , LKML , linux-mm@kvack.org, containers@lists.linux-foundation.org, Rune Kleveland References: <1M9_d6wrcu6rdPe1ON0_k0lOxJMyyot3KAb1gdyuwzDPC777XVUWPHoTCEVmcK3fYfgu7sIo3PSaLe9KulUdm4TWVuqlbKyYGxRAjsf_Cpk=@protonmail.ch> <87ee9pa6xw.fsf@disp2133> <878rzw77i3.fsf@disp2133> <20210929173611.fo5traia77o63gpw@example.org> <20210930130640.wudkpmn3cmah2cjz@example.org> Date: Mon, 04 Oct 2021 12:10:05 -0500 In-Reply-To: (Yu Zhao's message of "Thu, 30 Sep 2021 16:27:34 -0600") Message-ID: <878rz8wwb6.fsf@disp2133> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1mXRU8-00FQAR-W9;;;mid=<878rz8wwb6.fsf@disp2133>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/fW4fqBtqqTCbM1YFal6ou2aHs6gF4F48= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: Re: linux 5.14.3: free_user_ns causes NULL pointer dereference X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Adding Rune Kleveland to the discussion as he also seems to have reproduced the issue. Alex and I have been starring at the code and the reports and this bug is hiding well. Here is what we have figured out so far. Both the warning from free_user_ns calling dec_ucount that Jordan Glover reported and the KASAN error that Yu Zhao has reported appear to have the same cause. Using a ucounts structure after it has been freed and reallocated as something else. I have just skimmed through the recent report from Rune Kleveland and it appears also to be a use after free. Especially since the second failure in the log is slub complaining about trying to free the ucounts data structure. We looked through the users of put_ucounts and we don't see any obvious buggy users that would be freeing the data structure early. Alex has tried to reproduce this so far is not having any luck. Folks can you tell what compiler versions you are using and share your kernel config with us? That might help. The little debug diff below is my guess of what is happening. If the folks who can reproduce this issue can try the patch below and let me know if the warnings fire that would be appreciated. It is still not enough to track down the bug but at least it will confirm my current hypothesis about how things look before there is a use of memory after it is freed. Thank you, Eric diff --git a/kernel/cred.c b/kernel/cred.c index f784e08c2fbd..e7ffaa3cf5a6 100644 --- a/kernel/cred.c +++ b/kernel/cred.c @@ -120,6 +120,12 @@ static void put_cred_rcu(struct rcu_head *rcu) if (cred->group_info) put_group_info(cred->group_info); free_uid(cred->user); +#if 1 + if ((cred->ucounts == cred->user_ns->ucounts) && + (atomic_read(&cred->ucounts->count) == 1)) { + WARN_ONCE(1, "put_cred_rcu: ucount count 1\n"); + } +#endif if (cred->ucounts) put_ucounts(cred->ucounts); put_user_ns(cred->user_ns); diff --git a/kernel/exit.c b/kernel/exit.c index 91a43e57a32e..60fd88b34c1a 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -743,6 +743,13 @@ void __noreturn do_exit(long code) if (unlikely(!tsk->pid)) panic("Attempted to kill the idle task!"); +#if 1 + if ((tsk->cred->ucounts == tsk->cred->user_ns->ucounts) && + (atomic_read(tsk->cred->ucounts->count) == 1)) { + WARN_ONCE(1, "do_exit: ucount count 1\n"); + } +#endif + /* * If do_exit is called because this processes oopsed, it's possible * that get_fs() was left as KERNEL_DS, so reset it to USER_DS before