From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4CE6C433F5 for ; Fri, 17 Sep 2021 16:17:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B7F7061019 for ; Fri, 17 Sep 2021 16:17:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236656AbhIQQSW (ORCPT ); Fri, 17 Sep 2021 12:18:22 -0400 Received: from out01.mta.xmission.com ([166.70.13.231]:44866 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232726AbhIQQSU (ORCPT ); Fri, 17 Sep 2021 12:18:20 -0400 Received: from in02.mta.xmission.com ([166.70.13.52]:56088) by out01.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1mRGXo-007cbr-4X; Fri, 17 Sep 2021 10:16:56 -0600 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95]:58680 helo=email.xmission.com) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1mRGXm-00AKXd-TA; Fri, 17 Sep 2021 10:16:55 -0600 From: ebiederm@xmission.com (Eric W. Biederman) To: Yu Zhao Cc: Jordan Glover , LKML , "linux-mm\@kvack.org" , "legion\@kernel.org" , "containers\@lists.linux-foundation.org" References: <1M9_d6wrcu6rdPe1ON0_k0lOxJMyyot3KAb1gdyuwzDPC777XVUWPHoTCEVmcK3fYfgu7sIo3PSaLe9KulUdm4TWVuqlbKyYGxRAjsf_Cpk=@protonmail.ch> <87ee9pa6xw.fsf@disp2133> Date: Fri, 17 Sep 2021 11:15:57 -0500 In-Reply-To: (Yu Zhao's message of "Wed, 15 Sep 2021 17:44:40 -0600") Message-ID: <87zgsb5gaq.fsf@disp2133> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1mRGXm-00AKXd-TA;;;mid=<87zgsb5gaq.fsf@disp2133>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/grErXj5tSFynsqep0mW0e7iUUYvH9i9w= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: Re: linux 5.14.3: free_user_ns causes NULL pointer dereference X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Yu Zhao writes: > On Wed, Sep 15, 2021 at 4:42 PM Jordan Glover > wrote: >> >> On Wednesday, September 15th, 2021 at 9:02 PM, wrote: >> >> > Jordan Glover Golden_Miller83@protonmail.ch writes: >> > >> > > Hi, recently I hit system freeze after I was closing few containerized apps on my system. As for now it occurred only once on linux 5.14.3. I think it maybe be related to "Count rlimits in each user namespace" patchset merged during 5.14 window >> > > >> > > https://lore.kernel.org/all/257aa5fb1a7d81cf0f4c34f39ada2320c4284771.1619094428.git.legion@kernel.org/T/#u >> > >> > So that warning comes from: >> > >> > void dec_ucount(struct ucounts *ucounts, enum ucount_type type) >> > >> > { >> > >> > struct ucounts *iter; >> > >> > for (iter = ucounts; iter; iter = iter->ns->ucounts) { >> > >> > long dec = atomic_long_dec_if_positive(&iter->ucount[type]); >> > >> > WARN_ON_ONCE(dec < 0); >> > } >> > put_ucounts(ucounts); >> > >> > >> > } >> > >> > Which certainly looks like a reference count bug. It could also be a >> > >> > memory stomp somewhere close. >> > >> > Do you have any idea what else was going on? This location is the >> > >> > symptom but not the actual cause. >> > >> > Eric >> >> I had about 2 containerized (flatpak/bubblewrap) apps (browser + music player) running . I quickly closed them with intent to shutdown the system but instead get the freeze and had to use magic sysrq to reboot. System logs end with what I posted and before there is nothing suspicious. >> >> Maybe it's some random fluke. I'll reply if I hit it again. > > I have been able to steadily reproduce this for a while. But I haven't > had time to look into it. I'd appreciate any help. It would be very helpful if you could look farther back in your logs and see if you can also see: WARNING: CPU: 1 PID: 351 at kernel/ucount.c:253 dec_ucount+0x43/0x5 Or anything else preceding the use-after-free. I am inclined to think they are the same issue but without seeing the WARN_ON_ONCE I can't safely conclude that. Eric