From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B053AC433EF for ; Sun, 3 Oct 2021 19:37:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8A7CF61872 for ; Sun, 3 Oct 2021 19:37:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231531AbhJCTjd (ORCPT ); Sun, 3 Oct 2021 15:39:33 -0400 Received: from mail-40140.protonmail.ch ([185.70.40.140]:29858 "EHLO mail-40140.protonmail.ch" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231321AbhJCTjb (ORCPT ); Sun, 3 Oct 2021 15:39:31 -0400 Date: Sun, 03 Oct 2021 19:37:38 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.ch; s=protonmail; t=1633289860; bh=Vkis8q3wAr5DLrDK2kyBtcT3tC2zLfRByim6BGFmAjI=; h=Date:To:From:Cc:Reply-To:Subject:In-Reply-To:References:From; b=vJ+/0awUSluDcG8pFS3P/dqMtHC++VrRIVA0lJXgnUKZGDrcj+py6V0EOyEGZP9oF IZi/Z/rgBTS8up9A0eRUo7kFcj1ycQo+IWc/11sHazm4OWYLmLoKoCr2jhd0bFFCT9 Rslw76Cko+gDDN+tn1JRU3Qob+J0wqHQNXh2fi/o= To: Alexey Gladkov From: Jordan Glover Cc: ebiederm@xmission.com, LKML , "linux-mm\\\\@kvack.org" , "containers\\\\@lists.linux-foundation.org" , Yu Zhao Reply-To: Jordan Glover Subject: Re: linux 5.14.3: free_user_ns causes NULL pointer dereference Message-ID: In-Reply-To: <20210929173611.fo5traia77o63gpw@example.org> References: <1M9_d6wrcu6rdPe1ON0_k0lOxJMyyot3KAb1gdyuwzDPC777XVUWPHoTCEVmcK3fYfgu7sIo3PSaLe9KulUdm4TWVuqlbKyYGxRAjsf_Cpk=@protonmail.ch> <87ee9pa6xw.fsf@disp2133> <878rzw77i3.fsf@disp2133> <20210929173611.fo5traia77o63gpw@example.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wednesday, September 29th, 2021 at 5:36 PM, Alexey Gladkov wrote: > On Tue, Sep 28, 2021 at 01:40:48PM +0000, Jordan Glover wrote: > > > On Thursday, September 16th, 2021 at 5:30 PM, ebiederm@xmission.com wro= te: > > > > > Jordan Glover Golden_Miller83@protonmail.ch writes: > > > > > > > On Wednesday, September 15th, 2021 at 10:42 PM, Jordan Glover Golde= n_Miller83@protonmail.ch wrote: > > > > > > > > > I had about 2 containerized (flatpak/bubblewrap) apps (browser + = music player) running . I quickly closed them with intent to shutdown the s= ystem but instead get the freeze and had to use magic sysrq to reboot. Syst= em logs end with what I posted and before there is nothing suspicious. > > > > > > > > > > Maybe it's some random fluke. I'll reply if I hit it again. > > > > > > > > Heh, it jut happened again. This time closing firefox alone had suc= h > > > > > > > > effect: > > > > > > Ok. It looks like he have a couple of folks seeing issues here. > > > > > > I thought we had all of the issues sorted out for the release of v5.1= 4, > > > > > > but it looks like there is still some little bug left. > > > > > > If Alex doesn't beat me to it I will see if I can come up with a > > > > > > debugging patch to make it easy to help track down where the referenc= e > > > > > > count is going wrong. It will be a little bit as my brain is mush at > > > > > > the moment. > > > > > > Eric > > > > As the issue persist in 5.14.7 I would be very interested in such patch= . > > > > For now the thing is mostly reproducible when I close several tabs in f= f then > > > > close the browser in short period of time. When I close tabs then wait = out > > > > a bit then close the browser it doesn't happen so I guess some interrup= ted > > > > cleanup triggers it. > > I'm still investigating, but I would like to rule out one option. > > Could you check out the patch? > > diff --git a/kernel/ucount.c b/kernel/ucount.c > > index bb51849e6375..f23f906f4f62 100644 > > --- a/kernel/ucount.c > > +++ b/kernel/ucount.c > > @@ -201,11 +201,14 @@ void put_ucounts(struct ucounts *ucounts) > > { > > unsigned long flags; > > - if (atomic_dec_and_lock_irqsave(&ucounts->count, &ucounts_lock,= flags)) { > > > > - spin_lock_irqsave(&ucounts_lock, flags); > > > - if (atomic_dec_and_test(&ucounts->count)) { > > hlist_del_init(&ucounts->node); > > spin_unlock_irqrestore(&ucounts_lock, flags); > kfree(ucounts); > > > - return; > } > > > - spin_unlock_irqrestore(&ucounts_lock, flags); > > > > } > > static inline bool atomic_long_inc_below(atomic_long_t *v, int u) > > --------------------------------------------------------------------- > > Rgrds, legion I'm still able to reproduce the issue with above patch although situation changed/improved a bit as now I have to close tabs and browser really fast to hit it which means it's more unlikely to happen during real usage. On the other hand the kernel logging cuts off much earlier, just after few lines: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 20387 at kernel/ucount.c:256 dec_ucount+0x43/0x50 Modules linked in: ... Jordan From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CFEBC433EF for ; Sun, 3 Oct 2021 19:37:45 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 24C396124D for ; Sun, 3 Oct 2021 19:37:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 24C396124D Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=protonmail.ch Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 37E01900002; Sun, 3 Oct 2021 15:37:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 32CED6B0071; Sun, 3 Oct 2021 15:37:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 269EB900002; Sun, 3 Oct 2021 15:37:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0092.hostedemail.com [216.40.44.92]) by kanga.kvack.org (Postfix) with ESMTP id 187A96B006C for ; Sun, 3 Oct 2021 15:37:44 -0400 (EDT) Received: from smtpin38.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id B6B4D8249980 for ; Sun, 3 Oct 2021 19:37:43 +0000 (UTC) X-FDA: 78656136006.38.58D51E1 Received: from mail-40140.protonmail.ch (mail-40140.protonmail.ch [185.70.40.140]) by imf10.hostedemail.com (Postfix) with ESMTP id 240266002FF6 for ; Sun, 3 Oct 2021 19:37:42 +0000 (UTC) Date: Sun, 03 Oct 2021 19:37:38 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.ch; s=protonmail; t=1633289860; bh=Vkis8q3wAr5DLrDK2kyBtcT3tC2zLfRByim6BGFmAjI=; h=Date:To:From:Cc:Reply-To:Subject:In-Reply-To:References:From; b=vJ+/0awUSluDcG8pFS3P/dqMtHC++VrRIVA0lJXgnUKZGDrcj+py6V0EOyEGZP9oF IZi/Z/rgBTS8up9A0eRUo7kFcj1ycQo+IWc/11sHazm4OWYLmLoKoCr2jhd0bFFCT9 Rslw76Cko+gDDN+tn1JRU3Qob+J0wqHQNXh2fi/o= To: Alexey Gladkov From: Jordan Glover Cc: ebiederm@xmission.com, LKML , "linux-mm\\\\@kvack.org" , "containers\\\\@lists.linux-foundation.org" , Yu Zhao Reply-To: Jordan Glover Subject: Re: linux 5.14.3: free_user_ns causes NULL pointer dereference Message-ID: In-Reply-To: <20210929173611.fo5traia77o63gpw@example.org> References: <1M9_d6wrcu6rdPe1ON0_k0lOxJMyyot3KAb1gdyuwzDPC777XVUWPHoTCEVmcK3fYfgu7sIo3PSaLe9KulUdm4TWVuqlbKyYGxRAjsf_Cpk=@protonmail.ch> <87ee9pa6xw.fsf@disp2133> <878rzw77i3.fsf@disp2133> <20210929173611.fo5traia77o63gpw@example.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=protonmail.ch header.s=protonmail header.b="vJ+/0awU"; spf=pass (imf10.hostedemail.com: domain of Golden_Miller83@protonmail.ch designates 185.70.40.140 as permitted sender) smtp.mailfrom=Golden_Miller83@protonmail.ch; dmarc=pass (policy=quarantine) header.from=protonmail.ch X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 240266002FF6 X-Stat-Signature: 8br7eegi4jueyk75zrgresb9p4x6gu3q X-HE-Tag: 1633289862-624422 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wednesday, September 29th, 2021 at 5:36 PM, Alexey Gladkov wrote: > On Tue, Sep 28, 2021 at 01:40:48PM +0000, Jordan Glover wrote: > > > On Thursday, September 16th, 2021 at 5:30 PM, ebiederm@xmission.com wro= te: > > > > > Jordan Glover Golden_Miller83@protonmail.ch writes: > > > > > > > On Wednesday, September 15th, 2021 at 10:42 PM, Jordan Glover Golde= n_Miller83@protonmail.ch wrote: > > > > > > > > > I had about 2 containerized (flatpak/bubblewrap) apps (browser + = music player) running . I quickly closed them with intent to shutdown the s= ystem but instead get the freeze and had to use magic sysrq to reboot. Syst= em logs end with what I posted and before there is nothing suspicious. > > > > > > > > > > Maybe it's some random fluke. I'll reply if I hit it again. > > > > > > > > Heh, it jut happened again. This time closing firefox alone had suc= h > > > > > > > > effect: > > > > > > Ok. It looks like he have a couple of folks seeing issues here. > > > > > > I thought we had all of the issues sorted out for the release of v5.1= 4, > > > > > > but it looks like there is still some little bug left. > > > > > > If Alex doesn't beat me to it I will see if I can come up with a > > > > > > debugging patch to make it easy to help track down where the referenc= e > > > > > > count is going wrong. It will be a little bit as my brain is mush at > > > > > > the moment. > > > > > > Eric > > > > As the issue persist in 5.14.7 I would be very interested in such patch= . > > > > For now the thing is mostly reproducible when I close several tabs in f= f then > > > > close the browser in short period of time. When I close tabs then wait = out > > > > a bit then close the browser it doesn't happen so I guess some interrup= ted > > > > cleanup triggers it. > > I'm still investigating, but I would like to rule out one option. > > Could you check out the patch? > > diff --git a/kernel/ucount.c b/kernel/ucount.c > > index bb51849e6375..f23f906f4f62 100644 > > --- a/kernel/ucount.c > > +++ b/kernel/ucount.c > > @@ -201,11 +201,14 @@ void put_ucounts(struct ucounts *ucounts) > > { > > unsigned long flags; > > - if (atomic_dec_and_lock_irqsave(&ucounts->count, &ucounts_lock,= flags)) { > > > > - spin_lock_irqsave(&ucounts_lock, flags); > > > - if (atomic_dec_and_test(&ucounts->count)) { > > hlist_del_init(&ucounts->node); > > spin_unlock_irqrestore(&ucounts_lock, flags); > kfree(ucounts); > > > - return; > } > > > - spin_unlock_irqrestore(&ucounts_lock, flags); > > > > } > > static inline bool atomic_long_inc_below(atomic_long_t *v, int u) > > --------------------------------------------------------------------- > > Rgrds, legion I'm still able to reproduce the issue with above patch although situation changed/improved a bit as now I have to close tabs and browser really fast to hit it which means it's more unlikely to happen during real usage. On the other hand the kernel logging cuts off much earlier, just after few lines: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 20387 at kernel/ucount.c:256 dec_ucount+0x43/0x50 Modules linked in: ... Jordan