From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 17997C433E0 for ; Wed, 3 Mar 2021 17:59:55 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6E59464EF4 for ; Wed, 3 Mar 2021 17:59:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6E59464EF4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BCF5E8D018F; Wed, 3 Mar 2021 12:59:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BA6408D0157; Wed, 3 Mar 2021 12:59:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A6EDC8D018F; Wed, 3 Mar 2021 12:59:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 8B1B88D0157 for ; Wed, 3 Mar 2021 12:59:53 -0500 (EST) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 47CF2988A for ; Wed, 3 Mar 2021 17:59:53 +0000 (UTC) X-FDA: 77879326266.22.C038B99 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf14.hostedemail.com (Postfix) with ESMTP id 3CDE0C0007F3 for ; Wed, 3 Mar 2021 17:59:48 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 1A79964EF0; Wed, 3 Mar 2021 17:59:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1614794386; bh=n2SVrZTA5iizXm4rhS7o1b4X+2vVSPfTGVJVkr5YBLM=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=sq7u1H0vWaVD0UzqqiFca4MgalpAnzOphcg7PTiSaBLSNTtwlFK2T2wrT6xZ5+tvq IYfsX9SttJA0S7f8A3HSD+SCXLFZ5bDZdQ3H640tMhjZPRbsujYFroagJS3+RvRxba y9j05XHlg56mNaPyGhYwvq7XrByFiszgOLv/LgSB1vcbvU+dAsRvd1C8UzoKqxbIGU tqeRb04LTULgeiUkq7tX/7s4jrkA6xYei1xT8ivdSIwlRsabNCyR2xKuo+hu2erbvs IyIwVCYCt56JBNncjUQ03YxP/ySUwklANS3MZ8N3epDgF7kVZzhia/NlyFdvx0XFJ4 DXGq67w9JEfyA== Received: by paulmck-ThinkPad-P72.home (Postfix, from userid 1000) id B4B5F35237A1; Wed, 3 Mar 2021 09:59:45 -0800 (PST) Date: Wed, 3 Mar 2021 09:59:45 -0800 From: "Paul E. McKenney" To: Michal Hocko Cc: Mike Kravetz , Shakeel Butt , syzbot , Andrew Morton , LKML , Linux MM , syzkaller-bugs , Eric Dumazet , Mina Almasry , tglx@linutronix.de, john.ogness@linutronix.de, urezki@gmail.com, ast@fb.com Subject: Re: possible deadlock in sk_clone_lock Message-ID: <20210303175945.GE2696@paulmck-ThinkPad-P72> Reply-To: paulmck@kernel.org References: <122e8c5d-60b8-52d9-c6a1-00cd61b2e1b6@oracle.com> <06edda9a-dce9-accd-11a3-97f6d5243ed1@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 3CDE0C0007F3 X-Stat-Signature: in6d3sydk7kebee1mb3meu8gpst5nosa Received-SPF: none (kernel.org>: No applicable sender policy available) receiver=imf14; identity=mailfrom; envelope-from=""; helo=mail.kernel.org; client-ip=198.145.29.99 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1614794388-283836 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Mar 03, 2021 at 09:03:27AM +0100, Michal Hocko wrote: > [Add Paul] > > On Tue 02-03-21 13:19:34, Mike Kravetz wrote: > > On 3/2/21 6:29 AM, Michal Hocko wrote: > > > On Tue 02-03-21 06:11:51, Shakeel Butt wrote: > > >> On Tue, Mar 2, 2021 at 1:44 AM Michal Hocko wrote: > > >>> > > >>> On Mon 01-03-21 17:16:29, Mike Kravetz wrote: > > >>>> On 3/1/21 9:23 AM, Michal Hocko wrote: > > >>>>> On Mon 01-03-21 08:39:22, Shakeel Butt wrote: > > >>>>>> On Mon, Mar 1, 2021 at 7:57 AM Michal Hocko wrote: > > >>>>> [...] > > >>>>>>> Then how come this can ever be a problem? in_task() should exclude soft > > >>>>>>> irq context unless I am mistaken. > > >>>>>>> > > >>>>>> > > >>>>>> If I take the following example of syzbot's deadlock scenario then > > >>>>>> CPU1 is the one freeing the hugetlb pages. It is in the process > > >>>>>> context but has disabled softirqs (see __tcp_close()). > > >>>>>> > > >>>>>> CPU0 CPU1 > > >>>>>> ---- ---- > > >>>>>> lock(hugetlb_lock); > > >>>>>> local_irq_disable(); > > >>>>>> lock(slock-AF_INET); > > >>>>>> lock(hugetlb_lock); > > >>>>>> > > >>>>>> lock(slock-AF_INET); > > >>>>>> > > > [...] > > >>> Wouldn't something like this help? It is quite ugly but it would be > > >>> simple enough and backportable while we come up with a more rigorous > > >>> solution. What do you think? > > >>> > > >>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > >>> index 4bdb58ab14cb..c9a8b39f678d 100644 > > >>> --- a/mm/hugetlb.c > > >>> +++ b/mm/hugetlb.c > > >>> @@ -1495,9 +1495,11 @@ static DECLARE_WORK(free_hpage_work, free_hpage_workfn); > > >>> void free_huge_page(struct page *page) > > >>> { > > >>> /* > > >>> - * Defer freeing if in non-task context to avoid hugetlb_lock deadlock. > > >>> + * Defer freeing if in non-task context or when put_page is called > > >>> + * with IRQ disabled (e.g from via TCP slock dependency chain) to > > >>> + * avoid hugetlb_lock deadlock. > > >>> */ > > >>> - if (!in_task()) { > > >>> + if (!in_task() || irqs_disabled()) { > > >> > > >> Does irqs_disabled() also check softirqs? > > > > > > Nope it doesn't AFAICS. I was referring to the above lockdep splat which > > > claims irq disabled to be the trigger. But now that you are mentioning > > > that it would be better to replace in_task() along the way. We have > > > discussed that in another email thread and I was suggesting to use > > > in_atomic() which should catch also bh disabled situation. The big IF is > > > that this needs preempt count to be enabled unconditionally. There are > > > changes in the RCU tree heading that direction. > > > > I have not been following developments in preemption and the RCU tree. > > The comment for in_atomic() says: > > > > /* > > * Are we running in atomic context? WARNING: this macro cannot > > * always detect atomic context; in particular, it cannot know about > > * held spinlocks in non-preemptible kernels. Thus it should not be > > * used in the general case to determine whether sleeping is possible. > > * Do not use in_atomic() in driver code. > > */ > > > > That does seem to be the case. I verified in_atomic can detect softirq > > context even in non-preemptible kernels. But, as the comment says it > > will not detect a held spinlock in non-preemptible kernels. So, I think > > in_atomic would be better than the current check for !in_task. That > > would handle this syzbot issue, but we could still have issues if the > > hugetlb put_page path is called while someone is holding a spinlock with > > all interrupts enabled. Looks like there is no way to detect this > > today in non-preemptible kernels. in_atomic does detect spinlocks held > > in preemptible kernels. > > Paul what is the current plan with in_atomic to be usable for !PREEMPT > configurations? Ah, thank you for the reminder! I have rebased that series on top of v5.12-rc1 on -rcu branch tglx-pc.2021.03.03a. The current state is that Linus is not convinced that this change is worthwhile given that only RCU and printk() want it. (BPF would use it if it were available, but is willing to live without it, at least in the short term.) But maybe Linus will be convinced given your additional use case. Here is hoping! Thanx, Paul