From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69CBAC433DB for ; Wed, 3 Mar 2021 08:03:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A6A9E64EE8 for ; Wed, 3 Mar 2021 08:03:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A6A9E64EE8 Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A89D68D013A; Wed, 3 Mar 2021 03:03:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A396A8D0135; Wed, 3 Mar 2021 03:03:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8DA318D013A; Wed, 3 Mar 2021 03:03:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0205.hostedemail.com [216.40.44.205]) by kanga.kvack.org (Postfix) with ESMTP id 6EFDB8D0135 for ; Wed, 3 Mar 2021 03:03:30 -0500 (EST) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 207F08249980 for ; Wed, 3 Mar 2021 08:03:30 +0000 (UTC) X-FDA: 77877823380.30.F9B5F2A Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf10.hostedemail.com (Postfix) with ESMTP id 1FC2B407F8F3 for ; Wed, 3 Mar 2021 08:03:28 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1614758608; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=f9LseYxhhvKnvAG5ItlR8EN+CAZlGjlHQXS4l19bjE0=; b=LiHG6iMo6Dliklpsc9IDxjRF0fXr94/vZfe9KbtDJuqlsdg3X1yBaPkN+hmyNN+ywgHb4O WCiAXIoheDbuT0Lhi8DyRudYU7wZpWVNEFLcRkQqUK7z+iprNJypk71i3EE8FUfhMB+eXN Fv6771Ya73ZQ9tebqqa9iKS+X3A8yS0= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 0B3BAAF5C; Wed, 3 Mar 2021 08:03:28 +0000 (UTC) Date: Wed, 3 Mar 2021 09:03:27 +0100 From: Michal Hocko To: Mike Kravetz , "Paul E. McKenney" Cc: Shakeel Butt , syzbot , Andrew Morton , LKML , Linux MM , syzkaller-bugs , Eric Dumazet , Mina Almasry Subject: Re: possible deadlock in sk_clone_lock Message-ID: References: <122e8c5d-60b8-52d9-c6a1-00cd61b2e1b6@oracle.com> <06edda9a-dce9-accd-11a3-97f6d5243ed1@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <06edda9a-dce9-accd-11a3-97f6d5243ed1@oracle.com> X-Stat-Signature: oba1js58q4yj4gg7gtqr69kshzkswoy1 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 1FC2B407F8F3 Received-SPF: none (suse.com>: No applicable sender policy available) receiver=imf10; identity=mailfrom; envelope-from=""; helo=mx2.suse.de; client-ip=195.135.220.15 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1614758608-983641 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: [Add Paul] On Tue 02-03-21 13:19:34, Mike Kravetz wrote: > On 3/2/21 6:29 AM, Michal Hocko wrote: > > On Tue 02-03-21 06:11:51, Shakeel Butt wrote: > >> On Tue, Mar 2, 2021 at 1:44 AM Michal Hocko wrote: > >>> > >>> On Mon 01-03-21 17:16:29, Mike Kravetz wrote: > >>>> On 3/1/21 9:23 AM, Michal Hocko wrote: > >>>>> On Mon 01-03-21 08:39:22, Shakeel Butt wrote: > >>>>>> On Mon, Mar 1, 2021 at 7:57 AM Michal Hocko wrote: > >>>>> [...] > >>>>>>> Then how come this can ever be a problem? in_task() should exclude soft > >>>>>>> irq context unless I am mistaken. > >>>>>>> > >>>>>> > >>>>>> If I take the following example of syzbot's deadlock scenario then > >>>>>> CPU1 is the one freeing the hugetlb pages. It is in the process > >>>>>> context but has disabled softirqs (see __tcp_close()). > >>>>>> > >>>>>> CPU0 CPU1 > >>>>>> ---- ---- > >>>>>> lock(hugetlb_lock); > >>>>>> local_irq_disable(); > >>>>>> lock(slock-AF_INET); > >>>>>> lock(hugetlb_lock); > >>>>>> > >>>>>> lock(slock-AF_INET); > >>>>>> > > [...] > >>> Wouldn't something like this help? It is quite ugly but it would be > >>> simple enough and backportable while we come up with a more rigorous > >>> solution. What do you think? > >>> > >>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c > >>> index 4bdb58ab14cb..c9a8b39f678d 100644 > >>> --- a/mm/hugetlb.c > >>> +++ b/mm/hugetlb.c > >>> @@ -1495,9 +1495,11 @@ static DECLARE_WORK(free_hpage_work, free_hpage_workfn); > >>> void free_huge_page(struct page *page) > >>> { > >>> /* > >>> - * Defer freeing if in non-task context to avoid hugetlb_lock deadlock. > >>> + * Defer freeing if in non-task context or when put_page is called > >>> + * with IRQ disabled (e.g from via TCP slock dependency chain) to > >>> + * avoid hugetlb_lock deadlock. > >>> */ > >>> - if (!in_task()) { > >>> + if (!in_task() || irqs_disabled()) { > >> > >> Does irqs_disabled() also check softirqs? > > > > Nope it doesn't AFAICS. I was referring to the above lockdep splat which > > claims irq disabled to be the trigger. But now that you are mentioning > > that it would be better to replace in_task() along the way. We have > > discussed that in another email thread and I was suggesting to use > > in_atomic() which should catch also bh disabled situation. The big IF is > > that this needs preempt count to be enabled unconditionally. There are > > changes in the RCU tree heading that direction. > > I have not been following developments in preemption and the RCU tree. > The comment for in_atomic() says: > > /* > * Are we running in atomic context? WARNING: this macro cannot > * always detect atomic context; in particular, it cannot know about > * held spinlocks in non-preemptible kernels. Thus it should not be > * used in the general case to determine whether sleeping is possible. > * Do not use in_atomic() in driver code. > */ > > That does seem to be the case. I verified in_atomic can detect softirq > context even in non-preemptible kernels. But, as the comment says it > will not detect a held spinlock in non-preemptible kernels. So, I think > in_atomic would be better than the current check for !in_task. That > would handle this syzbot issue, but we could still have issues if the > hugetlb put_page path is called while someone is holding a spinlock with > all interrupts enabled. Looks like there is no way to detect this > today in non-preemptible kernels. in_atomic does detect spinlocks held > in preemptible kernels. Paul what is the current plan with in_atomic to be usable for !PREEMPT configurations? -- Michal Hocko SUSE Labs