All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Mike Kravetz <mike.kravetz@oracle.com>,
	"Paul E. McKenney" <paulmck@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>,
	syzbot <syzbot+506c8a2a115201881d45@syzkaller.appspotmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	syzkaller-bugs <syzkaller-bugs@googlegroups.com>,
	Eric Dumazet <edumazet@google.com>,
	Mina Almasry <almasrymina@google.com>
Subject: Re: possible deadlock in sk_clone_lock
Date: Wed, 3 Mar 2021 09:03:27 +0100	[thread overview]
Message-ID: <YD9CzzypRyUPT0Jh@dhcp22.suse.cz> (raw)
In-Reply-To: <06edda9a-dce9-accd-11a3-97f6d5243ed1@oracle.com>

[Add Paul]

On Tue 02-03-21 13:19:34, Mike Kravetz wrote:
> On 3/2/21 6:29 AM, Michal Hocko wrote:
> > On Tue 02-03-21 06:11:51, Shakeel Butt wrote:
> >> On Tue, Mar 2, 2021 at 1:44 AM Michal Hocko <mhocko@suse.com> wrote:
> >>>
> >>> On Mon 01-03-21 17:16:29, Mike Kravetz wrote:
> >>>> On 3/1/21 9:23 AM, Michal Hocko wrote:
> >>>>> On Mon 01-03-21 08:39:22, Shakeel Butt wrote:
> >>>>>> On Mon, Mar 1, 2021 at 7:57 AM Michal Hocko <mhocko@suse.com> wrote:
> >>>>> [...]
> >>>>>>> Then how come this can ever be a problem? in_task() should exclude soft
> >>>>>>> irq context unless I am mistaken.
> >>>>>>>
> >>>>>>
> >>>>>> If I take the following example of syzbot's deadlock scenario then
> >>>>>> CPU1 is the one freeing the hugetlb pages. It is in the process
> >>>>>> context but has disabled softirqs (see __tcp_close()).
> >>>>>>
> >>>>>>         CPU0                    CPU1
> >>>>>>         ----                    ----
> >>>>>>    lock(hugetlb_lock);
> >>>>>>                                 local_irq_disable();
> >>>>>>                                 lock(slock-AF_INET);
> >>>>>>                                 lock(hugetlb_lock);
> >>>>>>    <Interrupt>
> >>>>>>      lock(slock-AF_INET);
> >>>>>>
> > [...]
> >>> Wouldn't something like this help? It is quite ugly but it would be
> >>> simple enough and backportable while we come up with a more rigorous
> >>> solution. What do you think?
> >>>
> >>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> >>> index 4bdb58ab14cb..c9a8b39f678d 100644
> >>> --- a/mm/hugetlb.c
> >>> +++ b/mm/hugetlb.c
> >>> @@ -1495,9 +1495,11 @@ static DECLARE_WORK(free_hpage_work, free_hpage_workfn);
> >>>  void free_huge_page(struct page *page)
> >>>  {
> >>>         /*
> >>> -        * Defer freeing if in non-task context to avoid hugetlb_lock deadlock.
> >>> +        * Defer freeing if in non-task context or when put_page is called
> >>> +        * with IRQ disabled (e.g from via TCP slock dependency chain) to
> >>> +        * avoid hugetlb_lock deadlock.
> >>>          */
> >>> -       if (!in_task()) {
> >>> +       if (!in_task() || irqs_disabled()) {
> >>
> >> Does irqs_disabled() also check softirqs?
> > 
> > Nope it doesn't AFAICS. I was referring to the above lockdep splat which
> > claims irq disabled to be the trigger. But now that you are mentioning
> > that it would be better to replace in_task() along the way. We have
> > discussed that in another email thread and I was suggesting to use
> > in_atomic() which should catch also bh disabled situation. The big IF is
> > that this needs preempt count to be enabled unconditionally. There are
> > changes in the RCU tree heading that direction.
> 
> I have not been following developments in preemption and the RCU tree. 
> The comment for in_atomic() says:
> 
> /*
>  * Are we running in atomic context?  WARNING: this macro cannot
>  * always detect atomic context; in particular, it cannot know about
>  * held spinlocks in non-preemptible kernels.  Thus it should not be
>  * used in the general case to determine whether sleeping is possible.
>  * Do not use in_atomic() in driver code.
>  */
> 
> That does seem to be the case.  I verified in_atomic can detect softirq
> context even in non-preemptible kernels.  But, as the comment says it
> will not detect a held spinlock in non-preemptible kernels.  So, I think
> in_atomic would be better than the current check for !in_task.  That
> would handle this syzbot issue, but we could still have issues if the
> hugetlb put_page path is called while someone is holding a spinlock with
> all interrupts enabled.  Looks like there is no way to detect this
> today in non-preemptible kernels.  in_atomic does detect spinlocks held
> in preemptible kernels.

Paul what is the current plan with in_atomic to be usable for !PREEMPT
configurations?

-- 
Michal Hocko
SUSE Labs

  parent reply	other threads:[~2021-03-03 13:06 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-26 21:08 possible deadlock in sk_clone_lock syzbot
2021-02-26 21:08 ` syzbot
2021-02-26 22:44 ` Shakeel Butt
2021-02-26 22:44   ` Shakeel Butt
2021-02-26 23:14   ` Mike Kravetz
2021-02-27  0:00     ` Shakeel Butt
2021-02-27  0:00       ` Shakeel Butt
2021-03-01 12:11       ` Michal Hocko
2021-03-01 15:10         ` Shakeel Butt
2021-03-01 15:10           ` Shakeel Butt
2021-03-01 15:57           ` Michal Hocko
2021-03-01 16:39             ` Shakeel Butt
2021-03-01 16:39               ` Shakeel Butt
2021-03-01 17:23               ` Michal Hocko
2021-03-02  1:16                 ` Mike Kravetz
2021-03-02  9:44                   ` Michal Hocko
2021-03-02 14:11                     ` Shakeel Butt
2021-03-02 14:29                       ` Michal Hocko
2021-03-02 21:19                         ` Mike Kravetz
2021-03-03  3:59                           ` Shakeel Butt
2021-03-03  3:59                             ` Shakeel Butt
2021-03-05  9:09                             ` Michal Hocko
2021-03-03  8:03                           ` Michal Hocko [this message]
2021-03-03 17:59                             ` Paul E. McKenney
2021-03-04  9:58                               ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YD9CzzypRyUPT0Jh@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=edumazet@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=paulmck@kernel.org \
    --cc=shakeelb@google.com \
    --cc=syzbot+506c8a2a115201881d45@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.