All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yongji Xie <xieyongji@bytedance.com>
To: Josef Bacik <josef@toxicpanda.com>
Cc: Christoph Hellwig <hch@infradead.org>,
	Jens Axboe <axboe@kernel.dk>,
	Bart Van Assche <bvanassche@acm.org>,
	linux-block@vger.kernel.org, nbd@other.debian.org,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2] nbd: Don't use workqueue to handle recv work
Date: Wed, 5 Jan 2022 13:36:08 +0800	[thread overview]
Message-ID: <CACycT3tPZOSkCXPz-oYCXRJ_EOBs3dC0+Juv=FYsa6qRS0GVCw@mail.gmail.com> (raw)
In-Reply-To: <YdSMqKXv0PUkAwfl@localhost.localdomain>

On Wed, Jan 5, 2022 at 2:06 AM Josef Bacik <josef@toxicpanda.com> wrote:
>
> On Tue, Jan 04, 2022 at 01:31:47PM +0800, Yongji Xie wrote:
> > On Tue, Jan 4, 2022 at 12:10 AM Josef Bacik <josef@toxicpanda.com> wrote:
> > >
> > > On Thu, Dec 30, 2021 at 12:01:23PM +0800, Yongji Xie wrote:
> > > > On Thu, Dec 30, 2021 at 1:35 AM Christoph Hellwig <hch@infradead.org> wrote:
> > > > >
> > > > > On Mon, Dec 27, 2021 at 05:12:41PM +0800, Xie Yongji wrote:
> > > > > > The rescuer thread might take over the works queued on
> > > > > > the workqueue when the worker thread creation timed out.
> > > > > > If this happens, we have no chance to create multiple
> > > > > > recv threads which causes I/O hung on this nbd device.
> > > > >
> > > > > If a workqueue is used there aren't really 'receive threads'.
> > > > > What is the deadlock here?
> > > >
> > > > We might have multiple recv works, and those recv works won't quit
> > > > unless the socket is closed. If the rescuer thread takes over those
> > > > works, only the first recv work can run. The I/O needed to be handled
> > > > in other recv works would be hung since no thread can handle them.
> > > >
> > >
> > > I'm not following this explanation.  What is the rescuer thread you're talking
> >
> > https://www.kernel.org/doc/html/latest/core-api/workqueue.html#c.rescuer_thread
> >
>
> Ahhh ok now I see, thanks, I didn't know this is how this worked.
>
> So what happens is we do the queue_work(), this needs to do a GFP_KERNEL
> allocation internally, we are unable to satisfy this, and thus the work gets
> pushed onto the rescuer thread.
>
> Then the rescuer thread can't be used in the future because it's doing this long
> running thing.
>

Yes.

> I think the correct thing to do here is simply drop the WQ_MEM_RECLAIM bit.  It
> makes sense for workqueue's that are handling the work of short lived works that
> are in the memory reclaim path.  That's not what these workers are doing, yes
> they are in the reclaim path, but they run the entire time the device is up.
> The actual work happens as they process incoming requests.  AFAICT
> WQ_MEM_RECLAIM doesn't affect the actual allocations that the worker thread
> needs to do, which is what I think the intention was in using WQ_MEM_RECLAIM,
> which isn't really what it's used for.
>
> tl;dr, just remove thee WQ_MEM_RECLAIM flag completely and I think that's good
> enough?  Thanks,
>

In the reconnect case, we still need to call queue_work() while the
device is running. So it looks like we can't simply remove the
WQ_MEM_RECLAIM flag.

Thanks,
Yongji

  reply	other threads:[~2022-01-05  5:36 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-27  9:12 [PATCH v2] nbd: Don't use workqueue to handle recv work Xie Yongji
2021-12-29 17:35 ` Christoph Hellwig
2021-12-30  4:01   ` Yongji Xie
2022-01-03 16:10     ` Josef Bacik
2022-01-04  5:31       ` Yongji Xie
2022-01-04 18:06         ` Josef Bacik
2022-01-05  5:36           ` Yongji Xie [this message]
2022-01-21  8:34             ` Yongji Xie
2022-02-15 13:17               ` Yongji Xie
2022-02-18 15:20                 ` Josef Bacik
2022-02-19 13:04                   ` Yongji Xie
2022-03-22  8:10                     ` Yongji Xie
2022-03-22 20:11 ` Josef Bacik
2022-03-23 11:21   ` Yongji Xie
2022-05-16  6:13     ` Yongji Xie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACycT3tPZOSkCXPz-oYCXRJ_EOBs3dC0+Juv=FYsa6qRS0GVCw@mail.gmail.com' \
    --to=xieyongji@bytedance.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=hch@infradead.org \
    --cc=josef@toxicpanda.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nbd@other.debian.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.