From: Mikulas Patocka <mpatocka@redhat.com>
To: Ming Lei <tom.leiming@gmail.com>
Cc: Mike Snitzer <snitzer@redhat.com>, Jens Axboe <axboe@kernel.dk>,
Kent Overstreet <kent.overstreet@gmail.com>,
dm-devel@redhat.com,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
"Alasdair G. Kergon" <agk@redhat.com>,
Jeff Moyer <jmoyer@redhat.com>
Subject: Re: [PATCH v3 for-4.4] block: flush queued bios when process blocks to avoid deadlock
Date: Wed, 21 Oct 2015 17:49:20 -0400 (EDT) [thread overview]
Message-ID: <alpine.LRH.2.02.1510211718310.21723@file01.intranet.prod.int.rdu2.redhat.com> (raw)
In-Reply-To: <CACVXFVPfaJwKDzh+t0f9uxxTxrXE33bLZew5+FoKjGiYvsKW1Q@mail.gmail.com>
On Thu, 22 Oct 2015, Ming Lei wrote:
> > Some drivers (dm-snapshot, dm-thin) do acquire a mutex in .make_requests()
> > for every bio. It wouldn't be practical to convert them to not acquire the
> > mutex (and it would also degrade performance of these drivers, if they had
> > to offload every bio to a worker thread that can acquire the mutex).
>
> Lots of drivers handle I/O in that way, and this way makes AIO not possible
> basically for dm-snapshot.
It doesn't have to do anything with asynchronous I/O. Of course you can do
asynchronous I/O on dm-snapshot.
> >> Also sometimes it can hurt performance by converting I/O submission
> >> from one context into concurrent contexts of workqueue, especially
> >> in case of sequential I/O, since plug & plug merge can't be used any
> >> more.
> >
> > You can add blk_start_plug/blk_finish_plug to the function
> > bio_alloc_rescue. That would be reasonable to make sure that the requests
> > are merged even when they are offloaded to rescue thread.
>
> The IOs submitted from each wq context becomes not contineous any
> more, so plug merge isn't doable, not mention the extra context switch
> cost.
If the requests are mergeable, blk_start_plug/blk_finish_plug will merge
them, if not, it won't.
> This kind of cost can be introduced for all bio devices just for handling
> the unusual case, fair?
Offloading bios to a worker thread when the make_request_fn function
blocks is required to avoid a deadlock (BTW. the deadlock became more
common in the kernel 4.3 due to unrestricted size of bios).
The bio list current->bio_list introduces a false locking dependency -
completion of a bio depends on completion of other bios on
current->bio_list directed for different devices, thus it could create
circular dependency resulting in deadlock.
To avoid the circular dependency, each bio must be offloaded to a specific
workqueue, so that completion of bio for device A no longer depends on
completion of another bio for device B.
> >> > - queue_work(bs->rescue_workqueue, &bs->rescue_work);
> >> > + spin_lock(&bs->rescue_lock);
> >> > + bio_list_add(&bs->rescue_list, bio);
> >> > + queue_work(bs->rescue_workqueue, &bs->rescue_work);
> >> > + spin_unlock(&bs->rescue_lock);
> >> > + }
> >>
> >> Not like rescuring path, schedule out can be quite frequent, and the
> >> above change will switch to submit these I/Os from wq concurrently,
> >> which might hurt performance for sequential I/O.
> >>
> >> Also I am wondering why not submit these I/Os in 'current' context
> >> just like what flush plug does?
> >
> > Processing requests doesn't block (they only take the queue spinlock).
> >
> > Processing bios can block (they can take other mutexes or semaphores), so
> > processing them from the schedule hook is impossible - the bio's
> > make_request function could attempt to take some lock that is already
> > held. So - we must offload the bios to a separate workqueue.
>
> Yes, so better to just handle dm-snapshot in this way.
All dm targets and almost all other bio-processing drivers can block in
the make_request_fn function (for example, they may block when allocating
from a mempool).
Mikulas
> Thanks,
> Ming Lei
>
next prev parent reply other threads:[~2015-10-21 21:49 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-27 15:03 [PATCH] block: flush queued bios when the process blocks Mikulas Patocka
2014-05-27 15:08 ` Jens Axboe
2014-05-27 15:23 ` Mikulas Patocka
2014-05-27 15:42 ` Jens Axboe
2014-05-27 16:26 ` Mikulas Patocka
2014-05-27 17:33 ` Mike Snitzer
2014-05-27 19:56 ` Kent Overstreet
2015-10-05 19:50 ` Mike Snitzer
2014-05-27 17:42 ` [PATCH] " Jens Axboe
2014-05-27 18:14 ` [dm-devel] " Christoph Hellwig
2014-05-27 19:59 ` Kent Overstreet
2014-05-27 19:56 ` Mikulas Patocka
2014-05-27 20:06 ` Kent Overstreet
2014-05-29 23:52 ` Mikulas Patocka
2015-10-05 20:59 ` Mike Snitzer
2015-10-06 13:28 ` Mikulas Patocka
2015-10-06 13:47 ` Mike Snitzer
2015-10-06 14:10 ` Mikulas Patocka
2015-10-06 14:26 ` Mikulas Patocka
2015-10-06 18:17 ` [dm-devel] " Mikulas Patocka
2015-10-06 18:50 ` Mike Snitzer
2015-10-06 20:16 ` [PATCH v2] " Mike Snitzer
2015-10-06 20:26 ` Mike Snitzer
2015-10-08 15:04 ` Mikulas Patocka
2015-10-08 15:08 ` Mike Snitzer
2015-10-09 19:52 ` Mike Snitzer
2015-10-09 19:59 ` Mike Snitzer
2015-10-14 20:47 ` [PATCH v3 for-4.4] block: flush queued bios when process blocks to avoid deadlock Mike Snitzer
2015-10-14 21:44 ` Jeff Moyer
2015-10-17 16:04 ` Ming Lei
2015-10-20 19:57 ` Mike Snitzer
2015-10-20 20:03 ` Mikulas Patocka
2015-10-21 16:38 ` Ming Lei
2015-10-21 21:49 ` Mikulas Patocka [this message]
2015-10-22 1:53 ` Ming Lei
2015-10-15 3:27 ` [PATCH v2] block: flush queued bios when the process blocks Ming Lei
2015-10-15 8:06 ` Mike Snitzer
2015-10-16 3:08 ` Ming Lei
2015-10-16 15:29 ` Mike Snitzer
2015-10-17 15:54 ` Ming Lei
2015-10-09 11:58 ` kbuild test robot
2014-05-27 17:59 ` [PATCH] " Kent Overstreet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LRH.2.02.1510211718310.21723@file01.intranet.prod.int.rdu2.redhat.com \
--to=mpatocka@redhat.com \
--cc=agk@redhat.com \
--cc=axboe@kernel.dk \
--cc=dm-devel@redhat.com \
--cc=jmoyer@redhat.com \
--cc=kent.overstreet@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=snitzer@redhat.com \
--cc=tom.leiming@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).