All of lore.kernel.org
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.com>
To: Mike Snitzer <snitzer@redhat.com>, Jens Axboe <axboe@kernel.dk>
Cc: Jack Wang <jinpu.wang@profitbricks.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Lars Ellenberg <lars.ellenberg@linbit.com>,
	Kent Overstreet <kent.overstreet@gmail.com>,
	Pavel Machek <pavel@ucw.cz>,
	Mikulas Patocka <mpatocka@redhat.com>
Subject: Re: blk: improve order of bio handling in generic_make_request()
Date: Wed, 08 Mar 2017 07:29:55 +1100	[thread overview]
Message-ID: <87tw74j0e4.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <20170307171436.GA2109@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 3582 bytes --]

On Tue, Mar 07 2017, Mike Snitzer wrote:

> On Tue, Mar 07 2017 at 12:05pm -0500,
> Jens Axboe <axboe@kernel.dk> wrote:
>
>> On 03/07/2017 09:52 AM, Mike Snitzer wrote:
>> > On Tue, Mar 07 2017 at  3:49am -0500,
>> > Jack Wang <jinpu.wang@profitbricks.com> wrote:
>> > 
>> >>
>> >>
>> >> On 06.03.2017 21:18, Jens Axboe wrote:
>> >>> On 03/05/2017 09:40 PM, NeilBrown wrote:
>> >>>> On Fri, Mar 03 2017, Jack Wang wrote:
>> >>>>>
>> >>>>> Thanks Neil for pushing the fix.
>> >>>>>
>> >>>>> We can optimize generic_make_request a little bit:
>> >>>>> - assign bio_list struct hold directly instead init and merge
>> >>>>> - remove duplicate code
>> >>>>>
>> >>>>> I think better to squash into your fix.
>> >>>>
>> >>>> Hi Jack,
>> >>>>  I don't object to your changes, but I'd like to see a response from
>> >>>>  Jens first.
>> >>>>  My preference would be to get the original patch in, then other changes
>> >>>>  that build on it, such as this one, can be added.  Until the core
>> >>>>  changes lands, any other work is pointless.
>> >>>>
>> >>>>  Of course if Jens wants a this merged before he'll apply it, I'll
>> >>>>  happily do that.
>> >>>
>> >>> I like the change, and thanks for tackling this. It's been a pending
>> >>> issue for way too long. I do think we should squash Jack's patch
>> >>> into the original, as it does clean up the code nicely.
>> >>>
>> >>> Do we have a proper test case for this, so we can verify that it
>> >>> does indeed also work in practice?
>> >>>
>> >> Hi Jens,
>> >>
>> >> I can trigger deadlock with in RAID1 with test below:
>> >>
>> >> I create one md with one local loop device and one remote scsi
>> >> exported by SRP. running fio with mix rw on top of md, force_close
>> >> session on storage side. mdx_raid1 is wait on free_array in D state,
>> >> and a lot of fio also in D state in wait_barrier.
>> >>
>> >> With the patch from Neil above, I can no longer trigger it anymore.
>> >>
>> >> The discussion was in link below:
>> >> http://www.spinics.net/lists/raid/msg54680.html
>> > 
>> > In addition to Jack's MD raid test there is a DM snapshot deadlock test,
>> > albeit unpolished/needy to get running, see:
>> > https://www.redhat.com/archives/dm-devel/2017-January/msg00064.html
>> 
>> Can you run this patch with that test, reverting your DM workaround?
>
> Yeap, will do.  Last time Mikulas tried a similar patch it still
> deadlocked.  But I'll give it a go (likely tomorrow).

I don't think this will fix the DM snapshot deadlock by itself.
Rather, it make it possible for some internal changes to DM to fix it.
The DM change might be something vaguely like:

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 3086da5664f3..06ee0960e415 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1216,6 +1216,14 @@ static int __split_and_process_non_flush(struct clone_info *ci)

 	len = min_t(sector_t, max_io_len(ci->sector, ti), ci->sector_count);

+	if (len < ci->sector_count) {
+		struct bio *split = bio_split(bio, len, GFP_NOIO, fs_bio_set);
+		bio_chain(split, bio);
+		generic_make_request(bio);
+		bio = split;
+		ci->sector_count = len;
+	}
+
 	r = __clone_and_map_data_bio(ci, ti, ci->sector, &len);
 	if (r < 0)
 		return r;

Instead of looping inside DM, this change causes the remainder to be
passed to generic_make_request() and DM only handles or region at a
time.  So there is only one loop, in the top generic_make_request().
That loop will not reliable handle bios in the "right" order.

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

  reply	other threads:[~2017-03-07 21:14 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-03  5:14 [PATCH] blk: improve order of bio handling in generic_make_request() NeilBrown
2017-03-03  9:28 ` Jack Wang
2017-03-06  4:40   ` NeilBrown
2017-03-06  9:43     ` Jack Wang
2017-03-07 15:46       ` Pavel Machek
2017-03-07 15:53         ` Jack Wang
2017-03-07 16:21         ` Jens Axboe
2017-03-06 20:18     ` Jens Axboe
2017-03-07  8:49       ` Jack Wang
2017-03-07 16:52         ` Mike Snitzer
2017-03-07 17:05           ` Jens Axboe
2017-03-07 17:14             ` Mike Snitzer
2017-03-07 20:29               ` NeilBrown [this message]
2017-03-07 23:01                 ` Mike Snitzer
2017-03-08 16:40                 ` Mikulas Patocka
2017-03-08 17:15                   ` Lars Ellenberg
2017-03-09  6:08                   ` NeilBrown
2017-03-08 11:46           ` Lars Ellenberg
2017-03-07 20:38       ` [PATCH v2] " NeilBrown
2017-03-07 20:38         ` NeilBrown
2017-03-10  4:32         ` NeilBrown
2017-03-10  4:33           ` [PATCH 1/5 v3] " NeilBrown
2017-03-10  4:34           ` [PATCH 2/5] blk: remove bio_set arg from blk_queue_split() NeilBrown
2017-03-10  4:35           ` [PATCH 3/5] blk: make the bioset rescue_workqueue optional NeilBrown
2017-03-10  4:36           ` [PATCH 4/5] blk: use non-rescuing bioset for q->bio_split NeilBrown
2017-03-10  4:37           ` [PATCH 5/5] block_dev: make blkdev_dio_pool a non-rescuing bioset NeilBrown
2017-03-10  4:38           ` [PATCH v2] blk: improve order of bio handling in generic_make_request() Jens Axboe
2017-03-10  4:40             ` Jens Axboe
2017-03-10  5:19             ` NeilBrown
2017-03-10 12:34               ` Lars Ellenberg
2017-03-10 14:38                 ` Mike Snitzer
2017-03-10 14:55                   ` Mikulas Patocka
2017-03-10 15:07                     ` Jack Wang
2017-03-10 15:35                       ` Mike Snitzer
2017-03-10 18:51                       ` Lars Ellenberg
2017-03-11  0:47                 ` NeilBrown
2017-03-11  0:47                   ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tw74j0e4.fsf@notabene.neil.brown.name \
    --to=neilb@suse.com \
    --cc=axboe@kernel.dk \
    --cc=jinpu.wang@profitbricks.com \
    --cc=kent.overstreet@gmail.com \
    --cc=lars.ellenberg@linbit.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=pavel@ucw.cz \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.