All of lore.kernel.org
 help / color / mirror / Atom feed
* blk-mq crash with dm-multipath in for-3.20/core
@ 2015-02-09 16:38 Dongsu Park
  2015-02-09 16:48 ` Mike Snitzer
  0 siblings, 1 reply; 7+ messages in thread
From: Dongsu Park @ 2015-02-09 16:38 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Keith Busch, Mike Snitzer, dm-devel, linux-kernel

Hi Jens,

during testing with the linux-block for-3.20/core branch, I hit a BUG
like below. It's reproducible by running xfstests/xfs/279.

Bisecting showed that the first bad commit is 6d6285c45f5a ("block:
require blk_rq_prep_clone() be given an initialized clone request").
With reverting this commit, the crash disappears.
The linux-dm's branch dm-for-3.20 works fine without crash too.

As pointed out already by Keith Busch in a thread, [1] that commit should
not be there in the first place. Commit 102e38b1030e ("dm: split
request structure out from dm_rq_target_io structure") from linux-dm tree
[2] is going to move the blk_rq_init() call again to __clone_rq().

So that commit 6d6285c45f5a should be either reverted, or moved to
linux-dm tree, doesn't it?

Cheers,
Dongsu

[1] https://www.redhat.com/archives/dm-devel/2015-January/msg00171.html
[2] https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-for-3.20&id=102e38b1030e883efc022dfdc7b7e7a3de70d1c5

------------[ cut here ]------------
kernel BUG at block/blk-core.c:2333!
RIP: 0010: [<ffffffff814c6858>] blk_dequeue_request+0x78/0x90
Call Trace:
 [<ffffffff814c6886>] blk_start_request+0x16/0x70
 [<ffffffff8169f9fa>] dm_start_request+0x1a/0x50
 [<ffffffff8169fce6>] dm_request_fn+0x2b6/0x3e0
 [<ffffffff814c0087>] __blk_run_queue+0x37/0x50
 [<ffffffff814c31ed>] queue_unplugged+0x5d/0x230
 [<ffffffff814c710c>] blk_flush_plug_list+0x1ac/0x230
 [<ffffffff814c7708>] blk_finish_plug+0x18/0x60
 [<ffffffff811baea1>] __do_page_cache_readahead+0x2b1/0x320
 [<ffffffff811bad55>] ? __do_page_cache_readahead+0x165/0x320
 [<ffffffff811baff2>] ondemand_readahead+0xe2/0x480
 [<ffffffff811ac3ff>] ? pagecache_get_page+0x2f/0x200
 [<ffffffff811bb4c1>] page_cache_sync_readahead+0x31/0x50
 [<ffffffff811ad5bc>] generic_file_read_iter+0x51c/0x630
 [<ffffffff811dd00e>] ? might_fault+0x5e/0xc0
 [<ffffffff81261e37>] blkdev_read_iter+0x37/0x40
 [<ffffffff8121fa4e>] new_sync_read+0x7e/0xb0
 [<ffffffff81220ce8>] __vfs_read+0x18/0x50
 [<ffffffff81220dad>] vfs_read+0x8d/0x150
 [<ffffffff81220eb9>] SyS_read+0x49/0xb0
 [<ffffffff817dce52>] system_call_fastpath+0x12/0x17
RIP  [<ffffffff814c6858>] blk_dequeue_request+0x78/0x90
 RSP <ffff88006e1eba68>
---[ end trace dcfc3d438518b1aa ]---


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: blk-mq crash with dm-multipath in for-3.20/core
  2015-02-09 16:38 blk-mq crash with dm-multipath in for-3.20/core Dongsu Park
@ 2015-02-09 16:48 ` Mike Snitzer
  2015-02-09 17:07   ` Keith Busch
  0 siblings, 1 reply; 7+ messages in thread
From: Mike Snitzer @ 2015-02-09 16:48 UTC (permalink / raw)
  To: Dongsu Park; +Cc: Jens Axboe, Keith Busch, dm-devel, linux-kernel

On Mon, Feb 09 2015 at 11:38am -0500,
Dongsu Park <dongsu.park@profitbricks.com> wrote:

> Hi Jens,
> 
> during testing with the linux-block for-3.20/core branch, I hit a BUG
> like below. It's reproducible by running xfstests/xfs/279.
> 
> Bisecting showed that the first bad commit is 6d6285c45f5a ("block:
> require blk_rq_prep_clone() be given an initialized clone request").
> With reverting this commit, the crash disappears.
> The linux-dm's branch dm-for-3.20 works fine without crash too.
> 
> As pointed out already by Keith Busch in a thread, [1] that commit should
> not be there in the first place. Commit 102e38b1030e ("dm: split
> request structure out from dm_rq_target_io structure") from linux-dm tree
> [2] is going to move the blk_rq_init() call again to __clone_rq().
> 
> So that commit 6d6285c45f5a should be either reverted, or moved to
> linux-dm tree, doesn't it?
> 
> Cheers,
> Dongsu
> 
> [1] https://www.redhat.com/archives/dm-devel/2015-January/msg00171.html
> [2] https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-for-3.20&id=102e38b1030e883efc022dfdc7b7e7a3de70d1c5

Right, we're aware of this typo in 6d6285c45f5a.  Sorry about that, but
as you noted, once both the linux-block and linux-dm branches for 3.20
are merged all is back to working.

So we're planning to just leave the block commit as broken and let the
dm commit you noted fix it up.  In the end 3.20-rc1 will have a working
dm-multipath.

Mike

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: blk-mq crash with dm-multipath in for-3.20/core
  2015-02-09 16:48 ` Mike Snitzer
@ 2015-02-09 17:07   ` Keith Busch
  2015-02-09 17:13     ` Mike Snitzer
  0 siblings, 1 reply; 7+ messages in thread
From: Keith Busch @ 2015-02-09 17:07 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: Dongsu Park, Jens Axboe, Keith Busch, dm-devel, linux-kernel

On Mon, 9 Feb 2015, Mike Snitzer wrote:
> On Mon, Feb 09 2015 at 11:38am -0500,
> Dongsu Park <dongsu.park@profitbricks.com> wrote:
>> So that commit 6d6285c45f5a should be either reverted, or moved to
>> linux-dm tree, doesn't it?
>>
>> Cheers,
>> Dongsu
>>
>> [1] https://www.redhat.com/archives/dm-devel/2015-January/msg00171.html
>> [2] https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-for-3.20&id=102e38b1030e883efc022dfdc7b7e7a3de70d1c5
>
> Right, we're aware of this typo in 6d6285c45f5a.  Sorry about that, but
> as you noted, once both the linux-block and linux-dm branches for 3.20
> are merged all is back to working.
>
> So we're planning to just leave the block commit as broken and let the
> dm commit you noted fix it up.  In the end 3.20-rc1 will have a working
> dm-multipath.

Oh, we're not going rebase the series with the correction? I'm concerned
someone biscecting a completely unrelated problem might step on this
commit. Up to you guys. It's my fault, so I'll deal with the consequences.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: blk-mq crash with dm-multipath in for-3.20/core
  2015-02-09 17:07   ` Keith Busch
@ 2015-02-09 17:13     ` Mike Snitzer
  2015-02-09 17:35       ` [PATCH] dm: fix multipath regression due to initializing wrong request Mike Snitzer
  0 siblings, 1 reply; 7+ messages in thread
From: Mike Snitzer @ 2015-02-09 17:13 UTC (permalink / raw)
  To: Keith Busch; +Cc: Dongsu Park, Jens Axboe, dm-devel, linux-kernel

On Mon, Feb 09 2015 at 12:07pm -0500,
Keith Busch <keith.busch@intel.com> wrote:

> On Mon, 9 Feb 2015, Mike Snitzer wrote:
> >On Mon, Feb 09 2015 at 11:38am -0500,
> >Dongsu Park <dongsu.park@profitbricks.com> wrote:
> >>So that commit 6d6285c45f5a should be either reverted, or moved to
> >>linux-dm tree, doesn't it?
> >>
> >>Cheers,
> >>Dongsu
> >>
> >>[1] https://www.redhat.com/archives/dm-devel/2015-January/msg00171.html
> >>[2] https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-for-3.20&id=102e38b1030e883efc022dfdc7b7e7a3de70d1c5
> >
> >Right, we're aware of this typo in 6d6285c45f5a.  Sorry about that, but
> >as you noted, once both the linux-block and linux-dm branches for 3.20
> >are merged all is back to working.
> >
> >So we're planning to just leave the block commit as broken and let the
> >dm commit you noted fix it up.  In the end 3.20-rc1 will have a working
> >dm-multipath.
> 
> Oh, we're not going rebase the series with the correction? I'm concerned
> someone biscecting a completely unrelated problem might step on this
> commit. Up to you guys. It's my fault, so I'll deal with the consequences.

Rebasing this late (3.20 merge already opened) is generally not done.
Unfortunately we'll just have to suck it up and deal with the fallout
during bisect -- would wager very few are bisecting with multipath as
the focus of their test.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] dm: fix multipath regression due to initializing wrong request
  2015-02-09 17:13     ` Mike Snitzer
@ 2015-02-09 17:35       ` Mike Snitzer
  2015-02-09 17:47         ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: Mike Snitzer @ 2015-02-09 17:35 UTC (permalink / raw)
  To: Keith Busch, Jens Axboe; +Cc: Dongsu Park, dm-devel, linux-kernel

On Mon, Feb 09 2015 at 12:13P -0500,
Mike Snitzer <snitzer@redhat.com> wrote:

> On Mon, Feb 09 2015 at 12:07pm -0500,
> Keith Busch <keith.busch@intel.com> wrote:
> 
> > Oh, we're not going rebase the series with the correction? I'm concerned
> > someone biscecting a completely unrelated problem might step on this
> > commit. Up to you guys. It's my fault, so I'll deal with the consequences.
> 
> Rebasing this late (3.20 merge already opened) is generally not done.
> Unfortunately we'll just have to suck it up and deal with the fallout
> during bisect -- would wager very few are bisecting with multipath as
> the focus of their test.

Jens and I discussed this further and given that linux-block breaks
dm-multipath it is best to fix linux-block and let Linus resolve the
merge when I send him the linux-dm pull.

Here is the patch to fix the regression:

From: Mike Snitzer <snitzer@redhat.com>
Date: Mon, 9 Feb 2015 12:21:54 -0500
Subject: [PATCH] dm: fix multipath regression due to initializing wrong request

Commit febf715 ("block: require blk_rq_prep_clone() be given an
initialized clone request") introduced a regression by calling
blk_rq_init() on the original request rather than the clone
request that is passed to setup_clone().

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index f251633..71e6b73 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1719,7 +1719,7 @@ static int setup_clone(struct request *clone, struct request *rq,
 {
 	int r;
 
-	blk_rq_init(NULL, rq);
+	blk_rq_init(NULL, clone);
 	r = blk_rq_prep_clone(clone, rq, tio->md->bs, GFP_ATOMIC,
 			      dm_rq_bio_constructor, tio);
 	if (r)
-- 
1.9.3 (Apple Git-50)


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] dm: fix multipath regression due to initializing wrong request
  2015-02-09 17:35       ` [PATCH] dm: fix multipath regression due to initializing wrong request Mike Snitzer
@ 2015-02-09 17:47         ` Jens Axboe
  2015-02-10 10:42           ` Dongsu Park
  0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2015-02-09 17:47 UTC (permalink / raw)
  To: Mike Snitzer, Keith Busch; +Cc: Dongsu Park, dm-devel, linux-kernel

On 02/09/2015 10:35 AM, Mike Snitzer wrote:
> On Mon, Feb 09 2015 at 12:13P -0500,
> Mike Snitzer <snitzer@redhat.com> wrote:
>
>> On Mon, Feb 09 2015 at 12:07pm -0500,
>> Keith Busch <keith.busch@intel.com> wrote:
>>
>>> Oh, we're not going rebase the series with the correction? I'm concerned
>>> someone biscecting a completely unrelated problem might step on this
>>> commit. Up to you guys. It's my fault, so I'll deal with the consequences.
>>
>> Rebasing this late (3.20 merge already opened) is generally not done.
>> Unfortunately we'll just have to suck it up and deal with the fallout
>> during bisect -- would wager very few are bisecting with multipath as
>> the focus of their test.
>
> Jens and I discussed this further and given that linux-block breaks
> dm-multipath it is best to fix linux-block and let Linus resolve the
> merge when I send him the linux-dm pull.
>
> Here is the patch to fix the regression:

Added, thanks. I don't think this is worth rebasing for, so just added 
to the top of for-3.20/core (since that's where the buggy commit was added).

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] dm: fix multipath regression due to initializing wrong request
  2015-02-09 17:47         ` Jens Axboe
@ 2015-02-10 10:42           ` Dongsu Park
  0 siblings, 0 replies; 7+ messages in thread
From: Dongsu Park @ 2015-02-10 10:42 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Mike Snitzer, Keith Busch, dm-devel, linux-kernel

On 09.02.2015 10:47, Jens Axboe wrote:
> On 02/09/2015 10:35 AM, Mike Snitzer wrote:
> >On Mon, Feb 09 2015 at 12:13P -0500,
> >Mike Snitzer <snitzer@redhat.com> wrote:
> >
> >Jens and I discussed this further and given that linux-block breaks
> >dm-multipath it is best to fix linux-block and let Linus resolve the
> >merge when I send him the linux-dm pull.
> >
> >Here is the patch to fix the regression:
> 
> Added, thanks. I don't think this is worth rebasing for, so just added to
> the top of for-3.20/core (since that's where the buggy commit was added).

Thanks a lot. Now the branch for-3.20/core works without hitting the BUG.

Dongsu

> -- 
> Jens Axboe
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-02-10 10:42 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-09 16:38 blk-mq crash with dm-multipath in for-3.20/core Dongsu Park
2015-02-09 16:48 ` Mike Snitzer
2015-02-09 17:07   ` Keith Busch
2015-02-09 17:13     ` Mike Snitzer
2015-02-09 17:35       ` [PATCH] dm: fix multipath regression due to initializing wrong request Mike Snitzer
2015-02-09 17:47         ` Jens Axboe
2015-02-10 10:42           ` Dongsu Park

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.