linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Dorminy <jdorminy@redhat.com>
To: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Cc: axboe@kernel.dk, linux-block@vger.kernel.org,
	Alasdair G Kergon <agk@redhat.com>,
	dm-devel@redhat.com, Mike Snitzer <snitzer@redhat.com>
Subject: Re: [dm-devel] [v2] block: Fix a WRITE SAME BUG_ON
Date: Sat, 26 Jan 2019 12:17:57 +0100	[thread overview]
Message-ID: <CAMeeMh_=UVGwo-=9WtBXRmiHTXnLAswqis97bM-aiuwnzPU0QQ@mail.gmail.com> (raw)
In-Reply-To: <20190125021107.4595-1-zhangxiaoxu5@huawei.com>

Hi. I have read a bit of DM code and spent an hour reviewing this... I
didn't get to the point of knowing what the right fix for the problem
is, and I may have a wrong understanding, but I have two thoughts
about the patch:

I don't think this is the right solution for two reasons:

In the first place, if it's an LVM-only issue, we should fix it only
for device-mapper devices. If this is the right way to fix it,
possibly the way to do that would be to change DM calls to
blk_queue_max_write_same_sectors() to only set the max sectors to more
than 0 if and only if the logical block sizes match.

In the second place, I don't think this is the problem. The line of
code that's calling BUG_ON is, I think,
BUG_ON(bio_offset(bio) || bio_iovec(bio).bv_len != sdp->sector_size);

This is because write_same bios are supposed to have a single sector
of data at the beginning of a page in their bio_iovec(bio) [I think,
based on a commit message I've read] -- in other words, bio_iovec(bio)
is supposed to say something like { .page = something, .offset = 0,
.len = 1 native sector }. But clearly, since the BUG_ON is being
called, one of these is not true. Have you added a log statement right
before the BUG_ON() to print out bio_offset(bio) and
bio_iovec(bio).bv_len to see which value defies expectations?

I would be happy to help you trace through this or attempt to
reproduce it myself -- what stack of devices can you reproduce this
on? Is this a dm-linear device on top of a disk? Does it have a
filesystem on top, and if so, what filesystem?

Thank you!

John Dorminy


On Fri, Jan 25, 2019 at 9:53 AM Zhang Xiaoxu <zhangxiaoxu5@huawei.com> wrote:
>
> If the lvm is stacked by different logical_block_size disks,
> when WRITE SAME on it, will bug_on:
>
> kernel BUG at drivers/scsi/sd.c:968!
> invalid opcode: 0000 [#1] SMP PTI
> CPU: 11 PID: 525 Comm: kworker/11:1H Tainted: G           O      5.0.0-rc3+ #2
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
> Workqueue: kblockd blk_mq_run_work_fn
> RIP: 0010:sd_init_command+0x7aa/0xdb0
> Code: 30 75 00 00 c7 85 44 01 00 00 18 00 00 00 0f 85 fa 04 00 00 48 83 c4 40 48
>       89 df 5b 5d 41 5c 41 5d 41 5e 41 5f e9 b6 ca fe ff <0f> 0b 41 bc 09
> RSP: 0018:ffffb55f80ddbca0 EFLAGS: 00010206
> RAX: 0000000000001000 RBX: ffff9ed23fb927a0 RCX: 000000000000f000
> RDX: ffff9ed23f0a8400 RSI: ffff9ed27bc79800 RDI: 0000000000000000
> RBP: ffff9ed23fb92680 R08: ffff9ed27c8c0000 R09: ffff9ed23fb927d8
> R10: 0000000000000000 R11: fefefefefefefeff R12: ffff9ed27bc79800
> R13: ffff9ed2787a0000 R14: ffff9ed27bdf3400 R15: ffff9ed23fb927a0
> FS:  0000000000000000(0000) GS:ffff9ed27c8c0000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f6b14cf9341 CR3: 0000000069058000 CR4: 00000000000006e0
> Call Trace:
>  ? vp_notify+0x12/0x20
>  scsi_queue_rq+0x525/0xa30
>  blk_mq_dispatch_rq_list+0x8d/0x580
>  ? syscall_return_via_sysret+0x10/0x7f
>  ? elv_rb_del+0x1f/0x30
>  ? deadline_remove_request+0x55/0xc0
>  blk_mq_do_dispatch_sched+0x76/0x110
>  blk_mq_sched_dispatch_requests+0xf9/0x170
>  __blk_mq_run_hw_queue+0x51/0xd0
>  process_one_work+0x195/0x380
>  worker_thread+0x30/0x390
>  ? process_one_work+0x380/0x380
>  kthread+0x113/0x130
>  ? kthread_park+0x90/0x90
>  ret_from_fork+0x35/0x40
> Modules linked in: alloc(O+)
> ---[ end trace dc92ddeb2e6d1fe5 ]---
>
> The logical_block_size of the LVM is the max value of the sub disks,
> it maybe different with one of the sub disk. when WRITE SAME on the
> disk, it will BUG_ON when setup WRITE SAME cmd.
>
> Close WRITE_SAME feature on the LVM if it was stacked by different
> logical_block_size disk.
>
> Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
> ---
>  block/blk-settings.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index 3e7038e475ee..e4664280edb4 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -497,8 +497,6 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
>         t->max_sectors = min_not_zero(t->max_sectors, b->max_sectors);
>         t->max_hw_sectors = min_not_zero(t->max_hw_sectors, b->max_hw_sectors);
>         t->max_dev_sectors = min_not_zero(t->max_dev_sectors, b->max_dev_sectors);
> -       t->max_write_same_sectors = min(t->max_write_same_sectors,
> -                                       b->max_write_same_sectors);
>         t->max_write_zeroes_sectors = min(t->max_write_zeroes_sectors,
>                                         b->max_write_zeroes_sectors);
>         t->bounce_pfn = min_not_zero(t->bounce_pfn, b->bounce_pfn);
> @@ -537,6 +535,14 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
>                 }
>         }
>
> +       /* If the logical block size is different, forbid write same */
> +       if (t->logical_block_size != b->logical_block_size &&
> +           t->max_write_same_sectors != UINT_MAX)
> +               t->max_write_same_sectors = 0;
> +       else
> +               t->max_write_same_sectors = min(t->max_write_same_sectors,
> +                                               b->max_write_same_sectors);
> +
>         t->logical_block_size = max(t->logical_block_size,
>                                     b->logical_block_size);
>
> --
> 2.14.4
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel

       reply	other threads:[~2019-01-26 11:18 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20190125021107.4595-1-zhangxiaoxu5@huawei.com>
2019-01-26 11:17 ` John Dorminy [this message]
2019-01-28  5:48   ` [dm-devel] [v2] block: Fix a WRITE SAME BUG_ON zhangxiaoxu (A)
2019-01-28 22:14   ` Mike Snitzer
2019-01-29  4:54     ` Martin K. Petersen
2019-01-29  8:52       ` Christoph Hellwig
2019-01-30  6:50         ` zhangxiaoxu (A)
2019-01-30 14:08       ` John Dorminy
2019-01-31  0:58         ` zhangxiaoxu (A)
2019-01-31  2:23         ` Martin K. Petersen
2019-01-31 10:39         ` Christoph Hellwig
2019-01-31 19:41           ` John Dorminy
2019-02-01  7:35             ` Christoph Hellwig
2019-02-01 14:09               ` John Dorminy
2019-02-01 16:03                 ` [dm-devel] " Heinz Mauelshagen
2019-02-01 16:18                   ` Christoph Hellwig
2019-02-12  3:11                     ` zhangxiaoxu (A)
2019-02-14  2:31                       ` Martin K. Petersen
2019-02-14  9:36                         ` zhangxiaoxu (A)
2019-02-18 14:10                           ` zhangxiaoxu (A)
2019-02-19 23:10                             ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMeeMh_=UVGwo-=9WtBXRmiHTXnLAswqis97bM-aiuwnzPU0QQ@mail.gmail.com' \
    --to=jdorminy@redhat.com \
    --cc=agk@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=snitzer@redhat.com \
    --cc=zhangxiaoxu5@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).