linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Martin K. Petersen" <martin.petersen@oracle.com>
To: Mike Snitzer <snitzer@redhat.com>
Cc: John Dorminy <jdorminy@redhat.com>,
	Zhang Xiaoxu <zhangxiaoxu5@huawei.com>,
	axboe@kernel.dk, linux-block@vger.kernel.org,
	dm-devel@redhat.com, Alasdair G Kergon <agk@redhat.com>
Subject: Re: block: Fix a WRITE SAME BUG_ON
Date: Mon, 28 Jan 2019 23:54:08 -0500	[thread overview]
Message-ID: <yq1lg34f4i7.fsf@oracle.com> (raw)
In-Reply-To: <20190128221441.GA24102@redhat.com> (Mike Snitzer's message of "Mon, 28 Jan 2019 17:14:42 -0500")


Mike,

>> In the first place, if it's an LVM-only issue, we should fix it only
>> for device-mapper devices. If this is the right way to fix it,
>> possibly the way to do that would be to change DM calls to
>> blk_queue_max_write_same_sectors() to only set the max sectors to
>> more than 0 if and only if the logical block sizes match.
>
> There is no way this is specific to lvm (or DM).  It may _seem_ that way
> because lvm/dm are in the business of creating stacked devices --
> whereby exposing users to blk_stack_limits().
>
> I'll have a closer look at this issue, hopefully tomorrow, but Zhang
> Xiaoxu's proposed fix looks bogus to me.  Not disputing there is an
> issue, just feels like a different fix is needed.

It's caused by a remnant of the old bio payload hack in sd.c:

	BUG_ON(bio_offset(bio) || bio_iovec(bio).bv_len != sdp->sector_size);

We rounded up LBS when we created the DM device. And therefore the
bv_len coming down is 4K. But one of the component devices has a LBS of
512 and fails this check.

At first glance one could argue we should just nuke the BUG_ON since the
sd code no longer relies on bv_len. However, the semantics for WRITE
SAME are particularly challenging in this scenario. Say the filesystem
wants to WRITE SAME a 4K PAGE consisting of 512 bytes of zeroes,
followed by 512 bytes of ones, followed by 512 bytes of twos, etc. If a
component device only has a 512-byte LBS, we would end up writing zeroes
to the entire 4K block on that component device instead of the correct
pattern. Not good.

So disallowing WRITE SAME unless all component devices have the same LBS
is the correct fix.

That said, now that we have REQ_OP_WRITE_ZEROES (where the LBS is
irrelevant due to the payload being the ZERO_PAGE), it may be worthwhile
to remove REQ_OP_WRITE_SAME. I think drbd is the only user relying on a
non-zero payload. The target code ends up manually iterating, if I
remember correctly...

-- 
Martin K. Petersen	Oracle Linux Engineering

  reply	other threads:[~2019-01-29  4:54 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20190125021107.4595-1-zhangxiaoxu5@huawei.com>
2019-01-26 11:17 ` [dm-devel] [v2] block: Fix a WRITE SAME BUG_ON John Dorminy
2019-01-28  5:48   ` zhangxiaoxu (A)
2019-01-28 22:14   ` Mike Snitzer
2019-01-29  4:54     ` Martin K. Petersen [this message]
2019-01-29  8:52       ` Christoph Hellwig
2019-01-30  6:50         ` zhangxiaoxu (A)
2019-01-30 14:08       ` John Dorminy
2019-01-31  0:58         ` zhangxiaoxu (A)
2019-01-31  2:23         ` Martin K. Petersen
2019-01-31 10:39         ` Christoph Hellwig
2019-01-31 19:41           ` John Dorminy
2019-02-01  7:35             ` Christoph Hellwig
2019-02-01 14:09               ` John Dorminy
2019-02-01 16:03                 ` [dm-devel] " Heinz Mauelshagen
2019-02-01 16:18                   ` Christoph Hellwig
2019-02-12  3:11                     ` zhangxiaoxu (A)
2019-02-14  2:31                       ` Martin K. Petersen
2019-02-14  9:36                         ` zhangxiaoxu (A)
2019-02-18 14:10                           ` zhangxiaoxu (A)
2019-02-19 23:10                             ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=yq1lg34f4i7.fsf@oracle.com \
    --to=martin.petersen@oracle.com \
    --cc=agk@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@redhat.com \
    --cc=jdorminy@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=snitzer@redhat.com \
    --cc=zhangxiaoxu5@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).