Linux-Block Archive on lore.kernel.org
 help / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Christoph Hellwig <hch@lst.de>
Cc: axboe@fb.com, linux-block@vger.kernel.org
Subject: Re: [PATCH 1/4] block: don't decrement nr_phys_segments for physically contigous segments
Date: Sat, 18 May 2019 07:02:09 +0800
Message-ID: <20190517230208.GB22236@ming.t460p> (raw)
In-Reply-To: <20190516131703.GA26943@ming.t460p>

On Thu, May 16, 2019 at 09:17:04PM +0800, Ming Lei wrote:
> Hi Christoph,
> 
> On Thu, May 16, 2019 at 10:40:55AM +0200, Christoph Hellwig wrote:
> > Currently ll_merge_requests_fn, unlike all other merge functions,
> > reduces nr_phys_segments by one if the last segment of the previous,
> > and the first segment of the next segement are contigous.  While this
> > seems like a nice solution to avoid building smaller than possible
> > requests it causes a mismatch between the segments actually present
> > in the request and those iterated over by the bvec iterators, including
> > __rq_for_each_bio.  This could cause overwrites of too small kmalloc
> > allocations in any driver using ranged discard, or also mistrigger
> > the single segment optimization in the nvme-pci driver.
> > 
> > We could possibly work around this by making the bvec iterators take
> > the front and back segment size into account, but that would require
> > moving them from the bio to the bio_iter and spreading this mess
> > over all users of bvecs.  Or we could simply remove this optimization
> > under the assumption that most users already build good enough bvecs,
> > and that the bio merge patch never cared about this optimization
> > either.  The latter is what this patch does.
> > 
> > Fixes: b35ba01ea697 ("nvme: support ranged discard requests")
> > Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support")
> 
> ll_merge_requests_fn() is only called from attempt_merge() in case
> that ELEVATOR_BACK_MERGE is returned from blk_try_req_merge(). However,
> for discard merge of both virtio_blk and nvme, ELEVATOR_DISCARD_MERGE is
> always returned from blk_try_req_merge() in attempt_merge(), so looks
> ll_merge_requests_fn() shouldn't be called for virtio_blk/nvme's discard
> request. Just wondering if you may explain a bit how the change on
> ll_merge_requests_fn() in this patch makes a difference on the above
> two commits?
> 
> > Fixes: 297910571f08 ("nvme-pci: optimize mapping single segment requests using SGLs")
> 
> I guess it should be dff824b2aadb ("nvme-pci: optimize mapping of small
> single segment requests").
> 
> Yes, this patch helps for this case, cause blk_rq_nr_phys_segments() may be 1
> but there are two bios which share same segment.

BTW, I just sent a single-line nvme-pci fix on this issue, which may be more
suitable to serve as v5.2 fix:

http://lists.infradead.org/pipermail/linux-nvme/2019-May/024283.html

Thanks,
Ming

  reply index

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-16  8:40 fix nr_phys_segments vs iterators accounting v2 Christoph Hellwig
2019-05-16  8:40 ` [PATCH 1/4] block: don't decrement nr_phys_segments for physically contigous segments Christoph Hellwig
2019-05-16  8:48   ` Hannes Reinecke
2019-05-16 13:17   ` Ming Lei
2019-05-17 23:02     ` Ming Lei [this message]
2019-05-20 11:11     ` Christoph Hellwig
2019-05-21  1:04       ` Ming Lei
2019-05-16  8:40 ` [PATCH 2/4] block: force an unlimited segment size on queues with a virt boundary Christoph Hellwig
2019-05-16  8:49   ` Hannes Reinecke
2019-05-16  8:40 ` [PATCH 3/4] block: remove the segment size check in bio_will_gap Christoph Hellwig
2019-05-16  8:49   ` Hannes Reinecke
2019-05-16  8:40 ` [PATCH 4/4] block: remove the bi_seg_{front,back}_size fields in struct bio Christoph Hellwig
2019-05-16  8:50   ` Hannes Reinecke
2019-05-20 11:17 ` fix nr_phys_segments vs iterators accounting v2 Christoph Hellwig
2019-05-21  1:09   ` Jens Axboe
2019-05-21  1:17     ` Ming Lei
2019-05-21  1:20       ` Jens Axboe
2019-05-21  1:29         ` Ming Lei
2019-05-21  5:11           ` Christoph Hellwig
2019-05-21  7:01 fix nr_phys_segments vs iterators accounting v3 Christoph Hellwig
2019-05-21  7:01 ` [PATCH 1/4] block: don't decrement nr_phys_segments for physically contigous segments Christoph Hellwig
2019-05-21  8:05   ` Ming Lei

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190517230208.GB22236@ming.t460p \
    --to=ming.lei@redhat.com \
    --cc=axboe@fb.com \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Block Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-block/0 linux-block/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-block linux-block/ https://lore.kernel.org/linux-block \
		linux-block@vger.kernel.org linux-block@archiver.kernel.org
	public-inbox-index linux-block


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-block


AGPL code for this site: git clone https://public-inbox.org/ public-inbox