[v2] block: bio_map_user_iov should not be limited to BIO_MAX_PAGES
diff mbox series

Message ID 20190417115207.30202-1-pbonzini@redhat.com
State New
Headers show
Series
  • [v2] block: bio_map_user_iov should not be limited to BIO_MAX_PAGES
Related show

Commit Message

Paolo Bonzini April 17, 2019, 11:52 a.m. UTC
Because bio_kmalloc uses inline iovecs, the limit on the number of entries
is not BIO_MAX_PAGES but rather UIO_MAXIOV, which indeed is already checked
in bio_kmalloc.  This could cause SG_IO requests to be truncated and the HBA
to report a DMA overrun.

Note that if the argument to iov_iter_npages were changed to UIO_MAXIOV,
we would still truncate SG_IO requests beyond UIO_MAXIOV pages.  Changing
it to UIO_MAXIOV + 1 instead ensures that bio_kmalloc notices that the
request is too big and blocks it.

Cc: stable@vger.kernel.org
Cc: Al Viro <viro@zeniv.linux.org.uk>
Fixes: b282cc766958 ("bio_map_user_iov(): get rid of the iov_for_each()", 2017-10-11)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

---
	v1->v2: now with "git commit"
---
 block/bio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Ming Lei April 18, 2019, 2:19 a.m. UTC | #1
Hi Paolo,

On Wed, Apr 17, 2019 at 01:52:07PM +0200, Paolo Bonzini wrote:
> Because bio_kmalloc uses inline iovecs, the limit on the number of entries
> is not BIO_MAX_PAGES but rather UIO_MAXIOV, which indeed is already checked
> in bio_kmalloc.  This could cause SG_IO requests to be truncated and the HBA
> to report a DMA overrun.

BIO_MAX_PAGES only limits the single bio's max vector number, if one bio
can't hold all user space request, new bio will be allocated and appended
to the passthrough request if queue limits aren't reached.

So I understand SG_IO request shouldn't be truncated because of
BIO_MAX_PAGES, or could you explain it in a bit detail or provide
a reproducer?

> 
> Note that if the argument to iov_iter_npages were changed to UIO_MAXIOV,
> we would still truncate SG_IO requests beyond UIO_MAXIOV pages.  Changing
> it to UIO_MAXIOV + 1 instead ensures that bio_kmalloc notices that the
> request is too big and blocks it.

We should pass UIO_MAXIOV instead of UIO_MAXIOV + 1, otherwise bio_kmalloc()
will fail. Otherwise, the patch looks fine, but shouldn't be a fix if my
above analysis is correct.

Thanks,
Ming
Ming Lei April 18, 2019, 2:30 a.m. UTC | #2
On Thu, Apr 18, 2019 at 10:19:04AM +0800, Ming Lei wrote:
> Hi Paolo,
> 
> On Wed, Apr 17, 2019 at 01:52:07PM +0200, Paolo Bonzini wrote:
> > Because bio_kmalloc uses inline iovecs, the limit on the number of entries
> > is not BIO_MAX_PAGES but rather UIO_MAXIOV, which indeed is already checked
> > in bio_kmalloc.  This could cause SG_IO requests to be truncated and the HBA
> > to report a DMA overrun.
> 
> BIO_MAX_PAGES only limits the single bio's max vector number, if one bio
> can't hold all user space request, new bio will be allocated and appended
> to the passthrough request if queue limits aren't reached.
> 
> So I understand SG_IO request shouldn't be truncated because of
> BIO_MAX_PAGES, or could you explain it in a bit detail or provide
> a reproducer?
> 
> > 
> > Note that if the argument to iov_iter_npages were changed to UIO_MAXIOV,
> > we would still truncate SG_IO requests beyond UIO_MAXIOV pages.  Changing
> > it to UIO_MAXIOV + 1 instead ensures that bio_kmalloc notices that the
> > request is too big and blocks it.
> 
> We should pass UIO_MAXIOV instead of UIO_MAXIOV + 1, otherwise bio_kmalloc()
> will fail. Otherwise, the patch looks fine, but shouldn't be a fix if my
> above analysis is correct.

Also we have enabled multipage bvec for passthough IO[1], we shouldn't
need to allocate so big max io vectors any more, and actually the reasonable
number is (PAGE_SIZE - sizeof(struct bio)) / sizeof(struct bio_vec), then we
can avoid to stress mm for high order allocation.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=for-5.2/block&id=489fbbcb51d0249569d863f9220de69cb31f1922

Thanks,
Ming
Paolo Bonzini April 18, 2019, 8:42 a.m. UTC | #3
On 18/04/19 04:19, Ming Lei wrote:
> Hi Paolo,
> 
> On Wed, Apr 17, 2019 at 01:52:07PM +0200, Paolo Bonzini wrote:
>> Because bio_kmalloc uses inline iovecs, the limit on the number of entries
>> is not BIO_MAX_PAGES but rather UIO_MAXIOV, which indeed is already checked
>> in bio_kmalloc.  This could cause SG_IO requests to be truncated and the HBA
>> to report a DMA overrun.
> 
> BIO_MAX_PAGES only limits the single bio's max vector number, if one bio
> can't hold all user space request, new bio will be allocated and appended
> to the passthrough request if queue limits aren't reached.

Stupid question: where?  I don't see any place starting at
blk_rq_map_user_iov (and then __blk_rq_map_user_iov->bio_map_user_iov)
that would allocate a second bio.  The only bio_kmalloc in that path is
the one I'm patching.

> So I understand SG_IO request shouldn't be truncated because of
> BIO_MAX_PAGES, or could you explain it in a bit detail or provide
> a reproducer?

Unfortunately I don't have a reproducer.  Actually any userspace SG_IO
above 4 MB triggers it, but you need the right HBA to ensure that it
reaches bio_kmalloc.

Paolo
Ming Lei April 18, 2019, 9:29 a.m. UTC | #4
On Thu, Apr 18, 2019 at 10:42:21AM +0200, Paolo Bonzini wrote:
> On 18/04/19 04:19, Ming Lei wrote:
> > Hi Paolo,
> > 
> > On Wed, Apr 17, 2019 at 01:52:07PM +0200, Paolo Bonzini wrote:
> >> Because bio_kmalloc uses inline iovecs, the limit on the number of entries
> >> is not BIO_MAX_PAGES but rather UIO_MAXIOV, which indeed is already checked
> >> in bio_kmalloc.  This could cause SG_IO requests to be truncated and the HBA
> >> to report a DMA overrun.
> > 
> > BIO_MAX_PAGES only limits the single bio's max vector number, if one bio
> > can't hold all user space request, new bio will be allocated and appended
> > to the passthrough request if queue limits aren't reached.
> 
> Stupid question: where?  I don't see any place starting at
> blk_rq_map_user_iov (and then __blk_rq_map_user_iov->bio_map_user_iov)
> that would allocate a second bio.  The only bio_kmalloc in that path is
> the one I'm patching.

Each bio is created inside __blk_rq_map_user_iov() which is run inside
a loop, and the created bio is added to request via blk_rq_append_bio(),
see the following code:

blk_rq_map_user_iov
	__blk_rq_map_user_iov
		blk_rq_append_bio

blk_rq_map_user_iov():
		...
        do {
                ret =__blk_rq_map_user_iov(rq, map_data, &i, gfp_mask, copy);
                if (ret)
                        goto unmap_rq;
                if (!bio)
                        bio = rq->bio;
        } while (iov_iter_count(&i));
		...

Thanks,
Ming
Paolo Bonzini April 18, 2019, 9:36 a.m. UTC | #5
On 18/04/19 11:29, Ming Lei wrote:
> On Thu, Apr 18, 2019 at 10:42:21AM +0200, Paolo Bonzini wrote:
>> On 18/04/19 04:19, Ming Lei wrote:
>>> Hi Paolo,
>>>
>>> On Wed, Apr 17, 2019 at 01:52:07PM +0200, Paolo Bonzini wrote:
>>>> Because bio_kmalloc uses inline iovecs, the limit on the number of entries
>>>> is not BIO_MAX_PAGES but rather UIO_MAXIOV, which indeed is already checked
>>>> in bio_kmalloc.  This could cause SG_IO requests to be truncated and the HBA
>>>> to report a DMA overrun.
>>>
>>> BIO_MAX_PAGES only limits the single bio's max vector number, if one bio
>>> can't hold all user space request, new bio will be allocated and appended
>>> to the passthrough request if queue limits aren't reached.
>>
>> Stupid question: where?  I don't see any place starting at
>> blk_rq_map_user_iov (and then __blk_rq_map_user_iov->bio_map_user_iov)
>> that would allocate a second bio.  The only bio_kmalloc in that path is
>> the one I'm patching.
> 
> Each bio is created inside __blk_rq_map_user_iov() which is run inside
> a loop, and the created bio is added to request via blk_rq_append_bio(),
> see the following code:

Uff, I can't read apparently. :(  This is the commit that introduced it:

    commit 4d6af73d9e43f78651a43ee4c5ad221107ac8365
    Author: Christoph Hellwig <hch@lst.de>
    Date:   Wed Mar 2 18:07:14 2016 +0100

    block: support large requests in blk_rq_map_user_iov
    
    This patch adds support for larger requests in blk_rq_map_user_iov by
    allowing it to build multiple bios for a request.  This functionality
    used to exist for the non-vectored blk_rq_map_user in the past, and
    this patch reuses the existing functionality for it on the unmap side,
    which stuck around.  Thanks to the iov_iter API supporting multiple
    bios is fairly trivial, as we can just iterate the iov until we've
    consumed the whole iov_iter.
    
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reported-by: Jeff Lien <Jeff.Lien@hgst.com>
    Tested-by: Jeff Lien <Jeff.Lien@hgst.com>
    Reviewed-by: Keith Busch <keith.busch@intel.com>
    Signed-off-by: Jens Axboe <axboe@fb.com>

Paolo
Ming Lei April 18, 2019, 9:43 a.m. UTC | #6
On Thu, Apr 18, 2019 at 11:36:03AM +0200, Paolo Bonzini wrote:
> On 18/04/19 11:29, Ming Lei wrote:
> > On Thu, Apr 18, 2019 at 10:42:21AM +0200, Paolo Bonzini wrote:
> >> On 18/04/19 04:19, Ming Lei wrote:
> >>> Hi Paolo,
> >>>
> >>> On Wed, Apr 17, 2019 at 01:52:07PM +0200, Paolo Bonzini wrote:
> >>>> Because bio_kmalloc uses inline iovecs, the limit on the number of entries
> >>>> is not BIO_MAX_PAGES but rather UIO_MAXIOV, which indeed is already checked
> >>>> in bio_kmalloc.  This could cause SG_IO requests to be truncated and the HBA
> >>>> to report a DMA overrun.
> >>>
> >>> BIO_MAX_PAGES only limits the single bio's max vector number, if one bio
> >>> can't hold all user space request, new bio will be allocated and appended
> >>> to the passthrough request if queue limits aren't reached.
> >>
> >> Stupid question: where?  I don't see any place starting at
> >> blk_rq_map_user_iov (and then __blk_rq_map_user_iov->bio_map_user_iov)
> >> that would allocate a second bio.  The only bio_kmalloc in that path is
> >> the one I'm patching.
> > 
> > Each bio is created inside __blk_rq_map_user_iov() which is run inside
> > a loop, and the created bio is added to request via blk_rq_append_bio(),
> > see the following code:
> 
> Uff, I can't read apparently. :(  This is the commit that introduced it:
> 
>     commit 4d6af73d9e43f78651a43ee4c5ad221107ac8365
>     Author: Christoph Hellwig <hch@lst.de>
>     Date:   Wed Mar 2 18:07:14 2016 +0100
> 
>     block: support large requests in blk_rq_map_user_iov

Exactly, the above commit starts to build multiple bios for a request.

Then I guess your issue is triggered on kernel without the commit.

Thanks,
Ming

Patch
diff mbox series

diff --git a/block/bio.c b/block/bio.c
index 4db1008309ed..0914ae4adae9 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1299,7 +1299,7 @@  struct bio *bio_map_user_iov(struct request_queue *q,
 	if (!iov_iter_count(iter))
 		return ERR_PTR(-EINVAL);
 
-	bio = bio_kmalloc(gfp_mask, iov_iter_npages(iter, BIO_MAX_PAGES));
+	bio = bio_kmalloc(gfp_mask, iov_iter_npages(iter, UIO_MAXIOV + 1));
 	if (!bio)
 		return ERR_PTR(-ENOMEM);