All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kanchan Joshi <joshi.k@samsung.com>
To: Christoph Hellwig <hch@lst.de>
Cc: axboe@kernel.dk, kbusch@kernel.org, asml.silence@gmail.com,
	io-uring@vger.kernel.org, linux-nvme@lists.infradead.org,
	linux-block@vger.kernel.org, gost.dev@samsung.com,
	Anuj Gupta <anuj20.g@samsung.com>
Subject: Re: [PATCH for-next v7 4/5] block: add helper to map bvec iterator for passthrough
Date: Sun, 25 Sep 2022 23:16:29 +0530	[thread overview]
Message-ID: <20220925174629.GB6320@test-zns> (raw)
In-Reply-To: <20220923184349.GA3394@test-zns>

[-- Attachment #1: Type: text/plain, Size: 4863 bytes --]

On Sat, Sep 24, 2022 at 12:13:49AM +0530, Kanchan Joshi wrote:
>On Fri, Sep 23, 2022 at 05:29:41PM +0200, Christoph Hellwig wrote:
>>On Thu, Sep 22, 2022 at 08:53:31PM +0530, Kanchan Joshi wrote:
>>>>blk_rq_map_user_iov really should be able to detect that it is called
>>>>on a bvec iter and just do the right thing rather than needing different
>>>>helpers.
>>>
>>>I too explored that possibility, but found that it does not. It maps the
>>>user-pages into bio either directly or by doing that copy (in certain odd
>>>conditions) but does not know how to deal with existing bvec.
>>
>>What do you mean with existing bvec?  We allocate a brand new bio here
>>that we want to map the next chunk of the iov_iter to, and that
>>is exactly what blk_rq_map_user_iov does.  What blk_rq_map_user_iov
>>currently does not do is to implement this mapping efficiently
>>for ITER_BVEC iters
>
>It is clear that it was not written for ITER_BVEC iters.
>Otherwise that WARN_ON would not have hit.
>
>And efficency is the concern as we are moving to more heavyweight
>helper that 'handles' weird conditions rather than just 'bails out'.
>These alignment checks end up adding a loop that traverses
>the entire ITER_BVEC.
>Also blk_rq_map_user_iov uses bio_iter_advance which also seems
>cycle-consuming given below code-comment in io_import_fixed():
>
>if (offset) {
>       /*
>        * Don't use iov_iter_advance() here, as it's really slow for
>        * using the latter parts of a big fixed buffer - it iterates
>        * over each segment manually. We can cheat a bit here, because
>        * we know that:
>
>So if at all I could move the code inside blk_rq_map_user_iov, I will
>need to see that I skip doing iov_iter_advance.
>
>I still think it would be better to take this route only when there are
>other usecases/callers of this. And that is a future thing. For the current
>requirement, it seems better to prioritze efficency.
>
>>, but that is something that could and should
>>be fixed.
>>
>>>And it really felt cleaner to me write a new function rather than
>>>overloading the blk_rq_map_user_iov with multiple if/else canals.
>>
>>No.  The whole point of the iov_iter is to support this "overload".
>
>Even if I try taking that route, WARN_ON is a blocker that  prevents 
>me to put this code inside blk_rq_map_user_iov.
>
>>>But iov_iter_gap_alignment does not work on bvec iters. Line #1274 below
>>
>>So we'll need to fix it.
>
>Do you see good way to trigger this virt-alignment condition? I have
>not seen this hitting (the SG gap checks) when running with fixebufs.
>
>>>1264 unsigned long iov_iter_gap_alignment(const struct iov_iter *i)
>>>1265 {
>>>1266         unsigned long res = 0;
>>>1267         unsigned long v = 0;
>>>1268         size_t size = i->count;
>>>1269         unsigned k;
>>>1270
>>>1271         if (iter_is_ubuf(i))
>>>1272                 return 0;
>>>1273
>>>1274         if (WARN_ON(!iter_is_iovec(i)))
>>>1275                 return ~0U;
>>>
>>>Do you see a way to overcome this. Or maybe this can be revisted as we
>>>are not missing a lot?
>>
>>We just need to implement the equivalent functionality for bvecs.  It
>>isn't really hard, it just wasn't required so far.
>
>Can the virt-boundary alignment gap exist for ITER_BVEC iter in first
>place? Two reasons to ask this question:
>
>1. Commit description of this code (from Al viro) says -
>
>"iov_iter_gap_alignment(): get rid of iterate_all_kinds()
>
>For one thing, it's only used for iovec (and makes sense only for
>those)."
>
>2. I did not hit it so far as I mentioned above.

And we also have below condition (patch of Linus) that restricts
blk_rq_map_user_iov to only iovec iterator

commit a0ac402cfcdc904f9772e1762b3fda112dcc56a0
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Tue Dec 6 16:18:14 2016 -0800

    Don't feed anything but regular iovec's to blk_rq_map_user_iov

    In theory we could map other things, but there's a reason that function
    is called "user_iov".  Using anything else (like splice can do) just
    confuses it.

    Reported-and-tested-by: Johannes Thumshirn <jthumshirn@suse.de>
    Cc: Al Viro <viro@ZenIV.linux.org.uk>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

diff --git a/block/blk-map.c b/block/blk-map.c
index b8657fa8dc9a..27fd8d92892d 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -118,6 +118,9 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,
        struct iov_iter i;
        int ret;

+       if (!iter_is_iovec(iter))
+               goto fail;
+
        if (map_data)
                copy = true;
        else if (iov_iter_alignment(iter) & align)
@@ -140,6 +143,7 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,

 unmap_rq:
        __blk_rq_unmap_user(bio);
+fail:
        rq->bio = NULL;
        return -EINVAL;
 }


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



  reply	other threads:[~2022-09-25 17:56 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20220909103131epcas5p23d146916eccedf30d498e0ea23e54052@epcas5p2.samsung.com>
2022-09-09 10:21 ` [PATCH for-next v7 0/5] fixed-buffer for uring-cmd/passthru Kanchan Joshi
     [not found]   ` <CGME20220909103136epcas5p38ea3a933e90d9f9d7451848dc3a60829@epcas5p3.samsung.com>
2022-09-09 10:21     ` [PATCH for-next v7 1/5] io_uring: add io_uring_cmd_import_fixed Kanchan Joshi
     [not found]   ` <CGME20220909103140epcas5p36689726422eb68e6fdc1d39019a4a8ba@epcas5p3.samsung.com>
2022-09-09 10:21     ` [PATCH for-next v7 2/5] io_uring: introduce fixed buffer support for io_uring_cmd Kanchan Joshi
     [not found]   ` <CGME20220909103143epcas5p2eda60190cd23b79fb8f48596af3e1524@epcas5p2.samsung.com>
2022-09-09 10:21     ` [PATCH for-next v7 3/5] nvme: refactor nvme_alloc_user_request Kanchan Joshi
2022-09-20 12:02       ` Christoph Hellwig
2022-09-22 15:46         ` Kanchan Joshi
2022-09-23  9:25         ` Kanchan Joshi
     [not found]   ` <CGME20220909103147epcas5p2a83ec151333bcb1d2abb8c7536789bfd@epcas5p2.samsung.com>
2022-09-09 10:21     ` [PATCH for-next v7 4/5] block: add helper to map bvec iterator for passthrough Kanchan Joshi
2022-09-20 12:08       ` Christoph Hellwig
2022-09-22 15:23         ` Kanchan Joshi
2022-09-23 15:29           ` Christoph Hellwig
2022-09-23 18:43             ` Kanchan Joshi
2022-09-25 17:46               ` Kanchan Joshi [this message]
2022-09-26 14:50               ` Christoph Hellwig
2022-09-27 16:47                 ` Kanchan Joshi
     [not found]   ` <CGME20220909103151epcas5p1e25127c3053ba21e8f8418a701878973@epcas5p1.samsung.com>
2022-09-09 10:21     ` [PATCH for-next v7 5/5] nvme: wire up fixed buffer support for nvme passthrough Kanchan Joshi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220925174629.GB6320@test-zns \
    --to=joshi.k@samsung.com \
    --cc=anuj20.g@samsung.com \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=gost.dev@samsung.com \
    --cc=hch@lst.de \
    --cc=io-uring@vger.kernel.org \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.