linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kanchan Joshi <joshi.k@samsung.com>
To: Christoph Hellwig <hch@lst.de>
Cc: axboe@kernel.dk, kbusch@kernel.org, asml.silence@gmail.com,
	io-uring@vger.kernel.org, linux-nvme@lists.infradead.org,
	linux-block@vger.kernel.org, gost.dev@samsung.com,
	Anuj Gupta <anuj20.g@samsung.com>
Subject: Re: [PATCH for-next v7 4/5] block: add helper to map bvec iterator for passthrough
Date: Sun, 25 Sep 2022 23:16:29 +0530	[thread overview]
Message-ID: <20220925174629.GB6320@test-zns> (raw)
In-Reply-To: <20220923184349.GA3394@test-zns>

[-- Attachment #1: Type: text/plain, Size: 4863 bytes --]

On Sat, Sep 24, 2022 at 12:13:49AM +0530, Kanchan Joshi wrote:
>On Fri, Sep 23, 2022 at 05:29:41PM +0200, Christoph Hellwig wrote:
>>On Thu, Sep 22, 2022 at 08:53:31PM +0530, Kanchan Joshi wrote:
>>>>blk_rq_map_user_iov really should be able to detect that it is called
>>>>on a bvec iter and just do the right thing rather than needing different
>>>>helpers.
>>>
>>>I too explored that possibility, but found that it does not. It maps the
>>>user-pages into bio either directly or by doing that copy (in certain odd
>>>conditions) but does not know how to deal with existing bvec.
>>
>>What do you mean with existing bvec?  We allocate a brand new bio here
>>that we want to map the next chunk of the iov_iter to, and that
>>is exactly what blk_rq_map_user_iov does.  What blk_rq_map_user_iov
>>currently does not do is to implement this mapping efficiently
>>for ITER_BVEC iters
>
>It is clear that it was not written for ITER_BVEC iters.
>Otherwise that WARN_ON would not have hit.
>
>And efficency is the concern as we are moving to more heavyweight
>helper that 'handles' weird conditions rather than just 'bails out'.
>These alignment checks end up adding a loop that traverses
>the entire ITER_BVEC.
>Also blk_rq_map_user_iov uses bio_iter_advance which also seems
>cycle-consuming given below code-comment in io_import_fixed():
>
>if (offset) {
>       /*
>        * Don't use iov_iter_advance() here, as it's really slow for
>        * using the latter parts of a big fixed buffer - it iterates
>        * over each segment manually. We can cheat a bit here, because
>        * we know that:
>
>So if at all I could move the code inside blk_rq_map_user_iov, I will
>need to see that I skip doing iov_iter_advance.
>
>I still think it would be better to take this route only when there are
>other usecases/callers of this. And that is a future thing. For the current
>requirement, it seems better to prioritze efficency.
>
>>, but that is something that could and should
>>be fixed.
>>
>>>And it really felt cleaner to me write a new function rather than
>>>overloading the blk_rq_map_user_iov with multiple if/else canals.
>>
>>No.  The whole point of the iov_iter is to support this "overload".
>
>Even if I try taking that route, WARN_ON is a blocker that  prevents 
>me to put this code inside blk_rq_map_user_iov.
>
>>>But iov_iter_gap_alignment does not work on bvec iters. Line #1274 below
>>
>>So we'll need to fix it.
>
>Do you see good way to trigger this virt-alignment condition? I have
>not seen this hitting (the SG gap checks) when running with fixebufs.
>
>>>1264 unsigned long iov_iter_gap_alignment(const struct iov_iter *i)
>>>1265 {
>>>1266         unsigned long res = 0;
>>>1267         unsigned long v = 0;
>>>1268         size_t size = i->count;
>>>1269         unsigned k;
>>>1270
>>>1271         if (iter_is_ubuf(i))
>>>1272                 return 0;
>>>1273
>>>1274         if (WARN_ON(!iter_is_iovec(i)))
>>>1275                 return ~0U;
>>>
>>>Do you see a way to overcome this. Or maybe this can be revisted as we
>>>are not missing a lot?
>>
>>We just need to implement the equivalent functionality for bvecs.  It
>>isn't really hard, it just wasn't required so far.
>
>Can the virt-boundary alignment gap exist for ITER_BVEC iter in first
>place? Two reasons to ask this question:
>
>1. Commit description of this code (from Al viro) says -
>
>"iov_iter_gap_alignment(): get rid of iterate_all_kinds()
>
>For one thing, it's only used for iovec (and makes sense only for
>those)."
>
>2. I did not hit it so far as I mentioned above.

And we also have below condition (patch of Linus) that restricts
blk_rq_map_user_iov to only iovec iterator

commit a0ac402cfcdc904f9772e1762b3fda112dcc56a0
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Tue Dec 6 16:18:14 2016 -0800

    Don't feed anything but regular iovec's to blk_rq_map_user_iov

    In theory we could map other things, but there's a reason that function
    is called "user_iov".  Using anything else (like splice can do) just
    confuses it.

    Reported-and-tested-by: Johannes Thumshirn <jthumshirn@suse.de>
    Cc: Al Viro <viro@ZenIV.linux.org.uk>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

diff --git a/block/blk-map.c b/block/blk-map.c
index b8657fa8dc9a..27fd8d92892d 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -118,6 +118,9 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,
        struct iov_iter i;
        int ret;

+       if (!iter_is_iovec(iter))
+               goto fail;
+
        if (map_data)
                copy = true;
        else if (iov_iter_alignment(iter) & align)
@@ -140,6 +143,7 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,

 unmap_rq:
        __blk_rq_unmap_user(bio);
+fail:
        rq->bio = NULL;
        return -EINVAL;
 }


[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



  reply	other threads:[~2022-09-25 17:56 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20220909103131epcas5p23d146916eccedf30d498e0ea23e54052@epcas5p2.samsung.com>
2022-09-09 10:21 ` [PATCH for-next v7 0/5] fixed-buffer for uring-cmd/passthru Kanchan Joshi
     [not found]   ` <CGME20220909103136epcas5p38ea3a933e90d9f9d7451848dc3a60829@epcas5p3.samsung.com>
2022-09-09 10:21     ` [PATCH for-next v7 1/5] io_uring: add io_uring_cmd_import_fixed Kanchan Joshi
     [not found]   ` <CGME20220909103140epcas5p36689726422eb68e6fdc1d39019a4a8ba@epcas5p3.samsung.com>
2022-09-09 10:21     ` [PATCH for-next v7 2/5] io_uring: introduce fixed buffer support for io_uring_cmd Kanchan Joshi
     [not found]   ` <CGME20220909103143epcas5p2eda60190cd23b79fb8f48596af3e1524@epcas5p2.samsung.com>
2022-09-09 10:21     ` [PATCH for-next v7 3/5] nvme: refactor nvme_alloc_user_request Kanchan Joshi
2022-09-20 12:02       ` Christoph Hellwig
2022-09-22 15:46         ` Kanchan Joshi
2022-09-23  9:25         ` Kanchan Joshi
     [not found]   ` <CGME20220909103147epcas5p2a83ec151333bcb1d2abb8c7536789bfd@epcas5p2.samsung.com>
2022-09-09 10:21     ` [PATCH for-next v7 4/5] block: add helper to map bvec iterator for passthrough Kanchan Joshi
2022-09-20 12:08       ` Christoph Hellwig
2022-09-22 15:23         ` Kanchan Joshi
2022-09-23 15:29           ` Christoph Hellwig
2022-09-23 18:43             ` Kanchan Joshi
2022-09-25 17:46               ` Kanchan Joshi [this message]
2022-09-26 14:50               ` Christoph Hellwig
2022-09-27 16:47                 ` Kanchan Joshi
     [not found]   ` <CGME20220909103151epcas5p1e25127c3053ba21e8f8418a701878973@epcas5p1.samsung.com>
2022-09-09 10:21     ` [PATCH for-next v7 5/5] nvme: wire up fixed buffer support for nvme passthrough Kanchan Joshi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220925174629.GB6320@test-zns \
    --to=joshi.k@samsung.com \
    --cc=anuj20.g@samsung.com \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=gost.dev@samsung.com \
    --cc=hch@lst.de \
    --cc=io-uring@vger.kernel.org \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).