All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kanchan Joshi <joshi.k@samsung.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: Sagi Grimberg <sagi@grimberg.me>,
	hch@lst.de, kbusch@kernel.org, axboe@kernel.dk,
	io-uring@vger.kernel.org, linux-nvme@lists.infradead.org,
	linux-block@vger.kernel.org, asml.silence@gmail.com,
	joshiiitr@gmail.com, anuj20.g@samsung.com, gost.dev@samsung.com
Subject: Re: [PATCH for-next 4/4] nvme-multipath: add multipathing for uring-passthrough commands
Date: Fri, 15 Jul 2022 04:35:23 +0530	[thread overview]
Message-ID: <20220714230523.GA14373@test-zns> (raw)
In-Reply-To: <YtAy2PUDoWUUE9Bl@T590>

[-- Attachment #1: Type: text/plain, Size: 4231 bytes --]

On Thu, Jul 14, 2022 at 11:14:32PM +0800, Ming Lei wrote:
>On Wed, Jul 13, 2022 at 11:07:57AM +0530, Kanchan Joshi wrote:
>> > > > > > The way I would do this that in nvme_ioucmd_failover_req (or in the
>> > > > > > retry driven from command retriable failure) I would do the above,
>> > > > > > requeue it and kick the requeue work, to go over the requeue_list and
>> > > > > > just execute them again. Not sure why you even need an explicit retry
>> > > > > > code.
>> > > > > During retry we need passthrough command. But passthrough command is not
>> > > > > stable (i.e. valid only during first submission). We can make it stable
>> > > > > either by:
>> > > > > (a) allocating in nvme (b) return -EAGAIN to io_uring, and
>> > > > > it will do allocate + deferral
>> > > > > Both add a cost. And since any command can potentially fail, that
>> > > > > means taking that cost for every IO that we issue on mpath node. Even if
>> > > > > no failure (initial or subsquent after IO) occcured.
>> > > >
>> > > > As mentioned, I think that if a driver consumes a command as queued,
>> > > > it needs a stable copy for a later reformation of the request for
>> > > > failover purposes.
>> > >
>> > > So what do you propose to make that stable?
>> > > As I mentioned earlier, stable copy requires allocating/copying in fast
>> > > path. And for a condition (failover) that may not even occur.
>> > > I really think currrent solution is much better as it does not try to make
>> > > it stable. Rather it assembles pieces of passthrough command if retry
>> > > (which is rare) happens.
>> >
>> > Well, I can understand that io_uring_cmd is space constrained, otherwise
>> > we wouldn't be having this discussion.
>>
>> Indeed. If we had space for keeping passthrough command stable for
>> retry, that would really have simplified the plumbing. Retry logic would
>> be same as first submission.
>>
>> > However io_kiocb is less
>> > constrained, and could be used as a context to hold such a space.
>> >
>> > Even if it is undesired to have io_kiocb be passed to uring_cmd(), it
>> > can still hold a driver specific space paired with a helper to obtain it
>> > (i.e. something like io_uring_cmd_to_driver_ctx(ioucmd) ). Then if the
>> > space is pre-allocated it is only a small memory copy for a stable copy
>> > that would allow a saner failover design.
>>
>> I am thinking along the same lines, but it's not about few bytes of
>> space rather we need 80 (72 to be precise). Will think more, but
>> these 72 bytes really stand tall in front of my optimism.
>>
>> Do you see anything is possible in nvme-side?
>> Now also passthrough command (although in a modified form) gets copied
>> into this preallocated space i.e. nvme_req(req)->cmd. This part -
>
>I understand it can't be allocated in nvme request which is freed
>during retry,

Why not. Yes it gets freed, but we have control over when it gets freed
and we can do if anything needs to be done before freeing it. Please see
below as well.

>and looks the extra space has to be bound with
>io_uring_cmd.

if extra space is bound with io_uring_cmd, it helps to reduce the code
(and just that. I don't see that efficiency will improve - rather it
will be tad bit less because of one more 72 byte copy opeation in fast-path).

Alternate is to use this space that is bound with request i.e.
nvme_req(req)->cmd. This is also preallocated and passthru-cmd
already gets copied. But it is ~80% of the original command. 
Rest 20% includes few fields (data/meta buffer addres, rspective len)
which are not maintained (as bio/request can give that). 
During retry, we take out stuff from nvme_req(req)->cmd and then free
req. Please see nvme_uring_cmd_io_retry in the patch. And here is the
fragement for quick glance -

+       memcpy(&c, nvme_req(oreq)->cmd, sizeof(struct nvme_command));
+       d.metadata = (__u64)pdu->meta_buffer;
+       d.metadata_len = pdu->meta_len;
+       d.timeout_ms = oreq->timeout;
+       d.addr = (__u64)ioucmd->cmd;
+       if (obio) {
+               d.data_len = obio->bi_iter.bi_size;
+               blk_rq_unmap_user(obio);
+       } else {
+               d.data_len = 0;
+       }
+       blk_mq_free_request(oreq);

Do you see chinks in above?

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



  reply	other threads:[~2022-07-14 23:11 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20220711110753epcas5p4169b9e288d15ca35740dbb66a6f6983a@epcas5p4.samsung.com>
2022-07-11 11:01 ` [PATCH for-next 0/4] nvme-multipathing for uring-passthrough Kanchan Joshi
     [not found]   ` <CGME20220711110800epcas5p3d338dd486fd778c5ba5bfe93a91ec8bd@epcas5p3.samsung.com>
2022-07-11 11:01     ` [PATCH for-next 1/4] io_uring, nvme: rename a function Kanchan Joshi
2022-07-14 13:55       ` Ming Lei
     [not found]   ` <CGME20220711110812epcas5p33aa90b23aa62fb11722aa8195754becf@epcas5p3.samsung.com>
2022-07-11 11:01     ` [PATCH for-next 2/4] nvme: compact nvme_uring_cmd_pdu struct Kanchan Joshi
2022-07-12  6:32       ` Christoph Hellwig
     [not found]   ` <CGME20220711110824epcas5p22c8e945cb8c3c3ac46c8c2b5ab55db9b@epcas5p2.samsung.com>
2022-07-11 11:01     ` [PATCH for-next 3/4] io_uring: grow a field in struct io_uring_cmd Kanchan Joshi
2022-07-11 17:00       ` Sagi Grimberg
2022-07-11 17:19         ` Jens Axboe
2022-07-11 17:18       ` Jens Axboe
2022-07-11 17:55         ` Sagi Grimberg
2022-07-11 18:22           ` Sagi Grimberg
2022-07-11 18:24             ` Jens Axboe
2022-07-11 18:58               ` Sagi Grimberg
2022-07-12 11:40               ` Kanchan Joshi
2022-07-14  3:40             ` Ming Lei
2022-07-14  8:19               ` Kanchan Joshi
2022-07-14 15:30                 ` Daniel Wagner
2022-07-15 11:07                   ` Kanchan Joshi
2022-07-18  9:03                     ` Daniel Wagner
     [not found]   ` <CGME20220711110827epcas5p3fd81f142f55ca3048abc38a9ef0d0089@epcas5p3.samsung.com>
2022-07-11 11:01     ` [PATCH for-next 4/4] nvme-multipath: add multipathing for uring-passthrough commands Kanchan Joshi
2022-07-11 13:51       ` Sagi Grimberg
2022-07-11 15:12         ` Stefan Metzmacher
2022-07-11 16:58           ` Sagi Grimberg
2022-07-11 18:54           ` Kanchan Joshi
2022-07-11 18:37         ` Kanchan Joshi
2022-07-11 19:56           ` Sagi Grimberg
2022-07-12  4:23             ` Kanchan Joshi
2022-07-12 21:26               ` Sagi Grimberg
2022-07-13  5:37                 ` Kanchan Joshi
2022-07-13  9:03                   ` Sagi Grimberg
2022-07-13 11:28                     ` Kanchan Joshi
2022-07-13 12:17                       ` Sagi Grimberg
2022-07-14 15:14                   ` Ming Lei
2022-07-14 23:05                     ` Kanchan Joshi [this message]
2022-07-15  1:35                       ` Ming Lei
2022-07-15  1:46                         ` Ming Lei
2022-07-15  4:24                           ` Kanchan Joshi
2022-07-12  6:52       ` Christoph Hellwig
2022-07-12 11:33         ` Kanchan Joshi
2022-07-12 20:13         ` Sagi Grimberg
2022-07-13  5:36           ` Christoph Hellwig
2022-07-13  8:04             ` Sagi Grimberg
2022-07-13 10:12               ` Christoph Hellwig
2022-07-13 11:00                 ` Sagi Grimberg
2022-07-13 11:28                   ` Christoph Hellwig
2022-07-13 12:16                     ` Sagi Grimberg
2022-07-13 11:49                   ` Hannes Reinecke
2022-07-13 12:43                     ` Sagi Grimberg
2022-07-13 13:30                       ` Hannes Reinecke
2022-07-13 13:41                         ` Sagi Grimberg
2022-07-13 14:07                           ` Hannes Reinecke
2022-07-13 15:59                             ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220714230523.GA14373@test-zns \
    --to=joshi.k@samsung.com \
    --cc=anuj20.g@samsung.com \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=gost.dev@samsung.com \
    --cc=hch@lst.de \
    --cc=io-uring@vger.kernel.org \
    --cc=joshiiitr@gmail.com \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=ming.lei@redhat.com \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.