linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Ming Lei <ming.lei@redhat.com>,
	io-uring@vger.kernel.org, linux-block@vger.kernel.org
Cc: Miklos Szeredi <mszeredi@redhat.com>,
	ZiyangZhang <ZiyangZhang@linux.alibaba.com>,
	Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>,
	Bernd Schubert <bschubert@ddn.com>,
	Pavel Begunkov <asml.silence@gmail.com>
Subject: Re: [PATCH V3 00/16] io_uring/ublk: add IORING_OP_FUSED_CMD
Date: Sat, 18 Mar 2023 06:59:41 -0600	[thread overview]
Message-ID: <3971d43f-601f-635f-5a30-df7e647f6659@kernel.dk> (raw)
In-Reply-To: <ZBQhSzIhvZL+83nM@ovpn-8-18.pek2.redhat.com>

On 3/17/23 2:14?AM, Ming Lei wrote:
> On Tue, Mar 14, 2023 at 08:57:11PM +0800, Ming Lei wrote:
>> Hello,
>>
>> Add IORING_OP_FUSED_CMD, it is one special URING_CMD, which has to
>> be SQE128. The 1st SQE(master) is one 64byte URING_CMD, and the 2nd
>> 64byte SQE(slave) is another normal 64byte OP. For any OP which needs
>> to support slave OP, io_issue_defs[op].fused_slave needs to be set as 1,
>> and its ->issue() can retrieve/import buffer from master request's
>> fused_cmd_kbuf. The slave OP is actually submitted from kernel, part of
>> this idea is from Xiaoguang's ublk ebpf patchset, but this patchset
>> submits slave OP just like normal OP issued from userspace, that said,
>> SQE order is kept, and batching handling is done too.
>>
>> Please see detailed design in commit log of the 2th patch, and one big
>> point is how to handle buffer ownership.
>>
>> With this way, it is easy to support zero copy for ublk/fuse device.
>>
>> Basically userspace can specify any sub-buffer of the ublk block request
>> buffer from the fused command just by setting 'offset/len'
>> in the slave SQE for running slave OP. This way is flexible to implement
>> io mapping: mirror, stripped, ...
>>
>> The 3th & 4th patches enable fused slave support for the following OPs:
>>
>> 	OP_READ/OP_WRITE
>> 	OP_SEND/OP_RECV/OP_SEND_ZC
>>
>> The other ublk patches cleans ublk driver and implement fused command
>> for supporting zero copy.
>>
>> Follows userspace code:
>>
>> https://github.com/ming1/ubdsrv/tree/fused-cmd-zc-v2
>>
>> All three(loop, nbd and qcow2) ublk targets have supported zero copy by passing:
>>
>> 	ublk add -t [loop|nbd|qcow2] -z .... 
>>
>> Basic fs mount/kernel building and builtin test are done, and also not
>> observe regression on xfstest test over ublk-loop with zero copy.
>>
>> Also add liburing test case for covering fused command based on miniublk
>> of blktest:
>>
>> https://github.com/ming1/liburing/commits/fused_cmd_miniublk
>>
>> Performance improvement is obvious on memory bandwidth
>> related workloads, such as, 1~2X improvement on 64K/512K BS
>> IO test on loop with ramfs backing file.
>>
>> Any comments are welcome!
>>
>> V3:
>> 	- fix build warning reported by kernel test robot
>> 	- drop patch for checking fused flags on existed drivers with
>> 	  ->uring_command(), which isn't necessary, since we do not do that
>>       when adding new ioctl or uring command
>>     - inline io_init_rq() for core code, so just export io_init_slave_req
>> 	- return result of failed slave request unconditionally since REQ_F_CQE_SKIP
>> 	will be cleared
>> 	- pass xfstest over ublk-loop
> 
> Hello Jens and Guys,
> 
> I have been working on io_uring zero copy support for ublk/fuse for a while, and
> I appreciate you may share any thoughts on this patchset or approach?

I'm a bit split on this one, as I really like (and want) the feature.
ublk has become popular pretty quickly, and it makes a LOT of sense to
support zero copy for it. At the same time, I'm not really a huge fan of
the fused commands... They seem too specialized to be useful for other
things, and it'd be a shame to do something like that only for it later
to be replaced by a generic solution. And then we're stuck with
supporting fused commands forever, not sure I like that prospect.

Both Pavel and Xiaoguang voiced similar concerns, and I think it may be
worth spending a bit more time on figuring out if splice can help us
here. David Howells currently has a lot going on in that area too.

So while I'd love to see this feature get queued up right now, I also
don't want to prematurely do so. Can we split out the fixes from this
series into a separate series that we can queue up now? That would also
help shrink the patchset, which is always a win for review.

-- 
Jens Axboe


  reply	other threads:[~2023-03-18 12:59 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-14 12:57 [PATCH V3 00/16] io_uring/ublk: add IORING_OP_FUSED_CMD Ming Lei
2023-03-14 12:57 ` [PATCH V3 01/16] io_uring: increase io_kiocb->flags into 64bit Ming Lei
2023-03-14 12:57 ` [PATCH V3 02/16] io_uring: add IORING_OP_FUSED_CMD Ming Lei
2023-03-18 14:31   ` Jens Axboe
2023-03-18 15:24     ` Ming Lei
2023-03-18 16:00       ` Jens Axboe
2023-03-18 16:13       ` Ming Lei
2023-03-14 12:57 ` [PATCH V3 03/16] io_uring: support OP_READ/OP_WRITE for fused slave request Ming Lei
2023-03-14 12:57 ` [PATCH V3 04/16] io_uring: support OP_SEND_ZC/OP_RECV " Ming Lei
2023-03-14 12:57 ` [PATCH V3 05/16] block: ublk_drv: mark device as LIVE before adding disk Ming Lei
2023-03-14 12:57 ` [PATCH V3 06/16] block: ublk_drv: add common exit handling Ming Lei
2023-03-14 12:57 ` [PATCH V3 07/16] block: ublk_drv: don't consider flush request in map/unmap io Ming Lei
2023-03-14 12:57 ` [PATCH V3 08/16] block: ublk_drv: add two helpers to clean up map/unmap request Ming Lei
2023-03-14 12:57 ` [PATCH V3 09/16] block: ublk_drv: clean up several helpers Ming Lei
2023-03-14 12:57 ` [PATCH V3 10/16] block: ublk_drv: cleanup 'struct ublk_map_data' Ming Lei
2023-03-14 12:57 ` [PATCH V3 11/16] block: ublk_drv: cleanup ublk_copy_user_pages Ming Lei
2023-03-14 12:57 ` [PATCH V3 12/16] block: ublk_drv: grab request reference when the request is handled by userspace Ming Lei
2023-03-14 12:57 ` [PATCH V3 13/16] block: ublk_drv: support to copy any part of request pages Ming Lei
2023-03-14 12:57 ` [PATCH V3 14/16] block: ublk_drv: add read()/write() support for ublk char device Ming Lei
2023-03-14 12:57 ` [PATCH V3 15/16] block: ublk_drv: don't check buffer in case of zero copy Ming Lei
2023-03-14 12:57 ` [PATCH V3 16/16] block: ublk_drv: apply io_uring FUSED_CMD for supporting " Ming Lei
2023-03-16  3:13 ` [PATCH V3 00/16] io_uring/ublk: add IORING_OP_FUSED_CMD Xiaoguang Wang
2023-03-16  3:56   ` Ming Lei
2023-03-18 16:23   ` Pavel Begunkov
2023-03-18 16:39     ` Ming Lei
2023-03-21  9:17     ` Ziyang Zhang
2023-03-27 16:04       ` Pavel Begunkov
2023-03-28  1:01         ` Ming Lei
2023-03-28 11:01           ` Pavel Begunkov
2023-03-28  0:53       ` Ming Lei
2023-03-29  6:57         ` Ziyang Zhang
2023-03-29  8:52           ` Ming Lei
2023-03-25 14:15     ` Ming Lei
2023-03-17  8:14 ` Ming Lei
2023-03-18 12:59   ` Jens Axboe [this message]
2023-03-18 13:35     ` Ming Lei
2023-03-18 14:36       ` Jens Axboe
2023-03-18 15:06         ` Ming Lei
2023-03-18 16:51       ` Pavel Begunkov
2023-03-18 23:42         ` Ming Lei
2023-03-19  0:17           ` Ming Lei
2023-03-28 10:55           ` Pavel Begunkov
2023-03-28 13:01             ` Ming Lei
2023-03-29  6:59               ` Ziyang Zhang
2023-03-29 10:43               ` Pavel Begunkov
2023-03-29 11:55                 ` Ming Lei
2023-03-18 16:09 ` Jens Axboe
2023-03-18 17:01   ` Ming Lei
2023-03-21 15:56 ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3971d43f-601f-635f-5a30-df7e647f6659@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=ZiyangZhang@linux.alibaba.com \
    --cc=asml.silence@gmail.com \
    --cc=bschubert@ddn.com \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=mszeredi@redhat.com \
    --cc=xiaoguang.wang@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).