All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Liu Xiaodong <xiaodong.liu@intel.com>
Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	Harris James R <james.r.harris@intel.com>,
	io-uring@vger.kernel.org,
	Gabriel Krisman Bertazi <krisman@collabora.com>,
	ZiyangZhang <ZiyangZhang@linux.alibaba.com>,
	Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Jens Axboe <axboe@kernel.dk>,
	ming.lei@redhat.com
Subject: Re: [PATCH V2 0/1] ubd: add io_uring based userspace block driver
Date: Wed, 18 May 2022 21:18:45 +0800	[thread overview]
Message-ID: <YoTyNVccpIYDpx9q@T590> (raw)
In-Reply-To: <20220518063808.GA168577@storage2.sh.intel.com>

Hello Liu,

On Wed, May 18, 2022 at 02:38:08AM -0400, Liu Xiaodong wrote:
> On Tue, May 17, 2022 at 01:53:57PM +0800, Ming Lei wrote:
> > Hello Guys,
> > 
> > ubd driver is one kernel driver for implementing generic userspace block
> > device/driver, which delivers io request from ubd block device(/dev/ubdbN) into
> > ubd server[1] which is the userspace part of ubd for communicating
> > with ubd driver and handling specific io logic by its target module.
> > 
> > Another thing ubd driver handles is to copy data between user space buffer
> > and request/bio's pages, or take zero copy if mm is ready for support it in
> > future. ubd driver doesn't handle any IO logic of the specific driver, so
> > it is small/simple, and all io logics are done by the target code in ubdserver.
> > 
> > The above two are main jobs done by ubd driver.
> 
> Hi, Lei
> 
> Your UBD implementation looks great. Its io_uring based design is interesting
> and brilliant.
> Towards the purpose of userspace block device, last year,
> VDUSE initialized by Yongji is going to do a similar work. But VDUSE is under
> vdpa. VDUSE will present a virtio-blk device to other userspace process
> like containers, while serving virtio-blk req also by an userspace target.
> https://lists.linuxfoundation.org/pipermail/iommu/2021-June/056956.html 
> 
> I've been working and thinking on serving RUNC container by SPDK efficiently.
> But this work requires a new proper userspace block device module in kernel.
> The highlevel design idea for userspace block device implementations
> should be that: Using ring for IO request, so client and target can exchange
> req/resp quickly in batch; Map bounce buffer between kernel and userspace
> target, so another extra IO data copy like NBD can be avoid. (Oh, yes, SPDK
> needs this kernel module has some more minor functions)
> 
> UBD and VDUSE are both implemented in this way, while of course each of
> them has specific features and advantages.
> 
> Not like UBD which is straightforward and starts from scratch, VDUSE is
> embedded in virtio framework. So its implementation is more complicated, but
> all virtio frontend utilities can be leveraged.
> When considering security/permission issues, feels UBD would be easier to
> solve them.

Stefan Hajnoczi and I are discussing related security/permission
issues, can you share more details in your case?

> 
> So my questions are:
> 1. what do you think about the purpose overlap between UBD and VDUSE?

Sorry, I am not familiar with VDUSE, motivation of ubd is just to make one
high performance generic userspace block driver. ubd driver(kernel part) is
just responsible for communication and copying data between userspace buffers
and kernel io request pages, and the ubdsrv(userspace) target handles io
logic.

> 2. Could UBD be implemented with SPDK friendly functionalities? (mainly about
> io data mapping, since HW devices in SPDK need to access the mapped data
> buffer. Then, in function ubdsrv.c/ubdsrv_init_io_bufs(),
> "addr = mmap(,,,,dev->cdev_fd,)",

No, that code is actually for supporting zero copy.

But each request's buffer is allocated by ubdsrv and definitely available for any
target, please see loop_handle_io_async() which handles IO from /dev/ubdbN about
how to use the buffer. Fro READ, the target code needs to implement READ
logic and fill data to the buffer, then the buffer will be copied to
kernel io request pages; for WRITE, the target code needs to use the buffer to handle
WRITE and the buffer has been updated with kernel io request.

> SPDK needs to know the PA of "addr".

What is PA? and why?

Userspace can only see VM of each buffer. 

> Also memory pointed by "addr" should be pinned all the time.)

The current implementation only pins pages when copying data between
userspace buffers and kernel io request pages. But I plan to support
three pin behavior:

- never (current behavior, just pin pages when copying pages)
- lazy (pin pages until the request is idle for enough time)
- always (all pages in userpace VM are pinned during the device lifetime)


Thanks, 
Ming


  reply	other threads:[~2022-05-18 13:19 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-17  5:53 [PATCH V2 0/1] ubd: add io_uring based userspace block driver Ming Lei
2022-05-17  5:53 ` [PATCH V2 1/1] " Ming Lei
2022-05-17 10:00   ` Ziyang Zhang
2022-05-17 12:55     ` Ming Lei
2022-05-18  5:53       ` Ziyang Zhang
2022-05-17  8:01 ` [PATCH V2 0/1] " Christoph Hellwig
2022-05-17 14:06 ` Stefan Hajnoczi
2022-05-18  7:09   ` Ming Lei
2022-05-18 10:45     ` Stefan Hajnoczi
2022-05-18 12:53       ` Ming Lei
2022-05-18 15:49         ` Stefan Hajnoczi
2022-05-19  2:42           ` Ming Lei
2022-05-19  9:46             ` Stefan Hajnoczi
2022-05-18  6:38 ` Liu Xiaodong
2022-05-18 13:18   ` Ming Lei [this message]
2022-05-23 14:56     ` Liu Xiaodong
2022-05-24  2:59       ` Ming Lei
2022-05-18  9:26 ` Stefan Hajnoczi
2022-05-19 13:33 ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YoTyNVccpIYDpx9q@T590 \
    --to=ming.lei@redhat.com \
    --cc=ZiyangZhang@linux.alibaba.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    --cc=james.r.harris@intel.com \
    --cc=krisman@collabora.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stefanha@redhat.com \
    --cc=xiaodong.liu@intel.com \
    --cc=xiaoguang.wang@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.