linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Maxim Levitsky <mlevitsk@redhat.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Fam Zheng <fam@euphon.net>, Keith Busch <keith.busch@intel.com>,
	Sagi Grimberg <sagi@grimberg.me>,
	kvm@vger.kernel.org, Wolfram Sang <wsa@the-dreams.de>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Liang Cunming <cunming.liang@intel.com>,
	Nicolas Ferre <nicolas.ferre@microchip.com>,
	linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org,
	"David S . Miller" <davem@davemloft.net>,
	Jens Axboe <axboe@fb.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Kirti Wankhede <kwankhede@nvidia.com>,
	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Liu Changpeng <changpeng.liu@intel.com>,
	"Paul E . McKenney" <paulmck@linux.ibm.com>,
	Amnon Ilan <ailan@redhat.com>, John Ferlan <jferlan@redhat.com>
Subject: Re: [PATCH v2 00/10] RFC: NVME MDEV
Date: Mon, 06 May 2019 12:04:06 +0300	[thread overview]
Message-ID: <e8f6981863bdbba89adcba1c430083e68546ac1a.camel@redhat.com> (raw)
In-Reply-To: <20190503121838.GA21041@lst.de>

On Fri, 2019-05-03 at 14:18 +0200, Christoph Hellwig wrote:
> I simply don't get the point of this series.
> 
> MDEV is an interface for exposing parts of a device to a userspace
> program / VM.  But that this series appears to do is to expose a
> purely software defined nvme controller to userspace.  Which in
> principle is a good idea, but we have a much better framework for that,
> which is called vhost.

Let me explain the reasons for choosing the IO interfaces as I did:

1. Frontend interface (the interface that faces the guest/userspace/etc):

VFIO/mdev is just way to expose a (partially) software defined PCIe device to a
guest.

Vhost on the other hand is an interface that is hardcoded and optimized for
virtio. It can be extended to be pci generic, but why to do so if we already
have VFIO.

So the biggest advantage of using VFIO _currently_ is that I don't add any new
API/ABI to the kernel, and neither the userspace (qemu) needs to learn to use a
new API. 

It also worth noting that VFIO supports nesting out of box, so I don't need to
worry about it (vhost has to deal with that on the protocol level using its
IOTLB facility).

On top of that, it is expected that newer hardware will support the PASID based
device subdivision, which will allow us to _directly_ pass through the
submission queues of the device and _force_ us to use the NVME protocol for the
frontend.

2. Backend interface (the connection to the real nvme device):

Currently the backend interface _doesn't have_ to allocate a dedicated queue and
bypass the block layer. It can use the block submit_bio/blk_poll as I
demonstrate in the last patch in the series. Its 2x slower though.

However, similar to the (1), when the driver will support the devices with
hardware based passthrough, it will have to dedicate a bunch of queues to the
guest, configure them with the appropriate PASID, and then let the guest use
these queues directly.


Best regards,
	Maxim Levitsky


  reply	other threads:[~2019-05-06  9:04 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-02 11:47 [PATCH v2 00/10] RFC: NVME MDEV Maxim Levitsky
2019-05-02 11:47 ` [PATCH v2 01/10] vfio/mdev: add notifier for map events Maxim Levitsky
2019-05-02 11:47 ` [PATCH v2 02/10] vfio/mdev: add .request callback Maxim Levitsky
2019-05-02 11:47 ` [PATCH v2 03/10] nvme/core: add some more values from the spec Maxim Levitsky
2019-05-02 11:47 ` [PATCH v2 04/10] nvme/core: add NVME_CTRL_SUSPENDED controller state Maxim Levitsky
2019-05-02 11:47 ` [PATCH v2 05/10] nvme/pci: use the NVME_CTRL_SUSPENDED state Maxim Levitsky
2019-05-02 11:47 ` [PATCH v2 06/10] nvme/core: add mdev interfaces Maxim Levitsky
2019-05-03 12:29   ` Christoph Hellwig
2019-05-03 19:00     ` Max Gurtovoy
2019-05-04  6:49       ` Christoph Hellwig
2019-05-06  8:31         ` Maxim Levitsky
2019-05-06  8:34           ` Maxim Levitsky
2019-05-06 12:59           ` Christoph Hellwig
2019-05-02 11:47 ` [PATCH v2 07/10] nvme/core: add nvme-mdev core driver Maxim Levitsky
2019-05-02 11:47 ` [PATCH v2 08/10] nvme/pci: implement the mdev external queue allocation interface Maxim Levitsky
2019-05-02 14:20   ` Maxim Levitsky
2019-05-02 21:12   ` Heitke, Kenneth
2019-05-02 21:20     ` Maxim Levitsky
2019-05-03 12:09       ` Keith Busch
2019-05-06  7:55         ` Maxim Levitsky
2019-05-02 11:48 ` [PATCH v2 09/10] nvme/mdev - Add inline performance measurments Maxim Levitsky
2019-05-02 11:48 ` [PATCH v2 10/10] nvme/mdev - generic block IO code Maxim Levitsky
2019-05-03 12:18 ` [PATCH v2 00/10] RFC: NVME MDEV Christoph Hellwig
2019-05-06  9:04   ` Maxim Levitsky [this message]
2019-05-06 12:57     ` Christoph Hellwig
2019-05-06 16:43       ` Keith Busch
2019-05-08 12:39       ` Paolo Bonzini
2019-05-09  9:12     ` Stefan Hajnoczi
2019-05-09 13:49       ` Keith Busch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e8f6981863bdbba89adcba1c430083e68546ac1a.camel@redhat.com \
    --to=mlevitsk@redhat.com \
    --cc=ailan@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=axboe@fb.com \
    --cc=changpeng.liu@intel.com \
    --cc=cunming.liang@intel.com \
    --cc=davem@davemloft.net \
    --cc=fam@euphon.net \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@lst.de \
    --cc=jferlan@redhat.com \
    --cc=keith.busch@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=mchehab+samsung@kernel.org \
    --cc=nicolas.ferre@microchip.com \
    --cc=paulmck@linux.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=sagi@grimberg.me \
    --cc=wsa@the-dreams.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).