linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tiwei Bie <tiwei.bie@intel.com>
To: Jason Wang <jasowang@redhat.com>
Cc: mst@redhat.com, alex.williamson@redhat.com,
	maxime.coquelin@redhat.com, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org, virtualization@lists.linux-foundation.org,
	netdev@vger.kernel.org, dan.daly@intel.com,
	cunming.liang@intel.com, zhihong.wang@intel.com,
	lingshan.zhu@intel.com
Subject: Re: [PATCH v2] vhost: introduce mdev based hardware backend
Date: Thu, 24 Oct 2019 17:18:39 +0800	[thread overview]
Message-ID: <20191024091839.GA17463@___> (raw)
In-Reply-To: <d4cc4f4e-2635-4041-2f68-cd043a97f25a@redhat.com>

On Thu, Oct 24, 2019 at 04:32:42PM +0800, Jason Wang wrote:
> On 2019/10/24 下午4:03, Jason Wang wrote:
> > On 2019/10/24 下午12:21, Tiwei Bie wrote:
> > > On Wed, Oct 23, 2019 at 06:29:21PM +0800, Jason Wang wrote:
> > > > On 2019/10/23 下午6:11, Tiwei Bie wrote:
> > > > > On Wed, Oct 23, 2019 at 03:25:00PM +0800, Jason Wang wrote:
> > > > > > On 2019/10/23 下午3:07, Tiwei Bie wrote:
> > > > > > > On Wed, Oct 23, 2019 at 01:46:23PM +0800, Jason Wang wrote:
> > > > > > > > On 2019/10/23 上午11:02, Tiwei Bie wrote:
> > > > > > > > > On Tue, Oct 22, 2019 at 09:30:16PM +0800, Jason Wang wrote:
> > > > > > > > > > On 2019/10/22 下午5:52, Tiwei Bie wrote:
> > > > > > > > > > > This patch introduces a mdev based hardware vhost backend.
> > > > > > > > > > > This backend is built on top of the same abstraction used
> > > > > > > > > > > in virtio-mdev and provides a generic vhost interface for
> > > > > > > > > > > userspace to accelerate the virtio devices in guest.
> > > > > > > > > > > 
> > > > > > > > > > > This backend is implemented as a mdev device driver on top
> > > > > > > > > > > of the same mdev device ops used in virtio-mdev but using
> > > > > > > > > > > a different mdev class id, and it will register the device
> > > > > > > > > > > as a VFIO device for userspace to use. Userspace can setup
> > > > > > > > > > > the IOMMU with the existing VFIO container/group APIs and
> > > > > > > > > > > then get the device fd with the device name. After getting
> > > > > > > > > > > the device fd of this device, userspace can use vhost ioctls
> > > > > > > > > > > to setup the backend.
> > > > > > > > > > > 
> > > > > > > > > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > > > > > > > > > ---
> > > > > > > > > > > This patch depends on below series:
> > > > > > > > > > > https://lkml.org/lkml/2019/10/17/286
> > > > > > > > > > > 
> > > > > > > > > > > v1 -> v2:
> > > > > > > > > > > - Replace _SET_STATE with _SET_STATUS (MST);
> > > > > > > > > > > - Check status bits at each step (MST);
> > > > > > > > > > > - Report the max ring size and max number of queues (MST);
> > > > > > > > > > > - Add missing MODULE_DEVICE_TABLE (Jason);
> > > > > > > > > > > - Only support the network backend w/o multiqueue for now;
> > > > > > > > > > Any idea on how to extend it to support
> > > > > > > > > > devices other than net? I think we
> > > > > > > > > > want a generic API or an API that could
> > > > > > > > > > be made generic in the future.
> > > > > > > > > > 
> > > > > > > > > > Do we want to e.g having a generic vhost
> > > > > > > > > > mdev for all kinds of devices or
> > > > > > > > > > introducing e.g vhost-net-mdev and vhost-scsi-mdev?
> > > > > > > > > One possible way is to do what vhost-user does. I.e. Apart from
> > > > > > > > > the generic ring, features, ... related ioctls, we also introduce
> > > > > > > > > device specific ioctls when we need them. As vhost-mdev just needs
> > > > > > > > > to forward configs between parent and userspace and even won't
> > > > > > > > > cache any info when possible,
> > > > > > > > So it looks to me this is only possible if we
> > > > > > > > expose e.g set_config and
> > > > > > > > get_config to userspace.
> > > > > > > The set_config and get_config interface isn't really everything
> > > > > > > of device specific settings. We also have ctrlq in virtio-net.
> > > > > > Yes, but it could be processed by the exist API. Isn't
> > > > > > it? Just set ctrl vq
> > > > > > address and let parent to deal with that.
> > > > > I mean how to expose ctrlq related settings to userspace?
> > > > 
> > > > I think it works like:
> > > > 
> > > > 1) userspace find ctrl_vq is supported
> > > > 
> > > > 2) then it can allocate memory for ctrl vq and set its address through
> > > > vhost-mdev
> > > > 
> > > > 3) userspace can populate ctrl vq itself
> > > I see. That is to say, userspace e.g. QEMU will program the
> > > ctrl vq with the existing VHOST_*_VRING_* ioctls, and parent
> > > drivers should know that the addresses used in ctrl vq are
> > > host virtual addresses in vhost-mdev's case.
> > 
> > 
> > That's really good point. And that means parent needs to differ vhost
> > from virtio. It should work.
> 
> 
> HVA may only work when we have something similar to VHOST_SET_OWNER which
> can reuse MM of its owner.

We already have VHOST_SET_OWNER in vhost now, parent can handle
the commands in its .kick_vq() which is called by vq's .handle_kick
callback. Virtio-user did something similar:

https://github.com/DPDK/dpdk/blob/0da7f445df445630c794897347ee360d6fe6348b/drivers/net/virtio/virtio_user_ethdev.c#L313-L322

> 
> 
> > But is there any chance to use DMA address? I'm asking since the API
> > then tends to be device specific.
> 
> 
> I wonder whether we can introduce MAP IOMMU notifier and get DMA mappings
> from that.

I think this will complicate things unnecessarily and may
bring pains. Because, in vhost-mdev, mdev's ctrl vq is
supposed to be managed by host. And we should try to avoid
putting ctrl vq and Rx/Tx vqs in the same DMA space to prevent
guests having the chance to bypass the host (e.g. QEMU) to
setup the backend accelerator directly.

> 
> Thanks
> 

  reply	other threads:[~2019-10-24  9:17 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-22  9:52 [PATCH v2] vhost: introduce mdev based hardware backend Tiwei Bie
2019-10-22 13:30 ` Jason Wang
2019-10-23  3:02   ` Tiwei Bie
2019-10-23  5:46     ` Jason Wang
2019-10-23  7:07       ` Tiwei Bie
2019-10-23  7:25         ` Jason Wang
2019-10-23 10:11           ` Tiwei Bie
2019-10-23 10:29             ` Jason Wang
2019-10-24  4:21               ` Tiwei Bie
2019-10-24  8:03                 ` Jason Wang
2019-10-24  8:32                   ` Jason Wang
2019-10-24  9:18                     ` Tiwei Bie [this message]
2019-10-24 10:42                       ` Jason Wang
2019-10-25  9:54                         ` Jason Wang
2019-10-25 12:16                           ` Michael S. Tsirkin
2019-10-28  1:58                             ` Tiwei Bie
2019-10-28  3:50                               ` Jason Wang
2019-10-29  9:57                                 ` Tiwei Bie
2019-10-29 10:48                                   ` Jason Wang
2019-10-30  1:27                                     ` Tiwei Bie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191024091839.GA17463@___ \
    --to=tiwei.bie@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=cunming.liang@intel.com \
    --cc=dan.daly@intel.com \
    --cc=jasowang@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=lingshan.zhu@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maxime.coquelin@redhat.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=zhihong.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).