All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Stojaczyk, DariuszX" <dariuszx.stojaczyk@intel.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	Maxime Coquelin <maxime.coquelin@redhat.com>,
	"Bie, Tiwei" <tiwei.bie@intel.com>,
	"Tetsuya Mukawa" <mtetsuyah@gmail.com>,
	Thomas Monjalon <thomas@monjalon.net>,
	"yliu@fridaylinux.org" <yliu@fridaylinux.org>,
	"Liu, Changpeng" <changpeng.liu@intel.com>
Subject: Re: [RFC] vhost: new rte_vhost API proposal
Date: Fri, 18 May 2018 07:51:51 +0000	[thread overview]
Message-ID: <FBE7E039FA50BF47A673AD0BD3CD56A8449A2C91@HASMSX105.ger.corp.intel.com> (raw)
In-Reply-To: <20180511100531.GA19894@stefanha-x1.localdomain>



> -----Original Message-----
> From: Stefan Hajnoczi [mailto:stefanha@redhat.com]
> Sent: Friday, May 11, 2018 6:06 PM
> On Fri, May 11, 2018 at 05:55:45AM +0000, Stojaczyk, DariuszX wrote:
> > > -----Original Message-----
> > > From: Stefan Hajnoczi [mailto:stefanha@redhat.com]
> > > Sent: Friday, May 11, 2018 12:37 AM
> > > On Thu, May 10, 2018 at 03:22:53PM +0200, Dariusz Stojaczyk wrote:
> > > > rte_virtio would offer both vhost and virtio driver APIs. These two
> > > > have a lot of common code for vhost-user handling or PCI access for
> > > > initiator/virtio-vhost-user (and possibly vDPA) so there's little
> > > > sense to keep target and initiator code separated between different
> > > > libs. Of course, the APIs would be separate - only some parts of
> > > > the code would be shared.
> > >
> > > The API below seems to be for vhost backends (aka slaves).
> rte_virtio_*
> > > is a misnomer because vhost and virtio are two different things.  This
> > > is not for implementing virtio devices, it's specifically for vhost
> > > devices.
> >
> > I agree it's named a bit off if we're talking about vhost. My idea was to
> introduce a generic library for userspace Virtio processing and that's
> where the name came from. Even when you use just the vhost API that's
> introduced here, you are only required to implement vring processing,
> config access, and possibly interrupt handling, all of which are typical
> Virtio things. The vhost logic is hidden inside.
> 
> No, the vhost logic is not hidden: there is custom_msg() and the whole
> tgt_ops struct is an abstraction of the vhost protocol, not virtio.
> 
> It sounds like you're hoping to create a single API that can support
> both vhost and virtio access.  For example, one "net" device backend
> implementation using rte_virtio can be accessed via vhost or virtio.
> 
> This won't work because vhost and virtio are not equivalent.  vhost-net
> devices don't implement the virtio-net config space and they only have a
> subset of the virtqueues.  vhost-net devices support special vhost
> messages that don't exist in virtio-net.
> 
> Additionally, the virtio and vhost-user specifications are independent
> and make no promise of a 1:1 mapping.  They have the freedom to
> change
> in ways which will break any abstraction you come up with today.
> 
> I hope it will be possible to unify the two in the future, but that
> needs to happen at the spec level first, before trying to unify them in
> code.
> 
> This is why I'm belaboring the point that vhost should not be confused
> with virtio.  Each needs to be separate and clearly identified to avoid
> confusion.
> 	

Ok, I'm convinced now. Thanks for the explanation. I'll name the lib rte_vhost2 in v2.


> >
> > >
> > > Vhost does not offer the full virtio device model - otherwise it would
> > > be just another transport in the VIRTIO specification.  Instead vhost is
> > > a protocol for vhost devices, which are subsets of virtio devices.
> > >
> > > I suggest calling it rte_vhost2 since it's basically a new, incompatible
> > > rte_vhost API.
> >
> > Rte_vhost2 sounds better for what we have now, but would that name
> be still valid once we add a true Virtio driver functionality? (I believe it's
> called Virtio PMD in DPDK right now). That driver would reuse a lot of the
> vhost code for PCI and vhost-user, so it makes some sense to put these
> two together.
> >
> > I don't think rte_vhost2 is a permanent name anyway, so maybe we
> could call it like so for now, and rename it later once I introduce that
> additional Virtio functionality? Would that work?
> 
> The natural layering for is that vhost depends on virtio.  Virtio header
> files (feature bits, config space layout, vring layout) and the vring
> API can be reused by vhost.
> 
> Virtio doesn't need knowledge of virtio though and the two can be in
> separate packages without code duplication.
> 
> That said, it doesn't really matter whether there are rte_virtio +
> rte_vhost2 packages or a single rte_virtio package, as long as the
> function and struct names for vhost interfaces contain the name "vhost"
> so they cannot be confused with virtio.
> 
> > > > + * or before actions that require changing vhost device/vq state.
> > > > + */
> > > > + void (*queue_stop)(struct rte_virtio_dev *vdev, struct
> rte_virtio_vq
> > > *vq);
> > > > + /** Device disconnected. All queues are guaranteed to be stopped
> by
> > > now */
> > > > + void (*device_destroy)(struct rte_virtio_dev *vdev);
> > > > + /**
> > > > + * Custom message handler. `vdev` and `vq` can be NULL. This is
> called
> > > > + * for backend-specific actions. The `id` should be prefixed by the
> > >
> > > Since vdev can be NULL, does this mean custom_msg() may be invoked
> at
> > > any time during the lifecycle and even before/after
> > > device_create()/device_destroy()?
> >
> > Theoretically. I was thinking of some poorly-written backends notifying
> they're out of internal resources, but I agree it's just poor. I'll remove the
> `vdev can be NULL` part.
> 
> Okay, I wasn't suggesting it's bad, I just wanted the docs to state at
> which points in the lifecycle this callback can be invoked.
> 
> > > > + */
> > > > + void (*custom_msg)(struct rte_virtio_dev *vdev, struct
> rte_virtio_vq
> > > *vq,
> > > > +   char *id, void *ctx);
> > >
> > > What is the purpose of id and why is it char* instead of const char*?
> >
> > Ack, It should be const. (same thing goes to every other char* in this
> patch)
> >
> > For example vhost-crypto introduces two new vhost-user messages for
> initializing and destroying crypto session. The underlying vhost-crypto
> vhost-user backend after receiving such message could execute this
> callback as follows:
> >
> > struct my_crypto_data *data = calloc();
> > [...]
> > Ops->custom_msg(vdev, NULL, "crypto_sess_init", data);
> 
> So it's necessary to modify rte_virtio vhost code when implementing new
> device backends with custom messages?
> 
> It seems like rte_virtio needs to have knowledge of how to parse any
> custom messages :(.  It would be cleaner for rte_virtio to have no
> knowledge of device-specific messages.
> 
> And how does the device backend reply to custom messages?
>

The library would send proper response after receiving rte_virtio_tgt_cb_complete(). If it needs additional data from the user, there's the `ctx` field in custom_msg that he [the user] can write into.

However, I started to work on the implementation and came to conclusion that it's unnecessarily difficult to implement new Vhost device backends this way. I've changed the custom_msg callback to parse raw Vhost-user messages now. Still, new Vhost-user messages are usually a type of a protocol extension negotiated by a protocol feature flag, and protocol extensions should be implemented inside the lib in my opinion. If a protocol extension changes existing message rather than introduces new one, we'll *need* to implement it inside the lib. 

Both solutions have their good and bad points.
I'm sending v2 in a couple minutes, maybe it'll help us decide which one is better.

> > > > +/**
> > > > + * Unregisters a vhost target asynchronously.
> > >
> > > How are existing device instances affected?
> >
> > Ack. How about:
> > All active queues will be stopped and all devices destroyed.
> >
> > This is analogous to what rte_vhost has now.
> 
> Sounds good.
> 
> Stefan

Regards,
D.

  parent reply	other threads:[~2018-05-18  7:51 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-10 13:22 [RFC] vhost: new rte_vhost API proposal Dariusz Stojaczyk
     [not found] ` <20180510163643.GD9308@stefanha-x1.localdomain>
2018-05-11  5:55   ` Stojaczyk, DariuszX
     [not found]     ` <20180511100531.GA19894@stefanha-x1.localdomain>
2018-05-18  7:51       ` Stojaczyk, DariuszX [this message]
2018-05-18 13:01 ` [RFC v2] " Dariusz Stojaczyk
2018-05-18 13:50   ` Maxime Coquelin
2018-05-20  7:07     ` Yuanhan Liu
2018-05-22 10:19     ` Stojaczyk, DariuszX
     [not found]   ` <20180525100550.GD14757@stefanha-x1.localdomain>
2018-05-29 13:38     ` Stojaczyk, DariuszX
     [not found]       ` <20180530085700.GC14623@stefanha-x1.localdomain>
2018-05-30 12:24         ` Stojaczyk, DariuszX
     [not found]   ` <20180607151227.23660-1-darek.stojaczyk@gmail.com>
     [not found]     ` <20180608100852.GA31164@stefanha-x1.localdomain>
2018-06-13  9:41       ` [RFC v3 0/7] vhost2: new librte_vhost2 proposal Dariusz Stojaczyk
2018-06-25 11:01     ` Tiwei Bie
2018-06-25 12:17       ` Stojaczyk, DariuszX
2018-06-26  8:22         ` Tiwei Bie
2018-06-26  8:30           ` Thomas Monjalon
2018-06-26  8:47           ` Stojaczyk, DariuszX
2018-06-26  9:14             ` Tiwei Bie
2018-06-26  9:38               ` Maxime Coquelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=FBE7E039FA50BF47A673AD0BD3CD56A8449A2C91@HASMSX105.ger.corp.intel.com \
    --to=dariuszx.stojaczyk@intel.com \
    --cc=changpeng.liu@intel.com \
    --cc=dev@dpdk.org \
    --cc=maxime.coquelin@redhat.com \
    --cc=mtetsuyah@gmail.com \
    --cc=stefanha@redhat.com \
    --cc=thomas@monjalon.net \
    --cc=tiwei.bie@intel.com \
    --cc=yliu@fridaylinux.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.