All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Stojaczyk, DariuszX" <dariuszx.stojaczyk@intel.com>
To: Maxime Coquelin <maxime.coquelin@redhat.com>,
	"dev@dpdk.org" <dev@dpdk.org>, "Bie, Tiwei" <tiwei.bie@intel.com>,
	Tetsuya Mukawa <mtetsuyah@gmail.com>,
	Thomas Monjalon <thomas@monjalon.net>
Cc: "yliu@fridaylinux.org" <yliu@fridaylinux.org>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	"Harris, James R" <james.r.harris@intel.com>
Subject: Re: [RFC v2] vhost: new rte_vhost API proposal
Date: Tue, 22 May 2018 10:19:53 +0000	[thread overview]
Message-ID: <FBE7E039FA50BF47A673AD0BD3CD56A8449A61D1@HASMSX105.ger.corp.intel.com> (raw)
In-Reply-To: <27ce772e-9f01-dff9-1f82-b99924efa950@redhat.com>

Hi Maxime,

> -----Original Message-----
> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
> Sent: Friday, May 18, 2018 9:51 PM
> On 05/18/2018 03:01 PM, Dariusz Stojaczyk wrote:
> > rte_vhost is not vhost-user spec compliant. Some Vhost drivers have
> > been already confirmed not to work with rte_vhost. virtio-user-scsi-pci
> > in QEMU 2.12 doesn't fully initialize its management queues at SeaBIOS
> > stage. This is perfectly fine from the Vhost-user spec perspective, but
> > doesn't meet rte_vhost expectations. rte_vhost waits for all queues
> > to be fully initialized before it allows the entire device to be
> > processed. qFixing rte_vhost directly would require quite a big amount
> > of changes, which would completely break backwards compatibility.
> >
> > This rte_vhost2 library is intended to smooth out the transition.
> > It exposes a low-level API for implementing new Vhost-user slaves.
> > The existing rte_vhost is about to be refactored to use rte_vhost2
> > library underneath, and demanding backends could now use rte_vhost2
> > directly.
> 
> I like the idea, and the proposed way to smooth the transition.
> 
> I will certainly have other comments later, but please find below
> the ones I have for the moment.
> 
> > <snip>
> > +
> > +/**
> > + * Registers a new vhost target accepting remote connections. Multiple
> > + * available transports are available. It is possible to create a Vhost-
> user
> > + * Unix domain socket polling local connections or connect to a
> physical
> > + * Virtio device and install an interrupt handler .
> > + *
> > + * This function is thread-safe.
> > + *
> > + * \param trtype type of the transport used, e.g. "vhost-user",
> > + * "PCI-vhost-user", "PCI-vDPA".
> > + * \param trid identifier of the device. For PCI this would be the BDF
> address,
> > + * for vhost-user the socket name.
> > + * \param trflags additional options for the specified transport
> > + * \param trctx additional data for the specified transport. Can be
> NULL.
> > + * \param tgt_ops callbacks to be called upon reaching specific
> initialization
> > + * states.
> > + * \param features supported Virtio features. To be negotiated with
> the
> > + * driver ones. rte_vhost2 will append a couple of generic feature bits
> > + * which are required by the Virtio spec. TODO list these features here
> > + * \return 0 on success, negative errno otherwise
> > + */
> > +int rte_vhost2_tgt_register(const char *trtype, const char *trid,
> > +			    uint64_t trflags, void *trctx,
> > +			    struct rte_vhost2_tgt_ops *tgt_ops,
> > +			    uint64_t features);
> 
> Couldn't the register API also pass the vdev?
> Doing this, the backend could have rte_vhost2_dev in its device
> struct.

Please notice the register API is for registering targets, not devices. A single Vhost-user server target can spawn multiple devices - one for each connection. I know the nomenclature is different from rte_vhost, but since each connection uses its own (virt)queues it makes sense to call things this way.

Initially I thought about adding some rte_vhost2_tgt struct declaration that register function would return, but later on came to a conclusion that it would only complicate things for the library user. A parent struct that would keep rte_vhost2_tgt* needs to contain `const char *trtype` and `const char *trid` anyway, so it's just easier to use these two strings for target identification. 

> > <snip>
> > +/**
> > + * Bypass VIRTIO_F_IOMMU_PLATFORM and translate gpa directly.
> > + *
> > + * This function is thread-safe.
> > + *
> > + * \param mem vhost device memory
> > + * \param gpa guest physical address
> > + * \param len length of the memory to translate (in bytes). If
> requested
> > + * memory chunk crosses memory region boundary, the *len will be
> set to
> > + * the remaining, maximum length of virtually contiguous memory. In
> such
> > + * case the user will be required to call another gpa_to_vva(gpa +
> *len).
> > + * \return vhost virtual address or NULL if requested `gpa` is not
> mapped.
> > + */
> > +static inline void *
> > +rte_vhost2_gpa_to_vva(struct rte_vhost2_memory *mem, uint64_t
> gpa, uint64_t *len)
> > +{
> > +	struct rte_vhost2_mem_region *r;
> > +	uint32_t i;
> > +
> > +	for (i = 0; i < mem->nregions; i++) {
> > +		r = &mem->regions[i];
> > +		if (gpa >= r->guest_phys_addr &&
> > +		    gpa <  r->guest_phys_addr + r->size) {
> > +
> > +			if (unlikely(*len > r->guest_phys_addr + r->size -
> gpa)) {
> > +				*len = r->guest_phys_addr + r->size - gpa;
> > +			}
> > +
> > +			return gpa - r->guest_phys_addr + r-
> >host_user_addr;
> > +		}
> > +	}
> > +	*len = 0;
> > +
> > +	return 0;
> > +}
> 
> Maybe we could take the opportunity to only have
> rte_vhost2_iova_to_vva.

Good idea; will remove it in v3.

Thanks,
D.

  parent reply	other threads:[~2018-05-22 10:19 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-10 13:22 [RFC] vhost: new rte_vhost API proposal Dariusz Stojaczyk
     [not found] ` <20180510163643.GD9308@stefanha-x1.localdomain>
2018-05-11  5:55   ` Stojaczyk, DariuszX
     [not found]     ` <20180511100531.GA19894@stefanha-x1.localdomain>
2018-05-18  7:51       ` Stojaczyk, DariuszX
2018-05-18 13:01 ` [RFC v2] " Dariusz Stojaczyk
2018-05-18 13:50   ` Maxime Coquelin
2018-05-20  7:07     ` Yuanhan Liu
2018-05-22 10:19     ` Stojaczyk, DariuszX [this message]
     [not found]   ` <20180525100550.GD14757@stefanha-x1.localdomain>
2018-05-29 13:38     ` Stojaczyk, DariuszX
     [not found]       ` <20180530085700.GC14623@stefanha-x1.localdomain>
2018-05-30 12:24         ` Stojaczyk, DariuszX
     [not found]   ` <20180607151227.23660-1-darek.stojaczyk@gmail.com>
     [not found]     ` <20180608100852.GA31164@stefanha-x1.localdomain>
2018-06-13  9:41       ` [RFC v3 0/7] vhost2: new librte_vhost2 proposal Dariusz Stojaczyk
2018-06-25 11:01     ` Tiwei Bie
2018-06-25 12:17       ` Stojaczyk, DariuszX
2018-06-26  8:22         ` Tiwei Bie
2018-06-26  8:30           ` Thomas Monjalon
2018-06-26  8:47           ` Stojaczyk, DariuszX
2018-06-26  9:14             ` Tiwei Bie
2018-06-26  9:38               ` Maxime Coquelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=FBE7E039FA50BF47A673AD0BD3CD56A8449A61D1@HASMSX105.ger.corp.intel.com \
    --to=dariuszx.stojaczyk@intel.com \
    --cc=dev@dpdk.org \
    --cc=james.r.harris@intel.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=mtetsuyah@gmail.com \
    --cc=stefanha@redhat.com \
    --cc=thomas@monjalon.net \
    --cc=tiwei.bie@intel.com \
    --cc=yliu@fridaylinux.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.