From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yuanhan Liu Subject: Re: [PATCH v5 1/3] vhost: Add callback and private data for vhost PMD Date: Tue, 22 Dec 2015 11:41:58 +0800 Message-ID: <20151222034158.GH18863@yliu-dev.sh.intel.com> References: <1447392031-24970-3-git-send-email-mukawa@igel.co.jp> <1448355603-21275-1-git-send-email-mukawa@igel.co.jp> <1448355603-21275-2-git-send-email-mukawa@igel.co.jp> <20151217114223.GC29571@yliu-dev.sh.intel.com> <56737A5E.1030803@igel.co.jp> <20151218041536.GI29571@yliu-dev.sh.intel.com> <56738B5C.1030206@igel.co.jp> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Cc: dev@dpdk.org, ann.zhuangyanying@huawei.com To: Rich Lane Return-path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id AF7545A0A for ; Tue, 22 Dec 2015 04:41:07 +0100 (CET) Content-Disposition: inline In-Reply-To: List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Fri, Dec 18, 2015 at 10:01:25AM -0800, Rich Lane wrote: > I'm using the vhost callbacks and struct virtio_net with the vhost PMD = in a few > ways: Rich, thanks for the info! >=20 > 1. new_device/destroy_device: Link state change (will be covered by the= link > status interrupt). > 2. new_device: Add first queue to datapath. I'm wondering why vring_state_changed() is not used, as it will also be triggered at the beginning, when the default queue (the first queue) is enabled. > 3. vring_state_changed: Add/remove queue to datapath. > 4. destroy_device: Remove all queues (vring_state_changed is not called= when > qemu is killed). I had a plan to invoke vring_state_changed() to disable all vrings when destroy_device() is called. > 5. new_device and struct virtio_net: Determine NUMA node of the VM. You can get the 'struct virtio_net' dev from all above callbacks. =20 >=20 > The vring_state_changed callback is necessary because the VM might not = be using > the maximum number of RX queues. If I boot Linux in the VM it will star= t out > using one RX queue, which can be changed with ethtool. The DPDK app in = the host > needs to be notified that it can start sending traffic to the new queue= . >=20 > The vring_state_changed callback is also useful for guest TX queues to = avoid > reading from an inactive queue. >=20 > API I'd like to have: >=20 > 1. Link status interrupt. To vhost pmd, new_device()/destroy_device() equals to the link status interrupt, where new_device() is a link up, and destroy_device() is link down(). > 2. New queue_state_changed callback. Unlike vring_state_changed this sh= ould > cover the first queue at new_device and removal of all queues at > destroy_device. As stated above, vring_state_changed() should be able to do that, except the one on destroy_device(), which is not done yet. > 3. Per-queue or per-device NUMA node info. You can query the NUMA node info implicitly by get_mempolicy(); check numa_realloc() at lib/librte_vhost/virtio-net.c for reference. --yliu >=20 > On Thu, Dec 17, 2015 at 8:28 PM, Tetsuya Mukawa wro= te: >=20 > On 2015/12/18 13:15, Yuanhan Liu wrote: > > On Fri, Dec 18, 2015 at 12:15:42PM +0900, Tetsuya Mukawa wrote: > >> On 2015/12/17 20:42, Yuanhan Liu wrote: > >>> On Tue, Nov 24, 2015 at 06:00:01PM +0900, Tetsuya Mukawa wrote: > >>>> The vhost PMD will be a wrapper of vhost library, but some of = vhost > >>>> library APIs cannot be mapped to ethdev library APIs. > >>>> Becasue of this, in some cases, we still need to use vhost lib= rary > APIs > >>>> for a port created by the vhost PMD. > >>>> > >>>> Currently, when virtio device is created and destroyed, vhost = library > >>>> will call one of callback handlers. The vhost PMD need to use = this > >>>> pair of callback handlers to know which virtio devices are con= nected > >>>> actually. > >>>> Because we can register only one pair of callbacks to vhost li= brary, > if > >>>> the PMD use it, DPDK applications cannot have a way to know th= e > events. > >>>> > >>>> This may break legacy DPDK applications that uses vhost librar= y. To > prevent > >>>> it, this patch adds one more pair of callbacks to vhost librar= y > especially > >>>> for the vhost PMD. > >>>> With the patch, legacy applications can use the vhost PMD even= if they > need > >>>> additional specific handling for virtio device creation and > destruction. > >>>> > >>>> For example, legacy application can call > >>>> rte_vhost_enable_guest_notification() in callbacks to change s= etting. > >>> TBH, I never liked it since the beginning. Introducing two call= backs > >>> for one event is a bit messy, and therefore error prone. > >> I agree with you. > >> > >>> I have been thinking this occasionally last few weeks, and have= came > >>> up something that we may introduce another layer callback based= on > >>> the vhost pmd itself, by a new API: > >>> > >>>=A0 =A0 =A0rte_eth_vhost_register_callback(). > >>> > >>> And we then call those new callback inside the vhost pmd new_de= vice() > >>> and vhost pmd destroy_device() implementations. > >>> > >>> And we could have same callbacks like vhost have, but I'm think= ing > >>> that new_device() and destroy_device() doesn't sound like a goo= d name > >>> to a PMD driver. Maybe a name like "link_state_changed" is bett= er? > >>> > >>> What do you think of that? > >> Yes,=A0 "link_state_changed" will be good. > >> > >> BTW, I thought it was ok that an DPDK app that used vhost PMD ca= lled > >> vhost library APIs directly. > >> But probably you may feel strangeness about it. Is this correct? > > Unluckily, that's true :) > > > >> If so, how about implementing legacy status interrupt mechanism = to vhost > >> PMD? > >> For example, an DPDK app can register callback handler like > >> "examples/link_status_interrupt". > >> > >> Also, if the app doesn't call vhost library APIs directly, > >> rte_eth_vhost_portid2vdev() will be needless, because the app do= esn't > >> need to handle virtio device structure anymore. > >> > >>> > >>> On the other hand, I'm still thinking is that really necessary = to let > >>> the application be able to call vhost functions like > rte_vhost_enable_guest_notification() > >>> with the vhost PMD driver? > >> Basic concept of my patch is that vhost PMD will provides the fe= atures > >> that vhost library provides. > > I don't think that's necessary. Let's just treat it as a normal p= md > > driver, having nothing to do with vhost library. > > > >> How about removing rte_vhost_enable_guest_notification() from "v= host > >> library"? > >> (I also not sure what are use cases) > >> If we can do this, vhost PMD also doesn't need to take care of i= t. > >> Or if rte_vhost_enable_guest_notification() will be removed in t= he > >> future, vhost PMD is able to ignore it. > > You could either call it in vhost-pmd (which you already have don= e that), > > or ignore it in vhost-pmd, but dont' remove it from vhost library= . > > > >> Please let me correct up my thinking about your questions. > >>=A0 - Change concept of patch not to call vhost library APIs dire= ctly. > >> These should be wrapped by ethdev APIs. > >>=A0 - Remove rte_eth_vhost_portid2vdev(), because of above concep= t > changing. > >>=A0 - Implement legacy status changed interrupt to vhost PMD inst= ead of > >> using own callback mechanism. > >>=A0 - Check if we can remove rte_vhost_enable_guest_notification(= ) from > >> vhost library. > > So, how about making it __fare__ simple as the first step, to get= merged > > easily, that we don't assume the applications will call any vhost= library > > functions any more, so that we don't need the callback, and we do= n't need > > the rte_eth_vhost_portid2vdev(), either. Again, just let it be a = fare > > normal (nothing special) pmd driver.=A0 (UNLESS, there is a real = must, > which > > I don't see so far). > > > > Tetsuya, what do you think of that then? >=20 > I agree with you. But will wait a few days. > Because if someone wants to use it from vhost PMD, they probably wi= ll > provides use cases. > And if there are no use cases, let's do like above. >=20 > Thanks, > Tetsuya >=20 >=20