From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Richardson, Bruce" Subject: Re: [RFC PATCH v2 3/5] librte_ether: add API's for VF management Date: Wed, 28 Sep 2016 13:01:03 +0000 Message-ID: <59AF69C657FD0841A61C55336867B5B035B1018F@IRSMSX103.ger.corp.intel.com> References: <1471528125-26357-1-git-send-email-bernard.iremonger@intel.com> <8452736.eEWzj5BUlI@xps13> <8CEF83825BEC744B83065625E567D7C21A08D383@IRSMSX108.ger.corp.intel.com> <2212551.lClP9YTigq@xps13> <8CEF83825BEC744B83065625E567D7C21A08D615@IRSMSX108.ger.corp.intel.com> <20160927130121.GA34240@bricha3-MOBL3> <8CEF83825BEC744B83065625E567D7C21A08D86D@IRSMSX108.ger.corp.intel.com> <2601191342CEEE43887BDE71AB9772583F0BC0A3@irsmsx105.ger.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Cc: Thomas Monjalon , "dev@dpdk.org" , Jerin Jacob , "Shah, Rahul R" , "Lu, Wenzhuo" , azelezniak To: "Ananyev, Konstantin" , "Iremonger, Bernard" Return-path: Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id 04C572A1A for ; Wed, 28 Sep 2016 15:01:08 +0200 (CEST) In-Reply-To: <2601191342CEEE43887BDE71AB9772583F0BC0A3@irsmsx105.ger.corp.intel.com> Content-Language: en-US List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > -----Original Message----- > From: Ananyev, Konstantin > Sent: Wednesday, September 28, 2016 12:24 PM > To: Iremonger, Bernard ; Richardson, Bruce > > Cc: Thomas Monjalon ; dev@dpdk.org; Jerin Jaco= b > ; Shah, Rahul R ; > Lu, Wenzhuo ; azelezniak > Subject: RE: [dpdk-dev] [RFC PATCH v2 3/5] librte_ether: add API's for VF > management >=20 > Hi lads, >=20 > > -----Original Message----- > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Iremonger, > > Bernard > > Sent: Tuesday, September 27, 2016 3:13 PM > > To: Richardson, Bruce > > Cc: Thomas Monjalon ; dev@dpdk.org; Jerin > > Jacob ; Shah, Rahul R > > ; Lu, Wenzhuo ; > > azelezniak > > Subject: Re: [dpdk-dev] [RFC PATCH v2 3/5] librte_ether: add API's for > > VF management > > > > Hi Bruce, > > > > > > > > > > > > > Subject: Re: [dpdk-dev] [RFC PATCH v2 3/5] librte_ether: add > > > > > > > API's for VF management > > > > > > > > > > > > > > 2016-09-23 17:02, Iremonger, Bernard: > > > > > > > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] > > > > > > > > > 2016-09-23 09:53, Richardson, Bruce: > > > > > > > > > > From: Thomas Monjalon > > > > > > > > > > [mailto:thomas.monjalon@6wind.com] > > > > > > > > > > > 2016-09-23 10:20, Bruce Richardson: > > > > > > > > > > > > On Thu, Sep 22, 2016 at 07:04:37PM +0200, Thomas > > > > > > > > > > > > Monjalon > > > > > wrote: > > > > > > > > > > > > > 2016-09-15 16:46, Iremonger, Bernard: > > > > > > > > > > > > > > > > > Do we really need to expose VF specific > > > > > > > > > > > > > > > > > functions > > > > > here? > > > > > > > > > > > > > > > > > It can be generic(PF/VF) function > > > > > > > > > > > > > > > > > indexed only through > > > > > > > > > > > port_id. > > > > > > > > > > > > > > > > > (example: as > > > > > > > > > > > > > > > > > rte_eth_dev_set_vlan_anti_spoof(uint8_t > > > > > > > > > > > > > > > > > port_id, uint8_t on)) For instance, In > > > > > > > > > > > > > > > > > Thunderx PMD, We are not exposing a > > > > > > > > > > > > > > > > > separate port_id for PF. We only > > > > > > > > > > > > > > > > > enumerate 0..N VFs as 0..N ethdev > > > > > > > > > > > > > > > > > port_id > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Our intention with this patch is to > > > > > > > > > > > > > > > > control the VF from the > > > > > PF. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The following librte_ether functions > > > > > > > > > > > > > > > > already work in a similar > > > > > > > > > > > way: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_set_vf_rxmode(uint8_t port_id, > > > > > > > > > > > > > > > > uint16_t vf, uint16_t rx_mode, uint8_t on) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_set_vf_rx(uint8_t port_id, > > > > > > > > > > > > > > > > uint16_t vf, uint8_t > > > > > > > > > > > > > > > > on) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_set_vf_tx(uint8_t port_id, > > > > > > > > > > > > > > > > uint16_t vf, uint8_t > > > > > > > > > > > > > > > > on) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > int rte_eth_set_vf_rate_limit(uint8_t > > > > > > > > > > > > > > > > port_id, uint16_t vf, uint16_t tx_rate, > > > > > > > > > > > > > > > > uint64_t q_msk) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I have a bad feeling with these functions > > > > > > > > > > > > > > > dedicated to VF from > > > > > > > PF. > > > > > > > > > > > > > > > Are we sure there is no other way? > > > > > > > > > > > > > > > I mean we just need to know the VF with a por= t > ID. > > > > > > > > > > > > > > > > > > > > > > > > > > > > When the VF is used in a VM the port ID of the > > > > > > > > > > > > > > VF is not visible to > > > > > > > > > > > the PF. > > > > > > > > > > > > > > I don't think there is another way to do this. > > > > > > > > > > > > > > > > > > > > > > > > > > I don't understand why we could not assign a > > > > > > > > > > > > > port id to the VF from the host instead of > > > > > > > > > > > > > having the couple PF port id / > > > > > VF id. > > > > > > > > > > > > > Can we enumerate all the VFs associated to a PF? > > > > > > > > > > > > > Then can we allocate them a port id in the array > > > > > rte_eth_devices? > > > > > > > > > > > > > > > > > > > > > > > > Hi Thomas, > > > > > > > > > > > > > > > > > > > > > > > > The VF is not a port visible to DPDK, though, so > > > > > > > > > > > > it shouldn't have a port id IMHO. DPDK can't > > > > > > > > > > > > actually do anything > > > > > with it. > > > > > > > > > > > > > > > > > > > > > > You say the contrary below. > > > > > > > > > > > > > > > > > > > > Well, yes and no. The driver can manipulate things for > > > > > > > > > > the VF, but DPDK > > > > > > > > > doesn't actually have a device that corresponds to the VF= . > > > > > > > > > There are no PCI bar mappings for it, DPDK can't do RX > > > > > > > > > and TX with > > > it etc.? > > > > > > > > > > > > > > > > > > Very good point. > > > > > > > > > There are only few ethdev functions which are supported > > > > > > > > > by every drivers, like Rx/Tx and would not be available > > > > > > > > > for VF from PF > > > > > interface. > > > > > > > > > > > > > > > > > > > > > The PCI device for the VF is likely passed through > > > > > > > > > > > > to a different VM and being used there. > > > > > > > > > > > > Unfortunately, the VF still needs certain things > > > > > > > > > > > > done for it by the PF, so if the PF is under DPDK > > > > > > > > > > > > control, it needs to provide the functionality to > > > > > > > > > > > > assist > > > > > > > the VF. > > > > > > > > > > > > > > > > > > > > > > Why not have a VF_from_PF driver which does the > > > > > > > > > > > mailbox > > > > > things? > > > > > > > > > > > So you can manage the VF from the PF with a simple > port id. > > > > > > > > > > > It really seems to be the cleanest design to me. > > > > > > > > > > > > > > > > > > > > While I see your point, and it could work, I just want > > > > > > > > > > to be sure that we are > > > > > > > > > ok with the results of that. Suppose we do create > > > > > > > > > ethdevs for the VFs controlled by the PF. Does the new > > > > > > > > > VF get counted in the > > > > > > > > > rte_eth_dev_count() value (I assume yes)? How are apps > > > > > > > > > meant to use the port? Do they have to put in a special > > > > > > > > > case when iterating through all the port ids to check > > > > > > > > > that it's not a pseudo port that can't do anything. None > > > > > > > > > of the standard ethdev calls from an app will work on > > > > > > > > > it, you can't configure nb rx/tx queues on it, you can't > > > > > > > > > start or > > > > > > > stop it, you can't do rx or tx on it, etc, etc. > > > > > > > > > > > > > > > > > > Yes these devices would be special because their > > > > > > > > > supported API would be quite different. I was thinking > > > > > > > > > that in the future you could add most of the > > > > > > > > > configuration functions through the VF > > > > > mailbox. > > > > > > > > > But the Intel mailbox currently support only some > > > > > > > > > special configurations which are not supported by other > > > > > > > > > devices even its own VF device (except setting MAC > address). > > > > > > > > > And when I read "set drop enable bit in the VF split rx > > > > > > > > > control register", it becomes clear it is really > > > > > > > > > specific and has nothing to do in the generic ethdev API. > > > > > > > > > That's why it is a NACK. > > > > > > > > > > > > > > > > > > When we want to use these very specific features we are > > > > > > > > > aware of the underlying device and driver. So we can > > > > > > > > > directly include a header from the driver. I suggest to > > > > > > > > > retrieve a handler for the device which is not a port id > > > > > > > > > and will allow to call ixgbe functions > > > > > directly. > > > > > > > > > It could be achieved by adding an ethdev function like > > > > > > > > > discussed > > > here: > > > > > > > > > http://dpdk.org/ml/archives/dev/2016- > > > September/047392.html > > > > > > > > > > > > > > > > > > > > > > > > > I have been reading the net/vhost mail thread above. The > > > > > > > > following quote > > > > > > > is from this thread. > > > > > > > > > > > > > > > > "It means I would be in favor of introducing API in > > > > > > > > drivers for very specific > > > > > > > features." > > > > > > > > > > > > > > > > At present all the PMD functions are accessed through the > > > > > > > > eth_dev_ops > > > > > > > structure, there are no PMD API's. > > > > > > > > > > > > > > > > Is your proposal to add API(s) to the DPDK ixgbe PMD > > > > > > > > (similar to a driver > > > > > > > ioctl API) which can be accessed through a generic API in the > ethdev? > > > > > > > > > > > > > > Not exactly. I'm thinking about a PMD specific API. > > > > > > > The only ethdev API you need would be a function to retrieve > > > > > > > a handler (an opaque pointer on the device struct) from the > port id. > > > > > > > Then you can include rte_ixgbe.h and directly call the > > > > > > > specific ixgbe function, passing the device handler. > > > > > > > How does it sound? > > > > > > > > > > > > I have been prototyping this proposed solution, it appears to > work. > > > > > > > > > > > > I have added the following function: > > > > > > > > > > > > int rte_eth_dev_get_pmd_handle(uint8_t port_id, void** > > > > > > pmd_handle); > > > > > > > > > > > > The pmd_handle is a pointer to a dev_ops structure containing > > > > > > driver > > > > > specific functions. > > > > > > > > > > > > Using the pmd_handle the driver specific functions can be > > > > > > called (without having them in struct eth_dev_ops) > > > > > > > > > > > > Has this proposal been superseded by the discussion on the > > > > > > following > > > > > patch? > > > > > > > > > > > > [PATCH] net/vhost: Add function to retreive the 'vid' for a > > > > > > given port id > > > > > > > > > > Maybe, it can be superseded by this discussion, yes. > > > > > Bruce thinks we do not need rte_eth_dev_get_pmd_handle(). > > > > > What is your opinion about using port_id directly and retrieving > > > > > the structs from the driver via rte_eth_devices? > > > > > > > > Looking at the code in rte_eth_devices[] > > > > > > > > struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS]; > > > > > > > > struct rte_eth_dev { > > > > > > > > ... > > > > > > > > const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD > > > > */ > > > > > > > > ... > > > > > > > > void *pmd_ops; /** < exported PMD specific functions */ > > > > > > > > } > > > > > > > > The PMD functions are only accessible at present if they are in > > > > struct > > > eth_dev_ops. > > > > > > > > Adding a pmd_ops field to struct rte_eth_dev {} makes the PMD > > > > functions > > > accessible and is a simpler solution than using > > > rte_eth_dev_get_pmd_handle() to get access to the PMD functions. > > > > > > > > Regards, > > > > > > > > > > Why would an ops structure be needed? If it's a private API for a > > > driver, there should be no need for function pointers, and instead > > > the driver can define regular functions in it's header file, no? > > > > > > /Bruce > > > > The driver functions were static, I have made them public and added > > them to the rte_pmd_ixgbe.h file, and it works. These functions will > also need to be added to the rte_pmd_ixgbe_version.map file, previously > there were no public functions. >=20 > Sorry for being late in the discussion, but I have a question: > If we this way (force user to include driver specific headers and call > driver specific functions), how you guys plan to make this functionality > available for multiple driver types. > From discussion with Bernard understand that customers would need simila= r > functionality for i40e. > Does it mean that they'll have to re-implement this part of their code > again? > Or would have to create (and maintain) their own shim layer that would > provide some s of abstraction? > Basically their own version of rte_ethdev? >=20 > Konstantin Yes, it's a problem. However, the concern right now is that this functional= ity is not generic across NICs, and will only be implemented for ixgbe and = i40e. If it turns out to be implemented for a significant subset of DPDK PM= Ds, e.g. 4 or 5 drivers from at least two vendors, then I would expect the = case to be made to "promote" this functionality to ethdev. Does this sound reasonable. Any other thoughts? /Bruce