From mboxrd@z Thu Jan 1 00:00:00 1970 From: Neil Horman Subject: Re: [PATCH] Scheduler: add driver for scheduler crypto pmd Date: Thu, 8 Dec 2016 09:57:28 -0500 Message-ID: <20161208145728.GA4657@hmsreliant.think-freely.org> References: <1480688123-39494-1-git-send-email-roy.fan.zhang@intel.com> <8047937.9v81RFizFU@xps13> <20161202145730.GA322432@bricha3-MOBL3.ger.corp.intel.com> <63671b1d-52e0-e653-1323-5d9513c0b9dc@intel.com> <20161205151209.GA4232@hmsreliant.think-freely.org> <558a1817-9c81-5e5f-b1e2-b71934772631@intel.com> <20161207141656.GA31938@neilslaptop.think-freely.org> <59AF69C657FD0841A61C55336867B5B035B54481@IRSMSX103.ger.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Cc: "Richardson, Bruce" , Thomas Monjalon , "Zhang, Roy Fan" , "dev@dpdk.org" To: Declan Doherty Return-path: Received: from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58]) by dpdk.org (Postfix) with ESMTP id CEE9220F for ; Thu, 8 Dec 2016 15:57:48 +0100 (CET) Content-Disposition: inline In-Reply-To: List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Wed, Dec 07, 2016 at 04:04:17PM +0000, Declan Doherty wrote: > On 07/12/16 14:46, Richardson, Bruce wrote: > >=20 > >=20 > > > -----Original Message----- > > > From: Neil Horman [mailto:nhorman@tuxdriver.com] > > > Sent: Wednesday, December 7, 2016 2:17 PM > > > To: Doherty, Declan > > > Cc: Richardson, Bruce ; Thomas Monjalon > > > ; Zhang, Roy Fan ; > > > dev@dpdk.org > > > Subject: Re: [dpdk-dev] [PATCH] Scheduler: add driver for scheduler c= rypto > > > pmd > > >=20 > > > On Wed, Dec 07, 2016 at 12:42:15PM +0000, Declan Doherty wrote: > > > > On 05/12/16 15:12, Neil Horman wrote: > > > > > On Fri, Dec 02, 2016 at 04:22:16PM +0000, Declan Doherty wrote: > > > > > > On 02/12/16 14:57, Bruce Richardson wrote: > > > > > > > On Fri, Dec 02, 2016 at 03:31:24PM +0100, Thomas Monjalon wro= te: > > > > > > > > 2016-12-02 14:15, Fan Zhang: > > > > > > > > > This patch provides the initial implementation of the > > > > > > > > > scheduler poll mode driver using DPDK cryptodev framework. > > > > > > > > >=20 > > > > > > > > > Scheduler PMD is used to schedule and enqueue the crypto = ops > > > > > > > > > to the hardware and/or software crypto devices attached to > > > > > > > > > it (slaves). The dequeue operation from the slave(s), and > > > > > > > > > the possible dequeued crypto op reordering, are then carr= ied > > > out by the scheduler. > > > > > > > > >=20 > > > > > > > > > The scheduler PMD can be used to fill the throughput gap > > > > > > > > > between the physical core and the existing cryptodevs to > > > > > > > > > increase the overall performance. For example, if a physi= cal > > > > > > > > > core has higher crypto op processing rate than a cryptode= v, > > > > > > > > > the scheduler PMD can be introduced to attach more than o= ne > > > cryptodevs. > > > > > > > > >=20 > > > > > > > > > This initial implementation is limited to supporting the > > > > > > > > > following scheduling modes: > > > > > > > > >=20 > > > > > > > > > - CRYPTO_SCHED_SW_ROUND_ROBIN_MODE (round robin amongst > > > attached software > > > > > > > > > slave cryptodevs, to set this mode, the scheduler sho= uld > > > have been > > > > > > > > > attached 1 or more software cryptodevs. > > > > > > > > >=20 > > > > > > > > > - CRYPTO_SCHED_HW_ROUND_ROBIN_MODE (round robin amongst > > > attached hardware > > > > > > > > > slave cryptodevs (QAT), to set this mode, the schedul= er > > > should have > > > > > > > > > been attached 1 or more QATs. > > > > > > > >=20 > > > > > > > > Could it be implemented on top of the eventdev API? > > > > > > > >=20 > > > > > > > Not really. The eventdev API is for different types of > > > > > > > scheduling between multiple sources that are all polling for > > > > > > > packets, compared to this, which is more analgous - as I > > > > > > > understand it - to the bonding PMD for ethdev. > > > > > > >=20 > > > > > > > To make something like this work with an eventdev API you wou= ld > > > > > > > need to use one of the following models: > > > > > > > * have worker cores for offloading packets to the different c= rypto > > > > > > > blocks pulling from the eventdev APIs. This would make it > > > difficult to > > > > > > > do any "smart" scheduling of crypto operations between the > > > blocks, > > > > > > > e.g. that one crypto instance may be better at certain type= s of > > > > > > > operations than another. > > > > > > > * move the logic in this driver into an existing eventdev > > > instance, > > > > > > > which uses the eventdev api rather than the crypto APIs and= so > > > has an > > > > > > > extra level of "structure abstraction" that has to be worked > > > though. > > > > > > > It's just not really a good fit. > > > > > > >=20 > > > > > > > So for this workload, I believe the pseudo-cryptodev instance= is > > > > > > > the best way to go. > > > > > > >=20 > > > > > > > /Bruce > > > > > > >=20 > > > > > >=20 > > > > > >=20 > > > > > > As Bruce says this is much more analogous to the ethdev bonding > > > > > > driver, the main idea is to allow different crypto op scheduling > > > > > > mechanisms to be defined transparently to an application. This > > > > > > could be load-balancing across multiple hw crypto devices, or > > > > > > having a software crypto device to act as a backup device for a= hw > > > > > > accelerator if it becomes oversubscribed. I think the main > > > > > > advantage of a crypto-scheduler approach means that the data pa= th > > > > > > of the application doesn't need to have any knowledge that > > > > > > scheduling is happening at all, it is just using a different cr= ypto > > > device id, which is then manages the distribution of crypto work. > > > > > >=20 > > > > > >=20 > > > > > >=20 > > > > > This is a good deal like the bonding pmd, and so from a certain > > > > > standpoint it makes sense to do this, but whereas the bonding pmd= is > > > > > meant to create a single path to a logical network over several > > > > > physical networks, this pmd really only focuses on maximizing > > > > > througput, and for that we already have tools. As Thomas mention= s, > > > > > there is the eventdev library, but from my view the distributor > > > > > library already fits this bill. It already is a basic framework = to > > > > > process mbufs in parallel according to whatever policy you want to > > > implement, which sounds like exactly what the goal of this pmd is. > > > > >=20 > > > > > Neil > > > > >=20 > > > > >=20 > > > >=20 > > > > Hey Neil, > > > >=20 > > > > this is actually intended to act and look a good deal like the > > > > ethernet bonding device but to handling the crypto scheduling use c= ases. > > > >=20 > > > > For example, take the case where multiple hw accelerators may be > > > available. > > > > We want to provide user applications with a mechanism to transparen= tly > > > > balance work across all devices without having to manage the load > > > > balancing details or the guaranteeing of ordering of the processed = ops > > > > on the dequeue_burst side. In this case the application would just = use > > > > the crypto dev_id of the scheduler and it would look after balancing > > > > the workload across the available hw accelerators. > > > >=20 > > > >=20 > > > > +-------------------+ > > > > | Crypto Sch PMD | > > > > | | > > > > | ORDERING / RR SCH | > > > > +-------------------+ > > > > ^ ^ ^ > > > > | | | > > > > +-+ | +-------------------------------+ > > > > | +---------------+ | > > > > | | | > > > > V V V > > > > +---------------+ +---------------+ +---------------+ > > > > | Crypto HW PMD | | Crypto HW PMD | | Crypto HW PMD | > > > > +---------------+ +---------------+ +---------------+ > > > >=20 > > > > Another use case we hope to support is migration of processing from > > > > one device to another where a hw and sw crypto pmd can be bound to = the > > > > same crypto scheduler and the crypto processing could be > > > > transparently migrated from the hw to sw pmd. This would allow for = hw > > > > accelerators to be hot-plugged attached/detached in a Guess VM > > > >=20 > > > > +----------------+ > > > > | Crypto Sch PMD | > > > > | | > > > > | MIGRATION SCH | > > > > +----------------+ > > > > | | > > > > | +-----------------+ > > > > | | > > > > V V > > > > +---------------+ +---------------+ > > > > | Crypto HW PMD | | Crypto SW PMD | > > > > | (Active) | | (Inactive) | > > > > +---------------+ +---------------+ > > > >=20 > > > > The main point is that isn't envisaged as just a mechanism for > > > > scheduling crypto work loads across multiple cores, but a framework > > > > for allowing different scheduling mechanisms to be introduced, to > > > > handle different crypto scheduling problems, and done so in a way > > > > which is completely transparent to the data path of an application. > > > > Like the eth bonding driver we want to support creating the crypto > > > > scheduler from EAL options, which allow specification of the > > > > scheduling mode and the crypto pmds which are to be bound to that c= rypto > > > scheduler. > > > >=20 > > > >=20 > > > I get what its for, that much is pretty clear. But whereas the bondi= ng > > > driver benefits from creating a single device interface for the purpo= ses > > > of properly routing traffic through the network stack without exposing > > > that complexity to the using application, this pmd provides only > > > aggregation accoring to various policies. This is exactly what the > > > distributor library was built for, and it seems like a re-invention o= f the > > > wheel to ignore that. At the very least, you should implement this p= md on > > > top of the distributor library. If that is impracitcal, then I somew= hat > > > question why we have the distributor library at all. > > >=20 > > > Neil > > >=20 > >=20 > > Hi Neil, > >=20 > > The distributor library, and the eventdev framework are not the solutio= n here, as, firstly, the crypto devices are not cores, in the same way that= ethdev's are not cores, and the distributor library is for evenly distribu= ting work among cores. Sure, some crypto implementations may be software on= ly, but many aren't, and those that are software still appear as a device t= o software that must be used like they were a HW device. In the same way th= at to use distributor to load balance traffic between various TX ports is n= ot a suitable solution - because you need to use cores to do the work "brid= ging" between the distributor/eventdev and the ethdev device, similarly her= e, if we distribute traffic using the distributor, you need cores to pull t= hose packets from the distributor and offload them to the crypto devices. T= o use the distributor library in place of this vpmd, we'd need crypto devic= es which are aware of how to talk to the distributor, and use it's protocol= s for pushing/pulling packets, or else we are pulling in extra core cycles = to do bridging work. > >=20 > > Secondly, the distributor and eventdev libraries are designed for doing= flow based (generally atomic) packet distribution. Load balancing between = crypto devices is not generally based on flows, but rather on other factors= like packet size, offload cost per device, etc. To distributor/eventdev, a= ll workers are equal, but for working with devices, for crypto offload or n= ic transmission, that is plainly not the case. In short the distribution pr= oblems that are being solved by distributor and eventdev libraries are fund= amentally different than those being solved by this vpmd. They would be the= wrong tool for the job. > >=20 > > I would agree with the previous statements that this driver is far clos= er in functionality to the bonded ethdev driver than anything else. It make= s multiple devices appear as a single one while hiding the complexity of th= e multiple devices to the using application. In the same way as the bonded = ethdev driver has different modes for active-backup, and for active-active = for increased throughput, this vpmd for crypto can have the exact same mode= s - multiple active bonded devices for higher performance operation, or two= devices in active backup to enable migration when using SR-IOV as describe= d by Declan above. > >=20 > > Regards, > > /Bruce > >=20 >=20 > I think that having scheduler in the pmd name here may be somewhat of a > loaded term and is muddying the waters of the problem we are trying to > address and I think if we were to rename this to crypto_bond_pmd it may m= ake > our intent for what we want this pmd to achieve clearer. >=20 > Neil, in most of the initial scheduling use cases we want to address with > this pmd initially, we are looking to schedule within the context of a > single lcore on multiple hw accelerators or a mix of hw accelerators and = sw > pmds and therefore using the distributor or the eventdev wouldn't add a l= ot > of value. >=20 > Declan Ok, these are fair points, and I'll concede to them. That said, it still s= eems like a waste to me to ignore the 80% functionality overlap to be had here. = That is to say, the distributor library does alot of work that both this pmd and= the bonding pmd could benefit from. Perhaps its worth looking at how to enhanc= e the distributor library such that worker tasks can be affined to a single cpu, = and the worker assignment can be used as indexed device assignment (the idea be= ing that a single worker task might represent multiple worker ids in the distri= butor library). that way such a crypto aggregator pmd or the bonding pmd's implementation is little more than setting tags in mbufs accoring to approp= riate policy. Neil