From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Eads, Gage" Subject: Re: [RFC PATCH] eventdev: add buffered enqueue and flush APIs Date: Mon, 12 Dec 2016 17:56:32 +0000 Message-ID: <9184057F7FC11744A2107296B6B8EB1E01E38655@FMSMSX108.amr.corp.intel.com> References: <1480707956-17187-1-git-send-email-gage.eads@intel.com> <1480707956-17187-2-git-send-email-gage.eads@intel.com> <20161202211847.GA14577@localhost.localdomain> <60DABA4C-E3E8-4768-B2E4-BB97C6421A50@intel.com> <20161208044139.GA24793@svelivela-lt.caveonetworks.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Cc: "dev@dpdk.org" , "Richardson, Bruce" , "Van Haaren, Harry" , "hemant.agrawal@nxp.com" To: Jerin Jacob Return-path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id 2BDA3370 for ; Mon, 12 Dec 2016 18:56:45 +0100 (CET) In-Reply-To: <20161208044139.GA24793@svelivela-lt.caveonetworks.com> Content-Language: en-US List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > -----Original Message----- > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com] > Sent: Wednesday, December 7, 2016 10:42 PM > To: Eads, Gage > Cc: dev@dpdk.org; Richardson, Bruce ; Van > Haaren, Harry ; hemant.agrawal@nxp.com > Subject: Re: [RFC PATCH] eventdev: add buffered enqueue and flush APIs > =20 > On Mon, Dec 05, 2016 at 11:30:46PM +0000, Eads, Gage wrote: > > > > > On Dec 3, 2016, at 5:18 AM, Jerin Jacob > wrote: > > > > > >> On Fri, Dec 02, 2016 at 01:45:56PM -0600, Gage Eads wrote: > > >> This commit adds buffered enqueue functionality to the eventdev API= . > > >> It is conceptually similar to the ethdev API's tx buffering, > > >> however with a smaller API surface and no dropping of events. > > > > > > Hello Gage, > > > Different implementation may have different strategies to hold the b= uffers. > > > > A benefit of inlining the buffering logic in the header is that we avo= id the > overhead of entering the PMD for what is a fairly simple operation (comm= on > case: add event to an array, increment counter). If we make this > implementation-defined (i.e. use PMD callbacks), we lose that benefit. > In general, I agree from the system perspective. But, few general issues= with > eventdev integration part, > =20 > 1) What if the burst has ATOMIC flows and if we are NOT en-queuing to th= e > implementation then other event ports won't get the packets from the sam= e > ATOMIC tag ? BAD. Right? I'm not sure what scenario you're describing here. The buffered (as impleme= nted in my patch) and non-buffered enqueue operations are functionally the = same (as long as the buffer is flushed), the difference lies in when the ev= ents are moved from the application level to the PMD. > 2) At least, In our HW implementation, The event buffer strategy is more= like, if > you enqueue to HW then ONLY you get the events from dequeue provided if = op > =3D=3D RTE_EVENT_OP_FORWARD.So it will create deadlock.i.e application c= annot > hold the events with RTE_EVENT_OP_FORWARD If I'm reading this correctly, you're concerned that buffered events can re= sult in deadlock if they're not flushed. Whether the buffering is done in t= he app itself, inline in the API, or in the PMDs, not flushing the buffer i= s an application bug. E.g. the app could be fixed by flushing its enqueue b= uffer after processing every burst dequeued event set, or only if dequeue r= eturns 0 events. > 3) So considering the above case there is nothing like flush for us > 4) In real high throughput benchmark case, we will get the packets at th= e rate > of max burst and then we always needs to memcpy before we flush. > Otherwise there will be ordering issue as burst can get us the packet fr= om > different flows(unlike polling mode) I take it you're referring to the memcpy in the patch, and not an additiona= l memcpy? At any rate, I'm hoping that SIMD instructions can optimize the 1= 6B event copy. > =20 > > > > > and some does not need to hold the buffers if it is DDR backed. > > > > Though DDR-backed hardware doesn't need to buffer in software, doing s= o > would reduce the software overhead of enqueueing. Compared to N individu= al > calls to enqueue, buffering N events then calling enqueue burst once can > benefit from amortized (or parallelized) PMD-specific bookkeeping and er= ror- > checking across the set of events, and will definitely benefit from the = amortized > function call overhead and better I-cache behavior. (Essentially this is= VPP from > the fd.io project.) This should result in higher overall event throughou= t > (agnostic of the underlying device). > =20 > See above. I am not against burst processing in "application". > The flush does not make sense for us in HW perspective and it is costly = for us if > we trying generalize it. > =20 Besides the data copy that buffering requires, are there additional costs f= rom your perspective? > > > > I'm skeptical that other buffering strategies would emerge, but I can = only > speculate on Cavium/NXP/etc. NPU software. > i> > > > IHMO, This may not be the candidate for common code. I guess you can > > > move this to driver side and abstract under SW driver's enqueue_burs= t. > > > > > > > I don't think that will work without adding a flush API, otherwise we = could > have indefinitely buffered events. I see three ways forward: > =20 > I agree. More portable way is to move the "flush" to the implementation = and > "flush" > whenever it makes sense to PMD. > =20 > > > > - The proposed approach > > - Add the proposed functions but make them implementation-specific. > > - Require the application to write its own buffering logic (i.e. no > > API change) > =20 > I think, If the additional function call overhead cost is too much for S= W > implementation then we can think of implementation-specific API or custo= m > application flow based on SW driver. > =20 > But I am not fan of that(but tempted do now a days), If we take that rou= te, we > have truckload of custom implementation specific API and now we try to h= ide > all black magic under enqueue/dequeue to make it portable at some expens= e. Agreed, it's not worth special-casing the API with this relatively minor ad= dition. Thanks, Gage