From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Rybchenko Subject: Re: [PATCH v2 2/2] mempool/nb_stack: add non-blocking stack mempool Date: Wed, 16 Jan 2019 10:13:26 +0300 Message-ID: <12e1a42b-5973-748f-e6d0-32a453e7b744@solarflare.com> References: <20190110205538.24435-1-gage.eads@intel.com> <20190115223232.31866-1-gage.eads@intel.com> <20190115223232.31866-3-gage.eads@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Cc: , , To: Gage Eads , Return-path: Received: from dispatch1-us1.ppe-hosted.com (dispatch1-us1.ppe-hosted.com [148.163.129.52]) by dpdk.org (Postfix) with ESMTP id CCEE3DE0 for ; Wed, 16 Jan 2019 08:13:43 +0100 (CET) In-Reply-To: <20190115223232.31866-3-gage.eads@intel.com> Content-Language: en-GB List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 1/16/19 1:32 AM, Gage Eads wrote: > This commit adds support for non-blocking (linked list based) stack mempool > handler. The stack uses a 128-bit compare-and-swap instruction, and thus is > limited to x86_64. The 128-bit CAS atomically updates the stack top pointer > and a modification counter, which protects against the ABA problem. > > In mempool_perf_autotest the lock-based stack outperforms the non-blocking > handler*, however: > - For applications with preemptible pthreads, a lock-based stack's > worst-case performance (i.e. one thread being preempted while > holding the spinlock) is much worse than the non-blocking stack's. > - Using per-thread mempool caches will largely mitigate the performance > difference. > > *Test setup: x86_64 build with default config, dual-socket Xeon E5-2699 v4, > running on isolcpus cores with a tickless scheduler. The lock-based stack's > rate_persec was 1x-3.5x the non-blocking stack's. > > Signed-off-by: Gage Eads > --- Few minor nits below. Other than that Acked-by: Andrew Rybchenko Don't forget about release notes when 19.05 release cycle starts. [snip] > diff --git a/drivers/mempool/meson.build b/drivers/mempool/meson.build > index 4527d9806..01ee30fee 100644 > --- a/drivers/mempool/meson.build > +++ b/drivers/mempool/meson.build > @@ -2,6 +2,11 @@ > # Copyright(c) 2017 Intel Corporation > > drivers = ['bucket', 'dpaa', 'dpaa2', 'octeontx', 'ring', 'stack'] > + > +if dpdk_conf.has('RTE_ARCH_X86_64') > + drivers += 'nb_stack' > +endif > + I think it would be better to concentrate the logic inside nb_stack/meson.build. There is a 'build' variable which may be set to false disable the build. You can find an example in drivers/net/sfc/meson.build. [snip] > +static __rte_always_inline void > +nb_lifo_push(struct nb_lifo *lifo, > + struct nb_lifo_elem *first, > + struct nb_lifo_elem *last, > + unsigned int num) > +{ > + while (1) { > + struct nb_lifo_head old_head, new_head; > + > + old_head = lifo->head; > + > + /* Swing the top pointer to the first element in the list and > + * make the last element point to the old top. > + */ > + new_head.top = first; > + new_head.cnt = old_head.cnt + 1; > + > + last->next = old_head.top; > + > + if (rte_atomic128_cmpset((volatile uint64_t *) &lifo->head, Unnecessary space after type cast above. [snip] > + new_head.top = tmp; > + new_head.cnt = old_head.cnt + 1; > + > + if (rte_atomic128_cmpset((volatile uint64_t *) &lifo->head, Unnecessary space after type cast above. [snip]