From mboxrd@z Thu Jan  1 00:00:00 1970
From: Andrew Rybchenko <arybchenko@solarflare.com>
Subject: Re: [PATCH v2 2/2] mempool/nb_stack: add non-blocking
	stack mempool
Date: Wed, 16 Jan 2019 10:13:26 +0300
Message-ID: <12e1a42b-5973-748f-e6d0-32a453e7b744@solarflare.com>
References: <20190110205538.24435-1-gage.eads@intel.com>
 <20190115223232.31866-1-gage.eads@intel.com>
 <20190115223232.31866-3-gage.eads@intel.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="utf-8"; format=flowed
Content-Transfer-Encoding: 7bit
Cc: <olivier.matz@6wind.com>, <bruce.richardson@intel.com>,
 <konstantin.ananyev@intel.com>
To: Gage Eads <gage.eads@intel.com>, <dev@dpdk.org>
Return-path: <dev-bounces@dpdk.org>
Received: from dispatch1-us1.ppe-hosted.com (dispatch1-us1.ppe-hosted.com
 [148.163.129.52]) by dpdk.org (Postfix) with ESMTP id CCEE3DE0
 for <dev@dpdk.org>; Wed, 16 Jan 2019 08:13:43 +0100 (CET)
In-Reply-To: <20190115223232.31866-3-gage.eads@intel.com>
Content-Language: en-GB
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

On 1/16/19 1:32 AM, Gage Eads wrote:
> This commit adds support for non-blocking (linked list based) stack mempool
> handler. The stack uses a 128-bit compare-and-swap instruction, and thus is
> limited to x86_64. The 128-bit CAS atomically updates the stack top pointer
> and a modification counter, which protects against the ABA problem.
>
> In mempool_perf_autotest the lock-based stack outperforms the non-blocking
> handler*, however:
> - For applications with preemptible pthreads, a lock-based stack's
>    worst-case performance (i.e. one thread being preempted while
>    holding the spinlock) is much worse than the non-blocking stack's.
> - Using per-thread mempool caches will largely mitigate the performance
>    difference.
>
> *Test setup: x86_64 build with default config, dual-socket Xeon E5-2699 v4,
> running on isolcpus cores with a tickless scheduler. The lock-based stack's
> rate_persec was 1x-3.5x the non-blocking stack's.
>
> Signed-off-by: Gage Eads <gage.eads@intel.com>
> ---

Few minor nits below. Other than that
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>

Don't forget about release notes when 19.05 release cycle starts.

[snip]

> diff --git a/drivers/mempool/meson.build b/drivers/mempool/meson.build
> index 4527d9806..01ee30fee 100644
> --- a/drivers/mempool/meson.build
> +++ b/drivers/mempool/meson.build
> @@ -2,6 +2,11 @@
>   # Copyright(c) 2017 Intel Corporation
>   
>   drivers = ['bucket', 'dpaa', 'dpaa2', 'octeontx', 'ring', 'stack']
> +
> +if dpdk_conf.has('RTE_ARCH_X86_64')
> +	drivers += 'nb_stack'
> +endif
> +

I think it would be better to concentrate the logic inside 
nb_stack/meson.build.
There is a 'build' variable which may be set to false disable the build.
You can find an example in drivers/net/sfc/meson.build.

[snip]

> +static __rte_always_inline void
> +nb_lifo_push(struct nb_lifo *lifo,
> +	     struct nb_lifo_elem *first,
> +	     struct nb_lifo_elem *last,
> +	     unsigned int num)
> +{
> +	while (1) {
> +		struct nb_lifo_head old_head, new_head;
> +
> +		old_head = lifo->head;
> +
> +		/* Swing the top pointer to the first element in the list and
> +		 * make the last element point to the old top.
> +		 */
> +		new_head.top = first;
> +		new_head.cnt = old_head.cnt + 1;
> +
> +		last->next = old_head.top;
> +
> +		if (rte_atomic128_cmpset((volatile uint64_t *) &lifo->head,

Unnecessary space after type cast above.

[snip]

> +		new_head.top = tmp;
> +		new_head.cnt = old_head.cnt + 1;
> +
> +		if (rte_atomic128_cmpset((volatile uint64_t *) &lifo->head,

Unnecessary space after type cast above.

[snip]