All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Eads, Gage" <gage.eads@intel.com>
To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Cc: "olivier.matz@6wind.com" <olivier.matz@6wind.com>,
	"arybchenko@solarflare.com" <arybchenko@solarflare.com>,
	"Richardson, Bruce" <bruce.richardson@intel.com>,
	"Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	"Gavin Hu (Arm Technology China)" <Gavin.Hu@arm.com>,
	nd <nd@arm.com>, "thomas@monjalon.net" <thomas@monjalon.net>,
	nd <nd@arm.com>, Thomas Monjalon <thomas@monjalon.net>
Subject: Re: [PATCH v3 6/8] stack: add C11 atomic implementation
Date: Fri, 29 Mar 2019 19:24:45 +0000	[thread overview]
Message-ID: <9184057F7FC11744A2107296B6B8EB1E5420D940@FMSMSX108.amr.corp.intel.com> (raw)
In-Reply-To: <VE1PR08MB5149F2B5E1A3AC800BC71D8698590@VE1PR08MB5149.eurprd08.prod.outlook.com>

[snip]

> > +static __rte_always_inline void
> > +__rte_stack_lf_push(struct rte_stack_lf_list *list,
> > +		    struct rte_stack_lf_elem *first,
> > +		    struct rte_stack_lf_elem *last,
> > +		    unsigned int num)
> > +{
> > +#ifndef RTE_ARCH_X86_64
> > +	RTE_SET_USED(first);
> > +	RTE_SET_USED(last);
> > +	RTE_SET_USED(list);
> > +	RTE_SET_USED(num);
> > +#else
> > +	struct rte_stack_lf_head old_head;
> > +	int success;
> > +
> > +	old_head = list->head;
> This can be a torn read (same as you have mentioned in
> __rte_stack_lf_pop). I suggest we use acquire thread fence here as well
> (please see the comments in __rte_stack_lf_pop).

Agreed. I'll add the acquire fence.

> > +
> > +	do {
> > +		struct rte_stack_lf_head new_head;
> > +
> We can add __atomic_thread_fence(__ATOMIC_ACQUIRE) here (please see
> the comments in __rte_stack_lf_pop).

Will add the fence here.

> > +		/* Swing the top pointer to the first element in the list and
> > +		 * make the last element point to the old top.
> > +		 */
> > +		new_head.top = first;
> > +		new_head.cnt = old_head.cnt + 1;
> > +
> > +		last->next = old_head.top;
> > +
> > +		/* Use the release memmodel to ensure the writes to the LF
> > LIFO
> > +		 * elements are visible before the head pointer write.
> > +		 */
> > +		success = rte_atomic128_cmp_exchange(
> > +				(rte_int128_t *)&list->head,
> > +				(rte_int128_t *)&old_head,
> > +				(rte_int128_t *)&new_head,
> > +				1, __ATOMIC_RELEASE,
> > +				__ATOMIC_RELAXED);
> Success memory order can be RELAXED as the store to list->len.cnt is
> RELEASE.

The RELEASE success order here ensures that the store to 'last->next' is visible before the head update. The RELEASE in the list->len.cnt store only guarantees that the preceding stores are visible before list->len.cnt's store, but doesn't guarantee any ordering between the 'last->next' store and the head update, so we can't rely on that.

> > +	} while (success == 0);
> > +
> > +	/* Ensure the stack modifications are not reordered with respect
> > +	 * to the LIFO len update.
> > +	 */
> > +	__atomic_add_fetch(&list->len.cnt, num, __ATOMIC_RELEASE);
> > #endif }
> > +
> > +static __rte_always_inline struct rte_stack_lf_elem *
> > +__rte_stack_lf_pop(struct rte_stack_lf_list *list,
> > +		   unsigned int num,
> > +		   void **obj_table,
> > +		   struct rte_stack_lf_elem **last) { #ifndef
> RTE_ARCH_X86_64
> > +	RTE_SET_USED(obj_table);
> > +	RTE_SET_USED(last);
> > +	RTE_SET_USED(list);
> > +	RTE_SET_USED(num);
> > +
> > +	return NULL;
> > +#else
> > +	struct rte_stack_lf_head old_head;
> > +	int success;
> > +
> > +	/* Reserve num elements, if available */
> > +	while (1) {
> > +		uint64_t len = __atomic_load_n(&list->len.cnt,
> > +					       __ATOMIC_ACQUIRE);
> This can be outside the loop.

Good idea. I'll move this out of the loop and change the atomic_compare_exchange_n's failure memory order to ACQUIRE.

> > +
> > +		/* Does the list contain enough elements? */
> > +		if (unlikely(len < num))
> > +			return NULL;
> > +
> > +		if (__atomic_compare_exchange_n(&list->len.cnt,
> > +						&len, len - num,
> > +						0, __ATOMIC_RELAXED,
> > +						__ATOMIC_RELAXED))
> > +			break;
> > +	}
> > +
> > +#ifndef RTE_ARCH_X86_64
> > +	/* Use the acquire memmodel to ensure the reads to the LF LIFO
> > elements
> > +	 * are properly ordered with respect to the head pointer read.
> > +	 *
> > +	 * Note that for aarch64, GCC's implementation of __atomic_load_16
> > in
> > +	 * libatomic uses locks, and so this function should be replaced by
> > +	 * a new function (e.g. "rte_atomic128_load()").
> aarch64 does not have 'pure' atomic 128b load instructions. They have to be
> implemented using load/store exclusives.
>
> > +	 */
> > +	__atomic_load((volatile __int128 *)&list->head,
> > +		      &old_head,
> > +		      __ATOMIC_ACQUIRE);
> Since, we know of x86/aarch64 (power?) that cannot implement pure atomic
> 128b loads, should we just use relaxed reads and assume the possibility of
> torn reads for all architectures? Then, we can use acquire fence to prevent
> the reordering (see below)

That's a cleaner solution. I'll implement that, dropping the architecture distinction.

> > +#else
> > +	/* x86-64 does not require an atomic load here; if a torn read
> > +occurs,
> IMO, we should not make architecture specific distinctions as this algorithm is
> based on C11 memory model.
>
> > +	 * the CAS will fail and set old_head to the correct/latest value.
> > +	 */
> > +	old_head = list->head;
> > +#endif
> > +
> > +	/* Pop num elements */
> > +	do {
> > +		struct rte_stack_lf_head new_head;
> > +		struct rte_stack_lf_elem *tmp;
> > +		unsigned int i;
> > +
> We can add __atomic_thread_fence(__ATOMIC_ACQUIRE) here.

Will do.

> > +		rte_prefetch0(old_head.top);
> > +
> > +		tmp = old_head.top;
> > +
> > +		/* Traverse the list to find the new head. A next pointer will
> > +		 * either point to another element or NULL; if a thread
> > +		 * encounters a pointer that has already been popped, the
> > CAS
> > +		 * will fail.
> > +		 */
> > +		for (i = 0; i < num && tmp != NULL; i++) {
> > +			rte_prefetch0(tmp->next);
> > +			if (obj_table)
> > +				obj_table[i] = tmp->data;
> > +			if (last)
> > +				*last = tmp;
> > +			tmp = tmp->next;
> > +		}
> > +
> > +		/* If NULL was encountered, the list was modified while
> > +		 * traversing it. Retry.
> > +		 */
> > +		if (i != num)
> > +			continue;
> > +
> > +		new_head.top = tmp;
> > +		new_head.cnt = old_head.cnt + 1;
> > +
> > +		success = rte_atomic128_cmp_exchange(
> > +				(rte_int128_t *)&list->head,
> > +				(rte_int128_t *)&old_head,
> > +				(rte_int128_t *)&new_head,
> > +				1, __ATOMIC_ACQUIRE,
> > +				__ATOMIC_ACQUIRE);
> The success order should be __ATOMIC_RELEASE as the write to list->len.cnt
> is relaxed.
> The failure order can be __ATOMIC_RELAXED if the thread fence is added.

Agreed on both counts. Will address.

> > +	} while (success == 0);
> > +
> > +	return old_head.top;
> > +#endif
> > +}
> > +
> > +#endif /* _RTE_STACK_C11_MEM_H_ */
> > --
> > 2.13.6

  reply	other threads:[~2019-03-29 19:24 UTC|newest]

Thread overview: 133+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-22 16:06 [PATCH 0/7] Subject: [PATCH ...] Add stack library and new mempool handler Gage Eads
2019-02-22 16:06 ` [PATCH 1/7] stack: introduce rte stack library Gage Eads
2019-02-25 10:43   ` Olivier Matz
2019-02-28  5:10     ` Eads, Gage
2019-02-22 16:06 ` [PATCH 2/7] mempool/stack: convert mempool to use rte stack Gage Eads
2019-02-25 10:46   ` Olivier Matz
2019-02-22 16:06 ` [PATCH 3/7] test/stack: add stack test Gage Eads
2019-02-25 10:59   ` Olivier Matz
2019-02-28  5:11     ` Eads, Gage
2019-02-22 16:06 ` [PATCH 4/7] test/stack: add stack perf test Gage Eads
2019-02-25 11:04   ` Olivier Matz
2019-02-22 16:06 ` [PATCH 5/7] stack: add non-blocking stack implementation Gage Eads
2019-02-25 11:28   ` Olivier Matz
     [not found]     ` <2EC44CCD3517A842B44C82651A5557A14AF13386@fmsmsx118.amr.corp.intel.com>
2019-03-01 20:53       ` FW: " Eads, Gage
2019-03-01 21:12         ` Thomas Monjalon
2019-03-01 21:29           ` Eads, Gage
2019-02-22 16:06 ` [PATCH 6/7] test/stack: add non-blocking stack tests Gage Eads
2019-02-25 11:28   ` Olivier Matz
2019-02-22 16:06 ` [PATCH 7/7] mempool/stack: add non-blocking stack mempool handler Gage Eads
2019-02-25 11:29   ` Olivier Matz
2019-03-05 16:42 ` [PATCH v2 0/8] Add stack library and new " Gage Eads
2019-03-05 16:42   ` [PATCH v2 1/8] stack: introduce rte stack library Gage Eads
2019-03-05 16:42   ` [PATCH v2 2/8] mempool/stack: convert mempool to use rte stack Gage Eads
2019-03-05 16:42   ` [PATCH v2 3/8] test/stack: add stack test Gage Eads
2019-03-05 16:42   ` [PATCH v2 4/8] test/stack: add stack perf test Gage Eads
2019-03-05 16:42   ` [PATCH v2 5/8] stack: add lock-free stack implementation Gage Eads
2019-03-05 16:42   ` [PATCH v2 6/8] stack: add C11 atomic implementation Gage Eads
2019-03-05 16:42   ` [PATCH v2 7/8] test/stack: add lock-free stack tests Gage Eads
2019-03-05 16:42   ` [PATCH v2 8/8] mempool/stack: add lock-free stack mempool handler Gage Eads
2019-03-06 14:45   ` [PATCH v3 0/8] Add stack library and new " Gage Eads
2019-03-06 14:45     ` [PATCH v3 1/8] stack: introduce rte stack library Gage Eads
2019-03-14  8:00       ` Olivier Matz
2019-03-28 23:26       ` Honnappa Nagarahalli
2019-03-29 19:23         ` Eads, Gage
2019-03-29 21:07           ` Thomas Monjalon
2019-04-01 17:41           ` Honnappa Nagarahalli
2019-04-01 19:34             ` Eads, Gage
2019-03-06 14:45     ` [PATCH v3 2/8] mempool/stack: convert mempool to use rte stack Gage Eads
2019-03-06 14:45     ` [PATCH v3 3/8] test/stack: add stack test Gage Eads
2019-03-14  8:00       ` Olivier Matz
2019-03-06 14:45     ` [PATCH v3 4/8] test/stack: add stack perf test Gage Eads
2019-03-06 14:45     ` [PATCH v3 5/8] stack: add lock-free stack implementation Gage Eads
2019-03-14  8:01       ` Olivier Matz
2019-03-28 23:27       ` Honnappa Nagarahalli
2019-03-29 19:25         ` Eads, Gage
2019-03-06 14:45     ` [PATCH v3 6/8] stack: add C11 atomic implementation Gage Eads
2019-03-14  8:04       ` Olivier Matz
2019-03-28 23:27       ` Honnappa Nagarahalli
2019-03-29 19:24         ` Eads, Gage [this message]
2019-04-01  0:06           ` Eads, Gage
2019-04-01 19:06             ` Honnappa Nagarahalli
2019-04-01 20:21               ` Eads, Gage
2019-03-06 14:45     ` [PATCH v3 7/8] test/stack: add lock-free stack tests Gage Eads
2019-03-06 14:45     ` [PATCH v3 8/8] mempool/stack: add lock-free stack mempool handler Gage Eads
2019-03-28 18:00     ` [PATCH v4 0/8] Add stack library and new " Gage Eads
2019-03-28 18:00       ` [PATCH v4 1/8] stack: introduce rte stack library Gage Eads
2019-03-28 18:00       ` [PATCH v4 2/8] mempool/stack: convert mempool to use rte stack Gage Eads
2019-03-28 18:00       ` [PATCH v4 3/8] test/stack: add stack test Gage Eads
2019-03-28 18:00       ` [PATCH v4 4/8] test/stack: add stack perf test Gage Eads
2019-03-28 18:00       ` [PATCH v4 5/8] stack: add lock-free stack implementation Gage Eads
2019-03-28 18:00       ` [PATCH v4 6/8] stack: add C11 atomic implementation Gage Eads
2019-03-28 18:00       ` [PATCH v4 7/8] test/stack: add lock-free stack tests Gage Eads
2019-03-28 18:00       ` [PATCH v4 8/8] mempool/stack: add lock-free stack mempool handler Gage Eads
2019-04-01  0:12       ` [PATCH v5 0/8] Add stack library and new " Gage Eads
2019-04-01  0:12         ` [PATCH v5 1/8] stack: introduce rte stack library Gage Eads
2019-04-01  0:12         ` [PATCH v5 2/8] mempool/stack: convert mempool to use rte stack Gage Eads
2019-04-01  0:12         ` [PATCH v5 3/8] test/stack: add stack test Gage Eads
2019-04-01  0:12         ` [PATCH v5 4/8] test/stack: add stack perf test Gage Eads
2019-04-01  0:12         ` [PATCH v5 5/8] stack: add lock-free stack implementation Gage Eads
2019-04-01 18:08           ` Honnappa Nagarahalli
2019-04-01  0:12         ` [PATCH v5 6/8] stack: add C11 atomic implementation Gage Eads
2019-04-01  0:12         ` [PATCH v5 7/8] test/stack: add lock-free stack tests Gage Eads
2019-04-01  0:12         ` [PATCH v5 8/8] mempool/stack: add lock-free stack mempool handler Gage Eads
2019-04-01 21:14         ` [PATCH v6 0/8] Add stack library and new " Gage Eads
2019-04-01 21:14           ` [PATCH v6 1/8] stack: introduce rte stack library Gage Eads
2019-04-02 11:14             ` Honnappa Nagarahalli
2019-04-03 17:06               ` Thomas Monjalon
2019-04-03 17:13                 ` Eads, Gage
2019-04-03 17:23                   ` Thomas Monjalon
2019-04-01 21:14           ` [PATCH v6 2/8] mempool/stack: convert mempool to use rte stack Gage Eads
2019-04-01 21:14           ` [PATCH v6 3/8] test/stack: add stack test Gage Eads
2019-04-01 21:14           ` [PATCH v6 4/8] test/stack: add stack perf test Gage Eads
2019-04-01 21:14           ` [PATCH v6 5/8] stack: add lock-free stack implementation Gage Eads
2019-04-01 21:14           ` [PATCH v6 6/8] stack: add C11 atomic implementation Gage Eads
2019-04-02 11:11             ` Honnappa Nagarahalli
2019-04-01 21:14           ` [PATCH v6 7/8] test/stack: add lock-free stack tests Gage Eads
2019-04-01 21:14           ` [PATCH v6 8/8] mempool/stack: add lock-free stack mempool handler Gage Eads
2019-04-03 17:04           ` [PATCH v6 0/8] Add stack library and new " Thomas Monjalon
2019-04-03 17:10             ` Eads, Gage
2019-04-03 20:09           ` [PATCH v7 " Gage Eads
2019-04-03 20:09             ` [PATCH v7 1/8] stack: introduce rte stack library Gage Eads
2019-04-03 20:09             ` [PATCH v7 2/8] mempool/stack: convert mempool to use rte stack Gage Eads
2019-04-03 20:09             ` [PATCH v7 3/8] test/stack: add stack test Gage Eads
2019-04-03 20:09             ` [PATCH v7 4/8] test/stack: add stack perf test Gage Eads
2019-04-03 20:09             ` [PATCH v7 5/8] stack: add lock-free stack implementation Gage Eads
2019-04-03 20:09             ` [PATCH v7 6/8] stack: add C11 atomic implementation Gage Eads
2019-04-03 20:09             ` [PATCH v7 7/8] test/stack: add lock-free stack tests Gage Eads
2019-04-03 20:09             ` [PATCH v7 8/8] mempool/stack: add lock-free stack mempool handler Gage Eads
2019-04-03 20:39             ` [PATCH v7 0/8] Add stack library and new " Thomas Monjalon
2019-04-03 20:49               ` Eads, Gage
2019-04-03 20:50             ` [PATCH v8 " Gage Eads
2019-04-03 20:50               ` [PATCH v8 1/8] stack: introduce rte stack library Gage Eads
2019-04-03 20:50               ` [PATCH v8 2/8] mempool/stack: convert mempool to use rte stack Gage Eads
2019-04-03 20:50               ` [PATCH v8 3/8] test/stack: add stack test Gage Eads
2019-04-03 22:41                 ` Thomas Monjalon
2019-04-03 23:05                   ` Eads, Gage
2019-04-03 20:50               ` [PATCH v8 4/8] test/stack: add stack perf test Gage Eads
2019-04-03 20:50               ` [PATCH v8 5/8] stack: add lock-free stack implementation Gage Eads
2019-04-03 20:50               ` [PATCH v8 6/8] stack: add C11 atomic implementation Gage Eads
2019-04-03 20:50               ` [PATCH v8 7/8] test/stack: add lock-free stack tests Gage Eads
2019-04-03 20:50               ` [PATCH v8 8/8] mempool/stack: add lock-free stack mempool handler Gage Eads
2019-04-03 23:20               ` [PATCH v9 0/8] Add stack library and new " Gage Eads
2019-04-03 23:20                 ` [PATCH v9 1/8] stack: introduce rte stack library Gage Eads
2019-04-04 13:30                   ` Thomas Monjalon
2019-04-04 14:14                     ` Eads, Gage
2019-04-03 23:20                 ` [PATCH v9 2/8] mempool/stack: convert mempool to use rte stack Gage Eads
2019-04-03 23:20                 ` [PATCH v9 3/8] test/stack: add stack test Gage Eads
2019-04-04  7:34                   ` Thomas Monjalon
2019-04-03 23:20                 ` [PATCH v9 4/8] test/stack: add stack perf test Gage Eads
2019-04-03 23:20                 ` [PATCH v9 5/8] stack: add lock-free stack implementation Gage Eads
2019-04-03 23:20                 ` [PATCH v9 6/8] stack: add C11 atomic implementation Gage Eads
2019-04-03 23:20                 ` [PATCH v9 7/8] test/stack: add lock-free stack tests Gage Eads
2019-04-03 23:20                 ` [PATCH v9 8/8] mempool/stack: add lock-free stack mempool handler Gage Eads
2019-04-04 10:01                 ` [PATCH v10 0/8] Add stack library and new " Gage Eads
2019-04-04 10:01                   ` [PATCH v10 1/8] stack: introduce rte stack library Gage Eads
2019-04-04 10:01                   ` [PATCH v10 2/8] mempool/stack: convert mempool to use rte stack Gage Eads
2019-04-04 10:01                   ` [PATCH v10 3/8] test/stack: add stack test Gage Eads
2019-04-04 10:01                   ` [PATCH v10 4/8] test/stack: add stack perf test Gage Eads
2019-04-04 10:01                   ` [PATCH v10 5/8] stack: add lock-free stack implementation Gage Eads
2019-04-04 10:01                   ` [PATCH v10 6/8] stack: add C11 atomic implementation Gage Eads
2019-04-04 10:01                   ` [PATCH v10 7/8] test/stack: add lock-free stack tests Gage Eads
2019-04-04 10:01                   ` [PATCH v10 8/8] mempool/stack: add lock-free stack mempool handler Gage Eads
2019-04-04 15:42                   ` [PATCH v10 0/8] Add stack library and new " Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9184057F7FC11744A2107296B6B8EB1E5420D940@FMSMSX108.amr.corp.intel.com \
    --to=gage.eads@intel.com \
    --cc=Gavin.Hu@arm.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=arybchenko@solarflare.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@intel.com \
    --cc=nd@arm.com \
    --cc=olivier.matz@6wind.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.