All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gage Eads <gage.eads@intel.com>
To: dev@dpdk.org
Cc: olivier.matz@6wind.com, arybchenko@solarflare.com,
	bruce.richardson@intel.com, konstantin.ananyev@intel.com
Subject: [PATCH 1/6] ring: change head and tail to pointer-width size
Date: Thu, 10 Jan 2019 15:01:17 -0600	[thread overview]
Message-ID: <20190110210122.24889-2-gage.eads@intel.com> (raw)
In-Reply-To: <20190110210122.24889-1-gage.eads@intel.com>

For 64-bit architectures, doubling the head and tail index widths greatly
increases the time it takes for them to wrap-around (with current CPU
speeds, it won't happen within the author's lifetime). This is important in
avoiding the ABA problem -- in which a thread mistakes reading the same
tail index in two accesses to mean that the ring was not modified in the
intervening time -- in the upcoming non-blocking ring implementation. Using
a 64-bit index makes the possibility of this occurring effectively zero.

I tested this commit's performance impact with an x86_64 build on a
dual-socket Xeon E5-2699 v4 using ring_perf_autotest, and the change made
no significant difference -- the few differences appear to be system noise.
(The test ran on isolcpus cores using a tickless scheduler, but some
variation was stll observed.) Each test was run three times and the results
were averaged:

                                  | 64b head/tail cycle cost minus
             Test                 |     32b head/tail cycle cost
------------------------------------------------------------------
SP/SC single enq/dequeue          | 0.33
MP/MC single enq/dequeue          | 0.00
SP/SC burst enq/dequeue (size 8)  | 0.00
MP/MC burst enq/dequeue (size 8)  | 1.00
SP/SC burst enq/dequeue (size 32) | 0.00
MP/MC burst enq/dequeue (size 32) | -1.00
SC empty dequeue                  | 0.01
MC empty dequeue                  | 0.00

Single lcore:
SP/SC bulk enq/dequeue (size 8)   | -0.36
MP/MC bulk enq/dequeue (size 8)   | 0.99
SP/SC bulk enq/dequeue (size 32)  | -0.40
MP/MC bulk enq/dequeue (size 32)  | -0.57

Two physical cores:
SP/SC bulk enq/dequeue (size 8)   | -0.49
MP/MC bulk enq/dequeue (size 8)   | 0.19
SP/SC bulk enq/dequeue (size 32)  | -0.28
MP/MC bulk enq/dequeue (size 32)  | -0.62

Two NUMA nodes:
SP/SC bulk enq/dequeue (size 8)   | 3.25
MP/MC bulk enq/dequeue (size 8)   | 1.87
SP/SC bulk enq/dequeue (size 32)  | -0.44
MP/MC bulk enq/dequeue (size 32)  | -1.10

An earlier version of this patch changed the head and tail indexes to
uint64_t, but that caused a performance drop on 32-bit builds. With
uintptr_t, no performance difference is observed on an i686 build.

Signed-off-by: Gage Eads <gage.eads@intel.com>
---
 lib/librte_eventdev/rte_event_ring.h |  6 +++---
 lib/librte_ring/rte_ring.c           | 10 +++++-----
 lib/librte_ring/rte_ring.h           | 20 ++++++++++----------
 lib/librte_ring/rte_ring_generic.h   | 16 +++++++++-------
 4 files changed, 27 insertions(+), 25 deletions(-)

diff --git a/lib/librte_eventdev/rte_event_ring.h b/lib/librte_eventdev/rte_event_ring.h
index 827a3209e..eae70f904 100644
--- a/lib/librte_eventdev/rte_event_ring.h
+++ b/lib/librte_eventdev/rte_event_ring.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2016-2017 Intel Corporation
+ * Copyright(c) 2016-2019 Intel Corporation
  */
 
 /**
@@ -88,7 +88,7 @@ rte_event_ring_enqueue_burst(struct rte_event_ring *r,
 		const struct rte_event *events,
 		unsigned int n, uint16_t *free_space)
 {
-	uint32_t prod_head, prod_next;
+	uintptr_t prod_head, prod_next;
 	uint32_t free_entries;
 
 	n = __rte_ring_move_prod_head(&r->r, r->r.prod.single, n,
@@ -129,7 +129,7 @@ rte_event_ring_dequeue_burst(struct rte_event_ring *r,
 		struct rte_event *events,
 		unsigned int n, uint16_t *available)
 {
-	uint32_t cons_head, cons_next;
+	uintptr_t cons_head, cons_next;
 	uint32_t entries;
 
 	n = __rte_ring_move_cons_head(&r->r, r->r.cons.single, n,
diff --git a/lib/librte_ring/rte_ring.c b/lib/librte_ring/rte_ring.c
index d215acecc..b15ee0eb3 100644
--- a/lib/librte_ring/rte_ring.c
+++ b/lib/librte_ring/rte_ring.c
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: BSD-3-Clause
  *
- * Copyright (c) 2010-2015 Intel Corporation
+ * Copyright (c) 2010-2019 Intel Corporation
  * Copyright (c) 2007,2008 Kip Macy kmacy@freebsd.org
  * All rights reserved.
  * Derived from FreeBSD's bufring.h
@@ -227,10 +227,10 @@ rte_ring_dump(FILE *f, const struct rte_ring *r)
 	fprintf(f, "  flags=%x\n", r->flags);
 	fprintf(f, "  size=%"PRIu32"\n", r->size);
 	fprintf(f, "  capacity=%"PRIu32"\n", r->capacity);
-	fprintf(f, "  ct=%"PRIu32"\n", r->cons.tail);
-	fprintf(f, "  ch=%"PRIu32"\n", r->cons.head);
-	fprintf(f, "  pt=%"PRIu32"\n", r->prod.tail);
-	fprintf(f, "  ph=%"PRIu32"\n", r->prod.head);
+	fprintf(f, "  ct=%"PRIuPTR"\n", r->cons.tail);
+	fprintf(f, "  ch=%"PRIuPTR"\n", r->cons.head);
+	fprintf(f, "  pt=%"PRIuPTR"\n", r->prod.tail);
+	fprintf(f, "  ph=%"PRIuPTR"\n", r->prod.head);
 	fprintf(f, "  used=%u\n", rte_ring_count(r));
 	fprintf(f, "  avail=%u\n", rte_ring_free_count(r));
 }
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index af5444a9f..12af64e13 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: BSD-3-Clause
  *
- * Copyright (c) 2010-2017 Intel Corporation
+ * Copyright (c) 2010-2019 Intel Corporation
  * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
  * All rights reserved.
  * Derived from FreeBSD's bufring.h
@@ -65,8 +65,8 @@ struct rte_memzone; /* forward declaration, so as not to require memzone.h */
 
 /* structure to hold a pair of head/tail values and other metadata */
 struct rte_ring_headtail {
-	volatile uint32_t head;  /**< Prod/consumer head. */
-	volatile uint32_t tail;  /**< Prod/consumer tail. */
+	volatile uintptr_t head;  /**< Prod/consumer head. */
+	volatile uintptr_t tail;  /**< Prod/consumer tail. */
 	uint32_t single;         /**< True if single prod/cons */
 };
 
@@ -242,7 +242,7 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r);
 #define ENQUEUE_PTRS(r, ring_start, prod_head, obj_table, n, obj_type) do { \
 	unsigned int i; \
 	const uint32_t size = (r)->size; \
-	uint32_t idx = prod_head & (r)->mask; \
+	uintptr_t idx = prod_head & (r)->mask; \
 	obj_type *ring = (obj_type *)ring_start; \
 	if (likely(idx + n < size)) { \
 		for (i = 0; i < (n & ((~(unsigned)0x3))); i+=4, idx+=4) { \
@@ -272,7 +272,7 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r);
  * single and multi consumer dequeue functions */
 #define DEQUEUE_PTRS(r, ring_start, cons_head, obj_table, n, obj_type) do { \
 	unsigned int i; \
-	uint32_t idx = cons_head & (r)->mask; \
+	uintptr_t idx = cons_head & (r)->mask; \
 	const uint32_t size = (r)->size; \
 	obj_type *ring = (obj_type *)ring_start; \
 	if (likely(idx + n < size)) { \
@@ -338,7 +338,7 @@ __rte_ring_do_enqueue(struct rte_ring *r, void * const *obj_table,
 		 unsigned int n, enum rte_ring_queue_behavior behavior,
 		 unsigned int is_sp, unsigned int *free_space)
 {
-	uint32_t prod_head, prod_next;
+	uintptr_t prod_head, prod_next;
 	uint32_t free_entries;
 
 	n = __rte_ring_move_prod_head(r, is_sp, n, behavior,
@@ -380,7 +380,7 @@ __rte_ring_do_dequeue(struct rte_ring *r, void **obj_table,
 		 unsigned int n, enum rte_ring_queue_behavior behavior,
 		 unsigned int is_sc, unsigned int *available)
 {
-	uint32_t cons_head, cons_next;
+	uintptr_t cons_head, cons_next;
 	uint32_t entries;
 
 	n = __rte_ring_move_cons_head(r, (int)is_sc, n, behavior,
@@ -681,9 +681,9 @@ rte_ring_dequeue(struct rte_ring *r, void **obj_p)
 static inline unsigned
 rte_ring_count(const struct rte_ring *r)
 {
-	uint32_t prod_tail = r->prod.tail;
-	uint32_t cons_tail = r->cons.tail;
-	uint32_t count = (prod_tail - cons_tail) & r->mask;
+	uintptr_t prod_tail = r->prod.tail;
+	uintptr_t cons_tail = r->cons.tail;
+	uintptr_t count = (prod_tail - cons_tail) & r->mask;
 	return (count > r->capacity) ? r->capacity : count;
 }
 
diff --git a/lib/librte_ring/rte_ring_generic.h b/lib/librte_ring/rte_ring_generic.h
index ea7dbe5b9..3fd1150f6 100644
--- a/lib/librte_ring/rte_ring_generic.h
+++ b/lib/librte_ring/rte_ring_generic.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: BSD-3-Clause
  *
- * Copyright (c) 2010-2017 Intel Corporation
+ * Copyright (c) 2010-2019 Intel Corporation
  * Copyright (c) 2007-2009 Kip Macy kmacy@freebsd.org
  * All rights reserved.
  * Derived from FreeBSD's bufring.h
@@ -11,7 +11,7 @@
 #define _RTE_RING_GENERIC_H_
 
 static __rte_always_inline void
-update_tail(struct rte_ring_headtail *ht, uint32_t old_val, uint32_t new_val,
+update_tail(struct rte_ring_headtail *ht, uintptr_t old_val, uintptr_t new_val,
 		uint32_t single, uint32_t enqueue)
 {
 	if (enqueue)
@@ -55,7 +55,7 @@ update_tail(struct rte_ring_headtail *ht, uint32_t old_val, uint32_t new_val,
 static __rte_always_inline unsigned int
 __rte_ring_move_prod_head(struct rte_ring *r, unsigned int is_sp,
 		unsigned int n, enum rte_ring_queue_behavior behavior,
-		uint32_t *old_head, uint32_t *new_head,
+		uintptr_t *old_head, uintptr_t *new_head,
 		uint32_t *free_entries)
 {
 	const uint32_t capacity = r->capacity;
@@ -93,7 +93,8 @@ __rte_ring_move_prod_head(struct rte_ring *r, unsigned int is_sp,
 		if (is_sp)
 			r->prod.head = *new_head, success = 1;
 		else
-			success = rte_atomic32_cmpset(&r->prod.head,
+			/* Built-in used to handle variable-sized head index. */
+			success = __sync_bool_compare_and_swap(&r->prod.head,
 					*old_head, *new_head);
 	} while (unlikely(success == 0));
 	return n;
@@ -125,7 +126,7 @@ __rte_ring_move_prod_head(struct rte_ring *r, unsigned int is_sp,
 static __rte_always_inline unsigned int
 __rte_ring_move_cons_head(struct rte_ring *r, unsigned int is_sc,
 		unsigned int n, enum rte_ring_queue_behavior behavior,
-		uint32_t *old_head, uint32_t *new_head,
+		uintptr_t *old_head, uintptr_t *new_head,
 		uint32_t *entries)
 {
 	unsigned int max = n;
@@ -161,8 +162,9 @@ __rte_ring_move_cons_head(struct rte_ring *r, unsigned int is_sc,
 		if (is_sc)
 			r->cons.head = *new_head, success = 1;
 		else
-			success = rte_atomic32_cmpset(&r->cons.head, *old_head,
-					*new_head);
+			/* Built-in used to handle variable-sized head index. */
+			success = __sync_bool_compare_and_swap(&r->cons.head,
+					*old_head, *new_head);
 	} while (unlikely(success == 0));
 	return n;
 }
-- 
2.13.6

  reply	other threads:[~2019-01-10 21:02 UTC|newest]

Thread overview: 102+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-10 21:01 [PATCH 0/6] Add non-blocking ring Gage Eads
2019-01-10 21:01 ` Gage Eads [this message]
2019-01-11  4:38   ` [PATCH 1/6] ring: change head and tail to pointer-width size Stephen Hemminger
2019-01-11 19:07     ` Eads, Gage
2019-01-11 10:25   ` Burakov, Anatoly
2019-01-11 19:12     ` Eads, Gage
2019-01-11 19:55       ` Stephen Hemminger
2019-01-15 15:48         ` Eads, Gage
2019-01-11 10:40   ` Burakov, Anatoly
2019-01-11 10:58     ` Bruce Richardson
2019-01-11 11:30       ` Burakov, Anatoly
     [not found]         ` <20190111115851.GC3336@bricha3-MOBL.ger.corp.intel.com>
2019-01-11 19:27           ` Eads, Gage
2019-01-21 14:14             ` Burakov, Anatoly
2019-01-22 18:27               ` Eads, Gage
2019-01-10 21:01 ` [PATCH 2/6] ring: add a non-blocking implementation Gage Eads
2019-01-10 21:01 ` [PATCH 3/6] test_ring: add non-blocking ring autotest Gage Eads
2019-01-10 21:01 ` [PATCH 4/6] test_ring_perf: add non-blocking ring perf test Gage Eads
2019-01-10 21:01 ` [PATCH 5/6] mempool/ring: add non-blocking ring handlers Gage Eads
2019-01-13 13:43   ` Andrew Rybchenko
2019-01-10 21:01 ` [PATCH 6/6] doc: add NB ring comment to EAL "known issues" Gage Eads
2019-01-11  2:51   ` Varghese, Vipin
2019-01-11 19:30     ` Eads, Gage
2019-01-14  0:07       ` Varghese, Vipin
2019-01-15 23:52 ` [PATCH v2 0/5] Add non-blocking ring Gage Eads
2019-01-15 23:52   ` [PATCH v2 1/5] ring: change head and tail to pointer-width size Gage Eads
2019-01-15 23:52   ` [PATCH v2 2/5] ring: add a non-blocking implementation Gage Eads
2019-01-15 23:52   ` [PATCH v2 3/5] test_ring: add non-blocking ring autotest Gage Eads
2019-01-15 23:52   ` [PATCH v2 4/5] test_ring_perf: add non-blocking ring perf test Gage Eads
2019-01-15 23:52   ` [PATCH v2 5/5] mempool/ring: add non-blocking ring handlers Gage Eads
2019-01-16  0:26   ` [PATCH v2 0/5] Add non-blocking ring Stephen Hemminger
2019-01-18 15:23   ` [PATCH v3 " Gage Eads
2019-01-18 15:23     ` [PATCH v3 1/5] ring: add 64-bit headtail structure Gage Eads
2019-01-18 15:23     ` [PATCH v3 2/5] ring: add a non-blocking implementation Gage Eads
2019-01-22 10:12       ` Ola Liljedahl
2019-01-22 14:49       ` Ola Liljedahl
2019-01-22 21:31         ` Eads, Gage
2019-01-23 10:16           ` Ola Liljedahl
2019-01-25 17:21             ` Eads, Gage
2019-01-28 10:35               ` Ola Liljedahl
2019-01-28 18:54                 ` Eads, Gage
2019-01-28 22:31                   ` Ola Liljedahl
2019-01-28 13:34               ` Jerin Jacob Kollanukkaran
2019-01-28 13:43                 ` Ola Liljedahl
2019-01-28 14:04                   ` Jerin Jacob Kollanukkaran
2019-01-28 14:06                     ` Ola Liljedahl
2019-01-28 18:59                 ` Eads, Gage
2019-01-18 15:23     ` [PATCH v3 3/5] test_ring: add non-blocking ring autotest Gage Eads
2019-01-18 15:23     ` [PATCH v3 4/5] test_ring_perf: add non-blocking ring perf test Gage Eads
2019-01-18 15:23     ` [PATCH v3 5/5] mempool/ring: add non-blocking ring handlers Gage Eads
2019-01-22  9:27     ` [PATCH v3 0/5] Add non-blocking ring Ola Liljedahl
2019-01-22 10:15       ` Ola Liljedahl
2019-01-22 19:15       ` Eads, Gage
2019-01-23 16:02       ` Jerin Jacob Kollanukkaran
2019-01-23 16:29         ` Ola Liljedahl
2019-01-28 13:10           ` [EXT] " Jerin Jacob Kollanukkaran
2019-01-25  5:20     ` Honnappa Nagarahalli
2019-01-25 17:42       ` Eads, Gage
2019-01-25 17:56       ` Eads, Gage
2019-01-28 10:41         ` Ola Liljedahl
2019-01-28 18:14     ` [PATCH v4 " Gage Eads
2019-01-28 18:14       ` [PATCH v4 1/5] ring: add 64-bit headtail structure Gage Eads
2019-01-29 12:56         ` Ola Liljedahl
2019-01-30  4:26           ` Eads, Gage
2019-01-28 18:14       ` [PATCH v4 2/5] ring: add a non-blocking implementation Gage Eads
2019-01-28 18:14       ` [PATCH v4 3/5] test_ring: add non-blocking ring autotest Gage Eads
2019-01-28 18:14       ` [PATCH v4 4/5] test_ring_perf: add non-blocking ring perf test Gage Eads
2019-01-28 18:14       ` [PATCH v4 5/5] mempool/ring: add non-blocking ring handlers Gage Eads
2019-03-05 17:40       ` [PATCH v5 0/6] Add lock-free ring and mempool handler Gage Eads
2019-03-05 17:40         ` [PATCH v5 1/6] ring: add a pointer-width headtail structure Gage Eads
2019-03-05 17:40         ` [PATCH v5 2/6] ring: add a ring start marker Gage Eads
2019-03-05 17:40         ` [PATCH v5 3/6] ring: add a lock-free implementation Gage Eads
2019-03-05 17:40         ` [PATCH v5 4/6] test_ring: add lock-free ring autotest Gage Eads
2019-03-05 17:40         ` [PATCH v5 5/6] test_ring_perf: add lock-free ring perf test Gage Eads
2019-03-05 17:40         ` [PATCH v5 6/6] mempool/ring: add lock-free ring handlers Gage Eads
2019-03-06 15:03         ` [PATCH v6 0/6] Add lock-free ring and mempool handler Gage Eads
2019-03-06 15:03           ` [PATCH v6 1/6] ring: add a pointer-width headtail structure Gage Eads
2019-03-06 15:03           ` [PATCH v6 2/6] ring: add a ring start marker Gage Eads
2019-03-06 15:03           ` [PATCH v6 3/6] ring: add a lock-free implementation Gage Eads
2019-03-06 15:03           ` [PATCH v6 4/6] test_ring: add lock-free ring autotest Gage Eads
2019-03-06 15:03           ` [PATCH v6 5/6] test_ring_perf: add lock-free ring perf test Gage Eads
2019-03-06 15:03           ` [PATCH v6 6/6] mempool/ring: add lock-free ring handlers Gage Eads
2019-03-18 21:35           ` [PATCH v7 0/6] Add lock-free ring and mempool handler Gage Eads
2019-03-18 21:35             ` [PATCH v7 1/6] ring: add a pointer-width headtail structure Gage Eads
2019-03-18 21:35             ` [PATCH v7 2/6] ring: add a ring start marker Gage Eads
2019-03-18 21:35             ` [PATCH v7 3/6] ring: add a lock-free implementation Gage Eads
2019-03-19 15:50               ` Stephen Hemminger
2019-03-18 21:35             ` [PATCH v7 4/6] test_ring: add lock-free ring autotest Gage Eads
2019-03-18 21:35             ` [PATCH v7 5/6] test_ring_perf: add lock-free ring perf test Gage Eads
2019-03-18 21:35             ` [PATCH v7 6/6] mempool/ring: add lock-free ring handlers Gage Eads
2019-03-18 21:49             ` [PATCH v7 0/6] Add lock-free ring and mempool handler Eads, Gage
2019-03-19 15:51               ` Stephen Hemminger
2019-04-01 19:23                 ` Eads, Gage
2019-04-02 10:16                   ` Ola Liljedahl
2019-04-04 22:28                     ` [dpdk-dev] " Eads, Gage
2019-03-19  1:20             ` [PATCH v8 " Gage Eads
2019-03-19  1:20               ` [PATCH v8 1/6] ring: add a pointer-width headtail structure Gage Eads
2019-03-19  1:20               ` [PATCH v8 2/6] ring: add a ring start marker Gage Eads
2019-03-19  1:20               ` [PATCH v8 3/6] ring: add a lock-free implementation Gage Eads
2019-03-19  1:20               ` [PATCH v8 4/6] test_ring: add lock-free ring autotest Gage Eads
2019-03-19  1:20               ` [PATCH v8 5/6] test_ring_perf: add lock-free ring perf test Gage Eads
2019-03-19  1:20               ` [PATCH v8 6/6] mempool/ring: add lock-free ring handlers Gage Eads
2019-04-03 16:46               ` [PATCH v8 0/6] Add lock-free ring and mempool handler Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190110210122.24889-2-gage.eads@intel.com \
    --to=gage.eads@intel.com \
    --cc=arybchenko@solarflare.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@intel.com \
    --cc=olivier.matz@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.