All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH rdma-core 0/3] Barriers improvements
@ 2017-03-13 14:53 Yishai Hadas
       [not found] ` <1489416829-15467-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Yishai Hadas @ 2017-03-13 14:53 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
	majd-VPRAkNaXOzVWk0Htik3J/w

This series comes to optimize both mlx4 and mlx5 as some follow-up of previous
barrier series that was already merged.

The first patch enables an optimized code on X86 where spinlock can serve also
as a fence, it is used in downstream patches by mlx4 and mlx5 patches.

Next 2 patches are optimizing of current code to prevent degradation in mlx4
and to grub some improvement in mlx5.

Pull request was sent:
https://github.com/linux-rdma/rdma-core/pull/95

Jason Gunthorpe (1):
  verbs: Add mmio_wc_spinlock barrier

Yishai Hadas (2):
  mlx4: Optimize post_send barriers
  mlx5: Optimize post_send barriers

 providers/mlx4/qp.c   | 19 +++++++------------
 providers/mlx5/mlx5.c |  2 +-
 providers/mlx5/qp.c   |  7 ++++---
 util/udma_barrier.h   | 15 +++++++++++++++
 4 files changed, 27 insertions(+), 16 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH rdma-core 1/3] verbs: Add mmio_wc_spinlock barrier
       [not found] ` <1489416829-15467-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2017-03-13 14:53   ` Yishai Hadas
       [not found]     ` <1489416829-15467-2-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2017-03-13 14:53   ` [PATCH rdma-core 2/3] mlx4: Optimize post_send barriers Yishai Hadas
  2017-03-13 14:53   ` [PATCH rdma-core 3/3] mlx5: " Yishai Hadas
  2 siblings, 1 reply; 7+ messages in thread
From: Yishai Hadas @ 2017-03-13 14:53 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
	majd-VPRAkNaXOzVWk0Htik3J/w

From: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>

For x86 the serialization within the spin lock is enough to
strongly order WC and other memory types.

Add a new barrier named 'mmio_wc_spinlock' to optimize
that.

Signed-off-by: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
Reviewed-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 util/udma_barrier.h | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/util/udma_barrier.h b/util/udma_barrier.h
index 9e73148..ec14dd3 100644
--- a/util/udma_barrier.h
+++ b/util/udma_barrier.h
@@ -33,6 +33,8 @@
 #ifndef __UTIL_UDMA_BARRIER_H
 #define __UTIL_UDMA_BARRIER_H
 
+#include <pthread.h>
+
 /* Barriers for DMA.
 
    These barriers are expliclty only for use with user DMA operations. If you
@@ -222,4 +224,17 @@
 */
 #define mmio_ordered_writes_hack() mmio_flush_writes()
 
+/* Higher Level primitives */
+
+/* Do mmio_wc_start and grab a spinlock */
+static inline void mmio_wc_spinlock(pthread_spinlock_t *lock)
+{
+	pthread_spin_lock(lock);
+#if !defined(__i386__) && !defined(__x86_64__)
+	/* For x86 the serialization within the spin lock is enough to
+	 * strongly order WC and other memory types. */
+	mmio_wc_start();
+#endif
+}
+
 #endif
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH rdma-core 2/3] mlx4: Optimize post_send barriers
       [not found] ` <1489416829-15467-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2017-03-13 14:53   ` [PATCH rdma-core 1/3] verbs: Add mmio_wc_spinlock barrier Yishai Hadas
@ 2017-03-13 14:53   ` Yishai Hadas
  2017-03-13 14:53   ` [PATCH rdma-core 3/3] mlx5: " Yishai Hadas
  2 siblings, 0 replies; 7+ messages in thread
From: Yishai Hadas @ 2017-03-13 14:53 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
	majd-VPRAkNaXOzVWk0Htik3J/w

This patch optimize current implementation as follow:
- Drop the leading barrier which affects non X86 ARCH(s)
  till making further optimization to prevent performance
  degradation.
- In ARCH(s) that spinlock() serves also as SFENCE prevent
  an another explicit SFENCE. (e.g. X86).

Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 providers/mlx4/qp.c | 19 +++++++------------
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/providers/mlx4/qp.c b/providers/mlx4/qp.c
index e9a59f9..a8eb8e2 100644
--- a/providers/mlx4/qp.c
+++ b/providers/mlx4/qp.c
@@ -203,7 +203,7 @@ static void set_data_seg(struct mlx4_wqe_data_seg *dseg, struct ibv_sge *sg)
 	 * chunk and get a valid (!= * 0xffffffff) byte count but
 	 * stale data, and end up sending the wrong data.
 	 */
-	udma_ordering_write_barrier();
+	udma_to_device_barrier();
 
 	if (likely(sg->length))
 		dseg->byte_count = htobe32(sg->length);
@@ -227,9 +227,6 @@ int mlx4_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr,
 
 	pthread_spin_lock(&qp->sq.lock);
 
-	/* Get all user DMA buffers ready to go */
-	udma_to_device_barrier();
-
 	/* XXX check that state is OK to post send */
 
 	ind = qp->sq.head;
@@ -402,7 +399,7 @@ int mlx4_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr,
 					wqe += to_copy;
 					addr += to_copy;
 					seg_len += to_copy;
-					udma_ordering_write_barrier(); /* see comment below */
+					udma_to_device_barrier(); /* see comment below */
 					seg->byte_count = htobe32(MLX4_INLINE_SEG | seg_len);
 					seg_len = 0;
 					seg = wqe;
@@ -430,7 +427,7 @@ int mlx4_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr,
 				 * data, and end up sending the wrong
 				 * data.
 				 */
-				udma_ordering_write_barrier();
+				udma_to_device_barrier();
 				seg->byte_count = htobe32(MLX4_INLINE_SEG | seg_len);
 			}
 
@@ -452,7 +449,7 @@ int mlx4_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr,
 		 * setting ownership bit (because HW can start
 		 * executing as soon as we do).
 		 */
-		udma_ordering_write_barrier();
+		udma_to_device_barrier();
 
 		ctrl->owner_opcode = htobe32(mlx4_ib_opcode[wr->opcode]) |
 			(ind & qp->sq.wqe_cnt ? htobe32(1 << 31) : 0);
@@ -476,18 +473,16 @@ out:
 		ctrl->owner_opcode |= htobe32((qp->sq.head & 0xffff) << 8);
 
 		ctrl->bf_qpn |= qp->doorbell_qpn;
+		++qp->sq.head;
 		/*
 		 * Make sure that descriptor is written to memory
 		 * before writing to BlueFlame page.
 		 */
-		mmio_wc_start();
-
-		++qp->sq.head;
-
-		pthread_spin_lock(&ctx->bf_lock);
+		mmio_wc_spinlock(&ctx->bf_lock);
 
 		mlx4_bf_copy(ctx->bf_page + ctx->bf_offset, (unsigned long *) ctrl,
 			     align(size * 16, 64));
+		/* Flush before toggling bf_offset to be latency oriented */
 		mmio_flush_writes();
 
 		ctx->bf_offset ^= ctx->bf_buf_size;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH rdma-core 3/3] mlx5: Optimize post_send barriers
       [not found] ` <1489416829-15467-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2017-03-13 14:53   ` [PATCH rdma-core 1/3] verbs: Add mmio_wc_spinlock barrier Yishai Hadas
  2017-03-13 14:53   ` [PATCH rdma-core 2/3] mlx4: Optimize post_send barriers Yishai Hadas
@ 2017-03-13 14:53   ` Yishai Hadas
  2 siblings, 0 replies; 7+ messages in thread
From: Yishai Hadas @ 2017-03-13 14:53 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	yishaih-VPRAkNaXOzVWk0Htik3J/w,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/,
	majd-VPRAkNaXOzVWk0Htik3J/w

In ARCH(s) that spinlock() serves also as SFENCE prevent
an another explicit SFENCE. (e.g. X86).

To prevent an extra 'if' to know whether lock is really taken and it's
not a single threaded application encapsulates the mlx5_single_threaded
flag as part of bf->need_lock.

Signed-off-by: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 providers/mlx5/mlx5.c | 2 +-
 providers/mlx5/qp.c   | 7 ++++---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/providers/mlx5/mlx5.c b/providers/mlx5/mlx5.c
index 87c85df..eeaf5ac 100644
--- a/providers/mlx5/mlx5.c
+++ b/providers/mlx5/mlx5.c
@@ -524,7 +524,7 @@ static int get_num_low_lat_uuars(int tot_uuars)
  */
 static int need_uuar_lock(struct mlx5_context *ctx, int uuarn)
 {
-	if (uuarn == 0)
+	if (uuarn == 0 || mlx5_single_threaded)
 		return 0;
 
 	if (uuarn >= (ctx->tot_uuars - ctx->low_lat_uuars) * 2)
diff --git a/providers/mlx5/qp.c b/providers/mlx5/qp.c
index de68b1c..1d5a2f9 100644
--- a/providers/mlx5/qp.c
+++ b/providers/mlx5/qp.c
@@ -930,11 +930,11 @@ out:
 
 		/* Make sure that the doorbell write happens before the memcpy
 		 * to WC memory below */
-		mmio_wc_start();
-
 		ctx = to_mctx(ibqp->context);
 		if (bf->need_lock)
-			mlx5_spin_lock(&bf->lock);
+			mmio_wc_spinlock(&bf->lock.lock);
+		else
+			mmio_wc_start();
 
 		if (!ctx->shut_up_bf && nreq == 1 && bf->uuarn &&
 		    (inl || ctx->prefer_bf) && size > 1 &&
@@ -953,6 +953,7 @@ out:
 		 * writes doorbell 2, and it's write is flushed earlier. Since
 		 * the mmio_flush_writes is CPU local, this will result in the HCA seeing
 		 * doorbell 2, followed by doorbell 1.
+		 * Flush before toggling bf_offset to be latency oriented.
 		 */
 		mmio_flush_writes();
 		bf->offset ^= bf->buf_size;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH rdma-core 1/3] verbs: Add mmio_wc_spinlock barrier
       [not found]     ` <1489416829-15467-2-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2017-03-13 17:00       ` Jason Gunthorpe
       [not found]         ` <20170313170003.GC25664-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Jason Gunthorpe @ 2017-03-13 17:00 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, majd-VPRAkNaXOzVWk0Htik3J/w

On Mon, Mar 13, 2017 at 04:53:47PM +0200, Yishai Hadas wrote:
> From: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
> 
> For x86 the serialization within the spin lock is enough to
> strongly order WC and other memory types.
> 
> Add a new barrier named 'mmio_wc_spinlock' to optimize
> that.

Please use this patch with the commentary instead:

diff --git a/util/udma_barrier.h b/util/udma_barrier.h
index 9e73148af8d5b6..cfe0459d7f6fff 100644
--- a/util/udma_barrier.h
+++ b/util/udma_barrier.h
@@ -33,6 +33,8 @@
 #ifndef __UTIL_UDMA_BARRIER_H
 #define __UTIL_UDMA_BARRIER_H
 
+#include <pthread.h>
+
 /* Barriers for DMA.
 
    These barriers are expliclty only for use with user DMA operations. If you
@@ -222,4 +224,37 @@
 */
 #define mmio_ordered_writes_hack() mmio_flush_writes()
 
+/* Write Combining Spinlock primitive
+
+   Any access to a multi-value WC region must ensure that multiple cpus do not
+   write to the same values concurrently, these macros make that
+   straightforward and efficient if the choosen exclusion is a spinlock.
+
+   The spinlock guarantees that the WC writes issued within the critical
+   section are made visible as TLP to the device. The TLP must be seen by the
+   device strictly in the order that the spinlocks are acquired, and combining
+   WC writes between different sections is not permitted.
+
+   Use of these macros allow the fencing inside the spinlock to be combined
+   with the fencing required for DMA.
+ */
+static inline void mmio_wc_spinlock(pthread_spinlock_t *lock)
+{
+	pthread_spin_lock(lock);
+#if !defined(__i386__) && !defined(__x86_64__)
+	/* For x86 the serialization within the spin lock is enough to
+	 * strongly order WC and other memory types. */
+	mmio_wc_start();
+#endif
+}
+
+static inline void mmio_wc_spinunlock(pthread_spinlock_t *lock)
+{
+	/* It is possible that on x86 the atomic in the lock is strong enough
+	 * to force-flush the WC buffers quickly, and this SFENCE can be
+	 * omitted too. */
+	mmio_flush_writes();
+	pthread_spin_unlock(lock);
+}
+
 #endif
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH rdma-core 1/3] verbs: Add mmio_wc_spinlock barrier
       [not found]         ` <20170313170003.GC25664-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2017-03-14 12:06           ` Yishai Hadas
       [not found]             ` <f236046b-4d2d-f873-d251-a91c90adae98-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Yishai Hadas @ 2017-03-14 12:06 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Yishai Hadas, dledford-H+wXaHxf7aLQT0dZR+AlfA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, majd-VPRAkNaXOzVWk0Htik3J/w

On 3/13/2017 7:00 PM, Jason Gunthorpe wrote:
> On Mon, Mar 13, 2017 at 04:53:47PM +0200, Yishai Hadas wrote:
>> From: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
>>
>> For x86 the serialization within the spin lock is enough to
>> strongly order WC and other memory types.
>>
>> Add a new barrier named 'mmio_wc_spinlock' to optimize
>> that.
>
> Please use this patch with the commentary instead:

OK, pull request was updated with below.
https://github.com/linux-rdma/rdma-core/pull/95

> diff --git a/util/udma_barrier.h b/util/udma_barrier.h
> index 9e73148af8d5b6..cfe0459d7f6fff 100644
> --- a/util/udma_barrier.h
> +++ b/util/udma_barrier.h
> @@ -33,6 +33,8 @@
>  #ifndef __UTIL_UDMA_BARRIER_H
>  #define __UTIL_UDMA_BARRIER_H
>
> +#include <pthread.h>
> +
>  /* Barriers for DMA.
>
>     These barriers are expliclty only for use with user DMA operations. If you
> @@ -222,4 +224,37 @@
>  */
>  #define mmio_ordered_writes_hack() mmio_flush_writes()
>
> +/* Write Combining Spinlock primitive
> +
> +   Any access to a multi-value WC region must ensure that multiple cpus do not
> +   write to the same values concurrently, these macros make that
> +   straightforward and efficient if the choosen exclusion is a spinlock.
> +
> +   The spinlock guarantees that the WC writes issued within the critical
> +   section are made visible as TLP to the device. The TLP must be seen by the
> +   device strictly in the order that the spinlocks are acquired, and combining
> +   WC writes between different sections is not permitted.
> +
> +   Use of these macros allow the fencing inside the spinlock to be combined
> +   with the fencing required for DMA.
> + */
> +static inline void mmio_wc_spinlock(pthread_spinlock_t *lock)
> +{
> +	pthread_spin_lock(lock);
> +#if !defined(__i386__) && !defined(__x86_64__)
> +	/* For x86 the serialization within the spin lock is enough to
> +	 * strongly order WC and other memory types. */
> +	mmio_wc_start();
> +#endif
> +}
> +
> +static inline void mmio_wc_spinunlock(pthread_spinlock_t *lock)
> +{
> +	/* It is possible that on x86 the atomic in the lock is strong enough
> +	 * to force-flush the WC buffers quickly, and this SFENCE can be
> +	 * omitted too. */
> +	mmio_flush_writes();
> +	pthread_spin_unlock(lock);
> +}
> +
>  #endif

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH rdma-core 1/3] verbs: Add mmio_wc_spinlock barrier
       [not found]             ` <f236046b-4d2d-f873-d251-a91c90adae98-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2017-03-14 15:12               ` Doug Ledford
  0 siblings, 0 replies; 7+ messages in thread
From: Doug Ledford @ 2017-03-14 15:12 UTC (permalink / raw)
  To: Yishai Hadas, Jason Gunthorpe
  Cc: Yishai Hadas, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	majd-VPRAkNaXOzVWk0Htik3J/w

On Tue, 2017-03-14 at 14:06 +0200, Yishai Hadas wrote:
> On 3/13/2017 7:00 PM, Jason Gunthorpe wrote:
> > 
> > On Mon, Mar 13, 2017 at 04:53:47PM +0200, Yishai Hadas wrote:
> > > 
> > > From: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
> > > 
> > > For x86 the serialization within the spin lock is enough to
> > > strongly order WC and other memory types.
> > > 
> > > Add a new barrier named 'mmio_wc_spinlock' to optimize
> > > that.
> > 
> > Please use this patch with the commentary instead:
> 
> OK, pull request was updated with below.
> https://github.com/linux-rdma/rdma-core/pull/95

Thanks, I've merged this pull request.

> 
> > 
> > diff --git a/util/udma_barrier.h b/util/udma_barrier.h
> > index 9e73148af8d5b6..cfe0459d7f6fff 100644
> > --- a/util/udma_barrier.h
> > +++ b/util/udma_barrier.h
> > @@ -33,6 +33,8 @@
> >  #ifndef __UTIL_UDMA_BARRIER_H
> >  #define __UTIL_UDMA_BARRIER_H
> > 
> > +#include <pthread.h>
> > +
> >  /* Barriers for DMA.
> > 
> >     These barriers are expliclty only for use with user DMA
> > operations. If you
> > @@ -222,4 +224,37 @@
> >  */
> >  #define mmio_ordered_writes_hack() mmio_flush_writes()
> > 
> > +/* Write Combining Spinlock primitive
> > +
> > +   Any access to a multi-value WC region must ensure that multiple
> > cpus do not
> > +   write to the same values concurrently, these macros make that
> > +   straightforward and efficient if the choosen exclusion is a
> > spinlock.
> > +
> > +   The spinlock guarantees that the WC writes issued within the
> > critical
> > +   section are made visible as TLP to the device. The TLP must be
> > seen by the
> > +   device strictly in the order that the spinlocks are acquired,
> > and combining
> > +   WC writes between different sections is not permitted.
> > +
> > +   Use of these macros allow the fencing inside the spinlock to be
> > combined
> > +   with the fencing required for DMA.
> > + */
> > +static inline void mmio_wc_spinlock(pthread_spinlock_t *lock)
> > +{
> > +	pthread_spin_lock(lock);
> > +#if !defined(__i386__) && !defined(__x86_64__)
> > +	/* For x86 the serialization within the spin lock is
> > enough to
> > +	 * strongly order WC and other memory types. */
> > +	mmio_wc_start();
> > +#endif
> > +}
> > +
> > +static inline void mmio_wc_spinunlock(pthread_spinlock_t *lock)
> > +{
> > +	/* It is possible that on x86 the atomic in the lock is
> > strong enough
> > +	 * to force-flush the WC buffers quickly, and this SFENCE
> > can be
> > +	 * omitted too. */
> > +	mmio_flush_writes();
> > +	pthread_spin_unlock(lock);
> > +}
> > +
> >  #endif
> 
-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
    GPG KeyID: B826A3330E572FDD
   
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-03-14 15:12 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-13 14:53 [PATCH rdma-core 0/3] Barriers improvements Yishai Hadas
     [not found] ` <1489416829-15467-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2017-03-13 14:53   ` [PATCH rdma-core 1/3] verbs: Add mmio_wc_spinlock barrier Yishai Hadas
     [not found]     ` <1489416829-15467-2-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2017-03-13 17:00       ` Jason Gunthorpe
     [not found]         ` <20170313170003.GC25664-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-03-14 12:06           ` Yishai Hadas
     [not found]             ` <f236046b-4d2d-f873-d251-a91c90adae98-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2017-03-14 15:12               ` Doug Ledford
2017-03-13 14:53   ` [PATCH rdma-core 2/3] mlx4: Optimize post_send barriers Yishai Hadas
2017-03-13 14:53   ` [PATCH rdma-core 3/3] mlx5: " Yishai Hadas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.