linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH rdma-core] irdma: Restore full memory barrier for doorbell optimization
@ 2021-08-05 16:11 Tatyana Nikolova
  2021-08-06  1:28 ` Jason Gunthorpe
  0 siblings, 1 reply; 10+ messages in thread
From: Tatyana Nikolova @ 2021-08-05 16:11 UTC (permalink / raw)
  To: jgg, dledford, leon; +Cc: linux-rdma, Tatyana Nikolova

During the irdma library upstream submission we agreed to
replace atomic_thread_fence(memory_order_seq_cst) in the irdma
doorbell optimization algorithm with udma_to_device_barrier().
However, further regression testing uncovered cases where in
absence of a full memory barrier, the algorithm incorrectly
skips ringing the doorbell.

There has been a discussion about the necessity of a full
memory barrier for the doorbell optimization in the past:
https://lore.kernel.org/linux-rdma/20170301172920.GA11340@ssaleem-MOBL4.amr.corp.intel.com/

The algorithm decides whether to ring the doorbell based on input
from the shadow memory (hw_tail). If the hw_tail is behind the sq_head,
then the algorithm doesn't ring the doorbell, because it assumes that
the HW is still actively processing WQEs.

The shadow area indicates the last WQE processed by the HW and it is
okay if the shadow area value isn't the most current. However there
can't be a window of time between reading the shadow area and setting
the valid bit for the last WQE posted, because the HW may go idle and
the algorithm won't detect this.

The following two cases illustrate this issue and are identical,
except for ops ordering. The first case is an example of how
the wrong order results in not ringing the doorbell when the
HW has gone idle.

Case 1. Failing case without a full memory barrier

Write a WQE#3

Read shadow (hw tail)

hw tail = WQE#1 (i.e. WQE#1 has been processed),
then the algorithm doesn't ring the doorbell. However, in the window
of time between reading the shadow area and setting the valid bit,
the HW goes idle after processing WQE#2
(the valid bit for WQE#3 was clear when we read the shadow area).

Set valid bit for WQE#3

----------------------------------------------------------------------

Case 2. Passing case with a full memory barrier

Write a WQE#3

Set valid bit for WQE#3

Read shadow (hw tail)

hw tail = WQE#1 (i.e. WQE#1 has been processed),
then the algorithm doesn't ring the doorbell. The HW is active
and is expected to see and process WQE#3 before going idle.

----------------------------------------------------------------------

This patch restores the full memory barrier required for the doorbell
optimization.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
---
 providers/irdma/uk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/providers/irdma/uk.c b/providers/irdma/uk.c
index c7053c52..d63996db 100644
--- a/providers/irdma/uk.c
+++ b/providers/irdma/uk.c
@@ -118,7 +118,7 @@ void irdma_uk_qp_post_wr(struct irdma_qp_uk *qp)
 	__u32 sw_sq_head;
 
 	/* valid bit is written and loads completed before reading shadow */
-	udma_to_device_barrier();
+	atomic_thread_fence(memory_order_seq_cst);
 
 	/* read the doorbell shadow area */
 	get_64bit_val(qp->shadow_area, 0, &temp);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-09-02 17:09 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-05 16:11 [PATCH rdma-core] irdma: Restore full memory barrier for doorbell optimization Tatyana Nikolova
2021-08-06  1:28 ` Jason Gunthorpe
2021-08-09 20:07   ` Nikolova, Tatyana E
2021-08-10 11:59     ` Jason Gunthorpe
2021-08-13 22:25       ` Tatyana Nikolova
2021-08-18 16:49         ` Jason Gunthorpe
2021-08-19 22:01           ` Nikolova, Tatyana E
2021-08-19 22:44             ` Jason Gunthorpe
2021-09-02 16:27               ` Nikolova, Tatyana E
2021-09-02 17:09                 ` Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).