linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes.
@ 2021-09-09 20:44 Bob Pearson
  2021-09-09 20:44 ` [PATCH for-rc v3 1/6] RDMA/rxe: Add memory barriers to kernel queues Bob Pearson
                   ` (6 more replies)
  0 siblings, 7 replies; 22+ messages in thread
From: Bob Pearson @ 2021-09-09 20:44 UTC (permalink / raw)
  To: jgg, zyjzyj2000, linux-rdma, mie, bvanassche; +Cc: Bob Pearson

This series of patches implements several bug fixes and minor
cleanups of the rxe driver. Specifically these fix a bug exposed
by blktest.

They apply cleanly to both
commit 2169b908894df2ce83e7eb4a399d3224b2635126 (origin/for-rc, for-rc)
commit 6a217437f9f5482a3f6f2dc5fcd27cf0f62409ac (HEAD -> for-next,
	origin/wip/jgg-for-next, origin/for-next, origin/HEAD)

These are being resubmitted to for-rc instead of for-next.

The v2 version had a typo which broke clean application to for-next.
Additionally in v3 the order of the patches was changed to make
it a little cleaner.

The first patch is a repeat of an earlier patch after rebasing.
It adds memory barriers to kernel to kernel queues. The logic for this
is the same as an earlier patch that only treated user to kernel queues.
Without this patch kernel to kernel queues are expected to intermittently
fail at low frequency as was seen for the other queues.

The second patch is also a repeat after rebasing. It fixes a multicast
bug.

The third patch cleans up the state and type enums used by MRs.

The fourth patch separates the keys in rxe_mr and ib_mr. This allows
the following sequence seen in the srp driver to work correctly.

	do {
		ib_post_send( IB_WR_LOCAL_INV )
		ib_update_fast_reg_key()
		ib_map_mr_sg()
		ib_post_send( IB_WR_REG_MR )
	} while ( !done )

The fifth patch creates duplicate mapping tables for fast MRs. This
prevents rkeys referencing fast MRs from accessing data from an updated
map after the call to ib_map_mr_sg() call by keeping the new and old
mappings separate and atomically swapping them when a reg mr WR is
executed.

The sixth patch checks the type of MRs which receive local or remote
invalidate operations to prevent invalidating user MRs.

Bob Pearson (6):
  RDMA/rxe: Add memory barriers to kernel queues
  RDMA/rxe: Fix memory allocation while locked
  RDMA/rxe: Cleanup MR status and type enums
  RDMA/rxe: Separate HW and SW l/rkeys
  RDMA/rxe: Create duplicate mapping tables for FMRs
  RDMA/rxe: Only allow invalidate for appropriate MRs

 drivers/infiniband/sw/rxe/rxe_comp.c  |  10 +-
 drivers/infiniband/sw/rxe/rxe_cq.c    |  25 +--
 drivers/infiniband/sw/rxe/rxe_loc.h   |   2 +
 drivers/infiniband/sw/rxe/rxe_mcast.c |   2 +-
 drivers/infiniband/sw/rxe/rxe_mr.c    | 267 +++++++++++++++++++-------
 drivers/infiniband/sw/rxe/rxe_mw.c    |  36 ++--
 drivers/infiniband/sw/rxe/rxe_qp.c    |  10 +-
 drivers/infiniband/sw/rxe/rxe_queue.h |  73 ++-----
 drivers/infiniband/sw/rxe/rxe_req.c   |  35 +---
 drivers/infiniband/sw/rxe/rxe_resp.c  |  38 +---
 drivers/infiniband/sw/rxe/rxe_srq.c   |   2 +-
 drivers/infiniband/sw/rxe/rxe_verbs.c |  92 +++------
 drivers/infiniband/sw/rxe/rxe_verbs.h |  48 ++---
 13 files changed, 305 insertions(+), 335 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH for-rc v3 1/6] RDMA/rxe: Add memory barriers to kernel queues
  2021-09-09 20:44 [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes Bob Pearson
@ 2021-09-09 20:44 ` Bob Pearson
  2021-09-10  1:19   ` Zhu Yanjun
  2021-09-14  6:04   ` 回复: " yangx.jy
  2021-09-09 20:44 ` [PATCH for-rc v3 2/6] RDMA/rxe: Fix memory allocation while locked Bob Pearson
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 22+ messages in thread
From: Bob Pearson @ 2021-09-09 20:44 UTC (permalink / raw)
  To: jgg, zyjzyj2000, linux-rdma, mie, bvanassche; +Cc: Bob Pearson

Earlier patches added memory barriers to protect user space to kernel
space communications. The user space queues were previously shown to
have occasional memory synchonization errors which were removed by
adding smp_load_acquire, smp_store_release barriers. 

This patch extends that to the case where queues are used between kernel
space threads.

Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
---
 drivers/infiniband/sw/rxe/rxe_comp.c  | 10 +---
 drivers/infiniband/sw/rxe/rxe_cq.c    | 25 ++-------
 drivers/infiniband/sw/rxe/rxe_qp.c    | 10 ++--
 drivers/infiniband/sw/rxe/rxe_queue.h | 73 ++++++++-------------------
 drivers/infiniband/sw/rxe/rxe_req.c   | 21 ++------
 drivers/infiniband/sw/rxe/rxe_resp.c  | 38 ++++----------
 drivers/infiniband/sw/rxe/rxe_srq.c   |  2 +-
 drivers/infiniband/sw/rxe/rxe_verbs.c | 53 ++++---------------
 8 files changed, 55 insertions(+), 177 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c
index d2d802c776fd..ed4e3f29bd65 100644
--- a/drivers/infiniband/sw/rxe/rxe_comp.c
+++ b/drivers/infiniband/sw/rxe/rxe_comp.c
@@ -142,10 +142,7 @@ static inline enum comp_state get_wqe(struct rxe_qp *qp,
 	/* we come here whether or not we found a response packet to see if
 	 * there are any posted WQEs
 	 */
-	if (qp->is_user)
-		wqe = queue_head(qp->sq.queue, QUEUE_TYPE_FROM_USER);
-	else
-		wqe = queue_head(qp->sq.queue, QUEUE_TYPE_KERNEL);
+	wqe = queue_head(qp->sq.queue, QUEUE_TYPE_FROM_CLIENT);
 	*wqe_p = wqe;
 
 	/* no WQE or requester has not started it yet */
@@ -432,10 +429,7 @@ static void do_complete(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
 	if (post)
 		make_send_cqe(qp, wqe, &cqe);
 
-	if (qp->is_user)
-		advance_consumer(qp->sq.queue, QUEUE_TYPE_FROM_USER);
-	else
-		advance_consumer(qp->sq.queue, QUEUE_TYPE_KERNEL);
+	advance_consumer(qp->sq.queue, QUEUE_TYPE_FROM_CLIENT);
 
 	if (post)
 		rxe_cq_post(qp->scq, &cqe, 0);
diff --git a/drivers/infiniband/sw/rxe/rxe_cq.c b/drivers/infiniband/sw/rxe/rxe_cq.c
index aef288f164fd..4e26c2ea4a59 100644
--- a/drivers/infiniband/sw/rxe/rxe_cq.c
+++ b/drivers/infiniband/sw/rxe/rxe_cq.c
@@ -25,11 +25,7 @@ int rxe_cq_chk_attr(struct rxe_dev *rxe, struct rxe_cq *cq,
 	}
 
 	if (cq) {
-		if (cq->is_user)
-			count = queue_count(cq->queue, QUEUE_TYPE_TO_USER);
-		else
-			count = queue_count(cq->queue, QUEUE_TYPE_KERNEL);
-
+		count = queue_count(cq->queue, QUEUE_TYPE_TO_CLIENT);
 		if (cqe < count) {
 			pr_warn("cqe(%d) < current # elements in queue (%d)",
 				cqe, count);
@@ -65,7 +61,7 @@ int rxe_cq_from_init(struct rxe_dev *rxe, struct rxe_cq *cq, int cqe,
 	int err;
 	enum queue_type type;
 
-	type = uresp ? QUEUE_TYPE_TO_USER : QUEUE_TYPE_KERNEL;
+	type = QUEUE_TYPE_TO_CLIENT;
 	cq->queue = rxe_queue_init(rxe, &cqe,
 			sizeof(struct rxe_cqe), type);
 	if (!cq->queue) {
@@ -117,11 +113,7 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
 
 	spin_lock_irqsave(&cq->cq_lock, flags);
 
-	if (cq->is_user)
-		full = queue_full(cq->queue, QUEUE_TYPE_TO_USER);
-	else
-		full = queue_full(cq->queue, QUEUE_TYPE_KERNEL);
-
+	full = queue_full(cq->queue, QUEUE_TYPE_TO_CLIENT);
 	if (unlikely(full)) {
 		spin_unlock_irqrestore(&cq->cq_lock, flags);
 		if (cq->ibcq.event_handler) {
@@ -134,17 +126,10 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
 		return -EBUSY;
 	}
 
-	if (cq->is_user)
-		addr = producer_addr(cq->queue, QUEUE_TYPE_TO_USER);
-	else
-		addr = producer_addr(cq->queue, QUEUE_TYPE_KERNEL);
-
+	addr = producer_addr(cq->queue, QUEUE_TYPE_TO_CLIENT);
 	memcpy(addr, cqe, sizeof(*cqe));
 
-	if (cq->is_user)
-		advance_producer(cq->queue, QUEUE_TYPE_TO_USER);
-	else
-		advance_producer(cq->queue, QUEUE_TYPE_KERNEL);
+	advance_producer(cq->queue, QUEUE_TYPE_TO_CLIENT);
 
 	spin_unlock_irqrestore(&cq->cq_lock, flags);
 
diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
index 1ab6af7ddb25..2e923af642f8 100644
--- a/drivers/infiniband/sw/rxe/rxe_qp.c
+++ b/drivers/infiniband/sw/rxe/rxe_qp.c
@@ -231,7 +231,7 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
 	qp->sq.max_inline = init->cap.max_inline_data = wqe_size;
 	wqe_size += sizeof(struct rxe_send_wqe);
 
-	type = uresp ? QUEUE_TYPE_FROM_USER : QUEUE_TYPE_KERNEL;
+	type = QUEUE_TYPE_FROM_CLIENT;
 	qp->sq.queue = rxe_queue_init(rxe, &qp->sq.max_wr,
 				wqe_size, type);
 	if (!qp->sq.queue)
@@ -248,12 +248,8 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
 		return err;
 	}
 
-	if (qp->is_user)
 		qp->req.wqe_index = producer_index(qp->sq.queue,
-						QUEUE_TYPE_FROM_USER);
-	else
-		qp->req.wqe_index = producer_index(qp->sq.queue,
-						QUEUE_TYPE_KERNEL);
+					QUEUE_TYPE_FROM_CLIENT);
 
 	qp->req.state		= QP_STATE_RESET;
 	qp->req.opcode		= -1;
@@ -293,7 +289,7 @@ static int rxe_qp_init_resp(struct rxe_dev *rxe, struct rxe_qp *qp,
 		pr_debug("qp#%d max_wr = %d, max_sge = %d, wqe_size = %d\n",
 			 qp_num(qp), qp->rq.max_wr, qp->rq.max_sge, wqe_size);
 
-		type = uresp ? QUEUE_TYPE_FROM_USER : QUEUE_TYPE_KERNEL;
+		type = QUEUE_TYPE_FROM_CLIENT;
 		qp->rq.queue = rxe_queue_init(rxe, &qp->rq.max_wr,
 					wqe_size, type);
 		if (!qp->rq.queue)
diff --git a/drivers/infiniband/sw/rxe/rxe_queue.h b/drivers/infiniband/sw/rxe/rxe_queue.h
index 2702b0e55fc3..d465aa9342e1 100644
--- a/drivers/infiniband/sw/rxe/rxe_queue.h
+++ b/drivers/infiniband/sw/rxe/rxe_queue.h
@@ -35,9 +35,8 @@
 
 /* type of queue */
 enum queue_type {
-	QUEUE_TYPE_KERNEL,
-	QUEUE_TYPE_TO_USER,
-	QUEUE_TYPE_FROM_USER,
+	QUEUE_TYPE_TO_CLIENT,
+	QUEUE_TYPE_FROM_CLIENT,
 };
 
 struct rxe_queue {
@@ -87,20 +86,16 @@ static inline int queue_empty(struct rxe_queue *q, enum queue_type type)
 	u32 cons;
 
 	switch (type) {
-	case QUEUE_TYPE_FROM_USER:
+	case QUEUE_TYPE_FROM_CLIENT:
 		/* protect user space index */
 		prod = smp_load_acquire(&q->buf->producer_index);
 		cons = q->index;
 		break;
-	case QUEUE_TYPE_TO_USER:
+	case QUEUE_TYPE_TO_CLIENT:
 		prod = q->index;
 		/* protect user space index */
 		cons = smp_load_acquire(&q->buf->consumer_index);
 		break;
-	case QUEUE_TYPE_KERNEL:
-		prod = q->buf->producer_index;
-		cons = q->buf->consumer_index;
-		break;
 	}
 
 	return ((prod - cons) & q->index_mask) == 0;
@@ -112,20 +107,16 @@ static inline int queue_full(struct rxe_queue *q, enum queue_type type)
 	u32 cons;
 
 	switch (type) {
-	case QUEUE_TYPE_FROM_USER:
+	case QUEUE_TYPE_FROM_CLIENT:
 		/* protect user space index */
 		prod = smp_load_acquire(&q->buf->producer_index);
 		cons = q->index;
 		break;
-	case QUEUE_TYPE_TO_USER:
+	case QUEUE_TYPE_TO_CLIENT:
 		prod = q->index;
 		/* protect user space index */
 		cons = smp_load_acquire(&q->buf->consumer_index);
 		break;
-	case QUEUE_TYPE_KERNEL:
-		prod = q->buf->producer_index;
-		cons = q->buf->consumer_index;
-		break;
 	}
 
 	return ((prod + 1 - cons) & q->index_mask) == 0;
@@ -138,20 +129,16 @@ static inline unsigned int queue_count(const struct rxe_queue *q,
 	u32 cons;
 
 	switch (type) {
-	case QUEUE_TYPE_FROM_USER:
+	case QUEUE_TYPE_FROM_CLIENT:
 		/* protect user space index */
 		prod = smp_load_acquire(&q->buf->producer_index);
 		cons = q->index;
 		break;
-	case QUEUE_TYPE_TO_USER:
+	case QUEUE_TYPE_TO_CLIENT:
 		prod = q->index;
 		/* protect user space index */
 		cons = smp_load_acquire(&q->buf->consumer_index);
 		break;
-	case QUEUE_TYPE_KERNEL:
-		prod = q->buf->producer_index;
-		cons = q->buf->consumer_index;
-		break;
 	}
 
 	return (prod - cons) & q->index_mask;
@@ -162,7 +149,7 @@ static inline void advance_producer(struct rxe_queue *q, enum queue_type type)
 	u32 prod;
 
 	switch (type) {
-	case QUEUE_TYPE_FROM_USER:
+	case QUEUE_TYPE_FROM_CLIENT:
 		pr_warn_once("Normally kernel should not write user space index\n");
 		/* protect user space index */
 		prod = smp_load_acquire(&q->buf->producer_index);
@@ -170,15 +157,11 @@ static inline void advance_producer(struct rxe_queue *q, enum queue_type type)
 		/* same */
 		smp_store_release(&q->buf->producer_index, prod);
 		break;
-	case QUEUE_TYPE_TO_USER:
+	case QUEUE_TYPE_TO_CLIENT:
 		prod = q->index;
 		q->index = (prod + 1) & q->index_mask;
 		q->buf->producer_index = q->index;
 		break;
-	case QUEUE_TYPE_KERNEL:
-		prod = q->buf->producer_index;
-		q->buf->producer_index = (prod + 1) & q->index_mask;
-		break;
 	}
 }
 
@@ -187,12 +170,12 @@ static inline void advance_consumer(struct rxe_queue *q, enum queue_type type)
 	u32 cons;
 
 	switch (type) {
-	case QUEUE_TYPE_FROM_USER:
+	case QUEUE_TYPE_FROM_CLIENT:
 		cons = q->index;
 		q->index = (cons + 1) & q->index_mask;
 		q->buf->consumer_index = q->index;
 		break;
-	case QUEUE_TYPE_TO_USER:
+	case QUEUE_TYPE_TO_CLIENT:
 		pr_warn_once("Normally kernel should not write user space index\n");
 		/* protect user space index */
 		cons = smp_load_acquire(&q->buf->consumer_index);
@@ -200,10 +183,6 @@ static inline void advance_consumer(struct rxe_queue *q, enum queue_type type)
 		/* same */
 		smp_store_release(&q->buf->consumer_index, cons);
 		break;
-	case QUEUE_TYPE_KERNEL:
-		cons = q->buf->consumer_index;
-		q->buf->consumer_index = (cons + 1) & q->index_mask;
-		break;
 	}
 }
 
@@ -212,17 +191,14 @@ static inline void *producer_addr(struct rxe_queue *q, enum queue_type type)
 	u32 prod;
 
 	switch (type) {
-	case QUEUE_TYPE_FROM_USER:
+	case QUEUE_TYPE_FROM_CLIENT:
 		/* protect user space index */
 		prod = smp_load_acquire(&q->buf->producer_index);
 		prod &= q->index_mask;
 		break;
-	case QUEUE_TYPE_TO_USER:
+	case QUEUE_TYPE_TO_CLIENT:
 		prod = q->index;
 		break;
-	case QUEUE_TYPE_KERNEL:
-		prod = q->buf->producer_index;
-		break;
 	}
 
 	return q->buf->data + (prod << q->log2_elem_size);
@@ -233,17 +209,14 @@ static inline void *consumer_addr(struct rxe_queue *q, enum queue_type type)
 	u32 cons;
 
 	switch (type) {
-	case QUEUE_TYPE_FROM_USER:
+	case QUEUE_TYPE_FROM_CLIENT:
 		cons = q->index;
 		break;
-	case QUEUE_TYPE_TO_USER:
+	case QUEUE_TYPE_TO_CLIENT:
 		/* protect user space index */
 		cons = smp_load_acquire(&q->buf->consumer_index);
 		cons &= q->index_mask;
 		break;
-	case QUEUE_TYPE_KERNEL:
-		cons = q->buf->consumer_index;
-		break;
 	}
 
 	return q->buf->data + (cons << q->log2_elem_size);
@@ -255,17 +228,14 @@ static inline unsigned int producer_index(struct rxe_queue *q,
 	u32 prod;
 
 	switch (type) {
-	case QUEUE_TYPE_FROM_USER:
+	case QUEUE_TYPE_FROM_CLIENT:
 		/* protect user space index */
 		prod = smp_load_acquire(&q->buf->producer_index);
 		prod &= q->index_mask;
 		break;
-	case QUEUE_TYPE_TO_USER:
+	case QUEUE_TYPE_TO_CLIENT:
 		prod = q->index;
 		break;
-	case QUEUE_TYPE_KERNEL:
-		prod = q->buf->producer_index;
-		break;
 	}
 
 	return prod;
@@ -277,17 +247,14 @@ static inline unsigned int consumer_index(struct rxe_queue *q,
 	u32 cons;
 
 	switch (type) {
-	case QUEUE_TYPE_FROM_USER:
+	case QUEUE_TYPE_FROM_CLIENT:
 		cons = q->index;
 		break;
-	case QUEUE_TYPE_TO_USER:
+	case QUEUE_TYPE_TO_CLIENT:
 		/* protect user space index */
 		cons = smp_load_acquire(&q->buf->consumer_index);
 		cons &= q->index_mask;
 		break;
-	case QUEUE_TYPE_KERNEL:
-		cons = q->buf->consumer_index;
-		break;
 	}
 
 	return cons;
diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
index 3894197a82f6..22c3edb28945 100644
--- a/drivers/infiniband/sw/rxe/rxe_req.c
+++ b/drivers/infiniband/sw/rxe/rxe_req.c
@@ -49,13 +49,8 @@ static void req_retry(struct rxe_qp *qp)
 	unsigned int cons;
 	unsigned int prod;
 
-	if (qp->is_user) {
-		cons = consumer_index(q, QUEUE_TYPE_FROM_USER);
-		prod = producer_index(q, QUEUE_TYPE_FROM_USER);
-	} else {
-		cons = consumer_index(q, QUEUE_TYPE_KERNEL);
-		prod = producer_index(q, QUEUE_TYPE_KERNEL);
-	}
+	cons = consumer_index(q, QUEUE_TYPE_FROM_CLIENT);
+	prod = producer_index(q, QUEUE_TYPE_FROM_CLIENT);
 
 	qp->req.wqe_index	= cons;
 	qp->req.psn		= qp->comp.psn;
@@ -121,15 +116,9 @@ static struct rxe_send_wqe *req_next_wqe(struct rxe_qp *qp)
 	unsigned int cons;
 	unsigned int prod;
 
-	if (qp->is_user) {
-		wqe = queue_head(q, QUEUE_TYPE_FROM_USER);
-		cons = consumer_index(q, QUEUE_TYPE_FROM_USER);
-		prod = producer_index(q, QUEUE_TYPE_FROM_USER);
-	} else {
-		wqe = queue_head(q, QUEUE_TYPE_KERNEL);
-		cons = consumer_index(q, QUEUE_TYPE_KERNEL);
-		prod = producer_index(q, QUEUE_TYPE_KERNEL);
-	}
+	wqe = queue_head(q, QUEUE_TYPE_FROM_CLIENT);
+	cons = consumer_index(q, QUEUE_TYPE_FROM_CLIENT);
+	prod = producer_index(q, QUEUE_TYPE_FROM_CLIENT);
 
 	if (unlikely(qp->req.state == QP_STATE_DRAIN)) {
 		/* check to see if we are drained;
diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
index 5501227ddc65..596be002d33d 100644
--- a/drivers/infiniband/sw/rxe/rxe_resp.c
+++ b/drivers/infiniband/sw/rxe/rxe_resp.c
@@ -303,10 +303,7 @@ static enum resp_states get_srq_wqe(struct rxe_qp *qp)
 
 	spin_lock_bh(&srq->rq.consumer_lock);
 
-	if (qp->is_user)
-		wqe = queue_head(q, QUEUE_TYPE_FROM_USER);
-	else
-		wqe = queue_head(q, QUEUE_TYPE_KERNEL);
+	wqe = queue_head(q, QUEUE_TYPE_FROM_CLIENT);
 	if (!wqe) {
 		spin_unlock_bh(&srq->rq.consumer_lock);
 		return RESPST_ERR_RNR;
@@ -322,13 +319,8 @@ static enum resp_states get_srq_wqe(struct rxe_qp *qp)
 	memcpy(&qp->resp.srq_wqe, wqe, size);
 
 	qp->resp.wqe = &qp->resp.srq_wqe.wqe;
-	if (qp->is_user) {
-		advance_consumer(q, QUEUE_TYPE_FROM_USER);
-		count = queue_count(q, QUEUE_TYPE_FROM_USER);
-	} else {
-		advance_consumer(q, QUEUE_TYPE_KERNEL);
-		count = queue_count(q, QUEUE_TYPE_KERNEL);
-	}
+	advance_consumer(q, QUEUE_TYPE_FROM_CLIENT);
+	count = queue_count(q, QUEUE_TYPE_FROM_CLIENT);
 
 	if (srq->limit && srq->ibsrq.event_handler && (count < srq->limit)) {
 		srq->limit = 0;
@@ -357,12 +349,8 @@ static enum resp_states check_resource(struct rxe_qp *qp,
 			qp->resp.status = IB_WC_WR_FLUSH_ERR;
 			return RESPST_COMPLETE;
 		} else if (!srq) {
-			if (qp->is_user)
-				qp->resp.wqe = queue_head(qp->rq.queue,
-						QUEUE_TYPE_FROM_USER);
-			else
-				qp->resp.wqe = queue_head(qp->rq.queue,
-						QUEUE_TYPE_KERNEL);
+			qp->resp.wqe = queue_head(qp->rq.queue,
+					QUEUE_TYPE_FROM_CLIENT);
 			if (qp->resp.wqe) {
 				qp->resp.status = IB_WC_WR_FLUSH_ERR;
 				return RESPST_COMPLETE;
@@ -389,12 +377,8 @@ static enum resp_states check_resource(struct rxe_qp *qp,
 		if (srq)
 			return get_srq_wqe(qp);
 
-		if (qp->is_user)
-			qp->resp.wqe = queue_head(qp->rq.queue,
-					QUEUE_TYPE_FROM_USER);
-		else
-			qp->resp.wqe = queue_head(qp->rq.queue,
-					QUEUE_TYPE_KERNEL);
+		qp->resp.wqe = queue_head(qp->rq.queue,
+				QUEUE_TYPE_FROM_CLIENT);
 		return (qp->resp.wqe) ? RESPST_CHK_LENGTH : RESPST_ERR_RNR;
 	}
 
@@ -936,12 +920,8 @@ static enum resp_states do_complete(struct rxe_qp *qp,
 	}
 
 	/* have copy for srq and reference for !srq */
-	if (!qp->srq) {
-		if (qp->is_user)
-			advance_consumer(qp->rq.queue, QUEUE_TYPE_FROM_USER);
-		else
-			advance_consumer(qp->rq.queue, QUEUE_TYPE_KERNEL);
-	}
+	if (!qp->srq)
+		advance_consumer(qp->rq.queue, QUEUE_TYPE_FROM_CLIENT);
 
 	qp->resp.wqe = NULL;
 
diff --git a/drivers/infiniband/sw/rxe/rxe_srq.c b/drivers/infiniband/sw/rxe/rxe_srq.c
index 610c98d24b5c..a9e7817e2732 100644
--- a/drivers/infiniband/sw/rxe/rxe_srq.c
+++ b/drivers/infiniband/sw/rxe/rxe_srq.c
@@ -93,7 +93,7 @@ int rxe_srq_from_init(struct rxe_dev *rxe, struct rxe_srq *srq,
 	spin_lock_init(&srq->rq.producer_lock);
 	spin_lock_init(&srq->rq.consumer_lock);
 
-	type = uresp ? QUEUE_TYPE_FROM_USER : QUEUE_TYPE_KERNEL;
+	type = QUEUE_TYPE_FROM_CLIENT;
 	q = rxe_queue_init(rxe, &srq->rq.max_wr,
 			srq_wqe_size, type);
 	if (!q) {
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
index 267b5a9c345d..dc70e3edeba6 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -218,11 +218,7 @@ static int post_one_recv(struct rxe_rq *rq, const struct ib_recv_wr *ibwr)
 	int num_sge = ibwr->num_sge;
 	int full;
 
-	if (rq->is_user)
-		full = queue_full(rq->queue, QUEUE_TYPE_FROM_USER);
-	else
-		full = queue_full(rq->queue, QUEUE_TYPE_KERNEL);
-
+	full = queue_full(rq->queue, QUEUE_TYPE_FROM_CLIENT);
 	if (unlikely(full)) {
 		err = -ENOMEM;
 		goto err1;
@@ -237,11 +233,7 @@ static int post_one_recv(struct rxe_rq *rq, const struct ib_recv_wr *ibwr)
 	for (i = 0; i < num_sge; i++)
 		length += ibwr->sg_list[i].length;
 
-	if (rq->is_user)
-		recv_wqe = producer_addr(rq->queue, QUEUE_TYPE_FROM_USER);
-	else
-		recv_wqe = producer_addr(rq->queue, QUEUE_TYPE_KERNEL);
-
+	recv_wqe = producer_addr(rq->queue, QUEUE_TYPE_FROM_CLIENT);
 	recv_wqe->wr_id = ibwr->wr_id;
 	recv_wqe->num_sge = num_sge;
 
@@ -254,10 +246,7 @@ static int post_one_recv(struct rxe_rq *rq, const struct ib_recv_wr *ibwr)
 	recv_wqe->dma.cur_sge		= 0;
 	recv_wqe->dma.sge_offset	= 0;
 
-	if (rq->is_user)
-		advance_producer(rq->queue, QUEUE_TYPE_FROM_USER);
-	else
-		advance_producer(rq->queue, QUEUE_TYPE_KERNEL);
+	advance_producer(rq->queue, QUEUE_TYPE_FROM_CLIENT);
 
 	return 0;
 
@@ -633,27 +622,17 @@ static int post_one_send(struct rxe_qp *qp, const struct ib_send_wr *ibwr,
 
 	spin_lock_irqsave(&qp->sq.sq_lock, flags);
 
-	if (qp->is_user)
-		full = queue_full(sq->queue, QUEUE_TYPE_FROM_USER);
-	else
-		full = queue_full(sq->queue, QUEUE_TYPE_KERNEL);
+	full = queue_full(sq->queue, QUEUE_TYPE_FROM_CLIENT);
 
 	if (unlikely(full)) {
 		spin_unlock_irqrestore(&qp->sq.sq_lock, flags);
 		return -ENOMEM;
 	}
 
-	if (qp->is_user)
-		send_wqe = producer_addr(sq->queue, QUEUE_TYPE_FROM_USER);
-	else
-		send_wqe = producer_addr(sq->queue, QUEUE_TYPE_KERNEL);
-
+	send_wqe = producer_addr(sq->queue, QUEUE_TYPE_FROM_CLIENT);
 	init_send_wqe(qp, ibwr, mask, length, send_wqe);
 
-	if (qp->is_user)
-		advance_producer(sq->queue, QUEUE_TYPE_FROM_USER);
-	else
-		advance_producer(sq->queue, QUEUE_TYPE_KERNEL);
+	advance_producer(sq->queue, QUEUE_TYPE_FROM_CLIENT);
 
 	spin_unlock_irqrestore(&qp->sq.sq_lock, flags);
 
@@ -845,18 +824,12 @@ static int rxe_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc)
 
 	spin_lock_irqsave(&cq->cq_lock, flags);
 	for (i = 0; i < num_entries; i++) {
-		if (cq->is_user)
-			cqe = queue_head(cq->queue, QUEUE_TYPE_TO_USER);
-		else
-			cqe = queue_head(cq->queue, QUEUE_TYPE_KERNEL);
+		cqe = queue_head(cq->queue, QUEUE_TYPE_TO_CLIENT);
 		if (!cqe)
 			break;
 
 		memcpy(wc++, &cqe->ibwc, sizeof(*wc));
-		if (cq->is_user)
-			advance_consumer(cq->queue, QUEUE_TYPE_TO_USER);
-		else
-			advance_consumer(cq->queue, QUEUE_TYPE_KERNEL);
+		advance_consumer(cq->queue, QUEUE_TYPE_TO_CLIENT);
 	}
 	spin_unlock_irqrestore(&cq->cq_lock, flags);
 
@@ -868,10 +841,7 @@ static int rxe_peek_cq(struct ib_cq *ibcq, int wc_cnt)
 	struct rxe_cq *cq = to_rcq(ibcq);
 	int count;
 
-	if (cq->is_user)
-		count = queue_count(cq->queue, QUEUE_TYPE_TO_USER);
-	else
-		count = queue_count(cq->queue, QUEUE_TYPE_KERNEL);
+	count = queue_count(cq->queue, QUEUE_TYPE_TO_CLIENT);
 
 	return (count > wc_cnt) ? wc_cnt : count;
 }
@@ -887,10 +857,7 @@ static int rxe_req_notify_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags)
 	if (cq->notify != IB_CQ_NEXT_COMP)
 		cq->notify = flags & IB_CQ_SOLICITED_MASK;
 
-	if (cq->is_user)
-		empty = queue_empty(cq->queue, QUEUE_TYPE_TO_USER);
-	else
-		empty = queue_empty(cq->queue, QUEUE_TYPE_KERNEL);
+	empty = queue_empty(cq->queue, QUEUE_TYPE_TO_CLIENT);
 
 	if ((flags & IB_CQ_REPORT_MISSED_EVENTS) && !empty)
 		ret = 1;
-- 
2.30.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH for-rc v3 2/6] RDMA/rxe: Fix memory allocation while locked
  2021-09-09 20:44 [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes Bob Pearson
  2021-09-09 20:44 ` [PATCH for-rc v3 1/6] RDMA/rxe: Add memory barriers to kernel queues Bob Pearson
@ 2021-09-09 20:44 ` Bob Pearson
  2021-09-09 20:44 ` [PATCH for-rc v3 3/6] RDMA/rxe: Cleanup MR status and type enums Bob Pearson
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 22+ messages in thread
From: Bob Pearson @ 2021-09-09 20:44 UTC (permalink / raw)
  To: jgg, zyjzyj2000, linux-rdma, mie, bvanassche; +Cc: Bob Pearson, Dan Carpenter

rxe_mcast_add_grp_elem() in rxe_mcast.c calls rxe_alloc() while holding
spinlocks which in turn calls kzalloc(size, GFP_KERNEL) which is
incorrect.  This patch replaces rxe_alloc() by rxe_alloc_locked() which
uses GFP_ATOMIC. This bug was caused by the below mentioned commit and
failing to handle the need for the atomic allocate.

Fixes: 4276fd0dddc9 ("Remove RXE_POOL_ATOMIC")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
---
 drivers/infiniband/sw/rxe/rxe_mcast.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_mcast.c b/drivers/infiniband/sw/rxe/rxe_mcast.c
index 0ea9a5aa4ec0..1c1d1b53312d 100644
--- a/drivers/infiniband/sw/rxe/rxe_mcast.c
+++ b/drivers/infiniband/sw/rxe/rxe_mcast.c
@@ -85,7 +85,7 @@ int rxe_mcast_add_grp_elem(struct rxe_dev *rxe, struct rxe_qp *qp,
 		goto out;
 	}
 
-	elem = rxe_alloc(&rxe->mc_elem_pool);
+	elem = rxe_alloc_locked(&rxe->mc_elem_pool);
 	if (!elem) {
 		err = -ENOMEM;
 		goto out;
-- 
2.30.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH for-rc v3 3/6] RDMA/rxe: Cleanup MR status and type enums
  2021-09-09 20:44 [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes Bob Pearson
  2021-09-09 20:44 ` [PATCH for-rc v3 1/6] RDMA/rxe: Add memory barriers to kernel queues Bob Pearson
  2021-09-09 20:44 ` [PATCH for-rc v3 2/6] RDMA/rxe: Fix memory allocation while locked Bob Pearson
@ 2021-09-09 20:44 ` Bob Pearson
  2021-09-09 20:44 ` [PATCH for-rc v3 4/6] RDMA/rxe: Separate HW and SW l/rkeys Bob Pearson
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 22+ messages in thread
From: Bob Pearson @ 2021-09-09 20:44 UTC (permalink / raw)
  To: jgg, zyjzyj2000, linux-rdma, mie, bvanassche; +Cc: Bob Pearson

Eliminate RXE_MR_STATE_ZOMBIE which is not compatible with IBA.
RXE_MR_STATE_INVALID is better.

Replace RXE_MR_TYPE_XXX by IB_MR_TYPE_XXX which covers all the needed
types.

Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
---
 drivers/infiniband/sw/rxe/rxe_mr.c    | 20 ++++++++++++--------
 drivers/infiniband/sw/rxe/rxe_verbs.h |  9 +--------
 2 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index 5890a8246216..0cc24154762c 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -24,17 +24,22 @@ u8 rxe_get_next_key(u32 last_key)
 
 int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length)
 {
+
+
 	switch (mr->type) {
-	case RXE_MR_TYPE_DMA:
+	case IB_MR_TYPE_DMA:
 		return 0;
 
-	case RXE_MR_TYPE_MR:
+	case IB_MR_TYPE_USER:
+	case IB_MR_TYPE_MEM_REG:
 		if (iova < mr->iova || length > mr->length ||
 		    iova > mr->iova + mr->length - length)
 			return -EFAULT;
 		return 0;
 
 	default:
+		pr_warn("%s: mr type (%d) not supported\n",
+			__func__, mr->type);
 		return -EFAULT;
 	}
 }
@@ -51,7 +56,6 @@ static void rxe_mr_init(int access, struct rxe_mr *mr)
 	mr->ibmr.lkey = lkey;
 	mr->ibmr.rkey = rkey;
 	mr->state = RXE_MR_STATE_INVALID;
-	mr->type = RXE_MR_TYPE_NONE;
 	mr->map_shift = ilog2(RXE_BUF_PER_MAP);
 }
 
@@ -100,7 +104,7 @@ void rxe_mr_init_dma(struct rxe_pd *pd, int access, struct rxe_mr *mr)
 	mr->ibmr.pd = &pd->ibpd;
 	mr->access = access;
 	mr->state = RXE_MR_STATE_VALID;
-	mr->type = RXE_MR_TYPE_DMA;
+	mr->type = IB_MR_TYPE_DMA;
 }
 
 int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova,
@@ -173,7 +177,7 @@ int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova,
 	mr->va = start;
 	mr->offset = ib_umem_offset(umem);
 	mr->state = RXE_MR_STATE_VALID;
-	mr->type = RXE_MR_TYPE_MR;
+	mr->type = IB_MR_TYPE_USER;
 
 	return 0;
 
@@ -203,7 +207,7 @@ int rxe_mr_init_fast(struct rxe_pd *pd, int max_pages, struct rxe_mr *mr)
 	mr->ibmr.pd = &pd->ibpd;
 	mr->max_buf = max_pages;
 	mr->state = RXE_MR_STATE_FREE;
-	mr->type = RXE_MR_TYPE_MR;
+	mr->type = IB_MR_TYPE_MEM_REG;
 
 	return 0;
 
@@ -302,7 +306,7 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length,
 	if (length == 0)
 		return 0;
 
-	if (mr->type == RXE_MR_TYPE_DMA) {
+	if (mr->type == IB_MR_TYPE_DMA) {
 		u8 *src, *dest;
 
 		src = (dir == RXE_TO_MR_OBJ) ? addr : ((void *)(uintptr_t)iova);
@@ -564,7 +568,7 @@ int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
 		return -EINVAL;
 	}
 
-	mr->state = RXE_MR_STATE_ZOMBIE;
+	mr->state = RXE_MR_STATE_INVALID;
 	rxe_drop_ref(mr_pd(mr));
 	rxe_drop_index(mr);
 	rxe_drop_ref(mr);
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h
index ac2a2148027f..c6aca2293294 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.h
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
@@ -267,18 +267,11 @@ struct rxe_qp {
 };
 
 enum rxe_mr_state {
-	RXE_MR_STATE_ZOMBIE,
 	RXE_MR_STATE_INVALID,
 	RXE_MR_STATE_FREE,
 	RXE_MR_STATE_VALID,
 };
 
-enum rxe_mr_type {
-	RXE_MR_TYPE_NONE,
-	RXE_MR_TYPE_DMA,
-	RXE_MR_TYPE_MR,
-};
-
 enum rxe_mr_copy_dir {
 	RXE_TO_MR_OBJ,
 	RXE_FROM_MR_OBJ,
@@ -314,7 +307,7 @@ struct rxe_mr {
 	struct ib_umem		*umem;
 
 	enum rxe_mr_state	state;
-	enum rxe_mr_type	type;
+	enum ib_mr_type		type;
 	u64			va;
 	u64			iova;
 	size_t			length;
-- 
2.30.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH for-rc v3 4/6] RDMA/rxe: Separate HW and SW l/rkeys
  2021-09-09 20:44 [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes Bob Pearson
                   ` (2 preceding siblings ...)
  2021-09-09 20:44 ` [PATCH for-rc v3 3/6] RDMA/rxe: Cleanup MR status and type enums Bob Pearson
@ 2021-09-09 20:44 ` Bob Pearson
  2021-09-09 20:44 ` [PATCH for-rc v3 5/6] RDMA/rxe: Create duplicate mapping tables for FMRs Bob Pearson
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 22+ messages in thread
From: Bob Pearson @ 2021-09-09 20:44 UTC (permalink / raw)
  To: jgg, zyjzyj2000, linux-rdma, mie, bvanassche; +Cc: Bob Pearson

Separate software and simulated hardware lkeys and rkeys for MRs and MWs.
This makes struct ib_mr and struct ib_mw isolated from hardware changes
triggered by executing work requests.

This change fixes a bug seen in blktest.

Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
---
 drivers/infiniband/sw/rxe/rxe_loc.h   |  1 +
 drivers/infiniband/sw/rxe/rxe_mr.c    | 69 ++++++++++++++++++++++-----
 drivers/infiniband/sw/rxe/rxe_mw.c    | 30 ++++++------
 drivers/infiniband/sw/rxe/rxe_req.c   | 14 ++----
 drivers/infiniband/sw/rxe/rxe_verbs.h | 18 ++-----
 5 files changed, 81 insertions(+), 51 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h
index f0c954575bde..4fd73b51fabf 100644
--- a/drivers/infiniband/sw/rxe/rxe_loc.h
+++ b/drivers/infiniband/sw/rxe/rxe_loc.h
@@ -86,6 +86,7 @@ struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key,
 int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length);
 int advance_dma_data(struct rxe_dma_info *dma, unsigned int length);
 int rxe_invalidate_mr(struct rxe_qp *qp, u32 rkey);
+int rxe_reg_fast_mr(struct rxe_qp *qp, struct rxe_send_wqe *wqe);
 int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata);
 void rxe_mr_cleanup(struct rxe_pool_entry *arg);
 
diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index 0cc24154762c..370212801abc 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -53,8 +53,14 @@ static void rxe_mr_init(int access, struct rxe_mr *mr)
 	u32 lkey = mr->pelem.index << 8 | rxe_get_next_key(-1);
 	u32 rkey = (access & IB_ACCESS_REMOTE) ? lkey : 0;
 
-	mr->ibmr.lkey = lkey;
-	mr->ibmr.rkey = rkey;
+	/* set ibmr->l/rkey and also copy into private l/rkey
+	 * for user MRs these will always be the same
+	 * for cases where caller 'owns' the key portion
+	 * they may be different until REG_MR WQE is executed.
+	 */
+	mr->lkey = mr->ibmr.lkey = lkey;
+	mr->rkey = mr->ibmr.rkey = rkey;
+
 	mr->state = RXE_MR_STATE_INVALID;
 	mr->map_shift = ilog2(RXE_BUF_PER_MAP);
 }
@@ -195,10 +201,8 @@ int rxe_mr_init_fast(struct rxe_pd *pd, int max_pages, struct rxe_mr *mr)
 {
 	int err;
 
-	rxe_mr_init(0, mr);
-
-	/* In fastreg, we also set the rkey */
-	mr->ibmr.rkey = mr->ibmr.lkey;
+	/* always allow remote access for FMRs */
+	rxe_mr_init(IB_ACCESS_REMOTE, mr);
 
 	err = rxe_mr_alloc(mr, max_pages);
 	if (err)
@@ -511,8 +515,8 @@ struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key,
 	if (!mr)
 		return NULL;
 
-	if (unlikely((type == RXE_LOOKUP_LOCAL && mr_lkey(mr) != key) ||
-		     (type == RXE_LOOKUP_REMOTE && mr_rkey(mr) != key) ||
+	if (unlikely((type == RXE_LOOKUP_LOCAL && mr->lkey != key) ||
+		     (type == RXE_LOOKUP_REMOTE && mr->rkey != key) ||
 		     mr_pd(mr) != pd || (access && !(access & mr->access)) ||
 		     mr->state != RXE_MR_STATE_VALID)) {
 		rxe_drop_ref(mr);
@@ -535,9 +539,9 @@ int rxe_invalidate_mr(struct rxe_qp *qp, u32 rkey)
 		goto err;
 	}
 
-	if (rkey != mr->ibmr.rkey) {
-		pr_err("%s: rkey (%#x) doesn't match mr->ibmr.rkey (%#x)\n",
-			__func__, rkey, mr->ibmr.rkey);
+	if (rkey != mr->rkey) {
+		pr_err("%s: rkey (%#x) doesn't match mr->rkey (%#x)\n",
+			__func__, rkey, mr->rkey);
 		ret = -EINVAL;
 		goto err_drop_ref;
 	}
@@ -558,6 +562,49 @@ int rxe_invalidate_mr(struct rxe_qp *qp, u32 rkey)
 	return ret;
 }
 
+/* user can (re)register fast MR by executing a REG_MR WQE.
+ * user is expected to hold a reference on the ib mr until the
+ * WQE completes.
+ * Once a fast MR is created this is the only way to change the
+ * private keys. It is the responsibility of the user to maintain
+ * the ib mr keys in sync with rxe mr keys.
+ */
+int rxe_reg_fast_mr(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
+{
+	struct rxe_mr *mr = to_rmr(wqe->wr.wr.reg.mr);
+	u32 key = wqe->wr.wr.reg.key;
+	u32 access = wqe->wr.wr.reg.access;
+
+	/* user can only register MR in free state */
+	if (unlikely(mr->state != RXE_MR_STATE_FREE)) {
+		pr_warn("%s: mr->lkey = 0x%x not free\n",
+			__func__, mr->lkey);
+		return -EINVAL;
+	}
+
+	/* user can only register mr with qp in same protection domain */
+	if (unlikely(qp->ibqp.pd != mr->ibmr.pd)) {
+		pr_warn("%s: qp->pd and mr->pd don't match\n",
+			__func__);
+		return -EINVAL;
+	}
+
+	/* user is only allowed to change key portion of l/rkey */
+	if (unlikely((mr->lkey & ~0xff) != (key & ~0xff))) {
+		pr_warn("%s: key = 0x%x has wrong index mr->lkey = 0x%x\n",
+			__func__, key, mr->lkey);
+		return -EINVAL;
+	}
+
+	mr->access = access;
+	mr->lkey = key;
+	mr->rkey = (access & IB_ACCESS_REMOTE) ? key : 0;
+	mr->iova = wqe->wr.wr.reg.mr->iova;
+	mr->state = RXE_MR_STATE_VALID;
+
+	return 0;
+}
+
 int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
 {
 	struct rxe_mr *mr = to_rmr(ibmr);
diff --git a/drivers/infiniband/sw/rxe/rxe_mw.c b/drivers/infiniband/sw/rxe/rxe_mw.c
index 5ba77df7598e..a5e2ea7d80f0 100644
--- a/drivers/infiniband/sw/rxe/rxe_mw.c
+++ b/drivers/infiniband/sw/rxe/rxe_mw.c
@@ -21,7 +21,7 @@ int rxe_alloc_mw(struct ib_mw *ibmw, struct ib_udata *udata)
 	}
 
 	rxe_add_index(mw);
-	ibmw->rkey = (mw->pelem.index << 8) | rxe_get_next_key(-1);
+	mw->rkey = ibmw->rkey = (mw->pelem.index << 8) | rxe_get_next_key(-1);
 	mw->state = (mw->ibmw.type == IB_MW_TYPE_2) ?
 			RXE_MW_STATE_FREE : RXE_MW_STATE_VALID;
 	spin_lock_init(&mw->lock);
@@ -71,6 +71,8 @@ int rxe_dealloc_mw(struct ib_mw *ibmw)
 static int rxe_check_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
 			 struct rxe_mw *mw, struct rxe_mr *mr)
 {
+	u32 key = wqe->wr.wr.mw.rkey & 0xff;
+
 	if (mw->ibmw.type == IB_MW_TYPE_1) {
 		if (unlikely(mw->state != RXE_MW_STATE_VALID)) {
 			pr_err_once(
@@ -108,7 +110,7 @@ static int rxe_check_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
 		}
 	}
 
-	if (unlikely((wqe->wr.wr.mw.rkey & 0xff) == (mw->ibmw.rkey & 0xff))) {
+	if (unlikely(key == (mw->rkey & 0xff))) {
 		pr_err_once("attempt to bind MW with same key\n");
 		return -EINVAL;
 	}
@@ -161,13 +163,9 @@ static int rxe_check_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
 static void rxe_do_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
 		      struct rxe_mw *mw, struct rxe_mr *mr)
 {
-	u32 rkey;
-	u32 new_rkey;
-
-	rkey = mw->ibmw.rkey;
-	new_rkey = (rkey & 0xffffff00) | (wqe->wr.wr.mw.rkey & 0x000000ff);
+	u32 key = wqe->wr.wr.mw.rkey & 0xff;
 
-	mw->ibmw.rkey = new_rkey;
+	mw->rkey = (mw->rkey & ~0xff) | key;
 	mw->access = wqe->wr.wr.mw.access;
 	mw->state = RXE_MW_STATE_VALID;
 	mw->addr = wqe->wr.wr.mw.addr;
@@ -197,29 +195,29 @@ int rxe_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
 	struct rxe_mw *mw;
 	struct rxe_mr *mr;
 	struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+	u32 mw_rkey = wqe->wr.wr.mw.mw_rkey;
+	u32 mr_lkey = wqe->wr.wr.mw.mr_lkey;
 	unsigned long flags;
 
-	mw = rxe_pool_get_index(&rxe->mw_pool,
-				wqe->wr.wr.mw.mw_rkey >> 8);
+	mw = rxe_pool_get_index(&rxe->mw_pool, mw_rkey >> 8);
 	if (unlikely(!mw)) {
 		ret = -EINVAL;
 		goto err;
 	}
 
-	if (unlikely(mw->ibmw.rkey != wqe->wr.wr.mw.mw_rkey)) {
+	if (unlikely(mw->rkey != mw_rkey)) {
 		ret = -EINVAL;
 		goto err_drop_mw;
 	}
 
 	if (likely(wqe->wr.wr.mw.length)) {
-		mr = rxe_pool_get_index(&rxe->mr_pool,
-					wqe->wr.wr.mw.mr_lkey >> 8);
+		mr = rxe_pool_get_index(&rxe->mr_pool, mr_lkey >> 8);
 		if (unlikely(!mr)) {
 			ret = -EINVAL;
 			goto err_drop_mw;
 		}
 
-		if (unlikely(mr->ibmr.lkey != wqe->wr.wr.mw.mr_lkey)) {
+		if (unlikely(mr->lkey != mr_lkey)) {
 			ret = -EINVAL;
 			goto err_drop_mr;
 		}
@@ -292,7 +290,7 @@ int rxe_invalidate_mw(struct rxe_qp *qp, u32 rkey)
 		goto err;
 	}
 
-	if (rkey != mw->ibmw.rkey) {
+	if (rkey != mw->rkey) {
 		ret = -EINVAL;
 		goto err_drop_ref;
 	}
@@ -323,7 +321,7 @@ struct rxe_mw *rxe_lookup_mw(struct rxe_qp *qp, int access, u32 rkey)
 	if (!mw)
 		return NULL;
 
-	if (unlikely((rxe_mw_rkey(mw) != rkey) || rxe_mw_pd(mw) != pd ||
+	if (unlikely((mw->rkey != rkey) || rxe_mw_pd(mw) != pd ||
 		     (mw->ibmw.type == IB_MW_TYPE_2 && mw->qp != qp) ||
 		     (mw->length == 0) ||
 		     (access && !(access & mw->access)) ||
diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
index 22c3edb28945..ac18dcd6905b 100644
--- a/drivers/infiniband/sw/rxe/rxe_req.c
+++ b/drivers/infiniband/sw/rxe/rxe_req.c
@@ -561,7 +561,6 @@ static void update_state(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
 static int rxe_do_local_ops(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
 {
 	u8 opcode = wqe->wr.opcode;
-	struct rxe_mr *mr;
 	u32 rkey;
 	int ret;
 
@@ -579,14 +578,11 @@ static int rxe_do_local_ops(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
 		}
 		break;
 	case IB_WR_REG_MR:
-		mr = to_rmr(wqe->wr.wr.reg.mr);
-		rxe_add_ref(mr);
-		mr->state = RXE_MR_STATE_VALID;
-		mr->access = wqe->wr.wr.reg.access;
-		mr->ibmr.lkey = wqe->wr.wr.reg.key;
-		mr->ibmr.rkey = wqe->wr.wr.reg.key;
-		mr->iova = wqe->wr.wr.reg.mr->iova;
-		rxe_drop_ref(mr);
+		ret = rxe_reg_fast_mr(qp, wqe);
+		if (unlikely(ret)) {
+			wqe->status = IB_WC_LOC_QP_OP_ERR;
+			return ret;
+		}
 		break;
 	case IB_WR_BIND_MW:
 		ret = rxe_bind_mw(qp, wqe);
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h
index c6aca2293294..31c38b2f7d0a 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.h
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
@@ -306,6 +306,8 @@ struct rxe_mr {
 
 	struct ib_umem		*umem;
 
+	u32			lkey;
+	u32			rkey;
 	enum rxe_mr_state	state;
 	enum ib_mr_type		type;
 	u64			va;
@@ -343,6 +345,7 @@ struct rxe_mw {
 	enum rxe_mw_state	state;
 	struct rxe_qp		*qp; /* Type 2 only */
 	struct rxe_mr		*mr;
+	u32			rkey;
 	int			access;
 	u64			addr;
 	u64			length;
@@ -467,26 +470,11 @@ static inline struct rxe_pd *mr_pd(struct rxe_mr *mr)
 	return to_rpd(mr->ibmr.pd);
 }
 
-static inline u32 mr_lkey(struct rxe_mr *mr)
-{
-	return mr->ibmr.lkey;
-}
-
-static inline u32 mr_rkey(struct rxe_mr *mr)
-{
-	return mr->ibmr.rkey;
-}
-
 static inline struct rxe_pd *rxe_mw_pd(struct rxe_mw *mw)
 {
 	return to_rpd(mw->ibmw.pd);
 }
 
-static inline u32 rxe_mw_rkey(struct rxe_mw *mw)
-{
-	return mw->ibmw.rkey;
-}
-
 int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name);
 
 void rxe_mc_cleanup(struct rxe_pool_entry *arg);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH for-rc v3 5/6] RDMA/rxe: Create duplicate mapping tables for FMRs
  2021-09-09 20:44 [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes Bob Pearson
                   ` (3 preceding siblings ...)
  2021-09-09 20:44 ` [PATCH for-rc v3 4/6] RDMA/rxe: Separate HW and SW l/rkeys Bob Pearson
@ 2021-09-09 20:44 ` Bob Pearson
  2021-09-09 20:44 ` [PATCH for-rc v3 6/6] RDMA/rxe: Only allow invalidate for appropriate MRs Bob Pearson
  2021-09-09 21:52 ` [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes Bart Van Assche
  6 siblings, 0 replies; 22+ messages in thread
From: Bob Pearson @ 2021-09-09 20:44 UTC (permalink / raw)
  To: jgg, zyjzyj2000, linux-rdma, mie, bvanassche; +Cc: Bob Pearson

For fast memory regions create duplicate mapping tables so
ib_map_mr_sg() can build a new mapping table which is then
swapped into place synchronously with the execution of an IB_WR_REG_MR
work request.

Currently the rxe driver uses the same table for receiving RDMA operations
and for building new tables in preparation for reusing the MR. This
exposes users to potentially incorrect results.

Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
---
 drivers/infiniband/sw/rxe/rxe_loc.h   |   1 +
 drivers/infiniband/sw/rxe/rxe_mr.c    | 196 +++++++++++++++++---------
 drivers/infiniband/sw/rxe/rxe_mw.c    |   6 +-
 drivers/infiniband/sw/rxe/rxe_verbs.c |  39 ++---
 drivers/infiniband/sw/rxe/rxe_verbs.h |  21 +--
 5 files changed, 161 insertions(+), 102 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h
index 4fd73b51fabf..1ca43b859d80 100644
--- a/drivers/infiniband/sw/rxe/rxe_loc.h
+++ b/drivers/infiniband/sw/rxe/rxe_loc.h
@@ -87,6 +87,7 @@ int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length);
 int advance_dma_data(struct rxe_dma_info *dma, unsigned int length);
 int rxe_invalidate_mr(struct rxe_qp *qp, u32 rkey);
 int rxe_reg_fast_mr(struct rxe_qp *qp, struct rxe_send_wqe *wqe);
+int rxe_mr_set_page(struct ib_mr *ibmr, u64 addr);
 int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata);
 void rxe_mr_cleanup(struct rxe_pool_entry *arg);
 
diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index 370212801abc..8d658d42abed 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -24,7 +24,7 @@ u8 rxe_get_next_key(u32 last_key)
 
 int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length)
 {
-
+	struct rxe_map_set *set = mr->cur_map_set;
 
 	switch (mr->type) {
 	case IB_MR_TYPE_DMA:
@@ -32,8 +32,8 @@ int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length)
 
 	case IB_MR_TYPE_USER:
 	case IB_MR_TYPE_MEM_REG:
-		if (iova < mr->iova || length > mr->length ||
-		    iova > mr->iova + mr->length - length)
+		if (iova < set->iova || length > set->length ||
+		    iova > set->iova + set->length - length)
 			return -EFAULT;
 		return 0;
 
@@ -65,41 +65,89 @@ static void rxe_mr_init(int access, struct rxe_mr *mr)
 	mr->map_shift = ilog2(RXE_BUF_PER_MAP);
 }
 
-static int rxe_mr_alloc(struct rxe_mr *mr, int num_buf)
+static void rxe_mr_free_map_set(int num_map, struct rxe_map_set *set)
 {
 	int i;
-	int num_map;
-	struct rxe_map **map = mr->map;
 
-	num_map = (num_buf + RXE_BUF_PER_MAP - 1) / RXE_BUF_PER_MAP;
+	for (i = 0; i < num_map; i++)
+		kfree(set->map[i]);
 
-	mr->map = kmalloc_array(num_map, sizeof(*map), GFP_KERNEL);
-	if (!mr->map)
-		goto err1;
+	kfree(set->map);
+	kfree(set);
+}
+
+static int rxe_mr_alloc_map_set(int num_map, struct rxe_map_set **setp)
+{
+	int i;
+	struct rxe_map_set *set;
+
+	set = kmalloc(sizeof(*set), GFP_KERNEL);
+	if (!set)
+		goto err_out;
+
+	set->map = kmalloc_array(num_map, sizeof(struct rxe_map *), GFP_KERNEL);
+	if (!set->map)
+		goto err_free_set;
 
 	for (i = 0; i < num_map; i++) {
-		mr->map[i] = kmalloc(sizeof(**map), GFP_KERNEL);
-		if (!mr->map[i])
-			goto err2;
+		set->map[i] = kmalloc(sizeof(struct rxe_map), GFP_KERNEL);
+		if (!set->map[i])
+			goto err_free_map;
 	}
 
+	*setp = set;
+
+	return 0;
+
+err_free_map:
+	for (i--; i >= 0; i--)
+		kfree(set->map[i]);
+
+	kfree(set->map);
+err_free_set:
+	kfree(set);
+err_out:
+	return -ENOMEM;
+}
+
+/**
+ * rxe_mr_alloc() - Allocate memory map array(s) for MR
+ * @mr: Memory region
+ * @num_buf: Number of buffer descriptors to support
+ * @both: If non zero allocate both mr->map and mr->next_map
+ *	  else just allocate mr->map. Used for fast MRs
+ *
+ * Return: 0 on success else an error
+ */
+static int rxe_mr_alloc(struct rxe_mr *mr, int num_buf, int both)
+{
+	int ret;
+	int num_map;
+
 	BUILD_BUG_ON(!is_power_of_2(RXE_BUF_PER_MAP));
+	num_map = (num_buf + RXE_BUF_PER_MAP - 1) / RXE_BUF_PER_MAP;
 
 	mr->map_shift = ilog2(RXE_BUF_PER_MAP);
 	mr->map_mask = RXE_BUF_PER_MAP - 1;
-
 	mr->num_buf = num_buf;
-	mr->num_map = num_map;
 	mr->max_buf = num_map * RXE_BUF_PER_MAP;
+	mr->num_map = num_map;
 
-	return 0;
+	ret = rxe_mr_alloc_map_set(num_map, &mr->cur_map_set);
+	if (ret)
+		goto err_out;
 
-err2:
-	for (i--; i >= 0; i--)
-		kfree(mr->map[i]);
+	if (both) {
+		ret = rxe_mr_alloc_map_set(num_map, &mr->next_map_set);
+		if (ret) {
+			rxe_mr_free_map_set(mr->num_map, mr->cur_map_set);
+			goto err_out;
+		}
+	}
 
-	kfree(mr->map);
-err1:
+	return 0;
+
+err_out:
 	return -ENOMEM;
 }
 
@@ -116,6 +164,7 @@ void rxe_mr_init_dma(struct rxe_pd *pd, int access, struct rxe_mr *mr)
 int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova,
 		     int access, struct rxe_mr *mr)
 {
+	struct rxe_map_set	*set;
 	struct rxe_map		**map;
 	struct rxe_phys_buf	*buf = NULL;
 	struct ib_umem		*umem;
@@ -123,7 +172,6 @@ int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova,
 	int			num_buf;
 	void			*vaddr;
 	int err;
-	int i;
 
 	umem = ib_umem_get(pd->ibpd.device, start, length, access);
 	if (IS_ERR(umem)) {
@@ -137,18 +185,20 @@ int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova,
 
 	rxe_mr_init(access, mr);
 
-	err = rxe_mr_alloc(mr, num_buf);
+	err = rxe_mr_alloc(mr, num_buf, 0);
 	if (err) {
 		pr_warn("%s: Unable to allocate memory for map\n",
 				__func__);
 		goto err_release_umem;
 	}
 
-	mr->page_shift = PAGE_SHIFT;
-	mr->page_mask = PAGE_SIZE - 1;
+	set = mr->cur_map_set;
+	set->page_shift = PAGE_SHIFT;
+	set->page_mask = PAGE_SIZE - 1;
+
+	num_buf = 0;
+	map = set->map;
 
-	num_buf			= 0;
-	map = mr->map;
 	if (length > 0) {
 		buf = map[0]->buf;
 
@@ -171,26 +221,24 @@ int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova,
 			buf->size = PAGE_SIZE;
 			num_buf++;
 			buf++;
-
 		}
 	}
 
 	mr->ibmr.pd = &pd->ibpd;
 	mr->umem = umem;
 	mr->access = access;
-	mr->length = length;
-	mr->iova = iova;
-	mr->va = start;
-	mr->offset = ib_umem_offset(umem);
 	mr->state = RXE_MR_STATE_VALID;
 	mr->type = IB_MR_TYPE_USER;
 
+	set->length = length;
+	set->iova = iova;
+	set->va = start;
+	set->offset = ib_umem_offset(umem);
+
 	return 0;
 
 err_cleanup_map:
-	for (i = 0; i < mr->num_map; i++)
-		kfree(mr->map[i]);
-	kfree(mr->map);
+	rxe_mr_free_map_set(mr->num_map, mr->cur_map_set);
 err_release_umem:
 	ib_umem_release(umem);
 err_out:
@@ -204,7 +252,7 @@ int rxe_mr_init_fast(struct rxe_pd *pd, int max_pages, struct rxe_mr *mr)
 	/* always allow remote access for FMRs */
 	rxe_mr_init(IB_ACCESS_REMOTE, mr);
 
-	err = rxe_mr_alloc(mr, max_pages);
+	err = rxe_mr_alloc(mr, max_pages, 1);
 	if (err)
 		goto err1;
 
@@ -222,21 +270,24 @@ int rxe_mr_init_fast(struct rxe_pd *pd, int max_pages, struct rxe_mr *mr)
 static void lookup_iova(struct rxe_mr *mr, u64 iova, int *m_out, int *n_out,
 			size_t *offset_out)
 {
-	size_t offset = iova - mr->iova + mr->offset;
+	struct rxe_map_set *set = mr->cur_map_set;
+	size_t offset = iova - set->iova + set->offset;
 	int			map_index;
 	int			buf_index;
 	u64			length;
+	struct rxe_map *map;
 
-	if (likely(mr->page_shift)) {
-		*offset_out = offset & mr->page_mask;
-		offset >>= mr->page_shift;
+	if (likely(set->page_shift)) {
+		*offset_out = offset & set->page_mask;
+		offset >>= set->page_shift;
 		*n_out = offset & mr->map_mask;
 		*m_out = offset >> mr->map_shift;
 	} else {
 		map_index = 0;
 		buf_index = 0;
 
-		length = mr->map[map_index]->buf[buf_index].size;
+		map = set->map[map_index];
+		length = map->buf[buf_index].size;
 
 		while (offset >= length) {
 			offset -= length;
@@ -246,7 +297,8 @@ static void lookup_iova(struct rxe_mr *mr, u64 iova, int *m_out, int *n_out,
 				map_index++;
 				buf_index = 0;
 			}
-			length = mr->map[map_index]->buf[buf_index].size;
+			map = set->map[map_index];
+			length = map->buf[buf_index].size;
 		}
 
 		*m_out = map_index;
@@ -267,7 +319,7 @@ void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length)
 		goto out;
 	}
 
-	if (!mr->map) {
+	if (!mr->cur_map_set) {
 		addr = (void *)(uintptr_t)iova;
 		goto out;
 	}
@@ -280,13 +332,13 @@ void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length)
 
 	lookup_iova(mr, iova, &m, &n, &offset);
 
-	if (offset + length > mr->map[m]->buf[n].size) {
+	if (offset + length > mr->cur_map_set->map[m]->buf[n].size) {
 		pr_warn("crosses page boundary\n");
 		addr = NULL;
 		goto out;
 	}
 
-	addr = (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset;
+	addr = (void *)(uintptr_t)mr->cur_map_set->map[m]->buf[n].addr + offset;
 
 out:
 	return addr;
@@ -322,7 +374,7 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length,
 		return 0;
 	}
 
-	WARN_ON_ONCE(!mr->map);
+	WARN_ON_ONCE(!mr->cur_map_set);
 
 	err = mr_check_range(mr, iova, length);
 	if (err) {
@@ -332,7 +384,7 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length,
 
 	lookup_iova(mr, iova, &m, &i, &offset);
 
-	map = mr->map + m;
+	map = mr->cur_map_set->map + m;
 	buf	= map[0]->buf + i;
 
 	while (length > 0) {
@@ -572,8 +624,9 @@ int rxe_invalidate_mr(struct rxe_qp *qp, u32 rkey)
 int rxe_reg_fast_mr(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
 {
 	struct rxe_mr *mr = to_rmr(wqe->wr.wr.reg.mr);
-	u32 key = wqe->wr.wr.reg.key;
+	u32 key = wqe->wr.wr.reg.key & 0xff;
 	u32 access = wqe->wr.wr.reg.access;
+	struct rxe_map_set *set;
 
 	/* user can only register MR in free state */
 	if (unlikely(mr->state != RXE_MR_STATE_FREE)) {
@@ -589,19 +642,36 @@ int rxe_reg_fast_mr(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
 		return -EINVAL;
 	}
 
-	/* user is only allowed to change key portion of l/rkey */
-	if (unlikely((mr->lkey & ~0xff) != (key & ~0xff))) {
-		pr_warn("%s: key = 0x%x has wrong index mr->lkey = 0x%x\n",
-			__func__, key, mr->lkey);
-		return -EINVAL;
-	}
-
 	mr->access = access;
-	mr->lkey = key;
-	mr->rkey = (access & IB_ACCESS_REMOTE) ? key : 0;
-	mr->iova = wqe->wr.wr.reg.mr->iova;
+	mr->lkey = (mr->lkey & ~0xff) | key;
+	mr->rkey = (access & IB_ACCESS_REMOTE) ? mr->lkey : 0;
 	mr->state = RXE_MR_STATE_VALID;
 
+	set = mr->cur_map_set;
+	mr->cur_map_set = mr->next_map_set;
+	mr->cur_map_set->iova = wqe->wr.wr.reg.mr->iova;
+	mr->next_map_set = set;
+
+	return 0;
+}
+
+int rxe_mr_set_page(struct ib_mr *ibmr, u64 addr)
+{
+	struct rxe_mr *mr = to_rmr(ibmr);
+	struct rxe_map_set *set = mr->next_map_set;
+	struct rxe_map *map;
+	struct rxe_phys_buf *buf;
+
+	if (unlikely(set->nbuf == mr->num_buf))
+		return -ENOMEM;
+
+	map = set->map[set->nbuf / RXE_BUF_PER_MAP];
+	buf = &map->buf[set->nbuf % RXE_BUF_PER_MAP];
+
+	buf->addr = addr;
+	buf->size = ibmr->page_size;
+	set->nbuf++;
+
 	return 0;
 }
 
@@ -626,14 +696,12 @@ int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
 void rxe_mr_cleanup(struct rxe_pool_entry *arg)
 {
 	struct rxe_mr *mr = container_of(arg, typeof(*mr), pelem);
-	int i;
 
 	ib_umem_release(mr->umem);
 
-	if (mr->map) {
-		for (i = 0; i < mr->num_map; i++)
-			kfree(mr->map[i]);
+	if (mr->cur_map_set)
+		rxe_mr_free_map_set(mr->num_map, mr->cur_map_set);
 
-		kfree(mr->map);
-	}
+	if (mr->next_map_set)
+		rxe_mr_free_map_set(mr->num_map, mr->next_map_set);
 }
diff --git a/drivers/infiniband/sw/rxe/rxe_mw.c b/drivers/infiniband/sw/rxe/rxe_mw.c
index a5e2ea7d80f0..9534a7fe1a98 100644
--- a/drivers/infiniband/sw/rxe/rxe_mw.c
+++ b/drivers/infiniband/sw/rxe/rxe_mw.c
@@ -142,15 +142,15 @@ static int rxe_check_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
 
 	/* C10-75 */
 	if (mw->access & IB_ZERO_BASED) {
-		if (unlikely(wqe->wr.wr.mw.length > mr->length)) {
+		if (unlikely(wqe->wr.wr.mw.length > mr->cur_map_set->length)) {
 			pr_err_once(
 				"attempt to bind a ZB MW outside of the MR\n");
 			return -EINVAL;
 		}
 	} else {
-		if (unlikely((wqe->wr.wr.mw.addr < mr->iova) ||
+		if (unlikely((wqe->wr.wr.mw.addr < mr->cur_map_set->iova) ||
 			     ((wqe->wr.wr.mw.addr + wqe->wr.wr.mw.length) >
-			      (mr->iova + mr->length)))) {
+			      (mr->cur_map_set->iova + mr->cur_map_set->length)))) {
 			pr_err_once(
 				"attempt to bind a VA MW outside of the MR\n");
 			return -EINVAL;
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
index dc70e3edeba6..e7f482184359 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -954,41 +954,26 @@ static struct ib_mr *rxe_alloc_mr(struct ib_pd *ibpd, enum ib_mr_type mr_type,
 	return ERR_PTR(err);
 }
 
-static int rxe_set_page(struct ib_mr *ibmr, u64 addr)
-{
-	struct rxe_mr *mr = to_rmr(ibmr);
-	struct rxe_map *map;
-	struct rxe_phys_buf *buf;
-
-	if (unlikely(mr->nbuf == mr->num_buf))
-		return -ENOMEM;
-
-	map = mr->map[mr->nbuf / RXE_BUF_PER_MAP];
-	buf = &map->buf[mr->nbuf % RXE_BUF_PER_MAP];
-
-	buf->addr = addr;
-	buf->size = ibmr->page_size;
-	mr->nbuf++;
-
-	return 0;
-}
-
+/* build next_map_set from scatterlist
+ * The IB_WR_REG_MR WR will swap map_sets
+ */
 static int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg,
 			 int sg_nents, unsigned int *sg_offset)
 {
 	struct rxe_mr *mr = to_rmr(ibmr);
+	struct rxe_map_set *set = mr->next_map_set;
 	int n;
 
-	mr->nbuf = 0;
+	set->nbuf = 0;
 
-	n = ib_sg_to_pages(ibmr, sg, sg_nents, sg_offset, rxe_set_page);
+	n = ib_sg_to_pages(ibmr, sg, sg_nents, sg_offset, rxe_mr_set_page);
 
-	mr->va = ibmr->iova;
-	mr->iova = ibmr->iova;
-	mr->length = ibmr->length;
-	mr->page_shift = ilog2(ibmr->page_size);
-	mr->page_mask = ibmr->page_size - 1;
-	mr->offset = mr->iova & mr->page_mask;
+	set->va = ibmr->iova;
+	set->iova = ibmr->iova;
+	set->length = ibmr->length;
+	set->page_shift = ilog2(ibmr->page_size);
+	set->page_mask = ibmr->page_size - 1;
+	set->offset = set->iova & set->page_mask;
 
 	return n;
 }
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h
index 31c38b2f7d0a..9eabc8f30359 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.h
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
@@ -293,6 +293,17 @@ struct rxe_map {
 	struct rxe_phys_buf	buf[RXE_BUF_PER_MAP];
 };
 
+struct rxe_map_set {
+	struct rxe_map		**map;
+	u64			va;
+	u64			iova;
+	size_t			length;
+	u32			offset;
+	u32			nbuf;
+	int			page_shift;
+	int			page_mask;
+};
+
 static inline int rkey_is_mw(u32 rkey)
 {
 	u32 index = rkey >> 8;
@@ -310,26 +321,20 @@ struct rxe_mr {
 	u32			rkey;
 	enum rxe_mr_state	state;
 	enum ib_mr_type		type;
-	u64			va;
-	u64			iova;
-	size_t			length;
-	u32			offset;
 	int			access;
 
-	int			page_shift;
-	int			page_mask;
 	int			map_shift;
 	int			map_mask;
 
 	u32			num_buf;
-	u32			nbuf;
 
 	u32			max_buf;
 	u32			num_map;
 
 	atomic_t		num_mw;
 
-	struct rxe_map		**map;
+	struct rxe_map_set	*cur_map_set;
+	struct rxe_map_set	*next_map_set;
 };
 
 enum rxe_mw_state {
-- 
2.30.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH for-rc v3 6/6] RDMA/rxe: Only allow invalidate for appropriate MRs
  2021-09-09 20:44 [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes Bob Pearson
                   ` (4 preceding siblings ...)
  2021-09-09 20:44 ` [PATCH for-rc v3 5/6] RDMA/rxe: Create duplicate mapping tables for FMRs Bob Pearson
@ 2021-09-09 20:44 ` Bob Pearson
  2021-09-09 21:52 ` [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes Bart Van Assche
  6 siblings, 0 replies; 22+ messages in thread
From: Bob Pearson @ 2021-09-09 20:44 UTC (permalink / raw)
  To: jgg, zyjzyj2000, linux-rdma, mie, bvanassche; +Cc: Bob Pearson

Local and remote invalidate operations are not allowed by IBA for
MRs created by (re)register memory verbs. This patch checks the
MR type in rxe_invalidate_mr().

Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
---
 drivers/infiniband/sw/rxe/rxe_mr.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index 8d658d42abed..53271df10e47 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -605,6 +605,12 @@ int rxe_invalidate_mr(struct rxe_qp *qp, u32 rkey)
 		goto err_drop_ref;
 	}
 
+	if (unlikely(mr->type != IB_MR_TYPE_MEM_REG)) {
+		pr_warn("%s: mr->type (%d) is wrong type\n", __func__, mr->type);
+		ret = -EINVAL;
+		goto err_drop_ref;
+	}
+
 	mr->state = RXE_MR_STATE_FREE;
 	ret = 0;
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes.
  2021-09-09 20:44 [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes Bob Pearson
                   ` (5 preceding siblings ...)
  2021-09-09 20:44 ` [PATCH for-rc v3 6/6] RDMA/rxe: Only allow invalidate for appropriate MRs Bob Pearson
@ 2021-09-09 21:52 ` Bart Van Assche
  2021-09-10 19:38   ` Pearson, Robert B
  6 siblings, 1 reply; 22+ messages in thread
From: Bart Van Assche @ 2021-09-09 21:52 UTC (permalink / raw)
  To: Bob Pearson, jgg, zyjzyj2000, linux-rdma, mie

[-- Attachment #1: Type: text/plain, Size: 5781 bytes --]

On 9/9/21 1:44 PM, Bob Pearson wrote:
> This series of patches implements several bug fixes and minor
> cleanups of the rxe driver. Specifically these fix a bug exposed
> by blktest.
> 
> They apply cleanly to both
> commit 2169b908894df2ce83e7eb4a399d3224b2635126 (origin/for-rc, for-rc)
> commit 6a217437f9f5482a3f6f2dc5fcd27cf0f62409ac (HEAD -> for-next,
> 	origin/wip/jgg-for-next, origin/for-next, origin/HEAD)
> 
> These are being resubmitted to for-rc instead of for-next.

Hi Bob,

Thanks for having rebased and reposted this patch series. I have applied
this series on top of commit 2169b908894d ("IB/hfi1: make hist static").
A kernel bug was triggered while running test srp/001. I have attached
the kernel configuration used in my test to this email.

Thanks,

Bart.



ib_srpt Received SRP_LOGIN_REQ with i_port_id fe80:0000:0000:0000:5054:00ff:fe86:7464, t_port_id 5054:00ff:fe86:7464:5054:00ff:fe86:7464 and it_iu_len 8260 on port 1 (guid=fe80:0000:0000:0000:5054:00ff:fe86:7464); pkey 0xffff
BUG: unable to handle page fault for address: ffffc900e357d614
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 100000067 P4D 100000067 PUD 0
Oops: 0000 [#1] PREEMPT SMP KASAN
CPU: 26 PID: 148 Comm: ksoftirqd/26 Tainted: G            E     5.14.0-rc6-dbg+ #2
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
RIP: 0010:rxe_completer+0x96d/0x1050 [rdma_rxe]
Code: e0 49 8b 44 24 08 44 89 e9 41 d3 e6 4e 8d a4 30 80 01 00 00 4d 85 e4 0f 84 f9 00 00 00 49 8d bc 24 94 00 00 00 e8 73 a8 b1 e0 <41> 8b 84 24 94 00 00 00 85 c0 0f 84 df 00 00 00 83 f8 03 0f 84 bf
RSP: 0018:ffff8881014075f8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88813c67c000 RCX: dffffc0000000000
RDX: 0000000000000007 RSI: ffffffff826920c0 RDI: ffffc900e357d614
RBP: ffff8881014076e8 R08: ffffffffa09b228d R09: ffff88813c67c57b
R10: ffffed10278cf8af R11: 0000000000000000 R12: ffffc900e357d580
R13: 000000000000000a R14: 00000000d9c99400 R15: ffff8881515ddd08
FS:  0000000000000000(0000) GS:ffff88842d100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffc900e357d614 CR3: 0000000002e29005 CR4: 0000000000770ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
  rxe_do_task+0xdd/0x160 [rdma_rxe]
  rxe_run_task+0x67/0x80 [rdma_rxe]
  rxe_comp_queue_pkt+0x75/0x80 [rdma_rxe]
  rxe_rcv+0x345/0x480 [rdma_rxe]
  rxe_xmit_packet+0x1af/0x300 [rdma_rxe]
  send_ack.isra.0+0x88/0xd0 [rdma_rxe]
  rxe_responder+0xf4c/0x15e0 [rdma_rxe]
  rxe_do_task+0xdd/0x160 [rdma_rxe]
  rxe_run_task+0x67/0x80 [rdma_rxe]
  rxe_resp_queue_pkt+0x5a/0x60 [rdma_rxe]
  rxe_rcv+0x370/0x480 [rdma_rxe]
  rxe_xmit_packet+0x1af/0x300 [rdma_rxe]
  rxe_requester+0x4f4/0xe80 [rdma_rxe]
  rxe_do_task+0xdd/0x160 [rdma_rxe]
  tasklet_action_common.constprop.0+0x168/0x1b0
  tasklet_action+0x44/0x60
  __do_softirq+0x1db/0x6ed
  run_ksoftirqd+0x37/0x60
  smpboot_thread_fn+0x302/0x410
  kthread+0x1f6/0x220
  ret_from_fork+0x1f/0x30
Modules linked in: ib_srp(E) scsi_transport_srp(E) target_core_user(E) uio(E) target_core_pscsi(E) target_core_file(E) ib_srpt(E) target_core_iblock(E) target_core_mod(E) ib_umad(E) rdma_ucm(E) ib_iser(E) libiscsi(E) scsi_transport_iscsi(E) rdma_cm(E) iw_cm(E) 
scsi_debug(E) ib_cm(E) rdma_rxe(E) ip6_udp_tunnel(E) udp_tunnel(E) ib_uverbs(E) null_blk(E) ib_core(E) brd(E) af_packet(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) 
nft_chain_nat(E) nf_tables(E) ebtable_nat(E) iTCO_wdt(E) watchdog(E) ebtable_broute(E) intel_rapl_msr(E) intel_pmc_bxt(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) libcrc32c(E) 
iptable_mangle(E) iptable_raw(E) ip_set(E) nfnetlink(E) ebtable_filter(E) ebtables(E) ip6table_filter(E) ip6_tables(E) rfkill(E) iptable_filter(E) ip_tables(E) x_tables(E) bpfilter(E) intel_rapl_common(E)
  iosf_mbi(E) isst_if_common(E) i2c_i801(E) pcspkr(E) i2c_smbus(E) virtio_net(E) lpc_ich(E) virtio_balloon(E) net_failover(E) failover(E) tiny_power_button(E) button(E) fuse(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) ghash_clmulni_intel(E) aesni_intel(E) 
crypto_simd(E) cryptd(E) sr_mod(E) serio_raw(E) cdrom(E) virtio_gpu(E) virtio_dma_buf(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) cec(E) drm(E) qemu_fw_cfg(E) sg(E) nbd(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) 
scsi_dh_alua(E) virtio_rng(E)
CR2: ffffc900e357d614
---[ end trace 0667a278da47193a ]---
RIP: 0010:rxe_completer+0x96d/0x1050 [rdma_rxe]
Code: e0 49 8b 44 24 08 44 89 e9 41 d3 e6 4e 8d a4 30 80 01 00 00 4d 85 e4 0f 84 f9 00 00 00 49 8d bc 24 94 00 00 00 e8 73 a8 b1 e0 <41> 8b 84 24 94 00 00 00 85 c0 0f 84 df 00 00 00 83 f8 03 0f 84 bf
RSP: 0018:ffff8881014075f8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88813c67c000 RCX: dffffc0000000000
RDX: 0000000000000007 RSI: ffffffff826920c0 RDI: ffffc900e357d614
RBP: ffff8881014076e8 R08: ffffffffa09b228d R09: ffff88813c67c57b
R10: ffffed10278cf8af R11: 0000000000000000 R12: ffffc900e357d580
R13: 000000000000000a R14: 00000000d9c99400 R15: ffff8881515ddd08
FS:  0000000000000000(0000) GS:ffff88842d100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffc900e357d614 CR3: 0000000002e29005 CR4: 0000000000770ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: disabled
Rebooting in 90 seconds..

[-- Attachment #2: kernel-config.xz --]
[-- Type: application/x-xz, Size: 26148 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH for-rc v3 1/6] RDMA/rxe: Add memory barriers to kernel queues
  2021-09-09 20:44 ` [PATCH for-rc v3 1/6] RDMA/rxe: Add memory barriers to kernel queues Bob Pearson
@ 2021-09-10  1:19   ` Zhu Yanjun
  2021-09-10  4:01     ` Bob Pearson
  2021-09-14  6:04   ` 回复: " yangx.jy
  1 sibling, 1 reply; 22+ messages in thread
From: Zhu Yanjun @ 2021-09-10  1:19 UTC (permalink / raw)
  To: Bob Pearson; +Cc: Jason Gunthorpe, RDMA mailing list, mie, Bart Van Assche

On Fri, Sep 10, 2021 at 4:46 AM Bob Pearson <rpearsonhpe@gmail.com> wrote:
>
> Earlier patches added memory barriers to protect user space to kernel
> space communications. The user space queues were previously shown to
> have occasional memory synchonization errors which were removed by
> adding smp_load_acquire, smp_store_release barriers.
>
> This patch extends that to the case where queues are used between kernel
> space threads.
>
> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
> ---
>  drivers/infiniband/sw/rxe/rxe_comp.c  | 10 +---
>  drivers/infiniband/sw/rxe/rxe_cq.c    | 25 ++-------
>  drivers/infiniband/sw/rxe/rxe_qp.c    | 10 ++--
>  drivers/infiniband/sw/rxe/rxe_queue.h | 73 ++++++++-------------------
>  drivers/infiniband/sw/rxe/rxe_req.c   | 21 ++------
>  drivers/infiniband/sw/rxe/rxe_resp.c  | 38 ++++----------
>  drivers/infiniband/sw/rxe/rxe_srq.c   |  2 +-
>  drivers/infiniband/sw/rxe/rxe_verbs.c | 53 ++++---------------
>  8 files changed, 55 insertions(+), 177 deletions(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c
> index d2d802c776fd..ed4e3f29bd65 100644
> --- a/drivers/infiniband/sw/rxe/rxe_comp.c
> +++ b/drivers/infiniband/sw/rxe/rxe_comp.c
> @@ -142,10 +142,7 @@ static inline enum comp_state get_wqe(struct rxe_qp *qp,
>         /* we come here whether or not we found a response packet to see if
>          * there are any posted WQEs
>          */
> -       if (qp->is_user)
> -               wqe = queue_head(qp->sq.queue, QUEUE_TYPE_FROM_USER);
> -       else
> -               wqe = queue_head(qp->sq.queue, QUEUE_TYPE_KERNEL);

This commit is very similar to the commit in
https://lore.kernel.org/linux-rdma/20210902084640.679744-5-yangx.jy@fujitsu.com/T/

Zhu Yanjun

> +       wqe = queue_head(qp->sq.queue, QUEUE_TYPE_FROM_CLIENT);
>         *wqe_p = wqe;
>
>         /* no WQE or requester has not started it yet */
> @@ -432,10 +429,7 @@ static void do_complete(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
>         if (post)
>                 make_send_cqe(qp, wqe, &cqe);
>
> -       if (qp->is_user)
> -               advance_consumer(qp->sq.queue, QUEUE_TYPE_FROM_USER);
> -       else
> -               advance_consumer(qp->sq.queue, QUEUE_TYPE_KERNEL);
> +       advance_consumer(qp->sq.queue, QUEUE_TYPE_FROM_CLIENT);
>
>         if (post)
>                 rxe_cq_post(qp->scq, &cqe, 0);
> diff --git a/drivers/infiniband/sw/rxe/rxe_cq.c b/drivers/infiniband/sw/rxe/rxe_cq.c
> index aef288f164fd..4e26c2ea4a59 100644
> --- a/drivers/infiniband/sw/rxe/rxe_cq.c
> +++ b/drivers/infiniband/sw/rxe/rxe_cq.c
> @@ -25,11 +25,7 @@ int rxe_cq_chk_attr(struct rxe_dev *rxe, struct rxe_cq *cq,
>         }
>
>         if (cq) {
> -               if (cq->is_user)
> -                       count = queue_count(cq->queue, QUEUE_TYPE_TO_USER);
> -               else
> -                       count = queue_count(cq->queue, QUEUE_TYPE_KERNEL);
> -
> +               count = queue_count(cq->queue, QUEUE_TYPE_TO_CLIENT);
>                 if (cqe < count) {
>                         pr_warn("cqe(%d) < current # elements in queue (%d)",
>                                 cqe, count);
> @@ -65,7 +61,7 @@ int rxe_cq_from_init(struct rxe_dev *rxe, struct rxe_cq *cq, int cqe,
>         int err;
>         enum queue_type type;
>
> -       type = uresp ? QUEUE_TYPE_TO_USER : QUEUE_TYPE_KERNEL;
> +       type = QUEUE_TYPE_TO_CLIENT;
>         cq->queue = rxe_queue_init(rxe, &cqe,
>                         sizeof(struct rxe_cqe), type);
>         if (!cq->queue) {
> @@ -117,11 +113,7 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
>
>         spin_lock_irqsave(&cq->cq_lock, flags);
>
> -       if (cq->is_user)
> -               full = queue_full(cq->queue, QUEUE_TYPE_TO_USER);
> -       else
> -               full = queue_full(cq->queue, QUEUE_TYPE_KERNEL);
> -
> +       full = queue_full(cq->queue, QUEUE_TYPE_TO_CLIENT);
>         if (unlikely(full)) {
>                 spin_unlock_irqrestore(&cq->cq_lock, flags);
>                 if (cq->ibcq.event_handler) {
> @@ -134,17 +126,10 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
>                 return -EBUSY;
>         }
>
> -       if (cq->is_user)
> -               addr = producer_addr(cq->queue, QUEUE_TYPE_TO_USER);
> -       else
> -               addr = producer_addr(cq->queue, QUEUE_TYPE_KERNEL);
> -
> +       addr = producer_addr(cq->queue, QUEUE_TYPE_TO_CLIENT);
>         memcpy(addr, cqe, sizeof(*cqe));
>
> -       if (cq->is_user)
> -               advance_producer(cq->queue, QUEUE_TYPE_TO_USER);
> -       else
> -               advance_producer(cq->queue, QUEUE_TYPE_KERNEL);
> +       advance_producer(cq->queue, QUEUE_TYPE_TO_CLIENT);
>
>         spin_unlock_irqrestore(&cq->cq_lock, flags);
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
> index 1ab6af7ddb25..2e923af642f8 100644
> --- a/drivers/infiniband/sw/rxe/rxe_qp.c
> +++ b/drivers/infiniband/sw/rxe/rxe_qp.c
> @@ -231,7 +231,7 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
>         qp->sq.max_inline = init->cap.max_inline_data = wqe_size;
>         wqe_size += sizeof(struct rxe_send_wqe);
>
> -       type = uresp ? QUEUE_TYPE_FROM_USER : QUEUE_TYPE_KERNEL;
> +       type = QUEUE_TYPE_FROM_CLIENT;
>         qp->sq.queue = rxe_queue_init(rxe, &qp->sq.max_wr,
>                                 wqe_size, type);
>         if (!qp->sq.queue)
> @@ -248,12 +248,8 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
>                 return err;
>         }
>
> -       if (qp->is_user)
>                 qp->req.wqe_index = producer_index(qp->sq.queue,
> -                                               QUEUE_TYPE_FROM_USER);
> -       else
> -               qp->req.wqe_index = producer_index(qp->sq.queue,
> -                                               QUEUE_TYPE_KERNEL);
> +                                       QUEUE_TYPE_FROM_CLIENT);
>
>         qp->req.state           = QP_STATE_RESET;
>         qp->req.opcode          = -1;
> @@ -293,7 +289,7 @@ static int rxe_qp_init_resp(struct rxe_dev *rxe, struct rxe_qp *qp,
>                 pr_debug("qp#%d max_wr = %d, max_sge = %d, wqe_size = %d\n",
>                          qp_num(qp), qp->rq.max_wr, qp->rq.max_sge, wqe_size);
>
> -               type = uresp ? QUEUE_TYPE_FROM_USER : QUEUE_TYPE_KERNEL;
> +               type = QUEUE_TYPE_FROM_CLIENT;
>                 qp->rq.queue = rxe_queue_init(rxe, &qp->rq.max_wr,
>                                         wqe_size, type);
>                 if (!qp->rq.queue)
> diff --git a/drivers/infiniband/sw/rxe/rxe_queue.h b/drivers/infiniband/sw/rxe/rxe_queue.h
> index 2702b0e55fc3..d465aa9342e1 100644
> --- a/drivers/infiniband/sw/rxe/rxe_queue.h
> +++ b/drivers/infiniband/sw/rxe/rxe_queue.h
> @@ -35,9 +35,8 @@
>
>  /* type of queue */
>  enum queue_type {
> -       QUEUE_TYPE_KERNEL,
> -       QUEUE_TYPE_TO_USER,
> -       QUEUE_TYPE_FROM_USER,
> +       QUEUE_TYPE_TO_CLIENT,
> +       QUEUE_TYPE_FROM_CLIENT,
>  };
>
>  struct rxe_queue {
> @@ -87,20 +86,16 @@ static inline int queue_empty(struct rxe_queue *q, enum queue_type type)
>         u32 cons;
>
>         switch (type) {
> -       case QUEUE_TYPE_FROM_USER:
> +       case QUEUE_TYPE_FROM_CLIENT:
>                 /* protect user space index */
>                 prod = smp_load_acquire(&q->buf->producer_index);
>                 cons = q->index;
>                 break;
> -       case QUEUE_TYPE_TO_USER:
> +       case QUEUE_TYPE_TO_CLIENT:
>                 prod = q->index;
>                 /* protect user space index */
>                 cons = smp_load_acquire(&q->buf->consumer_index);
>                 break;
> -       case QUEUE_TYPE_KERNEL:
> -               prod = q->buf->producer_index;
> -               cons = q->buf->consumer_index;
> -               break;
>         }
>
>         return ((prod - cons) & q->index_mask) == 0;
> @@ -112,20 +107,16 @@ static inline int queue_full(struct rxe_queue *q, enum queue_type type)
>         u32 cons;
>
>         switch (type) {
> -       case QUEUE_TYPE_FROM_USER:
> +       case QUEUE_TYPE_FROM_CLIENT:
>                 /* protect user space index */
>                 prod = smp_load_acquire(&q->buf->producer_index);
>                 cons = q->index;
>                 break;
> -       case QUEUE_TYPE_TO_USER:
> +       case QUEUE_TYPE_TO_CLIENT:
>                 prod = q->index;
>                 /* protect user space index */
>                 cons = smp_load_acquire(&q->buf->consumer_index);
>                 break;
> -       case QUEUE_TYPE_KERNEL:
> -               prod = q->buf->producer_index;
> -               cons = q->buf->consumer_index;
> -               break;
>         }
>
>         return ((prod + 1 - cons) & q->index_mask) == 0;
> @@ -138,20 +129,16 @@ static inline unsigned int queue_count(const struct rxe_queue *q,
>         u32 cons;
>
>         switch (type) {
> -       case QUEUE_TYPE_FROM_USER:
> +       case QUEUE_TYPE_FROM_CLIENT:
>                 /* protect user space index */
>                 prod = smp_load_acquire(&q->buf->producer_index);
>                 cons = q->index;
>                 break;
> -       case QUEUE_TYPE_TO_USER:
> +       case QUEUE_TYPE_TO_CLIENT:
>                 prod = q->index;
>                 /* protect user space index */
>                 cons = smp_load_acquire(&q->buf->consumer_index);
>                 break;
> -       case QUEUE_TYPE_KERNEL:
> -               prod = q->buf->producer_index;
> -               cons = q->buf->consumer_index;
> -               break;
>         }
>
>         return (prod - cons) & q->index_mask;
> @@ -162,7 +149,7 @@ static inline void advance_producer(struct rxe_queue *q, enum queue_type type)
>         u32 prod;
>
>         switch (type) {
> -       case QUEUE_TYPE_FROM_USER:
> +       case QUEUE_TYPE_FROM_CLIENT:
>                 pr_warn_once("Normally kernel should not write user space index\n");
>                 /* protect user space index */
>                 prod = smp_load_acquire(&q->buf->producer_index);
> @@ -170,15 +157,11 @@ static inline void advance_producer(struct rxe_queue *q, enum queue_type type)
>                 /* same */
>                 smp_store_release(&q->buf->producer_index, prod);
>                 break;
> -       case QUEUE_TYPE_TO_USER:
> +       case QUEUE_TYPE_TO_CLIENT:
>                 prod = q->index;
>                 q->index = (prod + 1) & q->index_mask;
>                 q->buf->producer_index = q->index;
>                 break;
> -       case QUEUE_TYPE_KERNEL:
> -               prod = q->buf->producer_index;
> -               q->buf->producer_index = (prod + 1) & q->index_mask;
> -               break;
>         }
>  }
>
> @@ -187,12 +170,12 @@ static inline void advance_consumer(struct rxe_queue *q, enum queue_type type)
>         u32 cons;
>
>         switch (type) {
> -       case QUEUE_TYPE_FROM_USER:
> +       case QUEUE_TYPE_FROM_CLIENT:
>                 cons = q->index;
>                 q->index = (cons + 1) & q->index_mask;
>                 q->buf->consumer_index = q->index;
>                 break;
> -       case QUEUE_TYPE_TO_USER:
> +       case QUEUE_TYPE_TO_CLIENT:
>                 pr_warn_once("Normally kernel should not write user space index\n");
>                 /* protect user space index */
>                 cons = smp_load_acquire(&q->buf->consumer_index);
> @@ -200,10 +183,6 @@ static inline void advance_consumer(struct rxe_queue *q, enum queue_type type)
>                 /* same */
>                 smp_store_release(&q->buf->consumer_index, cons);
>                 break;
> -       case QUEUE_TYPE_KERNEL:
> -               cons = q->buf->consumer_index;
> -               q->buf->consumer_index = (cons + 1) & q->index_mask;
> -               break;
>         }
>  }
>
> @@ -212,17 +191,14 @@ static inline void *producer_addr(struct rxe_queue *q, enum queue_type type)
>         u32 prod;
>
>         switch (type) {
> -       case QUEUE_TYPE_FROM_USER:
> +       case QUEUE_TYPE_FROM_CLIENT:
>                 /* protect user space index */
>                 prod = smp_load_acquire(&q->buf->producer_index);
>                 prod &= q->index_mask;
>                 break;
> -       case QUEUE_TYPE_TO_USER:
> +       case QUEUE_TYPE_TO_CLIENT:
>                 prod = q->index;
>                 break;
> -       case QUEUE_TYPE_KERNEL:
> -               prod = q->buf->producer_index;
> -               break;
>         }
>
>         return q->buf->data + (prod << q->log2_elem_size);
> @@ -233,17 +209,14 @@ static inline void *consumer_addr(struct rxe_queue *q, enum queue_type type)
>         u32 cons;
>
>         switch (type) {
> -       case QUEUE_TYPE_FROM_USER:
> +       case QUEUE_TYPE_FROM_CLIENT:
>                 cons = q->index;
>                 break;
> -       case QUEUE_TYPE_TO_USER:
> +       case QUEUE_TYPE_TO_CLIENT:
>                 /* protect user space index */
>                 cons = smp_load_acquire(&q->buf->consumer_index);
>                 cons &= q->index_mask;
>                 break;
> -       case QUEUE_TYPE_KERNEL:
> -               cons = q->buf->consumer_index;
> -               break;
>         }
>
>         return q->buf->data + (cons << q->log2_elem_size);
> @@ -255,17 +228,14 @@ static inline unsigned int producer_index(struct rxe_queue *q,
>         u32 prod;
>
>         switch (type) {
> -       case QUEUE_TYPE_FROM_USER:
> +       case QUEUE_TYPE_FROM_CLIENT:
>                 /* protect user space index */
>                 prod = smp_load_acquire(&q->buf->producer_index);
>                 prod &= q->index_mask;
>                 break;
> -       case QUEUE_TYPE_TO_USER:
> +       case QUEUE_TYPE_TO_CLIENT:
>                 prod = q->index;
>                 break;
> -       case QUEUE_TYPE_KERNEL:
> -               prod = q->buf->producer_index;
> -               break;
>         }
>
>         return prod;
> @@ -277,17 +247,14 @@ static inline unsigned int consumer_index(struct rxe_queue *q,
>         u32 cons;
>
>         switch (type) {
> -       case QUEUE_TYPE_FROM_USER:
> +       case QUEUE_TYPE_FROM_CLIENT:
>                 cons = q->index;
>                 break;
> -       case QUEUE_TYPE_TO_USER:
> +       case QUEUE_TYPE_TO_CLIENT:
>                 /* protect user space index */
>                 cons = smp_load_acquire(&q->buf->consumer_index);
>                 cons &= q->index_mask;
>                 break;
> -       case QUEUE_TYPE_KERNEL:
> -               cons = q->buf->consumer_index;
> -               break;
>         }
>
>         return cons;
> diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
> index 3894197a82f6..22c3edb28945 100644
> --- a/drivers/infiniband/sw/rxe/rxe_req.c
> +++ b/drivers/infiniband/sw/rxe/rxe_req.c
> @@ -49,13 +49,8 @@ static void req_retry(struct rxe_qp *qp)
>         unsigned int cons;
>         unsigned int prod;
>
> -       if (qp->is_user) {
> -               cons = consumer_index(q, QUEUE_TYPE_FROM_USER);
> -               prod = producer_index(q, QUEUE_TYPE_FROM_USER);
> -       } else {
> -               cons = consumer_index(q, QUEUE_TYPE_KERNEL);
> -               prod = producer_index(q, QUEUE_TYPE_KERNEL);
> -       }
> +       cons = consumer_index(q, QUEUE_TYPE_FROM_CLIENT);
> +       prod = producer_index(q, QUEUE_TYPE_FROM_CLIENT);
>
>         qp->req.wqe_index       = cons;
>         qp->req.psn             = qp->comp.psn;
> @@ -121,15 +116,9 @@ static struct rxe_send_wqe *req_next_wqe(struct rxe_qp *qp)
>         unsigned int cons;
>         unsigned int prod;
>
> -       if (qp->is_user) {
> -               wqe = queue_head(q, QUEUE_TYPE_FROM_USER);
> -               cons = consumer_index(q, QUEUE_TYPE_FROM_USER);
> -               prod = producer_index(q, QUEUE_TYPE_FROM_USER);
> -       } else {
> -               wqe = queue_head(q, QUEUE_TYPE_KERNEL);
> -               cons = consumer_index(q, QUEUE_TYPE_KERNEL);
> -               prod = producer_index(q, QUEUE_TYPE_KERNEL);
> -       }
> +       wqe = queue_head(q, QUEUE_TYPE_FROM_CLIENT);
> +       cons = consumer_index(q, QUEUE_TYPE_FROM_CLIENT);
> +       prod = producer_index(q, QUEUE_TYPE_FROM_CLIENT);
>
>         if (unlikely(qp->req.state == QP_STATE_DRAIN)) {
>                 /* check to see if we are drained;
> diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
> index 5501227ddc65..596be002d33d 100644
> --- a/drivers/infiniband/sw/rxe/rxe_resp.c
> +++ b/drivers/infiniband/sw/rxe/rxe_resp.c
> @@ -303,10 +303,7 @@ static enum resp_states get_srq_wqe(struct rxe_qp *qp)
>
>         spin_lock_bh(&srq->rq.consumer_lock);
>
> -       if (qp->is_user)
> -               wqe = queue_head(q, QUEUE_TYPE_FROM_USER);
> -       else
> -               wqe = queue_head(q, QUEUE_TYPE_KERNEL);
> +       wqe = queue_head(q, QUEUE_TYPE_FROM_CLIENT);
>         if (!wqe) {
>                 spin_unlock_bh(&srq->rq.consumer_lock);
>                 return RESPST_ERR_RNR;
> @@ -322,13 +319,8 @@ static enum resp_states get_srq_wqe(struct rxe_qp *qp)
>         memcpy(&qp->resp.srq_wqe, wqe, size);
>
>         qp->resp.wqe = &qp->resp.srq_wqe.wqe;
> -       if (qp->is_user) {
> -               advance_consumer(q, QUEUE_TYPE_FROM_USER);
> -               count = queue_count(q, QUEUE_TYPE_FROM_USER);
> -       } else {
> -               advance_consumer(q, QUEUE_TYPE_KERNEL);
> -               count = queue_count(q, QUEUE_TYPE_KERNEL);
> -       }
> +       advance_consumer(q, QUEUE_TYPE_FROM_CLIENT);
> +       count = queue_count(q, QUEUE_TYPE_FROM_CLIENT);
>
>         if (srq->limit && srq->ibsrq.event_handler && (count < srq->limit)) {
>                 srq->limit = 0;
> @@ -357,12 +349,8 @@ static enum resp_states check_resource(struct rxe_qp *qp,
>                         qp->resp.status = IB_WC_WR_FLUSH_ERR;
>                         return RESPST_COMPLETE;
>                 } else if (!srq) {
> -                       if (qp->is_user)
> -                               qp->resp.wqe = queue_head(qp->rq.queue,
> -                                               QUEUE_TYPE_FROM_USER);
> -                       else
> -                               qp->resp.wqe = queue_head(qp->rq.queue,
> -                                               QUEUE_TYPE_KERNEL);
> +                       qp->resp.wqe = queue_head(qp->rq.queue,
> +                                       QUEUE_TYPE_FROM_CLIENT);
>                         if (qp->resp.wqe) {
>                                 qp->resp.status = IB_WC_WR_FLUSH_ERR;
>                                 return RESPST_COMPLETE;
> @@ -389,12 +377,8 @@ static enum resp_states check_resource(struct rxe_qp *qp,
>                 if (srq)
>                         return get_srq_wqe(qp);
>
> -               if (qp->is_user)
> -                       qp->resp.wqe = queue_head(qp->rq.queue,
> -                                       QUEUE_TYPE_FROM_USER);
> -               else
> -                       qp->resp.wqe = queue_head(qp->rq.queue,
> -                                       QUEUE_TYPE_KERNEL);
> +               qp->resp.wqe = queue_head(qp->rq.queue,
> +                               QUEUE_TYPE_FROM_CLIENT);
>                 return (qp->resp.wqe) ? RESPST_CHK_LENGTH : RESPST_ERR_RNR;
>         }
>
> @@ -936,12 +920,8 @@ static enum resp_states do_complete(struct rxe_qp *qp,
>         }
>
>         /* have copy for srq and reference for !srq */
> -       if (!qp->srq) {
> -               if (qp->is_user)
> -                       advance_consumer(qp->rq.queue, QUEUE_TYPE_FROM_USER);
> -               else
> -                       advance_consumer(qp->rq.queue, QUEUE_TYPE_KERNEL);
> -       }
> +       if (!qp->srq)
> +               advance_consumer(qp->rq.queue, QUEUE_TYPE_FROM_CLIENT);
>
>         qp->resp.wqe = NULL;
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_srq.c b/drivers/infiniband/sw/rxe/rxe_srq.c
> index 610c98d24b5c..a9e7817e2732 100644
> --- a/drivers/infiniband/sw/rxe/rxe_srq.c
> +++ b/drivers/infiniband/sw/rxe/rxe_srq.c
> @@ -93,7 +93,7 @@ int rxe_srq_from_init(struct rxe_dev *rxe, struct rxe_srq *srq,
>         spin_lock_init(&srq->rq.producer_lock);
>         spin_lock_init(&srq->rq.consumer_lock);
>
> -       type = uresp ? QUEUE_TYPE_FROM_USER : QUEUE_TYPE_KERNEL;
> +       type = QUEUE_TYPE_FROM_CLIENT;
>         q = rxe_queue_init(rxe, &srq->rq.max_wr,
>                         srq_wqe_size, type);
>         if (!q) {
> diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
> index 267b5a9c345d..dc70e3edeba6 100644
> --- a/drivers/infiniband/sw/rxe/rxe_verbs.c
> +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
> @@ -218,11 +218,7 @@ static int post_one_recv(struct rxe_rq *rq, const struct ib_recv_wr *ibwr)
>         int num_sge = ibwr->num_sge;
>         int full;
>
> -       if (rq->is_user)
> -               full = queue_full(rq->queue, QUEUE_TYPE_FROM_USER);
> -       else
> -               full = queue_full(rq->queue, QUEUE_TYPE_KERNEL);
> -
> +       full = queue_full(rq->queue, QUEUE_TYPE_FROM_CLIENT);
>         if (unlikely(full)) {
>                 err = -ENOMEM;
>                 goto err1;
> @@ -237,11 +233,7 @@ static int post_one_recv(struct rxe_rq *rq, const struct ib_recv_wr *ibwr)
>         for (i = 0; i < num_sge; i++)
>                 length += ibwr->sg_list[i].length;
>
> -       if (rq->is_user)
> -               recv_wqe = producer_addr(rq->queue, QUEUE_TYPE_FROM_USER);
> -       else
> -               recv_wqe = producer_addr(rq->queue, QUEUE_TYPE_KERNEL);
> -
> +       recv_wqe = producer_addr(rq->queue, QUEUE_TYPE_FROM_CLIENT);
>         recv_wqe->wr_id = ibwr->wr_id;
>         recv_wqe->num_sge = num_sge;
>
> @@ -254,10 +246,7 @@ static int post_one_recv(struct rxe_rq *rq, const struct ib_recv_wr *ibwr)
>         recv_wqe->dma.cur_sge           = 0;
>         recv_wqe->dma.sge_offset        = 0;
>
> -       if (rq->is_user)
> -               advance_producer(rq->queue, QUEUE_TYPE_FROM_USER);
> -       else
> -               advance_producer(rq->queue, QUEUE_TYPE_KERNEL);
> +       advance_producer(rq->queue, QUEUE_TYPE_FROM_CLIENT);
>
>         return 0;
>
> @@ -633,27 +622,17 @@ static int post_one_send(struct rxe_qp *qp, const struct ib_send_wr *ibwr,
>
>         spin_lock_irqsave(&qp->sq.sq_lock, flags);
>
> -       if (qp->is_user)
> -               full = queue_full(sq->queue, QUEUE_TYPE_FROM_USER);
> -       else
> -               full = queue_full(sq->queue, QUEUE_TYPE_KERNEL);
> +       full = queue_full(sq->queue, QUEUE_TYPE_FROM_CLIENT);
>
>         if (unlikely(full)) {
>                 spin_unlock_irqrestore(&qp->sq.sq_lock, flags);
>                 return -ENOMEM;
>         }
>
> -       if (qp->is_user)
> -               send_wqe = producer_addr(sq->queue, QUEUE_TYPE_FROM_USER);
> -       else
> -               send_wqe = producer_addr(sq->queue, QUEUE_TYPE_KERNEL);
> -
> +       send_wqe = producer_addr(sq->queue, QUEUE_TYPE_FROM_CLIENT);
>         init_send_wqe(qp, ibwr, mask, length, send_wqe);
>
> -       if (qp->is_user)
> -               advance_producer(sq->queue, QUEUE_TYPE_FROM_USER);
> -       else
> -               advance_producer(sq->queue, QUEUE_TYPE_KERNEL);
> +       advance_producer(sq->queue, QUEUE_TYPE_FROM_CLIENT);
>
>         spin_unlock_irqrestore(&qp->sq.sq_lock, flags);
>
> @@ -845,18 +824,12 @@ static int rxe_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc)
>
>         spin_lock_irqsave(&cq->cq_lock, flags);
>         for (i = 0; i < num_entries; i++) {
> -               if (cq->is_user)
> -                       cqe = queue_head(cq->queue, QUEUE_TYPE_TO_USER);
> -               else
> -                       cqe = queue_head(cq->queue, QUEUE_TYPE_KERNEL);
> +               cqe = queue_head(cq->queue, QUEUE_TYPE_TO_CLIENT);
>                 if (!cqe)
>                         break;
>
>                 memcpy(wc++, &cqe->ibwc, sizeof(*wc));
> -               if (cq->is_user)
> -                       advance_consumer(cq->queue, QUEUE_TYPE_TO_USER);
> -               else
> -                       advance_consumer(cq->queue, QUEUE_TYPE_KERNEL);
> +               advance_consumer(cq->queue, QUEUE_TYPE_TO_CLIENT);
>         }
>         spin_unlock_irqrestore(&cq->cq_lock, flags);
>
> @@ -868,10 +841,7 @@ static int rxe_peek_cq(struct ib_cq *ibcq, int wc_cnt)
>         struct rxe_cq *cq = to_rcq(ibcq);
>         int count;
>
> -       if (cq->is_user)
> -               count = queue_count(cq->queue, QUEUE_TYPE_TO_USER);
> -       else
> -               count = queue_count(cq->queue, QUEUE_TYPE_KERNEL);
> +       count = queue_count(cq->queue, QUEUE_TYPE_TO_CLIENT);
>
>         return (count > wc_cnt) ? wc_cnt : count;
>  }
> @@ -887,10 +857,7 @@ static int rxe_req_notify_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags)
>         if (cq->notify != IB_CQ_NEXT_COMP)
>                 cq->notify = flags & IB_CQ_SOLICITED_MASK;
>
> -       if (cq->is_user)
> -               empty = queue_empty(cq->queue, QUEUE_TYPE_TO_USER);
> -       else
> -               empty = queue_empty(cq->queue, QUEUE_TYPE_KERNEL);
> +       empty = queue_empty(cq->queue, QUEUE_TYPE_TO_CLIENT);
>
>         if ((flags & IB_CQ_REPORT_MISSED_EVENTS) && !empty)
>                 ret = 1;
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH for-rc v3 1/6] RDMA/rxe: Add memory barriers to kernel queues
  2021-09-10  1:19   ` Zhu Yanjun
@ 2021-09-10  4:01     ` Bob Pearson
  0 siblings, 0 replies; 22+ messages in thread
From: Bob Pearson @ 2021-09-10  4:01 UTC (permalink / raw)
  To: Zhu Yanjun; +Cc: Jason Gunthorpe, RDMA mailing list, mie, Bart Van Assche

On 9/9/21 8:19 PM, Zhu Yanjun wrote:
> On Fri, Sep 10, 2021 at 4:46 AM Bob Pearson <rpearsonhpe@gmail.com> wrote:
>>
>> Earlier patches added memory barriers to protect user space to kernel
>> space communications. The user space queues were previously shown to
>> have occasional memory synchonization errors which were removed by
>> adding smp_load_acquire, smp_store_release barriers.
>>
>> This patch extends that to the case where queues are used between kernel
>> space threads.
>>
>> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
>> ---
>>  drivers/infiniband/sw/rxe/rxe_comp.c  | 10 +---
>>  drivers/infiniband/sw/rxe/rxe_cq.c    | 25 ++-------
>>  drivers/infiniband/sw/rxe/rxe_qp.c    | 10 ++--
>>  drivers/infiniband/sw/rxe/rxe_queue.h | 73 ++++++++-------------------
>>  drivers/infiniband/sw/rxe/rxe_req.c   | 21 ++------
>>  drivers/infiniband/sw/rxe/rxe_resp.c  | 38 ++++----------
>>  drivers/infiniband/sw/rxe/rxe_srq.c   |  2 +-
>>  drivers/infiniband/sw/rxe/rxe_verbs.c | 53 ++++---------------
>>  8 files changed, 55 insertions(+), 177 deletions(-)
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c
>> index d2d802c776fd..ed4e3f29bd65 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_comp.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_comp.c
>> @@ -142,10 +142,7 @@ static inline enum comp_state get_wqe(struct rxe_qp *qp,
>>         /* we come here whether or not we found a response packet to see if
>>          * there are any posted WQEs
>>          */
>> -       if (qp->is_user)
>> -               wqe = queue_head(qp->sq.queue, QUEUE_TYPE_FROM_USER);
>> -       else
>> -               wqe = queue_head(qp->sq.queue, QUEUE_TYPE_KERNEL);
> 
> This commit is very similar to the commit in
> https://lore.kernel.org/linux-rdma/20210902084640.679744-5-yangx.jy@fujitsu.com/T/
> 
> Zhu Yanjun
> 
>> +       wqe = queue_head(qp->sq.queue, QUEUE_TYPE_FROM_CLIENT);
>>         *wqe_p = wqe;
>>
>>         /* no WQE or requester has not started it yet */
>> @@ -432,10 +429,7 @@ static void do_complete(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
>>         if (post)
>>                 make_send_cqe(qp, wqe, &cqe);
>>
>> -       if (qp->is_user)
>> -               advance_consumer(qp->sq.queue, QUEUE_TYPE_FROM_USER);
>> -       else
>> -               advance_consumer(qp->sq.queue, QUEUE_TYPE_KERNEL);
>> +       advance_consumer(qp->sq.queue, QUEUE_TYPE_FROM_CLIENT);
>>
>>         if (post)
>>                 rxe_cq_post(qp->scq, &cqe, 0);
>> diff --git a/drivers/infiniband/sw/rxe/rxe_cq.c b/drivers/infiniband/sw/rxe/rxe_cq.c
>> index aef288f164fd..4e26c2ea4a59 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_cq.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_cq.c
>> @@ -25,11 +25,7 @@ int rxe_cq_chk_attr(struct rxe_dev *rxe, struct rxe_cq *cq,
>>         }
>>
>>         if (cq) {
>> -               if (cq->is_user)
>> -                       count = queue_count(cq->queue, QUEUE_TYPE_TO_USER);
>> -               else
>> -                       count = queue_count(cq->queue, QUEUE_TYPE_KERNEL);
>> -
>> +               count = queue_count(cq->queue, QUEUE_TYPE_TO_CLIENT);
>>                 if (cqe < count) {
>>                         pr_warn("cqe(%d) < current # elements in queue (%d)",
>>                                 cqe, count);
>> @@ -65,7 +61,7 @@ int rxe_cq_from_init(struct rxe_dev *rxe, struct rxe_cq *cq, int cqe,
>>         int err;
>>         enum queue_type type;
>>
>> -       type = uresp ? QUEUE_TYPE_TO_USER : QUEUE_TYPE_KERNEL;
>> +       type = QUEUE_TYPE_TO_CLIENT;
>>         cq->queue = rxe_queue_init(rxe, &cqe,
>>                         sizeof(struct rxe_cqe), type);
>>         if (!cq->queue) {
>> @@ -117,11 +113,7 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
>>
>>         spin_lock_irqsave(&cq->cq_lock, flags);
>>
>> -       if (cq->is_user)
>> -               full = queue_full(cq->queue, QUEUE_TYPE_TO_USER);
>> -       else
>> -               full = queue_full(cq->queue, QUEUE_TYPE_KERNEL);
>> -
>> +       full = queue_full(cq->queue, QUEUE_TYPE_TO_CLIENT);
>>         if (unlikely(full)) {
>>                 spin_unlock_irqrestore(&cq->cq_lock, flags);
>>                 if (cq->ibcq.event_handler) {
>> @@ -134,17 +126,10 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
>>                 return -EBUSY;
>>         }
>>
>> -       if (cq->is_user)
>> -               addr = producer_addr(cq->queue, QUEUE_TYPE_TO_USER);
>> -       else
>> -               addr = producer_addr(cq->queue, QUEUE_TYPE_KERNEL);
>> -
>> +       addr = producer_addr(cq->queue, QUEUE_TYPE_TO_CLIENT);
>>         memcpy(addr, cqe, sizeof(*cqe));
>>
>> -       if (cq->is_user)
>> -               advance_producer(cq->queue, QUEUE_TYPE_TO_USER);
>> -       else
>> -               advance_producer(cq->queue, QUEUE_TYPE_KERNEL);
>> +       advance_producer(cq->queue, QUEUE_TYPE_TO_CLIENT);
>>
>>         spin_unlock_irqrestore(&cq->cq_lock, flags);
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
>> index 1ab6af7ddb25..2e923af642f8 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_qp.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_qp.c
>> @@ -231,7 +231,7 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
>>         qp->sq.max_inline = init->cap.max_inline_data = wqe_size;
>>         wqe_size += sizeof(struct rxe_send_wqe);
>>
>> -       type = uresp ? QUEUE_TYPE_FROM_USER : QUEUE_TYPE_KERNEL;
>> +       type = QUEUE_TYPE_FROM_CLIENT;
>>         qp->sq.queue = rxe_queue_init(rxe, &qp->sq.max_wr,
>>                                 wqe_size, type);
>>         if (!qp->sq.queue)
>> @@ -248,12 +248,8 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
>>                 return err;
>>         }
>>
>> -       if (qp->is_user)
>>                 qp->req.wqe_index = producer_index(qp->sq.queue,
>> -                                               QUEUE_TYPE_FROM_USER);
>> -       else
>> -               qp->req.wqe_index = producer_index(qp->sq.queue,
>> -                                               QUEUE_TYPE_KERNEL);
>> +                                       QUEUE_TYPE_FROM_CLIENT);
>>
>>         qp->req.state           = QP_STATE_RESET;
>>         qp->req.opcode          = -1;
>> @@ -293,7 +289,7 @@ static int rxe_qp_init_resp(struct rxe_dev *rxe, struct rxe_qp *qp,
>>                 pr_debug("qp#%d max_wr = %d, max_sge = %d, wqe_size = %d\n",
>>                          qp_num(qp), qp->rq.max_wr, qp->rq.max_sge, wqe_size);
>>
>> -               type = uresp ? QUEUE_TYPE_FROM_USER : QUEUE_TYPE_KERNEL;
>> +               type = QUEUE_TYPE_FROM_CLIENT;
>>                 qp->rq.queue = rxe_queue_init(rxe, &qp->rq.max_wr,
>>                                         wqe_size, type);
>>                 if (!qp->rq.queue)
>> diff --git a/drivers/infiniband/sw/rxe/rxe_queue.h b/drivers/infiniband/sw/rxe/rxe_queue.h
>> index 2702b0e55fc3..d465aa9342e1 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_queue.h
>> +++ b/drivers/infiniband/sw/rxe/rxe_queue.h
>> @@ -35,9 +35,8 @@
>>
>>  /* type of queue */
>>  enum queue_type {
>> -       QUEUE_TYPE_KERNEL,
>> -       QUEUE_TYPE_TO_USER,
>> -       QUEUE_TYPE_FROM_USER,
>> +       QUEUE_TYPE_TO_CLIENT,
>> +       QUEUE_TYPE_FROM_CLIENT,
>>  };
>>
>>  struct rxe_queue {
>> @@ -87,20 +86,16 @@ static inline int queue_empty(struct rxe_queue *q, enum queue_type type)
>>         u32 cons;
>>
>>         switch (type) {
>> -       case QUEUE_TYPE_FROM_USER:
>> +       case QUEUE_TYPE_FROM_CLIENT:
>>                 /* protect user space index */
>>                 prod = smp_load_acquire(&q->buf->producer_index);
>>                 cons = q->index;
>>                 break;
>> -       case QUEUE_TYPE_TO_USER:
>> +       case QUEUE_TYPE_TO_CLIENT:
>>                 prod = q->index;
>>                 /* protect user space index */
>>                 cons = smp_load_acquire(&q->buf->consumer_index);
>>                 break;
>> -       case QUEUE_TYPE_KERNEL:
>> -               prod = q->buf->producer_index;
>> -               cons = q->buf->consumer_index;
>> -               break;
>>         }
>>
>>         return ((prod - cons) & q->index_mask) == 0;
>> @@ -112,20 +107,16 @@ static inline int queue_full(struct rxe_queue *q, enum queue_type type)
>>         u32 cons;
>>
>>         switch (type) {
>> -       case QUEUE_TYPE_FROM_USER:
>> +       case QUEUE_TYPE_FROM_CLIENT:
>>                 /* protect user space index */
>>                 prod = smp_load_acquire(&q->buf->producer_index);
>>                 cons = q->index;
>>                 break;
>> -       case QUEUE_TYPE_TO_USER:
>> +       case QUEUE_TYPE_TO_CLIENT:
>>                 prod = q->index;
>>                 /* protect user space index */
>>                 cons = smp_load_acquire(&q->buf->consumer_index);
>>                 break;
>> -       case QUEUE_TYPE_KERNEL:
>> -               prod = q->buf->producer_index;
>> -               cons = q->buf->consumer_index;
>> -               break;
>>         }
>>
>>         return ((prod + 1 - cons) & q->index_mask) == 0;
>> @@ -138,20 +129,16 @@ static inline unsigned int queue_count(const struct rxe_queue *q,
>>         u32 cons;
>>
>>         switch (type) {
>> -       case QUEUE_TYPE_FROM_USER:
>> +       case QUEUE_TYPE_FROM_CLIENT:
>>                 /* protect user space index */
>>                 prod = smp_load_acquire(&q->buf->producer_index);
>>                 cons = q->index;
>>                 break;
>> -       case QUEUE_TYPE_TO_USER:
>> +       case QUEUE_TYPE_TO_CLIENT:
>>                 prod = q->index;
>>                 /* protect user space index */
>>                 cons = smp_load_acquire(&q->buf->consumer_index);
>>                 break;
>> -       case QUEUE_TYPE_KERNEL:
>> -               prod = q->buf->producer_index;
>> -               cons = q->buf->consumer_index;
>> -               break;
>>         }
>>
>>         return (prod - cons) & q->index_mask;
>> @@ -162,7 +149,7 @@ static inline void advance_producer(struct rxe_queue *q, enum queue_type type)
>>         u32 prod;
>>
>>         switch (type) {
>> -       case QUEUE_TYPE_FROM_USER:
>> +       case QUEUE_TYPE_FROM_CLIENT:
>>                 pr_warn_once("Normally kernel should not write user space index\n");
>>                 /* protect user space index */
>>                 prod = smp_load_acquire(&q->buf->producer_index);
>> @@ -170,15 +157,11 @@ static inline void advance_producer(struct rxe_queue *q, enum queue_type type)
>>                 /* same */
>>                 smp_store_release(&q->buf->producer_index, prod);
>>                 break;
>> -       case QUEUE_TYPE_TO_USER:
>> +       case QUEUE_TYPE_TO_CLIENT:
>>                 prod = q->index;
>>                 q->index = (prod + 1) & q->index_mask;
>>                 q->buf->producer_index = q->index;
>>                 break;
>> -       case QUEUE_TYPE_KERNEL:
>> -               prod = q->buf->producer_index;
>> -               q->buf->producer_index = (prod + 1) & q->index_mask;
>> -               break;
>>         }
>>  }
>>
>> @@ -187,12 +170,12 @@ static inline void advance_consumer(struct rxe_queue *q, enum queue_type type)
>>         u32 cons;
>>
>>         switch (type) {
>> -       case QUEUE_TYPE_FROM_USER:
>> +       case QUEUE_TYPE_FROM_CLIENT:
>>                 cons = q->index;
>>                 q->index = (cons + 1) & q->index_mask;
>>                 q->buf->consumer_index = q->index;
>>                 break;
>> -       case QUEUE_TYPE_TO_USER:
>> +       case QUEUE_TYPE_TO_CLIENT:
>>                 pr_warn_once("Normally kernel should not write user space index\n");
>>                 /* protect user space index */
>>                 cons = smp_load_acquire(&q->buf->consumer_index);
>> @@ -200,10 +183,6 @@ static inline void advance_consumer(struct rxe_queue *q, enum queue_type type)
>>                 /* same */
>>                 smp_store_release(&q->buf->consumer_index, cons);
>>                 break;
>> -       case QUEUE_TYPE_KERNEL:
>> -               cons = q->buf->consumer_index;
>> -               q->buf->consumer_index = (cons + 1) & q->index_mask;
>> -               break;
>>         }
>>  }
>>
>> @@ -212,17 +191,14 @@ static inline void *producer_addr(struct rxe_queue *q, enum queue_type type)
>>         u32 prod;
>>
>>         switch (type) {
>> -       case QUEUE_TYPE_FROM_USER:
>> +       case QUEUE_TYPE_FROM_CLIENT:
>>                 /* protect user space index */
>>                 prod = smp_load_acquire(&q->buf->producer_index);
>>                 prod &= q->index_mask;
>>                 break;
>> -       case QUEUE_TYPE_TO_USER:
>> +       case QUEUE_TYPE_TO_CLIENT:
>>                 prod = q->index;
>>                 break;
>> -       case QUEUE_TYPE_KERNEL:
>> -               prod = q->buf->producer_index;
>> -               break;
>>         }
>>
>>         return q->buf->data + (prod << q->log2_elem_size);
>> @@ -233,17 +209,14 @@ static inline void *consumer_addr(struct rxe_queue *q, enum queue_type type)
>>         u32 cons;
>>
>>         switch (type) {
>> -       case QUEUE_TYPE_FROM_USER:
>> +       case QUEUE_TYPE_FROM_CLIENT:
>>                 cons = q->index;
>>                 break;
>> -       case QUEUE_TYPE_TO_USER:
>> +       case QUEUE_TYPE_TO_CLIENT:
>>                 /* protect user space index */
>>                 cons = smp_load_acquire(&q->buf->consumer_index);
>>                 cons &= q->index_mask;
>>                 break;
>> -       case QUEUE_TYPE_KERNEL:
>> -               cons = q->buf->consumer_index;
>> -               break;
>>         }
>>
>>         return q->buf->data + (cons << q->log2_elem_size);
>> @@ -255,17 +228,14 @@ static inline unsigned int producer_index(struct rxe_queue *q,
>>         u32 prod;
>>
>>         switch (type) {
>> -       case QUEUE_TYPE_FROM_USER:
>> +       case QUEUE_TYPE_FROM_CLIENT:
>>                 /* protect user space index */
>>                 prod = smp_load_acquire(&q->buf->producer_index);
>>                 prod &= q->index_mask;
>>                 break;
>> -       case QUEUE_TYPE_TO_USER:
>> +       case QUEUE_TYPE_TO_CLIENT:
>>                 prod = q->index;
>>                 break;
>> -       case QUEUE_TYPE_KERNEL:
>> -               prod = q->buf->producer_index;
>> -               break;
>>         }
>>
>>         return prod;
>> @@ -277,17 +247,14 @@ static inline unsigned int consumer_index(struct rxe_queue *q,
>>         u32 cons;
>>
>>         switch (type) {
>> -       case QUEUE_TYPE_FROM_USER:
>> +       case QUEUE_TYPE_FROM_CLIENT:
>>                 cons = q->index;
>>                 break;
>> -       case QUEUE_TYPE_TO_USER:
>> +       case QUEUE_TYPE_TO_CLIENT:
>>                 /* protect user space index */
>>                 cons = smp_load_acquire(&q->buf->consumer_index);
>>                 cons &= q->index_mask;
>>                 break;
>> -       case QUEUE_TYPE_KERNEL:
>> -               cons = q->buf->consumer_index;
>> -               break;
>>         }
>>
>>         return cons;
>> diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
>> index 3894197a82f6..22c3edb28945 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_req.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_req.c
>> @@ -49,13 +49,8 @@ static void req_retry(struct rxe_qp *qp)
>>         unsigned int cons;
>>         unsigned int prod;
>>
>> -       if (qp->is_user) {
>> -               cons = consumer_index(q, QUEUE_TYPE_FROM_USER);
>> -               prod = producer_index(q, QUEUE_TYPE_FROM_USER);
>> -       } else {
>> -               cons = consumer_index(q, QUEUE_TYPE_KERNEL);
>> -               prod = producer_index(q, QUEUE_TYPE_KERNEL);
>> -       }
>> +       cons = consumer_index(q, QUEUE_TYPE_FROM_CLIENT);
>> +       prod = producer_index(q, QUEUE_TYPE_FROM_CLIENT);
>>
>>         qp->req.wqe_index       = cons;
>>         qp->req.psn             = qp->comp.psn;
>> @@ -121,15 +116,9 @@ static struct rxe_send_wqe *req_next_wqe(struct rxe_qp *qp)
>>         unsigned int cons;
>>         unsigned int prod;
>>
>> -       if (qp->is_user) {
>> -               wqe = queue_head(q, QUEUE_TYPE_FROM_USER);
>> -               cons = consumer_index(q, QUEUE_TYPE_FROM_USER);
>> -               prod = producer_index(q, QUEUE_TYPE_FROM_USER);
>> -       } else {
>> -               wqe = queue_head(q, QUEUE_TYPE_KERNEL);
>> -               cons = consumer_index(q, QUEUE_TYPE_KERNEL);
>> -               prod = producer_index(q, QUEUE_TYPE_KERNEL);
>> -       }
>> +       wqe = queue_head(q, QUEUE_TYPE_FROM_CLIENT);
>> +       cons = consumer_index(q, QUEUE_TYPE_FROM_CLIENT);
>> +       prod = producer_index(q, QUEUE_TYPE_FROM_CLIENT);
>>
>>         if (unlikely(qp->req.state == QP_STATE_DRAIN)) {
>>                 /* check to see if we are drained;
>> diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
>> index 5501227ddc65..596be002d33d 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_resp.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_resp.c
>> @@ -303,10 +303,7 @@ static enum resp_states get_srq_wqe(struct rxe_qp *qp)
>>
>>         spin_lock_bh(&srq->rq.consumer_lock);
>>
>> -       if (qp->is_user)
>> -               wqe = queue_head(q, QUEUE_TYPE_FROM_USER);
>> -       else
>> -               wqe = queue_head(q, QUEUE_TYPE_KERNEL);
>> +       wqe = queue_head(q, QUEUE_TYPE_FROM_CLIENT);
>>         if (!wqe) {
>>                 spin_unlock_bh(&srq->rq.consumer_lock);
>>                 return RESPST_ERR_RNR;
>> @@ -322,13 +319,8 @@ static enum resp_states get_srq_wqe(struct rxe_qp *qp)
>>         memcpy(&qp->resp.srq_wqe, wqe, size);
>>
>>         qp->resp.wqe = &qp->resp.srq_wqe.wqe;
>> -       if (qp->is_user) {
>> -               advance_consumer(q, QUEUE_TYPE_FROM_USER);
>> -               count = queue_count(q, QUEUE_TYPE_FROM_USER);
>> -       } else {
>> -               advance_consumer(q, QUEUE_TYPE_KERNEL);
>> -               count = queue_count(q, QUEUE_TYPE_KERNEL);
>> -       }
>> +       advance_consumer(q, QUEUE_TYPE_FROM_CLIENT);
>> +       count = queue_count(q, QUEUE_TYPE_FROM_CLIENT);
>>
>>         if (srq->limit && srq->ibsrq.event_handler && (count < srq->limit)) {
>>                 srq->limit = 0;
>> @@ -357,12 +349,8 @@ static enum resp_states check_resource(struct rxe_qp *qp,
>>                         qp->resp.status = IB_WC_WR_FLUSH_ERR;
>>                         return RESPST_COMPLETE;
>>                 } else if (!srq) {
>> -                       if (qp->is_user)
>> -                               qp->resp.wqe = queue_head(qp->rq.queue,
>> -                                               QUEUE_TYPE_FROM_USER);
>> -                       else
>> -                               qp->resp.wqe = queue_head(qp->rq.queue,
>> -                                               QUEUE_TYPE_KERNEL);
>> +                       qp->resp.wqe = queue_head(qp->rq.queue,
>> +                                       QUEUE_TYPE_FROM_CLIENT);
>>                         if (qp->resp.wqe) {
>>                                 qp->resp.status = IB_WC_WR_FLUSH_ERR;
>>                                 return RESPST_COMPLETE;
>> @@ -389,12 +377,8 @@ static enum resp_states check_resource(struct rxe_qp *qp,
>>                 if (srq)
>>                         return get_srq_wqe(qp);
>>
>> -               if (qp->is_user)
>> -                       qp->resp.wqe = queue_head(qp->rq.queue,
>> -                                       QUEUE_TYPE_FROM_USER);
>> -               else
>> -                       qp->resp.wqe = queue_head(qp->rq.queue,
>> -                                       QUEUE_TYPE_KERNEL);
>> +               qp->resp.wqe = queue_head(qp->rq.queue,
>> +                               QUEUE_TYPE_FROM_CLIENT);
>>                 return (qp->resp.wqe) ? RESPST_CHK_LENGTH : RESPST_ERR_RNR;
>>         }
>>
>> @@ -936,12 +920,8 @@ static enum resp_states do_complete(struct rxe_qp *qp,
>>         }
>>
>>         /* have copy for srq and reference for !srq */
>> -       if (!qp->srq) {
>> -               if (qp->is_user)
>> -                       advance_consumer(qp->rq.queue, QUEUE_TYPE_FROM_USER);
>> -               else
>> -                       advance_consumer(qp->rq.queue, QUEUE_TYPE_KERNEL);
>> -       }
>> +       if (!qp->srq)
>> +               advance_consumer(qp->rq.queue, QUEUE_TYPE_FROM_CLIENT);
>>
>>         qp->resp.wqe = NULL;
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_srq.c b/drivers/infiniband/sw/rxe/rxe_srq.c
>> index 610c98d24b5c..a9e7817e2732 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_srq.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_srq.c
>> @@ -93,7 +93,7 @@ int rxe_srq_from_init(struct rxe_dev *rxe, struct rxe_srq *srq,
>>         spin_lock_init(&srq->rq.producer_lock);
>>         spin_lock_init(&srq->rq.consumer_lock);
>>
>> -       type = uresp ? QUEUE_TYPE_FROM_USER : QUEUE_TYPE_KERNEL;
>> +       type = QUEUE_TYPE_FROM_CLIENT;
>>         q = rxe_queue_init(rxe, &srq->rq.max_wr,
>>                         srq_wqe_size, type);
>>         if (!q) {
>> diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
>> index 267b5a9c345d..dc70e3edeba6 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_verbs.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
>> @@ -218,11 +218,7 @@ static int post_one_recv(struct rxe_rq *rq, const struct ib_recv_wr *ibwr)
>>         int num_sge = ibwr->num_sge;
>>         int full;
>>
>> -       if (rq->is_user)
>> -               full = queue_full(rq->queue, QUEUE_TYPE_FROM_USER);
>> -       else
>> -               full = queue_full(rq->queue, QUEUE_TYPE_KERNEL);
>> -
>> +       full = queue_full(rq->queue, QUEUE_TYPE_FROM_CLIENT);
>>         if (unlikely(full)) {
>>                 err = -ENOMEM;
>>                 goto err1;
>> @@ -237,11 +233,7 @@ static int post_one_recv(struct rxe_rq *rq, const struct ib_recv_wr *ibwr)
>>         for (i = 0; i < num_sge; i++)
>>                 length += ibwr->sg_list[i].length;
>>
>> -       if (rq->is_user)
>> -               recv_wqe = producer_addr(rq->queue, QUEUE_TYPE_FROM_USER);
>> -       else
>> -               recv_wqe = producer_addr(rq->queue, QUEUE_TYPE_KERNEL);
>> -
>> +       recv_wqe = producer_addr(rq->queue, QUEUE_TYPE_FROM_CLIENT);
>>         recv_wqe->wr_id = ibwr->wr_id;
>>         recv_wqe->num_sge = num_sge;
>>
>> @@ -254,10 +246,7 @@ static int post_one_recv(struct rxe_rq *rq, const struct ib_recv_wr *ibwr)
>>         recv_wqe->dma.cur_sge           = 0;
>>         recv_wqe->dma.sge_offset        = 0;
>>
>> -       if (rq->is_user)
>> -               advance_producer(rq->queue, QUEUE_TYPE_FROM_USER);
>> -       else
>> -               advance_producer(rq->queue, QUEUE_TYPE_KERNEL);
>> +       advance_producer(rq->queue, QUEUE_TYPE_FROM_CLIENT);
>>
>>         return 0;
>>
>> @@ -633,27 +622,17 @@ static int post_one_send(struct rxe_qp *qp, const struct ib_send_wr *ibwr,
>>
>>         spin_lock_irqsave(&qp->sq.sq_lock, flags);
>>
>> -       if (qp->is_user)
>> -               full = queue_full(sq->queue, QUEUE_TYPE_FROM_USER);
>> -       else
>> -               full = queue_full(sq->queue, QUEUE_TYPE_KERNEL);
>> +       full = queue_full(sq->queue, QUEUE_TYPE_FROM_CLIENT);
>>
>>         if (unlikely(full)) {
>>                 spin_unlock_irqrestore(&qp->sq.sq_lock, flags);
>>                 return -ENOMEM;
>>         }
>>
>> -       if (qp->is_user)
>> -               send_wqe = producer_addr(sq->queue, QUEUE_TYPE_FROM_USER);
>> -       else
>> -               send_wqe = producer_addr(sq->queue, QUEUE_TYPE_KERNEL);
>> -
>> +       send_wqe = producer_addr(sq->queue, QUEUE_TYPE_FROM_CLIENT);
>>         init_send_wqe(qp, ibwr, mask, length, send_wqe);
>>
>> -       if (qp->is_user)
>> -               advance_producer(sq->queue, QUEUE_TYPE_FROM_USER);
>> -       else
>> -               advance_producer(sq->queue, QUEUE_TYPE_KERNEL);
>> +       advance_producer(sq->queue, QUEUE_TYPE_FROM_CLIENT);
>>
>>         spin_unlock_irqrestore(&qp->sq.sq_lock, flags);
>>
>> @@ -845,18 +824,12 @@ static int rxe_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc)
>>
>>         spin_lock_irqsave(&cq->cq_lock, flags);
>>         for (i = 0; i < num_entries; i++) {
>> -               if (cq->is_user)
>> -                       cqe = queue_head(cq->queue, QUEUE_TYPE_TO_USER);
>> -               else
>> -                       cqe = queue_head(cq->queue, QUEUE_TYPE_KERNEL);
>> +               cqe = queue_head(cq->queue, QUEUE_TYPE_TO_CLIENT);
>>                 if (!cqe)
>>                         break;
>>
>>                 memcpy(wc++, &cqe->ibwc, sizeof(*wc));
>> -               if (cq->is_user)
>> -                       advance_consumer(cq->queue, QUEUE_TYPE_TO_USER);
>> -               else
>> -                       advance_consumer(cq->queue, QUEUE_TYPE_KERNEL);
>> +               advance_consumer(cq->queue, QUEUE_TYPE_TO_CLIENT);
>>         }
>>         spin_unlock_irqrestore(&cq->cq_lock, flags);
>>
>> @@ -868,10 +841,7 @@ static int rxe_peek_cq(struct ib_cq *ibcq, int wc_cnt)
>>         struct rxe_cq *cq = to_rcq(ibcq);
>>         int count;
>>
>> -       if (cq->is_user)
>> -               count = queue_count(cq->queue, QUEUE_TYPE_TO_USER);
>> -       else
>> -               count = queue_count(cq->queue, QUEUE_TYPE_KERNEL);
>> +       count = queue_count(cq->queue, QUEUE_TYPE_TO_CLIENT);
>>
>>         return (count > wc_cnt) ? wc_cnt : count;
>>  }
>> @@ -887,10 +857,7 @@ static int rxe_req_notify_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags)
>>         if (cq->notify != IB_CQ_NEXT_COMP)
>>                 cq->notify = flags & IB_CQ_SOLICITED_MASK;
>>
>> -       if (cq->is_user)
>> -               empty = queue_empty(cq->queue, QUEUE_TYPE_TO_USER);
>> -       else
>> -               empty = queue_empty(cq->queue, QUEUE_TYPE_KERNEL);
>> +       empty = queue_empty(cq->queue, QUEUE_TYPE_TO_CLIENT);
>>
>>         if ((flags & IB_CQ_REPORT_MISSED_EVENTS) && !empty)
>>                 ret = 1;
>> --
>> 2.30.2
>>


It's the same one. In the cover letter that is called out. It still needs to get done.



Bob




^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes.
  2021-09-09 21:52 ` [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes Bart Van Assche
@ 2021-09-10 19:38   ` Pearson, Robert B
  2021-09-10 20:23     ` Bart Van Assche
  0 siblings, 1 reply; 22+ messages in thread
From: Pearson, Robert B @ 2021-09-10 19:38 UTC (permalink / raw)
  To: Bart Van Assche, Bob Pearson, jgg, zyjzyj2000, linux-rdma, mie

Bart,

I was able to run this test case but it is not failing. On my system it passes in ~1sec.
I have several questions about your system setup.

1. Which rdma-core are you running? Out of box or the github tree?
2. Can you run ib_send_bw? Python test suite in rdma-core?
3. Where did you get the kernel bits? Which git tree? Which branch?

Thanks,

Bob Pearson

-----Original Message-----
From: Bart Van Assche <bvanassche@acm.org> 
Sent: Thursday, September 9, 2021 4:52 PM
To: Bob Pearson <rpearsonhpe@gmail.com>; jgg@nvidia.com; zyjzyj2000@gmail.com; linux-rdma@vger.kernel.org; mie@igel.co.jp
Subject: Re: [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes.

On 9/9/21 1:44 PM, Bob Pearson wrote:
> This series of patches implements several bug fixes and minor cleanups 
> of the rxe driver. Specifically these fix a bug exposed by blktest.
> 
> They apply cleanly to both
> commit 2169b908894df2ce83e7eb4a399d3224b2635126 (origin/for-rc, 
> for-rc) commit 6a217437f9f5482a3f6f2dc5fcd27cf0f62409ac (HEAD -> for-next,
> 	origin/wip/jgg-for-next, origin/for-next, origin/HEAD)
> 
> These are being resubmitted to for-rc instead of for-next.

Hi Bob,

Thanks for having rebased and reposted this patch series. I have applied this series on top of commit 2169b908894d ("IB/hfi1: make hist static").
A kernel bug was triggered while running test srp/001. I have attached the kernel configuration used in my test to this email.

Thanks,

Bart.



ib_srpt Received SRP_LOGIN_REQ with i_port_id fe80:0000:0000:0000:5054:00ff:fe86:7464, t_port_id 5054:00ff:fe86:7464:5054:00ff:fe86:7464 and it_iu_len 8260 on port 1 (guid=fe80:0000:0000:0000:5054:00ff:fe86:7464); pkey 0xffff
BUG: unable to handle page fault for address: ffffc900e357d614
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page PGD 100000067 P4D 100000067 PUD 0
Oops: 0000 [#1] PREEMPT SMP KASAN
CPU: 26 PID: 148 Comm: ksoftirqd/26 Tainted: G            E     5.14.0-rc6-dbg+ #2
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
RIP: 0010:rxe_completer+0x96d/0x1050 [rdma_rxe]
Code: e0 49 8b 44 24 08 44 89 e9 41 d3 e6 4e 8d a4 30 80 01 00 00 4d 85 e4 0f 84 f9 00 00 00 49 8d bc 24 94 00 00 00 e8 73 a8 b1 e0 <41> 8b 84 24 94 00 00 00 85 c0 0f 84 df 00 00 00 83 f8 03 0f 84 bf
RSP: 0018:ffff8881014075f8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88813c67c000 RCX: dffffc0000000000
RDX: 0000000000000007 RSI: ffffffff826920c0 RDI: ffffc900e357d614
RBP: ffff8881014076e8 R08: ffffffffa09b228d R09: ffff88813c67c57b
R10: ffffed10278cf8af R11: 0000000000000000 R12: ffffc900e357d580
R13: 000000000000000a R14: 00000000d9c99400 R15: ffff8881515ddd08
FS:  0000000000000000(0000) GS:ffff88842d100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffc900e357d614 CR3: 0000000002e29005 CR4: 0000000000770ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
  rxe_do_task+0xdd/0x160 [rdma_rxe]
  rxe_run_task+0x67/0x80 [rdma_rxe]
  rxe_comp_queue_pkt+0x75/0x80 [rdma_rxe]
  rxe_rcv+0x345/0x480 [rdma_rxe]
  rxe_xmit_packet+0x1af/0x300 [rdma_rxe]
  send_ack.isra.0+0x88/0xd0 [rdma_rxe]
  rxe_responder+0xf4c/0x15e0 [rdma_rxe]
  rxe_do_task+0xdd/0x160 [rdma_rxe]
  rxe_run_task+0x67/0x80 [rdma_rxe]
  rxe_resp_queue_pkt+0x5a/0x60 [rdma_rxe]
  rxe_rcv+0x370/0x480 [rdma_rxe]
  rxe_xmit_packet+0x1af/0x300 [rdma_rxe]
  rxe_requester+0x4f4/0xe80 [rdma_rxe]
  rxe_do_task+0xdd/0x160 [rdma_rxe]
  tasklet_action_common.constprop.0+0x168/0x1b0
  tasklet_action+0x44/0x60
  __do_softirq+0x1db/0x6ed
  run_ksoftirqd+0x37/0x60
  smpboot_thread_fn+0x302/0x410
  kthread+0x1f6/0x220
  ret_from_fork+0x1f/0x30
Modules linked in: ib_srp(E) scsi_transport_srp(E) target_core_user(E) uio(E) target_core_pscsi(E) target_core_file(E) ib_srpt(E) target_core_iblock(E) target_core_mod(E) ib_umad(E) rdma_ucm(E) ib_iser(E) libiscsi(E) scsi_transport_iscsi(E) rdma_cm(E) iw_cm(E)
scsi_debug(E) ib_cm(E) rdma_rxe(E) ip6_udp_tunnel(E) udp_tunnel(E) ib_uverbs(E) null_blk(E) ib_core(E) brd(E) af_packet(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E)
nft_chain_nat(E) nf_tables(E) ebtable_nat(E) iTCO_wdt(E) watchdog(E) ebtable_broute(E) intel_rapl_msr(E) intel_pmc_bxt(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) libcrc32c(E)
iptable_mangle(E) iptable_raw(E) ip_set(E) nfnetlink(E) ebtable_filter(E) ebtables(E) ip6table_filter(E) ip6_tables(E) rfkill(E) iptable_filter(E) ip_tables(E) x_tables(E) bpfilter(E) intel_rapl_common(E)
  iosf_mbi(E) isst_if_common(E) i2c_i801(E) pcspkr(E) i2c_smbus(E) virtio_net(E) lpc_ich(E) virtio_balloon(E) net_failover(E) failover(E) tiny_power_button(E) button(E) fuse(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) ghash_clmulni_intel(E) aesni_intel(E)
crypto_simd(E) cryptd(E) sr_mod(E) serio_raw(E) cdrom(E) virtio_gpu(E) virtio_dma_buf(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) cec(E) drm(E) qemu_fw_cfg(E) sg(E) nbd(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E)
scsi_dh_alua(E) virtio_rng(E)
CR2: ffffc900e357d614
---[ end trace 0667a278da47193a ]---
RIP: 0010:rxe_completer+0x96d/0x1050 [rdma_rxe]
Code: e0 49 8b 44 24 08 44 89 e9 41 d3 e6 4e 8d a4 30 80 01 00 00 4d 85 e4 0f 84 f9 00 00 00 49 8d bc 24 94 00 00 00 e8 73 a8 b1 e0 <41> 8b 84 24 94 00 00 00 85 c0 0f 84 df 00 00 00 83 f8 03 0f 84 bf
RSP: 0018:ffff8881014075f8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88813c67c000 RCX: dffffc0000000000
RDX: 0000000000000007 RSI: ffffffff826920c0 RDI: ffffc900e357d614
RBP: ffff8881014076e8 R08: ffffffffa09b228d R09: ffff88813c67c57b
R10: ffffed10278cf8af R11: 0000000000000000 R12: ffffc900e357d580
R13: 000000000000000a R14: 00000000d9c99400 R15: ffff8881515ddd08
FS:  0000000000000000(0000) GS:ffff88842d100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffc900e357d614 CR3: 0000000002e29005 CR4: 0000000000770ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Kernel panic - not syncing: Fatal exception in interrupt Kernel Offset: disabled Rebooting in 90 seconds..

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes.
  2021-09-10 19:38   ` Pearson, Robert B
@ 2021-09-10 20:23     ` Bart Van Assche
  2021-09-10 21:16       ` Bob Pearson
  2021-09-10 21:47       ` Bob Pearson
  0 siblings, 2 replies; 22+ messages in thread
From: Bart Van Assche @ 2021-09-10 20:23 UTC (permalink / raw)
  To: Pearson, Robert B, Bob Pearson, jgg, zyjzyj2000, linux-rdma, mie

On 9/10/21 12:38 PM, Pearson, Robert B wrote:
> 1. Which rdma-core are you running? Out of box or the github tree?

I'm using the rdma-core package included in openSUSE Tumbleweed. blktests
pass with that rdma-core package against older kernel versions so I think
the rdma-core package is fine. The version number of the rdma-core package
I'm using is as follows:
$ rpm -q rdma-core
rdma-core-36.0-1.1.x86_64

The rdma tool comes from the iproute2 package:
$ rpm -qf /sbin/rdma
iproute2-5.13-1.1.x86_64

> 3. Where did you get the kernel bits? Which git tree? Which branch?

Hmm ... wasn't that mentioned in my previous email? I mentioned a commit
SHA and these SHA numbers are unique and unambiguous. Anyway: commit
2169b908894d comes from the for-rc branch of the following git repository:
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git.

Bart.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes.
  2021-09-10 20:23     ` Bart Van Assche
@ 2021-09-10 21:16       ` Bob Pearson
  2021-09-10 21:47       ` Bob Pearson
  1 sibling, 0 replies; 22+ messages in thread
From: Bob Pearson @ 2021-09-10 21:16 UTC (permalink / raw)
  To: Bart Van Assche, Pearson, Robert B, jgg, zyjzyj2000, linux-rdma, mie

On 9/10/21 3:23 PM, Bart Van Assche wrote:
> On 9/10/21 12:38 PM, Pearson, Robert B wrote:
>> 1. Which rdma-core are you running? Out of box or the github tree?
> 
> I'm using the rdma-core package included in openSUSE Tumbleweed. blktests
> pass with that rdma-core package against older kernel versions so I think
> the rdma-core package is fine. The version number of the rdma-core package
> I'm using is as follows:
> $ rpm -q rdma-core
> rdma-core-36.0-1.1.x86_64
> 
> The rdma tool comes from the iproute2 package:
> $ rpm -qf /sbin/rdma
> iproute2-5.13-1.1.x86_64
> 
>> 3. Where did you get the kernel bits? Which git tree? Which branch?
> 
> Hmm ... wasn't that mentioned in my previous email? I mentioned a commit
> SHA and these SHA numbers are unique and unambiguous. Anyway: commit
> 2169b908894d comes from the for-rc branch of the following git repository:
> git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git.
> 
> Bart.
> 
> 

You'd be surprised how much I don't know. I do know the numbers are unique but I
haven't the faintest idea how to decode them into useful strings.

In theory you are correct and rdma-core and kernels are supposed to be forwards and
backwards compatible but that is a goal and sometimes regressions do occur. I can try
to run with that version just to make sure.

There is a problem I have seen where some newer distros do not create the default IPV6
address from the MAC address. They randomize it (Ubuntu does this) and rxe is broken
as a result. I end up having to add a line like 

sudo ip addr add dev enp6s0 fe80::b62e:99ff:fef9:fa2e/64
  (where the MAC address is b4:2e:99:f9:fa:2e) just before the line
sudo rdma link add rxe_1 type rxe netdev enp6s0

But, when this is an issue rxe is really broken and almost nothing works so that may not
be an issue for you.

I will try to recreate your setup and retest.

Thanks,

Bob

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes.
  2021-09-10 20:23     ` Bart Van Assche
  2021-09-10 21:16       ` Bob Pearson
@ 2021-09-10 21:47       ` Bob Pearson
  2021-09-10 21:50         ` Bob Pearson
  2021-09-10 22:07         ` Bart Van Assche
  1 sibling, 2 replies; 22+ messages in thread
From: Bob Pearson @ 2021-09-10 21:47 UTC (permalink / raw)
  To: Bart Van Assche, Pearson, Robert B, jgg, zyjzyj2000, linux-rdma, mie

On 9/10/21 3:23 PM, Bart Van Assche wrote:
> On 9/10/21 12:38 PM, Pearson, Robert B wrote:
>> 1. Which rdma-core are you running? Out of box or the github tree?
> 
> I'm using the rdma-core package included in openSUSE Tumbleweed. blktests
> pass with that rdma-core package against older kernel versions so I think
> the rdma-core package is fine. The version number of the rdma-core package
> I'm using is as follows:
> $ rpm -q rdma-core
> rdma-core-36.0-1.1.x86_64
> 
> The rdma tool comes from the iproute2 package:
> $ rpm -qf /sbin/rdma
> iproute2-5.13-1.1.x86_64
> 
>> 3. Where did you get the kernel bits? Which git tree? Which branch?
> 
> Hmm ... wasn't that mentioned in my previous email? I mentioned a commit
> SHA and these SHA numbers are unique and unambiguous. Anyway: commit
> 2169b908894d comes from the for-rc branch of the following git repository:
> git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git.
> 
> Bart.
> 
> 

OK I checked out the kernel with the SHA number above and applied the patch series
and rebuilt and reinstalled the kernel. I checked out v36.0 of rdma-core and rebuilt
that. rdma is version 5.9.0 but I doubt that will have any effect. My startup script
is

    export LD_LIBRARY_PATH=/home/bob/src/rdma-core/build/lib/:/usr/local/lib:/usr/lib



    sudo ip link set dev enp0s3 mtu 8500

    sudo ip addr add dev enp0s3 fe80::0a00:27ff:fe94:8a69/64

    sudo rdma link add rxe0 type rxe netdev enp0s3


I am running on a Virtualbox VM instance of Ubuntu 21.04 with 20 cores and 8GB of RAM.

The test looks like

    sudo ./check -q srp/001

    srp/001 (Create and remove LUNs)                             [passed]

        runtime  1.174s  ...  1.236s

There were no issues. 

Any guesses what else to look at?

Thanks,

Bob

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes.
  2021-09-10 21:47       ` Bob Pearson
@ 2021-09-10 21:50         ` Bob Pearson
  2021-09-10 22:07         ` Bart Van Assche
  1 sibling, 0 replies; 22+ messages in thread
From: Bob Pearson @ 2021-09-10 21:50 UTC (permalink / raw)
  To: Bart Van Assche, Pearson, Robert B, jgg, zyjzyj2000, linux-rdma, mie

On 9/10/21 4:47 PM, Bob Pearson wrote:
> On 9/10/21 3:23 PM, Bart Van Assche wrote:
>> On 9/10/21 12:38 PM, Pearson, Robert B wrote:
>>> 1. Which rdma-core are you running? Out of box or the github tree?
>>
>> I'm using the rdma-core package included in openSUSE Tumbleweed. blktests
>> pass with that rdma-core package against older kernel versions so I think
>> the rdma-core package is fine. The version number of the rdma-core package
>> I'm using is as follows:
>> $ rpm -q rdma-core
>> rdma-core-36.0-1.1.x86_64
>>
>> The rdma tool comes from the iproute2 package:
>> $ rpm -qf /sbin/rdma
>> iproute2-5.13-1.1.x86_64
>>
>>> 3. Where did you get the kernel bits? Which git tree? Which branch?
>>
>> Hmm ... wasn't that mentioned in my previous email? I mentioned a commit
>> SHA and these SHA numbers are unique and unambiguous. Anyway: commit
>> 2169b908894d comes from the for-rc branch of the following git repository:
>> git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git.
>>
>> Bart.
>>
>>
> 
> OK I checked out the kernel with the SHA number above and applied the patch series
> and rebuilt and reinstalled the kernel. I checked out v36.0 of rdma-core and rebuilt
> that. rdma is version 5.9.0 but I doubt that will have any effect. My startup script
> is
> 
>     export LD_LIBRARY_PATH=/home/bob/src/rdma-core/build/lib/:/usr/local/lib:/usr/lib
> 
> 
> 
>     sudo ip link set dev enp0s3 mtu 8500
> 
>     sudo ip addr add dev enp0s3 fe80::0a00:27ff:fe94:8a69/64
> 
>     sudo rdma link add rxe0 type rxe netdev enp0s3
> 
> 
> I am running on a Virtualbox VM instance of Ubuntu 21.04 with 20 cores and 8GB of RAM.
> 
> The test looks like
> 
>     sudo ./check -q srp/001
> 
>     srp/001 (Create and remove LUNs)                             [passed]
> 
>         runtime  1.174s  ...  1.236s
> 
> There were no issues. 
> 
> Any guesses what else to look at?
> 
> Thanks,
> 
> Bob
> 

The 8500 is not required. It runs fine with 4K MTU just as well.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes.
  2021-09-10 21:47       ` Bob Pearson
  2021-09-10 21:50         ` Bob Pearson
@ 2021-09-10 22:07         ` Bart Van Assche
  2021-09-12 14:41           ` Bob Pearson
  2021-09-12 14:42           ` Bob Pearson
  1 sibling, 2 replies; 22+ messages in thread
From: Bart Van Assche @ 2021-09-10 22:07 UTC (permalink / raw)
  To: Bob Pearson, Pearson, Robert B, jgg, zyjzyj2000, linux-rdma, mie

On 9/10/21 2:47 PM, Bob Pearson wrote:
> OK I checked out the kernel with the SHA number above and applied the patch series
> and rebuilt and reinstalled the kernel. I checked out v36.0 of rdma-core and rebuilt
> that. rdma is version 5.9.0 but I doubt that will have any effect. My startup script
> is
> 
>      export LD_LIBRARY_PATH=/home/bob/src/rdma-core/build/lib/:/usr/local/lib:/usr/lib
> 
> 
> 
>      sudo ip link set dev enp0s3 mtu 8500
> 
>      sudo ip addr add dev enp0s3 fe80::0a00:27ff:fe94:8a69/64
> 
>      sudo rdma link add rxe0 type rxe netdev enp0s3
> 
> 
> I am running on a Virtualbox VM instance of Ubuntu 21.04 with 20 cores and 8GB of RAM.
> 
> The test looks like
> 
>      sudo ./check -q srp/001
> 
>      srp/001 (Create and remove LUNs)                             [passed]
> 
>          runtime  1.174s  ...  1.236s
> 
> There were no issues.
> 
> Any guesses what else to look at?

The test I ran is different. I did not run any of the ip link / ip addr /
rdma link commands since the blktests scripts already run the rdma link
command. The bug I reported in my previous email is reproducible and
triggers a VM halt.

Are we using the same kernel config? I attached my kernel config to my
previous email. The source code location of the crash address is as
follows:

(gdb) list *(rxe_completer+0x96d)
0x228d is in rxe_completer (drivers/infiniband/sw/rxe/rxe_comp.c:149).
144              */
145             wqe = queue_head(qp->sq.queue, QUEUE_TYPE_FROM_CLIENT);
146             *wqe_p = wqe;
147
148             /* no WQE or requester has not started it yet */
149             if (!wqe || wqe->state == wqe_state_posted)
150                     return pkt ? COMPST_DONE : COMPST_EXIT;
151
152             /* WQE does not require an ack */
153             if (wqe->state == wqe_state_done)

The disassembly output is as follows:

drivers/infiniband/sw/rxe/rxe_comp.c:
149             if (!wqe || wqe->state == wqe_state_posted)
    0x0000000000002277 <+2391>:  test   %r12,%r12
    0x000000000000227a <+2394>:  je     0x2379 <rxe_completer+2649>
    0x0000000000002280 <+2400>:  lea    0x94(%r12),%rdi
    0x0000000000002288 <+2408>:  call   0x228d <rxe_completer+2413>
    0x000000000000228d <+2413>:  mov    0x94(%r12),%eax
    0x0000000000002295 <+2421>:  test   %eax,%eax
    0x0000000000002297 <+2423>:  je     0x237c <rxe_completer+2652>

So the instruction that triggers the crash is "mov 0x94(%r12),%eax".
Does consumer_addr() perhaps return an invalid address under certain
circumstances?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes.
  2021-09-10 22:07         ` Bart Van Assche
@ 2021-09-12 14:41           ` Bob Pearson
  2021-09-14  3:26             ` Bart Van Assche
  2021-09-12 14:42           ` Bob Pearson
  1 sibling, 1 reply; 22+ messages in thread
From: Bob Pearson @ 2021-09-12 14:41 UTC (permalink / raw)
  To: Bart Van Assche, Pearson, Robert B, jgg, zyjzyj2000, linux-rdma,
	mie, Xiao Yang

On 9/10/21 5:07 PM, Bart Van Assche wrote:
> On 9/10/21 2:47 PM, Bob Pearson wrote:
>> OK I checked out the kernel with the SHA number above and applied the patch series
>> and rebuilt and reinstalled the kernel. I checked out v36.0 of rdma-core and rebuilt
>> that. rdma is version 5.9.0 but I doubt that will have any effect. My startup script
>> is
>>
>>      export LD_LIBRARY_PATH=/home/bob/src/rdma-core/build/lib/:/usr/local/lib:/usr/lib
>>
>>
>>
>>      sudo ip link set dev enp0s3 mtu 8500
>>
>>      sudo ip addr add dev enp0s3 fe80::0a00:27ff:fe94:8a69/64
>>
>>      sudo rdma link add rxe0 type rxe netdev enp0s3
>>
>>
>> I am running on a Virtualbox VM instance of Ubuntu 21.04 with 20 cores and 8GB of RAM.
>>
>> The test looks like
>>
>>      sudo ./check -q srp/001
>>
>>      srp/001 (Create and remove LUNs)                             [passed]
>>
>>          runtime  1.174s  ...  1.236s
>>
>> There were no issues.
>>
>> Any guesses what else to look at?
> 
> The test I ran is different. I did not run any of the ip link / ip addr /
> rdma link commands since the blktests scripts already run the rdma link
> command. The bug I reported in my previous email is reproducible and
> triggers a VM halt.
> 
> Are we using the same kernel config? I attached my kernel config to my
> previous email. The source code location of the crash address is as
> follows:
> 
> (gdb) list *(rxe_completer+0x96d)
> 0x228d is in rxe_completer (drivers/infiniband/sw/rxe/rxe_comp.c:149).
> 144              */
> 145             wqe = queue_head(qp->sq.queue, QUEUE_TYPE_FROM_CLIENT);
> 146             *wqe_p = wqe;
> 147
> 148             /* no WQE or requester has not started it yet */
> 149             if (!wqe || wqe->state == wqe_state_posted)
> 150                     return pkt ? COMPST_DONE : COMPST_EXIT;
> 151
> 152             /* WQE does not require an ack */
> 153             if (wqe->state == wqe_state_done)
> 
> The disassembly output is as follows:
> 
> drivers/infiniband/sw/rxe/rxe_comp.c:
> 149             if (!wqe || wqe->state == wqe_state_posted)
>    0x0000000000002277 <+2391>:  test   %r12,%r12
>    0x000000000000227a <+2394>:  je     0x2379 <rxe_completer+2649>
>    0x0000000000002280 <+2400>:  lea    0x94(%r12),%rdi
>    0x0000000000002288 <+2408>:  call   0x228d <rxe_completer+2413>
>    0x000000000000228d <+2413>:  mov    0x94(%r12),%eax
>    0x0000000000002295 <+2421>:  test   %eax,%eax
>    0x0000000000002297 <+2423>:  je     0x237c <rxe_completer+2652>
> 
> So the instruction that triggers the crash is "mov 0x94(%r12),%eax".
> Does consumer_addr() perhaps return an invalid address under certain
> circumstances?
> 
> Thanks,
> 
> Bart.

The most likely cause of this was fixed by a patch submitted 8/20/2021 by Xiao Yang. It is copied here

From: Xiao Yang <yangx.jy@fujitsu.com>
To: <linux-rdma@vger.kernel.org>
Cc: <aglo@umich.edu>, <rpearsonhpe@gmail.com>, <zyjzyj2000@gmail.com>,
	<jgg@nvidia.com>, <leon@kernel.org>,
	Xiao Yang <yangx.jy@fujitsu.com>
Subject: [PATCH] RDMA/rxe: Zero out index member of struct rxe_queue
Date: Fri, 20 Aug 2021 19:15:09 +0800	[thread overview]
Message-ID: <20210820111509.172500-1-yangx.jy@fujitsu.com> (raw)

1) New index member of struct rxe_queue is introduced but not zeroed
   so the initial value of index may be random.
2) Current index is not masked off to index_mask.
In such case, producer_addr() and consumer_addr() will get an invalid
address by the random index and then accessing the invalid address
triggers the following panic:
"BUG: unable to handle page fault for address: ffff9ae2c07a1414"

Fix the issue by using kzalloc() to zero out index member.

Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
---
 drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c
index 85b812586ed4..72d95398e604 100644
--- a/drivers/infiniband/sw/rxe/rxe_queue.c
+++ b/drivers/infiniband/sw/rxe/rxe_queue.c
@@ -63,7 +63,7 @@ struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe, int *num_elem,
 	if (*num_elem < 0)
 		goto err1;
 
-	q = kmalloc(sizeof(*q), GFP_KERNEL);
+	q = kzalloc(sizeof(*q), GFP_KERNEL);
 	if (!q)
 		goto err1;
 
-- 
2.25.1

If kmalloc returns a dirty block of memory you could get random values in the q index which could
easily give a page fault. Once the rxe driver writes a new value it will be masked before storing
and should always be in the allocated buffer. I am not seeing this error perhaps because I am
running in a VM. I just don't know. It should be added to the other fixes.

Bob

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes.
  2021-09-10 22:07         ` Bart Van Assche
  2021-09-12 14:41           ` Bob Pearson
@ 2021-09-12 14:42           ` Bob Pearson
  1 sibling, 0 replies; 22+ messages in thread
From: Bob Pearson @ 2021-09-12 14:42 UTC (permalink / raw)
  To: Bart Van Assche, Pearson, Robert B, jgg, zyjzyj2000, linux-rdma, mie

On 9/10/21 5:07 PM, Bart Van Assche wrote:
> On 9/10/21 2:47 PM, Bob Pearson wrote:
>> OK I checked out the kernel with the SHA number above and applied the patch series
>> and rebuilt and reinstalled the kernel. I checked out v36.0 of rdma-core and rebuilt
>> that. rdma is version 5.9.0 but I doubt that will have any effect. My startup script
>> is
>>
>>      export LD_LIBRARY_PATH=/home/bob/src/rdma-core/build/lib/:/usr/local/lib:/usr/lib
>>
>>
>>
>>      sudo ip link set dev enp0s3 mtu 8500
>>
>>      sudo ip addr add dev enp0s3 fe80::0a00:27ff:fe94:8a69/64
>>
>>      sudo rdma link add rxe0 type rxe netdev enp0s3
>>
>>
>> I am running on a Virtualbox VM instance of Ubuntu 21.04 with 20 cores and 8GB of RAM.
>>
>> The test looks like
>>
>>      sudo ./check -q srp/001
>>
>>      srp/001 (Create and remove LUNs)                             [passed]
>>
>>          runtime  1.174s  ...  1.236s
>>
>> There were no issues.
>>
>> Any guesses what else to look at?
> 
> The test I ran is different. I did not run any of the ip link / ip addr /
> rdma link commands since the blktests scripts already run the rdma link
> command. The bug I reported in my previous email is reproducible and
> triggers a VM halt.
> 
> Are we using the same kernel config? I attached my kernel config to my
> previous email. The source code location of the crash address is as
> follows:
> 
> (gdb) list *(rxe_completer+0x96d)
> 0x228d is in rxe_completer (drivers/infiniband/sw/rxe/rxe_comp.c:149).
> 144              */
> 145             wqe = queue_head(qp->sq.queue, QUEUE_TYPE_FROM_CLIENT);
> 146             *wqe_p = wqe;
> 147
> 148             /* no WQE or requester has not started it yet */
> 149             if (!wqe || wqe->state == wqe_state_posted)
> 150                     return pkt ? COMPST_DONE : COMPST_EXIT;
> 151
> 152             /* WQE does not require an ack */
> 153             if (wqe->state == wqe_state_done)
> 
> The disassembly output is as follows:
> 
> drivers/infiniband/sw/rxe/rxe_comp.c:
> 149             if (!wqe || wqe->state == wqe_state_posted)
>    0x0000000000002277 <+2391>:  test   %r12,%r12
>    0x000000000000227a <+2394>:  je     0x2379 <rxe_completer+2649>
>    0x0000000000002280 <+2400>:  lea    0x94(%r12),%rdi
>    0x0000000000002288 <+2408>:  call   0x228d <rxe_completer+2413>
>    0x000000000000228d <+2413>:  mov    0x94(%r12),%eax
>    0x0000000000002295 <+2421>:  test   %eax,%eax
>    0x0000000000002297 <+2423>:  je     0x237c <rxe_completer+2652>
> 
> So the instruction that triggers the crash is "mov 0x94(%r12),%eax".
> Does consumer_addr() perhaps return an invalid address under certain
> circumstances?
> 
> Thanks,
> 
> Bart.

By the way I did rebuild the kernel with your config file. No change. - Bob

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes.
  2021-09-12 14:41           ` Bob Pearson
@ 2021-09-14  3:26             ` Bart Van Assche
  2021-09-14  4:18               ` Bob Pearson
  0 siblings, 1 reply; 22+ messages in thread
From: Bart Van Assche @ 2021-09-14  3:26 UTC (permalink / raw)
  To: Bob Pearson, Pearson, Robert B, jgg, zyjzyj2000, linux-rdma, mie,
	Xiao Yang

On 9/12/21 07:41, Bob Pearson wrote:
> Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
> Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
> ---
>   drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c
> index 85b812586ed4..72d95398e604 100644
> --- a/drivers/infiniband/sw/rxe/rxe_queue.c
> +++ b/drivers/infiniband/sw/rxe/rxe_queue.c
> @@ -63,7 +63,7 @@ struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe, int *num_elem,
>   	if (*num_elem < 0)
>   		goto err1;
>   
> -	q = kmalloc(sizeof(*q), GFP_KERNEL);
> +	q = kzalloc(sizeof(*q), GFP_KERNEL);
>   	if (!q)
>   		goto err1;

Hi Bob,

If I rebase this patch series on top of kernel v5.15-rc1 then the srp 
tests from the blktests suite pass. Kernel v5.15-rc1 includes the above 
patch. Feel free to add the following to this patch series:

Tested-by: Bart Van Assche <bvanassche@acm.org>

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes.
  2021-09-14  3:26             ` Bart Van Assche
@ 2021-09-14  4:18               ` Bob Pearson
  0 siblings, 0 replies; 22+ messages in thread
From: Bob Pearson @ 2021-09-14  4:18 UTC (permalink / raw)
  To: Bart Van Assche, Pearson, Robert B, jgg, zyjzyj2000, linux-rdma,
	mie, Xiao Yang, Rao Shoaib

On 9/13/21 10:26 PM, Bart Van Assche wrote:
> On 9/12/21 07:41, Bob Pearson wrote:
>> Fixes: 5bcf5a59c41e ("RDMA/rxe: Protext kernel index from user space")
>> Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
>> ---
>>   drivers/infiniband/sw/rxe/rxe_queue.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_queue.c b/drivers/infiniband/sw/rxe/rxe_queue.c
>> index 85b812586ed4..72d95398e604 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_queue.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_queue.c
>> @@ -63,7 +63,7 @@ struct rxe_queue *rxe_queue_init(struct rxe_dev *rxe, int *num_elem,
>>       if (*num_elem < 0)
>>           goto err1;
>>   -    q = kmalloc(sizeof(*q), GFP_KERNEL);
>> +    q = kzalloc(sizeof(*q), GFP_KERNEL);
>>       if (!q)
>>           goto err1;
> 
> Hi Bob,
> 
> If I rebase this patch series on top of kernel v5.15-rc1 then the srp tests from the blktests suite pass. Kernel v5.15-rc1 includes the above patch. Feel free to add the following to this patch series:
> 
> Tested-by: Bart Van Assche <bvanassche@acm.org>
> 
> Thanks,
> 
> Bart.

Sadly, I have been trying to resolve the note from Shaib Rao who was trying to make rping work.
His solution was not correct but it led to a can of worms. The kernel verbs consumer APIs were all
using the same APIs from rxe_queue.h to manipulate the client ends of the queues but that was
totally incorrect. These are written from the POV of the driver and use the private index which
is not supposed to be visible to users of the queues. A whole day later I think I have that one about
fixed. So I will be resubmitting the series again in the morning. Its all just memory barriers so
it may not affect you.

Bob

^ permalink raw reply	[flat|nested] 22+ messages in thread

* 回复: [PATCH for-rc v3 1/6] RDMA/rxe: Add memory barriers to kernel queues
  2021-09-09 20:44 ` [PATCH for-rc v3 1/6] RDMA/rxe: Add memory barriers to kernel queues Bob Pearson
  2021-09-10  1:19   ` Zhu Yanjun
@ 2021-09-14  6:04   ` yangx.jy
  2021-09-14 15:47     ` Bob Pearson
  1 sibling, 1 reply; 22+ messages in thread
From: yangx.jy @ 2021-09-14  6:04 UTC (permalink / raw)
  To: Bob Pearson, jgg, zyjzyj2000, linux-rdma, mie, bvanassche

Hi Bob,

Why do you want to use FROM_CLIENT and TO_CLIENT suffix?
It seems readable to use FROM_USER and TO_USER suffix (i.e. between user space and kernel space).

Best Regards,
Xiao Yang

-----邮件原件-----
发件人: Bob Pearson <rpearsonhpe@gmail.com> 
发送时间: 2021年9月10日 4:45
收件人: jgg@nvidia.com; zyjzyj2000@gmail.com; linux-rdma@vger.kernel.org; mie@igel.co.jp; bvanassche@acm.org
抄送: Bob Pearson <rpearsonhpe@gmail.com>
主题: [PATCH for-rc v3 1/6] RDMA/rxe: Add memory barriers to kernel queues

Earlier patches added memory barriers to protect user space to kernel space communications. The user space queues were previously shown to have occasional memory synchonization errors which were removed by adding smp_load_acquire, smp_store_release barriers. 

This patch extends that to the case where queues are used between kernel space threads.

Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
---
 drivers/infiniband/sw/rxe/rxe_comp.c  | 10 +---
 drivers/infiniband/sw/rxe/rxe_cq.c    | 25 ++-------
 drivers/infiniband/sw/rxe/rxe_qp.c    | 10 ++--
 drivers/infiniband/sw/rxe/rxe_queue.h | 73 ++++++++-------------------
 drivers/infiniband/sw/rxe/rxe_req.c   | 21 ++------
 drivers/infiniband/sw/rxe/rxe_resp.c  | 38 ++++----------
 drivers/infiniband/sw/rxe/rxe_srq.c   |  2 +-
 drivers/infiniband/sw/rxe/rxe_verbs.c | 53 ++++---------------
 8 files changed, 55 insertions(+), 177 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c
index d2d802c776fd..ed4e3f29bd65 100644
--- a/drivers/infiniband/sw/rxe/rxe_comp.c
+++ b/drivers/infiniband/sw/rxe/rxe_comp.c
@@ -142,10 +142,7 @@ static inline enum comp_state get_wqe(struct rxe_qp *qp,
 	/* we come here whether or not we found a response packet to see if
 	 * there are any posted WQEs
 	 */
-	if (qp->is_user)
-		wqe = queue_head(qp->sq.queue, QUEUE_TYPE_FROM_USER);
-	else
-		wqe = queue_head(qp->sq.queue, QUEUE_TYPE_KERNEL);
+	wqe = queue_head(qp->sq.queue, QUEUE_TYPE_FROM_CLIENT);
 	*wqe_p = wqe;
 
 	/* no WQE or requester has not started it yet */ @@ -432,10 +429,7 @@ static void do_complete(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
 	if (post)
 		make_send_cqe(qp, wqe, &cqe);
 
-	if (qp->is_user)
-		advance_consumer(qp->sq.queue, QUEUE_TYPE_FROM_USER);
-	else
-		advance_consumer(qp->sq.queue, QUEUE_TYPE_KERNEL);
+	advance_consumer(qp->sq.queue, QUEUE_TYPE_FROM_CLIENT);
 
 	if (post)
 		rxe_cq_post(qp->scq, &cqe, 0);
diff --git a/drivers/infiniband/sw/rxe/rxe_cq.c b/drivers/infiniband/sw/rxe/rxe_cq.c
index aef288f164fd..4e26c2ea4a59 100644
--- a/drivers/infiniband/sw/rxe/rxe_cq.c
+++ b/drivers/infiniband/sw/rxe/rxe_cq.c
@@ -25,11 +25,7 @@ int rxe_cq_chk_attr(struct rxe_dev *rxe, struct rxe_cq *cq,
 	}
 
 	if (cq) {
-		if (cq->is_user)
-			count = queue_count(cq->queue, QUEUE_TYPE_TO_USER);
-		else
-			count = queue_count(cq->queue, QUEUE_TYPE_KERNEL);
-
+		count = queue_count(cq->queue, QUEUE_TYPE_TO_CLIENT);
 		if (cqe < count) {
 			pr_warn("cqe(%d) < current # elements in queue (%d)",
 				cqe, count);
@@ -65,7 +61,7 @@ int rxe_cq_from_init(struct rxe_dev *rxe, struct rxe_cq *cq, int cqe,
 	int err;
 	enum queue_type type;
 
-	type = uresp ? QUEUE_TYPE_TO_USER : QUEUE_TYPE_KERNEL;
+	type = QUEUE_TYPE_TO_CLIENT;
 	cq->queue = rxe_queue_init(rxe, &cqe,
 			sizeof(struct rxe_cqe), type);
 	if (!cq->queue) {
@@ -117,11 +113,7 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
 
 	spin_lock_irqsave(&cq->cq_lock, flags);
 
-	if (cq->is_user)
-		full = queue_full(cq->queue, QUEUE_TYPE_TO_USER);
-	else
-		full = queue_full(cq->queue, QUEUE_TYPE_KERNEL);
-
+	full = queue_full(cq->queue, QUEUE_TYPE_TO_CLIENT);
 	if (unlikely(full)) {
 		spin_unlock_irqrestore(&cq->cq_lock, flags);
 		if (cq->ibcq.event_handler) {
@@ -134,17 +126,10 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
 		return -EBUSY;
 	}
 
-	if (cq->is_user)
-		addr = producer_addr(cq->queue, QUEUE_TYPE_TO_USER);
-	else
-		addr = producer_addr(cq->queue, QUEUE_TYPE_KERNEL);
-
+	addr = producer_addr(cq->queue, QUEUE_TYPE_TO_CLIENT);
 	memcpy(addr, cqe, sizeof(*cqe));
 
-	if (cq->is_user)
-		advance_producer(cq->queue, QUEUE_TYPE_TO_USER);
-	else
-		advance_producer(cq->queue, QUEUE_TYPE_KERNEL);
+	advance_producer(cq->queue, QUEUE_TYPE_TO_CLIENT);
 
 	spin_unlock_irqrestore(&cq->cq_lock, flags);
 
diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
index 1ab6af7ddb25..2e923af642f8 100644
--- a/drivers/infiniband/sw/rxe/rxe_qp.c
+++ b/drivers/infiniband/sw/rxe/rxe_qp.c
@@ -231,7 +231,7 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
 	qp->sq.max_inline = init->cap.max_inline_data = wqe_size;
 	wqe_size += sizeof(struct rxe_send_wqe);
 
-	type = uresp ? QUEUE_TYPE_FROM_USER : QUEUE_TYPE_KERNEL;
+	type = QUEUE_TYPE_FROM_CLIENT;
 	qp->sq.queue = rxe_queue_init(rxe, &qp->sq.max_wr,
 				wqe_size, type);
 	if (!qp->sq.queue)
@@ -248,12 +248,8 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
 		return err;
 	}
 
-	if (qp->is_user)
 		qp->req.wqe_index = producer_index(qp->sq.queue,
-						QUEUE_TYPE_FROM_USER);
-	else
-		qp->req.wqe_index = producer_index(qp->sq.queue,
-						QUEUE_TYPE_KERNEL);
+					QUEUE_TYPE_FROM_CLIENT);
 
 	qp->req.state		= QP_STATE_RESET;
 	qp->req.opcode		= -1;
@@ -293,7 +289,7 @@ static int rxe_qp_init_resp(struct rxe_dev *rxe, struct rxe_qp *qp,
 		pr_debug("qp#%d max_wr = %d, max_sge = %d, wqe_size = %d\n",
 			 qp_num(qp), qp->rq.max_wr, qp->rq.max_sge, wqe_size);
 
-		type = uresp ? QUEUE_TYPE_FROM_USER : QUEUE_TYPE_KERNEL;
+		type = QUEUE_TYPE_FROM_CLIENT;
 		qp->rq.queue = rxe_queue_init(rxe, &qp->rq.max_wr,
 					wqe_size, type);
 		if (!qp->rq.queue)
diff --git a/drivers/infiniband/sw/rxe/rxe_queue.h b/drivers/infiniband/sw/rxe/rxe_queue.h
index 2702b0e55fc3..d465aa9342e1 100644
--- a/drivers/infiniband/sw/rxe/rxe_queue.h
+++ b/drivers/infiniband/sw/rxe/rxe_queue.h
@@ -35,9 +35,8 @@
 
 /* type of queue */
 enum queue_type {
-	QUEUE_TYPE_KERNEL,
-	QUEUE_TYPE_TO_USER,
-	QUEUE_TYPE_FROM_USER,
+	QUEUE_TYPE_TO_CLIENT,
+	QUEUE_TYPE_FROM_CLIENT,
 };
 
 struct rxe_queue {
@@ -87,20 +86,16 @@ static inline int queue_empty(struct rxe_queue *q, enum queue_type type)
 	u32 cons;
 
 	switch (type) {
-	case QUEUE_TYPE_FROM_USER:
+	case QUEUE_TYPE_FROM_CLIENT:
 		/* protect user space index */
 		prod = smp_load_acquire(&q->buf->producer_index);
 		cons = q->index;
 		break;
-	case QUEUE_TYPE_TO_USER:
+	case QUEUE_TYPE_TO_CLIENT:
 		prod = q->index;
 		/* protect user space index */
 		cons = smp_load_acquire(&q->buf->consumer_index);
 		break;
-	case QUEUE_TYPE_KERNEL:
-		prod = q->buf->producer_index;
-		cons = q->buf->consumer_index;
-		break;
 	}
 
 	return ((prod - cons) & q->index_mask) == 0; @@ -112,20 +107,16 @@ static inline int queue_full(struct rxe_queue *q, enum queue_type type)
 	u32 cons;
 
 	switch (type) {
-	case QUEUE_TYPE_FROM_USER:
+	case QUEUE_TYPE_FROM_CLIENT:
 		/* protect user space index */
 		prod = smp_load_acquire(&q->buf->producer_index);
 		cons = q->index;
 		break;
-	case QUEUE_TYPE_TO_USER:
+	case QUEUE_TYPE_TO_CLIENT:
 		prod = q->index;
 		/* protect user space index */
 		cons = smp_load_acquire(&q->buf->consumer_index);
 		break;
-	case QUEUE_TYPE_KERNEL:
-		prod = q->buf->producer_index;
-		cons = q->buf->consumer_index;
-		break;
 	}
 
 	return ((prod + 1 - cons) & q->index_mask) == 0; @@ -138,20 +129,16 @@ static inline unsigned int queue_count(const struct rxe_queue *q,
 	u32 cons;
 
 	switch (type) {
-	case QUEUE_TYPE_FROM_USER:
+	case QUEUE_TYPE_FROM_CLIENT:
 		/* protect user space index */
 		prod = smp_load_acquire(&q->buf->producer_index);
 		cons = q->index;
 		break;
-	case QUEUE_TYPE_TO_USER:
+	case QUEUE_TYPE_TO_CLIENT:
 		prod = q->index;
 		/* protect user space index */
 		cons = smp_load_acquire(&q->buf->consumer_index);
 		break;
-	case QUEUE_TYPE_KERNEL:
-		prod = q->buf->producer_index;
-		cons = q->buf->consumer_index;
-		break;
 	}
 
 	return (prod - cons) & q->index_mask;
@@ -162,7 +149,7 @@ static inline void advance_producer(struct rxe_queue *q, enum queue_type type)
 	u32 prod;
 
 	switch (type) {
-	case QUEUE_TYPE_FROM_USER:
+	case QUEUE_TYPE_FROM_CLIENT:
 		pr_warn_once("Normally kernel should not write user space index\n");
 		/* protect user space index */
 		prod = smp_load_acquire(&q->buf->producer_index);
@@ -170,15 +157,11 @@ static inline void advance_producer(struct rxe_queue *q, enum queue_type type)
 		/* same */
 		smp_store_release(&q->buf->producer_index, prod);
 		break;
-	case QUEUE_TYPE_TO_USER:
+	case QUEUE_TYPE_TO_CLIENT:
 		prod = q->index;
 		q->index = (prod + 1) & q->index_mask;
 		q->buf->producer_index = q->index;
 		break;
-	case QUEUE_TYPE_KERNEL:
-		prod = q->buf->producer_index;
-		q->buf->producer_index = (prod + 1) & q->index_mask;
-		break;
 	}
 }
 
@@ -187,12 +170,12 @@ static inline void advance_consumer(struct rxe_queue *q, enum queue_type type)
 	u32 cons;
 
 	switch (type) {
-	case QUEUE_TYPE_FROM_USER:
+	case QUEUE_TYPE_FROM_CLIENT:
 		cons = q->index;
 		q->index = (cons + 1) & q->index_mask;
 		q->buf->consumer_index = q->index;
 		break;
-	case QUEUE_TYPE_TO_USER:
+	case QUEUE_TYPE_TO_CLIENT:
 		pr_warn_once("Normally kernel should not write user space index\n");
 		/* protect user space index */
 		cons = smp_load_acquire(&q->buf->consumer_index);
@@ -200,10 +183,6 @@ static inline void advance_consumer(struct rxe_queue *q, enum queue_type type)
 		/* same */
 		smp_store_release(&q->buf->consumer_index, cons);
 		break;
-	case QUEUE_TYPE_KERNEL:
-		cons = q->buf->consumer_index;
-		q->buf->consumer_index = (cons + 1) & q->index_mask;
-		break;
 	}
 }
 
@@ -212,17 +191,14 @@ static inline void *producer_addr(struct rxe_queue *q, enum queue_type type)
 	u32 prod;
 
 	switch (type) {
-	case QUEUE_TYPE_FROM_USER:
+	case QUEUE_TYPE_FROM_CLIENT:
 		/* protect user space index */
 		prod = smp_load_acquire(&q->buf->producer_index);
 		prod &= q->index_mask;
 		break;
-	case QUEUE_TYPE_TO_USER:
+	case QUEUE_TYPE_TO_CLIENT:
 		prod = q->index;
 		break;
-	case QUEUE_TYPE_KERNEL:
-		prod = q->buf->producer_index;
-		break;
 	}
 
 	return q->buf->data + (prod << q->log2_elem_size); @@ -233,17 +209,14 @@ static inline void *consumer_addr(struct rxe_queue *q, enum queue_type type)
 	u32 cons;
 
 	switch (type) {
-	case QUEUE_TYPE_FROM_USER:
+	case QUEUE_TYPE_FROM_CLIENT:
 		cons = q->index;
 		break;
-	case QUEUE_TYPE_TO_USER:
+	case QUEUE_TYPE_TO_CLIENT:
 		/* protect user space index */
 		cons = smp_load_acquire(&q->buf->consumer_index);
 		cons &= q->index_mask;
 		break;
-	case QUEUE_TYPE_KERNEL:
-		cons = q->buf->consumer_index;
-		break;
 	}
 
 	return q->buf->data + (cons << q->log2_elem_size); @@ -255,17 +228,14 @@ static inline unsigned int producer_index(struct rxe_queue *q,
 	u32 prod;
 
 	switch (type) {
-	case QUEUE_TYPE_FROM_USER:
+	case QUEUE_TYPE_FROM_CLIENT:
 		/* protect user space index */
 		prod = smp_load_acquire(&q->buf->producer_index);
 		prod &= q->index_mask;
 		break;
-	case QUEUE_TYPE_TO_USER:
+	case QUEUE_TYPE_TO_CLIENT:
 		prod = q->index;
 		break;
-	case QUEUE_TYPE_KERNEL:
-		prod = q->buf->producer_index;
-		break;
 	}
 
 	return prod;
@@ -277,17 +247,14 @@ static inline unsigned int consumer_index(struct rxe_queue *q,
 	u32 cons;
 
 	switch (type) {
-	case QUEUE_TYPE_FROM_USER:
+	case QUEUE_TYPE_FROM_CLIENT:
 		cons = q->index;
 		break;
-	case QUEUE_TYPE_TO_USER:
+	case QUEUE_TYPE_TO_CLIENT:
 		/* protect user space index */
 		cons = smp_load_acquire(&q->buf->consumer_index);
 		cons &= q->index_mask;
 		break;
-	case QUEUE_TYPE_KERNEL:
-		cons = q->buf->consumer_index;
-		break;
 	}
 
 	return cons;
diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
index 3894197a82f6..22c3edb28945 100644
--- a/drivers/infiniband/sw/rxe/rxe_req.c
+++ b/drivers/infiniband/sw/rxe/rxe_req.c
@@ -49,13 +49,8 @@ static void req_retry(struct rxe_qp *qp)
 	unsigned int cons;
 	unsigned int prod;
 
-	if (qp->is_user) {
-		cons = consumer_index(q, QUEUE_TYPE_FROM_USER);
-		prod = producer_index(q, QUEUE_TYPE_FROM_USER);
-	} else {
-		cons = consumer_index(q, QUEUE_TYPE_KERNEL);
-		prod = producer_index(q, QUEUE_TYPE_KERNEL);
-	}
+	cons = consumer_index(q, QUEUE_TYPE_FROM_CLIENT);
+	prod = producer_index(q, QUEUE_TYPE_FROM_CLIENT);
 
 	qp->req.wqe_index	= cons;
 	qp->req.psn		= qp->comp.psn;
@@ -121,15 +116,9 @@ static struct rxe_send_wqe *req_next_wqe(struct rxe_qp *qp)
 	unsigned int cons;
 	unsigned int prod;
 
-	if (qp->is_user) {
-		wqe = queue_head(q, QUEUE_TYPE_FROM_USER);
-		cons = consumer_index(q, QUEUE_TYPE_FROM_USER);
-		prod = producer_index(q, QUEUE_TYPE_FROM_USER);
-	} else {
-		wqe = queue_head(q, QUEUE_TYPE_KERNEL);
-		cons = consumer_index(q, QUEUE_TYPE_KERNEL);
-		prod = producer_index(q, QUEUE_TYPE_KERNEL);
-	}
+	wqe = queue_head(q, QUEUE_TYPE_FROM_CLIENT);
+	cons = consumer_index(q, QUEUE_TYPE_FROM_CLIENT);
+	prod = producer_index(q, QUEUE_TYPE_FROM_CLIENT);
 
 	if (unlikely(qp->req.state == QP_STATE_DRAIN)) {
 		/* check to see if we are drained;
diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
index 5501227ddc65..596be002d33d 100644
--- a/drivers/infiniband/sw/rxe/rxe_resp.c
+++ b/drivers/infiniband/sw/rxe/rxe_resp.c
@@ -303,10 +303,7 @@ static enum resp_states get_srq_wqe(struct rxe_qp *qp)
 
 	spin_lock_bh(&srq->rq.consumer_lock);
 
-	if (qp->is_user)
-		wqe = queue_head(q, QUEUE_TYPE_FROM_USER);
-	else
-		wqe = queue_head(q, QUEUE_TYPE_KERNEL);
+	wqe = queue_head(q, QUEUE_TYPE_FROM_CLIENT);
 	if (!wqe) {
 		spin_unlock_bh(&srq->rq.consumer_lock);
 		return RESPST_ERR_RNR;
@@ -322,13 +319,8 @@ static enum resp_states get_srq_wqe(struct rxe_qp *qp)
 	memcpy(&qp->resp.srq_wqe, wqe, size);
 
 	qp->resp.wqe = &qp->resp.srq_wqe.wqe;
-	if (qp->is_user) {
-		advance_consumer(q, QUEUE_TYPE_FROM_USER);
-		count = queue_count(q, QUEUE_TYPE_FROM_USER);
-	} else {
-		advance_consumer(q, QUEUE_TYPE_KERNEL);
-		count = queue_count(q, QUEUE_TYPE_KERNEL);
-	}
+	advance_consumer(q, QUEUE_TYPE_FROM_CLIENT);
+	count = queue_count(q, QUEUE_TYPE_FROM_CLIENT);
 
 	if (srq->limit && srq->ibsrq.event_handler && (count < srq->limit)) {
 		srq->limit = 0;
@@ -357,12 +349,8 @@ static enum resp_states check_resource(struct rxe_qp *qp,
 			qp->resp.status = IB_WC_WR_FLUSH_ERR;
 			return RESPST_COMPLETE;
 		} else if (!srq) {
-			if (qp->is_user)
-				qp->resp.wqe = queue_head(qp->rq.queue,
-						QUEUE_TYPE_FROM_USER);
-			else
-				qp->resp.wqe = queue_head(qp->rq.queue,
-						QUEUE_TYPE_KERNEL);
+			qp->resp.wqe = queue_head(qp->rq.queue,
+					QUEUE_TYPE_FROM_CLIENT);
 			if (qp->resp.wqe) {
 				qp->resp.status = IB_WC_WR_FLUSH_ERR;
 				return RESPST_COMPLETE;
@@ -389,12 +377,8 @@ static enum resp_states check_resource(struct rxe_qp *qp,
 		if (srq)
 			return get_srq_wqe(qp);
 
-		if (qp->is_user)
-			qp->resp.wqe = queue_head(qp->rq.queue,
-					QUEUE_TYPE_FROM_USER);
-		else
-			qp->resp.wqe = queue_head(qp->rq.queue,
-					QUEUE_TYPE_KERNEL);
+		qp->resp.wqe = queue_head(qp->rq.queue,
+				QUEUE_TYPE_FROM_CLIENT);
 		return (qp->resp.wqe) ? RESPST_CHK_LENGTH : RESPST_ERR_RNR;
 	}
 
@@ -936,12 +920,8 @@ static enum resp_states do_complete(struct rxe_qp *qp,
 	}
 
 	/* have copy for srq and reference for !srq */
-	if (!qp->srq) {
-		if (qp->is_user)
-			advance_consumer(qp->rq.queue, QUEUE_TYPE_FROM_USER);
-		else
-			advance_consumer(qp->rq.queue, QUEUE_TYPE_KERNEL);
-	}
+	if (!qp->srq)
+		advance_consumer(qp->rq.queue, QUEUE_TYPE_FROM_CLIENT);
 
 	qp->resp.wqe = NULL;
 
diff --git a/drivers/infiniband/sw/rxe/rxe_srq.c b/drivers/infiniband/sw/rxe/rxe_srq.c
index 610c98d24b5c..a9e7817e2732 100644
--- a/drivers/infiniband/sw/rxe/rxe_srq.c
+++ b/drivers/infiniband/sw/rxe/rxe_srq.c
@@ -93,7 +93,7 @@ int rxe_srq_from_init(struct rxe_dev *rxe, struct rxe_srq *srq,
 	spin_lock_init(&srq->rq.producer_lock);
 	spin_lock_init(&srq->rq.consumer_lock);
 
-	type = uresp ? QUEUE_TYPE_FROM_USER : QUEUE_TYPE_KERNEL;
+	type = QUEUE_TYPE_FROM_CLIENT;
 	q = rxe_queue_init(rxe, &srq->rq.max_wr,
 			srq_wqe_size, type);
 	if (!q) {
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
index 267b5a9c345d..dc70e3edeba6 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -218,11 +218,7 @@ static int post_one_recv(struct rxe_rq *rq, const struct ib_recv_wr *ibwr)
 	int num_sge = ibwr->num_sge;
 	int full;
 
-	if (rq->is_user)
-		full = queue_full(rq->queue, QUEUE_TYPE_FROM_USER);
-	else
-		full = queue_full(rq->queue, QUEUE_TYPE_KERNEL);
-
+	full = queue_full(rq->queue, QUEUE_TYPE_FROM_CLIENT);
 	if (unlikely(full)) {
 		err = -ENOMEM;
 		goto err1;
@@ -237,11 +233,7 @@ static int post_one_recv(struct rxe_rq *rq, const struct ib_recv_wr *ibwr)
 	for (i = 0; i < num_sge; i++)
 		length += ibwr->sg_list[i].length;
 
-	if (rq->is_user)
-		recv_wqe = producer_addr(rq->queue, QUEUE_TYPE_FROM_USER);
-	else
-		recv_wqe = producer_addr(rq->queue, QUEUE_TYPE_KERNEL);
-
+	recv_wqe = producer_addr(rq->queue, QUEUE_TYPE_FROM_CLIENT);
 	recv_wqe->wr_id = ibwr->wr_id;
 	recv_wqe->num_sge = num_sge;
 
@@ -254,10 +246,7 @@ static int post_one_recv(struct rxe_rq *rq, const struct ib_recv_wr *ibwr)
 	recv_wqe->dma.cur_sge		= 0;
 	recv_wqe->dma.sge_offset	= 0;
 
-	if (rq->is_user)
-		advance_producer(rq->queue, QUEUE_TYPE_FROM_USER);
-	else
-		advance_producer(rq->queue, QUEUE_TYPE_KERNEL);
+	advance_producer(rq->queue, QUEUE_TYPE_FROM_CLIENT);
 
 	return 0;
 
@@ -633,27 +622,17 @@ static int post_one_send(struct rxe_qp *qp, const struct ib_send_wr *ibwr,
 
 	spin_lock_irqsave(&qp->sq.sq_lock, flags);
 
-	if (qp->is_user)
-		full = queue_full(sq->queue, QUEUE_TYPE_FROM_USER);
-	else
-		full = queue_full(sq->queue, QUEUE_TYPE_KERNEL);
+	full = queue_full(sq->queue, QUEUE_TYPE_FROM_CLIENT);
 
 	if (unlikely(full)) {
 		spin_unlock_irqrestore(&qp->sq.sq_lock, flags);
 		return -ENOMEM;
 	}
 
-	if (qp->is_user)
-		send_wqe = producer_addr(sq->queue, QUEUE_TYPE_FROM_USER);
-	else
-		send_wqe = producer_addr(sq->queue, QUEUE_TYPE_KERNEL);
-
+	send_wqe = producer_addr(sq->queue, QUEUE_TYPE_FROM_CLIENT);
 	init_send_wqe(qp, ibwr, mask, length, send_wqe);
 
-	if (qp->is_user)
-		advance_producer(sq->queue, QUEUE_TYPE_FROM_USER);
-	else
-		advance_producer(sq->queue, QUEUE_TYPE_KERNEL);
+	advance_producer(sq->queue, QUEUE_TYPE_FROM_CLIENT);
 
 	spin_unlock_irqrestore(&qp->sq.sq_lock, flags);
 
@@ -845,18 +824,12 @@ static int rxe_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc)
 
 	spin_lock_irqsave(&cq->cq_lock, flags);
 	for (i = 0; i < num_entries; i++) {
-		if (cq->is_user)
-			cqe = queue_head(cq->queue, QUEUE_TYPE_TO_USER);
-		else
-			cqe = queue_head(cq->queue, QUEUE_TYPE_KERNEL);
+		cqe = queue_head(cq->queue, QUEUE_TYPE_TO_CLIENT);
 		if (!cqe)
 			break;
 
 		memcpy(wc++, &cqe->ibwc, sizeof(*wc));
-		if (cq->is_user)
-			advance_consumer(cq->queue, QUEUE_TYPE_TO_USER);
-		else
-			advance_consumer(cq->queue, QUEUE_TYPE_KERNEL);
+		advance_consumer(cq->queue, QUEUE_TYPE_TO_CLIENT);
 	}
 	spin_unlock_irqrestore(&cq->cq_lock, flags);
 
@@ -868,10 +841,7 @@ static int rxe_peek_cq(struct ib_cq *ibcq, int wc_cnt)
 	struct rxe_cq *cq = to_rcq(ibcq);
 	int count;
 
-	if (cq->is_user)
-		count = queue_count(cq->queue, QUEUE_TYPE_TO_USER);
-	else
-		count = queue_count(cq->queue, QUEUE_TYPE_KERNEL);
+	count = queue_count(cq->queue, QUEUE_TYPE_TO_CLIENT);
 
 	return (count > wc_cnt) ? wc_cnt : count;  } @@ -887,10 +857,7 @@ static int rxe_req_notify_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags)
 	if (cq->notify != IB_CQ_NEXT_COMP)
 		cq->notify = flags & IB_CQ_SOLICITED_MASK;
 
-	if (cq->is_user)
-		empty = queue_empty(cq->queue, QUEUE_TYPE_TO_USER);
-	else
-		empty = queue_empty(cq->queue, QUEUE_TYPE_KERNEL);
+	empty = queue_empty(cq->queue, QUEUE_TYPE_TO_CLIENT);
 
 	if ((flags & IB_CQ_REPORT_MISSED_EVENTS) && !empty)
 		ret = 1;
--
2.30.2




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: 回复: [PATCH for-rc v3 1/6] RDMA/rxe: Add memory barriers to kernel queues
  2021-09-14  6:04   ` 回复: " yangx.jy
@ 2021-09-14 15:47     ` Bob Pearson
  0 siblings, 0 replies; 22+ messages in thread
From: Bob Pearson @ 2021-09-14 15:47 UTC (permalink / raw)
  To: yangx.jy, jgg, zyjzyj2000, linux-rdma, mie, bvanassche

On 9/14/21 1:04 AM, yangx.jy@fujitsu.com wrote:
> Hi Bob,
> 
> Why do you want to use FROM_CLIENT and TO_CLIENT suffix?
> It seems readable to use FROM_USER and TO_USER suffix (i.e. between user space and kernel space).
> 
> Best Regards,
> Xiao Yang
> 

The whole purpose of this patch is to extend the memory barriers to support
user <-> kernel *and* kernel <-> kernel. Changing the name made it clearer
that it's not just user space. It also made it easier to make sure I had
gotten all of them changed/looked at. Now that is done, if everyone wants it
back to USER I could be talked into it.

Bob

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2021-09-14 15:48 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-09 20:44 [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes Bob Pearson
2021-09-09 20:44 ` [PATCH for-rc v3 1/6] RDMA/rxe: Add memory barriers to kernel queues Bob Pearson
2021-09-10  1:19   ` Zhu Yanjun
2021-09-10  4:01     ` Bob Pearson
2021-09-14  6:04   ` 回复: " yangx.jy
2021-09-14 15:47     ` Bob Pearson
2021-09-09 20:44 ` [PATCH for-rc v3 2/6] RDMA/rxe: Fix memory allocation while locked Bob Pearson
2021-09-09 20:44 ` [PATCH for-rc v3 3/6] RDMA/rxe: Cleanup MR status and type enums Bob Pearson
2021-09-09 20:44 ` [PATCH for-rc v3 4/6] RDMA/rxe: Separate HW and SW l/rkeys Bob Pearson
2021-09-09 20:44 ` [PATCH for-rc v3 5/6] RDMA/rxe: Create duplicate mapping tables for FMRs Bob Pearson
2021-09-09 20:44 ` [PATCH for-rc v3 6/6] RDMA/rxe: Only allow invalidate for appropriate MRs Bob Pearson
2021-09-09 21:52 ` [PATCH for-rc v3 0/6] RDMA/rxe: Various bug fixes Bart Van Assche
2021-09-10 19:38   ` Pearson, Robert B
2021-09-10 20:23     ` Bart Van Assche
2021-09-10 21:16       ` Bob Pearson
2021-09-10 21:47       ` Bob Pearson
2021-09-10 21:50         ` Bob Pearson
2021-09-10 22:07         ` Bart Van Assche
2021-09-12 14:41           ` Bob Pearson
2021-09-14  3:26             ` Bart Van Assche
2021-09-14  4:18               ` Bob Pearson
2021-09-12 14:42           ` Bob Pearson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).