All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-next v4 0/8] On-Demand Paging on SoftRoCE
@ 2023-04-19  5:51 Daisuke Matsuda
  2023-04-19  5:51 ` [PATCH for-next v4 1/8] RDMA/rxe: Tentative workqueue implementation Daisuke Matsuda
                   ` (8 more replies)
  0 siblings, 9 replies; 13+ messages in thread
From: Daisuke Matsuda @ 2023-04-19  5:51 UTC (permalink / raw)
  To: linux-rdma, leonro, jgg, zyjzyj2000
  Cc: linux-kernel, rpearsonhpe, yangx.jy, lizhijian, Daisuke Matsuda

This patch series implements the On-Demand Paging feature on SoftRoCE(rxe)
driver, which has been available only in mlx5 driver[1] so far.

The first patch of this series is provided for testing purpose, and it
should be dropped in the end. It converts triple tasklets to use workqueue
in order to let them sleep during page-fault. Bob Pearson says he will post
the patch to do this, and I think we can adopt that. The other patches in
this series are, I believe, completed works.

I omitted some contents like the motive behind this series for simplicity.
Please see the cover letter of v3 for more details[2].

[Overview]
When applications register a memory region(MR), RDMA drivers normally pin
pages in the MR so that physical addresses are never changed during RDMA
communication. This requires the MR to fit in physical memory and
inevitably leads to memory pressure. On the other hand, On-Demand Paging
(ODP) allows applications to register MRs without pinning pages. They are
paged-in when the driver requires and paged-out when the OS reclaims. As a
result, it is possible to register a large MR that does not fit in physical
memory without taking up so much physical memory.

[How does ODP work?]
"struct ib_umem_odp" is used to manage pages. It is created for each
ODP-enabled MR on its registration. This struct holds a pair of arrays
(dma_list/pfn_list) that serve as a driver page table. DMA addresses and
PFNs are stored in the driver page table. They are updated on page-in and
page-out, both of which use the common interfaces in the ib_uverbs layer.

Page-in can occur when requester, responder or completer access an MR in
order to process RDMA operations. If they find that the pages being
accessed are not present on physical memory or requisite permissions are
not set on the pages, they provoke page fault to make the pages present
with proper permissions and at the same time update the driver page table.
After confirming the presence of the pages, they execute memory access such
as read, write or atomic operations.

Page-out is triggered by page reclaim or filesystem events (e.g. metadata
update of a file that is being used as an MR). When creating an ODP-enabled
MR, the driver registers an MMU notifier callback. When the kernel issues a
page invalidation notification, the callback is provoked to unmap DMA
addresses and update the driver page table. After that, the kernel releases
the pages.

[Supported operations]
All traditional operations are supported on RC connection. The new Atomic
write[3] and RDMA Flush[4] operations are not included in this patchset. I
will post them later after this patchset is merged. On UD connection, Send,
Recv, and SRQ-Recv are supported.

[How to test ODP?]
There are only a few resources available for testing. pyverbs testcases in
rdma-core and perftest[5] are recommendable ones. Other than them, the
ibv_rc_pingpong command can also used for testing. Note that you may have
to build perftest from upstream because older versions do not handle ODP
capabilities correctly.

The tree is available from github:
https://github.com/daimatsuda/linux/tree/odp_v4
While this series is based on commit f605f26ea196, the tree includes an
additional bugfix, which is yet to be merged as of today (Apr 19th, 2023).
https://lore.kernel.org/linux-rdma/20230418090642.1849358-1-matsuda-daisuke@fujitsu.com/

[Future work]
My next work is to enable the new Atomic write[3] and RDMA Flush[4]
operations with ODP. After that, I am going to implement the prefetch
feature. It allows applications to trigger page fault using
ibv_advise_mr(3) to optimize performance. Some existing software like
librpma[6] use this feature. Additionally, I think we can also add the
implicit ODP feature in the future.

[1] [RFC 00/20] On demand paging
https://www.spinics.net/lists/linux-rdma/msg18906.html

[2] [PATCH for-next v3 0/7] On-Demand Paging on SoftRoCE
https://lore.kernel.org/lkml/cover.1671772917.git.matsuda-daisuke@fujitsu.com/

[3] [PATCH v7 0/8] RDMA/rxe: Add atomic write operation
https://lore.kernel.org/linux-rdma/1669905432-14-1-git-send-email-yangx.jy@fujitsu.com/

[4] [for-next PATCH 00/10] RDMA/rxe: Add RDMA FLUSH operation
https://lore.kernel.org/lkml/20221206130201.30986-1-lizhijian@fujitsu.com/

[5] linux-rdma/perftest: Infiniband Verbs Performance Tests
https://github.com/linux-rdma/perftest

[6] librpma: Remote Persistent Memory Access Library
https://github.com/pmem/rpma

v3->v4:
 1) Re-designed functions that access MRs to use the MR xarray.
 2) Rebased onto the latest jgg-for-next tree.

v2->v3:
 1) Removed a patch that changes the common ib_uverbs layer.
 2) Re-implemented patches for conversion to workqueue.
 3) Fixed compile errors (happened when CONFIG_INFINIBAND_ON_DEMAND_PAGING=n).
 4) Fixed some functions that returned incorrect errors.
 5) Temporarily disabled ODP for RDMA Flush and Atomic Write.

v1->v2:
 1) Fixed a crash issue reported by Haris Iqbal.
 2) Tried to make lock patters clearer as pointed out by Romanovsky.
 3) Minor clean ups and fixes.

Daisuke Matsuda (8):
  RDMA/rxe: Tentative workqueue implementation
  RDMA/rxe: Always schedule works before accessing user MRs
  RDMA/rxe: Make MR functions accessible from other rxe source code
  RDMA/rxe: Move resp_states definition to rxe_verbs.h
  RDMA/rxe: Add page invalidation support
  RDMA/rxe: Allow registering MRs for On-Demand Paging
  RDMA/rxe: Add support for Send/Recv/Write/Read with ODP
  RDMA/rxe: Add support for the traditional Atomic operations with ODP

 drivers/infiniband/sw/rxe/Makefile    |   2 +
 drivers/infiniband/sw/rxe/rxe.c       |  27 ++-
 drivers/infiniband/sw/rxe/rxe.h       |  37 ---
 drivers/infiniband/sw/rxe/rxe_comp.c  |  12 +-
 drivers/infiniband/sw/rxe/rxe_loc.h   |  49 +++-
 drivers/infiniband/sw/rxe/rxe_mr.c    |  27 +--
 drivers/infiniband/sw/rxe/rxe_odp.c   | 311 ++++++++++++++++++++++++++
 drivers/infiniband/sw/rxe/rxe_recv.c  |   4 +-
 drivers/infiniband/sw/rxe/rxe_resp.c  |  32 ++-
 drivers/infiniband/sw/rxe/rxe_task.c  |  84 ++++---
 drivers/infiniband/sw/rxe/rxe_task.h  |   6 +-
 drivers/infiniband/sw/rxe/rxe_verbs.c |   5 +-
 drivers/infiniband/sw/rxe/rxe_verbs.h |  39 ++++
 13 files changed, 535 insertions(+), 100 deletions(-)
 create mode 100644 drivers/infiniband/sw/rxe/rxe_odp.c

base-commit: f605f26ea196a3b49bea249330cbd18dba61a33e

-- 
2.39.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH for-next v4 1/8] RDMA/rxe: Tentative workqueue implementation
  2023-04-19  5:51 [PATCH for-next v4 0/8] On-Demand Paging on SoftRoCE Daisuke Matsuda
@ 2023-04-19  5:51 ` Daisuke Matsuda
  2023-04-19  5:51 ` [PATCH for-next v4 2/8] RDMA/rxe: Always schedule works before accessing user MRs Daisuke Matsuda
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Daisuke Matsuda @ 2023-04-19  5:51 UTC (permalink / raw)
  To: linux-rdma, leonro, jgg, zyjzyj2000
  Cc: linux-kernel, rpearsonhpe, yangx.jy, lizhijian, Daisuke Matsuda

This is a mock patch and not intended to be merged. In order to implement
ODP on rxe, it is necessary to convert triple tasklets to workqueue. I
expect Bob Pearson will submit his patch to do this very soon.

Link: https://lore.kernel.org/linux-rdma/a74126b4-b527-af72-f23e-c9d6711e5285@gmail.com/
Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
---
 drivers/infiniband/sw/rxe/rxe.c      |  9 ++-
 drivers/infiniband/sw/rxe/rxe_task.c | 84 ++++++++++++++++++----------
 drivers/infiniband/sw/rxe/rxe_task.h |  6 +-
 3 files changed, 67 insertions(+), 32 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe.c b/drivers/infiniband/sw/rxe/rxe.c
index 7a7e713de52d..54c723a6edda 100644
--- a/drivers/infiniband/sw/rxe/rxe.c
+++ b/drivers/infiniband/sw/rxe/rxe.c
@@ -212,10 +212,16 @@ static int __init rxe_module_init(void)
 {
 	int err;
 
-	err = rxe_net_init();
+	err = rxe_alloc_wq();
 	if (err)
 		return err;
 
+	err = rxe_net_init();
+	if (err) {
+		rxe_destroy_wq();
+		return err;
+	}
+
 	rdma_link_register(&rxe_link_ops);
 	pr_info("loaded\n");
 	return 0;
@@ -226,6 +232,7 @@ static void __exit rxe_module_exit(void)
 	rdma_link_unregister(&rxe_link_ops);
 	ib_unregister_driver(RDMA_DRIVER_RXE);
 	rxe_net_exit();
+	rxe_destroy_wq();
 
 	pr_info("unloaded\n");
 }
diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c
index fb9a6bc8e620..c8aa1763d1f9 100644
--- a/drivers/infiniband/sw/rxe/rxe_task.c
+++ b/drivers/infiniband/sw/rxe/rxe_task.c
@@ -6,8 +6,25 @@
 
 #include "rxe.h"
 
+static struct workqueue_struct *rxe_wq;
+
+int rxe_alloc_wq(void)
+{
+	rxe_wq = alloc_workqueue("rxe_wq", WQ_CPU_INTENSIVE | WQ_UNBOUND,
+				 WQ_MAX_ACTIVE);
+	if (!rxe_wq)
+		return -ENOMEM;
+
+	return 0;
+}
+
+void rxe_destroy_wq(void)
+{
+	destroy_workqueue(rxe_wq);
+}
+
 /* Check if task is idle i.e. not running, not scheduled in
- * tasklet queue and not draining. If so move to busy to
+ * work queue and not draining. If so move to busy to
  * reserve a slot in do_task() by setting to busy and taking
  * a qp reference to cover the gap from now until the task finishes.
  * state will move out of busy if task returns a non zero value
@@ -21,7 +38,7 @@ static bool __reserve_if_idle(struct rxe_task *task)
 {
 	WARN_ON(rxe_read(task->qp) <= 0);
 
-	if (task->tasklet.state & BIT(TASKLET_STATE_SCHED))
+	if (task->state & BIT(TASKLET_STATE_SCHED))
 		return false;
 
 	if (task->state == TASK_STATE_IDLE) {
@@ -38,7 +55,7 @@ static bool __reserve_if_idle(struct rxe_task *task)
 }
 
 /* check if task is idle or drained and not currently
- * scheduled in the tasklet queue. This routine is
+ * scheduled in the work queue. This routine is
  * called by rxe_cleanup_task or rxe_disable_task to
  * see if the queue is empty.
  * Context: caller should hold task->lock.
@@ -46,7 +63,7 @@ static bool __reserve_if_idle(struct rxe_task *task)
  */
 static bool __is_done(struct rxe_task *task)
 {
-	if (task->tasklet.state & BIT(TASKLET_STATE_SCHED))
+	if (task->state & BIT(TASKLET_STATE_SCHED))
 		return false;
 
 	if (task->state == TASK_STATE_IDLE ||
@@ -77,23 +94,23 @@ static bool is_done(struct rxe_task *task)
  * schedules the task. They must call __reserve_if_idle to
  * move the task to busy before calling or scheduling.
  * The task can also be moved to drained or invalid
- * by calls to rxe-cleanup_task or rxe_disable_task.
+ * by calls to rxe_cleanup_task or rxe_disable_task.
  * In that case tasks which get here are not executed but
  * just flushed. The tasks are designed to look to see if
- * there is work to do and do part of it before returning
+ * there is work to do and then do part of it before returning
  * here with a return value of zero until all the work
- * has been consumed then it retuens a non-zero value.
+ * has been consumed then it returns a non-zero value.
  * The number of times the task can be run is limited by
  * max iterations so one task cannot hold the cpu forever.
+ * If the limit is hit and work remains the task is rescheduled.
  */
-static void do_task(struct tasklet_struct *t)
+static void do_task(struct rxe_task *task)
 {
-	int cont;
-	int ret;
-	struct rxe_task *task = from_tasklet(task, t, tasklet);
 	unsigned int iterations;
 	unsigned long flags;
 	int resched = 0;
+	int cont;
+	int ret;
 
 	WARN_ON(rxe_read(task->qp) <= 0);
 
@@ -122,8 +139,8 @@ static void do_task(struct tasklet_struct *t)
 			} else {
 				/* This can happen if the client
 				 * can add work faster than the
-				 * tasklet can finish it.
-				 * Reschedule the tasklet and exit
+				 * work queue can finish it.
+				 * Reschedule the task and exit
 				 * the loop to give up the cpu
 				 */
 				task->state = TASK_STATE_IDLE;
@@ -131,9 +148,9 @@ static void do_task(struct tasklet_struct *t)
 			}
 			break;
 
-		/* someone tried to run the task since the last time we called
-		 * func, so we will call one more time regardless of the
-		 * return value
+		/* someone tried to run the task since the last time we
+		 * called func, so we will call one more time regardless
+		 * of the return value
 		 */
 		case TASK_STATE_ARMED:
 			task->state = TASK_STATE_BUSY;
@@ -149,13 +166,16 @@ static void do_task(struct tasklet_struct *t)
 
 		default:
 			WARN_ON(1);
-			rxe_info_qp(task->qp, "unexpected task state = %d", task->state);
+			rxe_dbg_qp(task->qp, "unexpected task state = %d",
+				   task->state);
 		}
 
 		if (!cont) {
 			task->num_done++;
 			if (WARN_ON(task->num_done != task->num_sched))
-				rxe_err_qp(task->qp, "%ld tasks scheduled, %ld tasks done",
+				rxe_dbg_qp(task->qp,
+					   "%ld tasks scheduled, "
+					   "%ld tasks done",
 					   task->num_sched, task->num_done);
 		}
 		spin_unlock_irqrestore(&task->lock, flags);
@@ -169,6 +189,12 @@ static void do_task(struct tasklet_struct *t)
 	rxe_put(task->qp);
 }
 
+/* wrapper around do_task to fix argument */
+static void __do_task(struct work_struct *work)
+{
+	do_task(container_of(work, struct rxe_task, work));
+}
+
 int rxe_init_task(struct rxe_task *task, struct rxe_qp *qp,
 		  int (*func)(struct rxe_qp *))
 {
@@ -176,11 +202,9 @@ int rxe_init_task(struct rxe_task *task, struct rxe_qp *qp,
 
 	task->qp = qp;
 	task->func = func;
-
-	tasklet_setup(&task->tasklet, do_task);
-
 	task->state = TASK_STATE_IDLE;
 	spin_lock_init(&task->lock);
+	INIT_WORK(&task->work, __do_task);
 
 	return 0;
 }
@@ -213,8 +237,6 @@ void rxe_cleanup_task(struct rxe_task *task)
 	while (!is_done(task))
 		cond_resched();
 
-	tasklet_kill(&task->tasklet);
-
 	spin_lock_irqsave(&task->lock, flags);
 	task->state = TASK_STATE_INVALID;
 	spin_unlock_irqrestore(&task->lock, flags);
@@ -226,7 +248,7 @@ void rxe_cleanup_task(struct rxe_task *task)
 void rxe_run_task(struct rxe_task *task)
 {
 	unsigned long flags;
-	int run;
+	bool run;
 
 	WARN_ON(rxe_read(task->qp) <= 0);
 
@@ -235,11 +257,11 @@ void rxe_run_task(struct rxe_task *task)
 	spin_unlock_irqrestore(&task->lock, flags);
 
 	if (run)
-		do_task(&task->tasklet);
+		do_task(task);
 }
 
-/* schedule the task to run later as a tasklet.
- * the tasklet)schedule call can be called holding
+/* schedule the task to run later as a work queue entry.
+ * the queue_work call can be called holding
  * the lock.
  */
 void rxe_sched_task(struct rxe_task *task)
@@ -250,7 +272,7 @@ void rxe_sched_task(struct rxe_task *task)
 
 	spin_lock_irqsave(&task->lock, flags);
 	if (__reserve_if_idle(task))
-		tasklet_schedule(&task->tasklet);
+		queue_work(rxe_wq, &task->work);
 	spin_unlock_irqrestore(&task->lock, flags);
 }
 
@@ -277,7 +299,9 @@ void rxe_disable_task(struct rxe_task *task)
 	while (!is_done(task))
 		cond_resched();
 
-	tasklet_disable(&task->tasklet);
+	spin_lock_irqsave(&task->lock, flags);
+	task->state = TASK_STATE_DRAINED;
+	spin_unlock_irqrestore(&task->lock, flags);
 }
 
 void rxe_enable_task(struct rxe_task *task)
@@ -291,7 +315,7 @@ void rxe_enable_task(struct rxe_task *task)
 		spin_unlock_irqrestore(&task->lock, flags);
 		return;
 	}
+
 	task->state = TASK_STATE_IDLE;
-	tasklet_enable(&task->tasklet);
 	spin_unlock_irqrestore(&task->lock, flags);
 }
diff --git a/drivers/infiniband/sw/rxe/rxe_task.h b/drivers/infiniband/sw/rxe/rxe_task.h
index facb7c8e3729..a63e258b3d66 100644
--- a/drivers/infiniband/sw/rxe/rxe_task.h
+++ b/drivers/infiniband/sw/rxe/rxe_task.h
@@ -22,7 +22,7 @@ enum {
  * called again.
  */
 struct rxe_task {
-	struct tasklet_struct	tasklet;
+	struct work_struct	work;
 	int			state;
 	spinlock_t		lock;
 	struct rxe_qp		*qp;
@@ -32,6 +32,10 @@ struct rxe_task {
 	long			num_done;
 };
 
+int rxe_alloc_wq(void);
+
+void rxe_destroy_wq(void);
+
 /*
  * init rxe_task structure
  *	qp  => parameter to pass to func
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH for-next v4 2/8] RDMA/rxe: Always schedule works before accessing user MRs
  2023-04-19  5:51 [PATCH for-next v4 0/8] On-Demand Paging on SoftRoCE Daisuke Matsuda
  2023-04-19  5:51 ` [PATCH for-next v4 1/8] RDMA/rxe: Tentative workqueue implementation Daisuke Matsuda
@ 2023-04-19  5:51 ` Daisuke Matsuda
  2023-04-19 19:37   ` kernel test robot
  2023-04-19  5:51 ` [PATCH for-next v4 3/8] RDMA/rxe: Make MR functions accessible from other rxe source code Daisuke Matsuda
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 13+ messages in thread
From: Daisuke Matsuda @ 2023-04-19  5:51 UTC (permalink / raw)
  To: linux-rdma, leonro, jgg, zyjzyj2000
  Cc: linux-kernel, rpearsonhpe, yangx.jy, lizhijian, Daisuke Matsuda

Both responder and completer can sleep to execute page-fault when used
with ODP. It happens when they are going to access user MRs, so works must
be scheduled in such cases.

Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
---
 drivers/infiniband/sw/rxe/rxe_comp.c | 12 ++++++++++--
 drivers/infiniband/sw/rxe/rxe_loc.h  |  4 ++--
 drivers/infiniband/sw/rxe/rxe_recv.c |  4 ++--
 drivers/infiniband/sw/rxe/rxe_resp.c | 14 +++++++++-----
 4 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c
index db18ace74d2b..b71bd9cc00d0 100644
--- a/drivers/infiniband/sw/rxe/rxe_comp.c
+++ b/drivers/infiniband/sw/rxe/rxe_comp.c
@@ -126,13 +126,21 @@ void retransmit_timer(struct timer_list *t)
 	spin_unlock_bh(&qp->state_lock);
 }
 
-void rxe_comp_queue_pkt(struct rxe_qp *qp, struct sk_buff *skb)
+void rxe_comp_queue_pkt(struct rxe_pkt_info *pkt, struct sk_buff *skb)
 {
+	struct rxe_qp *qp = pkt->qp;
 	int must_sched;
 
 	skb_queue_tail(&qp->resp_pkts, skb);
 
-	must_sched = skb_queue_len(&qp->resp_pkts) > 1;
+	/* Schedule the task if processing Read responses or Atomic acks.
+	 * In these cases, completer may sleep to access ODP-enabled MRs.
+	 */
+	if (pkt->mask | (RXE_PAYLOAD_MASK || RXE_ATMACK_MASK))
+		must_sched = 1;
+	else
+		must_sched = skb_queue_len(&qp->resp_pkts) > 1;
+
 	if (must_sched != 0)
 		rxe_counter_inc(SKB_TO_PKT(skb)->rxe, RXE_CNT_COMPLETER_SCHED);
 
diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h
index 804b15e929dd..bf28ac13c3f5 100644
--- a/drivers/infiniband/sw/rxe/rxe_loc.h
+++ b/drivers/infiniband/sw/rxe/rxe_loc.h
@@ -179,9 +179,9 @@ int rxe_icrc_init(struct rxe_dev *rxe);
 int rxe_icrc_check(struct sk_buff *skb, struct rxe_pkt_info *pkt);
 void rxe_icrc_generate(struct sk_buff *skb, struct rxe_pkt_info *pkt);
 
-void rxe_resp_queue_pkt(struct rxe_qp *qp, struct sk_buff *skb);
+void rxe_resp_queue_pkt(struct rxe_pkt_info *pkt, struct sk_buff *skb);
 
-void rxe_comp_queue_pkt(struct rxe_qp *qp, struct sk_buff *skb);
+void rxe_comp_queue_pkt(struct rxe_pkt_info *pkt, struct sk_buff *skb);
 
 static inline unsigned int wr_opcode_mask(int opcode, struct rxe_qp *qp)
 {
diff --git a/drivers/infiniband/sw/rxe/rxe_recv.c b/drivers/infiniband/sw/rxe/rxe_recv.c
index 2f953cc74256..0d869615508a 100644
--- a/drivers/infiniband/sw/rxe/rxe_recv.c
+++ b/drivers/infiniband/sw/rxe/rxe_recv.c
@@ -181,9 +181,9 @@ static int hdr_check(struct rxe_pkt_info *pkt)
 static inline void rxe_rcv_pkt(struct rxe_pkt_info *pkt, struct sk_buff *skb)
 {
 	if (pkt->mask & RXE_REQ_MASK)
-		rxe_resp_queue_pkt(pkt->qp, skb);
+		rxe_resp_queue_pkt(pkt, skb);
 	else
-		rxe_comp_queue_pkt(pkt->qp, skb);
+		rxe_comp_queue_pkt(pkt, skb);
 }
 
 static void rxe_rcv_mcast_pkt(struct rxe_dev *rxe, struct sk_buff *skb)
diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
index 68f6cd188d8e..f915128ed32a 100644
--- a/drivers/infiniband/sw/rxe/rxe_resp.c
+++ b/drivers/infiniband/sw/rxe/rxe_resp.c
@@ -47,15 +47,19 @@ static char *resp_state_name[] = {
 };
 
 /* rxe_recv calls here to add a request packet to the input queue */
-void rxe_resp_queue_pkt(struct rxe_qp *qp, struct sk_buff *skb)
+void rxe_resp_queue_pkt(struct rxe_pkt_info *pkt, struct sk_buff *skb)
 {
-	int must_sched;
-	struct rxe_pkt_info *pkt = SKB_TO_PKT(skb);
+	int must_sched = 1;
+	struct rxe_qp *qp = pkt->qp;
 
 	skb_queue_tail(&qp->req_pkts, skb);
 
-	must_sched = (pkt->opcode == IB_OPCODE_RC_RDMA_READ_REQUEST) ||
-			(skb_queue_len(&qp->req_pkts) > 1);
+	/* responder can sleep to access an ODP-enabled MR. Always schedule
+	 * tasks for non-zero-byte operations, RDMA Read, and Atomic.
+	 */
+	if ((skb_queue_len(&qp->req_pkts) == 1) && (payload_size(pkt) == 0)
+	    && !(pkt->mask & RXE_READ_OR_ATOMIC_MASK))
+		must_sched = 0;
 
 	if (must_sched)
 		rxe_sched_task(&qp->resp.task);
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH for-next v4 3/8] RDMA/rxe: Make MR functions accessible from other rxe source code
  2023-04-19  5:51 [PATCH for-next v4 0/8] On-Demand Paging on SoftRoCE Daisuke Matsuda
  2023-04-19  5:51 ` [PATCH for-next v4 1/8] RDMA/rxe: Tentative workqueue implementation Daisuke Matsuda
  2023-04-19  5:51 ` [PATCH for-next v4 2/8] RDMA/rxe: Always schedule works before accessing user MRs Daisuke Matsuda
@ 2023-04-19  5:51 ` Daisuke Matsuda
  2023-04-19  5:51 ` [PATCH for-next v4 4/8] RDMA/rxe: Move resp_states definition to rxe_verbs.h Daisuke Matsuda
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Daisuke Matsuda @ 2023-04-19  5:51 UTC (permalink / raw)
  To: linux-rdma, leonro, jgg, zyjzyj2000
  Cc: linux-kernel, rpearsonhpe, yangx.jy, lizhijian, Daisuke Matsuda

Some functions in rxe_mr.c are going to be used in rxe_odp.c, which is to
be created in the subsequent patch. List the declarations of the functions
in rxe_loc.h.

Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
---
 drivers/infiniband/sw/rxe/rxe_loc.h | 14 ++++++++++++++
 drivers/infiniband/sw/rxe/rxe_mr.c  | 18 ++++--------------
 2 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h
index bf28ac13c3f5..3476726425d9 100644
--- a/drivers/infiniband/sw/rxe/rxe_loc.h
+++ b/drivers/infiniband/sw/rxe/rxe_loc.h
@@ -60,7 +60,9 @@ int rxe_mmap(struct ib_ucontext *context, struct vm_area_struct *vma);
 
 /* rxe_mr.c */
 u8 rxe_get_next_key(u32 last_key);
+void rxe_mr_init(int access, struct rxe_mr *mr);
 void rxe_mr_init_dma(int access, struct rxe_mr *mr);
+int rxe_mr_fill_pages_from_sgt(struct rxe_mr *mr, struct sg_table *sgt);
 int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
 		     int access, struct rxe_mr *mr);
 int rxe_mr_init_fast(int max_pages, struct rxe_mr *mr);
@@ -71,6 +73,8 @@ int copy_data(struct rxe_pd *pd, int access, struct rxe_dma_info *dma,
 	      void *addr, int length, enum rxe_mr_copy_dir dir);
 int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg,
 		  int sg_nents, unsigned int *sg_offset);
+int rxe_mr_copy_xarray(struct rxe_mr *mr, u64 iova, void *addr,
+		       unsigned int length, enum rxe_mr_copy_dir dir);
 int rxe_mr_do_atomic_op(struct rxe_mr *mr, u64 iova, int opcode,
 			u64 compare, u64 swap_add, u64 *orig_val);
 int rxe_mr_do_atomic_write(struct rxe_mr *mr, u64 iova, u64 value);
@@ -82,6 +86,16 @@ int rxe_invalidate_mr(struct rxe_qp *qp, u32 key);
 int rxe_reg_fast_mr(struct rxe_qp *qp, struct rxe_send_wqe *wqe);
 void rxe_mr_cleanup(struct rxe_pool_elem *elem);
 
+static inline unsigned long rxe_mr_iova_to_index(struct rxe_mr *mr, u64 iova)
+{
+	return (iova >> mr->page_shift) - (mr->ibmr.iova >> mr->page_shift);
+}
+
+static inline unsigned long rxe_mr_iova_to_page_offset(struct rxe_mr *mr, u64 iova)
+{
+	return iova & (mr_page_size(mr) - 1);
+}
+
 /* rxe_mw.c */
 int rxe_alloc_mw(struct ib_mw *ibmw, struct ib_udata *udata);
 int rxe_dealloc_mw(struct ib_mw *ibmw);
diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index 0e538fafcc20..ffbac6f5e828 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -49,7 +49,7 @@ int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length)
 				| IB_ACCESS_REMOTE_WRITE	\
 				| IB_ACCESS_REMOTE_ATOMIC)
 
-static void rxe_mr_init(int access, struct rxe_mr *mr)
+void rxe_mr_init(int access, struct rxe_mr *mr)
 {
 	u32 lkey = mr->elem.index << 8 | rxe_get_next_key(-1);
 	u32 rkey = (access & IB_ACCESS_REMOTE) ? lkey : 0;
@@ -77,16 +77,6 @@ void rxe_mr_init_dma(int access, struct rxe_mr *mr)
 	mr->ibmr.type = IB_MR_TYPE_DMA;
 }
 
-static unsigned long rxe_mr_iova_to_index(struct rxe_mr *mr, u64 iova)
-{
-	return (iova >> mr->page_shift) - (mr->ibmr.iova >> mr->page_shift);
-}
-
-static unsigned long rxe_mr_iova_to_page_offset(struct rxe_mr *mr, u64 iova)
-{
-	return iova & (mr_page_size(mr) - 1);
-}
-
 static bool is_pmem_page(struct page *pg)
 {
 	unsigned long paddr = page_to_phys(pg);
@@ -96,7 +86,7 @@ static bool is_pmem_page(struct page *pg)
 				 IORES_DESC_PERSISTENT_MEMORY);
 }
 
-static int rxe_mr_fill_pages_from_sgt(struct rxe_mr *mr, struct sg_table *sgt)
+int rxe_mr_fill_pages_from_sgt(struct rxe_mr *mr, struct sg_table *sgt)
 {
 	XA_STATE(xas, &mr->page_list, 0);
 	struct sg_page_iter sg_iter;
@@ -247,8 +237,8 @@ int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sgl,
 	return ib_sg_to_pages(ibmr, sgl, sg_nents, sg_offset, rxe_set_page);
 }
 
-static int rxe_mr_copy_xarray(struct rxe_mr *mr, u64 iova, void *addr,
-			      unsigned int length, enum rxe_mr_copy_dir dir)
+int rxe_mr_copy_xarray(struct rxe_mr *mr, u64 iova, void *addr,
+		       unsigned int length, enum rxe_mr_copy_dir dir)
 {
 	unsigned int page_offset = rxe_mr_iova_to_page_offset(mr, iova);
 	unsigned long index = rxe_mr_iova_to_index(mr, iova);
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH for-next v4 4/8] RDMA/rxe: Move resp_states definition to rxe_verbs.h
  2023-04-19  5:51 [PATCH for-next v4 0/8] On-Demand Paging on SoftRoCE Daisuke Matsuda
                   ` (2 preceding siblings ...)
  2023-04-19  5:51 ` [PATCH for-next v4 3/8] RDMA/rxe: Make MR functions accessible from other rxe source code Daisuke Matsuda
@ 2023-04-19  5:51 ` Daisuke Matsuda
  2023-04-19  5:51 ` [PATCH for-next v4 5/8] RDMA/rxe: Add page invalidation support Daisuke Matsuda
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Daisuke Matsuda @ 2023-04-19  5:51 UTC (permalink / raw)
  To: linux-rdma, leonro, jgg, zyjzyj2000
  Cc: linux-kernel, rpearsonhpe, yangx.jy, lizhijian, Daisuke Matsuda

To use the resp_states values in rxe_loc.h, it is necessary to move the
definition to rxe_verbs.h, where other internal states of this driver are
defined.

Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
---
 drivers/infiniband/sw/rxe/rxe.h       | 37 ---------------------------
 drivers/infiniband/sw/rxe/rxe_verbs.h | 37 +++++++++++++++++++++++++++
 2 files changed, 37 insertions(+), 37 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe.h b/drivers/infiniband/sw/rxe/rxe.h
index d33dd6cf83d3..9b4d044a1264 100644
--- a/drivers/infiniband/sw/rxe/rxe.h
+++ b/drivers/infiniband/sw/rxe/rxe.h
@@ -100,43 +100,6 @@
 #define rxe_info_mw(mw, fmt, ...) ibdev_info_ratelimited((mw)->ibmw.device, \
 		"mw#%d %s:  " fmt, (mw)->elem.index, __func__, ##__VA_ARGS__)
 
-/* responder states */
-enum resp_states {
-	RESPST_NONE,
-	RESPST_GET_REQ,
-	RESPST_CHK_PSN,
-	RESPST_CHK_OP_SEQ,
-	RESPST_CHK_OP_VALID,
-	RESPST_CHK_RESOURCE,
-	RESPST_CHK_LENGTH,
-	RESPST_CHK_RKEY,
-	RESPST_EXECUTE,
-	RESPST_READ_REPLY,
-	RESPST_ATOMIC_REPLY,
-	RESPST_ATOMIC_WRITE_REPLY,
-	RESPST_PROCESS_FLUSH,
-	RESPST_COMPLETE,
-	RESPST_ACKNOWLEDGE,
-	RESPST_CLEANUP,
-	RESPST_DUPLICATE_REQUEST,
-	RESPST_ERR_MALFORMED_WQE,
-	RESPST_ERR_UNSUPPORTED_OPCODE,
-	RESPST_ERR_MISALIGNED_ATOMIC,
-	RESPST_ERR_PSN_OUT_OF_SEQ,
-	RESPST_ERR_MISSING_OPCODE_FIRST,
-	RESPST_ERR_MISSING_OPCODE_LAST_C,
-	RESPST_ERR_MISSING_OPCODE_LAST_D1E,
-	RESPST_ERR_TOO_MANY_RDMA_ATM_REQ,
-	RESPST_ERR_RNR,
-	RESPST_ERR_RKEY_VIOLATION,
-	RESPST_ERR_INVALIDATE_RKEY,
-	RESPST_ERR_LENGTH,
-	RESPST_ERR_CQ_OVERFLOW,
-	RESPST_ERROR,
-	RESPST_DONE,
-	RESPST_EXIT,
-};
-
 void rxe_set_mtu(struct rxe_dev *rxe, unsigned int dev_mtu);
 
 int rxe_add(struct rxe_dev *rxe, unsigned int mtu, const char *ibdev_name);
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h
index 26a20f088692..b6fbd9b3d086 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.h
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
@@ -127,6 +127,43 @@ struct rxe_comp_info {
 	struct rxe_task		task;
 };
 
+/* responder states */
+enum resp_states {
+	RESPST_NONE,
+	RESPST_GET_REQ,
+	RESPST_CHK_PSN,
+	RESPST_CHK_OP_SEQ,
+	RESPST_CHK_OP_VALID,
+	RESPST_CHK_RESOURCE,
+	RESPST_CHK_LENGTH,
+	RESPST_CHK_RKEY,
+	RESPST_EXECUTE,
+	RESPST_READ_REPLY,
+	RESPST_ATOMIC_REPLY,
+	RESPST_ATOMIC_WRITE_REPLY,
+	RESPST_PROCESS_FLUSH,
+	RESPST_COMPLETE,
+	RESPST_ACKNOWLEDGE,
+	RESPST_CLEANUP,
+	RESPST_DUPLICATE_REQUEST,
+	RESPST_ERR_MALFORMED_WQE,
+	RESPST_ERR_UNSUPPORTED_OPCODE,
+	RESPST_ERR_MISALIGNED_ATOMIC,
+	RESPST_ERR_PSN_OUT_OF_SEQ,
+	RESPST_ERR_MISSING_OPCODE_FIRST,
+	RESPST_ERR_MISSING_OPCODE_LAST_C,
+	RESPST_ERR_MISSING_OPCODE_LAST_D1E,
+	RESPST_ERR_TOO_MANY_RDMA_ATM_REQ,
+	RESPST_ERR_RNR,
+	RESPST_ERR_RKEY_VIOLATION,
+	RESPST_ERR_INVALIDATE_RKEY,
+	RESPST_ERR_LENGTH,
+	RESPST_ERR_CQ_OVERFLOW,
+	RESPST_ERROR,
+	RESPST_DONE,
+	RESPST_EXIT,
+};
+
 enum rdatm_res_state {
 	rdatm_res_state_next,
 	rdatm_res_state_new,
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH for-next v4 5/8] RDMA/rxe: Add page invalidation support
  2023-04-19  5:51 [PATCH for-next v4 0/8] On-Demand Paging on SoftRoCE Daisuke Matsuda
                   ` (3 preceding siblings ...)
  2023-04-19  5:51 ` [PATCH for-next v4 4/8] RDMA/rxe: Move resp_states definition to rxe_verbs.h Daisuke Matsuda
@ 2023-04-19  5:51 ` Daisuke Matsuda
  2023-04-19  5:51 ` [PATCH for-next v4 6/8] RDMA/rxe: Allow registering MRs for On-Demand Paging Daisuke Matsuda
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Daisuke Matsuda @ 2023-04-19  5:51 UTC (permalink / raw)
  To: linux-rdma, leonro, jgg, zyjzyj2000
  Cc: linux-kernel, rpearsonhpe, yangx.jy, lizhijian, Daisuke Matsuda

On page invalidation, an MMU notifier callback is invoked to unmap DMA
addresses and update the driver page table(umem_odp->dma_list). It also
set the corresponding entries in MR xarray to NULL to prevent any access.
The callback is registered when an ODP-enabled MR is created.

Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
---
 drivers/infiniband/sw/rxe/Makefile  |  2 ++
 drivers/infiniband/sw/rxe/rxe_odp.c | 56 +++++++++++++++++++++++++++++
 2 files changed, 58 insertions(+)
 create mode 100644 drivers/infiniband/sw/rxe/rxe_odp.c

diff --git a/drivers/infiniband/sw/rxe/Makefile b/drivers/infiniband/sw/rxe/Makefile
index 5395a581f4bb..93134f1d1d0c 100644
--- a/drivers/infiniband/sw/rxe/Makefile
+++ b/drivers/infiniband/sw/rxe/Makefile
@@ -23,3 +23,5 @@ rdma_rxe-y := \
 	rxe_task.o \
 	rxe_net.o \
 	rxe_hw_counters.o
+
+rdma_rxe-$(CONFIG_INFINIBAND_ON_DEMAND_PAGING) += rxe_odp.o
diff --git a/drivers/infiniband/sw/rxe/rxe_odp.c b/drivers/infiniband/sw/rxe/rxe_odp.c
new file mode 100644
index 000000000000..b69b25e0fef6
--- /dev/null
+++ b/drivers/infiniband/sw/rxe/rxe_odp.c
@@ -0,0 +1,56 @@
+// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+/*
+ * Copyright (c) 2022-2023 Fujitsu Ltd. All rights reserved.
+ */
+
+#include <linux/hmm.h>
+
+#include <rdma/ib_umem_odp.h>
+
+#include "rxe.h"
+
+static void rxe_mr_unset_xarray(struct rxe_mr *mr, unsigned long start,
+				unsigned long end)
+{
+	unsigned long lower, upper, idx;
+
+	lower = rxe_mr_iova_to_index(mr, start);
+	upper = rxe_mr_iova_to_index(mr, end);
+
+	/* make elements in xarray NULL */
+	spin_lock(&mr->page_list.xa_lock);
+	for (idx = lower; idx <= upper; idx++)
+		__xa_erase(&mr->page_list, idx);
+	spin_unlock(&mr->page_list.xa_lock);
+}
+
+static bool rxe_ib_invalidate_range(struct mmu_interval_notifier *mni,
+				    const struct mmu_notifier_range *range,
+				    unsigned long cur_seq)
+{
+	struct ib_umem_odp *umem_odp =
+		container_of(mni, struct ib_umem_odp, notifier);
+	struct rxe_mr *mr = umem_odp->private;
+	unsigned long start, end;
+
+	if (!mmu_notifier_range_blockable(range))
+		return false;
+
+	mutex_lock(&umem_odp->umem_mutex);
+	mmu_interval_set_seq(mni, cur_seq);
+
+	start = max_t(u64, ib_umem_start(umem_odp), range->start);
+	end = min_t(u64, ib_umem_end(umem_odp), range->end);
+
+	rxe_mr_unset_xarray(mr, start, end);
+
+	/* update umem_odp->dma_list */
+	ib_umem_odp_unmap_dma_pages(umem_odp, start, end);
+
+	mutex_unlock(&umem_odp->umem_mutex);
+	return true;
+}
+
+const struct mmu_interval_notifier_ops rxe_mn_ops = {
+	.invalidate = rxe_ib_invalidate_range,
+};
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH for-next v4 6/8] RDMA/rxe: Allow registering MRs for On-Demand Paging
  2023-04-19  5:51 [PATCH for-next v4 0/8] On-Demand Paging on SoftRoCE Daisuke Matsuda
                   ` (4 preceding siblings ...)
  2023-04-19  5:51 ` [PATCH for-next v4 5/8] RDMA/rxe: Add page invalidation support Daisuke Matsuda
@ 2023-04-19  5:51 ` Daisuke Matsuda
  2023-04-19  5:51 ` [PATCH for-next v4 7/8] RDMA/rxe: Add support for Send/Recv/Write/Read with ODP Daisuke Matsuda
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Daisuke Matsuda @ 2023-04-19  5:51 UTC (permalink / raw)
  To: linux-rdma, leonro, jgg, zyjzyj2000
  Cc: linux-kernel, rpearsonhpe, yangx.jy, lizhijian, Daisuke Matsuda

Allow applications to register an ODP-enabled MR, in which case the flag
IB_ACCESS_ON_DEMAND is passed to rxe_reg_user_mr(). However, there is no
RDMA operation supported right now. They will be enabled later in the
subsequent two patches.

rxe_odp_do_pagefault() is called to initialize an ODP-enabled MR. It syncs
process address space from the CPU page table to the driver page table
(dma_list/pfn_list in umem_odp) when called with RXE_PAGEFAULT_SNAPSHOT
flag. Additionally, It can be used to trigger page fault when pages being
accessed are not present or do not have proper read/write permissions, and
possibly to prefetch pages in the future.

Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
---
 drivers/infiniband/sw/rxe/rxe.c       |   7 ++
 drivers/infiniband/sw/rxe/rxe_loc.h   |  14 +++
 drivers/infiniband/sw/rxe/rxe_mr.c    |   9 +-
 drivers/infiniband/sw/rxe/rxe_odp.c   | 120 ++++++++++++++++++++++++++
 drivers/infiniband/sw/rxe/rxe_resp.c  |  15 +++-
 drivers/infiniband/sw/rxe/rxe_verbs.c |   5 +-
 drivers/infiniband/sw/rxe/rxe_verbs.h |   2 +
 7 files changed, 166 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe.c b/drivers/infiniband/sw/rxe/rxe.c
index 54c723a6edda..f2284d27229b 100644
--- a/drivers/infiniband/sw/rxe/rxe.c
+++ b/drivers/infiniband/sw/rxe/rxe.c
@@ -73,6 +73,13 @@ static void rxe_init_device_param(struct rxe_dev *rxe)
 			rxe->ndev->dev_addr);
 
 	rxe->max_ucontext			= RXE_MAX_UCONTEXT;
+
+	if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) {
+		rxe->attr.kernel_cap_flags |= IBK_ON_DEMAND_PAGING;
+
+		/* IB_ODP_SUPPORT_IMPLICIT is not supported right now. */
+		rxe->attr.odp_caps.general_caps |= IB_ODP_SUPPORT;
+	}
 }
 
 /* initialize port attributes */
diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h
index 3476726425d9..0f91e23c151e 100644
--- a/drivers/infiniband/sw/rxe/rxe_loc.h
+++ b/drivers/infiniband/sw/rxe/rxe_loc.h
@@ -202,4 +202,18 @@ static inline unsigned int wr_opcode_mask(int opcode, struct rxe_qp *qp)
 	return rxe_wr_opcode_info[opcode].mask[qp->ibqp.qp_type];
 }
 
+/* rxe_odp.c */
+#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
+int rxe_odp_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length,
+			 u64 iova, int access_flags, struct rxe_mr *mr);
+#else /* CONFIG_INFINIBAND_ON_DEMAND_PAGING */
+static inline int
+rxe_odp_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
+		     int access_flags, struct rxe_mr *mr)
+{
+	return -EOPNOTSUPP;
+}
+
+#endif /* CONFIG_INFINIBAND_ON_DEMAND_PAGING */
+
 #endif /* RXE_LOC_H */
diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index ffbac6f5e828..cd368cd096c8 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -318,7 +318,10 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr,
 		return err;
 	}
 
-	return rxe_mr_copy_xarray(mr, iova, addr, length, dir);
+	if (mr->odp_enabled)
+		return -EOPNOTSUPP;
+	else
+		return rxe_mr_copy_xarray(mr, iova, addr, length, dir);
 }
 
 /* copy data in or out of a wqe, i.e. sg list
@@ -527,6 +530,10 @@ int rxe_mr_do_atomic_write(struct rxe_mr *mr, u64 iova, u64 value)
 	struct page *page;
 	u64 *va;
 
+	/* ODP is not supported right now. WIP. */
+	if (mr->odp_enabled)
+		return RESPST_ERR_UNSUPPORTED_OPCODE;
+
 	/* See IBA oA19-28 */
 	if (unlikely(mr->state != RXE_MR_STATE_VALID)) {
 		rxe_dbg_mr(mr, "mr not in valid state");
diff --git a/drivers/infiniband/sw/rxe/rxe_odp.c b/drivers/infiniband/sw/rxe/rxe_odp.c
index b69b25e0fef6..e5497d09c399 100644
--- a/drivers/infiniband/sw/rxe/rxe_odp.c
+++ b/drivers/infiniband/sw/rxe/rxe_odp.c
@@ -24,6 +24,24 @@ static void rxe_mr_unset_xarray(struct rxe_mr *mr, unsigned long start,
 	spin_unlock(&mr->page_list.xa_lock);
 }
 
+static void rxe_mr_set_xarray(struct rxe_mr *mr, unsigned long start,
+			      unsigned long end, unsigned long *pfn_list)
+{
+	unsigned long lower, upper, idx;
+	struct page *page;
+
+	lower = rxe_mr_iova_to_index(mr, start);
+	upper = rxe_mr_iova_to_index(mr, end);
+
+	/* make pages visible in xarray. no sleep while taking the lock */
+	spin_lock(&mr->page_list.xa_lock);
+	for (idx = lower; idx <= upper; idx++) {
+		page = hmm_pfn_to_page(pfn_list[idx]);
+		__xa_store(&mr->page_list, idx, page, GFP_ATOMIC);
+	}
+	spin_unlock(&mr->page_list.xa_lock);
+}
+
 static bool rxe_ib_invalidate_range(struct mmu_interval_notifier *mni,
 				    const struct mmu_notifier_range *range,
 				    unsigned long cur_seq)
@@ -54,3 +72,105 @@ static bool rxe_ib_invalidate_range(struct mmu_interval_notifier *mni,
 const struct mmu_interval_notifier_ops rxe_mn_ops = {
 	.invalidate = rxe_ib_invalidate_range,
 };
+
+#define RXE_PAGEFAULT_RDONLY BIT(1)
+#define RXE_PAGEFAULT_SNAPSHOT BIT(2)
+static int rxe_odp_do_pagefault(struct rxe_mr *mr, u64 user_va, int bcnt, u32 flags)
+{
+	int np;
+	u64 access_mask;
+	bool fault = !(flags & RXE_PAGEFAULT_SNAPSHOT);
+	struct ib_umem_odp *umem_odp = to_ib_umem_odp(mr->umem);
+
+	access_mask = ODP_READ_ALLOWED_BIT;
+	if (umem_odp->umem.writable && !(flags & RXE_PAGEFAULT_RDONLY))
+		access_mask |= ODP_WRITE_ALLOWED_BIT;
+
+	/*
+	 * ib_umem_odp_map_dma_and_lock() locks umem_mutex on success.
+	 * Callers must release the lock later to let invalidation handler
+	 * do its work again.
+	 */
+	np = ib_umem_odp_map_dma_and_lock(umem_odp, user_va, bcnt,
+					  access_mask, fault);
+	if (np < 0)
+		return np;
+
+	/* umem_mutex is still locked here, so we can use hmm_pfn_to_page()
+	 * safely to fetch pages in the range.
+	 */
+	rxe_mr_set_xarray(mr, user_va, user_va + bcnt, umem_odp->pfn_list);
+
+	return np;
+}
+
+static int rxe_odp_init_pages(struct rxe_mr *mr)
+{
+	int ret;
+	struct ib_umem_odp *umem_odp = to_ib_umem_odp(mr->umem);
+
+	ret = rxe_odp_do_pagefault(mr, mr->umem->address, mr->umem->length,
+				   RXE_PAGEFAULT_SNAPSHOT);
+
+	if (ret >= 0)
+		mutex_unlock(&umem_odp->umem_mutex);
+
+	return ret >= 0 ? 0 : ret;
+}
+
+int rxe_odp_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length,
+			 u64 iova, int access_flags, struct rxe_mr *mr)
+{
+	int err;
+	struct ib_umem_odp *umem_odp;
+
+	if (!IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING))
+		return -EOPNOTSUPP;
+
+	rxe_mr_init(access_flags, mr);
+
+	xa_init(&mr->page_list);
+
+	if (!start && length == U64_MAX) {
+		if (iova != 0)
+			return -EINVAL;
+		if (!(rxe->attr.odp_caps.general_caps & IB_ODP_SUPPORT_IMPLICIT))
+			return -EINVAL;
+
+		/* Never reach here, for implicit ODP is not implemented. */
+	}
+
+	umem_odp = ib_umem_odp_get(&rxe->ib_dev, start, length, access_flags,
+				   &rxe_mn_ops);
+	if (IS_ERR(umem_odp)) {
+		rxe_dbg_mr(mr, "Unable to create umem_odp err = %d\n",
+			   (int)PTR_ERR(umem_odp));
+		return PTR_ERR(umem_odp);
+	}
+
+	umem_odp->private = mr;
+
+	mr->odp_enabled = true;
+	mr->umem = &umem_odp->umem;
+	mr->access = access_flags;
+	mr->ibmr.length = length;
+	mr->ibmr.iova = iova;
+	mr->page_offset = ib_umem_offset(&umem_odp->umem);
+
+	err = rxe_odp_init_pages(mr);
+	if (err) {
+		ib_umem_odp_release(umem_odp);
+		return err;
+	}
+
+	err = rxe_mr_fill_pages_from_sgt(mr, &umem_odp->umem.sgt_append.sgt);
+	if (err) {
+		ib_umem_odp_release(umem_odp);
+		return err;
+	}
+
+	mr->state = RXE_MR_STATE_VALID;
+	mr->ibmr.type = IB_MR_TYPE_USER;
+
+	return err;
+}
diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
index f915128ed32a..b40c47477be3 100644
--- a/drivers/infiniband/sw/rxe/rxe_resp.c
+++ b/drivers/infiniband/sw/rxe/rxe_resp.c
@@ -641,6 +641,10 @@ static enum resp_states process_flush(struct rxe_qp *qp,
 	struct rxe_mr *mr = qp->resp.mr;
 	struct resp_res *res = qp->resp.res;
 
+	/* ODP is not supported right now. WIP. */
+	if (mr->odp_enabled)
+		return RESPST_ERR_UNSUPPORTED_OPCODE;
+
 	/* oA19-14, oA19-15 */
 	if (res && res->replay)
 		return RESPST_ACKNOWLEDGE;
@@ -694,10 +698,13 @@ static enum resp_states atomic_reply(struct rxe_qp *qp,
 	if (!res->replay) {
 		u64 iova = qp->resp.va + qp->resp.offset;
 
-		err = rxe_mr_do_atomic_op(mr, iova, pkt->opcode,
-					  atmeth_comp(pkt),
-					  atmeth_swap_add(pkt),
-					  &res->atomic.orig_val);
+		if (mr->odp_enabled)
+			err = RESPST_ERR_UNSUPPORTED_OPCODE;
+		else
+			err = rxe_mr_do_atomic_op(mr, iova, pkt->opcode,
+						  atmeth_comp(pkt),
+						  atmeth_swap_add(pkt),
+						  &res->atomic.orig_val);
 		if (err)
 			return err;
 
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
index dea605b7f683..9c23defdc7b5 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -1274,7 +1274,10 @@ static struct ib_mr *rxe_reg_user_mr(struct ib_pd *ibpd, u64 start,
 	mr->ibmr.pd = ibpd;
 	mr->ibmr.device = ibpd->device;
 
-	err = rxe_mr_init_user(rxe, start, length, iova, access, mr);
+	if (access & IB_ACCESS_ON_DEMAND)
+		err = rxe_odp_mr_init_user(rxe, start, length, iova, access, mr);
+	else
+		err = rxe_mr_init_user(rxe, start, length, iova, access, mr);
 	if (err) {
 		rxe_dbg_mr(mr, "reg_user_mr failed, err = %d", err);
 		goto err_cleanup;
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h
index b6fbd9b3d086..de5a982c7c7e 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.h
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
@@ -333,6 +333,8 @@ struct rxe_mr {
 	u32			nbuf;
 
 	struct xarray		page_list;
+
+	bool			odp_enabled;
 };
 
 static inline unsigned int mr_page_size(struct rxe_mr *mr)
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH for-next v4 7/8] RDMA/rxe: Add support for Send/Recv/Write/Read with ODP
  2023-04-19  5:51 [PATCH for-next v4 0/8] On-Demand Paging on SoftRoCE Daisuke Matsuda
                   ` (5 preceding siblings ...)
  2023-04-19  5:51 ` [PATCH for-next v4 6/8] RDMA/rxe: Allow registering MRs for On-Demand Paging Daisuke Matsuda
@ 2023-04-19  5:51 ` Daisuke Matsuda
  2023-04-19  5:52 ` [PATCH for-next v4 8/8] RDMA/rxe: Add support for the traditional Atomic operations " Daisuke Matsuda
  2023-04-19 16:07 ` [PATCH for-next v4 0/8] On-Demand Paging on SoftRoCE Pearson, Robert B
  8 siblings, 0 replies; 13+ messages in thread
From: Daisuke Matsuda @ 2023-04-19  5:51 UTC (permalink / raw)
  To: linux-rdma, leonro, jgg, zyjzyj2000
  Cc: linux-kernel, rpearsonhpe, yangx.jy, lizhijian, Daisuke Matsuda

rxe_mr_copy() is used widely to copy data to/from a user MR. requester uses
it to load payloads of requesting packets; responder uses it to process
Send, Write, and Read operaetions; completer uses it to copy data from
response packets of Read and Atomic operations to a user MR.

Allow these operations to be used with ODP by adding a subordinate function
rxe_odp_mr_copy(). It is comprised of the following steps:
 1. Check the driver page table(umem_odp->dma_list) to see if pages being
    accessed are present with appropriate permission.
 2. If necessary, trigger page fault to map the pages.
 3. Update the MR xarray using PFNs in umem_odp->pfn_list.
 4. Execute data copy to/from the pages.

umem_mutex is used to ensure that dma_list (an array of addresses of an MR)
is not changed while it is being checked and that mapped pages are not
invalidated before data copy completes.

Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
---
 drivers/infiniband/sw/rxe/rxe.c     |  10 +++
 drivers/infiniband/sw/rxe/rxe_loc.h |   8 ++
 drivers/infiniband/sw/rxe/rxe_mr.c  |   2 +-
 drivers/infiniband/sw/rxe/rxe_odp.c | 109 ++++++++++++++++++++++++++++
 4 files changed, 128 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/sw/rxe/rxe.c b/drivers/infiniband/sw/rxe/rxe.c
index f2284d27229b..207a022156f0 100644
--- a/drivers/infiniband/sw/rxe/rxe.c
+++ b/drivers/infiniband/sw/rxe/rxe.c
@@ -79,6 +79,16 @@ static void rxe_init_device_param(struct rxe_dev *rxe)
 
 		/* IB_ODP_SUPPORT_IMPLICIT is not supported right now. */
 		rxe->attr.odp_caps.general_caps |= IB_ODP_SUPPORT;
+
+		rxe->attr.odp_caps.per_transport_caps.ud_odp_caps |= IB_ODP_SUPPORT_SEND;
+		rxe->attr.odp_caps.per_transport_caps.ud_odp_caps |= IB_ODP_SUPPORT_RECV;
+		rxe->attr.odp_caps.per_transport_caps.ud_odp_caps |= IB_ODP_SUPPORT_SRQ_RECV;
+
+		rxe->attr.odp_caps.per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_SEND;
+		rxe->attr.odp_caps.per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_RECV;
+		rxe->attr.odp_caps.per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_WRITE;
+		rxe->attr.odp_caps.per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_READ;
+		rxe->attr.odp_caps.per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_SRQ_RECV;
 	}
 }
 
diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h
index 0f91e23c151e..35c2ccb2fdd9 100644
--- a/drivers/infiniband/sw/rxe/rxe_loc.h
+++ b/drivers/infiniband/sw/rxe/rxe_loc.h
@@ -206,6 +206,8 @@ static inline unsigned int wr_opcode_mask(int opcode, struct rxe_qp *qp)
 #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 int rxe_odp_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length,
 			 u64 iova, int access_flags, struct rxe_mr *mr);
+int rxe_odp_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length,
+		    enum rxe_mr_copy_dir dir);
 #else /* CONFIG_INFINIBAND_ON_DEMAND_PAGING */
 static inline int
 rxe_odp_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
@@ -213,6 +215,12 @@ rxe_odp_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
 {
 	return -EOPNOTSUPP;
 }
+static inline int
+rxe_odp_mr_copy(struct rxe_mr *mr, u64 iova, void *addr,
+		int length, enum rxe_mr_copy_dir dir)
+{
+	return -EOPNOTSUPP;
+}
 
 #endif /* CONFIG_INFINIBAND_ON_DEMAND_PAGING */
 
diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index cd368cd096c8..0e3cda59d702 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -319,7 +319,7 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr,
 	}
 
 	if (mr->odp_enabled)
-		return -EOPNOTSUPP;
+		return rxe_odp_mr_copy(mr, iova, addr, length, dir);
 	else
 		return rxe_mr_copy_xarray(mr, iova, addr, length, dir);
 }
diff --git a/drivers/infiniband/sw/rxe/rxe_odp.c b/drivers/infiniband/sw/rxe/rxe_odp.c
index e5497d09c399..cbe5d0c3fcc4 100644
--- a/drivers/infiniband/sw/rxe/rxe_odp.c
+++ b/drivers/infiniband/sw/rxe/rxe_odp.c
@@ -174,3 +174,112 @@ int rxe_odp_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length,
 
 	return err;
 }
+
+static inline bool rxe_is_pagefault_neccesary(struct ib_umem_odp *umem_odp,
+					      u64 iova, int length, u32 perm)
+{
+	int idx;
+	u64 addr;
+	bool need_fault = false;
+
+	addr = iova & (~(BIT(umem_odp->page_shift) - 1));
+
+	/* Skim through all pages that are to be accessed. */
+	while (addr < iova + length) {
+		idx = (addr - ib_umem_start(umem_odp)) >> umem_odp->page_shift;
+
+		if (!(umem_odp->dma_list[idx] & perm)) {
+			need_fault = true;
+			break;
+		}
+
+		addr += BIT(umem_odp->page_shift);
+	}
+	return need_fault;
+}
+
+/* umem mutex must be locked before entering this function. */
+static int rxe_odp_map_range(struct rxe_mr *mr, u64 iova, int length, u32 flags)
+{
+	struct ib_umem_odp *umem_odp = to_ib_umem_odp(mr->umem);
+	const int max_tries = 3;
+	int cnt = 0;
+
+	int err;
+	u64 perm;
+	bool need_fault;
+
+	if (unlikely(length < 1)) {
+		mutex_unlock(&umem_odp->umem_mutex);
+		return -EINVAL;
+	}
+
+	perm = ODP_READ_ALLOWED_BIT;
+	if (!(flags & RXE_PAGEFAULT_RDONLY))
+		perm |= ODP_WRITE_ALLOWED_BIT;
+
+	/*
+	 * A successful return from rxe_odp_do_pagefault() does not guarantee
+	 * that all pages in the range became present. Recheck the DMA address
+	 * array, allowing max 3 tries for pagefault.
+	 */
+	while ((need_fault = rxe_is_pagefault_neccesary(umem_odp,
+							iova, length, perm))) {
+		if (cnt >= max_tries)
+			break;
+
+		mutex_unlock(&umem_odp->umem_mutex);
+
+		/* umem_mutex is locked on success. */
+		err = rxe_odp_do_pagefault(mr, iova, length, flags);
+		if (err < 0)
+			return err;
+
+		cnt++;
+	}
+
+	if (need_fault)
+		return -EFAULT;
+
+	return 0;
+}
+
+int rxe_odp_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length,
+		    enum rxe_mr_copy_dir dir)
+{
+	struct ib_umem_odp *umem_odp = to_ib_umem_odp(mr->umem);
+	u32 flags = 0;
+	int err;
+
+	if (unlikely(!mr->odp_enabled))
+		return -EOPNOTSUPP;
+
+	switch (dir) {
+	case RXE_TO_MR_OBJ:
+		break;
+
+	case RXE_FROM_MR_OBJ:
+		flags = RXE_PAGEFAULT_RDONLY;
+		break;
+
+	default:
+		return -EINVAL;
+	}
+
+	/* If pagefault is not required, umem mutex will be held until data
+	 * copy to the MR completes. Otherwise, it is released and locked
+	 * again in rxe_odp_map_range() to let invalidation handler do its
+	 * work meanwhile.
+	 */
+	mutex_lock(&umem_odp->umem_mutex);
+
+	err = rxe_odp_map_range(mr, iova, length, flags);
+	if (err)
+		return err;
+
+	err =  rxe_mr_copy_xarray(mr, iova, addr, length, dir);
+
+	mutex_unlock(&umem_odp->umem_mutex);
+
+	return err;
+}
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH for-next v4 8/8] RDMA/rxe: Add support for the traditional Atomic operations with ODP
  2023-04-19  5:51 [PATCH for-next v4 0/8] On-Demand Paging on SoftRoCE Daisuke Matsuda
                   ` (6 preceding siblings ...)
  2023-04-19  5:51 ` [PATCH for-next v4 7/8] RDMA/rxe: Add support for Send/Recv/Write/Read with ODP Daisuke Matsuda
@ 2023-04-19  5:52 ` Daisuke Matsuda
  2023-04-19 16:07 ` [PATCH for-next v4 0/8] On-Demand Paging on SoftRoCE Pearson, Robert B
  8 siblings, 0 replies; 13+ messages in thread
From: Daisuke Matsuda @ 2023-04-19  5:52 UTC (permalink / raw)
  To: linux-rdma, leonro, jgg, zyjzyj2000
  Cc: linux-kernel, rpearsonhpe, yangx.jy, lizhijian, Daisuke Matsuda

Enable 'fetch and add' and 'compare and swap' operations to manipulate
data in an ODP-enabled MR. This is comprised of the following steps:
 1. Check the driver page table(umem_odp->dma_list) to see if the target
    page is both readable and writable.
 2. If not, then trigger page fault to map the page.
 3. Update the entry in the MR xarray.
 4. Execute the operation.

umem_mutex is used to ensure that dma_list (an array of addresses of an MR)
is not changed while it is being checked and that the target page is not
invalidated before data access completes.

Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
---
 drivers/infiniband/sw/rxe/rxe.c      |  1 +
 drivers/infiniband/sw/rxe/rxe_loc.h  |  9 +++++++++
 drivers/infiniband/sw/rxe/rxe_odp.c  | 26 ++++++++++++++++++++++++++
 drivers/infiniband/sw/rxe/rxe_resp.c |  5 ++++-
 4 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/sw/rxe/rxe.c b/drivers/infiniband/sw/rxe/rxe.c
index 207a022156f0..abd3267c2873 100644
--- a/drivers/infiniband/sw/rxe/rxe.c
+++ b/drivers/infiniband/sw/rxe/rxe.c
@@ -88,6 +88,7 @@ static void rxe_init_device_param(struct rxe_dev *rxe)
 		rxe->attr.odp_caps.per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_RECV;
 		rxe->attr.odp_caps.per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_WRITE;
 		rxe->attr.odp_caps.per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_READ;
+		rxe->attr.odp_caps.per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_ATOMIC;
 		rxe->attr.odp_caps.per_transport_caps.rc_odp_caps |= IB_ODP_SUPPORT_SRQ_RECV;
 	}
 }
diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h
index 35c2ccb2fdd9..6ff10b15fa32 100644
--- a/drivers/infiniband/sw/rxe/rxe_loc.h
+++ b/drivers/infiniband/sw/rxe/rxe_loc.h
@@ -208,6 +208,9 @@ int rxe_odp_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length,
 			 u64 iova, int access_flags, struct rxe_mr *mr);
 int rxe_odp_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length,
 		    enum rxe_mr_copy_dir dir);
+int rxe_odp_mr_atomic_op(struct rxe_mr *mr, u64 iova, int opcode,
+			 u64 compare, u64 swap_add, u64 *orig_val);
+
 #else /* CONFIG_INFINIBAND_ON_DEMAND_PAGING */
 static inline int
 rxe_odp_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
@@ -221,6 +224,12 @@ rxe_odp_mr_copy(struct rxe_mr *mr, u64 iova, void *addr,
 {
 	return -EOPNOTSUPP;
 }
+static inline int
+rxe_odp_mr_atomic_op(struct rxe_mr *mr, u64 iova, int opcode,
+		     u64 compare, u64 swap_add, u64 *orig_val)
+{
+	return RESPST_ERR_UNSUPPORTED_OPCODE;
+}
 
 #endif /* CONFIG_INFINIBAND_ON_DEMAND_PAGING */
 
diff --git a/drivers/infiniband/sw/rxe/rxe_odp.c b/drivers/infiniband/sw/rxe/rxe_odp.c
index cbe5d0c3fcc4..194b1fab98b7 100644
--- a/drivers/infiniband/sw/rxe/rxe_odp.c
+++ b/drivers/infiniband/sw/rxe/rxe_odp.c
@@ -283,3 +283,29 @@ int rxe_odp_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length,
 
 	return err;
 }
+
+int rxe_odp_mr_atomic_op(struct rxe_mr *mr, u64 iova, int opcode,
+			 u64 compare, u64 swap_add, u64 *orig_val)
+{
+	int err;
+	struct ib_umem_odp *umem_odp = to_ib_umem_odp(mr->umem);
+
+	/* If pagefault is not required, umem mutex will be held until the
+	 * atomic operation completes. Otherwise, it is released and locked
+	 * again in rxe_odp_map_range() to let invalidation handler do its
+	 * work meanwhile.
+	 */
+	mutex_lock(&umem_odp->umem_mutex);
+
+	/* Atomic operations manipulate a single char. */
+	err = rxe_odp_map_range(mr, iova, sizeof(char), 0);
+	if (err)
+		return err;
+
+	err = rxe_mr_do_atomic_op(mr, iova, opcode, compare,
+				  swap_add, orig_val);
+
+	mutex_unlock(&umem_odp->umem_mutex);
+
+	return err;
+}
diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
index b40c47477be3..99ad1dec10c7 100644
--- a/drivers/infiniband/sw/rxe/rxe_resp.c
+++ b/drivers/infiniband/sw/rxe/rxe_resp.c
@@ -699,7 +699,10 @@ static enum resp_states atomic_reply(struct rxe_qp *qp,
 		u64 iova = qp->resp.va + qp->resp.offset;
 
 		if (mr->odp_enabled)
-			err = RESPST_ERR_UNSUPPORTED_OPCODE;
+			err = rxe_odp_mr_atomic_op(mr, iova, pkt->opcode,
+						   atmeth_comp(pkt),
+						   atmeth_swap_add(pkt),
+						   &res->atomic.orig_val);
 		else
 			err = rxe_mr_do_atomic_op(mr, iova, pkt->opcode,
 						  atmeth_comp(pkt),
-- 
2.39.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* RE: [PATCH for-next v4 0/8] On-Demand Paging on SoftRoCE
  2023-04-19  5:51 [PATCH for-next v4 0/8] On-Demand Paging on SoftRoCE Daisuke Matsuda
                   ` (7 preceding siblings ...)
  2023-04-19  5:52 ` [PATCH for-next v4 8/8] RDMA/rxe: Add support for the traditional Atomic operations " Daisuke Matsuda
@ 2023-04-19 16:07 ` Pearson, Robert B
  2023-04-20  0:28   ` Daisuke Matsuda (Fujitsu)
  8 siblings, 1 reply; 13+ messages in thread
From: Pearson, Robert B @ 2023-04-19 16:07 UTC (permalink / raw)
  To: Daisuke Matsuda, linux-rdma, leonro, jgg, zyjzyj2000
  Cc: linux-kernel, rpearsonhpe, yangx.jy, lizhijian

The work queue patch has been submitted and is waiting for some action. -- Bob

-----Original Message-----
From: Daisuke Matsuda <matsuda-daisuke@fujitsu.com> 
Sent: Wednesday, April 19, 2023 12:52 AM
To: linux-rdma@vger.kernel.org; leonro@nvidia.com; jgg@nvidia.com; zyjzyj2000@gmail.com
Cc: linux-kernel@vger.kernel.org; rpearsonhpe@gmail.com; yangx.jy@fujitsu.com; lizhijian@fujitsu.com; Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
Subject: [PATCH for-next v4 0/8] On-Demand Paging on SoftRoCE

This patch series implements the On-Demand Paging feature on SoftRoCE(rxe) driver, which has been available only in mlx5 driver[1] so far.

The first patch of this series is provided for testing purpose, and it should be dropped in the end. It converts triple tasklets to use workqueue in order to let them sleep during page-fault. Bob Pearson says he will post the patch to do this, and I think we can adopt that. The other patches in this series are, I believe, completed works.

I omitted some contents like the motive behind this series for simplicity.
Please see the cover letter of v3 for more details[2].

[Overview]
When applications register a memory region(MR), RDMA drivers normally pin pages in the MR so that physical addresses are never changed during RDMA communication. This requires the MR to fit in physical memory and inevitably leads to memory pressure. On the other hand, On-Demand Paging
(ODP) allows applications to register MRs without pinning pages. They are paged-in when the driver requires and paged-out when the OS reclaims. As a result, it is possible to register a large MR that does not fit in physical memory without taking up so much physical memory.

[How does ODP work?]
"struct ib_umem_odp" is used to manage pages. It is created for each ODP-enabled MR on its registration. This struct holds a pair of arrays
(dma_list/pfn_list) that serve as a driver page table. DMA addresses and PFNs are stored in the driver page table. They are updated on page-in and page-out, both of which use the common interfaces in the ib_uverbs layer.

Page-in can occur when requester, responder or completer access an MR in order to process RDMA operations. If they find that the pages being accessed are not present on physical memory or requisite permissions are not set on the pages, they provoke page fault to make the pages present with proper permissions and at the same time update the driver page table.
After confirming the presence of the pages, they execute memory access such as read, write or atomic operations.

Page-out is triggered by page reclaim or filesystem events (e.g. metadata update of a file that is being used as an MR). When creating an ODP-enabled MR, the driver registers an MMU notifier callback. When the kernel issues a page invalidation notification, the callback is provoked to unmap DMA addresses and update the driver page table. After that, the kernel releases the pages.

[Supported operations]
All traditional operations are supported on RC connection. The new Atomic write[3] and RDMA Flush[4] operations are not included in this patchset. I will post them later after this patchset is merged. On UD connection, Send, Recv, and SRQ-Recv are supported.

[How to test ODP?]
There are only a few resources available for testing. pyverbs testcases in rdma-core and perftest[5] are recommendable ones. Other than them, the ibv_rc_pingpong command can also used for testing. Note that you may have to build perftest from upstream because older versions do not handle ODP capabilities correctly.

The tree is available from github:
https://github.com/daimatsuda/linux/tree/odp_v4
While this series is based on commit f605f26ea196, the tree includes an additional bugfix, which is yet to be merged as of today (Apr 19th, 2023).
https://lore.kernel.org/linux-rdma/20230418090642.1849358-1-matsuda-daisuke@fujitsu.com/

[Future work]
My next work is to enable the new Atomic write[3] and RDMA Flush[4] operations with ODP. After that, I am going to implement the prefetch feature. It allows applications to trigger page fault using
ibv_advise_mr(3) to optimize performance. Some existing software like librpma[6] use this feature. Additionally, I think we can also add the implicit ODP feature in the future.

[1] [RFC 00/20] On demand paging
https://www.spinics.net/lists/linux-rdma/msg18906.html

[2] [PATCH for-next v3 0/7] On-Demand Paging on SoftRoCE https://lore.kernel.org/lkml/cover.1671772917.git.matsuda-daisuke@fujitsu.com/

[3] [PATCH v7 0/8] RDMA/rxe: Add atomic write operation https://lore.kernel.org/linux-rdma/1669905432-14-1-git-send-email-yangx.jy@fujitsu.com/

[4] [for-next PATCH 00/10] RDMA/rxe: Add RDMA FLUSH operation https://lore.kernel.org/lkml/20221206130201.30986-1-lizhijian@fujitsu.com/

[5] linux-rdma/perftest: Infiniband Verbs Performance Tests https://github.com/linux-rdma/perftest

[6] librpma: Remote Persistent Memory Access Library https://github.com/pmem/rpma

v3->v4:
 1) Re-designed functions that access MRs to use the MR xarray.
 2) Rebased onto the latest jgg-for-next tree.

v2->v3:
 1) Removed a patch that changes the common ib_uverbs layer.
 2) Re-implemented patches for conversion to workqueue.
 3) Fixed compile errors (happened when CONFIG_INFINIBAND_ON_DEMAND_PAGING=n).
 4) Fixed some functions that returned incorrect errors.
 5) Temporarily disabled ODP for RDMA Flush and Atomic Write.

v1->v2:
 1) Fixed a crash issue reported by Haris Iqbal.
 2) Tried to make lock patters clearer as pointed out by Romanovsky.
 3) Minor clean ups and fixes.

Daisuke Matsuda (8):
  RDMA/rxe: Tentative workqueue implementation
  RDMA/rxe: Always schedule works before accessing user MRs
  RDMA/rxe: Make MR functions accessible from other rxe source code
  RDMA/rxe: Move resp_states definition to rxe_verbs.h
  RDMA/rxe: Add page invalidation support
  RDMA/rxe: Allow registering MRs for On-Demand Paging
  RDMA/rxe: Add support for Send/Recv/Write/Read with ODP
  RDMA/rxe: Add support for the traditional Atomic operations with ODP

 drivers/infiniband/sw/rxe/Makefile    |   2 +
 drivers/infiniband/sw/rxe/rxe.c       |  27 ++-
 drivers/infiniband/sw/rxe/rxe.h       |  37 ---
 drivers/infiniband/sw/rxe/rxe_comp.c  |  12 +-
 drivers/infiniband/sw/rxe/rxe_loc.h   |  49 +++-
 drivers/infiniband/sw/rxe/rxe_mr.c    |  27 +--
 drivers/infiniband/sw/rxe/rxe_odp.c   | 311 ++++++++++++++++++++++++++
 drivers/infiniband/sw/rxe/rxe_recv.c  |   4 +-
 drivers/infiniband/sw/rxe/rxe_resp.c  |  32 ++-  drivers/infiniband/sw/rxe/rxe_task.c  |  84 ++++---
 drivers/infiniband/sw/rxe/rxe_task.h  |   6 +-
 drivers/infiniband/sw/rxe/rxe_verbs.c |   5 +-
 drivers/infiniband/sw/rxe/rxe_verbs.h |  39 ++++
 13 files changed, 535 insertions(+), 100 deletions(-)  create mode 100644 drivers/infiniband/sw/rxe/rxe_odp.c

base-commit: f605f26ea196a3b49bea249330cbd18dba61a33e

--
2.39.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH for-next v4 2/8] RDMA/rxe: Always schedule works before accessing user MRs
  2023-04-19  5:51 ` [PATCH for-next v4 2/8] RDMA/rxe: Always schedule works before accessing user MRs Daisuke Matsuda
@ 2023-04-19 19:37   ` kernel test robot
  0 siblings, 0 replies; 13+ messages in thread
From: kernel test robot @ 2023-04-19 19:37 UTC (permalink / raw)
  To: Daisuke Matsuda, linux-rdma, leonro, jgg, zyjzyj2000
  Cc: llvm, oe-kbuild-all, linux-kernel, rpearsonhpe, yangx.jy,
	lizhijian, Daisuke Matsuda

Hi Daisuke,

kernel test robot noticed the following build warnings:

[auto build test WARNING on f605f26ea196a3b49bea249330cbd18dba61a33e]

url:    https://github.com/intel-lab-lkp/linux/commits/Daisuke-Matsuda/RDMA-rxe-Tentative-workqueue-implementation/20230419-135731
base:   f605f26ea196a3b49bea249330cbd18dba61a33e
patch link:    https://lore.kernel.org/r/7441c59fcea601c03c70ec03b5d17a69032e51f8.1681882651.git.matsuda-daisuke%40fujitsu.com
patch subject: [PATCH for-next v4 2/8] RDMA/rxe: Always schedule works before accessing user MRs
config: x86_64-randconfig-a011-20230417 (https://download.01.org/0day-ci/archive/20230420/202304200354.oGlN33Lg-lkp@intel.com/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/493fb0777100e2e1b6358176e84b4b29372105ae
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Daisuke-Matsuda/RDMA-rxe-Tentative-workqueue-implementation/20230419-135731
        git checkout 493fb0777100e2e1b6358176e84b4b29372105ae
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/infiniband/sw/rxe/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202304200354.oGlN33Lg-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/infiniband/sw/rxe/rxe_comp.c:139:36: warning: converting the enum constant to a boolean [-Wint-in-bool-context]
           if (pkt->mask | (RXE_PAYLOAD_MASK || RXE_ATMACK_MASK))
                                             ^
   1 warning generated.


vim +139 drivers/infiniband/sw/rxe/rxe_comp.c

   128	
   129	void rxe_comp_queue_pkt(struct rxe_pkt_info *pkt, struct sk_buff *skb)
   130	{
   131		struct rxe_qp *qp = pkt->qp;
   132		int must_sched;
   133	
   134		skb_queue_tail(&qp->resp_pkts, skb);
   135	
   136		/* Schedule the task if processing Read responses or Atomic acks.
   137		 * In these cases, completer may sleep to access ODP-enabled MRs.
   138		 */
 > 139		if (pkt->mask | (RXE_PAYLOAD_MASK || RXE_ATMACK_MASK))
   140			must_sched = 1;
   141		else
   142			must_sched = skb_queue_len(&qp->resp_pkts) > 1;
   143	
   144		if (must_sched != 0)
   145			rxe_counter_inc(SKB_TO_PKT(skb)->rxe, RXE_CNT_COMPLETER_SCHED);
   146	
   147		if (must_sched)
   148			rxe_sched_task(&qp->comp.task);
   149		else
   150			rxe_run_task(&qp->comp.task);
   151	}
   152	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH for-next v4 0/8] On-Demand Paging on SoftRoCE
  2023-04-19 16:07 ` [PATCH for-next v4 0/8] On-Demand Paging on SoftRoCE Pearson, Robert B
@ 2023-04-20  0:28   ` Daisuke Matsuda (Fujitsu)
  2023-04-27 15:47     ` Bob Pearson
  0 siblings, 1 reply; 13+ messages in thread
From: Daisuke Matsuda (Fujitsu) @ 2023-04-20  0:28 UTC (permalink / raw)
  To: 'Pearson, Robert B', linux-rdma, leonro, jgg, zyjzyj2000
  Cc: linux-kernel, rpearsonhpe, Xiao Yang (Fujitsu), Zhijian Li (Fujitsu)

On Thu, April 20, 2023 1:07 AM Pearson, Robert B wrote:
> 
> The work queue patch has been submitted and is waiting for some action. -- Bob

Hi,
Could you tell me which is it? I am willing to review it.

This seems to be your latest work queue patch:
https://lore.kernel.org/all/TYCPR01MB8455A2D0B3303FD90B3BB6F1E58B9@TYCPR01MB8455.jpnprd01.prod.outlook.com/
I cannot find any one newer on the mailing list nor on the Patchwork.

Daisuke

> 
> -----Original Message-----
> From: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
> Sent: Wednesday, April 19, 2023 12:52 AM
> To: linux-rdma@vger.kernel.org; leonro@nvidia.com; jgg@nvidia.com; zyjzyj2000@gmail.com
> Cc: linux-kernel@vger.kernel.org; rpearsonhpe@gmail.com; yangx.jy@fujitsu.com; lizhijian@fujitsu.com; Daisuke
> Matsuda <matsuda-daisuke@fujitsu.com>
> Subject: [PATCH for-next v4 0/8] On-Demand Paging on SoftRoCE
> 
> This patch series implements the On-Demand Paging feature on SoftRoCE(rxe) driver, which has been available only in
> mlx5 driver[1] so far.
> 
> The first patch of this series is provided for testing purpose, and it should be dropped in the end. It converts triple tasklets
> to use workqueue in order to let them sleep during page-fault. Bob Pearson says he will post the patch to do this, and I
> think we can adopt that. The other patches in this series are, I believe, completed works.
> 
> I omitted some contents like the motive behind this series for simplicity.
> Please see the cover letter of v3 for more details[2].
> 
> [Overview]
> When applications register a memory region(MR), RDMA drivers normally pin pages in the MR so that physical addresses
> are never changed during RDMA communication. This requires the MR to fit in physical memory and inevitably leads to
> memory pressure. On the other hand, On-Demand Paging
> (ODP) allows applications to register MRs without pinning pages. They are paged-in when the driver requires and
> paged-out when the OS reclaims. As a result, it is possible to register a large MR that does not fit in physical memory
> without taking up so much physical memory.
> 
> [How does ODP work?]
> "struct ib_umem_odp" is used to manage pages. It is created for each ODP-enabled MR on its registration. This struct
> holds a pair of arrays
> (dma_list/pfn_list) that serve as a driver page table. DMA addresses and PFNs are stored in the driver page table. They
> are updated on page-in and page-out, both of which use the common interfaces in the ib_uverbs layer.
> 
> Page-in can occur when requester, responder or completer access an MR in order to process RDMA operations. If they
> find that the pages being accessed are not present on physical memory or requisite permissions are not set on the pages,
> they provoke page fault to make the pages present with proper permissions and at the same time update the driver page
> table.
> After confirming the presence of the pages, they execute memory access such as read, write or atomic operations.
> 
> Page-out is triggered by page reclaim or filesystem events (e.g. metadata update of a file that is being used as an MR).
> When creating an ODP-enabled MR, the driver registers an MMU notifier callback. When the kernel issues a page
> invalidation notification, the callback is provoked to unmap DMA addresses and update the driver page table. After that,
> the kernel releases the pages.
> 
> [Supported operations]
> All traditional operations are supported on RC connection. The new Atomic write[3] and RDMA Flush[4] operations are
> not included in this patchset. I will post them later after this patchset is merged. On UD connection, Send, Recv, and
> SRQ-Recv are supported.
> 
> [How to test ODP?]
> There are only a few resources available for testing. pyverbs testcases in rdma-core and perftest[5] are recommendable
> ones. Other than them, the ibv_rc_pingpong command can also used for testing. Note that you may have to build perftest
> from upstream because older versions do not handle ODP capabilities correctly.
> 
> The tree is available from github:
> https://github.com/daimatsuda/linux/tree/odp_v4
> While this series is based on commit f605f26ea196, the tree includes an additional bugfix, which is yet to be merged as of
> today (Apr 19th, 2023).
> https://lore.kernel.org/linux-rdma/20230418090642.1849358-1-matsuda-daisuke@fujitsu.com/
> 
> [Future work]
> My next work is to enable the new Atomic write[3] and RDMA Flush[4] operations with ODP. After that, I am going to
> implement the prefetch feature. It allows applications to trigger page fault using
> ibv_advise_mr(3) to optimize performance. Some existing software like librpma[6] use this feature. Additionally, I think we
> can also add the implicit ODP feature in the future.
> 
> [1] [RFC 00/20] On demand paging
> https://www.spinics.net/lists/linux-rdma/msg18906.html
> 
> [2] [PATCH for-next v3 0/7] On-Demand Paging on SoftRoCE
> https://lore.kernel.org/lkml/cover.1671772917.git.matsuda-daisuke@fujitsu.com/
> 
> [3] [PATCH v7 0/8] RDMA/rxe: Add atomic write operation
> https://lore.kernel.org/linux-rdma/1669905432-14-1-git-send-email-yangx.jy@fujitsu.com/
> 
> [4] [for-next PATCH 00/10] RDMA/rxe: Add RDMA FLUSH operation
> https://lore.kernel.org/lkml/20221206130201.30986-1-lizhijian@fujitsu.com/
> 
> [5] linux-rdma/perftest: Infiniband Verbs Performance Tests https://github.com/linux-rdma/perftest
> 
> [6] librpma: Remote Persistent Memory Access Library https://github.com/pmem/rpma
> 
> v3->v4:
>  1) Re-designed functions that access MRs to use the MR xarray.
>  2) Rebased onto the latest jgg-for-next tree.
> 
> v2->v3:
>  1) Removed a patch that changes the common ib_uverbs layer.
>  2) Re-implemented patches for conversion to workqueue.
>  3) Fixed compile errors (happened when CONFIG_INFINIBAND_ON_DEMAND_PAGING=n).
>  4) Fixed some functions that returned incorrect errors.
>  5) Temporarily disabled ODP for RDMA Flush and Atomic Write.
> 
> v1->v2:
>  1) Fixed a crash issue reported by Haris Iqbal.
>  2) Tried to make lock patters clearer as pointed out by Romanovsky.
>  3) Minor clean ups and fixes.
> 
> Daisuke Matsuda (8):
>   RDMA/rxe: Tentative workqueue implementation
>   RDMA/rxe: Always schedule works before accessing user MRs
>   RDMA/rxe: Make MR functions accessible from other rxe source code
>   RDMA/rxe: Move resp_states definition to rxe_verbs.h
>   RDMA/rxe: Add page invalidation support
>   RDMA/rxe: Allow registering MRs for On-Demand Paging
>   RDMA/rxe: Add support for Send/Recv/Write/Read with ODP
>   RDMA/rxe: Add support for the traditional Atomic operations with ODP
> 
>  drivers/infiniband/sw/rxe/Makefile    |   2 +
>  drivers/infiniband/sw/rxe/rxe.c       |  27 ++-
>  drivers/infiniband/sw/rxe/rxe.h       |  37 ---
>  drivers/infiniband/sw/rxe/rxe_comp.c  |  12 +-
>  drivers/infiniband/sw/rxe/rxe_loc.h   |  49 +++-
>  drivers/infiniband/sw/rxe/rxe_mr.c    |  27 +--
>  drivers/infiniband/sw/rxe/rxe_odp.c   | 311 ++++++++++++++++++++++++++
>  drivers/infiniband/sw/rxe/rxe_recv.c  |   4 +-
>  drivers/infiniband/sw/rxe/rxe_resp.c  |  32 ++-  drivers/infiniband/sw/rxe/rxe_task.c  |  84 ++++---
>  drivers/infiniband/sw/rxe/rxe_task.h  |   6 +-
>  drivers/infiniband/sw/rxe/rxe_verbs.c |   5 +-
>  drivers/infiniband/sw/rxe/rxe_verbs.h |  39 ++++
>  13 files changed, 535 insertions(+), 100 deletions(-)  create mode 100644 drivers/infiniband/sw/rxe/rxe_odp.c
> 
> base-commit: f605f26ea196a3b49bea249330cbd18dba61a33e
> 
> --
> 2.39.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH for-next v4 0/8] On-Demand Paging on SoftRoCE
  2023-04-20  0:28   ` Daisuke Matsuda (Fujitsu)
@ 2023-04-27 15:47     ` Bob Pearson
  0 siblings, 0 replies; 13+ messages in thread
From: Bob Pearson @ 2023-04-27 15:47 UTC (permalink / raw)
  To: Daisuke Matsuda (Fujitsu), 'Pearson, Robert B',
	linux-rdma, leonro, jgg, zyjzyj2000
  Cc: linux-kernel, Xiao Yang (Fujitsu), Zhijian Li (Fujitsu)

On 4/19/23 19:28, Daisuke Matsuda (Fujitsu) wrote:
> On Thu, April 20, 2023 1:07 AM Pearson, Robert B wrote:
>>
>> The work queue patch has been submitted and is waiting for some action. -- Bob
> 
> Hi,
> Could you tell me which is it? I am willing to review it.
> 
> This seems to be your latest work queue patch:
> https://lore.kernel.org/all/TYCPR01MB8455A2D0B3303FD90B3BB6F1E58B9@TYCPR01MB8455.jpnprd01.prod.outlook.com/
> I cannot find any one newer on the mailing list nor on the Patchwork.
> 
> Daisuke

Daisuke,

Sorry for the delay. I've been on another project for a few days.
I can't either. After the fix qp counting in task.c the work queue patch is almost trivial.
I'll send it again.

Bob

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-04-27 15:47 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-19  5:51 [PATCH for-next v4 0/8] On-Demand Paging on SoftRoCE Daisuke Matsuda
2023-04-19  5:51 ` [PATCH for-next v4 1/8] RDMA/rxe: Tentative workqueue implementation Daisuke Matsuda
2023-04-19  5:51 ` [PATCH for-next v4 2/8] RDMA/rxe: Always schedule works before accessing user MRs Daisuke Matsuda
2023-04-19 19:37   ` kernel test robot
2023-04-19  5:51 ` [PATCH for-next v4 3/8] RDMA/rxe: Make MR functions accessible from other rxe source code Daisuke Matsuda
2023-04-19  5:51 ` [PATCH for-next v4 4/8] RDMA/rxe: Move resp_states definition to rxe_verbs.h Daisuke Matsuda
2023-04-19  5:51 ` [PATCH for-next v4 5/8] RDMA/rxe: Add page invalidation support Daisuke Matsuda
2023-04-19  5:51 ` [PATCH for-next v4 6/8] RDMA/rxe: Allow registering MRs for On-Demand Paging Daisuke Matsuda
2023-04-19  5:51 ` [PATCH for-next v4 7/8] RDMA/rxe: Add support for Send/Recv/Write/Read with ODP Daisuke Matsuda
2023-04-19  5:52 ` [PATCH for-next v4 8/8] RDMA/rxe: Add support for the traditional Atomic operations " Daisuke Matsuda
2023-04-19 16:07 ` [PATCH for-next v4 0/8] On-Demand Paging on SoftRoCE Pearson, Robert B
2023-04-20  0:28   ` Daisuke Matsuda (Fujitsu)
2023-04-27 15:47     ` Bob Pearson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.