[PATCH for-next v3 00/13] Implement work queues for rdma_rxe

* [PATCH for-next v3 00/13] Implement work queues for rdma_rxe
@ 2022-10-29  3:09 Bob Pearson
  2022-10-29  3:09 ` [PATCH for-next v3 01/13] RDMA/rxe: Make task interface pluggable Bob Pearson
                   ` (14 more replies)
  0 siblings, 15 replies; 23+ messages in thread
From: Bob Pearson @ 2022-10-29  3:09 UTC (permalink / raw)
  To: jgg, leon, zyjzyj2000, jhack, linux-rdma; +Cc: Bob Pearson

This patch series implements work queues as an alternative for
the main tasklets in the rdma_rxe driver. The patch series starts
with a patch that makes the internal API for task execution pluggable
and implements an inline and a tasklet based set of functions.
The remaining patches cleanup the qp reset and error code in the
three tasklets and modify the locking logic to prevent making
multiple calls to the tasklet scheduling routine. After
this preparation the work queue equivalent set of functions is
added and the tasklet version is dropped. 

The advantages of the work queue version of deferred task execution
is mainly that the work queue variant has much better scalability
and overall performance than the tasklet variant.  The perftest
microbenchmarks in local loopback mode (not a very realistic test
case) can reach approximately 100Gb/sec with work queues compared to
about 16Gb/sec for tasklets.

This version of the patch series drops the tasklet version as an option
but keeps the option of switching between the workqueue and inline
versions.

This patch series is derived from an earlier patch set developed by
Ian Ziemba at HPE which is used in some Lustre storage clients attached
to Lustre servers with hard RoCE v2 NICs.

It is based on the current version of wip/jgg-for-next.

v3:
Link: https://lore.kernel.org/linux-rdma/202210220559.f7taTL8S-lkp@intel.com/
The v3 version drops the first few patches which have already been accepted
in for-next. It also drops the last patch of the v2 version which
introduced module parameters to select between the task interfaces. It also
drops the tasklet version entirely. It fixes a minor error caught by
the kernel test robot <lkp@intel.com> with a missing static declaration.

v2:
The v2 version of the patch set has some minor changes that address
comments from Leon Romanovsky regarding locking of the valid parameter
and the setup parameters for alloc_workqueue. It also has one
additional cleanup patch.

Bob Pearson (13):
  RDMA/rxe: Make task interface pluggable
  RDMA/rxe: Split rxe_drain_resp_pkts()
  RDMA/rxe: Simplify reset state handling in rxe_resp.c
  RDMA/rxe: Handle qp error in rxe_resp.c
  RDMA/rxe: Cleanup comp tasks in rxe_qp.c
  RDMA/rxe: Remove __rxe_do_task()
  RDMA/rxe: Make tasks schedule each other
  RDMA/rxe: Implement disable/enable_task()
  RDMA/rxe: Replace TASK_STATE_START by TASK_STATE_IDLE
  RDMA/rxe: Replace task->destroyed by task state INVALID.
  RDMA/rxe: Add workqueue support for tasks
  RDMA/rxe: Make WORKQUEUE default for RC tasks
  RDMA/rxe: Remove tasklets from rxe_task.c

 drivers/infiniband/sw/rxe/rxe.c      |   9 +-
 drivers/infiniband/sw/rxe/rxe_comp.c |  24 ++-
 drivers/infiniband/sw/rxe/rxe_qp.c   |  80 ++++-----
 drivers/infiniband/sw/rxe/rxe_req.c  |   4 +-
 drivers/infiniband/sw/rxe/rxe_resp.c |  70 +++++---
 drivers/infiniband/sw/rxe/rxe_task.c | 258 +++++++++++++++++++--------
 drivers/infiniband/sw/rxe/rxe_task.h |  56 +++---
 7 files changed, 329 insertions(+), 172 deletions(-)

base-commit: 692373d186205dfb1b56f35f22702412d94d9420
-- 
2.34.1

^ permalink raw reply	[flat|nested] 23+ messages in thread