[PATCH v1 00/17] RPC service thread improvements

* [PATCH v1 00/17] RPC service thread improvements
@ 2023-09-11 14:38 Chuck Lever
  2023-09-11 14:38 ` [PATCH v1 01/17] SUNRPC: move all of xprt handling into svc_xprt_handle() Chuck Lever
                   ` (16 more replies)
  0 siblings, 17 replies; 29+ messages in thread
From: Chuck Lever @ 2023-09-11 14:38 UTC (permalink / raw)
  To: linux-nfs
  Cc: Anna Schumaker, Greg Kroah-Hartman, Trond Myklebust, Kees Cook,
	NeilBrown, Liam R. Howlett, Chuck Lever, Andrew Morton,
	David Gow, linux-kernel, kernel test robot

Here's the remaining set of RPC service thread improvements. This
series has been tested on v6.5.0 plus nfsd-next. The goal of this
work is to replace the current RPC service thread scheduler, which
walks a linked list to find an idle thread to wake, with a constant
time thread scheduler.

I've done some latency studies to measure the improvement. The
workload is fio on an NFSv3-over-TCP mount on 100GbE. The server
is running a v6.5.0 kernel with the v6.6 nfsd-next patches
applied. Server hardware is 4-core with 32GB of RAM and a tmpfs
export.

Latency measurements were generated with ktime_get() and recorded
via bespoke tracepoints added to svc_xprt_enqueue().

No patches applied (Linux 6.5.0-00057-g4c4f6d1271f1):
* 8 nfsd threads
    * 6682115 total RPCs
    * 32675206 svc_xprt_enqueue calls
    * 6683512 wake_idle calls (from svc_xprt_enqueue)
    * min/mean/max ns: 189/1601.83/6128677
* 32 nfsd threads
    * 6565439 total RPCs
    * 32136015 svc_xprt_enqueue calls
    * 6566486 wake_idle calls
    * min/mean/max ns: 373/1963.43/14027191
* 128 nfsd threads
    * 6434503 total RPCs
    * 31545411 svc_xprt_enqueue calls
    * 6435211 wake_idle calls
    * min/mean/max ns: 364/2289.3/24668201
* 512 nfsd threads
    * 6500600 total RPCs
    * 31798278 svc_xprt_enqueue calls
    * 6501659 wake_idle calls
    * min/mean/max ns: 371/2505.7/24983624

change-the-back-channel-to-use-lwq (Linux 6.5.0-00074-g5b9d1e90911d):
* 8 nfsd threads
    * 6643835 total RPCs
    * 32508906 svc_xprt_enqueue calls
    * 6644845 wake_idle calls (from svc_xprt_enqueue)
    * min/mean/max ns: 80/914.305/9785192
* 32 nfsd threads
    * 6679458 total RPCs
    * 32661542 svc_xprt_enqueue calls
    * 6680747 wake_idle calls
    * min/mean/max ns: 95/1194.38/10877985
* 128 nfsd threads
    * 6681268 total RPCs
    * 32674437 svc_xprt_enqueue calls
    * 6682497 wake_idle calls
    * min/mean/max ns: 95/1247.38/17284050
* 512 nfsd threads
    * 6700810 total RPCs
    * 32766457 svc_xprt_enqueue calls
    * 6702022 wake_idle calls
    * min/mean/max ns: 94/1265.88/14418874

And for dessert, a couple of latency histograms with Neil's patches
applied:

8 nfsd threads:
bin(centre) = freq
bin(150)    = 917305   14.3191%
bin(450)    = 643715   10.0483%
bin(750)    = 3285903  51.2927%
bin(1050)   = 537586    8.39168%
bin(1350)   = 359511    5.61194%
bin(1650)   = 330793    5.16366%
bin(1950)   = 125331    1.95641%
bin(2250)   = 55994     0.874062%
bin(2550)   = 33710     0.526211%
bin(2850)   = 24544     0.38313%

512 nfsd threads:
bin(centre) = freq
bin(150)    = 935030   14.5736%
bin(450)    = 636380    9.91876%
bin(750)    = 3268418  50.9423%
bin(1050)   = 542533    8.45604%
bin(1350)   = 367382    5.7261%
bin(1650)   = 334638    5.21574%
bin(1950)   = 125546    1.95679%
bin(2250)   = 55832     0.87021%
bin(2550)   = 33992     0.529807%
bin(2850)   = 25091     0.391074%

---

Chuck Lever (1):
      SUNRPC: Clean up bc_svc_process()

NeilBrown (16):
      SUNRPC: move all of xprt handling into svc_xprt_handle()
      SUNRPC: rename and refactor svc_get_next_xprt()
      SUNRPC: integrate back-channel processing with svc_recv()
      SUNRPC: change how svc threads are asked to exit.
      SUNRPC: add list of idle threads
      SUNRPC: discard SP_CONGESTED
      llist: add interface to check if a node is on a list.
      SUNRPC: change service idle list to be an llist
      llist: add llist_del_first_this()
      lib: add light-weight queuing mechanism.
      SUNRPC: rename some functions from rqst_ to svc_thread_
      SUNRPC: only have one thread waking up at a time
      SUNRPC: use lwq for sp_sockets - renamed to sp_xprts
      SUNRPC: change sp_nrthreads to atomic_t
      SUNRPC: discard sp_lock
      SUNRPC: change the back-channel queue to lwq

 fs/lockd/svc.c                    |   5 +-
 fs/lockd/svclock.c                |   5 +-
 fs/nfs/callback.c                 |  46 +-----
 fs/nfsd/nfs4proc.c                |   8 +-
 fs/nfsd/nfssvc.c                  |  13 +-
 include/linux/llist.h             |  46 ++++++
 include/linux/lockd/lockd.h       |   2 +-
 include/linux/lwq.h               | 120 +++++++++++++++
 include/linux/sunrpc/svc.h        |  44 ++++--
 include/linux/sunrpc/svc_xprt.h   |   2 +-
 include/linux/sunrpc/xprt.h       |   3 +-
 include/trace/events/sunrpc.h     |   1 -
 lib/Kconfig                       |   5 +
 lib/Makefile                      |   2 +-
 lib/llist.c                       |  28 ++++
 lib/lwq.c                         | 151 +++++++++++++++++++
 net/sunrpc/backchannel_rqst.c     |  13 +-
 net/sunrpc/svc.c                  | 146 +++++++++---------
 net/sunrpc/svc_xprt.c             | 236 ++++++++++++++----------------
 net/sunrpc/xprtrdma/backchannel.c |   6 +-
 20 files changed, 590 insertions(+), 292 deletions(-)
 create mode 100644 include/linux/lwq.h
 create mode 100644 lib/lwq.c

--
Chuck Lever

^ permalink raw reply	[flat|nested] 29+ messages in thread