[PATCH RFC] Frequent connection loss using krb5[ip] NFS mounts

* [PATCH RFC] Frequent connection loss using krb5[ip] NFS mounts
@ 2020-05-01 19:04 Chuck Lever
  2020-05-01 19:04 ` [PATCH RFC] SUNRPC: crypto calls should never schedule Chuck Lever
  2020-05-04 19:07 ` [PATCH RFC] Frequent connection loss using krb5[ip] NFS mounts J. Bruce Fields
  0 siblings, 2 replies; 4+ messages in thread
From: Chuck Lever @ 2020-05-01 19:04 UTC (permalink / raw)
  To: linux-nfs, linux-crypto

Hi-

Over the past year or so I've observed that using sec=krb5i or
sec=krb5p with NFS/RDMA while testing my Linux NFS client and server
results in frequent connection loss. I've been able to pin this down
a bit.

The problem is that SUNRPC's calls into the kernel crypto functions
can sometimes reschedule. If that happens between the time the
server receives an RPC Call and the time it has unwrapped the Call's
GSS sequence number, the number is very likely to have fallen
outside the GSS context's sequence number window. When that happens,
the server is required to drop the Call, and therefore also the
connection.

I've tested this observation by applying the following naive patch to
both my client and server, and got the following results.

I. Is this an effective fix?

With sec=krb5p,proto=rdma, I ran a 12-thread git regression test
(cd git/; make -j12 test).

Without this patch on the server, over 3,000 connection loss events
are observed. With it, the test runs on a single connection. Clearly
the server needs to have some kind of mitigation in this area.

II. Does the fix cause a performance regression?

Because this patch affects both the client and server paths, I
tested the performance differences between applying the patch in
various combinations and with both sec=krb5 (which checksums just
the RPC message header) and krb5i (which checksums the whole RPC
message.

fio 8KiB 70% read, 30% write for 30 seconds, NFSv3 on RDMA.

-- krb5 --

unpatched client and server:
Connect count: 3
read: IOPS=84.3k, 50.00th=[ 1467], 99.99th=[10028] 
write: IOPS=36.1k, 50.00th=[ 1565], 99.99th=[20579]

patched client, unpatched server:
Connect count: 2
read: IOPS=75.4k, 50.00th=[ 1647], 99.99th=[ 7111]
write: IOPS=32.3k, 50.00th=[ 1745], 99.99th=[ 7439]

unpatched client, patched server:
Connect count: 1
read: IOPS=84.1k, 50.00th=[ 1467], 99.99th=[ 8717]
write: IOPS=36.1k, 50.00th=[ 1582], 99.99th=[ 9241]

patched client and server:
Connect count: 1
read: IOPS=74.9k, 50.00th=[ 1663], 99.99th=[ 7046]
write: IOPS=31.0k, 50.00th=[ 1762], 99.99th=[ 7242]

-- krb5i --

unpatched client and server:
Connect count: 6
read: IOPS=35.8k, 50.00th=[ 3228], 99.99th=[49546]
write: IOPS=15.4k, 50.00th=[ 3294], 99.99th=[50594]

patched client, unpatched server:
Connect count: 5
read: IOPS=36.3k, 50.00th=[ 3228], 99.99th=[14877]
write: IOPS=15.5k, 50.00th=[ 3294], 99.99th=[15139]

unpatched client, patched server:
Connect count: 3
read: IOPS=35.7k, 50.00th=[ 3228], 99.99th=[15926]
write: IOPS=15.2k, 50.00th=[ 3294], 99.99th=[15926]

patched client and server:
Connect count: 3
read: IOPS=36.3k, 50.00th=[ 3195], 99.99th=[15139]
write: IOPS=15.5k, 50.00th=[ 3261], 99.99th=[15270]

The good news:
Both results show that I/O tail latency improves significantly when
either the client or the server has this patch applied.

The bad news:
The krb5 performance result shows an IOPS regression when the client
has this patch applied.

So now I'm trying to understand how to come up with a solution that
prevents the rescheduling/connection loss issue without also
causing a performance regression.

Any thoughts/comments/advice appreciated.

---

Chuck Lever (1):
      SUNRPC: crypto calls should never schedule

--
Chuck Lever

^ permalink raw reply	[flat|nested] 4+ messages in thread