kernel-tls-handshake.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* kernel oops
@ 2023-03-31 14:23 Olga Kornievskaia
  2023-03-31 14:25 ` Chuck Lever III
  0 siblings, 1 reply; 12+ messages in thread
From: Olga Kornievskaia @ 2023-03-31 14:23 UTC (permalink / raw)
  To: Chuck Lever; +Cc: kernel-tls-handshake

Hi Chuck,

Have you seen this oops? This is not my testing so all I know was that
it was an attempt to mount with xprtsec=tls (against ONTAP).

[  316.939747] BUG: kernel NULL pointer dereference, address: 0000000000000018
[  316.939848] #PF: supervisor read access in kernel mode
[  316.939914] #PF: error_code(0x0000) - not-present page
[  316.939984] PGD 0 P4D 0
[  316.940041] Oops: 0000 [#1] PREEMPT SMP PTI
[  316.940109] CPU: 0 PID: 511 Comm: kworker/u2:30 Kdump: loaded Not
tainted 6.2.1+ #2
[  316.940181] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
[  316.940259] Workqueue: xprtiod xs_tls_connect [sunrpc]
[  316.940463] RIP: 0010:xs_tls_connect+0x3e1/0x590 [sunrpc]
[  316.940556] Code: ff ff ff 48 2b 83 f8 fe ff ff 48 89 83 00 ff ff
ff e8 03 d3 ff ff e9 a8 fd ff ff 49 8b 97 f8 05 00 00 66 83 bb e0 f9
ff ff 0a <48> 8b 6a 18 0f 84 fd 00 00 00 48 8d 72 18 4c 89 e7 48 89 14
24 e8
[  316.940719] RSP: 0018:ffffa69d40f5fdc8 EFLAGS: 00010293
[  316.940786] RAX: 0000000000000000 RBX: ffff9b0c46333e40 RCX: 0000000000000000
[  316.940872] RDX: 0000000000000000 RSI: ffffa69d40f5fd20 RDI: 0000000000000000
[  316.940940] RBP: ffff9b0c46350200 R08: ffff9b0cf7e31968 R09: 0000000000000159
[  316.941008] R10: 0000000000000001 R11: 00000049c8887a40 R12: ffff9b0c46333800
[  316.941075] R13: 0000000004208160 R14: ffff9b0c44e18a00 R15: ffff9b0c42dab800
[  316.941159] FS:  0000000000000000(0000) GS:ffff9b0cf7e00000(0000)
knlGS:0000000000000000
[  316.941243] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  316.941309] CR2: 0000000000000018 CR3: 0000000102e48004 CR4: 00000000007706f0
[  316.941391] PKRU: 55555554
[  316.941443] Call Trace:
[  316.941513]  <TASK>
[  316.941578]  process_one_work+0x1b0/0x3c0
[  316.941689]  worker_thread+0x30/0x360
[  316.941755]  ? __pfx_worker_thread+0x10/0x10
[  316.941822]  kthread+0xd7/0x100
[  316.941900]  ? __pfx_kthread+0x10/0x10
[  316.941968]  ret_from_fork+0x29/0x50
[  316.942049]  </TASK>
[  316.942099] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
dns_resolver ib_core nfsv3 nfs_acl nfs lockd grace fscache netfs ext4
rfkill mbcache jbd2 loop vsock_loopback
vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock
sunrpc tls vfat fat dm_multipath intel_rapl_msr intel_rapl_common
isst_if_mbox_msr isst_if_common nfit libnvdimm crct10dif_pclmul
crc32_pclmul ghash_clmulni_intel rapl vmw_balloon joydev pcspkr
i2c_piix4 vmw_vmci xfs libcrc32c vmwgfx drm_ttm_helper ttm sr_mod
cdrom drm_kms_helper syscopyarea sd_mod sysfillrect ata_generic
sysimgblt sg ahci libahci ata_piix crc32c_intel serio_raw drm vmxnet3
vmw_pvscsi libata dm_mirror dm_region_hash dm_log dm_mod nvme_tcp
nvme_fabrics nvme_core t10_pi crc64_rocksoft crc64 ipmi_devintf
ipmi_msghandler fuse
[  316.942550] CR2: 0000000000000018

Please note this was still on "commit
caa52379f0046e974fd16585dd7ede624699d8e2 (HEAD -> 03152023-v7,
origin/upcall-v7)". If you think somethings done to address it I'll
have them update to the latest.

Thank you.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: kernel oops
  2023-03-31 14:23 kernel oops Olga Kornievskaia
@ 2023-03-31 14:25 ` Chuck Lever III
  2023-04-04 14:24   ` Olga Kornievskaia
  0 siblings, 1 reply; 12+ messages in thread
From: Chuck Lever III @ 2023-03-31 14:25 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: kernel-tls-handshake



> On Mar 31, 2023, at 10:23 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
> 
> Hi Chuck,
> 
> Have you seen this oops? This is not my testing so all I know was that
> it was an attempt to mount with xprtsec=tls (against ONTAP).
> 
> [  316.939747] BUG: kernel NULL pointer dereference, address: 0000000000000018
> [  316.939848] #PF: supervisor read access in kernel mode
> [  316.939914] #PF: error_code(0x0000) - not-present page
> [  316.939984] PGD 0 P4D 0
> [  316.940041] Oops: 0000 [#1] PREEMPT SMP PTI
> [  316.940109] CPU: 0 PID: 511 Comm: kworker/u2:30 Kdump: loaded Not
> tainted 6.2.1+ #2
> [  316.940181] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
> [  316.940259] Workqueue: xprtiod xs_tls_connect [sunrpc]
> [  316.940463] RIP: 0010:xs_tls_connect+0x3e1/0x590 [sunrpc]
> [  316.940556] Code: ff ff ff 48 2b 83 f8 fe ff ff 48 89 83 00 ff ff
> ff e8 03 d3 ff ff e9 a8 fd ff ff 49 8b 97 f8 05 00 00 66 83 bb e0 f9
> ff ff 0a <48> 8b 6a 18 0f 84 fd 00 00 00 48 8d 72 18 4c 89 e7 48 89 14
> 24 e8
> [  316.940719] RSP: 0018:ffffa69d40f5fdc8 EFLAGS: 00010293
> [  316.940786] RAX: 0000000000000000 RBX: ffff9b0c46333e40 RCX: 0000000000000000
> [  316.940872] RDX: 0000000000000000 RSI: ffffa69d40f5fd20 RDI: 0000000000000000
> [  316.940940] RBP: ffff9b0c46350200 R08: ffff9b0cf7e31968 R09: 0000000000000159
> [  316.941008] R10: 0000000000000001 R11: 00000049c8887a40 R12: ffff9b0c46333800
> [  316.941075] R13: 0000000004208160 R14: ffff9b0c44e18a00 R15: ffff9b0c42dab800
> [  316.941159] FS:  0000000000000000(0000) GS:ffff9b0cf7e00000(0000)
> knlGS:0000000000000000
> [  316.941243] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  316.941309] CR2: 0000000000000018 CR3: 0000000102e48004 CR4: 00000000007706f0
> [  316.941391] PKRU: 55555554
> [  316.941443] Call Trace:
> [  316.941513]  <TASK>
> [  316.941578]  process_one_work+0x1b0/0x3c0
> [  316.941689]  worker_thread+0x30/0x360
> [  316.941755]  ? __pfx_worker_thread+0x10/0x10
> [  316.941822]  kthread+0xd7/0x100
> [  316.941900]  ? __pfx_kthread+0x10/0x10
> [  316.941968]  ret_from_fork+0x29/0x50
> [  316.942049]  </TASK>
> [  316.942099] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
> dns_resolver ib_core nfsv3 nfs_acl nfs lockd grace fscache netfs ext4
> rfkill mbcache jbd2 loop vsock_loopback
> vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock
> sunrpc tls vfat fat dm_multipath intel_rapl_msr intel_rapl_common
> isst_if_mbox_msr isst_if_common nfit libnvdimm crct10dif_pclmul
> crc32_pclmul ghash_clmulni_intel rapl vmw_balloon joydev pcspkr
> i2c_piix4 vmw_vmci xfs libcrc32c vmwgfx drm_ttm_helper ttm sr_mod
> cdrom drm_kms_helper syscopyarea sd_mod sysfillrect ata_generic
> sysimgblt sg ahci libahci ata_piix crc32c_intel serio_raw drm vmxnet3
> vmw_pvscsi libata dm_mirror dm_region_hash dm_log dm_mod nvme_tcp
> nvme_fabrics nvme_core t10_pi crc64_rocksoft crc64 ipmi_devintf
> ipmi_msghandler fuse
> [  316.942550] CR2: 0000000000000018
> 
> Please note this was still on "commit
> caa52379f0046e974fd16585dd7ede624699d8e2 (HEAD -> 03152023-v7,
> origin/upcall-v7)". If you think somethings done to address it I'll
> have them update to the latest.

Update and retry. It has a familiar ring, but I can't say I've
seen this recently.


--
Chuck Lever



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: kernel oops
  2023-03-31 14:25 ` Chuck Lever III
@ 2023-04-04 14:24   ` Olga Kornievskaia
  2023-04-04 14:31     ` Chuck Lever III
  0 siblings, 1 reply; 12+ messages in thread
From: Olga Kornievskaia @ 2023-04-04 14:24 UTC (permalink / raw)
  To: Chuck Lever III; +Cc: kernel-tls-handshake

On Fri, Mar 31, 2023 at 10:25 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
>
>
>
> > On Mar 31, 2023, at 10:23 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
> >
> > Hi Chuck,
> >
> > Have you seen this oops? This is not my testing so all I know was that
> > it was an attempt to mount with xprtsec=tls (against ONTAP).
> >
> > [  316.939747] BUG: kernel NULL pointer dereference, address: 0000000000000018
> > [  316.939848] #PF: supervisor read access in kernel mode
> > [  316.939914] #PF: error_code(0x0000) - not-present page
> > [  316.939984] PGD 0 P4D 0
> > [  316.940041] Oops: 0000 [#1] PREEMPT SMP PTI
> > [  316.940109] CPU: 0 PID: 511 Comm: kworker/u2:30 Kdump: loaded Not
> > tainted 6.2.1+ #2
> > [  316.940181] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
> > Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
> > [  316.940259] Workqueue: xprtiod xs_tls_connect [sunrpc]
> > [  316.940463] RIP: 0010:xs_tls_connect+0x3e1/0x590 [sunrpc]
> > [  316.940556] Code: ff ff ff 48 2b 83 f8 fe ff ff 48 89 83 00 ff ff
> > ff e8 03 d3 ff ff e9 a8 fd ff ff 49 8b 97 f8 05 00 00 66 83 bb e0 f9
> > ff ff 0a <48> 8b 6a 18 0f 84 fd 00 00 00 48 8d 72 18 4c 89 e7 48 89 14
> > 24 e8
> > [  316.940719] RSP: 0018:ffffa69d40f5fdc8 EFLAGS: 00010293
> > [  316.940786] RAX: 0000000000000000 RBX: ffff9b0c46333e40 RCX: 0000000000000000
> > [  316.940872] RDX: 0000000000000000 RSI: ffffa69d40f5fd20 RDI: 0000000000000000
> > [  316.940940] RBP: ffff9b0c46350200 R08: ffff9b0cf7e31968 R09: 0000000000000159
> > [  316.941008] R10: 0000000000000001 R11: 00000049c8887a40 R12: ffff9b0c46333800
> > [  316.941075] R13: 0000000004208160 R14: ffff9b0c44e18a00 R15: ffff9b0c42dab800
> > [  316.941159] FS:  0000000000000000(0000) GS:ffff9b0cf7e00000(0000)
> > knlGS:0000000000000000
> > [  316.941243] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  316.941309] CR2: 0000000000000018 CR3: 0000000102e48004 CR4: 00000000007706f0
> > [  316.941391] PKRU: 55555554
> > [  316.941443] Call Trace:
> > [  316.941513]  <TASK>
> > [  316.941578]  process_one_work+0x1b0/0x3c0
> > [  316.941689]  worker_thread+0x30/0x360
> > [  316.941755]  ? __pfx_worker_thread+0x10/0x10
> > [  316.941822]  kthread+0xd7/0x100
> > [  316.941900]  ? __pfx_kthread+0x10/0x10
> > [  316.941968]  ret_from_fork+0x29/0x50
> > [  316.942049]  </TASK>
> > [  316.942099] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
> > dns_resolver ib_core nfsv3 nfs_acl nfs lockd grace fscache netfs ext4
> > rfkill mbcache jbd2 loop vsock_loopback
> > vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock
> > sunrpc tls vfat fat dm_multipath intel_rapl_msr intel_rapl_common
> > isst_if_mbox_msr isst_if_common nfit libnvdimm crct10dif_pclmul
> > crc32_pclmul ghash_clmulni_intel rapl vmw_balloon joydev pcspkr
> > i2c_piix4 vmw_vmci xfs libcrc32c vmwgfx drm_ttm_helper ttm sr_mod
> > cdrom drm_kms_helper syscopyarea sd_mod sysfillrect ata_generic
> > sysimgblt sg ahci libahci ata_piix crc32c_intel serio_raw drm vmxnet3
> > vmw_pvscsi libata dm_mirror dm_region_hash dm_log dm_mod nvme_tcp
> > nvme_fabrics nvme_core t10_pi crc64_rocksoft crc64 ipmi_devintf
> > ipmi_msghandler fuse
> > [  316.942550] CR2: 0000000000000018
> >
> > Please note this was still on "commit
> > caa52379f0046e974fd16585dd7ede624699d8e2 (HEAD -> 03152023-v7,
> > origin/upcall-v7)". If you think somethings done to address it I'll
> > have them update to the latest.
>
> Update and retry. It has a familiar ring, but I can't say I've
> seen this recently.

Hi Chuck,

I've updated the kernel code to the latest  commit
46588cade103b446f74a7ee160a83cd2a80399e7 (HEAD ->
chuck-rpc-with-tls-04032023, chuck/topic-rpc-with-tls-upcall) but the
client is still crashing but looks like in a different place. This
happens when the server terminates the connection (immediately) as the
tls handshake happens.

Here's the stack trace:
[  677.274299] BUG: kernel NULL pointer dereference, address: 0000000000000030
[  677.274424] #PF: supervisor read access in kernel mode
[  677.274517] #PF: error_code(0x0000) - not-present page
[  677.274609] PGD 0 P4D 0
[  677.274689] Oops: 0000 [#1] PREEMPT SMP PTI
[  677.274785] CPU: 0 PID: 518 Comm: kworker/u2:29 Kdump: loaded Not
tainted 6.3.0-rc4+ #13
[  677.274881] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
[  677.274981] Workqueue: xprtiod xs_tls_connect [sunrpc]
[  677.275137] RIP: 0010:handshake_req_cancel+0x2b/0x3b0
[  677.275269] Code: 0f 1e fa 0f 1f 44 00 00 41 57 41 56 41 55 41 54
55 53 48 83 ec 28 48 8b 6f 18 65 48 8b 04 25 28 00 00 00 48 89 44 24
20 31 c0 <48> 8b 5d 30 48 89 6c 24 18 e8 97 15 5f ff 4c 8b 25 50 2e 70
01 41
[  677.275474] RSP: 0018:ffffaed400f6fcf8 EFLAGS: 00010246
[  677.275570] RAX: 0000000000000000 RBX: ffff93f085b33800 RCX: 0000000000000000
[  677.275663] RDX: ffff93f085b33f08 RSI: ffffaed400f6fd18 RDI: ffff93f083aa6700
[  677.275756] RBP: 0000000000000000 R08: ffff93f137e32968 R09: 0000009d9663fa40
[  677.275850] R10: 0000000000000000 R11: 0000009d9663fa40 R12: ffff93f083aa6700
[  677.275943] R13: 0000000000000000 R14: ffff93f082999e00 R15: ffff93f085b33800
[  677.276049] FS:  0000000000000000(0000) GS:ffff93f137e00000(0000)
knlGS:0000000000000000
[  677.276154] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  677.276247] CR2: 0000000000000030 CR3: 000000010797c002 CR4: 00000000007706f0
[  677.276355] PKRU: 55555554
[  677.276427] Call Trace:
[  677.276521]  <TASK>
[  677.276605]  ? wait_for_completion_interruptible_timeout+0xfa/0x170
[  677.276725]  xs_tls_handshake_sync+0x14f/0x170 [sunrpc]
[  677.276850]  ? __pfx_xs_tls_handshake_done+0x10/0x10 [sunrpc]
[  677.276967]  xs_tls_connect+0x14c/0x590 [sunrpc]
[  677.277086]  process_one_work+0x1b0/0x3c0
[  677.277209]  worker_thread+0x34/0x360
[  677.277303]  ? __pfx_worker_thread+0x10/0x10
[  677.277396]  kthread+0xdb/0x110
[  677.277527]  ? __pfx_kthread+0x10/0x10
[  677.277619]  ret_from_fork+0x29/0x50
[  677.277731]  </TASK>
[  677.277803] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
dns_resolver nfsv3 nfs_acl nfs lockd grace fscache netfs ext4 mbcache
jbd2 loop rfkill vsock_loopback vmw_vsock_virtio_transport_common
vmw_vsock_vmci_transport vsock sunrpc vfat fat dm_multipath
intel_rapl_msr intel_rapl_common isst_if_mbox_msr isst_if_common nfit
libnvdimm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl
vmw_balloon joydev pcspkr vmw_vmci i2c_piix4 xfs libcrc32c vmwgfx
drm_ttm_helper ttm sr_mod cdrom drm_kms_helper syscopyarea sysfillrect
sd_mod ata_generic sysimgblt sg ata_piix ahci libahci crc32c_intel
serio_raw drm vmxnet3 vmw_pvscsi libata dm_mirror dm_region_hash
dm_log dm_mod nvme_tcp nvme_fabrics nvme_core t10_pi crc64_rocksoft
crc64 ipmi_devintf ipmi_msghandler fuse
[  677.278383] CR2: 0000000000000030

Network trace from the server:

[kolga@scs000085536 ~/pcaps]$ tshark -t ad -r e0d_20230404_084311.trc0
    1 2023-04-04 08:43:28.027546 10.235.232.77 → 10.63.20.131 TCP 74
727 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
TSval=4059503415 TSecr=0 WS=128
    2 2023-04-04 08:43:28.027645 10.63.20.131 → 10.235.232.77 TCP 74
2049 → 727 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
SACK_PERM=1 TSval=2256845073 TSecr=4059503415
    3 2023-04-04 08:43:28.027813 10.235.232.77 → 10.63.20.131 TCP 66
727 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=4059503415
TSecr=2256845073
    4 2023-04-04 08:43:28.027952 10.235.232.77 → 10.63.20.131 NFS 110
V4 NULL Call
    5 2023-04-04 08:43:28.028301 10.63.20.131 → 10.235.232.77 NFS 102
V4 NULL Reply (Call In 4)
    6 2023-04-04 08:43:28.028563 10.235.232.77 → 10.63.20.131 TCP 66
727 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=4059503416
TSecr=2256845073
    7 2023-04-04 08:43:28.331181 10.63.20.131 → 10.235.232.77 TCP 66
2049 → 727 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=2256845383
TSecr=4059503416


>
>
> --
> Chuck Lever
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: kernel oops
  2023-04-04 14:24   ` Olga Kornievskaia
@ 2023-04-04 14:31     ` Chuck Lever III
  2023-04-04 14:44       ` Olga Kornievskaia
  0 siblings, 1 reply; 12+ messages in thread
From: Chuck Lever III @ 2023-04-04 14:31 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: kernel-tls-handshake



> On Apr 4, 2023, at 10:24 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
> 
> On Fri, Mar 31, 2023 at 10:25 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
>> 
>> 
>> 
>>> On Mar 31, 2023, at 10:23 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>> 
>>> Hi Chuck,
>>> 
>>> Have you seen this oops? This is not my testing so all I know was that
>>> it was an attempt to mount with xprtsec=tls (against ONTAP).
>>> 
>>> [  316.939747] BUG: kernel NULL pointer dereference, address: 0000000000000018
>>> [  316.939848] #PF: supervisor read access in kernel mode
>>> [  316.939914] #PF: error_code(0x0000) - not-present page
>>> [  316.939984] PGD 0 P4D 0
>>> [  316.940041] Oops: 0000 [#1] PREEMPT SMP PTI
>>> [  316.940109] CPU: 0 PID: 511 Comm: kworker/u2:30 Kdump: loaded Not
>>> tainted 6.2.1+ #2
>>> [  316.940181] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
>>> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
>>> [  316.940259] Workqueue: xprtiod xs_tls_connect [sunrpc]
>>> [  316.940463] RIP: 0010:xs_tls_connect+0x3e1/0x590 [sunrpc]
>>> [  316.940556] Code: ff ff ff 48 2b 83 f8 fe ff ff 48 89 83 00 ff ff
>>> ff e8 03 d3 ff ff e9 a8 fd ff ff 49 8b 97 f8 05 00 00 66 83 bb e0 f9
>>> ff ff 0a <48> 8b 6a 18 0f 84 fd 00 00 00 48 8d 72 18 4c 89 e7 48 89 14
>>> 24 e8
>>> [  316.940719] RSP: 0018:ffffa69d40f5fdc8 EFLAGS: 00010293
>>> [  316.940786] RAX: 0000000000000000 RBX: ffff9b0c46333e40 RCX: 0000000000000000
>>> [  316.940872] RDX: 0000000000000000 RSI: ffffa69d40f5fd20 RDI: 0000000000000000
>>> [  316.940940] RBP: ffff9b0c46350200 R08: ffff9b0cf7e31968 R09: 0000000000000159
>>> [  316.941008] R10: 0000000000000001 R11: 00000049c8887a40 R12: ffff9b0c46333800
>>> [  316.941075] R13: 0000000004208160 R14: ffff9b0c44e18a00 R15: ffff9b0c42dab800
>>> [  316.941159] FS:  0000000000000000(0000) GS:ffff9b0cf7e00000(0000)
>>> knlGS:0000000000000000
>>> [  316.941243] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [  316.941309] CR2: 0000000000000018 CR3: 0000000102e48004 CR4: 00000000007706f0
>>> [  316.941391] PKRU: 55555554
>>> [  316.941443] Call Trace:
>>> [  316.941513]  <TASK>
>>> [  316.941578]  process_one_work+0x1b0/0x3c0
>>> [  316.941689]  worker_thread+0x30/0x360
>>> [  316.941755]  ? __pfx_worker_thread+0x10/0x10
>>> [  316.941822]  kthread+0xd7/0x100
>>> [  316.941900]  ? __pfx_kthread+0x10/0x10
>>> [  316.941968]  ret_from_fork+0x29/0x50
>>> [  316.942049]  </TASK>
>>> [  316.942099] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
>>> dns_resolver ib_core nfsv3 nfs_acl nfs lockd grace fscache netfs ext4
>>> rfkill mbcache jbd2 loop vsock_loopback
>>> vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock
>>> sunrpc tls vfat fat dm_multipath intel_rapl_msr intel_rapl_common
>>> isst_if_mbox_msr isst_if_common nfit libnvdimm crct10dif_pclmul
>>> crc32_pclmul ghash_clmulni_intel rapl vmw_balloon joydev pcspkr
>>> i2c_piix4 vmw_vmci xfs libcrc32c vmwgfx drm_ttm_helper ttm sr_mod
>>> cdrom drm_kms_helper syscopyarea sd_mod sysfillrect ata_generic
>>> sysimgblt sg ahci libahci ata_piix crc32c_intel serio_raw drm vmxnet3
>>> vmw_pvscsi libata dm_mirror dm_region_hash dm_log dm_mod nvme_tcp
>>> nvme_fabrics nvme_core t10_pi crc64_rocksoft crc64 ipmi_devintf
>>> ipmi_msghandler fuse
>>> [  316.942550] CR2: 0000000000000018
>>> 
>>> Please note this was still on "commit
>>> caa52379f0046e974fd16585dd7ede624699d8e2 (HEAD -> 03152023-v7,
>>> origin/upcall-v7)". If you think somethings done to address it I'll
>>> have them update to the latest.
>> 
>> Update and retry. It has a familiar ring, but I can't say I've
>> seen this recently.
> 
> Hi Chuck,
> 
> I've updated the kernel code to the latest  commit
> 46588cade103b446f74a7ee160a83cd2a80399e7 (HEAD ->
> chuck-rpc-with-tls-04032023, chuck/topic-rpc-with-tls-upcall) but the
> client is still crashing but looks like in a different place. This
> happens when the server terminates the connection (immediately) as the
> tls handshake happens.

Why is the server doing that?


> Here's the stack trace:
> [  677.274299] BUG: kernel NULL pointer dereference, address: 0000000000000030
> [  677.274424] #PF: supervisor read access in kernel mode
> [  677.274517] #PF: error_code(0x0000) - not-present page
> [  677.274609] PGD 0 P4D 0
> [  677.274689] Oops: 0000 [#1] PREEMPT SMP PTI
> [  677.274785] CPU: 0 PID: 518 Comm: kworker/u2:29 Kdump: loaded Not
> tainted 6.3.0-rc4+ #13
> [  677.274881] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
> [  677.274981] Workqueue: xprtiod xs_tls_connect [sunrpc]
> [  677.275137] RIP: 0010:handshake_req_cancel+0x2b/0x3b0

scripts/faddr2line on this RIP

Which pointer is NULL?


> [  677.275269] Code: 0f 1e fa 0f 1f 44 00 00 41 57 41 56 41 55 41 54
> 55 53 48 83 ec 28 48 8b 6f 18 65 48 8b 04 25 28 00 00 00 48 89 44 24
> 20 31 c0 <48> 8b 5d 30 48 89 6c 24 18 e8 97 15 5f ff 4c 8b 25 50 2e 70
> 01 41
> [  677.275474] RSP: 0018:ffffaed400f6fcf8 EFLAGS: 00010246
> [  677.275570] RAX: 0000000000000000 RBX: ffff93f085b33800 RCX: 0000000000000000
> [  677.275663] RDX: ffff93f085b33f08 RSI: ffffaed400f6fd18 RDI: ffff93f083aa6700
> [  677.275756] RBP: 0000000000000000 R08: ffff93f137e32968 R09: 0000009d9663fa40
> [  677.275850] R10: 0000000000000000 R11: 0000009d9663fa40 R12: ffff93f083aa6700
> [  677.275943] R13: 0000000000000000 R14: ffff93f082999e00 R15: ffff93f085b33800
> [  677.276049] FS:  0000000000000000(0000) GS:ffff93f137e00000(0000)
> knlGS:0000000000000000
> [  677.276154] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  677.276247] CR2: 0000000000000030 CR3: 000000010797c002 CR4: 00000000007706f0
> [  677.276355] PKRU: 55555554
> [  677.276427] Call Trace:
> [  677.276521]  <TASK>
> [  677.276605]  ? wait_for_completion_interruptible_timeout+0xfa/0x170
> [  677.276725]  xs_tls_handshake_sync+0x14f/0x170 [sunrpc]
> [  677.276850]  ? __pfx_xs_tls_handshake_done+0x10/0x10 [sunrpc]
> [  677.276967]  xs_tls_connect+0x14c/0x590 [sunrpc]
> [  677.277086]  process_one_work+0x1b0/0x3c0
> [  677.277209]  worker_thread+0x34/0x360
> [  677.277303]  ? __pfx_worker_thread+0x10/0x10
> [  677.277396]  kthread+0xdb/0x110
> [  677.277527]  ? __pfx_kthread+0x10/0x10
> [  677.277619]  ret_from_fork+0x29/0x50
> [  677.277731]  </TASK>
> [  677.277803] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
> dns_resolver nfsv3 nfs_acl nfs lockd grace fscache netfs ext4 mbcache
> jbd2 loop rfkill vsock_loopback vmw_vsock_virtio_transport_common
> vmw_vsock_vmci_transport vsock sunrpc vfat fat dm_multipath
> intel_rapl_msr intel_rapl_common isst_if_mbox_msr isst_if_common nfit
> libnvdimm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl
> vmw_balloon joydev pcspkr vmw_vmci i2c_piix4 xfs libcrc32c vmwgfx
> drm_ttm_helper ttm sr_mod cdrom drm_kms_helper syscopyarea sysfillrect
> sd_mod ata_generic sysimgblt sg ata_piix ahci libahci crc32c_intel
> serio_raw drm vmxnet3 vmw_pvscsi libata dm_mirror dm_region_hash
> dm_log dm_mod nvme_tcp nvme_fabrics nvme_core t10_pi crc64_rocksoft
> crc64 ipmi_devintf ipmi_msghandler fuse
> [  677.278383] CR2: 0000000000000030
> 
> Network trace from the server:
> 
> [kolga@scs000085536 ~/pcaps]$ tshark -t ad -r e0d_20230404_084311.trc0
>    1 2023-04-04 08:43:28.027546 10.235.232.77 → 10.63.20.131 TCP 74
> 727 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> TSval=4059503415 TSecr=0 WS=128
>    2 2023-04-04 08:43:28.027645 10.63.20.131 → 10.235.232.77 TCP 74
> 2049 → 727 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> SACK_PERM=1 TSval=2256845073 TSecr=4059503415
>    3 2023-04-04 08:43:28.027813 10.235.232.77 → 10.63.20.131 TCP 66
> 727 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=4059503415
> TSecr=2256845073
>    4 2023-04-04 08:43:28.027952 10.235.232.77 → 10.63.20.131 NFS 110
> V4 NULL Call
>    5 2023-04-04 08:43:28.028301 10.63.20.131 → 10.235.232.77 NFS 102
> V4 NULL Reply (Call In 4)
>    6 2023-04-04 08:43:28.028563 10.235.232.77 → 10.63.20.131 TCP 66
> 727 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=4059503416
> TSecr=2256845073
>    7 2023-04-04 08:43:28.331181 10.63.20.131 → 10.235.232.77 TCP 66
> 2049 → 727 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=2256845383
> TSecr=4059503416
> 
> 
>> 
>> 
>> --
>> Chuck Lever


--
Chuck Lever



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: kernel oops
  2023-04-04 14:31     ` Chuck Lever III
@ 2023-04-04 14:44       ` Olga Kornievskaia
  2023-04-04 14:54         ` Chuck Lever III
  0 siblings, 1 reply; 12+ messages in thread
From: Olga Kornievskaia @ 2023-04-04 14:44 UTC (permalink / raw)
  To: Chuck Lever III; +Cc: kernel-tls-handshake

On Tue, Apr 4, 2023 at 10:32 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
>
>
>
> > On Apr 4, 2023, at 10:24 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
> >
> > On Fri, Mar 31, 2023 at 10:25 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
> >>
> >>
> >>
> >>> On Mar 31, 2023, at 10:23 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
> >>>
> >>> Hi Chuck,
> >>>
> >>> Have you seen this oops? This is not my testing so all I know was that
> >>> it was an attempt to mount with xprtsec=tls (against ONTAP).
> >>>
> >>> [  316.939747] BUG: kernel NULL pointer dereference, address: 0000000000000018
> >>> [  316.939848] #PF: supervisor read access in kernel mode
> >>> [  316.939914] #PF: error_code(0x0000) - not-present page
> >>> [  316.939984] PGD 0 P4D 0
> >>> [  316.940041] Oops: 0000 [#1] PREEMPT SMP PTI
> >>> [  316.940109] CPU: 0 PID: 511 Comm: kworker/u2:30 Kdump: loaded Not
> >>> tainted 6.2.1+ #2
> >>> [  316.940181] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
> >>> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
> >>> [  316.940259] Workqueue: xprtiod xs_tls_connect [sunrpc]
> >>> [  316.940463] RIP: 0010:xs_tls_connect+0x3e1/0x590 [sunrpc]
> >>> [  316.940556] Code: ff ff ff 48 2b 83 f8 fe ff ff 48 89 83 00 ff ff
> >>> ff e8 03 d3 ff ff e9 a8 fd ff ff 49 8b 97 f8 05 00 00 66 83 bb e0 f9
> >>> ff ff 0a <48> 8b 6a 18 0f 84 fd 00 00 00 48 8d 72 18 4c 89 e7 48 89 14
> >>> 24 e8
> >>> [  316.940719] RSP: 0018:ffffa69d40f5fdc8 EFLAGS: 00010293
> >>> [  316.940786] RAX: 0000000000000000 RBX: ffff9b0c46333e40 RCX: 0000000000000000
> >>> [  316.940872] RDX: 0000000000000000 RSI: ffffa69d40f5fd20 RDI: 0000000000000000
> >>> [  316.940940] RBP: ffff9b0c46350200 R08: ffff9b0cf7e31968 R09: 0000000000000159
> >>> [  316.941008] R10: 0000000000000001 R11: 00000049c8887a40 R12: ffff9b0c46333800
> >>> [  316.941075] R13: 0000000004208160 R14: ffff9b0c44e18a00 R15: ffff9b0c42dab800
> >>> [  316.941159] FS:  0000000000000000(0000) GS:ffff9b0cf7e00000(0000)
> >>> knlGS:0000000000000000
> >>> [  316.941243] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>> [  316.941309] CR2: 0000000000000018 CR3: 0000000102e48004 CR4: 00000000007706f0
> >>> [  316.941391] PKRU: 55555554
> >>> [  316.941443] Call Trace:
> >>> [  316.941513]  <TASK>
> >>> [  316.941578]  process_one_work+0x1b0/0x3c0
> >>> [  316.941689]  worker_thread+0x30/0x360
> >>> [  316.941755]  ? __pfx_worker_thread+0x10/0x10
> >>> [  316.941822]  kthread+0xd7/0x100
> >>> [  316.941900]  ? __pfx_kthread+0x10/0x10
> >>> [  316.941968]  ret_from_fork+0x29/0x50
> >>> [  316.942049]  </TASK>
> >>> [  316.942099] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
> >>> dns_resolver ib_core nfsv3 nfs_acl nfs lockd grace fscache netfs ext4
> >>> rfkill mbcache jbd2 loop vsock_loopback
> >>> vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock
> >>> sunrpc tls vfat fat dm_multipath intel_rapl_msr intel_rapl_common
> >>> isst_if_mbox_msr isst_if_common nfit libnvdimm crct10dif_pclmul
> >>> crc32_pclmul ghash_clmulni_intel rapl vmw_balloon joydev pcspkr
> >>> i2c_piix4 vmw_vmci xfs libcrc32c vmwgfx drm_ttm_helper ttm sr_mod
> >>> cdrom drm_kms_helper syscopyarea sd_mod sysfillrect ata_generic
> >>> sysimgblt sg ahci libahci ata_piix crc32c_intel serio_raw drm vmxnet3
> >>> vmw_pvscsi libata dm_mirror dm_region_hash dm_log dm_mod nvme_tcp
> >>> nvme_fabrics nvme_core t10_pi crc64_rocksoft crc64 ipmi_devintf
> >>> ipmi_msghandler fuse
> >>> [  316.942550] CR2: 0000000000000018
> >>>
> >>> Please note this was still on "commit
> >>> caa52379f0046e974fd16585dd7ede624699d8e2 (HEAD -> 03152023-v7,
> >>> origin/upcall-v7)". If you think somethings done to address it I'll
> >>> have them update to the latest.
> >>
> >> Update and retry. It has a familiar ring, but I can't say I've
> >> seen this recently.
> >
> > Hi Chuck,
> >
> > I've updated the kernel code to the latest  commit
> > 46588cade103b446f74a7ee160a83cd2a80399e7 (HEAD ->
> > chuck-rpc-with-tls-04032023, chuck/topic-rpc-with-tls-upcall) but the
> > client is still crashing but looks like in a different place. This
> > happens when the server terminates the connection (immediately) as the
> > tls handshake happens.
>
> Why is the server doing that?

I'm not sure what kind of a reason you are looking for. I think the
server can send an RST at any point in a TCP connection and the client
shouldn't crash. But the real reason is probably because the server's
code is getting developed and they ran into some error and resetting
the connection.

> > Here's the stack trace:
> > [  677.274299] BUG: kernel NULL pointer dereference, address: 0000000000000030
> > [  677.274424] #PF: supervisor read access in kernel mode
> > [  677.274517] #PF: error_code(0x0000) - not-present page
> > [  677.274609] PGD 0 P4D 0
> > [  677.274689] Oops: 0000 [#1] PREEMPT SMP PTI
> > [  677.274785] CPU: 0 PID: 518 Comm: kworker/u2:29 Kdump: loaded Not
> > tainted 6.3.0-rc4+ #13
> > [  677.274881] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
> > Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
> > [  677.274981] Workqueue: xprtiod xs_tls_connect [sunrpc]
> > [  677.275137] RIP: 0010:handshake_req_cancel+0x2b/0x3b0
>
> scripts/faddr2line on this RIP
>
> Which pointer is NULL?

I apologize I don't have that info and I'm working on trying to figure
out how to get this information to you. Lab kernels are built with
CONFIG_DEBUG_INFO disabled and thus I can't do faddr2line. I will
attempt to build on with debugging enabled (has been problematic
before for the lab machines so not sure if that'll be successful I
might have to hack the linux kernel to simulate the reset).


> > [  677.275269] Code: 0f 1e fa 0f 1f 44 00 00 41 57 41 56 41 55 41 54
> > 55 53 48 83 ec 28 48 8b 6f 18 65 48 8b 04 25 28 00 00 00 48 89 44 24
> > 20 31 c0 <48> 8b 5d 30 48 89 6c 24 18 e8 97 15 5f ff 4c 8b 25 50 2e 70
> > 01 41
> > [  677.275474] RSP: 0018:ffffaed400f6fcf8 EFLAGS: 00010246
> > [  677.275570] RAX: 0000000000000000 RBX: ffff93f085b33800 RCX: 0000000000000000
> > [  677.275663] RDX: ffff93f085b33f08 RSI: ffffaed400f6fd18 RDI: ffff93f083aa6700
> > [  677.275756] RBP: 0000000000000000 R08: ffff93f137e32968 R09: 0000009d9663fa40
> > [  677.275850] R10: 0000000000000000 R11: 0000009d9663fa40 R12: ffff93f083aa6700
> > [  677.275943] R13: 0000000000000000 R14: ffff93f082999e00 R15: ffff93f085b33800
> > [  677.276049] FS:  0000000000000000(0000) GS:ffff93f137e00000(0000)
> > knlGS:0000000000000000
> > [  677.276154] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  677.276247] CR2: 0000000000000030 CR3: 000000010797c002 CR4: 00000000007706f0
> > [  677.276355] PKRU: 55555554
> > [  677.276427] Call Trace:
> > [  677.276521]  <TASK>
> > [  677.276605]  ? wait_for_completion_interruptible_timeout+0xfa/0x170
> > [  677.276725]  xs_tls_handshake_sync+0x14f/0x170 [sunrpc]
> > [  677.276850]  ? __pfx_xs_tls_handshake_done+0x10/0x10 [sunrpc]
> > [  677.276967]  xs_tls_connect+0x14c/0x590 [sunrpc]
> > [  677.277086]  process_one_work+0x1b0/0x3c0
> > [  677.277209]  worker_thread+0x34/0x360
> > [  677.277303]  ? __pfx_worker_thread+0x10/0x10
> > [  677.277396]  kthread+0xdb/0x110
> > [  677.277527]  ? __pfx_kthread+0x10/0x10
> > [  677.277619]  ret_from_fork+0x29/0x50
> > [  677.277731]  </TASK>
> > [  677.277803] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
> > dns_resolver nfsv3 nfs_acl nfs lockd grace fscache netfs ext4 mbcache
> > jbd2 loop rfkill vsock_loopback vmw_vsock_virtio_transport_common
> > vmw_vsock_vmci_transport vsock sunrpc vfat fat dm_multipath
> > intel_rapl_msr intel_rapl_common isst_if_mbox_msr isst_if_common nfit
> > libnvdimm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl
> > vmw_balloon joydev pcspkr vmw_vmci i2c_piix4 xfs libcrc32c vmwgfx
> > drm_ttm_helper ttm sr_mod cdrom drm_kms_helper syscopyarea sysfillrect
> > sd_mod ata_generic sysimgblt sg ata_piix ahci libahci crc32c_intel
> > serio_raw drm vmxnet3 vmw_pvscsi libata dm_mirror dm_region_hash
> > dm_log dm_mod nvme_tcp nvme_fabrics nvme_core t10_pi crc64_rocksoft
> > crc64 ipmi_devintf ipmi_msghandler fuse
> > [  677.278383] CR2: 0000000000000030
> >
> > Network trace from the server:
> >
> > [kolga@scs000085536 ~/pcaps]$ tshark -t ad -r e0d_20230404_084311.trc0
> >    1 2023-04-04 08:43:28.027546 10.235.232.77 → 10.63.20.131 TCP 74
> > 727 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> > TSval=4059503415 TSecr=0 WS=128
> >    2 2023-04-04 08:43:28.027645 10.63.20.131 → 10.235.232.77 TCP 74
> > 2049 → 727 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> > SACK_PERM=1 TSval=2256845073 TSecr=4059503415
> >    3 2023-04-04 08:43:28.027813 10.235.232.77 → 10.63.20.131 TCP 66
> > 727 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=4059503415
> > TSecr=2256845073
> >    4 2023-04-04 08:43:28.027952 10.235.232.77 → 10.63.20.131 NFS 110
> > V4 NULL Call
> >    5 2023-04-04 08:43:28.028301 10.63.20.131 → 10.235.232.77 NFS 102
> > V4 NULL Reply (Call In 4)
> >    6 2023-04-04 08:43:28.028563 10.235.232.77 → 10.63.20.131 TCP 66
> > 727 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=4059503416
> > TSecr=2256845073
> >    7 2023-04-04 08:43:28.331181 10.63.20.131 → 10.235.232.77 TCP 66
> > 2049 → 727 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=2256845383
> > TSecr=4059503416
> >
> >
> >>
> >>
> >> --
> >> Chuck Lever
>
>
> --
> Chuck Lever
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: kernel oops
  2023-04-04 14:44       ` Olga Kornievskaia
@ 2023-04-04 14:54         ` Chuck Lever III
  2023-04-04 16:15           ` Olga Kornievskaia
  0 siblings, 1 reply; 12+ messages in thread
From: Chuck Lever III @ 2023-04-04 14:54 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: kernel-tls-handshake



> On Apr 4, 2023, at 10:44 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
> 
> On Tue, Apr 4, 2023 at 10:32 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
>> 
>>> On Apr 4, 2023, at 10:24 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>> 
>>> On Fri, Mar 31, 2023 at 10:25 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
>>>> 
>>>> 
>>>> 
>>>>> On Mar 31, 2023, at 10:23 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>>> 
>>>>> Hi Chuck,
>>>>> 
>>>>> Have you seen this oops? This is not my testing so all I know was that
>>>>> it was an attempt to mount with xprtsec=tls (against ONTAP).
>>>>> 
>>>>> [  316.939747] BUG: kernel NULL pointer dereference, address: 0000000000000018
>>>>> [  316.939848] #PF: supervisor read access in kernel mode
>>>>> [  316.939914] #PF: error_code(0x0000) - not-present page
>>>>> [  316.939984] PGD 0 P4D 0
>>>>> [  316.940041] Oops: 0000 [#1] PREEMPT SMP PTI
>>>>> [  316.940109] CPU: 0 PID: 511 Comm: kworker/u2:30 Kdump: loaded Not
>>>>> tainted 6.2.1+ #2
>>>>> [  316.940181] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
>>>>> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
>>>>> [  316.940259] Workqueue: xprtiod xs_tls_connect [sunrpc]
>>>>> [  316.940463] RIP: 0010:xs_tls_connect+0x3e1/0x590 [sunrpc]
>>>>> [  316.940556] Code: ff ff ff 48 2b 83 f8 fe ff ff 48 89 83 00 ff ff
>>>>> ff e8 03 d3 ff ff e9 a8 fd ff ff 49 8b 97 f8 05 00 00 66 83 bb e0 f9
>>>>> ff ff 0a <48> 8b 6a 18 0f 84 fd 00 00 00 48 8d 72 18 4c 89 e7 48 89 14
>>>>> 24 e8
>>>>> [  316.940719] RSP: 0018:ffffa69d40f5fdc8 EFLAGS: 00010293
>>>>> [  316.940786] RAX: 0000000000000000 RBX: ffff9b0c46333e40 RCX: 0000000000000000
>>>>> [  316.940872] RDX: 0000000000000000 RSI: ffffa69d40f5fd20 RDI: 0000000000000000
>>>>> [  316.940940] RBP: ffff9b0c46350200 R08: ffff9b0cf7e31968 R09: 0000000000000159
>>>>> [  316.941008] R10: 0000000000000001 R11: 00000049c8887a40 R12: ffff9b0c46333800
>>>>> [  316.941075] R13: 0000000004208160 R14: ffff9b0c44e18a00 R15: ffff9b0c42dab800
>>>>> [  316.941159] FS:  0000000000000000(0000) GS:ffff9b0cf7e00000(0000)
>>>>> knlGS:0000000000000000
>>>>> [  316.941243] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [  316.941309] CR2: 0000000000000018 CR3: 0000000102e48004 CR4: 00000000007706f0
>>>>> [  316.941391] PKRU: 55555554
>>>>> [  316.941443] Call Trace:
>>>>> [  316.941513]  <TASK>
>>>>> [  316.941578]  process_one_work+0x1b0/0x3c0
>>>>> [  316.941689]  worker_thread+0x30/0x360
>>>>> [  316.941755]  ? __pfx_worker_thread+0x10/0x10
>>>>> [  316.941822]  kthread+0xd7/0x100
>>>>> [  316.941900]  ? __pfx_kthread+0x10/0x10
>>>>> [  316.941968]  ret_from_fork+0x29/0x50
>>>>> [  316.942049]  </TASK>
>>>>> [  316.942099] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
>>>>> dns_resolver ib_core nfsv3 nfs_acl nfs lockd grace fscache netfs ext4
>>>>> rfkill mbcache jbd2 loop vsock_loopback
>>>>> vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock
>>>>> sunrpc tls vfat fat dm_multipath intel_rapl_msr intel_rapl_common
>>>>> isst_if_mbox_msr isst_if_common nfit libnvdimm crct10dif_pclmul
>>>>> crc32_pclmul ghash_clmulni_intel rapl vmw_balloon joydev pcspkr
>>>>> i2c_piix4 vmw_vmci xfs libcrc32c vmwgfx drm_ttm_helper ttm sr_mod
>>>>> cdrom drm_kms_helper syscopyarea sd_mod sysfillrect ata_generic
>>>>> sysimgblt sg ahci libahci ata_piix crc32c_intel serio_raw drm vmxnet3
>>>>> vmw_pvscsi libata dm_mirror dm_region_hash dm_log dm_mod nvme_tcp
>>>>> nvme_fabrics nvme_core t10_pi crc64_rocksoft crc64 ipmi_devintf
>>>>> ipmi_msghandler fuse
>>>>> [  316.942550] CR2: 0000000000000018
>>>>> 
>>>>> Please note this was still on "commit
>>>>> caa52379f0046e974fd16585dd7ede624699d8e2 (HEAD -> 03152023-v7,
>>>>> origin/upcall-v7)". If you think somethings done to address it I'll
>>>>> have them update to the latest.
>>>> 
>>>> Update and retry. It has a familiar ring, but I can't say I've
>>>> seen this recently.
>>> 
>>> Hi Chuck,
>>> 
>>> I've updated the kernel code to the latest  commit
>>> 46588cade103b446f74a7ee160a83cd2a80399e7 (HEAD ->
>>> chuck-rpc-with-tls-04032023, chuck/topic-rpc-with-tls-upcall) but the
>>> client is still crashing but looks like in a different place. This
>>> happens when the server terminates the connection (immediately) as the
>>> tls handshake happens.
>> 
>> Why is the server doing that?
> 
> I'm not sure what kind of a reason you are looking for. I think the
> server can send an RST at any point in a TCP connection and the client
> shouldn't crash.

No doubt.


> But the real reason is probably because the server's
> code is getting developed and they ran into some error and resetting
> the connection.

That's what I was looking for. I wanted to know if it was
Linux NFSD doing this.


>>> Here's the stack trace:
>>> [  677.274299] BUG: kernel NULL pointer dereference, address: 0000000000000030
>>> [  677.274424] #PF: supervisor read access in kernel mode
>>> [  677.274517] #PF: error_code(0x0000) - not-present page
>>> [  677.274609] PGD 0 P4D 0
>>> [  677.274689] Oops: 0000 [#1] PREEMPT SMP PTI
>>> [  677.274785] CPU: 0 PID: 518 Comm: kworker/u2:29 Kdump: loaded Not
>>> tainted 6.3.0-rc4+ #13
>>> [  677.274881] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
>>> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
>>> [  677.274981] Workqueue: xprtiod xs_tls_connect [sunrpc]
>>> [  677.275137] RIP: 0010:handshake_req_cancel+0x2b/0x3b0
>> 
>> scripts/faddr2line on this RIP
>> 
>> Which pointer is NULL?
> 
> I apologize I don't have that info and I'm working on trying to figure
> out how to get this information to you. Lab kernels are built with
> CONFIG_DEBUG_INFO disabled and thus I can't do faddr2line. I will
> attempt to build on with debugging enabled (has been problematic
> before for the lab machines so not sure if that'll be successful I
> might have to hack the linux kernel to simulate the reset).

Jakub asked that NULL checking be removed from handshake_req_cancel,
or at least that's what I read into his comment. That choice is
definitely flushing bugs.

So this crasher is familiar. req_cancel is getting called in some
scenario where the socket or request is partially filled in. We
have to figure out how that's happening.

If debug info is a problem, then just add a sarcastic quantity of
NULL checks in handshake_req_cancel, rebuild, and it should tell
you exactly which pointer is bad.

BPF might also be a solution, but I'm not sure if it relies on
debug info. Might be only BTF is needed.


>>> [  677.275269] Code: 0f 1e fa 0f 1f 44 00 00 41 57 41 56 41 55 41 54
>>> 55 53 48 83 ec 28 48 8b 6f 18 65 48 8b 04 25 28 00 00 00 48 89 44 24
>>> 20 31 c0 <48> 8b 5d 30 48 89 6c 24 18 e8 97 15 5f ff 4c 8b 25 50 2e 70
>>> 01 41
>>> [  677.275474] RSP: 0018:ffffaed400f6fcf8 EFLAGS: 00010246
>>> [  677.275570] RAX: 0000000000000000 RBX: ffff93f085b33800 RCX: 0000000000000000
>>> [  677.275663] RDX: ffff93f085b33f08 RSI: ffffaed400f6fd18 RDI: ffff93f083aa6700
>>> [  677.275756] RBP: 0000000000000000 R08: ffff93f137e32968 R09: 0000009d9663fa40
>>> [  677.275850] R10: 0000000000000000 R11: 0000009d9663fa40 R12: ffff93f083aa6700
>>> [  677.275943] R13: 0000000000000000 R14: ffff93f082999e00 R15: ffff93f085b33800
>>> [  677.276049] FS:  0000000000000000(0000) GS:ffff93f137e00000(0000)
>>> knlGS:0000000000000000
>>> [  677.276154] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [  677.276247] CR2: 0000000000000030 CR3: 000000010797c002 CR4: 00000000007706f0
>>> [  677.276355] PKRU: 55555554
>>> [  677.276427] Call Trace:
>>> [  677.276521]  <TASK>
>>> [  677.276605]  ? wait_for_completion_interruptible_timeout+0xfa/0x170
>>> [  677.276725]  xs_tls_handshake_sync+0x14f/0x170 [sunrpc]
>>> [  677.276850]  ? __pfx_xs_tls_handshake_done+0x10/0x10 [sunrpc]
>>> [  677.276967]  xs_tls_connect+0x14c/0x590 [sunrpc]
>>> [  677.277086]  process_one_work+0x1b0/0x3c0
>>> [  677.277209]  worker_thread+0x34/0x360
>>> [  677.277303]  ? __pfx_worker_thread+0x10/0x10
>>> [  677.277396]  kthread+0xdb/0x110
>>> [  677.277527]  ? __pfx_kthread+0x10/0x10
>>> [  677.277619]  ret_from_fork+0x29/0x50
>>> [  677.277731]  </TASK>
>>> [  677.277803] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
>>> dns_resolver nfsv3 nfs_acl nfs lockd grace fscache netfs ext4 mbcache
>>> jbd2 loop rfkill vsock_loopback vmw_vsock_virtio_transport_common
>>> vmw_vsock_vmci_transport vsock sunrpc vfat fat dm_multipath
>>> intel_rapl_msr intel_rapl_common isst_if_mbox_msr isst_if_common nfit
>>> libnvdimm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl
>>> vmw_balloon joydev pcspkr vmw_vmci i2c_piix4 xfs libcrc32c vmwgfx
>>> drm_ttm_helper ttm sr_mod cdrom drm_kms_helper syscopyarea sysfillrect
>>> sd_mod ata_generic sysimgblt sg ata_piix ahci libahci crc32c_intel
>>> serio_raw drm vmxnet3 vmw_pvscsi libata dm_mirror dm_region_hash
>>> dm_log dm_mod nvme_tcp nvme_fabrics nvme_core t10_pi crc64_rocksoft
>>> crc64 ipmi_devintf ipmi_msghandler fuse
>>> [  677.278383] CR2: 0000000000000030
>>> 
>>> Network trace from the server:
>>> 
>>> [kolga@scs000085536 ~/pcaps]$ tshark -t ad -r e0d_20230404_084311.trc0
>>>   1 2023-04-04 08:43:28.027546 10.235.232.77 → 10.63.20.131 TCP 74
>>> 727 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
>>> TSval=4059503415 TSecr=0 WS=128
>>>   2 2023-04-04 08:43:28.027645 10.63.20.131 → 10.235.232.77 TCP 74
>>> 2049 → 727 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
>>> SACK_PERM=1 TSval=2256845073 TSecr=4059503415
>>>   3 2023-04-04 08:43:28.027813 10.235.232.77 → 10.63.20.131 TCP 66
>>> 727 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=4059503415
>>> TSecr=2256845073
>>>   4 2023-04-04 08:43:28.027952 10.235.232.77 → 10.63.20.131 NFS 110
>>> V4 NULL Call
>>>   5 2023-04-04 08:43:28.028301 10.63.20.131 → 10.235.232.77 NFS 102
>>> V4 NULL Reply (Call In 4)
>>>   6 2023-04-04 08:43:28.028563 10.235.232.77 → 10.63.20.131 TCP 66
>>> 727 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=4059503416
>>> TSecr=2256845073
>>>   7 2023-04-04 08:43:28.331181 10.63.20.131 → 10.235.232.77 TCP 66
>>> 2049 → 727 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=2256845383
>>> TSecr=4059503416
>>> 
>>> 
>>>> 
>>>> 
>>>> --
>>>> Chuck Lever
>> 
>> 
>> --
>> Chuck Lever


--
Chuck Lever



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: kernel oops
  2023-04-04 14:54         ` Chuck Lever III
@ 2023-04-04 16:15           ` Olga Kornievskaia
  2023-04-04 17:28             ` Chuck Lever III
  0 siblings, 1 reply; 12+ messages in thread
From: Olga Kornievskaia @ 2023-04-04 16:15 UTC (permalink / raw)
  To: Chuck Lever III; +Cc: kernel-tls-handshake

On Tue, Apr 4, 2023 at 10:54 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
>
>
>
> > On Apr 4, 2023, at 10:44 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
> >
> > On Tue, Apr 4, 2023 at 10:32 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
> >>
> >>> On Apr 4, 2023, at 10:24 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
> >>>
> >>> On Fri, Mar 31, 2023 at 10:25 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>>> On Mar 31, 2023, at 10:23 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
> >>>>>
> >>>>> Hi Chuck,
> >>>>>
> >>>>> Have you seen this oops? This is not my testing so all I know was that
> >>>>> it was an attempt to mount with xprtsec=tls (against ONTAP).
> >>>>>
> >>>>> [  316.939747] BUG: kernel NULL pointer dereference, address: 0000000000000018
> >>>>> [  316.939848] #PF: supervisor read access in kernel mode
> >>>>> [  316.939914] #PF: error_code(0x0000) - not-present page
> >>>>> [  316.939984] PGD 0 P4D 0
> >>>>> [  316.940041] Oops: 0000 [#1] PREEMPT SMP PTI
> >>>>> [  316.940109] CPU: 0 PID: 511 Comm: kworker/u2:30 Kdump: loaded Not
> >>>>> tainted 6.2.1+ #2
> >>>>> [  316.940181] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
> >>>>> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
> >>>>> [  316.940259] Workqueue: xprtiod xs_tls_connect [sunrpc]
> >>>>> [  316.940463] RIP: 0010:xs_tls_connect+0x3e1/0x590 [sunrpc]
> >>>>> [  316.940556] Code: ff ff ff 48 2b 83 f8 fe ff ff 48 89 83 00 ff ff
> >>>>> ff e8 03 d3 ff ff e9 a8 fd ff ff 49 8b 97 f8 05 00 00 66 83 bb e0 f9
> >>>>> ff ff 0a <48> 8b 6a 18 0f 84 fd 00 00 00 48 8d 72 18 4c 89 e7 48 89 14
> >>>>> 24 e8
> >>>>> [  316.940719] RSP: 0018:ffffa69d40f5fdc8 EFLAGS: 00010293
> >>>>> [  316.940786] RAX: 0000000000000000 RBX: ffff9b0c46333e40 RCX: 0000000000000000
> >>>>> [  316.940872] RDX: 0000000000000000 RSI: ffffa69d40f5fd20 RDI: 0000000000000000
> >>>>> [  316.940940] RBP: ffff9b0c46350200 R08: ffff9b0cf7e31968 R09: 0000000000000159
> >>>>> [  316.941008] R10: 0000000000000001 R11: 00000049c8887a40 R12: ffff9b0c46333800
> >>>>> [  316.941075] R13: 0000000004208160 R14: ffff9b0c44e18a00 R15: ffff9b0c42dab800
> >>>>> [  316.941159] FS:  0000000000000000(0000) GS:ffff9b0cf7e00000(0000)
> >>>>> knlGS:0000000000000000
> >>>>> [  316.941243] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>>>> [  316.941309] CR2: 0000000000000018 CR3: 0000000102e48004 CR4: 00000000007706f0
> >>>>> [  316.941391] PKRU: 55555554
> >>>>> [  316.941443] Call Trace:
> >>>>> [  316.941513]  <TASK>
> >>>>> [  316.941578]  process_one_work+0x1b0/0x3c0
> >>>>> [  316.941689]  worker_thread+0x30/0x360
> >>>>> [  316.941755]  ? __pfx_worker_thread+0x10/0x10
> >>>>> [  316.941822]  kthread+0xd7/0x100
> >>>>> [  316.941900]  ? __pfx_kthread+0x10/0x10
> >>>>> [  316.941968]  ret_from_fork+0x29/0x50
> >>>>> [  316.942049]  </TASK>
> >>>>> [  316.942099] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
> >>>>> dns_resolver ib_core nfsv3 nfs_acl nfs lockd grace fscache netfs ext4
> >>>>> rfkill mbcache jbd2 loop vsock_loopback
> >>>>> vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock
> >>>>> sunrpc tls vfat fat dm_multipath intel_rapl_msr intel_rapl_common
> >>>>> isst_if_mbox_msr isst_if_common nfit libnvdimm crct10dif_pclmul
> >>>>> crc32_pclmul ghash_clmulni_intel rapl vmw_balloon joydev pcspkr
> >>>>> i2c_piix4 vmw_vmci xfs libcrc32c vmwgfx drm_ttm_helper ttm sr_mod
> >>>>> cdrom drm_kms_helper syscopyarea sd_mod sysfillrect ata_generic
> >>>>> sysimgblt sg ahci libahci ata_piix crc32c_intel serio_raw drm vmxnet3
> >>>>> vmw_pvscsi libata dm_mirror dm_region_hash dm_log dm_mod nvme_tcp
> >>>>> nvme_fabrics nvme_core t10_pi crc64_rocksoft crc64 ipmi_devintf
> >>>>> ipmi_msghandler fuse
> >>>>> [  316.942550] CR2: 0000000000000018
> >>>>>
> >>>>> Please note this was still on "commit
> >>>>> caa52379f0046e974fd16585dd7ede624699d8e2 (HEAD -> 03152023-v7,
> >>>>> origin/upcall-v7)". If you think somethings done to address it I'll
> >>>>> have them update to the latest.
> >>>>
> >>>> Update and retry. It has a familiar ring, but I can't say I've
> >>>> seen this recently.
> >>>
> >>> Hi Chuck,
> >>>
> >>> I've updated the kernel code to the latest  commit
> >>> 46588cade103b446f74a7ee160a83cd2a80399e7 (HEAD ->
> >>> chuck-rpc-with-tls-04032023, chuck/topic-rpc-with-tls-upcall) but the
> >>> client is still crashing but looks like in a different place. This
> >>> happens when the server terminates the connection (immediately) as the
> >>> tls handshake happens.
> >>
> >> Why is the server doing that?
> >
> > I'm not sure what kind of a reason you are looking for. I think the
> > server can send an RST at any point in a TCP connection and the client
> > shouldn't crash.
>
> No doubt.
>
>
> > But the real reason is probably because the server's
> > code is getting developed and they ran into some error and resetting
> > the connection.
>
> That's what I was looking for. I wanted to know if it was
> Linux NFSD doing this.
>
>
> >>> Here's the stack trace:
> >>> [  677.274299] BUG: kernel NULL pointer dereference, address: 0000000000000030
> >>> [  677.274424] #PF: supervisor read access in kernel mode
> >>> [  677.274517] #PF: error_code(0x0000) - not-present page
> >>> [  677.274609] PGD 0 P4D 0
> >>> [  677.274689] Oops: 0000 [#1] PREEMPT SMP PTI
> >>> [  677.274785] CPU: 0 PID: 518 Comm: kworker/u2:29 Kdump: loaded Not
> >>> tainted 6.3.0-rc4+ #13
> >>> [  677.274881] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
> >>> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
> >>> [  677.274981] Workqueue: xprtiod xs_tls_connect [sunrpc]
> >>> [  677.275137] RIP: 0010:handshake_req_cancel+0x2b/0x3b0
> >>
> >> scripts/faddr2line on this RIP
> >>
> >> Which pointer is NULL?
> >
> > I apologize I don't have that info and I'm working on trying to figure
> > out how to get this information to you. Lab kernels are built with
> > CONFIG_DEBUG_INFO disabled and thus I can't do faddr2line. I will
> > attempt to build on with debugging enabled (has been problematic
> > before for the lab machines so not sure if that'll be successful I
> > might have to hack the linux kernel to simulate the reset).
>
> Jakub asked that NULL checking be removed from handshake_req_cancel,
> or at least that's what I read into his comment. That choice is
> definitely flushing bugs.
>
> So this crasher is familiar. req_cancel is getting called in some
> scenario where the socket or request is partially filled in. We
> have to figure out how that's happening.
>
> If debug info is a problem, then just add a sarcastic quantity of
> NULL checks in handshake_req_cancel, rebuild, and it should tell
> you exactly which pointer is bad.

I added a check for sock-sk null and if so return true. I thought that
would "prevent" the crash but it then proceeded to crash elsewhere.
Sorry I'm still on non-debug-symbol kernel (as I thought I'd add the
checks that that'd be enough :)).

[  284.368211] AGLO: handshake_req_cancel sock->sk is NULL
[  284.368249] BUG: kernel NULL pointer dereference, address: 0000000000000018
[  284.368320] #PF: supervisor read access in kernel mode
[  284.368363] #PF: error_code(0x0000) - not-present page
[  284.368411] PGD 0 P4D 0
[  284.368438] Oops: 0000 [#1] PREEMPT SMP PTI
[  284.368473] CPU: 0 PID: 521 Comm: kworker/u2:30 Kdump: loaded Not
tainted 6.3.0-rc4+ #14
[  284.368523] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
[  284.368580] Workqueue: xprtiod xs_tls_connect [sunrpc]
[  284.368720] RIP: 0010:xs_tls_connect+0x3e5/0x590 [sunrpc]
[  284.368844] Code: ff ff ff 48 2b 83 f8 fe ff ff 48 89 83 00 ff ff
ff e8 bf d2 ff ff e9 a8 fd ff ff 49 8b 97 f8 05 00 00 66 83 bb e0 f9
ff ff 0a <48> 8b 6a 18 0f 84 fd 00 00 00 48 8d 72 18 4c 89 e7 48 89 14
24 e8
[  284.368973] RSP: 0018:ffffba17c0f67dc8 EFLAGS: 00010293
[  284.369010] RAX: 0000000000000000 RBX: ffff963143c1fe40 RCX: 0000000000000000
[  284.369051] RDX: 0000000000000000 RSI: ffff9631f7e20840 RDI: ffff9631f7e20840
[  284.369091] RBP: ffff9631463d8a00 R08: 0000000000000000 R09: c0000000ffff7fff
[  284.369130] R10: 0000000000000001 R11: ffffba17c0f67bb8 R12: ffff963143c1f800
[  284.369175] R13: 0000000004208160 R14: ffff963143f6f600 R15: ffff963143c1e000
[  284.369259] FS:  0000000000000000(0000) GS:ffff9631f7e00000(0000)
knlGS:0000000000000000
[  284.369319] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  284.369367] CR2: 0000000000000018 CR3: 0000000102d12004 CR4: 00000000007706f0
[  284.369458] PKRU: 55555554
[  284.369483] Call Trace:
[  284.369523]  <TASK>
[  284.369583]  process_one_work+0x1b0/0x3c0
[  284.369633]  worker_thread+0x34/0x360
[  284.369668]  ? __pfx_worker_thread+0x10/0x10
[  284.369704]  kthread+0xdb/0x110
[  284.369736]  ? __pfx_kthread+0x10/0x10
[  284.369767]  ret_from_fork+0x29/0x50
[  284.369808]  </TASK>
[  284.369843] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
dns_resolver nfsv3 nfs_acl nfs lockd grace fscache netfs ext4 mbcache
jbd2 loop rfkill vsock_loopback vmw_vsock_virtio_transport_common
vmw_vsock_vmci_transport vsock sunrpc vfat fat dm_multipath
intel_rapl_msr intel_rapl_common isst_if_mbox_msr isst_if_common nfit
libnvdimm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl
vmw_balloon joydev pcspkr vmw_vmci i2c_piix4 xfs libcrc32c vmwgfx
drm_ttm_helper ttm sr_mod cdrom drm_kms_helper sd_mod syscopyarea
sysfillrect ata_generic sysimgblt sg ata_piix ahci libahci
crc32c_intel serio_raw vmxnet3 libata drm vmw_pvscsi dm_mirror
dm_region_hash dm_log dm_mod nvme_tcp nvme_fabrics nvme_core t10_pi
crc64_rocksoft crc64 ipmi_devintf ipmi_msghandler fuse
[  284.370300] CR2: 0000000000000018


>
> BPF might also be a solution, but I'm not sure if it relies on
> debug info. Might be only BTF is needed.
>
>
> >>> [  677.275269] Code: 0f 1e fa 0f 1f 44 00 00 41 57 41 56 41 55 41 54
> >>> 55 53 48 83 ec 28 48 8b 6f 18 65 48 8b 04 25 28 00 00 00 48 89 44 24
> >>> 20 31 c0 <48> 8b 5d 30 48 89 6c 24 18 e8 97 15 5f ff 4c 8b 25 50 2e 70
> >>> 01 41
> >>> [  677.275474] RSP: 0018:ffffaed400f6fcf8 EFLAGS: 00010246
> >>> [  677.275570] RAX: 0000000000000000 RBX: ffff93f085b33800 RCX: 0000000000000000
> >>> [  677.275663] RDX: ffff93f085b33f08 RSI: ffffaed400f6fd18 RDI: ffff93f083aa6700
> >>> [  677.275756] RBP: 0000000000000000 R08: ffff93f137e32968 R09: 0000009d9663fa40
> >>> [  677.275850] R10: 0000000000000000 R11: 0000009d9663fa40 R12: ffff93f083aa6700
> >>> [  677.275943] R13: 0000000000000000 R14: ffff93f082999e00 R15: ffff93f085b33800
> >>> [  677.276049] FS:  0000000000000000(0000) GS:ffff93f137e00000(0000)
> >>> knlGS:0000000000000000
> >>> [  677.276154] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>> [  677.276247] CR2: 0000000000000030 CR3: 000000010797c002 CR4: 00000000007706f0
> >>> [  677.276355] PKRU: 55555554
> >>> [  677.276427] Call Trace:
> >>> [  677.276521]  <TASK>
> >>> [  677.276605]  ? wait_for_completion_interruptible_timeout+0xfa/0x170
> >>> [  677.276725]  xs_tls_handshake_sync+0x14f/0x170 [sunrpc]
> >>> [  677.276850]  ? __pfx_xs_tls_handshake_done+0x10/0x10 [sunrpc]
> >>> [  677.276967]  xs_tls_connect+0x14c/0x590 [sunrpc]
> >>> [  677.277086]  process_one_work+0x1b0/0x3c0
> >>> [  677.277209]  worker_thread+0x34/0x360
> >>> [  677.277303]  ? __pfx_worker_thread+0x10/0x10
> >>> [  677.277396]  kthread+0xdb/0x110
> >>> [  677.277527]  ? __pfx_kthread+0x10/0x10
> >>> [  677.277619]  ret_from_fork+0x29/0x50
> >>> [  677.277731]  </TASK>
> >>> [  677.277803] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
> >>> dns_resolver nfsv3 nfs_acl nfs lockd grace fscache netfs ext4 mbcache
> >>> jbd2 loop rfkill vsock_loopback vmw_vsock_virtio_transport_common
> >>> vmw_vsock_vmci_transport vsock sunrpc vfat fat dm_multipath
> >>> intel_rapl_msr intel_rapl_common isst_if_mbox_msr isst_if_common nfit
> >>> libnvdimm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl
> >>> vmw_balloon joydev pcspkr vmw_vmci i2c_piix4 xfs libcrc32c vmwgfx
> >>> drm_ttm_helper ttm sr_mod cdrom drm_kms_helper syscopyarea sysfillrect
> >>> sd_mod ata_generic sysimgblt sg ata_piix ahci libahci crc32c_intel
> >>> serio_raw drm vmxnet3 vmw_pvscsi libata dm_mirror dm_region_hash
> >>> dm_log dm_mod nvme_tcp nvme_fabrics nvme_core t10_pi crc64_rocksoft
> >>> crc64 ipmi_devintf ipmi_msghandler fuse
> >>> [  677.278383] CR2: 0000000000000030
> >>>
> >>> Network trace from the server:
> >>>
> >>> [kolga@scs000085536 ~/pcaps]$ tshark -t ad -r e0d_20230404_084311.trc0
> >>>   1 2023-04-04 08:43:28.027546 10.235.232.77 → 10.63.20.131 TCP 74
> >>> 727 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> >>> TSval=4059503415 TSecr=0 WS=128
> >>>   2 2023-04-04 08:43:28.027645 10.63.20.131 → 10.235.232.77 TCP 74
> >>> 2049 → 727 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> >>> SACK_PERM=1 TSval=2256845073 TSecr=4059503415
> >>>   3 2023-04-04 08:43:28.027813 10.235.232.77 → 10.63.20.131 TCP 66
> >>> 727 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=4059503415
> >>> TSecr=2256845073
> >>>   4 2023-04-04 08:43:28.027952 10.235.232.77 → 10.63.20.131 NFS 110
> >>> V4 NULL Call
> >>>   5 2023-04-04 08:43:28.028301 10.63.20.131 → 10.235.232.77 NFS 102
> >>> V4 NULL Reply (Call In 4)
> >>>   6 2023-04-04 08:43:28.028563 10.235.232.77 → 10.63.20.131 TCP 66
> >>> 727 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=4059503416
> >>> TSecr=2256845073
> >>>   7 2023-04-04 08:43:28.331181 10.63.20.131 → 10.235.232.77 TCP 66
> >>> 2049 → 727 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=2256845383
> >>> TSecr=4059503416
> >>>
> >>>
> >>>>
> >>>>
> >>>> --
> >>>> Chuck Lever
> >>
> >>
> >> --
> >> Chuck Lever
>
>
> --
> Chuck Lever
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: kernel oops
  2023-04-04 16:15           ` Olga Kornievskaia
@ 2023-04-04 17:28             ` Chuck Lever III
  2023-04-04 19:11               ` Olga Kornievskaia
  0 siblings, 1 reply; 12+ messages in thread
From: Chuck Lever III @ 2023-04-04 17:28 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: kernel-tls-handshake



> On Apr 4, 2023, at 12:15 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> 
> On Tue, Apr 4, 2023 at 10:54 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
>> 
>> 
>> 
>>> On Apr 4, 2023, at 10:44 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>> 
>>> On Tue, Apr 4, 2023 at 10:32 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
>>>> 
>>>>> On Apr 4, 2023, at 10:24 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>>> 
>>>>> On Fri, Mar 31, 2023 at 10:25 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On Mar 31, 2023, at 10:23 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>>>>> 
>>>>>>> Hi Chuck,
>>>>>>> 
>>>>>>> Have you seen this oops? This is not my testing so all I know was that
>>>>>>> it was an attempt to mount with xprtsec=tls (against ONTAP).
>>>>>>> 
>>>>>>> [  316.939747] BUG: kernel NULL pointer dereference, address: 0000000000000018
>>>>>>> [  316.939848] #PF: supervisor read access in kernel mode
>>>>>>> [  316.939914] #PF: error_code(0x0000) - not-present page
>>>>>>> [  316.939984] PGD 0 P4D 0
>>>>>>> [  316.940041] Oops: 0000 [#1] PREEMPT SMP PTI
>>>>>>> [  316.940109] CPU: 0 PID: 511 Comm: kworker/u2:30 Kdump: loaded Not
>>>>>>> tainted 6.2.1+ #2
>>>>>>> [  316.940181] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
>>>>>>> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
>>>>>>> [  316.940259] Workqueue: xprtiod xs_tls_connect [sunrpc]
>>>>>>> [  316.940463] RIP: 0010:xs_tls_connect+0x3e1/0x590 [sunrpc]
>>>>>>> [  316.940556] Code: ff ff ff 48 2b 83 f8 fe ff ff 48 89 83 00 ff ff
>>>>>>> ff e8 03 d3 ff ff e9 a8 fd ff ff 49 8b 97 f8 05 00 00 66 83 bb e0 f9
>>>>>>> ff ff 0a <48> 8b 6a 18 0f 84 fd 00 00 00 48 8d 72 18 4c 89 e7 48 89 14
>>>>>>> 24 e8
>>>>>>> [  316.940719] RSP: 0018:ffffa69d40f5fdc8 EFLAGS: 00010293
>>>>>>> [  316.940786] RAX: 0000000000000000 RBX: ffff9b0c46333e40 RCX: 0000000000000000
>>>>>>> [  316.940872] RDX: 0000000000000000 RSI: ffffa69d40f5fd20 RDI: 0000000000000000
>>>>>>> [  316.940940] RBP: ffff9b0c46350200 R08: ffff9b0cf7e31968 R09: 0000000000000159
>>>>>>> [  316.941008] R10: 0000000000000001 R11: 00000049c8887a40 R12: ffff9b0c46333800
>>>>>>> [  316.941075] R13: 0000000004208160 R14: ffff9b0c44e18a00 R15: ffff9b0c42dab800
>>>>>>> [  316.941159] FS:  0000000000000000(0000) GS:ffff9b0cf7e00000(0000)
>>>>>>> knlGS:0000000000000000
>>>>>>> [  316.941243] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>>> [  316.941309] CR2: 0000000000000018 CR3: 0000000102e48004 CR4: 00000000007706f0
>>>>>>> [  316.941391] PKRU: 55555554
>>>>>>> [  316.941443] Call Trace:
>>>>>>> [  316.941513]  <TASK>
>>>>>>> [  316.941578]  process_one_work+0x1b0/0x3c0
>>>>>>> [  316.941689]  worker_thread+0x30/0x360
>>>>>>> [  316.941755]  ? __pfx_worker_thread+0x10/0x10
>>>>>>> [  316.941822]  kthread+0xd7/0x100
>>>>>>> [  316.941900]  ? __pfx_kthread+0x10/0x10
>>>>>>> [  316.941968]  ret_from_fork+0x29/0x50
>>>>>>> [  316.942049]  </TASK>
>>>>>>> [  316.942099] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
>>>>>>> dns_resolver ib_core nfsv3 nfs_acl nfs lockd grace fscache netfs ext4
>>>>>>> rfkill mbcache jbd2 loop vsock_loopback
>>>>>>> vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock
>>>>>>> sunrpc tls vfat fat dm_multipath intel_rapl_msr intel_rapl_common
>>>>>>> isst_if_mbox_msr isst_if_common nfit libnvdimm crct10dif_pclmul
>>>>>>> crc32_pclmul ghash_clmulni_intel rapl vmw_balloon joydev pcspkr
>>>>>>> i2c_piix4 vmw_vmci xfs libcrc32c vmwgfx drm_ttm_helper ttm sr_mod
>>>>>>> cdrom drm_kms_helper syscopyarea sd_mod sysfillrect ata_generic
>>>>>>> sysimgblt sg ahci libahci ata_piix crc32c_intel serio_raw drm vmxnet3
>>>>>>> vmw_pvscsi libata dm_mirror dm_region_hash dm_log dm_mod nvme_tcp
>>>>>>> nvme_fabrics nvme_core t10_pi crc64_rocksoft crc64 ipmi_devintf
>>>>>>> ipmi_msghandler fuse
>>>>>>> [  316.942550] CR2: 0000000000000018
>>>>>>> 
>>>>>>> Please note this was still on "commit
>>>>>>> caa52379f0046e974fd16585dd7ede624699d8e2 (HEAD -> 03152023-v7,
>>>>>>> origin/upcall-v7)". If you think somethings done to address it I'll
>>>>>>> have them update to the latest.
>>>>>> 
>>>>>> Update and retry. It has a familiar ring, but I can't say I've
>>>>>> seen this recently.
>>>>> 
>>>>> Hi Chuck,
>>>>> 
>>>>> I've updated the kernel code to the latest  commit
>>>>> 46588cade103b446f74a7ee160a83cd2a80399e7 (HEAD ->
>>>>> chuck-rpc-with-tls-04032023, chuck/topic-rpc-with-tls-upcall) but the
>>>>> client is still crashing but looks like in a different place. This
>>>>> happens when the server terminates the connection (immediately) as the
>>>>> tls handshake happens.
>>>> 
>>>> Why is the server doing that?
>>> 
>>> I'm not sure what kind of a reason you are looking for. I think the
>>> server can send an RST at any point in a TCP connection and the client
>>> shouldn't crash.
>> 
>> No doubt.
>> 
>> 
>>> But the real reason is probably because the server's
>>> code is getting developed and they ran into some error and resetting
>>> the connection.
>> 
>> That's what I was looking for. I wanted to know if it was
>> Linux NFSD doing this.
>> 
>> 
>>>>> Here's the stack trace:
>>>>> [  677.274299] BUG: kernel NULL pointer dereference, address: 0000000000000030
>>>>> [  677.274424] #PF: supervisor read access in kernel mode
>>>>> [  677.274517] #PF: error_code(0x0000) - not-present page
>>>>> [  677.274609] PGD 0 P4D 0
>>>>> [  677.274689] Oops: 0000 [#1] PREEMPT SMP PTI
>>>>> [  677.274785] CPU: 0 PID: 518 Comm: kworker/u2:29 Kdump: loaded Not
>>>>> tainted 6.3.0-rc4+ #13
>>>>> [  677.274881] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
>>>>> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
>>>>> [  677.274981] Workqueue: xprtiod xs_tls_connect [sunrpc]
>>>>> [  677.275137] RIP: 0010:handshake_req_cancel+0x2b/0x3b0
>>>> 
>>>> scripts/faddr2line on this RIP
>>>> 
>>>> Which pointer is NULL?
>>> 
>>> I apologize I don't have that info and I'm working on trying to figure
>>> out how to get this information to you. Lab kernels are built with
>>> CONFIG_DEBUG_INFO disabled and thus I can't do faddr2line. I will
>>> attempt to build on with debugging enabled (has been problematic
>>> before for the lab machines so not sure if that'll be successful I
>>> might have to hack the linux kernel to simulate the reset).
>> 
>> Jakub asked that NULL checking be removed from handshake_req_cancel,
>> or at least that's what I read into his comment. That choice is
>> definitely flushing bugs.
>> 
>> So this crasher is familiar. req_cancel is getting called in some
>> scenario where the socket or request is partially filled in. We
>> have to figure out how that's happening.
>> 
>> If debug info is a problem, then just add a sarcastic quantity of
>> NULL checks in handshake_req_cancel, rebuild, and it should tell
>> you exactly which pointer is bad.
> 
> I added a check for sock-sk null and if so return true.

If it returns true, xs_tls_handshake_sync() probably completes
successfully.

Perhaps handshake_req_cancel needs to return false in cases
where it can't find a pending request.


> I thought that
> would "prevent" the crash but it then proceeded to crash elsewhere.
> Sorry I'm still on non-debug-symbol kernel (as I thought I'd add the
> checks that that'd be enough :)).
> 
> [  284.368211] AGLO: handshake_req_cancel sock->sk is NULL
> [  284.368249] BUG: kernel NULL pointer dereference, address: 0000000000000018
> [  284.368320] #PF: supervisor read access in kernel mode
> [  284.368363] #PF: error_code(0x0000) - not-present page
> [  284.368411] PGD 0 P4D 0
> [  284.368438] Oops: 0000 [#1] PREEMPT SMP PTI
> [  284.368473] CPU: 0 PID: 521 Comm: kworker/u2:30 Kdump: loaded Not
> tainted 6.3.0-rc4+ #14
> [  284.368523] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
> [  284.368580] Workqueue: xprtiod xs_tls_connect [sunrpc]
> [  284.368720] RIP: 0010:xs_tls_connect+0x3e5/0x590 [sunrpc]
> [  284.368844] Code: ff ff ff 48 2b 83 f8 fe ff ff 48 89 83 00 ff ff
> ff e8 bf d2 ff ff e9 a8 fd ff ff 49 8b 97 f8 05 00 00 66 83 bb e0 f9
> ff ff 0a <48> 8b 6a 18 0f 84 fd 00 00 00 48 8d 72 18 4c 89 e7 48 89 14
> 24 e8
> [  284.368973] RSP: 0018:ffffba17c0f67dc8 EFLAGS: 00010293
> [  284.369010] RAX: 0000000000000000 RBX: ffff963143c1fe40 RCX: 0000000000000000
> [  284.369051] RDX: 0000000000000000 RSI: ffff9631f7e20840 RDI: ffff9631f7e20840
> [  284.369091] RBP: ffff9631463d8a00 R08: 0000000000000000 R09: c0000000ffff7fff
> [  284.369130] R10: 0000000000000001 R11: ffffba17c0f67bb8 R12: ffff963143c1f800
> [  284.369175] R13: 0000000004208160 R14: ffff963143f6f600 R15: ffff963143c1e000
> [  284.369259] FS:  0000000000000000(0000) GS:ffff9631f7e00000(0000)
> knlGS:0000000000000000
> [  284.369319] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  284.369367] CR2: 0000000000000018 CR3: 0000000102d12004 CR4: 00000000007706f0
> [  284.369458] PKRU: 55555554
> [  284.369483] Call Trace:
> [  284.369523]  <TASK>
> [  284.369583]  process_one_work+0x1b0/0x3c0
> [  284.369633]  worker_thread+0x34/0x360
> [  284.369668]  ? __pfx_worker_thread+0x10/0x10
> [  284.369704]  kthread+0xdb/0x110
> [  284.369736]  ? __pfx_kthread+0x10/0x10
> [  284.369767]  ret_from_fork+0x29/0x50
> [  284.369808]  </TASK>
> [  284.369843] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
> dns_resolver nfsv3 nfs_acl nfs lockd grace fscache netfs ext4 mbcache
> jbd2 loop rfkill vsock_loopback vmw_vsock_virtio_transport_common
> vmw_vsock_vmci_transport vsock sunrpc vfat fat dm_multipath
> intel_rapl_msr intel_rapl_common isst_if_mbox_msr isst_if_common nfit
> libnvdimm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl
> vmw_balloon joydev pcspkr vmw_vmci i2c_piix4 xfs libcrc32c vmwgfx
> drm_ttm_helper ttm sr_mod cdrom drm_kms_helper sd_mod syscopyarea
> sysfillrect ata_generic sysimgblt sg ata_piix ahci libahci
> crc32c_intel serio_raw vmxnet3 libata drm vmw_pvscsi dm_mirror
> dm_region_hash dm_log dm_mod nvme_tcp nvme_fabrics nvme_core t10_pi
> crc64_rocksoft crc64 ipmi_devintf ipmi_msghandler fuse
> [  284.370300] CR2: 0000000000000018
> 
> 
>> 
>> BPF might also be a solution, but I'm not sure if it relies on
>> debug info. Might be only BTF is needed.
>> 
>> 
>>>>> [  677.275269] Code: 0f 1e fa 0f 1f 44 00 00 41 57 41 56 41 55 41 54
>>>>> 55 53 48 83 ec 28 48 8b 6f 18 65 48 8b 04 25 28 00 00 00 48 89 44 24
>>>>> 20 31 c0 <48> 8b 5d 30 48 89 6c 24 18 e8 97 15 5f ff 4c 8b 25 50 2e 70
>>>>> 01 41
>>>>> [  677.275474] RSP: 0018:ffffaed400f6fcf8 EFLAGS: 00010246
>>>>> [  677.275570] RAX: 0000000000000000 RBX: ffff93f085b33800 RCX: 0000000000000000
>>>>> [  677.275663] RDX: ffff93f085b33f08 RSI: ffffaed400f6fd18 RDI: ffff93f083aa6700
>>>>> [  677.275756] RBP: 0000000000000000 R08: ffff93f137e32968 R09: 0000009d9663fa40
>>>>> [  677.275850] R10: 0000000000000000 R11: 0000009d9663fa40 R12: ffff93f083aa6700
>>>>> [  677.275943] R13: 0000000000000000 R14: ffff93f082999e00 R15: ffff93f085b33800
>>>>> [  677.276049] FS:  0000000000000000(0000) GS:ffff93f137e00000(0000)
>>>>> knlGS:0000000000000000
>>>>> [  677.276154] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [  677.276247] CR2: 0000000000000030 CR3: 000000010797c002 CR4: 00000000007706f0
>>>>> [  677.276355] PKRU: 55555554
>>>>> [  677.276427] Call Trace:
>>>>> [  677.276521]  <TASK>
>>>>> [  677.276605]  ? wait_for_completion_interruptible_timeout+0xfa/0x170
>>>>> [  677.276725]  xs_tls_handshake_sync+0x14f/0x170 [sunrpc]
>>>>> [  677.276850]  ? __pfx_xs_tls_handshake_done+0x10/0x10 [sunrpc]
>>>>> [  677.276967]  xs_tls_connect+0x14c/0x590 [sunrpc]
>>>>> [  677.277086]  process_one_work+0x1b0/0x3c0
>>>>> [  677.277209]  worker_thread+0x34/0x360
>>>>> [  677.277303]  ? __pfx_worker_thread+0x10/0x10
>>>>> [  677.277396]  kthread+0xdb/0x110
>>>>> [  677.277527]  ? __pfx_kthread+0x10/0x10
>>>>> [  677.277619]  ret_from_fork+0x29/0x50
>>>>> [  677.277731]  </TASK>
>>>>> [  677.277803] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
>>>>> dns_resolver nfsv3 nfs_acl nfs lockd grace fscache netfs ext4 mbcache
>>>>> jbd2 loop rfkill vsock_loopback vmw_vsock_virtio_transport_common
>>>>> vmw_vsock_vmci_transport vsock sunrpc vfat fat dm_multipath
>>>>> intel_rapl_msr intel_rapl_common isst_if_mbox_msr isst_if_common nfit
>>>>> libnvdimm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl
>>>>> vmw_balloon joydev pcspkr vmw_vmci i2c_piix4 xfs libcrc32c vmwgfx
>>>>> drm_ttm_helper ttm sr_mod cdrom drm_kms_helper syscopyarea sysfillrect
>>>>> sd_mod ata_generic sysimgblt sg ata_piix ahci libahci crc32c_intel
>>>>> serio_raw drm vmxnet3 vmw_pvscsi libata dm_mirror dm_region_hash
>>>>> dm_log dm_mod nvme_tcp nvme_fabrics nvme_core t10_pi crc64_rocksoft
>>>>> crc64 ipmi_devintf ipmi_msghandler fuse
>>>>> [  677.278383] CR2: 0000000000000030
>>>>> 
>>>>> Network trace from the server:
>>>>> 
>>>>> [kolga@scs000085536 ~/pcaps]$ tshark -t ad -r e0d_20230404_084311.trc0
>>>>>  1 2023-04-04 08:43:28.027546 10.235.232.77 → 10.63.20.131 TCP 74
>>>>> 727 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
>>>>> TSval=4059503415 TSecr=0 WS=128
>>>>>  2 2023-04-04 08:43:28.027645 10.63.20.131 → 10.235.232.77 TCP 74
>>>>> 2049 → 727 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
>>>>> SACK_PERM=1 TSval=2256845073 TSecr=4059503415
>>>>>  3 2023-04-04 08:43:28.027813 10.235.232.77 → 10.63.20.131 TCP 66
>>>>> 727 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=4059503415
>>>>> TSecr=2256845073
>>>>>  4 2023-04-04 08:43:28.027952 10.235.232.77 → 10.63.20.131 NFS 110
>>>>> V4 NULL Call
>>>>>  5 2023-04-04 08:43:28.028301 10.63.20.131 → 10.235.232.77 NFS 102
>>>>> V4 NULL Reply (Call In 4)
>>>>>  6 2023-04-04 08:43:28.028563 10.235.232.77 → 10.63.20.131 TCP 66
>>>>> 727 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=4059503416
>>>>> TSecr=2256845073
>>>>>  7 2023-04-04 08:43:28.331181 10.63.20.131 → 10.235.232.77 TCP 66
>>>>> 2049 → 727 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=2256845383
>>>>> TSecr=4059503416
>>>>> 
>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Chuck Lever
>>>> 
>>>> 
>>>> --
>>>> Chuck Lever
>> 
>> 
>> --
>> Chuck Lever


--
Chuck Lever



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: kernel oops
  2023-04-04 17:28             ` Chuck Lever III
@ 2023-04-04 19:11               ` Olga Kornievskaia
  2023-04-04 19:14                 ` Chuck Lever III
  0 siblings, 1 reply; 12+ messages in thread
From: Olga Kornievskaia @ 2023-04-04 19:11 UTC (permalink / raw)
  To: Chuck Lever III; +Cc: kernel-tls-handshake

On Tue, Apr 4, 2023 at 1:28 PM Chuck Lever III <chuck.lever@oracle.com> wrote:
>
>
>
> > On Apr 4, 2023, at 12:15 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> >
> > On Tue, Apr 4, 2023 at 10:54 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
> >>
> >>
> >>
> >>> On Apr 4, 2023, at 10:44 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
> >>>
> >>> On Tue, Apr 4, 2023 at 10:32 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
> >>>>
> >>>>> On Apr 4, 2023, at 10:24 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
> >>>>>
> >>>>> On Fri, Mar 31, 2023 at 10:25 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>> On Mar 31, 2023, at 10:23 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
> >>>>>>>
> >>>>>>> Hi Chuck,
> >>>>>>>
> >>>>>>> Have you seen this oops? This is not my testing so all I know was that
> >>>>>>> it was an attempt to mount with xprtsec=tls (against ONTAP).
> >>>>>>>
> >>>>>>> [  316.939747] BUG: kernel NULL pointer dereference, address: 0000000000000018
> >>>>>>> [  316.939848] #PF: supervisor read access in kernel mode
> >>>>>>> [  316.939914] #PF: error_code(0x0000) - not-present page
> >>>>>>> [  316.939984] PGD 0 P4D 0
> >>>>>>> [  316.940041] Oops: 0000 [#1] PREEMPT SMP PTI
> >>>>>>> [  316.940109] CPU: 0 PID: 511 Comm: kworker/u2:30 Kdump: loaded Not
> >>>>>>> tainted 6.2.1+ #2
> >>>>>>> [  316.940181] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
> >>>>>>> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
> >>>>>>> [  316.940259] Workqueue: xprtiod xs_tls_connect [sunrpc]
> >>>>>>> [  316.940463] RIP: 0010:xs_tls_connect+0x3e1/0x590 [sunrpc]
> >>>>>>> [  316.940556] Code: ff ff ff 48 2b 83 f8 fe ff ff 48 89 83 00 ff ff
> >>>>>>> ff e8 03 d3 ff ff e9 a8 fd ff ff 49 8b 97 f8 05 00 00 66 83 bb e0 f9
> >>>>>>> ff ff 0a <48> 8b 6a 18 0f 84 fd 00 00 00 48 8d 72 18 4c 89 e7 48 89 14
> >>>>>>> 24 e8
> >>>>>>> [  316.940719] RSP: 0018:ffffa69d40f5fdc8 EFLAGS: 00010293
> >>>>>>> [  316.940786] RAX: 0000000000000000 RBX: ffff9b0c46333e40 RCX: 0000000000000000
> >>>>>>> [  316.940872] RDX: 0000000000000000 RSI: ffffa69d40f5fd20 RDI: 0000000000000000
> >>>>>>> [  316.940940] RBP: ffff9b0c46350200 R08: ffff9b0cf7e31968 R09: 0000000000000159
> >>>>>>> [  316.941008] R10: 0000000000000001 R11: 00000049c8887a40 R12: ffff9b0c46333800
> >>>>>>> [  316.941075] R13: 0000000004208160 R14: ffff9b0c44e18a00 R15: ffff9b0c42dab800
> >>>>>>> [  316.941159] FS:  0000000000000000(0000) GS:ffff9b0cf7e00000(0000)
> >>>>>>> knlGS:0000000000000000
> >>>>>>> [  316.941243] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>>>>>> [  316.941309] CR2: 0000000000000018 CR3: 0000000102e48004 CR4: 00000000007706f0
> >>>>>>> [  316.941391] PKRU: 55555554
> >>>>>>> [  316.941443] Call Trace:
> >>>>>>> [  316.941513]  <TASK>
> >>>>>>> [  316.941578]  process_one_work+0x1b0/0x3c0
> >>>>>>> [  316.941689]  worker_thread+0x30/0x360
> >>>>>>> [  316.941755]  ? __pfx_worker_thread+0x10/0x10
> >>>>>>> [  316.941822]  kthread+0xd7/0x100
> >>>>>>> [  316.941900]  ? __pfx_kthread+0x10/0x10
> >>>>>>> [  316.941968]  ret_from_fork+0x29/0x50
> >>>>>>> [  316.942049]  </TASK>
> >>>>>>> [  316.942099] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
> >>>>>>> dns_resolver ib_core nfsv3 nfs_acl nfs lockd grace fscache netfs ext4
> >>>>>>> rfkill mbcache jbd2 loop vsock_loopback
> >>>>>>> vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock
> >>>>>>> sunrpc tls vfat fat dm_multipath intel_rapl_msr intel_rapl_common
> >>>>>>> isst_if_mbox_msr isst_if_common nfit libnvdimm crct10dif_pclmul
> >>>>>>> crc32_pclmul ghash_clmulni_intel rapl vmw_balloon joydev pcspkr
> >>>>>>> i2c_piix4 vmw_vmci xfs libcrc32c vmwgfx drm_ttm_helper ttm sr_mod
> >>>>>>> cdrom drm_kms_helper syscopyarea sd_mod sysfillrect ata_generic
> >>>>>>> sysimgblt sg ahci libahci ata_piix crc32c_intel serio_raw drm vmxnet3
> >>>>>>> vmw_pvscsi libata dm_mirror dm_region_hash dm_log dm_mod nvme_tcp
> >>>>>>> nvme_fabrics nvme_core t10_pi crc64_rocksoft crc64 ipmi_devintf
> >>>>>>> ipmi_msghandler fuse
> >>>>>>> [  316.942550] CR2: 0000000000000018
> >>>>>>>
> >>>>>>> Please note this was still on "commit
> >>>>>>> caa52379f0046e974fd16585dd7ede624699d8e2 (HEAD -> 03152023-v7,
> >>>>>>> origin/upcall-v7)". If you think somethings done to address it I'll
> >>>>>>> have them update to the latest.
> >>>>>>
> >>>>>> Update and retry. It has a familiar ring, but I can't say I've
> >>>>>> seen this recently.
> >>>>>
> >>>>> Hi Chuck,
> >>>>>
> >>>>> I've updated the kernel code to the latest  commit
> >>>>> 46588cade103b446f74a7ee160a83cd2a80399e7 (HEAD ->
> >>>>> chuck-rpc-with-tls-04032023, chuck/topic-rpc-with-tls-upcall) but the
> >>>>> client is still crashing but looks like in a different place. This
> >>>>> happens when the server terminates the connection (immediately) as the
> >>>>> tls handshake happens.
> >>>>
> >>>> Why is the server doing that?
> >>>
> >>> I'm not sure what kind of a reason you are looking for. I think the
> >>> server can send an RST at any point in a TCP connection and the client
> >>> shouldn't crash.
> >>
> >> No doubt.
> >>
> >>
> >>> But the real reason is probably because the server's
> >>> code is getting developed and they ran into some error and resetting
> >>> the connection.
> >>
> >> That's what I was looking for. I wanted to know if it was
> >> Linux NFSD doing this.
> >>
> >>
> >>>>> Here's the stack trace:
> >>>>> [  677.274299] BUG: kernel NULL pointer dereference, address: 0000000000000030
> >>>>> [  677.274424] #PF: supervisor read access in kernel mode
> >>>>> [  677.274517] #PF: error_code(0x0000) - not-present page
> >>>>> [  677.274609] PGD 0 P4D 0
> >>>>> [  677.274689] Oops: 0000 [#1] PREEMPT SMP PTI
> >>>>> [  677.274785] CPU: 0 PID: 518 Comm: kworker/u2:29 Kdump: loaded Not
> >>>>> tainted 6.3.0-rc4+ #13
> >>>>> [  677.274881] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
> >>>>> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
> >>>>> [  677.274981] Workqueue: xprtiod xs_tls_connect [sunrpc]
> >>>>> [  677.275137] RIP: 0010:handshake_req_cancel+0x2b/0x3b0
> >>>>
> >>>> scripts/faddr2line on this RIP
> >>>>
> >>>> Which pointer is NULL?
> >>>
> >>> I apologize I don't have that info and I'm working on trying to figure
> >>> out how to get this information to you. Lab kernels are built with
> >>> CONFIG_DEBUG_INFO disabled and thus I can't do faddr2line. I will
> >>> attempt to build on with debugging enabled (has been problematic
> >>> before for the lab machines so not sure if that'll be successful I
> >>> might have to hack the linux kernel to simulate the reset).
> >>
> >> Jakub asked that NULL checking be removed from handshake_req_cancel,
> >> or at least that's what I read into his comment. That choice is
> >> definitely flushing bugs.
> >>
> >> So this crasher is familiar. req_cancel is getting called in some
> >> scenario where the socket or request is partially filled in. We
> >> have to figure out how that's happening.
> >>
> >> If debug info is a problem, then just add a sarcastic quantity of
> >> NULL checks in handshake_req_cancel, rebuild, and it should tell
> >> you exactly which pointer is bad.
> >
> > I added a check for sock-sk null and if so return true.
>
> If it returns true, xs_tls_handshake_sync() probably completes
> successfully.
>
> Perhaps handshake_req_cancel needs to return false in cases
> where it can't find a pending request.

I'm not sure if you were suggesting me to do this but I tried:
if (!sock->sk) return false;
and that led to a mount not crashing but timing out.

>
>
> > I thought that
> > would "prevent" the crash but it then proceeded to crash elsewhere.
> > Sorry I'm still on non-debug-symbol kernel (as I thought I'd add the
> > checks that that'd be enough :)).
> >
> > [  284.368211] AGLO: handshake_req_cancel sock->sk is NULL
> > [  284.368249] BUG: kernel NULL pointer dereference, address: 0000000000000018
> > [  284.368320] #PF: supervisor read access in kernel mode
> > [  284.368363] #PF: error_code(0x0000) - not-present page
> > [  284.368411] PGD 0 P4D 0
> > [  284.368438] Oops: 0000 [#1] PREEMPT SMP PTI
> > [  284.368473] CPU: 0 PID: 521 Comm: kworker/u2:30 Kdump: loaded Not
> > tainted 6.3.0-rc4+ #14
> > [  284.368523] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
> > Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
> > [  284.368580] Workqueue: xprtiod xs_tls_connect [sunrpc]
> > [  284.368720] RIP: 0010:xs_tls_connect+0x3e5/0x590 [sunrpc]
> > [  284.368844] Code: ff ff ff 48 2b 83 f8 fe ff ff 48 89 83 00 ff ff
> > ff e8 bf d2 ff ff e9 a8 fd ff ff 49 8b 97 f8 05 00 00 66 83 bb e0 f9
> > ff ff 0a <48> 8b 6a 18 0f 84 fd 00 00 00 48 8d 72 18 4c 89 e7 48 89 14
> > 24 e8
> > [  284.368973] RSP: 0018:ffffba17c0f67dc8 EFLAGS: 00010293
> > [  284.369010] RAX: 0000000000000000 RBX: ffff963143c1fe40 RCX: 0000000000000000
> > [  284.369051] RDX: 0000000000000000 RSI: ffff9631f7e20840 RDI: ffff9631f7e20840
> > [  284.369091] RBP: ffff9631463d8a00 R08: 0000000000000000 R09: c0000000ffff7fff
> > [  284.369130] R10: 0000000000000001 R11: ffffba17c0f67bb8 R12: ffff963143c1f800
> > [  284.369175] R13: 0000000004208160 R14: ffff963143f6f600 R15: ffff963143c1e000
> > [  284.369259] FS:  0000000000000000(0000) GS:ffff9631f7e00000(0000)
> > knlGS:0000000000000000
> > [  284.369319] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  284.369367] CR2: 0000000000000018 CR3: 0000000102d12004 CR4: 00000000007706f0
> > [  284.369458] PKRU: 55555554
> > [  284.369483] Call Trace:
> > [  284.369523]  <TASK>
> > [  284.369583]  process_one_work+0x1b0/0x3c0
> > [  284.369633]  worker_thread+0x34/0x360
> > [  284.369668]  ? __pfx_worker_thread+0x10/0x10
> > [  284.369704]  kthread+0xdb/0x110
> > [  284.369736]  ? __pfx_kthread+0x10/0x10
> > [  284.369767]  ret_from_fork+0x29/0x50
> > [  284.369808]  </TASK>
> > [  284.369843] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
> > dns_resolver nfsv3 nfs_acl nfs lockd grace fscache netfs ext4 mbcache
> > jbd2 loop rfkill vsock_loopback vmw_vsock_virtio_transport_common
> > vmw_vsock_vmci_transport vsock sunrpc vfat fat dm_multipath
> > intel_rapl_msr intel_rapl_common isst_if_mbox_msr isst_if_common nfit
> > libnvdimm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl
> > vmw_balloon joydev pcspkr vmw_vmci i2c_piix4 xfs libcrc32c vmwgfx
> > drm_ttm_helper ttm sr_mod cdrom drm_kms_helper sd_mod syscopyarea
> > sysfillrect ata_generic sysimgblt sg ata_piix ahci libahci
> > crc32c_intel serio_raw vmxnet3 libata drm vmw_pvscsi dm_mirror
> > dm_region_hash dm_log dm_mod nvme_tcp nvme_fabrics nvme_core t10_pi
> > crc64_rocksoft crc64 ipmi_devintf ipmi_msghandler fuse
> > [  284.370300] CR2: 0000000000000018
> >
> >
> >>
> >> BPF might also be a solution, but I'm not sure if it relies on
> >> debug info. Might be only BTF is needed.
> >>
> >>
> >>>>> [  677.275269] Code: 0f 1e fa 0f 1f 44 00 00 41 57 41 56 41 55 41 54
> >>>>> 55 53 48 83 ec 28 48 8b 6f 18 65 48 8b 04 25 28 00 00 00 48 89 44 24
> >>>>> 20 31 c0 <48> 8b 5d 30 48 89 6c 24 18 e8 97 15 5f ff 4c 8b 25 50 2e 70
> >>>>> 01 41
> >>>>> [  677.275474] RSP: 0018:ffffaed400f6fcf8 EFLAGS: 00010246
> >>>>> [  677.275570] RAX: 0000000000000000 RBX: ffff93f085b33800 RCX: 0000000000000000
> >>>>> [  677.275663] RDX: ffff93f085b33f08 RSI: ffffaed400f6fd18 RDI: ffff93f083aa6700
> >>>>> [  677.275756] RBP: 0000000000000000 R08: ffff93f137e32968 R09: 0000009d9663fa40
> >>>>> [  677.275850] R10: 0000000000000000 R11: 0000009d9663fa40 R12: ffff93f083aa6700
> >>>>> [  677.275943] R13: 0000000000000000 R14: ffff93f082999e00 R15: ffff93f085b33800
> >>>>> [  677.276049] FS:  0000000000000000(0000) GS:ffff93f137e00000(0000)
> >>>>> knlGS:0000000000000000
> >>>>> [  677.276154] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>>>> [  677.276247] CR2: 0000000000000030 CR3: 000000010797c002 CR4: 00000000007706f0
> >>>>> [  677.276355] PKRU: 55555554
> >>>>> [  677.276427] Call Trace:
> >>>>> [  677.276521]  <TASK>
> >>>>> [  677.276605]  ? wait_for_completion_interruptible_timeout+0xfa/0x170
> >>>>> [  677.276725]  xs_tls_handshake_sync+0x14f/0x170 [sunrpc]
> >>>>> [  677.276850]  ? __pfx_xs_tls_handshake_done+0x10/0x10 [sunrpc]
> >>>>> [  677.276967]  xs_tls_connect+0x14c/0x590 [sunrpc]
> >>>>> [  677.277086]  process_one_work+0x1b0/0x3c0
> >>>>> [  677.277209]  worker_thread+0x34/0x360
> >>>>> [  677.277303]  ? __pfx_worker_thread+0x10/0x10
> >>>>> [  677.277396]  kthread+0xdb/0x110
> >>>>> [  677.277527]  ? __pfx_kthread+0x10/0x10
> >>>>> [  677.277619]  ret_from_fork+0x29/0x50
> >>>>> [  677.277731]  </TASK>
> >>>>> [  677.277803] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
> >>>>> dns_resolver nfsv3 nfs_acl nfs lockd grace fscache netfs ext4 mbcache
> >>>>> jbd2 loop rfkill vsock_loopback vmw_vsock_virtio_transport_common
> >>>>> vmw_vsock_vmci_transport vsock sunrpc vfat fat dm_multipath
> >>>>> intel_rapl_msr intel_rapl_common isst_if_mbox_msr isst_if_common nfit
> >>>>> libnvdimm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl
> >>>>> vmw_balloon joydev pcspkr vmw_vmci i2c_piix4 xfs libcrc32c vmwgfx
> >>>>> drm_ttm_helper ttm sr_mod cdrom drm_kms_helper syscopyarea sysfillrect
> >>>>> sd_mod ata_generic sysimgblt sg ata_piix ahci libahci crc32c_intel
> >>>>> serio_raw drm vmxnet3 vmw_pvscsi libata dm_mirror dm_region_hash
> >>>>> dm_log dm_mod nvme_tcp nvme_fabrics nvme_core t10_pi crc64_rocksoft
> >>>>> crc64 ipmi_devintf ipmi_msghandler fuse
> >>>>> [  677.278383] CR2: 0000000000000030
> >>>>>
> >>>>> Network trace from the server:
> >>>>>
> >>>>> [kolga@scs000085536 ~/pcaps]$ tshark -t ad -r e0d_20230404_084311.trc0
> >>>>>  1 2023-04-04 08:43:28.027546 10.235.232.77 → 10.63.20.131 TCP 74
> >>>>> 727 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> >>>>> TSval=4059503415 TSecr=0 WS=128
> >>>>>  2 2023-04-04 08:43:28.027645 10.63.20.131 → 10.235.232.77 TCP 74
> >>>>> 2049 → 727 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> >>>>> SACK_PERM=1 TSval=2256845073 TSecr=4059503415
> >>>>>  3 2023-04-04 08:43:28.027813 10.235.232.77 → 10.63.20.131 TCP 66
> >>>>> 727 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=4059503415
> >>>>> TSecr=2256845073
> >>>>>  4 2023-04-04 08:43:28.027952 10.235.232.77 → 10.63.20.131 NFS 110
> >>>>> V4 NULL Call
> >>>>>  5 2023-04-04 08:43:28.028301 10.63.20.131 → 10.235.232.77 NFS 102
> >>>>> V4 NULL Reply (Call In 4)
> >>>>>  6 2023-04-04 08:43:28.028563 10.235.232.77 → 10.63.20.131 TCP 66
> >>>>> 727 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=4059503416
> >>>>> TSecr=2256845073
> >>>>>  7 2023-04-04 08:43:28.331181 10.63.20.131 → 10.235.232.77 TCP 66
> >>>>> 2049 → 727 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=2256845383
> >>>>> TSecr=4059503416
> >>>>>
> >>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Chuck Lever
> >>>>
> >>>>
> >>>> --
> >>>> Chuck Lever
> >>
> >>
> >> --
> >> Chuck Lever
>
>
> --
> Chuck Lever
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: kernel oops
  2023-04-04 19:11               ` Olga Kornievskaia
@ 2023-04-04 19:14                 ` Chuck Lever III
  2023-04-04 19:26                   ` Olga Kornievskaia
  0 siblings, 1 reply; 12+ messages in thread
From: Chuck Lever III @ 2023-04-04 19:14 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: kernel-tls-handshake



> On Apr 4, 2023, at 3:11 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> 
> On Tue, Apr 4, 2023 at 1:28 PM Chuck Lever III <chuck.lever@oracle.com> wrote:
>> 
>> 
>> 
>>> On Apr 4, 2023, at 12:15 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>> 
>>> On Tue, Apr 4, 2023 at 10:54 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
>>>> 
>>>> 
>>>> 
>>>>> On Apr 4, 2023, at 10:44 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>>> 
>>>>> On Tue, Apr 4, 2023 at 10:32 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
>>>>>> 
>>>>>>> On Apr 4, 2023, at 10:24 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>>>>> 
>>>>>>> On Fri, Mar 31, 2023 at 10:25 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On Mar 31, 2023, at 10:23 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>>>>>>> 
>>>>>>>>> Hi Chuck,
>>>>>>>>> 
>>>>>>>>> Have you seen this oops? This is not my testing so all I know was that
>>>>>>>>> it was an attempt to mount with xprtsec=tls (against ONTAP).
>>>>>>>>> 
>>>>>>>>> [  316.939747] BUG: kernel NULL pointer dereference, address: 0000000000000018
>>>>>>>>> [  316.939848] #PF: supervisor read access in kernel mode
>>>>>>>>> [  316.939914] #PF: error_code(0x0000) - not-present page
>>>>>>>>> [  316.939984] PGD 0 P4D 0
>>>>>>>>> [  316.940041] Oops: 0000 [#1] PREEMPT SMP PTI
>>>>>>>>> [  316.940109] CPU: 0 PID: 511 Comm: kworker/u2:30 Kdump: loaded Not
>>>>>>>>> tainted 6.2.1+ #2
>>>>>>>>> [  316.940181] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
>>>>>>>>> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
>>>>>>>>> [  316.940259] Workqueue: xprtiod xs_tls_connect [sunrpc]
>>>>>>>>> [  316.940463] RIP: 0010:xs_tls_connect+0x3e1/0x590 [sunrpc]
>>>>>>>>> [  316.940556] Code: ff ff ff 48 2b 83 f8 fe ff ff 48 89 83 00 ff ff
>>>>>>>>> ff e8 03 d3 ff ff e9 a8 fd ff ff 49 8b 97 f8 05 00 00 66 83 bb e0 f9
>>>>>>>>> ff ff 0a <48> 8b 6a 18 0f 84 fd 00 00 00 48 8d 72 18 4c 89 e7 48 89 14
>>>>>>>>> 24 e8
>>>>>>>>> [  316.940719] RSP: 0018:ffffa69d40f5fdc8 EFLAGS: 00010293
>>>>>>>>> [  316.940786] RAX: 0000000000000000 RBX: ffff9b0c46333e40 RCX: 0000000000000000
>>>>>>>>> [  316.940872] RDX: 0000000000000000 RSI: ffffa69d40f5fd20 RDI: 0000000000000000
>>>>>>>>> [  316.940940] RBP: ffff9b0c46350200 R08: ffff9b0cf7e31968 R09: 0000000000000159
>>>>>>>>> [  316.941008] R10: 0000000000000001 R11: 00000049c8887a40 R12: ffff9b0c46333800
>>>>>>>>> [  316.941075] R13: 0000000004208160 R14: ffff9b0c44e18a00 R15: ffff9b0c42dab800
>>>>>>>>> [  316.941159] FS:  0000000000000000(0000) GS:ffff9b0cf7e00000(0000)
>>>>>>>>> knlGS:0000000000000000
>>>>>>>>> [  316.941243] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>>>>> [  316.941309] CR2: 0000000000000018 CR3: 0000000102e48004 CR4: 00000000007706f0
>>>>>>>>> [  316.941391] PKRU: 55555554
>>>>>>>>> [  316.941443] Call Trace:
>>>>>>>>> [  316.941513]  <TASK>
>>>>>>>>> [  316.941578]  process_one_work+0x1b0/0x3c0
>>>>>>>>> [  316.941689]  worker_thread+0x30/0x360
>>>>>>>>> [  316.941755]  ? __pfx_worker_thread+0x10/0x10
>>>>>>>>> [  316.941822]  kthread+0xd7/0x100
>>>>>>>>> [  316.941900]  ? __pfx_kthread+0x10/0x10
>>>>>>>>> [  316.941968]  ret_from_fork+0x29/0x50
>>>>>>>>> [  316.942049]  </TASK>
>>>>>>>>> [  316.942099] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
>>>>>>>>> dns_resolver ib_core nfsv3 nfs_acl nfs lockd grace fscache netfs ext4
>>>>>>>>> rfkill mbcache jbd2 loop vsock_loopback
>>>>>>>>> vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock
>>>>>>>>> sunrpc tls vfat fat dm_multipath intel_rapl_msr intel_rapl_common
>>>>>>>>> isst_if_mbox_msr isst_if_common nfit libnvdimm crct10dif_pclmul
>>>>>>>>> crc32_pclmul ghash_clmulni_intel rapl vmw_balloon joydev pcspkr
>>>>>>>>> i2c_piix4 vmw_vmci xfs libcrc32c vmwgfx drm_ttm_helper ttm sr_mod
>>>>>>>>> cdrom drm_kms_helper syscopyarea sd_mod sysfillrect ata_generic
>>>>>>>>> sysimgblt sg ahci libahci ata_piix crc32c_intel serio_raw drm vmxnet3
>>>>>>>>> vmw_pvscsi libata dm_mirror dm_region_hash dm_log dm_mod nvme_tcp
>>>>>>>>> nvme_fabrics nvme_core t10_pi crc64_rocksoft crc64 ipmi_devintf
>>>>>>>>> ipmi_msghandler fuse
>>>>>>>>> [  316.942550] CR2: 0000000000000018
>>>>>>>>> 
>>>>>>>>> Please note this was still on "commit
>>>>>>>>> caa52379f0046e974fd16585dd7ede624699d8e2 (HEAD -> 03152023-v7,
>>>>>>>>> origin/upcall-v7)". If you think somethings done to address it I'll
>>>>>>>>> have them update to the latest.
>>>>>>>> 
>>>>>>>> Update and retry. It has a familiar ring, but I can't say I've
>>>>>>>> seen this recently.
>>>>>>> 
>>>>>>> Hi Chuck,
>>>>>>> 
>>>>>>> I've updated the kernel code to the latest  commit
>>>>>>> 46588cade103b446f74a7ee160a83cd2a80399e7 (HEAD ->
>>>>>>> chuck-rpc-with-tls-04032023, chuck/topic-rpc-with-tls-upcall) but the
>>>>>>> client is still crashing but looks like in a different place. This
>>>>>>> happens when the server terminates the connection (immediately) as the
>>>>>>> tls handshake happens.
>>>>>> 
>>>>>> Why is the server doing that?
>>>>> 
>>>>> I'm not sure what kind of a reason you are looking for. I think the
>>>>> server can send an RST at any point in a TCP connection and the client
>>>>> shouldn't crash.
>>>> 
>>>> No doubt.
>>>> 
>>>> 
>>>>> But the real reason is probably because the server's
>>>>> code is getting developed and they ran into some error and resetting
>>>>> the connection.
>>>> 
>>>> That's what I was looking for. I wanted to know if it was
>>>> Linux NFSD doing this.
>>>> 
>>>> 
>>>>>>> Here's the stack trace:
>>>>>>> [  677.274299] BUG: kernel NULL pointer dereference, address: 0000000000000030
>>>>>>> [  677.274424] #PF: supervisor read access in kernel mode
>>>>>>> [  677.274517] #PF: error_code(0x0000) - not-present page
>>>>>>> [  677.274609] PGD 0 P4D 0
>>>>>>> [  677.274689] Oops: 0000 [#1] PREEMPT SMP PTI
>>>>>>> [  677.274785] CPU: 0 PID: 518 Comm: kworker/u2:29 Kdump: loaded Not
>>>>>>> tainted 6.3.0-rc4+ #13
>>>>>>> [  677.274881] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
>>>>>>> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
>>>>>>> [  677.274981] Workqueue: xprtiod xs_tls_connect [sunrpc]
>>>>>>> [  677.275137] RIP: 0010:handshake_req_cancel+0x2b/0x3b0
>>>>>> 
>>>>>> scripts/faddr2line on this RIP
>>>>>> 
>>>>>> Which pointer is NULL?
>>>>> 
>>>>> I apologize I don't have that info and I'm working on trying to figure
>>>>> out how to get this information to you. Lab kernels are built with
>>>>> CONFIG_DEBUG_INFO disabled and thus I can't do faddr2line. I will
>>>>> attempt to build on with debugging enabled (has been problematic
>>>>> before for the lab machines so not sure if that'll be successful I
>>>>> might have to hack the linux kernel to simulate the reset).
>>>> 
>>>> Jakub asked that NULL checking be removed from handshake_req_cancel,
>>>> or at least that's what I read into his comment. That choice is
>>>> definitely flushing bugs.
>>>> 
>>>> So this crasher is familiar. req_cancel is getting called in some
>>>> scenario where the socket or request is partially filled in. We
>>>> have to figure out how that's happening.
>>>> 
>>>> If debug info is a problem, then just add a sarcastic quantity of
>>>> NULL checks in handshake_req_cancel, rebuild, and it should tell
>>>> you exactly which pointer is bad.
>>> 
>>> I added a check for sock-sk null and if so return true.
>> 
>> If it returns true, xs_tls_handshake_sync() probably completes
>> successfully.
>> 
>> Perhaps handshake_req_cancel needs to return false in cases
>> where it can't find a pending request.
> 
> I'm not sure if you were suggesting me to do this but I tried:
> if (!sock->sk) return false;
> and that led to a mount not crashing but timing out.

That's the behavior I would expect. Was the failure immediate,
or did it take a few seconds for the mount command to exit?


>>> I thought that
>>> would "prevent" the crash but it then proceeded to crash elsewhere.
>>> Sorry I'm still on non-debug-symbol kernel (as I thought I'd add the
>>> checks that that'd be enough :)).
>>> 
>>> [  284.368211] AGLO: handshake_req_cancel sock->sk is NULL
>>> [  284.368249] BUG: kernel NULL pointer dereference, address: 0000000000000018
>>> [  284.368320] #PF: supervisor read access in kernel mode
>>> [  284.368363] #PF: error_code(0x0000) - not-present page
>>> [  284.368411] PGD 0 P4D 0
>>> [  284.368438] Oops: 0000 [#1] PREEMPT SMP PTI
>>> [  284.368473] CPU: 0 PID: 521 Comm: kworker/u2:30 Kdump: loaded Not
>>> tainted 6.3.0-rc4+ #14
>>> [  284.368523] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
>>> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
>>> [  284.368580] Workqueue: xprtiod xs_tls_connect [sunrpc]
>>> [  284.368720] RIP: 0010:xs_tls_connect+0x3e5/0x590 [sunrpc]
>>> [  284.368844] Code: ff ff ff 48 2b 83 f8 fe ff ff 48 89 83 00 ff ff
>>> ff e8 bf d2 ff ff e9 a8 fd ff ff 49 8b 97 f8 05 00 00 66 83 bb e0 f9
>>> ff ff 0a <48> 8b 6a 18 0f 84 fd 00 00 00 48 8d 72 18 4c 89 e7 48 89 14
>>> 24 e8
>>> [  284.368973] RSP: 0018:ffffba17c0f67dc8 EFLAGS: 00010293
>>> [  284.369010] RAX: 0000000000000000 RBX: ffff963143c1fe40 RCX: 0000000000000000
>>> [  284.369051] RDX: 0000000000000000 RSI: ffff9631f7e20840 RDI: ffff9631f7e20840
>>> [  284.369091] RBP: ffff9631463d8a00 R08: 0000000000000000 R09: c0000000ffff7fff
>>> [  284.369130] R10: 0000000000000001 R11: ffffba17c0f67bb8 R12: ffff963143c1f800
>>> [  284.369175] R13: 0000000004208160 R14: ffff963143f6f600 R15: ffff963143c1e000
>>> [  284.369259] FS:  0000000000000000(0000) GS:ffff9631f7e00000(0000)
>>> knlGS:0000000000000000
>>> [  284.369319] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [  284.369367] CR2: 0000000000000018 CR3: 0000000102d12004 CR4: 00000000007706f0
>>> [  284.369458] PKRU: 55555554
>>> [  284.369483] Call Trace:
>>> [  284.369523]  <TASK>
>>> [  284.369583]  process_one_work+0x1b0/0x3c0
>>> [  284.369633]  worker_thread+0x34/0x360
>>> [  284.369668]  ? __pfx_worker_thread+0x10/0x10
>>> [  284.369704]  kthread+0xdb/0x110
>>> [  284.369736]  ? __pfx_kthread+0x10/0x10
>>> [  284.369767]  ret_from_fork+0x29/0x50
>>> [  284.369808]  </TASK>
>>> [  284.369843] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
>>> dns_resolver nfsv3 nfs_acl nfs lockd grace fscache netfs ext4 mbcache
>>> jbd2 loop rfkill vsock_loopback vmw_vsock_virtio_transport_common
>>> vmw_vsock_vmci_transport vsock sunrpc vfat fat dm_multipath
>>> intel_rapl_msr intel_rapl_common isst_if_mbox_msr isst_if_common nfit
>>> libnvdimm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl
>>> vmw_balloon joydev pcspkr vmw_vmci i2c_piix4 xfs libcrc32c vmwgfx
>>> drm_ttm_helper ttm sr_mod cdrom drm_kms_helper sd_mod syscopyarea
>>> sysfillrect ata_generic sysimgblt sg ata_piix ahci libahci
>>> crc32c_intel serio_raw vmxnet3 libata drm vmw_pvscsi dm_mirror
>>> dm_region_hash dm_log dm_mod nvme_tcp nvme_fabrics nvme_core t10_pi
>>> crc64_rocksoft crc64 ipmi_devintf ipmi_msghandler fuse
>>> [  284.370300] CR2: 0000000000000018
>>> 
>>> 
>>>> 
>>>> BPF might also be a solution, but I'm not sure if it relies on
>>>> debug info. Might be only BTF is needed.
>>>> 
>>>> 
>>>>>>> [  677.275269] Code: 0f 1e fa 0f 1f 44 00 00 41 57 41 56 41 55 41 54
>>>>>>> 55 53 48 83 ec 28 48 8b 6f 18 65 48 8b 04 25 28 00 00 00 48 89 44 24
>>>>>>> 20 31 c0 <48> 8b 5d 30 48 89 6c 24 18 e8 97 15 5f ff 4c 8b 25 50 2e 70
>>>>>>> 01 41
>>>>>>> [  677.275474] RSP: 0018:ffffaed400f6fcf8 EFLAGS: 00010246
>>>>>>> [  677.275570] RAX: 0000000000000000 RBX: ffff93f085b33800 RCX: 0000000000000000
>>>>>>> [  677.275663] RDX: ffff93f085b33f08 RSI: ffffaed400f6fd18 RDI: ffff93f083aa6700
>>>>>>> [  677.275756] RBP: 0000000000000000 R08: ffff93f137e32968 R09: 0000009d9663fa40
>>>>>>> [  677.275850] R10: 0000000000000000 R11: 0000009d9663fa40 R12: ffff93f083aa6700
>>>>>>> [  677.275943] R13: 0000000000000000 R14: ffff93f082999e00 R15: ffff93f085b33800
>>>>>>> [  677.276049] FS:  0000000000000000(0000) GS:ffff93f137e00000(0000)
>>>>>>> knlGS:0000000000000000
>>>>>>> [  677.276154] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>>> [  677.276247] CR2: 0000000000000030 CR3: 000000010797c002 CR4: 00000000007706f0
>>>>>>> [  677.276355] PKRU: 55555554
>>>>>>> [  677.276427] Call Trace:
>>>>>>> [  677.276521]  <TASK>
>>>>>>> [  677.276605]  ? wait_for_completion_interruptible_timeout+0xfa/0x170
>>>>>>> [  677.276725]  xs_tls_handshake_sync+0x14f/0x170 [sunrpc]
>>>>>>> [  677.276850]  ? __pfx_xs_tls_handshake_done+0x10/0x10 [sunrpc]
>>>>>>> [  677.276967]  xs_tls_connect+0x14c/0x590 [sunrpc]
>>>>>>> [  677.277086]  process_one_work+0x1b0/0x3c0
>>>>>>> [  677.277209]  worker_thread+0x34/0x360
>>>>>>> [  677.277303]  ? __pfx_worker_thread+0x10/0x10
>>>>>>> [  677.277396]  kthread+0xdb/0x110
>>>>>>> [  677.277527]  ? __pfx_kthread+0x10/0x10
>>>>>>> [  677.277619]  ret_from_fork+0x29/0x50
>>>>>>> [  677.277731]  </TASK>
>>>>>>> [  677.277803] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
>>>>>>> dns_resolver nfsv3 nfs_acl nfs lockd grace fscache netfs ext4 mbcache
>>>>>>> jbd2 loop rfkill vsock_loopback vmw_vsock_virtio_transport_common
>>>>>>> vmw_vsock_vmci_transport vsock sunrpc vfat fat dm_multipath
>>>>>>> intel_rapl_msr intel_rapl_common isst_if_mbox_msr isst_if_common nfit
>>>>>>> libnvdimm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl
>>>>>>> vmw_balloon joydev pcspkr vmw_vmci i2c_piix4 xfs libcrc32c vmwgfx
>>>>>>> drm_ttm_helper ttm sr_mod cdrom drm_kms_helper syscopyarea sysfillrect
>>>>>>> sd_mod ata_generic sysimgblt sg ata_piix ahci libahci crc32c_intel
>>>>>>> serio_raw drm vmxnet3 vmw_pvscsi libata dm_mirror dm_region_hash
>>>>>>> dm_log dm_mod nvme_tcp nvme_fabrics nvme_core t10_pi crc64_rocksoft
>>>>>>> crc64 ipmi_devintf ipmi_msghandler fuse
>>>>>>> [  677.278383] CR2: 0000000000000030
>>>>>>> 
>>>>>>> Network trace from the server:
>>>>>>> 
>>>>>>> [kolga@scs000085536 ~/pcaps]$ tshark -t ad -r e0d_20230404_084311.trc0
>>>>>>> 1 2023-04-04 08:43:28.027546 10.235.232.77 → 10.63.20.131 TCP 74
>>>>>>> 727 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
>>>>>>> TSval=4059503415 TSecr=0 WS=128
>>>>>>> 2 2023-04-04 08:43:28.027645 10.63.20.131 → 10.235.232.77 TCP 74
>>>>>>> 2049 → 727 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
>>>>>>> SACK_PERM=1 TSval=2256845073 TSecr=4059503415
>>>>>>> 3 2023-04-04 08:43:28.027813 10.235.232.77 → 10.63.20.131 TCP 66
>>>>>>> 727 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=4059503415
>>>>>>> TSecr=2256845073
>>>>>>> 4 2023-04-04 08:43:28.027952 10.235.232.77 → 10.63.20.131 NFS 110
>>>>>>> V4 NULL Call
>>>>>>> 5 2023-04-04 08:43:28.028301 10.63.20.131 → 10.235.232.77 NFS 102
>>>>>>> V4 NULL Reply (Call In 4)
>>>>>>> 6 2023-04-04 08:43:28.028563 10.235.232.77 → 10.63.20.131 TCP 66
>>>>>>> 727 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=4059503416
>>>>>>> TSecr=2256845073
>>>>>>> 7 2023-04-04 08:43:28.331181 10.63.20.131 → 10.235.232.77 TCP 66
>>>>>>> 2049 → 727 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=2256845383
>>>>>>> TSecr=4059503416
>>>>>>> 
>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Chuck Lever
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Chuck Lever
>>>> 
>>>> 
>>>> --
>>>> Chuck Lever
>> 
>> 
>> --
>> Chuck Lever


--
Chuck Lever



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: kernel oops
  2023-04-04 19:14                 ` Chuck Lever III
@ 2023-04-04 19:26                   ` Olga Kornievskaia
  2023-04-04 19:30                     ` Chuck Lever III
  0 siblings, 1 reply; 12+ messages in thread
From: Olga Kornievskaia @ 2023-04-04 19:26 UTC (permalink / raw)
  To: Chuck Lever III; +Cc: kernel-tls-handshake

On Tue, Apr 4, 2023 at 3:14 PM Chuck Lever III <chuck.lever@oracle.com> wrote:
>
>
>
> > On Apr 4, 2023, at 3:11 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> >
> > On Tue, Apr 4, 2023 at 1:28 PM Chuck Lever III <chuck.lever@oracle.com> wrote:
> >>
> >>
> >>
> >>> On Apr 4, 2023, at 12:15 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> >>>
> >>> On Tue, Apr 4, 2023 at 10:54 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>>> On Apr 4, 2023, at 10:44 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
> >>>>>
> >>>>> On Tue, Apr 4, 2023 at 10:32 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
> >>>>>>
> >>>>>>> On Apr 4, 2023, at 10:24 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
> >>>>>>>
> >>>>>>> On Fri, Mar 31, 2023 at 10:25 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> On Mar 31, 2023, at 10:23 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
> >>>>>>>>>
> >>>>>>>>> Hi Chuck,
> >>>>>>>>>
> >>>>>>>>> Have you seen this oops? This is not my testing so all I know was that
> >>>>>>>>> it was an attempt to mount with xprtsec=tls (against ONTAP).
> >>>>>>>>>
> >>>>>>>>> [  316.939747] BUG: kernel NULL pointer dereference, address: 0000000000000018
> >>>>>>>>> [  316.939848] #PF: supervisor read access in kernel mode
> >>>>>>>>> [  316.939914] #PF: error_code(0x0000) - not-present page
> >>>>>>>>> [  316.939984] PGD 0 P4D 0
> >>>>>>>>> [  316.940041] Oops: 0000 [#1] PREEMPT SMP PTI
> >>>>>>>>> [  316.940109] CPU: 0 PID: 511 Comm: kworker/u2:30 Kdump: loaded Not
> >>>>>>>>> tainted 6.2.1+ #2
> >>>>>>>>> [  316.940181] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
> >>>>>>>>> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
> >>>>>>>>> [  316.940259] Workqueue: xprtiod xs_tls_connect [sunrpc]
> >>>>>>>>> [  316.940463] RIP: 0010:xs_tls_connect+0x3e1/0x590 [sunrpc]
> >>>>>>>>> [  316.940556] Code: ff ff ff 48 2b 83 f8 fe ff ff 48 89 83 00 ff ff
> >>>>>>>>> ff e8 03 d3 ff ff e9 a8 fd ff ff 49 8b 97 f8 05 00 00 66 83 bb e0 f9
> >>>>>>>>> ff ff 0a <48> 8b 6a 18 0f 84 fd 00 00 00 48 8d 72 18 4c 89 e7 48 89 14
> >>>>>>>>> 24 e8
> >>>>>>>>> [  316.940719] RSP: 0018:ffffa69d40f5fdc8 EFLAGS: 00010293
> >>>>>>>>> [  316.940786] RAX: 0000000000000000 RBX: ffff9b0c46333e40 RCX: 0000000000000000
> >>>>>>>>> [  316.940872] RDX: 0000000000000000 RSI: ffffa69d40f5fd20 RDI: 0000000000000000
> >>>>>>>>> [  316.940940] RBP: ffff9b0c46350200 R08: ffff9b0cf7e31968 R09: 0000000000000159
> >>>>>>>>> [  316.941008] R10: 0000000000000001 R11: 00000049c8887a40 R12: ffff9b0c46333800
> >>>>>>>>> [  316.941075] R13: 0000000004208160 R14: ffff9b0c44e18a00 R15: ffff9b0c42dab800
> >>>>>>>>> [  316.941159] FS:  0000000000000000(0000) GS:ffff9b0cf7e00000(0000)
> >>>>>>>>> knlGS:0000000000000000
> >>>>>>>>> [  316.941243] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>>>>>>>> [  316.941309] CR2: 0000000000000018 CR3: 0000000102e48004 CR4: 00000000007706f0
> >>>>>>>>> [  316.941391] PKRU: 55555554
> >>>>>>>>> [  316.941443] Call Trace:
> >>>>>>>>> [  316.941513]  <TASK>
> >>>>>>>>> [  316.941578]  process_one_work+0x1b0/0x3c0
> >>>>>>>>> [  316.941689]  worker_thread+0x30/0x360
> >>>>>>>>> [  316.941755]  ? __pfx_worker_thread+0x10/0x10
> >>>>>>>>> [  316.941822]  kthread+0xd7/0x100
> >>>>>>>>> [  316.941900]  ? __pfx_kthread+0x10/0x10
> >>>>>>>>> [  316.941968]  ret_from_fork+0x29/0x50
> >>>>>>>>> [  316.942049]  </TASK>
> >>>>>>>>> [  316.942099] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
> >>>>>>>>> dns_resolver ib_core nfsv3 nfs_acl nfs lockd grace fscache netfs ext4
> >>>>>>>>> rfkill mbcache jbd2 loop vsock_loopback
> >>>>>>>>> vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock
> >>>>>>>>> sunrpc tls vfat fat dm_multipath intel_rapl_msr intel_rapl_common
> >>>>>>>>> isst_if_mbox_msr isst_if_common nfit libnvdimm crct10dif_pclmul
> >>>>>>>>> crc32_pclmul ghash_clmulni_intel rapl vmw_balloon joydev pcspkr
> >>>>>>>>> i2c_piix4 vmw_vmci xfs libcrc32c vmwgfx drm_ttm_helper ttm sr_mod
> >>>>>>>>> cdrom drm_kms_helper syscopyarea sd_mod sysfillrect ata_generic
> >>>>>>>>> sysimgblt sg ahci libahci ata_piix crc32c_intel serio_raw drm vmxnet3
> >>>>>>>>> vmw_pvscsi libata dm_mirror dm_region_hash dm_log dm_mod nvme_tcp
> >>>>>>>>> nvme_fabrics nvme_core t10_pi crc64_rocksoft crc64 ipmi_devintf
> >>>>>>>>> ipmi_msghandler fuse
> >>>>>>>>> [  316.942550] CR2: 0000000000000018
> >>>>>>>>>
> >>>>>>>>> Please note this was still on "commit
> >>>>>>>>> caa52379f0046e974fd16585dd7ede624699d8e2 (HEAD -> 03152023-v7,
> >>>>>>>>> origin/upcall-v7)". If you think somethings done to address it I'll
> >>>>>>>>> have them update to the latest.
> >>>>>>>>
> >>>>>>>> Update and retry. It has a familiar ring, but I can't say I've
> >>>>>>>> seen this recently.
> >>>>>>>
> >>>>>>> Hi Chuck,
> >>>>>>>
> >>>>>>> I've updated the kernel code to the latest  commit
> >>>>>>> 46588cade103b446f74a7ee160a83cd2a80399e7 (HEAD ->
> >>>>>>> chuck-rpc-with-tls-04032023, chuck/topic-rpc-with-tls-upcall) but the
> >>>>>>> client is still crashing but looks like in a different place. This
> >>>>>>> happens when the server terminates the connection (immediately) as the
> >>>>>>> tls handshake happens.
> >>>>>>
> >>>>>> Why is the server doing that?
> >>>>>
> >>>>> I'm not sure what kind of a reason you are looking for. I think the
> >>>>> server can send an RST at any point in a TCP connection and the client
> >>>>> shouldn't crash.
> >>>>
> >>>> No doubt.
> >>>>
> >>>>
> >>>>> But the real reason is probably because the server's
> >>>>> code is getting developed and they ran into some error and resetting
> >>>>> the connection.
> >>>>
> >>>> That's what I was looking for. I wanted to know if it was
> >>>> Linux NFSD doing this.
> >>>>
> >>>>
> >>>>>>> Here's the stack trace:
> >>>>>>> [  677.274299] BUG: kernel NULL pointer dereference, address: 0000000000000030
> >>>>>>> [  677.274424] #PF: supervisor read access in kernel mode
> >>>>>>> [  677.274517] #PF: error_code(0x0000) - not-present page
> >>>>>>> [  677.274609] PGD 0 P4D 0
> >>>>>>> [  677.274689] Oops: 0000 [#1] PREEMPT SMP PTI
> >>>>>>> [  677.274785] CPU: 0 PID: 518 Comm: kworker/u2:29 Kdump: loaded Not
> >>>>>>> tainted 6.3.0-rc4+ #13
> >>>>>>> [  677.274881] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
> >>>>>>> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
> >>>>>>> [  677.274981] Workqueue: xprtiod xs_tls_connect [sunrpc]
> >>>>>>> [  677.275137] RIP: 0010:handshake_req_cancel+0x2b/0x3b0
> >>>>>>
> >>>>>> scripts/faddr2line on this RIP
> >>>>>>
> >>>>>> Which pointer is NULL?
> >>>>>
> >>>>> I apologize I don't have that info and I'm working on trying to figure
> >>>>> out how to get this information to you. Lab kernels are built with
> >>>>> CONFIG_DEBUG_INFO disabled and thus I can't do faddr2line. I will
> >>>>> attempt to build on with debugging enabled (has been problematic
> >>>>> before for the lab machines so not sure if that'll be successful I
> >>>>> might have to hack the linux kernel to simulate the reset).
> >>>>
> >>>> Jakub asked that NULL checking be removed from handshake_req_cancel,
> >>>> or at least that's what I read into his comment. That choice is
> >>>> definitely flushing bugs.
> >>>>
> >>>> So this crasher is familiar. req_cancel is getting called in some
> >>>> scenario where the socket or request is partially filled in. We
> >>>> have to figure out how that's happening.
> >>>>
> >>>> If debug info is a problem, then just add a sarcastic quantity of
> >>>> NULL checks in handshake_req_cancel, rebuild, and it should tell
> >>>> you exactly which pointer is bad.
> >>>
> >>> I added a check for sock-sk null and if so return true.
> >>
> >> If it returns true, xs_tls_handshake_sync() probably completes
> >> successfully.
> >>
> >> Perhaps handshake_req_cancel needs to return false in cases
> >> where it can't find a pending request.
> >
> > I'm not sure if you were suggesting me to do this but I tried:
> > if (!sock->sk) return false;
> > and that led to a mount not crashing but timing out.
>
> That's the behavior I would expect. Was the failure immediate,
> or did it take a few seconds for the mount command to exit?

It did take quite a long time to exit:

[root@scs000128074 kolga]# time mount -o vers=4.2,sec=sys,xprtsec=tls
10.63.20.131:/ /mnt
mount.nfs: Connection timed out

real    3m0.544s
user    0m0.004s
sys     0m0.013s

Network trace shows numerous (16) attempts to do NULL call

   66 2023-04-04 15:18:17.786579 10.235.232.7 → 10.63.20.131 TCP 74
953 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
TSval=3995198270 TSecr=0 WS=128
   67 2023-04-04 15:18:17.786818 10.63.20.131 → 10.235.232.7 TCP 74
2049 → 953 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
SACK_PERM=1 TSval=2815692929 TSecr=3995198270
   68 2023-04-04 15:18:17.786871 10.235.232.7 → 10.63.20.131 TCP 66
953 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995198270
TSecr=2815692929
   69 2023-04-04 15:18:17.786951 10.235.232.7 → 10.63.20.131 NFS 110
V4 NULL Call
   70 2023-04-04 15:18:17.787488 10.63.20.131 → 10.235.232.7 NFS 102
V4 NULL Reply (Call In 69)
   71 2023-04-04 15:18:17.787520 10.235.232.7 → 10.63.20.131 TCP 66
953 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995198271
TSecr=2815692929
   72 2023-04-04 15:18:17.818060 10.235.232.7 → 10.56.8.183  TCP 66
826 → 2049 [ACK] Seq=293 Ack=237 Win=501 Len=0 TSval=1599771772
TSecr=3825493593
   73 2023-04-04 15:18:17.993341 10.63.20.131 → 10.235.232.7 TCP 66
2049 → 953 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=2815693129
TSecr=3995198271
   74 2023-04-04 15:18:19.490669 fe80::21c:73ff:fe00:99 → ff02::1
ICMPv6 118 Router Advertisement from 00:1c:73:00:00:99
   75 2023-04-04 15:18:19.494076 fe80::250:56ff:febf:2f55 → ff02::16
  ICMPv6 90 Multicast Listener Report Message v2
   76 2023-04-04 15:18:19.841191 fe80::250:56ff:febf:2f55 → ff02::16
  ICMPv6 90 Multicast Listener Report Message v2
   77 2023-04-04 15:18:28.201642 10.235.232.7 → 10.63.20.131 TCP 74
692 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
TSval=3995208685 TSecr=0 WS=128
   78 2023-04-04 15:18:28.202171 10.63.20.131 → 10.235.232.7 TCP 74
2049 → 692 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
SACK_PERM=1 TSval=3904561029 TSecr=3995208685
   79 2023-04-04 15:18:28.202236 10.235.232.7 → 10.63.20.131 TCP 66
692 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995208686
TSecr=3904561029
   80 2023-04-04 15:18:28.202506 10.235.232.7 → 10.63.20.131 NFS 110
V4 NULL Call
   81 2023-04-04 15:18:28.202882 10.63.20.131 → 10.235.232.7 NFS 102
V4 NULL Reply (Call In 80)
   82 2023-04-04 15:18:28.202911 10.235.232.7 → 10.63.20.131 TCP 66
692 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995208686
TSecr=3904561029
   83 2023-04-04 15:18:28.329632 10.63.20.131 → 10.235.232.7 TCP 66
2049 → 692 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=3904561159
TSecr=3995208686
   84 2023-04-04 15:18:34.857077 10.235.232.7 → 10.60.248.40 TCP 66
845 → 2049 [ACK] Seq=1 Ack=1 Win=501 Len=0 TSval=3160247691
TSecr=2389454709
   85 2023-04-04 15:18:34.857146 10.235.232.7 → 10.60.249.150 TCP 66
970 → 2049 [ACK] Seq=1 Ack=1 Win=502 Len=0 TSval=4116465771
TSecr=1272760298
   86 2023-04-04 15:18:34.857421 10.60.249.150 → 10.235.232.7 TCP 66
[TCP ACKed unseen segment] 2049 → 970 [ACK] Seq=1 Ack=2 Win=257 Len=0
TSval=1272820701 TSecr=4116405409
   87 2023-04-04 15:18:34.857422 10.60.248.40 → 10.235.232.7 TCP 66
[TCP ACKed unseen segment] 2049 → 845 [ACK] Seq=1 Ack=2 Win=257 Len=0
TSval=2389515123 TSecr=3160187318
   88 2023-04-04 15:18:38.441851 10.235.232.7 → 10.63.20.131 TCP 74
971 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
TSval=3995218925 TSecr=0 WS=128
   89 2023-04-04 15:18:38.442269 10.63.20.131 → 10.235.232.7 TCP 74
2049 → 971 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
SACK_PERM=1 TSval=1698639803 TSecr=3995218925
   90 2023-04-04 15:18:38.442320 10.235.232.7 → 10.63.20.131 TCP 66
971 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995218926
TSecr=1698639803
   91 2023-04-04 15:18:38.442577 10.235.232.7 → 10.63.20.131 NFS 110
V4 NULL Call
   92 2023-04-04 15:18:38.442894 10.63.20.131 → 10.235.232.7 NFS 102
V4 NULL Reply (Call In 91)
   93 2023-04-04 15:18:38.442921 10.235.232.7 → 10.63.20.131 TCP 66
971 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995218926
TSecr=1698639803
   94 2023-04-04 15:18:38.478175 10.63.20.131 → 10.235.232.7 TCP 66
[TCP Dup ACK 92#1] 2049 → 971 [ACK] Seq=37 Ack=45 Win=65792 Len=0
TSval=1698639843 TSecr=3995218926
   95 2023-04-04 15:18:38.614265 10.63.20.131 → 10.235.232.7 TCP 66
2049 → 971 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=1698639973
TSecr=3995218926
   96 2023-04-04 15:18:48.681763 10.235.232.7 → 10.63.20.131 TCP 74
799 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
TSval=3995229165 TSecr=0 WS=128
   97 2023-04-04 15:18:48.682176 10.63.20.131 → 10.235.232.7 TCP 74
2049 → 799 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
SACK_PERM=1 TSval=1773399631 TSecr=3995229165
   98 2023-04-04 15:18:48.682203 10.235.232.7 → 10.63.20.131 TCP 66
799 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995229166
TSecr=1773399631
   99 2023-04-04 15:18:48.682260 10.235.232.7 → 10.63.20.131 NFS 110
V4 NULL Call
  100 2023-04-04 15:18:48.682751 10.63.20.131 → 10.235.232.7 NFS 102
V4 NULL Reply (Call In 99)
  101 2023-04-04 15:18:48.682761 10.235.232.7 → 10.63.20.131 TCP 66
799 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995229166
TSecr=1773399631
  102 2023-04-04 15:18:48.826843 10.63.20.131 → 10.235.232.7 TCP 66
2049 → 799 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=1773399771
TSecr=3995229166
  103 2023-04-04 15:18:49.514227 10.235.232.170 → 10.235.235.255
BROWSER 243 Host Announcement WINDOWS-G95UIQM, Workstation, Server, NT
Workstation, NT Server
  104 2023-04-04 15:18:58.921556 10.235.232.7 → 10.63.20.131 TCP 74
729 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
TSval=3995239405 TSecr=0 WS=128
  105 2023-04-04 15:18:58.921926 10.63.20.131 → 10.235.232.7 TCP 74
2049 → 729 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
SACK_PERM=1 TSval=2487545914 TSecr=3995239405
  106 2023-04-04 15:18:58.921953 10.235.232.7 → 10.63.20.131 TCP 66
729 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995239405
TSecr=2487545914
  107 2023-04-04 15:18:58.922027 10.235.232.7 → 10.63.20.131 NFS 110
V4 NULL Call
  108 2023-04-04 15:18:58.922802 10.63.20.131 → 10.235.232.7 NFS 102
V4 NULL Reply (Call In 107)
  109 2023-04-04 15:18:58.922822 10.235.232.7 → 10.63.20.131 TCP 66
729 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995239406
TSecr=2487545914
  110 2023-04-04 15:18:59.050600 10.63.20.131 → 10.235.232.7 TCP 66
2049 → 729 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=2487546044
TSecr=3995239406
  111 2023-04-04 15:19:06.533098 10.235.232.128 → 10.235.235.255
BROWSER 243 Host Announcement SCS000126759-1, Workstation, Server, NT
Workstation, NT Server
  112 2023-04-04 15:19:09.161611 10.235.232.7 → 10.63.20.131 TCP 74
954 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
TSval=3995249645 TSecr=0 WS=128
  113 2023-04-04 15:19:09.161949 10.63.20.131 → 10.235.232.7 TCP 74
2049 → 954 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
SACK_PERM=1 TSval=3804453828 TSecr=3995249645
  114 2023-04-04 15:19:09.162009 10.235.232.7 → 10.63.20.131 TCP 66
954 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995249645
TSecr=3804453828
  115 2023-04-04 15:19:09.162055 10.235.232.7 → 10.63.20.131 NFS 110
V4 NULL Call
  116 2023-04-04 15:19:09.162528 10.63.20.131 → 10.235.232.7 NFS 102
V4 NULL Reply (Call In 115)
  117 2023-04-04 15:19:09.162554 10.235.232.7 → 10.63.20.131 TCP 66
954 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995249646
TSecr=3804453828
  118 2023-04-04 15:19:09.278841 10.63.20.131 → 10.235.232.7 TCP 66
2049 → 954 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=3804453948
TSecr=3995249646
  119 2023-04-04 15:19:12.003989 Vmware_bf:32:23 → Broadcast    ARP 60
Who has 10.235.232.1? Tell 10.235.232.42
  120 2023-04-04 15:19:19.401546 10.235.232.7 → 10.63.20.131 TCP 74
728 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
TSval=3995259885 TSecr=0 WS=128
  121 2023-04-04 15:19:19.401902 10.63.20.131 → 10.235.232.7 TCP 74
2049 → 728 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
SACK_PERM=1 TSval=2468567773 TSecr=3995259885
  122 2023-04-04 15:19:19.401965 10.235.232.7 → 10.63.20.131 TCP 66
728 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995259885
TSecr=2468567773
  123 2023-04-04 15:19:19.402050 10.235.232.7 → 10.63.20.131 NFS 110
V4 NULL Call
  124 2023-04-04 15:19:19.402542 10.63.20.131 → 10.235.232.7 NFS 102
V4 NULL Reply (Call In 123)
  125 2023-04-04 15:19:19.402554 10.235.232.7 → 10.63.20.131 TCP 66
728 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995259886
TSecr=2468567773
  126 2023-04-04 15:19:19.595961 10.63.20.131 → 10.235.232.7 TCP 66
2049 → 728 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=2468567963
TSecr=3995259886
  127 2023-04-04 15:19:21.449077 10.235.232.7 → 10.60.249.125 TCP 66
[TCP Keep-Alive] 732 → 2049 [ACK] Seq=2176 Ack=1857 Win=502 Len=0
TSval=1628408100 TSecr=708634268
  128 2023-04-04 15:19:21.449206 10.235.232.7 → 10.56.8.183  TCP 66
[TCP Keep-Alive] 826 → 2049 [ACK] Seq=292 Ack=237 Win=501 Len=0
TSval=1599835403 TSecr=3825493593
  129 2023-04-04 15:19:21.449284 10.60.249.125 → 10.235.232.7 TCP 66
[TCP Keep-Alive ACK] 2049 → 732 [ACK] Seq=1857 Ack=2177 Win=257 Len=0
TSval=708698066 TSecr=1628344342
  130 2023-04-04 15:19:21.511308  10.56.8.183 → 10.235.232.7 TCP 66
[TCP Keep-Alive ACK] 2049 → 826 [ACK] Seq=237 Ack=293 Win=257 Len=0
TSval=3825557328 TSecr=1599771772
  131 2023-04-04 15:19:29.641659 10.235.232.7 → 10.63.20.131 TCP 74
924 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
TSval=3995270125 TSecr=0 WS=128
  132 2023-04-04 15:19:29.641987 10.63.20.131 → 10.235.232.7 TCP 74
2049 → 924 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
SACK_PERM=1 TSval=4239335290 TSecr=3995270125
  133 2023-04-04 15:19:29.642062 10.235.232.7 → 10.63.20.131 TCP 66
924 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995270125
TSecr=4239335290
  134 2023-04-04 15:19:29.642188 10.235.232.7 → 10.63.20.131 NFS 110
V4 NULL Call
  135 2023-04-04 15:19:29.642585 10.63.20.131 → 10.235.232.7 NFS 102
V4 NULL Reply (Call In 134)
  136 2023-04-04 15:19:29.642626 10.235.232.7 → 10.63.20.131 TCP 66
924 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995270126
TSecr=4239335290
  137 2023-04-04 15:19:29.752894 10.63.20.131 → 10.235.232.7 TCP 66
2049 → 924 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=4239335400
TSecr=3995270126
  138 2023-04-04 15:19:37.833079 10.235.232.7 → 10.60.248.40 TCP 66
[TCP Dup ACK 84#1] 845 → 2049 [ACK] Seq=1 Ack=1 Win=501 Len=0
TSval=3160310667 TSecr=2389515123
  139 2023-04-04 15:19:37.833122 10.235.232.7 → 10.60.249.150 TCP 66
[TCP Dup ACK 85#1] 970 → 2049 [ACK] Seq=1 Ack=1 Win=502 Len=0
TSval=4116528747 TSecr=1272820701
  140 2023-04-04 15:19:37.833308 10.60.248.40 → 10.235.232.7 TCP 66
[TCP Dup ACK 87#1] [TCP ACKed unseen segment] 2049 → 845 [ACK] Seq=1
Ack=2 Win=257 Len=0 TSval=2389578099 TSecr=3160187318
  141 2023-04-04 15:19:37.833346 10.60.249.150 → 10.235.232.7 TCP 66
[TCP Dup ACK 86#1] [TCP ACKed unseen segment] 2049 → 970 [ACK] Seq=1
Ack=2 Win=257 Len=0 TSval=1272883677 TSecr=4116405409
  142 2023-04-04 15:19:39.881507 10.235.232.7 → 10.63.20.131 TCP 74
788 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
TSval=3995280365 TSecr=0 WS=128
  143 2023-04-04 15:19:39.881784 10.63.20.131 → 10.235.232.7 TCP 74
2049 → 788 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
SACK_PERM=1 TSval=2800122551 TSecr=3995280365
  144 2023-04-04 15:19:39.881847 10.235.232.7 → 10.63.20.131 TCP 66
788 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995280365
TSecr=2800122551
  145 2023-04-04 15:19:39.881898 10.235.232.7 → 10.63.20.131 NFS 110
V4 NULL Call
  146 2023-04-04 15:19:39.882359 10.63.20.131 → 10.235.232.7 NFS 102
V4 NULL Reply (Call In 145)
  147 2023-04-04 15:19:39.882395 10.235.232.7 → 10.63.20.131 TCP 66
788 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995280366
TSecr=2800122551
  148 2023-04-04 15:19:40.017113 10.63.20.131 → 10.235.232.7 TCP 66
2049 → 788 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=2800122681
TSecr=3995280366
  149 2023-04-04 15:19:50.122254 10.235.232.7 → 10.63.20.131 TCP 74
868 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
TSval=3995290606 TSecr=0 WS=128
  150 2023-04-04 15:19:50.122643 10.63.20.131 → 10.235.232.7 TCP 74
2049 → 868 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
SACK_PERM=1 TSval=1996033015 TSecr=3995290606
  151 2023-04-04 15:19:50.122674 10.235.232.7 → 10.63.20.131 TCP 66
868 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995290606
TSecr=1996033015
  152 2023-04-04 15:19:50.122802 10.235.232.7 → 10.63.20.131 NFS 110
V4 NULL Call
  153 2023-04-04 15:19:50.123263 10.63.20.131 → 10.235.232.7 NFS 102
V4 NULL Reply (Call In 152)
  154 2023-04-04 15:19:50.123279 10.235.232.7 → 10.63.20.131 TCP 66
868 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995290607
TSecr=1996033015
  155 2023-04-04 15:19:50.268177 10.63.20.131 → 10.235.232.7 TCP 66
2049 → 868 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=1996033165
TSecr=3995290607
  156 2023-04-04 15:20:00.361440 10.235.232.7 → 10.63.20.131 TCP 74
988 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
TSval=3995300845 TSecr=0 WS=128
  157 2023-04-04 15:20:00.361776 10.63.20.131 → 10.235.232.7 TCP 74
2049 → 988 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
SACK_PERM=1 TSval=4120839794 TSecr=3995300845
  158 2023-04-04 15:20:00.361791 10.235.232.7 → 10.63.20.131 TCP 66
988 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995300845
TSecr=4120839794
  159 2023-04-04 15:20:00.361841 10.235.232.7 → 10.63.20.131 NFS 110
V4 NULL Call
  160 2023-04-04 15:20:00.362321 10.63.20.131 → 10.235.232.7 NFS 102
V4 NULL Reply (Call In 159)
  161 2023-04-04 15:20:00.362327 10.235.232.7 → 10.63.20.131 TCP 66
988 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995300846
TSecr=4120839794
  162 2023-04-04 15:20:00.479087 10.63.20.131 → 10.235.232.7 TCP 66
2049 → 988 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=4120839904
TSecr=3995300846
  163 2023-04-04 15:20:01.559017 10.235.232.173 → 10.235.235.255
BROWSER 243 Host Announcement WINDOWS-PTEFTLL, Workstation, Server, NT
Workstation, NT Server
  164 2023-04-04 15:20:10.426553 10.235.232.7 → 10.60.248.183 NTP 90
NTP Version 4, client
  165 2023-04-04 15:20:10.426824 10.60.248.183 → 10.235.232.7 NTP 90
NTP Version 4, server
  166 2023-04-04 15:20:10.601625 10.235.232.7 → 10.63.20.131 TCP 74
888 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
TSval=3995311085 TSecr=0 WS=128
  167 2023-04-04 15:20:10.601919 10.63.20.131 → 10.235.232.7 TCP 74
2049 → 888 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
SACK_PERM=1 TSval=531138781 TSecr=3995311085
  168 2023-04-04 15:20:10.601978 10.235.232.7 → 10.63.20.131 TCP 66
888 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995311085
TSecr=531138781
  169 2023-04-04 15:20:10.602058 10.235.232.7 → 10.63.20.131 NFS 110
V4 NULL Call
  170 2023-04-04 15:20:10.602516 10.63.20.131 → 10.235.232.7 NFS 102
V4 NULL Reply (Call In 169)
  171 2023-04-04 15:20:10.602543 10.235.232.7 → 10.63.20.131 TCP 66
888 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995311086
TSecr=531138781
  172 2023-04-04 15:20:10.731005 10.63.20.131 → 10.235.232.7 TCP 66
2049 → 888 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=531138911
TSecr=3995311086
  173 2023-04-04 15:20:11.998469 10.235.232.7 → 10.235.48.111 NTP 90
NTP Version 4, client
  174 2023-04-04 15:20:11.998803 10.235.48.111 → 10.235.232.7 NTP 90
NTP Version 4, server
  175 2023-04-04 15:20:12.648789 10.235.232.7 → 10.227.204.142 NTP 90
NTP Version 4, client
  176 2023-04-04 15:20:12.649163 10.227.204.142 → 10.235.232.7 NTP 90
NTP Version 4, server
  177 2023-04-04 15:20:20.841429 10.235.232.7 → 10.63.20.131 TCP 74
893 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
TSval=3995321325 TSecr=0 WS=128
  178 2023-04-04 15:20:20.841673 10.63.20.131 → 10.235.232.7 TCP 74
2049 → 893 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
SACK_PERM=1 TSval=1953607493 TSecr=3995321325
  179 2023-04-04 15:20:20.841688 10.235.232.7 → 10.63.20.131 TCP 66
893 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995321325
TSecr=1953607493
  180 2023-04-04 15:20:20.841767 10.235.232.7 → 10.63.20.131 NFS 110
V4 NULL Call
  181 2023-04-04 15:20:20.842198 10.63.20.131 → 10.235.232.7 NFS 102
V4 NULL Reply (Call In 180)
  182 2023-04-04 15:20:20.842208 10.235.232.7 → 10.63.20.131 TCP 66
893 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995321326
TSecr=1953607493
  183 2023-04-04 15:20:21.051706 10.63.20.131 → 10.235.232.7 TCP 66
2049 → 893 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=1953607703
TSecr=3995321326
  184 2023-04-04 15:20:22.889123 10.235.232.7 → 10.56.8.183  TCP 66
[TCP Keep-Alive] 826 → 2049 [ACK] Seq=292 Ack=237 Win=501 Len=0
TSval=1599896843 TSecr=3825557328
  185 2023-04-04 15:20:22.889130 10.235.232.7 → 10.60.249.125 TCP 66
[TCP Keep-Alive] 732 → 2049 [ACK] Seq=2176 Ack=1857 Win=502 Len=0
TSval=1628469540 TSecr=708698066
  186 2023-04-04 15:20:22.889335 10.60.249.125 → 10.235.232.7 TCP 66
[TCP Keep-Alive ACK] 2049 → 732 [ACK] Seq=1857 Ack=2177 Win=257 Len=0
TSval=708759507 TSecr=1628344342
  187 2023-04-04 15:20:22.951234  10.56.8.183 → 10.235.232.7 TCP 66
[TCP Keep-Alive ACK] 2049 → 826 [ACK] Seq=237 Ack=293 Win=257 Len=0
TSval=3825618767 TSecr=1599771772
  188 2023-04-04 15:20:27.226390 10.235.232.131 → 10.235.235.255
BROWSER 243 Host Announcement SCS000126759-3, Workstation, Server, NT
Workstation, NT Server
  189 2023-04-04 15:20:31.081384 10.235.232.7 → 10.63.20.131 TCP 74
1013 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
TSval=3995331565 TSecr=0 WS=128
  190 2023-04-04 15:20:31.081679 10.63.20.131 → 10.235.232.7 TCP 74
2049 → 1013 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
SACK_PERM=1 TSval=1446866364 TSecr=3995331565
  191 2023-04-04 15:20:31.081698 10.235.232.7 → 10.63.20.131 TCP 66
1013 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995331565
TSecr=1446866364
  192 2023-04-04 15:20:31.081745 10.235.232.7 → 10.63.20.131 NFS 110
V4 NULL Call
  193 2023-04-04 15:20:31.082347 10.63.20.131 → 10.235.232.7 NFS 102
V4 NULL Reply (Call In 192)
  194 2023-04-04 15:20:31.082353 10.235.232.7 → 10.63.20.131 TCP 66
1013 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995331566
TSecr=1446866364
  195 2023-04-04 15:20:31.216474 10.63.20.131 → 10.235.232.7 TCP 66
2049 → 1013 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=1446866494
TSecr=3995331566
  196 2023-04-04 15:20:39.273073 10.235.232.7 → 10.60.249.150 TCP 66
[TCP Dup ACK 85#2] 970 → 2049 [ACK] Seq=1 Ack=1 Win=502 Len=0
TSval=4116590187 TSecr=1272883677
  197 2023-04-04 15:20:39.273146 10.235.232.7 → 10.60.248.40 TCP 66
[TCP Dup ACK 84#2] 845 → 2049 [ACK] Seq=1 Ack=1 Win=501 Len=0
TSval=3160372107 TSecr=2389578099
  198 2023-04-04 15:20:39.273342 10.60.249.150 → 10.235.232.7 TCP 66
[TCP Dup ACK 86#2] [TCP ACKed unseen segment] 2049 → 970 [ACK] Seq=1
Ack=2 Win=257 Len=0 TSval=1272945117 TSecr=4116405409
  199 2023-04-04 15:20:39.274592 10.60.248.40 → 10.235.232.7 TCP 66
[TCP Dup ACK 87#2] [TCP ACKed unseen segment] 2049 → 845 [ACK] Seq=1
Ack=2 Win=257 Len=0 TSval=2389639540 TSecr=3160187318
  200 2023-04-04 15:20:41.321912 10.235.232.7 → 10.63.20.131 TCP 74
676 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
TSval=3995341805 TSecr=0 WS=128
  201 2023-04-04 15:20:41.322226 10.63.20.131 → 10.235.232.7 TCP 74
2049 → 676 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
SACK_PERM=1 TSval=710203011 TSecr=3995341805
  202 2023-04-04 15:20:41.322258 10.235.232.7 → 10.63.20.131 TCP 66
676 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995341806
TSecr=710203011
  203 2023-04-04 15:20:41.322382 10.235.232.7 → 10.63.20.131 NFS 110
V4 NULL Call
  204 2023-04-04 15:20:41.322902 10.63.20.131 → 10.235.232.7 NFS 102
V4 NULL Reply (Call In 203)
  205 2023-04-04 15:20:41.322923 10.235.232.7 → 10.63.20.131 TCP 66
676 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995341806
TSecr=710203011
  206 2023-04-04 15:20:41.502713 10.63.20.131 → 10.235.232.7 TCP 66
2049 → 676 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=710203191
TSecr=3995341806
  207 2023-04-04 15:20:41.671253 fe80::21c:73ff:fe00:99 → ff02::1
ICMPv6 118 Router Advertisement from 00:1c:73:00:00:99
  208 2023-04-04 15:20:41.675089 fe80::250:56ff:febf:2f55 → ff02::16
  ICMPv6 90 Multicast Listener Report Message v2
  209 2023-04-04 15:20:41.684071 fe80::250:56ff:febf:2f55 → ff02::16
  ICMPv6 90 Multicast Listener Report Message v2
  210 2023-04-04 15:20:43.709117 10.235.232.130 → 10.235.235.255
BROWSER 243 Host Announcement SCS000126759-2, Workstation, Server, NT
Workstation, NT Server
  211 2023-04-04 15:20:51.561341 10.235.232.7 → 10.63.20.131 TCP 74
864 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
TSval=3995352045 TSecr=0 WS=128
  212 2023-04-04 15:20:51.561677 10.63.20.131 → 10.235.232.7 TCP 74
2049 → 864 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
SACK_PERM=1 TSval=1105202286 TSecr=3995352045
  213 2023-04-04 15:20:51.561736 10.235.232.7 → 10.63.20.131 TCP 66
864 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995352045
TSecr=1105202286
  214 2023-04-04 15:20:51.561764 10.235.232.7 → 10.63.20.131 NFS 110
V4 NULL Call
  215 2023-04-04 15:20:51.562296 10.63.20.131 → 10.235.232.7 NFS 102
V4 NULL Reply (Call In 214)
  216 2023-04-04 15:20:51.562324 10.235.232.7 → 10.63.20.131 TCP 66
864 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995352046
TSecr=1105202286
  217 2023-04-04 15:20:51.764952 10.63.20.131 → 10.235.232.7 TCP 66
2049 → 864 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=1105202486
TSecr=3995352046
  218 2023-04-04 15:21:01.801569 10.235.232.7 → 10.63.20.131 TCP 74
779 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
TSval=3995362285 TSecr=0 WS=128
  219 2023-04-04 15:21:01.801903 10.63.20.131 → 10.235.232.7 TCP 74
2049 → 779 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
SACK_PERM=1 TSval=3669197173 TSecr=3995362285
  220 2023-04-04 15:21:01.801938 10.235.232.7 → 10.63.20.131 TCP 66
779 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995362285
TSecr=3669197173
  221 2023-04-04 15:21:01.801991 10.235.232.7 → 10.63.20.131 NFS 110
V4 NULL Call
  222 2023-04-04 15:21:01.802750 10.63.20.131 → 10.235.232.7 NFS 102
V4 NULL Reply (Call In 221)
  223 2023-04-04 15:21:01.802761 10.235.232.7 → 10.63.20.131 TCP 66
779 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995362286
TSecr=3669197173
  224 2023-04-04 15:21:01.954179 10.63.20.131 → 10.235.232.7 TCP 66
2049 → 779 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=3669197323
TSecr=3995362286
  225 2023-04-04 15:21:10.142734 fe80::21c:73ff:fe00:99 → ff02::1
ICMPv6 118 Router Advertisement from 00:1c:73:00:00:99
  226 2023-04-04 15:21:10.146082 fe80::250:56ff:febf:2f55 → ff02::16
  ICMPv6 90 Multicast Listener Report Message v2
  227 2023-04-04 15:21:10.889124 fe80::250:56ff:febf:2f55 → ff02::16
  ICMPv6 90 Multicast Listener Report Message v2
  228 2023-04-04 15:21:12.041741 10.235.232.7 → 10.63.20.131 TCP 74
946 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
TSval=3995372525 TSecr=0 WS=128
  229 2023-04-04 15:21:12.042058 10.63.20.131 → 10.235.232.7 TCP 74
2049 → 946 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
SACK_PERM=1 TSval=1985286575 TSecr=3995372525
  230 2023-04-04 15:21:12.042121 10.235.232.7 → 10.63.20.131 TCP 66
946 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995372526
TSecr=1985286575
  231 2023-04-04 15:21:12.042404 10.235.232.7 → 10.63.20.131 NFS 110
V4 NULL Call
  232 2023-04-04 15:21:12.042673 10.63.20.131 → 10.235.232.7 NFS 102
V4 NULL Reply (Call In 231)
  233 2023-04-04 15:21:12.042710 10.235.232.7 → 10.63.20.131 TCP 66
946 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995372526
TSecr=1985286575
  234 2023-04-04 15:21:12.079429 10.63.20.131 → 10.235.232.7 TCP 66
[TCP Dup ACK 232#1] 2049 → 946 [ACK] Seq=37 Ack=45 Win=65792 Len=0
TSval=1985286615 TSecr=3995372526
  235 2023-04-04 15:21:12.184895 10.63.20.131 → 10.235.232.7 TCP 66
2049 → 946 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=1985286705
TSecr=3995372526
  236 2023-04-04 15:21:18.186337 10.235.232.7 → 10.60.250.78 SSH 150
Server: Encrypted packet (len=84)
  237 2023-04-04 15:21:18.186573 10.60.250.78 → 10.235.232.7 TCP 66
45496 → 22 [ACK] Seq=417 Ack=629 Win=501 Len=0 TSval=2780826399
TSecr=3765675462
  238 2023-04-04 15:21:18.187900 10.235.232.7 → 10.60.250.78 SSH 150
Server: Encrypted packet (len=84)
  239 2023-04-04 15:21:18.188065 10.60.250.78 → 10.235.232.7 TCP 66
45496 → 22 [ACK] Seq=417 Ack=713 Win=501 Len=0 TSval=2780826401
TSecr=3765675463
  240 2023-04-04 15:21:18.188154 10.235.232.7 → 10.60.250.78 SSH 150
Server: Encrypted packet (len=84)
  241 2023-04-04 15:21:18.188283 10.60.250.78 → 10.235.232.7 TCP 66
45496 → 22 [ACK] Seq=417 Ack=797 Win=501 Len=0 TSval=2780826401
TSecr=3765675464
  242 2023-04-04 15:21:18.188455 10.235.232.7 → 10.60.250.78 SSH 134
Server: Encrypted packet (len=68)
  243 2023-04-04 15:21:18.188599 10.60.250.78 → 10.235.232.7 TCP 66
45496 → 22 [ACK] Seq=417 Ack=865 Win=501 Len=0 TSval=2780826401
TSecr=3765675464
  244 2023-04-04 15:21:22.539976 fe80::21c:73ff:fe00:99 →
fe80::250:56ff:febf:5abd ICMPv6 118 Router Advertisement from
00:1c:73:00:00:99
  245 2023-04-04 15:21:22.764914 fe80::21c:73ff:fe00:99 →
fe80::250:56ff:febf:5abd ICMPv6 118 Router Advertisement from
00:1c:73:00:00:99
  246 2023-04-04 15:21:24.329146 10.235.232.7 → 10.56.8.183  TCP 66
[TCP Keep-Alive] 826 → 2049 [ACK] Seq=292 Ack=237 Win=501 Len=0
TSval=1599958283 TSecr=3825618767
  247 2023-04-04 15:21:24.329156 10.235.232.7 → 10.60.249.125 TCP 66
[TCP Keep-Alive] 732 → 2049 [ACK] Seq=2176 Ack=1857 Win=502 Len=0
TSval=1628530980 TSecr=708759507
  248 2023-04-04 15:21:24.329506 10.60.249.125 → 10.235.232.7 TCP 66
[TCP Keep-Alive ACK] 2049 → 732 [ACK] Seq=1857 Ack=2177 Win=257 Len=0
TSval=708820947 TSecr=1628344342
  249 2023-04-04 15:21:24.391262  10.56.8.183 → 10.235.232.7 TCP 66
[TCP Keep-Alive ACK] 2049 → 826 [ACK] Seq=237 Ack=293 Win=257 Len=0
TSval=3825680208 TSecr=1599771772

>
>
> >>> I thought that
> >>> would "prevent" the crash but it then proceeded to crash elsewhere.
> >>> Sorry I'm still on non-debug-symbol kernel (as I thought I'd add the
> >>> checks that that'd be enough :)).
> >>>
> >>> [  284.368211] AGLO: handshake_req_cancel sock->sk is NULL
> >>> [  284.368249] BUG: kernel NULL pointer dereference, address: 0000000000000018
> >>> [  284.368320] #PF: supervisor read access in kernel mode
> >>> [  284.368363] #PF: error_code(0x0000) - not-present page
> >>> [  284.368411] PGD 0 P4D 0
> >>> [  284.368438] Oops: 0000 [#1] PREEMPT SMP PTI
> >>> [  284.368473] CPU: 0 PID: 521 Comm: kworker/u2:30 Kdump: loaded Not
> >>> tainted 6.3.0-rc4+ #14
> >>> [  284.368523] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
> >>> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
> >>> [  284.368580] Workqueue: xprtiod xs_tls_connect [sunrpc]
> >>> [  284.368720] RIP: 0010:xs_tls_connect+0x3e5/0x590 [sunrpc]
> >>> [  284.368844] Code: ff ff ff 48 2b 83 f8 fe ff ff 48 89 83 00 ff ff
> >>> ff e8 bf d2 ff ff e9 a8 fd ff ff 49 8b 97 f8 05 00 00 66 83 bb e0 f9
> >>> ff ff 0a <48> 8b 6a 18 0f 84 fd 00 00 00 48 8d 72 18 4c 89 e7 48 89 14
> >>> 24 e8
> >>> [  284.368973] RSP: 0018:ffffba17c0f67dc8 EFLAGS: 00010293
> >>> [  284.369010] RAX: 0000000000000000 RBX: ffff963143c1fe40 RCX: 0000000000000000
> >>> [  284.369051] RDX: 0000000000000000 RSI: ffff9631f7e20840 RDI: ffff9631f7e20840
> >>> [  284.369091] RBP: ffff9631463d8a00 R08: 0000000000000000 R09: c0000000ffff7fff
> >>> [  284.369130] R10: 0000000000000001 R11: ffffba17c0f67bb8 R12: ffff963143c1f800
> >>> [  284.369175] R13: 0000000004208160 R14: ffff963143f6f600 R15: ffff963143c1e000
> >>> [  284.369259] FS:  0000000000000000(0000) GS:ffff9631f7e00000(0000)
> >>> knlGS:0000000000000000
> >>> [  284.369319] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>> [  284.369367] CR2: 0000000000000018 CR3: 0000000102d12004 CR4: 00000000007706f0
> >>> [  284.369458] PKRU: 55555554
> >>> [  284.369483] Call Trace:
> >>> [  284.369523]  <TASK>
> >>> [  284.369583]  process_one_work+0x1b0/0x3c0
> >>> [  284.369633]  worker_thread+0x34/0x360
> >>> [  284.369668]  ? __pfx_worker_thread+0x10/0x10
> >>> [  284.369704]  kthread+0xdb/0x110
> >>> [  284.369736]  ? __pfx_kthread+0x10/0x10
> >>> [  284.369767]  ret_from_fork+0x29/0x50
> >>> [  284.369808]  </TASK>
> >>> [  284.369843] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
> >>> dns_resolver nfsv3 nfs_acl nfs lockd grace fscache netfs ext4 mbcache
> >>> jbd2 loop rfkill vsock_loopback vmw_vsock_virtio_transport_common
> >>> vmw_vsock_vmci_transport vsock sunrpc vfat fat dm_multipath
> >>> intel_rapl_msr intel_rapl_common isst_if_mbox_msr isst_if_common nfit
> >>> libnvdimm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl
> >>> vmw_balloon joydev pcspkr vmw_vmci i2c_piix4 xfs libcrc32c vmwgfx
> >>> drm_ttm_helper ttm sr_mod cdrom drm_kms_helper sd_mod syscopyarea
> >>> sysfillrect ata_generic sysimgblt sg ata_piix ahci libahci
> >>> crc32c_intel serio_raw vmxnet3 libata drm vmw_pvscsi dm_mirror
> >>> dm_region_hash dm_log dm_mod nvme_tcp nvme_fabrics nvme_core t10_pi
> >>> crc64_rocksoft crc64 ipmi_devintf ipmi_msghandler fuse
> >>> [  284.370300] CR2: 0000000000000018
> >>>
> >>>
> >>>>
> >>>> BPF might also be a solution, but I'm not sure if it relies on
> >>>> debug info. Might be only BTF is needed.
> >>>>
> >>>>
> >>>>>>> [  677.275269] Code: 0f 1e fa 0f 1f 44 00 00 41 57 41 56 41 55 41 54
> >>>>>>> 55 53 48 83 ec 28 48 8b 6f 18 65 48 8b 04 25 28 00 00 00 48 89 44 24
> >>>>>>> 20 31 c0 <48> 8b 5d 30 48 89 6c 24 18 e8 97 15 5f ff 4c 8b 25 50 2e 70
> >>>>>>> 01 41
> >>>>>>> [  677.275474] RSP: 0018:ffffaed400f6fcf8 EFLAGS: 00010246
> >>>>>>> [  677.275570] RAX: 0000000000000000 RBX: ffff93f085b33800 RCX: 0000000000000000
> >>>>>>> [  677.275663] RDX: ffff93f085b33f08 RSI: ffffaed400f6fd18 RDI: ffff93f083aa6700
> >>>>>>> [  677.275756] RBP: 0000000000000000 R08: ffff93f137e32968 R09: 0000009d9663fa40
> >>>>>>> [  677.275850] R10: 0000000000000000 R11: 0000009d9663fa40 R12: ffff93f083aa6700
> >>>>>>> [  677.275943] R13: 0000000000000000 R14: ffff93f082999e00 R15: ffff93f085b33800
> >>>>>>> [  677.276049] FS:  0000000000000000(0000) GS:ffff93f137e00000(0000)
> >>>>>>> knlGS:0000000000000000
> >>>>>>> [  677.276154] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>>>>>> [  677.276247] CR2: 0000000000000030 CR3: 000000010797c002 CR4: 00000000007706f0
> >>>>>>> [  677.276355] PKRU: 55555554
> >>>>>>> [  677.276427] Call Trace:
> >>>>>>> [  677.276521]  <TASK>
> >>>>>>> [  677.276605]  ? wait_for_completion_interruptible_timeout+0xfa/0x170
> >>>>>>> [  677.276725]  xs_tls_handshake_sync+0x14f/0x170 [sunrpc]
> >>>>>>> [  677.276850]  ? __pfx_xs_tls_handshake_done+0x10/0x10 [sunrpc]
> >>>>>>> [  677.276967]  xs_tls_connect+0x14c/0x590 [sunrpc]
> >>>>>>> [  677.277086]  process_one_work+0x1b0/0x3c0
> >>>>>>> [  677.277209]  worker_thread+0x34/0x360
> >>>>>>> [  677.277303]  ? __pfx_worker_thread+0x10/0x10
> >>>>>>> [  677.277396]  kthread+0xdb/0x110
> >>>>>>> [  677.277527]  ? __pfx_kthread+0x10/0x10
> >>>>>>> [  677.277619]  ret_from_fork+0x29/0x50
> >>>>>>> [  677.277731]  </TASK>
> >>>>>>> [  677.277803] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
> >>>>>>> dns_resolver nfsv3 nfs_acl nfs lockd grace fscache netfs ext4 mbcache
> >>>>>>> jbd2 loop rfkill vsock_loopback vmw_vsock_virtio_transport_common
> >>>>>>> vmw_vsock_vmci_transport vsock sunrpc vfat fat dm_multipath
> >>>>>>> intel_rapl_msr intel_rapl_common isst_if_mbox_msr isst_if_common nfit
> >>>>>>> libnvdimm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl
> >>>>>>> vmw_balloon joydev pcspkr vmw_vmci i2c_piix4 xfs libcrc32c vmwgfx
> >>>>>>> drm_ttm_helper ttm sr_mod cdrom drm_kms_helper syscopyarea sysfillrect
> >>>>>>> sd_mod ata_generic sysimgblt sg ata_piix ahci libahci crc32c_intel
> >>>>>>> serio_raw drm vmxnet3 vmw_pvscsi libata dm_mirror dm_region_hash
> >>>>>>> dm_log dm_mod nvme_tcp nvme_fabrics nvme_core t10_pi crc64_rocksoft
> >>>>>>> crc64 ipmi_devintf ipmi_msghandler fuse
> >>>>>>> [  677.278383] CR2: 0000000000000030
> >>>>>>>
> >>>>>>> Network trace from the server:
> >>>>>>>
> >>>>>>> [kolga@scs000085536 ~/pcaps]$ tshark -t ad -r e0d_20230404_084311.trc0
> >>>>>>> 1 2023-04-04 08:43:28.027546 10.235.232.77 → 10.63.20.131 TCP 74
> >>>>>>> 727 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> >>>>>>> TSval=4059503415 TSecr=0 WS=128
> >>>>>>> 2 2023-04-04 08:43:28.027645 10.63.20.131 → 10.235.232.77 TCP 74
> >>>>>>> 2049 → 727 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> >>>>>>> SACK_PERM=1 TSval=2256845073 TSecr=4059503415
> >>>>>>> 3 2023-04-04 08:43:28.027813 10.235.232.77 → 10.63.20.131 TCP 66
> >>>>>>> 727 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=4059503415
> >>>>>>> TSecr=2256845073
> >>>>>>> 4 2023-04-04 08:43:28.027952 10.235.232.77 → 10.63.20.131 NFS 110
> >>>>>>> V4 NULL Call
> >>>>>>> 5 2023-04-04 08:43:28.028301 10.63.20.131 → 10.235.232.77 NFS 102
> >>>>>>> V4 NULL Reply (Call In 4)
> >>>>>>> 6 2023-04-04 08:43:28.028563 10.235.232.77 → 10.63.20.131 TCP 66
> >>>>>>> 727 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=4059503416
> >>>>>>> TSecr=2256845073
> >>>>>>> 7 2023-04-04 08:43:28.331181 10.63.20.131 → 10.235.232.77 TCP 66
> >>>>>>> 2049 → 727 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=2256845383
> >>>>>>> TSecr=4059503416
> >>>>>>>
> >>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Chuck Lever
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Chuck Lever
> >>>>
> >>>>
> >>>> --
> >>>> Chuck Lever
> >>
> >>
> >> --
> >> Chuck Lever
>
>
> --
> Chuck Lever
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: kernel oops
  2023-04-04 19:26                   ` Olga Kornievskaia
@ 2023-04-04 19:30                     ` Chuck Lever III
  0 siblings, 0 replies; 12+ messages in thread
From: Chuck Lever III @ 2023-04-04 19:30 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: kernel-tls-handshake



> On Apr 4, 2023, at 3:26 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> 
> On Tue, Apr 4, 2023 at 3:14 PM Chuck Lever III <chuck.lever@oracle.com> wrote:
>> 
>> 
>> 
>>> On Apr 4, 2023, at 3:11 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>> 
>>> On Tue, Apr 4, 2023 at 1:28 PM Chuck Lever III <chuck.lever@oracle.com> wrote:
>>>> 
>>>> 
>>>> 
>>>>> On Apr 4, 2023, at 12:15 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>>> 
>>>>> On Tue, Apr 4, 2023 at 10:54 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On Apr 4, 2023, at 10:44 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>>>>> 
>>>>>>> On Tue, Apr 4, 2023 at 10:32 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
>>>>>>>> 
>>>>>>>>> On Apr 4, 2023, at 10:24 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>>>>>>> 
>>>>>>>>> On Fri, Mar 31, 2023 at 10:25 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> On Mar 31, 2023, at 10:23 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Hi Chuck,
>>>>>>>>>>> 
>>>>>>>>>>> Have you seen this oops? This is not my testing so all I know was that
>>>>>>>>>>> it was an attempt to mount with xprtsec=tls (against ONTAP).
>>>>>>>>>>> 
>>>>>>>>>>> [  316.939747] BUG: kernel NULL pointer dereference, address: 0000000000000018
>>>>>>>>>>> [  316.939848] #PF: supervisor read access in kernel mode
>>>>>>>>>>> [  316.939914] #PF: error_code(0x0000) - not-present page
>>>>>>>>>>> [  316.939984] PGD 0 P4D 0
>>>>>>>>>>> [  316.940041] Oops: 0000 [#1] PREEMPT SMP PTI
>>>>>>>>>>> [  316.940109] CPU: 0 PID: 511 Comm: kworker/u2:30 Kdump: loaded Not
>>>>>>>>>>> tainted 6.2.1+ #2
>>>>>>>>>>> [  316.940181] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
>>>>>>>>>>> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
>>>>>>>>>>> [  316.940259] Workqueue: xprtiod xs_tls_connect [sunrpc]
>>>>>>>>>>> [  316.940463] RIP: 0010:xs_tls_connect+0x3e1/0x590 [sunrpc]
>>>>>>>>>>> [  316.940556] Code: ff ff ff 48 2b 83 f8 fe ff ff 48 89 83 00 ff ff
>>>>>>>>>>> ff e8 03 d3 ff ff e9 a8 fd ff ff 49 8b 97 f8 05 00 00 66 83 bb e0 f9
>>>>>>>>>>> ff ff 0a <48> 8b 6a 18 0f 84 fd 00 00 00 48 8d 72 18 4c 89 e7 48 89 14
>>>>>>>>>>> 24 e8
>>>>>>>>>>> [  316.940719] RSP: 0018:ffffa69d40f5fdc8 EFLAGS: 00010293
>>>>>>>>>>> [  316.940786] RAX: 0000000000000000 RBX: ffff9b0c46333e40 RCX: 0000000000000000
>>>>>>>>>>> [  316.940872] RDX: 0000000000000000 RSI: ffffa69d40f5fd20 RDI: 0000000000000000
>>>>>>>>>>> [  316.940940] RBP: ffff9b0c46350200 R08: ffff9b0cf7e31968 R09: 0000000000000159
>>>>>>>>>>> [  316.941008] R10: 0000000000000001 R11: 00000049c8887a40 R12: ffff9b0c46333800
>>>>>>>>>>> [  316.941075] R13: 0000000004208160 R14: ffff9b0c44e18a00 R15: ffff9b0c42dab800
>>>>>>>>>>> [  316.941159] FS:  0000000000000000(0000) GS:ffff9b0cf7e00000(0000)
>>>>>>>>>>> knlGS:0000000000000000
>>>>>>>>>>> [  316.941243] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>>>>>>> [  316.941309] CR2: 0000000000000018 CR3: 0000000102e48004 CR4: 00000000007706f0
>>>>>>>>>>> [  316.941391] PKRU: 55555554
>>>>>>>>>>> [  316.941443] Call Trace:
>>>>>>>>>>> [  316.941513]  <TASK>
>>>>>>>>>>> [  316.941578]  process_one_work+0x1b0/0x3c0
>>>>>>>>>>> [  316.941689]  worker_thread+0x30/0x360
>>>>>>>>>>> [  316.941755]  ? __pfx_worker_thread+0x10/0x10
>>>>>>>>>>> [  316.941822]  kthread+0xd7/0x100
>>>>>>>>>>> [  316.941900]  ? __pfx_kthread+0x10/0x10
>>>>>>>>>>> [  316.941968]  ret_from_fork+0x29/0x50
>>>>>>>>>>> [  316.942049]  </TASK>
>>>>>>>>>>> [  316.942099] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
>>>>>>>>>>> dns_resolver ib_core nfsv3 nfs_acl nfs lockd grace fscache netfs ext4
>>>>>>>>>>> rfkill mbcache jbd2 loop vsock_loopback
>>>>>>>>>>> vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock
>>>>>>>>>>> sunrpc tls vfat fat dm_multipath intel_rapl_msr intel_rapl_common
>>>>>>>>>>> isst_if_mbox_msr isst_if_common nfit libnvdimm crct10dif_pclmul
>>>>>>>>>>> crc32_pclmul ghash_clmulni_intel rapl vmw_balloon joydev pcspkr
>>>>>>>>>>> i2c_piix4 vmw_vmci xfs libcrc32c vmwgfx drm_ttm_helper ttm sr_mod
>>>>>>>>>>> cdrom drm_kms_helper syscopyarea sd_mod sysfillrect ata_generic
>>>>>>>>>>> sysimgblt sg ahci libahci ata_piix crc32c_intel serio_raw drm vmxnet3
>>>>>>>>>>> vmw_pvscsi libata dm_mirror dm_region_hash dm_log dm_mod nvme_tcp
>>>>>>>>>>> nvme_fabrics nvme_core t10_pi crc64_rocksoft crc64 ipmi_devintf
>>>>>>>>>>> ipmi_msghandler fuse
>>>>>>>>>>> [  316.942550] CR2: 0000000000000018
>>>>>>>>>>> 
>>>>>>>>>>> Please note this was still on "commit
>>>>>>>>>>> caa52379f0046e974fd16585dd7ede624699d8e2 (HEAD -> 03152023-v7,
>>>>>>>>>>> origin/upcall-v7)". If you think somethings done to address it I'll
>>>>>>>>>>> have them update to the latest.
>>>>>>>>>> 
>>>>>>>>>> Update and retry. It has a familiar ring, but I can't say I've
>>>>>>>>>> seen this recently.
>>>>>>>>> 
>>>>>>>>> Hi Chuck,
>>>>>>>>> 
>>>>>>>>> I've updated the kernel code to the latest  commit
>>>>>>>>> 46588cade103b446f74a7ee160a83cd2a80399e7 (HEAD ->
>>>>>>>>> chuck-rpc-with-tls-04032023, chuck/topic-rpc-with-tls-upcall) but the
>>>>>>>>> client is still crashing but looks like in a different place. This
>>>>>>>>> happens when the server terminates the connection (immediately) as the
>>>>>>>>> tls handshake happens.
>>>>>>>> 
>>>>>>>> Why is the server doing that?
>>>>>>> 
>>>>>>> I'm not sure what kind of a reason you are looking for. I think the
>>>>>>> server can send an RST at any point in a TCP connection and the client
>>>>>>> shouldn't crash.
>>>>>> 
>>>>>> No doubt.
>>>>>> 
>>>>>> 
>>>>>>> But the real reason is probably because the server's
>>>>>>> code is getting developed and they ran into some error and resetting
>>>>>>> the connection.
>>>>>> 
>>>>>> That's what I was looking for. I wanted to know if it was
>>>>>> Linux NFSD doing this.
>>>>>> 
>>>>>> 
>>>>>>>>> Here's the stack trace:
>>>>>>>>> [  677.274299] BUG: kernel NULL pointer dereference, address: 0000000000000030
>>>>>>>>> [  677.274424] #PF: supervisor read access in kernel mode
>>>>>>>>> [  677.274517] #PF: error_code(0x0000) - not-present page
>>>>>>>>> [  677.274609] PGD 0 P4D 0
>>>>>>>>> [  677.274689] Oops: 0000 [#1] PREEMPT SMP PTI
>>>>>>>>> [  677.274785] CPU: 0 PID: 518 Comm: kworker/u2:29 Kdump: loaded Not
>>>>>>>>> tainted 6.3.0-rc4+ #13
>>>>>>>>> [  677.274881] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
>>>>>>>>> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
>>>>>>>>> [  677.274981] Workqueue: xprtiod xs_tls_connect [sunrpc]
>>>>>>>>> [  677.275137] RIP: 0010:handshake_req_cancel+0x2b/0x3b0
>>>>>>>> 
>>>>>>>> scripts/faddr2line on this RIP
>>>>>>>> 
>>>>>>>> Which pointer is NULL?
>>>>>>> 
>>>>>>> I apologize I don't have that info and I'm working on trying to figure
>>>>>>> out how to get this information to you. Lab kernels are built with
>>>>>>> CONFIG_DEBUG_INFO disabled and thus I can't do faddr2line. I will
>>>>>>> attempt to build on with debugging enabled (has been problematic
>>>>>>> before for the lab machines so not sure if that'll be successful I
>>>>>>> might have to hack the linux kernel to simulate the reset).
>>>>>> 
>>>>>> Jakub asked that NULL checking be removed from handshake_req_cancel,
>>>>>> or at least that's what I read into his comment. That choice is
>>>>>> definitely flushing bugs.
>>>>>> 
>>>>>> So this crasher is familiar. req_cancel is getting called in some
>>>>>> scenario where the socket or request is partially filled in. We
>>>>>> have to figure out how that's happening.
>>>>>> 
>>>>>> If debug info is a problem, then just add a sarcastic quantity of
>>>>>> NULL checks in handshake_req_cancel, rebuild, and it should tell
>>>>>> you exactly which pointer is bad.
>>>>> 
>>>>> I added a check for sock-sk null and if so return true.
>>>> 
>>>> If it returns true, xs_tls_handshake_sync() probably completes
>>>> successfully.
>>>> 
>>>> Perhaps handshake_req_cancel needs to return false in cases
>>>> where it can't find a pending request.
>>> 
>>> I'm not sure if you were suggesting me to do this but I tried:
>>> if (!sock->sk) return false;
>>> and that led to a mount not crashing but timing out.
>> 
>> That's the behavior I would expect. Was the failure immediate,
>> or did it take a few seconds for the mount command to exit?
> 
> It did take quite a long time to exit:
> 
> [root@scs000128074 kolga]# time mount -o vers=4.2,sec=sys,xprtsec=tls
> 10.63.20.131:/ /mnt
> mount.nfs: Connection timed out
> 
> real    3m0.544s
> user    0m0.004s
> sys     0m0.013s
> 
> Network trace shows numerous (16) attempts to do NULL call

The long delay is due to retries, probably driven by mount.nfs.

But looks like there's a 10 second delay between each attempt.
Maybe that's OK.


>   66 2023-04-04 15:18:17.786579 10.235.232.7 → 10.63.20.131 TCP 74
> 953 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> TSval=3995198270 TSecr=0 WS=128
>   67 2023-04-04 15:18:17.786818 10.63.20.131 → 10.235.232.7 TCP 74
> 2049 → 953 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> SACK_PERM=1 TSval=2815692929 TSecr=3995198270
>   68 2023-04-04 15:18:17.786871 10.235.232.7 → 10.63.20.131 TCP 66
> 953 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995198270
> TSecr=2815692929
>   69 2023-04-04 15:18:17.786951 10.235.232.7 → 10.63.20.131 NFS 110
> V4 NULL Call
>   70 2023-04-04 15:18:17.787488 10.63.20.131 → 10.235.232.7 NFS 102
> V4 NULL Reply (Call In 69)
>   71 2023-04-04 15:18:17.787520 10.235.232.7 → 10.63.20.131 TCP 66
> 953 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995198271
> TSecr=2815692929
>   72 2023-04-04 15:18:17.818060 10.235.232.7 → 10.56.8.183  TCP 66
> 826 → 2049 [ACK] Seq=293 Ack=237 Win=501 Len=0 TSval=1599771772
> TSecr=3825493593
>   73 2023-04-04 15:18:17.993341 10.63.20.131 → 10.235.232.7 TCP 66
> 2049 → 953 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=2815693129
> TSecr=3995198271
>   74 2023-04-04 15:18:19.490669 fe80::21c:73ff:fe00:99 → ff02::1
> ICMPv6 118 Router Advertisement from 00:1c:73:00:00:99
>   75 2023-04-04 15:18:19.494076 fe80::250:56ff:febf:2f55 → ff02::16
>  ICMPv6 90 Multicast Listener Report Message v2
>   76 2023-04-04 15:18:19.841191 fe80::250:56ff:febf:2f55 → ff02::16
>  ICMPv6 90 Multicast Listener Report Message v2
>   77 2023-04-04 15:18:28.201642 10.235.232.7 → 10.63.20.131 TCP 74
> 692 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> TSval=3995208685 TSecr=0 WS=128
>   78 2023-04-04 15:18:28.202171 10.63.20.131 → 10.235.232.7 TCP 74
> 2049 → 692 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> SACK_PERM=1 TSval=3904561029 TSecr=3995208685
>   79 2023-04-04 15:18:28.202236 10.235.232.7 → 10.63.20.131 TCP 66
> 692 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995208686
> TSecr=3904561029
>   80 2023-04-04 15:18:28.202506 10.235.232.7 → 10.63.20.131 NFS 110
> V4 NULL Call
>   81 2023-04-04 15:18:28.202882 10.63.20.131 → 10.235.232.7 NFS 102
> V4 NULL Reply (Call In 80)
>   82 2023-04-04 15:18:28.202911 10.235.232.7 → 10.63.20.131 TCP 66
> 692 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995208686
> TSecr=3904561029
>   83 2023-04-04 15:18:28.329632 10.63.20.131 → 10.235.232.7 TCP 66
> 2049 → 692 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=3904561159
> TSecr=3995208686
>   84 2023-04-04 15:18:34.857077 10.235.232.7 → 10.60.248.40 TCP 66
> 845 → 2049 [ACK] Seq=1 Ack=1 Win=501 Len=0 TSval=3160247691
> TSecr=2389454709
>   85 2023-04-04 15:18:34.857146 10.235.232.7 → 10.60.249.150 TCP 66
> 970 → 2049 [ACK] Seq=1 Ack=1 Win=502 Len=0 TSval=4116465771
> TSecr=1272760298
>   86 2023-04-04 15:18:34.857421 10.60.249.150 → 10.235.232.7 TCP 66
> [TCP ACKed unseen segment] 2049 → 970 [ACK] Seq=1 Ack=2 Win=257 Len=0
> TSval=1272820701 TSecr=4116405409
>   87 2023-04-04 15:18:34.857422 10.60.248.40 → 10.235.232.7 TCP 66
> [TCP ACKed unseen segment] 2049 → 845 [ACK] Seq=1 Ack=2 Win=257 Len=0
> TSval=2389515123 TSecr=3160187318
>   88 2023-04-04 15:18:38.441851 10.235.232.7 → 10.63.20.131 TCP 74
> 971 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> TSval=3995218925 TSecr=0 WS=128
>   89 2023-04-04 15:18:38.442269 10.63.20.131 → 10.235.232.7 TCP 74
> 2049 → 971 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> SACK_PERM=1 TSval=1698639803 TSecr=3995218925
>   90 2023-04-04 15:18:38.442320 10.235.232.7 → 10.63.20.131 TCP 66
> 971 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995218926
> TSecr=1698639803
>   91 2023-04-04 15:18:38.442577 10.235.232.7 → 10.63.20.131 NFS 110
> V4 NULL Call
>   92 2023-04-04 15:18:38.442894 10.63.20.131 → 10.235.232.7 NFS 102
> V4 NULL Reply (Call In 91)
>   93 2023-04-04 15:18:38.442921 10.235.232.7 → 10.63.20.131 TCP 66
> 971 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995218926
> TSecr=1698639803
>   94 2023-04-04 15:18:38.478175 10.63.20.131 → 10.235.232.7 TCP 66
> [TCP Dup ACK 92#1] 2049 → 971 [ACK] Seq=37 Ack=45 Win=65792 Len=0
> TSval=1698639843 TSecr=3995218926
>   95 2023-04-04 15:18:38.614265 10.63.20.131 → 10.235.232.7 TCP 66
> 2049 → 971 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=1698639973
> TSecr=3995218926
>   96 2023-04-04 15:18:48.681763 10.235.232.7 → 10.63.20.131 TCP 74
> 799 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> TSval=3995229165 TSecr=0 WS=128
>   97 2023-04-04 15:18:48.682176 10.63.20.131 → 10.235.232.7 TCP 74
> 2049 → 799 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> SACK_PERM=1 TSval=1773399631 TSecr=3995229165
>   98 2023-04-04 15:18:48.682203 10.235.232.7 → 10.63.20.131 TCP 66
> 799 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995229166
> TSecr=1773399631
>   99 2023-04-04 15:18:48.682260 10.235.232.7 → 10.63.20.131 NFS 110
> V4 NULL Call
>  100 2023-04-04 15:18:48.682751 10.63.20.131 → 10.235.232.7 NFS 102
> V4 NULL Reply (Call In 99)
>  101 2023-04-04 15:18:48.682761 10.235.232.7 → 10.63.20.131 TCP 66
> 799 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995229166
> TSecr=1773399631
>  102 2023-04-04 15:18:48.826843 10.63.20.131 → 10.235.232.7 TCP 66
> 2049 → 799 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=1773399771
> TSecr=3995229166
>  103 2023-04-04 15:18:49.514227 10.235.232.170 → 10.235.235.255
> BROWSER 243 Host Announcement WINDOWS-G95UIQM, Workstation, Server, NT
> Workstation, NT Server
>  104 2023-04-04 15:18:58.921556 10.235.232.7 → 10.63.20.131 TCP 74
> 729 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> TSval=3995239405 TSecr=0 WS=128
>  105 2023-04-04 15:18:58.921926 10.63.20.131 → 10.235.232.7 TCP 74
> 2049 → 729 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> SACK_PERM=1 TSval=2487545914 TSecr=3995239405
>  106 2023-04-04 15:18:58.921953 10.235.232.7 → 10.63.20.131 TCP 66
> 729 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995239405
> TSecr=2487545914
>  107 2023-04-04 15:18:58.922027 10.235.232.7 → 10.63.20.131 NFS 110
> V4 NULL Call
>  108 2023-04-04 15:18:58.922802 10.63.20.131 → 10.235.232.7 NFS 102
> V4 NULL Reply (Call In 107)
>  109 2023-04-04 15:18:58.922822 10.235.232.7 → 10.63.20.131 TCP 66
> 729 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995239406
> TSecr=2487545914
>  110 2023-04-04 15:18:59.050600 10.63.20.131 → 10.235.232.7 TCP 66
> 2049 → 729 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=2487546044
> TSecr=3995239406
>  111 2023-04-04 15:19:06.533098 10.235.232.128 → 10.235.235.255
> BROWSER 243 Host Announcement SCS000126759-1, Workstation, Server, NT
> Workstation, NT Server
>  112 2023-04-04 15:19:09.161611 10.235.232.7 → 10.63.20.131 TCP 74
> 954 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> TSval=3995249645 TSecr=0 WS=128
>  113 2023-04-04 15:19:09.161949 10.63.20.131 → 10.235.232.7 TCP 74
> 2049 → 954 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> SACK_PERM=1 TSval=3804453828 TSecr=3995249645
>  114 2023-04-04 15:19:09.162009 10.235.232.7 → 10.63.20.131 TCP 66
> 954 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995249645
> TSecr=3804453828
>  115 2023-04-04 15:19:09.162055 10.235.232.7 → 10.63.20.131 NFS 110
> V4 NULL Call
>  116 2023-04-04 15:19:09.162528 10.63.20.131 → 10.235.232.7 NFS 102
> V4 NULL Reply (Call In 115)
>  117 2023-04-04 15:19:09.162554 10.235.232.7 → 10.63.20.131 TCP 66
> 954 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995249646
> TSecr=3804453828
>  118 2023-04-04 15:19:09.278841 10.63.20.131 → 10.235.232.7 TCP 66
> 2049 → 954 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=3804453948
> TSecr=3995249646
>  119 2023-04-04 15:19:12.003989 Vmware_bf:32:23 → Broadcast    ARP 60
> Who has 10.235.232.1? Tell 10.235.232.42
>  120 2023-04-04 15:19:19.401546 10.235.232.7 → 10.63.20.131 TCP 74
> 728 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> TSval=3995259885 TSecr=0 WS=128
>  121 2023-04-04 15:19:19.401902 10.63.20.131 → 10.235.232.7 TCP 74
> 2049 → 728 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> SACK_PERM=1 TSval=2468567773 TSecr=3995259885
>  122 2023-04-04 15:19:19.401965 10.235.232.7 → 10.63.20.131 TCP 66
> 728 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995259885
> TSecr=2468567773
>  123 2023-04-04 15:19:19.402050 10.235.232.7 → 10.63.20.131 NFS 110
> V4 NULL Call
>  124 2023-04-04 15:19:19.402542 10.63.20.131 → 10.235.232.7 NFS 102
> V4 NULL Reply (Call In 123)
>  125 2023-04-04 15:19:19.402554 10.235.232.7 → 10.63.20.131 TCP 66
> 728 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995259886
> TSecr=2468567773
>  126 2023-04-04 15:19:19.595961 10.63.20.131 → 10.235.232.7 TCP 66
> 2049 → 728 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=2468567963
> TSecr=3995259886
>  127 2023-04-04 15:19:21.449077 10.235.232.7 → 10.60.249.125 TCP 66
> [TCP Keep-Alive] 732 → 2049 [ACK] Seq=2176 Ack=1857 Win=502 Len=0
> TSval=1628408100 TSecr=708634268
>  128 2023-04-04 15:19:21.449206 10.235.232.7 → 10.56.8.183  TCP 66
> [TCP Keep-Alive] 826 → 2049 [ACK] Seq=292 Ack=237 Win=501 Len=0
> TSval=1599835403 TSecr=3825493593
>  129 2023-04-04 15:19:21.449284 10.60.249.125 → 10.235.232.7 TCP 66
> [TCP Keep-Alive ACK] 2049 → 732 [ACK] Seq=1857 Ack=2177 Win=257 Len=0
> TSval=708698066 TSecr=1628344342
>  130 2023-04-04 15:19:21.511308  10.56.8.183 → 10.235.232.7 TCP 66
> [TCP Keep-Alive ACK] 2049 → 826 [ACK] Seq=237 Ack=293 Win=257 Len=0
> TSval=3825557328 TSecr=1599771772
>  131 2023-04-04 15:19:29.641659 10.235.232.7 → 10.63.20.131 TCP 74
> 924 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> TSval=3995270125 TSecr=0 WS=128
>  132 2023-04-04 15:19:29.641987 10.63.20.131 → 10.235.232.7 TCP 74
> 2049 → 924 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> SACK_PERM=1 TSval=4239335290 TSecr=3995270125
>  133 2023-04-04 15:19:29.642062 10.235.232.7 → 10.63.20.131 TCP 66
> 924 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995270125
> TSecr=4239335290
>  134 2023-04-04 15:19:29.642188 10.235.232.7 → 10.63.20.131 NFS 110
> V4 NULL Call
>  135 2023-04-04 15:19:29.642585 10.63.20.131 → 10.235.232.7 NFS 102
> V4 NULL Reply (Call In 134)
>  136 2023-04-04 15:19:29.642626 10.235.232.7 → 10.63.20.131 TCP 66
> 924 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995270126
> TSecr=4239335290
>  137 2023-04-04 15:19:29.752894 10.63.20.131 → 10.235.232.7 TCP 66
> 2049 → 924 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=4239335400
> TSecr=3995270126
>  138 2023-04-04 15:19:37.833079 10.235.232.7 → 10.60.248.40 TCP 66
> [TCP Dup ACK 84#1] 845 → 2049 [ACK] Seq=1 Ack=1 Win=501 Len=0
> TSval=3160310667 TSecr=2389515123
>  139 2023-04-04 15:19:37.833122 10.235.232.7 → 10.60.249.150 TCP 66
> [TCP Dup ACK 85#1] 970 → 2049 [ACK] Seq=1 Ack=1 Win=502 Len=0
> TSval=4116528747 TSecr=1272820701
>  140 2023-04-04 15:19:37.833308 10.60.248.40 → 10.235.232.7 TCP 66
> [TCP Dup ACK 87#1] [TCP ACKed unseen segment] 2049 → 845 [ACK] Seq=1
> Ack=2 Win=257 Len=0 TSval=2389578099 TSecr=3160187318
>  141 2023-04-04 15:19:37.833346 10.60.249.150 → 10.235.232.7 TCP 66
> [TCP Dup ACK 86#1] [TCP ACKed unseen segment] 2049 → 970 [ACK] Seq=1
> Ack=2 Win=257 Len=0 TSval=1272883677 TSecr=4116405409
>  142 2023-04-04 15:19:39.881507 10.235.232.7 → 10.63.20.131 TCP 74
> 788 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> TSval=3995280365 TSecr=0 WS=128
>  143 2023-04-04 15:19:39.881784 10.63.20.131 → 10.235.232.7 TCP 74
> 2049 → 788 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> SACK_PERM=1 TSval=2800122551 TSecr=3995280365
>  144 2023-04-04 15:19:39.881847 10.235.232.7 → 10.63.20.131 TCP 66
> 788 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995280365
> TSecr=2800122551
>  145 2023-04-04 15:19:39.881898 10.235.232.7 → 10.63.20.131 NFS 110
> V4 NULL Call
>  146 2023-04-04 15:19:39.882359 10.63.20.131 → 10.235.232.7 NFS 102
> V4 NULL Reply (Call In 145)
>  147 2023-04-04 15:19:39.882395 10.235.232.7 → 10.63.20.131 TCP 66
> 788 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995280366
> TSecr=2800122551
>  148 2023-04-04 15:19:40.017113 10.63.20.131 → 10.235.232.7 TCP 66
> 2049 → 788 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=2800122681
> TSecr=3995280366
>  149 2023-04-04 15:19:50.122254 10.235.232.7 → 10.63.20.131 TCP 74
> 868 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> TSval=3995290606 TSecr=0 WS=128
>  150 2023-04-04 15:19:50.122643 10.63.20.131 → 10.235.232.7 TCP 74
> 2049 → 868 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> SACK_PERM=1 TSval=1996033015 TSecr=3995290606
>  151 2023-04-04 15:19:50.122674 10.235.232.7 → 10.63.20.131 TCP 66
> 868 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995290606
> TSecr=1996033015
>  152 2023-04-04 15:19:50.122802 10.235.232.7 → 10.63.20.131 NFS 110
> V4 NULL Call
>  153 2023-04-04 15:19:50.123263 10.63.20.131 → 10.235.232.7 NFS 102
> V4 NULL Reply (Call In 152)
>  154 2023-04-04 15:19:50.123279 10.235.232.7 → 10.63.20.131 TCP 66
> 868 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995290607
> TSecr=1996033015
>  155 2023-04-04 15:19:50.268177 10.63.20.131 → 10.235.232.7 TCP 66
> 2049 → 868 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=1996033165
> TSecr=3995290607
>  156 2023-04-04 15:20:00.361440 10.235.232.7 → 10.63.20.131 TCP 74
> 988 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> TSval=3995300845 TSecr=0 WS=128
>  157 2023-04-04 15:20:00.361776 10.63.20.131 → 10.235.232.7 TCP 74
> 2049 → 988 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> SACK_PERM=1 TSval=4120839794 TSecr=3995300845
>  158 2023-04-04 15:20:00.361791 10.235.232.7 → 10.63.20.131 TCP 66
> 988 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995300845
> TSecr=4120839794
>  159 2023-04-04 15:20:00.361841 10.235.232.7 → 10.63.20.131 NFS 110
> V4 NULL Call
>  160 2023-04-04 15:20:00.362321 10.63.20.131 → 10.235.232.7 NFS 102
> V4 NULL Reply (Call In 159)
>  161 2023-04-04 15:20:00.362327 10.235.232.7 → 10.63.20.131 TCP 66
> 988 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995300846
> TSecr=4120839794
>  162 2023-04-04 15:20:00.479087 10.63.20.131 → 10.235.232.7 TCP 66
> 2049 → 988 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=4120839904
> TSecr=3995300846
>  163 2023-04-04 15:20:01.559017 10.235.232.173 → 10.235.235.255
> BROWSER 243 Host Announcement WINDOWS-PTEFTLL, Workstation, Server, NT
> Workstation, NT Server
>  164 2023-04-04 15:20:10.426553 10.235.232.7 → 10.60.248.183 NTP 90
> NTP Version 4, client
>  165 2023-04-04 15:20:10.426824 10.60.248.183 → 10.235.232.7 NTP 90
> NTP Version 4, server
>  166 2023-04-04 15:20:10.601625 10.235.232.7 → 10.63.20.131 TCP 74
> 888 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> TSval=3995311085 TSecr=0 WS=128
>  167 2023-04-04 15:20:10.601919 10.63.20.131 → 10.235.232.7 TCP 74
> 2049 → 888 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> SACK_PERM=1 TSval=531138781 TSecr=3995311085
>  168 2023-04-04 15:20:10.601978 10.235.232.7 → 10.63.20.131 TCP 66
> 888 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995311085
> TSecr=531138781
>  169 2023-04-04 15:20:10.602058 10.235.232.7 → 10.63.20.131 NFS 110
> V4 NULL Call
>  170 2023-04-04 15:20:10.602516 10.63.20.131 → 10.235.232.7 NFS 102
> V4 NULL Reply (Call In 169)
>  171 2023-04-04 15:20:10.602543 10.235.232.7 → 10.63.20.131 TCP 66
> 888 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995311086
> TSecr=531138781
>  172 2023-04-04 15:20:10.731005 10.63.20.131 → 10.235.232.7 TCP 66
> 2049 → 888 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=531138911
> TSecr=3995311086
>  173 2023-04-04 15:20:11.998469 10.235.232.7 → 10.235.48.111 NTP 90
> NTP Version 4, client
>  174 2023-04-04 15:20:11.998803 10.235.48.111 → 10.235.232.7 NTP 90
> NTP Version 4, server
>  175 2023-04-04 15:20:12.648789 10.235.232.7 → 10.227.204.142 NTP 90
> NTP Version 4, client
>  176 2023-04-04 15:20:12.649163 10.227.204.142 → 10.235.232.7 NTP 90
> NTP Version 4, server
>  177 2023-04-04 15:20:20.841429 10.235.232.7 → 10.63.20.131 TCP 74
> 893 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> TSval=3995321325 TSecr=0 WS=128
>  178 2023-04-04 15:20:20.841673 10.63.20.131 → 10.235.232.7 TCP 74
> 2049 → 893 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> SACK_PERM=1 TSval=1953607493 TSecr=3995321325
>  179 2023-04-04 15:20:20.841688 10.235.232.7 → 10.63.20.131 TCP 66
> 893 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995321325
> TSecr=1953607493
>  180 2023-04-04 15:20:20.841767 10.235.232.7 → 10.63.20.131 NFS 110
> V4 NULL Call
>  181 2023-04-04 15:20:20.842198 10.63.20.131 → 10.235.232.7 NFS 102
> V4 NULL Reply (Call In 180)
>  182 2023-04-04 15:20:20.842208 10.235.232.7 → 10.63.20.131 TCP 66
> 893 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995321326
> TSecr=1953607493
>  183 2023-04-04 15:20:21.051706 10.63.20.131 → 10.235.232.7 TCP 66
> 2049 → 893 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=1953607703
> TSecr=3995321326
>  184 2023-04-04 15:20:22.889123 10.235.232.7 → 10.56.8.183  TCP 66
> [TCP Keep-Alive] 826 → 2049 [ACK] Seq=292 Ack=237 Win=501 Len=0
> TSval=1599896843 TSecr=3825557328
>  185 2023-04-04 15:20:22.889130 10.235.232.7 → 10.60.249.125 TCP 66
> [TCP Keep-Alive] 732 → 2049 [ACK] Seq=2176 Ack=1857 Win=502 Len=0
> TSval=1628469540 TSecr=708698066
>  186 2023-04-04 15:20:22.889335 10.60.249.125 → 10.235.232.7 TCP 66
> [TCP Keep-Alive ACK] 2049 → 732 [ACK] Seq=1857 Ack=2177 Win=257 Len=0
> TSval=708759507 TSecr=1628344342
>  187 2023-04-04 15:20:22.951234  10.56.8.183 → 10.235.232.7 TCP 66
> [TCP Keep-Alive ACK] 2049 → 826 [ACK] Seq=237 Ack=293 Win=257 Len=0
> TSval=3825618767 TSecr=1599771772
>  188 2023-04-04 15:20:27.226390 10.235.232.131 → 10.235.235.255
> BROWSER 243 Host Announcement SCS000126759-3, Workstation, Server, NT
> Workstation, NT Server
>  189 2023-04-04 15:20:31.081384 10.235.232.7 → 10.63.20.131 TCP 74
> 1013 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> TSval=3995331565 TSecr=0 WS=128
>  190 2023-04-04 15:20:31.081679 10.63.20.131 → 10.235.232.7 TCP 74
> 2049 → 1013 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> SACK_PERM=1 TSval=1446866364 TSecr=3995331565
>  191 2023-04-04 15:20:31.081698 10.235.232.7 → 10.63.20.131 TCP 66
> 1013 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995331565
> TSecr=1446866364
>  192 2023-04-04 15:20:31.081745 10.235.232.7 → 10.63.20.131 NFS 110
> V4 NULL Call
>  193 2023-04-04 15:20:31.082347 10.63.20.131 → 10.235.232.7 NFS 102
> V4 NULL Reply (Call In 192)
>  194 2023-04-04 15:20:31.082353 10.235.232.7 → 10.63.20.131 TCP 66
> 1013 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995331566
> TSecr=1446866364
>  195 2023-04-04 15:20:31.216474 10.63.20.131 → 10.235.232.7 TCP 66
> 2049 → 1013 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=1446866494
> TSecr=3995331566
>  196 2023-04-04 15:20:39.273073 10.235.232.7 → 10.60.249.150 TCP 66
> [TCP Dup ACK 85#2] 970 → 2049 [ACK] Seq=1 Ack=1 Win=502 Len=0
> TSval=4116590187 TSecr=1272883677
>  197 2023-04-04 15:20:39.273146 10.235.232.7 → 10.60.248.40 TCP 66
> [TCP Dup ACK 84#2] 845 → 2049 [ACK] Seq=1 Ack=1 Win=501 Len=0
> TSval=3160372107 TSecr=2389578099
>  198 2023-04-04 15:20:39.273342 10.60.249.150 → 10.235.232.7 TCP 66
> [TCP Dup ACK 86#2] [TCP ACKed unseen segment] 2049 → 970 [ACK] Seq=1
> Ack=2 Win=257 Len=0 TSval=1272945117 TSecr=4116405409
>  199 2023-04-04 15:20:39.274592 10.60.248.40 → 10.235.232.7 TCP 66
> [TCP Dup ACK 87#2] [TCP ACKed unseen segment] 2049 → 845 [ACK] Seq=1
> Ack=2 Win=257 Len=0 TSval=2389639540 TSecr=3160187318
>  200 2023-04-04 15:20:41.321912 10.235.232.7 → 10.63.20.131 TCP 74
> 676 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> TSval=3995341805 TSecr=0 WS=128
>  201 2023-04-04 15:20:41.322226 10.63.20.131 → 10.235.232.7 TCP 74
> 2049 → 676 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> SACK_PERM=1 TSval=710203011 TSecr=3995341805
>  202 2023-04-04 15:20:41.322258 10.235.232.7 → 10.63.20.131 TCP 66
> 676 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995341806
> TSecr=710203011
>  203 2023-04-04 15:20:41.322382 10.235.232.7 → 10.63.20.131 NFS 110
> V4 NULL Call
>  204 2023-04-04 15:20:41.322902 10.63.20.131 → 10.235.232.7 NFS 102
> V4 NULL Reply (Call In 203)
>  205 2023-04-04 15:20:41.322923 10.235.232.7 → 10.63.20.131 TCP 66
> 676 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995341806
> TSecr=710203011
>  206 2023-04-04 15:20:41.502713 10.63.20.131 → 10.235.232.7 TCP 66
> 2049 → 676 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=710203191
> TSecr=3995341806
>  207 2023-04-04 15:20:41.671253 fe80::21c:73ff:fe00:99 → ff02::1
> ICMPv6 118 Router Advertisement from 00:1c:73:00:00:99
>  208 2023-04-04 15:20:41.675089 fe80::250:56ff:febf:2f55 → ff02::16
>  ICMPv6 90 Multicast Listener Report Message v2
>  209 2023-04-04 15:20:41.684071 fe80::250:56ff:febf:2f55 → ff02::16
>  ICMPv6 90 Multicast Listener Report Message v2
>  210 2023-04-04 15:20:43.709117 10.235.232.130 → 10.235.235.255
> BROWSER 243 Host Announcement SCS000126759-2, Workstation, Server, NT
> Workstation, NT Server
>  211 2023-04-04 15:20:51.561341 10.235.232.7 → 10.63.20.131 TCP 74
> 864 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> TSval=3995352045 TSecr=0 WS=128
>  212 2023-04-04 15:20:51.561677 10.63.20.131 → 10.235.232.7 TCP 74
> 2049 → 864 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> SACK_PERM=1 TSval=1105202286 TSecr=3995352045
>  213 2023-04-04 15:20:51.561736 10.235.232.7 → 10.63.20.131 TCP 66
> 864 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995352045
> TSecr=1105202286
>  214 2023-04-04 15:20:51.561764 10.235.232.7 → 10.63.20.131 NFS 110
> V4 NULL Call
>  215 2023-04-04 15:20:51.562296 10.63.20.131 → 10.235.232.7 NFS 102
> V4 NULL Reply (Call In 214)
>  216 2023-04-04 15:20:51.562324 10.235.232.7 → 10.63.20.131 TCP 66
> 864 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995352046
> TSecr=1105202286
>  217 2023-04-04 15:20:51.764952 10.63.20.131 → 10.235.232.7 TCP 66
> 2049 → 864 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=1105202486
> TSecr=3995352046
>  218 2023-04-04 15:21:01.801569 10.235.232.7 → 10.63.20.131 TCP 74
> 779 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> TSval=3995362285 TSecr=0 WS=128
>  219 2023-04-04 15:21:01.801903 10.63.20.131 → 10.235.232.7 TCP 74
> 2049 → 779 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> SACK_PERM=1 TSval=3669197173 TSecr=3995362285
>  220 2023-04-04 15:21:01.801938 10.235.232.7 → 10.63.20.131 TCP 66
> 779 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995362285
> TSecr=3669197173
>  221 2023-04-04 15:21:01.801991 10.235.232.7 → 10.63.20.131 NFS 110
> V4 NULL Call
>  222 2023-04-04 15:21:01.802750 10.63.20.131 → 10.235.232.7 NFS 102
> V4 NULL Reply (Call In 221)
>  223 2023-04-04 15:21:01.802761 10.235.232.7 → 10.63.20.131 TCP 66
> 779 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995362286
> TSecr=3669197173
>  224 2023-04-04 15:21:01.954179 10.63.20.131 → 10.235.232.7 TCP 66
> 2049 → 779 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=3669197323
> TSecr=3995362286
>  225 2023-04-04 15:21:10.142734 fe80::21c:73ff:fe00:99 → ff02::1
> ICMPv6 118 Router Advertisement from 00:1c:73:00:00:99
>  226 2023-04-04 15:21:10.146082 fe80::250:56ff:febf:2f55 → ff02::16
>  ICMPv6 90 Multicast Listener Report Message v2
>  227 2023-04-04 15:21:10.889124 fe80::250:56ff:febf:2f55 → ff02::16
>  ICMPv6 90 Multicast Listener Report Message v2
>  228 2023-04-04 15:21:12.041741 10.235.232.7 → 10.63.20.131 TCP 74
> 946 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> TSval=3995372525 TSecr=0 WS=128
>  229 2023-04-04 15:21:12.042058 10.63.20.131 → 10.235.232.7 TCP 74
> 2049 → 946 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> SACK_PERM=1 TSval=1985286575 TSecr=3995372525
>  230 2023-04-04 15:21:12.042121 10.235.232.7 → 10.63.20.131 TCP 66
> 946 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=3995372526
> TSecr=1985286575
>  231 2023-04-04 15:21:12.042404 10.235.232.7 → 10.63.20.131 NFS 110
> V4 NULL Call
>  232 2023-04-04 15:21:12.042673 10.63.20.131 → 10.235.232.7 NFS 102
> V4 NULL Reply (Call In 231)
>  233 2023-04-04 15:21:12.042710 10.235.232.7 → 10.63.20.131 TCP 66
> 946 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=3995372526
> TSecr=1985286575
>  234 2023-04-04 15:21:12.079429 10.63.20.131 → 10.235.232.7 TCP 66
> [TCP Dup ACK 232#1] 2049 → 946 [ACK] Seq=37 Ack=45 Win=65792 Len=0
> TSval=1985286615 TSecr=3995372526
>  235 2023-04-04 15:21:12.184895 10.63.20.131 → 10.235.232.7 TCP 66
> 2049 → 946 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=1985286705
> TSecr=3995372526
>  236 2023-04-04 15:21:18.186337 10.235.232.7 → 10.60.250.78 SSH 150
> Server: Encrypted packet (len=84)
>  237 2023-04-04 15:21:18.186573 10.60.250.78 → 10.235.232.7 TCP 66
> 45496 → 22 [ACK] Seq=417 Ack=629 Win=501 Len=0 TSval=2780826399
> TSecr=3765675462
>  238 2023-04-04 15:21:18.187900 10.235.232.7 → 10.60.250.78 SSH 150
> Server: Encrypted packet (len=84)
>  239 2023-04-04 15:21:18.188065 10.60.250.78 → 10.235.232.7 TCP 66
> 45496 → 22 [ACK] Seq=417 Ack=713 Win=501 Len=0 TSval=2780826401
> TSecr=3765675463
>  240 2023-04-04 15:21:18.188154 10.235.232.7 → 10.60.250.78 SSH 150
> Server: Encrypted packet (len=84)
>  241 2023-04-04 15:21:18.188283 10.60.250.78 → 10.235.232.7 TCP 66
> 45496 → 22 [ACK] Seq=417 Ack=797 Win=501 Len=0 TSval=2780826401
> TSecr=3765675464
>  242 2023-04-04 15:21:18.188455 10.235.232.7 → 10.60.250.78 SSH 134
> Server: Encrypted packet (len=68)
>  243 2023-04-04 15:21:18.188599 10.60.250.78 → 10.235.232.7 TCP 66
> 45496 → 22 [ACK] Seq=417 Ack=865 Win=501 Len=0 TSval=2780826401
> TSecr=3765675464
>  244 2023-04-04 15:21:22.539976 fe80::21c:73ff:fe00:99 →
> fe80::250:56ff:febf:5abd ICMPv6 118 Router Advertisement from
> 00:1c:73:00:00:99
>  245 2023-04-04 15:21:22.764914 fe80::21c:73ff:fe00:99 →
> fe80::250:56ff:febf:5abd ICMPv6 118 Router Advertisement from
> 00:1c:73:00:00:99
>  246 2023-04-04 15:21:24.329146 10.235.232.7 → 10.56.8.183  TCP 66
> [TCP Keep-Alive] 826 → 2049 [ACK] Seq=292 Ack=237 Win=501 Len=0
> TSval=1599958283 TSecr=3825618767
>  247 2023-04-04 15:21:24.329156 10.235.232.7 → 10.60.249.125 TCP 66
> [TCP Keep-Alive] 732 → 2049 [ACK] Seq=2176 Ack=1857 Win=502 Len=0
> TSval=1628530980 TSecr=708759507
>  248 2023-04-04 15:21:24.329506 10.60.249.125 → 10.235.232.7 TCP 66
> [TCP Keep-Alive ACK] 2049 → 732 [ACK] Seq=1857 Ack=2177 Win=257 Len=0
> TSval=708820947 TSecr=1628344342
>  249 2023-04-04 15:21:24.391262  10.56.8.183 → 10.235.232.7 TCP 66
> [TCP Keep-Alive ACK] 2049 → 826 [ACK] Seq=237 Ack=293 Win=257 Len=0
> TSval=3825680208 TSecr=1599771772
> 
>> 
>> 
>>>>> I thought that
>>>>> would "prevent" the crash but it then proceeded to crash elsewhere.
>>>>> Sorry I'm still on non-debug-symbol kernel (as I thought I'd add the
>>>>> checks that that'd be enough :)).
>>>>> 
>>>>> [  284.368211] AGLO: handshake_req_cancel sock->sk is NULL
>>>>> [  284.368249] BUG: kernel NULL pointer dereference, address: 0000000000000018
>>>>> [  284.368320] #PF: supervisor read access in kernel mode
>>>>> [  284.368363] #PF: error_code(0x0000) - not-present page
>>>>> [  284.368411] PGD 0 P4D 0
>>>>> [  284.368438] Oops: 0000 [#1] PREEMPT SMP PTI
>>>>> [  284.368473] CPU: 0 PID: 521 Comm: kworker/u2:30 Kdump: loaded Not
>>>>> tainted 6.3.0-rc4+ #14
>>>>> [  284.368523] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
>>>>> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
>>>>> [  284.368580] Workqueue: xprtiod xs_tls_connect [sunrpc]
>>>>> [  284.368720] RIP: 0010:xs_tls_connect+0x3e5/0x590 [sunrpc]
>>>>> [  284.368844] Code: ff ff ff 48 2b 83 f8 fe ff ff 48 89 83 00 ff ff
>>>>> ff e8 bf d2 ff ff e9 a8 fd ff ff 49 8b 97 f8 05 00 00 66 83 bb e0 f9
>>>>> ff ff 0a <48> 8b 6a 18 0f 84 fd 00 00 00 48 8d 72 18 4c 89 e7 48 89 14
>>>>> 24 e8
>>>>> [  284.368973] RSP: 0018:ffffba17c0f67dc8 EFLAGS: 00010293
>>>>> [  284.369010] RAX: 0000000000000000 RBX: ffff963143c1fe40 RCX: 0000000000000000
>>>>> [  284.369051] RDX: 0000000000000000 RSI: ffff9631f7e20840 RDI: ffff9631f7e20840
>>>>> [  284.369091] RBP: ffff9631463d8a00 R08: 0000000000000000 R09: c0000000ffff7fff
>>>>> [  284.369130] R10: 0000000000000001 R11: ffffba17c0f67bb8 R12: ffff963143c1f800
>>>>> [  284.369175] R13: 0000000004208160 R14: ffff963143f6f600 R15: ffff963143c1e000
>>>>> [  284.369259] FS:  0000000000000000(0000) GS:ffff9631f7e00000(0000)
>>>>> knlGS:0000000000000000
>>>>> [  284.369319] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [  284.369367] CR2: 0000000000000018 CR3: 0000000102d12004 CR4: 00000000007706f0
>>>>> [  284.369458] PKRU: 55555554
>>>>> [  284.369483] Call Trace:
>>>>> [  284.369523]  <TASK>
>>>>> [  284.369583]  process_one_work+0x1b0/0x3c0
>>>>> [  284.369633]  worker_thread+0x34/0x360
>>>>> [  284.369668]  ? __pfx_worker_thread+0x10/0x10
>>>>> [  284.369704]  kthread+0xdb/0x110
>>>>> [  284.369736]  ? __pfx_kthread+0x10/0x10
>>>>> [  284.369767]  ret_from_fork+0x29/0x50
>>>>> [  284.369808]  </TASK>
>>>>> [  284.369843] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
>>>>> dns_resolver nfsv3 nfs_acl nfs lockd grace fscache netfs ext4 mbcache
>>>>> jbd2 loop rfkill vsock_loopback vmw_vsock_virtio_transport_common
>>>>> vmw_vsock_vmci_transport vsock sunrpc vfat fat dm_multipath
>>>>> intel_rapl_msr intel_rapl_common isst_if_mbox_msr isst_if_common nfit
>>>>> libnvdimm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl
>>>>> vmw_balloon joydev pcspkr vmw_vmci i2c_piix4 xfs libcrc32c vmwgfx
>>>>> drm_ttm_helper ttm sr_mod cdrom drm_kms_helper sd_mod syscopyarea
>>>>> sysfillrect ata_generic sysimgblt sg ata_piix ahci libahci
>>>>> crc32c_intel serio_raw vmxnet3 libata drm vmw_pvscsi dm_mirror
>>>>> dm_region_hash dm_log dm_mod nvme_tcp nvme_fabrics nvme_core t10_pi
>>>>> crc64_rocksoft crc64 ipmi_devintf ipmi_msghandler fuse
>>>>> [  284.370300] CR2: 0000000000000018
>>>>> 
>>>>> 
>>>>>> 
>>>>>> BPF might also be a solution, but I'm not sure if it relies on
>>>>>> debug info. Might be only BTF is needed.
>>>>>> 
>>>>>> 
>>>>>>>>> [  677.275269] Code: 0f 1e fa 0f 1f 44 00 00 41 57 41 56 41 55 41 54
>>>>>>>>> 55 53 48 83 ec 28 48 8b 6f 18 65 48 8b 04 25 28 00 00 00 48 89 44 24
>>>>>>>>> 20 31 c0 <48> 8b 5d 30 48 89 6c 24 18 e8 97 15 5f ff 4c 8b 25 50 2e 70
>>>>>>>>> 01 41
>>>>>>>>> [  677.275474] RSP: 0018:ffffaed400f6fcf8 EFLAGS: 00010246
>>>>>>>>> [  677.275570] RAX: 0000000000000000 RBX: ffff93f085b33800 RCX: 0000000000000000
>>>>>>>>> [  677.275663] RDX: ffff93f085b33f08 RSI: ffffaed400f6fd18 RDI: ffff93f083aa6700
>>>>>>>>> [  677.275756] RBP: 0000000000000000 R08: ffff93f137e32968 R09: 0000009d9663fa40
>>>>>>>>> [  677.275850] R10: 0000000000000000 R11: 0000009d9663fa40 R12: ffff93f083aa6700
>>>>>>>>> [  677.275943] R13: 0000000000000000 R14: ffff93f082999e00 R15: ffff93f085b33800
>>>>>>>>> [  677.276049] FS:  0000000000000000(0000) GS:ffff93f137e00000(0000)
>>>>>>>>> knlGS:0000000000000000
>>>>>>>>> [  677.276154] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>>>>> [  677.276247] CR2: 0000000000000030 CR3: 000000010797c002 CR4: 00000000007706f0
>>>>>>>>> [  677.276355] PKRU: 55555554
>>>>>>>>> [  677.276427] Call Trace:
>>>>>>>>> [  677.276521]  <TASK>
>>>>>>>>> [  677.276605]  ? wait_for_completion_interruptible_timeout+0xfa/0x170
>>>>>>>>> [  677.276725]  xs_tls_handshake_sync+0x14f/0x170 [sunrpc]
>>>>>>>>> [  677.276850]  ? __pfx_xs_tls_handshake_done+0x10/0x10 [sunrpc]
>>>>>>>>> [  677.276967]  xs_tls_connect+0x14c/0x590 [sunrpc]
>>>>>>>>> [  677.277086]  process_one_work+0x1b0/0x3c0
>>>>>>>>> [  677.277209]  worker_thread+0x34/0x360
>>>>>>>>> [  677.277303]  ? __pfx_worker_thread+0x10/0x10
>>>>>>>>> [  677.277396]  kthread+0xdb/0x110
>>>>>>>>> [  677.277527]  ? __pfx_kthread+0x10/0x10
>>>>>>>>> [  677.277619]  ret_from_fork+0x29/0x50
>>>>>>>>> [  677.277731]  </TASK>
>>>>>>>>> [  677.277803] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
>>>>>>>>> dns_resolver nfsv3 nfs_acl nfs lockd grace fscache netfs ext4 mbcache
>>>>>>>>> jbd2 loop rfkill vsock_loopback vmw_vsock_virtio_transport_common
>>>>>>>>> vmw_vsock_vmci_transport vsock sunrpc vfat fat dm_multipath
>>>>>>>>> intel_rapl_msr intel_rapl_common isst_if_mbox_msr isst_if_common nfit
>>>>>>>>> libnvdimm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl
>>>>>>>>> vmw_balloon joydev pcspkr vmw_vmci i2c_piix4 xfs libcrc32c vmwgfx
>>>>>>>>> drm_ttm_helper ttm sr_mod cdrom drm_kms_helper syscopyarea sysfillrect
>>>>>>>>> sd_mod ata_generic sysimgblt sg ata_piix ahci libahci crc32c_intel
>>>>>>>>> serio_raw drm vmxnet3 vmw_pvscsi libata dm_mirror dm_region_hash
>>>>>>>>> dm_log dm_mod nvme_tcp nvme_fabrics nvme_core t10_pi crc64_rocksoft
>>>>>>>>> crc64 ipmi_devintf ipmi_msghandler fuse
>>>>>>>>> [  677.278383] CR2: 0000000000000030
>>>>>>>>> 
>>>>>>>>> Network trace from the server:
>>>>>>>>> 
>>>>>>>>> [kolga@scs000085536 ~/pcaps]$ tshark -t ad -r e0d_20230404_084311.trc0
>>>>>>>>> 1 2023-04-04 08:43:28.027546 10.235.232.77 → 10.63.20.131 TCP 74
>>>>>>>>> 727 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
>>>>>>>>> TSval=4059503415 TSecr=0 WS=128
>>>>>>>>> 2 2023-04-04 08:43:28.027645 10.63.20.131 → 10.235.232.77 TCP 74
>>>>>>>>> 2049 → 727 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
>>>>>>>>> SACK_PERM=1 TSval=2256845073 TSecr=4059503415
>>>>>>>>> 3 2023-04-04 08:43:28.027813 10.235.232.77 → 10.63.20.131 TCP 66
>>>>>>>>> 727 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=4059503415
>>>>>>>>> TSecr=2256845073
>>>>>>>>> 4 2023-04-04 08:43:28.027952 10.235.232.77 → 10.63.20.131 NFS 110
>>>>>>>>> V4 NULL Call
>>>>>>>>> 5 2023-04-04 08:43:28.028301 10.63.20.131 → 10.235.232.77 NFS 102
>>>>>>>>> V4 NULL Reply (Call In 4)
>>>>>>>>> 6 2023-04-04 08:43:28.028563 10.235.232.77 → 10.63.20.131 TCP 66
>>>>>>>>> 727 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=4059503416
>>>>>>>>> TSecr=2256845073
>>>>>>>>> 7 2023-04-04 08:43:28.331181 10.63.20.131 → 10.235.232.77 TCP 66
>>>>>>>>> 2049 → 727 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=2256845383
>>>>>>>>> TSecr=4059503416
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> Chuck Lever
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Chuck Lever
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Chuck Lever
>>>> 
>>>> 
>>>> --
>>>> Chuck Lever
>> 
>> 
>> --
>> Chuck Lever


--
Chuck Lever



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-04-04 19:30 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-31 14:23 kernel oops Olga Kornievskaia
2023-03-31 14:25 ` Chuck Lever III
2023-04-04 14:24   ` Olga Kornievskaia
2023-04-04 14:31     ` Chuck Lever III
2023-04-04 14:44       ` Olga Kornievskaia
2023-04-04 14:54         ` Chuck Lever III
2023-04-04 16:15           ` Olga Kornievskaia
2023-04-04 17:28             ` Chuck Lever III
2023-04-04 19:11               ` Olga Kornievskaia
2023-04-04 19:14                 ` Chuck Lever III
2023-04-04 19:26                   ` Olga Kornievskaia
2023-04-04 19:30                     ` Chuck Lever III

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).