kernel-tls-handshake.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Chuck Lever III <chuck.lever@oracle.com>
To: Olga Kornievskaia <aglo@umich.edu>
Cc: "kernel-tls-handshake@lists.linux.dev"
	<kernel-tls-handshake@lists.linux.dev>
Subject: Re: kernel oops
Date: Tue, 4 Apr 2023 14:54:20 +0000	[thread overview]
Message-ID: <2F8C7CC4-157D-479B-9DBB-51A4D39EEA96@oracle.com> (raw)
In-Reply-To: <CAN-5tyEzFjZx41bExgdLKuty5y7+59CJnf2vJm9=9dyAFV=H7w@mail.gmail.com>



> On Apr 4, 2023, at 10:44 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
> 
> On Tue, Apr 4, 2023 at 10:32 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
>> 
>>> On Apr 4, 2023, at 10:24 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>> 
>>> On Fri, Mar 31, 2023 at 10:25 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
>>>> 
>>>> 
>>>> 
>>>>> On Mar 31, 2023, at 10:23 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>>> 
>>>>> Hi Chuck,
>>>>> 
>>>>> Have you seen this oops? This is not my testing so all I know was that
>>>>> it was an attempt to mount with xprtsec=tls (against ONTAP).
>>>>> 
>>>>> [  316.939747] BUG: kernel NULL pointer dereference, address: 0000000000000018
>>>>> [  316.939848] #PF: supervisor read access in kernel mode
>>>>> [  316.939914] #PF: error_code(0x0000) - not-present page
>>>>> [  316.939984] PGD 0 P4D 0
>>>>> [  316.940041] Oops: 0000 [#1] PREEMPT SMP PTI
>>>>> [  316.940109] CPU: 0 PID: 511 Comm: kworker/u2:30 Kdump: loaded Not
>>>>> tainted 6.2.1+ #2
>>>>> [  316.940181] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
>>>>> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
>>>>> [  316.940259] Workqueue: xprtiod xs_tls_connect [sunrpc]
>>>>> [  316.940463] RIP: 0010:xs_tls_connect+0x3e1/0x590 [sunrpc]
>>>>> [  316.940556] Code: ff ff ff 48 2b 83 f8 fe ff ff 48 89 83 00 ff ff
>>>>> ff e8 03 d3 ff ff e9 a8 fd ff ff 49 8b 97 f8 05 00 00 66 83 bb e0 f9
>>>>> ff ff 0a <48> 8b 6a 18 0f 84 fd 00 00 00 48 8d 72 18 4c 89 e7 48 89 14
>>>>> 24 e8
>>>>> [  316.940719] RSP: 0018:ffffa69d40f5fdc8 EFLAGS: 00010293
>>>>> [  316.940786] RAX: 0000000000000000 RBX: ffff9b0c46333e40 RCX: 0000000000000000
>>>>> [  316.940872] RDX: 0000000000000000 RSI: ffffa69d40f5fd20 RDI: 0000000000000000
>>>>> [  316.940940] RBP: ffff9b0c46350200 R08: ffff9b0cf7e31968 R09: 0000000000000159
>>>>> [  316.941008] R10: 0000000000000001 R11: 00000049c8887a40 R12: ffff9b0c46333800
>>>>> [  316.941075] R13: 0000000004208160 R14: ffff9b0c44e18a00 R15: ffff9b0c42dab800
>>>>> [  316.941159] FS:  0000000000000000(0000) GS:ffff9b0cf7e00000(0000)
>>>>> knlGS:0000000000000000
>>>>> [  316.941243] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [  316.941309] CR2: 0000000000000018 CR3: 0000000102e48004 CR4: 00000000007706f0
>>>>> [  316.941391] PKRU: 55555554
>>>>> [  316.941443] Call Trace:
>>>>> [  316.941513]  <TASK>
>>>>> [  316.941578]  process_one_work+0x1b0/0x3c0
>>>>> [  316.941689]  worker_thread+0x30/0x360
>>>>> [  316.941755]  ? __pfx_worker_thread+0x10/0x10
>>>>> [  316.941822]  kthread+0xd7/0x100
>>>>> [  316.941900]  ? __pfx_kthread+0x10/0x10
>>>>> [  316.941968]  ret_from_fork+0x29/0x50
>>>>> [  316.942049]  </TASK>
>>>>> [  316.942099] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
>>>>> dns_resolver ib_core nfsv3 nfs_acl nfs lockd grace fscache netfs ext4
>>>>> rfkill mbcache jbd2 loop vsock_loopback
>>>>> vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock
>>>>> sunrpc tls vfat fat dm_multipath intel_rapl_msr intel_rapl_common
>>>>> isst_if_mbox_msr isst_if_common nfit libnvdimm crct10dif_pclmul
>>>>> crc32_pclmul ghash_clmulni_intel rapl vmw_balloon joydev pcspkr
>>>>> i2c_piix4 vmw_vmci xfs libcrc32c vmwgfx drm_ttm_helper ttm sr_mod
>>>>> cdrom drm_kms_helper syscopyarea sd_mod sysfillrect ata_generic
>>>>> sysimgblt sg ahci libahci ata_piix crc32c_intel serio_raw drm vmxnet3
>>>>> vmw_pvscsi libata dm_mirror dm_region_hash dm_log dm_mod nvme_tcp
>>>>> nvme_fabrics nvme_core t10_pi crc64_rocksoft crc64 ipmi_devintf
>>>>> ipmi_msghandler fuse
>>>>> [  316.942550] CR2: 0000000000000018
>>>>> 
>>>>> Please note this was still on "commit
>>>>> caa52379f0046e974fd16585dd7ede624699d8e2 (HEAD -> 03152023-v7,
>>>>> origin/upcall-v7)". If you think somethings done to address it I'll
>>>>> have them update to the latest.
>>>> 
>>>> Update and retry. It has a familiar ring, but I can't say I've
>>>> seen this recently.
>>> 
>>> Hi Chuck,
>>> 
>>> I've updated the kernel code to the latest  commit
>>> 46588cade103b446f74a7ee160a83cd2a80399e7 (HEAD ->
>>> chuck-rpc-with-tls-04032023, chuck/topic-rpc-with-tls-upcall) but the
>>> client is still crashing but looks like in a different place. This
>>> happens when the server terminates the connection (immediately) as the
>>> tls handshake happens.
>> 
>> Why is the server doing that?
> 
> I'm not sure what kind of a reason you are looking for. I think the
> server can send an RST at any point in a TCP connection and the client
> shouldn't crash.

No doubt.


> But the real reason is probably because the server's
> code is getting developed and they ran into some error and resetting
> the connection.

That's what I was looking for. I wanted to know if it was
Linux NFSD doing this.


>>> Here's the stack trace:
>>> [  677.274299] BUG: kernel NULL pointer dereference, address: 0000000000000030
>>> [  677.274424] #PF: supervisor read access in kernel mode
>>> [  677.274517] #PF: error_code(0x0000) - not-present page
>>> [  677.274609] PGD 0 P4D 0
>>> [  677.274689] Oops: 0000 [#1] PREEMPT SMP PTI
>>> [  677.274785] CPU: 0 PID: 518 Comm: kworker/u2:29 Kdump: loaded Not
>>> tainted 6.3.0-rc4+ #13
>>> [  677.274881] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
>>> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
>>> [  677.274981] Workqueue: xprtiod xs_tls_connect [sunrpc]
>>> [  677.275137] RIP: 0010:handshake_req_cancel+0x2b/0x3b0
>> 
>> scripts/faddr2line on this RIP
>> 
>> Which pointer is NULL?
> 
> I apologize I don't have that info and I'm working on trying to figure
> out how to get this information to you. Lab kernels are built with
> CONFIG_DEBUG_INFO disabled and thus I can't do faddr2line. I will
> attempt to build on with debugging enabled (has been problematic
> before for the lab machines so not sure if that'll be successful I
> might have to hack the linux kernel to simulate the reset).

Jakub asked that NULL checking be removed from handshake_req_cancel,
or at least that's what I read into his comment. That choice is
definitely flushing bugs.

So this crasher is familiar. req_cancel is getting called in some
scenario where the socket or request is partially filled in. We
have to figure out how that's happening.

If debug info is a problem, then just add a sarcastic quantity of
NULL checks in handshake_req_cancel, rebuild, and it should tell
you exactly which pointer is bad.

BPF might also be a solution, but I'm not sure if it relies on
debug info. Might be only BTF is needed.


>>> [  677.275269] Code: 0f 1e fa 0f 1f 44 00 00 41 57 41 56 41 55 41 54
>>> 55 53 48 83 ec 28 48 8b 6f 18 65 48 8b 04 25 28 00 00 00 48 89 44 24
>>> 20 31 c0 <48> 8b 5d 30 48 89 6c 24 18 e8 97 15 5f ff 4c 8b 25 50 2e 70
>>> 01 41
>>> [  677.275474] RSP: 0018:ffffaed400f6fcf8 EFLAGS: 00010246
>>> [  677.275570] RAX: 0000000000000000 RBX: ffff93f085b33800 RCX: 0000000000000000
>>> [  677.275663] RDX: ffff93f085b33f08 RSI: ffffaed400f6fd18 RDI: ffff93f083aa6700
>>> [  677.275756] RBP: 0000000000000000 R08: ffff93f137e32968 R09: 0000009d9663fa40
>>> [  677.275850] R10: 0000000000000000 R11: 0000009d9663fa40 R12: ffff93f083aa6700
>>> [  677.275943] R13: 0000000000000000 R14: ffff93f082999e00 R15: ffff93f085b33800
>>> [  677.276049] FS:  0000000000000000(0000) GS:ffff93f137e00000(0000)
>>> knlGS:0000000000000000
>>> [  677.276154] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [  677.276247] CR2: 0000000000000030 CR3: 000000010797c002 CR4: 00000000007706f0
>>> [  677.276355] PKRU: 55555554
>>> [  677.276427] Call Trace:
>>> [  677.276521]  <TASK>
>>> [  677.276605]  ? wait_for_completion_interruptible_timeout+0xfa/0x170
>>> [  677.276725]  xs_tls_handshake_sync+0x14f/0x170 [sunrpc]
>>> [  677.276850]  ? __pfx_xs_tls_handshake_done+0x10/0x10 [sunrpc]
>>> [  677.276967]  xs_tls_connect+0x14c/0x590 [sunrpc]
>>> [  677.277086]  process_one_work+0x1b0/0x3c0
>>> [  677.277209]  worker_thread+0x34/0x360
>>> [  677.277303]  ? __pfx_worker_thread+0x10/0x10
>>> [  677.277396]  kthread+0xdb/0x110
>>> [  677.277527]  ? __pfx_kthread+0x10/0x10
>>> [  677.277619]  ret_from_fork+0x29/0x50
>>> [  677.277731]  </TASK>
>>> [  677.277803] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
>>> dns_resolver nfsv3 nfs_acl nfs lockd grace fscache netfs ext4 mbcache
>>> jbd2 loop rfkill vsock_loopback vmw_vsock_virtio_transport_common
>>> vmw_vsock_vmci_transport vsock sunrpc vfat fat dm_multipath
>>> intel_rapl_msr intel_rapl_common isst_if_mbox_msr isst_if_common nfit
>>> libnvdimm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl
>>> vmw_balloon joydev pcspkr vmw_vmci i2c_piix4 xfs libcrc32c vmwgfx
>>> drm_ttm_helper ttm sr_mod cdrom drm_kms_helper syscopyarea sysfillrect
>>> sd_mod ata_generic sysimgblt sg ata_piix ahci libahci crc32c_intel
>>> serio_raw drm vmxnet3 vmw_pvscsi libata dm_mirror dm_region_hash
>>> dm_log dm_mod nvme_tcp nvme_fabrics nvme_core t10_pi crc64_rocksoft
>>> crc64 ipmi_devintf ipmi_msghandler fuse
>>> [  677.278383] CR2: 0000000000000030
>>> 
>>> Network trace from the server:
>>> 
>>> [kolga@scs000085536 ~/pcaps]$ tshark -t ad -r e0d_20230404_084311.trc0
>>>   1 2023-04-04 08:43:28.027546 10.235.232.77 → 10.63.20.131 TCP 74
>>> 727 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
>>> TSval=4059503415 TSecr=0 WS=128
>>>   2 2023-04-04 08:43:28.027645 10.63.20.131 → 10.235.232.77 TCP 74
>>> 2049 → 727 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
>>> SACK_PERM=1 TSval=2256845073 TSecr=4059503415
>>>   3 2023-04-04 08:43:28.027813 10.235.232.77 → 10.63.20.131 TCP 66
>>> 727 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=4059503415
>>> TSecr=2256845073
>>>   4 2023-04-04 08:43:28.027952 10.235.232.77 → 10.63.20.131 NFS 110
>>> V4 NULL Call
>>>   5 2023-04-04 08:43:28.028301 10.63.20.131 → 10.235.232.77 NFS 102
>>> V4 NULL Reply (Call In 4)
>>>   6 2023-04-04 08:43:28.028563 10.235.232.77 → 10.63.20.131 TCP 66
>>> 727 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=4059503416
>>> TSecr=2256845073
>>>   7 2023-04-04 08:43:28.331181 10.63.20.131 → 10.235.232.77 TCP 66
>>> 2049 → 727 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=2256845383
>>> TSecr=4059503416
>>> 
>>> 
>>>> 
>>>> 
>>>> --
>>>> Chuck Lever
>> 
>> 
>> --
>> Chuck Lever


--
Chuck Lever



  reply	other threads:[~2023-04-04 14:54 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-31 14:23 kernel oops Olga Kornievskaia
2023-03-31 14:25 ` Chuck Lever III
2023-04-04 14:24   ` Olga Kornievskaia
2023-04-04 14:31     ` Chuck Lever III
2023-04-04 14:44       ` Olga Kornievskaia
2023-04-04 14:54         ` Chuck Lever III [this message]
2023-04-04 16:15           ` Olga Kornievskaia
2023-04-04 17:28             ` Chuck Lever III
2023-04-04 19:11               ` Olga Kornievskaia
2023-04-04 19:14                 ` Chuck Lever III
2023-04-04 19:26                   ` Olga Kornievskaia
2023-04-04 19:30                     ` Chuck Lever III

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2F8C7CC4-157D-479B-9DBB-51A4D39EEA96@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=aglo@umich.edu \
    --cc=kernel-tls-handshake@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).