All of lore.kernel.org
 help / color / mirror / Atom feed
From: Olga Kornievskaia <aglo@umich.edu>
To: Chuck Lever III <chuck.lever@oracle.com>
Cc: "kernel-tls-handshake@lists.linux.dev"
	<kernel-tls-handshake@lists.linux.dev>
Subject: Re: kernel oops
Date: Tue, 4 Apr 2023 10:44:26 -0400	[thread overview]
Message-ID: <CAN-5tyEzFjZx41bExgdLKuty5y7+59CJnf2vJm9=9dyAFV=H7w@mail.gmail.com> (raw)
In-Reply-To: <6B4F534A-3D98-4AAC-8537-2D78D4EFFEA6@oracle.com>

On Tue, Apr 4, 2023 at 10:32 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
>
>
>
> > On Apr 4, 2023, at 10:24 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
> >
> > On Fri, Mar 31, 2023 at 10:25 AM Chuck Lever III <chuck.lever@oracle.com> wrote:
> >>
> >>
> >>
> >>> On Mar 31, 2023, at 10:23 AM, Olga Kornievskaia <aglo@umich.edu> wrote:
> >>>
> >>> Hi Chuck,
> >>>
> >>> Have you seen this oops? This is not my testing so all I know was that
> >>> it was an attempt to mount with xprtsec=tls (against ONTAP).
> >>>
> >>> [  316.939747] BUG: kernel NULL pointer dereference, address: 0000000000000018
> >>> [  316.939848] #PF: supervisor read access in kernel mode
> >>> [  316.939914] #PF: error_code(0x0000) - not-present page
> >>> [  316.939984] PGD 0 P4D 0
> >>> [  316.940041] Oops: 0000 [#1] PREEMPT SMP PTI
> >>> [  316.940109] CPU: 0 PID: 511 Comm: kworker/u2:30 Kdump: loaded Not
> >>> tainted 6.2.1+ #2
> >>> [  316.940181] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
> >>> Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
> >>> [  316.940259] Workqueue: xprtiod xs_tls_connect [sunrpc]
> >>> [  316.940463] RIP: 0010:xs_tls_connect+0x3e1/0x590 [sunrpc]
> >>> [  316.940556] Code: ff ff ff 48 2b 83 f8 fe ff ff 48 89 83 00 ff ff
> >>> ff e8 03 d3 ff ff e9 a8 fd ff ff 49 8b 97 f8 05 00 00 66 83 bb e0 f9
> >>> ff ff 0a <48> 8b 6a 18 0f 84 fd 00 00 00 48 8d 72 18 4c 89 e7 48 89 14
> >>> 24 e8
> >>> [  316.940719] RSP: 0018:ffffa69d40f5fdc8 EFLAGS: 00010293
> >>> [  316.940786] RAX: 0000000000000000 RBX: ffff9b0c46333e40 RCX: 0000000000000000
> >>> [  316.940872] RDX: 0000000000000000 RSI: ffffa69d40f5fd20 RDI: 0000000000000000
> >>> [  316.940940] RBP: ffff9b0c46350200 R08: ffff9b0cf7e31968 R09: 0000000000000159
> >>> [  316.941008] R10: 0000000000000001 R11: 00000049c8887a40 R12: ffff9b0c46333800
> >>> [  316.941075] R13: 0000000004208160 R14: ffff9b0c44e18a00 R15: ffff9b0c42dab800
> >>> [  316.941159] FS:  0000000000000000(0000) GS:ffff9b0cf7e00000(0000)
> >>> knlGS:0000000000000000
> >>> [  316.941243] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>> [  316.941309] CR2: 0000000000000018 CR3: 0000000102e48004 CR4: 00000000007706f0
> >>> [  316.941391] PKRU: 55555554
> >>> [  316.941443] Call Trace:
> >>> [  316.941513]  <TASK>
> >>> [  316.941578]  process_one_work+0x1b0/0x3c0
> >>> [  316.941689]  worker_thread+0x30/0x360
> >>> [  316.941755]  ? __pfx_worker_thread+0x10/0x10
> >>> [  316.941822]  kthread+0xd7/0x100
> >>> [  316.941900]  ? __pfx_kthread+0x10/0x10
> >>> [  316.941968]  ret_from_fork+0x29/0x50
> >>> [  316.942049]  </TASK>
> >>> [  316.942099] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
> >>> dns_resolver ib_core nfsv3 nfs_acl nfs lockd grace fscache netfs ext4
> >>> rfkill mbcache jbd2 loop vsock_loopback
> >>> vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock
> >>> sunrpc tls vfat fat dm_multipath intel_rapl_msr intel_rapl_common
> >>> isst_if_mbox_msr isst_if_common nfit libnvdimm crct10dif_pclmul
> >>> crc32_pclmul ghash_clmulni_intel rapl vmw_balloon joydev pcspkr
> >>> i2c_piix4 vmw_vmci xfs libcrc32c vmwgfx drm_ttm_helper ttm sr_mod
> >>> cdrom drm_kms_helper syscopyarea sd_mod sysfillrect ata_generic
> >>> sysimgblt sg ahci libahci ata_piix crc32c_intel serio_raw drm vmxnet3
> >>> vmw_pvscsi libata dm_mirror dm_region_hash dm_log dm_mod nvme_tcp
> >>> nvme_fabrics nvme_core t10_pi crc64_rocksoft crc64 ipmi_devintf
> >>> ipmi_msghandler fuse
> >>> [  316.942550] CR2: 0000000000000018
> >>>
> >>> Please note this was still on "commit
> >>> caa52379f0046e974fd16585dd7ede624699d8e2 (HEAD -> 03152023-v7,
> >>> origin/upcall-v7)". If you think somethings done to address it I'll
> >>> have them update to the latest.
> >>
> >> Update and retry. It has a familiar ring, but I can't say I've
> >> seen this recently.
> >
> > Hi Chuck,
> >
> > I've updated the kernel code to the latest  commit
> > 46588cade103b446f74a7ee160a83cd2a80399e7 (HEAD ->
> > chuck-rpc-with-tls-04032023, chuck/topic-rpc-with-tls-upcall) but the
> > client is still crashing but looks like in a different place. This
> > happens when the server terminates the connection (immediately) as the
> > tls handshake happens.
>
> Why is the server doing that?

I'm not sure what kind of a reason you are looking for. I think the
server can send an RST at any point in a TCP connection and the client
shouldn't crash. But the real reason is probably because the server's
code is getting developed and they ran into some error and resetting
the connection.

> > Here's the stack trace:
> > [  677.274299] BUG: kernel NULL pointer dereference, address: 0000000000000030
> > [  677.274424] #PF: supervisor read access in kernel mode
> > [  677.274517] #PF: error_code(0x0000) - not-present page
> > [  677.274609] PGD 0 P4D 0
> > [  677.274689] Oops: 0000 [#1] PREEMPT SMP PTI
> > [  677.274785] CPU: 0 PID: 518 Comm: kworker/u2:29 Kdump: loaded Not
> > tainted 6.3.0-rc4+ #13
> > [  677.274881] Hardware name: VMware, Inc. VMware7,1/440BX Desktop
> > Reference Platform, BIOS VMW71.00V.16707776.B64.2008070230 08/07/2020
> > [  677.274981] Workqueue: xprtiod xs_tls_connect [sunrpc]
> > [  677.275137] RIP: 0010:handshake_req_cancel+0x2b/0x3b0
>
> scripts/faddr2line on this RIP
>
> Which pointer is NULL?

I apologize I don't have that info and I'm working on trying to figure
out how to get this information to you. Lab kernels are built with
CONFIG_DEBUG_INFO disabled and thus I can't do faddr2line. I will
attempt to build on with debugging enabled (has been problematic
before for the lab machines so not sure if that'll be successful I
might have to hack the linux kernel to simulate the reset).


> > [  677.275269] Code: 0f 1e fa 0f 1f 44 00 00 41 57 41 56 41 55 41 54
> > 55 53 48 83 ec 28 48 8b 6f 18 65 48 8b 04 25 28 00 00 00 48 89 44 24
> > 20 31 c0 <48> 8b 5d 30 48 89 6c 24 18 e8 97 15 5f ff 4c 8b 25 50 2e 70
> > 01 41
> > [  677.275474] RSP: 0018:ffffaed400f6fcf8 EFLAGS: 00010246
> > [  677.275570] RAX: 0000000000000000 RBX: ffff93f085b33800 RCX: 0000000000000000
> > [  677.275663] RDX: ffff93f085b33f08 RSI: ffffaed400f6fd18 RDI: ffff93f083aa6700
> > [  677.275756] RBP: 0000000000000000 R08: ffff93f137e32968 R09: 0000009d9663fa40
> > [  677.275850] R10: 0000000000000000 R11: 0000009d9663fa40 R12: ffff93f083aa6700
> > [  677.275943] R13: 0000000000000000 R14: ffff93f082999e00 R15: ffff93f085b33800
> > [  677.276049] FS:  0000000000000000(0000) GS:ffff93f137e00000(0000)
> > knlGS:0000000000000000
> > [  677.276154] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  677.276247] CR2: 0000000000000030 CR3: 000000010797c002 CR4: 00000000007706f0
> > [  677.276355] PKRU: 55555554
> > [  677.276427] Call Trace:
> > [  677.276521]  <TASK>
> > [  677.276605]  ? wait_for_completion_interruptible_timeout+0xfa/0x170
> > [  677.276725]  xs_tls_handshake_sync+0x14f/0x170 [sunrpc]
> > [  677.276850]  ? __pfx_xs_tls_handshake_done+0x10/0x10 [sunrpc]
> > [  677.276967]  xs_tls_connect+0x14c/0x590 [sunrpc]
> > [  677.277086]  process_one_work+0x1b0/0x3c0
> > [  677.277209]  worker_thread+0x34/0x360
> > [  677.277303]  ? __pfx_worker_thread+0x10/0x10
> > [  677.277396]  kthread+0xdb/0x110
> > [  677.277527]  ? __pfx_kthread+0x10/0x10
> > [  677.277619]  ret_from_fork+0x29/0x50
> > [  677.277731]  </TASK>
> > [  677.277803] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4
> > dns_resolver nfsv3 nfs_acl nfs lockd grace fscache netfs ext4 mbcache
> > jbd2 loop rfkill vsock_loopback vmw_vsock_virtio_transport_common
> > vmw_vsock_vmci_transport vsock sunrpc vfat fat dm_multipath
> > intel_rapl_msr intel_rapl_common isst_if_mbox_msr isst_if_common nfit
> > libnvdimm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl
> > vmw_balloon joydev pcspkr vmw_vmci i2c_piix4 xfs libcrc32c vmwgfx
> > drm_ttm_helper ttm sr_mod cdrom drm_kms_helper syscopyarea sysfillrect
> > sd_mod ata_generic sysimgblt sg ata_piix ahci libahci crc32c_intel
> > serio_raw drm vmxnet3 vmw_pvscsi libata dm_mirror dm_region_hash
> > dm_log dm_mod nvme_tcp nvme_fabrics nvme_core t10_pi crc64_rocksoft
> > crc64 ipmi_devintf ipmi_msghandler fuse
> > [  677.278383] CR2: 0000000000000030
> >
> > Network trace from the server:
> >
> > [kolga@scs000085536 ~/pcaps]$ tshark -t ad -r e0d_20230404_084311.trc0
> >    1 2023-04-04 08:43:28.027546 10.235.232.77 → 10.63.20.131 TCP 74
> > 727 → 2049 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 SACK_PERM=1
> > TSval=4059503415 TSecr=0 WS=128
> >    2 2023-04-04 08:43:28.027645 10.63.20.131 → 10.235.232.77 TCP 74
> > 2049 → 727 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=256
> > SACK_PERM=1 TSval=2256845073 TSecr=4059503415
> >    3 2023-04-04 08:43:28.027813 10.235.232.77 → 10.63.20.131 TCP 66
> > 727 → 2049 [ACK] Seq=1 Ack=1 Win=64256 Len=0 TSval=4059503415
> > TSecr=2256845073
> >    4 2023-04-04 08:43:28.027952 10.235.232.77 → 10.63.20.131 NFS 110
> > V4 NULL Call
> >    5 2023-04-04 08:43:28.028301 10.63.20.131 → 10.235.232.77 NFS 102
> > V4 NULL Reply (Call In 4)
> >    6 2023-04-04 08:43:28.028563 10.235.232.77 → 10.63.20.131 TCP 66
> > 727 → 2049 [ACK] Seq=45 Ack=37 Win=64256 Len=0 TSval=4059503416
> > TSecr=2256845073
> >    7 2023-04-04 08:43:28.331181 10.63.20.131 → 10.235.232.77 TCP 66
> > 2049 → 727 [RST, ACK] Seq=37 Ack=45 Win=0 Len=0 TSval=2256845383
> > TSecr=4059503416
> >
> >
> >>
> >>
> >> --
> >> Chuck Lever
>
>
> --
> Chuck Lever
>
>

  reply	other threads:[~2023-04-04 14:44 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-31 14:23 kernel oops Olga Kornievskaia
2023-03-31 14:25 ` Chuck Lever III
2023-04-04 14:24   ` Olga Kornievskaia
2023-04-04 14:31     ` Chuck Lever III
2023-04-04 14:44       ` Olga Kornievskaia [this message]
2023-04-04 14:54         ` Chuck Lever III
2023-04-04 16:15           ` Olga Kornievskaia
2023-04-04 17:28             ` Chuck Lever III
2023-04-04 19:11               ` Olga Kornievskaia
2023-04-04 19:14                 ` Chuck Lever III
2023-04-04 19:26                   ` Olga Kornievskaia
2023-04-04 19:30                     ` Chuck Lever III
  -- strict thread matches above, loose matches on Subject: below --
2017-07-24 21:16 Kernel oops Jason Gunthorpe
     [not found] ` <20170724211606.GA1705-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-07-27 11:46   ` Matan Barak
     [not found]     ` <CAAKD3BAdB2aRk3WGdbeDYof6dUfkEwhQf27cG0FWe5DRuQ15NQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-07-27 12:54       ` Matan Barak
     [not found]         ` <CAAKD3BDFrTMMgX0nErD50rp2je=HC9zeaYWHDKf0mqQwc5fM9g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-07-27 20:44           ` Jason Gunthorpe
     [not found]             ` <20170727204437.GA16986-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-07-30 10:25               ` Leon Romanovsky
     [not found]                 ` <20170730102514.GQ13672-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-07-31  3:52                   ` Jason Gunthorpe
     [not found]                     ` <20170731035208.GA30615-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-07-31  5:39                       ` Leon Romanovsky
     [not found]                         ` <20170731053901.GR13672-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-07-31  7:12                           ` Leon Romanovsky
2013-08-30  3:13 Barclay Jameson
2012-07-07 12:54 Kernel Oops RuanZhijie
2012-07-04 11:04 RuanZhijie
2008-07-23 12:52 kernel oops Andrei Popa
2008-07-23 17:11 ` Vegard Nossum
2008-08-18 16:33   ` Vegard Nossum
2008-08-18 16:39     ` Greg KH
     [not found] <e8eb01770803120245x7690e6a9te8ad04296aa3fc4d@mail.gmail.com>
2008-03-12  9:49 ` Zbynek Drlik
2008-03-12 10:33   ` Al Viro
2008-03-12 13:12     ` Zbynek Drlik
2008-02-05 12:57 Andrej Hocevar
2008-02-06 17:55 ` Len Brown
2006-10-27  8:59 Kernel Oops Joël Cuissinat
2006-10-27 17:30 ` Russell Cattelan
2006-09-12 10:21 Marcin Prączko
2006-09-13  3:43 ` Andrew Morton
2005-10-15  1:03 Marc Perkel
2005-10-15  1:21 ` Randy.Dunlap
2005-10-15  1:43   ` Marc Perkel
2005-10-15  1:52     ` Randy.Dunlap
2005-04-25 16:31 Chris Penney
2005-04-25 17:47 ` Dan Stromberg
2005-01-08 12:47 kernel oops ierdnah
2005-01-03 21:10 Kernel oops Marat BN
2005-01-05 10:13 ` Andrew Morton
2004-10-17 12:02 kernel oops Thomas Bleher
2004-10-17 12:59 ` Erich Schubert
2004-10-18 21:09   ` Thomas Bleher
2004-06-11  7:27 Kernel oops tmp
2004-05-24 20:19 tmp
2004-05-16 12:08 Kernel OOPS tmp
2004-05-16 23:27 ` Andrew Morton
2004-05-17  0:33   ` tmp
2004-03-09 22:13 Kernel oops Philipp Baer
2004-03-09 23:11 ` Andrew Morton
2004-03-12  7:46   ` Philipp Baer
2004-02-08 11:05 Kernel Oops Mathieu LESNIAK
2004-02-08 16:35 ` Greg KH
2004-02-09  7:06   ` Mathieu LESNIAK
2003-11-28 23:15 Kernel oops Ville Jutvik
2003-11-28  5:45 Anderson Levi
2003-08-09 12:39 kernel oops Jean-Yves LENHOF
2003-08-09 20:37 ` Jean-Yves LENHOF
2003-08-09  9:28 Jean-Yves LENHOF
2003-07-18 19:44 Kernel OOPS Robert Scussel
2003-07-18 21:31 ` Alan Cox
2003-07-07 12:53 kernel oops Anders Karlsson
2003-07-07 13:14 ` Alan Cox
2003-07-07 13:32   ` Anders Karlsson
2003-07-07 13:37     ` Alan Cox
2003-07-07 13:56       ` Anders Karlsson
2003-07-08  9:39         ` Marcelo Tosatti
     [not found] ` <200307072009.50677.bernd-schubert@web.de>
2003-07-08  5:13   ` Anders Karlsson
2003-05-31  1:32 Nadeem Riaz
2003-03-26 15:52 Steve Terrell
2003-02-03  1:18 Kernel Oops Daniel Espinoza
2003-02-03  3:23 ` vishwas
2002-08-09  5:25 sanket rathi
2002-06-23 19:39 Dirk Schmidt
2002-06-10  8:46 kernel oops Robert Litwiniec
2002-02-26 18:26 Suporte RedeBonja
2002-02-27 13:35 ` Erik Mouw
2001-11-26 19:59 Tracy R Reed
2001-11-26 19:27 ` Stephen Smalley
2001-11-13 13:23 Kernel oops Anthony
2001-11-14  6:02 ` Thiago Rondon
2001-10-08 12:59 kernel oops Terry Kendal
2001-09-27  9:49 kewl
2001-06-01 15:13 Kernel oops David Harris
2001-06-01 15:12 David Harris
2001-04-19 18:32 kernel oops Ronald Bultje
2001-04-19 19:04 ` Alan Cox
2001-04-19 19:08   ` Ronald Bultje
2001-02-19 14:44 Kernel Oops Alberto Bertogli
2000-12-10  0:53 Bastien Nocera
2000-12-10  1:13 ` Bastien Nocera
2000-05-17 17:10 Kernel oops Patrick Higgins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAN-5tyEzFjZx41bExgdLKuty5y7+59CJnf2vJm9=9dyAFV=H7w@mail.gmail.com' \
    --to=aglo@umich.edu \
    --cc=chuck.lever@oracle.com \
    --cc=kernel-tls-handshake@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.