From: Jeff Layton <jlayton@kernel.org>
To: Chuck Lever III <chuck.lever@oracle.com>
Cc: "kernel-tls-handshake@lists.linux.dev"
<kernel-tls-handshake@lists.linux.dev>
Subject: Re: problems getting rpc over tls to work
Date: Tue, 28 Mar 2023 09:51:13 -0400 [thread overview]
Message-ID: <0b9c3476d90bef101da0d0f50f16e14c9ae88050.camel@kernel.org> (raw)
In-Reply-To: <D0679730-E679-4C96-BA7D-E17D307A4B8E@oracle.com>
On Tue, 2023-03-28 at 13:29 +0000, Chuck Lever III wrote:
>
> > On Mar 28, 2023, at 8:27 AM, Jeff Layton <jlayton@kernel.org> wrote:
> >
> > Hi Chuck!
> >
> > I have started the packaging work for Fedora for ktls-utils:
> >
> > https://bugzilla.redhat.com/show_bug.cgi?id=2182151
> >
> > I also built packages for this in copr:
> >
> > https://copr.fedorainfracloud.org/coprs/jlayton/ktls-utils/
> >
> > ...and built some interim nfs-utils packages with the requisite exportfs
> > patches:
> >
> > https://copr.fedorainfracloud.org/coprs/jlayton/nfs-utils/
>
> Note that the nfs-utils changes aren't necessary to support
> the kernel server in "opportunistic" mode -- the server will
> use RPC-with-TLS if a client requests it, but otherwise does
> not restrict access.
>
Opportunistic mode doesn't seem to work for me. I created a self-signed
cert and tried to use it, but the client rejects it with this:
Mar 28 09:01:20 nfsclnt tlshd[1092]: Certificate signer not found.
Is there a way to make it not try to validate the cert chain? Otherwise,
I guess I'll need to set up a CA and such.
> Client side also has no nfs-utils requirements at this time,
> since the new mount options are handled by the kernel.
>
>
> > I built a kernel from your topic-rpc-with-tls-upcall branch and
> > installed the kernel on a client and server, along with ktls-utils and
> > the updated nfs-utils on the server. I set up tlshd to run at boot on
> > both hosts. The server exports with a bog-standard set of options:
> >
> > /export *(rw,insecure,no_root_squash)
> >
> > I then tried to mount it with tls:
> >
> > $ sudo mount knfsd:/export /mnt/knfsd -o xprtsec=tls
> >
> > I see the initial NULL requests go out, and then I see the client send
> > an encrypted frame to the server, and the server just shuts down the
> > socket at that point (FIN, ACK).
> >
> > I assume that I must have something configured wrong. What am I missing?
>
> The starting move is to crank up the debug settings in /etc/tlshd.conf...
>
>
Thanks. I'll try that next time.
> > Eventually, after a couple of failed mount attempts, I also hit this on
> > the client:
> >
> > [ 375.561304] BUG: kernel NULL pointer dereference, address:
> > 0000000000000030
> > [ 375.564637] #PF: supervisor read access in kernel mode
> > [ 375.566439] #PF: error_code(0x0000) - not-present page
> > [ 375.567930] PGD 0 P4D 0
> > [ 375.568733] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > [ 375.569993] CPU: 2 PID: 9 Comm: kworker/u16:0 Tainted: G E
> > 6.3.0-rc2+ #151
> > [ 375.572214] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> > 1.16.1-2.fc37 04/01/2014
> > [ 375.574538] Workqueue: xprtiod xs_tls_connect [sunrpc]
> > [ 375.576087] RIP: 0010:handshake_req_cancel+0x12/0x1c0
> > [ 375.578255] Code: d0 5b ff eb 92 0f 1f 00 90 90 90 90 90 90 90 90 90
> > 90 90 90 90 90 90 90 0f 1f 44 00 00 41 55 41 54 55 53 48 8b 6f 18 48 89
> > ef <4c> 8b 6d 30 e8 35 fe ff ff 48 85 c0 0f 84 3e 01 00 00 4c 89 ef 48
> > [ 375.583180] RSP: 0018:ffffb87540053d28 EFLAGS: 00010246
> > [ 375.585416] RAX: 0000000000000000 RBX: ffff970289cb4800 RCX:
> > 0000000000000000
> > [ 375.588226] RDX: 0000000000000001 RSI: ffffb87540053d00 RDI:
> > 0000000000000000
> > [ 375.590389] RBP: 0000000000000000 R08: ffff970280e544a8 R09:
> > 0000000000000001
> > [ 375.592601] R10: 0000000000000002 R11: 0000000000000001 R12:
> > ffff970288985480
> > [ 375.594885] R13: 0000000000000000 R14: 0000000004208160 R15:
> > ffff970289cb4800
> > [ 375.597051] FS: 0000000000000000(0000) GS:ffff9703f7c80000(0000)
> > knlGS:0000000000000000
> > [ 375.599810] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 375.601625] CR2: 0000000000000030 CR3: 000000010aa04000 CR4:
> > 00000000003506e0
> > [ 375.603785] Call Trace:
> > [ 375.604754] <TASK>
> > [ 375.605651] xs_tls_handshake_sync+0x14f/0x170 [sunrpc]
> > [ 375.608998] ? __pfx_xs_tls_handshake_done+0x10/0x10 [sunrpc]
> > [ 375.610900] xs_tls_connect+0x14a/0x5f0 [sunrpc]
> > [ 375.612530] process_one_work+0x1c8/0x3c0
> > [ 375.613907] worker_thread+0x4d/0x380
> > [ 375.615189] ? __pfx_worker_thread+0x10/0x10
> > [ 375.616631] kthread+0xe9/0x110
> > [ 375.617788] ? __pfx_kthread+0x10/0x10
> > [ 375.619091] ret_from_fork+0x2c/0x50
> > [ 375.620347] </TASK>
> > [ 375.621242] Modules linked in: rpcsec_gss_krb5(E) auth_rpcgss(E)
> > nfsv4(E) dns_resolver(E) nfs(E) lockd(E) grace(E) sunrpc(E) ext4(E)
> > crc16(E) mbcache(E) jbd2(E) snd_hda_codec_generic(E) snd_hda_intel(E)
> > snd_intel_dspcfg(E) snd_hda_codec(E) snd_hwdep(E) snd_hda_core(E)
> > snd_pcm(E) kvm_amd(E) snd_timer(E) kvm(E) psmouse(E) snd(E) evdev(E)
> > irqbypass(E) virtio_balloon(E) soundcore(E) pcspkr(E) button(E) loop(E)
> > drm(E) configfs(E) zram(E) zsmalloc(E) xfs(E) libcrc32c(E)
> > crc32c_generic(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E)
> > ghash_clmulni_intel(E) sha512_ssse3(E) sha512_generic(E) virtio_net(E)
> > virtio_blk(E) net_failover(E) failover(E) virtio_console(E)
> > aesni_intel(E) serio_raw(E) crypto_simd(E) cryptd(E) virtio_pci(E)
> > virtio(E) virtio_pci_legacy_dev(E) virtio_pci_modern_dev(E)
> > virtio_ring(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E)
> > dm_multipath(E) dm_mod(E) scsi_mod(E) scsi_common(E) autofs4(E)
> > [ 375.646698] CR2: 0000000000000030
> > [ 375.647894] ---[ end trace 0000000000000000 ]---
> > [ 375.649403] RIP: 0010:handshake_req_cancel+0x12/0x1c0
> > [ 375.651062] Code: d0 5b ff eb 92 0f 1f 00 90 90 90 90 90 90 90 90 90
> > 90 90 90 90 90 90 90 0f 1f 44 00 00 41 55 41 54 55 53 48 8b 6f 18 48 89
> > ef <4c> 8b 6d 30 e8 35 fe ff ff 48 85 c0 0f 84 3e 01 00 00 4c 89 ef 48
> > [ 375.654664] RSP: 0018:ffffb87540053d28 EFLAGS: 00010246
> > [ 375.655447] RAX: 0000000000000000 RBX: ffff970289cb4800 RCX:
> > 0000000000000000
> > [ 375.656436] RDX: 0000000000000001 RSI: ffffb87540053d00 RDI:
> > 0000000000000000
> > [ 375.657425] RBP: 0000000000000000 R08: ffff970280e544a8 R09:
> > 0000000000000001
> > [ 375.658392] R10: 0000000000000002 R11: 0000000000000001 R12:
> > ffff970288985480
> > [ 375.659360] R13: 0000000000000000 R14: 0000000004208160 R15:
> > ffff970289cb4800
> > [ 375.660324] FS: 0000000000000000(0000) GS:ffff9703f7c80000(0000)
> > knlGS:0000000000000000
> > [ 375.661479] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 375.662285] CR2: 0000000000000030 CR3: 000000010aa04000 CR4:
> > 00000000003506e0
> > [ 375.663278] note: kworker/u16:0[9] exited with irqs disabled
> >
> >
> > ...faddr2line says:
> >
> > [jlayton@tleilax linux]$ ./scripts/faddr2line --list vmlinux
> > handshake_req_cancel+0x12/0x1c0
> > handshake_req_cancel+0x12/0x1c0:
> >
> > read_pnet at include/net/net_namespace.h:383
> > 378 }
> > 379
> > 380 static inline struct net *read_pnet(const possible_net_t *pnet)
> > 381 {
> > 382 #ifdef CONFIG_NET_NS
> > > 383< return pnet->net;
> > 384 #else
> > 385 return &init_net;
> > 386 #endif
> > 387 }
> > 388
> >
> > (inlined by) sock_net at include/net/sock.h:649
> > 644 __rcu_assign_sk_user_data_with_flags(sk, ptr, 0)
> > 645
> > 646 static inline
> > 647 struct net *sock_net(const struct sock *sk)
> > 648 {
> > > 649< return read_pnet(&sk->sk_net);
> > 650 }
> > 651
> > 652 static inline
> > 653 void sock_net_set(struct sock *sk, struct net *net)
> > 654 {
> >
> > (inlined by) handshake_req_cancel at net/handshake/request.c:281
> > 276 struct handshake_net *hn;
> > 277 struct sock *sk;
> > 278 struct net *net;
> > 279
> > 280 sk = sock->sk;
> > > 281< net = sock_net(sk);
> > 282 req = handshake_req_hash_lookup(sk);
> > 283 if (!req) {
> > 284 trace_handshake_cancel_none(net, req, sk);
> > 285 return true;
> > 286 }
> >
> >
> > I'm guessing sk was NULL in handshake_req_cancel?
>
> Jakub asked me to remove the NULL check there. But I think
> req_cancel needs to handle this case, which might happen
> due to a race.
>
If there's a race there, is that sufficient? Could it go NULL after you
check but before the call to sock_net? Maybe I need to better understand
the race.
--
Jeff Layton <jlayton@kernel.org>
next prev parent reply other threads:[~2023-03-28 13:51 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-28 12:27 problems getting rpc over tls to work Jeff Layton
2023-03-28 12:55 ` Jeff Layton
2023-03-28 14:04 ` Chuck Lever III
2023-03-28 14:23 ` Benjamin Coddington
2023-03-28 14:29 ` Jeff Layton
2023-03-28 14:39 ` Olga Kornievskaia
2023-03-28 14:45 ` Chuck Lever III
2023-03-28 14:50 ` Olga Kornievskaia
2023-03-28 15:06 ` Jeff Layton
2023-03-28 15:03 ` Jeff Layton
2023-03-28 15:05 ` Chuck Lever III
2023-03-28 15:15 ` Jeff Layton
2023-03-28 15:19 ` Olga Kornievskaia
2023-03-28 15:30 ` Olga Kornievskaia
2023-03-28 15:48 ` Chuck Lever III
2023-03-28 14:41 ` Chuck Lever III
2023-03-28 13:29 ` Chuck Lever III
2023-03-28 13:51 ` Jeff Layton [this message]
2023-03-28 13:55 ` Chuck Lever III
2023-03-28 14:13 ` Jeff Layton
2023-03-28 14:25 ` Olga Kornievskaia
2023-03-28 14:38 ` Jeff Layton
2023-03-28 14:44 ` Olga Kornievskaia
2023-03-28 14:47 ` Chuck Lever III
2023-03-28 15:48 ` Jeff Layton
2023-03-28 16:06 ` Chuck Lever III
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0b9c3476d90bef101da0d0f50f16e14c9ae88050.camel@kernel.org \
--to=jlayton@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=kernel-tls-handshake@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).