kernel-tls-handshake.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: Chuck Lever III <chuck.lever@oracle.com>
Cc: "kernel-tls-handshake@lists.linux.dev"
	<kernel-tls-handshake@lists.linux.dev>
Subject: Re: problems getting rpc over tls to work
Date: Tue, 28 Mar 2023 09:51:13 -0400	[thread overview]
Message-ID: <0b9c3476d90bef101da0d0f50f16e14c9ae88050.camel@kernel.org> (raw)
In-Reply-To: <D0679730-E679-4C96-BA7D-E17D307A4B8E@oracle.com>

On Tue, 2023-03-28 at 13:29 +0000, Chuck Lever III wrote:
> 
> > On Mar 28, 2023, at 8:27 AM, Jeff Layton <jlayton@kernel.org> wrote:
> > 
> > Hi Chuck!
> > 
> > I have started the packaging work for Fedora for ktls-utils:
> > 
> >    https://bugzilla.redhat.com/show_bug.cgi?id=2182151
> > 
> > I also built packages for this in copr:
> > 
> >    https://copr.fedorainfracloud.org/coprs/jlayton/ktls-utils/
> > 
> > ...and built some interim nfs-utils packages with the requisite exportfs
> > patches:
> > 
> >    https://copr.fedorainfracloud.org/coprs/jlayton/nfs-utils/
> 
> Note that the nfs-utils changes aren't necessary to support
> the kernel server in "opportunistic" mode -- the server will
> use RPC-with-TLS if a client requests it, but otherwise does
> not restrict access.
> 

Opportunistic mode doesn't seem to work for me. I created a self-signed
cert and tried to use it, but the client rejects it with this:

    Mar 28 09:01:20 nfsclnt tlshd[1092]: Certificate signer not found.

Is there a way to make it not try to validate the cert chain? Otherwise,
I guess I'll need to set up a CA and such.

> Client side also has no nfs-utils requirements at this time,
> since the new mount options are handled by the kernel.
> 
> 
> > I built a kernel from your topic-rpc-with-tls-upcall branch and
> > installed the kernel on a client and server, along with ktls-utils and
> > the updated nfs-utils on the server. I set up tlshd to run at boot on
> > both hosts. The server exports with a bog-standard set of options:
> > 
> >    /export		*(rw,insecure,no_root_squash)
> > 
> > I then tried to mount it with tls:
> > 
> >    $ sudo mount knfsd:/export /mnt/knfsd -o xprtsec=tls
> > 
> > I see the initial NULL requests go out, and then I see the client send
> > an encrypted frame to the server, and the server just shuts down the
> > socket at that point (FIN, ACK).
> > 
> > I assume that I must have something configured wrong. What am I missing?
> 
> The starting move is to crank up the debug settings in /etc/tlshd.conf...
> 
> 

Thanks. I'll try that next time.

> > Eventually, after a couple of failed mount attempts, I also hit this on
> > the client:
> > 
> > [  375.561304] BUG: kernel NULL pointer dereference, address:
> > 0000000000000030
> > [  375.564637] #PF: supervisor read access in kernel mode
> > [  375.566439] #PF: error_code(0x0000) - not-present page
> > [  375.567930] PGD 0 P4D 0 
> > [  375.568733] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > [  375.569993] CPU: 2 PID: 9 Comm: kworker/u16:0 Tainted: G            E
> > 6.3.0-rc2+ #151
> > [  375.572214] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> > 1.16.1-2.fc37 04/01/2014
> > [  375.574538] Workqueue: xprtiod xs_tls_connect [sunrpc]
> > [  375.576087] RIP: 0010:handshake_req_cancel+0x12/0x1c0
> > [  375.578255] Code: d0 5b ff eb 92 0f 1f 00 90 90 90 90 90 90 90 90 90
> > 90 90 90 90 90 90 90 0f 1f 44 00 00 41 55 41 54 55 53 48 8b 6f 18 48 89
> > ef <4c> 8b 6d 30 e8 35 fe ff ff 48 85 c0 0f 84 3e 01 00 00 4c 89 ef 48
> > [  375.583180] RSP: 0018:ffffb87540053d28 EFLAGS: 00010246
> > [  375.585416] RAX: 0000000000000000 RBX: ffff970289cb4800 RCX:
> > 0000000000000000
> > [  375.588226] RDX: 0000000000000001 RSI: ffffb87540053d00 RDI:
> > 0000000000000000
> > [  375.590389] RBP: 0000000000000000 R08: ffff970280e544a8 R09:
> > 0000000000000001
> > [  375.592601] R10: 0000000000000002 R11: 0000000000000001 R12:
> > ffff970288985480
> > [  375.594885] R13: 0000000000000000 R14: 0000000004208160 R15:
> > ffff970289cb4800
> > [  375.597051] FS:  0000000000000000(0000) GS:ffff9703f7c80000(0000)
> > knlGS:0000000000000000
> > [  375.599810] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  375.601625] CR2: 0000000000000030 CR3: 000000010aa04000 CR4:
> > 00000000003506e0
> > [  375.603785] Call Trace:
> > [  375.604754]  <TASK>
> > [  375.605651]  xs_tls_handshake_sync+0x14f/0x170 [sunrpc]
> > [  375.608998]  ? __pfx_xs_tls_handshake_done+0x10/0x10 [sunrpc]
> > [  375.610900]  xs_tls_connect+0x14a/0x5f0 [sunrpc]
> > [  375.612530]  process_one_work+0x1c8/0x3c0
> > [  375.613907]  worker_thread+0x4d/0x380
> > [  375.615189]  ? __pfx_worker_thread+0x10/0x10
> > [  375.616631]  kthread+0xe9/0x110
> > [  375.617788]  ? __pfx_kthread+0x10/0x10
> > [  375.619091]  ret_from_fork+0x2c/0x50
> > [  375.620347]  </TASK>
> > [  375.621242] Modules linked in: rpcsec_gss_krb5(E) auth_rpcgss(E)
> > nfsv4(E) dns_resolver(E) nfs(E) lockd(E) grace(E) sunrpc(E) ext4(E)
> > crc16(E) mbcache(E) jbd2(E) snd_hda_codec_generic(E) snd_hda_intel(E)
> > snd_intel_dspcfg(E) snd_hda_codec(E) snd_hwdep(E) snd_hda_core(E)
> > snd_pcm(E) kvm_amd(E) snd_timer(E) kvm(E) psmouse(E) snd(E) evdev(E)
> > irqbypass(E) virtio_balloon(E) soundcore(E) pcspkr(E) button(E) loop(E)
> > drm(E) configfs(E) zram(E) zsmalloc(E) xfs(E) libcrc32c(E)
> > crc32c_generic(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E)
> > ghash_clmulni_intel(E) sha512_ssse3(E) sha512_generic(E) virtio_net(E)
> > virtio_blk(E) net_failover(E) failover(E) virtio_console(E)
> > aesni_intel(E) serio_raw(E) crypto_simd(E) cryptd(E) virtio_pci(E)
> > virtio(E) virtio_pci_legacy_dev(E) virtio_pci_modern_dev(E)
> > virtio_ring(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E)
> > dm_multipath(E) dm_mod(E) scsi_mod(E) scsi_common(E) autofs4(E)
> > [  375.646698] CR2: 0000000000000030
> > [  375.647894] ---[ end trace 0000000000000000 ]---
> > [  375.649403] RIP: 0010:handshake_req_cancel+0x12/0x1c0
> > [  375.651062] Code: d0 5b ff eb 92 0f 1f 00 90 90 90 90 90 90 90 90 90
> > 90 90 90 90 90 90 90 0f 1f 44 00 00 41 55 41 54 55 53 48 8b 6f 18 48 89
> > ef <4c> 8b 6d 30 e8 35 fe ff ff 48 85 c0 0f 84 3e 01 00 00 4c 89 ef 48
> > [  375.654664] RSP: 0018:ffffb87540053d28 EFLAGS: 00010246
> > [  375.655447] RAX: 0000000000000000 RBX: ffff970289cb4800 RCX:
> > 0000000000000000
> > [  375.656436] RDX: 0000000000000001 RSI: ffffb87540053d00 RDI:
> > 0000000000000000
> > [  375.657425] RBP: 0000000000000000 R08: ffff970280e544a8 R09:
> > 0000000000000001
> > [  375.658392] R10: 0000000000000002 R11: 0000000000000001 R12:
> > ffff970288985480
> > [  375.659360] R13: 0000000000000000 R14: 0000000004208160 R15:
> > ffff970289cb4800
> > [  375.660324] FS:  0000000000000000(0000) GS:ffff9703f7c80000(0000)
> > knlGS:0000000000000000
> > [  375.661479] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  375.662285] CR2: 0000000000000030 CR3: 000000010aa04000 CR4:
> > 00000000003506e0
> > [  375.663278] note: kworker/u16:0[9] exited with irqs disabled
> > 
> > 
> > ...faddr2line says:
> > 
> > [jlayton@tleilax linux]$ ./scripts/faddr2line --list vmlinux
> > handshake_req_cancel+0x12/0x1c0
> > handshake_req_cancel+0x12/0x1c0:
> > 
> > read_pnet at include/net/net_namespace.h:383
> > 378 	}
> > 379 	
> > 380 	static inline struct net *read_pnet(const possible_net_t *pnet)
> > 381 	{
> > 382 	#ifdef CONFIG_NET_NS
> > > 383<		return pnet->net;
> > 384 	#else
> > 385 		return &init_net;
> > 386 	#endif
> > 387 	}
> > 388 	
> > 
> > (inlined by) sock_net at include/net/sock.h:649
> > 644 		__rcu_assign_sk_user_data_with_flags(sk, ptr, 0)
> > 645 	
> > 646 	static inline
> > 647 	struct net *sock_net(const struct sock *sk)
> > 648 	{
> > > 649<		return read_pnet(&sk->sk_net);
> > 650 	}
> > 651 	
> > 652 	static inline
> > 653 	void sock_net_set(struct sock *sk, struct net *net)
> > 654 	{
> > 
> > (inlined by) handshake_req_cancel at net/handshake/request.c:281
> > 276 		struct handshake_net *hn;
> > 277 		struct sock *sk;
> > 278 		struct net *net;
> > 279 	
> > 280 		sk = sock->sk;
> > > 281<		net = sock_net(sk);
> > 282 		req = handshake_req_hash_lookup(sk);
> > 283 		if (!req) {
> > 284 			trace_handshake_cancel_none(net, req, sk);
> > 285 			return true;
> > 286 		}
> > 
> > 
> > I'm guessing sk was NULL in handshake_req_cancel?
> 
> Jakub asked me to remove the NULL check there. But I think
> req_cancel needs to handle this case, which might happen
> due to a race.
> 

If there's a race there, is that sufficient? Could it go NULL after you
check but before the call to sock_net? Maybe I need to better understand
the race.

-- 
Jeff Layton <jlayton@kernel.org>

  reply	other threads:[~2023-03-28 13:51 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-28 12:27 problems getting rpc over tls to work Jeff Layton
2023-03-28 12:55 ` Jeff Layton
2023-03-28 14:04   ` Chuck Lever III
2023-03-28 14:23     ` Benjamin Coddington
2023-03-28 14:29     ` Jeff Layton
2023-03-28 14:39       ` Olga Kornievskaia
2023-03-28 14:45         ` Chuck Lever III
2023-03-28 14:50           ` Olga Kornievskaia
2023-03-28 15:06             ` Jeff Layton
2023-03-28 15:03           ` Jeff Layton
2023-03-28 15:05             ` Chuck Lever III
2023-03-28 15:15               ` Jeff Layton
2023-03-28 15:19               ` Olga Kornievskaia
2023-03-28 15:30                 ` Olga Kornievskaia
2023-03-28 15:48                   ` Chuck Lever III
2023-03-28 14:41       ` Chuck Lever III
2023-03-28 13:29 ` Chuck Lever III
2023-03-28 13:51   ` Jeff Layton [this message]
2023-03-28 13:55   ` Chuck Lever III
2023-03-28 14:13     ` Jeff Layton
2023-03-28 14:25       ` Olga Kornievskaia
2023-03-28 14:38         ` Jeff Layton
2023-03-28 14:44           ` Olga Kornievskaia
2023-03-28 14:47             ` Chuck Lever III
2023-03-28 15:48           ` Jeff Layton
2023-03-28 16:06             ` Chuck Lever III

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0b9c3476d90bef101da0d0f50f16e14c9ae88050.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=kernel-tls-handshake@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).