* problems getting rpc over tls to work @ 2023-03-28 12:27 Jeff Layton 2023-03-28 12:55 ` Jeff Layton 2023-03-28 13:29 ` Chuck Lever III 0 siblings, 2 replies; 26+ messages in thread From: Jeff Layton @ 2023-03-28 12:27 UTC (permalink / raw) To: Chuck Lever; +Cc: kernel-tls-handshake Hi Chuck! I have started the packaging work for Fedora for ktls-utils: https://bugzilla.redhat.com/show_bug.cgi?id=2182151 I also built packages for this in copr: https://copr.fedorainfracloud.org/coprs/jlayton/ktls-utils/ ...and built some interim nfs-utils packages with the requisite exportfs patches: https://copr.fedorainfracloud.org/coprs/jlayton/nfs-utils/ I built a kernel from your topic-rpc-with-tls-upcall branch and installed the kernel on a client and server, along with ktls-utils and the updated nfs-utils on the server. I set up tlshd to run at boot on both hosts. The server exports with a bog-standard set of options: /export *(rw,insecure,no_root_squash) I then tried to mount it with tls: $ sudo mount knfsd:/export /mnt/knfsd -o xprtsec=tls I see the initial NULL requests go out, and then I see the client send an encrypted frame to the server, and the server just shuts down the socket at that point (FIN, ACK). I assume that I must have something configured wrong. What am I missing? Eventually, after a couple of failed mount attempts, I also hit this on the client: [ 375.561304] BUG: kernel NULL pointer dereference, address: 0000000000000030 [ 375.564637] #PF: supervisor read access in kernel mode [ 375.566439] #PF: error_code(0x0000) - not-present page [ 375.567930] PGD 0 P4D 0 [ 375.568733] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 375.569993] CPU: 2 PID: 9 Comm: kworker/u16:0 Tainted: G E 6.3.0-rc2+ #151 [ 375.572214] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.1-2.fc37 04/01/2014 [ 375.574538] Workqueue: xprtiod xs_tls_connect [sunrpc] [ 375.576087] RIP: 0010:handshake_req_cancel+0x12/0x1c0 [ 375.578255] Code: d0 5b ff eb 92 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 41 55 41 54 55 53 48 8b 6f 18 48 89 ef <4c> 8b 6d 30 e8 35 fe ff ff 48 85 c0 0f 84 3e 01 00 00 4c 89 ef 48 [ 375.583180] RSP: 0018:ffffb87540053d28 EFLAGS: 00010246 [ 375.585416] RAX: 0000000000000000 RBX: ffff970289cb4800 RCX: 0000000000000000 [ 375.588226] RDX: 0000000000000001 RSI: ffffb87540053d00 RDI: 0000000000000000 [ 375.590389] RBP: 0000000000000000 R08: ffff970280e544a8 R09: 0000000000000001 [ 375.592601] R10: 0000000000000002 R11: 0000000000000001 R12: ffff970288985480 [ 375.594885] R13: 0000000000000000 R14: 0000000004208160 R15: ffff970289cb4800 [ 375.597051] FS: 0000000000000000(0000) GS:ffff9703f7c80000(0000) knlGS:0000000000000000 [ 375.599810] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 375.601625] CR2: 0000000000000030 CR3: 000000010aa04000 CR4: 00000000003506e0 [ 375.603785] Call Trace: [ 375.604754] <TASK> [ 375.605651] xs_tls_handshake_sync+0x14f/0x170 [sunrpc] [ 375.608998] ? __pfx_xs_tls_handshake_done+0x10/0x10 [sunrpc] [ 375.610900] xs_tls_connect+0x14a/0x5f0 [sunrpc] [ 375.612530] process_one_work+0x1c8/0x3c0 [ 375.613907] worker_thread+0x4d/0x380 [ 375.615189] ? __pfx_worker_thread+0x10/0x10 [ 375.616631] kthread+0xe9/0x110 [ 375.617788] ? __pfx_kthread+0x10/0x10 [ 375.619091] ret_from_fork+0x2c/0x50 [ 375.620347] </TASK> [ 375.621242] Modules linked in: rpcsec_gss_krb5(E) auth_rpcgss(E) nfsv4(E) dns_resolver(E) nfs(E) lockd(E) grace(E) sunrpc(E) ext4(E) crc16(E) mbcache(E) jbd2(E) snd_hda_codec_generic(E) snd_hda_intel(E) snd_intel_dspcfg(E) snd_hda_codec(E) snd_hwdep(E) snd_hda_core(E) snd_pcm(E) kvm_amd(E) snd_timer(E) kvm(E) psmouse(E) snd(E) evdev(E) irqbypass(E) virtio_balloon(E) soundcore(E) pcspkr(E) button(E) loop(E) drm(E) configfs(E) zram(E) zsmalloc(E) xfs(E) libcrc32c(E) crc32c_generic(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) ghash_clmulni_intel(E) sha512_ssse3(E) sha512_generic(E) virtio_net(E) virtio_blk(E) net_failover(E) failover(E) virtio_console(E) aesni_intel(E) serio_raw(E) crypto_simd(E) cryptd(E) virtio_pci(E) virtio(E) virtio_pci_legacy_dev(E) virtio_pci_modern_dev(E) virtio_ring(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) dm_multipath(E) dm_mod(E) scsi_mod(E) scsi_common(E) autofs4(E) [ 375.646698] CR2: 0000000000000030 [ 375.647894] ---[ end trace 0000000000000000 ]--- [ 375.649403] RIP: 0010:handshake_req_cancel+0x12/0x1c0 [ 375.651062] Code: d0 5b ff eb 92 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 41 55 41 54 55 53 48 8b 6f 18 48 89 ef <4c> 8b 6d 30 e8 35 fe ff ff 48 85 c0 0f 84 3e 01 00 00 4c 89 ef 48 [ 375.654664] RSP: 0018:ffffb87540053d28 EFLAGS: 00010246 [ 375.655447] RAX: 0000000000000000 RBX: ffff970289cb4800 RCX: 0000000000000000 [ 375.656436] RDX: 0000000000000001 RSI: ffffb87540053d00 RDI: 0000000000000000 [ 375.657425] RBP: 0000000000000000 R08: ffff970280e544a8 R09: 0000000000000001 [ 375.658392] R10: 0000000000000002 R11: 0000000000000001 R12: ffff970288985480 [ 375.659360] R13: 0000000000000000 R14: 0000000004208160 R15: ffff970289cb4800 [ 375.660324] FS: 0000000000000000(0000) GS:ffff9703f7c80000(0000) knlGS:0000000000000000 [ 375.661479] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 375.662285] CR2: 0000000000000030 CR3: 000000010aa04000 CR4: 00000000003506e0 [ 375.663278] note: kworker/u16:0[9] exited with irqs disabled ...faddr2line says: [jlayton@tleilax linux]$ ./scripts/faddr2line --list vmlinux handshake_req_cancel+0x12/0x1c0 handshake_req_cancel+0x12/0x1c0: read_pnet at include/net/net_namespace.h:383 378 } 379 380 static inline struct net *read_pnet(const possible_net_t *pnet) 381 { 382 #ifdef CONFIG_NET_NS >383< return pnet->net; 384 #else 385 return &init_net; 386 #endif 387 } 388 (inlined by) sock_net at include/net/sock.h:649 644 __rcu_assign_sk_user_data_with_flags(sk, ptr, 0) 645 646 static inline 647 struct net *sock_net(const struct sock *sk) 648 { >649< return read_pnet(&sk->sk_net); 650 } 651 652 static inline 653 void sock_net_set(struct sock *sk, struct net *net) 654 { (inlined by) handshake_req_cancel at net/handshake/request.c:281 276 struct handshake_net *hn; 277 struct sock *sk; 278 struct net *net; 279 280 sk = sock->sk; >281< net = sock_net(sk); 282 req = handshake_req_hash_lookup(sk); 283 if (!req) { 284 trace_handshake_cancel_none(net, req, sk); 285 return true; 286 } I'm guessing sk was NULL in handshake_req_cancel? -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 12:27 problems getting rpc over tls to work Jeff Layton @ 2023-03-28 12:55 ` Jeff Layton 2023-03-28 14:04 ` Chuck Lever III 2023-03-28 13:29 ` Chuck Lever III 1 sibling, 1 reply; 26+ messages in thread From: Jeff Layton @ 2023-03-28 12:55 UTC (permalink / raw) To: Chuck Lever; +Cc: kernel-tls-handshake On Tue, 2023-03-28 at 08:27 -0400, Jeff Layton wrote: > Hi Chuck! > > I have started the packaging work for Fedora for ktls-utils: > > https://bugzilla.redhat.com/show_bug.cgi?id=2182151 > > I also built packages for this in copr: > > https://copr.fedorainfracloud.org/coprs/jlayton/ktls-utils/ > > ...and built some interim nfs-utils packages with the requisite exportfs > patches: > > https://copr.fedorainfracloud.org/coprs/jlayton/nfs-utils/ > > I built a kernel from your topic-rpc-with-tls-upcall branch and > installed the kernel on a client and server, along with ktls-utils and > the updated nfs-utils on the server. I set up tlshd to run at boot on > both hosts. The server exports with a bog-standard set of options: > > /export *(rw,insecure,no_root_squash) > > I then tried to mount it with tls: > > $ sudo mount knfsd:/export /mnt/knfsd -o xprtsec=tls > > I see the initial NULL requests go out, and then I see the client send > an encrypted frame to the server, and the server just shuts down the > socket at that point (FIN, ACK). > > I assume that I must have something configured wrong. What am I missing? > > Eventually, after a couple of failed mount attempts, I also hit this on > the client: > > [ 375.561304] BUG: kernel NULL pointer dereference, address: > 0000000000000030 > [ 375.564637] #PF: supervisor read access in kernel mode > [ 375.566439] #PF: error_code(0x0000) - not-present page > [ 375.567930] PGD 0 P4D 0 > [ 375.568733] Oops: 0000 [#1] PREEMPT SMP NOPTI > [ 375.569993] CPU: 2 PID: 9 Comm: kworker/u16:0 Tainted: G E > 6.3.0-rc2+ #151 > [ 375.572214] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS > 1.16.1-2.fc37 04/01/2014 > [ 375.574538] Workqueue: xprtiod xs_tls_connect [sunrpc] > [ 375.576087] RIP: 0010:handshake_req_cancel+0x12/0x1c0 > [ 375.578255] Code: d0 5b ff eb 92 0f 1f 00 90 90 90 90 90 90 90 90 90 > 90 90 90 90 90 90 90 0f 1f 44 00 00 41 55 41 54 55 53 48 8b 6f 18 48 89 > ef <4c> 8b 6d 30 e8 35 fe ff ff 48 85 c0 0f 84 3e 01 00 00 4c 89 ef 48 > [ 375.583180] RSP: 0018:ffffb87540053d28 EFLAGS: 00010246 > [ 375.585416] RAX: 0000000000000000 RBX: ffff970289cb4800 RCX: > 0000000000000000 > [ 375.588226] RDX: 0000000000000001 RSI: ffffb87540053d00 RDI: > 0000000000000000 > [ 375.590389] RBP: 0000000000000000 R08: ffff970280e544a8 R09: > 0000000000000001 > [ 375.592601] R10: 0000000000000002 R11: 0000000000000001 R12: > ffff970288985480 > [ 375.594885] R13: 0000000000000000 R14: 0000000004208160 R15: > ffff970289cb4800 > [ 375.597051] FS: 0000000000000000(0000) GS:ffff9703f7c80000(0000) > knlGS:0000000000000000 > [ 375.599810] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 375.601625] CR2: 0000000000000030 CR3: 000000010aa04000 CR4: > 00000000003506e0 > [ 375.603785] Call Trace: > [ 375.604754] <TASK> > [ 375.605651] xs_tls_handshake_sync+0x14f/0x170 [sunrpc] > [ 375.608998] ? __pfx_xs_tls_handshake_done+0x10/0x10 [sunrpc] > [ 375.610900] xs_tls_connect+0x14a/0x5f0 [sunrpc] > [ 375.612530] process_one_work+0x1c8/0x3c0 > [ 375.613907] worker_thread+0x4d/0x380 > [ 375.615189] ? __pfx_worker_thread+0x10/0x10 > [ 375.616631] kthread+0xe9/0x110 > [ 375.617788] ? __pfx_kthread+0x10/0x10 > [ 375.619091] ret_from_fork+0x2c/0x50 > [ 375.620347] </TASK> > [ 375.621242] Modules linked in: rpcsec_gss_krb5(E) auth_rpcgss(E) > nfsv4(E) dns_resolver(E) nfs(E) lockd(E) grace(E) sunrpc(E) ext4(E) > crc16(E) mbcache(E) jbd2(E) snd_hda_codec_generic(E) snd_hda_intel(E) > snd_intel_dspcfg(E) snd_hda_codec(E) snd_hwdep(E) snd_hda_core(E) > snd_pcm(E) kvm_amd(E) snd_timer(E) kvm(E) psmouse(E) snd(E) evdev(E) > irqbypass(E) virtio_balloon(E) soundcore(E) pcspkr(E) button(E) loop(E) > drm(E) configfs(E) zram(E) zsmalloc(E) xfs(E) libcrc32c(E) > crc32c_generic(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) > ghash_clmulni_intel(E) sha512_ssse3(E) sha512_generic(E) virtio_net(E) > virtio_blk(E) net_failover(E) failover(E) virtio_console(E) > aesni_intel(E) serio_raw(E) crypto_simd(E) cryptd(E) virtio_pci(E) > virtio(E) virtio_pci_legacy_dev(E) virtio_pci_modern_dev(E) > virtio_ring(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) > dm_multipath(E) dm_mod(E) scsi_mod(E) scsi_common(E) autofs4(E) > [ 375.646698] CR2: 0000000000000030 > [ 375.647894] ---[ end trace 0000000000000000 ]--- > [ 375.649403] RIP: 0010:handshake_req_cancel+0x12/0x1c0 > [ 375.651062] Code: d0 5b ff eb 92 0f 1f 00 90 90 90 90 90 90 90 90 90 > 90 90 90 90 90 90 90 0f 1f 44 00 00 41 55 41 54 55 53 48 8b 6f 18 48 89 > ef <4c> 8b 6d 30 e8 35 fe ff ff 48 85 c0 0f 84 3e 01 00 00 4c 89 ef 48 > [ 375.654664] RSP: 0018:ffffb87540053d28 EFLAGS: 00010246 > [ 375.655447] RAX: 0000000000000000 RBX: ffff970289cb4800 RCX: > 0000000000000000 > [ 375.656436] RDX: 0000000000000001 RSI: ffffb87540053d00 RDI: > 0000000000000000 > [ 375.657425] RBP: 0000000000000000 R08: ffff970280e544a8 R09: > 0000000000000001 > [ 375.658392] R10: 0000000000000002 R11: 0000000000000001 R12: > ffff970288985480 > [ 375.659360] R13: 0000000000000000 R14: 0000000004208160 R15: > ffff970289cb4800 > [ 375.660324] FS: 0000000000000000(0000) GS:ffff9703f7c80000(0000) > knlGS:0000000000000000 > [ 375.661479] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 375.662285] CR2: 0000000000000030 CR3: 000000010aa04000 CR4: > 00000000003506e0 > [ 375.663278] note: kworker/u16:0[9] exited with irqs disabled > > > ...faddr2line says: > > [jlayton@tleilax linux]$ ./scripts/faddr2line --list vmlinux > handshake_req_cancel+0x12/0x1c0 > handshake_req_cancel+0x12/0x1c0: > > read_pnet at include/net/net_namespace.h:383 > 378 } > 379 > 380 static inline struct net *read_pnet(const possible_net_t *pnet) > 381 { > 382 #ifdef CONFIG_NET_NS > > 383< return pnet->net; > 384 #else > 385 return &init_net; > 386 #endif > 387 } > 388 > > (inlined by) sock_net at include/net/sock.h:649 > 644 __rcu_assign_sk_user_data_with_flags(sk, ptr, 0) > 645 > 646 static inline > 647 struct net *sock_net(const struct sock *sk) > 648 { > > 649< return read_pnet(&sk->sk_net); > 650 } > 651 > 652 static inline > 653 void sock_net_set(struct sock *sk, struct net *net) > 654 { > > (inlined by) handshake_req_cancel at net/handshake/request.c:281 > 276 struct handshake_net *hn; > 277 struct sock *sk; > 278 struct net *net; > 279 > 280 sk = sock->sk; > > 281< net = sock_net(sk); > 282 req = handshake_req_hash_lookup(sk); > 283 if (!req) { > 284 trace_handshake_cancel_none(net, req, sk); > 285 return true; > 286 } > > > I'm guessing sk was NULL in handshake_req_cancel? Ahh, I think this must be the issue: Mar 28 08:37:46 knfsd tlshd[1324]: Default certificate not found: Key file does not have key “x509.certificate” in group “authenticate.server” Mar 28 08:37:46 knfsd tlshd[1324]: Handshake with 'nfsclnt.poochiereds.net' (192.168.1.136) failed I'll go over the docs more carefully and try again. I wonder...should we have the ktls-utils package install a self-signed cert by default? -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 12:55 ` Jeff Layton @ 2023-03-28 14:04 ` Chuck Lever III 2023-03-28 14:23 ` Benjamin Coddington 2023-03-28 14:29 ` Jeff Layton 0 siblings, 2 replies; 26+ messages in thread From: Chuck Lever III @ 2023-03-28 14:04 UTC (permalink / raw) To: Jeff Layton; +Cc: kernel-tls-handshake > On Mar 28, 2023, at 8:55 AM, Jeff Layton <jlayton@kernel.org> wrote: > > I wonder...should we have the ktls-utils package install a self-signed cert by default? So this idea is intriguing, I had some similar thoughts. I'm not sure what the security implications of all this are. We'd first need to look at other certificate-based packages in Fedora to see if they offer a similar quick-setup. The cert would have to be created at install time. > I created a self-signed > cert and tried to use it, but the client rejects it with this: > > Mar 28 09:01:20 nfsclnt tlshd[1092]: Certificate signer not found. > > Is there a way to make it not try to validate the cert chain? Olga also found that self-signed server certs are not working as we'd like. tlshd had a mechanism to force the clients not to check the signer, but it was removed because it was deemed insecure. I'd like to find a way to make self-signed work seamlessly. -- Chuck Lever ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 14:04 ` Chuck Lever III @ 2023-03-28 14:23 ` Benjamin Coddington 2023-03-28 14:29 ` Jeff Layton 1 sibling, 0 replies; 26+ messages in thread From: Benjamin Coddington @ 2023-03-28 14:23 UTC (permalink / raw) To: Chuck Lever III; +Cc: Jeff Layton, kernel-tls-handshake On 28 Mar 2023, at 10:04, Chuck Lever III wrote: > I'd like to find a way to make self-signed work seamlessly. You can add them to the trust store, or bring back the option to skip verification. Ben ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 14:04 ` Chuck Lever III 2023-03-28 14:23 ` Benjamin Coddington @ 2023-03-28 14:29 ` Jeff Layton 2023-03-28 14:39 ` Olga Kornievskaia 2023-03-28 14:41 ` Chuck Lever III 1 sibling, 2 replies; 26+ messages in thread From: Jeff Layton @ 2023-03-28 14:29 UTC (permalink / raw) To: Chuck Lever III; +Cc: kernel-tls-handshake On Tue, 2023-03-28 at 14:04 +0000, Chuck Lever III wrote: > > > On Mar 28, 2023, at 8:55 AM, Jeff Layton <jlayton@kernel.org> wrote: > > > > I wonder...should we have the ktls-utils package install a self-signed cert by default? > > So this idea is intriguing, I had some similar thoughts. > > I'm not sure what the security implications of all this are. > We'd first need to look at other certificate-based packages > in Fedora to see if they offer a similar quick-setup. The > cert would have to be created at install time. > > I think when apache is installed, a self-signed cert is created. You don't have to use it, but it's what gets initially installed. > > I created a self-signed > > cert and tried to use it, but the client rejects it with this: > > > > Mar 28 09:01:20 nfsclnt tlshd[1092]: Certificate signer not found. > > > > Is there a way to make it not try to validate the cert chain? > > Olga also found that self-signed server certs are not > working as we'd like. tlshd had a mechanism to force the > clients not to check the signer, but it was removed > because it was deemed insecure. > > I'd like to find a way to make self-signed work seamlessly. > Ditto. A lot of people are going to want to use TLS opportunistically without deploying their own CA and issuing "real" certificates. It's true that it is less secure than having full chain-of-trust, but this seems like a case of "perfect being the enemy of good". If we don't allow for self-signed certificates, then we've created a rather large hurdle for anyone who wants to deploy this. One thing we could do is reinstate the tlshd option, but still allow it to check the signature. Then it could log something if that check fails but still allow the connection. We should of course document why using that option is not ideal, but ripping it out entirely seems rather draconian. That's just going to drive people to not use TLS at all because of the hassle factor. -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 14:29 ` Jeff Layton @ 2023-03-28 14:39 ` Olga Kornievskaia 2023-03-28 14:45 ` Chuck Lever III 2023-03-28 14:41 ` Chuck Lever III 1 sibling, 1 reply; 26+ messages in thread From: Olga Kornievskaia @ 2023-03-28 14:39 UTC (permalink / raw) To: Jeff Layton; +Cc: Chuck Lever III, kernel-tls-handshake On Tue, Mar 28, 2023 at 10:29 AM Jeff Layton <jlayton@kernel.org> wrote: > > On Tue, 2023-03-28 at 14:04 +0000, Chuck Lever III wrote: > > > > > On Mar 28, 2023, at 8:55 AM, Jeff Layton <jlayton@kernel.org> wrote: > > > > > > I wonder...should we have the ktls-utils package install a self-signed cert by default? > > > > So this idea is intriguing, I had some similar thoughts. > > > > I'm not sure what the security implications of all this are. > > We'd first need to look at other certificate-based packages > > in Fedora to see if they offer a similar quick-setup. The > > cert would have to be created at install time. > > > > > > I think when apache is installed, a self-signed cert is created. You > don't have to use it, but it's what gets initially installed. The problem I see with the plan is that the client (which will be installing ktlsd) needs the server's certificate (not its own). So installing a self-signed certificate helps with having one but is far from having a no hassle install. I think having clear steps about how to get server's cert installed into the client's trusted CA chain in the man page would go a long way. > > > I created a self-signed > > > cert and tried to use it, but the client rejects it with this: > > > > > > Mar 28 09:01:20 nfsclnt tlshd[1092]: Certificate signer not found. > > > > > > Is there a way to make it not try to validate the cert chain? > > > > Olga also found that self-signed server certs are not > > working as we'd like. tlshd had a mechanism to force the > > clients not to check the signer, but it was removed > > because it was deemed insecure. > > > > I'd like to find a way to make self-signed work seamlessly. > > > > Ditto. A lot of people are going to want to use TLS opportunistically > without deploying their own CA and issuing "real" certificates. > > It's true that it is less secure than having full chain-of-trust, but > this seems like a case of "perfect being the enemy of good". If we don't > allow for self-signed certificates, then we've created a rather large > hurdle for anyone who wants to deploy this. > > One thing we could do is reinstate the tlshd option, but still allow it > to check the signature. Then it could log something if that check fails > but still allow the connection. > > We should of course document why using that option is not ideal, but > ripping it out entirely seems rather draconian. That's just going to > drive people to not use TLS at all because of the hassle factor. I would argue that "no verification" option should only be allowed in some extreme cases. Like say having an option that explicitly says it's running in a debug mode and say on the foreground only (-d -f --noverify). Having such options might clearly state the intent is to debug only and not run for any user usage. I also don't see a real reason for "noverify" option except to remove frustrations during the setup. > -- > Jeff Layton <jlayton@kernel.org> > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 14:39 ` Olga Kornievskaia @ 2023-03-28 14:45 ` Chuck Lever III 2023-03-28 14:50 ` Olga Kornievskaia 2023-03-28 15:03 ` Jeff Layton 0 siblings, 2 replies; 26+ messages in thread From: Chuck Lever III @ 2023-03-28 14:45 UTC (permalink / raw) To: Olga Kornievskaia; +Cc: Jeff Layton, kernel-tls-handshake > On Mar 28, 2023, at 10:39 AM, Olga Kornievskaia <aglo@umich.edu> wrote: > > On Tue, Mar 28, 2023 at 10:29 AM Jeff Layton <jlayton@kernel.org> wrote: >> >> It's true that it is less secure than having full chain-of-trust, but >> this seems like a case of "perfect being the enemy of good". If we don't >> allow for self-signed certificates, then we've created a rather large >> hurdle for anyone who wants to deploy this. >> >> One thing we could do is reinstate the tlshd option, but still allow it >> to check the signature. Then it could log something if that check fails >> but still allow the connection. >> >> We should of course document why using that option is not ideal, but >> ripping it out entirely seems rather draconian. That's just going to >> drive people to not use TLS at all because of the hassle factor. > > I would argue that "no verification" option should only be allowed in > some extreme cases. Like say having an option that explicitly says > it's running in a debug mode and say on the foreground only (-d -f > --noverify). Having such options might clearly state the intent is to > debug only and not run for any user usage. > > I also don't see a real reason for "noverify" option except to remove > frustrations during the setup. I might put it this way: we don't want to have customers installing something on their clients whose out-of-the-shrinkwrap configuration is less than secure. "no verification" is less than secure. My preference would be to have some kind of way to get self-signed certs working with no client-side configuration needed. If the client mounts with "xprtsec=tls" it should work. Do we need to plumb that into our handshake upcall and make "anonymous" handshakes explicitly allow unrecognized signers? -- Chuck Lever ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 14:45 ` Chuck Lever III @ 2023-03-28 14:50 ` Olga Kornievskaia 2023-03-28 15:06 ` Jeff Layton 2023-03-28 15:03 ` Jeff Layton 1 sibling, 1 reply; 26+ messages in thread From: Olga Kornievskaia @ 2023-03-28 14:50 UTC (permalink / raw) To: Chuck Lever III; +Cc: Jeff Layton, kernel-tls-handshake On Tue, Mar 28, 2023 at 10:45 AM Chuck Lever III <chuck.lever@oracle.com> wrote: > > > > > On Mar 28, 2023, at 10:39 AM, Olga Kornievskaia <aglo@umich.edu> wrote: > > > > On Tue, Mar 28, 2023 at 10:29 AM Jeff Layton <jlayton@kernel.org> wrote: > >> > >> It's true that it is less secure than having full chain-of-trust, but > >> this seems like a case of "perfect being the enemy of good". If we don't > >> allow for self-signed certificates, then we've created a rather large > >> hurdle for anyone who wants to deploy this. > >> > >> One thing we could do is reinstate the tlshd option, but still allow it > >> to check the signature. Then it could log something if that check fails > >> but still allow the connection. > >> > >> We should of course document why using that option is not ideal, but > >> ripping it out entirely seems rather draconian. That's just going to > >> drive people to not use TLS at all because of the hassle factor. > > > > I would argue that "no verification" option should only be allowed in > > some extreme cases. Like say having an option that explicitly says > > it's running in a debug mode and say on the foreground only (-d -f > > --noverify). Having such options might clearly state the intent is to > > debug only and not run for any user usage. > > > > I also don't see a real reason for "noverify" option except to remove > > frustrations during the setup. > > I might put it this way: we don't want to have customers installing > something on their clients whose out-of-the-shrinkwrap configuration > is less than secure. "no verification" is less than secure. > > My preference would be to have some kind of way to get self-signed > certs working with no client-side configuration needed. If the > client mounts with "xprtsec=tls" it should work. Do we need to > plumb that into our handshake upcall and make "anonymous" > handshakes explicitly allow unrecognized signers? My vote is not allow for insecure installs (ever). Perhaps ktlsd install on the client can prompt the user asking for location of either server's self-signed cert or server's CA and this way it would have everything that's needed before using it? > > -- > Chuck Lever > > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 14:50 ` Olga Kornievskaia @ 2023-03-28 15:06 ` Jeff Layton 0 siblings, 0 replies; 26+ messages in thread From: Jeff Layton @ 2023-03-28 15:06 UTC (permalink / raw) To: Olga Kornievskaia, Chuck Lever III; +Cc: kernel-tls-handshake On Tue, 2023-03-28 at 10:50 -0400, Olga Kornievskaia wrote: > On Tue, Mar 28, 2023 at 10:45 AM Chuck Lever III <chuck.lever@oracle.com> wrote: > > > > > > > > > On Mar 28, 2023, at 10:39 AM, Olga Kornievskaia <aglo@umich.edu> wrote: > > > > > > On Tue, Mar 28, 2023 at 10:29 AM Jeff Layton <jlayton@kernel.org> wrote: > > > > > > > > It's true that it is less secure than having full chain-of-trust, but > > > > this seems like a case of "perfect being the enemy of good". If we don't > > > > allow for self-signed certificates, then we've created a rather large > > > > hurdle for anyone who wants to deploy this. > > > > > > > > One thing we could do is reinstate the tlshd option, but still allow it > > > > to check the signature. Then it could log something if that check fails > > > > but still allow the connection. > > > > > > > > We should of course document why using that option is not ideal, but > > > > ripping it out entirely seems rather draconian. That's just going to > > > > drive people to not use TLS at all because of the hassle factor. > > > > > > I would argue that "no verification" option should only be allowed in > > > some extreme cases. Like say having an option that explicitly says > > > it's running in a debug mode and say on the foreground only (-d -f > > > --noverify). Having such options might clearly state the intent is to > > > debug only and not run for any user usage. > > > > > > I also don't see a real reason for "noverify" option except to remove > > > frustrations during the setup. > > > > I might put it this way: we don't want to have customers installing > > something on their clients whose out-of-the-shrinkwrap configuration > > is less than secure. "no verification" is less than secure. > > > > My preference would be to have some kind of way to get self-signed > > certs working with no client-side configuration needed. If the > > client mounts with "xprtsec=tls" it should work. Do we need to > > plumb that into our handshake upcall and make "anonymous" > > handshakes explicitly allow unrecognized signers? > > My vote is not allow for insecure installs (ever). > Is it really better to force people into plaintext connections? I very much disagree here. Raise your hand if you've never used cURL with "--insecure" or told Mozilla to accept a bogus cert. > Perhaps ktlsd install on the client can prompt the user asking for > location of either server's self-signed cert or server's CA and this > way it would have everything that's needed before using it? > > NAK. Interactive package installs are no bueno. -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 14:45 ` Chuck Lever III 2023-03-28 14:50 ` Olga Kornievskaia @ 2023-03-28 15:03 ` Jeff Layton 2023-03-28 15:05 ` Chuck Lever III 1 sibling, 1 reply; 26+ messages in thread From: Jeff Layton @ 2023-03-28 15:03 UTC (permalink / raw) To: Chuck Lever III, Olga Kornievskaia; +Cc: kernel-tls-handshake On Tue, 2023-03-28 at 14:45 +0000, Chuck Lever III wrote: > > > On Mar 28, 2023, at 10:39 AM, Olga Kornievskaia <aglo@umich.edu> wrote: > > > > On Tue, Mar 28, 2023 at 10:29 AM Jeff Layton <jlayton@kernel.org> wrote: > > > > > > It's true that it is less secure than having full chain-of-trust, but > > > this seems like a case of "perfect being the enemy of good". If we don't > > > allow for self-signed certificates, then we've created a rather large > > > hurdle for anyone who wants to deploy this. > > > > > > One thing we could do is reinstate the tlshd option, but still allow it > > > to check the signature. Then it could log something if that check fails > > > but still allow the connection. > > > > > > We should of course document why using that option is not ideal, but > > > ripping it out entirely seems rather draconian. That's just going to > > > drive people to not use TLS at all because of the hassle factor. > > > > I would argue that "no verification" option should only be allowed in > > some extreme cases. Like say having an option that explicitly says > > it's running in a debug mode and say on the foreground only (-d -f > > --noverify). Having such options might clearly state the intent is to > > debug only and not run for any user usage. > > > > I also don't see a real reason for "noverify" option except to remove > > frustrations during the setup. > > I might put it this way: we don't want to have customers installing > something on their clients whose out-of-the-shrinkwrap configuration > is less than secure. "no verification" is less than secure. > > My preference would be to have some kind of way to get self-signed > certs working with no client-side configuration needed. If the > client mounts with "xprtsec=tls" it should work. Do we need to > plumb that into our handshake upcall and make "anonymous" > handshakes explicitly allow unrecognized signers? > Since the client is the side that's rejecting things, having a mount option that allows you to relax that check seems like the right approach. How about a new xprtsec= option? Maybe "xprtsec=nvtls" (no verify TLS)? That would allow things to work out of the box, but still leave xprtsec=tls as the more secure method. -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 15:03 ` Jeff Layton @ 2023-03-28 15:05 ` Chuck Lever III 2023-03-28 15:15 ` Jeff Layton 2023-03-28 15:19 ` Olga Kornievskaia 0 siblings, 2 replies; 26+ messages in thread From: Chuck Lever III @ 2023-03-28 15:05 UTC (permalink / raw) To: Jeff Layton; +Cc: Olga Kornievskaia, kernel-tls-handshake > On Mar 28, 2023, at 11:03 AM, Jeff Layton <jlayton@kernel.org> wrote: > > On Tue, 2023-03-28 at 14:45 +0000, Chuck Lever III wrote: >> >>> On Mar 28, 2023, at 10:39 AM, Olga Kornievskaia <aglo@umich.edu> wrote: >>> >>> On Tue, Mar 28, 2023 at 10:29 AM Jeff Layton <jlayton@kernel.org> wrote: >>>> >>>> It's true that it is less secure than having full chain-of-trust, but >>>> this seems like a case of "perfect being the enemy of good". If we don't >>>> allow for self-signed certificates, then we've created a rather large >>>> hurdle for anyone who wants to deploy this. >>>> >>>> One thing we could do is reinstate the tlshd option, but still allow it >>>> to check the signature. Then it could log something if that check fails >>>> but still allow the connection. >>>> >>>> We should of course document why using that option is not ideal, but >>>> ripping it out entirely seems rather draconian. That's just going to >>>> drive people to not use TLS at all because of the hassle factor. >>> >>> I would argue that "no verification" option should only be allowed in >>> some extreme cases. Like say having an option that explicitly says >>> it's running in a debug mode and say on the foreground only (-d -f >>> --noverify). Having such options might clearly state the intent is to >>> debug only and not run for any user usage. >>> >>> I also don't see a real reason for "noverify" option except to remove >>> frustrations during the setup. >> >> I might put it this way: we don't want to have customers installing >> something on their clients whose out-of-the-shrinkwrap configuration >> is less than secure. "no verification" is less than secure. >> >> My preference would be to have some kind of way to get self-signed >> certs working with no client-side configuration needed. If the >> client mounts with "xprtsec=tls" it should work. Do we need to >> plumb that into our handshake upcall and make "anonymous" >> handshakes explicitly allow unrecognized signers? >> > > Since the client is the side that's rejecting things, having a mount > option that allows you to relax that check seems like the right > approach. > > How about a new xprtsec= option? Maybe "xprtsec=nvtls" (no verify TLS)? > That would allow things to work out of the box, but still leave > xprtsec=tls as the more secure method. Nah. xprtsec=tls is supposed to be less secure: no authentication, just encryption. The secure method is xprtsec=mtls. IMO xprtsec=tls needs to skip the signer check. I think I can make tlshd do that. -- Chuck Lever ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 15:05 ` Chuck Lever III @ 2023-03-28 15:15 ` Jeff Layton 2023-03-28 15:19 ` Olga Kornievskaia 1 sibling, 0 replies; 26+ messages in thread From: Jeff Layton @ 2023-03-28 15:15 UTC (permalink / raw) To: Chuck Lever III; +Cc: Olga Kornievskaia, kernel-tls-handshake On Tue, 2023-03-28 at 15:05 +0000, Chuck Lever III wrote: > > > On Mar 28, 2023, at 11:03 AM, Jeff Layton <jlayton@kernel.org> wrote: > > > > On Tue, 2023-03-28 at 14:45 +0000, Chuck Lever III wrote: > > > > > > > On Mar 28, 2023, at 10:39 AM, Olga Kornievskaia <aglo@umich.edu> wrote: > > > > > > > > On Tue, Mar 28, 2023 at 10:29 AM Jeff Layton <jlayton@kernel.org> wrote: > > > > > > > > > > It's true that it is less secure than having full chain-of-trust, but > > > > > this seems like a case of "perfect being the enemy of good". If we don't > > > > > allow for self-signed certificates, then we've created a rather large > > > > > hurdle for anyone who wants to deploy this. > > > > > > > > > > One thing we could do is reinstate the tlshd option, but still allow it > > > > > to check the signature. Then it could log something if that check fails > > > > > but still allow the connection. > > > > > > > > > > We should of course document why using that option is not ideal, but > > > > > ripping it out entirely seems rather draconian. That's just going to > > > > > drive people to not use TLS at all because of the hassle factor. > > > > > > > > I would argue that "no verification" option should only be allowed in > > > > some extreme cases. Like say having an option that explicitly says > > > > it's running in a debug mode and say on the foreground only (-d -f > > > > --noverify). Having such options might clearly state the intent is to > > > > debug only and not run for any user usage. > > > > > > > > I also don't see a real reason for "noverify" option except to remove > > > > frustrations during the setup. > > > > > > I might put it this way: we don't want to have customers installing > > > something on their clients whose out-of-the-shrinkwrap configuration > > > is less than secure. "no verification" is less than secure. > > > > > > My preference would be to have some kind of way to get self-signed > > > certs working with no client-side configuration needed. If the > > > client mounts with "xprtsec=tls" it should work. Do we need to > > > plumb that into our handshake upcall and make "anonymous" > > > handshakes explicitly allow unrecognized signers? > > > > > > > Since the client is the side that's rejecting things, having a mount > > option that allows you to relax that check seems like the right > > approach. > > > > How about a new xprtsec= option? Maybe "xprtsec=nvtls" (no verify TLS)? > > That would allow things to work out of the box, but still leave > > xprtsec=tls as the more secure method. > > Nah. xprtsec=tls is supposed to be less secure: no authentication, > just encryption. The secure method is xprtsec=mtls. > > IMO xprtsec=tls needs to skip the signer check. I think I can make > tlshd do that. > > It's your call, but allowing the client to check the certificate without requiring the server to do so seems like it'd be a good thing to allow. Maybe there should be a new option for that instead then? Either way, I'm not sure skipping the signer check altogether is the best thing. It'd probably be good to check it, and just not fail the connection if it fails. Have it log a message on each handshake instead so that the admin is aware that the endpoint is not verified. -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 15:05 ` Chuck Lever III 2023-03-28 15:15 ` Jeff Layton @ 2023-03-28 15:19 ` Olga Kornievskaia 2023-03-28 15:30 ` Olga Kornievskaia 1 sibling, 1 reply; 26+ messages in thread From: Olga Kornievskaia @ 2023-03-28 15:19 UTC (permalink / raw) To: Chuck Lever III; +Cc: Jeff Layton, kernel-tls-handshake On Tue, Mar 28, 2023 at 11:06 AM Chuck Lever III <chuck.lever@oracle.com> wrote: > > > > > On Mar 28, 2023, at 11:03 AM, Jeff Layton <jlayton@kernel.org> wrote: > > > > On Tue, 2023-03-28 at 14:45 +0000, Chuck Lever III wrote: > >> > >>> On Mar 28, 2023, at 10:39 AM, Olga Kornievskaia <aglo@umich.edu> wrote: > >>> > >>> On Tue, Mar 28, 2023 at 10:29 AM Jeff Layton <jlayton@kernel.org> wrote: > >>>> > >>>> It's true that it is less secure than having full chain-of-trust, but > >>>> this seems like a case of "perfect being the enemy of good". If we don't > >>>> allow for self-signed certificates, then we've created a rather large > >>>> hurdle for anyone who wants to deploy this. > >>>> > >>>> One thing we could do is reinstate the tlshd option, but still allow it > >>>> to check the signature. Then it could log something if that check fails > >>>> but still allow the connection. > >>>> > >>>> We should of course document why using that option is not ideal, but > >>>> ripping it out entirely seems rather draconian. That's just going to > >>>> drive people to not use TLS at all because of the hassle factor. > >>> > >>> I would argue that "no verification" option should only be allowed in > >>> some extreme cases. Like say having an option that explicitly says > >>> it's running in a debug mode and say on the foreground only (-d -f > >>> --noverify). Having such options might clearly state the intent is to > >>> debug only and not run for any user usage. > >>> > >>> I also don't see a real reason for "noverify" option except to remove > >>> frustrations during the setup. > >> > >> I might put it this way: we don't want to have customers installing > >> something on their clients whose out-of-the-shrinkwrap configuration > >> is less than secure. "no verification" is less than secure. > >> > >> My preference would be to have some kind of way to get self-signed > >> certs working with no client-side configuration needed. If the > >> client mounts with "xprtsec=tls" it should work. Do we need to > >> plumb that into our handshake upcall and make "anonymous" > >> handshakes explicitly allow unrecognized signers? > >> > > > > Since the client is the side that's rejecting things, having a mount > > option that allows you to relax that check seems like the right > > approach. > > > > How about a new xprtsec= option? Maybe "xprtsec=nvtls" (no verify TLS)? > > That would allow things to work out of the box, but still leave > > xprtsec=tls as the more secure method. > > Nah. xprtsec=tls is supposed to be less secure: no authentication, > just encryption. The secure method is xprtsec=mtls. What's the point of "no authentication". I thought the server is always authenticated. > IMO xprtsec=tls needs to skip the signer check. I think I can make > tlshd do that. I guess in that case, I (grudgingly) agree with something like xprtsec=anonymous/nvtls". > > > -- > Chuck Lever > > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 15:19 ` Olga Kornievskaia @ 2023-03-28 15:30 ` Olga Kornievskaia 2023-03-28 15:48 ` Chuck Lever III 0 siblings, 1 reply; 26+ messages in thread From: Olga Kornievskaia @ 2023-03-28 15:30 UTC (permalink / raw) To: Chuck Lever III; +Cc: Jeff Layton, kernel-tls-handshake On Tue, Mar 28, 2023 at 11:19 AM Olga Kornievskaia <aglo@umich.edu> wrote: > > On Tue, Mar 28, 2023 at 11:06 AM Chuck Lever III <chuck.lever@oracle.com> wrote: > > > > > > > > > On Mar 28, 2023, at 11:03 AM, Jeff Layton <jlayton@kernel.org> wrote: > > > > > > On Tue, 2023-03-28 at 14:45 +0000, Chuck Lever III wrote: > > >> > > >>> On Mar 28, 2023, at 10:39 AM, Olga Kornievskaia <aglo@umich.edu> wrote: > > >>> > > >>> On Tue, Mar 28, 2023 at 10:29 AM Jeff Layton <jlayton@kernel.org> wrote: > > >>>> > > >>>> It's true that it is less secure than having full chain-of-trust, but > > >>>> this seems like a case of "perfect being the enemy of good". If we don't > > >>>> allow for self-signed certificates, then we've created a rather large > > >>>> hurdle for anyone who wants to deploy this. > > >>>> > > >>>> One thing we could do is reinstate the tlshd option, but still allow it > > >>>> to check the signature. Then it could log something if that check fails > > >>>> but still allow the connection. > > >>>> > > >>>> We should of course document why using that option is not ideal, but > > >>>> ripping it out entirely seems rather draconian. That's just going to > > >>>> drive people to not use TLS at all because of the hassle factor. > > >>> > > >>> I would argue that "no verification" option should only be allowed in > > >>> some extreme cases. Like say having an option that explicitly says > > >>> it's running in a debug mode and say on the foreground only (-d -f > > >>> --noverify). Having such options might clearly state the intent is to > > >>> debug only and not run for any user usage. > > >>> > > >>> I also don't see a real reason for "noverify" option except to remove > > >>> frustrations during the setup. > > >> > > >> I might put it this way: we don't want to have customers installing > > >> something on their clients whose out-of-the-shrinkwrap configuration > > >> is less than secure. "no verification" is less than secure. > > >> > > >> My preference would be to have some kind of way to get self-signed > > >> certs working with no client-side configuration needed. If the > > >> client mounts with "xprtsec=tls" it should work. Do we need to > > >> plumb that into our handshake upcall and make "anonymous" > > >> handshakes explicitly allow unrecognized signers? > > >> > > > > > > Since the client is the side that's rejecting things, having a mount > > > option that allows you to relax that check seems like the right > > > approach. > > > > > > How about a new xprtsec= option? Maybe "xprtsec=nvtls" (no verify TLS)? > > > That would allow things to work out of the box, but still leave > > > xprtsec=tls as the more secure method. > > > > Nah. xprtsec=tls is supposed to be less secure: no authentication, > > just encryption. The secure method is xprtsec=mtls. > > What's the point of "no authentication". I thought the server is > always authenticated. Sorry Ok we are discussing no authentication. But my point was "TLS" in its know doesn't mean less secure and always does server side authentication. In the early days of TLS, you could choose to do pure Diffie hellman and that was "no authentication" but that's no longer an option. HTTPS explicitly prompts that user to do manual verification (ie when it couldn't verify using existing CAs). It never allows for "no verification" which we are discussing here. > > IMO xprtsec=tls needs to skip the signer check. I think I can make > > tlshd do that. > > I guess in that case, I (grudgingly) agree with something like > xprtsec=anonymous/nvtls". > > > > > > > -- > > Chuck Lever > > > > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 15:30 ` Olga Kornievskaia @ 2023-03-28 15:48 ` Chuck Lever III 0 siblings, 0 replies; 26+ messages in thread From: Chuck Lever III @ 2023-03-28 15:48 UTC (permalink / raw) To: Olga Kornievskaia; +Cc: Jeff Layton, kernel-tls-handshake > On Mar 28, 2023, at 11:30 AM, Olga Kornievskaia <aglo@umich.edu> wrote: > > On Tue, Mar 28, 2023 at 11:19 AM Olga Kornievskaia <aglo@umich.edu> wrote: >> >> On Tue, Mar 28, 2023 at 11:06 AM Chuck Lever III <chuck.lever@oracle.com> wrote: >>> >>> >>> >>>> On Mar 28, 2023, at 11:03 AM, Jeff Layton <jlayton@kernel.org> wrote: >>>> >>>> On Tue, 2023-03-28 at 14:45 +0000, Chuck Lever III wrote: >>>>> >>>>>> On Mar 28, 2023, at 10:39 AM, Olga Kornievskaia <aglo@umich.edu> wrote: >>>>>> >>>>>> On Tue, Mar 28, 2023 at 10:29 AM Jeff Layton <jlayton@kernel.org> wrote: >>>>>>> >>>>>>> It's true that it is less secure than having full chain-of-trust, but >>>>>>> this seems like a case of "perfect being the enemy of good". If we don't >>>>>>> allow for self-signed certificates, then we've created a rather large >>>>>>> hurdle for anyone who wants to deploy this. >>>>>>> >>>>>>> One thing we could do is reinstate the tlshd option, but still allow it >>>>>>> to check the signature. Then it could log something if that check fails >>>>>>> but still allow the connection. >>>>>>> >>>>>>> We should of course document why using that option is not ideal, but >>>>>>> ripping it out entirely seems rather draconian. That's just going to >>>>>>> drive people to not use TLS at all because of the hassle factor. >>>>>> >>>>>> I would argue that "no verification" option should only be allowed in >>>>>> some extreme cases. Like say having an option that explicitly says >>>>>> it's running in a debug mode and say on the foreground only (-d -f >>>>>> --noverify). Having such options might clearly state the intent is to >>>>>> debug only and not run for any user usage. >>>>>> >>>>>> I also don't see a real reason for "noverify" option except to remove >>>>>> frustrations during the setup. >>>>> >>>>> I might put it this way: we don't want to have customers installing >>>>> something on their clients whose out-of-the-shrinkwrap configuration >>>>> is less than secure. "no verification" is less than secure. >>>>> >>>>> My preference would be to have some kind of way to get self-signed >>>>> certs working with no client-side configuration needed. If the >>>>> client mounts with "xprtsec=tls" it should work. Do we need to >>>>> plumb that into our handshake upcall and make "anonymous" >>>>> handshakes explicitly allow unrecognized signers? >>>>> >>>> >>>> Since the client is the side that's rejecting things, having a mount >>>> option that allows you to relax that check seems like the right >>>> approach. >>>> >>>> How about a new xprtsec= option? Maybe "xprtsec=nvtls" (no verify TLS)? >>>> That would allow things to work out of the box, but still leave >>>> xprtsec=tls as the more secure method. >>> >>> Nah. xprtsec=tls is supposed to be less secure: no authentication, >>> just encryption. The secure method is xprtsec=mtls. >> >> What's the point of "no authentication". I thought the server is >> always authenticated. > > Sorry Ok we are discussing no authentication. But my point was "TLS" > in its know doesn't mean less secure and always does server side > authentication. In the early days of TLS, you could choose to do pure > Diffie hellman and that was "no authentication" but that's no longer > an option. > > HTTPS explicitly prompts that user to do manual verification (ie when > it couldn't verify using existing CAs). It never allows for "no > verification" which we are discussing here. Today, our client always authenticates the server. That means that for self-signed environments, the server's certificate has to be distributed to all clients. That also means that automatically adding a self-signed server cert when ktls-utils is installed is not going to as helpful as we might want. I really wanted to have a way to enable encryption while avoiding the "client key distribution" problem, and to permit self-signed certs to be used in this mode. Some possible choices: - State that the way to avoid client key distribution is for the server administrator to acquire a certificate that is signed by a CA that is already known to clients. This is easy for us, and I suspect the security community would be agreeable only to this alternative. - Weaken the client's server authentication so that it does not fail the handshake if the server's certificate is self-signed (only for xprtsec=tls). Yes, tlshd would log the verification failure. - Add a third xprtsec= mode where no server verification is done. - Add a configuration option to /etc/tlshd.conf that weakens the anonymous policy so it does not verify the server. >>> IMO xprtsec=tls needs to skip the signer check. I think I can make >>> tlshd do that. >> >> I guess in that case, I (grudgingly) agree with something like >> xprtsec=anonymous/nvtls". No snap decisions today. We don't quite have a consensus on this yet. And, I think there is a reasonable workaround for the moment: if the server cert is self-signed, just distribute it to clients. -- Chuck Lever ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 14:29 ` Jeff Layton 2023-03-28 14:39 ` Olga Kornievskaia @ 2023-03-28 14:41 ` Chuck Lever III 1 sibling, 0 replies; 26+ messages in thread From: Chuck Lever III @ 2023-03-28 14:41 UTC (permalink / raw) To: Jeff Layton; +Cc: kernel-tls-handshake > On Mar 28, 2023, at 10:29 AM, Jeff Layton <jlayton@kernel.org> wrote: > > On Tue, 2023-03-28 at 14:04 +0000, Chuck Lever III wrote: >> >>> On Mar 28, 2023, at 8:55 AM, Jeff Layton <jlayton@kernel.org> wrote: >>> >>> I wonder...should we have the ktls-utils package install a self-signed cert by default? >> >> So this idea is intriguing, I had some similar thoughts. >> >> I'm not sure what the security implications of all this are. >> We'd first need to look at other certificate-based packages >> in Fedora to see if they offer a similar quick-setup. The >> cert would have to be created at install time. > > I think when apache is installed, a self-signed cert is created. You > don't have to use it, but it's what gets initially installed. If apache does it, then it sounds OK to do. >>> I created a self-signed >>> cert and tried to use it, but the client rejects it with this: >>> >>> Mar 28 09:01:20 nfsclnt tlshd[1092]: Certificate signer not found. >>> >>> Is there a way to make it not try to validate the cert chain? >> >> Olga also found that self-signed server certs are not >> working as we'd like. tlshd had a mechanism to force the >> clients not to check the signer, but it was removed >> because it was deemed insecure. >> >> I'd like to find a way to make self-signed work seamlessly. > > Ditto. A lot of people are going to want to use TLS opportunistically > without deploying their own CA and issuing "real" certificates. Yer preachin' to the choir, son. > It's true that it is less secure than having full chain-of-trust, but > this seems like a case of "perfect being the enemy of good". If we don't > allow for self-signed certificates, then we've created a rather large > hurdle for anyone who wants to deploy this. > > One thing we could do is reinstate the tlshd option, but still allow it > to check the signature. Then it could log something if that check fails > but still allow the connection. > > We should of course document why using that option is not ideal, but > ripping it out entirely seems rather draconian. That's just going to > drive people to not use TLS at all because of the hassle factor. I'd prefer that no client-side administration is necessary to make this work. Adding the server's self-signed cert on all clients is not what I had in mind, as that is the kind of "key distribution hassle" that RPC-with-TLS was intended to eliminate. (But I'm glad that gets you closer to working). -- Chuck Lever ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 12:27 problems getting rpc over tls to work Jeff Layton 2023-03-28 12:55 ` Jeff Layton @ 2023-03-28 13:29 ` Chuck Lever III 2023-03-28 13:51 ` Jeff Layton 2023-03-28 13:55 ` Chuck Lever III 1 sibling, 2 replies; 26+ messages in thread From: Chuck Lever III @ 2023-03-28 13:29 UTC (permalink / raw) To: Jeff Layton; +Cc: kernel-tls-handshake > On Mar 28, 2023, at 8:27 AM, Jeff Layton <jlayton@kernel.org> wrote: > > Hi Chuck! > > I have started the packaging work for Fedora for ktls-utils: > > https://bugzilla.redhat.com/show_bug.cgi?id=2182151 > > I also built packages for this in copr: > > https://copr.fedorainfracloud.org/coprs/jlayton/ktls-utils/ > > ...and built some interim nfs-utils packages with the requisite exportfs > patches: > > https://copr.fedorainfracloud.org/coprs/jlayton/nfs-utils/ Note that the nfs-utils changes aren't necessary to support the kernel server in "opportunistic" mode -- the server will use RPC-with-TLS if a client requests it, but otherwise does not restrict access. Client side also has no nfs-utils requirements at this time, since the new mount options are handled by the kernel. > I built a kernel from your topic-rpc-with-tls-upcall branch and > installed the kernel on a client and server, along with ktls-utils and > the updated nfs-utils on the server. I set up tlshd to run at boot on > both hosts. The server exports with a bog-standard set of options: > > /export *(rw,insecure,no_root_squash) > > I then tried to mount it with tls: > > $ sudo mount knfsd:/export /mnt/knfsd -o xprtsec=tls > > I see the initial NULL requests go out, and then I see the client send > an encrypted frame to the server, and the server just shuts down the > socket at that point (FIN, ACK). > > I assume that I must have something configured wrong. What am I missing? The starting move is to crank up the debug settings in /etc/tlshd.conf... > Eventually, after a couple of failed mount attempts, I also hit this on > the client: > > [ 375.561304] BUG: kernel NULL pointer dereference, address: > 0000000000000030 > [ 375.564637] #PF: supervisor read access in kernel mode > [ 375.566439] #PF: error_code(0x0000) - not-present page > [ 375.567930] PGD 0 P4D 0 > [ 375.568733] Oops: 0000 [#1] PREEMPT SMP NOPTI > [ 375.569993] CPU: 2 PID: 9 Comm: kworker/u16:0 Tainted: G E > 6.3.0-rc2+ #151 > [ 375.572214] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS > 1.16.1-2.fc37 04/01/2014 > [ 375.574538] Workqueue: xprtiod xs_tls_connect [sunrpc] > [ 375.576087] RIP: 0010:handshake_req_cancel+0x12/0x1c0 > [ 375.578255] Code: d0 5b ff eb 92 0f 1f 00 90 90 90 90 90 90 90 90 90 > 90 90 90 90 90 90 90 0f 1f 44 00 00 41 55 41 54 55 53 48 8b 6f 18 48 89 > ef <4c> 8b 6d 30 e8 35 fe ff ff 48 85 c0 0f 84 3e 01 00 00 4c 89 ef 48 > [ 375.583180] RSP: 0018:ffffb87540053d28 EFLAGS: 00010246 > [ 375.585416] RAX: 0000000000000000 RBX: ffff970289cb4800 RCX: > 0000000000000000 > [ 375.588226] RDX: 0000000000000001 RSI: ffffb87540053d00 RDI: > 0000000000000000 > [ 375.590389] RBP: 0000000000000000 R08: ffff970280e544a8 R09: > 0000000000000001 > [ 375.592601] R10: 0000000000000002 R11: 0000000000000001 R12: > ffff970288985480 > [ 375.594885] R13: 0000000000000000 R14: 0000000004208160 R15: > ffff970289cb4800 > [ 375.597051] FS: 0000000000000000(0000) GS:ffff9703f7c80000(0000) > knlGS:0000000000000000 > [ 375.599810] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 375.601625] CR2: 0000000000000030 CR3: 000000010aa04000 CR4: > 00000000003506e0 > [ 375.603785] Call Trace: > [ 375.604754] <TASK> > [ 375.605651] xs_tls_handshake_sync+0x14f/0x170 [sunrpc] > [ 375.608998] ? __pfx_xs_tls_handshake_done+0x10/0x10 [sunrpc] > [ 375.610900] xs_tls_connect+0x14a/0x5f0 [sunrpc] > [ 375.612530] process_one_work+0x1c8/0x3c0 > [ 375.613907] worker_thread+0x4d/0x380 > [ 375.615189] ? __pfx_worker_thread+0x10/0x10 > [ 375.616631] kthread+0xe9/0x110 > [ 375.617788] ? __pfx_kthread+0x10/0x10 > [ 375.619091] ret_from_fork+0x2c/0x50 > [ 375.620347] </TASK> > [ 375.621242] Modules linked in: rpcsec_gss_krb5(E) auth_rpcgss(E) > nfsv4(E) dns_resolver(E) nfs(E) lockd(E) grace(E) sunrpc(E) ext4(E) > crc16(E) mbcache(E) jbd2(E) snd_hda_codec_generic(E) snd_hda_intel(E) > snd_intel_dspcfg(E) snd_hda_codec(E) snd_hwdep(E) snd_hda_core(E) > snd_pcm(E) kvm_amd(E) snd_timer(E) kvm(E) psmouse(E) snd(E) evdev(E) > irqbypass(E) virtio_balloon(E) soundcore(E) pcspkr(E) button(E) loop(E) > drm(E) configfs(E) zram(E) zsmalloc(E) xfs(E) libcrc32c(E) > crc32c_generic(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) > ghash_clmulni_intel(E) sha512_ssse3(E) sha512_generic(E) virtio_net(E) > virtio_blk(E) net_failover(E) failover(E) virtio_console(E) > aesni_intel(E) serio_raw(E) crypto_simd(E) cryptd(E) virtio_pci(E) > virtio(E) virtio_pci_legacy_dev(E) virtio_pci_modern_dev(E) > virtio_ring(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) > dm_multipath(E) dm_mod(E) scsi_mod(E) scsi_common(E) autofs4(E) > [ 375.646698] CR2: 0000000000000030 > [ 375.647894] ---[ end trace 0000000000000000 ]--- > [ 375.649403] RIP: 0010:handshake_req_cancel+0x12/0x1c0 > [ 375.651062] Code: d0 5b ff eb 92 0f 1f 00 90 90 90 90 90 90 90 90 90 > 90 90 90 90 90 90 90 0f 1f 44 00 00 41 55 41 54 55 53 48 8b 6f 18 48 89 > ef <4c> 8b 6d 30 e8 35 fe ff ff 48 85 c0 0f 84 3e 01 00 00 4c 89 ef 48 > [ 375.654664] RSP: 0018:ffffb87540053d28 EFLAGS: 00010246 > [ 375.655447] RAX: 0000000000000000 RBX: ffff970289cb4800 RCX: > 0000000000000000 > [ 375.656436] RDX: 0000000000000001 RSI: ffffb87540053d00 RDI: > 0000000000000000 > [ 375.657425] RBP: 0000000000000000 R08: ffff970280e544a8 R09: > 0000000000000001 > [ 375.658392] R10: 0000000000000002 R11: 0000000000000001 R12: > ffff970288985480 > [ 375.659360] R13: 0000000000000000 R14: 0000000004208160 R15: > ffff970289cb4800 > [ 375.660324] FS: 0000000000000000(0000) GS:ffff9703f7c80000(0000) > knlGS:0000000000000000 > [ 375.661479] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 375.662285] CR2: 0000000000000030 CR3: 000000010aa04000 CR4: > 00000000003506e0 > [ 375.663278] note: kworker/u16:0[9] exited with irqs disabled > > > ...faddr2line says: > > [jlayton@tleilax linux]$ ./scripts/faddr2line --list vmlinux > handshake_req_cancel+0x12/0x1c0 > handshake_req_cancel+0x12/0x1c0: > > read_pnet at include/net/net_namespace.h:383 > 378 } > 379 > 380 static inline struct net *read_pnet(const possible_net_t *pnet) > 381 { > 382 #ifdef CONFIG_NET_NS >> 383< return pnet->net; > 384 #else > 385 return &init_net; > 386 #endif > 387 } > 388 > > (inlined by) sock_net at include/net/sock.h:649 > 644 __rcu_assign_sk_user_data_with_flags(sk, ptr, 0) > 645 > 646 static inline > 647 struct net *sock_net(const struct sock *sk) > 648 { >> 649< return read_pnet(&sk->sk_net); > 650 } > 651 > 652 static inline > 653 void sock_net_set(struct sock *sk, struct net *net) > 654 { > > (inlined by) handshake_req_cancel at net/handshake/request.c:281 > 276 struct handshake_net *hn; > 277 struct sock *sk; > 278 struct net *net; > 279 > 280 sk = sock->sk; >> 281< net = sock_net(sk); > 282 req = handshake_req_hash_lookup(sk); > 283 if (!req) { > 284 trace_handshake_cancel_none(net, req, sk); > 285 return true; > 286 } > > > I'm guessing sk was NULL in handshake_req_cancel? Jakub asked me to remove the NULL check there. But I think req_cancel needs to handle this case, which might happen due to a race. -- Chuck Lever ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 13:29 ` Chuck Lever III @ 2023-03-28 13:51 ` Jeff Layton 2023-03-28 13:55 ` Chuck Lever III 1 sibling, 0 replies; 26+ messages in thread From: Jeff Layton @ 2023-03-28 13:51 UTC (permalink / raw) To: Chuck Lever III; +Cc: kernel-tls-handshake On Tue, 2023-03-28 at 13:29 +0000, Chuck Lever III wrote: > > > On Mar 28, 2023, at 8:27 AM, Jeff Layton <jlayton@kernel.org> wrote: > > > > Hi Chuck! > > > > I have started the packaging work for Fedora for ktls-utils: > > > > https://bugzilla.redhat.com/show_bug.cgi?id=2182151 > > > > I also built packages for this in copr: > > > > https://copr.fedorainfracloud.org/coprs/jlayton/ktls-utils/ > > > > ...and built some interim nfs-utils packages with the requisite exportfs > > patches: > > > > https://copr.fedorainfracloud.org/coprs/jlayton/nfs-utils/ > > Note that the nfs-utils changes aren't necessary to support > the kernel server in "opportunistic" mode -- the server will > use RPC-with-TLS if a client requests it, but otherwise does > not restrict access. > Opportunistic mode doesn't seem to work for me. I created a self-signed cert and tried to use it, but the client rejects it with this: Mar 28 09:01:20 nfsclnt tlshd[1092]: Certificate signer not found. Is there a way to make it not try to validate the cert chain? Otherwise, I guess I'll need to set up a CA and such. > Client side also has no nfs-utils requirements at this time, > since the new mount options are handled by the kernel. > > > > I built a kernel from your topic-rpc-with-tls-upcall branch and > > installed the kernel on a client and server, along with ktls-utils and > > the updated nfs-utils on the server. I set up tlshd to run at boot on > > both hosts. The server exports with a bog-standard set of options: > > > > /export *(rw,insecure,no_root_squash) > > > > I then tried to mount it with tls: > > > > $ sudo mount knfsd:/export /mnt/knfsd -o xprtsec=tls > > > > I see the initial NULL requests go out, and then I see the client send > > an encrypted frame to the server, and the server just shuts down the > > socket at that point (FIN, ACK). > > > > I assume that I must have something configured wrong. What am I missing? > > The starting move is to crank up the debug settings in /etc/tlshd.conf... > > Thanks. I'll try that next time. > > Eventually, after a couple of failed mount attempts, I also hit this on > > the client: > > > > [ 375.561304] BUG: kernel NULL pointer dereference, address: > > 0000000000000030 > > [ 375.564637] #PF: supervisor read access in kernel mode > > [ 375.566439] #PF: error_code(0x0000) - not-present page > > [ 375.567930] PGD 0 P4D 0 > > [ 375.568733] Oops: 0000 [#1] PREEMPT SMP NOPTI > > [ 375.569993] CPU: 2 PID: 9 Comm: kworker/u16:0 Tainted: G E > > 6.3.0-rc2+ #151 > > [ 375.572214] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS > > 1.16.1-2.fc37 04/01/2014 > > [ 375.574538] Workqueue: xprtiod xs_tls_connect [sunrpc] > > [ 375.576087] RIP: 0010:handshake_req_cancel+0x12/0x1c0 > > [ 375.578255] Code: d0 5b ff eb 92 0f 1f 00 90 90 90 90 90 90 90 90 90 > > 90 90 90 90 90 90 90 0f 1f 44 00 00 41 55 41 54 55 53 48 8b 6f 18 48 89 > > ef <4c> 8b 6d 30 e8 35 fe ff ff 48 85 c0 0f 84 3e 01 00 00 4c 89 ef 48 > > [ 375.583180] RSP: 0018:ffffb87540053d28 EFLAGS: 00010246 > > [ 375.585416] RAX: 0000000000000000 RBX: ffff970289cb4800 RCX: > > 0000000000000000 > > [ 375.588226] RDX: 0000000000000001 RSI: ffffb87540053d00 RDI: > > 0000000000000000 > > [ 375.590389] RBP: 0000000000000000 R08: ffff970280e544a8 R09: > > 0000000000000001 > > [ 375.592601] R10: 0000000000000002 R11: 0000000000000001 R12: > > ffff970288985480 > > [ 375.594885] R13: 0000000000000000 R14: 0000000004208160 R15: > > ffff970289cb4800 > > [ 375.597051] FS: 0000000000000000(0000) GS:ffff9703f7c80000(0000) > > knlGS:0000000000000000 > > [ 375.599810] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 375.601625] CR2: 0000000000000030 CR3: 000000010aa04000 CR4: > > 00000000003506e0 > > [ 375.603785] Call Trace: > > [ 375.604754] <TASK> > > [ 375.605651] xs_tls_handshake_sync+0x14f/0x170 [sunrpc] > > [ 375.608998] ? __pfx_xs_tls_handshake_done+0x10/0x10 [sunrpc] > > [ 375.610900] xs_tls_connect+0x14a/0x5f0 [sunrpc] > > [ 375.612530] process_one_work+0x1c8/0x3c0 > > [ 375.613907] worker_thread+0x4d/0x380 > > [ 375.615189] ? __pfx_worker_thread+0x10/0x10 > > [ 375.616631] kthread+0xe9/0x110 > > [ 375.617788] ? __pfx_kthread+0x10/0x10 > > [ 375.619091] ret_from_fork+0x2c/0x50 > > [ 375.620347] </TASK> > > [ 375.621242] Modules linked in: rpcsec_gss_krb5(E) auth_rpcgss(E) > > nfsv4(E) dns_resolver(E) nfs(E) lockd(E) grace(E) sunrpc(E) ext4(E) > > crc16(E) mbcache(E) jbd2(E) snd_hda_codec_generic(E) snd_hda_intel(E) > > snd_intel_dspcfg(E) snd_hda_codec(E) snd_hwdep(E) snd_hda_core(E) > > snd_pcm(E) kvm_amd(E) snd_timer(E) kvm(E) psmouse(E) snd(E) evdev(E) > > irqbypass(E) virtio_balloon(E) soundcore(E) pcspkr(E) button(E) loop(E) > > drm(E) configfs(E) zram(E) zsmalloc(E) xfs(E) libcrc32c(E) > > crc32c_generic(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) > > ghash_clmulni_intel(E) sha512_ssse3(E) sha512_generic(E) virtio_net(E) > > virtio_blk(E) net_failover(E) failover(E) virtio_console(E) > > aesni_intel(E) serio_raw(E) crypto_simd(E) cryptd(E) virtio_pci(E) > > virtio(E) virtio_pci_legacy_dev(E) virtio_pci_modern_dev(E) > > virtio_ring(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) > > dm_multipath(E) dm_mod(E) scsi_mod(E) scsi_common(E) autofs4(E) > > [ 375.646698] CR2: 0000000000000030 > > [ 375.647894] ---[ end trace 0000000000000000 ]--- > > [ 375.649403] RIP: 0010:handshake_req_cancel+0x12/0x1c0 > > [ 375.651062] Code: d0 5b ff eb 92 0f 1f 00 90 90 90 90 90 90 90 90 90 > > 90 90 90 90 90 90 90 0f 1f 44 00 00 41 55 41 54 55 53 48 8b 6f 18 48 89 > > ef <4c> 8b 6d 30 e8 35 fe ff ff 48 85 c0 0f 84 3e 01 00 00 4c 89 ef 48 > > [ 375.654664] RSP: 0018:ffffb87540053d28 EFLAGS: 00010246 > > [ 375.655447] RAX: 0000000000000000 RBX: ffff970289cb4800 RCX: > > 0000000000000000 > > [ 375.656436] RDX: 0000000000000001 RSI: ffffb87540053d00 RDI: > > 0000000000000000 > > [ 375.657425] RBP: 0000000000000000 R08: ffff970280e544a8 R09: > > 0000000000000001 > > [ 375.658392] R10: 0000000000000002 R11: 0000000000000001 R12: > > ffff970288985480 > > [ 375.659360] R13: 0000000000000000 R14: 0000000004208160 R15: > > ffff970289cb4800 > > [ 375.660324] FS: 0000000000000000(0000) GS:ffff9703f7c80000(0000) > > knlGS:0000000000000000 > > [ 375.661479] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 375.662285] CR2: 0000000000000030 CR3: 000000010aa04000 CR4: > > 00000000003506e0 > > [ 375.663278] note: kworker/u16:0[9] exited with irqs disabled > > > > > > ...faddr2line says: > > > > [jlayton@tleilax linux]$ ./scripts/faddr2line --list vmlinux > > handshake_req_cancel+0x12/0x1c0 > > handshake_req_cancel+0x12/0x1c0: > > > > read_pnet at include/net/net_namespace.h:383 > > 378 } > > 379 > > 380 static inline struct net *read_pnet(const possible_net_t *pnet) > > 381 { > > 382 #ifdef CONFIG_NET_NS > > > 383< return pnet->net; > > 384 #else > > 385 return &init_net; > > 386 #endif > > 387 } > > 388 > > > > (inlined by) sock_net at include/net/sock.h:649 > > 644 __rcu_assign_sk_user_data_with_flags(sk, ptr, 0) > > 645 > > 646 static inline > > 647 struct net *sock_net(const struct sock *sk) > > 648 { > > > 649< return read_pnet(&sk->sk_net); > > 650 } > > 651 > > 652 static inline > > 653 void sock_net_set(struct sock *sk, struct net *net) > > 654 { > > > > (inlined by) handshake_req_cancel at net/handshake/request.c:281 > > 276 struct handshake_net *hn; > > 277 struct sock *sk; > > 278 struct net *net; > > 279 > > 280 sk = sock->sk; > > > 281< net = sock_net(sk); > > 282 req = handshake_req_hash_lookup(sk); > > 283 if (!req) { > > 284 trace_handshake_cancel_none(net, req, sk); > > 285 return true; > > 286 } > > > > > > I'm guessing sk was NULL in handshake_req_cancel? > > Jakub asked me to remove the NULL check there. But I think > req_cancel needs to handle this case, which might happen > due to a race. > If there's a race there, is that sufficient? Could it go NULL after you check but before the call to sock_net? Maybe I need to better understand the race. -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 13:29 ` Chuck Lever III 2023-03-28 13:51 ` Jeff Layton @ 2023-03-28 13:55 ` Chuck Lever III 2023-03-28 14:13 ` Jeff Layton 1 sibling, 1 reply; 26+ messages in thread From: Chuck Lever III @ 2023-03-28 13:55 UTC (permalink / raw) To: Jeff Layton; +Cc: kernel-tls-handshake > On Mar 28, 2023, at 9:29 AM, Chuck Lever III <chuck.lever@oracle.com> wrote: > > > >> On Mar 28, 2023, at 8:27 AM, Jeff Layton <jlayton@kernel.org> wrote: >> >> Hi Chuck! >> >> I have started the packaging work for Fedora for ktls-utils: >> >> https://bugzilla.redhat.com/show_bug.cgi?id=2182151 >> >> I also built packages for this in copr: >> >> https://copr.fedorainfracloud.org/coprs/jlayton/ktls-utils/ >> >> ...and built some interim nfs-utils packages with the requisite exportfs >> patches: >> >> https://copr.fedorainfracloud.org/coprs/jlayton/nfs-utils/ > > Note that the nfs-utils changes aren't necessary to support > the kernel server in "opportunistic" mode -- the server will > use RPC-with-TLS if a client requests it, but otherwise does > not restrict access. > > Client side also has no nfs-utils requirements at this time, > since the new mount options are handled by the kernel. In case I wasn't clear: This was meant as a suggestion. If you want to simplify your test set-up a bit, the nfs-utils piece isn't needed at this point. But feel free to include it if you like! -- Chuck Lever ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 13:55 ` Chuck Lever III @ 2023-03-28 14:13 ` Jeff Layton 2023-03-28 14:25 ` Olga Kornievskaia 0 siblings, 1 reply; 26+ messages in thread From: Jeff Layton @ 2023-03-28 14:13 UTC (permalink / raw) To: Chuck Lever III; +Cc: kernel-tls-handshake On Tue, 2023-03-28 at 13:55 +0000, Chuck Lever III wrote: > > > On Mar 28, 2023, at 9:29 AM, Chuck Lever III <chuck.lever@oracle.com> wrote: > > > > > > > > > On Mar 28, 2023, at 8:27 AM, Jeff Layton <jlayton@kernel.org> wrote: > > > > > > Hi Chuck! > > > > > > I have started the packaging work for Fedora for ktls-utils: > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=2182151 > > > > > > I also built packages for this in copr: > > > > > > https://copr.fedorainfracloud.org/coprs/jlayton/ktls-utils/ > > > > > > ...and built some interim nfs-utils packages with the requisite exportfs > > > patches: > > > > > > https://copr.fedorainfracloud.org/coprs/jlayton/nfs-utils/ > > > > Note that the nfs-utils changes aren't necessary to support > > the kernel server in "opportunistic" mode -- the server will > > use RPC-with-TLS if a client requests it, but otherwise does > > not restrict access. > > > > Client side also has no nfs-utils requirements at this time, > > since the new mount options are handled by the kernel. > > In case I wasn't clear: > > This was meant as a suggestion. If you want to simplify your > test set-up a bit, the nfs-utils piece isn't needed at this > point. But feel free to include it if you like! > Understood. I needed to build it for the server side anyway, so I figured I might as well. Eventually I'd like to set up a Fedora COPR repo that has all of the packages we need to test this, but I need to sort through the certificate handling here first. Are there docs on how to administer gnutls? For instance, I guess I'll want to set up my own CA and issue client and server certs. How do I make gnutls trust a new CA? -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 14:13 ` Jeff Layton @ 2023-03-28 14:25 ` Olga Kornievskaia 2023-03-28 14:38 ` Jeff Layton 0 siblings, 1 reply; 26+ messages in thread From: Olga Kornievskaia @ 2023-03-28 14:25 UTC (permalink / raw) To: Jeff Layton; +Cc: Chuck Lever III, kernel-tls-handshake On Tue, Mar 28, 2023 at 10:14 AM Jeff Layton <jlayton@kernel.org> wrote: > > On Tue, 2023-03-28 at 13:55 +0000, Chuck Lever III wrote: > > > > > On Mar 28, 2023, at 9:29 AM, Chuck Lever III <chuck.lever@oracle.com> wrote: > > > > > > > > > > > > > On Mar 28, 2023, at 8:27 AM, Jeff Layton <jlayton@kernel.org> wrote: > > > > > > > > Hi Chuck! > > > > > > > > I have started the packaging work for Fedora for ktls-utils: > > > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=2182151 > > > > > > > > I also built packages for this in copr: > > > > > > > > https://copr.fedorainfracloud.org/coprs/jlayton/ktls-utils/ > > > > > > > > ...and built some interim nfs-utils packages with the requisite exportfs > > > > patches: > > > > > > > > https://copr.fedorainfracloud.org/coprs/jlayton/nfs-utils/ > > > > > > Note that the nfs-utils changes aren't necessary to support > > > the kernel server in "opportunistic" mode -- the server will > > > use RPC-with-TLS if a client requests it, but otherwise does > > > not restrict access. > > > > > > Client side also has no nfs-utils requirements at this time, > > > since the new mount options are handled by the kernel. > > > > In case I wasn't clear: > > > > This was meant as a suggestion. If you want to simplify your > > test set-up a bit, the nfs-utils piece isn't needed at this > > point. But feel free to include it if you like! > > > > Understood. I needed to build it for the server side anyway, so I > figured I might as well. Eventually I'd like to set up a Fedora COPR > repo that has all of the packages we need to test this, but I need to > sort through the certificate handling here first. > > Are there docs on how to administer gnutls? For instance, I guess I'll > want to set up my own CA and issue client and server certs. How do I > make gnutls trust a new CA? Hi Jeff, To get self-signed certificates to work you need to (on the client's machine) copy your server's cert.pem file into /etc/pki/ca-trust/source/anchors and then run the “update-ca-trust extract”. > -- > Jeff Layton <jlayton@kernel.org> > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 14:25 ` Olga Kornievskaia @ 2023-03-28 14:38 ` Jeff Layton 2023-03-28 14:44 ` Olga Kornievskaia 2023-03-28 15:48 ` Jeff Layton 0 siblings, 2 replies; 26+ messages in thread From: Jeff Layton @ 2023-03-28 14:38 UTC (permalink / raw) To: Olga Kornievskaia; +Cc: Chuck Lever III, kernel-tls-handshake On Tue, 2023-03-28 at 10:25 -0400, Olga Kornievskaia wrote: > On Tue, Mar 28, 2023 at 10:14 AM Jeff Layton <jlayton@kernel.org> wrote: > > > > On Tue, 2023-03-28 at 13:55 +0000, Chuck Lever III wrote: > > > > > > > On Mar 28, 2023, at 9:29 AM, Chuck Lever III <chuck.lever@oracle.com> wrote: > > > > > > > > > > > > > > > > > On Mar 28, 2023, at 8:27 AM, Jeff Layton <jlayton@kernel.org> wrote: > > > > > > > > > > Hi Chuck! > > > > > > > > > > I have started the packaging work for Fedora for ktls-utils: > > > > > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=2182151 > > > > > > > > > > I also built packages for this in copr: > > > > > > > > > > https://copr.fedorainfracloud.org/coprs/jlayton/ktls-utils/ > > > > > > > > > > ...and built some interim nfs-utils packages with the requisite exportfs > > > > > patches: > > > > > > > > > > https://copr.fedorainfracloud.org/coprs/jlayton/nfs-utils/ > > > > > > > > Note that the nfs-utils changes aren't necessary to support > > > > the kernel server in "opportunistic" mode -- the server will > > > > use RPC-with-TLS if a client requests it, but otherwise does > > > > not restrict access. > > > > > > > > Client side also has no nfs-utils requirements at this time, > > > > since the new mount options are handled by the kernel. > > > > > > In case I wasn't clear: > > > > > > This was meant as a suggestion. If you want to simplify your > > > test set-up a bit, the nfs-utils piece isn't needed at this > > > point. But feel free to include it if you like! > > > > > > > Understood. I needed to build it for the server side anyway, so I > > figured I might as well. Eventually I'd like to set up a Fedora COPR > > repo that has all of the packages we need to test this, but I need to > > sort through the certificate handling here first. > > > > Are there docs on how to administer gnutls? For instance, I guess I'll > > want to set up my own CA and issue client and server certs. How do I > > make gnutls trust a new CA? > > Hi Jeff, > > To get self-signed certificates to work you need to (on the client's > machine) copy your server's cert.pem file into > /etc/pki/ca-trust/source/anchors and then run the “update-ca-trust > extract”. > > Many thanks, Olga! That got me further: Mar 28 10:35:05 nfsclnt tlshd[1498]: Handshake with nfsd.poochiereds.net (192.168.1.140) was successful The mount still isn't working yet, but I think I'm getting closer. I'll keep poking at it. Thanks! -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 14:38 ` Jeff Layton @ 2023-03-28 14:44 ` Olga Kornievskaia 2023-03-28 14:47 ` Chuck Lever III 2023-03-28 15:48 ` Jeff Layton 1 sibling, 1 reply; 26+ messages in thread From: Olga Kornievskaia @ 2023-03-28 14:44 UTC (permalink / raw) To: Jeff Layton; +Cc: Chuck Lever III, kernel-tls-handshake On Tue, Mar 28, 2023 at 10:38 AM Jeff Layton <jlayton@kernel.org> wrote: > > On Tue, 2023-03-28 at 10:25 -0400, Olga Kornievskaia wrote: > > On Tue, Mar 28, 2023 at 10:14 AM Jeff Layton <jlayton@kernel.org> wrote: > > > > > > On Tue, 2023-03-28 at 13:55 +0000, Chuck Lever III wrote: > > > > > > > > > On Mar 28, 2023, at 9:29 AM, Chuck Lever III <chuck.lever@oracle.com> wrote: > > > > > > > > > > > > > > > > > > > > > On Mar 28, 2023, at 8:27 AM, Jeff Layton <jlayton@kernel.org> wrote: > > > > > > > > > > > > Hi Chuck! > > > > > > > > > > > > I have started the packaging work for Fedora for ktls-utils: > > > > > > > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=2182151 > > > > > > > > > > > > I also built packages for this in copr: > > > > > > > > > > > > https://copr.fedorainfracloud.org/coprs/jlayton/ktls-utils/ > > > > > > > > > > > > ...and built some interim nfs-utils packages with the requisite exportfs > > > > > > patches: > > > > > > > > > > > > https://copr.fedorainfracloud.org/coprs/jlayton/nfs-utils/ > > > > > > > > > > Note that the nfs-utils changes aren't necessary to support > > > > > the kernel server in "opportunistic" mode -- the server will > > > > > use RPC-with-TLS if a client requests it, but otherwise does > > > > > not restrict access. > > > > > > > > > > Client side also has no nfs-utils requirements at this time, > > > > > since the new mount options are handled by the kernel. > > > > > > > > In case I wasn't clear: > > > > > > > > This was meant as a suggestion. If you want to simplify your > > > > test set-up a bit, the nfs-utils piece isn't needed at this > > > > point. But feel free to include it if you like! > > > > > > > > > > Understood. I needed to build it for the server side anyway, so I > > > figured I might as well. Eventually I'd like to set up a Fedora COPR > > > repo that has all of the packages we need to test this, but I need to > > > sort through the certificate handling here first. > > > > > > Are there docs on how to administer gnutls? For instance, I guess I'll > > > want to set up my own CA and issue client and server certs. How do I > > > make gnutls trust a new CA? > > > > Hi Jeff, > > > > To get self-signed certificates to work you need to (on the client's > > machine) copy your server's cert.pem file into > > /etc/pki/ca-trust/source/anchors and then run the “update-ca-trust > > extract”. > > > > > > Many thanks, Olga! That got me further: > > Mar 28 10:35:05 nfsclnt tlshd[1498]: Handshake with nfsd.poochiereds.net (192.168.1.140) was successful > > The mount still isn't working yet, but I think I'm getting closer. I'll > keep poking at it. I went thru several iterations before I got that working. If you are doing mutual authentication then the client's self-cert needs to be added to the server's CA chain in the same manner. My next stumble which Chuck helped me was that negotiated cipher was ChaCha20Poly which I didn't have enabled in my kernel. So look that you have CONFIG_CRYPTO_CHACHA20POLY1305 compiled in the kernel. > > Thanks! > -- > Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 14:44 ` Olga Kornievskaia @ 2023-03-28 14:47 ` Chuck Lever III 0 siblings, 0 replies; 26+ messages in thread From: Chuck Lever III @ 2023-03-28 14:47 UTC (permalink / raw) To: Olga Kornievskaia; +Cc: Jeff Layton, kernel-tls-handshake > On Mar 28, 2023, at 10:44 AM, Olga Kornievskaia <aglo@umich.edu> wrote: > > My next stumble which Chuck helped me was that negotiated cipher was > ChaCha20Poly which I didn't have enabled in my kernel. So look that > you have CONFIG_CRYPTO_CHACHA20POLY1305 compiled in the kernel. Yeah, I think tlshd's ability to detect what is supported by the local kernel is still not perfect. To that end I was thinking of adding a configuration option to /etc/tlshd.conf to enable and disable these algorithms. Does anyone know if ChaCha and Poly1305 is going to be enabled in present or future Fedora/OpenSuSE kernels? -- Chuck Lever ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 14:38 ` Jeff Layton 2023-03-28 14:44 ` Olga Kornievskaia @ 2023-03-28 15:48 ` Jeff Layton 2023-03-28 16:06 ` Chuck Lever III 1 sibling, 1 reply; 26+ messages in thread From: Jeff Layton @ 2023-03-28 15:48 UTC (permalink / raw) To: Olga Kornievskaia; +Cc: Chuck Lever III, kernel-tls-handshake [-- Attachment #1: Type: text/plain, Size: 7799 bytes --] On Tue, 2023-03-28 at 10:38 -0400, Jeff Layton wrote: > On Tue, 2023-03-28 at 10:25 -0400, Olga Kornievskaia wrote: > > On Tue, Mar 28, 2023 at 10:14 AM Jeff Layton <jlayton@kernel.org> wrote: > > > > > > On Tue, 2023-03-28 at 13:55 +0000, Chuck Lever III wrote: > > > > > > > > > On Mar 28, 2023, at 9:29 AM, Chuck Lever III <chuck.lever@oracle.com> wrote: > > > > > > > > > > > > > > > > > > > > > On Mar 28, 2023, at 8:27 AM, Jeff Layton <jlayton@kernel.org> wrote: > > > > > > > > > > > > Hi Chuck! > > > > > > > > > > > > I have started the packaging work for Fedora for ktls-utils: > > > > > > > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=2182151 > > > > > > > > > > > > I also built packages for this in copr: > > > > > > > > > > > > https://copr.fedorainfracloud.org/coprs/jlayton/ktls-utils/ > > > > > > > > > > > > ...and built some interim nfs-utils packages with the requisite exportfs > > > > > > patches: > > > > > > > > > > > > https://copr.fedorainfracloud.org/coprs/jlayton/nfs-utils/ > > > > > > > > > > Note that the nfs-utils changes aren't necessary to support > > > > > the kernel server in "opportunistic" mode -- the server will > > > > > use RPC-with-TLS if a client requests it, but otherwise does > > > > > not restrict access. > > > > > > > > > > Client side also has no nfs-utils requirements at this time, > > > > > since the new mount options are handled by the kernel. > > > > > > > > In case I wasn't clear: > > > > > > > > This was meant as a suggestion. If you want to simplify your > > > > test set-up a bit, the nfs-utils piece isn't needed at this > > > > point. But feel free to include it if you like! > > > > > > > > > > Understood. I needed to build it for the server side anyway, so I > > > figured I might as well. Eventually I'd like to set up a Fedora COPR > > > repo that has all of the packages we need to test this, but I need to > > > sort through the certificate handling here first. > > > > > > Are there docs on how to administer gnutls? For instance, I guess I'll > > > want to set up my own CA and issue client and server certs. How do I > > > make gnutls trust a new CA? > > > > Hi Jeff, > > > > To get self-signed certificates to work you need to (on the client's > > machine) copy your server's cert.pem file into > > /etc/pki/ca-trust/source/anchors and then run the “update-ca-trust > > extract”. > > > > > > Many thanks, Olga! That got me further: > > Mar 28 10:35:05 nfsclnt tlshd[1498]: Handshake with nfsd.poochiereds.net (192.168.1.140) was successful > > The mount still isn't working yet, but I think I'm getting closer. I'll > keep poking at it. > OK! I cranked up the debugging. Here's the kernel tracepoints during this time: <idle>-0 [007] ..s2. 3657.494946: svc_xprt_enqueue: server=0.0.0.0:2049 client=(einval) flags=BUSY|CONN|CHNGBUF|LISTENER|CACHE_AUTH|CONG_CTRL pid=1051 nfsd-1051 [005] ..... 3657.494980: svc_xprt_dequeue: server=0.0.0.0:2049 client=(einval) flags=BUSY|CONN|CHNGBUF|LISTENER|CACHE_AUTH|CONG_CTRL wakeup-us=45 nfsd-1051 [005] ..... 3657.495071: svcsock_new_socket: type=STREAM family=AF_INET nfsd-1051 [005] ..... 3657.495085: svc_xprt_enqueue: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL pid=1050 nfsd-1051 [005] ..... 3657.495086: svc_xprt_accept: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL protocol=tcp service=nfsd nfsd-1051 [005] ..... 3657.495092: svc_xprt_enqueue: server=0.0.0.0:2049 client=(einval) flags=BUSY|CONN|CHNGBUF|LISTENER|CACHE_AUTH|CONG_CTRL pid=1049 nfsd-1051 [005] ..... 3657.495095: svc_xprt_dequeue: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL wakeup-us=158 nfsd-1051 [005] ..... 3657.495101: svcsock_marker: addr=192.168.1.136:818 length=40 (last) nfsd-1051 [005] ..... 3657.495104: svcsock_tcp_recv: addr=192.168.1.136:818 result=40 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL nfsd-1051 [005] ..... 3657.495111: svc_xprt_enqueue: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL pid=1048 nfsd-1051 [005] ..... 3657.495112: svc_xdr_recvfrom: xid=0xd1e2303e head=[000000005f040892,40] page=0 tail=[0000000000000000,0] len=40 nfsd-1050 [007] ..... 3657.495121: svc_xprt_dequeue: server=0.0.0.0:2049 client=(einval) flags=BUSY|CONN|CHNGBUF|LISTENER|CACHE_AUTH|CONG_CTRL wakeup-us=44 nfsd-1051 [005] ..... 3657.495125: svc_tls_start: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL nfsd-1051 [005] ..... 3657.495128: svc_process: addr=192.168.1.136:818 xid=0xd1e2303e service=nfsd vers=4 proc=NULL nfsd-1051 [005] ..... 3657.495132: svc_xdr_sendto: xid=0xd1e2303e head=[000000005ccd151e,32] page=0(0) tail=[0000000000000000,0] len=32 nfsd-1051 [005] ..... 3657.495133: svc_stats_latency: xid=0xd1e2303e server=192.168.1.140:2049 client=192.168.1.136:818 proc=NULL execute-us=21 nfsd-1050 [007] ..... 3657.495146: svcsock_accept_err: addr=listener service=nfsd status=-11 nfsd-1048 [004] ..... 3657.495147: svc_xprt_dequeue: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL|HANDSHAKE wakeup-us=42 nfsd-1048 [004] ..... 3657.495151: svc_tls_upcall: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL|HANDSHAKE nfsd-1051 [005] ..... 3657.495163: svcsock_tcp_send: addr=192.168.1.136:818 result=36 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL|HANDSHAKE nfsd-1051 [005] ..... 3657.495198: svc_send: xid=0xd1e2303e server=192.168.1.140:2049 client=192.168.1.136:818 status=36 flags=SECURE|USEDEFERRAL|SPLICE_OK|BUSY|DATA <idle>-0 [007] ..s2. 3657.651316: svcsock_data_ready: addr=192.168.1.136:818 result=0 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL|HANDSHAKE <idle>-0 [007] ..s2. 3657.655648: svcsock_data_ready: addr=192.168.1.136:818 result=0 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL|HANDSHAKE <idle>-0 [007] ..s2. 3657.669552: svcsock_data_ready: addr=192.168.1.136:818 result=0 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL|HANDSHAKE nfsd-1048 [004] ..... 3662.666590: svc_tls_timed_out: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL|HANDSHAKE <<<<<<<<<<< TIMEOUT HERE nfsd-1048 [004] ..... 3662.666602: svc_xprt_enqueue: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|CLOSE|DATA|TEMP|CACHE_AUTH|CONG_CTRL pid=1051 nfsd-1048 [004] ..... 3662.666630: svc_xprt_dequeue: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|CLOSE|DATA|TEMP|CACHE_AUTH|CONG_CTRL wakeup-us=5171655 nfsd-1048 [004] ..... 3662.666631: svc_xprt_detach: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|CLOSE|DATA|TEMP|DEAD|CACHE_AUTH|CONG_CTRL nfsd-1048 [004] ..... 3662.666689: svc_xprt_free: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|CLOSE|DATA|TEMP|DEAD|CACHE_AUTH|CONG_CTRL It looks like it timed out waiting for the downcall. I cranked up the debug logging in tlshd at the same time and attached it to this. It looks like it all worked, so I'm not sure why the kernel didn't see the downcall. Thoughts? -- Jeff Layton <jlayton@kernel.org> [-- Attachment #2: tlshd.log.gz --] [-- Type: application/gzip, Size: 4622 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: problems getting rpc over tls to work 2023-03-28 15:48 ` Jeff Layton @ 2023-03-28 16:06 ` Chuck Lever III 0 siblings, 0 replies; 26+ messages in thread From: Chuck Lever III @ 2023-03-28 16:06 UTC (permalink / raw) To: Jeff Layton; +Cc: Olga Kornievskaia, kernel-tls-handshake > On Mar 28, 2023, at 11:48 AM, Jeff Layton <jlayton@kernel.org> wrote: > > On Tue, 2023-03-28 at 10:38 -0400, Jeff Layton wrote: >> On Tue, 2023-03-28 at 10:25 -0400, Olga Kornievskaia wrote: >>> On Tue, Mar 28, 2023 at 10:14 AM Jeff Layton <jlayton@kernel.org> wrote: >>>> >>>> On Tue, 2023-03-28 at 13:55 +0000, Chuck Lever III wrote: >>>>> >>>>>> On Mar 28, 2023, at 9:29 AM, Chuck Lever III <chuck.lever@oracle.com> wrote: >>>>>> >>>>>> >>>>>> >>>>>>> On Mar 28, 2023, at 8:27 AM, Jeff Layton <jlayton@kernel.org> wrote: >>>>>>> >>>>>>> Hi Chuck! >>>>>>> >>>>>>> I have started the packaging work for Fedora for ktls-utils: >>>>>>> >>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=2182151 >>>>>>> >>>>>>> I also built packages for this in copr: >>>>>>> >>>>>>> https://copr.fedorainfracloud.org/coprs/jlayton/ktls-utils/ >>>>>>> >>>>>>> ...and built some interim nfs-utils packages with the requisite exportfs >>>>>>> patches: >>>>>>> >>>>>>> https://copr.fedorainfracloud.org/coprs/jlayton/nfs-utils/ >>>>>> >>>>>> Note that the nfs-utils changes aren't necessary to support >>>>>> the kernel server in "opportunistic" mode -- the server will >>>>>> use RPC-with-TLS if a client requests it, but otherwise does >>>>>> not restrict access. >>>>>> >>>>>> Client side also has no nfs-utils requirements at this time, >>>>>> since the new mount options are handled by the kernel. >>>>> >>>>> In case I wasn't clear: >>>>> >>>>> This was meant as a suggestion. If you want to simplify your >>>>> test set-up a bit, the nfs-utils piece isn't needed at this >>>>> point. But feel free to include it if you like! >>>>> >>>> >>>> Understood. I needed to build it for the server side anyway, so I >>>> figured I might as well. Eventually I'd like to set up a Fedora COPR >>>> repo that has all of the packages we need to test this, but I need to >>>> sort through the certificate handling here first. >>>> >>>> Are there docs on how to administer gnutls? For instance, I guess I'll >>>> want to set up my own CA and issue client and server certs. How do I >>>> make gnutls trust a new CA? >>> >>> Hi Jeff, >>> >>> To get self-signed certificates to work you need to (on the client's >>> machine) copy your server's cert.pem file into >>> /etc/pki/ca-trust/source/anchors and then run the “update-ca-trust >>> extract”. >>> >>> >> >> Many thanks, Olga! That got me further: >> >> Mar 28 10:35:05 nfsclnt tlshd[1498]: Handshake with nfsd.poochiereds.net (192.168.1.140) was successful >> >> The mount still isn't working yet, but I think I'm getting closer. I'll >> keep poking at it. >> > > OK! I cranked up the debugging. Here's the kernel tracepoints during > this time: > > <idle>-0 [007] ..s2. 3657.494946: svc_xprt_enqueue: server=0.0.0.0:2049 client=(einval) flags=BUSY|CONN|CHNGBUF|LISTENER|CACHE_AUTH|CONG_CTRL pid=1051 > nfsd-1051 [005] ..... 3657.494980: svc_xprt_dequeue: server=0.0.0.0:2049 client=(einval) flags=BUSY|CONN|CHNGBUF|LISTENER|CACHE_AUTH|CONG_CTRL wakeup-us=45 > nfsd-1051 [005] ..... 3657.495071: svcsock_new_socket: type=STREAM family=AF_INET > nfsd-1051 [005] ..... 3657.495085: svc_xprt_enqueue: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL pid=1050 > nfsd-1051 [005] ..... 3657.495086: svc_xprt_accept: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL protocol=tcp service=nfsd > nfsd-1051 [005] ..... 3657.495092: svc_xprt_enqueue: server=0.0.0.0:2049 client=(einval) flags=BUSY|CONN|CHNGBUF|LISTENER|CACHE_AUTH|CONG_CTRL pid=1049 > nfsd-1051 [005] ..... 3657.495095: svc_xprt_dequeue: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL wakeup-us=158 > nfsd-1051 [005] ..... 3657.495101: svcsock_marker: addr=192.168.1.136:818 length=40 (last) > nfsd-1051 [005] ..... 3657.495104: svcsock_tcp_recv: addr=192.168.1.136:818 result=40 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL > nfsd-1051 [005] ..... 3657.495111: svc_xprt_enqueue: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL pid=1048 > nfsd-1051 [005] ..... 3657.495112: svc_xdr_recvfrom: xid=0xd1e2303e head=[000000005f040892,40] page=0 tail=[0000000000000000,0] len=40 > nfsd-1050 [007] ..... 3657.495121: svc_xprt_dequeue: server=0.0.0.0:2049 client=(einval) flags=BUSY|CONN|CHNGBUF|LISTENER|CACHE_AUTH|CONG_CTRL wakeup-us=44 > nfsd-1051 [005] ..... 3657.495125: svc_tls_start: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL > nfsd-1051 [005] ..... 3657.495128: svc_process: addr=192.168.1.136:818 xid=0xd1e2303e service=nfsd vers=4 proc=NULL > nfsd-1051 [005] ..... 3657.495132: svc_xdr_sendto: xid=0xd1e2303e head=[000000005ccd151e,32] page=0(0) tail=[0000000000000000,0] len=32 > nfsd-1051 [005] ..... 3657.495133: svc_stats_latency: xid=0xd1e2303e server=192.168.1.140:2049 client=192.168.1.136:818 proc=NULL execute-us=21 > nfsd-1050 [007] ..... 3657.495146: svcsock_accept_err: addr=listener service=nfsd status=-11 > nfsd-1048 [004] ..... 3657.495147: svc_xprt_dequeue: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL|HANDSHAKE wakeup-us=42 > nfsd-1048 [004] ..... 3657.495151: svc_tls_upcall: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL|HANDSHAKE > nfsd-1051 [005] ..... 3657.495163: svcsock_tcp_send: addr=192.168.1.136:818 result=36 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL|HANDSHAKE > nfsd-1051 [005] ..... 3657.495198: svc_send: xid=0xd1e2303e server=192.168.1.140:2049 client=192.168.1.136:818 status=36 flags=SECURE|USEDEFERRAL|SPLICE_OK|BUSY|DATA > <idle>-0 [007] ..s2. 3657.651316: svcsock_data_ready: addr=192.168.1.136:818 result=0 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL|HANDSHAKE > <idle>-0 [007] ..s2. 3657.655648: svcsock_data_ready: addr=192.168.1.136:818 result=0 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL|HANDSHAKE > <idle>-0 [007] ..s2. 3657.669552: svcsock_data_ready: addr=192.168.1.136:818 result=0 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL|HANDSHAKE > nfsd-1048 [004] ..... 3662.666590: svc_tls_timed_out: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|DATA|TEMP|CACHE_AUTH|CONG_CTRL|HANDSHAKE <<<<<<<<<<< TIMEOUT HERE > nfsd-1048 [004] ..... 3662.666602: svc_xprt_enqueue: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|CLOSE|DATA|TEMP|CACHE_AUTH|CONG_CTRL pid=1051 > nfsd-1048 [004] ..... 3662.666630: svc_xprt_dequeue: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|CLOSE|DATA|TEMP|CACHE_AUTH|CONG_CTRL wakeup-us=5171655 > nfsd-1048 [004] ..... 3662.666631: svc_xprt_detach: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|CLOSE|DATA|TEMP|DEAD|CACHE_AUTH|CONG_CTRL > nfsd-1048 [004] ..... 3662.666689: svc_xprt_free: server=192.168.1.140:2049 client=192.168.1.136:818 flags=BUSY|CLOSE|DATA|TEMP|DEAD|CACHE_AUTH|CONG_CTRL > > It looks like it timed out waiting for the downcall. I cranked up the > debug logging in tlshd at the same time and attached it to this. It > looks like it all worked, so I'm not sure why the kernel didn't see the > downcall. Check that src/tlshd/netlink.h looks exactly like include/uapi/linux/handshake.h Otherwise, enable function tracing to confirm that the downcall is either not getting done or is failing. -- Chuck Lever ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2023-03-28 16:06 UTC | newest] Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-03-28 12:27 problems getting rpc over tls to work Jeff Layton 2023-03-28 12:55 ` Jeff Layton 2023-03-28 14:04 ` Chuck Lever III 2023-03-28 14:23 ` Benjamin Coddington 2023-03-28 14:29 ` Jeff Layton 2023-03-28 14:39 ` Olga Kornievskaia 2023-03-28 14:45 ` Chuck Lever III 2023-03-28 14:50 ` Olga Kornievskaia 2023-03-28 15:06 ` Jeff Layton 2023-03-28 15:03 ` Jeff Layton 2023-03-28 15:05 ` Chuck Lever III 2023-03-28 15:15 ` Jeff Layton 2023-03-28 15:19 ` Olga Kornievskaia 2023-03-28 15:30 ` Olga Kornievskaia 2023-03-28 15:48 ` Chuck Lever III 2023-03-28 14:41 ` Chuck Lever III 2023-03-28 13:29 ` Chuck Lever III 2023-03-28 13:51 ` Jeff Layton 2023-03-28 13:55 ` Chuck Lever III 2023-03-28 14:13 ` Jeff Layton 2023-03-28 14:25 ` Olga Kornievskaia 2023-03-28 14:38 ` Jeff Layton 2023-03-28 14:44 ` Olga Kornievskaia 2023-03-28 14:47 ` Chuck Lever III 2023-03-28 15:48 ` Jeff Layton 2023-03-28 16:06 ` Chuck Lever III
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).