linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* refcount underflow in nfsd41_destroy_cb
@ 2020-03-21  2:51 Jason A. Donenfeld
  2020-03-21 14:41 ` Jan Psota
  2020-03-21 19:50 ` Chuck Lever
  0 siblings, 2 replies; 11+ messages in thread
From: Jason A. Donenfeld @ 2020-03-21  2:51 UTC (permalink / raw)
  To: linux-nfs; +Cc: Jan Psota

Hello,

A user erroneously sent me a refcount UaF in NFS with 5.6-rc6. I
thought I should forward it onward here in case this is not already
known. The original reporter is CC'd in case you have any questions.

Regards,
Jason

-------8<------------------------

Mar 20 21:43:34 agro kernel: ------------[ cut here ]------------
Mar 20 21:43:34 agro kernel: refcount_t: underflow; use-after-free.
Mar 20 21:43:34 agro kernel: WARNING: CPU: 1 PID: 9334 at
lib/refcount.c:28 refcount_warn_saturate+0xd9/0xe0
Mar 20 21:43:34 agro kernel: Modules linked in: md5 wireguard
curve25519_x86_64 libcurve25519_generic libchacha20poly1305
chacha_x86_64 libchacha poly1305_x86_64 ip6_udp_tunnel udp_tunnel
libblake2s blake2s_x86_64 libblake2s_generic xt_mac xt_nat
xt_MASQUERADE xt_REDIRECT xt_owner xt_comment ipt_REJECT
nf_reject_ipv4 xt_mark xt_hashlimit xt_multiport nfnetlink_log
xt_NFLOG nf_log_ipv4 nf_log_common xt_LOG nf_nat_tftp nf_nat_sip
nf_nat_h323 nf_nat_ftp nf_conntrack_tftp nf_conntrack_sip
nf_conntrack_netlink nfnetlink nf_conntrack_h323 nf_conntrack_ftp nfsd
auth_rpcgss nfs_acl lockd grace sunrpc cpufreq_ondemand msr bridge stp
llc xt_tcpudp xt_conntrack iptable_mangle iptable_nat nf_nat
nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_raw iptable_filter
ip_tables x_tables btrfs blake2b_generic xor raid6_pq libcrc32c
zstd_compress zstd_decompress zlib_deflate zlib_inflate uas
usb_storage snd_hda_codec_realtek snd_hda_codec_generic matroxfb_base
matroxfb_g450 matroxfb_Ti3026 snd_hda_int
 el matroxfb_accel
Mar 20 21:43:34 agro kernel:  snd_intel_dspcfg cfbfillrect cfbimgblt
atlantic snd_hda_codec r8169 cfbcopyarea e1000e snd_hda_core snd_hwdep
matroxfb_DAC1064 g450_pll matroxfb_misc ptp realtek fb snd_pcm libphy
snd_timer mxm_wmi xhci_pci pcspkr i2c_piix4 input_leds pps_core fbdev
k10temp ohci_pci snd xhci_hcd ohci_hcd ehci_pci soundcore i2c_core
ehci_hcd wmi acpi_cpufreq vhci_hcd usbip_core usbcore usb_common
vhost_net vhost kvm_amd kvm irqbypass loop tun fuse it87 hwmon_vid
hwmon
Mar 20 21:43:34 agro kernel: CPU: 1 PID: 9334 Comm: kworker/u16:3 Not
tainted 5.6.0-rc6 #1
Mar 20 21:43:34 agro kernel: Hardware name: Gigabyte Technology Co.,
Ltd. GA-990FXA-UD5/GA-990FXA-UD5, BIOS F12 10/03/2013
Mar 20 21:43:34 agro kernel: Workqueue: rpciod rpc_async_schedule [sunrpc]
Mar 20 21:43:34 agro kernel: RIP: 0010:refcount_warn_saturate+0xd9/0xe0
Mar 20 21:43:34 agro kernel: Code: ff 48 c7 c7 e8 ff d5 81 c6 05 cd 49
b5 00 01 e8 4d 04 cf ff 0f 0b c3 48 c7 c7 90 ff d5 81 c6 05 b9 49 b5
00 01 e8 37 04 cf ff <0f> 0b c3 0f 1f 40 00 8b 07 3d 00 00 00 c0 74 12
83 f8 01 74 46 8d
Mar 20 21:43:34 agro kernel: RSP: 0018:ffffc900010dfde0 EFLAGS: 00010282
Mar 20 21:43:34 agro kernel: RAX: 0000000000000026 RBX:
0000000000000e81 RCX: 0000000000000007
Mar 20 21:43:34 agro kernel: RDX: 0000000000000007 RSI:
0000000000000092 RDI: ffff88881fc587c0
Mar 20 21:43:34 agro kernel: RBP: ffff8887eeb40470 R08:
00000000000004d9 R09: 0000000000000001
Mar 20 21:43:34 agro kernel: R10: 0000000000000000 R11:
0000000000000001 R12: ffff8884f1e28510
Mar 20 21:43:34 agro kernel: R13: ffff88837dc49f30 R14:
0000000000000001 R15: ffff888780b8a840
Mar 20 21:43:34 agro kernel: FS:  0000000000000000(0000)
GS:ffff88881fc40000(0000) knlGS:0000000000000000
Mar 20 21:43:34 agro kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 20 21:43:34 agro kernel: CR2: 000000000421efb0 CR3:
0000000510564000 CR4: 00000000000006e0
Mar 20 21:43:34 agro kernel: Call Trace:
Mar 20 21:43:34 agro kernel:  nfsd41_destroy_cb+0x36/0x50 [nfsd]
Mar 20 21:43:34 agro kernel:  rpc_free_task+0x3c/0x70 [sunrpc]
Mar 20 21:43:34 agro kernel:  __rpc_execute+0x3a5/0x3c0 [sunrpc]
Mar 20 21:43:34 agro kernel:  ? finish_task_switch+0x7f/0x250
Mar 20 21:43:34 agro kernel:  rpc_async_schedule+0x2f/0x50 [sunrpc]
Mar 20 21:43:34 agro kernel:  process_one_work+0x1ca/0x3c0
Mar 20 21:43:34 agro kernel:  worker_thread+0x45/0x3d0
Mar 20 21:43:34 agro kernel:  kthread+0xf3/0x130
Mar 20 21:43:34 agro kernel:  ? process_one_work+0x3c0/0x3c0
Mar 20 21:43:34 agro kernel:  ? kthread_park+0x80/0x80
Mar 20 21:43:34 agro kernel:  ret_from_fork+0x1f/0x30
Mar 20 21:43:34 agro kernel: ---[ end trace 99765c8e28c46274 ]---

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: refcount underflow in nfsd41_destroy_cb
  2020-03-21  2:51 refcount underflow in nfsd41_destroy_cb Jason A. Donenfeld
@ 2020-03-21 14:41 ` Jan Psota
  2020-03-21 17:25   ` Jan Psota
  2020-05-10 23:23   ` Jan Psota
  2020-03-21 19:50 ` Chuck Lever
  1 sibling, 2 replies; 11+ messages in thread
From: Jan Psota @ 2020-03-21 14:41 UTC (permalink / raw)
  To: linux-nfs; +Cc: Jason A. Donenfeld

[-- Attachment #1: Type: text/plain, Size: 70 bytes --]

Hi! Previous and first trace for problem that Jason forwarded to you.

[-- Attachment #2: nfsd-trace.txt --]
[-- Type: text/plain, Size: 3972 bytes --]

Mar 19 11:08:07 agro kernel: ------------[ cut here ]------------
Mar 19 11:08:07 agro kernel: refcount_t: underflow; use-after-free.
Mar 19 11:08:07 agro kernel: WARNING: CPU: 0 PID: 15539 at lib/refcount.c:28 refcount_warn_saturate+
0xd9/0xe0
Mar 19 11:08:07 agro kernel: Modules linked in: udp_diag inet_diag wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 libchacha poly1305_x86_64 ip6_udp_tunnel udp_tunnel libblake2s blake2s_x86_64 libblake2s_generic md5 xt_mac xt_nat xt_MASQUERADE xt_REDIRECT xt_owner xt_comment ipt_REJECT nf_reject_ipv4 xt_mark xt_hashlimit iptable_raw xt_multiport nfnetlink_log xt_NFLOG nf_log_ipv4 nf_log_common xt_LOG nf_nat_tftp nf_nat_sip nf_nat_h323 nf_nat_ftp nf_conntrack_tftp nf_conntrack_sip nf_conntrack_netlink nfnetlink nf_conntrack_h323 nf_conntrack_ftp nfsd auth_rpcgss nfs_acl lockd grace sunrpc cpufreq_ondemand msr bridge stp llc xt_tcpudp xt_conntrack iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter ip_tables x_tables btrfs blake2b_generic xor raid6_pq libcrc32c zstd_compress zstd_decompress zlib_deflate zlib_inflate hid_generic usbhid hid uas usb_storage snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel
  matroxfb_base
Mar 19 11:08:07 agro kernel:  snd_intel_dspcfg r8169 matroxfb_g450 matroxfb_Ti3026 matroxfb_accel e1000e cfbfillrect realtek snd_hda_codec libphy cfbimgblt cfbcopyarea matroxfb_DAC1064 g450_pll matroxfb_misc fb snd_hda_core atlantic mxm_wmi input_leds snd_hwdep xhci_pci pcspkr k10temp ptp xhci_hcd fbdev snd_pcm pps_core snd_timer snd wmi ohci_pci ohci_hcd i2c_piix4 ehci_pci ehci_hcd soundcore i2c_core acpi_cpufreq vhci_hcd usbip_core usbcore usb_common vhost_net vhost kvm_amd kvm irqbypass loop tun fuse it87 hwmon_vid hwmon
Mar 19 11:08:07 agro kernel: CPU: 0 PID: 15539 Comm: kworker/u16:1 Not tainted 5.6.0-rc6 #1
Mar 19 11:08:07 agro kernel: Hardware name: Gigabyte Technology Co., Ltd. GA-990FXA-UD5/GA-990FXA-UD5, BIOS F12 10/03/2013
Mar 19 11:08:07 agro kernel: Workqueue: rpciod rpc_async_schedule [sunrpc]
Mar 19 11:08:07 agro kernel: RIP: 0010:refcount_warn_saturate+0xd9/0xe0
Mar 19 11:08:07 agro kernel: Code: ff 48 c7 c7 e8 ff d5 81 c6 05 cd 49 b5 00 01 e8 4d 04 cf ff 0f 0b c3 48 c7 c7 90 ff d5 81 c6 05 b9 49 b5 00 01 e8 37 04 cf ff <0f> 0b c3 0f 1f 40 00 8b 07 3d 00 00 00 c0 74 12 83 f8 01 74 46 8d
Mar 19 11:08:07 agro kernel: RSP: 0018:ffffc9000056fde0 EFLAGS: 00010282
Mar 19 11:08:07 agro kernel: RAX: 0000000000000026 RBX: 0000000000000e81 RCX: 0000000000000007
Mar 19 11:08:07 agro kernel: RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff88881fc187c0
Mar 19 11:08:07 agro kernel: RBP: ffff8887f0394c70 R08: 00000000000015b7 R09: 0000000000000001
Mar 19 11:08:07 agro kernel: R10: 0000000000000000 R11: 0000000000000001 R12: ffff8885fec40510
Mar 19 11:08:07 agro kernel: R13: ffff888231700430 R14: 0000000000000001 R15: ffff888016454a80
Mar 19 11:08:07 agro kernel: FS:  0000000000000000(0000) GS:ffff88881fc00000(0000) knlGS:0000000000000000
Mar 19 11:08:07 agro kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 19 11:08:07 agro kernel: CR2: 000000000319e728 CR3: 0000000611f86000 CR4: 00000000000006f0
Mar 19 11:08:07 agro kernel: Call Trace:
Mar 19 11:08:07 agro kernel:  nfsd41_destroy_cb+0x36/0x50 [nfsd]
Mar 19 11:08:07 agro kernel:  rpc_free_task+0x3c/0x70 [sunrpc]
Mar 19 11:08:07 agro kernel:  __rpc_execute+0x3a5/0x3c0 [sunrpc]
Mar 19 11:08:07 agro kernel:  ? finish_task_switch+0x7f/0x250
Mar 19 11:08:07 agro kernel:  rpc_async_schedule+0x2f/0x50 [sunrpc]
Mar 19 11:08:07 agro kernel:  process_one_work+0x1ca/0x3c0
Mar 19 11:08:07 agro kernel:  worker_thread+0x45/0x3d0
Mar 19 11:08:07 agro kernel:  kthread+0xf3/0x130
Mar 19 11:08:07 agro kernel:  ? process_one_work+0x3c0/0x3c0
Mar 19 11:08:07 agro kernel:  ? kthread_park+0x80/0x80
Mar 19 11:08:07 agro kernel:  ret_from_fork+0x1f/0x30
Mar 19 11:08:07 agro kernel: ---[ end trace 175a0f52bbff4924 ]---

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: refcount underflow in nfsd41_destroy_cb
  2020-03-21 14:41 ` Jan Psota
@ 2020-03-21 17:25   ` Jan Psota
  2020-05-10 23:23   ` Jan Psota
  1 sibling, 0 replies; 11+ messages in thread
From: Jan Psota @ 2020-03-21 17:25 UTC (permalink / raw)
  To: linux-nfs

[-- Attachment #1: Type: text/plain, Size: 84 bytes --]

Hi! 3-rd trace. If it is enough, so you want me not to send next, just
let me know.

[-- Attachment #2: nfsd-trace3.txt --]
[-- Type: text/plain, Size: 3481 bytes --]

[53545.848242] ------------[ cut here ]------------
[53545.848246] refcount_t: underflow; use-after-free.
[53545.848283] WARNING: CPU: 0 PID: 22784 at lib/refcount.c:28 refcount_warn_saturate+0xd9/0xe0
[53545.848285] Modules linked in: md5 wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 libchacha poly1305_x86_64 ip6_udp_tunnel udp_tunnel libblake2s blake2s_x86_64 libblake2s_generic xt_mac xt_nat xt_MASQUERADE xt_REDIRECT xt_owner xt_comment ipt_REJECT nf_reject_ipv4 xt_mark xt_hashlimit xt_multiport nfnetlink_log xt_NFLOG nf_log_ipv4 nf_log_common xt_LOG nf_nat_tftp nf_nat_sip nf_nat_h323 nf_nat_ftp nf_conntrack_tftp nf_conntrack_sip nf_conntrack_netlink nfnetlink nf_conntrack_h323 nf_conntrack_ftp nfsd auth_rpcgss nfs_acl lockd grace sunrpc cpufreq_ondemand msr bridge stp llc xt_tcpudp xt_conntrack iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_raw iptable_filter ip_tables x_tables btrfs blake2b_generic xor raid6_pq libcrc32c zstd_compress zstd_decompress zlib_deflate zlib_inflate uas usb_storage snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel matroxfb_base snd_intel_dspcfg snd_hda_codec matroxfb_g
 450
[53545.848354]  snd_hda_core matroxfb_Ti3026 matroxfb_accel r8169 cfbfillrect cfbimgblt snd_hwdep e1000e realtek libphy cfbcopyarea snd_pcm mxm_wmi pcspkr matroxfb_DAC1064 g450_pll snd_timer matroxfb_misc input_leds atlantic fb snd wmi fbdev xhci_pci k10temp xhci_hcd i2c_piix4 ptp soundcore pps_core ohci_pci i2c_core ohci_hcd ehci_pci ehci_hcd acpi_cpufreq vhci_hcd usbip_core usbcore usb_common vhost_net vhost kvm_amd kvm irqbypass loop tun fuse it87 hwmon_vid hwmon
[53545.848402] CPU: 0 PID: 22784 Comm: kworker/u16:0 Not tainted 5.6.0-rc6 #1
[53545.848405] Hardware name: Gigabyte Technology Co., Ltd. GA-990FXA-UD5/GA-990FXA-UD5, BIOS F12 10/03/2013
[53545.848432] Workqueue: rpciod rpc_async_schedule [sunrpc]
[53545.848440] RIP: 0010:refcount_warn_saturate+0xd9/0xe0
[53545.848446] Code: ff 48 c7 c7 e8 ff d5 81 c6 05 cd 49 b5 00 01 e8 4d 04 cf ff 0f 0b c3 48 c7 c7 90 ff d5 81 c6 05 b9 49 b5 00 01 e8 37 04 cf ff <0f> 0b c3 0f 1f 40 00 8b 07 3d 00 00 00 c0 74 12 83 f8 01 74 46 8d
[53545.848449] RSP: 0018:ffffc900045efde0 EFLAGS: 00010282
[53545.848452] RAX: 0000000000000026 RBX: 0000000000000e81 RCX: 0000000000000007
[53545.848455] RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff88881fc187c0
[53545.848457] RBP: ffff8887ee7b2c70 R08: 00000000000004ef R09: 0000000000000001
[53545.848459] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8884f1a10510
[53545.848462] R13: ffff88815792f530 R14: 0000000000000001 R15: ffff8882343c1300
[53545.848465] FS:  0000000000000000(0000) GS:ffff88881fc00000(0000) knlGS:0000000000000000
[53545.848467] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[53545.848470] CR2: 0000000002600000 CR3: 0000000511560000 CR4: 00000000000006f0
[53545.848471] Call Trace:
[53545.848501]  nfsd41_destroy_cb+0x36/0x50 [nfsd]
[53545.848524]  rpc_free_task+0x3c/0x70 [sunrpc]
[53545.848544]  __rpc_execute+0x3a5/0x3c0 [sunrpc]
[53545.848552]  ? finish_task_switch+0x7f/0x250
[53545.848571]  rpc_async_schedule+0x2f/0x50 [sunrpc]
[53545.848578]  process_one_work+0x1ca/0x3c0
[53545.848584]  worker_thread+0x45/0x3d0
[53545.848590]  kthread+0xf3/0x130
[53545.848595]  ? process_one_work+0x3c0/0x3c0
[53545.848599]  ? kthread_park+0x80/0x80
[53545.848607]  ret_from_fork+0x1f/0x30
[53545.848613] ---[ end trace dfea44b545c7626a ]---

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: refcount underflow in nfsd41_destroy_cb
  2020-03-21  2:51 refcount underflow in nfsd41_destroy_cb Jason A. Donenfeld
  2020-03-21 14:41 ` Jan Psota
@ 2020-03-21 19:50 ` Chuck Lever
  2020-03-22  3:43   ` Jan Psota
  1 sibling, 1 reply; 11+ messages in thread
From: Chuck Lever @ 2020-03-21 19:50 UTC (permalink / raw)
  To: Jan Psota; +Cc: Linux NFS Mailing List, Jason A. Donenfeld



> On Mar 20, 2020, at 10:51 PM, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> 
> Hello,
> 
> A user erroneously sent me a refcount UaF in NFS with 5.6-rc6. I
> thought I should forward it onward here in case this is not already
> known. The original reporter is CC'd in case you have any questions.

Jan, how are you reproducing this?


> Regards,
> Jason
> 
> -------8<------------------------
> 
> Mar 20 21:43:34 agro kernel: ------------[ cut here ]------------
> Mar 20 21:43:34 agro kernel: refcount_t: underflow; use-after-free.
> Mar 20 21:43:34 agro kernel: WARNING: CPU: 1 PID: 9334 at
> lib/refcount.c:28 refcount_warn_saturate+0xd9/0xe0
> Mar 20 21:43:34 agro kernel: Modules linked in: md5 wireguard
> curve25519_x86_64 libcurve25519_generic libchacha20poly1305
> chacha_x86_64 libchacha poly1305_x86_64 ip6_udp_tunnel udp_tunnel
> libblake2s blake2s_x86_64 libblake2s_generic xt_mac xt_nat
> xt_MASQUERADE xt_REDIRECT xt_owner xt_comment ipt_REJECT
> nf_reject_ipv4 xt_mark xt_hashlimit xt_multiport nfnetlink_log
> xt_NFLOG nf_log_ipv4 nf_log_common xt_LOG nf_nat_tftp nf_nat_sip
> nf_nat_h323 nf_nat_ftp nf_conntrack_tftp nf_conntrack_sip
> nf_conntrack_netlink nfnetlink nf_conntrack_h323 nf_conntrack_ftp nfsd
> auth_rpcgss nfs_acl lockd grace sunrpc cpufreq_ondemand msr bridge stp
> llc xt_tcpudp xt_conntrack iptable_mangle iptable_nat nf_nat
> nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_raw iptable_filter
> ip_tables x_tables btrfs blake2b_generic xor raid6_pq libcrc32c
> zstd_compress zstd_decompress zlib_deflate zlib_inflate uas
> usb_storage snd_hda_codec_realtek snd_hda_codec_generic matroxfb_base
> matroxfb_g450 matroxfb_Ti3026 snd_hda_int
> el matroxfb_accel
> Mar 20 21:43:34 agro kernel:  snd_intel_dspcfg cfbfillrect cfbimgblt
> atlantic snd_hda_codec r8169 cfbcopyarea e1000e snd_hda_core snd_hwdep
> matroxfb_DAC1064 g450_pll matroxfb_misc ptp realtek fb snd_pcm libphy
> snd_timer mxm_wmi xhci_pci pcspkr i2c_piix4 input_leds pps_core fbdev
> k10temp ohci_pci snd xhci_hcd ohci_hcd ehci_pci soundcore i2c_core
> ehci_hcd wmi acpi_cpufreq vhci_hcd usbip_core usbcore usb_common
> vhost_net vhost kvm_amd kvm irqbypass loop tun fuse it87 hwmon_vid
> hwmon
> Mar 20 21:43:34 agro kernel: CPU: 1 PID: 9334 Comm: kworker/u16:3 Not
> tainted 5.6.0-rc6 #1
> Mar 20 21:43:34 agro kernel: Hardware name: Gigabyte Technology Co.,
> Ltd. GA-990FXA-UD5/GA-990FXA-UD5, BIOS F12 10/03/2013
> Mar 20 21:43:34 agro kernel: Workqueue: rpciod rpc_async_schedule [sunrpc]
> Mar 20 21:43:34 agro kernel: RIP: 0010:refcount_warn_saturate+0xd9/0xe0
> Mar 20 21:43:34 agro kernel: Code: ff 48 c7 c7 e8 ff d5 81 c6 05 cd 49
> b5 00 01 e8 4d 04 cf ff 0f 0b c3 48 c7 c7 90 ff d5 81 c6 05 b9 49 b5
> 00 01 e8 37 04 cf ff <0f> 0b c3 0f 1f 40 00 8b 07 3d 00 00 00 c0 74 12
> 83 f8 01 74 46 8d
> Mar 20 21:43:34 agro kernel: RSP: 0018:ffffc900010dfde0 EFLAGS: 00010282
> Mar 20 21:43:34 agro kernel: RAX: 0000000000000026 RBX:
> 0000000000000e81 RCX: 0000000000000007
> Mar 20 21:43:34 agro kernel: RDX: 0000000000000007 RSI:
> 0000000000000092 RDI: ffff88881fc587c0
> Mar 20 21:43:34 agro kernel: RBP: ffff8887eeb40470 R08:
> 00000000000004d9 R09: 0000000000000001
> Mar 20 21:43:34 agro kernel: R10: 0000000000000000 R11:
> 0000000000000001 R12: ffff8884f1e28510
> Mar 20 21:43:34 agro kernel: R13: ffff88837dc49f30 R14:
> 0000000000000001 R15: ffff888780b8a840
> Mar 20 21:43:34 agro kernel: FS:  0000000000000000(0000)
> GS:ffff88881fc40000(0000) knlGS:0000000000000000
> Mar 20 21:43:34 agro kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Mar 20 21:43:34 agro kernel: CR2: 000000000421efb0 CR3:
> 0000000510564000 CR4: 00000000000006e0
> Mar 20 21:43:34 agro kernel: Call Trace:
> Mar 20 21:43:34 agro kernel:  nfsd41_destroy_cb+0x36/0x50 [nfsd]
> Mar 20 21:43:34 agro kernel:  rpc_free_task+0x3c/0x70 [sunrpc]
> Mar 20 21:43:34 agro kernel:  __rpc_execute+0x3a5/0x3c0 [sunrpc]
> Mar 20 21:43:34 agro kernel:  ? finish_task_switch+0x7f/0x250
> Mar 20 21:43:34 agro kernel:  rpc_async_schedule+0x2f/0x50 [sunrpc]
> Mar 20 21:43:34 agro kernel:  process_one_work+0x1ca/0x3c0
> Mar 20 21:43:34 agro kernel:  worker_thread+0x45/0x3d0
> Mar 20 21:43:34 agro kernel:  kthread+0xf3/0x130
> Mar 20 21:43:34 agro kernel:  ? process_one_work+0x3c0/0x3c0
> Mar 20 21:43:34 agro kernel:  ? kthread_park+0x80/0x80
> Mar 20 21:43:34 agro kernel:  ret_from_fork+0x1f/0x30
> Mar 20 21:43:34 agro kernel: ---[ end trace 99765c8e28c46274 ]---

--
Chuck Lever




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: refcount underflow in nfsd41_destroy_cb
  2020-03-21 19:50 ` Chuck Lever
@ 2020-03-22  3:43   ` Jan Psota
  2020-03-23 14:13     ` Chuck Lever
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Psota @ 2020-03-22  3:43 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Linux NFS Mailing List, Jason A. Donenfeld

Chuck Lever <chuck.lever@oracle.com> napisał(a):
> Jan, how are you reproducing this?

It looks like it's taking place on server on high NFS load and about
a day after boot! (as I noticed looking into last -x results, below)
Then system runs all right for a month (to be rebooted on new kernel
[not always] or something like this).

We have some NFS-rooted machines:
/systemd on / type nfs4 (rw,relatime,vers=4.2,rsize=4096,wsize=4096,namlen=255,hard,proto=tcp,
	timeo=10,retrans=2,sec=sys,clientaddr=192.168.1.18,local_lock=none,addr=192.168.1.1)

Server has 10Gb Aquantia AQC107 card connected to Mikrotik CSS326
switch. Clients running distcc (aside from acting as workstations)
are connected on 1Gb ethernet. Server runs Gentoo Linux on OpenRC
(stations have Systemd) with recent gcc-9.3, binutils-2.34 and
glibc-2.30, has 32 GB RAM and AMD Phenom II X6 1090T CPU.

/var/tmp/portage, where compilation takes place, normally is on client
tmpfs, but when there is not enough space to compile huge program, I
switch it to server exported NFS
(/etc/exports opts: -rw,async,no_root_squash,no_subtree_check)

# "grep nfs.*destroy /var/log/messages" mixed with "last -x"

reboot   system boot  5.5.1-gentoo     Mon Feb  3 00:20 - 15:22 (25+15:01)
Feb  4 17:44:39 agro kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
	rust compilation, kernel 5.5.1-gentoo

reboot   system boot  5.5.6-gentoo     Fri Feb 28 15:23 - 16:25 (14+01:02)
Feb 29 13:51:49 agro kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
	rust compilation, kernel 5.5.6-gentoo

reboot   system boot  5.5.9-gentoo     Fri Mar 13 16:27 - 00:04 (4+07:36)
Mar 14 18:03:49 agro kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
	libpciaccess compilation, kernel 

reboot   system boot  5.6.0-rc6        Wed Mar 18 00:06 - 20:39 (2+20:32)
Mar 19 11:08:07 agro kernel:  nfsd41_destroy_cb+0x36/0x50 [nfsd]
	linux-firmware merge
*
reboot   system boot  5.6.0-rc6        Fri Mar 20 20:40 - 02:40  (05:59)
Mar 20 21:43:34 agro kernel:  nfsd41_destroy_cb+0x36/0x50 [nfsd]
	zstd compilation
*
reboot   system boot  5.6.0-rc6        Sat Mar 21 02:42   still running
Mar 21 17:34:43 agro kernel:  nfsd41_destroy_cb+0x36/0x50 [nfsd]
	nodejs compilation

* - I noticed kernel fault looking for a reason, why WireGuard refused
to connect with _some_ remote peers so I rebooted the server and it helped.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: refcount underflow in nfsd41_destroy_cb
  2020-03-22  3:43   ` Jan Psota
@ 2020-03-23 14:13     ` Chuck Lever
  2020-03-23 15:09       ` Jan Psota
  0 siblings, 1 reply; 11+ messages in thread
From: Chuck Lever @ 2020-03-23 14:13 UTC (permalink / raw)
  To: Jan Psota; +Cc: Linux NFS Mailing List, Jason A. Donenfeld



> On Mar 21, 2020, at 11:43 PM, Jan Psota <jasiu@belsznica.pl> wrote:
> 
> Chuck Lever <chuck.lever@oracle.com> napisał(a):
>> Jan, how are you reproducing this?
> 
> It looks like it's taking place on server on high NFS load and about
> a day after boot! (as I noticed looking into last -x results, below)
> Then system runs all right for a month (to be rebooted on new kernel
> [not always] or something like this).
> 
> We have some NFS-rooted machines:
> /systemd on / type nfs4 (rw,relatime,vers=4.2,rsize=4096,wsize=4096,namlen=255,hard,proto=tcp,
> 	timeo=10,retrans=2,sec=sys,clientaddr=192.168.1.18,local_lock=none,addr=192.168.1.1)
> 
> Server has 10Gb Aquantia AQC107 card connected to Mikrotik CSS326
> switch. Clients running distcc (aside from acting as workstations)
> are connected on 1Gb ethernet. Server runs Gentoo Linux on OpenRC
> (stations have Systemd) with recent gcc-9.3, binutils-2.34 and
> glibc-2.30, has 32 GB RAM and AMD Phenom II X6 1090T CPU.
> 
> /var/tmp/portage, where compilation takes place, normally is on client
> tmpfs, but when there is not enough space to compile huge program, I
> switch it to server exported NFS
> (/etc/exports opts: -rw,async,no_root_squash,no_subtree_check)
> 
> # "grep nfs.*destroy /var/log/messages" mixed with "last -x"

I thought I read in the initial report that you were seeing this
problem only on v5.6-rc6. What is the earliest kernel release
where you saw refcount UaF warnings from nfsd4_destroy_cb?


> reboot   system boot  5.5.1-gentoo     Mon Feb  3 00:20 - 15:22 (25+15:01)
> Feb  4 17:44:39 agro kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
> 	rust compilation, kernel 5.5.1-gentoo
> 
> reboot   system boot  5.5.6-gentoo     Fri Feb 28 15:23 - 16:25 (14+01:02)
> Feb 29 13:51:49 agro kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
> 	rust compilation, kernel 5.5.6-gentoo
> 
> reboot   system boot  5.5.9-gentoo     Fri Mar 13 16:27 - 00:04 (4+07:36)
> Mar 14 18:03:49 agro kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
> 	libpciaccess compilation, kernel 
> 
> reboot   system boot  5.6.0-rc6        Wed Mar 18 00:06 - 20:39 (2+20:32)
> Mar 19 11:08:07 agro kernel:  nfsd41_destroy_cb+0x36/0x50 [nfsd]
> 	linux-firmware merge
> *
> reboot   system boot  5.6.0-rc6        Fri Mar 20 20:40 - 02:40  (05:59)
> Mar 20 21:43:34 agro kernel:  nfsd41_destroy_cb+0x36/0x50 [nfsd]
> 	zstd compilation
> *
> reboot   system boot  5.6.0-rc6        Sat Mar 21 02:42   still running
> Mar 21 17:34:43 agro kernel:  nfsd41_destroy_cb+0x36/0x50 [nfsd]
> 	nodejs compilation
> 
> * - I noticed kernel fault looking for a reason, why WireGuard refused
> to connect with _some_ remote peers so I rebooted the server and it helped.

--
Chuck Lever




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: refcount underflow in nfsd41_destroy_cb
  2020-03-23 14:13     ` Chuck Lever
@ 2020-03-23 15:09       ` Jan Psota
  2020-03-24 13:50         ` Chuck Lever
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Psota @ 2020-03-23 15:09 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Linux NFS Mailing List, Jason A. Donenfeld

[-- Attachment #1: Type: text/plain, Size: 1344 bytes --]

> I thought I read in the initial report that you were seeing this
> problem only on v5.6-rc6. What is the earliest kernel release
> where you saw refcount UaF warnings from nfsd4_destroy_cb?
> 
I didn't noticed that earlier too, because until connection breakage on
WireGuard I did not have any problems related. But when you are asking,
I found it in my Pentium G2020 system too since 5.5.4 kernel and 5.4.2
looks not affected (I have logs since 01 Jan and fault begin to appear
on Feb 21, when I switched from 5.4.2 to 5.5.4 kernel a day before)

$ journalctl | grep nfsd41_destroy_cb
lut 21 01:07:58 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
lut 27 01:01:12 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
mar 03 00:59:01 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
mar 03 23:03:02 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
mar 11 11:52:42 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
mar 13 01:12:02 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
mar 14 14:31:39 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
mar 15 20:56:56 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
mar 17 15:58:32 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
mar 22 15:24:03 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]

I attach NFS part of my .config and a screen dump of menuconfig.

[-- Attachment #2: nfs.config --]
[-- Type: application/octet-stream, Size: 1037 bytes --]

CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_NFS_FS=m
# CONFIG_NFS_V2 is not set
CONFIG_NFS_V3=m
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=m
# CONFIG_NFS_SWAP is not set
CONFIG_NFS_V4_1=y
CONFIG_NFS_V4_2=y
CONFIG_PNFS_FILE_LAYOUT=m
CONFIG_PNFS_BLOCK=m
CONFIG_PNFS_FLEXFILE_LAYOUT=m
CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN="kernel.org"
# CONFIG_NFS_V4_1_MIGRATION is not set
CONFIG_NFS_V4_SECURITY_LABEL=y
# CONFIG_NFS_FSCACHE is not set
# CONFIG_NFS_USE_LEGACY_DNS is not set
CONFIG_NFS_USE_KERNEL_DNS=y
CONFIG_NFS_DISABLE_UDP_SUPPORT=y
CONFIG_NFSD=m
CONFIG_NFSD_V2_ACL=y
CONFIG_NFSD_V3=y
CONFIG_NFSD_V3_ACL=y
CONFIG_NFSD_V4=y
CONFIG_NFSD_PNFS=y
CONFIG_NFSD_BLOCKLAYOUT=y
CONFIG_NFSD_SCSILAYOUT=y
# CONFIG_NFSD_FLEXFILELAYOUT is not set
# CONFIG_NFSD_V4_2_INTER_SSC is not set
# CONFIG_NFSD_V4_SECURITY_LABEL is not set
CONFIG_GRACE_PERIOD=m
CONFIG_LOCKD=m
CONFIG_LOCKD_V4=y
CONFIG_NFS_ACL_SUPPORT=m
CONFIG_NFS_COMMON=y
CONFIG_SUNRPC=m
CONFIG_SUNRPC_GSS=m
CONFIG_SUNRPC_BACKCHANNEL=y
# CONFIG_RPCSEC_GSS_KRB5 is not set
# CONFIG_SUNRPC_DEBUG is not set

[-- Attachment #3: nfs-config.txt --]
[-- Type: text/plain, Size: 2675 bytes --]

│ │         --- Network File Systems                                                        │ │  
│ │         <M>   NFS client support                                                        │ │  
│ │         < >     NFS client support for NFS version 2                                    │ │  
│ │         <M>     NFS client support for NFS version 3                                    │ │  
│ │         [*]       NFS client support for the NFSv3 ACL protocol extension               │ │  
│ │         <M>     NFS client support for NFS version 4                                    │ │  
│ │         [ ]     Provide swap over NFS support                                           │ │  
│ │         [*]   NFS client support for NFSv4.1                                            │ │  
│ │         [*]     NFS client support for NFSv4.2                                          │ │  
│ │         (kernel.org) NFSv4.1 Implementation ID Domain                                   │ │  
│ │         [ ]     NFSv4.1 client support for migration                                    │ │  
│ │         [ ]   Provide NFS client caching support                                        │ │  
│ │         [ ]   Use the legacy NFS DNS resolver                                           │ │  
│ │         [*]   NFS: Disable NFS UDP protocol support                                     │ │  
│ │         <M>   NFS server support                                                        │ │  
│ │         -*-     NFS server support for NFS version 3                                    │ │  
│ │         [*]       NFS server support for the NFSv3 ACL protocol extension               │ │  
│ │         [*]     NFS server support for NFS version 4                                    │ │  
│ │         [*]   NFSv4.1 server support for pNFS block layouts                             │ │  
│ │         [*]   NFSv4.1 server support for pNFS SCSI layouts                              │ │  
│ │         [ ]   NFSv4.1 server support for pNFS Flex File layouts                         │ │  
│ │         [ ]   NFSv4.2 inter server to server COPY                                       │ │  
│ │         [ ]   Provide Security Label support for NFSv4 server                           │ │  
│ │         < >   Secure RPC: Kerberos V mechanism                                          │ │  
│ │         [ ]   RPC: Enable dprintk debugging                                             │ │  

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: refcount underflow in nfsd41_destroy_cb
  2020-03-23 15:09       ` Jan Psota
@ 2020-03-24 13:50         ` Chuck Lever
  2020-03-31 17:59           ` Chuck Lever
  0 siblings, 1 reply; 11+ messages in thread
From: Chuck Lever @ 2020-03-24 13:50 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Linux NFS Mailing List, Jason A. Donenfeld, Jan Psota

Trond,

> On Mar 23, 2020, at 11:09 AM, Jan Psota <jasiu@belsznica.pl> wrote:
> 
>> I thought I read in the initial report that you were seeing this
>> problem only on v5.6-rc6. What is the earliest kernel release
>> where you saw refcount UaF warnings from nfsd4_destroy_cb?
>> 
> I didn't noticed that earlier too, because until connection breakage on
> WireGuard I did not have any problems related. But when you are asking,
> I found it in my Pentium G2020 system too since 5.5.4 kernel and 5.4.2
> looks not affected (I have logs since 01 Jan and fault begin to appear
> on Feb 21, when I switched from 5.4.2 to 5.5.4 kernel a day before)
> 
> $ journalctl | grep nfsd41_destroy_cb
> lut 21 01:07:58 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
> lut 27 01:01:12 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
> mar 03 00:59:01 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
> mar 03 23:03:02 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
> mar 11 11:52:42 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
> mar 13 01:12:02 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
> mar 14 14:31:39 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
> mar 15 20:56:56 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
> mar 17 15:58:32 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
> mar 22 15:24:03 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
> 
> I attach NFS part of my .config and a screen dump of menuconfig.
> <nfs.config><nfs-config.txt>

I'm wondering if

2bbfed98a4d8 ("nfsd: Fix races between nfsd4_cb_release() and nfsd4_shutdown_callback()")

or

12357f1b2c8e ("nfsd: minor 4.1 callback cleanup")

might be related to this issue (see down-thread for details and backtraces).

--
Chuck Lever




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: refcount underflow in nfsd41_destroy_cb
  2020-03-24 13:50         ` Chuck Lever
@ 2020-03-31 17:59           ` Chuck Lever
  0 siblings, 0 replies; 11+ messages in thread
From: Chuck Lever @ 2020-03-31 17:59 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Linux NFS Mailing List, Jason A. Donenfeld, Jan Psota

Trond, any thoughts?

> On Mar 24, 2020, at 9:50 AM, Chuck Lever <chuck.lever@oracle.com> wrote:
> 
> Trond,
> 
>> On Mar 23, 2020, at 11:09 AM, Jan Psota <jasiu@belsznica.pl> wrote:
>> 
>>> I thought I read in the initial report that you were seeing this
>>> problem only on v5.6-rc6. What is the earliest kernel release
>>> where you saw refcount UaF warnings from nfsd4_destroy_cb?
>>> 
>> I didn't noticed that earlier too, because until connection breakage on
>> WireGuard I did not have any problems related. But when you are asking,
>> I found it in my Pentium G2020 system too since 5.5.4 kernel and 5.4.2
>> looks not affected (I have logs since 01 Jan and fault begin to appear
>> on Feb 21, when I switched from 5.4.2 to 5.5.4 kernel a day before)
>> 
>> $ journalctl | grep nfsd41_destroy_cb
>> lut 21 01:07:58 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
>> lut 27 01:01:12 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
>> mar 03 00:59:01 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
>> mar 03 23:03:02 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
>> mar 11 11:52:42 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
>> mar 13 01:12:02 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
>> mar 14 14:31:39 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
>> mar 15 20:56:56 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
>> mar 17 15:58:32 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
>> mar 22 15:24:03 mordimer kernel:  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
>> 
>> I attach NFS part of my .config and a screen dump of menuconfig.
>> <nfs.config><nfs-config.txt>
> 
> I'm wondering if
> 
> 2bbfed98a4d8 ("nfsd: Fix races between nfsd4_cb_release() and nfsd4_shutdown_callback()")
> 
> or
> 
> 12357f1b2c8e ("nfsd: minor 4.1 callback cleanup")
> 
> might be related to this issue (see down-thread for details and backtraces).
> 
> --
> Chuck Lever

--
Chuck Lever




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: refcount underflow in nfsd41_destroy_cb
  2020-03-21 14:41 ` Jan Psota
  2020-03-21 17:25   ` Jan Psota
@ 2020-05-10 23:23   ` Jan Psota
  2020-05-15 22:01     ` J. Bruce Fields
  1 sibling, 1 reply; 11+ messages in thread
From: Jan Psota @ 2020-05-10 23:23 UTC (permalink / raw)
  To: linux-nfs; +Cc: Jason A. Donenfeld

[-- Attachment #1: Type: text/plain, Size: 837 bytes --]

Hi!
Today's nfsd trace in attachment. My NFS-root-ed client worked on
qemu-5.0.0 and tap interface (qemu has nothing to do with that effect
- on another server, physical machine client is connected through gigabit
ethernet). Error appeared on intensive NFS operations (Gentoo Linux on
client was installing Libre Office from binary package, moving files
from /var/tmp/portage to root filesystem).

Server /proc/version:
	Linux version 5.7.0-rc2 (root@mordimer)
	(gcc version 9.3.0 (Gentoo 9.3.0 p2), GNU ld (Gentoo 2.34 p1) 2.34.0)
	#1 SMP PREEMPT Tue Apr 21 21:52:13 CEST 2020

server /proc/cmdline:
	BOOT_IMAGE=/vmlinuz root=/dev/md0p4 panic=30 rw
	md=0,/dev/sda,/dev/sdb rootdelay=1 ipv6.disable=1

Client side errors were only some:
	nfs: server 192.168.2.1 not responding, still trying
	nfs: server 192.168.2.1 OK
but it is normal.

[-- Attachment #2: nfsd-trace-20200511.txt --]
[-- Type: text/plain, Size: 3560 bytes --]

[17957.413775] ------------[ cut here ]------------
[17957.413804] refcount_t: underflow; use-after-free.
[17957.413835] WARNING: CPU: 1 PID: 26127 at lib/refcount.c:28 refcount_warn_saturate+0xd8/0xe0
[17957.413872] Modules linked in: md5 nfsd auth_rpcgss nfs_acl lockd grace sunrpc fuse vhost_net vhost vhost_iotlb xt_mac xt_socket nf_socket_ipv4 xt_MASQUERADE xt_REDIRECT xt_TPROXY nf_tproxy_ipv4 xt_comment ipt_REJECT nf_reject_ipv4 xt_mark xt_multiport nfnetlink_log xt_NFLOG nf_log_ipv4 nf_log_common xt_LOG nf_nat_tftp nf_nat_sip nf_nat_h323 nf_nat_ftp nf_conntrack_tftp nf_conntrack_sip nf_conntrack_netlink nfnetlink nf_conntrack_h323 nf_conntrack_ftp tun bridge stp llc xt_tcpudp xt_conntrack iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_raw iptable_filter ip_tables x_tables rc_it913x_v1 it913x_fe hid_generic usbhid hid dvb_usb_it913x dvb_usb_v2 dvb_core videobuf2_vmalloc videobuf2_memops videobuf2_common rc_core btrfs blake2b_generic xor raid6_pq libcrc32c zlib_deflate zlib_inflate snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal kvm_intel kvm ppdev irqbypass i915 crct10dif_pclmul crc32_pclmul sr_mod intel_cstate iosf_mbi
[17957.413896]  intel_rapl_perf input_leds drm_kms_helper snd_hda_intel syscopyarea snd_intel_dspcfg sysfillrect pcspkr cdrom sysimgblt i2c_i801 fb_sys_fops intel_gtt r8169 snd_hda_codec i2c_algo_bit ehci_pci cfbfillrect cfbimgblt ehci_hcd realtek lpc_ich mei_me cfbcopyarea snd_hda_core usbcore libphy sky2 fb snd_hwdep mei mfd_core usb_common fbdev thermal parport_pc parport video sch_fq_codel snd_pcm_oss snd_mixer_oss snd_pcm snd_timer drm snd drm_panel_orientation_quirks soundcore agpgart backlight i2c_core coretemp hwmon
[17957.414516] CPU: 1 PID: 26127 Comm: kworker/u4:2 Not tainted 5.7.0-rc2 #1
[17957.414545] Hardware name: MSI MS-7788/H61M-P20 (G3) (MS-7788), BIOS V1.9 01/10/2013
[17957.414587] Workqueue: rpciod rpc_async_schedule [sunrpc]
[17957.414612] RIP: 0010:refcount_warn_saturate+0xd8/0xe0
[17957.414633] Code: ff 48 c7 c7 f0 7a d7 81 c6 05 d1 35 d7 00 01 e8 5b 9b d0 ff 0f 0b c3 48 c7 c7 98 7a d7 81 c6 05 bd 35 d7 00 01 e8 45 9b d0 ff <0f> 0b c3 0f 1f 44 00 00 8b 07 3d 00 00 00 c0 74 12 83 f8 01 74 46
[17957.414716] RSP: 0018:ffffc90000bafde0 EFLAGS: 00010282
[17957.414738] RAX: 0000000000000026 RBX: 0000000000000e81 RCX: 0000000000000007
[17957.414767] RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff888216718800
[17957.414798] RBP: ffff8881c5270c70 R08: 0000000000000338 R09: ffffc90010054024
[17957.414828] R10: 0000000000aaaaaa R11: 0000000000000000 R12: ffff88805b0a0510
[17957.414857] R13: ffff888011707b30 R14: 0000000000000001 R15: ffff888130c9d000
[17957.414889] FS:  0000000000000000(0000) GS:ffff888216700000(0000) knlGS:0000000000000000
[17957.414924] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[17957.414948] CR2: 00005600ba4586e8 CR3: 000000015a210004 CR4: 00000000001626e0
[17957.414978] Call Trace:
[17957.414996]  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
[17957.415020]  rpc_free_task+0x31/0x50 [sunrpc]
[17957.415042]  __rpc_execute+0x38f/0x3a0 [sunrpc]
[17957.415063]  ? pick_next_task_fair+0x295/0x2b0
[17957.415082]  ? finish_task_switch+0x70/0x230
[17957.415103]  rpc_async_schedule+0x24/0x40 [sunrpc]
[17957.416404]  process_one_work+0x1cd/0x3c0
[17957.417680]  worker_thread+0x45/0x3c0
[17957.418947]  kthread+0x10b/0x150
[17957.420195]  ? process_one_work+0x3c0/0x3c0
[17957.421420]  ? kthread_park+0x80/0x80
[17957.422620]  ret_from_fork+0x1f/0x30
[17957.423790] ---[ end trace d52e90aa624996f1 ]---

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: refcount underflow in nfsd41_destroy_cb
  2020-05-10 23:23   ` Jan Psota
@ 2020-05-15 22:01     ` J. Bruce Fields
  0 siblings, 0 replies; 11+ messages in thread
From: J. Bruce Fields @ 2020-05-15 22:01 UTC (permalink / raw)
  To: Jan Psota; +Cc: linux-nfs, Jason A. Donenfeld

On Mon, May 11, 2020 at 01:23:48AM +0200, Jan Psota wrote:
> Hi!
> Today's nfsd trace in attachment. My NFS-root-ed client worked on
> qemu-5.0.0 and tap interface (qemu has nothing to do with that effect
> - on another server, physical machine client is connected through gigabit
> ethernet). Error appeared on intensive NFS operations (Gentoo Linux on
> client was installing Libre Office from binary package, moving files
> from /var/tmp/portage to root filesystem).
> 
> Server /proc/version:
> 	Linux version 5.7.0-rc2 (root@mordimer)
> 	(gcc version 9.3.0 (Gentoo 9.3.0 p2), GNU ld (Gentoo 2.34 p1) 2.34.0)
> 	#1 SMP PREEMPT Tue Apr 21 21:52:13 CEST 2020
> 
> server /proc/cmdline:
> 	BOOT_IMAGE=/vmlinuz root=/dev/md0p4 panic=30 rw
> 	md=0,/dev/sda,/dev/sdb rootdelay=1 ipv6.disable=1
> 
> Client side errors were only some:
> 	nfs: server 192.168.2.1 not responding, still trying
> 	nfs: server 192.168.2.1 OK
> but it is normal.

> [17957.413775] ------------[ cut here ]------------
> [17957.413804] refcount_t: underflow; use-after-free.

Looks like this is in the !cb->cb_need restart case of nfsd4_cb_release,
which is ->release method of nfsd4_cb_ops.

So, an rpc task for a callback is being freed and decrementing
clp->cl_cb_inflight, and discovering that cl_cb_inflight is already less
than or equal to 0.  Either we've got unbalanced increments and
decrements of cl_cb_inflight, or clp has already been freed out from
under us.

I'm looking but can't quite see yet how that can happen.

--b.


> [17957.413835] WARNING: CPU: 1 PID: 26127 at lib/refcount.c:28 refcount_warn_saturate+0xd8/0xe0
> [17957.413872] Modules linked in: md5 nfsd auth_rpcgss nfs_acl lockd grace sunrpc fuse vhost_net vhost vhost_iotlb xt_mac xt_socket nf_socket_ipv4 xt_MASQUERADE xt_REDIRECT xt_TPROXY nf_tproxy_ipv4 xt_comment ipt_REJECT nf_reject_ipv4 xt_mark xt_multiport nfnetlink_log xt_NFLOG nf_log_ipv4 nf_log_common xt_LOG nf_nat_tftp nf_nat_sip nf_nat_h323 nf_nat_ftp nf_conntrack_tftp nf_conntrack_sip nf_conntrack_netlink nfnetlink nf_conntrack_h323 nf_conntrack_ftp tun bridge stp llc xt_tcpudp xt_conntrack iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_raw iptable_filter ip_tables x_tables rc_it913x_v1 it913x_fe hid_generic usbhid hid dvb_usb_it913x dvb_usb_v2 dvb_core videobuf2_vmalloc videobuf2_memops videobuf2_common rc_core btrfs blake2b_generic xor raid6_pq libcrc32c zlib_deflate zlib_inflate snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal kvm_intel kvm ppdev irqbypass i915 crct10dif_pclmul crc32_pclmul sr_mod intel_cstate iosf_mbi
> [17957.413896]  intel_rapl_perf input_leds drm_kms_helper snd_hda_intel syscopyarea snd_intel_dspcfg sysfillrect pcspkr cdrom sysimgblt i2c_i801 fb_sys_fops intel_gtt r8169 snd_hda_codec i2c_algo_bit ehci_pci cfbfillrect cfbimgblt ehci_hcd realtek lpc_ich mei_me cfbcopyarea snd_hda_core usbcore libphy sky2 fb snd_hwdep mei mfd_core usb_common fbdev thermal parport_pc parport video sch_fq_codel snd_pcm_oss snd_mixer_oss snd_pcm snd_timer drm snd drm_panel_orientation_quirks soundcore agpgart backlight i2c_core coretemp hwmon
> [17957.414516] CPU: 1 PID: 26127 Comm: kworker/u4:2 Not tainted 5.7.0-rc2 #1
> [17957.414545] Hardware name: MSI MS-7788/H61M-P20 (G3) (MS-7788), BIOS V1.9 01/10/2013
> [17957.414587] Workqueue: rpciod rpc_async_schedule [sunrpc]
> [17957.414612] RIP: 0010:refcount_warn_saturate+0xd8/0xe0
> [17957.414633] Code: ff 48 c7 c7 f0 7a d7 81 c6 05 d1 35 d7 00 01 e8 5b 9b d0 ff 0f 0b c3 48 c7 c7 98 7a d7 81 c6 05 bd 35 d7 00 01 e8 45 9b d0 ff <0f> 0b c3 0f 1f 44 00 00 8b 07 3d 00 00 00 c0 74 12 83 f8 01 74 46
> [17957.414716] RSP: 0018:ffffc90000bafde0 EFLAGS: 00010282
> [17957.414738] RAX: 0000000000000026 RBX: 0000000000000e81 RCX: 0000000000000007
> [17957.414767] RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff888216718800
> [17957.414798] RBP: ffff8881c5270c70 R08: 0000000000000338 R09: ffffc90010054024
> [17957.414828] R10: 0000000000aaaaaa R11: 0000000000000000 R12: ffff88805b0a0510
> [17957.414857] R13: ffff888011707b30 R14: 0000000000000001 R15: ffff888130c9d000
> [17957.414889] FS:  0000000000000000(0000) GS:ffff888216700000(0000) knlGS:0000000000000000
> [17957.414924] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [17957.414948] CR2: 00005600ba4586e8 CR3: 000000015a210004 CR4: 00000000001626e0
> [17957.414978] Call Trace:
> [17957.414996]  nfsd41_destroy_cb+0x2c/0x40 [nfsd]
> [17957.415020]  rpc_free_task+0x31/0x50 [sunrpc]
> [17957.415042]  __rpc_execute+0x38f/0x3a0 [sunrpc]
> [17957.415063]  ? pick_next_task_fair+0x295/0x2b0
> [17957.415082]  ? finish_task_switch+0x70/0x230
> [17957.415103]  rpc_async_schedule+0x24/0x40 [sunrpc]
> [17957.416404]  process_one_work+0x1cd/0x3c0
> [17957.417680]  worker_thread+0x45/0x3c0
> [17957.418947]  kthread+0x10b/0x150
> [17957.420195]  ? process_one_work+0x3c0/0x3c0
> [17957.421420]  ? kthread_park+0x80/0x80
> [17957.422620]  ret_from_fork+0x1f/0x30
> [17957.423790] ---[ end trace d52e90aa624996f1 ]---


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-05-15 22:01 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-21  2:51 refcount underflow in nfsd41_destroy_cb Jason A. Donenfeld
2020-03-21 14:41 ` Jan Psota
2020-03-21 17:25   ` Jan Psota
2020-05-10 23:23   ` Jan Psota
2020-05-15 22:01     ` J. Bruce Fields
2020-03-21 19:50 ` Chuck Lever
2020-03-22  3:43   ` Jan Psota
2020-03-23 14:13     ` Chuck Lever
2020-03-23 15:09       ` Jan Psota
2020-03-24 13:50         ` Chuck Lever
2020-03-31 17:59           ` Chuck Lever

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).