linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* nfs kernel panic - unable to handle kernel paging request,   cgroup_sk_free
@ 2019-07-01 20:35 Jason Alavaliant
  0 siblings, 0 replies; only message in thread
From: Jason Alavaliant @ 2019-07-01 20:35 UTC (permalink / raw)
  To: linux-nfs

Hi,

I'm hoping somebody will be kind enough to confirm if the below kernel panic is a known issue or not.
I've been searching the list history and I see mention of several reports mentioning pbf, cgroups and memory corruption but nothing with a panic that seems to match the one I'm seeing exactly.

What I have been able to confirm is;
* occurs on both kernel 4.14 and 4.19 (haven't tested beyond that - I'd suspect more kernel versions are effected)
* only if I'm sharing a volume using the nfs-kernel-server and a client is accessing it   (doesn't occur with nfs-kernel-server version 1.2.8 on ubuntu 16.04,  only with 1.3.4 (which I use on ubuntu 18.04) )
* does not occur if I switch machines to the ganesha nfs server (replacing nfs-kernel-server).

Beyond that despite having several hundred of these kernel panics on machines at the company I'm at.  I've not been able to isolate an reproducible way to trigger the kernel panic on demand.

[38258.193887] BUG: unable to handle kernel paging request at ffff888c8be54980
[38258.193906] IP: cgroup_sk_free+0x3a/0x80
[38258.193909] PGD 25c8067 P4D 25c8067 PUD 0
[38258.193916] Oops: 0002 [#1] PREEMPT SMP PTI
[38258.193920] Modules linked in: tcp_diag inet_diag iptable_filter nvidia_uvm(POE) nfnetlink_queue nfnetlink_log nfnetlink bluetooth ecdh_generic nfsv3 vtsspp(OE) sep4_1(OE) socperf2_0(OE) pax(OE) talpa_vfshook(OE) talpa_pedconnector(OE) talp
a_pedevice(OE) talpa_vcdevice(OE) talpa_core(OE) talpa_linux(OE) talpa_syscallhook(OE) xwbios(OE) binfmt_misc snd_hda_codec_hdmi nvidia_drm(POE) nvidia_modeset(POE) coretemp kvm_intel kvm nvidia(POE) gpio_ich iTCO_wdt irqbypass hp_wmi iTCO_ven
dor_support wmi_bmof sparse_keymap snd_hda_codec_realtek snd_hda_codec_generic ghash_clmulni_intel pcbc snd_hda_intel snd_hda_codec aesni_intel snd_hda_core aes_x86_64 snd_hwdep crypto_simd snd_pcm glue_helper cryptd drm_kms_helper snd_seq_mid
i snd_seq_midi_event serio_raw drm snd_rawmidi snd_seq lpc_ich ipmi_devintf snd_seq_device
[38258.193985]  ipmi_msghandler snd_timer fb_sys_fops syscopyarea sysfillrect sysimgblt snd soundcore shpchp wmi mac_hid sch_fq_codel taniwha(OE) nfs nfsd fscache nfs_acl lockd parport_pc grace ppdev auth_rpcgss lp sunrpc parport ip_tables x_t
ables autofs4 hid_generic mptsas psmouse mptscsih mptbase firewire_ohci ahci tg3 firewire_core scsi_transport_sas libahci crc_itu_t ptp pps_core floppy
[38258.194016] CPU: 4 PID: 0 Comm: swapper/4 Tainted: P          IOE   4.14.103-weta-20190225 #1
[38258.194018] Hardware name: Hewlett-Packard HP Z800 Workstation/0AECh, BIOS 786G5 v03.61 03/05/2018
[38258.194020] task: ffff888c141d8000 task.stack: ffffc900062d4000
[38258.194023] RIP: 0010:cgroup_sk_free+0x3a/0x80
[38258.194025] RSP: 0018:ffff888c2f603ab0 EFLAGS: 00010246
[38258.194027] RAX: 000000005c854980 RBX: ffff8886128ccc00 RCX: 0000000000000000
[38258.194029] RDX: 0000000000000000 RSI: 0000000000010080 RDI: 0000000000000001
[38258.194031] RBP: ffff888c2f603ab8 R08: 0000000000000001 R09: ffffffff816d457f
[38258.194033] R10: ffff888c2f603b10 R11: 0000000000000000 R12: ffff888bb258f800
[38258.194035] R13: 0000000000000000 R14: ffff888bb2662c24 R15: ffff888bb258f800
[38258.194038] FS:  0000000000000000(0000) GS:ffff888c2f600000(0000) knlGS:0000000000000000
[38258.194040] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[38258.194042] CR2: ffff888c8be54980 CR3: 000000000220a003 CR4: 00000000000206e0
[38258.194043] Call Trace:
[38258.194046]  <IRQ>
[38258.194054]  __sk_destruct+0xf5/0x160
[38258.194057]  sk_destruct+0x20/0x30
[38258.194059]  __sk_free+0x1b/0xa0
[38258.194061]  sk_free+0x1f/0x30
[38258.194066]  sock_put+0x1a/0x20
[38258.194069]  tcp_v4_rcv+0x9a5/0xa80
[38258.194074]  ? ___bpf_prog_run+0x410/0x11f0
[38258.194080]  ip_local_deliver_finish+0x6e/0x250
[38258.194082]  ip_local_deliver+0xe5/0xf0
[38258.194085]  ? ip_rcv_finish+0x430/0x430
[38258.194088]  ip_rcv_finish+0xe7/0x430
[38258.194091]  ip_rcv+0x28f/0x3d0
[38258.194096]  ? packet_rcv+0x44/0x430
[38258.194100]  __netif_receive_skb_core+0x402/0xb90
[38258.194105]  ? tcp4_gro_receive+0x137/0x1a0
[38258.194108]  __netif_receive_skb+0x18/0x60
[38258.194110]  ? __netif_receive_skb+0x18/0x60
[38258.194113]  netif_receive_skb_internal+0x31/0x110
[38258.194115]  napi_gro_receive+0xe5/0x110
[38258.194122]  tg3_poll_work+0x817/0xec0 [tg3]
[38258.194127]  tg3_poll+0x6b/0x390 [tg3]
[38258.194130]  net_rx_action+0x139/0x3a0
[38258.194136]  __do_softirq+0xe9/0x2d7
[38258.194142]  irq_exit+0x99/0xa0
[38258.194144]  do_IRQ+0xa6/0x100
[38258.194148]  common_interrupt+0x81/0x81
[38258.194149]  </IRQ>
[38258.194153] RIP: 0010:cpuidle_enter_state+0xa5/0x310
[38258.194155] RSP: 0018:ffffc900062d7e88 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff9c
[38258.194158] RAX: ffff888c2f600000 RBX: 000000000001df10 RCX: 000000000000001f
[38258.194160] RDX: 000022cbae0dded5 RSI: ffffffff82055221 RDI: ffffffff82082b25
[38258.194162] RBP: ffffc900062d7ec8 R08: 0000000000000004 R09: 0000000000020c80
[38258.194163] R10: ffffc900062d7e60 R11: 0000539304a379fa R12: ffff888c2f62a000
[38258.194165] R13: 0000000000000004 R14: ffffffff822c3f18 R15: 0000000000000000
[38258.194169]  ? cpuidle_enter_state+0x97/0x310
[38258.194172]  cpuidle_enter+0x17/0x20
[38258.194176]  call_cpuidle+0x23/0x40
[38258.194178]  do_idle+0x18f/0x1e0
[38258.194181]  cpu_startup_entry+0x1d/0x20
[38258.194185]  start_secondary+0x143/0x160
[38258.194188]  secondary_startup_64+0xa5/0xb0
[38258.194191] Code: c3 70 4c 53 82 a8 01 75 07 48 85 c0 48 0f 45 d8 f6 43 6c 01 74 03 5b 5d c3 bf 01 00 00 00 e8 ee 4f f8 ff 48 8b 43 18 a8 03 75 20 <65> 48 ff 08 bf 01 00 00 00 e8 48 4b f8 ff 65 8b 05 81 53 f0 7e
[38258.194225] RIP: cgroup_sk_free+0x3a/0x80 RSP: ffff888c2f603ab0
[38258.194226] CR2: ffff888c8be54980


Thanks
Jason


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2019-07-01 20:40 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-01 20:35 nfs kernel panic - unable to handle kernel paging request, cgroup_sk_free Jason Alavaliant

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).