netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* kernel BUG at net/core/skbuff.c:109!
@ 2021-02-03 11:55 Tj (Elloe Linux)
  2021-02-06  3:07 ` Jakub Kicinski
  0 siblings, 1 reply; 2+ messages in thread
From: Tj (Elloe Linux) @ 2021-02-03 11:55 UTC (permalink / raw)
  To: netdev; +Cc: Callum O'Connor

On a recent build (5.10.0) we've seen several hard-to-pinpoint complete
lock-ups requiring power-off restarts.

Today we found a small clue in the kernel log but unfortunately the
complete backtrace wasn't captured (presumably system froze before log
could be flushed) but I thought I should share it for investigation.

kernel BUG at net/core/skbuff.c:109!

kernel: skbuff: skb_under_panic: text:ffffffffc103c622 len:1228 put:48
head:ffffa00202858000 data:ffffa00202857ff2 tail:0x4be end:0x6c0 dev:wlp4s0
kernel: ------------[ cut here ]------------
kernel: kernel BUG at net/core/skbuff.c:109!

Obviously this ought not to happen and we'd like to discover the cause.

Whilst writing this report it happened again. Checking the logs we see
three instances of the BUG none of which capture a stack trace:

Jan 27
Feb 03 #1
Feb 03 #2

The only slight clue may be a k3s service that we were unaware was
constantly restarting and had reached 26,636 iterations just before the
Feb 03 #1 BUG. However, we removed k3s immediately after and there were
no similar clues 20 minutes later for the Feb 03 #2 BUG.

Feb 03 11:11:13 elloe001 k3s[1209978]:
time="2021-02-03T11:11:13.452745479Z" level=fatal msg="starting
kubernetes: preparing server: start cluster and https:
listen tcp 10.1.2.1:6443: bind: cannot assign requested address"
Feb 03 11:11:13 elloe001 systemd[1]: k3s-main.service: Main process
exited, code=exited, status=1/FAILURE
Feb 03 11:11:13 elloe001 systemd[1]: k3s-main.service: Failed with
result 'exit-code'.
Feb 03 11:11:13 elloe001 systemd[1]: Failed to start Lightweight Kubernetes.
Feb 03 11:11:18 elloe001 systemd[1]: k3s-dev.service: Scheduled restart
job, restart counter is at 26636.
Feb 03 11:11:18 elloe001 systemd[1]: k3s-main.service: Scheduled restart
job, restart counter is at 26636.
Feb 03 11:11:18 elloe001 systemd[1]: Stopped Lightweight Kubernetes.
Feb 03 11:11:18 elloe001 systemd[1]: Starting Lightweight Kubernetes...
Feb 03 11:11:18 elloe001 systemd[1]: Stopped Lightweight Kubernetes.
Feb 03 11:11:18 elloe001 systemd[1]: Starting Lightweight Kubernetes...

We don't think this is hardware related as we have several identical
Lenovo E495 laptops and they have never suffered this.

We don't know of any way to reproduce it at will.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: kernel BUG at net/core/skbuff.c:109!
  2021-02-03 11:55 kernel BUG at net/core/skbuff.c:109! Tj (Elloe Linux)
@ 2021-02-06  3:07 ` Jakub Kicinski
  0 siblings, 0 replies; 2+ messages in thread
From: Jakub Kicinski @ 2021-02-06  3:07 UTC (permalink / raw)
  To: Tj (Elloe Linux); +Cc: netdev, Callum O'Connor

On Wed, 3 Feb 2021 11:55:25 +0000 Tj (Elloe Linux) wrote:
> On a recent build (5.10.0) we've seen several hard-to-pinpoint complete
> lock-ups requiring power-off restarts.
> 
> Today we found a small clue in the kernel log but unfortunately the
> complete backtrace wasn't captured (presumably system froze before log
> could be flushed) but I thought I should share it for investigation.
> 
> kernel BUG at net/core/skbuff.c:109!
> 
> kernel: skbuff: skb_under_panic: text:ffffffffc103c622 len:1228 put:48

text:ffffffffc103c622

That's a return address, IOW address of the caller, if I'm reading the
code right. Any chance you could decode that?
./scripts/decode_stack_trace is your friend.

> head:ffffa00202858000 data:ffffa00202857ff2 tail:0x4be end:0x6c0 dev:wlp4s0

dev:wlp4s0

Can you tell us what driver drives this device?

> kernel: ------------[ cut here ]------------
> kernel: kernel BUG at net/core/skbuff.c:109!

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-02-06  3:17 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-03 11:55 kernel BUG at net/core/skbuff.c:109! Tj (Elloe Linux)
2021-02-06  3:07 ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).