linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Guenter Roeck <linux@roeck-us.net>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Stephen Boyd <sboyd@kernel.org>
Subject: Re: [RFC][PATCH v2 00/31] timers: Use del_timer_shutdown() before freeing timers
Date: Thu, 27 Oct 2022 15:24:04 -0700	[thread overview]
Message-ID: <20221027222404.GA3123386@roeck-us.net> (raw)
In-Reply-To: <20221027150525.753064657@goodmis.org>

On Thu, Oct 27, 2022 at 11:05:25AM -0400, Steven Rostedt wrote:
> 
> Back in April, I posted an RFC patch set to help mitigate a common issue
> where a timer gets armed just before it is freed, and when the timer
> goes off, it crashes in the timer code without any evidence of who the
> culprit was. I got side tracked and never finished up on that patch set.
> Since this type of crash is still our #1 crash we are seeing in the field,
> it has become a priority again to finish it.
> 
> This is v2 of that patch set. Thomas Gleixner posted an untested version
> that makes timer->function NULL as the flag that it is shutdown. I took that
> code, tested it (fixed it up), added more comments, and changed the
> name to del_timer_shutdown() as Linus had asked. I also converted it to use
> WARN_ON_ONCE() instead of just WARN_ON() as Linus asked for that too.
> 

Here are various warnings and crashes. Complete logs are at

https://kerneltests.org/builders

in the "testing" column of the qemu test results.

This is with the published patch set plus the fixups in 
timer_fixup_init() and timer_fixup_free().

Guenter

---
Block code:

WARNING: CPU: 0 PID: 8 at lib/debugobjects.c:502 debug_print_object+0x66/0x7a
ODEBUG: free active (active state 0) object type: timer_list hint: blk_rq_timed_out_timer+0x0/0xc

---
In tcp:

WARNING: CPU: 0 PID: 255 at lib/debugobjects.c:502 debug_print_object+0xa4/0xd8
ODEBUG: free active (active state 0) object type: timer_list hint: tcp_keepalive_timer+0x0/0x3a8

WARNING: CPU: 0 PID: 255 at lib/debugobjects.c:502 debug_print_object+0xa4/0xd8
ODEBUG: free active (active state 0) object type: timer_list hint: tcp_write_timer+0x0/0x1a8

---
Other networking:

WARNING: CPU: 0 PID: 8 at lib/debugobjects.c:502 debug_print_object+0xa4/0xd8
ODEBUG: free active (active state 0) object type: timer_list hint: neigh_timer_handler+0x0/0x468

---

WARNING: CPU: 0 PID: 280 at lib/debugobjects.c:502 debug_print_object+0xa4/0xd8
ODEBUG: init active (active state 0) object type: timer_list hint: tulip_timer+0x0/0x38

That one is often followed by:

[   29.833216] ODEBUG: init active (active state 0) object type: timer_list hint: 0x0
[   29.833723] WARNING: CPU: 0 PID: 365 at lib/debugobjects.c:502 debug_print_object+0xb8/0x100

in the same code line, suggesting that the timer may be shut down
more than once (?). Then there is another crash with

[   29.863890] Insufficient stack space to handle exception!
[   29.863939] ESR: 0x0000000096000047 -- DABT (current EL)
[   29.863952] FAR: 0xffff800008b17f80
[   29.863959] Task stack:     [0xffff800008b18000..0xffff800008b1c000]
[   29.863967] IRQ stack:      [0xffff800008000000..0xffff800008004000]
[   29.863975] Overflow stack: [0xffff30a35fe7a0f0..0xffff30a35fe7b0f0]
[   29.863987] CPU: 0 PID: 365 Comm: ip Tainted: G        W        N 6.1.0-rc2-00138-gced58c742836 #1
[   29.863999] Hardware name: linux,dummy-virt (DT)

followed by a sequence of

[   29.864572]  __try_to_del_timer_sync+0x40/0xdc
[   29.864582]  __del_timer_sync+0xa4/0x100
[   29.864590]  timer_fixup_init+0x2c/0x5c
[   29.864599]  __debug_object_init+0x248/0x53c
[   29.864606]  debug_object_init+0x24/0x30
[   29.864614]  timer_fixup_init+0x40/0x5c
[   29.864622]  __debug_object_init+0x248/0x53c
[   29.864630]  debug_object_init+0x24/0x30
[   29.864637]  timer_fixup_init+0x40/0x5c
[   29.864645]  __debug_object_init+0x248/0x53c
[   29.864658]  debug_object_init+0x24/0x30
[   29.864666]  timer_fixup_init+0x40/0x5c
...
[   29.866492]  debug_object_init+0x24/0x30
[   29.866500]  init_timer_key+0x7c/0x80
[   29.866508]  tulip_down+0x1d0/0x24c
[   29.866518]  tulip_close+0x3c/0xc0

This crash is seen with variants on several platforms/architectures.

---

WARNING: CPU: 0 PID: 8 at lib/debugobjects.c:502 debug_print_object+0xa4/0xd8
ODEBUG: free active (active state 0) object type: timer_list hint: addrconf_dad_work+0x0/0x628

---
mips64:

WARNING: CPU: 0 PID: 280 at lib/debugobjects.c:502 debug_print_object+0xa4/0xd8
ODEBUG: init active (active state 0) object type: timer_list hint: 0x0
Modules linked in:
CPU: 0 PID: 280 Comm: ip Tainted: G        W        N 6.1.0-rc2-00138-gced58c742836 #1
Stack : 0000000000000056 ffffffffffffffff 0000000000000008 00b65275abba8126
        00b65275abba8126 0000000000000000 9000000004193728 ffffffff80f29f38
        ffffffff8105c9d8 ffffffffffff8880 9000000004193668 0000000000000000
        0000000000000005 0000000000000010 ffffffff80c731a0 ffffffff81030000
        2020205720202020 ffffffff81030000 0000000000000000 ffffffff80f29f38
        0000000000000009 00000000000001f6 ffffffff8063d45c ffffffff81cb94c0
        ffffffff81d76240 0000000000000000 ffffffff807575a8 0000000000000000
        ffffffff81230000 9000000004190000 9000000004193720 9000000018002030
        ffffffff80c8f528 0000000000000000 0000000000000000 00b65275abba8126
        ffffffff8122e050 000000001000a4e1 ffffffff8010a884 00b65275abba8126
        ...
Call Trace:
[<ffffffff8010a884>] show_stack+0x3c/0x120
[<ffffffff80c8f528>] dump_stack_lvl+0x4c/0x90
[<ffffffff80130e54>] __warn+0xdc/0x1c8
[<ffffffff80c75a08>] warn_slowpath_fmt+0x98/0xc4
[<ffffffff8063d45c>] debug_print_object+0xa4/0xd8
[<ffffffff8063e590>] __debug_object_init+0x2d0/0x670
[<ffffffff801b44b8>] timer_fixup_init+0x40/0x58

irq event stamp: 1537
hardirqs last  enabled at (1549): [<ffffffff8018e67c>] __up_console_sem+0x9c/0xc8
hardirqs last disabled at (1560): [<ffffffff8018e658>] __up_console_sem+0x78/0xc8
softirqs last  enabled at (1324): [<ffffffff80a572bc>] dev_deactivate_many+0x32c/0x458
softirqs last disabled at (1322): [<ffffffff80a57288>] dev_deactivate_many+0x2f8/0x458
---[ end trace 0000000000000000 ]---

=====================================
WARNING: bad unlock balance detected!
6.1.0-rc2-00138-gced58c742836 #1 Tainted: G        W        N
-------------------------------------
�������␈␂/-1 is trying to release lock (&obj_hash[i].lock) at:
[<ffffffff801b41e0>] __try_to_del_timer_sync+0x48/0x140
but there are no more locks to release!

other info that might help us debug this:
qemu-system-mips64: terminating on signal 15 from pid 2525787 (/bin/bash)

---
Openrisc fails completely. Lots of the following, then boot stalls.

WARNING: CPU: 0 PID: 88 at lib/debugobjects.c:502 debug_print_object+0xc0/0xe8
ODEBUG: init active (active state 0) object type: timer_list hint: 0x0
Modules linked in:
CPU: 0 PID: 88 Comm: udhcpc Not tainted 6.1.0-rc2-00138-gced58c742836 #1
Call trace:
[<8eaa3f51>] dump_stack_lvl+0x44/0x80
[<4da6c5ef>] dump_stack+0x1c/0x2c
[<1b9f58b7>] __warn+0xdc/0x118
[<86d4d066>] ? debug_print_object+0xc0/0xe8
[<689beae8>] warn_slowpath_fmt+0x78/0x90
[<86d4d066>] debug_print_object+0xc0/0xe8
[<872f4074>] __debug_object_init+0x2bc/0x7f4
[<c30d6214>] ? _raw_spin_unlock_irqrestore+0x50/0x84
[<c3f3be91>] ? debug_check_no_locks_freed+0xb8/0x194
[<558675c1>] ? slob_alloc+0xe8/0x350
[<26225550>] ? lockdep_init_map_type+0x68/0x38c
[<c3f3be91>] ? debug_check_no_locks_freed+0xb8/0x194
[<2c9cf7b5>] ? inet_create+0x2e8/0x404
[<bab055df>] debug_object_init+0x30/0x40
[<6d2e468f>] init_timer_key+0xb4/0x110
[<60c0d3e7>] ? sk_init_common+0x1a0/0x1c0
[<b52686c2>] sock_init_data+0x60/0x2a4
[<1ff904c3>] ? sk_alloc+0xe8/0x138
[<ca3e6ef9>] inet_create+0x1e0/0x404
[<623d726a>] ? inet_create+0x80/0x404
[<a3256728>] ? lock_release+0x1c0/0x30c
[<960a454e>] __sock_create+0x140/0x288
[<0290cd64>] ? __sock_create+0x98/0x288
[<b62bc649>] __sys_socket+0x7c/0x128
[<fa003224>] ? do_work_pending+0x4c/0x118
[<705147e9>] sys_socket+0x14/0x24
[<9c4e015d>] ? _syscall_return+0x0/0x4
---[ end trace 0000000000000000 ]---

---
parisc crashes.

[    3.015186] ------------[ cut here ]------------
[    3.015580] ODEBUG: init active (active state 0) object type: timer_list hint: timeout_waiting_on_port+0x0/0x2c
[    3.016945] WARNING: CPU: 0 PID: 1 at lib/debugobjects.c:502 debug_print_object+0x98/0xc8
...
[    3.026117] ODEBUG: init active (active state 0) object type: timer_list hint: 0x0
[    3.026355] WARNING: CPU: 0 PID: 1 at lib/debugobjects.c:502 debug_print_object+0x98/0xc8
...
[ many more of the same ]

[    3.054542] stackcheck: swapper/0 will most likely overflow irq stack (sp:11e51800, stk bottom-top:11460004-11468004)
[    3.054592] Kernel panic - not syncing: low stack detected by irq handler - check messages

and boom.

---
usb:

[   23.993737] WARNING: CPU: 0 PID: 343 at lib/debugobjects.c:502 debug_print_object+0xac/0xc8
[   23.993953] ODEBUG: free active (active state 0) object type: timer_list hint: hub_init_func2+0x0/0xc

---
ppc:

[    6.607478][    T1] ODEBUG: init active (active state 0) object type: timer_list hint: .ibmvscsi_timeout+0x0/0x58
[    6.608536][    T1] WARNING: CPU: 0 PID: 1 at lib/debugobjects.c:502 .debug_print_object+0xbc/0xf0

followed by several of

[    6.623661][    T1] ODEBUG: init active (active state 0) object type: timer_list hint: 0x0
[    6.624155][    T1] WARNING: CPU: 0 PID: 1 at lib/debugobjects.c:502 .debug_print_object+0xbc/0xf0

and finally:

[    6.694785][    C0] BUG: Kernel NULL pointer dereference on read at 0x000000b8
[    6.694796][    C0] Faulting instruction address: 0xc0000000001ece44
[    6.694807][    C0] Thread overran stack, or stack corrupted
[    6.694967][    C0] Oops: Kernel access of bad area, sig: 7 [#1]


  parent reply	other threads:[~2022-10-27 22:24 UTC|newest]

Thread overview: 109+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-27 15:05 [RFC][PATCH v2 00/31] timers: Use del_timer_shutdown() before freeing timers Steven Rostedt
2022-10-27 15:05 ` [RFC][PATCH v2 01/31] timers: Add del_timer_shutdown() to be called " Steven Rostedt
2022-10-27 15:05 ` [RFC][PATCH v2 02/31] timers: s390/cmm: Use del_timer_shutdown() before freeing timer Steven Rostedt
2022-10-27 15:05 ` [RFC][PATCH v2 03/31] timers: sh: " Steven Rostedt
2022-10-27 15:05 ` [RFC][PATCH v2 05/31] timers: ACPI: " Steven Rostedt
2022-10-28 16:56   ` Rafael J. Wysocki
2022-11-01  1:11   ` Jarkko Sakkinen
2022-10-27 15:05 ` [RFC][PATCH v2 06/31] timers: atm: " Steven Rostedt
2022-10-27 15:05 ` [RFC][PATCH v2 07/31] timers: PM: Use del_timer_shutdown() Steven Rostedt
2022-10-28 17:45   ` Rafael J. Wysocki
2022-10-27 15:05 ` [RFC][PATCH v2 08/31] timers: Bluetooth: Use del_timer_shutdown() before freeing timer Steven Rostedt
2022-10-29  0:12   ` Luiz Augusto von Dentz
2022-10-29  0:33     ` Steven Rostedt
2022-10-27 15:05 ` [RFC][PATCH v2 09/31] timers: hangcheck: " Steven Rostedt
2022-10-27 15:05 ` [RFC][PATCH v2 10/31] timers: ipmi: " Steven Rostedt
2022-10-27 15:20   ` Corey Minyard
2022-10-27 15:22     ` Corey Minyard
2022-10-27 15:31       ` Steven Rostedt
2022-10-27 15:05 ` [RFC][PATCH v2 11/31] timers: random: " Steven Rostedt
2022-10-27 15:55   ` Jason A. Donenfeld
2022-10-27 15:05 ` [RFC][PATCH v2 14/31] timers: HID: " Steven Rostedt
2022-10-27 15:05 ` [RFC][PATCH v2 16/31] timers: mISDN: " Steven Rostedt
2022-10-27 15:05 ` [RFC][PATCH v2 17/31] timers: leds: " Steven Rostedt
2022-10-27 15:05 ` [RFC][PATCH v2 19/31] timers: net: " Steven Rostedt
2022-10-27 19:55   ` Steven Rostedt
2022-10-27 20:15     ` Linus Torvalds
2022-10-27 20:34       ` Steven Rostedt
2022-10-27 20:48         ` Linus Torvalds
2022-10-27 21:07           ` Steven Rostedt
2022-10-27 21:15             ` Steven Rostedt
2022-10-27 22:35             ` Steven Rostedt
2022-10-28 22:31               ` Steven Rostedt
2022-10-28 22:46                 ` Jakub Kicinski
2022-10-30 17:22                   ` Paolo Abeni
2022-11-03 21:51                     ` Steven Rostedt
2022-11-04  0:00                       ` Eric Dumazet
2022-11-04  5:51                         ` Steven Rostedt
2022-11-04 16:14                           ` Guenter Roeck
2022-10-27 21:07         ` Steven Rostedt
2022-10-28 15:16           ` Guenter Roeck
2022-10-27 15:05 ` [RFC][PATCH v2 20/31] timers: usb: " Steven Rostedt
2022-10-27 20:38   ` Alan Stern
2022-10-27 20:42     ` Steven Rostedt
2022-10-27 21:22       ` Steven Rostedt
2022-10-28  5:23   ` Guenter Roeck
2022-10-28 10:14     ` Steven Rostedt
2022-10-28 14:00       ` Steven Rostedt
2022-10-28 18:01     ` Steven Rostedt
2022-10-28 18:10       ` Steven Rostedt
2022-10-28 19:59         ` Guenter Roeck
2022-10-28 20:40           ` Steven Rostedt
2022-10-28 23:25             ` Guenter Roeck
2022-10-28 23:29               ` Steven Rostedt
2022-10-29 14:52           ` Guenter Roeck
2022-10-29 19:19             ` Steven Rostedt
2022-10-29 22:56               ` Guenter Roeck
2022-10-30 15:48                 ` Steven Rostedt
2022-10-31 15:50                   ` Guenter Roeck
2022-10-31 20:14                     ` Guenter Roeck
2022-10-27 15:05 ` [RFC][PATCH v2 21/31] timers: cgroup: " Steven Rostedt
2022-10-27 15:05 ` [RFC][PATCH v2 22/31] timers: workqueue: " Steven Rostedt
2022-10-27 15:05 ` [RFC][PATCH v2 23/31] timers: nfc: pn533: " Steven Rostedt
2022-10-27 15:05 ` [RFC][PATCH v2 24/31] timers: pcmcia: " Steven Rostedt
2022-10-27 15:05 ` [RFC][PATCH v2 25/31] timers: scsi: " Steven Rostedt
2022-10-27 15:05 ` [RFC][PATCH v2 26/31] timers: tty: " Steven Rostedt
2022-10-31  8:34   ` Jiri Slaby
2022-10-27 15:05 ` [RFC][PATCH v2 27/31] timers: ext4: " Steven Rostedt
2022-10-27 15:05 ` [RFC][PATCH v2 28/31] timers: fs/nilfs2: " Steven Rostedt
2022-10-28  5:12   ` Ryusuke Konishi
2022-10-27 15:05 ` [RFC][PATCH v2 29/31] timers: ALSA: " Steven Rostedt
2022-10-28  9:17   ` Takashi Iwai
2022-10-27 15:05 ` [RFC][PATCH v2 30/31] timers: x86/mce: Use __init_timer() for resetting timers Steven Rostedt
2022-10-27 15:05 ` [RFC][PATCH v2 31/31] timers: Expand DEBUG_OBJECTS_TIMER to check if it ever was used Steven Rostedt
     [not found] ` <20221027150925.819019339@goodmis.org>
2022-10-27 15:19   ` [RFC][PATCH v2 04/31] timers: block: Use del_timer_shutdown() before freeing timer Steven Rostedt
2022-10-28  8:26     ` Christoph Hellwig
2022-10-28 10:24       ` Steven Rostedt
2022-10-28 13:56         ` Jens Axboe
2022-10-28 14:06           ` Steven Rostedt
2022-10-28 14:11             ` Jens Axboe
2022-10-28 14:30               ` Steven Rostedt
2022-10-28 15:11   ` Guenter Roeck
     [not found] ` <20221027150927.371916000@goodmis.org>
2022-10-27 15:20   ` [RFC][PATCH v2 12/31] timers: dma-buf: " Steven Rostedt
     [not found] ` <20221027150927.611233945@goodmis.org>
2022-10-27 15:20   ` [RFC][PATCH v2 13/31] timers: drm: " Steven Rostedt
     [not found] ` <20221027150927.992061541@goodmis.org>
2022-10-27 15:21   ` [RFC][PATCH v2 15/31] timers: Input: " Steven Rostedt
2022-10-27 16:38     ` Dmitry Torokhov
2022-10-27 15:52 ` [RFC][PATCH v2 00/31] timers: Use del_timer_shutdown() before freeing timers Jason A. Donenfeld
2022-10-27 16:01   ` Sebastian Andrzej Siewior
2022-10-27 17:23     ` Steven Rostedt
2022-10-27 18:58 ` Guenter Roeck
2022-10-27 19:02   ` Steven Rostedt
2022-10-27 19:11     ` Guenter Roeck
2022-10-27 19:11     ` Linus Torvalds
2022-10-27 19:16       ` Steven Rostedt
2022-10-27 19:44         ` Guenter Roeck
2022-10-27 19:20   ` Steven Rostedt
2022-10-27 19:27     ` Steven Rostedt
2022-10-27 19:38       ` Guenter Roeck
2022-10-27 22:24 ` Guenter Roeck [this message]
2022-10-27 22:58   ` Steven Rostedt
2022-10-27 23:24     ` Guenter Roeck
2022-10-27 23:55       ` Steven Rostedt
2022-10-28  0:54         ` Guenter Roeck
2022-10-28 15:30     ` Guenter Roeck
2022-10-28 16:10     ` Guenter Roeck
     [not found] ` <20221028021815.3130-1-hdanton@sina.com>
2022-10-28  3:17   ` [RFC][PATCH v2 20/31] timers: usb: Use del_timer_shutdown() before freeing timer Steven Rostedt
2022-10-28 18:50 ` [RFC][PATCH v2 00/31] timers: Use del_timer_shutdown() before freeing timers Steven Rostedt
2022-10-28 20:12   ` Trond Myklebust
2022-10-28 20:49     ` Steven Rostedt
2022-10-28 21:57       ` Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221027222404.GA3123386@roeck-us.net \
    --to=linux@roeck-us.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=sboyd@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).