All of lore.kernel.org
 help / color / mirror / Atom feed
* spinlock recursion when running q800 emulation in qemu
@ 2024-03-04 17:58 Guenter Roeck
  2024-03-05  0:33 ` Finn Thain
  2024-03-06  7:14 ` Michael Schmitz
  0 siblings, 2 replies; 36+ messages in thread
From: Guenter Roeck @ 2024-03-04 17:58 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: linux-m68k

Hi,

when running the q800 qemu emulation, am seeing random "spinlock recursion"
messages every few test runs. Some examples are below. This is not a new
problem, but it keeps popping up.

Is this a potential problem or just noise ?

Thanks,
Guenter

---
BUG: spinlock recursion on CPU#0, swapper/1
 lock: 0x5875c0, .magic: dead4ead, .owner: swapper/1, .owner_cpu: 0
CPU: 0 PID: 1 Comm: swapper Not tainted 6.8.0-rc6-mac-00037-g805d849d7c3c #1
Stack from 00c49e80:
        00c49e80 0051c8da 0051c8da 005875c0 00000000 003ec2de 0051c8da 00038e26
        005875c0 004f094b 00c4c010 0003025a 005875c0 00c49ec4 00030272 005875c0
        00000000 00c49ed4 00031592 00000000 00c15100 0003df46 0004b6b6 00000000
        00006b1e 00000000 0003dfc0 0000000e 00000000 00000000 00042608 0000000f
        00000000 005ba9b0 005baa0c 00000000 00000000 0003e050 005ba9b0 00000000
        005ba9b0 0003e0a6 005ba9b0 005baa0c 00000000 005ba9b0 005baa0c 00040524
Call Trace: [<003ec2de>] dump_stack+0xc/0x10
 [<00038e26>] do_raw_spin_lock+0x34/0x58
 [<0003025a>] raw_spin_rq_lock_nested+0xe/0x14
 [<00030272>] raw_spin_rq_lock.constprop.0+0x12/0x18
 [<00031592>] scheduler_tick+0x18/0x62
 [<0003df46>] __irq_wake_thread+0x0/0x40
 [<0004b6b6>] update_process_times+0x5e/0x68
 [<00006b1e>] via_timer_handler+0x1a/0x20
 [<0003dfc0>] __handle_irq_event_percpu+0x3a/0xba
 [<00042608>] check_all_holdout_tasks_trace+0x114/0x25a
 [<0003e050>] handle_irq_event_percpu+0x10/0x3a
 [<0003e0a6>] handle_irq_event+0x2c/0x4e
 [<00040524>] handle_simple_irq+0x64/0x68
 [<003ec3ba>] kernel_init+0x0/0xec
 [<0003db76>] handle_irq_desc+0x1e/0x28
 [<000069be>] via1_irq+0x42/0x80
 [<00046914>] resolve_symbol+0x192/0x27e
 [<003ec3ba>] kernel_init+0x0/0xec
 [<0003db76>] handle_irq_desc+0x1e/0x28
 [<00002798>] do_IRQ+0x20/0x32
 [<00002698>] auto_irqhandler_fixup+0x4/0xc
 [<00046914>] resolve_symbol+0x192/0x27e
 [<00002530>] ret_from_kernel_thread+0x0/0x14
RCU Tasks Trace: Setting shift to 0 and lim to 1 rcu_task_cb_adjust=1.

---
BUG: spinlock recursion on CPU#0, swapper/2
 lock: 0x5875c0, .magic: dead4ead, .owner: swapper/2, .owner_cpu: 0
CPU: 0 PID: 2 Comm: swapper Not tainted 6.8.0-rc7-mac #1
Stack from 00c81e80:
        00c81e80 0051c8ce 0051c8ce 005875c0 00000000 003ec30a 0051c8ce 00038e26
        005875c0 004f093f 00c4c650 0003025a 005875c0 00c81ec4 00030272 005875c0
        00000000 00c81ed4 00031592 00000000 00c15100 0003df46 0004b6b6 00000000
        00006b1e 00000000 0003dfc0 0000000e 00000000 00000000 00c42608 00000102
        00037462 005ba9b0 005baa0c 00000000 00000000 0003e050 005ba9b0 00000000
        005ba9b0 0003e0a6 005ba9b0 005baa0c 00000000 005ba9b0 005baa0c 00040524
Call Trace: [<003ec30a>] dump_stack+0xc/0x10
 [<00038e26>] do_raw_spin_lock+0x34/0x58
 [<0003025a>] raw_spin_rq_lock_nested+0xe/0x14
 [<00030272>] raw_spin_rq_lock.constprop.0+0x12/0x18
 [<00031592>] scheduler_tick+0x18/0x62
 [<0003df46>] __irq_wake_thread+0x0/0x40
 [<0004b6b6>] update_process_times+0x5e/0x68
 [<00006b1e>] via_timer_handler+0x1a/0x20
 [<0003dfc0>] __handle_irq_event_percpu+0x3a/0xba
 [<00037462>] __prepare_to_swait+0x0/0x26
 [<0003e050>] handle_irq_event_percpu+0x10/0x3a
 [<0003e0a6>] handle_irq_event+0x2c/0x4e
 [<00040524>] handle_simple_irq+0x64/0x68
 [<0002c5c0>] kthreadd+0x0/0xf4
 [<0003db76>] handle_irq_desc+0x1e/0x28
 [<000069be>] via1_irq+0x42/0x80
 [<0002c5c0>] kthreadd+0x0/0xf4
 [<0003db76>] handle_irq_desc+0x1e/0x28
 [<00002798>] do_IRQ+0x20/0x32
 [<00002698>] auto_irqhandler_fixup+0x4/0xc
 [<00037462>] __prepare_to_swait+0x0/0x26
 [<00002530>] ret_from_kernel_thread+0x0/0x14
RCU Tasks Trace: Setting shift to 0 and lim to 1 rcu_task_cb_adjust=1.

---
BUG: spinlock recursion on CPU#0, swapper/2
 lock: 0x573580, .magic: dead4ead, .owner: swapper/2, .owner_cpu: 0
CPU: 0 PID: 2 Comm: swapper Not tainted 6.6.20-mac #1
Stack from 0085be80:
        0085be80 0050972e 0050972e 00573580 00000000 003dc00e 0050972e 000385ce
        00573580 004dddb2 0084c630 0002fca0 00573580 0085bec4 0002fcb8 00573580
        00000000 0085bed4 00030f36 00000000 00815100 0003d35e 0004a990 00000000
        00006b0a 00000000 0003d3d8 0000000e 00000000 00000000 00842608 00000102
        00036c08 005a69b4 005a6a10 00000000 00000000 0003d468 005a69b4 00000000
        005a69b4 0003d4be 005a69b4 005a6a10 00000000 005a69b4 005a6a10 0003f9a4
Call Trace: [<003dc00e>] dump_stack+0xc/0x10
 [<000385ce>] do_raw_spin_lock+0x34/0x58
 [<0002fca0>] raw_spin_rq_lock_nested+0xe/0x14
 [<0002fcb8>] raw_spin_rq_lock.constprop.0+0x12/0x18
 [<00030f36>] scheduler_tick+0x18/0x62
 [<0003d35e>] __irq_wake_thread+0x0/0x40
 [<0004a990>] update_process_times+0x5e/0x66
 [<00006b0a>] via_timer_handler+0x1a/0x20
 [<0003d3d8>] __handle_irq_event_percpu+0x3a/0xba
 [<00036c08>] __prepare_to_swait+0x0/0x26
 [<0003d468>] handle_irq_event_percpu+0x10/0x3a
 [<0003d4be>] handle_irq_event+0x2c/0x4e
 [<0003f9a4>] handle_simple_irq+0x64/0x68
 [<0002c1c8>] kthreadd+0x0/0xf4
 [<0003cf8e>] handle_irq_desc+0x1e/0x28
 [<000069aa>] via1_irq+0x42/0x80
 [<0002c1c8>] kthreadd+0x0/0xf4
 [<0003cf8e>] handle_irq_desc+0x1e/0x28
 [<0000276c>] do_IRQ+0x20/0x32
 [<0000266c>] auto_irqhandler_fixup+0x4/0xc
 [<00036c08>] __prepare_to_swait+0x0/0x26
 [<00002504>] ret_from_kernel_thread+0x0/0x14
RCU Tasks Trace: Setting shift to 0 and lim to 1 rcu_task_cb_adjust=1.

---
BUG: spinlock recursion on CPU#0, swapper/2
 lock: 0x52b8d0, .magic: dead4ead, .owner: swapper/2, .owner_cpu: 0
CPU: 0 PID: 2 Comm: swapper Not tainted 6.1.76-mac #1
Stack from 0082fe5c:
        0082fe5c 004ca084 004ca084 00000000 00000000 003ac8d6 004ca084 00038ec6
        0052b8d0 004a1af1 00810040 0002f7ca 0052b8d0 0082fea0 0002f7e2 0052b8d0
        00000000 0082fed4 00030abe 0082fecc 00000000 00000000 0000000e 003a44ea
        00000000 00000000 00810040 0055ecdc 0005b98e 00563214 0003d82e 0004a7f6
        00000000 00006e9e 00000000 0003d8a8 0000000e 00000000 00000000 00522608
        0000000f 005272c8 0055ecdc 0055ed38 00000000 00000000 0003d938 0055ecdc
Call Trace: [<003ac8d6>] dump_stack+0xc/0x10
 [<00038ec6>] do_raw_spin_lock+0x34/0x58
 [<0002f7ca>] raw_spin_rq_lock_nested+0xe/0x14
 [<0002f7e2>] rq_lock.constprop.0+0x12/0x28
 [<00030abe>] scheduler_tick+0x1c/0x142
 [<003a44ea>] _printk+0x0/0x18
 [<0005b98e>] irq_work_tick+0x20/0x26
 [<0003d82e>] __irq_wake_thread+0x0/0x40
 [<0004a7f6>] update_process_times+0x5e/0x68
 [<00006e9e>] via_timer_handler+0x1a/0x20
 [<0003d8a8>] __handle_irq_event_percpu+0x3a/0xba
 [<0003d938>] handle_irq_event_percpu+0x10/0x3a
 [<0003d98e>] handle_irq_event+0x2c/0x4e
 [<0003fe64>] handle_simple_irq+0x64/0x68
 [<0002bb10>] kthreadd+0x0/0xec
 [<0003d3f2>] handle_irq_desc+0x1e/0x28
 [<00006d3e>] via1_irq+0x42/0x80
 [<0002bb10>] kthreadd+0x0/0xec
 [<0003d3f2>] handle_irq_desc+0x1e/0x28
 [<00002b58>] do_IRQ+0x20/0x32
 [<00002a58>] auto_irqhandler_fixup+0x4/0xc
 [<000028f8>] ret_from_kernel_thread+0x0/0x14
cblist_init_generic: Setting adjustable number of callback queues.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-04 17:58 spinlock recursion when running q800 emulation in qemu Guenter Roeck
@ 2024-03-05  0:33 ` Finn Thain
  2024-03-05  0:48   ` Michael Schmitz
       [not found]   ` <fcb506f2-523d-4efc-ae3d-fe3c79c6f09e@gmail.com>
  2024-03-06  7:14 ` Michael Schmitz
  1 sibling, 2 replies; 36+ messages in thread
From: Finn Thain @ 2024-03-05  0:33 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Geert Uytterhoeven, linux-m68k


On Mon, 4 Mar 2024, Guenter Roeck wrote:

> when running the q800 qemu emulation, am seeing random "spinlock 
> recursion" messages every few test runs. Some examples are below. This 
> is not a new problem, but it keeps popping up.
> 

You have to enable CONFIG_DEBUG_SPINLOCK to see it. I have logs from 10 
years ago that show it. Those logs have the VIA timer driver in the 
backtrace, like yours. That was v3.17 running on a PowerBook 180.

> Is this a potential problem or just noise ?

Maybe the locking validation is not compatible with nested interrupts (?)

Does "spinlock recursion" show up in Aranym as well? I would check for 
that myself but I don't have a reproducer.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-05  0:33 ` Finn Thain
@ 2024-03-05  0:48   ` Michael Schmitz
       [not found]   ` <fcb506f2-523d-4efc-ae3d-fe3c79c6f09e@gmail.com>
  1 sibling, 0 replies; 36+ messages in thread
From: Michael Schmitz @ 2024-03-05  0:48 UTC (permalink / raw)
  To: Finn Thain, Guenter Roeck; +Cc: Geert Uytterhoeven, linux-m68k

Hi Guenter,

On 5/03/24 13:33, Finn Thain wrote:
> On Mon, 4 Mar 2024, Guenter Roeck wrote:
>
>> when running the q800 qemu emulation, am seeing random "spinlock
>> recursion" messages every few test runs. Some examples are below. This
>> is not a new problem, but it keeps popping up.
>>
> You have to enable CONFIG_DEBUG_SPINLOCK to see it. I have logs from 10
> years ago that show it. Those logs have the VIA timer driver in the
> backtrace, like yours. That was v3.17 running on a PowerBook 180.
>
>> Is this a potential problem or just noise ?
> Maybe the locking validation is not compatible with nested interrupts (?)
Seeing as m68k is the only architecture that allows nesting of 
interrupts, I'd rate that as quite likely. At least that's how I 
understood the situation a few years back when that weird access past 
end of RAM during interrupt accounting bug was discussed.
> Does "spinlock recursion" show up in Aranym as well? I would check for
> that myself but I don't have a reproducer.

If you can give me a hint on what kind of workload that happens at, I'd 
be happy to run something similar (if not the same) on my 030 hardware.

I've never set CONFIG_DEBUG_SPINLOCK, so would have missed these 
messages, too.

Cheers,

Michael


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
       [not found]   ` <fcb506f2-523d-4efc-ae3d-fe3c79c6f09e@gmail.com>
@ 2024-03-05  0:58     ` Guenter Roeck
  2024-03-05  1:06       ` Michael Schmitz
  0 siblings, 1 reply; 36+ messages in thread
From: Guenter Roeck @ 2024-03-05  0:58 UTC (permalink / raw)
  To: Michael Schmitz, Finn Thain; +Cc: Geert Uytterhoeven, linux-m68k

On 3/4/24 16:47, Michael Schmitz wrote:
> Hi Guenter,
> 
> On 5/03/24 13:33, Finn Thain wrote:
>> On Mon, 4 Mar 2024, Guenter Roeck wrote:
>>
>>> when running the q800 qemu emulation, am seeing random "spinlock
>>> recursion" messages every few test runs. Some examples are below. This
>>> is not a new problem, but it keeps popping up.
>>>
>> You have to enable CONFIG_DEBUG_SPINLOCK to see it. I have logs from 10
>> years ago that show it. Those logs have the VIA timer driver in the
>> backtrace, like yours. That was v3.17 running on a PowerBook 180.
>>
>>> Is this a potential problem or just noise ?
>> Maybe the locking validation is not compatible with nested interrupts (?)
> Seeing as m68k is the only architecture that allows nesting of interrupts, I'd rate that as quite likely. At least that's how I understood the situation a few years back when that weird access past end of RAM during interrupt accounting bug was discussed.
>> Does "spinlock recursion" show up in Aranym as well? I would check for
>> that myself but I don't have a reproducer.
> 
> If you can give me a hint on what kind of workload that happens at, I'd be happy to run something similar (if not the same) on my 030 hardware.
> 
> I've never set CONFIG_DEBUG_SPINLOCK, so would have missed these messages, too.
> 

No workload in my case, just boot into the qemu q800 emulation.

Looking into statistics, for example at
https://kerneltests.org/builders/qemu-m68k-master,
it is seen every ~5-10 boot attempts.

Guenter


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-05  0:58     ` Guenter Roeck
@ 2024-03-05  1:06       ` Michael Schmitz
  0 siblings, 0 replies; 36+ messages in thread
From: Michael Schmitz @ 2024-03-05  1:06 UTC (permalink / raw)
  To: Guenter Roeck, Finn Thain; +Cc: Geert Uytterhoeven, linux-m68k

Hi Guenter,

On 5/03/24 13:58, Guenter Roeck wrote:
> On 3/4/24 16:47, Michael Schmitz wrote:
>> Hi Guenter,
>>
>> On 5/03/24 13:33, Finn Thain wrote:
>>> On Mon, 4 Mar 2024, Guenter Roeck wrote:
>>>
>>>> when running the q800 qemu emulation, am seeing random "spinlock
>>>> recursion" messages every few test runs. Some examples are below. This
>>>> is not a new problem, but it keeps popping up.
>>>>
>>> You have to enable CONFIG_DEBUG_SPINLOCK to see it. I have logs from 10
>>> years ago that show it. Those logs have the VIA timer driver in the
>>> backtrace, like yours. That was v3.17 running on a PowerBook 180.
>>>
>>>> Is this a potential problem or just noise ?
>>> Maybe the locking validation is not compatible with nested 
>>> interrupts (?)
>> Seeing as m68k is the only architecture that allows nesting of 
>> interrupts, I'd rate that as quite likely. At least that's how I 
>> understood the situation a few years back when that weird access past 
>> end of RAM during interrupt accounting bug was discussed.
>>> Does "spinlock recursion" show up in Aranym as well? I would check for
>>> that myself but I don't have a reproducer.
>>
>> If you can give me a hint on what kind of workload that happens at, 
>> I'd be happy to run something similar (if not the same) on my 030 
>> hardware.
>>
>> I've never set CONFIG_DEBUG_SPINLOCK, so would have missed these 
>> messages, too.
>>
>
> No workload in my case, just boot into the qemu q800 emulation.
>
> Looking into statistics, for example at
> https://kerneltests.org/builders/qemu-m68k-master,
> it is seen every ~5-10 boot attempts.

I can certainly do that ...

Cheers,

     Michael


>
> Guenter
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-04 17:58 spinlock recursion when running q800 emulation in qemu Guenter Roeck
  2024-03-05  0:33 ` Finn Thain
@ 2024-03-06  7:14 ` Michael Schmitz
  2024-03-06  8:30   ` Brad Boyer
  2024-03-06 23:52   ` Finn Thain
  1 sibling, 2 replies; 36+ messages in thread
From: Michael Schmitz @ 2024-03-06  7:14 UTC (permalink / raw)
  To: Guenter Roeck, Geert Uytterhoeven; +Cc: linux-m68k

Hi Guenter,

Am 05.03.2024 um 06:58 schrieb Guenter Roeck:
> Hi,
>
> when running the q800 qemu emulation, am seeing random "spinlock recursion"
> messages every few test runs. Some examples are below. This is not a new
> problem, but it keeps popping up.
>
> Is this a potential problem or just noise ?

Looking at the backtraces some more, I think what happens here is that 
the swapper kernel task is stalled for long enough here that on the next 
scheduler tick, it is still running. With no other tasks scheduled to 
run in the meantime, it may still be the current task.

The stalled swapper task still holds the run queue lock, hence the 
recursion message.

I do not think this is a bug, just a reflection of a slow system with 
nothing much to do.

There does not appear to be any interrupt nesting here. I cannot recall 
the interrupt level assignment on the Q800, in particular as regards 
disk interrupts and whether both VIAs are using the same interrupt 
level. I'd suppose they would, as these chips date back a long time and 
probably cannot place a vector number on the bus as part of the 
interrupt stack frame?

Whether I can hope to see interrupt nesting on my Atari is unclear - 
pretty much anything on that machine is generating interrupts via the 
ST-MFP so it's all at the same level.

Cheers,

	Michael


>
> Thanks,
> Guenter
>
> ---
> BUG: spinlock recursion on CPU#0, swapper/1
>  lock: 0x5875c0, .magic: dead4ead, .owner: swapper/1, .owner_cpu: 0
> CPU: 0 PID: 1 Comm: swapper Not tainted 6.8.0-rc6-mac-00037-g805d849d7c3c #1
> Stack from 00c49e80:
>         00c49e80 0051c8da 0051c8da 005875c0 00000000 003ec2de 0051c8da 00038e26
>         005875c0 004f094b 00c4c010 0003025a 005875c0 00c49ec4 00030272 005875c0
>         00000000 00c49ed4 00031592 00000000 00c15100 0003df46 0004b6b6 00000000
>         00006b1e 00000000 0003dfc0 0000000e 00000000 00000000 00042608 0000000f
>         00000000 005ba9b0 005baa0c 00000000 00000000 0003e050 005ba9b0 00000000
>         005ba9b0 0003e0a6 005ba9b0 005baa0c 00000000 005ba9b0 005baa0c 00040524
> Call Trace: [<003ec2de>] dump_stack+0xc/0x10
>  [<00038e26>] do_raw_spin_lock+0x34/0x58
>  [<0003025a>] raw_spin_rq_lock_nested+0xe/0x14
>  [<00030272>] raw_spin_rq_lock.constprop.0+0x12/0x18
>  [<00031592>] scheduler_tick+0x18/0x62
>  [<0003df46>] __irq_wake_thread+0x0/0x40
>  [<0004b6b6>] update_process_times+0x5e/0x68
>  [<00006b1e>] via_timer_handler+0x1a/0x20
>  [<0003dfc0>] __handle_irq_event_percpu+0x3a/0xba
>  [<00042608>] check_all_holdout_tasks_trace+0x114/0x25a
>  [<0003e050>] handle_irq_event_percpu+0x10/0x3a
>  [<0003e0a6>] handle_irq_event+0x2c/0x4e
>  [<00040524>] handle_simple_irq+0x64/0x68
>  [<003ec3ba>] kernel_init+0x0/0xec
>  [<0003db76>] handle_irq_desc+0x1e/0x28
>  [<000069be>] via1_irq+0x42/0x80
>  [<00046914>] resolve_symbol+0x192/0x27e
>  [<003ec3ba>] kernel_init+0x0/0xec
>  [<0003db76>] handle_irq_desc+0x1e/0x28
>  [<00002798>] do_IRQ+0x20/0x32
>  [<00002698>] auto_irqhandler_fixup+0x4/0xc
>  [<00046914>] resolve_symbol+0x192/0x27e
>  [<00002530>] ret_from_kernel_thread+0x0/0x14
> RCU Tasks Trace: Setting shift to 0 and lim to 1 rcu_task_cb_adjust=1.
>
> ---
> BUG: spinlock recursion on CPU#0, swapper/2
>  lock: 0x5875c0, .magic: dead4ead, .owner: swapper/2, .owner_cpu: 0
> CPU: 0 PID: 2 Comm: swapper Not tainted 6.8.0-rc7-mac #1
> Stack from 00c81e80:
>         00c81e80 0051c8ce 0051c8ce 005875c0 00000000 003ec30a 0051c8ce 00038e26
>         005875c0 004f093f 00c4c650 0003025a 005875c0 00c81ec4 00030272 005875c0
>         00000000 00c81ed4 00031592 00000000 00c15100 0003df46 0004b6b6 00000000
>         00006b1e 00000000 0003dfc0 0000000e 00000000 00000000 00c42608 00000102
>         00037462 005ba9b0 005baa0c 00000000 00000000 0003e050 005ba9b0 00000000
>         005ba9b0 0003e0a6 005ba9b0 005baa0c 00000000 005ba9b0 005baa0c 00040524
> Call Trace: [<003ec30a>] dump_stack+0xc/0x10
>  [<00038e26>] do_raw_spin_lock+0x34/0x58
>  [<0003025a>] raw_spin_rq_lock_nested+0xe/0x14
>  [<00030272>] raw_spin_rq_lock.constprop.0+0x12/0x18
>  [<00031592>] scheduler_tick+0x18/0x62
>  [<0003df46>] __irq_wake_thread+0x0/0x40
>  [<0004b6b6>] update_process_times+0x5e/0x68
>  [<00006b1e>] via_timer_handler+0x1a/0x20
>  [<0003dfc0>] __handle_irq_event_percpu+0x3a/0xba
>  [<00037462>] __prepare_to_swait+0x0/0x26
>  [<0003e050>] handle_irq_event_percpu+0x10/0x3a
>  [<0003e0a6>] handle_irq_event+0x2c/0x4e
>  [<00040524>] handle_simple_irq+0x64/0x68
>  [<0002c5c0>] kthreadd+0x0/0xf4
>  [<0003db76>] handle_irq_desc+0x1e/0x28
>  [<000069be>] via1_irq+0x42/0x80
>  [<0002c5c0>] kthreadd+0x0/0xf4
>  [<0003db76>] handle_irq_desc+0x1e/0x28
>  [<00002798>] do_IRQ+0x20/0x32
>  [<00002698>] auto_irqhandler_fixup+0x4/0xc
>  [<00037462>] __prepare_to_swait+0x0/0x26
>  [<00002530>] ret_from_kernel_thread+0x0/0x14
> RCU Tasks Trace: Setting shift to 0 and lim to 1 rcu_task_cb_adjust=1.
>
> ---
> BUG: spinlock recursion on CPU#0, swapper/2
>  lock: 0x573580, .magic: dead4ead, .owner: swapper/2, .owner_cpu: 0
> CPU: 0 PID: 2 Comm: swapper Not tainted 6.6.20-mac #1
> Stack from 0085be80:
>         0085be80 0050972e 0050972e 00573580 00000000 003dc00e 0050972e 000385ce
>         00573580 004dddb2 0084c630 0002fca0 00573580 0085bec4 0002fcb8 00573580
>         00000000 0085bed4 00030f36 00000000 00815100 0003d35e 0004a990 00000000
>         00006b0a 00000000 0003d3d8 0000000e 00000000 00000000 00842608 00000102
>         00036c08 005a69b4 005a6a10 00000000 00000000 0003d468 005a69b4 00000000
>         005a69b4 0003d4be 005a69b4 005a6a10 00000000 005a69b4 005a6a10 0003f9a4
> Call Trace: [<003dc00e>] dump_stack+0xc/0x10
>  [<000385ce>] do_raw_spin_lock+0x34/0x58
>  [<0002fca0>] raw_spin_rq_lock_nested+0xe/0x14
>  [<0002fcb8>] raw_spin_rq_lock.constprop.0+0x12/0x18
>  [<00030f36>] scheduler_tick+0x18/0x62
>  [<0003d35e>] __irq_wake_thread+0x0/0x40
>  [<0004a990>] update_process_times+0x5e/0x66
>  [<00006b0a>] via_timer_handler+0x1a/0x20
>  [<0003d3d8>] __handle_irq_event_percpu+0x3a/0xba
>  [<00036c08>] __prepare_to_swait+0x0/0x26
>  [<0003d468>] handle_irq_event_percpu+0x10/0x3a
>  [<0003d4be>] handle_irq_event+0x2c/0x4e
>  [<0003f9a4>] handle_simple_irq+0x64/0x68
>  [<0002c1c8>] kthreadd+0x0/0xf4
>  [<0003cf8e>] handle_irq_desc+0x1e/0x28
>  [<000069aa>] via1_irq+0x42/0x80
>  [<0002c1c8>] kthreadd+0x0/0xf4
>  [<0003cf8e>] handle_irq_desc+0x1e/0x28
>  [<0000276c>] do_IRQ+0x20/0x32
>  [<0000266c>] auto_irqhandler_fixup+0x4/0xc
>  [<00036c08>] __prepare_to_swait+0x0/0x26
>  [<00002504>] ret_from_kernel_thread+0x0/0x14
> RCU Tasks Trace: Setting shift to 0 and lim to 1 rcu_task_cb_adjust=1.
>
> ---
> BUG: spinlock recursion on CPU#0, swapper/2
>  lock: 0x52b8d0, .magic: dead4ead, .owner: swapper/2, .owner_cpu: 0
> CPU: 0 PID: 2 Comm: swapper Not tainted 6.1.76-mac #1
> Stack from 0082fe5c:
>         0082fe5c 004ca084 004ca084 00000000 00000000 003ac8d6 004ca084 00038ec6
>         0052b8d0 004a1af1 00810040 0002f7ca 0052b8d0 0082fea0 0002f7e2 0052b8d0
>         00000000 0082fed4 00030abe 0082fecc 00000000 00000000 0000000e 003a44ea
>         00000000 00000000 00810040 0055ecdc 0005b98e 00563214 0003d82e 0004a7f6
>         00000000 00006e9e 00000000 0003d8a8 0000000e 00000000 00000000 00522608
>         0000000f 005272c8 0055ecdc 0055ed38 00000000 00000000 0003d938 0055ecdc
> Call Trace: [<003ac8d6>] dump_stack+0xc/0x10
>  [<00038ec6>] do_raw_spin_lock+0x34/0x58
>  [<0002f7ca>] raw_spin_rq_lock_nested+0xe/0x14
>  [<0002f7e2>] rq_lock.constprop.0+0x12/0x28
>  [<00030abe>] scheduler_tick+0x1c/0x142
>  [<003a44ea>] _printk+0x0/0x18
>  [<0005b98e>] irq_work_tick+0x20/0x26
>  [<0003d82e>] __irq_wake_thread+0x0/0x40
>  [<0004a7f6>] update_process_times+0x5e/0x68
>  [<00006e9e>] via_timer_handler+0x1a/0x20
>  [<0003d8a8>] __handle_irq_event_percpu+0x3a/0xba
>  [<0003d938>] handle_irq_event_percpu+0x10/0x3a
>  [<0003d98e>] handle_irq_event+0x2c/0x4e
>  [<0003fe64>] handle_simple_irq+0x64/0x68
>  [<0002bb10>] kthreadd+0x0/0xec
>  [<0003d3f2>] handle_irq_desc+0x1e/0x28
>  [<00006d3e>] via1_irq+0x42/0x80
>  [<0002bb10>] kthreadd+0x0/0xec
>  [<0003d3f2>] handle_irq_desc+0x1e/0x28
>  [<00002b58>] do_IRQ+0x20/0x32
>  [<00002a58>] auto_irqhandler_fixup+0x4/0xc
>  [<000028f8>] ret_from_kernel_thread+0x0/0x14
> cblist_init_generic: Setting adjustable number of callback queues.
>
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-06  7:14 ` Michael Schmitz
@ 2024-03-06  8:30   ` Brad Boyer
  2024-03-06 23:13     ` Finn Thain
  2024-03-06 23:42     ` Michael Schmitz
  2024-03-06 23:52   ` Finn Thain
  1 sibling, 2 replies; 36+ messages in thread
From: Brad Boyer @ 2024-03-06  8:30 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: Guenter Roeck, Geert Uytterhoeven, linux-m68k

On Wed, Mar 06, 2024 at 08:14:31PM +1300, Michael Schmitz wrote:
> Looking at the backtraces some more, I think what happens here is that the
> swapper kernel task is stalled for long enough here that on the next
> scheduler tick, it is still running. With no other tasks scheduled to run in
> the meantime, it may still be the current task.
> 
> The stalled swapper task still holds the run queue lock, hence the recursion
> message.
> 
> I do not think this is a bug, just a reflection of a slow system with
> nothing much to do.
> 
> There does not appear to be any interrupt nesting here. I cannot recall the
> interrupt level assignment on the Q800, in particular as regards disk
> interrupts and whether both VIAs are using the same interrupt level. I'd
> suppose they would, as these chips date back a long time and probably cannot
> place a vector number on the bus as part of the interrupt stack frame?

No, a Mac with 2 VIA chips should always have them on separate interrupt
levels. The exact mapping varies by model, but every model supported
by Linux uses at least three different hardware IRQ levels.

On most Mac models, VIA1 is IRQ1, VIA2 is IRQ2, and the SCC is IRQ4.
VIA1 would handle the timer and ADB while VIA2 handles NuBus and SCSI.
It looks like the Q800 supports switching VIA1 to IRQ6 through the
alternate IRQ mapping. Most models also are using various other chips
emulating a real VIA for VIA2 with varying levels of compatibility.

Apple always used autovectoring and the three IPL pins to determine
the interrupt vector. Older models just have three external IRQ
sources, one matching each pin (thus the 1, 2, 4 pattern). I presume
they had some external circuit to make sure only one line is being
pulled low at a time since we never get other IRQ levels by accident.

The AV macs and the IIfx are the only models with more complex
interrupt controllers as separate chips (PSC or OSS).

	Brad Boyer
	flar@allandria.com


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-06  8:30   ` Brad Boyer
@ 2024-03-06 23:13     ` Finn Thain
  2024-03-06 23:46       ` Guenter Roeck
  2024-03-06 23:42     ` Michael Schmitz
  1 sibling, 1 reply; 36+ messages in thread
From: Finn Thain @ 2024-03-06 23:13 UTC (permalink / raw)
  To: Brad Boyer; +Cc: Michael Schmitz, Guenter Roeck, Geert Uytterhoeven, linux-m68k


On Wed, 6 Mar 2024, Brad Boyer wrote:

> It looks like the Q800 supports switching VIA1 to IRQ6 through the
> alternate IRQ mapping. 

Yes, and that's what happens in QEMU. That was easily confirmed with the 
patch below. As a result, you get less clocksource drift during I/O but 
you also get a lot of interrupt nesting.


# cat /proc/interrupts
           CPU0       
  3:          2      auto      sonic
  4:         45      auto      ttyS0
  7:          0      auto      NMI
 10:       2089       mac      ADB
 14:        988       mac      timer
 19:        792       mac      ESP
 56:          0       mac      sonic
ERR:          0


diff --git a/arch/m68k/mac/via.c b/arch/m68k/mac/via.c
index 01e6b0e37f8d..aa3bdbaf2ff4 100644
--- a/arch/m68k/mac/via.c
+++ b/arch/m68k/mac/via.c
@@ -386,6 +386,9 @@ void via1_irq(struct irq_desc *desc)
 	int irq_num;
 	unsigned char irq_bit, events;
 
+	if (irq_desc_get_irq(desc) == 6)
+		printk_once(KERN_INFO "%s: using remapped irq\n", __func__);
+
 	events = via1[vIFR] & via1[vIER] & 0x7F;
 	if (!events)
 		return;

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-06  8:30   ` Brad Boyer
  2024-03-06 23:13     ` Finn Thain
@ 2024-03-06 23:42     ` Michael Schmitz
  1 sibling, 0 replies; 36+ messages in thread
From: Michael Schmitz @ 2024-03-06 23:42 UTC (permalink / raw)
  To: Brad Boyer; +Cc: Guenter Roeck, Geert Uytterhoeven, linux-m68k

Thanks Brad,

On 6/03/24 21:30, Brad Boyer wrote:
> On Wed, Mar 06, 2024 at 08:14:31PM +1300, Michael Schmitz wrote:
>> Looking at the backtraces some more, I think what happens here is that the
>> swapper kernel task is stalled for long enough here that on the next
>> scheduler tick, it is still running. With no other tasks scheduled to run in
>> the meantime, it may still be the current task.
>>
>> The stalled swapper task still holds the run queue lock, hence the recursion
>> message.
>>
>> I do not think this is a bug, just a reflection of a slow system with
>> nothing much to do.
>>
>> There does not appear to be any interrupt nesting here. I cannot recall the
>> interrupt level assignment on the Q800, in particular as regards disk
>> interrupts and whether both VIAs are using the same interrupt level. I'd
>> suppose they would, as these chips date back a long time and probably cannot
... that should have been 'would not', of course. Autovectoring can't 
work otherwise.
>> place a vector number on the bus as part of the interrupt stack frame?
> No, a Mac with 2 VIA chips should always have them on separate interrupt
> levels. The exact mapping varies by model, but every model supported
> by Linux uses at least three different hardware IRQ levels.
>
> On most Mac models, VIA1 is IRQ1, VIA2 is IRQ2, and the SCC is IRQ4.
> VIA1 would handle the timer and ADB while VIA2 handles NuBus and SCSI.
> It looks like the Q800 supports switching VIA1 to IRQ6 through the
> alternate IRQ mapping. Most models also are using various other chips
> emulating a real VIA for VIA2 with varying levels of compatibility.
>
> Apple always used autovectoring and the three IPL pins to determine
> the interrupt vector. Older models just have three external IRQ
> sources, one matching each pin (thus the 1, 2, 4 pattern). I presume
> they had some external circuit to make sure only one line is being
> pulled low at a time since we never get other IRQ levels by accident.

Makes sense - thanks for refreshing my memory!

Cheers,

     Michael


>
> The AV macs and the IIfx are the only models with more complex
> interrupt controllers as separate chips (PSC or OSS).
>
> 	Brad Boyer
> 	flar@allandria.com
>
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-06 23:13     ` Finn Thain
@ 2024-03-06 23:46       ` Guenter Roeck
  2024-03-07 23:35         ` Finn Thain
  0 siblings, 1 reply; 36+ messages in thread
From: Guenter Roeck @ 2024-03-06 23:46 UTC (permalink / raw)
  To: Finn Thain, Brad Boyer; +Cc: Michael Schmitz, Geert Uytterhoeven, linux-m68k

On 3/6/24 15:13, Finn Thain wrote:
> 
> On Wed, 6 Mar 2024, Brad Boyer wrote:
> 
>> It looks like the Q800 supports switching VIA1 to IRQ6 through the
>> alternate IRQ mapping.
> 
> Yes, and that's what happens in QEMU. That was easily confirmed with the
> patch below. As a result, you get less clocksource drift during I/O but
> you also get a lot of interrupt nesting.
> 

Does that mean we just have to live with the warning, or is there a possible
solution ?

Thanks,
Guenter


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-06  7:14 ` Michael Schmitz
  2024-03-06  8:30   ` Brad Boyer
@ 2024-03-06 23:52   ` Finn Thain
  2024-03-08  0:20     ` Michael Schmitz
  1 sibling, 1 reply; 36+ messages in thread
From: Finn Thain @ 2024-03-06 23:52 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: Guenter Roeck, Geert Uytterhoeven, linux-m68k


On Wed, 6 Mar 2024, Michael Schmitz wrote:

> 
> There does not appear to be any interrupt nesting here. 
> 

It's not in the backtrace but that doesn't mean much. It would be 
impossible for a lock checker to place all of it's state on the stack.
If interrupt nesting messed up its state, you won't know it by looking at 
the stack after the fact.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-06 23:46       ` Guenter Roeck
@ 2024-03-07 23:35         ` Finn Thain
  0 siblings, 0 replies; 36+ messages in thread
From: Finn Thain @ 2024-03-07 23:35 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Brad Boyer, Michael Schmitz, Geert Uytterhoeven, linux-m68k


On Wed, 6 Mar 2024, Guenter Roeck wrote:

> 
> Does that mean we just have to live with the warning, or is there a 
> possible solution ?
> 

I wasn't able to reproduce the problem and I don't know what the root 
cause is. That cause may be unrelated to the timer interrupt.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-06 23:52   ` Finn Thain
@ 2024-03-08  0:20     ` Michael Schmitz
  2024-03-08  0:56       ` Finn Thain
  0 siblings, 1 reply; 36+ messages in thread
From: Michael Schmitz @ 2024-03-08  0:20 UTC (permalink / raw)
  To: Finn Thain; +Cc: Guenter Roeck, Geert Uytterhoeven, linux-m68k

Hi Finn,

On 7/03/24 12:52, Finn Thain wrote:
> On Wed, 6 Mar 2024, Michael Schmitz wrote:
>
>> There does not appear to be any interrupt nesting here.
>>
> It's not in the backtrace but that doesn't mean much. It would be
> impossible for a lock checker to place all of it's state on the stack.
> If interrupt nesting messed up its state, you won't know it by looking at
> the stack after the fact.

True, we may see the effect only on the next tick.

But we may not need to invoke lock checker corruption in order to 
explain the recursion message.

Reentering the VIA1 timer handler would re-run scheduler_tick(). If the 
prior run has not yet released the run queue lock, we'd get the 
recursion message.

I realized (belatedly) that scheduler_tick() does not run the task 
itself but just causes a reschedule if appropriate, so the probability 
for this condition is quite low. The question is - does the VIA1 timer 
interrupt ever get reentered?

Can you add a printk_once() warning when you see 
arch/m68k/mac/via.c:via_timer_handler() reentered, Guenter?

Cheers,

     Michael




^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-08  0:20     ` Michael Schmitz
@ 2024-03-08  0:56       ` Finn Thain
  2024-03-08  8:06         ` Michael Schmitz
  0 siblings, 1 reply; 36+ messages in thread
From: Finn Thain @ 2024-03-08  0:56 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: Guenter Roeck, Geert Uytterhoeven, linux-m68k

On Fri, 8 Mar 2024, Michael Schmitz wrote:

> > If interrupt nesting messed up its state, you won't know it by looking 
> > at the stack after the fact.
> 
> True, we may see the effect only on the next tick.
> 
> But we may not need to invoke lock checker corruption in order to 
> explain the recursion message.
> 

Right. We can explain it in other ways.

> Reentering the VIA1 timer handler would re-run scheduler_tick(). If the 
> prior run has not yet released the run queue lock, we'd get the 
> recursion message.
> 
> I realized (belatedly) that scheduler_tick() does not run the task 
> itself but just causes a reschedule if appropriate, so the probability 
> for this condition is quite low. The question is - does the VIA1 timer 
> interrupt ever get reentered?
> 
> Can you add a printk_once() warning when you see 
> arch/m68k/mac/via.c:via_timer_handler() reentered, Guenter?
> 

It's not possible for an interrupt handler to be interrupted unless by an 
hard interrupt at a higher priority level. Maybe QEMU could be so broken 
(?!) but recall that I saw this BUG on a Powerbook 180 ten years ago.

IMHO, Gueunter would do better to instrument the spinlock tests on the 
assumption that the locking in arch/m68k really is correct.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-08  0:56       ` Finn Thain
@ 2024-03-08  8:06         ` Michael Schmitz
  2024-03-08  9:15           ` Finn Thain
  0 siblings, 1 reply; 36+ messages in thread
From: Michael Schmitz @ 2024-03-08  8:06 UTC (permalink / raw)
  To: Finn Thain; +Cc: Guenter Roeck, Geert Uytterhoeven, linux-m68k

Hi Finn,

Am 08.03.2024 um 13:56 schrieb Finn Thain:
>> I realized (belatedly) that scheduler_tick() does not run the task
>> itself but just causes a reschedule if appropriate, so the probability
>> for this condition is quite low. The question is - does the VIA1 timer
>> interrupt ever get reentered?
>>
>> Can you add a printk_once() warning when you see
>> arch/m68k/mac/via.c:via_timer_handler() reentered, Guenter?
>>
>
> It's not possible for an interrupt handler to be interrupted unless by an
> hard interrupt at a higher priority level. Maybe QEMU could be so broken

Right - not sure why I remembered interrupt handlers lowering the IPL 
there.

That leaves other users of the run queue lock. Queuing new tasks would 
be one obvious such user for the init task, but that is run with IRQs 
disabled (we'd see these warnings a lot more else). And comparing with 
how early I see the 'cblist_init_generic: Setting adjustable number of 
callback queues' message in my boot logs, this may all happen too early 
for init to start queuing tasks.

> (?!) but recall that I saw this BUG on a Powerbook 180 ten years ago.

I'll take a look at the scheduler code from that era (though probably 
not much has changed since).

> IMHO, Gueunter would do better to instrument the spinlock tests on the
> assumption that the locking in arch/m68k really is correct.

I've come to agree - maybe logging any run queue locks taken by the init 
task with IRQs enabled might help?

Cheers,

	Michael



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-08  8:06         ` Michael Schmitz
@ 2024-03-08  9:15           ` Finn Thain
  2024-03-08  9:33             ` Finn Thain
  0 siblings, 1 reply; 36+ messages in thread
From: Finn Thain @ 2024-03-08  9:15 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: Guenter Roeck, Geert Uytterhoeven, linux-m68k

On Fri, 8 Mar 2024, Michael Schmitz wrote:

> 
> > IMHO, Gueunter would do better to instrument the spinlock tests on the 
> > assumption that the locking in arch/m68k really is correct.
> 
> I've come to agree - maybe logging any run queue locks taken by the init 
> task with IRQs enabled might help?
> 

If I understood how the spinlock test algorithm worked, I'd be adding 
sanity checks to it to test the assumptions underlying that algorithm.

I would not be surprised if it uses logic that sees irqs_disabled() and 
then assumes that all hard irqs are disabled, or something similar.

I don't think I'd be adding printk() because the BUG is intermittent and 
printk() messes with timing.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-08  9:15           ` Finn Thain
@ 2024-03-08  9:33             ` Finn Thain
  2024-03-08 20:14               ` Michael Schmitz
  0 siblings, 1 reply; 36+ messages in thread
From: Finn Thain @ 2024-03-08  9:33 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: Guenter Roeck, Geert Uytterhoeven, linux-m68k

On Fri, 8 Mar 2024, I wrote:

> On Fri, 8 Mar 2024, Michael Schmitz wrote:
> 
> > 
> > > IMHO, Gueunter would do better to instrument the spinlock tests on the 
> > > assumption that the locking in arch/m68k really is correct.
> > 
> > I've come to agree - maybe logging any run queue locks taken by the init 
> > task with IRQs enabled might help?
> > 
> 
> If I understood how the spinlock test algorithm worked, I'd be adding 
> sanity checks to it to test the assumptions underlying that algorithm.
> 
> I would not be surprised if it uses logic that sees irqs_disabled() and 
> then assumes that all hard irqs are disabled, or something similar.
> 
> I don't think I'd be adding printk() because the BUG is intermittent and 
> printk() messes with timing.
> 

Another way to tackle this is first find a reliable reproducer, then raise 
the IPL before before dispatching hard interrupts.

The patch below was a forgotten experiment that I tried 3 years ago to 
deal with a similar failure. At the time I was using stress-ng to 
investigate a different issue, but maybe a workload like that could be a 
reproducer.

---

After enabling CONFIG_DEBUG_SPINLOCK, I see the following failure on m68k.

[ 4089.070000] BUG: spinlock trylock failure on UP on CPU#0, stress-ng/1341
[ 4089.070000]  lock: 0x53d4a0, .magic: dead4ead, .owner: stress-ng/1341, .owner_cpu: 0
[ 4089.070000] CPU: 0 PID: 1341 Comm: stress-ng Not tainted 5.14.0-multi-debug #2
[ 4089.070000] Stack from 00a7dcc8:
[ 4089.070000]         00a7dcc8 004da40f 004da40f 00000000 00a7dce4 0043e88e 004da40f 00a7dd10
[ 4089.070000]         0006316c 004d240e 0053d4a0 dead4ead 00956bba 0000053d 00000000 0053d4a0
[ 4089.070000]         00a7ddd8 0055e6ac 00a7dd28 000632b2 0053d4a0 004d2488 0005c81b 0053d458
[ 4089.070000]         00a7dd34 0044780e 0053d4a0 00a7dd58 002b8694 0053d4a0 00000001 0000007f
[ 4089.070000]         00860000 00000011 00528138 00528198 00a7dd7c 00069f6a 0000000d 00014200
[ 4089.070000]         00528138 00a7dd78 00a7def0 00528138 00014200 00a7dd98 00069ff0 00528138
[ 4089.070000] Call Trace: [<0043e88e>] dump_stack+0x10/0x16
[ 4089.070000]  [<0006316c>] spin_dump+0x6c/0xc0
[ 4089.070000]  [<000632b2>] do_raw_spin_trylock+0x32/0x80
[ 4089.070000]  [<0005c81b>] init_dl_bandwidth+0x1b/0x80
[ 4089.070000]  [<0044780e>] _raw_spin_trylock+0xe/0x40
[ 4089.070000]  [<002b8694>] add_interrupt_randomness+0x154/0x1c0
[ 4089.070000]  [<00069f6a>] handle_irq_event_percpu+0x2a/0x80
[ 4089.070000]  [<00014200>] INV_L2+0x8/0x10
[ 4089.070000]  [<00014200>] INV_L2+0x8/0x10
[ 4089.070000]  [<00069ff0>] handle_irq_event+0x30/0x80
[ 4089.070000]  [<0006d9e4>] handle_simple_irq+0x64/0x80
[ 4089.070000]  [<000695fe>] generic_handle_irq+0x3e/0x80
[ 4089.070000]  [<00002ea4>] do_IRQ+0x24/0x40
[ 4089.070000]  [<00002da8>] user_irqvec_fixup+0xc/0x14
[ 4089.070000]  [<0005c81a>] init_dl_bandwidth+0x1a/0x80
[ 4089.070000]  [<000d51c0>] find_vma+0x0/0x80
[ 4089.070000]  [<00444500>] down_read+0x0/0x200
[ 4089.070000]  [<002b86b0>] add_interrupt_randomness+0x170/0x1c0
[ 4089.070000]  [<00069f6a>] handle_irq_event_percpu+0x2a/0x80
[ 4089.070000]  [<00069ff0>] handle_irq_event+0x30/0x80
[ 4089.070000]  [<0006d9e4>] handle_simple_irq+0x64/0x80
[ 4089.070000]  [<000695fe>] generic_handle_irq+0x3e/0x80
[ 4089.070000]  [<00002ea4>] do_IRQ+0x24/0x40
[ 4089.070000]  [<00002d74>] auto_irqhandler_fixup+0x4/0xc
[ 4089.070000]  [<0000653c>] buserr_c+0x17c/0x600
[ 4089.070000]  [<00002ba8>] buserr+0x20/0x28
[ 4089.070000]  [<0008800d>] get_next_modinfo+0xcd/0x100

The problem seems to be that add_interrupt_randomness() was re-entered,
leading to a contended input_pool.lock and "spinlock trylock failure".

diff --git a/arch/m68k/kernel/irq.c b/arch/m68k/kernel/irq.c
index 9ab4f550342e..b46d8a57f4da 100644
--- a/arch/m68k/kernel/irq.c
+++ b/arch/m68k/kernel/irq.c
@@ -20,10 +20,13 @@
 asmlinkage void do_IRQ(int irq, struct pt_regs *regs)
 {
 	struct pt_regs *oldregs = set_irq_regs(regs);
+	unsigned long flags;
 
+	local_irq_save(flags);
 	irq_enter();
 	generic_handle_irq(irq);
 	irq_exit();
+	local_irq_restore(flags);
 
 	set_irq_regs(oldregs);
 }

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-08  9:33             ` Finn Thain
@ 2024-03-08 20:14               ` Michael Schmitz
  2024-03-09  5:02                 ` Finn Thain
  0 siblings, 1 reply; 36+ messages in thread
From: Michael Schmitz @ 2024-03-08 20:14 UTC (permalink / raw)
  To: Finn Thain; +Cc: Guenter Roeck, Geert Uytterhoeven, linux-m68k

Hi Finn,

Am 08.03.2024 um 22:33 schrieb Finn Thain:
> On Fri, 8 Mar 2024, I wrote:
>
>> On Fri, 8 Mar 2024, Michael Schmitz wrote:
>>
>>>
>>>> IMHO, Gueunter would do better to instrument the spinlock tests on the
>>>> assumption that the locking in arch/m68k really is correct.
>>>
>>> I've come to agree - maybe logging any run queue locks taken by the init
>>> task with IRQs enabled might help?
>>>
>>
>> If I understood how the spinlock test algorithm worked, I'd be adding
>> sanity checks to it to test the assumptions underlying that algorithm.
>>
>> I would not be surprised if it uses logic that sees irqs_disabled() and
>> then assumes that all hard irqs are disabled, or something similar.

That is quite likely, given our past experience with core code 
assumptions about interrupts.

>> I don't think I'd be adding printk() because the BUG is intermittent and
>> printk() messes with timing.
>>
>
> Another way to tackle this is first find a reliable reproducer, then raise
> the IPL before before dispatching hard interrupts.
>
> The patch below was a forgotten experiment that I tried 3 years ago to
> deal with a similar failure. At the time I was using stress-ng to
> investigate a different issue, but maybe a workload like that could be a
> reproducer.
>
> ---
>
> After enabling CONFIG_DEBUG_SPINLOCK, I see the following failure on m68k.
>
> [ 4089.070000] BUG: spinlock trylock failure on UP on CPU#0, stress-ng/1341
> [ 4089.070000]  lock: 0x53d4a0, .magic: dead4ead, .owner: stress-ng/1341, .owner_cpu: 0
> [ 4089.070000] CPU: 0 PID: 1341 Comm: stress-ng Not tainted 5.14.0-multi-debug #2
> [ 4089.070000] Stack from 00a7dcc8:
> [ 4089.070000]         00a7dcc8 004da40f 004da40f 00000000 00a7dce4 0043e88e 004da40f 00a7dd10
> [ 4089.070000]         0006316c 004d240e 0053d4a0 dead4ead 00956bba 0000053d 00000000 0053d4a0
> [ 4089.070000]         00a7ddd8 0055e6ac 00a7dd28 000632b2 0053d4a0 004d2488 0005c81b 0053d458
> [ 4089.070000]         00a7dd34 0044780e 0053d4a0 00a7dd58 002b8694 0053d4a0 00000001 0000007f
> [ 4089.070000]         00860000 00000011 00528138 00528198 00a7dd7c 00069f6a 0000000d 00014200
> [ 4089.070000]         00528138 00a7dd78 00a7def0 00528138 00014200 00a7dd98 00069ff0 00528138
> [ 4089.070000] Call Trace: [<0043e88e>] dump_stack+0x10/0x16
> [ 4089.070000]  [<0006316c>] spin_dump+0x6c/0xc0
> [ 4089.070000]  [<000632b2>] do_raw_spin_trylock+0x32/0x80
> [ 4089.070000]  [<0005c81b>] init_dl_bandwidth+0x1b/0x80
> [ 4089.070000]  [<0044780e>] _raw_spin_trylock+0xe/0x40
> [ 4089.070000]  [<002b8694>] add_interrupt_randomness+0x154/0x1c0
> [ 4089.070000]  [<00069f6a>] handle_irq_event_percpu+0x2a/0x80
> [ 4089.070000]  [<00014200>] INV_L2+0x8/0x10
> [ 4089.070000]  [<00014200>] INV_L2+0x8/0x10
> [ 4089.070000]  [<00069ff0>] handle_irq_event+0x30/0x80
> [ 4089.070000]  [<0006d9e4>] handle_simple_irq+0x64/0x80
> [ 4089.070000]  [<000695fe>] generic_handle_irq+0x3e/0x80
> [ 4089.070000]  [<00002ea4>] do_IRQ+0x24/0x40
> [ 4089.070000]  [<00002da8>] user_irqvec_fixup+0xc/0x14
> [ 4089.070000]  [<0005c81a>] init_dl_bandwidth+0x1a/0x80
> [ 4089.070000]  [<000d51c0>] find_vma+0x0/0x80
> [ 4089.070000]  [<00444500>] down_read+0x0/0x200
> [ 4089.070000]  [<002b86b0>] add_interrupt_randomness+0x170/0x1c0
> [ 4089.070000]  [<00069f6a>] handle_irq_event_percpu+0x2a/0x80
> [ 4089.070000]  [<00069ff0>] handle_irq_event+0x30/0x80
> [ 4089.070000]  [<0006d9e4>] handle_simple_irq+0x64/0x80
> [ 4089.070000]  [<000695fe>] generic_handle_irq+0x3e/0x80
> [ 4089.070000]  [<00002ea4>] do_IRQ+0x24/0x40
> [ 4089.070000]  [<00002d74>] auto_irqhandler_fixup+0x4/0xc
> [ 4089.070000]  [<0000653c>] buserr_c+0x17c/0x600
> [ 4089.070000]  [<00002ba8>] buserr+0x20/0x28
> [ 4089.070000]  [<0008800d>] get_next_modinfo+0xcd/0x100
>
> The problem seems to be that add_interrupt_randomness() was re-entered,
> leading to a contended input_pool.lock and "spinlock trylock failure".

We already knew that add_interrupt_randomness() can be reentered, though 
there are now checks in place that avoid taking input_pool.lock in 
hardirq context. But these checks have been added less than three years 
ago.

>
> diff --git a/arch/m68k/kernel/irq.c b/arch/m68k/kernel/irq.c
> index 9ab4f550342e..b46d8a57f4da 100644
> --- a/arch/m68k/kernel/irq.c
> +++ b/arch/m68k/kernel/irq.c
> @@ -20,10 +20,13 @@
>  asmlinkage void do_IRQ(int irq, struct pt_regs *regs)
>  {
>  	struct pt_regs *oldregs = set_irq_regs(regs);
> +	unsigned long flags;
>
> +	local_irq_save(flags);
>  	irq_enter();
>  	generic_handle_irq(irq);
>  	irq_exit();
> +	local_irq_restore(flags);
>
>  	set_irq_regs(oldregs);
>  }
>

I notice legacy_timer_tick() states it must be called with interrupts 
disabled. We do not explicitly do that in via_timer_handler(), but rely 
on the IPL remaining high. Maybe adding local_irq_save(flags) / 
local_irq_restore(flags) around the legacy_timer_tick() call would help 
debug this?

Cheers,

	Michael

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-08 20:14               ` Michael Schmitz
@ 2024-03-09  5:02                 ` Finn Thain
  2024-03-09 20:56                   ` Michael Schmitz
  0 siblings, 1 reply; 36+ messages in thread
From: Finn Thain @ 2024-03-09  5:02 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: Guenter Roeck, Geert Uytterhoeven, linux-m68k


On Sat, 9 Mar 2024, Michael Schmitz wrote:

> 
> I notice legacy_timer_tick() states it must be called with interrupts 
> disabled. We do not explicitly do that in via_timer_handler(), but rely 
> on the IPL remaining high. Maybe adding local_irq_save(flags) / 
> local_irq_restore(flags) around the legacy_timer_tick() call would help 
> debug this?
> 

If legacy_timer_tick() is sensitive to execution context, why not test the 
IPL with an assertion? That way you can catch that problem on Aranym as 
well as Qemu.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-09  5:02                 ` Finn Thain
@ 2024-03-09 20:56                   ` Michael Schmitz
  2024-03-09 22:18                     ` Finn Thain
  0 siblings, 1 reply; 36+ messages in thread
From: Michael Schmitz @ 2024-03-09 20:56 UTC (permalink / raw)
  To: Finn Thain; +Cc: Guenter Roeck, Geert Uytterhoeven, linux-m68k

Hi Finn,

Am 09.03.2024 um 18:02 schrieb Finn Thain:
>
> On Sat, 9 Mar 2024, Michael Schmitz wrote:
>
>>
>> I notice legacy_timer_tick() states it must be called with interrupts
>> disabled. We do not explicitly do that in via_timer_handler(), but rely
>> on the IPL remaining high. Maybe adding local_irq_save(flags) /
>> local_irq_restore(flags) around the legacy_timer_tick() call would help
>> debug this?
>>
>
> If legacy_timer_tick() is sensitive to execution context, why not test the
> IPL with an assertion? That way you can catch that problem on Aranym as
> well as Qemu.

Done that, and so far haven't seen the expected BUG message yet.

But I've now got this in ARAnyM:

BUG: spinlock recursion on CPU#0, pool_workqueue_/3
  lock: 0x435670, .magic: dead4ead, .owner: pool_workqueue_/3, .owner_cpu: 0
CPU: 0 PID: 3 Comm: pool_workqueue_ Not tainted 
6.8.0-rc7-atari-fpuemu-spinlocktest+ #1693
Stack from 00835eb4:
         00835eb4 003d5726 003d5726 00000000 0000000d 003273b0 003d5726 
0003a38a
         00435670 003cbec0 00840be0 000330f6 00435670 00814080 0003f5f2 
0004ca60
         00002604 0000714c 00000000 00000000 0003f666 0000000d 00000000 
00000000
         0032b7a6 0032b864 000186e6 00468958 004689b4 00000000 00000000 
0003f716
         00468958 00000000 00468958 0003f76e 00468958 004689b4 00468958 
004689b4
         00468958 00041c8e 00468958 00468958 0002dc58 00000000 0003f21c 
00468958
Call Trace: [<003273b0>] dump_stack+0xc/0x10
  [<0003a38a>] do_raw_spin_lock+0x34/0x58
  [<000330f6>] scheduler_tick+0x18/0x6e
  [<0003f5f2>] __irq_wake_thread+0x0/0x40
  [<0004ca60>] update_process_times+0x5e/0x66
  [<00002604>] trap+0x10/0x28
  [<0000714c>] mfp_timer_c_handler+0x2c/0x3c
  [<0003f666>] __handle_irq_event_percpu+0x34/0xd4
  [<0032b7a6>] _raw_spin_lock+0x0/0xe
  [<0032b864>] _raw_spin_unlock+0x0/0xe
  [<000186e6>] kernel_thread+0x0/0x6c
  [<0003f716>] handle_irq_event_percpu+0x10/0x3a
  [<0003f76e>] handle_irq_event+0x2e/0x54
  [<00041c8e>] handle_simple_irq+0x56/0x66
  [<0002dc58>] kthread+0x0/0xa6
  [<0003f21c>] handle_irq_desc+0x1e/0x28
  [<00002894>] do_IRQ+0x20/0x32
  [<000027c8>] user_irqvec_fixup+0xc/0x14
  [<0032b7a6>] _raw_spin_lock+0x0/0xe
  [<0032b864>] _raw_spin_unlock+0x0/0xe
  [<000186e6>] kernel_thread+0x0/0x6c
  [<0000262c>] ret_from_kernel_thread+0x0/0x14

RCU Tasks Trace: Setting shift to 0 and lim to 1 rcu_task_cb_adjust=1.

Happens real early during the boot process.

mfp_timer_c_hander() has a local_irq_save() / local_irq_restore() pair 
around the legacy_timer_tick() invocation so this spinlock recursion 
does appear to work even without reentering the scheduling timer routine ...

Cheers,

	Michael


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-09 20:56                   ` Michael Schmitz
@ 2024-03-09 22:18                     ` Finn Thain
  2024-03-11  7:06                       ` Michael Schmitz
  0 siblings, 1 reply; 36+ messages in thread
From: Finn Thain @ 2024-03-09 22:18 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: Guenter Roeck, Geert Uytterhoeven, linux-m68k

On Sun, 10 Mar 2024, Michael Schmitz wrote:

> But I've now got this in ARAnyM:
> 
> BUG: spinlock recursion on CPU#0, pool_workqueue_/3
> ...

OK. I am unable to reproduce the BUG, unfortunately.

> 
> mfp_timer_c_hander() has a local_irq_save() / local_irq_restore() pair 
> around the legacy_timer_tick() invocation so this spinlock recursion 
> does appear to work even without reentering the scheduling timer routine 
> 

IIUC it is not spinlock usage that's at issue. IIUC the problem is either 
the implementation of the locking primitives or the tests to verify their 
properties.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-09 22:18                     ` Finn Thain
@ 2024-03-11  7:06                       ` Michael Schmitz
  2024-03-11  8:35                         ` Finn Thain
  0 siblings, 1 reply; 36+ messages in thread
From: Michael Schmitz @ 2024-03-11  7:06 UTC (permalink / raw)
  To: Finn Thain; +Cc: Guenter Roeck, Geert Uytterhoeven, linux-m68k

Hi Finn,

Am 10.03.2024 um 11:18 schrieb Finn Thain:
> On Sun, 10 Mar 2024, Michael Schmitz wrote:
>
>> But I've now got this in ARAnyM:
>>
>> BUG: spinlock recursion on CPU#0, pool_workqueue_/3
>> ...
>
> OK. I am unable to reproduce the BUG, unfortunately.

Looping over 178 boots (using init=/sbin/reboot), I see eight of the 
spinlock recursion messages in ARAnyM on my old PowerBook G4:

BUG: spinlock recursion on CPU#0, swapper/1
BUG: spinlock recursion on CPU#0, swapper/1
BUG: spinlock recursion on CPU#0, pool_workqueue_/3
BUG: spinlock recursion on CPU#0, swapper/2
BUG: spinlock recursion on CPU#0, pool_workqueue_/3
BUG: spinlock recursion on CPU#0, pool_workqueue_/3
BUG: spinlock recursion on CPU#0, swapper/2
BUG: spinlock recursion on CPU#0, pool_workqueue_/3

Trying the same on a much faster Intel system, no messages are seen. 
I'll try locking the PowerBook on half CPU clock rate next.

>> mfp_timer_c_hander() has a local_irq_save() / local_irq_restore() pair
>> around the legacy_timer_tick() invocation so this spinlock recursion
>> does appear to work even without reentering the scheduling timer routine
>>
>
> IIUC it is not spinlock usage that's at issue. IIUC the problem is either
> the implementation of the locking primitives or the tests to verify their
> properties.

The tests on unlocking certainly aren't atomic, but those are not the 
ones we see in the messages. The test on locking use READ_ONCE() so 
ought to be safe.

The locking primitives are not atomic at all, by design ('No atomicity 
anywhere, we are on UP'. While not debugging, spinlocks are NOPs on UP.)

I wonder whether CONFIG_DEBUG_SPINLOCK was ever meant to work at all on UP?

Cheers,

	Michael







^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-11  7:06                       ` Michael Schmitz
@ 2024-03-11  8:35                         ` Finn Thain
  2024-03-12  0:51                           ` Michael Schmitz
  0 siblings, 1 reply; 36+ messages in thread
From: Finn Thain @ 2024-03-11  8:35 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: Guenter Roeck, Geert Uytterhoeven, linux-m68k


On Mon, 11 Mar 2024, Michael Schmitz wrote:
 
> 
> Looping over 178 boots (using init=/sbin/reboot), I see eight of the 
> spinlock recursion messages in ARAnyM on my old PowerBook G4:
> 
> BUG: spinlock recursion on CPU#0, swapper/1
> BUG: spinlock recursion on CPU#0, swapper/1
> BUG: spinlock recursion on CPU#0, pool_workqueue_/3
> BUG: spinlock recursion on CPU#0, swapper/2
> BUG: spinlock recursion on CPU#0, pool_workqueue_/3
> BUG: spinlock recursion on CPU#0, pool_workqueue_/3
> BUG: spinlock recursion on CPU#0, swapper/2
> BUG: spinlock recursion on CPU#0, pool_workqueue_/3
> 

Not the reliable reproducer I was hoping for but it is progress. We now 
know the problem shows up in both Aranym and Qemu.

> Trying the same on a much faster Intel system, no messages are seen. 
> I'll try locking the PowerBook on half CPU clock rate next.
> 
> ...
> 
> The tests on unlocking certainly aren't atomic, but those are not the 
> ones we see in the messages. The test on locking use READ_ONCE() so 
> ought to be safe.
> 
> The locking primitives are not atomic at all, by design ('No atomicity 
> anywhere, we are on UP'. While not debugging, spinlocks are NOPs on UP.)
> 

I think spin_lock() reduces to preempt_disable() on UP.
In include/linux/spinlock_api_up.h it says,

/*
 * In the UP-nondebug case there's no real locking going on, so the
 * only thing we have to do is to keep the preempt counts and irq
 * flags straight, to suppress compiler warnings of unused lock
 * variables, and to add the proper checker annotations:
 */

> I wonder whether CONFIG_DEBUG_SPINLOCK was ever meant to work at all on UP?
> 

I've no idea, sorry. The people who would be able to help would be found 
in MAINTAINERS in the "LOCKING PRIMITIVES" section.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-11  8:35                         ` Finn Thain
@ 2024-03-12  0:51                           ` Michael Schmitz
  2024-03-12  7:59                             ` Geert Uytterhoeven
  0 siblings, 1 reply; 36+ messages in thread
From: Michael Schmitz @ 2024-03-12  0:51 UTC (permalink / raw)
  To: Finn Thain; +Cc: Guenter Roeck, Geert Uytterhoeven, linux-m68k

Hi Finn,

On 11/03/24 21:35, Finn Thain wrote:
> On Mon, 11 Mar 2024, Michael Schmitz wrote:
>   
>> Looping over 178 boots (using init=/sbin/reboot), I see eight of the
>> spinlock recursion messages in ARAnyM on my old PowerBook G4:
>>
>> BUG: spinlock recursion on CPU#0, swapper/1
>> BUG: spinlock recursion on CPU#0, swapper/1
>> BUG: spinlock recursion on CPU#0, pool_workqueue_/3
>> BUG: spinlock recursion on CPU#0, swapper/2
>> BUG: spinlock recursion on CPU#0, pool_workqueue_/3
>> BUG: spinlock recursion on CPU#0, pool_workqueue_/3
>> BUG: spinlock recursion on CPU#0, swapper/2
>> BUG: spinlock recursion on CPU#0, pool_workqueue_/3
>>
> Not the reliable reproducer I was hoping for but it is progress. We now
> know the problem shows up in both Aranym and Qemu.

Yep - no large impact by dropping the CPU clock (and BogoMIPS) by half 
though. One boot in sixteen shows the recursion message now.

Still running the test with local_irq_save() / local_irq_restore() 
protecting the spinlock debug tests.

>
>> Trying the same on a much faster Intel system, no messages are seen.
>> I'll try locking the PowerBook on half CPU clock rate next.
>>
>> ...
>>
>> The tests on unlocking certainly aren't atomic, but those are not the
>> ones we see in the messages. The test on locking use READ_ONCE() so
>> ought to be safe.
>>
>> The locking primitives are not atomic at all, by design ('No atomicity
>> anywhere, we are on UP'. While not debugging, spinlocks are NOPs on UP.)
>>
> I think spin_lock() reduces to preempt_disable() on UP.
> In include/linux/spinlock_api_up.h it says,
>
> /*
>   * In the UP-nondebug case there's no real locking going on, so the
>   * only thing we have to do is to keep the preempt counts and irq
>   * flags straight, to suppress compiler warnings of unused lock
>   * variables, and to add the proper checker annotations:
>   */

That's only true in the debug case - there, preempt_disable() is used 
inside the spin loop. But m68k is one of the last CONFIG_PREEMPT_NONE 
archs AFAIR, and preempt_disable() reduces to barrier() on those.

>> I wonder whether CONFIG_DEBUG_SPINLOCK was ever meant to work at all on UP?
>>
> I've no idea, sorry. The people who would be able to help would be found
> in MAINTAINERS in the "LOCKING PRIMITIVES" section.

I'll ask around once I've done a few more tests.

Cheers,

     Michael



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-12  0:51                           ` Michael Schmitz
@ 2024-03-12  7:59                             ` Geert Uytterhoeven
  2024-03-12 20:14                               ` Michael Schmitz
  2024-03-13  0:16                               ` Michael Schmitz
  0 siblings, 2 replies; 36+ messages in thread
From: Geert Uytterhoeven @ 2024-03-12  7:59 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: Finn Thain, Guenter Roeck, linux-m68k

On Tue, Mar 12, 2024 at 1:51 AM Michael Schmitz <schmitzmic@gmail.com> wrote:
> On 11/03/24 21:35, Finn Thain wrote:
> > I think spin_lock() reduces to preempt_disable() on UP.
> > In include/linux/spinlock_api_up.h it says,
> >
> > /*
> >   * In the UP-nondebug case there's no real locking going on, so the
> >   * only thing we have to do is to keep the preempt counts and irq
> >   * flags straight, to suppress compiler warnings of unused lock
> >   * variables, and to add the proper checker annotations:
> >   */
>
> That's only true in the debug case - there, preempt_disable() is used
> inside the spin loop. But m68k is one of the last CONFIG_PREEMPT_NONE
> archs AFAIR, and preempt_disable() reduces to barrier() on those.

M68k does have experimental preempt support. I have been running
that for the last 5 months. Works fine most of the time, except for
the one BUG[1] that happens every 10 boots or so.

[1] https://lore.kernel.org/all/CAMuHMdUQ72KOPw5vxNfhjoTR-SsaELeKneBmyQPYEWa_o5OZZA@mail.gmail.com

Gr{oetje,eeting}s,

                        Geert


--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-12  7:59                             ` Geert Uytterhoeven
@ 2024-03-12 20:14                               ` Michael Schmitz
  2024-03-13  0:16                               ` Michael Schmitz
  1 sibling, 0 replies; 36+ messages in thread
From: Michael Schmitz @ 2024-03-12 20:14 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Finn Thain, Guenter Roeck, linux-m68k

Hi Geert,

On 12/03/24 20:59, Geert Uytterhoeven wrote:
> On Tue, Mar 12, 2024 at 1:51 AM Michael Schmitz <schmitzmic@gmail.com> wrote:
>> On 11/03/24 21:35, Finn Thain wrote:
>>> I think spin_lock() reduces to preempt_disable() on UP.
>>> In include/linux/spinlock_api_up.h it says,
>>>
>>> /*
>>>    * In the UP-nondebug case there's no real locking going on, so the
>>>    * only thing we have to do is to keep the preempt counts and irq
>>>    * flags straight, to suppress compiler warnings of unused lock
>>>    * variables, and to add the proper checker annotations:
>>>    */
>> That's only true in the debug case - there, preempt_disable() is used
>> inside the spin loop. But m68k is one of the last CONFIG_PREEMPT_NONE
>> archs AFAIR, and preempt_disable() reduces to barrier() on those.
> M68k does have experimental preempt support. I have been running
> that for the last 5 months. Works fine most of the time, except for
> the one BUG[1] that happens every 10 boots or so.
>
> [1] https://lore.kernel.org/all/CAMuHMdUQ72KOPw5vxNfhjoTR-SsaELeKneBmyQPYEWa_o5OZZA@mail.gmail.com

Thanks for reminding me - I had forgotten that I had seen that bug report.

I'll give that a spin on ARAnyM - doubt it will work on my Falcon due to 
the extra kernel size.

Cheers,

     Michael


>
> Gr{oetje,eeting}s,
>
>                          Geert
>
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                  -- Linus Torvalds

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-12  7:59                             ` Geert Uytterhoeven
  2024-03-12 20:14                               ` Michael Schmitz
@ 2024-03-13  0:16                               ` Michael Schmitz
  2024-03-13  4:39                                 ` Preemption (was: Re: spinlock recursion when running q800 emulation in qemu) Michael Schmitz
  2024-03-13  4:40                                 ` spinlock recursion when running q800 emulation in qemu Finn Thain
  1 sibling, 2 replies; 36+ messages in thread
From: Michael Schmitz @ 2024-03-13  0:16 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Finn Thain, Guenter Roeck, linux-m68k

Hi Geert,

On 12/03/24 20:59, Geert Uytterhoeven wrote:
> On Tue, Mar 12, 2024 at 1:51 AM Michael Schmitz <schmitzmic@gmail.com> wrote:
>> On 11/03/24 21:35, Finn Thain wrote:
>>> I think spin_lock() reduces to preempt_disable() on UP.
>>> In include/linux/spinlock_api_up.h it says,
>>>
>>> /*
>>>    * In the UP-nondebug case there's no real locking going on, so the
>>>    * only thing we have to do is to keep the preempt counts and irq
>>>    * flags straight, to suppress compiler warnings of unused lock
>>>    * variables, and to add the proper checker annotations:
>>>    */
>> That's only true in the debug case - there, preempt_disable() is used
>> inside the spin loop. But m68k is one of the last CONFIG_PREEMPT_NONE
>> archs AFAIR, and preempt_disable() reduces to barrier() on those.
> M68k does have experimental preempt support. I have been running
> that for the last 5 months. Works fine most of the time, except for
> the one BUG[1] that happens every 10 boots or so.
>
> [1] https://lore.kernel.org/all/CAMuHMdUQ72KOPw5vxNfhjoTR-SsaELeKneBmyQPYEWa_o5OZZA@mail.gmail.com

NB: the Kconfig hunk no longer applies, it now should look like this:

diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
index 4b3e93cac723..0c4b5180df1d 100644
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -10,7 +10,6 @@ config M68K
         select ARCH_HAS_SYNC_DMA_FOR_DEVICE if M68K_NONCOHERENT_DMA
         select ARCH_HAVE_NMI_SAFE_CMPXCHG if RMW_INSNS
         select ARCH_MIGHT_HAVE_PC_PARPORT if ISA
-       select ARCH_NO_PREEMPT if !COLDFIRE
         select ARCH_USE_MEMTEST if MMU_MOTOROLA
         select ARCH_WANT_IPC_PARSE_VERSION
         select BINFMT_FLAT_ARGVP_ENVP_ON_STACK

(HAS_DMA -> M68K_NONCOHERENT_DMA)

Runs fine on ARAnyM with a bit of VM stress testing using 
PREEMPT_VOLUNTARY. After a bit of looping (init=/sbin/reboot), fails 
spectacularly with not finding any root filesystem to mount when using 
full preempt.

Trying to boot a full system, I see the 'table already freed' panic on 
the first go.

Running a stack-ng stack-fill stressor triggered the panic, too. I'll 
see how reliable that is.

Cheers,

     Michael


>
> Gr{oetje,eeting}s,
>
>                          Geert
>
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                  -- Linus Torvalds

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Preemption (was: Re: spinlock recursion when running q800 emulation in qemu)
  2024-03-13  0:16                               ` Michael Schmitz
@ 2024-03-13  4:39                                 ` Michael Schmitz
  2024-03-13  4:40                                 ` spinlock recursion when running q800 emulation in qemu Finn Thain
  1 sibling, 0 replies; 36+ messages in thread
From: Michael Schmitz @ 2024-03-13  4:39 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Finn Thain, Guenter Roeck, linux-m68k

Hi Geert,

Am 13.03.2024 um 13:16 schrieb Michael Schmitz:
> Hi Geert,
>
> On 12/03/24 20:59, Geert Uytterhoeven wrote:
>> On Tue, Mar 12, 2024 at 1:51 AM Michael Schmitz <schmitzmic@gmail.com>
>> wrote:
>>> On 11/03/24 21:35, Finn Thain wrote:
>>>> I think spin_lock() reduces to preempt_disable() on UP.
>>>> In include/linux/spinlock_api_up.h it says,
>>>>
>>>> /*
>>>>    * In the UP-nondebug case there's no real locking going on, so the
>>>>    * only thing we have to do is to keep the preempt counts and irq
>>>>    * flags straight, to suppress compiler warnings of unused lock
>>>>    * variables, and to add the proper checker annotations:
>>>>    */
>>> That's only true in the debug case - there, preempt_disable() is used
>>> inside the spin loop. But m68k is one of the last CONFIG_PREEMPT_NONE
>>> archs AFAIR, and preempt_disable() reduces to barrier() on those.
>> M68k does have experimental preempt support. I have been running
>> that for the last 5 months. Works fine most of the time, except for
>> the one BUG[1] that happens every 10 boots or so.
>>
>> [1]
>> https://lore.kernel.org/all/CAMuHMdUQ72KOPw5vxNfhjoTR-SsaELeKneBmyQPYEWa_o5OZZA@mail.gmail.com
>>
>
> NB: the Kconfig hunk no longer applies, it now should look like this:
>
> diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
> index 4b3e93cac723..0c4b5180df1d 100644
> --- a/arch/m68k/Kconfig
> +++ b/arch/m68k/Kconfig
> @@ -10,7 +10,6 @@ config M68K
>         select ARCH_HAS_SYNC_DMA_FOR_DEVICE if M68K_NONCOHERENT_DMA
>         select ARCH_HAVE_NMI_SAFE_CMPXCHG if RMW_INSNS
>         select ARCH_MIGHT_HAVE_PC_PARPORT if ISA
> -       select ARCH_NO_PREEMPT if !COLDFIRE
>         select ARCH_USE_MEMTEST if MMU_MOTOROLA
>         select ARCH_WANT_IPC_PARSE_VERSION
>         select BINFMT_FLAT_ARGVP_ENVP_ON_STACK
>
> (HAS_DMA -> M68K_NONCOHERENT_DMA)
>
> Runs fine on ARAnyM with a bit of VM stress testing using
> PREEMPT_VOLUNTARY. After a bit of looping (init=/sbin/reboot), fails
> spectacularly with not finding any root filesystem to mount when using
> full preempt.
>
> Trying to boot a full system, I see the 'table already freed' panic on
> the first go.
>
> Running a stack-ng stack-fill stressor triggered the panic, too. I'll
> see how reliable that is.

That's actually pretty reliable - running stress-ng --stack 2 
--stack-fill two or three times gets me a panic (on a fast Intel system).

Running full preemption on the PowerBook instance, the older one of my 
disk images boots OK (sysvinit based), the newer one (systemd IIRC) 
cheerfully throws endless Oops of this kind:

> Unable to handle kernel NULL pointer dereference at virtual address b90995fb
> Oops: 00000000
> Modules linked in:
> PC: [<000de1d6>] del_page_from_free_list+0x16/0x46
> SR: 2700  SP: 3e9c0737  a2: 00f34610
> d0: 00000000    d1: 0054491c    d2: 0000000b    d3: 00000000
> d4: 0000000f    d5: 00000000    a0: 0054491c    a1: 00495dd0
> Process vol_id (pid: 1198, task=54e40596)
> Frame format=7 eff addr=00000104 ssw=0405 faddr=00000104
> wb 1 stat/addr/data: 0000 00000000 00000000
> wb 2 stat/addr/data: 0000 00000000 00000000
> wb 3 stat/addr/data: 0085 00000104 00495dd0
> push data: 00000000 00000000 00000000 00000000
> Stack from 0482fbc4:
>         0076343c 000df4ce 0054491c 00495d78 00000000 00000000 00000801 00000000
>         00000000 00002000 0076343c 00495d78 000df8de 0482e000 0482fd06 0000001e
>         0054491c 00544920 00000000 00495f84 00002700 00000058 00000058 00495d78
>         000dfb70 00495d78 00000000 00000000 00000801 0076343c 00763464 00152c40
>         00000000 00112cc0 00000000 00000000 000b493c 0482fd06 000df974 01023764
>         0032d0e8 00dd7400 00f1fe58 00000049 64560000 00010000 0000f000 00000a00
> Call Trace: [<000df4ce>] __rmqueue_pcplist+0x198/0x316
>  [<00002000>] _start+0x0/0x8
>  [<000df8de>] zone_watermark_fast.isra.112+0x0/0x96
>  [<00002700>] resume_userspace+0xa/0x16
>  [<000dfb70>] get_page_from_freelist+0x1fc/0x632
>  [<00152c40>] ext4_ext_convert_to_initialized+0x50c/0x534
>  [<00112cc0>] simple_transaction_get+0x52/0x98
>  [<000b493c>] filemap_add_folio+0x0/0x9e
>  [<000df974>] get_page_from_freelist+0x0/0x632
>  [<0032d0e8>] xa_load+0x0/0x76
>  [<00010000>] TWOMAIN+0x7a/0x80
>  [<0000f000>] NEVEN+0xd8/0xec
>  [<000df7d2>] prepare_alloc_pages.isra.110+0x70/0x84
>  [<000e0f4e>] __alloc_pages+0xa8/0x6b2
>  [<00152c40>] ext4_ext_convert_to_initialized+0x50c/0x534
>  [<00112cc0>] simple_transaction_get+0x52/0x98
>  [<000b493c>] filemap_add_folio+0x0/0x9e
>  [<0032d0e8>] xa_load+0x0/0x76
>  [<00004ec2>] buserr_c+0x10c/0x3e6
>  [<00001000>] kernel_pg_dir+0x0/0x1000
>  [<00010000>] TWOMAIN+0x7a/0x80
>  [<0032cf2a>] xas_load+0x1e/0x60
>  [<0032cf0c>] xas_load+0x0/0x60
>  [<0032d150>] xa_load+0x68/0x76
>  [<000e1a46>] __folio_alloc+0x1c/0x2a
>  [<00152cc0>] ext4_ext_index_trans_blocks+0x0/0x3a
>  [<000bace0>] page_cache_ra_unbounded+0xb0/0x140
>  [<00112cc0>] simple_transaction_get+0x52/0x98
>  [<00010fff>] X_OPERR+0x3/0x40
>  [<000b4c24>] filemap_get_pages+0x24a/0x4e0
>  [<00008000>] nvram_proc_read+0x4c/0x31a
>  [<001000f0>] fiemap_prep+0x6c/0xc2
>  [<000f0000>] vfs_writev+0x2c/0x16e
>  [<001c0000>] bio_may_exceed_limits.isra.35+0x8/0x54
>  [<000b4f66>] filemap_read+0xac/0x30c
>  [<00010000>] TWOMAIN+0x7a/0x80
>  [<00010000>] TWOMAIN+0x7a/0x80
>  [<00010000>] TWOMAIN+0x7a/0x80
>  [<00001000>] kernel_pg_dir+0x0/0x1000
>  [<00008000>] nvram_proc_read+0x4c/0x31a
>  [<00008000>] nvram_proc_read+0x4c/0x31a
>  [<00001000>] kernel_pg_dir+0x0/0x1000
>  [<00007000>] atari_tt_hwclk+0x16c/0x2fc
>  [<00003000>] arch_ptrace+0xa6/0x24a
>  [<00001000>] kernel_pg_dir+0x0/0x1000
>  [<001b5716>] blkdev_read_iter+0x118/0x138
>  [<00010000>] TWOMAIN+0x7a/0x80
>  [<00008000>] nvram_proc_read+0x4c/0x31a
>  [<00010000>] TWOMAIN+0x7a/0x80
>  [<000f0a68>] vfs_read+0xcc/0x128
>  [<00008000>] nvram_proc_read+0x4c/0x31a
>  [<00010000>] TWOMAIN+0x7a/0x80
>  [<000f0dfc>] ksys_read+0x4a/0x86
>  [<00010000>] TWOMAIN+0x7a/0x80
>  [<00010000>] TWOMAIN+0x7a/0x80
>  [<0000fff6>] TWOMAIN+0x70/0x80
>  [<000f0e4a>] sys_read+0x12/0x18
>  [<00010000>] TWOMAIN+0x7a/0x80
>  [<0000269a>] syscall+0x8/0xc
>  [<00010000>] TWOMAIN+0x7a/0x80
>  [<00010000>] TWOMAIN+0x7a/0x80
>  [<0000c00c>] ATANTBL+0x114/0x800
>
> Code: 202f 0010 2668 0004 2268 0008 2749 0004 <228b> 217c 0000 0100 0004 217c 0000 0122 0008 0068 0080 001a 42a8 0014 2200 e589
> Disabling lock debugging due to kernel taint
> note: vol_id[1198] exited with irqs disabled
> note: vol_id[1198] exited with preempt_count 3

(killed ARAnyM after 26 of those).

Booted OK on the next boot, produced a 'table already free' panic after 
three or four runs of the stack-fill stress test.

Cheers,

	Michael


>
> Cheers,
>
>     Michael
>
>
>>
>> Gr{oetje,eeting}s,
>>
>>                          Geert
>>
>>
>> --
>> Geert Uytterhoeven -- There's lots of Linux beyond ia32 --
>> geert@linux-m68k.org
>>
>> In personal conversations with technical people, I call myself a
>> hacker. But
>> when I'm talking to journalists I just say "programmer" or something
>> like that.
>>                                  -- Linus Torvalds

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-13  0:16                               ` Michael Schmitz
  2024-03-13  4:39                                 ` Preemption (was: Re: spinlock recursion when running q800 emulation in qemu) Michael Schmitz
@ 2024-03-13  4:40                                 ` Finn Thain
  2024-03-13  5:34                                   ` Michael Schmitz
  2024-03-14  0:59                                   ` Michael Schmitz
  1 sibling, 2 replies; 36+ messages in thread
From: Finn Thain @ 2024-03-13  4:40 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: Geert Uytterhoeven, Guenter Roeck, linux-m68k

On Wed, 13 Mar 2024, Michael Schmitz wrote:

> 
> Running a stack-ng stack-fill stressor triggered the panic, too. I'll 
> see how reliable that is.
> 

I see an immediate failure with full preemption in QEMU when I run
stress-ng -t 60 --zombie -1 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-13  4:40                                 ` spinlock recursion when running q800 emulation in qemu Finn Thain
@ 2024-03-13  5:34                                   ` Michael Schmitz
  2024-03-14  0:59                                   ` Michael Schmitz
  1 sibling, 0 replies; 36+ messages in thread
From: Michael Schmitz @ 2024-03-13  5:34 UTC (permalink / raw)
  To: Finn Thain; +Cc: Geert Uytterhoeven, Guenter Roeck, linux-m68k

Hi Finn,

Am 13.03.2024 um 17:40 schrieb Finn Thain:
> On Wed, 13 Mar 2024, Michael Schmitz wrote:
>
>>
>> Running a stack-ng stack-fill stressor triggered the panic, too. I'll
>> see how reliable that is.
>>
>
> I see an immediate failure with full preemption in QEMU when I run
> stress-ng -t 60 --zombie -1

Thanks - unfortunately, I don't have that available. Will see about 
adding that in ...

Cheers,

	Michael



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-13  4:40                                 ` spinlock recursion when running q800 emulation in qemu Finn Thain
  2024-03-13  5:34                                   ` Michael Schmitz
@ 2024-03-14  0:59                                   ` Michael Schmitz
  2024-03-15  4:32                                     ` Michael Schmitz
  1 sibling, 1 reply; 36+ messages in thread
From: Michael Schmitz @ 2024-03-14  0:59 UTC (permalink / raw)
  To: Finn Thain; +Cc: Geert Uytterhoeven, Guenter Roeck, linux-m68k

Hi Finn,

On 13/03/24 17:40, Finn Thain wrote:
> On Wed, 13 Mar 2024, Michael Schmitz wrote:
>
>> Running a stack-ng stack-fill stressor triggered the panic, too. I'll
>> see how reliable that is.
>>
> I see an immediate failure with full preemption in QEMU when I run
> stress-ng -t 60 --zombie -1

Seems to work quite well.

Though what I got on the first attempt was a deadlock with full 
preemption in ARAnyM. No panic.

Time to look for a solution now we've got a nice test case.

Cheers,

     Michael




^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-14  0:59                                   ` Michael Schmitz
@ 2024-03-15  4:32                                     ` Michael Schmitz
  2024-03-15  7:24                                       ` Finn Thain
  0 siblings, 1 reply; 36+ messages in thread
From: Michael Schmitz @ 2024-03-15  4:32 UTC (permalink / raw)
  To: Finn Thain; +Cc: Geert Uytterhoeven, Guenter Roeck, linux-m68k

Hi Finn, Geert,

Am 14.03.2024 um 13:59 schrieb Michael Schmitz:
> Hi Finn,
>
> On 13/03/24 17:40, Finn Thain wrote:
>> On Wed, 13 Mar 2024, Michael Schmitz wrote:
>>
>>> Running a stack-ng stack-fill stressor triggered the panic, too. I'll
>>> see how reliable that is.
>>>
>> I see an immediate failure with full preemption in QEMU when I run
>> stress-ng -t 60 --zombie -1
>
> Seems to work quite well.
>
> Though what I got on the first attempt was a deadlock with full
> preemption in ARAnyM. No panic.
>
> Time to look for a solution now we've got a nice test case.

No luck with whatever I tried around signals, cache maintenance and mm.

The 'BUG: Bad rss-counter state' message suggests we're freeing the same 
page ranges twice, sometimes in many cases. I cannot quite see how 
preempting the kernel on interupt return would cause this. Signal 
forcing process exit but process exiting before signal is received due 
to preemption? But skipping preemption when a signal is pending did not 
change anything in my tests...

Running out of ideas here, sorry.

Cheers,

	Michael


>
> Cheers,
>
>     Michael
>
>
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-15  4:32                                     ` Michael Schmitz
@ 2024-03-15  7:24                                       ` Finn Thain
  2024-03-18  6:24                                         ` Michael Schmitz
  0 siblings, 1 reply; 36+ messages in thread
From: Finn Thain @ 2024-03-15  7:24 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: Geert Uytterhoeven, Guenter Roeck, linux-m68k


On Fri, 15 Mar 2024, Michael Schmitz wrote:

> 
> No luck with whatever I tried around signals, cache maintenance and mm.
> 
> The 'BUG: Bad rss-counter state' message suggests we're freeing the same 
> page ranges twice, sometimes in many cases. I cannot quite see how 
> preempting the kernel on interupt return would cause this. Signal 
> forcing process exit but process exiting before signal is received due 
> to preemption? But skipping preemption when a signal is pending did not 
> change anything in my tests...
> 
> Running out of ideas here, sorry.
> 

FWIW, I found that the failure mode (with CONFIG_PREEMPT) changed 
significantly after I disabled hard irqs in do_IRQ() using the patch I 
sent on the 8th. In three stress-ng test runs, I got a soft lockup, a WARN 
from set_fc() and some CONFIG_DEBUG_LIST failures...

# /root/stress-ng -t 60 --zombie -1 
stress-ng: info:  [46] setting to a 60 second run per stressor
stress-ng: info:  [46] dispatching hogs: 1 zombie
[   29.670000] ------------[ cut here ]------------
[   29.670000] WARNING: CPU: 0 PID: 0 at arch/m68k/include/asm/processor.h:92 buserr_c+0x486/0x5ba
[   29.670000] Modules linked in:
[   29.670000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-rc7-mac-00234-g138ba0024ff0 #9
[   29.670000] Stack from 00583c20:
[   29.670000]         00583c20 0054af2c 0054af2c 00000000 006e0100 00583c40 004980b6 0054af2c
[   29.670000]         00583c54 00492ff6 00000505 0059ff94 0b46005f 00583c84 0001c73e 0053a3bc
[   29.670000]         0000005c 000061b6 00000009 00000000 00000000 00000505 00583cc8 00696ef0
[   29.670000]         008dc490 00583cbc 000061b6 0053a3bc 0000005c 00000009 00000000 00000000
[   29.670000]         006eee78 00002ff5 00000008 00000001 005b0b4c 006eee74 02d61260 00583d34
[   29.670000]         000026b4 00583cc8 006eee78 00000000 006eee78 00002ff5 00000008 005b0b4c
[   29.670000] Call Trace: [<004980b6>] dump_stack+0x10/0x16
[   29.670000]  [<00492ff6>] __warn+0x90/0xb6
[   29.670000]  [<0001c73e>] warn_slowpath_fmt+0x10e/0x1a4
[   29.670000]  [<000061b6>] buserr_c+0x486/0x5ba
[   29.670000]  [<000061b6>] buserr_c+0x486/0x5ba
[   29.670000]  [<00002ff5>] __get_wchan+0x4d/0x58
[   29.670000]  [<000026b4>] buserr+0x20/0x28
[   29.670000]  [<00002ff5>] __get_wchan+0x4d/0x58
[   29.670000]  [<000ea558>] __free_one_page+0x2ec/0x532
[   29.670000]  [<000ea8bc>] free_pcppages_bulk+0x11e/0x18a
[   29.670000]  [<00002ff5>] __get_wchan+0x4d/0x58
[   29.670000]  [<00002700>] ret_from_kernel_thread+0xc/0x14
[   29.670000]  [<000eaa86>] free_unref_page_commit+0x15e/0x314
[   29.670000]  [<00002ce0>] show_regs+0x48/0xb8
[   29.670000]  [<00002004>] _start+0x4/0x8
[   29.670000]  [<000ed452>] free_unref_page+0x12a/0x190
[   29.670000]  [<00065be4>] rcu_cblist_dequeue+0x0/0x28
[   29.670000]  [<000ed648>] __free_pages+0xb4/0x10c
[   29.670000]  [<00018340>] thread_stack_free_rcu+0x3a/0x44
[   29.670000]  [<0006436c>] rcu_core+0x2c2/0x79e
[   29.670000]  [<00054fda>] handle_irq_event+0x7a/0x90
[   29.670000]  [<00064928>] rcu_core_si+0x8/0xc
[   29.670000]  [<0049952c>] __do_softirq+0xa4/0x1f8
[   29.670000]  [<00492600>] memmove+0x8c/0xe0
[   29.670000]  [<00038196>] parse_args+0x0/0x3a6
[   29.670000]  [<000485f8>] arch_cpu_idle_exit+0x0/0x8
[   29.670000]  [<000485f0>] arch_cpu_idle_enter+0x0/0x8
[   29.670000]  [<0001fee6>] irq_exit+0x44/0x4e
[   29.670000]  [<00002986>] do_IRQ+0x32/0x6c
[   29.670000]  [<00498cf8>] cpu_idle_poll.isra.0+0x0/0x6e
[   29.670000]  [<00498d66>] default_idle_call+0x0/0x46
[   29.670000]  [<00002874>] auto_irqhandler_fixup+0x4/0xc
[   29.670000]  [<00498cf8>] cpu_idle_poll.isra.0+0x0/0x6e
[   29.670000]  [<00498d66>] default_idle_call+0x0/0x46
[   29.670000]  [<00498d88>] default_idle_call+0x22/0x46
[   29.670000]  [<00048680>] do_idle+0x6a/0xf0
[   29.670000]  [<00048616>] do_idle+0x0/0xf0
[   29.670000]  [<00036742>] find_task_by_pid_ns+0x0/0x2a
[   29.670000]  [<0005cfd4>] __rcu_read_lock+0x0/0x12
[   29.670000]  [<000489cc>] cpu_startup_entry+0x1a/0x1c
[   29.670000]  [<00063976>] __rcu_read_unlock+0x0/0x26
[   29.670000]  [<00498e4a>] kernel_init+0x0/0xfa
[   29.670000]  [<004897c0>] strcpy+0x0/0x1e
[   29.670000]  [<0049379c>] _printk+0x0/0x16
[   29.670000]  [<00489942>] strlen+0x0/0x22
[   29.670000]  [<0062d2c6>] memblock_alloc_try_nid+0x0/0x82
[   29.670000]  [<0062139a>] arch_post_acpi_subsys_init+0x0/0x8
[   29.670000]  [<00621918>] console_on_rootfs+0x0/0x60
[   29.670000]  [<00620410>] _sinittext+0x410/0xadc
[   29.670000] 
[   29.670000] ---[ end trace 0000000000000000 ]---
[   29.670000] Unable to handle kernel access at virtual address 0b46005f
[   29.670000] Oops: 00000000
[   29.670000] Modules linked in:
[   29.670000] PC: [<002951de>] __list_add_valid_or_report+0x1a/0x102
[   29.670000] SR: 2700  SP: 00583cc8  a2: 00587590
[   29.670000] d0: 000007e0    d1: 006eee78    d2: 00000000    d3: 006eee78
[   29.670000] d4: 00002ff5    d5: 00000008    a0: 005b0b4c    a1: 0b46005b
[   29.670000] Process swapper (pid: 0, task=00587590)
[   29.670000] Frame format=7 eff addr=0b46005f ssw=0505 faddr=0b46005f
[   29.670000] wb 1 stat/addr/data: 0000 00000000 00000000
[   29.670000] wb 2 stat/addr/data: 0000 00000000 00000000
[   29.670000] wb 3 stat/addr/data: 0000 0b46005f 00000000
[   29.670000] push data: 00000000 00000000 00000000 00000000
[   29.670000] Stack from 00583d30:
[   29.670000]         00000000 00583d6c 000ea558 006eee78 005b0b4c 0b46005b 007a5380 00000000
[   29.670000]         0000000f 0000000c 00000005 00000001 006eee78 007a5358 005b2090 00583dc8
[   29.670000]         000ea8bc 006eee74 00002ff5 005b0314 00000000 000000fc 00000000 00000f07
[   29.670000]         00000003 000001ca 000000e0 00000001 006e307c 007a5358 005b0314 00000011
[   29.670000]         006eee74 00000039 00002700 005b04fc 00000000 000000aa 00583e08 000eaa86
[   29.670000]         005b0314 0000004e 007a5358 00000003 00000001 00002ce0 00000000 00000000
[   29.670000] Call Trace: [<000ea558>] __free_one_page+0x2ec/0x532
[   29.670000]  [<000ea8bc>] free_pcppages_bulk+0x11e/0x18a
[   29.670000]  [<00002ff5>] __get_wchan+0x4d/0x58
[   29.670000]  [<00002700>] ret_from_kernel_thread+0xc/0x14
[   29.670000]  [<000eaa86>] free_unref_page_commit+0x15e/0x314
[   29.670000]  [<00002ce0>] show_regs+0x48/0xb8
[   29.670000]  [<00002004>] _start+0x4/0x8
[   29.670000]  [<000ed452>] free_unref_page+0x12a/0x190
[   29.670000]  [<00065be4>] rcu_cblist_dequeue+0x0/0x28
[   29.670000]  [<000ed648>] __free_pages+0xb4/0x10c
[   29.670000]  [<00018340>] thread_stack_free_rcu+0x3a/0x44
[   29.670000]  [<0006436c>] rcu_core+0x2c2/0x79e
[   29.670000]  [<00054fda>] handle_irq_event+0x7a/0x90
[   29.670000]  [<00064928>] rcu_core_si+0x8/0xc
[   29.670000]  [<0049952c>] __do_softirq+0xa4/0x1f8
[   29.670000]  [<00492600>] memmove+0x8c/0xe0
[   29.670000]  [<00038196>] parse_args+0x0/0x3a6
[   29.670000]  [<000485f8>] arch_cpu_idle_exit+0x0/0x8
[   29.670000]  [<000485f0>] arch_cpu_idle_enter+0x0/0x8
[   29.670000]  [<0001fee6>] irq_exit+0x44/0x4e
[   29.670000]  [<00002986>] do_IRQ+0x32/0x6c
[   29.670000]  [<00498cf8>] cpu_idle_poll.isra.0+0x0/0x6e
[   29.670000]  [<00498d66>] default_idle_call+0x0/0x46
[   29.670000]  [<00002874>] auto_irqhandler_fixup+0x4/0xc
[   29.670000]  [<00498cf8>] cpu_idle_poll.isra.0+0x0/0x6e
[   29.670000]  [<00498d66>] default_idle_call+0x0/0x46
[   29.670000]  [<00498d88>] default_idle_call+0x22/0x46
[   29.670000]  [<00048680>] do_idle+0x6a/0xf0
[   29.670000]  [<00048616>] do_idle+0x0/0xf0
[   29.670000]  [<00036742>] find_task_by_pid_ns+0x0/0x2a
[   29.670000]  [<0005cfd4>] __rcu_read_lock+0x0/0x12
[   29.670000]  [<000489cc>] cpu_startup_entry+0x1a/0x1c
[   29.670000]  [<00063976>] __rcu_read_unlock+0x0/0x26
[   29.670000]  [<00498e4a>] kernel_init+0x0/0xfa
[   29.670000]  [<004897c0>] strcpy+0x0/0x1e
[   29.670000]  [<0049379c>] _printk+0x0/0x16
[   29.670000]  [<00489942>] strlen+0x0/0x22
[   29.670000]  [<0062d2c6>] memblock_alloc_try_nid+0x0/0x82
[   29.670000]  [<0062139a>] arch_post_acpi_subsys_init+0x0/0x8
[   29.670000]  [<00621918>] console_on_rootfs+0x0/0x60
[   29.670000]  [<00620410>] _sinittext+0x410/0xadc
[   29.670000] 
[   29.670000] Code: 206e 000c 226e 0010 4a88 6756 4a89 6774 <2029> 0004 b1c0 6600 008e 2410 b3c2 6600 00ae b288 670e 7001 b289 6708 242e fffc
[   29.670000] Disabling lock debugging due to kernel taint
[   29.670000] Kernel panic - not syncing: Aiee, killing interrupt handler!
[   29.670000] ---[ end Kernel panic - not syncing: Aiee, killing interrupt handler! ]---

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-15  7:24                                       ` Finn Thain
@ 2024-03-18  6:24                                         ` Michael Schmitz
  2024-03-18  9:31                                           ` Finn Thain
  0 siblings, 1 reply; 36+ messages in thread
From: Michael Schmitz @ 2024-03-18  6:24 UTC (permalink / raw)
  To: Finn Thain; +Cc: Geert Uytterhoeven, Guenter Roeck, linux-m68k

Hi Finn,

Am 15.03.2024 um 20:24 schrieb Finn Thain:
>
> On Fri, 15 Mar 2024, Michael Schmitz wrote:
>
>>
>> No luck with whatever I tried around signals, cache maintenance and mm.
>>
>> The 'BUG: Bad rss-counter state' message suggests we're freeing the same
>> page ranges twice, sometimes in many cases. I cannot quite see how
>> preempting the kernel on interupt return would cause this. Signal
>> forcing process exit but process exiting before signal is received due
>> to preemption? But skipping preemption when a signal is pending did not
>> change anything in my tests...
>>
>> Running out of ideas here, sorry.
>>
>
> FWIW, I found that the failure mode (with CONFIG_PREEMPT) changed
> significantly after I disabled hard irqs in do_IRQ() using the patch I
> sent on the 8th. In three stress-ng test runs, I got a soft lockup, a WARN
> from set_fc() and some CONFIG_DEBUG_LIST failures...

Yes, I do see that with your patch, too. I still see the old 'table 
already free' bug, though.

As far as I can see, the set_fc warning is from access_error040 and is 
part of the access error exception that is taken in interrupt context.

The question is basically - why is __free_one_page() called from 
interrupt context? Did that also happen before Geert's preemption patch?

preempt_schedule_irq() is currently called in both the interrupt and 
exception return code path. Maybe it should only be called on exception 
(i.e. syscall) return?

Cheers,

	Michael

> # /root/stress-ng -t 60 --zombie -1
> stress-ng: info:  [46] setting to a 60 second run per stressor
> stress-ng: info:  [46] dispatching hogs: 1 zombie
> [   29.670000] ------------[ cut here ]------------
> [   29.670000] WARNING: CPU: 0 PID: 0 at arch/m68k/include/asm/processor.h:92 buserr_c+0x486/0x5ba
> [   29.670000] Modules linked in:
> [   29.670000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-rc7-mac-00234-g138ba0024ff0 #9
> [   29.670000] Stack from 00583c20:
> [   29.670000]         00583c20 0054af2c 0054af2c 00000000 006e0100 00583c40 004980b6 0054af2c
> [   29.670000]         00583c54 00492ff6 00000505 0059ff94 0b46005f 00583c84 0001c73e 0053a3bc
> [   29.670000]         0000005c 000061b6 00000009 00000000 00000000 00000505 00583cc8 00696ef0
> [   29.670000]         008dc490 00583cbc 000061b6 0053a3bc 0000005c 00000009 00000000 00000000
> [   29.670000]         006eee78 00002ff5 00000008 00000001 005b0b4c 006eee74 02d61260 00583d34
> [   29.670000]         000026b4 00583cc8 006eee78 00000000 006eee78 00002ff5 00000008 005b0b4c
> [   29.670000] Call Trace: [<004980b6>] dump_stack+0x10/0x16
> [   29.670000]  [<00492ff6>] __warn+0x90/0xb6
> [   29.670000]  [<0001c73e>] warn_slowpath_fmt+0x10e/0x1a4
> [   29.670000]  [<000061b6>] buserr_c+0x486/0x5ba
> [   29.670000]  [<000061b6>] buserr_c+0x486/0x5ba
> [   29.670000]  [<00002ff5>] __get_wchan+0x4d/0x58
> [   29.670000]  [<000026b4>] buserr+0x20/0x28
> [   29.670000]  [<00002ff5>] __get_wchan+0x4d/0x58
> [   29.670000]  [<000ea558>] __free_one_page+0x2ec/0x532
> [   29.670000]  [<000ea8bc>] free_pcppages_bulk+0x11e/0x18a
> [   29.670000]  [<00002ff5>] __get_wchan+0x4d/0x58
> [   29.670000]  [<00002700>] ret_from_kernel_thread+0xc/0x14
> [   29.670000]  [<000eaa86>] free_unref_page_commit+0x15e/0x314
> [   29.670000]  [<00002ce0>] show_regs+0x48/0xb8
> [   29.670000]  [<00002004>] _start+0x4/0x8
> [   29.670000]  [<000ed452>] free_unref_page+0x12a/0x190
> [   29.670000]  [<00065be4>] rcu_cblist_dequeue+0x0/0x28
> [   29.670000]  [<000ed648>] __free_pages+0xb4/0x10c
> [   29.670000]  [<00018340>] thread_stack_free_rcu+0x3a/0x44
> [   29.670000]  [<0006436c>] rcu_core+0x2c2/0x79e
> [   29.670000]  [<00054fda>] handle_irq_event+0x7a/0x90
> [   29.670000]  [<00064928>] rcu_core_si+0x8/0xc
> [   29.670000]  [<0049952c>] __do_softirq+0xa4/0x1f8
> [   29.670000]  [<00492600>] memmove+0x8c/0xe0
> [   29.670000]  [<00038196>] parse_args+0x0/0x3a6
> [   29.670000]  [<000485f8>] arch_cpu_idle_exit+0x0/0x8
> [   29.670000]  [<000485f0>] arch_cpu_idle_enter+0x0/0x8
> [   29.670000]  [<0001fee6>] irq_exit+0x44/0x4e
> [   29.670000]  [<00002986>] do_IRQ+0x32/0x6c
> [   29.670000]  [<00498cf8>] cpu_idle_poll.isra.0+0x0/0x6e
> [   29.670000]  [<00498d66>] default_idle_call+0x0/0x46
> [   29.670000]  [<00002874>] auto_irqhandler_fixup+0x4/0xc
> [   29.670000]  [<00498cf8>] cpu_idle_poll.isra.0+0x0/0x6e
> [   29.670000]  [<00498d66>] default_idle_call+0x0/0x46
> [   29.670000]  [<00498d88>] default_idle_call+0x22/0x46
> [   29.670000]  [<00048680>] do_idle+0x6a/0xf0
> [   29.670000]  [<00048616>] do_idle+0x0/0xf0
> [   29.670000]  [<00036742>] find_task_by_pid_ns+0x0/0x2a
> [   29.670000]  [<0005cfd4>] __rcu_read_lock+0x0/0x12
> [   29.670000]  [<000489cc>] cpu_startup_entry+0x1a/0x1c
> [   29.670000]  [<00063976>] __rcu_read_unlock+0x0/0x26
> [   29.670000]  [<00498e4a>] kernel_init+0x0/0xfa
> [   29.670000]  [<004897c0>] strcpy+0x0/0x1e
> [   29.670000]  [<0049379c>] _printk+0x0/0x16
> [   29.670000]  [<00489942>] strlen+0x0/0x22
> [   29.670000]  [<0062d2c6>] memblock_alloc_try_nid+0x0/0x82
> [   29.670000]  [<0062139a>] arch_post_acpi_subsys_init+0x0/0x8
> [   29.670000]  [<00621918>] console_on_rootfs+0x0/0x60
> [   29.670000]  [<00620410>] _sinittext+0x410/0xadc
> [   29.670000]
> [   29.670000] ---[ end trace 0000000000000000 ]---
> [   29.670000] Unable to handle kernel access at virtual address 0b46005f
> [   29.670000] Oops: 00000000
> [   29.670000] Modules linked in:
> [   29.670000] PC: [<002951de>] __list_add_valid_or_report+0x1a/0x102
> [   29.670000] SR: 2700  SP: 00583cc8  a2: 00587590
> [   29.670000] d0: 000007e0    d1: 006eee78    d2: 00000000    d3: 006eee78
> [   29.670000] d4: 00002ff5    d5: 00000008    a0: 005b0b4c    a1: 0b46005b
> [   29.670000] Process swapper (pid: 0, task=00587590)
> [   29.670000] Frame format=7 eff addr=0b46005f ssw=0505 faddr=0b46005f
> [   29.670000] wb 1 stat/addr/data: 0000 00000000 00000000
> [   29.670000] wb 2 stat/addr/data: 0000 00000000 00000000
> [   29.670000] wb 3 stat/addr/data: 0000 0b46005f 00000000
> [   29.670000] push data: 00000000 00000000 00000000 00000000
> [   29.670000] Stack from 00583d30:
> [   29.670000]         00000000 00583d6c 000ea558 006eee78 005b0b4c 0b46005b 007a5380 00000000
> [   29.670000]         0000000f 0000000c 00000005 00000001 006eee78 007a5358 005b2090 00583dc8
> [   29.670000]         000ea8bc 006eee74 00002ff5 005b0314 00000000 000000fc 00000000 00000f07
> [   29.670000]         00000003 000001ca 000000e0 00000001 006e307c 007a5358 005b0314 00000011
> [   29.670000]         006eee74 00000039 00002700 005b04fc 00000000 000000aa 00583e08 000eaa86
> [   29.670000]         005b0314 0000004e 007a5358 00000003 00000001 00002ce0 00000000 00000000
> [   29.670000] Call Trace: [<000ea558>] __free_one_page+0x2ec/0x532
> [   29.670000]  [<000ea8bc>] free_pcppages_bulk+0x11e/0x18a
> [   29.670000]  [<00002ff5>] __get_wchan+0x4d/0x58
> [   29.670000]  [<00002700>] ret_from_kernel_thread+0xc/0x14
> [   29.670000]  [<000eaa86>] free_unref_page_commit+0x15e/0x314
> [   29.670000]  [<00002ce0>] show_regs+0x48/0xb8
> [   29.670000]  [<00002004>] _start+0x4/0x8
> [   29.670000]  [<000ed452>] free_unref_page+0x12a/0x190
> [   29.670000]  [<00065be4>] rcu_cblist_dequeue+0x0/0x28
> [   29.670000]  [<000ed648>] __free_pages+0xb4/0x10c
> [   29.670000]  [<00018340>] thread_stack_free_rcu+0x3a/0x44
> [   29.670000]  [<0006436c>] rcu_core+0x2c2/0x79e
> [   29.670000]  [<00054fda>] handle_irq_event+0x7a/0x90
> [   29.670000]  [<00064928>] rcu_core_si+0x8/0xc
> [   29.670000]  [<0049952c>] __do_softirq+0xa4/0x1f8
> [   29.670000]  [<00492600>] memmove+0x8c/0xe0
> [   29.670000]  [<00038196>] parse_args+0x0/0x3a6
> [   29.670000]  [<000485f8>] arch_cpu_idle_exit+0x0/0x8
> [   29.670000]  [<000485f0>] arch_cpu_idle_enter+0x0/0x8
> [   29.670000]  [<0001fee6>] irq_exit+0x44/0x4e
> [   29.670000]  [<00002986>] do_IRQ+0x32/0x6c
> [   29.670000]  [<00498cf8>] cpu_idle_poll.isra.0+0x0/0x6e
> [   29.670000]  [<00498d66>] default_idle_call+0x0/0x46
> [   29.670000]  [<00002874>] auto_irqhandler_fixup+0x4/0xc
> [   29.670000]  [<00498cf8>] cpu_idle_poll.isra.0+0x0/0x6e
> [   29.670000]  [<00498d66>] default_idle_call+0x0/0x46
> [   29.670000]  [<00498d88>] default_idle_call+0x22/0x46
> [   29.670000]  [<00048680>] do_idle+0x6a/0xf0
> [   29.670000]  [<00048616>] do_idle+0x0/0xf0
> [   29.670000]  [<00036742>] find_task_by_pid_ns+0x0/0x2a
> [   29.670000]  [<0005cfd4>] __rcu_read_lock+0x0/0x12
> [   29.670000]  [<000489cc>] cpu_startup_entry+0x1a/0x1c
> [   29.670000]  [<00063976>] __rcu_read_unlock+0x0/0x26
> [   29.670000]  [<00498e4a>] kernel_init+0x0/0xfa
> [   29.670000]  [<004897c0>] strcpy+0x0/0x1e
> [   29.670000]  [<0049379c>] _printk+0x0/0x16
> [   29.670000]  [<00489942>] strlen+0x0/0x22
> [   29.670000]  [<0062d2c6>] memblock_alloc_try_nid+0x0/0x82
> [   29.670000]  [<0062139a>] arch_post_acpi_subsys_init+0x0/0x8
> [   29.670000]  [<00621918>] console_on_rootfs+0x0/0x60
> [   29.670000]  [<00620410>] _sinittext+0x410/0xadc
> [   29.670000]
> [   29.670000] Code: 206e 000c 226e 0010 4a88 6756 4a89 6774 <2029> 0004 b1c0 6600 008e 2410 b3c2 6600 00ae b288 670e 7001 b289 6708 242e fffc
> [   29.670000] Disabling lock debugging due to kernel taint
> [   29.670000] Kernel panic - not syncing: Aiee, killing interrupt handler!
> [   29.670000] ---[ end Kernel panic - not syncing: Aiee, killing interrupt handler! ]---
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-18  6:24                                         ` Michael Schmitz
@ 2024-03-18  9:31                                           ` Finn Thain
  2024-03-20  1:00                                             ` Michael Schmitz
  0 siblings, 1 reply; 36+ messages in thread
From: Finn Thain @ 2024-03-18  9:31 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: Geert Uytterhoeven, Guenter Roeck, linux-m68k

On Mon, 18 Mar 2024, Michael Schmitz wrote:

> Am 15.03.2024 um 20:24 schrieb Finn Thain:
> >
> > On Fri, 15 Mar 2024, Michael Schmitz wrote:
> >
> >>
> >> No luck with whatever I tried around signals, cache maintenance and 
> >> mm.
> >>
> >> The 'BUG: Bad rss-counter state' message suggests we're freeing the 
> >> same page ranges twice, sometimes in many cases. I cannot quite see 
> >> how preempting the kernel on interupt return would cause this. Signal 
> >> forcing process exit but process exiting before signal is received 
> >> due to preemption? But skipping preemption when a signal is pending 
> >> did not change anything in my tests...
> >>
> >> Running out of ideas here, sorry.
> >>
> >
> > FWIW, I found that the failure mode (with CONFIG_PREEMPT) changed 
> > significantly after I disabled hard irqs in do_IRQ() using the patch I 
> > sent on the 8th. In three stress-ng test runs, I got a soft lockup, a 
> > WARN from set_fc() and some CONFIG_DEBUG_LIST failures...
> 
> Yes, I do see that with your patch, too. I still see the old 'table 
> already free' bug, though.
> 
> As far as I can see, the set_fc warning is from access_error040 and is 
> part of the access error exception that is taken in interrupt context.
> 
> The question is basically - why is __free_one_page() called from 
> interrupt context? Did that also happen before Geert's preemption patch?
> 

I did see that set_fc() warning during the mmap stress testing I did a few 
years ago. The example below comes from 5.18.0-rc7-mac-00006-g210e04ff7681 
but a lot has changed since then and it may not be relevant. I stopped 
doing those tests when Al Viro fixed the bug I was chasing. When I get 
time I shall fire up a Quadra and try again with v6.8.

stress-ng: info:  [116] dispatching hogs: 1 mmap
[ 1673.480000] ------------[ cut here ]------------
[ 1673.480000] WARNING: CPU: 0 PID: 159 at ./arch/m68k/include/asm/processor.h:91 buserr_c+0x59a/0x99a
[ 1673.480000] Modules linked in:
[ 1673.480000] CPU: 0 PID: 159 Comm:  Not tainted 5.18.0-rc7-mac-00006-g210e04ff7681 #2
[ 1673.480000] Stack from 00a13dec:
[ 1673.480000]         00a13dec 0046b224 0046b224 00000000 00a13e08 003d7e16 0046b224 00a13e1c
[ 1673.480000]         0001c1b4 00000000 00a13e94 b6db6eaa 00a13e48 0001c240 00461323 0000005b
[ 1673.480000]         0000678c 00000009 00000000 00000000 00000505 b6db6db6 db6db6db 00a13e88
[ 1673.480000]         0000678c 00461323 0000005b 00000009 00000000 00000000 00989680 00000004
[ 1673.480000]         003d6a82 0000000c 003dbb98 00a1f780 004b0c0c 000496dc 00077359 00a13f0c
[ 1673.480000]         00002bcc 00a13e94 00010000 00000000 00989680 00000004 003d6a82 b6db6db6
[ 1673.480000] Call Trace: [<003d7e16>] dump_stack+0x10/0x16
[ 1673.480000]  [<0001c1b4>] __warn+0xc6/0xe8
[ 1673.480000]  [<0001c240>] warn_slowpath_fmt+0x6a/0x76
[ 1673.480000]  [<0000678c>] buserr_c+0x59a/0x99a
[ 1673.480000]  [<0000678c>] buserr_c+0x59a/0x99a
[ 1673.480000]  [<003d6a82>] _printk+0x0/0x16
[ 1673.480000]  [<003dbb98>] down_read+0x0/0xdc
[ 1673.480000]  [<000496dc>] __irq_wake_thread+0x0/0x44
[ 1673.480000]  [<00077359>] ___bpf_prog_run+0x18b/0x20e4
[ 1673.480000]  [<00002bcc>] buserr+0x20/0x28
[ 1673.480000]  [<00010000>] LP1CONT1+0x4a/0x7c
[ 1673.480000]  [<003d6a82>] _printk+0x0/0x16
[ 1673.480000]  [<00050005>] dma_coherent_ok+0x1d/0xb8
[ 1673.480000]  [<00012704>] tblpre+0x594/0x700
[ 1673.480000]  [<0001c1d6>] warn_slowpath_fmt+0x0/0x76
[ 1673.480000]  [<00040e08>] account_system_time+0x74/0xca
[ 1673.480000]  [<0004113e>] account_process_tick+0x30/0xb0
[ 1673.480000]  [<00010000>] LP1CONT1+0x4a/0x7c
[ 1673.480000]  [<00053a6e>] update_process_times+0x36/0xae
[ 1673.480000]  [<00060bdc>] legacy_timer_tick+0x64/0x6c
[ 1673.480000]  [<00008fa4>] via_timer_handler+0x1e/0x24
[ 1673.480000]  [<00049756>] __handle_irq_event_percpu+0x36/0xd8
[ 1673.480000]  [<00002600>] name_to_dev_t+0x1a4/0x3f8
[ 1673.480000]  [<003d9d40>] yield_to+0x88/0x8c
[ 1673.480000]  [<0004980c>] handle_irq_event_percpu+0x14/0x52
[ 1673.480000]  [<0004986c>] handle_irq_event+0x22/0x36
[ 1673.480000]  [<0004cf1a>] handle_simple_irq+0x4e/0x7c
[ 1673.480000]  [<00048f3e>] generic_handle_irq+0x3c/0x4a
[ 1673.480000]  [<00008e3c>] via1_irq+0x7e/0x96
[ 1673.480000]
[ 1673.480000] ---[ end trace 0000000000000000 ]---

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: spinlock recursion when running q800 emulation in qemu
  2024-03-18  9:31                                           ` Finn Thain
@ 2024-03-20  1:00                                             ` Michael Schmitz
  0 siblings, 0 replies; 36+ messages in thread
From: Michael Schmitz @ 2024-03-20  1:00 UTC (permalink / raw)
  To: Finn Thain; +Cc: Geert Uytterhoeven, Guenter Roeck, linux-m68k

Hi Finn,

On 18/03/24 22:31, Finn Thain wrote:
> On Mon, 18 Mar 2024, Michael Schmitz wrote:
>
>> Am 15.03.2024 um 20:24 schrieb Finn Thain:
>>> On Fri, 15 Mar 2024, Michael Schmitz wrote:
>>>
>>>> No luck with whatever I tried around signals, cache maintenance and
>>>> mm.
>>>>
>>>> The 'BUG: Bad rss-counter state' message suggests we're freeing the
>>>> same page ranges twice, sometimes in many cases. I cannot quite see
>>>> how preempting the kernel on interupt return would cause this. Signal
>>>> forcing process exit but process exiting before signal is received
>>>> due to preemption? But skipping preemption when a signal is pending
>>>> did not change anything in my tests...
>>>>
>>>> Running out of ideas here, sorry.
>>>>
>>> FWIW, I found that the failure mode (with CONFIG_PREEMPT) changed
>>> significantly after I disabled hard irqs in do_IRQ() using the patch I
>>> sent on the 8th. In three stress-ng test runs, I got a soft lockup, a
>>> WARN from set_fc() and some CONFIG_DEBUG_LIST failures...
>> Yes, I do see that with your patch, too. I still see the old 'table
>> already free' bug, though.
>>
>> As far as I can see, the set_fc warning is from access_error040 and is
>> part of the access error exception that is taken in interrupt context.
>>
>> The question is basically - why is __free_one_page() called from
>> interrupt context? Did that also happen before Geert's preemption patch?
It's actually not called in hardirq context here, so that might be OK.
>>
> I did see that set_fc() warning during the mmap stress testing I did a few
> years ago. The example below comes from 5.18.0-rc7-mac-00006-g210e04ff7681
> but a lot has changed since then and it may not be relevant. I stopped
> doing those tests when Al Viro fixed the bug I was chasing. When I get
> time I shall fire up a Quadra and try again with v6.8.

That may not be necessary - this warning is not followed by a kernel bus 
error oops, so I suspect this was due to a legitimate page fault taken 
in softirq context, caused by memory pressure.

Unless kernel rules state we must not take page faults during softirq 
handling?

Cheers,

     Michael

> stress-ng: info:  [116] dispatching hogs: 1 mmap
> [ 1673.480000] ------------[ cut here ]------------
> [ 1673.480000] WARNING: CPU: 0 PID: 159 at ./arch/m68k/include/asm/processor.h:91 buserr_c+0x59a/0x99a
> [ 1673.480000] Modules linked in:
> [ 1673.480000] CPU: 0 PID: 159 Comm:  Not tainted 5.18.0-rc7-mac-00006-g210e04ff7681 #2
> [ 1673.480000] Stack from 00a13dec:
> [ 1673.480000]         00a13dec 0046b224 0046b224 00000000 00a13e08 003d7e16 0046b224 00a13e1c
> [ 1673.480000]         0001c1b4 00000000 00a13e94 b6db6eaa 00a13e48 0001c240 00461323 0000005b
> [ 1673.480000]         0000678c 00000009 00000000 00000000 00000505 b6db6db6 db6db6db 00a13e88
> [ 1673.480000]         0000678c 00461323 0000005b 00000009 00000000 00000000 00989680 00000004
> [ 1673.480000]         003d6a82 0000000c 003dbb98 00a1f780 004b0c0c 000496dc 00077359 00a13f0c
> [ 1673.480000]         00002bcc 00a13e94 00010000 00000000 00989680 00000004 003d6a82 b6db6db6
> [ 1673.480000] Call Trace: [<003d7e16>] dump_stack+0x10/0x16
> [ 1673.480000]  [<0001c1b4>] __warn+0xc6/0xe8
> [ 1673.480000]  [<0001c240>] warn_slowpath_fmt+0x6a/0x76
> [ 1673.480000]  [<0000678c>] buserr_c+0x59a/0x99a
> [ 1673.480000]  [<0000678c>] buserr_c+0x59a/0x99a
> [ 1673.480000]  [<003d6a82>] _printk+0x0/0x16
> [ 1673.480000]  [<003dbb98>] down_read+0x0/0xdc
> [ 1673.480000]  [<000496dc>] __irq_wake_thread+0x0/0x44
> [ 1673.480000]  [<00077359>] ___bpf_prog_run+0x18b/0x20e4
> [ 1673.480000]  [<00002bcc>] buserr+0x20/0x28
> [ 1673.480000]  [<00010000>] LP1CONT1+0x4a/0x7c
> [ 1673.480000]  [<003d6a82>] _printk+0x0/0x16
> [ 1673.480000]  [<00050005>] dma_coherent_ok+0x1d/0xb8
> [ 1673.480000]  [<00012704>] tblpre+0x594/0x700
> [ 1673.480000]  [<0001c1d6>] warn_slowpath_fmt+0x0/0x76
> [ 1673.480000]  [<00040e08>] account_system_time+0x74/0xca
> [ 1673.480000]  [<0004113e>] account_process_tick+0x30/0xb0
> [ 1673.480000]  [<00010000>] LP1CONT1+0x4a/0x7c
> [ 1673.480000]  [<00053a6e>] update_process_times+0x36/0xae
> [ 1673.480000]  [<00060bdc>] legacy_timer_tick+0x64/0x6c
> [ 1673.480000]  [<00008fa4>] via_timer_handler+0x1e/0x24
> [ 1673.480000]  [<00049756>] __handle_irq_event_percpu+0x36/0xd8
> [ 1673.480000]  [<00002600>] name_to_dev_t+0x1a4/0x3f8
> [ 1673.480000]  [<003d9d40>] yield_to+0x88/0x8c
> [ 1673.480000]  [<0004980c>] handle_irq_event_percpu+0x14/0x52
> [ 1673.480000]  [<0004986c>] handle_irq_event+0x22/0x36
> [ 1673.480000]  [<0004cf1a>] handle_simple_irq+0x4e/0x7c
> [ 1673.480000]  [<00048f3e>] generic_handle_irq+0x3c/0x4a
> [ 1673.480000]  [<00008e3c>] via1_irq+0x7e/0x96
> [ 1673.480000]
> [ 1673.480000] ---[ end trace 0000000000000000 ]---

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2024-03-20  1:00 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-04 17:58 spinlock recursion when running q800 emulation in qemu Guenter Roeck
2024-03-05  0:33 ` Finn Thain
2024-03-05  0:48   ` Michael Schmitz
     [not found]   ` <fcb506f2-523d-4efc-ae3d-fe3c79c6f09e@gmail.com>
2024-03-05  0:58     ` Guenter Roeck
2024-03-05  1:06       ` Michael Schmitz
2024-03-06  7:14 ` Michael Schmitz
2024-03-06  8:30   ` Brad Boyer
2024-03-06 23:13     ` Finn Thain
2024-03-06 23:46       ` Guenter Roeck
2024-03-07 23:35         ` Finn Thain
2024-03-06 23:42     ` Michael Schmitz
2024-03-06 23:52   ` Finn Thain
2024-03-08  0:20     ` Michael Schmitz
2024-03-08  0:56       ` Finn Thain
2024-03-08  8:06         ` Michael Schmitz
2024-03-08  9:15           ` Finn Thain
2024-03-08  9:33             ` Finn Thain
2024-03-08 20:14               ` Michael Schmitz
2024-03-09  5:02                 ` Finn Thain
2024-03-09 20:56                   ` Michael Schmitz
2024-03-09 22:18                     ` Finn Thain
2024-03-11  7:06                       ` Michael Schmitz
2024-03-11  8:35                         ` Finn Thain
2024-03-12  0:51                           ` Michael Schmitz
2024-03-12  7:59                             ` Geert Uytterhoeven
2024-03-12 20:14                               ` Michael Schmitz
2024-03-13  0:16                               ` Michael Schmitz
2024-03-13  4:39                                 ` Preemption (was: Re: spinlock recursion when running q800 emulation in qemu) Michael Schmitz
2024-03-13  4:40                                 ` spinlock recursion when running q800 emulation in qemu Finn Thain
2024-03-13  5:34                                   ` Michael Schmitz
2024-03-14  0:59                                   ` Michael Schmitz
2024-03-15  4:32                                     ` Michael Schmitz
2024-03-15  7:24                                       ` Finn Thain
2024-03-18  6:24                                         ` Michael Schmitz
2024-03-18  9:31                                           ` Finn Thain
2024-03-20  1:00                                             ` Michael Schmitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.