All of lore.kernel.org
 help / color / mirror / Atom feed
From: Finn Thain <fthain@linux-m68k.org>
To: Michael Schmitz <schmitzmic@gmail.com>
Cc: linux-m68k@vger.kernel.org
Subject: Re: Mainline kernel crashes, was Re: RFC: remove set_fs for m68k
Date: Mon, 13 Sep 2021 11:27:55 +1000 (AEST)	[thread overview]
Message-ID: <d59a44c-ddea-a774-d217-2484ad582dc0@linux-m68k.org> (raw)
In-Reply-To: <2c624213-6a4-799c-45e-a1be578dd5f@linux-m68k.org>

On Sun, 12 Sep 2021, Finn Thain wrote:

> ... I've now done as you did, that is,
> 
> diff --git a/arch/m68k/kernel/irq.c b/arch/m68k/kernel/irq.c
> index 9ab4f550342e..b46d8a57f4da 100644
> --- a/arch/m68k/kernel/irq.c
> +++ b/arch/m68k/kernel/irq.c
> @@ -20,10 +20,13 @@
>  asmlinkage void do_IRQ(int irq, struct pt_regs *regs)
>  {
>  	struct pt_regs *oldregs = set_irq_regs(regs);
> +	unsigned long flags;
>  
> +	local_irq_save(flags);
>  	irq_enter();
>  	generic_handle_irq(irq);
>  	irq_exit();
> +	local_irq_restore(flags);
>  
>  	set_irq_regs(oldregs);
>  }
> 
> There may be a better way to achieve that. If the final IPL can be found 
> in regs then it doesn't need to be saved again.
> 
> I haven't looked for a possible entropy pool improvement from correct 
> locking in random.c -- it would not surprise me if there was one.
> 
> But none of this explains the panics I saw so I went looking for potential 
> race conditions in the irq_enter_rcu() and irq_exit_rcu() code. I haven't 
> found the bug yet.
> 

Turns out that the panic bug was not affected by that patch...

running --mmap -1 --mmap-odirect --mmap-bytes 100% -t 60 --timestamp --no-rand-seed --times
stress-ng: 17:06:09.62 info:  [1241] setting to a 60 second run per stressor
stress-ng: 17:06:09.62 info:  [1241] dispatching hogs: 1 mmap
[  807.270000] Kernel panic - not syncing: Aiee, killing interrupt handler!
[  807.270000] CPU: 0 PID: 1243 Comm: stress-ng Not tainted 5.14.0-multi-00002-g69f953866c7e #2
[  807.270000] Stack from 00bcbde4:
[  807.270000]         00bcbde4 00488d85 00488d85 000c0000 00bcbe00 003f3708 00488d85 00bcbe20
[  807.270000]         003f270e 000c0000 418004fc 00bca000 009f8a80 00bca000 00a06fc0 00bcbe5c
[  807.270000]         000317f6 0048098b 00000009 418004fc 00bca000 00000000 07408000 00000009
[  807.270000]         00000008 00bcbf38 00a06fc0 00000006 00000000 00000001 00bcbe6c 000319ac
[  807.270000]         00000009 01438a20 00bcbeb8 0003acf0 00000009 0000000f 0000000e c043c000
[  807.270000]         00000000 07408000 00000003 00bcbf98 efb2c944 efb2b8a8 00039afa 00bca000
[  807.270000] Call Trace: [<000c0000>] insert_vmap_area.constprop.91+0xbc/0x15a
[  807.270000]  [<003f3708>] dump_stack+0x10/0x16
[  807.270000]  [<003f270e>] panic+0xba/0x2bc
[  807.270000]  [<000c0000>] insert_vmap_area.constprop.91+0xbc/0x15a
[  807.270000]  [<000317f6>] do_exit+0x87e/0x9d6
[  807.270000]  [<000319ac>] do_group_exit+0x28/0xb6
[  807.270000]  [<0003acf0>] get_signal+0x126/0x720
[  807.270000]  [<00039afa>] send_signal+0xde/0x16e
[  807.270000]  [<00004f70>] do_notify_resume+0x38/0x61c
[  807.270000]  [<0003abaa>] force_sig_fault_to_task+0x36/0x3a
[  807.270000]  [<0003abc6>] force_sig_fault+0x18/0x1c
[  807.270000]  [<000074f4>] send_fault_sig+0x44/0xc6
[  807.270000]  [<00006a62>] buserr_c+0x2c8/0x6a2
[  807.270000]  [<00002cfc>] do_signal_return+0x10/0x1a
[  807.270000]  [<0018800e>] ext4_htree_fill_tree+0x7c/0x32a
[  807.270000]  [<0010800a>] d_absolute_path+0x18/0x6a
[  807.270000] 
[  807.270000] ---[ end Kernel panic - not syncing: Aiee, killing interrupt handler! ]---

On the Quadra 630, the panic almost completely disappeared when I enabled 
the relevant CONFIG_DEBUG_* options. After about 7 hours of stress testing 
I got this:

[23982.680000] list_add corruption. next->prev should be prev (00b51e98), but was 00bb22d8. (next=00b75cd0).
[23982.690000] kernel BUG at lib/list_debug.c:25!
[23982.700000] *** TRAP #7 ***   FORMAT=0
[23982.710000] Current process id is 15489
[23982.720000] BAD KERNEL TRAP: 00000000
[23982.740000] Modules linked in:
[23982.750000] PC: [<00261e62>] __list_add_valid+0x62/0xc0
[23982.760000] SR: 2000  SP: e2fb938b  a2: 00bcba80
[23982.770000] d0: 00000022    d1: 00000002    d2: 008c4e40    d3: 00b7a9c0
[23982.780000] d4: 00b51e98    d5: 000da3c0    a0: 00067f00    a1: 00b51d2c
[23982.790000] Process stress-ng (pid: 15489, task=35ee07ca)
[23982.800000] Frame format=0 
[23982.810000] Stack from 00b51e80:
[23982.810000]         004cbab9 004ea3a1 00000019 004ea34f 00b51e98 00bb22d8 00b75cd0 008c4e38
[23982.810000]         00b51ecc 000da3f2 008c4e40 00b51e98 00b75cd0 00b51e5c 000f5d40 00b75cd0
[23982.810000]         00b7a9c0 00bb22d0 00b7a9c0 00b51f04 000dc346 00b51e5c 008c4e38 00b7a9c0
[23982.810000]         c4c97000 00000000 c4c96000 00102073 00b14960 c4c97000 00b51e5c 00b75c94
[23982.810000]         00000001 00b51f24 000d5628 00b51e5c 00b75c94 00102070 00000000 00b75c94
[23982.810000]         00b75c94 00b51f3c 000d5728 00b14960 00b75c94 c4c97000 00000000 00b51f78
[23982.830000] Call Trace: [<000da3f2>] anon_vma_chain_link+0x32/0x80
[23982.840000]  [<000f5d40>] kmem_cache_alloc+0x0/0x200
[23982.850000]  [<000dc346>] anon_vma_clone+0xc6/0x180
[23982.860000]  [<00102073>] cdev_get+0x33/0x80
[23982.870000]  [<000d5628>] __split_vma+0x68/0x140
[23982.880000]  [<00102070>] cdev_get+0x30/0x80
[23982.890000]  [<000d5728>] split_vma+0x28/0x40
[23982.900000]  [<000d83ba>] mprotect_fixup+0x13a/0x200
[23982.910000]  [<00102070>] cdev_get+0x30/0x80
[23982.920000]  [<000d8280>] mprotect_fixup+0x0/0x200
[23982.930000]  [<000d85b2>] sys_mprotect+0x132/0x1c0
[23982.940000]  [<00102070>] cdev_get+0x30/0x80
[23982.950000]  [<00001000>] kernel_pg_dir+0x0/0x1000
[23982.960000]  [<000071df>] flush_icache_range+0x1f/0x40
[23982.970000]  [<00002ca4>] syscall+0x8/0xc
[23982.980000]  [<00001000>] kernel_pg_dir+0x0/0x1000
[23982.990000]  [<00001000>] kernel_pg_dir+0x0/0x1000
[23983.000000]  [<00002000>] _start+0x0/0x40
[23983.010000]  [<0018800e>] ext4_ext_remove_space+0x20e/0x1540
[23983.030000] 
[23983.040000] Code: 4879 004e a3a1 4879 004c bab9 4e93 4e47 <b089> 6704 b088 661c 2f08 2f2e 000c 2f00 4879 004e a404 47f9 0043 d16c 4e93 4878
[23983.060000] Disabling lock debugging due to kernel taint

I am still unable to reproduce this in Aranym or QEMU. (Though I did find 
a QEMU bug in the attempt.)

I suppose list pointer corruption could have resulted in the above panic 
had it gone undetected. So it's tempting to blame the panic on bad DRAM -- 
especially if this anon_vma_chain struct always gets placed at the same 
physical address (?)

  parent reply	other threads:[~2021-09-13  1:28 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-09  7:01 RFC: remove set_fs for m68k Christoph Hellwig
2021-07-09  7:01 ` [PATCH 1/7] m68k: document that access_ok is broken for !CONFIG_CPU_HAS_ADDRESS_SPACES Christoph Hellwig
2021-07-09  7:01 ` [PATCH 2/7] m68k: use BUILD_BUG for passing invalid sizes to get_user/put_user Christoph Hellwig
2021-07-09  7:01 ` [PATCH 3/7] m68k: remove the inline copy_{from,to}_user variants Christoph Hellwig
2021-07-09  7:01 ` [PATCH 4/7] m68k: remove the err argument to the get_user/put_user assembly helpers Christoph Hellwig
2021-07-09  7:01 ` [PATCH 5/7] m68k: factor the 8-byte lowlevel {get,put}_user code into helpers Christoph Hellwig
2021-07-09  7:01 ` [PATCH 6/7] m68k: provide __{get,put}_kernel_nofault Christoph Hellwig
2021-07-09  7:01 ` [PATCH 7/7] m68k: remove set_fs() Christoph Hellwig
2021-07-11  7:20 ` RFC: remove set_fs for m68k Michael Schmitz
2021-07-12  9:50   ` Christoph Hellwig
2021-07-12 10:20   ` Andreas Schwab
2021-07-12 19:12     ` Michael Schmitz
2021-07-13  5:41       ` Christoph Hellwig
2021-07-13  8:16         ` Michael Schmitz
2021-07-13  8:54           ` Christoph Hellwig
2021-07-14 19:26             ` Michael Schmitz
2021-07-14 20:03               ` Andreas Schwab
2021-07-15  5:44                 ` Michael Schmitz
2021-07-16  2:03               ` Michael Schmitz
2021-07-17  5:41                 ` Michael Schmitz
2021-07-18  1:14                   ` Michael Schmitz
2021-07-21 17:05                     ` Christoph Hellwig
2021-07-21 19:20                       ` Michael Schmitz
2021-07-23  4:00                       ` Michael Schmitz
2021-07-23  5:11                         ` Christoph Hellwig
2021-07-25  7:36                           ` Michael Schmitz
2021-07-31 19:31                             ` Michael Schmitz
2021-08-06  3:10                               ` Michael Schmitz
2021-08-11  9:12                                 ` Christoph Hellwig
2021-08-12  3:37                                   ` Michael Schmitz
2021-08-15  7:42                                 ` Christoph Hellwig
2021-08-15 19:17                                   ` Michael Schmitz
2021-08-16  6:58                                     ` Christoph Hellwig
     [not found]                                       ` <23f745f2-9086-81fb-3d9e-40ea08a1923@linux-m68k.org>
     [not found]                                         ` <20210816075155.GA29187@lst.de>
     [not found]                                           ` <d407a2a1-738b-5cd5-c2ed-b7250c5da8ec@gmail.com>
     [not found]                                             ` <83571ae-10ae-2919-cde-b6b4a5769c9@linux-m68k.org>
     [not found]                                               ` <dc594142-e459-533e-cac2-c7a213cec464@gmail.com>
     [not found]                                                 ` <f4ab2dcb-6761-c60b-54ce-35d0d017d371@gmail.com>
     [not found]                                                   ` <d772d22e-a945-3e35-80a2-f4783893bea@linux-m68k.org>
     [not found]                                                     ` <b2c55280-657b-51c2-065c-3fc93db050b9@gmail.com>
     [not found]                                                       ` <d7b8f7eb-fc18-c8d-fe3e-dcdf19d3f4b@linux-m68k.org>
     [not found]                                                         ` <755e55ba-4ce2-b4e4-a628-5abc183a557a@linux-m68k.org>
     [not found]                                                           ` <b52a10fe-3e4b-5740-d3f8-52bce3bc988@linux-m68k.org>
     [not found]                                                             ` <31f27da7-be60-8eb-9834-748b653c2246@linux-m68k.org>
2021-09-07  3:28                                                               ` Mainline kernel crashes, was " Finn Thain
2021-09-07  5:53                                                                 ` Michael Schmitz
2021-09-07 23:50                                                                   ` Finn Thain
2021-09-08  8:54                                                                     ` Michael Schmitz
2021-09-09  9:40                                                                       ` Finn Thain
2021-09-09 23:29                                                                         ` Michael Schmitz
2021-09-09 22:51                                                                       ` Finn Thain
2021-09-10  0:03                                                                         ` Michael Schmitz
2021-09-12  0:51                                                                           ` Finn Thain
2021-09-12  3:55                                                                             ` Brad Boyer
2021-09-13  1:27                                                                             ` Finn Thain [this message]
2021-09-13  3:26                                                                               ` Michael Schmitz
2021-09-13  5:22                                                                                 ` Finn Thain
2021-09-13  7:20                                                                                   ` Michael Schmitz
2021-09-14  3:13                                                                                     ` Michael Schmitz
2021-09-15  1:38                                                                                     ` Finn Thain
2021-09-15  8:37                                                                                       ` Michael Schmitz
2021-09-16  9:04                                                                                         ` Finn Thain
2021-09-16 22:28                                                                                           ` Michael Schmitz
2021-09-21 21:14                                       ` Michael Schcmitz
2021-08-22 19:33                                         ` Michael Schmitz
2021-08-23  4:04                                           ` Michael Schmitz
2021-08-23 17:59                                           ` Linus Torvalds
2021-08-23 21:31                                             ` Michael Schmitz
2021-08-23 21:49                                               ` Linus Torvalds
2021-08-24  8:08                                                 ` Andreas Schwab
2021-08-24  8:44                                                 ` Michael Schmitz
2021-08-24  8:59                                                   ` Andreas Schwab
2021-08-25  7:51                                                     ` Michael Schmitz
2021-08-25  8:44                                                       ` Andreas Schwab
2021-08-25 22:59                                                         ` Michael Schmitz
2021-08-25 23:30                                                           ` Brad Boyer
2021-08-26  7:46                                                             ` Michael Schmitz
2021-08-26  7:45                                                           ` Andreas Schwab
2021-09-14  2:43                                             ` Michael Schmitz
2021-09-14 15:54                                               ` Linus Torvalds
2021-09-14 16:28                                                 ` Al Viro
2021-09-14 16:38                                                   ` Linus Torvalds
2021-09-15  1:06                                                     ` Al Viro
2021-07-12 19:04   ` Michael Schmitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d59a44c-ddea-a774-d217-2484ad582dc0@linux-m68k.org \
    --to=fthain@linux-m68k.org \
    --cc=linux-m68k@vger.kernel.org \
    --cc=schmitzmic@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.