linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>
To: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: Erhard Furtner <erhard_f@mailbox.org>,
	Linux PowerPC <linuxppc-dev@lists.ozlabs.org>,
	Linux Maple Tree <maple-tree@lists.infradead.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux Regressions <regressions@lists.linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michael Ellerman <mpe@ellerman.id.au>
Subject: Re: [Bisected] PowerMac G4 getting "BUG: Unable to handle kernel data access on write at 0x00001ff0" at boot with CONFIG_VMAP_STACK=y on kernels 6.5.x (regression over 6.4.x)
Date: Tue, 26 Sep 2023 11:16:44 -0400	[thread overview]
Message-ID: <20230926151644.ogvk4n2hnfgffszv@revolver> (raw)
In-Reply-To: <ZRIfyp5YKH36_9df@debian.me>

* Bagas Sanjaya <bagasdotme@gmail.com> [230925 20:03]:
> On Tue, Sep 26, 2023 at 01:01:59AM +0200, Erhard Furtner wrote:
> > Greetings!
> > 
> > Had a chat on #gentoo-powerpc with another user whose G4 Mini fails booting kernel 6.5.0 when CONFIG_VMAP_STACK=y is enabled. I was able to replicate the issue on my PowerMac G4. Also I was able to bisect the issue.
> > 
> > Kernels 6.4.x boot ok with CONFIG_VMAP_STACK=y but on 6.5.5 I get:
> > 
> > [...]
> > Kernel attempted to write user page (1ff0) - exploit attempt? (uid: 0)
> > BUG: Unable to handle kernel data access on write at 0x00001ff0
> > Faulting instruction address: 0xc0008750
> > Oops: Kernel access of bad area, sig: 11 [#1]
> > BE PAGE_SIZE=4K MMU=Hash PowerMac
> > Modules linked in:
> > CPU: 0 PID: 0 Comm: swapper Not tainted 6.5.5-PMacG4 #5
> > Hardware name: PowerMac3,6 7455 0x80010303 PowerMac
> > NIP:  c0008750 LR: c0041848 CTR: c0070988
> > REGS: c0d3dcd0 TRAP: 0300   Not tainted (6.5.5-PMacG4)
> > MSR:  00001032 <ME,IR,DR,RI>  CR: 22d3ddc0 XER: 20000000
> > DAR: 00001ff0 DSISR: 42000000
> > GPR00: c0041848 c0d3dd90 c0d06360 c0d3ddd0 c0d06360 c0d3dea8 c0d3adc0 00000000
> > GPR08: 00000000 c0d40000 00000000 c0d3ddc0 00000000 00000000 00000000 00000004
> > GPR16: 00000002 00000000 00000002 00402dc2 00402dc2 00002000 f1004000 00000000
> > GPR24: c0d45220 c0d06644 c0843c34 00000002 c0d06360 c0d0ce00 c0d06360 00000000
> > NIP [c0008750] do_softirq_own_stack+0x18/0x3c
> > LR [c0041848] irq_exit+0x98/0xc4
> > Call Trace:
> > [c0d3dd90] [c0d69564] 0xc0d69564 (unreliable)
> > [c0d3ddb0] [c0041848] irq_exit+0x98/0xc4
> > [c0d3ddc0] [c0004a98] Decrementer_virt+0x108/0x10c
> > --- interrupt: 900 at __schedule+0x43c/0x4e0
> > NIP:  c0843940 LR: c084398c CTR: c0070988
> > REGS: c0d3ddd0 TRAP: 0900   Not tainted  (6.5.5-PMacG4)
> > MSR:  00009032 <EE,ME,IR,DR,RI>  CR: 22024484  XER: 00000000
> > 
> > GPR00: c0843574 c0d3de90 c0d06360 c0d06360 c0d06360 c0d3dea8 00000001 00000000
> > GPR08: 00000000 00009032 c099ce2c 0007ffbf 22024484 00000000 00000000 00000004
> > GPR16: 00000002 00000000 00000002 00402dc2 00402dc2 00002000 f1004000 00000000
> > GPR24: c0d45220 c0d06644 c0843c34 00000002 c0d06360 c0d0ce00 c0d06360 c0d063ac
> > NIP [c0843940] __schedule+0x43c/0x4e0
> > LR [c084390c] __schedule+0x408/0x4e0
> > --- interrupt: 900
> > [c0d3de90] [c0843574] __schedule+0x70/0x4e0 (unreliable)
> > [c0d3ded0] [c0843c34] __cond_resched+0x34/0x54
> > [c0d3dee0] [c0141068] __vmalloc_node_range+0x27c/0x64c
> > [c0d3de60] [c0141794] __vmalloc_node+0x44/0x54
> > [c8d3df80] [c0c06510] init_IRQ+0x34/0xd4
> > [c8d3dfa0] [c0c03440] start_kernel+0x424/0x558
> > [c8d3dff0] [00003540] 0x3540
> > Code: 39490999 7d4901a4 39290aaa 7d2a01a4 4c00012c 4bffff20 9421ffe0 7c08002a6 3d20c0d4 93e1001c 90010024 83e95278 <943f1ff0> 7fe1fb78 48840c6d 80210000
> > ---[ end trace 0000000000000000 ]---
> > 
> > Kernel panic - not syncing: Attempted to kill the idle task!
> > Rebooting in 48 seconds..
> > 
> > 
> > The bisect revealed this commit:
> >  # git bisect good
> > cfeb6ae8bcb96ccf674724f223661bbcef7b0d0b is the first bad commit
> > commit cfeb6ae8bcb96ccf674724f223661bbcef7b0d0b
> > Author: Liam R. Howlett <Liam.Howlett@oracle.com>
> > Date:   Fri Aug 18 20:43:55 2023 -0400
> > 
> >     maple_tree: disable mas_wr_append() when other readers are possible
> >     
> >     The current implementation of append may cause duplicate data and/or
> >     incorrect ranges to be returned to a reader during an update.  Although
> >     this has not been reported or seen, disable the append write operation
> >     while the tree is in rcu mode out of an abundance of caution.
> >     
> >     During the analysis of the mas_next_slot() the following was
> >     artificially created by separating the writer and reader code:
> >     
> >     Writer:                                 reader:
> >     mas_wr_append
> >         set end pivot
> >         updates end metata
> >         Detects write to last slot
> >         last slot write is to start of slot
> >         store current contents in slot
> >         overwrite old end pivot
> >                                             mas_next_slot():
> >                                                     read end metadata
> >                                                     read old end pivot
> >                                                     return with incorrect range
> >         store new value
> >     
> >     Alternatively:
> >     
> >     Writer:                                 reader:
> >     mas_wr_append
> >         set end pivot
> >         updates end metata
> >         Detects write to last slot
> >         last lost write to end of slot
> >         store value
> >                                             mas_next_slot():
> >                                                     read end metadata
> >                                                     read old end pivot
> >                                                     read new end pivot
> >                                                     return with incorrect range
> >         set old end pivot
> >     
> >     There may be other accesses that are not safe since we are now updating
> >     both metadata and pointers, so disabling append if there could be rcu
> >     readers is the safest action.
> >     
> >     Link: https://lkml.kernel.org/r/20230819004356.1454718-2-Liam.Howlett@oracle.com
> >     Fixes: 54a611b60590 ("Maple Tree: add new data structure")
> >     Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
> >     Cc: <stable@vger.kernel.org>
> >     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> > 
> >  lib/maple_tree.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > 
> > And indeed when I revert commit cfeb6ae8bcb96ccf674724f223661bbcef7b0d0b kernel 6.5.5 succeeds booting with CONFIG_VMAP_STACK=y enabled. dmesg of the successful boot with the reverted commit attached, also kernel .config and the bisect.log.
> > 
> > Regards,
> > Erhard F.
> 
> 
> 
> > git bisect start
> > # Status: warte auf guten und schlechten Commit
> > # bad: [2309983b0ac063045af3b01b0251dfd118d45449] Linux 6.5.5
> > git bisect bad 2309983b0ac063045af3b01b0251dfd118d45449
> > # good: [6995e2de6891c724bfeb2db33d7b87775f913ad1] Linux 6.4
> > git bisect good 6995e2de6891c724bfeb2db33d7b87775f913ad1
> > # good: [6c1561fb900524c5bceb924071b3e9b8a67ff3da] Merge tag 'soc-dt-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
> > git bisect good 6c1561fb900524c5bceb924071b3e9b8a67ff3da
> > # good: [9f57c13f7ed70a94ecc135645bc764efdd378acd] Merge tag 'soc-fixes-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
> > git bisect good 9f57c13f7ed70a94ecc135645bc764efdd378acd
> > # good: [b9f052dc68f69dac89fe1e24693354c033daa091] netfilter: nf_tables: fix false-positive lockdep splat
> > git bisect good b9f052dc68f69dac89fe1e24693354c033daa091
> > # bad: [7e2229d14234bbea8fbb5e426d5f3533b0f1b262] arm64: dts: qcom: sc8180x-pmics: add missing qcom,spmi-gpio fallbacks
> > git bisect bad 7e2229d14234bbea8fbb5e426d5f3533b0f1b262
> > # good: [93f5de5f648d2b1ce3540a4ac71756d4a852dc23] Merge tag 'acpi-6.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
> > git bisect good 93f5de5f648d2b1ce3540a4ac71756d4a852dc23
> > # bad: [25130b27e0352acb83e91c467853eb9afad3b644] OPP: Fix potential null ptr dereference in dev_pm_opp_get_required_pstate()
> > git bisect bad 25130b27e0352acb83e91c467853eb9afad3b644
> > # good: [4942fed84b98cfb71d3cdff1a3df0072a57bbdfa] Merge tag 'riscv-for-linus-6.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
> > git bisect good 4942fed84b98cfb71d3cdff1a3df0072a57bbdfa
> > # bad: [ecd7e1c562cb08e41957fcd4b0e404de5ab38e20] ksmbd: fix slub overflow in ksmbd_decode_ntlmssp_auth_blob()
> > git bisect bad ecd7e1c562cb08e41957fcd4b0e404de5ab38e20
> > # bad: [7d2f353b2682dcfe5f9bc71e5b61d5b61770d98e] Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
> > git bisect bad 7d2f353b2682dcfe5f9bc71e5b61d5b61770d98e
> > # good: [2f406263e3e954aa24c1248edcfa9be0c1bb30fa] madvise:madvise_cold_or_pageout_pte_range(): don't use mapcount() against large folio for sharing check
> > git bisect good 2f406263e3e954aa24c1248edcfa9be0c1bb30fa
> > # bad: [e5548f85b4527c4c803b7eae7887c10bf8f90c97] shmem: fix smaps BUG sleeping while atomic
> > git bisect bad e5548f85b4527c4c803b7eae7887c10bf8f90c97
> > # bad: [cfeb6ae8bcb96ccf674724f223661bbcef7b0d0b] maple_tree: disable mas_wr_append() when other readers are possible
> > git bisect bad cfeb6ae8bcb96ccf674724f223661bbcef7b0d0b
> > # good: [0e0e9bd5f7b9d40fd03b70092367247d52da1db0] madvise:madvise_free_pte_range(): don't use mapcount() against large folio for sharing check
> > git bisect good 0e0e9bd5f7b9d40fd03b70092367247d52da1db0
> > # first bad commit: [cfeb6ae8bcb96ccf674724f223661bbcef7b0d0b] maple_tree: disable mas_wr_append() when other readers are possible
> 
> Thanks for the regression report. I'm adding it to regzbot:
> 
> #regzbot ^introduced: cfeb6ae8bcb96c
> #regzbot title: CONFIG_VMAP_STACK kernel data write access bug due to disabling mas_wr_append()

Let's see if I get this right..

#regzbot title: PF_IDLE issue in rcu_tiny
#regzbot fix: cff9b2332ab762b7e0586c793c431a8f2ea4db04
#regzbot dup-of: https://lore.kernel.org/linux-mm/3f86d58e-7f36-c6b4-c43a-2a7bcffd3bd@linux-m68k.org/


> 
> -- 
> An old man doll... just what I always wanted! - Clara



      parent reply	other threads:[~2023-09-26 15:17 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-25 23:01 [Bisected] PowerMac G4 getting "BUG: Unable to handle kernel data access on write at 0x00001ff0" at boot with CONFIG_VMAP_STACK=y on kernels 6.5.x (regression over 6.4.x) Erhard Furtner
2023-09-26  1:49 ` Liam R. Howlett
     [not found] ` <ZRIfyp5YKH36_9df@debian.me>
2023-09-26 15:16   ` Liam R. Howlett [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230926151644.ogvk4n2hnfgffszv@revolver \
    --to=liam.howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=bagasdotme@gmail.com \
    --cc=erhard_f@mailbox.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=maple-tree@lists.infradead.org \
    --cc=mpe@ellerman.id.au \
    --cc=regressions@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).