All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ard Biesheuvel <ardb@kernel.org>
To: Corentin Labbe <clabbe.montjoie@gmail.com>,
	Linus Walleij <linus.walleij@linaro.org>,
	Arnd Bergmann <arnd@arndb.de>
Cc: "Russell King (Oracle)" <linux@armlinux.org.uk>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: boot flooded with unwind: Index not found
Date: Wed, 2 Mar 2022 10:45:46 +0100	[thread overview]
Message-ID: <CAMj1kXHri2_tnYhu2gE9xTUOxLY9v1=zODCo1BGfjFTKukiedA@mail.gmail.com> (raw)
In-Reply-To: <Yh8w7ldudhmbYv4N@Red>

On Wed, 2 Mar 2022 at 09:55, Corentin Labbe <clabbe.montjoie@gmail.com> wrote:
>
> Le Wed, Mar 02, 2022 at 09:44:52AM +0100, Ard Biesheuvel a écrit :
> > On Wed, 2 Mar 2022 at 09:40, Corentin Labbe <clabbe.montjoie@gmail.com> wrote:
> > >
> > > Le Tue, Mar 01, 2022 at 05:52:30PM +0100, Ard Biesheuvel a écrit :
> > > > On Tue, 1 Mar 2022 at 17:37, Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > >
> > > > > On Tue, 1 Mar 2022 at 16:52, Russell King (Oracle)
> > > > > <linux@armlinux.org.uk> wrote:
> > > > > >
> > > > > > On Tue, Mar 01, 2022 at 04:48:25PM +0100, Corentin Labbe wrote:
> > > > > > > Hello
> > > > > > >
> > > > > > > I booted today linux-next (20220301) and my boot is flooded with:
> > > > > > > [    0.000000] unwind: Index not found c0f0c440
> > > > > > > [    0.000000] unwind: Index not found 00000000
> > > > > > > [    0.000000] unwind: Index not found c0f0c440
> > > > > > > [    0.000000] unwind: Index not found 00000000
> > > > > > >
> > > > > > > This happen on a sun8i-a83t-bananapi-m3
> > > > > >
> > > > > > Have you enabled vmapped stacks?
> > > > > >
> > > > >
> > > > > This is probably related to
> > > > >
> > > > > 538b9265c063 ARM: unwind: track location of LR value in stack frame
> > > > >
> > > > > which removes a kernel_text_address() check on frame->pc as it is
> > > > > essentially redundant, given that we won't find unwind data otherwise.
> > > > > Unfortunately, I failed to realise that the other check carries a
> > > > > pr_warn(), which may apparently fire spuriously in some cases.
> > > > >
> > > > > The 0x0 value can easily be filtered out, but i would be interesting
> > > > > where the other value originates from. We might be able to solve this
> > > > > with a simple .nounwind directive in a asm routine somewhere.
> > > > >
> > > > > I'll prepare a patch that disregards the 0x0 value - could you check
> > > > > in the mean time what the address 0xcf0c440 coincides with in your
> > > > > build?
> > > >
> > > > Something like the below should restore the previous behavior, while
> > > > taking the kernel_text_address() check out of the hot path.
> > > >
> > > > --- a/arch/arm/kernel/unwind.c
> > > > +++ b/arch/arm/kernel/unwind.c
> > > > @@ -400,7 +400,8 @@ int unwind_frame(struct stackframe *frame)
> > > >
> > > >         idx = unwind_find_idx(frame->pc);
> > > >         if (!idx) {
> > > > -               pr_warn("unwind: Index not found %08lx\n", frame->pc);
> > > > +               if (frame->pc && kernel_text_address(frame->pc))
> > > > +                       pr_warn("unwind: Index not found %08lx\n", frame->pc);
> > > >                 return -URC_FAILURE;
> > > >         }
> > >
> > > Hello
> > >
> > > This is a more detailed trace from my follow up after your patch:
> >
> > So the log below is from a kernel that has the above patch applied?
> > Could you please share the .config?
> >
>
> Yes this is a kernel with above patch applied (this board do not boot without it).

It's not entirely clear to me how (or whether) the recent changes to
unwind.c cause this issue, but one thing that stands out in the
current code is the unguarded dereference of a value pulled of the
stack as a memory address.

It is worth noting that the only unwind entries in vmlinux that load
SP from the stack directly (as opposed to unwinding it by moving from
the frame pointer or by addition/subtraction) are the
__irq_svc/__pabt_svc/__dabt_svc entry routines, and given that the
bogus address 60000013 looks suspiciously like a PSR value (which is
stored in the vicinity of SP on the exception stack), my suspicion is
that some unwinder annotations are out of sync with the actual code.

So while the below does not fix the root cause, i.e., that the
unwinder unwinds SP incorrectly causing us to dereference a bogus
pointer, it should avoid the subsequent crash. Please give it a go.

--- a/arch/arm/kernel/unwind.c
+++ b/arch/arm/kernel/unwind.c
@@ -27,6 +27,7 @@
 #include <linux/sched.h>
 #include <linux/slab.h>
 #include <linux/spinlock.h>
+#include <linux/uaccess.h>
 #include <linux/list.h>

 #include <asm/sections.h>
@@ -236,10 +237,11 @@ static int unwind_pop_register(struct
unwind_ctrl_block *ctrl,
                if (*vsp >= (unsigned long *)ctrl->sp_high)
                        return -URC_FAILURE;

-       /* Use READ_ONCE_NOCHECK here to avoid this memory access
-        * from being tracked by KASAN.
+       /* Use get_kernel_nofault() here to avoid this memory access
+        * from causing a fatal fault, and from being tracked by KASAN.
         */
-       ctrl->vrs[reg] = READ_ONCE_NOCHECK(*(*vsp));
+       if (get_kernel_nofault(ctrl->vrs[reg], *vsp))
+               return -URC_FAILURE;
        if (reg == 14)
                ctrl->lr_addr = *vsp;
        (*vsp)++;

WARNING: multiple messages have this Message-ID (diff)
From: Ard Biesheuvel <ardb@kernel.org>
To: Corentin Labbe <clabbe.montjoie@gmail.com>,
	Linus Walleij <linus.walleij@linaro.org>,
	Arnd Bergmann <arnd@arndb.de>
Cc: "Russell King (Oracle)" <linux@armlinux.org.uk>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	 Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: boot flooded with unwind: Index not found
Date: Wed, 2 Mar 2022 10:45:46 +0100	[thread overview]
Message-ID: <CAMj1kXHri2_tnYhu2gE9xTUOxLY9v1=zODCo1BGfjFTKukiedA@mail.gmail.com> (raw)
In-Reply-To: <Yh8w7ldudhmbYv4N@Red>

On Wed, 2 Mar 2022 at 09:55, Corentin Labbe <clabbe.montjoie@gmail.com> wrote:
>
> Le Wed, Mar 02, 2022 at 09:44:52AM +0100, Ard Biesheuvel a écrit :
> > On Wed, 2 Mar 2022 at 09:40, Corentin Labbe <clabbe.montjoie@gmail.com> wrote:
> > >
> > > Le Tue, Mar 01, 2022 at 05:52:30PM +0100, Ard Biesheuvel a écrit :
> > > > On Tue, 1 Mar 2022 at 17:37, Ard Biesheuvel <ardb@kernel.org> wrote:
> > > > >
> > > > > On Tue, 1 Mar 2022 at 16:52, Russell King (Oracle)
> > > > > <linux@armlinux.org.uk> wrote:
> > > > > >
> > > > > > On Tue, Mar 01, 2022 at 04:48:25PM +0100, Corentin Labbe wrote:
> > > > > > > Hello
> > > > > > >
> > > > > > > I booted today linux-next (20220301) and my boot is flooded with:
> > > > > > > [    0.000000] unwind: Index not found c0f0c440
> > > > > > > [    0.000000] unwind: Index not found 00000000
> > > > > > > [    0.000000] unwind: Index not found c0f0c440
> > > > > > > [    0.000000] unwind: Index not found 00000000
> > > > > > >
> > > > > > > This happen on a sun8i-a83t-bananapi-m3
> > > > > >
> > > > > > Have you enabled vmapped stacks?
> > > > > >
> > > > >
> > > > > This is probably related to
> > > > >
> > > > > 538b9265c063 ARM: unwind: track location of LR value in stack frame
> > > > >
> > > > > which removes a kernel_text_address() check on frame->pc as it is
> > > > > essentially redundant, given that we won't find unwind data otherwise.
> > > > > Unfortunately, I failed to realise that the other check carries a
> > > > > pr_warn(), which may apparently fire spuriously in some cases.
> > > > >
> > > > > The 0x0 value can easily be filtered out, but i would be interesting
> > > > > where the other value originates from. We might be able to solve this
> > > > > with a simple .nounwind directive in a asm routine somewhere.
> > > > >
> > > > > I'll prepare a patch that disregards the 0x0 value - could you check
> > > > > in the mean time what the address 0xcf0c440 coincides with in your
> > > > > build?
> > > >
> > > > Something like the below should restore the previous behavior, while
> > > > taking the kernel_text_address() check out of the hot path.
> > > >
> > > > --- a/arch/arm/kernel/unwind.c
> > > > +++ b/arch/arm/kernel/unwind.c
> > > > @@ -400,7 +400,8 @@ int unwind_frame(struct stackframe *frame)
> > > >
> > > >         idx = unwind_find_idx(frame->pc);
> > > >         if (!idx) {
> > > > -               pr_warn("unwind: Index not found %08lx\n", frame->pc);
> > > > +               if (frame->pc && kernel_text_address(frame->pc))
> > > > +                       pr_warn("unwind: Index not found %08lx\n", frame->pc);
> > > >                 return -URC_FAILURE;
> > > >         }
> > >
> > > Hello
> > >
> > > This is a more detailed trace from my follow up after your patch:
> >
> > So the log below is from a kernel that has the above patch applied?
> > Could you please share the .config?
> >
>
> Yes this is a kernel with above patch applied (this board do not boot without it).

It's not entirely clear to me how (or whether) the recent changes to
unwind.c cause this issue, but one thing that stands out in the
current code is the unguarded dereference of a value pulled of the
stack as a memory address.

It is worth noting that the only unwind entries in vmlinux that load
SP from the stack directly (as opposed to unwinding it by moving from
the frame pointer or by addition/subtraction) are the
__irq_svc/__pabt_svc/__dabt_svc entry routines, and given that the
bogus address 60000013 looks suspiciously like a PSR value (which is
stored in the vicinity of SP on the exception stack), my suspicion is
that some unwinder annotations are out of sync with the actual code.

So while the below does not fix the root cause, i.e., that the
unwinder unwinds SP incorrectly causing us to dereference a bogus
pointer, it should avoid the subsequent crash. Please give it a go.

--- a/arch/arm/kernel/unwind.c
+++ b/arch/arm/kernel/unwind.c
@@ -27,6 +27,7 @@
 #include <linux/sched.h>
 #include <linux/slab.h>
 #include <linux/spinlock.h>
+#include <linux/uaccess.h>
 #include <linux/list.h>

 #include <asm/sections.h>
@@ -236,10 +237,11 @@ static int unwind_pop_register(struct
unwind_ctrl_block *ctrl,
                if (*vsp >= (unsigned long *)ctrl->sp_high)
                        return -URC_FAILURE;

-       /* Use READ_ONCE_NOCHECK here to avoid this memory access
-        * from being tracked by KASAN.
+       /* Use get_kernel_nofault() here to avoid this memory access
+        * from causing a fatal fault, and from being tracked by KASAN.
         */
-       ctrl->vrs[reg] = READ_ONCE_NOCHECK(*(*vsp));
+       if (get_kernel_nofault(ctrl->vrs[reg], *vsp))
+               return -URC_FAILURE;
        if (reg == 14)
                ctrl->lr_addr = *vsp;
        (*vsp)++;

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-03-02  9:46 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-01 15:48 boot flooded with unwind: Index not found Corentin Labbe
2022-03-01 15:48 ` Corentin Labbe
2022-03-01 15:49 ` Russell King (Oracle)
2022-03-01 15:49   ` Russell King (Oracle)
2022-03-01 16:37   ` Ard Biesheuvel
2022-03-01 16:37     ` Ard Biesheuvel
2022-03-01 16:52     ` Ard Biesheuvel
2022-03-01 16:52       ` Ard Biesheuvel
2022-03-01 18:19       ` Corentin Labbe
2022-03-01 18:19         ` Corentin Labbe
2022-03-02  8:39       ` Corentin Labbe
2022-03-02  8:39         ` Corentin Labbe
2022-03-02  8:44         ` Ard Biesheuvel
2022-03-02  8:44           ` Ard Biesheuvel
2022-03-02  8:55           ` Corentin Labbe
2022-03-02  9:45             ` Ard Biesheuvel [this message]
2022-03-02  9:45               ` Ard Biesheuvel
2022-03-02 10:09               ` Corentin Labbe
2022-03-02 10:09                 ` Corentin Labbe
2022-03-02 11:12                 ` Russell King (Oracle)
2022-03-02 11:12                   ` Russell King (Oracle)
2022-03-02 11:19                   ` Ard Biesheuvel
2022-03-02 11:19                     ` Ard Biesheuvel
2022-03-02 11:22                     ` Russell King (Oracle)
2022-03-02 11:22                       ` Russell King (Oracle)
2022-03-09  0:01                       ` Russell King (Oracle)
2022-03-09  0:01                         ` Russell King (Oracle)
2022-03-09  1:08                         ` Russell King (Oracle)
2022-03-09  1:08                           ` Russell King (Oracle)
2022-03-09  7:20                           ` Ard Biesheuvel
2022-03-09  7:20                             ` Ard Biesheuvel
2022-03-01 18:16   ` Corentin Labbe
2022-03-01 18:16     ` Corentin Labbe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMj1kXHri2_tnYhu2gE9xTUOxLY9v1=zODCo1BGfjFTKukiedA@mail.gmail.com' \
    --to=ardb@kernel.org \
    --cc=arnd@arndb.de \
    --cc=clabbe.montjoie@gmail.com \
    --cc=linus.walleij@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.