From: Doug Anderson <dianders@chromium.org>
To: Dave Martin <Dave.Martin@arm.com>
Cc: Caroline Tice <cmtice@chromium.org>,
kgdb-bugreport@lists.sourceforge.net,
Will Deacon <will.deacon@arm.com>,
Linux ARM <linux-arm-kernel@lists.infradead.org>,
Stephen Boyd <swboyd@chromium.org>
Subject: Re: Possible to annotate ARM64 IRQ handling to help gdb?
Date: Wed, 13 Feb 2019 13:19:17 -0800 [thread overview]
Message-ID: <CAD=FV=V6mPHUBGn6u5v4=D_DBxrtj_MAyfHk+TOnbCRNNokYJQ@mail.gmail.com> (raw)
In-Reply-To: <20190211195750.GI3567@e103592.cambridge.arm.com>
Hi,
On Mon, Feb 11, 2019 at 11:58 AM Dave Martin <Dave.Martin@arm.com> wrote:
>
> On Mon, Feb 11, 2019 at 09:27:11AM -0800, Doug Anderson wrote:
> > Hi,
> >
> > On Mon, Feb 4, 2019 at 4:31 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > >
> > > On Fri, Feb 01, 2019 at 01:38:05PM -0800, Doug Anderson wrote:
> > > > Hi,
> > > >
> > > > I was wondering if anyone out there has given any thought to
> > > > annotating the ARM64 IRQ handling in such a way that we could stack
> > > > crawl past el1_irq() when in gdb.
> > > >
> > > > I spent a bit of time on this a few months ago and documented all my
> > > > findings in:
> > > >
> > > > https://bugs.chromium.org/p/chromium/issues/detail?id=908721
> > > >
> > > > I can copy and paste all the discussion from that bug here, but since
> > > > it's public hopefully folks can read the discussion / investigation
> > > > there. To put it briefly, though: I can stack crawl past "el1_irq"
> > > > with the normal linux stack crawl (which is what kdb uses) but I can't
> > > > crawl past "el1_irq" in gdb(). After talking to some of our tools
> > > > guys here I'm fairly certain that we could solve this with the right
> > > > CFI directives, but when I poked at it I wasn't able to figure out the
> > > > magic.
> > > >
> > > >
> > > > Anyway, I figured I'd check to see if anyone here happens to know the
> > > > right magic.
> > >
> > > The kernel (appears to) generate a valid frame record for el1_irq:
> > >
> > > 0xffffff8008082b94 <+84>: mrs x22, elr_el1
> > >
> > > [...]
> > >
> > > 0xffffff8008082ba0 <+96>: stp x29, x22, [sp, #304]
> > > 0xffffff8008082ba4 <+100>: add x29, sp, #0x130
> > >
> > > (I note that 0x130 == 304. Yay binutils.)
> >
> > Right, this is how the kernel is able to do the crawl. It's also why
> > I was able to manually do the crawl in the bug by chaining together
> > frame pointers.
> >
> >
> > > From the bug report, I don't see any real investigation into what
> > > precisely causes gdb to choke on this frame.
> >
> > Right. I just don't know gdb well enough. :( I've had it on my list
> > to dig into it, but I need to find time. ;-)
> >
> >
> > > Do you have evidence that CFI annotations help in this case? And can
> > > you explain _why_ they help (i.e., precisely how is gdb relying on the
> > > annotations)?
> >
> > I spent a tiny bit of time playing around with CFI annotations.
> > Mostly it was stumbling around in the dark since I had a hard time
> > finding good arm/arm64 examples and the documentation was a little
> > hard for me to parse.
>
> You could try compiling a few simple C functions with gcc -S
> -fexceptions and see what the compiler spits out.
Thanks, this definitely helped!
> > ...but from my experience with gdb, my guess is that gdb wants more
> > than just the simple frame pointers. It wants to know where _all_ the
> > registers are stored on the stack and the only way it's going to get
> > that from assembly code (especially assembly code that barfed the
> > registers onto the stack somewhere that's not between FUNC and
> > ENDFUNC) is with some type of annotation. My guess is that it doesn't
> > fall back to just looking at frame pointer chains. Specifically as
> > you move up the stack frame in gdb and you type "info reg", the set of
> > registers changes to be those registers that are correct for the stack
> > frame you're on. Here's a quick example showing how gdb behaves with
> > a random register that was barfed, $x22:
> >
> > (gdb) frame 3
> > #3 0xffffff800846a088 in __handle_sysrq (key=103,
> > check_mask=<optimized out>) at .../drivers/tty/sysrq.c:620
> > 620 op_p->handler(key);
> >
> > (gdb) disass
> > Dump of assembler code for function __handle_sysrq:
> > 0xffffff8008469f64 <+0>: str x23, [sp, #-64]!
> > 0xffffff8008469f68 <+4>: stp x22, x21, [sp, #16]
> > 0xffffff8008469f6c <+8>: stp x20, x19, [sp, #32]
> > 0xffffff8008469f70 <+12>: stp x29, x30, [sp, #48]
> > 0xffffff8008469f74 <+16>: add x29, sp, #0x30
> >
> > (gdb) print /x $x22
> > $13 = 0xffffff8009035000
> >
> > (gdb) print /x *(void**)($x29 - 0x30 + 16)
> > $14 = 0x8000100
> >
> > (gdb) up
> > #4 0xffffff800846a0dc in handle_sysrq (key=103) at .../drivers/tty/sysrq.c:649
> > 649 __handle_sysrq(key, true);
> >
> > (gdb) print /x $x22
> > $15 = 0x8000100
>
>
> Indeed, but this requires full DWARF or .eh_frame info, which is not
> generally available in the kernel.
Yup, but I have it for gdb and right now the problem I'm trying to
solve is being able to crawl in gdb since the kernel seems to be OK.
I guess I was thinking that perhaps the DWARF info could be confusing
gdb?
> Except for code built with -fomit-frame-pointer, you should at least
> be able to see a list of frames though: this doesn't require all the
> registers of ancestor frames to be recovered, just x29 and lr (which is
> what the frame records on the stack contain -- so no other magic info
> is required in order to recover these).
>
> gdb tries various methods to unwind a frame, and ought to fall back to
> this approach if all else fails. Frame chains that appear to loop
> are a problem though, with no straightforward solution.
>
> My hunch is that gdb sees the frame chain attempt to loop backwards
> after el1_irq and bails out. Is your task stack at a lower address than
> the IRQ stack?
Here's what I've got (not lower)
#16 0xffffff8008082bf0 in el1_irq () at
/mnt/host/source/src/third_party/kernel/v4.19/arch/arm64/kernel/entry.S:622
622 irq_handler
(gdb) print /x $sp
$11 = 0xffffff8008004000
(gdb) print /x $x29
$12 = 0xffffff8009003e90
(gdb) print /x ((void**)$x29)[0]
$13 = 0xffffff8009003ed0
(gdb) print /x (*(void***)$x29)[0]
$14 = 0xffffff8009003ee0
...but then I poked a bit more and found out one really big problem is
this that "irq_stack_entry" swaps the stack before calling
gic_handle_irq() and this seemed to be confusing gdb. Specifically
the value of "sp" when I point gdb at the "el1_irq" frame is actually
"irq_stack_ptr" AKA 0xffffff8008004000.
I've been fighting a bit with trying to figure out how to make .cfi
directives do what I want and I managed a stupid/ugly hack that at
least seems to get my stack pointer to be correct in el1_irq now:
---
static asmlinkage void __exception_irq_entry gic_handle_irq(struct
pt_regs *regs)
{
u32 irqnr;
+ asm volatile (".cfi_register 31, 19");
---
...when I do that then my stack pointer sane which I point at el1_irq
(it matches x19), but I still can't get a trace. I also haven't yet
been able to figure out how to accomplish that without hacking it into
gic_handle_irq().
While it would be nice to get all this solved, it's probably not high
priority right now, so I might have to punt unless there's some other
obvious / low hanging fruit to try.
-Doug
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2019-02-13 21:19 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-01 21:38 Possible to annotate ARM64 IRQ handling to help gdb? Doug Anderson
2019-02-04 12:31 ` Dave Martin
2019-02-11 17:27 ` Doug Anderson
2019-02-11 19:57 ` Dave Martin
2019-02-13 21:19 ` Doug Anderson [this message]
2019-02-04 13:12 ` Mark Rutland
2019-02-11 18:05 ` Doug Anderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAD=FV=V6mPHUBGn6u5v4=D_DBxrtj_MAyfHk+TOnbCRNNokYJQ@mail.gmail.com' \
--to=dianders@chromium.org \
--cc=Dave.Martin@arm.com \
--cc=cmtice@chromium.org \
--cc=kgdb-bugreport@lists.sourceforge.net \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=swboyd@chromium.org \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).