All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sumit Garg <sumit.garg@linaro.org>
To: Doug Anderson <dianders@chromium.org>
Cc: Daniel Thompson <daniel.thompson@linaro.org>,
	Will Deacon <will@kernel.org>, Wei Li <liwei391@huawei.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Jason Wessel <jason.wessel@windriver.com>,
	Marc Zyngier <maz@kernel.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v3 0/2] arm64: Fix pending single-step debugging issues
Date: Mon, 11 Jul 2022 18:13:59 +0530	[thread overview]
Message-ID: <CAFA6WYPmFqCXk24J71tNO=5bJHgJRAOPFr7cGq8YYLR+MWBoPw@mail.gmail.com> (raw)
In-Reply-To: <CAD=FV=WXoDvWuH=yjzCcqOZ5CeUtYun7C8zrtrBP4FC409GkqA@mail.gmail.com>

Hi Doug,

On Sat, 2 Jul 2022 at 03:44, Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Tue, May 10, 2022 at 11:05 PM Sumit Garg <sumit.garg@linaro.org> wrote:
> >
> > This patch-set reworks pending fixes from Wei's series [1] to make
> > single-step debugging via kgdb/kdb on arm64 work as expected. There was
> > a prior discussion on ML [2] regarding if we should keep the interrupts
> > enabled during single-stepping. So patch #1 follows suggestion from Will
> > [3] to not disable interrupts during single stepping but rather skip
> > single stepping within interrupt handler.
> >
> > [1] https://lore.kernel.org/all/20200509214159.19680-1-liwei391@huawei.com/
> > [2] https://lore.kernel.org/all/CAD=FV=Voyfq3Qz0T3RY+aYWYJ0utdH=P_AweB=13rcV8GDBeyQ@mail.gmail.com/
> > [3] https://lore.kernel.org/all/20200626095551.GA9312@willie-the-truck/
> >
> > Changes in v3:
> > - Reword commit descriptions as per Daniel's suggestions.
> >
> > Changes in v2:
> > - Replace patch #1 to rather follow Will's suggestion.
> >
> > Sumit Garg (2):
> >   arm64: entry: Skip single stepping into interrupt handlers
> >   arm64: kgdb: Set PSTATE.SS to 1 to re-enable single-step
> >
> >  arch/arm64/include/asm/debug-monitors.h |  1 +
> >  arch/arm64/kernel/debug-monitors.c      |  5 +++++
> >  arch/arm64/kernel/entry-common.c        | 18 +++++++++++++++++-
> >  arch/arm64/kernel/kgdb.c                |  2 ++
> >  4 files changed, 25 insertions(+), 1 deletion(-)
>
> Sorry it took so long for me to respond. I kept dreaming that I'd find
> the time to really dig deep into this to understand it fully and I'm
> finally giving up on it.

No worries and apologies on my part as well as I had to find some time
to reproduce the issue that you have reported below.

> I'm going to hope that Will and/or Catalin
> knows this area of the code well and can give it a good review. If not
> then I'll strive harder to make the time...
>
> In any case, I poked around with this a bunch and it definitely
> improved the stepping behavior a whole lot. I still got one case where
> gdb hit an assertion while I was stepping, but I could believe that
> was a problem with gdb? I couldn't reproduce it. Thus I can at least
> give:
>
> Tested-by: Douglas Anderson <dianders@chromium.org>
>

Thanks for the testing.

> I'll also note that I _think_ I remember that with Wei's series that
> the gdb function "call" started working. I tried that here and it
> didn't seem so happy. To keep things simple, I created a dummy
> function in my kernel that looked like:
>
> void doug_test(void)
> {
>   pr_info("testing, 1 2 3\n");
> }
>
> I broke into the debugger by echoing "g" to /proc/sysrq-trigger and
> then tried "call doug_test()". I guess my printout actually printed
> but it wasn't so happy after that. Seems like it somehow ended up
> returning to a bogus address after the call which then caused a crash.
>

I am able to reproduce this issue on my setup as well. But it doesn't
seem to be a regression caused by this patch-set over Wei's series. As
I could reproduce this issue with v1 [1] patch-set as well which was
just a forward port of pending patches from Wei's series to the latest
upstream.

Maybe it's a different regression caused by other changes? BTW, do you
remember the kernel version you tested with Wei's series applied?

[1] https://lore.kernel.org/linux-arm-kernel/20220411093819.1012583-1-sumit.garg@linaro.org/T/

-Sumit

>   testing, 1 2 3
>   BUG: sleeping function called from invalid context at
> arch/arm64/mm/fault.c:593
>   in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 3393, name: bash
>   preempt_count: 0, expected: 0
>   RCU nest depth: 1, expected: 0
>   CPU: 6 PID: 3393 Comm: bash Not tainted 5.19.0-rc4+ #3
> dbec0bdb8582e447bccdcf2e70d7fe04477b1aac
>   Hardware name: Google Herobrine (rev1+) (DT)
>   Call trace:
>    dump_backtrace+0xf0/0x110
>    show_stack+0x24/0x70
>    dump_stack_lvl+0x64/0x7c
>    dump_stack+0x18/0x38
>    __might_resched+0x144/0x154
>    __might_sleep+0x54/0x84
>    do_page_fault+0x1d4/0x42c
>    do_mem_abort+0x4c/0xb0
>    el1_abort+0x3c/0x5c
>    el1h_64_sync_handler+0x4c/0xc4
>    el1h_64_sync+0x64/0x68
>    0xffffffc008000000
>    __handle_sysrq+0x15c/0x184
>    write_sysrq_trigger+0x94/0x128
>    proc_reg_write+0xbc/0xec
>    vfs_write+0xf0/0x2c8
>    ksys_write+0x84/0xf0
>    __arm64_sys_write+0x28/0x34
>    invoke_syscall+0x4c/0x120
>    el0_svc_common+0x94/0xfc
>    do_el0_svc+0x38/0xc0
>    el0_svc+0x2c/0x7c
>    el0t_64_sync_handler+0x48/0x114
>    el0t_64_sync+0x18c/0x190
>   Unable to handle kernel execute from non-executable memory at
> virtual address ffffffc008000000
>   Mem abort info:
>     ESR = 0x000000008600000f
>     EC = 0x21: IABT (current EL), IL = 32 bits
>     SET = 0, FnV = 0
>     EA = 0, S1PTW = 0
>     FSC = 0x0f: level 3 permission fault
>   swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000082863000
>   [ffffffc008000000] pgd=100000027ffff003, p4d=100000027ffff003,
> pud=100000027ffff003, pmd=100000027fffe003, pte=00680001001c3703
>   Internal error: Oops: 8600000f [#1] PREEMPT SMP
>
> I'm not sure if that's a sign that something is missing with your patch or not.
>
> -Doug

WARNING: multiple messages have this Message-ID (diff)
From: Sumit Garg <sumit.garg@linaro.org>
To: Doug Anderson <dianders@chromium.org>
Cc: Daniel Thompson <daniel.thompson@linaro.org>,
	Will Deacon <will@kernel.org>,  Wei Li <liwei391@huawei.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	 Mark Rutland <mark.rutland@arm.com>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	 Jason Wessel <jason.wessel@windriver.com>,
	Marc Zyngier <maz@kernel.org>,
	 Linux ARM <linux-arm-kernel@lists.infradead.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v3 0/2] arm64: Fix pending single-step debugging issues
Date: Mon, 11 Jul 2022 18:13:59 +0530	[thread overview]
Message-ID: <CAFA6WYPmFqCXk24J71tNO=5bJHgJRAOPFr7cGq8YYLR+MWBoPw@mail.gmail.com> (raw)
In-Reply-To: <CAD=FV=WXoDvWuH=yjzCcqOZ5CeUtYun7C8zrtrBP4FC409GkqA@mail.gmail.com>

Hi Doug,

On Sat, 2 Jul 2022 at 03:44, Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Tue, May 10, 2022 at 11:05 PM Sumit Garg <sumit.garg@linaro.org> wrote:
> >
> > This patch-set reworks pending fixes from Wei's series [1] to make
> > single-step debugging via kgdb/kdb on arm64 work as expected. There was
> > a prior discussion on ML [2] regarding if we should keep the interrupts
> > enabled during single-stepping. So patch #1 follows suggestion from Will
> > [3] to not disable interrupts during single stepping but rather skip
> > single stepping within interrupt handler.
> >
> > [1] https://lore.kernel.org/all/20200509214159.19680-1-liwei391@huawei.com/
> > [2] https://lore.kernel.org/all/CAD=FV=Voyfq3Qz0T3RY+aYWYJ0utdH=P_AweB=13rcV8GDBeyQ@mail.gmail.com/
> > [3] https://lore.kernel.org/all/20200626095551.GA9312@willie-the-truck/
> >
> > Changes in v3:
> > - Reword commit descriptions as per Daniel's suggestions.
> >
> > Changes in v2:
> > - Replace patch #1 to rather follow Will's suggestion.
> >
> > Sumit Garg (2):
> >   arm64: entry: Skip single stepping into interrupt handlers
> >   arm64: kgdb: Set PSTATE.SS to 1 to re-enable single-step
> >
> >  arch/arm64/include/asm/debug-monitors.h |  1 +
> >  arch/arm64/kernel/debug-monitors.c      |  5 +++++
> >  arch/arm64/kernel/entry-common.c        | 18 +++++++++++++++++-
> >  arch/arm64/kernel/kgdb.c                |  2 ++
> >  4 files changed, 25 insertions(+), 1 deletion(-)
>
> Sorry it took so long for me to respond. I kept dreaming that I'd find
> the time to really dig deep into this to understand it fully and I'm
> finally giving up on it.

No worries and apologies on my part as well as I had to find some time
to reproduce the issue that you have reported below.

> I'm going to hope that Will and/or Catalin
> knows this area of the code well and can give it a good review. If not
> then I'll strive harder to make the time...
>
> In any case, I poked around with this a bunch and it definitely
> improved the stepping behavior a whole lot. I still got one case where
> gdb hit an assertion while I was stepping, but I could believe that
> was a problem with gdb? I couldn't reproduce it. Thus I can at least
> give:
>
> Tested-by: Douglas Anderson <dianders@chromium.org>
>

Thanks for the testing.

> I'll also note that I _think_ I remember that with Wei's series that
> the gdb function "call" started working. I tried that here and it
> didn't seem so happy. To keep things simple, I created a dummy
> function in my kernel that looked like:
>
> void doug_test(void)
> {
>   pr_info("testing, 1 2 3\n");
> }
>
> I broke into the debugger by echoing "g" to /proc/sysrq-trigger and
> then tried "call doug_test()". I guess my printout actually printed
> but it wasn't so happy after that. Seems like it somehow ended up
> returning to a bogus address after the call which then caused a crash.
>

I am able to reproduce this issue on my setup as well. But it doesn't
seem to be a regression caused by this patch-set over Wei's series. As
I could reproduce this issue with v1 [1] patch-set as well which was
just a forward port of pending patches from Wei's series to the latest
upstream.

Maybe it's a different regression caused by other changes? BTW, do you
remember the kernel version you tested with Wei's series applied?

[1] https://lore.kernel.org/linux-arm-kernel/20220411093819.1012583-1-sumit.garg@linaro.org/T/

-Sumit

>   testing, 1 2 3
>   BUG: sleeping function called from invalid context at
> arch/arm64/mm/fault.c:593
>   in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 3393, name: bash
>   preempt_count: 0, expected: 0
>   RCU nest depth: 1, expected: 0
>   CPU: 6 PID: 3393 Comm: bash Not tainted 5.19.0-rc4+ #3
> dbec0bdb8582e447bccdcf2e70d7fe04477b1aac
>   Hardware name: Google Herobrine (rev1+) (DT)
>   Call trace:
>    dump_backtrace+0xf0/0x110
>    show_stack+0x24/0x70
>    dump_stack_lvl+0x64/0x7c
>    dump_stack+0x18/0x38
>    __might_resched+0x144/0x154
>    __might_sleep+0x54/0x84
>    do_page_fault+0x1d4/0x42c
>    do_mem_abort+0x4c/0xb0
>    el1_abort+0x3c/0x5c
>    el1h_64_sync_handler+0x4c/0xc4
>    el1h_64_sync+0x64/0x68
>    0xffffffc008000000
>    __handle_sysrq+0x15c/0x184
>    write_sysrq_trigger+0x94/0x128
>    proc_reg_write+0xbc/0xec
>    vfs_write+0xf0/0x2c8
>    ksys_write+0x84/0xf0
>    __arm64_sys_write+0x28/0x34
>    invoke_syscall+0x4c/0x120
>    el0_svc_common+0x94/0xfc
>    do_el0_svc+0x38/0xc0
>    el0_svc+0x2c/0x7c
>    el0t_64_sync_handler+0x48/0x114
>    el0t_64_sync+0x18c/0x190
>   Unable to handle kernel execute from non-executable memory at
> virtual address ffffffc008000000
>   Mem abort info:
>     ESR = 0x000000008600000f
>     EC = 0x21: IABT (current EL), IL = 32 bits
>     SET = 0, FnV = 0
>     EA = 0, S1PTW = 0
>     FSC = 0x0f: level 3 permission fault
>   swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000082863000
>   [ffffffc008000000] pgd=100000027ffff003, p4d=100000027ffff003,
> pud=100000027ffff003, pmd=100000027fffe003, pte=00680001001c3703
>   Internal error: Oops: 8600000f [#1] PREEMPT SMP
>
> I'm not sure if that's a sign that something is missing with your patch or not.
>
> -Doug

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2022-07-11 12:44 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-11  6:05 [PATCH v3 0/2] arm64: Fix pending single-step debugging issues Sumit Garg
2022-05-11  6:05 ` Sumit Garg
2022-05-11  6:05 ` [PATCH v3 1/2] arm64: entry: Skip single stepping into interrupt handlers Sumit Garg
2022-05-11  6:05   ` Sumit Garg
2022-05-11  6:05 ` [PATCH v3 2/2] arm64: kgdb: Set PSTATE.SS to 1 to re-enable single-step Sumit Garg
2022-05-11  6:05   ` Sumit Garg
2022-07-01 22:14 ` [PATCH v3 0/2] arm64: Fix pending single-step debugging issues Doug Anderson
2022-07-01 22:14   ` Doug Anderson
2022-07-08 16:31   ` Will Deacon
2022-07-08 16:31     ` Will Deacon
2022-07-08 16:44     ` Doug Anderson
2022-07-08 16:44       ` Doug Anderson
2022-07-11 12:48     ` Sumit Garg
2022-07-11 12:48       ` Sumit Garg
2022-07-11 12:43   ` Sumit Garg [this message]
2022-07-11 12:43     ` Sumit Garg
2022-07-11 13:47     ` Doug Anderson
2022-07-11 13:47       ` Doug Anderson
2022-07-11 13:51       ` Sumit Garg
2022-07-11 13:51         ` Sumit Garg
2022-08-04  9:18         ` Sumit Garg
2022-08-04  9:18           ` Sumit Garg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFA6WYPmFqCXk24J71tNO=5bJHgJRAOPFr7cGq8YYLR+MWBoPw@mail.gmail.com' \
    --to=sumit.garg@linaro.org \
    --cc=catalin.marinas@arm.com \
    --cc=daniel.thompson@linaro.org \
    --cc=dianders@chromium.org \
    --cc=jason.wessel@windriver.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liwei391@huawei.com \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.