All of lore.kernel.org
 help / color / mirror / Atom feed
From: Luis Machado <luis.machado@linaro.org>
To: Will Deacon <will@kernel.org>
Cc: mark.rutland@arm.com, linux-arm-kernel@lists.infradead.org
Subject: Re: [arm64, debug] PTRACE_SINGLESTEP does not single-step a valid instruction
Date: Mon, 18 Nov 2019 11:54:58 -0300	[thread overview]
Message-ID: <b3a9ae7e-8a45-7c14-7bc6-1d3b62728a0c@linaro.org> (raw)
In-Reply-To: <20191118131525.GA4180@willie-the-truck>

Hi Will,

Thanks for the thorough explanation.

On 11/18/19 10:15 AM, Will Deacon wrote:
> Hi Luis,
> 
> [+Mark for the valid_user_regs() part]
> 
> On Tue, Nov 12, 2019 at 08:22:10PM -0300, Luis Machado wrote:
>> I've noticed, under very specific conditions, that a PTRACE_SINGLESTEP
>> request by GDB won't execute the underlying instruction. As a consequence,
>> the PC doesn't move, but we return a SIGTRAP just like we would for a
>> regular successful PTRACE_SINGLESTEP request.
>>
>> Since there are no software breakpoints inserted at PC (we are actually
>> stepping over a breakpoint, so GDB removes the breakpoint at PC before
>> issuing a PTRACE_SINGLESTEP request), this is an odd behavior.
>>
>> Though not too harmful, i see this manifesting in the GDB testsuite
>> (gdb.reverse/insn-reverse.exp), which throws the test off by making GDB
>> think it is further in the instruction stream than it really is. In fact, we
>> get lucky here and no FAIL's show up, only many more spurious PASSes.
> 
> I managed to reproduce this locally and I think I've figured out what's
> going on, although I'm not sure that the kernel is the best place to fix
> it.
> 
> Looking at the specific reproducer:
> 
>> Execute gdb like so:
>>
>> gdb -ex "set displaced-stepping off" -ex "b load" -ex "run" -ex "record" -ex
>> "si" -ex "rsi" -ex "record stop" insn-reverse
> 
> So we've got a couple of instructions as follows (it doesn't actually matter
> what they are, so I've changed the LD1 in your binary for a NOP in order to
> avoid confusion with the "load" label not actually pointing at a load):
> 
> 	0x7b8:		mov	// "load"
> 	0x7bc:		nop
> 
> "b load" places a breakpoint at 0x7b8:
> 
> 	ptrace(PTRACE_POKEDATA, 662, 0xaaaaaaaaa7b8, 0xd503201fd4200000) = 0
> 
> We run to a software breakpoint on "load" (the mov instruction). We take
> the trap and try to execute the "si", which means we need to remove the
> breakpoint while we step over it:
> 
> 	ptrace(PTRACE_POKEDATA, 662, 0xaaaaaaaaa7b8, 0xd503201f910003e0) = 0
> 	[...]
> 	ptrace(PTRACE_SINGLESTEP, 662, 0x1, 0)  = 0
> 
> This causes the kernel to arm the single-step state machine so that
> MDSCR_EL1.SS == SPSR_EL1.SS == 1 (known as "active-not-pending"). Running
> an instruction in userspace will transition to MDSCR_EL1.SS ==1 and
> SPSR_EL1.SS == 0 (known as "active-pending"), which will cause the trap to
> trigger, at which point gdb puts the breakpoint instruction back since the
> step is complete:

So, just to confirm my understanding, we have a couple bits controlling 
single-stepping in the kernel, one in MDSCR_EL1 and another in SPSR_EL1. 
GDB doesn't have direct access to any of those, correct?

Instead, GDB has access to a SS bit in the reserved 21~22 range of CPSR.

The transition from active-not-pending to active-pending takes place via 
a single PTRACE_SINGLESTEP request? Is that correct?

> 
> 	ptrace(PTRACE_POKEDATA, 662, 0xaaaaaaaaa7b8, 0xd503201fd4200000) = 0
> 
> This is where things start to go wrong. The "rsi" command attempts to
> perform a reverse step, which means restoring the old state when we were
> previously executing at 0x7b8. It starts by removing the breakpoint again,
> since we've already hit that:
> 
> 	ptrace(PTRACE_POKEDATA, 662, 0xaaaaaaaaa7b8, 0xd503201f910003e0) = 0
> 
> and then resets the CPU registers to their old values:
> 
> 	(I don't know why it does this three times)
> 	ptrace(PTRACE_SETREGSET, 662, NT_PRSTATUS, [{iov_base=0xffffff64b3c8, iov_len=272}]) = 0
> 	ptrace(PTRACE_SETREGSET, 662, NT_PRSTATUS, [{iov_base=0xffffff64b3c8, iov_len=272}]) = 0
> 	ptrace(PTRACE_SETREGSET, 662, NT_PRSTATUS, [{iov_base=0xffffff64b3c8, iov_len=272}]) = 0
> 
> The problem with this is that we have moved the PC back to 0x7b8 but we have
> also cleared SPSR_EL1.SS to 0. Internally, the kernel hasn't seen stepping
> get disabled (this usually happens by PTRACE_CONT calling
> user_disable_single_step()) which means that MDSCR_EL1.SS remains set to 1
> and we're in the active-pending state! Consequently, we immediately take a
> step exception if a step operation is attempted >

While trying to reproduce this, i was paying attention to the SS bit 
coming and going. But in the particular sequence of si/rsi, within the 
record boundaries, i see GDB just restored the original CPSR value to 
what it was before we processed the si command.

 From GDB's POV all state was restore to the way it was before and we're 
good to go.

Is this not enough to restore state kernel-wise?

> Now, we *could* consider hacking the TIF_SINGLESTEP check in
> valid_user_regs() so that SPSR_EL1.SS is preserved when stepping is active
> but this is a user-visible change and may break things like stepping out of
> signal handlers. I would prefer that GDB manages the SS bit explicitly in
> this scenario, by setting it to 1 when restoring the old state in the
> reverse step, a bit like when it disables the old breakpoint. You can
> emulate this by doing:

I think we could let GDB control this when required, but I'm trying to 
understand the ramifications of letting GDB do so.

For example, what if the user decides to alter the PC here and there, 
for debugging purposes. That is a use case that happens often, in order 
to go back or skip some parts of the code.

Would we need to pay attention to the SS bit in those cases as well?

> 
> 	(gdb) set $cpsr |= (1<<21)

In particular, what does the switching of this bit accomplishes in the 
kernel? Would we be better off forcing the SS bit every time we do a 
single-step operation, for example?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2019-11-18 14:55 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-12 23:22 [arm64, debug] PTRACE_SINGLESTEP does not single-step a valid instruction Luis Machado
2019-11-18 13:15 ` Will Deacon
2019-11-18 14:54   ` Luis Machado [this message]
2019-11-26 16:35     ` Luis Machado
2019-12-10 20:00       ` Luis Machado
2020-02-13 12:01         ` Will Deacon
2020-02-13 17:07           ` Luis Machado
2020-02-14 15:45             ` Luis Machado
2020-02-18  8:44               ` Will Deacon
2020-02-18 10:33                 ` Luis Machado
2020-02-26 13:01                   ` Luis Machado
2020-02-20 13:02           ` Mark Rutland
2020-02-20 13:29             ` Will Deacon
2020-02-21 11:16               ` Mark Rutland
2020-05-27 14:39                 ` Luis Machado
2020-05-31  9:52                 ` Will Deacon
2020-01-13 18:13       ` Luis Machado

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b3a9ae7e-8a45-7c14-7bc6-1d3b62728a0c@linaro.org \
    --to=luis.machado@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=mark.rutland@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.