All of lore.kernel.org
 help / color / mirror / Atom feed
From: Keno Fischer <keno@juliacomputing.com>
To: Andrei Vagin <avagin@gmail.com>
Cc: Will Deacon <will@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Oleg Nesterov <oleg@redhat.com>,
	linux-arm-kernel@lists.infradead.org,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-api@vger.kernel.org,
	Anthony Steinhauser <asteinhauser@google.com>,
	Dave Martin <Dave.Martin@arm.com>, Kyle Huey <khuey@pernos.co>,
	"Robert O'Callahan" <roc@pernos.co>
Subject: Re: [PATCH 0/3 v2] arm64/ptrace: allow to get all registers on syscall traps
Date: Mon, 8 Feb 2021 14:18:28 -0500	[thread overview]
Message-ID: <CABV8kRzv54fxjXAvu1e8JYBZ7GQc_G-h7OSDWLCQ3NLA-5Ugug@mail.gmail.com> (raw)
In-Reply-To: <20210208183752.GB559391@gmail.com>

>
> Could you describe the problem in more details? I wonder whether we have
> the same thing in CRIU...
>

Sure, basically the issue is that orig_x0 gets recorded at the start of the
syscall entry, but is not otherwise part of the ptrace state. It used
to be primarily used for resetting the argument back to its original
value during syscall restarts, but I see it's been expanded slightly
with the user dispatch mechanism (though as far as I can tell
not in a way that interacts with ptrace). Basically the problem
is that if you change the value of x0 during either the ptrace
entry stop or a seccomp stop and then incur a syscall restart,
the syscall will restart with the original x0 rather than with
the modified x0, which may be unexpected. Of course,
relatedly, if you're doing CRIU-like things you can end up
in situations where the future behavior will depend on the
orig_x0 value, which isn't restore-able at the moment. It's
possible to work around all of this by keeping a local copy
of orig_x0 and being very careful with the ptrace traps around
restarts, but getting the logic right is extremely tricky. My
suggestion for what I thought would be reasonable
behavior was:

1. Expose orig_x0 to ptrace
2. Set orig_x0 to x0 and set x0 to -ENOSYS at the start of the syscall
dispatcher
3. Use orig_x0 for syscall arguments/seccomp/restarts

That's basically how rax works on x86_64 and it doesn't
seem to cause major problems (though of course I may
be biased by having x86_64 already work when I started the
aarch64 port). Just the first item would be sufficient of course
for getting rid of most of the bookkeeping. I should also say
that, for us, the ptrace getregs call can be the throughput
limiting operation, so it would be nice if getting the entire
basic register set would only require one syscall. I won't
insist on it, since we do have a solution in place that kinda
works (and only requires the one syscall),
but I thought I'd mention it.

While we're on this topic, and in case it's helpful to anybody,
I should also point out that the order of the ptrace-signal-stop,
vs setup for the syscall restart differs between x86 and
aarch64 (aarch64 sets up the restart first then delivers the
ptrace trap/signal - x86 the other way around). I actually
think the aarch64 behavior is saner here, but I figured I'd
leave this breadcrumb for anybody who's writing a ptracer
and stumbles across this.

Keno

WARNING: multiple messages have this Message-ID (diff)
From: Keno Fischer <keno@juliacomputing.com>
To: Andrei Vagin <avagin@gmail.com>
Cc: Kyle Huey <khuey@pernos.co>,
	Anthony Steinhauser <asteinhauser@google.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Oleg Nesterov <oleg@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Robert O'Callahan <roc@pernos.co>,
	linux-api@vger.kernel.org, Will Deacon <will@kernel.org>,
	Dave Martin <Dave.Martin@arm.com>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH 0/3 v2] arm64/ptrace: allow to get all registers on syscall traps
Date: Mon, 8 Feb 2021 14:18:28 -0500	[thread overview]
Message-ID: <CABV8kRzv54fxjXAvu1e8JYBZ7GQc_G-h7OSDWLCQ3NLA-5Ugug@mail.gmail.com> (raw)
In-Reply-To: <20210208183752.GB559391@gmail.com>

>
> Could you describe the problem in more details? I wonder whether we have
> the same thing in CRIU...
>

Sure, basically the issue is that orig_x0 gets recorded at the start of the
syscall entry, but is not otherwise part of the ptrace state. It used
to be primarily used for resetting the argument back to its original
value during syscall restarts, but I see it's been expanded slightly
with the user dispatch mechanism (though as far as I can tell
not in a way that interacts with ptrace). Basically the problem
is that if you change the value of x0 during either the ptrace
entry stop or a seccomp stop and then incur a syscall restart,
the syscall will restart with the original x0 rather than with
the modified x0, which may be unexpected. Of course,
relatedly, if you're doing CRIU-like things you can end up
in situations where the future behavior will depend on the
orig_x0 value, which isn't restore-able at the moment. It's
possible to work around all of this by keeping a local copy
of orig_x0 and being very careful with the ptrace traps around
restarts, but getting the logic right is extremely tricky. My
suggestion for what I thought would be reasonable
behavior was:

1. Expose orig_x0 to ptrace
2. Set orig_x0 to x0 and set x0 to -ENOSYS at the start of the syscall
dispatcher
3. Use orig_x0 for syscall arguments/seccomp/restarts

That's basically how rax works on x86_64 and it doesn't
seem to cause major problems (though of course I may
be biased by having x86_64 already work when I started the
aarch64 port). Just the first item would be sufficient of course
for getting rid of most of the bookkeeping. I should also say
that, for us, the ptrace getregs call can be the throughput
limiting operation, so it would be nice if getting the entire
basic register set would only require one syscall. I won't
insist on it, since we do have a solution in place that kinda
works (and only requires the one syscall),
but I thought I'd mention it.

While we're on this topic, and in case it's helpful to anybody,
I should also point out that the order of the ptrace-signal-stop,
vs setup for the syscall restart differs between x86 and
aarch64 (aarch64 sets up the restart first then delivers the
ptrace trap/signal - x86 the other way around). I actually
think the aarch64 behavior is saner here, but I figured I'd
leave this breadcrumb for anybody who's writing a ptracer
and stumbles across this.

Keno

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-02-08 20:38 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-01 19:40 [PATCH 0/3 v2] arm64/ptrace: allow to get all registers on syscall traps Andrei Vagin
2021-02-01 19:40 ` Andrei Vagin
2021-02-01 19:40 ` [PATCH 1/3] arm64/ptrace: don't clobber task registers on syscall entry/exit traps Andrei Vagin
2021-02-01 19:40   ` Andrei Vagin
2021-02-04 15:23   ` Will Deacon
2021-02-04 15:23     ` Will Deacon
2021-02-04 16:41     ` Dave Martin
2021-02-04 16:41       ` Dave Martin
2021-02-25 16:00     ` Andrei Vagin
2021-02-25 16:00       ` Andrei Vagin
2021-02-01 19:40 ` [PATCH 2/3] arm64/ptrace: introduce PTRACE_O_ARM64_RAW_REGS Andrei Vagin
2021-02-01 19:40   ` Andrei Vagin
2021-02-04 15:36   ` Will Deacon
2021-02-04 15:36     ` Will Deacon
2021-02-08 18:31     ` Andrei Vagin
2021-02-08 18:31       ` Andrei Vagin
2021-02-01 19:40 ` [PATCH 3/3] selftest/arm64/ptrace: add tests for PTRACE_O_ARM64_RAW_REGS Andrei Vagin
2021-02-01 19:40   ` Andrei Vagin
2021-02-04 15:40   ` Will Deacon
2021-02-04 15:40     ` Will Deacon
2021-02-10 20:54     ` Kees Cook
2021-02-10 20:54       ` Kees Cook
2021-02-02  0:11 ` [PATCH 0/3 v2] arm64/ptrace: allow to get all registers on syscall traps Keno Fischer
2021-02-02  0:11   ` Keno Fischer
2021-02-08 18:37   ` Andrei Vagin
2021-02-08 18:37     ` Andrei Vagin
2021-02-08 19:18     ` Keno Fischer [this message]
2021-02-08 19:18       ` Keno Fischer
2021-02-04 14:53 ` Will Deacon
2021-02-04 14:53   ` Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABV8kRzv54fxjXAvu1e8JYBZ7GQc_G-h7OSDWLCQ3NLA-5Ugug@mail.gmail.com \
    --to=keno@juliacomputing.com \
    --cc=Dave.Martin@arm.com \
    --cc=asteinhauser@google.com \
    --cc=avagin@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=khuey@pernos.co \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=roc@pernos.co \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.