All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Kyle Huey <me@kylehuey.com>
Cc: open list <linux-kernel@vger.kernel.org>,
	"moderated list:ARM PORT" <linux-arm-kernel@lists.infradead.org>,
	"maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)"
	<x86@kernel.org>,
	yyc1992@gmail.com, Keno Fischer <keno@juliacomputing.com>,
	"Robert O'Callahan" <robert@ocallahan.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Borislav Petkov <bp@alien8.de>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Will Deacon <will.deacon@arm.com>
Subject: Re: arm64 equivalents of PR_SET_TSC/ARCH_SET_CPUID
Date: Sun, 22 May 2022 16:35:40 +0100	[thread overview]
Message-ID: <87ilpxmvg3.wl-maz@kernel.org> (raw)
In-Reply-To: <CAP045ApiMSvP--f2E0=VdMbjE8oibvy921m8JASf4kaCCuU2RA@mail.gmail.com>

On Sat, 21 May 2022 21:07:14 +0100,
Kyle Huey <me@kylehuey.com> wrote:
> 
> There is ongoing work by Yichao Yu to make rr, a userspace record and
> replay debugger[0], production quality on arm64[1]. One of the bigger
> remaining issues is the kernel's emulation of accesses to certain
> system registers[2] that reflect timing and CPU capabilities and are
> either non-deterministic or can vary from processor to processor.

Just to make things clear: the kernel usually doesn't provide any
emulation for registers such as CNTVCT_EL0. On sane HW, userspace is
free to access it directly without any mediation (we only use the trap
for the sake of dealing with HW bugs).

> We
> would like to add the ability to tell the kernel to decline to emulate
> these instructions for a given task and pass that responsibility onto
> the supervising rr ptracer. There are analogous processor features and
> disabling mechanisms on x86. The RDTSC instruction is controlled by
> prctl(PR_SET_TSC) and the CPUID instruction is controlled (when the
> hardware allows) by arch_prctl(ARCH_SET_CPUID).
> 
> The questions I'd like to raise are:
> 
> 1. Is it appropriate to reuse PR_SET_TSC for roughly equivalent
> functionality on AArch64? (even if the AArch64 feature is not actually
> named Time Stamp Counter).

My gut feeling is that you really don't want to hijack an existing
API, because this is fundamentally different. The Linux arm64 ABI
mandates that the counter (and the frequency register associated with
it) are accessible, and you can't make them disappear.

From what I understand, you are relying on the TSC being disabled in
the tracee and intercepting the signal that gets delivered when it
accesses the counter. Is that correct?

Assuming I'm right, I think it'd make a lot more sense if there was a
first class ptrace option, if only because this would mandate the
kernel to start trapping things that are not trapped today.

It also begs the question of the fate of CNTFRQ_EL0, since you want to
be able to replay traces from one system to another (and the counter
is meaningless without the frequency).

Finally, what of the VDSO, which is by far the most common user of the
counter? I can totally imagine the VDSO getting stuck if emulation is
used and the sequence counter moves synchronously with the traps
(which is why we disable the VDSO when trapping CNTVCT_EL0).

> 2. Likewise for ARCH_SET_CPUID

We don't just emulate a single register, but a whole class of them. If
you are to present a different view for any of those, you'll need to
handle the lot (I really can't see why one would be more important
than the others).

So SET_CPUID really is the wrong tool. I'd rather there was (again) an
API that described exactly that.

> 3. Since arch_prctl is x86-only, does it make more sense to add
> arch_prctl to arm64 or to duplicate ARCH_SET_CPUID into the prctl
> world? (e.g. a PR_SET_CPUID that works on both x86/arm64)

I don't think any applies here. Different architectures have different
ABI requirements, and you can't really merge the two. Because the next
thing you know, you'll ask for the same thing for PMU registers, and
try to map them onto something else.

Overall, this would be better served by a framework for userspace
delegation of sysreg access by a ptrace'd process. Let's try to look
at it in those terms rather than casting arm64 into a seemingly
unrelated API.

Thanks,

	M.

> 
> - Kyle
> 
> [0] https://rr-project.org/
> [1] https://github.com/rr-debugger/rr/issues/3234
> [2] e.g. CNTVCT_EL0 and MIDR_EL1, among others
> 

-- 
Without deviation from the norm, progress is not possible.

WARNING: multiple messages have this Message-ID (diff)
From: Marc Zyngier <maz@kernel.org>
To: Kyle Huey <me@kylehuey.com>
Cc: open list <linux-kernel@vger.kernel.org>,
	"moderated list:ARM PORT" <linux-arm-kernel@lists.infradead.org>,
	"maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)"
	<x86@kernel.org>,
	yyc1992@gmail.com, Keno Fischer <keno@juliacomputing.com>,
	"Robert O'Callahan" <robert@ocallahan.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Borislav Petkov <bp@alien8.de>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Will Deacon <will.deacon@arm.com>
Subject: Re: arm64 equivalents of PR_SET_TSC/ARCH_SET_CPUID
Date: Sun, 22 May 2022 16:35:40 +0100	[thread overview]
Message-ID: <87ilpxmvg3.wl-maz@kernel.org> (raw)
In-Reply-To: <CAP045ApiMSvP--f2E0=VdMbjE8oibvy921m8JASf4kaCCuU2RA@mail.gmail.com>

On Sat, 21 May 2022 21:07:14 +0100,
Kyle Huey <me@kylehuey.com> wrote:
> 
> There is ongoing work by Yichao Yu to make rr, a userspace record and
> replay debugger[0], production quality on arm64[1]. One of the bigger
> remaining issues is the kernel's emulation of accesses to certain
> system registers[2] that reflect timing and CPU capabilities and are
> either non-deterministic or can vary from processor to processor.

Just to make things clear: the kernel usually doesn't provide any
emulation for registers such as CNTVCT_EL0. On sane HW, userspace is
free to access it directly without any mediation (we only use the trap
for the sake of dealing with HW bugs).

> We
> would like to add the ability to tell the kernel to decline to emulate
> these instructions for a given task and pass that responsibility onto
> the supervising rr ptracer. There are analogous processor features and
> disabling mechanisms on x86. The RDTSC instruction is controlled by
> prctl(PR_SET_TSC) and the CPUID instruction is controlled (when the
> hardware allows) by arch_prctl(ARCH_SET_CPUID).
> 
> The questions I'd like to raise are:
> 
> 1. Is it appropriate to reuse PR_SET_TSC for roughly equivalent
> functionality on AArch64? (even if the AArch64 feature is not actually
> named Time Stamp Counter).

My gut feeling is that you really don't want to hijack an existing
API, because this is fundamentally different. The Linux arm64 ABI
mandates that the counter (and the frequency register associated with
it) are accessible, and you can't make them disappear.

From what I understand, you are relying on the TSC being disabled in
the tracee and intercepting the signal that gets delivered when it
accesses the counter. Is that correct?

Assuming I'm right, I think it'd make a lot more sense if there was a
first class ptrace option, if only because this would mandate the
kernel to start trapping things that are not trapped today.

It also begs the question of the fate of CNTFRQ_EL0, since you want to
be able to replay traces from one system to another (and the counter
is meaningless without the frequency).

Finally, what of the VDSO, which is by far the most common user of the
counter? I can totally imagine the VDSO getting stuck if emulation is
used and the sequence counter moves synchronously with the traps
(which is why we disable the VDSO when trapping CNTVCT_EL0).

> 2. Likewise for ARCH_SET_CPUID

We don't just emulate a single register, but a whole class of them. If
you are to present a different view for any of those, you'll need to
handle the lot (I really can't see why one would be more important
than the others).

So SET_CPUID really is the wrong tool. I'd rather there was (again) an
API that described exactly that.

> 3. Since arch_prctl is x86-only, does it make more sense to add
> arch_prctl to arm64 or to duplicate ARCH_SET_CPUID into the prctl
> world? (e.g. a PR_SET_CPUID that works on both x86/arm64)

I don't think any applies here. Different architectures have different
ABI requirements, and you can't really merge the two. Because the next
thing you know, you'll ask for the same thing for PMU registers, and
try to map them onto something else.

Overall, this would be better served by a framework for userspace
delegation of sysreg access by a ptrace'd process. Let's try to look
at it in those terms rather than casting arm64 into a seemingly
unrelated API.

Thanks,

	M.

> 
> - Kyle
> 
> [0] https://rr-project.org/
> [1] https://github.com/rr-debugger/rr/issues/3234
> [2] e.g. CNTVCT_EL0 and MIDR_EL1, among others
> 

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-05-22 15:36 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-21 20:07 arm64 equivalents of PR_SET_TSC/ARCH_SET_CPUID Kyle Huey
2022-05-21 20:07 ` Kyle Huey
2022-05-22 15:35 ` Marc Zyngier [this message]
2022-05-22 15:35   ` Marc Zyngier
2022-05-22 18:22   ` Keno Fischer
2022-05-22 18:22     ` Keno Fischer
2022-05-23 19:27   ` Kyle Huey
2022-05-23 19:27     ` Kyle Huey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ilpxmvg3.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=bp@alien8.de \
    --cc=keno@juliacomputing.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=me@kylehuey.com \
    --cc=robert@ocallahan.org \
    --cc=suzuki.poulose@arm.com \
    --cc=tglx@linutronix.de \
    --cc=will.deacon@arm.com \
    --cc=x86@kernel.org \
    --cc=yyc1992@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.