Linux-man Archive on lore.kernel.org
 help / color / Atom feed
From: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
To: Keno Fischer <keno@juliacomputing.com>
Cc: linux-man <linux-man@vger.kernel.org>,
	Andy Lutomirski <luto@kernel.org>,
	Dave Hansen <dave.hansen@intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andi Kleen <andi@firstfloor.org>,
	Denys Vlasenko <vda.linux@gmail.com>,
	Michael Kerrisk <mtk.manpages@gmail.com>
Subject: Re: [PATCH] ptrace.2: Describe PTRACE_SET/GETREGSET on NT_X86_XSTATE
Date: Tue, 19 May 2020 22:44:01 +0200
Message-ID: <CAKgNAkiLv-uhKdXRcY=5ihfRrTVnpoti46brBb2EXcQ4n8CFbA@mail.gmail.com> (raw)
In-Reply-To: <20200518030053.GA72528@juliacomputing.com>

[CC += Denys, since he's had a lot of input to ptrace(2) in the past,
and perhaps might also have a comment to this patch]

On Mon, 18 May 2020 at 05:00, Keno Fischer <keno@juliacomputing.com> wrote:
>
> Correctly using the result of this operation is quite hard,
> because the layout is not fixed and depends on the kernel
> configuration. Furthermore, because of the initial state
> optimization, parts of the layout may be missing. If ptrace
> users are not careful, it is easy to get unexpected results.
> This documents everything I know about how to use NT_X86_XSTATE
> "correctly". This should probably have been documented earlier,
> since every single ptrace application I looked at gets this wrong
> in one way or another, but hopefully having documentation will at
> least help future users cover the relevant corner cases.
>
> Signed-off-by: Keno Fischer <keno@juliacomputing.com>
>
> ---
> I'm hoping this will help. I recently had occasion to read up on
> how this actually works (finding I, too, had used it incorrectly),
> for patch in https://lkml.org/lkml/2020/4/6/1161. I'm cc'ing the
> folks who took part in that review here, since I think they would
> be interested in making sure the status quo is adequately documented.
> Please let me know if I got anything wrong, or if anything is confusing.
>
>  man2/ptrace.2 | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 57 insertions(+)
>
> diff --git a/man2/ptrace.2 b/man2/ptrace.2
> index 575062971..57958119b 100644
> --- a/man2/ptrace.2
> +++ b/man2/ptrace.2
> @@ -2322,6 +2322,63 @@ result, to the real parent (to the real parent only when the
>  whole multithreaded process exits).
>  If the tracer and the real parent are the same process,
>  the report is sent only once.
> +.SS The layout and operation of the NT_X86_XSTATE regset
> +On x86(_64), the values of extended registers can be obtained as an xstate buffer,
> +accessed through the NT_X86_XSTATE option to
> +.B PTRACE_GETREGSET.
> +The layout of this area is relatively complex (and described below). It is
> +in general not safe to assume that a buffer obtained using
> +.B PTRACE_GETREGSET
> +may be set back to any task using
> +.B PTRACE_SETREGSET
> +while resulting in a task that has equivalent register state (see below for
> +details). It is also not safe to assume that the buffer is a valid xsave area
> +that may be restored using the
> +.I xrstor
> +instruction, nor is it safe to assume that any extended state component is
> +located at a particular fixed offset. Instead the following algorithm should be used to
> +compute the offset of any given state component within the xsave buffer:
> +.IP 1. 3
> +Obtain the kernel xsave component bitmask from the software-reserved area of the
> +xstate buffer. The software-reserved area beings at offset 464 into the xsave
> +buffer and the first 64 bits of this area contain the kernel xsave component bitmask
> +.IP 2.
> +Compute the offset of each state component by adding the sizes of all prior state
> +components that are enabled in the kernel xsave component bitmask, aligning to 64 byte boundaries along the way. This
> +format matches that of a compacted xsave area with XCOMP_BV set to the
> +kernel component bitmask. Further details on the layout of the compacted xsave
> +area may be found in the Intel architecture manual, but note that the xsave
> +buffer returned from ptrace will have its XCOMP_BV set to 0.
> +.IP 3.
> +For the given state component of interest, check the corresponding bit
> +in the xsave header's XSTATE_BV bitfield. If this bit is zero, the corresponding
> +component is in initial state and was not written to the buffer (i.e. the kernel
> +does not touch the memory corresponding to this state component at all,
> +the start offset next active state component will not be affected unless
> +the bit is also missing from the kernel component bitmask obtained in step 1).
> +The initial state for any state component is defined in the Intel architecture manual (for
> +most state components it is the zero state).
> +.PP
> +
> +In particular, the third of these considerations results in a buffer that does
> +not round-trip through
> +.B PTRACE_SETREGSET.
> +If a given state component is missing from the XSTATE_BV bitfield, it will
> +be ignored by
> +.B PTRACE_SETREGSET
> +even if the corresponding register in the target task is currently not in
> +initial state.
> +
> +Thus, to obtain an xsave area that may be set back to the tracee, all unused
> +state components must first be re-set to the correct initial state for the
> +corresponding state component, and the XSTATE_BV bitfield must subsequently
> +be adjusted to match the kernel xstate component bitmask (obtained as
> +described above).
> +
> +The value of the kernel's state component bitmask is determined on boot and
> +need not be equivalent to the maximal set of state components supported by the
> +CPU (as enumerated through CPUID).
> +
>  .SH RETURN VALUE
>  On success, the
>  .B PTRACE_PEEK*
> --
> 2.25.1
>


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

  reply index

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-18  3:00 Keno Fischer
2020-05-19 20:44 ` Michael Kerrisk (man-pages) [this message]
2020-05-19 21:29   ` Denys Vlasenko
2020-05-19 22:46     ` Keno Fischer
2020-05-20 10:03       ` Denys Vlasenko
2020-05-20  1:19 ` Andi Kleen
2020-05-20  3:30   ` Keno Fischer
2020-05-20  5:08     ` Michael Kerrisk (man-pages)
2020-05-20 13:56     ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKgNAkiLv-uhKdXRcY=5ihfRrTVnpoti46brBb2EXcQ4n8CFbA@mail.gmail.com' \
    --to=mtk.manpages@gmail.com \
    --cc=andi@firstfloor.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=keno@juliacomputing.com \
    --cc=linux-man@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=vda.linux@gmail.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-man Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-man/0 linux-man/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-man linux-man/ https://lore.kernel.org/linux-man \
		linux-man@vger.kernel.org
	public-inbox-index linux-man

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-man


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git