linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Metzger, Markus T" <markus.t.metzger@intel.com>
To: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	"Bae, Chang Seok" <chang.seok.bae@intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"bp@alien8.de" <bp@alien8.de>, "hpa@zytor.com" <hpa@zytor.com>,
	"Hansen, Dave" <dave.hansen@intel.com>,
	"Luck, Tony" <tony.luck@intel.com>,
	"Shankar, Ravi V" <ravi.v.shankar@intel.com>,
	Pedro Alves <palves@redhat.com>, Simon Marchi <simark@simark.ca>,
	Andi Kleen <ak@linux.intel.com>
Subject: RE: [PATCH v9 00/17] Enable FSGSBASE instructions
Date: Tue, 10 Dec 2019 08:27:31 +0000	[thread overview]
Message-ID: <A78C989F6D9628469189715575E55B236B519A77@IRSMSX104.ger.corp.intel.com> (raw)
In-Reply-To: <CALCETrWb9jvwOPuupet4n5=JytbS-x37bnn=THniv_d8cNvf_Q@mail.gmail.com>

> > > The general kernel rule is that we don't break working applications.
> > > Other than that, we're allowed to change the ABI if existing working
> > > applications don't break.  I can't tell whether you wrote a test that
> > > detects a behavior change or whether you wrote a test that tests
> > > behavior that gdb or other programs actually rely on.
> >
> > Well, that's a tough question.  The test covers GDB's behavior on today's
> > systems.  GDB itself does not actually rely on that behavior.  That is, GDB
> > itself wouldn't break.  You couldn't do all that you could do with it before,
> > though.
> 
> GDB does rely on at least some behavior.  If I tell gdb to call a
> function on my behalf, doesn't it save the old state, call the
> function, and then restore the state?  Surely it expects the restore
> operation to actually restore the state.

It does.  If we managed to break that, inferior calls in GDB would be
broken.  Users who don't use inferior calls wouldn't know or care,
though.  That's the point I was trying to make previously.


> It also helps that very, very few 64-bit applications use nonzero
> segments at all.  They used to because of a kernel optimization to
> automatically load a segment if an FS or GSBASE less than 4GB was
> requested, but that's been gone for a while.  Calling
> set_thread_area() at all in a 64-bit program requires considerable
> gymnastics, and distributions can and do disable modify_ldt() outright
> without significant ill effects.
> 
> So we're mostly talking about compatibility with 32-bit programs and
> exotic users like Wine and DOSEMU.

I agree that this should mostly affect 32-bit programs.


> > > Certainly, with a 32-bit *gdb*, writing a nonzero value to FS or GS
> > > using ptrace should change the base accordingly.  I think the current
> > > patches get this wrong.
> > >
> > > With a 64-bit gdb and a 32-bit inferior, in an ideal world, everything
> > > would work just like full 64-bit, since that's how the hardware works.
> >
> > Not sure what you mean.  The h/w runs in compatibility mode and the
> > inferior cannot set the base directly, can it?
> 
> I think there's a general impedance mismatch between gdb and the
> kernel/hw here.  On Linux on a 64-bit machine, there's isn't really a
> strong concept of a "32-bit process" versus a "64-bit process".  All
> tasks have 64-bit values in RAX, all tasks have R8-R15, all tasks have
> a GDT and an LDT, etc.  "32-bit tasks" are merely tasks that happen to
> be running with a compatibility selector loaded into CS at the time.
> Tasks can and do switch freely between compatibility and long mode
> using LJMP or LRET.  As far as I can tell, however, gdb doesn't really
> understand this and thinks that 32-bit tasks are their own special
> thing.
> 
> This causes me real problems: gdb explodes horribly if I connect gdb
> to QEMU's gdbserver (qemu -s) and try to debug during boot when the
> inferior switches between 32-bit and long mode.
> 
> As far as FSGSBASE goes, a "32-bit task" absolutely can set
> independent values in FS and FSBASE, although it's awkward to do so:
> the task would have to do a far transfer to long mode, then WRFSBASE,
> then far transfer back to compat mode.  But this entire sequence of
> events could occur without entering the kernel at all, and the ptrace
> API should be able to represent the result.  I think that, ideally, a
> 64-bit debugger would understand the essential 64-bitness of even
> compat tasks and work sensibly.  I don't really expect gdb to be able
> to do this any time soon, though.

I guess the primary use-case would be an application that was originally
written for 32-bit and is being maintained since then.  GDB is probably
64-bit in that case.


> > We had discussed this some time ago and proposed the following behavior: "
> > https://lore.kernel.org/lkml/1521481767-22113-15-git-send-email-
> chang.seok.bae@intel.com/
> >
> >         In a summary, ptracer's update on FS/GS selector and base
> >         yields such results on tracee's base:
> >         - When FS/GS selector only changed (to nonzero), fetch base
> >         from GDT/LDT (legacy behavior)
> >         - When FS/GS base (regardless of selector) changed, tracee
> >         will have the base
> > "
> 
> Indeed.  But I never understood how this behavior could be implemented
> with the current ABI.  As I understand it, gdb only ever sets the
> inferior register state by using a single ptrace() call to load the
> entire state, which means that the kernel does not know whether just
> FS is being written or whether FS and FSBASE are being written.

GDB writes the register state as soon as the user changes one of them.


> What actual ptrace() call does gdb use when a 64-bit gdb debugs a
> 64-bit inferior?  How about a 32-bit inferior?

GDB uses GETREGS both for 64-bit and 32-bit inferiors.  If GETREGS is
not available, it errors out on 64-bit and falls back to PEEKUSER on 32-bit.


> > The ptracer would need to read registers back after changing the selector
> > to get the updated base.
> 
> What would the actual API be?

GETREGS and PEEKUSER.


> I think it could make sense to add a whole new ptrace() command to
> tell the tracee to, in effect, MOV a specified value to a segment
> register.  This call would have the actual correct semantics in which
> it would return an error code if the specified value is invalid and
> would return 0 on success.  And then a second ptrace() call could be
> issued to read out FSBASE or GSBASE if needed.  Would this be useful?
> What gdb commands would invoke it?

Could SETREGS handle it based on the above proposal?


> > The only time when both change at the same time, then, is when registers
> > are restored after returning from an inferior call.  And then, it's the base
> > we want to take priority since we previously ensured that the base is always
> > up-to-date.
> 
> Right.  But how does the kernel tell the difference?

The other times only one changes.  Could the kernel compare the old and new
values for selector and base and detect if one or both change at the same time?

Regards,
Markus.
Intel Deutschland GmbH
Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de
Managing Directors: Christin Eisenschmid, Gary Kershaw
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928

  reply	other threads:[~2019-12-10  8:27 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-04 18:15 [PATCH v9 00/17] Enable FSGSBASE instructions Chang S. Bae
2019-10-04 18:15 ` [PATCH v9 01/17] x86/ptrace: Prevent ptrace from clearing the FS/GS selector Chang S. Bae
2019-10-04 18:15 ` [PATCH v9 02/17] selftests/x86/fsgsbase: Test GS selector on ptracer-induced GS base write Chang S. Bae
2019-10-04 18:15 ` [PATCH v9 03/17] x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE Chang S. Bae
2019-10-04 18:15 ` [PATCH v9 04/17] x86/entry/64: Clean up paranoid exit Chang S. Bae
2019-10-04 18:15 ` [PATCH v9 05/17] x86/entry/64: Switch CR3 before SWAPGS in paranoid entry Chang S. Bae
2019-10-04 18:15 ` [PATCH v9 06/17] x86/entry/64: Introduce the FIND_PERCPU_BASE macro Chang S. Bae
2019-10-04 18:15 ` [PATCH v9 07/17] x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit Chang S. Bae
2019-10-04 18:16 ` [PATCH v9 08/17] x86/entry/64: Document GSBASE handling in the paranoid path Chang S. Bae
2019-10-04 18:16 ` [PATCH v9 09/17] x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions Chang S. Bae
2019-10-04 18:16 ` [PATCH v9 10/17] x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions Chang S. Bae
2019-10-04 18:16 ` [PATCH v9 11/17] x86/fsgsbase/64: Use FSGSBASE in switch_to() if available Chang S. Bae
2019-10-04 18:16 ` [PATCH v9 12/17] x86/fsgsbase/64: Use FSGSBASE instructions on thread copy and ptrace Chang S. Bae
2019-10-04 18:16 ` [PATCH v9 13/17] x86/speculation/swapgs: Check FSGSBASE in enabling SWAPGS mitigation Chang S. Bae
2019-10-04 18:16 ` [PATCH v9 14/17] selftests/x86/fsgsbase: Test ptracer-induced GS base write with FSGSBASE Chang S. Bae
2019-10-04 18:16 ` [PATCH v9 15/17] x86/fsgsbase/64: Enable FSGSBASE on 64bit by default and add a chicken bit Chang S. Bae
2019-10-04 18:16 ` [PATCH v9 16/17] x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2 Chang S. Bae
2019-10-04 18:16 ` [PATCH v9 17/17] Documentation/x86/64: Add documentation for GS/FS addressing mode Chang S. Bae
2019-10-04 22:54   ` Randy Dunlap
2019-11-15 18:29 ` [PATCH v9 00/17] Enable FSGSBASE instructions Thomas Gleixner
2019-11-15 19:12   ` Andi Kleen
2019-11-29 14:56     ` Metzger, Markus T
2019-11-29 16:51       ` Andy Lutomirski
2019-12-02  8:23         ` Metzger, Markus T
2019-12-04 20:20           ` Andy Lutomirski
2019-12-10  8:27             ` Metzger, Markus T [this message]
2020-02-24 18:02             ` Bae, Chang Seok
2020-04-13 20:03               ` Sasha Levin
2020-04-14  0:32                 ` Andi Kleen
2020-04-17 13:30                   ` Sasha Levin
2020-04-17 15:52                     ` Andy Lutomirski
2020-04-20 14:13                       ` Andi Kleen
2020-04-20 17:14                         ` Thomas Gleixner
2020-04-21 16:06                           ` Sasha Levin
2020-04-21 16:49                             ` Andy Lutomirski
2020-04-21 20:02                               ` Andi Kleen
2020-04-21 17:15                             ` Bae, Chang Seok
2020-04-21 19:56                             ` Andi Kleen
2020-04-21 20:21                               ` Andy Lutomirski
2020-04-21 20:51                                 ` Sasha Levin
2020-04-22 23:00                                   ` Andy Lutomirski
2020-04-23  4:08                                     ` Sasha Levin
2020-04-25 22:39                                       ` Thomas Gleixner
2020-04-26  2:52                                         ` Sasha Levin
2020-04-26 10:04                                           ` Thomas Gleixner
2020-04-14 15:47                 ` Bae, Chang Seok

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=A78C989F6D9628469189715575E55B236B519A77@IRSMSX104.ger.corp.intel.com \
    --to=markus.t.metzger@intel.com \
    --cc=ak@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=chang.seok.bae@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=palves@redhat.com \
    --cc=ravi.v.shankar@intel.com \
    --cc=simark@simark.ca \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).