All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Laight <David.Laight@ACULAB.COM>
To: "'Eric W. Biederman'" <ebiederm@xmission.com>,
	Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	X86 ML <x86@kernel.org>
Subject: RE: in_compat_syscall() on x86
Date: Mon, 4 Jan 2021 22:34:48 +0000	[thread overview]
Message-ID: <fe2629460b4e4b44a120a8b56efe0ac1@AcuMS.aculab.com> (raw)
In-Reply-To: <87sg7gfnaa.fsf@x220.int.ebiederm.org>

From: Eric W. Biederman
> Sent: 04 January 2021 20:41
> 
> Al Viro <viro@zeniv.linux.org.uk> writes:
> 
> > On Mon, Jan 04, 2021 at 12:16:56PM +0000, David Laight wrote:
> >> On x86 in_compat_syscall() is defined as:
> >>     in_ia32_syscall() || in_x32_syscall()
> >>
> >> Now in_ia32_syscall() is a simple check of the TS_COMPAT flag.
> >> However in_x32_syscall() is a horrid beast that has to indirect
> >> through to the original %eax value (ie the syscall number) and
> >> check for a bit there.
> >>
> >> So on a kernel with x32 support (probably most distro kernels)
> >> the in_compat_syscall() check is rather more expensive than
> >> one might expect.
> 
> I suggest you check the distro kernels.  I suspect they don't compile in
> support for x32.  As far as I can tell x32 is an undead beast of a
> subarchitecture that just enough people use that it can't be removed,
> but few enough people use it likely has a few lurking scary bugs.

It is defined in the Ubuntu kernel configs I've got lurking:
Both 3.8.0-19_generic (Ubuntu 13.04) and 5.4.0-56_generic (probably 20.04).
Which is probably why it is in my test builds (I've just cut out
a lot of modules).

> >> It would be muck better if both checks could be done together.
> >> I think this would require the syscall entry code to set a
> >> value in both the 64bit and x32 entry paths.
> >> (Can a process make both 64bit and x32 system calls?)
> >
> > Yes, it bloody well can.
> >
> > And I see no benefit in pushing that logics into syscall entry,
> > since anything that calls in_compat_syscall() more than once
> > per syscall execution is doing the wrong thing.  Moreover,
> > in quite a few cases we don't call the sucker at all, and for
> > all of those pushing that crap into syscall entry logics is
> > pure loss.
> 
> The x32 system calls have their own system call table and it would be
> trivial to set a flag like TS_COMPAT when looking up a system call from
> that table.  I expect such a change would be purely in the noise.

Certainly a write of 0/1/2 into a dirtied cache line of 'current'
could easily cost absolutely nothing.
Especially if current has already been read.

I also wondered about resetting it to zero when an x32 system call
exits (rather than entry to a 64bit one).

For ia32 the flag is set (with |=) on every syscall entry.
Even though I'm pretty sure it can only change during exec.

> > What's the point, really?
> 
> Before we came up with the current games with __copy_siginfo_to_user
> and x32_copy_siginfo_to_user I was wondering if we should make such
> a change.  The delivery of compat signal frames and core dumps which
> do not go through the system call entry path could almost benefit from
> a flag that could be set/tested when on those paths.

For signal delivery it should (probably) depend on the system call
that setup the signal handler.
Although I'm sure I remember one kernel where some of it was done
in libc (with a single entrypoint for all hadlers).

> The fact that only SIGCHLD (which can not trigger a coredump) is
> different saves the coredump code from needing such a test.
> 
> The fact that the signal frame code is simple enough it can directly
> call x32_copy_siginfo_to_user or __copy_siginfo_to_user saves us there.
> 
> So I don't think we have any cases where we actually need a flag that
> is independent of the system call but we have come very close.

If a program can do both 64bit and x32 system calls you probably
need to generate a 64bit core dump if it has ever made a 64bit
system call??

> For people who want to optimize I suggest tracking down the handful of
> users of x32 and see if x32 can be made to just go away.

Unlikely since Ubuntu seem to have enabled it for years.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


  reply	other threads:[~2021-01-04 22:39 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-04 12:16 in_compat_syscall() on x86 David Laight
2021-01-04 16:46 ` David Laight
2021-01-04 16:58 ` Al Viro
2021-01-04 20:41   ` Eric W. Biederman
2021-01-04 22:34     ` David Laight [this message]
2021-01-04 23:04       ` Andy Lutomirski
2021-01-05  0:47         ` Eric W. Biederman
2021-01-05  0:57           ` Al Viro
2021-01-06  0:03             ` Eric W. Biederman
2021-01-06  0:11               ` Bernd Petrovitsch
2021-01-06  0:30               ` Al Viro
2021-01-05  9:53         ` David Laight
2021-01-05 17:35           ` Andy Lutomirski
2021-01-06  9:42             ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fe2629460b4e4b44a120a8b56efe0ac1@AcuMS.aculab.com \
    --to=david.laight@aculab.com \
    --cc=ebiederm@xmission.com \
    --cc=hch@lst.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.