linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "H. Peter Anvin" <hpa@zytor.com>
To: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Ingo Molnar <mingo@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Borislav Petkov <bp@alien8.de>,
	LKML <linux-kernel@vger.kernel.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Kees Cook <keescook@chromium.org>, Will Drewry <wad@chromium.org>
Subject: Re: pt_regs->ax == -ENOSYS
Date: Tue, 27 Apr 2021 17:20:55 -0700	[thread overview]
Message-ID: <3a502aae-4124-5cb2-1dac-bc18b8158fbe@zytor.com> (raw)
In-Reply-To: <CALCETrWzL=jgnWd+6YuBo02GG8vTvsG22sXGaUQCc37vwQ6HdA@mail.gmail.com>



On 4/27/21 5:11 PM, Andy Lutomirski wrote:
> On Tue, Apr 27, 2021 at 5:05 PM H. Peter Anvin <hpa@zytor.com> wrote:
>>
>> On 4/27/21 4:23 PM, Andy Lutomirski wrote:
>>>
>>> I much prefer the model of saying that the bits that make sense for
>>> the syscall type (all 64 for 64-bit SYSCALL and the low 32 for
>>> everything else) are all valid.  This way there are no weird reserved
>>> bits, no weird ptrace() interactions, etc.  I'm a tiny bit concerned
>>> that this would result in a backwards compatibility issue, but not
>>> very.  This would involve changing syscall_get_nr(), but that doesn't
>>> seem so bad.  The biggest problem is that seccomp hardcoded syscall
>>> nrs to 32 bit.
>>>
>>> An alternative would be to declare that we always truncate to 32 bits,
>>> except that 64-bit SYSCALL with high bits set is an error and results
>>> in ENOSYS. The ptrace interaction there is potentially nasty.
>>>
>>> Basically, all choices here kind of suck, and I haven't done a real
>>> analysis of all the issues...
>>>
>>
>> OK, I really don't understand this. The *current* way of doing it causes
>> a bunch of ugly corner conditions, including in ptrace, which this would
>> get rid of. It isn't any different than passing any other argument which
>> is an int -- in fact we have this whole machinery to deal with that subcase.
>>
> 
> Let's suppose we decide to truncate the syscall nr.  What would the
> actual semantics be?  Would ptrace see the truncated value in orig_ax?
>   How about syscall user dispatch?  What happens if ptrace writes a
> value with high bits set to orig_ax?  Do we truncate it again?  Or do
> we say that ptrace *can't* write too large a value?
> 
> For better for worse, RAX is 64 bits, orig_ax is a 64-bit field, and
> it currently has nonsensical semantics.  Redefining orig_ax as a
> 32-bit field is surely possible, but doing so cleanly is not
> necessarily any easier than any other approach.  If it weren't for
> seccomp, I would say that the obviously correct answer is to just
> treat it everywhere as a 64-bit number.
> 

We *used* to truncate the system call number; that was unsigned. It 
causes massive headache to ptrace if a 32-bit ptrace wants to write -1, 
which is a bit hacky.

I would personally like to see orig_ax to be the register passed in and 
for the truncation to happen by syscall_get_nr().

I also note that kernel/seccomp.c and the tracing infrastructure all 
expect a signed int as the system call number. Yes, orig_ax is a 64-bit 
field, but so are the other register fields which doesn't necessarily 
directly reflect the value of an argument -- like, say, %rdi in the case 
of sys_write - it is an int argument so it gets sign extended; this is 
*not* reflected in ptrace.

	-hpa

  reply	other threads:[~2021-04-28  0:21 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-27 21:15 pt_regs->ax == -ENOSYS H. Peter Anvin
2021-04-27 21:28 ` Andy Lutomirski
2021-04-27 22:58   ` H. Peter Anvin
2021-04-27 23:23     ` Andy Lutomirski
2021-04-28  0:05       ` H. Peter Anvin
2021-04-28  0:11         ` Andy Lutomirski
2021-04-28  0:20           ` H. Peter Anvin [this message]
2021-04-28  0:46             ` H. Peter Anvin
2021-04-27 23:29     ` Kees Cook
2021-04-27 23:51       ` Andy Lutomirski
2021-04-28  2:05         ` Kees Cook
2021-04-28  2:07           ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3a502aae-4124-5cb2-1dac-bc18b8158fbe@zytor.com \
    --to=hpa@zytor.com \
    --cc=bp@alien8.de \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=wad@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).