From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S932293AbbIBVjx (ORCPT <rfc822;w@1wt.eu>);
	Wed, 2 Sep 2015 17:39:53 -0400
Received: from mail-ob0-f176.google.com ([209.85.214.176]:33884 "EHLO
	mail-ob0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1755345AbbIBVjv convert rfc822-to-8bit (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Wed, 2 Sep 2015 17:39:51 -0400
MIME-Version: 1.0
In-Reply-To: <55E7638F.1060002@list.ru>
References: <55CA90B4.2010205@list.ru> <CA+55aFw5DzEn+jYmY0KwbkytBzowo45o3Czh1cQZ_aaR1+WZYA@mail.gmail.com>
 <CALCETrV0SXVtxNLpOi6VK_U5Q8xD3CCec6aDNojwiYJ182dS0w@mail.gmail.com>
 <CAMzpN2jbBq-33HVFZ0fHUtzMX-m9V5TUPCSp9VVf2zi=dd=EMw@mail.gmail.com>
 <55D2D0DE.3080707@list.ru> <CALCETrUwpHHORR_SB_MOPdG+0Z-+SeK9ZvPb++4s+aUcChy0AQ@mail.gmail.com>
 <CALCETrXBLna_8z7gUPv-2niWr7sNpCeN=wDWSwFEsH4ZHcBM-Q@mail.gmail.com>
 <55D44DF2.30802@list.ru> <CALCETrU1QudtJFja18bdPaNW23vjTHSbPOq9+XFk5od+53Yfww@mail.gmail.com>
 <55D4AF1D.2070100@list.ru> <CALCETrVmqP3EE8Opu6VTedHsaCWisPEB6hEZPbGc=4D0-g1+4w@mail.gmail.com>
 <55E6BEAC.8080302@list.ru> <CALCETrVBYySbWteyRas-+UsNqsUijORaoyYS64RzRqXp3g4X=A@mail.gmail.com>
 <55E735FE.1030901@list.ru> <CALCETrWdC9PHQ0OfZrk4zbTM3G8hNnk+wWjFxTdTFgRcBbD4Tw@mail.gmail.com>
 <55E73EB5.5040204@list.ru> <CALCETrX5ZtdWax8gWqmDEx1+gQFQZ+QKHB04tH=4dLkLsj3ptQ@mail.gmail.com>
 <55E7638F.1060002@list.ru>
From: Andy Lutomirski <luto@amacapital.net>
Date: Wed, 2 Sep 2015 14:39:31 -0700
Message-ID: <CALCETrVphMtmb+tW395Tj2LAa5DReh5b5j5oMjnMcp+GsMtH7g@mail.gmail.com>
Subject: Re: [regression] x86/signal/64: Fix SS handling for signals delivered
 to 64-bit programs breaks dosemu
To: Stas Sergeev <stsp@list.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
        Raymond Jennings <shentino@gmail.com>,
        Cyrill Gorcunov <gorcunov@gmail.com>,
        Pavel Emelyanov <xemul@parallels.com>,
        Linux kernel <linux-kernel@vger.kernel.org>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8BIT
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Wed, Sep 2, 2015 at 2:01 PM, Stas Sergeev <stsp@list.ru> wrote:
> 02.09.2015 22:06, Andy Lutomirski пишет:
>
>> On Wed, Sep 2, 2015 at 11:23 AM, Stas Sergeev <stsp@list.ru> wrote:
>>>
>>> 02.09.2015 21:17, Andy Lutomirski пишет:
>>>>
>>>> On Wed, Sep 2, 2015 at 10:46 AM, Stas Sergeev <stsp@list.ru> wrote:
>>>>>
>>>>> 02.09.2015 17:21, Andy Lutomirski пишет:
>>>>>>>>
>>>>>>>> This should work for old DOSEMU.  It's a bit gross, but it has the
>>>>>>>> nice benefit that everyone (even things that aren't DOSEMU) gain the
>>>>>>>> ability to catch signals thrown from bogus SS contexts, which
>>>>>>>> probably
>>>>>>>> improves debugability.  It's also nice to not have the SA flag.
>>>>>>>
>>>>>>> Pros:
>>>>>>> - No new SA flag
>>>>>>> - May improve debugability in some unknown scenario where people
>>>>>>> do not want to just use the new flag to get their things improved
>>>>>>>
>>>>>>> Cons:
>>>>>>> - Does not allow to cleanly use siglongjmp(), as then there is a risk
>>>>>>> to jump to 64bit code with bad SS
>>>>>>
>>>>>> What's the issue here?  I don't understand.
>>>>>>
>>>>>> On musl, (sig)longjmp just restores rsp, rbx, rbp, and r12-r15, so it
>>>>>> won't be affected.  AFAIK all implementations of siglongjmp are likely
>>>>>> to call sigprocmask or similar, and that will clobber SS.  I'm not
>>>>>> aware of an implementation of siglongjmp that uses sigreturn.
>>>>>
>>>>> I am not saying siglongjmp() will be affected.
>>>>> Quite the opposite: it won't, which is bad. :)
>>>>> If you have always correct SS, you can use siglongjmp(). If you have
>>>>> broken SS at times, siglongjmp() will be an asking for troubles, as
>>>>> it exactly does not restore SS.
>>>>> dosemu could do a good use of siglongjmp() to get back to 64bit code
>>>>> from its sighandler.
>>>>
>>>> This seems like it would be relying unpleasantly heavily on libc
>>>> internals.
>>>
>>> Could you please clarify?
>>> If kernel always passes the right SS to the sighandler, then what's
>>> the problem?
>>
>> What's the exact siglongjmp usage you have in mind?  Signal context
>> isn't normally involved AFAIK.
>
> dosemu needs 2 return pathes:
> 1. to DOS code
> 2. to 64bit code (dosemu is not all in a sighandler, right?)
>
> How it is currently achieved:
> dosemu1:
> 1. sigreturn() + iret (to DOS)
> 2. modify sigcontext -> sigreturn() (to 64bit asm helper)
>
> dosemu2:
> 1. sigreturn() + iret (to DOS)
> 2. modify sigcontext -> sigreturn() -> longjmp() (to 64bit C-coded)

So you're modifying sigcontext such that it returns to a C function
that calls longjmp?

>
> How dosemu2 is supposed to do this:
> 1. sigreturn() (to DOS)
> 2. siglongjmp() (to 64bit C-coded)

This should work fine on any kernel, right?  The main problem will be
that you presumably need to remember the old context so you can go
back to DOS, I assume.  So SS needs to be there somewhere.

>
>>>>>>> - Async signals can silently "validate" SS behind your back
>>>>>>
>>>>>> True, and that's unfortunate.  But async signals without SA_SAVE_SS
>>>>>> set with the other approach have exactly the same problem.
>>>>>
>>>>> Yes, and as such, they should be blocked.
>>>>> You could improve on that and on siglongjmp().
>>>>> And on TLS in the future.
>>>>
>>>> *I* can't do anything to siglongjmp because that's almost entirely
>>>> outside the kernel. :-/
>>>
>>> Except for passing the SS=__USER_DS to the sighandler, for which we
>>> discussed the new SA_hyz?
>>
>> I'm still not understanding what you're looking for.  If you
>> siglongjmp out of a signal handler, the hardware SS value is
>> irrelevant, at least on 64-bit binaries, because siglongjmp is just
>> going to replace it.
>
> Hmm? IIRC you've just said this:
> ---
> On musl, (sig)longjmp just restores rsp, rbx, rbp, and r12-r15, so it won't
> be affected.
> ---
> So why would siglongjmp() replace it?

Because siglongjmp calls sigprocmask, which uses SYSCALL, which clobbers SS.

>
>>>>>>> Is the new SA flag such a big deal here to even bother?
>>>>>>
>>>>>> Not really, but given that the new behavior seems clearly better
>>>>>> behaved than the old, it would be nice to be able to have the good
>>>>>> behavior, or at least most of it, be the default.
>>>>>
>>>>> Surely, but how about then having the heuristics you suggest,
>>>>> only if the new SA_hyz is not set? And when it is set, have a
>>>>> properly defined and predictable behaviour. Then it seems like
>>>>> we'll get all the possible wishes covered.
>>>>
>>>> That could work.  The result is quite similar to explicitly setting
>>>> UC_STRICT_RESTORE_SS.
>>>
>>> I am much more bothered with delivering the right SS than with
>>> restoring it on sigreturn().
>>
>> For 64-bit delivery, ignoring backwards compatibility, delivering
>> signals with ss = __USER_DS would be the right solution, I think: it's
>> trivial and it works.  Because of backwards compatibility, we need to
>
> ... add the SA_hyz flag.
> I don't understand why do you constantly ignore that part as
> if it was never spelled. Lets discuss the proposal as a whole, rather
> than with the random bits thrown away. The flag is exactly for
> backward compatibility, so why do you present it as a problem
> without the context of the new flag?

For backwards compat, we either need the default behavior to be
unchanged, or we need the default behavior to be something that works
with existing dosemu.  For existing dosemu, the only interesting cases
(I think) are signal delivery from *valid* 16-bit context, in which
case we need to preserve SS so that the signal handler can read it out
with mov ..., %ss, and sigreturn to 64-bit mode for the IRET
trampoline.  For sigreturn, IIUC old dosemu will replace the saved CS
with a 64-bit code segment selector and won't touch the saved SS
because it doesn't know about the saved SS.  Those dosemu versions
don't care what SS actually contains after sigreturn, because they're
immediately going to change it again using IRET.  So we just need to
make sure we return without faulting.

New dosemu2 would like to sigreturn directly back to 16-bit mode, so
it needs the kernel to honor the saved ss value and restore it,
possibly changed by dosemu.

We obviously can't require old dosemu to set an SA flag to keep
working.  But, if we can get away with it, I think it's somewhat
preferable not to require new DOSEMU to set an SA flag either.

This has one major benefit at least: if new dosemu loads some random
library that installs some async signal handler (SIGALRM for example),
everything will work with regard to CS and SS.  If SIGALRM hits 16-bit
code, CS and SS get saved, the signal handler gets invoked in 64-bit
mode, and sigreturn restores the old state.

Of course, FS and GS still screw this up.

--Andy