All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@amacapital.net>
To: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Dmitry Safonov <0x7f454c46@gmail.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Ingo Molnar <mingo@redhat.com>,
	Shuah Khan <shuahkh@osg.samsung.com>,
	Borislav Petkov <bp@alien8.de>, X86 ML <x86@kernel.org>,
	khorenko@virtuozzo.com, Andrew Morton <akpm@linux-foundation.org>,
	xemul@virtuozzo.com, linux-kselftest@vger.kernel.org,
	Cyrill Gorcunov <gorcunov@openvz.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH 1/2] x86/arch_prctl: add ARCH_SET_{COMPAT,NATIVE} to change compatible mode
Date: Thu, 7 Apr 2016 07:39:26 -0700	[thread overview]
Message-ID: <CALCETrVNTC9DmKbrYvEDLiAa3s+ZQNyCChk7Y1F6cw6Qv3OuJg@mail.gmail.com> (raw)
In-Reply-To: <57064E6C.2030202@virtuozzo.com>

On Apr 7, 2016 5:12 AM, "Dmitry Safonov" <dsafonov@virtuozzo.com> wrote:
>
> On 04/06/2016 09:04 PM, Andy Lutomirski wrote:
>>
>> [cc Dave Hansen for MPX]
>>
>> On Apr 6, 2016 9:30 AM, "Dmitry Safonov" <dsafonov@virtuozzo.com> wrote:
>>>
>>> Now each process that runs natively on x86_64 may execute 32-bit code
>>> by proper setting it's CS selector: either from LDT or reuse Linux's
>>> USER32_CS. The vice-versa is also valid: running 64-bit code in
>>> compatible task is also possible by choosing USER_CS.
>>> So we may switch between 32 and 64 bit code execution in any process.
>>> Linux will choose the right syscall numbers in entries for those
>>> processes. But it still will consider them native/compat by the
>>> personality, that elf loader set on launch. This affects i.e., ptrace
>>> syscall on those tasks: PTRACE_GETREGSET will return 64/32-bit regset
>>> according to process's mode (that's how strace detect task's
>>> personality from 4.8 version).
>>>
>>> This patch adds arch_prctl calls for x86 that make possible to tell
>>> Linux kernel in which mode the application is running currently.
>>> Mainly, this is needed for CRIU: restoring compatible & native
>>> applications both from 64-bit restorer. By that reason I wrapped all
>>> the code in CONFIG_CHECKPOINT_RESTORE.
>>> This patch solves also a problem for running 64-bit code in 32-bit elf
>>> (and reverse), that you have only 32-bit elf vdso for fast syscalls.
>>> When switching between native <-> compat mode by arch_prctl, it will
>>> remap needed vdso binary blob for target mode.
>>
>> General comments first:
>
> Thanks for your comments.
>>
>> You forgot about x32.
>
> Will add x32 support for v2.
>
>> I think that you should separate vdso remapping from "personality".
>> vdso remapping should be available even on native 32-bit builds, which
>> means that either you can't use arch_prctl for it or you'll have to
>> wire up arch_prctl as a 32-bit syscall.
>
> I cant say, I got your point. Do you mean by vdso remapping
> mremap for vdso/vvar pages? I think, it should work now.

For 32-bit, the vdso *must* exist in memory at the address that the
kernel thinks it's at.  Even if you had a pure 32-bit restore stub,
you would still need vdso remap, because there's a chance the vdso
could land at an unusable address, say one page off from where you
want it.  You couldn't map a wrapper because there wouldn't be any
space for it without moving the real vdso out of the way.

Remember, you *cannot* mremap() the 32-bit vdso because you will
crash.  It works by luck for 64-bit, but it's plausible that we'd want
to change that some day.  (I have awful patches that speed a bunch of
things up at the cost of a vdso trampoline for 64-bit code and a bunch
of other hacks.  Those patches will never go in for real, but
something else might want the ability to use 64-bit vdso trampolines.)

> I did remapping for vdso as blob for native x86_64 task differs
> to compatible task. So it's just changing blobs, address value
> is there for convenience - I may omit it and just remap
> different vdso blob at the same place where was previous vdso.
> I'm not sure, why do we need possibility to map 64-bit vdso blob
> on native 32-bit builds?

That would fail, but I think the API should exist.  But a native
32-bit program should be able to remap the 32-bit vdso.

IOW, I think you should be able to do, roughly:

map_new_vdso(VDSO_32BIT, addr);

on any kernel.

Am I making sense?

>
>> For "personality", someone needs to enumerate all of the various thigs
>> that try to track bitness and see how many of them even make sense.
>> On brief inspection:
>>
>>   - TIF_IA32: affects signal format and does something to ptrace.  I
>> suspect that whatever it does to ptrace is nonsensical, and I don't
>> know whether we're stuck with it.
>>
>>   - TIF_ADDR32 affects TASK_SIZE and mmap behavior (and the latter
>> isn't even done in a sensible way).
>>
>>   - is_64bit_mm affects MPX and uprobes.
>>
>> On even more brief inspection:
>>
>>   - uprobes using is_64bit_mm is buggy.
>>
>>   - I doubt that having TASK_SIZE vary serves any purpose.  Does anyone
>> know why TASK_SIZE is different for different tasks?  It would save
>> code size and speed things up if TASK_SIZE were always TASK_SIZE_MAX.
>>   - Using TIF_IA32 for signal processing is IMO suboptimal.  Instead,
>> we should record which syscall installed the signal handler and use
>> the corresponding frame format.
>
> Oh, I like it, will do.
>
>>   - Using TIF_IA32 of the *target* for ptrace is nonsense.  Having
>> strace figure out syscall type using that is actively buggy, and I ran
>> into that bug a few days ago and cursed at it.  strace should inspect
>> TS_COMPAT (I don't know how, but that's what should happen).  We may
>> be stuck with this for ABI reasons.
>
> ptrace may check seg_32bit for code selector, what do you think?

Not sure.  I have never fully wrapped my had around ptrace.

  parent reply	other threads:[~2016-04-07 14:40 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-06 16:29 [PATCH 0/2] x86: add arch_prctl to switch between native/compat modes Dmitry Safonov
2016-04-06 16:29 ` [PATCH 1/2] x86/arch_prctl: add ARCH_SET_{COMPAT,NATIVE} to change compatible mode Dmitry Safonov
2016-04-06 18:04   ` Andy Lutomirski
2016-04-06 18:49     ` Andy Lutomirski
2016-04-07 12:11     ` Dmitry Safonov
2016-04-07 12:21       ` Cyrill Gorcunov
2016-04-07 12:35         ` Dmitry Safonov
2016-04-07 14:39       ` Andy Lutomirski [this message]
2016-04-07 15:18         ` Dmitry Safonov
2016-04-08 13:50         ` Dmitry Safonov
2016-04-08 15:56           ` Andy Lutomirski
2016-04-08 16:18             ` Dmitry Safonov
2016-04-08 20:44               ` Andy Lutomirski
2016-04-09  8:06                 ` Dmitry Safonov
2016-04-13 16:55                 ` Dmitry Safonov
2016-04-14 18:27                   ` Andy Lutomirski
2016-04-20 11:04                     ` Peter Zijlstra
2016-04-20 15:40                       ` Andy Lutomirski
2016-04-20 19:05                         ` Peter Zijlstra
2016-04-21 19:39                           ` Andy Lutomirski
2016-04-21 20:12                             ` Peter Zijlstra
2016-04-21 23:27                               ` Andy Lutomirski
2016-04-21 23:46                                 ` Andy Lutomirski
2016-04-25 15:16                                 ` Peter Zijlstra
2016-04-25 16:50                                   ` Andy Lutomirski
2016-04-06 16:29 ` [PATCH 2/2] x86/tools/testing: add test for ARCH_SET_COMPAT Dmitry Safonov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALCETrVNTC9DmKbrYvEDLiAa3s+ZQNyCChk7Y1F6cw6Qv3OuJg@mail.gmail.com \
    --to=luto@amacapital.net \
    --cc=0x7f454c46@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=dsafonov@virtuozzo.com \
    --cc=gorcunov@openvz.org \
    --cc=hpa@zytor.com \
    --cc=khorenko@virtuozzo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=shuahkh@osg.samsung.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=xemul@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.