From: Boris Ostrovsky <boris.ostrovsky@oracle.com>
To: Andrew Cooper <andrew.cooper3@citrix.com>,
Andy Lutomirski <luto@amacapital.net>
Cc: "security@kernel.org" <security@kernel.org>,
Peter Zijlstra <peterz@infradead.org>, X86 ML <x86@kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
xen-devel <xen-devel@lists.xen.org>,
David Vrabel <dvrabel@cantab.net>, Borislav Petkov <bp@alien8.de>,
David Vrabel <david.vrabel@citrix.com>,
Jan Beulich <jbeulich@suse.com>,
Sasha Levin <sasha.levin@oracle.com>
Subject: Re: [PATCH v4 0/3] x86: modify_ldt improvement, test, and config option
Date: Thu, 30 Jul 2015 16:01:18 -0400 [thread overview]
Message-ID: <55BA828E.8070304__21584.7127991312$1438286653$gmane$org@oracle.com> (raw)
In-Reply-To: <55BA72E1.4050809@citrix.com>
On 07/30/2015 02:54 PM, Andrew Cooper wrote:
> On 30/07/15 19:30, Andy Lutomirski wrote:
>> On Wed, Jul 29, 2015 at 5:29 PM, Andrew Cooper
>> <andrew.cooper3@citrix.com> wrote:
>>> On 30/07/2015 00:13, Andy Lutomirski wrote:
>>>> On Wed, Jul 29, 2015 at 4:02 PM, Andrew Cooper
>>>> <andrew.cooper3@citrix.com> wrote:
>>>>> On 29/07/2015 23:49, Boris Ostrovsky wrote:
>>>>>> On 07/29/2015 06:46 PM, David Vrabel wrote:
>>>>>>> On 29/07/2015 23:11, Andrew Cooper wrote:
>>>>>>>> On 29/07/2015 23:05, Andy Lutomirski wrote:
>>>>>>>>> On Wed, Jul 29, 2015 at 2:37 PM, Andrew Cooper
>>>>>>>>> <andrew.cooper3@citrix.com> wrote:
>>>>>>>>>> On 29/07/2015 22:26, Andy Lutomirski wrote:
>>>>>>>>>>> On Wed, Jul 29, 2015 at 2:23 PM, Boris Ostrovsky
>>>>>>>>>>> <boris.ostrovsky@oracle.com> wrote:
>>>>>>>>>>>> On 07/29/2015 03:03 PM, Andrew Cooper wrote:
>>>>>>>>>>>>> On 29/07/15 15:43, Boris Ostrovsky wrote:
>>>>>>>>>>>>>> FYI, I have got a repro now and am investigating.
>>>>>>>>>>>>> Good and bad news. This bug has nothing to do with LDTs
>>>>>>>>>>>>> themselves.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have worked out what is going on, but this:
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
>>>>>>>>>>>>> index 5abeaac..7e1a82e 100644
>>>>>>>>>>>>> --- a/arch/x86/xen/enlighten.c
>>>>>>>>>>>>> +++ b/arch/x86/xen/enlighten.c
>>>>>>>>>>>>> @@ -493,6 +493,7 @@ static void set_aliased_prot(void *v,
>>>>>>>>>>>>> pgprot_t prot)
>>>>>>>>>>>>> pte = pfn_pte(pfn, prot);
>>>>>>>>>>>>> + (void)*(volatile int*)v;
>>>>>>>>>>>>> if (HYPERVISOR_update_va_mapping((unsigned long)v,
>>>>>>>>>>>>> pte, 0)) {
>>>>>>>>>>>>> pr_err("set_aliased_prot va update failed w/
>>>>>>>>>>>>> lazy mode
>>>>>>>>>>>>> %u\n", paravirt_get_lazy_mode());
>>>>>>>>>>>>> BUG();
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is perhaps not the fix we are looking for, and every use of
>>>>>>>>>>>>> HYPERVISOR_update_va_mapping() is susceptible to the same problem.
>>>>>>>>>>>> I think in most cases we know that page is mapped so hopefully
>>>>>>>>>>>> this is the
>>>>>>>>>>>> only site that we need to be careful about.
>>>>>>>>>>> Is there any chance we can get some kind of quick-and-dirty fix that
>>>>>>>>>>> can go to x86/urgent in the next few days even if a clean fix isn't
>>>>>>>>>>> available yet?
>>>>>>>>>> Quick and dirty?
>>>>>>>>>>
>>>>>>>>>> Reading from v is the most obvious and quick way, for areas where
>>>>>>>>>> we are
>>>>>>>>>> certain v exists, is kernel memory and is expected to have a backing
>>>>>>>>>> page. I don't know offhand how many of current
>>>>>>>>>> HYPERVISOR_update_va_mapping() callsites this applies to.
>>>>>>>>> __get_user((char *)v, tmp), perhaps, unless there's something better
>>>>>>>>> in the wings. Keep in mind that we need this for -stable, and it's
>>>>>>>>> likely to get backported quite quickly due to CVE-2015-5157.
>>>>>>>> Hmm - something like that tucked inside HYPERVISOR_update_va_mapping()
>>>>>>>> would probably work, and certainly be minimal hassle for -stable.
>>>>>>>>
>>>>>>>> Altering the hypercall used is certainly not something to backport, nor
>>>>>>>> are we sure it is a viable fix at this time.
>>>>>>> Changing this one use of update_va_mapping to use mmu_update_normal_pt
>>>>>>> is the correct fix to unblock this LDT series. I see no reason why this
>>>>>>> cannot be backported.
>>>>>> To properly fix it should include batching and that is not something
>>>>>> that I think we should target for stable.
>>>>> Batching is absolutely not necessary to alter update_va_mapping to
>>>>> mmu_update_normal_pt. After all, update_va_mapping isn't batched.
>>>>>
>>>>> However this isn't the first issue issue we have had lazy mmu faulting,
>>>>> and I doubt it is the last. There are not many callsites of
>>>>> update_va_mapping - I will audit them tomorrow and see if any similar
>>>>> issues are lurking elsewhere.
>>>> One thing I should add: nothing flushes old aliases in xen_alloc_ldt,
>>>> yet I haven't been able to get xen_alloc_ldt to fail or subsequent LDT
>>>> access to fault. Is this something we should be worried about?
>>> Yes. update_va_mapping() will function perfectly well taking one RW
>>> mapping to RO even if there is a second RW mapping. In such a case, the
>>> next LDT access will fault.
>> Which is a problem because that alias might still exist, and also
>> because Linux really doesn't expect that fault.
>>
>>> On closer inspection, Xen is rather unhelpful with the fault. Xen's
>>> lazy #PF will be bounced back to the guest with cr2 adjusted to appear
>>> in the range passed to set_ldt(). The error code however will be
>>> unmodified (and limited only by not-user and not-reserved), so will
>>> appear as a non-present read or write supervisor access to an address
>>> which the kernel has a valid read mapping of.
>> More yuck.
>>
>> I think I'm just going to stick an unconditional vm_flush_aliases in alloc_ldt.
>>
>>> Therefore, set_ldt() needs to be confident that there are no writeable
>>> mappings to the frames used to make up the LDT. It could proactively
>>> fault them in by accessing one descriptor in each page inside the limit,
>>> but by the time a fault is received it is probably too late to work out
>>> where the other mapping is which prevented the typechange (or indeed,
>>> whether Xen objected to one of the descriptors instead).
>> This seems like overkill.
>>
>> I'm still a bit confused, though: the failure is in xen_free_ldt. How
>> do we make it all the way to xen_free_ldt without the vmapped page
>> existing in the guest's page tables? After all, we had to survive
>> xen_alloc_ldt first, and ISTM that should fail in exactly the same
>> way.
> (Summarising part of a discussion which has just occurred on IRC)
>
> I presume that xen_free_ldt() is called while in the context of an mm
> which doesn't have the particular area of the vmalloc() space faulted in.
This is exactly what's happening --- the bug is only triggered during
exit and xen_free_ldt() is called from someone else's context, e.g.:
[ 53.986677] Call Trace:
[ 53.986677] [<c105312d>] xen_free_ldt+0x2d/0x40
[ 53.986677] [<c1062310>] free_ldt_struct.part.1+0x10/0x40
[ 53.986677] [<c1062735>] destroy_context+0x25/0x40
[ 53.986677] [<c10a764e>] __mmdrop+0x1e/0xc0
[ 53.986677] [<c10c9858>] finish_task_switch+0xd8/0x1a0
[ 53.986677] [<c1863736>] __schedule+0x316/0x950
[ 53.986677] [<c1863d96>] schedule+0x26/0x70
[ 53.986677] [<c10ac613>] do_wait+0x1b3/0x200
[ 53.986677] [<c10ac9d7>] SyS_waitpid+0x67/0xd0
[ 53.986677] [<c10aa820>] ? task_stopped_code+0x50/0x50
[ 53.986677] [<c186717a>] syscall_call+0x7/0x7
But that would imply that this other context has mm->context.ldt of
ldt_gdt_32. How is that possible?
-boris
>
> This is (I presume) why reading 'v' (which occasionally causes a
> pagefault to occur) fixes the issue.
>
> ~Andrew
next prev parent reply other threads:[~2015-07-30 20:01 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <cover.1437802102.git.luto@kernel.org>
2015-07-25 5:36 ` [PATCH v4 1/3] x86/ldt: Make modify_ldt synchronous Andy Lutomirski
2015-07-25 5:36 ` [PATCH v4 2/3] x86/ldt: Make modify_ldt optional Andy Lutomirski
2015-07-25 5:36 ` [PATCH v4 3/3] selftests/x86, x86/ldt: Add a selftest for modify_ldt Andy Lutomirski
2015-07-25 6:27 ` [PATCH v4 0/3] x86: modify_ldt improvement, test, and config option Willy Tarreau
[not found] ` <12ddcec938d76238975dff9de7d66cfc6e574aa7.1437802102.git.luto@kernel.org>
2015-07-25 9:03 ` [PATCH v4 1/3] x86/ldt: Make modify_ldt synchronous Borislav Petkov
[not found] ` <7286d77aa81abc38dc40362e2439861427064f6f.1437802102.git.luto@kernel.org>
2015-07-25 6:23 ` [PATCH v4 2/3] x86/ldt: Make modify_ldt optional Willy Tarreau
[not found] ` <20150725062343.GA3902@1wt.eu>
2015-07-25 6:44 ` Andy Lutomirski
[not found] ` <CALCETrX0ExTFXVdNthwBRheg4vsffPThVuyn7uAcj_TGwpXgiA@mail.gmail.com>
2015-07-25 7:50 ` Willy Tarreau
[not found] ` <20150725075052.GA3918@1wt.eu>
2015-07-25 13:03 ` [PATCH 4/3] x86/ldt: allow to disable modify_ldt at runtime Willy Tarreau
[not found] ` <20150725130340.GA17257@1wt.eu>
2015-07-25 16:08 ` Andy Lutomirski
[not found] ` <CALCETrV+OB0qxtw5CHaZc5RftuCUax04RxTyi_bt4ZKDJ2GB0g@mail.gmail.com>
2015-07-25 16:33 ` Willy Tarreau
[not found] ` <20150725163356.GD17659@1wt.eu>
2015-07-25 17:42 ` Andy Lutomirski
[not found] ` <CALCETrXeWdugPpAkKhUD=f7ftuYSM5fxaPxnF2=PwygupP2_4w@mail.gmail.com>
2015-07-25 18:45 ` Willy Tarreau
2015-07-27 19:04 ` Kees Cook
[not found] ` <CAGXu5jJDfnkRG2F=L37CnrgnCN4Yxh0p9QWbYFqQ_Jw5qk3HsQ@mail.gmail.com>
2015-07-27 21:37 ` Willy Tarreau
2015-07-25 9:15 ` [PATCH v4 2/3] x86/ldt: Make modify_ldt optional Borislav Petkov
[not found] ` <20150725091531.GE3427@nazgul.tnic>
2015-07-25 16:03 ` Andy Lutomirski
[not found] ` <CALCETrV_oeS_kA3oNirWTwc00ze2v=QLmx6tZKU7sxt_+gMcAg@mail.gmail.com>
2015-07-25 16:35 ` Willy Tarreau
2015-07-27 15:36 ` [PATCH v4 0/3] x86: modify_ldt improvement, test, and config option Boris Ostrovsky
[not found] ` <55B64FEA.70204@oracle.com>
2015-07-27 15:53 ` Andy Lutomirski
[not found] ` <CALCETrUEYTCwYzA0bvG=EJOi+pdXX=FZXoaQc4tYGkJATM7x3g@mail.gmail.com>
2015-07-27 16:18 ` Boris Ostrovsky
[not found] ` <55B659EC.5030009@oracle.com>
2015-07-28 2:20 ` Andy Lutomirski
[not found] ` <CALCETrV7zVbt0ZV4KYcSTUHjAOxzGmu3SXWoT7iECB=zWSN7Ew@mail.gmail.com>
2015-07-28 3:16 ` Andy Lutomirski
[not found] ` <CALCETrV275oYQY80yg6TJ-h9n2Db-uF-po90bF+JmKjnV5ZqYw@mail.gmail.com>
2015-07-28 3:23 ` Andy Lutomirski
2015-07-28 3:43 ` Boris Ostrovsky
2015-07-28 10:29 ` Andrew Cooper
[not found] ` <55B75993.90909@citrix.com>
2015-07-28 14:05 ` Boris Ostrovsky
[not found] ` <55B78C35.1050702@oracle.com>
2015-07-28 14:35 ` Andrew Cooper
[not found] ` <55B79314.8060009@citrix.com>
2015-07-28 14:50 ` Boris Ostrovsky
[not found] ` <55B796BF.1080005@oracle.com>
2015-07-28 15:15 ` Konrad Rzeszutek Wilk
2015-07-28 15:23 ` Andrew Cooper
[not found] ` <20150728151527.GI26623@x230.dumpdata.com>
2015-07-28 15:39 ` Boris Ostrovsky
[not found] ` <55B79E75.4010000@citrix.com>
2015-07-28 15:59 ` Boris Ostrovsky
2015-07-28 15:43 ` Andy Lutomirski
[not found] ` <CALCETrXt2OP=+JAj7gzUOJT+5=00Qg3Te11twSeK8F_9zn_nwg@mail.gmail.com>
2015-07-28 16:30 ` Andrew Cooper
[not found] ` <55B7AE39.7000101@citrix.com>
2015-07-28 17:07 ` Andy Lutomirski
[not found] ` <CALCETrVd56uwkZw0YtaSHKHp5dh7NugQouigibJkr=e3Q_mYyA@mail.gmail.com>
2015-07-28 17:10 ` Boris Ostrovsky
[not found] ` <55B7B791.2050208@oracle.com>
2015-07-29 0:21 ` Andy Lutomirski
[not found] ` <CALCETrXH5_PMqfH1en_5c+5gUpq8SjCnQ3Xaz-K6ej6FgBgLDQ@mail.gmail.com>
2015-07-29 0:47 ` Andrew Cooper
[not found] ` <55B822B8.3090608@citrix.com>
2015-07-29 3:01 ` Boris Ostrovsky
[not found] ` <55B841FF.2000102@oracle.com>
2015-07-29 4:26 ` Andy Lutomirski
2015-07-29 5:28 ` Andy Lutomirski
[not found] ` <CALCETrWkMRb+Y3FsJ7+kNYmPxtupM3ZPOeOPwagXytgBqM6tJQ@mail.gmail.com>
2015-07-29 14:21 ` Andrew Cooper
[not found] ` <55B8E16C.2050406@citrix.com>
2015-07-29 14:43 ` Boris Ostrovsky
[not found] ` <55B8E68B.2030305@oracle.com>
2015-07-29 19:03 ` Andrew Cooper
[not found] ` <55B9236B.9090507@citrix.com>
2015-07-29 21:23 ` Boris Ostrovsky
[not found] ` <55B94451.8040600@oracle.com>
2015-07-29 21:26 ` Andy Lutomirski
[not found] ` <CALCETrWA=hAyqqp=yzZ2r_S=9U9hLkd6dZEuNefew8hyLVA_eQ@mail.gmail.com>
2015-07-29 21:33 ` Boris Ostrovsky
2015-07-29 21:37 ` Andrew Cooper
[not found] ` <55B947AF.7020404@citrix.com>
2015-07-29 22:05 ` Andy Lutomirski
[not found] ` <CALCETrXp_DV-_Uvekwv7xLHO-5P8Oxkgn6OeXG-6tVOD4RkKMw@mail.gmail.com>
2015-07-29 22:11 ` Andrew Cooper
[not found] ` <55B94F9D.3000405@citrix.com>
2015-07-29 22:40 ` Boris Ostrovsky
2015-07-29 22:46 ` David Vrabel
2015-07-29 22:49 ` Boris Ostrovsky
[not found] ` <55B95863.2000102@oracle.com>
2015-07-29 22:55 ` David Vrabel
2015-07-29 23:02 ` Andrew Cooper
[not found] ` <55B95B70.8010902@citrix.com>
2015-07-29 23:13 ` Andy Lutomirski
[not found] ` <CALCETrWy93qobHmMWzTfqFN+0Y7DGyM7viwpPMGOeSiXEP0Z6w@mail.gmail.com>
2015-07-30 0:29 ` Andrew Cooper
[not found] ` <55B96FE0.6010600@citrix.com>
2015-07-30 18:30 ` Andy Lutomirski
[not found] ` <CALCETrUi2GBdGP2OX+3PwSf0UYjKuf2+DugENe3Y6mUoy-Rfkw@mail.gmail.com>
2015-07-30 18:54 ` Andrew Cooper
[not found] ` <55BA72E1.4050809@citrix.com>
2015-07-30 20:01 ` Boris Ostrovsky [this message]
[not found] ` <55BA828E.8070304@oracle.com>
2015-07-30 20:05 ` Andy Lutomirski
[not found] ` <CALCETrUsFn23tKf418VSbGCgXoXXRq8dk41ZfM3F55=_xWPQhw@mail.gmail.com>
2015-07-30 20:18 ` Boris Ostrovsky
2015-07-25 5:36 Andy Lutomirski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='55BA828E.8070304__21584.7127991312$1438286653$gmane$org@oracle.com' \
--to=boris.ostrovsky@oracle.com \
--cc=andrew.cooper3@citrix.com \
--cc=bp@alien8.de \
--cc=david.vrabel@citrix.com \
--cc=dvrabel@cantab.net \
--cc=jbeulich@suse.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=sasha.levin@oracle.com \
--cc=security@kernel.org \
--cc=x86@kernel.org \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).