xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: "security@kernel.org" <security@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>, X86 ML <x86@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	xen-devel <xen-devel@lists.xen.org>,
	Borislav Petkov <bp@alien8.de>, Jan Beulich <jbeulich@suse.com>,
	Sasha Levin <sasha.levin@oracle.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>
Subject: Re: [PATCH v4 0/3] x86: modify_ldt improvement, test, and config option
Date: Tue, 28 Jul 2015 17:30:49 +0100	[thread overview]
Message-ID: <55B7AE39.7000101__12547.9006710977$1438101203$gmane$org@citrix.com> (raw)
In-Reply-To: <CALCETrXt2OP=+JAj7gzUOJT+5=00Qg3Te11twSeK8F_9zn_nwg@mail.gmail.com>

On 28/07/15 16:43, Andy Lutomirski wrote:
>
>>>> After forward-porting my virtio patches, I got this thing to run on
>>>> Xen.  After several tries, I got:
>>>>
>>>> [   53.985707] ------------[ cut here ]------------
>>>> [   53.986314] kernel BUG at arch/x86/xen/enlighten.c:496!
>>>> [   53.986677] invalid opcode: 0000 [#1] SMP
>>>> [   53.986677] Modules linked in:
>>>> [   53.986677] CPU: 0 PID: 1400 Comm: bash Not tainted 4.2.0-rc4+ #4
>>>> [   53.986677] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>>>> BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org
>>>> 04/01/2014
>>>> [   53.986677] task: c2376180 ti: c0874000 task.ti: c0874000
>>>> [   53.986677] EIP: 0061:[<c10530f2>] EFLAGS: 00010282 CPU: 0
>>>> [   53.986677] EIP is at set_aliased_prot+0xb2/0xc0
>>>> [   53.986677] EAX: ffffffea EBX: cc3d1000 ECX: 0672e063 EDX: 80000000
>>>> [   53.986677] ESI: 00000000 EDI: 80000000 EBP: c0875e94 ESP: c0875e74
>>>> [   53.986677]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
>>>> [   53.986677] CR0: 80050033 CR2: b77404d4 CR3: 020b6000 CR4: 00042660
>>>> [   53.986677] Stack:
>>>> [   53.986677]  80000000 0672e063 000021c0 cc3d1000 00000001 cc3d2000
>>>> 00000b4a 00000200
>>>> [   53.986677]  c0875ea8 c105312d c2317940 c2373a80 00000000 c0875eb4
>>>> c1062310 c01861c0
>>>> [   53.986677]  c0875ec0 c1062735 c01861c0 c0875ed4 c10a764e c7007a00
>>>> c2373a80 00000000
>>>> [   53.986677] Call Trace:
>>>> [   53.986677]  [<c105312d>] xen_free_ldt+0x2d/0x40
>>>> [   53.986677]  [<c1062310>] free_ldt_struct.part.1+0x10/0x40
>>>> [   53.986677]  [<c1062735>] destroy_context+0x25/0x40
>>>> [   53.986677]  [<c10a764e>] __mmdrop+0x1e/0xc0
>>>> [   53.986677]  [<c10c9858>] finish_task_switch+0xd8/0x1a0
>>>> [   53.986677]  [<c1863736>] __schedule+0x316/0x950
>>>> [   53.986677]  [<c1863d96>] schedule+0x26/0x70
>>>> [   53.986677]  [<c10ac613>] do_wait+0x1b3/0x200
>>>> [   53.986677]  [<c10ac9d7>] SyS_waitpid+0x67/0xd0
>>>> [   53.986677]  [<c10aa820>] ? task_stopped_code+0x50/0x50
>>>> [   53.986677]  [<c186717a>] syscall_call+0x7/0x7
>>>> [   53.986677] Code: e8 c1 e3 0c 81 eb 00 00 00 40 39 5d ec 74 11 8b
>>>> 4d e4 8b 55 e0 31 f6 e8 dd e0 fa ff 85 c0 75 0d 83 c4 14 5b 5e 5f 5d
>>>> c3 90 0f 0b <0f> 0b 0f 0b 8d 76 00 8d bc 27 00 00 00 00 85 d2 74 31 55
>>>> 89 e5
>>>> [   53.986677] EIP: [<c10530f2>] set_aliased_prot+0xb2/0xc0 SS:ESP 0069:c0875e74
>>>> [   54.010069] ---[ end trace 89ac35b29c1c59bb ]---
>>>>
>>>> Is that the error you're seeing?
>>>>
>>>> If I change xen_free_ldt to:
>>>>
>>>> static void xen_free_ldt(struct desc_struct *ldt, unsigned entries)
>>>> {
>>>>     const unsigned entries_per_page = PAGE_SIZE / LDT_ENTRY_SIZE;
>>>>     int i;
>>>>
>>>>     vm_unmap_aliases();
>>>>     xen_mc_flush();
>>>>
>>>>     for(i = 0; i < entries; i += entries_per_page)
>>>>         set_aliased_prot(ldt + i, PAGE_KERNEL);
>>>> }
>>>>
>>>> then it works.  I don't know why this makes a difference.
>>>> (xen_mc_flush makes a little bit of sense to me.  vm_unmap_aliases
>>>> doesn't.)
>>>>
>>> That fix makes sense if there's some way that the vmalloc area we're
>>> freeing has an extra alias somewhere, which is very much possible.  On
>>> the other hand, I don't see how this happens without first doing an
>>> MMUEXT_SET_LDT with an unexpectedly aliased address, and I would have
>>> expected that to blow up and/or result in test case failures.
>>>
>>> But I'm still confused, because it seems like Xen will never populate
>>> the actual (hidden) LDT mapping unless the pages backing it are
>>> unaliased and well-formed, which make me wonder why this stuff ever
>>> worked.  Wouldn't LDT access with pre-existing vmalloc aliases result
>>> in segfaults?
>>>
>>> The semantics seem to be very odd.  xen_free_ldt with an aliased
>>> address might fail (and OOPS), but actual access to the LDT with an
>>> aliased address page faults.
>>>
>>> Also, using kzalloc for everything fixes the problem, which suggests
>>> that there really is something to my theory that the problem involves
>>> unexpected aliases.
>> Xen does lazily populate the LDT frames.  The first time a page is ever
>> referenced via the LDT, Xen will perform a typechange.
>>
>> Under Xen, guest mappings are reference counted with both a plain
>> reference, and a type count.  Types of writeable, segdec and pagetables
>> are mutually exclusive.  This prevents the guest from having writeable
>> mappings of interesting datastructures, but readable mappings are fine.
>> Typechanges may only occur when the type reference count is 0.
> Makes sense.
>
>> At the point of the typechange, no writeable mappings of the frame may
>> exist (and it must not be referenced by a L2 or greater page directory),
>> or the typechange will fail.  Additionally the descriptors are audited
>> at this point, so if Xen objects to any of the descriptors in the same
>> page, the typechange will also fail.
>>
>> If the typechange fails, the pagefault gets propagated back to the guest.
> The part I don't understand is that I didn't observe any page faults.
>
>> The corollary to this is that, for xen_free_ldt() to create writeable
>> mappings again, a typechange back to writeable is needed.  This will
>> fail if the LDT frames are still referenced in any vcpus LDT.
> And the mystery here is that I don't see how the typechange to LDT
> would have succeeded in the first place if we had a writable alias.

I wouldn't have.  (Unless we have a serious bug in Xen).

> In fact, I just fudged xen_set_ldt to probe the entire LDT using lsl,
> and it didn't page fault, so I'm still quite confused as to what's
> going on.  I've confirmed that my lsl hack really does work -- if I
> disable xen_alloc_ldt, then xen_set_ldt blows up immediately with my
> patch.

In which case we can be fairly sure that the LDT was properly installed.

>
> https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/tmp&id=ce664ed73aac804ef2a16ddef45589dbeba55570
>
> Oddly, the OOPS seems much more likely when I kill ldt_gdt_32 with
> Ctrl-C than when I let it run to completion.  The bug is here:
>
>         if (HYPERVISOR_update_va_mapping((unsigned long)v, pte, 0))
>                 BUG();
>
> so the only sensible explanation I have is either the guest really
> didn't drop all refs to the LDT or that Xen hasn't noticed yet.  I
> still don't see how aliases could be involved, because Xen really did
> accept the LDT.

Agreed.

> Are multicalls ordered?

Once the multicall hypercall is made, the multicalls are processed in
order.  The hypercalls might be preempted in Xen to deliver interrupts
to the guest.

However, the xen_mc_* infrastructure in the kernel obscures a lot of this.

>
> Hrm.  Does Xen actually do something sensible with set_ldt(NULL, 0)?

(When the hypercall gets to Xen), any set_ldt() call flushes the current
LDT, including synchronously decremented the typecount for every LDT
page faulted in thusfar.

>
> Also, xen_mc_issue seems buggy.  Is lazy_mode an enum or a bit field?
> What happens if two separate lazy modes are entered?  I suspect that
> one of them clobbers the other one.

This is the first time I have really peered into the xen_mc_* stuff.  It
is hardly the most clear of code to follow.  I am afraid that I will
have to pass that question to the Xen maintainers in Linux.


I suspect that the set_ldt(NULL, 0) call hasn't reached Xen before
xen_free_ldt() is attempting to nab back the pages which Xen still has
mapped as an LDT.

~Andrew

  parent reply	other threads:[~2015-07-28 16:30 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <cover.1437802102.git.luto@kernel.org>
2015-07-25  5:36 ` [PATCH v4 1/3] x86/ldt: Make modify_ldt synchronous Andy Lutomirski
2015-07-25  5:36 ` [PATCH v4 2/3] x86/ldt: Make modify_ldt optional Andy Lutomirski
2015-07-25  5:36 ` [PATCH v4 3/3] selftests/x86, x86/ldt: Add a selftest for modify_ldt Andy Lutomirski
2015-07-25  6:27 ` [PATCH v4 0/3] x86: modify_ldt improvement, test, and config option Willy Tarreau
     [not found] ` <12ddcec938d76238975dff9de7d66cfc6e574aa7.1437802102.git.luto@kernel.org>
2015-07-25  9:03   ` [PATCH v4 1/3] x86/ldt: Make modify_ldt synchronous Borislav Petkov
     [not found] ` <7286d77aa81abc38dc40362e2439861427064f6f.1437802102.git.luto@kernel.org>
2015-07-25  6:23   ` [PATCH v4 2/3] x86/ldt: Make modify_ldt optional Willy Tarreau
     [not found]   ` <20150725062343.GA3902@1wt.eu>
2015-07-25  6:44     ` Andy Lutomirski
     [not found]     ` <CALCETrX0ExTFXVdNthwBRheg4vsffPThVuyn7uAcj_TGwpXgiA@mail.gmail.com>
2015-07-25  7:50       ` Willy Tarreau
     [not found]       ` <20150725075052.GA3918@1wt.eu>
2015-07-25 13:03         ` [PATCH 4/3] x86/ldt: allow to disable modify_ldt at runtime Willy Tarreau
     [not found]         ` <20150725130340.GA17257@1wt.eu>
2015-07-25 16:08           ` Andy Lutomirski
     [not found]           ` <CALCETrV+OB0qxtw5CHaZc5RftuCUax04RxTyi_bt4ZKDJ2GB0g@mail.gmail.com>
2015-07-25 16:33             ` Willy Tarreau
     [not found]             ` <20150725163356.GD17659@1wt.eu>
2015-07-25 17:42               ` Andy Lutomirski
     [not found]               ` <CALCETrXeWdugPpAkKhUD=f7ftuYSM5fxaPxnF2=PwygupP2_4w@mail.gmail.com>
2015-07-25 18:45                 ` Willy Tarreau
2015-07-27 19:04           ` Kees Cook
     [not found]           ` <CAGXu5jJDfnkRG2F=L37CnrgnCN4Yxh0p9QWbYFqQ_Jw5qk3HsQ@mail.gmail.com>
2015-07-27 21:37             ` Willy Tarreau
2015-07-25  9:15   ` [PATCH v4 2/3] x86/ldt: Make modify_ldt optional Borislav Petkov
     [not found]   ` <20150725091531.GE3427@nazgul.tnic>
2015-07-25 16:03     ` Andy Lutomirski
     [not found]     ` <CALCETrV_oeS_kA3oNirWTwc00ze2v=QLmx6tZKU7sxt_+gMcAg@mail.gmail.com>
2015-07-25 16:35       ` Willy Tarreau
2015-07-27 15:36 ` [PATCH v4 0/3] x86: modify_ldt improvement, test, and config option Boris Ostrovsky
     [not found] ` <55B64FEA.70204@oracle.com>
2015-07-27 15:53   ` Andy Lutomirski
     [not found]   ` <CALCETrUEYTCwYzA0bvG=EJOi+pdXX=FZXoaQc4tYGkJATM7x3g@mail.gmail.com>
2015-07-27 16:18     ` Boris Ostrovsky
     [not found]     ` <55B659EC.5030009@oracle.com>
2015-07-28  2:20       ` Andy Lutomirski
     [not found]       ` <CALCETrV7zVbt0ZV4KYcSTUHjAOxzGmu3SXWoT7iECB=zWSN7Ew@mail.gmail.com>
2015-07-28  3:16         ` Andy Lutomirski
     [not found]         ` <CALCETrV275oYQY80yg6TJ-h9n2Db-uF-po90bF+JmKjnV5ZqYw@mail.gmail.com>
2015-07-28  3:23           ` Andy Lutomirski
2015-07-28  3:43           ` Boris Ostrovsky
2015-07-28 10:29           ` Andrew Cooper
     [not found]           ` <55B75993.90909@citrix.com>
2015-07-28 14:05             ` Boris Ostrovsky
     [not found]             ` <55B78C35.1050702@oracle.com>
2015-07-28 14:35               ` Andrew Cooper
     [not found]               ` <55B79314.8060009@citrix.com>
2015-07-28 14:50                 ` Boris Ostrovsky
     [not found]                 ` <55B796BF.1080005@oracle.com>
2015-07-28 15:15                   ` Konrad Rzeszutek Wilk
2015-07-28 15:23                   ` Andrew Cooper
     [not found]                   ` <20150728151527.GI26623@x230.dumpdata.com>
2015-07-28 15:39                     ` Boris Ostrovsky
     [not found]                   ` <55B79E75.4010000@citrix.com>
2015-07-28 15:59                     ` Boris Ostrovsky
2015-07-28 15:43             ` Andy Lutomirski
     [not found]             ` <CALCETrXt2OP=+JAj7gzUOJT+5=00Qg3Te11twSeK8F_9zn_nwg@mail.gmail.com>
2015-07-28 16:30               ` Andrew Cooper [this message]
     [not found]               ` <55B7AE39.7000101@citrix.com>
2015-07-28 17:07                 ` Andy Lutomirski
     [not found]                 ` <CALCETrVd56uwkZw0YtaSHKHp5dh7NugQouigibJkr=e3Q_mYyA@mail.gmail.com>
2015-07-28 17:10                   ` Boris Ostrovsky
     [not found]                   ` <55B7B791.2050208@oracle.com>
2015-07-29  0:21                     ` Andy Lutomirski
     [not found]                     ` <CALCETrXH5_PMqfH1en_5c+5gUpq8SjCnQ3Xaz-K6ej6FgBgLDQ@mail.gmail.com>
2015-07-29  0:47                       ` Andrew Cooper
     [not found]                       ` <55B822B8.3090608@citrix.com>
2015-07-29  3:01                         ` Boris Ostrovsky
     [not found]                         ` <55B841FF.2000102@oracle.com>
2015-07-29  4:26                           ` Andy Lutomirski
2015-07-29  5:28                           ` Andy Lutomirski
     [not found]                           ` <CALCETrWkMRb+Y3FsJ7+kNYmPxtupM3ZPOeOPwagXytgBqM6tJQ@mail.gmail.com>
2015-07-29 14:21                             ` Andrew Cooper
     [not found]                             ` <55B8E16C.2050406@citrix.com>
2015-07-29 14:43                               ` Boris Ostrovsky
     [not found]                               ` <55B8E68B.2030305@oracle.com>
2015-07-29 19:03                                 ` Andrew Cooper
     [not found]                                 ` <55B9236B.9090507@citrix.com>
2015-07-29 21:23                                   ` Boris Ostrovsky
     [not found]                                   ` <55B94451.8040600@oracle.com>
2015-07-29 21:26                                     ` Andy Lutomirski
     [not found]                                     ` <CALCETrWA=hAyqqp=yzZ2r_S=9U9hLkd6dZEuNefew8hyLVA_eQ@mail.gmail.com>
2015-07-29 21:33                                       ` Boris Ostrovsky
2015-07-29 21:37                                       ` Andrew Cooper
     [not found]                                       ` <55B947AF.7020404@citrix.com>
2015-07-29 22:05                                         ` Andy Lutomirski
     [not found]                                         ` <CALCETrXp_DV-_Uvekwv7xLHO-5P8Oxkgn6OeXG-6tVOD4RkKMw@mail.gmail.com>
2015-07-29 22:11                                           ` Andrew Cooper
     [not found]                                           ` <55B94F9D.3000405@citrix.com>
2015-07-29 22:40                                             ` Boris Ostrovsky
2015-07-29 22:46                                             ` David Vrabel
2015-07-29 22:49                                               ` Boris Ostrovsky
     [not found]                                               ` <55B95863.2000102@oracle.com>
2015-07-29 22:55                                                 ` David Vrabel
2015-07-29 23:02                                                 ` Andrew Cooper
     [not found]                                                 ` <55B95B70.8010902@citrix.com>
2015-07-29 23:13                                                   ` Andy Lutomirski
     [not found]                                                   ` <CALCETrWy93qobHmMWzTfqFN+0Y7DGyM7viwpPMGOeSiXEP0Z6w@mail.gmail.com>
2015-07-30  0:29                                                     ` Andrew Cooper
     [not found]                                                     ` <55B96FE0.6010600@citrix.com>
2015-07-30 18:30                                                       ` Andy Lutomirski
     [not found]                                                       ` <CALCETrUi2GBdGP2OX+3PwSf0UYjKuf2+DugENe3Y6mUoy-Rfkw@mail.gmail.com>
2015-07-30 18:54                                                         ` Andrew Cooper
     [not found]                                                         ` <55BA72E1.4050809@citrix.com>
2015-07-30 20:01                                                           ` Boris Ostrovsky
     [not found]                                                           ` <55BA828E.8070304@oracle.com>
2015-07-30 20:05                                                             ` Andy Lutomirski
     [not found]                                                             ` <CALCETrUsFn23tKf418VSbGCgXoXXRq8dk41ZfM3F55=_xWPQhw@mail.gmail.com>
2015-07-30 20:18                                                               ` Boris Ostrovsky
2015-07-25  5:36 Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='55B7AE39.7000101__12547.9006710977$1438101203$gmane$org@citrix.com' \
    --to=andrew.cooper3@citrix.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=jbeulich@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sasha.levin@oracle.com \
    --cc=security@kernel.org \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).