All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Aravindh Puthiyaparambil (aravindp)" <aravindp@cisco.com>
To: Tim Deegan <tim@xen.org>
Cc: "xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Keir Fraser <keir@xen.org>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	Ian Campbell <ian.campbell@citrix.com>,
	Jan Beulich <jbeulich@suse.com>
Subject: Re: [PATCH RFC 1/4] x86/mm: Shadow and p2m changes for PV mem_access
Date: Thu, 1 May 2014 19:26:17 +0000	[thread overview]
Message-ID: <97A500D504438F4ABC02EBA81613CC63317E9B13@xmb-aln-x02.cisco.com> (raw)
In-Reply-To: <20140501143920.GD86038@deinos.phlegethon.org>

>Thanks for your patch.  I have to say, this looks a lot less terrifying than I
>thought it would. :)

I am glad to hear that. I too feared the worst :-) On that note many thanks to you and Jan for taking the time for giving valuable feedback.

>> Shadow mem_access mode
>> ----------------------
>> Add a new shadow mode for mem_access. This should only be enabled by a
>> mem_access listener when it calls
>> XEN_DOMCTL_MEM_EVENT_OP_ACCESS_ENABLE
>> for a PV domain.
>
>Do we need a specific mode for this?  If the default behaviour is to allow all
>accesses until told otherwise, then it can be enabled at all times.

You might be right. I will look in to getting rid of it.

>> P2M changes
>> -----------
>> Add a new p2m implementation for mem_access. Central to this is the
>> access lookup table. This is an array of mfns. Each mfn is a page
>> allocated from the domain's shadow memory. The pages hold the
>> p2m_access_t values for each guest gmfn. p2m_mem_access_set_entry()
>> sets the access value of the mfn given as input and blows the shadow
>> entries for the mfn. p2m_mem_access_get_entry() returns the access
>> value of the mfn given as input.
>
>I think this array needs to be a trie, at least; it may make sense to reuse the
>p2m_pt implementation and just restrict/ignore the types and MFNs.
>That way you won't have to reinvent the wheel, esp. around avoiding long-
>running operations.

I like the idea you mentioned below of stashing the access type in the shadow_flags field of struct page_info. If that does not work out I will look in to reusing the p2m_pt implementation.

>> +/* Convert access restrictions to page table flags */ void
>> +p2m_access_to_flags(u32 *flags, u32 gflags, p2m_access_t access) {
>> +
>> +    /* Restrict with access permissions */
>> +    switch (access)
>> +    {
>> +        case p2m_access_r:
>> +            *flags &= ~(_PAGE_RW);
>> +            *flags |= (_PAGE_NX_BIT|_PAGE_PRESENT);
>
>IIUC this function is called to add further restrictions to an existing set of flags.
>In that case, it should never set the _PAGE_PRESENT bit.  Rather, it should
>only ever _reduce_ permissions (i.e. set _NX or clear other bits).  Then...
>
>> +            break;
>> +        case p2m_access_rx:
>> +        case p2m_access_rx2rw:
>> +            *flags &= ~(_PAGE_NX_BIT|_PAGE_RW);
>> +            *flags |= _PAGE_PRESENT;
>> +            break;
>> +        case p2m_access_rw:
>> +            *flags |= (_PAGE_NX_BIT|_PAGE_RW|_PAGE_PRESENT);
>> +            break;
>> +        case p2m_access_rwx:
>> +        default:
>> +            *flags &= ~(_PAGE_NX_BIT);
>> +            *flags |= (_PAGE_RW|_PAGE_PRESENT);
>> +            break;
>> +    }
>> +
>> +    // Allow more restrictive guest flags to be propagated instead of access
>> +    // permissions
>> +    if ( !(gflags & _PAGE_RW) )
>> +        *flags &= ~(_PAGE_RW);
>> +    if ( gflags & _PAGE_NX_BIT )
>> +        *flags |= _PAGE_NX_BIT;
>
>..you won't need these either.

That makes sense. I will follow your suggestion.

>> +}
>> +
>> +/*
>> + * Set the page permission of the mfn. This in effect removes all
>> +shadow
>> + * mappings of that mfn. The access type of that mfn is stored in the
>> +access
>> + * lookup table.
>> + */
>> +static int
>> +p2m_mem_access_set_entry(struct p2m_domain *p2m, unsigned long
>gfn, mfn_t mfn,
>> +                         unsigned int page_order, p2m_type_t p2mt,
>> +                         p2m_access_t p2ma) {
>> +    struct domain *d = p2m->domain;
>> +    mfn_t *access_lookup_table = p2m->access_lookup_table;
>> +    uint table_idx;
>> +    uint page_idx;
>> +    uint8_t *access_table_page;
>> +
>> +    ASSERT(shadow_mode_mem_access(d) && access_lookup_table !=
>NULL);
>> +
>> +    /* For PV domains we only support rw, rx, rx2rw, rwx access permissions
>*/
>> +    if ( unlikely(p2ma != p2m_access_r &&
>> +                  p2ma != p2m_access_rw &&
>> +                  p2ma != p2m_access_rx &&
>> +                  p2ma != p2m_access_rwx &&
>> +                  p2ma != p2m_access_rx2rw) )
>> +        return -EINVAL;
>> +
>> +    if ( page_get_owner(mfn_to_page(mfn)) != d )
>> +        return -ENOENT;
>
>If we're only ever setting access permissions for a guest's own pages (i.e. not
>setting them for grant mappings or foreign mappings) then the access
>permissions lookup table could potentially be shared among all PV VMs.  It
>might even be possible to fold the access-type into the shadow_flags field in
>struct page_info and avoid having a lookup table at all.  If that's possible, I
>think it will make some thing a _lot_ easier.

This idea looks very promising. I remember mentioning in this is some of my earlier emails but got side tracked thinking PV memory is contiguous and I could use a simple indexed lookup table. So I will resubmit this patch series after I implement a new version of p2m-ma using the shadow_flags or reusing the p2m-pt implementation.

>> -    /* Sanity check the call */
>> -    if ( d == current->domain || (d->arch.paging.mode & mode) == mode )
>> +    /*
>> +     * Sanity check the call.
>> +     * Do not allow shadow mem_access mode to be enabled when
>> +     * log-dirty or translate mode is enabled.
>
>Why not log-dirty?  That would mean you can't use mem-access and live
>migration at the same time.

I was afraid they would conflict and prevent log-dirty from working correctly. But I forgot about it being used during live-migration. Let me see if I can get both of them to live together.

>> +    /* Propagate access permissions */
>> +    if ( unlikely((level == 1) &&
>> +                  mem_event_check_ring(&d->mem_event->access) &&
>> +                  !sh_mfn_is_a_page_table(target_mfn)) )
>> +    {
>> +        struct p2m_domain *p2m = p2m_get_hostp2m(d);
>> +        p2m_access_t a;
>> +        p2m_type_t t;
>> +        mfn_t mfn;
>> +        mfn = p2m->get_entry(p2m, mfn_x(target_mfn), &t, &a, 0,
>> + NULL);
>
>Ah, no.  This access lookup ought to happen as part of the original p2m lookup
>that produced target_mfn.  Having a second lookup here is unpleasant and
>confusing -- this code keeps the distinction between gfn and mfn pretty
>strictly, so passing an mfn into a p2m lookup is wrong, even though you
>happen to know they're the same namespace in this PV guest.
>
>Of course, if you can hide the access type in the struct page_info, then it's
>correct to recover it here.

OK, I am hoping to do that here. 

>> +            /*
>> +             * Do not police writes to guest memory emanating from the Xen
>> +             * kernel. Trying to do so will cause the same pagefault to occur
>> +             * over and over again with an event being sent to the access
>> +             * listener for each fault. If the access listener's vcpu is not
>> +             * scheduled during this time, the violation is never resolved and
>> +             * will eventually end with the host crashing.
>> +             */
>> +            if ( (violation && access_w) &&
>> +                 (regs->eip >= XEN_VIRT_START && regs->eip <= XEN_VIRT_END) )
>> +            {
>> +                violation = 0;
>> +                rc = p2m->set_entry(p2m, gfn_x(gfn), gmfn, PAGE_ORDER_4K,
>> +                                    p2m_ram_rw, p2m_access_rw);
>> +            }
>
>That's exploitable, surely: the guest just needs to make a hypercall with a
>chosen buffer address to give itself _rw access to a page.
>(Which I guess is OK for your intended use of rw<->rx but not in
>general.)

I agree but could not come up with a way to get it to work. I have sent a separate email regarding this issue. 

Cheers,
Aravindh

  reply	other threads:[~2014-05-01 19:26 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-29  4:45 [PATCH RFC 0/4] Add mem_access support for PV domains Aravindh Puthiyaparambil
2014-04-29  4:45 ` [PATCH RFC 1/4] x86/mm: Shadow and p2m changes for PV mem_access Aravindh Puthiyaparambil
2014-04-29  8:50   ` Jan Beulich
2014-04-29 23:10     ` Aravindh Puthiyaparambil (aravindp)
2014-04-30  6:32       ` Jan Beulich
2014-04-30 22:20         ` Aravindh Puthiyaparambil (aravindp)
2014-05-01  1:11           ` Andres Lagar-Cavilla
2014-05-01  3:18             ` Aravindh Puthiyaparambil (aravindp)
2014-05-01 14:18           ` Tim Deegan
2014-05-01 19:14             ` Aravindh Puthiyaparambil (aravindp)
2014-05-08 12:44               ` Tim Deegan
2014-05-02  8:26           ` Jan Beulich
2014-05-02 16:28             ` Aravindh Puthiyaparambil (aravindp)
2014-04-29 12:05   ` Tamas Lengyel
2014-04-29 23:36     ` Aravindh Puthiyaparambil (aravindp)
2014-05-01 14:39   ` Tim Deegan
2014-05-01 19:26     ` Aravindh Puthiyaparambil (aravindp) [this message]
2014-04-29  4:45 ` [PATCH RFC 2/4] x86/mem_access: mem_access and mem_event changes to support PV domains Aravindh Puthiyaparambil
2014-04-29  4:45 ` [PATCH RFC 3/4] tools/libxc: Add APIs to create and get the PV ring page Aravindh Puthiyaparambil
2014-05-02 12:41   ` Ian Campbell
2014-04-29  4:45 ` [PATCH RFC 4/4] tool/xen-access: Add support for PV domains Aravindh Puthiyaparambil
2014-05-02 12:43   ` Ian Campbell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=97A500D504438F4ABC02EBA81613CC63317E9B13@xmb-aln-x02.cisco.com \
    --to=aravindp@cisco.com \
    --cc=ian.campbell@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=keir@xen.org \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.