All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jan Beulich" <JBeulich@suse.com>
To: "Aravindh Puthiyaparambil (aravindp)" <aravindp@cisco.com>
Cc: Keir Fraser <keir@xen.org>,
	Ian Campbell <ian.campbell@citrix.com>,
	"Andres Lagar-Cavilla (andreslc@gridcentric.ca)"
	<andreslc@gridcentric.ca>, Tim Deegan <tim@xen.org>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Ian Jackson <ian.jackson@eu.citrix.com>
Subject: Re: [PATCH RFC 1/4] x86/mm: Shadow and p2m changes for PV mem_access
Date: Fri, 02 May 2014 09:26:44 +0100	[thread overview]
Message-ID: <536372E4020000780000E5A3@mail.emea.novell.com> (raw)
In-Reply-To: <97A500D504438F4ABC02EBA81613CC63317E9527@xmb-aln-x02.cisco.com>

>>> On 01.05.14 at 00:20, <aravindp@cisco.com> wrote:
>>I think implying _n to mean "default" is at least a latent bug anyway (since that
>>precludes setting that type). And _if_ "get" really does any modification, that
>>minimally would require a big comment.
> 
> For PV domains access_n is precluded.
> 
>>Furthermore, with it being possible for the default type to change, the net
>>effect might easily be different between setting the default type at
>>initialization time or when the first lookup happens.
> 
> I am not sure that I follow your concern here. Setting default access does 
> not mean that the access table gets cleared. So if a particular MFN entry's 
> access value changed and the default value was changed, then the MFN should 
> retain the access value it was changed to and not the default value. The new 
> default value will only be returned for MFNs for which set_entry was not 
> called on and I think that is correct.

It might still be a difference when the designated access gets effected:
I would assume that a request for the default type should obtain the
default set at the time of the request, not the one in effects at the time
the entry gets first used.

>>>>> +int p2m_mem_access_init(struct p2m_domain *p2m) {
>>>>> +    struct domain *d = p2m->domain;
>>>>> +    mfn_t *access_lookup_table;
>>>>> +    uint32_t nr_access_table_pages;
>>>>> +    uint32_t ctr;
>>>>> +
>>>>> +    nr_access_table_pages = get_domain_nr_access_table_pages(d);
>>>>> +    access_lookup_table = xzalloc_array(mfn_t,
>>>>> + nr_access_table_pages);
>>>>
>>>>This surely is going to be an order > 0 allocation, and you even risk
>>>>it being an order > MAX_ORDER one. The former is disallowed at runtime
>>>>by convention/agreement, the latter is an outright bug.
>>>
>>> Sorry, I was not aware of the convention. What should I be doing here
>>> instead?
>>
>>Especially with the above note about GFN space being effectively unbounded
>>for PV guests, you need to find a different data structure to represent the
>>information you need. Perhaps something similar to what log-dirty mode
>>uses, except that you would need more levels in the extreme case, and hence
>>it might be worthwhile making the actually used number of levels dynamic?
> 
> I am a little confused here. In my initial discussions about this I was told 
> that PV guests have bounded and contiguous memory. This is why I took the 
> approach of an indexable array. 
> 
> " OTOH, all you need is a byte per pfn, and the great thing is that in PV 
> domains, the physmap is bounded and continuous. Unlike HVM and its PCI holes, 
> etc, which demand the sparse tree structure. So you can allocate an easily 
> indexable array, notwithstanding super page concerns (I think/hope)."
> 
> http://lists.xenproject.org/archives/html/xen-devel/2013-11/msg03860.html 
> 
> Did I misunderstand something here? 

What he stated is the situation at domain start. Due to ballooning and
other kinds of manipulations (as said, pv-ops re-arranges things to
avoid having memory [PFNs] overlap MMIO [MFNs]) this may change
immediately after the domain started.

And even if it was a plain contiguous linear address space, you still
wouldn't be permitted to allocate a flat one-byte-per-page array, as
for large domains this might still overflow the MAX_ORDER allocation
constraint.

>>>>> @@ -2443,7 +2443,8 @@ int sh_remove_all_mappings(struct vcpu *v,
>>>>mfn_t gmfn)
>>>>>          if ( !(shadow_mode_external(v->domain)
>>>>>                 && (page->count_info & PGC_count_mask) <= 3
>>>>>                 && ((page->u.inuse.type_info & PGT_count_mask)
>>>>> -                   == !!is_xen_heap_page(page))) )
>>>>> +                       == !!is_xen_heap_page(page)))
>>>>> +                    && !(shadow_mode_mem_access(v->domain)) )
>>>>
>>>>You're breaking indentation, there are pointless parentheses again,
>>>>but most importantly - why?
>>>
>>> Sorry, I meant to ask a question about this in my patch message and
>>> mark this as a workaround. I am seeing the message on all
>>> sh_remove_all_mappings() and I was not able to figure out why this was
>>> happening. I just added this as a work around. I was hoping you or Tim
>>would shed more light on this.
>>
>>I'm afraid that without detail on which pages the triggers on, and you at 
> least
>>having spent some time finding out where the stray/extra references may be
>>coming from it's going to be hard to help.
> 
> Here are some of the prints I saw. I typically saw it for every set_entry() 
> call.

There are some rather large usage counts among the examples you
gave - these surely need understanding rather than blindly ignoring.

>>>>> +/* Number of access table pages for a PV domain */ #define
>>>>> +get_domain_nr_access_table_pages(d) \
>>>>> +        DIV_ROUND_UP(P2M_ACCESS_SIZE * (d->tot_pages - 1),
>>>>> +PAGE_SIZE)
>>>>
>>>>And once again. I wonder whether this code was tested on a suitably
>>>>big pv-ops PV guest.
>>>
>>> The largest PV domain I tried this with was 28GB.
>>
>>Which ought to have had GFNs larger than ->tot_pages, so I wonder how this
>>worked.
> 
> It would have had MFNs larger than tot_pages. I don't understand how it 
> would have had GFNs larger than tot_pages.

Once again, because it re-arranges its PFN space to avoid MMIO holes
(plus - applicable to non-pv-ops too - it may be started ballooned or
balloon out memory subsequently).

Jan

  parent reply	other threads:[~2014-05-02  8:26 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-29  4:45 [PATCH RFC 0/4] Add mem_access support for PV domains Aravindh Puthiyaparambil
2014-04-29  4:45 ` [PATCH RFC 1/4] x86/mm: Shadow and p2m changes for PV mem_access Aravindh Puthiyaparambil
2014-04-29  8:50   ` Jan Beulich
2014-04-29 23:10     ` Aravindh Puthiyaparambil (aravindp)
2014-04-30  6:32       ` Jan Beulich
2014-04-30 22:20         ` Aravindh Puthiyaparambil (aravindp)
2014-05-01  1:11           ` Andres Lagar-Cavilla
2014-05-01  3:18             ` Aravindh Puthiyaparambil (aravindp)
2014-05-01 14:18           ` Tim Deegan
2014-05-01 19:14             ` Aravindh Puthiyaparambil (aravindp)
2014-05-08 12:44               ` Tim Deegan
2014-05-02  8:26           ` Jan Beulich [this message]
2014-05-02 16:28             ` Aravindh Puthiyaparambil (aravindp)
2014-04-29 12:05   ` Tamas Lengyel
2014-04-29 23:36     ` Aravindh Puthiyaparambil (aravindp)
2014-05-01 14:39   ` Tim Deegan
2014-05-01 19:26     ` Aravindh Puthiyaparambil (aravindp)
2014-04-29  4:45 ` [PATCH RFC 2/4] x86/mem_access: mem_access and mem_event changes to support PV domains Aravindh Puthiyaparambil
2014-04-29  4:45 ` [PATCH RFC 3/4] tools/libxc: Add APIs to create and get the PV ring page Aravindh Puthiyaparambil
2014-05-02 12:41   ` Ian Campbell
2014-04-29  4:45 ` [PATCH RFC 4/4] tool/xen-access: Add support for PV domains Aravindh Puthiyaparambil
2014-05-02 12:43   ` Ian Campbell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=536372E4020000780000E5A3@mail.emea.novell.com \
    --to=jbeulich@suse.com \
    --cc=andreslc@gridcentric.ca \
    --cc=aravindp@cisco.com \
    --cc=ian.campbell@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=keir@xen.org \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.