xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: "Durrant, Paul" <pdurrant@amazon.co.uk>
To: Jan Beulich <jbeulich@suse.com>
Cc: "Stefano Stabellini" <sstabellini@kernel.org>,
	"Julien Grall" <julien@xen.org>, "Wei Liu" <wl@xen.org>,
	"Konrad Rzeszutek Wilk" <konrad.wilk@oracle.com>,
	"George Dunlap" <George.Dunlap@eu.citrix.com>,
	"Andrew Cooper" <andrew.cooper3@citrix.com>,
	"Ian Jackson" <ian.jackson@eu.citrix.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	"Volodymyr Babchuk" <Volodymyr_Babchuk@epam.com>,
	"Roger Pau Monné" <roger.pau@citrix.com>
Subject: Re: [Xen-devel] [PATCH v4 5/7] mm: make MEMF_no_refcount pages safe to assign
Date: Tue, 28 Jan 2020 17:01:03 +0000	[thread overview]
Message-ID: <29425ac0b17d4772a162a097448cfee4@EX13D32EUC003.ant.amazon.com> (raw)
In-Reply-To: <9376dca1-1bdd-ac08-d84a-e8ac101436d2@suse.com>

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 28 January 2020 15:23
> To: Durrant, Paul <pdurrant@amazon.co.uk>
> Cc: xen-devel@lists.xenproject.org; Andrew Cooper
> <andrew.cooper3@citrix.com>; George Dunlap <George.Dunlap@eu.citrix.com>;
> Ian Jackson <ian.jackson@eu.citrix.com>; Julien Grall <julien@xen.org>;
> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Stefano Stabellini
> <sstabellini@kernel.org>; Wei Liu <wl@xen.org>; Volodymyr Babchuk
> <Volodymyr_Babchuk@epam.com>; Roger Pau Monné <roger.pau@citrix.com>
> Subject: Re: [PATCH v4 5/7] mm: make MEMF_no_refcount pages safe to assign
> 
> On 24.01.2020 16:31, Paul Durrant wrote:
> > Currently it is unsafe to assign a domheap page allocated with
> > MEMF_no_refcount to a domain because the domain't 'tot_pages' will not
> > be incremented, but will be decrement when the page is freed (since
> > free_domheap_pages() has no way of telling that the increment was
> skipped).
> >
> > This patch allocates a new 'count_info' bit for a PGC_no_refcount flag
> > which is then used to mark domheap pages allocated with
> MEMF_no_refcount.
> > This then allows free_domheap_pages() to skip decrementing tot_pages
> when
> > appropriate and hence makes the pages safe to assign.
> >
> > NOTE: The patch sets MEMF_no_refcount directly in alloc_domheap_pages()
> >       rather than in assign_pages() because the latter is called with
> >       MEMF_no_refcount by memory_exchange() as an optimization, to avoid
> >       too many calls to domain_adjust_tot_pages() (which acquires and
> >       releases the global 'heap_lock').
> 
> I don't think there were any optimization thoughts with this. The
> MEMF_no_refcount use is because otherwise for a domain with
> tot_pages == max_pages the assignment would fail.
> 

That would not be the case if the calls to steal_page() further up didn't pass MEMF_no_refcount (which would be the correct thing to do if not passing it to assign_pages(). I had originally considered doing that because I think it allows the somewhat complex error path after assign_pages() to be dropped. But avoiding thrashing the global lock seemed a good reason to leave memory_exchange() the way it is.

> > --- a/xen/common/page_alloc.c
> > +++ b/xen/common/page_alloc.c
> > @@ -460,6 +460,9 @@ unsigned long domain_adjust_tot_pages(struct domain
> *d, long pages)
> >  {
> >      long dom_before, dom_after, dom_claimed, sys_before, sys_after;
> >
> > +    if ( !pages )
> > +        goto out;
> 
> Unrelated change? Are there, in fact, any callers passing in 0?
> Oh, further down you add one which may do so, but then perhaps
> better to make the caller not call here (as is done e.g. in
> memory_exchange())?

I think it's preferable for domain_adjust_tot_pages() to handle zero gracefully.

> 
> > @@ -2331,11 +2331,20 @@ struct page_info *alloc_domheap_pages(
> >                                    memflags, d)) == NULL)) )
> >           return NULL;
> >
> > -    if ( d && !(memflags & MEMF_no_owner) &&
> > -         assign_pages(d, pg, order, memflags) )
> > +    if ( d && !(memflags & MEMF_no_owner) )
> >      {
> > -        free_heap_pages(pg, order, memflags & MEMF_no_scrub);
> > -        return NULL;
> > +        if ( assign_pages(d, pg, order, memflags) )
> > +        {
> > +            free_heap_pages(pg, order, memflags & MEMF_no_scrub);
> > +            return NULL;
> > +        }
> > +        if ( memflags & MEMF_no_refcount )
> > +        {
> > +            unsigned long i;
> > +
> > +            for ( i = 0; i < (1 << order); i++ )
> > +                pg[i].count_info |= PGC_no_refcount;
> > +        }
> 
> I would seem to me that this needs doing the other way around:
> First set PGC_no_refcount, then assign_pages(). After all, the
> moment assign_pages() drops its lock, the domain could also
> decide to get rid of (some of) the pages again.

True. Yes, this needs to be swapped.

> For this (and
> also to slightly simplify things in free_domheap_pages())
> perhaps it would be better not to add that ASSERT() to
> free_heap_pages(). The function shouldn't really be concerned
> of any refcounting, and hence could as well be ignorant to
> PGC_no_refcount being set on a page.
> 

Not sure I understand here. What would you like to see free_heap_pages() assert?

> > @@ -2368,24 +2377,32 @@ void free_domheap_pages(struct page_info *pg,
> unsigned int order)
> >
> >          if ( likely(d) && likely(d != dom_cow) )
> >          {
> > +            long pages = 0;
> > +
> >              /* NB. May recursively lock from relinquish_memory(). */
> >              spin_lock_recursive(&d->page_alloc_lock);
> >
> >              for ( i = 0; i < (1 << order); i++ )
> >              {
> > +                unsigned long count_info = pg[i].count_info;
> > +
> >                  if ( pg[i].u.inuse.type_info & PGT_count_mask )
> >                  {
> >                      printk(XENLOG_ERR
> >                             "pg[%u] MFN %"PRI_mfn" c=%#lx o=%u v=%#lx
> t=%#x\n",
> >                             i, mfn_x(page_to_mfn(pg + i)),
> > -                           pg[i].count_info, pg[i].v.free.order,
> > +                           count_info, pg[i].v.free.order,
> >                             pg[i].u.free.val, pg[i].tlbflush_timestamp);
> >                      BUG();
> >                  }
> >                  arch_free_heap_page(d, &pg[i]);
> > +                if ( count_info & PGC_no_refcount )
> > +                    pg[i].count_info &= ~PGC_no_refcount;
> > +                else
> > +                    pages--;
> 
> Not only to reduce code churn, may I recommend to avoid introducing
> the local variable? There's no strict rule preventing
> arch_free_heap_page() from possibly playing with the field you
> latch up front.

Ok.

  Paul

> 
> Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2020-01-28 17:01 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-24 15:30 [Xen-devel] [PATCH v4 0/7] purge free_shared_domheap_page() Paul Durrant
2020-01-24 15:30 ` [Xen-devel] [PATCH v4 1/7] x86 / vmx: make apic_access_mfn type-safe Paul Durrant
2020-01-24 15:30 ` [Xen-devel] [PATCH v4 2/7] x86 / hvm: add domain_relinquish_resources() method Paul Durrant
2020-01-24 15:30 ` [Xen-devel] [PATCH v4 3/7] x86 / hvm: make domain_destroy() method optional Paul Durrant
2020-01-24 15:44   ` Jan Beulich
2020-01-24 15:31 ` [Xen-devel] [PATCH v4 4/7] x86 / vmx: move teardown from domain_destroy() Paul Durrant
2020-01-28  8:14   ` Jan Beulich
2020-01-28  8:22     ` Durrant, Paul
2020-01-28 11:41       ` Jan Beulich
2020-01-24 15:31 ` [Xen-devel] [PATCH v4 5/7] mm: make MEMF_no_refcount pages safe to assign Paul Durrant
2020-01-28 15:23   ` Jan Beulich
2020-01-28 17:01     ` Durrant, Paul [this message]
2020-01-29  8:21       ` Jan Beulich
2020-01-29  8:29         ` Durrant, Paul
2020-01-28 17:13     ` Durrant, Paul
2020-01-24 15:31 ` [Xen-devel] [PATCH v4 6/7] x86 / vmx: use a MEMF_no_refcount domheap page for APIC_DEFAULT_PHYS_BASE Paul Durrant
2020-01-28 15:23   ` Jan Beulich
2020-01-24 15:31 ` [Xen-devel] [PATCH v4 7/7] mm: remove donate_page() Paul Durrant
2020-01-24 16:07   ` George Dunlap
2020-01-24 16:35   ` Julien Grall
2020-01-24 17:56   ` Andrew Cooper
2020-01-24 18:36     ` Durrant, Paul

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=29425ac0b17d4772a162a097448cfee4@EX13D32EUC003.ant.amazon.com \
    --to=pdurrant@amazon.co.uk \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=Volodymyr_Babchuk@epam.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=julien@xen.org \
    --cc=konrad.wilk@oracle.com \
    --cc=roger.pau@citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).