From: "Xia, Hongyan" <hongyxia@amazon.com>
To: "wl@xen.org" <wl@xen.org>
Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>,
"Grall, Julien" <jgrall@amazon.com>,
"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
"jbeulich@suse.com" <jbeulich@suse.com>,
"roger.pau@citrix.com" <roger.pau@citrix.com>
Subject: Re: [Xen-devel] [PATCH v2] x86/domain_page: implement pure per-vCPU mapping infrastructure
Date: Fri, 21 Feb 2020 12:52:07 +0000 [thread overview]
Message-ID: <2326943257e6f04bc9a858ef5667295651395c85.camel@amazon.com> (raw)
In-Reply-To: <20200221115017.tuo2i5db5mhd5yyt@debian>
On Fri, 2020-02-21 at 11:50 +0000, Wei Liu wrote:
> On Thu, Feb 06, 2020 at 06:58:23PM +0000, Hongyan Xia wrote:
> > ...
> >
> > Benchmarks
> > (baseline uses direct map instead of the mapcache in
> > map_domain_page,
> > old is the existing mapcache and new is after this patch):
> >
> > Geekbench on a 32-vCPU guest,
> >
> > performance drop old new
> > single core 0.04% 0.18%
>
> Do you know why the new mapcache has more overhead than the old one
> in
> the single core case?
To be honest I think this is within noise territory. The benchmarks
were run 5 times, I can average over even more runs to confirm.
> [...]
>
> > void unmap_domain_page(const void *ptr)
> > {
> > - unsigned int idx;
> > + unsigned int idx, glb_idx;
> > struct vcpu *v;
> > - struct mapcache_domain *dcache;
> > - unsigned long va = (unsigned long)ptr, mfn, flags;
> > - struct vcpu_maphash_entry *hashent;
> > + struct mapcache_vcpu *vcache;
> > + unsigned long va = (unsigned long)ptr, mfn, hashmfn, flags;
> >
> > if ( va >= DIRECTMAP_VIRT_START )
> > return;
> > @@ -189,114 +144,78 @@ void unmap_domain_page(const void *ptr)
> > v = mapcache_current_vcpu();
> > ASSERT(v && is_pv_vcpu(v));
> >
> > - dcache = &v->domain->arch.pv.mapcache;
> > - ASSERT(dcache->inuse);
> > + vcache = v->arch.pv.mapcache;
> > + ASSERT(vcache);
> >
> > - idx = PFN_DOWN(va - MAPCACHE_VIRT_START);
> > - mfn = l1e_get_pfn(MAPCACHE_L1ENT(idx));
> > - hashent = &v->arch.pv.mapcache.hash[MAPHASH_HASHFN(mfn)];
>
> Please also assert the va in question is within the range allocated
> to
> this vcpu.
Sure.
> > + glb_idx = PFN_DOWN(va - MAPCACHE_VIRT_START);
> > + idx = glb_idx - v->vcpu_id * MAPCACHE_VCPU_ENTRIES;
> >
> > local_irq_save(flags);
> >
> > - if ( hashent->idx == idx )
> > - {
> > - ASSERT(hashent->mfn == mfn);
> > - ASSERT(hashent->refcnt);
> > - hashent->refcnt--;
> > - }
> > - else if ( !hashent->refcnt )
> > - {
> > - if ( hashent->idx != MAPHASHENT_NOTINUSE )
> > - {
> > - /* /First/, zap the PTE. */
> > - ASSERT(l1e_get_pfn(MAPCACHE_L1ENT(hashent->idx)) ==
> > - hashent->mfn);
> > - l1e_write(&MAPCACHE_L1ENT(hashent->idx), l1e_empty());
> > - /* /Second/, mark as garbage. */
> > - set_bit(hashent->idx, dcache->garbage);
> > - }
> > -
> > - /* Add newly-freed mapping to the maphash. */
> > - hashent->mfn = mfn;
> > - hashent->idx = idx;
> > - }
> > - else
> > - {
> > - /* /First/, zap the PTE. */
> > - l1e_write(&MAPCACHE_L1ENT(idx), l1e_empty());
> > - /* /Second/, mark as garbage. */
> > - set_bit(idx, dcache->garbage);
> > - }
> > + mfn = l1e_get_pfn(MAPCACHE_L1ENT(glb_idx));
> > + hashmfn = vcache->hash_mfn[MAPHASH_HASHFN(mfn)];
> > +
> > + vcache->refcnt[idx]--;
> > + /* If refcnt drops to 0, we only clear inuse when it's not in
> > the maphash. */
>
> It would be clearer to "when it has been evicted from maphash" to
> match
> the terminology in the map routine.
Will reword in next revision.
> > + if ( hashmfn != mfn && !vcache->refcnt[idx] )
> > + __clear_bit(idx, vcache->inuse);
>
> Also, please flush the linear address here and the other __clear_bit
> location.
I flush when a new entry is taking a slot. Yeah, it's probably better
to flush earlier whenever a slot is no longer in use.
> >
> > local_irq_restore(flags);
> > }
> >
> > -int mapcache_domain_init(struct domain *d)
> > +int mapcache_vcpu_init(struct vcpu *v)
> > {
> > + struct domain *d = v->domain;
> > struct mapcache_domain *dcache = &d->arch.pv.mapcache;
> > - unsigned int bitmap_pages;
> > -
> > - ASSERT(is_pv_domain(d));
> > + unsigned long i;
>
> Since you're changing this anyway -- we normally use unsigned int as
> index type.
>
> > + unsigned int ents = d->max_vcpus * MAPCACHE_VCPU_ENTRIES;
> >
> > #ifdef NDEBUG
> > if ( !mem_hotplug && max_page <=
> > PFN_DOWN(__pa(HYPERVISOR_VIRT_END - 1)) )
> > return 0;
> > #endif
> >
> > - BUILD_BUG_ON(MAPCACHE_VIRT_END + PAGE_SIZE * (3 +
> > - 2 * PFN_UP(BITS_TO_LONGS(MAPCACHE_ENTRIES) *
> > sizeof(long))) >
> > - MAPCACHE_VIRT_START + (PERDOMAIN_SLOT_MBYTES <<
> > 20));
> > - bitmap_pages = PFN_UP(BITS_TO_LONGS(MAPCACHE_ENTRIES) *
> > sizeof(long));
> > - dcache->inuse = (void *)MAPCACHE_VIRT_END + PAGE_SIZE;
> > - dcache->garbage = dcache->inuse +
> > - (bitmap_pages + 1) * PAGE_SIZE /
> > sizeof(long);
> > -
> > - spin_lock_init(&dcache->lock);
> > -
> > - return create_perdomain_mapping(d, (unsigned long)dcache-
> > >inuse,
> > - 2 * bitmap_pages + 1,
> > - NIL(l1_pgentry_t *), NULL);
> > -}
> > -
> > -int mapcache_vcpu_init(struct vcpu *v)
> > -{
> > - struct domain *d = v->domain;
> > - struct mapcache_domain *dcache = &d->arch.pv.mapcache;
> > - unsigned long i;
> > - unsigned int ents = d->max_vcpus * MAPCACHE_VCPU_ENTRIES;
> > - unsigned int nr = PFN_UP(BITS_TO_LONGS(ents) * sizeof(long));
> > -
> > - if ( !is_pv_vcpu(v) || !dcache->inuse )
> > + if ( is_idle_vcpu(v) || is_hvm_vcpu(v) )
> > return 0;
> >
> > + BUILD_BUG_ON(MAPCACHE_VIRT_END > ARG_XLAT_VIRT_START);
> > +
> > if ( ents > dcache->entries )
> > {
> > + int rc;
> > +
> > + ASSERT(ents * PAGE_SIZE <= (PERDOMAIN_SLOT_MBYTES << 20));
> > +
> > /* Populate page tables. */
> > - int rc = create_perdomain_mapping(d, MAPCACHE_VIRT_START,
> > ents,
> > + rc = create_perdomain_mapping(d, MAPCACHE_VIRT_START,
> > ents,
> > NIL(l1_pgentry_t *),
> > NULL);
> >
> > - /* Populate bit maps. */
> > - if ( !rc )
> > - rc = create_perdomain_mapping(d, (unsigned
> > long)dcache->inuse,
> > - nr, NULL, NIL(struct
> > page_info *));
> > - if ( !rc )
> > - rc = create_perdomain_mapping(d, (unsigned
> > long)dcache->garbage,
> > - nr, NULL, NIL(struct
> > page_info *));
> > -
> > if ( rc )
> > return rc;
> >
> > dcache->entries = ents;
>
> Given that:
>
> 1. mapcache_domain is now a structure with only one member.
> 2. ents is a constant throughout domain's lifecycle.
>
> You can replace mapcache_domain with a boolean --
> mapcache_mapping_populated (?) in arch.pv.
>
> If I'm not mistaken, the size of the mapping is derived from the vcpu
> being initialised, so a further improvement is to lift the mapping
> creation out of mapcache_vcpu_init.
But you can just XEN_DOMCTL_max_vcpus on a running domain to increase
its max_vcpus count, so that ents is not constant?
> > }
> >
> > - /* Mark all maphash entries as not in use. */
> > BUILD_BUG_ON(MAPHASHENT_NOTINUSE < MAPCACHE_ENTRIES);
> > + /* MAPHASH_ENTRIES has to be power-of-two to make hashing
> > work. */
> > + BUILD_BUG_ON(MAPHASH_ENTRIES & (MAPHASH_ENTRIES - 1));
> > + /*
> > + * Since entries in the maphash also occupy inuse slots, we
> > have to make
> > + * sure MAPCACHE_VCPU_ENTRIES is large enough to accommodate
> > both the
> > + * maphash and a nested page table walk.
> > + */
> > + BUILD_BUG_ON(MAPCACHE_VCPU_ENTRIES - MAPHASH_ENTRIES <
> > + CONFIG_PAGING_LEVELS * CONFIG_PAGING_LEVELS);
> > +
> > + v->arch.pv.mapcache = xzalloc(struct mapcache_vcpu);
> > + if ( !v->arch.pv.mapcache )
> > + return -ENOMEM;
> > +
> > + /* Mark all maphash entries as not in use. */
> > for ( i = 0; i < MAPHASH_ENTRIES; i++ )
> > {
> > - struct vcpu_maphash_entry *hashent = &v-
> > >arch.pv.mapcache.hash[i];
> > -
> > - hashent->mfn = ~0UL; /* never valid to map */
> > - hashent->idx = MAPHASHENT_NOTINUSE;
> > + v->arch.pv.mapcache->hash_mfn[i] = ~0UL;
>
> mfn_x(INVALID_MFN) here. Or you can change the type of the array to
> mfn_t.
mfn_t is a better idea. Thanks.
Hongyan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
next prev parent reply other threads:[~2020-02-21 12:52 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-06 18:58 [Xen-devel] [PATCH v2] x86/domain_page: implement pure per-vCPU mapping infrastructure Hongyan Xia
2020-02-21 11:50 ` Wei Liu
2020-02-21 12:52 ` Xia, Hongyan [this message]
2020-02-21 13:02 ` Andrew Cooper
2020-02-21 14:40 ` Xia, Hongyan
2020-02-21 13:31 ` Jan Beulich
2020-02-21 14:36 ` Wei Liu
2020-02-21 14:55 ` Jan Beulich
2020-02-21 14:58 ` Wei Liu
2020-02-21 15:08 ` Jan Beulich
2020-02-21 14:52 ` Xia, Hongyan
2020-02-21 14:59 ` Jan Beulich
2020-02-21 14:39 ` Wei Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2326943257e6f04bc9a858ef5667295651395c85.camel@amazon.com \
--to=hongyxia@amazon.com \
--cc=andrew.cooper3@citrix.com \
--cc=jbeulich@suse.com \
--cc=jgrall@amazon.com \
--cc=roger.pau@citrix.com \
--cc=wl@xen.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).