From: Juergen Gross <jgross@suse.com>
To: Stefano Stabellini <sstabellini@kernel.org>
Cc: xen-devel <xen-devel@lists.xenproject.org>,
Ian Jackson <ian.jackson@eu.citrix.com>,
Wei Liu <wei.liu2@citrix.com>
Subject: Re: [for-4.9] Re: HVM guest performance regression
Date: Wed, 7 Jun 2017 08:55:55 +0200 [thread overview]
Message-ID: <294d36b9-0ebb-647d-ecfa-7a4e2c0ada47@suse.com> (raw)
In-Reply-To: <alpine.DEB.2.10.1706061202290.15791@sstabellini-ThinkPad-X260>
On 06/06/17 21:08, Stefano Stabellini wrote:
> On Tue, 6 Jun 2017, Juergen Gross wrote:
>> On 06/06/17 18:39, Stefano Stabellini wrote:
>>> On Tue, 6 Jun 2017, Juergen Gross wrote:
>>>> On 26/05/17 21:01, Stefano Stabellini wrote:
>>>>> On Fri, 26 May 2017, Juergen Gross wrote:
>>>>>> On 26/05/17 18:19, Ian Jackson wrote:
>>>>>>> Juergen Gross writes ("HVM guest performance regression"):
>>>>>>>> Looking for the reason of a performance regression of HVM guests under
>>>>>>>> Xen 4.7 against 4.5 I found the reason to be commit
>>>>>>>> c26f92b8fce3c9df17f7ef035b54d97cbe931c7a ("libxl: remove freemem_slack")
>>>>>>>> in Xen 4.6.
>>>>>>>>
>>>>>>>> The problem occurred when dom0 had to be ballooned down when starting
>>>>>>>> the guest. The performance of some micro benchmarks dropped by about
>>>>>>>> a factor of 2 with above commit.
>>>>>>>>
>>>>>>>> Interesting point is that the performance of the guest will depend on
>>>>>>>> the amount of free memory being available at guest creation time.
>>>>>>>> When there was barely enough memory available for starting the guest
>>>>>>>> the performance will remain low even if memory is being freed later.
>>>>>>>>
>>>>>>>> I'd like to suggest we either revert the commit or have some other
>>>>>>>> mechanism to try to have some reserve free memory when starting a
>>>>>>>> domain.
>>>>>>>
>>>>>>> Oh, dear. The memory accounting swamp again. Clearly we are not
>>>>>>> going to drain that swamp now, but I don't like regressions.
>>>>>>>
>>>>>>> I am not opposed to reverting that commit. I was a bit iffy about it
>>>>>>> at the time; and according to the removal commit message, it was
>>>>>>> basically removed because it was a piece of cargo cult for which we
>>>>>>> had no justification in any of our records.
>>>>>>>
>>>>>>> Indeed I think fixing this is a candidate for 4.9.
>>>>>>>
>>>>>>> Do you know the mechanism by which the freemem slack helps ? I think
>>>>>>> that would be a prerequisite for reverting this. That way we can have
>>>>>>> an understanding of why we are doing things, rather than just
>>>>>>> flailing at random...
>>>>>>
>>>>>> I wish I would understand it.
>>>>>>
>>>>>> One candidate would be 2M/1G pages being possible with enough free
>>>>>> memory, but I haven't proofed this yet. I can have a try by disabling
>>>>>> big pages in the hypervisor.
>>>>>
>>>>> Right, if I had to bet, I would put my money on superpages shattering
>>>>> being the cause of the problem.
>>>>
>>>> Seems you would have lost your money...
>>>>
>>>> Meanwhile I've found a way to get the "good" performance in the micro
>>>> benchmark. Unfortunately this requires to switch off the pv interfaces
>>>> in the HVM guest via "xen_nopv" kernel boot parameter.
>>>>
>>>> I have verified that pv spinlocks are not to blame (via "xen_nopvspin"
>>>> kernel boot parameter). Switching to clocksource TSC in the running
>>>> system doesn't help either.
>>>
>>> What about xen_hvm_exit_mmap (an optimization for shadow pagetables) and
>>> xen_hvm_smp_init (PV IPI)?
>>
>> xen_hvm_exit_mmap isn't active (kernel message telling me so was
>> issued).
>>
>>>> Unfortunately the kernel seems no longer to be functional when I try to
>>>> tweak it not to use the PVHVM enhancements.
>>>
>>> I guess you are not talking about regular PV drivers like netfront and
>>> blkfront, right?
>>
>> The plan was to be able to use PV drivers without having to use PV
>> callbacks and PV timers. This isn't possible right now.
>
> I think the code to handle that scenario was gradually removed over time
> to simplify the code base.
Hmm, too bad.
>>>> I'm wondering now whether
>>>> there have ever been any benchmarks to proof PVHVM really being faster
>>>> than non-PVHVM? My findings seem to suggest there might be a huge
>>>> performance gap with PVHVM. OTOH this might depend on hardware and other
>>>> factors.
>>>>
>>>> Stefano, didn't you do the PVHVM stuff back in 2010? Do you have any
>>>> data from then regarding performance figures?
>>>
>>> Yes, I still have these slides:
>>>
>>> https://www.slideshare.net/xen_com_mgr/linux-pv-on-hvm
>>
>> Thanks. So you measured the overall package, not the single items like
>> callbacks, timers, time source? I'm asking because I start to believe
>> there are some of those slower than their non-PV variants.
>
> There isn't much left in terms of individual optimizations: you already
> tried switching clocksource and removing pv spinlocks. xen_hvm_exit_mmap
> is not used. Only the following are left (you might want to double check
> I haven't missed anything):
>
> 1) PV IPI
Its a 1 vcpu guest.
> 2) PV suspend/resume
> 3) vector callback
> 4) interrupt remapping
>
> 2) is not on the hot path.
> I did individual measurements of 3) at some points and it was a clear win.
That might depend on the hardware. Could it be newer processors are
faster here?
> Slide 14 shows the individual measurements of 4)
I don't think this is affecting my benchmark. It is just munmap after
all.
>
> Only 1) is left to check as far as I can tell.
No IPIs should be involved.
Juergen
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
next prev parent reply other threads:[~2017-06-07 6:55 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-26 16:14 HVM guest performance regression Juergen Gross
2017-05-26 16:19 ` [for-4.9] " Ian Jackson
2017-05-26 17:00 ` Juergen Gross
2017-05-26 19:01 ` Stefano Stabellini
2017-05-29 19:05 ` Juergen Gross
2017-05-30 7:24 ` Jan Beulich
[not found] ` <592D3A3A020000780015D787@suse.com>
2017-05-30 10:33 ` Juergen Gross
2017-05-30 10:43 ` Jan Beulich
[not found] ` <592D68DC020000780015D919@suse.com>
2017-05-30 14:57 ` Juergen Gross
2017-05-30 15:10 ` Jan Beulich
2017-06-06 13:44 ` Juergen Gross
2017-06-06 16:39 ` Stefano Stabellini
2017-06-06 19:00 ` Juergen Gross
2017-06-06 19:08 ` Stefano Stabellini
2017-06-07 6:55 ` Juergen Gross [this message]
2017-06-07 18:19 ` Stefano Stabellini
2017-06-08 9:37 ` Juergen Gross
2017-06-08 18:09 ` Stefano Stabellini
2017-06-08 18:28 ` Juergen Gross
2017-06-08 21:00 ` Dario Faggioli
2017-06-11 2:27 ` Konrad Rzeszutek Wilk
2017-06-12 5:48 ` Solved: " Juergen Gross
2017-06-12 7:35 ` Andrew Cooper
2017-06-12 7:47 ` Juergen Gross
2017-06-12 8:30 ` Andrew Cooper
2017-05-26 17:04 ` Dario Faggioli
2017-05-26 17:25 ` Juergen Gross
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=294d36b9-0ebb-647d-ecfa-7a4e2c0ada47@suse.com \
--to=jgross@suse.com \
--cc=ian.jackson@eu.citrix.com \
--cc=sstabellini@kernel.org \
--cc=wei.liu2@citrix.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).