All of lore.kernel.org
 help / color / mirror / Atom feed
From: Juergen Gross <jgross@suse.com>
To: Stefano Stabellini <sstabellini@kernel.org>
Cc: xen-devel <xen-devel@lists.xenproject.org>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	Wei Liu <wei.liu2@citrix.com>
Subject: Re: [for-4.9] Re: HVM guest performance regression
Date: Thu, 8 Jun 2017 11:37:55 +0200	[thread overview]
Message-ID: <16819156-5a02-1f21-83c5-70507eed7a4b@suse.com> (raw)
In-Reply-To: <alpine.DEB.2.10.1706071109510.26108@sstabellini-ThinkPad-X260>

On 07/06/17 20:19, Stefano Stabellini wrote:
> On Wed, 7 Jun 2017, Juergen Gross wrote:
>> On 06/06/17 21:08, Stefano Stabellini wrote:
>>> On Tue, 6 Jun 2017, Juergen Gross wrote:
>>>> On 06/06/17 18:39, Stefano Stabellini wrote:
>>>>> On Tue, 6 Jun 2017, Juergen Gross wrote:
>>>>>> On 26/05/17 21:01, Stefano Stabellini wrote:
>>>>>>> On Fri, 26 May 2017, Juergen Gross wrote:
>>>>>>>> On 26/05/17 18:19, Ian Jackson wrote:
>>>>>>>>> Juergen Gross writes ("HVM guest performance regression"):
>>>>>>>>>> Looking for the reason of a performance regression of HVM guests under
>>>>>>>>>> Xen 4.7 against 4.5 I found the reason to be commit
>>>>>>>>>> c26f92b8fce3c9df17f7ef035b54d97cbe931c7a ("libxl: remove freemem_slack")
>>>>>>>>>> in Xen 4.6.
>>>>>>>>>>
>>>>>>>>>> The problem occurred when dom0 had to be ballooned down when starting
>>>>>>>>>> the guest. The performance of some micro benchmarks dropped by about
>>>>>>>>>> a factor of 2 with above commit.
>>>>>>>>>>
>>>>>>>>>> Interesting point is that the performance of the guest will depend on
>>>>>>>>>> the amount of free memory being available at guest creation time.
>>>>>>>>>> When there was barely enough memory available for starting the guest
>>>>>>>>>> the performance will remain low even if memory is being freed later.
>>>>>>>>>>
>>>>>>>>>> I'd like to suggest we either revert the commit or have some other
>>>>>>>>>> mechanism to try to have some reserve free memory when starting a
>>>>>>>>>> domain.
>>>>>>>>>
>>>>>>>>> Oh, dear.  The memory accounting swamp again.  Clearly we are not
>>>>>>>>> going to drain that swamp now, but I don't like regressions.
>>>>>>>>>
>>>>>>>>> I am not opposed to reverting that commit.  I was a bit iffy about it
>>>>>>>>> at the time; and according to the removal commit message, it was
>>>>>>>>> basically removed because it was a piece of cargo cult for which we
>>>>>>>>> had no justification in any of our records.
>>>>>>>>>
>>>>>>>>> Indeed I think fixing this is a candidate for 4.9.
>>>>>>>>>
>>>>>>>>> Do you know the mechanism by which the freemem slack helps ?  I think
>>>>>>>>> that would be a prerequisite for reverting this.  That way we can have
>>>>>>>>> an understanding of why we are doing things, rather than just
>>>>>>>>> flailing at random...
>>>>>>>>
>>>>>>>> I wish I would understand it.
>>>>>>>>
>>>>>>>> One candidate would be 2M/1G pages being possible with enough free
>>>>>>>> memory, but I haven't proofed this yet. I can have a try by disabling
>>>>>>>> big pages in the hypervisor.
>>>>>>>
>>>>>>> Right, if I had to bet, I would put my money on superpages shattering
>>>>>>> being the cause of the problem.
>>>>>>
>>>>>> Seems you would have lost your money...
>>>>>>
>>>>>> Meanwhile I've found a way to get the "good" performance in the micro
>>>>>> benchmark. Unfortunately this requires to switch off the pv interfaces
>>>>>> in the HVM guest via "xen_nopv" kernel boot parameter.
>>>>>>
>>>>>> I have verified that pv spinlocks are not to blame (via "xen_nopvspin"
>>>>>> kernel boot parameter). Switching to clocksource TSC in the running
>>>>>> system doesn't help either.
>>>>>
>>>>> What about xen_hvm_exit_mmap (an optimization for shadow pagetables) and
>>>>> xen_hvm_smp_init (PV IPI)?
>>>>
>>>> xen_hvm_exit_mmap isn't active (kernel message telling me so was
>>>> issued).
>>>>
>>>>>> Unfortunately the kernel seems no longer to be functional when I try to
>>>>>> tweak it not to use the PVHVM enhancements.
>>>>>
>>>>> I guess you are not talking about regular PV drivers like netfront and
>>>>> blkfront, right?
>>>>
>>>> The plan was to be able to use PV drivers without having to use PV
>>>> callbacks and PV timers. This isn't possible right now.
>>>
>>> I think the code to handle that scenario was gradually removed over time
>>> to simplify the code base.
>>
>> Hmm, too bad.
>>
>>>>>> I'm wondering now whether
>>>>>> there have ever been any benchmarks to proof PVHVM really being faster
>>>>>> than non-PVHVM? My findings seem to suggest there might be a huge
>>>>>> performance gap with PVHVM. OTOH this might depend on hardware and other
>>>>>> factors.
>>>>>>
>>>>>> Stefano, didn't you do the PVHVM stuff back in 2010? Do you have any
>>>>>> data from then regarding performance figures?
>>>>>
>>>>> Yes, I still have these slides:
>>>>>
>>>>> https://www.slideshare.net/xen_com_mgr/linux-pv-on-hvm
>>>>
>>>> Thanks. So you measured the overall package, not the single items like
>>>> callbacks, timers, time source? I'm asking because I start to believe
>>>> there are some of those slower than their non-PV variants.
>>>
>>> There isn't much left in terms of individual optimizations: you already
>>> tried switching clocksource and removing pv spinlocks. xen_hvm_exit_mmap
>>> is not used. Only the following are left (you might want to double check
>>> I haven't missed anything):
>>>
>>> 1) PV IPI
>>
>> Its a 1 vcpu guest.
>>
>>> 2) PV suspend/resume
>>> 3) vector callback
>>> 4) interrupt remapping
>>>
>>> 2) is not on the hot path.
>>> I did individual measurements of 3) at some points and it was a clear win.
>>
>> That might depend on the hardware. Could it be newer processors are
>> faster here?
> 
> I don't think so: the alternative it's an emulated interrupt. It's
> slower under all points of view.

What about APIC virtualization of modern processors? Are you sure e.g.
timer interrupts aren't handled completely by the processor? I guess
this might be faster than letting it be handled by the hypervisor and
then use the callback into the guest.

> I would try to run the test with xen_emul_unplug="never" which means
> that you are going to end up using the emulated network card and
> emulated IDE controller, but some of the other optimizations (like the
> vector callback) will still be active.

Now this is something I wouldn't like to do. My test isn't using any
I/O at all and is showing bad performance with pv interfaces being used.
The only remedy right now seems to be to switch off pv interfaces
leading to a bad I/O performance, but a good non-I/O performance.

You are suggesting a mode with bad I/O performance _and_ bad non-I/O
performance.

> If the cause of the problem is ballooning for example, using emulated
> interfaces for IO will reduce the amount of ballooned out pages
> significantly.

No I/O involved in my benchmark.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2017-06-08  9:37 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-26 16:14 HVM guest performance regression Juergen Gross
2017-05-26 16:19 ` [for-4.9] " Ian Jackson
2017-05-26 17:00   ` Juergen Gross
2017-05-26 19:01     ` Stefano Stabellini
2017-05-29 19:05       ` Juergen Gross
2017-05-30  7:24         ` Jan Beulich
     [not found]         ` <592D3A3A020000780015D787@suse.com>
2017-05-30 10:33           ` Juergen Gross
2017-05-30 10:43             ` Jan Beulich
     [not found]             ` <592D68DC020000780015D919@suse.com>
2017-05-30 14:57               ` Juergen Gross
2017-05-30 15:10                 ` Jan Beulich
2017-06-06 13:44       ` Juergen Gross
2017-06-06 16:39         ` Stefano Stabellini
2017-06-06 19:00           ` Juergen Gross
2017-06-06 19:08             ` Stefano Stabellini
2017-06-07  6:55               ` Juergen Gross
2017-06-07 18:19                 ` Stefano Stabellini
2017-06-08  9:37                   ` Juergen Gross [this message]
2017-06-08 18:09                     ` Stefano Stabellini
2017-06-08 18:28                       ` Juergen Gross
2017-06-08 21:00                     ` Dario Faggioli
2017-06-11  2:27                       ` Konrad Rzeszutek Wilk
2017-06-12  5:48                       ` Solved: " Juergen Gross
2017-06-12  7:35                         ` Andrew Cooper
2017-06-12  7:47                           ` Juergen Gross
2017-06-12  8:30                             ` Andrew Cooper
2017-05-26 17:04 ` Dario Faggioli
2017-05-26 17:25   ` Juergen Gross

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=16819156-5a02-1f21-83c5-70507eed7a4b@suse.com \
    --to=jgross@suse.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=sstabellini@kernel.org \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.