xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Stefano Stabellini <sstabellini@kernel.org>
To: Juergen Gross <jgross@suse.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>,
	Stefano Stabellini <sstabellini@kernel.org>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	Wei Liu <wei.liu2@citrix.com>
Subject: Re: [for-4.9] Re: HVM guest performance regression
Date: Thu, 8 Jun 2017 11:09:12 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.10.1706081103510.26108@sstabellini-ThinkPad-X260> (raw)
In-Reply-To: <16819156-5a02-1f21-83c5-70507eed7a4b@suse.com>

On Thu, 8 Jun 2017, Juergen Gross wrote:
> On 07/06/17 20:19, Stefano Stabellini wrote:
> > On Wed, 7 Jun 2017, Juergen Gross wrote:
> >> On 06/06/17 21:08, Stefano Stabellini wrote:
> >>> On Tue, 6 Jun 2017, Juergen Gross wrote:
> >>>> On 06/06/17 18:39, Stefano Stabellini wrote:
> >>>>> On Tue, 6 Jun 2017, Juergen Gross wrote:
> >>>>>> On 26/05/17 21:01, Stefano Stabellini wrote:
> >>>>>>> On Fri, 26 May 2017, Juergen Gross wrote:
> >>>>>>>> On 26/05/17 18:19, Ian Jackson wrote:
> >>>>>>>>> Juergen Gross writes ("HVM guest performance regression"):
> >>>>>>>>>> Looking for the reason of a performance regression of HVM guests under
> >>>>>>>>>> Xen 4.7 against 4.5 I found the reason to be commit
> >>>>>>>>>> c26f92b8fce3c9df17f7ef035b54d97cbe931c7a ("libxl: remove freemem_slack")
> >>>>>>>>>> in Xen 4.6.
> >>>>>>>>>>
> >>>>>>>>>> The problem occurred when dom0 had to be ballooned down when starting
> >>>>>>>>>> the guest. The performance of some micro benchmarks dropped by about
> >>>>>>>>>> a factor of 2 with above commit.
> >>>>>>>>>>
> >>>>>>>>>> Interesting point is that the performance of the guest will depend on
> >>>>>>>>>> the amount of free memory being available at guest creation time.
> >>>>>>>>>> When there was barely enough memory available for starting the guest
> >>>>>>>>>> the performance will remain low even if memory is being freed later.
> >>>>>>>>>>
> >>>>>>>>>> I'd like to suggest we either revert the commit or have some other
> >>>>>>>>>> mechanism to try to have some reserve free memory when starting a
> >>>>>>>>>> domain.
> >>>>>>>>>
> >>>>>>>>> Oh, dear.  The memory accounting swamp again.  Clearly we are not
> >>>>>>>>> going to drain that swamp now, but I don't like regressions.
> >>>>>>>>>
> >>>>>>>>> I am not opposed to reverting that commit.  I was a bit iffy about it
> >>>>>>>>> at the time; and according to the removal commit message, it was
> >>>>>>>>> basically removed because it was a piece of cargo cult for which we
> >>>>>>>>> had no justification in any of our records.
> >>>>>>>>>
> >>>>>>>>> Indeed I think fixing this is a candidate for 4.9.
> >>>>>>>>>
> >>>>>>>>> Do you know the mechanism by which the freemem slack helps ?  I think
> >>>>>>>>> that would be a prerequisite for reverting this.  That way we can have
> >>>>>>>>> an understanding of why we are doing things, rather than just
> >>>>>>>>> flailing at random...
> >>>>>>>>
> >>>>>>>> I wish I would understand it.
> >>>>>>>>
> >>>>>>>> One candidate would be 2M/1G pages being possible with enough free
> >>>>>>>> memory, but I haven't proofed this yet. I can have a try by disabling
> >>>>>>>> big pages in the hypervisor.
> >>>>>>>
> >>>>>>> Right, if I had to bet, I would put my money on superpages shattering
> >>>>>>> being the cause of the problem.
> >>>>>>
> >>>>>> Seems you would have lost your money...
> >>>>>>
> >>>>>> Meanwhile I've found a way to get the "good" performance in the micro
> >>>>>> benchmark. Unfortunately this requires to switch off the pv interfaces
> >>>>>> in the HVM guest via "xen_nopv" kernel boot parameter.
> >>>>>>
> >>>>>> I have verified that pv spinlocks are not to blame (via "xen_nopvspin"
> >>>>>> kernel boot parameter). Switching to clocksource TSC in the running
> >>>>>> system doesn't help either.
> >>>>>
> >>>>> What about xen_hvm_exit_mmap (an optimization for shadow pagetables) and
> >>>>> xen_hvm_smp_init (PV IPI)?
> >>>>
> >>>> xen_hvm_exit_mmap isn't active (kernel message telling me so was
> >>>> issued).
> >>>>
> >>>>>> Unfortunately the kernel seems no longer to be functional when I try to
> >>>>>> tweak it not to use the PVHVM enhancements.
> >>>>>
> >>>>> I guess you are not talking about regular PV drivers like netfront and
> >>>>> blkfront, right?
> >>>>
> >>>> The plan was to be able to use PV drivers without having to use PV
> >>>> callbacks and PV timers. This isn't possible right now.
> >>>
> >>> I think the code to handle that scenario was gradually removed over time
> >>> to simplify the code base.
> >>
> >> Hmm, too bad.
> >>
> >>>>>> I'm wondering now whether
> >>>>>> there have ever been any benchmarks to proof PVHVM really being faster
> >>>>>> than non-PVHVM? My findings seem to suggest there might be a huge
> >>>>>> performance gap with PVHVM. OTOH this might depend on hardware and other
> >>>>>> factors.
> >>>>>>
> >>>>>> Stefano, didn't you do the PVHVM stuff back in 2010? Do you have any
> >>>>>> data from then regarding performance figures?
> >>>>>
> >>>>> Yes, I still have these slides:
> >>>>>
> >>>>> https://www.slideshare.net/xen_com_mgr/linux-pv-on-hvm
> >>>>
> >>>> Thanks. So you measured the overall package, not the single items like
> >>>> callbacks, timers, time source? I'm asking because I start to believe
> >>>> there are some of those slower than their non-PV variants.
> >>>
> >>> There isn't much left in terms of individual optimizations: you already
> >>> tried switching clocksource and removing pv spinlocks. xen_hvm_exit_mmap
> >>> is not used. Only the following are left (you might want to double check
> >>> I haven't missed anything):
> >>>
> >>> 1) PV IPI
> >>
> >> Its a 1 vcpu guest.
> >>
> >>> 2) PV suspend/resume
> >>> 3) vector callback
> >>> 4) interrupt remapping
> >>>
> >>> 2) is not on the hot path.
> >>> I did individual measurements of 3) at some points and it was a clear win.
> >>
> >> That might depend on the hardware. Could it be newer processors are
> >> faster here?
> > 
> > I don't think so: the alternative it's an emulated interrupt. It's
> > slower under all points of view.
> 
> What about APIC virtualization of modern processors? Are you sure e.g.
> timer interrupts aren't handled completely by the processor? I guess
> this might be faster than letting it be handled by the hypervisor and
> then use the callback into the guest.
> 
> > I would try to run the test with xen_emul_unplug="never" which means
> > that you are going to end up using the emulated network card and
> > emulated IDE controller, but some of the other optimizations (like the
> > vector callback) will still be active.
> 
> Now this is something I wouldn't like to do. My test isn't using any
> I/O at all and is showing bad performance with pv interfaces being used.
> The only remedy right now seems to be to switch off pv interfaces
> leading to a bad I/O performance, but a good non-I/O performance.
> 
> You are suggesting a mode with bad I/O performance _and_ bad non-I/O
> performance.

I was only suggesting this for debugging, to better understand the
problem, not as a solution.


> > If the cause of the problem is ballooning for example, using emulated
> > interfaces for IO will reduce the amount of ballooned out pages
> > significantly.
> 
> No I/O involved in my benchmark.

I admit that if your test doesn't do any I/O, it is not likely that
xen_emul_unplug="never" will help us understand the problem.

Nonetheless, I believe that a simple blkfront/blkback or
netfront/netback connection, even without any I/O being done, leads to a
couple of calls into the ballooning code (xenbus_map_ring_valloc_hvm ->
alloc_xenballooned_pages).

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2017-06-08 18:09 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-26 16:14 HVM guest performance regression Juergen Gross
2017-05-26 16:19 ` [for-4.9] " Ian Jackson
2017-05-26 17:00   ` Juergen Gross
2017-05-26 19:01     ` Stefano Stabellini
2017-05-29 19:05       ` Juergen Gross
2017-05-30  7:24         ` Jan Beulich
     [not found]         ` <592D3A3A020000780015D787@suse.com>
2017-05-30 10:33           ` Juergen Gross
2017-05-30 10:43             ` Jan Beulich
     [not found]             ` <592D68DC020000780015D919@suse.com>
2017-05-30 14:57               ` Juergen Gross
2017-05-30 15:10                 ` Jan Beulich
2017-06-06 13:44       ` Juergen Gross
2017-06-06 16:39         ` Stefano Stabellini
2017-06-06 19:00           ` Juergen Gross
2017-06-06 19:08             ` Stefano Stabellini
2017-06-07  6:55               ` Juergen Gross
2017-06-07 18:19                 ` Stefano Stabellini
2017-06-08  9:37                   ` Juergen Gross
2017-06-08 18:09                     ` Stefano Stabellini [this message]
2017-06-08 18:28                       ` Juergen Gross
2017-06-08 21:00                     ` Dario Faggioli
2017-06-11  2:27                       ` Konrad Rzeszutek Wilk
2017-06-12  5:48                       ` Solved: " Juergen Gross
2017-06-12  7:35                         ` Andrew Cooper
2017-06-12  7:47                           ` Juergen Gross
2017-06-12  8:30                             ` Andrew Cooper
2017-05-26 17:04 ` Dario Faggioli
2017-05-26 17:25   ` Juergen Gross

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.10.1706081103510.26108@sstabellini-ThinkPad-X260 \
    --to=sstabellini@kernel.org \
    --cc=ian.jackson@eu.citrix.com \
    --cc=jgross@suse.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).