From: Chegu Vinod <chegu_vinod@hp.com>
To: kvm@vger.kernel.org
Subject: Performance of 40-way guest running 2.6.32-220 (RHEL6.2) vs. 3.3.1 OS
Date: Wed, 11 Apr 2012 17:21:59 +0000 (UTC) [thread overview]
Message-ID: <loom.20120411T183827-108@post.gmane.org> (raw)
Hello,
While running an AIM7 (workfile.high_systime) in a single 40-way (or a single
60-way KVM guest) I noticed pretty bad performance when the guest was booted
with 3.3.1 kernel when compared to the same guest booted with 2.6.32-220
(RHEL6.2) kernel.
'am still trying to dig more into the details here. Wondering if some changes in
the upstream kernel (i.e. since 2.6.32-220) might be causing this to show up in
a guest environment (esp. for this system-intensive workload).
Has anyone else observed this kind of behavior ? Is it a known issue with a fix
in the pipeline ? If not are there any special knobs/tunables that one needs to
explicitly set/clear etc. when using newer kernels like 3.3.1 in a guest ?
I have included some info. below.
Also any pointers on what else I could capture that would be helpful.
Thanks!
Vinod
---
Platform used:
DL980 G7 (80 cores + 128G RAM). Hyper-threading is turned off.
Workload used:
AIM7 (workfile.high_systime) and using RAM disks. This is
primarily a cpu intensive workload...not much i/o.
Software used :
qemu-system-x86_64 : 1.0.50 (i.e. latest as of about a week or so ago).
Native/Host OS : 3.3.1 (SLUB allocator explicitly enabled)
Guest-RunA OS : 2.6.32-220 (i.e. RHEL6.2 kernel)
Guest-RunB OS : 3.3.1
Guest was pinned on :
numa node: 4,5,6,7 -> 40VCPUs + 64G (i.e. 40-way guest)
numa node: 2,3,4,5,7 -> 60VCPUs + 96G (i.e. 60-way guest)
For the 40-way Guest-RunA (2.6.32-220 kernel) performed nearly 9x better than
the Guest-RunB (3.3.1 kernel). In the case of 60-way guest run the older guest
kernel was nearly 12x better !
For the Guest-RunB (3.3.1) case I ran "mpstat -P ALL 1" on the host and observed
that a very high % of time was being spent by the CPUs outside the guest mode
and mostly in the host (i.e. sys). Looking at the "perf" related traces it
seemed like there were long pauses in the guest perhaps waiting for the
zone->lru_lock as part of release_pages() and this resulted in the VT's PLE
related code to kick-in on the host.
Turned on function tracing and found that there appears to be more time being
spent around the lock code in the 3.3.1 guest when compared to the 2.6.32-220
guest. Here is a small sampling of these traces... Notice the time stamp jump
around "_spin_lock_irqsave <-release_pages" in the case of Guest-RunB.
1) 40-way Guest-RunA (2.6.32-220 kernel):
-----------------------------------------
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
<...>-32147 [020] 145783.127452: native_flush_tlb <-flush_tlb_mm
<...>-32147 [020] 145783.127452: free_pages_and_swap_cache <-
unmap_region
<...>-32147 [020] 145783.127452: lru_add_drain <-
free_pages_and_swap_cache
<...>-32147 [020] 145783.127452: release_pages <-
free_pages_and_swap_cache
<...>-32147 [020] 145783.127452: _spin_lock_irqsave <-release_pages
<...>-32147 [020] 145783.127452: __mod_zone_page_state <-
release_pages
<...>-32147 [020] 145783.127452: mem_cgroup_del_lru_list <-
release_pages
...
<...>-32147 [022] 145783.133536: release_pages <-
free_pages_and_swap_cache
<...>-32147 [022] 145783.133536: _spin_lock_irqsave <-release_pages
<...>-32147 [022] 145783.133536: __mod_zone_page_state <-
release_pages
<...>-32147 [022] 145783.133536: mem_cgroup_del_lru_list <-
release_pages
<...>-32147 [022] 145783.133537: lookup_page_cgroup <-
mem_cgroup_del_lru_list
2) 40-way Guest-RunB (3.3.1):
-----------------------------
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
<...>-16459 [009] .... 101757.383125: free_pages_and_swap_cache <-
tlb_flush_mmu
<...>-16459 [009] .... 101757.383125: lru_add_drain <-
free_pages_and_swap_cache
<...>-16459 [009] .... 101757.383125: release_pages <-
free_pages_and_swap_cache
<...>-16459 [009] .... 101757.383125: _raw_spin_lock_irqsave <-
release_pages
<...>-16459 [009] d... 101757.384861: mem_cgroup_lru_del_list <-
release_pages
<...>-16459 [009] d... 101757.384861: lookup_page_cgroup <-
mem_cgroup_lru_del_list
....
<...>-16459 [009] .N.. 101757.390385: release_pages <-
free_pages_and_swap_cache
<...>-16459 [009] .N.. 101757.390385: _raw_spin_lock_irqsave <-
release_pages
<...>-16459 [009] dN.. 101757.392983: mem_cgroup_lru_del_list <-
release_pages
<...>-16459 [009] dN.. 101757.392983: lookup_page_cgroup <-
mem_cgroup_lru_del_list
<...>-16459 [009] dN.. 101757.392983: __mod_zone_page_state <-
release_pages
next reply other threads:[~2012-04-11 17:30 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-11 17:21 Chegu Vinod [this message]
2012-04-12 18:21 ` Performance of 40-way guest running 2.6.32-220 (RHEL6.2) vs. 3.3.1 OS Rik van Riel
2012-04-16 3:04 ` Chegu Vinod
2012-04-16 12:18 ` Gleb Natapov
2012-04-16 14:44 ` Chegu Vinod
2012-04-17 9:49 ` Gleb Natapov
2012-04-17 13:25 ` Chegu Vinod
2012-04-19 4:44 ` Chegu Vinod
2012-04-19 6:01 ` Gleb Natapov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=loom.20120411T183827-108@post.gmane.org \
--to=chegu_vinod@hp.com \
--cc=kvm@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.