From: "Daniel P. Berrange" <berrange@redhat.com>
To: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
"Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>,
mingo@kernel.org, pjt@google.com, paul@paulmenage.org,
akpm@linux-foundation.org, rjw@sisk.pl, nacc@us.ibm.com,
paulmck@linux.vnet.ibm.com, tglx@linutronix.de,
seto.hidetoshi@jp.fujitsu.com, rob@landley.net, tj@kernel.org,
mschmidt@redhat.com, nikunj@linux.vnet.ibm.com,
vatsa@linux.vnet.ibm.com, linux-kernel@vger.kernel.org,
linux-doc@vger.kernel.org, linux-pm@vger.kernel.org
Subject: Re: [PATCH v2 0/7] CPU hotplug, cpusets: Fix issues with cpusets handling upon CPU hotplug
Date: Tue, 8 May 2012 14:07:40 +0100 [thread overview]
Message-ID: <20120508130740.GG18762@redhat.com> (raw)
In-Reply-To: <20120504213010.GD3054@linux.vnet.ibm.com>
On Fri, May 04, 2012 at 02:30:11PM -0700, Nishanth Aravamudan wrote:
> On 04.05.2012 [22:56:21 +0200], Peter Zijlstra wrote:
> > On Fri, 2012-05-04 at 13:46 -0700, Nishanth Aravamudan wrote:
> > > What about other users of cpusets (what are they?)?
> >
> > cpusets came from SGI, its traditionally used to partition _large_
> > machines. Things like the batch/job-schedulers that go with that type of
> > setup use it.
>
> Yeah, I recall that usage (or some description similar). Do we have any
> other known users of cpusets (beyond libvirt)?
IIRC, the lxc.sf.net project also uses cpusets (no connection to the libvirt
LXC driver mentioned below which is an alternative impl of the same concept).
> > I've no clue why libvirt uses it (or why one would use libvirt for that
> > matter).
>
> Well, it is the case that libvirt does use it, and libvirt is used
> pretty widely (or so it seems to me). I don't use it (cpusets or libvirt
> :) either, but it seems like we should either tell libvirt directly that
> cpusets are inappropriate for their use-case (once we figure out what
> exactly that is, and why they chose cpusets) or work with them to
> support their use-case?
Libvirt uses the cpuset cgroups functionality in two of its
virtualization drivers:
- LXC. Container based virt. The cpuset controller is used to
constrain all processes running inside the container to a
specific collection of CPUs. While we could use the traditional
sched_setaffinity() syscall at initial startup of the container,
this is not so practical when we want to dynamically change the
affinity of an existing container. It would require that we
iterate over all tasks changing their affinity, and to avoid
fork() race conditions we'd need to suspend the container while
doing this. Thus we've long used the cpuset cgroups controller
for LXC.
- KVM. Full machine virt. By default we use sched_setaffinity
to apply constraints on what host CPUs a VM executes on. Fairly
recently we added the ability to optionally use the cpuset
controller instead (only if the sysadmin has already mounted
it). The advantage of this, is that if we update the cpuset
of an existing VM, then IIUC, the kernel will migrate its
allocated memory to be local to the new CPU set mask.
The pain point we're hitting, is that upon suspend/restore the cgroups
cpuset masks are not preserved. This is not a problem for server virt
usage scenarios, but it is for desktop users with virt on laptaops.
I don't see a viable alternative to the cpuset controller for our LXC
container driver. For KVM we could do without the cpuset controller
if there is alternative way to tell the kernel to migrate the KVM
process memory to be local to the new CPU affinity set using the
sched_setaffinity() call.
We are open to suggestions of alternative approaches, particularly since
we have had no end of trouble with pretty much all of the kernel's
cgroups controllers :-(
Regards,
Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
next prev parent reply other threads:[~2012-05-08 13:09 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-04 19:17 [PATCH v2 0/7] CPU hotplug, cpusets: Fix issues with cpusets handling upon CPU hotplug Srivatsa S. Bhat
2012-05-04 19:17 ` [PATCH v2 1/7] cpusets, hotplug: Implement cpuset tree traversal in a helper function Srivatsa S. Bhat
2012-05-04 19:18 ` [PATCH v2 2/7] cpusets, hotplug: Restructure functions that are invoked during hotplug Srivatsa S. Bhat
2012-05-04 19:18 ` [PATCH v2 3/7] cpusets: Introduce 'user_cpus_allowed' and rework the semantics of 'cpus_allowed' Srivatsa S. Bhat
2012-05-04 19:19 ` [PATCH v2 4/7] CPU hotplug, cpusets: Workout hotplug handling for cpusets Srivatsa S. Bhat
2012-05-04 19:19 ` [PATCH v2 5/7] Docs, cpusets: Update the cpuset documentation Srivatsa S. Bhat
2012-05-04 22:28 ` Rob Landley
2012-05-04 19:20 ` [PATCH v2 6/7] cpusets: Optimize the implementation of guarantee_online_cpus() Srivatsa S. Bhat
2012-05-04 19:20 ` [PATCH v2 7/7] cpusets: Remove out-dated comment about cpuset_track_online_cpus Srivatsa S. Bhat
2012-05-04 19:24 ` [PATCH v2 0/7] CPU hotplug, cpusets: Fix issues with cpusets handling upon CPU hotplug Peter Zijlstra
2012-05-04 19:58 ` Srivatsa S. Bhat
2012-05-04 20:14 ` Peter Zijlstra
2012-05-04 20:28 ` Peter Zijlstra
2012-05-04 20:49 ` Nishanth Aravamudan
2012-05-04 21:01 ` Peter Zijlstra
2012-05-04 21:27 ` Nishanth Aravamudan
2012-05-04 21:32 ` Peter Zijlstra
2012-05-04 21:34 ` Peter Zijlstra
2012-05-04 21:57 ` Nishanth Aravamudan
2012-05-04 21:38 ` Peter Zijlstra
2012-05-04 20:46 ` Nishanth Aravamudan
2012-05-04 20:56 ` Peter Zijlstra
2012-05-04 21:30 ` Nishanth Aravamudan
2012-05-04 21:44 ` Peter Zijlstra
2012-05-05 15:24 ` Alan Stern
2012-05-05 17:44 ` Paul E. McKenney
2012-05-05 18:56 ` Rafael J. Wysocki
2012-05-08 13:07 ` Daniel P. Berrange [this message]
2012-05-05 4:39 ` Mike Galbraith
2012-05-05 17:15 ` Srivatsa S. Bhat
2012-05-07 15:26 ` Jiang Liu
2012-05-09 9:12 ` Srivatsa S. Bhat
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120508130740.GG18762@redhat.com \
--to=berrange@redhat.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=mschmidt@redhat.com \
--cc=nacc@linux.vnet.ibm.com \
--cc=nacc@us.ibm.com \
--cc=nikunj@linux.vnet.ibm.com \
--cc=paul@paulmenage.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=pjt@google.com \
--cc=rjw@sisk.pl \
--cc=rob@landley.net \
--cc=seto.hidetoshi@jp.fujitsu.com \
--cc=srivatsa.bhat@linux.vnet.ibm.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=vatsa@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).