linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Daniel P. Berrange" <berrange@redhat.com>
To: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
	"Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>,
	mingo@kernel.org, pjt@google.com, paul@paulmenage.org,
	akpm@linux-foundation.org, rjw@sisk.pl, nacc@us.ibm.com,
	paulmck@linux.vnet.ibm.com, tglx@linutronix.de,
	seto.hidetoshi@jp.fujitsu.com, rob@landley.net, tj@kernel.org,
	mschmidt@redhat.com, nikunj@linux.vnet.ibm.com,
	vatsa@linux.vnet.ibm.com, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-pm@vger.kernel.org
Subject: Re: [PATCH v2 0/7] CPU hotplug, cpusets: Fix issues with cpusets handling upon CPU hotplug
Date: Tue, 8 May 2012 14:07:40 +0100	[thread overview]
Message-ID: <20120508130740.GG18762@redhat.com> (raw)
In-Reply-To: <20120504213010.GD3054@linux.vnet.ibm.com>

On Fri, May 04, 2012 at 02:30:11PM -0700, Nishanth Aravamudan wrote:
> On 04.05.2012 [22:56:21 +0200], Peter Zijlstra wrote:
> > On Fri, 2012-05-04 at 13:46 -0700, Nishanth Aravamudan wrote:
> > > What about other users of cpusets (what are they?)? 
> > 
> > cpusets came from SGI, its traditionally used to partition _large_
> > machines. Things like the batch/job-schedulers that go with that type of
> > setup use it.
> 
> Yeah, I recall that usage (or some description similar). Do we have any
> other known users of cpusets (beyond libvirt)?

IIRC, the lxc.sf.net project also uses cpusets (no connection to the libvirt
LXC driver mentioned below which is an alternative impl of the same concept).

> > I've no clue why libvirt uses it (or why one would use libvirt for that
> > matter).
> 
> Well, it is the case that libvirt does use it, and libvirt is used
> pretty widely (or so it seems to me). I don't use it (cpusets or libvirt
> :) either, but it seems like we should either tell libvirt directly that
> cpusets are inappropriate for their use-case (once we figure out what
> exactly that is, and why they chose cpusets) or work with them to
> support their use-case?

Libvirt uses the cpuset cgroups functionality in two of its
virtualization drivers:

 - LXC.  Container based virt. The cpuset controller is used to
   constrain all processes running inside the container to a
   specific collection of CPUs. While we could use the traditional
   sched_setaffinity() syscall at initial startup of the container,
   this is not so practical when we want to dynamically change the
   affinity of an existing container. It would require that we
   iterate over all tasks changing their affinity, and to avoid
   fork() race conditions we'd need to suspend the container while
   doing this. Thus we've long used the cpuset cgroups controller
   for LXC.

 - KVM.  Full machine virt. By default we use sched_setaffinity
   to apply constraints on what host CPUs a VM executes on. Fairly
   recently we added the ability to optionally use the cpuset
   controller instead (only if the sysadmin has already mounted
   it). The advantage of this, is that if we update the cpuset
   of an existing VM, then IIUC, the kernel will migrate its
   allocated memory to be local to the new CPU set mask.

The pain point we're hitting, is that upon suspend/restore the cgroups
cpuset masks are not preserved. This is not a problem for server virt
usage scenarios, but it is for desktop users with virt on laptaops.

I don't see a viable alternative to the cpuset controller for our LXC
container driver. For KVM we could do without the cpuset controller
if there is alternative way to tell the kernel to migrate the KVM
process memory to be local to the new CPU affinity set using the
sched_setaffinity() call.

We are open to suggestions of alternative approaches, particularly since
we have had no end of trouble with pretty much all of the kernel's
cgroups controllers :-(

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

  parent reply	other threads:[~2012-05-08 13:09 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-04 19:17 [PATCH v2 0/7] CPU hotplug, cpusets: Fix issues with cpusets handling upon CPU hotplug Srivatsa S. Bhat
2012-05-04 19:17 ` [PATCH v2 1/7] cpusets, hotplug: Implement cpuset tree traversal in a helper function Srivatsa S. Bhat
2012-05-04 19:18 ` [PATCH v2 2/7] cpusets, hotplug: Restructure functions that are invoked during hotplug Srivatsa S. Bhat
2012-05-04 19:18 ` [PATCH v2 3/7] cpusets: Introduce 'user_cpus_allowed' and rework the semantics of 'cpus_allowed' Srivatsa S. Bhat
2012-05-04 19:19 ` [PATCH v2 4/7] CPU hotplug, cpusets: Workout hotplug handling for cpusets Srivatsa S. Bhat
2012-05-04 19:19 ` [PATCH v2 5/7] Docs, cpusets: Update the cpuset documentation Srivatsa S. Bhat
2012-05-04 22:28   ` Rob Landley
2012-05-04 19:20 ` [PATCH v2 6/7] cpusets: Optimize the implementation of guarantee_online_cpus() Srivatsa S. Bhat
2012-05-04 19:20 ` [PATCH v2 7/7] cpusets: Remove out-dated comment about cpuset_track_online_cpus Srivatsa S. Bhat
2012-05-04 19:24 ` [PATCH v2 0/7] CPU hotplug, cpusets: Fix issues with cpusets handling upon CPU hotplug Peter Zijlstra
2012-05-04 19:58   ` Srivatsa S. Bhat
2012-05-04 20:14     ` Peter Zijlstra
2012-05-04 20:28       ` Peter Zijlstra
2012-05-04 20:49         ` Nishanth Aravamudan
2012-05-04 21:01           ` Peter Zijlstra
2012-05-04 21:27             ` Nishanth Aravamudan
2012-05-04 21:32               ` Peter Zijlstra
2012-05-04 21:34               ` Peter Zijlstra
2012-05-04 21:57                 ` Nishanth Aravamudan
2012-05-04 21:38               ` Peter Zijlstra
2012-05-04 20:46       ` Nishanth Aravamudan
2012-05-04 20:56         ` Peter Zijlstra
2012-05-04 21:30           ` Nishanth Aravamudan
2012-05-04 21:44             ` Peter Zijlstra
2012-05-05 15:24               ` Alan Stern
2012-05-05 17:44                 ` Paul E. McKenney
2012-05-05 18:56                   ` Rafael J. Wysocki
2012-05-08 13:07             ` Daniel P. Berrange [this message]
2012-05-05  4:39           ` Mike Galbraith
2012-05-05 17:15         ` Srivatsa S. Bhat
2012-05-07 15:26           ` Jiang Liu
2012-05-09  9:12       ` Srivatsa S. Bhat

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120508130740.GG18762@redhat.com \
    --to=berrange@redhat.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mschmidt@redhat.com \
    --cc=nacc@linux.vnet.ibm.com \
    --cc=nacc@us.ibm.com \
    --cc=nikunj@linux.vnet.ibm.com \
    --cc=paul@paulmenage.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=pjt@google.com \
    --cc=rjw@sisk.pl \
    --cc=rob@landley.net \
    --cc=seto.hidetoshi@jp.fujitsu.com \
    --cc=srivatsa.bhat@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=vatsa@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).