From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S268278AbUJJM22 (ORCPT ); Sun, 10 Oct 2004 08:28:28 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S268279AbUJJM22 (ORCPT ); Sun, 10 Oct 2004 08:28:28 -0400 Received: from mail01.hpce.nec.com ([193.141.139.228]:13499 "EHLO mail01.hpce.nec.com") by vger.kernel.org with ESMTP id S268278AbUJJM1N (ORCPT ); Sun, 10 Oct 2004 08:27:13 -0400 From: Erich Focht To: Nick Piggin Subject: Re: [Lse-tech] [RFC PATCH] scheduler: Dynamic sched_domains Date: Sun, 10 Oct 2004 14:25:00 +0200 User-Agent: KMail/1.6.2 Cc: colpatch@us.ibm.com, LSE Tech , Paul Jackson , "Martin J. Bligh" , Andrew Morton , ckrm-tech@lists.sourceforge.net, LKML , simon.derr@bull.net, frankeh@watson.ibm.com References: <1097110266.4907.187.camel@arrakis> <200410090113.40589.efocht@hpce.nec.com> <416727C6.5000000@yahoo.com.au> In-Reply-To: <416727C6.5000000@yahoo.com.au> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200410101425.00486.efocht@hpce.nec.com> Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Saturday 09 October 2004 01:50, Nick Piggin wrote: > Erich Focht wrote: > > >>I personally like the hierarchical idea. Machine topologies tend to > >>look tree-like, and every useful sched_domain layout I've ever seen has > >>been tree-like. I think our interface should match that. > > > > > > I like the hierarchical idea, too. The natural way to build it would > > be by starting from the cpus and going up. This tree stands on its > > leafs... and I'm not sure how to express that in a filesystem. > > > > Why would you ever want to play around with the internals of the > thing though? Provided you have a way to create exclusive sets of > CPUs, when would you care about doing more? Three reasons come immediately to my mind: - Move the sched domains setup out of the kernel into user space. With my proposal of filesystem with directory operations only (just moving cpuX virtual files around) the boot setup should just be: global/ cpu1 cpu2 ... The rest could be done very machine and load specific in user space. This way the kernel scheduler wouldn't need to struggle keeping up learning characteristics of new machines as they appear on the radar. - I sometimes want to create/ destroy isolated partitions at high rate (through a batch scheduler) and a reasonable API enables me to keep the domains consistent at any time. - Flexibility of isolated partitions is a bare necessity. If you simply divide your system into interactive and batch partition you'd certainly want to decrease the size of the interactive partition during the night without rebooting the machine... Regards, Erich