From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S264796AbTE1Qqw (ORCPT ); Wed, 28 May 2003 12:46:52 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S264799AbTE1Qqw (ORCPT ); Wed, 28 May 2003 12:46:52 -0400 Received: from ophelia.ess.nec.de ([193.141.139.8]:28321 "EHLO ophelia.hpce.nec.com") by vger.kernel.org with ESMTP id S264796AbTE1Qqv (ORCPT ); Wed, 28 May 2003 12:46:51 -0400 From: Erich Focht To: "Martin J. Bligh" Subject: Re: [Lse-tech] Node affine NUMA scheduler extension Date: Wed, 28 May 2003 19:02:35 +0200 User-Agent: KMail/1.5.1 Cc: Andi Kleen , LSE , linux-kernel References: <200305271031.55554.efocht@hpce.nec.com> <200305272328.27269.efocht@hpce.nec.com> <2640000.1054072312@[10.10.2.4]> In-Reply-To: <2640000.1054072312@[10.10.2.4]> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200305281902.35511.efocht@hpce.nec.com> Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Tuesday 27 May 2003 23:51, Martin J. Bligh wrote: > > Do you think of something like: > > > ># define CAN_MIGRATE_TASK(p,rq,this_cpu) \ > > (HOMENODE_UNSET(p) && \ //<-- > > (jiffies - (p)->last_run > cache_decay_ticks) && \ > > !task_running(rq, p) && \ > > ((p)->cpus_allowed & (1UL << (this_cpu)))) > > ... > > My instinct would tell me the first expression should be ||, not && > but I'm not 100% sure. You're right. > And is this restricted to just clones? Doesn't > seem to be, unless that's implicit in homenode_unset? Hmmm... That's actually the difficult issue with fork vs. clone and migrating or not. When you clone a small task, you'd usually like to keep it on the same node. If it's a big task and fault-in a lot of memory, it could make sense to let it migrate to another node. For forked tasks it's the same, you mighthave some more COW memory movement but after that you'll usually be happy if the child went away from the parent's node if it is a long-running child. Not so for a short running child. The current compromise: - start children (no matter whether forked or cloned) on the same CPU - allow other CPUs to steal freshly forked children by reducing the cache affinity until the first steal/migration - ok with chidren who exec soon, they might exec before anyone can steal them, drop all the pages and find anew life on the optimal node This way exec'd tasks get properly baklanced, forked/cloned ones get a better start than in the current scheduler. I'll post a patch on top of the node affine scheduler doing this in 1-2 days. Regards, Erich