From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965002AbbEMNwM (ORCPT ); Wed, 13 May 2015 09:52:12 -0400 Received: from mx1.redhat.com ([209.132.183.28]:38372 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934274AbbEMNwI (ORCPT ); Wed, 13 May 2015 09:52:08 -0400 Message-ID: <555356E8.5000307@redhat.com> Date: Wed, 13 May 2015 09:51:36 -0400 From: Rik van Riel User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Peter Zijlstra CC: dedekind1@gmail.com, linux-kernel@vger.kernel.org, mgorman@suse.de, jhladky@redhat.com Subject: Re: [PATCH] numa,sched: only consider less busy nodes as numa balancing destination References: <1430908530.7444.145.camel@sauron.fi.intel.com> <20150506114128.0c846a37@cuia.bos.redhat.com> <1431090801.1418.87.camel@sauron.fi.intel.com> <554D1681.7040902@redhat.com> <1431438610.20417.0.camel@sauron.fi.intel.com> <55522005.1080705@redhat.com> <20150513062906.GJ3007@worktop.Skamania.guest> In-Reply-To: <20150513062906.GJ3007@worktop.Skamania.guest> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/13/2015 02:29 AM, Peter Zijlstra wrote: > On Tue, May 12, 2015 at 11:45:09AM -0400, Rik van Riel wrote: >> I have a few poorly formed ideas on what could be done about that: >> >> 1) have fbq_classify_rq take the current task on the rq into account, >> and adjust the fbq classification if all the runnable-but-queued >> tasks are on the right node > > So while looking at this I came up with the below; it treats anything > inside ->active_nodes as a preferred node for balancing purposes. > > Would that make sense? Not necessarily. If there are two workloads on a multi-threaded system, and they have not yet converged on one node each, both nodes will be part of ->active_nodes. Treating them as preferred nodes means the load balancing code would do nothing at all to help the workloads converge. > I'll see what I can do about current in the runqueue type > classification. This can probably be racy, so just checking a value in the current task struct for the runqueue should be ok. I am not aware of any architecture where the task struct address can become invalid. Worst thing that could happen is that the bits examined change value. >> 2) ensure that rq->nr_numa_running and rq->nr_preferred_running also >> get incremented for kernel threads that are bound to a particular >> CPU - currently CPU-bound kernel threads will cause the NUMA >> statistics to look like a CPU has tasks that do not belong on that >> NUMA node > > I'm thinking accounting those to nr_pinned, lemme see how that works > out. Cool. -- All rights reversed