From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753067AbdDQOlE (ORCPT ); Mon, 17 Apr 2017 10:41:04 -0400 Received: from mx1.redhat.com ([209.132.183.28]:36490 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751432AbdDQOlC (ORCPT ); Mon, 17 Apr 2017 10:41:02 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com DF16B85A07 Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=lvenanci@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com DF16B85A07 Reply-To: lvenanci@redhat.com Subject: Re: [RFC 2/3] sched/topology: fix sched groups on NUMA machines with mesh topology References: <1492091769-19879-1-git-send-email-lvenanci@redhat.com> <1492091769-19879-3-git-send-email-lvenanci@redhat.com> <20170414113813.vktcpsrsuu2st2fm@hirez.programming.kicks-ass.net> <20170414165857.7n75lxk4usfsbjaq@hirez.programming.kicks-ass.net> To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, lwang@redhat.com, riel@redhat.com, Mike Galbraith , Thomas Gleixner , Ingo Molnar From: Lauro Venancio Organization: Red Hat Message-ID: Date: Mon, 17 Apr 2017 11:40:59 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <20170414165857.7n75lxk4usfsbjaq@hirez.programming.kicks-ass.net> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 17 Apr 2017 14:41:02 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04/14/2017 01:58 PM, Peter Zijlstra wrote: > On Fri, Apr 14, 2017 at 01:38:13PM +0200, Peter Zijlstra wrote: >> On Thu, Apr 13, 2017 at 10:56:08AM -0300, Lauro Ramos Venancio wrote: >>> This patch constructs the sched groups from each CPU perspective. So, on >>> a 4 nodes machine with ring topology, while nodes 0 and 2 keep the same >>> groups as before [(3, 0, 1)(1, 2, 3)], nodes 1 and 3 have new groups >>> [(0, 1, 2)(2, 3, 0)]. This allows moving tasks between any node 2-hops >>> apart. >> Ah,.. so after drawing pictures I see what went wrong; duh :-( >> >> An equivalent patch would be (if for_each_cpu_wrap() were exposed): >> >> @@ -521,11 +588,11 @@ build_overlap_sched_groups(struct sched_domain *sd, int cpu) >> struct cpumask *covered = sched_domains_tmpmask; >> struct sd_data *sdd = sd->private; >> struct sched_domain *sibling; >> - int i; >> + int i, wrap; >> >> cpumask_clear(covered); >> >> - for_each_cpu(i, span) { >> + for_each_cpu_wrap(i, span, cpu, wrap) { >> struct cpumask *sg_span; >> >> if (cpumask_test_cpu(i, covered)) >> >> >> We need to start iterating at @cpu, not start at 0 every time. >> >> > OK, please have a look here: > > https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/log/?h=sched/core Looks good, but please hold these patches while patch 3 is not applied. Without it, the sched_group_capacity (sg->sgc) instance is not selected correctly and we have an important performance regression in all NUMA machines. I will continue this discussion in the other thread.