From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753781AbbEKLLX (ORCPT ); Mon, 11 May 2015 07:11:23 -0400 Received: from mga09.intel.com ([134.134.136.24]:58205 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753747AbbEKLLT (ORCPT ); Mon, 11 May 2015 07:11:19 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.13,406,1427785200"; d="scan'208";a="569428830" Message-ID: <1431342675.1418.148.camel@sauron.fi.intel.com> Subject: Re: [PATCH] numa,sched: only consider less busy nodes as numa balancing destination From: Artem Bityutskiy Reply-To: dedekind1@gmail.com To: Rik van Riel Cc: linux-kernel@vger.kernel.org, mgorman@suse.de, peterz@infradead.org, jhladky@redhat.com Date: Mon, 11 May 2015 14:11:15 +0300 In-Reply-To: <554D1681.7040902@redhat.com> References: <1430908530.7444.145.camel@sauron.fi.intel.com> <20150506114128.0c846a37@cuia.bos.redhat.com> <1431090801.1418.87.camel@sauron.fi.intel.com> <554D1681.7040902@redhat.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4 (3.10.4-4.fc20) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 2015-05-08 at 16:03 -0400, Rik van Riel wrote: > This works well when dealing with tasks that are constantly > running, but fails catastrophically when dealing with tasks > that go to sleep, wake back up, go back to sleep, wake back > up, and generally mess up the load statistics that the NUMA > balancing code use in a random way. Sleeping is what happens a lot I believe in this workload: processes do a lot of network I/O, file I/O too, and a lot of IPC. Would you please expand on this a bit more - why would this scenario "mess up load statistics" ? > If the normal scheduler load balancer is moving tasks the > other way the NUMA balancer is moving them, things will > not converge, and tasks will have worse memory locality > than not doing NUMA balancing at all. Are the regular and NUMA balancers independent? Are there mechanisms to detect ping-pong situations? I'd like to verify your theory, and these kinds of mechanisms would be helpful. > Currently the load balancer has a preference for moving > tasks to their preferred nodes (NUMA_FAVOUR_HIGHER, true), > but there is no resistance to moving tasks away from their > preferred nodes (NUMA_RESIST_LOWER, false). That setting > was arrived at after a fair amount of experimenting, and > is probably correct. I guess I can try making NUMA_RESIST_LOWER to be true and see what happens. But probably first I need to confirm that your theory (balancers playing ping-pong) is correct, any hints on how would I do this? Thanks! Artem.