From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-f70.google.com (mail-oi0-f70.google.com [209.85.218.70]) by kanga.kvack.org (Postfix) with ESMTP id 52D342803E9 for ; Wed, 23 Aug 2017 14:17:32 -0400 (EDT) Received: by mail-oi0-f70.google.com with SMTP id u6so884946oif.0 for ; Wed, 23 Aug 2017 11:17:32 -0700 (PDT) Received: from mail-oi0-x242.google.com (mail-oi0-x242.google.com. [2607:f8b0:4003:c06::242]) by mx.google.com with ESMTPS id i190si1653211oib.547.2017.08.23.11.17.31 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 23 Aug 2017 11:17:31 -0700 (PDT) Received: by mail-oi0-x242.google.com with SMTP id w8so282328oig.5 for ; Wed, 23 Aug 2017 11:17:31 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <6e8b81de-e985-9222-29c5-594c6849c351@linux.intel.com> References: <37D7C6CF3E00A74B8858931C1DB2F077537879BB@SHSMSX103.ccr.corp.intel.com> <20170818144622.oabozle26hasg5yo@techsingularity.net> <37D7C6CF3E00A74B8858931C1DB2F07753787AE4@SHSMSX103.ccr.corp.intel.com> <20170818185455.qol3st2nynfa47yc@techsingularity.net> <20170821183234.kzennaaw2zt2rbwz@techsingularity.net> <37D7C6CF3E00A74B8858931C1DB2F07753788B58@SHSMSX103.ccr.corp.intel.com> <37D7C6CF3E00A74B8858931C1DB2F0775378A24A@SHSMSX103.ccr.corp.intel.com> <37D7C6CF3E00A74B8858931C1DB2F0775378A377@SHSMSX103.ccr.corp.intel.com> <37D7C6CF3E00A74B8858931C1DB2F0775378A8AB@SHSMSX103.ccr.corp.intel.com> <6e8b81de-e985-9222-29c5-594c6849c351@linux.intel.com> From: Linus Torvalds Date: Wed, 23 Aug 2017 11:17:30 -0700 Message-ID: Subject: Re: [PATCH 1/2] sched/wait: Break up long wake list walk Content-Type: text/plain; charset="UTF-8" Sender: owner-linux-mm@kvack.org List-ID: To: Tim Chen Cc: "Liang, Kan" , Mel Gorman , Mel Gorman , "Kirill A. Shutemov" , Peter Zijlstra , Ingo Molnar , Andi Kleen , Andrew Morton , Johannes Weiner , Jan Kara , linux-mm , Linux Kernel Mailing List On Wed, Aug 23, 2017 at 8:58 AM, Tim Chen wrote: > > Will you still consider the original patch as a fail safe mechanism? I don't think we have much choice, although I would *really* want to get this root-caused rather than just papering over the symptoms. Maybe still worth testing that "sched/numa: Scale scan period with tasks in group and shared/private" patch that Mel mentioned. In fact, looking at that patch description, it does seem to match this particular load a lot. Quoting from the commit message: "Running 80 tasks in the same group, or as threads of the same process, results in the memory getting scanned 80x as fast as it would be if a single task was using the memory. This really hurts some workloads" So if 80 threads causes 80x as much scanning, a few thousand threads might indeed be really really bad. So once more unto the breach, dear friends, once more. Please. The patch got applied to -tip as commit b5dd77c8bdad, and can be downloaded here: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=b5dd77c8bdada7b6262d0cba02a6ed525bf4e6e1 (Hmm. It says it's cc'd to me, but I never noticed that patch simply because it was in a big group of other -tip commits.. Oh well). Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org