From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0101DC0650F for ; Wed, 14 Aug 2019 08:11:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id BF93A208C2 for ; Wed, 14 Aug 2019 08:11:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1565770318; bh=5mfpOg839URSXNCxe5eSFa1DqCnAesx9cmIqMfnQcqw=; h=Date:From:To:Cc:Subject:References:In-Reply-To:List-ID:From; b=ETeiEz4O7XSWYQFkBr11a3z4aPKBzklJNBrDXh0XD7hyfFDM0Yl4eSWRU6KSYlGNq O66q9YN+4usbdf2F1AJKzPHNX8VGN1s4oSLuhdqRWGeqvkVL99R5HOHZlk2emk6LEP LR+8Y2QlWh6CZ77drU6ruwO8+vNkB9LCC8ByZQr0= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727120AbfHNIL5 (ORCPT ); Wed, 14 Aug 2019 04:11:57 -0400 Received: from mx2.suse.de ([195.135.220.15]:38490 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726555AbfHNIL5 (ORCPT ); Wed, 14 Aug 2019 04:11:57 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id EA6AAAF98; Wed, 14 Aug 2019 08:11:55 +0000 (UTC) Date: Wed, 14 Aug 2019 10:11:55 +0200 From: Michal Hocko To: Johannes Weiner Cc: Andrew Morton , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH] mm: vmscan: do not share cgroup iteration between reclaimers Message-ID: <20190814081155.GQ17933@dhcp22.suse.cz> References: <20190812192316.13615-1-hannes@cmpxchg.org> <20190813132938.GJ17933@dhcp22.suse.cz> <20190813171237.GA21743@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190813171237.GA21743@cmpxchg.org> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 13-08-19 13:12:37, Johannes Weiner wrote: > On Tue, Aug 13, 2019 at 03:29:38PM +0200, Michal Hocko wrote: > > On Mon 12-08-19 15:23:16, Johannes Weiner wrote: [...] > > > This change completely eliminates the OOM kills on our service, while > > > showing no signs of overreclaim - no increased scan rates, %sys time, > > > or abrupt free memory spikes. I tested across 100 machines that have > > > 64G of RAM and host about 300 cgroups each. > > > > What is the usual direct reclaim involvement on those machines? > > 80-200 kb/s. In general we try to keep this low to non-existent on our > hosts due to the latency implications. So it's fair to say that kswapd > does page reclaim, and direct reclaim is a sign of overload. Well, there are workloads which are much more direct reclaim heavier. How much they rely on large memcg trees remains to be seen. Your changelog should state that the above workload is very light on direct reclaim, though, because the above paragraph suggests that a risk of longer stalls is really non-issue while I think this is not really all that clear. -- Michal Hocko SUSE Labs