From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D8D8C433E1 for ; Tue, 14 Jul 2020 15:45:59 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4B7312236F for ; Tue, 14 Jul 2020 15:45:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="TeT4WtuD" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4B7312236F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DF1AA6B000C; Tue, 14 Jul 2020 11:45:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D7A228D0001; Tue, 14 Jul 2020 11:45:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C42416B000E; Tue, 14 Jul 2020 11:45:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0249.hostedemail.com [216.40.44.249]) by kanga.kvack.org (Postfix) with ESMTP id AB59A6B000C for ; Tue, 14 Jul 2020 11:45:58 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 6B59E1EE6 for ; Tue, 14 Jul 2020 15:45:58 +0000 (UTC) X-FDA: 77037107196.11.book87_3f09a5c26ef2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin11.hostedemail.com (Postfix) with ESMTP id A9728180F8B9A for ; Tue, 14 Jul 2020 15:45:55 +0000 (UTC) X-HE-Tag: book87_3f09a5c26ef2 X-Filterd-Recvd-Size: 5608 Received: from mail-qk1-f193.google.com (mail-qk1-f193.google.com [209.85.222.193]) by imf01.hostedemail.com (Postfix) with ESMTP for ; Tue, 14 Jul 2020 15:45:54 +0000 (UTC) Received: by mail-qk1-f193.google.com with SMTP id e11so16014876qkm.3 for ; Tue, 14 Jul 2020 08:45:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=7U8JsZvgXHH3n8xgxoCLEbRbt6AY8Z+mxAelisJgRw8=; b=TeT4WtuDoSJlYSAQH2bV1YMeSx1E6X2BFU8rj0/8YyGh0bErT86/GtYKHAowuaOF6E i5sZLU99YAKqeQ4qr4Qp40qynq4yNSyzg1yH0BPltc5IY1ZLrkVQwxI8UTVm3ozseEdy 5ZkeWdsxPMWs2cvuTkyXFRGoW6rRjLvtArEwKHjtaJEZrrRGsCnu7J5y946a97v7K8xc O8xhwzvcMs0lb/N+ymrisTzO4wstCoEc86JTj/YVRRwGUOkVXxACR80FW58sUvUxstgn J4rgoTKUsqwoXilNN1i9HA6gtgwsAfe+lhDNrffl8KLuAP0+jQXa9OkWVGhEGjbFqp8/ 6C6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=7U8JsZvgXHH3n8xgxoCLEbRbt6AY8Z+mxAelisJgRw8=; b=VA7W/VTItrdVjYSLEqjJPQohcHmL2PYdOXXmp6TVEyBdrM2VXciFex9DsFmIWUeBss Gm1xhC8IphXEEy+sFFjoujTbJZjYUjw5k09yvJh+CB32JMw76k0uUGdEkBdNty2YXkkJ Nf7B3goiwO0VGKv75TG6m7ymeHHn7y4G8wntd76A8QCuLAAUnjpUcrfQH8cYYHeOiZDt yLfZHgingFRoL2NezwNmvcD4Pqxxc1yixMdr3tvldVRRJMlIwN/rGrpcLfdoCqXPACPU /9qQS3f3wkbcg4gG7/BDx/9GlDzuQ2JrEq8YcphzIGh5keXus3K4nQEPUjQi461iSdUg Mzfw== X-Gm-Message-State: AOAM530u2lAE51Dl5kagUNsu9e/XiU5hsynEka9I8wPUsK1r6AHy5ItF 0qf4LQ3a/r9/6y2lNP3/Pb4g7g== X-Google-Smtp-Source: ABdhPJwDgYPEzL6yjkDXQkBKQCvfcAKX1idi2WJWmoKy4Y+Sz7jQrn0lIt5Ndq1K+62+h1vAQ0vstA== X-Received: by 2002:a37:689:: with SMTP id 131mr4805114qkg.468.1594741554235; Tue, 14 Jul 2020 08:45:54 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::1:81a8]) by smtp.gmail.com with ESMTPSA id r6sm23081172qtt.81.2020.07.14.08.45.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 14 Jul 2020 08:45:53 -0700 (PDT) Date: Tue, 14 Jul 2020 11:45:04 -0400 From: Johannes Weiner To: Chris Down Cc: Andrew Morton , Michal Hocko , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH v2 1/2] mm, memcg: reclaim more aggressively before high allocator throttling Message-ID: <20200714154504.GB215857@cmpxchg.org> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: A9728180F8B9A X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jul 13, 2020 at 12:42:35PM +0100, Chris Down wrote: > In Facebook production, we've seen cases where cgroups have been put > into allocator throttling even when they appear to have a lot of slack > file caches which should be trivially reclaimable. > > Looking more closely, the problem is that we only try a single cgroup > reclaim walk for each return to usermode before calculating whether or > not we should throttle. This single attempt doesn't produce enough > pressure to shrink for cgroups with a rapidly growing amount of file > caches prior to entering allocator throttling. > > As an example, we see that threads in an affected cgroup are stuck in > allocator throttling: > > # for i in $(cat cgroup.threads); do > > grep over_high "/proc/$i/stack" > > done > [<0>] mem_cgroup_handle_over_high+0x10b/0x150 > [<0>] mem_cgroup_handle_over_high+0x10b/0x150 > [<0>] mem_cgroup_handle_over_high+0x10b/0x150 > > ...however, there is no I/O pressure reported by PSI, despite a lot of > slack file pages: > > # cat memory.pressure > some avg10=78.50 avg60=84.99 avg300=84.53 total=5702440903 > full avg10=78.50 avg60=84.99 avg300=84.53 total=5702116959 > # cat io.pressure > some avg10=0.00 avg60=0.00 avg300=0.00 total=78051391 > full avg10=0.00 avg60=0.00 avg300=0.00 total=78049640 > # grep _file memory.stat > inactive_file 1370939392 > active_file 661635072 > > This patch changes the behaviour to retry reclaim either until the > current task goes below the 10ms grace period, or we are making no > reclaim progress at all. In the latter case, we enter reclaim throttling > as before. > > To a user, there's no intuitive reason for the reclaim behaviour to > differ from hitting memory.high as part of a new allocation, as opposed > to hitting memory.high because someone lowered its value. As such this > also brings an added benefit: it unifies the reclaim behaviour between > the two. > > There's precedent for this behaviour: we already do reclaim retries when > writing to memory.{high,max}, in max reclaim, and in the page allocator > itself. > > Signed-off-by: Chris Down > Cc: Andrew Morton > Cc: Johannes Weiner > Cc: Tejun Heo > Cc: Michal Hocko Acked-by: Johannes Weiner