From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7990C433DB for ; Fri, 26 Mar 2021 15:36:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2CE6861A24 for ; Fri, 26 Mar 2021 15:36:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2CE6861A24 Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 52EE56B006C; Fri, 26 Mar 2021 11:36:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 505B46B006E; Fri, 26 Mar 2021 11:36:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3CDEB6B0070; Fri, 26 Mar 2021 11:36:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0167.hostedemail.com [216.40.44.167]) by kanga.kvack.org (Postfix) with ESMTP id 22EB16B006C for ; Fri, 26 Mar 2021 11:36:19 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id DE5B58249980 for ; Fri, 26 Mar 2021 15:36:18 +0000 (UTC) X-FDA: 77962426836.08.7386730 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf21.hostedemail.com (Postfix) with ESMTP id 75CF9E005F35 for ; Fri, 26 Mar 2021 15:36:13 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1616772962; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=F5KU6VylMRMyZvTyqPo/BHGYNwWheIqROBxQ4/DH4+8=; b=Y3ouveCx9Q+ASuaEpDN67hcpfdPanfqzzoghiKgdifFuk0NGyc7ynIPTnhYmNm5H2ouZyF QqaTv6fuxtmBOkrziGCtakHrFA2Z405U45W9qp38NhOVs3CL+s2ts7G/DvU9OldkwxnYGm j9OVL85tEnC4zxCm5yEtQbPGMyOzy0I= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 20121AC6A; Fri, 26 Mar 2021 15:36:02 +0000 (UTC) Date: Fri, 26 Mar 2021 16:36:01 +0100 From: Michal Hocko To: Aaron Tomlin Cc: linux-mm@kvack.org, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, Vlastimil Babka Subject: Re: [PATCH] mm/page_alloc: try oom if reclaim is unable to make forward progress Message-ID: References: <20210315165837.789593-1-atomlin@redhat.com> <20210319172901.cror2u53b7caws3a@ava.usersys.com> <20210325210159.r565fvfitoqeuykp@ava.usersys.com> <20210326112254.jy5jkiwtgj3pqkt2@ava.usersys.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210326112254.jy5jkiwtgj3pqkt2@ava.usersys.com> X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 75CF9E005F35 X-Stat-Signature: aa5tuo4nmy4bwdti749fgioujcg4htpy Received-SPF: none (suse.com>: No applicable sender policy available) receiver=imf21; identity=mailfrom; envelope-from=""; helo=mx2.suse.de; client-ip=195.135.220.15 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1616772973-204842 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri 26-03-21 11:22:54, Aaron Tomlin wrote: [...] > > Both reclaim and compaction maintain their own retries counters as they > > are targeting a different operation. Although the compaction really > > depends on the reclaim to do some progress. > > Yes. Looking at should_compact_retry() if the last known compaction result > was skipped i.e. suggesting there was not enough order-0 pages to support > compaction, so assistance is needed from reclaim > (see __compaction_suitable()). > > I noticed that the value of compaction_retries, compact_result and > compact_priority was 0, COMPACT_SKIPPED and 1 i.e. COMPACT_PRIO_SYNC_LIGHT, > respectively. > > > OK, this sound unexpected as it indicates that the reclaim is able to > > make a forward progress but compaction doesn't want to give up and keeps > > retrying. Are you able to reproduce this or could you find out which > > specific condition keeps compaction retrying? I would expect that it is > > one of the 3 conditions before the max_retries is checked. > > Unfortunately, I have been told it is not entirely reproducible. > I suspect it is the following in should_compact_retry() - as I indicated > above the last known value stored in compaction_retries was 0: > > > if (order > PAGE_ALLOC_COSTLY_ORDER) > max_retries /= 4; > if (*compaction_retries <= max_retries) { > ret = true; > goto out; > } OK, I kinda expected this would be not easily reproducible. The reason I dislike your patch is that it addes yet another criterion for oom while we already do have 2 which doesn't make the resulting code easier to reason about. We should be focusing on the compaction retry logic and see whether we can have some "run away" scenarios there. Seeing so many retries without compaction bailing out sounds like a bug in that retry logic. Vlastimil is much more familiar with that. -- Michal Hocko SUSE Labs