From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B43AC54EE9 for ; Mon, 19 Sep 2022 13:28:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E23AA940008; Mon, 19 Sep 2022 09:28:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DD286940007; Mon, 19 Sep 2022 09:28:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C9B1C940008; Mon, 19 Sep 2022 09:28:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id B948C940007 for ; Mon, 19 Sep 2022 09:28:47 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 8BCEDA0222 for ; Mon, 19 Sep 2022 13:28:47 +0000 (UTC) X-FDA: 79928915094.19.6AAB153 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf01.hostedemail.com (Postfix) with ESMTP id 1ABC040004 for ; Mon, 19 Sep 2022 13:28:46 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 9BB4E1F8D9; Mon, 19 Sep 2022 13:28:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1663594125; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=EbiTT46SCuA8lZCjsMJ4svt6IK4RjkmRRXwtm8aASvo=; b=VfezqDP6jp8P80XMpc8WkkmfJUPSNAiUj+y9erqTm+HMd59ZgA/A21m8LnDOlsH/rP+aLT caFJGFbI09VwQUTf3N5nO7bAe3TZtrJ5y24HLINYlTN62v0MWWrJqFXXjNZvsT/6PsT+NL qN8t5EzhcvtXfXos2A0SMcoWkd8cQvw= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 7ED6A13A96; Mon, 19 Sep 2022 13:28:45 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 9P52HI1uKGOGRgAAMHmgww (envelope-from ); Mon, 19 Sep 2022 13:28:45 +0000 Date: Mon, 19 Sep 2022 15:28:44 +0200 From: Michal Hocko To: Zhenhua Huang Cc: akpm@linux-foundation.org, mgorman@techsingularity.net, vbabka@suse.cz, linux-mm@kvack.org, quic_tingweiz@quicinc.com Subject: Re: [RESEND PATCH] mm:page_alloc.c: lower the order requirement of should_reclaim_retry Message-ID: References: <1663556455-30188-1-git-send-email-quic_zhenhuah@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1663594127; a=rsa-sha256; cv=none; b=LW4eELRZIud9stu52Oln3LTGJZAqIUL0urRxH1CdYP9W9LKTiFTwkOGailL/p0pYJBD0IZ FC3q59NfJfL8hN8dv2tDgHiGHZJ0R7jSVpVFQ+oPlmlTV9rHKC8VgppgDtS8Q+7XhlNuHt iRIe6S01Kl3k+c0tlEjGo+pvHtpFnio= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=VfezqDP6; spf=pass (imf01.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1663594127; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EbiTT46SCuA8lZCjsMJ4svt6IK4RjkmRRXwtm8aASvo=; b=8VNOy61WBZYTSKHaBGQQ4m/ajAedMd59wZE1vfcYsQwaOaX7H/7eUMtDX7ukliYoEY+pUl vnmq1H8DwrdTHFvTnJiw/4pQIm7XB77UYp+VIGeQvwMky+m+x67FRot4DU57H0QIMUdCu+ 2MgrEHokJgi1n680M/7pSG7MyjxkUZY= X-Rspamd-Queue-Id: 1ABC040004 X-Rspamd-Server: rspam05 X-Rspam-User: Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=VfezqDP6; spf=pass (imf01.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com X-Stat-Signature: s5a635ybmzjhepj4is7xqpf7hgot3rwi X-HE-Tag: 1663594126-141248 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon 19-09-22 19:24:32, Zhenhua Huang wrote: > Thanks Michal for comments! > > On 2022/9/19 16:14, Michal Hocko wrote: > > On Mon 19-09-22 11:00:55, Zhenhua Huang wrote: > > > When a driver was continuously allocating order 3 > > > pages, it would be very easily OOM even there were lots of reclaimable > > > pages. A test module is used to reproduce this issue, > > > several key ftrace events are as below: > > > > > > insmod-6968 [005] .... 321.306007: reclaim_retry_zone: node=0 zone=Normal > > > order=3 reclaimable=539988 available=592856 min_wmark=21227 no_progress_loops=0 > > > wmark_check=0 > > > insmod-6968 [005] .... 321.306009: compact_retry: order=3 > > > priority=COMPACT_PRIO_SYNC_LIGHT compaction_result=withdrawn retries=0 > > > max_retries=16 should_retry=1 > > > insmod-6968 [004] .... 321.308220: > > > mm_compaction_try_to_compact_pages: order=3 gfp_mask=GFP_KERNEL priority=0 > > > insmod-6968 [004] .... 321.308964: mm_compaction_end: > > > zone_start=0x80000 migrate_pfn=0xaa800 free_pfn=0x80800 zone_end=0x940000, > > > mode=sync status=complete > > > insmod-6968 [004] .... 321.308971: reclaim_retry_zone: node=0 > > > zone=Normal order=3 reclaimable=539830 available=592776 min_wmark=21227 > > > no_progress_loops=0 wmark_check=0 > > > insmod-6968 [004] .... 321.308973: compact_retry: order=3 > > > priority=COMPACT_PRIO_SYNC_FULL compaction_result=failed retries=0 > > > max_retries=16 should_retry=0 > > > > > > There're ~2GB reclaimable pages(reclaimable=539988) but VM decides not to > > > reclaim any more: > > > insmod-6968 [005] .... 321.306007: reclaim_retry_zone: node=0 zone=Normal > > > order=3 reclaimable=539988 available=592856 min_wmark=21227 no_progress_loops=0 > > > wmark_check=0 > > > > > > >From meminfo when oom, there was NO qualified order >= 3 pages(CMA page not qualified) > > > can meet should_reclaim_retry's requirement: > > > Normal : 24671*4kB (UMEC) 13807*8kB (UMEC) 8214*16kB (UEC) 190*32kB (C) > > > 94*64kB (C) 28*128kB (C) 16*256kB (C) 7*512kB (C) 5*1024kB (C) 7*2048kB (C) > > > 46*4096kB (C) = 571796kB > > > > > > The reason of should_reclaim_retry early aborting was that is based on having the order > > > pages in its free_list. For order 3 pages, that's easily fragmented. Considering enough free > > > pages are the fundamental of compaction. It may not be suitable to stop reclaiming > > > when lots of page cache there. Relax order by one to fix this issue. > > > > For the higher order request we rely on should_compact_retry which backs > > on based on the compaction feedback. I would recommend looking why the > > compaction fails. > I think the reason of compaction failure is there're not enough free pages. > Like in ftrace events showed, free pages(which include CMA) was only 592856 > - 539988 = 52868 pages(reclaimable=539988 available=592856). > > There are some restrictions like suitable_migration_target() for free pages > and suitable_migration_source() for movable pages. Hence eligible targets is > fewer. If the compaction decides the retry is not worth it then either it is making a wrong call or it doesn't make sense to retry. > > Also this patch doesn't really explain why it should work and honestly > > it doesn't really make much sense to me either. > Sorry, my fault. IMO, The reason it should work is, say for this case of > order 3 allocation: we can perform direct reclaim more times as we have only > order 2 pages(which *lowered* by this change) in free_list(8214*16kB (UEC)). > The order requirement which I have lowered is should_reclaim_retry -> > __zone_watermark_ok: > for (o = order; o < MAX_ORDER; o++) { > struct free_area *area = &z->free_area[o]; > ... > for (mt = 0; mt < MIGRATE_PCPTYPES; mt++) { > if (!free_area_empty(area, mt)) > return true; > } > > Order 2 pages can be more easily met, hence VM has more chance to return > true from should_reclaim_retry. This is a wrong approach to the problem because there is no real guarantee the reclaim round will do anything useful. You should be really looking at the compaction side of the thing. -- Michal Hocko SUSE Labs