From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9601C56202 for ; Wed, 25 Nov 2020 19:02:15 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3318E206D9 for ; Wed, 25 Nov 2020 19:02:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="S/31aAYG" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3318E206D9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 2B4516B005C; Wed, 25 Nov 2020 14:02:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 264576B0070; Wed, 25 Nov 2020 14:02:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 17A756B0071; Wed, 25 Nov 2020 14:02:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0119.hostedemail.com [216.40.44.119]) by kanga.kvack.org (Postfix) with ESMTP id F34866B005C for ; Wed, 25 Nov 2020 14:02:13 -0500 (EST) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 9EBF982499A8 for ; Wed, 25 Nov 2020 19:02:13 +0000 (UTC) X-FDA: 77523860946.01.mass29_10127c527379 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin01.hostedemail.com (Postfix) with ESMTP id 6E01F1004D186 for ; Wed, 25 Nov 2020 19:02:07 +0000 (UTC) X-HE-Tag: mass29_10127c527379 X-Filterd-Recvd-Size: 4700 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by imf09.hostedemail.com (Postfix) with ESMTP for ; Wed, 25 Nov 2020 19:02:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1606330926; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=8uFZEde0fjOKdsoDsRuCfq7lnGFjepdQaAwnfRHGJI0=; b=S/31aAYGsiZApAPRwG2ZtSPL2yfrCAgC1gVN6bCoZAtr+0wd9zrAe/VBMnruPXyUHxU6kp 6H9dMj1BaBKAXMdcg5mRF03yOR/pkCeiRs7Udsa8q1Dze7pszR3jeul+wr5NElRw7stSMk Cr4b4Kg81NayjTmvum0ynwkqNW5MLdw= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-48-osmPAM7sNPiVfU8_dn8syg-1; Wed, 25 Nov 2020 14:02:03 -0500 X-MC-Unique: osmPAM7sNPiVfU8_dn8syg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 16BA61E7C7; Wed, 25 Nov 2020 19:02:01 +0000 (UTC) Received: from mail (ovpn-112-118.rdu2.redhat.com [10.10.112.118]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 924B65D9CA; Wed, 25 Nov 2020 19:01:57 +0000 (UTC) Date: Wed, 25 Nov 2020 14:01:56 -0500 From: Andrea Arcangeli To: Vlastimil Babka Cc: David Hildenbrand , Mel Gorman , Andrew Morton , linux-mm@kvack.org, Qian Cai , Michal Hocko , linux-kernel@vger.kernel.org, Mike Rapoport , Baoquan He Subject: Re: [PATCH 1/1] mm: compaction: avoid fast_isolate_around() to set pageblock_skip on reserved pages Message-ID: References: <8C537EB7-85EE-4DCF-943E-3CC0ED0DF56D@lca.pw> <20201121194506.13464-1-aarcange@redhat.com> <20201121194506.13464-2-aarcange@redhat.com> <1c4c405b-52e0-cf6b-1f82-91a0a1e3dd53@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1c4c405b-52e0-cf6b-1f82-91a0a1e3dd53@suse.cz> User-Agent: Mutt/2.0.2 (2020-11-20) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Nov 25, 2020 at 01:08:54PM +0100, Vlastimil Babka wrote: > Yeah I guess it would be simpler if zoneid/nid was correct for > pfn_valid() pfns within a zone's range, even if they are reserved due > not not being really usable memory. > > I don't think we want to introduce CONFIG_HOLES_IN_ZONE to x86. If the > chosen solution is to make this to a real hole, the hole should be > extended to MAX_ORDER_NR_PAGES aligned boundaries. The way pfn_valid works it's not possible to render all non-RAM pfn as !pfn_valid, CONFIG_HOLES_IN_ZONE would not achieve it 100% either. So I don't think we can rely on that to eliminate all non-RAM reserved pages from the mem_map and avoid having to initialize them in the first place. Some could remain as in this case since in the same pageblock there's non-RAM followed by RAM and all pfn are valid. > In any case, compaction code can't fix this with better range checks. David's correct that it can, by adding enough PageReserved (I'm running all systems reproducing this with plenty of PageReserved checks in all places to work around it until we do a proper fix). My problem with that is that 1) it's simply non enforceable at runtime that there is not missing PageReserved check and 2) what benefit it would provide to leave a wrong zoneid in reserved pages and having to add extra PageReserved checks? A struct page has a deterministic zoneid/nid, if it's pointed by a valid pfn (as in pfn_valid()) the simplest is that the zoneid/nid in the page remain correct no matter if it's reserved at boot, it was marked reserved by a driver that swap the page somewhere else with the GART or EFI or something else. All reserved pages should work the same, RAM and non-RAM, since the non-RAM status can basically change at runtime if a driver assigns the page to hw somehow. NOTE: on the compaction side, we still need to add thepageblock_pfn_to_page to validate the "highest" pfn because the pfn_valid() check is missing on the first pfn on the pageblock as it's also missing the check of a pageblock that spans over two different zones. Thanks, Andrea