From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B7C1C433DB for ; Tue, 2 Feb 2021 12:52:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 93E7E64F45 for ; Tue, 2 Feb 2021 12:52:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 93E7E64F45 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 112546B0005; Tue, 2 Feb 2021 07:52:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 09B6A6B006E; Tue, 2 Feb 2021 07:52:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ECBD06B0070; Tue, 2 Feb 2021 07:51:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0092.hostedemail.com [216.40.44.92]) by kanga.kvack.org (Postfix) with ESMTP id D293B6B0005 for ; Tue, 2 Feb 2021 07:51:59 -0500 (EST) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 9CC168249980 for ; Tue, 2 Feb 2021 12:51:59 +0000 (UTC) X-FDA: 77773315158.04.jam03_0304db4275cb Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin04.hostedemail.com (Postfix) with ESMTP id 73F728007DCE for ; Tue, 2 Feb 2021 12:51:59 +0000 (UTC) X-HE-Tag: jam03_0304db4275cb X-Filterd-Recvd-Size: 5263 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Tue, 2 Feb 2021 12:51:58 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id CDCE164D5D; Tue, 2 Feb 2021 12:51:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1612270317; bh=13hVY3f698ITPBfmersicvgYgq/Y2xMUJW6GuQfCB8g=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=XaOVScoztoxZO/x2yCHKZ176xDL0szNeYCOAFidsrymZYWK9ywoC2NOcGs3Ruz5xa x9tR1+PujFEOFUhPZnOo/cgwawy+j6C7jGlP6cuhRkG/eEptcfhbXWJ9VOLSNST2aA hk7lhD20xXYxHR6X47Dkli4T2S3DBa6lOnUoH/vKCkNk4oLzu5IWT6WxY+L7vZ2rh7 Zqyhs2LFerPbE/CJkss20/8Gy8W5hc85Qy+qmewyA4ByBtw7tpiDKGkCD81fD/UNzl uawPfI1aRx28Rym65zPEjnWNf4lzJDAcZaAdprbIFySZHfYpbwE9OAskX1Q5fedFkH 8YRKO3qFCPyJQ== Date: Tue, 2 Feb 2021 12:51:52 +0000 From: Will Deacon To: David Hildenbrand Cc: Anshuman Khandual , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Catalin Marinas , Ard Biesheuvel , Mark Rutland , James Morse , Robin Murphy , =?iso-8859-1?B?Suly9G1l?= Glisse , Dan Williams , Mike Rapoport Subject: Re: [PATCH V2 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory Message-ID: <20210202125152.GC16868@willie-the-truck> References: <1612239114-28428-1-git-send-email-anshuman.khandual@arm.com> <1612239114-28428-2-git-send-email-anshuman.khandual@arm.com> <20210202123215.GA16868@willie-the-truck> <20210202123524.GB16868@willie-the-truck> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Feb 02, 2021 at 01:39:29PM +0100, David Hildenbrand wrote: > On 02.02.21 13:35, Will Deacon wrote: > > On Tue, Feb 02, 2021 at 12:32:15PM +0000, Will Deacon wrote: > > > On Tue, Feb 02, 2021 at 09:41:53AM +0530, Anshuman Khandual wrote: > > > > pfn_valid() validates a pfn but basically it checks for a valid struct page > > > > backing for that pfn. It should always return positive for memory ranges > > > > backed with struct page mapping. But currently pfn_valid() fails for all > > > > ZONE_DEVICE based memory types even though they have struct page mapping. > > > > > > > > pfn_valid() asserts that there is a memblock entry for a given pfn without > > > > MEMBLOCK_NOMAP flag being set. The problem with ZONE_DEVICE based memory is > > > > that they do not have memblock entries. Hence memblock_is_map_memory() will > > > > invariably fail via memblock_search() for a ZONE_DEVICE based address. This > > > > eventually fails pfn_valid() which is wrong. memblock_is_map_memory() needs > > > > to be skipped for such memory ranges. As ZONE_DEVICE memory gets hotplugged > > > > into the system via memremap_pages() called from a driver, their respective > > > > memory sections will not have SECTION_IS_EARLY set. > > > > > > > > Normal hotplug memory will never have MEMBLOCK_NOMAP set in their memblock > > > > regions. Because the flag MEMBLOCK_NOMAP was specifically designed and set > > > > for firmware reserved memory regions. memblock_is_map_memory() can just be > > > > skipped as its always going to be positive and that will be an optimization > > > > for the normal hotplug memory. Like ZONE_DEVICE based memory, all normal > > > > hotplugged memory too will not have SECTION_IS_EARLY set for their sections > > > > > > > > Skipping memblock_is_map_memory() for all non early memory sections would > > > > fix pfn_valid() problem for ZONE_DEVICE based memory and also improve its > > > > performance for normal hotplug memory as well. > > > > > > Hmm. Although I follow your logic, this does seem to rely on an awful lot of > > > assumptions to continue to hold true as the kernel evolves. In particular, > > > how do we ensure that early sections are always fully backed with > > > > Sorry, typo here: ^^^ should be *non-early* sections. > > It might be a good idea to have a look at generic > include/linux/mmzone.h:pfn_valid() The generic implementation already makes assumptions that aren't true on arm64, so that's why we've ended up with our own implementation. But the patches here put us in a position where I worry that pfn_valid() may return 'true' in future for cases where the underlying struct page is either non-existent or bogus, and debugging those failures really sucks. We had a raft of those back when NOMAP was introduced and I don't want to re-live that experience. > As I expressed already, long term we should really get rid of the arm64 > variant and rather special-case the generic one. Then we won't go out of > sync - just as it happened with ZONE_DEVICE handling here. Why does this have to be long term? This ZONE_DEVICE stuff could be the carrot on the stick :) Will