From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753699AbeEKLMi (ORCPT ); Fri, 11 May 2018 07:12:38 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:39952 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753473AbeEKLMg (ORCPT ); Fri, 11 May 2018 07:12:36 -0400 Subject: Re: [RFC][PATCH] arm64: update iomem_resource.end To: Nicolin Chen Cc: will.deacon@arm.com, catalin.marinas@arm.com, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, steve.capper@arm.com, kristina.martsenko@arm.com, labbott@redhat.com, stefan@agner.ch, akpm@linux-foundation.org, jglisse@redhat.com References: <1525906703-28481-1-git-send-email-nicoleotsuka@gmail.com> <20180510222920.GA20553@Asurada-Nvidia> From: Robin Murphy Message-ID: Date: Fri, 11 May 2018 12:12:32 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <20180510222920.GA20553@Asurada-Nvidia> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/05/18 23:29, Nicolin Chen wrote: > Thanks for the comments, Robin. > > On Thu, May 10, 2018 at 06:45:59PM +0100, Robin Murphy wrote: >> On 09/05/18 23:58, Nicolin Chen wrote: >>> The iomem_resource.end is -1 by default and should be updated in >>> arch-level code. >>> >>> ARM64 so far hasn't updated it while core kernel code (mm/hmm.c) >>> started to use iomem_resource.end for boundary check. So it'd be >>> better to assign iomem_resource.end using a valid value, the end >>> of physical address space for example because iomem_resource.end >>> in theory should reflect that. >>> >>> However, VA_BITS might be smaller than PA_BITS in ARM64. So using >>> the end of physical address space doesn't make a lot of sense in >>> this case, or could be even harmful since virtual address cannot >>> reach that memory region. >> >> Why? There's plenty of stuff in the physical address space that will >> only ever be accessed via ioremap/memremap. There's no reason you >> shouldn't be able to run a VA_BITS < 48 kernel on a Cavium ThunderX > > I'm running VA_BITS_39 and PA_BITS_48 on Tegra 210. There had > not been any problem of it, however with hmm..... > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/mm/hmm.c#n1144 > > This hmm_devmem_add() requests a region with PFNs being outside > of the linear region in ARM64 case which takes MAX_PHYSMEM_BITS > (48 bits) over iomem_resource.end without this patch. Then when > dealing with page structures in vmemmap region from a given PFN > directly (CONFIG_SPARSEMEM_VMEMMAP=y), and the given PFN is the > last one based on physical region (48 bits), the address of its > page structure will go beyond vmemmap region. Does this sound a > problem? Yes, but as far as we're concerned here it's not a problem with arm64: config ARCH_HAS_HMM ... depends on (X86_64 || PPC64) depends on ZONE_DEVICE ... depends on MEMORY_HOTPLUG depends on MEMORY_HOTREMOVE ... Whatever out-of-tree changes you have to address all of those are clearly implemented incorrectly; *that's* your problem. >> where *all* the I/O is in the top half of the PA space. We already >> constrain RAM in this very function to those regions which fit into >> the linear map, and if you're accessing anything other than RAM >> through the linear map you're probably doing something wrong. > > If I understand this part correctly, since ARM64 has applied the > memory limit already, does it mean that probably we should fix > something in the region_intersects() or add an extra check in the > hmm_devmem_add(), instead of limiting the iomem_resource? It means we should implement memory hotplug correctly. Which, unfortunately, I happen to know is really hard (it's something I've been looking at from the device-DAX angle). >> Furthermore, the physical region covered by the linear map doesn't >> necessarily start at physical address 0 anyway - see PHYS_OFFSET. > > Hmm...okay...but there still should be a protection somewhere if > it happens to access a page structure via pfn_to_page() while the > PFN is not covered by the vmemmap linear mapping, right? There already is: pfn_valid() will return false for anything outside the intersection of memblock regions and the linear map region as calculated by arm64_memblock_init(); anything calling pfn_to_page() without checking pfn_valid() first is fundamentally broken. Or if you have out-of-tree changes to the pfn_valid() implementation then all bets are off, and it's not something for mainline to work around. Robin.