* [GIT PULL] arm64 fix for 5.14 @ 2021-08-26 13:17 ` Will Deacon 0 siblings, 0 replies; 20+ messages in thread From: Will Deacon @ 2021-08-26 13:17 UTC (permalink / raw) To: torvalds; +Cc: catalin.marinas, linux-arm-kernel, linux-kernel, kernel-team Hi Linus, Please pull this single arm64 fix for 5.14. We received a report this week that the generic version of pfn_valid(), which we switched to this merge window in 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID"), interacts badly with dma_map_resource() due to the following check: /* Don't allow RAM to be mapped */ if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr)))) return DMA_MAPPING_ERROR; Since the ongoing saga to determine the semantics of pfn_valid() is unlikely to be resolved this week (does it indicate valid memory, or just the presence of a struct page, or whether that struct page has been initialised?), just revert back to our old version of pfn_valid() for 5.14. Thanks, Will --->8 The following changes since commit bde8fff82e4a4b0f000dbf4d5eadab2079be0b56: arm64: initialize all of CNTHCTL_EL2 (2021-08-19 10:02:10 +0100) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git tags/arm64-fixes for you to fetch changes up to 3eb9cdffb39701743973382860f214026f4d7825: Partially revert "arm64/mm: drop HAVE_ARCH_PFN_VALID" (2021-08-25 11:33:24 +0100) ---------------------------------------------------------------- arm64 fix for 5.14 - Fix dma_map_resource() by reverting back to old pfn_valid() code ---------------------------------------------------------------- Will Deacon (1): Partially revert "arm64/mm: drop HAVE_ARCH_PFN_VALID" arch/arm64/Kconfig | 1 + arch/arm64/include/asm/page.h | 1 + arch/arm64/mm/init.c | 37 +++++++++++++++++++++++++++++++++++++ 3 files changed, 39 insertions(+) ^ permalink raw reply [flat|nested] 20+ messages in thread
* [GIT PULL] arm64 fix for 5.14 @ 2021-08-26 13:17 ` Will Deacon 0 siblings, 0 replies; 20+ messages in thread From: Will Deacon @ 2021-08-26 13:17 UTC (permalink / raw) To: torvalds; +Cc: catalin.marinas, linux-arm-kernel, linux-kernel, kernel-team Hi Linus, Please pull this single arm64 fix for 5.14. We received a report this week that the generic version of pfn_valid(), which we switched to this merge window in 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID"), interacts badly with dma_map_resource() due to the following check: /* Don't allow RAM to be mapped */ if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr)))) return DMA_MAPPING_ERROR; Since the ongoing saga to determine the semantics of pfn_valid() is unlikely to be resolved this week (does it indicate valid memory, or just the presence of a struct page, or whether that struct page has been initialised?), just revert back to our old version of pfn_valid() for 5.14. Thanks, Will --->8 The following changes since commit bde8fff82e4a4b0f000dbf4d5eadab2079be0b56: arm64: initialize all of CNTHCTL_EL2 (2021-08-19 10:02:10 +0100) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git tags/arm64-fixes for you to fetch changes up to 3eb9cdffb39701743973382860f214026f4d7825: Partially revert "arm64/mm: drop HAVE_ARCH_PFN_VALID" (2021-08-25 11:33:24 +0100) ---------------------------------------------------------------- arm64 fix for 5.14 - Fix dma_map_resource() by reverting back to old pfn_valid() code ---------------------------------------------------------------- Will Deacon (1): Partially revert "arm64/mm: drop HAVE_ARCH_PFN_VALID" arch/arm64/Kconfig | 1 + arch/arm64/include/asm/page.h | 1 + arch/arm64/mm/init.c | 37 +++++++++++++++++++++++++++++++++++++ 3 files changed, 39 insertions(+) _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [GIT PULL] arm64 fix for 5.14 2021-08-26 13:17 ` Will Deacon @ 2021-08-26 18:41 ` Linus Torvalds -1 siblings, 0 replies; 20+ messages in thread From: Linus Torvalds @ 2021-08-26 18:41 UTC (permalink / raw) To: Will Deacon, Christoph Hellwig Cc: Catalin Marinas, Linux ARM, Linux Kernel Mailing List, Android Kernel Team On Thu, Aug 26, 2021 at 6:17 AM Will Deacon <will@kernel.org> wrote: > > Please pull this single arm64 fix for 5.14. Pulled. But adding Christoph to the cc, since I do think the eventual fix needs to be in the DMA mapping code: > We received a report this week > that the generic version of pfn_valid(), which we switched to this merge > window in 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID"), interacts > badly with dma_map_resource() due to the following check: > > /* Don't allow RAM to be mapped */ > if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr)))) > return DMA_MAPPING_ERROR; > > Since the ongoing saga to determine the semantics of pfn_valid() is > unlikely to be resolved this week (does it indicate valid memory, or > just the presence of a struct page, or whether that struct page has been > initialised?), just revert back to our old version of pfn_valid() for > 5.14. I think that's the right thing for now, but yeah, that condition for WARN_ON_ONCE() seems very questionable. "pfn_valid()" is more about whether you can do a "pfn_to_page()" lookup on it. II get the feeling that the dma-mapping code should allow pages that are PageReserved() to be mapped - they aren't "ram" in the kernel sense. Perhaps also make sure it's not the zero page (which is PageReserved(), but most definitely RAM). In a PC world that would be (for example) the legacy PCI space at 0xa0000-0xfffff, but I could easily imagine other platforms having other situations. Linus ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [GIT PULL] arm64 fix for 5.14 @ 2021-08-26 18:41 ` Linus Torvalds 0 siblings, 0 replies; 20+ messages in thread From: Linus Torvalds @ 2021-08-26 18:41 UTC (permalink / raw) To: Will Deacon, Christoph Hellwig Cc: Catalin Marinas, Linux ARM, Linux Kernel Mailing List, Android Kernel Team On Thu, Aug 26, 2021 at 6:17 AM Will Deacon <will@kernel.org> wrote: > > Please pull this single arm64 fix for 5.14. Pulled. But adding Christoph to the cc, since I do think the eventual fix needs to be in the DMA mapping code: > We received a report this week > that the generic version of pfn_valid(), which we switched to this merge > window in 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID"), interacts > badly with dma_map_resource() due to the following check: > > /* Don't allow RAM to be mapped */ > if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr)))) > return DMA_MAPPING_ERROR; > > Since the ongoing saga to determine the semantics of pfn_valid() is > unlikely to be resolved this week (does it indicate valid memory, or > just the presence of a struct page, or whether that struct page has been > initialised?), just revert back to our old version of pfn_valid() for > 5.14. I think that's the right thing for now, but yeah, that condition for WARN_ON_ONCE() seems very questionable. "pfn_valid()" is more about whether you can do a "pfn_to_page()" lookup on it. II get the feeling that the dma-mapping code should allow pages that are PageReserved() to be mapped - they aren't "ram" in the kernel sense. Perhaps also make sure it's not the zero page (which is PageReserved(), but most definitely RAM). In a PC world that would be (for example) the legacy PCI space at 0xa0000-0xfffff, but I could easily imagine other platforms having other situations. Linus _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [GIT PULL] arm64 fix for 5.14 2021-08-26 18:41 ` Linus Torvalds @ 2021-08-27 7:40 ` Christoph Hellwig -1 siblings, 0 replies; 20+ messages in thread From: Christoph Hellwig @ 2021-08-27 7:40 UTC (permalink / raw) To: Linus Torvalds Cc: Will Deacon, Christoph Hellwig, Catalin Marinas, Linux ARM, Linux Kernel Mailing List, Android Kernel Team On Thu, Aug 26, 2021 at 11:41:34AM -0700, Linus Torvalds wrote: > "pfn_valid()" is more about whether you can do a "pfn_to_page()" lookup on it. > > II get the feeling that the dma-mapping code should allow pages that > are PageReserved() to be mapped - they aren't "ram" in the kernel > sense. > > Perhaps also make sure it's not the zero page (which is > PageReserved(), but most definitely RAM). > > In a PC world that would be (for example) the legacy PCI space at > 0xa0000-0xfffff, but I could easily imagine other platforms having > other situations. So what would be the correct check for "this is not actually page backed normal RAM"? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [GIT PULL] arm64 fix for 5.14 @ 2021-08-27 7:40 ` Christoph Hellwig 0 siblings, 0 replies; 20+ messages in thread From: Christoph Hellwig @ 2021-08-27 7:40 UTC (permalink / raw) To: Linus Torvalds Cc: Will Deacon, Christoph Hellwig, Catalin Marinas, Linux ARM, Linux Kernel Mailing List, Android Kernel Team On Thu, Aug 26, 2021 at 11:41:34AM -0700, Linus Torvalds wrote: > "pfn_valid()" is more about whether you can do a "pfn_to_page()" lookup on it. > > II get the feeling that the dma-mapping code should allow pages that > are PageReserved() to be mapped - they aren't "ram" in the kernel > sense. > > Perhaps also make sure it's not the zero page (which is > PageReserved(), but most definitely RAM). > > In a PC world that would be (for example) the legacy PCI space at > 0xa0000-0xfffff, but I could easily imagine other platforms having > other situations. So what would be the correct check for "this is not actually page backed normal RAM"? _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [GIT PULL] arm64 fix for 5.14 2021-08-27 7:40 ` Christoph Hellwig @ 2021-08-27 17:03 ` Linus Torvalds -1 siblings, 0 replies; 20+ messages in thread From: Linus Torvalds @ 2021-08-27 17:03 UTC (permalink / raw) To: Christoph Hellwig Cc: Will Deacon, Catalin Marinas, Linux ARM, Linux Kernel Mailing List, Android Kernel Team On Fri, Aug 27, 2021 at 12:40 AM Christoph Hellwig <hch@lst.de> wrote: > > > In a PC world that would be (for example) the legacy PCI space at > > 0xa0000-0xfffff, but I could easily imagine other platforms having > > other situations. > > So what would be the correct check for "this is not actually page backed > normal RAM"? It would probably be interesting to have the arm people explain the call chain for the warning that caused that revert, so we'd have a very concrete example of the situation that goes wrong, but taking a wild stab at it, the code might be something like /* Don't allow RAM to be mapped */ if (WARN_ON_ONCE(phys_addr_is_ram(phys_addr))) return DMA_MAPPING_ERROR; and then having something like static inline bool phys_addr_is_ram(phys_addr_t phys_addr) { unsigned long pfn = PHYS_PFN(phys_addr); if (!pfn_valid(pfn)) return false; return is_zero_pfn(pfn) || !PageReserved(pfn_to_page(pfn)); } might be close to right. The ARM code actually uses that complex pfn_to_section_nr() and memblock_is_memory() etc. That seems a bit of an overkill, since the memblock code should have translated all that into being reserved. But again, I don't actually know exactly what triggered the issue on ARM, so the above is just my "this seems to be a more proper check" suggestion. Will? Linus ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [GIT PULL] arm64 fix for 5.14 @ 2021-08-27 17:03 ` Linus Torvalds 0 siblings, 0 replies; 20+ messages in thread From: Linus Torvalds @ 2021-08-27 17:03 UTC (permalink / raw) To: Christoph Hellwig Cc: Will Deacon, Catalin Marinas, Linux ARM, Linux Kernel Mailing List, Android Kernel Team On Fri, Aug 27, 2021 at 12:40 AM Christoph Hellwig <hch@lst.de> wrote: > > > In a PC world that would be (for example) the legacy PCI space at > > 0xa0000-0xfffff, but I could easily imagine other platforms having > > other situations. > > So what would be the correct check for "this is not actually page backed > normal RAM"? It would probably be interesting to have the arm people explain the call chain for the warning that caused that revert, so we'd have a very concrete example of the situation that goes wrong, but taking a wild stab at it, the code might be something like /* Don't allow RAM to be mapped */ if (WARN_ON_ONCE(phys_addr_is_ram(phys_addr))) return DMA_MAPPING_ERROR; and then having something like static inline bool phys_addr_is_ram(phys_addr_t phys_addr) { unsigned long pfn = PHYS_PFN(phys_addr); if (!pfn_valid(pfn)) return false; return is_zero_pfn(pfn) || !PageReserved(pfn_to_page(pfn)); } might be close to right. The ARM code actually uses that complex pfn_to_section_nr() and memblock_is_memory() etc. That seems a bit of an overkill, since the memblock code should have translated all that into being reserved. But again, I don't actually know exactly what triggered the issue on ARM, so the above is just my "this seems to be a more proper check" suggestion. Will? Linus _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [GIT PULL] arm64 fix for 5.14 2021-08-27 17:03 ` Linus Torvalds @ 2021-08-27 17:10 ` Christoph Hellwig -1 siblings, 0 replies; 20+ messages in thread From: Christoph Hellwig @ 2021-08-27 17:10 UTC (permalink / raw) To: Linus Torvalds Cc: Christoph Hellwig, Will Deacon, Catalin Marinas, Linux ARM, Linux Kernel Mailing List, Android Kernel Team On Fri, Aug 27, 2021 at 10:03:29AM -0700, Linus Torvalds wrote: > The ARM code actually uses that complex pfn_to_section_nr() and > memblock_is_memory() etc. That seems a bit of an overkill, since the > memblock code should have translated all that into being reserved. > > But again, I don't actually know exactly what triggered the issue on > ARM, so the above is just my "this seems to be a more proper check" > suggestion. They CCed me on their earlier discussion, but I did not catch up on it until you responded to the pull request If I understood it correct it was about a platform device mapping a MMIO region (like a PCI bar), but something about section alignment cause pfn_valid to mistrigger. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [GIT PULL] arm64 fix for 5.14 @ 2021-08-27 17:10 ` Christoph Hellwig 0 siblings, 0 replies; 20+ messages in thread From: Christoph Hellwig @ 2021-08-27 17:10 UTC (permalink / raw) To: Linus Torvalds Cc: Christoph Hellwig, Will Deacon, Catalin Marinas, Linux ARM, Linux Kernel Mailing List, Android Kernel Team On Fri, Aug 27, 2021 at 10:03:29AM -0700, Linus Torvalds wrote: > The ARM code actually uses that complex pfn_to_section_nr() and > memblock_is_memory() etc. That seems a bit of an overkill, since the > memblock code should have translated all that into being reserved. > > But again, I don't actually know exactly what triggered the issue on > ARM, so the above is just my "this seems to be a more proper check" > suggestion. They CCed me on their earlier discussion, but I did not catch up on it until you responded to the pull request If I understood it correct it was about a platform device mapping a MMIO region (like a PCI bar), but something about section alignment cause pfn_valid to mistrigger. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [GIT PULL] arm64 fix for 5.14 2021-08-27 17:10 ` Christoph Hellwig @ 2021-08-27 17:16 ` Linus Torvalds -1 siblings, 0 replies; 20+ messages in thread From: Linus Torvalds @ 2021-08-27 17:16 UTC (permalink / raw) To: Christoph Hellwig Cc: Will Deacon, Catalin Marinas, Linux ARM, Linux Kernel Mailing List, Android Kernel Team On Fri, Aug 27, 2021 at 10:10 AM Christoph Hellwig <hch@lst.de> wrote: > > They CCed me on their earlier discussion, but I did not catch up on it > until you responded to the pull request If I understood it correct it > was about a platform device mapping a MMIO region (like a PCI bar), > but something about section alignment cause pfn_valid to mistrigger. Yeah, so I can easily see the maxpfn numbers can easily end up being rounded up to a whole memory section etc. I think my suggested solution should JustWork(tm) - exactly because if the area is then in that "this pfn is valid" area, it will double-check the actual underlying page. That said, I think x86 avoids the problem another way - by just making sure max_pfn is exact. That works too, as long as there are no holes in the RAM map that might be used for PCI BAR's. So I think arm could fix it that way too, depending on their memory layout. (x86 has that traditional hole at A0000-FFFFF, but it's special enough that it presumably is never an issue). Linus ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [GIT PULL] arm64 fix for 5.14 @ 2021-08-27 17:16 ` Linus Torvalds 0 siblings, 0 replies; 20+ messages in thread From: Linus Torvalds @ 2021-08-27 17:16 UTC (permalink / raw) To: Christoph Hellwig Cc: Will Deacon, Catalin Marinas, Linux ARM, Linux Kernel Mailing List, Android Kernel Team On Fri, Aug 27, 2021 at 10:10 AM Christoph Hellwig <hch@lst.de> wrote: > > They CCed me on their earlier discussion, but I did not catch up on it > until you responded to the pull request If I understood it correct it > was about a platform device mapping a MMIO region (like a PCI bar), > but something about section alignment cause pfn_valid to mistrigger. Yeah, so I can easily see the maxpfn numbers can easily end up being rounded up to a whole memory section etc. I think my suggested solution should JustWork(tm) - exactly because if the area is then in that "this pfn is valid" area, it will double-check the actual underlying page. That said, I think x86 avoids the problem another way - by just making sure max_pfn is exact. That works too, as long as there are no holes in the RAM map that might be used for PCI BAR's. So I think arm could fix it that way too, depending on their memory layout. (x86 has that traditional hole at A0000-FFFFF, but it's special enough that it presumably is never an issue). Linus _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [GIT PULL] arm64 fix for 5.14 2021-08-27 17:16 ` Linus Torvalds @ 2021-08-31 13:31 ` Will Deacon -1 siblings, 0 replies; 20+ messages in thread From: Will Deacon @ 2021-08-31 13:31 UTC (permalink / raw) To: Linus Torvalds Cc: Christoph Hellwig, Catalin Marinas, Linux ARM, Linux Kernel Mailing List, Android Kernel Team, david [+David] On Fri, Aug 27, 2021 at 10:16:27AM -0700, Linus Torvalds wrote: > On Fri, Aug 27, 2021 at 10:10 AM Christoph Hellwig <hch@lst.de> wrote: > > > > They CCed me on their earlier discussion, but I did not catch up on it > > until you responded to the pull request If I understood it correct it > > was about a platform device mapping a MMIO region (like a PCI bar), > > but something about section alignment cause pfn_valid to mistrigger. > > Yeah, so I can easily see the maxpfn numbers can easily end up being > rounded up to a whole memory section etc. > > I think my suggested solution should JustWork(tm) - exactly because if > the area is then in that "this pfn is valid" area, it will > double-check the actual underlying page. I think the pitfall there is that the 'struct page' might well exist, but isn't necessarily initialised with anything meaningful. I remember seeing something like that in the past (I think for "no-map" memory) and David's reply here: https://lore.kernel.org/r/aff3942e-b9ce-5bae-8214-0e5d89cd071c@redhat.com hints that there are still gotchas with looking at the page flags for pages if the memory is offline or ZONE_DEVICE. Don't get me wrong, I'd really like to use the generic code here as I think it would help to set expectations around what pfn_valid() actually means, I'm just less keen on the try-it-and-see-what-breaks approach given how sensitive it is to the layout of the physical memory map. > That said, I think x86 avoids the problem another way - by just making > sure max_pfn is exact. That works too, as long as there are no holes > in the RAM map that might be used for PCI BAR's. > > So I think arm could fix it that way too, depending on their memory layout. The physical memory map is the wild west, unfortunately. It's one of the things where everybody does something different and it's very common to see disjoint banks of memory placed seemingly randomly around. Will ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [GIT PULL] arm64 fix for 5.14 @ 2021-08-31 13:31 ` Will Deacon 0 siblings, 0 replies; 20+ messages in thread From: Will Deacon @ 2021-08-31 13:31 UTC (permalink / raw) To: Linus Torvalds Cc: Christoph Hellwig, Catalin Marinas, Linux ARM, Linux Kernel Mailing List, Android Kernel Team, david [+David] On Fri, Aug 27, 2021 at 10:16:27AM -0700, Linus Torvalds wrote: > On Fri, Aug 27, 2021 at 10:10 AM Christoph Hellwig <hch@lst.de> wrote: > > > > They CCed me on their earlier discussion, but I did not catch up on it > > until you responded to the pull request If I understood it correct it > > was about a platform device mapping a MMIO region (like a PCI bar), > > but something about section alignment cause pfn_valid to mistrigger. > > Yeah, so I can easily see the maxpfn numbers can easily end up being > rounded up to a whole memory section etc. > > I think my suggested solution should JustWork(tm) - exactly because if > the area is then in that "this pfn is valid" area, it will > double-check the actual underlying page. I think the pitfall there is that the 'struct page' might well exist, but isn't necessarily initialised with anything meaningful. I remember seeing something like that in the past (I think for "no-map" memory) and David's reply here: https://lore.kernel.org/r/aff3942e-b9ce-5bae-8214-0e5d89cd071c@redhat.com hints that there are still gotchas with looking at the page flags for pages if the memory is offline or ZONE_DEVICE. Don't get me wrong, I'd really like to use the generic code here as I think it would help to set expectations around what pfn_valid() actually means, I'm just less keen on the try-it-and-see-what-breaks approach given how sensitive it is to the layout of the physical memory map. > That said, I think x86 avoids the problem another way - by just making > sure max_pfn is exact. That works too, as long as there are no holes > in the RAM map that might be used for PCI BAR's. > > So I think arm could fix it that way too, depending on their memory layout. The physical memory map is the wild west, unfortunately. It's one of the things where everybody does something different and it's very common to see disjoint banks of memory placed seemingly randomly around. Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [GIT PULL] arm64 fix for 5.14 2021-08-31 13:31 ` Will Deacon @ 2021-08-31 19:16 ` David Hildenbrand -1 siblings, 0 replies; 20+ messages in thread From: David Hildenbrand @ 2021-08-31 19:16 UTC (permalink / raw) To: Will Deacon, Linus Torvalds Cc: Christoph Hellwig, Catalin Marinas, Linux ARM, Linux Kernel Mailing List, Android Kernel Team On 31.08.21 15:31, Will Deacon wrote: > [+David] > > On Fri, Aug 27, 2021 at 10:16:27AM -0700, Linus Torvalds wrote: >> On Fri, Aug 27, 2021 at 10:10 AM Christoph Hellwig <hch@lst.de> wrote: >>> >>> They CCed me on their earlier discussion, but I did not catch up on it >>> until you responded to the pull request If I understood it correct it >>> was about a platform device mapping a MMIO region (like a PCI bar), >>> but something about section alignment cause pfn_valid to mistrigger. >> >> Yeah, so I can easily see the maxpfn numbers can easily end up being >> rounded up to a whole memory section etc. >> >> I think my suggested solution should JustWork(tm) - exactly because if >> the area is then in that "this pfn is valid" area, it will >> double-check the actual underlying page. > > I think the pitfall there is that the 'struct page' might well exist, > but isn't necessarily initialised with anything meaningful. I remember > seeing something like that in the past (I think for "no-map" memory) and > David's reply here: > > https://lore.kernel.org/r/aff3942e-b9ce-5bae-8214-0e5d89cd071c@redhat.com > > hints that there are still gotchas with looking at the page flags for > pages if the memory is offline or ZONE_DEVICE. > > Don't get me wrong, I'd really like to use the generic code here as I > think it would help to set expectations around what pfn_valid() actually > means, I'm just less keen on the try-it-and-see-what-breaks approach > given how sensitive it is to the layout of the physical memory map. > >> That said, I think x86 avoids the problem another way - by just making >> sure max_pfn is exact. That works too, as long as there are no holes >> in the RAM map that might be used for PCI BAR's. >> >> So I think arm could fix it that way too, depending on their memory layout. > > The physical memory map is the wild west, unfortunately. It's one of the > things where everybody does something different and it's very common to see > disjoint banks of memory placed seemingly randomly around. The resource tree is usually the best place to really identify what's system RAM and what's not IIRC. memblock should work on applicable archs as well. Identifying ZONE_DEVICE ranges reliably is a different story ... -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [GIT PULL] arm64 fix for 5.14 @ 2021-08-31 19:16 ` David Hildenbrand 0 siblings, 0 replies; 20+ messages in thread From: David Hildenbrand @ 2021-08-31 19:16 UTC (permalink / raw) To: Will Deacon, Linus Torvalds Cc: Christoph Hellwig, Catalin Marinas, Linux ARM, Linux Kernel Mailing List, Android Kernel Team On 31.08.21 15:31, Will Deacon wrote: > [+David] > > On Fri, Aug 27, 2021 at 10:16:27AM -0700, Linus Torvalds wrote: >> On Fri, Aug 27, 2021 at 10:10 AM Christoph Hellwig <hch@lst.de> wrote: >>> >>> They CCed me on their earlier discussion, but I did not catch up on it >>> until you responded to the pull request If I understood it correct it >>> was about a platform device mapping a MMIO region (like a PCI bar), >>> but something about section alignment cause pfn_valid to mistrigger. >> >> Yeah, so I can easily see the maxpfn numbers can easily end up being >> rounded up to a whole memory section etc. >> >> I think my suggested solution should JustWork(tm) - exactly because if >> the area is then in that "this pfn is valid" area, it will >> double-check the actual underlying page. > > I think the pitfall there is that the 'struct page' might well exist, > but isn't necessarily initialised with anything meaningful. I remember > seeing something like that in the past (I think for "no-map" memory) and > David's reply here: > > https://lore.kernel.org/r/aff3942e-b9ce-5bae-8214-0e5d89cd071c@redhat.com > > hints that there are still gotchas with looking at the page flags for > pages if the memory is offline or ZONE_DEVICE. > > Don't get me wrong, I'd really like to use the generic code here as I > think it would help to set expectations around what pfn_valid() actually > means, I'm just less keen on the try-it-and-see-what-breaks approach > given how sensitive it is to the layout of the physical memory map. > >> That said, I think x86 avoids the problem another way - by just making >> sure max_pfn is exact. That works too, as long as there are no holes >> in the RAM map that might be used for PCI BAR's. >> >> So I think arm could fix it that way too, depending on their memory layout. > > The physical memory map is the wild west, unfortunately. It's one of the > things where everybody does something different and it's very common to see > disjoint banks of memory placed seemingly randomly around. The resource tree is usually the best place to really identify what's system RAM and what's not IIRC. memblock should work on applicable archs as well. Identifying ZONE_DEVICE ranges reliably is a different story ... -- Thanks, David / dhildenb _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [GIT PULL] arm64 fix for 5.14 2021-08-27 17:16 ` Linus Torvalds @ 2021-08-31 16:56 ` Catalin Marinas -1 siblings, 0 replies; 20+ messages in thread From: Catalin Marinas @ 2021-08-31 16:56 UTC (permalink / raw) To: Linus Torvalds Cc: Christoph Hellwig, Will Deacon, Linux ARM, Linux Kernel Mailing List, Android Kernel Team On Fri, Aug 27, 2021 at 10:16:27AM -0700, Linus Torvalds wrote: > On Fri, Aug 27, 2021 at 10:10 AM Christoph Hellwig <hch@lst.de> wrote: > > They CCed me on their earlier discussion, but I did not catch up on it > > until you responded to the pull request If I understood it correct it > > was about a platform device mapping a MMIO region (like a PCI bar), > > but something about section alignment cause pfn_valid to mistrigger. > > Yeah, so I can easily see the maxpfn numbers can easily end up being > rounded up to a whole memory section etc. > > I think my suggested solution should JustWork(tm) - exactly because if > the area is then in that "this pfn is valid" area, it will > double-check the actual underlying page. > > That said, I think x86 avoids the problem another way - by just making > sure max_pfn is exact. That works too, as long as there are no holes > in the RAM map that might be used for PCI BAR's. > > So I think arm could fix it that way too, depending on their memory layout. The suggested solution in the original thread was to change the generic DMA code to use memblock_is_memory() instead of pfn_valid(): https://lore.kernel.org/lkml/b720e7c8-ca44-0a25-480b-05bf49d03c35@redhat.com/ Given how late we discovered this in the -rc cycle, the decision was to revert the pfn_valid() patch. We'll re-instate it at some point but someone needs to sanity check the other pfn_valid() call sites and the expected semantics. -- Catalin ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [GIT PULL] arm64 fix for 5.14 @ 2021-08-31 16:56 ` Catalin Marinas 0 siblings, 0 replies; 20+ messages in thread From: Catalin Marinas @ 2021-08-31 16:56 UTC (permalink / raw) To: Linus Torvalds Cc: Christoph Hellwig, Will Deacon, Linux ARM, Linux Kernel Mailing List, Android Kernel Team On Fri, Aug 27, 2021 at 10:16:27AM -0700, Linus Torvalds wrote: > On Fri, Aug 27, 2021 at 10:10 AM Christoph Hellwig <hch@lst.de> wrote: > > They CCed me on their earlier discussion, but I did not catch up on it > > until you responded to the pull request If I understood it correct it > > was about a platform device mapping a MMIO region (like a PCI bar), > > but something about section alignment cause pfn_valid to mistrigger. > > Yeah, so I can easily see the maxpfn numbers can easily end up being > rounded up to a whole memory section etc. > > I think my suggested solution should JustWork(tm) - exactly because if > the area is then in that "this pfn is valid" area, it will > double-check the actual underlying page. > > That said, I think x86 avoids the problem another way - by just making > sure max_pfn is exact. That works too, as long as there are no holes > in the RAM map that might be used for PCI BAR's. > > So I think arm could fix it that way too, depending on their memory layout. The suggested solution in the original thread was to change the generic DMA code to use memblock_is_memory() instead of pfn_valid(): https://lore.kernel.org/lkml/b720e7c8-ca44-0a25-480b-05bf49d03c35@redhat.com/ Given how late we discovered this in the -rc cycle, the decision was to revert the pfn_valid() patch. We'll re-instate it at some point but someone needs to sanity check the other pfn_valid() call sites and the expected semantics. -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [GIT PULL] arm64 fix for 5.14 2021-08-26 13:17 ` Will Deacon @ 2021-08-26 18:52 ` pr-tracker-bot -1 siblings, 0 replies; 20+ messages in thread From: pr-tracker-bot @ 2021-08-26 18:52 UTC (permalink / raw) To: Will Deacon Cc: torvalds, catalin.marinas, linux-arm-kernel, linux-kernel, kernel-team The pull request you sent on Thu, 26 Aug 2021 14:17:48 +0100: > git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git tags/arm64-fixes has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/1a6d80ff2419e8ad627b4bf4775a8b4c70af535d Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [GIT PULL] arm64 fix for 5.14 @ 2021-08-26 18:52 ` pr-tracker-bot 0 siblings, 0 replies; 20+ messages in thread From: pr-tracker-bot @ 2021-08-26 18:52 UTC (permalink / raw) To: Will Deacon Cc: torvalds, catalin.marinas, linux-arm-kernel, linux-kernel, kernel-team The pull request you sent on Thu, 26 Aug 2021 14:17:48 +0100: > git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git tags/arm64-fixes has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/1a6d80ff2419e8ad627b4bf4775a8b4c70af535d Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2021-08-31 19:23 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-08-26 13:17 [GIT PULL] arm64 fix for 5.14 Will Deacon 2021-08-26 13:17 ` Will Deacon 2021-08-26 18:41 ` Linus Torvalds 2021-08-26 18:41 ` Linus Torvalds 2021-08-27 7:40 ` Christoph Hellwig 2021-08-27 7:40 ` Christoph Hellwig 2021-08-27 17:03 ` Linus Torvalds 2021-08-27 17:03 ` Linus Torvalds 2021-08-27 17:10 ` Christoph Hellwig 2021-08-27 17:10 ` Christoph Hellwig 2021-08-27 17:16 ` Linus Torvalds 2021-08-27 17:16 ` Linus Torvalds 2021-08-31 13:31 ` Will Deacon 2021-08-31 13:31 ` Will Deacon 2021-08-31 19:16 ` David Hildenbrand 2021-08-31 19:16 ` David Hildenbrand 2021-08-31 16:56 ` Catalin Marinas 2021-08-31 16:56 ` Catalin Marinas 2021-08-26 18:52 ` pr-tracker-bot 2021-08-26 18:52 ` pr-tracker-bot
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.