driverdev-devel.linuxdriverproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC v1 00/12] mm: Don't mark hotplugged pages PG_reserved (including ZONE_DEVICE)
@ 2019-10-22 17:12 David Hildenbrand
  2019-10-22 17:12 ` [PATCH RFC v1 01/12] mm/memory_hotplug: Don't allow to online/offline memory blocks with holes David Hildenbrand
                   ` (12 more replies)
  0 siblings, 13 replies; 28+ messages in thread
From: David Hildenbrand @ 2019-10-22 17:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: Kate Stewart, Sasha Levin, linux-hyperv, Michal Hocko,
	Radim Krčmář,
	kvm, David Hildenbrand, KarimAllah Ahmed, Benjamin Herrenschmidt,
	Dave Hansen, Alexander Duyck, Michal Hocko, Paul Mackerras,
	linux-mm, Pavel Tatashin, Paul Mackerras, Michael Ellerman,
	H. Peter Anvin, Wanpeng Li, Alexander Duyck, Kees Cook, devel,
	Stefano Stabellini, Stephen Hemminger, Aneesh Kumar K.V,
	Joerg Roedel, x86, YueHaibing, Mike Rapoport,
	Madhumitha Prabakaran, Peter Zijlstra, Ingo Molnar,
	Vlastimil Babka, Nishka Dasgupta, Anthony Yznaga, Oscar Salvador,
	Dan Carpenter, Isaac J. Manjarres, Juergen Gross,
	Anshuman Khandual, Haiyang Zhang, Simon Sandström,
	Dan Williams, kvm-ppc, Qian Cai, Alex Williamson, Mike Rapoport,
	Borislav Petkov, Nicholas Piggin, Andy Lutomirski, xen-devel,
	Boris Ostrovsky, Todd Poynor, Vitaly Kuznetsov, Allison Randal,
	Jim Mattson, Christophe Leroy, Vandana BN, Greg Kroah-Hartman,
	Cornelia Huck, Pavel Tatashin, Mel Gorman, Sean Christopherson,
	Rob Springer, Thomas Gleixner, Johannes Weiner, Paolo Bonzini,
	Andrew Morton, linuxppc-dev

This series is based on [2], which should pop up in linux/next soon:
	https://lkml.org/lkml/2019/10/21/1034

This is the result of a recent discussion with Michal ([1], [2]). Right
now we set all pages PG_reserved when initializing hotplugged memmaps. This
includes ZONE_DEVICE memory. In case of system memory, PG_reserved is
cleared again when onlining the memory, in case of ZONE_DEVICE memory
never. In ancient times, we needed PG_reserved, because there was no way
to tell whether the memmap was already properly initialized. We now have
SECTION_IS_ONLINE for that in the case of !ZONE_DEVICE memory. ZONE_DEVICE
memory is already initialized deferred, and there shouldn't be a visible
change in that regard.

I remember that some time ago, we already talked about stopping to set
ZONE_DEVICE pages PG_reserved on the list, but I never saw any patches.
Also, I forgot who was part of the discussion :)

One of the biggest fear were side effects. I went ahead and audited all
users of PageReserved(). The ones that don't need any care (patches)
can be found below. I will double check and hope I am not missing something
important.

I am probably a little bit too careful (but I don't want to break things).
In most places (besides KVM and vfio that are nuts), the
pfn_to_online_page() check could most probably be avoided by a
is_zone_device_page() check. However, I usually get suspicious when I see
a pfn_valid() check (especially after I learned that people mmap parts of
/dev/mem into user space, including memory without memmaps. Also, people
could memmap offline memory blocks this way :/). As long as this does not
hurt performance, I think we should rather do it the clean way.

I only gave it a quick test with DIMMs on x86-64, but didn't test the
ZONE_DEVICE part at all (any tips for a nice QEMU setup?). Compile-tested
on x86-64 and PPC.

Other users of PageReserved() that should be fine:
- mm/page_owner.c:pagetypeinfo_showmixedcount_print()
  -> Never called for ZONE_DEVICE, (+ pfn_to_online_page(pfn))
- mm/page_owner.c:init_pages_in_zone()
  -> Never called for ZONE_DEVICE (!populated_zone(zone))
- mm/page_ext.c:free_page_ext()
  -> Only a BUG_ON(PageReserved(page)), not relevant
- mm/page_ext.c:has_unmovable_pages()
  -> Not releveant for ZONE_DEVICE
- mm/page_ext.c:pfn_range_valid_contig()
  -> pfn_to_online_page() already guards us
- mm/mempolicy.c:queue_pages_pte_range()
  -> vm_normal_page() checks against pte_devmap()
- mm/memory-failure.c:hwpoison_user_mappings()
  -> Not reached via memory_failure() due to pfn_to_online_page()
  -> Also not reached indirectly via memory_failure_hugetlb()
- mm/hugetlb.c:gather_bootmem_prealloc()
  -> Only a WARN_ON(PageReserved(page)), not relevant
- kernel/power/snapshot.c:saveable_highmem_page()
  -> pfn_to_online_page() already guards us
- kernel/power/snapshot.c:saveable_page()
  -> pfn_to_online_page() already guards us
- fs/proc/task_mmu.c:can_gather_numa_stats()
  -> vm_normal_page() checks against pte_devmap()
- fs/proc/task_mmu.c:can_gather_numa_stats_pmd
  -> vm_normal_page_pmd() checks against pte_devmap()
- fs/proc/page.c:stable_page_flags()
  -> The reserved bit is simply copied, irrelevant
- drivers/firmware/memmap.c:release_firmware_map_entry()
  -> really only a check to detect bootmem. Not relevant for ZONE_DEVICE
- arch/ia64/kernel/mca_drv.c
- arch/mips/mm/init.c
- arch/mips/mm/ioremap.c
- arch/nios2/mm/ioremap.c
- arch/parisc/mm/ioremap.c
- arch/sparc/mm/tlb.c
- arch/xtensa/mm/cache.c
  -> No ZONE_DEVICE support
- arch/powerpc/mm/init_64.c:vmemmap_free()
  -> Special-cases memmap on altmap
  -> Only a check for bootmem
- arch/x86/kernel/alternative.c:__text_poke()
  -> Only a WARN_ON(!PageReserved(pages[0])) to verify it is bootmem
- arch/x86/mm/init_64.c
  -> Only a check for bootmem

[1] https://lkml.org/lkml/2019/10/21/736
[2] https://lkml.org/lkml/2019/10/21/1034

Cc: Michal Hocko <mhocko@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: kvm-ppc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: kvm@vger.kernel.org
Cc: linux-hyperv@vger.kernel.org
Cc: devel@driverdev.osuosl.org
Cc: xen-devel@lists.xenproject.org
Cc: x86@kernel.org
Cc: Alexander Duyck <alexander.duyck@gmail.com>

David Hildenbrand (12):
  mm/memory_hotplug: Don't allow to online/offline memory blocks with
    holes
  mm/usercopy.c: Prepare check_page_span() for PG_reserved changes
  KVM: x86/mmu: Prepare kvm_is_mmio_pfn() for PG_reserved changes
  KVM: Prepare kvm_is_reserved_pfn() for PG_reserved changes
  vfio/type1: Prepare is_invalid_reserved_pfn() for PG_reserved changes
  staging/gasket: Prepare gasket_release_page() for PG_reserved changes
  staging: kpc2000: Prepare transfer_complete_cb() for PG_reserved
    changes
  powerpc/book3s: Prepare kvmppc_book3s_instantiate_page() for
    PG_reserved changes
  powerpc/64s: Prepare hash_page_do_lazy_icache() for PG_reserved
    changes
  powerpc/mm: Prepare maybe_pte_to_page() for PG_reserved changes
  x86/mm: Prepare __ioremap_check_ram() for PG_reserved changes
  mm/memory_hotplug: Don't mark pages PG_reserved when initializing the
    memmap

 arch/powerpc/kvm/book3s_64_mmu_radix.c     | 14 ++++---
 arch/powerpc/mm/book3s64/hash_utils.c      | 10 +++--
 arch/powerpc/mm/pgtable.c                  | 10 +++--
 arch/x86/kvm/mmu.c                         | 30 +++++++++------
 arch/x86/mm/ioremap.c                      | 13 +++++--
 drivers/hv/hv_balloon.c                    |  6 +++
 drivers/staging/gasket/gasket_page_table.c |  2 +-
 drivers/staging/kpc2000/kpc_dma/fileops.c  |  3 +-
 drivers/vfio/vfio_iommu_type1.c            | 10 ++++-
 drivers/xen/balloon.c                      |  7 ++++
 include/linux/page-flags.h                 |  8 +---
 mm/memory_hotplug.c                        | 43 ++++++++++++++++------
 mm/page_alloc.c                            | 11 ------
 mm/usercopy.c                              |  5 ++-
 virt/kvm/kvm_main.c                        | 10 ++++-
 15 files changed, 115 insertions(+), 67 deletions(-)

-- 
2.21.0

_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2019-10-24 12:51 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-22 17:12 [PATCH RFC v1 00/12] mm: Don't mark hotplugged pages PG_reserved (including ZONE_DEVICE) David Hildenbrand
2019-10-22 17:12 ` [PATCH RFC v1 01/12] mm/memory_hotplug: Don't allow to online/offline memory blocks with holes David Hildenbrand
2019-10-24  3:53   ` Anshuman Khandual
2019-10-24  7:55     ` David Hildenbrand
2019-10-22 17:12 ` [PATCH RFC v1 02/12] mm/usercopy.c: Prepare check_page_span() for PG_reserved changes David Hildenbrand
2019-10-23  8:20   ` David Hildenbrand
2019-10-23 16:25     ` Kees Cook
2019-10-23 16:32       ` David Hildenbrand
2019-10-22 17:12 ` [PATCH RFC v1 03/12] KVM: x86/mmu: Prepare kvm_is_mmio_pfn() " David Hildenbrand
2019-10-22 17:12 ` [PATCH RFC v1 04/12] KVM: Prepare kvm_is_reserved_pfn() " David Hildenbrand
2019-10-22 17:12 ` [PATCH RFC v1 05/12] vfio/type1: Prepare is_invalid_reserved_pfn() " David Hildenbrand
2019-10-22 17:12 ` [PATCH RFC v1 06/12] staging/gasket: Prepare gasket_release_page() " David Hildenbrand
2019-10-23  8:17   ` David Hildenbrand
2019-10-22 17:12 ` [PATCH RFC v1 07/12] staging: kpc2000: Prepare transfer_complete_cb() " David Hildenbrand
2019-10-22 17:55   ` Matt Sickler
2019-10-22 21:01     ` David Hildenbrand
2019-10-22 17:12 ` [PATCH RFC v1 08/12] powerpc/book3s: Prepare kvmppc_book3s_instantiate_page() " David Hildenbrand
2019-10-22 17:12 ` [PATCH RFC v1 09/12] powerpc/64s: Prepare hash_page_do_lazy_icache() " David Hildenbrand
2019-10-22 17:12 ` [PATCH RFC v1 10/12] powerpc/mm: Prepare maybe_pte_to_page() " David Hildenbrand
2019-10-22 17:12 ` [PATCH RFC v1 11/12] x86/mm: Prepare __ioremap_check_ram() " David Hildenbrand
2019-10-22 17:12 ` [PATCH RFC v1 12/12] mm/memory_hotplug: Don't mark pages PG_reserved when initializing the memmap David Hildenbrand
2019-10-22 21:54 ` [PATCH RFC v1 00/12] mm: Don't mark hotplugged pages PG_reserved (including ZONE_DEVICE) Dan Williams
2019-10-23  7:26   ` David Hildenbrand
2019-10-23 17:09     ` Dan Williams
2019-10-23 17:27       ` David Hildenbrand
2019-10-23 19:39         ` Dan Williams
2019-10-23 21:22           ` David Hildenbrand
2019-10-24 12:50     ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).