All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: akpm@linux-foundation.org, benh@kernel.crashing.org,
	bp@alien8.de, catalin.marinas@arm.com, dan.j.williams@intel.com,
	dave.hansen@linux.intel.com, david@redhat.com,
	ebadger@gigaio.com, hch@lst.de, hpa@zytor.com, jgg@ziepe.ca,
	linux-mm@kvack.org, logang@deltatee.com, luto@kernel.org,
	mhocko@suse.com, mingo@redhat.com, mm-commits@vger.kernel.org,
	mpe@ellerman.id.au, paulus@samba.org, peterz@infradead.org,
	tglx@linutronix.de, torvalds@linux-foundation.org,
	will@kernel.org
Subject: [patch 24/35] mm/memory_hotplug: add pgprot_t to mhp_params
Date: Fri, 10 Apr 2020 14:33:36 -0700	[thread overview]
Message-ID: <20200410213336.7dA-akMbT%akpm@linux-foundation.org> (raw)
In-Reply-To: <20200410143047.bf34a933ce1affdc042c7c80@linux-foundation.org>

From: Logan Gunthorpe <logang@deltatee.com>
Subject: mm/memory_hotplug: add pgprot_t to mhp_params

devm_memremap_pages() is currently used by the PCI P2PDMA code to create
struct page mappings for IO memory.  At present, these mappings are
created with PAGE_KERNEL which implies setting the PAT bits to be WB. 
However, on x86, an mtrr register will typically override this and force
the cache type to be UC-.  In the case firmware doesn't set this register
it is effectively WB and will typically result in a machine check
exception when it's accessed.

Other arches are not currently likely to function correctly seeing they
don't have any MTRR registers to fall back on.

To solve this, provide a way to specify the pgprot value explicitly to
arch_add_memory().

Of the arches that support MEMORY_HOTPLUG: x86_64, and arm64 need a simple
change to pass the pgprot_t down to their respective functions which set
up the page tables.  For x86_32, set the page tables explicitly using
_set_memory_prot() (seeing they are already mapped).  For ia64, s390 and
sh, reject anything but PAGE_KERNEL settings -- this should be fine, for
now, seeing these architectures don't support ZONE_DEVICE.

A check in __add_pages() is also added to ensure the pgprot parameter was
set for all arches.

Link: http://lkml.kernel.org/r/20200306170846.9333-7-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Eric Badger <ebadger@gigaio.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 arch/arm64/mm/mmu.c            |    3 ++-
 arch/ia64/mm/init.c            |    3 +++
 arch/powerpc/mm/mem.c          |    3 ++-
 arch/s390/mm/init.c            |    3 +++
 arch/sh/mm/init.c              |    3 +++
 arch/x86/mm/init_32.c          |   12 ++++++++++++
 arch/x86/mm/init_64.c          |    2 +-
 include/linux/memory_hotplug.h |    3 +++
 mm/memory_hotplug.c            |    5 ++++-
 mm/memremap.c                  |    6 +++---
 10 files changed, 36 insertions(+), 7 deletions(-)

--- a/arch/arm64/mm/mmu.c~mm-memory_hotplug-add-pgprot_t-to-mhp_params
+++ a/arch/arm64/mm/mmu.c
@@ -1382,7 +1382,8 @@ int arch_add_memory(int nid, u64 start,
 		flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
 
 	__create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
-			     size, PAGE_KERNEL, __pgd_pgtable_alloc, flags);
+			     size, params->pgprot, __pgd_pgtable_alloc,
+			     flags);
 
 	memblock_clear_nomap(start, size);
 
--- a/arch/ia64/mm/init.c~mm-memory_hotplug-add-pgprot_t-to-mhp_params
+++ a/arch/ia64/mm/init.c
@@ -676,6 +676,9 @@ int arch_add_memory(int nid, u64 start,
 	unsigned long nr_pages = size >> PAGE_SHIFT;
 	int ret;
 
+	if (WARN_ON_ONCE(params->pgprot.pgprot != PAGE_KERNEL.pgprot))
+		return -EINVAL;
+
 	ret = __add_pages(nid, start_pfn, nr_pages, params);
 	if (ret)
 		printk("%s: Problem encountered in __add_pages() as ret=%d\n",
--- a/arch/powerpc/mm/mem.c~mm-memory_hotplug-add-pgprot_t-to-mhp_params
+++ a/arch/powerpc/mm/mem.c
@@ -132,7 +132,8 @@ int __ref arch_add_memory(int nid, u64 s
 	resize_hpt_for_hotplug(memblock_phys_mem_size());
 
 	start = (unsigned long)__va(start);
-	rc = create_section_mapping(start, start + size, nid, PAGE_KERNEL);
+	rc = create_section_mapping(start, start + size, nid,
+				    params->pgprot);
 	if (rc) {
 		pr_warn("Unable to create mapping for hot added memory 0x%llx..0x%llx: %d\n",
 			start, start + size, rc);
--- a/arch/s390/mm/init.c~mm-memory_hotplug-add-pgprot_t-to-mhp_params
+++ a/arch/s390/mm/init.c
@@ -277,6 +277,9 @@ int arch_add_memory(int nid, u64 start,
 	if (WARN_ON_ONCE(params->altmap))
 		return -EINVAL;
 
+	if (WARN_ON_ONCE(params->pgprot.pgprot != PAGE_KERNEL.pgprot))
+		return -EINVAL;
+
 	rc = vmem_add_mapping(start, size);
 	if (rc)
 		return rc;
--- a/arch/sh/mm/init.c~mm-memory_hotplug-add-pgprot_t-to-mhp_params
+++ a/arch/sh/mm/init.c
@@ -412,6 +412,9 @@ int arch_add_memory(int nid, u64 start,
 	unsigned long nr_pages = size >> PAGE_SHIFT;
 	int ret;
 
+	if (WARN_ON_ONCE(params->pgprot.pgprot != PAGE_KERNEL.pgprot)
+		return -EINVAL;
+
 	/* We only have ZONE_NORMAL, so this is easy.. */
 	ret = __add_pages(nid, start_pfn, nr_pages, params);
 	if (unlikely(ret))
--- a/arch/x86/mm/init_32.c~mm-memory_hotplug-add-pgprot_t-to-mhp_params
+++ a/arch/x86/mm/init_32.c
@@ -824,6 +824,18 @@ int arch_add_memory(int nid, u64 start,
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
+	int ret;
+
+	/*
+	 * The page tables were already mapped at boot so if the caller
+	 * requests a different mapping type then we must change all the
+	 * pages with __set_memory_prot().
+	 */
+	if (params->pgprot.pgprot != PAGE_KERNEL.pgprot) {
+		ret = __set_memory_prot(start, nr_pages, params->pgprot);
+		if (ret)
+			return ret;
+	}
 
 	return __add_pages(nid, start_pfn, nr_pages, params);
 }
--- a/arch/x86/mm/init_64.c~mm-memory_hotplug-add-pgprot_t-to-mhp_params
+++ a/arch/x86/mm/init_64.c
@@ -867,7 +867,7 @@ int arch_add_memory(int nid, u64 start,
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
 
-	init_memory_mapping(start, start + size, PAGE_KERNEL);
+	init_memory_mapping(start, start + size, params->pgprot);
 
 	return add_pages(nid, start_pfn, nr_pages, params);
 }
--- a/include/linux/memory_hotplug.h~mm-memory_hotplug-add-pgprot_t-to-mhp_params
+++ a/include/linux/memory_hotplug.h
@@ -60,9 +60,12 @@ enum {
 /*
  * Extended parameters for memory hotplug:
  * altmap: alternative allocator for memmap array (optional)
+ * pgprot: page protection flags to apply to newly created page tables
+ *	(required)
  */
 struct mhp_params {
 	struct vmem_altmap *altmap;
+	pgprot_t pgprot;
 };
 
 /*
--- a/mm/memory_hotplug.c~mm-memory_hotplug-add-pgprot_t-to-mhp_params
+++ a/mm/memory_hotplug.c
@@ -311,6 +311,9 @@ int __ref __add_pages(int nid, unsigned
 	int err;
 	struct vmem_altmap *altmap = params->altmap;
 
+	if (WARN_ON_ONCE(!params->pgprot.pgprot))
+		return -EINVAL;
+
 	err = check_hotplug_memory_addressable(pfn, nr_pages);
 	if (err)
 		return err;
@@ -1002,7 +1005,7 @@ static int online_memory_block(struct me
  */
 int __ref add_memory_resource(int nid, struct resource *res)
 {
-	struct mhp_params params = {};
+	struct mhp_params params = { .pgprot = PAGE_KERNEL };
 	u64 start, size;
 	bool new_node = false;
 	int ret;
--- a/mm/memremap.c~mm-memory_hotplug-add-pgprot_t-to-mhp_params
+++ a/mm/memremap.c
@@ -189,8 +189,8 @@ void *memremap_pages(struct dev_pagemap
 		 * We do not want any optional features only our own memmap
 		 */
 		.altmap = pgmap_altmap(pgmap),
+		.pgprot = PAGE_KERNEL,
 	};
-	pgprot_t pgprot = PAGE_KERNEL;
 	int error, is_ram;
 	bool need_devmap_managed = true;
 
@@ -282,8 +282,8 @@ void *memremap_pages(struct dev_pagemap
 	if (nid < 0)
 		nid = numa_mem_id();
 
-	error = track_pfn_remap(NULL, &pgprot, PHYS_PFN(res->start), 0,
-			resource_size(res));
+	error = track_pfn_remap(NULL, &params.pgprot, PHYS_PFN(res->start),
+				0, resource_size(res));
 	if (error)
 		goto err_pfn_remap;
 
_

  parent reply	other threads:[~2020-04-10 21:33 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-10 21:30 incoming Andrew Morton
2020-04-10 21:32 ` [patch 01/35] hfsplus: fix crash and filesystem corruption when deleting files Andrew Morton
2020-04-10 21:32 ` [patch 02/35] mm, memcg: do not high throttle allocators based on wraparound Andrew Morton
2020-04-10 21:32 ` [patch 03/35] mm, slab_common: fix a typo in comment "eariler"->"earlier" Andrew Morton
2020-04-10 21:32 ` [patch 04/35] docs: mm: slab.h: fix a broken cross-reference Andrew Morton
2020-04-10 21:32 ` [patch 05/35] mm/page_alloc.c: fix kernel-doc warning Andrew Morton
2020-04-10 21:32 ` [patch 06/35] mm/page_alloc: make pcpu_drain_mutex and pcpu_drain static Andrew Morton
2020-04-10 21:32 ` [patch 07/35] mm/gup: fix null pointer dereference detected by coverity Andrew Morton
2020-04-10 22:24   ` Linus Torvalds
2020-04-10 23:53     ` Peter Xu
2020-04-11  0:19       ` Linus Torvalds
2020-04-14  4:04     ` Miles Chen
2020-04-10 21:32 ` [patch 08/35] ocfs2: no need try to truncate file beyond i_size Andrew Morton
2020-04-10 21:32 ` [patch 09/35] mm: cma: NUMA node interface Andrew Morton
2020-04-10 21:32 ` [patch 10/35] mm: hugetlb: optionally allocate gigantic hugepages using cma Andrew Morton
2020-04-10 21:32 ` [patch 11/35] mm/mmap.c: initialize align_offset explicitly for vm_unmapped_area Andrew Morton
2020-04-10 21:32 ` [patch 12/35] mm/memory.c: refactor insert_page to prepare for batched-lock insert Andrew Morton
2020-04-10 21:32   ` Andrew Morton
2020-04-10 21:32 ` [patch 13/35] mm: bring sparc pte_index() semantics inline with other platforms Andrew Morton
2020-04-10 21:32 ` [patch 14/35] mm: define pte_index as macro for x86 Andrew Morton
2020-04-10 21:33 ` [patch 15/35] mm/memory.c: add vm_insert_pages() Andrew Morton
2020-04-10 21:33 ` [patch 16/35] mm/vma: define a default value for VM_DATA_DEFAULT_FLAGS Andrew Morton
2020-04-10 21:33   ` Andrew Morton
2020-04-10 21:33 ` [patch 17/35] mm/vma: introduce VM_ACCESS_FLAGS Andrew Morton
2020-04-10 21:33 ` [patch 18/35] mm/special: create generic fallbacks for pte_special() and pte_mkspecial() Andrew Morton
2020-04-10 21:33 ` [patch 19/35] mm/memory_hotplug: drop the flags field from struct mhp_restrictions Andrew Morton
2020-04-10 21:33 ` [patch 20/35] mm/memory_hotplug: rename mhp_restrictions to mhp_params Andrew Morton
2020-04-10 21:33 ` [patch 21/35] x86/mm: thread pgprot_t through init_memory_mapping() Andrew Morton
2020-04-10 21:33 ` [patch 22/35] x86/mm: introduce __set_memory_prot() Andrew Morton
2020-04-10 21:33 ` [patch 23/35] powerpc/mm: thread pgprot_t through create_section_mapping() Andrew Morton
2020-04-10 21:33 ` Andrew Morton [this message]
2020-04-10 21:33 ` [patch 25/35] mm/memremap: set caching mode for PCI P2PDMA memory to WC Andrew Morton
2020-04-10 21:33 ` [patch 26/35] kmod: make request_module() return an error when autoloading is disabled Andrew Morton
2020-04-10 21:33 ` [patch 27/35] fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once() Andrew Morton
2020-04-10 21:33 ` Andrew Morton
2020-04-10 21:33 ` [patch 28/35] docs: admin-guide: document the kernel.modprobe sysctl Andrew Morton
2020-04-10 21:33 ` [patch 29/35] selftests: kmod: fix handling test numbers above 9 Andrew Morton
2020-04-10 21:33 ` [patch 30/35] selftests: kmod: test disabling module autoloading Andrew Morton
2020-04-10 21:34 ` [patch 31/35] change email address for Pali Rohár Andrew Morton
2020-04-10 21:44   ` Joe Perches
2020-04-10 21:34 ` [patch 32/35] drivers/dma/tegra20-apb-dma.c: fix platform_get_irq.cocci warnings Andrew Morton
2020-04-13 17:54   ` Jon Hunter
2020-04-10 21:34 ` [patch 33/35] fs/seq_file.c: seq_read(): add info message about buggy .next functions Andrew Morton
2020-04-10 21:34 ` [patch 34/35] kernel/gcov/fs.c: gcov_seq_next() should increase position index Andrew Morton
2020-04-10 21:34 ` [patch 35/35] ipc/util.c: sysvipc_find_ipc() " Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200410213336.7dA-akMbT%akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=benh@kernel.crashing.org \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=ebadger@gigaio.com \
    --cc=hch@lst.de \
    --cc=hpa@zytor.com \
    --cc=jgg@ziepe.ca \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=logang@deltatee.com \
    --cc=luto@kernel.org \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=mm-commits@vger.kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.