linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Logan Gunthorpe <logang@deltatee.com>
Cc: David Hildenbrand <david@redhat.com>,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org,
	linux-sh@vger.kernel.org, platform-driver-x86@vger.kernel.org,
	linux-mm@kvack.org, Christoph Hellwig <hch@lst.de>,
	Dan Williams <dan.j.williams@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH 5/6] mm, memory_hotplug: Provide argument for the pgprot_t in arch_add_memory()
Date: Mon, 9 Dec 2019 21:41:28 +0100	[thread overview]
Message-ID: <20191209204128.GC7658@dhcp22.suse.cz> (raw)
In-Reply-To: <f34a4c52-cc95-15ed-8a72-c05ab4fd6d33@deltatee.com>

On Mon 09-12-19 13:24:19, Logan Gunthorpe wrote:
> 
> 
> On 2019-12-09 12:23 p.m., David Hildenbrand wrote:
> > On 09.12.19 20:13, Logan Gunthorpe wrote:
> >> devm_memremap_pages() is currently used by the PCI P2PDMA code to create
> >> struct page mappings for IO memory. At present, these mappings are created
> >> with PAGE_KERNEL which implies setting the PAT bits to be WB. However, on
> >> x86, an mtrr register will typically override this and force the cache
> >> type to be UC-. In the case firmware doesn't set this register it is
> >> effectively WB and will typically result in a machine check exception
> >> when it's accessed.
> >>
> >> Other arches are not currently likely to function correctly seeing they
> >> don't have any MTRR registers to fall back on.
> >>
> >> To solve this, add an argument to arch_add_memory() to explicitly
> >> set the pgprot value to a specific value.
> >>
> >> Of the arches that support MEMORY_HOTPLUG: x86_64, s390 and arm64 is a
> >> simple change to pass the pgprot_t down to their respective functions
> >> which set up the page tables. For x86_32, set the page tables explicitly
> >> using _set_memory_prot() (seeing they are already mapped). For sh, reject
> >> anything but PAGE_KERNEL settings -- this should be fine, for now, seeing
> >> sh doesn't support ZONE_DEVICE anyway.
> >>
> >> Cc: Dan Williams <dan.j.williams@intel.com>
> >> Cc: David Hildenbrand <david@redhat.com>
> >> Cc: Michal Hocko <mhocko@suse.com>
> >> Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
> >> ---
> >>  arch/arm64/mm/mmu.c            | 4 ++--
> >>  arch/ia64/mm/init.c            | 5 ++++-
> >>  arch/powerpc/mm/mem.c          | 4 ++--
> >>  arch/s390/mm/init.c            | 4 ++--
> >>  arch/sh/mm/init.c              | 5 ++++-
> >>  arch/x86/mm/init_32.c          | 7 ++++++-
> >>  arch/x86/mm/init_64.c          | 4 ++--
> >>  include/linux/memory_hotplug.h | 2 +-
> >>  mm/memory_hotplug.c            | 2 +-
> >>  mm/memremap.c                  | 2 +-
> >>  10 files changed, 25 insertions(+), 14 deletions(-)
> >>
> >> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> >> index 60c929f3683b..48b65272df15 100644
> >> --- a/arch/arm64/mm/mmu.c
> >> +++ b/arch/arm64/mm/mmu.c
> >> @@ -1050,7 +1050,7 @@ int p4d_free_pud_page(p4d_t *p4d, unsigned long addr)
> >>  }
> >>  
> >>  #ifdef CONFIG_MEMORY_HOTPLUG
> >> -int arch_add_memory(int nid, u64 start, u64 size,
> >> +int arch_add_memory(int nid, u64 start, u64 size, pgprot_t prot,
> >>  			struct mhp_restrictions *restrictions)
> > 
> > Can we fiddle that into "struct mhp_restrictions" instead?
> 
> Yes, if that's what people want, it's pretty trivial to do. I chose not
> to do it that way because it doesn't get passed down to add_pages() and
> it's not really a "restriction". If I don't hear any objections, I will
> do that for v2.

I do agree that restriction is not the best fit. But I consider prot
argument to complicate the API to all users even though it is not really
clear whether we are going to have many users really benefiting from it.
Look at the vmalloc API and try to find how many users of __vmalloc do
not use PAGE_KERNEL.

So I can see two options. One of them is to add arch_add_memory_prot
that would allow to have give and extra prot argument or simply call
an arch independent API to change the protection after arch_add_memory.
The later sounds like much less code. The memory shouldn't be in use by
anybody at that stage yet AFAIU. Maybe there even is an API like that.
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2019-12-09 20:41 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-09 19:13 [PATCH 0/6] Allow setting caching mode in arch_add_memory() for P2PDMA Logan Gunthorpe
2019-12-09 19:13 ` [PATCH 1/6] x86/mm: Thread pgprot_t through init_memory_mapping() Logan Gunthorpe
2019-12-09 19:13 ` [PATCH 2/6] x86/mm: Introduce _set_memory_prot() Logan Gunthorpe
2019-12-09 19:13 ` [PATCH 3/6] powerpc/mm: Thread pgprot_t through create_section_mapping() Logan Gunthorpe
2019-12-09 19:13 ` [PATCH 4/6] s390/mm: Thread pgprot_t through vmem_add_mapping() Logan Gunthorpe
2019-12-09 19:13 ` [PATCH 5/6] mm, memory_hotplug: Provide argument for the pgprot_t in arch_add_memory() Logan Gunthorpe
2019-12-09 19:23   ` David Hildenbrand
2019-12-09 20:24     ` Logan Gunthorpe
2019-12-09 20:41       ` Michal Hocko [this message]
2019-12-09 21:00         ` Dan Williams
2019-12-09 21:27           ` Logan Gunthorpe
2019-12-09 21:24         ` Logan Gunthorpe
2019-12-10  9:56           ` Michal Hocko
2019-12-09 20:43       ` Dan Williams
2019-12-09 20:52         ` David Hildenbrand
2019-12-10 10:04         ` Michal Hocko
2019-12-10 10:09           ` David Hildenbrand
2019-12-10 10:34             ` Michal Hocko
2019-12-10 11:25               ` David Hildenbrand
2019-12-10 23:52                 ` Logan Gunthorpe
2019-12-11  8:37                   ` Michal Hocko
2019-12-09 19:13 ` [PATCH 6/6] mm/memremap: Set caching mode for PCI P2PDMA memory to WC Logan Gunthorpe
2019-12-09 20:43 ` [PATCH 0/6] Allow setting caching mode in arch_add_memory() for P2PDMA Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191209204128.GC7658@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=benh@kernel.crashing.org \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=logang@deltatee.com \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=platform-driver-x86@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).