* [PATCH 0/4] x86/mm: Fix set_memory_rox()
@ 2022-11-10 12:52 Peter Zijlstra
2022-11-10 12:52 ` [PATCH 1/4] x86/mm: Add a few comments Peter Zijlstra
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: Peter Zijlstra @ 2022-11-10 12:52 UTC (permalink / raw)
To: x86, dave.hansen; +Cc: linux-kernel, peterz, torvalds, oliver.sang
Hi,
The robot reported fail on commit:
b38994948567 ("x86/mm: Implement native set_memory_rox()")
which led to some non-obvious code in __change_page_attr_set_clr(). These few
patches attempt to clean that up.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/4] x86/mm: Add a few comments
2022-11-10 12:52 [PATCH 0/4] x86/mm: Fix set_memory_rox() Peter Zijlstra
@ 2022-11-10 12:52 ` Peter Zijlstra
2022-11-10 16:22 ` Linus Torvalds
2022-11-10 12:52 ` [PATCH 2/4] x86/mm: Untangle __change_page_attr_set_clr(.checkalias) Peter Zijlstra
` (2 subsequent siblings)
3 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2022-11-10 12:52 UTC (permalink / raw)
To: x86, dave.hansen; +Cc: linux-kernel, peterz, torvalds, oliver.sang
It's a shame to hide useful comments in Changelogs, add some to the
code.
Shamelessly stolen from commit:
c40a56a7818c ("x86/mm/init: Remove freed kernel image areas from alias mapping")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
arch/x86/mm/pat/set_memory.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -219,6 +219,17 @@ within_inclusive(unsigned long addr, uns
#ifdef CONFIG_X86_64
+/*
+ * The kernel image is mapped into two places in the virtual address space
+ * (addresses without KASLR, of course):
+ *
+ * 1. The kernel direct map (0xffff880000000000)
+ * 2. The "high kernel map" (0xffffffff81000000)
+ *
+ * We actually execute out of #2. If we get the address of a kernel symbol, it
+ * points to #2, but almost all physical-to-virtual translations point to #1.
+ */
+
static inline unsigned long highmap_start_pfn(void)
{
return __pa_symbol(_text) >> PAGE_SHIFT;
@@ -1626,6 +1637,9 @@ static int __change_page_attr(struct cpa
static int __change_page_attr_set_clr(struct cpa_data *cpa, int checkalias);
+/*
+ * Check the directmap and "high kernel map" 'aliases'.
+ */
static int cpa_process_alias(struct cpa_data *cpa)
{
struct cpa_data alias_cpa;
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 2/4] x86/mm: Untangle __change_page_attr_set_clr(.checkalias)
2022-11-10 12:52 [PATCH 0/4] x86/mm: Fix set_memory_rox() Peter Zijlstra
2022-11-10 12:52 ` [PATCH 1/4] x86/mm: Add a few comments Peter Zijlstra
@ 2022-11-10 12:52 ` Peter Zijlstra
2022-11-10 12:52 ` [PATCH 3/4] x86/mm: Inhibit _PAGE_NX changes from cpa_process_alias() Peter Zijlstra
2022-11-10 12:52 ` [PATCH 4/4] x86/mm: Rename __change_page_attr_set_clr(.checkalias) Peter Zijlstra
3 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2022-11-10 12:52 UTC (permalink / raw)
To: x86, dave.hansen; +Cc: linux-kernel, peterz, torvalds, oliver.sang
The .checkalias argument to __change_page_attr_set_clr() is overloaded
and serves two different purposes:
- it inhibits the call to cpa_process_alias() -- as suggested by the
name; however,
- it also serves as 'primary' indicator for __change_page_attr()
( which in turn also serves as a recursion terminator for
cpa_process_alias() ).
Untangle these by extending the use of CPA_NO_CHECK_ALIAS to all
callsites that currently use .checkalias=0 for this purpose.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
arch/x86/mm/pat/set_memory.c | 30 +++++++++++-------------------
1 file changed, 11 insertions(+), 19 deletions(-)
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -1721,7 +1721,7 @@ static int __change_page_attr_set_clr(st
if (ret)
goto out;
- if (checkalias) {
+ if (checkalias && !(cpa->flags & CPA_NO_CHECK_ALIAS)) {
ret = cpa_process_alias(cpa);
if (ret)
goto out;
@@ -1795,18 +1795,12 @@ static int change_page_attr_set_clr(unsi
cpa.numpages = numpages;
cpa.mask_set = mask_set;
cpa.mask_clr = mask_clr;
- cpa.flags = 0;
+ cpa.flags = in_flag;
cpa.curpage = 0;
cpa.force_split = force_split;
- if (in_flag & (CPA_ARRAY | CPA_PAGES_ARRAY))
- cpa.flags |= in_flag;
-
/* No alias checking for _NX bit modifications */
checkalias = (pgprot_val(mask_set) | pgprot_val(mask_clr)) != _PAGE_NX;
- /* Has caller explicitly disabled alias checking? */
- if (in_flag & CPA_NO_CHECK_ALIAS)
- checkalias = 0;
ret = __change_page_attr_set_clr(&cpa, checkalias);
@@ -2061,11 +2055,9 @@ int set_memory_np(unsigned long addr, in
int set_memory_np_noalias(unsigned long addr, int numpages)
{
- int cpa_flags = CPA_NO_CHECK_ALIAS;
-
return change_page_attr_set_clr(&addr, numpages, __pgprot(0),
__pgprot(_PAGE_PRESENT), 0,
- cpa_flags, NULL);
+ CPA_NO_CHECK_ALIAS, NULL);
}
int set_memory_4k(unsigned long addr, int numpages)
@@ -2282,7 +2274,7 @@ static int __set_pages_p(struct page *pa
.numpages = numpages,
.mask_set = __pgprot(_PAGE_PRESENT | _PAGE_RW),
.mask_clr = __pgprot(0),
- .flags = 0};
+ .flags = CPA_NO_CHECK_ALIAS };
/*
* No alias checking needed for setting present flag. otherwise,
@@ -2290,7 +2282,7 @@ static int __set_pages_p(struct page *pa
* mappings (this adds to complexity if we want to do this from
* atomic context especially). Let's keep it simple!
*/
- return __change_page_attr_set_clr(&cpa, 0);
+ return __change_page_attr_set_clr(&cpa, 1);
}
static int __set_pages_np(struct page *page, int numpages)
@@ -2301,7 +2293,7 @@ static int __set_pages_np(struct page *p
.numpages = numpages,
.mask_set = __pgprot(0),
.mask_clr = __pgprot(_PAGE_PRESENT | _PAGE_RW),
- .flags = 0};
+ .flags = CPA_NO_CHECK_ALIAS };
/*
* No alias checking needed for setting not present flag. otherwise,
@@ -2309,7 +2301,7 @@ static int __set_pages_np(struct page *p
* mappings (this adds to complexity if we want to do this from
* atomic context especially). Let's keep it simple!
*/
- return __change_page_attr_set_clr(&cpa, 0);
+ return __change_page_attr_set_clr(&cpa, 1);
}
int set_direct_map_invalid_noflush(struct page *page)
@@ -2380,7 +2372,7 @@ int __init kernel_map_pages_in_pgd(pgd_t
.numpages = numpages,
.mask_set = __pgprot(0),
.mask_clr = __pgprot(~page_flags & (_PAGE_NX|_PAGE_RW)),
- .flags = 0,
+ .flags = CPA_NO_CHECK_ALIAS,
};
WARN_ONCE(num_online_cpus() > 1, "Don't call after initializing SMP");
@@ -2393,7 +2385,7 @@ int __init kernel_map_pages_in_pgd(pgd_t
cpa.mask_set = __pgprot(_PAGE_PRESENT | page_flags);
- retval = __change_page_attr_set_clr(&cpa, 0);
+ retval = __change_page_attr_set_clr(&cpa, 1);
__flush_tlb_all();
out:
@@ -2423,12 +2415,12 @@ int __init kernel_unmap_pages_in_pgd(pgd
.numpages = numpages,
.mask_set = __pgprot(0),
.mask_clr = __pgprot(_PAGE_PRESENT | _PAGE_RW),
- .flags = 0,
+ .flags = CPA_NO_CHECK_ALIAS,
};
WARN_ONCE(num_online_cpus() > 1, "Don't call after initializing SMP");
- retval = __change_page_attr_set_clr(&cpa, 0);
+ retval = __change_page_attr_set_clr(&cpa, 1);
__flush_tlb_all();
return retval;
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 3/4] x86/mm: Inhibit _PAGE_NX changes from cpa_process_alias()
2022-11-10 12:52 [PATCH 0/4] x86/mm: Fix set_memory_rox() Peter Zijlstra
2022-11-10 12:52 ` [PATCH 1/4] x86/mm: Add a few comments Peter Zijlstra
2022-11-10 12:52 ` [PATCH 2/4] x86/mm: Untangle __change_page_attr_set_clr(.checkalias) Peter Zijlstra
@ 2022-11-10 12:52 ` Peter Zijlstra
2022-11-14 16:08 ` Dave Hansen
2022-11-10 12:52 ` [PATCH 4/4] x86/mm: Rename __change_page_attr_set_clr(.checkalias) Peter Zijlstra
3 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2022-11-10 12:52 UTC (permalink / raw)
To: x86, dave.hansen; +Cc: linux-kernel, peterz, torvalds, oliver.sang
There is a cludge in change_page_attr_set_clr() that inhibits
propagating NX changes to the aliases (directmap and highmap) -- this
is a cludge twofold:
- it also inhibits the primary checks in __change_page_attr();
- it hard depends on single bit changes.
The introduction of set_memory_rox() triggered this last issue for
clearing both _PAGE_RW and _PAGE_NX.
Explicitly ignore _PAGE_NX in cpa_process_alias() instead.
Fixes: b38994948567 ("x86/mm: Implement native set_memory_rox()")
Reported-by: kernel test robot <oliver.sang@intel.com>
Debugged-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
arch/x86/mm/pat/set_memory.c | 28 +++++++++++++++++++++++-----
1 file changed, 23 insertions(+), 5 deletions(-)
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -1663,6 +1663,12 @@ static int cpa_process_alias(struct cpa_
alias_cpa.flags &= ~(CPA_PAGES_ARRAY | CPA_ARRAY);
alias_cpa.curpage = 0;
+ /* Directmap always has NX set, do not modify. */
+ if (__supported_pte_mask & _PAGE_NX) {
+ alias_cpa.mask_clr.pgprot &= ~_PAGE_NX;
+ alias_cpa.mask_set.pgprot &= ~_PAGE_NX;
+ }
+
cpa->force_flush_all = 1;
ret = __change_page_attr_set_clr(&alias_cpa, 0);
@@ -1685,6 +1691,15 @@ static int cpa_process_alias(struct cpa_
alias_cpa.flags &= ~(CPA_PAGES_ARRAY | CPA_ARRAY);
alias_cpa.curpage = 0;
+ /*
+ * [_text, _brk_end) also covers data, do not modify NX except
+ * in cases where the highmap is the primary target.
+ */
+ if (__supported_pte_mask & _PAGE_NX) {
+ alias_cpa.mask_clr.pgprot &= ~_PAGE_NX;
+ alias_cpa.mask_set.pgprot &= ~_PAGE_NX;
+ }
+
cpa->force_flush_all = 1;
/*
* The high mapping range is imprecise, so ignore the
@@ -1703,6 +1718,12 @@ static int __change_page_attr_set_clr(st
unsigned long rempages = numpages;
int ret = 0;
+ /*
+ * No changes, easy!
+ */
+ if (!(pgprot_val(cpa->mask_set) | pgprot_val(cpa->mask_clr)))
+ return ret;
+
while (rempages) {
/*
* Store the remaining nr of pages for the large page
@@ -1749,7 +1770,7 @@ static int change_page_attr_set_clr(unsi
struct page **pages)
{
struct cpa_data cpa;
- int ret, cache, checkalias;
+ int ret, cache;
memset(&cpa, 0, sizeof(cpa));
@@ -1799,10 +1820,7 @@ static int change_page_attr_set_clr(unsi
cpa.curpage = 0;
cpa.force_split = force_split;
- /* No alias checking for _NX bit modifications */
- checkalias = (pgprot_val(mask_set) | pgprot_val(mask_clr)) != _PAGE_NX;
-
- ret = __change_page_attr_set_clr(&cpa, checkalias);
+ ret = __change_page_attr_set_clr(&cpa, 1);
/*
* Check whether we really changed something:
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 4/4] x86/mm: Rename __change_page_attr_set_clr(.checkalias)
2022-11-10 12:52 [PATCH 0/4] x86/mm: Fix set_memory_rox() Peter Zijlstra
` (2 preceding siblings ...)
2022-11-10 12:52 ` [PATCH 3/4] x86/mm: Inhibit _PAGE_NX changes from cpa_process_alias() Peter Zijlstra
@ 2022-11-10 12:52 ` Peter Zijlstra
3 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2022-11-10 12:52 UTC (permalink / raw)
To: x86, dave.hansen; +Cc: linux-kernel, peterz, torvalds, oliver.sang
Now that the checkalias functionality is taken by CPA_NO_CHECK_ALIAS
rename the argument to better match is remaining purpose: primary,
matching __change_page_attr().
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
arch/x86/mm/pat/set_memory.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -1635,7 +1635,7 @@ static int __change_page_attr(struct cpa
return err;
}
-static int __change_page_attr_set_clr(struct cpa_data *cpa, int checkalias);
+static int __change_page_attr_set_clr(struct cpa_data *cpa, int primary);
/*
* Check the directmap and "high kernel map" 'aliases'.
@@ -1712,7 +1712,7 @@ static int cpa_process_alias(struct cpa_
return 0;
}
-static int __change_page_attr_set_clr(struct cpa_data *cpa, int checkalias)
+static int __change_page_attr_set_clr(struct cpa_data *cpa, int primary)
{
unsigned long numpages = cpa->numpages;
unsigned long rempages = numpages;
@@ -1736,13 +1736,13 @@ static int __change_page_attr_set_clr(st
if (!debug_pagealloc_enabled())
spin_lock(&cpa_lock);
- ret = __change_page_attr(cpa, checkalias);
+ ret = __change_page_attr(cpa, primary);
if (!debug_pagealloc_enabled())
spin_unlock(&cpa_lock);
if (ret)
goto out;
- if (checkalias && !(cpa->flags & CPA_NO_CHECK_ALIAS)) {
+ if (primary && !(cpa->flags & CPA_NO_CHECK_ALIAS)) {
ret = cpa_process_alias(cpa);
if (ret)
goto out;
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/4] x86/mm: Add a few comments
2022-11-10 12:52 ` [PATCH 1/4] x86/mm: Add a few comments Peter Zijlstra
@ 2022-11-10 16:22 ` Linus Torvalds
2022-11-10 18:02 ` Peter Zijlstra
0 siblings, 1 reply; 9+ messages in thread
From: Linus Torvalds @ 2022-11-10 16:22 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: x86, dave.hansen, linux-kernel, oliver.sang
On Thu, Nov 10, 2022 at 4:58 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> It's a shame to hide useful comments in Changelogs, add some to the
> code.
>
> Shamelessly stolen from commit:
When the comment says how the image is mapped into two places,
wouldn't it be good to also have the reason *why* rather than just
what..
That said, my real commentary for this patch set is not that
particular instance, but the bigger picture - that this code is still
a bit opaque as to why these things are done.
Do we need any of those alias passes at all for pure protection bit
changes? I thought we only did these because things like cacheability
bits have to be in sync due to machine checks etc?
Or am I missing some case where writability matters too?
Linus
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/4] x86/mm: Add a few comments
2022-11-10 16:22 ` Linus Torvalds
@ 2022-11-10 18:02 ` Peter Zijlstra
2022-11-11 11:50 ` Peter Zijlstra
0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2022-11-10 18:02 UTC (permalink / raw)
To: Linus Torvalds; +Cc: x86, dave.hansen, linux-kernel, oliver.sang
On Thu, Nov 10, 2022 at 08:22:54AM -0800, Linus Torvalds wrote:
> On Thu, Nov 10, 2022 at 4:58 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > It's a shame to hide useful comments in Changelogs, add some to the
> > code.
> >
> > Shamelessly stolen from commit:
>
> When the comment says how the image is mapped into two places,
> wouldn't it be good to also have the reason *why* rather than just
> what..
Agreed, and I think I can actually answer this. How about I change it
into something like this:
/*
* The kernel image is mapped into two places in the virtual address space
* (addresses without KASLR, of course):
*
* 1. The kernel direct map (0xffff880000000000)
* 2. The "high kernel map" (0xffffffff81000000)
*
* We actually execute out of #2. If we get the address of a kernel symbol, it
* points to #2, but almost all physical-to-virtual translations point to #1.
*
* This is so that we can have both a directmap of all physical memory
* *and* take full advantage of the limited (s32) immediate addressing
* range (2G) of x86_64.
*
* See Documentation/x86/x86_64/mm.rst for more details.
*/
> That said, my real commentary for this patch set is not that
> particular instance, but the bigger picture - that this code is still
> a bit opaque as to why these things are done.
Yes, small steps I suppose. It took me way longer than it should have to
figure out what this code was actually doing. And why we shouldn't
propagate those NX bits in the various cases.
> Do we need any of those alias passes at all for pure protection bit
> changes? I thought we only did these because things like cacheability
> bits have to be in sync due to machine checks etc?
>
> Or am I missing some case where writability matters too?
I _think_, but I'm not actually sure, that it matters in exactly that
case dhansen mentions, where we do a physical to virtual address
translation and expect access to match whatever alias we originally came
from.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/4] x86/mm: Add a few comments
2022-11-10 18:02 ` Peter Zijlstra
@ 2022-11-11 11:50 ` Peter Zijlstra
0 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2022-11-11 11:50 UTC (permalink / raw)
To: Linus Torvalds; +Cc: x86, dave.hansen, linux-kernel, oliver.sang
On Thu, Nov 10, 2022 at 07:02:31PM +0100, Peter Zijlstra wrote:
> > Do we need any of those alias passes at all for pure protection bit
> > changes? I thought we only did these because things like cacheability
> > bits have to be in sync due to machine checks etc?
> >
> > Or am I missing some case where writability matters too?
>
> I _think_, but I'm not actually sure, that it matters in exactly that
> case dhansen mentions, where we do a physical to virtual address
> translation and expect access to match whatever alias we originally came
> from.
That of course only covers the directmap; for giggles I did the below
patch on top of these and the testcase at hand boots and finishes just
fine...
So yeah, no sodding clue why we do that :/
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index f275605892df..c63e6117221a 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -1656,7 +1676,7 @@ static int cpa_process_alias(struct cpa_data *cpa)
return ret;
}
-#ifdef CONFIG_X86_64
+#if 0 // def CONFIG_X86_64
/*
* If the primary call didn't touch the high mapping already
* and the physical address is inside the kernel map, we need
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 3/4] x86/mm: Inhibit _PAGE_NX changes from cpa_process_alias()
2022-11-10 12:52 ` [PATCH 3/4] x86/mm: Inhibit _PAGE_NX changes from cpa_process_alias() Peter Zijlstra
@ 2022-11-14 16:08 ` Dave Hansen
0 siblings, 0 replies; 9+ messages in thread
From: Dave Hansen @ 2022-11-14 16:08 UTC (permalink / raw)
To: Peter Zijlstra, x86; +Cc: linux-kernel, torvalds, oliver.sang
On 11/10/22 04:52, Peter Zijlstra wrote:
> There is a cludge in change_page_attr_set_clr() that inhibits
> propagating NX changes to the aliases (directmap and highmap) -- this
> is a cludge twofold:
>
> - it also inhibits the primary checks in __change_page_attr();
> - it hard depends on single bit changes.
>
> The introduction of set_memory_rox() triggered this last issue for
> clearing both _PAGE_RW and _PAGE_NX.
>
> Explicitly ignore _PAGE_NX in cpa_process_alias() instead.
>
> Fixes: b38994948567 ("x86/mm: Implement native set_memory_rox()")
> Reported-by: kernel test robot <oliver.sang@intel.com>
> Debugged-by: Dave Hansen <dave.hansen@intel.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Looks sane. It's also nice how the checkalias stuff continues to fall
out of there.
Acked-by: Dave Hansen <dave.hansen@intel.com>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2022-11-14 16:08 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-10 12:52 [PATCH 0/4] x86/mm: Fix set_memory_rox() Peter Zijlstra
2022-11-10 12:52 ` [PATCH 1/4] x86/mm: Add a few comments Peter Zijlstra
2022-11-10 16:22 ` Linus Torvalds
2022-11-10 18:02 ` Peter Zijlstra
2022-11-11 11:50 ` Peter Zijlstra
2022-11-10 12:52 ` [PATCH 2/4] x86/mm: Untangle __change_page_attr_set_clr(.checkalias) Peter Zijlstra
2022-11-10 12:52 ` [PATCH 3/4] x86/mm: Inhibit _PAGE_NX changes from cpa_process_alias() Peter Zijlstra
2022-11-14 16:08 ` Dave Hansen
2022-11-10 12:52 ` [PATCH 4/4] x86/mm: Rename __change_page_attr_set_clr(.checkalias) Peter Zijlstra
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.