linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* your "x86: mm: convert dump_pagetables to use walk_page_range" change
@ 2020-05-12  9:39 Jan Beulich
  2020-05-12 13:02 ` Steven Price
  0 siblings, 1 reply; 4+ messages in thread
From: Jan Beulich @ 2020-05-12  9:39 UTC (permalink / raw)
  To: Steven Price; +Cc: the arch/x86 maintainers, lkml

Steven,

in the description of this change you say:

"The effective permissions are passed down the chain using new fields in
 struct pg_state."

I don't see how this works, and I suppose this part of the change is
(part of) the reason why a W+X warning has magically disappeared in
5.6.x (compared to 5.5.x) when running a 32-bit kernel under Xen.

Quoting the relevant piece of code:

	if (level > 0) {
		new_eff = effective_prot(st->prot_levels[level - 1],
					 new_prot);
	} else {
		new_eff = new_prot;
	}

	if (level >= 0)
		st->prot_levels[level] = new_eff;

The generic framework calls note_page() only for leaf pages or holes
afaics. The protections for a leaf page found at a level other than
the numerically highest one have no meaning at all for a mapping at
a later address mapped with a numerically higher level mapping.
Instead it's the non-leaf page tables for that specific address
which determine the effective protection for any particular mapping.

To take an example, suppose the first present leaf page is found
at level 4. st->prot_levels[] will be all zero at this time, from
which it follows that new_eff will be zero then, too.

I don't think the intended effect can be achieved without either
retaining the original behavior of passing the effective protection
into note_page(), or calling note_page() also for non-leaf pages
(indicating to it which case it is, and adjusting it accordingly).

Am I overlooking something?

Additionally I'd like to note that note_page()'s "unsigned long val"
parameter isn't wide enough for 32-bit PAE PTEs, and hence the NX
flag will always be seen as clear in new_prot in such configs.

Jan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: your "x86: mm: convert dump_pagetables to use walk_page_range" change
  2020-05-12  9:39 your "x86: mm: convert dump_pagetables to use walk_page_range" change Jan Beulich
@ 2020-05-12 13:02 ` Steven Price
  2020-05-12 13:09   ` Jan Beulich
  2020-05-12 13:18   ` Peter Zijlstra
  0 siblings, 2 replies; 4+ messages in thread
From: Steven Price @ 2020-05-12 13:02 UTC (permalink / raw)
  To: Jan Beulich; +Cc: the arch/x86 maintainers, lkml

On 12/05/2020 10:39, Jan Beulich wrote:
> Steven,

Hi Jan,

> in the description of this change you say:
> 
> "The effective permissions are passed down the chain using new fields in
>   struct pg_state."
> 
> I don't see how this works, and I suppose this part of the change is
> (part of) the reason why a W+X warning has magically disappeared in
> 5.6.x (compared to 5.5.x) when running a 32-bit kernel under Xen.
> 
> Quoting the relevant piece of code:
> 
> 	if (level > 0) {
> 		new_eff = effective_prot(st->prot_levels[level - 1],
> 					 new_prot);
> 	} else {
> 		new_eff = new_prot;
> 	}
> 
> 	if (level >= 0)
> 		st->prot_levels[level] = new_eff;
> 
> The generic framework calls note_page() only for leaf pages or holes
> afaics. The protections for a leaf page found at a level other than
> the numerically highest one have no meaning at all for a mapping at
> a later address mapped with a numerically higher level mapping.
> Instead it's the non-leaf page tables for that specific address
> which determine the effective protection for any particular mapping.
> 
> To take an example, suppose the first present leaf page is found
> at level 4. st->prot_levels[] will be all zero at this time, from
> which it follows that new_eff will be zero then, too.
> 
> I don't think the intended effect can be achieved without either
> retaining the original behavior of passing the effective protection
> into note_page(), or calling note_page() also for non-leaf pages
> (indicating to it which case it is, and adjusting it accordingly).
> 
> Am I overlooking something?

Sadly I don't think you are - you're reasoning seems correct. It looks 
like the computation of effective permissions will need to be done in 
ptdump.c rather than dump_pagetables.c - as it's only ptdump.c that 
deals with the non-leaf entries as you point out.

> Additionally I'd like to note that note_page()'s "unsigned long val"
> parameter isn't wide enough for 32-bit PAE PTEs, and hence the NX
> flag will always be seen as clear in new_prot in such configs.

Ah, interesting. I'm not sure what type is actually guaranteed to be 
correct. pgprotval_t is x86 specific, but it might be necessary to 
extend it to other architectures. I think I got the "unsigned long" from 
the generic page.h (and because it happens to work on most 
architectures) - but hadn't noticed that that file was specifically only 
for NOMMU architectures.

I'll see if I can come up with fixes, but if you've got anything ready 
already then please jump in.

Steve

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: your "x86: mm: convert dump_pagetables to use walk_page_range" change
  2020-05-12 13:02 ` Steven Price
@ 2020-05-12 13:09   ` Jan Beulich
  2020-05-12 13:18   ` Peter Zijlstra
  1 sibling, 0 replies; 4+ messages in thread
From: Jan Beulich @ 2020-05-12 13:09 UTC (permalink / raw)
  To: Steven Price; +Cc: the arch/x86 maintainers, lkml

On 12.05.2020 15:02, Steven Price wrote:
> On 12/05/2020 10:39, Jan Beulich wrote:
>> Additionally I'd like to note that note_page()'s "unsigned long val"
>> parameter isn't wide enough for 32-bit PAE PTEs, and hence the NX
>> flag will always be seen as clear in new_prot in such configs.
> 
> Ah, interesting. I'm not sure what type is actually guaranteed to be 
> correct. pgprotval_t is x86 specific, but it might be necessary to 
> extend it to other architectures. I think I got the "unsigned long" from 
> the generic page.h (and because it happens to work on most 
> architectures) - but hadn't noticed that that file was specifically only 
> for NOMMU architectures.

Well, it's pteval_t (still x86-specific I guess) unless the call
sites could/would be switched to use pte_flags(). As per
https://lists.xenproject.org/archives/html/xen-devel/2020-05/msg00549.html
the use of pte_val() has another bad effect, but as described
there it might also be Xen specific code which is broken here.

> I'll see if I can come up with fixes, but if you've got anything ready 
> already then please jump in.

No, I don't. I lack the time right now to make a change potentially
affecting multiple architectures.

Jan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: your "x86: mm: convert dump_pagetables to use walk_page_range" change
  2020-05-12 13:02 ` Steven Price
  2020-05-12 13:09   ` Jan Beulich
@ 2020-05-12 13:18   ` Peter Zijlstra
  1 sibling, 0 replies; 4+ messages in thread
From: Peter Zijlstra @ 2020-05-12 13:18 UTC (permalink / raw)
  To: Steven Price; +Cc: Jan Beulich, the arch/x86 maintainers, lkml

On Tue, May 12, 2020 at 02:02:51PM +0100, Steven Price wrote:
> > Additionally I'd like to note that note_page()'s "unsigned long val"
> > parameter isn't wide enough for 32-bit PAE PTEs, and hence the NX
> > flag will always be seen as clear in new_prot in such configs.
> 
> Ah, interesting. I'm not sure what type is actually guaranteed to be
> correct. pgprotval_t is x86 specific, but it might be necessary to extend it
> to other architectures. I think I got the "unsigned long" from the generic
> page.h (and because it happens to work on most architectures) - but hadn't
> noticed that that file was specifically only for NOMMU architectures.

unsigned long long should cover the i386-PAE / ARMv7-LPAE cases. I
can't, from teh top of my head, remember if there's anybody else who has
a longer PTE.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-05-12 13:19 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-12  9:39 your "x86: mm: convert dump_pagetables to use walk_page_range" change Jan Beulich
2020-05-12 13:02 ` Steven Price
2020-05-12 13:09   ` Jan Beulich
2020-05-12 13:18   ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).