linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] powerpc/mm: Fix lockup on kernel exec fault
@ 2021-07-01 11:17 Christophe Leroy
  2021-07-02  1:25 ` Nicholas Piggin
  2021-07-06 10:52 ` Michael Ellerman
  0 siblings, 2 replies; 4+ messages in thread
From: Christophe Leroy @ 2021-07-01 11:17 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, npiggin
  Cc: linux-kernel, linuxppc-dev

The powerpc kernel is not prepared to handle exec faults from kernel.
Especially, the function is_exec_fault() will return 'false' when an
exec fault is taken by kernel, because the check is based on reading
current->thread.regs->trap which contains the trap from user.

For instance, when provoking a LKDTM EXEC_USERSPACE test,
current->thread.regs->trap is set to SYSCALL trap (0xc00), and
the fault taken by the kernel is not seen as an exec fault by
set_access_flags_filter().

Commit d7df2443cd5f ("powerpc/mm: Fix spurrious segfaults on radix
with autonuma") made it clear and handled it properly. But later on
commit d3ca587404b3 ("powerpc/mm: Fix reporting of kernel execute
faults") removed that handling, introducing test based on error_code.
And here is the problem, because on the 603 all upper bits of SRR1
get cleared when the TLB instruction miss handler bails out to ISI.

Until commit cbd7e6ca0210 ("powerpc/fault: Avoid heavy
search_exception_tables() verification"), an exec fault from kernel
at a userspace address was indirectly caught by the lack of entry for
that address in the exception tables. But after that commit the
kernel mainly rely on KUAP or on core mm handling to catch wrong
user accesses. Here the access is not wrong, so mm handles it.
It is a minor fault because PAGE_EXEC is not set,
set_access_flags_filter() should set PAGE_EXEC and voila.
But as is_exec_fault() returns false as explained in the begining,
set_access_flags_filter() bails out without setting PAGE_EXEC flag,
which leads to a forever minor exec fault.

As the kernel is not prepared to handle such exec faults, the thing
to do is to fire in bad_kernel_fault() for any exec fault taken by
the kernel, as it was prior to commit d3ca587404b3.

Fixes: d3ca587404b3 ("powerpc/mm: Fix reporting of kernel execute faults")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/mm/fault.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 34f641d4a2fe..a8d0ce85d39a 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -199,9 +199,7 @@ static bool bad_kernel_fault(struct pt_regs *regs, unsigned long error_code,
 {
 	int is_exec = TRAP(regs) == INTERRUPT_INST_STORAGE;
 
-	/* NX faults set DSISR_PROTFAULT on the 8xx, DSISR_NOEXEC_OR_G on others */
-	if (is_exec && (error_code & (DSISR_NOEXEC_OR_G | DSISR_KEYFAULT |
-				      DSISR_PROTFAULT))) {
+	if (is_exec) {
 		pr_crit_ratelimited("kernel tried to execute %s page (%lx) - exploit attempt? (uid: %d)\n",
 				    address >= TASK_SIZE ? "exec-protected" : "user",
 				    address,
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] powerpc/mm: Fix lockup on kernel exec fault
  2021-07-01 11:17 [PATCH] powerpc/mm: Fix lockup on kernel exec fault Christophe Leroy
@ 2021-07-02  1:25 ` Nicholas Piggin
  2021-07-02  5:10   ` Christophe Leroy
  2021-07-06 10:52 ` Michael Ellerman
  1 sibling, 1 reply; 4+ messages in thread
From: Nicholas Piggin @ 2021-07-02  1:25 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Christophe Leroy, Michael Ellerman,
	Paul Mackerras
  Cc: linux-kernel, linuxppc-dev

Excerpts from Christophe Leroy's message of July 1, 2021 9:17 pm:
> The powerpc kernel is not prepared to handle exec faults from kernel.
> Especially, the function is_exec_fault() will return 'false' when an
> exec fault is taken by kernel, because the check is based on reading
> current->thread.regs->trap which contains the trap from user.
> 
> For instance, when provoking a LKDTM EXEC_USERSPACE test,
> current->thread.regs->trap is set to SYSCALL trap (0xc00), and
> the fault taken by the kernel is not seen as an exec fault by
> set_access_flags_filter().
> 
> Commit d7df2443cd5f ("powerpc/mm: Fix spurrious segfaults on radix
> with autonuma") made it clear and handled it properly. But later on
> commit d3ca587404b3 ("powerpc/mm: Fix reporting of kernel execute
> faults") removed that handling, introducing test based on error_code.
> And here is the problem, because on the 603 all upper bits of SRR1
> get cleared when the TLB instruction miss handler bails out to ISI.

So the problem is 603 doesn't see the DSISR_NOEXEC_OR_G bit?

I don't see the problem with this for 64s, I don't think anything sane
can be done for any 0x400 interrupt in the kernel so it's probably
good to catch all here just in case. For 64s,

Acked-by: Nicholas Piggin <npiggin@gmail.com>

Why is 32s clearing those top bits? And it seems to be setting DSISR
that AFAIKS it does not use. Seems like it would be good to add a
NOEXEC_OR_G bit into SRR1.

Thanks,
Nick


> Until commit cbd7e6ca0210 ("powerpc/fault: Avoid heavy
> search_exception_tables() verification"), an exec fault from kernel
> at a userspace address was indirectly caught by the lack of entry for
> that address in the exception tables. But after that commit the
> kernel mainly rely on KUAP or on core mm handling to catch wrong
> user accesses. Here the access is not wrong, so mm handles it.
> It is a minor fault because PAGE_EXEC is not set,
> set_access_flags_filter() should set PAGE_EXEC and voila.
> But as is_exec_fault() returns false as explained in the begining,
> set_access_flags_filter() bails out without setting PAGE_EXEC flag,
> which leads to a forever minor exec fault.
> 
> As the kernel is not prepared to handle such exec faults, the thing
> to do is to fire in bad_kernel_fault() for any exec fault taken by
> the kernel, as it was prior to commit d3ca587404b3.
> 
> Fixes: d3ca587404b3 ("powerpc/mm: Fix reporting of kernel execute faults")
> Cc: stable@vger.kernel.org
> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> ---
>  arch/powerpc/mm/fault.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 34f641d4a2fe..a8d0ce85d39a 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -199,9 +199,7 @@ static bool bad_kernel_fault(struct pt_regs *regs, unsigned long error_code,
>  {
>  	int is_exec = TRAP(regs) == INTERRUPT_INST_STORAGE;
>  
> -	/* NX faults set DSISR_PROTFAULT on the 8xx, DSISR_NOEXEC_OR_G on others */
> -	if (is_exec && (error_code & (DSISR_NOEXEC_OR_G | DSISR_KEYFAULT |
> -				      DSISR_PROTFAULT))) {
> +	if (is_exec) {
>  		pr_crit_ratelimited("kernel tried to execute %s page (%lx) - exploit attempt? (uid: %d)\n",
>  				    address >= TASK_SIZE ? "exec-protected" : "user",
>  				    address,
> -- 
> 2.25.0
> 
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] powerpc/mm: Fix lockup on kernel exec fault
  2021-07-02  1:25 ` Nicholas Piggin
@ 2021-07-02  5:10   ` Christophe Leroy
  0 siblings, 0 replies; 4+ messages in thread
From: Christophe Leroy @ 2021-07-02  5:10 UTC (permalink / raw)
  To: Nicholas Piggin, Benjamin Herrenschmidt, Michael Ellerman,
	Paul Mackerras
  Cc: linux-kernel, linuxppc-dev



Le 02/07/2021 à 03:25, Nicholas Piggin a écrit :
> Excerpts from Christophe Leroy's message of July 1, 2021 9:17 pm:
>> The powerpc kernel is not prepared to handle exec faults from kernel.
>> Especially, the function is_exec_fault() will return 'false' when an
>> exec fault is taken by kernel, because the check is based on reading
>> current->thread.regs->trap which contains the trap from user.
>>
>> For instance, when provoking a LKDTM EXEC_USERSPACE test,
>> current->thread.regs->trap is set to SYSCALL trap (0xc00), and
>> the fault taken by the kernel is not seen as an exec fault by
>> set_access_flags_filter().
>>
>> Commit d7df2443cd5f ("powerpc/mm: Fix spurrious segfaults on radix
>> with autonuma") made it clear and handled it properly. But later on
>> commit d3ca587404b3 ("powerpc/mm: Fix reporting of kernel execute
>> faults") removed that handling, introducing test based on error_code.
>> And here is the problem, because on the 603 all upper bits of SRR1
>> get cleared when the TLB instruction miss handler bails out to ISI.
> 
> So the problem is 603 doesn't see the DSISR_NOEXEC_OR_G bit?

I a way yes. But the problem is also that the kernel doesn't see it as an exec fault in 
set_access_flags_filter() as explained above. If it could see it as an exec fault, it would set 
PAGE_EXEC and it would work (or maybe not because it seems it also checks for the dirtiness of the 
page, and here the page is also flagged as dirty).

603 will see DSISR_NOEXEC_OR_G if it's an access to a page which is in a segment flagged NX.

> 
> I don't see the problem with this for 64s, I don't think anything sane
> can be done for any 0x400 interrupt in the kernel so it's probably
> good to catch all here just in case. For 64s,
> 
> Acked-by: Nicholas Piggin <npiggin@gmail.com>
> 
> Why is 32s clearing those top bits? And it seems to be setting DSISR
> that AFAIKS it does not use. Seems like it would be good to add a
> NOEXEC_OR_G bit into SRR1.

Probably for simplicity.

When taking the Instruction TLB Miss interrupt, SRR1 contains CR0 fields in bits 0-3 and some 
dedicated info in bits 12-15. That doesn't match SRR1 bits for ISI, so before falling back to the 
ISI handler, ITLB Miss handler error patch clears upper SRR1 bits.

Maybe it could instead try to set the right bits, but it would make it more complicated because the 
error patch can be taken for the following reasons:
- No page table
- Not PAGE_PRESENT
- Not PAGE_ACCESSED
- Not PAGE_EXEC
- Below TASK_SIZE and not PAGE_USER

At the time being the verification of the flags is done with a single 'andc' operation. If we wanted 
to set the proper bits, it would mean testing the flags separately, which would impact performance 
on the no-error path.

Or maybe it would be good enough to set the PROTFAULT bit in all cases but the lack of page table. 
The 8xx sets PROTFAULT when hitting non-exec pages, so the kernel is prepared for it anyway. Not 
sure about the lack of PAGE_PRESENT thought. The 8xx sets NOHPTE bit when PAGE_PRESENT is cleared.

But is it really worth doing ?

Christophe

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] powerpc/mm: Fix lockup on kernel exec fault
  2021-07-01 11:17 [PATCH] powerpc/mm: Fix lockup on kernel exec fault Christophe Leroy
  2021-07-02  1:25 ` Nicholas Piggin
@ 2021-07-06 10:52 ` Michael Ellerman
  1 sibling, 0 replies; 4+ messages in thread
From: Michael Ellerman @ 2021-07-06 10:52 UTC (permalink / raw)
  To: npiggin, Paul Mackerras, Benjamin Herrenschmidt,
	Christophe Leroy, Michael Ellerman
  Cc: linux-kernel, linuxppc-dev

On Thu, 1 Jul 2021 11:17:08 +0000 (UTC), Christophe Leroy wrote:
> The powerpc kernel is not prepared to handle exec faults from kernel.
> Especially, the function is_exec_fault() will return 'false' when an
> exec fault is taken by kernel, because the check is based on reading
> current->thread.regs->trap which contains the trap from user.
> 
> For instance, when provoking a LKDTM EXEC_USERSPACE test,
> current->thread.regs->trap is set to SYSCALL trap (0xc00), and
> the fault taken by the kernel is not seen as an exec fault by
> set_access_flags_filter().
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/mm: Fix lockup on kernel exec fault
      https://git.kernel.org/powerpc/c/cd5d5e602f502895e47e18cd46804d6d7014e65c

cheers

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-07-06 10:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-01 11:17 [PATCH] powerpc/mm: Fix lockup on kernel exec fault Christophe Leroy
2021-07-02  1:25 ` Nicholas Piggin
2021-07-02  5:10   ` Christophe Leroy
2021-07-06 10:52 ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).