All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] x86/dumpstack: Fix misleading instruction pointer error message
@ 2020-10-02  4:29 Mark Mossberg
  2020-10-02 10:26 ` [tip: x86/core] " tip-bot2 for Mark Mossberg
  2020-11-03 12:50 ` [PATCH v2] " Oleg Nesterov
  0 siblings, 2 replies; 14+ messages in thread
From: Mark Mossberg @ 2020-10-02  4:29 UTC (permalink / raw)
  To: tglx, mingo, bp, x86, linux-kernel; +Cc: hpa, jannh, Mark Mossberg

Printing "Bad RIP value" if copy_code() fails can be misleading for
userspace pointers, since copy_code() can fail if the instruction
pointer is valid, but the code is paged out. This is because copy_code()
calls copy_from_user_nmi() for userspace pointers, which disables page
fault handling.

This is reproducible in OOM situations, where it's plausible that the
code may be reclaimed in the time between entry into the kernel and when
this message is printed. This leaves a misleading log in dmesg that
suggests instruction pointer corruption has occurred, which may alarm
users.

This patch changes the message to state the error condition more
precisely.

Thanks to Jann Horn for help with understanding OOM reclamation.

Signed-off-by: Mark Mossberg <mark.mossberg@gmail.com>
---
 arch/x86/kernel/dumpstack.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 48ce44576947..ea8d51ec251b 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -115,7 +115,8 @@ void show_opcodes(struct pt_regs *regs, const char *loglvl)
 	unsigned long prologue = regs->ip - PROLOGUE_SIZE;
 
 	if (copy_code(regs, opcodes, prologue, sizeof(opcodes))) {
-		printk("%sCode: Bad RIP value.\n", loglvl);
+		printk("%sCode: Unable to access opcode bytes at RIP 0x%lx.\n",
+		       loglvl, prologue);
 	} else {
 		printk("%sCode: %" __stringify(PROLOGUE_SIZE) "ph <%02x> %"
 		       __stringify(EPILOGUE_SIZE) "ph\n", loglvl, opcodes,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [tip: x86/core] x86/dumpstack: Fix misleading instruction pointer error message
  2020-10-02  4:29 [PATCH v2] x86/dumpstack: Fix misleading instruction pointer error message Mark Mossberg
@ 2020-10-02 10:26 ` tip-bot2 for Mark Mossberg
  2020-11-03 12:50 ` [PATCH v2] " Oleg Nesterov
  1 sibling, 0 replies; 14+ messages in thread
From: tip-bot2 for Mark Mossberg @ 2020-10-02 10:26 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Mark Mossberg, Borislav Petkov, x86, LKML

The following commit has been merged into the x86/core branch of tip:

Commit-ID:     238c91115cd05c71447ea071624a4c9fe661f970
Gitweb:        https://git.kernel.org/tip/238c91115cd05c71447ea071624a4c9fe661f970
Author:        Mark Mossberg <mark.mossberg@gmail.com>
AuthorDate:    Fri, 02 Oct 2020 04:29:16 
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Fri, 02 Oct 2020 11:33:55 +02:00

x86/dumpstack: Fix misleading instruction pointer error message

Printing "Bad RIP value" if copy_code() fails can be misleading for
userspace pointers, since copy_code() can fail if the instruction
pointer is valid but the code is paged out. This is because copy_code()
calls copy_from_user_nmi() for userspace pointers, which disables page
fault handling.

This is reproducible in OOM situations, where it's plausible that the
code may be reclaimed in the time between entry into the kernel and when
this message is printed. This leaves a misleading log in dmesg that
suggests instruction pointer corruption has occurred, which may alarm
users.

Change the message to state the error condition more precisely.

 [ bp: Massage a bit. ]

Signed-off-by: Mark Mossberg <mark.mossberg@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20201002042915.403558-1-mark.mossberg@gmail.com
---
 arch/x86/kernel/dumpstack.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 48ce445..ea8d51e 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -115,7 +115,8 @@ void show_opcodes(struct pt_regs *regs, const char *loglvl)
 	unsigned long prologue = regs->ip - PROLOGUE_SIZE;
 
 	if (copy_code(regs, opcodes, prologue, sizeof(opcodes))) {
-		printk("%sCode: Bad RIP value.\n", loglvl);
+		printk("%sCode: Unable to access opcode bytes at RIP 0x%lx.\n",
+		       loglvl, prologue);
 	} else {
 		printk("%sCode: %" __stringify(PROLOGUE_SIZE) "ph <%02x> %"
 		       __stringify(EPILOGUE_SIZE) "ph\n", loglvl, opcodes,

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] x86/dumpstack: Fix misleading instruction pointer error message
  2020-10-02  4:29 [PATCH v2] x86/dumpstack: Fix misleading instruction pointer error message Mark Mossberg
  2020-10-02 10:26 ` [tip: x86/core] " tip-bot2 for Mark Mossberg
@ 2020-11-03 12:50 ` Oleg Nesterov
  2020-11-03 17:15   ` Borislav Petkov
  1 sibling, 1 reply; 14+ messages in thread
From: Oleg Nesterov @ 2020-11-03 12:50 UTC (permalink / raw)
  To: Mark Mossberg; +Cc: tglx, mingo, bp, x86, linux-kernel, hpa, jannh, kyin

On 10/02, Mark Mossberg wrote:
>
> Printing "Bad RIP value" if copy_code() fails can be misleading for
> userspace pointers, since copy_code() can fail if the instruction
> pointer is valid, but the code is paged out.

Another problem is that show_opcodes() makes no sense if user_mode(regs)
and tsk is not current. Try "echo t > /proc/sysrq-trigger".

In this case copy_from_user_nmi() will either fail, or (worse) it will
read the "random" memory from current->mm.

Perhaps we can add something like

	if (user_mode(regs) && regs != task_pt_regs(current))
		return;

at the start of show_opcodes() ?

> --- a/arch/x86/kernel/dumpstack.c
> +++ b/arch/x86/kernel/dumpstack.c
> @@ -115,7 +115,8 @@ void show_opcodes(struct pt_regs *regs, const char *loglvl)
>  	unsigned long prologue = regs->ip - PROLOGUE_SIZE;
>  
>  	if (copy_code(regs, opcodes, prologue, sizeof(opcodes))) {
> -		printk("%sCode: Bad RIP value.\n", loglvl);
> +		printk("%sCode: Unable to access opcode bytes at RIP 0x%lx.\n",
> +		       loglvl, prologue);
>  	} else {
>  		printk("%sCode: %" __stringify(PROLOGUE_SIZE) "ph <%02x> %"
>  		       __stringify(EPILOGUE_SIZE) "ph\n", loglvl, opcodes,
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] x86/dumpstack: Fix misleading instruction pointer error message
  2020-11-03 12:50 ` [PATCH v2] " Oleg Nesterov
@ 2020-11-03 17:15   ` Borislav Petkov
  2020-11-03 17:47     ` Oleg Nesterov
  0 siblings, 1 reply; 14+ messages in thread
From: Borislav Petkov @ 2020-11-03 17:15 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Mark Mossberg, tglx, mingo, x86, linux-kernel, hpa, jannh, kyin

On Tue, Nov 03, 2020 at 01:50:34PM +0100, Oleg Nesterov wrote:
> Another problem is that show_opcodes() makes no sense if user_mode(regs)
> and tsk is not current.

Because if not current, we would access *some* user address space but
not the one to which regs belong to?

> Try "echo t > /proc/sysrq-trigger".

What am I looking for?

I see a bunch of:

[   37.622896] Code: Unable to access opcode bytes at RIP <user address>

and three Code: lines with opcode bytes, as expected:

[   37.148693] Code: 11 0d 00 48 89 c6 4c 89 ef e8 98 07 00 00 48 83 f8 ff 0f 84 3e 02 00 00 48 3b 05 b7 28 0d 00 48 89 c3 0f 83 b5 00 00 00 48 8b <0d> e7 10 0d 00 48 83 f8 0d 76 13 48 b8 28 75 6e 72 65 61 63 68 48

So all those other but the three cases, copy_code() failed.

> In this case copy_from_user_nmi() will either fail, or (worse) it will
> read the "random" memory from current->mm.
> 
> Perhaps we can add something like
> 
> 	if (user_mode(regs) && regs != task_pt_regs(current))
> 		return;
> 
> at the start of show_opcodes() ?

tglx made it use copy_from_user_nmi() in:

d181d2da0141 ("x86/dumpstack: Dump user space code correctly again")

I'm thinking this should not use the atomic variant if it can get called
in !atomic context too.

Thomas?

Leaving in the rest for reference.

> > --- a/arch/x86/kernel/dumpstack.c
> > +++ b/arch/x86/kernel/dumpstack.c
> > @@ -115,7 +115,8 @@ void show_opcodes(struct pt_regs *regs, const char *loglvl)
> >  	unsigned long prologue = regs->ip - PROLOGUE_SIZE;
> >  
> >  	if (copy_code(regs, opcodes, prologue, sizeof(opcodes))) {
> > -		printk("%sCode: Bad RIP value.\n", loglvl);
> > +		printk("%sCode: Unable to access opcode bytes at RIP 0x%lx.\n",
> > +		       loglvl, prologue);
> >  	} else {
> >  		printk("%sCode: %" __stringify(PROLOGUE_SIZE) "ph <%02x> %"
> >  		       __stringify(EPILOGUE_SIZE) "ph\n", loglvl, opcodes,
> > -- 
> > 2.25.1

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] x86/dumpstack: Fix misleading instruction pointer error message
  2020-11-03 17:15   ` Borislav Petkov
@ 2020-11-03 17:47     ` Oleg Nesterov
  2020-11-03 17:52       ` Borislav Petkov
  0 siblings, 1 reply; 14+ messages in thread
From: Oleg Nesterov @ 2020-11-03 17:47 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Mark Mossberg, tglx, mingo, x86, linux-kernel, hpa, jannh, kyin

On 11/03, Borislav Petkov wrote:
>
> On Tue, Nov 03, 2020 at 01:50:34PM +0100, Oleg Nesterov wrote:
> > Another problem is that show_opcodes() makes no sense if user_mode(regs)
> > and tsk is not current.
>
> Because if not current, we would access *some* user address space but
> not the one to which regs belong to?

Yes, because if not current, copy_from_user() will access the current's
user address space at address = foreign process's regs->ip.

> > Try "echo t > /proc/sysrq-trigger".
>
> What am I looking for?
>
> I see a bunch of:
>
> [   37.622896] Code: Unable to access opcode bytes at RIP <user address>

this means that foreign_regs->ip is not mmapped,

> and three Code: lines with opcode bytes, as expected:
>
> [   37.148693] Code: 11 0d 00 48 89 c6 4c 89 ef e8 98 07 00 00 48 83 f8 ff 0f 84 3e 02 00 00 48 3b 05 b7 28 0d 00 48 89 c3 0f 83 b5 00 00 00 48 8b <0d> e7 10 0d 00 48 83 f8 0d 76 13 48 b8 28 75 6e 72 65 61 63 68 48

I'd say this is NOT expected and adds the unnecessary confusion.
./scripts/decodecode reports

	...

	Code starting with the faulting instruction
	===========================================
	   0:	0d e7 10 0d 00       	or     $0xd10e7,%eax
	   5:	48 83 f8 0d          	cmp    $0xd,%rax
	   9:	76 13                	jbe    0x1e
	   b:	48 b8 28 75 6e 72 65 	movabs $0x68636165726e7528,%rax
	  12:	61 63 68
	  15:	48                   	rex.W

and this is because foreign_regs->ip happens to be a valid address in current->mm.

> I'm thinking this should not use the atomic variant if it can get called
> in !atomic context too.

For what?

Oleg.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] x86/dumpstack: Fix misleading instruction pointer error message
  2020-11-03 17:47     ` Oleg Nesterov
@ 2020-11-03 17:52       ` Borislav Petkov
  2020-11-03 18:11         ` Oleg Nesterov
  0 siblings, 1 reply; 14+ messages in thread
From: Borislav Petkov @ 2020-11-03 17:52 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Mark Mossberg, tglx, mingo, x86, linux-kernel, hpa, jannh, kyin

On Tue, Nov 03, 2020 at 06:47:44PM +0100, Oleg Nesterov wrote:
> > I'm thinking this should not use the atomic variant if it can get called
> > in !atomic context too.
> 
> For what?

I'm thinking copy_code() should not use copy_from_user_nmi() if former
can be called in non-atomic context too.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] x86/dumpstack: Fix misleading instruction pointer error message
  2020-11-03 17:52       ` Borislav Petkov
@ 2020-11-03 18:11         ` Oleg Nesterov
  2020-11-03 18:20           ` Borislav Petkov
  0 siblings, 1 reply; 14+ messages in thread
From: Oleg Nesterov @ 2020-11-03 18:11 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Mark Mossberg, tglx, mingo, x86, linux-kernel, hpa, jannh, kyin

On 11/03, Borislav Petkov wrote:
>
> On Tue, Nov 03, 2020 at 06:47:44PM +0100, Oleg Nesterov wrote:
> > > I'm thinking this should not use the atomic variant if it can get called
> > > in !atomic context too.
> >
> > For what?
>
> I'm thinking copy_code() should not use copy_from_user_nmi() if former
> can be called in non-atomic context too.

I understand, but why do you think this makes sense?

Say, do you think it is fine to block in fuse_readpage() ?

Anyway, this is off-topic and I won't argue.

Oleg.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] x86/dumpstack: Fix misleading instruction pointer error message
  2020-11-03 18:11         ` Oleg Nesterov
@ 2020-11-03 18:20           ` Borislav Petkov
  2020-11-16 22:01             ` Thomas Gleixner
  0 siblings, 1 reply; 14+ messages in thread
From: Borislav Petkov @ 2020-11-03 18:20 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Mark Mossberg, tglx, mingo, x86, linux-kernel, hpa, jannh, kyin

On Tue, Nov 03, 2020 at 07:11:15PM +0100, Oleg Nesterov wrote:
> > I'm thinking copy_code() should not use copy_from_user_nmi() if former
> > can be called in non-atomic context too.
> 
> I understand, but why do you think this makes sense?

Because the copy_from_user_nmi()'s name tells me that it is at least
supposed to be called in atomic context. At least this is how I
understand it. And in atomic context regs is supposed to belong to
current, right?

So I kinda agree with what you're proposing but if copy_from_user_nmi()
can be "tricked" into reading off from the weeds, then there should be
a big fat warning above it at least so that users are warned to do the
appropriate checks.

Or there should be another wrapper around it which does the
regs-belongs-to-current checks, etc and copy_code() should use that
wrapper...

AFAICT at least.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] x86/dumpstack: Fix misleading instruction pointer error message
  2020-11-03 18:20           ` Borislav Petkov
@ 2020-11-16 22:01             ` Thomas Gleixner
  2020-11-16 23:04               ` Andy Lutomirski
                                 ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Thomas Gleixner @ 2020-11-16 22:01 UTC (permalink / raw)
  To: Borislav Petkov, Oleg Nesterov
  Cc: Mark Mossberg, mingo, x86, linux-kernel, hpa, jannh, kyin

On Tue, Nov 03 2020 at 19:20, Borislav Petkov wrote:
> On Tue, Nov 03, 2020 at 07:11:15PM +0100, Oleg Nesterov wrote:
>> > I'm thinking copy_code() should not use copy_from_user_nmi() if former
>> > can be called in non-atomic context too.

While copy_from_user_nmi() is named that way, it can be invoked from
other contexts as well. See the comment inside.

>> I understand, but why do you think this makes sense?
>
> Because the copy_from_user_nmi()'s name tells me that it is at least
> supposed to be called in atomic context. At least this is how I
> understand it. And in atomic context regs is supposed to belong to
> current, right?

Whatever context you are in current can only read it's own user space
obviously.

AFAICT even before the change I did, show_opcodes() did not care and
just either dumped what was available at regs->ip in the current tasks
user space mapping or faulted.

Fix below.

Thanks,

        tglx
---
Subject: x86/dumpstack: Don't try to access user space code of other tasks
From: Thomas Gleixner <tglx@linutronix.de>
Date: Mon, 16 Nov 2020 22:26:52 +0100

sysrq-t ends up invoking show_opcodes() for each task which tries to access
the user space code of other processes which is obviously bogus.

It either manages to dump where the foreign tasks regs->ip points to in
currents mapping or triggers a pagefault and prints "Code: Bad RIP
value.". Both is just wrong.

Add a safeguard in copy_code() and check whether the @regs pointer matches
currents pt_regs. If not, do not even try to access it.

While at it, add commentry why using copy_from_user_nmi() is safe in
copy_code() even if the function name suggests otherwise.

Reported-by: Mark Mossberg <mark.mossberg@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/dumpstack.c |   23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -78,6 +78,9 @@ static int copy_code(struct pt_regs *reg
 	if (!user_mode(regs))
 		return copy_from_kernel_nofault(buf, (u8 *)src, nbytes);
 
+	/* The user space code from other tasks cannot be accessed. */
+	if (regs != task_pt_regs(current))
+		return -EPERM;
 	/*
 	 * Make sure userspace isn't trying to trick us into dumping kernel
 	 * memory by pointing the userspace instruction pointer at it.
@@ -85,6 +88,12 @@ static int copy_code(struct pt_regs *reg
 	if (__chk_range_not_ok(src, nbytes, TASK_SIZE_MAX))
 		return -EINVAL;
 
+	/*
+	 * Even if named copy_from_user_nmi() this can be invoked from
+	 * other contexts and will not try to resolve a pagefault, which is
+	 * the correct thing to do here as this code can be called from any
+	 * context.
+	 */
 	return copy_from_user_nmi(buf, (void __user *)src, nbytes);
 }
 
@@ -115,13 +124,19 @@ void show_opcodes(struct pt_regs *regs,
 	u8 opcodes[OPCODE_BUFSIZE];
 	unsigned long prologue = regs->ip - PROLOGUE_SIZE;
 
-	if (copy_code(regs, opcodes, prologue, sizeof(opcodes))) {
-		printk("%sCode: Unable to access opcode bytes at RIP 0x%lx.\n",
-		       loglvl, prologue);
-	} else {
+	switch (copy_code(regs, opcodes, prologue, sizeof(opcodes))) {
+	case 0:
 		printk("%sCode: %" __stringify(PROLOGUE_SIZE) "ph <%02x> %"
 		       __stringify(EPILOGUE_SIZE) "ph\n", loglvl, opcodes,
 		       opcodes[PROLOGUE_SIZE], opcodes + PROLOGUE_SIZE + 1);
+		break;
+	case -EPERM:
+		/* No access to the user space stack of other tasks. Ignore. */
+		break;
+	default:
+		printk("%sCode: Unable to access opcode bytes at RIP 0x%lx.\n",
+		       loglvl, prologue);
+		break;
 	}
 }
 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] x86/dumpstack: Fix misleading instruction pointer error message
  2020-11-16 22:01             ` Thomas Gleixner
@ 2020-11-16 23:04               ` Andy Lutomirski
  2020-11-16 23:37                 ` Thomas Gleixner
  2020-11-17  9:54               ` Borislav Petkov
  2020-11-17 17:31               ` Oleg Nesterov
  2 siblings, 1 reply; 14+ messages in thread
From: Andy Lutomirski @ 2020-11-16 23:04 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Borislav Petkov, Oleg Nesterov, Mark Mossberg, Ingo Molnar,
	X86 ML, LKML, H. Peter Anvin, Jann Horn, kyin

On Mon, Nov 16, 2020 at 2:01 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>  arch/x86/kernel/dumpstack.c |   23 +++++++++++++++++++----
>  1 file changed, 19 insertions(+), 4 deletions(-)
>
> --- a/arch/x86/kernel/dumpstack.c
> +++ b/arch/x86/kernel/dumpstack.c
> @@ -78,6 +78,9 @@ static int copy_code(struct pt_regs *reg
>         if (!user_mode(regs))
>                 return copy_from_kernel_nofault(buf, (u8 *)src, nbytes);
>
> +       /* The user space code from other tasks cannot be accessed. */
> +       if (regs != task_pt_regs(current))
> +               return -EPERM;

Depending on exactly where this gets called, this may not be
sufficient.  You should also check nmi_uaccess_okay().

--Andy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] x86/dumpstack: Fix misleading instruction pointer error message
  2020-11-16 23:04               ` Andy Lutomirski
@ 2020-11-16 23:37                 ` Thomas Gleixner
  2020-11-17  3:29                   ` Andy Lutomirski
  0 siblings, 1 reply; 14+ messages in thread
From: Thomas Gleixner @ 2020-11-16 23:37 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Borislav Petkov, Oleg Nesterov, Mark Mossberg, Ingo Molnar,
	X86 ML, LKML, H. Peter Anvin, Jann Horn, kyin

On Mon, Nov 16 2020 at 15:04, Andy Lutomirski wrote:

> On Mon, Nov 16, 2020 at 2:01 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>>  arch/x86/kernel/dumpstack.c |   23 +++++++++++++++++++----
>>  1 file changed, 19 insertions(+), 4 deletions(-)
>>
>> --- a/arch/x86/kernel/dumpstack.c
>> +++ b/arch/x86/kernel/dumpstack.c
>> @@ -78,6 +78,9 @@ static int copy_code(struct pt_regs *reg
>>         if (!user_mode(regs))
>>                 return copy_from_kernel_nofault(buf, (u8 *)src, nbytes);
>>
>> +       /* The user space code from other tasks cannot be accessed. */
>> +       if (regs != task_pt_regs(current))
>> +               return -EPERM;
>
> Depending on exactly where this gets called, this may not be
> sufficient.  You should also check nmi_uaccess_okay().

which is what copy_from_user_nmi() already does.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] x86/dumpstack: Fix misleading instruction pointer error message
  2020-11-16 23:37                 ` Thomas Gleixner
@ 2020-11-17  3:29                   ` Andy Lutomirski
  0 siblings, 0 replies; 14+ messages in thread
From: Andy Lutomirski @ 2020-11-17  3:29 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Borislav Petkov, Oleg Nesterov, Mark Mossberg, Ingo Molnar,
	X86 ML, LKML, H. Peter Anvin, Jann Horn, kyin

On Mon, Nov 16, 2020 at 3:37 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Mon, Nov 16 2020 at 15:04, Andy Lutomirski wrote:
>
> > On Mon, Nov 16, 2020 at 2:01 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> >>  arch/x86/kernel/dumpstack.c |   23 +++++++++++++++++++----
> >>  1 file changed, 19 insertions(+), 4 deletions(-)
> >>
> >> --- a/arch/x86/kernel/dumpstack.c
> >> +++ b/arch/x86/kernel/dumpstack.c
> >> @@ -78,6 +78,9 @@ static int copy_code(struct pt_regs *reg
> >>         if (!user_mode(regs))
> >>                 return copy_from_kernel_nofault(buf, (u8 *)src, nbytes);
> >>
> >> +       /* The user space code from other tasks cannot be accessed. */
> >> +       if (regs != task_pt_regs(current))
> >> +               return -EPERM;
> >
> > Depending on exactly where this gets called, this may not be
> > sufficient.  You should also check nmi_uaccess_okay().
>
> which is what copy_from_user_nmi() already does.

Whoops.  I thought I checked that...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] x86/dumpstack: Fix misleading instruction pointer error message
  2020-11-16 22:01             ` Thomas Gleixner
  2020-11-16 23:04               ` Andy Lutomirski
@ 2020-11-17  9:54               ` Borislav Petkov
  2020-11-17 17:31               ` Oleg Nesterov
  2 siblings, 0 replies; 14+ messages in thread
From: Borislav Petkov @ 2020-11-17  9:54 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Oleg Nesterov, Mark Mossberg, mingo, x86, linux-kernel, hpa, jannh, kyin

On Mon, Nov 16, 2020 at 11:01:03PM +0100, Thomas Gleixner wrote:
> Subject: x86/dumpstack: Don't try to access user space code of other tasks
> From: Thomas Gleixner <tglx@linutronix.de>
> Date: Mon, 16 Nov 2020 22:26:52 +0100
> 
> sysrq-t ends up invoking show_opcodes() for each task which tries to access
> the user space code of other processes which is obviously bogus.
> 
> It either manages to dump where the foreign tasks regs->ip points to in

I guess you mean here "points to valid mapping of current" or so.

> currents mapping or triggers a pagefault and prints "Code: Bad RIP
> value.". Both is just wrong.
> 
> Add a safeguard in copy_code() and check whether the @regs pointer matches
> currents pt_regs. If not, do not even try to access it.
> 
> While at it, add commentry why using copy_from_user_nmi() is safe in

s/commentry/commentary/

> copy_code() even if the function name suggests otherwise.
> 
> Reported-by: Mark Mossberg <mark.mossberg@gmail.com>

This is Reported-by: Oleg

> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  arch/x86/kernel/dumpstack.c |   23 +++++++++++++++++++----
>  1 file changed, 19 insertions(+), 4 deletions(-)
> 
> --- a/arch/x86/kernel/dumpstack.c
> +++ b/arch/x86/kernel/dumpstack.c
> @@ -78,6 +78,9 @@ static int copy_code(struct pt_regs *reg
>  	if (!user_mode(regs))
>  		return copy_from_kernel_nofault(buf, (u8 *)src, nbytes);
>  
> +	/* The user space code from other tasks cannot be accessed. */
> +	if (regs != task_pt_regs(current))
> +		return -EPERM;
>  	/*
>  	 * Make sure userspace isn't trying to trick us into dumping kernel
>  	 * memory by pointing the userspace instruction pointer at it.
> @@ -85,6 +88,12 @@ static int copy_code(struct pt_regs *reg
>  	if (__chk_range_not_ok(src, nbytes, TASK_SIZE_MAX))
>  		return -EINVAL;
>  
> +	/*
> +	 * Even if named copy_from_user_nmi() this can be invoked from
> +	 * other contexts and will not try to resolve a pagefault, which is
> +	 * the correct thing to do here as this code can be called from any
> +	 * context.
> +	 */

Can we stick the first part of this comment about "this can be invoked
from other contexts" over the function definition?

>  	return copy_from_user_nmi(buf, (void __user *)src, nbytes);
>  }

...

With this, I see Code: only once with Sysrq-T:

[   25.491878] task:bash            state:R  running task     stack:    0 pid: 4267 ppid:  4187 flags:0x00004000

...

[   25.497740] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 48 8d 05 f9 61 0d 00 8b 00 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 41 54 49 89 d4 55 48 89 f5 53

which is the shell doing the

$ echo t > /proc/sysrq-trigger

So

Reviewed-by: Borislav Petkov <bp@suse.de>
Tested-by: Borislav Petkov <bp@suse.de>

Thanks!

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] x86/dumpstack: Fix misleading instruction pointer error message
  2020-11-16 22:01             ` Thomas Gleixner
  2020-11-16 23:04               ` Andy Lutomirski
  2020-11-17  9:54               ` Borislav Petkov
@ 2020-11-17 17:31               ` Oleg Nesterov
  2 siblings, 0 replies; 14+ messages in thread
From: Oleg Nesterov @ 2020-11-17 17:31 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Borislav Petkov, Mark Mossberg, mingo, x86, linux-kernel, hpa,
	jannh, kyin

On 11/16, Thomas Gleixner wrote:
>
> Subject: x86/dumpstack: Don't try to access user space code of other tasks
> From: Thomas Gleixner <tglx@linutronix.de>
> Date: Mon, 16 Nov 2020 22:26:52 +0100
> 
> sysrq-t ends up invoking show_opcodes() for each task which tries to access
> the user space code of other processes which is obviously bogus.
> 
> It either manages to dump where the foreign tasks regs->ip points to in
> currents mapping or triggers a pagefault and prints "Code: Bad RIP
> value.". Both is just wrong.
> 
> Add a safeguard in copy_code() and check whether the @regs pointer matches
> currents pt_regs. If not, do not even try to access it.
> 
> While at it, add commentry why using copy_from_user_nmi() is safe in
> copy_code() even if the function name suggests otherwise.
> 
> Reported-by: Mark Mossberg <mark.mossberg@gmail.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Acked-by: Oleg Nesterov <oleg@redhat.com>


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-11-17 17:32 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-02  4:29 [PATCH v2] x86/dumpstack: Fix misleading instruction pointer error message Mark Mossberg
2020-10-02 10:26 ` [tip: x86/core] " tip-bot2 for Mark Mossberg
2020-11-03 12:50 ` [PATCH v2] " Oleg Nesterov
2020-11-03 17:15   ` Borislav Petkov
2020-11-03 17:47     ` Oleg Nesterov
2020-11-03 17:52       ` Borislav Petkov
2020-11-03 18:11         ` Oleg Nesterov
2020-11-03 18:20           ` Borislav Petkov
2020-11-16 22:01             ` Thomas Gleixner
2020-11-16 23:04               ` Andy Lutomirski
2020-11-16 23:37                 ` Thomas Gleixner
2020-11-17  3:29                   ` Andy Lutomirski
2020-11-17  9:54               ` Borislav Petkov
2020-11-17 17:31               ` Oleg Nesterov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.