All of lore.kernel.org
 help / color / mirror / Atom feed
* kernel BUG at mm/usercopy.c:72!
@ 2017-05-15 19:19 Breno Leitao
  2017-05-16  4:00 ` Anshuman Khandual
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Breno Leitao @ 2017-05-15 19:19 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: gromero

Hello,

Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
machine. Justing SSHing into the machine causes this issue.

	[23.138124] usercopy: kernel memory overwrite attempt detected to d000000003d80030 (mm_struct) (560 bytes)
	[23.138195] ------------[ cut here ]------------
	[23.138229] kernel BUG at mm/usercopy.c:72!
	[23.138252] Oops: Exception in kernel mode, sig: 5 [#3]
	[23.138280] SMP NR_CPUS=2048 
	[23.138280] NUMA 
	[23.138302] pSeries
	[23.138330] Modules linked in:
	[23.138354] CPU: 4 PID: 2215 Comm: sshd Tainted: G      D         4.12.0-rc1+ #9
	[23.138395] task: c0000001e272dc00 task.stack: c0000001e27b0000
	[23.138430] NIP: c000000000342358 LR: c000000000342354 CTR: c0000000006eb060
	[23.138472] REGS: c0000001e27b3a00 TRAP: 0700   Tainted: G      D          (4.12.0-rc1+)
	[23.138513] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
	[23.138517]   CR: 28004222  XER: 20000000
	[23.138565] CFAR: c000000000b34500 SOFTE: 1 
	[23.138565] GPR00: c000000000342354 c0000001e27b3c80 c00000000142a000 000000000000005e 
	[23.138565] GPR04: c0000001ffe0ade8 c0000001ffe21bf8 2920283536302062 79746573290d0a74 
	[23.138565] GPR08: 0000000000000007 c000000000f61864 00000001feeb0000 3064206f74206465 
	[23.138565] GPR12: 0000000000004400 c00000000fb42600 0000000000000015 00000000545bdc40 
	[23.138565] GPR16: 00000000545c49c8 000001000b4b8890 00007ffff78c26f0 00000000545cf000 
	[23.138565] GPR20: 00000000546109c8 000000000000c7e8 0000000054610010 00007ffff78c22e8 
	[23.138565] GPR24: 00000000545c8c40 c0000000ff6bcef0 c0000000001e5220 0000000000000230 
	[23.138565] GPR28: d000000003d80260 0000000000000000 0000000000000230 d000000003d80030 
	[23.138920] NIP [c000000000342358] __check_object_size+0x88/0x2d0
	[23.138956] LR [c000000000342354] __check_object_size+0x84/0x2d0
	[23.138990] Call Trace:
	[23.139006] [c0000001e27b3c80] [c000000000342354] __check_object_size+0x84/0x2d0 (unreliable)
	[23.139056] [c0000001e27b3d00] [c0000000009f5ba8] bpf_prog_create_from_user+0xa8/0x1a0
	[23.139099] [c0000001e27b3d60] [c0000000001e5d30] do_seccomp+0x120/0x720
	[23.139136] [c0000001e27b3dd0] [c0000000000fd53c] SyS_prctl+0x2ac/0x6b0
	[23.139172] [c0000001e27b3e30] [c00000000000af84] system_call+0x38/0xe0
	[23.139218] Instruction dump:
	[23.139240] 60000000 60420000 3c82ff94 3ca2ff9d 38841788 38a5e868 3c62ff95 7fc8f378 
	[23.139283] 7fe6fb78 386310c0 487f2169 60000000 <0fe00000> 60420000 2ba30010 409d018c 
	[23.139328] ---[ end trace 1a1dc952a4b7c4af ]---
	
I found that kernel 4.11 does not have this issue. I also found that, if
I revert 517e1fbeb65f5eade8d14f46ac365db6c75aea9b, I do not see the
problem.

On the other side, if I cherry-pick commit
517e1fbeb65f5eade8d14f46ac365db6c75aea9b into 4.11, I start seeing the
same issue also on 4.11.

FWIW I am using the following .config file[1].

[1] http://paste.ubuntu.com/24582478/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kernel BUG at mm/usercopy.c:72!
  2017-05-15 19:19 kernel BUG at mm/usercopy.c:72! Breno Leitao
@ 2017-05-16  4:00 ` Anshuman Khandual
  2017-05-16  4:44   ` Balbir Singh
  2017-05-16 11:02 ` Michael Ellerman
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Anshuman Khandual @ 2017-05-16  4:00 UTC (permalink / raw)
  To: Breno Leitao, linuxppc-dev; +Cc: gromero

On 05/16/2017 12:49 AM, Breno Leitao wrote:
> Hello,
> 
> Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
> machine. Justing SSHing into the machine causes this issue.
> 
> 	[23.138124] usercopy: kernel memory overwrite attempt detected to d000000003d80030 (mm_struct) (560 bytes)
> 	[23.138195] ------------[ cut here ]------------
> 	[23.138229] kernel BUG at mm/usercopy.c:72!
> 	[23.138252] Oops: Exception in kernel mode, sig: 5 [#3]
> 	[23.138280] SMP NR_CPUS=2048 
> 	[23.138280] NUMA 
> 	[23.138302] pSeries
> 	[23.138330] Modules linked in:
> 	[23.138354] CPU: 4 PID: 2215 Comm: sshd Tainted: G      D         4.12.0-rc1+ #9
> 	[23.138395] task: c0000001e272dc00 task.stack: c0000001e27b0000
> 	[23.138430] NIP: c000000000342358 LR: c000000000342354 CTR: c0000000006eb060
> 	[23.138472] REGS: c0000001e27b3a00 TRAP: 0700   Tainted: G      D          (4.12.0-rc1+)
> 	[23.138513] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
> 	[23.138517]   CR: 28004222  XER: 20000000
> 	[23.138565] CFAR: c000000000b34500 SOFTE: 1 
> 	[23.138565] GPR00: c000000000342354 c0000001e27b3c80 c00000000142a000 000000000000005e 
> 	[23.138565] GPR04: c0000001ffe0ade8 c0000001ffe21bf8 2920283536302062 79746573290d0a74 
> 	[23.138565] GPR08: 0000000000000007 c000000000f61864 00000001feeb0000 3064206f74206465 
> 	[23.138565] GPR12: 0000000000004400 c00000000fb42600 0000000000000015 00000000545bdc40 
> 	[23.138565] GPR16: 00000000545c49c8 000001000b4b8890 00007ffff78c26f0 00000000545cf000 
> 	[23.138565] GPR20: 00000000546109c8 000000000000c7e8 0000000054610010 00007ffff78c22e8 
> 	[23.138565] GPR24: 00000000545c8c40 c0000000ff6bcef0 c0000000001e5220 0000000000000230 
> 	[23.138565] GPR28: d000000003d80260 0000000000000000 0000000000000230 d000000003d80030 
> 	[23.138920] NIP [c000000000342358] __check_object_size+0x88/0x2d0
> 	[23.138956] LR [c000000000342354] __check_object_size+0x84/0x2d0
> 	[23.138990] Call Trace:
> 	[23.139006] [c0000001e27b3c80] [c000000000342354] __check_object_size+0x84/0x2d0 (unreliable)
> 	[23.139056] [c0000001e27b3d00] [c0000000009f5ba8] bpf_prog_create_from_user+0xa8/0x1a0
> 	[23.139099] [c0000001e27b3d60] [c0000000001e5d30] do_seccomp+0x120/0x720
> 	[23.139136] [c0000001e27b3dd0] [c0000000000fd53c] SyS_prctl+0x2ac/0x6b0
> 	[23.139172] [c0000001e27b3e30] [c00000000000af84] system_call+0x38/0xe0
> 	[23.139218] Instruction dump:
> 	[23.139240] 60000000 60420000 3c82ff94 3ca2ff9d 38841788 38a5e868 3c62ff95 7fc8f378 
> 	[23.139283] 7fe6fb78 386310c0 487f2169 60000000 <0fe00000> 60420000 2ba30010 409d018c 
> 	[23.139328] ---[ end trace 1a1dc952a4b7c4af ]---
> 	
> I found that kernel 4.11 does not have this issue. I also found that, if
> I revert 517e1fbeb65f5eade8d14f46ac365db6c75aea9b, I do not see the
> problem.

commit 517e1fbeb65f5eade8d14f46ac365db6c75aea9b
Author: Laura Abbott <labbott@redhat.com>
Date:   Tue Apr 4 14:09:00 2017 -0700

    mm/usercopy: Drop extra is_vmalloc_or_module() check
    
    Previously virt_addr_valid() was insufficient to validate if virt_to_page()
    could be called on an address on arm64. This has since been fixed up so
    there is no need for the extra check. Drop it.
    
    Signed-off-by: Laura Abbott <labbott@redhat.com>
    Acked-by: Mark Rutland <mark.rutland@arm.com>
    Signed-off-by: Kees Cook <keescook@chromium.org>

diff --git a/mm/usercopy.c b/mm/usercopy.c
index 1eba99b..a9852b2 100644
--- a/mm/usercopy.c
+++ b/mm/usercopy.c
@@ -200,17 +200,6 @@ static inline const char *check_heap_object(const void *ptr, unsigned long n,
 {
 	struct page *page;
 
-	/*
-	 * Some architectures (arm64) return true for virt_addr_valid() on
-	 * vmalloced addresses. Work around this by checking for vmalloc
-	 * first.
-	 *
-	 * We also need to check for module addresses explicitly since we
-	 * may copy static data from modules to userspace
-	 */
-	if (is_vmalloc_or_module_addr(ptr))
-		return NULL;
-
 	if (!virt_addr_valid(ptr))
 		return NULL;
 


On POWER8 (CONFIG_PPC64),

#define virt_addr_valid(kaddr)	pfn_valid(virt_to_pfn(kaddr))
#define virt_to_pfn(kaddr)	(__pa(kaddr) >> PAGE_SHIFT)
#define __pa(x) ((unsigned long)(x) & 0x0fffffffffffffffUL)

Hence some vmalloc (0xd range) addresses can still pass the virt_addr_valid()
test, hence the removed exclusive check for vmalloc and module addresses in
the commit is still required for powerpc. If that is the case, we should
revert the commit.

- Anshuman

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: kernel BUG at mm/usercopy.c:72!
  2017-05-16  4:00 ` Anshuman Khandual
@ 2017-05-16  4:44   ` Balbir Singh
  2017-05-16  5:04     ` Anshuman Khandual
  0 siblings, 1 reply; 14+ messages in thread
From: Balbir Singh @ 2017-05-16  4:44 UTC (permalink / raw)
  To: Anshuman Khandual, Breno Leitao, linuxppc-dev; +Cc: gromero

On Tue, 2017-05-16 at 09:30 +0530, Anshuman Khandual wrote:
> On 05/16/2017 12:49 AM, Breno Leitao wrote:
> > Hello,
> > 
> > Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
> > machine. Justing SSHing into the machine causes this issue.
> > 
> > 	[23.138124] usercopy: kernel memory overwrite attempt detected to d000000003d80030 (mm_struct) (560 bytes)
> > 	[23.138195] ------------[ cut here ]------------
> > 	[23.138229] kernel BUG at mm/usercopy.c:72!
> > 	[23.138252] Oops: Exception in kernel mode, sig: 5 [#3]
> > 	[23.138280] SMP NR_CPUS=2048 
> > 	[23.138280] NUMA 
> > 	[23.138302] pSeries
> > 	[23.138330] Modules linked in:
> > 	[23.138354] CPU: 4 PID: 2215 Comm: sshd Tainted: G      D         4.12.0-rc1+ #9
> > 	[23.138395] task: c0000001e272dc00 task.stack: c0000001e27b0000
> > 	[23.138430] NIP: c000000000342358 LR: c000000000342354 CTR: c0000000006eb060
> > 	[23.138472] REGS: c0000001e27b3a00 TRAP: 0700   Tainted: G      D          (4.12.0-rc1+)
> > 	[23.138513] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
> > 	[23.138517]   CR: 28004222  XER: 20000000
> > 	[23.138565] CFAR: c000000000b34500 SOFTE: 1 
> > 	[23.138565] GPR00: c000000000342354 c0000001e27b3c80 c00000000142a000 000000000000005e 
> > 	[23.138565] GPR04: c0000001ffe0ade8 c0000001ffe21bf8 2920283536302062 79746573290d0a74 
> > 	[23.138565] GPR08: 0000000000000007 c000000000f61864 00000001feeb0000 3064206f74206465 
> > 	[23.138565] GPR12: 0000000000004400 c00000000fb42600 0000000000000015 00000000545bdc40 
> > 	[23.138565] GPR16: 00000000545c49c8 000001000b4b8890 00007ffff78c26f0 00000000545cf000 
> > 	[23.138565] GPR20: 00000000546109c8 000000000000c7e8 0000000054610010 00007ffff78c22e8 
> > 	[23.138565] GPR24: 00000000545c8c40 c0000000ff6bcef0 c0000000001e5220 0000000000000230 
> > 	[23.138565] GPR28: d000000003d80260 0000000000000000 0000000000000230 d000000003d80030 
> > 	[23.138920] NIP [c000000000342358] __check_object_size+0x88/0x2d0
> > 	[23.138956] LR [c000000000342354] __check_object_size+0x84/0x2d0
> > 	[23.138990] Call Trace:
> > 	[23.139006] [c0000001e27b3c80] [c000000000342354] __check_object_size+0x84/0x2d0 (unreliable)
> > 	[23.139056] [c0000001e27b3d00] [c0000000009f5ba8] bpf_prog_create_from_user+0xa8/0x1a0
> > 	[23.139099] [c0000001e27b3d60] [c0000000001e5d30] do_seccomp+0x120/0x720
> > 	[23.139136] [c0000001e27b3dd0] [c0000000000fd53c] SyS_prctl+0x2ac/0x6b0
> > 	[23.139172] [c0000001e27b3e30] [c00000000000af84] system_call+0x38/0xe0
> > 	[23.139218] Instruction dump:
> > 	[23.139240] 60000000 60420000 3c82ff94 3ca2ff9d 38841788 38a5e868 3c62ff95 7fc8f378 
> > 	[23.139283] 7fe6fb78 386310c0 487f2169 60000000 <0fe00000> 60420000 2ba30010 409d018c 
> > 	[23.139328] ---[ end trace 1a1dc952a4b7c4af ]---
> > 	
> > I found that kernel 4.11 does not have this issue. I also found that, if
> > I revert 517e1fbeb65f5eade8d14f46ac365db6c75aea9b, I do not see the
> > problem.
> 
> commit 517e1fbeb65f5eade8d14f46ac365db6c75aea9b
> Author: Laura Abbott <labbott@redhat.com>
> Date:   Tue Apr 4 14:09:00 2017 -0700
> 
>     mm/usercopy: Drop extra is_vmalloc_or_module() check
>     
>     Previously virt_addr_valid() was insufficient to validate if virt_to_page()
>     could be called on an address on arm64. This has since been fixed up so
>     there is no need for the extra check. Drop it.
>     
>     Signed-off-by: Laura Abbott <labbott@redhat.com>
>     Acked-by: Mark Rutland <mark.rutland@arm.com>
>     Signed-off-by: Kees Cook <keescook@chromium.org>
> 
> diff --git a/mm/usercopy.c b/mm/usercopy.c
> index 1eba99b..a9852b2 100644
> --- a/mm/usercopy.c
> +++ b/mm/usercopy.c
> @@ -200,17 +200,6 @@ static inline const char *check_heap_object(const void *ptr, unsigned long n,
>  {
>  	struct page *page;
>  
> -	/*
> -	 * Some architectures (arm64) return true for virt_addr_valid() on
> -	 * vmalloced addresses. Work around this by checking for vmalloc
> -	 * first.
> -	 *
> -	 * We also need to check for module addresses explicitly since we
> -	 * may copy static data from modules to userspace
> -	 */
> -	if (is_vmalloc_or_module_addr(ptr))
> -		return NULL;
> -
>  	if (!virt_addr_valid(ptr))
>  		return NULL;
>  
> 
> 
> On POWER8 (CONFIG_PPC64),
> 
> #define virt_addr_valid(kaddr)	pfn_valid(virt_to_pfn(kaddr))
> #define virt_to_pfn(kaddr)	(__pa(kaddr) >> PAGE_SHIFT)
> #define __pa(x) ((unsigned long)(x) & 0x0fffffffffffffffUL)
> 
> Hence some vmalloc (0xd range) addresses can still pass the virt_addr_valid()
> test, hence the removed exclusive check for vmalloc and module addresses in
> the commit is still required for powerpc. If that is the case, we should
> revert the commit.
>

I guess it we should evaluate the meaning of virt_addr_valid() and what
it should return for 0xd.. and 0xf.. ranges for example?

Balbir Singh. 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kernel BUG at mm/usercopy.c:72!
  2017-05-16  4:44   ` Balbir Singh
@ 2017-05-16  5:04     ` Anshuman Khandual
  0 siblings, 0 replies; 14+ messages in thread
From: Anshuman Khandual @ 2017-05-16  5:04 UTC (permalink / raw)
  To: Balbir Singh, Breno Leitao, linuxppc-dev; +Cc: gromero

On 05/16/2017 10:14 AM, Balbir Singh wrote:
> On Tue, 2017-05-16 at 09:30 +0530, Anshuman Khandual wrote:
>> On 05/16/2017 12:49 AM, Breno Leitao wrote:
>>> Hello,
>>>
>>> Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
>>> machine. Justing SSHing into the machine causes this issue.
>>>
>>> 	[23.138124] usercopy: kernel memory overwrite attempt detected to d000000003d80030 (mm_struct) (560 bytes)
>>> 	[23.138195] ------------[ cut here ]------------
>>> 	[23.138229] kernel BUG at mm/usercopy.c:72!
>>> 	[23.138252] Oops: Exception in kernel mode, sig: 5 [#3]
>>> 	[23.138280] SMP NR_CPUS=2048 
>>> 	[23.138280] NUMA 
>>> 	[23.138302] pSeries
>>> 	[23.138330] Modules linked in:
>>> 	[23.138354] CPU: 4 PID: 2215 Comm: sshd Tainted: G      D         4.12.0-rc1+ #9
>>> 	[23.138395] task: c0000001e272dc00 task.stack: c0000001e27b0000
>>> 	[23.138430] NIP: c000000000342358 LR: c000000000342354 CTR: c0000000006eb060
>>> 	[23.138472] REGS: c0000001e27b3a00 TRAP: 0700   Tainted: G      D          (4.12.0-rc1+)
>>> 	[23.138513] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
>>> 	[23.138517]   CR: 28004222  XER: 20000000
>>> 	[23.138565] CFAR: c000000000b34500 SOFTE: 1 
>>> 	[23.138565] GPR00: c000000000342354 c0000001e27b3c80 c00000000142a000 000000000000005e 
>>> 	[23.138565] GPR04: c0000001ffe0ade8 c0000001ffe21bf8 2920283536302062 79746573290d0a74 
>>> 	[23.138565] GPR08: 0000000000000007 c000000000f61864 00000001feeb0000 3064206f74206465 
>>> 	[23.138565] GPR12: 0000000000004400 c00000000fb42600 0000000000000015 00000000545bdc40 
>>> 	[23.138565] GPR16: 00000000545c49c8 000001000b4b8890 00007ffff78c26f0 00000000545cf000 
>>> 	[23.138565] GPR20: 00000000546109c8 000000000000c7e8 0000000054610010 00007ffff78c22e8 
>>> 	[23.138565] GPR24: 00000000545c8c40 c0000000ff6bcef0 c0000000001e5220 0000000000000230 
>>> 	[23.138565] GPR28: d000000003d80260 0000000000000000 0000000000000230 d000000003d80030 
>>> 	[23.138920] NIP [c000000000342358] __check_object_size+0x88/0x2d0
>>> 	[23.138956] LR [c000000000342354] __check_object_size+0x84/0x2d0
>>> 	[23.138990] Call Trace:
>>> 	[23.139006] [c0000001e27b3c80] [c000000000342354] __check_object_size+0x84/0x2d0 (unreliable)
>>> 	[23.139056] [c0000001e27b3d00] [c0000000009f5ba8] bpf_prog_create_from_user+0xa8/0x1a0
>>> 	[23.139099] [c0000001e27b3d60] [c0000000001e5d30] do_seccomp+0x120/0x720
>>> 	[23.139136] [c0000001e27b3dd0] [c0000000000fd53c] SyS_prctl+0x2ac/0x6b0
>>> 	[23.139172] [c0000001e27b3e30] [c00000000000af84] system_call+0x38/0xe0
>>> 	[23.139218] Instruction dump:
>>> 	[23.139240] 60000000 60420000 3c82ff94 3ca2ff9d 38841788 38a5e868 3c62ff95 7fc8f378 
>>> 	[23.139283] 7fe6fb78 386310c0 487f2169 60000000 <0fe00000> 60420000 2ba30010 409d018c 
>>> 	[23.139328] ---[ end trace 1a1dc952a4b7c4af ]---
>>> 	
>>> I found that kernel 4.11 does not have this issue. I also found that, if
>>> I revert 517e1fbeb65f5eade8d14f46ac365db6c75aea9b, I do not see the
>>> problem.
>>
>> commit 517e1fbeb65f5eade8d14f46ac365db6c75aea9b
>> Author: Laura Abbott <labbott@redhat.com>
>> Date:   Tue Apr 4 14:09:00 2017 -0700
>>
>>     mm/usercopy: Drop extra is_vmalloc_or_module() check
>>     
>>     Previously virt_addr_valid() was insufficient to validate if virt_to_page()
>>     could be called on an address on arm64. This has since been fixed up so
>>     there is no need for the extra check. Drop it.
>>     
>>     Signed-off-by: Laura Abbott <labbott@redhat.com>
>>     Acked-by: Mark Rutland <mark.rutland@arm.com>
>>     Signed-off-by: Kees Cook <keescook@chromium.org>
>>
>> diff --git a/mm/usercopy.c b/mm/usercopy.c
>> index 1eba99b..a9852b2 100644
>> --- a/mm/usercopy.c
>> +++ b/mm/usercopy.c
>> @@ -200,17 +200,6 @@ static inline const char *check_heap_object(const void *ptr, unsigned long n,
>>  {
>>  	struct page *page;
>>  
>> -	/*
>> -	 * Some architectures (arm64) return true for virt_addr_valid() on
>> -	 * vmalloced addresses. Work around this by checking for vmalloc
>> -	 * first.
>> -	 *
>> -	 * We also need to check for module addresses explicitly since we
>> -	 * may copy static data from modules to userspace
>> -	 */
>> -	if (is_vmalloc_or_module_addr(ptr))
>> -		return NULL;
>> -
>>  	if (!virt_addr_valid(ptr))
>>  		return NULL;
>>  
>>
>>
>> On POWER8 (CONFIG_PPC64),
>>
>> #define virt_addr_valid(kaddr)	pfn_valid(virt_to_pfn(kaddr))
>> #define virt_to_pfn(kaddr)	(__pa(kaddr) >> PAGE_SHIFT)
>> #define __pa(x) ((unsigned long)(x) & 0x0fffffffffffffffUL)
>>
>> Hence some vmalloc (0xd range) addresses can still pass the virt_addr_valid()
>> test, hence the removed exclusive check for vmalloc and module addresses in
>> the commit is still required for powerpc. If that is the case, we should
>> revert the commit.
>>
> 
> I guess it we should evaluate the meaning of virt_addr_valid() and what
> it should return for 0xd.. and 0xf.. ranges for example?

Hmm, I get your point. But 0xd, 0xf are *actually* virtual addresses,
I wonder how can we return anything else for them. Hence the extra
check above is required for vmalloc addresses if thats not something
we want.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kernel BUG at mm/usercopy.c:72!
  2017-05-15 19:19 kernel BUG at mm/usercopy.c:72! Breno Leitao
  2017-05-16  4:00 ` Anshuman Khandual
@ 2017-05-16 11:02 ` Michael Ellerman
  2017-05-16 16:15   ` Breno Leitao
  2017-05-16 11:09 ` Michael Ellerman
  2017-05-18 10:17 ` Michael Ellerman
  3 siblings, 1 reply; 14+ messages in thread
From: Michael Ellerman @ 2017-05-16 11:02 UTC (permalink / raw)
  To: Breno Leitao, linuxppc-dev; +Cc: gromero

Breno Leitao <leitao@debian.org> writes:

> Hello,
>
> Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
> machine. Justing SSHing into the machine causes this issue.
>
> 	[23.138124] usercopy: kernel memory overwrite attempt detected to d000000003d80030 (mm_struct) (560 bytes)
> 	[23.138195] ------------[ cut here ]------------
> 	[23.138229] kernel BUG at mm/usercopy.c:72!
> 	[23.138252] Oops: Exception in kernel mode, sig: 5 [#3]
> 	[23.138280] SMP NR_CPUS=2048 
> 	[23.138280] NUMA 
> 	[23.138302] pSeries
> 	[23.138330] Modules linked in:
> 	[23.138354] CPU: 4 PID: 2215 Comm: sshd Tainted: G      D         4.12.0-rc1+ #9
> 	[23.138395] task: c0000001e272dc00 task.stack: c0000001e27b0000
> 	[23.138430] NIP: c000000000342358 LR: c000000000342354 CTR: c0000000006eb060
> 	[23.138472] REGS: c0000001e27b3a00 TRAP: 0700   Tainted: G      D          (4.12.0-rc1+)
> 	[23.138513] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
> 	[23.138517]   CR: 28004222  XER: 20000000
> 	[23.138565] CFAR: c000000000b34500 SOFTE: 1 
> 	[23.138565] GPR00: c000000000342354 c0000001e27b3c80 c00000000142a000 000000000000005e 
> 	[23.138565] GPR04: c0000001ffe0ade8 c0000001ffe21bf8 2920283536302062 79746573290d0a74 
> 	[23.138565] GPR08: 0000000000000007 c000000000f61864 00000001feeb0000 3064206f74206465 
> 	[23.138565] GPR12: 0000000000004400 c00000000fb42600 0000000000000015 00000000545bdc40 
> 	[23.138565] GPR16: 00000000545c49c8 000001000b4b8890 00007ffff78c26f0 00000000545cf000 
> 	[23.138565] GPR20: 00000000546109c8 000000000000c7e8 0000000054610010 00007ffff78c22e8 
> 	[23.138565] GPR24: 00000000545c8c40 c0000000ff6bcef0 c0000000001e5220 0000000000000230 
> 	[23.138565] GPR28: d000000003d80260 0000000000000000 0000000000000230 d000000003d80030 
> 	[23.138920] NIP [c000000000342358] __check_object_size+0x88/0x2d0
> 	[23.138956] LR [c000000000342354] __check_object_size+0x84/0x2d0
> 	[23.138990] Call Trace:
> 	[23.139006] [c0000001e27b3c80] [c000000000342354] __check_object_size+0x84/0x2d0 (unreliable)
> 	[23.139056] [c0000001e27b3d00] [c0000000009f5ba8] bpf_prog_create_from_user+0xa8/0x1a0
> 	[23.139099] [c0000001e27b3d60] [c0000000001e5d30] do_seccomp+0x120/0x720
> 	[23.139136] [c0000001e27b3dd0] [c0000000000fd53c] SyS_prctl+0x2ac/0x6b0
> 	[23.139172] [c0000001e27b3e30] [c00000000000af84] system_call+0x38/0xe0
> 	[23.139218] Instruction dump:
> 	[23.139240] 60000000 60420000 3c82ff94 3ca2ff9d 38841788 38a5e868 3c62ff95 7fc8f378 
> 	[23.139283] 7fe6fb78 386310c0 487f2169 60000000 <0fe00000> 60420000 2ba30010 409d018c 
> 	[23.139328] ---[ end trace 1a1dc952a4b7c4af ]---

Do you have any idea what is calling seccomp() and triggering the bug?

I run the BPF and seccomp test suites, and I haven't seen this.

cheers

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kernel BUG at mm/usercopy.c:72!
  2017-05-15 19:19 kernel BUG at mm/usercopy.c:72! Breno Leitao
  2017-05-16  4:00 ` Anshuman Khandual
  2017-05-16 11:02 ` Michael Ellerman
@ 2017-05-16 11:09 ` Michael Ellerman
  2017-05-16 14:32   ` Kees Cook
  2017-05-18 10:17 ` Michael Ellerman
  3 siblings, 1 reply; 14+ messages in thread
From: Michael Ellerman @ 2017-05-16 11:09 UTC (permalink / raw)
  To: Breno Leitao, linuxppc-dev, Kees Cook, Laura Abbott
  Cc: gromero, Anshuman Khandual, Balbir Singh

[Cc'ing the relevant folks]

Breno Leitao <leitao@debian.org> writes:
> Hello,
>
> Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
> machine. Justing SSHing into the machine causes this issue.
>
> 	[23.138124] usercopy: kernel memory overwrite attempt detected to d000000003d80030 (mm_struct) (560 bytes)
> 	[23.138195] ------------[ cut here ]------------
> 	[23.138229] kernel BUG at mm/usercopy.c:72!
> 	[23.138252] Oops: Exception in kernel mode, sig: 5 [#3]
> 	[23.138280] SMP NR_CPUS=2048 
> 	[23.138280] NUMA 
> 	[23.138302] pSeries
> 	[23.138330] Modules linked in:
> 	[23.138354] CPU: 4 PID: 2215 Comm: sshd Tainted: G      D         4.12.0-rc1+ #9
> 	[23.138395] task: c0000001e272dc00 task.stack: c0000001e27b0000
> 	[23.138430] NIP: c000000000342358 LR: c000000000342354 CTR: c0000000006eb060
> 	[23.138472] REGS: c0000001e27b3a00 TRAP: 0700   Tainted: G      D          (4.12.0-rc1+)
> 	[23.138513] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
> 	[23.138517]   CR: 28004222  XER: 20000000
> 	[23.138565] CFAR: c000000000b34500 SOFTE: 1 
> 	[23.138565] GPR00: c000000000342354 c0000001e27b3c80 c00000000142a000 000000000000005e 
> 	[23.138565] GPR04: c0000001ffe0ade8 c0000001ffe21bf8 2920283536302062 79746573290d0a74 
> 	[23.138565] GPR08: 0000000000000007 c000000000f61864 00000001feeb0000 3064206f74206465 
> 	[23.138565] GPR12: 0000000000004400 c00000000fb42600 0000000000000015 00000000545bdc40 
> 	[23.138565] GPR16: 00000000545c49c8 000001000b4b8890 00007ffff78c26f0 00000000545cf000 
> 	[23.138565] GPR20: 00000000546109c8 000000000000c7e8 0000000054610010 00007ffff78c22e8 
> 	[23.138565] GPR24: 00000000545c8c40 c0000000ff6bcef0 c0000000001e5220 0000000000000230 
> 	[23.138565] GPR28: d000000003d80260 0000000000000000 0000000000000230 d000000003d80030 
> 	[23.138920] NIP [c000000000342358] __check_object_size+0x88/0x2d0
> 	[23.138956] LR [c000000000342354] __check_object_size+0x84/0x2d0
> 	[23.138990] Call Trace:
> 	[23.139006] [c0000001e27b3c80] [c000000000342354] __check_object_size+0x84/0x2d0 (unreliable)
> 	[23.139056] [c0000001e27b3d00] [c0000000009f5ba8] bpf_prog_create_from_user+0xa8/0x1a0
> 	[23.139099] [c0000001e27b3d60] [c0000000001e5d30] do_seccomp+0x120/0x720
> 	[23.139136] [c0000001e27b3dd0] [c0000000000fd53c] SyS_prctl+0x2ac/0x6b0
> 	[23.139172] [c0000001e27b3e30] [c00000000000af84] system_call+0x38/0xe0
> 	[23.139218] Instruction dump:
> 	[23.139240] 60000000 60420000 3c82ff94 3ca2ff9d 38841788 38a5e868 3c62ff95 7fc8f378 
> 	[23.139283] 7fe6fb78 386310c0 487f2169 60000000 <0fe00000> 60420000 2ba30010 409d018c 
> 	[23.139328] ---[ end trace 1a1dc952a4b7c4af ]---
> 	
> I found that kernel 4.11 does not have this issue. I also found that, if
> I revert 517e1fbeb65f5eade8d14f46ac365db6c75aea9b, I do not see the
> problem.
>
> On the other side, if I cherry-pick commit
> 517e1fbeb65f5eade8d14f46ac365db6c75aea9b into 4.11, I start seeing the
> same issue also on 4.11.

Yeah it looks like powerpc also suffers from the same bug that arm64
used to, ie. virt_addr_valid() will return true for some vmalloc
addresses.

virt_addr_valid() is used pretty widely, I'm not sure if we can just fix
it without other fallout. I'll dig a bit more tomorrow if no one beats
me to it.

Kees, depending on how that turns out we may ask you to revert
517e1fbeb65f ("mm/usercopy: Drop extra is_vmalloc_or_module() check").

cheers

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kernel BUG at mm/usercopy.c:72!
  2017-05-16 11:09 ` Michael Ellerman
@ 2017-05-16 14:32   ` Kees Cook
  2017-05-16 14:35     ` Laura Abbott
                       ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Kees Cook @ 2017-05-16 14:32 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Breno Leitao, linuxppc-dev, Laura Abbott, gromero,
	Anshuman Khandual, Balbir Singh

On Tue, May 16, 2017 at 4:09 AM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> [Cc'ing the relevant folks]
>
> Breno Leitao <leitao@debian.org> writes:
>> Hello,
>>
>> Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
>> machine. Justing SSHing into the machine causes this issue.
>>
>>       [23.138124] usercopy: kernel memory overwrite attempt detected to d000000003d80030 (mm_struct) (560 bytes)
>>       [23.138195] ------------[ cut here ]------------
>>       [23.138229] kernel BUG at mm/usercopy.c:72!
>>       [23.138252] Oops: Exception in kernel mode, sig: 5 [#3]
>>       [23.138280] SMP NR_CPUS=2048
>>       [23.138280] NUMA
>>       [23.138302] pSeries
>>       [23.138330] Modules linked in:
>>       [23.138354] CPU: 4 PID: 2215 Comm: sshd Tainted: G      D         4.12.0-rc1+ #9
>>       [23.138395] task: c0000001e272dc00 task.stack: c0000001e27b0000
>>       [23.138430] NIP: c000000000342358 LR: c000000000342354 CTR: c0000000006eb060
>>       [23.138472] REGS: c0000001e27b3a00 TRAP: 0700   Tainted: G      D          (4.12.0-rc1+)
>>       [23.138513] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
>>       [23.138517]   CR: 28004222  XER: 20000000
>>       [23.138565] CFAR: c000000000b34500 SOFTE: 1
>>       [23.138565] GPR00: c000000000342354 c0000001e27b3c80 c00000000142a000 000000000000005e
>>       [23.138565] GPR04: c0000001ffe0ade8 c0000001ffe21bf8 2920283536302062 79746573290d0a74
>>       [23.138565] GPR08: 0000000000000007 c000000000f61864 00000001feeb0000 3064206f74206465
>>       [23.138565] GPR12: 0000000000004400 c00000000fb42600 0000000000000015 00000000545bdc40
>>       [23.138565] GPR16: 00000000545c49c8 000001000b4b8890 00007ffff78c26f0 00000000545cf000
>>       [23.138565] GPR20: 00000000546109c8 000000000000c7e8 0000000054610010 00007ffff78c22e8
>>       [23.138565] GPR24: 00000000545c8c40 c0000000ff6bcef0 c0000000001e5220 0000000000000230
>>       [23.138565] GPR28: d000000003d80260 0000000000000000 0000000000000230 d000000003d80030
>>       [23.138920] NIP [c000000000342358] __check_object_size+0x88/0x2d0
>>       [23.138956] LR [c000000000342354] __check_object_size+0x84/0x2d0
>>       [23.138990] Call Trace:
>>       [23.139006] [c0000001e27b3c80] [c000000000342354] __check_object_size+0x84/0x2d0 (unreliable)
>>       [23.139056] [c0000001e27b3d00] [c0000000009f5ba8] bpf_prog_create_from_user+0xa8/0x1a0
>>       [23.139099] [c0000001e27b3d60] [c0000000001e5d30] do_seccomp+0x120/0x720
>>       [23.139136] [c0000001e27b3dd0] [c0000000000fd53c] SyS_prctl+0x2ac/0x6b0
>>       [23.139172] [c0000001e27b3e30] [c00000000000af84] system_call+0x38/0xe0
>>       [23.139218] Instruction dump:
>>       [23.139240] 60000000 60420000 3c82ff94 3ca2ff9d 38841788 38a5e868 3c62ff95 7fc8f378
>>       [23.139283] 7fe6fb78 386310c0 487f2169 60000000 <0fe00000> 60420000 2ba30010 409d018c
>>       [23.139328] ---[ end trace 1a1dc952a4b7c4af ]---
>>
>> I found that kernel 4.11 does not have this issue. I also found that, if
>> I revert 517e1fbeb65f5eade8d14f46ac365db6c75aea9b, I do not see the
>> problem.
>>
>> On the other side, if I cherry-pick commit
>> 517e1fbeb65f5eade8d14f46ac365db6c75aea9b into 4.11, I start seeing the
>> same issue also on 4.11.
>
> Yeah it looks like powerpc also suffers from the same bug that arm64
> used to, ie. virt_addr_valid() will return true for some vmalloc
> addresses.
>
> virt_addr_valid() is used pretty widely, I'm not sure if we can just fix
> it without other fallout. I'll dig a bit more tomorrow if no one beats
> me to it.
>
> Kees, depending on how that turns out we may ask you to revert
> 517e1fbeb65f ("mm/usercopy: Drop extra is_vmalloc_or_module() check").

That's fine by me. Let me know what you think would be best.

Laura, I don't see much harm in putting this back in place. It seems
like it's just a matter of efficiency to have it removed?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kernel BUG at mm/usercopy.c:72!
  2017-05-16 14:32   ` Kees Cook
@ 2017-05-16 14:35     ` Laura Abbott
  2017-05-18  5:09       ` Michael Ellerman
  2017-05-17 10:05     ` Balbir Singh
  2017-05-18 10:16     ` Michael Ellerman
  2 siblings, 1 reply; 14+ messages in thread
From: Laura Abbott @ 2017-05-16 14:35 UTC (permalink / raw)
  To: Kees Cook, Michael Ellerman
  Cc: Breno Leitao, linuxppc-dev, gromero, Anshuman Khandual, Balbir Singh

On 05/16/2017 07:32 AM, Kees Cook wrote:
> On Tue, May 16, 2017 at 4:09 AM, Michael Ellerman <mpe@ellerman.id.au> wrote:
>> [Cc'ing the relevant folks]
>>
>> Breno Leitao <leitao@debian.org> writes:
>>> Hello,
>>>
>>> Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
>>> machine. Justing SSHing into the machine causes this issue.
>>>
>>>       [23.138124] usercopy: kernel memory overwrite attempt detected to d000000003d80030 (mm_struct) (560 bytes)
>>>       [23.138195] ------------[ cut here ]------------
>>>       [23.138229] kernel BUG at mm/usercopy.c:72!
>>>       [23.138252] Oops: Exception in kernel mode, sig: 5 [#3]
>>>       [23.138280] SMP NR_CPUS=2048
>>>       [23.138280] NUMA
>>>       [23.138302] pSeries
>>>       [23.138330] Modules linked in:
>>>       [23.138354] CPU: 4 PID: 2215 Comm: sshd Tainted: G      D         4.12.0-rc1+ #9
>>>       [23.138395] task: c0000001e272dc00 task.stack: c0000001e27b0000
>>>       [23.138430] NIP: c000000000342358 LR: c000000000342354 CTR: c0000000006eb060
>>>       [23.138472] REGS: c0000001e27b3a00 TRAP: 0700   Tainted: G      D          (4.12.0-rc1+)
>>>       [23.138513] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
>>>       [23.138517]   CR: 28004222  XER: 20000000
>>>       [23.138565] CFAR: c000000000b34500 SOFTE: 1
>>>       [23.138565] GPR00: c000000000342354 c0000001e27b3c80 c00000000142a000 000000000000005e
>>>       [23.138565] GPR04: c0000001ffe0ade8 c0000001ffe21bf8 2920283536302062 79746573290d0a74
>>>       [23.138565] GPR08: 0000000000000007 c000000000f61864 00000001feeb0000 3064206f74206465
>>>       [23.138565] GPR12: 0000000000004400 c00000000fb42600 0000000000000015 00000000545bdc40
>>>       [23.138565] GPR16: 00000000545c49c8 000001000b4b8890 00007ffff78c26f0 00000000545cf000
>>>       [23.138565] GPR20: 00000000546109c8 000000000000c7e8 0000000054610010 00007ffff78c22e8
>>>       [23.138565] GPR24: 00000000545c8c40 c0000000ff6bcef0 c0000000001e5220 0000000000000230
>>>       [23.138565] GPR28: d000000003d80260 0000000000000000 0000000000000230 d000000003d80030
>>>       [23.138920] NIP [c000000000342358] __check_object_size+0x88/0x2d0
>>>       [23.138956] LR [c000000000342354] __check_object_size+0x84/0x2d0
>>>       [23.138990] Call Trace:
>>>       [23.139006] [c0000001e27b3c80] [c000000000342354] __check_object_size+0x84/0x2d0 (unreliable)
>>>       [23.139056] [c0000001e27b3d00] [c0000000009f5ba8] bpf_prog_create_from_user+0xa8/0x1a0
>>>       [23.139099] [c0000001e27b3d60] [c0000000001e5d30] do_seccomp+0x120/0x720
>>>       [23.139136] [c0000001e27b3dd0] [c0000000000fd53c] SyS_prctl+0x2ac/0x6b0
>>>       [23.139172] [c0000001e27b3e30] [c00000000000af84] system_call+0x38/0xe0
>>>       [23.139218] Instruction dump:
>>>       [23.139240] 60000000 60420000 3c82ff94 3ca2ff9d 38841788 38a5e868 3c62ff95 7fc8f378
>>>       [23.139283] 7fe6fb78 386310c0 487f2169 60000000 <0fe00000> 60420000 2ba30010 409d018c
>>>       [23.139328] ---[ end trace 1a1dc952a4b7c4af ]---
>>>
>>> I found that kernel 4.11 does not have this issue. I also found that, if
>>> I revert 517e1fbeb65f5eade8d14f46ac365db6c75aea9b, I do not see the
>>> problem.
>>>
>>> On the other side, if I cherry-pick commit
>>> 517e1fbeb65f5eade8d14f46ac365db6c75aea9b into 4.11, I start seeing the
>>> same issue also on 4.11.
>>
>> Yeah it looks like powerpc also suffers from the same bug that arm64
>> used to, ie. virt_addr_valid() will return true for some vmalloc
>> addresses.
>>
>> virt_addr_valid() is used pretty widely, I'm not sure if we can just fix
>> it without other fallout. I'll dig a bit more tomorrow if no one beats
>> me to it.
>>
>> Kees, depending on how that turns out we may ask you to revert
>> 517e1fbeb65f ("mm/usercopy: Drop extra is_vmalloc_or_module() check").
> 
> That's fine by me. Let me know what you think would be best.
> 
> Laura, I don't see much harm in putting this back in place. It seems
> like it's just a matter of efficiency to have it removed?
> 
> -Kees
> 

Yes, there shouldn't be any harm if we need to bring it back.
Perhaps I should submit a follow on patch to rename virt_addr_valid to
virt_addr_valid_except_where_it_isnt.

Thanks,
Laura

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kernel BUG at mm/usercopy.c:72!
  2017-05-16 11:02 ` Michael Ellerman
@ 2017-05-16 16:15   ` Breno Leitao
  0 siblings, 0 replies; 14+ messages in thread
From: Breno Leitao @ 2017-05-16 16:15 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev, gromero

On Tue, May 16, 2017 at 09:02:29PM +1000, Michael Ellerman wrote:
> Breno Leitao <leitao@debian.org> writes:
> 
> > Hello,
> >
> > Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
> > machine. Justing SSHing into the machine causes this issue.
> >
> > 	[23.138124] usercopy: kernel memory overwrite attempt detected to d000000003d80030 (mm_struct) (560 bytes)
> > 	[23.138195] ------------[ cut here ]------------
> > 	[23.138229] kernel BUG at mm/usercopy.c:72!
> > 	[23.138252] Oops: Exception in kernel mode, sig: 5 [#3]
> > 	[23.138280] SMP NR_CPUS=2048 
> > 	[23.138280] NUMA 
> > 	[23.138302] pSeries
> > 	[23.138330] Modules linked in:
> > 	[23.138354] CPU: 4 PID: 2215 Comm: sshd Tainted: G      D         4.12.0-rc1+ #9
> > 	[23.138395] task: c0000001e272dc00 task.stack: c0000001e27b0000
> > 	[23.138430] NIP: c000000000342358 LR: c000000000342354 CTR: c0000000006eb060
> > 	[23.138472] REGS: c0000001e27b3a00 TRAP: 0700   Tainted: G      D          (4.12.0-rc1+)
> > 	[23.138513] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
> > 	[23.138517]   CR: 28004222  XER: 20000000
> > 	[23.138565] CFAR: c000000000b34500 SOFTE: 1 
> > 	[23.138565] GPR00: c000000000342354 c0000001e27b3c80 c00000000142a000 000000000000005e 
> > 	[23.138565] GPR04: c0000001ffe0ade8 c0000001ffe21bf8 2920283536302062 79746573290d0a74 
> > 	[23.138565] GPR08: 0000000000000007 c000000000f61864 00000001feeb0000 3064206f74206465 
> > 	[23.138565] GPR12: 0000000000004400 c00000000fb42600 0000000000000015 00000000545bdc40 
> > 	[23.138565] GPR16: 00000000545c49c8 000001000b4b8890 00007ffff78c26f0 00000000545cf000 
> > 	[23.138565] GPR20: 00000000546109c8 000000000000c7e8 0000000054610010 00007ffff78c22e8 
> > 	[23.138565] GPR24: 00000000545c8c40 c0000000ff6bcef0 c0000000001e5220 0000000000000230 
> > 	[23.138565] GPR28: d000000003d80260 0000000000000000 0000000000000230 d000000003d80030 
> > 	[23.138920] NIP [c000000000342358] __check_object_size+0x88/0x2d0
> > 	[23.138956] LR [c000000000342354] __check_object_size+0x84/0x2d0
> > 	[23.138990] Call Trace:
> > 	[23.139006] [c0000001e27b3c80] [c000000000342354] __check_object_size+0x84/0x2d0 (unreliable)
> > 	[23.139056] [c0000001e27b3d00] [c0000000009f5ba8] bpf_prog_create_from_user+0xa8/0x1a0
> > 	[23.139099] [c0000001e27b3d60] [c0000000001e5d30] do_seccomp+0x120/0x720
> > 	[23.139136] [c0000001e27b3dd0] [c0000000000fd53c] SyS_prctl+0x2ac/0x6b0
> > 	[23.139172] [c0000001e27b3e30] [c00000000000af84] system_call+0x38/0xe0
> > 	[23.139218] Instruction dump:
> > 	[23.139240] 60000000 60420000 3c82ff94 3ca2ff9d 38841788 38a5e868 3c62ff95 7fc8f378 
> > 	[23.139283] 7fe6fb78 386310c0 487f2169 60000000 <0fe00000> 60420000 2ba30010 409d018c 
> > 	[23.139328] ---[ end trace 1a1dc952a4b7c4af ]---
> 
> Do you have any idea what is calling seccomp() and triggering the bug?

This bug is hit using several path, not only via seccomp. This is
another path, via vfs_read, that triggers the bug:

	[  370.154307] usercopy: kernel memory exposure attempt detected from d000000003d6007c (vm_area_struct) (6 bytes)
	[  370.154373] ------------[ cut here ]------------
	[  370.154402] kernel BUG at mm/usercopy.c:72!                                                                    
	[  370.154425] Oops: Exception in kernel mode, sig: 5 [#4]
	<snip>
	[370.155220] [c0000001d30efab0] [c000000000342354] __check_object_size+0x84/0x2b0 (unreliable)
	[370.155272] [c0000001d30efb30] [c0000000006c96cc] copy_from_read_buf+0xac/0x1e0
	[370.155315] [c0000001d30efba0] [c0000000006ccbc4] n_tty_read+0x324/0x920
	[370.155351] [c0000001d30efcb0] [c0000000006c4c50] tty_read+0xc0/0x180                                          
	[370.155387] [c0000001d30efd00] [c000000000347f64] __vfs_read+0x44/0x1a0
	[370.155424] [c0000001d30efd90] [c0000000003499ac] vfs_read+0xbc/0x1b0
	[370.155460] [c0000001d30efde0] [c00000000034b6f8] SyS_read+0x68/0x110
	[370.155497] [c0000001d30efe30] [c00000000000af84] system_call+0x38/0xe0

Anyway, I see the seccomp() path issue when I log into the system using SSH,
and the issue with tty_read() just during the system boot.

> I run the BPF and seccomp test suites, and I haven't seen this.

Do you have the hardening options enabled? For example, I do not
reproduce this problem if I do not set CONFIG_HARDENED_USERCOPY=y.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kernel BUG at mm/usercopy.c:72!
  2017-05-16 14:32   ` Kees Cook
  2017-05-16 14:35     ` Laura Abbott
@ 2017-05-17 10:05     ` Balbir Singh
  2017-05-18 10:16     ` Michael Ellerman
  2 siblings, 0 replies; 14+ messages in thread
From: Balbir Singh @ 2017-05-17 10:05 UTC (permalink / raw)
  To: Kees Cook, Michael Ellerman
  Cc: Breno Leitao, linuxppc-dev, Laura Abbott, gromero, Anshuman Khandual

> > Kees, depending on how that turns out we may ask you to revert
> > 517e1fbeb65f ("mm/usercopy: Drop extra is_vmalloc_or_module() check").
> 
> That's fine by me. Let me know what you think would be best.
> 
> Laura, I don't see much harm in putting this back in place. It seems
> like it's just a matter of efficiency to have it removed?

It looks like we resolved struct page of 0xd000000003d80030 as PageSlab?

Balbir Singh.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kernel BUG at mm/usercopy.c:72!
  2017-05-16 14:35     ` Laura Abbott
@ 2017-05-18  5:09       ` Michael Ellerman
  0 siblings, 0 replies; 14+ messages in thread
From: Michael Ellerman @ 2017-05-18  5:09 UTC (permalink / raw)
  To: Laura Abbott, Kees Cook
  Cc: Breno Leitao, linuxppc-dev, gromero, Anshuman Khandual, Balbir Singh

Laura Abbott <labbott@redhat.com> writes:

> On 05/16/2017 07:32 AM, Kees Cook wrote:
>> On Tue, May 16, 2017 at 4:09 AM, Michael Ellerman <mpe@ellerman.id.au> wrote:
>>> [Cc'ing the relevant folks]
>>>
>>> Breno Leitao <leitao@debian.org> writes:
>>>> Hello,
>>>>
>>>> Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
>>>> machine. Justing SSHing into the machine causes this issue.
>>>>
>>>>       [23.138124] usercopy: kernel memory overwrite attempt detected to d000000003d80030 (mm_struct) (560 bytes)
>>>>       [23.138195] ------------[ cut here ]------------
>>>>       [23.138229] kernel BUG at mm/usercopy.c:72!
>>>>       [23.138252] Oops: Exception in kernel mode, sig: 5 [#3]
>>>>       [23.138280] SMP NR_CPUS=2048
>>>>       [23.138280] NUMA
>>>>       [23.138302] pSeries
>>>>       [23.138330] Modules linked in:
>>>>       [23.138354] CPU: 4 PID: 2215 Comm: sshd Tainted: G      D         4.12.0-rc1+ #9
>>>>       [23.138395] task: c0000001e272dc00 task.stack: c0000001e27b0000
>>>>       [23.138430] NIP: c000000000342358 LR: c000000000342354 CTR: c0000000006eb060
>>>>       [23.138472] REGS: c0000001e27b3a00 TRAP: 0700   Tainted: G      D          (4.12.0-rc1+)
>>>>       [23.138513] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
>>>>       [23.138517]   CR: 28004222  XER: 20000000
>>>>       [23.138565] CFAR: c000000000b34500 SOFTE: 1
>>>>       [23.138565] GPR00: c000000000342354 c0000001e27b3c80 c00000000142a000 000000000000005e
>>>>       [23.138565] GPR04: c0000001ffe0ade8 c0000001ffe21bf8 2920283536302062 79746573290d0a74
>>>>       [23.138565] GPR08: 0000000000000007 c000000000f61864 00000001feeb0000 3064206f74206465
>>>>       [23.138565] GPR12: 0000000000004400 c00000000fb42600 0000000000000015 00000000545bdc40
>>>>       [23.138565] GPR16: 00000000545c49c8 000001000b4b8890 00007ffff78c26f0 00000000545cf000
>>>>       [23.138565] GPR20: 00000000546109c8 000000000000c7e8 0000000054610010 00007ffff78c22e8
>>>>       [23.138565] GPR24: 00000000545c8c40 c0000000ff6bcef0 c0000000001e5220 0000000000000230
>>>>       [23.138565] GPR28: d000000003d80260 0000000000000000 0000000000000230 d000000003d80030
>>>>       [23.138920] NIP [c000000000342358] __check_object_size+0x88/0x2d0
>>>>       [23.138956] LR [c000000000342354] __check_object_size+0x84/0x2d0
>>>>       [23.138990] Call Trace:
>>>>       [23.139006] [c0000001e27b3c80] [c000000000342354] __check_object_size+0x84/0x2d0 (unreliable)
>>>>       [23.139056] [c0000001e27b3d00] [c0000000009f5ba8] bpf_prog_create_from_user+0xa8/0x1a0
>>>>       [23.139099] [c0000001e27b3d60] [c0000000001e5d30] do_seccomp+0x120/0x720
>>>>       [23.139136] [c0000001e27b3dd0] [c0000000000fd53c] SyS_prctl+0x2ac/0x6b0
>>>>       [23.139172] [c0000001e27b3e30] [c00000000000af84] system_call+0x38/0xe0
>>>>       [23.139218] Instruction dump:
>>>>       [23.139240] 60000000 60420000 3c82ff94 3ca2ff9d 38841788 38a5e868 3c62ff95 7fc8f378
>>>>       [23.139283] 7fe6fb78 386310c0 487f2169 60000000 <0fe00000> 60420000 2ba30010 409d018c
>>>>       [23.139328] ---[ end trace 1a1dc952a4b7c4af ]---
>>>>
>>>> I found that kernel 4.11 does not have this issue. I also found that, if
>>>> I revert 517e1fbeb65f5eade8d14f46ac365db6c75aea9b, I do not see the
>>>> problem.
>>>>
>>>> On the other side, if I cherry-pick commit
>>>> 517e1fbeb65f5eade8d14f46ac365db6c75aea9b into 4.11, I start seeing the
>>>> same issue also on 4.11.
>>>
>>> Yeah it looks like powerpc also suffers from the same bug that arm64
>>> used to, ie. virt_addr_valid() will return true for some vmalloc
>>> addresses.
>>>
>>> virt_addr_valid() is used pretty widely, I'm not sure if we can just fix
>>> it without other fallout. I'll dig a bit more tomorrow if no one beats
>>> me to it.
>>>
>>> Kees, depending on how that turns out we may ask you to revert
>>> 517e1fbeb65f ("mm/usercopy: Drop extra is_vmalloc_or_module() check").
>> 
>> That's fine by me. Let me know what you think would be best.
>> 
>> Laura, I don't see much harm in putting this back in place. It seems
>> like it's just a matter of efficiency to have it removed?
>
> Yes, there shouldn't be any harm if we need to bring it back.
> Perhaps I should submit a follow on patch to rename virt_addr_valid to
> virt_addr_valid_except_where_it_isnt.

I suspect there's lots of history here.

virt_addr_valid() is also hardly used, there's only a few 10's of
callers, vs hundreds for virt_to_page(). Which is scary as hell given
the latter is only safe if the former returns true.

cheers

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kernel BUG at mm/usercopy.c:72!
  2017-05-16 14:32   ` Kees Cook
  2017-05-16 14:35     ` Laura Abbott
  2017-05-17 10:05     ` Balbir Singh
@ 2017-05-18 10:16     ` Michael Ellerman
  2017-05-18 10:58       ` Balbir Singh
  2 siblings, 1 reply; 14+ messages in thread
From: Michael Ellerman @ 2017-05-18 10:16 UTC (permalink / raw)
  To: Kees Cook
  Cc: Breno Leitao, linuxppc-dev, Laura Abbott, gromero,
	Anshuman Khandual, Balbir Singh

Kees Cook <keescook@chromium.org> writes:
> On Tue, May 16, 2017 at 4:09 AM, Michael Ellerman <mpe@ellerman.id.au> wrote:
>> Yeah it looks like powerpc also suffers from the same bug that arm64
>> used to, ie. virt_addr_valid() will return true for some vmalloc
>> addresses.
>>
>> virt_addr_valid() is used pretty widely, I'm not sure if we can just fix
>> it without other fallout. I'll dig a bit more tomorrow if no one beats
>> me to it.
>>
>> Kees, depending on how that turns out we may ask you to revert
>> 517e1fbeb65f ("mm/usercopy: Drop extra is_vmalloc_or_module() check").
>
> That's fine by me. Let me know what you think would be best.

Oh man, what a mess.

I think we can do a small fix for this in powerpc code for 4.12, will
post it soon for Breno to test - I still can't reproduce locally.

cheers

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: kernel BUG at mm/usercopy.c:72!
  2017-05-15 19:19 kernel BUG at mm/usercopy.c:72! Breno Leitao
                   ` (2 preceding siblings ...)
  2017-05-16 11:09 ` Michael Ellerman
@ 2017-05-18 10:17 ` Michael Ellerman
  3 siblings, 0 replies; 14+ messages in thread
From: Michael Ellerman @ 2017-05-18 10:17 UTC (permalink / raw)
  To: Breno Leitao, linuxppc-dev; +Cc: gromero

Breno Leitao <leitao@debian.org> writes:

> Hello,
>
> Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
> machine. Justing SSHing into the machine causes this issue.

Can you try this?

cheers

diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index 2a32483c7b6c..8da5d4c1cab2 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -132,7 +132,19 @@ extern long long virt_phys_offset;
 #define virt_to_pfn(kaddr)	(__pa(kaddr) >> PAGE_SHIFT)
 #define virt_to_page(kaddr)	pfn_to_page(virt_to_pfn(kaddr))
 #define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)
+
+#ifdef CONFIG_PPC_BOOK3S_64
+/*
+ * On hash the vmalloc and other regions alias to the kernel region when passed
+ * through __pa(), which virt_to_pfn() uses. That means virt_addr_valid() can
+ * return true for some vmalloc addresses, which is incorrect. So explicitly
+ * check that the address is in the kernel region.
+ */
+#define virt_addr_valid(kaddr) (REGION_ID(kaddr) == KERNEL_REGION_ID && \
+				pfn_valid(virt_to_pfn(kaddr)))
+#else
 #define virt_addr_valid(kaddr)	pfn_valid(virt_to_pfn(kaddr))
+#endif
 
 /*
  * On Book-E parts we need __va to parse the device tree and we can't

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: kernel BUG at mm/usercopy.c:72!
  2017-05-18 10:16     ` Michael Ellerman
@ 2017-05-18 10:58       ` Balbir Singh
  0 siblings, 0 replies; 14+ messages in thread
From: Balbir Singh @ 2017-05-18 10:58 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Kees Cook, Breno Leitao, linuxppc-dev, Laura Abbott, gromero,
	Anshuman Khandual

On Thu, May 18, 2017 at 8:16 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> Kees Cook <keescook@chromium.org> writes:
>> On Tue, May 16, 2017 at 4:09 AM, Michael Ellerman <mpe@ellerman.id.au> wrote:
>>> Yeah it looks like powerpc also suffers from the same bug that arm64
>>> used to, ie. virt_addr_valid() will return true for some vmalloc
>>> addresses.
>>>
>>> virt_addr_valid() is used pretty widely, I'm not sure if we can just fix
>>> it without other fallout. I'll dig a bit more tomorrow if no one beats
>>> me to it.
>>>
>>> Kees, depending on how that turns out we may ask you to revert
>>> 517e1fbeb65f ("mm/usercopy: Drop extra is_vmalloc_or_module() check").
>>
>> That's fine by me. Let me know what you think would be best.
>
> Oh man, what a mess.
>
> I think we can do a small fix for this in powerpc code for 4.12, will
> post it soon for Breno to test - I still can't reproduce locally.

To reproduce locally you'd need an alias'd 0xd000..<addr> mapped to a
0xc000..<addr> (due to the assumptions in __pa()) and with
virt_to_page(addr) have PageSlab(page) set, I guess. I guess with very
few modules, a whole lot of 0xd000... space is unused, but if we had a
bunch of modules and ended up with an 0xd000<addr> range aliased as
PageSlab you would probably run into it more easily.

Balbir Singh

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-05-18 10:58 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-15 19:19 kernel BUG at mm/usercopy.c:72! Breno Leitao
2017-05-16  4:00 ` Anshuman Khandual
2017-05-16  4:44   ` Balbir Singh
2017-05-16  5:04     ` Anshuman Khandual
2017-05-16 11:02 ` Michael Ellerman
2017-05-16 16:15   ` Breno Leitao
2017-05-16 11:09 ` Michael Ellerman
2017-05-16 14:32   ` Kees Cook
2017-05-16 14:35     ` Laura Abbott
2017-05-18  5:09       ` Michael Ellerman
2017-05-17 10:05     ` Balbir Singh
2017-05-18 10:16     ` Michael Ellerman
2017-05-18 10:58       ` Balbir Singh
2017-05-18 10:17 ` Michael Ellerman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.