* Re: kernel BUG at mm/usercopy.c:72!
2017-05-15 19:19 kernel BUG at mm/usercopy.c:72! Breno Leitao
@ 2017-05-16 4:00 ` Anshuman Khandual
2017-05-16 4:44 ` Balbir Singh
2017-05-16 11:02 ` Michael Ellerman
` (2 subsequent siblings)
3 siblings, 1 reply; 14+ messages in thread
From: Anshuman Khandual @ 2017-05-16 4:00 UTC (permalink / raw)
To: Breno Leitao, linuxppc-dev; +Cc: gromero
On 05/16/2017 12:49 AM, Breno Leitao wrote:
> Hello,
>
> Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
> machine. Justing SSHing into the machine causes this issue.
>
> [23.138124] usercopy: kernel memory overwrite attempt detected to d000000003d80030 (mm_struct) (560 bytes)
> [23.138195] ------------[ cut here ]------------
> [23.138229] kernel BUG at mm/usercopy.c:72!
> [23.138252] Oops: Exception in kernel mode, sig: 5 [#3]
> [23.138280] SMP NR_CPUS=2048
> [23.138280] NUMA
> [23.138302] pSeries
> [23.138330] Modules linked in:
> [23.138354] CPU: 4 PID: 2215 Comm: sshd Tainted: G D 4.12.0-rc1+ #9
> [23.138395] task: c0000001e272dc00 task.stack: c0000001e27b0000
> [23.138430] NIP: c000000000342358 LR: c000000000342354 CTR: c0000000006eb060
> [23.138472] REGS: c0000001e27b3a00 TRAP: 0700 Tainted: G D (4.12.0-rc1+)
> [23.138513] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
> [23.138517] CR: 28004222 XER: 20000000
> [23.138565] CFAR: c000000000b34500 SOFTE: 1
> [23.138565] GPR00: c000000000342354 c0000001e27b3c80 c00000000142a000 000000000000005e
> [23.138565] GPR04: c0000001ffe0ade8 c0000001ffe21bf8 2920283536302062 79746573290d0a74
> [23.138565] GPR08: 0000000000000007 c000000000f61864 00000001feeb0000 3064206f74206465
> [23.138565] GPR12: 0000000000004400 c00000000fb42600 0000000000000015 00000000545bdc40
> [23.138565] GPR16: 00000000545c49c8 000001000b4b8890 00007ffff78c26f0 00000000545cf000
> [23.138565] GPR20: 00000000546109c8 000000000000c7e8 0000000054610010 00007ffff78c22e8
> [23.138565] GPR24: 00000000545c8c40 c0000000ff6bcef0 c0000000001e5220 0000000000000230
> [23.138565] GPR28: d000000003d80260 0000000000000000 0000000000000230 d000000003d80030
> [23.138920] NIP [c000000000342358] __check_object_size+0x88/0x2d0
> [23.138956] LR [c000000000342354] __check_object_size+0x84/0x2d0
> [23.138990] Call Trace:
> [23.139006] [c0000001e27b3c80] [c000000000342354] __check_object_size+0x84/0x2d0 (unreliable)
> [23.139056] [c0000001e27b3d00] [c0000000009f5ba8] bpf_prog_create_from_user+0xa8/0x1a0
> [23.139099] [c0000001e27b3d60] [c0000000001e5d30] do_seccomp+0x120/0x720
> [23.139136] [c0000001e27b3dd0] [c0000000000fd53c] SyS_prctl+0x2ac/0x6b0
> [23.139172] [c0000001e27b3e30] [c00000000000af84] system_call+0x38/0xe0
> [23.139218] Instruction dump:
> [23.139240] 60000000 60420000 3c82ff94 3ca2ff9d 38841788 38a5e868 3c62ff95 7fc8f378
> [23.139283] 7fe6fb78 386310c0 487f2169 60000000 <0fe00000> 60420000 2ba30010 409d018c
> [23.139328] ---[ end trace 1a1dc952a4b7c4af ]---
>
> I found that kernel 4.11 does not have this issue. I also found that, if
> I revert 517e1fbeb65f5eade8d14f46ac365db6c75aea9b, I do not see the
> problem.
commit 517e1fbeb65f5eade8d14f46ac365db6c75aea9b
Author: Laura Abbott <labbott@redhat.com>
Date: Tue Apr 4 14:09:00 2017 -0700
mm/usercopy: Drop extra is_vmalloc_or_module() check
Previously virt_addr_valid() was insufficient to validate if virt_to_page()
could be called on an address on arm64. This has since been fixed up so
there is no need for the extra check. Drop it.
Signed-off-by: Laura Abbott <labbott@redhat.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
diff --git a/mm/usercopy.c b/mm/usercopy.c
index 1eba99b..a9852b2 100644
--- a/mm/usercopy.c
+++ b/mm/usercopy.c
@@ -200,17 +200,6 @@ static inline const char *check_heap_object(const void *ptr, unsigned long n,
{
struct page *page;
- /*
- * Some architectures (arm64) return true for virt_addr_valid() on
- * vmalloced addresses. Work around this by checking for vmalloc
- * first.
- *
- * We also need to check for module addresses explicitly since we
- * may copy static data from modules to userspace
- */
- if (is_vmalloc_or_module_addr(ptr))
- return NULL;
-
if (!virt_addr_valid(ptr))
return NULL;
On POWER8 (CONFIG_PPC64),
#define virt_addr_valid(kaddr) pfn_valid(virt_to_pfn(kaddr))
#define virt_to_pfn(kaddr) (__pa(kaddr) >> PAGE_SHIFT)
#define __pa(x) ((unsigned long)(x) & 0x0fffffffffffffffUL)
Hence some vmalloc (0xd range) addresses can still pass the virt_addr_valid()
test, hence the removed exclusive check for vmalloc and module addresses in
the commit is still required for powerpc. If that is the case, we should
revert the commit.
- Anshuman
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: kernel BUG at mm/usercopy.c:72!
2017-05-16 4:00 ` Anshuman Khandual
@ 2017-05-16 4:44 ` Balbir Singh
2017-05-16 5:04 ` Anshuman Khandual
0 siblings, 1 reply; 14+ messages in thread
From: Balbir Singh @ 2017-05-16 4:44 UTC (permalink / raw)
To: Anshuman Khandual, Breno Leitao, linuxppc-dev; +Cc: gromero
On Tue, 2017-05-16 at 09:30 +0530, Anshuman Khandual wrote:
> On 05/16/2017 12:49 AM, Breno Leitao wrote:
> > Hello,
> >
> > Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
> > machine. Justing SSHing into the machine causes this issue.
> >
> > [23.138124] usercopy: kernel memory overwrite attempt detected to d000000003d80030 (mm_struct) (560 bytes)
> > [23.138195] ------------[ cut here ]------------
> > [23.138229] kernel BUG at mm/usercopy.c:72!
> > [23.138252] Oops: Exception in kernel mode, sig: 5 [#3]
> > [23.138280] SMP NR_CPUS=2048
> > [23.138280] NUMA
> > [23.138302] pSeries
> > [23.138330] Modules linked in:
> > [23.138354] CPU: 4 PID: 2215 Comm: sshd Tainted: G D 4.12.0-rc1+ #9
> > [23.138395] task: c0000001e272dc00 task.stack: c0000001e27b0000
> > [23.138430] NIP: c000000000342358 LR: c000000000342354 CTR: c0000000006eb060
> > [23.138472] REGS: c0000001e27b3a00 TRAP: 0700 Tainted: G D (4.12.0-rc1+)
> > [23.138513] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
> > [23.138517] CR: 28004222 XER: 20000000
> > [23.138565] CFAR: c000000000b34500 SOFTE: 1
> > [23.138565] GPR00: c000000000342354 c0000001e27b3c80 c00000000142a000 000000000000005e
> > [23.138565] GPR04: c0000001ffe0ade8 c0000001ffe21bf8 2920283536302062 79746573290d0a74
> > [23.138565] GPR08: 0000000000000007 c000000000f61864 00000001feeb0000 3064206f74206465
> > [23.138565] GPR12: 0000000000004400 c00000000fb42600 0000000000000015 00000000545bdc40
> > [23.138565] GPR16: 00000000545c49c8 000001000b4b8890 00007ffff78c26f0 00000000545cf000
> > [23.138565] GPR20: 00000000546109c8 000000000000c7e8 0000000054610010 00007ffff78c22e8
> > [23.138565] GPR24: 00000000545c8c40 c0000000ff6bcef0 c0000000001e5220 0000000000000230
> > [23.138565] GPR28: d000000003d80260 0000000000000000 0000000000000230 d000000003d80030
> > [23.138920] NIP [c000000000342358] __check_object_size+0x88/0x2d0
> > [23.138956] LR [c000000000342354] __check_object_size+0x84/0x2d0
> > [23.138990] Call Trace:
> > [23.139006] [c0000001e27b3c80] [c000000000342354] __check_object_size+0x84/0x2d0 (unreliable)
> > [23.139056] [c0000001e27b3d00] [c0000000009f5ba8] bpf_prog_create_from_user+0xa8/0x1a0
> > [23.139099] [c0000001e27b3d60] [c0000000001e5d30] do_seccomp+0x120/0x720
> > [23.139136] [c0000001e27b3dd0] [c0000000000fd53c] SyS_prctl+0x2ac/0x6b0
> > [23.139172] [c0000001e27b3e30] [c00000000000af84] system_call+0x38/0xe0
> > [23.139218] Instruction dump:
> > [23.139240] 60000000 60420000 3c82ff94 3ca2ff9d 38841788 38a5e868 3c62ff95 7fc8f378
> > [23.139283] 7fe6fb78 386310c0 487f2169 60000000 <0fe00000> 60420000 2ba30010 409d018c
> > [23.139328] ---[ end trace 1a1dc952a4b7c4af ]---
> >
> > I found that kernel 4.11 does not have this issue. I also found that, if
> > I revert 517e1fbeb65f5eade8d14f46ac365db6c75aea9b, I do not see the
> > problem.
>
> commit 517e1fbeb65f5eade8d14f46ac365db6c75aea9b
> Author: Laura Abbott <labbott@redhat.com>
> Date: Tue Apr 4 14:09:00 2017 -0700
>
> mm/usercopy: Drop extra is_vmalloc_or_module() check
>
> Previously virt_addr_valid() was insufficient to validate if virt_to_page()
> could be called on an address on arm64. This has since been fixed up so
> there is no need for the extra check. Drop it.
>
> Signed-off-by: Laura Abbott <labbott@redhat.com>
> Acked-by: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Kees Cook <keescook@chromium.org>
>
> diff --git a/mm/usercopy.c b/mm/usercopy.c
> index 1eba99b..a9852b2 100644
> --- a/mm/usercopy.c
> +++ b/mm/usercopy.c
> @@ -200,17 +200,6 @@ static inline const char *check_heap_object(const void *ptr, unsigned long n,
> {
> struct page *page;
>
> - /*
> - * Some architectures (arm64) return true for virt_addr_valid() on
> - * vmalloced addresses. Work around this by checking for vmalloc
> - * first.
> - *
> - * We also need to check for module addresses explicitly since we
> - * may copy static data from modules to userspace
> - */
> - if (is_vmalloc_or_module_addr(ptr))
> - return NULL;
> -
> if (!virt_addr_valid(ptr))
> return NULL;
>
>
>
> On POWER8 (CONFIG_PPC64),
>
> #define virt_addr_valid(kaddr) pfn_valid(virt_to_pfn(kaddr))
> #define virt_to_pfn(kaddr) (__pa(kaddr) >> PAGE_SHIFT)
> #define __pa(x) ((unsigned long)(x) & 0x0fffffffffffffffUL)
>
> Hence some vmalloc (0xd range) addresses can still pass the virt_addr_valid()
> test, hence the removed exclusive check for vmalloc and module addresses in
> the commit is still required for powerpc. If that is the case, we should
> revert the commit.
>
I guess it we should evaluate the meaning of virt_addr_valid() and what
it should return for 0xd.. and 0xf.. ranges for example?
Balbir Singh.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG at mm/usercopy.c:72!
2017-05-16 4:44 ` Balbir Singh
@ 2017-05-16 5:04 ` Anshuman Khandual
0 siblings, 0 replies; 14+ messages in thread
From: Anshuman Khandual @ 2017-05-16 5:04 UTC (permalink / raw)
To: Balbir Singh, Breno Leitao, linuxppc-dev; +Cc: gromero
On 05/16/2017 10:14 AM, Balbir Singh wrote:
> On Tue, 2017-05-16 at 09:30 +0530, Anshuman Khandual wrote:
>> On 05/16/2017 12:49 AM, Breno Leitao wrote:
>>> Hello,
>>>
>>> Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
>>> machine. Justing SSHing into the machine causes this issue.
>>>
>>> [23.138124] usercopy: kernel memory overwrite attempt detected to d000000003d80030 (mm_struct) (560 bytes)
>>> [23.138195] ------------[ cut here ]------------
>>> [23.138229] kernel BUG at mm/usercopy.c:72!
>>> [23.138252] Oops: Exception in kernel mode, sig: 5 [#3]
>>> [23.138280] SMP NR_CPUS=2048
>>> [23.138280] NUMA
>>> [23.138302] pSeries
>>> [23.138330] Modules linked in:
>>> [23.138354] CPU: 4 PID: 2215 Comm: sshd Tainted: G D 4.12.0-rc1+ #9
>>> [23.138395] task: c0000001e272dc00 task.stack: c0000001e27b0000
>>> [23.138430] NIP: c000000000342358 LR: c000000000342354 CTR: c0000000006eb060
>>> [23.138472] REGS: c0000001e27b3a00 TRAP: 0700 Tainted: G D (4.12.0-rc1+)
>>> [23.138513] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
>>> [23.138517] CR: 28004222 XER: 20000000
>>> [23.138565] CFAR: c000000000b34500 SOFTE: 1
>>> [23.138565] GPR00: c000000000342354 c0000001e27b3c80 c00000000142a000 000000000000005e
>>> [23.138565] GPR04: c0000001ffe0ade8 c0000001ffe21bf8 2920283536302062 79746573290d0a74
>>> [23.138565] GPR08: 0000000000000007 c000000000f61864 00000001feeb0000 3064206f74206465
>>> [23.138565] GPR12: 0000000000004400 c00000000fb42600 0000000000000015 00000000545bdc40
>>> [23.138565] GPR16: 00000000545c49c8 000001000b4b8890 00007ffff78c26f0 00000000545cf000
>>> [23.138565] GPR20: 00000000546109c8 000000000000c7e8 0000000054610010 00007ffff78c22e8
>>> [23.138565] GPR24: 00000000545c8c40 c0000000ff6bcef0 c0000000001e5220 0000000000000230
>>> [23.138565] GPR28: d000000003d80260 0000000000000000 0000000000000230 d000000003d80030
>>> [23.138920] NIP [c000000000342358] __check_object_size+0x88/0x2d0
>>> [23.138956] LR [c000000000342354] __check_object_size+0x84/0x2d0
>>> [23.138990] Call Trace:
>>> [23.139006] [c0000001e27b3c80] [c000000000342354] __check_object_size+0x84/0x2d0 (unreliable)
>>> [23.139056] [c0000001e27b3d00] [c0000000009f5ba8] bpf_prog_create_from_user+0xa8/0x1a0
>>> [23.139099] [c0000001e27b3d60] [c0000000001e5d30] do_seccomp+0x120/0x720
>>> [23.139136] [c0000001e27b3dd0] [c0000000000fd53c] SyS_prctl+0x2ac/0x6b0
>>> [23.139172] [c0000001e27b3e30] [c00000000000af84] system_call+0x38/0xe0
>>> [23.139218] Instruction dump:
>>> [23.139240] 60000000 60420000 3c82ff94 3ca2ff9d 38841788 38a5e868 3c62ff95 7fc8f378
>>> [23.139283] 7fe6fb78 386310c0 487f2169 60000000 <0fe00000> 60420000 2ba30010 409d018c
>>> [23.139328] ---[ end trace 1a1dc952a4b7c4af ]---
>>>
>>> I found that kernel 4.11 does not have this issue. I also found that, if
>>> I revert 517e1fbeb65f5eade8d14f46ac365db6c75aea9b, I do not see the
>>> problem.
>>
>> commit 517e1fbeb65f5eade8d14f46ac365db6c75aea9b
>> Author: Laura Abbott <labbott@redhat.com>
>> Date: Tue Apr 4 14:09:00 2017 -0700
>>
>> mm/usercopy: Drop extra is_vmalloc_or_module() check
>>
>> Previously virt_addr_valid() was insufficient to validate if virt_to_page()
>> could be called on an address on arm64. This has since been fixed up so
>> there is no need for the extra check. Drop it.
>>
>> Signed-off-by: Laura Abbott <labbott@redhat.com>
>> Acked-by: Mark Rutland <mark.rutland@arm.com>
>> Signed-off-by: Kees Cook <keescook@chromium.org>
>>
>> diff --git a/mm/usercopy.c b/mm/usercopy.c
>> index 1eba99b..a9852b2 100644
>> --- a/mm/usercopy.c
>> +++ b/mm/usercopy.c
>> @@ -200,17 +200,6 @@ static inline const char *check_heap_object(const void *ptr, unsigned long n,
>> {
>> struct page *page;
>>
>> - /*
>> - * Some architectures (arm64) return true for virt_addr_valid() on
>> - * vmalloced addresses. Work around this by checking for vmalloc
>> - * first.
>> - *
>> - * We also need to check for module addresses explicitly since we
>> - * may copy static data from modules to userspace
>> - */
>> - if (is_vmalloc_or_module_addr(ptr))
>> - return NULL;
>> -
>> if (!virt_addr_valid(ptr))
>> return NULL;
>>
>>
>>
>> On POWER8 (CONFIG_PPC64),
>>
>> #define virt_addr_valid(kaddr) pfn_valid(virt_to_pfn(kaddr))
>> #define virt_to_pfn(kaddr) (__pa(kaddr) >> PAGE_SHIFT)
>> #define __pa(x) ((unsigned long)(x) & 0x0fffffffffffffffUL)
>>
>> Hence some vmalloc (0xd range) addresses can still pass the virt_addr_valid()
>> test, hence the removed exclusive check for vmalloc and module addresses in
>> the commit is still required for powerpc. If that is the case, we should
>> revert the commit.
>>
>
> I guess it we should evaluate the meaning of virt_addr_valid() and what
> it should return for 0xd.. and 0xf.. ranges for example?
Hmm, I get your point. But 0xd, 0xf are *actually* virtual addresses,
I wonder how can we return anything else for them. Hence the extra
check above is required for vmalloc addresses if thats not something
we want.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG at mm/usercopy.c:72!
2017-05-15 19:19 kernel BUG at mm/usercopy.c:72! Breno Leitao
2017-05-16 4:00 ` Anshuman Khandual
@ 2017-05-16 11:02 ` Michael Ellerman
2017-05-16 16:15 ` Breno Leitao
2017-05-16 11:09 ` Michael Ellerman
2017-05-18 10:17 ` Michael Ellerman
3 siblings, 1 reply; 14+ messages in thread
From: Michael Ellerman @ 2017-05-16 11:02 UTC (permalink / raw)
To: Breno Leitao, linuxppc-dev; +Cc: gromero
Breno Leitao <leitao@debian.org> writes:
> Hello,
>
> Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
> machine. Justing SSHing into the machine causes this issue.
>
> [23.138124] usercopy: kernel memory overwrite attempt detected to d000000003d80030 (mm_struct) (560 bytes)
> [23.138195] ------------[ cut here ]------------
> [23.138229] kernel BUG at mm/usercopy.c:72!
> [23.138252] Oops: Exception in kernel mode, sig: 5 [#3]
> [23.138280] SMP NR_CPUS=2048
> [23.138280] NUMA
> [23.138302] pSeries
> [23.138330] Modules linked in:
> [23.138354] CPU: 4 PID: 2215 Comm: sshd Tainted: G D 4.12.0-rc1+ #9
> [23.138395] task: c0000001e272dc00 task.stack: c0000001e27b0000
> [23.138430] NIP: c000000000342358 LR: c000000000342354 CTR: c0000000006eb060
> [23.138472] REGS: c0000001e27b3a00 TRAP: 0700 Tainted: G D (4.12.0-rc1+)
> [23.138513] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
> [23.138517] CR: 28004222 XER: 20000000
> [23.138565] CFAR: c000000000b34500 SOFTE: 1
> [23.138565] GPR00: c000000000342354 c0000001e27b3c80 c00000000142a000 000000000000005e
> [23.138565] GPR04: c0000001ffe0ade8 c0000001ffe21bf8 2920283536302062 79746573290d0a74
> [23.138565] GPR08: 0000000000000007 c000000000f61864 00000001feeb0000 3064206f74206465
> [23.138565] GPR12: 0000000000004400 c00000000fb42600 0000000000000015 00000000545bdc40
> [23.138565] GPR16: 00000000545c49c8 000001000b4b8890 00007ffff78c26f0 00000000545cf000
> [23.138565] GPR20: 00000000546109c8 000000000000c7e8 0000000054610010 00007ffff78c22e8
> [23.138565] GPR24: 00000000545c8c40 c0000000ff6bcef0 c0000000001e5220 0000000000000230
> [23.138565] GPR28: d000000003d80260 0000000000000000 0000000000000230 d000000003d80030
> [23.138920] NIP [c000000000342358] __check_object_size+0x88/0x2d0
> [23.138956] LR [c000000000342354] __check_object_size+0x84/0x2d0
> [23.138990] Call Trace:
> [23.139006] [c0000001e27b3c80] [c000000000342354] __check_object_size+0x84/0x2d0 (unreliable)
> [23.139056] [c0000001e27b3d00] [c0000000009f5ba8] bpf_prog_create_from_user+0xa8/0x1a0
> [23.139099] [c0000001e27b3d60] [c0000000001e5d30] do_seccomp+0x120/0x720
> [23.139136] [c0000001e27b3dd0] [c0000000000fd53c] SyS_prctl+0x2ac/0x6b0
> [23.139172] [c0000001e27b3e30] [c00000000000af84] system_call+0x38/0xe0
> [23.139218] Instruction dump:
> [23.139240] 60000000 60420000 3c82ff94 3ca2ff9d 38841788 38a5e868 3c62ff95 7fc8f378
> [23.139283] 7fe6fb78 386310c0 487f2169 60000000 <0fe00000> 60420000 2ba30010 409d018c
> [23.139328] ---[ end trace 1a1dc952a4b7c4af ]---
Do you have any idea what is calling seccomp() and triggering the bug?
I run the BPF and seccomp test suites, and I haven't seen this.
cheers
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG at mm/usercopy.c:72!
2017-05-16 11:02 ` Michael Ellerman
@ 2017-05-16 16:15 ` Breno Leitao
0 siblings, 0 replies; 14+ messages in thread
From: Breno Leitao @ 2017-05-16 16:15 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev, gromero
On Tue, May 16, 2017 at 09:02:29PM +1000, Michael Ellerman wrote:
> Breno Leitao <leitao@debian.org> writes:
>
> > Hello,
> >
> > Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
> > machine. Justing SSHing into the machine causes this issue.
> >
> > [23.138124] usercopy: kernel memory overwrite attempt detected to d000000003d80030 (mm_struct) (560 bytes)
> > [23.138195] ------------[ cut here ]------------
> > [23.138229] kernel BUG at mm/usercopy.c:72!
> > [23.138252] Oops: Exception in kernel mode, sig: 5 [#3]
> > [23.138280] SMP NR_CPUS=2048
> > [23.138280] NUMA
> > [23.138302] pSeries
> > [23.138330] Modules linked in:
> > [23.138354] CPU: 4 PID: 2215 Comm: sshd Tainted: G D 4.12.0-rc1+ #9
> > [23.138395] task: c0000001e272dc00 task.stack: c0000001e27b0000
> > [23.138430] NIP: c000000000342358 LR: c000000000342354 CTR: c0000000006eb060
> > [23.138472] REGS: c0000001e27b3a00 TRAP: 0700 Tainted: G D (4.12.0-rc1+)
> > [23.138513] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
> > [23.138517] CR: 28004222 XER: 20000000
> > [23.138565] CFAR: c000000000b34500 SOFTE: 1
> > [23.138565] GPR00: c000000000342354 c0000001e27b3c80 c00000000142a000 000000000000005e
> > [23.138565] GPR04: c0000001ffe0ade8 c0000001ffe21bf8 2920283536302062 79746573290d0a74
> > [23.138565] GPR08: 0000000000000007 c000000000f61864 00000001feeb0000 3064206f74206465
> > [23.138565] GPR12: 0000000000004400 c00000000fb42600 0000000000000015 00000000545bdc40
> > [23.138565] GPR16: 00000000545c49c8 000001000b4b8890 00007ffff78c26f0 00000000545cf000
> > [23.138565] GPR20: 00000000546109c8 000000000000c7e8 0000000054610010 00007ffff78c22e8
> > [23.138565] GPR24: 00000000545c8c40 c0000000ff6bcef0 c0000000001e5220 0000000000000230
> > [23.138565] GPR28: d000000003d80260 0000000000000000 0000000000000230 d000000003d80030
> > [23.138920] NIP [c000000000342358] __check_object_size+0x88/0x2d0
> > [23.138956] LR [c000000000342354] __check_object_size+0x84/0x2d0
> > [23.138990] Call Trace:
> > [23.139006] [c0000001e27b3c80] [c000000000342354] __check_object_size+0x84/0x2d0 (unreliable)
> > [23.139056] [c0000001e27b3d00] [c0000000009f5ba8] bpf_prog_create_from_user+0xa8/0x1a0
> > [23.139099] [c0000001e27b3d60] [c0000000001e5d30] do_seccomp+0x120/0x720
> > [23.139136] [c0000001e27b3dd0] [c0000000000fd53c] SyS_prctl+0x2ac/0x6b0
> > [23.139172] [c0000001e27b3e30] [c00000000000af84] system_call+0x38/0xe0
> > [23.139218] Instruction dump:
> > [23.139240] 60000000 60420000 3c82ff94 3ca2ff9d 38841788 38a5e868 3c62ff95 7fc8f378
> > [23.139283] 7fe6fb78 386310c0 487f2169 60000000 <0fe00000> 60420000 2ba30010 409d018c
> > [23.139328] ---[ end trace 1a1dc952a4b7c4af ]---
>
> Do you have any idea what is calling seccomp() and triggering the bug?
This bug is hit using several path, not only via seccomp. This is
another path, via vfs_read, that triggers the bug:
[ 370.154307] usercopy: kernel memory exposure attempt detected from d000000003d6007c (vm_area_struct) (6 bytes)
[ 370.154373] ------------[ cut here ]------------
[ 370.154402] kernel BUG at mm/usercopy.c:72!
[ 370.154425] Oops: Exception in kernel mode, sig: 5 [#4]
<snip>
[370.155220] [c0000001d30efab0] [c000000000342354] __check_object_size+0x84/0x2b0 (unreliable)
[370.155272] [c0000001d30efb30] [c0000000006c96cc] copy_from_read_buf+0xac/0x1e0
[370.155315] [c0000001d30efba0] [c0000000006ccbc4] n_tty_read+0x324/0x920
[370.155351] [c0000001d30efcb0] [c0000000006c4c50] tty_read+0xc0/0x180
[370.155387] [c0000001d30efd00] [c000000000347f64] __vfs_read+0x44/0x1a0
[370.155424] [c0000001d30efd90] [c0000000003499ac] vfs_read+0xbc/0x1b0
[370.155460] [c0000001d30efde0] [c00000000034b6f8] SyS_read+0x68/0x110
[370.155497] [c0000001d30efe30] [c00000000000af84] system_call+0x38/0xe0
Anyway, I see the seccomp() path issue when I log into the system using SSH,
and the issue with tty_read() just during the system boot.
> I run the BPF and seccomp test suites, and I haven't seen this.
Do you have the hardening options enabled? For example, I do not
reproduce this problem if I do not set CONFIG_HARDENED_USERCOPY=y.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG at mm/usercopy.c:72!
2017-05-15 19:19 kernel BUG at mm/usercopy.c:72! Breno Leitao
2017-05-16 4:00 ` Anshuman Khandual
2017-05-16 11:02 ` Michael Ellerman
@ 2017-05-16 11:09 ` Michael Ellerman
2017-05-16 14:32 ` Kees Cook
2017-05-18 10:17 ` Michael Ellerman
3 siblings, 1 reply; 14+ messages in thread
From: Michael Ellerman @ 2017-05-16 11:09 UTC (permalink / raw)
To: Breno Leitao, linuxppc-dev, Kees Cook, Laura Abbott
Cc: gromero, Anshuman Khandual, Balbir Singh
[Cc'ing the relevant folks]
Breno Leitao <leitao@debian.org> writes:
> Hello,
>
> Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
> machine. Justing SSHing into the machine causes this issue.
>
> [23.138124] usercopy: kernel memory overwrite attempt detected to d000000003d80030 (mm_struct) (560 bytes)
> [23.138195] ------------[ cut here ]------------
> [23.138229] kernel BUG at mm/usercopy.c:72!
> [23.138252] Oops: Exception in kernel mode, sig: 5 [#3]
> [23.138280] SMP NR_CPUS=2048
> [23.138280] NUMA
> [23.138302] pSeries
> [23.138330] Modules linked in:
> [23.138354] CPU: 4 PID: 2215 Comm: sshd Tainted: G D 4.12.0-rc1+ #9
> [23.138395] task: c0000001e272dc00 task.stack: c0000001e27b0000
> [23.138430] NIP: c000000000342358 LR: c000000000342354 CTR: c0000000006eb060
> [23.138472] REGS: c0000001e27b3a00 TRAP: 0700 Tainted: G D (4.12.0-rc1+)
> [23.138513] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
> [23.138517] CR: 28004222 XER: 20000000
> [23.138565] CFAR: c000000000b34500 SOFTE: 1
> [23.138565] GPR00: c000000000342354 c0000001e27b3c80 c00000000142a000 000000000000005e
> [23.138565] GPR04: c0000001ffe0ade8 c0000001ffe21bf8 2920283536302062 79746573290d0a74
> [23.138565] GPR08: 0000000000000007 c000000000f61864 00000001feeb0000 3064206f74206465
> [23.138565] GPR12: 0000000000004400 c00000000fb42600 0000000000000015 00000000545bdc40
> [23.138565] GPR16: 00000000545c49c8 000001000b4b8890 00007ffff78c26f0 00000000545cf000
> [23.138565] GPR20: 00000000546109c8 000000000000c7e8 0000000054610010 00007ffff78c22e8
> [23.138565] GPR24: 00000000545c8c40 c0000000ff6bcef0 c0000000001e5220 0000000000000230
> [23.138565] GPR28: d000000003d80260 0000000000000000 0000000000000230 d000000003d80030
> [23.138920] NIP [c000000000342358] __check_object_size+0x88/0x2d0
> [23.138956] LR [c000000000342354] __check_object_size+0x84/0x2d0
> [23.138990] Call Trace:
> [23.139006] [c0000001e27b3c80] [c000000000342354] __check_object_size+0x84/0x2d0 (unreliable)
> [23.139056] [c0000001e27b3d00] [c0000000009f5ba8] bpf_prog_create_from_user+0xa8/0x1a0
> [23.139099] [c0000001e27b3d60] [c0000000001e5d30] do_seccomp+0x120/0x720
> [23.139136] [c0000001e27b3dd0] [c0000000000fd53c] SyS_prctl+0x2ac/0x6b0
> [23.139172] [c0000001e27b3e30] [c00000000000af84] system_call+0x38/0xe0
> [23.139218] Instruction dump:
> [23.139240] 60000000 60420000 3c82ff94 3ca2ff9d 38841788 38a5e868 3c62ff95 7fc8f378
> [23.139283] 7fe6fb78 386310c0 487f2169 60000000 <0fe00000> 60420000 2ba30010 409d018c
> [23.139328] ---[ end trace 1a1dc952a4b7c4af ]---
>
> I found that kernel 4.11 does not have this issue. I also found that, if
> I revert 517e1fbeb65f5eade8d14f46ac365db6c75aea9b, I do not see the
> problem.
>
> On the other side, if I cherry-pick commit
> 517e1fbeb65f5eade8d14f46ac365db6c75aea9b into 4.11, I start seeing the
> same issue also on 4.11.
Yeah it looks like powerpc also suffers from the same bug that arm64
used to, ie. virt_addr_valid() will return true for some vmalloc
addresses.
virt_addr_valid() is used pretty widely, I'm not sure if we can just fix
it without other fallout. I'll dig a bit more tomorrow if no one beats
me to it.
Kees, depending on how that turns out we may ask you to revert
517e1fbeb65f ("mm/usercopy: Drop extra is_vmalloc_or_module() check").
cheers
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG at mm/usercopy.c:72!
2017-05-16 11:09 ` Michael Ellerman
@ 2017-05-16 14:32 ` Kees Cook
2017-05-16 14:35 ` Laura Abbott
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: Kees Cook @ 2017-05-16 14:32 UTC (permalink / raw)
To: Michael Ellerman
Cc: Breno Leitao, linuxppc-dev, Laura Abbott, gromero,
Anshuman Khandual, Balbir Singh
On Tue, May 16, 2017 at 4:09 AM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> [Cc'ing the relevant folks]
>
> Breno Leitao <leitao@debian.org> writes:
>> Hello,
>>
>> Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
>> machine. Justing SSHing into the machine causes this issue.
>>
>> [23.138124] usercopy: kernel memory overwrite attempt detected to d000000003d80030 (mm_struct) (560 bytes)
>> [23.138195] ------------[ cut here ]------------
>> [23.138229] kernel BUG at mm/usercopy.c:72!
>> [23.138252] Oops: Exception in kernel mode, sig: 5 [#3]
>> [23.138280] SMP NR_CPUS=2048
>> [23.138280] NUMA
>> [23.138302] pSeries
>> [23.138330] Modules linked in:
>> [23.138354] CPU: 4 PID: 2215 Comm: sshd Tainted: G D 4.12.0-rc1+ #9
>> [23.138395] task: c0000001e272dc00 task.stack: c0000001e27b0000
>> [23.138430] NIP: c000000000342358 LR: c000000000342354 CTR: c0000000006eb060
>> [23.138472] REGS: c0000001e27b3a00 TRAP: 0700 Tainted: G D (4.12.0-rc1+)
>> [23.138513] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
>> [23.138517] CR: 28004222 XER: 20000000
>> [23.138565] CFAR: c000000000b34500 SOFTE: 1
>> [23.138565] GPR00: c000000000342354 c0000001e27b3c80 c00000000142a000 000000000000005e
>> [23.138565] GPR04: c0000001ffe0ade8 c0000001ffe21bf8 2920283536302062 79746573290d0a74
>> [23.138565] GPR08: 0000000000000007 c000000000f61864 00000001feeb0000 3064206f74206465
>> [23.138565] GPR12: 0000000000004400 c00000000fb42600 0000000000000015 00000000545bdc40
>> [23.138565] GPR16: 00000000545c49c8 000001000b4b8890 00007ffff78c26f0 00000000545cf000
>> [23.138565] GPR20: 00000000546109c8 000000000000c7e8 0000000054610010 00007ffff78c22e8
>> [23.138565] GPR24: 00000000545c8c40 c0000000ff6bcef0 c0000000001e5220 0000000000000230
>> [23.138565] GPR28: d000000003d80260 0000000000000000 0000000000000230 d000000003d80030
>> [23.138920] NIP [c000000000342358] __check_object_size+0x88/0x2d0
>> [23.138956] LR [c000000000342354] __check_object_size+0x84/0x2d0
>> [23.138990] Call Trace:
>> [23.139006] [c0000001e27b3c80] [c000000000342354] __check_object_size+0x84/0x2d0 (unreliable)
>> [23.139056] [c0000001e27b3d00] [c0000000009f5ba8] bpf_prog_create_from_user+0xa8/0x1a0
>> [23.139099] [c0000001e27b3d60] [c0000000001e5d30] do_seccomp+0x120/0x720
>> [23.139136] [c0000001e27b3dd0] [c0000000000fd53c] SyS_prctl+0x2ac/0x6b0
>> [23.139172] [c0000001e27b3e30] [c00000000000af84] system_call+0x38/0xe0
>> [23.139218] Instruction dump:
>> [23.139240] 60000000 60420000 3c82ff94 3ca2ff9d 38841788 38a5e868 3c62ff95 7fc8f378
>> [23.139283] 7fe6fb78 386310c0 487f2169 60000000 <0fe00000> 60420000 2ba30010 409d018c
>> [23.139328] ---[ end trace 1a1dc952a4b7c4af ]---
>>
>> I found that kernel 4.11 does not have this issue. I also found that, if
>> I revert 517e1fbeb65f5eade8d14f46ac365db6c75aea9b, I do not see the
>> problem.
>>
>> On the other side, if I cherry-pick commit
>> 517e1fbeb65f5eade8d14f46ac365db6c75aea9b into 4.11, I start seeing the
>> same issue also on 4.11.
>
> Yeah it looks like powerpc also suffers from the same bug that arm64
> used to, ie. virt_addr_valid() will return true for some vmalloc
> addresses.
>
> virt_addr_valid() is used pretty widely, I'm not sure if we can just fix
> it without other fallout. I'll dig a bit more tomorrow if no one beats
> me to it.
>
> Kees, depending on how that turns out we may ask you to revert
> 517e1fbeb65f ("mm/usercopy: Drop extra is_vmalloc_or_module() check").
That's fine by me. Let me know what you think would be best.
Laura, I don't see much harm in putting this back in place. It seems
like it's just a matter of efficiency to have it removed?
-Kees
--
Kees Cook
Pixel Security
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG at mm/usercopy.c:72!
2017-05-16 14:32 ` Kees Cook
@ 2017-05-16 14:35 ` Laura Abbott
2017-05-18 5:09 ` Michael Ellerman
2017-05-17 10:05 ` Balbir Singh
2017-05-18 10:16 ` Michael Ellerman
2 siblings, 1 reply; 14+ messages in thread
From: Laura Abbott @ 2017-05-16 14:35 UTC (permalink / raw)
To: Kees Cook, Michael Ellerman
Cc: Breno Leitao, linuxppc-dev, gromero, Anshuman Khandual, Balbir Singh
On 05/16/2017 07:32 AM, Kees Cook wrote:
> On Tue, May 16, 2017 at 4:09 AM, Michael Ellerman <mpe@ellerman.id.au> wrote:
>> [Cc'ing the relevant folks]
>>
>> Breno Leitao <leitao@debian.org> writes:
>>> Hello,
>>>
>>> Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
>>> machine. Justing SSHing into the machine causes this issue.
>>>
>>> [23.138124] usercopy: kernel memory overwrite attempt detected to d000000003d80030 (mm_struct) (560 bytes)
>>> [23.138195] ------------[ cut here ]------------
>>> [23.138229] kernel BUG at mm/usercopy.c:72!
>>> [23.138252] Oops: Exception in kernel mode, sig: 5 [#3]
>>> [23.138280] SMP NR_CPUS=2048
>>> [23.138280] NUMA
>>> [23.138302] pSeries
>>> [23.138330] Modules linked in:
>>> [23.138354] CPU: 4 PID: 2215 Comm: sshd Tainted: G D 4.12.0-rc1+ #9
>>> [23.138395] task: c0000001e272dc00 task.stack: c0000001e27b0000
>>> [23.138430] NIP: c000000000342358 LR: c000000000342354 CTR: c0000000006eb060
>>> [23.138472] REGS: c0000001e27b3a00 TRAP: 0700 Tainted: G D (4.12.0-rc1+)
>>> [23.138513] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
>>> [23.138517] CR: 28004222 XER: 20000000
>>> [23.138565] CFAR: c000000000b34500 SOFTE: 1
>>> [23.138565] GPR00: c000000000342354 c0000001e27b3c80 c00000000142a000 000000000000005e
>>> [23.138565] GPR04: c0000001ffe0ade8 c0000001ffe21bf8 2920283536302062 79746573290d0a74
>>> [23.138565] GPR08: 0000000000000007 c000000000f61864 00000001feeb0000 3064206f74206465
>>> [23.138565] GPR12: 0000000000004400 c00000000fb42600 0000000000000015 00000000545bdc40
>>> [23.138565] GPR16: 00000000545c49c8 000001000b4b8890 00007ffff78c26f0 00000000545cf000
>>> [23.138565] GPR20: 00000000546109c8 000000000000c7e8 0000000054610010 00007ffff78c22e8
>>> [23.138565] GPR24: 00000000545c8c40 c0000000ff6bcef0 c0000000001e5220 0000000000000230
>>> [23.138565] GPR28: d000000003d80260 0000000000000000 0000000000000230 d000000003d80030
>>> [23.138920] NIP [c000000000342358] __check_object_size+0x88/0x2d0
>>> [23.138956] LR [c000000000342354] __check_object_size+0x84/0x2d0
>>> [23.138990] Call Trace:
>>> [23.139006] [c0000001e27b3c80] [c000000000342354] __check_object_size+0x84/0x2d0 (unreliable)
>>> [23.139056] [c0000001e27b3d00] [c0000000009f5ba8] bpf_prog_create_from_user+0xa8/0x1a0
>>> [23.139099] [c0000001e27b3d60] [c0000000001e5d30] do_seccomp+0x120/0x720
>>> [23.139136] [c0000001e27b3dd0] [c0000000000fd53c] SyS_prctl+0x2ac/0x6b0
>>> [23.139172] [c0000001e27b3e30] [c00000000000af84] system_call+0x38/0xe0
>>> [23.139218] Instruction dump:
>>> [23.139240] 60000000 60420000 3c82ff94 3ca2ff9d 38841788 38a5e868 3c62ff95 7fc8f378
>>> [23.139283] 7fe6fb78 386310c0 487f2169 60000000 <0fe00000> 60420000 2ba30010 409d018c
>>> [23.139328] ---[ end trace 1a1dc952a4b7c4af ]---
>>>
>>> I found that kernel 4.11 does not have this issue. I also found that, if
>>> I revert 517e1fbeb65f5eade8d14f46ac365db6c75aea9b, I do not see the
>>> problem.
>>>
>>> On the other side, if I cherry-pick commit
>>> 517e1fbeb65f5eade8d14f46ac365db6c75aea9b into 4.11, I start seeing the
>>> same issue also on 4.11.
>>
>> Yeah it looks like powerpc also suffers from the same bug that arm64
>> used to, ie. virt_addr_valid() will return true for some vmalloc
>> addresses.
>>
>> virt_addr_valid() is used pretty widely, I'm not sure if we can just fix
>> it without other fallout. I'll dig a bit more tomorrow if no one beats
>> me to it.
>>
>> Kees, depending on how that turns out we may ask you to revert
>> 517e1fbeb65f ("mm/usercopy: Drop extra is_vmalloc_or_module() check").
>
> That's fine by me. Let me know what you think would be best.
>
> Laura, I don't see much harm in putting this back in place. It seems
> like it's just a matter of efficiency to have it removed?
>
> -Kees
>
Yes, there shouldn't be any harm if we need to bring it back.
Perhaps I should submit a follow on patch to rename virt_addr_valid to
virt_addr_valid_except_where_it_isnt.
Thanks,
Laura
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG at mm/usercopy.c:72!
2017-05-16 14:35 ` Laura Abbott
@ 2017-05-18 5:09 ` Michael Ellerman
0 siblings, 0 replies; 14+ messages in thread
From: Michael Ellerman @ 2017-05-18 5:09 UTC (permalink / raw)
To: Laura Abbott, Kees Cook
Cc: Breno Leitao, linuxppc-dev, gromero, Anshuman Khandual, Balbir Singh
Laura Abbott <labbott@redhat.com> writes:
> On 05/16/2017 07:32 AM, Kees Cook wrote:
>> On Tue, May 16, 2017 at 4:09 AM, Michael Ellerman <mpe@ellerman.id.au> wrote:
>>> [Cc'ing the relevant folks]
>>>
>>> Breno Leitao <leitao@debian.org> writes:
>>>> Hello,
>>>>
>>>> Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
>>>> machine. Justing SSHing into the machine causes this issue.
>>>>
>>>> [23.138124] usercopy: kernel memory overwrite attempt detected to d000000003d80030 (mm_struct) (560 bytes)
>>>> [23.138195] ------------[ cut here ]------------
>>>> [23.138229] kernel BUG at mm/usercopy.c:72!
>>>> [23.138252] Oops: Exception in kernel mode, sig: 5 [#3]
>>>> [23.138280] SMP NR_CPUS=2048
>>>> [23.138280] NUMA
>>>> [23.138302] pSeries
>>>> [23.138330] Modules linked in:
>>>> [23.138354] CPU: 4 PID: 2215 Comm: sshd Tainted: G D 4.12.0-rc1+ #9
>>>> [23.138395] task: c0000001e272dc00 task.stack: c0000001e27b0000
>>>> [23.138430] NIP: c000000000342358 LR: c000000000342354 CTR: c0000000006eb060
>>>> [23.138472] REGS: c0000001e27b3a00 TRAP: 0700 Tainted: G D (4.12.0-rc1+)
>>>> [23.138513] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
>>>> [23.138517] CR: 28004222 XER: 20000000
>>>> [23.138565] CFAR: c000000000b34500 SOFTE: 1
>>>> [23.138565] GPR00: c000000000342354 c0000001e27b3c80 c00000000142a000 000000000000005e
>>>> [23.138565] GPR04: c0000001ffe0ade8 c0000001ffe21bf8 2920283536302062 79746573290d0a74
>>>> [23.138565] GPR08: 0000000000000007 c000000000f61864 00000001feeb0000 3064206f74206465
>>>> [23.138565] GPR12: 0000000000004400 c00000000fb42600 0000000000000015 00000000545bdc40
>>>> [23.138565] GPR16: 00000000545c49c8 000001000b4b8890 00007ffff78c26f0 00000000545cf000
>>>> [23.138565] GPR20: 00000000546109c8 000000000000c7e8 0000000054610010 00007ffff78c22e8
>>>> [23.138565] GPR24: 00000000545c8c40 c0000000ff6bcef0 c0000000001e5220 0000000000000230
>>>> [23.138565] GPR28: d000000003d80260 0000000000000000 0000000000000230 d000000003d80030
>>>> [23.138920] NIP [c000000000342358] __check_object_size+0x88/0x2d0
>>>> [23.138956] LR [c000000000342354] __check_object_size+0x84/0x2d0
>>>> [23.138990] Call Trace:
>>>> [23.139006] [c0000001e27b3c80] [c000000000342354] __check_object_size+0x84/0x2d0 (unreliable)
>>>> [23.139056] [c0000001e27b3d00] [c0000000009f5ba8] bpf_prog_create_from_user+0xa8/0x1a0
>>>> [23.139099] [c0000001e27b3d60] [c0000000001e5d30] do_seccomp+0x120/0x720
>>>> [23.139136] [c0000001e27b3dd0] [c0000000000fd53c] SyS_prctl+0x2ac/0x6b0
>>>> [23.139172] [c0000001e27b3e30] [c00000000000af84] system_call+0x38/0xe0
>>>> [23.139218] Instruction dump:
>>>> [23.139240] 60000000 60420000 3c82ff94 3ca2ff9d 38841788 38a5e868 3c62ff95 7fc8f378
>>>> [23.139283] 7fe6fb78 386310c0 487f2169 60000000 <0fe00000> 60420000 2ba30010 409d018c
>>>> [23.139328] ---[ end trace 1a1dc952a4b7c4af ]---
>>>>
>>>> I found that kernel 4.11 does not have this issue. I also found that, if
>>>> I revert 517e1fbeb65f5eade8d14f46ac365db6c75aea9b, I do not see the
>>>> problem.
>>>>
>>>> On the other side, if I cherry-pick commit
>>>> 517e1fbeb65f5eade8d14f46ac365db6c75aea9b into 4.11, I start seeing the
>>>> same issue also on 4.11.
>>>
>>> Yeah it looks like powerpc also suffers from the same bug that arm64
>>> used to, ie. virt_addr_valid() will return true for some vmalloc
>>> addresses.
>>>
>>> virt_addr_valid() is used pretty widely, I'm not sure if we can just fix
>>> it without other fallout. I'll dig a bit more tomorrow if no one beats
>>> me to it.
>>>
>>> Kees, depending on how that turns out we may ask you to revert
>>> 517e1fbeb65f ("mm/usercopy: Drop extra is_vmalloc_or_module() check").
>>
>> That's fine by me. Let me know what you think would be best.
>>
>> Laura, I don't see much harm in putting this back in place. It seems
>> like it's just a matter of efficiency to have it removed?
>
> Yes, there shouldn't be any harm if we need to bring it back.
> Perhaps I should submit a follow on patch to rename virt_addr_valid to
> virt_addr_valid_except_where_it_isnt.
I suspect there's lots of history here.
virt_addr_valid() is also hardly used, there's only a few 10's of
callers, vs hundreds for virt_to_page(). Which is scary as hell given
the latter is only safe if the former returns true.
cheers
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG at mm/usercopy.c:72!
2017-05-16 14:32 ` Kees Cook
2017-05-16 14:35 ` Laura Abbott
@ 2017-05-17 10:05 ` Balbir Singh
2017-05-18 10:16 ` Michael Ellerman
2 siblings, 0 replies; 14+ messages in thread
From: Balbir Singh @ 2017-05-17 10:05 UTC (permalink / raw)
To: Kees Cook, Michael Ellerman
Cc: Breno Leitao, linuxppc-dev, Laura Abbott, gromero, Anshuman Khandual
> > Kees, depending on how that turns out we may ask you to revert
> > 517e1fbeb65f ("mm/usercopy: Drop extra is_vmalloc_or_module() check").
>
> That's fine by me. Let me know what you think would be best.
>
> Laura, I don't see much harm in putting this back in place. It seems
> like it's just a matter of efficiency to have it removed?
It looks like we resolved struct page of 0xd000000003d80030 as PageSlab?
Balbir Singh.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG at mm/usercopy.c:72!
2017-05-16 14:32 ` Kees Cook
2017-05-16 14:35 ` Laura Abbott
2017-05-17 10:05 ` Balbir Singh
@ 2017-05-18 10:16 ` Michael Ellerman
2017-05-18 10:58 ` Balbir Singh
2 siblings, 1 reply; 14+ messages in thread
From: Michael Ellerman @ 2017-05-18 10:16 UTC (permalink / raw)
To: Kees Cook
Cc: Breno Leitao, linuxppc-dev, Laura Abbott, gromero,
Anshuman Khandual, Balbir Singh
Kees Cook <keescook@chromium.org> writes:
> On Tue, May 16, 2017 at 4:09 AM, Michael Ellerman <mpe@ellerman.id.au> wrote:
>> Yeah it looks like powerpc also suffers from the same bug that arm64
>> used to, ie. virt_addr_valid() will return true for some vmalloc
>> addresses.
>>
>> virt_addr_valid() is used pretty widely, I'm not sure if we can just fix
>> it without other fallout. I'll dig a bit more tomorrow if no one beats
>> me to it.
>>
>> Kees, depending on how that turns out we may ask you to revert
>> 517e1fbeb65f ("mm/usercopy: Drop extra is_vmalloc_or_module() check").
>
> That's fine by me. Let me know what you think would be best.
Oh man, what a mess.
I think we can do a small fix for this in powerpc code for 4.12, will
post it soon for Breno to test - I still can't reproduce locally.
cheers
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG at mm/usercopy.c:72!
2017-05-18 10:16 ` Michael Ellerman
@ 2017-05-18 10:58 ` Balbir Singh
0 siblings, 0 replies; 14+ messages in thread
From: Balbir Singh @ 2017-05-18 10:58 UTC (permalink / raw)
To: Michael Ellerman
Cc: Kees Cook, Breno Leitao, linuxppc-dev, Laura Abbott, gromero,
Anshuman Khandual
On Thu, May 18, 2017 at 8:16 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> Kees Cook <keescook@chromium.org> writes:
>> On Tue, May 16, 2017 at 4:09 AM, Michael Ellerman <mpe@ellerman.id.au> wrote:
>>> Yeah it looks like powerpc also suffers from the same bug that arm64
>>> used to, ie. virt_addr_valid() will return true for some vmalloc
>>> addresses.
>>>
>>> virt_addr_valid() is used pretty widely, I'm not sure if we can just fix
>>> it without other fallout. I'll dig a bit more tomorrow if no one beats
>>> me to it.
>>>
>>> Kees, depending on how that turns out we may ask you to revert
>>> 517e1fbeb65f ("mm/usercopy: Drop extra is_vmalloc_or_module() check").
>>
>> That's fine by me. Let me know what you think would be best.
>
> Oh man, what a mess.
>
> I think we can do a small fix for this in powerpc code for 4.12, will
> post it soon for Breno to test - I still can't reproduce locally.
To reproduce locally you'd need an alias'd 0xd000..<addr> mapped to a
0xc000..<addr> (due to the assumptions in __pa()) and with
virt_to_page(addr) have PageSlab(page) set, I guess. I guess with very
few modules, a whole lot of 0xd000... space is unused, but if we had a
bunch of modules and ended up with an 0xd000<addr> range aliased as
PageSlab you would probably run into it more easily.
Balbir Singh
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: kernel BUG at mm/usercopy.c:72!
2017-05-15 19:19 kernel BUG at mm/usercopy.c:72! Breno Leitao
` (2 preceding siblings ...)
2017-05-16 11:09 ` Michael Ellerman
@ 2017-05-18 10:17 ` Michael Ellerman
3 siblings, 0 replies; 14+ messages in thread
From: Michael Ellerman @ 2017-05-18 10:17 UTC (permalink / raw)
To: Breno Leitao, linuxppc-dev; +Cc: gromero
Breno Leitao <leitao@debian.org> writes:
> Hello,
>
> Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
> machine. Justing SSHing into the machine causes this issue.
Can you try this?
cheers
diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index 2a32483c7b6c..8da5d4c1cab2 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -132,7 +132,19 @@ extern long long virt_phys_offset;
#define virt_to_pfn(kaddr) (__pa(kaddr) >> PAGE_SHIFT)
#define virt_to_page(kaddr) pfn_to_page(virt_to_pfn(kaddr))
#define pfn_to_kaddr(pfn) __va((pfn) << PAGE_SHIFT)
+
+#ifdef CONFIG_PPC_BOOK3S_64
+/*
+ * On hash the vmalloc and other regions alias to the kernel region when passed
+ * through __pa(), which virt_to_pfn() uses. That means virt_addr_valid() can
+ * return true for some vmalloc addresses, which is incorrect. So explicitly
+ * check that the address is in the kernel region.
+ */
+#define virt_addr_valid(kaddr) (REGION_ID(kaddr) == KERNEL_REGION_ID && \
+ pfn_valid(virt_to_pfn(kaddr)))
+#else
#define virt_addr_valid(kaddr) pfn_valid(virt_to_pfn(kaddr))
+#endif
/*
* On Book-E parts we need __va to parse the device tree and we can't
^ permalink raw reply related [flat|nested] 14+ messages in thread