* next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
@ 2017-05-15 22:06 Luis R. Rodriguez
2017-05-15 22:15 ` Luis R. Rodriguez
0 siblings, 1 reply; 21+ messages in thread
From: Luis R. Rodriguez @ 2017-05-15 22:06 UTC (permalink / raw)
To: Stephen Smalley, Ingo Molnar
Cc: Andy Lutomirski, Michal Hocko, Andrew Morton, Kees Cook,
Eric W. Biederman, Mateusz Guzik, mcgrof, linux-kernel
For a few kernel releases now I have managed to trigger the warning added via
commit e1a58320a38dfa ("x86/mm: Warn on W^X mappings", merged upstream since
v4.4) on my KVM qemu x86_64 system. Since I just booted into the shiny new
linux-next tag next-20170515 (based on v4.12-rc1) and this is still triggering
I figured its time to tackle this.
Let me know if this is already known or what can be done to try to fix this.
Using QEMU emulator version 2.7.94 (v2.8.0-rc4-dirty)
I will try updating my distro package for qemu and see if perhaps its this
and for the other odd fork issue I reported [0].
[0] https://lkml.kernel.org/r/CAB=NE6VZXq3y-3pfouYTBUco2Cq2xqoLZrgDFdVx+_=_=SwG_Q@mail.gmail.com
My config:
http://drvbp1.linux-foundation.org/~mcgrof/2017/05/15/configs/piggy-x86_64_qemu_fork_kmemleak.config
The splat:
[ 0.911209] x86/mm: Found insecure W+X mapping at address ffffffffc0288000/0xffffffffc0288000
[ 0.912066] ------------[ cut here ]------------
[ 0.912544] WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
[ 0.913381] Modules linked in:
[ 0.913672] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc1-next-20170515+ #144
[ 0.914434] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.1-0-g8891697-prebuilt.qemu-project.org 04/01/2014
[ 0.915595] task: ffff98d43a5eac80 task.stack: ffffad22c0630000
[ 0.916174] RIP: 0010:note_page+0x630/0x7e0
[ 0.916595] RSP: 0018:ffffad22c0633df0 EFLAGS: 00010286
[ 0.917101] RAX: 0000000000000051 RBX: ffffad22c0633e88 RCX: ffffffff91256708
[ 0.917805] RDX: 0000000000000000 RSI: 0000000000000096 RDI: 0000000000000246
[ 0.918511] RBP: ffffad22c0633e28 R08: 6666666666666678 R09: 0000000000000160
[ 0.919214] R10: ffffad22c0633dd8 R11: 3030303838323063 R12: 0000000000000000
[ 0.919917] R13: 0000000000000004 R14: 0000000000000000 R15: 0000000000000000
[ 0.920615] FS: 0000000000000000(0000) GS:ffff98d43fc00000(0000) knlGS:0000000000000000
[ 0.921384] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.921943] CR2: 0000000000000000 CR3: 00000000a3a09000 CR4: 00000000000006f0
[ 0.922657] Call Trace:
[ 0.922901] ptdump_walk_pgd_level_core+0x3e7/0x490
[ 0.923354] ? 0xffffffff90600000
[ 0.923662] ptdump_walk_pgd_level_checkwx+0x17/0x20
[ 0.924145] mark_rodata_ro+0xf4/0x100
[ 0.924536] ? rest_init+0x80/0x80
[ 0.924862] kernel_init+0x2f/0x100
[ 0.925197] ret_from_fork+0x2c/0x40
[ 0.925552] Code: 48 c7 43 28 00 00 00 00 48 89 43 20 e9 05 fd ff ff 48 8b 73 10 48 c7 c7 c8 34 fe 90 c6 05 c8 eb bc 00 01 48 89 f2 e8 8d fc 11 00 <0f> ff e9 1f fa ff ff 48 8b 70 20 48 c7 c7 05 b1 fe 90 e8 76 fc
[ 0.927368] ---[ end trace 97137ae213b9cb25 ]---
[ 0.927830] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
Luis
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
2017-05-15 22:06 next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0 Luis R. Rodriguez
@ 2017-05-15 22:15 ` Luis R. Rodriguez
2017-05-15 22:57 ` Kees Cook
2017-05-15 23:30 ` Luis R. Rodriguez
0 siblings, 2 replies; 21+ messages in thread
From: Luis R. Rodriguez @ 2017-05-15 22:15 UTC (permalink / raw)
To: Luis R. Rodriguez
Cc: Stephen Smalley, Ingo Molnar, Andy Lutomirski, Michal Hocko,
Andrew Morton, Kees Cook, Eric W. Biederman, Mateusz Guzik,
linux-kernel
On Tue, May 16, 2017 at 12:06:50AM +0200, Luis R. Rodriguez wrote:
> Using QEMU emulator version 2.7.94 (v2.8.0-rc4-dirty)
>
> I will try updating my distro package for qemu and see if perhaps its this
> and for the other odd fork issue I reported [0].
>
> [0] https://lkml.kernel.org/r/CAB=NE6VZXq3y-3pfouYTBUco2Cq2xqoLZrgDFdVx+_=_=SwG_Q@mail.gmail.com
Yeah nope, using my distribution latest:
QEMU emulator version 2.8.0(openSUSE Tumbleweed)
And still both issues are present.
Luis
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
2017-05-15 22:15 ` Luis R. Rodriguez
@ 2017-05-15 22:57 ` Kees Cook
2017-05-15 23:45 ` Luis R. Rodriguez
2017-05-15 23:30 ` Luis R. Rodriguez
1 sibling, 1 reply; 21+ messages in thread
From: Kees Cook @ 2017-05-15 22:57 UTC (permalink / raw)
To: Luis R. Rodriguez
Cc: Stephen Smalley, Ingo Molnar, Andy Lutomirski, Michal Hocko,
Andrew Morton, Eric W. Biederman, Mateusz Guzik, LKML
On Mon, May 15, 2017 at 3:15 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Tue, May 16, 2017 at 12:06:50AM +0200, Luis R. Rodriguez wrote:
>> Using QEMU emulator version 2.7.94 (v2.8.0-rc4-dirty)
>>
>> I will try updating my distro package for qemu and see if perhaps its this
>> and for the other odd fork issue I reported [0].
>>
>> [0] https://lkml.kernel.org/r/CAB=NE6VZXq3y-3pfouYTBUco2Cq2xqoLZrgDFdVx+_=_=SwG_Q@mail.gmail.com
>
> Yeah nope, using my distribution latest:
>
> QEMU emulator version 2.8.0(openSUSE Tumbleweed)
>
> And still both issues are present.
>
> Luis
Can you enable CONFIG_X86_PTDUMP=y and then find out what is located
at ffffffffc0288000 via /sys/kernel/debug/kernel_page_tables ?
-Kees
--
Kees Cook
Pixel Security
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
2017-05-15 22:15 ` Luis R. Rodriguez
2017-05-15 22:57 ` Kees Cook
@ 2017-05-15 23:30 ` Luis R. Rodriguez
1 sibling, 0 replies; 21+ messages in thread
From: Luis R. Rodriguez @ 2017-05-15 23:30 UTC (permalink / raw)
To: Luis R. Rodriguez
Cc: Stephen Smalley, Ingo Molnar, Andy Lutomirski, Michal Hocko,
Andrew Morton, Kees Cook, Eric W. Biederman, Mateusz Guzik,
linux-kernel
On Mon, May 15, 2017 at 3:15 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Tue, May 16, 2017 at 12:06:50AM +0200, Luis R. Rodriguez wrote:
>> Using QEMU emulator version 2.7.94 (v2.8.0-rc4-dirty)
>>
>> I will try updating my distro package for qemu and see if perhaps its this
>> and for the other odd fork issue I reported [0].
>>
>> [0] https://lkml.kernel.org/r/CAB=NE6VZXq3y-3pfouYTBUco2Cq2xqoLZrgDFdVx+_=_=SwG_Q@mail.gmail.com
>
> Yeah nope, using my distribution latest:
>
> QEMU emulator version 2.8.0(openSUSE Tumbleweed)
>
> And still both issues are present.
FWIW also compiled and tried to boot with the latest qemu, v2.9.0-rc5
and it also has both issues, so I don't think this is because of the
version of qemu.
Luis
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
2017-05-15 22:57 ` Kees Cook
@ 2017-05-15 23:45 ` Luis R. Rodriguez
2017-05-16 0:12 ` Kees Cook
0 siblings, 1 reply; 21+ messages in thread
From: Luis R. Rodriguez @ 2017-05-15 23:45 UTC (permalink / raw)
To: Kees Cook
Cc: Stephen Smalley, Ingo Molnar, Andy Lutomirski, Michal Hocko,
Andrew Morton, Eric W. Biederman, Mateusz Guzik, LKML
On Mon, May 15, 2017 at 3:57 PM, Kees Cook <keescook@chromium.org> wrote:
> On Mon, May 15, 2017 at 3:15 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
>> On Tue, May 16, 2017 at 12:06:50AM +0200, Luis R. Rodriguez wrote:
>>> Using QEMU emulator version 2.7.94 (v2.8.0-rc4-dirty)
>>>
>>> I will try updating my distro package for qemu and see if perhaps its this
>>> and for the other odd fork issue I reported [0].
>>>
>>> [0] https://lkml.kernel.org/r/CAB=NE6VZXq3y-3pfouYTBUco2Cq2xqoLZrgDFdVx+_=_=SwG_Q@mail.gmail.com
>>
>> Yeah nope, using my distribution latest:
>>
>> QEMU emulator version 2.8.0(openSUSE Tumbleweed)
>>
>> And still both issues are present.
>>
>> Luis
>
> Can you enable CONFIG_X86_PTDUMP=y and then find out what is located
> at ffffffffc0288000 via /sys/kernel/debug/kernel_page_tables ?
Sure thing.
Recompiled with this enabled, new warning:
[ 0.891559] x86/mm: Found insecure W+X mapping at address
ffffffffc00e4000/0xffffffffc00e4000
[ 0.892394] ------------[ cut here ]------------
[ 0.892834] WARNING: CPU: 0 PID: 1 at
arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
[ 0.893674] Modules linked in:
[ 0.893972] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
4.12.0-rc1-next-20170515+ #145
[ 0.894687] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
[ 0.895828] task: ffff8ed7fa5ccc80 task.stack: ffffae3900630000
[ 0.896403] RIP: 0010:note_page+0x630/0x7e0
[ 0.896780] RSP: 0018:ffffae3900633df0 EFLAGS: 00010286
[ 0.897271] RAX: 0000000000000051 RBX: ffffae3900633e88 RCX: ffffffff9b456708
[ 0.897940] RDX: 0000000000000000 RSI: 0000000000000096 RDI: 0000000000000246
[ 0.898624] RBP: ffffae3900633e28 R08: 203a6d6d2f363878 R09: 0000000000000165
[ 0.899314] R10: ffffae3900633dd8 R11: 736e6920646e756f R12: 0000000000000000
[ 0.899987] R13: 0000000000000004 R14: 0000000000000000 R15: 0000000000000000
[ 0.900629] FS: 0000000000000000(0000) GS:ffff8ed7ffc00000(0000)
knlGS:0000000000000000
[ 0.901398] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.901908] CR2: 0000000000000000 CR3: 0000000118009000 CR4: 00000000000006f0
[ 0.902590] Call Trace:
[ 0.902827] ptdump_walk_pgd_level_core+0x3e7/0x490
[ 0.903274] ? 0xffffffff9a800000
[ 0.903595] ptdump_walk_pgd_level_checkwx+0x17/0x20
[ 0.904064] mark_rodata_ro+0xf4/0x100
[ 0.904423] ? rest_init+0x80/0x80
[ 0.904744] kernel_init+0x2f/0x100
[ 0.905068] ret_from_fork+0x2c/0x40
[ 0.905393] Code: 48 c7 43 28 00 00 00 00 48 89 43 20 e9 05 fd ff
ff 48 8b 73 10 48 c7 c7 28 36 1e 9b c6 05 c8 eb bc 00 01 48 89 f2 e8
cd fc 11 00 <0f> ff e9 1f fa ff ff 48 8b 70 20 48 c7 c7 65 b2 1e 9b e8
b6 fc
[ 0.907173] ---[ end trace 878b39cb0c248e66 ]---
[ 0.907655] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
And ffffffffc00e4000 is:
---[ Modules ]---
0xffffffffc0000000-0xffffffffc00e4000 912K
pte
0xffffffffc00e4000-0xffffffffc00e5000 4K RW
GLB x pte
In case someone needs the full /sys/kernel/debug/kernel_page_tables file:
http://drvbp1.linux-foundation.org/~mcgrof/2017/05/15/kernel_page_tables/piggy-4.12.0-rc1-next-20170515-page-tables.txt
Luis
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
2017-05-15 23:45 ` Luis R. Rodriguez
@ 2017-05-16 0:12 ` Kees Cook
2017-05-17 16:40 ` Luis R. Rodriguez
0 siblings, 1 reply; 21+ messages in thread
From: Kees Cook @ 2017-05-16 0:12 UTC (permalink / raw)
To: Luis R. Rodriguez
Cc: Stephen Smalley, Ingo Molnar, Andy Lutomirski, Michal Hocko,
Andrew Morton, Eric W. Biederman, Mateusz Guzik, LKML
On Mon, May 15, 2017 at 4:45 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Mon, May 15, 2017 at 3:57 PM, Kees Cook <keescook@chromium.org> wrote:
>> On Mon, May 15, 2017 at 3:15 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
>>> On Tue, May 16, 2017 at 12:06:50AM +0200, Luis R. Rodriguez wrote:
>>>> Using QEMU emulator version 2.7.94 (v2.8.0-rc4-dirty)
>>>>
>>>> I will try updating my distro package for qemu and see if perhaps its this
>>>> and for the other odd fork issue I reported [0].
>>>>
>>>> [0] https://lkml.kernel.org/r/CAB=NE6VZXq3y-3pfouYTBUco2Cq2xqoLZrgDFdVx+_=_=SwG_Q@mail.gmail.com
>>>
>>> Yeah nope, using my distribution latest:
>>>
>>> QEMU emulator version 2.8.0(openSUSE Tumbleweed)
>>>
>>> And still both issues are present.
>>>
>>> Luis
>>
>> Can you enable CONFIG_X86_PTDUMP=y and then find out what is located
>> at ffffffffc0288000 via /sys/kernel/debug/kernel_page_tables ?
>
> Sure thing.
>
> Recompiled with this enabled, new warning:
>
> [ 0.891559] x86/mm: Found insecure W+X mapping at address
> ffffffffc00e4000/0xffffffffc00e4000
> [ 0.892394] ------------[ cut here ]------------
> [ 0.892834] WARNING: CPU: 0 PID: 1 at
> arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> [ 0.893674] Modules linked in:
> [ 0.893972] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
> 4.12.0-rc1-next-20170515+ #145
> [ 0.894687] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> [ 0.895828] task: ffff8ed7fa5ccc80 task.stack: ffffae3900630000
> [ 0.896403] RIP: 0010:note_page+0x630/0x7e0
> [ 0.896780] RSP: 0018:ffffae3900633df0 EFLAGS: 00010286
> [ 0.897271] RAX: 0000000000000051 RBX: ffffae3900633e88 RCX: ffffffff9b456708
> [ 0.897940] RDX: 0000000000000000 RSI: 0000000000000096 RDI: 0000000000000246
> [ 0.898624] RBP: ffffae3900633e28 R08: 203a6d6d2f363878 R09: 0000000000000165
> [ 0.899314] R10: ffffae3900633dd8 R11: 736e6920646e756f R12: 0000000000000000
> [ 0.899987] R13: 0000000000000004 R14: 0000000000000000 R15: 0000000000000000
> [ 0.900629] FS: 0000000000000000(0000) GS:ffff8ed7ffc00000(0000)
> knlGS:0000000000000000
> [ 0.901398] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 0.901908] CR2: 0000000000000000 CR3: 0000000118009000 CR4: 00000000000006f0
> [ 0.902590] Call Trace:
> [ 0.902827] ptdump_walk_pgd_level_core+0x3e7/0x490
> [ 0.903274] ? 0xffffffff9a800000
> [ 0.903595] ptdump_walk_pgd_level_checkwx+0x17/0x20
> [ 0.904064] mark_rodata_ro+0xf4/0x100
> [ 0.904423] ? rest_init+0x80/0x80
> [ 0.904744] kernel_init+0x2f/0x100
> [ 0.905068] ret_from_fork+0x2c/0x40
> [ 0.905393] Code: 48 c7 43 28 00 00 00 00 48 89 43 20 e9 05 fd ff
> ff 48 8b 73 10 48 c7 c7 28 36 1e 9b c6 05 c8 eb bc 00 01 48 89 f2 e8
> cd fc 11 00 <0f> ff e9 1f fa ff ff 48 8b 70 20 48 c7 c7 65 b2 1e 9b e8
> b6 fc
> [ 0.907173] ---[ end trace 878b39cb0c248e66 ]---
> [ 0.907655] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
>
> And ffffffffc00e4000 is:
>
> ---[ Modules ]---
> 0xffffffffc0000000-0xffffffffc00e4000 912K
> pte
> 0xffffffffc00e4000-0xffffffffc00e5000 4K RW
> GLB x pte
>
> In case someone needs the full /sys/kernel/debug/kernel_page_tables file:
>
> http://drvbp1.linux-foundation.org/~mcgrof/2017/05/15/kernel_page_tables/piggy-4.12.0-rc1-next-20170515-page-tables.txt
---[ Modules ]---
0xffffffffc0000000-0xffffffffc00e4000 912K
pte
This should be the modules ASLR gap
0xffffffffc00e4000-0xffffffffc00e5000 4K RW
GLB x pte
This is part of the same gap, but it's RW+x strangely?
0xffffffffc00e5000-0xffffffffc00e6000 4K
pte
This is more of the gap?
0xffffffffc00e6000-0xffffffffc00fa000 80K ro
GLB x pte
0xffffffffc00fa000-0xffffffffc010c000 72K ro
GLB NX pte
0xffffffffc010c000-0xffffffffc011b000 60K RW
GLB NX pte
This should be the first loaded module. Can you check that
0xffffffffc00e6000 matches the first module in /proc/modules?
Something touched the module gap and left is RW+x...
Are you able to bisect this?
-Kees
--
Kees Cook
Pixel Security
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
2017-05-16 0:12 ` Kees Cook
@ 2017-05-17 16:40 ` Luis R. Rodriguez
2017-05-17 17:53 ` Kees Cook
0 siblings, 1 reply; 21+ messages in thread
From: Luis R. Rodriguez @ 2017-05-17 16:40 UTC (permalink / raw)
To: Kees Cook
Cc: Luis R. Rodriguez, Stephen Smalley, Ingo Molnar, Andy Lutomirski,
Michal Hocko, Andrew Morton, Eric W. Biederman, Mateusz Guzik,
LKML
On Mon, May 15, 2017 at 05:12:18PM -0700, Kees Cook wrote:
> On Mon, May 15, 2017 at 4:45 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > On Mon, May 15, 2017 at 3:57 PM, Kees Cook <keescook@chromium.org> wrote:
> >> On Mon, May 15, 2017 at 3:15 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> >>> On Tue, May 16, 2017 at 12:06:50AM +0200, Luis R. Rodriguez wrote:
> >>>> Using QEMU emulator version 2.7.94 (v2.8.0-rc4-dirty)
> >>>>
> >>>> I will try updating my distro package for qemu and see if perhaps its this
> >>>> and for the other odd fork issue I reported [0].
> >>>>
> >>>> [0] https://lkml.kernel.org/r/CAB=NE6VZXq3y-3pfouYTBUco2Cq2xqoLZrgDFdVx+_=_=SwG_Q@mail.gmail.com
> >>>
> >>> Yeah nope, using my distribution latest:
> >>>
> >>> QEMU emulator version 2.8.0(openSUSE Tumbleweed)
> >>>
> >>> And still both issues are present.
> >>>
> >>> Luis
> >>
> >> Can you enable CONFIG_X86_PTDUMP=y and then find out what is located
> >> at ffffffffc0288000 via /sys/kernel/debug/kernel_page_tables ?
> >
> > Sure thing.
> >
> > Recompiled with this enabled, new warning:
> >
> > [ 0.891559] x86/mm: Found insecure W+X mapping at address
> > ffffffffc00e4000/0xffffffffc00e4000
> > [ 0.892394] ------------[ cut here ]------------
> > [ 0.892834] WARNING: CPU: 0 PID: 1 at
> > arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> > [ 0.893674] Modules linked in:
> > [ 0.893972] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
> > 4.12.0-rc1-next-20170515+ #145
> > [ 0.894687] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> > [ 0.895828] task: ffff8ed7fa5ccc80 task.stack: ffffae3900630000
> > [ 0.896403] RIP: 0010:note_page+0x630/0x7e0
> > [ 0.896780] RSP: 0018:ffffae3900633df0 EFLAGS: 00010286
> > [ 0.897271] RAX: 0000000000000051 RBX: ffffae3900633e88 RCX: ffffffff9b456708
> > [ 0.897940] RDX: 0000000000000000 RSI: 0000000000000096 RDI: 0000000000000246
> > [ 0.898624] RBP: ffffae3900633e28 R08: 203a6d6d2f363878 R09: 0000000000000165
> > [ 0.899314] R10: ffffae3900633dd8 R11: 736e6920646e756f R12: 0000000000000000
> > [ 0.899987] R13: 0000000000000004 R14: 0000000000000000 R15: 0000000000000000
> > [ 0.900629] FS: 0000000000000000(0000) GS:ffff8ed7ffc00000(0000)
> > knlGS:0000000000000000
> > [ 0.901398] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 0.901908] CR2: 0000000000000000 CR3: 0000000118009000 CR4: 00000000000006f0
> > [ 0.902590] Call Trace:
> > [ 0.902827] ptdump_walk_pgd_level_core+0x3e7/0x490
> > [ 0.903274] ? 0xffffffff9a800000
> > [ 0.903595] ptdump_walk_pgd_level_checkwx+0x17/0x20
> > [ 0.904064] mark_rodata_ro+0xf4/0x100
> > [ 0.904423] ? rest_init+0x80/0x80
> > [ 0.904744] kernel_init+0x2f/0x100
> > [ 0.905068] ret_from_fork+0x2c/0x40
> > [ 0.905393] Code: 48 c7 43 28 00 00 00 00 48 89 43 20 e9 05 fd ff
> > ff 48 8b 73 10 48 c7 c7 28 36 1e 9b c6 05 c8 eb bc 00 01 48 89 f2 e8
> > cd fc 11 00 <0f> ff e9 1f fa ff ff 48 8b 70 20 48 c7 c7 65 b2 1e 9b e8
> > b6 fc
> > [ 0.907173] ---[ end trace 878b39cb0c248e66 ]---
> > [ 0.907655] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
> >
> > And ffffffffc00e4000 is:
> >
> > ---[ Modules ]---
> > 0xffffffffc0000000-0xffffffffc00e4000 912K
> > pte
> > 0xffffffffc00e4000-0xffffffffc00e5000 4K RW
> > GLB x pte
> >
> > In case someone needs the full /sys/kernel/debug/kernel_page_tables file:
> >
> > http://drvbp1.linux-foundation.org/~mcgrof/2017/05/15/kernel_page_tables/piggy-4.12.0-rc1-next-20170515-page-tables.txt
>
> ---[ Modules ]---
> 0xffffffffc0000000-0xffffffffc00e4000 912K
> pte
>
> This should be the modules ASLR gap
>
> 0xffffffffc00e4000-0xffffffffc00e5000 4K RW
> GLB x pte
>
> This is part of the same gap, but it's RW+x strangely?
>
> 0xffffffffc00e5000-0xffffffffc00e6000 4K
> pte
>
> This is more of the gap?
>
> 0xffffffffc00e6000-0xffffffffc00fa000 80K ro
> GLB x pte
> 0xffffffffc00fa000-0xffffffffc010c000 72K ro
> GLB NX pte
> 0xffffffffc010c000-0xffffffffc011b000 60K RW
> GLB NX pte
>
> This should be the first loaded module. Can you check that
> 0xffffffffc00e6000 matches the first module in /proc/modules?
Yes, but I had killed that boot session again, so upon my next boot
I had a different layout, the ASLR gap was much larger:
---[ Modules ]---
0xffffffffc0000000-0xffffffffc01b0000 1728K pte
0xffffffffc01b0000-0xffffffffc01b1000 4K RW GLB x pte
0xffffffffc01b1000-0xffffffffc01b2000 4K pte
0xffffffffc01b2000-0xffffffffc01c6000 80K ro GLB x pte
0xffffffffc01c6000-0xffffffffc01cc000 24K ro GLB NX pte
0xffffffffc01cc000-0xffffffffc01d5000 36K RW GLB NX pte
As you can guess if we follow similar pattern the RW hole is the one this boot
warned about:
[ 1.450483] x86/mm: Found insecure W+X mapping at address ffffffffc01b0000/0xffffffffc01b0000
[ 1.451280] ------------[ cut here ]------------
[ 1.451721] WARNING: CPU: 1 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
[ 1.452499] Modules linked in:
[ 1.452791] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc1-next-20170515+ #145
I checked and indeed 0xffffffffc01b2000 is part of a module, it was not the first one
on the /proc/modules list but then again /proc/modules does not seem to have a specific
order other than perhaps being pegged into a linked list of modules once they go live,
and it seems its typically output backwards from when that happened, sorting that
by address we get:
root@piggy:~# cat /proc/modules | sort -k 6 | head -3
e1000 143360 0 - Live 0xffffffffc01b2000 (E)
mbcache 16384 1 ext4, Live 0xffffffffc01d6000 (E)
scsi_mod 217088 4 sg,sr_mod,sd_mod,libata, Live 0xffffffffc01df000 (E)
And this then seems to be the first module loaded:
e1000 143360 0 - Live 0xffffffffc01b2000 (E)
The output of dmesg seems to confirm this as per the list of modules sorted
as per above.
> Something touched the module gap and left is RW+x...
Lemme try booting with e1000 renamed to e1000.ko.ignore and see how that goes.
> Are you able to bisect this?
This issue has been present for a while so since I recall this I might be
able to reduce the number of needed target kernels to bisect. Lemme tinker
a bit and if no clear culprit comes up then will try bisect.
Luis
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
2017-05-17 16:40 ` Luis R. Rodriguez
@ 2017-05-17 17:53 ` Kees Cook
2017-05-19 0:44 ` Luis R. Rodriguez
0 siblings, 1 reply; 21+ messages in thread
From: Kees Cook @ 2017-05-17 17:53 UTC (permalink / raw)
To: Luis R. Rodriguez
Cc: Stephen Smalley, Ingo Molnar, Andy Lutomirski, Michal Hocko,
Andrew Morton, Eric W. Biederman, Mateusz Guzik, LKML
On Wed, May 17, 2017 at 9:40 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> Yes, but I had killed that boot session again, so upon my next boot
> I had a different layout, the ASLR gap was much larger:
>
> ---[ Modules ]---
> 0xffffffffc0000000-0xffffffffc01b0000 1728K pte
> 0xffffffffc01b0000-0xffffffffc01b1000 4K RW GLB x pte
> 0xffffffffc01b1000-0xffffffffc01b2000 4K pte
> 0xffffffffc01b2000-0xffffffffc01c6000 80K ro GLB x pte
> 0xffffffffc01c6000-0xffffffffc01cc000 24K ro GLB NX pte
> 0xffffffffc01cc000-0xffffffffc01d5000 36K RW GLB NX pte
>
> As you can guess if we follow similar pattern the RW hole is the one this boot
> warned about:
>
> [ 1.450483] x86/mm: Found insecure W+X mapping at address ffffffffc01b0000/0xffffffffc01b0000
> [ 1.451280] ------------[ cut here ]------------
> [ 1.451721] WARNING: CPU: 1 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> [ 1.452499] Modules linked in:
> [ 1.452791] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc1-next-20170515+ #145
>
> I checked and indeed 0xffffffffc01b2000 is part of a module, it was not the first one
> on the /proc/modules list but then again /proc/modules does not seem to have a specific
> order other than perhaps being pegged into a linked list of modules once they go live,
> and it seems its typically output backwards from when that happened, sorting that
> by address we get:
Right, sorry, I'd expect it at the bottom of the list in
/proc/modules, but that's fine, it's there.
>
> root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> e1000 143360 0 - Live 0xffffffffc01b2000 (E)
> mbcache 16384 1 ext4, Live 0xffffffffc01d6000 (E)
> scsi_mod 217088 4 sg,sr_mod,sd_mod,libata, Live 0xffffffffc01df000 (E)
>
> And this then seems to be the first module loaded:
>
> e1000 143360 0 - Live 0xffffffffc01b2000 (E)
>
> The output of dmesg seems to confirm this as per the list of modules sorted
> as per above.
>
>> Something touched the module gap and left is RW+x...
>
> Lemme try booting with e1000 renamed to e1000.ko.ignore and see how that goes.
Is it possible a module got loaded before e1000 and then unloaded?
That seems odd, but maybe unload isn't cleaning up?
>> Are you able to bisect this?
>
> This issue has been present for a while so since I recall this I might be
> able to reduce the number of needed target kernels to bisect. Lemme tinker
> a bit and if no clear culprit comes up then will try bisect.
Okay, thanks!
-Kees
--
Kees Cook
Pixel Security
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
2017-05-17 17:53 ` Kees Cook
@ 2017-05-19 0:44 ` Luis R. Rodriguez
2017-05-19 3:08 ` Luis R. Rodriguez
0 siblings, 1 reply; 21+ messages in thread
From: Luis R. Rodriguez @ 2017-05-19 0:44 UTC (permalink / raw)
To: Kees Cook
Cc: Luis R. Rodriguez, Stephen Smalley, Ingo Molnar, Andy Lutomirski,
Michal Hocko, Vlastimil Babka, Andrew Morton, Eric W. Biederman,
Mateusz Guzik, LKML
On Wed, May 17, 2017 at 10:53:06AM -0700, Kees Cook wrote:
> On Wed, May 17, 2017 at 9:40 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > Yes, but I had killed that boot session again, so upon my next boot
> > I had a different layout, the ASLR gap was much larger:
> >
> > ---[ Modules ]---
> > 0xffffffffc0000000-0xffffffffc01b0000 1728K pte
> > 0xffffffffc01b0000-0xffffffffc01b1000 4K RW GLB x pte
> > 0xffffffffc01b1000-0xffffffffc01b2000 4K pte
> > 0xffffffffc01b2000-0xffffffffc01c6000 80K ro GLB x pte
> > 0xffffffffc01c6000-0xffffffffc01cc000 24K ro GLB NX pte
> > 0xffffffffc01cc000-0xffffffffc01d5000 36K RW GLB NX pte
> >
> > As you can guess if we follow similar pattern the RW hole is the one this boot
> > warned about:
> >
> > [ 1.450483] x86/mm: Found insecure W+X mapping at address ffffffffc01b0000/0xffffffffc01b0000
> > [ 1.451280] ------------[ cut here ]------------
> > [ 1.451721] WARNING: CPU: 1 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> > [ 1.452499] Modules linked in:
> > [ 1.452791] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc1-next-20170515+ #145
> >
> > I checked and indeed 0xffffffffc01b2000 is part of a module, it was not the first one
> > on the /proc/modules list but then again /proc/modules does not seem to have a specific
> > order other than perhaps being pegged into a linked list of modules once they go live,
> > and it seems its typically output backwards from when that happened, sorting that
> > by address we get:
>
> Right, sorry, I'd expect it at the bottom of the list in
> /proc/modules, but that's fine, it's there.
>
> >
> > root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> > e1000 143360 0 - Live 0xffffffffc01b2000 (E)
> > mbcache 16384 1 ext4, Live 0xffffffffc01d6000 (E)
> > scsi_mod 217088 4 sg,sr_mod,sd_mod,libata, Live 0xffffffffc01df000 (E)
> >
> > And this then seems to be the first module loaded:
> >
> > e1000 143360 0 - Live 0xffffffffc01b2000 (E)
> >
> > The output of dmesg seems to confirm this as per the list of modules sorted
> > as per above.
> >
> >> Something touched the module gap and left is RW+x...
> >
> > Lemme try booting with e1000 renamed to e1000.ko.ignore and see how that goes.
>
> Is it possible a module got loaded before e1000 and then unloaded?
> That seems odd, but maybe unload isn't cleaning up?
>
> >> Are you able to bisect this?
> >
> > This issue has been present for a while so since I recall this I might be
> > able to reduce the number of needed target kernels to bisect. Lemme tinker
> > a bit and if no clear culprit comes up then will try bisect.
>
> Okay, thanks!
Sorry to report that this issue is present since the feature's addition. So
the issue is there since its addition and is still present today. *But* it
may also be a configuration issue, given I have booted this guest *without*
this issue ...
So:
git checkout -b WX e1a58320a38dfa72be48a0f1a3a92273663ba6db
That boots with the warning. To help debug further I've minimized my modules
to only a few: scsi_mod, e1000, libata.
I suspect at this point this is not the fault of a particular module but
instead just an accounting semantic (>= or <= on an edge) but let's see.
I now boot on 4.3.0-rc3 on commit (e1a58320a38df ("x86/mm: Warn on W^X
mappings") and I with:
[ 0.949435] ------------[ cut here ]------------
[ 0.949992] WARNING: CPU: 2 PID: 1 at arch/x86/mm/dump_pagetables.c:225 note_page+0x635/0x7e0()
[ 0.950996] x86/mm: Found insecure W+X mapping at address ffffffffc0000000/0xffffffffc0000000
[ 0.951814] Modules linked in:
[ 0.952123] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.3.0-rc3-FINAL-TEST-WITH-WX-NOFLOPPY+ #365
[ 0.952929] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
[ 0.954033] 0000000000000000 000000001f722925 ffff88013a5d7d40 ffffffff812ff335
[ 0.954742] ffff88013a5d7d88 ffff88013a5d7d78 ffffffff81079be2 ffff88013a5d7e90
[ 0.955522] 0000000000000000 0000000000000004 0000000000000000 0000000000000000
[ 0.956256] Call Trace:
[ 0.956496] [<ffffffff812ff335>] dump_stack+0x44/0x5f
[ 0.956953] [<ffffffff81079be2>] warn_slowpath_common+0x82/0xc0
[ 0.957519] [<ffffffff81079c7c>] warn_slowpath_fmt+0x5c/0x80
[ 0.958066] [<ffffffff8106c155>] note_page+0x635/0x7e0
[ 0.958595] [<ffffffff8106c5eb>] ptdump_walk_pgd_level_core+0x2eb/0x410
[ 0.959219] [<ffffffff8106c7b7>] ptdump_walk_pgd_level_checkwx+0x17/0x20
[ 0.959856] [<ffffffff8106260d>] mark_rodata_ro+0xed/0x100
[ 0.960372] [<ffffffff815aa7d0>] ? rest_init+0x80/0x80
[ 0.960869] [<ffffffff815aa7ed>] kernel_init+0x1d/0xe0
[ 0.961358] [<ffffffff815b798f>] ret_from_fork+0x3f/0x70
[ 0.961900] [<ffffffff815aa7d0>] ? rest_init+0x80/0x80
[ 0.962389] ---[ end trace 6125ebcb24c9e3d0 ]---
[ 0.962822] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
---[ High Kernel Mapping ]---
0xffffffff80000000-0xffffffff81000000 16M pmd
0xffffffff81000000-0xffffffff81600000 6M ro PSE GLB x pmd
0xffffffff81600000-0xffffffff81a00000 4M ro PSE GLB NX pmd
0xffffffff81a00000-0xffffffff81c00000 2M RW GLB NX pte
0xffffffff81c00000-0xffffffff82200000 6M RW PSE GLB NX pmd
0xffffffff82200000-0xffffffff82400000 2M RW GLB NX pte
0xffffffff82400000-0xffffffffc0000000 988M pmd
---[ Modules ]---
0xffffffffc0000000-0xffffffffc0001000 4K RW GLB x pte
0xffffffffc0001000-0xffffffffc0002000 4K pte
0xffffffffc0002000-0xffffffffc0039000 220K RW GLB x pte
root@piggy:~# cat /proc/modules | sort -k 6 | head -3
scsi_mod 221979 4 sg,sd_mod,sr_mod,libata, Live 0xffffffffc0002000 (E)
e1000 127757 0 - Live 0xffffffffc004d000 (E)
libata 229931 2 ata_generic,ata_piix, Live 0xffffffffc0076000 (E)
So that 4K RW seems suspect of getting used for allocation purpose on edge
for a particular reason and it also happens to be on the edge of the high
kernel mapping. Could it be the boundary semantic issue ?
For instance can it be that since 0xffffffffc0002000 is given to the first
module by the allocator, scsi_mod, and since that address is *technically*
part of two boundaries we get a splat ?
0xffffffffc0001000-0xffffffffc0002000 4K pte
0xffffffffc0002000-0xffffffffc0039000 220K RW GLB x pte
Luis
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
2017-05-19 0:44 ` Luis R. Rodriguez
@ 2017-05-19 3:08 ` Luis R. Rodriguez
2017-05-19 15:40 ` Luis R. Rodriguez
0 siblings, 1 reply; 21+ messages in thread
From: Luis R. Rodriguez @ 2017-05-19 3:08 UTC (permalink / raw)
To: Luis R. Rodriguez
Cc: Kees Cook, Stephen Smalley, Ingo Molnar, Andy Lutomirski,
Michal Hocko, Vlastimil Babka, Andrew Morton, Eric W. Biederman,
Mateusz Guzik, LKML
On Fri, May 19, 2017 at 02:44:14AM +0200, Luis R. Rodriguez wrote:
> On Wed, May 17, 2017 at 10:53:06AM -0700, Kees Cook wrote:
> > On Wed, May 17, 2017 at 9:40 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > > Yes, but I had killed that boot session again, so upon my next boot
> > > I had a different layout, the ASLR gap was much larger:
> > >
> > > ---[ Modules ]---
> > > 0xffffffffc0000000-0xffffffffc01b0000 1728K pte
> > > 0xffffffffc01b0000-0xffffffffc01b1000 4K RW GLB x pte
> > > 0xffffffffc01b1000-0xffffffffc01b2000 4K pte
> > > 0xffffffffc01b2000-0xffffffffc01c6000 80K ro GLB x pte
> > > 0xffffffffc01c6000-0xffffffffc01cc000 24K ro GLB NX pte
> > > 0xffffffffc01cc000-0xffffffffc01d5000 36K RW GLB NX pte
> > >
> > > As you can guess if we follow similar pattern the RW hole is the one this boot
> > > warned about:
> > >
> > > [ 1.450483] x86/mm: Found insecure W+X mapping at address ffffffffc01b0000/0xffffffffc01b0000
> > > [ 1.451280] ------------[ cut here ]------------
> > > [ 1.451721] WARNING: CPU: 1 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> > > [ 1.452499] Modules linked in:
> > > [ 1.452791] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc1-next-20170515+ #145
> > >
> > > I checked and indeed 0xffffffffc01b2000 is part of a module, it was not the first one
> > > on the /proc/modules list but then again /proc/modules does not seem to have a specific
> > > order other than perhaps being pegged into a linked list of modules once they go live,
> > > and it seems its typically output backwards from when that happened, sorting that
> > > by address we get:
> >
> > Right, sorry, I'd expect it at the bottom of the list in
> > /proc/modules, but that's fine, it's there.
> >
> > >
> > > root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> > > e1000 143360 0 - Live 0xffffffffc01b2000 (E)
> > > mbcache 16384 1 ext4, Live 0xffffffffc01d6000 (E)
> > > scsi_mod 217088 4 sg,sr_mod,sd_mod,libata, Live 0xffffffffc01df000 (E)
> > >
> > > And this then seems to be the first module loaded:
> > >
> > > e1000 143360 0 - Live 0xffffffffc01b2000 (E)
> > >
> > > The output of dmesg seems to confirm this as per the list of modules sorted
> > > as per above.
> > >
> > >> Something touched the module gap and left is RW+x...
> > >
> > > Lemme try booting with e1000 renamed to e1000.ko.ignore and see how that goes.
> >
> > Is it possible a module got loaded before e1000 and then unloaded?
> > That seems odd, but maybe unload isn't cleaning up?
> >
> > >> Are you able to bisect this?
> > >
> > > This issue has been present for a while so since I recall this I might be
> > > able to reduce the number of needed target kernels to bisect. Lemme tinker
> > > a bit and if no clear culprit comes up then will try bisect.
> >
> > Okay, thanks!
>
> Sorry to report that this issue is present since the feature's addition. So
> the issue is there since its addition and is still present today. *But* it
> may also be a configuration issue, given I have booted this guest *without*
> this issue ...
>
> So:
>
> git checkout -b WX e1a58320a38dfa72be48a0f1a3a92273663ba6db
>
> That boots with the warning. To help debug further I've minimized my modules
> to only a few: scsi_mod, e1000, libata.
>
> I suspect at this point this is not the fault of a particular module but
> instead just an accounting semantic (>= or <= on an edge) but let's see.
>
> I now boot on 4.3.0-rc3 on commit (e1a58320a38df ("x86/mm: Warn on W^X
> mappings") and I with:
>
> [ 0.949435] ------------[ cut here ]------------
> [ 0.949992] WARNING: CPU: 2 PID: 1 at arch/x86/mm/dump_pagetables.c:225 note_page+0x635/0x7e0()
> [ 0.950996] x86/mm: Found insecure W+X mapping at address ffffffffc0000000/0xffffffffc0000000
> [ 0.951814] Modules linked in:
> [ 0.952123] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.3.0-rc3-FINAL-TEST-WITH-WX-NOFLOPPY+ #365
> [ 0.952929] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> [ 0.954033] 0000000000000000 000000001f722925 ffff88013a5d7d40 ffffffff812ff335
> [ 0.954742] ffff88013a5d7d88 ffff88013a5d7d78 ffffffff81079be2 ffff88013a5d7e90
> [ 0.955522] 0000000000000000 0000000000000004 0000000000000000 0000000000000000
> [ 0.956256] Call Trace:
> [ 0.956496] [<ffffffff812ff335>] dump_stack+0x44/0x5f
> [ 0.956953] [<ffffffff81079be2>] warn_slowpath_common+0x82/0xc0
> [ 0.957519] [<ffffffff81079c7c>] warn_slowpath_fmt+0x5c/0x80
> [ 0.958066] [<ffffffff8106c155>] note_page+0x635/0x7e0
> [ 0.958595] [<ffffffff8106c5eb>] ptdump_walk_pgd_level_core+0x2eb/0x410
> [ 0.959219] [<ffffffff8106c7b7>] ptdump_walk_pgd_level_checkwx+0x17/0x20
> [ 0.959856] [<ffffffff8106260d>] mark_rodata_ro+0xed/0x100
> [ 0.960372] [<ffffffff815aa7d0>] ? rest_init+0x80/0x80
> [ 0.960869] [<ffffffff815aa7ed>] kernel_init+0x1d/0xe0
> [ 0.961358] [<ffffffff815b798f>] ret_from_fork+0x3f/0x70
> [ 0.961900] [<ffffffff815aa7d0>] ? rest_init+0x80/0x80
> [ 0.962389] ---[ end trace 6125ebcb24c9e3d0 ]---
> [ 0.962822] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
>
>
> ---[ High Kernel Mapping ]---
> 0xffffffff80000000-0xffffffff81000000 16M pmd
> 0xffffffff81000000-0xffffffff81600000 6M ro PSE GLB x pmd
> 0xffffffff81600000-0xffffffff81a00000 4M ro PSE GLB NX pmd
> 0xffffffff81a00000-0xffffffff81c00000 2M RW GLB NX pte
> 0xffffffff81c00000-0xffffffff82200000 6M RW PSE GLB NX pmd
> 0xffffffff82200000-0xffffffff82400000 2M RW GLB NX pte
> 0xffffffff82400000-0xffffffffc0000000 988M pmd
> ---[ Modules ]---
> 0xffffffffc0000000-0xffffffffc0001000 4K RW GLB x pte
> 0xffffffffc0001000-0xffffffffc0002000 4K pte
> 0xffffffffc0002000-0xffffffffc0039000 220K RW GLB x pte
>
> root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> scsi_mod 221979 4 sg,sd_mod,sr_mod,libata, Live 0xffffffffc0002000 (E)
> e1000 127757 0 - Live 0xffffffffc004d000 (E)
> libata 229931 2 ata_generic,ata_piix, Live 0xffffffffc0076000 (E)
>
> So that 4K RW seems suspect of getting used for allocation purpose on edge
> for a particular reason and it also happens to be on the edge of the high
> kernel mapping. Could it be the boundary semantic issue ?
>
> For instance can it be that since 0xffffffffc0002000 is given to the first
> module by the allocator, scsi_mod, and since that address is *technically*
> part of two boundaries we get a splat ?
>
> 0xffffffffc0001000-0xffffffffc0002000 4K pte
> 0xffffffffc0002000-0xffffffffc0039000 220K RW GLB x pte
Note on the latest linux-next and on the commit that introduced this the config
and kernel yields only *one* page:
x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
I believe this is more indications my suspicion might be right.
Luis
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
2017-05-19 3:08 ` Luis R. Rodriguez
@ 2017-05-19 15:40 ` Luis R. Rodriguez
2017-05-19 17:28 ` Luis R. Rodriguez
2017-05-19 17:35 ` Catalin Marinas
0 siblings, 2 replies; 21+ messages in thread
From: Luis R. Rodriguez @ 2017-05-19 15:40 UTC (permalink / raw)
To: Catalin Marinas, Steven Rostedt
Cc: Luis R. Rodriguez, Kees Cook, Stephen Smalley, Ingo Molnar,
Andy Lutomirski, Michal Hocko, Vlastimil Babka, Andrew Morton,
Eric W. Biederman, Mateusz Guzik, LKML
On Fri, May 19, 2017 at 05:08:02AM +0200, Luis R. Rodriguez wrote:
> On Fri, May 19, 2017 at 02:44:14AM +0200, Luis R. Rodriguez wrote:
> > On Wed, May 17, 2017 at 10:53:06AM -0700, Kees Cook wrote:
> > > On Wed, May 17, 2017 at 9:40 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > > > Yes, but I had killed that boot session again, so upon my next boot
> > > > I had a different layout, the ASLR gap was much larger:
> > > >
> > > > ---[ Modules ]---
> > > > 0xffffffffc0000000-0xffffffffc01b0000 1728K pte
> > > > 0xffffffffc01b0000-0xffffffffc01b1000 4K RW GLB x pte
> > > > 0xffffffffc01b1000-0xffffffffc01b2000 4K pte
> > > > 0xffffffffc01b2000-0xffffffffc01c6000 80K ro GLB x pte
> > > > 0xffffffffc01c6000-0xffffffffc01cc000 24K ro GLB NX pte
> > > > 0xffffffffc01cc000-0xffffffffc01d5000 36K RW GLB NX pte
> > > >
> > > > As you can guess if we follow similar pattern the RW hole is the one this boot
> > > > warned about:
> > > >
> > > > [ 1.450483] x86/mm: Found insecure W+X mapping at address ffffffffc01b0000/0xffffffffc01b0000
> > > > [ 1.451280] ------------[ cut here ]------------
> > > > [ 1.451721] WARNING: CPU: 1 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> > > > [ 1.452499] Modules linked in:
> > > > [ 1.452791] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc1-next-20170515+ #145
> > > >
> > > > I checked and indeed 0xffffffffc01b2000 is part of a module, it was not the first one
> > > > on the /proc/modules list but then again /proc/modules does not seem to have a specific
> > > > order other than perhaps being pegged into a linked list of modules once they go live,
> > > > and it seems its typically output backwards from when that happened, sorting that
> > > > by address we get:
> > >
> > > Right, sorry, I'd expect it at the bottom of the list in
> > > /proc/modules, but that's fine, it's there.
> > >
> > > >
> > > > root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> > > > e1000 143360 0 - Live 0xffffffffc01b2000 (E)
> > > > mbcache 16384 1 ext4, Live 0xffffffffc01d6000 (E)
> > > > scsi_mod 217088 4 sg,sr_mod,sd_mod,libata, Live 0xffffffffc01df000 (E)
> > > >
> > > > And this then seems to be the first module loaded:
> > > >
> > > > e1000 143360 0 - Live 0xffffffffc01b2000 (E)
> > > >
> > > > The output of dmesg seems to confirm this as per the list of modules sorted
> > > > as per above.
> > > >
> > > >> Something touched the module gap and left is RW+x...
> > > >
> > > > Lemme try booting with e1000 renamed to e1000.ko.ignore and see how that goes.
> > >
> > > Is it possible a module got loaded before e1000 and then unloaded?
> > > That seems odd, but maybe unload isn't cleaning up?
> > >
> > > >> Are you able to bisect this?
> > > >
> > > > This issue has been present for a while so since I recall this I might be
> > > > able to reduce the number of needed target kernels to bisect. Lemme tinker
> > > > a bit and if no clear culprit comes up then will try bisect.
> > >
> > > Okay, thanks!
> >
> > Sorry to report that this issue is present since the feature's addition. So
> > the issue is there since its addition and is still present today. *But* it
> > may also be a configuration issue, given I have booted this guest *without*
> > this issue ...
> >
> > So:
> >
> > git checkout -b WX e1a58320a38dfa72be48a0f1a3a92273663ba6db
> >
> > That boots with the warning. To help debug further I've minimized my modules
> > to only a few: scsi_mod, e1000, libata.
> >
> > I suspect at this point this is not the fault of a particular module but
> > instead just an accounting semantic (>= or <= on an edge) but let's see.
> >
> > I now boot on 4.3.0-rc3 on commit (e1a58320a38df ("x86/mm: Warn on W^X
> > mappings") and I with:
> >
> > [ 0.949435] ------------[ cut here ]------------
> > [ 0.949992] WARNING: CPU: 2 PID: 1 at arch/x86/mm/dump_pagetables.c:225 note_page+0x635/0x7e0()
> > [ 0.950996] x86/mm: Found insecure W+X mapping at address ffffffffc0000000/0xffffffffc0000000
> > [ 0.951814] Modules linked in:
> > [ 0.952123] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.3.0-rc3-FINAL-TEST-WITH-WX-NOFLOPPY+ #365
> > [ 0.952929] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> > [ 0.954033] 0000000000000000 000000001f722925 ffff88013a5d7d40 ffffffff812ff335
> > [ 0.954742] ffff88013a5d7d88 ffff88013a5d7d78 ffffffff81079be2 ffff88013a5d7e90
> > [ 0.955522] 0000000000000000 0000000000000004 0000000000000000 0000000000000000
> > [ 0.956256] Call Trace:
> > [ 0.956496] [<ffffffff812ff335>] dump_stack+0x44/0x5f
> > [ 0.956953] [<ffffffff81079be2>] warn_slowpath_common+0x82/0xc0
> > [ 0.957519] [<ffffffff81079c7c>] warn_slowpath_fmt+0x5c/0x80
> > [ 0.958066] [<ffffffff8106c155>] note_page+0x635/0x7e0
> > [ 0.958595] [<ffffffff8106c5eb>] ptdump_walk_pgd_level_core+0x2eb/0x410
> > [ 0.959219] [<ffffffff8106c7b7>] ptdump_walk_pgd_level_checkwx+0x17/0x20
> > [ 0.959856] [<ffffffff8106260d>] mark_rodata_ro+0xed/0x100
> > [ 0.960372] [<ffffffff815aa7d0>] ? rest_init+0x80/0x80
> > [ 0.960869] [<ffffffff815aa7ed>] kernel_init+0x1d/0xe0
> > [ 0.961358] [<ffffffff815b798f>] ret_from_fork+0x3f/0x70
> > [ 0.961900] [<ffffffff815aa7d0>] ? rest_init+0x80/0x80
> > [ 0.962389] ---[ end trace 6125ebcb24c9e3d0 ]---
> > [ 0.962822] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
> >
> >
> > ---[ High Kernel Mapping ]---
> > 0xffffffff80000000-0xffffffff81000000 16M pmd
> > 0xffffffff81000000-0xffffffff81600000 6M ro PSE GLB x pmd
> > 0xffffffff81600000-0xffffffff81a00000 4M ro PSE GLB NX pmd
> > 0xffffffff81a00000-0xffffffff81c00000 2M RW GLB NX pte
> > 0xffffffff81c00000-0xffffffff82200000 6M RW PSE GLB NX pmd
> > 0xffffffff82200000-0xffffffff82400000 2M RW GLB NX pte
> > 0xffffffff82400000-0xffffffffc0000000 988M pmd
> > ---[ Modules ]---
> > 0xffffffffc0000000-0xffffffffc0001000 4K RW GLB x pte
> > 0xffffffffc0001000-0xffffffffc0002000 4K pte
> > 0xffffffffc0002000-0xffffffffc0039000 220K RW GLB x pte
> >
> > root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> > scsi_mod 221979 4 sg,sd_mod,sr_mod,libata, Live 0xffffffffc0002000 (E)
> > e1000 127757 0 - Live 0xffffffffc004d000 (E)
> > libata 229931 2 ata_generic,ata_piix, Live 0xffffffffc0076000 (E)
> >
> > So that 4K RW seems suspect of getting used for allocation purpose on edge
> > for a particular reason and it also happens to be on the edge of the high
> > kernel mapping. Could it be the boundary semantic issue ?
> >
> > For instance can it be that since 0xffffffffc0002000 is given to the first
> > module by the allocator, scsi_mod, and since that address is *technically*
> > part of two boundaries we get a splat ?
> >
> > 0xffffffffc0001000-0xffffffffc0002000 4K pte
> > 0xffffffffc0002000-0xffffffffc0039000 220K RW GLB x pte
>
> Note on the latest linux-next and on the commit that introduced this the config
> and kernel yields only *one* page:
>
> x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
>
> I believe this is more indications my suspicion might be right.
If the following is a legit forced way to get query the kernel to ask it
who owns a page then perhaps this technique can be used in the future to
figure out who the hell caused this. Catalin, can you confirm? In this
case this is perhaps not a leaked page but I am trying to abuse the
kmemleak debugfs API to query who allocated the page. Is that fine?
[ 0.916771] WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:235 note_page+0x63c/0x7e0
[ 0.917636] x86/mm: Found insecure W+X mapping at address ffffffffc03d5000/0xffffffffc03d5000
[ 0.918502] Modules linked in:
[ 0.918819] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.11.0-mcgrof-force-config #340
[ 0.919631] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
[ 0.920011] Call Trace:
[ 0.920011] dump_stack+0x63/0x81
[ 0.920011] __warn+0xcb/0xf0
[ 0.920011] warn_slowpath_fmt+0x5a/0x80
[ 0.920011] note_page+0x63c/0x7e0
[ 0.920011] ptdump_walk_pgd_level_core+0x3b1/0x460
[ 0.920011] ? 0xffffffff86c00000
[ 0.920011] ptdump_walk_pgd_level_checkwx+0x17/0x20
[ 0.920011] mark_rodata_ro+0xf4/0x100
[ 0.920011] ? rest_init+0x80/0x80
[ 0.920011] kernel_init+0x2a/0x100
[ 0.920011] ret_from_fork+0x2c/0x40
[ 0.925474] ---[ end trace dca00cd779490a2b ]---
[ 0.925959] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
echo dump=0xffffffffc03d5000 > /sys/kernel/debug/kmemleak
dmesg | tail
[ 49.209565] kmemleak: Object 0xffffffffc03d5000 (size 335):
[ 49.210814] kmemleak: comm "swapper/0", pid 1, jiffies 4294892440
[ 49.212148] kmemleak: min_count = 2
[ 49.212852] kmemleak: count = 0
[ 49.213363] kmemleak: flags = 0x1
[ 49.213363] kmemleak: checksum = 0
[ 49.213363] kmemleak: backtrace:
[ 49.213363] kmemleak_alloc+0x4a/0xa0
[ 49.213363] __vmalloc_node_range+0x20a/0x2b0
[ 49.213363] module_alloc+0x67/0xc0
[ 49.213363] arch_ftrace_update_trampoline+0xba/0x260
[ 49.213363] ftrace_startup+0x90/0x210
[ 49.213363] register_ftrace_function+0x4b/0x60
[ 49.213363] arm_kprobe+0x84/0xe0
[ 49.213363] register_kprobe+0x56e/0x5b0
[ 49.213363] init_test_probes+0x61/0x560
[ 49.213363] init_kprobes+0x1e3/0x206
[ 49.213363] do_one_initcall+0x52/0x1a0
[ 49.213363] kernel_init_freeable+0x178/0x200
[ 49.213363] kernel_init+0xe/0x100
[ 49.213363] ret_from_fork+0x2c/0x40
[ 49.213363] 0xffffffffffffffff
Luis
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
2017-05-19 15:40 ` Luis R. Rodriguez
@ 2017-05-19 17:28 ` Luis R. Rodriguez
2017-05-20 2:38 ` Masami Hiramatsu
2017-05-19 17:35 ` Catalin Marinas
1 sibling, 1 reply; 21+ messages in thread
From: Luis R. Rodriguez @ 2017-05-19 17:28 UTC (permalink / raw)
To: Masami Hiramatsu, Jim Keniston, davem, sagar.abhishek
Cc: Catalin Marinas, mcgrof, Steven Rostedt, Kees Cook,
Stephen Smalley, Ingo Molnar, Andy Lutomirski, Michal Hocko,
Vlastimil Babka, Andrew Morton, Eric W. Biederman, Mateusz Guzik,
LKML
On Fri, May 19, 2017 at 05:40:16PM +0200, Luis R. Rodriguez wrote:
> On Fri, May 19, 2017 at 05:08:02AM +0200, Luis R. Rodriguez wrote:
> > On Fri, May 19, 2017 at 02:44:14AM +0200, Luis R. Rodriguez wrote:
> > > On Wed, May 17, 2017 at 10:53:06AM -0700, Kees Cook wrote:
> > > > On Wed, May 17, 2017 at 9:40 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > > > > Yes, but I had killed that boot session again, so upon my next boot
> > > > > I had a different layout, the ASLR gap was much larger:
> > > > >
> > > > > ---[ Modules ]---
> > > > > 0xffffffffc0000000-0xffffffffc01b0000 1728K pte
> > > > > 0xffffffffc01b0000-0xffffffffc01b1000 4K RW GLB x pte
> > > > > 0xffffffffc01b1000-0xffffffffc01b2000 4K pte
> > > > > 0xffffffffc01b2000-0xffffffffc01c6000 80K ro GLB x pte
> > > > > 0xffffffffc01c6000-0xffffffffc01cc000 24K ro GLB NX pte
> > > > > 0xffffffffc01cc000-0xffffffffc01d5000 36K RW GLB NX pte
> > > > >
> > > > > As you can guess if we follow similar pattern the RW hole is the one this boot
> > > > > warned about:
> > > > >
> > > > > [ 1.450483] x86/mm: Found insecure W+X mapping at address ffffffffc01b0000/0xffffffffc01b0000
> > > > > [ 1.451280] ------------[ cut here ]------------
> > > > > [ 1.451721] WARNING: CPU: 1 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> > > > > [ 1.452499] Modules linked in:
> > > > > [ 1.452791] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc1-next-20170515+ #145
> > > > >
> > > > > I checked and indeed 0xffffffffc01b2000 is part of a module, it was not the first one
> > > > > on the /proc/modules list but then again /proc/modules does not seem to have a specific
> > > > > order other than perhaps being pegged into a linked list of modules once they go live,
> > > > > and it seems its typically output backwards from when that happened, sorting that
> > > > > by address we get:
> > > >
> > > > Right, sorry, I'd expect it at the bottom of the list in
> > > > /proc/modules, but that's fine, it's there.
> > > >
> > > > >
> > > > > root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> > > > > e1000 143360 0 - Live 0xffffffffc01b2000 (E)
> > > > > mbcache 16384 1 ext4, Live 0xffffffffc01d6000 (E)
> > > > > scsi_mod 217088 4 sg,sr_mod,sd_mod,libata, Live 0xffffffffc01df000 (E)
> > > > >
> > > > > And this then seems to be the first module loaded:
> > > > >
> > > > > e1000 143360 0 - Live 0xffffffffc01b2000 (E)
> > > > >
> > > > > The output of dmesg seems to confirm this as per the list of modules sorted
> > > > > as per above.
> > > > >
> > > > >> Something touched the module gap and left is RW+x...
> > > > >
> > > > > Lemme try booting with e1000 renamed to e1000.ko.ignore and see how that goes.
> > > >
> > > > Is it possible a module got loaded before e1000 and then unloaded?
> > > > That seems odd, but maybe unload isn't cleaning up?
> > > >
> > > > >> Are you able to bisect this?
> > > > >
> > > > > This issue has been present for a while so since I recall this I might be
> > > > > able to reduce the number of needed target kernels to bisect. Lemme tinker
> > > > > a bit and if no clear culprit comes up then will try bisect.
> > > >
> > > > Okay, thanks!
> > >
> > > Sorry to report that this issue is present since the feature's addition. So
> > > the issue is there since its addition and is still present today. *But* it
> > > may also be a configuration issue, given I have booted this guest *without*
> > > this issue ...
> > >
> > > So:
> > >
> > > git checkout -b WX e1a58320a38dfa72be48a0f1a3a92273663ba6db
> > >
> > > That boots with the warning. To help debug further I've minimized my modules
> > > to only a few: scsi_mod, e1000, libata.
> > >
> > > I suspect at this point this is not the fault of a particular module but
> > > instead just an accounting semantic (>= or <= on an edge) but let's see.
> > >
> > > I now boot on 4.3.0-rc3 on commit (e1a58320a38df ("x86/mm: Warn on W^X
> > > mappings") and I with:
> > >
> > > [ 0.949435] ------------[ cut here ]------------
> > > [ 0.949992] WARNING: CPU: 2 PID: 1 at arch/x86/mm/dump_pagetables.c:225 note_page+0x635/0x7e0()
> > > [ 0.950996] x86/mm: Found insecure W+X mapping at address ffffffffc0000000/0xffffffffc0000000
> > > [ 0.951814] Modules linked in:
> > > [ 0.952123] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.3.0-rc3-FINAL-TEST-WITH-WX-NOFLOPPY+ #365
> > > [ 0.952929] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> > > [ 0.954033] 0000000000000000 000000001f722925 ffff88013a5d7d40 ffffffff812ff335
> > > [ 0.954742] ffff88013a5d7d88 ffff88013a5d7d78 ffffffff81079be2 ffff88013a5d7e90
> > > [ 0.955522] 0000000000000000 0000000000000004 0000000000000000 0000000000000000
> > > [ 0.956256] Call Trace:
> > > [ 0.956496] [<ffffffff812ff335>] dump_stack+0x44/0x5f
> > > [ 0.956953] [<ffffffff81079be2>] warn_slowpath_common+0x82/0xc0
> > > [ 0.957519] [<ffffffff81079c7c>] warn_slowpath_fmt+0x5c/0x80
> > > [ 0.958066] [<ffffffff8106c155>] note_page+0x635/0x7e0
> > > [ 0.958595] [<ffffffff8106c5eb>] ptdump_walk_pgd_level_core+0x2eb/0x410
> > > [ 0.959219] [<ffffffff8106c7b7>] ptdump_walk_pgd_level_checkwx+0x17/0x20
> > > [ 0.959856] [<ffffffff8106260d>] mark_rodata_ro+0xed/0x100
> > > [ 0.960372] [<ffffffff815aa7d0>] ? rest_init+0x80/0x80
> > > [ 0.960869] [<ffffffff815aa7ed>] kernel_init+0x1d/0xe0
> > > [ 0.961358] [<ffffffff815b798f>] ret_from_fork+0x3f/0x70
> > > [ 0.961900] [<ffffffff815aa7d0>] ? rest_init+0x80/0x80
> > > [ 0.962389] ---[ end trace 6125ebcb24c9e3d0 ]---
> > > [ 0.962822] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
> > >
> > >
> > > ---[ High Kernel Mapping ]---
> > > 0xffffffff80000000-0xffffffff81000000 16M pmd
> > > 0xffffffff81000000-0xffffffff81600000 6M ro PSE GLB x pmd
> > > 0xffffffff81600000-0xffffffff81a00000 4M ro PSE GLB NX pmd
> > > 0xffffffff81a00000-0xffffffff81c00000 2M RW GLB NX pte
> > > 0xffffffff81c00000-0xffffffff82200000 6M RW PSE GLB NX pmd
> > > 0xffffffff82200000-0xffffffff82400000 2M RW GLB NX pte
> > > 0xffffffff82400000-0xffffffffc0000000 988M pmd
> > > ---[ Modules ]---
> > > 0xffffffffc0000000-0xffffffffc0001000 4K RW GLB x pte
> > > 0xffffffffc0001000-0xffffffffc0002000 4K pte
> > > 0xffffffffc0002000-0xffffffffc0039000 220K RW GLB x pte
> > >
> > > root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> > > scsi_mod 221979 4 sg,sd_mod,sr_mod,libata, Live 0xffffffffc0002000 (E)
> > > e1000 127757 0 - Live 0xffffffffc004d000 (E)
> > > libata 229931 2 ata_generic,ata_piix, Live 0xffffffffc0076000 (E)
> > >
> > > So that 4K RW seems suspect of getting used for allocation purpose on edge
> > > for a particular reason and it also happens to be on the edge of the high
> > > kernel mapping. Could it be the boundary semantic issue ?
> > >
> > > For instance can it be that since 0xffffffffc0002000 is given to the first
> > > module by the allocator, scsi_mod, and since that address is *technically*
> > > part of two boundaries we get a splat ?
> > >
> > > 0xffffffffc0001000-0xffffffffc0002000 4K pte
> > > 0xffffffffc0002000-0xffffffffc0039000 220K RW GLB x pte
> >
> > Note on the latest linux-next and on the commit that introduced this the config
> > and kernel yields only *one* page:
> >
> > x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
> >
> > I believe this is more indications my suspicion might be right.
>
> If the following is a legit forced way to get query the kernel to ask it
> who owns a page then perhaps this technique can be used in the future to
> figure out who the hell caused this. Catalin, can you confirm? In this
> case this is perhaps not a leaked page but I am trying to abuse the
> kmemleak debugfs API to query who allocated the page. Is that fine?
>
> [ 0.916771] WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:235 note_page+0x63c/0x7e0
> [ 0.917636] x86/mm: Found insecure W+X mapping at address ffffffffc03d5000/0xffffffffc03d5000
> [ 0.918502] Modules linked in:
> [ 0.918819] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.11.0-mcgrof-force-config #340
> [ 0.919631] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> [ 0.920011] Call Trace:
> [ 0.920011] dump_stack+0x63/0x81
> [ 0.920011] __warn+0xcb/0xf0
> [ 0.920011] warn_slowpath_fmt+0x5a/0x80
> [ 0.920011] note_page+0x63c/0x7e0
> [ 0.920011] ptdump_walk_pgd_level_core+0x3b1/0x460
> [ 0.920011] ? 0xffffffff86c00000
> [ 0.920011] ptdump_walk_pgd_level_checkwx+0x17/0x20
> [ 0.920011] mark_rodata_ro+0xf4/0x100
> [ 0.920011] ? rest_init+0x80/0x80
> [ 0.920011] kernel_init+0x2a/0x100
> [ 0.920011] ret_from_fork+0x2c/0x40
> [ 0.925474] ---[ end trace dca00cd779490a2b ]---
> [ 0.925959] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
>
> echo dump=0xffffffffc03d5000 > /sys/kernel/debug/kmemleak
> dmesg | tail
>
> [ 49.209565] kmemleak: Object 0xffffffffc03d5000 (size 335):
> [ 49.210814] kmemleak: comm "swapper/0", pid 1, jiffies 4294892440
> [ 49.212148] kmemleak: min_count = 2
> [ 49.212852] kmemleak: count = 0
> [ 49.213363] kmemleak: flags = 0x1
> [ 49.213363] kmemleak: checksum = 0
> [ 49.213363] kmemleak: backtrace:
> [ 49.213363] kmemleak_alloc+0x4a/0xa0
> [ 49.213363] __vmalloc_node_range+0x20a/0x2b0
> [ 49.213363] module_alloc+0x67/0xc0
> [ 49.213363] arch_ftrace_update_trampoline+0xba/0x260
> [ 49.213363] ftrace_startup+0x90/0x210
> [ 49.213363] register_ftrace_function+0x4b/0x60
> [ 49.213363] arm_kprobe+0x84/0xe0
> [ 49.213363] register_kprobe+0x56e/0x5b0
> [ 49.213363] init_test_probes+0x61/0x560
> [ 49.213363] init_kprobes+0x1e3/0x206
> [ 49.213363] do_one_initcall+0x52/0x1a0
> [ 49.213363] kernel_init_freeable+0x178/0x200
> [ 49.213363] kernel_init+0xe/0x100
> [ 49.213363] ret_from_fork+0x2c/0x40
> [ 49.213363] 0xffffffffffffffff
Aha! And the winner is:
CONFIG_KPROBES_SANITY_TEST
I confirm disabling it on 4.3.0-rc3 and on linux-next next-20170519 avoids the WARN.
I also can confirm using the 'echo dump=mem-area > /sys/kernel/debug/kmemleak' yields
the same trace for both of these kernels.
So -- the above kmemleak hack seems to actually work to seek who owns that page.
Now to figure out how the hell kernel/test_kprobes.c screws around with things.
Luis
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
2017-05-19 15:40 ` Luis R. Rodriguez
2017-05-19 17:28 ` Luis R. Rodriguez
@ 2017-05-19 17:35 ` Catalin Marinas
2017-05-19 18:27 ` Andy Lutomirski
2017-05-26 22:13 ` Luis R. Rodriguez
1 sibling, 2 replies; 21+ messages in thread
From: Catalin Marinas @ 2017-05-19 17:35 UTC (permalink / raw)
To: Luis R. Rodriguez
Cc: Steven Rostedt, Kees Cook, Stephen Smalley, Ingo Molnar,
Andy Lutomirski, Michal Hocko, Vlastimil Babka, Andrew Morton,
Eric W. Biederman, Mateusz Guzik, LKML
On Fri, May 19, 2017 at 05:40:16PM +0200, Luis R. Rodriguez wrote:
> If the following is a legit forced way to get query the kernel to ask it
> who owns a page then perhaps this technique can be used in the future to
> figure out who the hell caused this. Catalin, can you confirm? In this
> case this is perhaps not a leaked page but I am trying to abuse the
> kmemleak debugfs API to query who allocated the page. Is that fine?
>
> [ 0.916771] WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:235 note_page+0x63c/0x7e0
> [ 0.917636] x86/mm: Found insecure W+X mapping at address ffffffffc03d5000/0xffffffffc03d5000
> [ 0.918502] Modules linked in:
> [ 0.918819] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.11.0-mcgrof-force-config #340
> [ 0.919631] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> [ 0.920011] Call Trace:
> [ 0.920011] dump_stack+0x63/0x81
> [ 0.920011] __warn+0xcb/0xf0
> [ 0.920011] warn_slowpath_fmt+0x5a/0x80
> [ 0.920011] note_page+0x63c/0x7e0
> [ 0.920011] ptdump_walk_pgd_level_core+0x3b1/0x460
> [ 0.920011] ? 0xffffffff86c00000
> [ 0.920011] ptdump_walk_pgd_level_checkwx+0x17/0x20
> [ 0.920011] mark_rodata_ro+0xf4/0x100
> [ 0.920011] ? rest_init+0x80/0x80
> [ 0.920011] kernel_init+0x2a/0x100
> [ 0.920011] ret_from_fork+0x2c/0x40
> [ 0.925474] ---[ end trace dca00cd779490a2b ]---
> [ 0.925959] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
>
> echo dump=0xffffffffc03d5000 > /sys/kernel/debug/kmemleak
> dmesg | tail
>
> [ 49.209565] kmemleak: Object 0xffffffffc03d5000 (size 335):
> [ 49.210814] kmemleak: comm "swapper/0", pid 1, jiffies 4294892440
> [ 49.212148] kmemleak: min_count = 2
> [ 49.212852] kmemleak: count = 0
> [ 49.213363] kmemleak: flags = 0x1
> [ 49.213363] kmemleak: checksum = 0
> [ 49.213363] kmemleak: backtrace:
> [ 49.213363] kmemleak_alloc+0x4a/0xa0
> [ 49.213363] __vmalloc_node_range+0x20a/0x2b0
> [ 49.213363] module_alloc+0x67/0xc0
> [ 49.213363] arch_ftrace_update_trampoline+0xba/0x260
> [ 49.213363] ftrace_startup+0x90/0x210
> [ 49.213363] register_ftrace_function+0x4b/0x60
> [ 49.213363] arm_kprobe+0x84/0xe0
> [ 49.213363] register_kprobe+0x56e/0x5b0
> [ 49.213363] init_test_probes+0x61/0x560
> [ 49.213363] init_kprobes+0x1e3/0x206
> [ 49.213363] do_one_initcall+0x52/0x1a0
> [ 49.213363] kernel_init_freeable+0x178/0x200
> [ 49.213363] kernel_init+0xe/0x100
> [ 49.213363] ret_from_fork+0x2c/0x40
> [ 49.213363] 0xffffffffffffffff
You could as well use kmemleak this way since it tracks the memory
allocations. However, it doesn't track alloc_pages and also doesn't
track mapping existing pages (vmap etc.)
--
Catalin
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
2017-05-19 17:35 ` Catalin Marinas
@ 2017-05-19 18:27 ` Andy Lutomirski
2017-05-19 19:16 ` Kees Cook
2017-05-26 22:13 ` Luis R. Rodriguez
1 sibling, 1 reply; 21+ messages in thread
From: Andy Lutomirski @ 2017-05-19 18:27 UTC (permalink / raw)
To: Catalin Marinas
Cc: Luis R. Rodriguez, Steven Rostedt, Kees Cook, Stephen Smalley,
Ingo Molnar, Michal Hocko, Vlastimil Babka, Andrew Morton,
Eric W. Biederman, Mateusz Guzik, LKML
On Fri, May 19, 2017 at 10:35 AM, Catalin Marinas
<catalin.marinas@arm.com> wrote:
> On Fri, May 19, 2017 at 05:40:16PM +0200, Luis R. Rodriguez wrote:
>> If the following is a legit forced way to get query the kernel to ask it
>> who owns a page then perhaps this technique can be used in the future to
>> figure out who the hell caused this. Catalin, can you confirm? In this
>> case this is perhaps not a leaked page but I am trying to abuse the
>> kmemleak debugfs API to query who allocated the page. Is that fine?
>>
>> [ 0.916771] WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:235 note_page+0x63c/0x7e0
>> [ 0.917636] x86/mm: Found insecure W+X mapping at address ffffffffc03d5000/0xffffffffc03d5000
>> [ 0.918502] Modules linked in:
>> [ 0.918819] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.11.0-mcgrof-force-config #340
>> [ 0.919631] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
>> [ 0.920011] Call Trace:
>> [ 0.920011] dump_stack+0x63/0x81
>> [ 0.920011] __warn+0xcb/0xf0
>> [ 0.920011] warn_slowpath_fmt+0x5a/0x80
>> [ 0.920011] note_page+0x63c/0x7e0
>> [ 0.920011] ptdump_walk_pgd_level_core+0x3b1/0x460
>> [ 0.920011] ? 0xffffffff86c00000
>> [ 0.920011] ptdump_walk_pgd_level_checkwx+0x17/0x20
>> [ 0.920011] mark_rodata_ro+0xf4/0x100
>> [ 0.920011] ? rest_init+0x80/0x80
>> [ 0.920011] kernel_init+0x2a/0x100
>> [ 0.920011] ret_from_fork+0x2c/0x40
>> [ 0.925474] ---[ end trace dca00cd779490a2b ]---
>> [ 0.925959] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
>>
>> echo dump=0xffffffffc03d5000 > /sys/kernel/debug/kmemleak
>> dmesg | tail
>>
>> [ 49.209565] kmemleak: Object 0xffffffffc03d5000 (size 335):
>> [ 49.210814] kmemleak: comm "swapper/0", pid 1, jiffies 4294892440
>> [ 49.212148] kmemleak: min_count = 2
>> [ 49.212852] kmemleak: count = 0
>> [ 49.213363] kmemleak: flags = 0x1
>> [ 49.213363] kmemleak: checksum = 0
>> [ 49.213363] kmemleak: backtrace:
>> [ 49.213363] kmemleak_alloc+0x4a/0xa0
>> [ 49.213363] __vmalloc_node_range+0x20a/0x2b0
>> [ 49.213363] module_alloc+0x67/0xc0
>> [ 49.213363] arch_ftrace_update_trampoline+0xba/0x260
>> [ 49.213363] ftrace_startup+0x90/0x210
>> [ 49.213363] register_ftrace_function+0x4b/0x60
>> [ 49.213363] arm_kprobe+0x84/0xe0
>> [ 49.213363] register_kprobe+0x56e/0x5b0
>> [ 49.213363] init_test_probes+0x61/0x560
>> [ 49.213363] init_kprobes+0x1e3/0x206
>> [ 49.213363] do_one_initcall+0x52/0x1a0
>> [ 49.213363] kernel_init_freeable+0x178/0x200
>> [ 49.213363] kernel_init+0xe/0x100
>> [ 49.213363] ret_from_fork+0x2c/0x40
>> [ 49.213363] 0xffffffffffffffff
>
> You could as well use kmemleak this way since it tracks the memory
> allocations. However, it doesn't track alloc_pages and also doesn't
> track mapping existing pages (vmap etc.)
One thing I've pondered: can we make some debugging mode (kmemleak,
perhaps?) check that freed memory is RW at the time it's freed? I
once wrote some buggy code that freed an R page and caused an OOPS
much later, and this bug here seems likely to be some code that frees
RWX memory.
--Andy
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
2017-05-19 18:27 ` Andy Lutomirski
@ 2017-05-19 19:16 ` Kees Cook
2017-05-19 19:18 ` Andy Lutomirski
0 siblings, 1 reply; 21+ messages in thread
From: Kees Cook @ 2017-05-19 19:16 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Catalin Marinas, Luis R. Rodriguez, Steven Rostedt,
Stephen Smalley, Ingo Molnar, Michal Hocko, Vlastimil Babka,
Andrew Morton, Eric W. Biederman, Mateusz Guzik, LKML
On Fri, May 19, 2017 at 11:27 AM, Andy Lutomirski <luto@kernel.org> wrote:
> One thing I've pondered: can we make some debugging mode (kmemleak,
> perhaps?) check that freed memory is RW at the time it's freed? I
> once wrote some buggy code that freed an R page and caused an OOPS
> much later, and this bug here seems likely to be some code that frees
> RWX memory.
Which begs for even more checks: nothing should ever make a page RWX.
Either R, RW, or RX only... (or X too I guess, in the future).
-Kees
--
Kees Cook
Pixel Security
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
2017-05-19 19:16 ` Kees Cook
@ 2017-05-19 19:18 ` Andy Lutomirski
2017-05-19 19:29 ` Kees Cook
0 siblings, 1 reply; 21+ messages in thread
From: Andy Lutomirski @ 2017-05-19 19:18 UTC (permalink / raw)
To: Kees Cook
Cc: Andy Lutomirski, Catalin Marinas, Luis R. Rodriguez,
Steven Rostedt, Stephen Smalley, Ingo Molnar, Michal Hocko,
Vlastimil Babka, Andrew Morton, Eric W. Biederman, Mateusz Guzik,
LKML
On Fri, May 19, 2017 at 12:16 PM, Kees Cook <keescook@chromium.org> wrote:
> On Fri, May 19, 2017 at 11:27 AM, Andy Lutomirski <luto@kernel.org> wrote:
>> One thing I've pondered: can we make some debugging mode (kmemleak,
>> perhaps?) check that freed memory is RW at the time it's freed? I
>> once wrote some buggy code that freed an R page and caused an OOPS
>> much later, and this bug here seems likely to be some code that frees
>> RWX memory.
>
> Which begs for even more checks: nothing should ever make a page RWX.
> Either R, RW, or RX only... (or X too I guess, in the future).
I could see pages being RWX temporarily during boot. OTOH if we ban
RWX outright (after very early boot, anyway), then catching code that
messes up and leaves pages RWX gets much easier.
--Andy
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
2017-05-19 19:18 ` Andy Lutomirski
@ 2017-05-19 19:29 ` Kees Cook
0 siblings, 0 replies; 21+ messages in thread
From: Kees Cook @ 2017-05-19 19:29 UTC (permalink / raw)
To: Andy Lutomirski, Laura Abbott
Cc: Catalin Marinas, Luis R. Rodriguez, Steven Rostedt,
Stephen Smalley, Ingo Molnar, Michal Hocko, Vlastimil Babka,
Andrew Morton, Eric W. Biederman, Mateusz Guzik, LKML
On Fri, May 19, 2017 at 12:18 PM, Andy Lutomirski <luto@kernel.org> wrote:
> On Fri, May 19, 2017 at 12:16 PM, Kees Cook <keescook@chromium.org> wrote:
>> On Fri, May 19, 2017 at 11:27 AM, Andy Lutomirski <luto@kernel.org> wrote:
>>> One thing I've pondered: can we make some debugging mode (kmemleak,
>>> perhaps?) check that freed memory is RW at the time it's freed? I
>>> once wrote some buggy code that freed an R page and caused an OOPS
>>> much later, and this bug here seems likely to be some code that frees
>>> RWX memory.
>>
>> Which begs for even more checks: nothing should ever make a page RWX.
>> Either R, RW, or RX only... (or X too I guess, in the future).
>
> I could see pages being RWX temporarily during boot. OTOH if we ban
> RWX outright (after very early boot, anyway), then catching code that
> messes up and leaves pages RWX gets much easier.
Right, early boot is kind of special. It'd be nice to have there, but
I meant during normal runtime. We'd probably need to adjust
set_memory_rw/ro/nx/x around to have the correct side-effects, instead
of just controlling specific bits:
set_memory_rw() (RW_)
set_memory_ro() (R__)
set_memory_rx() (R_X)
set_memory_x() (__X)
That kind of refactoring might be not _too_ bad:
- add set_memory_rx()
- s/\bset_memory_x\b/set_memory_rx/g
- fix what breaks from expecting writable-executable memory
- adjust set_memory_rw() to drop x
- fix what breaks from expecting writable-executable memory
- adjust set_memory_ro() to drop x
- fix what breaks from expecting executable memory
- add set_memory_x() some day...
-Kees
--
Kees Cook
Pixel Security
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
2017-05-19 17:28 ` Luis R. Rodriguez
@ 2017-05-20 2:38 ` Masami Hiramatsu
2017-05-23 14:48 ` Luis R. Rodriguez
0 siblings, 1 reply; 21+ messages in thread
From: Masami Hiramatsu @ 2017-05-20 2:38 UTC (permalink / raw)
To: Luis R. Rodriguez
Cc: Jim Keniston, davem, sagar.abhishek, Catalin Marinas,
Steven Rostedt, Kees Cook, Stephen Smalley, Ingo Molnar,
Andy Lutomirski, Michal Hocko, Vlastimil Babka, Andrew Morton,
Eric W. Biederman, Mateusz Guzik, LKML
Hi Luis,
On Fri, 19 May 2017 19:28:54 +0200
"Luis R. Rodriguez" <mcgrof@kernel.org> wrote:
> On Fri, May 19, 2017 at 05:40:16PM +0200, Luis R. Rodriguez wrote:
> > On Fri, May 19, 2017 at 05:08:02AM +0200, Luis R. Rodriguez wrote:
> > > On Fri, May 19, 2017 at 02:44:14AM +0200, Luis R. Rodriguez wrote:
> > > > On Wed, May 17, 2017 at 10:53:06AM -0700, Kees Cook wrote:
> > > > > On Wed, May 17, 2017 at 9:40 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > > > > > Yes, but I had killed that boot session again, so upon my next boot
> > > > > > I had a different layout, the ASLR gap was much larger:
> > > > > >
> > > > > > ---[ Modules ]---
> > > > > > 0xffffffffc0000000-0xffffffffc01b0000 1728K pte
> > > > > > 0xffffffffc01b0000-0xffffffffc01b1000 4K RW GLB x pte
> > > > > > 0xffffffffc01b1000-0xffffffffc01b2000 4K pte
> > > > > > 0xffffffffc01b2000-0xffffffffc01c6000 80K ro GLB x pte
> > > > > > 0xffffffffc01c6000-0xffffffffc01cc000 24K ro GLB NX pte
> > > > > > 0xffffffffc01cc000-0xffffffffc01d5000 36K RW GLB NX pte
> > > > > >
> > > > > > As you can guess if we follow similar pattern the RW hole is the one this boot
> > > > > > warned about:
> > > > > >
> > > > > > [ 1.450483] x86/mm: Found insecure W+X mapping at address ffffffffc01b0000/0xffffffffc01b0000
> > > > > > [ 1.451280] ------------[ cut here ]------------
> > > > > > [ 1.451721] WARNING: CPU: 1 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> > > > > > [ 1.452499] Modules linked in:
> > > > > > [ 1.452791] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc1-next-20170515+ #145
> > > > > >
> > > > > > I checked and indeed 0xffffffffc01b2000 is part of a module, it was not the first one
> > > > > > on the /proc/modules list but then again /proc/modules does not seem to have a specific
> > > > > > order other than perhaps being pegged into a linked list of modules once they go live,
> > > > > > and it seems its typically output backwards from when that happened, sorting that
> > > > > > by address we get:
> > > > >
> > > > > Right, sorry, I'd expect it at the bottom of the list in
> > > > > /proc/modules, but that's fine, it's there.
> > > > >
> > > > > >
> > > > > > root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> > > > > > e1000 143360 0 - Live 0xffffffffc01b2000 (E)
> > > > > > mbcache 16384 1 ext4, Live 0xffffffffc01d6000 (E)
> > > > > > scsi_mod 217088 4 sg,sr_mod,sd_mod,libata, Live 0xffffffffc01df000 (E)
> > > > > >
> > > > > > And this then seems to be the first module loaded:
> > > > > >
> > > > > > e1000 143360 0 - Live 0xffffffffc01b2000 (E)
> > > > > >
> > > > > > The output of dmesg seems to confirm this as per the list of modules sorted
> > > > > > as per above.
> > > > > >
> > > > > >> Something touched the module gap and left is RW+x...
> > > > > >
> > > > > > Lemme try booting with e1000 renamed to e1000.ko.ignore and see how that goes.
> > > > >
> > > > > Is it possible a module got loaded before e1000 and then unloaded?
> > > > > That seems odd, but maybe unload isn't cleaning up?
> > > > >
> > > > > >> Are you able to bisect this?
> > > > > >
> > > > > > This issue has been present for a while so since I recall this I might be
> > > > > > able to reduce the number of needed target kernels to bisect. Lemme tinker
> > > > > > a bit and if no clear culprit comes up then will try bisect.
> > > > >
> > > > > Okay, thanks!
> > > >
> > > > Sorry to report that this issue is present since the feature's addition. So
> > > > the issue is there since its addition and is still present today. *But* it
> > > > may also be a configuration issue, given I have booted this guest *without*
> > > > this issue ...
> > > >
> > > > So:
> > > >
> > > > git checkout -b WX e1a58320a38dfa72be48a0f1a3a92273663ba6db
> > > >
> > > > That boots with the warning. To help debug further I've minimized my modules
> > > > to only a few: scsi_mod, e1000, libata.
> > > >
> > > > I suspect at this point this is not the fault of a particular module but
> > > > instead just an accounting semantic (>= or <= on an edge) but let's see.
> > > >
> > > > I now boot on 4.3.0-rc3 on commit (e1a58320a38df ("x86/mm: Warn on W^X
> > > > mappings") and I with:
> > > >
> > > > [ 0.949435] ------------[ cut here ]------------
> > > > [ 0.949992] WARNING: CPU: 2 PID: 1 at arch/x86/mm/dump_pagetables.c:225 note_page+0x635/0x7e0()
> > > > [ 0.950996] x86/mm: Found insecure W+X mapping at address ffffffffc0000000/0xffffffffc0000000
> > > > [ 0.951814] Modules linked in:
> > > > [ 0.952123] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.3.0-rc3-FINAL-TEST-WITH-WX-NOFLOPPY+ #365
> > > > [ 0.952929] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> > > > [ 0.954033] 0000000000000000 000000001f722925 ffff88013a5d7d40 ffffffff812ff335
> > > > [ 0.954742] ffff88013a5d7d88 ffff88013a5d7d78 ffffffff81079be2 ffff88013a5d7e90
> > > > [ 0.955522] 0000000000000000 0000000000000004 0000000000000000 0000000000000000
> > > > [ 0.956256] Call Trace:
> > > > [ 0.956496] [<ffffffff812ff335>] dump_stack+0x44/0x5f
> > > > [ 0.956953] [<ffffffff81079be2>] warn_slowpath_common+0x82/0xc0
> > > > [ 0.957519] [<ffffffff81079c7c>] warn_slowpath_fmt+0x5c/0x80
> > > > [ 0.958066] [<ffffffff8106c155>] note_page+0x635/0x7e0
> > > > [ 0.958595] [<ffffffff8106c5eb>] ptdump_walk_pgd_level_core+0x2eb/0x410
> > > > [ 0.959219] [<ffffffff8106c7b7>] ptdump_walk_pgd_level_checkwx+0x17/0x20
> > > > [ 0.959856] [<ffffffff8106260d>] mark_rodata_ro+0xed/0x100
> > > > [ 0.960372] [<ffffffff815aa7d0>] ? rest_init+0x80/0x80
> > > > [ 0.960869] [<ffffffff815aa7ed>] kernel_init+0x1d/0xe0
> > > > [ 0.961358] [<ffffffff815b798f>] ret_from_fork+0x3f/0x70
> > > > [ 0.961900] [<ffffffff815aa7d0>] ? rest_init+0x80/0x80
> > > > [ 0.962389] ---[ end trace 6125ebcb24c9e3d0 ]---
> > > > [ 0.962822] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
> > > >
> > > >
> > > > ---[ High Kernel Mapping ]---
> > > > 0xffffffff80000000-0xffffffff81000000 16M pmd
> > > > 0xffffffff81000000-0xffffffff81600000 6M ro PSE GLB x pmd
> > > > 0xffffffff81600000-0xffffffff81a00000 4M ro PSE GLB NX pmd
> > > > 0xffffffff81a00000-0xffffffff81c00000 2M RW GLB NX pte
> > > > 0xffffffff81c00000-0xffffffff82200000 6M RW PSE GLB NX pmd
> > > > 0xffffffff82200000-0xffffffff82400000 2M RW GLB NX pte
> > > > 0xffffffff82400000-0xffffffffc0000000 988M pmd
> > > > ---[ Modules ]---
> > > > 0xffffffffc0000000-0xffffffffc0001000 4K RW GLB x pte
> > > > 0xffffffffc0001000-0xffffffffc0002000 4K pte
> > > > 0xffffffffc0002000-0xffffffffc0039000 220K RW GLB x pte
> > > >
> > > > root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> > > > scsi_mod 221979 4 sg,sd_mod,sr_mod,libata, Live 0xffffffffc0002000 (E)
> > > > e1000 127757 0 - Live 0xffffffffc004d000 (E)
> > > > libata 229931 2 ata_generic,ata_piix, Live 0xffffffffc0076000 (E)
> > > >
> > > > So that 4K RW seems suspect of getting used for allocation purpose on edge
> > > > for a particular reason and it also happens to be on the edge of the high
> > > > kernel mapping. Could it be the boundary semantic issue ?
> > > >
> > > > For instance can it be that since 0xffffffffc0002000 is given to the first
> > > > module by the allocator, scsi_mod, and since that address is *technically*
> > > > part of two boundaries we get a splat ?
> > > >
> > > > 0xffffffffc0001000-0xffffffffc0002000 4K pte
> > > > 0xffffffffc0002000-0xffffffffc0039000 220K RW GLB x pte
> > >
> > > Note on the latest linux-next and on the commit that introduced this the config
> > > and kernel yields only *one* page:
> > >
> > > x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
> > >
> > > I believe this is more indications my suspicion might be right.
> >
> > If the following is a legit forced way to get query the kernel to ask it
> > who owns a page then perhaps this technique can be used in the future to
> > figure out who the hell caused this. Catalin, can you confirm? In this
> > case this is perhaps not a leaked page but I am trying to abuse the
> > kmemleak debugfs API to query who allocated the page. Is that fine?
> >
> > [ 0.916771] WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:235 note_page+0x63c/0x7e0
> > [ 0.917636] x86/mm: Found insecure W+X mapping at address ffffffffc03d5000/0xffffffffc03d5000
> > [ 0.918502] Modules linked in:
> > [ 0.918819] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.11.0-mcgrof-force-config #340
> > [ 0.919631] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> > [ 0.920011] Call Trace:
> > [ 0.920011] dump_stack+0x63/0x81
> > [ 0.920011] __warn+0xcb/0xf0
> > [ 0.920011] warn_slowpath_fmt+0x5a/0x80
> > [ 0.920011] note_page+0x63c/0x7e0
> > [ 0.920011] ptdump_walk_pgd_level_core+0x3b1/0x460
> > [ 0.920011] ? 0xffffffff86c00000
> > [ 0.920011] ptdump_walk_pgd_level_checkwx+0x17/0x20
> > [ 0.920011] mark_rodata_ro+0xf4/0x100
> > [ 0.920011] ? rest_init+0x80/0x80
> > [ 0.920011] kernel_init+0x2a/0x100
> > [ 0.920011] ret_from_fork+0x2c/0x40
> > [ 0.925474] ---[ end trace dca00cd779490a2b ]---
> > [ 0.925959] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
> >
> > echo dump=0xffffffffc03d5000 > /sys/kernel/debug/kmemleak
> > dmesg | tail
> >
> > [ 49.209565] kmemleak: Object 0xffffffffc03d5000 (size 335):
> > [ 49.210814] kmemleak: comm "swapper/0", pid 1, jiffies 4294892440
> > [ 49.212148] kmemleak: min_count = 2
> > [ 49.212852] kmemleak: count = 0
> > [ 49.213363] kmemleak: flags = 0x1
> > [ 49.213363] kmemleak: checksum = 0
> > [ 49.213363] kmemleak: backtrace:
> > [ 49.213363] kmemleak_alloc+0x4a/0xa0
> > [ 49.213363] __vmalloc_node_range+0x20a/0x2b0
> > [ 49.213363] module_alloc+0x67/0xc0
> > [ 49.213363] arch_ftrace_update_trampoline+0xba/0x260
> > [ 49.213363] ftrace_startup+0x90/0x210
> > [ 49.213363] register_ftrace_function+0x4b/0x60
> > [ 49.213363] arm_kprobe+0x84/0xe0
> > [ 49.213363] register_kprobe+0x56e/0x5b0
> > [ 49.213363] init_test_probes+0x61/0x560
> > [ 49.213363] init_kprobes+0x1e3/0x206
> > [ 49.213363] do_one_initcall+0x52/0x1a0
> > [ 49.213363] kernel_init_freeable+0x178/0x200
> > [ 49.213363] kernel_init+0xe/0x100
> > [ 49.213363] ret_from_fork+0x2c/0x40
> > [ 49.213363] 0xffffffffffffffff
>
> Aha! And the winner is:
>
> CONFIG_KPROBES_SANITY_TEST
>
> I confirm disabling it on 4.3.0-rc3 and on linux-next next-20170519 avoids the WARN.
> I also can confirm using the 'echo dump=mem-area > /sys/kernel/debug/kmemleak' yields
> the same trace for both of these kernels.
>
> So -- the above kmemleak hack seems to actually work to seek who owns that page.
>
> Now to figure out how the hell kernel/test_kprobes.c screws around with things.
Ah, that was fixed recently;
https://marc.info/?l=linux-kernel&m=149076389011850
Note that this patch depends another patch in the series;
https://marc.info/?l=linux-kernel&m=149076370111812&w=2
Thank you,
--
Masami Hiramatsu <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
2017-05-20 2:38 ` Masami Hiramatsu
@ 2017-05-23 14:48 ` Luis R. Rodriguez
2017-05-24 17:55 ` Luis R. Rodriguez
0 siblings, 1 reply; 21+ messages in thread
From: Luis R. Rodriguez @ 2017-05-23 14:48 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: Luis R. Rodriguez, Jim Keniston, davem, sagar.abhishek,
Catalin Marinas, Steven Rostedt, Kees Cook, Stephen Smalley,
Ingo Molnar, Andy Lutomirski, Michal Hocko, Vlastimil Babka,
Andrew Morton, Eric W. Biederman, Mateusz Guzik, LKML
On Sat, May 20, 2017 at 11:38:50AM +0900, Masami Hiramatsu wrote:
> Hi Luis,
>
> On Fri, 19 May 2017 19:28:54 +0200
> "Luis R. Rodriguez" <mcgrof@kernel.org> wrote:
>
> > On Fri, May 19, 2017 at 05:40:16PM +0200, Luis R. Rodriguez wrote:
> > > On Fri, May 19, 2017 at 05:08:02AM +0200, Luis R. Rodriguez wrote:
> > > > On Fri, May 19, 2017 at 02:44:14AM +0200, Luis R. Rodriguez wrote:
> > > > > On Wed, May 17, 2017 at 10:53:06AM -0700, Kees Cook wrote:
> > > > > > On Wed, May 17, 2017 at 9:40 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > > > > > > Yes, but I had killed that boot session again, so upon my next boot
> > > > > > > I had a different layout, the ASLR gap was much larger:
> > > > > > >
> > > > > > > ---[ Modules ]---
> > > > > > > 0xffffffffc0000000-0xffffffffc01b0000 1728K pte
> > > > > > > 0xffffffffc01b0000-0xffffffffc01b1000 4K RW GLB x pte
> > > > > > > 0xffffffffc01b1000-0xffffffffc01b2000 4K pte
> > > > > > > 0xffffffffc01b2000-0xffffffffc01c6000 80K ro GLB x pte
> > > > > > > 0xffffffffc01c6000-0xffffffffc01cc000 24K ro GLB NX pte
> > > > > > > 0xffffffffc01cc000-0xffffffffc01d5000 36K RW GLB NX pte
> > > > > > >
> > > > > > > As you can guess if we follow similar pattern the RW hole is the one this boot
> > > > > > > warned about:
> > > > > > >
> > > > > > > [ 1.450483] x86/mm: Found insecure W+X mapping at address ffffffffc01b0000/0xffffffffc01b0000
> > > > > > > [ 1.451280] ------------[ cut here ]------------
> > > > > > > [ 1.451721] WARNING: CPU: 1 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> > > > > > > [ 1.452499] Modules linked in:
> > > > > > > [ 1.452791] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc1-next-20170515+ #145
> > > > > > >
> > > > > > > I checked and indeed 0xffffffffc01b2000 is part of a module, it was not the first one
> > > > > > > on the /proc/modules list but then again /proc/modules does not seem to have a specific
> > > > > > > order other than perhaps being pegged into a linked list of modules once they go live,
> > > > > > > and it seems its typically output backwards from when that happened, sorting that
> > > > > > > by address we get:
> > > > > >
> > > > > > Right, sorry, I'd expect it at the bottom of the list in
> > > > > > /proc/modules, but that's fine, it's there.
> > > > > >
> > > > > > >
> > > > > > > root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> > > > > > > e1000 143360 0 - Live 0xffffffffc01b2000 (E)
> > > > > > > mbcache 16384 1 ext4, Live 0xffffffffc01d6000 (E)
> > > > > > > scsi_mod 217088 4 sg,sr_mod,sd_mod,libata, Live 0xffffffffc01df000 (E)
> > > > > > >
> > > > > > > And this then seems to be the first module loaded:
> > > > > > >
> > > > > > > e1000 143360 0 - Live 0xffffffffc01b2000 (E)
> > > > > > >
> > > > > > > The output of dmesg seems to confirm this as per the list of modules sorted
> > > > > > > as per above.
> > > > > > >
> > > > > > >> Something touched the module gap and left is RW+x...
> > > > > > >
> > > > > > > Lemme try booting with e1000 renamed to e1000.ko.ignore and see how that goes.
> > > > > >
> > > > > > Is it possible a module got loaded before e1000 and then unloaded?
> > > > > > That seems odd, but maybe unload isn't cleaning up?
> > > > > >
> > > > > > >> Are you able to bisect this?
> > > > > > >
> > > > > > > This issue has been present for a while so since I recall this I might be
> > > > > > > able to reduce the number of needed target kernels to bisect. Lemme tinker
> > > > > > > a bit and if no clear culprit comes up then will try bisect.
> > > > > >
> > > > > > Okay, thanks!
> > > > >
> > > > > Sorry to report that this issue is present since the feature's addition. So
> > > > > the issue is there since its addition and is still present today. *But* it
> > > > > may also be a configuration issue, given I have booted this guest *without*
> > > > > this issue ...
> > > > >
> > > > > So:
> > > > >
> > > > > git checkout -b WX e1a58320a38dfa72be48a0f1a3a92273663ba6db
> > > > >
> > > > > That boots with the warning. To help debug further I've minimized my modules
> > > > > to only a few: scsi_mod, e1000, libata.
> > > > >
> > > > > I suspect at this point this is not the fault of a particular module but
> > > > > instead just an accounting semantic (>= or <= on an edge) but let's see.
> > > > >
> > > > > I now boot on 4.3.0-rc3 on commit (e1a58320a38df ("x86/mm: Warn on W^X
> > > > > mappings") and I with:
> > > > >
> > > > > [ 0.949435] ------------[ cut here ]------------
> > > > > [ 0.949992] WARNING: CPU: 2 PID: 1 at arch/x86/mm/dump_pagetables.c:225 note_page+0x635/0x7e0()
> > > > > [ 0.950996] x86/mm: Found insecure W+X mapping at address ffffffffc0000000/0xffffffffc0000000
> > > > > [ 0.951814] Modules linked in:
> > > > > [ 0.952123] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.3.0-rc3-FINAL-TEST-WITH-WX-NOFLOPPY+ #365
> > > > > [ 0.952929] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> > > > > [ 0.954033] 0000000000000000 000000001f722925 ffff88013a5d7d40 ffffffff812ff335
> > > > > [ 0.954742] ffff88013a5d7d88 ffff88013a5d7d78 ffffffff81079be2 ffff88013a5d7e90
> > > > > [ 0.955522] 0000000000000000 0000000000000004 0000000000000000 0000000000000000
> > > > > [ 0.956256] Call Trace:
> > > > > [ 0.956496] [<ffffffff812ff335>] dump_stack+0x44/0x5f
> > > > > [ 0.956953] [<ffffffff81079be2>] warn_slowpath_common+0x82/0xc0
> > > > > [ 0.957519] [<ffffffff81079c7c>] warn_slowpath_fmt+0x5c/0x80
> > > > > [ 0.958066] [<ffffffff8106c155>] note_page+0x635/0x7e0
> > > > > [ 0.958595] [<ffffffff8106c5eb>] ptdump_walk_pgd_level_core+0x2eb/0x410
> > > > > [ 0.959219] [<ffffffff8106c7b7>] ptdump_walk_pgd_level_checkwx+0x17/0x20
> > > > > [ 0.959856] [<ffffffff8106260d>] mark_rodata_ro+0xed/0x100
> > > > > [ 0.960372] [<ffffffff815aa7d0>] ? rest_init+0x80/0x80
> > > > > [ 0.960869] [<ffffffff815aa7ed>] kernel_init+0x1d/0xe0
> > > > > [ 0.961358] [<ffffffff815b798f>] ret_from_fork+0x3f/0x70
> > > > > [ 0.961900] [<ffffffff815aa7d0>] ? rest_init+0x80/0x80
> > > > > [ 0.962389] ---[ end trace 6125ebcb24c9e3d0 ]---
> > > > > [ 0.962822] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
> > > > >
> > > > >
> > > > > ---[ High Kernel Mapping ]---
> > > > > 0xffffffff80000000-0xffffffff81000000 16M pmd
> > > > > 0xffffffff81000000-0xffffffff81600000 6M ro PSE GLB x pmd
> > > > > 0xffffffff81600000-0xffffffff81a00000 4M ro PSE GLB NX pmd
> > > > > 0xffffffff81a00000-0xffffffff81c00000 2M RW GLB NX pte
> > > > > 0xffffffff81c00000-0xffffffff82200000 6M RW PSE GLB NX pmd
> > > > > 0xffffffff82200000-0xffffffff82400000 2M RW GLB NX pte
> > > > > 0xffffffff82400000-0xffffffffc0000000 988M pmd
> > > > > ---[ Modules ]---
> > > > > 0xffffffffc0000000-0xffffffffc0001000 4K RW GLB x pte
> > > > > 0xffffffffc0001000-0xffffffffc0002000 4K pte
> > > > > 0xffffffffc0002000-0xffffffffc0039000 220K RW GLB x pte
> > > > >
> > > > > root@piggy:~# cat /proc/modules | sort -k 6 | head -3
> > > > > scsi_mod 221979 4 sg,sd_mod,sr_mod,libata, Live 0xffffffffc0002000 (E)
> > > > > e1000 127757 0 - Live 0xffffffffc004d000 (E)
> > > > > libata 229931 2 ata_generic,ata_piix, Live 0xffffffffc0076000 (E)
> > > > >
> > > > > So that 4K RW seems suspect of getting used for allocation purpose on edge
> > > > > for a particular reason and it also happens to be on the edge of the high
> > > > > kernel mapping. Could it be the boundary semantic issue ?
> > > > >
> > > > > For instance can it be that since 0xffffffffc0002000 is given to the first
> > > > > module by the allocator, scsi_mod, and since that address is *technically*
> > > > > part of two boundaries we get a splat ?
> > > > >
> > > > > 0xffffffffc0001000-0xffffffffc0002000 4K pte
> > > > > 0xffffffffc0002000-0xffffffffc0039000 220K RW GLB x pte
> > > >
> > > > Note on the latest linux-next and on the commit that introduced this the config
> > > > and kernel yields only *one* page:
> > > >
> > > > x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
> > > >
> > > > I believe this is more indications my suspicion might be right.
> > >
> > > If the following is a legit forced way to get query the kernel to ask it
> > > who owns a page then perhaps this technique can be used in the future to
> > > figure out who the hell caused this. Catalin, can you confirm? In this
> > > case this is perhaps not a leaked page but I am trying to abuse the
> > > kmemleak debugfs API to query who allocated the page. Is that fine?
> > >
> > > [ 0.916771] WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:235 note_page+0x63c/0x7e0
> > > [ 0.917636] x86/mm: Found insecure W+X mapping at address ffffffffc03d5000/0xffffffffc03d5000
> > > [ 0.918502] Modules linked in:
> > > [ 0.918819] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.11.0-mcgrof-force-config #340
> > > [ 0.919631] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> > > [ 0.920011] Call Trace:
> > > [ 0.920011] dump_stack+0x63/0x81
> > > [ 0.920011] __warn+0xcb/0xf0
> > > [ 0.920011] warn_slowpath_fmt+0x5a/0x80
> > > [ 0.920011] note_page+0x63c/0x7e0
> > > [ 0.920011] ptdump_walk_pgd_level_core+0x3b1/0x460
> > > [ 0.920011] ? 0xffffffff86c00000
> > > [ 0.920011] ptdump_walk_pgd_level_checkwx+0x17/0x20
> > > [ 0.920011] mark_rodata_ro+0xf4/0x100
> > > [ 0.920011] ? rest_init+0x80/0x80
> > > [ 0.920011] kernel_init+0x2a/0x100
> > > [ 0.920011] ret_from_fork+0x2c/0x40
> > > [ 0.925474] ---[ end trace dca00cd779490a2b ]---
> > > [ 0.925959] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
> > >
> > > echo dump=0xffffffffc03d5000 > /sys/kernel/debug/kmemleak
> > > dmesg | tail
> > >
> > > [ 49.209565] kmemleak: Object 0xffffffffc03d5000 (size 335):
> > > [ 49.210814] kmemleak: comm "swapper/0", pid 1, jiffies 4294892440
> > > [ 49.212148] kmemleak: min_count = 2
> > > [ 49.212852] kmemleak: count = 0
> > > [ 49.213363] kmemleak: flags = 0x1
> > > [ 49.213363] kmemleak: checksum = 0
> > > [ 49.213363] kmemleak: backtrace:
> > > [ 49.213363] kmemleak_alloc+0x4a/0xa0
> > > [ 49.213363] __vmalloc_node_range+0x20a/0x2b0
> > > [ 49.213363] module_alloc+0x67/0xc0
> > > [ 49.213363] arch_ftrace_update_trampoline+0xba/0x260
> > > [ 49.213363] ftrace_startup+0x90/0x210
> > > [ 49.213363] register_ftrace_function+0x4b/0x60
> > > [ 49.213363] arm_kprobe+0x84/0xe0
> > > [ 49.213363] register_kprobe+0x56e/0x5b0
> > > [ 49.213363] init_test_probes+0x61/0x560
> > > [ 49.213363] init_kprobes+0x1e3/0x206
> > > [ 49.213363] do_one_initcall+0x52/0x1a0
> > > [ 49.213363] kernel_init_freeable+0x178/0x200
> > > [ 49.213363] kernel_init+0xe/0x100
> > > [ 49.213363] ret_from_fork+0x2c/0x40
> > > [ 49.213363] 0xffffffffffffffff
> >
> > Aha! And the winner is:
> >
> > CONFIG_KPROBES_SANITY_TEST
> >
> > I confirm disabling it on 4.3.0-rc3 and on linux-next next-20170519 avoids the WARN.
> > I also can confirm using the 'echo dump=mem-area > /sys/kernel/debug/kmemleak' yields
> > the same trace for both of these kernels.
> >
> > So -- the above kmemleak hack seems to actually work to seek who owns that page.
> >
> > Now to figure out how the hell kernel/test_kprobes.c screws around with things.
>
> Ah, that was fixed recently;
>
> https://marc.info/?l=linux-kernel&m=149076389011850
>
> Note that this patch depends another patch in the series;
>
> https://marc.info/?l=linux-kernel&m=149076370111812&w=2
I actually boot tested linux-next tag next-20170519 which carries these
patches and the WARNING still is there. Please note the issue is with
CONFIG_KPROBES_SANITY_TEST enabled.
[ 1.025601] x86/mm: Found insecure W+X mapping at address ffffffffc01e7000/0xffffffffc01e7000
[ 1.026429] ------------[ cut here ]------------
[ 1.026885] WARNING: CPU: 1 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
[ 1.027711] Modules linked in:
[ 1.028032] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc1-next-20170519 #151
[ 1.028788] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
[ 1.029928] task: ffff9fd47a5ccc80 task.stack: ffffb6bcc0630000
[ 1.030509] RIP: 0010:note_page+0x630/0x7e0
[ 1.030917] RSP: 0000:ffffb6bcc0633df0 EFLAGS: 00010286
[ 1.031425] RAX: 0000000000000051 RBX: ffffb6bcc0633e88 RCX: ffffffffbb656708
[ 1.032132] RDX: 0000000000000000 RSI: 0000000000000096 RDI: 0000000000000246
[ 1.032834] RBP: ffffb6bcc0633e28 R08: 203a6d6d2f363878 R09: 0000000000000161
[ 1.033539] R10: ffffb6bcc0633dd8 R11: 736e6920646e756f R12: 0000000000000000
[ 1.034235] R13: 0000000000000004 R14: 0000000000000000 R15: 0000000000000000
[ 1.034927] FS: 0000000000000000(0000) GS:ffff9fd47fc80000(0000) knlGS:0000000000000000
[ 1.035722] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1.036290] CR2: ffffb6bcc073c000 CR3: 0000000053209000 CR4: 00000000000006e0
[ 1.036839] Call Trace:
[ 1.037034] ptdump_walk_pgd_level_core+0x3e7/0x490
[ 1.037367] ? 0xffffffffbaa00000
[ 1.037705] ptdump_walk_pgd_level_checkwx+0x17/0x20
[ 1.038187] mark_rodata_ro+0xf4/0x100
[ 1.038559] ? rest_init+0x80/0x80
[ 1.038890] kernel_init+0x2f/0x100
[ 1.039235] ret_from_fork+0x2c/0x40
[ 1.039582] Code: 48 c7 43 28 00 00 00 00 48 89 43 20 e9 05 fd ff ff 48 8b 73 10 48 c7 c7 f0 3d 3e bb c6 05 f8 eb bc 00 01 48 89 f2 e8 1d 02 12 00 <0f> ff e9 1f fa ff ff 48 8b 70 20 48 c7 c7 3c ba 3e bb e8 06 02
[ 1.041416] ---[ end trace e726c1b63e5a81a9 ]---
[ 1.041872] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
root@piggy:~# echo dump=0xffffffffc01e7000 > /sys/kernel/debug/kmemleak
On dmesg:
May 23 07:44:51 piggy kernel: kmemleak: Object 0xffffffffc01e7000 (size 335):
May 23 07:44:51 piggy kernel: kmemleak: comm "swapper/0", pid 1, jiffies 4294892451
May 23 07:44:51 piggy kernel: kmemleak: min_count = 2
May 23 07:44:51 piggy kernel: kmemleak: count = 2
May 23 07:44:51 piggy kernel: kmemleak: flags = 0x1
May 23 07:44:51 piggy kernel: kmemleak: checksum = 0
May 23 07:44:51 piggy kernel: kmemleak: backtrace:
May 23 07:44:51 piggy kernel: kmemleak_alloc+0x4a/0xa0
May 23 07:44:51 piggy kernel: __vmalloc_node_range+0x20c/0x2b0
May 23 07:44:51 piggy kernel: module_alloc+0x67/0xc0
May 23 07:44:51 piggy kernel: arch_ftrace_update_trampoline+0xc1/0x240
May 23 07:44:51 piggy kernel: ftrace_startup+0x92/0x210
May 23 07:44:51 piggy kernel: register_ftrace_function+0x4b/0x60
May 23 07:44:51 piggy kernel: arm_kprobe+0x84/0xc0
May 23 07:44:51 piggy kernel: register_kprobe+0x59c/0x5e0
May 23 07:44:51 piggy kernel: init_test_probes+0x61/0x560
May 23 07:44:51 piggy kernel: init_kprobes+0x1ea/0x20d
May 23 07:44:51 piggy kernel: do_one_initcall+0x52/0x1a0
May 23 07:44:51 piggy kernel: kernel_init_freeable+0x17d/0x205
May 23 07:44:51 piggy kernel: kernel_init+0xe/0x100
May 23 07:44:51 piggy kernel: ret_from_fork+0x2c/0x40
May 23 07:44:51 piggy kernel: 0xffffffffffffffff
Luis
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
2017-05-23 14:48 ` Luis R. Rodriguez
@ 2017-05-24 17:55 ` Luis R. Rodriguez
0 siblings, 0 replies; 21+ messages in thread
From: Luis R. Rodriguez @ 2017-05-24 17:55 UTC (permalink / raw)
To: Luis R. Rodriguez, Thomas Gleixner
Cc: Masami Hiramatsu, Jim Keniston, davem, sagar.abhishek,
Catalin Marinas, Steven Rostedt, Kees Cook, Stephen Smalley,
Ingo Molnar, Andy Lutomirski, Michal Hocko, Vlastimil Babka,
Andrew Morton, Eric W. Biederman, Mateusz Guzik, LKML
On Tue, May 23, 2017 at 04:48:50PM +0200, Luis R. Rodriguez wrote:
> On Sat, May 20, 2017 at 11:38:50AM +0900, Masami Hiramatsu wrote:
> > Hi Luis,
> >
> > On Fri, 19 May 2017 19:28:54 +0200
> > "Luis R. Rodriguez" <mcgrof@kernel.org> wrote:
> > >
> > > Aha! And the winner is:
> > >
> > > CONFIG_KPROBES_SANITY_TEST
> > >
> > > I confirm disabling it on 4.3.0-rc3 and on linux-next next-20170519 avoids the WARN.
> > > I also can confirm using the 'echo dump=mem-area > /sys/kernel/debug/kmemleak' yields
> > > the same trace for both of these kernels.
> > >
> > > So -- the above kmemleak hack seems to actually work to seek who owns that page.
> > >
> > > Now to figure out how the hell kernel/test_kprobes.c screws around with things.
> >
> > Ah, that was fixed recently;
> >
> > https://marc.info/?l=linux-kernel&m=149076389011850
> >
> > Note that this patch depends another patch in the series;
> >
> > https://marc.info/?l=linux-kernel&m=149076370111812&w=2
>
> I actually boot tested linux-next tag next-20170519 which carries these
> patches and the WARNING still is there. Please note the issue is with
> CONFIG_KPROBES_SANITY_TEST enabled.
>
> [ 1.025601] x86/mm: Found insecure W+X mapping at address ffffffffc01e7000/0xffffffffc01e7000
> [ 1.026429] ------------[ cut here ]------------
> [ 1.026885] WARNING: CPU: 1 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
> [ 1.027711] Modules linked in:
> [ 1.028032] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc1-next-20170519 #151
> [ 1.028788] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> [ 1.029928] task: ffff9fd47a5ccc80 task.stack: ffffb6bcc0630000
> [ 1.030509] RIP: 0010:note_page+0x630/0x7e0
> [ 1.030917] RSP: 0000:ffffb6bcc0633df0 EFLAGS: 00010286
> [ 1.031425] RAX: 0000000000000051 RBX: ffffb6bcc0633e88 RCX: ffffffffbb656708
> [ 1.032132] RDX: 0000000000000000 RSI: 0000000000000096 RDI: 0000000000000246
> [ 1.032834] RBP: ffffb6bcc0633e28 R08: 203a6d6d2f363878 R09: 0000000000000161
> [ 1.033539] R10: ffffb6bcc0633dd8 R11: 736e6920646e756f R12: 0000000000000000
> [ 1.034235] R13: 0000000000000004 R14: 0000000000000000 R15: 0000000000000000
> [ 1.034927] FS: 0000000000000000(0000) GS:ffff9fd47fc80000(0000) knlGS:0000000000000000
> [ 1.035722] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1.036290] CR2: ffffb6bcc073c000 CR3: 0000000053209000 CR4: 00000000000006e0
> [ 1.036839] Call Trace:
> [ 1.037034] ptdump_walk_pgd_level_core+0x3e7/0x490
> [ 1.037367] ? 0xffffffffbaa00000
> [ 1.037705] ptdump_walk_pgd_level_checkwx+0x17/0x20
> [ 1.038187] mark_rodata_ro+0xf4/0x100
> [ 1.038559] ? rest_init+0x80/0x80
> [ 1.038890] kernel_init+0x2f/0x100
> [ 1.039235] ret_from_fork+0x2c/0x40
> [ 1.039582] Code: 48 c7 43 28 00 00 00 00 48 89 43 20 e9 05 fd ff ff 48 8b 73 10 48 c7 c7 f0 3d 3e bb c6 05 f8 eb bc 00 01 48 89 f2 e8 1d 02 12 00 <0f> ff e9 1f fa ff ff 48 8b 70 20 48 c7 c7 3c ba 3e bb e8 06 02
> [ 1.041416] ---[ end trace e726c1b63e5a81a9 ]---
> [ 1.041872] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
>
> root@piggy:~# echo dump=0xffffffffc01e7000 > /sys/kernel/debug/kmemleak
>
> On dmesg:
>
> May 23 07:44:51 piggy kernel: kmemleak: Object 0xffffffffc01e7000 (size 335):
> May 23 07:44:51 piggy kernel: kmemleak: comm "swapper/0", pid 1, jiffies 4294892451
> May 23 07:44:51 piggy kernel: kmemleak: min_count = 2
> May 23 07:44:51 piggy kernel: kmemleak: count = 2
> May 23 07:44:51 piggy kernel: kmemleak: flags = 0x1
> May 23 07:44:51 piggy kernel: kmemleak: checksum = 0
> May 23 07:44:51 piggy kernel: kmemleak: backtrace:
> May 23 07:44:51 piggy kernel: kmemleak_alloc+0x4a/0xa0
> May 23 07:44:51 piggy kernel: __vmalloc_node_range+0x20c/0x2b0
> May 23 07:44:51 piggy kernel: module_alloc+0x67/0xc0
> May 23 07:44:51 piggy kernel: arch_ftrace_update_trampoline+0xc1/0x240
> May 23 07:44:51 piggy kernel: ftrace_startup+0x92/0x210
> May 23 07:44:51 piggy kernel: register_ftrace_function+0x4b/0x60
> May 23 07:44:51 piggy kernel: arm_kprobe+0x84/0xc0
> May 23 07:44:51 piggy kernel: register_kprobe+0x59c/0x5e0
> May 23 07:44:51 piggy kernel: init_test_probes+0x61/0x560
> May 23 07:44:51 piggy kernel: init_kprobes+0x1ea/0x20d
> May 23 07:44:51 piggy kernel: do_one_initcall+0x52/0x1a0
> May 23 07:44:51 piggy kernel: kernel_init_freeable+0x17d/0x205
> May 23 07:44:51 piggy kernel: kernel_init+0xe/0x100
> May 23 07:44:51 piggy kernel: ret_from_fork+0x2c/0x40
> May 23 07:44:51 piggy kernel: 0xffffffffffffffff
Turns out that Thomas Gleixner's patch from today [0] fixes this as the same
module_alloc() path was the culprit of the issue. Steven Rostedt however just
reported that this patch crashes on his ftracetests, so it would seem we just
need to address that last kink to fix this properly.
We can take this further on, in that other thread.
[0] https://lkml.kernel.org/r/alpine.DEB.2.20.1705241459480.2201@nanos
[1] https://lkml.kernel.org/r/20170524134728.61a896c9@vmware.local.home
Luis
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0
2017-05-19 17:35 ` Catalin Marinas
2017-05-19 18:27 ` Andy Lutomirski
@ 2017-05-26 22:13 ` Luis R. Rodriguez
1 sibling, 0 replies; 21+ messages in thread
From: Luis R. Rodriguez @ 2017-05-26 22:13 UTC (permalink / raw)
To: Catalin Marinas
Cc: Steven Rostedt, Kees Cook, Stephen Smalley, Ingo Molnar,
Andy Lutomirski, Michal Hocko, Vlastimil Babka, Andrew Morton,
Eric W. Biederman, Mateusz Guzik, LKML, Luis R. Rodriguez
On Fri, May 19, 2017 at 10:35 AM, Catalin Marinas
<catalin.marinas@arm.com> wrote:
> On Fri, May 19, 2017 at 05:40:16PM +0200, Luis R. Rodriguez wrote:
>> If the following is a legit forced way to get query the kernel to ask it
>> who owns a page then perhaps this technique can be used in the future to
>> figure out who the hell caused this. Catalin, can you confirm? In this
>> case this is perhaps not a leaked page but I am trying to abuse the
>> kmemleak debugfs API to query who allocated the page. Is that fine?
>>
>> [ 0.916771] WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:235 note_page+0x63c/0x7e0
>> [ 0.917636] x86/mm: Found insecure W+X mapping at address ffffffffc03d5000/0xffffffffc03d5000
>> [ 0.918502] Modules linked in:
>> [ 0.918819] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.11.0-mcgrof-force-config #340
>> [ 0.919631] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
>> [ 0.920011] Call Trace:
>> [ 0.920011] dump_stack+0x63/0x81
>> [ 0.920011] __warn+0xcb/0xf0
>> [ 0.920011] warn_slowpath_fmt+0x5a/0x80
>> [ 0.920011] note_page+0x63c/0x7e0
>> [ 0.920011] ptdump_walk_pgd_level_core+0x3b1/0x460
>> [ 0.920011] ? 0xffffffff86c00000
>> [ 0.920011] ptdump_walk_pgd_level_checkwx+0x17/0x20
>> [ 0.920011] mark_rodata_ro+0xf4/0x100
>> [ 0.920011] ? rest_init+0x80/0x80
>> [ 0.920011] kernel_init+0x2a/0x100
>> [ 0.920011] ret_from_fork+0x2c/0x40
>> [ 0.925474] ---[ end trace dca00cd779490a2b ]---
>> [ 0.925959] x86/mm: Checked W+X mappings: FAILED, 1 W+X pages found.
>>
>> echo dump=0xffffffffc03d5000 > /sys/kernel/debug/kmemleak
>> dmesg | tail
>>
>> [ 49.209565] kmemleak: Object 0xffffffffc03d5000 (size 335):
>> [ 49.210814] kmemleak: comm "swapper/0", pid 1, jiffies 4294892440
>> [ 49.212148] kmemleak: min_count = 2
>> [ 49.212852] kmemleak: count = 0
>> [ 49.213363] kmemleak: flags = 0x1
>> [ 49.213363] kmemleak: checksum = 0
>> [ 49.213363] kmemleak: backtrace:
>> [ 49.213363] kmemleak_alloc+0x4a/0xa0
>> [ 49.213363] __vmalloc_node_range+0x20a/0x2b0
>> [ 49.213363] module_alloc+0x67/0xc0
>> [ 49.213363] arch_ftrace_update_trampoline+0xba/0x260
>> [ 49.213363] ftrace_startup+0x90/0x210
>> [ 49.213363] register_ftrace_function+0x4b/0x60
>> [ 49.213363] arm_kprobe+0x84/0xe0
>> [ 49.213363] register_kprobe+0x56e/0x5b0
>> [ 49.213363] init_test_probes+0x61/0x560
>> [ 49.213363] init_kprobes+0x1e3/0x206
>> [ 49.213363] do_one_initcall+0x52/0x1a0
>> [ 49.213363] kernel_init_freeable+0x178/0x200
>> [ 49.213363] kernel_init+0xe/0x100
>> [ 49.213363] ret_from_fork+0x2c/0x40
>> [ 49.213363] 0xffffffffffffffff
>
> You could as well use kmemleak this way since it tracks the memory
> allocations.
Great!
> However, it doesn't track alloc_pages and also doesn't
> track mapping existing pages (vmap etc.)
Can we verify that? If so then the splat from the above complaint
could include a follow up dump of the trace, no ? That's *much* more
useful.
Luis
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2017-05-27 1:19 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-15 22:06 next-20170515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/dump_pagetables.c:236 note_page+0x630/0x7e0 Luis R. Rodriguez
2017-05-15 22:15 ` Luis R. Rodriguez
2017-05-15 22:57 ` Kees Cook
2017-05-15 23:45 ` Luis R. Rodriguez
2017-05-16 0:12 ` Kees Cook
2017-05-17 16:40 ` Luis R. Rodriguez
2017-05-17 17:53 ` Kees Cook
2017-05-19 0:44 ` Luis R. Rodriguez
2017-05-19 3:08 ` Luis R. Rodriguez
2017-05-19 15:40 ` Luis R. Rodriguez
2017-05-19 17:28 ` Luis R. Rodriguez
2017-05-20 2:38 ` Masami Hiramatsu
2017-05-23 14:48 ` Luis R. Rodriguez
2017-05-24 17:55 ` Luis R. Rodriguez
2017-05-19 17:35 ` Catalin Marinas
2017-05-19 18:27 ` Andy Lutomirski
2017-05-19 19:16 ` Kees Cook
2017-05-19 19:18 ` Andy Lutomirski
2017-05-19 19:29 ` Kees Cook
2017-05-26 22:13 ` Luis R. Rodriguez
2017-05-15 23:30 ` Luis R. Rodriguez
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.