All of lore.kernel.org
 help / color / mirror / Atom feed
* arm64: W+X mapping check failures
@ 2018-04-25 13:37 Jan Glauber
  2018-04-25 13:55 ` Jeffrey Hugo
  2018-04-25 13:57 ` Mark Rutland
  0 siblings, 2 replies; 7+ messages in thread
From: Jan Glauber @ 2018-04-25 13:37 UTC (permalink / raw)
  To: linux-arm-kernel

Hi all,

enabling CONFIG_DEBUG_WX we see insecure mappings reported across various kernel
versions and machines. I've not yet seen this with upstream but that doesn't
mean much as the issue is a race and I cannot trigger it reliably.

The reported W+X mappings are gone after the boot is finished. The addresses
all belong to .init.* sections of the first loaded kernel modules.

Example log (I changed the warnings as I found the backtrace quite useless):

[   39.157884] Freeing unused kernel memory: 5248K
[   39.167997] note_prot_wx: Found insecure W+X mapping at start: ffff000000ab9000  addr: ffff000000abd000  pages: 4
[   39.178246] note_prot_wx: Found insecure W+X mapping at start: ffff000000ac3000  addr: ffff000000ac5000  pages: 2
[   39.188495] note_prot_wx: Found insecure W+X mapping at start: ffff000000acd000  addr: ffff000000ad0000  pages: 3
[   39.198745] note_prot_wx: Found insecure W+X mapping at start: ffff000000af9000  addr: ffff000000afc000  pages: 3
[   39.212981] Checked W+X mappings: FAILED, 12 W+X pages found, 0 non-UXN pages found

I think this is a race between module loading and the ptdump_check_wx().
The RCU'd do_free_init() can be delayed _after_ ptdump_check_wx() for a coming module.

I tried using stop_machine() around the memory check similar to arm but that does not
solve the race. It is not a critical issue as the .init sections are freed afterwards
anyway but still the warning is a bit misleading.

Any thoughts?

--Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* arm64: W+X mapping check failures
  2018-04-25 13:37 arm64: W+X mapping check failures Jan Glauber
@ 2018-04-25 13:55 ` Jeffrey Hugo
  2018-04-25 14:50   ` Jan Glauber
  2018-04-25 13:57 ` Mark Rutland
  1 sibling, 1 reply; 7+ messages in thread
From: Jeffrey Hugo @ 2018-04-25 13:55 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jan,

On 4/25/2018 7:37 AM, Jan Glauber wrote:
> Hi all,
> 
> enabling CONFIG_DEBUG_WX we see insecure mappings reported across various kernel
> versions and machines. I've not yet seen this with upstream but that doesn't
> mean much as the issue is a race and I cannot trigger it reliably.
> 
> The reported W+X mappings are gone after the boot is finished. The addresses
> all belong to .init.* sections of the first loaded kernel modules.
> 
> Example log (I changed the warnings as I found the backtrace quite useless):
> 
> [   39.157884] Freeing unused kernel memory: 5248K
> [   39.167997] note_prot_wx: Found insecure W+X mapping at start: ffff000000ab9000  addr: ffff000000abd000  pages: 4
> [   39.178246] note_prot_wx: Found insecure W+X mapping at start: ffff000000ac3000  addr: ffff000000ac5000  pages: 2
> [   39.188495] note_prot_wx: Found insecure W+X mapping at start: ffff000000acd000  addr: ffff000000ad0000  pages: 3
> [   39.198745] note_prot_wx: Found insecure W+X mapping at start: ffff000000af9000  addr: ffff000000afc000  pages: 3
> [   39.212981] Checked W+X mappings: FAILED, 12 W+X pages found, 0 non-UXN pages found
> 
> I think this is a race between module loading and the ptdump_check_wx().
> The RCU'd do_free_init() can be delayed _after_ ptdump_check_wx() for a coming module.
> 
> I tried using stop_machine() around the memory check similar to arm but that does not
> solve the race. It is not a critical issue as the .init sections are freed afterwards
> anyway but still the warning is a bit misleading.
> 
> Any thoughts?
> 
> --Jan

You are correct.  It appears you have independently found the issue I 
was about to send a fix for.

I have a setup that can repro this 100% of the time, and have confirmed 
there is a race between ptdump_check_wx() and do_free_init().

My fix is to put rcu_barrier_sched() just before the call to 
ptdump_check_wx().  This "flushes" the queued work, ensuring it runs to 
completion before ptdump_check_wx().

In my testing, it works, however this fix does not prevent additional 
load_module() invocations from being triggered, and recreating the race 
condition.  From my debugging, it appears this might not be an issue in 
practice, as it looks like all modules that are expected to be loaded in 
that phase of boot are loaded before ptdump_check_wx() is called.

The other alternative would be to remove the use of PAGE_KERNEL_EXEC 
from module_alloc(), but based on the effort to clean that up afterward 
in the module loading process, I suspect that is not viable.

> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 


-- 
Jeffrey Hugo
Qualcomm Datacenter Technologies as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* arm64: W+X mapping check failures
  2018-04-25 13:37 arm64: W+X mapping check failures Jan Glauber
  2018-04-25 13:55 ` Jeffrey Hugo
@ 2018-04-25 13:57 ` Mark Rutland
  2018-04-25 14:47   ` Jan Glauber
  1 sibling, 1 reply; 7+ messages in thread
From: Mark Rutland @ 2018-04-25 13:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Apr 25, 2018 at 03:37:04PM +0200, Jan Glauber wrote:
> Hi all,

Hi Jan,

> enabling CONFIG_DEBUG_WX we see insecure mappings reported across various kernel
> versions and machines. I've not yet seen this with upstream but that doesn't
> mean much as the issue is a race and I cannot trigger it reliably.

Can you please tell us which kernel version(s) you're seeing this with,
and with chich config options (if not defconfig)?

... and if possible, on which machines.

> The reported W+X mappings are gone after the boot is finished. The addresses
> all belong to .init.* sections of the first loaded kernel modules.

I'm afraid I haven't tried loading modules before getting to userspace,
and I'm not sure what I'd need to set up to test that.

> Example log (I changed the warnings as I found the backtrace quite useless):
> 
> [   39.157884] Freeing unused kernel memory: 5248K
> [   39.167997] note_prot_wx: Found insecure W+X mapping at start: ffff000000ab9000  addr: ffff000000abd000  pages: 4
> [   39.178246] note_prot_wx: Found insecure W+X mapping at start: ffff000000ac3000  addr: ffff000000ac5000  pages: 2
> [   39.188495] note_prot_wx: Found insecure W+X mapping at start: ffff000000acd000  addr: ffff000000ad0000  pages: 3
> [   39.198745] note_prot_wx: Found insecure W+X mapping at start: ffff000000af9000  addr: ffff000000afc000  pages: 3
> [   39.212981] Checked W+X mappings: FAILED, 12 W+X pages found, 0 non-UXN pages found
> 
> I think this is a race between module loading and the ptdump_check_wx().
> The RCU'd do_free_init() can be delayed _after_ ptdump_check_wx() for a coming module.

Do we need some explicit RCU sync to complete this, prior to
ptdump_check_wx(), perhaps?

> I tried using stop_machine() around the memory check similar to arm but that does not
> solve the race. It is not a critical issue as the .init sections are freed afterwards
> anyway but still the warning is a bit misleading.
> 
> Any thoughts?

I'm not sure if stop_machine() would complete an RCU grace period and
complete the freeing of module memory. As above, woudl some explicit RCU
sync help?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* arm64: W+X mapping check failures
  2018-04-25 13:57 ` Mark Rutland
@ 2018-04-25 14:47   ` Jan Glauber
  2018-04-26 11:00     ` James Morse
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Glauber @ 2018-04-25 14:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Apr 25, 2018 at 02:57:02PM +0100, Mark Rutland wrote:
> On Wed, Apr 25, 2018 at 03:37:04PM +0200, Jan Glauber wrote:
> > Hi all,
> 
> Hi Jan,
> 
> > enabling CONFIG_DEBUG_WX we see insecure mappings reported across various kernel
> > versions and machines. I've not yet seen this with upstream but that doesn't
> > mean much as the issue is a race and I cannot trigger it reliably.
> 
> Can you please tell us which kernel version(s) you're seeing this with,
> and with chich config options (if not defconfig)?

Ubuntu artful and bionic at least, this are 4.13+ and 4.15+.

> ... and if possible, on which machines.

ThunderX 1 and 2 and one other unspecified arm64 platform (would need to
ask).

> > The reported W+X mappings are gone after the boot is finished. The addresses
> > all belong to .init.* sections of the first loaded kernel modules.
> 
> I'm afraid I haven't tried loading modules before getting to userspace,
> and I'm not sure what I'd need to set up to test that.

Not much I guess, initramfs with early modules. For instance encrypted
root should be a possible testcase. In my tests it was always cryptd and
dependent modules (crypto_simd, aes_neon_blk, aes_neon_bs) that
triggered the issue.

> > Example log (I changed the warnings as I found the backtrace quite useless):
> > 
> > [   39.157884] Freeing unused kernel memory: 5248K
> > [   39.167997] note_prot_wx: Found insecure W+X mapping at start: ffff000000ab9000  addr: ffff000000abd000  pages: 4
> > [   39.178246] note_prot_wx: Found insecure W+X mapping at start: ffff000000ac3000  addr: ffff000000ac5000  pages: 2
> > [   39.188495] note_prot_wx: Found insecure W+X mapping at start: ffff000000acd000  addr: ffff000000ad0000  pages: 3
> > [   39.198745] note_prot_wx: Found insecure W+X mapping at start: ffff000000af9000  addr: ffff000000afc000  pages: 3
> > [   39.212981] Checked W+X mappings: FAILED, 12 W+X pages found, 0 non-UXN pages found
> > 
> > I think this is a race between module loading and the ptdump_check_wx().
> > The RCU'd do_free_init() can be delayed _after_ ptdump_check_wx() for a coming module.
> 
> Do we need some explicit RCU sync to complete this, prior to
> ptdump_check_wx(), perhaps?

Yes.

> > I tried using stop_machine() around the memory check similar to arm but that does not
> > solve the race. It is not a critical issue as the .init sections are freed afterwards
> > anyway but still the warning is a bit misleading.
> > 
> > Any thoughts?
> 
> I'm not sure if stop_machine() would complete an RCU grace period and
> complete the freeing of module memory. As above, woudl some explicit RCU
> sync help?

Yes, I tried synchonize_sched() but without looking what it does first,
Jeffreys rcu_barrier_sched() looks better suited here.

thanks,
Jan

> Thanks,
> Mark.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* arm64: W+X mapping check failures
  2018-04-25 13:55 ` Jeffrey Hugo
@ 2018-04-25 14:50   ` Jan Glauber
  2018-04-25 15:18     ` Jeffrey Hugo
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Glauber @ 2018-04-25 14:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Apr 25, 2018 at 07:55:20AM -0600, Jeffrey Hugo wrote:
> Hi Jan,
> 
> On 4/25/2018 7:37 AM, Jan Glauber wrote:
> >Hi all,
> >
> >enabling CONFIG_DEBUG_WX we see insecure mappings reported across various kernel
> >versions and machines. I've not yet seen this with upstream but that doesn't
> >mean much as the issue is a race and I cannot trigger it reliably.
> >
> >The reported W+X mappings are gone after the boot is finished. The addresses
> >all belong to .init.* sections of the first loaded kernel modules.
> >
> >Example log (I changed the warnings as I found the backtrace quite useless):
> >
> >[   39.157884] Freeing unused kernel memory: 5248K
> >[   39.167997] note_prot_wx: Found insecure W+X mapping at start: ffff000000ab9000  addr: ffff000000abd000  pages: 4
> >[   39.178246] note_prot_wx: Found insecure W+X mapping at start: ffff000000ac3000  addr: ffff000000ac5000  pages: 2
> >[   39.188495] note_prot_wx: Found insecure W+X mapping at start: ffff000000acd000  addr: ffff000000ad0000  pages: 3
> >[   39.198745] note_prot_wx: Found insecure W+X mapping at start: ffff000000af9000  addr: ffff000000afc000  pages: 3
> >[   39.212981] Checked W+X mappings: FAILED, 12 W+X pages found, 0 non-UXN pages found
> >
> >I think this is a race between module loading and the ptdump_check_wx().
> >The RCU'd do_free_init() can be delayed _after_ ptdump_check_wx() for a coming module.
> >
> >I tried using stop_machine() around the memory check similar to arm but that does not
> >solve the race. It is not a critical issue as the .init sections are freed afterwards
> >anyway but still the warning is a bit misleading.
> >
> >Any thoughts?
> >
> >--Jan
> 
> You are correct.  It appears you have independently found the issue
> I was about to send a fix for.
> 
> I have a setup that can repro this 100% of the time, and have
> confirmed there is a race between ptdump_check_wx() and
> do_free_init().

How did you manage to hit this every time? Just wondering...

> My fix is to put rcu_barrier_sched() just before the call to
> ptdump_check_wx().  This "flushes" the queued work, ensuring it runs
> to completion before ptdump_check_wx().

Looks good to me, I tried synchronize_sched() which did not help but
I should have read the documentation first.

> In my testing, it works, however this fix does not prevent
> additional load_module() invocations from being triggered, and
> recreating the race condition.  From my debugging, it appears this
> might not be an issue in practice, as it looks like all modules that
> are expected to be loaded in that phase of boot are loaded before
> ptdump_check_wx() is called.

Yes, the race would still be there. We would need some combination of
stop_machine and the rcu barrier but I guess calling rcu_barrier_sched()
inside stop_machine would be a very very bad idea.

> The other alternative would be to remove the use of PAGE_KERNEL_EXEC
> from module_alloc(), but based on the effort to clean that up
> afterward in the module loading process, I suspect that is not
> viable.
> 
> >
> >_______________________________________________
> >linux-arm-kernel mailing list
> >linux-arm-kernel at lists.infradead.org
> >http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> >
> 
> 
> -- 
> Jeffrey Hugo
> Qualcomm Datacenter Technologies as an affiliate of Qualcomm
> Technologies, Inc.
> Qualcomm Technologies, Inc. is a member of the
> Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* arm64: W+X mapping check failures
  2018-04-25 14:50   ` Jan Glauber
@ 2018-04-25 15:18     ` Jeffrey Hugo
  0 siblings, 0 replies; 7+ messages in thread
From: Jeffrey Hugo @ 2018-04-25 15:18 UTC (permalink / raw)
  To: linux-arm-kernel

On 4/25/2018 8:50 AM, Jan Glauber wrote:
> On Wed, Apr 25, 2018 at 07:55:20AM -0600, Jeffrey Hugo wrote:
>> Hi Jan,
>>
>> On 4/25/2018 7:37 AM, Jan Glauber wrote:
>>> Hi all,
>>>
>>> enabling CONFIG_DEBUG_WX we see insecure mappings reported across various kernel
>>> versions and machines. I've not yet seen this with upstream but that doesn't
>>> mean much as the issue is a race and I cannot trigger it reliably.
>>>
>>> The reported W+X mappings are gone after the boot is finished. The addresses
>>> all belong to .init.* sections of the first loaded kernel modules.
>>>
>>> Example log (I changed the warnings as I found the backtrace quite useless):
>>>
>>> [   39.157884] Freeing unused kernel memory: 5248K
>>> [   39.167997] note_prot_wx: Found insecure W+X mapping at start: ffff000000ab9000  addr: ffff000000abd000  pages: 4
>>> [   39.178246] note_prot_wx: Found insecure W+X mapping at start: ffff000000ac3000  addr: ffff000000ac5000  pages: 2
>>> [   39.188495] note_prot_wx: Found insecure W+X mapping at start: ffff000000acd000  addr: ffff000000ad0000  pages: 3
>>> [   39.198745] note_prot_wx: Found insecure W+X mapping at start: ffff000000af9000  addr: ffff000000afc000  pages: 3
>>> [   39.212981] Checked W+X mappings: FAILED, 12 W+X pages found, 0 non-UXN pages found
>>>
>>> I think this is a race between module loading and the ptdump_check_wx().
>>> The RCU'd do_free_init() can be delayed _after_ ptdump_check_wx() for a coming module.
>>>
>>> I tried using stop_machine() around the memory check similar to arm but that does not
>>> solve the race. It is not a critical issue as the .init sections are freed afterwards
>>> anyway but still the warning is a bit misleading.
>>>
>>> Any thoughts?
>>>
>>> --Jan
>>
>> You are correct.  It appears you have independently found the issue
>> I was about to send a fix for.
>>
>> I have a setup that can repro this 100% of the time, and have
>> confirmed there is a race between ptdump_check_wx() and
>> do_free_init().
> 
> How did you manage to hit this every time? Just wondering...

SW based system simulator that is basically singlethreaded.  If the 
simulator hits a race condition, it hits it every time.

On hardware (QDF2400), I have to put several devices into a reboot loop 
and wait 12+ hours for single repro.  Usually ends up being 2000+ 
reboots combined.

>> My fix is to put rcu_barrier_sched() just before the call to
>> ptdump_check_wx().  This "flushes" the queued work, ensuring it runs
>> to completion before ptdump_check_wx().
> 
> Looks good to me, I tried synchronize_sched() which did not help but
> I should have read the documentation first.

Yep.  I thought of using synchronize_sched() based on the comment from 
do_init_module() until I went and scrutinized the documentation in the 
RCU header.

> 
>> In my testing, it works, however this fix does not prevent
>> additional load_module() invocations from being triggered, and
>> recreating the race condition.  From my debugging, it appears this
>> might not be an issue in practice, as it looks like all modules that
>> are expected to be loaded in that phase of boot are loaded before
>> ptdump_check_wx() is called.
> 
> Yes, the race would still be there. We would need some combination of
> stop_machine and the rcu barrier but I guess calling rcu_barrier_sched()
> inside stop_machine would be a very very bad idea.
> 

Yeah, that sounds like a horrible idea to me, but I'm certainly not an 
expert.

>> The other alternative would be to remove the use of PAGE_KERNEL_EXEC
>> from module_alloc(), but based on the effort to clean that up
>> afterward in the module loading process, I suspect that is not
>> viable.
>>


-- 
Jeffrey Hugo
Qualcomm Datacenter Technologies as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* arm64: W+X mapping check failures
  2018-04-25 14:47   ` Jan Glauber
@ 2018-04-26 11:00     ` James Morse
  0 siblings, 0 replies; 7+ messages in thread
From: James Morse @ 2018-04-26 11:00 UTC (permalink / raw)
  To: linux-arm-kernel

Hi guys,

On 25/04/18 15:47, Jan Glauber wrote:
> On Wed, Apr 25, 2018 at 02:57:02PM +0100, Mark Rutland wrote:
>> On Wed, Apr 25, 2018 at 03:37:04PM +0200, Jan Glauber wrote:
>>> The reported W+X mappings are gone after the boot is finished. The addresses
>>> all belong to .init.* sections of the first loaded kernel modules.
>>
>> I'm afraid I haven't tried loading modules before getting to userspace,
>> and I'm not sure what I'd need to set up to test that.
> 
> Not much I guess, initramfs with early modules. For instance encrypted
> root should be a possible testcase. In my tests it was always cryptd and
> dependent modules (crypto_simd, aes_neon_blk, aes_neon_bs) that
> triggered the issue.

I've used "elevator=deadline" on the kernel cmdline as something that takes less
setup. This will cause deadline-iosched.ko to be loaded during
kernel_init_freeable() if its not built-in.

dmesg shows:
| [   10.195409] I/O scheduler deadline not found

| [   10.204365] scsi 4:0:0:0: Direct-Access     ATA      WDC WD5000AAKX-0
| [   10.222301] sd 4:0:0:0: [sda] 976773168 512-byte logical blocks:
| [   10.237494] sd 4:0:0:0: [sda] Write Protect is off
| [   10.247130] sd 4:0:0:0: [sda] Mode Sense: 00 3a 00 00
| [   10.257365] sd 4:0:0:0: [sda] Write cache: enabled, read cache: enabled,
| [   10.336133]  sda: sda1 sda2 sda3 sda4 sda5

| [   11.723286] io scheduler deadline registered (default)
| [   11.738017] Freeing unused kernel memory: 5824K



Thanks,

James

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-04-26 11:00 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-25 13:37 arm64: W+X mapping check failures Jan Glauber
2018-04-25 13:55 ` Jeffrey Hugo
2018-04-25 14:50   ` Jan Glauber
2018-04-25 15:18     ` Jeffrey Hugo
2018-04-25 13:57 ` Mark Rutland
2018-04-25 14:47   ` Jan Glauber
2018-04-26 11:00     ` James Morse

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.