* BUG: KASAN: stack-out-of-bounds in unwind_get_return_address @ 2016-11-29 18:13 Scott Bauer 2016-11-30 18:35 ` Josh Poimboeuf 0 siblings, 1 reply; 35+ messages in thread From: Scott Bauer @ 2016-11-29 18:13 UTC (permalink / raw) To: linux-kernel; +Cc: jpoimboe, peterz, mingo, luto This is super easy to repro ontop of 4.9-rc7: run pm-suspend and it hits every time [ 968.667086] ================================================================== [ 968.667091] BUG: KASAN: stack-out-of-bounds in unwind_get_return_address+0x11d/0x130 at addr ffff8803867d7878 [ 968.667092] Read of size 8 by task pm-suspend/7774 [ 968.667095] page:ffffea000e19f5c0 count:0 mapcount:0 mapping: (null) index:0x0 [ 968.667096] flags: 0x2ffff0000000000() [ 968.667097] page dumped because: kasan: bad access detected [ 968.667099] CPU: 0 PID: 7774 Comm: pm-suspend Tainted: G B 4.9.0-rc7+ #8 [ 968.667100] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016 [ 968.667102] ffff8803867d7468 ffffffffb4c0d051 ffff8803867d7500 ffff8803867d7878 [ 968.667103] ffff8803867d74f0 ffffffffb45cbe34 ffffffffb4e64136 ffffffffb4510d42 [ 968.667105] ffff8803828c3f4c 0000000000000097 0000000041b58ab3 ffffffffb6192731 [ 968.667105] Call Trace: [ 968.667108] [<ffffffffb4c0d051>] dump_stack+0x63/0x82 [ 968.667110] [<ffffffffb45cbe34>] kasan_report_error+0x4b4/0x4e0 [ 968.667112] [<ffffffffb4e64136>] ? acpi_hw_read_port+0xd0/0x1ea [ 968.667113] [<ffffffffb4510d42>] ? kfree_const+0x22/0x30 [ 968.667114] [<ffffffffb4e64066>] ? acpi_hw_validate_io_request+0x1a6/0x1a6 [ 968.667116] [<ffffffffb45cc011>] __asan_report_load8_noabort+0x61/0x70 [ 968.667117] [<ffffffffb411a29d>] ? unwind_get_return_address+0x11d/0x130 [ 968.667118] [<ffffffffb411a29d>] unwind_get_return_address+0x11d/0x130 [ 968.667119] [<ffffffffb411a497>] ? unwind_next_frame+0x97/0xf0 [ 968.667120] [<ffffffffb40b01e2>] __save_stack_trace+0x92/0x100 [ 968.667122] [<ffffffffb40b026b>] save_stack_trace+0x1b/0x20 [ 968.667123] [<ffffffffb45cac76>] save_stack+0x46/0xd0 [ 968.667124] [<ffffffffb40b026b>] ? save_stack_trace+0x1b/0x20 [ 968.667125] [<ffffffffb45cac76>] ? save_stack+0x46/0xd0 [ 968.667126] [<ffffffffb45caeed>] ? kasan_kmalloc+0xad/0xe0 [ 968.667127] [<ffffffffb45cb432>] ? kasan_slab_alloc+0x12/0x20 [ 968.667128] [<ffffffffb4e62d56>] ? acpi_hw_read+0x2b6/0x3aa [ 968.667129] [<ffffffffb4e62aa0>] ? acpi_hw_validate_register+0x20b/0x20b [ 968.667131] [<ffffffffb4e642c2>] ? acpi_hw_write_port+0x72/0xc7 [ 968.667132] [<ffffffffb4e63108>] ? acpi_hw_write+0x11f/0x15f [ 968.667133] [<ffffffffb4e62fe9>] ? acpi_hw_read_multiple+0x19f/0x19f [ 968.667134] [<ffffffffb45cb065>] ? memcpy+0x45/0x50 [ 968.667135] [<ffffffffb4e642c2>] ? acpi_hw_write_port+0x72/0xc7 [ 968.667136] [<ffffffffb4e63108>] ? acpi_hw_write+0x11f/0x15f [ 968.667137] [<ffffffffb4e62fe9>] ? acpi_hw_read_multiple+0x19f/0x19f [ 968.667138] [<ffffffffb45cad86>] ? kasan_unpoison_shadow+0x36/0x50 [ 968.667140] [<ffffffffb45caeed>] kasan_kmalloc+0xad/0xe0 [ 968.667141] [<ffffffffb45cb432>] kasan_slab_alloc+0x12/0x20 [ 968.667142] [<ffffffffb45c757c>] kmem_cache_alloc_trace+0xbc/0x1e0 [ 968.667143] [<ffffffffb4e64de2>] ? acpi_get_sleep_type_data+0x9a/0x578 [ 968.667144] [<ffffffffb4e64de2>] acpi_get_sleep_type_data+0x9a/0x578 [ 968.667146] [<ffffffffb4e63bc9>] acpi_hw_legacy_wake_prep+0x88/0x22c [ 968.667147] [<ffffffffb4e63b41>] ? acpi_hw_legacy_sleep+0x3c7/0x3c7 [ 968.667148] [<ffffffffb4e64904>] ? acpi_write_bit_register+0x28d/0x2d3 [ 968.667149] [<ffffffffb4e64677>] ? acpi_read_bit_register+0x19b/0x19b [ 968.667150] [<ffffffffb4e6555d>] acpi_hw_sleep_dispatch+0xb5/0xba [ 968.667151] [<ffffffffb4e65579>] acpi_leave_sleep_state_prep+0x17/0x19 [ 968.667153] [<ffffffffb4e0e1d4>] acpi_suspend_enter+0x154/0x1e0 [ 968.667154] [<ffffffffb4e0e080>] ? trace_suspend_resume+0xe8/0xe8 [ 968.667156] [<ffffffffb4262539>] suspend_devices_and_enter+0xb09/0xdb0 [ 968.667157] [<ffffffffb44a6069>] ? printk+0xa8/0xd8 [ 968.667158] [<ffffffffb4261a30>] ? arch_suspend_enable_irqs+0x20/0x20 [ 968.667159] [<ffffffffb4260815>] ? try_to_freeze_tasks+0x295/0x600 [ 968.667160] [<ffffffffb4262ea9>] pm_suspend+0x6c9/0x780 [ 968.667162] [<ffffffffb4244010>] ? finish_wait+0x1f0/0x1f0 [ 968.667163] [<ffffffffb42627e0>] ? suspend_devices_and_enter+0xdb0/0xdb0 [ 968.667164] [<ffffffffb425fe02>] state_store+0xa2/0x120 [ 968.667165] [<ffffffffb4c12ca0>] ? kobj_attr_show+0x60/0x60 [ 968.667166] [<ffffffffb4c12cd6>] kobj_attr_store+0x36/0x70 [ 968.667168] [<ffffffffb47b0701>] sysfs_kf_write+0x131/0x200 [ 968.667169] [<ffffffffb47ae0e5>] kernfs_fop_write+0x295/0x3f0 [ 968.667170] [<ffffffffb462aadf>] __vfs_write+0xef/0x760 [ 968.667172] [<ffffffffb454d136>] ? handle_mm_fault+0x1346/0x35e0 [ 968.667173] [<ffffffffb462a9f0>] ? do_iter_readv_writev+0x660/0x660 [ 968.667174] [<ffffffffb454bdf0>] ? __pmd_alloc+0x310/0x310 [ 968.667176] [<ffffffffb47345d0>] ? do_lock_file_wait+0x1e0/0x1e0 [ 968.667178] [<ffffffffb4ad66e8>] ? apparmor_file_permission+0x18/0x20 [ 968.667179] [<ffffffffb4a14773>] ? security_file_permission+0x73/0x1c0 [ 968.667181] [<ffffffffb462ba3d>] ? rw_verify_area+0xbd/0x2b0 [ 968.667182] [<ffffffffb462c069>] vfs_write+0x149/0x4a0 [ 968.667184] [<ffffffffb462f9a9>] SyS_write+0xd9/0x1c0 [ 968.667185] [<ffffffffb462f8d0>] ? SyS_read+0x1c0/0x1c0 [ 968.667187] [<ffffffffb5a708fb>] entry_SYSCALL_64_fastpath+0x1e/0xad [ 968.667188] Memory state around the buggy address: [ 968.667189] ffff8803867d7700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 968.667190] ffff8803867d7780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 968.667191] >ffff8803867d7800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f4 [ 968.667192] ^ [ 968.667192] ffff8803867d7880: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 [ 968.667193] ffff8803867d7900: 00 00 00 f1 f1 f1 f1 04 f4 f4 f4 f3 f3 f3 f3 00 [ 968.667193] ================================================================== ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: BUG: KASAN: stack-out-of-bounds in unwind_get_return_address 2016-11-29 18:13 BUG: KASAN: stack-out-of-bounds in unwind_get_return_address Scott Bauer @ 2016-11-30 18:35 ` Josh Poimboeuf 2016-11-30 19:02 ` Scott Bauer 0 siblings, 1 reply; 35+ messages in thread From: Josh Poimboeuf @ 2016-11-30 18:35 UTC (permalink / raw) To: Scott Bauer; +Cc: linux-kernel, peterz, mingo, luto On Tue, Nov 29, 2016 at 11:13:01AM -0700, Scott Bauer wrote: > This is super easy to repro ontop of 4.9-rc7: > run pm-suspend and it hits every time > > > [ 968.667086] ================================================================== > [ 968.667091] BUG: KASAN: stack-out-of-bounds in unwind_get_return_address+0x11d/0x130 at addr ffff8803867d7878 > [ 968.667092] Read of size 8 by task pm-suspend/7774 > [ 968.667095] page:ffffea000e19f5c0 count:0 mapcount:0 mapping: (null) index:0x0 > [ 968.667096] flags: 0x2ffff0000000000() > [ 968.667097] page dumped because: kasan: bad access detected Thanks for reporting this. I think it's a false positive caused by the fact that the suspend and resume happen at different contexts. Can you test if this patch fixes it? diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c index 4858733..62bd046 100644 --- a/arch/x86/kernel/acpi/sleep.c +++ b/arch/x86/kernel/acpi/sleep.c @@ -115,6 +115,9 @@ int x86_acpi_suspend_lowlevel(void) pause_graph_tracing(); do_suspend_lowlevel(); unpause_graph_tracing(); + + kasan_unpoison_stack_below_sp(); + return 0; } diff --git a/include/linux/kasan.h b/include/linux/kasan.h index 820c0ad..ca36126 100644 --- a/include/linux/kasan.h +++ b/include/linux/kasan.h @@ -45,6 +45,12 @@ void kasan_unpoison_shadow(const void *address, size_t size); void kasan_unpoison_task_stack(struct task_struct *task); void kasan_unpoison_stack_above_sp_to(const void *watermark); +asmlinkage void kasan_unpoison_task_stack_below(const void *watermark); + +static inline void kasan_unpoison_stack_below_sp(void) +{ + kasan_unpoison_task_stack_below(__builtin_frame_address(0)); +} void kasan_alloc_pages(struct page *page, unsigned int order); void kasan_free_pages(struct page *page, unsigned int order); ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: BUG: KASAN: stack-out-of-bounds in unwind_get_return_address 2016-11-30 18:35 ` Josh Poimboeuf @ 2016-11-30 19:02 ` Scott Bauer 2016-11-30 23:10 ` [PATCH] x86/suspend: fix false positive KASAN warning on suspend/resume Josh Poimboeuf 0 siblings, 1 reply; 35+ messages in thread From: Scott Bauer @ 2016-11-30 19:02 UTC (permalink / raw) To: Josh Poimboeuf; +Cc: linux-kernel, peterz, mingo, luto On Wed, Nov 30, 2016 at 12:35:07PM -0600, Josh Poimboeuf wrote: > On Tue, Nov 29, 2016 at 11:13:01AM -0700, Scott Bauer wrote: > > This is super easy to repro ontop of 4.9-rc7: > > run pm-suspend and it hits every time > > > > > > [ 968.667086] ================================================================== > > [ 968.667091] BUG: KASAN: stack-out-of-bounds in unwind_get_return_address+0x11d/0x130 at addr ffff8803867d7878 > > [ 968.667092] Read of size 8 by task pm-suspend/7774 > > [ 968.667095] page:ffffea000e19f5c0 count:0 mapcount:0 mapping: (null) index:0x0 > > [ 968.667096] flags: 0x2ffff0000000000() > > [ 968.667097] page dumped because: kasan: bad access detected > > Thanks for reporting this. I think it's a false positive caused by the > fact that the suspend and resume happen at different contexts. > > Can you test if this patch fixes it? > > diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c > index 4858733..62bd046 100644 > --- a/arch/x86/kernel/acpi/sleep.c > +++ b/arch/x86/kernel/acpi/sleep.c > @@ -115,6 +115,9 @@ int x86_acpi_suspend_lowlevel(void) > pause_graph_tracing(); > do_suspend_lowlevel(); > unpause_graph_tracing(); > + > + kasan_unpoison_stack_below_sp(); > + > return 0; > } > > diff --git a/include/linux/kasan.h b/include/linux/kasan.h > index 820c0ad..ca36126 100644 > --- a/include/linux/kasan.h > +++ b/include/linux/kasan.h > @@ -45,6 +45,12 @@ void kasan_unpoison_shadow(const void *address, size_t size); > > void kasan_unpoison_task_stack(struct task_struct *task); > void kasan_unpoison_stack_above_sp_to(const void *watermark); > +asmlinkage void kasan_unpoison_task_stack_below(const void *watermark); > + > +static inline void kasan_unpoison_stack_below_sp(void) > +{ > + kasan_unpoison_task_stack_below(__builtin_frame_address(0)); > +} > > void kasan_alloc_pages(struct page *page, unsigned int order); > void kasan_free_pages(struct page *page, unsigned int order); Thanks for the quick turn-around. This patch worked for me. You can add me as tested by if you need. ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-11-30 19:02 ` Scott Bauer @ 2016-11-30 23:10 ` Josh Poimboeuf 2016-12-01 9:05 ` Andrey Ryabinin 2016-12-01 14:04 ` [PATCH] " Rafael J. Wysocki 0 siblings, 2 replies; 35+ messages in thread From: Josh Poimboeuf @ 2016-11-30 23:10 UTC (permalink / raw) To: Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm Cc: linux-kernel, peterz, mingo, luto, Scott Bauer, x86, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, kasan-dev Resuming from a suspend operation is showing a KASAN false positive warning: BUG: KASAN: stack-out-of-bounds in unwind_get_return_address+0x11d/0x130 at addr ffff8803867d7878 Read of size 8 by task pm-suspend/7774 page:ffffea000e19f5c0 count:0 mapcount:0 mapping: (null) index:0x0 flags: 0x2ffff0000000000() page dumped because: kasan: bad access detected CPU: 0 PID: 7774 Comm: pm-suspend Tainted: G B 4.9.0-rc7+ #8 Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016 ffff8803867d7468 ffffffffb4c0d051 ffff8803867d7500 ffff8803867d7878 ffff8803867d74f0 ffffffffb45cbe34 ffffffffb4e64136 ffffffffb4510d42 ffff8803828c3f4c 0000000000000097 0000000041b58ab3 ffffffffb6192731 Call Trace: dump_stack+0x63/0x82 kasan_report_error+0x4b4/0x4e0 ? acpi_hw_read_port+0xd0/0x1ea ? kfree_const+0x22/0x30 ? acpi_hw_validate_io_request+0x1a6/0x1a6 __asan_report_load8_noabort+0x61/0x70 ? unwind_get_return_address+0x11d/0x130 unwind_get_return_address+0x11d/0x130 ? unwind_next_frame+0x97/0xf0 __save_stack_trace+0x92/0x100 save_stack_trace+0x1b/0x20 save_stack+0x46/0xd0 ? save_stack_trace+0x1b/0x20 ? save_stack+0x46/0xd0 ? kasan_kmalloc+0xad/0xe0 ? kasan_slab_alloc+0x12/0x20 ? acpi_hw_read+0x2b6/0x3aa ? acpi_hw_validate_register+0x20b/0x20b ? acpi_hw_write_port+0x72/0xc7 ? acpi_hw_write+0x11f/0x15f ? acpi_hw_read_multiple+0x19f/0x19f ? memcpy+0x45/0x50 ? acpi_hw_write_port+0x72/0xc7 ? acpi_hw_write+0x11f/0x15f ? acpi_hw_read_multiple+0x19f/0x19f ? kasan_unpoison_shadow+0x36/0x50 kasan_kmalloc+0xad/0xe0 kasan_slab_alloc+0x12/0x20 kmem_cache_alloc_trace+0xbc/0x1e0 ? acpi_get_sleep_type_data+0x9a/0x578 acpi_get_sleep_type_data+0x9a/0x578 acpi_hw_legacy_wake_prep+0x88/0x22c ? acpi_hw_legacy_sleep+0x3c7/0x3c7 ? acpi_write_bit_register+0x28d/0x2d3 ? acpi_read_bit_register+0x19b/0x19b acpi_hw_sleep_dispatch+0xb5/0xba acpi_leave_sleep_state_prep+0x17/0x19 acpi_suspend_enter+0x154/0x1e0 ? trace_suspend_resume+0xe8/0xe8 suspend_devices_and_enter+0xb09/0xdb0 ? printk+0xa8/0xd8 ? arch_suspend_enable_irqs+0x20/0x20 ? try_to_freeze_tasks+0x295/0x600 pm_suspend+0x6c9/0x780 ? finish_wait+0x1f0/0x1f0 ? suspend_devices_and_enter+0xdb0/0xdb0 state_store+0xa2/0x120 ? kobj_attr_show+0x60/0x60 kobj_attr_store+0x36/0x70 sysfs_kf_write+0x131/0x200 kernfs_fop_write+0x295/0x3f0 __vfs_write+0xef/0x760 ? handle_mm_fault+0x1346/0x35e0 ? do_iter_readv_writev+0x660/0x660 ? __pmd_alloc+0x310/0x310 ? do_lock_file_wait+0x1e0/0x1e0 ? apparmor_file_permission+0x18/0x20 ? security_file_permission+0x73/0x1c0 ? rw_verify_area+0xbd/0x2b0 vfs_write+0x149/0x4a0 SyS_write+0xd9/0x1c0 ? SyS_read+0x1c0/0x1c0 entry_SYSCALL_64_fastpath+0x1e/0xad Memory state around the buggy address: ffff8803867d7700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8803867d7780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff8803867d7800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f4 ^ ffff8803867d7880: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 ffff8803867d7900: 00 00 00 f1 f1 f1 f1 04 f4 f4 f4 f3 f3 f3 f3 00 KASAN instrumentation poisons the stack when entering a function and unpoisons it when exiting the function. However, in the suspend path, some functions never return, so their stack never gets unpoisoned, resulting in stale KASAN shadow data which can cause false positive warnings like the one above. Reported-by: Scott Bauer <scott.bauer@intel.com> Tested-by: Scott Bauer <scott.bauer@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> --- arch/x86/kernel/acpi/sleep.c | 3 +++ include/linux/kasan.h | 7 +++++++ 2 files changed, 10 insertions(+) diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c index 4858733..62bd046 100644 --- a/arch/x86/kernel/acpi/sleep.c +++ b/arch/x86/kernel/acpi/sleep.c @@ -115,6 +115,9 @@ int x86_acpi_suspend_lowlevel(void) pause_graph_tracing(); do_suspend_lowlevel(); unpause_graph_tracing(); + + kasan_unpoison_stack_below_sp(); + return 0; } diff --git a/include/linux/kasan.h b/include/linux/kasan.h index 820c0ad..e0945d5 100644 --- a/include/linux/kasan.h +++ b/include/linux/kasan.h @@ -45,6 +45,12 @@ void kasan_unpoison_shadow(const void *address, size_t size); void kasan_unpoison_task_stack(struct task_struct *task); void kasan_unpoison_stack_above_sp_to(const void *watermark); +asmlinkage void kasan_unpoison_task_stack_below(const void *watermark); + +static inline void kasan_unpoison_stack_below_sp(void) +{ + kasan_unpoison_task_stack_below(__builtin_frame_address(0)); +} void kasan_alloc_pages(struct page *page, unsigned int order); void kasan_free_pages(struct page *page, unsigned int order); @@ -87,6 +93,7 @@ static inline void kasan_unpoison_shadow(const void *address, size_t size) {} static inline void kasan_unpoison_task_stack(struct task_struct *task) {} static inline void kasan_unpoison_stack_above_sp_to(const void *watermark) {} +static inline void kasan_unpoison_stack_below_sp(void) {} static inline void kasan_enable_current(void) {} static inline void kasan_disable_current(void) {} -- 2.7.4 ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-11-30 23:10 ` [PATCH] x86/suspend: fix false positive KASAN warning on suspend/resume Josh Poimboeuf @ 2016-12-01 9:05 ` Andrey Ryabinin 2016-12-01 14:04 ` [PATCH] " Rafael J. Wysocki 1 sibling, 0 replies; 35+ messages in thread From: Andrey Ryabinin @ 2016-12-01 9:05 UTC (permalink / raw) To: Josh Poimboeuf, Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm Cc: linux-kernel, peterz, mingo, luto, Scott Bauer, x86, Alexander Potapenko, Dmitry Vyukov, kasan-dev On 12/01/2016 02:10 AM, Josh Poimboeuf wrote: > Resuming from a suspend operation is showing a KASAN false positive > warning: > > KASAN instrumentation poisons the stack when entering a function and > unpoisons it when exiting the function. However, in the suspend path, > some functions never return, so their stack never gets unpoisoned, > resulting in stale KASAN shadow data which can cause false positive > warnings like the one above. > > Reported-by: Scott Bauer <scott.bauer@intel.com> > Tested-by: Scott Bauer <scott.bauer@intel.com> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> > --- > arch/x86/kernel/acpi/sleep.c | 3 +++ > include/linux/kasan.h | 7 +++++++ > 2 files changed, 10 insertions(+) > > diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c > index 4858733..62bd046 100644 > --- a/arch/x86/kernel/acpi/sleep.c > +++ b/arch/x86/kernel/acpi/sleep.c > @@ -115,6 +115,9 @@ int x86_acpi_suspend_lowlevel(void) > pause_graph_tracing(); > do_suspend_lowlevel(); > unpause_graph_tracing(); > + > + kasan_unpoison_stack_below_sp(); > + I think this might be too late. We may hit stale poison in the first C function called after resume (restore_processor_state()). Thus the shadow must be unpoisoned prior such call, i.e. somewhere in do_suspend_lowlevel() after .Lresume_point. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH] x86/suspend: fix false positive KASAN warning on suspend/resume @ 2016-12-01 9:05 ` Andrey Ryabinin 0 siblings, 0 replies; 35+ messages in thread From: Andrey Ryabinin @ 2016-12-01 9:05 UTC (permalink / raw) To: Josh Poimboeuf, Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm Cc: linux-kernel, peterz, mingo, luto, Scott Bauer, x86, Alexander Potapenko, Dmitry Vyukov, kasan-dev On 12/01/2016 02:10 AM, Josh Poimboeuf wrote: > Resuming from a suspend operation is showing a KASAN false positive > warning: > > KASAN instrumentation poisons the stack when entering a function and > unpoisons it when exiting the function. However, in the suspend path, > some functions never return, so their stack never gets unpoisoned, > resulting in stale KASAN shadow data which can cause false positive > warnings like the one above. > > Reported-by: Scott Bauer <scott.bauer@intel.com> > Tested-by: Scott Bauer <scott.bauer@intel.com> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> > --- > arch/x86/kernel/acpi/sleep.c | 3 +++ > include/linux/kasan.h | 7 +++++++ > 2 files changed, 10 insertions(+) > > diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c > index 4858733..62bd046 100644 > --- a/arch/x86/kernel/acpi/sleep.c > +++ b/arch/x86/kernel/acpi/sleep.c > @@ -115,6 +115,9 @@ int x86_acpi_suspend_lowlevel(void) > pause_graph_tracing(); > do_suspend_lowlevel(); > unpause_graph_tracing(); > + > + kasan_unpoison_stack_below_sp(); > + I think this might be too late. We may hit stale poison in the first C function called after resume (restore_processor_state()). Thus the shadow must be unpoisoned prior such call, i.e. somewhere in do_suspend_lowlevel() after .Lresume_point. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-01 9:05 ` Andrey Ryabinin (?) @ 2016-12-01 14:58 ` Josh Poimboeuf 2016-12-01 16:45 ` Josh Poimboeuf -1 siblings, 1 reply; 35+ messages in thread From: Josh Poimboeuf @ 2016-12-01 14:58 UTC (permalink / raw) To: Andrey Ryabinin Cc: Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm, linux-kernel, peterz, mingo, luto, Scott Bauer, x86, Alexander Potapenko, Dmitry Vyukov, kasan-dev On Thu, Dec 01, 2016 at 12:05:34PM +0300, Andrey Ryabinin wrote: > > > On 12/01/2016 02:10 AM, Josh Poimboeuf wrote: > > Resuming from a suspend operation is showing a KASAN false positive > > warning: > > > > > KASAN instrumentation poisons the stack when entering a function and > > unpoisons it when exiting the function. However, in the suspend path, > > some functions never return, so their stack never gets unpoisoned, > > resulting in stale KASAN shadow data which can cause false positive > > warnings like the one above. > > > > Reported-by: Scott Bauer <scott.bauer@intel.com> > > Tested-by: Scott Bauer <scott.bauer@intel.com> > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> > > --- > > arch/x86/kernel/acpi/sleep.c | 3 +++ > > include/linux/kasan.h | 7 +++++++ > > 2 files changed, 10 insertions(+) > > > > diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c > > index 4858733..62bd046 100644 > > --- a/arch/x86/kernel/acpi/sleep.c > > +++ b/arch/x86/kernel/acpi/sleep.c > > @@ -115,6 +115,9 @@ int x86_acpi_suspend_lowlevel(void) > > pause_graph_tracing(); > > do_suspend_lowlevel(); > > unpause_graph_tracing(); > > + > > + kasan_unpoison_stack_below_sp(); > > + > > I think this might be too late. We may hit stale poison in the first C function called > after resume (restore_processor_state()). Thus the shadow must be unpoisoned prior such call, > i.e. somewhere in do_suspend_lowlevel() after .Lresume_point. Yeah, I think you're right. Will spin a v2. -- Josh ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-01 14:58 ` Josh Poimboeuf @ 2016-12-01 16:45 ` Josh Poimboeuf 2016-12-01 16:51 ` Dmitry Vyukov 0 siblings, 1 reply; 35+ messages in thread From: Josh Poimboeuf @ 2016-12-01 16:45 UTC (permalink / raw) To: Andrey Ryabinin Cc: Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm, linux-kernel, peterz, mingo, luto, Scott Bauer, x86, Alexander Potapenko, Dmitry Vyukov, kasan-dev On Thu, Dec 01, 2016 at 08:58:21AM -0600, Josh Poimboeuf wrote: > On Thu, Dec 01, 2016 at 12:05:34PM +0300, Andrey Ryabinin wrote: > > > > > > On 12/01/2016 02:10 AM, Josh Poimboeuf wrote: > > > Resuming from a suspend operation is showing a KASAN false positive > > > warning: > > > > > > > > KASAN instrumentation poisons the stack when entering a function and > > > unpoisons it when exiting the function. However, in the suspend path, > > > some functions never return, so their stack never gets unpoisoned, > > > resulting in stale KASAN shadow data which can cause false positive > > > warnings like the one above. > > > > > > Reported-by: Scott Bauer <scott.bauer@intel.com> > > > Tested-by: Scott Bauer <scott.bauer@intel.com> > > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> > > > --- > > > arch/x86/kernel/acpi/sleep.c | 3 +++ > > > include/linux/kasan.h | 7 +++++++ > > > 2 files changed, 10 insertions(+) > > > > > > diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c > > > index 4858733..62bd046 100644 > > > --- a/arch/x86/kernel/acpi/sleep.c > > > +++ b/arch/x86/kernel/acpi/sleep.c > > > @@ -115,6 +115,9 @@ int x86_acpi_suspend_lowlevel(void) > > > pause_graph_tracing(); > > > do_suspend_lowlevel(); > > > unpause_graph_tracing(); > > > + > > > + kasan_unpoison_stack_below_sp(); > > > + > > > > I think this might be too late. We may hit stale poison in the first C function called > > after resume (restore_processor_state()). Thus the shadow must be unpoisoned prior such call, > > i.e. somewhere in do_suspend_lowlevel() after .Lresume_point. > > Yeah, I think you're right. Will spin a v2. So I tried calling kasan_unpoison_task_stack_below() from do_suspend_lowlevel(), but it hung on the resume. Presumably because restore_processor_state() does some important setup which would be needed before calling into kasan_unpoison_task_stack_below(). For example, setting up the gs register. So it's a bit of a catch-22. It could probably be fixed properly by rewriting do_suspend_lowlevel() to call restore_processor_state() with the temporary stack before switching to the original stack and doing the unpoison. (And there are some other issues with do_suspend_lowlevel() and I'd love to try taking a scalpel to it. But I have too many knives in the air already to want to try to attempt that right now...) Unless somebody else wants to take a stab at it, my original patch is probably good enough for now, since restore_processor_state() doesn't seem to be triggering any KASAN warnings. -- Josh ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-01 16:45 ` Josh Poimboeuf @ 2016-12-01 16:51 ` Dmitry Vyukov 2016-12-01 17:13 ` Josh Poimboeuf 0 siblings, 1 reply; 35+ messages in thread From: Dmitry Vyukov @ 2016-12-01 16:51 UTC (permalink / raw) To: Josh Poimboeuf Cc: Andrey Ryabinin, Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm, LKML, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, x86, Alexander Potapenko, kasan-dev On Thu, Dec 1, 2016 at 5:45 PM, Josh Poimboeuf <jpoimboe@redhat.com> wrote: > On Thu, Dec 01, 2016 at 08:58:21AM -0600, Josh Poimboeuf wrote: >> On Thu, Dec 01, 2016 at 12:05:34PM +0300, Andrey Ryabinin wrote: >> > >> > >> > On 12/01/2016 02:10 AM, Josh Poimboeuf wrote: >> > > Resuming from a suspend operation is showing a KASAN false positive >> > > warning: >> > > >> > >> > > KASAN instrumentation poisons the stack when entering a function and >> > > unpoisons it when exiting the function. However, in the suspend path, >> > > some functions never return, so their stack never gets unpoisoned, >> > > resulting in stale KASAN shadow data which can cause false positive >> > > warnings like the one above. >> > > >> > > Reported-by: Scott Bauer <scott.bauer@intel.com> >> > > Tested-by: Scott Bauer <scott.bauer@intel.com> >> > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> >> > > --- >> > > arch/x86/kernel/acpi/sleep.c | 3 +++ >> > > include/linux/kasan.h | 7 +++++++ >> > > 2 files changed, 10 insertions(+) >> > > >> > > diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c >> > > index 4858733..62bd046 100644 >> > > --- a/arch/x86/kernel/acpi/sleep.c >> > > +++ b/arch/x86/kernel/acpi/sleep.c >> > > @@ -115,6 +115,9 @@ int x86_acpi_suspend_lowlevel(void) >> > > pause_graph_tracing(); >> > > do_suspend_lowlevel(); >> > > unpause_graph_tracing(); >> > > + >> > > + kasan_unpoison_stack_below_sp(); >> > > + >> > >> > I think this might be too late. We may hit stale poison in the first C function called >> > after resume (restore_processor_state()). Thus the shadow must be unpoisoned prior such call, >> > i.e. somewhere in do_suspend_lowlevel() after .Lresume_point. >> >> Yeah, I think you're right. Will spin a v2. > > So I tried calling kasan_unpoison_task_stack_below() from > do_suspend_lowlevel(), but it hung on the resume. Presumably because > restore_processor_state() does some important setup which would be > needed before calling into kasan_unpoison_task_stack_below(). For > example, setting up the gs register. So it's a bit of a catch-22. > > It could probably be fixed properly by rewriting do_suspend_lowlevel() > to call restore_processor_state() with the temporary stack before > switching to the original stack and doing the unpoison. > > (And there are some other issues with do_suspend_lowlevel() and I'd love > to try taking a scalpel to it. But I have too many knives in the air > already to want to try to attempt that right now...) > > Unless somebody else wants to take a stab at it, my original patch is > probably good enough for now, since restore_processor_state() doesn't > seem to be triggering any KASAN warnings. restore_processor_state/__restore_processor_state does not seem to have any local variables, so KASAN does not do any stack checks there. We could disable KASAN instrumentation of the file, or of particular functions. Or we could call kasan_unpoison_shadow() on the stack range before switching to it. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-01 16:51 ` Dmitry Vyukov @ 2016-12-01 17:13 ` Josh Poimboeuf 2016-12-01 17:27 ` Dmitry Vyukov 0 siblings, 1 reply; 35+ messages in thread From: Josh Poimboeuf @ 2016-12-01 17:13 UTC (permalink / raw) To: Dmitry Vyukov Cc: Andrey Ryabinin, Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm, LKML, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, x86, Alexander Potapenko, kasan-dev On Thu, Dec 01, 2016 at 05:51:52PM +0100, Dmitry Vyukov wrote: > On Thu, Dec 1, 2016 at 5:45 PM, Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > On Thu, Dec 01, 2016 at 08:58:21AM -0600, Josh Poimboeuf wrote: > >> On Thu, Dec 01, 2016 at 12:05:34PM +0300, Andrey Ryabinin wrote: > >> > > >> > > >> > On 12/01/2016 02:10 AM, Josh Poimboeuf wrote: > >> > > Resuming from a suspend operation is showing a KASAN false positive > >> > > warning: > >> > > > >> > > >> > > KASAN instrumentation poisons the stack when entering a function and > >> > > unpoisons it when exiting the function. However, in the suspend path, > >> > > some functions never return, so their stack never gets unpoisoned, > >> > > resulting in stale KASAN shadow data which can cause false positive > >> > > warnings like the one above. > >> > > > >> > > Reported-by: Scott Bauer <scott.bauer@intel.com> > >> > > Tested-by: Scott Bauer <scott.bauer@intel.com> > >> > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> > >> > > --- > >> > > arch/x86/kernel/acpi/sleep.c | 3 +++ > >> > > include/linux/kasan.h | 7 +++++++ > >> > > 2 files changed, 10 insertions(+) > >> > > > >> > > diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c > >> > > index 4858733..62bd046 100644 > >> > > --- a/arch/x86/kernel/acpi/sleep.c > >> > > +++ b/arch/x86/kernel/acpi/sleep.c > >> > > @@ -115,6 +115,9 @@ int x86_acpi_suspend_lowlevel(void) > >> > > pause_graph_tracing(); > >> > > do_suspend_lowlevel(); > >> > > unpause_graph_tracing(); > >> > > + > >> > > + kasan_unpoison_stack_below_sp(); > >> > > + > >> > > >> > I think this might be too late. We may hit stale poison in the first C function called > >> > after resume (restore_processor_state()). Thus the shadow must be unpoisoned prior such call, > >> > i.e. somewhere in do_suspend_lowlevel() after .Lresume_point. > >> > >> Yeah, I think you're right. Will spin a v2. > > > > So I tried calling kasan_unpoison_task_stack_below() from > > do_suspend_lowlevel(), but it hung on the resume. Presumably because > > restore_processor_state() does some important setup which would be > > needed before calling into kasan_unpoison_task_stack_below(). For > > example, setting up the gs register. So it's a bit of a catch-22. > > > > It could probably be fixed properly by rewriting do_suspend_lowlevel() > > to call restore_processor_state() with the temporary stack before > > switching to the original stack and doing the unpoison. > > > > (And there are some other issues with do_suspend_lowlevel() and I'd love > > to try taking a scalpel to it. But I have too many knives in the air > > already to want to try to attempt that right now...) > > > > Unless somebody else wants to take a stab at it, my original patch is > > probably good enough for now, since restore_processor_state() doesn't > > seem to be triggering any KASAN warnings. > > restore_processor_state/__restore_processor_state does not seem to > have any local variables, so KASAN does not do any stack checks there. Actually, looking at the object code, it uses a lot of stack space and has several calls to __asan_report_load*() functions. Probably due to inlining of other functions which have stack variables. > We could disable KASAN instrumentation of the file, or of particular > functions. I don't think that would be sufficient unless it were disabled for __restore_processor_state() and all the functions it calls (and the functions they call, etc), which wouldn't necessarily be straightforward. > Or we could call kasan_unpoison_shadow() on the stack range > before switching to it. I tried that already, but it hung because restore_processor_state() hadn't been called yet (the catch-22 I mentioned aboved). -- Josh ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-01 17:13 ` Josh Poimboeuf @ 2016-12-01 17:27 ` Dmitry Vyukov 2016-12-01 17:34 ` Josh Poimboeuf 0 siblings, 1 reply; 35+ messages in thread From: Dmitry Vyukov @ 2016-12-01 17:27 UTC (permalink / raw) To: Josh Poimboeuf Cc: Andrey Ryabinin, Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm, LKML, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, x86, Alexander Potapenko, kasan-dev On Thu, Dec 1, 2016 at 6:13 PM, Josh Poimboeuf <jpoimboe@redhat.com> wrote: > On Thu, Dec 01, 2016 at 05:51:52PM +0100, Dmitry Vyukov wrote: >> On Thu, Dec 1, 2016 at 5:45 PM, Josh Poimboeuf <jpoimboe@redhat.com> wrote: >> > On Thu, Dec 01, 2016 at 08:58:21AM -0600, Josh Poimboeuf wrote: >> >> On Thu, Dec 01, 2016 at 12:05:34PM +0300, Andrey Ryabinin wrote: >> >> > >> >> > >> >> > On 12/01/2016 02:10 AM, Josh Poimboeuf wrote: >> >> > > Resuming from a suspend operation is showing a KASAN false positive >> >> > > warning: >> >> > > >> >> > >> >> > > KASAN instrumentation poisons the stack when entering a function and >> >> > > unpoisons it when exiting the function. However, in the suspend path, >> >> > > some functions never return, so their stack never gets unpoisoned, >> >> > > resulting in stale KASAN shadow data which can cause false positive >> >> > > warnings like the one above. >> >> > > >> >> > > Reported-by: Scott Bauer <scott.bauer@intel.com> >> >> > > Tested-by: Scott Bauer <scott.bauer@intel.com> >> >> > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> >> >> > > --- >> >> > > arch/x86/kernel/acpi/sleep.c | 3 +++ >> >> > > include/linux/kasan.h | 7 +++++++ >> >> > > 2 files changed, 10 insertions(+) >> >> > > >> >> > > diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c >> >> > > index 4858733..62bd046 100644 >> >> > > --- a/arch/x86/kernel/acpi/sleep.c >> >> > > +++ b/arch/x86/kernel/acpi/sleep.c >> >> > > @@ -115,6 +115,9 @@ int x86_acpi_suspend_lowlevel(void) >> >> > > pause_graph_tracing(); >> >> > > do_suspend_lowlevel(); >> >> > > unpause_graph_tracing(); >> >> > > + >> >> > > + kasan_unpoison_stack_below_sp(); >> >> > > + >> >> > >> >> > I think this might be too late. We may hit stale poison in the first C function called >> >> > after resume (restore_processor_state()). Thus the shadow must be unpoisoned prior such call, >> >> > i.e. somewhere in do_suspend_lowlevel() after .Lresume_point. >> >> >> >> Yeah, I think you're right. Will spin a v2. >> > >> > So I tried calling kasan_unpoison_task_stack_below() from >> > do_suspend_lowlevel(), but it hung on the resume. Presumably because >> > restore_processor_state() does some important setup which would be >> > needed before calling into kasan_unpoison_task_stack_below(). For >> > example, setting up the gs register. So it's a bit of a catch-22. >> > >> > It could probably be fixed properly by rewriting do_suspend_lowlevel() >> > to call restore_processor_state() with the temporary stack before >> > switching to the original stack and doing the unpoison. >> > >> > (And there are some other issues with do_suspend_lowlevel() and I'd love >> > to try taking a scalpel to it. But I have too many knives in the air >> > already to want to try to attempt that right now...) >> > >> > Unless somebody else wants to take a stab at it, my original patch is >> > probably good enough for now, since restore_processor_state() doesn't >> > seem to be triggering any KASAN warnings. >> >> restore_processor_state/__restore_processor_state does not seem to >> have any local variables, so KASAN does not do any stack checks there. > > Actually, looking at the object code, it uses a lot of stack space and > has several calls to __asan_report_load*() functions. Probably due to > inlining of other functions which have stack variables. That can be loads of heap variables (or other non-stack data). KASAN will emit these checks for lots of loads, but they don't necessary go to stack. >> We could disable KASAN instrumentation of the file, or of particular >> functions. > > I don't think that would be sufficient unless it were disabled for > __restore_processor_state() and all the functions it calls (and the > functions they call, etc), which wouldn't necessarily be > straightforward. > >> Or we could call kasan_unpoison_shadow() on the stack range >> before switching to it. > > I tried that already, but it hung because restore_processor_state() > hadn't been called yet (the catch-22 I mentioned aboved). Ah, I see, we just can't execute normal C code at that point... ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-01 17:27 ` Dmitry Vyukov @ 2016-12-01 17:34 ` Josh Poimboeuf 2016-12-01 17:47 ` Dmitry Vyukov 0 siblings, 1 reply; 35+ messages in thread From: Josh Poimboeuf @ 2016-12-01 17:34 UTC (permalink / raw) To: Dmitry Vyukov Cc: Andrey Ryabinin, Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm, LKML, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, x86, Alexander Potapenko, kasan-dev On Thu, Dec 01, 2016 at 06:27:31PM +0100, Dmitry Vyukov wrote: > On Thu, Dec 1, 2016 at 6:13 PM, Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > On Thu, Dec 01, 2016 at 05:51:52PM +0100, Dmitry Vyukov wrote: > >> On Thu, Dec 1, 2016 at 5:45 PM, Josh Poimboeuf <jpoimboe@redhat.com> wrote: > >> > On Thu, Dec 01, 2016 at 08:58:21AM -0600, Josh Poimboeuf wrote: > >> >> On Thu, Dec 01, 2016 at 12:05:34PM +0300, Andrey Ryabinin wrote: > >> >> > > >> >> > > >> >> > On 12/01/2016 02:10 AM, Josh Poimboeuf wrote: > >> >> > > Resuming from a suspend operation is showing a KASAN false positive > >> >> > > warning: > >> >> > > > >> >> > > >> >> > > KASAN instrumentation poisons the stack when entering a function and > >> >> > > unpoisons it when exiting the function. However, in the suspend path, > >> >> > > some functions never return, so their stack never gets unpoisoned, > >> >> > > resulting in stale KASAN shadow data which can cause false positive > >> >> > > warnings like the one above. > >> >> > > > >> >> > > Reported-by: Scott Bauer <scott.bauer@intel.com> > >> >> > > Tested-by: Scott Bauer <scott.bauer@intel.com> > >> >> > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> > >> >> > > --- > >> >> > > arch/x86/kernel/acpi/sleep.c | 3 +++ > >> >> > > include/linux/kasan.h | 7 +++++++ > >> >> > > 2 files changed, 10 insertions(+) > >> >> > > > >> >> > > diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c > >> >> > > index 4858733..62bd046 100644 > >> >> > > --- a/arch/x86/kernel/acpi/sleep.c > >> >> > > +++ b/arch/x86/kernel/acpi/sleep.c > >> >> > > @@ -115,6 +115,9 @@ int x86_acpi_suspend_lowlevel(void) > >> >> > > pause_graph_tracing(); > >> >> > > do_suspend_lowlevel(); > >> >> > > unpause_graph_tracing(); > >> >> > > + > >> >> > > + kasan_unpoison_stack_below_sp(); > >> >> > > + > >> >> > > >> >> > I think this might be too late. We may hit stale poison in the first C function called > >> >> > after resume (restore_processor_state()). Thus the shadow must be unpoisoned prior such call, > >> >> > i.e. somewhere in do_suspend_lowlevel() after .Lresume_point. > >> >> > >> >> Yeah, I think you're right. Will spin a v2. > >> > > >> > So I tried calling kasan_unpoison_task_stack_below() from > >> > do_suspend_lowlevel(), but it hung on the resume. Presumably because > >> > restore_processor_state() does some important setup which would be > >> > needed before calling into kasan_unpoison_task_stack_below(). For > >> > example, setting up the gs register. So it's a bit of a catch-22. > >> > > >> > It could probably be fixed properly by rewriting do_suspend_lowlevel() > >> > to call restore_processor_state() with the temporary stack before > >> > switching to the original stack and doing the unpoison. > >> > > >> > (And there are some other issues with do_suspend_lowlevel() and I'd love > >> > to try taking a scalpel to it. But I have too many knives in the air > >> > already to want to try to attempt that right now...) > >> > > >> > Unless somebody else wants to take a stab at it, my original patch is > >> > probably good enough for now, since restore_processor_state() doesn't > >> > seem to be triggering any KASAN warnings. > >> > >> restore_processor_state/__restore_processor_state does not seem to > >> have any local variables, so KASAN does not do any stack checks there. > > > > Actually, looking at the object code, it uses a lot of stack space and > > has several calls to __asan_report_load*() functions. Probably due to > > inlining of other functions which have stack variables. > > That can be loads of heap variables (or other non-stack data). KASAN > will emit these checks for lots of loads, but they don't necessary go > to stack. I also see the stack poisoning instructions: 54f: 49 c1 ee 03 shr $0x3,%r14 553: 4c 01 f0 add %r14,%rax 556: c7 00 f1 f1 f1 f1 movl $0xf1f1f1f1,(%rax) 55c: c7 40 04 00 00 f4 f4 movl $0xf4f40000,0x4(%rax) 563: c7 40 08 f3 f3 f3 f3 movl $0xf3f3f3f3,0x8(%rax) > >> We could disable KASAN instrumentation of the file, or of particular > >> functions. > > > > I don't think that would be sufficient unless it were disabled for > > __restore_processor_state() and all the functions it calls (and the > > functions they call, etc), which wouldn't necessarily be > > straightforward. > > > >> Or we could call kasan_unpoison_shadow() on the stack range > >> before switching to it. > > > > I tried that already, but it hung because restore_processor_state() > > hadn't been called yet (the catch-22 I mentioned aboved). > > Ah, I see, we just can't execute normal C code at that point... Right. -- Josh ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-01 17:34 ` Josh Poimboeuf @ 2016-12-01 17:47 ` Dmitry Vyukov 2016-12-01 17:56 ` Josh Poimboeuf 0 siblings, 1 reply; 35+ messages in thread From: Dmitry Vyukov @ 2016-12-01 17:47 UTC (permalink / raw) To: Josh Poimboeuf Cc: Andrey Ryabinin, Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm, LKML, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, x86, Alexander Potapenko, kasan-dev On Thu, Dec 1, 2016 at 6:34 PM, Josh Poimboeuf <jpoimboe@redhat.com> wrote: >> >> >> > >> >> >> > On 12/01/2016 02:10 AM, Josh Poimboeuf wrote: >> >> >> > > Resuming from a suspend operation is showing a KASAN false positive >> >> >> > > warning: >> >> >> > > >> >> >> > >> >> >> > > KASAN instrumentation poisons the stack when entering a function and >> >> >> > > unpoisons it when exiting the function. However, in the suspend path, >> >> >> > > some functions never return, so their stack never gets unpoisoned, >> >> >> > > resulting in stale KASAN shadow data which can cause false positive >> >> >> > > warnings like the one above. >> >> >> > > >> >> >> > > Reported-by: Scott Bauer <scott.bauer@intel.com> >> >> >> > > Tested-by: Scott Bauer <scott.bauer@intel.com> >> >> >> > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> >> >> >> > > --- >> >> >> > > arch/x86/kernel/acpi/sleep.c | 3 +++ >> >> >> > > include/linux/kasan.h | 7 +++++++ >> >> >> > > 2 files changed, 10 insertions(+) >> >> >> > > >> >> >> > > diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c >> >> >> > > index 4858733..62bd046 100644 >> >> >> > > --- a/arch/x86/kernel/acpi/sleep.c >> >> >> > > +++ b/arch/x86/kernel/acpi/sleep.c >> >> >> > > @@ -115,6 +115,9 @@ int x86_acpi_suspend_lowlevel(void) >> >> >> > > pause_graph_tracing(); >> >> >> > > do_suspend_lowlevel(); >> >> >> > > unpause_graph_tracing(); >> >> >> > > + >> >> >> > > + kasan_unpoison_stack_below_sp(); >> >> >> > > + >> >> >> > >> >> >> > I think this might be too late. We may hit stale poison in the first C function called >> >> >> > after resume (restore_processor_state()). Thus the shadow must be unpoisoned prior such call, >> >> >> > i.e. somewhere in do_suspend_lowlevel() after .Lresume_point. >> >> >> >> >> >> Yeah, I think you're right. Will spin a v2. >> >> > >> >> > So I tried calling kasan_unpoison_task_stack_below() from >> >> > do_suspend_lowlevel(), but it hung on the resume. Presumably because >> >> > restore_processor_state() does some important setup which would be >> >> > needed before calling into kasan_unpoison_task_stack_below(). For >> >> > example, setting up the gs register. So it's a bit of a catch-22. >> >> > >> >> > It could probably be fixed properly by rewriting do_suspend_lowlevel() >> >> > to call restore_processor_state() with the temporary stack before >> >> > switching to the original stack and doing the unpoison. >> >> > >> >> > (And there are some other issues with do_suspend_lowlevel() and I'd love >> >> > to try taking a scalpel to it. But I have too many knives in the air >> >> > already to want to try to attempt that right now...) >> >> > >> >> > Unless somebody else wants to take a stab at it, my original patch is >> >> > probably good enough for now, since restore_processor_state() doesn't >> >> > seem to be triggering any KASAN warnings. >> >> >> >> restore_processor_state/__restore_processor_state does not seem to >> >> have any local variables, so KASAN does not do any stack checks there. >> > >> > Actually, looking at the object code, it uses a lot of stack space and >> > has several calls to __asan_report_load*() functions. Probably due to >> > inlining of other functions which have stack variables. >> >> That can be loads of heap variables (or other non-stack data). KASAN >> will emit these checks for lots of loads, but they don't necessary go >> to stack. > > I also see the stack poisoning instructions: > > 54f: 49 c1 ee 03 shr $0x3,%r14 > 553: 4c 01 f0 add %r14,%rax > 556: c7 00 f1 f1 f1 f1 movl $0xf1f1f1f1,(%rax) > 55c: c7 40 04 00 00 f4 f4 movl $0xf4f40000,0x4(%rax) > 563: c7 40 08 f3 f3 f3 f3 movl $0xf3f3f3f3,0x8(%rax) OK, then we are in trouble potentially. It may work as long as as the stack region that is used for local vars in restore_processor_state() does not contain any stale poisoning. But it can break at any moment. Have you tried kasan_unpoison_task_stack_below() or kasan_unpoison_shadow()? I can see how kasan_unpoison_task_stack_below() can hang (it at least uses current). But kasan_unpoison_shadow() is quite trivial, it computes shadow address with simple math and writes zeroes there. >> >> We could disable KASAN instrumentation of the file, or of particular >> >> functions. >> > >> > I don't think that would be sufficient unless it were disabled for >> > __restore_processor_state() and all the functions it calls (and the >> > functions they call, etc), which wouldn't necessarily be >> > straightforward. >> > >> >> Or we could call kasan_unpoison_shadow() on the stack range >> >> before switching to it. >> > >> > I tried that already, but it hung because restore_processor_state() >> > hadn't been called yet (the catch-22 I mentioned aboved). >> >> Ah, I see, we just can't execute normal C code at that point... > > Right. > > -- > Josh > > -- > You received this message because you are subscribed to the Google Groups "kasan-dev" group. > To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@googlegroups.com. > To post to this group, send email to kasan-dev@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/20161201173438.bfe5eq23i6ezfxsq%40treble. > For more options, visit https://groups.google.com/d/optout. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-01 17:47 ` Dmitry Vyukov @ 2016-12-01 17:56 ` Josh Poimboeuf 2016-12-01 20:31 ` [PATCH v2] " Josh Poimboeuf 0 siblings, 1 reply; 35+ messages in thread From: Josh Poimboeuf @ 2016-12-01 17:56 UTC (permalink / raw) To: Dmitry Vyukov Cc: Andrey Ryabinin, Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm, LKML, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, x86, Alexander Potapenko, kasan-dev On Thu, Dec 01, 2016 at 06:47:07PM +0100, Dmitry Vyukov wrote: > On Thu, Dec 1, 2016 at 6:34 PM, Josh Poimboeuf <jpoimboe@redhat.com> wrote: > >> >> >> > > >> >> >> > On 12/01/2016 02:10 AM, Josh Poimboeuf wrote: > >> >> >> > > Resuming from a suspend operation is showing a KASAN false positive > >> >> >> > > warning: > >> >> >> > > > >> >> >> > > >> >> >> > > KASAN instrumentation poisons the stack when entering a function and > >> >> >> > > unpoisons it when exiting the function. However, in the suspend path, > >> >> >> > > some functions never return, so their stack never gets unpoisoned, > >> >> >> > > resulting in stale KASAN shadow data which can cause false positive > >> >> >> > > warnings like the one above. > >> >> >> > > > >> >> >> > > Reported-by: Scott Bauer <scott.bauer@intel.com> > >> >> >> > > Tested-by: Scott Bauer <scott.bauer@intel.com> > >> >> >> > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> > >> >> >> > > --- > >> >> >> > > arch/x86/kernel/acpi/sleep.c | 3 +++ > >> >> >> > > include/linux/kasan.h | 7 +++++++ > >> >> >> > > 2 files changed, 10 insertions(+) > >> >> >> > > > >> >> >> > > diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c > >> >> >> > > index 4858733..62bd046 100644 > >> >> >> > > --- a/arch/x86/kernel/acpi/sleep.c > >> >> >> > > +++ b/arch/x86/kernel/acpi/sleep.c > >> >> >> > > @@ -115,6 +115,9 @@ int x86_acpi_suspend_lowlevel(void) > >> >> >> > > pause_graph_tracing(); > >> >> >> > > do_suspend_lowlevel(); > >> >> >> > > unpause_graph_tracing(); > >> >> >> > > + > >> >> >> > > + kasan_unpoison_stack_below_sp(); > >> >> >> > > + > >> >> >> > > >> >> >> > I think this might be too late. We may hit stale poison in the first C function called > >> >> >> > after resume (restore_processor_state()). Thus the shadow must be unpoisoned prior such call, > >> >> >> > i.e. somewhere in do_suspend_lowlevel() after .Lresume_point. > >> >> >> > >> >> >> Yeah, I think you're right. Will spin a v2. > >> >> > > >> >> > So I tried calling kasan_unpoison_task_stack_below() from > >> >> > do_suspend_lowlevel(), but it hung on the resume. Presumably because > >> >> > restore_processor_state() does some important setup which would be > >> >> > needed before calling into kasan_unpoison_task_stack_below(). For > >> >> > example, setting up the gs register. So it's a bit of a catch-22. > >> >> > > >> >> > It could probably be fixed properly by rewriting do_suspend_lowlevel() > >> >> > to call restore_processor_state() with the temporary stack before > >> >> > switching to the original stack and doing the unpoison. > >> >> > > >> >> > (And there are some other issues with do_suspend_lowlevel() and I'd love > >> >> > to try taking a scalpel to it. But I have too many knives in the air > >> >> > already to want to try to attempt that right now...) > >> >> > > >> >> > Unless somebody else wants to take a stab at it, my original patch is > >> >> > probably good enough for now, since restore_processor_state() doesn't > >> >> > seem to be triggering any KASAN warnings. > >> >> > >> >> restore_processor_state/__restore_processor_state does not seem to > >> >> have any local variables, so KASAN does not do any stack checks there. > >> > > >> > Actually, looking at the object code, it uses a lot of stack space and > >> > has several calls to __asan_report_load*() functions. Probably due to > >> > inlining of other functions which have stack variables. > >> > >> That can be loads of heap variables (or other non-stack data). KASAN > >> will emit these checks for lots of loads, but they don't necessary go > >> to stack. > > > > I also see the stack poisoning instructions: > > > > 54f: 49 c1 ee 03 shr $0x3,%r14 > > 553: 4c 01 f0 add %r14,%rax > > 556: c7 00 f1 f1 f1 f1 movl $0xf1f1f1f1,(%rax) > > 55c: c7 40 04 00 00 f4 f4 movl $0xf4f40000,0x4(%rax) > > 563: c7 40 08 f3 f3 f3 f3 movl $0xf3f3f3f3,0x8(%rax) > > OK, then we are in trouble potentially. > It may work as long as as the stack region that is used for local vars > in restore_processor_state() does not contain any stale poisoning. But > it can break at any moment. > > Have you tried kasan_unpoison_task_stack_below() or kasan_unpoison_shadow()? > I can see how kasan_unpoison_task_stack_below() can hang (it at least > uses current). But kasan_unpoison_shadow() is quite trivial, it > computes shadow address with simple math and writes zeroes there. Good idea, I'll give kasan_unpoison_shadow() a shot. -- Josh ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v2] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-01 17:56 ` Josh Poimboeuf @ 2016-12-01 20:31 ` Josh Poimboeuf 2016-12-02 9:44 ` Dmitry Vyukov ` (2 more replies) 0 siblings, 3 replies; 35+ messages in thread From: Josh Poimboeuf @ 2016-12-01 20:31 UTC (permalink / raw) To: Dmitry Vyukov Cc: Andrey Ryabinin, Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm, LKML, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, x86, Alexander Potapenko, kasan-dev Resuming from a suspend operation is showing a KASAN false positive warning: BUG: KASAN: stack-out-of-bounds in unwind_get_return_address+0x11d/0x130 at addr ffff8803867d7878 Read of size 8 by task pm-suspend/7774 page:ffffea000e19f5c0 count:0 mapcount:0 mapping: (null) index:0x0 flags: 0x2ffff0000000000() page dumped because: kasan: bad access detected CPU: 0 PID: 7774 Comm: pm-suspend Tainted: G B 4.9.0-rc7+ #8 Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016 Call Trace: dump_stack+0x63/0x82 kasan_report_error+0x4b4/0x4e0 ? acpi_hw_read_port+0xd0/0x1ea ? kfree_const+0x22/0x30 ? acpi_hw_validate_io_request+0x1a6/0x1a6 __asan_report_load8_noabort+0x61/0x70 ? unwind_get_return_address+0x11d/0x130 unwind_get_return_address+0x11d/0x130 ? unwind_next_frame+0x97/0xf0 __save_stack_trace+0x92/0x100 save_stack_trace+0x1b/0x20 save_stack+0x46/0xd0 ? save_stack_trace+0x1b/0x20 ? save_stack+0x46/0xd0 ? kasan_kmalloc+0xad/0xe0 ? kasan_slab_alloc+0x12/0x20 ? acpi_hw_read+0x2b6/0x3aa ? acpi_hw_validate_register+0x20b/0x20b ? acpi_hw_write_port+0x72/0xc7 ? acpi_hw_write+0x11f/0x15f ? acpi_hw_read_multiple+0x19f/0x19f ? memcpy+0x45/0x50 ? acpi_hw_write_port+0x72/0xc7 ? acpi_hw_write+0x11f/0x15f ? acpi_hw_read_multiple+0x19f/0x19f ? kasan_unpoison_shadow+0x36/0x50 kasan_kmalloc+0xad/0xe0 kasan_slab_alloc+0x12/0x20 kmem_cache_alloc_trace+0xbc/0x1e0 ? acpi_get_sleep_type_data+0x9a/0x578 acpi_get_sleep_type_data+0x9a/0x578 acpi_hw_legacy_wake_prep+0x88/0x22c ? acpi_hw_legacy_sleep+0x3c7/0x3c7 ? acpi_write_bit_register+0x28d/0x2d3 ? acpi_read_bit_register+0x19b/0x19b acpi_hw_sleep_dispatch+0xb5/0xba acpi_leave_sleep_state_prep+0x17/0x19 acpi_suspend_enter+0x154/0x1e0 ? trace_suspend_resume+0xe8/0xe8 suspend_devices_and_enter+0xb09/0xdb0 ? printk+0xa8/0xd8 ? arch_suspend_enable_irqs+0x20/0x20 ? try_to_freeze_tasks+0x295/0x600 pm_suspend+0x6c9/0x780 ? finish_wait+0x1f0/0x1f0 ? suspend_devices_and_enter+0xdb0/0xdb0 state_store+0xa2/0x120 ? kobj_attr_show+0x60/0x60 kobj_attr_store+0x36/0x70 sysfs_kf_write+0x131/0x200 kernfs_fop_write+0x295/0x3f0 __vfs_write+0xef/0x760 ? handle_mm_fault+0x1346/0x35e0 ? do_iter_readv_writev+0x660/0x660 ? __pmd_alloc+0x310/0x310 ? do_lock_file_wait+0x1e0/0x1e0 ? apparmor_file_permission+0x18/0x20 ? security_file_permission+0x73/0x1c0 ? rw_verify_area+0xbd/0x2b0 vfs_write+0x149/0x4a0 SyS_write+0xd9/0x1c0 ? SyS_read+0x1c0/0x1c0 entry_SYSCALL_64_fastpath+0x1e/0xad Memory state around the buggy address: ffff8803867d7700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8803867d7780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff8803867d7800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f4 ^ ffff8803867d7880: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 ffff8803867d7900: 00 00 00 f1 f1 f1 f1 04 f4 f4 f4 f3 f3 f3 f3 00 KASAN instrumentation poisons the stack when entering a function and unpoisons it when exiting the function. However, in the suspend path, some functions never return, so their stack never gets unpoisoned, resulting in stale KASAN shadow data which can cause later false positive warnings like the one above. Reported-by: Scott Bauer <scott.bauer@intel.com> Suggested-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> --- arch/x86/kernel/acpi/wakeup_64.S | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/arch/x86/kernel/acpi/wakeup_64.S b/arch/x86/kernel/acpi/wakeup_64.S index 169963f..1df9b75 100644 --- a/arch/x86/kernel/acpi/wakeup_64.S +++ b/arch/x86/kernel/acpi/wakeup_64.S @@ -109,6 +109,22 @@ ENTRY(do_suspend_lowlevel) movq pt_regs_r14(%rax), %r14 movq pt_regs_r15(%rax), %r15 +#ifdef CONFIG_KASAN + /* + * The suspend path may have poisoned some areas deeper in the stack, + * which we now need to unpoison. + * + * We can't call kasan_unpoison_task_stack_below() because it uses %gs + * for 'current', which hasn't been set up yet. Instead, calculate the + * stack range manually and call kasan_unpoison_shadow(). + */ + movq %rsp, %rdi + andq $CURRENT_MASK, %rdi + movq %rsp, %rsi + xorq %rdi, %rsi + call kasan_unpoison_shadow +#endif + xorl %eax, %eax addq $8, %rsp FRAME_END -- 2.7.4 ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v2] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-01 20:31 ` [PATCH v2] " Josh Poimboeuf @ 2016-12-02 9:44 ` Dmitry Vyukov 2016-12-02 12:54 ` Pavel Machek 2016-12-02 13:41 ` Andrey Ryabinin 2 siblings, 0 replies; 35+ messages in thread From: Dmitry Vyukov @ 2016-12-02 9:44 UTC (permalink / raw) To: Josh Poimboeuf Cc: Andrey Ryabinin, Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm, LKML, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, x86, Alexander Potapenko, kasan-dev On Thu, Dec 1, 2016 at 9:31 PM, Josh Poimboeuf <jpoimboe@redhat.com> wrote: > Resuming from a suspend operation is showing a KASAN false positive > warning: > > BUG: KASAN: stack-out-of-bounds in unwind_get_return_address+0x11d/0x130 at addr ffff8803867d7878 > Read of size 8 by task pm-suspend/7774 > page:ffffea000e19f5c0 count:0 mapcount:0 mapping: (null) index:0x0 > flags: 0x2ffff0000000000() > page dumped because: kasan: bad access detected > CPU: 0 PID: 7774 Comm: pm-suspend Tainted: G B 4.9.0-rc7+ #8 > Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016 > Call Trace: > dump_stack+0x63/0x82 > kasan_report_error+0x4b4/0x4e0 > ? acpi_hw_read_port+0xd0/0x1ea > ? kfree_const+0x22/0x30 > ? acpi_hw_validate_io_request+0x1a6/0x1a6 > __asan_report_load8_noabort+0x61/0x70 > ? unwind_get_return_address+0x11d/0x130 > unwind_get_return_address+0x11d/0x130 > ? unwind_next_frame+0x97/0xf0 > __save_stack_trace+0x92/0x100 > save_stack_trace+0x1b/0x20 > save_stack+0x46/0xd0 > ? save_stack_trace+0x1b/0x20 > ? save_stack+0x46/0xd0 > ? kasan_kmalloc+0xad/0xe0 > ? kasan_slab_alloc+0x12/0x20 > ? acpi_hw_read+0x2b6/0x3aa > ? acpi_hw_validate_register+0x20b/0x20b > ? acpi_hw_write_port+0x72/0xc7 > ? acpi_hw_write+0x11f/0x15f > ? acpi_hw_read_multiple+0x19f/0x19f > ? memcpy+0x45/0x50 > ? acpi_hw_write_port+0x72/0xc7 > ? acpi_hw_write+0x11f/0x15f > ? acpi_hw_read_multiple+0x19f/0x19f > ? kasan_unpoison_shadow+0x36/0x50 > kasan_kmalloc+0xad/0xe0 > kasan_slab_alloc+0x12/0x20 > kmem_cache_alloc_trace+0xbc/0x1e0 > ? acpi_get_sleep_type_data+0x9a/0x578 > acpi_get_sleep_type_data+0x9a/0x578 > acpi_hw_legacy_wake_prep+0x88/0x22c > ? acpi_hw_legacy_sleep+0x3c7/0x3c7 > ? acpi_write_bit_register+0x28d/0x2d3 > ? acpi_read_bit_register+0x19b/0x19b > acpi_hw_sleep_dispatch+0xb5/0xba > acpi_leave_sleep_state_prep+0x17/0x19 > acpi_suspend_enter+0x154/0x1e0 > ? trace_suspend_resume+0xe8/0xe8 > suspend_devices_and_enter+0xb09/0xdb0 > ? printk+0xa8/0xd8 > ? arch_suspend_enable_irqs+0x20/0x20 > ? try_to_freeze_tasks+0x295/0x600 > pm_suspend+0x6c9/0x780 > ? finish_wait+0x1f0/0x1f0 > ? suspend_devices_and_enter+0xdb0/0xdb0 > state_store+0xa2/0x120 > ? kobj_attr_show+0x60/0x60 > kobj_attr_store+0x36/0x70 > sysfs_kf_write+0x131/0x200 > kernfs_fop_write+0x295/0x3f0 > __vfs_write+0xef/0x760 > ? handle_mm_fault+0x1346/0x35e0 > ? do_iter_readv_writev+0x660/0x660 > ? __pmd_alloc+0x310/0x310 > ? do_lock_file_wait+0x1e0/0x1e0 > ? apparmor_file_permission+0x18/0x20 > ? security_file_permission+0x73/0x1c0 > ? rw_verify_area+0xbd/0x2b0 > vfs_write+0x149/0x4a0 > SyS_write+0xd9/0x1c0 > ? SyS_read+0x1c0/0x1c0 > entry_SYSCALL_64_fastpath+0x1e/0xad > Memory state around the buggy address: > ffff8803867d7700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ffff8803867d7780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >ffff8803867d7800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f4 > ^ > ffff8803867d7880: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 > ffff8803867d7900: 00 00 00 f1 f1 f1 f1 04 f4 f4 f4 f3 f3 f3 f3 00 > > KASAN instrumentation poisons the stack when entering a function and > unpoisons it when exiting the function. However, in the suspend path, > some functions never return, so their stack never gets unpoisoned, > resulting in stale KASAN shadow data which can cause later false > positive warnings like the one above. > > Reported-by: Scott Bauer <scott.bauer@intel.com> > Suggested-by: Dmitry Vyukov <dvyukov@google.com> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> > --- > arch/x86/kernel/acpi/wakeup_64.S | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) > > diff --git a/arch/x86/kernel/acpi/wakeup_64.S b/arch/x86/kernel/acpi/wakeup_64.S > index 169963f..1df9b75 100644 > --- a/arch/x86/kernel/acpi/wakeup_64.S > +++ b/arch/x86/kernel/acpi/wakeup_64.S > @@ -109,6 +109,22 @@ ENTRY(do_suspend_lowlevel) > movq pt_regs_r14(%rax), %r14 > movq pt_regs_r15(%rax), %r15 > > +#ifdef CONFIG_KASAN > + /* > + * The suspend path may have poisoned some areas deeper in the stack, > + * which we now need to unpoison. > + * > + * We can't call kasan_unpoison_task_stack_below() because it uses %gs > + * for 'current', which hasn't been set up yet. Instead, calculate the > + * stack range manually and call kasan_unpoison_shadow(). > + */ > + movq %rsp, %rdi > + andq $CURRENT_MASK, %rdi > + movq %rsp, %rsi > + xorq %rdi, %rsi > + call kasan_unpoison_shadow > +#endif > + > xorl %eax, %eax > addq $8, %rsp > FRAME_END Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Thanks! ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v2] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-01 20:31 ` [PATCH v2] " Josh Poimboeuf 2016-12-02 9:44 ` Dmitry Vyukov @ 2016-12-02 12:54 ` Pavel Machek 2016-12-02 13:41 ` Andrey Ryabinin 2 siblings, 0 replies; 35+ messages in thread From: Pavel Machek @ 2016-12-02 12:54 UTC (permalink / raw) To: Josh Poimboeuf Cc: Dmitry Vyukov, Andrey Ryabinin, Rafael J. Wysocki, Len Brown, linux-pm, LKML, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, x86, Alexander Potapenko, kasan-dev [-- Attachment #1: Type: text/plain, Size: 1963 bytes --] Hi! > Resuming from a suspend operation is showing a KASAN false positive > warning: > KASAN instrumentation poisons the stack when entering a function and > unpoisons it when exiting the function. However, in the suspend path, > some functions never return, so their stack never gets unpoisoned, > resulting in stale KASAN shadow data which can cause later false > positive warnings like the one above. > > Reported-by: Scott Bauer <scott.bauer@intel.com> > Suggested-by: Dmitry Vyukov <dvyukov@google.com> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Pavel Machek <pavel@ucw.cz> > --- > arch/x86/kernel/acpi/wakeup_64.S | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) > > diff --git a/arch/x86/kernel/acpi/wakeup_64.S b/arch/x86/kernel/acpi/wakeup_64.S > index 169963f..1df9b75 100644 > --- a/arch/x86/kernel/acpi/wakeup_64.S > +++ b/arch/x86/kernel/acpi/wakeup_64.S > @@ -109,6 +109,22 @@ ENTRY(do_suspend_lowlevel) > movq pt_regs_r14(%rax), %r14 > movq pt_regs_r15(%rax), %r15 > > +#ifdef CONFIG_KASAN > + /* > + * The suspend path may have poisoned some areas deeper in the stack, > + * which we now need to unpoison. > + * > + * We can't call kasan_unpoison_task_stack_below() because it uses %gs > + * for 'current', which hasn't been set up yet. Instead, calculate the > + * stack range manually and call kasan_unpoison_shadow(). > + */ > + movq %rsp, %rdi > + andq $CURRENT_MASK, %rdi > + movq %rsp, %rsi > + xorq %rdi, %rsi > + call kasan_unpoison_shadow > +#endif Well... you may want to add note to kasan_unpoison_shadow() /* * This is called by early resume code, with cpu not yer properly * resumed. In particular, %gs may not be set up, and thus current * is not available. */ Thanks, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v2] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-01 20:31 ` [PATCH v2] " Josh Poimboeuf @ 2016-12-02 13:41 ` Andrey Ryabinin 2016-12-02 12:54 ` Pavel Machek 2016-12-02 13:41 ` Andrey Ryabinin 2 siblings, 0 replies; 35+ messages in thread From: Andrey Ryabinin @ 2016-12-02 13:41 UTC (permalink / raw) To: Josh Poimboeuf, Dmitry Vyukov Cc: Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm, LKML, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, x86, Alexander Potapenko, kasan-dev On 12/01/2016 11:31 PM, Josh Poimboeuf wrote: > arch/x86/kernel/acpi/wakeup_64.S | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) > > diff --git a/arch/x86/kernel/acpi/wakeup_64.S b/arch/x86/kernel/acpi/wakeup_64.S > index 169963f..1df9b75 100644 > --- a/arch/x86/kernel/acpi/wakeup_64.S > +++ b/arch/x86/kernel/acpi/wakeup_64.S > @@ -109,6 +109,22 @@ ENTRY(do_suspend_lowlevel) > movq pt_regs_r14(%rax), %r14 > movq pt_regs_r15(%rax), %r15 > > +#ifdef CONFIG_KASAN > + /* > + * The suspend path may have poisoned some areas deeper in the stack, > + * which we now need to unpoison. > + * > + * We can't call kasan_unpoison_task_stack_below() because it uses %gs > + * for 'current', which hasn't been set up yet. Instead, calculate the > + * stack range manually and call kasan_unpoison_shadow(). > + */ > + movq %rsp, %rdi > + andq $CURRENT_MASK, %rdi > + movq %rsp, %rsi > + xorq %rdi, %rsi > + call kasan_unpoison_shadow > +#endif > + Looks good, but in fact we can use kasan_unpoison_task_stack_below(). We just need to change it a little: diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c index 70c0097..e779236 100644 --- a/mm/kasan/kasan.c +++ b/mm/kasan/kasan.c @@ -80,7 +80,9 @@ void kasan_unpoison_task_stack(struct task_struct *task) /* Unpoison the stack for the current task beyond a watermark sp value. */ asmlinkage void kasan_unpoison_task_stack_below(const void *watermark) { - __kasan_unpoison_stack(current, watermark); + void *base = (void *)((unsigned long)watermark & ~(THREAD_SIZE - 1)); + + kasan_unpoison_shadow(base, watermark - base); } With this we don't have to calculate stack range in assembly. ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v2] x86/suspend: fix false positive KASAN warning on suspend/resume @ 2016-12-02 13:41 ` Andrey Ryabinin 0 siblings, 0 replies; 35+ messages in thread From: Andrey Ryabinin @ 2016-12-02 13:41 UTC (permalink / raw) To: Josh Poimboeuf, Dmitry Vyukov Cc: Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm, LKML, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, x86, Alexander Potapenko, kasan-dev On 12/01/2016 11:31 PM, Josh Poimboeuf wrote: > arch/x86/kernel/acpi/wakeup_64.S | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) > > diff --git a/arch/x86/kernel/acpi/wakeup_64.S b/arch/x86/kernel/acpi/wakeup_64.S > index 169963f..1df9b75 100644 > --- a/arch/x86/kernel/acpi/wakeup_64.S > +++ b/arch/x86/kernel/acpi/wakeup_64.S > @@ -109,6 +109,22 @@ ENTRY(do_suspend_lowlevel) > movq pt_regs_r14(%rax), %r14 > movq pt_regs_r15(%rax), %r15 > > +#ifdef CONFIG_KASAN > + /* > + * The suspend path may have poisoned some areas deeper in the stack, > + * which we now need to unpoison. > + * > + * We can't call kasan_unpoison_task_stack_below() because it uses %gs > + * for 'current', which hasn't been set up yet. Instead, calculate the > + * stack range manually and call kasan_unpoison_shadow(). > + */ > + movq %rsp, %rdi > + andq $CURRENT_MASK, %rdi > + movq %rsp, %rsi > + xorq %rdi, %rsi > + call kasan_unpoison_shadow > +#endif > + Looks good, but in fact we can use kasan_unpoison_task_stack_below(). We just need to change it a little: diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c index 70c0097..e779236 100644 --- a/mm/kasan/kasan.c +++ b/mm/kasan/kasan.c @@ -80,7 +80,9 @@ void kasan_unpoison_task_stack(struct task_struct *task) /* Unpoison the stack for the current task beyond a watermark sp value. */ asmlinkage void kasan_unpoison_task_stack_below(const void *watermark) { - __kasan_unpoison_stack(current, watermark); + void *base = (void *)((unsigned long)watermark & ~(THREAD_SIZE - 1)); + + kasan_unpoison_shadow(base, watermark - base); } With this we don't have to calculate stack range in assembly. ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v2] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-02 13:41 ` Andrey Ryabinin (?) @ 2016-12-02 14:01 ` Josh Poimboeuf 2016-12-02 14:02 ` Dmitry Vyukov 2016-12-02 14:42 ` [PATCH v3] " Josh Poimboeuf -1 siblings, 2 replies; 35+ messages in thread From: Josh Poimboeuf @ 2016-12-02 14:01 UTC (permalink / raw) To: Andrey Ryabinin Cc: Dmitry Vyukov, Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm, LKML, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, x86, Alexander Potapenko, kasan-dev On Fri, Dec 02, 2016 at 04:41:09PM +0300, Andrey Ryabinin wrote: > > > On 12/01/2016 11:31 PM, Josh Poimboeuf wrote: > > > arch/x86/kernel/acpi/wakeup_64.S | 16 ++++++++++++++++ > > 1 file changed, 16 insertions(+) > > > > diff --git a/arch/x86/kernel/acpi/wakeup_64.S b/arch/x86/kernel/acpi/wakeup_64.S > > index 169963f..1df9b75 100644 > > --- a/arch/x86/kernel/acpi/wakeup_64.S > > +++ b/arch/x86/kernel/acpi/wakeup_64.S > > @@ -109,6 +109,22 @@ ENTRY(do_suspend_lowlevel) > > movq pt_regs_r14(%rax), %r14 > > movq pt_regs_r15(%rax), %r15 > > > > +#ifdef CONFIG_KASAN > > + /* > > + * The suspend path may have poisoned some areas deeper in the stack, > > + * which we now need to unpoison. > > + * > > + * We can't call kasan_unpoison_task_stack_below() because it uses %gs > > + * for 'current', which hasn't been set up yet. Instead, calculate the > > + * stack range manually and call kasan_unpoison_shadow(). > > + */ > > + movq %rsp, %rdi > > + andq $CURRENT_MASK, %rdi > > + movq %rsp, %rsi > > + xorq %rdi, %rsi > > + call kasan_unpoison_shadow > > +#endif > > + > > Looks good, but in fact we can use kasan_unpoison_task_stack_below(). We just need to change it a little: > > diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c > index 70c0097..e779236 100644 > --- a/mm/kasan/kasan.c > +++ b/mm/kasan/kasan.c > @@ -80,7 +80,9 @@ void kasan_unpoison_task_stack(struct task_struct *task) > /* Unpoison the stack for the current task beyond a watermark sp value. */ > asmlinkage void kasan_unpoison_task_stack_below(const void *watermark) > { > - __kasan_unpoison_stack(current, watermark); > + void *base = (void *)((unsigned long)watermark & ~(THREAD_SIZE - 1)); > + > + kasan_unpoison_shadow(base, watermark - base); > } > > > With this we don't have to calculate stack range in assembly. That is better indeed, will do a v3. -- Josh ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v2] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-02 14:01 ` Josh Poimboeuf @ 2016-12-02 14:02 ` Dmitry Vyukov 2016-12-02 14:42 ` [PATCH v3] " Josh Poimboeuf 1 sibling, 0 replies; 35+ messages in thread From: Dmitry Vyukov @ 2016-12-02 14:02 UTC (permalink / raw) To: Josh Poimboeuf Cc: Andrey Ryabinin, Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm, LKML, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, x86, Alexander Potapenko, kasan-dev On Fri, Dec 2, 2016 at 3:01 PM, Josh Poimboeuf <jpoimboe@redhat.com> wrote: > On Fri, Dec 02, 2016 at 04:41:09PM +0300, Andrey Ryabinin wrote: >> >> >> On 12/01/2016 11:31 PM, Josh Poimboeuf wrote: >> >> > arch/x86/kernel/acpi/wakeup_64.S | 16 ++++++++++++++++ >> > 1 file changed, 16 insertions(+) >> > >> > diff --git a/arch/x86/kernel/acpi/wakeup_64.S b/arch/x86/kernel/acpi/wakeup_64.S >> > index 169963f..1df9b75 100644 >> > --- a/arch/x86/kernel/acpi/wakeup_64.S >> > +++ b/arch/x86/kernel/acpi/wakeup_64.S >> > @@ -109,6 +109,22 @@ ENTRY(do_suspend_lowlevel) >> > movq pt_regs_r14(%rax), %r14 >> > movq pt_regs_r15(%rax), %r15 >> > >> > +#ifdef CONFIG_KASAN >> > + /* >> > + * The suspend path may have poisoned some areas deeper in the stack, >> > + * which we now need to unpoison. >> > + * >> > + * We can't call kasan_unpoison_task_stack_below() because it uses %gs >> > + * for 'current', which hasn't been set up yet. Instead, calculate the >> > + * stack range manually and call kasan_unpoison_shadow(). >> > + */ >> > + movq %rsp, %rdi >> > + andq $CURRENT_MASK, %rdi >> > + movq %rsp, %rsi >> > + xorq %rdi, %rsi >> > + call kasan_unpoison_shadow >> > +#endif >> > + >> >> Looks good, but in fact we can use kasan_unpoison_task_stack_below(). We just need to change it a little: >> >> diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c >> index 70c0097..e779236 100644 >> --- a/mm/kasan/kasan.c >> +++ b/mm/kasan/kasan.c >> @@ -80,7 +80,9 @@ void kasan_unpoison_task_stack(struct task_struct *task) >> /* Unpoison the stack for the current task beyond a watermark sp value. */ >> asmlinkage void kasan_unpoison_task_stack_below(const void *watermark) >> { >> - __kasan_unpoison_stack(current, watermark); >> + void *base = (void *)((unsigned long)watermark & ~(THREAD_SIZE - 1)); >> + >> + kasan_unpoison_shadow(base, watermark - base); >> } >> >> >> With this we don't have to calculate stack range in assembly. > > That is better indeed, will do a v3. agree ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v3] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-02 14:01 ` Josh Poimboeuf 2016-12-02 14:02 ` Dmitry Vyukov @ 2016-12-02 14:42 ` Josh Poimboeuf 2016-12-02 14:45 ` Andrey Ryabinin 1 sibling, 1 reply; 35+ messages in thread From: Josh Poimboeuf @ 2016-12-02 14:42 UTC (permalink / raw) To: Andrey Ryabinin Cc: Dmitry Vyukov, Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm, LKML, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, x86, Alexander Potapenko, kasan-dev Resuming from a suspend operation is showing a KASAN false positive warning: BUG: KASAN: stack-out-of-bounds in unwind_get_return_address+0x11d/0x130 at addr ffff8803867d7878 Read of size 8 by task pm-suspend/7774 page:ffffea000e19f5c0 count:0 mapcount:0 mapping: (null) index:0x0 flags: 0x2ffff0000000000() page dumped because: kasan: bad access detected CPU: 0 PID: 7774 Comm: pm-suspend Tainted: G B 4.9.0-rc7+ #8 Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016 Call Trace: dump_stack+0x63/0x82 kasan_report_error+0x4b4/0x4e0 ? acpi_hw_read_port+0xd0/0x1ea ? kfree_const+0x22/0x30 ? acpi_hw_validate_io_request+0x1a6/0x1a6 __asan_report_load8_noabort+0x61/0x70 ? unwind_get_return_address+0x11d/0x130 unwind_get_return_address+0x11d/0x130 ? unwind_next_frame+0x97/0xf0 __save_stack_trace+0x92/0x100 save_stack_trace+0x1b/0x20 save_stack+0x46/0xd0 ? save_stack_trace+0x1b/0x20 ? save_stack+0x46/0xd0 ? kasan_kmalloc+0xad/0xe0 ? kasan_slab_alloc+0x12/0x20 ? acpi_hw_read+0x2b6/0x3aa ? acpi_hw_validate_register+0x20b/0x20b ? acpi_hw_write_port+0x72/0xc7 ? acpi_hw_write+0x11f/0x15f ? acpi_hw_read_multiple+0x19f/0x19f ? memcpy+0x45/0x50 ? acpi_hw_write_port+0x72/0xc7 ? acpi_hw_write+0x11f/0x15f ? acpi_hw_read_multiple+0x19f/0x19f ? kasan_unpoison_shadow+0x36/0x50 kasan_kmalloc+0xad/0xe0 kasan_slab_alloc+0x12/0x20 kmem_cache_alloc_trace+0xbc/0x1e0 ? acpi_get_sleep_type_data+0x9a/0x578 acpi_get_sleep_type_data+0x9a/0x578 acpi_hw_legacy_wake_prep+0x88/0x22c ? acpi_hw_legacy_sleep+0x3c7/0x3c7 ? acpi_write_bit_register+0x28d/0x2d3 ? acpi_read_bit_register+0x19b/0x19b acpi_hw_sleep_dispatch+0xb5/0xba acpi_leave_sleep_state_prep+0x17/0x19 acpi_suspend_enter+0x154/0x1e0 ? trace_suspend_resume+0xe8/0xe8 suspend_devices_and_enter+0xb09/0xdb0 ? printk+0xa8/0xd8 ? arch_suspend_enable_irqs+0x20/0x20 ? try_to_freeze_tasks+0x295/0x600 pm_suspend+0x6c9/0x780 ? finish_wait+0x1f0/0x1f0 ? suspend_devices_and_enter+0xdb0/0xdb0 state_store+0xa2/0x120 ? kobj_attr_show+0x60/0x60 kobj_attr_store+0x36/0x70 sysfs_kf_write+0x131/0x200 kernfs_fop_write+0x295/0x3f0 __vfs_write+0xef/0x760 ? handle_mm_fault+0x1346/0x35e0 ? do_iter_readv_writev+0x660/0x660 ? __pmd_alloc+0x310/0x310 ? do_lock_file_wait+0x1e0/0x1e0 ? apparmor_file_permission+0x18/0x20 ? security_file_permission+0x73/0x1c0 ? rw_verify_area+0xbd/0x2b0 vfs_write+0x149/0x4a0 SyS_write+0xd9/0x1c0 ? SyS_read+0x1c0/0x1c0 entry_SYSCALL_64_fastpath+0x1e/0xad Memory state around the buggy address: ffff8803867d7700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8803867d7780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff8803867d7800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f4 ^ ffff8803867d7880: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 ffff8803867d7900: 00 00 00 f1 f1 f1 f1 04 f4 f4 f4 f3 f3 f3 f3 00 KASAN instrumentation poisons the stack when entering a function and unpoisons it when exiting the function. However, in the suspend path, some functions never return, so their stack never gets unpoisoned, resulting in stale KASAN shadow data which can cause later false positive warnings like the one above. Reported-by: Scott Bauer <scott.bauer@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> --- arch/x86/kernel/acpi/wakeup_64.S | 9 +++++++++ mm/kasan/kasan.c | 9 ++++++++- 2 files changed, 17 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/acpi/wakeup_64.S b/arch/x86/kernel/acpi/wakeup_64.S index 169963f..50b8ed0 100644 --- a/arch/x86/kernel/acpi/wakeup_64.S +++ b/arch/x86/kernel/acpi/wakeup_64.S @@ -109,6 +109,15 @@ ENTRY(do_suspend_lowlevel) movq pt_regs_r14(%rax), %r14 movq pt_regs_r15(%rax), %r15 +#ifdef CONFIG_KASAN + /* + * The suspend path may have poisoned some areas deeper in the stack, + * which we now need to unpoison. + */ + movq %rsp, %rdi + call kasan_unpoison_task_stack_below +#endif + xorl %eax, %eax addq $8, %rsp FRAME_END diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c index 0e9505f..e9d8ba0 100644 --- a/mm/kasan/kasan.c +++ b/mm/kasan/kasan.c @@ -80,7 +80,14 @@ void kasan_unpoison_task_stack(struct task_struct *task) /* Unpoison the stack for the current task beyond a watermark sp value. */ asmlinkage void kasan_unpoison_task_stack_below(const void *watermark) { - __kasan_unpoison_stack(current, watermark); + /* + * Calculate the task stack base address. Avoid using 'current' + * because this function is called by early resume code which hasn't + * yet set up the percpu register (%gs). + */ + void *base = (void *)((unsigned long)watermark & CURRENT_MASK); + + kasan_unpoison_shadow(base, watermark - base); } /* -- 2.7.4 ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v3] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-02 14:42 ` [PATCH v3] " Josh Poimboeuf @ 2016-12-02 14:45 ` Andrey Ryabinin 0 siblings, 0 replies; 35+ messages in thread From: Andrey Ryabinin @ 2016-12-02 14:45 UTC (permalink / raw) To: Josh Poimboeuf Cc: Dmitry Vyukov, Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm, LKML, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, x86, Alexander Potapenko, kasan-dev On 12/02/2016 05:42 PM, Josh Poimboeuf wrote: > diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c > index 0e9505f..e9d8ba0 100644 > --- a/mm/kasan/kasan.c > +++ b/mm/kasan/kasan.c > @@ -80,7 +80,14 @@ void kasan_unpoison_task_stack(struct task_struct *task) > /* Unpoison the stack for the current task beyond a watermark sp value. */ > asmlinkage void kasan_unpoison_task_stack_below(const void *watermark) > { > - __kasan_unpoison_stack(current, watermark); > + /* > + * Calculate the task stack base address. Avoid using 'current' > + * because this function is called by early resume code which hasn't > + * yet set up the percpu register (%gs). > + */ > + void *base = (void *)((unsigned long)watermark & CURRENT_MASK); CURRENT_MASK is defined only on x86... > + > + kasan_unpoison_shadow(base, watermark - base); > } > > /* > ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3] x86/suspend: fix false positive KASAN warning on suspend/resume @ 2016-12-02 14:45 ` Andrey Ryabinin 0 siblings, 0 replies; 35+ messages in thread From: Andrey Ryabinin @ 2016-12-02 14:45 UTC (permalink / raw) To: Josh Poimboeuf Cc: Dmitry Vyukov, Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm, LKML, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, x86, Alexander Potapenko, kasan-dev On 12/02/2016 05:42 PM, Josh Poimboeuf wrote: > diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c > index 0e9505f..e9d8ba0 100644 > --- a/mm/kasan/kasan.c > +++ b/mm/kasan/kasan.c > @@ -80,7 +80,14 @@ void kasan_unpoison_task_stack(struct task_struct *task) > /* Unpoison the stack for the current task beyond a watermark sp value. */ > asmlinkage void kasan_unpoison_task_stack_below(const void *watermark) > { > - __kasan_unpoison_stack(current, watermark); > + /* > + * Calculate the task stack base address. Avoid using 'current' > + * because this function is called by early resume code which hasn't > + * yet set up the percpu register (%gs). > + */ > + void *base = (void *)((unsigned long)watermark & CURRENT_MASK); CURRENT_MASK is defined only on x86... > + > + kasan_unpoison_shadow(base, watermark - base); > } > > /* > ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v3] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-02 14:45 ` Andrey Ryabinin (?) @ 2016-12-02 15:08 ` Josh Poimboeuf 2016-12-02 17:42 ` [PATCH v4] " Josh Poimboeuf -1 siblings, 1 reply; 35+ messages in thread From: Josh Poimboeuf @ 2016-12-02 15:08 UTC (permalink / raw) To: Andrey Ryabinin Cc: Dmitry Vyukov, Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm, LKML, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, x86, Alexander Potapenko, kasan-dev On Fri, Dec 02, 2016 at 05:45:18PM +0300, Andrey Ryabinin wrote: > > > On 12/02/2016 05:42 PM, Josh Poimboeuf wrote: > > > > diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c > > index 0e9505f..e9d8ba0 100644 > > --- a/mm/kasan/kasan.c > > +++ b/mm/kasan/kasan.c > > @@ -80,7 +80,14 @@ void kasan_unpoison_task_stack(struct task_struct *task) > > /* Unpoison the stack for the current task beyond a watermark sp value. */ > > asmlinkage void kasan_unpoison_task_stack_below(const void *watermark) > > { > > - __kasan_unpoison_stack(current, watermark); > > + /* > > + * Calculate the task stack base address. Avoid using 'current' > > + * because this function is called by early resume code which hasn't > > + * yet set up the percpu register (%gs). > > + */ > > + void *base = (void *)((unsigned long)watermark & CURRENT_MASK); > > CURRENT_MASK is defined only on x86... Oops. I guess I should have taken your suggested patch verbatim... Will do a proper multi-arch compile before submitting v4. -- Josh ^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v4] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-02 15:08 ` Josh Poimboeuf @ 2016-12-02 17:42 ` Josh Poimboeuf 2016-12-02 20:55 ` Andrey Ryabinin ` (2 more replies) 0 siblings, 3 replies; 35+ messages in thread From: Josh Poimboeuf @ 2016-12-02 17:42 UTC (permalink / raw) To: Andrey Ryabinin Cc: Dmitry Vyukov, Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm, LKML, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, x86, Alexander Potapenko, kasan-dev Resuming from a suspend operation is showing a KASAN false positive warning: BUG: KASAN: stack-out-of-bounds in unwind_get_return_address+0x11d/0x130 at addr ffff8803867d7878 Read of size 8 by task pm-suspend/7774 page:ffffea000e19f5c0 count:0 mapcount:0 mapping: (null) index:0x0 flags: 0x2ffff0000000000() page dumped because: kasan: bad access detected CPU: 0 PID: 7774 Comm: pm-suspend Tainted: G B 4.9.0-rc7+ #8 Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016 Call Trace: dump_stack+0x63/0x82 kasan_report_error+0x4b4/0x4e0 ? acpi_hw_read_port+0xd0/0x1ea ? kfree_const+0x22/0x30 ? acpi_hw_validate_io_request+0x1a6/0x1a6 __asan_report_load8_noabort+0x61/0x70 ? unwind_get_return_address+0x11d/0x130 unwind_get_return_address+0x11d/0x130 ? unwind_next_frame+0x97/0xf0 __save_stack_trace+0x92/0x100 save_stack_trace+0x1b/0x20 save_stack+0x46/0xd0 ? save_stack_trace+0x1b/0x20 ? save_stack+0x46/0xd0 ? kasan_kmalloc+0xad/0xe0 ? kasan_slab_alloc+0x12/0x20 ? acpi_hw_read+0x2b6/0x3aa ? acpi_hw_validate_register+0x20b/0x20b ? acpi_hw_write_port+0x72/0xc7 ? acpi_hw_write+0x11f/0x15f ? acpi_hw_read_multiple+0x19f/0x19f ? memcpy+0x45/0x50 ? acpi_hw_write_port+0x72/0xc7 ? acpi_hw_write+0x11f/0x15f ? acpi_hw_read_multiple+0x19f/0x19f ? kasan_unpoison_shadow+0x36/0x50 kasan_kmalloc+0xad/0xe0 kasan_slab_alloc+0x12/0x20 kmem_cache_alloc_trace+0xbc/0x1e0 ? acpi_get_sleep_type_data+0x9a/0x578 acpi_get_sleep_type_data+0x9a/0x578 acpi_hw_legacy_wake_prep+0x88/0x22c ? acpi_hw_legacy_sleep+0x3c7/0x3c7 ? acpi_write_bit_register+0x28d/0x2d3 ? acpi_read_bit_register+0x19b/0x19b acpi_hw_sleep_dispatch+0xb5/0xba acpi_leave_sleep_state_prep+0x17/0x19 acpi_suspend_enter+0x154/0x1e0 ? trace_suspend_resume+0xe8/0xe8 suspend_devices_and_enter+0xb09/0xdb0 ? printk+0xa8/0xd8 ? arch_suspend_enable_irqs+0x20/0x20 ? try_to_freeze_tasks+0x295/0x600 pm_suspend+0x6c9/0x780 ? finish_wait+0x1f0/0x1f0 ? suspend_devices_and_enter+0xdb0/0xdb0 state_store+0xa2/0x120 ? kobj_attr_show+0x60/0x60 kobj_attr_store+0x36/0x70 sysfs_kf_write+0x131/0x200 kernfs_fop_write+0x295/0x3f0 __vfs_write+0xef/0x760 ? handle_mm_fault+0x1346/0x35e0 ? do_iter_readv_writev+0x660/0x660 ? __pmd_alloc+0x310/0x310 ? do_lock_file_wait+0x1e0/0x1e0 ? apparmor_file_permission+0x18/0x20 ? security_file_permission+0x73/0x1c0 ? rw_verify_area+0xbd/0x2b0 vfs_write+0x149/0x4a0 SyS_write+0xd9/0x1c0 ? SyS_read+0x1c0/0x1c0 entry_SYSCALL_64_fastpath+0x1e/0xad Memory state around the buggy address: ffff8803867d7700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8803867d7780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff8803867d7800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f4 ^ ffff8803867d7880: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 ffff8803867d7900: 00 00 00 f1 f1 f1 f1 04 f4 f4 f4 f3 f3 f3 f3 00 KASAN instrumentation poisons the stack when entering a function and unpoisons it when exiting the function. However, in the suspend path, some functions never return, so their stack never gets unpoisoned, resulting in stale KASAN shadow data which can cause later false positive warnings like the one above. Reported-by: Scott Bauer <scott.bauer@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> --- arch/x86/kernel/acpi/wakeup_64.S | 9 +++++++++ mm/kasan/kasan.c | 9 ++++++++- 2 files changed, 17 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/acpi/wakeup_64.S b/arch/x86/kernel/acpi/wakeup_64.S index 169963f..50b8ed0 100644 --- a/arch/x86/kernel/acpi/wakeup_64.S +++ b/arch/x86/kernel/acpi/wakeup_64.S @@ -109,6 +109,15 @@ ENTRY(do_suspend_lowlevel) movq pt_regs_r14(%rax), %r14 movq pt_regs_r15(%rax), %r15 +#ifdef CONFIG_KASAN + /* + * The suspend path may have poisoned some areas deeper in the stack, + * which we now need to unpoison. + */ + movq %rsp, %rdi + call kasan_unpoison_task_stack_below +#endif + xorl %eax, %eax addq $8, %rsp FRAME_END diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c index 0e9505f..b2a0cff 100644 --- a/mm/kasan/kasan.c +++ b/mm/kasan/kasan.c @@ -80,7 +80,14 @@ void kasan_unpoison_task_stack(struct task_struct *task) /* Unpoison the stack for the current task beyond a watermark sp value. */ asmlinkage void kasan_unpoison_task_stack_below(const void *watermark) { - __kasan_unpoison_stack(current, watermark); + /* + * Calculate the task stack base address. Avoid using 'current' + * because this function is called by early resume code which hasn't + * yet set up the percpu register (%gs). + */ + void *base = (void *)((unsigned long)watermark & ~(THREAD_SIZE - 1)); + + kasan_unpoison_shadow(base, watermark - base); } /* -- 2.7.4 ^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v4] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-02 17:42 ` [PATCH v4] " Josh Poimboeuf @ 2016-12-02 20:55 ` Andrey Ryabinin 2016-12-02 21:09 ` Pavel Machek 2016-12-08 0:10 ` Rafael J. Wysocki 2 siblings, 0 replies; 35+ messages in thread From: Andrey Ryabinin @ 2016-12-02 20:55 UTC (permalink / raw) To: Josh Poimboeuf Cc: Dmitry Vyukov, Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm, LKML, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, x86, Alexander Potapenko, kasan-dev On 12/02/2016 08:42 PM, Josh Poimboeuf wrote: > Resuming from a suspend operation is showing a KASAN false positive > warning: > > BUG: KASAN: stack-out-of-bounds in unwind_get_return_address+0x11d/0x130 at addr ffff8803867d7878 > Read of size 8 by task pm-suspend/7774 > page:ffffea000e19f5c0 count:0 mapcount:0 mapping: (null) index:0x0 > flags: 0x2ffff0000000000() > page dumped because: kasan: bad access detected > CPU: 0 PID: 7774 Comm: pm-suspend Tainted: G B 4.9.0-rc7+ #8 > Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016 > Call Trace: > dump_stack+0x63/0x82 > kasan_report_error+0x4b4/0x4e0 > ? acpi_hw_read_port+0xd0/0x1ea > ? kfree_const+0x22/0x30 > ? acpi_hw_validate_io_request+0x1a6/0x1a6 > __asan_report_load8_noabort+0x61/0x70 > ? unwind_get_return_address+0x11d/0x130 > unwind_get_return_address+0x11d/0x130 > ? unwind_next_frame+0x97/0xf0 > __save_stack_trace+0x92/0x100 > save_stack_trace+0x1b/0x20 > save_stack+0x46/0xd0 > ? save_stack_trace+0x1b/0x20 > ? save_stack+0x46/0xd0 > ? kasan_kmalloc+0xad/0xe0 > ? kasan_slab_alloc+0x12/0x20 > ? acpi_hw_read+0x2b6/0x3aa > ? acpi_hw_validate_register+0x20b/0x20b > ? acpi_hw_write_port+0x72/0xc7 > ? acpi_hw_write+0x11f/0x15f > ? acpi_hw_read_multiple+0x19f/0x19f > ? memcpy+0x45/0x50 > ? acpi_hw_write_port+0x72/0xc7 > ? acpi_hw_write+0x11f/0x15f > ? acpi_hw_read_multiple+0x19f/0x19f > ? kasan_unpoison_shadow+0x36/0x50 > kasan_kmalloc+0xad/0xe0 > kasan_slab_alloc+0x12/0x20 > kmem_cache_alloc_trace+0xbc/0x1e0 > ? acpi_get_sleep_type_data+0x9a/0x578 > acpi_get_sleep_type_data+0x9a/0x578 > acpi_hw_legacy_wake_prep+0x88/0x22c > ? acpi_hw_legacy_sleep+0x3c7/0x3c7 > ? acpi_write_bit_register+0x28d/0x2d3 > ? acpi_read_bit_register+0x19b/0x19b > acpi_hw_sleep_dispatch+0xb5/0xba > acpi_leave_sleep_state_prep+0x17/0x19 > acpi_suspend_enter+0x154/0x1e0 > ? trace_suspend_resume+0xe8/0xe8 > suspend_devices_and_enter+0xb09/0xdb0 > ? printk+0xa8/0xd8 > ? arch_suspend_enable_irqs+0x20/0x20 > ? try_to_freeze_tasks+0x295/0x600 > pm_suspend+0x6c9/0x780 > ? finish_wait+0x1f0/0x1f0 > ? suspend_devices_and_enter+0xdb0/0xdb0 > state_store+0xa2/0x120 > ? kobj_attr_show+0x60/0x60 > kobj_attr_store+0x36/0x70 > sysfs_kf_write+0x131/0x200 > kernfs_fop_write+0x295/0x3f0 > __vfs_write+0xef/0x760 > ? handle_mm_fault+0x1346/0x35e0 > ? do_iter_readv_writev+0x660/0x660 > ? __pmd_alloc+0x310/0x310 > ? do_lock_file_wait+0x1e0/0x1e0 > ? apparmor_file_permission+0x18/0x20 > ? security_file_permission+0x73/0x1c0 > ? rw_verify_area+0xbd/0x2b0 > vfs_write+0x149/0x4a0 > SyS_write+0xd9/0x1c0 > ? SyS_read+0x1c0/0x1c0 > entry_SYSCALL_64_fastpath+0x1e/0xad > Memory state around the buggy address: > ffff8803867d7700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ffff8803867d7780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >ffff8803867d7800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f4 > ^ > ffff8803867d7880: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 > ffff8803867d7900: 00 00 00 f1 f1 f1 f1 04 f4 f4 f4 f3 f3 f3 f3 00 > > KASAN instrumentation poisons the stack when entering a function and > unpoisons it when exiting the function. However, in the suspend path, > some functions never return, so their stack never gets unpoisoned, > resulting in stale KASAN shadow data which can cause later false > positive warnings like the one above. > > Reported-by: Scott Bauer <scott.bauer@intel.com> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4] x86/suspend: fix false positive KASAN warning on suspend/resume @ 2016-12-02 20:55 ` Andrey Ryabinin 0 siblings, 0 replies; 35+ messages in thread From: Andrey Ryabinin @ 2016-12-02 20:55 UTC (permalink / raw) To: Josh Poimboeuf Cc: Dmitry Vyukov, Rafael J. Wysocki, Len Brown, Pavel Machek, linux-pm, LKML, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, x86, Alexander Potapenko, kasan-dev On 12/02/2016 08:42 PM, Josh Poimboeuf wrote: > Resuming from a suspend operation is showing a KASAN false positive > warning: > > BUG: KASAN: stack-out-of-bounds in unwind_get_return_address+0x11d/0x130 at addr ffff8803867d7878 > Read of size 8 by task pm-suspend/7774 > page:ffffea000e19f5c0 count:0 mapcount:0 mapping: (null) index:0x0 > flags: 0x2ffff0000000000() > page dumped because: kasan: bad access detected > CPU: 0 PID: 7774 Comm: pm-suspend Tainted: G B 4.9.0-rc7+ #8 > Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016 > Call Trace: > dump_stack+0x63/0x82 > kasan_report_error+0x4b4/0x4e0 > ? acpi_hw_read_port+0xd0/0x1ea > ? kfree_const+0x22/0x30 > ? acpi_hw_validate_io_request+0x1a6/0x1a6 > __asan_report_load8_noabort+0x61/0x70 > ? unwind_get_return_address+0x11d/0x130 > unwind_get_return_address+0x11d/0x130 > ? unwind_next_frame+0x97/0xf0 > __save_stack_trace+0x92/0x100 > save_stack_trace+0x1b/0x20 > save_stack+0x46/0xd0 > ? save_stack_trace+0x1b/0x20 > ? save_stack+0x46/0xd0 > ? kasan_kmalloc+0xad/0xe0 > ? kasan_slab_alloc+0x12/0x20 > ? acpi_hw_read+0x2b6/0x3aa > ? acpi_hw_validate_register+0x20b/0x20b > ? acpi_hw_write_port+0x72/0xc7 > ? acpi_hw_write+0x11f/0x15f > ? acpi_hw_read_multiple+0x19f/0x19f > ? memcpy+0x45/0x50 > ? acpi_hw_write_port+0x72/0xc7 > ? acpi_hw_write+0x11f/0x15f > ? acpi_hw_read_multiple+0x19f/0x19f > ? kasan_unpoison_shadow+0x36/0x50 > kasan_kmalloc+0xad/0xe0 > kasan_slab_alloc+0x12/0x20 > kmem_cache_alloc_trace+0xbc/0x1e0 > ? acpi_get_sleep_type_data+0x9a/0x578 > acpi_get_sleep_type_data+0x9a/0x578 > acpi_hw_legacy_wake_prep+0x88/0x22c > ? acpi_hw_legacy_sleep+0x3c7/0x3c7 > ? acpi_write_bit_register+0x28d/0x2d3 > ? acpi_read_bit_register+0x19b/0x19b > acpi_hw_sleep_dispatch+0xb5/0xba > acpi_leave_sleep_state_prep+0x17/0x19 > acpi_suspend_enter+0x154/0x1e0 > ? trace_suspend_resume+0xe8/0xe8 > suspend_devices_and_enter+0xb09/0xdb0 > ? printk+0xa8/0xd8 > ? arch_suspend_enable_irqs+0x20/0x20 > ? try_to_freeze_tasks+0x295/0x600 > pm_suspend+0x6c9/0x780 > ? finish_wait+0x1f0/0x1f0 > ? suspend_devices_and_enter+0xdb0/0xdb0 > state_store+0xa2/0x120 > ? kobj_attr_show+0x60/0x60 > kobj_attr_store+0x36/0x70 > sysfs_kf_write+0x131/0x200 > kernfs_fop_write+0x295/0x3f0 > __vfs_write+0xef/0x760 > ? handle_mm_fault+0x1346/0x35e0 > ? do_iter_readv_writev+0x660/0x660 > ? __pmd_alloc+0x310/0x310 > ? do_lock_file_wait+0x1e0/0x1e0 > ? apparmor_file_permission+0x18/0x20 > ? security_file_permission+0x73/0x1c0 > ? rw_verify_area+0xbd/0x2b0 > vfs_write+0x149/0x4a0 > SyS_write+0xd9/0x1c0 > ? SyS_read+0x1c0/0x1c0 > entry_SYSCALL_64_fastpath+0x1e/0xad > Memory state around the buggy address: > ffff8803867d7700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ffff8803867d7780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >ffff8803867d7800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f4 > ^ > ffff8803867d7880: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 > ffff8803867d7900: 00 00 00 f1 f1 f1 f1 04 f4 f4 f4 f3 f3 f3 f3 00 > > KASAN instrumentation poisons the stack when entering a function and > unpoisons it when exiting the function. However, in the suspend path, > some functions never return, so their stack never gets unpoisoned, > resulting in stale KASAN shadow data which can cause later false > positive warnings like the one above. > > Reported-by: Scott Bauer <scott.bauer@intel.com> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-02 17:42 ` [PATCH v4] " Josh Poimboeuf 2016-12-02 20:55 ` Andrey Ryabinin @ 2016-12-02 21:09 ` Pavel Machek 2016-12-02 21:57 ` Josh Poimboeuf 2016-12-08 0:10 ` Rafael J. Wysocki 2 siblings, 1 reply; 35+ messages in thread From: Pavel Machek @ 2016-12-02 21:09 UTC (permalink / raw) To: Josh Poimboeuf Cc: Andrey Ryabinin, Dmitry Vyukov, Rafael J. Wysocki, Len Brown, linux-pm, LKML, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, x86, Alexander Potapenko, kasan-dev [-- Attachment #1: Type: text/plain, Size: 1375 bytes --] On Fri 2016-12-02 11:42:21, Josh Poimboeuf wrote: > Resuming from a suspend operation is showing a KASAN false positive > warning: > > > Reported-by: Scott Bauer <scott.bauer@intel.com> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Pavel Machek <pavel@ucw.cz> > diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c > index 0e9505f..b2a0cff 100644 > --- a/mm/kasan/kasan.c > +++ b/mm/kasan/kasan.c > @@ -80,7 +80,14 @@ void kasan_unpoison_task_stack(struct task_struct *task) > /* Unpoison the stack for the current task beyond a watermark sp value. */ > asmlinkage void kasan_unpoison_task_stack_below(const void *watermark) > { > - __kasan_unpoison_stack(current, watermark); > + /* > + * Calculate the task stack base address. Avoid using 'current' > + * because this function is called by early resume code which hasn't > + * yet set up the percpu register (%gs). > + */ > + void *base = (void *)((unsigned long)watermark & ~(THREAD_SIZE - 1)); > + > + kasan_unpoison_shadow(base, watermark - base); > } > I know you modified this code to be arch-independend... but is it really? I guess it is portable enough across architectures that run kasan today.. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 181 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-02 21:09 ` Pavel Machek @ 2016-12-02 21:57 ` Josh Poimboeuf 0 siblings, 0 replies; 35+ messages in thread From: Josh Poimboeuf @ 2016-12-02 21:57 UTC (permalink / raw) To: Pavel Machek Cc: Andrey Ryabinin, Dmitry Vyukov, Rafael J. Wysocki, Len Brown, linux-pm, LKML, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, x86, Alexander Potapenko, kasan-dev On Fri, Dec 02, 2016 at 10:09:03PM +0100, Pavel Machek wrote: > On Fri 2016-12-02 11:42:21, Josh Poimboeuf wrote: > > Resuming from a suspend operation is showing a KASAN false positive > > warning: > > > > > > Reported-by: Scott Bauer <scott.bauer@intel.com> > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> > > Acked-by: Pavel Machek <pavel@ucw.cz> > > > diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c > > index 0e9505f..b2a0cff 100644 > > --- a/mm/kasan/kasan.c > > +++ b/mm/kasan/kasan.c > > @@ -80,7 +80,14 @@ void kasan_unpoison_task_stack(struct task_struct *task) > > /* Unpoison the stack for the current task beyond a watermark sp value. */ > > asmlinkage void kasan_unpoison_task_stack_below(const void *watermark) > > { > > - __kasan_unpoison_stack(current, watermark); > > + /* > > + * Calculate the task stack base address. Avoid using 'current' > > + * because this function is called by early resume code which hasn't > > + * yet set up the percpu register (%gs). > > + */ > > + void *base = (void *)((unsigned long)watermark & ~(THREAD_SIZE - 1)); > > + > > + kasan_unpoison_shadow(base, watermark - base); > > } > > > > I know you modified this code to be arch-independend... but is it > really? I guess it is portable enough across architectures that run > kasan today.. Yes, it's arch-independent as far as I know. All the implementations of alloc_thread_stack_node() in kernel/fork.c create THREAD_SIZE sized/aligned stacks. ia64 has its own implementation of alloc_thread_stack_node(), which also has a THREAD_SIZE sized/aligned stack, with task_struct stored at the beginning. For those architectures for which stack grows up, they would need to call a different helper which unpoisons the stack above the watermark, but that was also the case before my patch. -- Josh ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v4] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-02 17:42 ` [PATCH v4] " Josh Poimboeuf 2016-12-02 20:55 ` Andrey Ryabinin 2016-12-02 21:09 ` Pavel Machek @ 2016-12-08 0:10 ` Rafael J. Wysocki 2 siblings, 0 replies; 35+ messages in thread From: Rafael J. Wysocki @ 2016-12-08 0:10 UTC (permalink / raw) To: Josh Poimboeuf Cc: Andrey Ryabinin, Dmitry Vyukov, Len Brown, Pavel Machek, linux-pm, LKML, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, x86, Alexander Potapenko, kasan-dev On Friday, December 02, 2016 11:42:21 AM Josh Poimboeuf wrote: > Resuming from a suspend operation is showing a KASAN false positive > warning: > > BUG: KASAN: stack-out-of-bounds in unwind_get_return_address+0x11d/0x130 at addr ffff8803867d7878 > Read of size 8 by task pm-suspend/7774 > page:ffffea000e19f5c0 count:0 mapcount:0 mapping: (null) index:0x0 > flags: 0x2ffff0000000000() > page dumped because: kasan: bad access detected > CPU: 0 PID: 7774 Comm: pm-suspend Tainted: G B 4.9.0-rc7+ #8 > Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016 > Call Trace: > dump_stack+0x63/0x82 > kasan_report_error+0x4b4/0x4e0 > ? acpi_hw_read_port+0xd0/0x1ea > ? kfree_const+0x22/0x30 > ? acpi_hw_validate_io_request+0x1a6/0x1a6 > __asan_report_load8_noabort+0x61/0x70 > ? unwind_get_return_address+0x11d/0x130 > unwind_get_return_address+0x11d/0x130 > ? unwind_next_frame+0x97/0xf0 > __save_stack_trace+0x92/0x100 > save_stack_trace+0x1b/0x20 > save_stack+0x46/0xd0 > ? save_stack_trace+0x1b/0x20 > ? save_stack+0x46/0xd0 > ? kasan_kmalloc+0xad/0xe0 > ? kasan_slab_alloc+0x12/0x20 > ? acpi_hw_read+0x2b6/0x3aa > ? acpi_hw_validate_register+0x20b/0x20b > ? acpi_hw_write_port+0x72/0xc7 > ? acpi_hw_write+0x11f/0x15f > ? acpi_hw_read_multiple+0x19f/0x19f > ? memcpy+0x45/0x50 > ? acpi_hw_write_port+0x72/0xc7 > ? acpi_hw_write+0x11f/0x15f > ? acpi_hw_read_multiple+0x19f/0x19f > ? kasan_unpoison_shadow+0x36/0x50 > kasan_kmalloc+0xad/0xe0 > kasan_slab_alloc+0x12/0x20 > kmem_cache_alloc_trace+0xbc/0x1e0 > ? acpi_get_sleep_type_data+0x9a/0x578 > acpi_get_sleep_type_data+0x9a/0x578 > acpi_hw_legacy_wake_prep+0x88/0x22c > ? acpi_hw_legacy_sleep+0x3c7/0x3c7 > ? acpi_write_bit_register+0x28d/0x2d3 > ? acpi_read_bit_register+0x19b/0x19b > acpi_hw_sleep_dispatch+0xb5/0xba > acpi_leave_sleep_state_prep+0x17/0x19 > acpi_suspend_enter+0x154/0x1e0 > ? trace_suspend_resume+0xe8/0xe8 > suspend_devices_and_enter+0xb09/0xdb0 > ? printk+0xa8/0xd8 > ? arch_suspend_enable_irqs+0x20/0x20 > ? try_to_freeze_tasks+0x295/0x600 > pm_suspend+0x6c9/0x780 > ? finish_wait+0x1f0/0x1f0 > ? suspend_devices_and_enter+0xdb0/0xdb0 > state_store+0xa2/0x120 > ? kobj_attr_show+0x60/0x60 > kobj_attr_store+0x36/0x70 > sysfs_kf_write+0x131/0x200 > kernfs_fop_write+0x295/0x3f0 > __vfs_write+0xef/0x760 > ? handle_mm_fault+0x1346/0x35e0 > ? do_iter_readv_writev+0x660/0x660 > ? __pmd_alloc+0x310/0x310 > ? do_lock_file_wait+0x1e0/0x1e0 > ? apparmor_file_permission+0x18/0x20 > ? security_file_permission+0x73/0x1c0 > ? rw_verify_area+0xbd/0x2b0 > vfs_write+0x149/0x4a0 > SyS_write+0xd9/0x1c0 > ? SyS_read+0x1c0/0x1c0 > entry_SYSCALL_64_fastpath+0x1e/0xad > Memory state around the buggy address: > ffff8803867d7700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ffff8803867d7780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >ffff8803867d7800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f4 > ^ > ffff8803867d7880: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 > ffff8803867d7900: 00 00 00 f1 f1 f1 f1 04 f4 f4 f4 f3 f3 f3 f3 00 > > KASAN instrumentation poisons the stack when entering a function and > unpoisons it when exiting the function. However, in the suspend path, > some functions never return, so their stack never gets unpoisoned, > resulting in stale KASAN shadow data which can cause later false > positive warnings like the one above. > > Reported-by: Scott Bauer <scott.bauer@intel.com> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> > --- > arch/x86/kernel/acpi/wakeup_64.S | 9 +++++++++ > mm/kasan/kasan.c | 9 ++++++++- > 2 files changed, 17 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kernel/acpi/wakeup_64.S b/arch/x86/kernel/acpi/wakeup_64.S > index 169963f..50b8ed0 100644 > --- a/arch/x86/kernel/acpi/wakeup_64.S > +++ b/arch/x86/kernel/acpi/wakeup_64.S > @@ -109,6 +109,15 @@ ENTRY(do_suspend_lowlevel) > movq pt_regs_r14(%rax), %r14 > movq pt_regs_r15(%rax), %r15 > > +#ifdef CONFIG_KASAN > + /* > + * The suspend path may have poisoned some areas deeper in the stack, > + * which we now need to unpoison. > + */ > + movq %rsp, %rdi > + call kasan_unpoison_task_stack_below > +#endif > + > xorl %eax, %eax > addq $8, %rsp > FRAME_END > diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c > index 0e9505f..b2a0cff 100644 > --- a/mm/kasan/kasan.c > +++ b/mm/kasan/kasan.c > @@ -80,7 +80,14 @@ void kasan_unpoison_task_stack(struct task_struct *task) > /* Unpoison the stack for the current task beyond a watermark sp value. */ > asmlinkage void kasan_unpoison_task_stack_below(const void *watermark) > { > - __kasan_unpoison_stack(current, watermark); > + /* > + * Calculate the task stack base address. Avoid using 'current' > + * because this function is called by early resume code which hasn't > + * yet set up the percpu register (%gs). > + */ > + void *base = (void *)((unsigned long)watermark & ~(THREAD_SIZE - 1)); > + > + kasan_unpoison_shadow(base, watermark - base); > } > > /* > Applied. Thanks, Rafael ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-11-30 23:10 ` [PATCH] x86/suspend: fix false positive KASAN warning on suspend/resume Josh Poimboeuf 2016-12-01 9:05 ` Andrey Ryabinin @ 2016-12-01 14:04 ` Rafael J. Wysocki 2016-12-01 16:53 ` Josh Poimboeuf 1 sibling, 1 reply; 35+ messages in thread From: Rafael J. Wysocki @ 2016-12-01 14:04 UTC (permalink / raw) To: Josh Poimboeuf Cc: Rafael J. Wysocki, Len Brown, Pavel Machek, Linux PM, Linux Kernel Mailing List, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, the arch/x86 maintainers, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, kasan-dev On Thu, Dec 1, 2016 at 12:10 AM, Josh Poimboeuf <jpoimboe@redhat.com> wrote: > Resuming from a suspend operation is showing a KASAN false positive > warning: > > BUG: KASAN: stack-out-of-bounds in unwind_get_return_address+0x11d/0x130 at addr ffff8803867d7878 > Read of size 8 by task pm-suspend/7774 > page:ffffea000e19f5c0 count:0 mapcount:0 mapping: (null) index:0x0 > flags: 0x2ffff0000000000() > page dumped because: kasan: bad access detected > CPU: 0 PID: 7774 Comm: pm-suspend Tainted: G B 4.9.0-rc7+ #8 > Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016 > ffff8803867d7468 ffffffffb4c0d051 ffff8803867d7500 ffff8803867d7878 > ffff8803867d74f0 ffffffffb45cbe34 ffffffffb4e64136 ffffffffb4510d42 > ffff8803828c3f4c 0000000000000097 0000000041b58ab3 ffffffffb6192731 > Call Trace: > dump_stack+0x63/0x82 > kasan_report_error+0x4b4/0x4e0 > ? acpi_hw_read_port+0xd0/0x1ea > ? kfree_const+0x22/0x30 > ? acpi_hw_validate_io_request+0x1a6/0x1a6 > __asan_report_load8_noabort+0x61/0x70 > ? unwind_get_return_address+0x11d/0x130 > unwind_get_return_address+0x11d/0x130 > ? unwind_next_frame+0x97/0xf0 > __save_stack_trace+0x92/0x100 > save_stack_trace+0x1b/0x20 > save_stack+0x46/0xd0 > ? save_stack_trace+0x1b/0x20 > ? save_stack+0x46/0xd0 > ? kasan_kmalloc+0xad/0xe0 > ? kasan_slab_alloc+0x12/0x20 > ? acpi_hw_read+0x2b6/0x3aa > ? acpi_hw_validate_register+0x20b/0x20b > ? acpi_hw_write_port+0x72/0xc7 > ? acpi_hw_write+0x11f/0x15f > ? acpi_hw_read_multiple+0x19f/0x19f > ? memcpy+0x45/0x50 > ? acpi_hw_write_port+0x72/0xc7 > ? acpi_hw_write+0x11f/0x15f > ? acpi_hw_read_multiple+0x19f/0x19f > ? kasan_unpoison_shadow+0x36/0x50 > kasan_kmalloc+0xad/0xe0 > kasan_slab_alloc+0x12/0x20 > kmem_cache_alloc_trace+0xbc/0x1e0 > ? acpi_get_sleep_type_data+0x9a/0x578 > acpi_get_sleep_type_data+0x9a/0x578 > acpi_hw_legacy_wake_prep+0x88/0x22c > ? acpi_hw_legacy_sleep+0x3c7/0x3c7 > ? acpi_write_bit_register+0x28d/0x2d3 > ? acpi_read_bit_register+0x19b/0x19b > acpi_hw_sleep_dispatch+0xb5/0xba > acpi_leave_sleep_state_prep+0x17/0x19 > acpi_suspend_enter+0x154/0x1e0 > ? trace_suspend_resume+0xe8/0xe8 > suspend_devices_and_enter+0xb09/0xdb0 > ? printk+0xa8/0xd8 > ? arch_suspend_enable_irqs+0x20/0x20 > ? try_to_freeze_tasks+0x295/0x600 > pm_suspend+0x6c9/0x780 > ? finish_wait+0x1f0/0x1f0 > ? suspend_devices_and_enter+0xdb0/0xdb0 > state_store+0xa2/0x120 > ? kobj_attr_show+0x60/0x60 > kobj_attr_store+0x36/0x70 > sysfs_kf_write+0x131/0x200 > kernfs_fop_write+0x295/0x3f0 > __vfs_write+0xef/0x760 > ? handle_mm_fault+0x1346/0x35e0 > ? do_iter_readv_writev+0x660/0x660 > ? __pmd_alloc+0x310/0x310 > ? do_lock_file_wait+0x1e0/0x1e0 > ? apparmor_file_permission+0x18/0x20 > ? security_file_permission+0x73/0x1c0 > ? rw_verify_area+0xbd/0x2b0 > vfs_write+0x149/0x4a0 > SyS_write+0xd9/0x1c0 > ? SyS_read+0x1c0/0x1c0 > entry_SYSCALL_64_fastpath+0x1e/0xad > Memory state around the buggy address: > ffff8803867d7700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ffff8803867d7780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >ffff8803867d7800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f4 > ^ > ffff8803867d7880: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 > ffff8803867d7900: 00 00 00 f1 f1 f1 f1 04 f4 f4 f4 f3 f3 f3 f3 00 > > KASAN instrumentation poisons the stack when entering a function and > unpoisons it when exiting the function. However, in the suspend path, > some functions never return, so their stack never gets unpoisoned, > resulting in stale KASAN shadow data which can cause false positive > warnings like the one above. > > Reported-by: Scott Bauer <scott.bauer@intel.com> > Tested-by: Scott Bauer <scott.bauer@intel.com> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> > --- > arch/x86/kernel/acpi/sleep.c | 3 +++ > include/linux/kasan.h | 7 +++++++ > 2 files changed, 10 insertions(+) > > diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c > index 4858733..62bd046 100644 > --- a/arch/x86/kernel/acpi/sleep.c > +++ b/arch/x86/kernel/acpi/sleep.c > @@ -115,6 +115,9 @@ int x86_acpi_suspend_lowlevel(void) > pause_graph_tracing(); > do_suspend_lowlevel(); > unpause_graph_tracing(); > + > + kasan_unpoison_stack_below_sp(); > + > return 0; > } > > diff --git a/include/linux/kasan.h b/include/linux/kasan.h > index 820c0ad..e0945d5 100644 > --- a/include/linux/kasan.h > +++ b/include/linux/kasan.h > @@ -45,6 +45,12 @@ void kasan_unpoison_shadow(const void *address, size_t size); > > void kasan_unpoison_task_stack(struct task_struct *task); > void kasan_unpoison_stack_above_sp_to(const void *watermark); > +asmlinkage void kasan_unpoison_task_stack_below(const void *watermark); > + > +static inline void kasan_unpoison_stack_below_sp(void) > +{ > + kasan_unpoison_task_stack_below(__builtin_frame_address(0)); > +} > > void kasan_alloc_pages(struct page *page, unsigned int order); > void kasan_free_pages(struct page *page, unsigned int order); > @@ -87,6 +93,7 @@ static inline void kasan_unpoison_shadow(const void *address, size_t size) {} > > static inline void kasan_unpoison_task_stack(struct task_struct *task) {} > static inline void kasan_unpoison_stack_above_sp_to(const void *watermark) {} > +static inline void kasan_unpoison_stack_below_sp(void) {} > > static inline void kasan_enable_current(void) {} > static inline void kasan_disable_current(void) {} > -- Looks OK to me. Whom do you expect to apply this? Thanks, Rafael ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-01 14:04 ` [PATCH] " Rafael J. Wysocki @ 2016-12-01 16:53 ` Josh Poimboeuf 2016-12-01 17:05 ` Rafael J. Wysocki 0 siblings, 1 reply; 35+ messages in thread From: Josh Poimboeuf @ 2016-12-01 16:53 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Rafael J. Wysocki, Len Brown, Pavel Machek, Linux PM, Linux Kernel Mailing List, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, the arch/x86 maintainers, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, kasan-dev On Thu, Dec 01, 2016 at 03:04:22PM +0100, Rafael J. Wysocki wrote: > On Thu, Dec 1, 2016 at 12:10 AM, Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > Resuming from a suspend operation is showing a KASAN false positive > > warning: > > > > BUG: KASAN: stack-out-of-bounds in unwind_get_return_address+0x11d/0x130 at addr ffff8803867d7878 > > Read of size 8 by task pm-suspend/7774 > > page:ffffea000e19f5c0 count:0 mapcount:0 mapping: (null) index:0x0 > > flags: 0x2ffff0000000000() > > page dumped because: kasan: bad access detected > > CPU: 0 PID: 7774 Comm: pm-suspend Tainted: G B 4.9.0-rc7+ #8 > > Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016 > > ffff8803867d7468 ffffffffb4c0d051 ffff8803867d7500 ffff8803867d7878 > > ffff8803867d74f0 ffffffffb45cbe34 ffffffffb4e64136 ffffffffb4510d42 > > ffff8803828c3f4c 0000000000000097 0000000041b58ab3 ffffffffb6192731 > > Call Trace: > > dump_stack+0x63/0x82 > > kasan_report_error+0x4b4/0x4e0 > > ? acpi_hw_read_port+0xd0/0x1ea > > ? kfree_const+0x22/0x30 > > ? acpi_hw_validate_io_request+0x1a6/0x1a6 > > __asan_report_load8_noabort+0x61/0x70 > > ? unwind_get_return_address+0x11d/0x130 > > unwind_get_return_address+0x11d/0x130 > > ? unwind_next_frame+0x97/0xf0 > > __save_stack_trace+0x92/0x100 > > save_stack_trace+0x1b/0x20 > > save_stack+0x46/0xd0 > > ? save_stack_trace+0x1b/0x20 > > ? save_stack+0x46/0xd0 > > ? kasan_kmalloc+0xad/0xe0 > > ? kasan_slab_alloc+0x12/0x20 > > ? acpi_hw_read+0x2b6/0x3aa > > ? acpi_hw_validate_register+0x20b/0x20b > > ? acpi_hw_write_port+0x72/0xc7 > > ? acpi_hw_write+0x11f/0x15f > > ? acpi_hw_read_multiple+0x19f/0x19f > > ? memcpy+0x45/0x50 > > ? acpi_hw_write_port+0x72/0xc7 > > ? acpi_hw_write+0x11f/0x15f > > ? acpi_hw_read_multiple+0x19f/0x19f > > ? kasan_unpoison_shadow+0x36/0x50 > > kasan_kmalloc+0xad/0xe0 > > kasan_slab_alloc+0x12/0x20 > > kmem_cache_alloc_trace+0xbc/0x1e0 > > ? acpi_get_sleep_type_data+0x9a/0x578 > > acpi_get_sleep_type_data+0x9a/0x578 > > acpi_hw_legacy_wake_prep+0x88/0x22c > > ? acpi_hw_legacy_sleep+0x3c7/0x3c7 > > ? acpi_write_bit_register+0x28d/0x2d3 > > ? acpi_read_bit_register+0x19b/0x19b > > acpi_hw_sleep_dispatch+0xb5/0xba > > acpi_leave_sleep_state_prep+0x17/0x19 > > acpi_suspend_enter+0x154/0x1e0 > > ? trace_suspend_resume+0xe8/0xe8 > > suspend_devices_and_enter+0xb09/0xdb0 > > ? printk+0xa8/0xd8 > > ? arch_suspend_enable_irqs+0x20/0x20 > > ? try_to_freeze_tasks+0x295/0x600 > > pm_suspend+0x6c9/0x780 > > ? finish_wait+0x1f0/0x1f0 > > ? suspend_devices_and_enter+0xdb0/0xdb0 > > state_store+0xa2/0x120 > > ? kobj_attr_show+0x60/0x60 > > kobj_attr_store+0x36/0x70 > > sysfs_kf_write+0x131/0x200 > > kernfs_fop_write+0x295/0x3f0 > > __vfs_write+0xef/0x760 > > ? handle_mm_fault+0x1346/0x35e0 > > ? do_iter_readv_writev+0x660/0x660 > > ? __pmd_alloc+0x310/0x310 > > ? do_lock_file_wait+0x1e0/0x1e0 > > ? apparmor_file_permission+0x18/0x20 > > ? security_file_permission+0x73/0x1c0 > > ? rw_verify_area+0xbd/0x2b0 > > vfs_write+0x149/0x4a0 > > SyS_write+0xd9/0x1c0 > > ? SyS_read+0x1c0/0x1c0 > > entry_SYSCALL_64_fastpath+0x1e/0xad > > Memory state around the buggy address: > > ffff8803867d7700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > ffff8803867d7780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > > >ffff8803867d7800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f4 > > ^ > > ffff8803867d7880: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 > > ffff8803867d7900: 00 00 00 f1 f1 f1 f1 04 f4 f4 f4 f3 f3 f3 f3 00 > > > > KASAN instrumentation poisons the stack when entering a function and > > unpoisons it when exiting the function. However, in the suspend path, > > some functions never return, so their stack never gets unpoisoned, > > resulting in stale KASAN shadow data which can cause false positive > > warnings like the one above. > > > > Reported-by: Scott Bauer <scott.bauer@intel.com> > > Tested-by: Scott Bauer <scott.bauer@intel.com> > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> > > --- > > arch/x86/kernel/acpi/sleep.c | 3 +++ > > include/linux/kasan.h | 7 +++++++ > > 2 files changed, 10 insertions(+) > > > > diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c > > index 4858733..62bd046 100644 > > --- a/arch/x86/kernel/acpi/sleep.c > > +++ b/arch/x86/kernel/acpi/sleep.c > > @@ -115,6 +115,9 @@ int x86_acpi_suspend_lowlevel(void) > > pause_graph_tracing(); > > do_suspend_lowlevel(); > > unpause_graph_tracing(); > > + > > + kasan_unpoison_stack_below_sp(); > > + > > return 0; > > } > > > > diff --git a/include/linux/kasan.h b/include/linux/kasan.h > > index 820c0ad..e0945d5 100644 > > --- a/include/linux/kasan.h > > +++ b/include/linux/kasan.h > > @@ -45,6 +45,12 @@ void kasan_unpoison_shadow(const void *address, size_t size); > > > > void kasan_unpoison_task_stack(struct task_struct *task); > > void kasan_unpoison_stack_above_sp_to(const void *watermark); > > +asmlinkage void kasan_unpoison_task_stack_below(const void *watermark); > > + > > +static inline void kasan_unpoison_stack_below_sp(void) > > +{ > > + kasan_unpoison_task_stack_below(__builtin_frame_address(0)); > > +} > > > > void kasan_alloc_pages(struct page *page, unsigned int order); > > void kasan_free_pages(struct page *page, unsigned int order); > > @@ -87,6 +93,7 @@ static inline void kasan_unpoison_shadow(const void *address, size_t size) {} > > > > static inline void kasan_unpoison_task_stack(struct task_struct *task) {} > > static inline void kasan_unpoison_stack_above_sp_to(const void *watermark) {} > > +static inline void kasan_unpoison_stack_below_sp(void) {} > > > > static inline void kasan_enable_current(void) {} > > static inline void kasan_disable_current(void) {} > > -- > > Looks OK to me. > > Whom do you expect to apply this? Assuming it gets an ack from Andrey, can you take it? Or would the tip tree be better? -- Josh ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-01 16:53 ` Josh Poimboeuf @ 2016-12-01 17:05 ` Rafael J. Wysocki 2016-12-02 10:15 ` Ingo Molnar 0 siblings, 1 reply; 35+ messages in thread From: Rafael J. Wysocki @ 2016-12-01 17:05 UTC (permalink / raw) To: Josh Poimboeuf Cc: Rafael J. Wysocki, Rafael J. Wysocki, Len Brown, Pavel Machek, Linux PM, Linux Kernel Mailing List, Peter Zijlstra, Ingo Molnar, Andy Lutomirski, Scott Bauer, the arch/x86 maintainers, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, kasan-dev On Thu, Dec 1, 2016 at 5:53 PM, Josh Poimboeuf <jpoimboe@redhat.com> wrote: > On Thu, Dec 01, 2016 at 03:04:22PM +0100, Rafael J. Wysocki wrote: >> On Thu, Dec 1, 2016 at 12:10 AM, Josh Poimboeuf <jpoimboe@redhat.com> wrote: >> > Resuming from a suspend operation is showing a KASAN false positive >> > warning: >> > >> > BUG: KASAN: stack-out-of-bounds in unwind_get_return_address+0x11d/0x130 at addr ffff8803867d7878 >> > Read of size 8 by task pm-suspend/7774 >> > page:ffffea000e19f5c0 count:0 mapcount:0 mapping: (null) index:0x0 >> > flags: 0x2ffff0000000000() >> > page dumped because: kasan: bad access detected >> > CPU: 0 PID: 7774 Comm: pm-suspend Tainted: G B 4.9.0-rc7+ #8 >> > Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016 >> > ffff8803867d7468 ffffffffb4c0d051 ffff8803867d7500 ffff8803867d7878 >> > ffff8803867d74f0 ffffffffb45cbe34 ffffffffb4e64136 ffffffffb4510d42 >> > ffff8803828c3f4c 0000000000000097 0000000041b58ab3 ffffffffb6192731 >> > Call Trace: >> > dump_stack+0x63/0x82 >> > kasan_report_error+0x4b4/0x4e0 >> > ? acpi_hw_read_port+0xd0/0x1ea >> > ? kfree_const+0x22/0x30 >> > ? acpi_hw_validate_io_request+0x1a6/0x1a6 >> > __asan_report_load8_noabort+0x61/0x70 >> > ? unwind_get_return_address+0x11d/0x130 >> > unwind_get_return_address+0x11d/0x130 >> > ? unwind_next_frame+0x97/0xf0 >> > __save_stack_trace+0x92/0x100 >> > save_stack_trace+0x1b/0x20 >> > save_stack+0x46/0xd0 >> > ? save_stack_trace+0x1b/0x20 >> > ? save_stack+0x46/0xd0 >> > ? kasan_kmalloc+0xad/0xe0 >> > ? kasan_slab_alloc+0x12/0x20 >> > ? acpi_hw_read+0x2b6/0x3aa >> > ? acpi_hw_validate_register+0x20b/0x20b >> > ? acpi_hw_write_port+0x72/0xc7 >> > ? acpi_hw_write+0x11f/0x15f >> > ? acpi_hw_read_multiple+0x19f/0x19f >> > ? memcpy+0x45/0x50 >> > ? acpi_hw_write_port+0x72/0xc7 >> > ? acpi_hw_write+0x11f/0x15f >> > ? acpi_hw_read_multiple+0x19f/0x19f >> > ? kasan_unpoison_shadow+0x36/0x50 >> > kasan_kmalloc+0xad/0xe0 >> > kasan_slab_alloc+0x12/0x20 >> > kmem_cache_alloc_trace+0xbc/0x1e0 >> > ? acpi_get_sleep_type_data+0x9a/0x578 >> > acpi_get_sleep_type_data+0x9a/0x578 >> > acpi_hw_legacy_wake_prep+0x88/0x22c >> > ? acpi_hw_legacy_sleep+0x3c7/0x3c7 >> > ? acpi_write_bit_register+0x28d/0x2d3 >> > ? acpi_read_bit_register+0x19b/0x19b >> > acpi_hw_sleep_dispatch+0xb5/0xba >> > acpi_leave_sleep_state_prep+0x17/0x19 >> > acpi_suspend_enter+0x154/0x1e0 >> > ? trace_suspend_resume+0xe8/0xe8 >> > suspend_devices_and_enter+0xb09/0xdb0 >> > ? printk+0xa8/0xd8 >> > ? arch_suspend_enable_irqs+0x20/0x20 >> > ? try_to_freeze_tasks+0x295/0x600 >> > pm_suspend+0x6c9/0x780 >> > ? finish_wait+0x1f0/0x1f0 >> > ? suspend_devices_and_enter+0xdb0/0xdb0 >> > state_store+0xa2/0x120 >> > ? kobj_attr_show+0x60/0x60 >> > kobj_attr_store+0x36/0x70 >> > sysfs_kf_write+0x131/0x200 >> > kernfs_fop_write+0x295/0x3f0 >> > __vfs_write+0xef/0x760 >> > ? handle_mm_fault+0x1346/0x35e0 >> > ? do_iter_readv_writev+0x660/0x660 >> > ? __pmd_alloc+0x310/0x310 >> > ? do_lock_file_wait+0x1e0/0x1e0 >> > ? apparmor_file_permission+0x18/0x20 >> > ? security_file_permission+0x73/0x1c0 >> > ? rw_verify_area+0xbd/0x2b0 >> > vfs_write+0x149/0x4a0 >> > SyS_write+0xd9/0x1c0 >> > ? SyS_read+0x1c0/0x1c0 >> > entry_SYSCALL_64_fastpath+0x1e/0xad >> > Memory state around the buggy address: >> > ffff8803867d7700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> > ffff8803867d7780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> > >ffff8803867d7800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f4 >> > ^ >> > ffff8803867d7880: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 >> > ffff8803867d7900: 00 00 00 f1 f1 f1 f1 04 f4 f4 f4 f3 f3 f3 f3 00 >> > >> > KASAN instrumentation poisons the stack when entering a function and >> > unpoisons it when exiting the function. However, in the suspend path, >> > some functions never return, so their stack never gets unpoisoned, >> > resulting in stale KASAN shadow data which can cause false positive >> > warnings like the one above. >> > >> > Reported-by: Scott Bauer <scott.bauer@intel.com> >> > Tested-by: Scott Bauer <scott.bauer@intel.com> >> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> >> > --- >> > arch/x86/kernel/acpi/sleep.c | 3 +++ >> > include/linux/kasan.h | 7 +++++++ >> > 2 files changed, 10 insertions(+) >> > >> > diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c >> > index 4858733..62bd046 100644 >> > --- a/arch/x86/kernel/acpi/sleep.c >> > +++ b/arch/x86/kernel/acpi/sleep.c >> > @@ -115,6 +115,9 @@ int x86_acpi_suspend_lowlevel(void) >> > pause_graph_tracing(); >> > do_suspend_lowlevel(); >> > unpause_graph_tracing(); >> > + >> > + kasan_unpoison_stack_below_sp(); >> > + >> > return 0; >> > } >> > >> > diff --git a/include/linux/kasan.h b/include/linux/kasan.h >> > index 820c0ad..e0945d5 100644 >> > --- a/include/linux/kasan.h >> > +++ b/include/linux/kasan.h >> > @@ -45,6 +45,12 @@ void kasan_unpoison_shadow(const void *address, size_t size); >> > >> > void kasan_unpoison_task_stack(struct task_struct *task); >> > void kasan_unpoison_stack_above_sp_to(const void *watermark); >> > +asmlinkage void kasan_unpoison_task_stack_below(const void *watermark); >> > + >> > +static inline void kasan_unpoison_stack_below_sp(void) >> > +{ >> > + kasan_unpoison_task_stack_below(__builtin_frame_address(0)); >> > +} >> > >> > void kasan_alloc_pages(struct page *page, unsigned int order); >> > void kasan_free_pages(struct page *page, unsigned int order); >> > @@ -87,6 +93,7 @@ static inline void kasan_unpoison_shadow(const void *address, size_t size) {} >> > >> > static inline void kasan_unpoison_task_stack(struct task_struct *task) {} >> > static inline void kasan_unpoison_stack_above_sp_to(const void *watermark) {} >> > +static inline void kasan_unpoison_stack_below_sp(void) {} >> > >> > static inline void kasan_enable_current(void) {} >> > static inline void kasan_disable_current(void) {} >> > -- >> >> Looks OK to me. >> >> Whom do you expect to apply this? > > Assuming it gets an ack from Andrey, can you take it? Or would the tip > tree be better? I can take it unless anyone else wants to take care of it. :-) Thanks, Rafael ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH] x86/suspend: fix false positive KASAN warning on suspend/resume 2016-12-01 17:05 ` Rafael J. Wysocki @ 2016-12-02 10:15 ` Ingo Molnar 0 siblings, 0 replies; 35+ messages in thread From: Ingo Molnar @ 2016-12-02 10:15 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Josh Poimboeuf, Rafael J. Wysocki, Len Brown, Pavel Machek, Linux PM, Linux Kernel Mailing List, Peter Zijlstra, Andy Lutomirski, Scott Bauer, the arch/x86 maintainers, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, kasan-dev * Rafael J. Wysocki <rafael@kernel.org> wrote: > >> Looks OK to me. > >> > >> Whom do you expect to apply this? > > > > Assuming it gets an ack from Andrey, can you take it? Or would the tip > > tree be better? > > I can take it unless anyone else wants to take care of it. :-) Please pick up the fixes in this thread. Thanks! Ingo ^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2016-12-08 0:14 UTC | newest] Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-11-29 18:13 BUG: KASAN: stack-out-of-bounds in unwind_get_return_address Scott Bauer 2016-11-30 18:35 ` Josh Poimboeuf 2016-11-30 19:02 ` Scott Bauer 2016-11-30 23:10 ` [PATCH] x86/suspend: fix false positive KASAN warning on suspend/resume Josh Poimboeuf 2016-12-01 9:05 ` Andrey Ryabinin 2016-12-01 9:05 ` Andrey Ryabinin 2016-12-01 14:58 ` Josh Poimboeuf 2016-12-01 16:45 ` Josh Poimboeuf 2016-12-01 16:51 ` Dmitry Vyukov 2016-12-01 17:13 ` Josh Poimboeuf 2016-12-01 17:27 ` Dmitry Vyukov 2016-12-01 17:34 ` Josh Poimboeuf 2016-12-01 17:47 ` Dmitry Vyukov 2016-12-01 17:56 ` Josh Poimboeuf 2016-12-01 20:31 ` [PATCH v2] " Josh Poimboeuf 2016-12-02 9:44 ` Dmitry Vyukov 2016-12-02 12:54 ` Pavel Machek 2016-12-02 13:41 ` Andrey Ryabinin 2016-12-02 13:41 ` Andrey Ryabinin 2016-12-02 14:01 ` Josh Poimboeuf 2016-12-02 14:02 ` Dmitry Vyukov 2016-12-02 14:42 ` [PATCH v3] " Josh Poimboeuf 2016-12-02 14:45 ` Andrey Ryabinin 2016-12-02 14:45 ` Andrey Ryabinin 2016-12-02 15:08 ` Josh Poimboeuf 2016-12-02 17:42 ` [PATCH v4] " Josh Poimboeuf 2016-12-02 20:55 ` Andrey Ryabinin 2016-12-02 20:55 ` Andrey Ryabinin 2016-12-02 21:09 ` Pavel Machek 2016-12-02 21:57 ` Josh Poimboeuf 2016-12-08 0:10 ` Rafael J. Wysocki 2016-12-01 14:04 ` [PATCH] " Rafael J. Wysocki 2016-12-01 16:53 ` Josh Poimboeuf 2016-12-01 17:05 ` Rafael J. Wysocki 2016-12-02 10:15 ` Ingo Molnar
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.