linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dmitry Vyukov <dvyukov@google.com>
To: Mike Galbraith <efault@gmx.de>
Cc: "Zhang, Qiang" <Qiang.Zhang@windriver.com>,
	Andrew Halaney <ahalaney@redhat.com>,
	"andreyknvl@gmail.com" <andreyknvl@gmail.com>,
	"ryabinin.a.a@gmail.com" <ryabinin.a.a@gmail.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"kasan-dev@googlegroups.com" <kasan-dev@googlegroups.com>
Subject: Re: Question on KASAN calltrace record in RT
Date: Wed, 14 Apr 2021 07:26:29 +0200	[thread overview]
Message-ID: <CACT4Y+bVkBscD+Ggp6oQm3LbyiMVmwaaX20fQJLHobg6_z4VzQ@mail.gmail.com> (raw)
In-Reply-To: <182eea30ee9648b2a618709e9fc894e49cb464ad.camel@gmx.de>

On Wed, Apr 14, 2021 at 6:00 AM Mike Galbraith <efault@gmx.de> wrote:
>
> On Tue, 2021-04-13 at 17:29 +0200, Dmitry Vyukov wrote:
> > On Tue, Apr 6, 2021 at 10:26 AM Zhang, Qiang <Qiang.Zhang@windriver.com> wrote:
> > >
> > > Hello everyone
> > >
> > > In RT system,   after  Andrew test,   found the following calltrace ,
> > > in KASAN, we record callstack through stack_depot_save(), in this function, may be call alloc_pages,  but in RT, the spin_lock replace with
> > > rt_mutex in alloc_pages(), if before call this function, the irq is disabled,
> > > will trigger following calltrace.
> > >
> > > maybe  add array[KASAN_STACK_DEPTH] in struct kasan_track to record callstack  in RT system.
> > >
> > > Is there a better solution ?
> >
> > Hi Qiang,
> >
> > Adding 2 full stacks per heap object can increase memory usage too much.
> > The stackdepot has a preallocation mechanism, I would start with
> > adding interrupts check here:
> > https://elixir.bootlin.com/linux/v5.12-rc7/source/lib/stackdepot.c#L294
> > and just not do preallocation in interrupt context. This will solve
> > the problem, right?
>
> Hm, this thing might actually be (sorta?) working, modulo one startup
> gripe.  The CRASH_DUMP inspired gripe I get with !RT appeared (and shut
> up when told I don't care given kdump has worked just fine for ages:),
> but no more might_sleep() gripeage.
>
>
> CONFIG_KASAN_SHADOW_OFFSET=0xdffffc0000000000
> CONFIG_HAVE_ARCH_KASAN=y
> CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
> CONFIG_CC_HAS_KASAN_GENERIC=y
> CONFIG_KASAN=y
> CONFIG_KASAN_GENERIC=y
> CONFIG_KASAN_OUTLINE=y
> # CONFIG_KASAN_INLINE is not set
> CONFIG_KASAN_STACK=1
> CONFIG_KASAN_VMALLOC=y
> # CONFIG_KASAN_MODULE_TEST is not set
>
> ---
>  lib/stackdepot.c |   10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> --- a/lib/stackdepot.c
> +++ b/lib/stackdepot.c
> @@ -71,7 +71,7 @@ static void *stack_slabs[STACK_ALLOC_MAX
>  static int depot_index;
>  static int next_slab_inited;
>  static size_t depot_offset;
> -static DEFINE_SPINLOCK(depot_lock);
> +static DEFINE_RAW_SPINLOCK(depot_lock);
>
>  static bool init_stack_slab(void **prealloc)
>  {
> @@ -265,7 +265,7 @@ depot_stack_handle_t stack_depot_save(un
>         struct page *page = NULL;
>         void *prealloc = NULL;
>         unsigned long flags;
> -       u32 hash;
> +       u32 hash, may_prealloc = !IS_ENABLED(CONFIG_PREEMPT_RT) || preemptible();
>
>         if (unlikely(nr_entries == 0) || stack_depot_disable)
>                 goto fast_exit;
> @@ -291,7 +291,7 @@ depot_stack_handle_t stack_depot_save(un
>          * The smp_load_acquire() here pairs with smp_store_release() to
>          * |next_slab_inited| in depot_alloc_stack() and init_stack_slab().
>          */
> -       if (unlikely(!smp_load_acquire(&next_slab_inited))) {
> +       if (unlikely(!smp_load_acquire(&next_slab_inited) && may_prealloc)) {
>                 /*
>                  * Zero out zone modifiers, as we don't have specific zone
>                  * requirements. Keep the flags related to allocation in atomic
> @@ -305,7 +305,7 @@ depot_stack_handle_t stack_depot_save(un
>                         prealloc = page_address(page);
>         }
>
> -       spin_lock_irqsave(&depot_lock, flags);
> +       raw_spin_lock_irqsave(&depot_lock, flags);
>
>         found = find_stack(*bucket, entries, nr_entries, hash);
>         if (!found) {
> @@ -329,7 +329,7 @@ depot_stack_handle_t stack_depot_save(un
>                 WARN_ON(!init_stack_slab(&prealloc));
>         }
>
> -       spin_unlock_irqrestore(&depot_lock, flags);
> +       raw_spin_unlock_irqrestore(&depot_lock, flags);
>  exit:
>         if (prealloc) {
>                 /* Nobody used this memory, ok to free it. */
>
> [    0.692437] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:943
> [    0.692439] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0
> [    0.692442] Preemption disabled at:
> [    0.692443] [<ffffffff811a1510>] on_each_cpu_cond_mask+0x30/0xb0
> [    0.692451] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.12.0.g2afefec-tip-rt #5
> [    0.692454] Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013
> [    0.692456] Call Trace:
> [    0.692458]  ? on_each_cpu_cond_mask+0x30/0xb0
> [    0.692462]  dump_stack+0x8a/0xb5
> [    0.692467]  ___might_sleep.cold+0xfe/0x112
> [    0.692471]  rt_spin_lock+0x1c/0x60

HI Mike,

If freeing pages from smp_call_function is not OK, then perhaps we
need just to collect the objects to be freed to the task/CPU that
executes kasan_quarantine_remove_cache and it will free them (we know
it can free objects).

> [    0.692475]  free_unref_page+0x117/0x3c0
> [    0.692481]  qlist_free_all+0x60/0xd0
> [    0.692485]  per_cpu_remove_cache+0x5b/0x70
> [    0.692488]  smp_call_function_many_cond+0x185/0x3d0
> [    0.692492]  ? qlist_move_cache+0xe0/0xe0
> [    0.692495]  ? qlist_move_cache+0xe0/0xe0
> [    0.692497]  on_each_cpu_cond_mask+0x44/0xb0
> [    0.692501]  kasan_quarantine_remove_cache+0x52/0xf0
> [    0.692505]  ? acpi_bus_init+0x183/0x183
> [    0.692510]  kmem_cache_shrink+0xe/0x20
> [    0.692513]  acpi_os_purge_cache+0xa/0x10
> [    0.692517]  acpi_purge_cached_objects+0x1d/0x68
> [    0.692522]  acpi_initialize_objects+0x11/0x39
> [    0.692524]  ? acpi_ev_install_xrupt_handlers+0x6f/0x7c
> [    0.692529]  acpi_bus_init+0x50/0x183
> [    0.692532]  acpi_init+0xce/0x182
> [    0.692536]  ? acpi_bus_init+0x183/0x183
> [    0.692539]  ? intel_idle_init+0x36d/0x36d
> [    0.692543]  ? acpi_bus_init+0x183/0x183
> [    0.692546]  do_one_initcall+0x71/0x300
> [    0.692550]  ? trace_event_raw_event_initcall_finish+0x120/0x120
> [    0.692553]  ? parameq+0x90/0x90
> [    0.692556]  ? __wake_up_common+0x1e0/0x200
> [    0.692560]  ? kasan_unpoison+0x21/0x50
> [    0.692562]  ? __kasan_slab_alloc+0x24/0x70
> [    0.692567]  do_initcalls+0xff/0x129
> [    0.692571]  kernel_init_freeable+0x19c/0x1ce
> [    0.692574]  ? rest_init+0xc6/0xc6
> [    0.692577]  kernel_init+0xd/0x11a
> [    0.692580]  ret_from_fork+0x1f/0x30
>
> [   15.428008] ==================================================================
> [   15.428011] BUG: KASAN: vmalloc-out-of-bounds in crash_setup_memmap_entries+0x17e/0x3a0

This looks like a genuine kernel bug on first glance. I think it needs
to be fixed rather than ignored.

> [   15.428018] Write of size 8 at addr ffffc90000426008 by task kexec/1187
> [   15.428022] CPU: 2 PID: 1187 Comm: kexec Tainted: G        W   E     5.12.0.g2afefec-tip-rt #5
> [   15.428025] Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013
> [   15.428027] Call Trace:
> [   15.428029]  ? crash_setup_memmap_entries+0x17e/0x3a0
> [   15.428032]  dump_stack+0x8a/0xb5
> [   15.428037]  print_address_description.constprop.0+0x16/0xa0
> [   15.428044]  kasan_report+0xc4/0x100
> [   15.428047]  ? crash_setup_memmap_entries+0x17e/0x3a0
> [   15.428050]  crash_setup_memmap_entries+0x17e/0x3a0
> [   15.428053]  ? strcmp+0x2e/0x50
> [   15.428057]  ? native_machine_crash_shutdown+0x240/0x240
> [   15.428059]  ? kexec_purgatory_find_symbol.isra.0+0x145/0x1a0
> [   15.428066]  setup_boot_parameters+0x181/0x5c0
> [   15.428069]  bzImage64_load+0x6b5/0x740
> [   15.428072]  ? bzImage64_probe+0x140/0x140
> [   15.428075]  ? iov_iter_kvec+0x5f/0x70
> [   15.428080]  ? rw_verify_area+0x80/0x80
> [   15.428087]  ? __might_sleep+0x31/0xd0
> [   15.428091]  ? __might_sleep+0x31/0xd0
> [   15.428094]  ? ___might_sleep+0xc9/0xe0
> [   15.428096]  ? bzImage64_probe+0x140/0x140
> [   15.428099]  arch_kexec_kernel_image_load+0x102/0x130
> [   15.428102]  kimage_file_alloc_init+0xda/0x290
> [   15.428107]  __do_sys_kexec_file_load+0x21f/0x390
> [   15.428110]  ? __x64_sys_open+0x100/0x100
> [   15.428113]  ? kexec_calculate_store_digests+0x390/0x390
> [   15.428117]  ? rcu_nocb_flush_deferred_wakeup+0x36/0x50
> [   15.428122]  do_syscall_64+0x3d/0x80
> [   15.428127]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [   15.428132] RIP: 0033:0x7f46ad026759
> [   15.428135] Code: 00 48 81 c4 80 00 00 00 89 f0 c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 0f d7 2b 00 f7 d8 64 89 01 48
> [   15.428137] RSP: 002b:00007ffcf6f96788 EFLAGS: 00000206 ORIG_RAX: 0000000000000140
> [   15.428141] RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 00007f46ad026759
> [   15.428143] RDX: 0000000000000182 RSI: 0000000000000005 RDI: 0000000000000003
> [   15.428145] RBP: 00007ffcf6f96a28 R08: 0000000000000002 R09: 0000000000000000
> [   15.428146] R10: 0000000000b0d5e0 R11: 0000000000000206 R12: 0000000000000004
> [   15.428148] R13: 0000000000000000 R14: 0000000000000000 R15: 00000000ffffffff
> [   15.428152] Memory state around the buggy address:
> [   15.428164]  ffffc90000425f00: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> [   15.428166]  ffffc90000425f80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> [   15.428168] >ffffc90000426000: 00 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> [   15.428169]                       ^
> [   15.428171]  ffffc90000426080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> [   15.428172]  ffffc90000426100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> [   15.428173] ==================================================================
> [   15.428174] Disabling lock debugging due to kernel taint
>
> kasan: stop grumbling about CRASH_DUMP
>
> Signed-off-by: Mike Galbraith <efault@gmx.de>
> ---
>  arch/x86/kernel/Makefile |    1 +
>  kernel/Makefile          |    1 +
>  2 files changed, 2 insertions(+)
>
> --- a/arch/x86/kernel/Makefile
> +++ b/arch/x86/kernel/Makefile
> @@ -105,6 +105,7 @@ obj-$(CONFIG_X86_TSC)               += trace_clock.o
>  obj-$(CONFIG_CRASH_CORE)       += crash_core_$(BITS).o
>  obj-$(CONFIG_KEXEC_CORE)       += machine_kexec_$(BITS).o
>  obj-$(CONFIG_KEXEC_CORE)       += relocate_kernel_$(BITS).o crash.o
> +KASAN_SANITIZE_crash.o         := n
>  obj-$(CONFIG_KEXEC_FILE)       += kexec-bzimage64.o
>  obj-$(CONFIG_CRASH_DUMP)       += crash_dump_$(BITS).o
>  obj-y                          += kprobes/
> --- a/kernel/Makefile
> +++ b/kernel/Makefile
> @@ -72,6 +72,7 @@ obj-$(CONFIG_CRASH_CORE) += crash_core.o
>  obj-$(CONFIG_KEXEC_CORE) += kexec_core.o
>  obj-$(CONFIG_KEXEC) += kexec.o
>  obj-$(CONFIG_KEXEC_FILE) += kexec_file.o
> +KASAN_SANITIZE_kexec_file.o := n
>  obj-$(CONFIG_KEXEC_ELF) += kexec_elf.o
>  obj-$(CONFIG_BACKTRACE_SELF_TEST) += backtracetest.o
>  obj-$(CONFIG_COMPAT) += compat.o
>

  reply	other threads:[~2021-04-14  5:26 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-06  8:26 Question on KASAN calltrace record in RT Zhang, Qiang
2021-04-13 15:29 ` Dmitry Vyukov
2021-04-14  4:00   ` Mike Galbraith
2021-04-14  5:26     ` Dmitry Vyukov [this message]
2021-04-14  6:15       ` Mike Galbraith
2021-04-14 15:04         ` [patch] kasan: make it RT aware Mike Galbraith
2021-04-15 18:13       ` Question on KASAN calltrace record in RT Mike Galbraith
2021-04-14  7:29     ` 回复: " Zhang, Qiang
2021-04-14  7:56       ` Mike Galbraith
2021-04-14  8:09         ` 回复: " Zhang, Qiang
2021-04-14  6:58   ` Zhang, Qiang
2021-04-14  7:07     ` Dmitry Vyukov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACT4Y+bVkBscD+Ggp6oQm3LbyiMVmwaaX20fQJLHobg6_z4VzQ@mail.gmail.com \
    --to=dvyukov@google.com \
    --cc=Qiang.Zhang@windriver.com \
    --cc=ahalaney@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@gmail.com \
    --cc=efault@gmx.de \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ryabinin.a.a@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).