All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dmitry Vyukov <dvyukov@google.com>
To: Mike Galbraith <efault@gmx.de>
Cc: "Zhang, Qiang" <Qiang.Zhang@windriver.com>,
	Andrew Halaney <ahalaney@redhat.com>,
	"andreyknvl@gmail.com" <andreyknvl@gmail.com>,
	"ryabinin.a.a@gmail.com" <ryabinin.a.a@gmail.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"kasan-dev@googlegroups.com" <kasan-dev@googlegroups.com>
Subject: Re: Question on KASAN calltrace record in RT
Date: Wed, 14 Apr 2021 07:26:29 +0200	[thread overview]
Message-ID: <CACT4Y+bVkBscD+Ggp6oQm3LbyiMVmwaaX20fQJLHobg6_z4VzQ@mail.gmail.com> (raw)
In-Reply-To: <182eea30ee9648b2a618709e9fc894e49cb464ad.camel@gmx.de>

On Wed, Apr 14, 2021 at 6:00 AM Mike Galbraith <efault@gmx.de> wrote:
>
> On Tue, 2021-04-13 at 17:29 +0200, Dmitry Vyukov wrote:
> > On Tue, Apr 6, 2021 at 10:26 AM Zhang, Qiang <Qiang.Zhang@windriver.com> wrote:
> > >
> > > Hello everyone
> > >
> > > In RT system,   after  Andrew test,   found the following calltrace ,
> > > in KASAN, we record callstack through stack_depot_save(), in this function, may be call alloc_pages,  but in RT, the spin_lock replace with
> > > rt_mutex in alloc_pages(), if before call this function, the irq is disabled,
> > > will trigger following calltrace.
> > >
> > > maybe  add array[KASAN_STACK_DEPTH] in struct kasan_track to record callstack  in RT system.
> > >
> > > Is there a better solution ?
> >
> > Hi Qiang,
> >
> > Adding 2 full stacks per heap object can increase memory usage too much.
> > The stackdepot has a preallocation mechanism, I would start with
> > adding interrupts check here:
> > https://elixir.bootlin.com/linux/v5.12-rc7/source/lib/stackdepot.c#L294
> > and just not do preallocation in interrupt context. This will solve
> > the problem, right?
>
> Hm, this thing might actually be (sorta?) working, modulo one startup
> gripe.  The CRASH_DUMP inspired gripe I get with !RT appeared (and shut
> up when told I don't care given kdump has worked just fine for ages:),
> but no more might_sleep() gripeage.
>
>
> CONFIG_KASAN_SHADOW_OFFSET=0xdffffc0000000000
> CONFIG_HAVE_ARCH_KASAN=y
> CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
> CONFIG_CC_HAS_KASAN_GENERIC=y
> CONFIG_KASAN=y
> CONFIG_KASAN_GENERIC=y
> CONFIG_KASAN_OUTLINE=y
> # CONFIG_KASAN_INLINE is not set
> CONFIG_KASAN_STACK=1
> CONFIG_KASAN_VMALLOC=y
> # CONFIG_KASAN_MODULE_TEST is not set
>
> ---
>  lib/stackdepot.c |   10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> --- a/lib/stackdepot.c
> +++ b/lib/stackdepot.c
> @@ -71,7 +71,7 @@ static void *stack_slabs[STACK_ALLOC_MAX
>  static int depot_index;
>  static int next_slab_inited;
>  static size_t depot_offset;
> -static DEFINE_SPINLOCK(depot_lock);
> +static DEFINE_RAW_SPINLOCK(depot_lock);
>
>  static bool init_stack_slab(void **prealloc)
>  {
> @@ -265,7 +265,7 @@ depot_stack_handle_t stack_depot_save(un
>         struct page *page = NULL;
>         void *prealloc = NULL;
>         unsigned long flags;
> -       u32 hash;
> +       u32 hash, may_prealloc = !IS_ENABLED(CONFIG_PREEMPT_RT) || preemptible();
>
>         if (unlikely(nr_entries == 0) || stack_depot_disable)
>                 goto fast_exit;
> @@ -291,7 +291,7 @@ depot_stack_handle_t stack_depot_save(un
>          * The smp_load_acquire() here pairs with smp_store_release() to
>          * |next_slab_inited| in depot_alloc_stack() and init_stack_slab().
>          */
> -       if (unlikely(!smp_load_acquire(&next_slab_inited))) {
> +       if (unlikely(!smp_load_acquire(&next_slab_inited) && may_prealloc)) {
>                 /*
>                  * Zero out zone modifiers, as we don't have specific zone
>                  * requirements. Keep the flags related to allocation in atomic
> @@ -305,7 +305,7 @@ depot_stack_handle_t stack_depot_save(un
>                         prealloc = page_address(page);
>         }
>
> -       spin_lock_irqsave(&depot_lock, flags);
> +       raw_spin_lock_irqsave(&depot_lock, flags);
>
>         found = find_stack(*bucket, entries, nr_entries, hash);
>         if (!found) {
> @@ -329,7 +329,7 @@ depot_stack_handle_t stack_depot_save(un
>                 WARN_ON(!init_stack_slab(&prealloc));
>         }
>
> -       spin_unlock_irqrestore(&depot_lock, flags);
> +       raw_spin_unlock_irqrestore(&depot_lock, flags);
>  exit:
>         if (prealloc) {
>                 /* Nobody used this memory, ok to free it. */
>
> [    0.692437] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:943
> [    0.692439] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0
> [    0.692442] Preemption disabled at:
> [    0.692443] [<ffffffff811a1510>] on_each_cpu_cond_mask+0x30/0xb0
> [    0.692451] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.12.0.g2afefec-tip-rt #5
> [    0.692454] Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013
> [    0.692456] Call Trace:
> [    0.692458]  ? on_each_cpu_cond_mask+0x30/0xb0
> [    0.692462]  dump_stack+0x8a/0xb5
> [    0.692467]  ___might_sleep.cold+0xfe/0x112
> [    0.692471]  rt_spin_lock+0x1c/0x60

HI Mike,

If freeing pages from smp_call_function is not OK, then perhaps we
need just to collect the objects to be freed to the task/CPU that
executes kasan_quarantine_remove_cache and it will free them (we know
it can free objects).

> [    0.692475]  free_unref_page+0x117/0x3c0
> [    0.692481]  qlist_free_all+0x60/0xd0
> [    0.692485]  per_cpu_remove_cache+0x5b/0x70
> [    0.692488]  smp_call_function_many_cond+0x185/0x3d0
> [    0.692492]  ? qlist_move_cache+0xe0/0xe0
> [    0.692495]  ? qlist_move_cache+0xe0/0xe0
> [    0.692497]  on_each_cpu_cond_mask+0x44/0xb0
> [    0.692501]  kasan_quarantine_remove_cache+0x52/0xf0
> [    0.692505]  ? acpi_bus_init+0x183/0x183
> [    0.692510]  kmem_cache_shrink+0xe/0x20
> [    0.692513]  acpi_os_purge_cache+0xa/0x10
> [    0.692517]  acpi_purge_cached_objects+0x1d/0x68
> [    0.692522]  acpi_initialize_objects+0x11/0x39
> [    0.692524]  ? acpi_ev_install_xrupt_handlers+0x6f/0x7c
> [    0.692529]  acpi_bus_init+0x50/0x183
> [    0.692532]  acpi_init+0xce/0x182
> [    0.692536]  ? acpi_bus_init+0x183/0x183
> [    0.692539]  ? intel_idle_init+0x36d/0x36d
> [    0.692543]  ? acpi_bus_init+0x183/0x183
> [    0.692546]  do_one_initcall+0x71/0x300
> [    0.692550]  ? trace_event_raw_event_initcall_finish+0x120/0x120
> [    0.692553]  ? parameq+0x90/0x90
> [    0.692556]  ? __wake_up_common+0x1e0/0x200
> [    0.692560]  ? kasan_unpoison+0x21/0x50
> [    0.692562]  ? __kasan_slab_alloc+0x24/0x70
> [    0.692567]  do_initcalls+0xff/0x129
> [    0.692571]  kernel_init_freeable+0x19c/0x1ce
> [    0.692574]  ? rest_init+0xc6/0xc6
> [    0.692577]  kernel_init+0xd/0x11a
> [    0.692580]  ret_from_fork+0x1f/0x30
>
> [   15.428008] ==================================================================
> [   15.428011] BUG: KASAN: vmalloc-out-of-bounds in crash_setup_memmap_entries+0x17e/0x3a0

This looks like a genuine kernel bug on first glance. I think it needs
to be fixed rather than ignored.

> [   15.428018] Write of size 8 at addr ffffc90000426008 by task kexec/1187
> [   15.428022] CPU: 2 PID: 1187 Comm: kexec Tainted: G        W   E     5.12.0.g2afefec-tip-rt #5
> [   15.428025] Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013
> [   15.428027] Call Trace:
> [   15.428029]  ? crash_setup_memmap_entries+0x17e/0x3a0
> [   15.428032]  dump_stack+0x8a/0xb5
> [   15.428037]  print_address_description.constprop.0+0x16/0xa0
> [   15.428044]  kasan_report+0xc4/0x100
> [   15.428047]  ? crash_setup_memmap_entries+0x17e/0x3a0
> [   15.428050]  crash_setup_memmap_entries+0x17e/0x3a0
> [   15.428053]  ? strcmp+0x2e/0x50
> [   15.428057]  ? native_machine_crash_shutdown+0x240/0x240
> [   15.428059]  ? kexec_purgatory_find_symbol.isra.0+0x145/0x1a0
> [   15.428066]  setup_boot_parameters+0x181/0x5c0
> [   15.428069]  bzImage64_load+0x6b5/0x740
> [   15.428072]  ? bzImage64_probe+0x140/0x140
> [   15.428075]  ? iov_iter_kvec+0x5f/0x70
> [   15.428080]  ? rw_verify_area+0x80/0x80
> [   15.428087]  ? __might_sleep+0x31/0xd0
> [   15.428091]  ? __might_sleep+0x31/0xd0
> [   15.428094]  ? ___might_sleep+0xc9/0xe0
> [   15.428096]  ? bzImage64_probe+0x140/0x140
> [   15.428099]  arch_kexec_kernel_image_load+0x102/0x130
> [   15.428102]  kimage_file_alloc_init+0xda/0x290
> [   15.428107]  __do_sys_kexec_file_load+0x21f/0x390
> [   15.428110]  ? __x64_sys_open+0x100/0x100
> [   15.428113]  ? kexec_calculate_store_digests+0x390/0x390
> [   15.428117]  ? rcu_nocb_flush_deferred_wakeup+0x36/0x50
> [   15.428122]  do_syscall_64+0x3d/0x80
> [   15.428127]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [   15.428132] RIP: 0033:0x7f46ad026759
> [   15.428135] Code: 00 48 81 c4 80 00 00 00 89 f0 c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 0f d7 2b 00 f7 d8 64 89 01 48
> [   15.428137] RSP: 002b:00007ffcf6f96788 EFLAGS: 00000206 ORIG_RAX: 0000000000000140
> [   15.428141] RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 00007f46ad026759
> [   15.428143] RDX: 0000000000000182 RSI: 0000000000000005 RDI: 0000000000000003
> [   15.428145] RBP: 00007ffcf6f96a28 R08: 0000000000000002 R09: 0000000000000000
> [   15.428146] R10: 0000000000b0d5e0 R11: 0000000000000206 R12: 0000000000000004
> [   15.428148] R13: 0000000000000000 R14: 0000000000000000 R15: 00000000ffffffff
> [   15.428152] Memory state around the buggy address:
> [   15.428164]  ffffc90000425f00: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> [   15.428166]  ffffc90000425f80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> [   15.428168] >ffffc90000426000: 00 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> [   15.428169]                       ^
> [   15.428171]  ffffc90000426080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> [   15.428172]  ffffc90000426100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
> [   15.428173] ==================================================================
> [   15.428174] Disabling lock debugging due to kernel taint
>
> kasan: stop grumbling about CRASH_DUMP
>
> Signed-off-by: Mike Galbraith <efault@gmx.de>
> ---
>  arch/x86/kernel/Makefile |    1 +
>  kernel/Makefile          |    1 +
>  2 files changed, 2 insertions(+)
>
> --- a/arch/x86/kernel/Makefile
> +++ b/arch/x86/kernel/Makefile
> @@ -105,6 +105,7 @@ obj-$(CONFIG_X86_TSC)               += trace_clock.o
>  obj-$(CONFIG_CRASH_CORE)       += crash_core_$(BITS).o
>  obj-$(CONFIG_KEXEC_CORE)       += machine_kexec_$(BITS).o
>  obj-$(CONFIG_KEXEC_CORE)       += relocate_kernel_$(BITS).o crash.o
> +KASAN_SANITIZE_crash.o         := n
>  obj-$(CONFIG_KEXEC_FILE)       += kexec-bzimage64.o
>  obj-$(CONFIG_CRASH_DUMP)       += crash_dump_$(BITS).o
>  obj-y                          += kprobes/
> --- a/kernel/Makefile
> +++ b/kernel/Makefile
> @@ -72,6 +72,7 @@ obj-$(CONFIG_CRASH_CORE) += crash_core.o
>  obj-$(CONFIG_KEXEC_CORE) += kexec_core.o
>  obj-$(CONFIG_KEXEC) += kexec.o
>  obj-$(CONFIG_KEXEC_FILE) += kexec_file.o
> +KASAN_SANITIZE_kexec_file.o := n
>  obj-$(CONFIG_KEXEC_ELF) += kexec_elf.o
>  obj-$(CONFIG_BACKTRACE_SELF_TEST) += backtracetest.o
>  obj-$(CONFIG_COMPAT) += compat.o
>

  reply	other threads:[~2021-04-14  5:26 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-06  8:26 Question on KASAN calltrace record in RT Zhang, Qiang
2021-04-13 15:29 ` Dmitry Vyukov
2021-04-14  4:00   ` Mike Galbraith
2021-04-14  5:26     ` Dmitry Vyukov [this message]
2021-04-14  6:15       ` Mike Galbraith
2021-04-14 15:04         ` [patch] kasan: make it RT aware Mike Galbraith
2021-04-15 18:13       ` Question on KASAN calltrace record in RT Mike Galbraith
2021-04-14  7:29     ` 回复: " Zhang, Qiang
2021-04-14  7:56       ` Mike Galbraith
2021-04-14  8:09         ` 回复: " Zhang, Qiang
2021-04-14  6:58   ` Zhang, Qiang
2021-04-14  7:07     ` Dmitry Vyukov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACT4Y+bVkBscD+Ggp6oQm3LbyiMVmwaaX20fQJLHobg6_z4VzQ@mail.gmail.com \
    --to=dvyukov@google.com \
    --cc=Qiang.Zhang@windriver.com \
    --cc=ahalaney@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@gmail.com \
    --cc=efault@gmx.de \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ryabinin.a.a@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.