linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Andrey Konovalov <andreyknvl@google.com>
To: Qian Cai <cai@lca.pw>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	kasan-dev <kasan-dev@googlegroups.com>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	Andrey Ryabinin <aryabinin@virtuozzo.com>,
	Dmitry Vyukov <dvyukov@google.com>
Subject: Re: soft lockups with SLAB_CONSISTENCY_CHECKS + KASAN_SW_TAGS (was: livelock with KASAN_SW_TAGS)
Date: Tue, 19 Feb 2019 19:56:57 +0100	[thread overview]
Message-ID: <CAAeHK+yUQ0kZUspiFazjkFu7CRdaL_DZijUXD1po45gnGZkV3w@mail.gmail.com> (raw)
In-Reply-To: <1550601754.6911.41.camel@lca.pw>

On Tue, Feb 19, 2019 at 7:42 PM Qian Cai <cai@lca.pw> wrote:
>
> On Tue, 2019-02-19 at 18:56 +0100, Andrey Konovalov wrote:
> > > > Once the machine is restricted to 16 CPUs (nr_cpus=16), although it still
> > > > trigger soft lockups and msgstress03 would seem running forever, the
> > > > machine is
> > > > still responsible and is able to login via ssh. Hence, it is possible to
> > > > capture
> > > > a task dump (echo t >/proc/sysrq-trigger) while this is happening.
> > > >
> > > > https://git.sr.ht/~cai/linux-debug/tree/master/console
> > > >
> > > > Some traces looks strange that looks like running free_debug_processing()
> > > > in a loop,
> > > >
> > > > [ 1986.002139] Call trace:
> > > > [ 1986.002145]  _raw_spin_unlock_irqrestore+0x44/0xac
> > > > [ 1986.002152]  free_debug_processing+0x2f4/0x3e4
> > > > [ 1986.002157]  kmem_cache_free+0x44c/0x870
> > > > [ 1986.002163]  free_object_rcu+0x200/0x228
> > > > [ 1986.002169]  rcu_process_callbacks+0xb00/0x12c0
> > > > [ 1986.002175]  __do_softirq+0x644/0xfd0
> > > > [ 1986.002181]  irq_exit+0x29c/0x370
> > > > [ 1986.002187]  __handle_domain_irq+0xe0/0x1c4
> > > > [ 1986.002192]  gic_handle_irq+0x1c4/0x3b0
> > > > [ 1986.002197]  el1_irq+0xb0/0x140
> > > > [ 1986.002203]  lock_release+0x660/0x7dc
> > > > [ 1986.002209]  rcu_lock_release+0x20/0x28
> > > > [ 1986.002214]  do_msgrcv+0x708/0xed0
> > > > [ 1986.002219]  ksys_msgrcv+0x4c/0x60
> > > > [ 1986.002224]  __arm64_sys_msgrcv+0xb8/0x194
> > > > [ 1986.002230]  el0_svc_handler+0x230/0x3bc
> > > > [ 1986.002236]  el0_svc+0x8/0xc
> > > > [ 1986.007106]  OUTLINED_FUNCTION_169+0x4/0xc
> > > > [ 1986.011885]  free_debug_processing+0x2f4/0x3e4
> > > > [ 1986.017186]  load_msg+0x4c/0x324
> > > > [ 1986.021617]  kmem_cache_free+0x44c/0x870
> > > > [ 1986.026917]  ksys_msgsnd+0x1e0/0xe5c
> > > > [ 1988.050035]  _raw_spin_unlock_irqrestore+0x44/0xac
> > > > [ 1988.054821]  free_debug_processing+0x2f4/0x3e4
> > > > [ 1988.059260]  kfree+0x3f8/0x7ac
> > > > [ 1988.062313]  free_msg+0x50/0xb0
> > > > [ 1988.065450]  do_msgrcv+0xd80/0xed0
> > > > [ 1988.068846]  ksys_msgrcv+0x4c/0x60
> > > > [ 1988.072243]  __arm64_sys_msgrcv+0xb8/0x194
> > > > [ 1988.076336]  el0_svc_handler+0x230/0x3bc
> > > > [ 1988.080255]  el0_svc+0x
> > >
> > > I'm hoping that Andrey can make sense of this, since he recently hacked up
> > > freelist_ptr(), although only if CONFIG_SLAB_FREELIST_HARDENED=y, which
> > > isn't the case in your .config.
> >
> > So far, I've been unable to trigger this in QEMU as well.
> >
> > Qian, could you check if this still happens after adding that -pg flag
> > in KASAN Makefile?
>
> Yes, it still happen. Although the reproducer (LTP msgstress0[3-4]) is making
> slow progress, so not strict a live lock now. The situation gets worse if the
> system has more CPUs probably because more CPUs are trying to acquire the
> spinlock in free_debug_processing() and then flush the console with soft
> lockups.
>
> One workaround is to add "KASAN_SANITIZE_string.o := n" to lib/Makefile which
> will stop inserting KASAN instruments for check_bytes8(), and then the
> reproducers are running smoothly without triggering any soft lockups.
>
> It looks like check_bytes8() is a big CPU consumer especially with KASAN
> instruments added.
>
> 0000000000001a2c <check_bytes8>:
>     1a2c:       d2c20008        mov     x8, #0x100000000000
>     1a30:       f2fdffe8        movk    x8, #0xefff, lsl #48
>     1a34:       34000202        cbz     w2, 1a74 <check_bytes8+0x48>
>     1a38:       aa1e03e3        mov     x3, x30
>     1a3c:       94000047        bl      1b58 <OUTLINED_FUNCTION_2>
>     1a40:       aa0303fe        mov     x30, x3
>     1a44:       54000060        b.eq    1a50 <check_bytes8+0x24>  // b.none
>     1a48:       7103fd3f        cmp     w9, #0xff
>     1a4c:       54000101        b.ne    1a6c <check_bytes8+0x40>  // b.any
>     1a50:       39400009        ldrb    w9, [x0]
>     1a54:       6b21013f        cmp     w9, w1, uxtb
>     1a58:       54000101        b.ne    1a78 <check_bytes8+0x4c>  // b.any
>     1a5c:       91000400        add     x0, x0, #0x1
>     1a60:       51000442        sub     w2, w2, #0x1
>     1a64:       35fffea2        cbnz    w2, 1a38 <check_bytes8+0xc>
>     1a68:       14000003        b       1a74 <check_bytes8+0x48>
>     1a6c:       d4212400        brk     #0x920
>     1a70:       17fffff8        b       1a50 <check_bytes8+0x24>
>     1a74:       aa1f03e0        mov     x0, xzr
>     1a78:       d65f03c0        ret
>
> This function is called by over and over again (with interrupts disabled),
>
> free_debug_processing [1]
>   free_consistency_checks
>     check_object
>       memchr_inv [2]
>         check_bytes8
>
> [1] iterate all objects in the slab.
> [2] while (words) { words--;

Ah, so it doesn't lock, it's just very slow? memchr_inv() is the only
caller of check_bytes8(), so we could remove instrumentation from the
latter, and add one KASAN range check into the former. But I'd say
this is the expected behavior, KASAN slows down stuff and I don't
think it makes much sense to enable it together with other memory
debugging options.


>
> I also noticed that even the single "top" command is now consuming 30% - 40%
> CPUs all the time. Sometimes, it could jump to 80% or so.
>
> 5969 root      20   0   24512  10560   4736 R  83.8   0.0   3:25.79 top

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2019-02-19 18:57 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-14  3:32 livelock with KASAN_SW_TAGS Qian Cai
2019-02-14 16:35 ` Will Deacon
2019-02-14 16:50   ` Qian Cai
2019-02-14 18:01     ` Will Deacon
2019-02-15  4:04       ` Qian Cai
2019-02-15 14:23         ` Will Deacon
2019-02-15 14:26           ` Will Deacon
2019-02-19 17:56           ` Andrey Konovalov
     [not found]             ` <1550601754.6911.41.camel@lca.pw>
2019-02-19 18:56               ` Andrey Konovalov [this message]
2019-02-19 19:08                 ` soft lockups with SLAB_CONSISTENCY_CHECKS + KASAN_SW_TAGS (was: livelock with KASAN_SW_TAGS) Qian Cai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAAeHK+yUQ0kZUspiFazjkFu7CRdaL_DZijUXD1po45gnGZkV3w@mail.gmail.com \
    --to=andreyknvl@google.com \
    --cc=aryabinin@virtuozzo.com \
    --cc=cai@lca.pw \
    --cc=catalin.marinas@arm.com \
    --cc=dvyukov@google.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).