Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
From: Qian Cai <cai@lca.pw>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: peterz@infradead.org, mingo@redhat.com,
	akpm@linux-foundation.org,  tglx@linutronix.de,
	thgarnie@google.com, tytso@mit.edu, cl@linux.com,
	 penberg@kernel.org, rientjes@google.com, will@kernel.org,
	linux-mm@kvack.org,  linux-kernel@vger.kernel.org,
	keescook@chromium.org
Subject: Re: [PATCH] mm/slub: fix a deadlock in shuffle_freelist()
Date: Mon, 16 Sep 2019 17:31:34 -0400
Message-ID: <1568669494.5576.157.camel@lca.pw> (raw)
In-Reply-To: <20190916195115.g4hj3j3wstofpsdr@linutronix.de>

On Mon, 2019-09-16 at 21:51 +0200, Sebastian Andrzej Siewior wrote:
> On 2019-09-16 10:01:27 [-0400], Qian Cai wrote:
> > On Mon, 2019-09-16 at 11:03 +0200, Sebastian Andrzej Siewior wrote:
> > > On 2019-09-13 12:27:44 [-0400], Qian Cai wrote:
> > > …
> > > > Chain exists of:
> > > >   random_write_wait.lock --> &rq->lock --> batched_entropy_u32.lock
> > > > 
> > > >  Possible unsafe locking scenario:
> > > > 
> > > >        CPU0                    CPU1
> > > >        ----                    ----
> > > >   lock(batched_entropy_u32.lock);
> > > >                                lock(&rq->lock);
> > > >                                lock(batched_entropy_u32.lock);
> > > >   lock(random_write_wait.lock);
> > > 
> > > would this deadlock still occur if lockdep knew that
> > > batched_entropy_u32.lock on CPU0 could be acquired at the same time
> > > as CPU1 acquired its batched_entropy_u32.lock?
> > 
> > I suppose that might fix it too if it can teach the lockdep the trick, but it
> > would be better if there is a patch if you have something in mind that could be
> > tested to make sure.
> 
> get_random_bytes() is heavier than get_random_int() so I would prefer to
> avoid its usage to fix what looks like a false positive report from
> lockdep.
> But no, I don't have a patch sitting around. A lock in per-CPU memory
> could lead to the scenario mentioned above if the lock could be obtained
> cross-CPU it just isn't so in that case. So I don't think it is that
> simple.

get_random_u64() is also busted.

[  752.925079] WARNING: possible circular locking dependency detected
[  752.931951] 5.3.0-rc8-next-20190915+ #2 Tainted: G             L   
[  752.938906] ------------------------------------------------------
[  752.945774] ls/9665 is trying to acquire lock:
[  752.950905] ffff90001311fef8 (random_write_wait.lock){..-.}, at:
__wake_up_common_lock+0xa8/0x11c
[  752.960481] 
               but task is already holding lock:
[  752.967698] ffff008abc7b9c00 (batched_entropy_u64.lock){....}, at:
get_random_u64+0x6c/0x1dc
[  752.976835] 
               which lock already depends on the new lock.

[  752.987089] 
               the existing dependency chain (in reverse order) is:
[  752.995953] 
               -> #4 (batched_entropy_u64.lock){....}:
[  753.003702]        lock_acquire+0x320/0x364
[  753.008577]        _raw_spin_lock_irqsave+0x7c/0x9c
[  753.014145]        get_random_u64+0x6c/0x1dc
[  753.019109]        add_to_free_area_random+0x54/0x1c8
[  753.024851]        free_one_page+0x86c/0xc28
[  753.029818]        __free_pages_ok+0x69c/0xdac
[  753.034960]        __free_pages+0xbc/0xf8
[  753.039663]        __free_pages_core+0x2ac/0x3c0
[  753.044973]        memblock_free_pages+0xe0/0xf8
[  753.050281]        __free_pages_memory+0xcc/0xfc
[  753.055588]        __free_memory_core+0x70/0x78
[  753.060809]        free_low_memory_core_early+0x148/0x18c
[  753.066897]        memblock_free_all+0x18/0x54
[  753.072033]        mem_init+0x9c/0x160
[  753.076472]        mm_init+0x14/0x38
[  753.080737]        start_kernel+0x19c/0x52c
[  753.085607] 
               -> #3 (&(&zone->lock)->rlock){..-.}:
[  753.093092]        lock_acquire+0x320/0x364
[  753.097964]        _raw_spin_lock+0x64/0x80
[  753.102839]        rmqueue_bulk+0x50/0x15a0
[  753.107712]        get_page_from_freelist+0x2260/0x29dc
[  753.113627]        __alloc_pages_nodemask+0x36c/0x1ce0
[  753.119457]        alloc_page_interleave+0x34/0x17c
[  753.125023]        alloc_pages_current+0x80/0xe0
[  753.130334]        allocate_slab+0xfc/0x1d80
[  753.135296]        ___slab_alloc+0x5d4/0xa70
[  753.140257]        kmem_cache_alloc+0x588/0x66c
[  753.145480]        __debug_object_init+0x9d8/0xbac
[  753.150962]        debug_object_init+0x40/0x50
[  753.156098]        hrtimer_init+0x38/0x2b4
[  753.160885]        init_dl_task_timer+0x24/0x44
[  753.166108]        __sched_fork+0xc0/0x168
[  753.170894]        init_idle+0x80/0x3d8
[  753.175420]        idle_thread_get+0x60/0x8c
[  753.180385]        _cpu_up+0x10c/0x348
[  753.184824]        do_cpu_up+0x114/0x170
[  753.189437]        cpu_up+0x20/0x2c
[  753.193615]        smp_init+0xf8/0x1bc
[  753.198054]        kernel_init_freeable+0x198/0x26c
[  753.203622]        kernel_init+0x18/0x334
[  753.208323]        ret_from_fork+0x10/0x18
[  753.213107] 
               -> #2 (&rq->lock){-.-.}:
[  753.219550]        lock_acquire+0x320/0x364
[  753.224423]        _raw_spin_lock+0x64/0x80
[  753.229299]        task_fork_fair+0x64/0x22c
[  753.234261]        sched_fork+0x24c/0x3d8
[  753.238962]        copy_process+0xa60/0x29b0
[  753.243921]        _do_fork+0xb8/0xa64
[  753.248360]        kernel_thread+0xc4/0xf4
[  753.253147]        rest_init+0x30/0x320
[  753.257673]        arch_call_rest_init+0x10/0x18
[  753.262980]        start_kernel+0x424/0x52c
[  753.267849] 
               -> #1 (&p->pi_lock){-.-.}:
[  753.274467]        lock_acquire+0x320/0x364
[  753.279342]        _raw_spin_lock_irqsave+0x7c/0x9c
[  753.284910]        try_to_wake_up+0x74/0x128c
[  753.289959]        default_wake_function+0x38/0x48
[  753.295440]        pollwake+0x118/0x158
[  753.299967]        __wake_up_common+0x16c/0x240
[  753.305187]        __wake_up_common_lock+0xc8/0x11c
[  753.310754]        __wake_up+0x3c/0x4c
[  753.315193]        account+0x390/0x3e0
[  753.319632]        extract_entropy+0x2cc/0x37c
[  753.324766]        _xfer_secondary_pool+0x35c/0x3c4
[  753.330333]        push_to_pool+0x54/0x308
[  753.335119]        process_one_work+0x558/0xb1c
[  753.340339]        worker_thread+0x494/0x650
[  753.345300]        kthread+0x1cc/0x1e8
[  753.349739]        ret_from_fork+0x10/0x18
[  753.354522] 
               -> #0 (random_write_wait.lock){..-.}:
[  753.362093]        validate_chain+0xfcc/0x2fd4
[  753.367227]        __lock_acquire+0x868/0xc2c
[  753.372274]        lock_acquire+0x320/0x364
[  753.377147]        _raw_spin_lock_irqsave+0x7c/0x9c
[  753.382715]        __wake_up_common_lock+0xa8/0x11c
[  753.388282]        __wake_up+0x3c/0x4c
[  753.392720]        account+0x390/0x3e0
[  753.397159]        extract_entropy+0x2cc/0x37c
[  753.402292]        crng_reseed+0x60/0x350
[  753.406991]        _extract_crng+0xd8/0x164
[  753.411864]        crng_reseed+0x7c/0x350
[  753.416563]        _extract_crng+0xd8/0x164
[  753.421436]        get_random_u64+0xec/0x1dc
[  753.426396]        arch_mmap_rnd+0x18/0x78
[  753.431187]        load_elf_binary+0x6d0/0x1730
[  753.436411]        search_binary_handler+0x10c/0x35c
[  753.442067]        __do_execve_file+0xb58/0xf7c
[  753.447287]        __arm64_sys_execve+0x6c/0xa4
[  753.452509]        el0_svc_handler+0x170/0x240
[  753.457643]        el0_svc+0x8/0xc
[  753.461732] 
               other info that might help us debug this:

[  753.471812] Chain exists of:
                 random_write_wait.lock --> &(&zone->lock)->rlock -->
batched_entropy_u64.lock

[  753.486588]  Possible unsafe locking scenario:

[  753.493890]        CPU0                    CPU1
[  753.499108]        ----                    ----
[  753.504324]   lock(batched_entropy_u64.lock);
[  753.509372]                                lock(&(&zone->lock)->rlock);
[  753.516675]                                lock(batched_entropy_u64.lock);
[  753.524238]   lock(random_write_wait.lock);
[  753.529113] 
                *** DEADLOCK ***

[  753.537111] 1 lock held by ls/9665:
[  753.541287]  #0: ffff008abc7b9c00 (batched_entropy_u64.lock){....}, at:
get_random_u64+0x6c/0x1dc
[  753.550858] 
               stack backtrace:
[  753.556602] CPU: 121 PID: 9665 Comm: ls Tainted: G             L    5.3.0-
rc8-next-20190915+ #2
[  753.565987] Hardware name: HPE Apollo 70             /C01_APACHE_MB         ,
BIOS L50_5.13_1.11 06/18/2019
[  753.576414] Call trace:
[  753.579553]  dump_backtrace+0x0/0x264
[  753.583905]  show_stack+0x20/0x2c
[  753.587911]  dump_stack+0xd0/0x140
[  753.592003]  print_circular_bug+0x368/0x380
[  753.596876]  check_noncircular+0x28c/0x294
[  753.601664]  validate_chain+0xfcc/0x2fd4
[  753.606276]  __lock_acquire+0x868/0xc2c
[  753.610802]  lock_acquire+0x320/0x364
[  753.615154]  _raw_spin_lock_irqsave+0x7c/0x9c
[  753.620202]  __wake_up_common_lock+0xa8/0x11c
[  753.625248]  __wake_up+0x3c/0x4c
[  753.629171]  account+0x390/0x3e0
[  753.633095]  extract_entropy+0x2cc/0x37c
[  753.637708]  crng_reseed+0x60/0x350
[  753.641887]  _extract_crng+0xd8/0x164
[  753.646238]  crng_reseed+0x7c/0x350
[  753.650417]  _extract_crng+0xd8/0x164
[  753.654768]  get_random_u64+0xec/0x1dc
[  753.659208]  arch_mmap_rnd+0x18/0x78
[  753.663474]  load_elf_binary+0x6d0/0x1730
[  753.668173]  search_binary_handler+0x10c/0x35c
[  753.673308]  __do_execve_file+0xb58/0xf7c
[  753.678007]  __arm64_sys_execve+0x6c/0xa4
[  753.682707]  el0_svc_handler+0x170/0x240
[  753.687319]  el0_svc+0x8/0xc


  reply index

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-13 16:27 Qian Cai
2019-09-16  9:03 ` Sebastian Andrzej Siewior
2019-09-16 14:01   ` Qian Cai
2019-09-16 19:51     ` Sebastian Andrzej Siewior
2019-09-16 21:31       ` Qian Cai [this message]
2019-09-17  7:16         ` Sebastian Andrzej Siewior
2019-09-18 19:59           ` Qian Cai
2019-09-25  9:31 ` Peter Zijlstra
2019-09-25 15:18   ` Qian Cai
2019-09-25 16:45     ` Peter Zijlstra
2019-09-26 12:29       ` Qian Cai
2019-10-01  9:18         ` [PATCH] sched: Avoid spurious lock dependencies Peter Zijlstra
2019-10-01 10:01           ` Valentin Schneider
2019-10-01 11:22           ` Qian Cai
2019-10-01 11:36           ` Srikar Dronamraju
2019-10-01 13:44             ` Peter Zijlstra
2019-10-29 11:10           ` Qian Cai
2019-10-29 12:44             ` Peter Zijlstra
2019-11-12  0:54               ` Qian Cai

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1568669494.5576.157.camel@lca.pw \
    --to=cai@lca.pw \
    --cc=akpm@linux-foundation.org \
    --cc=bigeasy@linutronix.de \
    --cc=cl@linux.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@redhat.com \
    --cc=penberg@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=tglx@linutronix.de \
    --cc=thgarnie@google.com \
    --cc=tytso@mit.edu \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git