From: Tejun Heo <tj@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Christoph Lameter <cl@linux-foundation.org>,
Ingo Molnar <mingo@elte.hu>, Nick Piggin <npiggin@suse.de>,
Jiri Kosina <jkosina@suse.cz>, Yinghai Lu <yhlu.kernel@gmail.com>,
Thomas Gleixner <tglx@linutronix.de>,
linux-kernel@vger.kernel.org
Subject: [PATCH percpu#for-linus] percpu: fix possible deadlock via irq lock inversion
Date: Mon, 09 Nov 2009 14:46:52 +0900 [thread overview]
Message-ID: <4AF7ACCC.2050208@kernel.org> (raw)
In-Reply-To: <1257610412.4108.3.camel@laptop>
Lockdep caught the following irq lock inversion which can lead to
deadlock.
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.32-rc5-tip-04815-g12f0f93-dirty #745
---------------------------------------------------------
hub 1-3:1.0: state 7 ports 2 chg 0000 evt 0004
ksoftirqd/65/199 just changed the state of lock:
(pcpu_lock){..-...}, at: [<ffffffff81130e04>] free_percpu+0x38/0x104
but this lock took another, SOFTIRQ-unsafe lock in the past:
(vmap_area_lock){+.+...}
and interrupts could create inverse lock ordering between them.
This happens because pcpu_lock is allowed to be acquired from irq
context for free_percpu() path and alloc_percpu() path may call
vfree() with pcpu_lock held. As vmap_area_lock isn't irq safe, if an
IRQ occurs while vmap_area_lock is held and the irq handler calls
free_percpu(), locking order inversion occurs.
As the nesting only occurs in the alloc path which isn't allowed to be
called from irq context, A->B->A deadlock won't occur but A->B B->A
deadlock is still possible.
The only place where vmap_area_lock is nested under pcpu_lock is while
extending area_map to free old map. Break the locking order by
temporarily releasing pcpu_lock when freeing old map. This is safe to
do as allocation path is protected with outer pcpu_alloc_mutex.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
---
If nobody objects, I'll push it to Linus tomorrow. Thanks.
mm/percpu.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/mm/percpu.c b/mm/percpu.c
index d907971..30cd343 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -372,7 +372,7 @@ static struct pcpu_chunk *pcpu_chunk_addr_search(void *addr)
static int pcpu_extend_area_map(struct pcpu_chunk *chunk, unsigned long *flags)
{
int new_alloc;
- int *new;
+ int *new, *old = NULL;
size_t size;
/* has enough? */
@@ -407,10 +407,23 @@ static int pcpu_extend_area_map(struct pcpu_chunk *chunk, unsigned long *flags)
* one of the first chunks and still using static map.
*/
if (chunk->map_alloc >= PCPU_DFL_MAP_ALLOC)
- pcpu_mem_free(chunk->map, size);
+ old = chunk->map;
chunk->map_alloc = new_alloc;
chunk->map = new;
+
+ /*
+ * pcpu_mem_free() might end up calling vfree() which uses
+ * IRQ-unsafe lock and thus can't be called with pcpu_lock
+ * held. Release and reacquire pcpu_lock if old map needs to
+ * be freed.
+ */
+ if (old) {
+ spin_unlock_irqrestore(&pcpu_lock, *flags);
+ pcpu_mem_free(old, size);
+ spin_lock_irqsave(&pcpu_lock, *flags);
+ }
+
return 0;
}
next prev parent reply other threads:[~2009-11-09 5:47 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <86802c440911041008q4969b9bdk15b4598c40bb84bd@mail.gmail.com>
[not found] ` <4AF25FC7.4000502@kernel.org>
[not found] ` <20091105082102.GA2870@elte.hu>
[not found] ` <4AF28D7A.6020209@kernel.org>
2009-11-05 14:31 ` irq lock inversion Jiri Kosina
2009-11-06 5:53 ` Tejun Heo
2009-11-06 7:17 ` Ingo Molnar
2009-11-06 7:45 ` Tejun Heo
2009-11-06 7:58 ` Ingo Molnar
2009-11-06 8:24 ` Tejun Heo
2009-11-06 8:40 ` Ingo Molnar
2009-11-06 8:52 ` Tejun Heo
2009-11-06 16:08 ` Christoph Lameter
2009-11-06 16:38 ` Tejun Heo
2009-11-06 17:03 ` Christoph Lameter
2009-11-07 16:13 ` Peter Zijlstra
2009-11-09 5:46 ` Tejun Heo [this message]
2009-11-06 9:59 ` Jens Axboe
2009-11-08 9:38 ` Ingo Molnar
2009-11-09 15:34 ` Jens Axboe
2009-11-09 15:45 ` Ingo Molnar
2009-11-09 15:49 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AF7ACCC.2050208@kernel.org \
--to=tj@kernel.org \
--cc=cl@linux-foundation.org \
--cc=jkosina@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=npiggin@suse.de \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=yhlu.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.