* perf/workqueue: lockdep warning on process exit @ 2014-06-16 14:24 Sasha Levin 2014-06-16 21:41 ` Sasha Levin ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: Sasha Levin @ 2014-06-16 14:24 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar, acme, paulus, Tejun Heo; +Cc: LKML, Dave Jones Hi all, While fuzzing with trinity inside a KVM tools guest running the latest -next kernel I've stumbled on the following spew: [ 430.429005] ====================================================== [ 430.429005] [ INFO: possible circular locking dependency detected ] [ 430.429005] 3.15.0-next-20140613-sasha-00026-g6dd125d-dirty #654 Not tainted [ 430.429005] ------------------------------------------------------- [ 430.429005] trinity-c578/9725 is trying to acquire lock: [ 430.429005] (&(&pool->lock)->rlock){-.-...}, at: __queue_work (kernel/workqueue.c:1346) [ 430.429005] [ 430.429005] but task is already holding lock: [ 430.429005] (&ctx->lock){-.....}, at: perf_event_exit_task (kernel/events/core.c:7471 kernel/events/core.c:7533) [ 430.439509] [ 430.439509] which lock already depends on the new lock. [ 430.439509] [ 430.439509] [ 430.439509] the existing dependency chain (in reverse order) is: [ 430.439509] -> #3 (&ctx->lock){-.....}: [ 430.439509] lock_acquire (./arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) [ 430.439509] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) [ 430.439509] __perf_event_task_sched_out (kernel/events/core.c:2358 kernel/events/core.c:2384) [ 430.450111] perf_event_task_sched_out (include/linux/perf_event.h:702) [ 430.450111] __schedule (kernel/sched/core.c:2138 kernel/sched/core.c:2176 kernel/sched/core.c:2300 kernel/sched/core.c:2795) [ 430.450111] preempt_schedule_irq (./arch/x86/include/asm/paravirt.h:814 kernel/sched/core.c:2912) [ 430.450111] retint_kernel (arch/x86/kernel/entry_64.S:936) [ 430.450111] syscall_trace_leave (arch/x86/kernel/ptrace.c:1531) [ 430.450111] int_check_syscall_exit_work (arch/x86/kernel/entry_64.S:590) [ 430.450111] -> #2 (&rq->lock){-.-.-.}: [ 430.450111] lock_acquire (./arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) [ 430.450111] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) [ 430.450111] wake_up_new_task (include/linux/sched.h:2889 kernel/sched/core.c:329 kernel/sched/core.c:2088) [ 430.450111] do_fork (kernel/fork.c:1628) [ 430.450111] kernel_thread (kernel/fork.c:1650) [ 430.450111] rest_init (init/main.c:405) [ 430.450111] start_kernel (init/main.c:680) [ 430.450111] x86_64_start_reservations (arch/x86/kernel/head64.c:194) [ 430.450111] x86_64_start_kernel (arch/x86/kernel/head64.c:183) [ 430.450111] -> #1 (&p->pi_lock){-.-.-.}: [ 430.450111] lock_acquire (./arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) [ 430.450111] _raw_spin_lock_irqsave (include/linux/spinlock_api_smp.h:117 kernel/locking/spinlock.c:159) [ 430.450111] try_to_wake_up (kernel/sched/core.c:1666) [ 430.450111] wake_up_process (kernel/sched/core.c:1762 (discriminator 2)) [ 430.450111] create_and_start_worker (include/linux/spinlock.h:353 kernel/workqueue.c:1768) [ 430.450111] init_workqueues (kernel/workqueue.c:4938) [ 430.450111] do_one_initcall (init/main.c:790) [ 430.450111] kernel_init_freeable (init/main.c:891 init/main.c:998) [ 430.450111] kernel_init (init/main.c:936) [ 430.450111] ret_from_fork (arch/x86/kernel/entry_64.S:349) [ 430.450111] -> #0 (&(&pool->lock)->rlock){-.-...}: [ 430.450111] __lock_acquire (kernel/locking/lockdep.c:1840 kernel/locking/lockdep.c:1945 kernel/locking/lockdep.c:2131 kernel/locking/lockdep.c:3182) [ 430.450111] lock_acquire (./arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) [ 430.450111] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) [ 430.450111] __queue_work (kernel/workqueue.c:1346) [ 430.450111] queue_work_on (kernel/workqueue.c:1424) [ 430.450111] free_object (lib/debugobjects.c:209) [ 430.450111] __debug_check_no_obj_freed (lib/debugobjects.c:715) [ 430.450111] debug_check_no_obj_freed (lib/debugobjects.c:727) [ 430.450111] kmem_cache_free (mm/slub.c:2683 mm/slub.c:2711) [ 430.450111] free_task (kernel/fork.c:221) [ 430.450111] __put_task_struct (kernel/fork.c:250) [ 430.450111] put_ctx (include/linux/sched.h:1855 kernel/events/core.c:898) [ 430.450111] perf_event_exit_task (kernel/events/core.c:907 kernel/events/core.c:7478 kernel/events/core.c:7533) [ 430.450111] do_exit (kernel/exit.c:766) [ 430.450111] do_group_exit (kernel/exit.c:884) [ 430.450111] get_signal_to_deliver (kernel/signal.c:2347) [ 430.450111] do_signal (arch/x86/kernel/signal.c:698) [ 430.450111] do_notify_resume (arch/x86/kernel/signal.c:751) [ 430.450111] int_signal (arch/x86/kernel/entry_64.S:600) [ 430.450111] [ 430.450111] other info that might help us debug this: [ 430.450111] [ 430.450111] Chain exists of: &(&pool->lock)->rlock --> &rq->lock --> &ctx->lock [ 430.450111] Possible unsafe locking scenario: [ 430.450111] [ 430.450111] CPU0 CPU1 [ 430.450111] ---- ---- [ 430.450111] lock(&ctx->lock); [ 430.450111] lock(&rq->lock); [ 430.450111] lock(&ctx->lock); [ 430.450111] lock(&(&pool->lock)->rlock); [ 430.450111] [ 430.450111] *** DEADLOCK *** [ 430.450111] [ 430.450111] 1 lock held by trinity-c578/9725: [ 430.450111] #0: (&ctx->lock){-.....}, at: perf_event_exit_task (kernel/events/core.c:7471 kernel/events/core.c:7533) [ 430.450111] [ 430.450111] stack backtrace: [ 430.450111] CPU: 6 PID: 9725 Comm: trinity-c578 Not tainted 3.15.0-next-20140613-sasha-00026-g6dd125d-dirty #654 [ 430.450111] ffffffffadb45840 ffff880101787848 ffffffffaa511b1c 0000000000000003 [ 430.450111] ffffffffadb8a4c0 ffff880101787898 ffffffffaa5044e2 0000000000000001 [ 430.450111] ffff880101787928 ffff880101787898 ffff8800aed98cf8 ffff8800aed98000 [ 430.450111] Call Trace: [ 430.450111] dump_stack (lib/dump_stack.c:52) [ 430.450111] print_circular_bug (kernel/locking/lockdep.c:1216) [ 430.450111] __lock_acquire (kernel/locking/lockdep.c:1840 kernel/locking/lockdep.c:1945 kernel/locking/lockdep.c:2131 kernel/locking/lockdep.c:3182) [ 430.450111] ? kvm_clock_read (./arch/x86/include/asm/preempt.h:90 arch/x86/kernel/kvmclock.c:86) [ 430.450111] ? sched_clock (./arch/x86/include/asm/paravirt.h:192 arch/x86/kernel/tsc.c:305) [ 430.450111] ? put_lock_stats.isra.12 (./arch/x86/include/asm/preempt.h:98 kernel/locking/lockdep.c:254) [ 430.450111] lock_acquire (./arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) [ 430.450111] ? __queue_work (kernel/workqueue.c:1346) [ 430.450111] ? __lock_is_held (kernel/locking/lockdep.c:3516) [ 430.450111] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) [ 430.450111] ? __queue_work (kernel/workqueue.c:1346) [ 430.450111] __queue_work (kernel/workqueue.c:1346) [ 430.450111] queue_work_on (kernel/workqueue.c:1424) [ 430.450111] free_object (lib/debugobjects.c:209) [ 430.450111] __debug_check_no_obj_freed (lib/debugobjects.c:715) [ 430.450111] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) [ 430.450111] debug_check_no_obj_freed (lib/debugobjects.c:727) [ 430.450111] kmem_cache_free (mm/slub.c:2683 mm/slub.c:2711) [ 430.450111] ? free_task (kernel/fork.c:221) [ 430.450111] free_task (kernel/fork.c:221) [ 430.450111] __put_task_struct (kernel/fork.c:250) [ 430.450111] put_ctx (include/linux/sched.h:1855 kernel/events/core.c:898) [ 430.450111] perf_event_exit_task (kernel/events/core.c:907 kernel/events/core.c:7478 kernel/events/core.c:7533) [ 430.450111] ? preempt_count_sub (kernel/sched/core.c:2602) [ 430.450111] do_exit (kernel/exit.c:766) [ 430.450111] ? debug_smp_processor_id (lib/smp_processor_id.c:57) [ 430.450111] ? put_lock_stats.isra.12 (./arch/x86/include/asm/preempt.h:98 kernel/locking/lockdep.c:254) [ 430.450111] ? _raw_spin_unlock_irq (./arch/x86/include/asm/paravirt.h:819 include/linux/spinlock_api_smp.h:168 kernel/locking/spinlock.c:199) [ 430.450111] do_group_exit (kernel/exit.c:884) [ 430.450111] get_signal_to_deliver (kernel/signal.c:2347) [ 430.450111] ? vtime_account_user (kernel/sched/cputime.c:687) [ 430.450111] do_signal (arch/x86/kernel/signal.c:698) [ 430.450111] ? vtime_account_user (kernel/sched/cputime.c:687) [ 430.450111] ? preempt_count_sub (kernel/sched/core.c:2602) [ 430.450111] ? context_tracking_user_exit (./arch/x86/include/asm/paravirt.h:809 (discriminator 2) kernel/context_tracking.c:182 (discriminator 2)) [ 430.450111] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) [ 430.450111] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2557 kernel/locking/lockdep.c:2599) [ 430.450111] do_notify_resume (arch/x86/kernel/signal.c:751) [ 430.450111] int_signal (arch/x86/kernel/entry_64.S:600) Thanks, Sasha ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: perf/workqueue: lockdep warning on process exit 2014-06-16 14:24 perf/workqueue: lockdep warning on process exit Sasha Levin @ 2014-06-16 21:41 ` Sasha Levin 2014-06-17 15:57 ` Tejun Heo 2014-06-23 14:12 ` Peter Zijlstra 2 siblings, 0 replies; 7+ messages in thread From: Sasha Levin @ 2014-06-16 21:41 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar, acme, paulus, Tejun Heo; +Cc: LKML, Dave Jones On 06/16/2014 10:24 AM, Sasha Levin wrote: > Hi all, > > While fuzzing with trinity inside a KVM tools guest running the latest -next > kernel I've stumbled on the following spew: I think that this is due to a recent change, since it seems to reproduce very often. I've tried bisecting it but it seems that I'm hitting unrelated issues and can't correctly run the bisection. Thanks, Sasha ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: perf/workqueue: lockdep warning on process exit 2014-06-16 14:24 perf/workqueue: lockdep warning on process exit Sasha Levin 2014-06-16 21:41 ` Sasha Levin @ 2014-06-17 15:57 ` Tejun Heo 2014-06-23 14:12 ` Peter Zijlstra 2 siblings, 0 replies; 7+ messages in thread From: Tejun Heo @ 2014-06-17 15:57 UTC (permalink / raw) To: Sasha Levin; +Cc: Peter Zijlstra, Ingo Molnar, acme, paulus, LKML, Dave Jones Hello, On Mon, Jun 16, 2014 at 10:24:58AM -0400, Sasha Levin wrote: > [ 430.429005] ====================================================== > [ 430.429005] [ INFO: possible circular locking dependency detected ] > [ 430.429005] 3.15.0-next-20140613-sasha-00026-g6dd125d-dirty #654 Not tainted > [ 430.429005] ------------------------------------------------------- > [ 430.429005] trinity-c578/9725 is trying to acquire lock: > [ 430.429005] (&(&pool->lock)->rlock){-.-...}, at: __queue_work (kernel/workqueue.c:1346) > [ 430.429005] > [ 430.429005] but task is already holding lock: > [ 430.429005] (&ctx->lock){-.....}, at: perf_event_exit_task (kernel/events/core.c:7471 kernel/events/core.c:7533) > [ 430.439509] > [ 430.439509] which lock already depends on the new lock. > [ 430.439509] > [ 430.439509] > [ 430.439509] the existing dependency chain (in reverse order) is: > [ 430.439509] > -> #3 (&ctx->lock){-.....}: ... > -> #2 (&rq->lock){-.-.-.}: ... > -> #1 (&p->pi_lock){-.-.-.}: ... > -> #0 (&(&pool->lock)->rlock){-.-...}: ... > [ 430.450111] other info that might help us debug this: > [ 430.450111] > [ 430.450111] Chain exists of: > &(&pool->lock)->rlock --> &rq->lock --> &ctx->lock > > [ 430.450111] Possible unsafe locking scenario: > [ 430.450111] > [ 430.450111] CPU0 CPU1 > [ 430.450111] ---- ---- > [ 430.450111] lock(&ctx->lock); > [ 430.450111] lock(&rq->lock); > [ 430.450111] lock(&ctx->lock); > [ 430.450111] lock(&(&pool->lock)->rlock); > [ 430.450111] > [ 430.450111] *** DEADLOCK *** > [ 430.450111] > [ 430.450111] 1 lock held by trinity-c578/9725: > [ 430.450111] #0: (&ctx->lock){-.....}, at: perf_event_exit_task (kernel/events/core.c:7471 kernel/events/core.c:7533) > [ 430.450111] > [ 430.450111] stack backtrace: > [ 430.450111] CPU: 6 PID: 9725 Comm: trinity-c578 Not tainted 3.15.0-next-20140613-sasha-00026-g6dd125d-dirty #654 > [ 430.450111] ffffffffadb45840 ffff880101787848 ffffffffaa511b1c 0000000000000003 > [ 430.450111] ffffffffadb8a4c0 ffff880101787898 ffffffffaa5044e2 0000000000000001 > [ 430.450111] ffff880101787928 ffff880101787898 ffff8800aed98cf8 ffff8800aed98000 > [ 430.450111] Call Trace: > [ 430.450111] dump_stack (lib/dump_stack.c:52) > [ 430.450111] print_circular_bug (kernel/locking/lockdep.c:1216) > [ 430.450111] __lock_acquire (kernel/locking/lockdep.c:1840 kernel/locking/lockdep.c:1945 kernel/locking/lockdep.c:2131 kernel/locking/lockdep.c:3182) > [ 430.450111] lock_acquire (./arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) > [ 430.450111] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) > [ 430.450111] __queue_work (kernel/workqueue.c:1346) > [ 430.450111] queue_work_on (kernel/workqueue.c:1424) > [ 430.450111] free_object (lib/debugobjects.c:209) > [ 430.450111] __debug_check_no_obj_freed (lib/debugobjects.c:715) > [ 430.450111] debug_check_no_obj_freed (lib/debugobjects.c:727) > [ 430.450111] kmem_cache_free (mm/slub.c:2683 mm/slub.c:2711) > [ 430.450111] free_task (kernel/fork.c:221) > [ 430.450111] __put_task_struct (kernel/fork.c:250) > [ 430.450111] put_ctx (include/linux/sched.h:1855 kernel/events/core.c:898) > [ 430.450111] perf_event_exit_task (kernel/events/core.c:907 kernel/events/core.c:7478 kernel/events/core.c:7533) > [ 430.450111] do_exit (kernel/exit.c:766) So, perf_event_exit_task() ends up freeing perf_events under perf_event_context->lock which may nest inside rq lock. With SLAB_DEBUG_OBJECTS enabled, sl?b calls into debugobjects which in turn call into workqueue for its internal management. This leads to possible deadlock as workqueue is now being invoked under a lock which nests under rq lock. This is a really low level feature invoking high level debugging facility leading to possible deadlocks. I don't know why it showed up now and there may be better ways but the default thing to do seems to be turning off SLAB_DEBUG_OBJECTS for perf_events. Thanks. -- tejun ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: perf/workqueue: lockdep warning on process exit 2014-06-16 14:24 perf/workqueue: lockdep warning on process exit Sasha Levin 2014-06-16 21:41 ` Sasha Levin 2014-06-17 15:57 ` Tejun Heo @ 2014-06-23 14:12 ` Peter Zijlstra 2014-06-25 21:17 ` Sasha Levin 2014-07-16 19:19 ` [tip:perf/urgent] perf: Fix " tip-bot for Peter Zijlstra 2 siblings, 2 replies; 7+ messages in thread From: Peter Zijlstra @ 2014-06-23 14:12 UTC (permalink / raw) To: Sasha Levin; +Cc: Ingo Molnar, acme, paulus, Tejun Heo, LKML, Dave Jones On Mon, Jun 16, 2014 at 10:24:58AM -0400, Sasha Levin wrote: > Hi all, > > While fuzzing with trinity inside a KVM tools guest running the latest -next > kernel I've stumbled on the following spew: > > [ 430.429005] ====================================================== > [ 430.429005] [ INFO: possible circular locking dependency detected ] > [ 430.429005] 3.15.0-next-20140613-sasha-00026-g6dd125d-dirty #654 Not tainted > [ 430.429005] ------------------------------------------------------- > [ 430.429005] trinity-c578/9725 is trying to acquire lock: > [ 430.429005] (&(&pool->lock)->rlock){-.-...}, at: __queue_work (kernel/workqueue.c:1346) > [ 430.429005] > [ 430.429005] but task is already holding lock: > [ 430.429005] (&ctx->lock){-.....}, at: perf_event_exit_task (kernel/events/core.c:7471 kernel/events/core.c:7533) > [ 430.439509] > [ 430.439509] which lock already depends on the new lock. > [ 430.450111] 1 lock held by trinity-c578/9725: > [ 430.450111] #0: (&ctx->lock){-.....}, at: perf_event_exit_task (kernel/events/core.c:7471 kernel/events/core.c:7533) > [ 430.450111] > [ 430.450111] stack backtrace: > [ 430.450111] CPU: 6 PID: 9725 Comm: trinity-c578 Not tainted 3.15.0-next-20140613-sasha-00026-g6dd125d-dirty #654 > [ 430.450111] ffffffffadb45840 ffff880101787848 ffffffffaa511b1c 0000000000000003 > [ 430.450111] ffffffffadb8a4c0 ffff880101787898 ffffffffaa5044e2 0000000000000001 > [ 430.450111] ffff880101787928 ffff880101787898 ffff8800aed98cf8 ffff8800aed98000 > [ 430.450111] Call Trace: > [ 430.450111] dump_stack (lib/dump_stack.c:52) > [ 430.450111] print_circular_bug (kernel/locking/lockdep.c:1216) > [ 430.450111] __lock_acquire (kernel/locking/lockdep.c:1840 kernel/locking/lockdep.c:1945 kernel/locking/lockdep.c:2131 kernel/locking/lockdep.c:3182) > [ 430.450111] lock_acquire (./arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) > [ 430.450111] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) > [ 430.450111] __queue_work (kernel/workqueue.c:1346) > [ 430.450111] queue_work_on (kernel/workqueue.c:1424) > [ 430.450111] free_object (lib/debugobjects.c:209) > [ 430.450111] __debug_check_no_obj_freed (lib/debugobjects.c:715) > [ 430.450111] debug_check_no_obj_freed (lib/debugobjects.c:727) > [ 430.450111] kmem_cache_free (mm/slub.c:2683 mm/slub.c:2711) > [ 430.450111] free_task (kernel/fork.c:221) > [ 430.450111] __put_task_struct (kernel/fork.c:250) > [ 430.450111] put_ctx (include/linux/sched.h:1855 kernel/events/core.c:898) > [ 430.450111] perf_event_exit_task (kernel/events/core.c:907 kernel/events/core.c:7478 kernel/events/core.c:7533) > [ 430.450111] do_exit (kernel/exit.c:766) > [ 430.450111] do_group_exit (kernel/exit.c:884) > [ 430.450111] get_signal_to_deliver (kernel/signal.c:2347) > [ 430.450111] do_signal (arch/x86/kernel/signal.c:698) > [ 430.450111] do_notify_resume (arch/x86/kernel/signal.c:751) > [ 430.450111] int_signal (arch/x86/kernel/entry_64.S:600) Urgh.. so the only way I can make that happen is through: perf_event_exit_task_context() raw_spin_lock(&child_ctx->lock); unclone_ctx(child_ctx) put_ctx(ctx->parent_ctx); raw_spin_unlock_irqrestore(&child_ctx->lock); And we can avoid this by doing something like.. I can't immediately see how this changed recently, but given that you say its easy to reproduce, can you give this a spin? --- kernel/events/core.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index a33d9a2bcbd7..5e90fa579055 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7474,7 +7474,7 @@ __perf_event_exit_task(struct perf_event *child_event, static void perf_event_exit_task_context(struct task_struct *child, int ctxn) { struct perf_event *child_event, *next; - struct perf_event_context *child_ctx; + struct perf_event_context *child_ctx, *parent_ctx; unsigned long flags; if (likely(!child->perf_event_ctxp[ctxn])) { @@ -7499,6 +7499,15 @@ static void perf_event_exit_task_context(struct task_struct *child, int ctxn) raw_spin_lock(&child_ctx->lock); task_ctx_sched_out(child_ctx); child->perf_event_ctxp[ctxn] = NULL; + + /* + * In order to avoid freeing: child_ctx->parent_ctx->task + * under perf_event_context::lock, grab another reference. + */ + parent_ctx = child_ctx->parent_ctx; + if (parent_ctx) + get_ctx(parent_ctx); + /* * If this context is a clone; unclone it so it can't get * swapped to another process while we're removing all @@ -7509,6 +7518,13 @@ static void perf_event_exit_task_context(struct task_struct *child, int ctxn) raw_spin_unlock_irqrestore(&child_ctx->lock, flags); /* + * Now that we no longer hold perf_event_context::lock, drop + * our extra child_ctx->parent_ctx reference. + */ + if (parent_ctx) + put_ctx(parent_ctx); + + /* * Report the task dead after unscheduling the events so that we * won't get any samples after PERF_RECORD_EXIT. We can however still * get a few PERF_RECORD_READ events. ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: perf/workqueue: lockdep warning on process exit 2014-06-23 14:12 ` Peter Zijlstra @ 2014-06-25 21:17 ` Sasha Levin 2014-07-07 13:45 ` Peter Zijlstra 2014-07-16 19:19 ` [tip:perf/urgent] perf: Fix " tip-bot for Peter Zijlstra 1 sibling, 1 reply; 7+ messages in thread From: Sasha Levin @ 2014-06-25 21:17 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Ingo Molnar, acme, paulus, Tejun Heo, LKML, Dave Jones On 06/23/2014 10:12 AM, Peter Zijlstra wrote: > On Mon, Jun 16, 2014 at 10:24:58AM -0400, Sasha Levin wrote: >> Hi all, >> >> While fuzzing with trinity inside a KVM tools guest running the latest -next >> kernel I've stumbled on the following spew: >> >> [ 430.429005] ====================================================== >> [ 430.429005] [ INFO: possible circular locking dependency detected ] >> [ 430.429005] 3.15.0-next-20140613-sasha-00026-g6dd125d-dirty #654 Not tainted >> [ 430.429005] ------------------------------------------------------- >> [ 430.429005] trinity-c578/9725 is trying to acquire lock: >> [ 430.429005] (&(&pool->lock)->rlock){-.-...}, at: __queue_work (kernel/workqueue.c:1346) >> [ 430.429005] >> [ 430.429005] but task is already holding lock: >> [ 430.429005] (&ctx->lock){-.....}, at: perf_event_exit_task (kernel/events/core.c:7471 kernel/events/core.c:7533) >> [ 430.439509] >> [ 430.439509] which lock already depends on the new lock. > > >> [ 430.450111] 1 lock held by trinity-c578/9725: >> [ 430.450111] #0: (&ctx->lock){-.....}, at: perf_event_exit_task (kernel/events/core.c:7471 kernel/events/core.c:7533) >> [ 430.450111] >> [ 430.450111] stack backtrace: >> [ 430.450111] CPU: 6 PID: 9725 Comm: trinity-c578 Not tainted 3.15.0-next-20140613-sasha-00026-g6dd125d-dirty #654 >> [ 430.450111] ffffffffadb45840 ffff880101787848 ffffffffaa511b1c 0000000000000003 >> [ 430.450111] ffffffffadb8a4c0 ffff880101787898 ffffffffaa5044e2 0000000000000001 >> [ 430.450111] ffff880101787928 ffff880101787898 ffff8800aed98cf8 ffff8800aed98000 >> [ 430.450111] Call Trace: >> [ 430.450111] dump_stack (lib/dump_stack.c:52) >> [ 430.450111] print_circular_bug (kernel/locking/lockdep.c:1216) >> [ 430.450111] __lock_acquire (kernel/locking/lockdep.c:1840 kernel/locking/lockdep.c:1945 kernel/locking/lockdep.c:2131 kernel/locking/lockdep.c:3182) >> [ 430.450111] lock_acquire (./arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) >> [ 430.450111] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) >> [ 430.450111] __queue_work (kernel/workqueue.c:1346) >> [ 430.450111] queue_work_on (kernel/workqueue.c:1424) >> [ 430.450111] free_object (lib/debugobjects.c:209) >> [ 430.450111] __debug_check_no_obj_freed (lib/debugobjects.c:715) >> [ 430.450111] debug_check_no_obj_freed (lib/debugobjects.c:727) >> [ 430.450111] kmem_cache_free (mm/slub.c:2683 mm/slub.c:2711) >> [ 430.450111] free_task (kernel/fork.c:221) >> [ 430.450111] __put_task_struct (kernel/fork.c:250) >> [ 430.450111] put_ctx (include/linux/sched.h:1855 kernel/events/core.c:898) >> [ 430.450111] perf_event_exit_task (kernel/events/core.c:907 kernel/events/core.c:7478 kernel/events/core.c:7533) >> [ 430.450111] do_exit (kernel/exit.c:766) >> [ 430.450111] do_group_exit (kernel/exit.c:884) >> [ 430.450111] get_signal_to_deliver (kernel/signal.c:2347) >> [ 430.450111] do_signal (arch/x86/kernel/signal.c:698) >> [ 430.450111] do_notify_resume (arch/x86/kernel/signal.c:751) >> [ 430.450111] int_signal (arch/x86/kernel/entry_64.S:600) > > > Urgh.. so the only way I can make that happen is through: > > perf_event_exit_task_context() > raw_spin_lock(&child_ctx->lock); > unclone_ctx(child_ctx) > put_ctx(ctx->parent_ctx); > raw_spin_unlock_irqrestore(&child_ctx->lock); > > And we can avoid this by doing something like.. > > I can't immediately see how this changed recently, but given that you > say its easy to reproduce, can you give this a spin? With the patch I've observed the following, which seems to be related. [ 1853.792414] ====================================================== [ 1853.794073] [ INFO: possible circular locking dependency detected ] [ 1853.795636] 3.16.0-rc2-next-20140625-sasha-00023-g9bc0cd4-dirty #727 Tainted: G W [ 1853.797760] ------------------------------------------------------- [ 1853.799353] trinity-c3/29876 is trying to acquire lock: [ 1853.800172] (&(&pool->lock)->rlock){-.-.-.}, at: __queue_work (kernel/workqueue.c:1339) [ 1853.800172] [ 1853.800172] but task is already holding lock: [ 1853.800172] (&ctx->lock){-.-...}, at: perf_lock_task_context (kernel/events/core.c:983) [ 1853.800172] [ 1853.800172] which lock already depends on the new lock. [ 1853.800172] [ 1853.800172] [ 1853.800172] the existing dependency chain (in reverse order) is: [ 1853.800172] -> #3 (&ctx->lock){-.-...}: [ 1853.800172] lock_acquire (./arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) [ 1853.800172] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) [ 1853.816396] __perf_event_task_sched_out (kernel/events/core.c:2359 kernel/events/core.c:2385) [ 1853.817250] perf_event_task_sched_out (include/linux/perf_event.h:702) [ 1853.817250] __schedule (kernel/sched/core.c:2142 kernel/sched/core.c:2180 kernel/sched/core.c:2304 kernel/sched/core.c:2799) [ 1853.817250] schedule (kernel/sched/core.c:2836) [ 1853.817250] p9_client_rpc (net/9p/client.c:756 (discriminator 9)) [ 1853.817250] p9_client_getattr_dotl (net/9p/client.c:1754) [ 1853.817250] v9fs_vfs_getattr_dotl (fs/9p/vfs_inode_dotl.c:496) [ 1853.817250] vfs_getattr_nosec (fs/stat.c:57) [ 1853.817250] vfs_getattr (fs/stat.c:73) [ 1853.817250] vfs_fstatat (fs/stat.c:111) [ 1853.817250] vfs_lstat (fs/stat.c:130) [ 1853.817250] SYSC_newlstat (fs/stat.c:284) [ 1853.817250] SyS_newlstat (fs/stat.c:278) [ 1853.817250] tracesys (arch/x86/kernel/entry_64.S:542) [ 1853.817250] -> #2 (&rq->lock){-.-.-.}: [ 1853.817250] lock_acquire (./arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) [ 1853.817250] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) [ 1853.817250] wake_up_new_task (include/linux/sched.h:2891 kernel/sched/core.c:329 kernel/sched/core.c:2092) [ 1853.817250] do_fork (kernel/fork.c:1630) [ 1853.817250] kernel_thread (kernel/fork.c:1652) [ 1853.817250] rest_init (init/main.c:404) [ 1853.817250] start_kernel (init/main.c:681) [ 1853.817250] x86_64_start_reservations (arch/x86/kernel/head64.c:194) [ 1853.817250] x86_64_start_kernel (arch/x86/kernel/head64.c:183) [ 1853.817250] -> #1 (&p->pi_lock){-.-.-.}: [ 1853.817250] lock_acquire (./arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) [ 1853.817250] _raw_spin_lock_irqsave (include/linux/spinlock_api_smp.h:117 kernel/locking/spinlock.c:159) [ 1853.817250] try_to_wake_up (kernel/sched/core.c:1670) [ 1853.817250] wake_up_process (kernel/sched/core.c:1766 (discriminator 2)) [ 1853.817250] create_and_start_worker (include/linux/spinlock.h:353 kernel/workqueue.c:1764) [ 1853.817250] init_workqueues (kernel/workqueue.c:4923) [ 1853.817250] do_one_initcall (init/main.c:791) [ 1853.817250] kernel_init_freeable (init/main.c:892 init/main.c:999) [ 1853.817250] kernel_init (init/main.c:937) [ 1853.817250] ret_from_fork (arch/x86/kernel/entry_64.S:349) [ 1853.817250] -> #0 (&(&pool->lock)->rlock){-.-.-.}: [ 1853.817250] __lock_acquire (kernel/locking/lockdep.c:1840 kernel/locking/lockdep.c:1945 kernel/locking/lockdep.c:2131 kernel/locking/lockdep.c:3182) [ 1853.817250] lock_acquire (./arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) [ 1853.817250] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) [ 1853.817250] __queue_work (kernel/workqueue.c:1339) [ 1853.817250] queue_work_on (kernel/workqueue.c:1417) [ 1853.817250] free_object (lib/debugobjects.c:209) [ 1853.817250] __debug_check_no_obj_freed (lib/debugobjects.c:715) [ 1853.817250] debug_check_no_obj_freed (lib/debugobjects.c:727) [ 1853.817250] kmem_cache_free (mm/slub.c:2657 mm/slub.c:2686) [ 1853.817250] free_task (kernel/fork.c:221) [ 1853.817250] __put_task_struct (kernel/fork.c:250) [ 1853.817250] put_ctx (include/linux/sched.h:1857 kernel/events/core.c:899) [ 1853.817250] find_get_context (kernel/events/core.c:908 kernel/events/core.c:3161) [ 1853.817250] SYSC_perf_event_open (kernel/events/core.c:7200) [ 1853.817250] SyS_perf_event_open (kernel/events/core.c:7064) [ 1853.817250] tracesys (arch/x86/kernel/entry_64.S:542) [ 1853.817250] [ 1853.817250] other info that might help us debug this: [ 1853.817250] [ 1853.817250] Chain exists of: &(&pool->lock)->rlock --> &rq->lock --> &ctx->lock [ 1853.817250] Possible unsafe locking scenario: [ 1853.817250] [ 1853.817250] CPU0 CPU1 [ 1853.817250] ---- ---- [ 1853.817250] lock(&ctx->lock); [ 1853.817250] lock(&rq->lock); [ 1853.817250] lock(&ctx->lock); [ 1853.817250] lock(&(&pool->lock)->rlock); [ 1853.817250] [ 1853.817250] *** DEADLOCK *** [ 1853.817250] [ 1853.817250] 2 locks held by trinity-c3/29876: [ 1853.817250] #0: ([watchdog] 4141316 iterations. [F:3363768 S:777188 HI:15026] cpu_hotplug.lock){++++++}, at: get_online_cpus (kernel/cpu.c:90) [ 1853.924037] #1: (&ctx->lock){-.-...}, at: perf_lock_task_context (kernel/events/core.c:983) [ 1853.924037] [ 1853.924037] stack backtrace: [ 1853.924037] CPU: 3 PID: 29876 Comm: trinity-c3 Tainted: G W 3.16.0-rc2-next-20140625-sasha-00023-g9bc0cd4-dirty #727 [ 1853.924037] ffffffffa1b4a790 ffff880219dcfa28 ffffffff9e524b4b 0000000000000003 [ 1853.924037] ffffffffa1b8f410 ffff880219dcfa78 ffffffff9e5176e0 0000000000000002 [ 1853.924037] ffff880219dcfb08 ffff880219dcfa78 ffff880206b73d40 ffff880206b73000 [ 1853.924037] Call Trace: [ 1853.924037] dump_stack (lib/dump_stack.c:52) [ 1853.924037] print_circular_bug (kernel/locking/lockdep.c:1216) [ 1853.924037] __lock_acquire (kernel/locking/lockdep.c:1840 kernel/locking/lockdep.c:1945 kernel/locking/lockdep.c:2131 kernel/locking/lockdep.c:3182) [ 1853.924037] ? kvm_clock_read (./arch/x86/include/asm/preempt.h:90 arch/x86/kernel/kvmclock.c:86) [ 1853.924037] ? sched_clock (./arch/x86/include/asm/paravirt.h:192 arch/x86/kernel/tsc.c:305) [ 1853.924037] ? put_lock_stats.isra.12 (./arch/x86/include/asm/preempt.h:98 kernel/locking/lockdep.c:254) [ 1853.924037] lock_acquire (./arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) [ 1853.924037] ? __queue_work (kernel/workqueue.c:1339) [ 1853.924037] ? __lock_is_held (kernel/locking/lockdep.c:3516) [ 1853.924037] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) [ 1853.924037] ? __queue_work (kernel/workqueue.c:1339) [ 1853.924037] __queue_work (kernel/workqueue.c:1339) [ 1853.924037] queue_work_on (kernel/workqueue.c:1417) [ 1853.924037] free_object (lib/debugobjects.c:209) [ 1853.924037] __debug_check_no_obj_freed (lib/debugobjects.c:715) [ 1853.924037] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) [ 1853.924037] ? free_task (kernel/fork.c:221) [ 1853.924037] debug_check_no_obj_freed (lib/debugobjects.c:727) [ 1853.924037] kmem_cache_free (mm/slub.c:2657 mm/slub.c:2686) [ 1853.924037] free_task (kernel/fork.c:221) [ 1853.924037] __put_task_struct (kernel/fork.c:250) [ 1853.924037] put_ctx (include/linux/sched.h:1857 kernel/events/core.c:899) [ 1853.924037] find_get_context (kernel/events/core.c:908 kernel/events/core.c:3161) [ 1853.924037] SYSC_perf_event_open (kernel/events/core.c:7200) [ 1853.924037] ? preempt_count_sub (kernel/sched/core.c:2606) [ 1853.924037] ? context_tracking_user_exit (./arch/x86/include/asm/paravirt.h:809 (discriminator 2) kernel/context_tracking.c:184 (discriminator 2)) [ 1853.924037] SyS_perf_event_open (kernel/events/core.c:7064) Thanks, Sasha ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: perf/workqueue: lockdep warning on process exit 2014-06-25 21:17 ` Sasha Levin @ 2014-07-07 13:45 ` Peter Zijlstra 0 siblings, 0 replies; 7+ messages in thread From: Peter Zijlstra @ 2014-07-07 13:45 UTC (permalink / raw) To: Sasha Levin; +Cc: Ingo Molnar, acme, paulus, Tejun Heo, LKML, Dave Jones [-- Attachment #1: Type: text/plain, Size: 1734 bytes --] On Wed, Jun 25, 2014 at 05:17:48PM -0400, Sasha Levin wrote: > [ 1853.817250] 2 locks held by trinity-c3/29876: > [ 1853.817250] #0: cpu_hotplug.lock){++++++}, at: get_online_cpus (kernel/cpu.c:90) > [ 1853.924037] #1: (&ctx->lock){-.-...}, at: perf_lock_task_context (kernel/events/core.c:983) > [ 1853.924037] > [ 1853.924037] Call Trace: > [ 1853.924037] lock_acquire (./arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) > [ 1853.924037] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) > [ 1853.924037] __queue_work (kernel/workqueue.c:1339) > [ 1853.924037] queue_work_on (kernel/workqueue.c:1417) > [ 1853.924037] free_object (lib/debugobjects.c:209) > [ 1853.924037] __debug_check_no_obj_freed (lib/debugobjects.c:715) > [ 1853.924037] debug_check_no_obj_freed (lib/debugobjects.c:727) > [ 1853.924037] kmem_cache_free (mm/slub.c:2657 mm/slub.c:2686) > [ 1853.924037] free_task (kernel/fork.c:221) > [ 1853.924037] __put_task_struct (kernel/fork.c:250) > [ 1853.924037] put_ctx (include/linux/sched.h:1857 kernel/events/core.c:899) > [ 1853.924037] find_get_context (kernel/events/core.c:908 kernel/events/core.c:3161) > [ 1853.924037] SYSC_perf_event_open (kernel/events/core.c:7200) > [ 1853.924037] ? preempt_count_sub (kernel/sched/core.c:2606) > [ 1853.924037] ? context_tracking_user_exit (./arch/x86/include/asm/paravirt.h:809 (discriminator 2) kernel/context_tracking.c:184 (discriminator 2)) > [ 1853.924037] SyS_perf_event_open (kernel/events/core.c:7064) Egads.. that shouldn't be possible. The only put_ctx() from find_get_context() is in the !ctx branch, and if perf_lock_task_context() returns NULL it shouldn't be holding ctx->lock. Does this reproduce? [-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* [tip:perf/urgent] perf: Fix lockdep warning on process exit 2014-06-23 14:12 ` Peter Zijlstra 2014-06-25 21:17 ` Sasha Levin @ 2014-07-16 19:19 ` tip-bot for Peter Zijlstra 1 sibling, 0 replies; 7+ messages in thread From: tip-bot for Peter Zijlstra @ 2014-07-16 19:19 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, sasha.levin, hpa, mingo, torvalds, peterz, acme, davej, tj, tglx Commit-ID: 4a1c0f262f88e2676fda80a6bf80e7dbccae1dcb Gitweb: http://git.kernel.org/tip/4a1c0f262f88e2676fda80a6bf80e7dbccae1dcb Author: Peter Zijlstra <peterz@infradead.org> AuthorDate: Mon, 23 Jun 2014 16:12:42 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 16 Jul 2014 13:18:42 +0200 perf: Fix lockdep warning on process exit Sasha Levin reported: > While fuzzing with trinity inside a KVM tools guest running the latest -next > kernel I've stumbled on the following spew: > > ====================================================== > [ INFO: possible circular locking dependency detected ] > 3.15.0-next-20140613-sasha-00026-g6dd125d-dirty #654 Not tainted > ------------------------------------------------------- > trinity-c578/9725 is trying to acquire lock: > (&(&pool->lock)->rlock){-.-...}, at: __queue_work (kernel/workqueue.c:1346) > > but task is already holding lock: > (&ctx->lock){-.....}, at: perf_event_exit_task (kernel/events/core.c:7471 kernel/events/core.c:7533) > > which lock already depends on the new lock. > 1 lock held by trinity-c578/9725: > #0: (&ctx->lock){-.....}, at: perf_event_exit_task (kernel/events/core.c:7471 kernel/events/core.c:7533) > > Call Trace: > dump_stack (lib/dump_stack.c:52) > print_circular_bug (kernel/locking/lockdep.c:1216) > __lock_acquire (kernel/locking/lockdep.c:1840 kernel/locking/lockdep.c:1945 kernel/locking/lockdep.c:2131 kernel/locking/lockdep.c:3182) > lock_acquire (./arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) > _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) > __queue_work (kernel/workqueue.c:1346) > queue_work_on (kernel/workqueue.c:1424) > free_object (lib/debugobjects.c:209) > __debug_check_no_obj_freed (lib/debugobjects.c:715) > debug_check_no_obj_freed (lib/debugobjects.c:727) > kmem_cache_free (mm/slub.c:2683 mm/slub.c:2711) > free_task (kernel/fork.c:221) > __put_task_struct (kernel/fork.c:250) > put_ctx (include/linux/sched.h:1855 kernel/events/core.c:898) > perf_event_exit_task (kernel/events/core.c:907 kernel/events/core.c:7478 kernel/events/core.c:7533) > do_exit (kernel/exit.c:766) > do_group_exit (kernel/exit.c:884) > get_signal_to_deliver (kernel/signal.c:2347) > do_signal (arch/x86/kernel/signal.c:698) > do_notify_resume (arch/x86/kernel/signal.c:751) > int_signal (arch/x86/kernel/entry_64.S:600) Urgh.. so the only way I can make that happen is through: perf_event_exit_task_context() raw_spin_lock(&child_ctx->lock); unclone_ctx(child_ctx) put_ctx(ctx->parent_ctx); raw_spin_unlock_irqrestore(&child_ctx->lock); And we can avoid this by doing the change below. I can't immediately see how this changed recently, but given that you say it's easy to reproduce, lets fix this. Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Dave Jones <davej@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140623141242.GB19860@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org> --- kernel/events/core.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index c46b02b..6b17ac1 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7486,7 +7486,7 @@ __perf_event_exit_task(struct perf_event *child_event, static void perf_event_exit_task_context(struct task_struct *child, int ctxn) { struct perf_event *child_event, *next; - struct perf_event_context *child_ctx; + struct perf_event_context *child_ctx, *parent_ctx; unsigned long flags; if (likely(!child->perf_event_ctxp[ctxn])) { @@ -7511,6 +7511,15 @@ static void perf_event_exit_task_context(struct task_struct *child, int ctxn) raw_spin_lock(&child_ctx->lock); task_ctx_sched_out(child_ctx); child->perf_event_ctxp[ctxn] = NULL; + + /* + * In order to avoid freeing: child_ctx->parent_ctx->task + * under perf_event_context::lock, grab another reference. + */ + parent_ctx = child_ctx->parent_ctx; + if (parent_ctx) + get_ctx(parent_ctx); + /* * If this context is a clone; unclone it so it can't get * swapped to another process while we're removing all @@ -7521,6 +7530,13 @@ static void perf_event_exit_task_context(struct task_struct *child, int ctxn) raw_spin_unlock_irqrestore(&child_ctx->lock, flags); /* + * Now that we no longer hold perf_event_context::lock, drop + * our extra child_ctx->parent_ctx reference. + */ + if (parent_ctx) + put_ctx(parent_ctx); + + /* * Report the task dead after unscheduling the events so that we * won't get any samples after PERF_RECORD_EXIT. We can however still * get a few PERF_RECORD_READ events. ^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-07-16 19:20 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-06-16 14:24 perf/workqueue: lockdep warning on process exit Sasha Levin 2014-06-16 21:41 ` Sasha Levin 2014-06-17 15:57 ` Tejun Heo 2014-06-23 14:12 ` Peter Zijlstra 2014-06-25 21:17 ` Sasha Levin 2014-07-07 13:45 ` Peter Zijlstra 2014-07-16 19:19 ` [tip:perf/urgent] perf: Fix " tip-bot for Peter Zijlstra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).