On Thu, May 01, 2014 at 02:49:01PM -0400, Vince Weaver wrote: > > OK, humor me a bit here. > > I'm looking at the buggy trace and comparing against a "good" trace where > the bug doesn't happen. > > It is a rance condition of sorts, because it's just a 10us or so > interleaving of calls that causes the bug to happen or not. > > In the good trace: > > [parent] __perf_event_task_sched_out (and hence perf_swevent_del) > [child] perf_release > > In the buggy trace: > > [child] perf_release > [parent] __perf_event_task_sched_out (perf_swevent_del never happens) > > > perf_swevent_del calls > hlist_del_rcu(event->hlist_entry) > to remove the event from the swevent hlist. > > Now in theory perf_release() calls sw_perf_event_destroy() which you > would think would also call the above. Instead it does > swevent_hlist_put_cpu(event, cpu); > which does all kinds of weird hash stuff that I don't follow. > > Should the above two be equivelent? Is it reference counting in there > with if (!--swhash->hlist_refcount) causing the issue? perf_release() put_event() perf_remove_from_context() __perf_remove_from_context() event_sched_out() ->del() is the path that would call ->del() and hlist_del_rcu(). Now perf_remove_from_context() only calls __perf_remove_from_context() when the task is active somewhere, otherwise it simply calls list_del_event(). Both perf_remove_from_context() and perf_event_context_sched_out() (as called from __perf_event_task_sched_out) hold ctx->lock, so they should be serialized against each other. Clearly I'm missing something though, will go stare at the trace now.