From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755076AbbLIJ6V (ORCPT ); Wed, 9 Dec 2015 04:58:21 -0500 Received: from mga03.intel.com ([134.134.136.65]:39575 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754769AbbLIJ55 (ORCPT ); Wed, 9 Dec 2015 04:57:57 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,403,1444719600"; d="scan'208";a="867849636" From: Alexander Shishkin To: Peter Zijlstra Cc: Ingo Molnar , linux-kernel@vger.kernel.org, vince@deater.net, eranian@google.com, johannes@sipsolutions.net, Arnaldo Carvalho de Melo Subject: Re: [PATCH 4/7] perf: Free aux pages in unmap path In-Reply-To: <20151204170206.GI17308@twins.programming.kicks-ass.net> References: <1449138762-15194-1-git-send-email-alexander.shishkin@linux.intel.com> <1449138762-15194-5-git-send-email-alexander.shishkin@linux.intel.com> <20151204170206.GI17308@twins.programming.kicks-ass.net> User-Agent: Notmuch/0.21 (http://notmuchmail.org) Emacs/24.5.1 (x86_64-pc-linux-gnu) Date: Wed, 09 Dec 2015 11:57:51 +0200 Message-ID: <87si3cx7ts.fsf@ashishki-desk.ger.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Peter Zijlstra writes: > Yuck, nasty problem. Also, I think its broken. By not having > mmap_mutex around the whole thing, notably rb_free_aux(), you can race > against mmap(). > > What seems possible now is that: > > mmap(aux); // rb->aux_mmap_count == 1 > munmap(aux) > atomic_dec_and_mutex_lock(&rb->aux_mmap_count, &event->mmap_mutex); // == 0 > > mutex_unlock(&event->mmap_mutex); > > mmap(aux) > if (rb_has_aux()) > atomic_inc(&rb->aux_mmap_count); // == 1 > > rb_free_aux(); // oops!! Wait, this isn't actually a problem, we can hold mmap_mutex over rb_free_aux(), as we actually already do in current code. My patch did it wrongly though, but there's really no reason to drop the mutex before rb_free_aux(). > So I thought that pulling all the aux bits out from the ring_buffer > struct, such that we have rb->aux, would solve the issue in that we can > then fix mmap() to have the same retry loop as for event->rb. > > And while that fixes that race (I almost had that patch complete -- I > might still send it out, just so you can see what it looks like), it > doesn't solve the complete problem I don't think. I was toying with that some time ago, but I couldn't really see the benefits that would justify the hassle. > Because in that case, you want the event to start again on the new > buffer, and I think its possible we end up calling ->start() before > we've issued the ->stop() and that would be BAD (tm). So if we just hold the mmap_mutex over rb_free_aux(), this won't happen, right? > The only solution I've come up with is: > > struct rb_aux *aux = rb->aux; > > if (aux && vma->vm_pgoff == aux->pgoff) { > ctx = perf_event_ctx_lock(event); > if (!atomic_dec_and_mutex_lock(&aux->mmap_count, &event->mmap_mutex) { > /* we now hold both ctx::mutex and event::mmap_mutex */ > rb->aux = NULL; > ring_buffer_put(rb); /* aux had a reference */ > _perf_event_stop(event); Here we really need to ensure that none of the events on the rb->event_list is running, not just the parent, and that still presents complications wrt irqsave rb->event_lock even with your new idea for perf_event_stop(). How about something like this to stop the writers: static int __ring_buffer_output_stop(void *info) { struct ring_buffer *rb = info; struct perf_event *event; spin_lock(&rb->event_lock); list_for_each_entry_rcu(event, &rb->event_list, rb_entry) { if (event->state != PERF_EVENT_STATE_ACTIVE) continue; event->pmu->stop(event, PERF_EF_UPDATE); } spin_unlock(&rb->event_lock); return 0; } static void perf_event_output_stop(struct perf_event *event) { struct ring_buffer *rb = event->rb; lockdep_assert_held(&event->mmap_mutex); if (event->cpu == -1) perf_event_stop(event); cpu_function_call(event->cpu, __ring_buffer_output_stop, rb); } And then in the mmap_close: if (rb_has_aux(rb) && vma->vm_pgoff == rb->aux_pgoff && atomic_dec_and_mutex_lock(&rb->aux_mmap_count, &event->mmap_mutex)) { perf_event_output_stop(event); /* undo the mlock accounting here */ rb_free_aux(rb); mutex_unlock(&event->mmap_mutex); } Regards, -- Alex