From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S964905AbcCNOEx (ORCPT ); Mon, 14 Mar 2016 10:04:53 -0400 Received: from mga14.intel.com ([192.55.52.115]:8117 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932432AbcCNOEu (ORCPT ); Mon, 14 Mar 2016 10:04:50 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,335,1455004800"; d="scan'208";a="909919752" From: Alexander Shishkin To: Peter Zijlstra Cc: Ingo Molnar , linux-kernel@vger.kernel.org, vince@deater.net, eranian@google.com, Arnaldo Carvalho de Melo Subject: Re: [PATCH v2 2/5] perf: Free aux pages in unmap path In-Reply-To: <20160314123837.GU6356@twins.programming.kicks-ass.net> References: <1457098969-21595-1-git-send-email-alexander.shishkin@linux.intel.com> <1457098969-21595-3-git-send-email-alexander.shishkin@linux.intel.com> <20160314123837.GU6356@twins.programming.kicks-ass.net> User-Agent: Notmuch/0.21 (http://notmuchmail.org) Emacs/24.5.1 (x86_64-pc-linux-gnu) Date: Mon, 14 Mar 2016 16:04:44 +0200 Message-ID: <87fuvtyx37.fsf@ashishki-desk.ger.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Peter Zijlstra writes: > On Fri, Mar 04, 2016 at 03:42:46PM +0200, Alexander Shishkin wrote: >> @@ -4649,10 +4679,22 @@ static void perf_mmap_close(struct vm_area_struct *vma) >> */ >> if (rb_has_aux(rb) && vma->vm_pgoff == rb->aux_pgoff && >> atomic_dec_and_mutex_lock(&rb->aux_mmap_count, &event->mmap_mutex)) { >> + /* >> + * Stop all aux events that are writing to this here buffer, >> + * so that we can free its aux pages and corresponding pmu >> + * data. Note that after rb::aux_mmap_count dropped to zero, >> + * they won't start any more (see perf_aux_output_begin()). >> + */ >> + perf_pmu_output_stop(event); > > So to me it seems like we're interested in rb, we don't particularly > care about @event in this case. Yeah, @event is used only for its rb and pmu down this path. >> + if (!has_aux(event)) >> + return; >> + > > if (!parent) > parent = event; > >> + if (rcu_dereference(event->rb) == rb) > s/event/parent/ > >> + ro->err = __perf_event_stop(event); > >> + else if (parent && rcu_dereference(parent->rb) == rb) >> + ro->err = __perf_event_stop(event); > > and these can go.. However.. You're right: it's the parent that's got the ->rb, but it may well be the child that's actually running and writing data there. So we need to stop any running children. I think I actually broke this bit since the previous version, because there needs to be a comment about it. > >> +} >> + >> +static int __perf_pmu_output_stop(void *info) >> +{ >> + struct perf_event *event = info; >> + struct pmu *pmu = event->pmu; >> + struct perf_cpu_context *cpuctx = get_cpu_ptr(pmu->pmu_cpu_context); >> + struct remote_output ro = { >> + .rb = event->rb, >> + }; >> + >> + rcu_read_lock(); >> + perf_event_aux_ctx(&cpuctx->ctx, __perf_event_output_stop, &ro); >> + if (cpuctx->task_ctx) >> + perf_event_aux_ctx(cpuctx->task_ctx, __perf_event_output_stop, >> + &ro); >> + rcu_read_unlock(); >> + >> + return ro.err; >> +} >> + >> +static void perf_pmu_output_stop(struct perf_event *event) >> +{ >> + int cpu, err; >> + >> + /* better be thorough */ >> + get_online_cpus(); >> +restart: >> + for_each_online_cpu(cpu) { >> + err = cpu_function_call(cpu, __perf_pmu_output_stop, event); >> + if (err) >> + goto restart; >> + } >> + put_online_cpus(); >> +} > > This seems wildly overkill, could we not iterate rb->event_list like we > do for the normal buffer? Actually we can. One problem though is that iterating rb::event_list requires rcu read section or irqsafe rb::event_lock and we need to send IPIs. The normal buffer case tears down the rb::event_list as it goes, so it can close the rcu read section right after it fetches one event from it. In this case however, we must keep the list intact. > Sure, we need to IPI for each event found, but that seems better than > unconditionally sending IPIs to all CPUs. Actually, won't it "often" be the case that the number of events will be a multiple of the number of cpus? The usual use case being one event per task per cpu and inheritance enabled. In this case we'll zap multiple events per IPI. Regards, -- Alex