From mboxrd@z Thu Jan 1 00:00:00 1970 From: Neil Leeder Subject: Re: [PATCH 2/2] soc: qcom: add l2 cache perf events driver Date: Fri, 10 Jun 2016 18:34:04 -0400 Message-ID: <7a8fd879-6f0b-3b73-013f-99b12ba5729f@codeaurora.org> References: <1464987812-14360-1-git-send-email-nleeder@codeaurora.org> <1464987812-14360-3-git-send-email-nleeder@codeaurora.org> <20160606095139.GC6831@leverpostej> <20160609155615.GE12201@leverpostej> <20160609194136.GZ30154@twins.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Return-path: Received: from smtp.codeaurora.org ([198.145.29.96]:55596 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751724AbcFJWeJ (ORCPT ); Fri, 10 Jun 2016 18:34:09 -0400 In-Reply-To: <20160609194136.GZ30154@twins.programming.kicks-ass.net> Sender: linux-arm-msm-owner@vger.kernel.org List-Id: linux-arm-msm@vger.kernel.org To: Peter Zijlstra , Mark Rutland Cc: Catalin Marinas , WillDeacon , Ingo Molnar , Arnaldo Carvalho de Melo , linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Mark Langsdorf , Mark Salter , Jon Masters , Timur Tabi , cov@codeaurora.org, nleeder@codeaurora.org On 6/9/2016 03:41 PM, Peter Zijlstra wrote: > On Thu, Jun 09, 2016 at 04:56:16PM +0100, Mark Rutland wrote: >>>>> +static irqreturn_t l2_cache__handle_irq(int irq_num, void *data) >>>>> +{ >>>>> + struct hml2_pmu *slice = data; >>>>> + u32 ovsr; >>>>> + int idx; >>>>> + struct pt_regs *regs; >>>>> + >>>>> + ovsr = hml2_pmu__getreset_ovsr(); >>>>> + if (!hml2_pmu__has_overflowed(ovsr)) >>>>> + return IRQ_NONE; >>>>> + >>>>> + regs = get_irq_regs(); >>>>> + >>>>> + for (idx = 0; idx < l2cache_pmu.num_counters; idx++) { >>>>> + struct perf_event *event = slice->events[idx]; >>>>> + struct hw_perf_event *hwc; >>>>> + struct perf_sample_data data; >>>>> + >>>>> + if (!event) >>>>> + continue; >>>>> + >>>>> + if (!hml2_pmu__counter_has_overflowed(ovsr, idx)) >>>>> + continue; >>>>> + >>>>> + l2_cache__event_update_from_slice(event, slice); >>>>> + hwc = &event->hw; >>>>> + >>>>> + if (is_sampling_event(event)) { >>>>> + perf_sample_data_init(&data, 0, hwc->last_period); >>>> >>>> I don't think sampling makes sense, given this is an uncore PMU and the >>>> events are triggered by other CPUs. >>> >>> There is origin filtering so events can be attributed to a CPU when sampling. >> >> Ok. I believe that's different from all other uncore PMUs we support >> (none of the drivers support sampling, certainly), so I'm not entirely >> sure how/if we can make use of that sanely and reliably. > > Right; because not only do you need to know which CPU originated the > event, the IRQ must also happen on that CPU. Simply knowing which CPU > triggered it is not enough for sampling. > >> For the timebeing, I think this sampling needs to go, and the event_init >> logic needs to reject sampling as with other uncore PMU drivers. > > Agreed. I want to make sure I understand what the concern is here. Given the hardware filter which restricts counting to events generated by a specific CPU, and an irq which is affine to that CPU, sampling and task mode would seem to work for a single perf use. Is the issue only related to multiple concurrent perf uses? > >> One thing I forgot to mention in my earlier comments is that as an >> uncore PMU you need to have task_ctx_nr = perf_invalid_context here >> also. > > For good reasons, uncore PMUs (as is the case here) count strictly more > than the effect of single CPUs (and thus also the current task). So > attributing it back to a task is nonsense. > Thanks, Neil -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. From mboxrd@z Thu Jan 1 00:00:00 1970 From: nleeder@codeaurora.org (Neil Leeder) Date: Fri, 10 Jun 2016 18:34:04 -0400 Subject: [PATCH 2/2] soc: qcom: add l2 cache perf events driver In-Reply-To: <20160609194136.GZ30154@twins.programming.kicks-ass.net> References: <1464987812-14360-1-git-send-email-nleeder@codeaurora.org> <1464987812-14360-3-git-send-email-nleeder@codeaurora.org> <20160606095139.GC6831@leverpostej> <20160609155615.GE12201@leverpostej> <20160609194136.GZ30154@twins.programming.kicks-ass.net> Message-ID: <7a8fd879-6f0b-3b73-013f-99b12ba5729f@codeaurora.org> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On 6/9/2016 03:41 PM, Peter Zijlstra wrote: > On Thu, Jun 09, 2016 at 04:56:16PM +0100, Mark Rutland wrote: >>>>> +static irqreturn_t l2_cache__handle_irq(int irq_num, void *data) >>>>> +{ >>>>> + struct hml2_pmu *slice = data; >>>>> + u32 ovsr; >>>>> + int idx; >>>>> + struct pt_regs *regs; >>>>> + >>>>> + ovsr = hml2_pmu__getreset_ovsr(); >>>>> + if (!hml2_pmu__has_overflowed(ovsr)) >>>>> + return IRQ_NONE; >>>>> + >>>>> + regs = get_irq_regs(); >>>>> + >>>>> + for (idx = 0; idx < l2cache_pmu.num_counters; idx++) { >>>>> + struct perf_event *event = slice->events[idx]; >>>>> + struct hw_perf_event *hwc; >>>>> + struct perf_sample_data data; >>>>> + >>>>> + if (!event) >>>>> + continue; >>>>> + >>>>> + if (!hml2_pmu__counter_has_overflowed(ovsr, idx)) >>>>> + continue; >>>>> + >>>>> + l2_cache__event_update_from_slice(event, slice); >>>>> + hwc = &event->hw; >>>>> + >>>>> + if (is_sampling_event(event)) { >>>>> + perf_sample_data_init(&data, 0, hwc->last_period); >>>> >>>> I don't think sampling makes sense, given this is an uncore PMU and the >>>> events are triggered by other CPUs. >>> >>> There is origin filtering so events can be attributed to a CPU when sampling. >> >> Ok. I believe that's different from all other uncore PMUs we support >> (none of the drivers support sampling, certainly), so I'm not entirely >> sure how/if we can make use of that sanely and reliably. > > Right; because not only do you need to know which CPU originated the > event, the IRQ must also happen on that CPU. Simply knowing which CPU > triggered it is not enough for sampling. > >> For the timebeing, I think this sampling needs to go, and the event_init >> logic needs to reject sampling as with other uncore PMU drivers. > > Agreed. I want to make sure I understand what the concern is here. Given the hardware filter which restricts counting to events generated by a specific CPU, and an irq which is affine to that CPU, sampling and task mode would seem to work for a single perf use. Is the issue only related to multiple concurrent perf uses? > >> One thing I forgot to mention in my earlier comments is that as an >> uncore PMU you need to have task_ctx_nr = perf_invalid_context here >> also. > > For good reasons, uncore PMUs (as is the case here) count strictly more > than the effect of single CPUs (and thus also the current task). So > attributing it back to a task is nonsense. > Thanks, Neil -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.