From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751788AbdAYNCL (ORCPT ); Wed, 25 Jan 2017 08:02:11 -0500 Received: from mx1.redhat.com ([209.132.183.28]:54860 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751466AbdAYNCK (ORCPT ); Wed, 25 Jan 2017 08:02:10 -0500 Date: Wed, 25 Jan 2017 14:02:07 +0100 From: Jiri Olsa To: Peter Zijlstra Cc: Jiri Olsa , lkml , Ingo Molnar , Andi Kleen , Alexander Shishkin , Arnaldo Carvalho de Melo , Vince Weaver Subject: Re: [PATCH 4/4] perf/x86/intel: Throttle PEBS events only from pmi Message-ID: <20170125130207.GA24586@krava> References: <1482931866-6018-1-git-send-email-jolsa@kernel.org> <1482931866-6018-5-git-send-email-jolsa@kernel.org> <20170124164122.GL25813@worktop.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170124164122.GL25813@worktop.programming.kicks-ass.net> User-Agent: Mutt/1.7.1 (2016-10-04) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Wed, 25 Jan 2017 13:02:10 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 24, 2017 at 05:41:22PM +0100, Peter Zijlstra wrote: SNIP > > So I don't particularly like these patches... they make a wee bit of a > mess. > > Under the assumption that draining a single event is on the same order > of cost as a regular PMI, then accounting a drain of multiple events as > an equal amount of interrupts makes sense. I understood it like we're trying more to protect the interrupt context abuse rather than account the samples here.. but yes, we compare (hwc->interrupts >= max_samples_per_tick) for throttling so I guess we should not cross max_samples_per_tick in any case > > We should not disregard this work. Now it looks like both (BTS & PEBS) > drain methods only count a single interrupt, that's something we maybe > ought to fix too. right, we account just single hwc->interrupts++, and BTS drain does not have the throttle logic I tried to put the perf_event_overflow call into both PEBS and BTS drains (below, untested). I think it's going to be slower especially for BTS code, which bypassed single event allocations and allocated one buffer for all samples at once. jirka --- diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index be202390bbd3..703bd82c69b1 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -520,10 +520,7 @@ int intel_pmu_drain_bts_buffer(void) }; struct perf_event *event = cpuc->events[INTEL_PMC_IDX_FIXED_BTS]; struct bts_record *at, *base, *top; - struct perf_output_handle handle; - struct perf_event_header header; struct perf_sample_data data; - unsigned long skip = 0; struct pt_regs regs; if (!event) @@ -562,40 +559,18 @@ int intel_pmu_drain_bts_buffer(void) */ if (event->attr.exclude_kernel && (kernel_ip(at->from) || kernel_ip(at->to))) - skip++; - } - - /* - * Prepare a generic sample, i.e. fill in the invariant fields. - * We will overwrite the from and to address before we output - * the sample. - */ - rcu_read_lock(); - perf_prepare_sample(&header, &data, event, ®s); - - if (perf_output_begin(&handle, event, header.size * - (top - base - skip))) - goto unlock; - - for (at = base; at < top; at++) { - /* Filter out any records that contain kernel addresses. */ - if (event->attr.exclude_kernel && - (kernel_ip(at->from) || kernel_ip(at->to))) continue; data.ip = at->from; data.addr = at->to; - perf_output_sample(&handle, &header, &data, event); - } + if (perf_event_overflow(event, &data, ®s)) { + x86_pmu_stop(event, 0); + break; + } - perf_output_end(&handle); + } - /* There's new data available. */ - event->hw.interrupts++; - event->pending_kill = POLL_IN; -unlock: - rcu_read_unlock(); return 1; } @@ -1243,25 +1218,18 @@ static void __intel_pmu_pebs_event(struct perf_event *event, !(event->hw.flags & PERF_X86_EVENT_AUTO_RELOAD)) return; - while (count > 1) { + while (count > 0) { setup_pebs_sample_data(event, iregs, at, &data, ®s); - perf_event_output(event, &data, ®s); + + if (perf_event_overflow(event, &data, ®s)) { + x86_pmu_stop(event, 0); + break; + } + at += x86_pmu.pebs_record_size; at = get_next_pebs_record_by_bit(at, top, bit); count--; } - - setup_pebs_sample_data(event, iregs, at, &data, ®s); - - /* - * All but the last records are processed. - * The last one is left to be able to call the overflow handler. - */ - if (perf_event_overflow(event, &data, ®s)) { - x86_pmu_stop(event, 0); - return; - } - } static void intel_pmu_drain_pebs_core(struct pt_regs *iregs)