From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S938654AbcIHIIl (ORCPT ); Thu, 8 Sep 2016 04:08:41 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:42603 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932777AbcIHIIj (ORCPT ); Thu, 8 Sep 2016 04:08:39 -0400 Date: Thu, 8 Sep 2016 10:06:13 +0200 (CEST) From: Thomas Gleixner To: Binoy Jayan cc: "Steven Rostedt (Red Hat)" , Ingo Molnar , Daniel Wagner , Arnd Bergmann , linux-kernel@vger.kernel.org, Masami Subject: Re: [PATCH v6 3/4] tracing: Add trace_irqsoff tracepoints In-Reply-To: <1473246835-30075-4-git-send-email-binoy.jayan@linaro.org> Message-ID: References: <1473246835-30075-1-git-send-email-binoy.jayan@linaro.org> <1473246835-30075-4-git-send-email-binoy.jayan@linaro.org> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 7 Sep 2016, Binoy Jayan wrote: > This captures only the latencies introduced by disabled irqs and > preemption. Additional per process data has to be captured to calculate > the effective latencies introduced for individual processes. And what is the additional per process data and how is it captured and used? > static DEFINE_PER_CPU(int, tracing_cpu); > +static DEFINE_PER_CPU(cycle_t, lat_ts[LT_MAX]); > > static DEFINE_RAW_SPINLOCK(max_trace_lock); > > @@ -419,9 +425,23 @@ stop_critical_timing(unsigned long ip, unsigned long parent_ip) > atomic_dec(&data->disabled); > } > > +static inline void latency_preempt_timing_start(enum latency_type ltype) > +{ > + this_cpu_write(lat_ts[ltype], (cycle_t) trace_clock_local()); What is this silly type cast for? Why can't you just use u64 ? > +} > + > +static inline void latency_preempt_timing_stop(enum latency_type type) > +{ > + trace_latency_preempt(type, > + (cycle_t) trace_clock_local() - this_cpu_read(lat_ts[type])); And then of course you use a completely different data type in the trace itself. > +DECLARE_EVENT_CLASS(latency_template, > + TP_PROTO(int ltype, cycles_t latency), Are you sure, that you know what you are doing here? If yes, then please explain it in form of comments so mere mortals can understand it as well. > /* start and stop critical timings used to for stoppage (in idle) */ > void start_critical_timings(void) > { > + if (unlikely(trace_latency_preempt_enabled())) > + latency_preempt_timing_start(LT_CRITTIME); I doubt, that this conditional is less expensive than a simple unconditional store to a per cpu variable. > + > if (preempt_trace() || irq_trace()) > start_critical_timing(CALLER_ADDR0, CALLER_ADDR1); > } > @@ -431,6 +451,9 @@ void stop_critical_timings(void) > { > if (preempt_trace() || irq_trace()) > stop_critical_timing(CALLER_ADDR0, CALLER_ADDR1); > + > + if (unlikely(trace_latency_preempt_enabled())) > + latency_preempt_timing_stop(LT_CRITTIME); And this is silly as well. You can put the whole evaluation into the trace event assignement so the tracer core will handle that conditional. Aside of that it is silly to evaluate trace_clock_local() for the actual tracepoint simply because that time is already stored in the tracepoint itself. The flow here is: event = trace_event_buffer_lock_reserve(); entry = ring_buffer_event_data(event); { ; } <-- Here we assign the entries by the __field and __array macros. So you should talk to Steven about having a way to retrieve that time from entry itself in a generic way. Thanks, tglx