linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Daniel Bristot de Oliveira <bristot@redhat.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org, Phil Auld <pauld@redhat.com>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Kate Carcia <kcarcia@redhat.com>,
	Jonathan Corbet <corbet@lwn.net>, Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Alexandre Chartre <alexandre.chartre@oracle.com>,
	Clark Willaims <williams@redhat.com>,
	John Kacur <jkacur@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	linux-doc@vger.kernel.org
Subject: Re: [PATCH V3 9/9] tracing: Add timerlat tracer
Date: Sat, 12 Jun 2021 10:47:16 +0200	[thread overview]
Message-ID: <80f74da1-a9a7-2af9-dd4f-c22f312214df@redhat.com> (raw)
In-Reply-To: <20210611164855.252f35fb@gandalf.local.home>

On 6/11/21 10:48 PM, Steven Rostedt wrote:
> On Fri, 11 Jun 2021 16:13:36 +0200
> Daniel Bristot de Oliveira <bristot@redhat.com> wrote:
> 
>>>> +
>>>> +#ifdef CONFIG_STACKTRACE
>>>> +/*
>>>> + * Stack trace will take place only at IRQ level, so, no need
>>>> + * to control nesting here.
>>>> + */
>>>> +struct trace_stack {
>>>> +	int stack_size;
>>>> +	int nr_entries;
>>>> +	unsigned long           calls[PAGE_SIZE];  
>>>
>>> That is rather big. It's 8 * PAGE_SIZE. I don't think that's what you really
>>> wanted.  
>>
>> no, I did not want that... is 256 a good number?
> 
> Sure. But make it a macro.
> 
> #define MAX_CALLS 256
> 
> or something like that.

Ack!

>>
>>>> +};
>>>> +
>>>> +static DEFINE_PER_CPU(struct trace_stack, trace_stack);
>>>> +
>>>> +/**  
>>>
>>> Again, remove the KernelDoc notation of /**, or make it real kerneldoc
>>> notation.  
>>
>> Fixed!
>>
>> [...]
>>
>>>>   *
>>>> @@ -801,6 +1017,22 @@ void trace_softirq_exit_callback(void *data, unsigned int vec_nr)
>>>>  	if (!osn_var->sampling)
>>>>  		return;
>>>>  
>>>> +#ifdef CONFIG_TIMERLAT_TRACER
>>>> +	/*
>>>> +	 * If the timerlat is enabled, but the irq handler did
>>>> +	 * not run yet enabling timerlat_tracer, do not trace.
>>>> +	 */
>>>> +	if (unlikely(osnoise_data.timerlat_tracer)) {
>>>> +		struct timerlat_variables *tlat_var;
>>>> +		tlat_var = this_cpu_tmr_var();
>>>> +		if (!tlat_var->tracing_thread) {  
>>>
>>> What happens if the timer interrupt triggers here?  
>>
>> The tracer will not report the softirq overhead. But at this point, the softirq
>> is returning, and the duration would be from this time to...
>>
>>
>>
>>>> +			osn_var->softirq.arrival_time = 0;
>>>> +			osn_var->softirq.delta_start = 0;
>>>> +			return;
>>>> +		}
>>>> +	}
>>>> +#endif
>>>> +
>>>>  	duration = get_int_safe_duration(osn_var, &osn_var->softirq.delta_start);  
>>
>> here.
>>
>> We can disable interrupts to avoid this issue. But the question is, is it worth
>> to disable interrupts to avoid this problem?
>>
>>>>  	trace_softirq_noise(vec_nr, osn_var->softirq.arrival_time, duration);
>>>>  	cond_move_thread_delta_start(osn_var, duration);
>>>> @@ -893,6 +1125,18 @@ thread_exit(struct osnoise_variables *osn_var, struct task_struct *t)
>>>>  	if (!osn_var->sampling)
>>>>  		return;
>>>>  
>>>> +#ifdef CONFIG_TIMERLAT_TRACER
>>>> +	if (osnoise_data.timerlat_tracer) {
>>>> +		struct timerlat_variables *tlat_var;
>>>> +		tlat_var = this_cpu_tmr_var();
>>>> +		if (!tlat_var->tracing_thread) {  
>>>
>>> Or here?  
>>
>> The problem that can happen with the softirq cannot happen here: this code runs
>> with interrupts disabled on __schedule() (it is hooked to the sched_switch).
>>
>>>> +			osn_var->thread.delta_start = 0;
>>>> +			osn_var->thread.arrival_time = 0;
>>>> +			return;
>>>> +		}
>>>> +	}
>>>> +#endif
>>>> +
>>>>  	duration = get_int_safe_duration(osn_var, &osn_var->thread.delta_start);
>>>>  
>>>>  	trace_thread_noise(t, osn_var->thread.arrival_time, duration);
>>>> @@ -1182,6 +1426,197 @@ static int osnoise_main(void *data)
>>>>  	return 0;
>>>>  }
>>>>  
>>>> +#ifdef CONFIG_TIMERLAT_TRACER
>>>> +/**
>>>> + * timerlat_irq - hrtimer handler for timerlat.
>>>> + */
>>>> +static enum hrtimer_restart timerlat_irq(struct hrtimer *timer)
>>>> +{
>>>> +	struct osnoise_variables *osn_var = this_cpu_osn_var();
>>>> +	struct trace_array *tr = osnoise_trace;
>>>> +	struct timerlat_variables *tlat;
>>>> +	struct timerlat_sample s;
>>>> +	u64 now;
>>>> +	u64 diff;
>>>> +
>>>> +	/*
>>>> +	 * I am not sure if the timer was armed for this CPU. So, get
>>>> +	 * the timerlat struct from the timer itself, not from this
>>>> +	 * CPU.
>>>> +	 */
>>>> +	tlat = container_of(timer, struct timerlat_variables, timer);
>>>> +
>>>> +	now = ktime_to_ns(hrtimer_cb_get_time(&tlat->timer));
>>>> +
>>>> +	/*
>>>> +	 * Enable the osnoise: events for thread an softirq.
>>>> +	 */
>>>> +	tlat->tracing_thread = true;
>>>> +
>>>> +	osn_var->thread.arrival_time = time_get();
>>>> +
>>>> +	/*
>>>> +	 * A hardirq is running: the timer IRQ. It is for sure preempting
>>>> +	 * a thread, and potentially preempting a softirq.
>>>> +	 *
>>>> +	 * At this point, it is not interesting to know the duration of the
>>>> +	 * preempted thread (and maybe softirq), but how much time they will
>>>> +	 * delay the beginning of the execution of the timer thread.
>>>> +	 *
>>>> +	 * To get the correct (net) delay added by the softirq, its delta_start
>>>> +	 * is set as the IRQ one. In this way, at the return of the IRQ, the delta
>>>> +	 * start of the sofitrq will be zeroed, accounting then only the time
>>>> +	 * after that.
>>>> +	 *
>>>> +	 * The thread follows the same principle. However, if a softirq is
>>>> +	 * running, the thread needs to receive the softirq delta_start. The
>>>> +	 * reason being is that the softirq will be the last to be unfolded,
>>>> +	 * resseting the thread delay to zero.
>>>> +	 */
>>>> +#ifndef CONFIG_PREEMPT_RT
>>>> +	if (osn_var->softirq.delta_start) {
>>>> +		copy_int_safe_time(osn_var, &osn_var->thread.delta_start,
>>>> +				   &osn_var->softirq.delta_start);  
>>>
>>> Isn't softirq.delta_start going to be zero here? It doesn't look to get
>>> updated until you set tracing_thread to true, but that happens here, and as
>>> this is in a interrupt context, there will not be a softirq happening
>>> between the setting of that to true to this point.  
>>
>> No... on the timerlat, the "sampling" is always on. And the
>> osnoise_data.timerlat_tracer is only checked at the softirq return, so the
>> softirq entry always set set the delta_start.
> 
> OK, I was confused by the timerlat using the "__osnoise_tracer_start()". If
> timerlat is going to use that, perhaps we need to rename it, because the
> "osnoise" is one tracer, and its confusing that the "timerlat" is using
> functions called "*_osnoise_*". I was thinking that those functions were
> only for the osnoise tracer and not part of the timerlat tracer, and
> ignored them when looking at what the timerlat tracer was doing.
> 
> Can we rename that to simply "start_latency_tracing()" or something more
> generic.

right, also considering that we will (I hope) have the rtsl next, and other
ideas coming around the usage of osnoise: events on other tracers, I also think
should a more generic term to the events. Indeed, on rtsl they have a different
name.

Thinking only about the instrumentation/events, what they are tracking is the
execution time. So how about naming them as:

exec_time:thread
exec_time:irq

Also adding that, although here we measure the execution time of "task" context,
on rtsl we have other kinds of "windows" that they measure, for instance, the
poid window (Preemption or IRQ disabled). So, the term exec time also fits there.

?

>>
>>>> +
>>>> +		copy_int_safe_time(osn_var, &osn_var->softirq.delta_start,
>>>> +				    &osn_var->irq.delta_start);
>>>> +	} else {
>>>> +		copy_int_safe_time(osn_var, &osn_var->thread.delta_start,
>>>> +				    &osn_var->irq.delta_start);
>>>> +	}
>>>> +#else /* CONFIG_PREEMPT_RT */
>>>> +	/*
>>>> +	 * The sofirqs run as threads on RT, so there is not need
>>>> +	 * to keep track of it.
>>>> +	 */
>>>> +	copy_int_safe_time(osn_var, &osn_var->thread.delta_start, &osn_var->irq.delta_start);
>>>> +#endif /* CONFIG_PREEMPT_RT */
>>>> +
>>>> +	/*
>>>> +	 * Compute the current time with the expected time.
>>>> +	 */
>>>> +	diff = now - tlat->abs_period;
>>>> +
>>>> +	tlat->count++;
>>>> +	s.seqnum = tlat->count;
>>>> +	s.timer_latency = diff;
>>>> +	s.context = IRQ_CONTEXT;
>>>> +
>>>> +	trace_timerlat_sample(&s);
>>>> +
>>>> +	/* Keep a running maximum ever recorded os noise "latency" */
>>>> +	if (diff > tr->max_latency) {
>>>> +		tr->max_latency = diff;
>>>> +		latency_fsnotify(tr);
>>>> +	}
>>>> +
>>>> +	if (osnoise_data.stop_tracing_in)
>>>> +		if (time_to_us(diff) >= osnoise_data.stop_tracing_in)
>>>> +			osnoise_stop_tracing();
>>>> +
>>>> +	wake_up_process(tlat->kthread);
>>>> +
>>>> +#ifdef CONFIG_STACKTRACE
>>>> +	if (osnoise_data.print_stack)
>>>> +		timerlat_save_stack(0);
>>>> +#endif  
>>>
>>> No need for the #ifdef above. timerlat_save_stack() is defined as a nop
>>> when not enabled, and the compiler will just optimize this out.  
>>
>> The osnoise_data.print_stack is ifdefed, should I remove it from ifdef?
> 
> Well, the above ifdef is for STACKTRACE not for TIMERLAT_TRACER, which
> encompasses all of this. And the "timerlat_save_stack()" is a nop when
> STACKTRACE is not defined. So no.

Ooooppppsss, right, miss attention from my side.

-- Daniel

> -- Steve
> 
>>
>>>   
>>>> +
>>>> +	return HRTIMER_NORESTART;
>>>> +}
>>>> +
> 


  reply	other threads:[~2021-06-12  8:47 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-14 20:51 [PATCH V3 0/9] hwlat improvements and osnoise/timerlat tracers Daniel Bristot de Oliveira
2021-05-14 20:51 ` [PATCH V3 1/9] tracing/hwlat: Fix Clark's email Daniel Bristot de Oliveira
2021-05-14 20:51 ` [PATCH V3 2/9] tracing/hwlat: Implement the mode config option Daniel Bristot de Oliveira
2021-06-03 20:11   ` Steven Rostedt
2021-05-14 20:51 ` [PATCH V3 3/9] tracing/hwlat: Switch disable_migrate to mode none Daniel Bristot de Oliveira
2021-05-14 20:51 ` [PATCH V3 4/9] tracing/hwlat: Implement the per-cpu mode Daniel Bristot de Oliveira
2021-05-27 11:58   ` Juri Lelli
2021-05-27 12:29     ` Daniel Bristot de Oliveira
2021-06-03 21:17   ` Steven Rostedt
2021-06-04 15:31     ` Daniel Bristot de Oliveira
2021-05-14 20:51 ` [PATCH V3 5/9] tracing/trace: Add a generic function to read/write u64 values from tracefs Daniel Bristot de Oliveira
2021-06-03 21:22   ` Steven Rostedt
2021-06-04 16:05     ` Daniel Bristot de Oliveira
2021-06-04 16:18       ` Steven Rostedt
2021-06-04 16:34         ` Daniel Bristot de Oliveira
2021-05-14 20:51 ` [PATCH V3 6/9] trace/hwlat: Use the generic function to read/write width and window Daniel Bristot de Oliveira
2021-06-03 21:27   ` Steven Rostedt
2021-06-04 16:36     ` Daniel Bristot de Oliveira
2021-06-04 20:50       ` Steven Rostedt
2021-05-14 20:51 ` [PATCH V3 7/9] tracing: Add __print_ns_to_secs() and __print_ns_without_secs() helpers Daniel Bristot de Oliveira
2021-06-03 21:29   ` Steven Rostedt
2021-06-04  4:19     ` Joe Perches
2021-06-04 16:21       ` Steven Rostedt
2021-06-04 19:09         ` [PATCH] treewide: Add missing semicolons to __assign_str uses Joe Perches
2021-06-04 19:38         ` Joe Perches
2021-06-07 23:18           ` Jason Gunthorpe
2021-06-12 15:42         ` [PATCH V2] " Joe Perches
2021-06-12 23:11           ` Steven Rostedt
2021-06-30 11:28           ` Joe Perches
2021-06-30 12:22             ` Steven Rostedt
2021-06-04 16:07     ` [PATCH V3 7/9] tracing: Add __print_ns_to_secs() and __print_ns_without_secs() helpers Daniel Bristot de Oliveira
2021-05-14 20:51 ` [PATCH V3 8/9] tracing: Add osnoise tracer Daniel Bristot de Oliveira
2021-06-03 21:31   ` Steven Rostedt
2021-06-04 21:28   ` Steven Rostedt
2021-06-07 12:00     ` Daniel Bristot de Oliveira
2021-06-07 15:47       ` Steven Rostedt
2021-06-08 15:24         ` Daniel Bristot de Oliveira
2021-06-08 17:17     ` Daniel Bristot de Oliveira
2021-06-08 17:39       ` Steven Rostedt
2021-06-08 19:33         ` Daniel Bristot de Oliveira
2021-06-08 19:42           ` Steven Rostedt
2021-06-09 12:14     ` Daniel Bristot de Oliveira
2021-06-09 13:03       ` Steven Rostedt
2021-06-09 13:44         ` Daniel Bristot de Oliveira
2021-05-14 20:51 ` [PATCH V3 9/9] tracing: Add timerlat tracer Daniel Bristot de Oliveira
2021-06-08  1:36   ` Steven Rostedt
2021-06-11 12:59     ` Daniel Bristot de Oliveira
2021-06-11 20:03       ` Steven Rostedt
2021-06-12  9:41         ` Daniel Bristot de Oliveira
2021-06-12 23:06           ` Steven Rostedt
2021-06-11 14:13     ` Daniel Bristot de Oliveira
2021-06-11 20:48       ` Steven Rostedt
2021-06-12  8:47         ` Daniel Bristot de Oliveira [this message]
2021-06-12 23:09           ` Steven Rostedt
2021-06-15  8:18             ` Daniel Bristot de Oliveira
2021-05-27 12:07 ` [PATCH V3 0/9] hwlat improvements and osnoise/timerlat tracers Juri Lelli
2021-05-29  2:16   ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=80f74da1-a9a7-2af9-dd4f-c22f312214df@redhat.com \
    --to=bristot@redhat.com \
    --cc=alexandre.chartre@oracle.com \
    --cc=bigeasy@linutronix.de \
    --cc=corbet@lwn.net \
    --cc=jkacur@redhat.com \
    --cc=juri.lelli@redhat.com \
    --cc=kcarcia@redhat.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).