All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Bristot de Oliveira <bristot@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Juri Lelli <juri.lelli@gmail.com>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-rt-users <linux-rt-users@vger.kernel.org>
Subject: Re: [PATCH 3/4] sched/deadline: Tracepoints for deadline scheduler
Date: Tue, 23 Feb 2016 13:19:18 -0300	[thread overview]
Message-ID: <56CC8686.5090404@redhat.com> (raw)
In-Reply-To: <20160223104408.GO6357@twins.programming.kicks-ass.net>



On 02/23/2016 07:44 AM, Peter Zijlstra wrote:
>>> Worse, the proposed tracepoints are atrocious, look at crap like this:
>>> > > 
>>>> > > > +		if (trace_sched_deadline_yield_enabled()) {
>>>> > > > +			u64 delta_exec = rq_clock_task(rq) - p->se.exec_start;
>>>> > > > +			/* Subtract the last run till now */
>>>> > > > +			if (likely((s64)delta_exec > 0))
>>>> > > > +				p->dl.runtime -= delta_exec;
>>>> > > > +			trace_sched_deadline_yield(&p->dl);
>>>> > > > +		}  
>>> > > 
>>> > > tracepoints should _NEVER_ change state, ever.
>> > 
>> > Heh, it's not really changing state. The code directly after this is:
>> > 
>> > 	p->dl.runtime = 0;
> Yes, it more or less 'works', but its still atrocious shite. Its the
> worst kind of anti pattern possible.
> 
> Suppose someone comes and removes that line, and ignores the tracepoint
> stuff, because, hell its a tracepoint, those don't modify stuff.
> 
> Its just really, utterly bad practice.

It is possible to clean up this by removing the "p->dl.runtime = 0;"
from yield_task_dl(), to carry the dl_runtime until update_curr_dl().
We end up not proposing this change because we thought it could be
too much intrusive. But seems that it was the correct approach.

Btw, your patch for "Always calculate end of period on sched_yield()"
already do the necessary clean up, so in the possible next version of
these tracepoint the hack will disappear. 

>>> > > So tell me why these specific tracepoints and why the existing ones
>>> > > could not be extended to include this information. For example, why a
>>> > > trace_sched_dealine_yield, and not a generic trace_sched_yield() that
>>> > > works for all classes.
>> > 
>> > But what about reporting actual runtime, and when the next period will
>> > come. That only matters for deadline.
> How is that an answer to the question? Are you implying a generic
> trace_sched_yield() call could not do this?

I agree that a trace_sched_yield() could partially do this. But each
scheduler would have its own specific data to be printed, and it is hard
to define how many and the type of these data.

>>> > > But do not present me with a bunch of random arse, hacked together
>>> > > tracepoints and tell me they might be useful, maybe.
>> > 
>> > 
>> > They ARE useful. These are the tracepoints I'm currently using to
>> > debug the deadline scheduler with. They have been indispensable for my
>> > current work.
> They are, most obviously, a hacked together debug session for sure. This
> is _NOT_ what you commit.
> 
> Now ideally we'd do something like the below, but because trainwreck, we
> cannot actually do this I think :-(

I do liked this patch, but I agree that bad things can happen on
applications that already use sched:sched_switch :-(.
 
> I really dislike tracepoints, and I'm >.< close to proposing a patch
> removing them all from the scheduler.

Scheduler tracepoints become such a standard not only for debugging the
kernel, but also for understanding performance issues and finding
optimizations for user-space tasks, e.g. finding the best way to spread
task among processors on a large NUMA box. Many people would cry
if sched tracepoints disappears :'-(.

> @@ -132,33 +177,63 @@ TRACE_EVENT(sched_switch,
>  	TP_STRUCT__entry(
>  		__array(	char,	prev_comm,	TASK_COMM_LEN	)
>  		__field(	pid_t,	prev_pid			)
> -		__field(	int,	prev_prio			)
> +		__field(	int,	prev_policy			)
> +		__field(	s64,	prev_f1				)
> +		__field(	u64,	prev_f2				)
> +		__field(	u64,	prev_f3				)

This helps me to explain why did I chose to create deadline specific
tracepoints.

It is not possible to define in advance a common set of parameters to be
traced/printed for all future (exotic) scheduler, because each scheduler
will use its own set of parameters. Hence, the number and type of the
parameter can vary. How many fields do we need? and which are the types
of these fields? What if a scheduler needs two s64 fields and no u64?

Moreover, it is not possible to define the changes that will be needed
on classical schedulers algorithms for them to fit on Linux's
abstractions and restrictions. Therefore, it is not safe to mention
that LLF would (theoretically) use the same set of parameters of a EDF
scheduler, because the theoretical LLF would need to be modified
to fit on Linux. For example, there is no throttling concept on the
classical deadline scheduler. So, the proposed tracepoints are only valid
for Linux's sched deadline scheduler (deadline scheduler + CBS).

That is why I believe that it would be a better approach to have
scheduler specific tracepoints for the sched_*_entity particularities
and sched_* specific points interest.

Finally, I was based in the same approach used on the fair scheduler's
sched:sched_stat* tracepoints: they report sched_entity data on fair.c,
and my patches report sched_deadline_entity data on deadline.c.

  parent reply	other threads:[~2016-02-23 16:19 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-22 17:08 [PATCH 0/4] Tracepoints for deadline scheduler Daniel Bristot de Oliveira
2016-02-22 17:08 ` [PATCH 1/4] tracing: Add __print_ns_to_secs() and __print_ns_without_secs() helpers Daniel Bristot de Oliveira
2016-02-22 17:08 ` [PATCH 2/4] sched: Move deadline container_of() helper functions into sched.h Daniel Bristot de Oliveira
2016-02-22 17:08 ` [PATCH 3/4] sched/deadline: Tracepoints for deadline scheduler Daniel Bristot de Oliveira
2016-02-22 17:32   ` Peter Zijlstra
2016-02-22 17:48     ` Steven Rostedt
2016-02-22 20:11       ` Daniel Bristot de Oliveira
2016-02-22 21:30       ` Peter Zijlstra
2016-02-22 22:30         ` Steven Rostedt
2016-02-23 10:40           ` Juri Lelli
2016-02-23 10:44           ` Peter Zijlstra
2016-02-23 13:10             ` Steven Rostedt
2016-02-24  8:48               ` Ingo Molnar
2016-02-23 14:27             ` Peter Zijlstra
2016-02-23 16:19             ` Daniel Bristot de Oliveira [this message]
2016-02-24  2:29             ` Daniel Bristot de Oliveira
2016-02-22 17:48   ` kbuild test robot
2016-02-22 17:08 ` [PATCH 4/4] tools lib traceevent: Implements '%' operation Daniel Bristot de Oliveira
2016-02-22 20:23   ` Steven Rostedt
2016-02-23 14:38     ` Arnaldo Carvalho de Melo
2016-02-23 14:44       ` Arnaldo Carvalho de Melo
2016-02-25  5:41   ` [tip:perf/core] tools lib traceevent: Implement " tip-bot for Daniel Bristot de Oliveira

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56CC8686.5090404@redhat.com \
    --to=bristot@redhat.com \
    --cc=acme@redhat.com \
    --cc=juri.lelli@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.