* [PATCH] proc/schedstat: Expose /proc/<pid>/schedstat if delay accounting is enabled
@ 2015-05-25 7:12 Naveen N. Rao
2015-05-28 5:11 ` Naveen N. Rao
2015-05-28 9:11 ` Balbir Singh
0 siblings, 2 replies; 17+ messages in thread
From: Naveen N. Rao @ 2015-05-25 7:12 UTC (permalink / raw)
To: bsingharora, mingo; +Cc: linux-kernel
/proc/<pid>/schedstat is currently only available if CONFIG_SCHEDSTATS is
enabled. But, all the fields that this exposes are available and valid
if CONFIG_TASK_DELAY_ACCT is enabled as well.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
fs/proc/base.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 093ca14..3ece303 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -304,14 +304,17 @@ static int proc_pid_stack(struct seq_file *m, struct pid_namespace *ns,
}
#endif
-#ifdef CONFIG_SCHEDSTATS
+#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
/*
* Provides /proc/PID/schedstat
*/
static int proc_pid_schedstat(struct seq_file *m, struct pid_namespace *ns,
struct pid *pid, struct task_struct *task)
{
- seq_printf(m, "%llu %llu %lu\n",
+ if (unlikely(!sched_info_on()))
+ seq_printf(m, "0 0 0\n");
+ else
+ seq_printf(m, "%llu %llu %lu\n",
(unsigned long long)task->se.sum_exec_runtime,
(unsigned long long)task->sched_info.run_delay,
task->sched_info.pcount);
@@ -2600,7 +2603,7 @@ static const struct pid_entry tgid_base_stuff[] = {
#ifdef CONFIG_STACKTRACE
ONE("stack", S_IRUSR, proc_pid_stack),
#endif
-#ifdef CONFIG_SCHEDSTATS
+#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
ONE("schedstat", S_IRUGO, proc_pid_schedstat),
#endif
#ifdef CONFIG_LATENCYTOP
@@ -2948,7 +2951,7 @@ static const struct pid_entry tid_base_stuff[] = {
#ifdef CONFIG_STACKTRACE
ONE("stack", S_IRUSR, proc_pid_stack),
#endif
-#ifdef CONFIG_SCHEDSTATS
+#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
ONE("schedstat", S_IRUGO, proc_pid_schedstat),
#endif
#ifdef CONFIG_LATENCYTOP
--
2.4.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH] proc/schedstat: Expose /proc/<pid>/schedstat if delay accounting is enabled
2015-05-25 7:12 [PATCH] proc/schedstat: Expose /proc/<pid>/schedstat if delay accounting is enabled Naveen N. Rao
@ 2015-05-28 5:11 ` Naveen N. Rao
2015-05-28 9:11 ` Balbir Singh
1 sibling, 0 replies; 17+ messages in thread
From: Naveen N. Rao @ 2015-05-28 5:11 UTC (permalink / raw)
To: bsingharora, mingo; +Cc: linux-kernel
Ingo, Balbir,
Any comments on this? Can you please take a look?
Thanks,
Naveen
On 2015/05/25 12:42PM, Naveen N Rao wrote:
> /proc/<pid>/schedstat is currently only available if CONFIG_SCHEDSTATS is
> enabled. But, all the fields that this exposes are available and valid
> if CONFIG_TASK_DELAY_ACCT is enabled as well.
>
> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
> ---
> fs/proc/base.c | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 093ca14..3ece303 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -304,14 +304,17 @@ static int proc_pid_stack(struct seq_file *m, struct pid_namespace *ns,
> }
> #endif
>
> -#ifdef CONFIG_SCHEDSTATS
> +#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> /*
> * Provides /proc/PID/schedstat
> */
> static int proc_pid_schedstat(struct seq_file *m, struct pid_namespace *ns,
> struct pid *pid, struct task_struct *task)
> {
> - seq_printf(m, "%llu %llu %lu\n",
> + if (unlikely(!sched_info_on()))
> + seq_printf(m, "0 0 0\n");
> + else
> + seq_printf(m, "%llu %llu %lu\n",
> (unsigned long long)task->se.sum_exec_runtime,
> (unsigned long long)task->sched_info.run_delay,
> task->sched_info.pcount);
> @@ -2600,7 +2603,7 @@ static const struct pid_entry tgid_base_stuff[] = {
> #ifdef CONFIG_STACKTRACE
> ONE("stack", S_IRUSR, proc_pid_stack),
> #endif
> -#ifdef CONFIG_SCHEDSTATS
> +#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> ONE("schedstat", S_IRUGO, proc_pid_schedstat),
> #endif
> #ifdef CONFIG_LATENCYTOP
> @@ -2948,7 +2951,7 @@ static const struct pid_entry tid_base_stuff[] = {
> #ifdef CONFIG_STACKTRACE
> ONE("stack", S_IRUSR, proc_pid_stack),
> #endif
> -#ifdef CONFIG_SCHEDSTATS
> +#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> ONE("schedstat", S_IRUGO, proc_pid_schedstat),
> #endif
> #ifdef CONFIG_LATENCYTOP
> --
> 2.4.0
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] proc/schedstat: Expose /proc/<pid>/schedstat if delay accounting is enabled
2015-05-25 7:12 [PATCH] proc/schedstat: Expose /proc/<pid>/schedstat if delay accounting is enabled Naveen N. Rao
2015-05-28 5:11 ` Naveen N. Rao
@ 2015-05-28 9:11 ` Balbir Singh
2015-05-29 6:16 ` Naveen N. Rao
2015-06-25 18:00 ` Cong Wang
1 sibling, 2 replies; 17+ messages in thread
From: Balbir Singh @ 2015-05-28 9:11 UTC (permalink / raw)
To: Naveen N. Rao; +Cc: mingo, linux-kernel
On Mon, May 25, 2015 at 12:42 PM, Naveen N. Rao
<naveen.n.rao@linux.vnet.ibm.com> wrote:
> /proc/<pid>/schedstat is currently only available if CONFIG_SCHEDSTATS is
> enabled. But, all the fields that this exposes are available and valid
> if CONFIG_TASK_DELAY_ACCT is enabled as well.
>
> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
> ---
> fs/proc/base.c | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 093ca14..3ece303 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -304,14 +304,17 @@ static int proc_pid_stack(struct seq_file *m, struct pid_namespace *ns,
> }
> #endif
>
> -#ifdef CONFIG_SCHEDSTATS
> +#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> /*
> * Provides /proc/PID/schedstat
> */
> static int proc_pid_schedstat(struct seq_file *m, struct pid_namespace *ns,
> struct pid *pid, struct task_struct *task)
> {
> - seq_printf(m, "%llu %llu %lu\n",
> + if (unlikely(!sched_info_on()))
> + seq_printf(m, "0 0 0\n");
> + else
> + seq_printf(m, "%llu %llu %lu\n",
> (unsigned long long)task->se.sum_exec_runtime,
> (unsigned long long)task->sched_info.run_delay,
> task->sched_info.pcount);
> @@ -2600,7 +2603,7 @@ static const struct pid_entry tgid_base_stuff[] = {
> #ifdef CONFIG_STACKTRACE
> ONE("stack", S_IRUSR, proc_pid_stack),
> #endif
> -#ifdef CONFIG_SCHEDSTATS
> +#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> ONE("schedstat", S_IRUGO, proc_pid_schedstat),
> #endif
> #ifdef CONFIG_LATENCYTOP
> @@ -2948,7 +2951,7 @@ static const struct pid_entry tid_base_stuff[] = {
> #ifdef CONFIG_STACKTRACE
> ONE("stack", S_IRUSR, proc_pid_stack),
> #endif
> -#ifdef CONFIG_SCHEDSTATS
> +#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> ONE("schedstat", S_IRUGO, proc_pid_schedstat),
> #endif
> #ifdef CONFIG_LATENCYTOP
> --
The change looks reasonable, from what I can understand you want these
changes so that you can use /proc/<pid>/schedstat instead of the
netlink interface when CONFIG_TASK_DELAY_ACCT is enabled.
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] proc/schedstat: Expose /proc/<pid>/schedstat if delay accounting is enabled
2015-05-28 9:11 ` Balbir Singh
@ 2015-05-29 6:16 ` Naveen N. Rao
2015-05-29 8:04 ` Ingo Molnar
2015-06-25 18:00 ` Cong Wang
1 sibling, 1 reply; 17+ messages in thread
From: Naveen N. Rao @ 2015-05-29 6:16 UTC (permalink / raw)
To: Balbir Singh; +Cc: mingo, linux-kernel
On 2015/05/28 02:41PM, Balbir Singh wrote:
> On Mon, May 25, 2015 at 12:42 PM, Naveen N. Rao
> <naveen.n.rao@linux.vnet.ibm.com> wrote:
> > /proc/<pid>/schedstat is currently only available if CONFIG_SCHEDSTATS is
> > enabled. But, all the fields that this exposes are available and valid
> > if CONFIG_TASK_DELAY_ACCT is enabled as well.
> >
> > Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
> > ---
> > fs/proc/base.c | 11 +++++++----
> > 1 file changed, 7 insertions(+), 4 deletions(-)
> >
> > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > index 093ca14..3ece303 100644
> > --- a/fs/proc/base.c
> > +++ b/fs/proc/base.c
> > @@ -304,14 +304,17 @@ static int proc_pid_stack(struct seq_file *m, struct pid_namespace *ns,
> > }
> > #endif
> >
> > -#ifdef CONFIG_SCHEDSTATS
> > +#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> > /*
> > * Provides /proc/PID/schedstat
> > */
> > static int proc_pid_schedstat(struct seq_file *m, struct pid_namespace *ns,
> > struct pid *pid, struct task_struct *task)
> > {
> > - seq_printf(m, "%llu %llu %lu\n",
> > + if (unlikely(!sched_info_on()))
> > + seq_printf(m, "0 0 0\n");
> > + else
> > + seq_printf(m, "%llu %llu %lu\n",
> > (unsigned long long)task->se.sum_exec_runtime,
> > (unsigned long long)task->sched_info.run_delay,
> > task->sched_info.pcount);
> > @@ -2600,7 +2603,7 @@ static const struct pid_entry tgid_base_stuff[] = {
> > #ifdef CONFIG_STACKTRACE
> > ONE("stack", S_IRUSR, proc_pid_stack),
> > #endif
> > -#ifdef CONFIG_SCHEDSTATS
> > +#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> > ONE("schedstat", S_IRUGO, proc_pid_schedstat),
> > #endif
> > #ifdef CONFIG_LATENCYTOP
> > @@ -2948,7 +2951,7 @@ static const struct pid_entry tid_base_stuff[] = {
> > #ifdef CONFIG_STACKTRACE
> > ONE("stack", S_IRUSR, proc_pid_stack),
> > #endif
> > -#ifdef CONFIG_SCHEDSTATS
> > +#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> > ONE("schedstat", S_IRUGO, proc_pid_schedstat),
> > #endif
> > #ifdef CONFIG_LATENCYTOP
> > --
>
> The change looks reasonable, from what I can understand you want these
> changes so that you can use /proc/<pid>/schedstat instead of the
> netlink interface when CONFIG_TASK_DELAY_ACCT is enabled.
>
> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
>
Yes, thanks for the review, Balbir!
Ingo,
Can you please pick this up?
Regards,
Naveen
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] proc/schedstat: Expose /proc/<pid>/schedstat if delay accounting is enabled
2015-05-29 6:16 ` Naveen N. Rao
@ 2015-05-29 8:04 ` Ingo Molnar
2015-05-29 8:55 ` Naveen N. Rao
0 siblings, 1 reply; 17+ messages in thread
From: Ingo Molnar @ 2015-05-29 8:04 UTC (permalink / raw)
To: Naveen N. Rao; +Cc: Balbir Singh, linux-kernel, Peter Zijlstra
* Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> wrote:
> /proc/<pid>/schedstat is currently only available if CONFIG_SCHEDSTATS is
> enabled. But, all the fields that this exposes are available and valid
> if CONFIG_TASK_DELAY_ACCT is enabled as well.
>
> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
> ---
> fs/proc/base.c | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 093ca14..3ece303 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -304,14 +304,17 @@ static int proc_pid_stack(struct seq_file *m, struct pid_namespace *ns,
> }
> #endif
>
> +#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> +#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> +#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
So such #ifdef parades are ugly and are usually a sign of some problem with the
patch - as in this case.
But there's deeper problems as well:
/*
* Provides /proc/PID/schedstat
*/
static int proc_pid_schedstat(struct seq_file *m, struct pid_namespace *ns,
struct pid *pid, struct task_struct *task)
{
seq_printf(m, "%llu %llu %lu\n",
(unsigned long long)task->se.sum_exec_runtime,
(unsigned long long)task->sched_info.run_delay,
task->sched_info.pcount);
return 0;
}
- The sum_exec_runtime field is available unconditionally.
- But the sched_info.run_delay field is only maintained if CONFIG_SCHEDSTATS is
enabled.
- Also, the sched_info.pcount field is again only maintained if CONFIG_SCHEDSTATS
is enabled.
So the claim in your changelog that these fields are maintained in the
delayaccounting case is plain false: none of the fields are conditional on delay
accounting.
So what's the purpose of your patch?
Thanks,
Ingo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] proc/schedstat: Expose /proc/<pid>/schedstat if delay accounting is enabled
2015-05-29 8:04 ` Ingo Molnar
@ 2015-05-29 8:55 ` Naveen N. Rao
2015-05-29 9:18 ` Ingo Molnar
0 siblings, 1 reply; 17+ messages in thread
From: Naveen N. Rao @ 2015-05-29 8:55 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Balbir Singh, linux-kernel, Peter Zijlstra, srikar
On 2015/05/29 10:04AM, Ingo Molnar wrote:
>
> * Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> wrote:
>
> > /proc/<pid>/schedstat is currently only available if CONFIG_SCHEDSTATS is
> > enabled. But, all the fields that this exposes are available and valid
> > if CONFIG_TASK_DELAY_ACCT is enabled as well.
> >
> > Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
> > ---
> > fs/proc/base.c | 11 +++++++----
> > 1 file changed, 7 insertions(+), 4 deletions(-)
> >
> > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > index 093ca14..3ece303 100644
> > --- a/fs/proc/base.c
> > +++ b/fs/proc/base.c
> > @@ -304,14 +304,17 @@ static int proc_pid_stack(struct seq_file *m, struct pid_namespace *ns,
> > }
> > #endif
> >
> > +#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> > +#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> > +#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
>
Hi Ingo,
Thanks for the review.
> So such #ifdef parades are ugly and are usually a sign of some problem with the
> patch - as in this case.
>
> But there's deeper problems as well:
>
> /*
> * Provides /proc/PID/schedstat
> */
> static int proc_pid_schedstat(struct seq_file *m, struct pid_namespace *ns,
> struct pid *pid, struct task_struct *task)
> {
> seq_printf(m, "%llu %llu %lu\n",
> (unsigned long long)task->se.sum_exec_runtime,
> (unsigned long long)task->sched_info.run_delay,
> task->sched_info.pcount);
>
> return 0;
> }
>
> - The sum_exec_runtime field is available unconditionally.
>
> - But the sched_info.run_delay field is only maintained if CONFIG_SCHEDSTATS is
> enabled.
>
> - Also, the sched_info.pcount field is again only maintained if CONFIG_SCHEDSTATS
> is enabled.
I may be missing something, but from my reading of the code, the above
are maintained if any one of CONFIG_SCHEDSTATS or CONFIG_TASK_DELAY_ACCT
are enabled (from kernel/sched/stats.h).
Only the runqueue-specific sched_info is tracked with CONFIG_SCHEDSTATS
separately [kernel/sched/stats.c and the corresponding global
/proc/schedstat]
Regards,
Naveen
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] proc/schedstat: Expose /proc/<pid>/schedstat if delay accounting is enabled
2015-05-29 8:55 ` Naveen N. Rao
@ 2015-05-29 9:18 ` Ingo Molnar
2015-05-29 9:45 ` Naveen N. Rao
0 siblings, 1 reply; 17+ messages in thread
From: Ingo Molnar @ 2015-05-29 9:18 UTC (permalink / raw)
To: Naveen N. Rao; +Cc: Balbir Singh, linux-kernel, Peter Zijlstra, srikar
* Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> wrote:
> On 2015/05/29 10:04AM, Ingo Molnar wrote:
> >
> > * Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> wrote:
> >
> > > /proc/<pid>/schedstat is currently only available if CONFIG_SCHEDSTATS is
> > > enabled. But, all the fields that this exposes are available and valid
> > > if CONFIG_TASK_DELAY_ACCT is enabled as well.
> > >
> > > Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
> > > ---
> > > fs/proc/base.c | 11 +++++++----
> > > 1 file changed, 7 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > > index 093ca14..3ece303 100644
> > > --- a/fs/proc/base.c
> > > +++ b/fs/proc/base.c
> > > @@ -304,14 +304,17 @@ static int proc_pid_stack(struct seq_file *m, struct pid_namespace *ns,
> > > }
> > > #endif
> > >
> > > +#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> > > +#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> > > +#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> >
>
> Hi Ingo,
> Thanks for the review.
>
> > So such #ifdef parades are ugly and are usually a sign of some problem with the
> > patch - as in this case.
> >
> > But there's deeper problems as well:
> >
> > /*
> > * Provides /proc/PID/schedstat
> > */
> > static int proc_pid_schedstat(struct seq_file *m, struct pid_namespace *ns,
> > struct pid *pid, struct task_struct *task)
> > {
> > seq_printf(m, "%llu %llu %lu\n",
> > (unsigned long long)task->se.sum_exec_runtime,
> > (unsigned long long)task->sched_info.run_delay,
> > task->sched_info.pcount);
> >
> > return 0;
> > }
> >
> > - The sum_exec_runtime field is available unconditionally.
> >
> > - But the sched_info.run_delay field is only maintained if CONFIG_SCHEDSTATS is
> > enabled.
> >
> > - Also, the sched_info.pcount field is again only maintained if CONFIG_SCHEDSTATS
> > is enabled.
>
> I may be missing something, but from my reading of the code, the above
> are maintained if any one of CONFIG_SCHEDSTATS or CONFIG_TASK_DELAY_ACCT
> are enabled (from kernel/sched/stats.h).
Hm, indeed - I mis-read the rq-specific code - sorry.
So all this should really be cleaned up:
include/linux/sched.h:#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
include/linux/sched.h:#endif /* defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT) */
include/linux/sched.h:#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
kernel/sched/core.c:#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
kernel/sched/stats.h:#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
kernel/sched/stats.h:#endif /* CONFIG_SCHEDSTATS || CONFIG_TASK_DELAY_ACCT */
by introducing an intermediate Kconfig variable, named CONFIG_SCHED_INFO or so,
and selected by both SCHEDSTATS and TASK_DELAY_ACCT.
Please make it two patches: the first one adds CONFIG_SCHED_INFO and cleans up the
code to use it, the second one uses it for the procps change.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] proc/schedstat: Expose /proc/<pid>/schedstat if delay accounting is enabled
2015-05-29 9:18 ` Ingo Molnar
@ 2015-05-29 9:45 ` Naveen N. Rao
2015-05-29 9:54 ` Ingo Molnar
0 siblings, 1 reply; 17+ messages in thread
From: Naveen N. Rao @ 2015-05-29 9:45 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Balbir Singh, linux-kernel, Peter Zijlstra, srikar
On 2015/05/29 11:18AM, Ingo Molnar wrote:
>
> * Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> wrote:
>
> > On 2015/05/29 10:04AM, Ingo Molnar wrote:
> > >
> > > * Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> wrote:
> > >
> > > - The sum_exec_runtime field is available unconditionally.
> > >
> > > - But the sched_info.run_delay field is only maintained if CONFIG_SCHEDSTATS is
> > > enabled.
> > >
> > > - Also, the sched_info.pcount field is again only maintained if CONFIG_SCHEDSTATS
> > > is enabled.
> >
> > I may be missing something, but from my reading of the code, the above
> > are maintained if any one of CONFIG_SCHEDSTATS or CONFIG_TASK_DELAY_ACCT
> > are enabled (from kernel/sched/stats.h).
>
> Hm, indeed - I mis-read the rq-specific code - sorry.
>
> So all this should really be cleaned up:
>
> include/linux/sched.h:#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> include/linux/sched.h:#endif /* defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT) */
> include/linux/sched.h:#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> kernel/sched/core.c:#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> kernel/sched/stats.h:#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> kernel/sched/stats.h:#endif /* CONFIG_SCHEDSTATS || CONFIG_TASK_DELAY_ACCT */
>
> by introducing an intermediate Kconfig variable, named CONFIG_SCHED_INFO or so,
> and selected by both SCHEDSTATS and TASK_DELAY_ACCT.
>
> Please make it two patches: the first one adds CONFIG_SCHED_INFO and cleans up the
> code to use it, the second one uses it for the procps change.
Sure, will do.
On a related note, even though sum_exec_runtime is available
unconditionally, I dump all zeroes in my patch if !sched_info_on() to
make it clear that some of the fields are not available. Is this ok or
should be display sum_exec_runtime regardless of sched_info?
Thanks,
Naveen
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] proc/schedstat: Expose /proc/<pid>/schedstat if delay accounting is enabled
2015-05-29 9:45 ` Naveen N. Rao
@ 2015-05-29 9:54 ` Ingo Molnar
2015-05-29 17:06 ` Naveen N. Rao
0 siblings, 1 reply; 17+ messages in thread
From: Ingo Molnar @ 2015-05-29 9:54 UTC (permalink / raw)
To: Naveen N. Rao; +Cc: Balbir Singh, linux-kernel, Peter Zijlstra, srikar
* Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> wrote:
> On 2015/05/29 11:18AM, Ingo Molnar wrote:
> >
> > * Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> wrote:
> >
> > > On 2015/05/29 10:04AM, Ingo Molnar wrote:
> > > >
> > > > * Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> wrote:
> > > >
> > > > - The sum_exec_runtime field is available unconditionally.
> > > >
> > > > - But the sched_info.run_delay field is only maintained if CONFIG_SCHEDSTATS is
> > > > enabled.
> > > >
> > > > - Also, the sched_info.pcount field is again only maintained if CONFIG_SCHEDSTATS
> > > > is enabled.
> > >
> > > I may be missing something, but from my reading of the code, the above
> > > are maintained if any one of CONFIG_SCHEDSTATS or CONFIG_TASK_DELAY_ACCT
> > > are enabled (from kernel/sched/stats.h).
> >
> > Hm, indeed - I mis-read the rq-specific code - sorry.
> >
> > So all this should really be cleaned up:
> >
> > include/linux/sched.h:#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> > include/linux/sched.h:#endif /* defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT) */
> > include/linux/sched.h:#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> > kernel/sched/core.c:#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> > kernel/sched/stats.h:#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> > kernel/sched/stats.h:#endif /* CONFIG_SCHEDSTATS || CONFIG_TASK_DELAY_ACCT */
> >
> > by introducing an intermediate Kconfig variable, named CONFIG_SCHED_INFO or so,
> > and selected by both SCHEDSTATS and TASK_DELAY_ACCT.
> >
> > Please make it two patches: the first one adds CONFIG_SCHED_INFO and cleans up the
> > code to use it, the second one uses it for the procps change.
>
> Sure, will do.
>
> On a related note, even though sum_exec_runtime is available unconditionally, I
> dump all zeroes in my patch if !sched_info_on() to make it clear that some of
> the fields are not available. Is this ok or should be display sum_exec_runtime
> regardless of sched_info?
So I'd suggest printing -1 for non-available fields, that should be unambigous
enough and makes it also possible to write out 0 in some cases.
That way the schedstat file should be made unconditional altogether. (And we still
want the Kconfig cleanups as a separate patch.)
Thanks,
Ingo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] proc/schedstat: Expose /proc/<pid>/schedstat if delay accounting is enabled
2015-05-29 9:54 ` Ingo Molnar
@ 2015-05-29 17:06 ` Naveen N. Rao
2015-06-02 7:58 ` Ingo Molnar
0 siblings, 1 reply; 17+ messages in thread
From: Naveen N. Rao @ 2015-05-29 17:06 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Balbir Singh, linux-kernel, Peter Zijlstra, srikar
On 2015/05/29 11:54AM, Ingo Molnar wrote:
>
> > > So all this should really be cleaned up:
> > >
> > > include/linux/sched.h:#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> > > include/linux/sched.h:#endif /* defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT) */
> > > include/linux/sched.h:#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> > > kernel/sched/core.c:#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> > > kernel/sched/stats.h:#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> > > kernel/sched/stats.h:#endif /* CONFIG_SCHEDSTATS || CONFIG_TASK_DELAY_ACCT */
> > >
> > > by introducing an intermediate Kconfig variable, named CONFIG_SCHED_INFO or so,
> > > and selected by both SCHEDSTATS and TASK_DELAY_ACCT.
> > >
> > > Please make it two patches: the first one adds CONFIG_SCHED_INFO and cleans up the
> > > code to use it, the second one uses it for the procps change.
> >
> > Sure, will do.
> >
> > On a related note, even though sum_exec_runtime is available unconditionally, I
> > dump all zeroes in my patch if !sched_info_on() to make it clear that some of
> > the fields are not available. Is this ok or should be display sum_exec_runtime
> > regardless of sched_info?
>
> So I'd suggest printing -1 for non-available fields, that should be unambigous
> enough and makes it also possible to write out 0 in some cases.
Per Documentation/scheduler/sched-stats.txt (and the linked latency.c
there), user-space seems to be expecting unsigned values here. Would
displaying -1 here be ok?
Thanks,
Naveen
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] proc/schedstat: Expose /proc/<pid>/schedstat if delay accounting is enabled
2015-05-29 17:06 ` Naveen N. Rao
@ 2015-06-02 7:58 ` Ingo Molnar
2015-06-25 8:39 ` Naveen N. Rao
0 siblings, 1 reply; 17+ messages in thread
From: Ingo Molnar @ 2015-06-02 7:58 UTC (permalink / raw)
To: Naveen N. Rao; +Cc: Balbir Singh, linux-kernel, Peter Zijlstra, srikar
* Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> wrote:
> On 2015/05/29 11:54AM, Ingo Molnar wrote:
> >
> > > > So all this should really be cleaned up:
> > > >
> > > > include/linux/sched.h:#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> > > > include/linux/sched.h:#endif /* defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT) */
> > > > include/linux/sched.h:#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> > > > kernel/sched/core.c:#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> > > > kernel/sched/stats.h:#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
> > > > kernel/sched/stats.h:#endif /* CONFIG_SCHEDSTATS || CONFIG_TASK_DELAY_ACCT */
> > > >
> > > > by introducing an intermediate Kconfig variable, named CONFIG_SCHED_INFO or so,
> > > > and selected by both SCHEDSTATS and TASK_DELAY_ACCT.
> > > >
> > > > Please make it two patches: the first one adds CONFIG_SCHED_INFO and cleans up the
> > > > code to use it, the second one uses it for the procps change.
> > >
> > > Sure, will do.
> > >
> > > On a related note, even though sum_exec_runtime is available unconditionally, I
> > > dump all zeroes in my patch if !sched_info_on() to make it clear that some of
> > > the fields are not available. Is this ok or should be display sum_exec_runtime
> > > regardless of sched_info?
> >
> > So I'd suggest printing -1 for non-available fields, that should be unambigous
> > enough and makes it also possible to write out 0 in some cases.
>
> Per Documentation/scheduler/sched-stats.txt (and the linked latency.c there),
> user-space seems to be expecting unsigned values here. Would displaying -1 here
> be ok?
Probably not (the code is silly, why doesn't it split up the string and use
atol()?) - hopefully real user-space is better? Can you try some real, packaged up
tools that read schedstats, to see whether they work with -1?
Thanks,
Ingo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] proc/schedstat: Expose /proc/<pid>/schedstat if delay accounting is enabled
2015-06-02 7:58 ` Ingo Molnar
@ 2015-06-25 8:39 ` Naveen N. Rao
2015-06-25 12:12 ` Ingo Molnar
0 siblings, 1 reply; 17+ messages in thread
From: Naveen N. Rao @ 2015-06-25 8:39 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Balbir Singh, linux-kernel, Peter Zijlstra, srikar, ricklind
On 2015/06/02 09:58AM, Ingo Molnar wrote:
>
> * Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> wrote:
>
> > On 2015/05/29 11:54AM, Ingo Molnar wrote:
> > >
> > > > On a related note, even though sum_exec_runtime is available
> > > > unconditionally, I dump all zeroes in my patch if
> > > > !sched_info_on() to make it clear that some of the fields are
> > > > not available. Is this ok or should be display sum_exec_runtime
> > > > regardless of sched_info?
> > >
> > > So I'd suggest printing -1 for non-available fields, that should be unambigous
> > > enough and makes it also possible to write out 0 in some cases.
> >
> > Per Documentation/scheduler/sched-stats.txt (and the linked latency.c there),
> > user-space seems to be expecting unsigned values here. Would displaying -1 here
> > be ok?
>
> Probably not (the code is silly, why doesn't it split up the string and use
> atol()?) - hopefully real user-space is better? Can you try some real, packaged up
> tools that read schedstats, to see whether they work with -1?
Hi Ingo,
Sorry for the delay - I had been off on vacation.
I see that quite a few packages are using /proc/<pid>/schedstat - pcp,
systemd, dstat, android, among others. While most of these seem to be
splitting up the fields properly, they are using a variant of
strtoull(), which returns ULLONG_MAX for -1, and none of these check for
that condition. If any of the tools use the value read to report total
execution time or run delay, it will be incorrect.
At this point, I feel it is better to display all the three fields in
schedstat only if sched_info_on() is true, as explained above. What do
you suggest?
Thanks,
Naveen
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] proc/schedstat: Expose /proc/<pid>/schedstat if delay accounting is enabled
2015-06-25 8:39 ` Naveen N. Rao
@ 2015-06-25 12:12 ` Ingo Molnar
0 siblings, 0 replies; 17+ messages in thread
From: Ingo Molnar @ 2015-06-25 12:12 UTC (permalink / raw)
To: Naveen N. Rao
Cc: Balbir Singh, linux-kernel, Peter Zijlstra, srikar, ricklind
* Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> wrote:
> On 2015/06/02 09:58AM, Ingo Molnar wrote:
> >
> > * Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> wrote:
> >
> > > On 2015/05/29 11:54AM, Ingo Molnar wrote:
> > > >
> > > > > On a related note, even though sum_exec_runtime is available
> > > > > unconditionally, I dump all zeroes in my patch if
> > > > > !sched_info_on() to make it clear that some of the fields are
> > > > > not available. Is this ok or should be display sum_exec_runtime
> > > > > regardless of sched_info?
> > > >
> > > > So I'd suggest printing -1 for non-available fields, that should be unambigous
> > > > enough and makes it also possible to write out 0 in some cases.
> > >
> > > Per Documentation/scheduler/sched-stats.txt (and the linked latency.c there),
> > > user-space seems to be expecting unsigned values here. Would displaying -1 here
> > > be ok?
> >
> > Probably not (the code is silly, why doesn't it split up the string and use
> > atol()?) - hopefully real user-space is better? Can you try some real, packaged up
> > tools that read schedstats, to see whether they work with -1?
>
> Hi Ingo,
> Sorry for the delay - I had been off on vacation.
>
> I see that quite a few packages are using /proc/<pid>/schedstat - pcp, systemd,
> dstat, android, among others. While most of these seem to be splitting up the
> fields properly, they are using a variant of strtoull(), which returns
> ULLONG_MAX for -1, and none of these check for that condition. If any of the
> tools use the value read to report total execution time or run delay, it will be
> incorrect.
>
> At this point, I feel it is better to display all the three fields in schedstat
> only if sched_info_on() is true, as explained above. What do you suggest?
Ok, agreed - thanks for the analysis.
Mind (re-)submitting the patches accordingly?
Thanks,
Ingo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] proc/schedstat: Expose /proc/<pid>/schedstat if delay accounting is enabled
2015-05-28 9:11 ` Balbir Singh
2015-05-29 6:16 ` Naveen N. Rao
@ 2015-06-25 18:00 ` Cong Wang
2015-06-25 18:27 ` Naveen N. Rao
1 sibling, 1 reply; 17+ messages in thread
From: Cong Wang @ 2015-06-25 18:00 UTC (permalink / raw)
To: Balbir Singh; +Cc: Naveen N. Rao, Ingo Molnar, linux-kernel
On Thu, May 28, 2015 at 2:11 AM, Balbir Singh <bsingharora@gmail.com> wrote:
> On Mon, May 25, 2015 at 12:42 PM, Naveen N. Rao
> <naveen.n.rao@linux.vnet.ibm.com> wrote:
>> /proc/<pid>/schedstat is currently only available if CONFIG_SCHEDSTATS is
>> enabled. But, all the fields that this exposes are available and valid
>> if CONFIG_TASK_DELAY_ACCT is enabled as well.
>>
[...]
> The change looks reasonable, from what I can understand you want these
> changes so that you can use /proc/<pid>/schedstat instead of the
> netlink interface when CONFIG_TASK_DELAY_ACCT is enabled.
>
Why?
If you need the procfs interface, just enable CONFIG_SCHEDSTATS.
If you need the netlink interface, enable CONFIG_TASK_DELAY_ACCT.
They are just two different interfaces for getting the same sched
information, so why make this change?
There must be some reason you don't want to enable
CONFIG_SCHEDSTATS? If so, please add it in the changelog.
My guess is it depends on DEBUG_KERNEL which is not what you
want? But I see no reason it should have that dependency, it just
exposes some stats, looks like can be just removed (and moved out
of Kconfig,debug of course).
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] proc/schedstat: Expose /proc/<pid>/schedstat if delay accounting is enabled
2015-06-25 18:00 ` Cong Wang
@ 2015-06-25 18:27 ` Naveen N. Rao
2015-06-25 18:40 ` Cong Wang
0 siblings, 1 reply; 17+ messages in thread
From: Naveen N. Rao @ 2015-06-25 18:27 UTC (permalink / raw)
To: Cong Wang; +Cc: Balbir Singh, Ingo Molnar, linux-kernel
On 2015/06/25 11:00AM, Cong Wang wrote:
> On Thu, May 28, 2015 at 2:11 AM, Balbir Singh <bsingharora@gmail.com> wrote:
> > On Mon, May 25, 2015 at 12:42 PM, Naveen N. Rao
> > <naveen.n.rao@linux.vnet.ibm.com> wrote:
> >> /proc/<pid>/schedstat is currently only available if CONFIG_SCHEDSTATS is
> >> enabled. But, all the fields that this exposes are available and valid
> >> if CONFIG_TASK_DELAY_ACCT is enabled as well.
> >>
>
> [...]
>
> > The change looks reasonable, from what I can understand you want these
> > changes so that you can use /proc/<pid>/schedstat instead of the
> > netlink interface when CONFIG_TASK_DELAY_ACCT is enabled.
> >
>
> Why?
>
> If you need the procfs interface, just enable CONFIG_SCHEDSTATS.
> If you need the netlink interface, enable CONFIG_TASK_DELAY_ACCT.
> They are just two different interfaces for getting the same sched
> information, so why make this change?
>
> There must be some reason you don't want to enable
> CONFIG_SCHEDSTATS? If so, please add it in the changelog.
>
> My guess is it depends on DEBUG_KERNEL which is not what you
> want? But I see no reason it should have that dependency, it just
> exposes some stats, looks like can be just removed (and moved out
> of Kconfig,debug of course).
The primary issue with CONFIG_SCHEDSTATS is the (slight) additional
overhead it introduces, per Kconfig. Due to this, it is not enabled by
default by some of the distro kernels.
Regards,
Naveen
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] proc/schedstat: Expose /proc/<pid>/schedstat if delay accounting is enabled
2015-06-25 18:27 ` Naveen N. Rao
@ 2015-06-25 18:40 ` Cong Wang
2015-06-26 8:39 ` Naveen N. Rao
0 siblings, 1 reply; 17+ messages in thread
From: Cong Wang @ 2015-06-25 18:40 UTC (permalink / raw)
To: Naveen N. Rao; +Cc: Balbir Singh, Ingo Molnar, linux-kernel
On Thu, Jun 25, 2015 at 11:27 AM, Naveen N. Rao
<naveen.n.rao@linux.vnet.ibm.com> wrote:
>
> The primary issue with CONFIG_SCHEDSTATS is the (slight) additional
> overhead it introduces, per Kconfig. Due to this, it is not enabled by
> default by some of the distro kernels.
>
The overhead is same for CONFIG_TASK_DELAY_ACCT since the
core sched info code is protected by CONFIG_TASK_DELAY_ACCT
or CONFIG_TASK_DELAY_ACCT?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] proc/schedstat: Expose /proc/<pid>/schedstat if delay accounting is enabled
2015-06-25 18:40 ` Cong Wang
@ 2015-06-26 8:39 ` Naveen N. Rao
0 siblings, 0 replies; 17+ messages in thread
From: Naveen N. Rao @ 2015-06-26 8:39 UTC (permalink / raw)
To: Cong Wang; +Cc: Balbir Singh, Ingo Molnar, linux-kernel
On 2015/06/25 11:40AM, Cong Wang wrote:
> On Thu, Jun 25, 2015 at 11:27 AM, Naveen N. Rao
> <naveen.n.rao@linux.vnet.ibm.com> wrote:
> >
> > The primary issue with CONFIG_SCHEDSTATS is the (slight) additional
> > overhead it introduces, per Kconfig. Due to this, it is not enabled by
> > default by some of the distro kernels.
> >
>
> The overhead is same for CONFIG_TASK_DELAY_ACCT since the
> core sched info code is protected by CONFIG_TASK_DELAY_ACCT
> or CONFIG_TASK_DELAY_ACCT?
CONFIG_SCHEDSTATS enables a lot more scheduler statistics
(/proc/schedstat) than just the task delay accounting enabled by
CONFIG_TASK_DELAY_ACCT. So, no - the overhead is not the same.
- Naveen
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2015-06-26 8:40 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-25 7:12 [PATCH] proc/schedstat: Expose /proc/<pid>/schedstat if delay accounting is enabled Naveen N. Rao
2015-05-28 5:11 ` Naveen N. Rao
2015-05-28 9:11 ` Balbir Singh
2015-05-29 6:16 ` Naveen N. Rao
2015-05-29 8:04 ` Ingo Molnar
2015-05-29 8:55 ` Naveen N. Rao
2015-05-29 9:18 ` Ingo Molnar
2015-05-29 9:45 ` Naveen N. Rao
2015-05-29 9:54 ` Ingo Molnar
2015-05-29 17:06 ` Naveen N. Rao
2015-06-02 7:58 ` Ingo Molnar
2015-06-25 8:39 ` Naveen N. Rao
2015-06-25 12:12 ` Ingo Molnar
2015-06-25 18:00 ` Cong Wang
2015-06-25 18:27 ` Naveen N. Rao
2015-06-25 18:40 ` Cong Wang
2015-06-26 8:39 ` Naveen N. Rao
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).