All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH 0/2 v2.0]sched: updating /proc/schedstat
@ 2011-01-25 20:41 Ciju Rajan K
  2011-01-25 20:45 ` [PATCH 1/2 v2.0]sched: Removing unused fields from /proc/schedstat Ciju Rajan K
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Ciju Rajan K @ 2011-01-25 20:41 UTC (permalink / raw)
  To: linux kernel mailing list
  Cc: Peter Zijlstra, Bharata B Rao, Ingo Molnar, Srivatsa Vaddagiri,
	Satoru Takeuchi, Ciju Rajan K

Hi All,

Here is the v2.0 of the patch set, which updates 
/proc/schedstat statistics. Please review the patches 
and let me know your thoughts.


Changes from v1.0:

* Fixed couple of typos
* Re-written the documentation for sched-domain statistics
* Re-based to 2.6.38-rc2

-Ciju


 Documentation/scheduler/sched-stats.txt |  150 +++++++++++---------------------
 include/linux/sched.h                   |   11 --
 kernel/sched_debug.c                    |    1 
 kernel/sched_stats.h                    |   13 +-
 4 files changed, 61 insertions(+), 114 deletions(-)





^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/2 v2.0]sched: Removing unused fields from /proc/schedstat
  2011-01-25 20:41 [RFC][PATCH 0/2 v2.0]sched: updating /proc/schedstat Ciju Rajan K
@ 2011-01-25 20:45 ` Ciju Rajan K
  2011-01-26  6:10   ` Satoru Takeuchi
  2011-01-25 20:46 ` [PATCH 2/2 v2.0]sched: Updating the sched-stat documentation Ciju Rajan K
  2011-02-18 12:43 ` [RFC][PATCH 0/2 v3.0]sched: updating /proc/schedstat Ciju Rajan K
  2 siblings, 1 reply; 13+ messages in thread
From: Ciju Rajan K @ 2011-01-25 20:45 UTC (permalink / raw)
  To: linux kernel mailing list
  Cc: Ciju Rajan K, Peter Zijlstra, Bharata B Rao, Ingo Molnar,
	Srivatsa Vaddagiri, Satoru Takeuchi


sched: Updating the fields of /proc/schedstat 

This patch removes the unused statistics from /proc/schedstat. 
Also updates the request queue structure fields. 

Signed-off-by: Ciju Rajan K <ciju@linux.vnet.ibm.com>


diff --git a/include/linux/sched.h b/include/linux/sched.h
index d747f94..1c0ac12 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -954,20 +954,9 @@ struct sched_domain {
 	unsigned int alb_failed;
 	unsigned int alb_pushed;
 
-	/* SD_BALANCE_EXEC stats */
-	unsigned int sbe_count;
-	unsigned int sbe_balanced;
-	unsigned int sbe_pushed;
-
-	/* SD_BALANCE_FORK stats */
-	unsigned int sbf_count;
-	unsigned int sbf_balanced;
-	unsigned int sbf_pushed;
-
 	/* try_to_wake_up() stats */
 	unsigned int ttwu_wake_remote;
 	unsigned int ttwu_move_affine;
-	unsigned int ttwu_move_balance;
 #endif
 #ifdef CONFIG_SCHED_DEBUG
 	char *name;
diff --git a/kernel/sched_debug.c b/kernel/sched_debug.c
index eb6cb8e..99893be 100644
--- a/kernel/sched_debug.c
+++ b/kernel/sched_debug.c
@@ -286,7 +286,6 @@ static void print_cpu(struct seq_file *m, int cpu)
 
 	P(yld_count);
 
-	P(sched_switch);
 	P(sched_count);
 	P(sched_goidle);
 #ifdef CONFIG_SMP
diff --git a/kernel/sched_stats.h b/kernel/sched_stats.h
index 48ddf43..8869ed9 100644
--- a/kernel/sched_stats.h
+++ b/kernel/sched_stats.h
@@ -4,7 +4,7 @@
  * bump this up when changing the output format or the meaning of an existing
  * format, so that tools can adapt (or abort)
  */
-#define SCHEDSTAT_VERSION 15
+#define SCHEDSTAT_VERSION 16
 
 static int show_schedstat(struct seq_file *seq, void *v)
 {
@@ -26,9 +26,9 @@ static int show_schedstat(struct seq_file *seq, void *v)
 
 		/* runqueue-specific stats */
 		seq_printf(seq,
-		    "cpu%d %u %u %u %u %u %u %llu %llu %lu",
+		    "cpu%d %u %u %u %u %u %llu %llu %lu",
 		    cpu, rq->yld_count,
-		    rq->sched_switch, rq->sched_count, rq->sched_goidle,
+		    rq->sched_count, rq->sched_goidle,
 		    rq->ttwu_count, rq->ttwu_local,
 		    rq->rq_cpu_time,
 		    rq->rq_sched_info.run_delay, rq->rq_sched_info.pcount);
@@ -57,12 +57,9 @@ static int show_schedstat(struct seq_file *seq, void *v)
 				    sd->lb_nobusyg[itype]);
 			}
 			seq_printf(seq,
-				   " %u %u %u %u %u %u %u %u %u %u %u %u\n",
+				   " %u %u %u %u %u\n",
 			    sd->alb_count, sd->alb_failed, sd->alb_pushed,
-			    sd->sbe_count, sd->sbe_balanced, sd->sbe_pushed,
-			    sd->sbf_count, sd->sbf_balanced, sd->sbf_pushed,
-			    sd->ttwu_wake_remote, sd->ttwu_move_affine,
-			    sd->ttwu_move_balance);
+			    sd->ttwu_wake_remote, sd->ttwu_move_affine);
 		}
 		preempt_enable();
 #endif

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/2 v2.0]sched: Updating the sched-stat documentation
  2011-01-25 20:41 [RFC][PATCH 0/2 v2.0]sched: updating /proc/schedstat Ciju Rajan K
  2011-01-25 20:45 ` [PATCH 1/2 v2.0]sched: Removing unused fields from /proc/schedstat Ciju Rajan K
@ 2011-01-25 20:46 ` Ciju Rajan K
  2011-02-03  9:19   ` Satoru Takeuchi
  2011-02-18 12:43 ` [RFC][PATCH 0/2 v3.0]sched: updating /proc/schedstat Ciju Rajan K
  2 siblings, 1 reply; 13+ messages in thread
From: Ciju Rajan K @ 2011-01-25 20:46 UTC (permalink / raw)
  To: linux kernel mailing list
  Cc: Ciju Rajan K, Peter Zijlstra, Bharata B Rao, Ingo Molnar,
	Srivatsa Vaddagiri, Satoru Takeuchi

sched: Updating the sched-stat documentation

Some of the unused fields are removed from /proc/schedstat.
This is the documentation changes reflecting the same.

Signed-off-by: Ciju Rajan K <ciju@linux.vnet.ibm.com>
---

diff --git a/Documentation/scheduler/sched-stats.txt b/Documentation/scheduler/sched-stats.txt
index 01e6940..28bee75 100644
--- a/Documentation/scheduler/sched-stats.txt
+++ b/Documentation/scheduler/sched-stats.txt
@@ -26,119 +26,81 @@ Note that any such script will necessarily be version-specific, as the main
 reason to change versions is changes in the output format.  For those wishing
 to write their own scripts, the fields are described here.
 
+The first two fields of /proc/schedstat indicates the version (current
+version is 16) and jiffies values. The following values are from 
+cpu & domain statistics.
+
 CPU statistics
 --------------
-cpu<N> 1 2 3 4 5 6 7 8 9 10 11 12
-
-NOTE: In the sched_yield() statistics, the active queue is considered empty
-    if it has only one process in it, since obviously the process calling
-    sched_yield() is that process.
-
-First four fields are sched_yield() statistics:
-     1) # of times both the active and the expired queue were empty
-     2) # of times just the active queue was empty
-     3) # of times just the expired queue was empty
-     4) # of times sched_yield() was called
-
-Next three are schedule() statistics:
-     5) # of times we switched to the expired queue and reused it
-     6) # of times schedule() was called
-     7) # of times schedule() left the processor idle
+The format is like this:
 
-Next two are try_to_wake_up() statistics:
-     8) # of times try_to_wake_up() was called
-     9) # of times try_to_wake_up() was called to wake up the local cpu
+cpu<N> 1 2 3 4 5 6 7 8
 
-Next three are statistics describing scheduling latency:
-    10) sum of all time spent running by tasks on this processor (in jiffies)
-    11) sum of all time spent waiting to run by tasks on this processor (in
-        jiffies)
-    12) # of timeslices run on this cpu
+     1) # of times sched_yield() was called on this CPU
+     2) # of times scheduler runs on this CPU
+     3) # of times scheduler picks idle task as next task on this CPU
+     4) # of times try_to_wake_up() is run on this CPU 
+        (Number of times task wakeup is attempted from this CPU)
+     5) # of times try_to_wake_up() wakes up a task on the same CPU 
+        (local wakeup)
+     6) Time(ns) for which tasks have run on this CPU
+     7) Time(ns) for which tasks on this CPU's runqueue have waited 
+        before getting to run on the CPU
+     8) # of tasks that have run on this CPU
 
 
 Domain statistics
 -----------------
-One of these is produced per domain for each cpu described. (Note that if
-CONFIG_SMP is not defined, *no* domains are utilized and these lines
-will not appear in the output.)
+One of these is produced per domain for each cpu described. 
+(Note that if CONFIG_SMP is not defined, *no* domains are utilized
+ and these lines will not appear in the output.)
 
-domain<N> <cpumask> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
+domain<N> <cpumask> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
 
 The first field is a bit mask indicating what cpus this domain operates over.
 
-The next 24 are a variety of load_balance() statistics in grouped into types
-of idleness (idle, busy, and newly idle):
-
-     1) # of times in this domain load_balance() was called when the
-        cpu was idle
-     2) # of times in this domain load_balance() checked but found
-        the load did not require balancing when the cpu was idle
-     3) # of times in this domain load_balance() tried to move one or
-        more tasks and failed, when the cpu was idle
-     4) sum of imbalances discovered (if any) with each call to
-        load_balance() in this domain when the cpu was idle
-     5) # of times in this domain pull_task() was called when the cpu
-        was idle
-     6) # of times in this domain pull_task() was called even though
-        the target task was cache-hot when idle
-     7) # of times in this domain load_balance() was called but did
-        not find a busier queue while the cpu was idle
-     8) # of times in this domain a busier queue was found while the
-        cpu was idle but no busier group was found
-
-     9) # of times in this domain load_balance() was called when the
-        cpu was busy
-    10) # of times in this domain load_balance() checked but found the
-        load did not require balancing when busy
-    11) # of times in this domain load_balance() tried to move one or
-        more tasks and failed, when the cpu was busy
-    12) sum of imbalances discovered (if any) with each call to
-        load_balance() in this domain when the cpu was busy
-    13) # of times in this domain pull_task() was called when busy
-    14) # of times in this domain pull_task() was called even though the
-        target task was cache-hot when busy
-    15) # of times in this domain load_balance() was called but did not
-        find a busier queue while the cpu was busy
-    16) # of times in this domain a busier queue was found while the cpu
-        was busy but no busier group was found
-
-    17) # of times in this domain load_balance() was called when the
-        cpu was just becoming idle
-    18) # of times in this domain load_balance() checked but found the
-        load did not require balancing when the cpu was just becoming idle
-    19) # of times in this domain load_balance() tried to move one or more
-        tasks and failed, when the cpu was just becoming idle
-    20) sum of imbalances discovered (if any) with each call to
-        load_balance() in this domain when the cpu was just becoming idle
-    21) # of times in this domain pull_task() was called when newly idle
-    22) # of times in this domain pull_task() was called even though the
-        target task was cache-hot when just becoming idle
-    23) # of times in this domain load_balance() was called but did not
-        find a busier queue while the cpu was just becoming idle
-    24) # of times in this domain a busier queue was found while the cpu
-        was just becoming idle but no busier group was found
-
+The next 24 are a variety of load_balance() statistics grouped into
+types of idleness (idle, busy, and newly idle). The three idle 
+states are:
+
+CPU_IDLE:          This state is entered after CPU_NEWLY_IDLE 
+                   state fails to find a new task for this CPU
+CPU_NOT_IDLE:      Load balancer is being run on a CPU when it is 
+                   not in IDLE state (busy times)
+CPU_NEWLY_IDLE:    Load balancer is being run on a CPU which is 
+                   about to enter IDLE state
+
+There are eight stats available for each of the above three states:
+     - # of times in this domain load_balance() was called
+     - # of times in this domain load_balance() checked but found
+        the load did not require balancing
+     - # of times in this domain load_balance() tried to move one or
+        more tasks and failed
+     - sum of imbalances discovered (if any) with each call to
+        load_balance() in this domain
+     - # of times in this domain pull_task() was called
+     - # of times in this domain pull_task() was called even though
+        the target task was cache-hot
+     - # of times in this domain load_balance() was called but did
+        not find a busier queue
+     - # of times in this domain a busier queue was found but no 
+        busier group was found
+
+   The first 1-8) fields are the stats when cpu was idle (CPU_IDLE),
+   the next 9-15) fields are the stats when cpu was busy (CPU_NOT_IDLE),
+   and the next 16-24) fields are the stats when cpu  was just 
+   becoming idle (CPU_NEWLY_IDLE)
+     
    Next three are active_load_balance() statistics:
     25) # of times active_load_balance() was called
     26) # of times active_load_balance() tried to move a task and failed
     27) # of times active_load_balance() successfully moved a task
 
-   Next three are sched_balance_exec() statistics:
-    28) sbe_cnt is not used
-    29) sbe_balanced is not used
-    30) sbe_pushed is not used
-
-   Next three are sched_balance_fork() statistics:
-    31) sbf_cnt is not used
-    32) sbf_balanced is not used
-    33) sbf_pushed is not used
-
-   Next three are try_to_wake_up() statistics:
-    34) # of times in this domain try_to_wake_up() awoke a task that
+   Next two are try_to_wake_up() statistics:
+    28) # of times in this domain try_to_wake_up() awoke a task that
         last ran on a different cpu in this domain
-    35) # of times in this domain try_to_wake_up() moved a task to the
+    29) # of times in this domain try_to_wake_up() moved a task to the
         waking cpu because it was cache-cold on its own cpu anyway
-    36) # of times in this domain try_to_wake_up() started passive balancing
 
 /proc/<pid>/schedstat
 ----------------

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2 v2.0]sched: Removing unused fields from /proc/schedstat
  2011-01-25 20:45 ` [PATCH 1/2 v2.0]sched: Removing unused fields from /proc/schedstat Ciju Rajan K
@ 2011-01-26  6:10   ` Satoru Takeuchi
  2011-01-31  4:10     ` Ciju Rajan K
  0 siblings, 1 reply; 13+ messages in thread
From: Satoru Takeuchi @ 2011-01-26  6:10 UTC (permalink / raw)
  To: Ciju Rajan K
  Cc: linux kernel mailing list, Peter Zijlstra, Bharata B Rao,
	Ingo Molnar, Srivatsa Vaddagiri

Hi Ciju,

(2011/01/26 5:45), Ciju Rajan K wrote:
>
> sched: Updating the fields of /proc/schedstat
>
> This patch removes the unused statistics from /proc/schedstat.
> Also updates the request queue structure fields.
>
> Signed-off-by: Ciju Rajan K<ciju@linux.vnet.ibm.com>

This patch is logically correct, succeeded to compile and works
fine. But I came to be worried about whether it is good to kill
all fields you said after reading old and upstream scheduler
code again.

I think we can remove rq->sched_switch and rq->sched_switch
without no problem because they are meaningless. The former
is for old O(1) scheduler and means the number of runqueue
switching among active/expired queue. The latter is for
SD_WAKE_BALANCE flag and its logic is already gone.

However sbe_* are for SD_BALANCE_EXEC flag and sbf_* are for
SD_BALANCE_FORK flag. Since both logic for them are still alive,
the absence of these accounting is regression in my perspective.
In addition, these fields would be useful for analyzing load
balance behavior.

# although I haven't been able to notice they are always zero ;-(

I prefer not to remove these fields({sbe,sbf}_*) but to add
accounting code for these flags again. What do you think?

Thanks,
Satoru

>
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index d747f94..1c0ac12 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -954,20 +954,9 @@ struct sched_domain {
>   	unsigned int alb_failed;
>   	unsigned int alb_pushed;
>
> -	/* SD_BALANCE_EXEC stats */
> -	unsigned int sbe_count;
> -	unsigned int sbe_balanced;
> -	unsigned int sbe_pushed;
> -
> -	/* SD_BALANCE_FORK stats */
> -	unsigned int sbf_count;
> -	unsigned int sbf_balanced;
> -	unsigned int sbf_pushed;
> -
>   	/* try_to_wake_up() stats */
>   	unsigned int ttwu_wake_remote;
>   	unsigned int ttwu_move_affine;
> -	unsigned int ttwu_move_balance;
>   #endif
>   #ifdef CONFIG_SCHED_DEBUG
>   	char *name;
> diff --git a/kernel/sched_debug.c b/kernel/sched_debug.c
> index eb6cb8e..99893be 100644
> --- a/kernel/sched_debug.c
> +++ b/kernel/sched_debug.c
> @@ -286,7 +286,6 @@ static void print_cpu(struct seq_file *m, int cpu)
>
>   	P(yld_count);
>
> -	P(sched_switch);
>   	P(sched_count);
>   	P(sched_goidle);
>   #ifdef CONFIG_SMP
> diff --git a/kernel/sched_stats.h b/kernel/sched_stats.h
> index 48ddf43..8869ed9 100644
> --- a/kernel/sched_stats.h
> +++ b/kernel/sched_stats.h
> @@ -4,7 +4,7 @@
>    * bump this up when changing the output format or the meaning of an existing
>    * format, so that tools can adapt (or abort)
>    */
> -#define SCHEDSTAT_VERSION 15
> +#define SCHEDSTAT_VERSION 16
>
>   static int show_schedstat(struct seq_file *seq, void *v)
>   {
> @@ -26,9 +26,9 @@ static int show_schedstat(struct seq_file *seq, void *v)
>
>   		/* runqueue-specific stats */
>   		seq_printf(seq,
> -		    "cpu%d %u %u %u %u %u %u %llu %llu %lu",
> +		    "cpu%d %u %u %u %u %u %llu %llu %lu",
>   		    cpu, rq->yld_count,
> -		    rq->sched_switch, rq->sched_count, rq->sched_goidle,
> +		    rq->sched_count, rq->sched_goidle,
>   		    rq->ttwu_count, rq->ttwu_local,
>   		    rq->rq_cpu_time,
>   		    rq->rq_sched_info.run_delay, rq->rq_sched_info.pcount);
> @@ -57,12 +57,9 @@ static int show_schedstat(struct seq_file *seq, void *v)
>   				    sd->lb_nobusyg[itype]);
>   			}
>   			seq_printf(seq,
> -				   " %u %u %u %u %u %u %u %u %u %u %u %u\n",
> +				   " %u %u %u %u %u\n",
>   			    sd->alb_count, sd->alb_failed, sd->alb_pushed,
> -			    sd->sbe_count, sd->sbe_balanced, sd->sbe_pushed,
> -			    sd->sbf_count, sd->sbf_balanced, sd->sbf_pushed,
> -			    sd->ttwu_wake_remote, sd->ttwu_move_affine,
> -			    sd->ttwu_move_balance);
> +			    sd->ttwu_wake_remote, sd->ttwu_move_affine);
>   		}
>   		preempt_enable();
>   #endif
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
>



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2 v2.0]sched: Removing unused fields from /proc/schedstat
  2011-01-26  6:10   ` Satoru Takeuchi
@ 2011-01-31  4:10     ` Ciju Rajan K
  2011-02-02  8:54       ` Ciju Rajan K
  0 siblings, 1 reply; 13+ messages in thread
From: Ciju Rajan K @ 2011-01-31  4:10 UTC (permalink / raw)
  To: Satoru Takeuchi
  Cc: linux kernel mailing list, Peter Zijlstra, Bharata B Rao,
	Ingo Molnar, Srivatsa Vaddagiri, Ciju Rajan K

Hi Satoru,

> 
> This patch is logically correct, succeeded to compile and works
> fine. But I came to be worried about whether it is good to kill
> all fields you said after reading old and upstream scheduler
> code again.
> 
> I think we can remove rq->sched_switch and rq->sched_switch
> without no problem because they are meaningless. The former
> is for old O(1) scheduler and means the number of runqueue
> switching among active/expired queue. The latter is for
> SD_WAKE_BALANCE flag and its logic is already gone.
> 
> However sbe_* are for SD_BALANCE_EXEC flag and sbf_* are for
> SD_BALANCE_FORK flag. Since both logic for them are still alive,
> the absence of these accounting is regression in my perspective.
> In addition, these fields would be useful for analyzing load
> balance behavior.
> 

sbe_* & sbf_* flags are added by the commit
68767a0ae428801649d510d9a65bb71feed44dd1  Git describe shows that it was
gone in to v2.6.12-1422-g68767a0  which is quite old. So in my opinion
this might not be a regression.

> # although I haven't been able to notice they are always zero ;-(
> 
> I prefer not to remove these fields({sbe,sbf}_*) but to add
> accounting code for these flags again. What do you think?

I will go through the code and verify once again.

-Ciju


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2 v2.0]sched: Removing unused fields from /proc/schedstat
  2011-01-31  4:10     ` Ciju Rajan K
@ 2011-02-02  8:54       ` Ciju Rajan K
  2011-02-03  9:19         ` Satoru Takeuchi
  0 siblings, 1 reply; 13+ messages in thread
From: Ciju Rajan K @ 2011-02-02  8:54 UTC (permalink / raw)
  To: Satoru Takeuchi
  Cc: Ciju Rajan K, linux kernel mailing list, Peter Zijlstra,
	Bharata B Rao, Ingo Molnar, Srivatsa Vaddagiri

Hi Satoru,


>> I think we can remove rq->sched_switch and rq->sched_switch
>> without no problem because they are meaningless. The former
>> is for old O(1) scheduler and means the number of runqueue
>> switching among active/expired queue. The latter is for
>> SD_WAKE_BALANCE flag and its logic is already gone.
>>
>> However sbe_* are for SD_BALANCE_EXEC flag and sbf_* are for
>> SD_BALANCE_FORK flag. Since both logic for them are still alive,
>> the absence of these accounting is regression in my perspective.
>> In addition, these fields would be useful for analyzing load
>> balance behavior.
>>

The sbe_* & sbf_* counters were added by the commit 
68767a0ae428801649d510d9a65bb71feed44dd1 But it was subsequently 
removed by the commit 476d139c218e44e045e4bc6d4cc02b010b343939

[ciju@ciju kernel]$ git describe 68767a0ae428801649d510d9a65bb71feed44dd1 --contains
v2.6.13-rc1~68^2~148
[ciju@ciju kernel]$ git describe 476d139c218e44e045e4bc6d4cc02b010b343939 --contains
v2.6.13-rc1~68^2~140

So.. it was introduced and removed in 2.6.13 time frame


When the counters were removed the sbe_* sbf_* variable
declarations were not removed. Hence it caused a little confusion.
So I believe these stats were not available and hence can't be 
considered as regression. 

476d139c218e44e045e4bc6d4cc02b010b343939 consolidated the fork and 
exec balance. Thereafter it became non-trivial to provide separate 
stats for fork and exec events. So if people think a consolidated  
balance-on-event is needed, it can be looked into separately. But 
that shouldn't prevent this documentation cleanup patch from 
getting in.

-Ciju



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2 v2.0]sched: Removing unused fields from /proc/schedstat
  2011-02-02  8:54       ` Ciju Rajan K
@ 2011-02-03  9:19         ` Satoru Takeuchi
  2011-02-07  9:33           ` Ciju Rajan K
  0 siblings, 1 reply; 13+ messages in thread
From: Satoru Takeuchi @ 2011-02-03  9:19 UTC (permalink / raw)
  To: Ciju Rajan K
  Cc: linux kernel mailing list, Peter Zijlstra, Bharata B Rao,
	Ingo Molnar, Srivatsa Vaddagiri

Hi Ciju,

(2011/02/02 17:54), Ciju Rajan K wrote:
> Hi Satoru,
>
>
>>> I think we can remove rq->sched_switch and rq->sched_switch
>>> without no problem because they are meaningless. The former
>>> is for old O(1) scheduler and means the number of runqueue
>>> switching among active/expired queue. The latter is for
>>> SD_WAKE_BALANCE flag and its logic is already gone.
>>>
>>> However sbe_* are for SD_BALANCE_EXEC flag and sbf_* are for
>>> SD_BALANCE_FORK flag. Since both logic for them are still alive,
>>> the absence of these accounting is regression in my perspective.
>>> In addition, these fields would be useful for analyzing load
>>> balance behavior.
>>>
>
> The sbe_*&  sbf_* counters were added by the commit
> 68767a0ae428801649d510d9a65bb71feed44dd1 But it was subsequently
> removed by the commit 476d139c218e44e045e4bc6d4cc02b010b343939

OK, I understood. It's OK if user tools referring /proc/schedstat
are released sync with this change.

I confirmed the following:

  - This patch removes some unused schedstat fields and related
    data.
  - The kernel applying this patch works fine on my i386 box.

Tested-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>

Thanks,
Satoru

>
> [ciju@ciju kernel]$ git describe 68767a0ae428801649d510d9a65bb71feed44dd1 --contains
> v2.6.13-rc1~68^2~148
> [ciju@ciju kernel]$ git describe 476d139c218e44e045e4bc6d4cc02b010b343939 --contains
> v2.6.13-rc1~68^2~140
>
> So.. it was introduced and removed in 2.6.13 time frame
>
>
> When the counters were removed the sbe_* sbf_* variable
> declarations were not removed. Hence it caused a little confusion.
> So I believe these stats were not available and hence can't be
> considered as regression.
>
> 476d139c218e44e045e4bc6d4cc02b010b343939 consolidated the fork and
> exec balance. Thereafter it became non-trivial to provide separate
> stats for fork and exec events. So if people think a consolidated
> balance-on-event is needed, it can be looked into separately. But
> that shouldn't prevent this documentation cleanup patch from
> getting in.
>
> -Ciju
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
>



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2 v2.0]sched: Updating the sched-stat documentation
  2011-01-25 20:46 ` [PATCH 2/2 v2.0]sched: Updating the sched-stat documentation Ciju Rajan K
@ 2011-02-03  9:19   ` Satoru Takeuchi
  0 siblings, 0 replies; 13+ messages in thread
From: Satoru Takeuchi @ 2011-02-03  9:19 UTC (permalink / raw)
  To: Ciju Rajan K
  Cc: linux kernel mailing list, Peter Zijlstra, Bharata B Rao,
	Ingo Molnar, Srivatsa Vaddagiri

(2011/01/26 5:46), Ciju Rajan K wrote:
> sched: Updating the sched-stat documentation
>
> Some of the unused fields are removed from /proc/schedstat.
> This is the documentation changes reflecting the same.
>
> Signed-off-by: Ciju Rajan K<ciju@linux.vnet.ibm.com>

Reviewed-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>

> ---
>
> diff --git a/Documentation/scheduler/sched-stats.txt b/Documentation/scheduler/sched-stats.txt
> index 01e6940..28bee75 100644
> --- a/Documentation/scheduler/sched-stats.txt
> +++ b/Documentation/scheduler/sched-stats.txt
> @@ -26,119 +26,81 @@ Note that any such script will necessarily be version-specific, as the main
>   reason to change versions is changes in the output format.  For those wishing
>   to write their own scripts, the fields are described here.
>
> +The first two fields of /proc/schedstat indicates the version (current
> +version is 16) and jiffies values. The following values are from
> +cpu&  domain statistics.
> +
>   CPU statistics
>   --------------
> -cpu<N>  1 2 3 4 5 6 7 8 9 10 11 12
> -
> -NOTE: In the sched_yield() statistics, the active queue is considered empty
> -    if it has only one process in it, since obviously the process calling
> -    sched_yield() is that process.
> -
> -First four fields are sched_yield() statistics:
> -     1) # of times both the active and the expired queue were empty
> -     2) # of times just the active queue was empty
> -     3) # of times just the expired queue was empty
> -     4) # of times sched_yield() was called
> -
> -Next three are schedule() statistics:
> -     5) # of times we switched to the expired queue and reused it
> -     6) # of times schedule() was called
> -     7) # of times schedule() left the processor idle
> +The format is like this:
>
> -Next two are try_to_wake_up() statistics:
> -     8) # of times try_to_wake_up() was called
> -     9) # of times try_to_wake_up() was called to wake up the local cpu
> +cpu<N>  1 2 3 4 5 6 7 8
>
> -Next three are statistics describing scheduling latency:
> -    10) sum of all time spent running by tasks on this processor (in jiffies)
> -    11) sum of all time spent waiting to run by tasks on this processor (in
> -        jiffies)
> -    12) # of timeslices run on this cpu
> +     1) # of times sched_yield() was called on this CPU
> +     2) # of times scheduler runs on this CPU
> +     3) # of times scheduler picks idle task as next task on this CPU
> +     4) # of times try_to_wake_up() is run on this CPU
> +        (Number of times task wakeup is attempted from this CPU)
> +     5) # of times try_to_wake_up() wakes up a task on the same CPU
> +        (local wakeup)
> +     6) Time(ns) for which tasks have run on this CPU
> +     7) Time(ns) for which tasks on this CPU's runqueue have waited
> +        before getting to run on the CPU
> +     8) # of tasks that have run on this CPU
>
>
>   Domain statistics
>   -----------------
> -One of these is produced per domain for each cpu described. (Note that if
> -CONFIG_SMP is not defined, *no* domains are utilized and these lines
> -will not appear in the output.)
> +One of these is produced per domain for each cpu described.
> +(Note that if CONFIG_SMP is not defined, *no* domains are utilized
> + and these lines will not appear in the output.)
>
> -domain<N>  <cpumask>  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
> +domain<N>  <cpumask>  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
>
>   The first field is a bit mask indicating what cpus this domain operates over.
>
> -The next 24 are a variety of load_balance() statistics in grouped into types
> -of idleness (idle, busy, and newly idle):
> -
> -     1) # of times in this domain load_balance() was called when the
> -        cpu was idle
> -     2) # of times in this domain load_balance() checked but found
> -        the load did not require balancing when the cpu was idle
> -     3) # of times in this domain load_balance() tried to move one or
> -        more tasks and failed, when the cpu was idle
> -     4) sum of imbalances discovered (if any) with each call to
> -        load_balance() in this domain when the cpu was idle
> -     5) # of times in this domain pull_task() was called when the cpu
> -        was idle
> -     6) # of times in this domain pull_task() was called even though
> -        the target task was cache-hot when idle
> -     7) # of times in this domain load_balance() was called but did
> -        not find a busier queue while the cpu was idle
> -     8) # of times in this domain a busier queue was found while the
> -        cpu was idle but no busier group was found
> -
> -     9) # of times in this domain load_balance() was called when the
> -        cpu was busy
> -    10) # of times in this domain load_balance() checked but found the
> -        load did not require balancing when busy
> -    11) # of times in this domain load_balance() tried to move one or
> -        more tasks and failed, when the cpu was busy
> -    12) sum of imbalances discovered (if any) with each call to
> -        load_balance() in this domain when the cpu was busy
> -    13) # of times in this domain pull_task() was called when busy
> -    14) # of times in this domain pull_task() was called even though the
> -        target task was cache-hot when busy
> -    15) # of times in this domain load_balance() was called but did not
> -        find a busier queue while the cpu was busy
> -    16) # of times in this domain a busier queue was found while the cpu
> -        was busy but no busier group was found
> -
> -    17) # of times in this domain load_balance() was called when the
> -        cpu was just becoming idle
> -    18) # of times in this domain load_balance() checked but found the
> -        load did not require balancing when the cpu was just becoming idle
> -    19) # of times in this domain load_balance() tried to move one or more
> -        tasks and failed, when the cpu was just becoming idle
> -    20) sum of imbalances discovered (if any) with each call to
> -        load_balance() in this domain when the cpu was just becoming idle
> -    21) # of times in this domain pull_task() was called when newly idle
> -    22) # of times in this domain pull_task() was called even though the
> -        target task was cache-hot when just becoming idle
> -    23) # of times in this domain load_balance() was called but did not
> -        find a busier queue while the cpu was just becoming idle
> -    24) # of times in this domain a busier queue was found while the cpu
> -        was just becoming idle but no busier group was found
> -
> +The next 24 are a variety of load_balance() statistics grouped into
> +types of idleness (idle, busy, and newly idle). The three idle
> +states are:
> +
> +CPU_IDLE:          This state is entered after CPU_NEWLY_IDLE
> +                   state fails to find a new task for this CPU
> +CPU_NOT_IDLE:      Load balancer is being run on a CPU when it is
> +                   not in IDLE state (busy times)
> +CPU_NEWLY_IDLE:    Load balancer is being run on a CPU which is
> +                   about to enter IDLE state
> +
> +There are eight stats available for each of the above three states:
> +     - # of times in this domain load_balance() was called
> +     - # of times in this domain load_balance() checked but found
> +        the load did not require balancing
> +     - # of times in this domain load_balance() tried to move one or
> +        more tasks and failed
> +     - sum of imbalances discovered (if any) with each call to
> +        load_balance() in this domain
> +     - # of times in this domain pull_task() was called
> +     - # of times in this domain pull_task() was called even though
> +        the target task was cache-hot
> +     - # of times in this domain load_balance() was called but did
> +        not find a busier queue
> +     - # of times in this domain a busier queue was found but no
> +        busier group was found
> +
> +   The first 1-8) fields are the stats when cpu was idle (CPU_IDLE),
> +   the next 9-15) fields are the stats when cpu was busy (CPU_NOT_IDLE),
> +   and the next 16-24) fields are the stats when cpu  was just
> +   becoming idle (CPU_NEWLY_IDLE)
> +
>      Next three are active_load_balance() statistics:
>       25) # of times active_load_balance() was called
>       26) # of times active_load_balance() tried to move a task and failed
>       27) # of times active_load_balance() successfully moved a task
>
> -   Next three are sched_balance_exec() statistics:
> -    28) sbe_cnt is not used
> -    29) sbe_balanced is not used
> -    30) sbe_pushed is not used
> -
> -   Next three are sched_balance_fork() statistics:
> -    31) sbf_cnt is not used
> -    32) sbf_balanced is not used
> -    33) sbf_pushed is not used
> -
> -   Next three are try_to_wake_up() statistics:
> -    34) # of times in this domain try_to_wake_up() awoke a task that
> +   Next two are try_to_wake_up() statistics:
> +    28) # of times in this domain try_to_wake_up() awoke a task that
>           last ran on a different cpu in this domain
> -    35) # of times in this domain try_to_wake_up() moved a task to the
> +    29) # of times in this domain try_to_wake_up() moved a task to the
>           waking cpu because it was cache-cold on its own cpu anyway
> -    36) # of times in this domain try_to_wake_up() started passive balancing
>
>   /proc/<pid>/schedstat
>   ----------------
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
>



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2 v2.0]sched: Removing unused fields from /proc/schedstat
  2011-02-03  9:19         ` Satoru Takeuchi
@ 2011-02-07  9:33           ` Ciju Rajan K
  0 siblings, 0 replies; 13+ messages in thread
From: Ciju Rajan K @ 2011-02-07  9:33 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Satoru Takeuchi, linux kernel mailing list, Bharata B Rao,
	Ingo Molnar, Srivatsa Vaddagiri, Ciju Rajan K

Hi Peter,

Could you please consider this patch set for inclusion?

-Ciju

On 02/03/2011 02:49 PM, Satoru Takeuchi wrote:
> Hi Ciju,
> 
> (2011/02/02 17:54), Ciju Rajan K wrote:
>> Hi Satoru,
>>
>>
>>>> I think we can remove rq->sched_switch and rq->sched_switch
>>>> without no problem because they are meaningless. The former
>>>> is for old O(1) scheduler and means the number of runqueue
>>>> switching among active/expired queue. The latter is for
>>>> SD_WAKE_BALANCE flag and its logic is already gone.
>>>>
>>>> However sbe_* are for SD_BALANCE_EXEC flag and sbf_* are for
>>>> SD_BALANCE_FORK flag. Since both logic for them are still alive,
>>>> the absence of these accounting is regression in my perspective.
>>>> In addition, these fields would be useful for analyzing load
>>>> balance behavior.
>>>>
>>
>> The sbe_*& sbf_* counters were added by the commit
>> 68767a0ae428801649d510d9a65bb71feed44dd1 But it was subsequently
>> removed by the commit 476d139c218e44e045e4bc6d4cc02b010b343939
> 
> OK, I understood. It's OK if user tools referring /proc/schedstat
> are released sync with this change.
> 
> I confirmed the following:
> 
> - This patch removes some unused schedstat fields and related
> data.
> - The kernel applying this patch works fine on my i386 box.
> 
> Tested-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
> 
> Thanks,
> Satoru
> 
>>
>> [ciju@ciju kernel]$ git describe 68767a0ae428801649d510d9a65bb71feed44dd1 --contains
>> v2.6.13-rc1~68^2~148
>> [ciju@ciju kernel]$ git describe 476d139c218e44e045e4bc6d4cc02b010b343939 --contains
>> v2.6.13-rc1~68^2~140
>>
>> So.. it was introduced and removed in 2.6.13 time frame
>>
>>
>> When the counters were removed the sbe_* sbf_* variable
>> declarations were not removed. Hence it caused a little confusion.
>> So I believe these stats were not available and hence can't be
>> considered as regression.
>>
>> 476d139c218e44e045e4bc6d4cc02b010b343939 consolidated the fork and
>> exec balance. Thereafter it became non-trivial to provide separate
>> stats for fork and exec events. So if people think a consolidated
>> balance-on-event is needed, it can be looked into separately. But
>> that shouldn't prevent this documentation cleanup patch from
>> getting in.
>>
>> -Ciju
>>
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/
>>
>>
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC][PATCH 0/2 v3.0]sched: updating /proc/schedstat
  2011-01-25 20:41 [RFC][PATCH 0/2 v2.0]sched: updating /proc/schedstat Ciju Rajan K
  2011-01-25 20:45 ` [PATCH 1/2 v2.0]sched: Removing unused fields from /proc/schedstat Ciju Rajan K
  2011-01-25 20:46 ` [PATCH 2/2 v2.0]sched: Updating the sched-stat documentation Ciju Rajan K
@ 2011-02-18 12:43 ` Ciju Rajan K
  2011-02-18 12:46   ` [PATCH 1/2 v3.0]sched: Removing unused fields from /proc/schedstat Ciju Rajan K
                     ` (2 more replies)
  2 siblings, 3 replies; 13+ messages in thread
From: Ciju Rajan K @ 2011-02-18 12:43 UTC (permalink / raw)
  To: linux kernel mailing list
  Cc: Ciju Rajan K, Peter Zijlstra, Bharata B Rao, Ingo Molnar,
	Srivatsa Vaddagiri, Satoru Takeuchi

Hi All,

Here is the v3.0 of the patch set, which updates 
/proc/schedstat statistics. Please review the patches
and consider for inclusion.

Changes from v2.0:
 * Re-based to linux-2.6-tip
 * Added Tested-by tag

Changes from v1.0:
 * Fixed couple of typos
 * Re-written the documentation for sched-domain statistics
 * Re-based to 2.6.38-rc2

Previous versions of the patches were posted here:
 (v1.0) https://lkml.org/lkml/2011/1/17/87
 (v2.0) https://lkml.org/lkml/2011/1/25/456

-Ciju


 Documentation/scheduler/sched-stats.txt |  144 ++++++++++++--------------------
 include/linux/sched.h                   |   11 --
 kernel/sched_debug.c                    |    1 
 kernel/sched_stats.h                    |   13 +-
 4 files changed, 60 insertions(+), 109 deletions(-)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/2 v3.0]sched: Removing unused fields from /proc/schedstat
  2011-02-18 12:43 ` [RFC][PATCH 0/2 v3.0]sched: updating /proc/schedstat Ciju Rajan K
@ 2011-02-18 12:46   ` Ciju Rajan K
  2011-02-18 12:47   ` [PATCH 2/2 v3.0]sched: Updating the sched-stat documentation Ciju Rajan K
  2011-02-22  8:38   ` [RFC][PATCH 0/2 v3.0]sched: updating /proc/schedstat Bharata B Rao
  2 siblings, 0 replies; 13+ messages in thread
From: Ciju Rajan K @ 2011-02-18 12:46 UTC (permalink / raw)
  To: linux kernel mailing list
  Cc: Ciju Rajan K, Peter Zijlstra, Bharata B Rao, Ingo Molnar,
	Srivatsa Vaddagiri, Satoru Takeuchi


From: Ciju Rajan K <ciju@linux.vnet.ibm.com>
Date: Fri, 18 Feb 2011 16:31:12 +0530
Subject: [PATCH 1/2 v3.0] sched: Updating the fields of /proc/schedstat

This patch removes the unused statistics from /proc/schedstat.
Also updates the request queue structure fields.

Signed-off-by: Ciju Rajan K <ciju@linux.vnet.ibm.com>
---
 include/linux/sched.h |   11 -----------
 kernel/sched_debug.c  |    1 -
 kernel/sched_stats.h  |   13 +++++--------
 3 files changed, 5 insertions(+), 20 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 23e9c27..a1691c7 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -954,20 +954,9 @@ struct sched_domain {
 	unsigned int alb_failed;
 	unsigned int alb_pushed;
 
-	/* SD_BALANCE_EXEC stats */
-	unsigned int sbe_count;
-	unsigned int sbe_balanced;
-	unsigned int sbe_pushed;
-
-	/* SD_BALANCE_FORK stats */
-	unsigned int sbf_count;
-	unsigned int sbf_balanced;
-	unsigned int sbf_pushed;
-
 	/* try_to_wake_up() stats */
 	unsigned int ttwu_wake_remote;
 	unsigned int ttwu_move_affine;
-	unsigned int ttwu_move_balance;
 #endif
 #ifdef CONFIG_SCHED_DEBUG
 	char *name;
diff --git a/kernel/sched_debug.c b/kernel/sched_debug.c
index 7bacd83..726b306 100644
--- a/kernel/sched_debug.c
+++ b/kernel/sched_debug.c
@@ -286,7 +286,6 @@ static void print_cpu(struct seq_file *m, int cpu)
 
 	P(yld_count);
 
-	P(sched_switch);
 	P(sched_count);
 	P(sched_goidle);
 #ifdef CONFIG_SMP
diff --git a/kernel/sched_stats.h b/kernel/sched_stats.h
index 48ddf43..8869ed9 100644
--- a/kernel/sched_stats.h
+++ b/kernel/sched_stats.h
@@ -4,7 +4,7 @@
  * bump this up when changing the output format or the meaning of an existing
  * format, so that tools can adapt (or abort)
  */
-#define SCHEDSTAT_VERSION 15
+#define SCHEDSTAT_VERSION 16
 
 static int show_schedstat(struct seq_file *seq, void *v)
 {
@@ -26,9 +26,9 @@ static int show_schedstat(struct seq_file *seq, void *v)
 
 		/* runqueue-specific stats */
 		seq_printf(seq,
-		    "cpu%d %u %u %u %u %u %u %llu %llu %lu",
+		    "cpu%d %u %u %u %u %u %llu %llu %lu",
 		    cpu, rq->yld_count,
-		    rq->sched_switch, rq->sched_count, rq->sched_goidle,
+		    rq->sched_count, rq->sched_goidle,
 		    rq->ttwu_count, rq->ttwu_local,
 		    rq->rq_cpu_time,
 		    rq->rq_sched_info.run_delay, rq->rq_sched_info.pcount);
@@ -57,12 +57,9 @@ static int show_schedstat(struct seq_file *seq, void *v)
 				    sd->lb_nobusyg[itype]);
 			}
 			seq_printf(seq,
-				   " %u %u %u %u %u %u %u %u %u %u %u %u\n",
+				   " %u %u %u %u %u\n",
 			    sd->alb_count, sd->alb_failed, sd->alb_pushed,
-			    sd->sbe_count, sd->sbe_balanced, sd->sbe_pushed,
-			    sd->sbf_count, sd->sbf_balanced, sd->sbf_pushed,
-			    sd->ttwu_wake_remote, sd->ttwu_move_affine,
-			    sd->ttwu_move_balance);
+			    sd->ttwu_wake_remote, sd->ttwu_move_affine);
 		}
 		preempt_enable();
 #endif

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/2 v3.0]sched: Updating the sched-stat documentation
  2011-02-18 12:43 ` [RFC][PATCH 0/2 v3.0]sched: updating /proc/schedstat Ciju Rajan K
  2011-02-18 12:46   ` [PATCH 1/2 v3.0]sched: Removing unused fields from /proc/schedstat Ciju Rajan K
@ 2011-02-18 12:47   ` Ciju Rajan K
  2011-02-22  8:38   ` [RFC][PATCH 0/2 v3.0]sched: updating /proc/schedstat Bharata B Rao
  2 siblings, 0 replies; 13+ messages in thread
From: Ciju Rajan K @ 2011-02-18 12:47 UTC (permalink / raw)
  To: linux kernel mailing list
  Cc: Ciju Rajan K, Peter Zijlstra, Bharata B Rao, Ingo Molnar,
	Srivatsa Vaddagiri, Satoru Takeuchi

From: Ciju Rajan K <ciju@linux.vnet.ibm.com>
Date: Fri, 18 Feb 2011 16:29:14 +0530
Subject: [PATCH 2/2 v3.0] sched: Updating the sched-stat documentation

Some of the unused fields are removed from /proc/schedstat.
This is the documentation changes reflecting the same.

Signed-off-by: Ciju Rajan K <ciju@linux.vnet.ibm.com>
---
 Documentation/scheduler/sched-stats.txt |  144 ++++++++++++-------------------
 1 files changed, 55 insertions(+), 89 deletions(-)

diff --git a/Documentation/scheduler/sched-stats.txt b/Documentation/scheduler/sched-stats.txt
index 1cd5d51..de47562 100644
--- a/Documentation/scheduler/sched-stats.txt
+++ b/Documentation/scheduler/sched-stats.txt
@@ -1,3 +1,4 @@
+Version 16 of schedstats removed some of the unused fields.
 Version 15 of schedstats dropped counters for some sched_yield:
 yld_exp_empty, yld_act_empty and yld_both_empty. Otherwise, it is
 identical to version 14.
@@ -30,112 +31,77 @@ Note that any such script will necessarily be version-specific, as the main
 reason to change versions is changes in the output format.  For those wishing
 to write their own scripts, the fields are described here.
 
+The first two fields of /proc/schedstat indicates the version (current
+version is 16) and jiffies values. The following values are from 
+cpu & domain statistics.
+
 CPU statistics
 --------------
-cpu<N> 1 2 3 4 5 6 7 8 9
-
-First field is a sched_yield() statistic:
-     1) # of times sched_yield() was called
-
-Next three are schedule() statistics:
-     2) # of times we switched to the expired queue and reused it
-     3) # of times schedule() was called
-     4) # of times schedule() left the processor idle
-
-Next two are try_to_wake_up() statistics:
-     5) # of times try_to_wake_up() was called
-     6) # of times try_to_wake_up() was called to wake up the local cpu
-
-Next three are statistics describing scheduling latency:
-     7) sum of all time spent running by tasks on this processor (in jiffies)
-     8) sum of all time spent waiting to run by tasks on this processor (in
-        jiffies)
-     9) # of timeslices run on this cpu
-
+The format is like this:
+
+cpu<N> 1 2 3 4 5 6 7 8
+
+     1) # of times sched_yield() was called on this CPU
+     2) # of times scheduler runs on this CPU
+     3) # of times scheduler picks idle task as next task on this CPU
+     4) # of times try_to_wake_up() is run on this CPU 
+        (Number of times task wakeup is attempted from this CPU)
+     5) # of times try_to_wake_up() wakes up a task on the same CPU 
+        (local wakeup)
+     6) Time(ns) for which tasks have run on this CPU
+     7) Time(ns) for which tasks on this CPU's runqueue have waited 
+        before getting to run on the CPU
+     8) # of tasks that have run on this CPU
 
 Domain statistics
 -----------------
-One of these is produced per domain for each cpu described. (Note that if
-CONFIG_SMP is not defined, *no* domains are utilized and these lines
-will not appear in the output.)
+One of these is produced per domain for each cpu described. 
+(Note that if CONFIG_SMP is not defined, *no* domains are utilized
+and these lines will not appear in the output.)
 
-domain<N> <cpumask> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
+domain<N> <cpumask> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
 
 The first field is a bit mask indicating what cpus this domain operates over.
 
 The next 24 are a variety of load_balance() statistics in grouped into types
 of idleness (idle, busy, and newly idle):
 
-     1) # of times in this domain load_balance() was called when the
-        cpu was idle
-     2) # of times in this domain load_balance() checked but found
-        the load did not require balancing when the cpu was idle
-     3) # of times in this domain load_balance() tried to move one or
-        more tasks and failed, when the cpu was idle
-     4) sum of imbalances discovered (if any) with each call to
-        load_balance() in this domain when the cpu was idle
-     5) # of times in this domain pull_task() was called when the cpu
-        was idle
-     6) # of times in this domain pull_task() was called even though
-        the target task was cache-hot when idle
-     7) # of times in this domain load_balance() was called but did
-        not find a busier queue while the cpu was idle
-     8) # of times in this domain a busier queue was found while the
-        cpu was idle but no busier group was found
-
-     9) # of times in this domain load_balance() was called when the
-        cpu was busy
-    10) # of times in this domain load_balance() checked but found the
-        load did not require balancing when busy
-    11) # of times in this domain load_balance() tried to move one or
-        more tasks and failed, when the cpu was busy
-    12) sum of imbalances discovered (if any) with each call to
-        load_balance() in this domain when the cpu was busy
-    13) # of times in this domain pull_task() was called when busy
-    14) # of times in this domain pull_task() was called even though the
-        target task was cache-hot when busy
-    15) # of times in this domain load_balance() was called but did not
-        find a busier queue while the cpu was busy
-    16) # of times in this domain a busier queue was found while the cpu
-        was busy but no busier group was found
-
-    17) # of times in this domain load_balance() was called when the
-        cpu was just becoming idle
-    18) # of times in this domain load_balance() checked but found the
-        load did not require balancing when the cpu was just becoming idle
-    19) # of times in this domain load_balance() tried to move one or more
-        tasks and failed, when the cpu was just becoming idle
-    20) sum of imbalances discovered (if any) with each call to
-        load_balance() in this domain when the cpu was just becoming idle
-    21) # of times in this domain pull_task() was called when newly idle
-    22) # of times in this domain pull_task() was called even though the
-        target task was cache-hot when just becoming idle
-    23) # of times in this domain load_balance() was called but did not
-        find a busier queue while the cpu was just becoming idle
-    24) # of times in this domain a busier queue was found while the cpu
-        was just becoming idle but no busier group was found
-
+CPU_NOT_IDLE:      Load balancer is being run on a CPU when it is 
+                   not in IDLE state (busy times)
+CPU_NEWLY_IDLE:    Load balancer is being run on a CPU which is 
+                   about to enter IDLE state
+
+There are eight stats available for each of the above three states:
+     - # of times in this domain load_balance() was called
+     - # of times in this domain load_balance() checked but found
+        the load did not require balancing
+     - # of times in this domain load_balance() tried to move one or
+        more tasks and failed
+     - sum of imbalances discovered (if any) with each call to
+        load_balance() in this domain
+     - # of times in this domain pull_task() was called
+     - # of times in this domain pull_task() was called even though
+        the target task was cache-hot
+     - # of times in this domain load_balance() was called but did
+        not find a busier queue
+     - # of times in this domain a busier queue was found but no 
+        busier group was found
+
+   The first 1-8) fields are the stats when cpu was idle (CPU_IDLE),
+   the next 9-15) fields are the stats when cpu was busy (CPU_NOT_IDLE),
+   and the next 16-24) fields are the stats when cpu  was just 
+   becoming idle (CPU_NEWLY_IDLE)
+ 
    Next three are active_load_balance() statistics:
     25) # of times active_load_balance() was called
     26) # of times active_load_balance() tried to move a task and failed
     27) # of times active_load_balance() successfully moved a task
 
-   Next three are sched_balance_exec() statistics:
-    28) sbe_cnt is not used
-    29) sbe_balanced is not used
-    30) sbe_pushed is not used
-
-   Next three are sched_balance_fork() statistics:
-    31) sbf_cnt is not used
-    32) sbf_balanced is not used
-    33) sbf_pushed is not used
-
-   Next three are try_to_wake_up() statistics:
-    34) # of times in this domain try_to_wake_up() awoke a task that
-        last ran on a different cpu in this domain
-    35) # of times in this domain try_to_wake_up() moved a task to the
-        waking cpu because it was cache-cold on its own cpu anyway
-    36) # of times in this domain try_to_wake_up() started passive balancing
+   Next two are try_to_wake_up() statistics:
+    28) # of times in this domain try_to_wake_up() awoke a task that
+         last ran on a different cpu in this domain
+    29) # of times in this domain try_to_wake_up() moved a task to the
+         waking cpu because it was cache-cold on its own cpu anyway
 
 /proc/<pid>/schedstat
 ----------------

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC][PATCH 0/2 v3.0]sched: updating /proc/schedstat
  2011-02-18 12:43 ` [RFC][PATCH 0/2 v3.0]sched: updating /proc/schedstat Ciju Rajan K
  2011-02-18 12:46   ` [PATCH 1/2 v3.0]sched: Removing unused fields from /proc/schedstat Ciju Rajan K
  2011-02-18 12:47   ` [PATCH 2/2 v3.0]sched: Updating the sched-stat documentation Ciju Rajan K
@ 2011-02-22  8:38   ` Bharata B Rao
  2 siblings, 0 replies; 13+ messages in thread
From: Bharata B Rao @ 2011-02-22  8:38 UTC (permalink / raw)
  To: Ciju Rajan K
  Cc: linux kernel mailing list, Peter Zijlstra, Ingo Molnar,
	Srivatsa Vaddagiri, Satoru Takeuchi

On Fri, Feb 18, 2011 at 06:13:28PM +0530, Ciju Rajan K wrote:
> Hi All,
> 
> Here is the v3.0 of the patch set, which updates 
> /proc/schedstat statistics. Please review the patches
> and consider for inclusion.

I believe this documentation cleanup is good to go in. Hope you
have ensured that userspace can work smoothly with the bumped up
version.

Regards,
Bharata.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2011-02-22  8:37 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-25 20:41 [RFC][PATCH 0/2 v2.0]sched: updating /proc/schedstat Ciju Rajan K
2011-01-25 20:45 ` [PATCH 1/2 v2.0]sched: Removing unused fields from /proc/schedstat Ciju Rajan K
2011-01-26  6:10   ` Satoru Takeuchi
2011-01-31  4:10     ` Ciju Rajan K
2011-02-02  8:54       ` Ciju Rajan K
2011-02-03  9:19         ` Satoru Takeuchi
2011-02-07  9:33           ` Ciju Rajan K
2011-01-25 20:46 ` [PATCH 2/2 v2.0]sched: Updating the sched-stat documentation Ciju Rajan K
2011-02-03  9:19   ` Satoru Takeuchi
2011-02-18 12:43 ` [RFC][PATCH 0/2 v3.0]sched: updating /proc/schedstat Ciju Rajan K
2011-02-18 12:46   ` [PATCH 1/2 v3.0]sched: Removing unused fields from /proc/schedstat Ciju Rajan K
2011-02-18 12:47   ` [PATCH 2/2 v3.0]sched: Updating the sched-stat documentation Ciju Rajan K
2011-02-22  8:38   ` [RFC][PATCH 0/2 v3.0]sched: updating /proc/schedstat Bharata B Rao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.