linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] Proper kernel irq time reporting -v1
@ 2010-10-25 22:30 Venkatesh Pallipadi
  2010-10-25 22:30 ` [PATCH 1/6] Free up pf flag PF_KSOFTIRQD -v1 Venkatesh Pallipadi
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Venkatesh Pallipadi @ 2010-10-25 22:30 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, H. Peter Anvin, Thomas Gleixner,
	Balbir Singh, Martin Schwidefsky
  Cc: linux-kernel, Paul Turner, Eric Dumazet, Shaun Ruffell,
	Yong Zhang, Venkatesh Pallipadi

This is Part 2 of
"Proper kernel irq time accounting -v4"
http://lkml.indiana.edu/hypermail//linux/kernel/1010.0/01175.html

and applies over those changes.

Changes since v0:
( v0 - http://lkml.indiana.edu/hypermail//linux/kernel/1010.2/02420.html )
- Use of this_cpu_* variants
- Other comments on v0 version addressed


Part 1 solves the way irqs are accounted in scheduler and tasks. This
patchset solves how irq times are reported in /proc/stat and also not
to include irq time in task->stime, etc.

Example:
Running a cpu intensive loop and network intensive nc on a 4 CPU system
and looking at 'top' output.

With vanilla kernel:
Cpu0  :  0.0% us,  0.3% sy,  0.0% ni, 99.3% id,  0.0% wa,  0.0% hi,  0.3% si
Cpu1  : 100.0% us,  0.0% sy,  0.0% ni,  0.0% id,  0.0% wa,  0.0% hi,  0.0% si
Cpu2  :  1.3% us, 27.2% sy,  0.0% ni,  0.0% id,  0.0% wa,  0.0% hi, 71.4% si
Cpu3  :  1.6% us,  1.3% sy,  0.0% ni, 96.7% id,  0.0% wa,  0.0% hi,  0.3% si

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 7555 root      20   0  1760  528  436 R  100  0.0   0:15.79 nc
 7563 root      20   0  3632  268  204 R  100  0.0   0:13.13 loop

Notes:
- Both tasks show 100% CPU, even when one of them is stuck on a CPU thats
  processing 70% softirq.
- no hardirq time.


With "Part 1" patches:
Cpu0  :  0.0% us,  0.0% sy,  0.0% ni, 100.0% id,  0.0% wa,  0.0% hi,  0.0% si
Cpu1  : 100.0% us,  0.0% sy,  0.0% ni,  0.0% id,  0.0% wa,  0.0% hi,  0.0% si
Cpu2  :  2.0% us, 30.6% sy,  0.0% ni,  0.0% id,  0.0% wa,  0.0% hi, 67.4% si
Cpu3  :  0.7% us,  0.7% sy,  0.3% ni, 98.3% id,  0.0% wa,  0.0% hi,  0.0% si

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 6289 root      20   0  3632  268  204 R  100  0.0   2:18.67 loop
 5737 root      20   0  1760  528  436 R   33  0.0   0:26.72 nc

Notes:
- Tasks show 100% CPU and 33% CPU that correspond to their non-irq exec time.
- no hardirq time.


With "Part 1 + Part 2" patches:
Cpu0  :  1.3% us,  1.0% sy,  0.3% ni, 97.0% id,  0.0% wa,  0.0% hi,  0.3% si
Cpu1  : 99.3% us,  0.0% sy,  0.0% ni,  0.0% id,  0.0% wa,  0.7% hi,  0.0% si
Cpu2  :  1.3% us, 31.5% sy,  0.0% ni,  0.0% id,  0.0% wa,  8.3% hi, 58.9% si
Cpu3  :  1.0% us,  2.0% sy,  0.3% ni, 95.0% id,  0.0% wa,  0.7% hi,  1.0% si

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
20929 root      20   0  3632  268  204 R   99  0.0   3:48.25 loop
20796 root      20   0  1760  528  436 R   33  0.0   2:38.65 nc

Notes:
- Both task exec time and hard irq time reported correctly.
- hi and si time are based on fine granularity info and not on samples.
- getrusage would give proper utime/stime split not including irq times
  in that ratio.
- Other places that report user/sys time like, cgroup cpuacct.stat will
  now include only non-irq exectime.

Signed-off-by: Venkatesh Pallipadi <venki@google.com>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/6] Free up pf flag PF_KSOFTIRQD -v1
  2010-10-25 22:30 [PATCH 0/6] Proper kernel irq time reporting -v1 Venkatesh Pallipadi
@ 2010-10-25 22:30 ` Venkatesh Pallipadi
  2010-10-25 22:30 ` [PATCH 2/6] cleanup account_system_vtime with this_cpu_* -v1 Venkatesh Pallipadi
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Venkatesh Pallipadi @ 2010-10-25 22:30 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, H. Peter Anvin, Thomas Gleixner,
	Balbir Singh, Martin Schwidefsky
  Cc: linux-kernel, Paul Turner, Eric Dumazet, Shaun Ruffell,
	Yong Zhang, Venkatesh Pallipadi

Cleanup patch, freeing up PF_KSOFTIRQD and use per_cpu ksoftirqd pointer
instead, as suggested by Eric Dumazet.

Tested-by: Shaun Ruffell <sruffell@digium.com>

Signed-off-by: Venkatesh Pallipadi <venki@google.com>
---
 include/linux/interrupt.h |    7 +++++++
 include/linux/sched.h     |    1 -
 kernel/sched.c            |    2 +-
 kernel/softirq.c          |    3 +--
 4 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 01b2816..0473d88 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -426,6 +426,13 @@ extern void raise_softirq(unsigned int nr);
  */
 DECLARE_PER_CPU(struct list_head [NR_SOFTIRQS], softirq_work_list);
 
+DECLARE_PER_CPU(struct task_struct *, ksoftirqd);
+
+static inline struct task_struct *this_cpu_ksoftirqd(void)
+{
+	return this_cpu_read(ksoftirqd);
+}
+
 /* Try to send a softirq to a remote cpu.  If this cannot be done, the
  * work will be queued to the local cpu.
  */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 0383601..0b25c60 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1691,7 +1691,6 @@ extern void thread_group_times(struct task_struct *p, cputime_t *ut, cputime_t *
 /*
  * Per process flags
  */
-#define PF_KSOFTIRQD	0x00000001	/* I am ksoftirqd */
 #define PF_STARTING	0x00000002	/* being created */
 #define PF_EXITING	0x00000004	/* getting shut down */
 #define PF_EXITPIDONE	0x00000008	/* pi exit done on shut down */
diff --git a/kernel/sched.c b/kernel/sched.c
index abf8440..fae668b 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -1986,7 +1986,7 @@ void account_system_vtime(struct task_struct *curr)
 	 */
 	if (hardirq_count())
 		per_cpu(cpu_hardirq_time, cpu) += delta;
-	else if (in_serving_softirq() && !(curr->flags & PF_KSOFTIRQD))
+	else if (in_serving_softirq() && curr != this_cpu_ksoftirqd())
 		per_cpu(cpu_softirq_time, cpu) += delta;
 
 	local_irq_restore(flags);
diff --git a/kernel/softirq.c b/kernel/softirq.c
index f02a9df..f7a88af 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -54,7 +54,7 @@ EXPORT_SYMBOL(irq_stat);
 
 static struct softirq_action softirq_vec[NR_SOFTIRQS] __cacheline_aligned_in_smp;
 
-static DEFINE_PER_CPU(struct task_struct *, ksoftirqd);
+DEFINE_PER_CPU(struct task_struct *, ksoftirqd);
 
 char *softirq_to_name[NR_SOFTIRQS] = {
 	"HI", "TIMER", "NET_TX", "NET_RX", "BLOCK", "BLOCK_IOPOLL",
@@ -719,7 +719,6 @@ static int run_ksoftirqd(void * __bind_cpu)
 {
 	set_current_state(TASK_INTERRUPTIBLE);
 
-	current->flags |= PF_KSOFTIRQD;
 	while (!kthread_should_stop()) {
 		preempt_disable();
 		if (!local_softirq_pending()) {
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/6] cleanup account_system_vtime with this_cpu_* -v1
  2010-10-25 22:30 [PATCH 0/6] Proper kernel irq time reporting -v1 Venkatesh Pallipadi
  2010-10-25 22:30 ` [PATCH 1/6] Free up pf flag PF_KSOFTIRQD -v1 Venkatesh Pallipadi
@ 2010-10-25 22:30 ` Venkatesh Pallipadi
  2010-10-25 22:30 ` [PATCH 3/6] Add nsecs_to_cputime64 interface for asm-generic -v1 Venkatesh Pallipadi
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Venkatesh Pallipadi @ 2010-10-25 22:30 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, H. Peter Anvin, Thomas Gleixner,
	Balbir Singh, Martin Schwidefsky
  Cc: linux-kernel, Paul Turner, Eric Dumazet, Shaun Ruffell,
	Yong Zhang, Venkatesh Pallipadi

this_cpu_* variants are optimal than per_cpu(). Cleanup
IRQ_TIME_ACCOUNTING account_system_vtime to use this_cpu_*.

Signed-off-by: Venkatesh Pallipadi <venki@google.com>
---
 kernel/sched.c |   14 ++++++--------
 1 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index fae668b..a37bb83 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -1966,7 +1966,6 @@ static u64 irq_time_cpu(int cpu)
 void account_system_vtime(struct task_struct *curr)
 {
 	unsigned long flags;
-	int cpu;
 	u64 now, delta;
 
 	if (!sched_clock_irqtime)
@@ -1974,20 +1973,19 @@ void account_system_vtime(struct task_struct *curr)
 
 	local_irq_save(flags);
 
-	cpu = smp_processor_id();
-	now = sched_clock_cpu(cpu);
-	delta = now - per_cpu(irq_start_time, cpu);
-	per_cpu(irq_start_time, cpu) = now;
+	now = sched_clock_cpu(smp_processor_id());
+	delta = now - this_cpu_read(irq_start_time);
+	this_cpu_write(irq_start_time, now);
 	/*
 	 * We do not account for softirq time from ksoftirqd here.
 	 * We want to continue accounting softirq time to ksoftirqd thread
 	 * in that case, so as not to confuse scheduler with a special task
 	 * that do not consume any time, but still wants to run.
 	 */
-	if (hardirq_count())
-		per_cpu(cpu_hardirq_time, cpu) += delta;
+	if (in_irq())
+		this_cpu_add(cpu_hardirq_time, delta);
 	else if (in_serving_softirq() && curr != this_cpu_ksoftirqd())
-		per_cpu(cpu_softirq_time, cpu) += delta;
+		this_cpu_add(cpu_softirq_time, delta);
 
 	local_irq_restore(flags);
 }
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/6] Add nsecs_to_cputime64 interface for asm-generic -v1
  2010-10-25 22:30 [PATCH 0/6] Proper kernel irq time reporting -v1 Venkatesh Pallipadi
  2010-10-25 22:30 ` [PATCH 1/6] Free up pf flag PF_KSOFTIRQD -v1 Venkatesh Pallipadi
  2010-10-25 22:30 ` [PATCH 2/6] cleanup account_system_vtime with this_cpu_* -v1 Venkatesh Pallipadi
@ 2010-10-25 22:30 ` Venkatesh Pallipadi
  2010-10-25 22:30 ` [PATCH 4/6] Refactor account_system_time separating id-update -v1 Venkatesh Pallipadi
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Venkatesh Pallipadi @ 2010-10-25 22:30 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, H. Peter Anvin, Thomas Gleixner,
	Balbir Singh, Martin Schwidefsky
  Cc: linux-kernel, Paul Turner, Eric Dumazet, Shaun Ruffell,
	Yong Zhang, Venkatesh Pallipadi

Add nsecs_to_cputime64 interface. This is used in following patches that
updates cpu irq stat based on ns granularity info in IRQ_TIME_ACCOUNTING.

Tested-by: Shaun Ruffell <sruffell@digium.com>

Signed-off-by: Venkatesh Pallipadi <venki@google.com>
---
 include/asm-generic/cputime.h |    3 +++
 include/linux/jiffies.h       |    1 +
 kernel/time.c                 |   23 +++++++++++++++++++++--
 3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/include/asm-generic/cputime.h b/include/asm-generic/cputime.h
index ca0f239..c0f0da0 100644
--- a/include/asm-generic/cputime.h
+++ b/include/asm-generic/cputime.h
@@ -30,6 +30,9 @@ typedef u64 cputime64_t;
 #define cputime64_to_jiffies64(__ct)	(__ct)
 #define jiffies64_to_cputime64(__jif)	(__jif)
 #define cputime_to_cputime64(__ct)	((u64) __ct)
+#define cputime64_gt(__a, __b)		((__a) >  (__b))
+
+#define nsecs_to_cputime64(__ct)	nsecs_to_jiffies64(__ct)
 
 
 /*
diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h
index 6811f4b..922aa31 100644
--- a/include/linux/jiffies.h
+++ b/include/linux/jiffies.h
@@ -307,6 +307,7 @@ extern clock_t jiffies_to_clock_t(long x);
 extern unsigned long clock_t_to_jiffies(unsigned long x);
 extern u64 jiffies_64_to_clock_t(u64 x);
 extern u64 nsec_to_clock_t(u64 x);
+extern u64 nsecs_to_jiffies64(u64 n);
 extern unsigned long nsecs_to_jiffies(u64 n);
 
 #define TIMESTAMP_SIZE	30
diff --git a/kernel/time.c b/kernel/time.c
index ba9b338..fde4691 100644
--- a/kernel/time.c
+++ b/kernel/time.c
@@ -645,7 +645,7 @@ u64 nsec_to_clock_t(u64 x)
 }
 
 /**
- * nsecs_to_jiffies - Convert nsecs in u64 to jiffies
+ * nsecs_to_jiffies64 - Convert nsecs in u64 to jiffies64
  *
  * @n:	nsecs in u64
  *
@@ -657,7 +657,7 @@ u64 nsec_to_clock_t(u64 x)
  *   NSEC_PER_SEC = 10^9 = (5^9 * 2^9) = (1953125 * 512)
  *   ULLONG_MAX ns = 18446744073.709551615 secs = about 584 years
  */
-unsigned long nsecs_to_jiffies(u64 n)
+u64 nsecs_to_jiffies64(u64 n)
 {
 #if (NSEC_PER_SEC % HZ) == 0
 	/* Common case, HZ = 100, 128, 200, 250, 256, 500, 512, 1000 etc. */
@@ -674,6 +674,25 @@ unsigned long nsecs_to_jiffies(u64 n)
 #endif
 }
 
+
+/**
+ * nsecs_to_jiffies - Convert nsecs in u64 to jiffies
+ *
+ * @n:	nsecs in u64
+ *
+ * Unlike {m,u}secs_to_jiffies, type of input is not unsigned int but u64.
+ * And this doesn't return MAX_JIFFY_OFFSET since this function is designed
+ * for scheduler, not for use in device drivers to calculate timeout value.
+ *
+ * note:
+ *   NSEC_PER_SEC = 10^9 = (5^9 * 2^9) = (1953125 * 512)
+ *   ULLONG_MAX ns = 18446744073.709551615 secs = about 584 years
+ */
+unsigned long nsecs_to_jiffies(u64 n)
+{
+	return (unsigned long)nsecs_to_jiffies64(n);
+}
+
 #if (BITS_PER_LONG < 64)
 u64 get_jiffies_64(void)
 {
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/6] Refactor account_system_time separating id-update -v1
  2010-10-25 22:30 [PATCH 0/6] Proper kernel irq time reporting -v1 Venkatesh Pallipadi
                   ` (2 preceding siblings ...)
  2010-10-25 22:30 ` [PATCH 3/6] Add nsecs_to_cputime64 interface for asm-generic -v1 Venkatesh Pallipadi
@ 2010-10-25 22:30 ` Venkatesh Pallipadi
  2010-10-25 22:30 ` [PATCH 5/6] Export ns irqtimes through /proc/stat -v1 Venkatesh Pallipadi
  2010-10-25 22:30 ` [PATCH 6/6] Account ksoftirqd time as cpustat softirq -v1 Venkatesh Pallipadi
  5 siblings, 0 replies; 13+ messages in thread
From: Venkatesh Pallipadi @ 2010-10-25 22:30 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, H. Peter Anvin, Thomas Gleixner,
	Balbir Singh, Martin Schwidefsky
  Cc: linux-kernel, Paul Turner, Eric Dumazet, Shaun Ruffell,
	Yong Zhang, Venkatesh Pallipadi

Refactor account_system_time, to separate out the logic of
identifying the update needed and code that does actual updating.

This is used by following patch for IRQ_TIME_ACCOUNTING,
which has different identification logic and same update logic.

Tested-by: Shaun Ruffell <sruffell@digium.com>

Signed-off-by: Venkatesh Pallipadi <venki@google.com>
---
 kernel/sched.c |   46 +++++++++++++++++++++++++++++++---------------
 1 files changed, 31 insertions(+), 15 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index a37bb83..f291f3d 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -3499,6 +3499,32 @@ static void account_guest_time(struct task_struct *p, cputime_t cputime,
 }
 
 /*
+ * Account system cpu time to a process and desired cpustat field
+ * @p: the process that the cpu time gets accounted to
+ * @cputime: the cpu time spent in kernel space since the last update
+ * @cputime_scaled: cputime scaled by cpu frequency
+ * @target_cputime64: pointer to cpustat field that needs updating
+ */
+static inline
+void __account_system_time(struct task_struct *p, cputime_t cputime,
+			cputime_t cputime_scaled, cputime64_t *target_cputime64)
+{
+	cputime64_t tmp = cputime_to_cputime64(cputime);
+
+	/* Add system time to process. */
+	p->stime = cputime_add(p->stime, cputime);
+	p->stimescaled = cputime_add(p->stimescaled, cputime_scaled);
+	account_group_system_time(p, cputime);
+
+	/* Add system time to cpustat. */
+	*target_cputime64 = cputime64_add(*target_cputime64, tmp);
+	cpuacct_update_stats(p, CPUACCT_STAT_SYSTEM, cputime);
+
+	/* Account for system time used */
+	acct_update_integrals(p);
+}
+
+/*
  * Account system cpu time to a process.
  * @p: the process that the cpu time gets accounted to
  * @hardirq_offset: the offset to subtract from hardirq_count()
@@ -3509,31 +3535,21 @@ void account_system_time(struct task_struct *p, int hardirq_offset,
 			 cputime_t cputime, cputime_t cputime_scaled)
 {
 	struct cpu_usage_stat *cpustat = &kstat_this_cpu.cpustat;
-	cputime64_t tmp;
+	cputime64_t *target_cputime64;
 
 	if ((p->flags & PF_VCPU) && (irq_count() - hardirq_offset == 0)) {
 		account_guest_time(p, cputime, cputime_scaled);
 		return;
 	}
 
-	/* Add system time to process. */
-	p->stime = cputime_add(p->stime, cputime);
-	p->stimescaled = cputime_add(p->stimescaled, cputime_scaled);
-	account_group_system_time(p, cputime);
-
-	/* Add system time to cpustat. */
-	tmp = cputime_to_cputime64(cputime);
 	if (hardirq_count() - hardirq_offset)
-		cpustat->irq = cputime64_add(cpustat->irq, tmp);
+		target_cputime64 = &cpustat->irq;
 	else if (in_serving_softirq())
-		cpustat->softirq = cputime64_add(cpustat->softirq, tmp);
+		target_cputime64 = &cpustat->softirq;
 	else
-		cpustat->system = cputime64_add(cpustat->system, tmp);
+		target_cputime64 = &cpustat->system;
 
-	cpuacct_update_stats(p, CPUACCT_STAT_SYSTEM, cputime);
-
-	/* Account for system time used */
-	acct_update_integrals(p);
+	__account_system_time(p, cputime, cputime_scaled, target_cputime64);
 }
 
 /*
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/6] Export ns irqtimes through /proc/stat -v1
  2010-10-25 22:30 [PATCH 0/6] Proper kernel irq time reporting -v1 Venkatesh Pallipadi
                   ` (3 preceding siblings ...)
  2010-10-25 22:30 ` [PATCH 4/6] Refactor account_system_time separating id-update -v1 Venkatesh Pallipadi
@ 2010-10-25 22:30 ` Venkatesh Pallipadi
  2010-10-26  9:45   ` Peter Zijlstra
  2010-10-25 22:30 ` [PATCH 6/6] Account ksoftirqd time as cpustat softirq -v1 Venkatesh Pallipadi
  5 siblings, 1 reply; 13+ messages in thread
From: Venkatesh Pallipadi @ 2010-10-25 22:30 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, H. Peter Anvin, Thomas Gleixner,
	Balbir Singh, Martin Schwidefsky
  Cc: linux-kernel, Paul Turner, Eric Dumazet, Shaun Ruffell,
	Yong Zhang, Venkatesh Pallipadi

CONFIG_IRQ_TIME_ACCOUNTING adds ns granularity irq time on each CPU.
This info is already used in scheduler to do proper task chargeback
(earlier patches). This patch retro-fits this ns granularity
hardirq and softirq information to /proc/stat irq and softirq fields.

The update is still done on timer tick, where we look at accumulated
ns hardirq/softirq time and account the tick to user/system/irq/hardirq/guest
accordingly.

No new interface added.

Earlier versions looked at adding this as new fields in some /proc
files. This one seems to be the best in terms of impact to existing
apps, even though it has somewhat more kernel code than earlier versions.

Tested-by: Shaun Ruffell <sruffell@digium.com>

Signed-off-by: Venkatesh Pallipadi <venki@google.com>
---
 kernel/sched.c |  102 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 102 insertions(+), 0 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index f291f3d..49f6f61 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2000,8 +2000,40 @@ static void sched_irq_time_avg_update(struct rq *rq, u64 curr_irq_time)
 	}
 }
 
+static int irqtime_account_hi_update(void)
+{
+	struct cpu_usage_stat *cpustat = &kstat_this_cpu.cpustat;
+	unsigned long flags;
+	u64 latest_ns;
+	int ret = 0;
+
+	local_irq_save(flags);
+	latest_ns = this_cpu_read(cpu_hardirq_time);
+	if (cputime64_gt(nsecs_to_cputime64(latest_ns), cpustat->irq))
+		ret = 1;
+	local_irq_restore(flags);
+	return ret;
+}
+
+static int irqtime_account_si_update(void)
+{
+	struct cpu_usage_stat *cpustat = &kstat_this_cpu.cpustat;
+	unsigned long flags;
+	u64 latest_ns;
+	int ret = 0;
+
+	local_irq_save(flags);
+	latest_ns = this_cpu_read(cpu_softirq_time);
+	if (cputime64_gt(nsecs_to_cputime64(latest_ns), cpustat->softirq))
+		ret = 1;
+	local_irq_restore(flags);
+	return ret;
+}
+
 #else
 
+#define sched_clock_irqtime	(0)
+
 static u64 irq_time_cpu(int cpu)
 {
 	return 0;
@@ -3552,6 +3584,65 @@ void account_system_time(struct task_struct *p, int hardirq_offset,
 	__account_system_time(p, cputime, cputime_scaled, target_cputime64);
 }
 
+#ifdef CONFIG_IRQ_TIME_ACCOUNTING
+/*
+ * Account a tick to a process and cpustat
+ * @p: the process that the cpu time gets accounted to
+ * @user_tick: is the tick from userspace
+ * @rq: the pointer to rq
+ *
+ * Tick demultiplexing follows the order
+ * - pending hardirq update
+ * - pending softirq update
+ * - user_time
+ * - idle_time
+ * - system time
+ *   - check for guest_time
+ *   - else account as system_time
+ *
+ * Check for hardirq is done both for system and user time as there is
+ * no timer going off while we are on hardirq and hence we may never get an
+ * opportunity to update it solely in system time.
+ * p->stime and friends are only updated on system time and not on irq
+ * softirq as those do not count in task exec_runtime any more.
+ */
+static void irqtime_account_process_tick(struct task_struct *p, int user_tick,
+						struct rq *rq)
+{
+	cputime_t one_jiffy_scaled = cputime_to_scaled(cputime_one_jiffy);
+	cputime64_t tmp = cputime_to_cputime64(cputime_one_jiffy);
+	struct cpu_usage_stat *cpustat = &kstat_this_cpu.cpustat;
+
+	if (irqtime_account_hi_update()) {
+		cpustat->irq = cputime64_add(cpustat->irq, tmp);
+	} else if (irqtime_account_si_update()) {
+		cpustat->softirq = cputime64_add(cpustat->softirq, tmp);
+	} else if (user_tick) {
+		account_user_time(p, cputime_one_jiffy, one_jiffy_scaled);
+	} else if (p == rq->idle) {
+		account_idle_time(cputime_one_jiffy);
+	} else if (p->flags & PF_VCPU) { /* System time or guest time */
+		account_guest_time(p, cputime_one_jiffy, one_jiffy_scaled);
+	} else {
+		__account_system_time(p, cputime_one_jiffy, one_jiffy_scaled,
+					&cpustat->system);
+	}
+}
+
+static void irqtime_account_idle_ticks(int ticks)
+{
+	int i;
+	struct rq *rq = this_rq();
+
+	for (i = 0; i < ticks; i++)
+		irqtime_account_process_tick(current, 0, rq);
+}
+#else
+static void irqtime_account_idle_ticks(int ticks) {}
+static void irqtime_account_process_tick(struct task_struct *p, int user_tick,
+						struct rq *rq) {}
+#endif
+
 /*
  * Account for involuntary wait time.
  * @steal: the cpu time spent in involuntary wait
@@ -3592,6 +3683,11 @@ void account_process_tick(struct task_struct *p, int user_tick)
 	cputime_t one_jiffy_scaled = cputime_to_scaled(cputime_one_jiffy);
 	struct rq *rq = this_rq();
 
+	if (sched_clock_irqtime) {
+		irqtime_account_process_tick(p, user_tick, rq);
+		return;
+	}
+
 	if (user_tick)
 		account_user_time(p, cputime_one_jiffy, one_jiffy_scaled);
 	else if ((p != rq->idle) || (irq_count() != HARDIRQ_OFFSET))
@@ -3617,6 +3713,12 @@ void account_steal_ticks(unsigned long ticks)
  */
 void account_idle_ticks(unsigned long ticks)
 {
+
+	if (sched_clock_irqtime) {
+		irqtime_account_idle_ticks(ticks);
+		return;
+	}
+
 	account_idle_time(jiffies_to_cputime(ticks));
 }
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/6] Account ksoftirqd time as cpustat softirq -v1
  2010-10-25 22:30 [PATCH 0/6] Proper kernel irq time reporting -v1 Venkatesh Pallipadi
                   ` (4 preceding siblings ...)
  2010-10-25 22:30 ` [PATCH 5/6] Export ns irqtimes through /proc/stat -v1 Venkatesh Pallipadi
@ 2010-10-25 22:30 ` Venkatesh Pallipadi
  2010-10-26  9:33   ` Peter Zijlstra
  5 siblings, 1 reply; 13+ messages in thread
From: Venkatesh Pallipadi @ 2010-10-25 22:30 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, H. Peter Anvin, Thomas Gleixner,
	Balbir Singh, Martin Schwidefsky
  Cc: linux-kernel, Paul Turner, Eric Dumazet, Shaun Ruffell,
	Yong Zhang, Venkatesh Pallipadi

softirq time in ksoftirqd context is not accounted in ns granularity
per cpu softirq stats, as we want that to be a part of ksoftirqd
exec_runtime.

Accounting them as softirq on /proc/stat separately.

Tested-by: Shaun Ruffell <sruffell@digium.com>

Signed-off-by: Venkatesh Pallipadi <venki@google.com>
---
 kernel/sched.c |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 49f6f61..0955050 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -3617,6 +3617,14 @@ static void irqtime_account_process_tick(struct task_struct *p, int user_tick,
 		cpustat->irq = cputime64_add(cpustat->irq, tmp);
 	} else if (irqtime_account_si_update()) {
 		cpustat->softirq = cputime64_add(cpustat->softirq, tmp);
+	} else if (this_cpu_ksoftirqd() == p) {
+		/*
+		 * ksoftirqd time do not get accounted in cpu_softirq_time.
+		 * So, we have to handle it separately here.
+		 * Also, p->stime needs to be updated for ksoftirqd.
+		 */
+		__account_system_time(p, cputime_one_jiffy, one_jiffy_scaled,
+					&cpustat->softirq);
 	} else if (user_tick) {
 		account_user_time(p, cputime_one_jiffy, one_jiffy_scaled);
 	} else if (p == rq->idle) {
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 6/6] Account ksoftirqd time as cpustat softirq -v1
  2010-10-25 22:30 ` [PATCH 6/6] Account ksoftirqd time as cpustat softirq -v1 Venkatesh Pallipadi
@ 2010-10-26  9:33   ` Peter Zijlstra
  2010-10-26  9:50     ` Peter Zijlstra
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2010-10-26  9:33 UTC (permalink / raw)
  To: Venkatesh Pallipadi
  Cc: Ingo Molnar, H. Peter Anvin, Thomas Gleixner, Balbir Singh,
	Martin Schwidefsky, linux-kernel, Paul Turner, Eric Dumazet,
	Shaun Ruffell, Yong Zhang

On Mon, 2010-10-25 at 15:30 -0700, Venkatesh Pallipadi wrote:
> softirq time in ksoftirqd context is not accounted in ns granularity
> per cpu softirq stats, as we want that to be a part of ksoftirqd
> exec_runtime.
> 
> Accounting them as softirq on /proc/stat separately.
> 
> Tested-by: Shaun Ruffell <sruffell@digium.com>
> 
> Signed-off-by: Venkatesh Pallipadi <venki@google.com>
> ---
>  kernel/sched.c |    8 ++++++++
>  1 files changed, 8 insertions(+), 0 deletions(-)
> 
> diff --git a/kernel/sched.c b/kernel/sched.c
> index 49f6f61..0955050 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -3617,6 +3617,14 @@ static void irqtime_account_process_tick(struct task_struct *p, int user_tick,
>  		cpustat->irq = cputime64_add(cpustat->irq, tmp);
>  	} else if (irqtime_account_si_update()) {
>  		cpustat->softirq = cputime64_add(cpustat->softirq, tmp);
> +	} else if (this_cpu_ksoftirqd() == p) {
> +		/*
> +		 * ksoftirqd time do not get accounted in cpu_softirq_time.
> +		 * So, we have to handle it separately here.
> +		 * Also, p->stime needs to be updated for ksoftirqd.
> +		 */
> +		__account_system_time(p, cputime_one_jiffy, one_jiffy_scaled,
> +					&cpustat->softirq);
>  	} else if (user_tick) {
>  		account_user_time(p, cputime_one_jiffy, one_jiffy_scaled);
>  	} else if (p == rq->idle) {


I'm somewhat confused by this patch.. This is significantly different
from the thing proposed last time around, which was to use:

  cpustat->softirq + this_cpu_ksoftirqd()->se.sum_exec_runtime

The above looses the fine grained aspect of the accounting and simply
charges a whole jiffy if the current process happens to be ksoftirqd.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/6] Export ns irqtimes through /proc/stat -v1
  2010-10-25 22:30 ` [PATCH 5/6] Export ns irqtimes through /proc/stat -v1 Venkatesh Pallipadi
@ 2010-10-26  9:45   ` Peter Zijlstra
  2010-10-26 14:57     ` Peter Zijlstra
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2010-10-26  9:45 UTC (permalink / raw)
  To: Venkatesh Pallipadi
  Cc: Ingo Molnar, H. Peter Anvin, Thomas Gleixner, Balbir Singh,
	Martin Schwidefsky, linux-kernel, Paul Turner, Eric Dumazet,
	Shaun Ruffell, Yong Zhang

On Mon, 2010-10-25 at 15:30 -0700, Venkatesh Pallipadi wrote:
> +static void irqtime_account_process_tick(struct task_struct *p, int user_tick,
> +                                               struct rq *rq)
> +{
> +       cputime_t one_jiffy_scaled = cputime_to_scaled(cputime_one_jiffy);
> +       cputime64_t tmp = cputime_to_cputime64(cputime_one_jiffy);
> +       struct cpu_usage_stat *cpustat = &kstat_this_cpu.cpustat;
> +
> +       if (irqtime_account_hi_update()) {
> +               cpustat->irq = cputime64_add(cpustat->irq, tmp);
> +       } else if (irqtime_account_si_update()) {
> +               cpustat->softirq = cputime64_add(cpustat->softirq, tmp);
> +       } else 

I'm still not sure about this else stmt, the above two conditions can
basically 'eat' user/system ticks. What we need to show is that there is
no bias towards either kind so the ratio is not affected -- can we make
such an argument?

>                if (user_tick) {
> +               account_user_time(p, cputime_one_jiffy, one_jiffy_scaled);
> +       } else if (p == rq->idle) {
> +               account_idle_time(cputime_one_jiffy);
> +       } else if (p->flags & PF_VCPU) { /* System time or guest time */
> +               account_guest_time(p, cputime_one_jiffy, one_jiffy_scaled);
> +       } else {
> +               __account_system_time(p, cputime_one_jiffy, one_jiffy_scaled,
> +                                       &cpustat->system);
> +       }
> +} 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 6/6] Account ksoftirqd time as cpustat softirq -v1
  2010-10-26  9:33   ` Peter Zijlstra
@ 2010-10-26  9:50     ` Peter Zijlstra
  2010-10-26 17:35       ` Venkatesh Pallipadi
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2010-10-26  9:50 UTC (permalink / raw)
  To: Venkatesh Pallipadi
  Cc: Ingo Molnar, H. Peter Anvin, Thomas Gleixner, Balbir Singh,
	Martin Schwidefsky, linux-kernel, Paul Turner, Eric Dumazet,
	Shaun Ruffell, Yong Zhang

On Tue, 2010-10-26 at 11:33 +0200, Peter Zijlstra wrote:
> On Mon, 2010-10-25 at 15:30 -0700, Venkatesh Pallipadi wrote:
> > softirq time in ksoftirqd context is not accounted in ns granularity
> > per cpu softirq stats, as we want that to be a part of ksoftirqd
> > exec_runtime.
> > 
> > Accounting them as softirq on /proc/stat separately.
> > 
> > Tested-by: Shaun Ruffell <sruffell@digium.com>
> > 
> > Signed-off-by: Venkatesh Pallipadi <venki@google.com>
> > ---
> >  kernel/sched.c |    8 ++++++++
> >  1 files changed, 8 insertions(+), 0 deletions(-)
> > 
> > diff --git a/kernel/sched.c b/kernel/sched.c
> > index 49f6f61..0955050 100644
> > --- a/kernel/sched.c
> > +++ b/kernel/sched.c
> > @@ -3617,6 +3617,14 @@ static void irqtime_account_process_tick(struct task_struct *p, int user_tick,
> >  		cpustat->irq = cputime64_add(cpustat->irq, tmp);
> >  	} else if (irqtime_account_si_update()) {
> >  		cpustat->softirq = cputime64_add(cpustat->softirq, tmp);
> > +	} else if (this_cpu_ksoftirqd() == p) {
> > +		/*
> > +		 * ksoftirqd time do not get accounted in cpu_softirq_time.
> > +		 * So, we have to handle it separately here.
> > +		 * Also, p->stime needs to be updated for ksoftirqd.
> > +		 */
> > +		__account_system_time(p, cputime_one_jiffy, one_jiffy_scaled,
> > +					&cpustat->softirq);
> >  	} else if (user_tick) {
> >  		account_user_time(p, cputime_one_jiffy, one_jiffy_scaled);
> >  	} else if (p == rq->idle) {
> 
> 
> I'm somewhat confused by this patch.. This is significantly different
> from the thing proposed last time around, which was to use:
> 
>   cpustat->softirq + this_cpu_ksoftirqd()->se.sum_exec_runtime
> 
> The above looses the fine grained aspect of the accounting and simply
> charges a whole jiffy if the current process happens to be ksoftirqd.

Btw, both these solutions can cause si + us + ni + sy > 100%, are we ok
with that?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/6] Export ns irqtimes through /proc/stat -v1
  2010-10-26  9:45   ` Peter Zijlstra
@ 2010-10-26 14:57     ` Peter Zijlstra
  2010-10-26 17:35       ` Venkatesh Pallipadi
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2010-10-26 14:57 UTC (permalink / raw)
  To: Venkatesh Pallipadi
  Cc: Ingo Molnar, H. Peter Anvin, Thomas Gleixner, Balbir Singh,
	Martin Schwidefsky, linux-kernel, Paul Turner, Eric Dumazet,
	Shaun Ruffell, Yong Zhang

On Tue, 2010-10-26 at 11:45 +0200, Peter Zijlstra wrote:
> On Mon, 2010-10-25 at 15:30 -0700, Venkatesh Pallipadi wrote:
> > +static void irqtime_account_process_tick(struct task_struct *p, int user_tick,
> > +                                               struct rq *rq)
> > +{
> > +       cputime_t one_jiffy_scaled = cputime_to_scaled(cputime_one_jiffy);
> > +       cputime64_t tmp = cputime_to_cputime64(cputime_one_jiffy);
> > +       struct cpu_usage_stat *cpustat = &kstat_this_cpu.cpustat;
> > +
> > +       if (irqtime_account_hi_update()) {
> > +               cpustat->irq = cputime64_add(cpustat->irq, tmp);
> > +       } else if (irqtime_account_si_update()) {
> > +               cpustat->softirq = cputime64_add(cpustat->softirq, tmp);
> > +       } else 
> 
> I'm still not sure about this else stmt, the above two conditions can
> basically 'eat' user/system ticks. What we need to show is that there is
> no bias towards either kind so the ratio is not affected -- can we make
> such an argument?

I think I can made a counter-argument: if either or both of these checks
are true we had system time in the last tick, hence there is a larger
chance this tick is a system tick.

Therefore it will not provide the same user/system ratio.

Hmm?

> >                if (user_tick) {
> > +               account_user_time(p, cputime_one_jiffy, one_jiffy_scaled);
> > +       } else if (p == rq->idle) {
> > +               account_idle_time(cputime_one_jiffy);
> > +       } else if (p->flags & PF_VCPU) { /* System time or guest time */
> > +               account_guest_time(p, cputime_one_jiffy, one_jiffy_scaled);
> > +       } else {
> > +               __account_system_time(p, cputime_one_jiffy, one_jiffy_scaled,
> > +                                       &cpustat->system);
> > +       }
> > +} 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/6] Export ns irqtimes through /proc/stat -v1
  2010-10-26 14:57     ` Peter Zijlstra
@ 2010-10-26 17:35       ` Venkatesh Pallipadi
  0 siblings, 0 replies; 13+ messages in thread
From: Venkatesh Pallipadi @ 2010-10-26 17:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, H. Peter Anvin, Thomas Gleixner, Balbir Singh,
	Martin Schwidefsky, linux-kernel, Paul Turner, Eric Dumazet,
	Shaun Ruffell, Yong Zhang

On Tue, Oct 26, 2010 at 7:57 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, 2010-10-26 at 11:45 +0200, Peter Zijlstra wrote:
>> On Mon, 2010-10-25 at 15:30 -0700, Venkatesh Pallipadi wrote:
>> > +static void irqtime_account_process_tick(struct task_struct *p, int user_tick,
>> > +                                               struct rq *rq)
>> > +{
>> > +       cputime_t one_jiffy_scaled = cputime_to_scaled(cputime_one_jiffy);
>> > +       cputime64_t tmp = cputime_to_cputime64(cputime_one_jiffy);
>> > +       struct cpu_usage_stat *cpustat = &kstat_this_cpu.cpustat;
>> > +
>> > +       if (irqtime_account_hi_update()) {
>> > +               cpustat->irq = cputime64_add(cpustat->irq, tmp);
>> > +       } else if (irqtime_account_si_update()) {
>> > +               cpustat->softirq = cputime64_add(cpustat->softirq, tmp);
>> > +       } else
>>
>> I'm still not sure about this else stmt, the above two conditions can
>> basically 'eat' user/system ticks. What we need to show is that there is
>> no bias towards either kind so the ratio is not affected -- can we make
>> such an argument?
>
> I think I can made a counter-argument: if either or both of these checks
> are true we had system time in the last tick, hence there is a larger
> chance this tick is a system tick.
>
> Therefore it will not provide the same user/system ratio.
>
> Hmm?
>

This is about task user/system time. Right?
With the earlier changes, hardirq/softirq won't be part of task
sum_exec_runtime anymore. So, if we have had significant hardirq or
softirq during this tick, then the task sum_exec_runtime wouldn't have
changed much. So, accounting this as task system time will tweak task
user/system ratio needlessly. So, eating up system time should be the
right thing to do here :-)

The other case of significant hardirq/softirq during last tick, but
not enough to cause the irq folding and so may get accounted as
system/user and we have a tiny hardirq/softirq in the next tick which
results in folding and so system/user wont be accounted. So, this can
affect user/system ratio if last tick was user and current tick is
system or vice-versa. This I feel, is no different from variation with
tick instance based sampling we have now.

Only other option I can think of, is to forget about this folding
business and micro account hardirq/softirq fraction on every tick and
account the remaining time (one_tick - (hardirq+softirq)) as
user/idle/guest/system. The problem with that is the task times and
kstat are done in cputime64, which in x86 is jiffies and so we dont
have enough resolution for this approach.

Thanks,
Venki

>> >                if (user_tick) {
>> > +               account_user_time(p, cputime_one_jiffy, one_jiffy_scaled);
>> > +       } else if (p == rq->idle) {
>> > +               account_idle_time(cputime_one_jiffy);
>> > +       } else if (p->flags & PF_VCPU) { /* System time or guest time */
>> > +               account_guest_time(p, cputime_one_jiffy, one_jiffy_scaled);
>> > +       } else {
>> > +               __account_system_time(p, cputime_one_jiffy, one_jiffy_scaled,
>> > +                                       &cpustat->system);
>> > +       }
>> > +}
>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 6/6] Account ksoftirqd time as cpustat softirq -v1
  2010-10-26  9:50     ` Peter Zijlstra
@ 2010-10-26 17:35       ` Venkatesh Pallipadi
  0 siblings, 0 replies; 13+ messages in thread
From: Venkatesh Pallipadi @ 2010-10-26 17:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, H. Peter Anvin, Thomas Gleixner, Balbir Singh,
	Martin Schwidefsky, linux-kernel, Paul Turner, Eric Dumazet,
	Shaun Ruffell, Yong Zhang

On Tue, Oct 26, 2010 at 2:50 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, 2010-10-26 at 11:33 +0200, Peter Zijlstra wrote:
>> On Mon, 2010-10-25 at 15:30 -0700, Venkatesh Pallipadi wrote:
>> > softirq time in ksoftirqd context is not accounted in ns granularity
>> > per cpu softirq stats, as we want that to be a part of ksoftirqd
>> > exec_runtime.
>> >
>> > Accounting them as softirq on /proc/stat separately.
>> >
>> > Tested-by: Shaun Ruffell <sruffell@digium.com>
>> >
>> > Signed-off-by: Venkatesh Pallipadi <venki@google.com>
>> > ---
>> >  kernel/sched.c |    8 ++++++++
>> >  1 files changed, 8 insertions(+), 0 deletions(-)
>> >
>> > diff --git a/kernel/sched.c b/kernel/sched.c
>> > index 49f6f61..0955050 100644
>> > --- a/kernel/sched.c
>> > +++ b/kernel/sched.c
>> > @@ -3617,6 +3617,14 @@ static void irqtime_account_process_tick(struct task_struct *p, int user_tick,
>> >             cpustat->irq = cputime64_add(cpustat->irq, tmp);
>> >     } else if (irqtime_account_si_update()) {
>> >             cpustat->softirq = cputime64_add(cpustat->softirq, tmp);
>> > +   } else if (this_cpu_ksoftirqd() == p) {
>> > +           /*
>> > +            * ksoftirqd time do not get accounted in cpu_softirq_time.
>> > +            * So, we have to handle it separately here.
>> > +            * Also, p->stime needs to be updated for ksoftirqd.
>> > +            */
>> > +           __account_system_time(p, cputime_one_jiffy, one_jiffy_scaled,
>> > +                                   &cpustat->softirq);
>> >     } else if (user_tick) {
>> >             account_user_time(p, cputime_one_jiffy, one_jiffy_scaled);
>> >     } else if (p == rq->idle) {
>>
>>
>> I'm somewhat confused by this patch.. This is significantly different
>> from the thing proposed last time around, which was to use:
>>
>>   cpustat->softirq + this_cpu_ksoftirqd()->se.sum_exec_runtime
>>

Previous version I was using sum_exec_runtime and later I was also
checking whether the current task is ksoftirqd before charging. So, in
effect that was stricter change which would only account when there is
folding and current happens to be ksoftirqd. The second check was
required as we are also accounting ksoftirqd->stime and I didn't want
to do that when current is not ksoftirqd.

>> The above looses the fine grained aspect of the accounting and simply
>> charges a whole jiffy if the current process happens to be ksoftirqd.
>

Yes. This leaves ksoftirqd with coarse granularity as before.

> Btw, both these solutions can cause si + us + ni + sy > 100%, are we ok
> with that?

The current if/else if/.../else/ logic gets called every tick and
charges that tick to what it thinks is the most appropriate bucket.
So, it should not be greater than 100%. ksoftirqd gets accounted only
softirq tick and not as system and thats how it has been currently.

Thanks,
Venki

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2010-10-26 17:35 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-25 22:30 [PATCH 0/6] Proper kernel irq time reporting -v1 Venkatesh Pallipadi
2010-10-25 22:30 ` [PATCH 1/6] Free up pf flag PF_KSOFTIRQD -v1 Venkatesh Pallipadi
2010-10-25 22:30 ` [PATCH 2/6] cleanup account_system_vtime with this_cpu_* -v1 Venkatesh Pallipadi
2010-10-25 22:30 ` [PATCH 3/6] Add nsecs_to_cputime64 interface for asm-generic -v1 Venkatesh Pallipadi
2010-10-25 22:30 ` [PATCH 4/6] Refactor account_system_time separating id-update -v1 Venkatesh Pallipadi
2010-10-25 22:30 ` [PATCH 5/6] Export ns irqtimes through /proc/stat -v1 Venkatesh Pallipadi
2010-10-26  9:45   ` Peter Zijlstra
2010-10-26 14:57     ` Peter Zijlstra
2010-10-26 17:35       ` Venkatesh Pallipadi
2010-10-25 22:30 ` [PATCH 6/6] Account ksoftirqd time as cpustat softirq -v1 Venkatesh Pallipadi
2010-10-26  9:33   ` Peter Zijlstra
2010-10-26  9:50     ` Peter Zijlstra
2010-10-26 17:35       ` Venkatesh Pallipadi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).