All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] vtime: Delay cputime accounting to tick / context switch
@ 2017-01-05 17:11 Frederic Weisbecker
  2017-01-05 17:11 ` [PATCH 01/10] powerpc32: Fix stale scaled stime on " Frederic Weisbecker
                   ` (11 more replies)
  0 siblings, 12 replies; 24+ messages in thread
From: Frederic Weisbecker @ 2017-01-05 17:11 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Heiko Carstens, Martin Schwidefsky, Tony Luck,
	Fenghua Yu, Peter Zijlstra, Rik van Riel, Thomas Gleixner,
	Ingo Molnar, Stanislaw Gruszka, Wanpeng Li,
	Christian Borntraeger

This version is a rebase on top of latest Linus tree which includes
the fix 8f2b468aadc ("s390/vtime: correct system time accounting").

Also a small change: I have moved account_system_index_scaled() to s390
in patch "s390/cputime: delayed accounting of system time" because it is
the only user of the function.

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
	vtime/acc-v2

Thanks,
	Frederic
---

Frederic Weisbecker (9):
      powerpc32: Fix stale scaled stime on context switch
      ia64: Fix wrong start cputime assignment on task switch
      cputime: Allow accounting system time using cpustat index
      cputime: Export account_guest_time
      powerpc: Prepare accounting structure for cputime flush on tick
      powerpc: Migrate stolen_time field to accounting structure
      powerpc/vtime: Accumulate cputime and account only on tick/task switch
      ia64: Accumulate cputime and account only on tick/task switch
      vtime: Rename vtime_account_user() to vtime_flush()

Martin Schwidefsky (1):
      s390/cputime: delayed accounting of system time


 arch/ia64/include/asm/thread_info.h   |   6 ++
 arch/ia64/kernel/time.c               |  66 +++++++++++-----
 arch/powerpc/include/asm/accounting.h |  14 +++-
 arch/powerpc/include/asm/paca.h       |   1 -
 arch/powerpc/kernel/asm-offsets.c     |   8 +-
 arch/powerpc/kernel/time.c            | 138 +++++++++++++++++++++-------------
 arch/powerpc/xmon/xmon.c              |   8 +-
 arch/s390/include/asm/lowcore.h       |  65 ++++++++--------
 arch/s390/include/asm/processor.h     |   3 +
 arch/s390/kernel/vtime.c              | 138 ++++++++++++++++++++++------------
 include/linux/kernel_stat.h           |   5 +-
 include/linux/vtime.h                 |   7 +-
 kernel/sched/cputime.c                |  16 ++--
 13 files changed, 302 insertions(+), 173 deletions(-)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 01/10] powerpc32: Fix stale scaled stime on context switch
  2017-01-05 17:11 [PATCH 00/10] vtime: Delay cputime accounting to tick / context switch Frederic Weisbecker
@ 2017-01-05 17:11 ` Frederic Weisbecker
  2017-01-14 10:00   ` [tip:sched/core] sched/cputime, " tip-bot for Frederic Weisbecker
  2017-01-05 17:11 ` [PATCH 02/10] ia64: Fix wrong start cputime assignment on task switch Frederic Weisbecker
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Frederic Weisbecker @ 2017-01-05 17:11 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Heiko Carstens, Martin Schwidefsky, Tony Luck,
	Fenghua Yu, Peter Zijlstra, Rik van Riel, Thomas Gleixner,
	Ingo Molnar, Stanislaw Gruszka, Wanpeng Li,
	Christian Borntraeger

On context switch with powerpc32, the cputime is accumulated in the
thread_info struct. So the switching-in task must move forward its
start time snapshot to the current time in order to later compute the
delta spent in system mode.

This is what we do for the normal cputime by initializing the starttime
field to the value of the previous task's starttime which got freshly
updated.

But we are missing the update of the scaled cputime start time. As a
result we may be accounting too much scaled cputime later.

Fix this by initializing the scaled cputime the same way we do for
normal cputime.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 arch/powerpc/kernel/time.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index bc2e08d..ce21650 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -407,6 +407,7 @@ void arch_vtime_task_switch(struct task_struct *prev)
 	struct cpu_accounting_data *acct = get_accounting(current);
 
 	acct->starttime = get_accounting(prev)->starttime;
+	acct->startspurr = get_accounting(prev)->startspurr;
 	acct->system_time = 0;
 	acct->user_time = 0;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 02/10] ia64: Fix wrong start cputime assignment on task switch
  2017-01-05 17:11 [PATCH 00/10] vtime: Delay cputime accounting to tick / context switch Frederic Weisbecker
  2017-01-05 17:11 ` [PATCH 01/10] powerpc32: Fix stale scaled stime on " Frederic Weisbecker
@ 2017-01-05 17:11 ` Frederic Weisbecker
  2017-01-14 10:00   ` [tip:sched/core] sched/cputime, ia64: Fix incorrect " tip-bot for Frederic Weisbecker
  2017-01-05 17:11 ` [PATCH 03/10] cputime: Allow accounting system time using cpustat index Frederic Weisbecker
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Frederic Weisbecker @ 2017-01-05 17:11 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Heiko Carstens, Martin Schwidefsky, Tony Luck,
	Fenghua Yu, Peter Zijlstra, Rik van Riel, Thomas Gleixner,
	Ingo Molnar, Stanislaw Gruszka, Wanpeng Li,
	Christian Borntraeger

On task switch we must initialize the current cputime of the next task
using the value of the previous task which got freshly updated.

But we are confusing that with doing the opposite, which should result
in wrong cputime accounting.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 arch/ia64/kernel/time.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/ia64/kernel/time.c b/arch/ia64/kernel/time.c
index 71775b95..637e741 100644
--- a/arch/ia64/kernel/time.c
+++ b/arch/ia64/kernel/time.c
@@ -83,7 +83,7 @@ void arch_vtime_task_switch(struct task_struct *prev)
 	struct thread_info *pi = task_thread_info(prev);
 	struct thread_info *ni = task_thread_info(current);
 
-	pi->ac_stamp = ni->ac_stamp;
+	ni->ac_stamp = pi->ac_stamp;
 	ni->ac_stime = ni->ac_utime = 0;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 03/10] cputime: Allow accounting system time using cpustat index
  2017-01-05 17:11 [PATCH 00/10] vtime: Delay cputime accounting to tick / context switch Frederic Weisbecker
  2017-01-05 17:11 ` [PATCH 01/10] powerpc32: Fix stale scaled stime on " Frederic Weisbecker
  2017-01-05 17:11 ` [PATCH 02/10] ia64: Fix wrong start cputime assignment on task switch Frederic Weisbecker
@ 2017-01-05 17:11 ` Frederic Weisbecker
  2017-01-14 10:01   ` [tip:sched/core] sched/cputime: " tip-bot for Frederic Weisbecker
  2017-01-05 17:11 ` [PATCH 04/10] cputime: Export account_guest_time Frederic Weisbecker
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Frederic Weisbecker @ 2017-01-05 17:11 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Heiko Carstens, Martin Schwidefsky, Tony Luck,
	Fenghua Yu, Peter Zijlstra, Rik van Riel, Thomas Gleixner,
	Ingo Molnar, Stanislaw Gruszka, Wanpeng Li,
	Christian Borntraeger

In order to prepare for CONFIG_VIRT_CPU_ACCOUNTING_NATIVE to delay
cputime accounting to the tick, let's provide APIs to account system
time to precise contexts: hardirq, softirq, pure system, ...

Inspired-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 include/linux/kernel_stat.h |  2 ++
 kernel/sched/cputime.c      | 10 +++++-----
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/include/linux/kernel_stat.h b/include/linux/kernel_stat.h
index 00f7768..14b63bb 100644
--- a/include/linux/kernel_stat.h
+++ b/include/linux/kernel_stat.h
@@ -80,6 +80,8 @@ static inline unsigned int kstat_cpu_irqs_sum(unsigned int cpu)
 
 extern void account_user_time(struct task_struct *, cputime_t);
 extern void account_system_time(struct task_struct *, int, cputime_t);
+extern void account_system_index_time(struct task_struct *, cputime_t,
+				      enum cpu_usage_stat);
 extern void account_steal_time(cputime_t);
 extern void account_idle_time(cputime_t);
 
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 7700a9c..aad4283 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -176,8 +176,8 @@ static void account_guest_time(struct task_struct *p, cputime_t cputime)
  * @cputime: the cpu time spent in kernel space since the last update
  * @index: pointer to cpustat field that has to be updated
  */
-static inline
-void __account_system_time(struct task_struct *p, cputime_t cputime, int index)
+void account_system_index_time(struct task_struct *p,
+			       cputime_t cputime, enum cpu_usage_stat index)
 {
 	/* Add system time to process. */
 	p->stime += cputime;
@@ -213,7 +213,7 @@ void account_system_time(struct task_struct *p, int hardirq_offset,
 	else
 		index = CPUTIME_SYSTEM;
 
-	__account_system_time(p, cputime, index);
+	account_system_index_time(p, cputime, index);
 }
 
 /*
@@ -400,7 +400,7 @@ static void irqtime_account_process_tick(struct task_struct *p, int user_tick,
 		 * So, we have to handle it separately here.
 		 * Also, p->stime needs to be updated for ksoftirqd.
 		 */
-		__account_system_time(p, cputime, CPUTIME_SOFTIRQ);
+		account_system_index_time(p, cputime, CPUTIME_SOFTIRQ);
 	} else if (user_tick) {
 		account_user_time(p, cputime);
 	} else if (p == rq->idle) {
@@ -408,7 +408,7 @@ static void irqtime_account_process_tick(struct task_struct *p, int user_tick,
 	} else if (p->flags & PF_VCPU) { /* System time or guest time */
 		account_guest_time(p, cputime);
 	} else {
-		__account_system_time(p, cputime, CPUTIME_SYSTEM);
+		account_system_index_time(p, cputime, CPUTIME_SYSTEM);
 	}
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 04/10] cputime: Export account_guest_time
  2017-01-05 17:11 [PATCH 00/10] vtime: Delay cputime accounting to tick / context switch Frederic Weisbecker
                   ` (2 preceding siblings ...)
  2017-01-05 17:11 ` [PATCH 03/10] cputime: Allow accounting system time using cpustat index Frederic Weisbecker
@ 2017-01-05 17:11 ` Frederic Weisbecker
  2017-01-14 10:01   ` [tip:sched/core] sched/cputime: Export account_guest_time() tip-bot for Frederic Weisbecker
  2017-01-05 17:11 ` [PATCH 05/10] powerpc: Prepare accounting structure for cputime flush on tick Frederic Weisbecker
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Frederic Weisbecker @ 2017-01-05 17:11 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Heiko Carstens, Martin Schwidefsky, Tony Luck,
	Fenghua Yu, Peter Zijlstra, Rik van Riel, Thomas Gleixner,
	Ingo Molnar, Stanislaw Gruszka, Wanpeng Li,
	Christian Borntraeger

In order to prepare for CONFIG_VIRT_CPU_ACCOUNTING_NATIVE to delay
cputime accounting to the tick, let's allow archs to account cputime
directly to gtime.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 include/linux/kernel_stat.h | 1 +
 kernel/sched/cputime.c      | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/kernel_stat.h b/include/linux/kernel_stat.h
index 14b63bb..cfd6c0c 100644
--- a/include/linux/kernel_stat.h
+++ b/include/linux/kernel_stat.h
@@ -79,6 +79,7 @@ static inline unsigned int kstat_cpu_irqs_sum(unsigned int cpu)
 }
 
 extern void account_user_time(struct task_struct *, cputime_t);
+extern void account_guest_time(struct task_struct *, cputime_t);
 extern void account_system_time(struct task_struct *, int, cputime_t);
 extern void account_system_index_time(struct task_struct *, cputime_t,
 				      enum cpu_usage_stat);
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index aad4283..5813ee4 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -151,7 +151,7 @@ void account_user_time(struct task_struct *p, cputime_t cputime)
  * @p: the process that the cpu time gets accounted to
  * @cputime: the cpu time spent in virtual machine since the last update
  */
-static void account_guest_time(struct task_struct *p, cputime_t cputime)
+void account_guest_time(struct task_struct *p, cputime_t cputime)
 {
 	u64 *cpustat = kcpustat_this_cpu->cpustat;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 05/10] powerpc: Prepare accounting structure for cputime flush on tick
  2017-01-05 17:11 [PATCH 00/10] vtime: Delay cputime accounting to tick / context switch Frederic Weisbecker
                   ` (3 preceding siblings ...)
  2017-01-05 17:11 ` [PATCH 04/10] cputime: Export account_guest_time Frederic Weisbecker
@ 2017-01-05 17:11 ` Frederic Weisbecker
  2017-01-14 10:02   ` [tip:sched/core] sched/cputime, " tip-bot for Frederic Weisbecker
  2017-01-05 17:11 ` [PATCH 06/10] powerpc: Migrate stolen_time field to accounting structure Frederic Weisbecker
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Frederic Weisbecker @ 2017-01-05 17:11 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Heiko Carstens, Martin Schwidefsky, Tony Luck,
	Fenghua Yu, Peter Zijlstra, Rik van Riel, Thomas Gleixner,
	Ingo Molnar, Stanislaw Gruszka, Wanpeng Li,
	Christian Borntraeger

In order to prepare for CONFIG_VIRT_CPU_ACCOUNTING_NATIVE to delay
cputime accounting to the tick, provide finegrained accumulators to
powerpc in order to store the cputime until flushing.

While at it, normalize the name of several fields according to common
cputime naming.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 arch/powerpc/include/asm/accounting.h | 14 +++++++++++---
 arch/powerpc/kernel/asm-offsets.c     |  8 ++++----
 arch/powerpc/kernel/time.c            | 31 ++++++++++++++++---------------
 arch/powerpc/xmon/xmon.c              |  6 +++---
 4 files changed, 34 insertions(+), 25 deletions(-)

diff --git a/arch/powerpc/include/asm/accounting.h b/arch/powerpc/include/asm/accounting.h
index c133246..3abcf98 100644
--- a/arch/powerpc/include/asm/accounting.h
+++ b/arch/powerpc/include/asm/accounting.h
@@ -12,9 +12,17 @@
 
 /* Stuff for accurate time accounting */
 struct cpu_accounting_data {
-	unsigned long user_time;	/* accumulated usermode TB ticks */
-	unsigned long system_time;	/* accumulated system TB ticks */
-	unsigned long user_time_scaled;	/* accumulated usermode SPURR ticks */
+	/* Accumulated cputime values to flush on ticks*/
+	unsigned long utime;
+	unsigned long stime;
+	unsigned long utime_scaled;
+	unsigned long stime_scaled;
+	unsigned long gtime;
+	unsigned long hardirq_time;
+	unsigned long softirq_time;
+	unsigned long steal_time;
+	unsigned long idle_time;
+	/* Internal counters */
 	unsigned long starttime;	/* TB value snapshot */
 	unsigned long starttime_user;	/* TB value on exit to usermode */
 	unsigned long startspurr;	/* SPURR value snapshot */
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 0601e6a..e505319 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -252,9 +252,9 @@ int main(void)
 	DEFINE(ACCOUNT_STARTTIME_USER,
 	       offsetof(struct paca_struct, accounting.starttime_user));
 	DEFINE(ACCOUNT_USER_TIME,
-	       offsetof(struct paca_struct, accounting.user_time));
+	       offsetof(struct paca_struct, accounting.utime));
 	DEFINE(ACCOUNT_SYSTEM_TIME,
-	       offsetof(struct paca_struct, accounting.system_time));
+	       offsetof(struct paca_struct, accounting.stime));
 	DEFINE(PACA_TRAP_SAVE, offsetof(struct paca_struct, trap_save));
 	DEFINE(PACA_NAPSTATELOST, offsetof(struct paca_struct, nap_state_lost));
 	DEFINE(PACA_SPRG_VDSO, offsetof(struct paca_struct, sprg_vdso));
@@ -265,9 +265,9 @@ int main(void)
 	DEFINE(ACCOUNT_STARTTIME_USER,
 	       offsetof(struct thread_info, accounting.starttime_user));
 	DEFINE(ACCOUNT_USER_TIME,
-	       offsetof(struct thread_info, accounting.user_time));
+	       offsetof(struct thread_info, accounting.utime));
 	DEFINE(ACCOUNT_SYSTEM_TIME,
-	       offsetof(struct thread_info, accounting.system_time));
+	       offsetof(struct thread_info, accounting.stime));
 #endif
 #endif /* CONFIG_PPC64 */
 
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index ce21650..17a2cd1 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -271,8 +271,8 @@ void accumulate_stolen_time(void)
 
 	sst = scan_dispatch_log(acct->starttime_user);
 	ust = scan_dispatch_log(acct->starttime);
-	acct->system_time -= sst;
-	acct->user_time -= ust;
+	acct->stime -= sst;
+	acct->utime -= ust;
 	local_paca->stolen_time += ust + sst;
 
 	local_paca->soft_enabled = save_soft_enabled;
@@ -281,10 +281,11 @@ void accumulate_stolen_time(void)
 static inline u64 calculate_stolen_time(u64 stop_tb)
 {
 	u64 stolen = 0;
+	struct cpu_accounting_data *acct = &local_paca->accounting;
 
 	if (get_paca()->dtl_ridx != be64_to_cpu(get_lppaca()->dtl_idx)) {
 		stolen = scan_dispatch_log(stop_tb);
-		get_paca()->accounting.system_time -= stolen;
+		acct->stime -= stolen;
 	}
 
 	stolen += get_paca()->stolen_time;
@@ -316,17 +317,17 @@ static unsigned long vtime_delta(struct task_struct *tsk,
 
 	now = mftb();
 	nowscaled = read_spurr(now);
-	acct->system_time += now - acct->starttime;
+	acct->stime += now - acct->starttime;
 	acct->starttime = now;
 	deltascaled = nowscaled - acct->startspurr;
 	acct->startspurr = nowscaled;
 
 	*stolen = calculate_stolen_time(now);
 
-	delta = acct->system_time;
-	acct->system_time = 0;
-	udelta = acct->user_time - acct->utime_sspurr;
-	acct->utime_sspurr = acct->user_time;
+	delta = acct->stime;
+	acct->stime = 0;
+	udelta = acct->utime - acct->utime_sspurr;
+	acct->utime_sspurr = acct->utime;
 
 	/*
 	 * Because we don't read the SPURR on every kernel entry/exit,
@@ -348,7 +349,7 @@ static unsigned long vtime_delta(struct task_struct *tsk,
 			*sys_scaled = deltascaled;
 		}
 	}
-	acct->user_time_scaled += user_scaled;
+	acct->utime_scaled += user_scaled;
 
 	return delta;
 }
@@ -387,10 +388,10 @@ void vtime_account_user(struct task_struct *tsk)
 	cputime_t utime, utimescaled;
 	struct cpu_accounting_data *acct = get_accounting(tsk);
 
-	utime = acct->user_time;
-	utimescaled = acct->user_time_scaled;
-	acct->user_time = 0;
-	acct->user_time_scaled = 0;
+	utime = acct->utime;
+	utimescaled = acct->utime_scaled;
+	acct->utime = 0;
+	acct->utime_scaled = 0;
 	acct->utime_sspurr = 0;
 	account_user_time(tsk, utime);
 	tsk->utimescaled += utimescaled;
@@ -408,8 +409,8 @@ void arch_vtime_task_switch(struct task_struct *prev)
 
 	acct->starttime = get_accounting(prev)->starttime;
 	acct->startspurr = get_accounting(prev)->startspurr;
-	acct->system_time = 0;
-	acct->user_time = 0;
+	acct->stime = 0;
+	acct->utime = 0;
 }
 #endif /* CONFIG_PPC32 */
 
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 9c0e17c..9f3b170 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -2287,9 +2287,9 @@ static void dump_one_paca(int cpu)
 	DUMP(p, subcore_sibling_mask, "x");
 #endif
 
-	DUMP(p, accounting.user_time, "llx");
-	DUMP(p, accounting.system_time, "llx");
-	DUMP(p, accounting.user_time_scaled, "llx");
+	DUMP(p, accounting.utime, "llx");
+	DUMP(p, accounting.stime, "llx");
+	DUMP(p, accounting.utime_scaled, "llx");
 	DUMP(p, accounting.starttime, "llx");
 	DUMP(p, accounting.starttime_user, "llx");
 	DUMP(p, accounting.startspurr, "llx");
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 06/10] powerpc: Migrate stolen_time field to accounting structure
  2017-01-05 17:11 [PATCH 00/10] vtime: Delay cputime accounting to tick / context switch Frederic Weisbecker
                   ` (4 preceding siblings ...)
  2017-01-05 17:11 ` [PATCH 05/10] powerpc: Prepare accounting structure for cputime flush on tick Frederic Weisbecker
@ 2017-01-05 17:11 ` Frederic Weisbecker
  2017-01-14 10:03   ` [tip:sched/core] sched/cputime, powerpc: Migrate stolen_time field to the " tip-bot for Frederic Weisbecker
  2017-01-05 17:11 ` [PATCH 07/10] powerpc/vtime: Accumulate cputime and account only on tick/task switch Frederic Weisbecker
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Frederic Weisbecker @ 2017-01-05 17:11 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Heiko Carstens, Martin Schwidefsky, Tony Luck,
	Fenghua Yu, Peter Zijlstra, Rik van Riel, Thomas Gleixner,
	Ingo Molnar, Stanislaw Gruszka, Wanpeng Li,
	Christian Borntraeger

That in order to gather all cputime accumulation to the same place.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 arch/powerpc/include/asm/paca.h | 1 -
 arch/powerpc/kernel/time.c      | 6 +++---
 arch/powerpc/xmon/xmon.c        | 2 +-
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 6a6792b..708c3e5 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -187,7 +187,6 @@ struct paca_struct {
 
 	/* Stuff for accurate time accounting */
 	struct cpu_accounting_data accounting;
-	u64 stolen_time;		/* TB ticks taken by hypervisor */
 	u64 dtl_ridx;			/* read index in dispatch log */
 	struct dtl_entry *dtl_curr;	/* pointer corresponding to dtl_ridx */
 
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 17a2cd1..714313e 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -273,7 +273,7 @@ void accumulate_stolen_time(void)
 	ust = scan_dispatch_log(acct->starttime);
 	acct->stime -= sst;
 	acct->utime -= ust;
-	local_paca->stolen_time += ust + sst;
+	acct->steal_time += ust + sst;
 
 	local_paca->soft_enabled = save_soft_enabled;
 }
@@ -288,8 +288,8 @@ static inline u64 calculate_stolen_time(u64 stop_tb)
 		acct->stime -= stolen;
 	}
 
-	stolen += get_paca()->stolen_time;
-	get_paca()->stolen_time = 0;
+	stolen += acct->steal_time;
+	acct->steal_time = 0;
 	return stolen;
 }
 
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 9f3b170..3f864c3 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -2294,7 +2294,7 @@ static void dump_one_paca(int cpu)
 	DUMP(p, accounting.starttime_user, "llx");
 	DUMP(p, accounting.startspurr, "llx");
 	DUMP(p, accounting.utime_sspurr, "llx");
-	DUMP(p, stolen_time, "llx");
+	DUMP(p, accounting.steal_time, "llx");
 #undef DUMP
 
 	catch_memory_errors = 0;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 07/10] powerpc/vtime: Accumulate cputime and account only on tick/task switch
  2017-01-05 17:11 [PATCH 00/10] vtime: Delay cputime accounting to tick / context switch Frederic Weisbecker
                   ` (5 preceding siblings ...)
  2017-01-05 17:11 ` [PATCH 06/10] powerpc: Migrate stolen_time field to accounting structure Frederic Weisbecker
@ 2017-01-05 17:11 ` Frederic Weisbecker
  2017-01-14 10:03   ` [tip:sched/core] sched/cputime, " tip-bot for Frederic Weisbecker
  2017-01-05 17:11 ` [PATCH 08/10] ia64: " Frederic Weisbecker
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Frederic Weisbecker @ 2017-01-05 17:11 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Heiko Carstens, Martin Schwidefsky, Tony Luck,
	Fenghua Yu, Peter Zijlstra, Rik van Riel, Thomas Gleixner,
	Ingo Molnar, Stanislaw Gruszka, Wanpeng Li,
	Christian Borntraeger

Currently CONFIG_VIRT_CPU_ACCOUNTING_NATIVE accounts the cputime on
any context boundary: irq entry/exit, guest entry/exit, context switch,
etc...

Calling functions such as account_system_time(), account_user_time()
and such can be costly, especially if they are called on many fastpath
such as twice per IRQ. Those functions do more than just accounting to
kcpustat and task cputime. Depending on the config, some subsystems can
perform unpleasant multiplications and divisions, among other things.

So lets accumulate the cputime instead and delay the accounting on ticks
and context switches only.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 arch/powerpc/kernel/time.c | 120 +++++++++++++++++++++++++++++----------------
 1 file changed, 77 insertions(+), 43 deletions(-)

diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 714313e..4255e69 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -280,17 +280,10 @@ void accumulate_stolen_time(void)
 
 static inline u64 calculate_stolen_time(u64 stop_tb)
 {
-	u64 stolen = 0;
-	struct cpu_accounting_data *acct = &local_paca->accounting;
+	if (get_paca()->dtl_ridx != be64_to_cpu(get_lppaca()->dtl_idx))
+		return scan_dispatch_log(stop_tb);
 
-	if (get_paca()->dtl_ridx != be64_to_cpu(get_lppaca()->dtl_idx)) {
-		stolen = scan_dispatch_log(stop_tb);
-		acct->stime -= stolen;
-	}
-
-	stolen += acct->steal_time;
-	acct->steal_time = 0;
-	return stolen;
+	return 0;
 }
 
 #else /* CONFIG_PPC_SPLPAR */
@@ -306,27 +299,26 @@ static inline u64 calculate_stolen_time(u64 stop_tb)
  * or soft irq state.
  */
 static unsigned long vtime_delta(struct task_struct *tsk,
-				 unsigned long *sys_scaled,
-				 unsigned long *stolen)
+				 unsigned long *stime_scaled,
+				 unsigned long *steal_time)
 {
 	unsigned long now, nowscaled, deltascaled;
-	unsigned long udelta, delta, user_scaled;
+	unsigned long stime;
+	unsigned long utime, utime_scaled;
 	struct cpu_accounting_data *acct = get_accounting(tsk);
 
 	WARN_ON_ONCE(!irqs_disabled());
 
 	now = mftb();
 	nowscaled = read_spurr(now);
-	acct->stime += now - acct->starttime;
+	stime = now - acct->starttime;
 	acct->starttime = now;
 	deltascaled = nowscaled - acct->startspurr;
 	acct->startspurr = nowscaled;
 
-	*stolen = calculate_stolen_time(now);
+	*steal_time = calculate_stolen_time(now);
 
-	delta = acct->stime;
-	acct->stime = 0;
-	udelta = acct->utime - acct->utime_sspurr;
+	utime = acct->utime - acct->utime_sspurr;
 	acct->utime_sspurr = acct->utime;
 
 	/*
@@ -339,39 +331,54 @@ static unsigned long vtime_delta(struct task_struct *tsk,
 	 * the user ticks get saved up in paca->user_time_scaled to be
 	 * used by account_process_tick.
 	 */
-	*sys_scaled = delta;
-	user_scaled = udelta;
-	if (deltascaled != delta + udelta) {
-		if (udelta) {
-			*sys_scaled = deltascaled * delta / (delta + udelta);
-			user_scaled = deltascaled - *sys_scaled;
+	*stime_scaled = stime;
+	utime_scaled = utime;
+	if (deltascaled != stime + utime) {
+		if (utime) {
+			*stime_scaled = deltascaled * stime / (stime + utime);
+			utime_scaled = deltascaled - *stime_scaled;
 		} else {
-			*sys_scaled = deltascaled;
+			*stime_scaled = deltascaled;
 		}
 	}
-	acct->utime_scaled += user_scaled;
+	acct->utime_scaled += utime_scaled;
 
-	return delta;
+	return stime;
 }
 
 void vtime_account_system(struct task_struct *tsk)
 {
-	unsigned long delta, sys_scaled, stolen;
+	unsigned long stime, stime_scaled, steal_time;
+	struct cpu_accounting_data *acct = get_accounting(tsk);
 
-	delta = vtime_delta(tsk, &sys_scaled, &stolen);
-	account_system_time(tsk, 0, delta);
-	tsk->stimescaled += sys_scaled;
-	if (stolen)
-		account_steal_time(stolen);
+	stime = vtime_delta(tsk, &stime_scaled, &steal_time);
+
+	stime -= min(stime, steal_time);
+	acct->steal_time += steal_time;
+
+	if ((tsk->flags & PF_VCPU) && !irq_count()) {
+		acct->gtime += stime;
+		acct->utime_scaled += stime_scaled;
+	} else {
+		if (hardirq_count())
+			acct->hardirq_time += stime;
+		else if (in_serving_softirq())
+			acct->softirq_time += stime;
+		else
+			acct->stime += stime;
+
+		acct->stime_scaled += stime_scaled;
+	}
 }
 EXPORT_SYMBOL_GPL(vtime_account_system);
 
 void vtime_account_idle(struct task_struct *tsk)
 {
-	unsigned long delta, sys_scaled, stolen;
+	unsigned long stime, stime_scaled, steal_time;
+	struct cpu_accounting_data *acct = get_accounting(tsk);
 
-	delta = vtime_delta(tsk, &sys_scaled, &stolen);
-	account_idle_time(delta + stolen);
+	stime = vtime_delta(tsk, &stime_scaled, &steal_time);
+	acct->idle_time += stime + steal_time;
 }
 
 /*
@@ -385,16 +392,45 @@ void vtime_account_idle(struct task_struct *tsk)
  */
 void vtime_account_user(struct task_struct *tsk)
 {
-	cputime_t utime, utimescaled;
 	struct cpu_accounting_data *acct = get_accounting(tsk);
 
-	utime = acct->utime;
-	utimescaled = acct->utime_scaled;
+	if (acct->utime)
+		account_user_time(tsk, acct->utime);
+
+	if (acct->utime_scaled)
+		tsk->utimescaled += acct->utime_scaled;
+
+	if (acct->gtime)
+		account_guest_time(tsk, acct->gtime);
+
+	if (acct->steal_time)
+		account_steal_time(acct->steal_time);
+
+	if (acct->idle_time)
+		account_idle_time(acct->idle_time);
+
+	if (acct->stime)
+		account_system_index_time(tsk, acct->stime, CPUTIME_SYSTEM);
+
+	if (acct->stime_scaled)
+		tsk->stimescaled += acct->stime_scaled;
+
+	if (acct->hardirq_time)
+		account_system_index_time(tsk, acct->hardirq_time, CPUTIME_IRQ);
+
+	if (acct->softirq_time)
+		account_system_index_time(tsk, acct->softirq_time, CPUTIME_SOFTIRQ);
+
 	acct->utime = 0;
 	acct->utime_scaled = 0;
 	acct->utime_sspurr = 0;
-	account_user_time(tsk, utime);
-	tsk->utimescaled += utimescaled;
+	acct->gtime = 0;
+	acct->steal_time = 0;
+	acct->idle_time = 0;
+	acct->stime = 0;
+	acct->stime_scaled = 0;
+	acct->hardirq_time = 0;
+	acct->softirq_time = 0;
 }
 
 #ifdef CONFIG_PPC32
@@ -409,8 +445,6 @@ void arch_vtime_task_switch(struct task_struct *prev)
 
 	acct->starttime = get_accounting(prev)->starttime;
 	acct->startspurr = get_accounting(prev)->startspurr;
-	acct->stime = 0;
-	acct->utime = 0;
 }
 #endif /* CONFIG_PPC32 */
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 08/10] ia64: Accumulate cputime and account only on tick/task switch
  2017-01-05 17:11 [PATCH 00/10] vtime: Delay cputime accounting to tick / context switch Frederic Weisbecker
                   ` (6 preceding siblings ...)
  2017-01-05 17:11 ` [PATCH 07/10] powerpc/vtime: Accumulate cputime and account only on tick/task switch Frederic Weisbecker
@ 2017-01-05 17:11 ` Frederic Weisbecker
  2017-01-14 10:04   ` [tip:sched/core] sched/cputime, " tip-bot for Frederic Weisbecker
  2017-01-05 17:11 ` [PATCH 09/10] s390/cputime: delayed accounting of system time Frederic Weisbecker
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Frederic Weisbecker @ 2017-01-05 17:11 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Heiko Carstens, Martin Schwidefsky, Tony Luck,
	Fenghua Yu, Peter Zijlstra, Rik van Riel, Thomas Gleixner,
	Ingo Molnar, Stanislaw Gruszka, Wanpeng Li,
	Christian Borntraeger

Currently CONFIG_VIRT_CPU_ACCOUNTING_NATIVE accounts the cputime on
any context boundary: irq entry/exit, guest entry/exit, context switch,
etc...

Calling functions such as account_system_time(), account_user_time()
and such can be costly, especially if they are called on many fastpath
such as twice per IRQ. Those functions do more than just accounting to
kcpustat and task cputime. Depending on the config, some subsystems can
perform unpleasant multiplications and divisions, among other things.

So lets accumulate the cputime instead and delay the accounting on ticks
and context switches only.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 arch/ia64/include/asm/thread_info.h |  6 ++++
 arch/ia64/kernel/time.c             | 60 ++++++++++++++++++++++++++++---------
 2 files changed, 52 insertions(+), 14 deletions(-)

diff --git a/arch/ia64/include/asm/thread_info.h b/arch/ia64/include/asm/thread_info.h
index c702642..8742d74 100644
--- a/arch/ia64/include/asm/thread_info.h
+++ b/arch/ia64/include/asm/thread_info.h
@@ -27,6 +27,12 @@ struct thread_info {
 	mm_segment_t addr_limit;	/* user-level address space limit */
 	int preempt_count;		/* 0=premptable, <0=BUG; will also serve as bh-counter */
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
+	__u64 utime;
+	__u64 stime;
+	__u64 gtime;
+	__u64 hardirq_time;
+	__u64 softirq_time;
+	__u64 idle_time;
 	__u64 ac_stamp;
 	__u64 ac_leave;
 	__u64 ac_stime;
diff --git a/arch/ia64/kernel/time.c b/arch/ia64/kernel/time.c
index 637e741..37f1b31 100644
--- a/arch/ia64/kernel/time.c
+++ b/arch/ia64/kernel/time.c
@@ -63,14 +63,39 @@ extern cputime_t cycle_to_cputime(u64 cyc);
 
 void vtime_account_user(struct task_struct *tsk)
 {
-	cputime_t delta_utime;
 	struct thread_info *ti = task_thread_info(tsk);
+	cputime_t delta;
 
-	if (ti->ac_utime) {
-		delta_utime = cycle_to_cputime(ti->ac_utime);
-		account_user_time(tsk, delta_utime);
-		ti->ac_utime = 0;
+	if (ti->utime)
+		account_user_time(tsk, cycle_to_cputime(ti->utime));
+
+	if (ti->gtime)
+		account_guest_time(tsk, cycle_to_cputime(ti->gtime));
+
+	if (ti->idle_time)
+		account_idle_time(cycle_to_cputime(ti->idle_time));
+
+	if (ti->stime) {
+		delta = cycle_to_cputime(ti->stime);
+		account_system_index_time(tsk, delta, CPUTIME_SYSTEM);
+	}
+
+	if (ti->hardirq_time) {
+		delta = cycle_to_cputime(ti->hardirq_time);
+		account_system_index_time(tsk, delta, CPUTIME_IRQ);
+	}
+
+	if (ti->softirq_time) {
+		delta = cycle_to_cputime(ti->softirq_time);
+		account_system_index_time(tsk, delta, CPUTIME_SOFTIRQ);
 	}
+
+	ti->utime = 0;
+	ti->gtime = 0;
+	ti->idle_time = 0;
+	ti->stime = 0;
+	ti->hardirq_time = 0;
+	ti->softirq_time = 0;
 }
 
 /*
@@ -91,18 +116,15 @@ void arch_vtime_task_switch(struct task_struct *prev)
  * Account time for a transition between system, hard irq or soft irq state.
  * Note that this function is called with interrupts enabled.
  */
-static cputime_t vtime_delta(struct task_struct *tsk)
+static __u64 vtime_delta(struct task_struct *tsk)
 {
 	struct thread_info *ti = task_thread_info(tsk);
-	cputime_t delta_stime;
-	__u64 now;
+	__u64 now, delta_stime;
 
 	WARN_ON_ONCE(!irqs_disabled());
 
 	now = ia64_get_itc();
-
-	delta_stime = cycle_to_cputime(ti->ac_stime + (now - ti->ac_stamp));
-	ti->ac_stime = 0;
+	delta_stime = now - ti->ac_stamp;
 	ti->ac_stamp = now;
 
 	return delta_stime;
@@ -110,15 +132,25 @@ static cputime_t vtime_delta(struct task_struct *tsk)
 
 void vtime_account_system(struct task_struct *tsk)
 {
-	cputime_t delta = vtime_delta(tsk);
+	struct thread_info *ti = task_thread_info(tsk);
+	__u64 stime = vtime_delta(tsk);
 
-	account_system_time(tsk, 0, delta);
+	if ((tsk->flags & PF_VCPU) && !irq_count())
+		ti->gtime += stime;
+	else if (hardirq_count())
+		ti->hardirq_time += stime;
+	else if (in_serving_softirq())
+		ti->softirq_time += stime;
+	else
+		ti->stime += stime;
 }
 EXPORT_SYMBOL_GPL(vtime_account_system);
 
 void vtime_account_idle(struct task_struct *tsk)
 {
-	account_idle_time(vtime_delta(tsk));
+	struct thread_info *ti = task_thread_info(tsk);
+
+	ti->idle_time += vtime_delta(tsk);
 }
 
 #endif /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 09/10] s390/cputime: delayed accounting of system time
  2017-01-05 17:11 [PATCH 00/10] vtime: Delay cputime accounting to tick / context switch Frederic Weisbecker
                   ` (7 preceding siblings ...)
  2017-01-05 17:11 ` [PATCH 08/10] ia64: " Frederic Weisbecker
@ 2017-01-05 17:11 ` Frederic Weisbecker
  2017-01-14 10:04   ` [tip:sched/core] sched/cputime, s390: Implement " tip-bot for Martin Schwidefsky
  2017-01-05 17:11 ` [PATCH 10/10] vtime: Rename vtime_account_user() to vtime_flush() Frederic Weisbecker
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 24+ messages in thread
From: Frederic Weisbecker @ 2017-01-05 17:11 UTC (permalink / raw)
  To: LKML
  Cc: Martin Schwidefsky, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Heiko Carstens, Tony Luck, Fenghua Yu,
	Peter Zijlstra, Rik van Riel, Thomas Gleixner, Ingo Molnar,
	Stanislaw Gruszka, Wanpeng Li, Christian Borntraeger,
	Frederic Weisbecker

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

The account_system_time() function is called with a cputime that
occurred while running in the kernel. The function detects which
context the CPU is currently running in and accounts the time to
the correct bucket. This forces the arch code to account the
cputime for hardirq and softirq immediately.

Such accounting function can be costly and perform unwelcome divisions
and multiplications, among others.

The arch code can delay the accounting for system time. For s390
the accounting is done once per timer tick and for each task switch.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
[rebase against latest linus tree and move account_system_index_scaled()]
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 arch/s390/include/asm/lowcore.h   |  65 +++++++++---------
 arch/s390/include/asm/processor.h |   3 +
 arch/s390/kernel/vtime.c          | 134 +++++++++++++++++++++++++-------------
 3 files changed, 124 insertions(+), 78 deletions(-)

diff --git a/arch/s390/include/asm/lowcore.h b/arch/s390/include/asm/lowcore.h
index 9bfad2a..61261e0e 100644
--- a/arch/s390/include/asm/lowcore.h
+++ b/arch/s390/include/asm/lowcore.h
@@ -85,53 +85,56 @@ struct lowcore {
 	__u64	mcck_enter_timer;		/* 0x02c0 */
 	__u64	exit_timer;			/* 0x02c8 */
 	__u64	user_timer;			/* 0x02d0 */
-	__u64	system_timer;			/* 0x02d8 */
-	__u64	steal_timer;			/* 0x02e0 */
-	__u64	last_update_timer;		/* 0x02e8 */
-	__u64	last_update_clock;		/* 0x02f0 */
-	__u64	int_clock;			/* 0x02f8 */
-	__u64	mcck_clock;			/* 0x0300 */
-	__u64	clock_comparator;		/* 0x0308 */
+	__u64	guest_timer;			/* 0x02d8 */
+	__u64	system_timer;			/* 0x02e0 */
+	__u64	hardirq_timer;			/* 0x02e8 */
+	__u64	softirq_timer;			/* 0x02f0 */
+	__u64	steal_timer;			/* 0x02f8 */
+	__u64	last_update_timer;		/* 0x0300 */
+	__u64	last_update_clock;		/* 0x0308 */
+	__u64	int_clock;			/* 0x0310 */
+	__u64	mcck_clock;			/* 0x0318 */
+	__u64	clock_comparator;		/* 0x0320 */
 
 	/* Current process. */
-	__u64	current_task;			/* 0x0310 */
-	__u8	pad_0x318[0x320-0x318];		/* 0x0318 */
-	__u64	kernel_stack;			/* 0x0320 */
+	__u64	current_task;			/* 0x0328 */
+	__u8	pad_0x318[0x320-0x318];		/* 0x0330 */
+	__u64	kernel_stack;			/* 0x0338 */
 
 	/* Interrupt, panic and restart stack. */
-	__u64	async_stack;			/* 0x0328 */
-	__u64	panic_stack;			/* 0x0330 */
-	__u64	restart_stack;			/* 0x0338 */
+	__u64	async_stack;			/* 0x0340 */
+	__u64	panic_stack;			/* 0x0348 */
+	__u64	restart_stack;			/* 0x0350 */
 
 	/* Restart function and parameter. */
-	__u64	restart_fn;			/* 0x0340 */
-	__u64	restart_data;			/* 0x0348 */
-	__u64	restart_source;			/* 0x0350 */
+	__u64	restart_fn;			/* 0x0358 */
+	__u64	restart_data;			/* 0x0360 */
+	__u64	restart_source;			/* 0x0368 */
 
 	/* Address space pointer. */
-	__u64	kernel_asce;			/* 0x0358 */
-	__u64	user_asce;			/* 0x0360 */
+	__u64	kernel_asce;			/* 0x0370 */
+	__u64	user_asce;			/* 0x0378 */
 
 	/*
 	 * The lpp and current_pid fields form a
 	 * 64-bit value that is set as program
 	 * parameter with the LPP instruction.
 	 */
-	__u32	lpp;				/* 0x0368 */
-	__u32	current_pid;			/* 0x036c */
+	__u32	lpp;				/* 0x0380 */
+	__u32	current_pid;			/* 0x0384 */
 
 	/* SMP info area */
-	__u32	cpu_nr;				/* 0x0370 */
-	__u32	softirq_pending;		/* 0x0374 */
-	__u64	percpu_offset;			/* 0x0378 */
-	__u64	vdso_per_cpu_data;		/* 0x0380 */
-	__u64	machine_flags;			/* 0x0388 */
-	__u32	preempt_count;			/* 0x0390 */
-	__u8	pad_0x0394[0x0398-0x0394];	/* 0x0394 */
-	__u64	gmap;				/* 0x0398 */
-	__u32	spinlock_lockval;		/* 0x03a0 */
-	__u32	fpu_flags;			/* 0x03a4 */
-	__u8	pad_0x03a8[0x0400-0x03a8];	/* 0x03a8 */
+	__u32	cpu_nr;				/* 0x0388 */
+	__u32	softirq_pending;		/* 0x038c */
+	__u64	percpu_offset;			/* 0x0390 */
+	__u64	vdso_per_cpu_data;		/* 0x0398 */
+	__u64	machine_flags;			/* 0x03a0 */
+	__u32	preempt_count;			/* 0x03a8 */
+	__u8	pad_0x03ac[0x03b0-0x03ac];	/* 0x03ac */
+	__u64	gmap;				/* 0x03b0 */
+	__u32	spinlock_lockval;		/* 0x03b8 */
+	__u32	fpu_flags;			/* 0x03bc */
+	__u8	pad_0x03c0[0x0400-0x03c0];	/* 0x03c0 */
 
 	/* Per cpu primary space access list */
 	__u32	paste[16];			/* 0x0400 */
diff --git a/arch/s390/include/asm/processor.h b/arch/s390/include/asm/processor.h
index 6bca916..977a5b6 100644
--- a/arch/s390/include/asm/processor.h
+++ b/arch/s390/include/asm/processor.h
@@ -111,7 +111,10 @@ struct thread_struct {
 	unsigned int  acrs[NUM_ACRS];
         unsigned long ksp;              /* kernel stack pointer             */
 	unsigned long user_timer;	/* task cputime in user space */
+	unsigned long guest_timer;	/* task cputime in kvm guest */
 	unsigned long system_timer;	/* task cputime in kernel space */
+	unsigned long hardirq_timer;	/* task cputime in hardirq context */
+	unsigned long softirq_timer;	/* task cputime in softirq context */
 	unsigned long sys_call_table;	/* system call table address */
 	mm_segment_t mm_segment;
 	unsigned long gmap_addr;	/* address of last gmap fault. */
diff --git a/arch/s390/kernel/vtime.c b/arch/s390/kernel/vtime.c
index 1b5c5ee..1a53e0b 100644
--- a/arch/s390/kernel/vtime.c
+++ b/arch/s390/kernel/vtime.c
@@ -90,14 +90,41 @@ static void update_mt_scaling(void)
 	__this_cpu_write(mt_scaling_jiffies, jiffies_64);
 }
 
+static inline u64 update_tsk_timer(unsigned long *tsk_vtime, u64 new)
+{
+	u64 delta;
+
+	delta = new - *tsk_vtime;
+	*tsk_vtime = new;
+	return delta;
+}
+
+
+static inline u64 scale_vtime(u64 vtime)
+{
+	u64 mult = __this_cpu_read(mt_scaling_mult);
+	u64 div = __this_cpu_read(mt_scaling_div);
+
+	if (smp_cpu_mtid)
+		return vtime * mult / div;
+	return vtime;
+}
+
+static void account_system_index_scaled(struct task_struct *p,
+					cputime_t cputime, cputime_t scaled,
+					enum cpu_usage_stat index)
+{
+	p->stimescaled += scaled;
+	account_system_index_time(p, cputime, index);
+}
+
 /*
  * Update process times based on virtual cpu times stored by entry.S
  * to the lowcore fields user_timer, system_timer & steal_clock.
  */
 static int do_account_vtime(struct task_struct *tsk)
 {
-	u64 timer, clock, user, system, steal;
-	u64 user_scaled, system_scaled;
+	u64 timer, clock, user, guest, system, hardirq, softirq, steal;
 
 	timer = S390_lowcore.last_update_timer;
 	clock = S390_lowcore.last_update_clock;
@@ -110,36 +137,53 @@ static int do_account_vtime(struct task_struct *tsk)
 #endif
 		: "=m" (S390_lowcore.last_update_timer),
 		  "=m" (S390_lowcore.last_update_clock));
-	S390_lowcore.system_timer += timer - S390_lowcore.last_update_timer;
-	S390_lowcore.steal_timer += S390_lowcore.last_update_clock - clock;
+	clock = S390_lowcore.last_update_clock - clock;
+	timer -= S390_lowcore.last_update_timer;
+
+	if (hardirq_count())
+		S390_lowcore.hardirq_timer += timer;
+	else
+		S390_lowcore.system_timer += timer;
 
 	/* Update MT utilization calculation */
 	if (smp_cpu_mtid &&
 	    time_after64(jiffies_64, this_cpu_read(mt_scaling_jiffies)))
 		update_mt_scaling();
 
-	user = S390_lowcore.user_timer - tsk->thread.user_timer;
-	S390_lowcore.steal_timer -= user;
-	tsk->thread.user_timer = S390_lowcore.user_timer;
+	/* Calculate cputime delta */
+	user = update_tsk_timer(&tsk->thread.user_timer,
+				READ_ONCE(S390_lowcore.user_timer));
+	guest = update_tsk_timer(&tsk->thread.guest_timer,
+				 READ_ONCE(S390_lowcore.guest_timer));
+	system = update_tsk_timer(&tsk->thread.system_timer,
+				  READ_ONCE(S390_lowcore.system_timer));
+	hardirq = update_tsk_timer(&tsk->thread.hardirq_timer,
+				   READ_ONCE(S390_lowcore.hardirq_timer));
+	softirq = update_tsk_timer(&tsk->thread.softirq_timer,
+				   READ_ONCE(S390_lowcore.softirq_timer));
+	S390_lowcore.steal_timer +=
+		clock - user - guest - system - hardirq - softirq;
 
-	system = S390_lowcore.system_timer - tsk->thread.system_timer;
-	S390_lowcore.steal_timer -= system;
-	tsk->thread.system_timer = S390_lowcore.system_timer;
-
-	user_scaled = user;
-	system_scaled = system;
-	/* Do MT utilization scaling */
-	if (smp_cpu_mtid) {
-		u64 mult = __this_cpu_read(mt_scaling_mult);
-		u64 div = __this_cpu_read(mt_scaling_div);
+	/* Push account value */
+	if (user) {
+		account_user_time(tsk, user);
+		tsk->utimescaled += scale_vtime(user);
+	}
 
-		user_scaled = (user_scaled * mult) / div;
-		system_scaled = (system_scaled * mult) / div;
+	if (guest) {
+		account_guest_time(tsk, guest);
+		tsk->utimescaled += scale_vtime(guest);
 	}
-	account_user_time(tsk, user);
-	tsk->utimescaled += user_scaled;
-	account_system_time(tsk, 0, system);
-	tsk->stimescaled += system_scaled;
+
+	if (system)
+		account_system_index_scaled(tsk, system, scale_vtime(system),
+					    CPUTIME_SYSTEM);
+	if (hardirq)
+		account_system_index_scaled(tsk, hardirq, scale_vtime(hardirq),
+					    CPUTIME_IRQ);
+	if (softirq)
+		account_system_index_scaled(tsk, softirq, scale_vtime(softirq),
+					    CPUTIME_SOFTIRQ);
 
 	steal = S390_lowcore.steal_timer;
 	if ((s64) steal > 0) {
@@ -147,16 +191,22 @@ static int do_account_vtime(struct task_struct *tsk)
 		account_steal_time(steal);
 	}
 
-	return virt_timer_forward(user + system);
+	return virt_timer_forward(user + guest + system + hardirq + softirq);
 }
 
 void vtime_task_switch(struct task_struct *prev)
 {
 	do_account_vtime(prev);
 	prev->thread.user_timer = S390_lowcore.user_timer;
+	prev->thread.guest_timer = S390_lowcore.guest_timer;
 	prev->thread.system_timer = S390_lowcore.system_timer;
+	prev->thread.hardirq_timer = S390_lowcore.hardirq_timer;
+	prev->thread.softirq_timer = S390_lowcore.softirq_timer;
 	S390_lowcore.user_timer = current->thread.user_timer;
+	S390_lowcore.guest_timer = current->thread.guest_timer;
 	S390_lowcore.system_timer = current->thread.system_timer;
+	S390_lowcore.hardirq_timer = current->thread.hardirq_timer;
+	S390_lowcore.softirq_timer = current->thread.softirq_timer;
 }
 
 /*
@@ -176,32 +226,22 @@ void vtime_account_user(struct task_struct *tsk)
  */
 void vtime_account_irq_enter(struct task_struct *tsk)
 {
-	u64 timer, system, system_scaled;
+	u64 timer;
 
 	timer = S390_lowcore.last_update_timer;
 	S390_lowcore.last_update_timer = get_vtimer();
-	S390_lowcore.system_timer += timer - S390_lowcore.last_update_timer;
-
-	/* Update MT utilization calculation */
-	if (smp_cpu_mtid &&
-	    time_after64(jiffies_64, this_cpu_read(mt_scaling_jiffies)))
-		update_mt_scaling();
-
-	system = S390_lowcore.system_timer - tsk->thread.system_timer;
-	S390_lowcore.steal_timer -= system;
-	tsk->thread.system_timer = S390_lowcore.system_timer;
-	system_scaled = system;
-	/* Do MT utilization scaling */
-	if (smp_cpu_mtid) {
-		u64 mult = __this_cpu_read(mt_scaling_mult);
-		u64 div = __this_cpu_read(mt_scaling_div);
-
-		system_scaled = (system_scaled * mult) / div;
-	}
-	account_system_time(tsk, 0, system);
-	tsk->stimescaled += system_scaled;
-
-	virt_timer_forward(system);
+	timer -= S390_lowcore.last_update_timer;
+
+	if ((tsk->flags & PF_VCPU) && (irq_count() == 0))
+		S390_lowcore.guest_timer += timer;
+	else if (hardirq_count())
+		S390_lowcore.hardirq_timer += timer;
+	else if (in_serving_softirq())
+		S390_lowcore.softirq_timer += timer;
+	else
+		S390_lowcore.system_timer += timer;
+
+	virt_timer_forward(timer);
 }
 EXPORT_SYMBOL_GPL(vtime_account_irq_enter);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 10/10] vtime: Rename vtime_account_user() to vtime_flush()
  2017-01-05 17:11 [PATCH 00/10] vtime: Delay cputime accounting to tick / context switch Frederic Weisbecker
                   ` (8 preceding siblings ...)
  2017-01-05 17:11 ` [PATCH 09/10] s390/cputime: delayed accounting of system time Frederic Weisbecker
@ 2017-01-05 17:11 ` Frederic Weisbecker
  2017-01-14 10:05   ` [tip:sched/core] sched/cputime: " tip-bot for Frederic Weisbecker
  2017-01-09  8:13 ` [PATCH 00/10] vtime: Delay cputime accounting to tick / context switch Martin Schwidefsky
  2017-01-10 11:45 ` Thomas Gleixner
  11 siblings, 1 reply; 24+ messages in thread
From: Frederic Weisbecker @ 2017-01-05 17:11 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Heiko Carstens, Martin Schwidefsky, Tony Luck,
	Fenghua Yu, Peter Zijlstra, Rik van Riel, Thomas Gleixner,
	Ingo Molnar, Stanislaw Gruszka, Wanpeng Li,
	Christian Borntraeger

CONFIG_VIRT_CPU_ACCOUNTING_NATIVE used to accumulate user time and
account it on ticks and context switches only through the
vtime_account_user() function.

Now this model has been generalized on the 3 archs for all kind of
cputime (system, irq, ...) and all the cputime flushing happens under
vtime_account_user().

So let's rename this function to better reflect its new role.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 arch/ia64/kernel/time.c     | 2 +-
 arch/powerpc/kernel/time.c  | 6 ++----
 arch/s390/kernel/vtime.c    | 2 +-
 include/linux/kernel_stat.h | 2 +-
 include/linux/vtime.h       | 7 +++++--
 kernel/sched/cputime.c      | 4 +---
 6 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/arch/ia64/kernel/time.c b/arch/ia64/kernel/time.c
index 37f1b31..d040f12 100644
--- a/arch/ia64/kernel/time.c
+++ b/arch/ia64/kernel/time.c
@@ -61,7 +61,7 @@ static struct clocksource *itc_clocksource;
 
 extern cputime_t cycle_to_cputime(u64 cyc);
 
-void vtime_account_user(struct task_struct *tsk)
+void vtime_flush(struct task_struct *tsk)
 {
 	struct thread_info *ti = task_thread_info(tsk);
 	cputime_t delta;
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 4255e69..02e9730 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -382,15 +382,13 @@ void vtime_account_idle(struct task_struct *tsk)
 }
 
 /*
- * Transfer the user time accumulated in the paca
- * by the exception entry and exit code to the generic
- * process user time records.
+ * Account the whole cputime accumulated in the paca
  * Must be called with interrupts disabled.
  * Assumes that vtime_account_system/idle() has been called
  * recently (i.e. since the last entry from usermode) so that
  * get_paca()->user_time_scaled is up to date.
  */
-void vtime_account_user(struct task_struct *tsk)
+void vtime_flush(struct task_struct *tsk)
 {
 	struct cpu_accounting_data *acct = get_accounting(tsk);
 
diff --git a/arch/s390/kernel/vtime.c b/arch/s390/kernel/vtime.c
index 1a53e0b..0a9e5d6 100644
--- a/arch/s390/kernel/vtime.c
+++ b/arch/s390/kernel/vtime.c
@@ -214,7 +214,7 @@ void vtime_task_switch(struct task_struct *prev)
  * accounting system time in order to correctly compute
  * the stolen time accounting.
  */
-void vtime_account_user(struct task_struct *tsk)
+void vtime_flush(struct task_struct *tsk)
 {
 	if (do_account_vtime(tsk))
 		virt_timer_expire();
diff --git a/include/linux/kernel_stat.h b/include/linux/kernel_stat.h
index cfd6c0c..c3e38de 100644
--- a/include/linux/kernel_stat.h
+++ b/include/linux/kernel_stat.h
@@ -89,7 +89,7 @@ extern void account_idle_time(cputime_t);
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
 static inline void account_process_tick(struct task_struct *tsk, int user)
 {
-	vtime_account_user(tsk);
+	vtime_flush(tsk);
 }
 #else
 extern void account_process_tick(struct task_struct *, int user);
diff --git a/include/linux/vtime.h b/include/linux/vtime.h
index aa9bfea..0681fe2 100644
--- a/include/linux/vtime.h
+++ b/include/linux/vtime.h
@@ -58,27 +58,28 @@ static inline void vtime_task_switch(struct task_struct *prev)
 
 extern void vtime_account_system(struct task_struct *tsk);
 extern void vtime_account_idle(struct task_struct *tsk);
-extern void vtime_account_user(struct task_struct *tsk);
 
 #else /* !CONFIG_VIRT_CPU_ACCOUNTING */
 
 static inline void vtime_task_switch(struct task_struct *prev) { }
 static inline void vtime_account_system(struct task_struct *tsk) { }
-static inline void vtime_account_user(struct task_struct *tsk) { }
 #endif /* !CONFIG_VIRT_CPU_ACCOUNTING */
 
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
 extern void arch_vtime_task_switch(struct task_struct *tsk);
+extern void vtime_account_user(struct task_struct *tsk);
 extern void vtime_user_enter(struct task_struct *tsk);
 
 static inline void vtime_user_exit(struct task_struct *tsk)
 {
 	vtime_account_user(tsk);
 }
+
 extern void vtime_guest_enter(struct task_struct *tsk);
 extern void vtime_guest_exit(struct task_struct *tsk);
 extern void vtime_init_idle(struct task_struct *tsk, int cpu);
 #else /* !CONFIG_VIRT_CPU_ACCOUNTING_GEN  */
+static inline void vtime_account_user(struct task_struct *tsk) { }
 static inline void vtime_user_enter(struct task_struct *tsk) { }
 static inline void vtime_user_exit(struct task_struct *tsk) { }
 static inline void vtime_guest_enter(struct task_struct *tsk) { }
@@ -93,9 +94,11 @@ static inline void vtime_account_irq_exit(struct task_struct *tsk)
 	/* On hard|softirq exit we always account to hard|softirq cputime */
 	vtime_account_system(tsk);
 }
+extern void vtime_flush(struct task_struct *tsk);
 #else /* !CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
 static inline void vtime_account_irq_enter(struct task_struct *tsk) { }
 static inline void vtime_account_irq_exit(struct task_struct *tsk) { }
+static inline void vtime_flush(struct task_struct *tsk) { }
 #endif
 
 
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 5813ee4..f7c14cc 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -437,9 +437,7 @@ void vtime_common_task_switch(struct task_struct *prev)
 	else
 		vtime_account_system(prev);
 
-#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
-	vtime_account_user(prev);
-#endif
+	vtime_flush(prev);
 	arch_vtime_task_switch(prev);
 }
 #endif
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 00/10] vtime: Delay cputime accounting to tick / context switch
  2017-01-05 17:11 [PATCH 00/10] vtime: Delay cputime accounting to tick / context switch Frederic Weisbecker
                   ` (9 preceding siblings ...)
  2017-01-05 17:11 ` [PATCH 10/10] vtime: Rename vtime_account_user() to vtime_flush() Frederic Weisbecker
@ 2017-01-09  8:13 ` Martin Schwidefsky
  2017-01-10 11:45 ` Thomas Gleixner
  11 siblings, 0 replies; 24+ messages in thread
From: Martin Schwidefsky @ 2017-01-09  8:13 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Heiko Carstens, Tony Luck, Fenghua Yu, Peter Zijlstra,
	Rik van Riel, Thomas Gleixner, Ingo Molnar, Stanislaw Gruszka,
	Wanpeng Li, Christian Borntraeger

On Thu,  5 Jan 2017 18:11:40 +0100
Frederic Weisbecker <fweisbec@gmail.com> wrote:

> This version is a rebase on top of latest Linus tree which includes
> the fix 8f2b468aadc ("s390/vtime: correct system time accounting").
> 
> Also a small change: I have moved account_system_index_scaled() to s390
> in patch "s390/cputime: delayed accounting of system time" because it is
> the only user of the function.
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
> 	vtime/acc-v2

That looks good, I get sensible numbers for s390. Thanks for doing this!

For the s390 parts:
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 00/10] vtime: Delay cputime accounting to tick / context switch
  2017-01-05 17:11 [PATCH 00/10] vtime: Delay cputime accounting to tick / context switch Frederic Weisbecker
                   ` (10 preceding siblings ...)
  2017-01-09  8:13 ` [PATCH 00/10] vtime: Delay cputime accounting to tick / context switch Martin Schwidefsky
@ 2017-01-10 11:45 ` Thomas Gleixner
  2017-01-10 15:21   ` Frederic Weisbecker
  11 siblings, 1 reply; 24+ messages in thread
From: Thomas Gleixner @ 2017-01-10 11:45 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Heiko Carstens, Martin Schwidefsky, Tony Luck, Fenghua Yu,
	Peter Zijlstra, Rik van Riel, Ingo Molnar, Stanislaw Gruszka,
	Wanpeng Li, Christian Borntraeger

On Thu, 5 Jan 2017, Frederic Weisbecker wrote:

> This version is a rebase on top of latest Linus tree which includes
> the fix 8f2b468aadc ("s390/vtime: correct system time accounting").
> 
> Also a small change: I have moved account_system_index_scaled() to s390
> in patch "s390/cputime: delayed accounting of system time" because it is
> the only user of the function.
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
> 	vtime/acc-v2

Acked-by: Thomas Gleixner <tglx@linutronix.de>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 00/10] vtime: Delay cputime accounting to tick / context switch
  2017-01-10 11:45 ` Thomas Gleixner
@ 2017-01-10 15:21   ` Frederic Weisbecker
  0 siblings, 0 replies; 24+ messages in thread
From: Frederic Weisbecker @ 2017-01-10 15:21 UTC (permalink / raw)
  To: Thomas Gleixner, Martin Schwidefsky
  Cc: LKML, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Heiko Carstens, Tony Luck, Fenghua Yu, Peter Zijlstra,
	Rik van Riel, Ingo Molnar, Stanislaw Gruszka, Wanpeng Li,
	Christian Borntraeger

On Tue, Jan 10, 2017 at 12:45:43PM +0100, Thomas Gleixner wrote:
> On Thu, 5 Jan 2017, Frederic Weisbecker wrote:
> 
> > This version is a rebase on top of latest Linus tree which includes
> > the fix 8f2b468aadc ("s390/vtime: correct system time accounting").
> > 
> > Also a small change: I have moved account_system_index_scaled() to s390
> > in patch "s390/cputime: delayed accounting of system time" because it is
> > the only user of the function.
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
> > 	vtime/acc-v2
> 
> Acked-by: Thomas Gleixner <tglx@linutronix.de>

Thanks for your reviews! I added the acks on the series.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [tip:sched/core] sched/cputime, powerpc32: Fix stale scaled stime on context switch
  2017-01-05 17:11 ` [PATCH 01/10] powerpc32: Fix stale scaled stime on " Frederic Weisbecker
@ 2017-01-14 10:00   ` tip-bot for Frederic Weisbecker
  0 siblings, 0 replies; 24+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2017-01-14 10:00 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tony.luck, mingo, tglx, riel, sgruszka, wanpeng.li, fenghua.yu,
	fweisbec, benh, linux-kernel, schwidefsky, borntraeger,
	heiko.carstens, peterz, mpe, torvalds, hpa, paulus

Commit-ID:  90d08ba2b9b4be4aeca6a5b5a4b09fbcde30194d
Gitweb:     http://git.kernel.org/tip/90d08ba2b9b4be4aeca6a5b5a4b09fbcde30194d
Author:     Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Thu, 5 Jan 2017 18:11:41 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 14 Jan 2017 09:54:11 +0100

sched/cputime, powerpc32: Fix stale scaled stime on context switch

On context switch with powerpc32, the cputime is accumulated in the
thread_info struct. So the switching-in task must move forward its
start time snapshot to the current time in order to later compute the
delta spent in system mode.

This is what we do for the normal cputime by initializing the starttime
field to the value of the previous task's starttime which got freshly
updated.

But we are missing the update of the scaled cputime start time. As a
result we may be accounting too much scaled cputime later.

Fix this by initializing the scaled cputime the same way we do for
normal cputime.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1483636310-6557-2-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/powerpc/kernel/time.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index bc2e08d..ce21650 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -407,6 +407,7 @@ void arch_vtime_task_switch(struct task_struct *prev)
 	struct cpu_accounting_data *acct = get_accounting(current);
 
 	acct->starttime = get_accounting(prev)->starttime;
+	acct->startspurr = get_accounting(prev)->startspurr;
 	acct->system_time = 0;
 	acct->user_time = 0;
 }

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [tip:sched/core] sched/cputime, ia64: Fix incorrect start cputime assignment on task switch
  2017-01-05 17:11 ` [PATCH 02/10] ia64: Fix wrong start cputime assignment on task switch Frederic Weisbecker
@ 2017-01-14 10:00   ` tip-bot for Frederic Weisbecker
  0 siblings, 0 replies; 24+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2017-01-14 10:00 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: wanpeng.li, benh, mpe, mingo, tglx, riel, fweisbec, peterz,
	sgruszka, borntraeger, linux-kernel, paulus, fenghua.yu,
	heiko.carstens, schwidefsky, tony.luck, torvalds, hpa

Commit-ID:  8388d21468e7e7656867b67ab2ec98a78c9ad799
Gitweb:     http://git.kernel.org/tip/8388d21468e7e7656867b67ab2ec98a78c9ad799
Author:     Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Thu, 5 Jan 2017 18:11:42 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 14 Jan 2017 09:54:11 +0100

sched/cputime, ia64: Fix incorrect start cputime assignment on task switch

On task switch we must initialize the current cputime of the next task
using the value of the previous task which got freshly updated.

But we are confusing that with doing the opposite, which should result
in incorrect cputime accounting.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1483636310-6557-3-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/ia64/kernel/time.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/ia64/kernel/time.c b/arch/ia64/kernel/time.c
index 71775b95..637e741 100644
--- a/arch/ia64/kernel/time.c
+++ b/arch/ia64/kernel/time.c
@@ -83,7 +83,7 @@ void arch_vtime_task_switch(struct task_struct *prev)
 	struct thread_info *pi = task_thread_info(prev);
 	struct thread_info *ni = task_thread_info(current);
 
-	pi->ac_stamp = ni->ac_stamp;
+	ni->ac_stamp = pi->ac_stamp;
 	ni->ac_stime = ni->ac_utime = 0;
 }
 

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [tip:sched/core] sched/cputime: Allow accounting system time using cpustat index
  2017-01-05 17:11 ` [PATCH 03/10] cputime: Allow accounting system time using cpustat index Frederic Weisbecker
@ 2017-01-14 10:01   ` tip-bot for Frederic Weisbecker
  0 siblings, 0 replies; 24+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2017-01-14 10:01 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: sgruszka, fenghua.yu, borntraeger, mpe, torvalds, riel,
	tony.luck, mingo, heiko.carstens, linux-kernel, schwidefsky, hpa,
	wanpeng.li, paulus, benh, tglx, peterz, fweisbec

Commit-ID:  c31cc6a5187e8b09ccee34f81728a90f80e872e7
Gitweb:     http://git.kernel.org/tip/c31cc6a5187e8b09ccee34f81728a90f80e872e7
Author:     Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Thu, 5 Jan 2017 18:11:43 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 14 Jan 2017 09:54:11 +0100

sched/cputime: Allow accounting system time using cpustat index

In order to prepare for CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y to delay
cputime accounting to the tick, let's provide APIs to account system
time to precise contexts: hardirq, softirq, pure system, ...

Inspired-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1483636310-6557-4-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/kernel_stat.h |  2 ++
 kernel/sched/cputime.c      | 10 +++++-----
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/include/linux/kernel_stat.h b/include/linux/kernel_stat.h
index 00f7768..14b63bb 100644
--- a/include/linux/kernel_stat.h
+++ b/include/linux/kernel_stat.h
@@ -80,6 +80,8 @@ static inline unsigned int kstat_cpu_irqs_sum(unsigned int cpu)
 
 extern void account_user_time(struct task_struct *, cputime_t);
 extern void account_system_time(struct task_struct *, int, cputime_t);
+extern void account_system_index_time(struct task_struct *, cputime_t,
+				      enum cpu_usage_stat);
 extern void account_steal_time(cputime_t);
 extern void account_idle_time(cputime_t);
 
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 7700a9c..aad4283 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -176,8 +176,8 @@ static void account_guest_time(struct task_struct *p, cputime_t cputime)
  * @cputime: the cpu time spent in kernel space since the last update
  * @index: pointer to cpustat field that has to be updated
  */
-static inline
-void __account_system_time(struct task_struct *p, cputime_t cputime, int index)
+void account_system_index_time(struct task_struct *p,
+			       cputime_t cputime, enum cpu_usage_stat index)
 {
 	/* Add system time to process. */
 	p->stime += cputime;
@@ -213,7 +213,7 @@ void account_system_time(struct task_struct *p, int hardirq_offset,
 	else
 		index = CPUTIME_SYSTEM;
 
-	__account_system_time(p, cputime, index);
+	account_system_index_time(p, cputime, index);
 }
 
 /*
@@ -400,7 +400,7 @@ static void irqtime_account_process_tick(struct task_struct *p, int user_tick,
 		 * So, we have to handle it separately here.
 		 * Also, p->stime needs to be updated for ksoftirqd.
 		 */
-		__account_system_time(p, cputime, CPUTIME_SOFTIRQ);
+		account_system_index_time(p, cputime, CPUTIME_SOFTIRQ);
 	} else if (user_tick) {
 		account_user_time(p, cputime);
 	} else if (p == rq->idle) {
@@ -408,7 +408,7 @@ static void irqtime_account_process_tick(struct task_struct *p, int user_tick,
 	} else if (p->flags & PF_VCPU) { /* System time or guest time */
 		account_guest_time(p, cputime);
 	} else {
-		__account_system_time(p, cputime, CPUTIME_SYSTEM);
+		account_system_index_time(p, cputime, CPUTIME_SYSTEM);
 	}
 }
 

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [tip:sched/core] sched/cputime: Export account_guest_time()
  2017-01-05 17:11 ` [PATCH 04/10] cputime: Export account_guest_time Frederic Weisbecker
@ 2017-01-14 10:01   ` tip-bot for Frederic Weisbecker
  0 siblings, 0 replies; 24+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2017-01-14 10:01 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: heiko.carstens, hpa, torvalds, borntraeger, fenghua.yu, mingo,
	peterz, tony.luck, fweisbec, tglx, mpe, riel, sgruszka,
	schwidefsky, paulus, wanpeng.li, linux-kernel, benh

Commit-ID:  1213699ab426608ff1925ab263dd6925102bb92a
Gitweb:     http://git.kernel.org/tip/1213699ab426608ff1925ab263dd6925102bb92a
Author:     Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Thu, 5 Jan 2017 18:11:44 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 14 Jan 2017 09:54:11 +0100

sched/cputime: Export account_guest_time()

In order to prepare for CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y to delay
cputime accounting to the tick, let's allow archs to account cputime
directly to gtime.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1483636310-6557-5-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/kernel_stat.h | 1 +
 kernel/sched/cputime.c      | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/kernel_stat.h b/include/linux/kernel_stat.h
index 14b63bb..cfd6c0c 100644
--- a/include/linux/kernel_stat.h
+++ b/include/linux/kernel_stat.h
@@ -79,6 +79,7 @@ static inline unsigned int kstat_cpu_irqs_sum(unsigned int cpu)
 }
 
 extern void account_user_time(struct task_struct *, cputime_t);
+extern void account_guest_time(struct task_struct *, cputime_t);
 extern void account_system_time(struct task_struct *, int, cputime_t);
 extern void account_system_index_time(struct task_struct *, cputime_t,
 				      enum cpu_usage_stat);
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index aad4283..5813ee4 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -151,7 +151,7 @@ void account_user_time(struct task_struct *p, cputime_t cputime)
  * @p: the process that the cpu time gets accounted to
  * @cputime: the cpu time spent in virtual machine since the last update
  */
-static void account_guest_time(struct task_struct *p, cputime_t cputime)
+void account_guest_time(struct task_struct *p, cputime_t cputime)
 {
 	u64 *cpustat = kcpustat_this_cpu->cpustat;
 

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [tip:sched/core] sched/cputime, powerpc: Prepare accounting structure for cputime flush on tick
  2017-01-05 17:11 ` [PATCH 05/10] powerpc: Prepare accounting structure for cputime flush on tick Frederic Weisbecker
@ 2017-01-14 10:02   ` tip-bot for Frederic Weisbecker
  0 siblings, 0 replies; 24+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2017-01-14 10:02 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: schwidefsky, wanpeng.li, benh, torvalds, borntraeger, peterz,
	linux-kernel, hpa, fweisbec, tony.luck, tglx, mpe,
	heiko.carstens, mingo, paulus, fenghua.yu, riel, sgruszka

Commit-ID:  8c8b73c4811f2b5e458a7418dca07d2ef85c7db1
Gitweb:     http://git.kernel.org/tip/8c8b73c4811f2b5e458a7418dca07d2ef85c7db1
Author:     Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Thu, 5 Jan 2017 18:11:45 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 14 Jan 2017 09:54:12 +0100

sched/cputime, powerpc: Prepare accounting structure for cputime flush on tick

In order to prepare for CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y to delay
cputime accounting to the tick, provide finegrained accumulators to
powerpc in order to store the cputime until flushing.

While at it, normalize the name of several fields according to common
cputime naming.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1483636310-6557-6-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/powerpc/include/asm/accounting.h | 14 +++++++++++---
 arch/powerpc/kernel/asm-offsets.c     |  8 ++++----
 arch/powerpc/kernel/time.c            | 31 ++++++++++++++++---------------
 arch/powerpc/xmon/xmon.c              |  6 +++---
 4 files changed, 34 insertions(+), 25 deletions(-)

diff --git a/arch/powerpc/include/asm/accounting.h b/arch/powerpc/include/asm/accounting.h
index c133246..3abcf98 100644
--- a/arch/powerpc/include/asm/accounting.h
+++ b/arch/powerpc/include/asm/accounting.h
@@ -12,9 +12,17 @@
 
 /* Stuff for accurate time accounting */
 struct cpu_accounting_data {
-	unsigned long user_time;	/* accumulated usermode TB ticks */
-	unsigned long system_time;	/* accumulated system TB ticks */
-	unsigned long user_time_scaled;	/* accumulated usermode SPURR ticks */
+	/* Accumulated cputime values to flush on ticks*/
+	unsigned long utime;
+	unsigned long stime;
+	unsigned long utime_scaled;
+	unsigned long stime_scaled;
+	unsigned long gtime;
+	unsigned long hardirq_time;
+	unsigned long softirq_time;
+	unsigned long steal_time;
+	unsigned long idle_time;
+	/* Internal counters */
 	unsigned long starttime;	/* TB value snapshot */
 	unsigned long starttime_user;	/* TB value on exit to usermode */
 	unsigned long startspurr;	/* SPURR value snapshot */
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 0601e6a..e505319 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -252,9 +252,9 @@ int main(void)
 	DEFINE(ACCOUNT_STARTTIME_USER,
 	       offsetof(struct paca_struct, accounting.starttime_user));
 	DEFINE(ACCOUNT_USER_TIME,
-	       offsetof(struct paca_struct, accounting.user_time));
+	       offsetof(struct paca_struct, accounting.utime));
 	DEFINE(ACCOUNT_SYSTEM_TIME,
-	       offsetof(struct paca_struct, accounting.system_time));
+	       offsetof(struct paca_struct, accounting.stime));
 	DEFINE(PACA_TRAP_SAVE, offsetof(struct paca_struct, trap_save));
 	DEFINE(PACA_NAPSTATELOST, offsetof(struct paca_struct, nap_state_lost));
 	DEFINE(PACA_SPRG_VDSO, offsetof(struct paca_struct, sprg_vdso));
@@ -265,9 +265,9 @@ int main(void)
 	DEFINE(ACCOUNT_STARTTIME_USER,
 	       offsetof(struct thread_info, accounting.starttime_user));
 	DEFINE(ACCOUNT_USER_TIME,
-	       offsetof(struct thread_info, accounting.user_time));
+	       offsetof(struct thread_info, accounting.utime));
 	DEFINE(ACCOUNT_SYSTEM_TIME,
-	       offsetof(struct thread_info, accounting.system_time));
+	       offsetof(struct thread_info, accounting.stime));
 #endif
 #endif /* CONFIG_PPC64 */
 
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index ce21650..17a2cd1 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -271,8 +271,8 @@ void accumulate_stolen_time(void)
 
 	sst = scan_dispatch_log(acct->starttime_user);
 	ust = scan_dispatch_log(acct->starttime);
-	acct->system_time -= sst;
-	acct->user_time -= ust;
+	acct->stime -= sst;
+	acct->utime -= ust;
 	local_paca->stolen_time += ust + sst;
 
 	local_paca->soft_enabled = save_soft_enabled;
@@ -281,10 +281,11 @@ void accumulate_stolen_time(void)
 static inline u64 calculate_stolen_time(u64 stop_tb)
 {
 	u64 stolen = 0;
+	struct cpu_accounting_data *acct = &local_paca->accounting;
 
 	if (get_paca()->dtl_ridx != be64_to_cpu(get_lppaca()->dtl_idx)) {
 		stolen = scan_dispatch_log(stop_tb);
-		get_paca()->accounting.system_time -= stolen;
+		acct->stime -= stolen;
 	}
 
 	stolen += get_paca()->stolen_time;
@@ -316,17 +317,17 @@ static unsigned long vtime_delta(struct task_struct *tsk,
 
 	now = mftb();
 	nowscaled = read_spurr(now);
-	acct->system_time += now - acct->starttime;
+	acct->stime += now - acct->starttime;
 	acct->starttime = now;
 	deltascaled = nowscaled - acct->startspurr;
 	acct->startspurr = nowscaled;
 
 	*stolen = calculate_stolen_time(now);
 
-	delta = acct->system_time;
-	acct->system_time = 0;
-	udelta = acct->user_time - acct->utime_sspurr;
-	acct->utime_sspurr = acct->user_time;
+	delta = acct->stime;
+	acct->stime = 0;
+	udelta = acct->utime - acct->utime_sspurr;
+	acct->utime_sspurr = acct->utime;
 
 	/*
 	 * Because we don't read the SPURR on every kernel entry/exit,
@@ -348,7 +349,7 @@ static unsigned long vtime_delta(struct task_struct *tsk,
 			*sys_scaled = deltascaled;
 		}
 	}
-	acct->user_time_scaled += user_scaled;
+	acct->utime_scaled += user_scaled;
 
 	return delta;
 }
@@ -387,10 +388,10 @@ void vtime_account_user(struct task_struct *tsk)
 	cputime_t utime, utimescaled;
 	struct cpu_accounting_data *acct = get_accounting(tsk);
 
-	utime = acct->user_time;
-	utimescaled = acct->user_time_scaled;
-	acct->user_time = 0;
-	acct->user_time_scaled = 0;
+	utime = acct->utime;
+	utimescaled = acct->utime_scaled;
+	acct->utime = 0;
+	acct->utime_scaled = 0;
 	acct->utime_sspurr = 0;
 	account_user_time(tsk, utime);
 	tsk->utimescaled += utimescaled;
@@ -408,8 +409,8 @@ void arch_vtime_task_switch(struct task_struct *prev)
 
 	acct->starttime = get_accounting(prev)->starttime;
 	acct->startspurr = get_accounting(prev)->startspurr;
-	acct->system_time = 0;
-	acct->user_time = 0;
+	acct->stime = 0;
+	acct->utime = 0;
 }
 #endif /* CONFIG_PPC32 */
 
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 9c0e17c..9f3b170 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -2287,9 +2287,9 @@ static void dump_one_paca(int cpu)
 	DUMP(p, subcore_sibling_mask, "x");
 #endif
 
-	DUMP(p, accounting.user_time, "llx");
-	DUMP(p, accounting.system_time, "llx");
-	DUMP(p, accounting.user_time_scaled, "llx");
+	DUMP(p, accounting.utime, "llx");
+	DUMP(p, accounting.stime, "llx");
+	DUMP(p, accounting.utime_scaled, "llx");
 	DUMP(p, accounting.starttime, "llx");
 	DUMP(p, accounting.starttime_user, "llx");
 	DUMP(p, accounting.startspurr, "llx");

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [tip:sched/core] sched/cputime, powerpc: Migrate stolen_time field to the accounting structure
  2017-01-05 17:11 ` [PATCH 06/10] powerpc: Migrate stolen_time field to accounting structure Frederic Weisbecker
@ 2017-01-14 10:03   ` tip-bot for Frederic Weisbecker
  0 siblings, 0 replies; 24+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2017-01-14 10:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, fenghua.yu, wanpeng.li, riel, torvalds, benh,
	peterz, sgruszka, tglx, mingo, paulus, mpe, borntraeger,
	tony.luck, heiko.carstens, hpa, schwidefsky, fweisbec

Commit-ID:  f828c3d0aebab130a19d36336b50afa3414fa0bc
Gitweb:     http://git.kernel.org/tip/f828c3d0aebab130a19d36336b50afa3414fa0bc
Author:     Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Thu, 5 Jan 2017 18:11:46 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 14 Jan 2017 09:54:12 +0100

sched/cputime, powerpc: Migrate stolen_time field to the accounting structure

That in order to gather all cputime accumulation to the same place.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1483636310-6557-7-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/powerpc/include/asm/paca.h | 1 -
 arch/powerpc/kernel/time.c      | 6 +++---
 arch/powerpc/xmon/xmon.c        | 2 +-
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 6a6792b..708c3e5 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -187,7 +187,6 @@ struct paca_struct {
 
 	/* Stuff for accurate time accounting */
 	struct cpu_accounting_data accounting;
-	u64 stolen_time;		/* TB ticks taken by hypervisor */
 	u64 dtl_ridx;			/* read index in dispatch log */
 	struct dtl_entry *dtl_curr;	/* pointer corresponding to dtl_ridx */
 
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 17a2cd1..714313e 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -273,7 +273,7 @@ void accumulate_stolen_time(void)
 	ust = scan_dispatch_log(acct->starttime);
 	acct->stime -= sst;
 	acct->utime -= ust;
-	local_paca->stolen_time += ust + sst;
+	acct->steal_time += ust + sst;
 
 	local_paca->soft_enabled = save_soft_enabled;
 }
@@ -288,8 +288,8 @@ static inline u64 calculate_stolen_time(u64 stop_tb)
 		acct->stime -= stolen;
 	}
 
-	stolen += get_paca()->stolen_time;
-	get_paca()->stolen_time = 0;
+	stolen += acct->steal_time;
+	acct->steal_time = 0;
 	return stolen;
 }
 
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 9f3b170..3f864c3 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -2294,7 +2294,7 @@ static void dump_one_paca(int cpu)
 	DUMP(p, accounting.starttime_user, "llx");
 	DUMP(p, accounting.startspurr, "llx");
 	DUMP(p, accounting.utime_sspurr, "llx");
-	DUMP(p, stolen_time, "llx");
+	DUMP(p, accounting.steal_time, "llx");
 #undef DUMP
 
 	catch_memory_errors = 0;

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [tip:sched/core] sched/cputime, powerpc/vtime: Accumulate cputime and account only on tick/task switch
  2017-01-05 17:11 ` [PATCH 07/10] powerpc/vtime: Accumulate cputime and account only on tick/task switch Frederic Weisbecker
@ 2017-01-14 10:03   ` tip-bot for Frederic Weisbecker
  0 siblings, 0 replies; 24+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2017-01-14 10:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: paulus, tony.luck, fenghua.yu, tglx, wanpeng.li, mpe, riel,
	mingo, borntraeger, hpa, fweisbec, linux-kernel, peterz,
	torvalds, benh, heiko.carstens, sgruszka, schwidefsky

Commit-ID:  a19ff1a2cc9227f82e97836a8ee3e593f622eaf9
Gitweb:     http://git.kernel.org/tip/a19ff1a2cc9227f82e97836a8ee3e593f622eaf9
Author:     Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Thu, 5 Jan 2017 18:11:47 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 14 Jan 2017 09:54:12 +0100

sched/cputime, powerpc/vtime: Accumulate cputime and account only on tick/task switch

Currently CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y accounts the cputime on
any context boundary: irq entry/exit, guest entry/exit, context switch,
etc...

Calling functions such as account_system_time(), account_user_time()
and such can be costly, especially if they are called on many fastpath
such as twice per IRQ. Those functions do more than just accounting to
kcpustat and task cputime. Depending on the config, some subsystems can
perform unpleasant multiplications and divisions, among other things.

So lets accumulate the cputime instead and delay the accounting on ticks
and context switches only.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1483636310-6557-8-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/powerpc/kernel/time.c | 120 +++++++++++++++++++++++++++++----------------
 1 file changed, 77 insertions(+), 43 deletions(-)

diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 714313e..4255e69 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -280,17 +280,10 @@ void accumulate_stolen_time(void)
 
 static inline u64 calculate_stolen_time(u64 stop_tb)
 {
-	u64 stolen = 0;
-	struct cpu_accounting_data *acct = &local_paca->accounting;
-
-	if (get_paca()->dtl_ridx != be64_to_cpu(get_lppaca()->dtl_idx)) {
-		stolen = scan_dispatch_log(stop_tb);
-		acct->stime -= stolen;
-	}
+	if (get_paca()->dtl_ridx != be64_to_cpu(get_lppaca()->dtl_idx))
+		return scan_dispatch_log(stop_tb);
 
-	stolen += acct->steal_time;
-	acct->steal_time = 0;
-	return stolen;
+	return 0;
 }
 
 #else /* CONFIG_PPC_SPLPAR */
@@ -306,27 +299,26 @@ static inline u64 calculate_stolen_time(u64 stop_tb)
  * or soft irq state.
  */
 static unsigned long vtime_delta(struct task_struct *tsk,
-				 unsigned long *sys_scaled,
-				 unsigned long *stolen)
+				 unsigned long *stime_scaled,
+				 unsigned long *steal_time)
 {
 	unsigned long now, nowscaled, deltascaled;
-	unsigned long udelta, delta, user_scaled;
+	unsigned long stime;
+	unsigned long utime, utime_scaled;
 	struct cpu_accounting_data *acct = get_accounting(tsk);
 
 	WARN_ON_ONCE(!irqs_disabled());
 
 	now = mftb();
 	nowscaled = read_spurr(now);
-	acct->stime += now - acct->starttime;
+	stime = now - acct->starttime;
 	acct->starttime = now;
 	deltascaled = nowscaled - acct->startspurr;
 	acct->startspurr = nowscaled;
 
-	*stolen = calculate_stolen_time(now);
+	*steal_time = calculate_stolen_time(now);
 
-	delta = acct->stime;
-	acct->stime = 0;
-	udelta = acct->utime - acct->utime_sspurr;
+	utime = acct->utime - acct->utime_sspurr;
 	acct->utime_sspurr = acct->utime;
 
 	/*
@@ -339,39 +331,54 @@ static unsigned long vtime_delta(struct task_struct *tsk,
 	 * the user ticks get saved up in paca->user_time_scaled to be
 	 * used by account_process_tick.
 	 */
-	*sys_scaled = delta;
-	user_scaled = udelta;
-	if (deltascaled != delta + udelta) {
-		if (udelta) {
-			*sys_scaled = deltascaled * delta / (delta + udelta);
-			user_scaled = deltascaled - *sys_scaled;
+	*stime_scaled = stime;
+	utime_scaled = utime;
+	if (deltascaled != stime + utime) {
+		if (utime) {
+			*stime_scaled = deltascaled * stime / (stime + utime);
+			utime_scaled = deltascaled - *stime_scaled;
 		} else {
-			*sys_scaled = deltascaled;
+			*stime_scaled = deltascaled;
 		}
 	}
-	acct->utime_scaled += user_scaled;
+	acct->utime_scaled += utime_scaled;
 
-	return delta;
+	return stime;
 }
 
 void vtime_account_system(struct task_struct *tsk)
 {
-	unsigned long delta, sys_scaled, stolen;
+	unsigned long stime, stime_scaled, steal_time;
+	struct cpu_accounting_data *acct = get_accounting(tsk);
+
+	stime = vtime_delta(tsk, &stime_scaled, &steal_time);
+
+	stime -= min(stime, steal_time);
+	acct->steal_time += steal_time;
 
-	delta = vtime_delta(tsk, &sys_scaled, &stolen);
-	account_system_time(tsk, 0, delta);
-	tsk->stimescaled += sys_scaled;
-	if (stolen)
-		account_steal_time(stolen);
+	if ((tsk->flags & PF_VCPU) && !irq_count()) {
+		acct->gtime += stime;
+		acct->utime_scaled += stime_scaled;
+	} else {
+		if (hardirq_count())
+			acct->hardirq_time += stime;
+		else if (in_serving_softirq())
+			acct->softirq_time += stime;
+		else
+			acct->stime += stime;
+
+		acct->stime_scaled += stime_scaled;
+	}
 }
 EXPORT_SYMBOL_GPL(vtime_account_system);
 
 void vtime_account_idle(struct task_struct *tsk)
 {
-	unsigned long delta, sys_scaled, stolen;
+	unsigned long stime, stime_scaled, steal_time;
+	struct cpu_accounting_data *acct = get_accounting(tsk);
 
-	delta = vtime_delta(tsk, &sys_scaled, &stolen);
-	account_idle_time(delta + stolen);
+	stime = vtime_delta(tsk, &stime_scaled, &steal_time);
+	acct->idle_time += stime + steal_time;
 }
 
 /*
@@ -385,16 +392,45 @@ void vtime_account_idle(struct task_struct *tsk)
  */
 void vtime_account_user(struct task_struct *tsk)
 {
-	cputime_t utime, utimescaled;
 	struct cpu_accounting_data *acct = get_accounting(tsk);
 
-	utime = acct->utime;
-	utimescaled = acct->utime_scaled;
+	if (acct->utime)
+		account_user_time(tsk, acct->utime);
+
+	if (acct->utime_scaled)
+		tsk->utimescaled += acct->utime_scaled;
+
+	if (acct->gtime)
+		account_guest_time(tsk, acct->gtime);
+
+	if (acct->steal_time)
+		account_steal_time(acct->steal_time);
+
+	if (acct->idle_time)
+		account_idle_time(acct->idle_time);
+
+	if (acct->stime)
+		account_system_index_time(tsk, acct->stime, CPUTIME_SYSTEM);
+
+	if (acct->stime_scaled)
+		tsk->stimescaled += acct->stime_scaled;
+
+	if (acct->hardirq_time)
+		account_system_index_time(tsk, acct->hardirq_time, CPUTIME_IRQ);
+
+	if (acct->softirq_time)
+		account_system_index_time(tsk, acct->softirq_time, CPUTIME_SOFTIRQ);
+
 	acct->utime = 0;
 	acct->utime_scaled = 0;
 	acct->utime_sspurr = 0;
-	account_user_time(tsk, utime);
-	tsk->utimescaled += utimescaled;
+	acct->gtime = 0;
+	acct->steal_time = 0;
+	acct->idle_time = 0;
+	acct->stime = 0;
+	acct->stime_scaled = 0;
+	acct->hardirq_time = 0;
+	acct->softirq_time = 0;
 }
 
 #ifdef CONFIG_PPC32
@@ -409,8 +445,6 @@ void arch_vtime_task_switch(struct task_struct *prev)
 
 	acct->starttime = get_accounting(prev)->starttime;
 	acct->startspurr = get_accounting(prev)->startspurr;
-	acct->stime = 0;
-	acct->utime = 0;
 }
 #endif /* CONFIG_PPC32 */
 

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [tip:sched/core] sched/cputime, ia64: Accumulate cputime and account only on tick/task switch
  2017-01-05 17:11 ` [PATCH 08/10] ia64: " Frederic Weisbecker
@ 2017-01-14 10:04   ` tip-bot for Frederic Weisbecker
  0 siblings, 0 replies; 24+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2017-01-14 10:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: wanpeng.li, hpa, tony.luck, riel, heiko.carstens, peterz,
	sgruszka, benh, torvalds, tglx, mpe, linux-kernel, schwidefsky,
	mingo, fweisbec, paulus, fenghua.yu, borntraeger

Commit-ID:  7dd582305d19fd178bb42ecd1666285ecfb1657a
Gitweb:     http://git.kernel.org/tip/7dd582305d19fd178bb42ecd1666285ecfb1657a
Author:     Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Thu, 5 Jan 2017 18:11:48 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 14 Jan 2017 09:54:12 +0100

sched/cputime, ia64: Accumulate cputime and account only on tick/task switch

Currently CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y accounts the cputime on
any context boundary: irq entry/exit, guest entry/exit, context switch,
etc...

Calling functions such as account_system_time(), account_user_time()
and such can be costly, especially if they are called on many fastpath
such as twice per IRQ. Those functions do more than just accounting to
kcpustat and task cputime. Depending on the config, some subsystems can
perform unpleasant multiplications and divisions, among other things.

So lets accumulate the cputime instead and delay the accounting on ticks
and context switches only.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1483636310-6557-9-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/ia64/include/asm/thread_info.h |  6 ++++
 arch/ia64/kernel/time.c             | 62 ++++++++++++++++++++++++++++---------
 2 files changed, 53 insertions(+), 15 deletions(-)

diff --git a/arch/ia64/include/asm/thread_info.h b/arch/ia64/include/asm/thread_info.h
index c702642..8742d74 100644
--- a/arch/ia64/include/asm/thread_info.h
+++ b/arch/ia64/include/asm/thread_info.h
@@ -27,6 +27,12 @@ struct thread_info {
 	mm_segment_t addr_limit;	/* user-level address space limit */
 	int preempt_count;		/* 0=premptable, <0=BUG; will also serve as bh-counter */
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
+	__u64 utime;
+	__u64 stime;
+	__u64 gtime;
+	__u64 hardirq_time;
+	__u64 softirq_time;
+	__u64 idle_time;
 	__u64 ac_stamp;
 	__u64 ac_leave;
 	__u64 ac_stime;
diff --git a/arch/ia64/kernel/time.c b/arch/ia64/kernel/time.c
index 637e741..37f1b31 100644
--- a/arch/ia64/kernel/time.c
+++ b/arch/ia64/kernel/time.c
@@ -63,14 +63,39 @@ extern cputime_t cycle_to_cputime(u64 cyc);
 
 void vtime_account_user(struct task_struct *tsk)
 {
-	cputime_t delta_utime;
 	struct thread_info *ti = task_thread_info(tsk);
+	cputime_t delta;
 
-	if (ti->ac_utime) {
-		delta_utime = cycle_to_cputime(ti->ac_utime);
-		account_user_time(tsk, delta_utime);
-		ti->ac_utime = 0;
+	if (ti->utime)
+		account_user_time(tsk, cycle_to_cputime(ti->utime));
+
+	if (ti->gtime)
+		account_guest_time(tsk, cycle_to_cputime(ti->gtime));
+
+	if (ti->idle_time)
+		account_idle_time(cycle_to_cputime(ti->idle_time));
+
+	if (ti->stime) {
+		delta = cycle_to_cputime(ti->stime);
+		account_system_index_time(tsk, delta, CPUTIME_SYSTEM);
+	}
+
+	if (ti->hardirq_time) {
+		delta = cycle_to_cputime(ti->hardirq_time);
+		account_system_index_time(tsk, delta, CPUTIME_IRQ);
+	}
+
+	if (ti->softirq_time) {
+		delta = cycle_to_cputime(ti->softirq_time);
+		account_system_index_time(tsk, delta, CPUTIME_SOFTIRQ);
 	}
+
+	ti->utime = 0;
+	ti->gtime = 0;
+	ti->idle_time = 0;
+	ti->stime = 0;
+	ti->hardirq_time = 0;
+	ti->softirq_time = 0;
 }
 
 /*
@@ -91,18 +116,15 @@ void arch_vtime_task_switch(struct task_struct *prev)
  * Account time for a transition between system, hard irq or soft irq state.
  * Note that this function is called with interrupts enabled.
  */
-static cputime_t vtime_delta(struct task_struct *tsk)
+static __u64 vtime_delta(struct task_struct *tsk)
 {
 	struct thread_info *ti = task_thread_info(tsk);
-	cputime_t delta_stime;
-	__u64 now;
+	__u64 now, delta_stime;
 
 	WARN_ON_ONCE(!irqs_disabled());
 
 	now = ia64_get_itc();
-
-	delta_stime = cycle_to_cputime(ti->ac_stime + (now - ti->ac_stamp));
-	ti->ac_stime = 0;
+	delta_stime = now - ti->ac_stamp;
 	ti->ac_stamp = now;
 
 	return delta_stime;
@@ -110,15 +132,25 @@ static cputime_t vtime_delta(struct task_struct *tsk)
 
 void vtime_account_system(struct task_struct *tsk)
 {
-	cputime_t delta = vtime_delta(tsk);
-
-	account_system_time(tsk, 0, delta);
+	struct thread_info *ti = task_thread_info(tsk);
+	__u64 stime = vtime_delta(tsk);
+
+	if ((tsk->flags & PF_VCPU) && !irq_count())
+		ti->gtime += stime;
+	else if (hardirq_count())
+		ti->hardirq_time += stime;
+	else if (in_serving_softirq())
+		ti->softirq_time += stime;
+	else
+		ti->stime += stime;
 }
 EXPORT_SYMBOL_GPL(vtime_account_system);
 
 void vtime_account_idle(struct task_struct *tsk)
 {
-	account_idle_time(vtime_delta(tsk));
+	struct thread_info *ti = task_thread_info(tsk);
+
+	ti->idle_time += vtime_delta(tsk);
 }
 
 #endif /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [tip:sched/core] sched/cputime, s390: Implement delayed accounting of system time
  2017-01-05 17:11 ` [PATCH 09/10] s390/cputime: delayed accounting of system time Frederic Weisbecker
@ 2017-01-14 10:04   ` tip-bot for Martin Schwidefsky
  0 siblings, 0 replies; 24+ messages in thread
From: tip-bot for Martin Schwidefsky @ 2017-01-14 10:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: heiko.carstens, torvalds, wanpeng.li, paulus, borntraeger, tglx,
	hpa, riel, peterz, sgruszka, tony.luck, benh, mingo, fweisbec,
	schwidefsky, linux-kernel, mpe, fenghua.yu

Commit-ID:  b7394a5f4ce9542666cc68422c3594ea854adc2c
Gitweb:     http://git.kernel.org/tip/b7394a5f4ce9542666cc68422c3594ea854adc2c
Author:     Martin Schwidefsky <schwidefsky@de.ibm.com>
AuthorDate: Thu, 5 Jan 2017 18:11:49 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 14 Jan 2017 09:54:12 +0100

sched/cputime, s390: Implement delayed accounting of system time

The account_system_time() function is called with a cputime that
occurred while running in the kernel. The function detects which
context the CPU is currently running in and accounts the time to
the correct bucket. This forces the arch code to account the
cputime for hardirq and softirq immediately.

Such accounting function can be costly and perform unwelcome divisions
and multiplications, among others.

The arch code can delay the accounting for system time. For s390
the accounting is done once per timer tick and for each task switch.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
[ Rebase against latest linus tree and move account_system_index_scaled(). ]
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1483636310-6557-10-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/s390/include/asm/lowcore.h   |  65 +++++++++---------
 arch/s390/include/asm/processor.h |   3 +
 arch/s390/kernel/vtime.c          | 136 ++++++++++++++++++++++++--------------
 3 files changed, 125 insertions(+), 79 deletions(-)

diff --git a/arch/s390/include/asm/lowcore.h b/arch/s390/include/asm/lowcore.h
index 9bfad2a..61261e0 100644
--- a/arch/s390/include/asm/lowcore.h
+++ b/arch/s390/include/asm/lowcore.h
@@ -85,53 +85,56 @@ struct lowcore {
 	__u64	mcck_enter_timer;		/* 0x02c0 */
 	__u64	exit_timer;			/* 0x02c8 */
 	__u64	user_timer;			/* 0x02d0 */
-	__u64	system_timer;			/* 0x02d8 */
-	__u64	steal_timer;			/* 0x02e0 */
-	__u64	last_update_timer;		/* 0x02e8 */
-	__u64	last_update_clock;		/* 0x02f0 */
-	__u64	int_clock;			/* 0x02f8 */
-	__u64	mcck_clock;			/* 0x0300 */
-	__u64	clock_comparator;		/* 0x0308 */
+	__u64	guest_timer;			/* 0x02d8 */
+	__u64	system_timer;			/* 0x02e0 */
+	__u64	hardirq_timer;			/* 0x02e8 */
+	__u64	softirq_timer;			/* 0x02f0 */
+	__u64	steal_timer;			/* 0x02f8 */
+	__u64	last_update_timer;		/* 0x0300 */
+	__u64	last_update_clock;		/* 0x0308 */
+	__u64	int_clock;			/* 0x0310 */
+	__u64	mcck_clock;			/* 0x0318 */
+	__u64	clock_comparator;		/* 0x0320 */
 
 	/* Current process. */
-	__u64	current_task;			/* 0x0310 */
-	__u8	pad_0x318[0x320-0x318];		/* 0x0318 */
-	__u64	kernel_stack;			/* 0x0320 */
+	__u64	current_task;			/* 0x0328 */
+	__u8	pad_0x318[0x320-0x318];		/* 0x0330 */
+	__u64	kernel_stack;			/* 0x0338 */
 
 	/* Interrupt, panic and restart stack. */
-	__u64	async_stack;			/* 0x0328 */
-	__u64	panic_stack;			/* 0x0330 */
-	__u64	restart_stack;			/* 0x0338 */
+	__u64	async_stack;			/* 0x0340 */
+	__u64	panic_stack;			/* 0x0348 */
+	__u64	restart_stack;			/* 0x0350 */
 
 	/* Restart function and parameter. */
-	__u64	restart_fn;			/* 0x0340 */
-	__u64	restart_data;			/* 0x0348 */
-	__u64	restart_source;			/* 0x0350 */
+	__u64	restart_fn;			/* 0x0358 */
+	__u64	restart_data;			/* 0x0360 */
+	__u64	restart_source;			/* 0x0368 */
 
 	/* Address space pointer. */
-	__u64	kernel_asce;			/* 0x0358 */
-	__u64	user_asce;			/* 0x0360 */
+	__u64	kernel_asce;			/* 0x0370 */
+	__u64	user_asce;			/* 0x0378 */
 
 	/*
 	 * The lpp and current_pid fields form a
 	 * 64-bit value that is set as program
 	 * parameter with the LPP instruction.
 	 */
-	__u32	lpp;				/* 0x0368 */
-	__u32	current_pid;			/* 0x036c */
+	__u32	lpp;				/* 0x0380 */
+	__u32	current_pid;			/* 0x0384 */
 
 	/* SMP info area */
-	__u32	cpu_nr;				/* 0x0370 */
-	__u32	softirq_pending;		/* 0x0374 */
-	__u64	percpu_offset;			/* 0x0378 */
-	__u64	vdso_per_cpu_data;		/* 0x0380 */
-	__u64	machine_flags;			/* 0x0388 */
-	__u32	preempt_count;			/* 0x0390 */
-	__u8	pad_0x0394[0x0398-0x0394];	/* 0x0394 */
-	__u64	gmap;				/* 0x0398 */
-	__u32	spinlock_lockval;		/* 0x03a0 */
-	__u32	fpu_flags;			/* 0x03a4 */
-	__u8	pad_0x03a8[0x0400-0x03a8];	/* 0x03a8 */
+	__u32	cpu_nr;				/* 0x0388 */
+	__u32	softirq_pending;		/* 0x038c */
+	__u64	percpu_offset;			/* 0x0390 */
+	__u64	vdso_per_cpu_data;		/* 0x0398 */
+	__u64	machine_flags;			/* 0x03a0 */
+	__u32	preempt_count;			/* 0x03a8 */
+	__u8	pad_0x03ac[0x03b0-0x03ac];	/* 0x03ac */
+	__u64	gmap;				/* 0x03b0 */
+	__u32	spinlock_lockval;		/* 0x03b8 */
+	__u32	fpu_flags;			/* 0x03bc */
+	__u8	pad_0x03c0[0x0400-0x03c0];	/* 0x03c0 */
 
 	/* Per cpu primary space access list */
 	__u32	paste[16];			/* 0x0400 */
diff --git a/arch/s390/include/asm/processor.h b/arch/s390/include/asm/processor.h
index 6bca916..977a5b6 100644
--- a/arch/s390/include/asm/processor.h
+++ b/arch/s390/include/asm/processor.h
@@ -111,7 +111,10 @@ struct thread_struct {
 	unsigned int  acrs[NUM_ACRS];
         unsigned long ksp;              /* kernel stack pointer             */
 	unsigned long user_timer;	/* task cputime in user space */
+	unsigned long guest_timer;	/* task cputime in kvm guest */
 	unsigned long system_timer;	/* task cputime in kernel space */
+	unsigned long hardirq_timer;	/* task cputime in hardirq context */
+	unsigned long softirq_timer;	/* task cputime in softirq context */
 	unsigned long sys_call_table;	/* system call table address */
 	mm_segment_t mm_segment;
 	unsigned long gmap_addr;	/* address of last gmap fault. */
diff --git a/arch/s390/kernel/vtime.c b/arch/s390/kernel/vtime.c
index 1b5c5ee..1a53e0b 100644
--- a/arch/s390/kernel/vtime.c
+++ b/arch/s390/kernel/vtime.c
@@ -90,14 +90,41 @@ static void update_mt_scaling(void)
 	__this_cpu_write(mt_scaling_jiffies, jiffies_64);
 }
 
+static inline u64 update_tsk_timer(unsigned long *tsk_vtime, u64 new)
+{
+	u64 delta;
+
+	delta = new - *tsk_vtime;
+	*tsk_vtime = new;
+	return delta;
+}
+
+
+static inline u64 scale_vtime(u64 vtime)
+{
+	u64 mult = __this_cpu_read(mt_scaling_mult);
+	u64 div = __this_cpu_read(mt_scaling_div);
+
+	if (smp_cpu_mtid)
+		return vtime * mult / div;
+	return vtime;
+}
+
+static void account_system_index_scaled(struct task_struct *p,
+					cputime_t cputime, cputime_t scaled,
+					enum cpu_usage_stat index)
+{
+	p->stimescaled += scaled;
+	account_system_index_time(p, cputime, index);
+}
+
 /*
  * Update process times based on virtual cpu times stored by entry.S
  * to the lowcore fields user_timer, system_timer & steal_clock.
  */
 static int do_account_vtime(struct task_struct *tsk)
 {
-	u64 timer, clock, user, system, steal;
-	u64 user_scaled, system_scaled;
+	u64 timer, clock, user, guest, system, hardirq, softirq, steal;
 
 	timer = S390_lowcore.last_update_timer;
 	clock = S390_lowcore.last_update_clock;
@@ -110,36 +137,53 @@ static int do_account_vtime(struct task_struct *tsk)
 #endif
 		: "=m" (S390_lowcore.last_update_timer),
 		  "=m" (S390_lowcore.last_update_clock));
-	S390_lowcore.system_timer += timer - S390_lowcore.last_update_timer;
-	S390_lowcore.steal_timer += S390_lowcore.last_update_clock - clock;
+	clock = S390_lowcore.last_update_clock - clock;
+	timer -= S390_lowcore.last_update_timer;
+
+	if (hardirq_count())
+		S390_lowcore.hardirq_timer += timer;
+	else
+		S390_lowcore.system_timer += timer;
 
 	/* Update MT utilization calculation */
 	if (smp_cpu_mtid &&
 	    time_after64(jiffies_64, this_cpu_read(mt_scaling_jiffies)))
 		update_mt_scaling();
 
-	user = S390_lowcore.user_timer - tsk->thread.user_timer;
-	S390_lowcore.steal_timer -= user;
-	tsk->thread.user_timer = S390_lowcore.user_timer;
-
-	system = S390_lowcore.system_timer - tsk->thread.system_timer;
-	S390_lowcore.steal_timer -= system;
-	tsk->thread.system_timer = S390_lowcore.system_timer;
-
-	user_scaled = user;
-	system_scaled = system;
-	/* Do MT utilization scaling */
-	if (smp_cpu_mtid) {
-		u64 mult = __this_cpu_read(mt_scaling_mult);
-		u64 div = __this_cpu_read(mt_scaling_div);
+	/* Calculate cputime delta */
+	user = update_tsk_timer(&tsk->thread.user_timer,
+				READ_ONCE(S390_lowcore.user_timer));
+	guest = update_tsk_timer(&tsk->thread.guest_timer,
+				 READ_ONCE(S390_lowcore.guest_timer));
+	system = update_tsk_timer(&tsk->thread.system_timer,
+				  READ_ONCE(S390_lowcore.system_timer));
+	hardirq = update_tsk_timer(&tsk->thread.hardirq_timer,
+				   READ_ONCE(S390_lowcore.hardirq_timer));
+	softirq = update_tsk_timer(&tsk->thread.softirq_timer,
+				   READ_ONCE(S390_lowcore.softirq_timer));
+	S390_lowcore.steal_timer +=
+		clock - user - guest - system - hardirq - softirq;
+
+	/* Push account value */
+	if (user) {
+		account_user_time(tsk, user);
+		tsk->utimescaled += scale_vtime(user);
+	}
 
-		user_scaled = (user_scaled * mult) / div;
-		system_scaled = (system_scaled * mult) / div;
+	if (guest) {
+		account_guest_time(tsk, guest);
+		tsk->utimescaled += scale_vtime(guest);
 	}
-	account_user_time(tsk, user);
-	tsk->utimescaled += user_scaled;
-	account_system_time(tsk, 0, system);
-	tsk->stimescaled += system_scaled;
+
+	if (system)
+		account_system_index_scaled(tsk, system, scale_vtime(system),
+					    CPUTIME_SYSTEM);
+	if (hardirq)
+		account_system_index_scaled(tsk, hardirq, scale_vtime(hardirq),
+					    CPUTIME_IRQ);
+	if (softirq)
+		account_system_index_scaled(tsk, softirq, scale_vtime(softirq),
+					    CPUTIME_SOFTIRQ);
 
 	steal = S390_lowcore.steal_timer;
 	if ((s64) steal > 0) {
@@ -147,16 +191,22 @@ static int do_account_vtime(struct task_struct *tsk)
 		account_steal_time(steal);
 	}
 
-	return virt_timer_forward(user + system);
+	return virt_timer_forward(user + guest + system + hardirq + softirq);
 }
 
 void vtime_task_switch(struct task_struct *prev)
 {
 	do_account_vtime(prev);
 	prev->thread.user_timer = S390_lowcore.user_timer;
+	prev->thread.guest_timer = S390_lowcore.guest_timer;
 	prev->thread.system_timer = S390_lowcore.system_timer;
+	prev->thread.hardirq_timer = S390_lowcore.hardirq_timer;
+	prev->thread.softirq_timer = S390_lowcore.softirq_timer;
 	S390_lowcore.user_timer = current->thread.user_timer;
+	S390_lowcore.guest_timer = current->thread.guest_timer;
 	S390_lowcore.system_timer = current->thread.system_timer;
+	S390_lowcore.hardirq_timer = current->thread.hardirq_timer;
+	S390_lowcore.softirq_timer = current->thread.softirq_timer;
 }
 
 /*
@@ -176,32 +226,22 @@ void vtime_account_user(struct task_struct *tsk)
  */
 void vtime_account_irq_enter(struct task_struct *tsk)
 {
-	u64 timer, system, system_scaled;
+	u64 timer;
 
 	timer = S390_lowcore.last_update_timer;
 	S390_lowcore.last_update_timer = get_vtimer();
-	S390_lowcore.system_timer += timer - S390_lowcore.last_update_timer;
-
-	/* Update MT utilization calculation */
-	if (smp_cpu_mtid &&
-	    time_after64(jiffies_64, this_cpu_read(mt_scaling_jiffies)))
-		update_mt_scaling();
-
-	system = S390_lowcore.system_timer - tsk->thread.system_timer;
-	S390_lowcore.steal_timer -= system;
-	tsk->thread.system_timer = S390_lowcore.system_timer;
-	system_scaled = system;
-	/* Do MT utilization scaling */
-	if (smp_cpu_mtid) {
-		u64 mult = __this_cpu_read(mt_scaling_mult);
-		u64 div = __this_cpu_read(mt_scaling_div);
-
-		system_scaled = (system_scaled * mult) / div;
-	}
-	account_system_time(tsk, 0, system);
-	tsk->stimescaled += system_scaled;
-
-	virt_timer_forward(system);
+	timer -= S390_lowcore.last_update_timer;
+
+	if ((tsk->flags & PF_VCPU) && (irq_count() == 0))
+		S390_lowcore.guest_timer += timer;
+	else if (hardirq_count())
+		S390_lowcore.hardirq_timer += timer;
+	else if (in_serving_softirq())
+		S390_lowcore.softirq_timer += timer;
+	else
+		S390_lowcore.system_timer += timer;
+
+	virt_timer_forward(timer);
 }
 EXPORT_SYMBOL_GPL(vtime_account_irq_enter);
 

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [tip:sched/core] sched/cputime: Rename vtime_account_user() to vtime_flush()
  2017-01-05 17:11 ` [PATCH 10/10] vtime: Rename vtime_account_user() to vtime_flush() Frederic Weisbecker
@ 2017-01-14 10:05   ` tip-bot for Frederic Weisbecker
  0 siblings, 0 replies; 24+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2017-01-14 10:05 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: riel, linux-kernel, heiko.carstens, schwidefsky, paulus,
	fweisbec, torvalds, peterz, sgruszka, mpe, benh, wanpeng.li,
	tglx, borntraeger, mingo, fenghua.yu, hpa, tony.luck

Commit-ID:  c8d7dabf8f91fadd265e6eb87afb201d14ea299b
Gitweb:     http://git.kernel.org/tip/c8d7dabf8f91fadd265e6eb87afb201d14ea299b
Author:     Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Thu, 5 Jan 2017 18:11:50 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sat, 14 Jan 2017 09:54:13 +0100

sched/cputime: Rename vtime_account_user() to vtime_flush()

CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y used to accumulate user time and
account it on ticks and context switches only through the
vtime_account_user() function.

Now this model has been generalized on the 3 archs for all kind of
cputime (system, irq, ...) and all the cputime flushing happens under
vtime_account_user().

So let's rename this function to better reflect its new role.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1483636310-6557-11-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/ia64/kernel/time.c     | 2 +-
 arch/powerpc/kernel/time.c  | 6 ++----
 arch/s390/kernel/vtime.c    | 2 +-
 include/linux/kernel_stat.h | 2 +-
 include/linux/vtime.h       | 7 +++++--
 kernel/sched/cputime.c      | 4 +---
 6 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/arch/ia64/kernel/time.c b/arch/ia64/kernel/time.c
index 37f1b31..d040f12 100644
--- a/arch/ia64/kernel/time.c
+++ b/arch/ia64/kernel/time.c
@@ -61,7 +61,7 @@ static struct clocksource *itc_clocksource;
 
 extern cputime_t cycle_to_cputime(u64 cyc);
 
-void vtime_account_user(struct task_struct *tsk)
+void vtime_flush(struct task_struct *tsk)
 {
 	struct thread_info *ti = task_thread_info(tsk);
 	cputime_t delta;
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 4255e69..02e9730 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -382,15 +382,13 @@ void vtime_account_idle(struct task_struct *tsk)
 }
 
 /*
- * Transfer the user time accumulated in the paca
- * by the exception entry and exit code to the generic
- * process user time records.
+ * Account the whole cputime accumulated in the paca
  * Must be called with interrupts disabled.
  * Assumes that vtime_account_system/idle() has been called
  * recently (i.e. since the last entry from usermode) so that
  * get_paca()->user_time_scaled is up to date.
  */
-void vtime_account_user(struct task_struct *tsk)
+void vtime_flush(struct task_struct *tsk)
 {
 	struct cpu_accounting_data *acct = get_accounting(tsk);
 
diff --git a/arch/s390/kernel/vtime.c b/arch/s390/kernel/vtime.c
index 1a53e0b..0a9e5d6 100644
--- a/arch/s390/kernel/vtime.c
+++ b/arch/s390/kernel/vtime.c
@@ -214,7 +214,7 @@ void vtime_task_switch(struct task_struct *prev)
  * accounting system time in order to correctly compute
  * the stolen time accounting.
  */
-void vtime_account_user(struct task_struct *tsk)
+void vtime_flush(struct task_struct *tsk)
 {
 	if (do_account_vtime(tsk))
 		virt_timer_expire();
diff --git a/include/linux/kernel_stat.h b/include/linux/kernel_stat.h
index cfd6c0c..c3e38de 100644
--- a/include/linux/kernel_stat.h
+++ b/include/linux/kernel_stat.h
@@ -89,7 +89,7 @@ extern void account_idle_time(cputime_t);
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
 static inline void account_process_tick(struct task_struct *tsk, int user)
 {
-	vtime_account_user(tsk);
+	vtime_flush(tsk);
 }
 #else
 extern void account_process_tick(struct task_struct *, int user);
diff --git a/include/linux/vtime.h b/include/linux/vtime.h
index aa9bfea..0681fe2 100644
--- a/include/linux/vtime.h
+++ b/include/linux/vtime.h
@@ -58,27 +58,28 @@ static inline void vtime_task_switch(struct task_struct *prev)
 
 extern void vtime_account_system(struct task_struct *tsk);
 extern void vtime_account_idle(struct task_struct *tsk);
-extern void vtime_account_user(struct task_struct *tsk);
 
 #else /* !CONFIG_VIRT_CPU_ACCOUNTING */
 
 static inline void vtime_task_switch(struct task_struct *prev) { }
 static inline void vtime_account_system(struct task_struct *tsk) { }
-static inline void vtime_account_user(struct task_struct *tsk) { }
 #endif /* !CONFIG_VIRT_CPU_ACCOUNTING */
 
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
 extern void arch_vtime_task_switch(struct task_struct *tsk);
+extern void vtime_account_user(struct task_struct *tsk);
 extern void vtime_user_enter(struct task_struct *tsk);
 
 static inline void vtime_user_exit(struct task_struct *tsk)
 {
 	vtime_account_user(tsk);
 }
+
 extern void vtime_guest_enter(struct task_struct *tsk);
 extern void vtime_guest_exit(struct task_struct *tsk);
 extern void vtime_init_idle(struct task_struct *tsk, int cpu);
 #else /* !CONFIG_VIRT_CPU_ACCOUNTING_GEN  */
+static inline void vtime_account_user(struct task_struct *tsk) { }
 static inline void vtime_user_enter(struct task_struct *tsk) { }
 static inline void vtime_user_exit(struct task_struct *tsk) { }
 static inline void vtime_guest_enter(struct task_struct *tsk) { }
@@ -93,9 +94,11 @@ static inline void vtime_account_irq_exit(struct task_struct *tsk)
 	/* On hard|softirq exit we always account to hard|softirq cputime */
 	vtime_account_system(tsk);
 }
+extern void vtime_flush(struct task_struct *tsk);
 #else /* !CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
 static inline void vtime_account_irq_enter(struct task_struct *tsk) { }
 static inline void vtime_account_irq_exit(struct task_struct *tsk) { }
+static inline void vtime_flush(struct task_struct *tsk) { }
 #endif
 
 
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 5813ee4..f7c14cc 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -437,9 +437,7 @@ void vtime_common_task_switch(struct task_struct *prev)
 	else
 		vtime_account_system(prev);
 
-#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
-	vtime_account_user(prev);
-#endif
+	vtime_flush(prev);
 	arch_vtime_task_switch(prev);
 }
 #endif

^ permalink raw reply related	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2017-01-14 11:14 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-05 17:11 [PATCH 00/10] vtime: Delay cputime accounting to tick / context switch Frederic Weisbecker
2017-01-05 17:11 ` [PATCH 01/10] powerpc32: Fix stale scaled stime on " Frederic Weisbecker
2017-01-14 10:00   ` [tip:sched/core] sched/cputime, " tip-bot for Frederic Weisbecker
2017-01-05 17:11 ` [PATCH 02/10] ia64: Fix wrong start cputime assignment on task switch Frederic Weisbecker
2017-01-14 10:00   ` [tip:sched/core] sched/cputime, ia64: Fix incorrect " tip-bot for Frederic Weisbecker
2017-01-05 17:11 ` [PATCH 03/10] cputime: Allow accounting system time using cpustat index Frederic Weisbecker
2017-01-14 10:01   ` [tip:sched/core] sched/cputime: " tip-bot for Frederic Weisbecker
2017-01-05 17:11 ` [PATCH 04/10] cputime: Export account_guest_time Frederic Weisbecker
2017-01-14 10:01   ` [tip:sched/core] sched/cputime: Export account_guest_time() tip-bot for Frederic Weisbecker
2017-01-05 17:11 ` [PATCH 05/10] powerpc: Prepare accounting structure for cputime flush on tick Frederic Weisbecker
2017-01-14 10:02   ` [tip:sched/core] sched/cputime, " tip-bot for Frederic Weisbecker
2017-01-05 17:11 ` [PATCH 06/10] powerpc: Migrate stolen_time field to accounting structure Frederic Weisbecker
2017-01-14 10:03   ` [tip:sched/core] sched/cputime, powerpc: Migrate stolen_time field to the " tip-bot for Frederic Weisbecker
2017-01-05 17:11 ` [PATCH 07/10] powerpc/vtime: Accumulate cputime and account only on tick/task switch Frederic Weisbecker
2017-01-14 10:03   ` [tip:sched/core] sched/cputime, " tip-bot for Frederic Weisbecker
2017-01-05 17:11 ` [PATCH 08/10] ia64: " Frederic Weisbecker
2017-01-14 10:04   ` [tip:sched/core] sched/cputime, " tip-bot for Frederic Weisbecker
2017-01-05 17:11 ` [PATCH 09/10] s390/cputime: delayed accounting of system time Frederic Weisbecker
2017-01-14 10:04   ` [tip:sched/core] sched/cputime, s390: Implement " tip-bot for Martin Schwidefsky
2017-01-05 17:11 ` [PATCH 10/10] vtime: Rename vtime_account_user() to vtime_flush() Frederic Weisbecker
2017-01-14 10:05   ` [tip:sched/core] sched/cputime: " tip-bot for Frederic Weisbecker
2017-01-09  8:13 ` [PATCH 00/10] vtime: Delay cputime accounting to tick / context switch Martin Schwidefsky
2017-01-10 11:45 ` Thomas Gleixner
2017-01-10 15:21   ` Frederic Weisbecker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.