All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] cputime: fix invalid gtime
@ 2015-10-28  7:01 Hiroshi Shimamoto
  2015-10-28 16:11 ` Frederic Weisbecker
  0 siblings, 1 reply; 6+ messages in thread
From: Hiroshi Shimamoto @ 2015-10-28  7:01 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Frederic Weisbecker; +Cc: linux-kernel

From: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>

/proc/stats shows invalid gtime when the thread is running in guest.
When vtime accounting is not enabled, we cannot get a valid delta.
The delta is calculated now - tsk->vtime_snap, but tsk->vtime_snap
is only updated when vtime accounting is enabled.

This patch makes task_gtime() just return gtime when vtime accounting
is not enabled.

The kernel config contains CONFIG_VIRT_CPU_ACCOUNTING_GEN=y and
CONFIG_NO_HZ_FULL_ALL=n and boot without nohz_full.

I ran and stop a busy loop in VM and see the gtime in host.
Dump the 43rd field which shows the gtime in every second.
 # while :; do awk '{print $3" "$43}' /proc/3955/task/4014/stat; sleep 1; done
S 4348
R 7064566
R 7064766
R 7064967
R 7065168
S 4759
S 4759

During running busy loop, it returns large value.

After applying this patch, we can see right gtime.

 # while :; do awk '{print $3" "$43}' /proc/10913/task/10956/stat; sleep 1; done
S 5338
R 5365
R 5465
R 5566
R 5666
S 5726
S 5726

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
---
 kernel/sched/cputime.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 8cbc3db..f614ee9 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -786,6 +786,9 @@ cputime_t task_gtime(struct task_struct *t)
 	unsigned int seq;
 	cputime_t gtime;
 
+	if (!vtime_accounting_enabled())
+		return t->gtime;
+
 	do {
 		seq = read_seqbegin(&t->vtime_seqlock);
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] cputime: fix invalid gtime
  2015-10-28  7:01 [PATCH v2] cputime: fix invalid gtime Hiroshi Shimamoto
@ 2015-10-28 16:11 ` Frederic Weisbecker
  2015-10-29  1:10   ` Hiroshi Shimamoto
  0 siblings, 1 reply; 6+ messages in thread
From: Frederic Weisbecker @ 2015-10-28 16:11 UTC (permalink / raw)
  To: Hiroshi Shimamoto; +Cc: Peter Zijlstra, Ingo Molnar, linux-kernel

On Wed, Oct 28, 2015 at 07:01:18AM +0000, Hiroshi Shimamoto wrote:
> From: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
> 
> /proc/stats shows invalid gtime when the thread is running in guest.
> When vtime accounting is not enabled, we cannot get a valid delta.
> The delta is calculated now - tsk->vtime_snap, but tsk->vtime_snap
> is only updated when vtime accounting is enabled.
> 
> This patch makes task_gtime() just return gtime when vtime accounting
> is not enabled.
> 
> The kernel config contains CONFIG_VIRT_CPU_ACCOUNTING_GEN=y and
> CONFIG_NO_HZ_FULL_ALL=n and boot without nohz_full.
> 
> I ran and stop a busy loop in VM and see the gtime in host.
> Dump the 43rd field which shows the gtime in every second.
>  # while :; do awk '{print $3" "$43}' /proc/3955/task/4014/stat; sleep 1; done
> S 4348
> R 7064566
> R 7064766
> R 7064967
> R 7065168
> S 4759
> S 4759
> 
> During running busy loop, it returns large value.
> 
> After applying this patch, we can see right gtime.
> 
>  # while :; do awk '{print $3" "$43}' /proc/10913/task/10956/stat; sleep 1; done
> S 5338
> R 5365
> R 5465
> R 5566
> R 5666
> S 5726
> S 5726
> 
> Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
> ---
>  kernel/sched/cputime.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index 8cbc3db..f614ee9 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -786,6 +786,9 @@ cputime_t task_gtime(struct task_struct *t)
>  	unsigned int seq;
>  	cputime_t gtime;
>  
> +	if (!vtime_accounting_enabled())
> +		return t->gtime;
> +

Obviously I completely messed up there. And task_cputime() has a similar issue
but it happens to work due to vtime_snap_whence set to VTIME_SLEEPING when vtime
doesn't run. Still it works at the cost of a seqcount read operation.

Do you think you could fix it too (along with task_cputime_scaled())? I think those
patches will also need a stable tag.

Thanks!



>  	do {
>  		seq = read_seqbegin(&t->vtime_seqlock);
>  
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH v2] cputime: fix invalid gtime
  2015-10-28 16:11 ` Frederic Weisbecker
@ 2015-10-29  1:10   ` Hiroshi Shimamoto
  2015-10-29  3:37     ` Frederic Weisbecker
  0 siblings, 1 reply; 6+ messages in thread
From: Hiroshi Shimamoto @ 2015-10-29  1:10 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: Peter Zijlstra, Ingo Molnar, linux-kernel

> Subject: Re: [PATCH v2] cputime: fix invalid gtime
> 
> On Wed, Oct 28, 2015 at 07:01:18AM +0000, Hiroshi Shimamoto wrote:
> > From: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
> >
> > /proc/stats shows invalid gtime when the thread is running in guest.
> > When vtime accounting is not enabled, we cannot get a valid delta.
> > The delta is calculated now - tsk->vtime_snap, but tsk->vtime_snap
> > is only updated when vtime accounting is enabled.
> >
> > This patch makes task_gtime() just return gtime when vtime accounting
> > is not enabled.
> >
> > The kernel config contains CONFIG_VIRT_CPU_ACCOUNTING_GEN=y and
> > CONFIG_NO_HZ_FULL_ALL=n and boot without nohz_full.
> >
> > I ran and stop a busy loop in VM and see the gtime in host.
> > Dump the 43rd field which shows the gtime in every second.
> >  # while :; do awk '{print $3" "$43}' /proc/3955/task/4014/stat; sleep 1; done
> > S 4348
> > R 7064566
> > R 7064766
> > R 7064967
> > R 7065168
> > S 4759
> > S 4759
> >
> > During running busy loop, it returns large value.
> >
> > After applying this patch, we can see right gtime.
> >
> >  # while :; do awk '{print $3" "$43}' /proc/10913/task/10956/stat; sleep 1; done
> > S 5338
> > R 5365
> > R 5465
> > R 5566
> > R 5666
> > S 5726
> > S 5726
> >
> > Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
> > ---
> >  kernel/sched/cputime.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> > index 8cbc3db..f614ee9 100644
> > --- a/kernel/sched/cputime.c
> > +++ b/kernel/sched/cputime.c
> > @@ -786,6 +786,9 @@ cputime_t task_gtime(struct task_struct *t)
> >  	unsigned int seq;
> >  	cputime_t gtime;
> >
> > +	if (!vtime_accounting_enabled())
> > +		return t->gtime;
> > +
> 
> Obviously I completely messed up there. And task_cputime() has a similar issue
> but it happens to work due to vtime_snap_whence set to VTIME_SLEEPING when vtime
> doesn't run. Still it works at the cost of a seqcount read operation.
> 
> Do you think you could fix it too (along with task_cputime_scaled())? I think those
> patches will also need a stable tag.

Do you mean that task_cputime() and task_cputime_scaled() don't hit invalid behavior
but have some extra operation cost which could be removed?

Will look into it, and send patches with stable tag.

thanks,
Hiroshi

> 
> Thanks!
> 
> 
> 
> >  	do {
> >  		seq = read_seqbegin(&t->vtime_seqlock);
> >
> > --
> > 1.8.3.1
> >

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] cputime: fix invalid gtime
  2015-10-29  1:10   ` Hiroshi Shimamoto
@ 2015-10-29  3:37     ` Frederic Weisbecker
  2015-10-29  4:30       ` Hiroshi Shimamoto
  0 siblings, 1 reply; 6+ messages in thread
From: Frederic Weisbecker @ 2015-10-29  3:37 UTC (permalink / raw)
  To: Hiroshi Shimamoto; +Cc: Peter Zijlstra, Ingo Molnar, linux-kernel

On Thu, Oct 29, 2015 at 01:10:01AM +0000, Hiroshi Shimamoto wrote:
> > Obviously I completely messed up there. And task_cputime() has a similar issue
> > but it happens to work due to vtime_snap_whence set to VTIME_SLEEPING when vtime
> > doesn't run. Still it works at the cost of a seqcount read operation.
> > 
> > Do you think you could fix it too (along with task_cputime_scaled())? I think those
> > patches will also need a stable tag.
> 
> Do you mean that task_cputime() and task_cputime_scaled() don't hit invalid behavior
> but have some extra operation cost which could be removed?

Exactly.

> 
> Will look into it, and send patches with stable tag.

Thanks a lot!

Oh and another detail: vtime_accounting_enabled() checks if vtime
accounting is done precisely on the current CPU. That's what we want to check
when we account the time but not when we want to read the cputime of a task.

For example, CPU 0 never has vtime_accounting_enabled() because it plays the
role of timekeeper and as such it keeps the tick periodic. So if task A runs on
CPU 1 that has vtime accounting on, and we read the cputime of task A from CPU 0,
vtime_accounting_enabled() will be false whereas we need to compute the delta.

So vtime_accounting_enabled() isn't suitable to check if vtime is running on _some_
CPU such that we can't return utime/stime with a raw read.

Ideally we shoud rename vtime_accounting_enabled() to vtime_accounting_cpu_enabled()
and have vtime_accounting_enabled() to check if vtime runs somewhere. But that would
be too much an invasive change for a stable patch. So lets just use
context_tracking_is_enabled() for now instead.

I think task_gtime() is also buggy when the target task runs in a CPU that doesn't do
vtime accounting (whereas another CPU does vtime accounting). We should fix that with
using a new VTIME_GUEST value instead of (or along with) PF_VCPU. But that's another story,
your fix is much more important for now.

> 
> thanks,
> Hiroshi
> 
> > 
> > Thanks!
> > 
> > 
> > 
> > >  	do {
> > >  		seq = read_seqbegin(&t->vtime_seqlock);
> > >
> > > --
> > > 1.8.3.1
> > >

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH v2] cputime: fix invalid gtime
  2015-10-29  3:37     ` Frederic Weisbecker
@ 2015-10-29  4:30       ` Hiroshi Shimamoto
  2015-10-29 12:44         ` Frederic Weisbecker
  0 siblings, 1 reply; 6+ messages in thread
From: Hiroshi Shimamoto @ 2015-10-29  4:30 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: Peter Zijlstra, Ingo Molnar, linux-kernel

> Subject: Re: [PATCH v2] cputime: fix invalid gtime
> 
> On Thu, Oct 29, 2015 at 01:10:01AM +0000, Hiroshi Shimamoto wrote:
> > > Obviously I completely messed up there. And task_cputime() has a similar issue
> > > but it happens to work due to vtime_snap_whence set to VTIME_SLEEPING when vtime
> > > doesn't run. Still it works at the cost of a seqcount read operation.
> > >
> > > Do you think you could fix it too (along with task_cputime_scaled())? I think those
> > > patches will also need a stable tag.
> >
> > Do you mean that task_cputime() and task_cputime_scaled() don't hit invalid behavior
> > but have some extra operation cost which could be removed?
> 
> Exactly.
> 
> >
> > Will look into it, and send patches with stable tag.
> 
> Thanks a lot!
> 
> Oh and another detail: vtime_accounting_enabled() checks if vtime
> accounting is done precisely on the current CPU. That's what we want to check
> when we account the time but not when we want to read the cputime of a task.
> 
> For example, CPU 0 never has vtime_accounting_enabled() because it plays the
> role of timekeeper and as such it keeps the tick periodic. So if task A runs on
> CPU 1 that has vtime accounting on, and we read the cputime of task A from CPU 0,
> vtime_accounting_enabled() will be false whereas we need to compute the delta.
> 
> So vtime_accounting_enabled() isn't suitable to check if vtime is running on _some_
> CPU such that we can't return utime/stime with a raw read.

I see the point, vtime accounting can be enabled on dedicated cpu and there is no
guarantee the reading thread is on the same state.

> 
> Ideally we shoud rename vtime_accounting_enabled() to vtime_accounting_cpu_enabled()
> and have vtime_accounting_enabled() to check if vtime runs somewhere. But that would
> be too much an invasive change for a stable patch. So lets just use
> context_tracking_is_enabled() for now instead.

I have dig the code.
And my understanding is that vtime_accounting_enabled() does check global flag with
context_tracking_is_enabled() and then check current cpu state with
context_tracking_cpu_is_enabled(). For now, we just check global flag to fix current
issue instead of checking both in vtime_accounting_enabled(). In future we should fix
more precisely.
Is that correct?

thanks,
Hiroshi

> 
> I think task_gtime() is also buggy when the target task runs in a CPU that doesn't do
> vtime accounting (whereas another CPU does vtime accounting). We should fix that with
> using a new VTIME_GUEST value instead of (or along with) PF_VCPU. But that's another story,
> your fix is much more important for now.
> 
> >
> > thanks,
> > Hiroshi
> >
> > >
> > > Thanks!
> > >
> > >
> > >
> > > >  	do {
> > > >  		seq = read_seqbegin(&t->vtime_seqlock);
> > > >
> > > > --
> > > > 1.8.3.1
> > > >

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] cputime: fix invalid gtime
  2015-10-29  4:30       ` Hiroshi Shimamoto
@ 2015-10-29 12:44         ` Frederic Weisbecker
  0 siblings, 0 replies; 6+ messages in thread
From: Frederic Weisbecker @ 2015-10-29 12:44 UTC (permalink / raw)
  To: Hiroshi Shimamoto; +Cc: Peter Zijlstra, Ingo Molnar, linux-kernel

On Thu, Oct 29, 2015 at 04:30:20AM +0000, Hiroshi Shimamoto wrote:
> > Subject: Re: [PATCH v2] cputime: fix invalid gtime
> > 
> > On Thu, Oct 29, 2015 at 01:10:01AM +0000, Hiroshi Shimamoto wrote:
> > > > Obviously I completely messed up there. And task_cputime() has a similar issue
> > > > but it happens to work due to vtime_snap_whence set to VTIME_SLEEPING when vtime
> > > > doesn't run. Still it works at the cost of a seqcount read operation.
> > > >
> > > > Do you think you could fix it too (along with task_cputime_scaled())? I think those
> > > > patches will also need a stable tag.
> > >
> > > Do you mean that task_cputime() and task_cputime_scaled() don't hit invalid behavior
> > > but have some extra operation cost which could be removed?
> > 
> > Exactly.
> > 
> > >
> > > Will look into it, and send patches with stable tag.
> > 
> > Thanks a lot!
> > 
> > Oh and another detail: vtime_accounting_enabled() checks if vtime
> > accounting is done precisely on the current CPU. That's what we want to check
> > when we account the time but not when we want to read the cputime of a task.
> > 
> > For example, CPU 0 never has vtime_accounting_enabled() because it plays the
> > role of timekeeper and as such it keeps the tick periodic. So if task A runs on
> > CPU 1 that has vtime accounting on, and we read the cputime of task A from CPU 0,
> > vtime_accounting_enabled() will be false whereas we need to compute the delta.
> > 
> > So vtime_accounting_enabled() isn't suitable to check if vtime is running on _some_
> > CPU such that we can't return utime/stime with a raw read.
> 
> I see the point, vtime accounting can be enabled on dedicated cpu and there is no
> guarantee the reading thread is on the same state.

Exactly!

> 
> > 
> > Ideally we shoud rename vtime_accounting_enabled() to vtime_accounting_cpu_enabled()
> > and have vtime_accounting_enabled() to check if vtime runs somewhere. But that would
> > be too much an invasive change for a stable patch. So lets just use
> > context_tracking_is_enabled() for now instead.
> 
> I have dig the code.
> And my understanding is that vtime_accounting_enabled() does check global flag with
> context_tracking_is_enabled() and then check current cpu state with
> context_tracking_cpu_is_enabled(). For now, we just check global flag to fix current
> issue instead of checking both in vtime_accounting_enabled(). In future we should fix
> more precisely.
> Is that correct?

Perfectly correct :-)

Thanks!

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-10-29 12:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-28  7:01 [PATCH v2] cputime: fix invalid gtime Hiroshi Shimamoto
2015-10-28 16:11 ` Frederic Weisbecker
2015-10-29  1:10   ` Hiroshi Shimamoto
2015-10-29  3:37     ` Frederic Weisbecker
2015-10-29  4:30       ` Hiroshi Shimamoto
2015-10-29 12:44         ` Frederic Weisbecker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.