From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752916AbcHJSID (ORCPT <rfc822;w@1wt.eu>);
	Wed, 10 Aug 2016 14:08:03 -0400
Received: from mx1.redhat.com ([209.132.183.28]:35622 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S932629AbcHJSIA (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Wed, 10 Aug 2016 14:08:00 -0400
Date: Wed, 10 Aug 2016 12:52:12 -0400
From: Rik van Riel <riel@redhat.com>
To: Wanpeng Li <kernellwp@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>, Ingo Molnar <mingo@kernel.org>,
        LKML <linux-kernel@vger.kernel.org>,
        Paolo Bonzini <pbonzini@redhat.com>,
        Peter Zijlstra <peterz@infradead.org>,
        Wanpeng Li <wanpeng.li@hotmail.com>,
        Thomas Gleixner <tglx@linutronix.de>,
        Radim Krcmar <rkrcmar@redhat.com>, Mike Galbraith <efault@gmx.de>
Subject: [PATCH] time,virt: resync steal time when guest & host lose sync
Message-ID: <20160810125212.78564dc2@annuminas.surriel.com>
In-Reply-To: <CANRm+CzXSCNGoVbuOB0Ruj2nmfHNRfcO3eB-91Z-fnBOnn-gbQ@mail.gmail.com>
References: <1468421405-20056-1-git-send-email-fweisbec@gmail.com>
	<1468421405-20056-2-git-send-email-fweisbec@gmail.com>
	<CANRm+CyYHLaihuk+ckQg42-Lo_3vFHaS8gU=GmXh8Rfq5mMpaA@mail.gmail.com>
	<1470751579.13905.77.camel@redhat.com>
	<CANRm+CzxWo9i_=Ygn4CAF0L5=1avtnX=ahwMK96fj3hnQW-_zA@mail.gmail.com>
	<CANRm+CzXSCNGoVbuOB0Ruj2nmfHNRfcO3eB-91Z-fnBOnn-gbQ@mail.gmail.com>
Organization: Red Hat, Inc.
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Wed, 10 Aug 2016 16:52:16 +0000 (UTC)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Wed, 10 Aug 2016 07:39:08 +0800
Wanpeng Li <kernellwp@gmail.com> wrote:

> The regression is caused by your commit "sched,time: Count actually
> elapsed irq & softirq time".

Wanpeng, does this patch fix your issue?

Paolo, what is your opinion on this issue?

I can think of all kinds of ways in which guest and host might lose
sync with steal time, from uninitialized values at boot, to guest
pause, followed by save to disk, and reload, to live migration, to...

---8<---

Subject: time,virt: resync steal time when guest & host lose sync

When guest and host wildly disagree on steal time, a guest can
do several things:
1) Quickly account all the steal time at once (the kernel did this before
   57430218317e ("sched/cputime: Count actually elapsed irq & softirq time"),
   when steal_account_process_ticks got ULONG_MAX as its maximum value.
2) Stay out of sync for an indeterminate amount of time. This is what the
   system does today.
3) Sync up the guest value to the host-provided value, without accounting
   an absurdly large value in the cpu time statistics.

This patch makes the kernel do (3), which seems like the right thing
to do.

The exact value of the threshold use probably does not matter too much,
as long as it is long enough to cover all the timer ticks that passed
during an idle period, because (irqtime_)account_idle_ticks can process
a large amount of time all at once.

Signed-off-by: Rik van Riel <riel@redhat.com>
Reported-by: Wanpeng Li <kernellwp@gmail.com>
---
 kernel/sched/cputime.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 1934f658c036..c18f9e717af6 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -273,7 +273,17 @@ static __always_inline cputime_t steal_account_process_time(cputime_t maxtime)
 		steal = paravirt_steal_clock(smp_processor_id());
 		steal -= this_rq()->prev_steal_time;
 
-		steal_cputime = min(nsecs_to_cputime(steal), maxtime);
+		steal_cputime = nsecs_to_cputime(steal);
+		if (steal_cputime > 32 * maxtime) {
+			/*
+			 * Guest and host steal time values are way out of
+			 * sync. Sync up the guest steal time with the host.
+			 */
+			this_rq()->prev_steal_time +=
+					cputime_to_nsecs(steal_cputime);
+			return 0;
+		}
+		steal_cputime = min(steal_cputime, maxtime);
 		account_steal_time(steal_cputime);
 		this_rq()->prev_steal_time += cputime_to_nsecs(steal_cputime);