qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* Re: [Qemu-devel] [PATCH v15 23/26] sched: early boot clock
       [not found] ` <20180719205545.16512-24-pasha.tatashin@oracle.com>
@ 2018-11-06  5:42   ` Dominique Martinet
  2018-11-06 11:35     ` Steven Sistare
  0 siblings, 1 reply; 12+ messages in thread
From: Dominique Martinet @ 2018-11-06  5:42 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: steven.sistare, daniel.m.jordan, linux, schwidefsky,
	heiko.carstens, john.stultz, sboyd, x86, linux-kernel, mingo,
	tglx, hpa, douly.fnst, peterz, prarit, feng.tang, pmladek,
	gnomes, linux-s390, boris.ostrovsky, jgross, pbonzini,
	virtualization, kvm, qemu-devel

(added various kvm/virtualization lists in Cc as well as qemu as I don't
know who's "wrong" here)

Pavel Tatashin wrote on Thu, Jul 19, 2018:
> Allow sched_clock() to be used before schec_clock_init() is called.
> This provides with a way to get early boot timestamps on machines with
> unstable clocks.

This isn't something I understand, but bisect tells me this patch
(landed as 857baa87b64 ("sched/clock: Enable sched clock early")) makes
a VM running with kvmclock take a step in uptime/printk timer early in
boot sequence as illustrated below. The step seems to be related to the
amount of time the host was suspended while qemu was running before the
reboot.

$ dmesg
...
[    0.000000] SMBIOS 2.8 present.
[    0.000000] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014
[    0.000000] Hypervisor detected: KVM
[    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
[283120.529821] kvm-clock: cpu 0, msr 321a8001, primary cpu clock
[283120.529822] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[283120.529824] tsc: Detected 2592.000 MHz processor
...

(The VM is x86_64 on x86_64, I can provide my .config on request but
don't think it's related)


It's rather annoying for me as I often reboot VMs and rely on the
'uptime' command to check if I did just reboot or not as I have the
attention span of a goldfish; I'd rather not have to find something else
to check if I did just reboot or not.

Note that if the qemu process is restarted, there is no offset anymore.

I unfortunately just did that so cannot say with confidence (putting my
laptop to sleep for 30s only led to a 2s offset and I do not want to
wait longer right now), but it looks like the clock is still mostly
correct after reboot after disabling my VM's ntp client. Will infirm
that tomorrow if I was wrong.


Happy to try to help fixing this in any way, as written above the quote
I'm not even actually sure who is wrong here.

Thanks!



(As a side, mostly unrelated note, insert swearing here about cf7a63ef4
not compiling earlier in this serie; some variable declaration got
removed before their use. Was fixed in the next patch but I didn't
notice the kernel didn't fully rebuild and wasted time in my bisect
heading the wrong way...)

> Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
> ---
>  init/main.c          |  2 +-
>  kernel/sched/clock.c | 20 +++++++++++++++++++-
>  2 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/init/main.c b/init/main.c
> index 162d931c9511..ff0a24170b95 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -642,7 +642,6 @@ asmlinkage __visible void __init start_kernel(void)
>  	softirq_init();
>  	timekeeping_init();
>  	time_init();
> -	sched_clock_init();
>  	printk_safe_init();
>  	perf_event_init();
>  	profile_init();
> @@ -697,6 +696,7 @@ asmlinkage __visible void __init start_kernel(void)
>  	acpi_early_init();
>  	if (late_time_init)
>  		late_time_init();
> +	sched_clock_init();
>  	calibrate_delay();
>  	pid_idr_init();
>  	anon_vma_init();
> diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
> index 0e9dbb2d9aea..422cd63f8f17 100644
> --- a/kernel/sched/clock.c
> +++ b/kernel/sched/clock.c
> @@ -202,7 +202,25 @@ static void __sched_clock_gtod_offset(void)
>  
>  void __init sched_clock_init(void)
>  {
> +	unsigned long flags;
> +
> +	/*
> +	 * Set __gtod_offset such that once we mark sched_clock_running,
> +	 * sched_clock_tick() continues where sched_clock() left off.
> +	 *
> +	 * Even if TSC is buggered, we're still UP at this point so it
> +	 * can't really be out of sync.
> +	 */
> +	local_irq_save(flags);
> +	__sched_clock_gtod_offset();
> +	local_irq_restore(flags);
> +
>  	sched_clock_running = 1;
> +
> +	/* Now that sched_clock_running is set adjust scd */
> +	local_irq_save(flags);
> +	sched_clock_tick();
> +	local_irq_restore(flags);
>  }
>  /*
>   * We run this as late_initcall() such that it runs after all built-in drivers,
> @@ -356,7 +374,7 @@ u64 sched_clock_cpu(int cpu)
>  		return sched_clock() + __sched_clock_offset;
>  
>  	if (unlikely(!sched_clock_running))
> -		return 0ull;
> +		return sched_clock();
>  
>  	preempt_disable_notrace();
>  	scd = cpu_sdc(cpu);
-- 
Dominique Martinet | Asmadeus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v15 23/26] sched: early boot clock
  2018-11-06  5:42   ` [Qemu-devel] [PATCH v15 23/26] sched: early boot clock Dominique Martinet
@ 2018-11-06 11:35     ` Steven Sistare
  2019-01-02 20:20       ` Salvatore Bonaccorso
  0 siblings, 1 reply; 12+ messages in thread
From: Steven Sistare @ 2018-11-06 11:35 UTC (permalink / raw)
  To: Dominique Martinet, Pavel Tatashin
  Cc: daniel.m.jordan, linux, schwidefsky, heiko.carstens, john.stultz,
	sboyd, x86, linux-kernel, mingo, tglx, hpa, douly.fnst, peterz,
	prarit, feng.tang, pmladek, gnomes, linux-s390, boris.ostrovsky,
	jgross, pbonzini, virtualization, kvm, qemu-devel

Pavel has a new email address, cc'd - steve

On 11/6/2018 12:42 AM, Dominique Martinet wrote:
> (added various kvm/virtualization lists in Cc as well as qemu as I don't
> know who's "wrong" here)
> 
> Pavel Tatashin wrote on Thu, Jul 19, 2018:
>> Allow sched_clock() to be used before schec_clock_init() is called.
>> This provides with a way to get early boot timestamps on machines with
>> unstable clocks.
> 
> This isn't something I understand, but bisect tells me this patch
> (landed as 857baa87b64 ("sched/clock: Enable sched clock early")) makes
> a VM running with kvmclock take a step in uptime/printk timer early in
> boot sequence as illustrated below. The step seems to be related to the
> amount of time the host was suspended while qemu was running before the
> reboot.
> 
> $ dmesg
> ...
> [    0.000000] SMBIOS 2.8 present.
> [    0.000000] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014
> [    0.000000] Hypervisor detected: KVM
> [    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
> [283120.529821] kvm-clock: cpu 0, msr 321a8001, primary cpu clock
> [283120.529822] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
> [283120.529824] tsc: Detected 2592.000 MHz processor
> ...
> 
> (The VM is x86_64 on x86_64, I can provide my .config on request but
> don't think it's related)
> 
> 
> It's rather annoying for me as I often reboot VMs and rely on the
> 'uptime' command to check if I did just reboot or not as I have the
> attention span of a goldfish; I'd rather not have to find something else
> to check if I did just reboot or not.
> 
> Note that if the qemu process is restarted, there is no offset anymore.
> 
> I unfortunately just did that so cannot say with confidence (putting my
> laptop to sleep for 30s only led to a 2s offset and I do not want to
> wait longer right now), but it looks like the clock is still mostly
> correct after reboot after disabling my VM's ntp client. Will infirm
> that tomorrow if I was wrong.
> 
> 
> Happy to try to help fixing this in any way, as written above the quote
> I'm not even actually sure who is wrong here.
> 
> Thanks!
> 
> 
> 
> (As a side, mostly unrelated note, insert swearing here about cf7a63ef4
> not compiling earlier in this serie; some variable declaration got
> removed before their use. Was fixed in the next patch but I didn't
> notice the kernel didn't fully rebuild and wasted time in my bisect
> heading the wrong way...)
> 
>> Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
>> ---
>>  init/main.c          |  2 +-
>>  kernel/sched/clock.c | 20 +++++++++++++++++++-
>>  2 files changed, 20 insertions(+), 2 deletions(-)
>>
>> diff --git a/init/main.c b/init/main.c
>> index 162d931c9511..ff0a24170b95 100644
>> --- a/init/main.c
>> +++ b/init/main.c
>> @@ -642,7 +642,6 @@ asmlinkage __visible void __init start_kernel(void)
>>  	softirq_init();
>>  	timekeeping_init();
>>  	time_init();
>> -	sched_clock_init();
>>  	printk_safe_init();
>>  	perf_event_init();
>>  	profile_init();
>> @@ -697,6 +696,7 @@ asmlinkage __visible void __init start_kernel(void)
>>  	acpi_early_init();
>>  	if (late_time_init)
>>  		late_time_init();
>> +	sched_clock_init();
>>  	calibrate_delay();
>>  	pid_idr_init();
>>  	anon_vma_init();
>> diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
>> index 0e9dbb2d9aea..422cd63f8f17 100644
>> --- a/kernel/sched/clock.c
>> +++ b/kernel/sched/clock.c
>> @@ -202,7 +202,25 @@ static void __sched_clock_gtod_offset(void)
>>  
>>  void __init sched_clock_init(void)
>>  {
>> +	unsigned long flags;
>> +
>> +	/*
>> +	 * Set __gtod_offset such that once we mark sched_clock_running,
>> +	 * sched_clock_tick() continues where sched_clock() left off.
>> +	 *
>> +	 * Even if TSC is buggered, we're still UP at this point so it
>> +	 * can't really be out of sync.
>> +	 */
>> +	local_irq_save(flags);
>> +	__sched_clock_gtod_offset();
>> +	local_irq_restore(flags);
>> +
>>  	sched_clock_running = 1;
>> +
>> +	/* Now that sched_clock_running is set adjust scd */
>> +	local_irq_save(flags);
>> +	sched_clock_tick();
>> +	local_irq_restore(flags);
>>  }
>>  /*
>>   * We run this as late_initcall() such that it runs after all built-in drivers,
>> @@ -356,7 +374,7 @@ u64 sched_clock_cpu(int cpu)
>>  		return sched_clock() + __sched_clock_offset;
>>  
>>  	if (unlikely(!sched_clock_running))
>> -		return 0ull;
>> +		return sched_clock();
>>  
>>  	preempt_disable_notrace();
>>  	scd = cpu_sdc(cpu);

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v15 23/26] sched: early boot clock
  2018-11-06 11:35     ` Steven Sistare
@ 2019-01-02 20:20       ` Salvatore Bonaccorso
  2019-01-03 21:28         ` Pavel Tatashin
  0 siblings, 1 reply; 12+ messages in thread
From: Salvatore Bonaccorso @ 2019-01-02 20:20 UTC (permalink / raw)
  To: Steven Sistare
  Cc: Dominique Martinet, Pavel Tatashin, daniel.m.jordan, linux,
	schwidefsky, heiko.carstens, john.stultz, sboyd, x86,
	linux-kernel, mingo, tglx, hpa, douly.fnst, peterz, prarit,
	feng.tang, pmladek, gnomes, linux-s390, boris.ostrovsky, jgross,
	pbonzini, virtualization, kvm, qemu-devel

Hi,

On Tue, Nov 06, 2018 at 06:35:36AM -0500, Steven Sistare wrote:
> Pavel has a new email address, cc'd - steve
> 
> On 11/6/2018 12:42 AM, Dominique Martinet wrote:
> > (added various kvm/virtualization lists in Cc as well as qemu as I don't
> > know who's "wrong" here)
> > 
> > Pavel Tatashin wrote on Thu, Jul 19, 2018:
> >> Allow sched_clock() to be used before schec_clock_init() is called.
> >> This provides with a way to get early boot timestamps on machines with
> >> unstable clocks.
> > 
> > This isn't something I understand, but bisect tells me this patch
> > (landed as 857baa87b64 ("sched/clock: Enable sched clock early")) makes
> > a VM running with kvmclock take a step in uptime/printk timer early in
> > boot sequence as illustrated below. The step seems to be related to the
> > amount of time the host was suspended while qemu was running before the
> > reboot.
> > 
> > $ dmesg
> > ...
> > [    0.000000] SMBIOS 2.8 present.
> > [    0.000000] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014
> > [    0.000000] Hypervisor detected: KVM
> > [    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
> > [283120.529821] kvm-clock: cpu 0, msr 321a8001, primary cpu clock
> > [283120.529822] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
> > [283120.529824] tsc: Detected 2592.000 MHz processor
> > ...
> > 
> > (The VM is x86_64 on x86_64, I can provide my .config on request but
> > don't think it's related)
> > 
> > 
> > It's rather annoying for me as I often reboot VMs and rely on the
> > 'uptime' command to check if I did just reboot or not as I have the
> > attention span of a goldfish; I'd rather not have to find something else
> > to check if I did just reboot or not.
> > 
> > Note that if the qemu process is restarted, there is no offset anymore.
> > 
> > I unfortunately just did that so cannot say with confidence (putting my
> > laptop to sleep for 30s only led to a 2s offset and I do not want to
> > wait longer right now), but it looks like the clock is still mostly
> > correct after reboot after disabling my VM's ntp client. Will infirm
> > that tomorrow if I was wrong.
> > 
> > 
> > Happy to try to help fixing this in any way, as written above the quote
> > I'm not even actually sure who is wrong here.

A user in Debian reported the same/similar issue (with 4.19.13):

https://bugs.debian.org/918036

Regards,
Salvatore

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v15 23/26] sched: early boot clock
  2019-01-02 20:20       ` Salvatore Bonaccorso
@ 2019-01-03 21:28         ` Pavel Tatashin
  2019-01-03 23:43           ` Dominique Martinet
  2019-01-04  7:30           ` [Qemu-devel] [PATCH v15 23/26] sched: early boot clock (was Re: Bug#918036: linux: uptime after reboot wrong (kvm-clock related?)) Thorsten Glaser
  0 siblings, 2 replies; 12+ messages in thread
From: Pavel Tatashin @ 2019-01-03 21:28 UTC (permalink / raw)
  To: Salvatore Bonaccorso
  Cc: Steven Sistare, Dominique Martinet, Pavel Tatashin,
	Daniel Jordan, linux, schwidefsky, heiko.carstens, john.stultz,
	sboyd, x86, LKML, mingo, Thomas Gleixner, hpa, douly.fnst,
	Peter Zijlstra, prarit, feng.tang, pmladek, gnomes, linux-s390,
	boris.ostrovsky, jgross, pbonzini, virtualization, kvm,
	qemu-devel

Could you please send the config file and qemu arguments that were
used to reproduce this problem.

Thank you,
Pasha

On Wed, Jan 2, 2019 at 3:20 PM Salvatore Bonaccorso <carnil@debian.org> wrote:
>
> Hi,
>
> On Tue, Nov 06, 2018 at 06:35:36AM -0500, Steven Sistare wrote:
> > Pavel has a new email address, cc'd - steve
> >
> > On 11/6/2018 12:42 AM, Dominique Martinet wrote:
> > > (added various kvm/virtualization lists in Cc as well as qemu as I don't
> > > know who's "wrong" here)
> > >
> > > Pavel Tatashin wrote on Thu, Jul 19, 2018:
> > >> Allow sched_clock() to be used before schec_clock_init() is called.
> > >> This provides with a way to get early boot timestamps on machines with
> > >> unstable clocks.
> > >
> > > This isn't something I understand, but bisect tells me this patch
> > > (landed as 857baa87b64 ("sched/clock: Enable sched clock early")) makes
> > > a VM running with kvmclock take a step in uptime/printk timer early in
> > > boot sequence as illustrated below. The step seems to be related to the
> > > amount of time the host was suspended while qemu was running before the
> > > reboot.
> > >
> > > $ dmesg
> > > ...
> > > [    0.000000] SMBIOS 2.8 present.
> > > [    0.000000] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014
> > > [    0.000000] Hypervisor detected: KVM
> > > [    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
> > > [283120.529821] kvm-clock: cpu 0, msr 321a8001, primary cpu clock
> > > [283120.529822] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
> > > [283120.529824] tsc: Detected 2592.000 MHz processor
> > > ...
> > >
> > > (The VM is x86_64 on x86_64, I can provide my .config on request but
> > > don't think it's related)
> > >
> > >
> > > It's rather annoying for me as I often reboot VMs and rely on the
> > > 'uptime' command to check if I did just reboot or not as I have the
> > > attention span of a goldfish; I'd rather not have to find something else
> > > to check if I did just reboot or not.
> > >
> > > Note that if the qemu process is restarted, there is no offset anymore.
> > >
> > > I unfortunately just did that so cannot say with confidence (putting my
> > > laptop to sleep for 30s only led to a 2s offset and I do not want to
> > > wait longer right now), but it looks like the clock is still mostly
> > > correct after reboot after disabling my VM's ntp client. Will infirm
> > > that tomorrow if I was wrong.
> > >
> > >
> > > Happy to try to help fixing this in any way, as written above the quote
> > > I'm not even actually sure who is wrong here.
>
> A user in Debian reported the same/similar issue (with 4.19.13):
>
> https://bugs.debian.org/918036
>
> Regards,
> Salvatore

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v15 23/26] sched: early boot clock
  2019-01-03 21:28         ` Pavel Tatashin
@ 2019-01-03 23:43           ` Dominique Martinet
  2019-01-07 18:17             ` Pavel Tatashin
  2019-01-04  7:30           ` [Qemu-devel] [PATCH v15 23/26] sched: early boot clock (was Re: Bug#918036: linux: uptime after reboot wrong (kvm-clock related?)) Thorsten Glaser
  1 sibling, 1 reply; 12+ messages in thread
From: Dominique Martinet @ 2019-01-03 23:43 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: Salvatore Bonaccorso, Steven Sistare, Pavel Tatashin,
	Daniel Jordan, linux, schwidefsky, heiko.carstens, john.stultz,
	sboyd, x86, LKML, mingo, Thomas Gleixner, hpa, douly.fnst,
	Peter Zijlstra, prarit, feng.tang, pmladek, gnomes, linux-s390,
	boris.ostrovsky, jgross, pbonzini, virtualization, kvm,
	qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1574 bytes --]

Pavel Tatashin wrote on Thu, Jan 03, 2019:
> Could you please send the config file and qemu arguments that were
> used to reproduce this problem.

Running qemu by hand, nothing fancy e.g. this works:

# qemu-system-x86_64 -m 1G -smp 4 -drive file=/root/kvm-wrapper/disks/f2.img,if=virtio -serial mon:stdio --enable-kvm -cpu Haswell -device virtio-rng-pci -nographic

(used a specific cpu just in case but normally runnning with cpu host on
a skylake machine; can probably go older)


qemu is fedora 29 blend as is:
$ qemu-system-x86_64 --version
QEMU emulator version 3.0.0 (qemu-3.0.0-3.fc29)
Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers


compressed .config attached to the mail, this can likely be trimmed down
some as well but that takes more time for me..
I didn't rebuild the kernel so not 100% sure (comes from
/proc/config.gz) but it should work on a 4.20-rc2 kernel as written in
the first few lines; 857baa87b64 I referred to in another mail was
merged in 4.19-rc1 so anything past that is probably OK to reproduce...


Re-checked today with these exact options (fresh VM start; then suspend
laptop for a bit, then reboot VM):
[    0.000000] Hypervisor detected: KVM
[    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
[ 2477.907447] kvm-clock: cpu 0, msr 153a4001, primary cpu clock
[ 2477.907448] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[ 2477.907450] tsc: Detected 2592.000 MHz processor


As offered previously, happy to help in any way.

Thanks,
-- 
Dominique

[-- Attachment #2: config.xz --]
[-- Type: application/octet-stream, Size: 19476 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v15 23/26] sched: early boot clock (was Re: Bug#918036: linux: uptime after reboot wrong (kvm-clock related?))
  2019-01-03 21:28         ` Pavel Tatashin
  2019-01-03 23:43           ` Dominique Martinet
@ 2019-01-04  7:30           ` Thorsten Glaser
  1 sibling, 0 replies; 12+ messages in thread
From: Thorsten Glaser @ 2019-01-04  7:30 UTC (permalink / raw)
  To: Salvatore Bonaccorso, 918036, Pavel Tatashin
  Cc: Steven Sistare, Dominique Martinet, Pavel Tatashin,
	Daniel Jordan, linux, schwidefsky, heiko.carstens, john.stultz,
	sboyd, x86, LKML, mingo, Thomas Gleixner, hpa, douly.fnst,
	Peter Zijlstra, prarit, feng.tang, pmladek, gnomes, linux-s390,
	boris.ostrovsky, jgross, pbonzini, virtualization, kvm,
	qemu-devel

Hi Salvatore,

>p.s.: my earlier reply to you seem to have been rejected and never
>      reached you, hope this one does now.

if you sent from Googlemail, it may reach me in the next weeks or
never *shrug* they don’t play nice with greylisting. The -submitter
or @d.o works, though. I’m following up from my $dayjob address as
the issue occurred there (which is also Googlemail, unfortunately).

>There was now a followup on this, and if you can I think it's best if
>you can followup there.
>
>https://lore.kernel.org/lkml/CA+CK2bC70pnL0Wimb0xt99J4nNfi8W3zuUHgAk-jsPuOP9jpHA@mail.gmail.com/

OK, doing now:


Pavel Tatashin wrote:

>Could you please send the config file and qemu arguments that were
>used to reproduce this problem.

This is from a libvirt-managed system. The arguments as shown by
“ps axwww” are:

qemu-system-x86_64 -enable-kvm -name ci-busyapps -S -machine pc-1.1,accel=kvm,usb=off -m 8192 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 09536d92-dd73-8993-78fb-e0c885acf763 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/ci-busyapps.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/dev/vms/ci-busyapps,format=raw,if=none,id=drive-virtio-disk0,cache=none,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=25 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:05:6e:fd,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 127.0.0.1:0 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on

I’ve attached the kernel configuration; this is a stock Debian
unstable/amd64 system, just upgraded. After upgrading the guest,
I merely issued a “reboot” in the guest and did not stop/start
qemu.

The host is Debian jessie/amd64 (Linux 3.16.0-7-amd64 / 3.16.59-1)
in case that matters.

Thanks,
//mirabilos
-- 
tarent solutions GmbH
Rochusstraße 2-4, D-53123 Bonn • http://www.tarent.de/
Tel: +49 228 54881-393 • Fax: +49 228 54881-235
HRB 5168 (AG Bonn) • USt-ID (VAT): DE122264941
Geschäftsführer: Dr. Stefan Barth, Kai Ebenrett, Boris Esser, Alexander Steeg

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v15 23/26] sched: early boot clock
  2019-01-03 23:43           ` Dominique Martinet
@ 2019-01-07 18:17             ` Pavel Tatashin
  2019-01-07 23:48               ` Dominique Martinet
  0 siblings, 1 reply; 12+ messages in thread
From: Pavel Tatashin @ 2019-01-07 18:17 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: Salvatore Bonaccorso, Steven Sistare, Pavel Tatashin,
	Daniel Jordan, linux, schwidefsky, heiko.carstens, John Stultz,
	sboyd, x86, LKML, mingo, Thomas Gleixner, hpa, douly.fnst,
	Peter Zijlstra, prarit, feng.tang, Petr Mladek, gnomes,
	linux-s390, boris.ostrovsky, jgross, pbonzini, virtualization,
	kvm, qemu-devel

On Thu, Jan 3, 2019 at 6:43 PM Dominique Martinet
<asmadeus@codewreck.org> wrote:
>
> Pavel Tatashin wrote on Thu, Jan 03, 2019:
> > Could you please send the config file and qemu arguments that were
> > used to reproduce this problem.
>
> Running qemu by hand, nothing fancy e.g. this works:
>
> # qemu-system-x86_64 -m 1G -smp 4 -drive file=/root/kvm-wrapper/disks/f2.img,if=virtio -serial mon:stdio --enable-kvm -cpu Haswell -device virtio-rng-pci -nographic
>
> (used a specific cpu just in case but normally runnning with cpu host on
> a skylake machine; can probably go older)
>
>
> qemu is fedora 29 blend as is:
> $ qemu-system-x86_64 --version
> QEMU emulator version 3.0.0 (qemu-3.0.0-3.fc29)
> Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers
>
>
> compressed .config attached to the mail, this can likely be trimmed down
> some as well but that takes more time for me..
> I didn't rebuild the kernel so not 100% sure (comes from
> /proc/config.gz) but it should work on a 4.20-rc2 kernel as written in
> the first few lines; 857baa87b64 I referred to in another mail was
> merged in 4.19-rc1 so anything past that is probably OK to reproduce...
>
>
> Re-checked today with these exact options (fresh VM start; then suspend
> laptop for a bit, then reboot VM):
> [    0.000000] Hypervisor detected: KVM
> [    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
> [ 2477.907447] kvm-clock: cpu 0, msr 153a4001, primary cpu clock
> [ 2477.907448] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
> [ 2477.907450] tsc: Detected 2592.000 MHz processor

I could not reproduce the problem. Did you suspend to memory between
wake ups? Does this time jump happen every time, even if your laptop
sleeps for a minute?

I have tried with qemu 2.6 and 3.1 on Ubuntu, testing 4.20rc2.

Pasha

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v15 23/26] sched: early boot clock
  2019-01-07 18:17             ` Pavel Tatashin
@ 2019-01-07 23:48               ` Dominique Martinet
  2019-01-08  1:04                 ` Pavel Tatashin
  0 siblings, 1 reply; 12+ messages in thread
From: Dominique Martinet @ 2019-01-07 23:48 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: Salvatore Bonaccorso, Steven Sistare, Pavel Tatashin,
	Daniel Jordan, linux, schwidefsky, heiko.carstens, John Stultz,
	sboyd, x86, LKML, mingo, Thomas Gleixner, hpa, douly.fnst,
	Peter Zijlstra, prarit, feng.tang, Petr Mladek, gnomes,
	linux-s390, boris.ostrovsky, jgross, pbonzini, virtualization,
	kvm, qemu-devel

Pavel Tatashin wrote on Mon, Jan 07, 2019:
> I could not reproduce the problem. Did you suspend to memory between
> wake ups? Does this time jump happen every time, even if your laptop
> sleeps for a minute?

I'm not sure I understand "suspend to memory between the wake ups".
The full sequence is:
 - start a VM (just in case, I let it boot till the end)
 - suspend to memory (aka systemctl suspend) the host
 - after resuming the host, soft reboot the VM (login through
serial/ssh/whatever and reboot or in the qemu console 'system_reset')

I've just slept exactly one minute and reproduced again with the fedora
stock kernel now (4.19.13-300.fc29.x86_64) in the VM.

Interestingly I'm not getting the same offset between multiple reboots
now despite not suspending again; but if I don't suspend I cannot seem
to get it to give an offset at all (only tried for a few minutes; this
might not be true) ; OTOH I pushed my luck further and even with a five
seconds sleep I'm getting a noticeable offset on first VM reboot after
resume:

[    0.000000] Hypervisor detected: KVM
[    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
[  179.362163] kvm-clock: cpu 0, msr 13c01001, primary cpu clock
[  179.362163] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns

Honestly not sure what more information I could give, I'll try on some
other hardware than my laptop (if I can get a server to resume after
suspend through ipmi or wake on lan); but I don't have anything I could
install ubuntu on to try their qemu's version... although I really don't
want to believe that's the difference...

Thanks,
-- 
Dominique

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v15 23/26] sched: early boot clock
  2019-01-07 23:48               ` Dominique Martinet
@ 2019-01-08  1:04                 ` Pavel Tatashin
  2019-01-08  1:09                   ` Dominique Martinet
  2019-01-26  2:04                   ` Jon DeVree
  0 siblings, 2 replies; 12+ messages in thread
From: Pavel Tatashin @ 2019-01-08  1:04 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: Salvatore Bonaccorso, Steven Sistare, Pavel Tatashin,
	Daniel Jordan, linux, schwidefsky, heiko.carstens, John Stultz,
	sboyd, x86, LKML, mingo, Thomas Gleixner, hpa, douly.fnst,
	Peter Zijlstra, prarit, feng.tang, Petr Mladek, gnomes,
	linux-s390, boris.ostrovsky, jgross, pbonzini, virtualization,
	kvm, qemu-devel

I did exactly the same sequence on Kaby Lake CPU and could not
reproduce it. What is your host CPU?

Thank you,
Pasha

On Mon, Jan 7, 2019 at 6:48 PM Dominique Martinet
<asmadeus@codewreck.org> wrote:
>
> Pavel Tatashin wrote on Mon, Jan 07, 2019:
> > I could not reproduce the problem. Did you suspend to memory between
> > wake ups? Does this time jump happen every time, even if your laptop
> > sleeps for a minute?
>
> I'm not sure I understand "suspend to memory between the wake ups".
> The full sequence is:
>  - start a VM (just in case, I let it boot till the end)
>  - suspend to memory (aka systemctl suspend) the host
>  - after resuming the host, soft reboot the VM (login through
> serial/ssh/whatever and reboot or in the qemu console 'system_reset')
>
> I've just slept exactly one minute and reproduced again with the fedora
> stock kernel now (4.19.13-300.fc29.x86_64) in the VM.
>
> Interestingly I'm not getting the same offset between multiple reboots
> now despite not suspending again; but if I don't suspend I cannot seem
> to get it to give an offset at all (only tried for a few minutes; this
> might not be true) ; OTOH I pushed my luck further and even with a five
> seconds sleep I'm getting a noticeable offset on first VM reboot after
> resume:
>
> [    0.000000] Hypervisor detected: KVM
> [    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
> [  179.362163] kvm-clock: cpu 0, msr 13c01001, primary cpu clock
> [  179.362163] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
>
> Honestly not sure what more information I could give, I'll try on some
> other hardware than my laptop (if I can get a server to resume after
> suspend through ipmi or wake on lan); but I don't have anything I could
> install ubuntu on to try their qemu's version... although I really don't
> want to believe that's the difference...
>
> Thanks,
> --
> Dominique

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v15 23/26] sched: early boot clock
  2019-01-08  1:04                 ` Pavel Tatashin
@ 2019-01-08  1:09                   ` Dominique Martinet
  2019-01-26  2:04                   ` Jon DeVree
  1 sibling, 0 replies; 12+ messages in thread
From: Dominique Martinet @ 2019-01-08  1:09 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: Salvatore Bonaccorso, Steven Sistare, Pavel Tatashin,
	Daniel Jordan, linux, schwidefsky, heiko.carstens, John Stultz,
	sboyd, x86, LKML, mingo, Thomas Gleixner, hpa, douly.fnst,
	Peter Zijlstra, prarit, feng.tang, Petr Mladek, gnomes,
	linux-s390, boris.ostrovsky, jgross, pbonzini, virtualization,
	kvm, qemu-devel

Pavel Tatashin wrote on Mon, Jan 07, 2019:
> I did exactly the same sequence on Kaby Lake CPU and could not
> reproduce it. What is your host CPU?

skylake consumer laptop CPU: Intel(R) Core(TM) i7-6500U CPU @ 2.50GHz

I don't have any kaby lake around; I have access to older servers though...
-- 
Dominique

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v15 23/26] sched: early boot clock
  2019-01-08  1:04                 ` Pavel Tatashin
  2019-01-08  1:09                   ` Dominique Martinet
@ 2019-01-26  2:04                   ` Jon DeVree
  2019-01-26 16:11                     ` Pavel Tatashin
  1 sibling, 1 reply; 12+ messages in thread
From: Jon DeVree @ 2019-01-26  2:04 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: Dominique Martinet, Salvatore Bonaccorso, Steven Sistare,
	Pavel Tatashin, Daniel Jordan, linux, schwidefsky,
	heiko.carstens, John Stultz, sboyd, x86, LKML, mingo,
	Thomas Gleixner, hpa, douly.fnst, Peter Zijlstra, prarit,
	feng.tang, Petr Mladek, gnomes, linux-s390, boris.ostrovsky,
	jgross, pbonzini, virtualization, kvm, qemu-devel

On Mon, Jan 07, 2019 at 20:04:41 -0500, Pavel Tatashin wrote:
> I did exactly the same sequence on Kaby Lake CPU and could not
> reproduce it. What is your host CPU?
> 

I have some machines which display this bug and others that don't, so I
was able to figure out the difference between their configurations.

TL;DR the bug appears to be based on wther or not
/sys/devices/system/clocksource/clocksource0/current_clocksource is set
to TSC in the hypervisor

This is the log from a machine with the bug:

[    0.000000] Hypervisor detected: KVM
[    0.000000] kvm-clock: Using msrs 12 and 11
[1162908.013830] kvm-clock: cpu 0, msr 3e0fea001, primary cpu clock
[1162908.013830] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[1162908.013834] tsc: Detected 1899.888 MHz processor

This is the log from a machine without the bug:

[    0.000000] Hypervisor detected: KVM
[    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
[    0.000000] kvm-clock: cpu 0, msr 149fea001, primary cpu clock
[    0.000000] kvm-clock: using sched offset of 1558436482528906 cycles
[    0.000002] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[    0.000004] tsc: Detected 2097.570 MHz processor

Note the additional line of output on the machine without the bug:

[    0.000000] kvm-clock: using sched offset of 1558436482528906 cycles

This is printed from kvm_sched_clock_init() in
arch/x86/kernel/kvmclock.c based on whether or not the clock is stable.
For the clock to be stable both KVM_FEATURE_CLOCKSOURCE_STABLE_BIT and
PVCLOCK_TSC_STABLE_BIT have to be set.  Both of these are controlled by
the hypervisor kernel.

* KVM_FEATURE_CLOCKSOURCE_STABLE_BIT is always set by the hypervisor
  starting with Linux v2.6.35 - 371bcf646d17 ("KVM: x86: Tell the guest
  we'll warn it about tsc stability")
* PVCLOCK_TSC_STABLE_BIT is set starting in Linux v3.8 but only if the
  clocksource is the TSC - d828199e8444 ("KVM: x86: implement
  PVCLOCK_TSC_STABLE_BIT pvclock flag")

I changed the clocksource of a hypervisor that wasn't having issues from
TSC to HPET and when I started up a guest VM the bug suddenly appeared.
I shut down the guest, set the hypervisor's clocksource back to TSC,
started up the guest and the bug went away again.

You don't actually have to reboot the guest before the bug is visible
either, just letting the guest sit at the GRUB menu for a minute or two
before loading Linux is enough to make the bug plainly visible in the
printk timestamps.

I don't know enough to actually fix the bug, but hopefully this is
enough to allow everyone else to reproduce it and come up with a fix.

-- 
Jon
X(7): A program for managing terminal windows. See also screen(1) and tmux(1).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] [PATCH v15 23/26] sched: early boot clock
  2019-01-26  2:04                   ` Jon DeVree
@ 2019-01-26 16:11                     ` Pavel Tatashin
  0 siblings, 0 replies; 12+ messages in thread
From: Pavel Tatashin @ 2019-01-26 16:11 UTC (permalink / raw)
  To: Dominique Martinet, Salvatore Bonaccorso, Steven Sistare,
	Pavel Tatashin, Daniel Jordan, linux, schwidefsky,
	heiko.carstens, John Stultz, sboyd, x86, LKML, mingo,
	Thomas Gleixner, hpa, douly.fnst, Peter Zijlstra, prarit,
	feng.tang, Petr Mladek, gnomes, linux-s390, boris.ostrovsky,
	jgross, pbonzini, virtualization, kvm, qemu-devel

On 19-01-25 21:04:10, Jon DeVree wrote:
> 
> * KVM_FEATURE_CLOCKSOURCE_STABLE_BIT is always set by the hypervisor
>   starting with Linux v2.6.35 - 371bcf646d17 ("KVM: x86: Tell the guest
>   we'll warn it about tsc stability")
> * PVCLOCK_TSC_STABLE_BIT is set starting in Linux v3.8 but only if the
>   clocksource is the TSC - d828199e8444 ("KVM: x86: implement
>   PVCLOCK_TSC_STABLE_BIT pvclock flag")
> 
> I changed the clocksource of a hypervisor that wasn't having issues from
> TSC to HPET and when I started up a guest VM the bug suddenly appeared.
> I shut down the guest, set the hypervisor's clocksource back to TSC,
> started up the guest and the bug went away again.
> 
> You don't actually have to reboot the guest before the bug is visible
> either, just letting the guest sit at the GRUB menu for a minute or two
> before loading Linux is enough to make the bug plainly visible in the
> printk timestamps.
> 
> I don't know enough to actually fix the bug, but hopefully this is
> enough to allow everyone else to reproduce it and come up with a fix.

Thank you very much for your analysis, I am now able to reproduce the
problem by setting clocksource on my machine to hpet. I will soon submit a
patch with a fix.

Pasha

> 
> -- 
> Jon
> X(7): A program for managing terminal windows. See also screen(1) and tmux(1).

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-01-26 16:11 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20180719205545.16512-1-pasha.tatashin@oracle.com>
     [not found] ` <20180719205545.16512-24-pasha.tatashin@oracle.com>
2018-11-06  5:42   ` [Qemu-devel] [PATCH v15 23/26] sched: early boot clock Dominique Martinet
2018-11-06 11:35     ` Steven Sistare
2019-01-02 20:20       ` Salvatore Bonaccorso
2019-01-03 21:28         ` Pavel Tatashin
2019-01-03 23:43           ` Dominique Martinet
2019-01-07 18:17             ` Pavel Tatashin
2019-01-07 23:48               ` Dominique Martinet
2019-01-08  1:04                 ` Pavel Tatashin
2019-01-08  1:09                   ` Dominique Martinet
2019-01-26  2:04                   ` Jon DeVree
2019-01-26 16:11                     ` Pavel Tatashin
2019-01-04  7:30           ` [Qemu-devel] [PATCH v15 23/26] sched: early boot clock (was Re: Bug#918036: linux: uptime after reboot wrong (kvm-clock related?)) Thorsten Glaser
     [not found] <154644530361.2390.18185252504260044930.reportbug@ci-busyapps.lan.tarent.de>
     [not found] ` <20190102163939.GB13145@eldamar.local>

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).