All of lore.kernel.org
 help / color / mirror / Atom feed
From: Igor Mammedov <imammedo@redhat.com>
To: Amit Shah <amit.shah@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
	kvm@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com,
	hpa@zytor.com, x86@kernel.org, johnstul@us.ibm.com,
	riel@redhat.com, avi@redhat.com
Subject: Re: x86: kvmclock: abstract save/restore sched_clock_state
Date: Thu, 09 Feb 2012 16:13:29 +0100	[thread overview]
Message-ID: <4F33E299.3090500@redhat.com> (raw)
In-Reply-To: <20120209122742.GA25505@amit.redhat.com>

On 02/09/2012 01:27 PM, Amit Shah wrote:
> On (Tue) 07 Feb 2012 [19:05:42], Marcelo Tosatti wrote:
>>
>> Upon resume from hibernation, CPU 0's hvclock area contains the old
>> values for system_time and tsc_timestamp. It is necessary for the
>> hypervisor to update these values with uptodate ones before the CPU uses
>> them.
>>
>> Abstract TSC's save/restore sched_clock_state functions and use
>> restore_state to write to KVM_SYSTEM_TIME MSR, forcing an update.
>>
>> Fixes suspend-to-disk with kvmclock.
>
> There are stalls after resume, see trace below.
>
> ./x86_64-softmmu/qemu-system-x86_64 -kernel ~/src/linux/arch/x86/boot/bzImage  -append 'root=/dev/vda1 console=tty0 console=ttyS0 no_console_suspend' -drive file=/guests/f14-suspend.qcow2,if=none,id=dr0 -device virtio-blk-pci,drive=dr0 -net nic,model=virtio -net user  -serial stdio  -enable-kvm -m 512  -cpu host -smp 4
>
> <The following is on resume:>
>
> Disabling non-boot CPUs ...
> CPU 1 is now offline
> CPU 2 is now offline
> CPU 3 is now offline
> lockdep: fixing up alternatives.
> kvm-clock: cpu 0, msr 0:1f1d19c1, primary cpu clock, resume
> PM: Restoring platform NVS memory
> Enabling non-boot CPUs ...
> lockdep: fixing up alternatives.
> Booting Node 0 Processor 1 APIC 0x1
> Calibrating delay loop (skipped) already calibrated this CPU
> INFO: rcu_sched detected stalls on CPUs/tasks: { 1} (detected by 0, t=314192 jiffies)
> Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
> Call Trace:
>   <IRQ>   [<ffffffff8108b6c9>] __rcu_pending+0x268/0x3c5
>   [<ffffffff8108bef5>] rcu_check_callbacks+0xab/0x108
>   [<ffffffff81031208>] update_process_times+0x3f/0x75
>   [<ffffffff810649b5>] tick_sched_timer+0x6d/0x8c
>   [<ffffffff81042e68>] __run_hrtimer+0xc9/0x15c
>   [<ffffffff81064948>] ? tick_nohz_handler+0xd5/0xd5
>   [<ffffffff8104359b>] hrtimer_interrupt+0xea/0x1b4
>   [<ffffffff81316192>] smp_apic_timer_interrupt+0x76/0x89
>   [<ffffffff81315833>] apic_timer_interrupt+0x73/0x80
>   <EOI>   [<ffffffff8130b9ec>] ? arch_local_irq_restore+0x6/0xd
>   [<ffffffff8130840c>] native_cpu_up+0x108/0x127
>   [<ffffffff81308f2b>] _cpu_up+0x92/0xfc
>   [<ffffffff812facf7>] enable_nonboot_cpus+0x4d/0xb2
>   [<ffffffff81059836>] hibernation_snapshot+0x1f3/0x2a0
>   [<ffffffff81044406>] ? cleanup_srcu_struct+0x52/0x58
>   [<ffffffff81059e6d>] hibernate+0x97/0x196
>   [<ffffffff81057eb0>] state_store+0x5c/0x106
>   [<ffffffff811af49b>] kobj_attr_store+0x17/0x19
>   [<ffffffff811368fa>] sysfs_write_file+0x10e/0x14a
>   [<ffffffff810eb5f6>] vfs_write+0xab/0xd2
>   [<ffffffff810ec955>] ? fget_light+0x3a/0xa1
>   [<ffffffff810eb7ad>] sys_write+0x4d/0x74
>   [<ffffffff81314ce9>] system_call_fastpath+0x16/0x1b
> INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected by 0, t=314193 jiffies)
> Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
> Call Trace:
>   <IRQ>   [<ffffffff8108b6c9>] __rcu_pending+0x268/0x3c5
>   [<ffffffff8108bf33>] rcu_check_callbacks+0xe9/0x108
>   [<ffffffff81031208>] update_process_times+0x3f/0x75
>   [<ffffffff810649b5>] tick_sched_timer+0x6d/0x8c
>   [<ffffffff81042e68>] __run_hrtimer+0xc9/0x15c
>   [<ffffffff81064948>] ? tick_nohz_handler+0xd5/0xd5
>   [<ffffffff8104359b>] hrtimer_interrupt+0xea/0x1b4
>   [<ffffffff81316192>] smp_apic_timer_interrupt+0x76/0x89
>   [<ffffffff81315833>] apic_timer_interrupt+0x73/0x80
>   <EOI>   [<ffffffff8130b9ec>] ? arch_local_irq_restore+0x6/0xd
>   [<ffffffff8130840c>] native_cpu_up+0x108/0x127
>   [<ffffffff81308f2b>] _cpu_up+0x92/0xfc
>   [<ffffffff812facf7>] enable_nonboot_cpus+0x4d/0xb2
>   [<ffffffff81059836>] hibernation_snapshot+0x1f3/0x2a0
>   [<ffffffff81044406>] ? cleanup_srcu_struct+0x52/0x58
>   [<ffffffff81059e6d>] hibernate+0x97/0x196
>   [<ffffffff81057eb0>] state_store+0x5c/0x106
>   [<ffffffff811af49b>] kobj_attr_store+0x17/0x19
>   [<ffffffff811368fa>] sysfs_write_file+0x10e/0x14a
>   [<ffffffff810eb5f6>] vfs_write+0xab/0xd2
>   [<ffffffff810ec955>] ? fget_light+0x3a/0xa1
>   [<ffffffff810eb7ad>] sys_write+0x4d/0x74
>   [<ffffffff81314ce9>] system_call_fastpath+0x16/0x1b
> INFO: rcu_sched detected stalls on CPUs/tasks: { 1} (detected by 0, t=1713579840232 jiffies)
> Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
> Call Trace:
>   <IRQ>   [<ffffffff8108b6c9>] __rcu_pending+0x268/0x3c5
>   [<ffffffff8108bef5>] rcu_check_callbacks+0xab/0x108
>   [<ffffffff81031208>] update_process_times+0x3f/0x75
>   [<ffffffff810649b5>] tick_sched_timer+0x6d/0x8c
>   [<ffffffff81042e68>] __run_hrtimer+0xc9/0x15c
>   [<ffffffff81064948>] ? tick_nohz_handler+0xd5/0xd5
>   [<ffffffff8104359b>] hrtimer_interrupt+0xea/0x1b4
>   [<ffffffff81316192>] smp_apic_timer_interrupt+0x76/0x89
>   [<ffffffff81315833>] apic_timer_interrupt+0x73/0x80
>   <EOI>   [<ffffffff8130b9ec>] ? arch_local_irq_restore+0x6/0xd
>   [<ffffffff8130840c>] native_cpu_up+0x108/0x127
>   [<ffffffff81308f2b>] _cpu_up+0x92/0xfc
>   [<ffffffff812facf7>] enable_nonboot_cpus+0x4d/0xb2
>   [<ffffffff81059836>] hibernation_snapshot+0x1f3/0x2a0
>   [<ffffffff81044406>] ? cleanup_srcu_struct+0x52/0x58
>   [<ffffffff81059e6d>] hibernate+0x97/0x196
>   [<ffffffff81057eb0>] state_store+0x5c/0x106
>   [<ffffffff811af49b>] kobj_attr_store+0x17/0x19
>   [<ffffffff811368fa>] sysfs_write_file+0x10e/0x14a
>   [<ffffffff810eb5f6>] vfs_write+0xab/0xd2
>   [<ffffffff810ec955>] ? fget_light+0x3a/0xa1
>   [<ffffffff810eb7ad>] sys_write+0x4d/0x74
>   [<ffffffff81314ce9>] system_call_fastpath+0x16/0x1b
> INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected by 0, t=1713579840233 jiffies)
> Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
> Call Trace:
>   <IRQ>   [<ffffffff8108b6c9>] __rcu_pending+0x268/0x3c5
>   [<ffffffff8108bf33>] rcu_check_callbacks+0xe9/0x108
>   [<ffffffff81031208>] update_process_times+0x3f/0x75
>   [<ffffffff810649b5>] tick_sched_timer+0x6d/0x8c
>   [<ffffffff81042e68>] __run_hrtimer+0xc9/0x15c
>   [<ffffffff81064948>] ? tick_nohz_handler+0xd5/0xd5
>   [<ffffffff8104359b>] hrtimer_interrupt+0xea/0x1b4
>   [<ffffffff81316192>] smp_apic_timer_interrupt+0x76/0x89
>   [<ffffffff81315833>] apic_timer_interrupt+0x73/0x80
>   <EOI>   [<ffffffff8130b9ec>] ? arch_local_irq_restore+0x6/0xd
>   [<ffffffff8130840c>] native_cpu_up+0x108/0x127
>   [<ffffffff81308f2b>] _cpu_up+0x92/0xfc
>   [<ffffffff812facf7>] enable_nonboot_cpus+0x4d/0xb2
>   [<ffffffff81059836>] hibernation_snapshot+0x1f3/0x2a0
>   [<ffffffff81044406>] ? cleanup_srcu_struct+0x52/0x58
>   [<ffffffff81059e6d>] hibernate+0x97/0x196
>   [<ffffffff81057eb0>] state_store+0x5c/0x106
>   [<ffffffff811af49b>] kobj_attr_store+0x17/0x19
>   [<ffffffff811368fa>] sysfs_write_file+0x10e/0x14a
>   [<ffffffff810eb5f6>] vfs_write+0xab/0xd2
>   [<ffffffff810ec955>] ? fget_light+0x3a/0xa1
>   [<ffffffff810eb7ad>] sys_write+0x4d/0x74
>   [<ffffffff81314ce9>] system_call_fastpath+0x16/0x1b
>
>
> 		Amit

Stalls are probably caused by uninitialized percpu hv_clock, with
following patch I don't see stalls. Although I might be just lucky.
http://git.kernel.org/?p=virt/kvm/kvm.git;a=commit;h=e2971ac7e1d186af059e088d305496c5cb47d487

However there is/are a warning/s on suspend path and with following patch:
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index 66c74f4..c8e7587 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -117,6 +117,7 @@
  static void native_smp_send_reschedule(int cpu)
  {
         if (unlikely(cpu_is_offline(cpu))) {
+               printk ("curr_cpu: %d, for cpu: %d\n", smp_processor_id(), cpu);
                 WARN_ON(1);
                 return;
         }


[    3.101919] Unregister pv shared memory for cpu 1
[    3.207321] CPU 1 is now offline
[    3.210177] curr_cpu: 0, for cpu: 1
[    3.211172] ------------[ cut here ]------------
[    3.211172] WARNING: at arch/x86/kernel/smp.c:121 native_smp_send_reschedule+0x40/0x71()
[    3.211172] Hardware name: Bochs
[    3.211172] Modules linked in: floppy virtio_blk
[    3.211172] Pid: 264, comm: sh Not tainted 3.3.0-rc1+ #42
[    3.211172] Call Trace:
[    3.211172]  <IRQ>  [<ffffffff81045ecd>] warn_slowpath_common+0x83/0x9b
[    3.211172]  [<ffffffff81045eff>] warn_slowpath_null+0x1a/0x1c
[    3.211172]  [<ffffffff81026dba>] native_smp_send_reschedule+0x40/0x71
[    3.211172]  [<ffffffff810749d6>] trigger_load_balance+0x28e/0x2ba
[    3.211172]  [<ffffffff8106e7ec>] scheduler_tick+0xe9/0xf2
[    3.211172]  [<ffffffff81052db8>] update_process_times+0x65/0x76
[    3.211172]  [<ffffffff81088219>] tick_sched_timer+0x75/0x9f
[    3.211172]  [<ffffffff81064056>] __run_hrtimer+0xb1/0x155
[    3.211172]  [<ffffffff810881a4>] ? tick_nohz_handler+0xdd/0xdd
[    3.211172]  [<ffffffff8106476c>] hrtimer_interrupt+0xe0/0x19e
[    3.211172]  [<ffffffff814d847c>] smp_apic_timer_interrupt+0x77/0x8a
[    3.211172]  [<ffffffff814d7133>] apic_timer_interrupt+0x73/0x80
[    3.211172]  <EOI>  [<ffffffff814ba14f>] ? cpuid4_cache_sysfs_exit+0x94/0xad
[    3.211172]  [<ffffffff8111ebe3>] ? cmpxchg_double_slab+0x6/0xef
[    3.211172]  [<ffffffff814c9afc>] __slab_free+0xe6/0x1b7
[    3.211172]  [<ffffffff8111eed2>] kmem_cache_free+0x72/0xac
[    3.211172]  [<ffffffff814ba14f>] ? cpuid4_cache_sysfs_exit+0x94/0xad
[    3.211172]  [<ffffffff81186d23>] release_sysfs_dirent+0x9b/0xb9
[    3.211172]  [<ffffffff81186d59>] __sysfs_put+0x18/0x1a
[    3.211172]  [<ffffffff81187257>] sysfs_addrm_finish+0x8a/0xa2
[    3.211172]  [<ffffffff81185d2b>] sysfs_hash_and_remove+0x5c/0x71
[    3.211172]  [<ffffffff8118674d>] sysfs_remove_file+0x34/0x3b
[    3.211172]  [<ffffffff812ee9da>] device_remove_file+0x17/0x19
[    3.211172]  [<ffffffff814bcaff>] mce_cpu_callback+0xa3/0x186
[    3.211172]  [<ffffffff814d31c0>] notifier_call_chain+0x37/0x63
[    3.211172]  [<ffffffff810655b6>] __raw_notifier_call_chain+0xe/0x10
[    3.211172]  [<ffffffff81047758>] __cpu_notify+0x20/0x32
[    3.211172]  [<ffffffff81047863>] cpu_notify_nofail+0x13/0x1b
[    3.211172]  [<ffffffff814ae6b3>] _cpu_down+0x13a/0x21f
[    3.211172]  [<ffffffff81047915>] disable_nonboot_cpus+0x66/0x11e
[    3.211172]  [<ffffffff8107bff1>] hibernation_restore+0x78/0x106
[    3.211172]  [<ffffffff8107c25f>] software_resume+0x1e0/0x249
[    3.211172]  [<ffffffff8107c353>] resume_store+0x8b/0x97
[    3.211172]  [<ffffffff8122f23b>] kobj_attr_store+0x17/0x19
[    3.211172]  [<ffffffff81186183>] sysfs_write_file+0x101/0x13d
[    3.211172]  [<ffffffff8112f2e3>] vfs_write+0xac/0xf3
[    3.211172]  [<ffffffff810caf1d>] ? trace_hardirqs_on_caller+0xf/0x24
[    3.211172]  [<ffffffff8112f4d2>] sys_write+0x4a/0x6e
[    3.211172]  [<ffffffff814d6629>] system_call_fastpath+0x16/0x1b
[    3.211172] ---[ end trace 5aa7abdf47c70330 ]---
[    3.276748] Unregister pv shared memory for cpu 2
[    3.280771] CPU 2 is now offline
[    3.285070] Unregister pv shared memory for cpu 3
[    3.288206] CPU 3 is now offline
[   31.290110] kvm-clock: cpu 0, msr 0:1fc13341, primary cpu clock, resume
[   31.290110] Enabling non-boot CPUs ...
[   31.291319] Booting Node 0 Processor 1 APIC 0x1
[   31.293720] smpboot cpu 1: start_ip = 96000
[   31.172141] Calibrating delay loop (skipped) already calibrated this CPUcurr_cpu: 0, for cpu: 2
[   31.330037] ------------[ cut here ]------------
[   31.330094] WARNING: at arch/x86/kernel/smp.c:121 native_smp_send_reschedule+0x40/0x71()
[   31.330094] Hardware name: Bochs
[   31.330094] Modules linked in: lockd fcoe libfcoe 8021q libfc scsi_transport_fc scsi_tgt garp stp llc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 
nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ip6table_filter ip6_tables e1000 pcspkr uinput i2c_piix4 i2c_core ppdev parport_pc parport 
microcode sunrpc floppy virtio_blk [last unloaded: scsi_wait_scan]
[   31.330094] Pid: 1016, comm: bash Tainted: G        W    3.3.0-rc1+ #42
[   31.330094] Call Trace:
[   31.330094]  <IRQ>  [<ffffffff81045ecd>] warn_slowpath_common+0x83/0x9b
[   31.330094]  [<ffffffff81045eff>] warn_slowpath_null+0x1a/0x1c
[   31.330094]  [<ffffffff81026dba>] native_smp_send_reschedule+0x40/0x71
[   31.330094]  [<ffffffff810749d6>] trigger_load_balance+0x28e/0x2ba
[   31.330094]  [<ffffffff8106e7ec>] scheduler_tick+0xe9/0xf2
[   31.330094]  [<ffffffff81052db8>] update_process_times+0x65/0x76
[   31.330094]  [<ffffffff81088219>] tick_sched_timer+0x75/0x9f
[   31.330094]  [<ffffffff81064056>] __run_hrtimer+0xb1/0x155
[   31.330094]  [<ffffffff810881a4>] ? tick_nohz_handler+0xdd/0xdd
[   31.330094]  [<ffffffff8106476c>] hrtimer_interrupt+0xe0/0x19e
[   31.330094]  [<ffffffff814d847c>] smp_apic_timer_interrupt+0x77/0x8a
[   31.330094]  [<ffffffff814d7133>] apic_timer_interrupt+0x73/0x80
[   31.330094]  <EOI>  [<ffffffff814cffe1>] ? retint_restore_args+0x13/0x13
[   31.330094]  [<ffffffff8108d42a>] ? arch_local_irq_restore+0x6/0xd
[   31.330094]  [<ffffffff814cfd88>] _raw_spin_unlock_irqrestore+0x3d/0x3f
[   31.330094]  [<ffffffff814b1bd3>] workqueue_cpu_callback+0x203/0x214
[   31.330094]  [<ffffffff814d31c0>] notifier_call_chain+0x37/0x63
[   31.330094]  [<ffffffff814c6b68>] ? arch_local_irq_restore+0x6/0xd
[   31.330094]  [<ffffffff810655b6>] __raw_notifier_call_chain+0xe/0x10
[   31.330094]  [<ffffffff81047758>] __cpu_notify+0x20/0x32
[   31.330094]  [<ffffffff814bfb99>] _cpu_up+0xcd/0xfc
[   31.330094]  [<ffffffff814ae88b>] enable_nonboot_cpus+0x57/0xc4
[   31.330094]  [<ffffffff8107becc>] hibernation_snapshot+0x1f3/0x2a0
[   31.330094]  [<ffffffff8107c503>] hibernate+0x97/0x198
[   31.330094]  [<ffffffff8107a613>] state_store+0x5c/0x106
[   31.330094]  [<ffffffff8122f23b>] kobj_attr_store+0x17/0x19
[   31.330094]  [<ffffffff81186183>] sysfs_write_file+0x101/0x13d
[   31.330094]  [<ffffffff8112f2e3>] vfs_write+0xac/0xf3
[   31.330094]  [<ffffffff810caf1d>] ? trace_hardirqs_on_caller+0xf/0x24
[   31.330094]  [<ffffffff8112f4d2>] sys_write+0x4a/0x6e
[   31.330094]  [<ffffffff814d6629>] system_call_fastpath+0x16/0x1b
[   31.330094] ---[ end trace bbd3f0319d3e2c44 ]---
[   31.402829] KVM setup async PF for cpu 1
[   31.402829] kvm-stealtime: cpu 1, msr 1fc8df00
[   31.656224] NMI watchdog disabled (cpu1): hardware events not enabled
[   31.659240] CPU1 is up
[   31.660402] Booting Node 0 Processor 2 APIC 0x2
[   31.661522] smpboot cpu 2: start_ip = 96000
[   31.278595] Calibrating delay loop (skipped) already calibrated this CPU
[   31.697981] NMI watchdog disabled (cpu2): hardware events not enabled
[   31.698047] KVM setup async PF for cpu 2
[   31.698051] kvm-stealtime: cpu 2, msr 1fd0df00
[   31.706326] CPU2 is up

however I think they are not related to subj patch.



  reply	other threads:[~2012-02-09 15:13 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-07 21:05 x86: kvmclock: abstract save/restore sched_clock_state Marcelo Tosatti
2012-02-08 10:53 ` Igor Mammedov
2012-02-09 12:27 ` Amit Shah
2012-02-09 15:13   ` Igor Mammedov [this message]
2012-02-10 10:02     ` Amit Shah
2012-02-10 10:11       ` Igor Mammedov
2012-02-10 10:23         ` Amit Shah
2012-02-10 12:32       ` Marcelo Tosatti
2012-02-10 12:33         ` Marcelo Tosatti
2012-02-10 12:43           ` Igor Mammedov
2012-02-13 12:46             ` Amit Shah
2012-02-13 12:56           ` Amit Shah
2012-02-10 13:18         ` Amit Shah
2012-02-10 20:58           ` Igor Mammedov
2012-02-13 12:39             ` Amit Shah
2012-02-13 13:07 ` x86: kvmclock: abstract save/restore sched_clock_state (v2) Marcelo Tosatti
2012-02-13 15:20   ` Igor Mammedov
2012-02-13 15:52     ` Marcelo Tosatti
2012-02-15 10:28       ` Avi Kivity
2012-02-13 15:45   ` [PATCH RFC] pvclock: Make pv_clock more robust and fixup it if overflow happens Igor Mammedov
2012-02-13 17:48     ` Marcelo Tosatti
2012-02-13 18:15       ` Igor Mammedov
2012-02-13 16:26   ` x86: kvmclock: abstract save/restore sched_clock_state (v2) Amit Shah
2012-03-01  9:58 ` x86: kvmclock: abstract save/restore sched_clock_state Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F33E299.3090500@redhat.com \
    --to=imammedo@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=avi@redhat.com \
    --cc=hpa@zytor.com \
    --cc=johnstul@us.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.