All of lore.kernel.org
 help / color / mirror / Atom feed
From: Igor Mammedov <imammedo@redhat.com>
To: Amit Shah <amit.shah@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
	kvm@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com,
	hpa@zytor.com, x86@kernel.org, johnstul@us.ibm.com,
	riel@redhat.com, avi@redhat.com, Gleb Natapov <gleb@redhat.com>
Subject: Re: x86: kvmclock: abstract save/restore sched_clock_state
Date: Fri, 10 Feb 2012 21:58:47 +0100	[thread overview]
Message-ID: <4F358507.2030906@redhat.com> (raw)
In-Reply-To: <20120210131828.GA31999@amit.redhat.com>

I've to revoke my ack and say NAK to this patch. Patch itself is in right
direction but clock restore happens too late.

With patch I've used to hunt down overflow for cpu hotplug
(maybe it will be better for it to be in kernel, which will help to detect
  overflow problem without spending a lot of time to debug it?):

diff --git a/arch/x86/kernel/pvclock.c b/arch/x86/kernel/pvclock.c
index 42eb330..0081e10 100644
--- a/arch/x86/kernel/pvclock.c
+++ b/arch/x86/kernel/pvclock.c
@@ -43,7 +43,10 @@ void pvclock_set_flags(u8 flags)

  static u64 pvclock_get_nsec_offset(struct pvclock_shadow_time *shadow)
  {
-       u64 delta = native_read_tsc() - shadow->tsc_timestamp;
+       u64 tsc = native_read_tsc();
+       u64 shadow_timestamp = shadow->tsc_timestamp;
+       u64 delta = tsc - shadow_timestamp;
+       BUG_ON (tsc < shadow_timestamp);
         return pvclock_scale_delta(delta, shadow->tsc_to_nsec_mul,
                                    shadow->tsc_shift);
  }

I've got following back-trace:

#29 0xffffffff81004925 in oops_end (flags=70, regs=<optimized out>, signr=11) at arch/x86/kernel/dumpstack.c:246
#30 0xffffffff81004a77 in die (str=0xffffffff8152ca7f "invalid opcode", regs=0xffff8800149859e8, err=0)
     at arch/x86/kernel/dumpstack.c:305
#31 0xffffffff81001fed in do_trap (trapnr=6, signr=4, str=0xffffffff8152ca7f "invalid opcode", regs=0xffff8800149859e8,
     error_code=0, info=0xffff880014985948) at arch/x86/kernel/traps.c:168
#32 0xffffffff810021f7 in do_invalid_op (regs=0xffff8800149859e8, error_code=0) at arch/x86/kernel/traps.c:209
#33 <signal handler called>
#34 pvclock_get_nsec_offset (shadow=0xffff880014985a98) at arch/x86/kernel/pvclock.c:51
#35 pvclock_clocksource_read (src=0xffff88001f1d1ac0) at arch/x86/kernel/pvclock.c:107
#36 0xffffffff8101aa97 in kvm_clock_read () at arch/x86/kernel/kvmclock.c:79
#37 0xffffffff810087cc in paravirt_sched_clock () at /home/imammedov/fc15/linux-2.6/arch/x86/include/asm/paravirt.h:230
#38 sched_clock () at arch/x86/kernel/tsc.c:71
#39 0xffffffff8104f025 in sched_clock_local (scd=0xffff88001f1d2600) at kernel/sched/clock.c:147
#40 0xffffffff8104f1b7 in sched_clock_cpu (cpu=0) at kernel/sched/clock.c:232
#41 0xffffffff8104f1e5 in local_clock () at kernel/sched/clock.c:316
#42 0xffffffff810699d2 in lockstat_clock () at kernel/lockdep.c:158
#43 __lock_acquire (lock=0xffffffff816131b8, subclass=<optimized out>, trylock=0, read=0, check=2, hardirqs_off=<optimized out>,
     nest_lock=0x0, ip=18446744071578924098, references=0) at kernel/lockdep.c:3098
#44 0xffffffff8106a924 in lock_acquire (lock=0xffffffff816131b8, subclass=0, trylock=0, read=0, check=2, nest_lock=0x0,
     ip=18446744071578924098) at kernel/lockdep.c:3555
#45 0xffffffff8136b1c3 in __raw_spin_lock (lock=0xffffffff816131a0) at include/linux/spinlock_api_smp.h:143
#46 _raw_spin_lock (lock=0xffffffff816131a0) at kernel/spinlock.c:137
#47 0xffffffff81013442 in prepare_set () at arch/x86/kernel/cpu/mtrr/generic.c:676
#48 0xffffffff8101368c in generic_set_all () at arch/x86/kernel/cpu/mtrr/generic.c:723
#49 0xffffffff8101258b in mtrr_bp_restore () at arch/x86/kernel/cpu/mtrr/main.c:738
#50 0xffffffff812cabfa in __restore_processor_state (ctxt=<optimized out>) at arch/x86/power/cpu.c:227
#51 restore_processor_state () at arch/x86/power/cpu.c:233
---Type <return> to continue, or q <return> to quit---
#52 0xffffffff8105a1b6 in create_image (platform_mode=0) at kernel/power/hibernate.c:291
#53 hibernation_snapshot (platform_mode=0) at kernel/power/hibernate.c:363
#54 0xffffffff8105a825 in hibernate () at kernel/power/hibernate.c:629
#55 0xffffffff81058868 in state_store (kobj=<optimized out>, attr=<optimized out>,
     buf=0xffff88001366e000 <Address 0xffff88001366e000 out of bounds>, n=5) at kernel/power/main.c:284
#56 0xffffffff811e6cc3 in kobj_attr_store (kobj=<optimized out>, attr=<optimized out>, buf=<optimized out>,
     count=<optimized out>) at lib/kobject.c:699
#57 0xffffffff811372c2 in flush_write_buffer (count=<optimized out>, buffer=<optimized out>, dentry=<optimized out>)
     at fs/sysfs/file.c:202
#58 sysfs_write_file (file=<optimized out>, buf=0x7f9ce710d000 <Address 0x7f9ce710d000 out of bounds>, count=<optimized out>,
     ppos=0xffff880014985f58) at fs/sysfs/file.c:236
#59 0xffffffff810ebfbe in vfs_write (file=0xffff88001cb48000, buf=0x7f9ce710d000 <Address 0x7f9ce710d000 out of bounds>,
     count=<optimized out>, pos=0xffff880014985f58) at fs/read_write.c:435
#60 0xffffffff810ec175 in sys_write (fd=<optimized out>, buf=0x7f9ce710d000 <Address 0x7f9ce710d000 out of bounds>,
     count=<optimized out>) at fs/read_write.c:487

that shows an access to clock happens right before x86_platform.restore_sched_clock_state
is called.

Moving x86_platform.restore_sched_clock_state before mtrr_bp_restore solves issue.
It isn't bugged on me for 20 save/restore cycles with this change, without this change
it bugs on 2nd-3rd cycle.


BTW Amit,
your config doesn't have CONFIG_KVM_GUEST set, which causes primary cpu clock to be
uninitialized too in case of SMP kernel. Commit ca3f10172eea9b95b moved it from
kvmclock_init to kvm_guest_init but forgot to change Kconfig too and kvm_guest_init
is called only when KVM_GUEST is selected.  This commit created following implicit deps

if SMP && KVM_CLOCK then select KVM_GUEST

But I don't know if it's possible to express it in Kconfig. That rises a question:
Do we need KVM_CLOCK option? Maybe better to use KVM_GUEST instead and remove KVM_CLOCK option?
Or make KVM_CLOCK depend on KVM_GUEST...

  reply	other threads:[~2012-02-10 20:58 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-07 21:05 x86: kvmclock: abstract save/restore sched_clock_state Marcelo Tosatti
2012-02-08 10:53 ` Igor Mammedov
2012-02-09 12:27 ` Amit Shah
2012-02-09 15:13   ` Igor Mammedov
2012-02-10 10:02     ` Amit Shah
2012-02-10 10:11       ` Igor Mammedov
2012-02-10 10:23         ` Amit Shah
2012-02-10 12:32       ` Marcelo Tosatti
2012-02-10 12:33         ` Marcelo Tosatti
2012-02-10 12:43           ` Igor Mammedov
2012-02-13 12:46             ` Amit Shah
2012-02-13 12:56           ` Amit Shah
2012-02-10 13:18         ` Amit Shah
2012-02-10 20:58           ` Igor Mammedov [this message]
2012-02-13 12:39             ` Amit Shah
2012-02-13 13:07 ` x86: kvmclock: abstract save/restore sched_clock_state (v2) Marcelo Tosatti
2012-02-13 15:20   ` Igor Mammedov
2012-02-13 15:52     ` Marcelo Tosatti
2012-02-15 10:28       ` Avi Kivity
2012-02-13 15:45   ` [PATCH RFC] pvclock: Make pv_clock more robust and fixup it if overflow happens Igor Mammedov
2012-02-13 17:48     ` Marcelo Tosatti
2012-02-13 18:15       ` Igor Mammedov
2012-02-13 16:26   ` x86: kvmclock: abstract save/restore sched_clock_state (v2) Amit Shah
2012-03-01  9:58 ` x86: kvmclock: abstract save/restore sched_clock_state Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F358507.2030906@redhat.com \
    --to=imammedo@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=avi@redhat.com \
    --cc=gleb@redhat.com \
    --cc=hpa@zytor.com \
    --cc=johnstul@us.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.