All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christoffer Dall <christoffer.dall@arm.com>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>,
	QEMU Developers <qemu-devel@nongnu.org>,
	steven.price@arm.com, qemu-arm <qemu-arm@nongnu.org>,
	Heyi Guo <guoheyi@huawei.com>,
	kvmarm@lists.cs.columbia.edu
Subject: Re: [RFC] arm/cpu: fix soft lockup panic after resuming from stop
Date: Tue, 12 Mar 2019 13:25:28 +0100	[thread overview]
Message-ID: <20190312122528.GA12646@e113682-lin.lund.arm.com> (raw)
In-Reply-To: <CAFEAcA-29Q-sTg2wLre0GJJCr2gfu89rg0vuTOf7cW5drE4d7Q@mail.gmail.com>

[Adding Steven Price, who has recently looked at this, in cc]

On Tue, Mar 12, 2019 at 10:08:47AM +0000, Peter Maydell wrote:
> On Tue, 12 Mar 2019 at 06:10, Heyi Guo <guoheyi@huawei.com> wrote:
> >
> > When we stop a VM for more than 30 seconds and then resume it, by qemu
> > monitor command "stop" and "cont", Linux on VM will complain of "soft
> > lockup - CPU#x stuck for xxs!" as below:
> >
> > [ 2783.809517] watchdog: BUG: soft lockup - CPU#3 stuck for 2395s!
> > [ 2783.809559] watchdog: BUG: soft lockup - CPU#2 stuck for 2395s!
> > [ 2783.809561] watchdog: BUG: soft lockup - CPU#1 stuck for 2395s!
> > [ 2783.809563] Modules linked in...
> >
> > This is because Guest Linux uses generic timer virtual counter as
> > a software watchdog, and CNTVCT_EL0 does not stop when VM is stopped
> > by qemu.
> >
> > This patch is to fix this issue by saving the value of CNTVCT_EL0 when
> > stopping and restoring it when resuming.
> 
> Hi -- I know we have issues with the passage of time in Arm VMs
> running under KVM when the VM is suspended, but the topic is
> a tricky one, and it's not clear to me that this is the correct
> way to fix it. I would prefer to see us start with a discussion
> on the kvm-arm mailing list about the best approach to the problem.
> 
> I've cc'd that list and a couple of the Arm KVM maintainers
> for their opinion.
> 
> QEMU patch left below for context -- the brief summary is that
> it uses KVM_GET_ONE_REG/KVM_SET_ONE_REG on the timer CNT register
> to save it on VM pause and write that value back on VM resume.
> 
> thanks
> -- PMM
> 
> 
> 
> > Cc: Peter Maydell <peter.maydell@linaro.org>
> > Signed-off-by: Heyi Guo <guoheyi@huawei.com>
> > ---
> >  target/arm/cpu.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 65 insertions(+)
> >
> > diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> > index 96f0ff0..7bbba3d 100644
> > --- a/target/arm/cpu.c
> > +++ b/target/arm/cpu.c
> > @@ -896,6 +896,60 @@ static void arm_cpu_finalizefn(Object *obj)
> >  #endif
> >  }
> >
> > +static int get_vcpu_timer_tick(CPUState *cs, uint64_t *tick_at_pause)
> > +{
> > +    int err;
> > +    struct kvm_one_reg reg;
> > +
> > +    reg.id = KVM_REG_ARM_TIMER_CNT;
> > +    reg.addr = (uintptr_t) tick_at_pause;
> > +
> > +    err = kvm_vcpu_ioctl(cs, KVM_GET_ONE_REG, &reg);
> > +    return err;
> > +}
> > +
> > +static int set_vcpu_timer_tick(CPUState *cs, uint64_t tick_at_pause)
> > +{
> > +    int err;
> > +    struct kvm_one_reg reg;
> > +
> > +    reg.id = KVM_REG_ARM_TIMER_CNT;
> > +    reg.addr = (uintptr_t) &tick_at_pause;
> > +
> > +    err = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, &reg);
> > +    return err;
> > +}
> > +
> > +static void arch_timer_change_state_handler(void *opaque, int running,
> > +                                            RunState state)
> > +{
> > +    static uint64_t hw_ticks_at_paused;
> > +    static RunState pre_state = RUN_STATE__MAX;
> > +    int err;
> > +    CPUState *cs = (CPUState *)opaque;
> > +
> > +    switch (state) {
> > +    case RUN_STATE_PAUSED:
> > +        err = get_vcpu_timer_tick(cs, &hw_ticks_at_paused);
> > +        if (err) {
> > +            error_report("Get vcpu timer tick failed: %d", err);
> > +        }
> > +        break;
> > +    case RUN_STATE_RUNNING:
> > +        if (pre_state == RUN_STATE_PAUSED) {
> > +            err = set_vcpu_timer_tick(cs, hw_ticks_at_paused);
> > +            if (err) {
> > +                error_report("Resume vcpu timer tick failed: %d", err);
> > +            }
> > +        }
> > +        break;
> > +    default:
> > +        break;
> > +    }
> > +
> > +    pre_state = state;
> > +}
> > +
> >  static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
> >  {
> >      CPUState *cs = CPU(dev);
> > @@ -906,6 +960,12 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
> >      Error *local_err = NULL;
> >      bool no_aa32 = false;
> >
> > +    /*
> > +     * Only add change state handler for arch timer once, for KVM will help to
> > +     * synchronize virtual timer of all VCPUs.
> > +     */
> > +    static bool arch_timer_change_state_handler_added;
> > +
> >      /* If we needed to query the host kernel for the CPU features
> >       * then it's possible that might have failed in the initfn, but
> >       * this is the first point where we can report it.
> > @@ -1181,6 +1241,11 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
> >
> >      init_cpreg_list(cpu);
> >
> > +    if (!arch_timer_change_state_handler_added && kvm_enabled()) {
> > +        qemu_add_vm_change_state_handler(arch_timer_change_state_handler, cs);
> > +        arch_timer_change_state_handler_added = true;
> > +    }
> > +
> >  #ifndef CONFIG_USER_ONLY
> >      if (cpu->has_el3 || arm_feature(env, ARM_FEATURE_M_SECURITY)) {
> >          cs->num_ases = 2;
> > --
> > 1.8.3.1

  reply	other threads:[~2019-03-12 12:25 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-12  6:09 [Qemu-devel] [RFC] arm/cpu: fix soft lockup panic after resuming from stop Heyi Guo
2019-03-12  6:22 ` no-reply
2019-03-12 10:08 ` Peter Maydell
2019-03-12 12:25   ` Christoffer Dall [this message]
2019-03-12 14:12   ` Marc Zyngier
2019-03-13  1:57     ` Heyi Guo
2019-03-13 10:22       ` Steven Price
2019-03-13 10:11     ` Steven Price
2019-03-14 14:55       ` Heyi Guo
2019-03-15  8:59       ` Christoffer Dall
2019-03-19 14:39         ` Heyi Guo
2019-03-20 16:29           ` Steven Price
2019-03-26 11:54             ` Heyi Guo
2019-03-26 13:53               ` Heyi Guo
2019-03-26 17:12                 ` Steven Price
2019-04-11  7:27                   ` [Qemu-devel] " Heyi Guo
2019-04-11  7:27                     ` Heyi Guo
2019-04-11  7:27                     ` Heyi Guo
2019-04-11  7:27                     ` [Qemu-devel] " Heyi Guo
2019-04-11 15:31                     ` Steven Price
2019-04-11 15:31                       ` Steven Price

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190312122528.GA12646@e113682-lin.lund.arm.com \
    --to=christoffer.dall@arm.com \
    --cc=guoheyi@huawei.com \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=marc.zyngier@arm.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=steven.price@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.