kvmarm.lists.cs.columbia.edu archive mirror
 help / color / mirror / Atom feed
From: Julien Grall <julien.grall@arm.com>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: "linux-rt-users@vger.kernel.org" <linux-rt-users@vger.kernel.org>,
	Marc Zyngier <maz@kernel.org>,
	Andre Przywara <andre.przywara@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	anna-maria@linutronix.de,
	"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>
Subject: Re: KVM Arm64 and Linux-RT issues
Date: Fri, 16 Aug 2019 17:32:38 +0100	[thread overview]
Message-ID: <e9a77a95-ce0e-27a4-acb0-e997eb656e14@arm.com> (raw)
In-Reply-To: <20190816152317.pbhctfiyurjrepju@linutronix.de>

Hi Sebastian,

On 16/08/2019 16:23, Sebastian Andrzej Siewior wrote:
> On 2019-08-16 16:18:20 [+0100], Julien Grall wrote:
>> Sadly, I managed to hit the same BUG_ON() today with this patch
>> applied on top v5.2-rt1-rebase. :/ Although, it is more difficult
>> to hit than previously.
>>
>> [  157.449545] 000: BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:968
>> [  157.449569] 000: in_atomic(): 1, irqs_disabled(): 0, pid: 990, name: kvm-vcpu-1
>> [  157.449579] 000: 2 locks held by kvm-vcpu-1/990:
>> [  157.449592] 000:  #0: 00000000c2fc8217 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x70/0xae0
>> [  157.449638] 000:  #1: 0000000096863801 (&cpu_base->softirq_expiry_lock){+.+.}, at: hrtimer_grab_expiry_lock+0x24/0x40
>> [  157.449677] 000: Preemption disabled at:
>> [  157.449679] 000: [<ffff0000111a4538>] schedule+0x30/0xd8
>> [  157.449702] 000: CPU: 0 PID: 990 Comm: kvm-vcpu-1 Tainted: G        W 5.2.0-rt1-00001-gd368139e892f #104
>> [  157.449712] 000: Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Jan 23 2017
>> [  157.449718] 000: Call trace:
>> [  157.449722] 000:  dump_backtrace+0x0/0x130
>> [  157.449730] 000:  show_stack+0x14/0x20
>> [  157.449738] 000:  dump_stack+0xbc/0x104
>> [  157.449747] 000:  ___might_sleep+0x198/0x238
>> [  157.449756] 000:  rt_spin_lock+0x5c/0x70
>> [  157.449765] 000:  hrtimer_grab_expiry_lock+0x24/0x40
>> [  157.449773] 000:  hrtimer_cancel+0x1c/0x38
>> [  157.449780] 000:  kvm_timer_vcpu_load+0x78/0x3e0
> 
> …
>> I will do some debug and see what I can find.
> 
> which timer is this? Is there another one?

It looks like the timer is the background timer (bg_timer).
Although, the BUG() seems to happen with the other ones
but less often. All of them have already been converted.

Interestingly, hrtimer_grab_expiry_lock may be called by
timer even if is_soft (I assume this means softirq will
not be used) is 0.

diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 7d7db8802131..fe05e553dea2 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -934,6 +934,9 @@ void hrtimer_grab_expiry_lock(const struct hrtimer *timer)
 {
        struct hrtimer_clock_base *base = timer->base;
 
+       WARN(!preemptible(), "is_soft %u base %p base->cpu_base %p\n",
+            timer->is_soft, base, base ? base->cpu_base : NULL);
+
        if (base && base->cpu_base) {
                spin_lock(&base->cpu_base->softirq_expiry_lock);
                spin_unlock(&base->cpu_base->softirq_expiry_lock);

[  576.291886] 004: is_soft 0 base ffff80097eed44c0 base->cpu_base ffff80097eed4380

Because the hrtimer is started when scheduling out the
vCPU and canceled when the scheduling in, there is no
guarantee the hrtimer will be running on the same pCPU.
So I think the following can happen:

CPU0                                          |  CPU1
                                              |
                                              |  hrtimer_interrupt()
                                              |    raw_spin_lock_irqsave(&cpu_save->lock)
 hrtimer_cancel()                             |      __run_hrtimer_run_queues()
   hrtimer_try_to_cancel()                    |      __run_hrtimer()
     lock_hrtimer_base()                      |        base->running = timer;
                                              |        raw_spin_unlock_irqrestore(&cpu_save->lock)
       raw_spin_lock_irqsave(cpu_base->lock)  |        fn(timer);
     hrtimer_callback_running()               |
        
hrtimer_callback_running() will be returning true as the callback is
running somewhere else. This means hrtimer_try_to_cancel()
would return -1. Therefore hrtimer_grab_expiry_lock() would
be called.

Did I miss anything?

Cheers,

-- 
Julien Grall
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

  reply	other threads:[~2019-08-16 16:32 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-23 17:58 KVM Arm64 and Linux-RT issues Julien Grall
2019-07-24  8:53 ` Marc Zyngier
2019-07-26 22:58   ` Thomas Gleixner
2019-07-27 11:13     ` Marc Zyngier
2019-07-27 13:37       ` Julien Grall
2019-08-13 12:58         ` bigeasy
2019-08-13 15:44           ` Julien Grall
2019-08-13 16:24             ` Marc Zyngier
2019-08-16 15:18               ` Julien Grall
2019-08-16 15:23                 ` Sebastian Andrzej Siewior
2019-08-16 16:32                   ` Julien Grall [this message]
2019-08-19  7:33                     ` Sebastian Andrzej Siewior
2019-08-20 14:18                       ` Julien Grall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e9a77a95-ce0e-27a4-acb0-e997eb656e14@arm.com \
    --to=julien.grall@arm.com \
    --cc=andre.przywara@arm.com \
    --cc=anna-maria@linutronix.de \
    --cc=bigeasy@linutronix.de \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).