From: Jordan Niethe <jniethe5@gmail.com>
To: Paul Mackerras <paulus@ozlabs.org>,
linuxppc-dev@ozlabs.org, kvm@vger.kernel.org
Cc: kvm-ppc@vger.kernel.org, David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH v2 1/3] KVM: PPC: Book3S HV: Fix race in re-enabling XIVE escalation interrupts
Date: Wed, 14 Aug 2019 14:46:38 +1000 [thread overview]
Message-ID: <53a17acd0330bc38190ab36625e48d1727a16fa4.camel@gmail.com> (raw)
In-Reply-To: <20190813100349.GD9567@blackberry>
On Tue, 2019-08-13 at 20:03 +1000, Paul Mackerras wrote:
> Escalation interrupts are interrupts sent to the host by the XIVE
> hardware when it has an interrupt to deliver to a guest VCPU but that
> VCPU is not running anywhere in the system. Hence we disable the
> escalation interrupt for the VCPU being run when we enter the guest
> and re-enable it when the guest does an H_CEDE hypercall indicating
> it is idle.
>
> It is possible that an escalation interrupt gets generated just as we
> are entering the guest. In that case the escalation interrupt may be
> using a queue entry in one of the interrupt queues, and that queue
> entry may not have been processed when the guest exits with an
> H_CEDE.
> The existing entry code detects this situation and does not clear the
> vcpu->arch.xive_esc_on flag as an indication that there is a pending
> queue entry (if the queue entry gets processed, xive_esc_irq() will
> clear the flag). There is a comment in the code saying that if the
> flag is still set on H_CEDE, we have to abort the cede rather than
> re-enabling the escalation interrupt, lest we end up with two
> occurrences of the escalation interrupt in the interrupt queue.
>
> However, the exit code doesn't do that; it aborts the cede in the
> sense
> that vcpu->arch.ceded gets cleared, but it still enables the
> escalation
> interrupt by setting the source's PQ bits to 00. Instead we need to
> set the PQ bits to 10, indicating that an interrupt has been
> triggered.
> We also need to avoid setting vcpu->arch.xive_esc_on in this case
> (i.e. vcpu->arch.xive_esc_on seen to be set on H_CEDE) because
> xive_esc_irq() will run at some point and clear it, and if we race
> with
> that we may end up with an incorrect result (i.e. xive_esc_on set
> when
> the escalation interrupt has just been handled).
>
> It is extremely unlikely that having two queue entries would cause
> observable problems; theoretically it could cause queue overflow, but
> the CPU would have to have thousands of interrupts targetted to it
> for
> that to be possible. However, this fix will also make it possible to
> determine accurately whether there is an unhandled escalation
> interrupt in the queue, which will be needed by the following patch.
>
> Cc: stable@vger.kernel.org # v4.16+
> Fixes: 9b9b13a6d153 ("KVM: PPC: Book3S HV: Keep XIVE escalation
> interrupt masked unless ceded")
> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
> ---
> v2: don't set xive_esc_on if we're not using a XIVE escalation
> interrupt.
>
> arch/powerpc/kvm/book3s_hv_rmhandlers.S | 36 +++++++++++++++++++++
> ------------
> 1 file changed, 23 insertions(+), 13 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index 337e644..2e7e788 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -2831,29 +2831,39 @@ kvm_cede_prodded:
> kvm_cede_exit:
> ld r9, HSTATE_KVM_VCPU(r13)
> #ifdef CONFIG_KVM_XICS
> - /* Abort if we still have a pending escalation */
> + /* are we using XIVE with single escalation? */
> + ld r10, VCPU_XIVE_ESC_VADDR(r9)
> + cmpdi r10, 0
> + beq 3f
> + li r6, XIVE_ESB_SET_PQ_00
Would it make sense to put the above instruction down into the 4: label
instead? If we do not branch to 4, r6 is overwriten anyway.
I think that would save a load when we do not branch to 4. Also it
would mean that you could use r5 everywhere instead of changing it to
r6?
> + /*
> + * If we still have a pending escalation, abort the cede,
> + * and we must set PQ to 10 rather than 00 so that we don't
> + * potentially end up with two entries for the escalation
> + * interrupt in the XIVE interrupt queue. In that case
> + * we also don't want to set xive_esc_on to 1 here in
> + * case we race with xive_esc_irq().
> + */
> lbz r5, VCPU_XIVE_ESC_ON(r9)
> cmpwi r5, 0
> - beq 1f
> + beq 4f
> li r0, 0
> stb r0, VCPU_CEDED(r9)
> -1: /* Enable XIVE escalation */
> - li r5, XIVE_ESB_SET_PQ_00
> + li r6, XIVE_ESB_SET_PQ_10
> + b 5f
> +4: li r0, 1
> + stb r0, VCPU_XIVE_ESC_ON(r9)
> + /* make sure store to xive_esc_on is seen before xive_esc_irq
> runs */
> + sync
> +5: /* Enable XIVE escalation */
> mfmsr r0
> andi. r0, r0, MSR_DR /* in real mode? */
> beq 1f
> - ld r10, VCPU_XIVE_ESC_VADDR(r9)
> - cmpdi r10, 0
> - beq 3f
> - ldx r0, r10, r5
> + ldx r0, r10, r6
> b 2f
> 1: ld r10, VCPU_XIVE_ESC_RADDR(r9)
> - cmpdi r10, 0
> - beq 3f
> - ldcix r0, r10, r5
> + ldcix r0, r10, r6
> 2: sync
> - li r0, 1
> - stb r0, VCPU_XIVE_ESC_ON(r9)
> #endif /* CONFIG_KVM_XICS */
> 3: b guest_exit_cont
>
next prev parent reply other threads:[~2019-08-14 4:48 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-13 9:58 [PATCH v2 0/3] powerpc/xive: Fix race condition leading to host crashes and hangs Paul Mackerras
2019-08-13 10:01 ` [PATCH v2 2/3] KVM: PPC: Book3S HV: Don't push XIVE context when not using XIVE device Paul Mackerras
2019-08-13 12:18 ` Cédric Le Goater
2019-08-22 10:46 ` Michael Ellerman
2019-08-13 10:03 ` [PATCH v2 1/3] KVM: PPC: Book3S HV: Fix race in re-enabling XIVE escalation interrupts Paul Mackerras
2019-08-14 4:46 ` Jordan Niethe [this message]
2019-08-14 6:05 ` Paul Mackerras
2019-08-22 10:46 ` Michael Ellerman
2019-08-13 10:06 ` [PATCH v2 3/3] powerpc/xive: Implement get_irqchip_state method for XIVE to fix shutdown race Paul Mackerras
2019-08-22 10:46 ` Michael Ellerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53a17acd0330bc38190ab36625e48d1727a16fa4.camel@gmail.com \
--to=jniethe5@gmail.com \
--cc=david@gibson.dropbear.id.au \
--cc=kvm-ppc@vger.kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linuxppc-dev@ozlabs.org \
--cc=paulus@ozlabs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).