All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Mackerras <paulus@samba.org>
To: Alexander Graf <agraf@suse.de>
Cc: kvm-ppc@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: [PATCH 1/5] KVM: PPC: Book3S HV: Handle guest-caused machine checks on POWER7 without panicking
Date: Sat, 24 Nov 2012 08:42:10 +1100	[thread overview]
Message-ID: <20121123214210.GA23537@bloggs.ozlabs.ibm.com> (raw)
In-Reply-To: <5E924DFD-5B00-4B3C-9933-575665FBD8B4@suse.de>

On Fri, Nov 23, 2012 at 03:13:09PM +0100, Alexander Graf wrote:
> 
> On 22.11.2012, at 10:25, Paul Mackerras wrote:
> 
> > +	/* Do they have an SLB shadow buffer registered? */
> > +	slb = vcpu->arch.slb_shadow.pinned_addr;
> > +	if (!slb)
> > +		return;
> 
> Mind to explain this case? What happens here? Do we leave the guest with an empty SLB? Why would this ever happen? What happens next as soon as we go back into the guest?

Yes, we leave the guest with an empty SLB, the access gets retried and
this time the guest gets an SLB miss interrupt, which it can hopefully
handle using an SLB miss handler that runs entirely in real mode.
This could happen for instance while the guest is in SLOF or yaboot or
some other code that runs basically in real mode but occasionally
turns the MMU on for some accesses, and happens to have a bug where it
creates a duplicate SLB entry.

> > +	/* Sanity check */
> > +	n = slb->persistent;
> > +	if (n > SLB_MIN_SIZE)
> > +		n = SLB_MIN_SIZE;
> 
> Please use a max() macro here.

OK.

> > +	rb = 0x800;		/* IS field = 0b10, flush congruence class */
> > +	for (i = 0; i < 128; ++i) {
> 
> Please put up a #define for this. POWER7_TLB_SIZE or so. Is there any way to fetch that number from an SPR? I don't really want to have a p7+ and a p8 function in here too ;).
> 
> > +		asm volatile("tlbiel %0" : : "r" (rb));
> > +		rb += 0x1000;
> 
> I assume this also means (1 << TLBIE_ENTRY_SHIFT)? Would be nice to keep the code readable without guessing :).

The 0x800 and 0x1000 are taken from the architecture - it defines
fields in the RB value for the flush type and TLB index.  The 128 is
POWER7-specific and isn't in any SPR that I know of.  Eventually we'll
probably have to put it (the number of TLB congruence classes) in the
cputable, but for now I'll just do a define.

> So I take it that p7 does not implement tlbia?

Correct.

> So we never return 0? How about ECC errors and the likes? Wouldn't those also be #MCs that the host needs to handle?

Yes, true.  In fact the OPAL firmware gets to see the machine checks
before we do (see the opal_register_exception_handler() calls in
arch/powerpc/platforms/powernv/opal.c), so it should have already
handled recoverable things like L1 cache parity errors.

I'll make the function return 0 if there's an error that it doesn't
know about.

> > 	ld	r8, HSTATE_VMHANDLER(r13)
> > 	ld	r7, HSTATE_HOST_MSR(r13)
> > 
> > 	cmpwi	r12, BOOK3S_INTERRUPT_EXTERNAL
> > +BEGIN_FTR_SECTION
> > 	beq	11f
> > -	cmpwi	r12, BOOK3S_INTERRUPT_MACHINE_CHECK
> > +END_FTR_SECTION_IFSET(CPU_FTR_ARCH_206)
> 
> Mind to explain the logic that happens here? Do we get external interrupts on 970? If not, the cmpwi should also be inside the FTR section. Also, if we do a beq here, why do the beqctr below again?

I was making it not call the host kernel machine check handler if it
was a machine check that pulled us out of the guest.  In fact we
probably do still want to call the handler, but we don't want to jump
to 0x200, since that has been patched by OPAL, and we don't want to
make OPAL think we just got another machine check.  Instead we would
need to jump to machine_check_pSeries.

The feature section is because POWER7 sets HSRR0/1 on external
interrupts, whereas PPC970 sets SRR0/1.

Paul.

WARNING: multiple messages have this Message-ID (diff)
From: Paul Mackerras <paulus@samba.org>
To: Alexander Graf <agraf@suse.de>
Cc: kvm-ppc@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: [PATCH 1/5] KVM: PPC: Book3S HV: Handle guest-caused machine checks on POWER7 without panicking
Date: Fri, 23 Nov 2012 21:42:10 +0000	[thread overview]
Message-ID: <20121123214210.GA23537@bloggs.ozlabs.ibm.com> (raw)
In-Reply-To: <5E924DFD-5B00-4B3C-9933-575665FBD8B4@suse.de>

On Fri, Nov 23, 2012 at 03:13:09PM +0100, Alexander Graf wrote:
> 
> On 22.11.2012, at 10:25, Paul Mackerras wrote:
> 
> > +	/* Do they have an SLB shadow buffer registered? */
> > +	slb = vcpu->arch.slb_shadow.pinned_addr;
> > +	if (!slb)
> > +		return;
> 
> Mind to explain this case? What happens here? Do we leave the guest with an empty SLB? Why would this ever happen? What happens next as soon as we go back into the guest?

Yes, we leave the guest with an empty SLB, the access gets retried and
this time the guest gets an SLB miss interrupt, which it can hopefully
handle using an SLB miss handler that runs entirely in real mode.
This could happen for instance while the guest is in SLOF or yaboot or
some other code that runs basically in real mode but occasionally
turns the MMU on for some accesses, and happens to have a bug where it
creates a duplicate SLB entry.

> > +	/* Sanity check */
> > +	n = slb->persistent;
> > +	if (n > SLB_MIN_SIZE)
> > +		n = SLB_MIN_SIZE;
> 
> Please use a max() macro here.

OK.

> > +	rb = 0x800;		/* IS field = 0b10, flush congruence class */
> > +	for (i = 0; i < 128; ++i) {
> 
> Please put up a #define for this. POWER7_TLB_SIZE or so. Is there any way to fetch that number from an SPR? I don't really want to have a p7+ and a p8 function in here too ;).
> 
> > +		asm volatile("tlbiel %0" : : "r" (rb));
> > +		rb += 0x1000;
> 
> I assume this also means (1 << TLBIE_ENTRY_SHIFT)? Would be nice to keep the code readable without guessing :).

The 0x800 and 0x1000 are taken from the architecture - it defines
fields in the RB value for the flush type and TLB index.  The 128 is
POWER7-specific and isn't in any SPR that I know of.  Eventually we'll
probably have to put it (the number of TLB congruence classes) in the
cputable, but for now I'll just do a define.

> So I take it that p7 does not implement tlbia?

Correct.

> So we never return 0? How about ECC errors and the likes? Wouldn't those also be #MCs that the host needs to handle?

Yes, true.  In fact the OPAL firmware gets to see the machine checks
before we do (see the opal_register_exception_handler() calls in
arch/powerpc/platforms/powernv/opal.c), so it should have already
handled recoverable things like L1 cache parity errors.

I'll make the function return 0 if there's an error that it doesn't
know about.

> > 	ld	r8, HSTATE_VMHANDLER(r13)
> > 	ld	r7, HSTATE_HOST_MSR(r13)
> > 
> > 	cmpwi	r12, BOOK3S_INTERRUPT_EXTERNAL
> > +BEGIN_FTR_SECTION
> > 	beq	11f
> > -	cmpwi	r12, BOOK3S_INTERRUPT_MACHINE_CHECK
> > +END_FTR_SECTION_IFSET(CPU_FTR_ARCH_206)
> 
> Mind to explain the logic that happens here? Do we get external interrupts on 970? If not, the cmpwi should also be inside the FTR section. Also, if we do a beq here, why do the beqctr below again?

I was making it not call the host kernel machine check handler if it
was a machine check that pulled us out of the guest.  In fact we
probably do still want to call the handler, but we don't want to jump
to 0x200, since that has been patched by OPAL, and we don't want to
make OPAL think we just got another machine check.  Instead we would
need to jump to machine_check_pSeries.

The feature section is because POWER7 sets HSRR0/1 on external
interrupts, whereas PPC970 sets SRR0/1.

Paul.

  reply	other threads:[~2012-11-23 21:42 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-22  9:24 [PATCH 0/5] KVM: PPC: Fix various bugs and vulnerabilities in HV KVM Paul Mackerras
2012-11-22  9:24 ` Paul Mackerras
2012-11-22  9:25 ` [PATCH 1/5] KVM: PPC: Book3S HV: Handle guest-caused machine checks on POWER7 without panicking Paul Mackerras
2012-11-22  9:25   ` Paul Mackerras
2012-11-23 14:13   ` Alexander Graf
2012-11-23 14:13     ` Alexander Graf
2012-11-23 21:42     ` Paul Mackerras [this message]
2012-11-23 21:42       ` Paul Mackerras
2012-11-26 13:15       ` Alexander Graf
2012-11-26 13:15         ` Alexander Graf
2012-11-26 21:33         ` Paul Mackerras
2012-11-26 21:33           ` Paul Mackerras
2012-11-26 21:55           ` Alexander Graf
2012-11-26 21:55             ` Alexander Graf
2012-11-26 22:03             ` Alexander Graf
2012-11-26 22:03               ` Alexander Graf
2012-11-26 23:11               ` Paul Mackerras
2012-11-26 23:11                 ` Paul Mackerras
2012-11-24  8:37     ` [PATCH v2] " Paul Mackerras
2012-11-24  8:37       ` Paul Mackerras
2012-11-26 23:16       ` Alexander Graf
2012-11-26 23:16         ` Alexander Graf
2012-11-26 23:18         ` Paul Mackerras
2012-11-26 23:18           ` Paul Mackerras
2012-11-26 23:20           ` Alexander Graf
2012-11-26 23:20             ` Alexander Graf
2012-11-27  0:20             ` Paul Mackerras
2012-11-27  0:20               ` Paul Mackerras
2012-12-22 14:09       ` [PATCH] KVM: PPC: Book3S HV: Fix compilation without CONFIG_PPC_POWERNV Andreas Schwab
2012-12-22 14:09         ` Andreas Schwab
2013-01-06 13:05         ` Alexander Graf
2013-01-06 13:05           ` Alexander Graf
2012-11-22  9:27 ` [PATCH 2/5] KVM: PPC: Book3S HV: Reset reverse-map chains when resetting the HPT Paul Mackerras
2012-11-22  9:27   ` Paul Mackerras
2012-11-22  9:28 ` [PATCH 3/5] KVM: PPC: Book3S HV: Improve handling of local vs. global TLB invalidations Paul Mackerras
2012-11-22  9:28   ` Paul Mackerras
2012-11-23 15:43   ` Alexander Graf
2012-11-23 15:43     ` Alexander Graf
2012-11-23 22:07     ` Paul Mackerras
2012-11-23 22:07       ` Paul Mackerras
2012-11-26 13:10       ` Alexander Graf
2012-11-26 13:10         ` Alexander Graf
2012-11-26 21:48         ` Paul Mackerras
2012-11-26 21:48           ` Paul Mackerras
2012-11-26 22:03           ` Alexander Graf
2012-11-26 22:03             ` Alexander Graf
2012-11-26 23:16             ` Paul Mackerras
2012-11-26 23:16               ` Paul Mackerras
2012-11-26 23:18               ` Alexander Graf
2012-11-26 23:18                 ` Alexander Graf
2012-11-22  9:28 ` [PATCH 4/5] KVM: PPC: Book3S HV: Don't give the guest RW access to RO pages Paul Mackerras
2012-11-22  9:28   ` Paul Mackerras
2012-11-23 15:47   ` Alexander Graf
2012-11-23 15:47     ` Alexander Graf
2012-11-23 22:13     ` Paul Mackerras
2012-11-23 22:13       ` Paul Mackerras
2012-11-24  9:05       ` Alexander Graf
2012-11-24  9:05         ` Alexander Graf
2012-11-24  9:32         ` Paul Mackerras
2012-11-24  9:32           ` Paul Mackerras
2012-11-26 13:09           ` Alexander Graf
2012-11-26 13:09             ` Alexander Graf
2012-11-22  9:29 ` [PATCH 5/5] KVM: PPC: Book3S HV: Report correct HPT entry index when reading HPT Paul Mackerras
2012-11-22  9:29   ` Paul Mackerras
2012-11-23 15:48 ` [PATCH 0/5] KVM: PPC: Fix various bugs and vulnerabilities in HV KVM Alexander Graf
2012-11-23 15:48   ` Alexander Graf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121123214210.GA23537@bloggs.ozlabs.ibm.com \
    --to=paulus@samba.org \
    --cc=agraf@suse.de \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.