Re: [PATCH 2/2] KVM: arm64: nVHE: Don't consume host SErrors with RAS

From: James Morse <james.morse@arm.com>
To: Andrew Scull <ascull@google.com>
Cc: kernel-team@android.com, catalin.marinas@arm.com,
	Marc Zyngier <maz@kernel.org>,
	will@kernel.org, kvmarm@lists.cs.columbia.edu
Subject: Re: [PATCH 2/2] KVM: arm64: nVHE: Don't consume host SErrors with RAS
Date: Wed, 26 Aug 2020 18:41:00 +0100	[thread overview]
Message-ID: <394b84de-6989-e980-c068-5ef9c48cd078@arm.com> (raw)
In-Reply-To: <20200811151230.GC2917393@google.com>

Hi Andrew,

On 11/08/2020 16:12, Andrew Scull wrote:
> On Wed, Aug 05, 2020 at 03:37:27PM +0100, James Morse wrote:
>> On 31/07/2020 11:20, Andrew Scull wrote:
>>> If there is a pending physical SError, we'd have to keep it pending so
>>> the host can consume it.

>>> __guest_enter has the dsb-isb for RAS so
>>> SErrors will become pending, but it doesn't consume them.

>>> I can't
>>> remember now whether this was reliable or not; I assumed it was as it is
>>> gated on the RAS config.
>>
>> It should be reliable for 'asynchronous external abort', but not 'SError Interrupt', both
>> of which are signalled via the SError vector.
> 
> Uh oh, more terms with differences that I don't understand yet.

> I assume that there aren't SError Interrupts that we would be concerned
> about being deliered to the guest that should have been delivered to the
> host after checing for asynchronous external aborts? At least, that seem
> to be the current behaviour if I'm reading things right.

Currently KVM assume everything is an asynchronous-external-abort as that was the pre-RAS
behaviour. SError interrupt means something along the lines of 'part of the SoC fell out',
and I expect we're not going to get very far before we trip over the problem again.
If we don't, its probably something related to the guest.

With v8.2 we can tell these things apart, so we could try harder, but I don't think
anything uses SError-Interrupt. Its much more likely that catastrophic errors are taken to
EL3 firmware, or handled by a dedicated microcontroller.

>> The DSB completes previous memory accesses, so that the components they land in can return
>> an abort if they need to. The ISB ensures the abort is seen as pending in ISR_EL1 before
>> the CPU reads it.
>>
>> The ESB-instruction is potentially cheaper, (as cheap as a nop if all errors are going to
>> be reported as uncontained anyway), but it has the nasty side effect of consuming the
>> error, which we don't want here.
>>
>>
>>> The above didn't need an ESB for the host but incorrect assumptions
>>> might change that.
>>>
>>>> Furthermore, assuming you find a way to do it, what do you do with it?
>>>>
>>>> (a) Either there was something pending already and it is still pending,
>>
>>> If a physical SError is pending, you leave it pending and go back to the
>>> host so it can consume it.
>>
>> Great idea!
>> If the CPU has IESB, you can just check ISR_EL1 on entry from the host.
>> If not, you'll need some extra barriers.

> I'm not sure it would be easy to check ISR_EL1 and bail to the host in
> the general case as it would require a meaningful return code for each
> context, something like EAGAIN that doesn't exist in the hyp interfaces.

I guess whether its a problem depends on how much work we're going to do. It doesn't feel
right to carry on once we know something is burning.

If its a bug to make an HVC from SError-masked code, this would only happen if IESB
synchronised the error.

For __kvm_vcpu_run_*(), we could return immediately as if we'd taken an IRQ out of the
guest, this is pretty much the trick __guest_enter pulls. We should probably do this today.

For others, we could rewind ELR_EL2 and forcibly emulate the exception back into the host.
If it returns, it runs the HVC again. Its probably not worth the effort, unless we do the
same for the guest.... which won't promise not to do this from masked

> I was instead thinking that EL2 would continue as usual except it won't
> consume any SErrors with the ESB.
What we don't know can't hurt us! I think this is fine provided we don't end up making the
blackout window huge. A whole set of extra vectors is I guess the cost of this.

Will KVM ever want to handle its own errors? (it doesn't today).

If its okay to merge KVM's asynchronous external abort and the hosts, then I agree this
works, we'd need to be careful not to unmask SError or run ESB.

If KVM wants to know if its an asynchronous external abort that it caused, you'd need to
virtualise the errors for the host.

>>>> (b) Or there was nothing pending and you now have an error that you
>>>>     don't know how to report (the host EL1 never issued an ESB)
>>>
>>> If there isn't a physical SError pending, either there is no SError at
>>> all (easy) or the SError has already been consumed to DISR_EL1 by a host
>>> ESB and we'd preserve DISR_EL1 for the host to handle however it chooses.
>>
>> (I think this is a host bug)
>>
>>
>>>> We could just error out on hypercalls if DISR_EL1 is non-zero, but
>>>> I don't see how we do that for traps, as it would just confuse the host
>>>> EL1.
>>>
>>> Traps would need to be left pending. Detected but not consumed with the
>>> dsb-isb in __guest_enter.
>>
>> You're trapping stuff from the host?
> 
> Not yet, but it might happen e.g. to context switch FPSIMD under
> protected KVM.

Sounds fun.

>> If you might trap something between the ESB-instruction and the host reading DISR_EL1 then
>> you can't use the ESB-instruction in your trap handling path, as it would over-write it.
>> This sucks as presumably you want to make it common with the really-a-guest trap handling
>> path.
>>
>> The best thing to do would be to go the whole-hog and route SError to EL2 with
>> HCR_EL2.AMO. SError taken out of the host can be re-injected with HCR_EL2.VSE and
>> VSESR_EL2, which will be consumed to VDISR_EL2 if the 'host' executes an ESB-instruction.
>>
>> DISR_EL1 then belongs to EL2, meaning its free to use the ESB-instruction as it likes.
> 
> Does a save and restore of DISR_EL1 (see other thread on this series)
> address the concerns here or would you still think it better to start
> virtualizing for the host?

Other than trapping things from the host, I'm confused as to the circumstances that
DISR_EL1 would get clobbered. (is there an example?)

Isn't that unnecessary in the "EL2 would continue as usual except it won't
consume any SErrors with the ESB." world?

DISR_EL1 belongs to the thing that is consuming SError for the platform. The hardware
takes care of ensuring the guest can never corrupt this. KVMs EL2 code is part of the
host, so it magically knows its not overwriting anything when uses the ESB-instruction.
I can see how trapping stuff from the host would change that.

Save restoring it looks really weird, having two things consuming errors leads to problems
around who consumes what. Not handling the DISR_EL1 value as quickly as possible looks
problematic.

(I don't think I'm answering the question here).

For trapping, or KVM detecting its own asynchronous-external-abort, I think you need to
turn the virtualisation on. Otherwise its avoiding ESB-instructions and unmasking SError,
which comes with the cost of special copies of things like the vectors.

Thanks,

James
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm