On Wed, 29 Jul 2020, Jan Beulich wrote:
> On 29.07.2020 09:08, Bertrand Marquis wrote:
> > > On 28 Jul 2020, at 21:54, Jan Beulich <jbeulich@suse.com> wrote:
> > > 
> > > On 28.07.2020 17:52, Bertrand Marquis wrote:
> > > > At the moment on Arm, a Linux guest running with KTPI enabled will
> > > > cause the following error when a context switch happens in user mode:
> > > > (XEN) p2m.c:1890: d1v0: Failed to walk page-table va 0xffffff837ebe0cd0
> > > > The error is caused by the virtual address for the runstate area
> > > > registered by the guest only being accessible when the guest is running
> > > > in kernel space when KPTI is enabled.
> > > > To solve this issue, this patch is doing the translation from virtual
> > > > address to physical address during the hypercall and mapping the
> > > > required pages using vmap. This is removing the conversion from virtual
> > > > to physical address during the context switch which is solving the
> > > > problem with KPTI.
> > > > This is done only on arm architecture, the behaviour on x86 is not
> > > > modified by this patch and the address conversion is done as before
> > > > during each context switch.
> > > > This is introducing several limitations in comparison to the previous
> > > > behaviour (on arm only):
> > > > - if the guest is remapping the area at a different physical address Xen
> > > > will continue to update the area at the previous physical address. As
> > > > the area is in kernel space and usually defined as a global variable
> > > > this
> > > > is something which is believed not to happen. If this is required by a
> > > > guest, it will have to call the hypercall with the new area (even if it
> > > > is at the same virtual address).
> > > > - the area needs to be mapped during the hypercall. For the same reasons
> > > > as for the previous case, even if the area is registered for a different
> > > > vcpu. It is believed that registering an area using a virtual address
> > > > unmapped is not something done.
> > > 
> > > Beside me thinking that an in-use and stable ABI can't be changed like
> > > this, no matter what is "believed" kernel code may or may not do, I
> > > also don't think having arch-es diverge in behavior here is a good
> > > idea. Use of commonly available interfaces shouldn't lead to head
> > > aches or surprises when porting code from one arch to another. I'm
> > > pretty sure it was suggested before: Why don't you simply introduce
> > > a physical address based hypercall (and then also on x86 at the same
> > > time, keeping functional parity)? I even seem to recall giving a
> > > suggestion how to fit this into a future "physical addresses only"
> > > model, as long as we can settle on the basic principles of that
> > > conversion path that we want to go sooner or later anyway (as I
> > > understand).
> > 
> > I fully agree with the “physical address only” model and i think it must be
> > done. Introducing a new hypercall taking a physical address as parameter
> > is the long term solution (and I would even volunteer to do it in a new
> > patchset).
> > But this would not solve the issue here unless linux is modified.
> > So I do see this patch as a “bug fix”.
> 
> Well, it is sort of implied by my previous reply that we won't get away
> without an OS side change here. The prereq to get away without would be
> that it is okay to change the behavior of a hypercall like you do, and
> that it is okay to make the behavior diverge between arch-es. I think
> I've made pretty clear that I don't think either is really an option.

Hi Jan,

This is a difficult problem to solve and the current situation honestly
sucks: there is no way to solve the problem without making compromises.

The new hypercall is good-to-have in any case (it is a better interface)
but it is not a full solution.  If we introduce a new hypercall we fix
new guests but don't fix existing guests. If we change Linux in any way,
we are still going to have problems with all already-released kernel
binaries. Leaving the issue unfixed is not an option either because the
problem can't be ignored.

There is no simple way out of this situation.


After careful considerations and many discussions (most of them on
xen-devel [1][2]) we came to the realization that changing the behavior
for this hypercall on ARM is the best we can do:

1) it solves the problem for existing guests
2) it has no ill effects as far as we know


Are we happy about it? We are not. It is the best we can do given the
constraints. The rule to never change hypercalls is a good one to have
and I fully support it, but in this unique circumstance it is best to
make an exception.

The idea is to minimize the impact to x86 so that's why Bertrand's patch
is introducing a divergence between Arm and x86 on this hypercall.


In summary, please consider this patch together with its context. This
has been the only time to my memory when we had to do this -- it is
certainly a unique situation.


[1] https://marc.info/?l=xen-devel&m=158946660216159
[2] https://marc.info/?l=xen-devel&m=159067967432381