kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* nvmx: get/set_nested_state ignores VM_EXIT_INSTRUCTION_LEN
@ 2019-07-21 17:40 Jan Kiszka
  2019-07-21 19:14 ` Paolo Bonzini
  0 siblings, 1 reply; 4+ messages in thread
From: Jan Kiszka @ 2019-07-21 17:40 UTC (permalink / raw)
  To: Paolo Bonzini, Liran Alon, kvm; +Cc: Jim Mattson, KarimAllah Ahmed

Hi all,

made some progress understanding why vmport from L2 breaks since QEMU gets/sets
the nested state around it: We do not preserve VM_EXIT_INSTRUCTION_LEN, and that
breaks skip_emulated_instruction when completing the PIO access on next run. The
field is suddenly 0, and so we loop infinitely over the IO instruction. Unless
some other magic prevents migration while an IO instruction is in flight, vmport
may not be the only victim here.

Now the question is how to preserve that information: Can we restore the value
into vmcs02 on set_nested_state, despite this field being read-only? Or do we
need to cache its content and use that instead in skip_emulated_instruction?

Looking at this pattern, I wonder if there is more. What other fields are used
across PIO or MMIO when the handling is done by userland?

Jan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: nvmx: get/set_nested_state ignores VM_EXIT_INSTRUCTION_LEN
  2019-07-21 17:40 nvmx: get/set_nested_state ignores VM_EXIT_INSTRUCTION_LEN Jan Kiszka
@ 2019-07-21 19:14 ` Paolo Bonzini
  2019-07-21 19:31   ` Jan Kiszka
  0 siblings, 1 reply; 4+ messages in thread
From: Paolo Bonzini @ 2019-07-21 19:14 UTC (permalink / raw)
  To: Jan Kiszka, Liran Alon, kvm; +Cc: Jim Mattson, KarimAllah Ahmed

On 21/07/19 19:40, Jan Kiszka wrote:
> Hi all,
> 
> made some progress understanding why vmport from L2 breaks since QEMU gets/sets
> the nested state around it: We do not preserve VM_EXIT_INSTRUCTION_LEN, and that
> breaks skip_emulated_instruction when completing the PIO access on next run. The
> field is suddenly 0, and so we loop infinitely over the IO instruction. Unless
> some other magic prevents migration while an IO instruction is in flight, vmport
> may not be the only victim here.
> 
> Now the question is how to preserve that information: Can we restore the value
> into vmcs02 on set_nested_state, despite this field being read-only? Or do we
> need to cache its content and use that instead in skip_emulated_instruction?

Hmm I think technically this is invalid, since you're not supposed to
modify state at all while MMIO is pending.  Of course that's kinda moot
if it's the only way to emulate vmport, but perhaps we can (or even
should!) fix it in QEMU.  Is KVM_SET_NESTED_STATE needed for level <
KVM_PUT_RESET_STATE?  Even if it is, we should first complete I/O and
then do kvm_arch_put_registers.

> Looking at this pattern, I wonder if there is more. What other fields are used
> across PIO or MMIO when the handling is done by userland?


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: nvmx: get/set_nested_state ignores VM_EXIT_INSTRUCTION_LEN
  2019-07-21 19:14 ` Paolo Bonzini
@ 2019-07-21 19:31   ` Jan Kiszka
  2019-07-21 20:24     ` Paolo Bonzini
  0 siblings, 1 reply; 4+ messages in thread
From: Jan Kiszka @ 2019-07-21 19:31 UTC (permalink / raw)
  To: Paolo Bonzini, Liran Alon, kvm; +Cc: Jim Mattson, KarimAllah Ahmed

On 21.07.19 21:14, Paolo Bonzini wrote:
> On 21/07/19 19:40, Jan Kiszka wrote:
>> Hi all,
>>
>> made some progress understanding why vmport from L2 breaks since QEMU gets/sets
>> the nested state around it: We do not preserve VM_EXIT_INSTRUCTION_LEN, and that
>> breaks skip_emulated_instruction when completing the PIO access on next run. The
>> field is suddenly 0, and so we loop infinitely over the IO instruction. Unless
>> some other magic prevents migration while an IO instruction is in flight, vmport
>> may not be the only victim here.
>>
>> Now the question is how to preserve that information: Can we restore the value
>> into vmcs02 on set_nested_state, despite this field being read-only? Or do we
>> need to cache its content and use that instead in skip_emulated_instruction?
>
> Hmm I think technically this is invalid, since you're not supposed to
> modify state at all while MMIO is pending.  Of course that's kinda moot
> if it's the only way to emulate vmport, but perhaps we can (or even
> should!) fix it in QEMU.  Is KVM_SET_NESTED_STATE needed for level <
> KVM_PUT_RESET_STATE?  Even if it is, we should first complete I/O and
> then do kvm_arch_put_registers.

Are we sure that vmport is the only case? What prevents a migration from
starting in the middle of an IO exit?

And I suspect we need to enhance the API specification then.

Jan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: nvmx: get/set_nested_state ignores VM_EXIT_INSTRUCTION_LEN
  2019-07-21 19:31   ` Jan Kiszka
@ 2019-07-21 20:24     ` Paolo Bonzini
  0 siblings, 0 replies; 4+ messages in thread
From: Paolo Bonzini @ 2019-07-21 20:24 UTC (permalink / raw)
  To: Jan Kiszka, Liran Alon, kvm; +Cc: Jim Mattson, KarimAllah Ahmed

On 21/07/19 21:31, Jan Kiszka wrote:
> On 21.07.19 21:14, Paolo Bonzini wrote:
>> On 21/07/19 19:40, Jan Kiszka wrote:
>>> Hi all,
>>>
>>> made some progress understanding why vmport from L2 breaks since QEMU gets/sets
>>> the nested state around it: We do not preserve VM_EXIT_INSTRUCTION_LEN, and that
>>> breaks skip_emulated_instruction when completing the PIO access on next run. The
>>> field is suddenly 0, and so we loop infinitely over the IO instruction. Unless
>>> some other magic prevents migration while an IO instruction is in flight, vmport
>>> may not be the only victim here.
>>>
>>> Now the question is how to preserve that information: Can we restore the value
>>> into vmcs02 on set_nested_state, despite this field being read-only? Or do we
>>> need to cache its content and use that instead in skip_emulated_instruction?
>>
>> Hmm I think technically this is invalid, since you're not supposed to
>> modify state at all while MMIO is pending.  Of course that's kinda moot
>> if it's the only way to emulate vmport, but perhaps we can (or even
>> should!) fix it in QEMU.  Is KVM_SET_NESTED_STATE needed for level <
>> KVM_PUT_RESET_STATE?  Even if it is, we should first complete I/O and
>> then do kvm_arch_put_registers.
> 
> Are we sure that vmport is the only case? What prevents a migration from
> starting in the middle of an IO exit?

Migration syncs with the CPU thread via pause_all_vcpus(), after which
the CPU thread must be in qemu_wait_io_event.  KVM_SET_NESTED_STATE
should definitely not be part of KVM_PUT_RUNTIME_STATE.

Paolo

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-07-21 20:24 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-21 17:40 nvmx: get/set_nested_state ignores VM_EXIT_INSTRUCTION_LEN Jan Kiszka
2019-07-21 19:14 ` Paolo Bonzini
2019-07-21 19:31   ` Jan Kiszka
2019-07-21 20:24     ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).