* [PATCH v2] docs/virt/kvm: Document running nested guests
@ 2020-04-20 11:17 Kashyap Chamarthy
2020-04-21 10:35 ` Paolo Bonzini
2020-04-22 8:56 ` Cornelia Huck
0 siblings, 2 replies; 6+ messages in thread
From: Kashyap Chamarthy @ 2020-04-20 11:17 UTC (permalink / raw)
To: kvm; +Cc: pbonzini, cohuck, dgilbert, vkuznets, kchamart
This is a rewrite of this[1] Wiki page with further enhancements. The
doc also includes a section on debugging problems in nested
environments.
[1] https://www.linux-kvm.org/page/Nested_Guests
Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>
---
v1 is here: https://marc.info/?l=kvm&m=158108941605311&w=2
In v2:
- Address Cornelia's feedback v1:
https://marc.info/?l=kvm&m=158109042605606&w=2
- Address Dave's feedback from v1:
https://marc.info/?l=kvm&m=158109134905930&w=2
---
.../virt/kvm/running-nested-guests.rst | 275 ++++++++++++++++++
1 file changed, 275 insertions(+)
create mode 100644 Documentation/virt/kvm/running-nested-guests.rst
diff --git a/Documentation/virt/kvm/running-nested-guests.rst b/Documentation/virt/kvm/running-nested-guests.rst
new file mode 100644
index 0000000000000000000000000000000000000000..c6c9ccfa0c00e3cbfd65782ceae962b7ef52b34b
--- /dev/null
+++ b/Documentation/virt/kvm/running-nested-guests.rst
@@ -0,0 +1,275 @@
+==============================
+Running nested guests with KVM
+==============================
+
+A nested guest is the ability to run a guest inside another guest (it
+can be KVM-based or a different hypervisor). The straightforward
+example is a KVM guest that in turn runs on KVM a guest (the rest of
+this document is built on this example)::
+
+ .----------------. .----------------.
+ | | | |
+ | L2 | | L2 |
+ | (Nested Guest) | | (Nested Guest) |
+ | | | |
+ |----------------'--'----------------|
+ | |
+ | L1 (Guest Hypervisor) |
+ | KVM (/dev/kvm) |
+ | |
+ .------------------------------------------------------.
+ | L0 (Host Hypervisor) |
+ | KVM (/dev/kvm) |
+ |------------------------------------------------------|
+ | Hardware (with virtualization extensions) |
+ '------------------------------------------------------'
+
+Terminology:
+
+- L0 – level-0; the bare metal host, running KVM
+
+- L1 – level-1 guest; a VM running on L0; also called the "guest
+ hypervisor", as it itself is capable of running KVM.
+
+- L2 – level-2 guest; a VM running on L1, this is the "nested guest"
+
+.. note:: The above diagram is modelled after x86 architecture; s390x,
+ ppc64 and other architectures are likely to have different
+ design for nesting.
+
+ For example, s390x has an additional layer, called "LPAR
+ hypervisor" (Logical PARtition) on the baremetal, resulting in
+ "four levels" in a nested setup — L0 (bare metal, running the
+ LPAR hypervisor), L1 (host hypervisor), L2 (guest hypervisor),
+ L3 (nested guest).
+
+ This document will stick with the three-level terminology (L0,
+ L1, and L2) for all architectures; and will largely focus on
+ x86.
+
+
+Use Cases
+---------
+
+There are several scenarios where nested KVM can be useful, to name a
+few:
+
+- As a developer, you want to test your software on different operating
+ systems (OSes). Instead of renting multiple VMs from a Cloud
+ Provider, using nested KVM lets you rent a large enough "guest
+ hypervisor" (level-1 guest). This in turn allows you to create
+ multiple nested guests (level-2 guests), running different OSes, on
+ which you can develop and test your software.
+
+- Live migration of "guest hypervisors" and their nested guests, for
+ load balancing, disaster recovery, etc.
+
+- VM image creation tools (e.g. ``virt-install``, etc) often run
+ their own VM, and users expect these to work inside a VM.
+
+- Some OSes use virtualization internally for security (e.g. to let
+ applications run safely in isolation).
+
+
+Enabling "nested" (x86)
+-----------------------
+
+From Linux kernel v4.19 onwards, the ``nested`` KVM parameter is enabled
+by default for Intel x86, but *not* for AMD. (Though your Linux
+distribution might override this default.)
+
+In case you are running a Linux kernel older than v4.19, to enable
+nesting, set the ``nested`` KVM module parameter to ``Y`` or ``1``. To
+persist this setting across reboots, you can add it in a config file, as
+shown below:
+
+1. On the bare metal host (L0), list the kernel modules and ensure that
+ the KVM modules::
+
+ $ lsmod | grep -i kvm
+ kvm_intel 133627 0
+ kvm 435079 1 kvm_intel
+
+2. Show information for ``kvm_intel`` module::
+
+ $ modinfo kvm_intel | grep -i nested
+ parm: nested:boolkvm 435079 1 kvm_intel
+
+3. For the nested KVM configuration to persist across reboots, place the
+ below in ``/etc/modprobed/kvm_intel.conf`` (create the file if it
+ doesn't exist)::
+
+ $ cat /etc/modprobe.d/kvm_intel.conf
+ options kvm-intel nested=y
+
+4. Unload and re-load the KVM Intel module::
+
+ $ sudo rmmod kvm-intel
+ $ sudo modprobe kvm-intel
+
+5. Verify if the ``nested`` parameter for KVM is enabled::
+
+ $ cat /sys/module/kvm_intel/parameters/nested
+ Y
+
+For AMD hosts, the process is the same as above, except that the module
+name is ``kvm-amd``.
+
+
+Additional nested-related kernel parameters (x86)
+-------------------------------------------------
+
+If your hardware is sufficiently advanced (Intel Haswell processor or
+above which has newer hardware virt extensions), you might want to
+enable additional features: "Shadow VMCS (Virtual Machine Control
+Structure)", APIC Virtualization on your bare metal host (L0).
+Parameters for Intel hosts::
+
+ $ cat /sys/module/kvm_intel/parameters/enable_shadow_vmcs
+ Y
+
+ $ cat /sys/module/kvm_intel/parameters/enable_apicv
+ N
+
+ $ cat /sys/module/kvm_intel/parameters/ept
+ Y
+
+Again, to persist the above values across reboot, append them to
+``/etc/modprobe.d/kvm_intel.conf``::
+
+ options kvm-intel nested=y
+ options kvm-intel enable_shadow_vmcs=y
+ options kvm-intel enable_apivc=y
+ options kvm-intel ept=y
+
+.. note:: Depending on the hardware and kernel versions, some of the
+ above might be automatically enabled; so check before you do
+ the above.
+
+
+Starting a nested guest (x86)
+-----------------------------
+
+Once your bare metal host (L0) is configured for nesting, you should be
+able to start an L1 guest with::
+
+ $ qemu-kvm -cpu host [...]
+
+The above will pass through the host CPU's capabilities as-is to the
+gues); or for better live migration compatibility, use a named CPU
+model supported by QEMU. e.g.::
+
+ $ qemu-kvm -cpu Haswell-noTSX-IBRS,vmx=on
+
+then the guest hypervisor will subsequently be capable of running a
+nested guest with accelerated KVM.
+
+
+Enabling "nested" (s390x)
+-------------------------
+
+1. On the host hypervisor (L0), enable the ``nested`` parameter on
+ s390x::
+
+ $ rmmod kvm
+ $ modprobe kvm nested=1
+
+.. note:: On s390x, the kernel parameter ``hpage`` parameter is mutually
+ exclusive with the ``nested`` paramter; i.e. to have
+ ``nested`` enabled you _must_ disable the ``hpage`` parameter.
+
+2. The guest hypervisor (L1) must be allowed to have ``sie`` CPU
+ feature — with QEMU, this is possible by using "host passthrough"
+ (via the command-line ``-cpu host``).
+
+3. Now the KVM module can be enabled in the L1 (guest hypervisor)::
+
+ $ modprobe kvm
+
+
+Live migration with nested KVM
+------------------------------
+
+The below live migration scenarios should work as of Linux kernel 5.3
+and QEMU 4.2.0. In all the below cases, L1 exposes ``/dev/kvm`` in
+it, i.e. the L2 guest is a "KVM-accelerated guest", not a "plain
+emulated guest" (as done by QEMU's TCG).
+
+- Migrating a nested guest (L2) to another L1 guest on the *same* bare
+ metal host.
+
+- Migrating a nested guest (L2) to another L1 guest on a *different*
+ bare metal host.
+
+- Migrating an L1 guest, with an *offline* nested guest in it, to
+ another bare metal host.
+
+- Migrating an L1 guest, with a *live* nested guest in it, to another
+ bare metal host.
+
+Limitations on Linux kernel versions older than 5.3
+---------------------------------------------------
+
+On x86 systems-only (as this does *not* apply for s390x):
+
+On Linux kernel versions older than 5.3, once an L1 guest has started an
+L2 guest, the L1 guest would no longer capable of being migrated, saved,
+or loaded (refer to QEMU documentation on "save"/"load") until the L2
+guest shuts down.
+
+Attempting to migrate or save-and-load an L1 guest while an L2 guest is
+running will result in undefined behavior. You might see a ``kernel
+BUG!`` entry in ``dmesg``, a kernel 'oops', or an outright kernel panic.
+Such a migrated or loaded L1 guest can no longer be considered stable or
+secure, and must be restarted.
+
+Migrating an L1 guest merely configured to support nesting, while not
+actually running L2 guests, is expected to function normally.
+Live-migrating an L2 guest from one L1 guest to another is also expected
+to succeed.
+
+Reporting bugs from "nested" setups
+-----------------------------------
+
+(This is written with x86 terminology in mind, but similar should apply
+for other architectures.)
+
+Debugging "nested" problems can involve sifting through log files across
+L0, L1 and L2; this can result in tedious back-n-forth between the bug
+reporter and the bug fixer.
+
+- Mention that you are in a "nested" setup. If you are running any kind
+ of "nesting" at all, say so. Unfortunately, this needs to be called
+ out because when reporting bugs, people tend to forget to even
+ *mention* that they're using nested virtualization.
+
+- Ensure you are actually running KVM on KVM. Sometimes people do not
+ have KVM enabled for their guest hypervisor (L1), which results in
+ them running with pure emulation or what QEMU calls it as "TCG", but
+ they think they're running nested KVM. Thus confusing "nested Virt"
+ (which could also mean, QEMU on KVM) with "nested KVM" (KVM on KVM).
+
+- What information to collect? The following; it's not an exhaustive
+ list, but a very good starting point:
+
+ - Kernel, libvirt, and QEMU version from L0
+
+ - Kernel, libvirt and QEMU version from L1
+
+ - QEMU command-line of L1 -- preferably full log from
+ ``/var/log/libvirt/qemu/instance.log``
+
+ - QEMU command-line of L2 -- preferably full log from
+ ``/var/log/libvirt/qemu/instance.log``
+
+ - Full ``dmesg`` output from L0
+
+ - Full ``dmesg`` output from L1
+
+ - Output of: ``x86info -a`` (& ``lscpu``) from L0
+
+ - Output of: ``x86info -a`` (& ``lscpu``) from L1
+
+ - Output of: ``dmidecode`` from L0
+
+ - Output of: ``dmidecode`` from L1
--
2.21.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2] docs/virt/kvm: Document running nested guests
2020-04-20 11:17 [PATCH v2] docs/virt/kvm: Document running nested guests Kashyap Chamarthy
@ 2020-04-21 10:35 ` Paolo Bonzini
2020-04-27 10:14 ` Kashyap Chamarthy
2020-04-22 8:56 ` Cornelia Huck
1 sibling, 1 reply; 6+ messages in thread
From: Paolo Bonzini @ 2020-04-21 10:35 UTC (permalink / raw)
To: Kashyap Chamarthy, kvm; +Cc: cohuck, dgilbert, vkuznets
Mostly looks good except for kernel parameters:
On 20/04/20 13:17, Kashyap Chamarthy wrote:
> +Enabling "nested" (x86)
> +-----------------------
> +
> +From Linux kernel v4.19 onwards, the ``nested`` KVM parameter is enabled
> +by default for Intel x86, but *not* for AMD. (Though your Linux
> +distribution might override this default.)
It is enabled for AMD as well.
>
> +
> +If your hardware is sufficiently advanced (Intel Haswell processor or
> +above which has newer hardware virt extensions), you might want to
> +enable additional features: "Shadow VMCS (Virtual Machine Control
> +Structure)", APIC Virtualization on your bare metal host (L0).
> +Parameters for Intel hosts::
> +
> + $ cat /sys/module/kvm_intel/parameters/enable_shadow_vmcs
> + Y
> +
> + $ cat /sys/module/kvm_intel/parameters/enable_apicv
> + N
> +
> + $ cat /sys/module/kvm_intel/parameters/ept
> + Y
These are enabled by default if you have them, on all kernel versions.
So you may instead tell people to check them (especially
enable_shadow_vmcs and ept) if their L2 guests run slower.
>
> +Starting a nested guest (x86)
> +-----------------------------
> +
> +Once your bare metal host (L0) is configured for nesting, you should be
> +able to start an L1 guest with::
> +
> + $ qemu-kvm -cpu host [...]
> +
> +The above will pass through the host CPU's capabilities as-is to the
> +gues); or for better live migration compatibility, use a named CPU
> +model supported by QEMU. e.g.::
> +
> + $ qemu-kvm -cpu Haswell-noTSX-IBRS,vmx=on
> +
> +then the guest hypervisor will subsequently be capable of running a
> +nested guest with accelerated KVM.
> +
The latter is only on QEMU 4.2 and newer. Also, you should group by
architecture and use third-level headings within an architecture.
Paolo
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] docs/virt/kvm: Document running nested guests
2020-04-20 11:17 [PATCH v2] docs/virt/kvm: Document running nested guests Kashyap Chamarthy
2020-04-21 10:35 ` Paolo Bonzini
@ 2020-04-22 8:56 ` Cornelia Huck
2020-04-27 15:22 ` Kashyap Chamarthy
1 sibling, 1 reply; 6+ messages in thread
From: Cornelia Huck @ 2020-04-22 8:56 UTC (permalink / raw)
To: Kashyap Chamarthy; +Cc: kvm, pbonzini, dgilbert, vkuznets
On Mon, 20 Apr 2020 13:17:55 +0200
Kashyap Chamarthy <kchamart@redhat.com> wrote:
> This is a rewrite of this[1] Wiki page with further enhancements. The
> doc also includes a section on debugging problems in nested
> environments.
>
> [1] https://www.linux-kvm.org/page/Nested_Guests
>
> Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>
> ---
> v1 is here: https://marc.info/?l=kvm&m=158108941605311&w=2
>
> In v2:
> - Address Cornelia's feedback v1:
> https://marc.info/?l=kvm&m=158109042605606&w=2
> - Address Dave's feedback from v1:
> https://marc.info/?l=kvm&m=158109134905930&w=2
> ---
> .../virt/kvm/running-nested-guests.rst | 275 ++++++++++++++++++
> 1 file changed, 275 insertions(+)
> create mode 100644 Documentation/virt/kvm/running-nested-guests.rst
>
> diff --git a/Documentation/virt/kvm/running-nested-guests.rst b/Documentation/virt/kvm/running-nested-guests.rst
> new file mode 100644
> index 0000000000000000000000000000000000000000..c6c9ccfa0c00e3cbfd65782ceae962b7ef52b34b
> --- /dev/null
> +++ b/Documentation/virt/kvm/running-nested-guests.rst
> @@ -0,0 +1,275 @@
> +==============================
> +Running nested guests with KVM
> +==============================
> +
> +A nested guest is the ability to run a guest inside another guest (it
> +can be KVM-based or a different hypervisor). The straightforward
> +example is a KVM guest that in turn runs on KVM a guest (the rest of
s/on KVM a guest/on a KVM guest/
> +this document is built on this example)::
> +
> + .----------------. .----------------.
> + | | | |
> + | L2 | | L2 |
> + | (Nested Guest) | | (Nested Guest) |
> + | | | |
> + |----------------'--'----------------|
> + | |
> + | L1 (Guest Hypervisor) |
> + | KVM (/dev/kvm) |
> + | |
> + .------------------------------------------------------.
> + | L0 (Host Hypervisor) |
> + | KVM (/dev/kvm) |
> + |------------------------------------------------------|
> + | Hardware (with virtualization extensions) |
> + '------------------------------------------------------'
> +
> +Terminology:
> +
> +- L0 – level-0; the bare metal host, running KVM
> +
> +- L1 – level-1 guest; a VM running on L0; also called the "guest
> + hypervisor", as it itself is capable of running KVM.
> +
> +- L2 – level-2 guest; a VM running on L1, this is the "nested guest"
> +
> +.. note:: The above diagram is modelled after x86 architecture; s390x,
s/x86 architecture/the x86 architecture/
> + ppc64 and other architectures are likely to have different
s/to have/to have a/
> + design for nesting.
> +
> + For example, s390x has an additional layer, called "LPAR
> + hypervisor" (Logical PARtition) on the baremetal, resulting in
> + "four levels" in a nested setup — L0 (bare metal, running the
> + LPAR hypervisor), L1 (host hypervisor), L2 (guest hypervisor),
> + L3 (nested guest).
What about:
"For example, s390x always has an LPAR (LogicalPARtition) hypervisor
running on bare metal, adding another layer and resulting in at least
four levels in a nested setup..."
> +
> + This document will stick with the three-level terminology (L0,
> + L1, and L2) for all architectures; and will largely focus on
> + x86.
> +
> +
(...)
> +Enabling "nested" (s390x)
> +-------------------------
> +
> +1. On the host hypervisor (L0), enable the ``nested`` parameter on
> + s390x::
> +
> + $ rmmod kvm
> + $ modprobe kvm nested=1
> +
> +.. note:: On s390x, the kernel parameter ``hpage`` parameter is mutually
Drop one of the "parameter"?
> + exclusive with the ``nested`` paramter; i.e. to have
> + ``nested`` enabled you _must_ disable the ``hpage`` parameter.
"i.e., in order to be able to enable ``nested``, the ``hpage``
parameter _must_ be disabled."
?
> +
> +2. The guest hypervisor (L1) must be allowed to have ``sie`` CPU
"must be provided with" ?
> + feature — with QEMU, this is possible by using "host passthrough"
s/this is possible by/this can be done by e.g./ ?
> + (via the command-line ``-cpu host``).
> +
> +3. Now the KVM module can be enabled in the L1 (guest hypervisor)::
s/enabled/loaded/
> +
> + $ modprobe kvm
> +
> +
> +Live migration with nested KVM
> +------------------------------
> +
> +The below live migration scenarios should work as of Linux kernel 5.3
> +and QEMU 4.2.0. In all the below cases, L1 exposes ``/dev/kvm`` in
> +it, i.e. the L2 guest is a "KVM-accelerated guest", not a "plain
> +emulated guest" (as done by QEMU's TCG).
The 5.3/4.2 versions likely apply to x86? Should work for s390x as well
as of these version, but should have worked earlier already :)
> +
> +- Migrating a nested guest (L2) to another L1 guest on the *same* bare
> + metal host.
> +
> +- Migrating a nested guest (L2) to another L1 guest on a *different*
> + bare metal host.
> +
> +- Migrating an L1 guest, with an *offline* nested guest in it, to
> + another bare metal host.
> +
> +- Migrating an L1 guest, with a *live* nested guest in it, to another
> + bare metal host.
> +
> +Limitations on Linux kernel versions older than 5.3
> +---------------------------------------------------
> +
> +On x86 systems-only (as this does *not* apply for s390x):
Add a "x86" marker? Or better yet, group all the x86 stuff in an x86
section?
> +
> +On Linux kernel versions older than 5.3, once an L1 guest has started an
> +L2 guest, the L1 guest would no longer capable of being migrated, saved,
> +or loaded (refer to QEMU documentation on "save"/"load") until the L2
> +guest shuts down.
> +
> +Attempting to migrate or save-and-load an L1 guest while an L2 guest is
> +running will result in undefined behavior. You might see a ``kernel
> +BUG!`` entry in ``dmesg``, a kernel 'oops', or an outright kernel panic.
> +Such a migrated or loaded L1 guest can no longer be considered stable or
> +secure, and must be restarted.
> +
> +Migrating an L1 guest merely configured to support nesting, while not
> +actually running L2 guests, is expected to function normally.
> +Live-migrating an L2 guest from one L1 guest to another is also expected
> +to succeed.
> +
> +Reporting bugs from "nested" setups
> +-----------------------------------
> +
> +(This is written with x86 terminology in mind, but similar should apply
> +for other architectures.)
Better to reorder it a bit (see below).
> +
> +Debugging "nested" problems can involve sifting through log files across
> +L0, L1 and L2; this can result in tedious back-n-forth between the bug
> +reporter and the bug fixer.
> +
> +- Mention that you are in a "nested" setup. If you are running any kind
> + of "nesting" at all, say so. Unfortunately, this needs to be called
> + out because when reporting bugs, people tend to forget to even
> + *mention* that they're using nested virtualization.
> +
> +- Ensure you are actually running KVM on KVM. Sometimes people do not
> + have KVM enabled for their guest hypervisor (L1), which results in
> + them running with pure emulation or what QEMU calls it as "TCG", but
> + they think they're running nested KVM. Thus confusing "nested Virt"
> + (which could also mean, QEMU on KVM) with "nested KVM" (KVM on KVM).
> +
> +- What information to collect? The following; it's not an exhaustive
> + list, but a very good starting point:
> +
> + - Kernel, libvirt, and QEMU version from L0
> +
> + - Kernel, libvirt and QEMU version from L1
> +
> + - QEMU command-line of L1 -- preferably full log from
> + ``/var/log/libvirt/qemu/instance.log``
(if you are running libvirt)
> +
> + - QEMU command-line of L2 -- preferably full log from
> + ``/var/log/libvirt/qemu/instance.log``
(if you are running libvirt)
> +
> + - Full ``dmesg`` output from L0
> +
> + - Full ``dmesg`` output from L1
> +
> + - Output of: ``x86info -a`` (& ``lscpu``) from L0
> +
> + - Output of: ``x86info -a`` (& ``lscpu``) from L1
lscpu makes sense for other architectures as well.
> +
> + - Output of: ``dmidecode`` from L0
> +
> + - Output of: ``dmidecode`` from L1
This looks x86 specific? Maybe have a list of things that make sense
everywhere, and list architecture-specific stuff in specific
subsections?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] docs/virt/kvm: Document running nested guests
2020-04-21 10:35 ` Paolo Bonzini
@ 2020-04-27 10:14 ` Kashyap Chamarthy
0 siblings, 0 replies; 6+ messages in thread
From: Kashyap Chamarthy @ 2020-04-27 10:14 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: kvm, cohuck, dgilbert, vkuznets
On Tue, Apr 21, 2020 at 12:35:21PM +0200, Paolo Bonzini wrote:
> Mostly looks good except for kernel parameters:
[Just noticed this; somehow the KVM e-mails, which I explicitly Cced
myself, aren't arriving in my Inbox.]
> On 20/04/20 13:17, Kashyap Chamarthy wrote:
> > +Enabling "nested" (x86)
> > +-----------------------
> > +
> > +From Linux kernel v4.19 onwards, the ``nested`` KVM parameter is enabled
> > +by default for Intel x86, but *not* for AMD. (Though your Linux
> > +distribution might override this default.)
>
> It is enabled for AMD as well.
Ah, thanks. Will correct.
> > +
> > +If your hardware is sufficiently advanced (Intel Haswell processor or
> > +above which has newer hardware virt extensions), you might want to
> > +enable additional features: "Shadow VMCS (Virtual Machine Control
> > +Structure)", APIC Virtualization on your bare metal host (L0).
> > +Parameters for Intel hosts::
> > +
> > + $ cat /sys/module/kvm_intel/parameters/enable_shadow_vmcs
> > + Y
> > +
> > + $ cat /sys/module/kvm_intel/parameters/enable_apicv
> > + N
> > +
> > + $ cat /sys/module/kvm_intel/parameters/ept
> > + Y
>
>
> These are enabled by default if you have them, on all kernel versions.
> So you may instead tell people to check them (especially
> enable_shadow_vmcs and ept) if their L2 guests run slower.
Noted, will amend.
> >
> > +Starting a nested guest (x86)
> > +-----------------------------
> > +
> > +Once your bare metal host (L0) is configured for nesting, you should be
> > +able to start an L1 guest with::
> > +
> > + $ qemu-kvm -cpu host [...]
> > +
> > +The above will pass through the host CPU's capabilities as-is to the
> > +gues); or for better live migration compatibility, use a named CPU
> > +model supported by QEMU. e.g.::
> > +
> > + $ qemu-kvm -cpu Haswell-noTSX-IBRS,vmx=on
> > +
> > +then the guest hypervisor will subsequently be capable of running a
> > +nested guest with accelerated KVM.
> > +
>
> The latter is only on QEMU 4.2 and newer. Also, you should group by
> architecture and use third-level headings within an architecture.
Okay, will adjust the structure.
Thanks for the review.
--
/kashyap
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] docs/virt/kvm: Document running nested guests
2020-04-22 8:56 ` Cornelia Huck
@ 2020-04-27 15:22 ` Kashyap Chamarthy
2020-04-30 10:25 ` Cornelia Huck
0 siblings, 1 reply; 6+ messages in thread
From: Kashyap Chamarthy @ 2020-04-27 15:22 UTC (permalink / raw)
To: Cornelia Huck; +Cc: kvm, pbonzini, dgilbert, vkuznets
On Wed, Apr 22, 2020 at 10:56:18AM +0200, Cornelia Huck wrote:
> On Mon, 20 Apr 2020 13:17:55 +0200
> Kashyap Chamarthy <kchamart@redhat.com> wrote:
[Just noticed this today ... thanks for the review.]
[...]
> > +A nested guest is the ability to run a guest inside another guest (it
> > +can be KVM-based or a different hypervisor). The straightforward
> > +example is a KVM guest that in turn runs on KVM a guest (the rest of
>
> s/on KVM a guest/on a KVM guest/
Will fix in v3.
[...]
> > +Terminology:
> > +
> > +- L0 – level-0; the bare metal host, running KVM
> > +
> > +- L1 – level-1 guest; a VM running on L0; also called the "guest
> > + hypervisor", as it itself is capable of running KVM.
> > +
> > +- L2 – level-2 guest; a VM running on L1, this is the "nested guest"
> > +
> > +.. note:: The above diagram is modelled after x86 architecture; s390x,
>
> s/x86 architecture/the x86 architecture/
>
> > + ppc64 and other architectures are likely to have different
>
> s/to have/to have a/
Noted (both the above)
> > + design for nesting.
> > +
> > + For example, s390x has an additional layer, called "LPAR
> > + hypervisor" (Logical PARtition) on the baremetal, resulting in
> > + "four levels" in a nested setup — L0 (bare metal, running the
> > + LPAR hypervisor), L1 (host hypervisor), L2 (guest hypervisor),
> > + L3 (nested guest).
>
> What about:
>
> "For example, s390x always has an LPAR (LogicalPARtition) hypervisor
> running on bare metal, adding another layer and resulting in at least
> four levels in a nested setup..."
Yep, reads nicer; thanks.
[...]
> > +1. On the host hypervisor (L0), enable the ``nested`` parameter on
> > + s390x::
> > +
> > + $ rmmod kvm
> > + $ modprobe kvm nested=1
> > +
> > +.. note:: On s390x, the kernel parameter ``hpage`` parameter is mutually
>
> Drop one of the "parameter"?
Will do.
> > + exclusive with the ``nested`` paramter; i.e. to have
> > + ``nested`` enabled you _must_ disable the ``hpage`` parameter.
>
> "i.e., in order to be able to enable ``nested``, the ``hpage``
> parameter _must_ be disabled."
>
> ?
Yes :)
>
> > +
> > +2. The guest hypervisor (L1) must be allowed to have ``sie`` CPU
>
> "must be provided with" ?
>
> > + feature — with QEMU, this is possible by using "host passthrough"
>
> s/this is possible by/this can be done by e.g./ ?
>
> > + (via the command-line ``-cpu host``).
> > +
> > +3. Now the KVM module can be enabled in the L1 (guest hypervisor)::
>
> s/enabled/loaded/
Will adjust the above three; thanks.
> > +
> > + $ modprobe kvm
> > +
> > +
> > +Live migration with nested KVM
> > +------------------------------
> > +
> > +The below live migration scenarios should work as of Linux kernel 5.3
> > +and QEMU 4.2.0. In all the below cases, L1 exposes ``/dev/kvm`` in
> > +it, i.e. the L2 guest is a "KVM-accelerated guest", not a "plain
> > +emulated guest" (as done by QEMU's TCG).
>
> The 5.3/4.2 versions likely apply to x86? Should work for s390x as well
> as of these version, but should have worked earlier already :)
Heh, I'll specify the x86-ness of those versions :-)
> > +
> > +- Migrating a nested guest (L2) to another L1 guest on the *same* bare
> > + metal host.
> > +
> > +- Migrating a nested guest (L2) to another L1 guest on a *different*
> > + bare metal host.
> > +
> > +- Migrating an L1 guest, with an *offline* nested guest in it, to
> > + another bare metal host.
> > +
> > +- Migrating an L1 guest, with a *live* nested guest in it, to another
> > + bare metal host.
> > +
> > +Limitations on Linux kernel versions older than 5.3
> > +---------------------------------------------------
> > +
> > +On x86 systems-only (as this does *not* apply for s390x):
>
> Add a "x86" marker? Or better yet, group all the x86 stuff in an x86
> section?
Right, forgot here, will do.
[...]
> > +Reporting bugs from "nested" setups
> > +-----------------------------------
> > +
> > +(This is written with x86 terminology in mind, but similar should apply
> > +for other architectures.)
>
> Better to reorder it a bit (see below).
[...]
> > + - Kernel, libvirt, and QEMU version from L0
> > +
> > + - Kernel, libvirt and QEMU version from L1
> > +
> > + - QEMU command-line of L1 -- preferably full log from
> > + ``/var/log/libvirt/qemu/instance.log``
>
> (if you are running libvirt)
>
> > +
> > + - QEMU command-line of L2 -- preferably full log from
> > + ``/var/log/libvirt/qemu/instance.log``
>
> (if you are running libvirt)
Yes, I'll mention that bit. (I'm just to used to reports coming from
libvirt users :-))
> > +
> > + - Full ``dmesg`` output from L0
> > +
> > + - Full ``dmesg`` output from L1
> > +
> > + - Output of: ``x86info -a`` (& ``lscpu``) from L0
> > +
> > + - Output of: ``x86info -a`` (& ``lscpu``) from L1
>
> lscpu makes sense for other architectures as well.
Noted.
> > +
> > + - Output of: ``dmidecode`` from L0
> > +
> > + - Output of: ``dmidecode`` from L1
>
> This looks x86 specific? Maybe have a list of things that make sense
> everywhere, and list architecture-specific stuff in specific
> subsections?
Can do. Do you have any other specific debugging bits to look out for
s390x or any other arch?
Thanks for the careful review. Much appreciate it :-)
--
/kashyap
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] docs/virt/kvm: Document running nested guests
2020-04-27 15:22 ` Kashyap Chamarthy
@ 2020-04-30 10:25 ` Cornelia Huck
0 siblings, 0 replies; 6+ messages in thread
From: Cornelia Huck @ 2020-04-30 10:25 UTC (permalink / raw)
To: Kashyap Chamarthy; +Cc: kvm, pbonzini, dgilbert, vkuznets
On Mon, 27 Apr 2020 17:22:49 +0200
Kashyap Chamarthy <kchamart@redhat.com> wrote:
> On Wed, Apr 22, 2020 at 10:56:18AM +0200, Cornelia Huck wrote:
> > On Mon, 20 Apr 2020 13:17:55 +0200
> > Kashyap Chamarthy <kchamart@redhat.com> wrote:
> > > +
> > > + - Output of: ``dmidecode`` from L0
> > > +
> > > + - Output of: ``dmidecode`` from L1
> >
> > This looks x86 specific? Maybe have a list of things that make sense
> > everywhere, and list architecture-specific stuff in specific
> > subsections?
>
> Can do. Do you have any other specific debugging bits to look out for
> s390x or any other arch?
Not from the top of my head... but we can easily add something later on
anyway.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-04-30 10:25 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-20 11:17 [PATCH v2] docs/virt/kvm: Document running nested guests Kashyap Chamarthy
2020-04-21 10:35 ` Paolo Bonzini
2020-04-27 10:14 ` Kashyap Chamarthy
2020-04-22 8:56 ` Cornelia Huck
2020-04-27 15:22 ` Kashyap Chamarthy
2020-04-30 10:25 ` Cornelia Huck
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).