All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] About the light VM solution!
@ 2017-12-05  6:33 Yang Zhong
  2017-12-05 12:06 ` Stefan Hajnoczi
  2017-12-07 12:03 ` Richard W.M. Jones
  0 siblings, 2 replies; 12+ messages in thread
From: Yang Zhong @ 2017-12-05  6:33 UTC (permalink / raw)
  To: pbonzini, stefanha, berrange; +Cc: qemu-devel

Hello all,

As you know, AWS has decided to switch to KVM in their clouds. This news make almost all
china CSPs(clouds service provider) pay more attention on KVM/Qemu, especially light VM
solution.

Below are intel solution for light VM, qemu-lite.
http://events.linuxfoundation.org/sites/events/files/slides/Light%20weight%20virtualization%20with%20QEMU%26KVM_0.pdf

My question is whether community has some plan to implement light VM or alternative solutions? If no, whether our 
qemu-lite solution is suitable for upstream again? Many thanks!

Regards,

Yang

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] About the light VM solution!
  2017-12-05  6:33 [Qemu-devel] About the light VM solution! Yang Zhong
@ 2017-12-05 12:06 ` Stefan Hajnoczi
  2017-12-05 13:35   ` Paolo Bonzini
  2017-12-07 12:03 ` Richard W.M. Jones
  1 sibling, 1 reply; 12+ messages in thread
From: Stefan Hajnoczi @ 2017-12-05 12:06 UTC (permalink / raw)
  To: Yang Zhong; +Cc: pbonzini, berrange, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1841 bytes --]

On Tue, Dec 05, 2017 at 02:33:13PM +0800, Yang Zhong wrote:
> As you know, AWS has decided to switch to KVM in their clouds. This news make almost all
> china CSPs(clouds service provider) pay more attention on KVM/Qemu, especially light VM
> solution.
> 
> Below are intel solution for light VM, qemu-lite.
> http://events.linuxfoundation.org/sites/events/files/slides/Light%20weight%20virtualization%20with%20QEMU%26KVM_0.pdf
> 
> My question is whether community has some plan to implement light VM or alternative solutions? If no, whether our 
> qemu-lite solution is suitable for upstream again? Many thanks!

Booting VMs faster is appreciated upstream.  I think there is interest
in patches that further this goal and hope you have time to contribute.

What caused a lot of discussion and held back progress was the approach
that was taken.  The basic philosophy seems to be bypassing or
special-casing components in order to avoid slow operations.  This
requires special QEMU, firmware, and/or guest kernel binaries and causes
extra work for the management stack, distributions, and testers.  It is
preferrable to have just one binary and no special configuration
settings.

I think patches are more likely to be merged with the following
approach:
1. Benchmark or profile to find slow operations.
2. Perform an experiment to see if bypassing the operations improves
   performance.  If no, go back to step 1.
3. Investigate the slow operation to see if it can be optimized or
   skipped completely based on run-time knowledge.  This means no
   special '-lite' binaries or new config options.
4. Implement patches to do this, retest, and send them upstream.

My view is that qemu-lite only got to Step 2 in some cases.  Going to
Step 4 is more work but the result will be easier to merge.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] About the light VM solution!
  2017-12-05 12:06 ` Stefan Hajnoczi
@ 2017-12-05 13:35   ` Paolo Bonzini
  2017-12-05 13:47     ` Stefan Hajnoczi
  2017-12-06 15:11     ` Stefan Hajnoczi
  0 siblings, 2 replies; 12+ messages in thread
From: Paolo Bonzini @ 2017-12-05 13:35 UTC (permalink / raw)
  To: Stefan Hajnoczi, Yang Zhong; +Cc: berrange, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2528 bytes --]

On 05/12/2017 13:06, Stefan Hajnoczi wrote:
> On Tue, Dec 05, 2017 at 02:33:13PM +0800, Yang Zhong wrote:
>> As you know, AWS has decided to switch to KVM in their clouds. This news make almost all
>> china CSPs(clouds service provider) pay more attention on KVM/Qemu, especially light VM
>> solution.
>>
>> Below are intel solution for light VM, qemu-lite.
>> http://events.linuxfoundation.org/sites/events/files/slides/Light%20weight%20virtualization%20with%20QEMU%26KVM_0.pdf
>>
>> My question is whether community has some plan to implement light VM or alternative solutions? If no, whether our 
>> qemu-lite solution is suitable for upstream again? Many thanks!
> 
> What caused a lot of discussion and held back progress was the approach
> that was taken.  The basic philosophy seems to be bypassing or
> special-casing components in order to avoid slow operations.  This
> requires special QEMU, firmware, and/or guest kernel binaries and causes
> extra work for the management stack, distributions, and testers.

I think having a special firmware (be it qboot or a special-purpose
SeaBIOS) is acceptable.

So is having a special QEMU binary with fewer runtime dependencies,
though that should only be a stopgap measure; the real solution is to
modularize e.g. the UI and audio subsystems to remove runtime
dependencies from the QEMU binary itself.

I agree with Stefan however that there should be no special machine
types or kernels.

Referring to your slides, the remaining points for fast boot are:

* parallelize VCPU initialization: do you have patches? :)

* q35-lite: any other machine options that have not been merged yet?

* SeaBIOS+Option ROM: can you take new numbers with DMA-based option
ROM, or with qboot?

* guest kernel: my proposal to make Linux a multiboot kernel has been
nacked upstream, but Oracle is working on supporting Xen PVH binaries in
QEMU.  These are very similar to multiboot and in particular they're
uncompressed.

For memory consumption, 2.11 should have improved things already thanks
to shared FlatViews.  Your malloc_trim patch should also be merged in 2.12.

So are things really that bad?  Almost all "qemu-lite" patches that have
been proposed upstream have been accepted.

Only mmap-ing the kernel into the guest is not going to be accepted, but
maybe you can look into replacing stdio with open+mmap in load_linux and
load_multiboot, for both the kernel and the initrd.  This should save a
few milliseconds too.

Paolo


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] About the light VM solution!
  2017-12-05 13:35   ` Paolo Bonzini
@ 2017-12-05 13:47     ` Stefan Hajnoczi
  2017-12-05 14:00       ` Paolo Bonzini
  2017-12-06 15:11     ` Stefan Hajnoczi
  1 sibling, 1 reply; 12+ messages in thread
From: Stefan Hajnoczi @ 2017-12-05 13:47 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Stefan Hajnoczi, Yang Zhong, qemu-devel

On Tue, Dec 5, 2017 at 1:35 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 05/12/2017 13:06, Stefan Hajnoczi wrote:
>> On Tue, Dec 05, 2017 at 02:33:13PM +0800, Yang Zhong wrote:
>>> As you know, AWS has decided to switch to KVM in their clouds. This news make almost all
>>> china CSPs(clouds service provider) pay more attention on KVM/Qemu, especially light VM
>>> solution.
>>>
>>> Below are intel solution for light VM, qemu-lite.
>>> http://events.linuxfoundation.org/sites/events/files/slides/Light%20weight%20virtualization%20with%20QEMU%26KVM_0.pdf
>>>
>>> My question is whether community has some plan to implement light VM or alternative solutions? If no, whether our
>>> qemu-lite solution is suitable for upstream again? Many thanks!
>>
>> What caused a lot of discussion and held back progress was the approach
>> that was taken.  The basic philosophy seems to be bypassing or
>> special-casing components in order to avoid slow operations.  This
>> requires special QEMU, firmware, and/or guest kernel binaries and causes
>> extra work for the management stack, distributions, and testers.
>
> I think having a special firmware (be it qboot or a special-purpose
> SeaBIOS) is acceptable.

The work Marc Mari Barcelo did in 2015 showed that SeaBIOS can boot
guests quickly.  The guest kernel was entered in <35 milliseconds
IIRC.  Why is special firmware necessary?

I'm not against additional binaries if there's no other way, but it's
important to demonstrate why special-casing is necessary.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] About the light VM solution!
  2017-12-05 13:47     ` Stefan Hajnoczi
@ 2017-12-05 14:00       ` Paolo Bonzini
  2017-12-05 16:31         ` Stefan Hajnoczi
  0 siblings, 1 reply; 12+ messages in thread
From: Paolo Bonzini @ 2017-12-05 14:00 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Stefan Hajnoczi, Yang Zhong, qemu-devel

On 05/12/2017 14:47, Stefan Hajnoczi wrote:
> On Tue, Dec 5, 2017 at 1:35 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> On 05/12/2017 13:06, Stefan Hajnoczi wrote:
>>> On Tue, Dec 05, 2017 at 02:33:13PM +0800, Yang Zhong wrote:
>>>> As you know, AWS has decided to switch to KVM in their clouds. This news make almost all
>>>> china CSPs(clouds service provider) pay more attention on KVM/Qemu, especially light VM
>>>> solution.
>>>>
>>>> Below are intel solution for light VM, qemu-lite.
>>>> http://events.linuxfoundation.org/sites/events/files/slides/Light%20weight%20virtualization%20with%20QEMU%26KVM_0.pdf
>>>>
>>>> My question is whether community has some plan to implement light VM or alternative solutions? If no, whether our
>>>> qemu-lite solution is suitable for upstream again? Many thanks!
>>>
>>> What caused a lot of discussion and held back progress was the approach
>>> that was taken.  The basic philosophy seems to be bypassing or
>>> special-casing components in order to avoid slow operations.  This
>>> requires special QEMU, firmware, and/or guest kernel binaries and causes
>>> extra work for the management stack, distributions, and testers.
>>
>> I think having a special firmware (be it qboot or a special-purpose
>> SeaBIOS) is acceptable.
> 
> The work Marc Mari Barcelo did in 2015 showed that SeaBIOS can boot
> guests quickly.  The guest kernel was entered in <35 milliseconds
> IIRC.  Why is special firmware necessary?

I thought that wasn't the "conventional" SeaBIOS, but rather one with
reduced configuration options, but I may be remembering wrong.

Paolo

> I'm not against additional binaries if there's no other way, but it's
> important to demonstrate why special-casing is necessary.
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] About the light VM solution!
  2017-12-05 14:00       ` Paolo Bonzini
@ 2017-12-05 16:31         ` Stefan Hajnoczi
  2017-12-06  9:21           ` Gonglei (Arei)
  0 siblings, 1 reply; 12+ messages in thread
From: Stefan Hajnoczi @ 2017-12-05 16:31 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Stefan Hajnoczi, Yang Zhong, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2043 bytes --]

On Tue, Dec 05, 2017 at 03:00:10PM +0100, Paolo Bonzini wrote:
> On 05/12/2017 14:47, Stefan Hajnoczi wrote:
> > On Tue, Dec 5, 2017 at 1:35 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> >> On 05/12/2017 13:06, Stefan Hajnoczi wrote:
> >>> On Tue, Dec 05, 2017 at 02:33:13PM +0800, Yang Zhong wrote:
> >>>> As you know, AWS has decided to switch to KVM in their clouds. This news make almost all
> >>>> china CSPs(clouds service provider) pay more attention on KVM/Qemu, especially light VM
> >>>> solution.
> >>>>
> >>>> Below are intel solution for light VM, qemu-lite.
> >>>> http://events.linuxfoundation.org/sites/events/files/slides/Light%20weight%20virtualization%20with%20QEMU%26KVM_0.pdf
> >>>>
> >>>> My question is whether community has some plan to implement light VM or alternative solutions? If no, whether our
> >>>> qemu-lite solution is suitable for upstream again? Many thanks!
> >>>
> >>> What caused a lot of discussion and held back progress was the approach
> >>> that was taken.  The basic philosophy seems to be bypassing or
> >>> special-casing components in order to avoid slow operations.  This
> >>> requires special QEMU, firmware, and/or guest kernel binaries and causes
> >>> extra work for the management stack, distributions, and testers.
> >>
> >> I think having a special firmware (be it qboot or a special-purpose
> >> SeaBIOS) is acceptable.
> > 
> > The work Marc Mari Barcelo did in 2015 showed that SeaBIOS can boot
> > guests quickly.  The guest kernel was entered in <35 milliseconds
> > IIRC.  Why is special firmware necessary?
> 
> I thought that wasn't the "conventional" SeaBIOS, but rather one with
> reduced configuration options, but I may be remembering wrong.

Marc didn't spend much time on optimizing SeaBIOS, he used the build
options that were suggested.  An extra flag can be added in
qemu_preinit() to skip slow init that's unnecessary on optimized
machines.  That would allow a single SeaBIOS binary to run both full and
lite systems.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] About the light VM solution!
  2017-12-05 16:31         ` Stefan Hajnoczi
@ 2017-12-06  9:21           ` Gonglei (Arei)
  2017-12-06 15:09             ` Stefan Hajnoczi
  0 siblings, 1 reply; 12+ messages in thread
From: Gonglei (Arei) @ 2017-12-06  9:21 UTC (permalink / raw)
  To: Stefan Hajnoczi, Paolo Bonzini; +Cc: Yang Zhong, Stefan Hajnoczi, qemu-devel


> -----Original Message-----
> From: Qemu-devel
> [mailto:qemu-devel-bounces+arei.gonglei=huawei.com@nongnu.org] On
> Behalf Of Stefan Hajnoczi
> Sent: Wednesday, December 06, 2017 12:31 AM
> To: Paolo Bonzini
> Cc: Yang Zhong; Stefan Hajnoczi; qemu-devel
> Subject: Re: [Qemu-devel] About the light VM solution!
> 
> On Tue, Dec 05, 2017 at 03:00:10PM +0100, Paolo Bonzini wrote:
> > On 05/12/2017 14:47, Stefan Hajnoczi wrote:
> > > On Tue, Dec 5, 2017 at 1:35 PM, Paolo Bonzini <pbonzini@redhat.com>
> wrote:
> > >> On 05/12/2017 13:06, Stefan Hajnoczi wrote:
> > >>> On Tue, Dec 05, 2017 at 02:33:13PM +0800, Yang Zhong wrote:
> > >>>> As you know, AWS has decided to switch to KVM in their clouds. This
> news make almost all
> > >>>> china CSPs(clouds service provider) pay more attention on KVM/Qemu,
> especially light VM
> > >>>> solution.
> > >>>>
> > >>>> Below are intel solution for light VM, qemu-lite.
> > >>>>
> http://events.linuxfoundation.org/sites/events/files/slides/Light%20weight%2
> 0virtualization%20with%20QEMU%26KVM_0.pdf
> > >>>>
> > >>>> My question is whether community has some plan to implement light
> VM or alternative solutions? If no, whether our
> > >>>> qemu-lite solution is suitable for upstream again? Many thanks!
> > >>>
> > >>> What caused a lot of discussion and held back progress was the approach
> > >>> that was taken.  The basic philosophy seems to be bypassing or
> > >>> special-casing components in order to avoid slow operations.  This
> > >>> requires special QEMU, firmware, and/or guest kernel binaries and
> causes
> > >>> extra work for the management stack, distributions, and testers.
> > >>
> > >> I think having a special firmware (be it qboot or a special-purpose
> > >> SeaBIOS) is acceptable.
> > >
> > > The work Marc Mari Barcelo did in 2015 showed that SeaBIOS can boot
> > > guests quickly.  The guest kernel was entered in <35 milliseconds
> > > IIRC.  Why is special firmware necessary?
> >
> > I thought that wasn't the "conventional" SeaBIOS, but rather one with
> > reduced configuration options, but I may be remembering wrong.
> 
> Marc didn't spend much time on optimizing SeaBIOS, he used the build
> options that were suggested.  An extra flag can be added in
> qemu_preinit() to skip slow init that's unnecessary on optimized
> machines.  That would allow a single SeaBIOS binary to run both full and
> lite systems.
> 
What's options do you remember? Stefan. Or any links about that
thread? I'm Interesting with this topic.

Thanks,
-Gonglei

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] About the light VM solution!
  2017-12-06  9:21           ` Gonglei (Arei)
@ 2017-12-06 15:09             ` Stefan Hajnoczi
  2017-12-07  0:49               ` Gonglei (Arei)
  0 siblings, 1 reply; 12+ messages in thread
From: Stefan Hajnoczi @ 2017-12-06 15:09 UTC (permalink / raw)
  To: Gonglei (Arei); +Cc: Paolo Bonzini, Yang Zhong, Stefan Hajnoczi, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3176 bytes --]

On Wed, Dec 06, 2017 at 09:21:55AM +0000, Gonglei (Arei) wrote:
> 
> > -----Original Message-----
> > From: Qemu-devel
> > [mailto:qemu-devel-bounces+arei.gonglei=huawei.com@nongnu.org] On
> > Behalf Of Stefan Hajnoczi
> > Sent: Wednesday, December 06, 2017 12:31 AM
> > To: Paolo Bonzini
> > Cc: Yang Zhong; Stefan Hajnoczi; qemu-devel
> > Subject: Re: [Qemu-devel] About the light VM solution!
> > 
> > On Tue, Dec 05, 2017 at 03:00:10PM +0100, Paolo Bonzini wrote:
> > > On 05/12/2017 14:47, Stefan Hajnoczi wrote:
> > > > On Tue, Dec 5, 2017 at 1:35 PM, Paolo Bonzini <pbonzini@redhat.com>
> > wrote:
> > > >> On 05/12/2017 13:06, Stefan Hajnoczi wrote:
> > > >>> On Tue, Dec 05, 2017 at 02:33:13PM +0800, Yang Zhong wrote:
> > > >>>> As you know, AWS has decided to switch to KVM in their clouds. This
> > news make almost all
> > > >>>> china CSPs(clouds service provider) pay more attention on KVM/Qemu,
> > especially light VM
> > > >>>> solution.
> > > >>>>
> > > >>>> Below are intel solution for light VM, qemu-lite.
> > > >>>>
> > http://events.linuxfoundation.org/sites/events/files/slides/Light%20weight%2
> > 0virtualization%20with%20QEMU%26KVM_0.pdf
> > > >>>>
> > > >>>> My question is whether community has some plan to implement light
> > VM or alternative solutions? If no, whether our
> > > >>>> qemu-lite solution is suitable for upstream again? Many thanks!
> > > >>>
> > > >>> What caused a lot of discussion and held back progress was the approach
> > > >>> that was taken.  The basic philosophy seems to be bypassing or
> > > >>> special-casing components in order to avoid slow operations.  This
> > > >>> requires special QEMU, firmware, and/or guest kernel binaries and
> > causes
> > > >>> extra work for the management stack, distributions, and testers.
> > > >>
> > > >> I think having a special firmware (be it qboot or a special-purpose
> > > >> SeaBIOS) is acceptable.
> > > >
> > > > The work Marc Mari Barcelo did in 2015 showed that SeaBIOS can boot
> > > > guests quickly.  The guest kernel was entered in <35 milliseconds
> > > > IIRC.  Why is special firmware necessary?
> > >
> > > I thought that wasn't the "conventional" SeaBIOS, but rather one with
> > > reduced configuration options, but I may be remembering wrong.
> > 
> > Marc didn't spend much time on optimizing SeaBIOS, he used the build
> > options that were suggested.  An extra flag can be added in
> > qemu_preinit() to skip slow init that's unnecessary on optimized
> > machines.  That would allow a single SeaBIOS binary to run both full and
> > lite systems.
> > 
> What's options do you remember? Stefan. Or any links about that
> thread? I'm Interesting with this topic.

Here is what I found:

Marc Mari's fastest SeaBIOS build took 8 ms from the first guest CPU
instruction to entering the guest kernel.  CBFS was used instead of a
normal boot device (e.g. virtio-blk).  Most hardware support was
disabled.

https://mail.coreboot.org/pipermail/seabios/2015-July/009554.html

The SeaBIOS configuration file is here:

https://mail.coreboot.org/pipermail/seabios/2015-July/009548.html

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] About the light VM solution!
  2017-12-05 13:35   ` Paolo Bonzini
  2017-12-05 13:47     ` Stefan Hajnoczi
@ 2017-12-06 15:11     ` Stefan Hajnoczi
  2017-12-08 11:29       ` Yang Zhong
  1 sibling, 1 reply; 12+ messages in thread
From: Stefan Hajnoczi @ 2017-12-06 15:11 UTC (permalink / raw)
  To: Yang Zhong; +Cc: berrange, qemu-devel, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 1022 bytes --]

On Tue, Dec 05, 2017 at 02:35:42PM +0100, Paolo Bonzini wrote:
> On 05/12/2017 13:06, Stefan Hajnoczi wrote:
> > On Tue, Dec 05, 2017 at 02:33:13PM +0800, Yang Zhong wrote:
> Referring to your slides, the remaining points for fast boot are:
> 
> * parallelize VCPU initialization: do you have patches? :)
> 
> * q35-lite: any other machine options that have not been merged yet?
> 
> * SeaBIOS+Option ROM: can you take new numbers with DMA-based option
> ROM, or with qboot?
> 
> * guest kernel: my proposal to make Linux a multiboot kernel has been
> nacked upstream, but Oracle is working on supporting Xen PVH binaries in
> QEMU.  These are very similar to multiboot and in particular they're
> uncompressed.

By the way, sending a separate patch series for each optimization is
good.  That way patches requiring more discussion don't hold up patches
that could be applied immediately.  It also makes reviewing and
understanding performance results easier since there is only one change.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] About the light VM solution!
  2017-12-06 15:09             ` Stefan Hajnoczi
@ 2017-12-07  0:49               ` Gonglei (Arei)
  0 siblings, 0 replies; 12+ messages in thread
From: Gonglei (Arei) @ 2017-12-07  0:49 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Paolo Bonzini, Yang Zhong, Stefan Hajnoczi, qemu-devel

> -----Original Message-----
> From: Stefan Hajnoczi [mailto:stefanha@redhat.com]
> Sent: Wednesday, December 06, 2017 11:10 PM
> To: Gonglei (Arei)
> Cc: Paolo Bonzini; Yang Zhong; Stefan Hajnoczi; qemu-devel
> Subject: Re: [Qemu-devel] About the light VM solution!
> 
> On Wed, Dec 06, 2017 at 09:21:55AM +0000, Gonglei (Arei) wrote:
> >
> > > -----Original Message-----
> > > From: Qemu-devel
> > > [mailto:qemu-devel-bounces+arei.gonglei=huawei.com@nongnu.org] On
> > > Behalf Of Stefan Hajnoczi
> > > Sent: Wednesday, December 06, 2017 12:31 AM
> > > To: Paolo Bonzini
> > > Cc: Yang Zhong; Stefan Hajnoczi; qemu-devel
> > > Subject: Re: [Qemu-devel] About the light VM solution!
> > >
> > > On Tue, Dec 05, 2017 at 03:00:10PM +0100, Paolo Bonzini wrote:
> > > > On 05/12/2017 14:47, Stefan Hajnoczi wrote:
> > > > > On Tue, Dec 5, 2017 at 1:35 PM, Paolo Bonzini <pbonzini@redhat.com>
> > > wrote:
> > > > >> On 05/12/2017 13:06, Stefan Hajnoczi wrote:
> > > > >>> On Tue, Dec 05, 2017 at 02:33:13PM +0800, Yang Zhong wrote:
> > > > >>>> As you know, AWS has decided to switch to KVM in their clouds. This
> > > news make almost all
> > > > >>>> china CSPs(clouds service provider) pay more attention on
> KVM/Qemu,
> > > especially light VM
> > > > >>>> solution.
> > > > >>>>
> > > > >>>> Below are intel solution for light VM, qemu-lite.
> > > > >>>>
> > >
> http://events.linuxfoundation.org/sites/events/files/slides/Light%20weight%2
> > > 0virtualization%20with%20QEMU%26KVM_0.pdf
> > > > >>>>
> > > > >>>> My question is whether community has some plan to implement
> light
> > > VM or alternative solutions? If no, whether our
> > > > >>>> qemu-lite solution is suitable for upstream again? Many thanks!
> > > > >>>
> > > > >>> What caused a lot of discussion and held back progress was the
> approach
> > > > >>> that was taken.  The basic philosophy seems to be bypassing or
> > > > >>> special-casing components in order to avoid slow operations.  This
> > > > >>> requires special QEMU, firmware, and/or guest kernel binaries and
> > > causes
> > > > >>> extra work for the management stack, distributions, and testers.
> > > > >>
> > > > >> I think having a special firmware (be it qboot or a special-purpose
> > > > >> SeaBIOS) is acceptable.
> > > > >
> > > > > The work Marc Mari Barcelo did in 2015 showed that SeaBIOS can boot
> > > > > guests quickly.  The guest kernel was entered in <35 milliseconds
> > > > > IIRC.  Why is special firmware necessary?
> > > >
> > > > I thought that wasn't the "conventional" SeaBIOS, but rather one with
> > > > reduced configuration options, but I may be remembering wrong.
> > >
> > > Marc didn't spend much time on optimizing SeaBIOS, he used the build
> > > options that were suggested.  An extra flag can be added in
> > > qemu_preinit() to skip slow init that's unnecessary on optimized
> > > machines.  That would allow a single SeaBIOS binary to run both full and
> > > lite systems.
> > >
> > What's options do you remember? Stefan. Or any links about that
> > thread? I'm Interesting with this topic.
> 
> Here is what I found:
> 
> Marc Mari's fastest SeaBIOS build took 8 ms from the first guest CPU
> instruction to entering the guest kernel.  CBFS was used instead of a
> normal boot device (e.g. virtio-blk).  Most hardware support was
> disabled.
> 
> https://mail.coreboot.org/pipermail/seabios/2015-July/009554.html
> 
> The SeaBIOS configuration file is here:
> 
> https://mail.coreboot.org/pipermail/seabios/2015-July/009548.html
> 
Thanks for your information. :)
 
Thanks,
-Gonglei

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] About the light VM solution!
  2017-12-05  6:33 [Qemu-devel] About the light VM solution! Yang Zhong
  2017-12-05 12:06 ` Stefan Hajnoczi
@ 2017-12-07 12:03 ` Richard W.M. Jones
  1 sibling, 0 replies; 12+ messages in thread
From: Richard W.M. Jones @ 2017-12-07 12:03 UTC (permalink / raw)
  To: Yang Zhong; +Cc: pbonzini, stefanha, berrange, qemu-devel

I did a bit of work on this back in early 2016 and wrote a paper which
analyzes what Intel were doing with Clear Containers back then, and
how it fitted with the more distribution-centric view of Fedora and
Red Hat, ie that we ideally want a single qemu binary, a single
kernel, a single SeaBIOS, etc.

You can read the paper here:

  http://oirase.annexia.org/tmp/paper.pdf

and the source is here:

  http://git.annexia.org/?p=libguestfs-talks.git;a=tree;f=2016-eng-talk;h=5a0a29ceb1e9db39539669717ec06e4f94eba086;hb=HEAD

To address a few points from this thread:

* As Paolo mentioned one problem is we link qemu to so many libraries,
  and glibc / ELF dynamic loading is very slow.  Modularizing qemu
  could help here.  Reducing symbol interposition
  (-fvisibility=hidden) in more libraries would help a bit.  Linker
  security features enabled in downstream distros don't help.
  Rewriting ELF to be less crazy would help a lot but good luck there :-)

* As Stefan & Paolo mentioned, it would be nice if SeaBIOS was faster
  in the default configuration.  I ended up compiling a special
  minimal SeaBIOS which saved a load of time, mainly not probing PCI
  unnecessarily IIRC.

* Considerable time is taken in booting the kernel, and that's mostly
  in running all the initcall functions.  We wanted to use a Fedora
  distro kernel, but unfortunately many subsystems do loads of
  initcall work which is run even when that subsystem is not used.
  This is why compiling a custom kernel (compiling out these
  subsystems) is enticing.  Parallelizing initialization could help
  here (however at the moment using -smp slows things down), also
  parallelizing was rejected upstream IIRC.

* PCI config space probing is really slow.  Unfortunately accelerating
  it in the kernel doesn't seem either very easy or very acceptable to
  KVM upstream: the use case is rather narrow & the implementation
  seems like it would be very complex.  I also write a parallelizing
  PCI probe for Linux which helped a bit but wasn't upstream material.

* udev is another huge problem.  It's slow, it's monolithic, it
  resists modifications such as modularization or removing parts.

* Debugging over the UART is slow.

libguestfs also ships with benchmarking tools which can be very useful
to actually measure boot time:

  https://github.com/libguestfs/libguestfs/tree/master/utils

HTH,

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Qemu-devel] About the light VM solution!
  2017-12-06 15:11     ` Stefan Hajnoczi
@ 2017-12-08 11:29       ` Yang Zhong
  0 siblings, 0 replies; 12+ messages in thread
From: Yang Zhong @ 2017-12-08 11:29 UTC (permalink / raw)
  To: Stefan Hajnoczi, pbonzini, rjones
  Cc: qemu-devel, berrange, anthony.xu, chao.p.peng

On Wed, Dec 06, 2017 at 03:11:55PM +0000, Stefan Hajnoczi wrote:
> On Tue, Dec 05, 2017 at 02:35:42PM +0100, Paolo Bonzini wrote:
> > On 05/12/2017 13:06, Stefan Hajnoczi wrote:
> > > On Tue, Dec 05, 2017 at 02:33:13PM +0800, Yang Zhong wrote:
> > Referring to your slides, the remaining points for fast boot are:
> > 
> > * parallelize VCPU initialization: do you have patches? :)
> > 
> > * q35-lite: any other machine options that have not been merged yet?
> > 
> > * SeaBIOS+Option ROM: can you take new numbers with DMA-based option
> > ROM, or with qboot?
> > 
> > * guest kernel: my proposal to make Linux a multiboot kernel has been
> > nacked upstream, but Oracle is working on supporting Xen PVH binaries in
> > QEMU.  These are very similar to multiboot and in particular they're
> > uncompressed.
> 
> By the way, sending a separate patch series for each optimization is
> good.  That way patches requiring more discussion don't hold up patches
> that could be applied immediately.  It also makes reviewing and
> understanding performance results easier since there is only one change.
> 
> Stefan

  Hello Stefan,Paolo and Richard,

  Many thanks for your detailed comments! Really thanks for your great help!

  We are talking about light VM solution with our customers firstly and want
  to know their concerns. Then we will continue to talk this on this thread.

  Richard, your paper and test script will do great help for our further work.

  I also CC PengChao and Anthony in this thread, they will join this talk.

  Regards,

  Yang 

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-12-08 11:30 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-05  6:33 [Qemu-devel] About the light VM solution! Yang Zhong
2017-12-05 12:06 ` Stefan Hajnoczi
2017-12-05 13:35   ` Paolo Bonzini
2017-12-05 13:47     ` Stefan Hajnoczi
2017-12-05 14:00       ` Paolo Bonzini
2017-12-05 16:31         ` Stefan Hajnoczi
2017-12-06  9:21           ` Gonglei (Arei)
2017-12-06 15:09             ` Stefan Hajnoczi
2017-12-07  0:49               ` Gonglei (Arei)
2017-12-06 15:11     ` Stefan Hajnoczi
2017-12-08 11:29       ` Yang Zhong
2017-12-07 12:03 ` Richard W.M. Jones

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.