All of lore.kernel.org
 help / color / mirror / Atom feed
* RFC: Still TODO for 4.2?
@ 2012-01-04 16:29 Ian Campbell
  2012-01-04 16:47 ` Konrad Rzeszutek Wilk
                   ` (6 more replies)
  0 siblings, 7 replies; 97+ messages in thread
From: Ian Campbell @ 2012-01-04 16:29 UTC (permalink / raw)
  To: xen-devel
  Cc: Ian Jackson, Keir Fraser, Jan Beulich, Stefano Stabellini, Tim Deegan

What are the outstanding things to do before we think we can start on
the 4.2 -rc's? Does anyone have a timetable in mind?

hypervisor:

      * ??? - Keir, Tim, Jan?

tools:

      * libxl stable API -- we would like 4.2 to define a stable API
        which downstream's can start to rely on not changing. Aspects of
        this are:
              * event handling (IanJ working on this)
              * drop libxl_device_model_info (move bits to build_info or
                elsewhere as appropriate) (IanC working on this, patches
                shortly)
              * add libxl_defbool and generally try and arrange that
                memset(foo,0,...) requests the defaults (IanC working on
                this, patches shortly)
              * The topologyinfo datastructure should be a list of
                tuples, not a tuple of lists. (nobody currently looking
                at this, not 100% sure this makes sense, could possibly
                defer and change after 4.2 in a compatible way)
              * Block script support -- can be done post 4.2?
      * Hotplug script stuff -- internal to libxl (I think, therefore I
        didn't put this under stable API above) but still good to have
        for 4.2? Roger Pau Monet was looking at this but its looking
        like a big can-o-worms...
      * Integrate qemu+seabios upstream into the build (Stefano has
        posted patches, I guess they need refreshing and reposting). No
        change in default qemu for 4.2.
      * More formally deprecate xm/xend. Manpage patches already in
        tree. Needs release noting and communication around -rc1 to
        remind people to test xl.

Has anybody got anything else? I'm sure I've missed stuff. Are there any
must haves e.g. in the paging/sharing spaces?

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-04 16:29 RFC: Still TODO for 4.2? Ian Campbell
@ 2012-01-04 16:47 ` Konrad Rzeszutek Wilk
  2012-01-04 16:51   ` Stefano Stabellini
  2012-01-04 16:55 ` Jan Beulich
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 97+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-01-04 16:47 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Keir Fraser, Tim Deegan, Ian Jackson,
	Stefano Stabellini, Jan Beulich

On Wed, Jan 04, 2012 at 04:29:22PM +0000, Ian Campbell wrote:
> What are the outstanding things to do before we think we can start on
> the 4.2 -rc's? Does anyone have a timetable in mind?
> 
> hypervisor:
> 
>       * ??? - Keir, Tim, Jan?

Mark ARM as experimental? Docs on how to compile it, use it?

> 
> tools:
> 
>       * libxl stable API -- we would like 4.2 to define a stable API
>         which downstream's can start to rely on not changing. Aspects of
>         this are:
>               * event handling (IanJ working on this)
>               * drop libxl_device_model_info (move bits to build_info or
>                 elsewhere as appropriate) (IanC working on this, patches
>                 shortly)
>               * add libxl_defbool and generally try and arrange that
>                 memset(foo,0,...) requests the defaults (IanC working on
>                 this, patches shortly)
>               * The topologyinfo datastructure should be a list of
>                 tuples, not a tuple of lists. (nobody currently looking
>                 at this, not 100% sure this makes sense, could possibly
>                 defer and change after 4.2 in a compatible way)
>               * Block script support -- can be done post 4.2?
>       * Hotplug script stuff -- internal to libxl (I think, therefore I
>         didn't put this under stable API above) but still good to have
>         for 4.2? Roger Pau Monet was looking at this but its looking
>         like a big can-o-worms...
>       * Integrate qemu+seabios upstream into the build (Stefano has
>         posted patches, I guess they need refreshing and reposting). No
>         change in default qemu for 4.2.

Anthony's PCI passthrough patches?

>       * More formally deprecate xm/xend. Manpage patches already in
>         tree. Needs release noting and communication around -rc1 to
>         remind people to test xl.

> 
> Has anybody got anything else? I'm sure I've missed stuff. Are there any
> must haves e.g. in the paging/sharing spaces?
> 
> Ian.
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-04 16:47 ` Konrad Rzeszutek Wilk
@ 2012-01-04 16:51   ` Stefano Stabellini
  2012-01-16 13:42     ` Ian Campbell
  0 siblings, 1 reply; 97+ messages in thread
From: Stefano Stabellini @ 2012-01-04 16:51 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: xen-devel, Keir (Xen.org),
	Ian Campbell, Stefano Stabellini, Tim (Xen.org),
	Ian Jackson, Jan Beulich

On Wed, 4 Jan 2012, Konrad Rzeszutek Wilk wrote:
> On Wed, Jan 04, 2012 at 04:29:22PM +0000, Ian Campbell wrote:
> > What are the outstanding things to do before we think we can start on
> > the 4.2 -rc's? Does anyone have a timetable in mind?
> > 
> > hypervisor:
> > 
> >       * ??? - Keir, Tim, Jan?
> 
> Mark ARM as experimental? Docs on how to compile it, use it?
> 
> > 
> > tools:
> > 
> >       * libxl stable API -- we would like 4.2 to define a stable API
> >         which downstream's can start to rely on not changing. Aspects of
> >         this are:
> >               * event handling (IanJ working on this)
> >               * drop libxl_device_model_info (move bits to build_info or
> >                 elsewhere as appropriate) (IanC working on this, patches
> >                 shortly)
> >               * add libxl_defbool and generally try and arrange that
> >                 memset(foo,0,...) requests the defaults (IanC working on
> >                 this, patches shortly)
> >               * The topologyinfo datastructure should be a list of
> >                 tuples, not a tuple of lists. (nobody currently looking
> >                 at this, not 100% sure this makes sense, could possibly
> >                 defer and change after 4.2 in a compatible way)
> >               * Block script support -- can be done post 4.2?
> >       * Hotplug script stuff -- internal to libxl (I think, therefore I
> >         didn't put this under stable API above) but still good to have
> >         for 4.2? Roger Pau Monet was looking at this but its looking
> >         like a big can-o-worms...
> >       * Integrate qemu+seabios upstream into the build (Stefano has
> >         posted patches, I guess they need refreshing and reposting). No
> >         change in default qemu for 4.2.
> 
> Anthony's PCI passthrough patches?

Right. And Anthony's save/restore patches as well.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-04 16:29 RFC: Still TODO for 4.2? Ian Campbell
  2012-01-04 16:47 ` Konrad Rzeszutek Wilk
@ 2012-01-04 16:55 ` Jan Beulich
  2012-01-16 13:39   ` Ian Campbell
  2012-01-04 17:25 ` Pasi Kärkkäinen
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 97+ messages in thread
From: Jan Beulich @ 2012-01-04 16:55 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Ian Jackson, Keir Fraser, xen-devel, Stefano Stabellini, Tim Deegan

>>> On 04.01.12 at 17:29, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> What are the outstanding things to do before we think we can start on
> the 4.2 -rc's? Does anyone have a timetable in mind?
> 
> hypervisor:
> 
>       * ??? - Keir, Tim, Jan?

Apart from a few small things that I have on my todo list, the only
bigger one (at least from an possible impact perspective) is the
round-up of the closing of the security hole in MSI-X passthrough
(uniformly - i.e. even for Dom0 - disallowing write access to MSI-X
table pages), which I intended to do only once the upstream qemu
patch series also incorporates the respective recent qemu-xen
change.

Jan

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-04 16:29 RFC: Still TODO for 4.2? Ian Campbell
  2012-01-04 16:47 ` Konrad Rzeszutek Wilk
  2012-01-04 16:55 ` Jan Beulich
@ 2012-01-04 17:25 ` Pasi Kärkkäinen
  2012-01-04 17:36   ` George Dunlap
                     ` (3 more replies)
  2012-01-04 17:39 ` Roger Pau Monné
                   ` (3 subsequent siblings)
  6 siblings, 4 replies; 97+ messages in thread
From: Pasi Kärkkäinen @ 2012-01-04 17:25 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Keir Fraser, Tim Deegan, Ian Jackson,
	Stefano Stabellini, Jan Beulich

On Wed, Jan 04, 2012 at 04:29:22PM +0000, Ian Campbell wrote:
> What are the outstanding things to do before we think we can start on
> the 4.2 -rc's? Does anyone have a timetable in mind?
> 
> hypervisor:
> 
>       * ??? - Keir, Tim, Jan?
> 
> tools:
> 
>       * libxl stable API -- we would like 4.2 to define a stable API
>         which downstream's can start to rely on not changing. Aspects of
>         this are:
>               * event handling (IanJ working on this)
>               * drop libxl_device_model_info (move bits to build_info or
>                 elsewhere as appropriate) (IanC working on this, patches
>                 shortly)
>               * add libxl_defbool and generally try and arrange that
>                 memset(foo,0,...) requests the defaults (IanC working on
>                 this, patches shortly)
>               * The topologyinfo datastructure should be a list of
>                 tuples, not a tuple of lists. (nobody currently looking
>                 at this, not 100% sure this makes sense, could possibly
>                 defer and change after 4.2 in a compatible way)
>               * Block script support -- can be done post 4.2?
>       * Hotplug script stuff -- internal to libxl (I think, therefore I
>         didn't put this under stable API above) but still good to have
>         for 4.2? Roger Pau Monet was looking at this but its looking
>         like a big can-o-worms...
>       * Integrate qemu+seabios upstream into the build (Stefano has
>         posted patches, I guess they need refreshing and reposting). No
>         change in default qemu for 4.2.
>       * More formally deprecate xm/xend. Manpage patches already in
>         tree. Needs release noting and communication around -rc1 to
>         remind people to test xl.
> 
> Has anybody got anything else? I'm sure I've missed stuff. Are there any
> must haves e.g. in the paging/sharing spaces?
> 

- What's the status of Nested Hardware Virtualization? 
I remember some email saying Intel vmx-on-vmx has some performance issues,
and amd svm-on-svm works better..


- Also there's a bunch of VGA passthru related patches,
that I once volunteered to collect/rebase/cleanup/repost myself,
but I still haven't had time for that :(


-- Pasi

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-04 17:25 ` Pasi Kärkkäinen
@ 2012-01-04 17:36   ` George Dunlap
  2012-01-04 18:20   ` Tim Deegan
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 97+ messages in thread
From: George Dunlap @ 2012-01-04 17:36 UTC (permalink / raw)
  To: Pasi Kärkkäinen
  Cc: xen-devel, Keir Fraser, Ian Campbell, Tim Deegan, Ian Jackson,
	Stefano Stabellini, Jan Beulich

On Wed, Jan 4, 2012 at 12:25 PM, Pasi Kärkkäinen <pasik@iki.fi> wrote:
> On Wed, Jan 04, 2012 at 04:29:22PM +0000, Ian Campbell wrote:
>> What are the outstanding things to do before we think we can start on
>> the 4.2 -rc's? Does anyone have a timetable in mind?
>>
>> hypervisor:
>>
>>       * ??? - Keir, Tim, Jan?

It would be good to have domctls / sysctls set up to modify scheduler
parameters, like the credit1 timeslice (and schedule rate, if that
ever makes it in).

 -George

>>
>> tools:
>>
>>       * libxl stable API -- we would like 4.2 to define a stable API
>>         which downstream's can start to rely on not changing. Aspects of
>>         this are:
>>               * event handling (IanJ working on this)
>>               * drop libxl_device_model_info (move bits to build_info or
>>                 elsewhere as appropriate) (IanC working on this, patches
>>                 shortly)
>>               * add libxl_defbool and generally try and arrange that
>>                 memset(foo,0,...) requests the defaults (IanC working on
>>                 this, patches shortly)
>>               * The topologyinfo datastructure should be a list of
>>                 tuples, not a tuple of lists. (nobody currently looking
>>                 at this, not 100% sure this makes sense, could possibly
>>                 defer and change after 4.2 in a compatible way)
>>               * Block script support -- can be done post 4.2?
>>       * Hotplug script stuff -- internal to libxl (I think, therefore I
>>         didn't put this under stable API above) but still good to have
>>         for 4.2? Roger Pau Monet was looking at this but its looking
>>         like a big can-o-worms...
>>       * Integrate qemu+seabios upstream into the build (Stefano has
>>         posted patches, I guess they need refreshing and reposting). No
>>         change in default qemu for 4.2.
>>       * More formally deprecate xm/xend. Manpage patches already in
>>         tree. Needs release noting and communication around -rc1 to
>>         remind people to test xl.
>>
>> Has anybody got anything else? I'm sure I've missed stuff. Are there any
>> must haves e.g. in the paging/sharing spaces?
>>
>
> - What's the status of Nested Hardware Virtualization?
> I remember some email saying Intel vmx-on-vmx has some performance issues,
> and amd svm-on-svm works better..
>
>
> - Also there's a bunch of VGA passthru related patches,
> that I once volunteered to collect/rebase/cleanup/repost myself,
> but I still haven't had time for that :(
>
>
> -- Pasi
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-04 16:29 RFC: Still TODO for 4.2? Ian Campbell
                   ` (2 preceding siblings ...)
  2012-01-04 17:25 ` Pasi Kärkkäinen
@ 2012-01-04 17:39 ` Roger Pau Monné
  2012-01-05 18:07   ` Driver domains and hotplug scripts, redux Ian Jackson
  2012-01-05 17:49 ` RFC: Still TODO for 4.2? Ian Jackson
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 97+ messages in thread
From: Roger Pau Monné @ 2012-01-04 17:39 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Keir Fraser, Tim Deegan, Ian Jackson,
	Stefano Stabellini, Jan Beulich

2012/1/4 Ian Campbell <Ian.Campbell@citrix.com>:
> What are the outstanding things to do before we think we can start on
> the 4.2 -rc's? Does anyone have a timetable in mind?
>
> hypervisor:
>
>      * ??? - Keir, Tim, Jan?
>
> tools:
>
>      * libxl stable API -- we would like 4.2 to define a stable API
>        which downstream's can start to rely on not changing. Aspects of
>        this are:
>              * event handling (IanJ working on this)
>              * drop libxl_device_model_info (move bits to build_info or
>                elsewhere as appropriate) (IanC working on this, patches
>                shortly)
>              * add libxl_defbool and generally try and arrange that
>                memset(foo,0,...) requests the defaults (IanC working on
>                this, patches shortly)
>              * The topologyinfo datastructure should be a list of
>                tuples, not a tuple of lists. (nobody currently looking
>                at this, not 100% sure this makes sense, could possibly
>                defer and change after 4.2 in a compatible way)
>              * Block script support -- can be done post 4.2?
>      * Hotplug script stuff -- internal to libxl (I think, therefore I
>        didn't put this under stable API above) but still good to have
>        for 4.2? Roger Pau Monet was looking at this but its looking
>        like a big can-o-worms...

The hotplug implementation I've sent can be improved with asynchronous
events once IanJ patches are in. Also it might be good to do some
cleaning of the Linux hotplug scripts, right now they are a style
mess, apart from the fact that they take different parameters
depending on the script being called, which I think could be avoided.

I don't know much about driver domains, but from what I've read they
should be running something like NetBSD xenbackend and listen for
xenstore events. Most of the functions that I've written on my hotplug
series can be used to create a little daemon, that's not the problem,
the problem is what can we use to synchronize hotplug script calling
and libxl (what comes to mind is using a dedicated xenstore variable
for each device, but someone might have a better idea).

>      * Integrate qemu+seabios upstream into the build (Stefano has
>        posted patches, I guess they need refreshing and reposting). No
>        change in default qemu for 4.2.
>      * More formally deprecate xm/xend. Manpage patches already in
>        tree. Needs release noting and communication around -rc1 to
>        remind people to test xl.
>
> Has anybody got anything else? I'm sure I've missed stuff. Are there any
> must haves e.g. in the paging/sharing spaces?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-04 17:25 ` Pasi Kärkkäinen
  2012-01-04 17:36   ` George Dunlap
@ 2012-01-04 18:20   ` Tim Deegan
  2012-01-05 10:39     ` Ian Campbell
  2012-01-04 19:21   ` RFC: Still TODO for 4.2? Wei Huang
  2012-01-16 13:28   ` Ian Campbell
  3 siblings, 1 reply; 97+ messages in thread
From: Tim Deegan @ 2012-01-04 18:20 UTC (permalink / raw)
  To: Pasi Kärkkäinen
  Cc: xen-devel, Keir Fraser, Ian Campbell, Ian Jackson,
	Stefano Stabellini, Jan Beulich

At 19:25 +0200 on 04 Jan (1325705119), Pasi K?rkk?inen wrote:
> On Wed, Jan 04, 2012 at 04:29:22PM +0000, Ian Campbell wrote:
> > What are the outstanding things to do before we think we can start on
> > the 4.2 -rc's? Does anyone have a timetable in mind?
> > 
> > hypervisor:
> > 
> >       * ??? - Keir, Tim, Jan?

I would like to get the interface changes for sharing/paging/mem-events
done and dusted so that 4.2 is a stable API that we hold to.

It would be nice to get the implementation solid too (i.e., using wait
queues) but that can happen later if it's the only thing holding up a
release.

> - What's the status of Nested Hardware Virtualization? 
> I remember some email saying Intel vmx-on-vmx has some performance issues,
> and amd svm-on-svm works better..

The basic feature is in for AMD and Intel, but AIUI it's not getting a
lot of use and it's not in the xen.org automated testing.  The AMD code
has nested-paging support too, which is a requirement for decent
performance. 

We could call it 'experimental' for 4.2?

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-04 17:25 ` Pasi Kärkkäinen
  2012-01-04 17:36   ` George Dunlap
  2012-01-04 18:20   ` Tim Deegan
@ 2012-01-04 19:21   ` Wei Huang
  2012-01-04 19:43     ` Pasi Kärkkäinen
  2012-01-16 13:28   ` Ian Campbell
  3 siblings, 1 reply; 97+ messages in thread
From: Wei Huang @ 2012-01-04 19:21 UTC (permalink / raw)
  To: Pasi Kärkkäinen
  Cc: xen-devel, Keir Fraser, Ian Campbell, Tim Deegan, Ian Jackson,
	Stefano Stabellini, Jan Beulich

On 01/04/2012 11:25 AM, Pasi Kärkkäinen wrote:
> On Wed, Jan 04, 2012 at 04:29:22PM +0000, Ian Campbell wrote:
>> What are the outstanding things to do before we think we can start on
>> the 4.2 -rc's? Does anyone have a timetable in mind?
>>
>> hypervisor:
>>
>>        * ??? - Keir, Tim, Jan?
>>
>> tools:
>>
>>        * libxl stable API -- we would like 4.2 to define a stable API
>>          which downstream's can start to rely on not changing. Aspects of
>>          this are:
>>                * event handling (IanJ working on this)
>>                * drop libxl_device_model_info (move bits to build_info or
>>                  elsewhere as appropriate) (IanC working on this, patches
>>                  shortly)
>>                * add libxl_defbool and generally try and arrange that
>>                  memset(foo,0,...) requests the defaults (IanC working on
>>                  this, patches shortly)
>>                * The topologyinfo datastructure should be a list of
>>                  tuples, not a tuple of lists. (nobody currently looking
>>                  at this, not 100% sure this makes sense, could possibly
>>                  defer and change after 4.2 in a compatible way)
>>                * Block script support -- can be done post 4.2?
>>        * Hotplug script stuff -- internal to libxl (I think, therefore I
>>          didn't put this under stable API above) but still good to have
>>          for 4.2? Roger Pau Monet was looking at this but its looking
>>          like a big can-o-worms...
>>        * Integrate qemu+seabios upstream into the build (Stefano has
>>          posted patches, I guess they need refreshing and reposting). No
>>          change in default qemu for 4.2.
>>        * More formally deprecate xm/xend. Manpage patches already in
>>          tree. Needs release noting and communication around -rc1 to
>>          remind people to test xl.
>>
>> Has anybody got anything else? I'm sure I've missed stuff. Are there any
>> must haves e.g. in the paging/sharing spaces?
>>
> - What's the status of Nested Hardware Virtualization?
> I remember some email saying Intel vmx-on-vmx has some performance issues,
> and amd svm-on-svm works better..
>
>
> - Also there's a bunch of VGA passthru related patches,
> that I once volunteered to collect/rebase/cleanup/repost myself,
> but I still haven't had time for that :(
Since there were quite a lot of interest on this subject, should we 
document it in a separate wiki for working combinations (like 
hypervisor, dom0, gfx card, driver version, tricks, etc)?
>
> -- Pasi
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-04 19:21   ` RFC: Still TODO for 4.2? Wei Huang
@ 2012-01-04 19:43     ` Pasi Kärkkäinen
  2012-01-04 19:57       ` Wei Huang
  2012-01-05 13:19       ` Re : RFC: Still TODO for 4.2? David TECHER
  0 siblings, 2 replies; 97+ messages in thread
From: Pasi Kärkkäinen @ 2012-01-04 19:43 UTC (permalink / raw)
  To: Wei Huang
  Cc: xen-devel, Keir Fraser, Ian Campbell, Tim Deegan, Ian Jackson,
	Stefano Stabellini, Jan Beulich

On Wed, Jan 04, 2012 at 01:21:46PM -0600, Wei Huang wrote:
>>>
>>> Has anybody got anything else? I'm sure I've missed stuff. Are there any
>>> must haves e.g. in the paging/sharing spaces?
>>>
>> - What's the status of Nested Hardware Virtualization?
>> I remember some email saying Intel vmx-on-vmx has some performance issues,
>> and amd svm-on-svm works better..
>>
>>
>> - Also there's a bunch of VGA passthru related patches,
>> that I once volunteered to collect/rebase/cleanup/repost myself,
>> but I still haven't had time for that :(
> Since there were quite a lot of interest on this subject, should we  
> document it in a separate wiki for working combinations (like  
> hypervisor, dom0, gfx card, driver version, tricks, etc)?
>

I actually once started writing down that kind of stuff:
http://wiki.xen.org/xenwiki/XenVGAPassthroughTestedAdapters.html

Feel free to contribute :)

There's also:
http://wiki.xen.org/xenwiki/XenVGAPassthrough


-- Pasi

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-04 19:43     ` Pasi Kärkkäinen
@ 2012-01-04 19:57       ` Wei Huang
  2012-01-05  7:27         ` Pasi Kärkkäinen
  2012-01-06 15:37         ` Konrad Rzeszutek Wilk
  2012-01-05 13:19       ` Re : RFC: Still TODO for 4.2? David TECHER
  1 sibling, 2 replies; 97+ messages in thread
From: Wei Huang @ 2012-01-04 19:57 UTC (permalink / raw)
  To: Pasi Kärkkäinen
  Cc: xen-devel, Keir Fraser, Ian Campbell, Tim Deegan, Ian Jackson,
	Stefano Stabellini, Jan Beulich

On 01/04/2012 01:43 PM, Pasi Kärkkäinen wrote:
> On Wed, Jan 04, 2012 at 01:21:46PM -0600, Wei Huang wrote:
>>>> Has anybody got anything else? I'm sure I've missed stuff. Are there any
>>>> must haves e.g. in the paging/sharing spaces?
>>>>
>>> - What's the status of Nested Hardware Virtualization?
>>> I remember some email saying Intel vmx-on-vmx has some performance issues,
>>> and amd svm-on-svm works better..
>>>
>>>
>>> - Also there's a bunch of VGA passthru related patches,
>>> that I once volunteered to collect/rebase/cleanup/repost myself,
>>> but I still haven't had time for that :(
>> Since there were quite a lot of interest on this subject, should we
>> document it in a separate wiki for working combinations (like
>> hypervisor, dom0, gfx card, driver version, tricks, etc)?
>>
> I actually once started writing down that kind of stuff:
> http://wiki.xen.org/xenwiki/XenVGAPassthroughTestedAdapters.html
>
> Feel free to contribute :)
>
> There's also:
> http://wiki.xen.org/xenwiki/XenVGAPassthrough
Thanks for sharing. I will contribute my findings as needed. BTW, do you 
need my VBIOS loading patches (sent long time ago) for AMD GPU? It is a 
dilemma for several reasons:  it doesn't always work; sometimes it can 
screw up main display while passthru 2nd GPUs. Plus the recent Catalyst 
driver seems very stable even without these patches. But Wei Wang told 
me that he needs them for some of his cards.
>
> -- Pasi
>
>

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-04 19:57       ` Wei Huang
@ 2012-01-05  7:27         ` Pasi Kärkkäinen
  2012-01-06 15:37         ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 97+ messages in thread
From: Pasi Kärkkäinen @ 2012-01-05  7:27 UTC (permalink / raw)
  To: Wei Huang
  Cc: xen-devel, Keir Fraser, Ian Campbell, Tim Deegan, Ian Jackson,
	Stefano Stabellini, Jan Beulich

On Wed, Jan 04, 2012 at 01:57:28PM -0600, Wei Huang wrote:
> On 01/04/2012 01:43 PM, Pasi Kärkkäinen wrote:
>> On Wed, Jan 04, 2012 at 01:21:46PM -0600, Wei Huang wrote:
>>>>> Has anybody got anything else? I'm sure I've missed stuff. Are there any
>>>>> must haves e.g. in the paging/sharing spaces?
>>>>>
>>>> - What's the status of Nested Hardware Virtualization?
>>>> I remember some email saying Intel vmx-on-vmx has some performance issues,
>>>> and amd svm-on-svm works better..
>>>>
>>>>
>>>> - Also there's a bunch of VGA passthru related patches,
>>>> that I once volunteered to collect/rebase/cleanup/repost myself,
>>>> but I still haven't had time for that :(
>>> Since there were quite a lot of interest on this subject, should we
>>> document it in a separate wiki for working combinations (like
>>> hypervisor, dom0, gfx card, driver version, tricks, etc)?
>>>
>> I actually once started writing down that kind of stuff:
>> http://wiki.xen.org/xenwiki/XenVGAPassthroughTestedAdapters.html
>>
>> Feel free to contribute :)
>>
>> There's also:
>> http://wiki.xen.org/xenwiki/XenVGAPassthrough
> Thanks for sharing. I will contribute my findings as needed. BTW, do you  
> need my VBIOS loading patches (sent long time ago) for AMD GPU? It is a  
> dilemma for several reasons:  it doesn't always work; sometimes it can  
> screw up main display while passthru 2nd GPUs. Plus the recent Catalyst  
> driver seems very stable even without these patches. But Wei Wang told  
> me that he needs them for some of his cards.
>

Yes, please send the patch!

-- Pasi

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-04 18:20   ` Tim Deegan
@ 2012-01-05 10:39     ` Ian Campbell
  2012-01-06 15:24       ` RFC: Still TODO for 4.2? Nested Paging for Intel Nested Virt Pasi Kärkkäinen
  0 siblings, 1 reply; 97+ messages in thread
From: Ian Campbell @ 2012-01-05 10:39 UTC (permalink / raw)
  To: Tim Deegan
  Cc: xen-devel, Keir (Xen.org), Stefano Stabellini, Ian Jackson, Jan Beulich

On Wed, 2012-01-04 at 18:20 +0000, Tim Deegan wrote:
> At 19:25 +0200 on 04 Jan (1325705119), Pasi K?rkk?inen wrote:
> > On Wed, Jan 04, 2012 at 04:29:22PM +0000, Ian Campbell wrote:
> > > What are the outstanding things to do before we think we can start on
> > > the 4.2 -rc's? Does anyone have a timetable in mind?
> > > 
> > > hypervisor:
> > > 
> > >       * ??? - Keir, Tim, Jan?
> 
> I would like to get the interface changes for sharing/paging/mem-events
> done and dusted so that 4.2 is a stable API that we hold to.
> 
> It would be nice to get the implementation solid too (i.e., using wait
> queues) but that can happen later if it's the only thing holding up a
> release.
> 
> > - What's the status of Nested Hardware Virtualization? 
> > I remember some email saying Intel vmx-on-vmx has some performance issues,
> > and amd svm-on-svm works better..

That's the impression that I've gotten too.

> The basic feature is in for AMD and Intel, but AIUI it's not getting a
> lot of use and it's not in the xen.org automated testing.  The AMD code
> has nested-paging support too, which is a requirement for decent
> performance. 
> 
> We could call it 'experimental' for 4.2?

IMHO we shouldn't hold up the release for it so either it is working by
the time the release happens or it is experimental. (I guess we should
consider svm-on-svm and vmx-on-vmx separately for these purposes).

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re :  RFC: Still TODO for 4.2?
  2012-01-04 19:43     ` Pasi Kärkkäinen
  2012-01-04 19:57       ` Wei Huang
@ 2012-01-05 13:19       ` David TECHER
  2012-01-05 13:25         ` Ian Campbell
  1 sibling, 1 reply; 97+ messages in thread
From: David TECHER @ 2012-01-05 13:19 UTC (permalink / raw)
  To: Pasi Kärkkäinen, Wei Huang
  Cc: xen-devel, Keir Fraser, Ian Campbell, Tim Deegan, Ian Jackson,
	Stefano Stabellini, Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 1897 bytes --]

Pasi

I tryied to maintain the patches for Xen 4.2  since a few month.

Please have a look http://www.davidgis.fr/blog/index.php?2011/12/07/860-xen-42unstable-patches-for-vga-pass-through

Once a week, I try to test the patches.

Let me know if I can contribute.

David


________________________________
 De : Pasi Kärkkäinen <pasik@iki.fi>
À : Wei Huang <wei.huang2@amd.com> 
Cc : xen-devel <xen-devel@lists.xensource.com>; Keir Fraser <keir@xen.org>; Ian Campbell <Ian.Campbell@citrix.com>; Tim Deegan <tim@xen.org>; Ian Jackson <Ian.Jackson@eu.citrix.com>; Stefano Stabellini <stefano.stabellini@citrix.com>; Jan Beulich <JBeulich@suse.com> 
Envoyé le : Mercredi 4 Janvier 2012 20h43
Objet : Re: [Xen-devel] RFC: Still TODO for 4.2?
 
On Wed, Jan 04, 2012 at 01:21:46PM -0600, Wei Huang wrote:
>>>
>>> Has anybody got anything else? I'm sure I've missed stuff. Are there any
>>> must haves e.g. in the paging/sharing spaces?
>>>
>> - What's the status of Nested Hardware Virtualization?
>> I remember some email saying Intel vmx-on-vmx has some performance issues,
>> and amd svm-on-svm works better..
>>
>>
>> - Also there's a bunch of VGA passthru related patches,
>> that I once volunteered to collect/rebase/cleanup/repost myself,
>> but I still haven't had time for that :(
> Since there were quite a lot of interest on this subject, should we  
> document it in a separate wiki for working combinations (like  
> hypervisor, dom0, gfx card, driver version, tricks, etc)?
>

I actually once started writing down that kind of stuff:
http://wiki.xen.org/xenwiki/XenVGAPassthroughTestedAdapters.html

Feel free to contribute :)

There's also:
http://wiki.xen.org/xenwiki/XenVGAPassthrough


-- Pasi


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

[-- Attachment #1.2: Type: text/html, Size: 3324 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-05 13:19       ` Re : RFC: Still TODO for 4.2? David TECHER
@ 2012-01-05 13:25         ` Ian Campbell
  2012-01-05 13:41           ` Re : " David TECHER
  0 siblings, 1 reply; 97+ messages in thread
From: Ian Campbell @ 2012-01-05 13:25 UTC (permalink / raw)
  To: David TECHER
  Cc: xen-devel, Keir (Xen.org), Stefano Stabellini, Tim (Xen.org),
	Wei Huang, Ian Jackson, Jan Beulich

On Thu, 2012-01-05 at 13:19 +0000, David TECHER wrote:
> Pasi
> 
> 
> I tryied to maintain the patches for Xen 4.2  since a few month.
> 
> 
> Please have a look
> http://www.davidgis.fr/blog/index.php?2011/12/07/860-xen-42unstable-patches-for-vga-pass-through

Please can you post these patches, against the tip of xen-unstable, with
a changelog etc as described in
http://wiki.xen.org/wiki/SubmittingXenPatches to xen-devel.

Then we can look at accepting them in to the tree and you can stop
needing to maintain them like this. Or is there some reason these can't
be submitted?

Ian.

> 
> 
> Once a week, I try to test the patches.
> 
> 
> Let me know if I can contribute.
> 
> 
> David
> 
> 
> ______________________________________________________________________
> De : Pasi Kärkkäinen <pasik@iki.fi>
> À : Wei Huang <wei.huang2@amd.com> 
> Cc : xen-devel <xen-devel@lists.xensource.com>; Keir Fraser
> <keir@xen.org>; Ian Campbell <Ian.Campbell@citrix.com>; Tim Deegan
> <tim@xen.org>; Ian Jackson <Ian.Jackson@eu.citrix.com>; Stefano
> Stabellini <stefano.stabellini@citrix.com>; Jan Beulich
> <JBeulich@suse.com> 
> Envoyé le : Mercredi 4 Janvier 2012 20h43
> Objet : Re: [Xen-devel] RFC: Still TODO for 4.2?
> 
> On Wed, Jan 04, 2012 at 01:21:46PM -0600, Wei Huang wrote:
> >>>
> >>> Has anybody got anything else? I'm sure I've missed stuff. Are
> there any
> >>> must haves e.g. in the paging/sharing spaces?
> >>>
> >> - What's the status of Nested Hardware Virtualization?
> >> I remember some email saying Intel vmx-on-vmx has some performance
> issues,
> >> and amd svm-on-svm works better..
> >>
> >>
> >> - Also there's a bunch of VGA passthru related patches,
> >> that I once volunteered to collect/rebase/cleanup/repost myself,
> >> but I still haven't had time for that :(
> > Since there were quite a lot of interest on this subject, should
> we  
> > document it in a separate wiki for working combinations (like  
> > hypervisor, dom0, gfx card, driver version, tricks, etc)?
> >
> 
> I actually once started writing down that kind of stuff:
> http://wiki.xen.org/xenwiki/XenVGAPassthroughTestedAdapters.html
> 
> Feel free to contribute :)
> 
> There's also:
> http://wiki.xen.org/xenwiki/XenVGAPassthrough
> 
> 
> -- Pasi
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
> 
> 
> 



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re :  RFC: Still TODO for 4.2?
  2012-01-05 13:25         ` Ian Campbell
@ 2012-01-05 13:41           ` David TECHER
  2012-01-05 16:18             ` Ian Campbell
  0 siblings, 1 reply; 97+ messages in thread
From: David TECHER @ 2012-01-05 13:41 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, Ian Jackson, Wei Huang, Tim (Xen.org),
	Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 3540 bytes --]

Ian

I will try to submit the patch tonight when I am at home (I am in France - C.E.T) else you can download patches at http://www.davidgis.fr/download/xen-4.2_rev24232_gfx-passthrough-patchs.tar.bz2


For your information and as it is writtten in my article, I am not the author of the patches, just a simple maintainer.

However tt is worth having a try for submitting tonight


Thanks.

David.



________________________________
 De : Ian Campbell <Ian.Campbell@citrix.com>
À : David TECHER <davidtecher@yahoo.fr> 
Cc : xen-devel <xen-devel@lists.xensource.com>; Keir (Xen.org) <keir@xen.org>; Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>; Tim (Xen.org) <tim@xen.org>; Wei Huang <wei.huang2@amd.com>; Ian Jackson <Ian.Jackson@eu.citrix.com>; Jan Beulich <JBeulich@suse.com> 
Envoyé le : Jeudi 5 Janvier 2012 14h25
Objet : Re: [Xen-devel] RFC: Still TODO for 4.2?
 
On Thu, 2012-01-05 at 13:19 +0000, David TECHER wrote:
> Pasi
> 
> 
> I tryied to maintain the patches for Xen 4.2  since a few month.
> 
> 
> Please have a look
> http://www.davidgis.fr/blog/index.php?2011/12/07/860-xen-42unstable-patches-for-vga-pass-through

Please can you post these patches, against the tip of xen-unstable, with
a changelog etc as described in
http://wiki.xen.org/wiki/SubmittingXenPatches to xen-devel.

Then we can look at accepting them in to the tree and you can stop
needing to maintain them like this. Or is there some reason these can't
be submitted?

Ian.

> 
> 
> Once a week, I try to test the patches.
> 
> 
> Let me know if I can contribute.
> 
> 
> David
> 
> 
> ______________________________________________________________________
> De : Pasi Kärkkäinen <pasik@iki.fi>
> À : Wei Huang <wei.huang2@amd.com> 
> Cc : xen-devel <xen-devel@lists.xensource.com>; Keir Fraser
> <keir@xen.org>; Ian Campbell <Ian.Campbell@citrix.com>; Tim Deegan
> <tim@xen.org>; Ian Jackson <Ian.Jackson@eu.citrix.com>; Stefano
> Stabellini <stefano.stabellini@citrix.com>; Jan Beulich
> <JBeulich@suse.com> 
> Envoyé le : Mercredi 4 Janvier 2012 20h43
> Objet : Re: [Xen-devel] RFC: Still TODO for 4.2?
> 
> On Wed, Jan 04, 2012 at 01:21:46PM -0600, Wei Huang wrote:
> >>>
> >>> Has anybody got anything else? I'm sure I've missed stuff. Are
> there any
> >>> must haves e.g. in the paging/sharing spaces?
> >>>
> >> - What's the status of Nested Hardware Virtualization?
> >> I remember some email saying Intel vmx-on-vmx has some performance
> issues,
> >> and amd svm-on-svm works better..
> >>
> >>
> >> - Also there's a bunch of VGA passthru related patches,
> >> that I once volunteered to collect/rebase/cleanup/repost myself,
> >> but I still haven't had time for that :(
> > Since there were quite a lot of interest on this subject, should
> we  
> > document it in a separate wiki for working combinations (like  
> > hypervisor, dom0, gfx card, driver version, tricks, etc)?
> >
> 
> I actually once started writing down that kind of stuff:
> http://wiki.xen.org/xenwiki/XenVGAPassthroughTestedAdapters.html
> 
> Feel free to contribute :)
> 
> There's also:
> http://wiki.xen.org/xenwiki/XenVGAPassthrough
> 
> 
> -- Pasi
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
> 
> 
> 



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

[-- Attachment #1.2: Type: text/html, Size: 6494 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-05 13:41           ` Re : " David TECHER
@ 2012-01-05 16:18             ` Ian Campbell
  0 siblings, 0 replies; 97+ messages in thread
From: Ian Campbell @ 2012-01-05 16:18 UTC (permalink / raw)
  To: David TECHER
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, Ian Jackson, Wei Huang, Tim (Xen.org),
	Jan Beulich

On Thu, 2012-01-05 at 13:41 +0000, David TECHER wrote:
> Ian
> 
> 
> I will try to submit the patch tonight when I am at home (I am in
> France - C.E.T) else you can download patches at
> http://www.davidgis.fr/download/xen-4.2_rev24232_gfx-passthrough-patchs.tar.bz2

Thanks, I'm not actually interested in this functionality myself -- I
just wanted to encourage their submission upstream since others seem to
want them.

> For your information and as it is writtten in my article, I am not the
> author of the patches, just a simple maintainer.

Then there is one additional wrinkle over and above what is described in
the SubmittingXenPatches wiki page, you should include a "Signed-off-by"
from the original author above your own.

Hopefully they included one when they originally posted the patch, in
which case you can just pass it on. If they did not then please CC them
on the posting and ask them to supply it (don't just make it up
yourself).

Hopefully the original authorship of all the patches is clear, if not
please note this in your changelog and we'll see if we can track them
down etc.

> However tt is worth having a try for submitting tonight

Please.

Thanks,
Ian.

> 
> 
> 
> Thanks.
> 
> 
> David.
> 
> 
> 
> ______________________________________________________________________
> De : Ian Campbell <Ian.Campbell@citrix.com>
> À : David TECHER <davidtecher@yahoo.fr> 
> Cc : xen-devel <xen-devel@lists.xensource.com>; Keir (Xen.org)
> <keir@xen.org>; Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>;
> Tim (Xen.org) <tim@xen.org>; Wei Huang <wei.huang2@amd.com>; Ian
> Jackson <Ian.Jackson@eu.citrix.com>; Jan Beulich <JBeulich@suse.com> 
> Envoyé le : Jeudi 5 Janvier 2012 14h25
> Objet : Re: [Xen-devel] RFC: Still TODO for 4.2?
> 
> On Thu, 2012-01-05 at 13:19 +0000, David TECHER wrote:
> > Pasi
> > 
> > 
> > I tryied to maintain the patches for Xen 4.2  since a few month.
> > 
> > 
> > Please have a look
> >
> http://www.davidgis.fr/blog/index.php?2011/12/07/860-xen-42unstable-patches-for-vga-pass-through
> 
> Please can you post these patches, against the tip of xen-unstable,
> with
> a changelog etc as described in
> http://wiki.xen.org/wiki/SubmittingXenPatches to xen-devel.
> 
> Then we can look at accepting them in to the tree and you can stop
> needing to maintain them like this. Or is there some reason these
> can't
> be submitted?
> 
> Ian.
> 
> > 
> > 
> > Once a week, I try to test the patches.
> > 
> > 
> > Let me know if I can contribute.
> > 
> > 
> > David
> > 
> > 
> >
> ______________________________________________________________________
> > De : Pasi Kärkkäinen <pasik@iki.fi>
> > À : Wei Huang <wei.huang2@amd.com> 
> > Cc : xen-devel <xen-devel@lists.xensource.com>; Keir Fraser
> > <keir@xen.org>; Ian Campbell <Ian.Campbell@citrix.com>; Tim Deegan
> > <tim@xen.org>; Ian Jackson <Ian.Jackson@eu.citrix.com>; Stefano
> > Stabellini <stefano.stabellini@citrix.com>; Jan Beulich
> > <JBeulich@suse.com> 
> > Envoyé le : Mercredi 4 Janvier 2012 20h43
> > Objet : Re: [Xen-devel] RFC: Still TODO for 4.2?
> > 
> > On Wed, Jan 04, 2012 at 01:21:46PM -0600, Wei Huang wrote:
> > >>>
> > >>> Has anybody got anything else? I'm sure I've missed stuff. Are
> > there any
> > >>> must haves e.g. in the paging/sharing spaces?
> > >>>
> > >> - What's the status of Nested Hardware Virtualization?
> > >> I remember some email saying Intel vmx-on-vmx has some
> performance
> > issues,
> > >> and amd svm-on-svm works better..
> > >>
> > >>
> > >> - Also there's a bunch of VGA passthru related patches,
> > >> that I once volunteered to collect/rebase/cleanup/repost myself,
> > >> but I still haven't had time for that :(
> > > Since there were quite a lot of interest on this subject, should
> > we  
> > > document it in a separate wiki for working combinations (like  
> > > hypervisor, dom0, gfx card, driver version, tricks, etc)?
> > >
> > 
> > I actually once started writing down that kind of stuff:
> > http://wiki.xen.org/xenwiki/XenVGAPassthroughTestedAdapters.html
> > 
> > Feel free to contribute :)
> > 
> > There's also:
> > http://wiki.xen.org/xenwiki/XenVGAPassthrough
> > 
> > 
> > -- Pasi
> > 
> > 
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com
> > http://lists.xensource.com/xen-devel
> > 
> > 
> > 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
> 
> 
> 



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-04 16:29 RFC: Still TODO for 4.2? Ian Campbell
                   ` (3 preceding siblings ...)
  2012-01-04 17:39 ` Roger Pau Monné
@ 2012-01-05 17:49 ` Ian Jackson
  2012-01-06 13:37   ` Ian Campbell
  2012-01-16 11:55 ` George Dunlap
  2012-01-19 21:14 ` RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend Pasi Kärkkäinen
  6 siblings, 1 reply; 97+ messages in thread
From: Ian Jackson @ 2012-01-05 17:49 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Stefano Stabellini, xen-devel, Keir Fraser, Jan Beulich, Tim Deegan

Ian Campbell writes ("[Xen-devel] RFC: Still TODO for 4.2?"):
> What are the outstanding things to do before we think we can start on
> the 4.2 -rc's? Does anyone have a timetable in mind?
> 
> hypervisor:
> 
>       * ??? - Keir, Tim, Jan?
> 
> tools:
> 
>       * libxl stable API -- we would like 4.2 to define a stable API
>         which downstream's can start to rely on not changing. Aspects of
>         this are:

Relatedly, xl should have a json-based querier intended for users to
not have to use the weird handwritten sexp printfs.

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Driver domains and hotplug scripts, redux
  2012-01-04 17:39 ` Roger Pau Monné
@ 2012-01-05 18:07   ` Ian Jackson
  2012-01-06 12:01     ` Stefano Stabellini
  2012-01-09 10:08     ` Roger Pau Monné
  0 siblings, 2 replies; 97+ messages in thread
From: Ian Jackson @ 2012-01-05 18:07 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Stefano Stabellini, Ian Campbell

Roger Pau Monné writes ("Re: [Xen-devel] RFC: Still TODO for 4.2?"):
> I don't know much about driver domains, but from what I've read they
> should be running something like NetBSD xenbackend and listen for
> xenstore events. Most of the functions that I've written on my hotplug
> series can be used to create a little daemon, that's not the problem,
> the problem is what can we use to synchronize hotplug script calling
> and libxl (what comes to mind is using a dedicated xenstore variable
> for each device, but someone might have a better idea).

This envisages devicer setup/teardown scripts in driver domains
running in a different way to those in the same domain as the
toolstack.  Are we sure this is a good idea ?

I think it would be preferable to have only one interface to device
scripts, which is used everywhere.  That interface would have to
involve initiation by the toolstack, and collection of resulting
success/failure/etc., via xenstore.

The sequence of events for vifs with a kernel-level backend needs
to go like this:
  * toolstack tells backend domain to create vif, via xenstore
  * backend kernel creates a virtual network interface vifNN
  * something in backend domain notices that this vifNN
    has appeared and consequently
  * device setup script runs, enslaves vifNN to bridge, adds
    it to routing tables, gives it an address, etc.
  * something in backend domain domain tells toolstack vif is ready
  [ device is used ]
  * toolstack tells backend domain to destroy vif; perhaps entire
    xenstore directory is forcibly removed??
  * backend kernel removes virtual network interface immediately
    and all routes, bridge enslavements, etc., are undone
  * something in backend notices the removal
  * device teardown script may need to remove eg firewall rules
  * when this is complete, the backend domain notifies the
    toolstack (how??)

For block devices with a kernel-level backend:
  * toolstack tells backend domain to create vbd
    parameters include: vbd number, target??, script??
  * something in backend domain notices this and consequently
  * device setup script runs, creates a suitable actual
    block device in backend domain
  * backend kernel picks up actual block device details and
    becomes available to guest
  * something in backend domain tells the toolstack all is well
  [ device is used ]
  * toolstack tells backend domain to destroy vbd; perhaps entire
    xenstore directory is forcibly removed??
  * backend kernel removes its actual backend and closes the
    block device, and somehow notifies userspace when this
    is done so that
  * device teardown script cleans up, including making actual
    block device go away (if it was one which the setup script
    created)
  * when this is complete, the backend domain notifies the
    toolstack (how??)

For block devices with a user-level backend:
  * toolstack tells backend domain to create vbd
    parameters include: vbd number, target??, script??
  * userland backend notices this, does its housekeeping
    and setup, and tells the toolstack all is well
  [ device is used ]
  * toolstack tells backend domain to destroy vbd; perhaps entire
    xenstore directory is forcibly removed??
  * userland backend removes its actual backend and closes the
    resources it was using, and
  * notifies the toolstack (how??)

Much of this seems to be covered by, or coverable by, the existing
xenstore protocol.  I think we just need to define in more detail
exactly how it should all work, and on each platform how the
"something"s work.

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-05 18:07   ` Driver domains and hotplug scripts, redux Ian Jackson
@ 2012-01-06 12:01     ` Stefano Stabellini
  2012-01-09 10:08     ` Roger Pau Monné
  1 sibling, 0 replies; 97+ messages in thread
From: Stefano Stabellini @ 2012-01-06 12:01 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Roger Pau Monné, xen-devel, Ian Campbell, Stefano Stabellini

[-- Attachment #1: Type: text/plain, Size: 5194 bytes --]

On Thu, 5 Jan 2012, Ian Jackson wrote:
> Roger Pau Monn�© writes ("Re: [Xen-devel] RFC: Still TODO for 4.2?"):
> > I don't know much about driver domains, but from what I've read they
> > should be running something like NetBSD xenbackend and listen for
> > xenstore events. Most of the functions that I've written on my hotplug
> > series can be used to create a little daemon, that's not the problem,
> > the problem is what can we use to synchronize hotplug script calling
> > and libxl (what comes to mind is using a dedicated xenstore variable
> > for each device, but someone might have a better idea).
> 
> This envisages devicer setup/teardown scripts in driver domains
> running in a different way to those in the same domain as the
> toolstack.  Are we sure this is a good idea ?
> 
> I think it would be preferable to have only one interface to device
> scripts, which is used everywhere.  That interface would have to
> involve initiation by the toolstack, and collection of resulting
> success/failure/etc., via xenstore.
> 
> The sequence of events for vifs with a kernel-level backend needs
> to go like this:
>   * toolstack tells backend domain to create vif, via xenstore
>   * backend kernel creates a virtual network interface vifNN
>   * something in backend domain notices that this vifNN
>     has appeared and consequently
>   * device setup script runs, enslaves vifNN to bridge, adds
>     it to routing tables, gives it an address, etc.
>   * something in backend domain domain tells toolstack vif is ready
>   [ device is used ]
>   * toolstack tells backend domain to destroy vif; perhaps entire
>     xenstore directory is forcibly removed??
>   * backend kernel removes virtual network interface immediately
>     and all routes, bridge enslavements, etc., are undone
>   * something in backend notices the removal
>   * device teardown script may need to remove eg firewall rules
>   * when this is complete, the backend domain notifies the
>     toolstack (how??)
> 
> For block devices with a kernel-level backend:
>   * toolstack tells backend domain to create vbd
>     parameters include: vbd number, target??, script??
>   * something in backend domain notices this and consequently
>   * device setup script runs, creates a suitable actual
>     block device in backend domain
>   * backend kernel picks up actual block device details and
>     becomes available to guest
>   * something in backend domain tells the toolstack all is well
>   [ device is used ]
>   * toolstack tells backend domain to destroy vbd; perhaps entire
>     xenstore directory is forcibly removed??
>   * backend kernel removes its actual backend and closes the
>     block device, and somehow notifies userspace when this
>     is done so that
>   * device teardown script cleans up, including making actual
>     block device go away (if it was one which the setup script
>     created)
>   * when this is complete, the backend domain notifies the
>     toolstack (how??)
> 
> For block devices with a user-level backend:
>   * toolstack tells backend domain to create vbd
>     parameters include: vbd number, target??, script??
>   * userland backend notices this, does its housekeeping
>     and setup, and tells the toolstack all is well
>   [ device is used ]
>   * toolstack tells backend domain to destroy vbd; perhaps entire
>     xenstore directory is forcibly removed??
>   * userland backend removes its actual backend and closes the
>     resources it was using, and
>   * notifies the toolstack (how??)
> 
> Much of this seems to be covered by, or coverable by, the existing
> xenstore protocol.  I think we just need to define in more detail
> exactly how it should all work, and on each platform how the
> "something"s work.
 
I have given some thoughts on this issue and these are my observations:

- given that the backend might be in userspace, it is recommendable not
to rely on udev for the execution of the scripts;

- given that some complex network storage solutions might have difficult
timing requirements, it is advisable not to tight the execution of the setup
script to the setup of the block backend node on xenstore (that can
cause the block backend to run before its time). We want to be able to
setup the storage and once that is done write the params node on
xenstore for the block backend.

- same for teardown: it is better not to tight the executing of the
teardown block script to the removal of the block backend node on
xenstore. We could run the script independently and store their
information somewhere else.

As a consequence I suggest that we adopt a solution similar to
xenbackendd, however rather than reacting to the backend creation on
xenstore, it should react on different, more explicit, events on another
xenstore location.
This way the toolstack can decide exactly when the script gets executed
independently from the block/network backend.
Also storing the script info on a different location has the advantage
that we can prevent the script from writing (maybe even reading) to the
backend nodes on xenstore, that frankly is quite scaring.

However to do this we probably need to change some/most of/all the
scripts.

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-05 17:49 ` RFC: Still TODO for 4.2? Ian Jackson
@ 2012-01-06 13:37   ` Ian Campbell
  2012-01-10 16:06     ` Ian Jackson
  0 siblings, 1 reply; 97+ messages in thread
From: Ian Campbell @ 2012-01-06 13:37 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Tim (Xen.org), xen-devel, Keir (Xen.org),
	Jan Beulich, Stefano Stabellini

On Thu, 2012-01-05 at 17:49 +0000, Ian Jackson wrote:
> Ian Campbell writes ("[Xen-devel] RFC: Still TODO for 4.2?"):
> > What are the outstanding things to do before we think we can start on
> > the 4.2 -rc's? Does anyone have a timetable in mind?
> > 
> > hypervisor:
> > 
> >       * ??? - Keir, Tim, Jan?
> > 
> > tools:
> > 
> >       * libxl stable API -- we would like 4.2 to define a stable API
> >         which downstream's can start to rely on not changing. Aspects of
> >         this are:
> 
> Relatedly, xl should have a json-based querier intended for users to
> not have to use the weird handwritten sexp printfs.

You mean for the "create -d" output? I agree and I've got such a patch
somewhere that I could polish off (and will).

I'd argue that the json output should be the default with sxp requiring
a special option, even though that break backwards compat with xm. I
have a hard time believing that the sexp printed by xl is close enough
to the xm one that people haven't already been hacking around it in
their parsers anyway...

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? Nested Paging for Intel Nested Virt
  2012-01-05 10:39     ` Ian Campbell
@ 2012-01-06 15:24       ` Pasi Kärkkäinen
  0 siblings, 0 replies; 97+ messages in thread
From: Pasi Kärkkäinen @ 2012-01-06 15:24 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Shan, Haitao, xen-devel, Keir (Xen.org),
	Dong, Eddie, Stefano Stabellini, Tim Deegan, Ian Jackson,
	Jan Beulich

On Thu, Jan 05, 2012 at 10:39:20AM +0000, Ian Campbell wrote:
> On Wed, 2012-01-04 at 18:20 +0000, Tim Deegan wrote:
> > At 19:25 +0200 on 04 Jan (1325705119), Pasi K?rkk?inen wrote:
> > > On Wed, Jan 04, 2012 at 04:29:22PM +0000, Ian Campbell wrote:
> > > > What are the outstanding things to do before we think we can start on
> > > > the 4.2 -rc's? Does anyone have a timetable in mind?
> > > > 
> > > > hypervisor:
> > > > 
> > > >       * ??? - Keir, Tim, Jan?
> > 
> > I would like to get the interface changes for sharing/paging/mem-events
> > done and dusted so that 4.2 is a stable API that we hold to.
> > 
> > It would be nice to get the implementation solid too (i.e., using wait
> > queues) but that can happen later if it's the only thing holding up a
> > release.
> > 
> > > - What's the status of Nested Hardware Virtualization? 
> > > I remember some email saying Intel vmx-on-vmx has some performance issues,
> > > and amd svm-on-svm works better..
> 
> That's the impression that I've gotten too.
> 

Yep. 

Intel guys: Any plans to implement Nested Paging for Nested Virt ? 


> > The basic feature is in for AMD and Intel, but AIUI it's not getting a
> > lot of use and it's not in the xen.org automated testing.  The AMD code
> > has nested-paging support too, which is a requirement for decent
> > performance. 
> > 
> > We could call it 'experimental' for 4.2?
> 
> IMHO we shouldn't hold up the release for it so either it is working by
> the time the release happens or it is experimental. (I guess we should
> consider svm-on-svm and vmx-on-vmx separately for these purposes).
> 

Makes sense.

-- Pasi

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-04 19:57       ` Wei Huang
  2012-01-05  7:27         ` Pasi Kärkkäinen
@ 2012-01-06 15:37         ` Konrad Rzeszutek Wilk
  2012-01-06 19:08           ` Wei Huang
  2012-02-06 17:57           ` Pasi Kärkkäinen
  1 sibling, 2 replies; 97+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-01-06 15:37 UTC (permalink / raw)
  To: Wei Huang
  Cc: xen-devel, Keir Fraser, Ian Campbell, Tim Deegan, Ian Jackson,
	Stefano Stabellini, Jan Beulich

On Wed, Jan 04, 2012 at 01:57:28PM -0600, Wei Huang wrote:
> On 01/04/2012 01:43 PM, Pasi K?rkk?inen wrote:
> >On Wed, Jan 04, 2012 at 01:21:46PM -0600, Wei Huang wrote:
> >>>>Has anybody got anything else? I'm sure I've missed stuff. Are there any
> >>>>must haves e.g. in the paging/sharing spaces?
> >>>>
> >>>- What's the status of Nested Hardware Virtualization?
> >>>I remember some email saying Intel vmx-on-vmx has some performance 
> >>>issues,
> >>>and amd svm-on-svm works better..
> >>>
> >>>
> >>>- Also there's a bunch of VGA passthru related patches,
> >>>that I once volunteered to collect/rebase/cleanup/repost myself,
> >>>but I still haven't had time for that :(
> >>Since there were quite a lot of interest on this subject, should we
> >>document it in a separate wiki for working combinations (like
> >>hypervisor, dom0, gfx card, driver version, tricks, etc)?
> >>
> >I actually once started writing down that kind of stuff:
> >http://wiki.xen.org/xenwiki/XenVGAPassthroughTestedAdapters.html
> >
> >Feel free to contribute :)
> >
> >There's also:
> >http://wiki.xen.org/xenwiki/XenVGAPassthrough
> Thanks for sharing. I will contribute my findings as needed. BTW, do you 
> need my VBIOS loading patches (sent long time ago) for AMD GPU? It is a 

Yes! Thought I haven't figured out yet how to extract the AMD GPU BIOS
from the card. I've been able to pass in a Radeon 4870 to an Win 7 HVM
guest and it works nicely.. the first time. After I shut down the guest
it just never works again :-(

> dilemma for several reasons:  it doesn't always work; sometimes it can 
> screw up main display while passthru 2nd GPUs. Plus the recent Catalyst 
> driver seems very stable even without these patches. But Wei Wang told 
> me that he needs them for some of his cards.
> >
> >-- Pasi
> >
> >
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-06 15:37         ` Konrad Rzeszutek Wilk
@ 2012-01-06 19:08           ` Wei Huang
  2012-02-06 17:57           ` Pasi Kärkkäinen
  1 sibling, 0 replies; 97+ messages in thread
From: Wei Huang @ 2012-01-06 19:08 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: xen-devel, Keir Fraser, Ian Campbell, Tim Deegan, Ian Jackson,
	Stefano Stabellini, Jan Beulich

On 01/06/2012 09:37 AM, Konrad Rzeszutek Wilk wrote:
> On Wed, Jan 04, 2012 at 01:57:28PM -0600, Wei Huang wrote:
>> On 01/04/2012 01:43 PM, Pasi K?rkk?inen wrote:
>>> On Wed, Jan 04, 2012 at 01:21:46PM -0600, Wei Huang wrote:
>>>>>> Has anybody got anything else? I'm sure I've missed stuff. Are there any
>>>>>> must haves e.g. in the paging/sharing spaces?
>>>>>>
>>>>> - What's the status of Nested Hardware Virtualization?
>>>>> I remember some email saying Intel vmx-on-vmx has some performance
>>>>> issues,
>>>>> and amd svm-on-svm works better..
>>>>>
>>>>>
>>>>> - Also there's a bunch of VGA passthru related patches,
>>>>> that I once volunteered to collect/rebase/cleanup/repost myself,
>>>>> but I still haven't had time for that :(
>>>> Since there were quite a lot of interest on this subject, should we
>>>> document it in a separate wiki for working combinations (like
>>>> hypervisor, dom0, gfx card, driver version, tricks, etc)?
>>>>
>>> I actually once started writing down that kind of stuff:
>>> http://wiki.xen.org/xenwiki/XenVGAPassthroughTestedAdapters.html
>>>
>>> Feel free to contribute :)
>>>
>>> There's also:
>>> http://wiki.xen.org/xenwiki/XenVGAPassthrough
>> Thanks for sharing. I will contribute my findings as needed. BTW, do you
>> need my VBIOS loading patches (sent long time ago) for AMD GPU? It is a
> Yes! Thought I haven't figured out yet how to extract the AMD GPU BIOS
> from the card. I've been able to pass in a Radeon 4870 to an Win 7 HVM
> guest and it works nicely.. the first time. After I shut down the guest
> it just never works again :-(

This is a known issue which should be addressed for 4.2. I remember 
several persons reported this problem.

>
>> dilemma for several reasons:  it doesn't always work; sometimes it can
>> screw up main display while passthru 2nd GPUs. Plus the recent Catalyst
>> driver seems very stable even without these patches. But Wei Wang told
>> me that he needs them for some of his cards.
>>> -- Pasi
>>>
>>>
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-05 18:07   ` Driver domains and hotplug scripts, redux Ian Jackson
  2012-01-06 12:01     ` Stefano Stabellini
@ 2012-01-09 10:08     ` Roger Pau Monné
  2012-01-09 12:26       ` Stefano Stabellini
  1 sibling, 1 reply; 97+ messages in thread
From: Roger Pau Monné @ 2012-01-09 10:08 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Stefano Stabellini, Ian Campbell

2012/1/5 Ian Jackson <Ian.Jackson@eu.citrix.com>:
> Roger Pau Monné writes ("Re: [Xen-devel] RFC: Still TODO for 4.2?"):
>> I don't know much about driver domains, but from what I've read they
>> should be running something like NetBSD xenbackend and listen for
>> xenstore events. Most of the functions that I've written on my hotplug
>> series can be used to create a little daemon, that's not the problem,
>> the problem is what can we use to synchronize hotplug script calling
>> and libxl (what comes to mind is using a dedicated xenstore variable
>> for each device, but someone might have a better idea).

Sorry I didn't reply earlier, I was still on holidays (mostly working
with the autotools stuff).

> This envisages devicer setup/teardown scripts in driver domains
> running in a different way to those in the same domain as the
> toolstack.  Are we sure this is a good idea ?

No, I think it's best that even is the driver domain is Dom0 the same
procedure should be executed, and xenbackendd should be running in
every driver domain, Dom0 included, toolstack should never execute
hotplug scripts directly.

> I think it would be preferable to have only one interface to device
> scripts, which is used everywhere.  That interface would have to
> involve initiation by the toolstack, and collection of resulting
> success/failure/etc., via xenstore.

Are we only going to use xenstore to share information between both
domains (Dom0 <--> Driver domain)?

I'm going to comment the vif case, but I think both vif and block
devices should follow the same approach, and hotplug script execution
has to be something "standard" and should not rely on the type of the
device.

> The sequence of events for vifs with a kernel-level backend needs
> to go like this:
>  * toolstack tells backend domain to create vif, via xenstore

How does the toolstack tell a domain to create a device? Creating a
xenstore entry like:

/local/domain/<domid>/backend/vif/...

does trigger the creation of a vif interface in the <domid> domain?

>  * backend kernel creates a virtual network interface vifNN
>  * something in backend domain notices that this vifNN
>    has appeared and consequently

This should be handled by xenbackendd (I know it will not exactly be
xenbackendd, but let's call it that way to simplify things), since it
should be listening to /local/domain/<domid>/backend/* for changes and
react upon them.

>  * device setup script runs, enslaves vifNN to bridge, adds
>    it to routing tables, gives it an address, etc.

Handled by hotplug scripts.

>  * something in backend domain domain tells toolstack vif is ready

Hotplug scripts should change backend state (and write the appropriate
values) to notify everything when ok. Since xenbackendd is the one
that executes the scripts, it should examine the exit code of the
called hotplug script and write the exit status code and message if
hotplug script execution is not successful. This values can be
retrieved from the toolstack and notify the user if something failed.

>  [ device is used ]
>  * toolstack tells backend domain to destroy vif; perhaps entire
>    xenstore directory is forcibly removed??

If entire xenstore directory is forcibly removed, how does xenbackendd
know the parameters to pass to the hotplug script to shutdown the
device? Do we have to keep a copy of this somewhere else (xenstore or
create a xenbackendd private database)?

Here we have two cases, whether it is a shutdown or a destroy:

When doing a shutdown the toolstack should wait to get a notification
from the driver domain that hotplug execution was done (either
successfully or not) and then proceed with the removal of xenstore
directory.

DomU closes device --> driver domain notices --> execution of hotplug
scripts --> write result to xenstore --> toolstack reads results of
hotplug teardown.

When doing a destroy, the toolstack should manually set the frontend
state to closed, and thus force the execution of hotplug scripts in
the driver domain? I know this has been a cause of discussion in
previous patches, but I really don't see the problem with modifying
the frontend status if the domain is already dead, it's just a way to
force the unplug of the device and the execution of hotplug scripts.
Normally the DomU should set the frontend status to closed, but since
we killed it from the toolstack, it should be the toolstack itself the
one in charge of setting the status to closed.

toolstack kills domain --> toolstack sets frontend status to closed
--> driver domain kernel notices frontend change and closes backend
--> xenbackendd noticies change --> execution of hotplug scripts -->
write results to xenstore --> toolstack reads results of hotplug
teardown.

>  * backend kernel removes virtual network interface immediately
>    and all routes, bridge enslavements, etc., are undone
>  * something in backend notices the removal
>  * device teardown script may need to remove eg firewall rules
>  * when this is complete, the backend domain notifies the
>    toolstack (how??)

Should the toolstack wait for a notification from the driver domain? I
think it is important that the toolstack is always aware of what
happens in the driver domain, and it should wait for the execution of
the teardown hotplug scripts and catch it's results, to notify the
user if it is not successful.

> For block devices with a kernel-level backend:
>  * toolstack tells backend domain to create vbd
>    parameters include: vbd number, target??, script??
>  * something in backend domain notices this and consequently
>  * device setup script runs, creates a suitable actual
>    block device in backend domain
>  * backend kernel picks up actual block device details and
>    becomes available to guest
>  * something in backend domain tells the toolstack all is well
>  [ device is used ]
>  * toolstack tells backend domain to destroy vbd; perhaps entire
>    xenstore directory is forcibly removed??
>  * backend kernel removes its actual backend and closes the
>    block device, and somehow notifies userspace when this
>    is done so that
>  * device teardown script cleans up, including making actual
>    block device go away (if it was one which the setup script
>    created)
>  * when this is complete, the backend domain notifies the
>    toolstack (how??)
>
> For block devices with a user-level backend:
>  * toolstack tells backend domain to create vbd
>    parameters include: vbd number, target??, script??
>  * userland backend notices this, does its housekeeping
>    and setup, and tells the toolstack all is well
>  [ device is used ]
>  * toolstack tells backend domain to destroy vbd; perhaps entire
>    xenstore directory is forcibly removed??
>  * userland backend removes its actual backend and closes the
>    resources it was using, and
>  * notifies the toolstack (how??)

When it comes to block hotplug scripts, we have to let xenbackendd
decide which kind of backend to use, so we should agree on what to
write to xenstore that can cover all types of block backends (phy,
qdisk, blktap...), since the toolstack probably doesn't have access to
the requested medium.

> Much of this seems to be covered by, or coverable by, the existing
> xenstore protocol.  I think we just need to define in more detail
> exactly how it should all work, and on each platform how the
> "something"s work.
>
> Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-09 10:08     ` Roger Pau Monné
@ 2012-01-09 12:26       ` Stefano Stabellini
  2012-01-09 17:39         ` Ian Jackson
  0 siblings, 1 reply; 97+ messages in thread
From: Stefano Stabellini @ 2012-01-09 12:26 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, Ian Jackson, Ian Campbell, Stefano Stabellini

[-- Attachment #1: Type: text/plain, Size: 5157 bytes --]

On Mon, 9 Jan 2012, Roger Pau Monné wrote:
> > This envisages devicer setup/teardown scripts in driver domains
> > running in a different way to those in the same domain as the
> > toolstack.  Are we sure this is a good idea ?
> 
> No, I think it's best that even is the driver domain is Dom0 the same
> procedure should be executed, and xenbackendd should be running in
> every driver domain, Dom0 included, toolstack should never execute
> hotplug scripts directly.

I agree.


> > I think it would be preferable to have only one interface to device
> > scripts, which is used everywhere.  That interface would have to
> > involve initiation by the toolstack, and collection of resulting
> > success/failure/etc., via xenstore.
> 
> Are we only going to use xenstore to share information between both
> domains (Dom0 <--> Driver domain)?

That would make things easier, but we have to be careful not turning
xenstore in an RPC style mechanism of communication because it is not
very good a that. In that case we would be better off with libvchan.


> I'm going to comment the vif case, but I think both vif and block
> devices should follow the same approach, and hotplug script execution
> has to be something "standard" and should not rely on the type of the
> device.

Right. However vif and block scripts need to be executed at different
points of the lifecycle of the VM.


> > The sequence of events for vifs with a kernel-level backend needs
> > to go like this:
> >  * toolstack tells backend domain to create vif, via xenstore
> 
> How does the toolstack tell a domain to create a device? Creating a
> xenstore entry like:
> 
> /local/domain/<domid>/backend/vif/...
> 
> does trigger the creation of a vif interface in the <domid> domain?

Yes, I think so.


> >  * backend kernel creates a virtual network interface vifNN
> >  * something in backend domain notices that this vifNN
> >    has appeared and consequently
> 
> This should be handled by xenbackendd (I know it will not exactly be
> xenbackendd, but let's call it that way to simplify things), since it
> should be listening to /local/domain/<domid>/backend/* for changes and
> react upon them.

I think this is wrong because we would be tying together the vif
creation with the script execution, while these two kinds of events
might need to be executed at different point in time (especially in the
block case). We need be flexible.
I would make xenbackendd listen to a different xenstore location,
maybe /hotplug/<domid>/*, so that the toolstack can explicitly ask
xenbackendd for something, making sure that it gets done before taking
other actions.


> >  * device setup script runs, enslaves vifNN to bridge, adds
> >    it to routing tables, gives it an address, etc.
> 
> Handled by hotplug scripts.
> 
> >  * something in backend domain domain tells toolstack vif is ready
> 
> Hotplug scripts should change backend state (and write the appropriate
> values) to notify everything when ok. Since xenbackendd is the one
> that executes the scripts, it should examine the exit code of the
> called hotplug script and write the exit status code and message if
> hotplug script execution is not successful. This values can be
> retrieved from the toolstack and notify the user if something failed.

Or the script could write the return value to /hotplug/<domid>/vif/state
itself. Either way should work.


> >  [ device is used ]
> >  * toolstack tells backend domain to destroy vif; perhaps entire
> >    xenstore directory is forcibly removed??
> 
> If entire xenstore directory is forcibly removed, how does xenbackendd
> know the parameters to pass to the hotplug script to shutdown the
> device? Do we have to keep a copy of this somewhere else (xenstore or
> create a xenbackendd private database)?

This problem would disappear if we use /hotplug/<domid> rather than
/local/domain/<domid>/backend to store the parameters.


> Here we have two cases, whether it is a shutdown or a destroy:
> 
> When doing a shutdown the toolstack should wait to get a notification
> from the driver domain that hotplug execution was done (either
> successfully or not) and then proceed with the removal of xenstore
> directory.
> 
> DomU closes device --> driver domain notices --> execution of hotplug
> scripts --> write result to xenstore --> toolstack reads results of
> hotplug teardown.
> 
> When doing a destroy, the toolstack should manually set the frontend
> state to closed, and thus force the execution of hotplug scripts in
> the driver domain? I know this has been a cause of discussion in
> previous patches, but I really don't see the problem with modifying
> the frontend status if the domain is already dead, it's just a way to
> force the unplug of the device and the execution of hotplug scripts.
> Normally the DomU should set the frontend status to closed, but since
> we killed it from the toolstack, it should be the toolstack itself the
> one in charge of setting the status to closed.

this problem would go away too if we use /hotplug/<domid> rather than
/local/domain/<domid>/backend to trigger xenbackendd events.

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-09 12:26       ` Stefano Stabellini
@ 2012-01-09 17:39         ` Ian Jackson
       [not found]           ` <alpine.DEB.2.00.1201101448030.3150@kaball-desktop>
  0 siblings, 1 reply; 97+ messages in thread
From: Ian Jackson @ 2012-01-09 17:39 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Ian Campbell

Stefano Stabellini writes ("Re: Driver domains and hotplug scripts, redux"):
> On Mon, 9 Jan 2012, Roger Pau Monné wrote:
> > This should be handled by xenbackendd (I know it will not exactly be
> > xenbackendd, but let's call it that way to simplify things), since it
> > should be listening to /local/domain/<domid>/backend/* for changes and
> > react upon them.
> 
> I think this is wrong because we would be tying together the vif
> creation with the script execution, while these two kinds of events
> might need to be executed at different point in time (especially in the
> block case). We need be flexible.
> I would make xenbackendd listen to a different xenstore location,
> maybe /hotplug/<domid>/*, so that the toolstack can explicitly ask
> xenbackendd for something, making sure that it gets done before taking
> other actions.

The fact that any script is being run, and exactly what that script
is, and when, does not need to be visible to the main toolstack.  It's
a function entirely inside the driver domain.

So I disagree.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-06 13:37   ` Ian Campbell
@ 2012-01-10 16:06     ` Ian Jackson
  0 siblings, 0 replies; 97+ messages in thread
From: Ian Jackson @ 2012-01-10 16:06 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, Ian Jackson, Tim (Xen.org),
	Jan Beulich

Ian Campbell writes ("Re: [Xen-devel] RFC: Still TODO for 4.2?"):
> I'd argue that the json output should be the default with sxp requiring
> a special option, even though that break backwards compat with xm. I
> have a hard time believing that the sexp printed by xl is close enough
> to the xm one that people haven't already been hacking around it in
> their parsers anyway...

Yes.  Now that we have a global xl configuration file, we could put an
option in there to revert to the weird sexps.

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
       [not found]                         ` <20236.28931.127139.752426@mariner.uk.xensource.com>
@ 2012-01-11 11:50                           ` Roger Pau Monné
  2012-01-11 12:17                             ` Ian Campbell
  2012-01-11 12:50                             ` Stefano Stabellini
  0 siblings, 2 replies; 97+ messages in thread
From: Roger Pau Monné @ 2012-01-11 11:50 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Stefano Stabellini

Hello,

This comes from my experience with Xen and hotplug scripts, and it
might be wrong, since I wasn't able to find any document explaining
exactly how hotplug execution works and who does what. I'm gonna try
to list the sequence of events that happens when a device is added (I
really don't want to keep on with the discusion if this is a protocol
or not):

1. Toolstack writes: /local/domain/0/backend/<vbd or vif>/... with "state = 1".
2. Kernel acks xenstore backend device creation, creates the device
and sets backend "state = 2".
3. xenbackendd notices backend device with "state == 2" and launches
hotplug script.
4. Hotplug script executes necessary actions and sets backend
"hotplug-status = connected".
5. Kernel notices "hotplug-status == connected", plugs the device, and
sets xenstore backend device "state = 4".

This is true on NetBSD, because there aren't any userspace hotplug
devices, someone should probably add the missing bits if the device is
implemented in userspace (I'm not really sure of what happens inside
the kernel in #2 and #5, specially when using blktap or qdisk).

Regarding device shutdown/destroy:

1. Guest sets frontend state to 6 (closed)
2. Kernel unplugs the device and sets backend "state = 6".
3. xenbackendd notices device with "state == 6", and performs the
necessary cleanup.
3. Toolstack notices device with "state == 6" and removes xenstore
backend entries.

Notice that I've used two #3, that's where the race condition happens,
because there's no synchronization between toolstack and
hotplug/xenbackendd to know when hotplug scripts have been executed
(however we should be able to synchronize this watching
"hotplug-status" instead of "state", and waiting for it to change to
"disconnected").

Now, we have to decide how to fix the shutdown/destroy race and how to
implement this outside of the Dom0. I'm not really sure if it's a good
idea to try so hard to keep this flow intact, I think it's best to try
to define a flow that solves our current problems, regardless of how
things are now, and then try to map both flows to see what should be
changed and how.

Since the device will be plugged from a Domain different than Dom0,
the toolstack doesn't really (and probably shouldn't) know anything
about which backend type will be used (phy, blktap, qdisk...). Having
that in mind, I don't know how can we write
/local/domain/<driverdom_id>/backend/... from Dom0, instead we should
create something like:

/hotplug/domain/<driverdom_id>/<vbd or vif>/<domu_id>/<device_id>/params
/hotplug/domain/<driverdom_id>/<vbd or vif>/<domu_id>/<device_id>/script
/hotplug/domain/<driverdom_id>/<vbd or vif>/<domu_id>/<device_id>/state
[This seem like the minimum necessary parameters, but probably there
are others, so add what you feel necessary]

With that the driver domain should be able to create
/local/domain/<driverdomain_id>/backend/... and the frontend also.

I'm not sure if we should control the execution of hotplug scripts
from Dom0, or instead let the driver domain decide when it's best to
execute each script. This adds /hotplug to xenstore, but the
plug/unplug sequence could be the same as the one we currently have,
the only change is that each driver domain is in charge of writing
it's own xenstore backend/frontend entries to trigger the plug
sequence.

Hope that helps, Roger.

(xen-devel mailing list was removed at some point during the
conversation, so I'm adding it again)

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-11 11:50                           ` Roger Pau Monné
@ 2012-01-11 12:17                             ` Ian Campbell
  2012-01-11 14:26                               ` Dave Scott
                                                 ` (2 more replies)
  2012-01-11 12:50                             ` Stefano Stabellini
  1 sibling, 3 replies; 97+ messages in thread
From: Ian Campbell @ 2012-01-11 12:17 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Ian Jackson, Stefano Stabellini

On Wed, 2012-01-11 at 11:50 +0000, Roger Pau Monné wrote:
> Hello,
> 
> This comes from my experience with Xen and hotplug scripts, and it
> might be wrong, since I wasn't able to find any document explaining
> exactly how hotplug execution works and who does what. I'm gonna try
> to list the sequence of events that happens when a device is added (I
> really don't want to keep on with the discusion if this is a protocol
> or not):
> 
> 1. Toolstack writes: /local/domain/0/backend/<vbd or vif>/... with "state = 1".
> 2. Kernel acks xenstore backend device creation, creates the device
> and sets backend "state = 2".
> 3. xenbackendd notices backend device with "state == 2" and launches
> hotplug script.

In the Linux I think state == 2 corresponds to the generation of a
uevent which triggers udev to run the hotplug script. I'm not 100% sure
that all devices do this at the same point though.

> 4. Hotplug script executes necessary actions and sets backend
> "hotplug-status = connected".
> 5. Kernel notices "hotplug-status == connected", plugs the device, and
> sets xenstore backend device "state = 4".

I think 4+5 are correct for Linux netback but for blkback it actually
waits for phys-dev (or whatever it's real name is) to be written wather
than the hotplug-status node.

> This is true on NetBSD, because there aren't any userspace hotplug
> devices, someone should probably add the missing bits if the device is
> implemented in userspace (I'm not really sure of what happens inside
> the kernel in #2 and #5, specially when using blktap or qdisk).

Nothing happens in the kernel for qdisk. It is a separate backend path
which the kernel doesn't watch or have a driver for.

blktap1 behaves a lot like blkback, I think.

blktap2 doesn't use xenbus IIRC, rather it is created via userspace
tools/libraries. There _might_ be some hotplug script interaction which
causes the phys-dev node to get written to the associated blkback device
but I think this is not the case and the toolstack just writes the
phys-dev because it knows what it is from when it created it.

> Regarding device shutdown/destroy:

We need to consider 3 cases:
      * guest initiated graceful shutdown
      * toolstack initiated graceful shutdown
      * toolstack initiated forceful destroy.

> 1. Guest sets frontend state to 6 (closed)
> 2. Kernel unplugs the device and sets backend "state = 6".
> 3. xenbackendd notices device with "state == 6", and performs the
> necessary cleanup.
> 3. Toolstack notices device with "state == 6" and removes xenstore
> backend entries.

At least some backend/frontends make use of state 5 as part of this,
probably at #1 or #2.

The ordering of #1 and #2 probably depends on whether the frontend or
the backend initiates things.

The forceful destroy case is different, it is effectively:
1. rm backend dir in xenstore.

Somewhere in both of these a Linux backend will generate a hotplug event
which will cause a script to run, although in some cases the script
can't do much because the backend dir is already gone...

> Notice that I've used two #3, that's where the race condition happens,
> because there's no synchronization between toolstack and
> hotplug/xenbackendd to know when hotplug scripts have been executed
> (however we should be able to synchronize this watching
> "hotplug-status" instead of "state", and waiting for it to change to
> "disconnected").
> 
> Now, we have to decide how to fix the shutdown/destroy race and how to
> implement this outside of the Dom0. I'm not really sure if it's a good
> idea to try so hard to keep this flow intact, I think it's best to try
> to define a flow that solves our current problems, regardless of how
> things are now, and then try to map both flows to see what should be
> changed and how.
> 
> Since the device will be plugged from a Domain different than Dom0,
> the toolstack doesn't really (and probably shouldn't) know anything
> about which backend type will be used (phy, blktap, qdisk...). Having
> that in mind, I don't know how can we write
> /local/domain/<driverdom_id>/backend/... from Dom0, instead we should
> create something like:
> 
> /hotplug/domain/<driverdom_id>/<vbd or vif>/<domu_id>/<device_id>/params
> /hotplug/domain/<driverdom_id>/<vbd or vif>/<domu_id>/<device_id>/script
> /hotplug/domain/<driverdom_id>/<vbd or vif>/<domu_id>/<device_id>/state
> [This seem like the minimum necessary parameters, but probably there
> are others, so add what you feel necessary]
> 
> With that the driver domain should be able to create
> /local/domain/<driverdomain_id>/backend/... and the frontend also.
> 
> I'm not sure if we should control the execution of hotplug scripts
> from Dom0, or instead let the driver domain decide when it's best to
> execute each script. This adds /hotplug to xenstore, but the
> plug/unplug sequence could be the same as the one we currently have,
> the only change is that each driver domain is in charge of writing
> it's own xenstore backend/frontend entries to trigger the plug
> sequence.
> 
> Hope that helps, Roger.
> 
> (xen-devel mailing list was removed at some point during the
> conversation, so I'm adding it again)
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-11 11:50                           ` Roger Pau Monné
  2012-01-11 12:17                             ` Ian Campbell
@ 2012-01-11 12:50                             ` Stefano Stabellini
  1 sibling, 0 replies; 97+ messages in thread
From: Stefano Stabellini @ 2012-01-11 12:50 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Ian Jackson, Stefano Stabellini

[-- Attachment #1: Type: text/plain, Size: 3847 bytes --]

On Wed, 11 Jan 2012, Roger Pau Monné wrote:
> Hello,
> 
> This comes from my experience with Xen and hotplug scripts, and it
> might be wrong, since I wasn't able to find any document explaining
> exactly how hotplug execution works and who does what. I'm gonna try
> to list the sequence of events that happens when a device is added (I
> really don't want to keep on with the discusion if this is a protocol
> or not):
> 
> 1. Toolstack writes: /local/domain/0/backend/<vbd or vif>/... with "state = 1".
> 2. Kernel acks xenstore backend device creation, creates the device
> and sets backend "state = 2".
> 3. xenbackendd notices backend device with "state == 2" and launches
> hotplug script.
> 4. Hotplug script executes necessary actions and sets backend
> "hotplug-status = connected".
> 5. Kernel notices "hotplug-status == connected", plugs the device, and
> sets xenstore backend device "state = 4".
> 
> This is true on NetBSD, because there aren't any userspace hotplug
> devices, someone should probably add the missing bits if the device is
> implemented in userspace (I'm not really sure of what happens inside
> the kernel in #2 and #5, specially when using blktap or qdisk).
> 
> Regarding device shutdown/destroy:
> 
> 1. Guest sets frontend state to 6 (closed)
> 2. Kernel unplugs the device and sets backend "state = 6".
> 3. xenbackendd notices device with "state == 6", and performs the
> necessary cleanup.
> 3. Toolstack notices device with "state == 6" and removes xenstore
> backend entries.
> 
> Notice that I've used two #3, that's where the race condition happens,
> because there's no synchronization between toolstack and
> hotplug/xenbackendd to know when hotplug scripts have been executed
> (however we should be able to synchronize this watching
> "hotplug-status" instead of "state", and waiting for it to change to
> "disconnected").
> 
> Now, we have to decide how to fix the shutdown/destroy race and how to
> implement this outside of the Dom0. I'm not really sure if it's a good
> idea to try so hard to keep this flow intact, I think it's best to try
> to define a flow that solves our current problems, regardless of how
> things are now, and then try to map both flows to see what should be
> changed and how.
> 
> Since the device will be plugged from a Domain different than Dom0,
> the toolstack doesn't really (and probably shouldn't) know anything
> about which backend type will be used (phy, blktap, qdisk...). Having
> that in mind, I don't know how can we write
> /local/domain/<driverdom_id>/backend/... from Dom0, instead we should
> create something like:
> 
> /hotplug/domain/<driverdom_id>/<vbd or vif>/<domu_id>/<device_id>/params
> /hotplug/domain/<driverdom_id>/<vbd or vif>/<domu_id>/<device_id>/script
> /hotplug/domain/<driverdom_id>/<vbd or vif>/<domu_id>/<device_id>/state
> [This seem like the minimum necessary parameters, but probably there
> are others, so add what you feel necessary]
> 
> With that the driver domain should be able to create
> /local/domain/<driverdomain_id>/backend/... and the frontend also.
 
What you are proposing is good enough to solve the problem I was
describing before: xenbackendd in the driver domain would have the
liberty of running any setup script it needs to run, before writing the
backend nodes to xenstore.
Also, considering that xenbackendd would be in charge of both running
the script and writing/removing the backend nodes it would have full
control over the sequence of events, that gives us a lot of flexibility
to deal with complex scenarios.
For these reasons, I support this idea.


> I'm not sure if we should control the execution of hotplug scripts
> from Dom0, or instead let the driver domain decide when it's best to
> execute each script. 

At this point it is best to keep the driver domain in charge of its own
scripts.

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-11 12:17                             ` Ian Campbell
@ 2012-01-11 14:26                               ` Dave Scott
  2012-01-12 16:50                                 ` Ian Jackson
  2012-01-11 14:44                               ` Roger Pau Monné
  2012-01-12 16:48                               ` Ian Jackson
  2 siblings, 1 reply; 97+ messages in thread
From: Dave Scott @ 2012-01-11 14:26 UTC (permalink / raw)
  To: Ian Campbell, Roger Pau Monné
  Cc: xen-devel, Ian Jackson, Stefano Stabellini


Hi,

Ian Campbell wrote:
> blktap2 doesn't use xenbus IIRC, rather it is created via userspace
> tools/libraries. There _might_ be some hotplug script interaction which
> causes the phys-dev node to get written to the associated blkback
> device
> but I think this is not the case and the toolstack just writes the
> phys-dev because it knows what it is from when it created it.

FYI XCP/xapi is a heavy user of blktap2 and we use it like this:

0. user calls VM.start
1. xapi invokes one of its "storage managers" (plugin scripts, one per
   storage type) telling it to "attach" a disk.
2. the storage manager zones-in the relevant LUN and runs "tap-ctl"
   commands to create a block device. This is returned to xapi.
3. xapi creates the block backend directory in xenstore with
   "params=<block device>".

Xapi used to write the physical-device key directly but I stopped
it doing that when I discovered that using the "params" key worked for
both FreeBSD and Linux storage driver domains. (In the case of a
driver domain on XCP, the "storage manager" code runs inside the driver
domain and xapi talks to it over an RPC mechanism -- currently JSON-rpc
over a network but we'll probably support something else in future: maybe
DBUS-over-libvchan)

We used to store cleanup information in xenstore (under /xapi) but we
don't need to do this anymore, now that xapi remembers which disks it
has "attach"ed.

Cheers,
Dave

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-11 12:17                             ` Ian Campbell
  2012-01-11 14:26                               ` Dave Scott
@ 2012-01-11 14:44                               ` Roger Pau Monné
  2012-01-12 16:48                               ` Ian Jackson
  2 siblings, 0 replies; 97+ messages in thread
From: Roger Pau Monné @ 2012-01-11 14:44 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Ian Jackson, Stefano Stabellini

2012/1/11 Ian Campbell <Ian.Campbell@citrix.com>:
> On Wed, 2012-01-11 at 11:50 +0000, Roger Pau Monné wrote:
>> Hello,
>>
>> This comes from my experience with Xen and hotplug scripts, and it
>> might be wrong, since I wasn't able to find any document explaining
>> exactly how hotplug execution works and who does what. I'm gonna try
>> to list the sequence of events that happens when a device is added (I
>> really don't want to keep on with the discusion if this is a protocol
>> or not):
>>
>> 1. Toolstack writes: /local/domain/0/backend/<vbd or vif>/... with "state = 1".
>> 2. Kernel acks xenstore backend device creation, creates the device
>> and sets backend "state = 2".
>> 3. xenbackendd notices backend device with "state == 2" and launches
>> hotplug script.
>
> In the Linux I think state == 2 corresponds to the generation of a
> uevent which triggers udev to run the hotplug script. I'm not 100% sure
> that all devices do this at the same point though.
>
>> 4. Hotplug script executes necessary actions and sets backend
>> "hotplug-status = connected".
>> 5. Kernel notices "hotplug-status == connected", plugs the device, and
>> sets xenstore backend device "state = 4".
>
> I think 4+5 are correct for Linux netback but for blkback it actually
> waits for phys-dev (or whatever it's real name is) to be written wather
> than the hotplug-status node.
>
>> This is true on NetBSD, because there aren't any userspace hotplug
>> devices, someone should probably add the missing bits if the device is
>> implemented in userspace (I'm not really sure of what happens inside
>> the kernel in #2 and #5, specially when using blktap or qdisk).
>
> Nothing happens in the kernel for qdisk. It is a separate backend path
> which the kernel doesn't watch or have a driver for.
>
> blktap1 behaves a lot like blkback, I think.
>
> blktap2 doesn't use xenbus IIRC, rather it is created via userspace
> tools/libraries. There _might_ be some hotplug script interaction which
> causes the phys-dev node to get written to the associated blkback device
> but I think this is not the case and the toolstack just writes the
> phys-dev because it knows what it is from when it created it.
>
>> Regarding device shutdown/destroy:
>
> We need to consider 3 cases:
>      * guest initiated graceful shutdown
>      * toolstack initiated graceful shutdown
>      * toolstack initiated forceful destroy.
>
>> 1. Guest sets frontend state to 6 (closed)
>> 2. Kernel unplugs the device and sets backend "state = 6".
>> 3. xenbackendd notices device with "state == 6", and performs the
>> necessary cleanup.
>> 3. Toolstack notices device with "state == 6" and removes xenstore
>> backend entries.
>
> At least some backend/frontends make use of state 5 as part of this,
> probably at #1 or #2.
>
> The ordering of #1 and #2 probably depends on whether the frontend or
> the backend initiates things.

I don't think graceful shutdown presents any problems, since it can be
handled by the driver domain the same way it is handled by the
toolstack right now, and the toolstack should just wait for:

/hotplug/domain/<driverdom_id>/<vbd or vif>/<domu_id>/<device_id>/state

to be set to "closed".

The forceful shutdown however, presents some problems, and I don't
think the toolstack should remove the backend, instead we should add
something like:

/hotplug/domain/<driverdom_id>/<vbd or vif>/<domu_id>/<device_id>/destroy

And let the toolstack set this to 1 and wait for hotplug state to be
set to "closed" (just like a normal shutdown), then the driver domain
should take care of destroying the device and removing the backend and
frontend nodes, since it was the one that created them.

> The forceful destroy case is different, it is effectively:
> 1. rm backend dir in xenstore.
>
> Somewhere in both of these a Linux backend will generate a hotplug event
> which will cause a script to run, although in some cases the script
> can't do much because the backend dir is already gone...
>
>> Notice that I've used two #3, that's where the race condition happens,
>> because there's no synchronization between toolstack and
>> hotplug/xenbackendd to know when hotplug scripts have been executed
>> (however we should be able to synchronize this watching
>> "hotplug-status" instead of "state", and waiting for it to change to
>> "disconnected").
>>
>> Now, we have to decide how to fix the shutdown/destroy race and how to
>> implement this outside of the Dom0. I'm not really sure if it's a good
>> idea to try so hard to keep this flow intact, I think it's best to try
>> to define a flow that solves our current problems, regardless of how
>> things are now, and then try to map both flows to see what should be
>> changed and how.
>>
>> Since the device will be plugged from a Domain different than Dom0,
>> the toolstack doesn't really (and probably shouldn't) know anything
>> about which backend type will be used (phy, blktap, qdisk...). Having
>> that in mind, I don't know how can we write
>> /local/domain/<driverdom_id>/backend/... from Dom0, instead we should
>> create something like:
>>
>> /hotplug/domain/<driverdom_id>/<vbd or vif>/<domu_id>/<device_id>/params
>> /hotplug/domain/<driverdom_id>/<vbd or vif>/<domu_id>/<device_id>/script
>> /hotplug/domain/<driverdom_id>/<vbd or vif>/<domu_id>/<device_id>/state
>> [This seem like the minimum necessary parameters, but probably there
>> are others, so add what you feel necessary]
>>
>> With that the driver domain should be able to create
>> /local/domain/<driverdomain_id>/backend/... and the frontend also.
>>
>> I'm not sure if we should control the execution of hotplug scripts
>> from Dom0, or instead let the driver domain decide when it's best to
>> execute each script. This adds /hotplug to xenstore, but the
>> plug/unplug sequence could be the same as the one we currently have,
>> the only change is that each driver domain is in charge of writing
>> it's own xenstore backend/frontend entries to trigger the plug
>> sequence.
>>
>> Hope that helps, Roger.
>>
>> (xen-devel mailing list was removed at some point during the
>> conversation, so I'm adding it again)
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
>
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-11 12:17                             ` Ian Campbell
  2012-01-11 14:26                               ` Dave Scott
  2012-01-11 14:44                               ` Roger Pau Monné
@ 2012-01-12 16:48                               ` Ian Jackson
  2012-01-16 17:52                                 ` Roger Pau Monné
  2 siblings, 1 reply; 97+ messages in thread
From: Ian Jackson @ 2012-01-12 16:48 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Ian Jackson, Stefano Stabellini

Ian Campbell writes ("Re: [Xen-devel] Driver domains and hotplug scripts, redux"):
> We need to consider 3 cases:
>       * guest initiated graceful shutdown
>       * toolstack initiated graceful shutdown
>       * toolstack initiated forceful destroy.

When we consider that the driver and toolstack domains might be
different, there are in fact three different levels of grace:
  i.   fully graceful: wait for both guest and driver domain
  ii.  semi graceful: mess up the guest, wait only for driver domain
  iii. very ungraceful: mess up the guest and the driver domain

I'm not sure whether how often we want (iii), but (ii) is going to be
the common case.  However:

> The forceful destroy case is different, it is effectively:
> 1. rm backend dir in xenstore.

That's (iii).  We want a way to do (ii) as well.

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-11 14:26                               ` Dave Scott
@ 2012-01-12 16:50                                 ` Ian Jackson
  2012-01-12 18:07                                   ` Dave Scott
  0 siblings, 1 reply; 97+ messages in thread
From: Ian Jackson @ 2012-01-12 16:50 UTC (permalink / raw)
  To: Dave Scott; +Cc: xen-devel, Ian Campbell, Stefano Stabellini

Dave Scott writes ("Re: [Xen-devel] Driver domains and hotplug scripts, redux"):
> FYI XCP/xapi is a heavy user of blktap2 and we use it like this:
> 
> 0. user calls VM.start
> 1. xapi invokes one of its "storage managers" (plugin scripts, one per
>    storage type) telling it to "attach" a disk.
> 2. the storage manager zones-in the relevant LUN and runs "tap-ctl"
>    commands to create a block device. This is returned to xapi.
> 3. xapi creates the block backend directory in xenstore with
>    "params=<block device>".

Thanks for the info.  Right, this is a model we should continue to
support in libxl.  All the "management" is done outside libxl, and
libxl is simply provided with the block device.

ATM libxl only supports taking an actual device name in /dev, and TBH
I can't really see that changing because some parts of libxl might
need to actually open it in dom0.

I guess though it wouldn't be hard for xapi to provide a name in
/dev.  It must surely make one for its own purposes.

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-12 16:50                                 ` Ian Jackson
@ 2012-01-12 18:07                                   ` Dave Scott
  0 siblings, 0 replies; 97+ messages in thread
From: Dave Scott @ 2012-01-12 18:07 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Roger Pau Monn?, xen-devel, Ian Campbell, Stefano Stabellini

Hi,

> Dave Scott writes ("Re: [Xen-devel] Driver domains and hotplug scripts,
> redux"):
> > FYI XCP/xapi is a heavy user of blktap2 and we use it like this:
> >
> > 0. user calls VM.start
> > 1. xapi invokes one of its "storage managers" (plugin scripts, one
> per
> >    storage type) telling it to "attach" a disk.
> > 2. the storage manager zones-in the relevant LUN and runs "tap-ctl"
> >    commands to create a block device. This is returned to xapi.
> > 3. xapi creates the block backend directory in xenstore with
> >    "params=<block device>".
> 

Ian Jackson replied:
> Thanks for the info.  Right, this is a model we should continue to
> support in libxl.  All the "management" is done outside libxl, and
> libxl is simply provided with the block device.

That would be great :-)

> ATM libxl only supports taking an actual device name in /dev, and TBH
> I can't really see that changing because some parts of libxl might
> need to actually open it in dom0.
> 
> I guess though it wouldn't be hard for xapi to provide a name in
> /dev.  It must surely make one for its own purposes.

I'm pretty sure our block devices end up in /dev -- so I think that would work for us.

Thanks,
Dave

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-04 16:29 RFC: Still TODO for 4.2? Ian Campbell
                   ` (4 preceding siblings ...)
  2012-01-05 17:49 ` RFC: Still TODO for 4.2? Ian Jackson
@ 2012-01-16 11:55 ` George Dunlap
  2012-01-19 21:14 ` RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend Pasi Kärkkäinen
  6 siblings, 0 replies; 97+ messages in thread
From: George Dunlap @ 2012-01-16 11:55 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Keir Fraser, Tim Deegan, Ian Jackson,
	Stefano Stabellini, Jan Beulich

On Wed, Jan 4, 2012 at 4:29 PM, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> What are the outstanding things to do before we think we can start on
> the 4.2 -rc's? Does anyone have a timetable in mind?
>
> hypervisor:
>
>      * ??? - Keir, Tim, Jan?
>
> tools:
>
>      * libxl stable API -- we would like 4.2 to define a stable API
>        which downstream's can start to rely on not changing. Aspects of
>        this are:
>              * event handling (IanJ working on this)
>              * drop libxl_device_model_info (move bits to build_info or
>                elsewhere as appropriate) (IanC working on this, patches
>                shortly)
>              * add libxl_defbool and generally try and arrange that
>                memset(foo,0,...) requests the defaults (IanC working on
>                this, patches shortly)
>              * The topologyinfo datastructure should be a list of
>                tuples, not a tuple of lists. (nobody currently looking
>                at this, not 100% sure this makes sense, could possibly
>                defer and change after 4.2 in a compatible way)
>              * Block script support -- can be done post 4.2?
>      * Hotplug script stuff -- internal to libxl (I think, therefore I
>        didn't put this under stable API above) but still good to have
>        for 4.2? Roger Pau Monet was looking at this but its looking
>        like a big can-o-worms...
>      * Integrate qemu+seabios upstream into the build (Stefano has
>        posted patches, I guess they need refreshing and reposting). No
>        change in default qemu for 4.2.
>      * More formally deprecate xm/xend. Manpage patches already in
>        tree. Needs release noting and communication around -rc1 to
>        remind people to test xl.
>
> Has anybody got anything else? I'm sure I've missed stuff. Are there any
> must haves e.g. in the paging/sharing spaces?

Seems like making sure xl has feature parity with xend wrt driver
domains would be something good to make sure we have, if we're going
to be deprecating xend.

>
> Ian.
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-04 17:25 ` Pasi Kärkkäinen
                     ` (2 preceding siblings ...)
  2012-01-04 19:21   ` RFC: Still TODO for 4.2? Wei Huang
@ 2012-01-16 13:28   ` Ian Campbell
  2012-01-16 14:39     ` Re : " David TECHER
  3 siblings, 1 reply; 97+ messages in thread
From: Ian Campbell @ 2012-01-16 13:28 UTC (permalink / raw)
  To: Pasi Kärkkäinen
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, Ian Jackson, Tim (Xen.org),
	Jan Beulich

On Wed, 2012-01-04 at 17:25 +0000, Pasi Kärkkäinen wrote:
> 
> - Also there's a bunch of VGA passthru related patches,
> that I once volunteered to collect/rebase/cleanup/repost myself,
> but I still haven't had time for that :( 

I'm not going to include this in the list unless someone steps up and
starts submitting patches.

Ian.



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-04 16:55 ` Jan Beulich
@ 2012-01-16 13:39   ` Ian Campbell
  2012-01-16 14:48     ` Jan Beulich
  0 siblings, 1 reply; 97+ messages in thread
From: Ian Campbell @ 2012-01-16 13:39 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Keir (Xen.org), Tim (Xen.org),
	xen-devel, Ian Jackson, Stefano Stabellini

On Wed, 2012-01-04 at 16:55 +0000, Jan Beulich wrote:
> >>> On 04.01.12 at 17:29, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> > What are the outstanding things to do before we think we can start on
> > the 4.2 -rc's? Does anyone have a timetable in mind?
> > 
> > hypervisor:
> > 
> >       * ??? - Keir, Tim, Jan?
> 
> Apart from a few small things that I have on my todo list, the only
> bigger one (at least from an possible impact perspective) is the
> round-up of the closing of the security hole in MSI-X passthrough
> (uniformly - i.e. even for Dom0 - disallowing write access to MSI-X
> table pages), which I intended to do only once the upstream qemu
> patch series also incorporates the respective recent qemu-xen
> change.

It sounds like this issue is a blocker for the release, whereas the
upstream qemu support for pci passthrough is not necessarily. Has your
precondition for inclusion been met yet or do we need to prod someone?

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-04 16:51   ` Stefano Stabellini
@ 2012-01-16 13:42     ` Ian Campbell
  0 siblings, 0 replies; 97+ messages in thread
From: Ian Campbell @ 2012-01-16 13:42 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Keir (Xen.org),
	Konrad Rzeszutek Wilk, Ian Jackson, Tim (Xen.org),
	Jan Beulich

On Wed, 2012-01-04 at 16:51 +0000, Stefano Stabellini wrote:
> On Wed, 4 Jan 2012, Konrad Rzeszutek Wilk wrote:
> > On Wed, Jan 04, 2012 at 04:29:22PM +0000, Ian Campbell wrote:
> > >       * Integrate qemu+seabios upstream into the build (Stefano has
> > >         posted patches, I guess they need refreshing and reposting). No
> > >         change in default qemu for 4.2.
> > 
> > Anthony's PCI passthrough patches?
> 
> Right. And Anthony's save/restore patches as well.

Since these are dependent on external factors (qemu upstream) are we
willing to block our own release for them?

Given that upstream qemu won't be the default in this release I think
the answer is "no", although obviously they are nice to haves.

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re :  RFC: Still TODO for 4.2?
  2012-01-16 13:28   ` Ian Campbell
@ 2012-01-16 14:39     ` David TECHER
  0 siblings, 0 replies; 97+ messages in thread
From: David TECHER @ 2012-01-16 14:39 UTC (permalink / raw)
  To: Ian Campbell, Pasi Kärkkäinen
  Cc: xen-devel, Keir (Xen.org), Stefano Stabellini, Tim (Xen.org),
	Ian Jackson, Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 1182 bytes --]

I told a couple weeks ago that I will try to submit the patches.

Sorry I am not submitting patches. I was/am very busy these last weeks.


I will try to submit patches for VGA passthrough this week-end.

Kind regards.

David.



________________________________
 De : Ian Campbell <Ian.Campbell@citrix.com>
À : Pasi Kärkkäinen <pasik@iki.fi> 
Cc : xen-devel <xen-devel@lists.xensource.com>; Keir (Xen.org) <keir@xen.org>; Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>; Ian Jackson <Ian.Jackson@eu.citrix.com>; Tim (Xen.org) <tim@xen.org>; Jan Beulich <JBeulich@suse.com> 
Envoyé le : Lundi 16 Janvier 2012 14h28
Objet : Re: [Xen-devel] RFC: Still TODO for 4.2?
 
On Wed, 2012-01-04 at 17:25 +0000, Pasi Kärkkäinen wrote:
> 
> - Also there's a bunch of VGA passthru related patches,
> that I once volunteered to collect/rebase/cleanup/repost myself,
> but I still haven't had time for that :( 

I'm not going to include this in the list unless someone steps up and
starts submitting patches.

Ian.



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

[-- Attachment #1.2: Type: text/html, Size: 2328 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-16 13:39   ` Ian Campbell
@ 2012-01-16 14:48     ` Jan Beulich
  2012-01-16 15:00       ` Stefano Stabellini
  0 siblings, 1 reply; 97+ messages in thread
From: Jan Beulich @ 2012-01-16 14:48 UTC (permalink / raw)
  To: anthony.perard, Ian Campbell
  Cc: Keir (Xen.org), Tim (Xen.org),
	xen-devel, Ian Jackson, Stefano Stabellini

>>> On 16.01.12 at 14:39, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Wed, 2012-01-04 at 16:55 +0000, Jan Beulich wrote:
>> >>> On 04.01.12 at 17:29, Ian Campbell <Ian.Campbell@citrix.com> wrote:
>> > What are the outstanding things to do before we think we can start on
>> > the 4.2 -rc's? Does anyone have a timetable in mind?
>> > 
>> > hypervisor:
>> > 
>> >       * ??? - Keir, Tim, Jan?
>> 
>> Apart from a few small things that I have on my todo list, the only
>> bigger one (at least from an possible impact perspective) is the
>> round-up of the closing of the security hole in MSI-X passthrough
>> (uniformly - i.e. even for Dom0 - disallowing write access to MSI-X
>> table pages), which I intended to do only once the upstream qemu
>> patch series also incorporates the respective recent qemu-xen
>> change.
> 
> It sounds like this issue is a blocker for the release, whereas the
> upstream qemu support for pci passthrough is not necessarily. Has your
> precondition for inclusion been met yet or do we need to prod someone?

I didn't see updated upstream qemu patches since this was discussed
on irc - Anthony, do you have a rough time line?

Jan

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-16 14:48     ` Jan Beulich
@ 2012-01-16 15:00       ` Stefano Stabellini
  0 siblings, 0 replies; 97+ messages in thread
From: Stefano Stabellini @ 2012-01-16 15:00 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Keir (Xen.org),
	Ian Campbell, Stefano Stabellini, Tim (Xen.org),
	Ian Jackson, Anthony Perard

On Mon, 16 Jan 2012, Jan Beulich wrote:
> >>> On 16.01.12 at 14:39, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> > On Wed, 2012-01-04 at 16:55 +0000, Jan Beulich wrote:
> >> >>> On 04.01.12 at 17:29, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> >> > What are the outstanding things to do before we think we can start on
> >> > the 4.2 -rc's? Does anyone have a timetable in mind?
> >> > 
> >> > hypervisor:
> >> > 
> >> >       * ??? - Keir, Tim, Jan?
> >> 
> >> Apart from a few small things that I have on my todo list, the only
> >> bigger one (at least from an possible impact perspective) is the
> >> round-up of the closing of the security hole in MSI-X passthrough
> >> (uniformly - i.e. even for Dom0 - disallowing write access to MSI-X
> >> table pages), which I intended to do only once the upstream qemu
> >> patch series also incorporates the respective recent qemu-xen
> >> change.
> > 
> > It sounds like this issue is a blocker for the release, whereas the
> > upstream qemu support for pci passthrough is not necessarily. Has your
> > precondition for inclusion been met yet or do we need to prod someone?
> 
> I didn't see updated upstream qemu patches since this was discussed
> on irc - Anthony, do you have a rough time line?
 
We had long discussions on qemu-devel and on #qemu, I think we have a
plan now and I am currently prototyping a different approach to solve
the issue. The new approach requires libxl support, so I would still
like to put upstream qemu save/restore in the roadmap, at least for the
libxl side.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-12 16:48                               ` Ian Jackson
@ 2012-01-16 17:52                                 ` Roger Pau Monné
  2012-01-16 17:58                                   ` Ian Jackson
  0 siblings, 1 reply; 97+ messages in thread
From: Roger Pau Monné @ 2012-01-16 17:52 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Ian Campbell, Stefano Stabellini

2012/1/12 Ian Jackson <Ian.Jackson@eu.citrix.com>:
> Ian Campbell writes ("Re: [Xen-devel] Driver domains and hotplug scripts, redux"):
>> We need to consider 3 cases:
>>       * guest initiated graceful shutdown
>>       * toolstack initiated graceful shutdown
>>       * toolstack initiated forceful destroy.
>
> When we consider that the driver and toolstack domains might be
> different, there are in fact three different levels of grace:
>  i.   fully graceful: wait for both guest and driver domain
>  ii.  semi graceful: mess up the guest, wait only for driver domain
>  iii. very ungraceful: mess up the guest and the driver domain
>
> I'm not sure whether how often we want (iii), but (ii) is going to be
> the common case.  However:
>
>> The forceful destroy case is different, it is effectively:
>> 1. rm backend dir in xenstore.
>
> That's (iii).  We want a way to do (ii) as well.

From my point of view, (iii) should only happen after (i) or (ii) has
failed (timeout or error trying to unplug devices).

What should we do with xend? Are we keeping it on 4.2? I'm asking this
because the changes I'm introducing disables some udev rules that are
needed for xend. The other option is to update xend to talk to
xenbackendd also.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-16 17:52                                 ` Roger Pau Monné
@ 2012-01-16 17:58                                   ` Ian Jackson
  2012-01-17  9:17                                     ` Ian Campbell
  2012-01-17  9:22                                     ` Roger Pau Monné
  0 siblings, 2 replies; 97+ messages in thread
From: Ian Jackson @ 2012-01-16 17:58 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Ian Campbell, Stefano Stabellini

Roger Pau Monné writes ("Re: [Xen-devel] Driver domains and hotplug scripts, redux"):
> 2012/1/12 Ian Jackson <Ian.Jackson@eu.citrix.com>:
> > Ian Campbell writes ("Re: [Xen-devel] Driver domains and hotplug scripts, redux"):
> >> The forceful destroy case is different, it is effectively:
> >> 1. rm backend dir in xenstore.
> >
> > That's (iii).  We want a way to do (ii) as well.
> 
> From my point of view, (iii) should only happen after (i) or (ii) has
> failed (timeout or error trying to unplug devices).

There has to be a user option to ask for a "very forceful" detach.

> What should we do with xend? Are we keeping it on 4.2? I'm asking this
> because the changes I'm introducing disables some udev rules that are
> needed for xend. The other option is to update xend to talk to
> xenbackendd also.

I think xend is not going to go away in 4.2, unfortunately.

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-16 17:58                                   ` Ian Jackson
@ 2012-01-17  9:17                                     ` Ian Campbell
  2012-01-17  9:30                                       ` Roger Pau Monné
  2012-01-17  9:40                                       ` Roger Pau Monné
  2012-01-17  9:22                                     ` Roger Pau Monné
  1 sibling, 2 replies; 97+ messages in thread
From: Ian Campbell @ 2012-01-17  9:17 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Roger Pau Monné, xen-devel, Stefano Stabellini

On Mon, 2012-01-16 at 17:58 +0000, Ian Jackson wrote:
> Roger Pau Monné writes ("Re: [Xen-devel] Driver domains and hotplug scripts, redux"):
> > 2012/1/12 Ian Jackson <Ian.Jackson@eu.citrix.com>:
> > > Ian Campbell writes ("Re: [Xen-devel] Driver domains and hotplug scripts, redux"):
> > >> The forceful destroy case is different, it is effectively:
> > >> 1. rm backend dir in xenstore.
> > >
> > > That's (iii).  We want a way to do (ii) as well.
> > 
> > From my point of view, (iii) should only happen after (i) or (ii) has
> > failed (timeout or error trying to unplug devices).
> 
> There has to be a user option to ask for a "very forceful" detach.
> 
> > What should we do with xend? Are we keeping it on 4.2? I'm asking this
> > because the changes I'm introducing disables some udev rules that are
> > needed for xend. The other option is to update xend to talk to
> > xenbackendd also.
> 
> I think xend is not going to go away in 4.2, unfortunately.

However xend should not be transition to this new scheme but should
continue to use its existing scripts in the current manner.

There was a conversation last year[0] about how a toolstack could
opt-in/out of the use of the hotplug scripts. We decided that toolstacks
should have to opt into the use of these scripts, by touching a stamp
file.

Although this wasn't implemented yet (unless I missed it) I guess the
same scheme would apply to this work if that sort of thing turns out to
be necessary.

Ian.

[0] http://lists.xen.org/archives/html/xen-devel/2011-06/msg00952.html


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-16 17:58                                   ` Ian Jackson
  2012-01-17  9:17                                     ` Ian Campbell
@ 2012-01-17  9:22                                     ` Roger Pau Monné
  1 sibling, 0 replies; 97+ messages in thread
From: Roger Pau Monné @ 2012-01-17  9:22 UTC (permalink / raw)
  To: Ian Jackson; +Cc: xen-devel, Ian Campbell, Stefano Stabellini

2012/1/16 Ian Jackson <Ian.Jackson@eu.citrix.com>:
> Roger Pau Monné writes ("Re: [Xen-devel] Driver domains and hotplug scripts, redux"):
>> 2012/1/12 Ian Jackson <Ian.Jackson@eu.citrix.com>:
>> > Ian Campbell writes ("Re: [Xen-devel] Driver domains and hotplug scripts, redux"):
>> >> The forceful destroy case is different, it is effectively:
>> >> 1. rm backend dir in xenstore.
>> >
>> > That's (iii).  We want a way to do (ii) as well.
>>
>> From my point of view, (iii) should only happen after (i) or (ii) has
>> failed (timeout or error trying to unplug devices).
>
> There has to be a user option to ask for a "very forceful" detach.

Let's map current shutdown options to your points:

xl shutdown -> (i)
xl destroy -> (ii) or (iii) if timeout happens while trying to unplug devices.
xl destroy -f -> (iii)?

I guess adding a -f to destroy is easy and it should work as you
described in (iii).

>
>> What should we do with xend? Are we keeping it on 4.2? I'm asking this
>> because the changes I'm introducing disables some udev rules that are
>> needed for xend. The other option is to update xend to talk to
>> xenbackendd also.
>
> I think xend is not going to go away in 4.2, unfortunately.

I see pain.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-17  9:17                                     ` Ian Campbell
@ 2012-01-17  9:30                                       ` Roger Pau Monné
  2012-01-17  9:43                                         ` Ian Campbell
  2012-01-17  9:40                                       ` Roger Pau Monné
  1 sibling, 1 reply; 97+ messages in thread
From: Roger Pau Monné @ 2012-01-17  9:30 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Ian Jackson, Stefano Stabellini

2012/1/17 Ian Campbell <Ian.Campbell@citrix.com>:
> On Mon, 2012-01-16 at 17:58 +0000, Ian Jackson wrote:
>> Roger Pau Monné writes ("Re: [Xen-devel] Driver domains and hotplug scripts, redux"):
>> > 2012/1/12 Ian Jackson <Ian.Jackson@eu.citrix.com>:
>> > > Ian Campbell writes ("Re: [Xen-devel] Driver domains and hotplug scripts, redux"):
>> > >> The forceful destroy case is different, it is effectively:
>> > >> 1. rm backend dir in xenstore.
>> > >
>> > > That's (iii).  We want a way to do (ii) as well.
>> >
>> > From my point of view, (iii) should only happen after (i) or (ii) has
>> > failed (timeout or error trying to unplug devices).
>>
>> There has to be a user option to ask for a "very forceful" detach.
>>
>> > What should we do with xend? Are we keeping it on 4.2? I'm asking this
>> > because the changes I'm introducing disables some udev rules that are
>> > needed for xend. The other option is to update xend to talk to
>> > xenbackendd also.
>>
>> I think xend is not going to go away in 4.2, unfortunately.
>
> However xend should not be transition to this new scheme but should
> continue to use its existing scripts in the current manner.
>
> There was a conversation last year[0] about how a toolstack could
> opt-in/out of the use of the hotplug scripts. We decided that toolstacks
> should have to opt into the use of these scripts, by touching a stamp
> file.

I'm not sure this solves our problems, since this doesn't disable udev
exactly, it disables hotplug scripts entirely, but they are needed
from libxl also (my approach uses the current hotplug scripts).

Also, if both xl and xend are running, there are a lot of chances of
getting a mess, since machines started from xl (using the new xenstore
protocol /hotplug/...) could not be stopped successfully from xend,
and the other way around.

> Although this wasn't implemented yet (unless I missed it) I guess the
> same scheme would apply to this work if that sort of thing turns out to
> be necessary.
>
> Ian.
>
> [0] http://lists.xen.org/archives/html/xen-devel/2011-06/msg00952.html

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-17  9:17                                     ` Ian Campbell
  2012-01-17  9:30                                       ` Roger Pau Monné
@ 2012-01-17  9:40                                       ` Roger Pau Monné
  2012-01-17  9:52                                         ` Ian Campbell
  1 sibling, 1 reply; 97+ messages in thread
From: Roger Pau Monné @ 2012-01-17  9:40 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Ian Jackson, Stefano Stabellini

2012/1/17 Ian Campbell <Ian.Campbell@citrix.com>:
> However xend should not be transition to this new scheme but should
> continue to use its existing scripts in the current manner.
>
> There was a conversation last year[0] about how a toolstack could
> opt-in/out of the use of the hotplug scripts. We decided that toolstacks
> should have to opt into the use of these scripts, by touching a stamp
> file.
>
> Although this wasn't implemented yet (unless I missed it) I guess the
> same scheme would apply to this work if that sort of thing turns out to
> be necessary.

Sorry for replying so many times, this is a big maybe, and possibly
it's too drastic, but after this changes xl and xend will not be
compatible anymore, so why don't we disable xend by default, and only
build xl?

When the new configure script is in, it will be trivial to select if
you want xl or xend, and only install one of those. Adding the option
--enable-xend should disable xl and only build and install xend
(printing a very big warning that xend is deprecated).

> Ian.
>
> [0] http://lists.xen.org/archives/html/xen-devel/2011-06/msg00952.html
>

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-17  9:30                                       ` Roger Pau Monné
@ 2012-01-17  9:43                                         ` Ian Campbell
  0 siblings, 0 replies; 97+ messages in thread
From: Ian Campbell @ 2012-01-17  9:43 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Ian Jackson, Stefano Stabellini

On Tue, 2012-01-17 at 09:30 +0000, Roger Pau Monné wrote:
> 2012/1/17 Ian Campbell <Ian.Campbell@citrix.com>:
> > On Mon, 2012-01-16 at 17:58 +0000, Ian Jackson wrote:
> >> Roger Pau Monné writes ("Re: [Xen-devel] Driver domains and hotplug scripts, redux"):
> >> > 2012/1/12 Ian Jackson <Ian.Jackson@eu.citrix.com>:
> >> > > Ian Campbell writes ("Re: [Xen-devel] Driver domains and hotplug scripts, redux"):
> >> > >> The forceful destroy case is different, it is effectively:
> >> > >> 1. rm backend dir in xenstore.
> >> > >
> >> > > That's (iii).  We want a way to do (ii) as well.
> >> >
> >> > From my point of view, (iii) should only happen after (i) or (ii) has
> >> > failed (timeout or error trying to unplug devices).
> >>
> >> There has to be a user option to ask for a "very forceful" detach.
> >>
> >> > What should we do with xend? Are we keeping it on 4.2? I'm asking this
> >> > because the changes I'm introducing disables some udev rules that are
> >> > needed for xend. The other option is to update xend to talk to
> >> > xenbackendd also.
> >>
> >> I think xend is not going to go away in 4.2, unfortunately.
> >
> > However xend should not be transition to this new scheme but should
> > continue to use its existing scripts in the current manner.
> >
> > There was a conversation last year[0] about how a toolstack could
> > opt-in/out of the use of the hotplug scripts. We decided that toolstacks
> > should have to opt into the use of these scripts, by touching a stamp
> > file.
> 
> I'm not sure this solves our problems, since this doesn't disable udev
> exactly, it disables hotplug scripts entirely, but they are needed
> from libxl also (my approach uses the current hotplug scripts).

Sure, I wasn't sure exactly what approach was being planned.

> Also, if both xl and xend are running,

That is not a supported configuration so I don't think we should concern
ourselves unduly with it.

Lets fix xl to work how we want it and then we can look at ways of
making sure that nothing changes if you are using xend.

>  there are a lot of chances of
> getting a mess, since machines started from xl (using the new xenstore
> protocol /hotplug/...) could not be stopped successfully from xend,
> and the other way around.
> 
> > Although this wasn't implemented yet (unless I missed it) I guess the
> > same scheme would apply to this work if that sort of thing turns out to
> > be necessary.
> >
> > Ian.
> >
> > [0] http://lists.xen.org/archives/html/xen-devel/2011-06/msg00952.html



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-17  9:40                                       ` Roger Pau Monné
@ 2012-01-17  9:52                                         ` Ian Campbell
  2012-01-17 10:00                                           ` Roger Pau Monné
  0 siblings, 1 reply; 97+ messages in thread
From: Ian Campbell @ 2012-01-17  9:52 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Ian Jackson, Stefano Stabellini

On Tue, 2012-01-17 at 09:40 +0000, Roger Pau Monné wrote:
> 2012/1/17 Ian Campbell <Ian.Campbell@citrix.com>:
> > However xend should not be transition to this new scheme but should
> > continue to use its existing scripts in the current manner.
> >
> > There was a conversation last year[0] about how a toolstack could
> > opt-in/out of the use of the hotplug scripts. We decided that toolstacks
> > should have to opt into the use of these scripts, by touching a stamp
> > file.
> >
> > Although this wasn't implemented yet (unless I missed it) I guess the
> > same scheme would apply to this work if that sort of thing turns out to
> > be necessary.
> 
> Sorry for replying so many times, this is a big maybe, and possibly
> it's too drastic, but after this changes xl and xend will not be
> compatible anymore, so why don't we disable xend by default, and only
> build xl?

I don't think they are compatible now, are they? I've certainly seen odd
behaviour when using xl with xend (accidentally) running, usually xend
reaps the domain I've just started...

I'm all for disabling the build of xend by default but I had assumed
that others would think 4.2 was rather an aggressive timeline for that.

> When the new configure script is in, it will be trivial to select if
> you want xl or xend, and only install one of those. Adding the option
> --enable-xend should disable xl and only build and install xend
> (printing a very big warning that xend is deprecated).

I don't think --enable-xend should ever disable xl (or vice versa). Many
folks (e.g. distros) will want to build both, perhaps to package them in
two different binary packages, but certainly to offer their users the
choice, at least for the time being.

> 
> > Ian.
> >
> > [0] http://lists.xen.org/archives/html/xen-devel/2011-06/msg00952.html
> >



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-17  9:52                                         ` Ian Campbell
@ 2012-01-17 10:00                                           ` Roger Pau Monné
  2012-01-17 10:39                                             ` Ian Campbell
  0 siblings, 1 reply; 97+ messages in thread
From: Roger Pau Monné @ 2012-01-17 10:00 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Ian Jackson, Stefano Stabellini

2012/1/17 Ian Campbell <Ian.Campbell@citrix.com>:
> On Tue, 2012-01-17 at 09:40 +0000, Roger Pau Monné wrote:
>> 2012/1/17 Ian Campbell <Ian.Campbell@citrix.com>:
>> > However xend should not be transition to this new scheme but should
>> > continue to use its existing scripts in the current manner.
>> >
>> > There was a conversation last year[0] about how a toolstack could
>> > opt-in/out of the use of the hotplug scripts. We decided that toolstacks
>> > should have to opt into the use of these scripts, by touching a stamp
>> > file.
>> >
>> > Although this wasn't implemented yet (unless I missed it) I guess the
>> > same scheme would apply to this work if that sort of thing turns out to
>> > be necessary.
>>
>> Sorry for replying so many times, this is a big maybe, and possibly
>> it's too drastic, but after this changes xl and xend will not be
>> compatible anymore, so why don't we disable xend by default, and only
>> build xl?
>
> I don't think they are compatible now, are they? I've certainly seen odd
> behaviour when using xl with xend (accidentally) running, usually xend
> reaps the domain I've just started...
>
> I'm all for disabling the build of xend by default but I had assumed
> that others would think 4.2 was rather an aggressive timeline for that.
>
>> When the new configure script is in, it will be trivial to select if
>> you want xl or xend, and only install one of those. Adding the option
>> --enable-xend should disable xl and only build and install xend
>> (printing a very big warning that xend is deprecated).
>
> I don't think --enable-xend should ever disable xl (or vice versa). Many
> folks (e.g. distros) will want to build both, perhaps to package them in
> two different binary packages, but certainly to offer their users the
> choice, at least for the time being.

My main concern with this is that xend and xl will start to use
different udev rules (well, xend will continue to use the existing
ones, while xl will only use a subset of those). So we have to decide
which udev rules file to install, because we can't have both installed
at the same time.

Another option is to install xl udev rules by default, and make xend
move it's own rules in the init script. Since xl doesn't use a daemon,
xl should always check if xend is running before doing anything and
fail if xend is found.

>>
>> > Ian.
>> >
>> > [0] http://lists.xen.org/archives/html/xen-devel/2011-06/msg00952.html
>> >
>
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-17 10:00                                           ` Roger Pau Monné
@ 2012-01-17 10:39                                             ` Ian Campbell
  2012-01-23 11:40                                               ` Roger Pau Monné
  0 siblings, 1 reply; 97+ messages in thread
From: Ian Campbell @ 2012-01-17 10:39 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Ian Jackson, Stefano Stabellini

On Tue, 2012-01-17 at 10:00 +0000, Roger Pau Monné wrote:
> 2012/1/17 Ian Campbell <Ian.Campbell@citrix.com>:
> > On Tue, 2012-01-17 at 09:40 +0000, Roger Pau Monné wrote:
> >> 2012/1/17 Ian Campbell <Ian.Campbell@citrix.com>:
> >> > However xend should not be transition to this new scheme but should
> >> > continue to use its existing scripts in the current manner.
> >> >
> >> > There was a conversation last year[0] about how a toolstack could
> >> > opt-in/out of the use of the hotplug scripts. We decided that toolstacks
> >> > should have to opt into the use of these scripts, by touching a stamp
> >> > file.
> >> >
> >> > Although this wasn't implemented yet (unless I missed it) I guess the
> >> > same scheme would apply to this work if that sort of thing turns out to
> >> > be necessary.
> >>
> >> Sorry for replying so many times, this is a big maybe, and possibly
> >> it's too drastic, but after this changes xl and xend will not be
> >> compatible anymore, so why don't we disable xend by default, and only
> >> build xl?
> >
> > I don't think they are compatible now, are they? I've certainly seen odd
> > behaviour when using xl with xend (accidentally) running, usually xend
> > reaps the domain I've just started...
> >
> > I'm all for disabling the build of xend by default but I had assumed
> > that others would think 4.2 was rather an aggressive timeline for that.
> >
> >> When the new configure script is in, it will be trivial to select if
> >> you want xl or xend, and only install one of those. Adding the option
> >> --enable-xend should disable xl and only build and install xend
> >> (printing a very big warning that xend is deprecated).
> >
> > I don't think --enable-xend should ever disable xl (or vice versa). Many
> > folks (e.g. distros) will want to build both, perhaps to package them in
> > two different binary packages, but certainly to offer their users the
> > choice, at least for the time being.
> 
> My main concern with this is that xend and xl will start to use
> different udev rules (well, xend will continue to use the existing
> ones, while xl will only use a subset of those). So we have to decide
> which udev rules file to install, because we can't have both installed
> at the same time.

Sure we can. Perhaps they need to have an "if $TOOLSTACK" check (e.g. if
[ -f /var/run/xend.hotplug ]) added to the top, that is all.

> Another option is to install xl udev rules by default, and make xend
> move it's own rules in the init script.

I don't think initscripts should be messing with udev rules.

Perhaps the opt-in needs to be more fine grained e.g. opt-in to vif but
not block scripts or whatever distinction you think is necessary instead
of jut a global opt in, it's just a different naming convention for the
stamp file. This avoids reconfiguration and the need to install subsets
of the scripts etc.

>  Since xl doesn't use a daemon,
> xl should always check if xend is running before doing anything and
> fail if xend is found.

I think that is a separate question/issue to the one of hotplug scripts.

Ian.

> 
> >>
> >> > Ian.
> >> >
> >> > [0] http://lists.xen.org/archives/html/xen-devel/2011-06/msg00952.html
> >> >
> >
> >



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-04 16:29 RFC: Still TODO for 4.2? Ian Campbell
                   ` (5 preceding siblings ...)
  2012-01-16 11:55 ` George Dunlap
@ 2012-01-19 21:14 ` Pasi Kärkkäinen
  2012-01-20  7:59   ` Ian Campbell
  6 siblings, 1 reply; 97+ messages in thread
From: Pasi Kärkkäinen @ 2012-01-19 21:14 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Keir Fraser, Tim Deegan, Ian Jackson,
	Stefano Stabellini, Jan Beulich

On Wed, Jan 04, 2012 at 04:29:22PM +0000, Ian Campbell wrote:
> 
> Has anybody got anything else? I'm sure I've missed stuff. Are there any
> must haves e.g. in the paging/sharing spaces?
> 

Something that I just remembered:
http://wiki.xen.org/xenwiki/Xen4.1

"NUMA-aware memory allocation for VMs. xl in Xen 4.1 will allocate equal amount of memory from every NUMA node for the VM. xm/xend allocates all the memory from the same NUMA node."

Is this something that should be looked at? Should the numa memory allocation be an option so it can be controlled per domain? 
The default libxl behaviour might cause unexpected performance issues on multi-socket systems? 

-- Pasi

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-19 21:14 ` RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend Pasi Kärkkäinen
@ 2012-01-20  7:59   ` Ian Campbell
  2012-01-20  8:15     ` Pasi Kärkkäinen
  2012-01-20 10:55     ` Stefano Stabellini
  0 siblings, 2 replies; 97+ messages in thread
From: Ian Campbell @ 2012-01-20  7:59 UTC (permalink / raw)
  To: Pasi Kärkkäinen
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, Ian Jackson, Tim (Xen.org),
	Jan Beulich

On Thu, 2012-01-19 at 21:14 +0000, Pasi Kärkkäinen wrote:
> On Wed, Jan 04, 2012 at 04:29:22PM +0000, Ian Campbell wrote:
> > 
> > Has anybody got anything else? I'm sure I've missed stuff. Are there any
> > must haves e.g. in the paging/sharing spaces?
> > 
> 
> Something that I just remembered:
> http://wiki.xen.org/xenwiki/Xen4.1
> 
> "NUMA-aware memory allocation for VMs. xl in Xen 4.1 will allocate
> equal amount of memory from every NUMA node for the VM. xm/xend
> allocates all the memory from the same NUMA node."

I'm not that familiar with the NUMA support but my understanding was
that memory was allocated by libxc/the-hypervisor and not by the
toolstack and that the default was to allocate from the same numa nodes
as domains the processor's were pinned to i.e. if you pin the processors
appropriately the Right Thing just happens. Do you believe this is not
the case and/or not working right with xl?

CCing Juergen since he added the cpupool support and in particular the
cpupool-numa-split option so I'm hoping he knows something about NUMA
more generally.

> Is this something that should be looked at?
 
Probably, but is anyone doing so?

> Should the numa memory allocation be an option so it can be controlled
> per domain? 

What options did xm provide in this regard?

Does xl's cpupool (with the cpupool-numa-split option) server the same
purpose?

> The default libxl behaviour might cause unexpected performance issues
> on multi-socket systems? 

I'm not convinced libxl is behaving any different to xend but perhaps
someone can show me the error of my ways.

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20  7:59   ` Ian Campbell
@ 2012-01-20  8:15     ` Pasi Kärkkäinen
  2012-01-20  9:01       ` Ian Campbell
  2012-01-20 10:55     ` Stefano Stabellini
  1 sibling, 1 reply; 97+ messages in thread
From: Pasi Kärkkäinen @ 2012-01-20  8:15 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, Ian Jackson, Tim (Xen.org),
	Jan Beulich

On Fri, Jan 20, 2012 at 07:59:28AM +0000, Ian Campbell wrote:
> On Thu, 2012-01-19 at 21:14 +0000, Pasi Kärkkäinen wrote:
> > On Wed, Jan 04, 2012 at 04:29:22PM +0000, Ian Campbell wrote:
> > > 
> > > Has anybody got anything else? I'm sure I've missed stuff. Are there any
> > > must haves e.g. in the paging/sharing spaces?
> > > 
> > 
> > Something that I just remembered:
> > http://wiki.xen.org/xenwiki/Xen4.1
> > 
> > "NUMA-aware memory allocation for VMs. xl in Xen 4.1 will allocate
> > equal amount of memory from every NUMA node for the VM. xm/xend
> > allocates all the memory from the same NUMA node."
> 
> I'm not that familiar with the NUMA support but my understanding was
> that memory was allocated by libxc/the-hypervisor and not by the
> toolstack and that the default was to allocate from the same numa nodes
> as domains the processor's were pinned to i.e. if you pin the processors
> appropriately the Right Thing just happens. Do you believe this is not
> the case and/or not working right with xl?
> 
> CCing Juergen since he added the cpupool support and in particular the
> cpupool-numa-split option so I'm hoping he knows something about NUMA
> more generally.
> 
> > Is this something that should be looked at?
>  
> Probably, but is anyone doing so?
> 
> > Should the numa memory allocation be an option so it can be controlled
> > per domain? 
> 
> What options did xm provide in this regard?
> 
> Does xl's cpupool (with the cpupool-numa-split option) server the same
> purpose?
> 
> > The default libxl behaviour might cause unexpected performance issues
> > on multi-socket systems? 
> 
> I'm not convinced libxl is behaving any different to xend but perhaps
> someone can show me the error of my ways.
> 


See this thread: 
http://old-list-archives.xen.org/archives/html/xen-devel/2011-07/msg01423.html

where Stefano wrote:
"I think we forgot about this feature but it is important and hopefully
somebody will write a patch for it before 4.2 is out."


-- Pasi

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20  8:15     ` Pasi Kärkkäinen
@ 2012-01-20  9:01       ` Ian Campbell
  2012-01-20  9:47         ` Dario Faggioli
  2012-01-23  9:59         ` Juergen Gross
  0 siblings, 2 replies; 97+ messages in thread
From: Ian Campbell @ 2012-01-20  9:01 UTC (permalink / raw)
  To: Pasi Kärkkäinen
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, Ian Jackson, Juergen Gross, Tim (Xen.org),
	Jan Beulich

On Fri, 2012-01-20 at 08:15 +0000, Pasi Kärkkäinen wrote:
> On Fri, Jan 20, 2012 at 07:59:28AM +0000, Ian Campbell wrote:
> > On Thu, 2012-01-19 at 21:14 +0000, Pasi Kärkkäinen wrote:
> > > On Wed, Jan 04, 2012 at 04:29:22PM +0000, Ian Campbell wrote:
> > > > 
> > > > Has anybody got anything else? I'm sure I've missed stuff. Are there any
> > > > must haves e.g. in the paging/sharing spaces?
> > > > 
> > > 
> > > Something that I just remembered:
> > > http://wiki.xen.org/xenwiki/Xen4.1
> > > 
> > > "NUMA-aware memory allocation for VMs. xl in Xen 4.1 will allocate
> > > equal amount of memory from every NUMA node for the VM. xm/xend
> > > allocates all the memory from the same NUMA node."
> > 
> > I'm not that familiar with the NUMA support but my understanding was
> > that memory was allocated by libxc/the-hypervisor and not by the
> > toolstack and that the default was to allocate from the same numa nodes
> > as domains the processor's were pinned to i.e. if you pin the processors
> > appropriately the Right Thing just happens. Do you believe this is not
> > the case and/or not working right with xl?
> > 
> > CCing Juergen since he added the cpupool support and in particular the
> > cpupool-numa-split option so I'm hoping he knows something about NUMA
> > more generally.
> > 
> > > Is this something that should be looked at?
> >  
> > Probably, but is anyone doing so?
> > 
> > > Should the numa memory allocation be an option so it can be controlled
> > > per domain? 
> > 
> > What options did xm provide in this regard?
> > 
> > Does xl's cpupool (with the cpupool-numa-split option) server the same
> > purpose?
> > 
> > > The default libxl behaviour might cause unexpected performance issues
> > > on multi-socket systems? 
> > 
> > I'm not convinced libxl is behaving any different to xend but perhaps
> > someone can show me the error of my ways.
> > 
> 
> 
> See this thread: 
> http://old-list-archives.xen.org/archives/html/xen-devel/2011-07/msg01423.html
> 
> where Stefano wrote:
> "I think we forgot about this feature but it is important and hopefully
> somebody will write a patch for it before 4.2 is out."

Is anyone looking into this?

Does cpupool-numa-split solve this same problem?

I think I forgot to actually CC Juergen when I said, doing that now.

Ian.



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20  9:01       ` Ian Campbell
@ 2012-01-20  9:47         ` Dario Faggioli
  2012-01-20 11:56           ` Ian Campbell
  2012-01-23  9:59         ` Juergen Gross
  1 sibling, 1 reply; 97+ messages in thread
From: Dario Faggioli @ 2012-01-20  9:47 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Keir (Xen.org), Stefano Stabellini, Tim (Xen.org),
	Juergen Gross, Ian Jackson, Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 1667 bytes --]

On Fri, 2012-01-20 at 09:01 +0000, Ian Campbell wrote: 
> > See this thread: 
> > http://old-list-archives.xen.org/archives/html/xen-devel/2011-07/msg01423.html
> > 
> > where Stefano wrote:
> > "I think we forgot about this feature but it is important and hopefully
> > somebody will write a patch for it before 4.2 is out."
> 
> Is anyone looking into this?
> 
Hi,

Actually, I'll be investigating how we could gradually introduce some
NUMA support in both the Xen scheduler and memory allocator!

I already have some ideas and I'm looking at the code to better
understand it and find out what's the best strategy and from where it's
better to start. I can give some more details, here or on separate
thread, in a few days, if you're interested.

The only thing is that I'm not sure how far I'll be able to get before
4.2. I really think I'll have _something_ but maybe not a full-fledged
NUMA-aware patchset! :-P

> Does cpupool-numa-split solve this same problem?
> 
I'm looking right into that, and indeed I think it does a lot in this
direction. The idea was to try doing something similar in a sort of
automatic fashion, so that everyone can benefit from at least some
NUMA-awareness, even without having to bother with cpupools.

It that what you were asking?

Thanks and Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-------------------------------------------------------------------
Dario Faggioli, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
PhD Candidate, ReTiS Lab, Scuola Superiore Sant'Anna, Pisa (Italy)



[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20  7:59   ` Ian Campbell
  2012-01-20  8:15     ` Pasi Kärkkäinen
@ 2012-01-20 10:55     ` Stefano Stabellini
  2012-01-20 11:22       ` Ian Campbell
  2012-01-20 11:26       ` Dario Faggioli
  1 sibling, 2 replies; 97+ messages in thread
From: Stefano Stabellini @ 2012-01-20 10:55 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Tim (Xen.org),
	Dario Faggioli, Ian Jackson, Jan Beulich

[-- Attachment #1: Type: text/plain, Size: 1241 bytes --]

On Fri, 20 Jan 2012, Ian Campbell wrote:
> On Thu, 2012-01-19 at 21:14 +0000, Pasi Kärkkäinen wrote:
> > On Wed, Jan 04, 2012 at 04:29:22PM +0000, Ian Campbell wrote:
> > > 
> > > Has anybody got anything else? I'm sure I've missed stuff. Are there any
> > > must haves e.g. in the paging/sharing spaces?
> > > 
> > 
> > Something that I just remembered:
> > http://wiki.xen.org/xenwiki/Xen4.1
> > 
> > "NUMA-aware memory allocation for VMs. xl in Xen 4.1 will allocate
> > equal amount of memory from every NUMA node for the VM. xm/xend
> > allocates all the memory from the same NUMA node."
> 
> I'm not that familiar with the NUMA support but my understanding was
> that memory was allocated by libxc/the-hypervisor and not by the
> toolstack and that the default was to allocate from the same numa nodes
> as domains the processor's were pinned to i.e. if you pin the processors
> appropriately the Right Thing just happens. Do you believe this is not
> the case and/or not working right with xl?

It seems that xend is retrieving numa info about the platform, see
pyxc_numainfo, then using those info to pin vcpus to pcpus, see
_setCPUAffinity.
Still it seems to me more of an hack than the right way to solve the
problem.

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20 10:55     ` Stefano Stabellini
@ 2012-01-20 11:22       ` Ian Campbell
  2012-01-20 11:25         ` Stefano Stabellini
  2012-01-20 11:44         ` Dario Faggioli
  2012-01-20 11:26       ` Dario Faggioli
  1 sibling, 2 replies; 97+ messages in thread
From: Ian Campbell @ 2012-01-20 11:22 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Keir (Xen.org), George Dunlap, Tim (Xen.org),
	Dario Faggioli, Ian Jackson, Jan Beulich

On Fri, 2012-01-20 at 10:55 +0000, Stefano Stabellini wrote:
> On Fri, 20 Jan 2012, Ian Campbell wrote:
> > On Thu, 2012-01-19 at 21:14 +0000, Pasi Kärkkäinen wrote:
> > > On Wed, Jan 04, 2012 at 04:29:22PM +0000, Ian Campbell wrote:
> > > > 
> > > > Has anybody got anything else? I'm sure I've missed stuff. Are there any
> > > > must haves e.g. in the paging/sharing spaces?
> > > > 
> > > 
> > > Something that I just remembered:
> > > http://wiki.xen.org/xenwiki/Xen4.1
> > > 
> > > "NUMA-aware memory allocation for VMs. xl in Xen 4.1 will allocate
> > > equal amount of memory from every NUMA node for the VM. xm/xend
> > > allocates all the memory from the same NUMA node."
> > 
> > I'm not that familiar with the NUMA support but my understanding was
> > that memory was allocated by libxc/the-hypervisor and not by the
> > toolstack and that the default was to allocate from the same numa nodes
> > as domains the processor's were pinned to i.e. if you pin the processors
> > appropriately the Right Thing just happens. Do you believe this is not
> > the case and/or not working right with xl?
> 
> It seems that xend is retrieving numa info about the platform, see
> pyxc_numainfo, then using those info to pin vcpus to pcpus, see
> _setCPUAffinity.
> Still it seems to me more of an hack than the right way to solve the
> problem.

Right, so in the absence of any explicit configuration it basically
picks a NUMA node (via some heuristic) and automatically puts the guest
into it.

It seems to me that xl's behaviour isn't wrong as such, it's just
different.

I think the important thing is that xl should honour user's explicit
requests to use a particular node, either via vcpu pinning or cpupools
etc.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20 11:22       ` Ian Campbell
@ 2012-01-20 11:25         ` Stefano Stabellini
  2012-01-20 11:44         ` Dario Faggioli
  1 sibling, 0 replies; 97+ messages in thread
From: Stefano Stabellini @ 2012-01-20 11:25 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Tim (Xen.org),
	Dario Faggioli, Ian Jackson, Jan Beulich

[-- Attachment #1: Type: text/plain, Size: 2062 bytes --]

On Fri, 20 Jan 2012, Ian Campbell wrote:
> On Fri, 2012-01-20 at 10:55 +0000, Stefano Stabellini wrote:
> > On Fri, 20 Jan 2012, Ian Campbell wrote:
> > > On Thu, 2012-01-19 at 21:14 +0000, Pasi Kärkkäinen wrote:
> > > > On Wed, Jan 04, 2012 at 04:29:22PM +0000, Ian Campbell wrote:
> > > > > 
> > > > > Has anybody got anything else? I'm sure I've missed stuff. Are there any
> > > > > must haves e.g. in the paging/sharing spaces?
> > > > > 
> > > > 
> > > > Something that I just remembered:
> > > > http://wiki.xen.org/xenwiki/Xen4.1
> > > > 
> > > > "NUMA-aware memory allocation for VMs. xl in Xen 4.1 will allocate
> > > > equal amount of memory from every NUMA node for the VM. xm/xend
> > > > allocates all the memory from the same NUMA node."
> > > 
> > > I'm not that familiar with the NUMA support but my understanding was
> > > that memory was allocated by libxc/the-hypervisor and not by the
> > > toolstack and that the default was to allocate from the same numa nodes
> > > as domains the processor's were pinned to i.e. if you pin the processors
> > > appropriately the Right Thing just happens. Do you believe this is not
> > > the case and/or not working right with xl?
> > 
> > It seems that xend is retrieving numa info about the platform, see
> > pyxc_numainfo, then using those info to pin vcpus to pcpus, see
> > _setCPUAffinity.
> > Still it seems to me more of an hack than the right way to solve the
> > problem.
> 
> Right, so in the absence of any explicit configuration it basically
> picks a NUMA node (via some heuristic) and automatically puts the guest
> into it.
> 
> It seems to me that xl's behaviour isn't wrong as such, it's just
> different.
> 
> I think the important thing is that xl should honour user's explicit
> requests to use a particular node, either via vcpu pinning or cpupools
> etc.
 
Indeed.
Moreover it should be a choice for the allocator and the scheduler to
make, not the toolstack.
Otherwise we end up with non-NUMA algorithms in Xen and NUMA algorithms
in the toolstack?

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20 10:55     ` Stefano Stabellini
  2012-01-20 11:22       ` Ian Campbell
@ 2012-01-20 11:26       ` Dario Faggioli
  1 sibling, 0 replies; 97+ messages in thread
From: Dario Faggioli @ 2012-01-20 11:26 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Keir (Xen.org),
	Ian Campbell, George Dunlap, Tim (Xen.org),
	Ian Jackson, Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 602 bytes --]

On Fri, 2012-01-20 at 10:55 +0000, Stefano Stabellini wrote: 
> It seems that xend is retrieving numa info about the platform, see
> pyxc_numainfo, then using those info to pin vcpus to pcpus, see
> _setCPUAffinity.
> Still it seems to me more of an hack than the right way to solve the
> problem.
>
And it looks very much the same to me.

Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-------------------------------------------------------------------
Dario Faggioli, PhD, Senior Software Engineer, Citrix Systems R&D Ltd.,
Cambridge (UK)



[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20 11:22       ` Ian Campbell
  2012-01-20 11:25         ` Stefano Stabellini
@ 2012-01-20 11:44         ` Dario Faggioli
  2012-01-20 11:54           ` Ian Campbell
  1 sibling, 1 reply; 97+ messages in thread
From: Dario Faggioli @ 2012-01-20 11:44 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Tim (Xen.org),
	Ian Jackson, Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 2618 bytes --]

On Fri, 2012-01-20 at 11:22 +0000, Ian Campbell wrote: 
> > It seems that xend is retrieving numa info about the platform, see
> > pyxc_numainfo, then using those info to pin vcpus to pcpus, see
> > _setCPUAffinity.
> > Still it seems to me more of an hack than the right way to solve the
> > problem.
> 
> Right, so in the absence of any explicit configuration it basically
> picks a NUMA node (via some heuristic) and automatically puts the guest
> into it.
> 
Seems so. As Stefano is saying I don't think this is something that
should be done at the toolstack level, or at least not at the xl-level
of the toolstack. :-)

> It seems to me that xl's behaviour isn't wrong as such, it's just
> different.
> 
Indeed.

> I think the important thing is that xl should honour user's explicit
> requests to use a particular node, either via vcpu pinning or cpupools
> etc.
> 
And I agree again, honouring explicit user requests is key point. I
think the issue here is what should be dona, say by default, i.e., if
the user doesn't say anything about CPU/memory allocation. My idea was
to have Xen supporting a "NUMA-aware operational mode" where (and this
will actually be the first step!) it does exactly what xend is doing
right now --- that is, choosing a node and putting the new guest there,
both memory and CPU-wise. However, having this logic in the hypervisor
would allow Xen itself, for example, while investigating which node to
use for a new guest, or during a sort of periodic load balancing or
whatever, to change its mind and move a guest to a different node from
where it was put in the first place, as well as a bunch of other things.
I'm not sure the same can be done within the toolstack but I think I can
say that if it can, it would be way more complex and probably less
effective... Am I wrong?

Of course, even in such mode, if the user explicitly tells us what he
wants, e.g., by means of cpupools, pinning, etc., we should still honour
such request.

Then the question is whether or nod this mode would be the default, or
would need to be explicitly requested (boot parameter or something), but
that would become important only when we will have it up and
running... :-)

What do you think? Does this look reasonable? As the topic has been
raised, I'd very much enjoy some early feedback! :-P

Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-------------------------------------------------------------------
Dario Faggioli, PhD, Senior Software Engineer, Citrix Systems R&D Ltd.,
Cambridge (UK)



[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20 11:44         ` Dario Faggioli
@ 2012-01-20 11:54           ` Ian Campbell
  2012-01-20 12:04             ` Dario Faggioli
  0 siblings, 1 reply; 97+ messages in thread
From: Ian Campbell @ 2012-01-20 11:54 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Tim (Xen.org),
	Ian Jackson, Jan Beulich

On Fri, 2012-01-20 at 11:44 +0000, Dario Faggioli wrote:

> And I agree again, honouring explicit user requests is key point. I
> think the issue here is what should be dona, say by default, i.e., if
> the user doesn't say anything about CPU/memory allocation. My idea was
> to have Xen supporting a "NUMA-aware operational mode" where (and this
> will actually be the first step!) it does exactly what xend is doing
> right now --- that is, choosing a node and putting the new guest there,
> both memory and CPU-wise. However, having this logic in the hypervisor
> would allow Xen itself, for example, while investigating which node to
> use for a new guest, or during a sort of periodic load balancing or
> whatever, to change its mind and move a guest to a different node from
> where it was put in the first place, as well as a bunch of other things.
> I'm not sure the same can be done within the toolstack but I think I can
> say that if it can, it would be way more complex and probably less
> effective... Am I wrong?

This might be doable for HVM guests but for PV guests pretty much the
only way would be a kind of local migration which would need tool
support. For the PV case hybrid support would help (by introducing HAP
for PV guests). Not saying it's not worthwhile but might just be harder
than it sounds.

> Of course, even in such mode, if the user explicitly tells us what he
> wants, e.g., by means of cpupools, pinning, etc., we should still honour
> such request.

Do we get this right now?

> Then the question is whether or nod this mode would be the default, or
> would need to be explicitly requested (boot parameter or something), but
> that would become important only when we will have it up and
> running... :-)

Yeah, I think we can defer that decision ;-)

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20  9:47         ` Dario Faggioli
@ 2012-01-20 11:56           ` Ian Campbell
  0 siblings, 0 replies; 97+ messages in thread
From: Ian Campbell @ 2012-01-20 11:56 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, Ian Jackson, Juergen Gross, Tim (Xen.org),
	Jan Beulich

On Fri, 2012-01-20 at 09:47 +0000, Dario Faggioli wrote:
> On Fri, 2012-01-20 at 09:01 +0000, Ian Campbell wrote: 
> > > See this thread: 
> > > http://old-list-archives.xen.org/archives/html/xen-devel/2011-07/msg01423.html
> > > 
> > > where Stefano wrote:
> > > "I think we forgot about this feature but it is important and hopefully
> > > somebody will write a patch for it before 4.2 is out."
> > 
> > Is anyone looking into this?
> > 
> Hi,
> 
> Actually, I'll be investigating how we could gradually introduce some
> NUMA support in both the Xen scheduler and memory allocator!
> 
> I already have some ideas and I'm looking at the code to better
> understand it and find out what's the best strategy and from where it's
> better to start. I can give some more details, here or on separate
> thread, in a few days, if you're interested.
> 
> The only thing is that I'm not sure how far I'll be able to get before
> 4.2. I really think I'll have _something_ but maybe not a full-fledged
> NUMA-aware patchset! :-P

Sure, For 4.2 we should make sure that the toolstack does something
vaguely sensible.

> > Does cpupool-numa-split solve this same problem?
> > 
> I'm looking right into that, and indeed I think it does a lot in this
> direction. The idea was to try doing something similar in a sort of
> automatic fashion, so that everyone can benefit from at least some
> NUMA-awareness, even without having to bother with cpupools.
> 
> It that what you were asking?

Yes, thanks!

Ian

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20 11:54           ` Ian Campbell
@ 2012-01-20 12:04             ` Dario Faggioli
  2012-01-20 12:33               ` Ian Campbell
  0 siblings, 1 reply; 97+ messages in thread
From: Dario Faggioli @ 2012-01-20 12:04 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Tim (Xen.org),
	Ian Jackson, Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 1214 bytes --]

On Fri, 2012-01-20 at 11:54 +0000, Ian Campbell wrote: 
> This might be doable for HVM guests but for PV guests pretty much the
> only way would be a kind of local migration which would need tool
> support. For the PV case hybrid support would help (by introducing HAP
> for PV guests). Not saying it's not worthwhile but might just be harder
> than it sounds.
> 
That local migration analogy was exactly what came out when we were
trying to envision how to deal with the PV guest case. I still know too
few about all this to have a authoritative opinion, so, for now, thanks
for the warning! :-D

> > Of course, even in such mode, if the user explicitly tells us what he
> > wants, e.g., by means of cpupools, pinning, etc., we should still honour
> > such request.
> 
> Do we get this right now?
> 
Sorry, not sure what you mean here...

Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-------------------------------------------------------------------
Dario Faggioli, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
PhD Candidate, ReTiS Lab, Scuola Superiore Sant'Anna, Pisa (Italy)



[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20 12:04             ` Dario Faggioli
@ 2012-01-20 12:33               ` Ian Campbell
  2012-01-20 13:11                 ` Ian Campbell
  0 siblings, 1 reply; 97+ messages in thread
From: Ian Campbell @ 2012-01-20 12:33 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Ian Jackson, Tim (Xen.org),
	Jan Beulich

On Fri, 2012-01-20 at 12:04 +0000, Dario Faggioli wrote:
> On Fri, 2012-01-20 at 11:54 +0000, Ian Campbell wrote: 

> > > Of course, even in such mode, if the user explicitly tells us what he
> > > wants, e.g., by means of cpupools, pinning, etc., we should still honour
> > > such request.
> > 
> > Do we get this right now?
> > 
> Sorry, not sure what you mean here...

I meant is "if the user explicitly tells us what he wants, e.g., by
means of cpupools, pinning, etc." do we still honour such request?

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20 12:33               ` Ian Campbell
@ 2012-01-20 13:11                 ` Ian Campbell
  2012-01-20 15:06                   ` Ian Campbell
  0 siblings, 1 reply; 97+ messages in thread
From: Ian Campbell @ 2012-01-20 13:11 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Tim (Xen.org),
	Ian Jackson, Jan Beulich

On Fri, 2012-01-20 at 12:33 +0000, Ian Campbell wrote:
> On Fri, 2012-01-20 at 12:04 +0000, Dario Faggioli wrote:
> > On Fri, 2012-01-20 at 11:54 +0000, Ian Campbell wrote: 
> 
> > > > Of course, even in such mode, if the user explicitly tells us what he
> > > > wants, e.g., by means of cpupools, pinning, etc., we should still honour
> > > > such request.
> > > 
> > > Do we get this right now?
> > > 
> > Sorry, not sure what you mean here...
> 
> I meant is "if the user explicitly tells us what he wants, e.g., by
> means of cpupools, pinning, etc." do we still honour such request?

It appears that with cpupools we do not. After running
cpupool-numa-split I started a guest with pool=Pool-node1 and got:
# xl cpupool-list 
Name               CPUs   Sched     Active   Domain count
Pool-node0           8    credit       y          1
Pool-node1           8    credit       y          1

(so dom0 on node0, guest on node 1) but:
(XEN) Memory location of each domain:
(XEN) Domain 0 (total: 131072):
(XEN)     Node 0: 61098
(XEN)     Node 1: 69974
(XEN) Domain 1 (total: 6290427):
(XEN)     Node 0: 3407101
(XEN)     Node 1: 2883326

With your patches to support vcpu pin and giving the guest vcpus="8-15"
I see effectively the same thing. (xl vcpu-list shows the affinity is
correct, so your patches seem correct in that regard).

Your patches do the affinity setting pretty early so I'm not sure what's
going on.

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20 13:11                 ` Ian Campbell
@ 2012-01-20 15:06                   ` Ian Campbell
  2012-01-20 16:02                     ` Dario Faggioli
  0 siblings, 1 reply; 97+ messages in thread
From: Ian Campbell @ 2012-01-20 15:06 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Ian Jackson, Tim (Xen.org),
	Jan Beulich

On Fri, 2012-01-20 at 13:11 +0000, Ian Campbell wrote:

> Your patches do the affinity setting pretty early so I'm not sure what's
> going on.

The cpu affinities are set and d->node_affinity is getting correctly
updated to only include one node before any memory is allocated to the
domain, yet memory appears to be being allocated on both nodes.

Strange.

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20 15:06                   ` Ian Campbell
@ 2012-01-20 16:02                     ` Dario Faggioli
  2012-01-20 16:21                       ` Ian Campbell
  0 siblings, 1 reply; 97+ messages in thread
From: Dario Faggioli @ 2012-01-20 16:02 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Ian Jackson, Tim (Xen.org),
	Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 919 bytes --]

On Fri, 2012-01-20 at 15:06 +0000, Ian Campbell wrote: 
> On Fri, 2012-01-20 at 13:11 +0000, Ian Campbell wrote:
> 
> > Your patches do the affinity setting pretty early so I'm not sure what's
> > going on.
> 
> The cpu affinities are set and d->node_affinity is getting correctly
> updated to only include one node before any memory is allocated to the
> domain, yet memory appears to be being allocated on both nodes.
> 
I'm also looking into this and NOT finding an answer... Yet. Will keep
investigating and report back as soon as I get what's happening...

Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-------------------------------------------------------------------
Dario Faggioli, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
PhD Candidate, ReTiS Lab, Scuola Superiore Sant'Anna, Pisa (Italy)



[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20 16:02                     ` Dario Faggioli
@ 2012-01-20 16:21                       ` Ian Campbell
  2012-01-20 16:28                         ` Ian Campbell
  2012-01-20 16:58                         ` Dario Faggioli
  0 siblings, 2 replies; 97+ messages in thread
From: Ian Campbell @ 2012-01-20 16:21 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Tim (Xen.org),
	Ian Jackson, Jan Beulich

On Fri, 2012-01-20 at 16:02 +0000, Dario Faggioli wrote:
> On Fri, 2012-01-20 at 15:06 +0000, Ian Campbell wrote: 
> > On Fri, 2012-01-20 at 13:11 +0000, Ian Campbell wrote:
> > 
> > > Your patches do the affinity setting pretty early so I'm not sure what's
> > > going on.
> > 
> > The cpu affinities are set and d->node_affinity is getting correctly
> > updated to only include one node before any memory is allocated to the
> > domain, yet memory appears to be being allocated on both nodes.
> > 
> I'm also looking into this and NOT finding an answer... Yet. Will keep
> investigating and report back as soon as I get what's happening...

I think I made the rather basic c*ckup of using a domain configured with
more memory than is in any single node.

/me dons the brown paper bag.

With your affinity patches and the domain restricted to a single node
via cpu affinity The Right Thing seems to happen.

cpupools don't seem to do this, I don't know if that is expected or not.

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20 16:21                       ` Ian Campbell
@ 2012-01-20 16:28                         ` Ian Campbell
  2012-01-20 16:31                           ` George Dunlap
  2012-01-20 16:58                         ` Dario Faggioli
  1 sibling, 1 reply; 97+ messages in thread
From: Ian Campbell @ 2012-01-20 16:28 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Ian Jackson, Juergen Gross,
	Tim (Xen.org),
	Jan Beulich

On Fri, 2012-01-20 at 16:21 +0000, Ian Campbell wrote:
> cpupools don't seem to do this, I don't know if that is expected or not.

Right, so cpupools do not appear to set the vcpu affinity, at least not
at the level where it effects memory allocation. However both
	pool="Pool-node0" cpus="0-7"
and
	pool="Pool-node1" cpus="8-15"
work as expected on a system with 8 cpus per node.

Should something be doing this pinning automatically?

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20 16:28                         ` Ian Campbell
@ 2012-01-20 16:31                           ` George Dunlap
  2012-01-20 16:39                             ` Ian Campbell
  0 siblings, 1 reply; 97+ messages in thread
From: George Dunlap @ 2012-01-20 16:31 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Ian Jackson, Juergen Gross,
	Tim (Xen.org),
	Dario Faggioli, Jan Beulich

On Fri, 2012-01-20 at 16:28 +0000, Ian Campbell wrote:
> On Fri, 2012-01-20 at 16:21 +0000, Ian Campbell wrote:
> > cpupools don't seem to do this, I don't know if that is expected or not.
> 
> Right, so cpupools do not appear to set the vcpu affinity, at least not
> at the level where it effects memory allocation. However both
> 	pool="Pool-node0" cpus="0-7"
> and
> 	pool="Pool-node1" cpus="8-15"
> work as expected on a system with 8 cpus per node.
> 
> Should something be doing this pinning automatically?

It seems like it would be useful; But then we have the issue of, if a vm
was pinned to cpus 0-3 of Pool-node0, and you move it to Pool-node1,
what do you do?

 -George

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20 16:31                           ` George Dunlap
@ 2012-01-20 16:39                             ` Ian Campbell
  2012-01-20 16:43                               ` George Dunlap
  0 siblings, 1 reply; 97+ messages in thread
From: Ian Campbell @ 2012-01-20 16:39 UTC (permalink / raw)
  To: George Dunlap
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Ian Jackson, Juergen Gross,
	Tim (Xen.org),
	Dario Faggioli, Jan Beulich

On Fri, 2012-01-20 at 16:31 +0000, George Dunlap wrote:
> On Fri, 2012-01-20 at 16:28 +0000, Ian Campbell wrote:
> > On Fri, 2012-01-20 at 16:21 +0000, Ian Campbell wrote:
> > > cpupools don't seem to do this, I don't know if that is expected or not.
> > 
> > Right, so cpupools do not appear to set the vcpu affinity, at least not
> > at the level where it effects memory allocation. However both
> > 	pool="Pool-node0" cpus="0-7"
> > and
> > 	pool="Pool-node1" cpus="8-15"
> > work as expected on a system with 8 cpus per node.
> > 
> > Should something be doing this pinning automatically?
> 
> It seems like it would be useful; But then we have the issue of, if a vm
> was pinned to cpus 0-3 of Pool-node0, and you move it to Pool-node1,
> what do you do?

I've no idea, it's not clear to me now what the semantics of cpupools
are if they don't restrict the VCPU affinity like I previously assumed.

I'm actually struggling to find what libxl does with the poolid or
poolname parameters in libxl_domain_create_info other than write the
latter to /local/domain/<D>/pool_name. I must be missing something.

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20 16:39                             ` Ian Campbell
@ 2012-01-20 16:43                               ` George Dunlap
  2012-01-20 16:54                                 ` Ian Campbell
  2012-01-20 16:55                                 ` Ian Campbell
  0 siblings, 2 replies; 97+ messages in thread
From: George Dunlap @ 2012-01-20 16:43 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Ian Jackson, Juergen Gross,
	Tim (Xen.org),
	Dario Faggioli, Jan Beulich

On Fri, 2012-01-20 at 16:39 +0000, Ian Campbell wrote:
> On Fri, 2012-01-20 at 16:31 +0000, George Dunlap wrote:
> > On Fri, 2012-01-20 at 16:28 +0000, Ian Campbell wrote:
> > > On Fri, 2012-01-20 at 16:21 +0000, Ian Campbell wrote:
> > > > cpupools don't seem to do this, I don't know if that is expected or not.
> > > 
> > > Right, so cpupools do not appear to set the vcpu affinity, at least not
> > > at the level where it effects memory allocation. However both
> > > 	pool="Pool-node0" cpus="0-7"
> > > and
> > > 	pool="Pool-node1" cpus="8-15"
> > > work as expected on a system with 8 cpus per node.
> > > 
> > > Should something be doing this pinning automatically?
> > 
> > It seems like it would be useful; But then we have the issue of, if a vm
> > was pinned to cpus 0-3 of Pool-node0, and you move it to Pool-node1,
> > what do you do?
> 
> I've no idea, it's not clear to me now what the semantics of cpupools
> are if they don't restrict the VCPU affinity like I previously assumed.

Well, it does restrict what cpus the VM will run on; the effective
affinity will be the union of the pool cpus and the vcpu affinity.

 -George

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20 16:43                               ` George Dunlap
@ 2012-01-20 16:54                                 ` Ian Campbell
  2012-01-20 17:32                                   ` Dario Faggioli
  2012-01-20 16:55                                 ` Ian Campbell
  1 sibling, 1 reply; 97+ messages in thread
From: Ian Campbell @ 2012-01-20 16:54 UTC (permalink / raw)
  To: George Dunlap
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Ian Jackson, Juergen Gross,
	Tim (Xen.org),
	Dario Faggioli, Jan Beulich

On Fri, 2012-01-20 at 16:43 +0000, George Dunlap wrote:
> On Fri, 2012-01-20 at 16:39 +0000, Ian Campbell wrote:
> > On Fri, 2012-01-20 at 16:31 +0000, George Dunlap wrote:
> > > On Fri, 2012-01-20 at 16:28 +0000, Ian Campbell wrote:
> > > > On Fri, 2012-01-20 at 16:21 +0000, Ian Campbell wrote:
> > > > > cpupools don't seem to do this, I don't know if that is expected or not.
> > > > 
> > > > Right, so cpupools do not appear to set the vcpu affinity, at least not
> > > > at the level where it effects memory allocation. However both
> > > > 	pool="Pool-node0" cpus="0-7"
> > > > and
> > > > 	pool="Pool-node1" cpus="8-15"
> > > > work as expected on a system with 8 cpus per node.
> > > > 
> > > > Should something be doing this pinning automatically?
> > > 
> > > It seems like it would be useful; But then we have the issue of, if a vm
> > > was pinned to cpus 0-3 of Pool-node0, and you move it to Pool-node1,
> > > what do you do?
> > 
> > I've no idea, it's not clear to me now what the semantics of cpupools
> > are if they don't restrict the VCPU affinity like I previously assumed.
> 
> Well, it does restrict what cpus the VM will run on; the effective
> affinity will be the union of the pool cpus and the vcpu affinity.

Ah, right.

I confused myself into thinking that cpupools ~= NUMA because I've only
used cpupool-numa-split but I can see that you might also divide your
cpus up some other way.

Should that same union be used for d->node_affinity though? It seems
like it would make sense.

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20 16:43                               ` George Dunlap
  2012-01-20 16:54                                 ` Ian Campbell
@ 2012-01-20 16:55                                 ` Ian Campbell
  2012-01-20 16:59                                   ` George Dunlap
  1 sibling, 1 reply; 97+ messages in thread
From: Ian Campbell @ 2012-01-20 16:55 UTC (permalink / raw)
  To: George Dunlap
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Ian Jackson, Juergen Gross,
	Tim (Xen.org),
	Dario Faggioli, Jan Beulich

On Fri, 2012-01-20 at 16:43 +0000, George Dunlap wrote:
> On Fri, 2012-01-20 at 16:39 +0000, Ian Campbell wrote:
> > On Fri, 2012-01-20 at 16:31 +0000, George Dunlap wrote:
> > > On Fri, 2012-01-20 at 16:28 +0000, Ian Campbell wrote:
> > > > On Fri, 2012-01-20 at 16:21 +0000, Ian Campbell wrote:
> > > > > cpupools don't seem to do this, I don't know if that is expected or not.
> > > > 
> > > > Right, so cpupools do not appear to set the vcpu affinity, at least not
> > > > at the level where it effects memory allocation. However both
> > > > 	pool="Pool-node0" cpus="0-7"
> > > > and
> > > > 	pool="Pool-node1" cpus="8-15"
> > > > work as expected on a system with 8 cpus per node.
> > > > 
> > > > Should something be doing this pinning automatically?
> > > 
> > > It seems like it would be useful; But then we have the issue of, if a vm
> > > was pinned to cpus 0-3 of Pool-node0, and you move it to Pool-node1,
> > > what do you do?
> > 
> > I've no idea, it's not clear to me now what the semantics of cpupools
> > are if they don't restrict the VCPU affinity like I previously assumed.
> 
> Well, it does restrict what cpus the VM will run on; the effective
> affinity will be the union of the pool cpus and the vcpu affinity.

Did you mean intersection?

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20 16:21                       ` Ian Campbell
  2012-01-20 16:28                         ` Ian Campbell
@ 2012-01-20 16:58                         ` Dario Faggioli
  2012-01-20 17:23                           ` Ian Campbell
  1 sibling, 1 reply; 97+ messages in thread
From: Dario Faggioli @ 2012-01-20 16:58 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Tim (Xen.org),
	Ian Jackson, Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 1254 bytes --]

On Fri, 2012-01-20 at 16:21 +0000, Ian Campbell wrote: 
> I think I made the rather basic c*ckup of using a domain configured with
> more memory than is in any single node.
> 
Well, that is one of the most interesting use cases, indeed! :-)

Seriously, I really expect to have some very hard time when I'll come to
consider scenarios like that one...

> With your affinity patches and the domain restricted to a single node
> via cpu affinity The Right Thing seems to happen.
> 
> cpupools don't seem to do this, I don't know if that is expected or not.
> 
Glad to know my patches are (well, could be!) useful for something!
About cpupool, I think if a domain is created as part of the pool, the
very same behaviour you achieve with my patches should be expected.
However, as George is correctly pointing out, that might turn out to be
quite bad if the domain is then moved! :-(

Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-------------------------------------------------------------------
Dario Faggioli, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
PhD Candidate, ReTiS Lab, Scuola Superiore Sant'Anna, Pisa (Italy)



[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20 16:55                                 ` Ian Campbell
@ 2012-01-20 16:59                                   ` George Dunlap
  0 siblings, 0 replies; 97+ messages in thread
From: George Dunlap @ 2012-01-20 16:59 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Ian Jackson, Juergen Gross,
	Tim (Xen.org),
	Dario Faggioli, Jan Beulich

On Fri, 2012-01-20 at 16:55 +0000, Ian Campbell wrote:
> On Fri, 2012-01-20 at 16:43 +0000, George Dunlap wrote:
> > On Fri, 2012-01-20 at 16:39 +0000, Ian Campbell wrote:
> > > On Fri, 2012-01-20 at 16:31 +0000, George Dunlap wrote:
> > > > On Fri, 2012-01-20 at 16:28 +0000, Ian Campbell wrote:
> > > > > On Fri, 2012-01-20 at 16:21 +0000, Ian Campbell wrote:
> > > > > > cpupools don't seem to do this, I don't know if that is expected or not.
> > > > > 
> > > > > Right, so cpupools do not appear to set the vcpu affinity, at least not
> > > > > at the level where it effects memory allocation. However both
> > > > > 	pool="Pool-node0" cpus="0-7"
> > > > > and
> > > > > 	pool="Pool-node1" cpus="8-15"
> > > > > work as expected on a system with 8 cpus per node.
> > > > > 
> > > > > Should something be doing this pinning automatically?
> > > > 
> > > > It seems like it would be useful; But then we have the issue of, if a vm
> > > > was pinned to cpus 0-3 of Pool-node0, and you move it to Pool-node1,
> > > > what do you do?
> > > 
> > > I've no idea, it's not clear to me now what the semantics of cpupools
> > > are if they don't restrict the VCPU affinity like I previously assumed.
> > 
> > Well, it does restrict what cpus the VM will run on; the effective
> > affinity will be the union of the pool cpus and the vcpu affinity.
> 
> Did you mean intersection?

Hmm, seems it's been too long since I used set theory. :-)  Yes,
intersection; (i.e., a vcpu will run on a cpu only if the cpu is in the
affinity mask and the cpu pool).

 -George

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20 16:58                         ` Dario Faggioli
@ 2012-01-20 17:23                           ` Ian Campbell
  2012-01-20 17:28                             ` Dario Faggioli
  0 siblings, 1 reply; 97+ messages in thread
From: Ian Campbell @ 2012-01-20 17:23 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Ian Jackson, Tim (Xen.org),
	Jan Beulich

On Fri, 2012-01-20 at 16:58 +0000, Dario Faggioli wrote:
> On Fri, 2012-01-20 at 16:21 +0000, Ian Campbell wrote: 
> > With your affinity patches and the domain restricted to a single node
> > via cpu affinity The Right Thing seems to happen.
> > 
> > cpupools don't seem to do this, I don't know if that is expected or not.
> > 
> Glad to know my patches are (well, could be!) useful for something!
> About cpupool, I think if a domain is created as part of the pool, the
> very same behaviour you achieve with my patches should be expected.
> However, as George is correctly pointing out, that might turn out to be
> quite bad if the domain is then moved! :-(

It's no worse than starting a VM with CPUS pinned one way and then
changing it -- you might end up with CPUS with pessimal access to the
memory assigned to the guest.

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20 17:23                           ` Ian Campbell
@ 2012-01-20 17:28                             ` Dario Faggioli
  0 siblings, 0 replies; 97+ messages in thread
From: Dario Faggioli @ 2012-01-20 17:28 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Ian Jackson, Tim (Xen.org),
	Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 791 bytes --]

On Fri, 2012-01-20 at 17:23 +0000, Ian Campbell wrote: 
> It's no worse than starting a VM with CPUS pinned one way and then
> changing it -- you might end up with CPUS with pessimal access to the
> memory assigned to the guest.
> 
It is exactly the same thing! What one can argue is that right now,
_without_ my xl-cpupin patches, you CAN'T really start a VM with CPUs
pinned (at least with xl, as it seems you can with xm/xend)... :-P

Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-------------------------------------------------------------------
Dario Faggioli, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
PhD Candidate, ReTiS Lab, Scuola Superiore Sant'Anna, Pisa (Italy)



[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20 16:54                                 ` Ian Campbell
@ 2012-01-20 17:32                                   ` Dario Faggioli
  2012-01-23 10:19                                     ` Ian Campbell
  0 siblings, 1 reply; 97+ messages in thread
From: Dario Faggioli @ 2012-01-20 17:32 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Ian Jackson, Juergen Gross,
	Tim (Xen.org),
	George Dunlap, Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 1132 bytes --]

On Fri, 2012-01-20 at 16:54 +0000, Ian Campbell wrote: 
> I confused myself into thinking that cpupools ~= NUMA because I've only
> used cpupool-numa-split but I can see that you might also divide your
> cpus up some other way.
> 
Yeah, indeed, although the numa-split case looks like the most useful
one to me.

> Should that same union be used for d->node_affinity though? It seems
> like it would make sense.
> 
According to me, it should. Then, at least right now, moving it would
probably kill its performances because all its memory will be far away,
while right now it's all more "stochastic".

Still, I think it should be done, as if you place a domain in a cpupool
at its creation, I think the case of moving it away from there would be
quite rare.

Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-------------------------------------------------------------------
Dario Faggioli, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
PhD Candidate, ReTiS Lab, Scuola Superiore Sant'Anna, Pisa (Italy)



[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20  9:01       ` Ian Campbell
  2012-01-20  9:47         ` Dario Faggioli
@ 2012-01-23  9:59         ` Juergen Gross
  1 sibling, 0 replies; 97+ messages in thread
From: Juergen Gross @ 2012-01-23  9:59 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Keir (Xen.org), Stefano Stabellini, Tim (Xen.org),
	Ian Jackson, Jan Beulich

On 01/20/2012 10:01 AM, Ian Campbell wrote:
> On Fri, 2012-01-20 at 08:15 +0000, Pasi Kärkkäinen wrote:
>> On Fri, Jan 20, 2012 at 07:59:28AM +0000, Ian Campbell wrote:
>>> On Thu, 2012-01-19 at 21:14 +0000, Pasi Kärkkäinen wrote:
>>>> On Wed, Jan 04, 2012 at 04:29:22PM +0000, Ian Campbell wrote:
>>>>> Has anybody got anything else? I'm sure I've missed stuff. Are there any
>>>>> must haves e.g. in the paging/sharing spaces?
>>>>>
>>>> Something that I just remembered:
>>>> http://wiki.xen.org/xenwiki/Xen4.1
>>>>
>>>> "NUMA-aware memory allocation for VMs. xl in Xen 4.1 will allocate
>>>> equal amount of memory from every NUMA node for the VM. xm/xend
>>>> allocates all the memory from the same NUMA node."
>>> I'm not that familiar with the NUMA support but my understanding was
>>> that memory was allocated by libxc/the-hypervisor and not by the
>>> toolstack and that the default was to allocate from the same numa nodes
>>> as domains the processor's were pinned to i.e. if you pin the processors
>>> appropriately the Right Thing just happens. Do you believe this is not
>>> the case and/or not working right with xl?
>>>
>>> CCing Juergen since he added the cpupool support and in particular the
>>> cpupool-numa-split option so I'm hoping he knows something about NUMA
>>> more generally.
>>>
>>>> Is this something that should be looked at?
>>>
>>> Probably, but is anyone doing so?
>>>
>>>> Should the numa memory allocation be an option so it can be controlled
>>>> per domain?
>>> What options did xm provide in this regard?
>>>
>>> Does xl's cpupool (with the cpupool-numa-split option) server the same
>>> purpose?
>>>
>>>> The default libxl behaviour might cause unexpected performance issues
>>>> on multi-socket systems?
>>> I'm not convinced libxl is behaving any different to xend but perhaps
>>> someone can show me the error of my ways.
>>>
>>
>> See this thread:
>> http://old-list-archives.xen.org/archives/html/xen-devel/2011-07/msg01423.html
>>
>> where Stefano wrote:
>> "I think we forgot about this feature but it is important and hopefully
>> somebody will write a patch for it before 4.2 is out."
> Is anyone looking into this?
>
> Does cpupool-numa-split solve this same problem?
>
> I think I forgot to actually CC Juergen when I said, doing that now.

I've just sent a patch which should do the job.
I just have no NUMA machine to test it on, I just tested the patch
doesn't break booting dom0...


Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
PDG ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@ts.fujitsu.com
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-20 17:32                                   ` Dario Faggioli
@ 2012-01-23 10:19                                     ` Ian Campbell
  2012-01-23 13:14                                       ` Dario Faggioli
  0 siblings, 1 reply; 97+ messages in thread
From: Ian Campbell @ 2012-01-23 10:19 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Tim (Xen.org),
	Juergen Gross, Ian Jackson, Jan Beulich

On Fri, 2012-01-20 at 17:32 +0000, Dario Faggioli wrote:
> On Fri, 2012-01-20 at 16:54 +0000, Ian Campbell wrote: 
> > I confused myself into thinking that cpupools ~= NUMA because I've only
> > used cpupool-numa-split but I can see that you might also divide your
> > cpus up some other way.
> > 
> Yeah, indeed, although the numa-split case looks like the most useful
> one to me.
> 
> > Should that same union be used for d->node_affinity though? It seems
> > like it would make sense.
> > 
> According to me, it should.

I agree.

One idea I had over the weekend is that we could support a special
'cpus="pool"' syntax to mean "pin this guest to the node I configured it
to be in". I think this is a second best option to simply having
d->node_affinity reflect the pool though.

>  Then, at least right now, moving it would
> probably kill its performances because all its memory will be far away,
> while right now it's all more "stochastic".

Yes, in some sense the xend behaviour is best case good behaviour and
worse case bad behaviour, while xl has a more average/consistent
behaviour across the range. In practice however I suspect xend probably
hits the good cases more often than not.

> Still, I think it should be done, as if you place a domain in a cpupool
> at its creation, I think the case of moving it away from there would be
> quite rare.

Agreed.

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-17 10:39                                             ` Ian Campbell
@ 2012-01-23 11:40                                               ` Roger Pau Monné
  2012-01-27  8:43                                                 ` Roger Pau Monné
  0 siblings, 1 reply; 97+ messages in thread
From: Roger Pau Monné @ 2012-01-23 11:40 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Ian Jackson, Stefano Stabellini

2012/1/17 Ian Campbell <Ian.Campbell@citrix.com>:
> On Tue, 2012-01-17 at 10:00 +0000, Roger Pau Monné wrote:
>> 2012/1/17 Ian Campbell <Ian.Campbell@citrix.com>:
>> > On Tue, 2012-01-17 at 09:40 +0000, Roger Pau Monné wrote:
>> >> 2012/1/17 Ian Campbell <Ian.Campbell@citrix.com>:
>> >> > However xend should not be transition to this new scheme but should
>> >> > continue to use its existing scripts in the current manner.
>> >> >
>> >> > There was a conversation last year[0] about how a toolstack could
>> >> > opt-in/out of the use of the hotplug scripts. We decided that toolstacks
>> >> > should have to opt into the use of these scripts, by touching a stamp
>> >> > file.
>> >> >
>> >> > Although this wasn't implemented yet (unless I missed it) I guess the
>> >> > same scheme would apply to this work if that sort of thing turns out to
>> >> > be necessary.
>> >>
>> >> Sorry for replying so many times, this is a big maybe, and possibly
>> >> it's too drastic, but after this changes xl and xend will not be
>> >> compatible anymore, so why don't we disable xend by default, and only
>> >> build xl?
>> >
>> > I don't think they are compatible now, are they? I've certainly seen odd
>> > behaviour when using xl with xend (accidentally) running, usually xend
>> > reaps the domain I've just started...
>> >
>> > I'm all for disabling the build of xend by default but I had assumed
>> > that others would think 4.2 was rather an aggressive timeline for that.
>> >
>> >> When the new configure script is in, it will be trivial to select if
>> >> you want xl or xend, and only install one of those. Adding the option
>> >> --enable-xend should disable xl and only build and install xend
>> >> (printing a very big warning that xend is deprecated).
>> >
>> > I don't think --enable-xend should ever disable xl (or vice versa). Many
>> > folks (e.g. distros) will want to build both, perhaps to package them in
>> > two different binary packages, but certainly to offer their users the
>> > choice, at least for the time being.
>>
>> My main concern with this is that xend and xl will start to use
>> different udev rules (well, xend will continue to use the existing
>> ones, while xl will only use a subset of those). So we have to decide
>> which udev rules file to install, because we can't have both installed
>> at the same time.
>
> Sure we can. Perhaps they need to have an "if $TOOLSTACK" check (e.g. if
> [ -f /var/run/xend.hotplug ]) added to the top, that is all.
>
>> Another option is to install xl udev rules by default, and make xend
>> move it's own rules in the init script.
>
> I don't think initscripts should be messing with udev rules.
>
> Perhaps the opt-in needs to be more fine grained e.g. opt-in to vif but
> not block scripts or whatever distinction you think is necessary instead
> of jut a global opt in, it's just a different naming convention for the
> stamp file. This avoids reconfiguration and the need to install subsets
> of the scripts etc.
>
>>  Since xl doesn't use a daemon,
>> xl should always check if xend is running before doing anything and
>> fail if xend is found.
>
> I think that is a separate question/issue to the one of hotplug scripts.

I've posted a WIP the other week about calling hotplug scritps from
libxl, by the end of this week or the beginning of the following I
will try to post the finished driver domain series, and then we can
decide how to fix xend to preserve compatibility.

With all this, is there a deadline for 4.2? I will really like to have
driver domains added to 4.2.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-23 10:19                                     ` Ian Campbell
@ 2012-01-23 13:14                                       ` Dario Faggioli
  2012-01-23 13:20                                         ` Ian Campbell
  0 siblings, 1 reply; 97+ messages in thread
From: Dario Faggioli @ 2012-01-23 13:14 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Tim (Xen.org),
	Juergen Gross, Ian Jackson, Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 1322 bytes --]

On Mon, 2012-01-23 at 10:19 +0000, Ian Campbell wrote: 
> > According to me, it should.
> 
> I agree.
> 
> One idea I had over the weekend is that we could support a special
> 'cpus="pool"' syntax to mean "pin this guest to the node I configured it
> to be in". I think this is a second best option to simply having
> d->node_affinity reflect the pool though.
> 
Which is exactly what Juergen is doing, right? Or you meant something
else?

> >  Then, at least right now, moving it would
> > probably kill its performances because all its memory will be far away,
> > while right now it's all more "stochastic".
> 
> Yes, in some sense the xend behaviour is best case good behaviour and
> worse case bad behaviour, while xl has a more average/consistent
> behaviour across the range. In practice however I suspect xend probably
> hits the good cases more often than not.
> 
Me too. I'm thinking how/working to get to something even better! :-)

Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-------------------------------------------------------------------
Dario Faggioli, http://retis.sssup.it/people/faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
PhD Candidate, ReTiS Lab, Scuola Superiore Sant'Anna, Pisa (Italy)



[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend
  2012-01-23 13:14                                       ` Dario Faggioli
@ 2012-01-23 13:20                                         ` Ian Campbell
  0 siblings, 0 replies; 97+ messages in thread
From: Ian Campbell @ 2012-01-23 13:20 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: xen-devel, Keir (Xen.org),
	Stefano Stabellini, George Dunlap, Tim (Xen.org),
	Juergen Gross, Ian Jackson, Jan Beulich

On Mon, 2012-01-23 at 13:14 +0000, Dario Faggioli wrote:
> On Mon, 2012-01-23 at 10:19 +0000, Ian Campbell wrote: 
> > > According to me, it should.
> > 
> > I agree.
> > 
> > One idea I had over the weekend is that we could support a special
> > 'cpus="pool"' syntax to mean "pin this guest to the node I configured it
> > to be in". I think this is a second best option to simply having
> > d->node_affinity reflect the pool though.
> > 
> Which is exactly what Juergen is doing, right? Or you meant something
> else?

I meant what Jeurgen is doing, I just hadn't seen that mail yet.

> > >  Then, at least right now, moving it would
> > > probably kill its performances because all its memory will be far away,
> > > while right now it's all more "stochastic".
> > 
> > Yes, in some sense the xend behaviour is best case good behaviour and
> > worse case bad behaviour, while xl has a more average/consistent
> > behaviour across the range. In practice however I suspect xend probably
> > hits the good cases more often than not.
> > 
> Me too. I'm thinking how/working to get to something even better! :-)

Great.

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-23 11:40                                               ` Roger Pau Monné
@ 2012-01-27  8:43                                                 ` Roger Pau Monné
  2012-01-27 10:57                                                   ` Stefano Stabellini
  0 siblings, 1 reply; 97+ messages in thread
From: Roger Pau Monné @ 2012-01-27  8:43 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, Ian Jackson, Stefano Stabellini

Hello,

I have a question regarding driver domains and root hard disks, if the
root hard disk (the one containing the kernel and ramdisk) is on a
driver domain, how can we pass the kernel to the Dom0? libvchan seems
like a good option to pass the kernel and ramdisk from driver domains
to Dom0, but I would like to hear opinions about that.

If kernel and ramdisk passing is not implemented, the only way to boot
from hard disk stored on driver domains is to extract the kernel and
ramdisk and store them on the Dom0.

Thanks, Roger.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-27  8:43                                                 ` Roger Pau Monné
@ 2012-01-27 10:57                                                   ` Stefano Stabellini
  2012-01-31  9:57                                                     ` Roger Pau Monné
  0 siblings, 1 reply; 97+ messages in thread
From: Stefano Stabellini @ 2012-01-27 10:57 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, Ian Jackson, Ian Campbell, Stefano Stabellini

[-- Attachment #1: Type: text/plain, Size: 1255 bytes --]

On Fri, 27 Jan 2012, Roger Pau Monné wrote:
> Hello,
> 
> I have a question regarding driver domains and root hard disks, if the
> root hard disk (the one containing the kernel and ramdisk) is on a
> driver domain, how can we pass the kernel to the Dom0? libvchan seems
> like a good option to pass the kernel and ramdisk from driver domains
> to Dom0, but I would like to hear opinions about that.
> 
> If kernel and ramdisk passing is not implemented, the only way to boot
> from hard disk stored on driver domains is to extract the kernel and
> ramdisk and store them on the Dom0.

Let me describe in more details this scenario for you:
we are using a storage driver domain (the storage controller is assigned
to a domain other than dom0), and dom0 is still responsible for creating
all the other VMs, including the storage driver domain.

How can dom0 create the storage driver domain if the kernel and initrd
of the storage driver domain are on the hard disk?

A simple solution would be having the storage driver domain kernel and
initrd inside dom0 initrd or passed to dom0 through multiboot by the
bootloader as an additional payload (better).
Dom0 should be capable of freeing the memory used this way after
creating the storage driver domain.

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-27 10:57                                                   ` Stefano Stabellini
@ 2012-01-31  9:57                                                     ` Roger Pau Monné
  2012-01-31 10:06                                                       ` Tim Deegan
                                                                         ` (2 more replies)
  0 siblings, 3 replies; 97+ messages in thread
From: Roger Pau Monné @ 2012-01-31  9:57 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Ian Jackson, Ian Campbell

Hello,

I've been thinking about this email for some time, but there are parts
that are still not clear, so I'm sorry to bother you again with
this...

2012/1/27 Stefano Stabellini <stefano.stabellini@eu.citrix.com>:
> On Fri, 27 Jan 2012, Roger Pau Monné wrote:
>> Hello,
>>
>> I have a question regarding driver domains and root hard disks, if the
>> root hard disk (the one containing the kernel and ramdisk) is on a
>> driver domain, how can we pass the kernel to the Dom0? libvchan seems
>> like a good option to pass the kernel and ramdisk from driver domains
>> to Dom0, but I would like to hear opinions about that.
>>
>> If kernel and ramdisk passing is not implemented, the only way to boot
>> from hard disk stored on driver domains is to extract the kernel and
>> ramdisk and store them on the Dom0.
>
> Let me describe in more details this scenario for you:
> we are using a storage driver domain (the storage controller is assigned
> to a domain other than dom0), and dom0 is still responsible for creating
> all the other VMs, including the storage driver domain.

Ok, you launch the Dom0 normally and then you launch another domain(s)
that will be the driver domains, that's ok. Every one of this drivers
domains should be running xenbackendd to react to device
creation/destruction.

>
> How can dom0 create the storage driver domain if the kernel and initrd
> of the storage driver domain are on the hard disk?
>
> A simple solution would be having the storage driver domain kernel and
> initrd inside dom0 initrd or passed to dom0 through multiboot by the
> bootloader as an additional payload (better).
> Dom0 should be capable of freeing the memory used this way after
> creating the storage driver domain.

That's what I don't get. Booting the driver domain should be no
problem, because you can also have a xenbackendd running in the Dom0
to boot the driver domain (or maybe you want to use both Dom0 and
another DomU as driver domains).

What I don't get is what you do when you have to boot a PV DomU which
root HDD is on the driver domain. Dom0 needs the kernel/initrd from
the HDD (usually extracted using pygrub). Since the HDD is inside the
driver domain, Dom0 doesn't have access to that image, so there's no
way to extract the kernel/initrd from the Dom0. What I through is that
the driver domain has to run pygrub, extract the kernel/initrd, and
pass both files to the Dom0, but how can we pass those files? libvchan
seems like the best option, but I would like to head others opinions
about this.

Currently I have a mostly working xenbackendd implementation with
libxl, that can handle vbd and vif interfaces, but I'm missing qdisk,
I still have to look into the Qemu stuff to be able to launch a device
model that only attaches a HDD.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-31  9:57                                                     ` Roger Pau Monné
@ 2012-01-31 10:06                                                       ` Tim Deegan
  2012-01-31 13:47                                                         ` Stefano Stabellini
  2012-01-31 13:51                                                       ` Stefano Stabellini
  2012-01-31 20:02                                                       ` Ian Campbell
  2 siblings, 1 reply; 97+ messages in thread
From: Tim Deegan @ 2012-01-31 10:06 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, Ian Jackson, Ian Campbell, Stefano Stabellini

Hi, 

At 10:57 +0100 on 31 Jan (1328007457), Roger Pau Monn? wrote:
> That's what I don't get. Booting the driver domain should be no
> problem, because you can also have a xenbackendd running in the Dom0
> to boot the driver domain (or maybe you want to use both Dom0 and
> another DomU as driver domains).
> 
> What I don't get is what you do when you have to boot a PV DomU which
> root HDD is on the driver domain. Dom0 needs the kernel/initrd from
> the HDD (usually extracted using pygrub). Since the HDD is inside the
> driver domain, Dom0 doesn't have access to that image, so there's no
> way to extract the kernel/initrd from the Dom0. What I through is that
> the driver domain has to run pygrub, extract the kernel/initrd, and
> pass both files to the Dom0, but how can we pass those files? libvchan
> seems like the best option, but I would like to head others opinions
> about this.

You could attach to the disk image (using blkfront in dom0 and
blkback/tap/whatever in the driver domain), run pygrub and detach.
That's basically the same thing you have to do with a qcow image in a
traditional dom0.

Or you could use pvgrub to boot the domain, so dom0 code never has to
touch the guets-supplied disk image or kernel.  That seems much better
to me, but maybe it wouldn't work for some existing deployments?

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-31 10:06                                                       ` Tim Deegan
@ 2012-01-31 13:47                                                         ` Stefano Stabellini
  2012-01-31 13:51                                                           ` Ian Campbell
  0 siblings, 1 reply; 97+ messages in thread
From: Stefano Stabellini @ 2012-01-31 13:47 UTC (permalink / raw)
  To: Tim Deegan
  Cc: Roger Pau Monné,
	xen-devel, Ian Jackson, Ian Campbell, Stefano Stabellini

On Tue, 31 Jan 2012, Tim Deegan wrote:
> Hi, 
> 
> At 10:57 +0100 on 31 Jan (1328007457), Roger Pau Monn? wrote:
> > That's what I don't get. Booting the driver domain should be no
> > problem, because you can also have a xenbackendd running in the Dom0
> > to boot the driver domain (or maybe you want to use both Dom0 and
> > another DomU as driver domains).
> > 
> > What I don't get is what you do when you have to boot a PV DomU which
> > root HDD is on the driver domain. Dom0 needs the kernel/initrd from
> > the HDD (usually extracted using pygrub). Since the HDD is inside the
> > driver domain, Dom0 doesn't have access to that image, so there's no
> > way to extract the kernel/initrd from the Dom0. What I through is that
> > the driver domain has to run pygrub, extract the kernel/initrd, and
> > pass both files to the Dom0, but how can we pass those files? libvchan
> > seems like the best option, but I would like to head others opinions
> > about this.
> 
> You could attach to the disk image (using blkfront in dom0 and
> blkback/tap/whatever in the driver domain), run pygrub and detach.
> That's basically the same thing you have to do with a qcow image in a
> traditional dom0.

this


> Or you could use pvgrub to boot the domain, so dom0 code never has to
> touch the guets-supplied disk image or kernel.  That seems much better
> to me, but maybe it wouldn't work for some existing deployments?

It wouldn't work with guests that use grub2, I believe.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-31  9:57                                                     ` Roger Pau Monné
  2012-01-31 10:06                                                       ` Tim Deegan
@ 2012-01-31 13:51                                                       ` Stefano Stabellini
  2012-01-31 20:02                                                       ` Ian Campbell
  2 siblings, 0 replies; 97+ messages in thread
From: Stefano Stabellini @ 2012-01-31 13:51 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, Ian Jackson, Ian Campbell, Stefano Stabellini

[-- Attachment #1: Type: text/plain, Size: 1447 bytes --]

On Tue, 31 Jan 2012, Roger Pau Monné wrote:
> That's what I don't get. Booting the driver domain should be no
> problem, because you can also have a xenbackendd running in the Dom0
> to boot the driver domain (or maybe you want to use both Dom0 and
> another DomU as driver domains).
> 
> What I don't get is what you do when you have to boot a PV DomU which
> root HDD is on the driver domain. Dom0 needs the kernel/initrd from
> the HDD (usually extracted using pygrub). Since the HDD is inside the
> driver domain, Dom0 doesn't have access to that image, so there's no
> way to extract the kernel/initrd from the Dom0. What I through is that
> the driver domain has to run pygrub, extract the kernel/initrd, and
> pass both files to the Dom0, but how can we pass those files? libvchan
> seems like the best option, but I would like to head others opinions
> about this.

As Tim said, you would use blkfront in dom0 to get a device you can open
with pygrub.


> Currently I have a mostly working xenbackendd implementation with
> libxl, that can handle vbd and vif interfaces, but I'm missing qdisk,
> I still have to look into the Qemu stuff to be able to launch a device
> model that only attaches a HDD.
 
In case of qdisk, you can use the same blkfront in dom0 trick described
above, or, if the backend and the frontend are both in dom0, you can use
QEMU's nbdserver to export the disk and nbd-client to setup a device
that pygrub can open.

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-31 13:47                                                         ` Stefano Stabellini
@ 2012-01-31 13:51                                                           ` Ian Campbell
  0 siblings, 0 replies; 97+ messages in thread
From: Ian Campbell @ 2012-01-31 13:51 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Roger Pau Monné, Ian Jackson, xen-devel, Tim (Xen.org)

On Tue, 2012-01-31 at 13:47 +0000, Stefano Stabellini wrote:
> On Tue, 31 Jan 2012, Tim Deegan wrote:
> > Hi, 
> > 
> > At 10:57 +0100 on 31 Jan (1328007457), Roger Pau Monn? wrote:
> > > That's what I don't get. Booting the driver domain should be no
> > > problem, because you can also have a xenbackendd running in the Dom0
> > > to boot the driver domain (or maybe you want to use both Dom0 and
> > > another DomU as driver domains).
> > > 
> > > What I don't get is what you do when you have to boot a PV DomU which
> > > root HDD is on the driver domain. Dom0 needs the kernel/initrd from
> > > the HDD (usually extracted using pygrub). Since the HDD is inside the
> > > driver domain, Dom0 doesn't have access to that image, so there's no
> > > way to extract the kernel/initrd from the Dom0. What I through is that
> > > the driver domain has to run pygrub, extract the kernel/initrd, and
> > > pass both files to the Dom0, but how can we pass those files? libvchan
> > > seems like the best option, but I would like to head others opinions
> > > about this.
> > 
> > You could attach to the disk image (using blkfront in dom0 and
> > blkback/tap/whatever in the driver domain), run pygrub and detach.
> > That's basically the same thing you have to do with a qcow image in a
> > traditional dom0.
> 
> this
> 
> 
> > Or you could use pvgrub to boot the domain, so dom0 code never has to
> > touch the guets-supplied disk image or kernel.  That seems much better
> > to me, but maybe it wouldn't work for some existing deployments?
> 
> It wouldn't work with guests that use grub2, I believe.

One of the grub2 developers posted about working on pv grub2 back in
November...

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: Driver domains and hotplug scripts, redux
  2012-01-31  9:57                                                     ` Roger Pau Monné
  2012-01-31 10:06                                                       ` Tim Deegan
  2012-01-31 13:51                                                       ` Stefano Stabellini
@ 2012-01-31 20:02                                                       ` Ian Campbell
  2 siblings, 0 replies; 97+ messages in thread
From: Ian Campbell @ 2012-01-31 20:02 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Ian Jackson, Stefano Stabellini

On Tue, 2012-01-31 at 09:57 +0000, Roger Pau Monné wrote:
> 
> 
> What I don't get is what you do when you have to boot a PV DomU which
> root HDD is on the driver domain. Dom0 needs the kernel/initrd from
> the HDD (usually extracted using pygrub). Since the HDD is inside the
> driver domain, Dom0 doesn't have access to that image, so there's no
> way to extract the kernel/initrd from the Dom0. 

The usual way to deal with is to create a vbd device in dom0 (or which
ever domain runs pygrub) which attached to the backend domain and pass
that to pygrub. This is the sort of thing libxl_device_disk_local_attach
would be expected to handle.

Ian.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: RFC: Still TODO for 4.2?
  2012-01-06 15:37         ` Konrad Rzeszutek Wilk
  2012-01-06 19:08           ` Wei Huang
@ 2012-02-06 17:57           ` Pasi Kärkkäinen
  2012-02-13 17:52             ` Extracting ATI/AMD Radeon VBIOS ROM Pasi Kärkkäinen
  1 sibling, 1 reply; 97+ messages in thread
From: Pasi Kärkkäinen @ 2012-02-06 17:57 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: xen-devel, Keir Fraser, Ian Campbell, Tim Deegan, Wei Huang,
	Ian Jackson, Stefano Stabellini, Jan Beulich

On Fri, Jan 06, 2012 at 11:37:14AM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, Jan 04, 2012 at 01:57:28PM -0600, Wei Huang wrote:
> > On 01/04/2012 01:43 PM, Pasi K?rkk?inen wrote:
> > >On Wed, Jan 04, 2012 at 01:21:46PM -0600, Wei Huang wrote:
> > >>>>Has anybody got anything else? I'm sure I've missed stuff. Are there any
> > >>>>must haves e.g. in the paging/sharing spaces?
> > >>>>
> > >>>- What's the status of Nested Hardware Virtualization?
> > >>>I remember some email saying Intel vmx-on-vmx has some performance 
> > >>>issues,
> > >>>and amd svm-on-svm works better..
> > >>>
> > >>>
> > >>>- Also there's a bunch of VGA passthru related patches,
> > >>>that I once volunteered to collect/rebase/cleanup/repost myself,
> > >>>but I still haven't had time for that :(
> > >>Since there were quite a lot of interest on this subject, should we
> > >>document it in a separate wiki for working combinations (like
> > >>hypervisor, dom0, gfx card, driver version, tricks, etc)?
> > >>
> > >I actually once started writing down that kind of stuff:
> > >http://wiki.xen.org/xenwiki/XenVGAPassthroughTestedAdapters.html
> > >
> > >Feel free to contribute :)
> > >
> > >There's also:
> > >http://wiki.xen.org/xenwiki/XenVGAPassthrough
> > Thanks for sharing. I will contribute my findings as needed. BTW, do you 
> > need my VBIOS loading patches (sent long time ago) for AMD GPU? It is a 
> 
> Yes! Thought I haven't figured out yet how to extract the AMD GPU BIOS
> from the card. I've been able to pass in a Radeon 4870 to an Win 7 HVM
> guest and it works nicely.. the first time. After I shut down the guest
> it just never works again :-(
> 

Hmm.. I wonder if this would work:

cd /sys/bus/pci/devices/0000:01:05.0
sudo sh -c "echo 1 > rom"
sudo sh -c "cat rom > ~/bios.rom"


-- Pasi

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Extracting ATI/AMD Radeon VBIOS ROM
  2012-02-06 17:57           ` Pasi Kärkkäinen
@ 2012-02-13 17:52             ` Pasi Kärkkäinen
  0 siblings, 0 replies; 97+ messages in thread
From: Pasi Kärkkäinen @ 2012-02-13 17:52 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: xen-devel, Keir Fraser, Ian Campbell, Tim Deegan, Wei Huang,
	Ian Jackson, Stefano Stabellini, Jan Beulich

On Mon, Feb 06, 2012 at 07:57:55PM +0200, Pasi Kärkkäinen wrote:
> > > >>>- Also there's a bunch of VGA passthru related patches,
> > > >>>that I once volunteered to collect/rebase/cleanup/repost myself,
> > > >>>but I still haven't had time for that :(
> > > >>Since there were quite a lot of interest on this subject, should we
> > > >>document it in a separate wiki for working combinations (like
> > > >>hypervisor, dom0, gfx card, driver version, tricks, etc)?
> > > >>
> > > >I actually once started writing down that kind of stuff:
> > > >http://wiki.xen.org/xenwiki/XenVGAPassthroughTestedAdapters.html
> > > >
> > > >Feel free to contribute :)
> > > >
> > > >There's also:
> > > >http://wiki.xen.org/xenwiki/XenVGAPassthrough
> > > Thanks for sharing. I will contribute my findings as needed. BTW, do you 
> > > need my VBIOS loading patches (sent long time ago) for AMD GPU? It is a 
> > 
> > Yes! Thought I haven't figured out yet how to extract the AMD GPU BIOS
> > from the card. I've been able to pass in a Radeon 4870 to an Win 7 HVM
> > guest and it works nicely.. the first time. After I shut down the guest
> > it just never works again :-(
> > 
> 
> Hmm.. I wonder if this would work:
> 
> cd /sys/bus/pci/devices/0000:01:05.0
> sudo sh -c "echo 1 > rom"
> sudo sh -c "cat rom > ~/bios.rom"
> 

Extracting Radeon Mobility HD3650 VBIOS ROM as above works on my laptop.. 
(on baremetal Linux)

-- Pasi

^ permalink raw reply	[flat|nested] 97+ messages in thread

end of thread, other threads:[~2012-02-13 17:52 UTC | newest]

Thread overview: 97+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-04 16:29 RFC: Still TODO for 4.2? Ian Campbell
2012-01-04 16:47 ` Konrad Rzeszutek Wilk
2012-01-04 16:51   ` Stefano Stabellini
2012-01-16 13:42     ` Ian Campbell
2012-01-04 16:55 ` Jan Beulich
2012-01-16 13:39   ` Ian Campbell
2012-01-16 14:48     ` Jan Beulich
2012-01-16 15:00       ` Stefano Stabellini
2012-01-04 17:25 ` Pasi Kärkkäinen
2012-01-04 17:36   ` George Dunlap
2012-01-04 18:20   ` Tim Deegan
2012-01-05 10:39     ` Ian Campbell
2012-01-06 15:24       ` RFC: Still TODO for 4.2? Nested Paging for Intel Nested Virt Pasi Kärkkäinen
2012-01-04 19:21   ` RFC: Still TODO for 4.2? Wei Huang
2012-01-04 19:43     ` Pasi Kärkkäinen
2012-01-04 19:57       ` Wei Huang
2012-01-05  7:27         ` Pasi Kärkkäinen
2012-01-06 15:37         ` Konrad Rzeszutek Wilk
2012-01-06 19:08           ` Wei Huang
2012-02-06 17:57           ` Pasi Kärkkäinen
2012-02-13 17:52             ` Extracting ATI/AMD Radeon VBIOS ROM Pasi Kärkkäinen
2012-01-05 13:19       ` Re : RFC: Still TODO for 4.2? David TECHER
2012-01-05 13:25         ` Ian Campbell
2012-01-05 13:41           ` Re : " David TECHER
2012-01-05 16:18             ` Ian Campbell
2012-01-16 13:28   ` Ian Campbell
2012-01-16 14:39     ` Re : " David TECHER
2012-01-04 17:39 ` Roger Pau Monné
2012-01-05 18:07   ` Driver domains and hotplug scripts, redux Ian Jackson
2012-01-06 12:01     ` Stefano Stabellini
2012-01-09 10:08     ` Roger Pau Monné
2012-01-09 12:26       ` Stefano Stabellini
2012-01-09 17:39         ` Ian Jackson
     [not found]           ` <alpine.DEB.2.00.1201101448030.3150@kaball-desktop>
     [not found]             ` <20236.23822.715733.455559@mariner.uk.xensource.com>
     [not found]               ` <alpine.DEB.2.00.1201101547540.3150@kaball-desktop>
     [not found]                 ` <20236.24780.865152.458124@mariner.uk.xensource.com>
     [not found]                   ` <alpine.DEB.2.00.1201101619460.3150@kaball-desktop>
     [not found]                     ` <20236.27158.706017.813195@mariner.uk.xensource.com>
     [not found]                       ` <alpine.DEB.2.00.1201101655390.3150@kaball-desktop>
     [not found]                         ` <20236.28931.127139.752426@mariner.uk.xensource.com>
2012-01-11 11:50                           ` Roger Pau Monné
2012-01-11 12:17                             ` Ian Campbell
2012-01-11 14:26                               ` Dave Scott
2012-01-12 16:50                                 ` Ian Jackson
2012-01-12 18:07                                   ` Dave Scott
2012-01-11 14:44                               ` Roger Pau Monné
2012-01-12 16:48                               ` Ian Jackson
2012-01-16 17:52                                 ` Roger Pau Monné
2012-01-16 17:58                                   ` Ian Jackson
2012-01-17  9:17                                     ` Ian Campbell
2012-01-17  9:30                                       ` Roger Pau Monné
2012-01-17  9:43                                         ` Ian Campbell
2012-01-17  9:40                                       ` Roger Pau Monné
2012-01-17  9:52                                         ` Ian Campbell
2012-01-17 10:00                                           ` Roger Pau Monné
2012-01-17 10:39                                             ` Ian Campbell
2012-01-23 11:40                                               ` Roger Pau Monné
2012-01-27  8:43                                                 ` Roger Pau Monné
2012-01-27 10:57                                                   ` Stefano Stabellini
2012-01-31  9:57                                                     ` Roger Pau Monné
2012-01-31 10:06                                                       ` Tim Deegan
2012-01-31 13:47                                                         ` Stefano Stabellini
2012-01-31 13:51                                                           ` Ian Campbell
2012-01-31 13:51                                                       ` Stefano Stabellini
2012-01-31 20:02                                                       ` Ian Campbell
2012-01-17  9:22                                     ` Roger Pau Monné
2012-01-11 12:50                             ` Stefano Stabellini
2012-01-05 17:49 ` RFC: Still TODO for 4.2? Ian Jackson
2012-01-06 13:37   ` Ian Campbell
2012-01-10 16:06     ` Ian Jackson
2012-01-16 11:55 ` George Dunlap
2012-01-19 21:14 ` RFC: Still TODO for 4.2? xl domain numa memory allocation vs xm/xend Pasi Kärkkäinen
2012-01-20  7:59   ` Ian Campbell
2012-01-20  8:15     ` Pasi Kärkkäinen
2012-01-20  9:01       ` Ian Campbell
2012-01-20  9:47         ` Dario Faggioli
2012-01-20 11:56           ` Ian Campbell
2012-01-23  9:59         ` Juergen Gross
2012-01-20 10:55     ` Stefano Stabellini
2012-01-20 11:22       ` Ian Campbell
2012-01-20 11:25         ` Stefano Stabellini
2012-01-20 11:44         ` Dario Faggioli
2012-01-20 11:54           ` Ian Campbell
2012-01-20 12:04             ` Dario Faggioli
2012-01-20 12:33               ` Ian Campbell
2012-01-20 13:11                 ` Ian Campbell
2012-01-20 15:06                   ` Ian Campbell
2012-01-20 16:02                     ` Dario Faggioli
2012-01-20 16:21                       ` Ian Campbell
2012-01-20 16:28                         ` Ian Campbell
2012-01-20 16:31                           ` George Dunlap
2012-01-20 16:39                             ` Ian Campbell
2012-01-20 16:43                               ` George Dunlap
2012-01-20 16:54                                 ` Ian Campbell
2012-01-20 17:32                                   ` Dario Faggioli
2012-01-23 10:19                                     ` Ian Campbell
2012-01-23 13:14                                       ` Dario Faggioli
2012-01-23 13:20                                         ` Ian Campbell
2012-01-20 16:55                                 ` Ian Campbell
2012-01-20 16:59                                   ` George Dunlap
2012-01-20 16:58                         ` Dario Faggioli
2012-01-20 17:23                           ` Ian Campbell
2012-01-20 17:28                             ` Dario Faggioli
2012-01-20 11:26       ` Dario Faggioli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.