All of lore.kernel.org
 help / color / mirror / Atom feed
* Could Xen hyperviosr be able to invoke Linux systemcalls?
@ 2015-08-15  1:31 Kun Cheng
  2015-08-16 16:16 ` Frediano Ziglio
  0 siblings, 1 reply; 8+ messages in thread
From: Kun Cheng @ 2015-08-15  1:31 UTC (permalink / raw)
  To: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 810 bytes --]

Hi all,

That might be a dumb question but I just not confident with it. I'm not
familiar with Xen's memory management part. Currently I want to add some
support  (it should cope more with machine memory) to the hyperviosr to
assist the management of the above VMs. Now the situation is there're some
codes in the kernel which are supposed to be useful. but can Xen call Linux
system calls or other kernel functions?

I'm not pretty sure about this as in my understanding xen hyperviosr lies
under the kernel, so it can't invoke a systemcall from the kernel (or let's
say dom0's kernel) . Then if I want to use those codes, I suppose I have to
implement them in the hyperviosr by myself, right?

And does anyone know which one of xen's wiki pages explain the management &
APIs of xen's memory?

Thank you all.

[-- Attachment #1.2: Type: text/html, Size: 972 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Could Xen hyperviosr be able to invoke Linux systemcalls?
  2015-08-15  1:31 Could Xen hyperviosr be able to invoke Linux systemcalls? Kun Cheng
@ 2015-08-16 16:16 ` Frediano Ziglio
  2015-08-17  0:55   ` Kun Cheng
  0 siblings, 1 reply; 8+ messages in thread
From: Frediano Ziglio @ 2015-08-16 16:16 UTC (permalink / raw)
  To: Kun Cheng; +Cc: xen-devel

2015-08-15 2:31 GMT+01:00 Kun Cheng <chengkunck@gmail.com>:
> Hi all,
>
> That might be a dumb question but I just not confident with it. I'm not
> familiar with Xen's memory management part. Currently I want to add some
> support  (it should cope more with machine memory) to the hyperviosr to
> assist the management of the above VMs. Now the situation is there're some
> codes in the kernel which are supposed to be useful. but can Xen call Linux
> system calls or other kernel functions?
>
> I'm not pretty sure about this as in my understanding xen hyperviosr lies
> under the kernel, so it can't invoke a systemcall from the kernel (or let's
> say dom0's kernel) . Then if I want to use those codes, I suppose I have to
> implement them in the hyperviosr by myself, right?
>
> And does anyone know which one of xen's wiki pages explain the management &
> APIs of xen's memory?
>
> Thank you all.
>


Good question. However I would hardly see such stuff in an hypervisor.
Usually VM management is done inserting some cd/dvd/usb and managing
the installation or injecting files into the filesystem at storage
level. Xen emulate the hardware so is not that easy to do system
calls. Just an example you are not sure which kind of OS is running
(well... unless you are using PVs but even so you are not 100% sure).
Saying that you are able to detect OS from what is doing/looking at
storage/memory/whatever. Then you have to do the syscall. You could
trap syscall/int for the OS you detected to run send to the VM an
event that trigger a syscall from an high permission executable and
detect it and then start injecting syscalls to do whatever you want.
However you have also consider the "ethics" of doing so. Basically you
are hacking the OS from the hypervisor forcing the VM to do something
unexpected. I don't know if such a patch would be considered for
inclusion upstream. Surely not if you can easily disable it or better
if by default is disabled but you can enable and detect from the VMs.

I don't know if something similar is possible from domain0 instead of
Xen. You can surely change memory of another domain from dom0 but
injecting syscall is another stuff. You can poll the VM to see if is
running in userspace (stopping the CPU), set a new context and change
code cpu is running but is even much more hacky then the Xen
suggestion.

Well, actually are just some ideas, you could even change the callback
code once registered and use it to inject code. Still the ethic
question remain.

Frediano

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Could Xen hyperviosr be able to invoke Linux systemcalls?
  2015-08-16 16:16 ` Frediano Ziglio
@ 2015-08-17  0:55   ` Kun Cheng
  2015-08-17 19:25     ` Dario Faggioli
  0 siblings, 1 reply; 8+ messages in thread
From: Kun Cheng @ 2015-08-17  0:55 UTC (permalink / raw)
  To: Frediano Ziglio; +Cc: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 3864 bytes --]

On Mon, Aug 17, 2015 at 12:16 AM Frediano Ziglio <freddy77@gmail.com> wrote:

> 2015-08-15 2:31 GMT+01:00 Kun Cheng <chengkunck@gmail.com>:
> > Hi all,
> >
> > That might be a dumb question but I just not confident with it. I'm not
> > familiar with Xen's memory management part. Currently I want to add some
> > support  (it should cope more with machine memory) to the hyperviosr to
> > assist the management of the above VMs. Now the situation is there're
> some
> > codes in the kernel which are supposed to be useful. but can Xen call
> Linux
> > system calls or other kernel functions?
> >
> > I'm not pretty sure about this as in my understanding xen hyperviosr lies
> > under the kernel, so it can't invoke a systemcall from the kernel (or
> let's
> > say dom0's kernel) . Then if I want to use those codes, I suppose I have
> to
> > implement them in the hyperviosr by myself, right?
> >
> > And does anyone know which one of xen's wiki pages explain the
> management &
> > APIs of xen's memory?
> >
> > Thank you all.
> >
>
>
> Good question. However I would hardly see such stuff in an hypervisor.
> Usually VM management is done inserting some cd/dvd/usb and managing
> the installation or injecting files into the filesystem at storage
> level. Xen emulate the hardware so is not that easy to do system
> calls. Just an example you are not sure which kind of OS is running
> (well... unless you are using PVs but even so you are not 100% sure).
> Saying that you are able to detect OS from what is doing/looking at
> storage/memory/whatever. Then you have to do the syscall. You could
> trap syscall/int for the OS you detected to run send to the VM an
> event that trigger a syscall from an high permission executable and
> detect it and then start injecting syscalls to do whatever you want.
> However you have also consider the "ethics" of doing so. Basically you
> are hacking the OS from the hypervisor forcing the VM to do something
> unexpected. I don't know if such a patch would be considered for
> inclusion upstream. Surely not if you can easily disable it or better
> if by default is disabled but you can enable and detect from the VMs.
>
>

What I'm planing is adding page migration support for NUMA aware
scheduling. In such a case the most time I'll be dealing with Xen's memory
management & scheduling part to make relevant pages migrate to another node
with their VCPU. However, Linux kernel has already implemented some basic
mechanisms so the whole work would be better by leveraging the kernel's
 existing code or functions. More specifically, I want to confirm that
could we use the code or functions in linux kernel to assist the
hypervisor? My guess is not because in my understanding xen hypervisor lies
under the linux kernel, i.e. dom0's kernel. Given that Dom0 is a special
domain, if I want to manage & move all the machine memory pages, can the
kernel be helpful?

Your example gave me some hints that injecting the systemcall could be
useful to achieve vNUMA. However currently it is not in my plan.



> I don't know if something similar is possible from domain0 instead of
> Xen. You can surely change memory of another domain from dom0 but
> injecting syscall is another stuff. You can poll the VM to see if is
> running in userspace (stopping the CPU), set a new context and change
> code cpu is running but is even much more hacky then the Xen
> suggestion.
>

Hmm, "change the memory from dom0" means we can control the VM's memory by
using a config file or xl command right? What if we goes to the code level?
I'm really confused now. Is Xen's memory management complemented all by
itsself or does it also receive help from the kernel?


>
> Well, actually are just some ideas, you could even change the callback
> code once registered and use it to inject code. Still the ethic
> question remain.
>
> Frediano
>

[-- Attachment #1.2: Type: text/html, Size: 4842 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Could Xen hyperviosr be able to invoke Linux systemcalls?
  2015-08-17  0:55   ` Kun Cheng
@ 2015-08-17 19:25     ` Dario Faggioli
  2015-08-18  1:18       ` Kun Cheng
  0 siblings, 1 reply; 8+ messages in thread
From: Dario Faggioli @ 2015-08-17 19:25 UTC (permalink / raw)
  To: Kun Cheng; +Cc: Frediano Ziglio, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 3206 bytes --]

On Mon, 2015-08-17 at 00:55 +0000, Kun Cheng wrote:
> 
> 
> On Mon, Aug 17, 2015 at 12:16 AM Frediano Ziglio <freddy77@gmail.com> 
> 
> What I'm planing is adding page migration support for NUMA aware
> scheduling. In such a case the most time I'll be dealing with Xen's
> memory management & scheduling part to make relevant pages migrate to
> another node with their VCPU. However, Linux kernel has already
> implemented some basic mechanisms so the whole work would be better by
> leveraging the kernel's  existing code or functions. 
>
No, not at all. As you figured (or at least had intuition about)
yourself, Xen does run below Linux. Actually, it runs below any guest,
including Dom0, which is a special guest but still a guest, and can even
not be a Linux guest.

So there's no code sharing, or no mechanism to invoke Linux code and
have it affect Xen's scheduling or memory management (and never will
be :-P).

> More specifically, I want to confirm that could we use the code or
> functions in linux kernel to assist the hypervisor? 
>
No, it's the other way around.

> My guess is not because in my understanding xen hypervisor lies under
> the linux kernel, i.e. dom0's kernel. 
>
Exactly.

> Given that Dom0 is a special domain, if I want to manage & move all
> the machine memory pages, can the kernel be helpful? 
>
The Dom0 kernel doesn't know anything about the memory of other guest.
It basically doesn't even know that they exist... That's the point of
virtualization, isn't it?
Also Linux's and Xen's scheduling and memory management are so different
(and that's by design) that, even for similar (or the same) feature, the
implementation will be different anyway, so sharing the code won't help
at all.
 
> Hmm, "change the memory from dom0" means we can control the VM's
> memory by using a config file or xl command right? 
>
"changing the memory from dom0" means something like "the dom0 can ask
Xen, via toolstack, to do something to the memory of other guests",
i.e., it has enough privileges to do that, but that's it.

> What if we goes to the code level? I'm really confused now. Is Xen's
> memory management complemented all by itsself or does it also receive
> help from the kernel? 
>  
As said, Xen doesn't even know what kernel is actually running in the
various guest, including dom0, and things should remains that way,
especially for core things like scheduling and memory management.

So, in summary, what you're after should be achieved entirely inside
Xen. It is possible than, in the PV guest case, you'd need some help
from the guest. However, that would be in the form of "Xen
asking/forcing the guest to do something on the *guest* *itself*", not
in the form of "Xen asking dom0 to do something on Xen's own
memory/scheduling or (directly) on other guests' memory".

Hope this helps clearing things out for you. :-)

Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Could Xen hyperviosr be able to invoke Linux systemcalls?
  2015-08-17 19:25     ` Dario Faggioli
@ 2015-08-18  1:18       ` Kun Cheng
  2015-08-18  9:16         ` Dario Faggioli
  0 siblings, 1 reply; 8+ messages in thread
From: Kun Cheng @ 2015-08-18  1:18 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: Frediano Ziglio, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 4484 bytes --]

On Tue, Aug 18, 2015 at 3:25 AM Dario Faggioli <dario.faggioli@citrix.com>
wrote:

> On Mon, 2015-08-17 at 00:55 +0000, Kun Cheng wrote:
> >
> >
> > On Mon, Aug 17, 2015 at 12:16 AM Frediano Ziglio <freddy77@gmail.com>
> >
> > What I'm planing is adding page migration support for NUMA aware
> > scheduling. In such a case the most time I'll be dealing with Xen's
> > memory management & scheduling part to make relevant pages migrate to
> > another node with their VCPU. However, Linux kernel has already
> > implemented some basic mechanisms so the whole work would be better by
> > leveraging the kernel's  existing code or functions.
> >
> No, not at all. As you figured (or at least had intuition about)
> yourself, Xen does run below Linux. Actually, it runs below any guest,
> including Dom0, which is a special guest but still a guest, and can even
> not be a Linux guest.
>
> So there's no code sharing, or no mechanism to invoke Linux code and
> have it affect Xen's scheduling or memory management (and never will
> be :-P).
>
>
Thank you Dario and Frediano.

Not being able to share the existing kernel mechanism is some kind of
frustrating......But just as you said it's the point of virtualization. And
now I gain a better understanding why you said it would be tough ;)   (I
start to envy KVM guys, LOL)


> > More specifically, I want to confirm that could we use the code or
> > functions in linux kernel to assist the hypervisor?
> >
> No, it's the other way around.
>
> > My guess is not because in my understanding xen hypervisor lies under
> > the linux kernel, i.e. dom0's kernel.
> >
> Exactly.
>
> > Given that Dom0 is a special domain, if I want to manage & move all
> > the machine memory pages, can the kernel be helpful?
> >
> The Dom0 kernel doesn't know anything about the memory of other guest.
> It basically doesn't even know that they exist... That's the point of
> virtualization, isn't it?
> Also Linux's and Xen's scheduling and memory management are so different
> (and that's by design) that, even for similar (or the same) feature, the
> implementation will be different anyway, so sharing the code won't help
> at all.
>
> > Hmm, "change the memory from dom0" means we can control the VM's
> > memory by using a config file or xl command right?
> >
> "changing the memory from dom0" means something like "the dom0 can ask
> Xen, via toolstack, to do something to the memory of other guests",
> i.e., it has enough privileges to do that, but that's it.
>
> > What if we goes to the code level? I'm really confused now. Is Xen's
> > memory management complemented all by itsself or does it also receive
> > help from the kernel?
> >
> As said, Xen doesn't even know what kernel is actually running in the
> various guest, including dom0, and things should remains that way,
> especially for core things like scheduling and memory management.
>
> So, in summary, what you're after should be achieved entirely inside
> Xen. It is possible than, in the PV guest case, you'd need some help
> from the guest. However, that would be in the form of "Xen
> asking/forcing the guest to do something on the *guest* *itself*", not
> in the form of "Xen asking dom0 to do something on Xen's own
> memory/scheduling or (directly) on other guests' memory".
>
> Hope this helps clearing things out for you. :-)
>


At this point I still have other plans.  But 'asking the guest to do
something on the guest itself' sounds like exposing the virtual NUMA
topology to the guest (vNUMA). I wrote this email because hypervisor is
responsible to allocate machine memory for each guest. Then, in a PV case
there are P2M and M2P to help address translation (and shadow page tables
in HVMs). So what first came to my mind was hypervisor should move the
pages for guests and then P2M things should better be renewed somehow.
However inside a guest domain, its OS can only manage the guest physical
memory, which I don't think is able to be moved to another node by itself.


>
> Regards,
> Dario
>
> --
> <<This happens because I choose it to happen!>> (Raistlin Majere)
> -----------------------------------------------------------------
> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
> Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
>


Maybe I misunderstood you words... 'asking the guest to do something on the
guest itself' confuses me a bit, could you explain more details of your
thought if it's convenient for you?

Thank you,
Kenneth

[-- Attachment #1.2: Type: text/html, Size: 5809 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Could Xen hyperviosr be able to invoke Linux systemcalls?
  2015-08-18  1:18       ` Kun Cheng
@ 2015-08-18  9:16         ` Dario Faggioli
  2015-08-19  0:47           ` Kun Cheng
  0 siblings, 1 reply; 8+ messages in thread
From: Dario Faggioli @ 2015-08-18  9:16 UTC (permalink / raw)
  To: Kun Cheng; +Cc: Frediano Ziglio, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 6726 bytes --]

On Tue, 2015-08-18 at 01:18 +0000, Kun Cheng wrote:

> On Tue, Aug 18, 2015 at 3:25 AM Dario Faggioli
> <dario.faggioli@citrix.com> wrote:
> 
>         On Mon, 2015-08-17 at 00:55 +0000, Kun Cheng wrote:
>         >
>         >
>         > On Mon, Aug 17, 2015 at 12:16 AM Frediano Ziglio
>         <freddy77@gmail.com>
>         >
>         > What I'm planing is adding page migration support for NUMA
>         aware
>         > scheduling. In such a case the most time I'll be dealing
>         with Xen's
>         > memory management & scheduling part to make relevant pages
>         migrate to
>         > another node with their VCPU. However, Linux kernel has
>         already
>         > implemented some basic mechanisms so the whole work would be
>         better by
>         > leveraging the kernel's  existing code or functions.
>         >
>         No, not at all. As you figured (or at least had intuition
>         about)
>         yourself, Xen does run below Linux. Actually, it runs below
>         any guest,
>         including Dom0, which is a special guest but still a guest,
>         and can even
>         not be a Linux guest.
>         
>         So there's no code sharing, or no mechanism to invoke Linux
>         code and
>         have it affect Xen's scheduling or memory management (and
>         never will
>         be :-P).

> 
> 
> Not being able to share the existing kernel mechanism is some kind of
> frustrating......
>
You think? Well, I guess I see what you mean. However, being able to do
custom things, specifically tailored to the kind of workload that Xen
focuses on (i.e., virtualization, of course), instead of having to rely
on tweaking a general purpose operating system, trying to bending it as
much as possible to some specific needs (i.e., basically, what KVM is
doing), is one of Xen's strengths.

Then, whether or not we always manage to take proper advantage of that
it's another pair of hands.

> But just as you said it's the point of virtualization. And now I gain
> a better understanding why you said it would be tough ;)   (I start to
> envy KVM guys, LOL)
>  
Yeah, sometimes it happens that they get something sort of "for free",
but I really believe what I just said above, so no anvy. :-)

>         So, in summary, what you're after should be achieved entirely
>         inside
>         Xen. It is possible than, in the PV guest case, you'd need
>         some help
>         from the guest. However, that would be in the form of "Xen
>         asking/forcing the guest to do something on the *guest*
>         *itself*", not
>         in the form of "Xen asking dom0 to do something on Xen's own
>         memory/scheduling or (directly) on other guests' memory".
>
>         Hope this helps clearing things out for you. :-)
> 

> At this point I still have other plans.  But 'asking the guest to do
> something on the guest itself' sounds like exposing the virtual NUMA
> topology to the guest (vNUMA). 
>
How so? We already have it, although it's not yet fully usable (right
for PV guests) due to other issues. But I don't see what that has to do
with what we're talking about.

In the PV case, virtual NUMA what virtual NUMA topology takes is:
 - the tools and the hypervisor being able to allocate memory for the
   guest in a specific way (matching the topology we want the guest to
   have)
 - the hypervisor to store the virtual topology somewhere, in order to
   be able to provide it to the guest
 - the guest to ask about its own NUMA topology via a PV path
   (hypercalls), rather than via ACPI (which basically doesn't exist in
   PV)

Again, what does this have to do with memory migration?

> I wrote this email because hypervisor is responsible to allocate
> machine memory for each guest. Then, in a PV case there are P2M and
> M2P to help address translation (and shadow page tables in HVMs). So
> what first came to my mind was hypervisor should move the pages for
> guests and then P2M things should better be renewed somehow. However
> inside a guest domain, its OS can only manage the guest physical
> memory, which I don't think is able to be moved to another node by
> itself.
>
A PV guests know about the fact that it is a PV guest (that's the point
of paravirtualization), and in fact, it performs hypercalls ad
everything. However, such a knowledge does not go as far as being aware
of the host NUMA layout, and being able to move its own memory to a
different NUMA node in the host.

What I recommend you, is to have a look at the migration code. It's kind
of a beast, I know, but it's been rewrote almost from scratch just very
recently, and I'm sure now it's a lot better and easier to understand
than before.

Reason I'm suggesting this is that, particularly for PV, moving the
guest's RAM under its own feet is going to be possible oly with
something similar to performing a local migration. The main difference
is that we may want to be able to do it more 'lively' (i.e., without
stopping the world, even for a small amount of time, as it happens in
migration), as well as that we may want to be able to move specific
chunks of memory, rather than all of it.

These are not small differences, and the migration code wouldn't
probably be reusable as it is, but it's the closest thing to what you're
saying you're trying to achieve that I can imagine.

> 
> Maybe I misunderstood you words... 'asking the guest to do something
> on the guest itself' confuses me a bit, could you explain more details
> of your thought if it's convenient for you?
>
Yeah, my bad. Perhaps, for now, it's better if you forget about this.
Very quickly, what I was hinting at is some mechanisms that we could
come up with (but that will be one of the last steps) for putting the PV
guest under some kind of quiescent state, i.e., a state where it does
not change its page tables --as we're fiddling with them-- without being
completely suspended. If we'll ever get there, I think that this could
only be done with some cooperation from the guest, e.g., having it going
through a protocol that we'd need to define, upon request from the
hypervisor. But that's just speculation at this time, and we really
shouldn't think at it until we get there... It's not like there aren't
super difficult problem to solve already! :-P

Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Could Xen hyperviosr be able to invoke Linux systemcalls?
  2015-08-18  9:16         ` Dario Faggioli
@ 2015-08-19  0:47           ` Kun Cheng
  2015-08-19 18:27             ` Dario Faggioli
  0 siblings, 1 reply; 8+ messages in thread
From: Kun Cheng @ 2015-08-19  0:47 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: Frediano Ziglio, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 8064 bytes --]

On Tue, Aug 18, 2015 at 5:16 PM Dario Faggioli <dario.faggioli@citrix.com>
wrote:

> On Tue, 2015-08-18 at 01:18 +0000, Kun Cheng wrote:
>
> > On Tue, Aug 18, 2015 at 3:25 AM Dario Faggioli
> > <dario.faggioli@citrix.com> wrote:
> >
> >         On Mon, 2015-08-17 at 00:55 +0000, Kun Cheng wrote:
> >         >
> >         >
> >         > On Mon, Aug 17, 2015 at 12:16 AM Frediano Ziglio
> >         <freddy77@gmail.com>
> >         >
> >         > What I'm planing is adding page migration support for NUMA
> >         aware
> >         > scheduling. In such a case the most time I'll be dealing
> >         with Xen's
> >         > memory management & scheduling part to make relevant pages
> >         migrate to
> >         > another node with their VCPU. However, Linux kernel has
> >         already
> >         > implemented some basic mechanisms so the whole work would be
> >         better by
> >         > leveraging the kernel's  existing code or functions.
> >         >
> >         No, not at all. As you figured (or at least had intuition
> >         about)
> >         yourself, Xen does run below Linux. Actually, it runs below
> >         any guest,
> >         including Dom0, which is a special guest but still a guest,
> >         and can even
> >         not be a Linux guest.
> >
> >         So there's no code sharing, or no mechanism to invoke Linux
> >         code and
> >         have it affect Xen's scheduling or memory management (and
> >         never will
> >         be :-P).
>
> >
> >
> > Not being able to share the existing kernel mechanism is some kind of
> > frustrating......
> >
> You think? Well, I guess I see what you mean. However, being able to do
> custom things, specifically tailored to the kind of workload that Xen
> focuses on (i.e., virtualization, of course), instead of having to rely
> on tweaking a general purpose operating system, trying to bending it as
> much as possible to some specific needs (i.e., basically, what KVM is
> doing), is one of Xen's strengths.
>

Agreed. "Decoupling the hypervisor from a specific OS kernel", that's what
Xen does.


>
> Then, whether or not we always manage to take proper advantage of that
> it's another pair of hands.
>
> > But just as you said it's the point of virtualization. And now I gain
> > a better understanding why you said it would be tough ;)   (I start to
> > envy KVM guys, LOL)
> >
> Yeah, sometimes it happens that they get something sort of "for free",
> but I really believe what I just said above, so no anvy. :-)
>
> >         So, in summary, what you're after should be achieved entirely
> >         inside
> >         Xen. It is possible than, in the PV guest case, you'd need
> >         some help
> >         from the guest. However, that would be in the form of "Xen
> >         asking/forcing the guest to do something on the *guest*
> >         *itself*", not
> >         in the form of "Xen asking dom0 to do something on Xen's own
> >         memory/scheduling or (directly) on other guests' memory".
> >
> >         Hope this helps clearing things out for you. :-)
> >
>
> > At this point I still have other plans.  But 'asking the guest to do
> > something on the guest itself' sounds like exposing the virtual NUMA
> > topology to the guest (vNUMA).
> >
> How so? We already have it, although it's not yet fully usable (right
> for PV guests) due to other issues. But I don't see what that has to do
> with what we're talking about.
>
> In the PV case, virtual NUMA what virtual NUMA topology takes is:
>  - the tools and the hypervisor being able to allocate memory for the
>    guest in a specific way (matching the topology we want the guest to
>    have)
>  - the hypervisor to store the virtual topology somewhere, in order to
>    be able to provide it to the guest
>  - the guest to ask about its own NUMA topology via a PV path
>    (hypercalls), rather than via ACPI (which basically doesn't exist in
>    PV)
>
> Again, what does this have to do with memory migration?
>

No, vNUMA is not involved which I agree with you. The truth is vNUMA was
the first came to my mind when you mentioned "to do something on the guest
itself". That's all, never mind.


>
> > I wrote this email because hypervisor is responsible to allocate
> > machine memory for each guest. Then, in a PV case there are P2M and
> > M2P to help address translation (and shadow page tables in HVMs). So
> > what first came to my mind was hypervisor should move the pages for
> > guests and then P2M things should better be renewed somehow. However
> > inside a guest domain, its OS can only manage the guest physical
> > memory, which I don't think is able to be moved to another node by
> > itself.
> >
> A PV guests know about the fact that it is a PV guest (that's the point
> of paravirtualization), and in fact, it performs hypercalls ad
> everything. However, such a knowledge does not go as far as being aware
> of the host NUMA layout, and being able to move its own memory to a
> different NUMA node in the host.
>
> What I recommend you, is to have a look at the migration code. It's kind
> of a beast, I know, but it's been rewrote almost from scratch just very
> recently, and I'm sure now it's a lot better and easier to understand
> than before.
>
> Reason I'm suggesting this is that, particularly for PV, moving the
> guest's RAM under its own feet is going to be possible oly with
> something similar to performing a local migration. The main difference
> is that we may want to be able to do it more 'lively' (i.e., without
> stopping the world, even for a small amount of time, as it happens in
> migration), as well as that we may want to be able to move specific
> chunks of memory, rather than all of it.
>
> These are not small differences, and the migration code wouldn't
> probably be reusable as it is, but it's the closest thing to what you're
> saying you're trying to achieve that I can imagine.
>

Live migration between nodes is perhaps the easiest way. But it also has
draw backs mainly because that migration is coarse-grained. Supposing that
a VM has multiple VCPUs, if only some of them are moved to another node or
some other nodes. Then it will be tough to decide which one should be the
target node for the live migration. However, I also think live migration is
the best 'first step'. But finally a fine grained memory migration is the
destination. By the way, I am currently digging the migration code. ;)


>
> >
> > Maybe I misunderstood you words... 'asking the guest to do something
> > on the guest itself' confuses me a bit, could you explain more details
> > of your thought if it's convenient for you?
> >
> Yeah, my bad. Perhaps, for now, it's better if you forget about this.
> Very quickly, what I was hinting at is some mechanisms that we could
> come up with (but that will be one of the last steps) for putting the PV
> guest under some kind of quiescent state, i.e., a state where it does
> not change its page tables --as we're fiddling with them-- without being
> completely suspended. If we'll ever get there, I think that this could
> only be done with some cooperation from the guest, e.g., having it going
> through a protocol that we'd need to define, upon request from the
> hypervisor.


I see. So that's a basic idea about keeping a VM alive but not access the
pages during the migration. Yes that's will be helpful to accelerate the
whole process, e.g.  without spending too much time on waiting for the
lock.


> But that's just speculation at this time, and we really
> shouldn't think at it until we get there... It's not like there aren't
> super difficult problem to solve already! :-P
>

Yeah, thinking too much will aggregate the difficulty.

>
> Regards,
> Dario
>
> --
> <<This happens because I choose it to happen!>> (Raistlin Majere)
> -----------------------------------------------------------------
> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
> Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
>

Thank you so much Dario!

Best regards,
Kenneth

[-- Attachment #1.2: Type: text/html, Size: 10523 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Could Xen hyperviosr be able to invoke Linux systemcalls?
  2015-08-19  0:47           ` Kun Cheng
@ 2015-08-19 18:27             ` Dario Faggioli
  0 siblings, 0 replies; 8+ messages in thread
From: Dario Faggioli @ 2015-08-19 18:27 UTC (permalink / raw)
  To: Kun Cheng; +Cc: Frediano Ziglio, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1416 bytes --]

On Wed, 2015-08-19 at 00:47 +0000, Kun Cheng wrote:


> Live migration between nodes is perhaps the easiest way. But it also
> has draw backs mainly because that migration is coarse-grained.
>
What I'm saying is that you can, as a first step, look at the migration
code and implement (let's call it so) page moving in a similar way. That
does not mean that you'd have to *always* move all the pages, like a
migration does, you well can move only one page, or a bunch of them.

It's the mechanism for implementing that you should take inspiration
from.

>  Supposing that a VM has multiple VCPUs, if only some of them are
> moved to another node or some other nodes. Then it will be tough to
> decide which one should be the target node for the live migration.
>
A policy needs to be defined, sure. But we're not talking about that,
we're talking about, after you've decided what to do, how to do that.

>  However, I also think live migration is the best 'first step'. But
> finally a fine grained memory migration is the destination. By the
> way, I am currently digging the migration code. ;) 
>
Great! :-)

Regards,
Dario

-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-08-19 18:27 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-15  1:31 Could Xen hyperviosr be able to invoke Linux systemcalls? Kun Cheng
2015-08-16 16:16 ` Frediano Ziglio
2015-08-17  0:55   ` Kun Cheng
2015-08-17 19:25     ` Dario Faggioli
2015-08-18  1:18       ` Kun Cheng
2015-08-18  9:16         ` Dario Faggioli
2015-08-19  0:47           ` Kun Cheng
2015-08-19 18:27             ` Dario Faggioli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.