All of lore.kernel.org
 help / color / mirror / Atom feed
* How can Xen trigger a context switch in an HVM guest domain?
@ 2009-10-31 11:02 XiaYubin
  2009-10-31 15:20 ` George Dunlap
  0 siblings, 1 reply; 7+ messages in thread
From: XiaYubin @ 2009-10-31 11:02 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap

Hi, all,

As I'm doing some research in cooperative scheduling between Xen and
guest domain, I want to know how many ways can Xen trigger a context
switch inside an HVM guest domain (which runs Windows in my case). Do
I have to write a driver (like balloon-driver)? Or a user process is
enough? Or there is an even simpler way?

All your suggestions are appreciated. Thanks! :)

--
Yubin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: How can Xen trigger a context switch in an HVM guest domain?
  2009-10-31 11:02 How can Xen trigger a context switch in an HVM guest domain? XiaYubin
@ 2009-10-31 15:20 ` George Dunlap
  2009-11-01  5:54   ` XiaYubin
  0 siblings, 1 reply; 7+ messages in thread
From: George Dunlap @ 2009-10-31 15:20 UTC (permalink / raw)
  To: XiaYubin; +Cc: xen-devel

Context switching is a choice the guest OS has to make, and how that's
done will differ based on the operating system.  I think if you're
thinking about modifying the guest scheduler, you're probably better
off starting with Linux.  Even if there's a way to convince Windows to
call schedule() to pick a new process, I'm not sure you'll be able to
tell it *which* process to choose.

As far as mechanism on Xen's side, it would be easy enough to allocate
a "reschedule" event channel for the guest, such that whenever you
want to trigger a guest reschedule, just raise the event channel.

 -George

On Sat, Oct 31, 2009 at 11:02 AM, XiaYubin <xiayubin@gmail.com> wrote:
> Hi, all,
>
> As I'm doing some research in cooperative scheduling between Xen and
> guest domain, I want to know how many ways can Xen trigger a context
> switch inside an HVM guest domain (which runs Windows in my case). Do
> I have to write a driver (like balloon-driver)? Or a user process is
> enough? Or there is an even simpler way?
>
> All your suggestions are appreciated. Thanks! :)
>
> --
> Yubin
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: How can Xen trigger a context switch in an HVM guest domain?
  2009-10-31 15:20 ` George Dunlap
@ 2009-11-01  5:54   ` XiaYubin
  2009-11-02  9:16     ` James (song wei)
  2009-11-02 16:05     ` George Dunlap
  0 siblings, 2 replies; 7+ messages in thread
From: XiaYubin @ 2009-11-01  5:54 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel

Hi, George,

Thank you for your reply. Actually, I'm looking for a generic
mechanism of cooperative scheduling. The independence of  guest OS can
make such mechanism more convincing and practical, just like the
balloon driver does.

Maybe you are wondering why I asked such a wired question, let me
describe it with more details. My current work is based on "Task-aware
VM scheduling", which is published on VEE'09. By monitoring CR3
changing at VMM level, Xen can get information of tasks' CPU
consumption to identify CPU hogs and I/O tasks. Therefore, the
task-aware mechanism offers a more fine-grained scheduler than the
original VCPU-level scheduler, as a VCPU may run CPU hogs and I/O
tasks in a mixed style.

Imagine there are n VMs. One of them, named mix-VM, runs two tasks:
cpuhog and iotask (network). The other VMs, named CPU-VM, run just
cpuhog. All VMs are using PV driver ( GPLPV driver for Windows).

Here's what supposed to happen when iotask receiving an network
packet: The NIC raises an IRQ, passes to Xen, then domain-0 sends an
inter-domain event to mix-VM, which is likely to be in run-queue. Xen
then schedules it to run immediately and set its state to
preempting-state. Right after that, the mix-VM *should* schedules
iotask to process the incoming packet, and then schedules cpuhog after
processing. When the CR3 is changing to cpuhog, Xen knows that the
mix-VM has finished I/O processing (here we assume that the priority
of cpuhog is usually lower than iotask in most OS), and schedules the
mix-VM out to finish its preempting-state. Therefore, the mix-VM can
preempt other VMs to process I/O ASAP, while making the preempting
time as short as possible to keep fairness. The point is: cpuhog
should not run in preempting-state.

However, a problem arises when the mix-VM sending packets. When iotask
sends an amount of data (using TCP protocol), it will block and wait
to be waked up after guest kernel sending all the data, which may be
split into thousands of TCP packets. The mix-VM will receives an ACK
packet every time it sending a packet, which makes it enter
preempting-state. Note that at this moment, the CR3 of mix-VM is
cpuhog's (as the only running process). After the guest kernel
processing the ACK packet and sending next packet, it switches to user
mode, which means the cpuhog gets to run in preempting-state. The
point is: as there is no CR3-changing, Xen has no way to run.

One way is to add a hook at user/kernel mode switching, then Xen can
catch the moment when cpuhog gets to run. However, this way costs too
much. Another way is to force a VM to schedule when it entering
preempting-state. Therefore, it will trap to Xen when CR3 is changed,
and Xen can finish its preempting-state when it schedules cpuhog to
run. That's why I want to trigger guest context switch from Xen. I
don't really care *which* process it will switch to, I just want to
get Xen a chance to run. The point is: is there a better/simpler way
to solve this problem?

Hope I described the problem clearly. And would you please show more
details about the thought of "reschedule event channel"? Thanks!

--
Yubin

On Sat, Oct 31, 2009 at 11:20 PM, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:
> Context switching is a choice the guest OS has to make, and how that's
> done will differ based on the operating system.  I think if you're
> thinking about modifying the guest scheduler, you're probably better
> off starting with Linux.  Even if there's a way to convince Windows to
> call schedule() to pick a new process, I'm not sure you'll be able to
> tell it *which* process to choose.
>
> As far as mechanism on Xen's side, it would be easy enough to allocate
> a "reschedule" event channel for the guest, such that whenever you
> want to trigger a guest reschedule, just raise the event channel.
>
>  -George
>
> On Sat, Oct 31, 2009 at 11:02 AM, XiaYubin <xiayubin@gmail.com> wrote:
>> Hi, all,
>>
>> As I'm doing some research in cooperative scheduling between Xen and
>> guest domain, I want to know how many ways can Xen trigger a context
>> switch inside an HVM guest domain (which runs Windows in my case). Do
>> I have to write a driver (like balloon-driver)? Or a user process is
>> enough? Or there is an even simpler way?
>>
>> All your suggestions are appreciated. Thanks! :)
>>
>> --
>> Yubin
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
>>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: How can Xen trigger a context switch in an HVM guest  domain?
  2009-11-01  5:54   ` XiaYubin
@ 2009-11-02  9:16     ` James (song wei)
  2009-11-02 16:05     ` George Dunlap
  1 sibling, 0 replies; 7+ messages in thread
From: James (song wei) @ 2009-11-02  9:16 UTC (permalink / raw)
  To: xen-devel


Could you add a timer when watch the changes of  CR3 to prevent one task
exhaust too long cpu time without changing CR3.

- James (Song Wei)


XiaYubin wrote:
> 
> Hi, George,
> 
> Thank you for your reply. Actually, I'm looking for a generic
> mechanism of cooperative scheduling. The independence of  guest OS can
> make such mechanism more convincing and practical, just like the
> balloon driver does.
> 
> Maybe you are wondering why I asked such a wired question, let me
> describe it with more details. My current work is based on "Task-aware
> VM scheduling", which is published on VEE'09. By monitoring CR3
> changing at VMM level, Xen can get information of tasks' CPU
> consumption to identify CPU hogs and I/O tasks. Therefore, the
> task-aware mechanism offers a more fine-grained scheduler than the
> original VCPU-level scheduler, as a VCPU may run CPU hogs and I/O
> tasks in a mixed style.
> 
> Imagine there are n VMs. One of them, named mix-VM, runs two tasks:
> cpuhog and iotask (network). The other VMs, named CPU-VM, run just
> cpuhog. All VMs are using PV driver ( GPLPV driver for Windows).
> 
> Here's what supposed to happen when iotask receiving an network
> packet: The NIC raises an IRQ, passes to Xen, then domain-0 sends an
> inter-domain event to mix-VM, which is likely to be in run-queue. Xen
> then schedules it to run immediately and set its state to
> preempting-state. Right after that, the mix-VM *should* schedules
> iotask to process the incoming packet, and then schedules cpuhog after
> processing. When the CR3 is changing to cpuhog, Xen knows that the
> mix-VM has finished I/O processing (here we assume that the priority
> of cpuhog is usually lower than iotask in most OS), and schedules the
> mix-VM out to finish its preempting-state. Therefore, the mix-VM can
> preempt other VMs to process I/O ASAP, while making the preempting
> time as short as possible to keep fairness. The point is: cpuhog
> should not run in preempting-state.
> 
> However, a problem arises when the mix-VM sending packets. When iotask
> sends an amount of data (using TCP protocol), it will block and wait
> to be waked up after guest kernel sending all the data, which may be
> split into thousands of TCP packets. The mix-VM will receives an ACK
> packet every time it sending a packet, which makes it enter
> preempting-state. Note that at this moment, the CR3 of mix-VM is
> cpuhog's (as the only running process). After the guest kernel
> processing the ACK packet and sending next packet, it switches to user
> mode, which means the cpuhog gets to run in preempting-state. The
> point is: as there is no CR3-changing, Xen has no way to run.
> 
> One way is to add a hook at user/kernel mode switching, then Xen can
> catch the moment when cpuhog gets to run. However, this way costs too
> much. Another way is to force a VM to schedule when it entering
> preempting-state. Therefore, it will trap to Xen when CR3 is changed,
> and Xen can finish its preempting-state when it schedules cpuhog to
> run. That's why I want to trigger guest context switch from Xen. I
> don't really care *which* process it will switch to, I just want to
> get Xen a chance to run. The point is: is there a better/simpler way
> to solve this problem?
> 
> Hope I described the problem clearly. And would you please show more
> details about the thought of "reschedule event channel"? Thanks!
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
> 
> 

-- 
View this message in context: http://old.nabble.com/How-can-Xen-trigger-a-context-switch-in-an-HVM-guest-domain--tp26141418p26156633.html
Sent from the Xen - Dev mailing list archive at Nabble.com.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: How can Xen trigger a context switch in an HVM guest domain?
  2009-11-01  5:54   ` XiaYubin
  2009-11-02  9:16     ` James (song wei)
@ 2009-11-02 16:05     ` George Dunlap
  2009-11-03  1:43       ` XiaYubin
  1 sibling, 1 reply; 7+ messages in thread
From: George Dunlap @ 2009-11-02 16:05 UTC (permalink / raw)
  To: XiaYubin; +Cc: xen-devel

OK, so you want to allow a VM to run so that it can do packet
processing in the kernel, but once it's done in the kernel you want to
preempt the VM again.

An idea I was going to try out is that if a VM receives an interrupt
(possibly only certain interrupts, like network), let it run for a
very short amount of time (say, 1ms or 500us).  That should be enough
for it to do its basic packet processing (or audio processing, video
processing, whatever).  True, you're going to run the "cpu hog" during
that time, but that will be debited against time he'll run later.  (I
haven't tested this idea yet. It may work better with some credit
algorithms than others.)

The problem with inducing a guest to call schedule():
* It may not have any other runnable processes, or it may choose the
same process to run again; so it may not switch the cr3 anyway.
* The only reliable way to do it without some kind of
paravirtualization (if even a kernel driver) would be to give it a
timer interrupt, which may mess up other things on the system, such as
the system time.

If you're really keen to preempt on return to userspace, you could try
something like the following.  Before delivering the interrupt, note
the EIP the guest is at.  If it's in user space, set a hardware
breakpoint at that address.  Then deliver the interrupt.  If the guest
calls schedule(), you can catch the CR3 switch; if it returns to the
same process, it will hit the breakpoint.

Two possible problems:
* For reasons of ancient history, the iret instruction may set the RF
flag in the EFLAGS register, which will cause the breakpoint not to
fire after the guest iret.  You may need to decode the instruction and
set the breakpoint at the instruction after, or something like that.
* I believe windows doens't do a cr3 switch if it does a *thread*
switch.  If so, on a thread switch you'll get neither the CR3 switch
nor the breakpoint (since the other thread is probably running
somewhere else).

Peace,
 -George

On Sun, Nov 1, 2009 at 5:54 AM, XiaYubin <xiayubin@gmail.com> wrote:
> Hi, George,
>
> Thank you for your reply. Actually, I'm looking for a generic
> mechanism of cooperative scheduling. The independence of  guest OS can
> make such mechanism more convincing and practical, just like the
> balloon driver does.
>
> Maybe you are wondering why I asked such a wired question, let me
> describe it with more details. My current work is based on "Task-aware
> VM scheduling", which is published on VEE'09. By monitoring CR3
> changing at VMM level, Xen can get information of tasks' CPU
> consumption to identify CPU hogs and I/O tasks. Therefore, the
> task-aware mechanism offers a more fine-grained scheduler than the
> original VCPU-level scheduler, as a VCPU may run CPU hogs and I/O
> tasks in a mixed style.
>
> Imagine there are n VMs. One of them, named mix-VM, runs two tasks:
> cpuhog and iotask (network). The other VMs, named CPU-VM, run just
> cpuhog. All VMs are using PV driver ( GPLPV driver for Windows).
>
> Here's what supposed to happen when iotask receiving an network
> packet: The NIC raises an IRQ, passes to Xen, then domain-0 sends an
> inter-domain event to mix-VM, which is likely to be in run-queue. Xen
> then schedules it to run immediately and set its state to
> preempting-state. Right after that, the mix-VM *should* schedules
> iotask to process the incoming packet, and then schedules cpuhog after
> processing. When the CR3 is changing to cpuhog, Xen knows that the
> mix-VM has finished I/O processing (here we assume that the priority
> of cpuhog is usually lower than iotask in most OS), and schedules the
> mix-VM out to finish its preempting-state. Therefore, the mix-VM can
> preempt other VMs to process I/O ASAP, while making the preempting
> time as short as possible to keep fairness. The point is: cpuhog
> should not run in preempting-state.
>
> However, a problem arises when the mix-VM sending packets. When iotask
> sends an amount of data (using TCP protocol), it will block and wait
> to be waked up after guest kernel sending all the data, which may be
> split into thousands of TCP packets. The mix-VM will receives an ACK
> packet every time it sending a packet, which makes it enter
> preempting-state. Note that at this moment, the CR3 of mix-VM is
> cpuhog's (as the only running process). After the guest kernel
> processing the ACK packet and sending next packet, it switches to user
> mode, which means the cpuhog gets to run in preempting-state. The
> point is: as there is no CR3-changing, Xen has no way to run.
>
> One way is to add a hook at user/kernel mode switching, then Xen can
> catch the moment when cpuhog gets to run. However, this way costs too
> much. Another way is to force a VM to schedule when it entering
> preempting-state. Therefore, it will trap to Xen when CR3 is changed,
> and Xen can finish its preempting-state when it schedules cpuhog to
> run. That's why I want to trigger guest context switch from Xen. I
> don't really care *which* process it will switch to, I just want to
> get Xen a chance to run. The point is: is there a better/simpler way
> to solve this problem?
>
> Hope I described the problem clearly. And would you please show more
> details about the thought of "reschedule event channel"? Thanks!
>
> --
> Yubin
>
> On Sat, Oct 31, 2009 at 11:20 PM, George Dunlap
> <George.Dunlap@eu.citrix.com> wrote:
>> Context switching is a choice the guest OS has to make, and how that's
>> done will differ based on the operating system.  I think if you're
>> thinking about modifying the guest scheduler, you're probably better
>> off starting with Linux.  Even if there's a way to convince Windows to
>> call schedule() to pick a new process, I'm not sure you'll be able to
>> tell it *which* process to choose.
>>
>> As far as mechanism on Xen's side, it would be easy enough to allocate
>> a "reschedule" event channel for the guest, such that whenever you
>> want to trigger a guest reschedule, just raise the event channel.
>>
>>  -George
>>
>> On Sat, Oct 31, 2009 at 11:02 AM, XiaYubin <xiayubin@gmail.com> wrote:
>>> Hi, all,
>>>
>>> As I'm doing some research in cooperative scheduling between Xen and
>>> guest domain, I want to know how many ways can Xen trigger a context
>>> switch inside an HVM guest domain (which runs Windows in my case). Do
>>> I have to write a driver (like balloon-driver)? Or a user process is
>>> enough? Or there is an even simpler way?
>>>
>>> All your suggestions are appreciated. Thanks! :)
>>>
>>> --
>>> Yubin
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xensource.com
>>> http://lists.xensource.com/xen-devel
>>>
>>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: How can Xen trigger a context switch in an HVM guest domain?
  2009-11-02 16:05     ` George Dunlap
@ 2009-11-03  1:43       ` XiaYubin
  2009-11-03 11:51         ` George Dunlap
  0 siblings, 1 reply; 7+ messages in thread
From: XiaYubin @ 2009-11-03  1:43 UTC (permalink / raw)
  To: George Dunlap, James (song wei); +Cc: xen-devel

James and George, thank you both! The breakpoint way is interesting, I
don't event think of it :)

OK, I'm going to use a simpler way to verify my idea first. Before the
preempting-state VM runs, I will set a timer to make Xen get to run
every 100us (maybe longer for the first iteration). The timer-handler
will check if the preempting VM is in kernel-mode or user-mode. If it
is in user-mode with cpu-hog's CR3, then it will be scheduled out.
Meanwhile, if the iteration goes beyond some threshold (say 5 times),
the VM will also be scheduled out. This way seems much simpler than
the one using breakpoint, and more accurate than the one using
1ms-timer. It may bring some overhead, but the preemption is not
supposed to occur frequently and the fairness is more important.

The thread problem also exists in Linux platform. Currently I have no
good idea to identify different threads from the hypervisor's
perspective. I have a dream that one day those OS guys will export
this information to VMM, a dream that one day our children will live
in a world where virtualization rules. I have a dream today :)

Thanks!

--
Yubin

On Tue, Nov 3, 2009 at 12:05 AM, George Dunlap
<George.Dunlap@eu.citrix.com> wrote:
> OK, so you want to allow a VM to run so that it can do packet
> processing in the kernel, but once it's done in the kernel you want to
> preempt the VM again.
>
> An idea I was going to try out is that if a VM receives an interrupt
> (possibly only certain interrupts, like network), let it run for a
> very short amount of time (say, 1ms or 500us).  That should be enough
> for it to do its basic packet processing (or audio processing, video
> processing, whatever).  True, you're going to run the "cpu hog" during
> that time, but that will be debited against time he'll run later.  (I
> haven't tested this idea yet. It may work better with some credit
> algorithms than others.)
>
> The problem with inducing a guest to call schedule():
> * It may not have any other runnable processes, or it may choose the
> same process to run again; so it may not switch the cr3 anyway.
> * The only reliable way to do it without some kind of
> paravirtualization (if even a kernel driver) would be to give it a
> timer interrupt, which may mess up other things on the system, such as
> the system time.
>
> If you're really keen to preempt on return to userspace, you could try
> something like the following.  Before delivering the interrupt, note
> the EIP the guest is at.  If it's in user space, set a hardware
> breakpoint at that address.  Then deliver the interrupt.  If the guest
> calls schedule(), you can catch the CR3 switch; if it returns to the
> same process, it will hit the breakpoint.
>
> Two possible problems:
> * For reasons of ancient history, the iret instruction may set the RF
> flag in the EFLAGS register, which will cause the breakpoint not to
> fire after the guest iret.  You may need to decode the instruction and
> set the breakpoint at the instruction after, or something like that.
> * I believe windows doens't do a cr3 switch if it does a *thread*
> switch.  If so, on a thread switch you'll get neither the CR3 switch
> nor the breakpoint (since the other thread is probably running
> somewhere else).
>
> Peace,
>  -George
>
> On Sun, Nov 1, 2009 at 5:54 AM, XiaYubin <xiayubin@gmail.com> wrote:
>> Hi, George,
>>
>> Thank you for your reply. Actually, I'm looking for a generic
>> mechanism of cooperative scheduling. The independence of  guest OS can
>> make such mechanism more convincing and practical, just like the
>> balloon driver does.
>>
>> Maybe you are wondering why I asked such a wired question, let me
>> describe it with more details. My current work is based on "Task-aware
>> VM scheduling", which is published on VEE'09. By monitoring CR3
>> changing at VMM level, Xen can get information of tasks' CPU
>> consumption to identify CPU hogs and I/O tasks. Therefore, the
>> task-aware mechanism offers a more fine-grained scheduler than the
>> original VCPU-level scheduler, as a VCPU may run CPU hogs and I/O
>> tasks in a mixed style.
>>
>> Imagine there are n VMs. One of them, named mix-VM, runs two tasks:
>> cpuhog and iotask (network). The other VMs, named CPU-VM, run just
>> cpuhog. All VMs are using PV driver ( GPLPV driver for Windows).
>>
>> Here's what supposed to happen when iotask receiving an network
>> packet: The NIC raises an IRQ, passes to Xen, then domain-0 sends an
>> inter-domain event to mix-VM, which is likely to be in run-queue. Xen
>> then schedules it to run immediately and set its state to
>> preempting-state. Right after that, the mix-VM *should* schedules
>> iotask to process the incoming packet, and then schedules cpuhog after
>> processing. When the CR3 is changing to cpuhog, Xen knows that the
>> mix-VM has finished I/O processing (here we assume that the priority
>> of cpuhog is usually lower than iotask in most OS), and schedules the
>> mix-VM out to finish its preempting-state. Therefore, the mix-VM can
>> preempt other VMs to process I/O ASAP, while making the preempting
>> time as short as possible to keep fairness. The point is: cpuhog
>> should not run in preempting-state.
>>
>> However, a problem arises when the mix-VM sending packets. When iotask
>> sends an amount of data (using TCP protocol), it will block and wait
>> to be waked up after guest kernel sending all the data, which may be
>> split into thousands of TCP packets. The mix-VM will receives an ACK
>> packet every time it sending a packet, which makes it enter
>> preempting-state. Note that at this moment, the CR3 of mix-VM is
>> cpuhog's (as the only running process). After the guest kernel
>> processing the ACK packet and sending next packet, it switches to user
>> mode, which means the cpuhog gets to run in preempting-state. The
>> point is: as there is no CR3-changing, Xen has no way to run.
>>
>> One way is to add a hook at user/kernel mode switching, then Xen can
>> catch the moment when cpuhog gets to run. However, this way costs too
>> much. Another way is to force a VM to schedule when it entering
>> preempting-state. Therefore, it will trap to Xen when CR3 is changed,
>> and Xen can finish its preempting-state when it schedules cpuhog to
>> run. That's why I want to trigger guest context switch from Xen. I
>> don't really care *which* process it will switch to, I just want to
>> get Xen a chance to run. The point is: is there a better/simpler way
>> to solve this problem?
>>
>> Hope I described the problem clearly. And would you please show more
>> details about the thought of "reschedule event channel"? Thanks!
>>
>> --
>> Yubin
>>
>> On Sat, Oct 31, 2009 at 11:20 PM, George Dunlap
>> <George.Dunlap@eu.citrix.com> wrote:
>>> Context switching is a choice the guest OS has to make, and how that's
>>> done will differ based on the operating system.  I think if you're
>>> thinking about modifying the guest scheduler, you're probably better
>>> off starting with Linux.  Even if there's a way to convince Windows to
>>> call schedule() to pick a new process, I'm not sure you'll be able to
>>> tell it *which* process to choose.
>>>
>>> As far as mechanism on Xen's side, it would be easy enough to allocate
>>> a "reschedule" event channel for the guest, such that whenever you
>>> want to trigger a guest reschedule, just raise the event channel.
>>>
>>>  -George
>>>
>>> On Sat, Oct 31, 2009 at 11:02 AM, XiaYubin <xiayubin@gmail.com> wrote:
>>>> Hi, all,
>>>>
>>>> As I'm doing some research in cooperative scheduling between Xen and
>>>> guest domain, I want to know how many ways can Xen trigger a context
>>>> switch inside an HVM guest domain (which runs Windows in my case). Do
>>>> I have to write a driver (like balloon-driver)? Or a user process is
>>>> enough? Or there is an even simpler way?
>>>>
>>>> All your suggestions are appreciated. Thanks! :)
>>>>
>>>> --
>>>> Yubin
>>>>
>>>> _______________________________________________
>>>> Xen-devel mailing list
>>>> Xen-devel@lists.xensource.com
>>>> http://lists.xensource.com/xen-devel
>>>>
>>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
>>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: How can Xen trigger a context switch in an HVM guest domain?
  2009-11-03  1:43       ` XiaYubin
@ 2009-11-03 11:51         ` George Dunlap
  0 siblings, 0 replies; 7+ messages in thread
From: George Dunlap @ 2009-11-03 11:51 UTC (permalink / raw)
  To: XiaYubin; +Cc: xen-devel, James (song wei)

When I first started doing performance analysis, the sedf scheduler
was using a 500us timeslice, which (in my estimates) caused the
first-gen VMX-capable processors to spend at least 5% of their time
handling vmenters and vmexits.  Obviously performance has increased
somewhat since then, but they're still not free. :-)

 -George

On Tue, Nov 3, 2009 at 1:43 AM, XiaYubin <xiayubin@gmail.com> wrote:
> James and George, thank you both! The breakpoint way is interesting, I
> don't event think of it :)
>
> OK, I'm going to use a simpler way to verify my idea first. Before the
> preempting-state VM runs, I will set a timer to make Xen get to run
> every 100us (maybe longer for the first iteration). The timer-handler
> will check if the preempting VM is in kernel-mode or user-mode. If it
> is in user-mode with cpu-hog's CR3, then it will be scheduled out.
> Meanwhile, if the iteration goes beyond some threshold (say 5 times),
> the VM will also be scheduled out. This way seems much simpler than
> the one using breakpoint, and more accurate than the one using
> 1ms-timer. It may bring some overhead, but the preemption is not
> supposed to occur frequently and the fairness is more important.
>
> The thread problem also exists in Linux platform. Currently I have no
> good idea to identify different threads from the hypervisor's
> perspective. I have a dream that one day those OS guys will export
> this information to VMM, a dream that one day our children will live
> in a world where virtualization rules. I have a dream today :)
>
> Thanks!
>
> --
> Yubin
>
> On Tue, Nov 3, 2009 at 12:05 AM, George Dunlap
> <George.Dunlap@eu.citrix.com> wrote:
>> OK, so you want to allow a VM to run so that it can do packet
>> processing in the kernel, but once it's done in the kernel you want to
>> preempt the VM again.
>>
>> An idea I was going to try out is that if a VM receives an interrupt
>> (possibly only certain interrupts, like network), let it run for a
>> very short amount of time (say, 1ms or 500us).  That should be enough
>> for it to do its basic packet processing (or audio processing, video
>> processing, whatever).  True, you're going to run the "cpu hog" during
>> that time, but that will be debited against time he'll run later.  (I
>> haven't tested this idea yet. It may work better with some credit
>> algorithms than others.)
>>
>> The problem with inducing a guest to call schedule():
>> * It may not have any other runnable processes, or it may choose the
>> same process to run again; so it may not switch the cr3 anyway.
>> * The only reliable way to do it without some kind of
>> paravirtualization (if even a kernel driver) would be to give it a
>> timer interrupt, which may mess up other things on the system, such as
>> the system time.
>>
>> If you're really keen to preempt on return to userspace, you could try
>> something like the following.  Before delivering the interrupt, note
>> the EIP the guest is at.  If it's in user space, set a hardware
>> breakpoint at that address.  Then deliver the interrupt.  If the guest
>> calls schedule(), you can catch the CR3 switch; if it returns to the
>> same process, it will hit the breakpoint.
>>
>> Two possible problems:
>> * For reasons of ancient history, the iret instruction may set the RF
>> flag in the EFLAGS register, which will cause the breakpoint not to
>> fire after the guest iret.  You may need to decode the instruction and
>> set the breakpoint at the instruction after, or something like that.
>> * I believe windows doens't do a cr3 switch if it does a *thread*
>> switch.  If so, on a thread switch you'll get neither the CR3 switch
>> nor the breakpoint (since the other thread is probably running
>> somewhere else).
>>
>> Peace,
>>  -George
>>
>> On Sun, Nov 1, 2009 at 5:54 AM, XiaYubin <xiayubin@gmail.com> wrote:
>>> Hi, George,
>>>
>>> Thank you for your reply. Actually, I'm looking for a generic
>>> mechanism of cooperative scheduling. The independence of  guest OS can
>>> make such mechanism more convincing and practical, just like the
>>> balloon driver does.
>>>
>>> Maybe you are wondering why I asked such a wired question, let me
>>> describe it with more details. My current work is based on "Task-aware
>>> VM scheduling", which is published on VEE'09. By monitoring CR3
>>> changing at VMM level, Xen can get information of tasks' CPU
>>> consumption to identify CPU hogs and I/O tasks. Therefore, the
>>> task-aware mechanism offers a more fine-grained scheduler than the
>>> original VCPU-level scheduler, as a VCPU may run CPU hogs and I/O
>>> tasks in a mixed style.
>>>
>>> Imagine there are n VMs. One of them, named mix-VM, runs two tasks:
>>> cpuhog and iotask (network). The other VMs, named CPU-VM, run just
>>> cpuhog. All VMs are using PV driver ( GPLPV driver for Windows).
>>>
>>> Here's what supposed to happen when iotask receiving an network
>>> packet: The NIC raises an IRQ, passes to Xen, then domain-0 sends an
>>> inter-domain event to mix-VM, which is likely to be in run-queue. Xen
>>> then schedules it to run immediately and set its state to
>>> preempting-state. Right after that, the mix-VM *should* schedules
>>> iotask to process the incoming packet, and then schedules cpuhog after
>>> processing. When the CR3 is changing to cpuhog, Xen knows that the
>>> mix-VM has finished I/O processing (here we assume that the priority
>>> of cpuhog is usually lower than iotask in most OS), and schedules the
>>> mix-VM out to finish its preempting-state. Therefore, the mix-VM can
>>> preempt other VMs to process I/O ASAP, while making the preempting
>>> time as short as possible to keep fairness. The point is: cpuhog
>>> should not run in preempting-state.
>>>
>>> However, a problem arises when the mix-VM sending packets. When iotask
>>> sends an amount of data (using TCP protocol), it will block and wait
>>> to be waked up after guest kernel sending all the data, which may be
>>> split into thousands of TCP packets. The mix-VM will receives an ACK
>>> packet every time it sending a packet, which makes it enter
>>> preempting-state. Note that at this moment, the CR3 of mix-VM is
>>> cpuhog's (as the only running process). After the guest kernel
>>> processing the ACK packet and sending next packet, it switches to user
>>> mode, which means the cpuhog gets to run in preempting-state. The
>>> point is: as there is no CR3-changing, Xen has no way to run.
>>>
>>> One way is to add a hook at user/kernel mode switching, then Xen can
>>> catch the moment when cpuhog gets to run. However, this way costs too
>>> much. Another way is to force a VM to schedule when it entering
>>> preempting-state. Therefore, it will trap to Xen when CR3 is changed,
>>> and Xen can finish its preempting-state when it schedules cpuhog to
>>> run. That's why I want to trigger guest context switch from Xen. I
>>> don't really care *which* process it will switch to, I just want to
>>> get Xen a chance to run. The point is: is there a better/simpler way
>>> to solve this problem?
>>>
>>> Hope I described the problem clearly. And would you please show more
>>> details about the thought of "reschedule event channel"? Thanks!
>>>
>>> --
>>> Yubin
>>>
>>> On Sat, Oct 31, 2009 at 11:20 PM, George Dunlap
>>> <George.Dunlap@eu.citrix.com> wrote:
>>>> Context switching is a choice the guest OS has to make, and how that's
>>>> done will differ based on the operating system.  I think if you're
>>>> thinking about modifying the guest scheduler, you're probably better
>>>> off starting with Linux.  Even if there's a way to convince Windows to
>>>> call schedule() to pick a new process, I'm not sure you'll be able to
>>>> tell it *which* process to choose.
>>>>
>>>> As far as mechanism on Xen's side, it would be easy enough to allocate
>>>> a "reschedule" event channel for the guest, such that whenever you
>>>> want to trigger a guest reschedule, just raise the event channel.
>>>>
>>>>  -George
>>>>
>>>> On Sat, Oct 31, 2009 at 11:02 AM, XiaYubin <xiayubin@gmail.com> wrote:
>>>>> Hi, all,
>>>>>
>>>>> As I'm doing some research in cooperative scheduling between Xen and
>>>>> guest domain, I want to know how many ways can Xen trigger a context
>>>>> switch inside an HVM guest domain (which runs Windows in my case). Do
>>>>> I have to write a driver (like balloon-driver)? Or a user process is
>>>>> enough? Or there is an even simpler way?
>>>>>
>>>>> All your suggestions are appreciated. Thanks! :)
>>>>>
>>>>> --
>>>>> Yubin
>>>>>
>>>>> _______________________________________________
>>>>> Xen-devel mailing list
>>>>> Xen-devel@lists.xensource.com
>>>>> http://lists.xensource.com/xen-devel
>>>>>
>>>>
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xensource.com
>>> http://lists.xensource.com/xen-devel
>>>
>>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-11-03 11:51 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-31 11:02 How can Xen trigger a context switch in an HVM guest domain? XiaYubin
2009-10-31 15:20 ` George Dunlap
2009-11-01  5:54   ` XiaYubin
2009-11-02  9:16     ` James (song wei)
2009-11-02 16:05     ` George Dunlap
2009-11-03  1:43       ` XiaYubin
2009-11-03 11:51         ` George Dunlap

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.