All of lore.kernel.org
 help / color / mirror / Atom feed
* A question on the credit scheduler
@ 2011-12-16  7:55 zhikai
  2011-12-16 11:04 ` George Dunlap
  2011-12-16 14:44 ` gavin
  0 siblings, 2 replies; 9+ messages in thread
From: zhikai @ 2011-12-16  7:55 UTC (permalink / raw)
  To: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 403 bytes --]

Hi All,


In the credit scheduler, the scheduling decision function csched_schedule() is called in the schedule function in scheduler.c, such as the following.
next_slice = sched->do_schedule(sched, now, tasklet_work_scheduled);


But, how often the csched_schedule() is called and to run? Does this frequency have something to do with the slice of credit scheduler that is 30ms?


Best Regards,
Gavin


[-- Attachment #1.2: Type: text/html, Size: 692 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: A question on the credit scheduler
  2011-12-16  7:55 A question on the credit scheduler zhikai
@ 2011-12-16 11:04 ` George Dunlap
  2011-12-16 14:44 ` gavin
  1 sibling, 0 replies; 9+ messages in thread
From: George Dunlap @ 2011-12-16 11:04 UTC (permalink / raw)
  To: zhikai; +Cc: xen-devel

2011/12/16 zhikai <gbtux@126.com>:
> Hi All,
>
> In the credit scheduler, the scheduling decision function csched_schedule()
> is called in the schedule function in scheduler.c, such as the following.
> next_slice = sched->do_schedule(sched, now, tasklet_work_scheduled);
>
> But, how often the csched_schedule() is called and to run? Does this
> frequency have something to do with the slice of credit scheduler that is
> 30ms?

The scheduler runs whenever the SCHEDULE_SOFTIRQ is raised.  If you
grep through the source code fro that string, you can find all the
places where it's raised.

Some examples include:
* When the 30ms timeslice is finished
* When a sleeping vcpu of higher priority than what's currently running wakes up
* When a vcpu blocks
* When a vcpu is migrated from one cpu to another

30ms is actually a pretty long time; in typical workloads, vcpus block
or are preempted by other waking vcpus without using up their full
timeslice.

 -George

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: A question on the credit scheduler
  2011-12-16  7:55 A question on the credit scheduler zhikai
  2011-12-16 11:04 ` George Dunlap
@ 2011-12-16 14:44 ` gavin
  2011-12-16 15:58   ` George Dunlap
  2011-12-17  9:16   ` gavin
  1 sibling, 2 replies; 9+ messages in thread
From: gavin @ 2011-12-16 14:44 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1546 bytes --]

At 2011-12-16 19:04:19,"George Dunlap" <George.Dunlap@eu.citrix.com> wrote:>2011/12/16 zhikai <gbtux@126.com>:
>> Hi All,
>>
>> In the credit scheduler, the scheduling decision function csched_schedule()
>> is called in the schedule function in scheduler.c, such as the following.
>> next_slice = sched->do_schedule(sched, now, tasklet_work_scheduled);
>>
>> But, how often the csched_schedule() is called and to run? Does this
>> frequency have something to do with the slice of credit scheduler that is
>> 30ms?
>
>The scheduler runs whenever the SCHEDULE_SOFTIRQ is raised.  If you
>grep through the source code fro that string, you can find all the
>places where it's raised.
>
>Some examples include:
>* When the 30ms timeslice is finished
>* When a sleeping vcpu of higher priority than what's currently running wakes up
>* When a vcpu blocks
>* When a vcpu is migrated from one cpu to another
>
>30ms is actually a pretty long time; in typical workloads, vcpus block
>or are preempted by other waking vcpus without using up their full
>timeslice.


Thank you very much for your reply.
So, the vcpu is very likely to be preempted whenever the SCHEDULE_SOFTIRQ is raised. And we cannot find a small timeslice, such as a(ms), which makes the time any vcpu spending on running phase is k*a(ms), k is integer here. There is no such a small timeslice. Is it right?


Best Regards,
Gavin


>
> -George
>
>_______________________________________________
>Xen-devel mailing list
>Xen-devel@lists.xensource.com
>http://lists.xensource.com/xen-devel

[-- Attachment #1.2: Type: text/html, Size: 2696 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: A question on the credit scheduler
  2011-12-16 14:44 ` gavin
@ 2011-12-16 15:58   ` George Dunlap
  2011-12-17  9:16   ` gavin
  1 sibling, 0 replies; 9+ messages in thread
From: George Dunlap @ 2011-12-16 15:58 UTC (permalink / raw)
  To: gavin; +Cc: xen-devel

2011/12/16 gavin <gbtux@126.com>:
> At 2011-12-16 19:04:19,"George Dunlap" <George.Dunlap@eu.citrix.com> wrote:
>
>>2011/12/16 zhikai <gbtux@126.com>:
>>> Hi All,
>>>
>>> In the credit scheduler, the scheduling decision function csched_schedule()
>>> is called in the schedule function in scheduler.c, such as the following.
>>> next_slice = sched->do_schedule(sched, now, tasklet_work_scheduled);
>>>
>>> But, how often the csched_schedule() is called and to run? Does this
>>> frequency have something to do with the slice of credit scheduler that is
>>> 30ms?
>>
>>The scheduler runs whenever the SCHEDULE_SOFTIRQ is raised.  If you
>>grep through the source code fro that string, you can find all the
>>places where it's raised.
>>
>>Some examples include:
>>* When the 30ms timeslice is finished
>>* When a sleeping vcpu of higher priority than what's currently running wakes up
>>* When a vcpu blocks
>>* When a vcpu is migrated from one cpu to another
>>
>>30ms is actually a pretty long time; in typical workloads, vcpus block
>>or are preempted by other waking vcpus without using up their full
>>timeslice.
>
> Thank you very much for your reply.
>
> So, the vcpu is very likely to be preempted whenever the SCHEDULE_SOFTIRQ is
> raised.

It depends; if you have a cpu-burning vcpu running on a cpu all by
itself, then after its 30ms timeslice, Xen will have no one else to
run, and so will let it run again.

But yes, if there are other vcpus on the runqueue, or the host is
moderately busy, it's likely that SCHEDULE_SOFTIRQ will cause a
context-switch.

> And we cannot find a small timeslice, such as a(ms), which makes the
> time any vcpu spending on running phase is k*a(ms), k is integer here. There
> is no such a small timeslice. Is it right?

I'm sorry, I don't really understand your question.  Perhaps if you
told me what you're trying to accomplish?

 -George

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: A question on the credit scheduler
  2011-12-16 14:44 ` gavin
  2011-12-16 15:58   ` George Dunlap
@ 2011-12-17  9:16   ` gavin
  2011-12-19 10:37     ` George Dunlap
                       ` (2 more replies)
  1 sibling, 3 replies; 9+ messages in thread
From: gavin @ 2011-12-17  9:16 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 3764 bytes --]


At 2011-12-16 23:58:26,"George Dunlap" <George.Dunlap@eu.citrix.com> wrote:
>2011/12/16 gavin <gbtux@126.com>:
>> At 2011-12-16 19:04:19,"George Dunlap" <George.Dunlap@eu.citrix.com> wrote:
>>
>>>2011/12/16 zhikai <gbtux@126.com>:
>>>> Hi All,
>>>>
>>>> In the credit scheduler, the scheduling decision function csched_schedule()
>>>> is called in the schedule function in scheduler.c, such as the following.
>>>> next_slice = sched->do_schedule(sched, now, tasklet_work_scheduled);
>>>>
>>>> But, how often the csched_schedule() is called and to run? Does this
>>>> frequency have something to do with the slice of credit scheduler that is
>>>> 30ms?
>>>
>>>The scheduler runs whenever the SCHEDULE_SOFTIRQ is raised.  If you
>>>grep through the source code fro that string, you can find all the
>>>places where it's raised.
>>>
>>>Some examples include:
>>>* When the 30ms timeslice is finished
>>>* When a sleeping vcpu of higher priority than what's currently running wakes up
>>>* When a vcpu blocks
>>>* When a vcpu is migrated from one cpu to another
>>>
>>>30ms is actually a pretty long time; in typical workloads, vcpus block
>>>or are preempted by other waking vcpus without using up their full
>>>timeslice.
>>
>> Thank you very much for your reply.
>>
>> So, the vcpu is very likely to be preempted whenever the SCHEDULE_SOFTIRQ is
>> raised.
>
>It depends; if you have a cpu-burning vcpu running on a cpu all by
>itself, then after its 30ms timeslice, Xen will have no one else to
>run, and so will let it run again.
>
>But yes, if there are other vcpus on the runqueue, or the host is
>moderately busy, it's likely that SCHEDULE_SOFTIRQ will cause a
>context-switch.
>
>> And we cannot find a small timeslice, such as a(ms), which makes the
>> time any vcpu spending on running phase is k*a(ms), k is integer here. There
>> is no such a small timeslice. Is it right?
>
>I'm sorry, I don't really understand your question.  Perhaps if you
>told me what you're trying to accomplish?

I try to describe my idea as the following clearly. But I really don't know if it will work. Please give me some advice if possible.

According to the credit scheduler in Xen, a vCPU can run a 30ms timeslice when it is scheduled on the physical CPU. And, a vCPU with the BOOST priority will preempt the running one and run additional 10ms. So, what I think is if we monitor the physical CPU every 10ms and we can get the mapping information of a physical CPU and a vCPU. And also, we can get the un-mapping information that a physical CPU isn’t mapped to any vCPU. Thus, we can get the CPU usage by calculating the proportion of the mapping information to the total time when we monitored. 

For example, if we monitor the physical CPUs every 10ms and we can get 100 pairs of pCPU and vCPU in a second, such as (pCPU_id, vCPU_id). If there is 60 mapping pairs that the pCPU is mapped to a valid vPCU, and 40 un-mapping pairs that we cannot find the pCPU to be mapped a valid vCPU. So, we can get the usage of the physical CPUs that is 60%.

Here, we monitor the physical CPUs every 10ms. We also can monitor them once less than the 10ms interval, such as 1ms interval. Whatever interval we choose, we must make sure no CPU content switch in the interval or the context switch always occur at the edge of interval. Only in this condition, can this idea work. 

So, I am not sure whether we can find such a time interval that can meet this condition. In other words, whether we can find such a time interval that ensures all the CPU content switch occur at the edge of interval.




Thank you very much.

Gavin


> -George
>
>_______________________________________________
>Xen-devel mailing list
>Xen-devel@lists.xensource.com
>http://lists.xensource.com/xen-devel

[-- Attachment #1.2: Type: text/html, Size: 6302 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: A question on the credit scheduler
  2011-12-17  9:16   ` gavin
@ 2011-12-19 10:37     ` George Dunlap
  2011-12-19 16:13       ` Shriram Rajagopalan
  2011-12-20  7:30     ` gavin
  2011-12-20 15:49     ` gavin
  2 siblings, 1 reply; 9+ messages in thread
From: George Dunlap @ 2011-12-19 10:37 UTC (permalink / raw)
  To: gavin; +Cc: xen-devel

2011/12/17 gavin <gbtux@126.com>:
>
> At 2011-12-16 23:58:26,"George Dunlap" <George.Dunlap@eu.citrix.com> wrote:
>>2011/12/16 gavin <gbtux@126.com>:
>>> At 2011-12-16 19:04:19,"George Dunlap" <George.Dunlap@eu.citrix.com> wrote:
>>>
>>>>2011/12/16 zhikai <gbtux@126.com>:
>>>>> Hi All,
>>>>>
>>>>> In the credit scheduler, the scheduling decision function csched_schedule()
>>>>> is called in the schedule function in scheduler.c, such as the following.
>>>>> next_slice = sched->do_schedule(sched, now, tasklet_work_scheduled);
>>>>>
>>>>> But, how often the csched_schedule() is called and to run? Does this
>>>>> frequency have something to do with the slice of credit scheduler that is
>>>>> 30ms?
>>>>
>>>>The scheduler runs whenever the SCHEDULE_SOFTIRQ is raised.  If you
>>>>grep through the source code fro that string, you can find all the
>>>>places where it's raised.
>>>>
>>>>Some examples include:
>>>>* When the 30ms timeslice is finished
>>>>* When a sleeping vcpu of higher priority than what's currently running wakes up
>>>>* When a vcpu blocks
>>>>* When a vcpu is migrated from one cpu to another
>>>>
>>>>30ms is actually a pretty long time; in typical workloads, vcpus block
>>>>or are preempted by other waking vcpus without using up their full
>>>>timeslice.
>>>
>>> Thank you very much for your reply.
>>>
>>> So, the vcpu is very likely to be preempted whenever the SCHEDULE_SOFTIRQ is
>>> raised.
>>
>>It depends; if you have a cpu-burning vcpu running on a cpu all by
>>itself, then after its 30ms timeslice, Xen will have no one else to
>>run, and so will let it run again.
>>
>>But yes, if there are other vcpus on the runqueue, or the host is
>>moderately busy, it's likely that SCHEDULE_SOFTIRQ will cause a
>>context-switch.
>>
>>> And we cannot find a small timeslice, such as a(ms), which makes the
>>> time any vcpu spending on running phase is k*a(ms), k is integer here. There
>>> is no such a small timeslice. Is it right?
>>
>>I'm sorry, I don't really understand your question.  Perhaps if you
>>told me what you're trying to accomplish?
>
> I try to describe my idea as the following clearly. But I really don't know
> if it will work. Please give me some advice if possible.
>
> According to the credit scheduler in Xen, a vCPU can run a 30ms timeslice
> when it is scheduled on the physical CPU. And, a vCPU with the BOOST
> priority will preempt the running one and run additional 10ms. So, what I
> think is if we monitor the physical CPU every 10ms and we can get the
> mapping information of a physical CPU and a vCPU. And also, we can get the
> un-mapping information that a physical CPU isn’t mapped to any vCPU. Thus,
> we can get the CPU usage by calculating the proportion of the mapping
> information to the total time when we monitored.
>
> For example, if we monitor the physical CPUs every 10ms and we can get 100
> pairs of pCPU and vCPU in a second, such as (pCPU_id, vCPU_id). If there is
> 60 mapping pairs that the pCPU is mapped to a valid vPCU, and 40 un-mapping
> pairs that we cannot find the pCPU to be mapped a valid vCPU. So, we can get
> the usage of the physical CPUs that is 60%.
>
> Here, we monitor the physical CPUs every 10ms. We also can monitor them once
> less than the 10ms interval, such as 1ms interval. Whatever interval we
> choose, we must make sure no CPU content switch in the interval or the
> context switch always occur at the edge of interval. Only in this condition,
> can this idea work.
>
> So, I am not sure whether we can find such a time interval that can meet
> this condition. In other words, whether we can find such a time interval
> that ensures all the CPU content switch occur at the edge of interval.

You still haven't described exactly what it is you're trying to
accomplish: what is your end goal?  It seems to be related somehow to
measuring how busy the system is (i.e., the number of active pcpus and
idle pcpus); but as I don't know what you want to do with that
information, I can't tell you the best way to get it.

Regarding a map of pcpus to vcpus, that already exists.  The
scheduling code will keep track of the currently running vcpu here:
  per_cpu(schedule_data, pcpu_id).curr

You can see examples of the above structure used in
xen/common/sched_credit2.c.  If "is_idle(per_cpu(schedule_data,
pcpu).curr)" is false, then the cpu is running a vcpu; if it is true,
then the pcpu is idle (although it may be running a tasklet).

Additionally, if all you want is the number of non-idle cpus, the
credit1 scheduler keeps track of the idle and non-idle cpus in
prv->idlers.  You could easily use "cpumask_weight(&prv->idlers)" to
find out how many idle cpus there are at any given time.  If you know
how many online cpus there are, that will give you the busy-ness of
the system.

So now that you have this instantaneous percentage, what do you want
to do with it?

 -George

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: A question on the credit scheduler
  2011-12-19 10:37     ` George Dunlap
@ 2011-12-19 16:13       ` Shriram Rajagopalan
  0 siblings, 0 replies; 9+ messages in thread
From: Shriram Rajagopalan @ 2011-12-19 16:13 UTC (permalink / raw)
  To: George Dunlap; +Cc: gavin, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 5974 bytes --]

On Mon, Dec 19, 2011 at 4:37 AM, George Dunlap
<George.Dunlap@eu.citrix.com>wrote:

> 2011/12/17 gavin <gbtux@126.com>:
> >
> > At 2011-12-16 23:58:26,"George Dunlap" <George.Dunlap@eu.citrix.com
> > wrote:
> >>2011/12/16 gavin <gbtux@126.com>:
> >>> At 2011-12-16 19:04:19,"George Dunlap" <George.Dunlap@eu.citrix.com
> > wrote:
> >>>
> >>>>2011/12/16 zhikai <gbtux@126.com>:
> >>>>> Hi All,
> >>>>>
>
> >>>>> In the credit scheduler, the scheduling decision function csched_schedule()
>
> >>>>> is called in the schedule function in scheduler.c, such as the following.
> >>>>> next_slice = sched->do_schedule(sched, now, tasklet_work_scheduled);
> >>>>>
> >>>>> But, how often the csched_schedule() is called and to run? Does this
>
> >>>>> frequency have something to do with the slice of credit scheduler that is
> >>>>> 30ms?
> >>>>
> >>>>The scheduler runs whenever the SCHEDULE_SOFTIRQ is raised.  If you
> >>>>grep through the source code fro that string, you can find all the
> >>>>places where it's raised.
> >>>>
> >>>>Some examples include:
> >>>>* When the 30ms timeslice is finished
>
> >>>>* When a sleeping vcpu of higher priority than what's currently running wakes up
> >>>>* When a vcpu blocks
> >>>>* When a vcpu is migrated from one cpu to another
> >>>>
> >>>>30ms is actually a pretty long time; in typical workloads, vcpus block
> >>>>or are preempted by other waking vcpus without using up their full
> >>>>timeslice.
> >>>
> >>> Thank you very much for your reply.
> >>>
>
> >>> So, the vcpu is very likely to be preempted whenever the SCHEDULE_SOFTIRQ is
> >>> raised.
> >>
> >>It depends; if you have a cpu-burning vcpu running on a cpu all by
> >>itself, then after its 30ms timeslice, Xen will have no one else to
> >>run, and so will let it run again.
> >>
> >>But yes, if there are other vcpus on the runqueue, or the host is
> >>moderately busy, it's likely that SCHEDULE_SOFTIRQ will cause a
> >>context-switch.
> >>
> >>> And we cannot find a small timeslice, such as a(ms), which makes the
>
> >>> time any vcpu spending on running phase is k*a(ms), k is integer here. There
> >>> is no such a small timeslice. Is it right?
> >>
> >>I'm sorry, I don't really understand your question.  Perhaps if you
> >>told me what you're trying to accomplish?
> >
> > I try to describe my idea as the following clearly. But I really don't
> know
> > if it will work. Please give me some advice if possible.
> >
> > According to the credit scheduler in Xen, a vCPU can run a 30ms timeslice
> > when it is scheduled on the physical CPU. And, a vCPU with the BOOST
> > priority will preempt the running one and run additional 10ms. So, what I
> > think is if we monitor the physical CPU every 10ms and we can get the
> > mapping information of a physical CPU and a vCPU. And also, we can get
> the
> > un-mapping information that a physical CPU isn’t mapped to any vCPU.
> Thus,
> > we can get the CPU usage by calculating the proportion of the mapping
> > information to the total time when we monitored.
> >
> > For example, if we monitor the physical CPUs every 10ms and we can get
> 100
> > pairs of pCPU and vCPU in a second, such as (pCPU_id, vCPU_id). If there
> is
> > 60 mapping pairs that the pCPU is mapped to a valid vPCU, and 40
> un-mapping
> > pairs that we cannot find the pCPU to be mapped a valid vCPU. So, we can
> get
> > the usage of the physical CPUs that is 60%.
> >
> > Here, we monitor the physical CPUs every 10ms. We also can monitor them
> once
> > less than the 10ms interval, such as 1ms interval. Whatever interval we
> > choose, we must make sure no CPU content switch in the interval or the
> > context switch always occur at the edge of interval. Only in this
> condition,
> > can this idea work.
> >
> > So, I am not sure whether we can find such a time interval that can meet
> > this condition. In other words, whether we can find such a time interval
> > that ensures all the CPU content switch occur at the edge of interval.
>
> You still haven't described exactly what it is you're trying to
> accomplish: what is your end goal?  It seems to be related somehow to
> measuring how busy the system is (i.e., the number of active pcpus and
> idle pcpus); but as I don't know what you want to do with that
> information, I can't tell you the best way to get it.
>
> Regarding a map of pcpus to vcpus, that already exists.  The
> scheduling code will keep track of the currently running vcpu here:
>  per_cpu(schedule_data, pcpu_id).curr
>
> You can see examples of the above structure used in
> xen/common/sched_credit2.c.  If "is_idle(per_cpu(schedule_data,
> pcpu).curr)" is false, then the cpu is running a vcpu; if it is true,
> then the pcpu is idle (although it may be running a tasklet).
>
> Additionally, if all you want is the number of non-idle cpus, the
> credit1 scheduler keeps track of the idle and non-idle cpus in
> prv->idlers.  You could easily use "cpumask_weight(&prv->idlers)" to
> find out how many idle cpus there are at any given time.  If you know
> how many online cpus there are, that will give you the busy-ness of
> the system.
>
> So now that you have this instantaneous percentage, what do you want
> to do with it?
>
>
A tangential question:
 When you pin a pcpu to a vcpu (e.g. xm vcpu-pin 0 0 0),  are the soft irqs
for that cpu still raised ? (Lets assume for the sake of simplicity that
there
are 2 cpus in the system and 2 domains - a dom0 and a domU, each pinned
to one CPU).
 Do the vcpu pauses (and subsequent resumes with no context switch etc)
still happen due to the irqs or the scheduler code? Or will the scheduler
be effectively disabled in this scenario ?

shriram


 -George
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>

[-- Attachment #1.2: Type: text/html, Size: 7713 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: A question on the credit scheduler
  2011-12-17  9:16   ` gavin
  2011-12-19 10:37     ` George Dunlap
@ 2011-12-20  7:30     ` gavin
  2011-12-20 15:49     ` gavin
  2 siblings, 0 replies; 9+ messages in thread
From: gavin @ 2011-12-20  7:30 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 7111 bytes --]

1) My original goal is to calculate the usage percentage of CPU in a different way from other tools, such as xentrace.

Because the aim of all the scheduling is to map the vCPU to pCPU, we can get a mapping sequence of pCPU and vCPU by monitoring the pCPU in a fixed interval. 

For example, if the SCHEDULE_SOFTIRQ is just raised when the timeslice is finished, the vCPU runs as described in the figure 1 (two pCPU and three vCPU). We monitor all the CPUs once every time t, which is equal to the time of the timeslice here. In ten timeslice (maybe 0.3 second in credit1), we can get the mapping sequence, which contains 20 (pCPU, vCPU) pairs as the following.

(pCPU1, vCPU1), (pCPU2, vCPU2), (pCPU1, vCPU1), (pCPU2, non), (pCPU1, vCPU1), (pCPU2, vCPU3), (pCPU1, vCPU2), (pCPU2, non), (pCPU1, non), (pCPU2, vCPU2), (pCPU1, vCPU3), (pCPU2, non), (pCPU1, vCPU3), (pCPU2, vCPU1), (pCPU1, non), (pCPU2, vCPU1), (pCPU1, vCPU1), (pCPU2, non), (pCPU1, non), (pCPU2, vCPU2)。

If there is no vCPU mapped on the pCPU, we use (pCPU*, non) pair, which means the pCPU is idle. In the above sequence, there are 7 idle pairs in total. 

So, from the above mapping sequence, we can calculate the usage percentage of the CPU in ten timeslcie (0.3 sencond).

Usage Percentage = (20-7)/20 = 65% 

2) If we can get such a mapping sequence of pCPU and vCPU, besides calculating the usage percentage of CPU, maybe we also can find somelaws of the mapping sequence and use the laws to infer the CPU-bound task and I/O-bound task. 

However, this idea may not work. Because the SCHEDULE_SOFTIRQ is not only raised when the timeslice is finished, but also raised in many other situations, we cannot get a regular mapping of vCPU and pCPU just like figure 1. The mapping is irregular possibly shown in figure 2. In the figure 2, we cannot find a very proper time t to monitor the CPUs.



--
Best Regards,
Gavin



At 2011-12-19 18:37:14,"George Dunlap" <George.Dunlap@eu.citrix.com> wrote:
>2011/12/17 gavin <gbtux@126.com>:
>>
>> At 2011-12-16 23:58:26,"George Dunlap" <George.Dunlap@eu.citrix.com> wrote:
>>>2011/12/16 gavin <gbtux@126.com>:
>>>> At 2011-12-16 19:04:19,"George Dunlap" <George.Dunlap@eu.citrix.com> wrote:
>>>>
>>>>>2011/12/16 zhikai <gbtux@126.com>:
>>>>>> Hi All,
>>>>>>
>>>>>> In the credit scheduler, the scheduling decision function csched_schedule()
>>>>>> is called in the schedule function in scheduler.c, such as the following.
>>>>>> next_slice = sched->do_schedule(sched, now, tasklet_work_scheduled);
>>>>>>
>>>>>> But, how often the csched_schedule() is called and to run? Does this
>>>>>> frequency have something to do with the slice of credit scheduler that is
>>>>>> 30ms?
>>>>>
>>>>>The scheduler runs whenever the SCHEDULE_SOFTIRQ is raised.  If you
>>>>>grep through the source code fro that string, you can find all the
>>>>>places where it's raised.
>>>>>
>>>>>Some examples include:
>>>>>* When the 30ms timeslice is finished
>>>>>* When a sleeping vcpu of higher priority than what's currently running wakes up
>>>>>* When a vcpu blocks
>>>>>* When a vcpu is migrated from one cpu to another
>>>>>
>>>>>30ms is actually a pretty long time; in typical workloads, vcpus block
>>>>>or are preempted by other waking vcpus without using up their full
>>>>>timeslice.
>>>>
>>>> Thank you very much for your reply.
>>>>
>>>> So, the vcpu is very likely to be preempted whenever the SCHEDULE_SOFTIRQ is
>>>> raised.
>>>
>>>It depends; if you have a cpu-burning vcpu running on a cpu all by
>>>itself, then after its 30ms timeslice, Xen will have no one else to
>>>run, and so will let it run again.
>>>
>>>But yes, if there are other vcpus on the runqueue, or the host is
>>>moderately busy, it's likely that SCHEDULE_SOFTIRQ will cause a
>>>context-switch.
>>>
>>>> And we cannot find a small timeslice, such as a(ms), which makes the
>>>> time any vcpu spending on running phase is k*a(ms), k is integer here. There
>>>> is no such a small timeslice. Is it right?
>>>
>>>I'm sorry, I don't really understand your question.  Perhaps if you
>>>told me what you're trying to accomplish?
>>
>> I try to describe my idea as the following clearly. But I really don't know
>> if it will work. Please give me some advice if possible.
>>
>> According to the credit scheduler in Xen, a vCPU can run a 30ms timeslice
>> when it is scheduled on the physical CPU. And, a vCPU with the BOOST
>> priority will preempt the running one and run additional 10ms. So, what I
>> think is if we monitor the physical CPU every 10ms and we can get the
>> mapping information of a physical CPU and a vCPU. And also, we can get the
>> un-mapping information that a physical CPU isn’t mapped to any vCPU. Thus,
>> we can get the CPU usage by calculating the proportion of the mapping
>> information to the total time when we monitored.
>>
>> For example, if we monitor the physical CPUs every 10ms and we can get 100
>> pairs of pCPU and vCPU in a second, such as (pCPU_id, vCPU_id). If there is
>> 60 mapping pairs that the pCPU is mapped to a valid vPCU, and 40 un-mapping
>> pairs that we cannot find the pCPU to be mapped a valid vCPU. So, we can get
>> the usage of the physical CPUs that is 60%.
>>
>> Here, we monitor the physical CPUs every 10ms. We also can monitor them once
>> less than the 10ms interval, such as 1ms interval. Whatever interval we
>> choose, we must make sure no CPU content switch in the interval or the
>> context switch always occur at the edge of interval. Only in this condition,
>> can this idea work.
>>
>> So, I am not sure whether we can find such a time interval that can meet
>> this condition. In other words, whether we can find such a time interval
>> that ensures all the CPU content switch occur at the edge of interval.
>
>You still haven't described exactly what it is you're trying to
>accomplish: what is your end goal?  It seems to be related somehow to
>measuring how busy the system is (i.e., the number of active pcpus and
>idle pcpus); but as I don't know what you want to do with that
>information, I can't tell you the best way to get it.
>
>Regarding a map of pcpus to vcpus, that already exists.  The
>scheduling code will keep track of the currently running vcpu here:
>  per_cpu(schedule_data, pcpu_id).curr
>
>You can see examples of the above structure used in
>xen/common/sched_credit2.c.  If "is_idle(per_cpu(schedule_data,
>pcpu).curr)" is false, then the cpu is running a vcpu; if it is true,
>then the pcpu is idle (although it may be running a tasklet).
>
>Additionally, if all you want is the number of non-idle cpus, the
>credit1 scheduler keeps track of the idle and non-idle cpus in
>prv->idlers.  You could easily use "cpumask_weight(&prv->idlers)" to
>find out how many idle cpus there are at any given time.  If you know
>how many online cpus there are, that will give you the busy-ness of
>the system.
>
>So now that you have this instantaneous percentage, what do you want
>to do with it?
>
> -George
>
>_______________________________________________
>Xen-devel mailing list
>Xen-devel@lists.xensource.com
>http://lists.xensource.com/xen-devel

[-- Attachment #1.2: Type: text/html, Size: 13577 bytes --]

[-- Attachment #2: fig1.jpg --]
[-- Type: image/jpeg, Size: 26669 bytes --]

[-- Attachment #3: fig2.jpg --]
[-- Type: image/jpeg, Size: 32119 bytes --]

[-- Attachment #4: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: A question on the credit scheduler
  2011-12-17  9:16   ` gavin
  2011-12-19 10:37     ` George Dunlap
  2011-12-20  7:30     ` gavin
@ 2011-12-20 15:49     ` gavin
  2 siblings, 0 replies; 9+ messages in thread
From: gavin @ 2011-12-20 15:49 UTC (permalink / raw)
  To: George Dunlap; +Cc: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 699 bytes --]

Hi George,  


I have another question about the credit2 scheduling.
In a shared cache multi-core system, such as two processors (processor 0 and processor 1), there are two CPU cores in each processor. It seems that the credit scheduling treat all the four cores the same and doesn't distinguish them. 
However, I am not sure whether the credit2 scheduling distinguishs them from each other? Maybe the four cores can be divided into two groups. The two cores on the same processor are in the same group. I noted there is development of CPU topology in the To-do list on the credit2 Xen wiki page, but there is no details about the current situation of credit2 scheduling.




--
Best Regards,
Gavin

[-- Attachment #1.2: Type: text/html, Size: 987 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-12-20 15:49 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-16  7:55 A question on the credit scheduler zhikai
2011-12-16 11:04 ` George Dunlap
2011-12-16 14:44 ` gavin
2011-12-16 15:58   ` George Dunlap
2011-12-17  9:16   ` gavin
2011-12-19 10:37     ` George Dunlap
2011-12-19 16:13       ` Shriram Rajagopalan
2011-12-20  7:30     ` gavin
2011-12-20 15:49     ` gavin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.