From mboxrd@z Thu Jan  1 00:00:00 1970
From: George Dunlap <George.Dunlap@eu.citrix.com>
Subject: Re: A question on the credit scheduler
Date: Mon, 19 Dec 2011 10:37:14 +0000
Message-ID: <CAFLBxZZP=K504L86WDJ-F+9SuTf7aKU_C=i8nLZwqDuejoVGag@mail.gmail.com>
References: <26c93838.3b3fb.13445de0e67.Coremail.gbtux@126.com>
	<CAFLBxZYM-8vEc5Egmv+iXqMDUE_UJcgHbwr7=71H-kFowTzXWA@mail.gmail.com>
	<676b608e.2fb26.13447553671.Coremail.gbtux@126.com>
	<CAFLBxZaBiwEvz-dgPzgBQQTj2xMi8m+QnZ3M+uoqw4AKmb0ZvA@mail.gmail.com>
	<35e4701f.ca5a.1344b4ed98e.Coremail.gbtux@126.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="windows-1252"
Content-Transfer-Encoding: quoted-printable
Return-path: <xen-devel-bounces@lists.xensource.com>
In-Reply-To: <35e4701f.ca5a.1344b4ed98e.Coremail.gbtux@126.com>
List-Unsubscribe: <http://lists.xensource.com/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xensource.com>
List-Help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-Subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
Sender: xen-devel-bounces@lists.xensource.com
Errors-To: xen-devel-bounces@lists.xensource.com
To: gavin <gbtux@126.com>
Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>
List-Id: xen-devel@lists.xenproject.org

2011/12/17 gavin <gbtux@126.com>:
>
> At=A02011-12-16=A023:58:26,"George=A0Dunlap"=A0<George.Dunlap@eu.citrix.c=
om>=A0wrote:
>>2011/12/16=A0gavin=A0<gbtux@126.com>:
>>>=A0At=A02011-12-16=A019:04:19,"George=A0Dunlap"=A0<George.Dunlap@eu.citr=
ix.com>=A0wrote:
>>>
>>>>2011/12/16=A0zhikai=A0<gbtux@126.com>:
>>>>>=A0Hi=A0All,
>>>>>
>>>>>=A0In=A0the=A0credit=A0scheduler,=A0the=A0scheduling=A0decision=A0func=
tion=A0csched_schedule()
>>>>>=A0is=A0called=A0in=A0the=A0schedule=A0function=A0in=A0scheduler.c,=A0=
such=A0as=A0the=A0following.
>>>>>=A0next_slice=A0=3D=A0sched->do_schedule(sched,=A0now,=A0tasklet_work_=
scheduled);
>>>>>
>>>>>=A0But,=A0how=A0often=A0the=A0csched_schedule()=A0is=A0called=A0and=A0=
to=A0run?=A0Does=A0this
>>>>>=A0frequency=A0have=A0something=A0to=A0do=A0with=A0the=A0slice=A0of=A0=
credit=A0scheduler=A0that=A0is
>>>>>=A030ms?
>>>>
>>>>The=A0scheduler=A0runs=A0whenever=A0the=A0SCHEDULE_SOFTIRQ=A0is=A0raise=
d.=A0=A0If=A0you
>>>>grep=A0through=A0the=A0source=A0code=A0fro=A0that=A0string,=A0you=A0can=
=A0find=A0all=A0the
>>>>places=A0where=A0it's=A0raised.
>>>>
>>>>Some=A0examples=A0include:
>>>>*=A0When=A0the=A030ms=A0timeslice=A0is=A0finished
>>>>*=A0When=A0a=A0sleeping=A0vcpu=A0of=A0higher=A0priority=A0than=A0what's=
=A0currently=A0running=A0wakes=A0up
>>>>*=A0When=A0a=A0vcpu=A0blocks
>>>>*=A0When=A0a=A0vcpu=A0is=A0migrated=A0from=A0one=A0cpu=A0to=A0another
>>>>
>>>>30ms=A0is=A0actually=A0a=A0pretty=A0long=A0time;=A0in=A0typical=A0workl=
oads,=A0vcpus=A0block
>>>>or=A0are=A0preempted=A0by=A0other=A0waking=A0vcpus=A0without=A0using=A0=
up=A0their=A0full
>>>>timeslice.
>>>
>>>=A0Thank=A0you=A0very=A0much=A0for=A0your=A0reply.
>>>
>>>=A0So,=A0the=A0vcpu=A0is=A0very=A0likely=A0to=A0be=A0preempted=A0wheneve=
r=A0the=A0SCHEDULE_SOFTIRQ=A0is
>>>=A0raised.
>>
>>It=A0depends;=A0if=A0you=A0have=A0a=A0cpu-burning=A0vcpu=A0running=A0on=
=A0a=A0cpu=A0all=A0by
>>itself,=A0then=A0after=A0its=A030ms=A0timeslice,=A0Xen=A0will=A0have=A0no=
=A0one=A0else=A0to
>>run,=A0and=A0so=A0will=A0let=A0it=A0run=A0again.
>>
>>But=A0yes,=A0if=A0there=A0are=A0other=A0vcpus=A0on=A0the=A0runqueue,=A0or=
=A0the=A0host=A0is
>>moderately=A0busy,=A0it's=A0likely=A0that=A0SCHEDULE_SOFTIRQ=A0will=A0cau=
se=A0a
>>context-switch.
>>
>>>=A0And=A0we=A0cannot=A0find=A0a=A0small=A0timeslice,=A0such=A0as=A0a(ms)=
,=A0which=A0makes=A0the
>>>=A0time=A0any=A0vcpu=A0spending=A0on=A0running=A0phase=A0is=A0k*a(ms),=
=A0k=A0is=A0integer=A0here.=A0There
>>>=A0is=A0no=A0such=A0a=A0small=A0timeslice.=A0Is=A0it=A0right?
>>
>>I'm=A0sorry,=A0I=A0don't=A0really=A0understand=A0your=A0question.=A0=A0Pe=
rhaps=A0if=A0you
>>told=A0me=A0what=A0you're=A0trying=A0to=A0accomplish?
>
> I try to describe my idea as the following clearly. But I really don't kn=
ow
> if it will work. Please give me some advice if possible.
>
> According to the credit scheduler in Xen, a vCPU can run a 30ms timeslice
> when it is scheduled on the physical CPU. And, a vCPU with the BOOST
> priority will preempt the running one and run additional 10ms. So, what I
> think is if we monitor the physical CPU every 10ms and we can get the
> mapping information of a physical CPU and a vCPU. And also, we can get the
> un-mapping information that a physical CPU isn=92t mapped to any vCPU. Th=
us,
> we can get the CPU usage by calculating the proportion of the mapping
> information to the total time when we monitored.
>
> For example, if we monitor the physical CPUs every 10ms and we can get 100
> pairs of pCPU and vCPU in a second, such as (pCPU_id, vCPU_id). If there =
is
> 60 mapping pairs that the pCPU is mapped to a valid vPCU, and 40 un-mappi=
ng
> pairs that we cannot find the pCPU to be mapped a valid vCPU. So, we can =
get
> the usage of the physical CPUs that is 60%.
>
> Here, we monitor the physical CPUs every 10ms. We also can monitor them o=
nce
> less than the 10ms interval, such as 1ms interval. Whatever interval we
> choose, we must make sure no CPU content switch in the interval or the
> context switch always occur at the edge of interval. Only in this conditi=
on,
> can this idea work.
>
> So, I am not sure whether we can find such a time interval that can meet
> this condition. In other words, whether we can find such a time interval
> that ensures all the CPU content switch occur at the edge of interval.

You still haven't described exactly what it is you're trying to
accomplish: what is your end goal?  It seems to be related somehow to
measuring how busy the system is (i.e., the number of active pcpus and
idle pcpus); but as I don't know what you want to do with that
information, I can't tell you the best way to get it.

Regarding a map of pcpus to vcpus, that already exists.  The
scheduling code will keep track of the currently running vcpu here:
  per_cpu(schedule_data, pcpu_id).curr

You can see examples of the above structure used in
xen/common/sched_credit2.c.  If "is_idle(per_cpu(schedule_data,
pcpu).curr)" is false, then the cpu is running a vcpu; if it is true,
then the pcpu is idle (although it may be running a tasklet).

Additionally, if all you want is the number of non-idle cpus, the
credit1 scheduler keeps track of the idle and non-idle cpus in
prv->idlers.  You could easily use "cpumask_weight(&prv->idlers)" to
find out how many idle cpus there are at any given time.  If you know
how many online cpus there are, that will give you the busy-ness of
the system.

So now that you have this instantaneous percentage, what do you want
to do with it?

 -George