From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: A question on the credit scheduler Date: Mon, 19 Dec 2011 10:37:14 +0000 Message-ID: References: <26c93838.3b3fb.13445de0e67.Coremail.gbtux@126.com> <676b608e.2fb26.13447553671.Coremail.gbtux@126.com> <35e4701f.ca5a.1344b4ed98e.Coremail.gbtux@126.com> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <35e4701f.ca5a.1344b4ed98e.Coremail.gbtux@126.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: gavin Cc: "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org 2011/12/17 gavin : > > At=A02011-12-16=A023:58:26,"George=A0Dunlap"=A0=A0wrote: >>2011/12/16=A0gavin=A0: >>>=A0At=A02011-12-16=A019:04:19,"George=A0Dunlap"=A0=A0wrote: >>> >>>>2011/12/16=A0zhikai=A0: >>>>>=A0Hi=A0All, >>>>> >>>>>=A0In=A0the=A0credit=A0scheduler,=A0the=A0scheduling=A0decision=A0func= tion=A0csched_schedule() >>>>>=A0is=A0called=A0in=A0the=A0schedule=A0function=A0in=A0scheduler.c,=A0= such=A0as=A0the=A0following. >>>>>=A0next_slice=A0=3D=A0sched->do_schedule(sched,=A0now,=A0tasklet_work_= scheduled); >>>>> >>>>>=A0But,=A0how=A0often=A0the=A0csched_schedule()=A0is=A0called=A0and=A0= to=A0run?=A0Does=A0this >>>>>=A0frequency=A0have=A0something=A0to=A0do=A0with=A0the=A0slice=A0of=A0= credit=A0scheduler=A0that=A0is >>>>>=A030ms? >>>> >>>>The=A0scheduler=A0runs=A0whenever=A0the=A0SCHEDULE_SOFTIRQ=A0is=A0raise= d.=A0=A0If=A0you >>>>grep=A0through=A0the=A0source=A0code=A0fro=A0that=A0string,=A0you=A0can= =A0find=A0all=A0the >>>>places=A0where=A0it's=A0raised. >>>> >>>>Some=A0examples=A0include: >>>>*=A0When=A0the=A030ms=A0timeslice=A0is=A0finished >>>>*=A0When=A0a=A0sleeping=A0vcpu=A0of=A0higher=A0priority=A0than=A0what's= =A0currently=A0running=A0wakes=A0up >>>>*=A0When=A0a=A0vcpu=A0blocks >>>>*=A0When=A0a=A0vcpu=A0is=A0migrated=A0from=A0one=A0cpu=A0to=A0another >>>> >>>>30ms=A0is=A0actually=A0a=A0pretty=A0long=A0time;=A0in=A0typical=A0workl= oads,=A0vcpus=A0block >>>>or=A0are=A0preempted=A0by=A0other=A0waking=A0vcpus=A0without=A0using=A0= up=A0their=A0full >>>>timeslice. >>> >>>=A0Thank=A0you=A0very=A0much=A0for=A0your=A0reply. >>> >>>=A0So,=A0the=A0vcpu=A0is=A0very=A0likely=A0to=A0be=A0preempted=A0wheneve= r=A0the=A0SCHEDULE_SOFTIRQ=A0is >>>=A0raised. >> >>It=A0depends;=A0if=A0you=A0have=A0a=A0cpu-burning=A0vcpu=A0running=A0on= =A0a=A0cpu=A0all=A0by >>itself,=A0then=A0after=A0its=A030ms=A0timeslice,=A0Xen=A0will=A0have=A0no= =A0one=A0else=A0to >>run,=A0and=A0so=A0will=A0let=A0it=A0run=A0again. >> >>But=A0yes,=A0if=A0there=A0are=A0other=A0vcpus=A0on=A0the=A0runqueue,=A0or= =A0the=A0host=A0is >>moderately=A0busy,=A0it's=A0likely=A0that=A0SCHEDULE_SOFTIRQ=A0will=A0cau= se=A0a >>context-switch. >> >>>=A0And=A0we=A0cannot=A0find=A0a=A0small=A0timeslice,=A0such=A0as=A0a(ms)= ,=A0which=A0makes=A0the >>>=A0time=A0any=A0vcpu=A0spending=A0on=A0running=A0phase=A0is=A0k*a(ms),= =A0k=A0is=A0integer=A0here.=A0There >>>=A0is=A0no=A0such=A0a=A0small=A0timeslice.=A0Is=A0it=A0right? >> >>I'm=A0sorry,=A0I=A0don't=A0really=A0understand=A0your=A0question.=A0=A0Pe= rhaps=A0if=A0you >>told=A0me=A0what=A0you're=A0trying=A0to=A0accomplish? > > I try to describe my idea as the following clearly. But I really don't kn= ow > if it will work. Please give me some advice if possible. > > According to the credit scheduler in Xen, a vCPU can run a 30ms timeslice > when it is scheduled on the physical CPU. And, a vCPU with the BOOST > priority will preempt the running one and run additional 10ms. So, what I > think is if we monitor the physical CPU every 10ms and we can get the > mapping information of a physical CPU and a vCPU. And also, we can get the > un-mapping information that a physical CPU isn=92t mapped to any vCPU. Th= us, > we can get the CPU usage by calculating the proportion of the mapping > information to the total time when we monitored. > > For example, if we monitor the physical CPUs every 10ms and we can get 100 > pairs of pCPU and vCPU in a second, such as (pCPU_id, vCPU_id). If there = is > 60 mapping pairs that the pCPU is mapped to a valid vPCU, and 40 un-mappi= ng > pairs that we cannot find the pCPU to be mapped a valid vCPU. So, we can = get > the usage of the physical CPUs that is 60%. > > Here, we monitor the physical CPUs every 10ms. We also can monitor them o= nce > less than the 10ms interval, such as 1ms interval. Whatever interval we > choose, we must make sure no CPU content switch in the interval or the > context switch always occur at the edge of interval. Only in this conditi= on, > can this idea work. > > So, I am not sure whether we can find such a time interval that can meet > this condition. In other words, whether we can find such a time interval > that ensures all the CPU content switch occur at the edge of interval. You still haven't described exactly what it is you're trying to accomplish: what is your end goal? It seems to be related somehow to measuring how busy the system is (i.e., the number of active pcpus and idle pcpus); but as I don't know what you want to do with that information, I can't tell you the best way to get it. Regarding a map of pcpus to vcpus, that already exists. The scheduling code will keep track of the currently running vcpu here: per_cpu(schedule_data, pcpu_id).curr You can see examples of the above structure used in xen/common/sched_credit2.c. If "is_idle(per_cpu(schedule_data, pcpu).curr)" is false, then the cpu is running a vcpu; if it is true, then the pcpu is idle (although it may be running a tasklet). Additionally, if all you want is the number of non-idle cpus, the credit1 scheduler keeps track of the idle and non-idle cpus in prv->idlers. You could easily use "cpumask_weight(&prv->idlers)" to find out how many idle cpus there are at any given time. If you know how many online cpus there are, that will give you the busy-ness of the system. So now that you have this instantaneous percentage, what do you want to do with it? -George