On Thu, 2016-09-01 at 12:08 +0100, anshul makkar wrote: > On 17/08/16 18:19, Dario Faggioli wrote: > >  > > diff --git a/xen/common/sched_credit2.c > > b/xen/common/sched_credit2.c > >  > > @@ -506,34 +506,68 @@ void smt_idle_mask_clear(unsigned int cpu, > > cpumask_t *mask) > >   } > > > >   /* > > - * When a hard affinity change occurs, we may not be able to check > > some > > - * (any!) of the other runqueues, when looking for the best new > > processor > > - * for svc (as trylock-s in csched2_cpu_pick() can fail). If that > > happens, we > > - * pick, in order of decreasing preference: > > - *  - svc's current pcpu; > > - *  - another pcpu from svc's current runq; > > - *  - any cpu. > > + * In csched2_cpu_pick(), it may not be possible to actually look > > at remote > > + * runqueues (the trylock-s on their spinlocks can fail!). If that > > happens, > With remote runqueues , do you mean runqs on remote socket ? > I mean runqueues different from the runq the vcpu is currently assigned to (as per runq_assign()/runq_deassing()). If you have runqueues configured to be per-socket, yes, it will try to lock runqueues in which there are pcpus that are on a different socket wrt svc->vcpu->processor. > Can't we  > just read their workload or we can change the locktype to allow > reading ? > Reading without taking the lock would race against the load value being updated. And updating the load is done by __update_runq_load(), which, with all it's shifts and maths, by no means is an atomic operation. So it's not just a matter of risking to read a slightly outdated value, which, I agree, may not be that bad, it's that we risk reading something inconsistent. :-/ About "changing the locktype", I guess you mean that we can turn also the runqueue lock into an rw-lock? If yes, that's indeed interesting, and I've also thought about it, but, for now, always deferred trying to actually do that. It's technically non trivial, as it would involve changing schedule_data->schedule_lock and all the {v,p}cpu_schedule_lock*() functions. Also, it's a lock that will almost all the times be taken for writing, which usually means what you want is a proper spinlock. So, IMO, before embarking in doing something like that, it'd be good to figure out how frequently we actually fail to take the remote runqueue lock, and what's the real impact of having to deal the consequence of that. I'm not saying it's not worth a try, but I'm saying that's it's something at high risk of being a lot of work for a very small gain, and that there are more important things to focus on. > > + * we pick, in order of decreasing preference: > > + *  1) svc's current pcpu, if it is part of svc's soft affinity; > > + *  2) a pcpu in svc's current runqueue that is also in svc's soft > > affinity; > svc's current runqueue. Do you mean the runq in which svc is > currently  > queued ? > I mean the runqueue to which svc is currently assigned (again, as per runq_assign()/runq_deassing()), which in turns mean that, if svc is queued in a runqueue, it's queues there (so, I guess the short answer to your question is "yes" :-D). Regards, Dario -- <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)