On Tue, 2016-09-13 at 12:28 +0100, George Dunlap wrote: > On 17/08/16 18:18, Dario Faggioli wrote: > >  > diff --git a/xen/common/sched_credit2.c > > @@ -2233,7 +2241,8 @@ void __dump_execstate(void *unused); > >  static struct csched2_vcpu * > >  runq_candidate(struct csched2_runqueue_data *rqd, > >                 struct csched2_vcpu *scurr, > > -               int cpu, s_time_t now) > > +               int cpu, s_time_t now, > > +               unsigned int *pos) > > I think I'd prefer if this were called "skipped" or something like > that > -- to indicate how many vcpus in the runqueue had been skipped before > coming to this one. > Done. > > @@ -2298,6 +2340,7 @@ csched2_schedule( > >      struct csched2_runqueue_data *rqd; > >      struct csched2_vcpu * const scurr = CSCHED2_VCPU(current); > >      struct csched2_vcpu *snext = NULL; > > +    unsigned int snext_pos = 0; > >      struct task_slice ret; > >   > >      SCHED_STAT_CRANK(schedule); > > @@ -2347,7 +2390,7 @@ csched2_schedule( > >          snext = CSCHED2_VCPU(idle_vcpu[cpu]); > >      } > >      else > > -        snext=runq_candidate(rqd, scurr, cpu, now); > > +        snext = runq_candidate(rqd, scurr, cpu, now, &snext_pos); > >   > >      /* If switching from a non-idle runnable vcpu, put it > >       * back on the runqueue. */ > > @@ -2371,8 +2414,21 @@ csched2_schedule( > >              __set_bit(__CSFLAG_scheduled, &snext->flags); > >          } > >   > > -        /* Check for the reset condition */ > > -        if ( snext->credit <= CSCHED2_CREDIT_RESET ) > > +        /* > > +         * The reset condition is "has a scheduler epoch come to > > an end?". > > +         * The way this is enforced is checking whether the vcpu > > at the top > > +         * of the runqueue has negative credits. This means the > > epochs have > > +         * variable lenght, as in one epoch expores when: > > +         *  1) the vcpu at the top of the runqueue has executed > > for > > +         *     around 10 ms (with default parameters); > > +         *  2) no other vcpu with higher credits wants to run. > > +         * > > +         * Here, where we want to check for reset, we need to make > > sure the > > +         * proper vcpu is being used. In fact, > > runqueue_candidate() may have > > +         * not returned the first vcpu in the runqueue, for > > various reasons > > +         * (e.g., affinity). Only trigger a reset when it does. > > +         */ > > +        if ( snext_pos == 0 && snext->credit <= > > CSCHED2_CREDIT_RESET ) > > This bit wasn't mentioned in the description. :-) > You're right. Actually, I think this change deserves to be in its own patch, so in v2 I'm splitting this patch in two. > There's a certain amount of sense to the idea here, but it's the kind > of > thing that may have strange side effects.  Did you look at traces > before > and after this change?  And does the behavior seem more rational? > I have. It's not like it was happening a lot of times that we were resetting upon the wrong vcpus, but I indeed have caught a couple of examples. And yes, the trace looked more 'regular' with this patch. Or, IOW, without this patch, there were some of the reset events that were suspiciously closer between each other. TBH, in the vast majority of the cases, even when a "spurious reset" was involved, the difference was rather hard to tell, but please, consider that the combination of hard-affinity, this patch and soft- affinity will potentially make things much worse (and in fact, I saw the most severe occurrences when using hard-affinity). It's also rather hard to measure the effect, but I think what is implemented here is the right thing to do. And even if it may be hard to measure the performance impact, I claim that this is a 'correctness' issue, or at least a matter of adhering as much as possible to the algorithm theory and idea. > If so, I'm happy to trust your judgement -- just want to check to > make > sure. :-) > Ah, thanks. :-) Regards, Dario -- <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)