On mer, 2014-07-09 at 16:55 -0400, Josh Whitehead wrote: > This patch adds the pieces needed for the CBS functionality. > So, first of all, this is absolutely my favorite resource-reservation based real-time scheduling algorithm... so glad you chose it! :-P > This includes a > "soft" flag to indicate whether a domain should be handled in a pure EDF manner > or handled by the CBS algorithm, the CBS algorithm itself to the wake function, > and few other bits and comments to track scheduler statistics, set parameters, > and document the functionality of the code itself. > Indeed, it's something pretty easy to implement, on top of a decent EDF-looking scheduler. > --- a/xen/common/sched_sedf.c > +++ b/xen/common/sched_sedf.c > /****************************************************************************** > - * Simple EDF Scheduler for xen > + * Simple EDF Scheduler for Xen > * This line changed already in patch 2. > * Permission is hereby granted, free of charge, to any person obtaining a copy > * of this software and associated documentation files (the "Software"), to > @@ -46,6 +46,7 @@ > #define CHECK(_p) ((void)0) > #endif > > +#define SEDF_SOFT_TASK (1) > #define SEDF_ASLEEP (16) > And this is why I was saying it was ugly and not very practical to leave SEDF_ASLEEP alone and #define to 16. Now you have two status flag, and instead of bits 1 and 2, they're represented by bits 1 and 5... Weird, isn't it? > #define DEFAULT_PERIOD (MILLISECS(20)) > @@ -73,7 +74,8 @@ struct sedf_vcpu_info { > > /* Parameters for EDF */ > s_time_t period; /* = Server scheduling period */ > - s_time_t budget; /* = Guarenteed minimum CPU time per period */ > + s_time_t budget; /* = Guarenteed minimum CPU time per period */ > And this one too. Just decide how do you want them to look, and make that happen once and for ever! :-P :-P > + /* Note: Server bandwidth = (budget / period) */ > Again, I'd avoid the word server. 'VCPU bandwidth' sounds just fine. > @@ -208,6 +215,7 @@ static void *sedf_alloc_vdata(const struct scheduler *ops, struct vcpu *v, void > inf->vcpu = v; > > inf->deadl_abs = 0; > + inf->cputime = 0; > inf->status = SEDF_ASLEEP; > inf is xzalloc-ed above, so I don't think we need to zero the fields explicitly. I see this is probably an attempt to achieve more simmetry and clarity, given the already existing explicit zeroing of deadl_abs. I don't have a strong opinion. It's certainly inconsistent and odd-looking to have only one, so I'd either =0 both (as above) or none. However, whatever path you choose to go, I don't think either this or the removal of deadl_abs=0 belongs to this patch. It's probably best to put this hunk in one of the cleanup two cleanup patches before this. > @@ -306,29 +314,64 @@ static void desched_edf_vcpu(s_time_t now, struct vcpu *v) > /* Update the vcpu's cputime */ > inf->cputime += now - inf->sched_start_abs; > > - /* Scheduling decisions which don't remove the running vcpu from > - * the runq */ > + /* If running vcpu has budget remaining, return */ > if ( (inf->cputime < inf->budget) && sedf_runnable(v) ) > return; > > I happen to like the original comment better. Well, actually, I'd change it a bit myself, but please, stick to something that provides some insights about "the semantic" of what's happening. I mean, the fact that you return if the condition is true is already evident from the code, no much point in saying it in a comment. :-) OTOH, it may help the reader to tell him what's the consequence of such return, i.e., "if the vcpu still has budget left, leave it alone on the RunQ" (or something like that). > + /* Current vcpu has consumed its budget for this period */ > __del_from_queue(v); > > - /* > - * Manage bookkeeping (i.e. calculate next deadline, memorise > - * overrun-time of budget) of finished vcpus. > - */ > +#ifdef SEDF_STATS > + /* Manage deadline misses */ > + if ( unlikely(inf->deadl_abs < now) ) > + { > + inf->miss_tot++; > + inf->miss_time += inf->cputime; > Two questions: 1) you're checking whether a deadline miss happened only in case there has been a budget overrun, right? So, why? Can't a vcpu overcome its deadline even without overrunning (e.g., in case the system is overloaded to more than 100%, i UP case, utilization)? 2) By miss_time you mean by how much the deadline has been missed (in literature, that's called tardiness), right? If yes, what makes you sure that the miss time is inf->cputime? > + } > +#endif > + > These #ifdef are really really ugly. I think using some helpers, and their empty stubs in case SEDF_STATS is not defined, would make it all look a lot better. E.g., have a look at what's done in sched_credit.c However, this can well happen in another unrelated series. > + /* Manage overruns */ > if ( inf->cputime >= inf->budget ) > { > inf->cputime -= inf->budget; > > /* Set next deadline */ > inf->deadl_abs += inf->period; > + > + /* Ensure that the cputime is always less than budget */ > + if ( unlikely(inf->cputime > inf->budget) ) > + { > +#ifdef SEDF_STATS > + inf->over_tot++; > Why this is incremented only here? Or, I guess, what's the intended meaning and usage of over_tot? I'm asking because it looks like you want to count the number of overruns. However, it seems to me that putting this here, you are counting the occurrences of the vcpu having executed more than twice its budget (you do the ++ only if cputime-budget>budget ==> cputime>2*budget)... Is that what you want? > + inf->over_time += inf->cputime; > +#endif > + > + /* Make up for the overage by pushing the deadline > + into the future; ensure one domain doesn't consistently > + consume extra time by overrunning its budget */ > Style of the comment. > + inf->deadl_abs += ((inf->cputime / inf->budget) > + * inf->period); > + inf->cputime -= (inf->cputime / inf->budget) * inf->budget; > + } > + Honestly, the whole "if ( inf->cputime >= inf->budget ) {}" block looks a bit clumsy to me. How about just something like this: while ( inf->cputime >= inf->budget ) { inf->over_tot++; // if you want to count *all* overruns, // otherwise, skip the ++ for 1st iteration inf->cputime -= inf->budget; inf->deadl_abs += inf->period; } > + /* Ensure that the start of the next period is in the future */ > + if ( unlikely(PERIOD_BEGIN(inf) < now) ) > + inf->deadl_abs += > + (DIV_UP(now - PERIOD_BEGIN(inf), > + inf->period)) * inf->period; > Well, I agree that checking that the new period begins in future is required. However, if at this point that is not true, we really are lagging quite a bit! On what to do, I'm not sure whether I'd just push the deadline ahead, as you're doing, or go for a full reset (e.g., deadl_abs=NOW()+period). I'm fine with the pushing ahead, if you like it better, but I'd consider printing a warning somewhere, as this really should not happen. > } > > /* Add a runnable vcpu to the appropriate queue */ > if ( sedf_runnable(v) ) > { > - __add_to_waitqueue_sort(v); > + if( sedf_soft(v) ) > + { > + __add_to_runqueue_sort(v); > + } > + else > + { > + __add_to_waitqueue_sort(v); > + } > } > > ASSERT(EQ(sedf_runnable(v), __task_on_queue(v))); > As I said in another mail, let's kill EQ() macro too. ASSERT(sedf_runnable(v) == __task_on_queue(v)); looks perfectly fine to me. Of course, do it in the appropriate patch (not here). > @@ -418,9 +461,9 @@ static void sedf_deinit(const struct scheduler *ops) > static struct task_slice sedf_do_schedule( > const struct scheduler *ops, s_time_t now, bool_t tasklet_work_scheduled) > { > - int cpu = smp_processor_id(); > - struct list_head *runq = RUNQ(cpu); > - struct list_head *waitq = WAITQ(cpu); > + int cpu = smp_processor_id(); > + struct list_head *runq = RUNQ(cpu); > + struct list_head *waitq = WAITQ(cpu); > struct sedf_vcpu_info *inf = SEDF_VCPU(current); > struct sedf_vcpu_info *runinf, *waitinf; > struct task_slice ret; > I may be wrong, but I feel like this hunk is something that would be better fixed in patch 2 than in this one, isn't it? > @@ -469,7 +512,7 @@ static struct task_slice sedf_do_schedule( > waitinf = list_entry(waitq->next, > struct sedf_vcpu_info, list); > /* > - * Rerun scheduler, when scheduled vcpu consumes > + * Rerun scheduler when scheduled vcpu consumes > And this one too. > * its budget or the first vcpu from the waitqueue > * gets ready. > */ > @@ -490,7 +533,7 @@ static struct task_slice sedf_do_schedule( > } > > /* > - * TODO: Do something USEFUL when this happens and find out, why it > + * TODO: Do something USEFUL when this happens and find out why it > And also this one. > * still can happen!!! > */ > if ( ret.time < 0) > @@ -554,15 +597,21 @@ static inline int should_switch(struct vcpu *cur, > /* > * This function wakes up a vcpu, i.e. moves them into the appropriate queue > * > - * When a blocked vcpu unblocks, it is allowed to start execution at > - * the beginning of the next complete period > - * (D..deadline, R..running, B..blocking/sleeping, U..unblocking/waking up > + * For Hard Real-Time vcpus (soft = 0): > + * -When a blocked vcpu unblocks, it is allowed to start execution at > + * the beginning of the next complete period > + * (D..deadline, R..running, B..blocking/sleeping, U..unblocking/waking up > + * > + * DRRB_____D__U_____DRRRRR___D________ ... > + * > + * -This causes the vcpu to miss a period (and a deadlline) > + * -Doesn't disturb the schedule at all > + * -Deadlines keep occuring isochronous > So, now I'm curious, why did you pick this variant, instead of the 'let's reset everything at wake-up time' one? You think keeping deadline isochronous is important, to the point that is ok to postpone execution by one period? (Note that I'm not saying that is not the case... I'm genuinely asking :-) ) > - if ( now < inf->deadl_abs ) > - { > - /* Short blocking */ > - inf->short_block_tot++; > - } > - else > + if ( sedf_soft(v) ) > { > - /* Long unblocking, someone is going to miss their deadline. */ > - inf->long_block_tot++; > + /* > + * Apply CBS rule > + * Where: > + * c == Remaining server budget == (inf->budget - cpu_time) > + * d == Server (vcpu) deadline == inf->deadl_abs > + * r == Wake-up time of vcpu == now > + * U == Server (vcpu) bandwidth == (inf->budget / inf->period) > + * > + * if c>=(d-r)*U ---> > + * (inf->budget - cputime) >= > + * (inf->deadl_abs - now) * (inf->period / inf->period) > + * > + * If true, push deadline back by one period and refresh budget, else > + * use current budget and deadline. > + * > + * Note: The 'if' statement below is equivalent to the above comments; > + * it has been simplified to avoid the division operator > + */ > + if ((inf->budget - inf->cputime) * inf->period >= > + (inf->deadl_abs - now) * inf->budget) > + { > + /* Push back deadline by one period */ > + inf->deadl_abs += inf->period; > + inf->cputime = 0; > + } > + This looks ok. > + /* > + * In CBS we don't care if the period has begun, > + * the task doesn't have to wait for its period > + * because it'll never request more than its budget > + * for any given period. > + */ > + __add_to_runqueue_sort(v); > Well, I'm not sure I understand the comment or, if I understand it, that I agree with it. It's not that the vcpu won't request more than its budget. A vcpu may well try to execute for 100% of the time... What happens in that case is that the budgetting mechanism of the algorithm kicks in and slow it down, by pushing its deadline away (and hence lowering its priority). But perhaps you were trying to say something different? > } > + else { > + /* Task is a hard task, treat accordingly */ > +#ifdef SEDF_STATS > + if ( now < inf->deadl_abs ) > + { > + /* Short blocking */ > + inf->short_block_tot++; > + } > + else > + { > + /* Long unblocking, someone is going to miss their deadline. */ > + inf->long_block_tot++; > Again, I'm not sure I understand who 'someone' should be. It looks to me that it is right the vcpu that is unblocking that risks to miss its deadline, since it will have to skip a period. It should not be any of the other vcpus that miss its own deadline, because of this vcpu unblocking, or we'd be breaking the CPU isolation feature that this kind of real-time scheduling is supposed to provide... Is that what you meant? > + } > +#endif > I think this could have been inside #ifdef SEDF_STATS from the beginning, which would mean do part of this in patch 2. > - if ( PERIOD_BEGIN(inf) > now ) > - __add_to_waitqueue_sort(v); > - else > - __add_to_runqueue_sort(v); > + if ( PERIOD_BEGIN(inf) > now ) > + __add_to_waitqueue_sort(v); > + else > + __add_to_runqueue_sort(v); > + } > > #ifdef SEDF_STATS > /* Do some statistics here... */ > @@ -626,7 +716,7 @@ static void sedf_wake(const struct scheduler *ops, struct vcpu *v) > cpu_raise_softirq(v->processor, SCHEDULE_SOFTIRQ); > } > > -/* Print a lot of useful information about a vcpus in the system */ > +/* Print a lot of useful information about a vcpu in the system */ > ditto (do this in patch 2). > @@ -760,8 +868,9 @@ static int sedf_adjust(const struct scheduler *ops, struct domain *d, struct xen > goto out; > } > > - op->u.sedf.period = SEDF_VCPU(d->vcpu[0])->period; > - op->u.sedf.slice = SEDF_VCPU(d->vcpu[0])->budget; > + op->u.sedf.period = SEDF_VCPU(d->vcpu[0])->period; > + op->u.sedf.budget = SEDF_VCPU(d->vcpu[0])->budget; > + op->u.sedf.soft = sedf_soft(d->vcpu[0]); > And again (i.e., do this in patch 2?). > } > Regards, Dario -- <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)