xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Dario Faggioli <dario.faggioli@citrix.com>
To: Tianyang Chen <tiche@seas.upenn.edu>, xen-devel@lists.xenproject.org
Cc: george.dunlap@citrix.com, Dagaen Golomb <dgolomb@seas.upenn.edu>,
	Meng Xu <mengxu@cis.upenn.edu>
Subject: Re: [PATCH v9]xen: sched: convert RTDS from time to event driven model
Date: Wed, 16 Mar 2016 15:25:07 +0100	[thread overview]
Message-ID: <1458138307.3102.901.camel@citrix.com> (raw)
In-Reply-To: <1458000275-4237-1-git-send-email-tiche@seas.upenn.edu>


[-- Attachment #1.1: Type: text/plain, Size: 20325 bytes --]

Ok, it's about time that we deal with this changelog!

On Mon, 2016-03-14 at 20:04 -0400, Tianyang Chen wrote:
> Budget replenishment and enforcement are separated by adding
> a replenishment timer, which fires at the next most imminent
> release time of all runnable vcpus.
> 
First of all, state (quickly) the "problems". So, right now:
 - the scheduler, although the algorithm is event driven by nature, 
   follow a time driven model (is invoked periodically!), making the 
   code looks unnatural;
 - budget replenishment logic, budget enforcement logic and scheduling
   decisions are mixed and entangled, making the code hard to 
   understand;
 - the various queues of vcpus are scanned various times, making the 
   code inefficient;

Then describe your solution. The first sentence you've got up above is
ok...

> A replenishment queue has been added to keep track of all vcpus that
> are runnable.
> 
...and this one too.

> The following functions have major changes to manage the
> replenishment
> events and timer.
> 
> repl_handler(): It is a timer handler which is re-programmed
> to fire at the nearest vcpu deadline to replenish vcpus.
> 
> rt_schedule(): picks the highest runnable vcpu based on cpu
> affinity and ret.time will be passed to schedule(). If an idle
> vcpu is picked, -1 is returned to avoid busy-waiting. repl_update()
> has been removed.
> 
> rt_vcpu_wake(): when a vcpu wakes up, it tickles instead of
> picking one from the run queue.
> 
> rt_context_saved(): when context switching is finished, the
> preempted vcpu will be put back into the runq.
> 
This is too detailed for a changelog. If you want this information to
live somewhere (it already lives in the list archives, actually), make
a cover letter (this is just one patch, so it's not required, but
nothing forbids that). Or put it in a wiki page. Or write a blog post.
Or (which would be kind of nice) all of them! :-)

Just a few more words, in addition of the above two sentences, covering
the other changes done in the patch are more than enough. Such as:

"In the main scheduling function, we now return the next time when a
making a scheduling decision is going to be necessary, such as when the
budget of the currently running vcpu will run over.

Finally, when waking up a vcpu, it is now enough to tickle the various
CPUs appropriately, like all other schedulers also do."

> Simplified funtional graph:
> 
> schedule.c SCHEDULE_SOFTIRQ:
>     rt_schedule():
>         [spin_lock]
>         burn_budget(scurr)
>         snext = runq_pick()
>         [spin_unlock]
> 
> sched_rt.c TIMER_SOFTIRQ
>     replenishment_timer_handler()
>         [spin_lock]
>         <for_each_vcpu_on_q(i)> {
>             replenish(i)
>             runq_tickle(i)
>         }>
>         program_timer()
>         [spin_lock]
> 
And kill this (or, again, move to cover, wiki, blog, etc.) as well.

Also, trying to apply this gave:

/* 
<stdin>:194: trailing whitespace.
 * A helper function that only removes a vcpu from a queue 
<stdin>:355: trailing whitespace.
    /* 
<stdin>:356: trailing whitespace.
     * The timer initialization will happen later when 
<stdin>:380: trailing whitespace.
    {   
warning: squelched 10 whitespace errors
warning: 15 lines add whitespace errors.

And I think there are a couple of long lines as well.

> --- a/xen/common/sched_rt.c
> +++ b/xen/common/sched_rt.c
> @@ -3,7 +3,9 @@
>   * EDF scheduling is a real-time scheduling algorithm used in
> embedded field.
>   *
>   * by Sisu Xi, 2013, Washington University in Saint Louis
> - * and Meng Xu, 2014, University of Pennsylvania
> + * Meng Xu, 2014-2016, University of Pennsylvania
> + * Tianyang Chen, 2016, University of Pennsylvania
> + * Dagaen Golomb, 2016, University of Pennsylvania
>   *
 * Meng Xu, 2014-2016, University of Pennsylvania.
 *
 * Conversion toward event drive model by Tianyang Chen
 * and Dagaen Golomb, 2016, University of Pennsylvania.

> @@ -115,6 +118,18 @@
>  #define RTDS_delayed_runq_add (1<<__RTDS_delayed_runq_add)
>  
>  /*
> + * The replenishment timer handler needs to check this bit
> + * to know where a replenished vcpu was, when deciding which
> + * vcpu should tickle.
> + * A replenished vcpu should tickle if it was moved from the
> + * depleted queue to the run queue.
> + * + Set in burn_budget() if a vcpu has zero budget left.
> + * + Cleared and checked in the repenishment handler.
> + */
>
Stuff about how the replenishment timer handler uses it, with this
level of details, is better said in the replenishment timer handler
itself. Such that, if at some point we'll use this flag for other
things, we will not have to change this comment!

Just limit to what the flag is meant at representing. So, to sort of
follow the style of other similar comments (yes, perhaps it's not too
bad :-D).

And I'd lose the _was in the name.

/*
 * RTDS_depleted: does this vcpu run out of budget?
 * This flag is:
 * + set in burn_budget(), if a vcpu has zero budget left;
 * + checked and cleared in the replenishment timer handler,
 *   for the vcpus that are being replenished.
 */


> +#define __RTDS_was_depleted     3
> +#define RTDS_was_depleted (1<<__RTDS_was_depleted)
>
__RTDS_depleted and RTDS_depleted.

It's equally expressive and more general (i.e., suitable for more broad
usage in future, without needing to change the name at that time).

> @@ -142,6 +157,9 @@ static cpumask_var_t *_cpumask_scratch;
>   */
>  static unsigned int nr_rt_ops;
>  
> +/* handler for the replenishment timer */
> +static void repl_handler(void *data);
> +
Can we call the function repl_timer_handler() and lose the comment?

> @@ -160,6 +180,7 @@ struct rt_private {
>   */
>  struct rt_vcpu {
>      struct list_head q_elem;    /* on the runq/depletedq list */
> +    struct list_head replq_elem; /* on the repl event list */
>  
"on the replenishment events list"

It still fits, and it's much better.

> @@ -316,6 +356,13 @@ rt_dump(const struct scheduler *ops)
>          rt_dump_vcpu(ops, svc);
>      }
>  
> +    printk("Global Replenishment Event info:\n");
>
"Events"

> @@ -380,11 +427,78 @@ rt_update_deadline(s_time_t now, struct rt_vcpu
> *svc)
>      return;
>  }
>  
> +/* 
> + * A helper function that only removes a vcpu from a queue 
> + * and it returns 1 if the vcpu was in front of the list.
> + */
> +static inline int
> +deadline_queue_remove(struct list_head *queue, struct list_head
> *elem)
> +{
> +    int pos = 0;
> +
> +    if ( queue->next != elem )
> +        pos = 1;
> +
> +    list_del_init(elem);
> +    return !pos;
> +}
> +
Can we move deadline_queue_insert() here, so that these two similar
helpers are close to each other, and harmonize, or even unify the
comments.

Something like (put above deadline_queue_remove()):
/*
 * Helpers for removing and inserting a vcpu in a queue
 * that is being kept ordered by the vcpus' deadlines (as EDF
 * mandates).
 *
 * For callers' convenience, the vcpu removing helper returns
 * true if the vcpu removed was the one at the front of the
 * queue; similarly, the inserting helper returns true if the
 * inserted ended at the front of the queue (i.e., in both
 * cases, if the vcpu with the earliest deadline is what we
 * are dealing with).
 */

Also, it looks like the both can be of boolean return type.

> +/*
> + * Removing a vcpu from the replenishment queue could
> + * re-program the timer for the next replenishment event
> + * if it was at the front of the list.
> + */
>
This makes me think that it is the removal itself that,
automati_g_ically, would reprogram the timer.

Kill this comment and expand the one inside the function.

> +static inline void
> +__replq_remove(const struct scheduler *ops, struct rt_vcpu *svc)
> +{
>
Can we call it replq_remove()?

> +    struct rt_private *prv = rt_priv(ops);
> +    struct list_head *replq = rt_replq(ops);
> +    struct timer* repl_timer = prv->repl_timer;
>
There's no need for this local variable (repl_timer).

> +
> +    ASSERT( vcpu_on_replq(svc) );
> +
> +    if ( deadline_queue_remove(replq, &svc->replq_elem) )
> +    {
> +        /* re-arm the timer for the next replenishment event */
/*
 * The replenishment timer needs to be set to fire when a
 * replenishment for the vcpu at the front of the replenishment
 * queue is due. If it is such vcpu that we just removed, we may
 * need to reprogram the timer.
 */

> +        if ( !list_empty(replq) )
> +        {
> +            struct rt_vcpu *svc_next = replq_elem(replq->next);
> +            set_timer(repl_timer, svc_next->cur_deadline);
> +        }
> +        else
> +            stop_timer(repl_timer);
> +    }
> +}
> +
> +/*
> + * An utility function that inserts a vcpu to a
> + * queue based on certain order (EDF). The returned
> + * value is 1 if a vcpu has been inserted to the
> + * front of a list.
> + */
> +static int
> +deadline_queue_insert(struct rt_vcpu * (*_get_q_elem)(struct
> list_head *elem),
> +    struct rt_vcpu *svc, struct list_head *elem, struct list_head
> *queue)
> +{
>
Indentation.

Also I'd use some #defines to make things look a bit better.

So:

static bool_t
deadline_queue_insert(struct rt_vcpu * (*qelem)(struct list_head *),
                      struct rt_vcpu *svc, struct list_head *elem,
                      struct list_head *queue)
{
 ...
}
#define deadline_runq_insert(...) \
  deadline_queue_insert(&__q_elem, ##__VA_ARGS__)
#define deadline_replq_insert(...) \
  deadline_queue_insert(&replq_elem, ##__VA_ARGS__)

And then, at the various call sites:

 deadline_runq_insert(svc, &svc->q_elem, runq);
 ...
 if ( deadline_replq_insert(svc, &svc->replq_elem, replq) )
 ...
 if ( deadline_replq_insert(svc, &svc->replq_elem, replq) )
 ...
 etc.

Which would also make a few long lines go away.

> @@ -397,27 +511,62 @@ __runq_insert(const struct scheduler *ops,
> struct rt_vcpu *svc)

> +/*
> + * Insert svc into the replenishment event list
> + * in replenishment time order.
> + * vcpus that need to be replished earlier go first.
> + * The timer may be re-programmed if svc is inserted
> + * at the front of the event list.
> + */
>
You can keep the lines a bit longer than this, the comment will look
better. No need to necessarily reach 79 characters, of course, but
reaching ~70 would be ok.

And, actually, even if I feel like I remember I suggested this comment
myself, I think it's best to do what I said already for replq_remove():
keep only the last sentence, and move it inside the fucntion (just
above the if is fine).

> +static void
> +__replq_insert(const struct scheduler *ops, struct rt_vcpu *svc)
> +{
>
Call this replq_insert().

> +    struct list_head *replq = rt_replq(ops);
> +    struct rt_private *prv = rt_priv(ops);
> +    struct timer *repl_timer = prv->repl_timer;
> +
Again, no need for this.

> +/*
> + * Removes and re-inserts an event to the replenishment queue.
>
    * The aim is to update its position inside the queue, as its 
    * deadline (and hence its replenishment time) could have
    * changed.

> + */
> +static void
> +replq_reinsert(const struct scheduler *ops, struct rt_vcpu *svc)
> +{
> +    struct list_head *replq = rt_replq(ops);
> +    struct rt_private *prv = rt_priv(ops);
> +    struct timer *repl_timer = prv->repl_timer;
> +
> +    ASSERT( vcpu_on_replq(svc) );
> +
> +    if ( deadline_queue_remove(replq, &svc->replq_elem) )
> +    {
> +        if ( deadline_queue_insert(&replq_elem, svc, &svc-
> >replq_elem, replq) )
> +            set_timer(repl_timer, svc->cur_deadline);
> +        else
> +        {
> +            struct rt_vcpu *svc_next = replq_elem(replq->next);
> +            set_timer(repl_timer, svc_next->cur_deadline);
> +        }
>      }
> +    else if ( deadline_queue_insert(&replq_elem, svc, &svc-
> >replq_elem, replq) )
> +        set_timer(repl_timer, svc->cur_deadline);
>
This looks better, and it should work (but please double check, haven't
tested it):

static void
replq_reinsert(const struct scheduler *ops, struct rt_vcpu *svc)
{
    struct list_head *replq = rt_replq(ops);
    struct rt_vcpu *rearm_svc = svc;
    bool_t rearm = false;

    ASSERT( vcpu_on_replq(svc) );

    /*
     * If svc was at the front of the replenishment queue, we certainly
     * need to re-program the timer, and we want to use the deadline of
     * the vcpu which is now at the front of the queue (which may still
     * be svc or not).
     * 
     * We may also need to re-program, if svc has been put at the front
     * of the replenishment queue when being re-inserted.
     */
    if ( deadline_queue_remove(replq, &svc->replq_elem) )
    {
        deadline_replq_insert(svc, &svc->replq_elem, replq);
        rearm_svc = replq_elem(replq->next);
        rearm = true; 
    }
    else
        rearm = deadline_replq_insert(svc, &svc->replq_elem, replq);

    if ( rearm )
        set_timer(rt_priv(ops)->repl_timer, rearm_svc->cur_deadline);
}

> @@ -706,8 +888,15 @@ burn_budget(const struct scheduler *ops, struct
> rt_vcpu *svc, s_time_t now)
>  
>      svc->cur_budget -= delta;
>  
> -    if ( svc->cur_budget < 0 )
> +    if ( svc->cur_budget <= 0 )
> +    {    
>          svc->cur_budget = 0;
> +        /* 
> +         * Set __RTDS_was_depleted so the replenishment
> +         * handler can let this vcpu tickle if it was refilled.
> +         */
> +        set_bit(__RTDS_was_depleted, &svc->flags);
>
Kill the comment. You'll say what happens there... in there! :-)

> @@ -1150,6 +1301,100 @@ rt_dom_cntl(
>      return rc;
>  }
>  
> +/*
> + * The replenishment timer handler picks vcpus
> + * from the replq and does the actual replenishment.
> + */
> +static void repl_handler(void *data){
> +    s_time_t now = NOW();
> +    struct scheduler *ops = data; 
> +    struct rt_private *prv = rt_priv(ops);
> +    struct list_head *replq = rt_replq(ops);
> +    struct list_head *runq = rt_runq(ops);
> +    struct timer *repl_timer = prv->repl_timer;
> +    struct list_head *iter, *tmp;
> +    struct list_head tmp_replq;
> +    struct rt_vcpu *svc = NULL;
>
You don't need to initialize this, do you?

> +    spin_lock_irq(&prv->lock);
> +
> +    stop_timer(repl_timer);
> +
There is no need for stopping the timer inside the handler.

> +    /*
> +     * A temperary queue for updated vcpus.
> +     * It is used to tickle.
> +     */
> +    INIT_LIST_HEAD(&tmp_replq);
> +
You can use LIST_HEAD(tmp_replq); for defining and initializing. I'm
quite sure this is what I suggested, are you using this because you
tried and it didn't work?

The comment is not useful. It's quite clear from a number of factors
that it's a temporary list.

> +    list_for_each_safe ( iter, tmp, replq )
> +    {
> +        svc = replq_elem(iter);
> +
> +        if ( now < svc->cur_deadline )
> +            break;
> +
> +        rt_update_deadline(now, svc);
> +        /*
> +         * When a vcpu is replenished, it is moved
> +         * to a temporary queue to tickle.
> +         */
> +        list_del(&svc->replq_elem);
> +        list_add(&svc->replq_elem, &tmp_replq);
> +
           list_del(&svc->replq_elem);
           rt_update_deadline(now, svc);
           list_add(&svc->replq_elem, &tmp_replq);

Is more clear, IMO. Also, the comments. I think they need to be more
clear about what is happening in these two loops.

What I think is best is to get rid of all these pieces inside the
list_for_each(), and put one just one comment above it, explaining what
happens inside the loop (the same is true for the other below
list_for_each()).

> +        /*
> +         * If svc is on run queue, we need
> +         * to put it to the correct place since
> +         * its deadline changes.
> +         */
> +        if ( __vcpu_on_q(svc) )
> +        {
> +            /* put back to runq */
> +            __q_remove(svc);
> +            __runq_insert(ops, svc);
> +        }
> +    }
> +
> +    /* Iterate through the list of updated vcpus. */
> +    list_for_each_safe ( iter, tmp, &tmp_replq )
> +    {
> +        struct vcpu* vc;
> +        svc = replq_elem(iter);
> +        vc = svc->vcpu;
>
For vcpus, we should use variables named 'v'. But, anyway, I don't
think you need this (it's used only once!).

> +        if ( curr_on_cpu(vc->processor) == vc && 
> +             ( !list_empty(runq) ) )
>
So, this is because, since we don't keep the idle vcpus in the
runqueues, we need to catch the case where v is running, but no other
vcpu is waiting on the runqueue in runnable state, is this right?

In any case, parentheses around the second part of the && seems
pointless.

> +        {
> +            /*
> +             * If the vcpu is running, let the head
> +             * of the run queue tickle if it has a 
> +             * higher priority.
> +             */
> +            struct rt_vcpu *next_on_runq = __q_elem(runq->next);
>
Empty line here.

Also, the correct wording is "tickle the head of the runqueue". (But
see what I said above about comments for these loops.)

> +            if ( svc->cur_deadline > next_on_runq->cur_deadline )
> +                runq_tickle(ops, next_on_runq);
> +        }
> +        else if ( __vcpu_on_q(svc) )
> +        {
> +            /* Only need to tickle a vcpu that was depleted. */
> +            if ( test_and_clear_bit(__RTDS_was_depleted, &svc-
> >flags) )
>
These two if-s can be combined, can't they?


> +                runq_tickle(ops, svc);
> +        }
> +
> +        list_del(&svc->replq_elem);
> +        /* Move it back to the replenishment event queue. */
> +        deadline_queue_insert(&replq_elem, svc, &svc->replq_elem,
> replq);
> +    }
> +
> +    /* 
> +     * Use the vcpu that's on the top of the run queue.
>
It's not the run queue, it's the replenishment queue.

> +     * Otherwise don't program the timer.
> +     */
>
"Otherwise" what?

Maybe:

/*
 * If there are vcpus left in the replenishment event list,
 * set the next replenishment to happen at the deadline of
 * the one at the front.
 */

> +    if ( !list_empty(replq) )
> +        set_timer(repl_timer, replq_elem(replq->next)-
> >cur_deadline);
> +
> +    spin_unlock_irq(&prv->lock);
> +}
> +

Thanks and Regads,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  parent reply	other threads:[~2016-03-16 14:25 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-15  0:04 Tianyang Chen
2016-03-16  3:11 ` Meng Xu
2016-03-16  3:32   ` Chen, Tianyang
2016-03-16  3:40     ` Meng Xu
2016-03-16 10:23       ` Dario Faggioli
2016-03-16 14:20         ` Meng Xu
2016-03-16 14:44           ` Dario Faggioli
2016-03-16 20:51             ` Meng Xu
2016-03-16 14:25 ` Dario Faggioli [this message]
2016-03-16 15:43   ` Chen, Tianyang
2016-03-16 16:11     ` Dario Faggioli
2016-03-16 20:54     ` Meng Xu
2016-03-16 20:45   ` Meng Xu
2016-03-17  2:24   ` Chen, Tianyang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1458138307.3102.901.camel@citrix.com \
    --to=dario.faggioli@citrix.com \
    --cc=dgolomb@seas.upenn.edu \
    --cc=george.dunlap@citrix.com \
    --cc=mengxu@cis.upenn.edu \
    --cc=tiche@seas.upenn.edu \
    --cc=xen-devel@lists.xenproject.org \
    --subject='Re: [PATCH v9]xen: sched: convert RTDS from time to event driven model' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).