xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] xen: Credit2: fix two issues from recently committed series.
@ 2016-07-19 15:33 Dario Faggioli
  2016-07-19 15:33 ` [PATCH v2 1/2] xen: credit2: fix two s_time_t handling issues in load balancing Dario Faggioli
  2016-07-19 15:34 ` [PATCH v2 2/2] xen: credit2: fix potential issues in csched2_cpu_pick with tracing enabled Dario Faggioli
  0 siblings, 2 replies; 5+ messages in thread
From: Dario Faggioli @ 2016-07-19 15:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Anshul Makkar, George Dunlap

v2 of <146892985892.30642.2392453881110942183.stgit@Solace.fritz.box>, as v1
was making things worse!

In fact, there was a bug in patch 1 which turned the ASSERT() from being
useless to being wrong, and it was actually triggering.

Sorry for the noise.

Regards,
Dario
---
Dario Faggioli (2):
      xen: credit2: fix two s_time_t handling issues in load balancing
      xen: credit2: fix potential issues in csched2_cpu_pick with tracing enabled

 xen/common/sched_credit2.c |   21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2 1/2] xen: credit2: fix two s_time_t handling issues in load balancing
  2016-07-19 15:33 [PATCH v2 0/2] xen: Credit2: fix two issues from recently committed series Dario Faggioli
@ 2016-07-19 15:33 ` Dario Faggioli
  2016-07-20  9:33   ` George Dunlap
  2016-07-19 15:34 ` [PATCH v2 2/2] xen: credit2: fix potential issues in csched2_cpu_pick with tracing enabled Dario Faggioli
  1 sibling, 1 reply; 5+ messages in thread
From: Dario Faggioli @ 2016-07-19 15:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Anshul Makkar, George Dunlap

both introduced in d205f8a7f48e2ec ("xen: credit2: rework
load tracking logic").

First, in __update_runq_load(), the ASSERT() was actually
useless. Let's instead check that the computed value of
the load has not overflowed (and hence gone negative).

While there, do that in __update_svc_load() as well.

Second, in balance_load(), cpus_max needs being extended
in order to be correctly shifted, and the result compared
with an s_time_t value, without risking loosing info.

Spotted by Coverity.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Anshul Makkar <anshul.makkar@citrix.com>
---
Changed from v1:
 * fixed a '> 0' which wanted to be '>= 0' in the ASSERT()-s;
 * cite Coverity in the changelog.
---
 xen/common/sched_credit2.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index b33ba7a..a55240f 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -656,7 +656,8 @@ __update_runq_load(const struct scheduler *ops,
     rqd->load += change;
     rqd->load_last_update = now;
 
-    ASSERT(rqd->avgload <= STIME_MAX && rqd->b_avgload <= STIME_MAX);
+    /* Overflow, capable of making the load look negative, must not occur. */
+    ASSERT(rqd->avgload >= 0 && rqd->b_avgload >= 0);
 
     if ( unlikely(tb_init_done) )
     {
@@ -714,6 +715,9 @@ __update_svc_load(const struct scheduler *ops,
     }
     svc->load_last_update = now;
 
+    /* Overflow, capable of making the load look negative, must not occur. */
+    ASSERT(svc->avgload >= 0);
+
     if ( unlikely(tb_init_done) )
     {
         struct {
@@ -1742,7 +1746,7 @@ retry:
          * If we're under 100% capacaty, only shift if load difference
          * is > 1.  otherwise, shift if under 12.5%
          */
-        if ( load_max < (cpus_max << prv->load_precision_shift) )
+        if ( load_max < ((s_time_t)cpus_max << prv->load_precision_shift) )
         {
             if ( st.load_delta < (1ULL << (prv->load_precision_shift +
                                            opt_underload_balance_tolerance)) )


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 2/2] xen: credit2: fix potential issues in csched2_cpu_pick with tracing enabled
  2016-07-19 15:33 [PATCH v2 0/2] xen: Credit2: fix two issues from recently committed series Dario Faggioli
  2016-07-19 15:33 ` [PATCH v2 1/2] xen: credit2: fix two s_time_t handling issues in load balancing Dario Faggioli
@ 2016-07-19 15:34 ` Dario Faggioli
  2016-07-20  9:47   ` George Dunlap
  1 sibling, 1 reply; 5+ messages in thread
From: Dario Faggioli @ 2016-07-19 15:34 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Anshul Makkar, George Dunlap

In fact, when not finding a suitable runqueue where to
place a vCPU, and hence using a fallback, we either:
 - don't issue any trace record (while we should),
 - risk underruning when accessing the runqueues
   array, while preparing the trace record.

Fix both issues and, while there, also a couple of style
problems found nearby.

Spotted by Coverity.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Anshul Makkar <anshul.makkar@citrix.com>
---
Changes from v1:
 * cite Coverity in the changelog.
---
 xen/common/sched_credit2.c |   13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index a55240f..3009ff9 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -1443,7 +1443,8 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
     {
         /* We may be here because someone requested us to migrate. */
         __clear_bit(__CSFLAG_runq_migrate_request, &svc->flags);
-        return get_fallback_cpu(svc);
+        new_cpu = get_fallback_cpu(svc);
+        goto out;
     }
 
     /* First check to see if we're here because someone else suggested a place
@@ -1505,7 +1506,7 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
         if ( rqd_avgload < min_avgload )
         {
             min_avgload = rqd_avgload;
-            min_rqi=i;
+            min_rqi = i;
         }
     }
 
@@ -1520,20 +1521,20 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
         BUG_ON(new_cpu >= nr_cpu_ids);
     }
 
-out_up:
+ out_up:
     read_unlock(&prv->lock);
-
+ out:
     if ( unlikely(tb_init_done) )
     {
         struct {
             uint64_t b_avgload;
             unsigned vcpu:16, dom:16;
             unsigned rq_id:16, new_cpu:16;
-       } d;
-        d.b_avgload = prv->rqd[min_rqi].b_avgload;
+        } d;
         d.dom = vc->domain->domain_id;
         d.vcpu = vc->vcpu_id;
         d.rq_id = c2r(ops, new_cpu);
+        d.b_avgload = prv->rqd[d.rq_id].b_avgload;
         d.new_cpu = new_cpu;
         __trace_var(TRC_CSCHED2_PICKED_CPU, 1,
                     sizeof(d),


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 1/2] xen: credit2: fix two s_time_t handling issues in load balancing
  2016-07-19 15:33 ` [PATCH v2 1/2] xen: credit2: fix two s_time_t handling issues in load balancing Dario Faggioli
@ 2016-07-20  9:33   ` George Dunlap
  0 siblings, 0 replies; 5+ messages in thread
From: George Dunlap @ 2016-07-20  9:33 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: xen-devel, Anshul Makkar, Andrew Cooper

On Tue, Jul 19, 2016 at 4:33 PM, Dario Faggioli
<dario.faggioli@citrix.com> wrote:
> both introduced in d205f8a7f48e2ec ("xen: credit2: rework
> load tracking logic").
>
> First, in __update_runq_load(), the ASSERT() was actually
> useless. Let's instead check that the computed value of
> the load has not overflowed (and hence gone negative).
>
> While there, do that in __update_svc_load() as well.
>
> Second, in balance_load(), cpus_max needs being extended
> in order to be correctly shifted, and the result compared
> with an s_time_t value, without risking loosing info.
>
> Spotted by Coverity.
>
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: George Dunlap <george.dunlap@citrix.com>

And queued.

> ---
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Anshul Makkar <anshul.makkar@citrix.com>
> ---
> Changed from v1:
>  * fixed a '> 0' which wanted to be '>= 0' in the ASSERT()-s;
>  * cite Coverity in the changelog.
> ---
>  xen/common/sched_credit2.c |    8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index b33ba7a..a55240f 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -656,7 +656,8 @@ __update_runq_load(const struct scheduler *ops,
>      rqd->load += change;
>      rqd->load_last_update = now;
>
> -    ASSERT(rqd->avgload <= STIME_MAX && rqd->b_avgload <= STIME_MAX);
> +    /* Overflow, capable of making the load look negative, must not occur. */
> +    ASSERT(rqd->avgload >= 0 && rqd->b_avgload >= 0);
>
>      if ( unlikely(tb_init_done) )
>      {
> @@ -714,6 +715,9 @@ __update_svc_load(const struct scheduler *ops,
>      }
>      svc->load_last_update = now;
>
> +    /* Overflow, capable of making the load look negative, must not occur. */
> +    ASSERT(svc->avgload >= 0);
> +
>      if ( unlikely(tb_init_done) )
>      {
>          struct {
> @@ -1742,7 +1746,7 @@ retry:
>           * If we're under 100% capacaty, only shift if load difference
>           * is > 1.  otherwise, shift if under 12.5%
>           */
> -        if ( load_max < (cpus_max << prv->load_precision_shift) )
> +        if ( load_max < ((s_time_t)cpus_max << prv->load_precision_shift) )
>          {
>              if ( st.load_delta < (1ULL << (prv->load_precision_shift +
>                                             opt_underload_balance_tolerance)) )
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 2/2] xen: credit2: fix potential issues in csched2_cpu_pick with tracing enabled
  2016-07-19 15:34 ` [PATCH v2 2/2] xen: credit2: fix potential issues in csched2_cpu_pick with tracing enabled Dario Faggioli
@ 2016-07-20  9:47   ` George Dunlap
  0 siblings, 0 replies; 5+ messages in thread
From: George Dunlap @ 2016-07-20  9:47 UTC (permalink / raw)
  To: Dario Faggioli; +Cc: xen-devel, Anshul Makkar, Andrew Cooper

On Tue, Jul 19, 2016 at 4:34 PM, Dario Faggioli
<dario.faggioli@citrix.com> wrote:
> In fact, when not finding a suitable runqueue where to
> place a vCPU, and hence using a fallback, we either:
>  - don't issue any trace record (while we should),
>  - risk underruning when accessing the runqueues
>    array, while preparing the trace record.
>
> Fix both issues and, while there, also a couple of style
> problems found nearby.
>
> Spotted by Coverity.
>
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Anshul Makkar <anshul.makkar@citrix.com>
> ---
> Changes from v1:
>  * cite Coverity in the changelog.
> ---
>  xen/common/sched_credit2.c |   13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index a55240f..3009ff9 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -1443,7 +1443,8 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
>      {
>          /* We may be here because someone requested us to migrate. */
>          __clear_bit(__CSFLAG_runq_migrate_request, &svc->flags);
> -        return get_fallback_cpu(svc);
> +        new_cpu = get_fallback_cpu(svc);
> +        goto out;
>      }
>
>      /* First check to see if we're here because someone else suggested a place
> @@ -1505,7 +1506,7 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
>          if ( rqd_avgload < min_avgload )
>          {
>              min_avgload = rqd_avgload;
> -            min_rqi=i;
> +            min_rqi = i;
>          }
>      }
>
> @@ -1520,20 +1521,20 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
>          BUG_ON(new_cpu >= nr_cpu_ids);
>      }
>
> -out_up:
> + out_up:
>      read_unlock(&prv->lock);
> -
> + out:
>      if ( unlikely(tb_init_done) )
>      {
>          struct {
>              uint64_t b_avgload;
>              unsigned vcpu:16, dom:16;
>              unsigned rq_id:16, new_cpu:16;
> -       } d;
> -        d.b_avgload = prv->rqd[min_rqi].b_avgload;
> +        } d;
>          d.dom = vc->domain->domain_id;
>          d.vcpu = vc->vcpu_id;
>          d.rq_id = c2r(ops, new_cpu);
> +        d.b_avgload = prv->rqd[d.rq_id].b_avgload;

Hmm, actually -- is this unlocked access to the prv structure the best
idea?  It looks like at the moment nothing bad should happen (as we
don't re-initialize a pcpu's entry in prv->runq_map[] to -1 when
de-initializing the pcpu), but if we ever *did*, then there'd be a
race condition we could possibly trip over.

Sorry for missing this during review.

What about having a local variable that we initialize to something
sensible (like 0 or -1) and setting it before the read_unlock()?

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-07-20  9:47 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-19 15:33 [PATCH v2 0/2] xen: Credit2: fix two issues from recently committed series Dario Faggioli
2016-07-19 15:33 ` [PATCH v2 1/2] xen: credit2: fix two s_time_t handling issues in load balancing Dario Faggioli
2016-07-20  9:33   ` George Dunlap
2016-07-19 15:34 ` [PATCH v2 2/2] xen: credit2: fix potential issues in csched2_cpu_pick with tracing enabled Dario Faggioli
2016-07-20  9:47   ` George Dunlap

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).