* [PATCH v2 0/2] xen: Credit2: fix two issues from recently committed series.
@ 2016-07-19 15:33 Dario Faggioli
2016-07-19 15:33 ` [PATCH v2 1/2] xen: credit2: fix two s_time_t handling issues in load balancing Dario Faggioli
2016-07-19 15:34 ` [PATCH v2 2/2] xen: credit2: fix potential issues in csched2_cpu_pick with tracing enabled Dario Faggioli
0 siblings, 2 replies; 5+ messages in thread
From: Dario Faggioli @ 2016-07-19 15:33 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper, Anshul Makkar, George Dunlap
v2 of <146892985892.30642.2392453881110942183.stgit@Solace.fritz.box>, as v1
was making things worse!
In fact, there was a bug in patch 1 which turned the ASSERT() from being
useless to being wrong, and it was actually triggering.
Sorry for the noise.
Regards,
Dario
---
Dario Faggioli (2):
xen: credit2: fix two s_time_t handling issues in load balancing
xen: credit2: fix potential issues in csched2_cpu_pick with tracing enabled
xen/common/sched_credit2.c | 21 +++++++++++++--------
1 file changed, 13 insertions(+), 8 deletions(-)
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v2 1/2] xen: credit2: fix two s_time_t handling issues in load balancing
2016-07-19 15:33 [PATCH v2 0/2] xen: Credit2: fix two issues from recently committed series Dario Faggioli
@ 2016-07-19 15:33 ` Dario Faggioli
2016-07-20 9:33 ` George Dunlap
2016-07-19 15:34 ` [PATCH v2 2/2] xen: credit2: fix potential issues in csched2_cpu_pick with tracing enabled Dario Faggioli
1 sibling, 1 reply; 5+ messages in thread
From: Dario Faggioli @ 2016-07-19 15:33 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper, Anshul Makkar, George Dunlap
both introduced in d205f8a7f48e2ec ("xen: credit2: rework
load tracking logic").
First, in __update_runq_load(), the ASSERT() was actually
useless. Let's instead check that the computed value of
the load has not overflowed (and hence gone negative).
While there, do that in __update_svc_load() as well.
Second, in balance_load(), cpus_max needs being extended
in order to be correctly shifted, and the result compared
with an s_time_t value, without risking loosing info.
Spotted by Coverity.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Anshul Makkar <anshul.makkar@citrix.com>
---
Changed from v1:
* fixed a '> 0' which wanted to be '>= 0' in the ASSERT()-s;
* cite Coverity in the changelog.
---
xen/common/sched_credit2.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index b33ba7a..a55240f 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -656,7 +656,8 @@ __update_runq_load(const struct scheduler *ops,
rqd->load += change;
rqd->load_last_update = now;
- ASSERT(rqd->avgload <= STIME_MAX && rqd->b_avgload <= STIME_MAX);
+ /* Overflow, capable of making the load look negative, must not occur. */
+ ASSERT(rqd->avgload >= 0 && rqd->b_avgload >= 0);
if ( unlikely(tb_init_done) )
{
@@ -714,6 +715,9 @@ __update_svc_load(const struct scheduler *ops,
}
svc->load_last_update = now;
+ /* Overflow, capable of making the load look negative, must not occur. */
+ ASSERT(svc->avgload >= 0);
+
if ( unlikely(tb_init_done) )
{
struct {
@@ -1742,7 +1746,7 @@ retry:
* If we're under 100% capacaty, only shift if load difference
* is > 1. otherwise, shift if under 12.5%
*/
- if ( load_max < (cpus_max << prv->load_precision_shift) )
+ if ( load_max < ((s_time_t)cpus_max << prv->load_precision_shift) )
{
if ( st.load_delta < (1ULL << (prv->load_precision_shift +
opt_underload_balance_tolerance)) )
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v2 2/2] xen: credit2: fix potential issues in csched2_cpu_pick with tracing enabled
2016-07-19 15:33 [PATCH v2 0/2] xen: Credit2: fix two issues from recently committed series Dario Faggioli
2016-07-19 15:33 ` [PATCH v2 1/2] xen: credit2: fix two s_time_t handling issues in load balancing Dario Faggioli
@ 2016-07-19 15:34 ` Dario Faggioli
2016-07-20 9:47 ` George Dunlap
1 sibling, 1 reply; 5+ messages in thread
From: Dario Faggioli @ 2016-07-19 15:34 UTC (permalink / raw)
To: xen-devel; +Cc: Andrew Cooper, Anshul Makkar, George Dunlap
In fact, when not finding a suitable runqueue where to
place a vCPU, and hence using a fallback, we either:
- don't issue any trace record (while we should),
- risk underruning when accessing the runqueues
array, while preparing the trace record.
Fix both issues and, while there, also a couple of style
problems found nearby.
Spotted by Coverity.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Anshul Makkar <anshul.makkar@citrix.com>
---
Changes from v1:
* cite Coverity in the changelog.
---
xen/common/sched_credit2.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
index a55240f..3009ff9 100644
--- a/xen/common/sched_credit2.c
+++ b/xen/common/sched_credit2.c
@@ -1443,7 +1443,8 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
{
/* We may be here because someone requested us to migrate. */
__clear_bit(__CSFLAG_runq_migrate_request, &svc->flags);
- return get_fallback_cpu(svc);
+ new_cpu = get_fallback_cpu(svc);
+ goto out;
}
/* First check to see if we're here because someone else suggested a place
@@ -1505,7 +1506,7 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
if ( rqd_avgload < min_avgload )
{
min_avgload = rqd_avgload;
- min_rqi=i;
+ min_rqi = i;
}
}
@@ -1520,20 +1521,20 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
BUG_ON(new_cpu >= nr_cpu_ids);
}
-out_up:
+ out_up:
read_unlock(&prv->lock);
-
+ out:
if ( unlikely(tb_init_done) )
{
struct {
uint64_t b_avgload;
unsigned vcpu:16, dom:16;
unsigned rq_id:16, new_cpu:16;
- } d;
- d.b_avgload = prv->rqd[min_rqi].b_avgload;
+ } d;
d.dom = vc->domain->domain_id;
d.vcpu = vc->vcpu_id;
d.rq_id = c2r(ops, new_cpu);
+ d.b_avgload = prv->rqd[d.rq_id].b_avgload;
d.new_cpu = new_cpu;
__trace_var(TRC_CSCHED2_PICKED_CPU, 1,
sizeof(d),
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2 1/2] xen: credit2: fix two s_time_t handling issues in load balancing
2016-07-19 15:33 ` [PATCH v2 1/2] xen: credit2: fix two s_time_t handling issues in load balancing Dario Faggioli
@ 2016-07-20 9:33 ` George Dunlap
0 siblings, 0 replies; 5+ messages in thread
From: George Dunlap @ 2016-07-20 9:33 UTC (permalink / raw)
To: Dario Faggioli; +Cc: xen-devel, Anshul Makkar, Andrew Cooper
On Tue, Jul 19, 2016 at 4:33 PM, Dario Faggioli
<dario.faggioli@citrix.com> wrote:
> both introduced in d205f8a7f48e2ec ("xen: credit2: rework
> load tracking logic").
>
> First, in __update_runq_load(), the ASSERT() was actually
> useless. Let's instead check that the computed value of
> the load has not overflowed (and hence gone negative).
>
> While there, do that in __update_svc_load() as well.
>
> Second, in balance_load(), cpus_max needs being extended
> in order to be correctly shifted, and the result compared
> with an s_time_t value, without risking loosing info.
>
> Spotted by Coverity.
>
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
And queued.
> ---
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Anshul Makkar <anshul.makkar@citrix.com>
> ---
> Changed from v1:
> * fixed a '> 0' which wanted to be '>= 0' in the ASSERT()-s;
> * cite Coverity in the changelog.
> ---
> xen/common/sched_credit2.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index b33ba7a..a55240f 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -656,7 +656,8 @@ __update_runq_load(const struct scheduler *ops,
> rqd->load += change;
> rqd->load_last_update = now;
>
> - ASSERT(rqd->avgload <= STIME_MAX && rqd->b_avgload <= STIME_MAX);
> + /* Overflow, capable of making the load look negative, must not occur. */
> + ASSERT(rqd->avgload >= 0 && rqd->b_avgload >= 0);
>
> if ( unlikely(tb_init_done) )
> {
> @@ -714,6 +715,9 @@ __update_svc_load(const struct scheduler *ops,
> }
> svc->load_last_update = now;
>
> + /* Overflow, capable of making the load look negative, must not occur. */
> + ASSERT(svc->avgload >= 0);
> +
> if ( unlikely(tb_init_done) )
> {
> struct {
> @@ -1742,7 +1746,7 @@ retry:
> * If we're under 100% capacaty, only shift if load difference
> * is > 1. otherwise, shift if under 12.5%
> */
> - if ( load_max < (cpus_max << prv->load_precision_shift) )
> + if ( load_max < ((s_time_t)cpus_max << prv->load_precision_shift) )
> {
> if ( st.load_delta < (1ULL << (prv->load_precision_shift +
> opt_underload_balance_tolerance)) )
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2 2/2] xen: credit2: fix potential issues in csched2_cpu_pick with tracing enabled
2016-07-19 15:34 ` [PATCH v2 2/2] xen: credit2: fix potential issues in csched2_cpu_pick with tracing enabled Dario Faggioli
@ 2016-07-20 9:47 ` George Dunlap
0 siblings, 0 replies; 5+ messages in thread
From: George Dunlap @ 2016-07-20 9:47 UTC (permalink / raw)
To: Dario Faggioli; +Cc: xen-devel, Anshul Makkar, Andrew Cooper
On Tue, Jul 19, 2016 at 4:34 PM, Dario Faggioli
<dario.faggioli@citrix.com> wrote:
> In fact, when not finding a suitable runqueue where to
> place a vCPU, and hence using a fallback, we either:
> - don't issue any trace record (while we should),
> - risk underruning when accessing the runqueues
> array, while preparing the trace record.
>
> Fix both issues and, while there, also a couple of style
> problems found nearby.
>
> Spotted by Coverity.
>
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> Cc: George Dunlap <george.dunlap@citrix.com>
> Cc: Anshul Makkar <anshul.makkar@citrix.com>
> ---
> Changes from v1:
> * cite Coverity in the changelog.
> ---
> xen/common/sched_credit2.c | 13 +++++++------
> 1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> index a55240f..3009ff9 100644
> --- a/xen/common/sched_credit2.c
> +++ b/xen/common/sched_credit2.c
> @@ -1443,7 +1443,8 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
> {
> /* We may be here because someone requested us to migrate. */
> __clear_bit(__CSFLAG_runq_migrate_request, &svc->flags);
> - return get_fallback_cpu(svc);
> + new_cpu = get_fallback_cpu(svc);
> + goto out;
> }
>
> /* First check to see if we're here because someone else suggested a place
> @@ -1505,7 +1506,7 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
> if ( rqd_avgload < min_avgload )
> {
> min_avgload = rqd_avgload;
> - min_rqi=i;
> + min_rqi = i;
> }
> }
>
> @@ -1520,20 +1521,20 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc)
> BUG_ON(new_cpu >= nr_cpu_ids);
> }
>
> -out_up:
> + out_up:
> read_unlock(&prv->lock);
> -
> + out:
> if ( unlikely(tb_init_done) )
> {
> struct {
> uint64_t b_avgload;
> unsigned vcpu:16, dom:16;
> unsigned rq_id:16, new_cpu:16;
> - } d;
> - d.b_avgload = prv->rqd[min_rqi].b_avgload;
> + } d;
> d.dom = vc->domain->domain_id;
> d.vcpu = vc->vcpu_id;
> d.rq_id = c2r(ops, new_cpu);
> + d.b_avgload = prv->rqd[d.rq_id].b_avgload;
Hmm, actually -- is this unlocked access to the prv structure the best
idea? It looks like at the moment nothing bad should happen (as we
don't re-initialize a pcpu's entry in prv->runq_map[] to -1 when
de-initializing the pcpu), but if we ever *did*, then there'd be a
race condition we could possibly trip over.
Sorry for missing this during review.
What about having a local variable that we initialize to something
sensible (like 0 or -1) and setting it before the read_unlock()?
-George
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-07-20 9:47 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-19 15:33 [PATCH v2 0/2] xen: Credit2: fix two issues from recently committed series Dario Faggioli
2016-07-19 15:33 ` [PATCH v2 1/2] xen: credit2: fix two s_time_t handling issues in load balancing Dario Faggioli
2016-07-20 9:33 ` George Dunlap
2016-07-19 15:34 ` [PATCH v2 2/2] xen: credit2: fix potential issues in csched2_cpu_pick with tracing enabled Dario Faggioli
2016-07-20 9:47 ` George Dunlap
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).