* [PATCH v2 0/2] xen: Credit2: fix two issues from recently committed series. @ 2016-07-19 15:33 Dario Faggioli 2016-07-19 15:33 ` [PATCH v2 1/2] xen: credit2: fix two s_time_t handling issues in load balancing Dario Faggioli 2016-07-19 15:34 ` [PATCH v2 2/2] xen: credit2: fix potential issues in csched2_cpu_pick with tracing enabled Dario Faggioli 0 siblings, 2 replies; 5+ messages in thread From: Dario Faggioli @ 2016-07-19 15:33 UTC (permalink / raw) To: xen-devel; +Cc: Andrew Cooper, Anshul Makkar, George Dunlap v2 of <146892985892.30642.2392453881110942183.stgit@Solace.fritz.box>, as v1 was making things worse! In fact, there was a bug in patch 1 which turned the ASSERT() from being useless to being wrong, and it was actually triggering. Sorry for the noise. Regards, Dario --- Dario Faggioli (2): xen: credit2: fix two s_time_t handling issues in load balancing xen: credit2: fix potential issues in csched2_cpu_pick with tracing enabled xen/common/sched_credit2.c | 21 +++++++++++++-------- 1 file changed, 13 insertions(+), 8 deletions(-) -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v2 1/2] xen: credit2: fix two s_time_t handling issues in load balancing 2016-07-19 15:33 [PATCH v2 0/2] xen: Credit2: fix two issues from recently committed series Dario Faggioli @ 2016-07-19 15:33 ` Dario Faggioli 2016-07-20 9:33 ` George Dunlap 2016-07-19 15:34 ` [PATCH v2 2/2] xen: credit2: fix potential issues in csched2_cpu_pick with tracing enabled Dario Faggioli 1 sibling, 1 reply; 5+ messages in thread From: Dario Faggioli @ 2016-07-19 15:33 UTC (permalink / raw) To: xen-devel; +Cc: Andrew Cooper, Anshul Makkar, George Dunlap both introduced in d205f8a7f48e2ec ("xen: credit2: rework load tracking logic"). First, in __update_runq_load(), the ASSERT() was actually useless. Let's instead check that the computed value of the load has not overflowed (and hence gone negative). While there, do that in __update_svc_load() as well. Second, in balance_load(), cpus_max needs being extended in order to be correctly shifted, and the result compared with an s_time_t value, without risking loosing info. Spotted by Coverity. Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> --- Cc: George Dunlap <george.dunlap@citrix.com> Cc: Anshul Makkar <anshul.makkar@citrix.com> --- Changed from v1: * fixed a '> 0' which wanted to be '>= 0' in the ASSERT()-s; * cite Coverity in the changelog. --- xen/common/sched_credit2.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index b33ba7a..a55240f 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -656,7 +656,8 @@ __update_runq_load(const struct scheduler *ops, rqd->load += change; rqd->load_last_update = now; - ASSERT(rqd->avgload <= STIME_MAX && rqd->b_avgload <= STIME_MAX); + /* Overflow, capable of making the load look negative, must not occur. */ + ASSERT(rqd->avgload >= 0 && rqd->b_avgload >= 0); if ( unlikely(tb_init_done) ) { @@ -714,6 +715,9 @@ __update_svc_load(const struct scheduler *ops, } svc->load_last_update = now; + /* Overflow, capable of making the load look negative, must not occur. */ + ASSERT(svc->avgload >= 0); + if ( unlikely(tb_init_done) ) { struct { @@ -1742,7 +1746,7 @@ retry: * If we're under 100% capacaty, only shift if load difference * is > 1. otherwise, shift if under 12.5% */ - if ( load_max < (cpus_max << prv->load_precision_shift) ) + if ( load_max < ((s_time_t)cpus_max << prv->load_precision_shift) ) { if ( st.load_delta < (1ULL << (prv->load_precision_shift + opt_underload_balance_tolerance)) ) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2 1/2] xen: credit2: fix two s_time_t handling issues in load balancing 2016-07-19 15:33 ` [PATCH v2 1/2] xen: credit2: fix two s_time_t handling issues in load balancing Dario Faggioli @ 2016-07-20 9:33 ` George Dunlap 0 siblings, 0 replies; 5+ messages in thread From: George Dunlap @ 2016-07-20 9:33 UTC (permalink / raw) To: Dario Faggioli; +Cc: xen-devel, Anshul Makkar, Andrew Cooper On Tue, Jul 19, 2016 at 4:33 PM, Dario Faggioli <dario.faggioli@citrix.com> wrote: > both introduced in d205f8a7f48e2ec ("xen: credit2: rework > load tracking logic"). > > First, in __update_runq_load(), the ASSERT() was actually > useless. Let's instead check that the computed value of > the load has not overflowed (and hence gone negative). > > While there, do that in __update_svc_load() as well. > > Second, in balance_load(), cpus_max needs being extended > in order to be correctly shifted, and the result compared > with an s_time_t value, without risking loosing info. > > Spotted by Coverity. > > Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> > Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com> And queued. > --- > Cc: George Dunlap <george.dunlap@citrix.com> > Cc: Anshul Makkar <anshul.makkar@citrix.com> > --- > Changed from v1: > * fixed a '> 0' which wanted to be '>= 0' in the ASSERT()-s; > * cite Coverity in the changelog. > --- > xen/common/sched_credit2.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c > index b33ba7a..a55240f 100644 > --- a/xen/common/sched_credit2.c > +++ b/xen/common/sched_credit2.c > @@ -656,7 +656,8 @@ __update_runq_load(const struct scheduler *ops, > rqd->load += change; > rqd->load_last_update = now; > > - ASSERT(rqd->avgload <= STIME_MAX && rqd->b_avgload <= STIME_MAX); > + /* Overflow, capable of making the load look negative, must not occur. */ > + ASSERT(rqd->avgload >= 0 && rqd->b_avgload >= 0); > > if ( unlikely(tb_init_done) ) > { > @@ -714,6 +715,9 @@ __update_svc_load(const struct scheduler *ops, > } > svc->load_last_update = now; > > + /* Overflow, capable of making the load look negative, must not occur. */ > + ASSERT(svc->avgload >= 0); > + > if ( unlikely(tb_init_done) ) > { > struct { > @@ -1742,7 +1746,7 @@ retry: > * If we're under 100% capacaty, only shift if load difference > * is > 1. otherwise, shift if under 12.5% > */ > - if ( load_max < (cpus_max << prv->load_precision_shift) ) > + if ( load_max < ((s_time_t)cpus_max << prv->load_precision_shift) ) > { > if ( st.load_delta < (1ULL << (prv->load_precision_shift + > opt_underload_balance_tolerance)) ) > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > https://lists.xen.org/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v2 2/2] xen: credit2: fix potential issues in csched2_cpu_pick with tracing enabled 2016-07-19 15:33 [PATCH v2 0/2] xen: Credit2: fix two issues from recently committed series Dario Faggioli 2016-07-19 15:33 ` [PATCH v2 1/2] xen: credit2: fix two s_time_t handling issues in load balancing Dario Faggioli @ 2016-07-19 15:34 ` Dario Faggioli 2016-07-20 9:47 ` George Dunlap 1 sibling, 1 reply; 5+ messages in thread From: Dario Faggioli @ 2016-07-19 15:34 UTC (permalink / raw) To: xen-devel; +Cc: Andrew Cooper, Anshul Makkar, George Dunlap In fact, when not finding a suitable runqueue where to place a vCPU, and hence using a fallback, we either: - don't issue any trace record (while we should), - risk underruning when accessing the runqueues array, while preparing the trace record. Fix both issues and, while there, also a couple of style problems found nearby. Spotted by Coverity. Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> --- Cc: George Dunlap <george.dunlap@citrix.com> Cc: Anshul Makkar <anshul.makkar@citrix.com> --- Changes from v1: * cite Coverity in the changelog. --- xen/common/sched_credit2.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index a55240f..3009ff9 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -1443,7 +1443,8 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc) { /* We may be here because someone requested us to migrate. */ __clear_bit(__CSFLAG_runq_migrate_request, &svc->flags); - return get_fallback_cpu(svc); + new_cpu = get_fallback_cpu(svc); + goto out; } /* First check to see if we're here because someone else suggested a place @@ -1505,7 +1506,7 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc) if ( rqd_avgload < min_avgload ) { min_avgload = rqd_avgload; - min_rqi=i; + min_rqi = i; } } @@ -1520,20 +1521,20 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc) BUG_ON(new_cpu >= nr_cpu_ids); } -out_up: + out_up: read_unlock(&prv->lock); - + out: if ( unlikely(tb_init_done) ) { struct { uint64_t b_avgload; unsigned vcpu:16, dom:16; unsigned rq_id:16, new_cpu:16; - } d; - d.b_avgload = prv->rqd[min_rqi].b_avgload; + } d; d.dom = vc->domain->domain_id; d.vcpu = vc->vcpu_id; d.rq_id = c2r(ops, new_cpu); + d.b_avgload = prv->rqd[d.rq_id].b_avgload; d.new_cpu = new_cpu; __trace_var(TRC_CSCHED2_PICKED_CPU, 1, sizeof(d), _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2 2/2] xen: credit2: fix potential issues in csched2_cpu_pick with tracing enabled 2016-07-19 15:34 ` [PATCH v2 2/2] xen: credit2: fix potential issues in csched2_cpu_pick with tracing enabled Dario Faggioli @ 2016-07-20 9:47 ` George Dunlap 0 siblings, 0 replies; 5+ messages in thread From: George Dunlap @ 2016-07-20 9:47 UTC (permalink / raw) To: Dario Faggioli; +Cc: xen-devel, Anshul Makkar, Andrew Cooper On Tue, Jul 19, 2016 at 4:34 PM, Dario Faggioli <dario.faggioli@citrix.com> wrote: > In fact, when not finding a suitable runqueue where to > place a vCPU, and hence using a fallback, we either: > - don't issue any trace record (while we should), > - risk underruning when accessing the runqueues > array, while preparing the trace record. > > Fix both issues and, while there, also a couple of style > problems found nearby. > > Spotted by Coverity. > > Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> > Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> > --- > Cc: George Dunlap <george.dunlap@citrix.com> > Cc: Anshul Makkar <anshul.makkar@citrix.com> > --- > Changes from v1: > * cite Coverity in the changelog. > --- > xen/common/sched_credit2.c | 13 +++++++------ > 1 file changed, 7 insertions(+), 6 deletions(-) > > diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c > index a55240f..3009ff9 100644 > --- a/xen/common/sched_credit2.c > +++ b/xen/common/sched_credit2.c > @@ -1443,7 +1443,8 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc) > { > /* We may be here because someone requested us to migrate. */ > __clear_bit(__CSFLAG_runq_migrate_request, &svc->flags); > - return get_fallback_cpu(svc); > + new_cpu = get_fallback_cpu(svc); > + goto out; > } > > /* First check to see if we're here because someone else suggested a place > @@ -1505,7 +1506,7 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc) > if ( rqd_avgload < min_avgload ) > { > min_avgload = rqd_avgload; > - min_rqi=i; > + min_rqi = i; > } > } > > @@ -1520,20 +1521,20 @@ csched2_cpu_pick(const struct scheduler *ops, struct vcpu *vc) > BUG_ON(new_cpu >= nr_cpu_ids); > } > > -out_up: > + out_up: > read_unlock(&prv->lock); > - > + out: > if ( unlikely(tb_init_done) ) > { > struct { > uint64_t b_avgload; > unsigned vcpu:16, dom:16; > unsigned rq_id:16, new_cpu:16; > - } d; > - d.b_avgload = prv->rqd[min_rqi].b_avgload; > + } d; > d.dom = vc->domain->domain_id; > d.vcpu = vc->vcpu_id; > d.rq_id = c2r(ops, new_cpu); > + d.b_avgload = prv->rqd[d.rq_id].b_avgload; Hmm, actually -- is this unlocked access to the prv structure the best idea? It looks like at the moment nothing bad should happen (as we don't re-initialize a pcpu's entry in prv->runq_map[] to -1 when de-initializing the pcpu), but if we ever *did*, then there'd be a race condition we could possibly trip over. Sorry for missing this during review. What about having a local variable that we initialize to something sensible (like 0 or -1) and setting it before the read_unlock()? -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-07-20 9:47 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-07-19 15:33 [PATCH v2 0/2] xen: Credit2: fix two issues from recently committed series Dario Faggioli 2016-07-19 15:33 ` [PATCH v2 1/2] xen: credit2: fix two s_time_t handling issues in load balancing Dario Faggioli 2016-07-20 9:33 ` George Dunlap 2016-07-19 15:34 ` [PATCH v2 2/2] xen: credit2: fix potential issues in csched2_cpu_pick with tracing enabled Dario Faggioli 2016-07-20 9:47 ` George Dunlap
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).