From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753916AbcFPQQa (ORCPT ); Thu, 16 Jun 2016 12:16:30 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:40500 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751530AbcFPQQ3 (ORCPT ); Thu, 16 Jun 2016 12:16:29 -0400 Date: Thu, 16 Jun 2016 18:16:03 +0200 From: Peter Zijlstra To: Yuyang Du Cc: Chris Wilson , Andrey Ryabinin , Linus Torvalds , Mike Galbraith , Thomas Gleixner , bsegall@google.com, morten.rasmussen@arm.com, pjt@google.com, steve.muckle@linaro.org, linux-kernel@vger.kernel.org, kernel@kyup.com Subject: Re: Divide-by-zero in post_init_entity_util_avg Message-ID: <20160616161603.GH30927@twins.programming.kicks-ass.net> References: <20160609090142.GS32344@nuc-i3427.alporthouse.com> <20160609013324.GH8105@intel.com> <20160609130750.GQ30909@twins.programming.kicks-ass.net> <20160616085040.GF30927@twins.programming.kicks-ass.net> <20160616122504.GG30927@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160616122504.GG30927@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 16, 2016 at 02:25:04PM +0200, Peter Zijlstra wrote: > On Thu, Jun 16, 2016 at 10:50:40AM +0200, Peter Zijlstra wrote: > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index f75930bdd326..3fd3d903e6b6 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -2878,6 +2878,20 @@ static inline void cfs_rq_util_change(struct cfs_rq *cfs_rq) > > } > > } > > > > +/* > > + * Explicitly do a load-store to ensure the temporary value never hits memory. > > + * This allows lockless observations without ever seeing the negative values. > > + * > > + * Incidentally, this also generates much saner code for x86. > > + */ > > +#define sub_positive(type, ptr, val) do { \ > > + type tmp = READ_ONCE(*ptr); \ > > + tmp -= (val); \ > > + if (tmp < 0) \ > > + tmp = 0; \ > > + WRITE_ONCE(*ptr, tmp); \ > > +} while (0) > > + > > /* Group cfs_rq's load_avg is used for task_h_load and update_cfs_share */ > > static inline int > > update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq, bool update_freq) > > @@ -2887,15 +2901,15 @@ update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq, bool update_freq) > > > > if (atomic_long_read(&cfs_rq->removed_load_avg)) { > > s64 r = atomic_long_xchg(&cfs_rq->removed_load_avg, 0); > > - sa->load_avg = max_t(long, sa->load_avg - r, 0); > > - sa->load_sum = max_t(s64, sa->load_sum - r * LOAD_AVG_MAX, 0); > > + sub_positive(long, &sa->load_avg, r); > > + sub_positive(s64, &sa->load_sum, r * LOAD_AVG_MAX); > > Hmm, so either we should change these variables to signed types as > forced here, or this logic (along with the former) is plain wrong. > > As it stands any unsigned value with the MSB set will wipe the field > after this subtraction. > > I suppose instead we'd want something like: > > tmp = READ_ONCE(*ptr); > if (tmp > val) > tmp -= val; > else > tmp = 0; > WRITE_ONCE(*ptr, tmp); Stackoverflow suggested this pattern for (unsigned) underflow checking: r = a - b; if ((r = a - b) > a) underflow() should generate the right asm, but no, that doesn't work either. http://stackoverflow.com/questions/24958469/subtract-and-detect-underflow-most-efficient-way-x86-64-with-gcc