From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932091AbbFHJOe (ORCPT ); Mon, 8 Jun 2015 05:14:34 -0400 Received: from casper.infradead.org ([85.118.1.10]:41084 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752064AbbFHJO0 (ORCPT ); Mon, 8 Jun 2015 05:14:26 -0400 Date: Mon, 8 Jun 2015 11:14:17 +0200 From: Peter Zijlstra To: Oleg Nesterov Cc: umgwanakikbuti@gmail.com, mingo@elte.hu, ktkhai@parallels.com, rostedt@goodmis.org, tglx@linutronix.de, juri.lelli@gmail.com, pang.xunlei@linaro.org, wanpeng.li@linux.intel.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH 08/14] hrtimer: Allow hrtimer::function() to free the timer Message-ID: <20150608091417.GM19282@twins.programming.kicks-ass.net> References: <20150605084836.364306429@infradead.org> <20150605085205.723058588@infradead.org> <20150607223317.GA5193@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150607223317.GA5193@redhat.com> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 08, 2015 at 12:33:17AM +0200, Oleg Nesterov wrote: > Not sure I read this patch correctly, it doesn't apply to Linus's tree. I was working on tip/master, there's a number of timer patches in there. > And I simply can not understand the complication in hrtimer_active(), > please help! > > On 06/05, Peter Zijlstra wrote: > > > > +bool hrtimer_active(const struct hrtimer *timer) > > +{ > > + struct hrtimer_cpu_base *cpu_base; > > + unsigned int seq; > > + bool active; > > + > > + do { > > + active = false; > > + cpu_base = READ_ONCE(timer->base->cpu_base); > > + seq = raw_read_seqcount(&cpu_base->seq); > > + > > + if (timer->state != HRTIMER_STATE_INACTIVE || > > + cpu_base->running == timer) > > + active = true; > > Why we can't simply return true in this case? > > Unless you lock this timer, hrtimer_active() is inherently racy anyway. > Granted, it must not wrongly return False if the timer is pending or > running. > > But "false positive" does not differ from the case when (say) the > running timer->function() finishes right after hrtimer_active() returns > True. So the problem with: static inline bool hrtimer_active(const struct timer *timer) { return timer->state != HRTIMER_STATE_INACTIVE || timer->base->cpu_base->running == timer; } Is that is can indeed return false falsely. Consider __run_hrtimer: cpu_base->running = timer; __remove_hrtimer(..., HRTIMER_STATE_INACTIVE, ...); Our test could observe INACTIVE but not yet see the ->running store. In this case it would return false, even though it is in fact active. > > + } while (read_seqcount_retry(&cpu_base->seq, seq) || > > + cpu_base != READ_ONCE(timer->base->cpu_base)); > > Why do we need to re-check >cpu_base? Because each CPU's cpu_base has independent seq numbers, so if the timer migrates, the seq numbers might align just so that we fail to detect change, even though there was -- extremely unlikely, but possible. > I think we can ignore migrate_hrtimer_list(), it doesn't clear ->state. I'm inclined to agree, although I did not get around to staring at that on Friday and am currently in the process of waking up. > Otherwise the timer can change its ->base only if it is not running and > inactive, and again I think we should only eliminate the false negative > return. Agreed. > And I think there is a problem. Consider a timer TIMER which always > rearms itself using some "default" timeout. > > In this case __hrtimer_start_range_ns(&TIMER, ...) must preserve > hrtimer_active(&TIMER) == T. By definition, and currently this is > true. I must ask for a little clarification; is this the simple: ->function() hrtimer_forward_now(&self, timeout); return HRTIMER_RESTART; Or something that 'randomly' calls: hrtimer_start() on a timer? Because for the latter this isn't currently true for the same reason as you give here: > After this patch this is no longer true (afaics). If the timer is > pending but not running, __hrtimer_start_range_ns()->remove_hrtimer() > will clear ENQUEUED first, then set it again in enqueue_hrtimer(). That is so even with the current code; the current code uses: hrtimer->state & CALLBACK for __remove_hrtimer(.state). In the above case of a pending timer, that's 0 aka. HRTIMER_STATE_INACTIVE. > This means that hrtimer_active() returns false in between. And note > that it doesn't matter if the timer changes its ->base or not, so > that 2nd cpu_base above can't help. > > I think that __hrtimer_start_range_ns() should preserve ENQUEUED > like migrate_hrtimer_list() should do (see the previous email). I tend to agree, but I think its a pre-existing problem, not one introduced by my proposed patch. > Finally. Suppose that timer->function() returns HRTIMER_RESTART > and hrtimer_active() is called right after __run_hrtimer() sets > cpu_base->running = NULL. I can't understand why hrtimer_active() > can't miss ENQUEUED in this case. We have wmb() in between, yes, > but then hrtimer_active() should do something like > > active = cpu_base->running == timer; > if (!active) { > rmb(); > active = state != HRTIMER_STATE_INACTIVE; > } > > No? Hmm, good point. Let me think about that. It would be nice to be able to avoid more memory barriers.