From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753332AbbFHOmv (ORCPT ); Mon, 8 Jun 2015 10:42:51 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:42374 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752516AbbFHOmo (ORCPT ); Mon, 8 Jun 2015 10:42:44 -0400 Date: Mon, 8 Jun 2015 16:42:31 +0200 From: Peter Zijlstra To: Oleg Nesterov Cc: umgwanakikbuti@gmail.com, mingo@elte.hu, ktkhai@parallels.com, rostedt@goodmis.org, tglx@linutronix.de, juri.lelli@gmail.com, pang.xunlei@linaro.org, wanpeng.li@linux.intel.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH 08/14] hrtimer: Allow hrtimer::function() to free the timer Message-ID: <20150608144231.GL3644@twins.programming.kicks-ass.net> References: <20150605084836.364306429@infradead.org> <20150605085205.723058588@infradead.org> <20150607223317.GA5193@redhat.com> <20150608091417.GM19282@twins.programming.kicks-ass.net> <20150608124234.GW18673@twins.programming.kicks-ass.net> <20150608142749.GB13168@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150608142749.GB13168@redhat.com> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 08, 2015 at 04:27:49PM +0200, Oleg Nesterov wrote: > On 06/08, Peter Zijlstra wrote: > > > > On Mon, Jun 08, 2015 at 11:14:17AM +0200, Peter Zijlstra wrote: > > > > Finally. Suppose that timer->function() returns HRTIMER_RESTART > > > > and hrtimer_active() is called right after __run_hrtimer() sets > > > > cpu_base->running = NULL. I can't understand why hrtimer_active() > > > > can't miss ENQUEUED in this case. We have wmb() in between, yes, > > > > but then hrtimer_active() should do something like > > > > > > > > active = cpu_base->running == timer; > > > > if (!active) { > > > > rmb(); > > > > active = state != HRTIMER_STATE_INACTIVE; > > > > } > > > > > > > > No? > > > > > > Hmm, good point. Let me think about that. It would be nice to be able to > > > avoid more memory barriers. > > Yes, but otoh, can't we avoid seqcount_t altogether? > > To remind, we assume that > > - "false positive" is fine. If we observe ENQUEUED or ->running > we can safely return true. It doesn't matter if the timer becomes > "inactive" right after return. > > - we need to fix migrate_hrtimer_list() and __hrtimer_start_range_ns() > to preserve ENQUEUED. This fixes the races with hrtimer_is_queued() > and hrtimer_active() we currently have. > > Now, can't we simply do > > __run_hrtimer() > { > > cpu_base->running = timer; > > wmb(); // 1 > > __remove_hrtimer(INACTIVE); // clears ENQUEUED > > fn(); // autorearm can set ENQUEUED again > > wmb(); // 2 > > cpu_base->running = NULL; // XXX > } > > hrtimer_active(timer) > { > if (timer->state & ENQUEUED) > return true; > > rmb(); // pairs with 1 > > > // We do not care if we race with __hrtimer_start_range_ns(). > // The running timer can't change its base. > // If it was ENQUEUED, we rely on the previous check. > > base = timer->base->cpu_base; > read_barrier_depends(); > if (base->running == timer) > return true; > > rmb(); // pairs with 2 > > // Avoid the race with auto-rearming timer. If we see the > // result of XXX above we should also see ENQUEUED if it > // was set by __run_hrtimer() or timer->function(). > // > // We do not care if another thread does hrtimer_start() > // and we miss ENQUEUED. In this case we can the "inactive" > // window anyway, we can pretend that hrtimer_start() was > // called after XXX above. So we can equally pretend that > // hrtimer_active() was called in this window. > // > if (timer->state & ENQUEUED) > return true; > > return false; > } > > Most probably I missed something... I'll try to think more, but perhaps > you see a hole immediately? This is something I proposed earlier; Kirill said: lkml.kernel.org/r/2134411433408823@web8j.yandex.ru Which I read like the below, imagine our timer expires periodically and rearms itself: acquire cpu_base->running = timer; wmb timer->state = INACTIVE; release [R] timer->state (== INACTIVE) fn() acquire timer->state = ACTIVE wmb cpu_base->running = NULL release [R] cpu_base->running (== NULL) acquire cpu_base->running = timer; wmb timer->state = INACTIVE; release [R] timer->state (== INACTIVE) fn() acquire timer->state = ACTIVE wmb cpu_base->running = NULL release And we have a false negative.