From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932925AbcEQMYU (ORCPT ); Tue, 17 May 2016 08:24:20 -0400 Received: from mail-wm0-f46.google.com ([74.125.82.46]:37753 "EHLO mail-wm0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932869AbcEQMYT (ORCPT ); Tue, 17 May 2016 08:24:19 -0400 Date: Tue, 17 May 2016 13:24:15 +0100 From: Matt Fleming To: Yuyang Du Cc: Peter Zijlstra , Ingo Molnar , linux-kernel@vger.kernel.org, Byungchul Park , Frederic Weisbecker , Luca Abeni , "Rafael J . Wysocki" , Rik van Riel , Thomas Gleixner , Wanpeng Li , Mel Gorman , Mike Galbraith Subject: Re: [RFC][PATCH 5/5] sched/core: Add debug code to catch missing update_rq_clock() Message-ID: <20160517122415.GD21993@codeblueprint.co.uk> References: <1463082593-27777-1-git-send-email-matt@codeblueprint.co.uk> <1463082593-27777-6-git-send-email-matt@codeblueprint.co.uk> <20160515021439.GC8790@intel.com> <20160516094638.GB6574@codeblueprint.co.uk> <20160516201109.GD8790@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160516201109.GD8790@intel.com> User-Agent: Mutt/1.5.24+41 (02bc14ed1569) (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 17 May, at 04:11:09AM, Yuyang Du wrote: > On Mon, May 16, 2016 at 10:46:38AM +0100, Matt Fleming wrote: > > > > No because if someone calls rq_clock() immediately after __schedule(), > > or even immediately after we clear RQCF_ACT_SKIP in __schedule(), we > > should trigger a warning since the clock has not actually been > > updated. > > Well, I don't know how concurrent it can be, but aren't both update > and read synchronized by rq->lock? So I don't understand the latter > case, and the former should be addressed (missing its own update?). I'm not talking about concurrency; when I said "someone" above, I was referring to code. So, if the code looks like the following, either now or in the future, static void __schedule(bool preempt) { ... /* Clear RQCF_ACT_SKIP */ rq->clock_update_flags = 0; ... delta = rq_clock(); } I would expect to see a warning triggered, because we've read the rq clock outside of the code area where we know it's safe to do so without a clock update. The solution for that bug may be as simple as rearranging the code, delta = rq_clock(); ... rq->clock_update_flags = 0; but we definitely want to catch such bugs in the first instance.