From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752234AbcHKTcu (ORCPT ); Thu, 11 Aug 2016 15:32:50 -0400 Received: from mga01.intel.com ([192.55.52.88]:23560 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751340AbcHKTcs (ORCPT ); Thu, 11 Aug 2016 15:32:48 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.28,506,1464678000"; d="scan'208";a="1034028728" Subject: Re: [RESEND PATCH v4] x86/hpet: Reduce HPET counter read contention To: Waiman Long , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" References: <1470853770-37625-1-git-send-email-Waiman.Long@hpe.com> Cc: linux-kernel@vger.kernel.org, x86@kernel.org, Jiang Liu , Borislav Petkov , Andy Lutomirski , Prarit Bhargava , Scott J Norton , Douglas Hatch , Randy Wright , John Stultz From: Dave Hansen Message-ID: <57ACD2DE.6080306@intel.com> Date: Thu, 11 Aug 2016 12:32:46 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0 MIME-Version: 1.0 In-Reply-To: <1470853770-37625-1-git-send-email-Waiman.Long@hpe.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 08/10/2016 11:29 AM, Waiman Long wrote: > +static cycle_t read_hpet(struct clocksource *cs) > +{ > + int seq; > + > + seq = READ_ONCE(hpet_save.seq); > + if (!HPET_SEQ_LOCKED(seq)) { ... > + } > + > + /* > + * Wait until the locked sequence number changes which indicates > + * that the saved HPET value is up-to-date. > + */ > + while (READ_ONCE(hpet_save.seq) == seq) { > + /* > + * Since reading the HPET is much slower than a single > + * cpu_relax() instruction, we use two here in an attempt > + * to reduce the amount of cacheline contention in the > + * hpet_save.seq cacheline. > + */ > + cpu_relax(); > + cpu_relax(); > + } > + > + return (cycle_t)READ_ONCE(hpet_save.hpet); > +} It's a real bummer that this all has to be open-coded. I have to wonder if there were any alternatives that you tried that were simpler. Is READ_ONCE()/smp_store_release() really strong enough here? It guarantees ordering, but you need ordering *and* a guarantee that your write is visible to the reader. Don't you need actual barriers for that? Otherwise, you might be seeing a stale HPET value, and the spin loop that you did waiting for it to be up-to-date was worthless. The seqlock code, uses barriers, btw. Also, since you're fundamentally reading a second-hand HPET value, does that have any impact on the precision of the HPET as a timesource? Or, is it so coarse already that this isn't an issue?