From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751907Ab3J3RQA (ORCPT ); Wed, 30 Oct 2013 13:16:00 -0400 Received: from e06smtp16.uk.ibm.com ([195.75.94.112]:53995 "EHLO e06smtp16.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751125Ab3J3RP7 (ORCPT ); Wed, 30 Oct 2013 13:15:59 -0400 In-Reply-To: <20131030153931.GM16117@laptop.programming.kicks-ass.net> References: <20131025173749.GG19466@laptop.lan> <20131028132634.GO19466@laptop.lan> <20131028163418.GD4126@linux.vnet.ibm.com> <20131028201735.GA15629@redhat.com> <20131030092725.GL4126@linux.vnet.ibm.com> <20131030112526.GI16117@laptop.programming.kicks-ass.net> <20131030153931.GM16117@laptop.programming.kicks-ass.net> Subject: Re: perf events ring buffer memory barrier on powerpc X-KeepSent: CA1DB4AC:7D4F6EEE-42257C14:0056F77D; type=4; name=$KeepSent To: Peter Zijlstra Cc: Anton Blanchard , Benjamin Herrenschmidt , Frederic Weisbecker , LKML , Linux PPC dev , Mathieu Desnoyers , Michael Ellerman , Michael Neuling , Oleg Nesterov , "Paul E. McKenney" X-Mailer: Lotus Notes Release 8.5.3 September 15, 2011 Message-ID: From: Victor Kaplansky Date: Wed, 30 Oct 2013 19:14:29 +0200 X-MIMETrack: Serialize by Router on D06ML319/06/M/IBM(Release 8.5.3FP5|July 31, 2013) at 30/10/2013 19:14:21 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII X-TM-AS-MML: No X-Content-Scanned: Fidelis XPS MAILER x-cbid: 13103017-3548-0000-0000-000006FF6B7E Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Peter Zijlstra wrote on 10/30/2013 05:39:31 PM: > Although I suppose speculative reads are allowed -- they don't have the > destructive behaviour speculative writes have -- and thus we could in > fact get reorder issues. I agree. > > But since it is still a dependent load in that we do that @tail vs @head > comparison before doing other loads, wouldn't a read_barrier_depends() > be sufficient? Or do we still need a complete rmb? We need a complete rmb() here IMO. I think there is a fundamental difference between load and stores in this aspect. Load are allowed to be hoisted by compiler or executed speculatively by HW. To prevent load "*(ubuf->data + tail)" to be hoisted beyond "ubuf->head" load you would need something like this: void ubuf_read(void) { u64 head, tail; tail = ubuf->tail; head = ACCESS_ONCE(ubuf->head); /* * Ensure we read the buffer boundaries before the actual buffer * data... */ while (tail != head) { smp_read_barrier_depends(); /* for Alpha */ obj = *(ubuf->data + head - 128); /* process obj */ tail += obj->size; tail %= ubuf->size; } /* * Ensure all data reads are complete before we issue the * ubuf->tail update; once that update hits, kbuf_write() can * observe and overwrite data. */ smp_mb(); /* D, matches with A */ ubuf->tail = tail; } (note that "head" is part of address calculation of obj load now). But, even in this demo example some "smp_read_barrier_depends()" before "obj = *(ubuf->data + head - 100);" is required for architectures like Alpha. Though, on more sane architectures "smp_read_barrier_depends()" will be translated to nothing. Regards, -- Victor