From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751907Ab3J3RQA (ORCPT ); Wed, 30 Oct 2013 13:16:00 -0400 Received: from e06smtp16.uk.ibm.com ([195.75.94.112]:53995 "EHLO e06smtp16.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751125Ab3J3RP7 (ORCPT ); Wed, 30 Oct 2013 13:15:59 -0400 In-Reply-To: <20131030153931.GM16117@laptop.programming.kicks-ass.net> References: <20131025173749.GG19466@laptop.lan> <20131028132634.GO19466@laptop.lan> <20131028163418.GD4126@linux.vnet.ibm.com> <20131028201735.GA15629@redhat.com> <20131030092725.GL4126@linux.vnet.ibm.com> <20131030112526.GI16117@laptop.programming.kicks-ass.net> <20131030153931.GM16117@laptop.programming.kicks-ass.net> Subject: Re: perf events ring buffer memory barrier on powerpc X-KeepSent: CA1DB4AC:7D4F6EEE-42257C14:0056F77D; type=4; name=$KeepSent To: Peter Zijlstra Cc: Anton Blanchard , Benjamin Herrenschmidt , Frederic Weisbecker , LKML , Linux PPC dev , Mathieu Desnoyers , Michael Ellerman , Michael Neuling , Oleg Nesterov , "Paul E. McKenney" X-Mailer: Lotus Notes Release 8.5.3 September 15, 2011 Message-ID: From: Victor Kaplansky Date: Wed, 30 Oct 2013 19:14:29 +0200 X-MIMETrack: Serialize by Router on D06ML319/06/M/IBM(Release 8.5.3FP5|July 31, 2013) at 30/10/2013 19:14:21 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII X-TM-AS-MML: No X-Content-Scanned: Fidelis XPS MAILER x-cbid: 13103017-3548-0000-0000-000006FF6B7E Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Peter Zijlstra wrote on 10/30/2013 05:39:31 PM: > Although I suppose speculative reads are allowed -- they don't have the > destructive behaviour speculative writes have -- and thus we could in > fact get reorder issues. I agree. > > But since it is still a dependent load in that we do that @tail vs @head > comparison before doing other loads, wouldn't a read_barrier_depends() > be sufficient? Or do we still need a complete rmb? We need a complete rmb() here IMO. I think there is a fundamental difference between load and stores in this aspect. Load are allowed to be hoisted by compiler or executed speculatively by HW. To prevent load "*(ubuf->data + tail)" to be hoisted beyond "ubuf->head" load you would need something like this: void ubuf_read(void) { u64 head, tail; tail = ubuf->tail; head = ACCESS_ONCE(ubuf->head); /* * Ensure we read the buffer boundaries before the actual buffer * data... */ while (tail != head) { smp_read_barrier_depends(); /* for Alpha */ obj = *(ubuf->data + head - 128); /* process obj */ tail += obj->size; tail %= ubuf->size; } /* * Ensure all data reads are complete before we issue the * ubuf->tail update; once that update hits, kbuf_write() can * observe and overwrite data. */ smp_mb(); /* D, matches with A */ ubuf->tail = tail; } (note that "head" is part of address calculation of obj load now). But, even in this demo example some "smp_read_barrier_depends()" before "obj = *(ubuf->data + head - 100);" is required for architectures like Alpha. Though, on more sane architectures "smp_read_barrier_depends()" will be translated to nothing. Regards, -- Victor From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e06smtp13.uk.ibm.com (e06smtp13.uk.ibm.com [195.75.94.109]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "e06smtp13.uk.ibm.com", Issuer "GeoTrust SSL CA" (not verified)) by ozlabs.org (Postfix) with ESMTPS id BEC182C0390 for ; Thu, 31 Oct 2013 04:15:58 +1100 (EST) Received: from /spool/local by e06smtp13.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 30 Oct 2013 17:15:55 -0000 Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by d06dlp01.portsmouth.uk.ibm.com (Postfix) with ESMTP id F321C17D805C for ; Wed, 30 Oct 2013 17:15:26 +0000 (GMT) Received: from d06av02.portsmouth.uk.ibm.com (d06av02.portsmouth.uk.ibm.com [9.149.37.228]) by b06cxnps4074.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r9UHEPCR61407310 for ; Wed, 30 Oct 2013 17:14:25 GMT Received: from d06av02.portsmouth.uk.ibm.com (localhost [127.0.0.1]) by d06av02.portsmouth.uk.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id r9UHEZav015733 for ; Wed, 30 Oct 2013 11:14:37 -0600 In-Reply-To: <20131030153931.GM16117@laptop.programming.kicks-ass.net> References: <20131025173749.GG19466@laptop.lan> <20131028132634.GO19466@laptop.lan> <20131028163418.GD4126@linux.vnet.ibm.com> <20131028201735.GA15629@redhat.com> <20131030092725.GL4126@linux.vnet.ibm.com> <20131030112526.GI16117@laptop.programming.kicks-ass.net> <20131030153931.GM16117@laptop.programming.kicks-ass.net> Subject: Re: perf events ring buffer memory barrier on powerpc To: Peter Zijlstra Message-ID: From: Victor Kaplansky Date: Wed, 30 Oct 2013 19:14:29 +0200 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII Cc: Michael Neuling , Mathieu Desnoyers , LKML , Oleg Nesterov , Linux PPC dev , Anton Blanchard , Frederic Weisbecker , "Paul E. McKenney" List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Peter Zijlstra wrote on 10/30/2013 05:39:31 PM: > Although I suppose speculative reads are allowed -- they don't have the > destructive behaviour speculative writes have -- and thus we could in > fact get reorder issues. I agree. > > But since it is still a dependent load in that we do that @tail vs @head > comparison before doing other loads, wouldn't a read_barrier_depends() > be sufficient? Or do we still need a complete rmb? We need a complete rmb() here IMO. I think there is a fundamental difference between load and stores in this aspect. Load are allowed to be hoisted by compiler or executed speculatively by HW. To prevent load "*(ubuf->data + tail)" to be hoisted beyond "ubuf->head" load you would need something like this: void ubuf_read(void) { u64 head, tail; tail = ubuf->tail; head = ACCESS_ONCE(ubuf->head); /* * Ensure we read the buffer boundaries before the actual buffer * data... */ while (tail != head) { smp_read_barrier_depends(); /* for Alpha */ obj = *(ubuf->data + head - 128); /* process obj */ tail += obj->size; tail %= ubuf->size; } /* * Ensure all data reads are complete before we issue the * ubuf->tail update; once that update hits, kbuf_write() can * observe and overwrite data. */ smp_mb(); /* D, matches with A */ ubuf->tail = tail; } (note that "head" is part of address calculation of obj load now). But, even in this demo example some "smp_read_barrier_depends()" before "obj = *(ubuf->data + head - 100);" is required for architectures like Alpha. Though, on more sane architectures "smp_read_barrier_depends()" will be translated to nothing. Regards, -- Victor