From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:60633) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R0R25-0008Va-Np for qemu-devel@nongnu.org; Mon, 05 Sep 2011 00:43:42 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1R0R24-0003b5-2X for qemu-devel@nongnu.org; Mon, 05 Sep 2011 00:43:41 -0400 Received: from e23smtp07.au.ibm.com ([202.81.31.140]:51989) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1R0R23-0003an-Dv for qemu-devel@nongnu.org; Mon, 05 Sep 2011 00:43:40 -0400 Received: from d23relay03.au.ibm.com (d23relay03.au.ibm.com [202.81.31.245]) by e23smtp07.au.ibm.com (8.14.4/8.13.1) with ESMTP id p854hKNF000703 for ; Mon, 5 Sep 2011 14:43:20 +1000 Received: from d23av01.au.ibm.com (d23av01.au.ibm.com [9.190.234.96]) by d23relay03.au.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p854hJME1138724 for ; Mon, 5 Sep 2011 14:43:20 +1000 Received: from d23av01.au.ibm.com (loopback [127.0.0.1]) by d23av01.au.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p854hJjV012205 for ; Mon, 5 Sep 2011 14:43:19 +1000 Date: Mon, 5 Sep 2011 14:43:16 +1000 From: David Gibson Message-ID: <20110905044316.GD30278@yookeroo.fritz.box> References: <20110901163359.GB11620@redhat.com> <786649703.1049386.1314909069542.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com> <20110902154549.GA18368@redhat.com> <20110903144635.GD12965@yookeroo.fritz.box> <20110904091643.GA20795@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110904091643.GA20795@redhat.com> Subject: Re: [Qemu-devel] [PATCH] virtio: Make memory barriers be memory barriers List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: aliguori@us.ibm.com, aik@ozlabs.ru, rusty@rustcorp.com.au, qemu-devel@nongnu.org, agraf@suse.de, Paolo Bonzini On Sun, Sep 04, 2011 at 12:16:43PM +0300, Michael S. Tsirkin wrote: > On Sun, Sep 04, 2011 at 12:46:35AM +1000, David Gibson wrote: > > On Fri, Sep 02, 2011 at 06:45:50PM +0300, Michael S. Tsirkin wrote: > > > On Thu, Sep 01, 2011 at 04:31:09PM -0400, Paolo Bonzini wrote: > > > > > > > Why not limit the change to ppc then? > > > > > > > > > > > > Because the bug is masked by the x86 memory model, but it is still > > > > > > there even there conceptually. It is not really true that x86 does > > > > > > not need memory barriers, though it doesn't in this case: > > > > > > > > > > > > http://bartoszmilewski.wordpress.com/2008/11/05/who-ordered-memory-fences-on-an-x86/ > > > > > > > > > > > > Paolo > > > > > > > > > > Right. > > > > > To summarize, on x86 we probably want wmb and rmb to be compiler > > > > > barrier only. Only mb might in theory need to be an mfence. > > > > > > > > No, wmb needs to be sfence and rmb needs to be lfence. GCC does > > > > not provide those, so they should become __sync_synchronize() too, > > > > or you should use inline assembly. > > > > > > > > > But there might be reasons why that is not an issue either > > > > > if we look closely enough. > > > > > > > > Since the ring buffers are not using locked instructions (no xchg > > > > or cmpxchg) the barriers simply must be there, even on x86. Whether > > > > it works in practice is not interesting, only the formal model is > > > > interesting. > > > > > > > > Paolo > > > > > > Well, can you describe an issue in virtio that lfence/sfence help solve > > > in terms of a memory model please? > > > Pls note that guest uses smp_ variants for barriers. > > > > Ok, so, I'm having a bit of trouble with the fact that I'm having to > > argue the case that things the protocol requiress to be memory > > barriers actually *be* memory barriers on all platforms. > > > > I mean argue for a richer set of barriers, with per-arch minimal > > implementations instead of the large but portable hammer of > > sync_synchronize, if you will. > > That's what I'm saying really. On x86 the richer set of barriers > need not insert code at all for both wmb and rmb macros. All we > might need is an 'optimization barrier'- e.g. linux does > __asm__ __volatile__("": : :"memory") > ppc needs something like sync_synchronize there. But you're approaching this the wrong way around - correctness should come first. That is, we should first ensure that there is a sufficient memory barrier to satisfy the protocol. Then, *if* there is a measurable performance improvement and *if* we can show that a weaker barrier is sufficient on a given platform, then we can whittle it down to a lighter barrier. > > But just leaving them out on x86!? > > Seriously, wtf? Do you enjoy having software that works chiefly by > > accident? > > I'm surprised at the controversy too. People seem to argue that > x86 cpu does not reorder stores and that we need an sfence between > stores to prevent the guest from seeing them out of order, at > the same time. I don't know the x86 storage model well enough to definitively say that the barrier is not necessary there - nor to say that it is necessary. All I know is that the x86 model is quite strongly ordered, and I assume that is why the lack of barrier has not caused an observed problem on x86. Again, correctness first. sync_synchronize should be a sufficient barrier for wmb() on all platforms. If you really don't want it, the onus is on you to show that (a) it's safe to do so and (b) it's actually worth it. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson