From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:56605)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1SVJ0U-0004Fz-HI
	for qemu-devel@nongnu.org; Fri, 18 May 2012 04:57:55 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1SVJ0M-0001T7-KA
	for qemu-devel@nongnu.org; Fri, 18 May 2012 04:57:54 -0400
Received: from mx1.redhat.com ([209.132.183.28]:26650)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1SVJ0M-0001Sn-C0
	for qemu-devel@nongnu.org; Fri, 18 May 2012 04:57:46 -0400
Message-ID: <4FB60EFF.6070205@redhat.com>
Date: Fri, 18 May 2012 10:57:35 +0200
From: Paolo Bonzini <pbonzini@redhat.com>
MIME-Version: 1.0
References: <1336625347-10169-1-git-send-email-benh@kernel.crashing.org>
	<1336625347-10169-14-git-send-email-benh@kernel.crashing.org>
	<4FB1A8BF.7060503@codemonkey.ws>
	<20120515014449.GF30229@truffala.fritz.box>
	<1337142938.6727.122.camel@pasglop>
	<4FB4028F.7070003@codemonkey.ws>
	<1337213257.30558.22.camel@pasglop>
	<1337214293.30558.25.camel@pasglop>
	<4FB5F1FD.9020009@redhat.com> <1337329136.2513.7.camel@pasglop>
In-Reply-To: <1337329136.2513.7.camel@pasglop>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH 13/13] iommu: Add a memory barrier to DMA
	RW function
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Gibson <david@gibson.dropbear.id.au>, qemu-devel@nongnu.org, Anthony Liguori <anthony@codemonkey.ws>, "Michael S. Tsirkin" <mst@redhat.com>

Il 18/05/2012 10:18, Benjamin Herrenschmidt ha scritto:
> On Fri, 2012-05-18 at 08:53 +0200, Paolo Bonzini wrote:
> 
>> It depends on what semantics you attach to dma_mb.  In my opinion,
>> having a separate barrier for DMA is wrong, because you want the same
>> semantics on all architectures.
>>
>> The x86 requirements are roughly as follows:
>>
>> 1) it never needs explicit rmb and wmb (as long as you don't use
>> non-temporal stores etc.);
>>
>> 2) synchronized operations have an implicit mb before and after (unlike
>> LL/SC on PowerPC).
>>
>> 3) everywhere else, you need an mb.
>>
>> So, on x86 you have more or less an implicit wmb after each write and an
>> implicit rmb before a read.  This explains why these kind of bugs are
>> very hard to see on x86 (or often impossible to see).
> 
> So what you mean is that on x86, a read can pass a write (and
> vice-versa) ? Interesting.... I didn't know that.

I have to look it up every time.  It takes several pages in the x86
manuals, but the important part is this:

0) Reads are not moved before older reads, writes are not moved before
older writes

1) Writes are not moved before older reads

2) Reads may be moved before older writes to different locations, but
not before older writes to the same location.

3) Intra-processor forwarding is allowed. While a store is temporarily
held in a processor's store buffer, it can satisfy the processor's own
loads.

> I'm still tempted to add barriers in map and unmap as well in the case
> where they don't bounce to provide consistent semantics here, ie, all
> accesses done between the map and unmap are ordered vs all previous and
> subsequent accesses. Ok with that ? [...]
> I will not add barriers to the various ld*/st* variants.

In theory you would need a memory barrier before the first ld/st and one
after the last... considering virtio uses map/unmap, what about leaving
map/unmap and ld*_phys/st*_phys as the high performance unsafe API?
Then you can add barriers around ld*_pci_dma/st*_pci_dma.

>> x86 -> Load Seq_Cst:	 mov                or mfence; mov
>>        Store Seq Cst: mov; mfence        or mov
>>
>> ARM -> Load Seq Cst:  ldr; dmb           or dmb; ldr; dmb
>>        Store Seq Cst: dmb; str; dmb      or dmb; str
>>
>> PPC -> Load Seq Cst:  sync; ld; cmp; bc; isync
>>        Store Seq Cst: sync; st
>>
>>        where cmp; bc; isync can be replaced by sync.
> 
> Hrm, the cmp/bc/isync can be -very- expensive, we use a variant of that
> using twi to enforce complete execution of reads in our readX()
> accessors in the kernel but I don't think I want to do that in qemu.

Ah, ok, thanks for explaining what cmp;bc;isync really is. :)

> The full sync should provide all the synchronization we need

You mean "sync; ld; sync" for load and "sync; st" for store?  That would
do, yes.

Paolo