From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1754646AbeDZJUt (ORCPT <rfc822;w@1wt.eu>);
        Thu, 26 Apr 2018 05:20:49 -0400
Received: from mail-io0-f173.google.com ([209.85.223.173]:35896 "EHLO
        mail-io0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1753487AbeDZJUp (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 26 Apr 2018 05:20:45 -0400
X-Google-Smtp-Source: AB8JxZrrkkCXGvoySEum4ad22BhlHjW/oMWgELdtgyX3DAmJBZyQvMI2E+gXWIc4sQYXCp+Z8gHZn+zG1z16CB364nU=
MIME-Version: 1.0
X-Originating-IP: [2a02:168:5635:0:39d2:f87e:2033:9f6]
In-Reply-To: <20180425225443.GQ16141@n2100.armlinux.org.uk>
References: <20180424184847.GA3247@infradead.org> <CAKMK7uFL68pu+-9LODTgz+GQYvxpnXOGhxfz9zorJ_JKsPVw2g@mail.gmail.com>
 <20180425054855.GA17038@infradead.org> <CAKMK7uEFitkNQrD6cLX5Txe11XhVO=LC4YKJXH=VNdq+CY=DjQ@mail.gmail.com>
 <CAKMK7uFx=KB1vup=WhPCyfUFairKQcRR4BEd7aXaX1Pj-vj3Cw@mail.gmail.com>
 <20180425064335.GB28100@infradead.org> <20180425074151.GA2271@ulmo>
 <20180425085439.GA29996@infradead.org> <20180425100429.GR25142@phenom.ffwll.local>
 <20180425153312.GD27076@infradead.org> <20180425225443.GQ16141@n2100.armlinux.org.uk>
From: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu, 26 Apr 2018 11:20:44 +0200
Message-ID: <CAKMK7uG9R6paoP4BmvqDVUP_4Db4Dz8MSXeLwUBMYaORa5-kVw@mail.gmail.com>
Subject: Re: [Linaro-mm-sig] noveau vs arm dma ops
To: Russell King - ARM Linux <linux@armlinux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
        amd-gfx list <amd-gfx@lists.freedesktop.org>,
        "moderated list:DMA BUFFER SHARING FRAMEWORK"
        <linaro-mm-sig@lists.linaro.org>,
        Jerome Glisse <jglisse@redhat.com>, iommu@lists.linux-foundation.org,
        dri-devel <dri-devel@lists.freedesktop.org>,
        Dan Williams <dan.j.williams@intel.com>,
        Thierry Reding <treding@nvidia.com>,
        Logan Gunthorpe <logang@deltatee.com>,
        =?UTF-8?Q?Christian_K=C3=B6nig?= <christian.koenig@amd.com>,
        Linux ARM <linux-arm-kernel@lists.infradead.org>,
        "open list:DMA BUFFER SHARING FRAMEWORK"
        <linux-media@vger.kernel.org>
Content-Type: text/plain; charset="UTF-8"
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Thu, Apr 26, 2018 at 12:54 AM, Russell King - ARM Linux
<linux@armlinux.org.uk> wrote:
> On Wed, Apr 25, 2018 at 08:33:12AM -0700, Christoph Hellwig wrote:
>> On Wed, Apr 25, 2018 at 12:04:29PM +0200, Daniel Vetter wrote:
>> > - dma api hides the cache flushing requirements from us. GPUs love
>> >   non-snooped access, and worse give userspace control over that. We want
>> >   a strict separation between mapping stuff and flushing stuff. With the
>> >   IOMMU api we mostly have the former, but for the later arch maintainers
>> >   regularly tells they won't allow that. So we have drm_clflush.c.
>>
>> The problem is that a cache flushing API entirely separate is hard. That
>> being said if you look at my generic dma-noncoherent API series it tries
>> to move that way.  So far it is in early stages and apparently rather
>> buggy unfortunately.
>
> And if folk want a cacheable mapping with explicit cache flushing, the
> cache flushing must not be defined in terms of "this is what CPU seems
> to need" but from the point of view of a CPU with infinite prefetching,
> infinite caching and infinite capacity to perform writebacks of dirty
> cache lines at unexpected moments when the memory is mapped in a
> cacheable mapping.
>
> (The reason for that is you're operating in a non-CPU specific space,
> so you can't make any guarantees as to how much caching or prefetching
> will occur by the CPU - different CPUs will do different amounts.)
>
> So, for example, the sequence:
>
> GPU writes to memory
>                         CPU reads from cacheable memory
>
> if the memory was previously dirty (iow, CPU has written), you need to
> flush the dirty cache lines _before_ the GPU writes happen, but you
> don't know whether the CPU has speculatively prefetched, so you need
> to flush any prefetched cache lines before reading from the cacheable
> memory _after_ the GPU has finished writing.
>
> Also note that "flush" there can be "clean the cache", "clean and
> invalidate the cache" or "invalidate the cache" as appropriate - some
> CPUs are able to perform those three operations, and the appropriate
> one depends on not only where in the above sequence it's being used,
> but also on what the operations are.
>
> So, the above sequence could be:
>
>                         CPU invalidates cache for memory
>                                 (due to possible dirty cache lines)
> GPU writes to memory
>                         CPU invalidates cache for memory
>                                 (to get rid of any speculatively prefetched
>                                  lines)
>                         CPU reads from cacheable memory
>
> Yes, in the above case, _two_ cache operations are required to ensure
> correct behaviour.  However, if you know for certain that the memory was
> previously clean, then the first cache operation can be skipped.
>
> What I'm pointing out is there's much more than just "I want to flush
> the cache" here, which is currently what DRM seems to assume at the
> moment with the code in drm_cache.c.
>
> If we can agree a set of interfaces that allows _proper_ use of these
> facilities, one which can be used appropriately, then there shouldn't
> be a problem.  The DMA API does that via it's ideas about who owns a
> particular buffer (because of the above problem) and that's something
> which would need to be carried over to such a cache flushing API (it
> should be pretty obvious that having a GPU read or write memory while
> the cache for that memory is being cleaned will lead to unexpected
> results.)
>
> Also note that things get even more interesting in a SMP environment
> if cache operations aren't broadcasted across the SMP cluster (which
> means cache operations have to be IPI'd to other CPUs.)
>
> The next issue, which I've brought up before, is that exposing cache
> flushing to userspace on architectures where it isn't already exposed
> comes.  As has been shown by Google Project Zero, this risks exposing
> those architectures to Spectre and Meltdown exploits where they weren't
> at such a risk before.  (I've pretty much shown here that you _do_
> need to control which cache lines get flushed to make these exploits
> work, and flushing the cache by reading lots of data in liu of having
> the ability to explicitly flush bits of cache makes it very difficult
> to impossible for them to work.)

The above is already what we're implementing in i915, at least
conceptually (it all boils down to clflush instructions because those
both invalidate and flush).

One architectural guarantee we're exploiting is that prefetched (and
hence non-dirty) cachelines will never get written back, but dropped
instead. But we kinda need that, otherwise the cpu could randomly
corrupt the data the gpu is writing and non-coherent would just not
work on those platforms. But aside from that, yes we do an invalidate
before reading, and flushing after every writing (or anything else
that could leave dirty cachelines behind). Plus a bit of tracking in
the driver (kernel/userspace both do this, together, with some
hilariously bad evolved semantics at least for i915, but oh well can't
fix uapi mistakes) to avoid redundant cacheline flushes/invalidates.

So ack.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

From mboxrd@z Thu Jan  1 00:00:00 1970
From: Daniel Vetter <daniel.vetter@ffwll.ch>
Subject: Re: [Linaro-mm-sig] noveau vs arm dma ops
Date: Thu, 26 Apr 2018 11:20:44 +0200
Message-ID: <CAKMK7uG9R6paoP4BmvqDVUP_4Db4Dz8MSXeLwUBMYaORa5-kVw@mail.gmail.com>
References: <20180424184847.GA3247@infradead.org>
 <CAKMK7uFL68pu+-9LODTgz+GQYvxpnXOGhxfz9zorJ_JKsPVw2g@mail.gmail.com>
 <20180425054855.GA17038@infradead.org>
 <CAKMK7uEFitkNQrD6cLX5Txe11XhVO=LC4YKJXH=VNdq+CY=DjQ@mail.gmail.com>
 <CAKMK7uFx=KB1vup=WhPCyfUFairKQcRR4BEd7aXaX1Pj-vj3Cw@mail.gmail.com>
 <20180425064335.GB28100@infradead.org> <20180425074151.GA2271@ulmo>
 <20180425085439.GA29996@infradead.org>
 <20180425100429.GR25142@phenom.ffwll.local>
 <20180425153312.GD27076@infradead.org>
 <20180425225443.GQ16141@n2100.armlinux.org.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Return-path: <dri-devel-bounces@lists.freedesktop.org>
In-Reply-To: <20180425225443.GQ16141@n2100.armlinux.org.uk>
List-Unsubscribe: <https://lists.freedesktop.org/mailman/options/dri-devel>,
 <mailto:dri-devel-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <https://lists.freedesktop.org/archives/dri-devel>
List-Post: <mailto:dri-devel@lists.freedesktop.org>
List-Help: <mailto:dri-devel-request@lists.freedesktop.org?subject=help>
List-Subscribe: <https://lists.freedesktop.org/mailman/listinfo/dri-devel>,
 <mailto:dri-devel-request@lists.freedesktop.org?subject=subscribe>
Errors-To: dri-devel-bounces@lists.freedesktop.org
Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>
To: Russell King - ARM Linux <linux@armlinux.org.uk>
Cc: "moderated list:DMA BUFFER SHARING FRAMEWORK" <linaro-mm-sig@lists.linaro.org>, iommu@lists.linux-foundation.org, amd-gfx list <amd-gfx@lists.freedesktop.org>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Christoph Hellwig <hch@infradead.org>, Jerome Glisse <jglisse@redhat.com>, dri-devel <dri-devel@lists.freedesktop.org>, Dan Williams <dan.j.williams@intel.com>, Thierry Reding <treding@nvidia.com>, Logan Gunthorpe <logang@deltatee.com>, =?UTF-8?Q?Christian_K=C3=B6nig?= <christian.koenig@amd.com>, Linux ARM <linux-arm-kernel@lists.infradead.org>, "open list:DMA BUFFER SHARING FRAMEWORK" <linux-media@vger.kernel.org>
List-Id: iommu@lists.linux-foundation.org

T24gVGh1LCBBcHIgMjYsIDIwMTggYXQgMTI6NTQgQU0sIFJ1c3NlbGwgS2luZyAtIEFSTSBMaW51
eAo8bGludXhAYXJtbGludXgub3JnLnVrPiB3cm90ZToKPiBPbiBXZWQsIEFwciAyNSwgMjAxOCBh
dCAwODozMzoxMkFNIC0wNzAwLCBDaHJpc3RvcGggSGVsbHdpZyB3cm90ZToKPj4gT24gV2VkLCBB
cHIgMjUsIDIwMTggYXQgMTI6MDQ6MjlQTSArMDIwMCwgRGFuaWVsIFZldHRlciB3cm90ZToKPj4g
PiAtIGRtYSBhcGkgaGlkZXMgdGhlIGNhY2hlIGZsdXNoaW5nIHJlcXVpcmVtZW50cyBmcm9tIHVz
LiBHUFVzIGxvdmUKPj4gPiAgIG5vbi1zbm9vcGVkIGFjY2VzcywgYW5kIHdvcnNlIGdpdmUgdXNl
cnNwYWNlIGNvbnRyb2wgb3ZlciB0aGF0LiBXZSB3YW50Cj4+ID4gICBhIHN0cmljdCBzZXBhcmF0
aW9uIGJldHdlZW4gbWFwcGluZyBzdHVmZiBhbmQgZmx1c2hpbmcgc3R1ZmYuIFdpdGggdGhlCj4+
ID4gICBJT01NVSBhcGkgd2UgbW9zdGx5IGhhdmUgdGhlIGZvcm1lciwgYnV0IGZvciB0aGUgbGF0
ZXIgYXJjaCBtYWludGFpbmVycwo+PiA+ICAgcmVndWxhcmx5IHRlbGxzIHRoZXkgd29uJ3QgYWxs
b3cgdGhhdC4gU28gd2UgaGF2ZSBkcm1fY2xmbHVzaC5jLgo+Pgo+PiBUaGUgcHJvYmxlbSBpcyB0
aGF0IGEgY2FjaGUgZmx1c2hpbmcgQVBJIGVudGlyZWx5IHNlcGFyYXRlIGlzIGhhcmQuIFRoYXQK
Pj4gYmVpbmcgc2FpZCBpZiB5b3UgbG9vayBhdCBteSBnZW5lcmljIGRtYS1ub25jb2hlcmVudCBB
UEkgc2VyaWVzIGl0IHRyaWVzCj4+IHRvIG1vdmUgdGhhdCB3YXkuICBTbyBmYXIgaXQgaXMgaW4g
ZWFybHkgc3RhZ2VzIGFuZCBhcHBhcmVudGx5IHJhdGhlcgo+PiBidWdneSB1bmZvcnR1bmF0ZWx5
Lgo+Cj4gQW5kIGlmIGZvbGsgd2FudCBhIGNhY2hlYWJsZSBtYXBwaW5nIHdpdGggZXhwbGljaXQg
Y2FjaGUgZmx1c2hpbmcsIHRoZQo+IGNhY2hlIGZsdXNoaW5nIG11c3Qgbm90IGJlIGRlZmluZWQg
aW4gdGVybXMgb2YgInRoaXMgaXMgd2hhdCBDUFUgc2VlbXMKPiB0byBuZWVkIiBidXQgZnJvbSB0
aGUgcG9pbnQgb2YgdmlldyBvZiBhIENQVSB3aXRoIGluZmluaXRlIHByZWZldGNoaW5nLAo+IGlu
ZmluaXRlIGNhY2hpbmcgYW5kIGluZmluaXRlIGNhcGFjaXR5IHRvIHBlcmZvcm0gd3JpdGViYWNr
cyBvZiBkaXJ0eQo+IGNhY2hlIGxpbmVzIGF0IHVuZXhwZWN0ZWQgbW9tZW50cyB3aGVuIHRoZSBt
ZW1vcnkgaXMgbWFwcGVkIGluIGEKPiBjYWNoZWFibGUgbWFwcGluZy4KPgo+IChUaGUgcmVhc29u
IGZvciB0aGF0IGlzIHlvdSdyZSBvcGVyYXRpbmcgaW4gYSBub24tQ1BVIHNwZWNpZmljIHNwYWNl
LAo+IHNvIHlvdSBjYW4ndCBtYWtlIGFueSBndWFyYW50ZWVzIGFzIHRvIGhvdyBtdWNoIGNhY2hp
bmcgb3IgcHJlZmV0Y2hpbmcKPiB3aWxsIG9jY3VyIGJ5IHRoZSBDUFUgLSBkaWZmZXJlbnQgQ1BV
cyB3aWxsIGRvIGRpZmZlcmVudCBhbW91bnRzLikKPgo+IFNvLCBmb3IgZXhhbXBsZSwgdGhlIHNl
cXVlbmNlOgo+Cj4gR1BVIHdyaXRlcyB0byBtZW1vcnkKPiAgICAgICAgICAgICAgICAgICAgICAg
ICBDUFUgcmVhZHMgZnJvbSBjYWNoZWFibGUgbWVtb3J5Cj4KPiBpZiB0aGUgbWVtb3J5IHdhcyBw
cmV2aW91c2x5IGRpcnR5IChpb3csIENQVSBoYXMgd3JpdHRlbiksIHlvdSBuZWVkIHRvCj4gZmx1
c2ggdGhlIGRpcnR5IGNhY2hlIGxpbmVzIF9iZWZvcmVfIHRoZSBHUFUgd3JpdGVzIGhhcHBlbiwg
YnV0IHlvdQo+IGRvbid0IGtub3cgd2hldGhlciB0aGUgQ1BVIGhhcyBzcGVjdWxhdGl2ZWx5IHBy
ZWZldGNoZWQsIHNvIHlvdSBuZWVkCj4gdG8gZmx1c2ggYW55IHByZWZldGNoZWQgY2FjaGUgbGlu
ZXMgYmVmb3JlIHJlYWRpbmcgZnJvbSB0aGUgY2FjaGVhYmxlCj4gbWVtb3J5IF9hZnRlcl8gdGhl
IEdQVSBoYXMgZmluaXNoZWQgd3JpdGluZy4KPgo+IEFsc28gbm90ZSB0aGF0ICJmbHVzaCIgdGhl
cmUgY2FuIGJlICJjbGVhbiB0aGUgY2FjaGUiLCAiY2xlYW4gYW5kCj4gaW52YWxpZGF0ZSB0aGUg
Y2FjaGUiIG9yICJpbnZhbGlkYXRlIHRoZSBjYWNoZSIgYXMgYXBwcm9wcmlhdGUgLSBzb21lCj4g
Q1BVcyBhcmUgYWJsZSB0byBwZXJmb3JtIHRob3NlIHRocmVlIG9wZXJhdGlvbnMsIGFuZCB0aGUg
YXBwcm9wcmlhdGUKPiBvbmUgZGVwZW5kcyBvbiBub3Qgb25seSB3aGVyZSBpbiB0aGUgYWJvdmUg
c2VxdWVuY2UgaXQncyBiZWluZyB1c2VkLAo+IGJ1dCBhbHNvIG9uIHdoYXQgdGhlIG9wZXJhdGlv
bnMgYXJlLgo+Cj4gU28sIHRoZSBhYm92ZSBzZXF1ZW5jZSBjb3VsZCBiZToKPgo+ICAgICAgICAg
ICAgICAgICAgICAgICAgIENQVSBpbnZhbGlkYXRlcyBjYWNoZSBmb3IgbWVtb3J5Cj4gICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAoZHVlIHRvIHBvc3NpYmxlIGRpcnR5IGNhY2hlIGxp
bmVzKQo+IEdQVSB3cml0ZXMgdG8gbWVtb3J5Cj4gICAgICAgICAgICAgICAgICAgICAgICAgQ1BV
IGludmFsaWRhdGVzIGNhY2hlIGZvciBtZW1vcnkKPiAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICh0byBnZXQgcmlkIG9mIGFueSBzcGVjdWxhdGl2ZWx5IHByZWZldGNoZWQKPiAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICBsaW5lcykKPiAgICAgICAgICAgICAgICAgICAg
ICAgICBDUFUgcmVhZHMgZnJvbSBjYWNoZWFibGUgbWVtb3J5Cj4KPiBZZXMsIGluIHRoZSBhYm92
ZSBjYXNlLCBfdHdvXyBjYWNoZSBvcGVyYXRpb25zIGFyZSByZXF1aXJlZCB0byBlbnN1cmUKPiBj
b3JyZWN0IGJlaGF2aW91ci4gIEhvd2V2ZXIsIGlmIHlvdSBrbm93IGZvciBjZXJ0YWluIHRoYXQg
dGhlIG1lbW9yeSB3YXMKPiBwcmV2aW91c2x5IGNsZWFuLCB0aGVuIHRoZSBmaXJzdCBjYWNoZSBv
cGVyYXRpb24gY2FuIGJlIHNraXBwZWQuCj4KPiBXaGF0IEknbSBwb2ludGluZyBvdXQgaXMgdGhl
cmUncyBtdWNoIG1vcmUgdGhhbiBqdXN0ICJJIHdhbnQgdG8gZmx1c2gKPiB0aGUgY2FjaGUiIGhl
cmUsIHdoaWNoIGlzIGN1cnJlbnRseSB3aGF0IERSTSBzZWVtcyB0byBhc3N1bWUgYXQgdGhlCj4g
bW9tZW50IHdpdGggdGhlIGNvZGUgaW4gZHJtX2NhY2hlLmMuCj4KPiBJZiB3ZSBjYW4gYWdyZWUg
YSBzZXQgb2YgaW50ZXJmYWNlcyB0aGF0IGFsbG93cyBfcHJvcGVyXyB1c2Ugb2YgdGhlc2UKPiBm
YWNpbGl0aWVzLCBvbmUgd2hpY2ggY2FuIGJlIHVzZWQgYXBwcm9wcmlhdGVseSwgdGhlbiB0aGVy
ZSBzaG91bGRuJ3QKPiBiZSBhIHByb2JsZW0uICBUaGUgRE1BIEFQSSBkb2VzIHRoYXQgdmlhIGl0
J3MgaWRlYXMgYWJvdXQgd2hvIG93bnMgYQo+IHBhcnRpY3VsYXIgYnVmZmVyIChiZWNhdXNlIG9m
IHRoZSBhYm92ZSBwcm9ibGVtKSBhbmQgdGhhdCdzIHNvbWV0aGluZwo+IHdoaWNoIHdvdWxkIG5l
ZWQgdG8gYmUgY2FycmllZCBvdmVyIHRvIHN1Y2ggYSBjYWNoZSBmbHVzaGluZyBBUEkgKGl0Cj4g
c2hvdWxkIGJlIHByZXR0eSBvYnZpb3VzIHRoYXQgaGF2aW5nIGEgR1BVIHJlYWQgb3Igd3JpdGUg
bWVtb3J5IHdoaWxlCj4gdGhlIGNhY2hlIGZvciB0aGF0IG1lbW9yeSBpcyBiZWluZyBjbGVhbmVk
IHdpbGwgbGVhZCB0byB1bmV4cGVjdGVkCj4gcmVzdWx0cy4pCj4KPiBBbHNvIG5vdGUgdGhhdCB0
aGluZ3MgZ2V0IGV2ZW4gbW9yZSBpbnRlcmVzdGluZyBpbiBhIFNNUCBlbnZpcm9ubWVudAo+IGlm
IGNhY2hlIG9wZXJhdGlvbnMgYXJlbid0IGJyb2FkY2FzdGVkIGFjcm9zcyB0aGUgU01QIGNsdXN0
ZXIgKHdoaWNoCj4gbWVhbnMgY2FjaGUgb3BlcmF0aW9ucyBoYXZlIHRvIGJlIElQSSdkIHRvIG90
aGVyIENQVXMuKQo+Cj4gVGhlIG5leHQgaXNzdWUsIHdoaWNoIEkndmUgYnJvdWdodCB1cCBiZWZv
cmUsIGlzIHRoYXQgZXhwb3NpbmcgY2FjaGUKPiBmbHVzaGluZyB0byB1c2Vyc3BhY2Ugb24gYXJj
aGl0ZWN0dXJlcyB3aGVyZSBpdCBpc24ndCBhbHJlYWR5IGV4cG9zZWQKPiBjb21lcy4gIEFzIGhh
cyBiZWVuIHNob3duIGJ5IEdvb2dsZSBQcm9qZWN0IFplcm8sIHRoaXMgcmlza3MgZXhwb3NpbmcK
PiB0aG9zZSBhcmNoaXRlY3R1cmVzIHRvIFNwZWN0cmUgYW5kIE1lbHRkb3duIGV4cGxvaXRzIHdo
ZXJlIHRoZXkgd2VyZW4ndAo+IGF0IHN1Y2ggYSByaXNrIGJlZm9yZS4gIChJJ3ZlIHByZXR0eSBt
dWNoIHNob3duIGhlcmUgdGhhdCB5b3UgX2RvXwo+IG5lZWQgdG8gY29udHJvbCB3aGljaCBjYWNo
ZSBsaW5lcyBnZXQgZmx1c2hlZCB0byBtYWtlIHRoZXNlIGV4cGxvaXRzCj4gd29yaywgYW5kIGZs
dXNoaW5nIHRoZSBjYWNoZSBieSByZWFkaW5nIGxvdHMgb2YgZGF0YSBpbiBsaXUgb2YgaGF2aW5n
Cj4gdGhlIGFiaWxpdHkgdG8gZXhwbGljaXRseSBmbHVzaCBiaXRzIG9mIGNhY2hlIG1ha2VzIGl0
IHZlcnkgZGlmZmljdWx0Cj4gdG8gaW1wb3NzaWJsZSBmb3IgdGhlbSB0byB3b3JrLikKClRoZSBh
Ym92ZSBpcyBhbHJlYWR5IHdoYXQgd2UncmUgaW1wbGVtZW50aW5nIGluIGk5MTUsIGF0IGxlYXN0
CmNvbmNlcHR1YWxseSAoaXQgYWxsIGJvaWxzIGRvd24gdG8gY2xmbHVzaCBpbnN0cnVjdGlvbnMg
YmVjYXVzZSB0aG9zZQpib3RoIGludmFsaWRhdGUgYW5kIGZsdXNoKS4KCk9uZSBhcmNoaXRlY3R1
cmFsIGd1YXJhbnRlZSB3ZSdyZSBleHBsb2l0aW5nIGlzIHRoYXQgcHJlZmV0Y2hlZCAoYW5kCmhl
bmNlIG5vbi1kaXJ0eSkgY2FjaGVsaW5lcyB3aWxsIG5ldmVyIGdldCB3cml0dGVuIGJhY2ssIGJ1
dCBkcm9wcGVkCmluc3RlYWQuIEJ1dCB3ZSBraW5kYSBuZWVkIHRoYXQsIG90aGVyd2lzZSB0aGUg
Y3B1IGNvdWxkIHJhbmRvbWx5CmNvcnJ1cHQgdGhlIGRhdGEgdGhlIGdwdSBpcyB3cml0aW5nIGFu
ZCBub24tY29oZXJlbnQgd291bGQganVzdCBub3QKd29yayBvbiB0aG9zZSBwbGF0Zm9ybXMuIEJ1
dCBhc2lkZSBmcm9tIHRoYXQsIHllcyB3ZSBkbyBhbiBpbnZhbGlkYXRlCmJlZm9yZSByZWFkaW5n
LCBhbmQgZmx1c2hpbmcgYWZ0ZXIgZXZlcnkgd3JpdGluZyAob3IgYW55dGhpbmcgZWxzZQp0aGF0
IGNvdWxkIGxlYXZlIGRpcnR5IGNhY2hlbGluZXMgYmVoaW5kKS4gUGx1cyBhIGJpdCBvZiB0cmFj
a2luZyBpbgp0aGUgZHJpdmVyIChrZXJuZWwvdXNlcnNwYWNlIGJvdGggZG8gdGhpcywgdG9nZXRo
ZXIsIHdpdGggc29tZQpoaWxhcmlvdXNseSBiYWQgZXZvbHZlZCBzZW1hbnRpY3MgYXQgbGVhc3Qg
Zm9yIGk5MTUsIGJ1dCBvaCB3ZWxsIGNhbid0CmZpeCB1YXBpIG1pc3Rha2VzKSB0byBhdm9pZCBy
ZWR1bmRhbnQgY2FjaGVsaW5lIGZsdXNoZXMvaW52YWxpZGF0ZXMuCgpTbyBhY2suCi1EYW5pZWwK
LS0gCkRhbmllbCBWZXR0ZXIKU29mdHdhcmUgRW5naW5lZXIsIEludGVsIENvcnBvcmF0aW9uCis0
MSAoMCkgNzkgMzY1IDU3IDQ4IC0gaHR0cDovL2Jsb2cuZmZ3bGwuY2gKX19fX19fX19fX19fX19f
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVsIG1haWxpbmcgbGlzdApk
cmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlzdHMuZnJlZWRlc2t0b3Au
b3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg==

From mboxrd@z Thu Jan  1 00:00:00 1970
From: daniel.vetter@ffwll.ch (Daniel Vetter)
Date: Thu, 26 Apr 2018 11:20:44 +0200
Subject: [Linaro-mm-sig] noveau vs arm dma ops
In-Reply-To: <20180425225443.GQ16141@n2100.armlinux.org.uk>
References: <20180424184847.GA3247@infradead.org>
 <CAKMK7uFL68pu+-9LODTgz+GQYvxpnXOGhxfz9zorJ_JKsPVw2g@mail.gmail.com>
 <20180425054855.GA17038@infradead.org>
 <CAKMK7uEFitkNQrD6cLX5Txe11XhVO=LC4YKJXH=VNdq+CY=DjQ@mail.gmail.com>
 <CAKMK7uFx=KB1vup=WhPCyfUFairKQcRR4BEd7aXaX1Pj-vj3Cw@mail.gmail.com>
 <20180425064335.GB28100@infradead.org> <20180425074151.GA2271@ulmo>
 <20180425085439.GA29996@infradead.org>
 <20180425100429.GR25142@phenom.ffwll.local>
 <20180425153312.GD27076@infradead.org>
 <20180425225443.GQ16141@n2100.armlinux.org.uk>
Message-ID: <CAKMK7uG9R6paoP4BmvqDVUP_4Db4Dz8MSXeLwUBMYaORa5-kVw@mail.gmail.com>
To: linux-arm-kernel@lists.infradead.org
List-Id: linux-arm-kernel.lists.infradead.org

On Thu, Apr 26, 2018 at 12:54 AM, Russell King - ARM Linux
<linux@armlinux.org.uk> wrote:
> On Wed, Apr 25, 2018 at 08:33:12AM -0700, Christoph Hellwig wrote:
>> On Wed, Apr 25, 2018 at 12:04:29PM +0200, Daniel Vetter wrote:
>> > - dma api hides the cache flushing requirements from us. GPUs love
>> >   non-snooped access, and worse give userspace control over that. We want
>> >   a strict separation between mapping stuff and flushing stuff. With the
>> >   IOMMU api we mostly have the former, but for the later arch maintainers
>> >   regularly tells they won't allow that. So we have drm_clflush.c.
>>
>> The problem is that a cache flushing API entirely separate is hard. That
>> being said if you look at my generic dma-noncoherent API series it tries
>> to move that way.  So far it is in early stages and apparently rather
>> buggy unfortunately.
>
> And if folk want a cacheable mapping with explicit cache flushing, the
> cache flushing must not be defined in terms of "this is what CPU seems
> to need" but from the point of view of a CPU with infinite prefetching,
> infinite caching and infinite capacity to perform writebacks of dirty
> cache lines at unexpected moments when the memory is mapped in a
> cacheable mapping.
>
> (The reason for that is you're operating in a non-CPU specific space,
> so you can't make any guarantees as to how much caching or prefetching
> will occur by the CPU - different CPUs will do different amounts.)
>
> So, for example, the sequence:
>
> GPU writes to memory
>                         CPU reads from cacheable memory
>
> if the memory was previously dirty (iow, CPU has written), you need to
> flush the dirty cache lines _before_ the GPU writes happen, but you
> don't know whether the CPU has speculatively prefetched, so you need
> to flush any prefetched cache lines before reading from the cacheable
> memory _after_ the GPU has finished writing.
>
> Also note that "flush" there can be "clean the cache", "clean and
> invalidate the cache" or "invalidate the cache" as appropriate - some
> CPUs are able to perform those three operations, and the appropriate
> one depends on not only where in the above sequence it's being used,
> but also on what the operations are.
>
> So, the above sequence could be:
>
>                         CPU invalidates cache for memory
>                                 (due to possible dirty cache lines)
> GPU writes to memory
>                         CPU invalidates cache for memory
>                                 (to get rid of any speculatively prefetched
>                                  lines)
>                         CPU reads from cacheable memory
>
> Yes, in the above case, _two_ cache operations are required to ensure
> correct behaviour.  However, if you know for certain that the memory was
> previously clean, then the first cache operation can be skipped.
>
> What I'm pointing out is there's much more than just "I want to flush
> the cache" here, which is currently what DRM seems to assume at the
> moment with the code in drm_cache.c.
>
> If we can agree a set of interfaces that allows _proper_ use of these
> facilities, one which can be used appropriately, then there shouldn't
> be a problem.  The DMA API does that via it's ideas about who owns a
> particular buffer (because of the above problem) and that's something
> which would need to be carried over to such a cache flushing API (it
> should be pretty obvious that having a GPU read or write memory while
> the cache for that memory is being cleaned will lead to unexpected
> results.)
>
> Also note that things get even more interesting in a SMP environment
> if cache operations aren't broadcasted across the SMP cluster (which
> means cache operations have to be IPI'd to other CPUs.)
>
> The next issue, which I've brought up before, is that exposing cache
> flushing to userspace on architectures where it isn't already exposed
> comes.  As has been shown by Google Project Zero, this risks exposing
> those architectures to Spectre and Meltdown exploits where they weren't
> at such a risk before.  (I've pretty much shown here that you _do_
> need to control which cache lines get flushed to make these exploits
> work, and flushing the cache by reading lots of data in liu of having
> the ability to explicitly flush bits of cache makes it very difficult
> to impossible for them to work.)

The above is already what we're implementing in i915, at least
conceptually (it all boils down to clflush instructions because those
both invalidate and flush).

One architectural guarantee we're exploiting is that prefetched (and
hence non-dirty) cachelines will never get written back, but dropped
instead. But we kinda need that, otherwise the cpu could randomly
corrupt the data the gpu is writing and non-coherent would just not
work on those platforms. But aside from that, yes we do an invalidate
before reading, and flushing after every writing (or anything else
that could leave dirty cachelines behind). Plus a bit of tracking in
the driver (kernel/userspace both do this, together, with some
hilariously bad evolved semantics at least for i915, but oh well can't
fix uapi mistakes) to avoid redundant cacheline flushes/invalidates.

So ack.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch