From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from ozlabs.org (ozlabs.org [IPv6:2401:3900:2:1::2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4875E1A00FE for ; Thu, 4 Sep 2014 19:34:40 +1000 (EST) Received: from mail-we0-x22e.google.com (mail-we0-x22e.google.com [IPv6:2a00:1450:400c:c03::22e]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 8FE941400E9 for ; Thu, 4 Sep 2014 19:34:38 +1000 (EST) Received: by mail-we0-f174.google.com with SMTP id u57so9880904wes.5 for ; Thu, 04 Sep 2014 02:34:34 -0700 (PDT) Sender: Daniel Vetter Date: Thu, 4 Sep 2014 11:34:54 +0200 From: Daniel Vetter To: Thomas Hellstrom Subject: Re: TTM placement & caching issue/questions Message-ID: <20140904093454.GG15520@phenom.ffwll.local> References: <1409789547.30640.136.camel@pasglop> <54081844.7000604@vmware.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <54081844.7000604@vmware.com> Cc: dri-devel@lists.freedesktop.org, Michel Danzer , linuxppc-dev@ozlabs.org List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Thu, Sep 04, 2014 at 09:44:04AM +0200, Thomas Hellstrom wrote: > Last time I tested, (and it seems like Michel is on the same track), > writing with the CPU to write-combined memory was substantially faster > than writing to cached memory, with the additional side-effect that CPU > caches are left unpolluted. > > Moreover (although only tested on Intel's embedded chipsets), texturing > from cpu-cache-coherent PCI memory was a real GPU performance hog > compared to texturing from non-snooped memory. Hence, whenever a buffer > could be classified as GPU-read-only (or almost at least), it should be > placed in write-combined memory. Just a quick comment since this explicitly referes to intel chips: On desktop/laptop chips with the big shared l3/l4 caches it's the other way round. Cached uploads are substantially faster than wc and not using coherent access is a severe perf hit for texturing. I guess the hw guys worked really hard to hide the snooping costs so that the gpu can benefit from the massive bandwidth these caches can provide. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel Vetter Subject: Re: TTM placement & caching issue/questions Date: Thu, 4 Sep 2014 11:34:54 +0200 Message-ID: <20140904093454.GG15520@phenom.ffwll.local> References: <1409789547.30640.136.camel@pasglop> <54081844.7000604@vmware.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Content-Disposition: inline In-Reply-To: <54081844.7000604@vmware.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+glppe-linuxppc-embedded-2=m.gmane.org@lists.ozlabs.org Sender: "Linuxppc-dev" To: Thomas Hellstrom Cc: dri-devel@lists.freedesktop.org, Michel Danzer , linuxppc-dev@ozlabs.org List-Id: dri-devel@lists.freedesktop.org T24gVGh1LCBTZXAgMDQsIDIwMTQgYXQgMDk6NDQ6MDRBTSArMDIwMCwgVGhvbWFzIEhlbGxzdHJv bSB3cm90ZToKPiBMYXN0IHRpbWUgSSB0ZXN0ZWQsIChhbmQgaXQgc2VlbXMgbGlrZSBNaWNoZWwg aXMgb24gdGhlIHNhbWUgdHJhY2spLAo+IHdyaXRpbmcgd2l0aCB0aGUgQ1BVIHRvIHdyaXRlLWNv bWJpbmVkIG1lbW9yeSB3YXMgc3Vic3RhbnRpYWxseSBmYXN0ZXIKPiB0aGFuIHdyaXRpbmcgdG8g Y2FjaGVkIG1lbW9yeSwgd2l0aCB0aGUgYWRkaXRpb25hbCBzaWRlLWVmZmVjdCB0aGF0IENQVQo+ IGNhY2hlcyBhcmUgbGVmdCB1bnBvbGx1dGVkLgo+IAo+IE1vcmVvdmVyIChhbHRob3VnaCBvbmx5 IHRlc3RlZCBvbiBJbnRlbCdzIGVtYmVkZGVkIGNoaXBzZXRzKSwgdGV4dHVyaW5nCj4gZnJvbSBj cHUtY2FjaGUtY29oZXJlbnQgUENJIG1lbW9yeSB3YXMgYSByZWFsIEdQVSBwZXJmb3JtYW5jZSBo b2cKPiBjb21wYXJlZCB0byB0ZXh0dXJpbmcgZnJvbSBub24tc25vb3BlZCBtZW1vcnkuIEhlbmNl LCB3aGVuZXZlciBhIGJ1ZmZlcgo+IGNvdWxkIGJlIGNsYXNzaWZpZWQgYXMgR1BVLXJlYWQtb25s eSAob3IgYWxtb3N0IGF0IGxlYXN0KSwgaXQgc2hvdWxkIGJlCj4gcGxhY2VkIGluIHdyaXRlLWNv bWJpbmVkIG1lbW9yeS4KCkp1c3QgYSBxdWljayBjb21tZW50IHNpbmNlIHRoaXMgZXhwbGljaXRs eSByZWZlcmVzIHRvIGludGVsIGNoaXBzOiBPbgpkZXNrdG9wL2xhcHRvcCBjaGlwcyB3aXRoIHRo ZSBiaWcgc2hhcmVkIGwzL2w0IGNhY2hlcyBpdCdzIHRoZSBvdGhlciB3YXkKcm91bmQuIENhY2hl ZCB1cGxvYWRzIGFyZSBzdWJzdGFudGlhbGx5IGZhc3RlciB0aGFuIHdjIGFuZCBub3QgdXNpbmcK Y29oZXJlbnQgYWNjZXNzIGlzIGEgc2V2ZXJlIHBlcmYgaGl0IGZvciB0ZXh0dXJpbmcuIEkgZ3Vl c3MgdGhlIGh3IGd1eXMKd29ya2VkIHJlYWxseSBoYXJkIHRvIGhpZGUgdGhlIHNub29waW5nIGNv c3RzIHNvIHRoYXQgdGhlIGdwdSBjYW4gYmVuZWZpdApmcm9tIHRoZSBtYXNzaXZlIGJhbmR3aWR0 aCB0aGVzZSBjYWNoZXMgY2FuIHByb3ZpZGUuCi1EYW5pZWwKLS0gCkRhbmllbCBWZXR0ZXIKU29m dHdhcmUgRW5naW5lZXIsIEludGVsIENvcnBvcmF0aW9uCis0MSAoMCkgNzkgMzY1IDU3IDQ4IC0g aHR0cDovL2Jsb2cuZmZ3bGwuY2gKX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX18KTGludXhwcGMtZGV2IG1haWxpbmcgbGlzdApMaW51eHBwYy1kZXZAbGlzdHMu b3psYWJzLm9yZwpodHRwczovL2xpc3RzLm96bGFicy5vcmcvbGlzdGluZm8vbGludXhwcGMtZGV2