From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel Vetter Subject: Re: [PATCH 1/6] RFCish: write only mappings (aka non-blocking) Date: Tue, 20 Sep 2011 13:06:43 +0200 Message-ID: References: <1316492706-31081-1-git-send-email-ben@bwidawsk.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail-vw0-f49.google.com (mail-vw0-f49.google.com [209.85.212.49]) by gabe.freedesktop.org (Postfix) with ESMTP id A89F4A0AD3 for ; Tue, 20 Sep 2011 04:06:44 -0700 (PDT) Received: by vws8 with SMTP id 8so507289vws.36 for ; Tue, 20 Sep 2011 04:06:43 -0700 (PDT) In-Reply-To: <1316492706-31081-1-git-send-email-ben@bwidawsk.net> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: Ben Widawsky Cc: intel-gfx@lists.freedesktop.org List-Id: intel-gfx@lists.freedesktop.org On Mon, Sep 19, 2011 at 09:25:00PM -0700, Ben Widawsky wrote: > I'm going to keep this short... > Patch 5 is my test case. > On Gen6 I see slightly better performance. On Gen5 I see really really > improvements (like 3x) for non GTT write only maps over regular mmaps. > GTT mappings don't really show any improvements as a whole. > > Better tests would be nice, but without more significant Mesa changes, > or better benchmarks, I'm not sure how to get those. While I think > these patches are mostly complete, ideas for better testing are very > welcome. Also of course, general optimizations or pointing out my errors > would be nice. Ok, I'm gonna be the dense annoying bastard here: - Can we stop calling this mappings write-only. Afaics the distinguishing feature is that they're non-blocking. And yes, current users only use non-blocking paths to upload data because the amount of data we're currently downloading is so small. Hence we can use on bo for each download without wasting too much space and still avoid unnecessary blocking. Bit I think this will change, e.g. with designs like sna that tightly integrate gpu and sw rendering. Or OpenCL. - Why do we need any patches for gtt non-blocking mmaps? I've re-read our code, and afaics we're only calling wait_rendering from gem_fault if obj->gtt_space == NULL. I.e. there's no way the gpu is currently using the data and hence no way for us to block on it. I think the only thing needed is a small libdrm batch to enable non-blocking gtt mmaps void drm_intel_enable_non_blocking_gtt_mmap(obj) which sets a bit somewhere and moves the obj (once) into the gtt domain. And a corresponding change in gtt_mmap to disable the set_domain call. This only works as long as no one else access the object from the cpu domain, but afaics we'll use non-blocking mmaps only for unshared buffers, so that should be fine. I might also just be dense and not see the issue ... - I'm sorry having suggested to implement the clflush ioctl, I think it's a foolish idea, now. Non-blocking mmaps is a performance optimization, needing to sync caches with clflush is very much the opposite. So I think we can dustbin this. Now non-blocking cpu mmaps make very much sense on llc/snooped buffer objects. So I think we actually need an ioctl to get obj->cache_level so userspace can decide whether it should use non-blocking gtt mmaps or cpu (non-blocking) cpu mmaps. We might as well go full-circle, make Chris happy and merge the corresponding set_cache_level ioclt to enable snooped buffers on machines with ilk-like coherency (i.e. that atom thing I'm hearing about ...). But imo that's material for non-blocking mmaps, step 2. Cheers, Daniel -- Daniel Vetter Mail: daniel@ffwll.ch Mobile: +41 (0)79 365 57 48