From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chris Wilson Subject: Re: [PATCH 05/13] drm/i915: Insert a flush between batches if the breadcrumb was dropped Date: Sat, 14 Jul 2012 11:24:33 +0100 Message-ID: <1342261485_7208@CP5-2952> References: <1342185256-16024-1-git-send-email-chris@chris-wilson.co.uk> <1342185256-16024-6-git-send-email-chris@chris-wilson.co.uk> <20120713154620.GG5721@phenom.ffwll.local> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from fireflyinternet.com (smtp.fireflyinternet.com [109.228.6.236]) by gabe.freedesktop.org (Postfix) with ESMTP id CF2FC9E76E for ; Sat, 14 Jul 2012 03:24:50 -0700 (PDT) In-Reply-To: <20120713154620.GG5721@phenom.ffwll.local> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: Daniel Vetter Cc: intel-gfx@lists.freedesktop.org List-Id: intel-gfx@lists.freedesktop.org On Fri, 13 Jul 2012 17:46:20 +0200, Daniel Vetter wrote: > On Fri, Jul 13, 2012 at 02:14:08PM +0100, Chris Wilson wrote: > > If we drop the breadcrumb request after a batch due to a signal for > > example we aim to fix it up at the next opportunity. In this case we > > emit a second batchbuffer with no waits upon the first and so no > > opportunity to insert the missing request, so we need to emit the > > missing flush for coherency. (Note that that invalidating the render > > cache is the same as flushing it, so there should have been no > > observable corruption.) > > > > Signed-off-by: Chris Wilson > > Imo still too meager commit message ;-) As I've said in the previous mail, > I'd like some mention of the two commits that made this disaster possible > (put the blame on me where it is due). And I think some more in-detail > walk-thru of how things blow up would be great. And the Bugzilla link for > the QA bugreport. Sure, in the patch I thought I was sending I had an extra paragraph: As a side effect this will also paper over issues such as https://bugs.freedesktop.org/show_bug.cgi?id=52040 whereby we clear the write_domain on objects on the defunct gpu_write_list. References: https://bugs.freedesktop.org/show_bug.cgi?id=52040 > Also, I still don't understand why this patch here isn't enough to fix up > the fallout. So if you can enlighten me where/why stuff blows up even with > this I'd highly appreciate. Not just because not understanding bugs makes > me queasy, but also to have a clear picture of what I'd need to send to > Dave it this -next cycle misses 3.6. The remaining fallout is that we still end up using the flushing-list, as revealed by *adding* a WARN. To end up in that situation we must retire an object with a write-domain still set. But how can this be possible if we always clear the write_list prior to the request/retirment? I thought I had it, being sneaky with the use of INSTRUCTION write domain for pipe-control. However, looks like I'm going to have to reproduce with some more debugging. -Chris -- Chris Wilson, Intel Open Source Technology Centre