From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ben Widawsky Subject: Re: [RFC] drm/i915: use PIPE_CONTROL for flushing on gen6+ Date: Tue, 30 Aug 2011 16:11:29 -0700 Message-ID: <20110830231129.GA7274@bolo_yeung.jf.intel.com> References: <20110812111845.339aab04@jbarnes-desktop> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from cloud01.chad-versace.us (184-106-247-128.static.cloud-ips.com [184.106.247.128]) by gabe.freedesktop.org (Postfix) with ESMTP id 14F459E8CF for ; Tue, 30 Aug 2011 16:12:39 -0700 (PDT) Content-Disposition: inline In-Reply-To: <20110812111845.339aab04@jbarnes-desktop> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: Jesse Barnes Cc: intel-gfx@lists.freedesktop.org List-Id: intel-gfx@lists.freedesktop.org On Fri, Aug 12, 2011 at 11:18:45AM -0700, Jesse Barnes wrote: > int space = (ring->head & HEAD_ADDR) - (ring->tail + 8); > @@ -123,6 +133,112 @@ render_ring_flush(struct intel_ring_buffer *ring, > return 0; > } > > +/** > + * Emits a PIPE_CONTROL with a non-zero post-sync operation, for > + * implementing two workarounds on gen6. From section 1.4.7.1 > + * "PIPE_CONTROL" of the Sandy Bridge PRM volume 2 part 1: > + * > + * [DevSNB-C+{W/A}] Before any depth stall flush (including those > + * produced by non-pipelined state commands), software needs to first > + * send a PIPE_CONTROL with no bits set except Post-Sync Operation != > + * 0. > + * > + * [Dev-SNB{W/A}]: Before a PIPE_CONTROL with Write Cache Flush Enable > + * =1, a PIPE_CONTROL with any non-zero post-sync-op is required. > + * > + * And the workaround for these two requires this workaround first: > + * > + * [Dev-SNB{W/A}]: Pipe-control with CS-stall bit set must be sent > + * BEFORE the pipe-control with a post-sync op and no write-cache > + * flushes. > + * > + * And this last workaround is tricky because of the requirements on > + * that bit. From section 1.4.7.2.3 "Stall" of the Sandy Bridge PRM > + * volume 2 part 1: > + * > + * "1 of the following must also be set: > + * - Render Target Cache Flush Enable ([12] of DW1) > + * - Depth Cache Flush Enable ([0] of DW1) > + * - Stall at Pixel Scoreboard ([1] of DW1) > + * - Depth Stall ([13] of DW1) > + * - Post-Sync Operation ([13] of DW1) > + * - Notify Enable ([8] of DW1)" > + * > + * The cache flushes require the workaround flush that triggered this > + * one, so we can't use it. Depth stall would trigger the same. > + * Post-sync nonzero is what triggered this second workaround, so we > + * can't use that one either. Notify enable is IRQs, which aren't > + * really our business. That leaves only stall at scoreboard. > + */ > +static int > +intel_emit_post_sync_nonzero_flush(struct intel_ring_buffer *ring) > +{ > + struct pipe_control *pc = ring->private; > + u32 scratch_addr = pc->gtt_offset + 128; > + int ret; > + > + > + ret = intel_ring_begin(ring, 6); > + if (ret) > + return ret; > + > + intel_ring_emit(ring, GFX_OP_PIPE_CONTROL_GEN6); > + intel_ring_emit(ring, PIPE_CONTROL_CS_STALL | > + PIPE_CONTROL_STALL_AT_SCOREBOARD); > + intel_ring_emit(ring, scratch_addr | PIPE_CONTROL_GLOBAL_GTT); /* address */ > + intel_ring_emit(ring, 0); /* low dword */ > + intel_ring_emit(ring, 0); /* high dword */ > + intel_ring_emit(ring, MI_NOOP); > + intel_ring_advance(ring); > + > + ret = intel_ring_begin(ring, 6); > + if (ret) > + return ret; > + > + intel_ring_emit(ring, GFX_OP_PIPE_CONTROL_GEN6); > + intel_ring_emit(ring, PIPE_CONTROL_QW_WRITE); > + intel_ring_emit(ring, scratch_addr | PIPE_CONTROL_GLOBAL_GTT); /* address */ > + intel_ring_emit(ring, 0); > + intel_ring_emit(ring, 0); > + intel_ring_emit(ring, MI_NOOP); > + intel_ring_advance(ring); > + > + return 0; > +} > + > +static int > +gen6_render_ring_flush(struct intel_ring_buffer *ring, > + u32 invalidate_domains, u32 flush_domains) > +{ > + u32 flags = 0; > + struct pipe_control *pc = ring->private; > + u32 scratch_addr = pc->gtt_offset + 128; > + int ret; > + > + /* Force SNB workarounds for PIPE_CONTROL flushes */ > + intel_emit_post_sync_nonzero_flush(ring); > + > + /* Just flush everything for now */ > + flags |= PIPE_CONTROL_WC_FLUSH; > + flags |= PIPE_CONTROL_IS_FLUSH; > + flags |= PIPE_CONTROL_TC_FLUSH; > + flags |= PIPE_CONTROL_DEPTH_FLUSH; > + > + ret = intel_ring_begin(ring, 6); > + if (ret) > + return ret; > + > + intel_ring_emit(ring, GFX_OP_PIPE_CONTROL_GEN6); > + intel_ring_emit(ring, flags); > + intel_ring_emit(ring, scratch_addr | PIPE_CONTROL_GLOBAL_GTT); > + intel_ring_emit(ring, 0); /* lower dword */ > + intel_ring_emit(ring, 0); /* uppwer dword */ > + intel_ring_emit(ring, MI_NOOP); > + intel_ring_advance(ring); > + > + return 0; > +} > + > static void ring_write_tail(struct intel_ring_buffer *ring, > u32 value) > { While I'm not convinced this is broken, I think you either want to specify a qword write (3 << 14) or use n=2 for the pipe control, ie: intel_ring_emit(ring, GFX_OP_PIPE_CONTROL_GEN); intel_ring_emit(ring, flags); intel_ring_emit(ring, 0); /* ignored if no write */ intel_ring_emit(ring, 0); /* ignored if no write*/ intel_ring_advance(ring); Ben