From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755981AbbCPOM6 (ORCPT ); Mon, 16 Mar 2015 10:12:58 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:54767 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755938AbbCPOMx (ORCPT ); Mon, 16 Mar 2015 10:12:53 -0400 From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Chris Wilson , Daniel Vetter , Jani Nikula Subject: [PATCH 3.19 050/177] drm/i915: Insert a command barrier on BLT/BSD cache flushes Date: Mon, 16 Mar 2015 15:07:37 +0100 Message-Id: <20150316140815.364824384@linuxfoundation.org> X-Mailer: git-send-email 2.3.3 In-Reply-To: <20150316140813.085032723@linuxfoundation.org> References: <20150316140813.085032723@linuxfoundation.org> User-Agent: quilt/0.64 MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-15 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 3.19-stable review patch. If anyone has any objections, please let me know. ------------------ From: Chris Wilson commit f0a1fb10e5f79f5aaf8d7e94b9fa6bf2fa9aeebf upstream. This looked like an odd regression from commit ec5cc0f9b019af95e4571a9fa162d94294c8d90b Author: Chris Wilson Date: Thu Jun 12 10:28:55 2014 +0100 drm/i915: Restrict GPU boost to the RCS engine but in reality it undercovered a much older coherency bug. The issue that boosting the GPU frequency on the BCS ring was masking was that we could wake the CPU up after completion of a BCS batch and inspect memory prior to the write cache being fully evicted. In order to serialise the breadcrumb interrupt (and so ensure that the CPU's view of memory is coherent) we need to perform a post-sync operation in the MI_FLUSH_DW. v2: Fix all the MI_FLUSH_DW (bsd plus the duplication in execlists). Also fix the invalidate_domains mask in gen8_emit_flush() for ring != VCS. Testcase: gpuX-rcs-gpu-read-after-write Signed-off-by: Chris Wilson Acked-by: Daniel Vetter Signed-off-by: Jani Nikula Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/i915/intel_lrc.c | 20 +++++++++++--------- drivers/gpu/drm/i915/intel_ringbuffer.c | 23 +++++++++++++++++++---- 2 files changed, 30 insertions(+), 13 deletions(-) --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1237,15 +1237,17 @@ static int gen8_emit_flush(struct intel_ cmd = MI_FLUSH_DW + 1; - if (ring == &dev_priv->ring[VCS]) { - if (invalidate_domains & I915_GEM_GPU_DOMAINS) - cmd |= MI_INVALIDATE_TLB | MI_INVALIDATE_BSD | - MI_FLUSH_DW_STORE_INDEX | - MI_FLUSH_DW_OP_STOREDW; - } else { - if (invalidate_domains & I915_GEM_DOMAIN_RENDER) - cmd |= MI_INVALIDATE_TLB | MI_FLUSH_DW_STORE_INDEX | - MI_FLUSH_DW_OP_STOREDW; + /* We always require a command barrier so that subsequent + * commands, such as breadcrumb interrupts, are strictly ordered + * wrt the contents of the write cache being flushed to memory + * (and thus being coherent from the CPU). + */ + cmd |= MI_FLUSH_DW_STORE_INDEX | MI_FLUSH_DW_OP_STOREDW; + + if (invalidate_domains & I915_GEM_GPU_DOMAINS) { + cmd |= MI_INVALIDATE_TLB; + if (ring == &dev_priv->ring[VCS]) + cmd |= MI_INVALIDATE_BSD; } intel_logical_ring_emit(ringbuf, cmd); --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -2178,6 +2178,14 @@ static int gen6_bsd_ring_flush(struct in cmd = MI_FLUSH_DW; if (INTEL_INFO(ring->dev)->gen >= 8) cmd += 1; + + /* We always require a command barrier so that subsequent + * commands, such as breadcrumb interrupts, are strictly ordered + * wrt the contents of the write cache being flushed to memory + * (and thus being coherent from the CPU). + */ + cmd |= MI_FLUSH_DW_STORE_INDEX | MI_FLUSH_DW_OP_STOREDW; + /* * Bspec vol 1c.5 - video engine command streamer: * "If ENABLED, all TLBs will be invalidated once the flush @@ -2185,8 +2193,8 @@ static int gen6_bsd_ring_flush(struct in * Post-Sync Operation field is a value of 1h or 3h." */ if (invalidate & I915_GEM_GPU_DOMAINS) - cmd |= MI_INVALIDATE_TLB | MI_INVALIDATE_BSD | - MI_FLUSH_DW_STORE_INDEX | MI_FLUSH_DW_OP_STOREDW; + cmd |= MI_INVALIDATE_TLB | MI_INVALIDATE_BSD; + intel_ring_emit(ring, cmd); intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT); if (INTEL_INFO(ring->dev)->gen >= 8) { @@ -2282,6 +2290,14 @@ static int gen6_ring_flush(struct intel_ cmd = MI_FLUSH_DW; if (INTEL_INFO(ring->dev)->gen >= 8) cmd += 1; + + /* We always require a command barrier so that subsequent + * commands, such as breadcrumb interrupts, are strictly ordered + * wrt the contents of the write cache being flushed to memory + * (and thus being coherent from the CPU). + */ + cmd |= MI_FLUSH_DW_STORE_INDEX | MI_FLUSH_DW_OP_STOREDW; + /* * Bspec vol 1c.3 - blitter engine command streamer: * "If ENABLED, all TLBs will be invalidated once the flush @@ -2289,8 +2305,7 @@ static int gen6_ring_flush(struct intel_ * Post-Sync Operation field is a value of 1h or 3h." */ if (invalidate & I915_GEM_DOMAIN_RENDER) - cmd |= MI_INVALIDATE_TLB | MI_FLUSH_DW_STORE_INDEX | - MI_FLUSH_DW_OP_STOREDW; + cmd |= MI_INVALIDATE_TLB; intel_ring_emit(ring, cmd); intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT); if (INTEL_INFO(ring->dev)->gen >= 8) {