All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/19] drm-intel-collector push simple patches for review
@ 2013-09-10 22:36 Rodrigo Vivi
  2013-09-10 22:36 ` [PATCH 01/19] drm/i915: check that the i965g/gm 4G limit is really obeyed Rodrigo Vivi
                   ` (18 more replies)
  0 siblings, 19 replies; 30+ messages in thread
From: Rodrigo Vivi @ 2013-09-10 22:36 UTC (permalink / raw)
  To: intel-gfx

Hi all,

This is another drm-intel-collector push for review:
http://cgit.freedesktop.org/~vivijim/drm-intel/log/?h=drm-intel-collector

Here goes the list in order for better reviewers assignment:

0001-drm-i915-check-that-the-i965g-gm-4G-limit-is-really-.patch
0002-drm-i915-Cancel-outstanding-modeset-workers-before-s.patch
0003-drm-i915-Move-the-conditional-seqno-query-into-the-t.patch
0004-drm-i915-Add-some-missing-steps-to-i915_driver_load-.patch
0005-drm-i915-Asynchronously-perform-the-set-base-for-a-s.patch
0006-drm-i915-Align-tiled-scanouts-from-stolen-memory-to-.patch
0007-drm-i915-Pair-seqno-completion-tracepoint-with-its-d.patch
0008-drm-i915-write-D_COMP-using-the-mailbox.patch
0009-drm-i915-Initialise-min-max-frequencies-before-updat.patch
0010-drm-i915-Delay-the-relase-of-the-forcewake-by-a-jiff.patch
0011-drm-i915-Add-a-tracepoint-for-using-a-semaphore.patch
0012-drm-i915-Write-RING_TAIL-once-per-request.patch
0013-drm-i915-Boost-RPS-frequency-for-CPU-stalls.patch
0014-drm-i915-Tweak-RPS-thresholds-to-more-aggressively-d.patch
0015-drm-i915-i915.quirks_set-quirks_mask-overrides-dmi-m.patch
0016-drm-i915-Fix-l3-parity-buffer-offset.patch
0017-drm-i915-Fix-HSW-parity-test.patch
0018-drm-i915-Allow-GT3-Slice-Shutdown-on-Boot.patch
0019-drm-i915-Allow-Dynamically-GT3-Slice-Shutdown.patch

Overall Process:

1. Daniel pushs drm-intel-testing
2. I rebase drm-intel-collector onto drm-intel-testing
3. I collect all simple (1-2) patches that wasn't yet reviewed and not queued by Daniel
4. Request automated QA's PRTS automated i-g-t tests comparing drm-intel-testing x drm-intel-collector
5. If tests are ok I send the patches as a series to intel-gfx mailing list for better tracking and to be reviewed.

There are some reasons that some patches can be left behind:
1. It was send so long time ago. I started with patches from Jul 26th.
2. Your patch didn't applied cleanly and I couldn't easily solve the conflicts.
3. Kernel didn't compiled with your patch.
4. I simply missed it. If you believe this is the case please warn me.

Please help me to get these patches reviewed and queued by Daniel.

Also, please let me know if you have further ideas how to improve this process.

Thanks in advance,
Rodrigo.

Ben Widawsky (2):
  drm/i915: Fix l3 parity buffer offset
  drm/i915: Fix HSW parity test

Chris Wilson (12):
  drm/i915: Cancel outstanding modeset workers before suspend
  drm/i915: Move the conditional seqno query into the tracepoint
  drm/i915: Add some missing steps to i915_driver_load error path
  drm/i915: Asynchronously perform the set-base for a simple modeset
  drm/i915: Align tiled scanouts from stolen memory to 256k in the GTT
  drm/i915: Pair seqno completion tracepoint with its dispatch
  drm/i915: Initialise min/max frequencies before updating RPS registers
  drm/i915: Delay the relase of the forcewake by a jiffie
  drm/i915: Add a tracepoint for using a semaphore
  drm/i915: Write RING_TAIL once per-request
  drm/i915: Boost RPS frequency for CPU stalls
  drm/i915: Tweak RPS thresholds to more aggressively downclock

Daniel Vetter (1):
  drm/i915: check that the i965g/gm 4G limit is really obeyed

Kamal Mostafa (1):
  drm/i915: i915.quirks_set/quirks_mask overrides dmi match

Paulo Zanoni (1):
  drm/i915: write D_COMP using the mailbox

Rodrigo Vivi (2):
  drm/i915: Allow GT3 Slice Shutdown on Boot.
  drm/i915: Allow Dynamically GT3 Slice Shutdown.

 drivers/gpu/drm/i915/i915_debugfs.c        |   2 +-
 drivers/gpu/drm/i915/i915_dma.c            |  43 ++---
 drivers/gpu/drm/i915/i915_drv.c            |   5 +
 drivers/gpu/drm/i915/i915_drv.h            |  29 +++-
 drivers/gpu/drm/i915/i915_gem.c            | 146 +++++++++++------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |   2 +-
 drivers/gpu/drm/i915/i915_irq.c            |  59 ++++---
 drivers/gpu/drm/i915/i915_reg.h            |  17 +-
 drivers/gpu/drm/i915/i915_sysfs.c          |  82 ++++++++--
 drivers/gpu/drm/i915/i915_trace.h          |  52 ++++--
 drivers/gpu/drm/i915/intel_display.c       |  72 +++++++--
 drivers/gpu/drm/i915/intel_drv.h           |   7 +-
 drivers/gpu/drm/i915/intel_pm.c            | 250 +++++++++++++++++++++--------
 drivers/gpu/drm/i915/intel_ringbuffer.c    |  30 ++--
 drivers/gpu/drm/i915/intel_ringbuffer.h    |   7 +-
 drivers/gpu/drm/i915/intel_uncore.c        |  30 +++-
 16 files changed, 612 insertions(+), 221 deletions(-)

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 01/19] drm/i915: check that the i965g/gm 4G limit is really obeyed
  2013-09-10 22:36 [PATCH 00/19] drm-intel-collector push simple patches for review Rodrigo Vivi
@ 2013-09-10 22:36 ` Rodrigo Vivi
  2013-09-10 22:36 ` [PATCH 02/19] drm/i915: Cancel outstanding modeset workers before suspend Rodrigo Vivi
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Rodrigo Vivi @ 2013-09-10 22:36 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

From: Daniel Vetter <daniel.vetter@ffwll.ch>

In truly crazy circumstances shmem might give us the wrong type of
page. So be a bit paranoid and double check this.

Cc: Rob Clark <robdclark@gmail.com>
References: http://lkml.org/lkml/2011/7/11/238
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f0884a9..b3aaf29 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1861,6 +1861,9 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 			sg->length += PAGE_SIZE;
 		}
 		last_pfn = page_to_pfn(page);
+
+		/* Check that the i965g/gm workaround works. */
+		WARN_ON((gfp & __GFP_DMA32) && (last_pfn >= 0x00100000UL));
 	}
 #ifdef CONFIG_SWIOTLB
 	if (!swiotlb_nr_tbl())
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 02/19] drm/i915: Cancel outstanding modeset workers before suspend
  2013-09-10 22:36 [PATCH 00/19] drm-intel-collector push simple patches for review Rodrigo Vivi
  2013-09-10 22:36 ` [PATCH 01/19] drm/i915: check that the i965g/gm 4G limit is really obeyed Rodrigo Vivi
@ 2013-09-10 22:36 ` Rodrigo Vivi
  2013-09-10 22:36 ` [PATCH 03/19] drm/i915: Move the conditional seqno query into the tracepoint Rodrigo Vivi
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Rodrigo Vivi @ 2013-09-10 22:36 UTC (permalink / raw)
  To: intel-gfx; +Cc: Liu, Chuansheng

From: Chris Wilson <chris@chris-wilson.co.uk>

Upon resume we will do a complete restoration of the mode and so reset
all tasks, but during suspend (and unload) we need to make sure that no
workers run concurrently with our suspend code. Or worse just after.

The issue was first raised whilst tackling a suspend issue with Takashi
Iwai (http://lists.freedesktop.org/archives/intel-gfx/2012-April/016738.html)
and then it was independently rediscovered by Chuanshen Lui.

v2: Rebase for the lost year.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Takashi Iwai <tiwai@suse.de>
Cc: "Liu, Chuansheng" <chuansheng.liu@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_drv.h      |  1 +
 drivers/gpu/drm/i915/intel_display.c | 21 ++++++++++++++++-----
 2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e357995..2729338 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2229,6 +2229,7 @@ extern void intel_modeset_init_hw(struct drm_device *dev);
 extern void intel_modeset_suspend_hw(struct drm_device *dev);
 extern void intel_modeset_init(struct drm_device *dev);
 extern void intel_modeset_gem_init(struct drm_device *dev);
+extern void intel_modeset_quiesce(struct drm_device *dev);
 extern void intel_modeset_cleanup(struct drm_device *dev);
 extern int intel_modeset_vga_set_state(struct drm_device *dev, bool state);
 extern void intel_modeset_setup_hw_state(struct drm_device *dev,
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 122aa02..86a70d9 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -10106,6 +10106,7 @@ void intel_modeset_init_hw(struct drm_device *dev)
 
 void intel_modeset_suspend_hw(struct drm_device *dev)
 {
+	intel_modeset_quiesce(dev);
 	intel_suspend_hw(dev);
 }
 
@@ -10550,9 +10551,19 @@ void intel_modeset_gem_init(struct drm_device *dev)
 	intel_modeset_setup_hw_state(dev, false);
 }
 
-void intel_modeset_cleanup(struct drm_device *dev)
+void intel_modeset_quiesce(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	cancel_work_sync(&dev_priv->hotplug_work);
+	cancel_work_sync(&dev_priv->rps.work);
+
+	/* catch all required for dev_priv->wq */
+	flush_scheduled_work();
+}
+
+void intel_modeset_cleanup(struct drm_device *dev)
+{
 	struct drm_crtc *crtc;
 
 	/*
@@ -10561,7 +10572,10 @@ void intel_modeset_cleanup(struct drm_device *dev)
 	 * experience fancy races otherwise.
 	 */
 	drm_irq_uninstall(dev);
-	cancel_work_sync(&dev_priv->hotplug_work);
+
+	/* flush any delayed tasks or pending work */
+	intel_modeset_quiesce(dev);
+
 	/*
 	 * Due to the hpd irq storm handling the hotplug work can re-arm the
 	 * poll handlers. Hence disable polling after hpd handling is shut down.
@@ -10590,9 +10604,6 @@ void intel_modeset_cleanup(struct drm_device *dev)
 
 	mutex_unlock(&dev->struct_mutex);
 
-	/* flush any delayed tasks or pending work */
-	flush_scheduled_work();
-
 	/* destroy backlight, if any, before the connectors */
 	intel_panel_destroy_backlight(dev);
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 03/19] drm/i915: Move the conditional seqno query into the tracepoint
  2013-09-10 22:36 [PATCH 00/19] drm-intel-collector push simple patches for review Rodrigo Vivi
  2013-09-10 22:36 ` [PATCH 01/19] drm/i915: check that the i965g/gm 4G limit is really obeyed Rodrigo Vivi
  2013-09-10 22:36 ` [PATCH 02/19] drm/i915: Cancel outstanding modeset workers before suspend Rodrigo Vivi
@ 2013-09-10 22:36 ` Rodrigo Vivi
  2013-09-10 22:36 ` [PATCH 04/19] drm/i915: Add some missing steps to i915_driver_load error path Rodrigo Vivi
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Rodrigo Vivi @ 2013-09-10 22:36 UTC (permalink / raw)
  To: intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

We only wish to know the value of seqno when emitting the tracepoint, so
move the query from a parameter to the macro to inside the conditional
macro body so that the query is only evaluated when required.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_irq.c   |  2 +-
 drivers/gpu/drm/i915/i915_trace.h | 21 ++++++++++++++++++---
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 83cce0c..e97d386 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -801,7 +801,7 @@ static void notify_ring(struct drm_device *dev,
 	if (ring->obj == NULL)
 		return;
 
-	trace_i915_gem_request_complete(ring, ring->get_seqno(ring, false));
+	trace_i915_gem_request_complete(ring);
 
 	wake_up_all(&ring->irq_queue);
 	i915_queue_hangcheck(dev);
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index e2c5ee6..a1797f6 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -304,9 +304,24 @@ DEFINE_EVENT(i915_gem_request, i915_gem_request_add,
 	    TP_ARGS(ring, seqno)
 );
 
-DEFINE_EVENT(i915_gem_request, i915_gem_request_complete,
-	    TP_PROTO(struct intel_ring_buffer *ring, u32 seqno),
-	    TP_ARGS(ring, seqno)
+TRACE_EVENT(i915_gem_request_complete,
+	    TP_PROTO(struct intel_ring_buffer *ring),
+	    TP_ARGS(ring),
+
+	    TP_STRUCT__entry(
+			     __field(u32, dev)
+			     __field(u32, ring)
+			     __field(u32, seqno)
+			     ),
+
+	    TP_fast_assign(
+			   __entry->dev = ring->dev->primary->index;
+			   __entry->ring = ring->id;
+			   __entry->seqno = ring->get_seqno(ring, false);
+			   ),
+
+	    TP_printk("dev=%u, ring=%u, seqno=%u",
+		      __entry->dev, __entry->ring, __entry->seqno)
 );
 
 DEFINE_EVENT(i915_gem_request, i915_gem_request_retire,
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 04/19] drm/i915: Add some missing steps to i915_driver_load error path
  2013-09-10 22:36 [PATCH 00/19] drm-intel-collector push simple patches for review Rodrigo Vivi
                   ` (2 preceding siblings ...)
  2013-09-10 22:36 ` [PATCH 03/19] drm/i915: Move the conditional seqno query into the tracepoint Rodrigo Vivi
@ 2013-09-10 22:36 ` Rodrigo Vivi
  2013-09-10 22:36 ` [PATCH 05/19] drm/i915: Asynchronously perform the set-base for a simple modeset Rodrigo Vivi
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Rodrigo Vivi @ 2013-09-10 22:36 UTC (permalink / raw)
  To: intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

We missed adding a few cleanup steps for recent additions.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_dma.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 3de6050..9f39db9 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1546,7 +1546,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 
 	ret = i915_gem_gtt_init(dev);
 	if (ret)
-		goto put_bridge;
+		goto out_regs;
 
 	if (drm_core_check_feature(dev, DRIVER_MODESET))
 		i915_kick_out_firmware_fb(dev_priv);
@@ -1575,7 +1575,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 				     aperture_size);
 	if (dev_priv->gtt.mappable == NULL) {
 		ret = -EIO;
-		goto out_rmmap;
+		goto out_gtt;
 	}
 
 	dev_priv->gtt.mtrr = arch_phys_wc_add(dev_priv->gtt.mappable_base,
@@ -1649,7 +1649,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 		ret = i915_load_modeset_init(dev);
 		if (ret < 0) {
 			DRM_ERROR("failed to init modeset\n");
-			goto out_gem_unload;
+			goto out_power_well;
 		}
 	} else {
 		/* Start out suspended in ums mode. */
@@ -1669,6 +1669,10 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 
 	return 0;
 
+out_power_well:
+	if (HAS_POWER_WELL(dev))
+		i915_remove_power_well(dev);
+	drm_vblank_cleanup(dev);
 out_gem_unload:
 	if (dev_priv->mm.inactive_shrinker.shrink)
 		unregister_shrinker(&dev_priv->mm.inactive_shrinker);
@@ -1682,12 +1686,17 @@ out_gem_unload:
 out_mtrrfree:
 	arch_phys_wc_del(dev_priv->gtt.mtrr);
 	io_mapping_free(dev_priv->gtt.mappable);
+out_gtt:
+	list_del(&dev_priv->gtt.base.global_link);
+	drm_mm_takedown(&dev_priv->gtt.base.mm);
 	dev_priv->gtt.base.cleanup(&dev_priv->gtt.base);
-out_rmmap:
+out_regs:
 	pci_iounmap(dev->pdev, dev_priv->regs);
 put_bridge:
 	pci_dev_put(dev_priv->bridge_dev);
 free_priv:
+	if (dev_priv->slab)
+		kmem_cache_destroy(dev_priv->slab);
 	kfree(dev_priv);
 	return ret;
 }
@@ -1780,6 +1789,8 @@ int i915_driver_unload(struct drm_device *dev)
 	if (dev_priv->regs != NULL)
 		pci_iounmap(dev->pdev, dev_priv->regs);
 
+	drm_vblank_cleanup(dev);
+
 	intel_teardown_gmbus(dev);
 	intel_teardown_mchbar(dev);
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 05/19] drm/i915: Asynchronously perform the set-base for a simple modeset
  2013-09-10 22:36 [PATCH 00/19] drm-intel-collector push simple patches for review Rodrigo Vivi
                   ` (3 preceding siblings ...)
  2013-09-10 22:36 ` [PATCH 04/19] drm/i915: Add some missing steps to i915_driver_load error path Rodrigo Vivi
@ 2013-09-10 22:36 ` Rodrigo Vivi
  2013-09-10 22:36 ` [PATCH 06/19] drm/i915: Align tiled scanouts from stolen memory to 256k in the GTT Rodrigo Vivi
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Rodrigo Vivi @ 2013-09-10 22:36 UTC (permalink / raw)
  To: intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

A simple modeset, where we only wish to switch over to a new framebuffer
such as the transition from fbcon to X, takes around 30-60ms. This is
due to three factors:

1. We need to make sure the fb->obj is in the display domain, which
incurs a cache flush to ensure no dirt is left on the scanout.

2. We need to flush any pending rendering before performing the mmio
so that the frame is complete before it is shown.

3. We currently wait for the vblank after the mmio to be sure that the
old fb is no longer being shown before releasing it.

(1) can only be eliminated by userspace preparing the fb->obj in advance
to already be in the display domain. This can be done through use of the
create2 ioctl, or by reusing an existing fb->obj.

However, (2) and (3) are already solved by the existing page flip
mechanism, and it is surprisingly trivial to wire them up for use in the
set-base fast path. Though it can be argued that this represents a
subtle ABI break in that the set_config ioctl now returns before the old
framebuffer is unpinned. The danger is that userspace will start to
modify it before it is no longer being shown, however we should be able
to prevent that through proper domain tracking.

By combining all of the above, we can achieve an instaneous set_config:

[     6.601] (II) intel(0): switch to mode 2560x1440@60.0 on pipe 0 using DP2, position (0, 0), rotation normal
[     6.601] (II) intel(0): Setting screen physical size to 677 x 381

v2 (by Vivi): page_flip_flag was added to intel_crtc_page_flip
              in a previous commit. using 0.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/intel_display.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 86a70d9..c99dc1d 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -9264,10 +9264,13 @@ static int intel_crtc_set_config(struct drm_mode_set *set)
 		ret = intel_set_mode(set->crtc, set->mode,
 				     set->x, set->y, set->fb);
 	} else if (config->fb_changed) {
-		intel_crtc_wait_for_pending_flips(set->crtc);
-
-		ret = intel_pipe_set_base(set->crtc,
-					  set->x, set->y, set->fb);
+		if (to_intel_framebuffer(set->fb)->obj->ring == NULL ||
+		    save_set.x != set->x || save_set.y != set->y ||
+		    intel_crtc_page_flip(set->crtc, set->fb, NULL, 0)) {
+			intel_crtc_wait_for_pending_flips(set->crtc);
+			ret = intel_pipe_set_base(set->crtc,
+						  set->x, set->y, set->fb);
+		}
 	}
 
 	if (ret) {
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 06/19] drm/i915: Align tiled scanouts from stolen memory to 256k in the GTT
  2013-09-10 22:36 [PATCH 00/19] drm-intel-collector push simple patches for review Rodrigo Vivi
                   ` (4 preceding siblings ...)
  2013-09-10 22:36 ` [PATCH 05/19] drm/i915: Asynchronously perform the set-base for a simple modeset Rodrigo Vivi
@ 2013-09-10 22:36 ` Rodrigo Vivi
  2013-09-10 22:36 ` [PATCH 07/19] drm/i915: Pair seqno completion tracepoint with its dispatch Rodrigo Vivi
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Rodrigo Vivi @ 2013-09-10 22:36 UTC (permalink / raw)
  To: intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

For unfathomable reasons this alignment appears to be required for tiled
scanouts being read from stolen memory. I can find no reference in the
w/a db to support this requirement, but the evidence of my own eyes says
this prevents many headaches.

Note that I have not tricked anything older than Sandybridge into using
stolen tiled scanouts, so the extra alignment may be required there as
well.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/intel_display.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index c99dc1d..882da0e 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -1840,6 +1840,8 @@ intel_pin_and_fence_fb_obj(struct drm_device *dev,
 	case I915_TILING_X:
 		/* pin() will align the object as required by fence */
 		alignment = 0;
+		if (obj->stolen && INTEL_INFO(dev)->gen >= 6)
+			alignment = 256 * 1024;
 		break;
 	case I915_TILING_Y:
 		/* Despite that we check this in framebuffer_init userspace can
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 07/19] drm/i915: Pair seqno completion tracepoint with its dispatch
  2013-09-10 22:36 [PATCH 00/19] drm-intel-collector push simple patches for review Rodrigo Vivi
                   ` (5 preceding siblings ...)
  2013-09-10 22:36 ` [PATCH 06/19] drm/i915: Align tiled scanouts from stolen memory to 256k in the GTT Rodrigo Vivi
@ 2013-09-10 22:36 ` Rodrigo Vivi
  2013-09-10 22:36 ` [PATCH 08/19] drm/i915: write D_COMP using the mailbox Rodrigo Vivi
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Rodrigo Vivi @ 2013-09-10 22:36 UTC (permalink / raw)
  To: intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

In order to time how long a seqno is executed by a ring, we need to
measure both its insertion and its completion. (Using the completion of
the previous seqno as an estimate for when the GPU starts, if busy.) In
order to get an exact completion timestamp, we need irqs. This is
enabled by trace_i915_gem_ring_dispatch, so it makes more sens to pair
the completion event with that rather than the request.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
 drivers/gpu/drm/i915/i915_irq.c            |  2 +-
 drivers/gpu/drm/i915/i915_trace.h          | 48 +++++++++++++++---------------
 3 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index e519f9f..ce8ea97 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1167,7 +1167,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 			goto err;
 	}
 
-	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
+	trace_i915_gem_ring_dispatch(ring, flags);
 
 	i915_gem_execbuffer_move_to_active(&eb->vmas, ring);
 	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index e97d386..26882bd 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -801,7 +801,7 @@ static void notify_ring(struct drm_device *dev,
 	if (ring->obj == NULL)
 		return;
 
-	trace_i915_gem_request_complete(ring);
+	trace_i915_gem_ring_complete(ring);
 
 	wake_up_all(&ring->irq_queue);
 	i915_queue_hangcheck(dev);
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index a1797f6..5c8e36a 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -234,8 +234,8 @@ TRACE_EVENT(i915_gem_evict_everything,
 );
 
 TRACE_EVENT(i915_gem_ring_dispatch,
-	    TP_PROTO(struct intel_ring_buffer *ring, u32 seqno, u32 flags),
-	    TP_ARGS(ring, seqno, flags),
+	    TP_PROTO(struct intel_ring_buffer *ring, u32 flags),
+	    TP_ARGS(ring, flags),
 
 	    TP_STRUCT__entry(
 			     __field(u32, dev)
@@ -247,15 +247,35 @@ TRACE_EVENT(i915_gem_ring_dispatch,
 	    TP_fast_assign(
 			   __entry->dev = ring->dev->primary->index;
 			   __entry->ring = ring->id;
-			   __entry->seqno = seqno;
+			   __entry->seqno = intel_ring_get_seqno(ring),
 			   __entry->flags = flags;
-			   i915_trace_irq_get(ring, seqno);
+			   i915_trace_irq_get(ring, __entry->seqno);
 			   ),
 
 	    TP_printk("dev=%u, ring=%u, seqno=%u, flags=%x",
 		      __entry->dev, __entry->ring, __entry->seqno, __entry->flags)
 );
 
+TRACE_EVENT(i915_gem_ring_complete,
+	    TP_PROTO(struct intel_ring_buffer *ring),
+	    TP_ARGS(ring),
+
+	    TP_STRUCT__entry(
+			     __field(u32, dev)
+			     __field(u32, ring)
+			     __field(u32, seqno)
+			     ),
+
+	    TP_fast_assign(
+			   __entry->dev = ring->dev->primary->index;
+			   __entry->ring = ring->id;
+			   __entry->seqno = ring->get_seqno(ring, false);
+			   ),
+
+	    TP_printk("dev=%u, ring=%u, seqno=%u",
+		      __entry->dev, __entry->ring, __entry->seqno)
+);
+
 TRACE_EVENT(i915_gem_ring_flush,
 	    TP_PROTO(struct intel_ring_buffer *ring, u32 invalidate, u32 flush),
 	    TP_ARGS(ring, invalidate, flush),
@@ -304,26 +324,6 @@ DEFINE_EVENT(i915_gem_request, i915_gem_request_add,
 	    TP_ARGS(ring, seqno)
 );
 
-TRACE_EVENT(i915_gem_request_complete,
-	    TP_PROTO(struct intel_ring_buffer *ring),
-	    TP_ARGS(ring),
-
-	    TP_STRUCT__entry(
-			     __field(u32, dev)
-			     __field(u32, ring)
-			     __field(u32, seqno)
-			     ),
-
-	    TP_fast_assign(
-			   __entry->dev = ring->dev->primary->index;
-			   __entry->ring = ring->id;
-			   __entry->seqno = ring->get_seqno(ring, false);
-			   ),
-
-	    TP_printk("dev=%u, ring=%u, seqno=%u",
-		      __entry->dev, __entry->ring, __entry->seqno)
-);
-
 DEFINE_EVENT(i915_gem_request, i915_gem_request_retire,
 	    TP_PROTO(struct intel_ring_buffer *ring, u32 seqno),
 	    TP_ARGS(ring, seqno)
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 08/19] drm/i915: write D_COMP using the mailbox
  2013-09-10 22:36 [PATCH 00/19] drm-intel-collector push simple patches for review Rodrigo Vivi
                   ` (6 preceding siblings ...)
  2013-09-10 22:36 ` [PATCH 07/19] drm/i915: Pair seqno completion tracepoint with its dispatch Rodrigo Vivi
@ 2013-09-10 22:36 ` Rodrigo Vivi
  2013-09-19  9:58   ` Damien Lespiau
  2013-09-10 22:36 ` [PATCH 09/19] drm/i915: Initialise min/max frequencies before updating RPS registers Rodrigo Vivi
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 30+ messages in thread
From: Rodrigo Vivi @ 2013-09-10 22:36 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

You can't write it using the MCHBAR mirror, the write will just get
dropped.

This should make us BSpec-compliant, but there's no real bug I could
reproduce that is fixed by this patch.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_reg.h      |  4 ++++
 drivers/gpu/drm/i915/intel_display.c | 10 ++++++++--
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index bcee89b..ff60945 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -1441,6 +1441,8 @@
  * device 0 function 0's pci config register 0x44 or 0x48 and matches it in
  * every way.  It is not accessible from the CP register read instructions.
  *
+ * Starting form Haswell, you can't write registers using the MCHBAR mirror,
+ * just read.
  */
 #define MCHBAR_MIRROR_BASE	0x10000
 
@@ -4723,6 +4725,8 @@
 #define   GEN6_PCODE_READ_MIN_FREQ_TABLE	0x9
 #define	  GEN6_PCODE_WRITE_RC6VIDS		0x4
 #define	  GEN6_PCODE_READ_RC6VIDS		0x5
+#define   GEN6_PCODE_READ_D_COMP		0x10
+#define   GEN6_PCODE_WRITE_D_COMP		0x11
 #define   GEN6_ENCODE_RC6_VID(mv)		(((mv) - 245) / 5)
 #define   GEN6_DECODE_RC6_VID(vids)		(((vids) * 5) + 245)
 #define GEN6_PCODE_DATA				0x138128
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 882da0e..3098465 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -6019,7 +6019,10 @@ void hsw_disable_lcpll(struct drm_i915_private *dev_priv,
 
 	val = I915_READ(D_COMP);
 	val |= D_COMP_COMP_DISABLE;
-	I915_WRITE(D_COMP, val);
+	mutex_lock(&dev_priv->rps.hw_lock);
+	if (sandybridge_pcode_write(dev_priv, GEN6_PCODE_WRITE_D_COMP, val))
+		DRM_ERROR("Failed to disable D_COMP\n");
+	mutex_unlock(&dev_priv->rps.hw_lock);
 	POSTING_READ(D_COMP);
 	ndelay(100);
 
@@ -6061,7 +6064,10 @@ void hsw_restore_lcpll(struct drm_i915_private *dev_priv)
 	val = I915_READ(D_COMP);
 	val |= D_COMP_COMP_FORCE;
 	val &= ~D_COMP_COMP_DISABLE;
-	I915_WRITE(D_COMP, val);
+	mutex_lock(&dev_priv->rps.hw_lock);
+	if (sandybridge_pcode_write(dev_priv, GEN6_PCODE_WRITE_D_COMP, val))
+		DRM_ERROR("Failed to enable D_COMP\n");
+	mutex_unlock(&dev_priv->rps.hw_lock);
 	POSTING_READ(D_COMP);
 
 	val = I915_READ(LCPLL_CTL);
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 09/19] drm/i915: Initialise min/max frequencies before updating RPS registers
  2013-09-10 22:36 [PATCH 00/19] drm-intel-collector push simple patches for review Rodrigo Vivi
                   ` (7 preceding siblings ...)
  2013-09-10 22:36 ` [PATCH 08/19] drm/i915: write D_COMP using the mailbox Rodrigo Vivi
@ 2013-09-10 22:36 ` Rodrigo Vivi
  2013-09-10 22:36 ` [PATCH 10/19] drm/i915: Delay the relase of the forcewake by a jiffie Rodrigo Vivi
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Rodrigo Vivi @ 2013-09-10 22:36 UTC (permalink / raw)
  To: intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

The RPS register writing routines use the current value of min/max to
set certain limits and interrupt gating. If we set those afterwards, we
risk setting up the hw incorrectly and losing power management events,
and worse, trigger some internal assertions.

Reorder the calling sequences to be correct, and remove the then
unrequired clamping from inside set_rps(). And for a bonus, fix the bug
of calling gen6_set_rps() from Valleyview.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c |  2 +-
 drivers/gpu/drm/i915/i915_sysfs.c   | 16 ++++++++--------
 drivers/gpu/drm/i915/intel_pm.c     | 19 +++++--------------
 3 files changed, 14 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 9ac4e31..8c5ce9b 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2008,7 +2008,7 @@ i915_max_freq_set(void *data, u64 val)
 	if (IS_VALLEYVIEW(dev)) {
 		val = vlv_freq_opcode(dev_priv->mem_freq, val);
 		dev_priv->rps.max_delay = val;
-		gen6_set_rps(dev, val);
+		valleyview_set_rps(dev, val);
 	} else {
 		do_div(val, GT_FREQUENCY_MULTIPLIER);
 		dev_priv->rps.max_delay = val;
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index c8c4112..05195c0 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -294,15 +294,15 @@ static ssize_t gt_max_freq_mhz_store(struct device *kdev,
 		DRM_DEBUG("User requested overclocking to %d\n",
 			  val * GT_FREQUENCY_MULTIPLIER);
 
+	dev_priv->rps.max_delay = val;
+
 	if (dev_priv->rps.cur_delay > val) {
-		if (IS_VALLEYVIEW(dev_priv->dev))
-			valleyview_set_rps(dev_priv->dev, val);
+		if (IS_VALLEYVIEW(dev))
+			valleyview_set_rps(dev, val);
 		else
-			gen6_set_rps(dev_priv->dev, val);
+			gen6_set_rps(dev, val);
 	}
 
-	dev_priv->rps.max_delay = val;
-
 	mutex_unlock(&dev_priv->rps.hw_lock);
 
 	return count;
@@ -359,15 +359,15 @@ static ssize_t gt_min_freq_mhz_store(struct device *kdev,
 		return -EINVAL;
 	}
 
+	dev_priv->rps.min_delay = val;
+
 	if (dev_priv->rps.cur_delay < val) {
 		if (IS_VALLEYVIEW(dev))
 			valleyview_set_rps(dev, val);
 		else
-			gen6_set_rps(dev_priv->dev, val);
+			gen6_set_rps(dev, val);
 	}
 
-	dev_priv->rps.min_delay = val;
-
 	mutex_unlock(&dev_priv->rps.hw_lock);
 
 	return count;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 0c115cc..2207e27 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3263,26 +3263,19 @@ static void ironlake_disable_drps(struct drm_device *dev)
  * ourselves, instead of doing a rmw cycle (which might result in us clearing
  * all limits and the gpu stuck at whatever frequency it is at atm).
  */
-static u32 gen6_rps_limits(struct drm_i915_private *dev_priv, u8 *val)
+static u32 gen6_rps_limits(struct drm_i915_private *dev_priv, u8 val)
 {
 	u32 limits;
 
-	limits = 0;
-
-	if (*val >= dev_priv->rps.max_delay)
-		*val = dev_priv->rps.max_delay;
-	limits |= dev_priv->rps.max_delay << 24;
-
 	/* Only set the down limit when we've reached the lowest level to avoid
 	 * getting more interrupts, otherwise leave this clear. This prevents a
 	 * race in the hw when coming out of rc6: There's a tiny window where
 	 * the hw runs at the minimal clock before selecting the desired
 	 * frequency, if the down threshold expires in that window we will not
 	 * receive a down interrupt. */
-	if (*val <= dev_priv->rps.min_delay) {
-		*val = dev_priv->rps.min_delay;
+	limits = dev_priv->rps.max_delay << 24;
+	if (val <= dev_priv->rps.min_delay)
 		limits |= dev_priv->rps.min_delay << 16;
-	}
 
 	return limits;
 }
@@ -3290,7 +3283,6 @@ static u32 gen6_rps_limits(struct drm_i915_private *dev_priv, u8 *val)
 void gen6_set_rps(struct drm_device *dev, u8 val)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	u32 limits = gen6_rps_limits(dev_priv, &val);
 
 	WARN_ON(!mutex_is_locked(&dev_priv->rps.hw_lock));
 	WARN_ON(val > dev_priv->rps.max_delay);
@@ -3311,7 +3303,8 @@ void gen6_set_rps(struct drm_device *dev, u8 val)
 	/* Make sure we continue to get interrupts
 	 * until we hit the minimum or maximum frequencies.
 	 */
-	I915_WRITE(GEN6_RP_INTERRUPT_LIMITS, limits);
+	I915_WRITE(GEN6_RP_INTERRUPT_LIMITS,
+		   gen6_rps_limits(dev_priv, val));
 
 	POSTING_READ(GEN6_RPNSWREQ);
 
@@ -3349,8 +3342,6 @@ void valleyview_set_rps(struct drm_device *dev, u8 val)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	gen6_rps_limits(dev_priv, &val);
-
 	WARN_ON(!mutex_is_locked(&dev_priv->rps.hw_lock));
 	WARN_ON(val > dev_priv->rps.max_delay);
 	WARN_ON(val < dev_priv->rps.min_delay);
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 10/19] drm/i915: Delay the relase of the forcewake by a jiffie
  2013-09-10 22:36 [PATCH 00/19] drm-intel-collector push simple patches for review Rodrigo Vivi
                   ` (8 preceding siblings ...)
  2013-09-10 22:36 ` [PATCH 09/19] drm/i915: Initialise min/max frequencies before updating RPS registers Rodrigo Vivi
@ 2013-09-10 22:36 ` Rodrigo Vivi
  2013-09-10 22:36 ` [PATCH 11/19] drm/i915: Add a tracepoint for using a semaphore Rodrigo Vivi
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Rodrigo Vivi @ 2013-09-10 22:36 UTC (permalink / raw)
  To: intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

Obtaining the forcwake requires expensive and time consuming
serialisation. And we often try to obtain the forcewake multiple times
in very quick succession. We can reduce the overhead of these sequences
by delaying the forcewake release, and so not hammer the hw quite so
hard.

I was hoping this would help with the spurious
[drm:__gen6_gt_force_wake_mt_get] *ERROR* Timed out waiting for forcewake old ack to clear.
found on Haswell. Alas not.

v2: Fix teardown ordering - unmap the regs after turning off forcewake,
and make sure we do turn off forcewake - both found by Ville.

Note: I have no claims for improved performance, stablity or power
comsumption for this patch. We should not be hitting the registers often
enough for this to improve benchmarks, but given the nature of our hw it
is likely to improve long term stability.

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_dma.c     |  6 ++++--
 drivers/gpu/drm/i915/i915_drv.h     |  3 +++
 drivers/gpu/drm/i915/intel_uncore.c | 30 ++++++++++++++++++++++++++++--
 3 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 9f39db9..93f1e50 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1786,8 +1786,6 @@ int i915_driver_unload(struct drm_device *dev)
 	list_del(&dev_priv->gtt.base.global_link);
 	WARN_ON(!list_empty(&dev_priv->vm_list));
 	drm_mm_takedown(&dev_priv->gtt.base.mm);
-	if (dev_priv->regs != NULL)
-		pci_iounmap(dev->pdev, dev_priv->regs);
 
 	drm_vblank_cleanup(dev);
 
@@ -1799,6 +1797,10 @@ int i915_driver_unload(struct drm_device *dev)
 
 	dev_priv->gtt.base.cleanup(&dev_priv->gtt.base);
 
+	intel_uncore_fini(dev);
+	if (dev_priv->regs != NULL)
+		pci_iounmap(dev->pdev, dev_priv->regs);
+
 	if (dev_priv->slab)
 		kmem_cache_destroy(dev_priv->slab);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2729338..676d799 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -406,6 +406,8 @@ struct intel_uncore {
 
 	unsigned fifo_count;
 	unsigned forcewake_count;
+
+	struct delayed_work force_wake_work;
 };
 
 #define DEV_INFO_FOR_EACH_FLAG(func, sep) \
@@ -1774,6 +1776,7 @@ extern void intel_uncore_early_sanitize(struct drm_device *dev);
 extern void intel_uncore_init(struct drm_device *dev);
 extern void intel_uncore_clear_errors(struct drm_device *dev);
 extern void intel_uncore_check_errors(struct drm_device *dev);
+extern void intel_uncore_fini(struct drm_device *dev);
 
 void
 i915_enable_pipestat(drm_i915_private_t *dev_priv, int pipe, u32 mask);
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 8649f1c..462cc7f 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -204,6 +204,18 @@ static void vlv_force_wake_put(struct drm_i915_private *dev_priv)
 	gen6_gt_check_fifodbg(dev_priv);
 }
 
+static void gen6_force_wake_work(struct work_struct *work)
+{
+	struct drm_i915_private *dev_priv =
+		container_of(work, typeof(*dev_priv), uncore.force_wake_work.work);
+	unsigned long irqflags;
+
+	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
+	if (--dev_priv->uncore.forcewake_count == 0)
+		dev_priv->uncore.funcs.force_wake_put(dev_priv);
+	spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
+}
+
 void intel_uncore_early_sanitize(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -216,6 +228,9 @@ void intel_uncore_init(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
+	INIT_DELAYED_WORK(&dev_priv->uncore.force_wake_work,
+			  gen6_force_wake_work);
+
 	if (IS_VALLEYVIEW(dev)) {
 		dev_priv->uncore.funcs.force_wake_get = vlv_force_wake_get;
 		dev_priv->uncore.funcs.force_wake_put = vlv_force_wake_put;
@@ -261,6 +276,13 @@ void intel_uncore_init(struct drm_device *dev)
 	}
 }
 
+void intel_uncore_fini(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	flush_delayed_work(&dev_priv->uncore.force_wake_work);
+}
+
 static void intel_uncore_forcewake_reset(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -306,8 +328,12 @@ void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv)
 	unsigned long irqflags;
 
 	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
-	if (--dev_priv->uncore.forcewake_count == 0)
-		dev_priv->uncore.funcs.force_wake_put(dev_priv);
+	if (--dev_priv->uncore.forcewake_count == 0) {
+		dev_priv->uncore.forcewake_count++;
+		mod_delayed_work(dev_priv->wq,
+				 &dev_priv->uncore.force_wake_work,
+				 1);
+	}
 	spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
 }
 
-- 
1.8.1.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 11/19] drm/i915: Add a tracepoint for using a semaphore
  2013-09-10 22:36 [PATCH 00/19] drm-intel-collector push simple patches for review Rodrigo Vivi
                   ` (9 preceding siblings ...)
  2013-09-10 22:36 ` [PATCH 10/19] drm/i915: Delay the relase of the forcewake by a jiffie Rodrigo Vivi
@ 2013-09-10 22:36 ` Rodrigo Vivi
  2013-09-10 22:36 ` [PATCH 12/19] drm/i915: Write RING_TAIL once per-request Rodrigo Vivi
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Rodrigo Vivi @ 2013-09-10 22:36 UTC (permalink / raw)
  To: intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

So that we can find the callers who introduce a ring stall. A single
ring stall is not too unwelcome, the right issue becomes when they start
to interlock and prevent any concurrent work. That, however, is a little
tricker to detect with a mere tracepoint!

v2: Rebrand it as a ring event, rather than an object event.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_gem.c   |  2 ++
 drivers/gpu/drm/i915/i915_trace.h | 19 +++++++++++++++++++
 2 files changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index b3aaf29..bbcb1b6 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2600,6 +2600,8 @@ i915_gem_object_sync(struct drm_i915_gem_object *obj,
 	if (ret)
 		return ret;
 
+	trace_i915_gem_ring_sync_to(from, to);
+
 	ret = to->sync_to(to, from, seqno);
 	if (!ret)
 		/* We use last_read_seqno because sync_to()
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index 5c8e36a..48e8f07 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -233,6 +233,25 @@ TRACE_EVENT(i915_gem_evict_everything,
 	    TP_printk("dev=%d", __entry->dev)
 );
 
+TRACE_EVENT(i915_gem_ring_sync_to,
+	    TP_PROTO(struct intel_ring_buffer *from, struct intel_ring_buffer *to),
+	    TP_ARGS(from, to),
+
+	    TP_STRUCT__entry(
+			     __field(u32, dev)
+			     __field(u32, sync_from)
+			     __field(u32, sync_to)
+			     ),
+
+	    TP_fast_assign(
+			   __entry->dev = from->dev->primary->index;
+			   __entry->sync_from = from->id;
+			   __entry->sync_to = to->id;
+			   ),
+
+	    TP_printk("dev=%u, sync-from=%u, sync-to=%u", __entry->dev, __entry->sync_from, __entry->sync_to)
+);
+
 TRACE_EVENT(i915_gem_ring_dispatch,
 	    TP_PROTO(struct intel_ring_buffer *ring, u32 flags),
 	    TP_ARGS(ring, flags),
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 12/19] drm/i915: Write RING_TAIL once per-request
  2013-09-10 22:36 [PATCH 00/19] drm-intel-collector push simple patches for review Rodrigo Vivi
                   ` (10 preceding siblings ...)
  2013-09-10 22:36 ` [PATCH 11/19] drm/i915: Add a tracepoint for using a semaphore Rodrigo Vivi
@ 2013-09-10 22:36 ` Rodrigo Vivi
  2013-09-10 22:36 ` [PATCH 13/19] drm/i915: Boost RPS frequency for CPU stalls Rodrigo Vivi
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Rodrigo Vivi @ 2013-09-10 22:36 UTC (permalink / raw)
  To: intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

Ignoring the legacy DRI1 code, and a couple of special cases (to be
discussed later), all access to the ring is mediated through requests.
The first write to a ring will grab a seqno and mark the ring as having
an outstanding_lazy_request. Either through explicitly adding a request
after an execbuffer or through an implicit wait (either by the CPU or by
a semaphore), that sequence of writes will be terminated with a request.
So we can ellide all the intervening writes to the tail register and
send the entire command stream to the GPU at once. This will reduce the
number of *serialising* writes to the tail register by a factor or 3-5
times (depending upon architecture and number of workarounds, context
switches, etc involved). This becomes even more noticeable when the
register write is overloaded with a number of debugging tools. The
astute reader will wonder if it is then possible to overflow the ring
with a single command. It is not. When we start a command sequence to
the ring, we check for available space and issue a wait in case we have
not. The ring wait will in this case be forced to flush the outstanding
register write and then poll the ACTHD for sufficient space to continue.

The exception to the rule where everything is inside a request are a few
initialisation cases where we may want to write GPU commands via the CS
before userspace wakes up and page flips.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_dma.c         |  2 +-
 drivers/gpu/drm/i915/intel_display.c    | 10 +++++-----
 drivers/gpu/drm/i915/intel_ringbuffer.c | 30 ++++++++++++++++--------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  7 ++++++-
 4 files changed, 28 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 93f1e50..af46c47 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -52,7 +52,7 @@
 	intel_ring_emit(LP_RING(dev_priv), x)
 
 #define ADVANCE_LP_RING() \
-	intel_ring_advance(LP_RING(dev_priv))
+	__intel_ring_advance(LP_RING(dev_priv))
 
 /**
  * Lock test for when it's just for synchronization of ring access.
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 3098465..701ce74 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -7713,7 +7713,7 @@ static int intel_gen2_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, 0); /* aux display base address, unused */
 
 	intel_mark_page_flip_active(intel_crtc);
-	intel_ring_advance(ring);
+	__intel_ring_advance(ring);
 	return 0;
 
 err_unpin:
@@ -7755,7 +7755,7 @@ static int intel_gen3_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, MI_NOOP);
 
 	intel_mark_page_flip_active(intel_crtc);
-	intel_ring_advance(ring);
+	__intel_ring_advance(ring);
 	return 0;
 
 err_unpin:
@@ -7804,7 +7804,7 @@ static int intel_gen4_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, pf | pipesrc);
 
 	intel_mark_page_flip_active(intel_crtc);
-	intel_ring_advance(ring);
+	__intel_ring_advance(ring);
 	return 0;
 
 err_unpin:
@@ -7849,7 +7849,7 @@ static int intel_gen6_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, pf | pipesrc);
 
 	intel_mark_page_flip_active(intel_crtc);
-	intel_ring_advance(ring);
+	__intel_ring_advance(ring);
 	return 0;
 
 err_unpin:
@@ -7928,7 +7928,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, (MI_NOOP));
 
 	intel_mark_page_flip_active(intel_crtc);
-	intel_ring_advance(ring);
+	__intel_ring_advance(ring);
 	return 0;
 
 err_unpin:
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 284afaf..686e5b2 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -41,6 +41,16 @@ static inline int ring_space(struct intel_ring_buffer *ring)
 	return space;
 }
 
+void __intel_ring_advance(struct intel_ring_buffer *ring)
+{
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+
+	ring->tail &= ring->size - 1;
+	if (dev_priv->gpu_error.stop_rings & intel_ring_flag(ring))
+		return;
+	ring->write_tail(ring, ring->tail);
+}
+
 static int
 gen2_render_ring_flush(struct intel_ring_buffer *ring,
 		       u32	invalidate_domains,
@@ -631,7 +641,7 @@ gen6_add_request(struct intel_ring_buffer *ring)
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
 	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
-	intel_ring_advance(ring);
+	__intel_ring_advance(ring);
 
 	return 0;
 }
@@ -744,7 +754,7 @@ pc_render_add_request(struct intel_ring_buffer *ring)
 	intel_ring_emit(ring, ring->scratch.gtt_offset | PIPE_CONTROL_GLOBAL_GTT);
 	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
 	intel_ring_emit(ring, 0);
-	intel_ring_advance(ring);
+	__intel_ring_advance(ring);
 
 	return 0;
 }
@@ -965,7 +975,7 @@ i9xx_add_request(struct intel_ring_buffer *ring)
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
 	intel_ring_emit(ring, ring->outstanding_lazy_seqno);
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
-	intel_ring_advance(ring);
+	__intel_ring_advance(ring);
 
 	return 0;
 }
@@ -1414,6 +1424,9 @@ static int ring_wait_for_space(struct intel_ring_buffer *ring, int n)
 	if (ret != -ENOSPC)
 		return ret;
 
+	/* force the tail write in case we have been skipping them */
+	__intel_ring_advance(ring);
+
 	trace_i915_ring_wait_begin(ring);
 	/* With GEM the hangcheck timer should kick us out of the loop,
 	 * leaving it early runs the risk of corrupting GEM state (due
@@ -1568,17 +1581,6 @@ void intel_ring_init_seqno(struct intel_ring_buffer *ring, u32 seqno)
 	ring->hangcheck.seqno = seqno;
 }
 
-void intel_ring_advance(struct intel_ring_buffer *ring)
-{
-	struct drm_i915_private *dev_priv = ring->dev->dev_private;
-
-	ring->tail &= ring->size - 1;
-	if (dev_priv->gpu_error.stop_rings & intel_ring_flag(ring))
-		return;
-	ring->write_tail(ring, ring->tail);
-}
-
-
 static void gen6_bsd_ring_write_tail(struct intel_ring_buffer *ring,
 				     u32 value)
 {
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index ad2dd65..9db7917 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -238,7 +238,12 @@ static inline void intel_ring_emit(struct intel_ring_buffer *ring,
 	iowrite32(data, ring->virtual_start + ring->tail);
 	ring->tail += 4;
 }
-void intel_ring_advance(struct intel_ring_buffer *ring);
+static inline void intel_ring_advance(struct intel_ring_buffer *ring)
+{
+	ring->tail &= ring->size - 1;
+}
+void __intel_ring_advance(struct intel_ring_buffer *ring);
+
 int __must_check intel_ring_idle(struct intel_ring_buffer *ring);
 void intel_ring_init_seqno(struct intel_ring_buffer *ring, u32 seqno);
 int intel_ring_flush_all_caches(struct intel_ring_buffer *ring);
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 13/19] drm/i915: Boost RPS frequency for CPU stalls
  2013-09-10 22:36 [PATCH 00/19] drm-intel-collector push simple patches for review Rodrigo Vivi
                   ` (11 preceding siblings ...)
  2013-09-10 22:36 ` [PATCH 12/19] drm/i915: Write RING_TAIL once per-request Rodrigo Vivi
@ 2013-09-10 22:36 ` Rodrigo Vivi
  2013-09-10 22:53   ` Chris Wilson
  2013-09-10 22:36 ` [PATCH 14/19] drm/i915: Tweak RPS thresholds to more aggressively downclock Rodrigo Vivi
                   ` (5 subsequent siblings)
  18 siblings, 1 reply; 30+ messages in thread
From: Rodrigo Vivi @ 2013-09-10 22:36 UTC (permalink / raw)
  To: intel-gfx; +Cc: Owen Taylor, Stéphane Marchesin, Zhuang, Lena

From: Chris Wilson <chris@chris-wilson.co.uk>

If we encounter a situation where the CPU blocks waiting for results
from the GPU, give the GPU a kick to boost its the frequency.

This should work to reduce user interface stalls and to quickly promote
mesa to high frequencies - but the cost is that our requested frequency
stalls high (as we do not idle for long enough before rc6 to start
reducing frequencies, nor are we aggressive at down clocking an
underused GPU). However, this should be mitigated by rc6 itself powering
off the GPU when idle, and that energy use is dependent upon the workload
of the GPU in addition to its frequency (e.g. the math or sampler
functions only consume power when used). Still, this is likely to
adversely affect light workloads.

In particular, this nearly eliminates the highly noticeable wake-up lag
in animations from idle. For example, expose or workspace transitions.
(However, given the situation where we fail to downclock, our requested
frequency is almost always the maximum, except for Baytrail where we
manually downclock upon idling. This often masks the latency of
upclocking after being idle, so animations are typically smooth - at the
cost of increased power consumption.)

Stéphane raised the concern that this will punish good applications and
reward bad applications - but due to the nature of how mesa performs its
client throttling, I believe all mesa applications will be roughly
equally affected. To address this concern, and to prevent applications
like compositors from regularly boosting the RPS state, we only allow
each client to receive one boost in each period of activity.

Unfortunately, this techinique is ineffective with Ironlake - which also
has dynamic render power states and suffers just as dramatically. For
Ironlake, the thermal/power headroom is shared with the CPU through
Intelligent Power Sharing and the intel-ips module. This leaves us with
no GPU boost frequencies available when coming out of idle, and due to
hardware limitations we cannot change the arbitration between the CPU and
GPU quickly enough to be effective.

v2: Limit each client to receiving a single boost for each active period.
v3: Tidy up. Allow the device to always boost if it waits outside of
client context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Stéphane Marchesin <stephane.marchesin@gmail.com>
Cc: Owen Taylor <otaylor@redhat.com>
Cc: "Meng, Mengmeng" <mengmeng.meng@intel.com>
Cc: "Zhuang, Lena" <lena.zhuang@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_dma.c      |  16 ++---
 drivers/gpu/drm/i915/i915_drv.h      |  19 +++--
 drivers/gpu/drm/i915/i915_gem.c      | 136 ++++++++++++++++++++++++-----------
 drivers/gpu/drm/i915/i915_irq.c      |  11 ---
 drivers/gpu/drm/i915/intel_display.c |   3 +
 drivers/gpu/drm/i915/intel_drv.h     |   3 +
 drivers/gpu/drm/i915/intel_pm.c      |  42 ++++++-----
 7 files changed, 139 insertions(+), 91 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index af46c47..61af9b8 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1812,19 +1812,11 @@ int i915_driver_unload(struct drm_device *dev)
 
 int i915_driver_open(struct drm_device *dev, struct drm_file *file)
 {
-	struct drm_i915_file_private *file_priv;
-
-	DRM_DEBUG_DRIVER("\n");
-	file_priv = kzalloc(sizeof(*file_priv), GFP_KERNEL);
-	if (!file_priv)
-		return -ENOMEM;
-
-	file->driver_priv = file_priv;
-
-	spin_lock_init(&file_priv->mm.lock);
-	INIT_LIST_HEAD(&file_priv->mm.request_list);
+	int ret;
 
-	idr_init(&file_priv->context_idr);
+	ret = i915_gem_open(dev, file);
+	if (ret)
+		return ret;
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 676d799..8fb1eb7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -833,9 +833,6 @@ struct intel_gen6_power_mgmt {
 	struct work_struct work;
 	u32 pm_iir;
 
-	/* On vlv we need to manually drop to Vmin with a delayed work. */
-	struct delayed_work vlv_work;
-
 	/* The below variables an all the rps hw state are protected by
 	 * dev->struct mutext. */
 	u8 cur_delay;
@@ -952,6 +949,15 @@ struct i915_gem_mm {
 	struct delayed_work retire_work;
 
 	/**
+	 * When we detect an idle GPU, we want to turn on
+	 * powersaving features. So once we see that there
+	 * are no more requests outstanding and no more
+	 * arrive within a small period of time, we fire
+	 * off the idle_work.
+	 */
+	struct delayed_work idle_work;
+
+	/**
 	 * Are we in a non-interruptible section of code like
 	 * modesetting?
 	 */
@@ -1566,13 +1572,17 @@ struct drm_i915_gem_request {
 };
 
 struct drm_i915_file_private {
+	struct drm_i915_private *dev_priv;
+
 	struct {
 		spinlock_t lock;
 		struct list_head request_list;
+		struct delayed_work idle_work;
 	} mm;
 	struct idr context_idr;
 
 	struct i915_ctx_hang_stats hang_stats;
+	atomic_t rps_wait_boost;
 };
 
 #define INTEL_INFO(dev)	(to_i915(dev)->info)
@@ -1921,7 +1931,7 @@ i915_gem_object_unpin_fence(struct drm_i915_gem_object *obj)
 	}
 }
 
-void i915_gem_retire_requests(struct drm_device *dev);
+bool i915_gem_retire_requests(struct drm_device *dev);
 void i915_gem_retire_requests_ring(struct intel_ring_buffer *ring);
 int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
 				      bool interruptible);
@@ -1972,6 +1982,7 @@ int i915_gem_attach_phys_object(struct drm_device *dev,
 void i915_gem_detach_phys_object(struct drm_device *dev,
 				 struct drm_i915_gem_object *obj);
 void i915_gem_free_all_phys_object(struct drm_device *dev);
+int i915_gem_open(struct drm_device *dev, struct drm_file *file);
 void i915_gem_release(struct drm_device *dev, struct drm_file *file);
 
 uint32_t
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index bbcb1b6..89c2844 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -970,6 +970,14 @@ i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno)
 	return ret;
 }
 
+static bool can_wait_boost(struct drm_i915_file_private *file_priv)
+{
+	if (file_priv == NULL)
+		return true;
+
+	return !atomic_xchg(&file_priv->rps_wait_boost, true);
+}
+
 /**
  * __wait_seqno - wait until execution of seqno has finished
  * @ring: the ring expected to report seqno
@@ -990,7 +998,9 @@ i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno)
  */
 static int __wait_seqno(struct intel_ring_buffer *ring, u32 seqno,
 			unsigned reset_counter,
-			bool interruptible, struct timespec *timeout)
+			bool interruptible,
+			struct timespec *timeout,
+			struct drm_i915_file_private *file_priv)
 {
 	drm_i915_private_t *dev_priv = ring->dev->dev_private;
 	struct timespec before, now, wait_time={1,0};
@@ -1013,6 +1023,9 @@ static int __wait_seqno(struct intel_ring_buffer *ring, u32 seqno,
 
 	timeout_jiffies = timespec_to_jiffies_timeout(&wait_time);
 
+	if (dev_priv->info->gen >= 6 && can_wait_boost(file_priv))
+		gen6_rps_boost(dev_priv);
+
 	if (WARN_ON(!ring->irq_get(ring)))
 		return -ENODEV;
 
@@ -1095,7 +1108,7 @@ i915_wait_seqno(struct intel_ring_buffer *ring, uint32_t seqno)
 
 	return __wait_seqno(ring, seqno,
 			    atomic_read(&dev_priv->gpu_error.reset_counter),
-			    interruptible, NULL);
+			    interruptible, NULL, NULL);
 }
 
 static int
@@ -1145,6 +1158,7 @@ i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj,
  */
 static __must_check int
 i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
+					    struct drm_file *file,
 					    bool readonly)
 {
 	struct drm_device *dev = obj->base.dev;
@@ -1171,7 +1185,7 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
 
 	reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
 	mutex_unlock(&dev->struct_mutex);
-	ret = __wait_seqno(ring, seqno, reset_counter, true, NULL);
+	ret = __wait_seqno(ring, seqno, reset_counter, true, NULL, file->driver_priv);
 	mutex_lock(&dev->struct_mutex);
 	if (ret)
 		return ret;
@@ -1220,7 +1234,7 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
 	 * We will repeat the flush holding the lock in the normal manner
 	 * to catch cases where we are gazumped.
 	 */
-	ret = i915_gem_object_wait_rendering__nonblocking(obj, !write_domain);
+	ret = i915_gem_object_wait_rendering__nonblocking(obj, file, !write_domain);
 	if (ret)
 		goto unref;
 
@@ -2120,6 +2134,8 @@ int __i915_add_request(struct intel_ring_buffer *ring,
 	if (file) {
 		struct drm_i915_file_private *file_priv = file->driver_priv;
 
+		cancel_delayed_work_sync(&file_priv->mm.idle_work);
+
 		spin_lock(&file_priv->mm.lock);
 		request->file_priv = file_priv;
 		list_add_tail(&request->client_list,
@@ -2135,6 +2151,7 @@ int __i915_add_request(struct intel_ring_buffer *ring,
 		i915_queue_hangcheck(ring->dev);
 
 		if (was_empty) {
+			cancel_delayed_work_sync(&dev_priv->mm.idle_work);
 			queue_delayed_work(dev_priv->wq,
 					   &dev_priv->mm.retire_work,
 					   round_jiffies_up_relative(HZ));
@@ -2156,10 +2173,12 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
 		return;
 
 	spin_lock(&file_priv->mm.lock);
-	if (request->file_priv) {
-		list_del(&request->client_list);
-		request->file_priv = NULL;
-	}
+	list_del(&request->client_list);
+	if (list_empty(&file_priv->mm.request_list))
+		mod_delayed_work(to_i915(request->ring->dev)->wq,
+				 &file_priv->mm.idle_work,
+				 msecs_to_jiffies(100));
+	request->file_priv = NULL;
 	spin_unlock(&file_priv->mm.lock);
 }
 
@@ -2405,57 +2424,53 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
 	WARN_ON(i915_verify_lists(ring->dev));
 }
 
-void
+bool
 i915_gem_retire_requests(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct intel_ring_buffer *ring;
+	bool idle = true;
 	int i;
 
-	for_each_ring(ring, dev_priv, i)
+	for_each_ring(ring, dev_priv, i) {
 		i915_gem_retire_requests_ring(ring);
+		idle &= list_empty(&ring->request_list);
+	}
+
+	if (idle)
+		mod_delayed_work(dev_priv->wq,
+				   &dev_priv->mm.idle_work,
+				   msecs_to_jiffies(100));
+
+	return idle;
 }
 
 static void
 i915_gem_retire_work_handler(struct work_struct *work)
 {
-	drm_i915_private_t *dev_priv;
-	struct drm_device *dev;
-	struct intel_ring_buffer *ring;
+	struct drm_i915_private *dev_priv =
+		container_of(work, typeof(*dev_priv), mm.retire_work.work);
+	struct drm_device *dev = dev_priv->dev;
 	bool idle;
-	int i;
-
-	dev_priv = container_of(work, drm_i915_private_t,
-				mm.retire_work.work);
-	dev = dev_priv->dev;
 
 	/* Come back later if the device is busy... */
-	if (!mutex_trylock(&dev->struct_mutex)) {
-		queue_delayed_work(dev_priv->wq, &dev_priv->mm.retire_work,
-				   round_jiffies_up_relative(HZ));
-		return;
-	}
-
-	i915_gem_retire_requests(dev);
-
-	/* Send a periodic flush down the ring so we don't hold onto GEM
-	 * objects indefinitely.
-	 */
-	idle = true;
-	for_each_ring(ring, dev_priv, i) {
-		if (ring->gpu_caches_dirty)
-			i915_add_request(ring, NULL);
-
-		idle &= list_empty(&ring->request_list);
+	idle = false;
+	if (mutex_trylock(&dev->struct_mutex)) {
+		idle = i915_gem_retire_requests(dev);
+		mutex_unlock(&dev->struct_mutex);
 	}
-
-	if (!dev_priv->ums.mm_suspended && !idle)
+	if (!idle)
 		queue_delayed_work(dev_priv->wq, &dev_priv->mm.retire_work,
 				   round_jiffies_up_relative(HZ));
-	if (idle)
-		intel_mark_idle(dev);
+}
 
-	mutex_unlock(&dev->struct_mutex);
+static void
+i915_gem_idle_work_handler(struct work_struct *work)
+{
+	struct drm_i915_private *dev_priv =
+		container_of(work, typeof(*dev_priv), mm.idle_work.work);
+
+	intel_mark_idle(dev_priv->dev);
 }
 
 /**
@@ -2553,7 +2568,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 	reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
 	mutex_unlock(&dev->struct_mutex);
 
-	ret = __wait_seqno(ring, seqno, reset_counter, true, timeout);
+	ret = __wait_seqno(ring, seqno, reset_counter, true, timeout, file->driver_priv);
 	if (timeout)
 		args->timeout_ns = timespec_to_ns(timeout);
 	return ret;
@@ -3766,7 +3781,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 	if (seqno == 0)
 		return 0;
 
-	ret = __wait_seqno(ring, seqno, reset_counter, true, NULL);
+	ret = __wait_seqno(ring, seqno, reset_counter, true, NULL, NULL);
 	if (ret == 0)
 		queue_delayed_work(dev_priv->wq, &dev_priv->mm.retire_work, 0);
 
@@ -4239,6 +4254,7 @@ i915_gem_idle(struct drm_device *dev)
 
 	/* Cancel the retire work handler, which should be idle now. */
 	cancel_delayed_work_sync(&dev_priv->mm.retire_work);
+	cancel_delayed_work_sync(&dev_priv->mm.idle_work);
 
 	return 0;
 }
@@ -4571,6 +4587,8 @@ i915_gem_load(struct drm_device *dev)
 		INIT_LIST_HEAD(&dev_priv->fence_regs[i].lru_list);
 	INIT_DELAYED_WORK(&dev_priv->mm.retire_work,
 			  i915_gem_retire_work_handler);
+	INIT_DELAYED_WORK(&dev_priv->mm.idle_work,
+			  i915_gem_idle_work_handler);
 	init_waitqueue_head(&dev_priv->gpu_error.reset_queue);
 
 	/* On GEN3 we really need to make sure the ARB C3 LP bit is set */
@@ -4794,6 +4812,8 @@ void i915_gem_release(struct drm_device *dev, struct drm_file *file)
 {
 	struct drm_i915_file_private *file_priv = file->driver_priv;
 
+	cancel_delayed_work_sync(&file_priv->mm.idle_work);
+
 	/* Clean up our request list when the client is going away, so that
 	 * later retire_requests won't dereference our soon-to-be-gone
 	 * file_priv.
@@ -4811,6 +4831,38 @@ void i915_gem_release(struct drm_device *dev, struct drm_file *file)
 	spin_unlock(&file_priv->mm.lock);
 }
 
+static void
+i915_gem_file_idle_work_handler(struct work_struct *work)
+{
+	struct drm_i915_file_private *file_priv =
+		container_of(work, typeof(*file_priv), mm.idle_work.work);
+
+	atomic_set(&file_priv->rps_wait_boost, false);
+}
+
+int i915_gem_open(struct drm_device *dev, struct drm_file *file)
+{
+	struct drm_i915_file_private *file_priv;
+
+	DRM_DEBUG_DRIVER("\n");
+
+	file_priv = kzalloc(sizeof(*file_priv), GFP_KERNEL);
+	if (!file_priv)
+		return -ENOMEM;
+
+	file->driver_priv = file_priv;
+	file_priv->dev_priv = dev->dev_private;
+
+	spin_lock_init(&file_priv->mm.lock);
+	INIT_LIST_HEAD(&file_priv->mm.request_list);
+	INIT_DELAYED_WORK(&file_priv->mm.idle_work,
+			  i915_gem_file_idle_work_handler);
+
+	idr_init(&file_priv->context_idr);
+
+	return 0;
+}
+
 static bool mutex_is_locked_by(struct mutex *mutex, struct task_struct *task)
 {
 	if (!mutex_is_locked(mutex))
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 26882bd..90cfb3b 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -853,17 +853,6 @@ static void gen6_pm_rps_work(struct work_struct *work)
 			gen6_set_rps(dev_priv->dev, new_delay);
 	}
 
-	if (IS_VALLEYVIEW(dev_priv->dev)) {
-		/*
-		 * On VLV, when we enter RC6 we may not be at the minimum
-		 * voltage level, so arm a timer to check.  It should only
-		 * fire when there's activity or once after we've entered
-		 * RC6, and then won't be re-armed until the next RPS interrupt.
-		 */
-		mod_delayed_work(dev_priv->wq, &dev_priv->rps.vlv_work,
-				 msecs_to_jiffies(100));
-	}
-
 	mutex_unlock(&dev_priv->rps.hw_lock);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 701ce74..1eabca3 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -7525,6 +7525,9 @@ void intel_mark_idle(struct drm_device *dev)
 
 		intel_decrease_pllclock(crtc);
 	}
+
+	if (dev_priv->info->gen >= 6)
+		gen6_rps_idle(dev->dev_private);
 }
 
 void intel_mark_fb_busy(struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index ea97c23..c6807d7 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -799,4 +799,7 @@ extern void hsw_pc8_restore_interrupts(struct drm_device *dev);
 extern void intel_aux_display_runtime_get(struct drm_i915_private *dev_priv);
 extern void intel_aux_display_runtime_put(struct drm_i915_private *dev_priv);
 
+extern void gen6_rps_idle(struct drm_i915_private *dev_priv);
+extern void gen6_rps_boost(struct drm_i915_private *dev_priv);
+
 #endif /* __INTEL_DRV_H__ */
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 2207e27..54c0e1e 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3313,6 +3313,26 @@ void gen6_set_rps(struct drm_device *dev, u8 val)
 	trace_intel_gpu_freq_change(val * 50);
 }
 
+void gen6_rps_idle(struct drm_i915_private *dev_priv)
+{
+	mutex_lock(&dev_priv->rps.hw_lock);
+	if (dev_priv->info->is_valleyview)
+		valleyview_set_rps(dev_priv->dev, dev_priv->rps.min_delay);
+	else
+		gen6_set_rps(dev_priv->dev, dev_priv->rps.min_delay);
+	mutex_unlock(&dev_priv->rps.hw_lock);
+}
+
+void gen6_rps_boost(struct drm_i915_private *dev_priv)
+{
+	mutex_lock(&dev_priv->rps.hw_lock);
+	if (dev_priv->info->is_valleyview)
+		valleyview_set_rps(dev_priv->dev, dev_priv->rps.max_delay);
+	else
+		gen6_set_rps(dev_priv->dev, dev_priv->rps.max_delay);
+	mutex_unlock(&dev_priv->rps.hw_lock);
+}
+
 /*
  * Wait until the previous freq change has completed,
  * or the timeout elapsed, and then update our notion
@@ -3700,24 +3720,6 @@ int valleyview_rps_min_freq(struct drm_i915_private *dev_priv)
 	return vlv_punit_read(dev_priv, PUNIT_REG_GPU_LFM) & 0xff;
 }
 
-static void vlv_rps_timer_work(struct work_struct *work)
-{
-	drm_i915_private_t *dev_priv = container_of(work, drm_i915_private_t,
-						    rps.vlv_work.work);
-
-	/*
-	 * Timer fired, we must be idle.  Drop to min voltage state.
-	 * Note: we use RPe here since it should match the
-	 * Vmin we were shooting for.  That should give us better
-	 * perf when we come back out of RC6 than if we used the
-	 * min freq available.
-	 */
-	mutex_lock(&dev_priv->rps.hw_lock);
-	if (dev_priv->rps.cur_delay > dev_priv->rps.rpe_delay)
-		valleyview_set_rps(dev_priv->dev, dev_priv->rps.rpe_delay);
-	mutex_unlock(&dev_priv->rps.hw_lock);
-}
-
 static void valleyview_setup_pctx(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -3855,8 +3857,6 @@ static void valleyview_enable_rps(struct drm_device *dev)
 				      dev_priv->rps.rpe_delay),
 			 dev_priv->rps.rpe_delay);
 
-	INIT_DELAYED_WORK(&dev_priv->rps.vlv_work, vlv_rps_timer_work);
-
 	valleyview_set_rps(dev_priv->dev, dev_priv->rps.rpe_delay);
 
 	gen6_enable_rps_interrupts(dev);
@@ -4596,8 +4596,6 @@ void intel_disable_gt_powersave(struct drm_device *dev)
 	} else if (INTEL_INFO(dev)->gen >= 6) {
 		cancel_delayed_work_sync(&dev_priv->rps.delayed_resume_work);
 		cancel_work_sync(&dev_priv->rps.work);
-		if (IS_VALLEYVIEW(dev))
-			cancel_delayed_work_sync(&dev_priv->rps.vlv_work);
 		mutex_lock(&dev_priv->rps.hw_lock);
 		if (IS_VALLEYVIEW(dev))
 			valleyview_disable_rps(dev);
-- 
1.8.1.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 14/19] drm/i915: Tweak RPS thresholds to more aggressively downclock
  2013-09-10 22:36 [PATCH 00/19] drm-intel-collector push simple patches for review Rodrigo Vivi
                   ` (12 preceding siblings ...)
  2013-09-10 22:36 ` [PATCH 13/19] drm/i915: Boost RPS frequency for CPU stalls Rodrigo Vivi
@ 2013-09-10 22:36 ` Rodrigo Vivi
  2013-09-10 22:36 ` [PATCH 15/19] drm/i915: i915.quirks_set/quirks_mask overrides dmi match Rodrigo Vivi
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 30+ messages in thread
From: Rodrigo Vivi @ 2013-09-10 22:36 UTC (permalink / raw)
  To: intel-gfx; +Cc: Owen Taylor, Stéphane Marchesin, Zhuang, Lena

From: Chris Wilson <chris@chris-wilson.co.uk>

After applying wait-boost we often find ourselves stuck at higher clocks
than required. The current threshold value requires the GPU to be
continuously and completely idle for 313ms before it is dropped by one
bin. Conversely, we require the GPU to be busy for an average of 90% over
a 84ms period before we upclock. So the current thresholds almost never
downclock the GPU, and respond very slowly to sudden demands for more
power. It is easy to observe that we currently lock into the wrong bin
and both underperform in benchmarks and consume more power than optimal
(just by repeating the task and measuring the different results).

An alternative approach, as discussed in the bspec, is to use a
continuous threshold for upclocking, and an average value for downclocking.
This is good for quickly detecting and reacting to state changes within a
frame, however it fails with the common throttling method of waiting
upon the outstanding frame - at least it is difficult to choose a
threshold that works well at 15,000fps and at 60fps. So continue to use
average busy/idle loads to determine frequency change.

v2: Use 3 power zones to keep frequencies low in steady-state mostly
idle (e.g. scrolling, interactive 2D drawing), and frequencies high
for demanding games. In between those end-states, we use a
fast-reclocking algorithm to converge more quickly on the desired bin.

v3: Bug fixes - make sure we reset adj after switching power zones.

v4: Tune - drop the continuous busy thresholds as it prevents us from
choosing the right frequency for glxgears style swap benchmarks. Instead
the goal is to be able to find the right clocks irrespective of the
wait-boost.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Stéphane Marchesin <stephane.marchesin@gmail.com>
Cc: Owen Taylor <otaylor@redhat.com>
Cc: "Meng, Mengmeng" <mengmeng.meng@intel.com>
Cc: "Zhuang, Lena" <lena.zhuang@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_drv.h |   5 ++
 drivers/gpu/drm/i915/i915_irq.c |  46 ++++++++++----
 drivers/gpu/drm/i915/i915_reg.h |   2 +-
 drivers/gpu/drm/i915/intel_pm.c | 135 ++++++++++++++++++++++++++++++----------
 4 files changed, 141 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8fb1eb7..d648bfe 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -839,8 +839,13 @@ struct intel_gen6_power_mgmt {
 	u8 min_delay;
 	u8 max_delay;
 	u8 rpe_delay;
+	u8 rp1_delay;
+	u8 rp0_delay;
 	u8 hw_max;
 
+	int last_adj;
+	enum { LOW_POWER, BETWEEN, HIGH_POWER } power;
+
 	struct delayed_work delayed_resume_work;
 
 	/*
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 90cfb3b..f3f5ffe 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -812,7 +812,7 @@ static void gen6_pm_rps_work(struct work_struct *work)
 	drm_i915_private_t *dev_priv = container_of(work, drm_i915_private_t,
 						    rps.work);
 	u32 pm_iir;
-	u8 new_delay;
+	int new_delay, adj;
 
 	spin_lock_irq(&dev_priv->irq_lock);
 	pm_iir = dev_priv->rps.pm_iir;
@@ -829,29 +829,49 @@ static void gen6_pm_rps_work(struct work_struct *work)
 
 	mutex_lock(&dev_priv->rps.hw_lock);
 
+	adj = dev_priv->rps.last_adj;
 	if (pm_iir & GEN6_PM_RP_UP_THRESHOLD) {
-		new_delay = dev_priv->rps.cur_delay + 1;
+		if (adj > 0)
+			adj *= 2;
+		else
+			adj = 1;
+		new_delay = dev_priv->rps.cur_delay + adj;
 
 		/*
 		 * For better performance, jump directly
 		 * to RPe if we're below it.
 		 */
-		if (IS_VALLEYVIEW(dev_priv->dev) &&
-		    dev_priv->rps.cur_delay < dev_priv->rps.rpe_delay)
+		if (new_delay < dev_priv->rps.rpe_delay)
+			new_delay = dev_priv->rps.rpe_delay;
+	} else if (pm_iir & GEN6_PM_RP_DOWN_TIMEOUT) {
+		if (dev_priv->rps.cur_delay > dev_priv->rps.rpe_delay)
 			new_delay = dev_priv->rps.rpe_delay;
-	} else
-		new_delay = dev_priv->rps.cur_delay - 1;
+		else
+			new_delay = dev_priv->rps.min_delay;
+		adj = 0;
+	} else if (pm_iir & GEN6_PM_RP_DOWN_THRESHOLD) {
+		if (adj < 0)
+			adj *= 2;
+		else
+			adj = -1;
+		new_delay = dev_priv->rps.cur_delay + adj;
+	} else { /* unknown event */
+		new_delay = dev_priv->rps.cur_delay;
+	}
 
 	/* sysfs frequency interfaces may have snuck in while servicing the
 	 * interrupt
 	 */
-	if (new_delay >= dev_priv->rps.min_delay &&
-	    new_delay <= dev_priv->rps.max_delay) {
-		if (IS_VALLEYVIEW(dev_priv->dev))
-			valleyview_set_rps(dev_priv->dev, new_delay);
-		else
-			gen6_set_rps(dev_priv->dev, new_delay);
-	}
+	if (new_delay < (int)dev_priv->rps.min_delay)
+		new_delay = dev_priv->rps.min_delay;
+	if (new_delay > (int)dev_priv->rps.max_delay)
+		new_delay = dev_priv->rps.max_delay;
+	dev_priv->rps.last_adj = new_delay - dev_priv->rps.cur_delay;
+
+	if (IS_VALLEYVIEW(dev_priv->dev))
+		valleyview_set_rps(dev_priv->dev, new_delay);
+	else
+		gen6_set_rps(dev_priv->dev, new_delay);
 
 	mutex_unlock(&dev_priv->rps.hw_lock);
 }
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index ff60945..bcaba63 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -4669,7 +4669,7 @@
 #define   GEN6_RP_UP_IDLE_MIN			(0x1<<3)
 #define   GEN6_RP_UP_BUSY_AVG			(0x2<<3)
 #define   GEN6_RP_UP_BUSY_CONT			(0x4<<3)
-#define   GEN7_RP_DOWN_IDLE_AVG			(0x2<<0)
+#define   GEN6_RP_DOWN_IDLE_AVG			(0x2<<0)
 #define   GEN6_RP_DOWN_IDLE_CONT		(0x1<<0)
 #define GEN6_RP_UP_THRESHOLD			0xA02C
 #define GEN6_RP_DOWN_THRESHOLD			0xA030
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 54c0e1e..4af6c1e 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3280,6 +3280,98 @@ static u32 gen6_rps_limits(struct drm_i915_private *dev_priv, u8 val)
 	return limits;
 }
 
+static void gen6_set_rps_thresholds(struct drm_i915_private *dev_priv, u8 val)
+{
+	int new_power;
+
+	new_power = dev_priv->rps.power;
+	switch (dev_priv->rps.power) {
+	case LOW_POWER:
+		if (val > dev_priv->rps.rpe_delay + 1 && val > dev_priv->rps.cur_delay)
+			new_power = BETWEEN;
+		break;
+
+	case BETWEEN:
+		if (val <= dev_priv->rps.rpe_delay && val < dev_priv->rps.cur_delay)
+			new_power = LOW_POWER;
+		else if (val >= dev_priv->rps.rp0_delay && val > dev_priv->rps.cur_delay)
+			new_power = HIGH_POWER;
+		break;
+
+	case HIGH_POWER:
+		if (val < (dev_priv->rps.rp1_delay + dev_priv->rps.rp0_delay) >> 1 && val < dev_priv->rps.cur_delay)
+			new_power = BETWEEN;
+		break;
+	}
+	/* Max/min bins are special */
+	if (val == dev_priv->rps.min_delay)
+		new_power = LOW_POWER;
+	if (val == dev_priv->rps.max_delay)
+		new_power = HIGH_POWER;
+	if (new_power == dev_priv->rps.power)
+		return;
+
+	/* Note the units here are not exactly 1us, but 1280ns. */
+	switch (new_power) {
+	case LOW_POWER:
+		/* Upclock if more than 95% busy over 16ms */
+		I915_WRITE(GEN6_RP_UP_EI, 12500);
+		I915_WRITE(GEN6_RP_UP_THRESHOLD, 11800);
+
+		/* Downclock if less than 85% busy over 32ms */
+		I915_WRITE(GEN6_RP_DOWN_EI, 25000);
+		I915_WRITE(GEN6_RP_DOWN_THRESHOLD, 21250);
+
+		I915_WRITE(GEN6_RP_CONTROL,
+			   GEN6_RP_MEDIA_TURBO |
+			   GEN6_RP_MEDIA_HW_NORMAL_MODE |
+			   GEN6_RP_MEDIA_IS_GFX |
+			   GEN6_RP_ENABLE |
+			   GEN6_RP_UP_BUSY_AVG |
+			   GEN6_RP_DOWN_IDLE_AVG);
+		break;
+
+	case BETWEEN:
+		/* Upclock if more than 90% busy over 13ms */
+		I915_WRITE(GEN6_RP_UP_EI, 10250);
+		I915_WRITE(GEN6_RP_UP_THRESHOLD, 9225);
+
+		/* Downclock if less than 75% busy over 32ms */
+		I915_WRITE(GEN6_RP_DOWN_EI, 25000);
+		I915_WRITE(GEN6_RP_DOWN_THRESHOLD, 18750);
+
+		I915_WRITE(GEN6_RP_CONTROL,
+			   GEN6_RP_MEDIA_TURBO |
+			   GEN6_RP_MEDIA_HW_NORMAL_MODE |
+			   GEN6_RP_MEDIA_IS_GFX |
+			   GEN6_RP_ENABLE |
+			   GEN6_RP_UP_BUSY_AVG |
+			   GEN6_RP_DOWN_IDLE_AVG);
+		break;
+
+	case HIGH_POWER:
+		/* Upclock if more than 85% busy over 10ms */
+		I915_WRITE(GEN6_RP_UP_EI, 8000);
+		I915_WRITE(GEN6_RP_UP_THRESHOLD, 6800);
+
+		/* Downclock if less than 60% busy over 32ms */
+		I915_WRITE(GEN6_RP_DOWN_EI, 25000);
+		I915_WRITE(GEN6_RP_DOWN_THRESHOLD, 15000);
+
+		I915_WRITE(GEN6_RP_CONTROL,
+			   GEN6_RP_MEDIA_TURBO |
+			   GEN6_RP_MEDIA_HW_NORMAL_MODE |
+			   GEN6_RP_MEDIA_IS_GFX |
+			   GEN6_RP_ENABLE |
+			   GEN6_RP_UP_BUSY_AVG |
+			   GEN6_RP_DOWN_IDLE_AVG);
+		break;
+	}
+
+	dev_priv->rps.power = new_power;
+	dev_priv->rps.last_adj = 0;
+}
+
 void gen6_set_rps(struct drm_device *dev, u8 val)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -3291,6 +3383,8 @@ void gen6_set_rps(struct drm_device *dev, u8 val)
 	if (val == dev_priv->rps.cur_delay)
 		return;
 
+	gen6_set_rps_thresholds(dev_priv, val);
+
 	if (IS_HASWELL(dev))
 		I915_WRITE(GEN6_RPNSWREQ,
 			   HSW_FREQUENCY(val));
@@ -3512,7 +3606,10 @@ static void gen6_enable_rps(struct drm_device *dev)
 
 	/* In units of 50MHz */
 	dev_priv->rps.hw_max = dev_priv->rps.max_delay = rp_state_cap & 0xff;
-	dev_priv->rps.min_delay = (rp_state_cap & 0xff0000) >> 16;
+	dev_priv->rps.min_delay = (rp_state_cap >> 16) & 0xff;
+	dev_priv->rps.rp1_delay = (rp_state_cap >>  8) & 0xff;
+	dev_priv->rps.rp0_delay = (rp_state_cap >>  0) & 0xff;
+	dev_priv->rps.rpe_delay = dev_priv->rps.rp1_delay;
 	dev_priv->rps.cur_delay = 0;
 
 	/* disable the counters and set deterministic thresholds */
@@ -3560,38 +3657,9 @@ static void gen6_enable_rps(struct drm_device *dev)
 		   GEN6_RC_CTL_EI_MODE(1) |
 		   GEN6_RC_CTL_HW_ENABLE);
 
-	if (IS_HASWELL(dev)) {
-		I915_WRITE(GEN6_RPNSWREQ,
-			   HSW_FREQUENCY(10));
-		I915_WRITE(GEN6_RC_VIDEO_FREQ,
-			   HSW_FREQUENCY(12));
-	} else {
-		I915_WRITE(GEN6_RPNSWREQ,
-			   GEN6_FREQUENCY(10) |
-			   GEN6_OFFSET(0) |
-			   GEN6_AGGRESSIVE_TURBO);
-		I915_WRITE(GEN6_RC_VIDEO_FREQ,
-			   GEN6_FREQUENCY(12));
-	}
-
-	I915_WRITE(GEN6_RP_DOWN_TIMEOUT, 1000000);
-	I915_WRITE(GEN6_RP_INTERRUPT_LIMITS,
-		   dev_priv->rps.max_delay << 24 |
-		   dev_priv->rps.min_delay << 16);
-
-	I915_WRITE(GEN6_RP_UP_THRESHOLD, 59400);
-	I915_WRITE(GEN6_RP_DOWN_THRESHOLD, 245000);
-	I915_WRITE(GEN6_RP_UP_EI, 66000);
-	I915_WRITE(GEN6_RP_DOWN_EI, 350000);
-
+	/* Power down if completely idle for over 50ms */
+	I915_WRITE(GEN6_RP_DOWN_TIMEOUT, 50000);
 	I915_WRITE(GEN6_RP_IDLE_HYSTERSIS, 10);
-	I915_WRITE(GEN6_RP_CONTROL,
-		   GEN6_RP_MEDIA_TURBO |
-		   GEN6_RP_MEDIA_HW_NORMAL_MODE |
-		   GEN6_RP_MEDIA_IS_GFX |
-		   GEN6_RP_ENABLE |
-		   GEN6_RP_UP_BUSY_AVG |
-		   (IS_HASWELL(dev) ? GEN7_RP_DOWN_IDLE_AVG : GEN6_RP_DOWN_IDLE_CONT));
 
 	ret = sandybridge_pcode_write(dev_priv, GEN6_PCODE_WRITE_MIN_FREQ_TABLE, 0);
 	if (!ret) {
@@ -3607,7 +3675,8 @@ static void gen6_enable_rps(struct drm_device *dev)
 		DRM_DEBUG_DRIVER("Failed to set the min frequency\n");
 	}
 
-	gen6_set_rps(dev_priv->dev, (gt_perf_status & 0xff00) >> 8);
+	dev_priv->rps.power = HIGH_POWER; /* force a reset */
+	gen6_set_rps(dev_priv->dev, dev_priv->rps.min_delay);
 
 	gen6_enable_rps_interrupts(dev);
 
-- 
1.8.1.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 15/19] drm/i915: i915.quirks_set/quirks_mask overrides dmi match
  2013-09-10 22:36 [PATCH 00/19] drm-intel-collector push simple patches for review Rodrigo Vivi
                   ` (13 preceding siblings ...)
  2013-09-10 22:36 ` [PATCH 14/19] drm/i915: Tweak RPS thresholds to more aggressively downclock Rodrigo Vivi
@ 2013-09-10 22:36 ` Rodrigo Vivi
  2013-09-11  8:21   ` [Intel-gfx] " Jani Nikula
  2013-09-10 22:36 ` [PATCH 16/19] drm/i915: Fix l3 parity buffer offset Rodrigo Vivi
                   ` (3 subsequent siblings)
  18 siblings, 1 reply; 30+ messages in thread
From: Rodrigo Vivi @ 2013-09-10 22:36 UTC (permalink / raw)
  To: intel-gfx; +Cc: Kamal Mostafa, stable, Rodrigo Vivi

From: Kamal Mostafa <kamal@canonical.com>

Boot params quirks_set and quirks_mask allow the user to enable or
inhibit any dmi-matched quirks, overriding the dmi match table.  Examples:

  i915.quirks_set=0x2   - enables QUIRK_LVDS_SSC_DISABLE
  i915.quirks_set=0x8   - enables QUIRK_NO_PCH_PWM_ENABLE
  i915.quirks_mask=0x8  - disables QUIRK_NO_PCH_PWM_ENABLE
  i915.quirks_mask=0xFF - disables all quirks

Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/intel_display.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 1eabca3..12607f2 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -10043,8 +10043,19 @@ static struct intel_quirk intel_quirks[] = {
 	{ 0x0166, 0x1028, 0x058b, quirk_no_pcm_pwm_enable },
 };
 
+unsigned long i915_quirks_set __read_mostly = 0;
+module_param_named(quirks_set, i915_quirks_set, ulong, 0600);
+MODULE_PARM_DESC(quirks_set,
+		"Enable specified quirks bits.");
+
+unsigned long i915_quirks_mask __read_mostly = 0;
+module_param_named(quirks_mask, i915_quirks_mask, ulong, 0600);
+MODULE_PARM_DESC(quirks_mask,
+		"Disable specified quirks bits (override dmi match).");
+
 static void intel_init_quirks(struct drm_device *dev)
 {
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct pci_dev *d = dev->pdev;
 	int i;
 
@@ -10062,6 +10073,10 @@ static void intel_init_quirks(struct drm_device *dev)
 		if (dmi_check_system(*intel_dmi_quirks[i].dmi_id_list) != 0)
 			intel_dmi_quirks[i].hook(dev);
 	}
+
+	/* handle user-specified quirks overrides */
+	dev_priv->quirks |= i915_quirks_set;
+	dev_priv->quirks &= ~i915_quirks_mask;
 }
 
 /* Disable the VGA plane that we never use */
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 16/19] drm/i915: Fix l3 parity buffer offset
  2013-09-10 22:36 [PATCH 00/19] drm-intel-collector push simple patches for review Rodrigo Vivi
                   ` (14 preceding siblings ...)
  2013-09-10 22:36 ` [PATCH 15/19] drm/i915: i915.quirks_set/quirks_mask overrides dmi match Rodrigo Vivi
@ 2013-09-10 22:36 ` Rodrigo Vivi
  2013-09-11 11:45   ` Ville Syrjälä
  2013-09-10 22:36 ` [PATCH 17/19] drm/i915: Fix HSW parity test Rodrigo Vivi
                   ` (2 subsequent siblings)
  18 siblings, 1 reply; 30+ messages in thread
From: Rodrigo Vivi @ 2013-09-10 22:36 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

The buf pointer used during l3_write is just char *, therefore it does
not require the silly /4.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_sysfs.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index 05195c0..70de7a9 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -184,9 +184,7 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
 	if (temp)
 		dev_priv->l3_parity.remap_info = temp;
 
-	memcpy(dev_priv->l3_parity.remap_info + (offset/4),
-	       buf + (offset/4),
-	       count);
+	memcpy(dev_priv->l3_parity.remap_info + (offset/4), buf, count);
 
 	i915_gem_l3_remap(drm_dev);
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 17/19] drm/i915: Fix HSW parity test
  2013-09-10 22:36 [PATCH 00/19] drm-intel-collector push simple patches for review Rodrigo Vivi
                   ` (15 preceding siblings ...)
  2013-09-10 22:36 ` [PATCH 16/19] drm/i915: Fix l3 parity buffer offset Rodrigo Vivi
@ 2013-09-10 22:36 ` Rodrigo Vivi
  2013-09-10 22:36 ` [PATCH 18/19] drm/i915: Allow GT3 Slice Shutdown on Boot Rodrigo Vivi
  2013-09-10 22:36 ` [PATCH 19/19] drm/i915: Allow Dynamically GT3 Slice Shutdown Rodrigo Vivi
  18 siblings, 0 replies; 30+ messages in thread
From: Rodrigo Vivi @ 2013-09-10 22:36 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Ben Widawsky

From: Ben Widawsky <benjamin.widawsky@intel.com>

Haswell changed the log registers to be WO, so we can no longer read
them to determine the programming (which sucks, see later note). For
now, simply use the cached value, and hope HW doesn't screw us over.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57441
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Tested-by: shui yangwei <yangweix.shui@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_sysfs.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index 70de7a9..491aba6 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -129,6 +129,19 @@ i915_l3_read(struct file *filp, struct kobject *kobj,
 	if (ret)
 		return ret;
 
+	if (IS_HASWELL(drm_dev)) {
+		int last = min_t(int, GEN7_L3LOG_SIZE, count + offset);
+		if ((!dev_priv->l3_parity.remap_info))
+			memset(buf + offset, 0, last - offset);
+		else
+			memcpy(buf + offset,
+			       dev_priv->l3_parity.remap_info + (offset/4),
+			       last - offset);
+
+		i = last;
+		goto out;
+	}
+
 	misccpctl = I915_READ(GEN7_MISCCPCTL);
 	I915_WRITE(GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
 
@@ -137,6 +150,7 @@ i915_l3_read(struct file *filp, struct kobject *kobj,
 
 	I915_WRITE(GEN7_MISCCPCTL, misccpctl);
 
+out:
 	mutex_unlock(&drm_dev->struct_mutex);
 
 	return i - offset;
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 18/19] drm/i915: Allow GT3 Slice Shutdown on Boot.
  2013-09-10 22:36 [PATCH 00/19] drm-intel-collector push simple patches for review Rodrigo Vivi
                   ` (16 preceding siblings ...)
  2013-09-10 22:36 ` [PATCH 17/19] drm/i915: Fix HSW parity test Rodrigo Vivi
@ 2013-09-10 22:36 ` Rodrigo Vivi
  2013-09-10 22:36 ` [PATCH 19/19] drm/i915: Allow Dynamically GT3 Slice Shutdown Rodrigo Vivi
  18 siblings, 0 replies; 30+ messages in thread
From: Rodrigo Vivi @ 2013-09-10 22:36 UTC (permalink / raw)
  To: intel-gfx

Slice shutdown is a power savings feature whereby parts of HW i.e. slice is
shut off on boot or dynamically to save power.

This patch only introduces a way to disable half of Haswell GT3 slices on boot.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_drv.c |  5 +++++
 drivers/gpu/drm/i915/i915_drv.h |  1 +
 drivers/gpu/drm/i915/i915_gem.c |  5 -----
 drivers/gpu/drm/i915/i915_reg.h | 11 +++++++++++
 drivers/gpu/drm/i915/intel_pm.c | 23 +++++++++++++++++++++++
 5 files changed, 40 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 72e2be7..2650071 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -154,6 +154,11 @@ module_param_named(prefault_disable, i915_prefault_disable, bool, 0600);
 MODULE_PARM_DESC(prefault_disable,
 		"Disable page prefaulting for pread/pwrite/reloc (default:false). For developers only.");
 
+int i915_gt3_policy __read_mostly = 1;
+module_param_named(gt3_policy, i915_gt3_policy, int, 0600);
+MODULE_PARM_DESC(gt3_policy,
+		 "GT3 boot with Full (1) or Half (0) slices enabled. (default:full)");
+
 static struct drm_driver driver;
 extern int intel_agp_enabled;
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d648bfe..62bac41 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1743,6 +1743,7 @@ extern bool i915_fastboot __read_mostly;
 extern int i915_enable_pc8 __read_mostly;
 extern int i915_pc8_timeout __read_mostly;
 extern bool i915_prefault_disable __read_mostly;
+extern int i915_gt3_policy __read_mostly;
 
 extern int i915_suspend(struct drm_device *dev, pm_message_t state);
 extern int i915_resume(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 89c2844..5fcbaad 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4388,11 +4388,6 @@ i915_gem_init_hw(struct drm_device *dev)
 	if (dev_priv->ellc_size)
 		I915_WRITE(HSW_IDICR, I915_READ(HSW_IDICR) | IDIHASHMSK(0xf));
 
-	if (IS_HSW_GT3(dev))
-		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
-	else
-		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
-
 	if (HAS_PCH_NOP(dev)) {
 		u32 temp = I915_READ(GEN7_MSG_CTL);
 		temp &= ~(WAIT_FOR_PCH_FLR_ACK | WAIT_FOR_PCH_RESET_ACK);
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index bcaba63..dc2adcb 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -269,6 +269,17 @@
 #define  LOWER_SLICE_ENABLED	(1<<0)
 #define  LOWER_SLICE_DISABLED	(0<<0)
 
+#define HSW_GT_SLICE_INFO	0x138064
+#define   SLICE_SEL_BOTH	(1<<3)
+#define   SLICE_AUTOWAKE	(1<<2)
+#define   SLICE_STATUS_MASK	0x3
+#define   SLICE_STATUS_GT_OFF	(0<<0)
+#define   SLICE_STATUS_MAIN_ON	(2<<0)
+#define   SLICE_STATUS_BOTH_ON	(3<<0)
+
+#define HSW_SLICESHUTDOWN	0xA190
+#define   SLICE_SHUTDOWN	(1<<0)
+
 /*
  * 3D instructions used by the kernel
  */
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 4af6c1e..2b9db88 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3697,6 +3697,28 @@ static void gen6_enable_rps(struct drm_device *dev)
 	gen6_gt_force_wake_put(dev_priv);
 }
 
+static void intel_init_gt3_slices(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (!IS_HSW_GT3(dev))
+		return;
+
+	if (!i915_gt3_policy) {
+		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+		POSTING_READ(MI_PREDICATE_RESULT_2);
+
+		I915_WRITE(HSW_SLICESHUTDOWN, SLICE_SHUTDOWN);
+		POSTING_READ(HSW_SLICESHUTDOWN);
+
+		I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
+		POSTING_READ(HSW_GT_SLICE_INFO);
+	} else {
+		I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
+		POSTING_READ(MI_PREDICATE_RESULT_2);
+	}
+}
+
 void gen6_update_ring_freq(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -4688,6 +4710,7 @@ static void intel_gen6_powersave_work(struct work_struct *work)
 	} else {
 		gen6_enable_rps(dev);
 		gen6_update_ring_freq(dev);
+		intel_init_gt3_slices(dev);
 	}
 	mutex_unlock(&dev_priv->rps.hw_lock);
 }
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 19/19] drm/i915: Allow Dynamically GT3 Slice Shutdown.
  2013-09-10 22:36 [PATCH 00/19] drm-intel-collector push simple patches for review Rodrigo Vivi
                   ` (17 preceding siblings ...)
  2013-09-10 22:36 ` [PATCH 18/19] drm/i915: Allow GT3 Slice Shutdown on Boot Rodrigo Vivi
@ 2013-09-10 22:36 ` Rodrigo Vivi
  18 siblings, 0 replies; 30+ messages in thread
From: Rodrigo Vivi @ 2013-09-10 22:36 UTC (permalink / raw)
  To: intel-gfx

Slice shutdown is a power savings feature whereby parts of HW i.e. slice is
shut off on boot or dynamically to save power.

This patch introduces a sysfs interface to easily allow dynamically switch
between full and half GT3 slices.

v2: remove unused variables and fix identation

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/i915_sysfs.c | 48 +++++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_drv.h  |  4 +++-
 drivers/gpu/drm/i915/intel_pm.c   | 31 +++++++++++++++++++++++++
 3 files changed, 82 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index 491aba6..a87c260 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -93,6 +93,48 @@ static struct attribute_group rc6_attr_group = {
 	.name = power_group_name,
 	.attrs =  rc6_attrs
 };
+
+static ssize_t gt3_policy_show(struct device *kdev,
+			      struct device_attribute *attr, char *buf)
+{
+	struct drm_minor *minor = container_of(kdev, struct drm_minor, kdev);
+	struct drm_device *dev = minor->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	return sprintf(buf, "%s\n", I915_READ(MI_PREDICATE_RESULT_2) ==
+		       LOWER_SLICE_ENABLED ? "full" : "half");
+}
+
+static ssize_t gt3_policy_store(struct device *kdev,
+			       struct device_attribute *attr,
+			       const char *buf, size_t count)
+{
+	struct drm_minor *minor = container_of(kdev, struct drm_minor, kdev);
+	struct drm_device *dev = minor->dev;
+
+	if (!strncmp(buf, "full", sizeof("full") - 1))
+		intel_set_gt3_full(dev);
+	else if (!strncmp(buf, "half", sizeof("half") - 1))
+		intel_set_gt3_half(dev);
+	else
+		return -EINVAL;
+
+	return count;
+}
+
+static DEVICE_ATTR(gt3_policy, S_IRUGO | S_IWUSR,
+		   gt3_policy_show, gt3_policy_store);
+
+static struct attribute *gt3_policy_attrs[] = {
+	&dev_attr_gt3_policy.attr,
+	NULL
+};
+
+static struct attribute_group gt3_policy_attr_group = {
+	.name = power_group_name,
+	.attrs =  gt3_policy_attrs
+};
+
 #endif
 
 static int l3_access_valid(struct drm_device *dev, loff_t offset)
@@ -518,6 +560,12 @@ void i915_setup_sysfs(struct drm_device *dev)
 		if (ret)
 			DRM_ERROR("RC6 residency sysfs setup failed\n");
 	}
+	if (IS_HSW_GT3(dev)) {
+		ret = sysfs_merge_group(&dev->primary->kdev.kobj,
+					&gt3_policy_attr_group);
+		if (ret)
+			DRM_ERROR("GT3 policy sysfs setup failed\n");
+	}
 #endif
 	if (HAS_L3_GPU_CACHE(dev)) {
 		ret = device_create_bin_file(&dev->primary->kdev, &dpf_attrs);
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index c6807d7..ced31f1 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -739,7 +739,9 @@ extern void intel_update_fbc(struct drm_device *dev);
 /* IPS */
 extern void intel_gpu_ips_init(struct drm_i915_private *dev_priv);
 extern void intel_gpu_ips_teardown(void);
-
+/* Slice Shutdown */
+extern void intel_set_gt3_full(struct drm_device *dev);
+extern void intel_set_gt3_half(struct drm_device *dev);
 /* Power well */
 extern int i915_init_power_well(struct drm_device *dev);
 extern void i915_remove_power_well(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 2b9db88..a87be43 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3697,6 +3697,37 @@ static void gen6_enable_rps(struct drm_device *dev)
 	gen6_gt_force_wake_put(dev_priv);
 }
 
+void intel_set_gt3_full(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (!IS_HSW_GT3(dev))
+		return;
+
+	I915_WRITE(HSW_GT_SLICE_INFO, SLICE_SEL_BOTH);
+	POSTING_READ(HSW_GT_SLICE_INFO);
+
+	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_ENABLED);
+	POSTING_READ(MI_PREDICATE_RESULT_2);
+}
+
+void intel_set_gt3_half(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (!IS_HSW_GT3(dev))
+		return;
+
+	I915_WRITE(HSW_SLICESHUTDOWN, SLICE_SHUTDOWN);
+	POSTING_READ(HSW_SLICESHUTDOWN);
+
+	I915_WRITE(HSW_GT_SLICE_INFO, ~SLICE_SEL_BOTH);
+	POSTING_READ(HSW_GT_SLICE_INFO);
+
+	I915_WRITE(MI_PREDICATE_RESULT_2, LOWER_SLICE_DISABLED);
+	POSTING_READ(MI_PREDICATE_RESULT_2);
+}
+
 static void intel_init_gt3_slices(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 13/19] drm/i915: Boost RPS frequency for CPU stalls
  2013-09-10 22:36 ` [PATCH 13/19] drm/i915: Boost RPS frequency for CPU stalls Rodrigo Vivi
@ 2013-09-10 22:53   ` Chris Wilson
  0 siblings, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2013-09-10 22:53 UTC (permalink / raw)
  To: Rodrigo Vivi
  Cc: Owen Taylor, intel-gfx, Stéphane Marchesin, Zhuang, Lena

On Tue, Sep 10, 2013 at 07:36:42PM -0300, Rodrigo Vivi wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> If we encounter a situation where the CPU blocks waiting for results
> from the GPU, give the GPU a kick to boost its the frequency.
> 
> This should work to reduce user interface stalls and to quickly promote
> mesa to high frequencies - but the cost is that our requested frequency
> stalls high (as we do not idle for long enough before rc6 to start
> reducing frequencies, nor are we aggressive at down clocking an
> underused GPU). However, this should be mitigated by rc6 itself powering
> off the GPU when idle, and that energy use is dependent upon the workload
> of the GPU in addition to its frequency (e.g. the math or sampler
> functions only consume power when used). Still, this is likely to
> adversely affect light workloads.
> 
> In particular, this nearly eliminates the highly noticeable wake-up lag
> in animations from idle. For example, expose or workspace transitions.
> (However, given the situation where we fail to downclock, our requested
> frequency is almost always the maximum, except for Baytrail where we
> manually downclock upon idling. This often masks the latency of
> upclocking after being idle, so animations are typically smooth - at the
> cost of increased power consumption.)
> 
> Stéphane raised the concern that this will punish good applications and
> reward bad applications - but due to the nature of how mesa performs its
> client throttling, I believe all mesa applications will be roughly
> equally affected. To address this concern, and to prevent applications
> like compositors from regularly boosting the RPS state, we only allow
> each client to receive one boost in each period of activity.
> 
> Unfortunately, this techinique is ineffective with Ironlake - which also
> has dynamic render power states and suffers just as dramatically. For
> Ironlake, the thermal/power headroom is shared with the CPU through
> Intelligent Power Sharing and the intel-ips module. This leaves us with
> no GPU boost frequencies available when coming out of idle, and due to
> hardware limitations we cannot change the arbitration between the CPU and
> GPU quickly enough to be effective.
> 
> v2: Limit each client to receiving a single boost for each active period.
> v3: Tidy up. Allow the device to always boost if it waits outside of
> client context.

There's a remaining issue that the compositor needs the boost
regularly as even though they never idle themselves (therefore under the
above scheme they are only allowed to gain one wait-boost), the GPU
sleeps waiting for flips. The latency from waking up from rc6 after a
flip is enough for the compositor to start dropping frames.

At the moment I'm running with

http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=perf&id=377a87649111b3fccf5ba488811706c3aeab89e1

on top (and v4 anyway).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Intel-gfx] [PATCH 15/19] drm/i915: i915.quirks_set/quirks_mask overrides dmi match
  2013-09-10 22:36 ` [PATCH 15/19] drm/i915: i915.quirks_set/quirks_mask overrides dmi match Rodrigo Vivi
@ 2013-09-11  8:21   ` Jani Nikula
  2013-09-17 16:42     ` Kamal Mostafa
  0 siblings, 1 reply; 30+ messages in thread
From: Jani Nikula @ 2013-09-11  8:21 UTC (permalink / raw)
  To: Rodrigo Vivi, Kamal Mostafa, intel-gfx; +Cc: stable

On Wed, 11 Sep 2013, Rodrigo Vivi <rodrigo.vivi@gmail.com> wrote:
> From: Kamal Mostafa <kamal@canonical.com>
>
> Boot params quirks_set and quirks_mask allow the user to enable or
> inhibit any dmi-matched quirks, overriding the dmi match table.  Examples:
>
>   i915.quirks_set=0x2   - enables QUIRK_LVDS_SSC_DISABLE
>   i915.quirks_set=0x8   - enables QUIRK_NO_PCH_PWM_ENABLE
>   i915.quirks_mask=0x8  - disables QUIRK_NO_PCH_PWM_ENABLE
>   i915.quirks_mask=0xFF - disables all quirks

While I admit this can be used to workaround issues with certain
machines, this is a hack that exposes an implementation detail to the
userspace, and carves it into the stone. Any user of it would have to
look at the kernel source to make a smart choice of parameters; more
likely forums would start filling with tips what to set.

So NAK from me.

That said, I think we still have a few stones unturned wrt backlight and
what BIOS does in UEFI mode.


Cheers,
Jani.

>
> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
> ---
>  drivers/gpu/drm/i915/intel_display.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 1eabca3..12607f2 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -10043,8 +10043,19 @@ static struct intel_quirk intel_quirks[] = {
>  	{ 0x0166, 0x1028, 0x058b, quirk_no_pcm_pwm_enable },
>  };
>  
> +unsigned long i915_quirks_set __read_mostly = 0;
> +module_param_named(quirks_set, i915_quirks_set, ulong, 0600);
> +MODULE_PARM_DESC(quirks_set,
> +		"Enable specified quirks bits.");
> +
> +unsigned long i915_quirks_mask __read_mostly = 0;
> +module_param_named(quirks_mask, i915_quirks_mask, ulong, 0600);
> +MODULE_PARM_DESC(quirks_mask,
> +		"Disable specified quirks bits (override dmi match).");
> +
>  static void intel_init_quirks(struct drm_device *dev)
>  {
> +	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct pci_dev *d = dev->pdev;
>  	int i;
>  
> @@ -10062,6 +10073,10 @@ static void intel_init_quirks(struct drm_device *dev)
>  		if (dmi_check_system(*intel_dmi_quirks[i].dmi_id_list) != 0)
>  			intel_dmi_quirks[i].hook(dev);
>  	}
> +
> +	/* handle user-specified quirks overrides */
> +	dev_priv->quirks |= i915_quirks_set;
> +	dev_priv->quirks &= ~i915_quirks_mask;
>  }
>  
>  /* Disable the VGA plane that we never use */
> -- 
> 1.8.1.4
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Jani Nikula, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 16/19] drm/i915: Fix l3 parity buffer offset
  2013-09-10 22:36 ` [PATCH 16/19] drm/i915: Fix l3 parity buffer offset Rodrigo Vivi
@ 2013-09-11 11:45   ` Ville Syrjälä
  2013-09-11 17:07     ` Ben Widawsky
  0 siblings, 1 reply; 30+ messages in thread
From: Ville Syrjälä @ 2013-09-11 11:45 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx, Ben Widawsky, Ben Widawsky

On Tue, Sep 10, 2013 at 07:36:45PM -0300, Rodrigo Vivi wrote:
> From: Ben Widawsky <benjamin.widawsky@intel.com>
> 
> The buf pointer used during l3_write is just char *, therefore it does
> not require the silly /4.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
> ---
>  drivers/gpu/drm/i915/i915_sysfs.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
> index 05195c0..70de7a9 100644
> --- a/drivers/gpu/drm/i915/i915_sysfs.c
> +++ b/drivers/gpu/drm/i915/i915_sysfs.c
> @@ -184,9 +184,7 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
>  	if (temp)
>  		dev_priv->l3_parity.remap_info = temp;
>  
> -	memcpy(dev_priv->l3_parity.remap_info + (offset/4),
> -	       buf + (offset/4),
> -	       count);
> +	memcpy(dev_priv->l3_parity.remap_info + (offset/4), buf, count);

The commit message doesn't really reflect the fact that you completely
remove the offset from 'buf', which is actually the correct thing to do,
but the commit message should match the patch.

Also i915_l3_read() should get the same fix.

And while on the subject, I don't understand why we're playing weird
tricks with the remap_info memory allocation. Ie. why don't we simply
do this?

 ...
 if (!dev_priv->l3_parity.remap_info) {
 	dev_priv->l3_parity.remap_info = kzalloc(GEN7_L3LOG_SIZE,
 	                                         GFP_KERNEL);

 	if (!dev_priv->l3_parity.remap_info) {
 		mutex_unlock(&drm_dev->struct_mutex);
 		return -ENOMEM;
 	}
 }
 ...

>  
>  	i915_gem_l3_remap(drm_dev);
>  
> -- 
> 1.8.1.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 16/19] drm/i915: Fix l3 parity buffer offset
  2013-09-11 11:45   ` Ville Syrjälä
@ 2013-09-11 17:07     ` Ben Widawsky
  2013-09-11 17:44       ` Ville Syrjälä
  0 siblings, 1 reply; 30+ messages in thread
From: Ben Widawsky @ 2013-09-11 17:07 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx, Ben Widawsky

On Wed, Sep 11, 2013 at 02:45:28PM +0300, Ville Syrjälä wrote:
> On Tue, Sep 10, 2013 at 07:36:45PM -0300, Rodrigo Vivi wrote:
> > From: Ben Widawsky <benjamin.widawsky@intel.com>
> > 
> > The buf pointer used during l3_write is just char *, therefore it does
> > not require the silly /4.
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
> > ---
> >  drivers/gpu/drm/i915/i915_sysfs.c | 4 +---
> >  1 file changed, 1 insertion(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
> > index 05195c0..70de7a9 100644
> > --- a/drivers/gpu/drm/i915/i915_sysfs.c
> > +++ b/drivers/gpu/drm/i915/i915_sysfs.c
> > @@ -184,9 +184,7 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
> >  	if (temp)
> >  		dev_priv->l3_parity.remap_info = temp;
> >  
> > -	memcpy(dev_priv->l3_parity.remap_info + (offset/4),
> > -	       buf + (offset/4),
> > -	       count);
> > +	memcpy(dev_priv->l3_parity.remap_info + (offset/4), buf, count);
> 
> The commit message doesn't really reflect the fact that you completely
> remove the offset from 'buf', which is actually the correct thing to do,
> but the commit message should match the patch.

I don't know, it seems pretty clear to me. Shall I just copy and paste
what you wrote here?

> 
> Also i915_l3_read() should get the same fix.

Should it?

> 
> And while on the subject, I don't understand why we're playing weird
> tricks with the remap_info memory allocation. Ie. why don't we simply
> do this?

It was mostly to be able to differentiate NULL vs. 0's. The former being
no remaps ever, the latter being a cleared remapping. That's why temp is
used.

> 
>  ...
>  if (!dev_priv->l3_parity.remap_info) {
>  	dev_priv->l3_parity.remap_info = kzalloc(GEN7_L3LOG_SIZE,
>  	                                         GFP_KERNEL);
> 
>  	if (!dev_priv->l3_parity.remap_info) {
>  		mutex_unlock(&drm_dev->struct_mutex);
>  		return -ENOMEM;
>  	}
>  }
>  ...
> 
> >  
> >  	i915_gem_l3_remap(drm_dev);
> >  
> > -- 
> > 1.8.1.4
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Ville Syrjälä
> Intel OTC

-- 
Ben Widawsky, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 16/19] drm/i915: Fix l3 parity buffer offset
  2013-09-11 17:07     ` Ben Widawsky
@ 2013-09-11 17:44       ` Ville Syrjälä
  2013-09-11 17:59         ` Ben Widawsky
  0 siblings, 1 reply; 30+ messages in thread
From: Ville Syrjälä @ 2013-09-11 17:44 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: intel-gfx, Ben Widawsky

On Wed, Sep 11, 2013 at 10:07:39AM -0700, Ben Widawsky wrote:
> On Wed, Sep 11, 2013 at 02:45:28PM +0300, Ville Syrjälä wrote:
> > On Tue, Sep 10, 2013 at 07:36:45PM -0300, Rodrigo Vivi wrote:
> > > From: Ben Widawsky <benjamin.widawsky@intel.com>
> > > 
> > > The buf pointer used during l3_write is just char *, therefore it does
> > > not require the silly /4.
> > > 
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_sysfs.c | 4 +---
> > >  1 file changed, 1 insertion(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
> > > index 05195c0..70de7a9 100644
> > > --- a/drivers/gpu/drm/i915/i915_sysfs.c
> > > +++ b/drivers/gpu/drm/i915/i915_sysfs.c
> > > @@ -184,9 +184,7 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
> > >  	if (temp)
> > >  		dev_priv->l3_parity.remap_info = temp;
> > >  
> > > -	memcpy(dev_priv->l3_parity.remap_info + (offset/4),
> > > -	       buf + (offset/4),
> > > -	       count);
> > > +	memcpy(dev_priv->l3_parity.remap_info + (offset/4), buf, count);
> > 
> > The commit message doesn't really reflect the fact that you completely
> > remove the offset from 'buf', which is actually the correct thing to do,
> > but the commit message should match the patch.
> 
> I don't know, it seems pretty clear to me. Shall I just copy and paste
> what you wrote here?

The commit msg says this:
- buf + offset/4
+ buf + offset

But the patch does this:
- buf + offset/4
+ buf

Doesn't seem very clear to me :)

> 
> > 
> > Also i915_l3_read() should get the same fix.
> 
> Should it?

Yes, 'buf' is just the buffer userspace will eventually get from the
read() syscall. Seeking seeks inside the file, not inside the buffer
userspace passes to read()/write().

Well, actually 'buf' is one of two temp buffers inside the kernel. It
looks like bin.c will copy from that buffer to another temp buffer,
and then from the second temp buffer to userspace. Seems very
inefficient, but perhaps there's a reason for it.

> 
> > 
> > And while on the subject, I don't understand why we're playing weird
> > tricks with the remap_info memory allocation. Ie. why don't we simply
> > do this?
> 
> It was mostly to be able to differentiate NULL vs. 0's. The former being
> no remaps ever, the latter being a cleared remapping. That's why temp is
> used.

Hmm. I don't get it. The only difference between my idea and the current
code is if i915_gpu_idle() fails for the first write. But we could just
do the allocation after we've called i915_gpu_idle() and then there's no
longer any difference.

Also there's no guarantee that userspace would even write the whole
thing. It could just write one byte and then we'd implicitly clear the
rest to 0. Not sure if that's considered OK or not.

Actually 'temp' would make more sense if we'd do the allocation outside
struct_mutex.

> > 
> >  ...
> >  if (!dev_priv->l3_parity.remap_info) {
> >  	dev_priv->l3_parity.remap_info = kzalloc(GEN7_L3LOG_SIZE,
> >  	                                         GFP_KERNEL);
> > 
> >  	if (!dev_priv->l3_parity.remap_info) {
> >  		mutex_unlock(&drm_dev->struct_mutex);
> >  		return -ENOMEM;
> >  	}
> >  }
> >  ...
> > 
> > >  
> > >  	i915_gem_l3_remap(drm_dev);
> > >  
> > > -- 
> > > 1.8.1.4
> > > 
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > 
> > -- 
> > Ville Syrjälä
> > Intel OTC
> 
> -- 
> Ben Widawsky, Intel Open Source Technology Center

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 16/19] drm/i915: Fix l3 parity buffer offset
  2013-09-11 17:44       ` Ville Syrjälä
@ 2013-09-11 17:59         ` Ben Widawsky
  0 siblings, 0 replies; 30+ messages in thread
From: Ben Widawsky @ 2013-09-11 17:59 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx, Ben Widawsky

On Wed, Sep 11, 2013 at 08:44:21PM +0300, Ville Syrjälä wrote:
> On Wed, Sep 11, 2013 at 10:07:39AM -0700, Ben Widawsky wrote:
> > On Wed, Sep 11, 2013 at 02:45:28PM +0300, Ville Syrjälä wrote:
> > > On Tue, Sep 10, 2013 at 07:36:45PM -0300, Rodrigo Vivi wrote:
> > > > From: Ben Widawsky <benjamin.widawsky@intel.com>
> > > > 
> > > > The buf pointer used during l3_write is just char *, therefore it does
> > > > not require the silly /4.
> > > > 
> > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_sysfs.c | 4 +---
> > > >  1 file changed, 1 insertion(+), 3 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
> > > > index 05195c0..70de7a9 100644
> > > > --- a/drivers/gpu/drm/i915/i915_sysfs.c
> > > > +++ b/drivers/gpu/drm/i915/i915_sysfs.c
> > > > @@ -184,9 +184,7 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
> > > >  	if (temp)
> > > >  		dev_priv->l3_parity.remap_info = temp;
> > > >  
> > > > -	memcpy(dev_priv->l3_parity.remap_info + (offset/4),
> > > > -	       buf + (offset/4),
> > > > -	       count);
> > > > +	memcpy(dev_priv->l3_parity.remap_info + (offset/4), buf, count);
> > > 
> > > The commit message doesn't really reflect the fact that you completely
> > > remove the offset from 'buf', which is actually the correct thing to do,
> > > but the commit message should match the patch.
> > 
> > I don't know, it seems pretty clear to me. Shall I just copy and paste
> > what you wrote here?
> 
> The commit msg says this:
> - buf + offset/4
> + buf + offset
> 
> But the patch does this:
> - buf + offset/4
> + buf
> 
> Doesn't seem very clear to me :)
> 

That's fine. I want it to be as clear as possible (still seems clear to
me). What do you want me to write? I'll fix it with the below fixed as
well.

> > 
> > > 
> > > Also i915_l3_read() should get the same fix.
> > 
> > Should it?
> 
> Yes, 'buf' is just the buffer userspace will eventually get from the
> read() syscall. Seeking seeks inside the file, not inside the buffer
> userspace passes to read()/write().
> 
> Well, actually 'buf' is one of two temp buffers inside the kernel. It
> looks like bin.c will copy from that buffer to another temp buffer,
> and then from the second temp buffer to userspace. Seems very
> inefficient, but perhaps there's a reason for it.
> 

Now I see what I was missing. Thanks for catching it.

> > 
> > > 
> > > And while on the subject, I don't understand why we're playing weird
> > > tricks with the remap_info memory allocation. Ie. why don't we simply
> > > do this?
> > 
> > It was mostly to be able to differentiate NULL vs. 0's. The former being
> > no remaps ever, the latter being a cleared remapping. That's why temp is
> > used.
> 
> Hmm. I don't get it. The only difference between my idea and the current
> code is if i915_gpu_idle() fails for the first write. But we could just
> do the allocation after we've called i915_gpu_idle() and then there's no
> longer any difference.
> 
> Also there's no guarantee that userspace would even write the whole
> thing. It could just write one byte and then we'd implicitly clear the
> rest to 0. Not sure if that's considered OK or not.

It's not required for userspace to write the whole thing. The operation
should be a RMW, except for the initial case where zero-filled is what
you want (all 0's should be no remapping).

If you want to submit cleanup patches on top of this, I can review them,
but I don't feel this suggestion belongs in this patch anyway.

> 
> Actually 'temp' would make more sense if we'd do the allocation outside
> struct_mutex.
> 
> > > 
> > >  ...
> > >  if (!dev_priv->l3_parity.remap_info) {
> > >  	dev_priv->l3_parity.remap_info = kzalloc(GEN7_L3LOG_SIZE,
> > >  	                                         GFP_KERNEL);
> > > 
> > >  	if (!dev_priv->l3_parity.remap_info) {
> > >  		mutex_unlock(&drm_dev->struct_mutex);
> > >  		return -ENOMEM;
> > >  	}
> > >  }
> > >  ...
> > > 
> > > >  
> > > >  	i915_gem_l3_remap(drm_dev);
> > > >  
> > > > -- 
> > > > 1.8.1.4
> > > > 
> > > > _______________________________________________
> > > > Intel-gfx mailing list
> > > > Intel-gfx@lists.freedesktop.org
> > > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > > 
> > > -- 
> > > Ville Syrjälä
> > > Intel OTC
> > 
> > -- 
> > Ben Widawsky, Intel Open Source Technology Center
> 
> -- 
> Ville Syrjälä
> Intel OTC

-- 
Ben Widawsky, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Intel-gfx] [PATCH 15/19] drm/i915: i915.quirks_set/quirks_mask overrides dmi match
  2013-09-11  8:21   ` [Intel-gfx] " Jani Nikula
@ 2013-09-17 16:42     ` Kamal Mostafa
  2013-09-19 12:10       ` Jani Nikula
  0 siblings, 1 reply; 30+ messages in thread
From: Kamal Mostafa @ 2013-09-17 16:42 UTC (permalink / raw)
  To: Jani Nikula; +Cc: Rodrigo Vivi, intel-gfx, stable, Daniel Vetter

[-- Attachment #1: Type: text/plain, Size: 4216 bytes --]

On Wed, 2013-09-11 at 11:21 +0300, Jani Nikula wrote:
> On Wed, 11 Sep 2013, Rodrigo Vivi <rodrigo.vivi@gmail.com> wrote:
> > From: Kamal Mostafa <kamal@canonical.com>
> >
> > Boot params quirks_set and quirks_mask allow the user to enable or
> > inhibit any dmi-matched quirks, overriding the dmi match table.  Examples:
> >
> >   i915.quirks_set=0x2   - enables QUIRK_LVDS_SSC_DISABLE
> >   i915.quirks_set=0x8   - enables QUIRK_NO_PCH_PWM_ENABLE
> >   i915.quirks_mask=0x8  - disables QUIRK_NO_PCH_PWM_ENABLE
> >   i915.quirks_mask=0xFF - disables all quirks


Thanks for reviewing this, Jani.  I hope you're open to a little more
debate on the topic...

> While I admit this can be used to workaround issues with certain
> machines, this is a hack that exposes an implementation detail to the
> userspace, and carves it into the stone. Any user of it would have to
> look at the kernel source to make a smart choice of parameters; more
> likely forums would start filling with tips what to set.

All of the above is true of all boot params.  Why is that a problem?

Consider that if this quirks_set/mask feature were available:

 - Users and developers could trivially check whether any future machine
needs to be added to any of the existing dmi-match quirk tables.
Currently we have to build a test kernel and get the user to install it,
but this would allow us to just ask "Does i915.quirks_set=0x2 fix it?".
That feature alone makes me want quirks_set/mask.

 - We would unblock some black-screen-on-boot users who currently have
no other solution.  And we would unblock future users who face future
dmi/quirk-mismatch issues (of which I am sure there will be more).

 - The existing i915.invert_brightness override param would not have
been needed.  (It is worth nothing also that my previously proposed
i915.disable_pch_pwm override patch was rejected, even though its
implementation is identical to invert_brightness.)


> So NAK from me.
> 
> That said, I think we still have a few stones unturned wrt backlight and
> what BIOS does in UEFI mode.

I would be thrilled to see the UEFI/BIOS/backlight issue get fixed, but
in my opinion i915.quirks_set/mask would be very useful (a) anyway, and
(b) immediately.

 -Kamal


> 
> Cheers,
> Jani.
> 
> >
> > Signed-off-by: Kamal Mostafa <kamal@canonical.com>
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
> > ---
> >  drivers/gpu/drm/i915/intel_display.c | 15 +++++++++++++++
> >  1 file changed, 15 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > index 1eabca3..12607f2 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -10043,8 +10043,19 @@ static struct intel_quirk intel_quirks[] = {
> >  	{ 0x0166, 0x1028, 0x058b, quirk_no_pcm_pwm_enable },
> >  };
> >  
> > +unsigned long i915_quirks_set __read_mostly = 0;
> > +module_param_named(quirks_set, i915_quirks_set, ulong, 0600);
> > +MODULE_PARM_DESC(quirks_set,
> > +		"Enable specified quirks bits.");
> > +
> > +unsigned long i915_quirks_mask __read_mostly = 0;
> > +module_param_named(quirks_mask, i915_quirks_mask, ulong, 0600);
> > +MODULE_PARM_DESC(quirks_mask,
> > +		"Disable specified quirks bits (override dmi match).");
> > +
> >  static void intel_init_quirks(struct drm_device *dev)
> >  {
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	struct pci_dev *d = dev->pdev;
> >  	int i;
> >  
> > @@ -10062,6 +10073,10 @@ static void intel_init_quirks(struct drm_device *dev)
> >  		if (dmi_check_system(*intel_dmi_quirks[i].dmi_id_list) != 0)
> >  			intel_dmi_quirks[i].hook(dev);
> >  	}
> > +
> > +	/* handle user-specified quirks overrides */
> > +	dev_priv->quirks |= i915_quirks_set;
> > +	dev_priv->quirks &= ~i915_quirks_mask;
> >  }
> >  
> >  /* Disable the VGA plane that we never use */
> > -- 
> > 1.8.1.4
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 08/19] drm/i915: write D_COMP using the mailbox
  2013-09-10 22:36 ` [PATCH 08/19] drm/i915: write D_COMP using the mailbox Rodrigo Vivi
@ 2013-09-19  9:58   ` Damien Lespiau
  2013-09-19 12:11     ` Daniel Vetter
  0 siblings, 1 reply; 30+ messages in thread
From: Damien Lespiau @ 2013-09-19  9:58 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx, Paulo Zanoni

On Tue, Sep 10, 2013 at 07:36:37PM -0300, Rodrigo Vivi wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> 
> You can't write it using the MCHBAR mirror, the write will just get
> dropped.
> 
> This should make us BSpec-compliant, but there's no real bug I could
> reproduce that is fixed by this patch.
> 
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>

With or without the 'from' typo fixed (can be fixed later, amended
while applying):

Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>

> ---
>  drivers/gpu/drm/i915/i915_reg.h      |  4 ++++
>  drivers/gpu/drm/i915/intel_display.c | 10 ++++++++--
>  2 files changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index bcee89b..ff60945 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -1441,6 +1441,8 @@
>   * device 0 function 0's pci config register 0x44 or 0x48 and matches it in
>   * every way.  It is not accessible from the CP register read instructions.
>   *
> + * Starting form Haswell, you can't write registers using the MCHBAR mirror,

'from'

> + * just read.
>   */
>  #define MCHBAR_MIRROR_BASE	0x10000
>  
> @@ -4723,6 +4725,8 @@
>  #define   GEN6_PCODE_READ_MIN_FREQ_TABLE	0x9
>  #define	  GEN6_PCODE_WRITE_RC6VIDS		0x4
>  #define	  GEN6_PCODE_READ_RC6VIDS		0x5
> +#define   GEN6_PCODE_READ_D_COMP		0x10
> +#define   GEN6_PCODE_WRITE_D_COMP		0x11
>  #define   GEN6_ENCODE_RC6_VID(mv)		(((mv) - 245) / 5)
>  #define   GEN6_DECODE_RC6_VID(vids)		(((vids) * 5) + 245)
>  #define GEN6_PCODE_DATA				0x138128
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 882da0e..3098465 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -6019,7 +6019,10 @@ void hsw_disable_lcpll(struct drm_i915_private *dev_priv,
>  
>  	val = I915_READ(D_COMP);
>  	val |= D_COMP_COMP_DISABLE;
> -	I915_WRITE(D_COMP, val);
> +	mutex_lock(&dev_priv->rps.hw_lock);
> +	if (sandybridge_pcode_write(dev_priv, GEN6_PCODE_WRITE_D_COMP, val))
> +		DRM_ERROR("Failed to disable D_COMP\n");
> +	mutex_unlock(&dev_priv->rps.hw_lock);
>  	POSTING_READ(D_COMP);
>  	ndelay(100);
>  
> @@ -6061,7 +6064,10 @@ void hsw_restore_lcpll(struct drm_i915_private *dev_priv)
>  	val = I915_READ(D_COMP);
>  	val |= D_COMP_COMP_FORCE;
>  	val &= ~D_COMP_COMP_DISABLE;
> -	I915_WRITE(D_COMP, val);
> +	mutex_lock(&dev_priv->rps.hw_lock);
> +	if (sandybridge_pcode_write(dev_priv, GEN6_PCODE_WRITE_D_COMP, val))
> +		DRM_ERROR("Failed to enable D_COMP\n");
> +	mutex_unlock(&dev_priv->rps.hw_lock);
>  	POSTING_READ(D_COMP);
>  
>  	val = I915_READ(LCPLL_CTL);
> -- 
> 1.8.1.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 15/19] drm/i915: i915.quirks_set/quirks_mask overrides dmi match
  2013-09-17 16:42     ` Kamal Mostafa
@ 2013-09-19 12:10       ` Jani Nikula
  0 siblings, 0 replies; 30+ messages in thread
From: Jani Nikula @ 2013-09-19 12:10 UTC (permalink / raw)
  To: Kamal Mostafa; +Cc: intel-gfx, stable, Daniel Vetter

On Tue, 17 Sep 2013, Kamal Mostafa <kamal@canonical.com> wrote:
> On Wed, 2013-09-11 at 11:21 +0300, Jani Nikula wrote:
>> On Wed, 11 Sep 2013, Rodrigo Vivi <rodrigo.vivi@gmail.com> wrote:
>> > From: Kamal Mostafa <kamal@canonical.com>
>> >
>> > Boot params quirks_set and quirks_mask allow the user to enable or
>> > inhibit any dmi-matched quirks, overriding the dmi match table.  Examples:
>> >
>> >   i915.quirks_set=0x2   - enables QUIRK_LVDS_SSC_DISABLE
>> >   i915.quirks_set=0x8   - enables QUIRK_NO_PCH_PWM_ENABLE
>> >   i915.quirks_mask=0x8  - disables QUIRK_NO_PCH_PWM_ENABLE
>> >   i915.quirks_mask=0xFF - disables all quirks
>
>
> Thanks for reviewing this, Jani.  I hope you're open to a little more
> debate on the topic...

I'm just offering my opinions; at the end of the day it's Daniel's call.

>> While I admit this can be used to workaround issues with certain
>> machines, this is a hack that exposes an implementation detail to the
>> userspace, and carves it into the stone. Any user of it would have to
>> look at the kernel source to make a smart choice of parameters; more
>> likely forums would start filling with tips what to set.
>
> All of the above is true of all boot params.  Why is that a problem?

Personally I think we are accumulating too many parameters to begin
with. This is a "meta parameter" that makes all of our quirks an
interface to our driver. Something that would be non-trivial to change
or remove.

It opens quirks intended to some specific machines to *all* machines.

> Consider that if this quirks_set/mask feature were available:
>
>  - Users and developers could trivially check whether any future machine
> needs to be added to any of the existing dmi-match quirk tables.
> Currently we have to build a test kernel and get the user to install it,
> but this would allow us to just ask "Does i915.quirks_set=0x2 fix it?".
> That feature alone makes me want quirks_set/mask.

>From a developer perspective, sure, I'd like that.

The flip side is that if setting a module parameter fixes things for the
advanced user, some of the bugs might go unreported. We wouldn't get the
feedback we need to make the driver work with the default settings. I
wouldn't like that.

I've already seen this with the i915.invert_brightness option; if
setting that fixes the problem, it can be hard to convince the reporter
to even send 'lspci -vmmnn' output!

>  - We would unblock some black-screen-on-boot users who currently have
> no other solution.  And we would unblock future users who face future
> dmi/quirk-mismatch issues (of which I am sure there will be more).

Again, how to convince those users to interact with us to fix the issues
for the rest?

>  - The existing i915.invert_brightness override param would not have
> been needed.  (It is worth nothing also that my previously proposed
> i915.disable_pch_pwm override patch was rejected, even though its
> implementation is identical to invert_brightness.)

FWIW, I don't much like that parameter either.

>> So NAK from me.
>> 
>> That said, I think we still have a few stones unturned wrt backlight and
>> what BIOS does in UEFI mode.
>
> I would be thrilled to see the UEFI/BIOS/backlight issue get fixed, but
> in my opinion i915.quirks_set/mask would be very useful (a) anyway, and
> (b) immediately.

If this was debugfs, with zero guarantees on the API or forward
compatibility, I'd be all in. Perhaps that would be useful for some of
the cases, but not all.


Cheers,
Jani.


PS. One code comment inline below.

>
>  -Kamal
>
>
>> 
>> Cheers,
>> Jani.
>> 
>> >
>> > Signed-off-by: Kamal Mostafa <kamal@canonical.com>
>> > Cc: stable@vger.kernel.org
>> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
>> > ---
>> >  drivers/gpu/drm/i915/intel_display.c | 15 +++++++++++++++
>> >  1 file changed, 15 insertions(+)
>> >
>> > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
>> > index 1eabca3..12607f2 100644
>> > --- a/drivers/gpu/drm/i915/intel_display.c
>> > +++ b/drivers/gpu/drm/i915/intel_display.c
>> > @@ -10043,8 +10043,19 @@ static struct intel_quirk intel_quirks[] = {
>> >  	{ 0x0166, 0x1028, 0x058b, quirk_no_pcm_pwm_enable },
>> >  };
>> >  
>> > +unsigned long i915_quirks_set __read_mostly = 0;
>> > +module_param_named(quirks_set, i915_quirks_set, ulong, 0600);
>> > +MODULE_PARM_DESC(quirks_set,
>> > +		"Enable specified quirks bits.");
>> > +
>> > +unsigned long i915_quirks_mask __read_mostly = 0;
>> > +module_param_named(quirks_mask, i915_quirks_mask, ulong, 0600);
>> > +MODULE_PARM_DESC(quirks_mask,
>> > +		"Disable specified quirks bits (override dmi match).");

There's no point in making the perm writable, since it's only read on
driver init time.

>> > +
>> >  static void intel_init_quirks(struct drm_device *dev)
>> >  {
>> > +	struct drm_i915_private *dev_priv = dev->dev_private;
>> >  	struct pci_dev *d = dev->pdev;
>> >  	int i;
>> >  
>> > @@ -10062,6 +10073,10 @@ static void intel_init_quirks(struct drm_device *dev)
>> >  		if (dmi_check_system(*intel_dmi_quirks[i].dmi_id_list) != 0)
>> >  			intel_dmi_quirks[i].hook(dev);
>> >  	}
>> > +
>> > +	/* handle user-specified quirks overrides */
>> > +	dev_priv->quirks |= i915_quirks_set;
>> > +	dev_priv->quirks &= ~i915_quirks_mask;
>> >  }
>> >  
>> >  /* Disable the VGA plane that we never use */
>> > -- 
>> > 1.8.1.4
>> >
>> > _______________________________________________
>> > Intel-gfx mailing list
>> > Intel-gfx@lists.freedesktop.org
>> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>> 
>

-- 
Jani Nikula, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 08/19] drm/i915: write D_COMP using the mailbox
  2013-09-19  9:58   ` Damien Lespiau
@ 2013-09-19 12:11     ` Daniel Vetter
  0 siblings, 0 replies; 30+ messages in thread
From: Daniel Vetter @ 2013-09-19 12:11 UTC (permalink / raw)
  To: Damien Lespiau; +Cc: intel-gfx, Paulo Zanoni

On Thu, Sep 19, 2013 at 10:58:36AM +0100, Damien Lespiau wrote:
> On Tue, Sep 10, 2013 at 07:36:37PM -0300, Rodrigo Vivi wrote:
> > From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > 
> > You can't write it using the MCHBAR mirror, the write will just get
> > dropped.
> > 
> > This should make us BSpec-compliant, but there's no real bug I could
> > reproduce that is fixed by this patch.
> > 
> > Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
> 
> With or without the 'from' typo fixed (can be fixed later, amended
> while applying):
> 
> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>

Queued for -next, thanks for the patch.
> 
> > ---
> >  drivers/gpu/drm/i915/i915_reg.h      |  4 ++++
> >  drivers/gpu/drm/i915/intel_display.c | 10 ++++++++--
> >  2 files changed, 12 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> > index bcee89b..ff60945 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -1441,6 +1441,8 @@
> >   * device 0 function 0's pci config register 0x44 or 0x48 and matches it in
> >   * every way.  It is not accessible from the CP register read instructions.
> >   *
> > + * Starting form Haswell, you can't write registers using the MCHBAR mirror,
> 
> 'from'

Fixed.
-Daniel

> 
> > + * just read.
> >   */
> >  #define MCHBAR_MIRROR_BASE	0x10000
> >  
> > @@ -4723,6 +4725,8 @@
> >  #define   GEN6_PCODE_READ_MIN_FREQ_TABLE	0x9
> >  #define	  GEN6_PCODE_WRITE_RC6VIDS		0x4
> >  #define	  GEN6_PCODE_READ_RC6VIDS		0x5
> > +#define   GEN6_PCODE_READ_D_COMP		0x10
> > +#define   GEN6_PCODE_WRITE_D_COMP		0x11
> >  #define   GEN6_ENCODE_RC6_VID(mv)		(((mv) - 245) / 5)
> >  #define   GEN6_DECODE_RC6_VID(vids)		(((vids) * 5) + 245)
> >  #define GEN6_PCODE_DATA				0x138128
> > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > index 882da0e..3098465 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -6019,7 +6019,10 @@ void hsw_disable_lcpll(struct drm_i915_private *dev_priv,
> >  
> >  	val = I915_READ(D_COMP);
> >  	val |= D_COMP_COMP_DISABLE;
> > -	I915_WRITE(D_COMP, val);
> > +	mutex_lock(&dev_priv->rps.hw_lock);
> > +	if (sandybridge_pcode_write(dev_priv, GEN6_PCODE_WRITE_D_COMP, val))
> > +		DRM_ERROR("Failed to disable D_COMP\n");
> > +	mutex_unlock(&dev_priv->rps.hw_lock);
> >  	POSTING_READ(D_COMP);
> >  	ndelay(100);
> >  
> > @@ -6061,7 +6064,10 @@ void hsw_restore_lcpll(struct drm_i915_private *dev_priv)
> >  	val = I915_READ(D_COMP);
> >  	val |= D_COMP_COMP_FORCE;
> >  	val &= ~D_COMP_COMP_DISABLE;
> > -	I915_WRITE(D_COMP, val);
> > +	mutex_lock(&dev_priv->rps.hw_lock);
> > +	if (sandybridge_pcode_write(dev_priv, GEN6_PCODE_WRITE_D_COMP, val))
> > +		DRM_ERROR("Failed to enable D_COMP\n");
> > +	mutex_unlock(&dev_priv->rps.hw_lock);
> >  	POSTING_READ(D_COMP);
> >  
> >  	val = I915_READ(LCPLL_CTL);
> > -- 
> > 1.8.1.4
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2013-09-19 12:11 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-10 22:36 [PATCH 00/19] drm-intel-collector push simple patches for review Rodrigo Vivi
2013-09-10 22:36 ` [PATCH 01/19] drm/i915: check that the i965g/gm 4G limit is really obeyed Rodrigo Vivi
2013-09-10 22:36 ` [PATCH 02/19] drm/i915: Cancel outstanding modeset workers before suspend Rodrigo Vivi
2013-09-10 22:36 ` [PATCH 03/19] drm/i915: Move the conditional seqno query into the tracepoint Rodrigo Vivi
2013-09-10 22:36 ` [PATCH 04/19] drm/i915: Add some missing steps to i915_driver_load error path Rodrigo Vivi
2013-09-10 22:36 ` [PATCH 05/19] drm/i915: Asynchronously perform the set-base for a simple modeset Rodrigo Vivi
2013-09-10 22:36 ` [PATCH 06/19] drm/i915: Align tiled scanouts from stolen memory to 256k in the GTT Rodrigo Vivi
2013-09-10 22:36 ` [PATCH 07/19] drm/i915: Pair seqno completion tracepoint with its dispatch Rodrigo Vivi
2013-09-10 22:36 ` [PATCH 08/19] drm/i915: write D_COMP using the mailbox Rodrigo Vivi
2013-09-19  9:58   ` Damien Lespiau
2013-09-19 12:11     ` Daniel Vetter
2013-09-10 22:36 ` [PATCH 09/19] drm/i915: Initialise min/max frequencies before updating RPS registers Rodrigo Vivi
2013-09-10 22:36 ` [PATCH 10/19] drm/i915: Delay the relase of the forcewake by a jiffie Rodrigo Vivi
2013-09-10 22:36 ` [PATCH 11/19] drm/i915: Add a tracepoint for using a semaphore Rodrigo Vivi
2013-09-10 22:36 ` [PATCH 12/19] drm/i915: Write RING_TAIL once per-request Rodrigo Vivi
2013-09-10 22:36 ` [PATCH 13/19] drm/i915: Boost RPS frequency for CPU stalls Rodrigo Vivi
2013-09-10 22:53   ` Chris Wilson
2013-09-10 22:36 ` [PATCH 14/19] drm/i915: Tweak RPS thresholds to more aggressively downclock Rodrigo Vivi
2013-09-10 22:36 ` [PATCH 15/19] drm/i915: i915.quirks_set/quirks_mask overrides dmi match Rodrigo Vivi
2013-09-11  8:21   ` [Intel-gfx] " Jani Nikula
2013-09-17 16:42     ` Kamal Mostafa
2013-09-19 12:10       ` Jani Nikula
2013-09-10 22:36 ` [PATCH 16/19] drm/i915: Fix l3 parity buffer offset Rodrigo Vivi
2013-09-11 11:45   ` Ville Syrjälä
2013-09-11 17:07     ` Ben Widawsky
2013-09-11 17:44       ` Ville Syrjälä
2013-09-11 17:59         ` Ben Widawsky
2013-09-10 22:36 ` [PATCH 17/19] drm/i915: Fix HSW parity test Rodrigo Vivi
2013-09-10 22:36 ` [PATCH 18/19] drm/i915: Allow GT3 Slice Shutdown on Boot Rodrigo Vivi
2013-09-10 22:36 ` [PATCH 19/19] drm/i915: Allow Dynamically GT3 Slice Shutdown Rodrigo Vivi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.